Re: [PATCH] mm/pgtable: return null if no ptl in __pte_offset_map_lock

From: José Pekkarinen
Date: Wed Nov 15 2023 - 11:15:23 EST


On 2023-11-15 16:19, Matthew Wilcox wrote:
On Wed, Nov 15, 2023 at 08:55:05AM +0200, José Pekkarinen wrote:
Documentation of __pte_offset_map_lock suggest there is situations where

You should have cc'd Hugh who changed all this code recently.

Hi,

Sorry, he seems to be missing if I run get_maintainer.pl:

$ ./scripts/get_maintainer.pl include/linux/mm.h
Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> (maintainer:MEMORY MANAGEMENT)
linux-mm@xxxxxxxxx (open list:MEMORY MANAGEMENT)
linux-kernel@xxxxxxxxxxxxxxx (open list)

a pmd may not have a corresponding page table, in which case it should
return NULL without changing ptlp. Syzbot found its ways to produce a
NULL dereference in the function showing this case. This patch will
provide the exit path suggested if this unlikely situation turns up. The
output of the kasan null-ptr-report follows:

There's no need to include all this nonsense in the changelog.

No problem, we can clean the patch if we find there is something
worth upstreaming.

spin_lock include/linux/spinlock.h:351 [inline]
__pte_offset_map_lock+0x154/0x360 mm/pgtable-generic.c:373
pte_offset_map_lock include/linux/mm.h:2939 [inline]
filemap_map_pages+0x698/0x11f0 mm/filemap.c:3582

This was the only interesting part.

+++ b/include/linux/mm.h
@@ -2854,7 +2854,7 @@ void ptlock_free(struct ptdesc *ptdesc);

static inline spinlock_t *ptlock_ptr(struct ptdesc *ptdesc)
{
- return ptdesc->ptl;
+ return (likely(ptdesc)) ? ptdesc->ptl : NULL;
}

I don't think we should be changing ptlock_ptr().

This is where the null ptr dereference originates, so the only
alternative I can think of is to protect the life cycle of the ptdesc
to prevent it to die between the pte check and the spin_unlock of
__pte_offset_map_lock. Would that work for you?

+++ b/mm/pgtable-generic.c
@@ -370,6 +370,8 @@ pte_t *__pte_offset_map_lock(struct mm_struct *mm, pmd_t *pmd,
if (unlikely(!pte))
return pte;
ptl = pte_lockptr(mm, &pmdval);
+ if (unlikely(!ptl))
+ return NULL;
spin_lock(ptl);

I don't understand how this could possibly solve the problem. If there's
no PTE level, then __pte_offset_map() should return NULL and we'd already
return due to the check for !pte.

I tested the syzbot reproducer in x86 and it doesn't produce this kasan
report anymore.

José.