Re: [PATCH] mm/pgtable: return null if no ptl in __pte_offset_map_lock

From: Matthew Wilcox
Date: Wed Nov 15 2023 - 14:04:26 EST


On Wed, Nov 15, 2023 at 06:05:30PM +0200, José Pekkarinen wrote:
> On 2023-11-15 16:19, Matthew Wilcox wrote:
> > On Wed, Nov 15, 2023 at 08:55:05AM +0200, José Pekkarinen wrote:
> > > Documentation of __pte_offset_map_lock suggest there is situations
> > > where
> >
> > You should have cc'd Hugh who changed all this code recently.
>
> Hi,
>
> Sorry, he seems to be missing if I run get_maintainer.pl:
>
> $ ./scripts/get_maintainer.pl include/linux/mm.h
> Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> (maintainer:MEMORY MANAGEMENT)
> linux-mm@xxxxxxxxx (open list:MEMORY MANAGEMENT)
> linux-kernel@xxxxxxxxxxxxxxx (open list)

That's a good example of why get_maintainer.pl is not great. It's
just a stupid perl program. Ideally, you should research what changes
have been made to that code recently and see who else might be
implicated. Who introduced the exact code that you're fixing?

In this specific instance, you can see Hugh already responded to it:

https://lore.kernel.org/all/0000000000005e44550608a0806c@xxxxxxxxxx/T/

Now, part of Hugh's response turns out to be incorrect; syzbot can
reproduce this on a current mainline kernel. But, for some reason,
syzbot has not done a bisect to track it down to a particular commit.
I don't understand why it hasn't; maybe someone who knows syzbot better
can explain why.

> > > +++ b/include/linux/mm.h
> > > @@ -2854,7 +2854,7 @@ void ptlock_free(struct ptdesc *ptdesc);
> > >
> > > static inline spinlock_t *ptlock_ptr(struct ptdesc *ptdesc)
> > > {
> > > - return ptdesc->ptl;
> > > + return (likely(ptdesc)) ? ptdesc->ptl : NULL;
> > > }
> >
> > I don't think we should be changing ptlock_ptr().
>
> This is where the null ptr dereference originates, so the only
> alternative I can think of is to protect the life cycle of the ptdesc
> to prevent it to die between the pte check and the spin_unlock of
> __pte_offset_map_lock. Would that work for you?

Ah! I think I found the problem.

If ALLOC_SPLIT_PTLOCKS is not set, there is no problem as ->ptl
is embedded in the struct page. But if ALLOC_SPLIT_PTLOCKS is set
(eg you have LOCKDEP enabled), we can _return_ a NULL pointer from
ptlock_ptr. The NULL pointer dereference isn't in ptlock_ptr(), it's
in __pte_offset_map_lock().

So, how to solve this? We can't just check the ptl against NULL; the
memory that ptl points to may have been freed. We could grab a reference
to the pmd_page, possibly conditionally on ALLOC_SPLIT_LOCK being set,
but that will slow down everything. We could make page_ptl_cachep
SLAB_TYPESAFE_BY_RCU, preventing the memory from being freed (even if
the lock might not be associated with this page any more).

Other ideas?