Re: [syzbot] [mm?] BUG: unable to handle kernel paging request in __pte_offset_map_lock

From: Matthew Wilcox
Date: Wed Nov 15 2023 - 14:39:15 EST


On Thu, Oct 26, 2023 at 11:07:35PM -0700, Hugh Dickins wrote:
> I've spent a while worrying over this report, but have not been able
> glean much from it: I'm not at all familiar with arm64 debugging, so
> cannot deduce anything from the registers shown, though suspect they
> would shed good light on it; but it may just be a waste of time, since
> it was on a transient 6.6-rc6-based for-kernelci branch from last week.
>
> If I read right, the reproducer is exercising MADV_PAGEOUT (splitting
> huge pages) and MADV_COLLAPSE (assembling huge pages), on mmaps
> MAP_FIXED MAP_SHARED MAP_ANONYMOUS i.e. shmem.
>
> Suspicion falls on my 6.6-rc1 mm/khugepaged.c changes; but I don't see
> what's wrong, and shall probably give up and ignore this - unless an
> arm64 expert can take it further, or syzbot reproduces it on x86 on a
> known tree.

Just to tie the two threads together ... it looks to me like what's
happening is __pte_offset_map_lock() is racing with pagetable_pte_dtor().
That is, we're walking the page tables, find a pmd, look up its
page/ptdesc, but because CONFIG_LOCKDEP is enabled, ptdesc->ptl is a
pointer to a lock, and that pointer is NULL.

More discussion here:
https://lore.kernel.org/linux-mm/ZVUWLgFgu+jE3QmW@xxxxxxxxxxxxxxxxxxxx/T/#t