Re: [syzbot] WARNING: locking bug in hugetlb_no_page

From: Dmitry Vyukov
Date: Mon Nov 14 2022 - 05:02:34 EST


On Mon, 14 Nov 2022 at 03:24, Mike Kravetz <mike.kravetz@xxxxxxxxxx> wrote:
>
> On 11/13/22 10:50, Mike Kravetz wrote:
> > On 11/13/22 16:36, Dmitry Vyukov wrote:
> > > On Sat, 12 Nov 2022 at 15:03, syzbot
> > > <syzbot+d07c65298d2c15eafcb0@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
> > > >
> > > > Hello,
> > > >
> > > > syzbot found the following issue on:
> > > >
> > > > HEAD commit: 1621b6eaebf7 Merge branch 'for-next/fixes' into for-kernelci
> > > > git tree: git://git.kernel.org/pub/scm/linux/kernel/git/arm64/linux.git for-kernelci
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=13bd511e880000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=606e57fd25c5c6cc
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=d07c65298d2c15eafcb0
> > > > compiler: Debian clang version 13.0.1-++20220126092033+75e33f71c2da-1~exp1~20220126212112.63, GNU ld (GNU Binutils for Debian) 2.35.2
> > > > userspace arch: arm64
> > > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=13315856880000
> > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=173614d1880000
> > > >
> > > > Downloadable assets:
> > > > disk image: https://storage.googleapis.com/syzbot-assets/82aa7741098d/disk-1621b6ea.raw.xz
> > > > vmlinux: https://storage.googleapis.com/syzbot-assets/f6be08c4e4c2/vmlinux-1621b6ea.xz
> > > > kernel image: https://storage.googleapis.com/syzbot-assets/296b6946258a/Image-1621b6ea.gz.xz
> > > >
> > > > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > > > Reported-by: syzbot+d07c65298d2c15eafcb0@xxxxxxxxxxxxxxxxxxxxxxxxx
> > >
> > > This may have the same root cause as:
> > >
> > > possible deadlock in hugetlb_fault
> > > https://lore.kernel.org/all/CACT4Y+ZWNV6ApzEv0UrsF2T8JWmXez_-H-EGMii-S_2JbXv07Q@xxxxxxxxxxxxxx/
> > >
> > > and there is a potential explanation as to what may be the problem.
> >
> > Thanks Dmitry!
> >
> > An issue with this new hugetlb locking was previously reported and I have been
> > working on a solution. When I look at the reproducer, I see that it is calling
> > madvise(MADV_DONTNEED). This triggers the other issue and could certainly
> > cause the issue reported here.
> >
> > Proposed patches are here and in next-20221111:
> > https://lore.kernel.org/linux-mm/20221111232628.290160-1-mike.kravetz@xxxxxxxxxx/
> >
> > I am currently trying to run the reproducer, but it is not reproducing quickly.
> > Since this is a timing issue that as expected. Interesting that this
> > report is run on arm64 and I am trying to reproduce on x86. Although, the
> > issue is not architecture specific in any way.
>
> After tweaking my config, I was able to reliably reproduce.
>
> > I'll keep looking, but am fairly confident this is the root cause.
>
> I was also able to verify the series above addresses the issue.

Let's tell syzbot about the fix so that it reports similar issues in future:

#syz fix:
hugetlb: don't delete vma_lock in hugetlb MADV_DONTNEED processing