Re: [mm] 408579cd62: WARNING:suspicious_RCU_usage

From: Oliver Sang
Date: Tue Jul 04 2023 - 09:51:43 EST


hi, Linus,

On Mon, Jul 03, 2023 at 07:29:48PM -0700, Linus Torvalds wrote:
> On Mon, 3 Jul 2023 at 18:48, Oliver Sang <oliver.sang@xxxxxxxxx> wrote:
> >
> > by patch [1], we found the warning is not fixed.
>
> Hmm. I already committed that "fix" as obvious, since the main
> difference in commit 408579cd627a ("mm: Update do_vmi_align_munmap()
> return semantics") around that validate_mm() call was how it did that
> mmap_read_unlock().
>
> > we also found there are some changes in stack backtrace. now it's as below:
> > (detail dmesg is attached)
> >
> > [ 26.412372][ T1] stack backtrace:
> > [ 26.412846][ T1] CPU: 0 PID: 1 Comm: systemd Not tainted 6.4.0-09908-gcb226fb1fb7a #1
> > [ 26.413506][ T1] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> > [ 26.414326][ T1] Call Trace:
> > [ 26.414605][ T1] <TASK>
> > [ 26.414847][ T1] dump_stack_lvl+0x73/0xc0
> > [ 26.415225][ T1] lockdep_rcu_suspicious+0x1b7/0x280
> > [ 26.415669][ T1] mas_start+0x280/0x400
> > [ 26.416037][ T1] mas_find+0x27a/0x400
> > [ 26.416391][ T1] validate_mm+0x8b/0x2c0
> > [ 26.416757][ T1] __se_sys_brk+0xa35/0xc00
>
> Ok, that is indeed a very different stack trace.
>
> So maybe the fix is a real fix, but the first complaint shut up
> lockdep, so this is the *second* and unrelated complaint.
>
> And indeed: it turns out that do_vma_munmap() does this:
>
> ret = do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> validate_mm(mm);
>
> and so we have *another* validate_mm() that is now done outside the lock.
>
> That one is actually pretty pointless. We've *just* validated the mm
> already inside do_vmi_align_munmap(), except we only did it in one of
> the two return cases.
>
> So I think the fix is to just move that validate_mm() into the other
> return case of do_vmi_align_munmap(), and remove it from the caller.
>
> IOW, something like the attached (NOTE! This is in *addition* to the
> previous patch, which is the same as the one you quoted, just with
> slightly different whitespace as commit ae80b4041984: "mm: validate
> the mm before dropping the mmap lock").

Thanks a lot for guidance!
I applied below patch directly upon ae80b4041984, and confirmed the
WARNING gone. Thanks

>
> Linus

> mm/mmap.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 547b40531791..204ddcd52625 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2571,6 +2571,7 @@ do_vmi_align_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> __mt_destroy(&mt_detach);
> start_split_failed:
> map_count_exceeded:
> + validate_mm(mm);
> return error;
> }
>
> @@ -3019,12 +3020,9 @@ int do_vma_munmap(struct vma_iterator *vmi, struct vm_area_struct *vma,
> bool unlock)
> {
> struct mm_struct *mm = vma->vm_mm;
> - int ret;
>
> arch_unmap(mm, start, end);
> - ret = do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> - validate_mm(mm);
> - return ret;
> + return do_vmi_align_munmap(vmi, vma, mm, start, end, uf, unlock);
> }
>
> /*