Re: [PATCH v3 1/2] mm: protect free_pgtables with mmap_lock write lock in exit_mmap

From: Suren Baghdasaryan
Date: Wed Dec 08 2021 - 16:24:52 EST


On Wed, Dec 8, 2021 at 11:22 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
>
> On Wed, Dec 08, 2021 at 11:13:42AM -0800, Suren Baghdasaryan wrote:
> > On Wed, Dec 8, 2021 at 8:50 AM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> > >
> > > On Wed, Dec 8, 2021 at 8:05 AM Matthew Wilcox <willy@xxxxxxxxxxxxx> wrote:
> > > >
> > > > On Wed, Dec 08, 2021 at 04:51:58PM +0100, Michal Hocko wrote:
> > > > > On Wed 08-12-21 15:01:24, Matthew Wilcox wrote:
> > > > > > On Tue, Dec 07, 2021 at 03:08:19PM -0800, Suren Baghdasaryan wrote:
> > > > > > > > > /**
> > > > > > > > > * @close: Called when the VMA is being removed from the MM.
> > > > > > > > > * Context: Caller holds mmap_lock.
> > > > > > >
> > > > > > > BTW, is the caller always required to hold mmap_lock for write or it
> > > > > > > *might* hold it?
> > > > > >
> > > > > > __do_munmap() might hold it for read, thanks to:
> > > > > >
> > > > > > if (downgrade)
> > > > > > mmap_write_downgrade(mm);
> > > > > >
> > > > > > Should probably say:
> > > > > >
> > > > > > * Context: User context. May sleep. Caller holds mmap_lock.
> > > > > >
> > > > > > I don't think we should burden the implementor of the vm_ops with the
> > > > > > knowledge that the VM chooses to not hold the mmap_lock under certain
> > > > > > circumstances when it doesn't matter whether it's holding the mmap_lock
> > > > > > or not.
> > > > >
> > > > > If we document it like that some code might depend on that lock to be
> > > > > held. I think we only want to document that the holder itself is not
> > > > > allowed to take mmap sem or a depending lock.
> > > >
> > > > The only place where we're not currently holding the mmap_lock is at
> > > > task exit, where the mmap_lock is effectively held because nobody else
> > > > can modify the task's mm. Besides, Suren is changing that in this patch
> > > > series anyway, so it will be always true.
> > >
> > > Ok, I'll make it a separate patch after the patch that changes
> > > exit_mmap and this statement will become always true. Sounds
> > > reasonable?
> >
> > Actually, while today vma_ops->close is called with mmap_lock held, I
> > believe we want this comment to reflect the restrictions on the
> > callback itself, not on the user. IOW, we want to say that the
> > callback should not take mmap_lock while the caller might or might not
> > hold it. If so, I think *might* would make more sense here, like this:
> >
> > * Context: User context. May sleep. Caller might hold mmap_lock.
> >
> > WDYT?
>
> We're documenting the contract between the caller and the callee.
> That implies responsibilities on both sides. For example, we're
> placing requirements on the caller that they're not going to tear
> down the VMA in interrupt context. So I preferred what previous-Suren
> said to current-Suren, "this statement will become always true".
>

previous-Suren posted v4 at
https://lore.kernel.org/all/20211208212211.2860249-1-surenb@xxxxxxxxxx
Thanks!