Re: [PATCH v4 3/3] userfaultfd: use per-vma locks in userfaultfd operations

From: Liam R. Howlett
Date: Mon Feb 12 2024 - 15:12:12 EST

* Lokesh Gidra <lokeshgidra@xxxxxxxxxx> [240212 13:08]:
> On Mon, Feb 12, 2024 at 7:20 AM Liam R. Howlett <Liam.Howlett@xxxxxxxxxx> wrote:

> > >
> > > The current implementation has a deadlock problem:

> > On contention you will now abort vs block.
> Is it? On contention mmap_read_trylock() will fail and we do the whole
> operation using lock_mm_and_find_vmas() which blocks on mmap_lock. Am
> I missing something?

You are right, I missed the taking of the lock in the function call.

> >
> > > }
> > > return 0;
> > > }
> > >
> > > Of course this would need defining lock_mm_and_find_vmas() regardless
> > > of CONFIG_PER_VMA_LOCK. I can also remove the prepare_anon condition
> > > in lock_vma().
> >
> > You are adding a lot of complexity for a relatively rare case, which is
> > probably not worth optimising.
> >

> Agreed on reduced complexity. But as Suren pointed out in one of his
> replies that lock_vma_under_rcu() may fail due to seq overflow. That's
> why lock_vma() uses vma_lookup() followed by direct down_read() on
> vma-lock.

I'd rather see another function that doesn't care about anon (I think
src is special that way?), and avoid splitting the locking across
functions as much as possible.

> IMHO what we need here is exactly lock_mm_and_find_vmas()
> and the code can be further simplified as follows:
> err = lock_mm_and_find_vmas(...);
> if (!err) {
> down_read(dst_vma...);
> if (dst_vma != src_vma)
> down_read(src_vma....);
> mmap_read_unlock(mm);
> }
> return err;

If we exactly needed lock_mm_and_find_vmas(), there wouldn't be three
lock/unlock calls depending on the return code.

The fact that lock_mm_and_find_vmas() returns with the mm locked or
unlocked depending on the return code is not reducing the complexity of
this code.

You could use a widget that does something with dst, and a different
widget that does something with src (if they are different). The dst
widget can be used for the lock_vma(), and in the
lock_mm_and_find_vmas(), while the src one can be used in this and the
lock_mm_and_find_vmas(). Neither widget would touch the locks. This way
you can build your functions that have the locking and unlocking
co-located (except the obvious necessity of holding the mmap_read lock
for the !per-vma case).

I've also thought of how you can name the abstraction in the functions:
use a 'prepare() and complete()' to find/lock and unlock what you need.
Might be worth exploring? If we fail to 'prepare()' then we don't need
to 'complete()', which means there won't be mismatched locking hanging
around. Maybe it's too late to change to this sort of thing, but I
thought I'd mention it.