Re: [PATCH 1/5] mm: Introduce mm_struct.has_pinned

From: Jason Gunthorpe
Date: Tue Sep 22 2020 - 12:10:51 EST


On Tue, Sep 22, 2020 at 11:17:36AM -0400, Peter Xu wrote:

> > But it's admittedly a cosmetic point, combined with my perennial fear that
> > I'm missing something when I look at a READ_ONCE()/WRITE_ONCE() pair. :)
>
> Yeah but I hope I'm using it right.. :) I used READ_ONCE/WRITE_ONCE explicitly
> because I think they're cheaper than atomic operations, (which will, iiuc, lock
> the bus).

It is worth thinking a bit about racing fork with
pin_user_pages(). The desired outcome is:

If fork wins the page is write protected, and pin_user_pages_fast()
will COW it.

If pin_user_pages_fast() wins then fork must see the READ_ONCE and
the pin.

As get_user_pages_fast() is lockless it looks like the ordering has to
be like this:

pin_user_pages_fast() fork()
atomic_set(has_pinned, 1);
[..]
atomic_add(page->_refcount)
ordered check write protect()
ordered set write protect()
atomic_read(page->_refcount)
atomic_read(has_pinned)

Such that in all the degenerate racy cases the outcome is that both
sides COW, never neither.

Thus I think it does have to be atomics purely from an ordering
perspective, observing an increased _refcount requires that has_pinned
!= 0 if we are pinning.

So, to make this 100% this ordering will need to be touched up too.

Jason