Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb)

From: Linus Torvalds
Date: Fri Dec 17 2021 - 15:37:06 EST


On Fri, Dec 17, 2021 at 12:18 PM David Hildenbrand <david@xxxxxxxxxx> wrote:
>
> On 17.12.21 20:22, Linus Torvalds wrote:
> > On Fri, Dec 17, 2021 at 11:04 AM Linus Torvalds
> > <torvalds@xxxxxxxxxxxxxxxxxxxx> wrote:
> >>
> > - get a "readonly" copy of a local private page using FAULT_FLAG_UNSHARE.
> >
> > This just increments the page count, because mapcount == 1.
> >
> > - fork()
> >
> > - unmap in the original
> >
> > - child now has "mapcount == 1" on a page again, but refcount is
> > elevated, and child HAS TO COW before writing.
>
> Hi Linus,
>
> This is just GUP before fork(), which is in general
> problematic/incompatible with sharing.

Note that my example was not meant to be an example of a problem per
se, but purely as an example of how meaningless 'mapcount' is, and how
'mapcount==1' isn't really a very meaningful test.

So it wasn't mean to show "look, GUP before fork is problematic". We
have that problem already solved at least for regular pages.

It was purely meant to show how "mapcount==1" isn't a meaningful thing
to test, and my worry about how you're adding that nonsensical test to
the new code.

> Let's just take a look at what refcount does *wrong*. Let's use an
> adjusted version of your example above, because it's a perfect fit:
>
> 1. mem = mmap(pagesize, MAP_PRIVATE)
> -> refcount == 1
>
> 2. memset(mem, 0, pagesize); /* Page is mapped R/W */
>
> 3. fork() /* Page gets mapped R/O */
> -> refcount > 1
>
> 4. child quits
> -> refcount == 1
>
> 5. Take a R/O pin (RDMA, VFIO, ...)
> -> refcount > 1
>
> 6. memset(mem, 0xff, pagesize);
> -> Write fault -> COW

I do not believe this is actually a bug.

You asked for a R/O pin, and you got one.

Then somebody else modified that page, and you got exactly what you
asked for - a COW event. The original R/O pin has the original page
that it asked for, and can read it just fine.

So what is your argument?

Linus