Re: [PATCH v1 06/11] mm: support GUP-triggered unsharing via FAULT_FLAG_UNSHARE (!hugetlb)

From: Linus Torvalds
Date: Fri Dec 17 2021 - 20:54:08 EST


[ Going back in the thread to this one ]

On Fri, Dec 17, 2021 at 1:15 PM Nadav Amit <namit@xxxxxxxxxx> wrote:
>
> I think that there is an assumption that once a page is COW-broken,
> it would never have another write-fault that might lead to COW
> breaking later.

Right. I do think there are problems in the current code, I just think
that the patches are a step back.

The problems with the current code are of two kinds:

- I think the largepage code (either THP or explicit hugetlb) doesn't
do as good a job of this whole COW handling as the regular pages do

- some of the "you can make pages read-only again explicitly" kinds of loads.

But honestly, at least for the second case, if somebody does a GUP,
and then starts playing mprotect games on the same virtual memory area
that they did a GUP on, and are surprised when they get another COW
fault that breaks their own connection with a page they did a GUP on
earlier, that's their own fault.

So I think there's some of "If you broke it, you get to keep both
pieces". Literally, in this case. You have your GUP page that you
looked up, and you have your virtual address page that you caused COW
on with mprotect() by making it read-only and then read-write again,
then you have two different pages, and at some point it really is just
"Well, don't do that then".

But yes, there's also some of "some code probably didn't get fully
converted to the new world order". So if VFIO only uses
FOLL_LONGTERM, and didn't ask for the COW breaking, then yes, VFIO
will see page incoherencies. But that should be an issue of "VFIO
should do the right thing".

So part of it is a combination of "if you do crazy things, you'll get
crazy results". And some of it is some kernel pinning code that
doesn't do the right thing to actually make sure it gets a shared page
to be pinned.

And then there's THP and HUGETLB, that I do think needs fixing and
aren't about those two kinds of cases.

I think we never got around to just doing the same thing we did for
regular pages. I think the hugepage code simply doesn't follow that
"COW on GUP, mark to not COW later" pattern.

Linus