Re: [PATCH RFC] mm/userfaultfd: enable writenotify while userfaultfd-wp is enabled for a VMA

From: David Hildenbrand
Date: Wed Dec 07 2022 - 08:35:20 EST


On 06.12.22 22:27, Peter Xu wrote:
On Tue, Dec 06, 2022 at 05:28:07PM +0100, David Hildenbrand wrote:
If no one is using mprotect() with uffd-wp like that, then the reproducer
may not be valid - the reproducer is defining how it should work, but does
that really stand? That's why I said it's ambiguous, because the
definition in this case is unclear.

There are interesting variations like:

mmap(PROT_READ, MAP_POPULATE|MAP_SHARED)
uffd_wp()
mprotect(PROT_READ|PROT_WRITE)

Where we start out with all-write permissions before we enable selective
write permissions.

Could you elaborate what's the difference of above comparing to:

mmap(PROT_READ|PROT_WRITE, MAP_POPULATE|MAP_SHARED)
uffd_wp()

?

That mapping would temporarily allow write access. I'd imagine that something like that might be useful when atomically replacing an existing mapping (MAP_FIXED), and the VMA might already be in use by other threads. or when you really want to catch any possible write access.

For example, libvhost-user.c in QEMU uses for ordinary postcopy:

/*
* In postcopy we're using PROT_NONE here to catch anyone
* accessing it before we userfault.
*/
mmap_addr = mmap(0, dev_region->size + dev_region->mmap_offset,
PROT_NONE, MAP_SHARED | MAP_NORESERVE,
vmsg->fds[0], 0);

I'd imagine, when using uffd-wp (VM snapshotting with shmem?) one might use PROT_READ instead before the write-protection is properly set. Because read access would be fine in the meantime.

But I'm just pulling use cases out of my magic hat ;) Nothing stops user space from doing things that are not clearly forbidden (well, even then users might complain, but that's a different story).

[...]

Case (2) is rather a corner case, and unless people complain about it being
a real performance issue, it felt cleaner (less code) to not optimize for
that now.

As I didn't have a closer look on the savedwrite removal patchset so I may
not speak anything sensible here.. What I hope is that we don't lose write
bits easily, after all we tried to even safe the dirty and young bits to
avoid the machine cycles in the MMUs.

Hopefully, someone will complain loudly if that corner case is relevant.



Again Peter, I am not against you, not at all. Sorry if I gave you the
impression. I highly appreciate your work and this discussion.

No worry on that part. You're doing great in this email explaining things
and write things up, especially I'm happy Hugh confirmed it so it's good to
have those. Let's start with something like this when you NAK something
next time. :)

:)

--
Thanks,

David / dhildenb