Re: [PATCH RFC v2 19/27] mm: mprotect: Introduce PAGE_FAULT_ON_ACCESS for mprotect(PROT_MTE)

From: David Hildenbrand
Date: Thu Nov 30 2023 - 09:39:46 EST


On 30.11.23 15:33, Alexandru Elisei wrote:
On Thu, Nov 30, 2023 at 02:43:48PM +0100, David Hildenbrand wrote:
On 30.11.23 14:32, Alexandru Elisei wrote:
Hi,

On Thu, Nov 30, 2023 at 01:49:34PM +0100, David Hildenbrand wrote:
+
+out_retry:
+ put_page(page);
+ if (vmf->flags & FAULT_FLAG_VMA_LOCK)
+ vma_end_read(vma);
+ if (fault_flag_allow_retry_first(vmf->flags)) {
+ err = VM_FAULT_RETRY;
+ } else {
+ /* Replay the fault. */
+ err = 0;

Hello!

Unfortunately, if the page continues to be pinned, it seems like fault will continue to occur.
I guess it makes system stability issue. (but I'm not familiar with that, so please let me know if I'm mistaken!)

How about migrating the page when migration problem repeats.

Yes, I had the same though in the previous iteration of the series, the
page was migrated out of the VMA if tag storage couldn't be reserved.

Only short term pins are allowed on MIGRATE_CMA pages, so I expect that the
pin will be released before the fault is replayed. Because of this, and
because it makes the code simpler, I chose not to migrate the page if tag
storage couldn't be reserved.

There are still some cases that are theoretically problematic: vmsplice()
can pin pages forever and doesn't use FOLL_LONGTERM yet.

All these things also affect other users that rely on movability (e.g., CMA,
memory hotunplug).

I wasn't aware of that, thank you for the information. Then to ensure that the
process doesn't hang by replying the loop indefinitely, I'll migrate the page if
tag storage cannot be reserved. Looking over the code again, I think I can reuse
the same function that migrates tag storage pages out of the MTE VMA (added in
patch #21), so no major changes needed.

It's going to be interesting if migrating that page fails because it is
pinned :/

I imagine that having both the page **and** its tag storage pinned longterm
without FOLL_LONGTERM is going to be exceedingly rare.

Yes. I recall that the rule of thumb is that some O_DIRECT I/O can take up to 10 seconds, although extremely rare (and maybe not applicable on arm64).


Am I mistaken in believing that the problematic vmsplice() behaviour is
recognized as something that needs to be fixed?

Yes, for a couple of years I'm hoping this will actually get fixed now that O_DIRECT mostly uses FOLL_PIN instead of FOLL_GET.

--
Cheers,

David / dhildenb