Re: [PATCH v5 19/25] arm64/mm: Wire up PTE_CONT for user mappings

From: David Hildenbrand
Date: Mon Feb 12 2024 - 10:26:29 EST


On 12.02.24 15:45, Ryan Roberts wrote:
On 12/02/2024 13:54, David Hildenbrand wrote:
If so, I wonder if we could instead do that comparison modulo the access/dirty
bits,

I think that would work - but will need to think a bit more on it.

and leave ptep_get_lockless() only reading a single entry?

I think we will need to do something a bit less fragile. ptep_get() does collect
the access/dirty bits so its confusing if ptep_get_lockless() doesn't IMHO. So
we will likely want to rename the function and make its documentation explicit
that it does not return those bits.

ptep_get_lockless_noyoungdirty()? yuk... Any ideas?

Of course if I could convince you the current implementation is safe, I might be
able to sidestep this optimization until a later date?

As discussed (and pointed out abive), there might be quite some callsites where
we don't really care about uptodate accessed/dirty bits -- where ptep_get() is
used nowadays.

One way to approach that I had in mind was having an explicit interface:

ptep_get()
ptep_get_uptodate()
ptep_get_lockless()
ptep_get_lockless_uptodate()

Yes, I like the direction of this. I guess we anticipate that call sites
requiring the "_uptodate" variant will be the minority so it makes sense to use
the current names for the "_not_uptodate" variants? But to do a slow migration,
it might be better/safer to have the weaker variant use the new name - that
would allow us to downgrade one at a time?

Yes, I was primarily struggling with names. Likely it makes sense to either have two completely new function names, or use the new name only for the "faster but less precise" variant.



Especially the last one might not be needed.
I've done a scan through the code and agree with Mark's original conclusions.
Additionally, huge_pte_alloc() (which isn't used for arm64) doesn't rely on
access/dirty info. So I think I could migrate everything to the weaker variant
fairly easily.


Futher, "uptodate" might not be the best choice because of PageUptodate() and
friends. But it's better than "youngdirty"/"noyoungdirty" IMHO.

Certainly agree with "noyoungdirty" being a horrible name. How about "_sync" /
"_nosync"?

I could live with

ptep_get_sync()
ptep_get_nosync()

with proper documentation :)

I don't think we use "_sync" / "_nosync" in the context of pte operations yet.

Well, there seems to be "__arm_v7s_pte_sync" in iommu code, bit at least in core code nothing jumped at me.

--
Cheers,

David / dhildenb