Re: PTE access rules & abstraction

From: Jeremy Fitzhardinge
Date: Wed Sep 24 2008 - 17:58:08 EST


Benjamin Herrenschmidt wrote:
> Well, the current set accessor, as far as I'm concerned is a big pile of
> steaming shit that evolved from x86-specific gunk raped in different
> horrible ways to make it looks like it fits on other architectures and
> additionally mashed with goo to make it somewhat palatable by
> virtualization stuff. Yes, bugs can be fixed but it's still an horrible
> mess.
>

What do you propose then? Ideally one would like to get something that
works for powerpc, s390, all the wacky ia64 modes as well as x86. The
ia64 folks proposed something, but I've not looked at it closely. From
an x86 virtualization perspective, something that's basically x86 with
as much scope for batching and deferring as possible would be fine.

As a start, what's the state machine for a pte? What states can it be
in, and how does it move from state to state? It sounds like powerpc
has at least one extra state above x86 (hashed, with the hash key stored
in the pte itself?).

> Now, regarding the above bug, I'm afraid the only approaches I see that
> would work would be to have either a ptep_get_and_clear_flush(), which I
> suppose x86 virt. people will hate, or maybe to actually have a powerpc
> specific variant of the new start/commit hooks that does the flush.
>

ptep_get_and_clear() is not batchable anyway, because the x86
implementation requires an atomic xchg on the pte, which will likely
result in some sort of trap (and if it doesn't then it doesn't need
batching). The start/commit API was specifically so that we can do the
mprotect (and fork COW updates) in a batchable way (in Xen its
implemented with a pte update hypercall which updates the pte without
affecting the A/D bits).

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/