Re: [PATCH] KVM: arm64: Only save S1PIE registers when dirty

From: Marc Zyngier
Date: Sat Mar 02 2024 - 05:28:36 EST


On Fri, 01 Mar 2024 21:13:26 +0000,
Mark Brown <broonie@xxxxxxxxxx> wrote:
>
> [1 <text/plain; us-ascii (7bit)>]
> On Fri, Mar 01, 2024 at 07:32:28PM +0000, Oliver Upton wrote:
> > On Fri, Mar 01, 2024 at 06:05:53PM +0000, Mark Brown wrote:
>
> > > I don't have a good sense if this is a good idea or not, or if this is a
> > > desirable implementation of the concept - the patch is based on some
> > > concerns about the cost of the system register context switching. We
> > > should be able to do something similar for some of the other registers.
>
> > Is there any data beyond a microbenchmark to suggest save elision
> > benefits the VM at all? The idea of baking the trap configuration based
> > on what KVM _thinks_ the guest will do isn't particularly exciting. This
> > doesn't seem to be a one-size-fits-all solution.
>
> No, and as I said above I'm really not confident about this. There's no
> hardware with these registers yet as far as I know so I don't know how
> meaningful any benchmark would be anyway, and as you suggest even with a
> benchmark a new guest could always come along and blow performance up
> with a change in access patterns.
>
> > The overheads of guest exits are extremely configuration dependent, and
> > on VHE the save/restore of EL1 state happens at vcpu_load() / vcpu_put()
> > rather than every exit. There isn't a whole lot KVM can do to lessen the
> > blow of sharing EL1 in the nVHE configuration.
>
> > Looking a bit further out, the cost of traps will be dramatically higher
> > when running as a guest hypervisor, so we'd want to avoid them if
> > possible...
>
> Indeed, but OTOH I got some complaints about adding more system register

Complains from whom? I can't see anything in my inbox, so it my
conclusion that these "issues" are not serious enough to be publicly
mentioned.

> switching in __sysreg_save_el1_state() for one of my other serieses that
> specifically mentioned nested virt and there don't seem to be a huge
> range of other options for reducing what we're doing with context
> switching without using traps to figure out what's in use, especially in
> the nVHE case. I figured I'd send the patch so the idea could be
> considered.

nVHE has a cost. If someone finds it too slow, they can use VHE. If
they rely on nVHE for isolation, then they know exactly what they are
paying for.

They can also avoid exposing the feature to the guest, which will
remove *any* save/restore. because *that* is the right thing to do if
you want to minimise the impact of additional features.

If anything, I'm actually minded to remove existing instances of this
stupid trapping, such as PAuth, which is entirely pointless.

Sysreg access should essentially be free. 90% of it is done out of
context, and requires no synchronisation. If someone has HW that is so
badly affected by something that should have the same cost as a NOP,
they can borrow a hammer from my toolbox and put this HW out of its
misery.

The only case where this sort of trap can be beneficial is when the
cost of the trap (over 60 64bit registers being saved/restored --
that's host and guest's GPRs) is small compared to the cost of the
context switch of the trapped state. This may make sense for FP, SVE
and the like. Doesn't make much sense for anything else.

M.

--
Without deviation from the norm, progress is not possible.