Re: [PATCH V2] KVM: x86/pmu: Disable vPMU if EVENTSEL_GUESTONLY bit doesn't exist

From: Like Xu
Date: Mon Sep 25 2023 - 23:50:04 EST


On 26/9/2023 7:31 am, Sean Christopherson wrote:
On Thu, Sep 14, 2023, Like Xu wrote:
On 7/4/2023 11:37 pm, Sean Christopherson wrote:
On Fri, Apr 07, 2023, Like Xu wrote:
/*
* The guest vPMU counter emulation depends on the EVENTSEL_GUESTONLY bit.
* If this bit is present on the host, the host needs to support at least
the PERFCTR_CORE.
*/

...

/*
* KVM requires guest-only event support in order to isolate guest PMCs
* from host PMCs. SVM doesn't provide a way to atomically load MSRs
* on VMRUN, and manually adjusting counts before/after VMRUN is not
* accurate enough to properly virtualize a PMU.
*/

But now I'm really confused, because if I'm reading the code correctly, perf
invokes amd_core_hw_config() for legacy PMUs, i.e. even if PERFCTR_CORE isn't
supported. And the APM documents the host/guest bits only for "Core Performance
Event-Select Registers".

So either (a) GUESTONLY isn't supported on legacy CPUs and perf is relying on AMD
CPUs ignoring reserved bits or (b) GUESTONLY _is_ supported on legacy PMUs and
pmu_has_guestonly_mode() is checking the wrong MSR when running on older CPUs.

And if (a) is true, then how on earth does KVM support vPMU when running on a
legacy PMU? Is vPMU on AMD just wildly broken? Am I missing something?


(a) It's true and AMD guest vPMU have only been implemented accurately with
the help of this GUESTONLY bit.

There are two other scenarios worth discussing here: one is support L2 vPMU
on the PERFCTR_CORE+ host and this proposal is disabling it; and the other
case is to support AMD legacy vPMU on the PERFCTR_CORE+ host.

Oooh, so the really problematic case is when PERFCTR_CORE+ is supported but
GUESTONLY is not, in which case KVM+perf *think* they can use GUESTONLY (and
HOSTONLY).

That's a straight up KVM (as L0) bug, no? I don't see anything in the APM that
suggests those bits are optional, i.e. KVM is blatantly violating AMD's architecture
by ignoring those bits.

For L2 guest, it often doesn't see all the cpu features corresponding to the
cpu model because KVM and VMM filter some of the capabilities. We can't say
that the absence of these features violates spec, can we ?

I treat it as a KVM flaw or a lack of emulation capability.


I would rather fix KVM (as L0). It doesn't seem _that_ hard to support, e.g.
modify reprogram_counter() to disable the counter if it's supposed to be silent
for the current mode, and reprogram all counters if EFER.SVME is toggled, and on
all nested transitions.

I thought about that too, setting up EFER.SVME and VMRUN is still a little
bit far away, and more micro-testing is needed to correct the behavior
of the emulation here, considering KVM also has to support emulated ins.

It's safe to say that there are no real user scenarios using vPMU in a nested
guest, so I'm more inclined to disable it provisionally (for the sake of more
stable tree users), enabling this feature is honestly at the end of my to-do list.