Re: [PATCH V2] KVM: x86/pmu: Disable vPMU if EVENTSEL_GUESTONLY bit doesn't exist

From: Sean Christopherson
Date: Mon Sep 25 2023 - 19:32:06 EST


On Thu, Sep 14, 2023, Like Xu wrote:
> On 7/4/2023 11:37 pm, Sean Christopherson wrote:
> > On Fri, Apr 07, 2023, Like Xu wrote:
> /*
> * The guest vPMU counter emulation depends on the EVENTSEL_GUESTONLY bit.
> * If this bit is present on the host, the host needs to support at least
> the PERFCTR_CORE.
> */

...

> > /*
> > * KVM requires guest-only event support in order to isolate guest PMCs
> > * from host PMCs. SVM doesn't provide a way to atomically load MSRs
> > * on VMRUN, and manually adjusting counts before/after VMRUN is not
> > * accurate enough to properly virtualize a PMU.
> > */
> >
> > But now I'm really confused, because if I'm reading the code correctly, perf
> > invokes amd_core_hw_config() for legacy PMUs, i.e. even if PERFCTR_CORE isn't
> > supported. And the APM documents the host/guest bits only for "Core Performance
> > Event-Select Registers".
> >
> > So either (a) GUESTONLY isn't supported on legacy CPUs and perf is relying on AMD
> > CPUs ignoring reserved bits or (b) GUESTONLY _is_ supported on legacy PMUs and
> > pmu_has_guestonly_mode() is checking the wrong MSR when running on older CPUs.
> >
> > And if (a) is true, then how on earth does KVM support vPMU when running on a
> > legacy PMU? Is vPMU on AMD just wildly broken? Am I missing something?
> >
>
> (a) It's true and AMD guest vPMU have only been implemented accurately with
> the help of this GUESTONLY bit.
>
> There are two other scenarios worth discussing here: one is support L2 vPMU
> on the PERFCTR_CORE+ host and this proposal is disabling it; and the other
> case is to support AMD legacy vPMU on the PERFCTR_CORE+ host.

Oooh, so the really problematic case is when PERFCTR_CORE+ is supported but
GUESTONLY is not, in which case KVM+perf *think* they can use GUESTONLY (and
HOSTONLY).

That's a straight up KVM (as L0) bug, no? I don't see anything in the APM that
suggests those bits are optional, i.e. KVM is blatantly violating AMD's architecture
by ignoring those bits.

I would rather fix KVM (as L0). It doesn't seem _that_ hard to support, e.g.
modify reprogram_counter() to disable the counter if it's supposed to be silent
for the current mode, and reprogram all counters if EFER.SVME is toggled, and on
all nested transitions.