Re: [PATCH V2] KVM: x86/pmu: Disable vPMU if EVENTSEL_GUESTONLY bit doesn't exist

From: Sean Christopherson
Date: Tue Sep 26 2023 - 11:44:39 EST


On Tue, Sep 26, 2023, Like Xu wrote:
> On 26/9/2023 7:31 am, Sean Christopherson wrote:
> > On Thu, Sep 14, 2023, Like Xu wrote:
> > > On 7/4/2023 11:37 pm, Sean Christopherson wrote:
> > > > On Fri, Apr 07, 2023, Like Xu wrote:
> > > /*
> > > * The guest vPMU counter emulation depends on the EVENTSEL_GUESTONLY bit.
> > > * If this bit is present on the host, the host needs to support at least
> > > the PERFCTR_CORE.
> > > */
> >
> > ...
> >
> > > > /*
> > > > * KVM requires guest-only event support in order to isolate guest PMCs
> > > > * from host PMCs. SVM doesn't provide a way to atomically load MSRs
> > > > * on VMRUN, and manually adjusting counts before/after VMRUN is not
> > > > * accurate enough to properly virtualize a PMU.
> > > > */
> > > >
> > > > But now I'm really confused, because if I'm reading the code correctly, perf
> > > > invokes amd_core_hw_config() for legacy PMUs, i.e. even if PERFCTR_CORE isn't
> > > > supported. And the APM documents the host/guest bits only for "Core Performance
> > > > Event-Select Registers".
> > > >
> > > > So either (a) GUESTONLY isn't supported on legacy CPUs and perf is relying on AMD
> > > > CPUs ignoring reserved bits or (b) GUESTONLY _is_ supported on legacy PMUs and
> > > > pmu_has_guestonly_mode() is checking the wrong MSR when running on older CPUs.
> > > >
> > > > And if (a) is true, then how on earth does KVM support vPMU when running on a
> > > > legacy PMU? Is vPMU on AMD just wildly broken? Am I missing something?
> > > >
> > >
> > > (a) It's true and AMD guest vPMU have only been implemented accurately with
> > > the help of this GUESTONLY bit.
> > >
> > > There are two other scenarios worth discussing here: one is support L2 vPMU
> > > on the PERFCTR_CORE+ host and this proposal is disabling it; and the other
> > > case is to support AMD legacy vPMU on the PERFCTR_CORE+ host.
> >
> > Oooh, so the really problematic case is when PERFCTR_CORE+ is supported but
> > GUESTONLY is not, in which case KVM+perf *think* they can use GUESTONLY (and
> > HOSTONLY).
> >
> > That's a straight up KVM (as L0) bug, no? I don't see anything in the APM that
> > suggests those bits are optional, i.e. KVM is blatantly violating AMD's architecture
> > by ignoring those bits.
>
> For L2 guest, it often doesn't see all the cpu features corresponding to the
> cpu model because KVM and VMM filter some of the capabilities. We can't say
> that the absence of these features violates spec, can we ?

Yes, KVM hides features via architectural means. This is enumerating a feature,
PERFCTR_CORE, and not providing all its functionalality. The two things are
distinctly different.

> I treat it as a KVM flaw or a lack of emulation capability.

A.k.a. a bug.

> > I would rather fix KVM (as L0). It doesn't seem _that_ hard to support, e.g.
> > modify reprogram_counter() to disable the counter if it's supposed to be silent
> > for the current mode, and reprogram all counters if EFER.SVME is toggled, and on
> > all nested transitions.
>
> I thought about that too, setting up EFER.SVME and VMRUN is still a little
> bit far away, and more micro-testing is needed to correct the behavior
> of the emulation here, considering KVM also has to support emulated ins.
>
> It's safe to say that there are no real user scenarios using vPMU in a nested
> guest, so I'm more inclined to disable it provisionally (for the sake of more
> stable tree users), enabling this feature is honestly at the end of my to-do list.

I don't see a way to do that without violating AMD's architecture while still
exposing the PMU to L1.