Re: [PATCH v2] KVM: x86/pmu: Disable all vPMU features support on Intel hybrid CPUs

From: Sean Christopherson
Date: Tue Feb 07 2023 - 12:17:26 EST


On Mon, Feb 06, 2023, Like Xu wrote:
> On 4/2/2023 1:28 am, Sean Christopherson wrote:
> > On Fri, Feb 03, 2023, Like Xu wrote:
> > > On 3/2/2023 2:06 am, Sean Christopherson wrote:
> > > > On Thu, Feb 02, 2023, Like Xu wrote:
> > > > > On 1/2/2023 12:02 am, Sean Christopherson wrote:
> > > > > The perf interface only provides host PMU capabilities and the logic for
> > > > > choosing to disable (or enable) vPMU based on perf input should be left
> > > > > in the KVM part so that subsequent development work can add most code
> > > > > to the just KVM, which is very helpful for downstream users to upgrade
> > > > > loadable KVM module rather than the entire core kernel.
> > > > >
> > > > > My experience interacting with the perf subsystem has taught me that
> > > > > perf change required from KVM should be made as small as possible.
> > > >
> > > > I don't disagree, but I don't think that's relevant in this case. Perf doesn't
> > > > provide the necessary bits for KVM to virtualize a hybrid PMU, so unless KVM is
> > > > somehow able to get away with enumerating a very stripped down vPMU, additional
> > > > modifications to perf_get_x86_pmu_capability() will be required.
> > > >
> > > > What I care more about though is this ugliness in perf_get_x86_pmu_capability():
> > > >
> > > > /*
> > > > * KVM doesn't support the hybrid PMU yet.
> > > > * Return the common value in global x86_pmu,
> > > > * which available for all cores.
> > >
> > > I would have expected w/ current code base, vpmu (excluding pebs and lbr, intel_pt)
> > > to continue to work on any type of pCPU until you decide to disable them completely.
> >
> > Didn't follow this.
>
> My expectation is that, if a guest doesn't enable "PEBS, LBR and intel_pt",
> and only has the most basic pmu conters (its number is the lesser number
> of big and small cores supported), with some pmu_event_fileter allow list
> mechanism, vPMU works regardless of the vcpu model and does not
> require cpu pined. Any complaints from users on this usages ?

No? But I highly doubt the average user is even aware of KVM_SET_PMU_EVENT_FILTER,
let alone knows how to use it. E.g. AFAICT QEMU doesn't support the ioctl().
And for people that do use event filters, I doubt they're running on hybrid CPUs.

> > > > I really don't want to leave that comment lying around as it's flat out wrong in
> > > > that it obviously doesn't address the other differences beyond the number of
> > > > counters. And since there are dependencies on perf, my preference is to disable
> > > > PMU enumeration in perf specifically so that whoever takes on vPMU enabling is
> > > > forced to consider the perf side of things, and get buy in from the perf folks.
> > >
> > > The perf_get_x86_pmu_capability() obviously needs to be revamped,
> > > but until real effective KVM enabling work arrives, any inconsequential intrusion
> > > into perf/core code will only lead to trivial system maintenance.
> >
> > Trivial doesn't mean useless or unnecessary though. IMO, there's value in capturing,
> > in code, that perf_get_x86_pmu_capability() doesn't properly support hybrid vPMUs.
> >
> > That said, poking around perf, checking is_hybrid() is wrong. This quirk suggests
> > that if E-cores are disabled via BIOS, (a) X86_FEATURE_HYBRID_CPU is _supposed_ to
> > be cleared, and (b) the base PMU will reflect the P-core PMU. I.e. someone can
> > enable vPMU by disabling E-cores.
> >
> > /*
> > * Quirk: For some Alder Lake machine, when all E-cores are disabled in
> > * a BIOS, the leaf 0xA will enumerate all counters of P-cores. However,
> > * the X86_FEATURE_HYBRID_CPU is still set. The above codes will
>
> Sigh. Then what if E-cores are manually offline via "/.../cpu$/online" and
> then init kvm module ?

KVM has to be paranoid and assume those CPUs could be onlined in the future.

> I suggest leaving these open issues to that enabling guy (or maybe it's still me).
>
> > * mistakenly add extra counters for P-cores. Correct the number of
> > * counters here.
> > */
> > if ((pmu->num_counters > 8) || (pmu->num_counters_fixed > 4)) {
> > pmu->num_counters = x86_pmu.num_counters;
> > pmu->num_counters_fixed = x86_pmu.num_counters_fixed;
> > }
> >
> > Side topic, someone (*cough* Intel) should fix that, e.g. detect the scenario
> > during boot and manually clear X86_FEATURE_HYBRID_CPU.
>
> Maybe they did it on purpose.

That seems unlikely. My interpretation of the comment, specifically the "Quirk" and
"some ... machine" parts, is that the intended behavior is to clear the HYBRID bit,
but _some_ platforms fail to do so.

I don't think Intel's intent matters though. If the kernel benefits from clearing
HYBRID and there are no downsides, then it should be cleared regardless of what
Intel intended ucode to do.