Re: [PATCH 1/2] KVM: x86/pmu: Reset perf_capabilities in vcpu to 0 if PDCM is disabled

From: Mingwei Zhang
Date: Fri Jan 26 2024 - 14:31:14 EST


+Frederick Mayle +Steven Moreland

On Fri, Jan 26, 2024 at 10:33 AM Sean Christopherson <seanjc@googlecom> wrote:
>
> On Thu, Jan 25, 2024, Mingwei Zhang wrote:
> > On Wed, Jan 24, 2024, Sean Christopherson wrote:
> > > On Wed, Jan 24, 2024, Mingwei Zhang wrote:
> > > > I think this makes a lot of confusions on migration where VMM on the source
> > > > believes that a non-zero value from KVM_GET_MSRS is valid and the VMM on the
> > > > target will find it not true.
> > >
> > > Yes, but seeing a non-zero value is a KVM bug that should be fixed.
> > >
> > How about adding an entry in vmx_get_msr() for
> > MSR_IA32_PERF_CAPABILITIES and check pmu_version? This basically pairs
> > with the implementation in vmx_set_msr() for MSR_IA32_PERF_CAPABILITIES.
> > Doing so allows KVM_GET_MSRS return 0 for the MSR instead of returning
> > the initial permitted value.
>
> Hrm, I don't hate it as a stopgap. But if we are the only people that are affected,
> because again I'm pretty sure QEMU is fine, I would rather we just fix things in
> our VMM and/or internal kernel.

It is not just QEMU. crossvm is another open source VMM that suffers
from this one.

>
> Long term, I want some form of fix for the initialization code, even if that means
> adding a quirk to let userspace opt out of KVM setting default values for platform
> MSRs.

Yeah, that I definitely agree. It applies to all future platform MSRs.
Potentially, any patch trying to add a new platform MSR and initialize
it at vcpu create time should be rejected. And VMM should be told to
get the feature value and set it appropriately.

For existing platform MSRs, I think we should prioritize the "fix"
solution. Refactoring existing code that requires ABI changes may
raise complaints from production folks.

>
> Side topic, vmx_set_msr() should check X86_FEATURE_PDCM, not just the PMU version.

You are right.
>
> > The benefit is that it is not enforcing the VMM to explicitly set the
> > value. In fact, there are several platform MSRs which has initial value
> > that VMM may rely on instead of explicitly setting.
> > MSR_IA32_PERF_CAPABILITIES is only one of them.
>
> Yeah, and all of those are broken. AFAICT, the bad behavior got introduced for
> MSR_PLATFORM_INFO, and then people kept copy+pasting that broken pattern :-(