Re: KVM: x86/vPMU/AMD: Can we detect PMU is off for a VM?

From: Jim Mattson
Date: Thu Nov 09 2023 - 19:06:29 EST


On Thu, Nov 9, 2023 at 3:46 PM Denis V. Lunev <den@xxxxxxxxxxxxx> wrote:
>
> On 11/9/23 23:52, Jim Mattson wrote:
> > On Thu, Nov 9, 2023 at 10:18 AM Konstantin Khorenko
> > <khorenko@xxxxxxxxxxxxx> wrote:
> >> Hi All,
> >>
> >> as a followup for my patch: i have noticed that
> >> currently Intel kernel code provides an ability to detect if PMU is totally disabled for a VM
> >> (pmu->version == 0 in this case), but for AMD code pmu->version is never 0,
> >> no matter if PMU is enabled or disabled for a VM (i mean <pmu state='off'/> in the VM config which
> >> results in "-cpu pmu=off" qemu option).
> >>
> >> So the question is - is it possible to enhance the code for AMD to also honor PMU VM setting or it is
> >> impossible by design?
> > The AMD architectural specification prior to AMD PMU v2 does not allow
> > one to describe a CPU (via CPUID or MSRs) that has fewer than 4
> > general purpose PMU counters. While AMD PMU v2 does allow one to
> > describe such a CPU, legacy software that knows nothing of AMD PMU v2
> > can expect four counters regardless.
> >
> > Having said that, KVM does provide a per-VM capability for disabling
> > the virtual PMU: KVM_CAP_PMU_CAPABILITY(KVM_PMU_CAP_DISABLE). See
> > section 8.35 in Documentation/virt/kvm/api.rst.
> But this means in particular that QEMU should immediately
> use this KVM_PMU_CAP_DISABLE if this capability is supported and
> PMU=off. I am not seeing this code thus I believe that we have missed
> this. I think that this change worth adding. We will measure the impact
> :-) Den

At present, KVM will still blindly cycle through each GP counter (4
minimum for AMD) until it checks vcpu->kvm->arch.enable_pmu at the top
of get_gp_pmc_amd().

Sean's proposal to clear the metadata should eliminate the overhead
for VMs with the virtual PMU disabled. My proposal should eliminate
the overhead even for VMs with the virtual PMU enabled, as long as no
counters are programmed for "instructions retired."