RE: [PATCH 2/4] KVM: VMX/pmu: Save host debugctlmsr just before vm entry

From: Zhang, Xiong Y
Date: Sun Jun 25 2023 - 00:03:15 EST


> On Fri, Jun 16, 2023, Xiong Zhang wrote:
> > Perf defines four types of perf event: per cpu pinned event, per
> > process pinned event, per cpu event, per process event, their
> > prioirity are from high to low. vLBR event is per process pinned
> > event. So durng vm exit handler, if vLBR event preempts perf low
> > priority LBR event, perf will disable LBR and let guest control LBR,
> > or if vLBR event is preempted by perf high priority LBR event, perf
> > will enable LBR. In a word LBR status may be changed during vm exit handler.
> >
> > MSR_IA32_DEBUGCTLMSR[0] controls LBR enabling, kvm saves its value
> > into
> > vmx->host_debugctlmsr in vcpu_load(), and kvm restores its value from
> > vmx->host_debugctlmsr after vm exit immediately. Since
> > MSR_IA32_DEBUGCTLMSR[0] could be changed during vm exit handler, the
> > saved value vmx->host_debugctlmsr could be wrong. So this commit saves
> > MSR_IA32_DEBUGCTLMSR into vmx->host_debugctlmsr just before vm entry
> > to reflect the real hardware value.
> >
> > Signed-off-by: Xiong Zhang <xiong.y.zhang@xxxxxxxxx>
> > ---
> > arch/x86/kvm/vmx/vmx.c | 5 +----
> > 1 file changed, 1 insertion(+), 4 deletions(-)
> >
> > diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c index
> > 44fb619803b8..5ca61a26d0d7 100644
> > --- a/arch/x86/kvm/vmx/vmx.c
> > +++ b/arch/x86/kvm/vmx/vmx.c
> > @@ -1459,13 +1459,9 @@ void vmx_vcpu_load_vmcs(struct kvm_vcpu *vcpu,
> int cpu,
> > */
> > static void vmx_vcpu_load(struct kvm_vcpu *vcpu, int cpu) {
> > - struct vcpu_vmx *vmx = to_vmx(vcpu);
> > -
> > vmx_vcpu_load_vmcs(vcpu, cpu, NULL);
> >
> > vmx_vcpu_pi_load(vcpu, cpu);
> > -
> > - vmx->host_debugctlmsr = get_debugctlmsr();
> > }
> >
> > static void vmx_vcpu_put(struct kvm_vcpu *vcpu) @@ -7273,6 +7269,7 @@
> > static fastpath_t vmx_vcpu_run(struct kvm_vcpu *vcpu)
> > atomic_switch_perf_msrs(vmx);
> > if (intel_pmu_lbr_is_enabled(vcpu))
> > vmx_passthrough_lbr_msrs(vcpu);
> > + vmx->host_debugctlmsr = get_debugctlmsr();
>
> Reading DEBUG_CTL on every VM-Entry is either unnecessary or insufficient. If
> the DEBUG_CTL value is being changed synchronously, then just fix whatever
> KVM path leads to a change in the host avlue. If DEBUG_CTL is being changed
> asynchronously, then I'm guessing the change is coming from NMI context,
> which means that KVM is buggy no matter how close we put this to VM-Enter.
When a perf event reschedule is needed on a physical cpu, perf scheduler send an IPI to the target cpu, LBR will be enabled or disabled in the IPI handler according to active event attribute.
If vLBR event is active, LBR is disabled in IPI handler.
If Host LBR event is active, LBR is enabled in the IPI handler, this could happen when host LBR event preempt vLBR event during vm exit handler.
DEBUG_CTL[0]'s changing is asynchronous in the perf IPI handler, host irq is disabled near VM-Enter, so IPI couldn't happen, then host DEBUG_CTL[0] couldn't change before kvm enable host irq.
Perf event counter overflow (PMI) is a NMI, but this NMI handler doesn't change LBR status, the kvm saved host_debugctlmsr is correct still after PMI handler.

thanks