RE: [PATCH] KVM: VMX: Disable Intel PT before VM-entry

From: Kang, Luwei
Date: Mon Mar 30 2020 - 23:29:34 EST


> > > On Wed, Mar 18, 2020 at 11:48:18AM +0800, Luwei Kang wrote:
> > > > If the logical processor is operating with Intel PT enabled (
> > > > IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the âload
> > > > IA32_RTIT_CTLâ VM-entry control must be 0(SDM 26.2.1.1).
> > > >
> > > > The first disabled the host Intel PT(Clear TraceEn) will make all
> > > > the buffered packets are flushed out of the processor and it may
> > > > cause an Intel PT PMI. The host Intel PT will be re-enabled in the
> > > > host Intel PT PMI handler.
> > > >
> > > > handle_pmi_common()
> > > > -> intel_pt_interrupt()
> > > > -> pt_config_start()
> > >
> > > IIUC, this is only possible when PT "plays nice" with VMX, correct?
> > > Otherwise pt->vmx_on will be true and pt_config_start() would skip
> > > the WRMSR.
> > >
> > > And IPT PMI must be delivered via NMI (though maybe they're always
> > > delivered via NMI?).
> > >
> > > In any case, redoing WRMSR doesn't seem safe, and it certainly isn't
> > > performant, e.g. what prevents the second WRMSR from triggering a
> > > second IPT PMI?
> > >
> > > pt_guest_enter() is called after the switch to the vCPU has already
> > > been recorded, can't this be handled in the IPT code, e.g. something like
> this?
> > >
> > > diff --git a/arch/x86/events/intel/pt.c b/arch/x86/events/intel/pt.c
> > > index
> > > 1db7a51d9792..e38ddae9f0d1 100644
> > > --- a/arch/x86/events/intel/pt.c
> > > +++ b/arch/x86/events/intel/pt.c
> > > @@ -405,7 +405,7 @@ static void pt_config_start(struct perf_event *event)
> > > ctl |= RTIT_CTL_TRACEEN;
> > > if (READ_ONCE(pt->vmx_on))
> > > perf_aux_output_flag(&pt->handle, PERF_AUX_FLAG_PARTIAL);
> > > - else
> > > + else (!(current->flags & PF_VCPU))
> > > wrmsrl(MSR_IA32_RTIT_CTL, ctl);
> >
> > Intel PT can work in SYSTEM and HOST_GUEST mode by setting the
> > kvm-intel.ko parameter "pt_mode". In SYSTEM mode, the host and guest
> > PT trace will be saved in the host buffer. The KVM do nothing during
> > VM-entry/exit in SYSTEM mode and Intel PT PMI may happened on any
> > place. The PT trace may be disabled when running in KVM(PT only needs
> > to be disabled before VM-entry in HOST_GUEST mode).
>
> Ah, right. What about enhancing intel_pt_handle_vmx() and 'struct pt' to
> replace vmx_on with a field that incorporates the KVM mode?

Some history is the host perf didn't fully agree with introducing HOST_GUEST mode for PT in KVM. Because the KVM will disable the host trace before VM-entry in HOST_GUEST mode and KVM guest will win in this case. e.g. Intel PT has been enabled in KVM guest and the host wants to start system-wide trace(collect all the trace on this system) at this time, the trace produced by the Guest OS will be saved in guest PT buffer and host buffer can't get this. So I prefer don't introduce the KVM PT mode to host perf framework. The similar problem happens on PEBS virtualization via DS as well.

Thanks,
Luwei Kang

> From an outsider's perspective, that'd be an improvment irrespective of this bug fix as
> 'vmx_on' is misleading, e.g. it can be %false when the CPU is post- VMXON,
> and really means "post-VMXON and Intel PT can't trace it".