Re: [PATCH v2 2/3] KVM: nVMX: add kvm_nested_vmlaunch_resume tracepoint

From: Sean Christopherson
Date: Fri Jan 15 2021 - 11:31:15 EST


On Fri, Jan 15, 2021, Paolo Bonzini wrote:
> On 15/01/21 01:14, Sean Christopherson wrote:
> > > + trace_kvm_nested_vmlaunch_resume(kvm_rip_read(vcpu),
> > Hmm, won't this RIP be wrong for the migration case? I.e. it'll be L2, not L1
> > as is the case for the "true" nested VM-Enter path.
>
> It will be the previous RIP---might as well be 0xfffffff0 depending on what
> userspace does. I don't think you can do much better than that, using
> vmcs12->host_rip would be confusing in the SMM case.
>
> > > + vmx->nested.current_vmptr,
> > > + vmcs12->guest_rip,
> > > + vmcs12->vm_entry_intr_info_field);
> > The placement is a bit funky. I assume you put it here so that calls from
> > vmx_set_nested_state() also get traced. But, that also means
> > vmx_pre_leave_smm() will get traced, and it also creates some weirdness where
> > some nested VM-Enters that VM-Fail will get traced, but others will not.
> >
> > Tracing vmx_pre_leave_smm() isn't necessarily bad, but it could be confusing,
> > especially if the debugger looks up the RIP and sees RSM. Ditto for the
> > migration case.
>
> Actually tracing vmx_pre_leave_smm() is good, and pointing to RSM makes
> sense so I'm not worried about that.

Ideally there would something in the tracepoint to differentiate the various
cases. Not that the RSM/migration cases will pop up often, but I think it's an
easily solved problem that could avoid confusion.

What if we captured vmx->nested.smm.guest_mode and from_vmentry, and explicitly
record what triggered the entry?

TP_printk("from: %s rip: 0x%016llx vmcs: 0x%016llx nrip: 0x%016llx intr_info: 0x%08x",
__entry->vmenter ? "VM-Enter" : __entry->smm ? "RSM" : "SET_STATE",
__entry->rip, __entry->vmcs, __entry->nested_rip,
__entry->entry_intr_info

Side topic, can we have an "official" ruling on whether KVM tracepoints should
use colons and/or commas? And probably same question for whether or not to
prepend zeros. E.g. kvm_entry has "vcpu %u, rip 0x%lx" versus "rip: 0x%016llx
vmcs: 0x%016llx". It bugs me that we're so inconsistent.