Re: [PATCH v2 2/3] KVM: nVMX: add kvm_nested_vmlaunch_resume tracepoint

From: Sean Christopherson
Date: Thu Jan 14 2021 - 19:14:58 EST


On Thu, Jan 14, 2021, Maxim Levitsky wrote:
> This is very helpful for debugging nested VMX issues.
>
> Signed-off-by: Maxim Levitsky <mlevitsk@xxxxxxxxxx>
> ---
> arch/x86/kvm/trace.h | 30 ++++++++++++++++++++++++++++++
> arch/x86/kvm/vmx/nested.c | 6 ++++++
> arch/x86/kvm/x86.c | 1 +
> 3 files changed, 37 insertions(+)
>
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index 2de30c20bc264..663d1b1d8bf64 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -554,6 +554,36 @@ TRACE_EVENT(kvm_nested_vmrun,
> __entry->npt ? "on" : "off")
> );
>
> +
> +/*
> + * Tracepoint for nested VMLAUNCH/VMRESUME

VM-Enter, as below.

> + */
> +TRACE_EVENT(kvm_nested_vmlaunch_resume,

s/vmlaunc_resume/vmenter, both for consistency with other code and so that it
can sanely be reused by SVM. IMO, trace_kvm_entry is wrong :-).

> + TP_PROTO(__u64 rip, __u64 vmcs, __u64 nested_rip,
> + __u32 entry_intr_info),
> + TP_ARGS(rip, vmcs, nested_rip, entry_intr_info),
> +
> + TP_STRUCT__entry(
> + __field( __u64, rip )
> + __field( __u64, vmcs )
> + __field( __u64, nested_rip )
> + __field( __u32, entry_intr_info )
> + ),
> +
> + TP_fast_assign(
> + __entry->rip = rip;
> + __entry->vmcs = vmcs;
> + __entry->nested_rip = nested_rip;
> + __entry->entry_intr_info = entry_intr_info;
> + ),
> +
> + TP_printk("rip: 0x%016llx vmcs: 0x%016llx nrip: 0x%016llx "
> + "entry_intr_info: 0x%08x",
> + __entry->rip, __entry->vmcs, __entry->nested_rip,
> + __entry->entry_intr_info)
> +);
> +
> +
> TRACE_EVENT(kvm_nested_intercepts,
> TP_PROTO(__u16 cr_read, __u16 cr_write, __u32 exceptions,
> __u32 intercept1, __u32 intercept2, __u32 intercept3),
> diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
> index 776688f9d1017..cd51b66480d52 100644
> --- a/arch/x86/kvm/vmx/nested.c
> +++ b/arch/x86/kvm/vmx/nested.c
> @@ -3327,6 +3327,12 @@ enum nvmx_vmentry_status nested_vmx_enter_non_root_mode(struct kvm_vcpu *vcpu,
> !(vmcs12->vm_entry_controls & VM_ENTRY_LOAD_BNDCFGS))
> vmx->nested.vmcs01_guest_bndcfgs = vmcs_read64(GUEST_BNDCFGS);
>
> + trace_kvm_nested_vmlaunch_resume(kvm_rip_read(vcpu),

Hmm, won't this RIP be wrong for the migration case? I.e. it'll be L2, not L1
as is the case for the "true" nested VM-Enter path.

> + vmx->nested.current_vmptr,
> + vmcs12->guest_rip,
> + vmcs12->vm_entry_intr_info_field);

The placement is a bit funky. I assume you put it here so that calls from
vmx_set_nested_state() also get traced. But, that also means
vmx_pre_leave_smm() will get traced, and it also creates some weirdness where
some nested VM-Enters that VM-Fail will get traced, but others will not.

Tracing vmx_pre_leave_smm() isn't necessarily bad, but it could be confusing,
especially if the debugger looks up the RIP and sees RSM. Ditto for the
migration case.

Not sure what would be a good answer.

> +
> +
> /*
> * Overwrite vmcs01.GUEST_CR3 with L1's CR3 if EPT is disabled *and*
> * nested early checks are disabled. In the event of a "late" VM-Fail,
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index a480804ae27a3..7c6e94e32100e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -11562,6 +11562,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
> EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
> EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_msr);
> EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_cr);
> +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmlaunch_resume);
> EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmrun);
> EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit);
> EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_nested_vmexit_inject);
> --
> 2.26.2
>