Re: [PATCH v2] KVM: x86: nSVM/nVMX: Fix handling triple fault on RSM instruction

From: Paolo Bonzini
Date: Thu Feb 08 2024 - 12:46:31 EST

On Thu, Feb 8, 2024 at 2:18 PM Wilczynski, Michal
<michal.wilczynski@xxxxxxxxx> wrote:
> Hi, I've tested the patch and it seems to work, both on Intel and AMD.
> There was a problem with applying this chunk though:
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index ac8b7614e79d..3d18fa7db353 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -119,7 +119,8 @@ KVM_X86_OP(setup_mce)
> KVM_X86_OP(smi_allowed)
> KVM_X86_OP() // <- This shouldn't be there I guess ?
> -KVM_X86_OP(leave_smm)
> +KVM_X86_OP(leave_smm_prepare)
> +KVM_X86_OP(leave_smm_commit)
> KVM_X86_OP(enable_smi_window)
> #endif
> KVM_X86_OP_OPTIONAL(dev_get_attr)
> Anyway I was a bit averse to this approach as I noticed in the git log
> that callbacks like e.g post_leave_smm() used to exist, but they were later
> removed, so I though the maintainers don't like introducing extra
> callbacks.

If they are needed, it's fine. In my opinion a new callback is easier
to handle and understand than new state.

> > 2) otherwise, if the problem is that we have not gone through the
> > vmenter yet, then KVM needs to do that and _then_ inject the triple
> > fault. The fix is to merge the .triple_fault and .check_nested_events
> > callbacks, with something like the second attached patch - which
> > probably has so many problems that I haven't even tried to compile it.
> Well, in this case if we know that RSM will fail it doesn't seem to me
> like it make sense to run vmenter just do kill the VM anyway, this would
> be more confusing.

Note that the triple fault must not kill the VM, it's just causing a
nested vmexit from L2 to L1. KVM's algorithm to inject a
vmexit-causing event is always to first ensure that the VMCS02 (VMCB02
for AMD) is consistent, and only then trigger the vmexit. So if patch
2 or something like it works, that would be even better.

> I've made the fix this way based on our discussion with Sean in v1, and
> tried to mark the RSM instruction with a flag, as a one that needs
> actual HW VMenter to complete succesfully, and based on that information
> manipulate nested_run_pending.

I understand, apologies for not noticing v1. Let's wait for Sean's opinion.