Re: [PATCH 0/2] KVM: SVM: Fix unexpected #UD on INT3 in SEV guests

From: Sean Christopherson
Date: Wed Aug 23 2023 - 17:13:10 EST


On Wed, Aug 23, 2023, Tom Lendacky wrote:
> On 8/23/23 15:03, Sean Christopherson wrote:
> > I think the best option is to add a "temporary" patch so that the fix for @stable
> > is short, sweet, and safe, and then do the can_emulate_instruction() cleanup that
> > I was avoiding.
> >
> > E.g. this as patch 2/4 (or maybe 2/5) of this series:
>
> 2/4 or 2/5? Do you mean 2/3 since there were only 2 patches in the series?

I am planning on adding more patches in v2 to cleanup the hack-a-fix. But after
writing the code, I think it's best to sqaush the hack-a-fix in with patch 1
(new full patch at the bottom, though it should be the same result as the earlier
delta patch).

> I'll apply the below patch in between patches 1 and 2 and re-test. Should
> have results in a week :)

Heh, maybe if we send enough emails we can get Wu's feedback before then :-)

One idea to make the original bug repro on every run would be to constantly
toggle nx_huge_pages between "off" and "force" while the guest is booting. Toggling
nx_huge_pages should force KVM to rebuild the SPTEs and all but guarantee trying
to deliver the #BP will hit a #NPF.

--
From: Sean Christopherson <seanjc@xxxxxxxxxx>
Date: Tue, 8 Aug 2023 17:18:42 -0700
Subject: [PATCH 1/4] KVM: SVM: Don't inject #UD if KVM attempts to skip SEV
guest insn

Don't inject a #UD if KVM attempts to "emulate" to skip an instruction
for an SEV guest, and instead resume the guest and hope that it can make
forward progress. When commit 04c40f344def ("KVM: SVM: Inject #UD on
attempted emulation for SEV guest w/o insn buffer") added the completely
arbitrary #UD behavior, there were no known scenarios where a well-behaved
guest would induce a VM-Exit that triggered emulation, i.e. it was thought
that injecting #UD would be helpful.

However, now that KVM (correctly) attempts to re-inject INT3/INTO, e.g. if
a #NPF is encountered when attempting to deliver the INT3/INTO, an SEV
guest can trigger emulation without a buffer, through no fault of its own.
Resuming the guest and retrying the INT3/INTO is architecturally wrong,
e.g. the vCPU will incorrectly re-hit code #DBs, but for SEV guests there
is literally no other option that has a chance of making forward progress.

Drop the #UD injection for all "skip" emulation, not just those related to
INT3/INTO, even though that means that the guest will likely end up in an
infinite loop instead of getting a #UD (the vCPU may also crash, e.g. if
KVM emulated everything about an instruction except for advancing RIP).
There's no evidence that suggests that an unexpected #UD is actually
better than hanging the vCPU, e.g. a soft-hung vCPU can still respond to
IRQs and NMIs to generate a backtrace.

Reported-by: Wu Zongyo <wuzongyo@xxxxxxxxxxxxxxxx>
Closes: https://lore.kernel.org/all/8eb933fd-2cf3-d7a9-32fe-2a1d82eac42a@xxxxxxxxxxxxxxxx
Fixes: 6ef88d6e36c2 ("KVM: SVM: Re-inject INT3/INTO instead of retrying the instruction")
Cc: stable@xxxxxxxxxxxxxxx
Cc: Tom Lendacky <thomas.lendacky@xxxxxxx>
Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
arch/x86/kvm/svm/svm.c | 35 +++++++++++++++++++++++++++--------
1 file changed, 27 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 212706d18c62..f6adc43b315a 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -364,6 +364,8 @@ static void svm_set_interrupt_shadow(struct kvm_vcpu *vcpu, int mask)
svm->vmcb->control.int_state |= SVM_INTERRUPT_SHADOW_MASK;

}
+static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
+ void *insn, int insn_len);

static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
bool commit_side_effects)
@@ -384,6 +386,14 @@ static int __svm_skip_emulated_instruction(struct kvm_vcpu *vcpu,
}

if (!svm->next_rip) {
+ /*
+ * FIXME: Drop this when kvm_emulate_instruction() does the
+ * right thing and treats "can't emulate" as outright failure
+ * for EMULTYPE_SKIP.
+ */
+ if (!svm_can_emulate_instruction(vcpu, EMULTYPE_SKIP, NULL, 0))
+ return 0;
+
if (unlikely(!commit_side_effects))
old_rflags = svm->vmcb->save.rflags;

@@ -4725,16 +4735,25 @@ static bool svm_can_emulate_instruction(struct kvm_vcpu *vcpu, int emul_type,
* and cannot be decrypted by KVM, i.e. KVM would read cyphertext and
* decode garbage.
*
- * Inject #UD if KVM reached this point without an instruction buffer.
- * In practice, this path should never be hit by a well-behaved guest,
- * e.g. KVM doesn't intercept #UD or #GP for SEV guests, but this path
- * is still theoretically reachable, e.g. via unaccelerated fault-like
- * AVIC access, and needs to be handled by KVM to avoid putting the
- * guest into an infinite loop. Injecting #UD is somewhat arbitrary,
- * but its the least awful option given lack of insight into the guest.
+ * If KVM is NOT trying to simply skip an instruction, inject #UD if
+ * KVM reached this point without an instruction buffer. In practice,
+ * this path should never be hit by a well-behaved guest, e.g. KVM
+ * doesn't intercept #UD or #GP for SEV guests, but this path is still
+ * theoretically reachable, e.g. via unaccelerated fault-like AVIC
+ * access, and needs to be handled by KVM to avoid putting the guest
+ * into an infinite loop. Injecting #UD is somewhat arbitrary, but
+ * its the least awful option given lack of insight into the guest.
+ *
+ * If KVM is trying to skip an instruction, simply resume the guest.
+ * If a #NPF occurs while the guest is vectoring an INT3/INTO, then KVM
+ * will attempt to re-inject the INT3/INTO and skip the instruction.
+ * In that scenario, retrying the INT3/INTO and hoping the guest will
+ * make forward progress is the only option that has a chance of
+ * success (and in practice it will work the vast majority of the time).
*/
if (unlikely(!insn)) {
- kvm_queue_exception(vcpu, UD_VECTOR);
+ if (!(emul_type & EMULTYPE_SKIP))
+ kvm_queue_exception(vcpu, UD_VECTOR);
return false;
}


base-commit: 240f736891887939571854bd6d734b6c9291f22e
--