Re: [PATCH 04/11] KVM: x86: Disable MCE related stuff for TDX

From: Xiaoyao Li
Date: Mon Nov 15 2021 - 10:40:54 EST


On 11/13/2021 1:01 AM, Sean Christopherson wrote:
On Fri, Nov 12, 2021, Xiaoyao Li wrote:
From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>

MCE is not supported for TDX VM and KVM cannot inject #MC to TDX VM.

Introduce kvm_guest_mce_disallowed() which actually reports the MCE
availability based on vm_type. And use it to guard all the MCE related
CAPs and IOCTLs.

Note: KVM_X86_GET_MCE_CAP_SUPPORTED is KVM scope so that what it reports
may not match the behavior of specific VM (e.g., here for TDX VM). The
same for KVM_CAP_MCE when queried from /dev/kvm. To qeuery the precise
KVM_CAP_MCE of the VM, it should use VM's fd.

[ Xiaoyao: Guard MCE related CAPs ]

Co-developed-by: Kai Huang <kai.huang@xxxxxxxxxxxxxxx>
Signed-off-by: Kai Huang <kai.huang@xxxxxxxxxxxxxxx>
Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
---
arch/x86/kvm/x86.c | 10 ++++++++++
arch/x86/kvm/x86.h | 5 +++++
2 files changed, 15 insertions(+)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index b02088343d80..2b21c5169f32 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4150,6 +4150,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
break;
case KVM_CAP_MCE:
r = KVM_MAX_MCE_BANKS;
+ if (kvm)
+ r = kvm_guest_mce_disallowed(kvm) ? 0 : r;

r = KVM_MAX_MCE_BANKS;
if (kvm && kvm_guest_mce_disallowed(kvm))
r = 0;

or

r = (kvm && kvm_guest_mce_disallowed(kvm)) ? 0 : KVM_MAX_MCE_BANKS;

I will use this one in next submission.

break;
case KVM_CAP_XCRS:
r = boot_cpu_has(X86_FEATURE_XSAVE);
@@ -5155,6 +5157,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
case KVM_X86_SETUP_MCE: {
u64 mcg_cap;
+ r = EINVAL;
+ if (kvm_guest_mce_disallowed(vcpu->kvm))
+ goto out;
+
r = -EFAULT;
if (copy_from_user(&mcg_cap, argp, sizeof(mcg_cap)))
goto out;
@@ -5164,6 +5170,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
case KVM_X86_SET_MCE: {
struct kvm_x86_mce mce;
+ r = EINVAL;
+ if (kvm_guest_mce_disallowed(vcpu->kvm))
+ goto out;
+
r = -EFAULT;
if (copy_from_user(&mce, argp, sizeof(mce)))
goto out;
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index a2813892740d..69c60297bef2 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -441,6 +441,11 @@ static __always_inline bool kvm_irq_injection_disallowed(struct kvm_vcpu *vcpu)
return vcpu->kvm->arch.vm_type == KVM_X86_TDX_VM;
}
+static __always_inline bool kvm_guest_mce_disallowed(struct kvm *kvm)

The "guest" part is potentially confusing and incosistent with e.g.
kvm_irq_injection_disallowed. And given the current ridiculous spec, CR4.MCE=1
is _required_, so saying "mce disallowed" is arguably wrong from that perspective.

kvm_mce_injection_disallowed() would be more appropriate.

Good advice, I'll rename to it.

+{
+ return kvm->arch.vm_type == KVM_X86_TDX_VM;
+}
+
void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
int kvm_spec_ctrl_test_value(u64 value);
--
2.27.0