Re: [PATCH 04/11] KVM: x86: Disable MCE related stuff for TDX

From: Sean Christopherson
Date: Fri Nov 12 2021 - 12:01:36 EST


On Fri, Nov 12, 2021, Xiaoyao Li wrote:
> From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>
> MCE is not supported for TDX VM and KVM cannot inject #MC to TDX VM.
>
> Introduce kvm_guest_mce_disallowed() which actually reports the MCE
> availability based on vm_type. And use it to guard all the MCE related
> CAPs and IOCTLs.
>
> Note: KVM_X86_GET_MCE_CAP_SUPPORTED is KVM scope so that what it reports
> may not match the behavior of specific VM (e.g., here for TDX VM). The
> same for KVM_CAP_MCE when queried from /dev/kvm. To qeuery the precise
> KVM_CAP_MCE of the VM, it should use VM's fd.
>
> [ Xiaoyao: Guard MCE related CAPs ]
>
> Co-developed-by: Kai Huang <kai.huang@xxxxxxxxxxxxxxx>
> Signed-off-by: Kai Huang <kai.huang@xxxxxxxxxxxxxxx>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
> ---
> arch/x86/kvm/x86.c | 10 ++++++++++
> arch/x86/kvm/x86.h | 5 +++++
> 2 files changed, 15 insertions(+)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index b02088343d80..2b21c5169f32 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4150,6 +4150,8 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> break;
> case KVM_CAP_MCE:
> r = KVM_MAX_MCE_BANKS;
> + if (kvm)
> + r = kvm_guest_mce_disallowed(kvm) ? 0 : r;

r = KVM_MAX_MCE_BANKS;
if (kvm && kvm_guest_mce_disallowed(kvm))
r = 0;

or

r = (kvm && kvm_guest_mce_disallowed(kvm)) ? 0 : KVM_MAX_MCE_BANKS;

> break;
> case KVM_CAP_XCRS:
> r = boot_cpu_has(X86_FEATURE_XSAVE);
> @@ -5155,6 +5157,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> case KVM_X86_SETUP_MCE: {
> u64 mcg_cap;
>
> + r = EINVAL;
> + if (kvm_guest_mce_disallowed(vcpu->kvm))
> + goto out;
> +
> r = -EFAULT;
> if (copy_from_user(&mcg_cap, argp, sizeof(mcg_cap)))
> goto out;
> @@ -5164,6 +5170,10 @@ long kvm_arch_vcpu_ioctl(struct file *filp,
> case KVM_X86_SET_MCE: {
> struct kvm_x86_mce mce;
>
> + r = EINVAL;
> + if (kvm_guest_mce_disallowed(vcpu->kvm))
> + goto out;
> +
> r = -EFAULT;
> if (copy_from_user(&mce, argp, sizeof(mce)))
> goto out;
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index a2813892740d..69c60297bef2 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -441,6 +441,11 @@ static __always_inline bool kvm_irq_injection_disallowed(struct kvm_vcpu *vcpu)
> return vcpu->kvm->arch.vm_type == KVM_X86_TDX_VM;
> }
>
> +static __always_inline bool kvm_guest_mce_disallowed(struct kvm *kvm)

The "guest" part is potentially confusing and incosistent with e.g.
kvm_irq_injection_disallowed. And given the current ridiculous spec, CR4.MCE=1
is _required_, so saying "mce disallowed" is arguably wrong from that perspective.

kvm_mce_injection_disallowed() would be more appropriate.

> +{
> + return kvm->arch.vm_type == KVM_X86_TDX_VM;
> +}
> +
> void kvm_load_guest_xsave_state(struct kvm_vcpu *vcpu);
> void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu);
> int kvm_spec_ctrl_test_value(u64 value);
> --
> 2.27.0
>