Re: [PATCH v12 5/6] KVM: x86: Introduce vm_type to differentiate default VMs from confidential VMs

From: Zhi Wang
Date: Tue Apr 11 2023 - 06:09:22 EST


On Tue, 28 Feb 2023 17:19:15 -0800
isaku.yamahata@xxxxxxxxx wrote:

> From: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
>
> Unlike default VMs, confidential VMs (Intel TDX and AMD SEV-ES) don't allow
> some operations (e.g., memory read/write, register state access, etc).
>
> Introduce vm_type to track the type of the VM to x86 KVM. Other arch KVMs
> already use vm_type, KVM_INIT_VM accepts vm_type, and x86 KVM callback
> vm_init accepts vm_type. So follow them. Further, a different policy can
> be made based on vm_type. Define KVM_X86_DEFAULT_VM for default VM as
> default and define KVM_X86_TDX_VM for Intel TDX VM. The wrapper function
> will be defined as "bool is_td(kvm) { return vm_type == VM_TYPE_TDX; }"
>

Where is the KVM_X86_TDX_VM? It seems the comments are out of date. I guess
KVM_X86_PROTECTED_VM means a generic CC VM now, not specifically to SNP
or TDX.

Is it possible to have an additional vendor (TDX/SNP) VM type besides
KVM_X86_PROTECTED_VM? Although QEMU knows if SEV driver is existing or not
in a system by checking "/dev/sev", the only way it can know if KVM supports
SNP is to check KVM_X86_PROTECTED_VM through the KVM_CAP_VM_TYPES. For TDX,
QEMU only sees KVM_X86_PROTECTED_VM is set and !SEV_DRIVER. This doesn't
seems very clear to QEMU.

Is it better to split the bits in vm_type into two parts of bit fields: a.
generic part: (KVM_X86_{DEFAULT,PROTECTED}_VM). b. vendor part:
KVM_X86_{TDX,SNP}_PROTECTED_VM?

The KVM can still use KVM_X86_PROTECTED_VM in the code flow to deal with non-
vendor specific matter.

When QEMU queries the KVM_CAP_VM_TYPES, besides checking the vm_type in kvm_x86_is_vm_type_supported, KVM also let the vendor callback to set the
KVM_X86_{TDX,SNP}_PROTECTED_VM in the vendor part. Then QEMU would receive a
cap return value with (KVM_X86_PROTECTED_VM | KVM_X86_{TDX,SNP}_PROTECTED_VM)
and immediately know which bunch of the ioctls {TDX/SNP} are available in KVM.

> Add a capability KVM_CAP_VM_TYPES to effectively allow device model,
> e.g. qemu, to query what VM types are supported by KVM. This (introduce a
> new capability and add vm_type) is chosen to align with other arch KVMs
> that have VM types already. Other arch KVMs use different names to query
> supported vm types and there is no common name for it, so new name was
> chosen.
>
> Co-developed-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
> Signed-off-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
> Signed-off-by: Sean Christopherson <sean.j.christopherson@xxxxxxxxx>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> Reviewed-by: Paolo Bonzini <pbonzini@xxxxxxxxxx>
> ---
> Documentation/virt/kvm/api.rst | 4 +++-
> arch/x86/include/asm/kvm-x86-ops.h | 1 +
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/svm/svm.c | 7 +++++++
> arch/x86/kvm/vmx/main.c | 1 +
> arch/x86/kvm/vmx/vmx.c | 5 +++++
> arch/x86/kvm/vmx/x86_ops.h | 1 +
> arch/x86/kvm/x86.c | 8 +++++++-
> arch/x86/kvm/x86.h | 2 ++
> tools/arch/x86/include/uapi/asm/kvm.h | 3 +++
> tools/include/uapi/linux/kvm.h | 1 +
> 11 files changed, 32 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
> index 191aabc3af8c..fbff5cd6e404 100644
> --- a/Documentation/virt/kvm/api.rst
> +++ b/Documentation/virt/kvm/api.rst
> @@ -150,7 +150,9 @@ You probably want to use 0 as machine type.
> X86:
> ^^^^
>
> -Supported X86 VM types can be queried via KVM_CAP_VM_TYPES.
> +Supported X86 VM types can be queried via KVM_CAP_VM_TYPES, which returns the
> +bitmap of supported vm types. The 1-setting of bit @n means vm type with value
> +@n is supported.
>
> S390:
> ^^^^^
> diff --git a/arch/x86/include/asm/kvm-x86-ops.h b/arch/x86/include/asm/kvm-x86-ops.h
> index 8dc345cc6318..eac4b65d1b01 100644
> --- a/arch/x86/include/asm/kvm-x86-ops.h
> +++ b/arch/x86/include/asm/kvm-x86-ops.h
> @@ -20,6 +20,7 @@ KVM_X86_OP(hardware_disable)
> KVM_X86_OP(hardware_unsetup)
> KVM_X86_OP(has_emulated_msr)
> KVM_X86_OP(vcpu_after_set_cpuid)
> +KVM_X86_OP(is_vm_type_supported)
> KVM_X86_OP(vm_init)
> KVM_X86_OP_OPTIONAL(vm_destroy)
> KVM_X86_OP_OPTIONAL_RET0(vcpu_precreate)
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 8344945dece3..ffb85c35cacc 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1546,6 +1546,7 @@ struct kvm_x86_ops {
> bool (*has_emulated_msr)(struct kvm *kvm, u32 index);
> void (*vcpu_after_set_cpuid)(struct kvm_vcpu *vcpu);
>
> + bool (*is_vm_type_supported)(unsigned long vm_type);
> unsigned int vm_size;
> int (*vm_init)(struct kvm *kvm);
> void (*vm_destroy)(struct kvm *kvm);
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 8ed7e177e73d..d0b01956e420 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4682,6 +4682,12 @@ static void svm_vm_destroy(struct kvm *kvm)
> sev_vm_destroy(kvm);
> }
>
> +static bool svm_is_vm_type_supported(unsigned long type)
> +{
> + /* FIXME: Check if CPU is capable of SEV. */
> + return __kvm_is_vm_type_supported(type);
> +}
> +
> static int svm_vm_init(struct kvm *kvm)
> {
> if (!pause_filter_count || !pause_filter_thresh)
> @@ -4710,6 +4716,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
> .vcpu_free = svm_vcpu_free,
> .vcpu_reset = svm_vcpu_reset,
>
> + .is_vm_type_supported = svm_is_vm_type_supported,
> .vm_size = sizeof(struct kvm_svm),
> .vm_init = svm_vm_init,
> .vm_destroy = svm_vm_destroy,
> diff --git a/arch/x86/kvm/vmx/main.c b/arch/x86/kvm/vmx/main.c
> index d21a7c7d18ea..e1bbe06517b7 100644
> --- a/arch/x86/kvm/vmx/main.c
> +++ b/arch/x86/kvm/vmx/main.c
> @@ -45,6 +45,7 @@ struct kvm_x86_ops vt_x86_ops __initdata = {
> .hardware_disable = vmx_hardware_disable,
> .has_emulated_msr = vmx_has_emulated_msr,
>
> + .is_vm_type_supported = vmx_is_vm_type_supported,
> .vm_size = sizeof(struct kvm_vmx),
> .vm_init = vmx_vm_init,
> .vm_destroy = vmx_vm_destroy,
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index bddbdd2988f4..5bfdfc6f2190 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7470,6 +7470,11 @@ int vmx_vcpu_create(struct kvm_vcpu *vcpu)
> return err;
> }
>
> +bool vmx_is_vm_type_supported(unsigned long type)
> +{
> + return type == KVM_X86_DEFAULT_VM;
> +}
> +
> #define L1TF_MSG_SMT "L1TF CPU bug present and SMT on, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n"
> #define L1TF_MSG_L1D "L1TF CPU bug present and virtualization mitigation disabled, data leak possible. See CVE-2018-3646 and https://www.kernel.org/doc/html/latest/admin-guide/hw-vuln/l1tf.html for details.\n"
>
> diff --git a/arch/x86/kvm/vmx/x86_ops.h b/arch/x86/kvm/vmx/x86_ops.h
> index 0f200aead411..e4dae9842550 100644
> --- a/arch/x86/kvm/vmx/x86_ops.h
> +++ b/arch/x86/kvm/vmx/x86_ops.h
> @@ -32,6 +32,7 @@ void vmx_hardware_unsetup(void);
> int vmx_check_processor_compat(void);
> int vmx_hardware_enable(void);
> void vmx_hardware_disable(void);
> +bool vmx_is_vm_type_supported(unsigned long type);
> int vmx_vm_init(struct kvm *kvm);
> void vmx_vm_destroy(struct kvm *kvm);
> int vmx_vcpu_precreate(struct kvm *kvm);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 45330273bad6..589844a27349 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4352,12 +4352,18 @@ static int kvm_ioctl_get_supported_hv_cpuid(struct kvm_vcpu *vcpu,
> return 0;
> }
>
> -static bool kvm_is_vm_type_supported(unsigned long type)
> +bool __kvm_is_vm_type_supported(unsigned long type)
> {
> return type == KVM_X86_DEFAULT_VM ||
> (type == KVM_X86_PROTECTED_VM &&
> IS_ENABLED(CONFIG_KVM_PROTECTED_VM) && tdp_enabled);
> }
> +EXPORT_SYMBOL_GPL(__kvm_is_vm_type_supported);
> +
> +static bool kvm_is_vm_type_supported(unsigned long type)
> +{
> + return static_call(kvm_x86_is_vm_type_supported)(type);
> +}
>
> int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> {
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 9de72586f406..888f34224bba 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -8,6 +8,8 @@
> #include "kvm_cache_regs.h"
> #include "kvm_emulate.h"
>
> +bool __kvm_is_vm_type_supported(unsigned long type);
> +
> struct kvm_caps {
> /* control of guest tsc rate supported? */
> bool has_tsc_control;
> diff --git a/tools/arch/x86/include/uapi/asm/kvm.h b/tools/arch/x86/include/uapi/asm/kvm.h
> index e48deab8901d..53ce363ba5fe 100644
> --- a/tools/arch/x86/include/uapi/asm/kvm.h
> +++ b/tools/arch/x86/include/uapi/asm/kvm.h
> @@ -529,4 +529,7 @@ struct kvm_pmu_event_filter {
> #define KVM_VCPU_TSC_CTRL 0 /* control group for the timestamp counter (TSC) */
> #define KVM_VCPU_TSC_OFFSET 0 /* attribute for the TSC offset */
>
> +#define KVM_X86_DEFAULT_VM 0
> +#define KVM_X86_PROTECTED_VM 1
> +
> #endif /* _ASM_X86_KVM_H */
> diff --git a/tools/include/uapi/linux/kvm.h b/tools/include/uapi/linux/kvm.h
> index 55155e262646..63474f72ea34 100644
> --- a/tools/include/uapi/linux/kvm.h
> +++ b/tools/include/uapi/linux/kvm.h
> @@ -1175,6 +1175,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_DIRTY_LOG_RING_ACQ_REL 223
> #define KVM_CAP_S390_PROTECTED_ASYNC_DISABLE 224
> #define KVM_CAP_DIRTY_LOG_RING_WITH_BITMAP 225
> +#define KVM_CAP_VM_TYPES 227
>
> #ifdef KVM_CAP_IRQ_ROUTING
>