Re: [PATCH v3 10/21] KVM:x86: Add #CP support in guest exception classification

From: Yang, Weijiang
Date: Fri Jun 30 2023 - 05:35:29 EST



On 6/17/2023 2:57 AM, Sean Christopherson wrote:
On Fri, Jun 16, 2023, Weijiang Yang wrote:
On 6/16/2023 7:58 AM, Sean Christopherson wrote:
On Thu, Jun 08, 2023, Weijiang Yang wrote:
On 6/6/2023 5:08 PM, Chao Gao wrote:
On Thu, May 11, 2023 at 12:08:46AM -0400, Yang Weijiang wrote:
Add handling for Control Protection (#CP) exceptions(vector 21).
The new vector is introduced for Intel's Control-Flow Enforcement
Technology (CET) relevant violation cases.

Although #CP belongs contributory exception class, but the actual
effect is conditional on CET being exposed to guest. If CET is not
available to guest, #CP falls back to non-contributory and doesn't
have an error code.
This sounds weird. is this the hardware behavior? If yes, could you
point us to where this behavior is documented?
It's not SDM documented behavior.
The #CP behavior needs to be documented. Please pester whoever you need to in
order to make that happen.
Do you mean documentation for #CP as an generic exception or the behavior in
KVM as this patch shows?
As I pointed out two *years* ago, this entry in the SDM

— The field's deliver-error-code bit (bit 11) is 1 if each of the following
holds: (1) the interruption type is hardware exception; (2) bit 0
(corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
(3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector
indicates one of the following exceptions: #DF (vector 8), #TS (10),
#NP (11), #SS (12), #GP (13), #PF (14), or #AC (17).

needs to read something like

— The field's deliver-error-code bit (bit 11) is 1 if each of the following
holds: (1) the interruption type is hardware exception; (2) bit 0
(corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
(3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector
indicates one of the following exceptions: #DF (vector 8), #TS (10),
#NP (11), #SS (12), #GP (13), #PF (14), #AC (17), or #CP (21)[1]

[1] #CP has an error code if and only if IA32_VMX_CR4_FIXED1 enumerates
support for the 1-setting of CR4.CET.

Hi, Sean,

I sent above change request to Gil(added in cc), but he shared different opinion on this issue:


"It is the case that all CET-capable parts enumerate IA32_VMX_BASIC[56] as 1.

 However, there were earlier parts without CET that enumerated IA32_VMX_BASIC[56] as 0.

 On those parts, an attempt to inject an exception with vector 21 (#CP) with an error code would fail.

(Injection of exception 21 with no error code would be allowed.)

 It may make things clearer if we document the statement above (all CET-capable parts enumerate IA32_VMX_BASIC[56] as 1).

I will see if we can update future revisions of the SDM to clarify this."


Then if this is the case,  kvm needs to check IA32_VMX_BASIC[56] before inject exception to nested VM.

And this patch could be removed, instead need another patch like below:

diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index ad35355ee43e..6b33aacc8587 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -1076,6 +1076,7 @@
 #define VMX_BASIC_MEM_TYPE_MASK    0x003c000000000000LLU
 #define VMX_BASIC_MEM_TYPE_WB    6LLU
 #define VMX_BASIC_INOUT        0x0040000000000000LLU
+#define VMX_BASIC_CHECK_ERRCODE    0x0140000000000000LLU

 /* Resctrl MSRs: */
 /* - Intel: */
diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 85cffeae7f10..4b1ed4dc03bc 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -79,6 +79,11 @@ static inline bool cpu_has_vmx_basic_inout(void)
     return    (((u64)vmcs_config.basic_cap << 32) & VMX_BASIC_INOUT);
 }

+static inline bool cpu_has_vmx_basic_check_errcode(void)
+{
+    return    (((u64)vmcs_config.basic_cap << 32) & VMX_BASIC_CHECK_ERRCODE);
+}
+
 static inline bool cpu_has_virtual_nmis(void)
 {
     return vmcs_config.pin_based_exec_ctrl & PIN_BASED_VIRTUAL_NMIS &&
diff --git a/arch/x86/kvm/vmx/nested.c b/arch/x86/kvm/vmx/nested.c
index 78524daa2cb2..92aa4fc3d233 100644
--- a/arch/x86/kvm/vmx/nested.c
+++ b/arch/x86/kvm/vmx/nested.c
@@ -1227,9 +1227,9 @@ static int vmx_restore_vmx_basic(struct vcpu_vmx *vmx, u64 data)
 {
     const u64 feature_and_reserved =
         /* feature (except bit 48; see below) */
-        BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) |
+        BIT_ULL(49) | BIT_ULL(54) | BIT_ULL(55) | BIT_ULL(56) |
         /* reserved */
-        BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 56);
+        BIT_ULL(31) | GENMASK_ULL(47, 45) | GENMASK_ULL(63, 57);
     u64 vmx_basic = vmcs_config.nested.basic;

     if (!is_bitwise_subset(vmx_basic, data, feature_and_reserved))
@@ -2873,7 +2873,8 @@ static int nested_check_vm_entry_controls(struct kvm_vcpu *vcpu,
         should_have_error_code =
             intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
             x86_exception_has_error_code(vector);
-        if (CC(has_error_code != should_have_error_code))
+        if (!cpu_has_vmx_basic_check_errcode() &&
+            CC(has_error_code != should_have_error_code))
             return -EINVAL;

         /* VM-entry exception error code */
@@ -6986,6 +6987,8 @@ static void nested_vmx_setup_basic(struct nested_vmx_msrs *msrs)

     if (cpu_has_vmx_basic_inout())
         msrs->basic |= VMX_BASIC_INOUT;
+    if (cpu_has_vmx_basic_check_errcode())
+        msrs->basic |= VMX_BASIC_CHECK_ERRCODE;
 }

 static void nested_vmx_setup_cr_fixed(struct nested_vmx_msrs *msrs)
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index d70f2e94b187..95c0eab7805c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2748,7 +2748,7 @@ static int setup_vmcs_config(struct vmcs_config *vmcs_conf,
     rdmsrl(MSR_IA32_VMX_MISC, misc_msr);

     vmcs_conf->size = vmx_msr_high & 0x1fff;
-    vmcs_conf->basic_cap = vmx_msr_high & ~0x1fff;
+    vmcs_conf->basic_cap = vmx_msr_high & ~0x7fff;

     vmcs_conf->revision_id = vmx_msr_low;