RE: [PATCH v3 10/21] KVM:x86: Add #CP support in guest exception classification

From: Neiger, Gil
Date: Fri Jun 30 2023 - 11:22:32 EST


Just in case it is not clear: event delivery in real mode never includes an error code. That is why the PE bit in CR0 is checked.

- Gil

-----Original Message-----
From: Sean Christopherson <seanjc@xxxxxxxxxx>
Sent: Friday, June 30, 2023 08:08
To: Yang, Weijiang <weijiang.yang@xxxxxxxxx>
Cc: Gao, Chao <chao.gao@xxxxxxxxx>; pbonzini@xxxxxxxxxx; kvm@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; peterz@xxxxxxxxxxxxx; rppt@xxxxxxxxxx; binbin.wu@xxxxxxxxxxxxxxx; Edgecombe, Rick P <rick.p.edgecombe@xxxxxxxxx>; john.allen@xxxxxxx; Neiger, Gil <gil.neiger@xxxxxxxxx>
Subject: Re: [PATCH v3 10/21] KVM:x86: Add #CP support in guest exception classification

On Fri, Jun 30, 2023, Weijiang Yang wrote:
>
> On 6/17/2023 2:57 AM, Sean Christopherson wrote:
> > > Do you mean documentation for #CP as an generic exception or the
> > > behavior in KVM as this patch shows?
> > As I pointed out two *years* ago, this entry in the SDM
> >
> > — The field's deliver-error-code bit (bit 11) is 1 if each of the following
> > holds: (1) the interruption type is hardware exception; (2) bit 0
> > (corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
> > (3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector
> > indicates one of the following exceptions: #DF (vector 8), #TS (10),
> > #NP (11), #SS (12), #GP (13), #PF (14), or #AC (17).
> >
> > needs to read something like
> >
> > — The field's deliver-error-code bit (bit 11) is 1 if each of the following
> > holds: (1) the interruption type is hardware exception; (2) bit 0
> > (corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
> > (3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector
> > indicates one of the following exceptions: #DF (vector 8), #TS (10),
> > #NP (11), #SS (12), #GP (13), #PF (14), #AC (17), or #CP
> > (21)[1]
> >
> > [1] #CP has an error code if and only if IA32_VMX_CR4_FIXED1 enumerates
> > support for the 1-setting of CR4.CET.
>
> Hi, Sean,
>
> I sent above change request to Gil(added in cc), but he shared
> different opinion on this issue:

Heh, "opinion".

>  It may make things clearer if we document the statement above (all
> CET-capable parts enumerate IA32_VMX_BASIC[56] as 1).
>
> I will see if we can update future revisions of the SDM to clarify this."

That would be helpful. Though to be perfectly honest, I simply overlooked the existence of IA32_VMX_BASIC[56].

Thanks!

> Then if this is the case,  kvm needs to check IA32_VMX_BASIC[56]
> before inject exception to nested VM.
>
> And this patch could be removed, instead need another patch like below:
>
> diff --git a/arch/x86/include/asm/msr-index.h
> b/arch/x86/include/asm/msr-index.h
> index ad35355ee43e..6b33aacc8587 100644
> --- a/arch/x86/include/asm/msr-index.h
> +++ b/arch/x86/include/asm/msr-index.h
> @@ -1076,6 +1076,7 @@
>  #define VMX_BASIC_MEM_TYPE_MASK    0x003c000000000000LLU
>  #define VMX_BASIC_MEM_TYPE_WB    6LLU
>  #define VMX_BASIC_INOUT        0x0040000000000000LLU
> +#define VMX_BASIC_CHECK_ERRCODE    0x0140000000000000LLU

"Check Error Code" isn't a great description. The flag enumerates that there the CPU does *not* perform consistency checks on the error code when injecting hardware exceptions.

So something like this?

VMX_BASIC_NO_HW_ERROR_CODE_CC

or maybe

VMX_BASIC_PM_NO_HW_ERROR_CODE_CC

if we want to capture that only protected mode is exempt (I personally prefer just VMX_BASIC_NO_HW_ERROR_CODE_CC as "PM" is a bit ambiguous).

> @@ -2873,7 +2873,8 @@ static int nested_check_vm_entry_controls(struct
> kvm_vcpu *vcpu,
>          should_have_error_code =
>              intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
>              x86_exception_has_error_code(vector);
> -        if (CC(has_error_code != should_have_error_code))
> +        if (!cpu_has_vmx_basic_check_errcode() &&
> +            CC(has_error_code != should_have_error_code))

This is wrong on mutiple fronts:

1. The new feature flag only excempts hardware exceptions delivered to guests
with CR0.PE=1. The above will skip the consistency check for all event injection.

2. KVM needs to check the CPU model that is exposed to L1, not the capabilities
of the host CPU.

Highlighting the key phrases in the SDM:

The field's deliver-error-code bit (bit 11) is 1 if each of the following holds: (1) the interruption type is
^^^^^^^
hardware exception; (2) bit 0 (corresponding to CR0.PE) is set in the CR0 field in the guest-state area;
(3) IA32_VMX_BASIC[56] is read as 0 (see Appendix A.1); and (4) the vector indicates one of the following
exceptions: #DF (vector 8), #TS (10), #NP (11), #SS (12), #GP (13), #PF (14), or #AC (17).

The field's deliver-error-code bit is 0 if any of the following holds: (1) the interruption type is not hardware
^^^^^^
exception; (2) bit 0 is clear in the CR0 field in the guest-state area; or (3) IA32_VMX_BASIC[56] is read as
0 and the vector is in one of the following ranges: 0–7, 9, 15, 16, or 18–31.

I think what we want is:

/* VM-entry interruption-info field: deliver error code */
if (!prot_mode || intr_type != INTR_TYPE_HARD_EXCEPTION ||
!nested_cpu_has_no_hw_error_code_cc(vcpu)) {
should_have_error_code =
intr_type == INTR_TYPE_HARD_EXCEPTION && prot_mode &&
x86_exception_has_error_code(vector);
if (CC(has_error_code != should_have_error_code))
return -EINVAL;
}