RE: [PATCH v1 12/23] KVM: VMX: Handle FRED event data

From: Li, Xin3
Date: Mon Nov 13 2023 - 23:34:11 EST


> >+ else if (is_nm_fault(intr_info) &&
> >+ vcpu->arch.guest_fpu.fpstate->xfd)
>
> does this necessarily mean the #NM is caused by XFD?

Then the event data should be 0. Or I missed something obvious? I.e.,
it can be easily differentiated and we should just explicitly set it
to 0.

>
> >+ event_data = vcpu->arch.guest_fpu.xfd_err;
> >+
> >+ vmcs_write64(INJECTED_EVENT_DATA, event_data);
> >+ }
> >+ }
> >+
> > vmcs_write32(VM_ENTRY_INTR_INFO_FIELD, intr_info);
> >
> > vmx_clear_hlt(vcpu);
> >@@ -7226,7 +7247,8 @@ static void vmx_recover_nmi_blocking(struct
> >vcpu_vmx *vmx) static void __vmx_complete_interrupts(struct kvm_vcpu *vcpu,
> > u32 idt_vectoring_info,
> > int instr_len_field,
> >- int error_code_field)
> >+ int error_code_field,
> >+ int event_data_field)
>
> event_data_field is used to indicate whether this is a "cancel". I may think it is
> better to simply use a boolean e.g., bool cancel.

I'm fine with the idea if no objections.

>
>
> > {
> > u8 vector;
> > int type;
> >@@ -7260,6 +7282,37 @@ static void __vmx_complete_interrupts(struct
> kvm_vcpu *vcpu,
> > vcpu->arch.event_exit_inst_len = vmcs_read32(instr_len_field);
> > fallthrough;
> > case INTR_TYPE_HARD_EXCEPTION:
> >+ if (kvm_is_fred_enabled(vcpu) && event_data_field) {
> >+ /*
> >+ * Save original-event data for being used as injected-
> event data.
> >+ */
>
> Looks we also expect CPU will update CR2/DR6/XFD_ERR. this hunk looks to me
> just a paranoid check to ensure the cpu works as expected. if that's the case, I
> suggest documenting it a bit in the comment.

These checks are not intended for hw, they make sure nVMX FRED is correctly
implemented and catch regressions.

And yes, in the early stage, I prefer to be paranoid.

>
> >+ u64 event_data = vmcs_read64(event_data_field);
> >+
> >+ switch (vector) {
> >+ case DB_VECTOR:
> >+ get_debugreg(vcpu->arch.dr6, 6);
> >+ WARN_ON(vcpu->arch.dr6 != (event_data ^
> DR6_RESERVED));
> >+ vcpu->arch.dr6 = event_data ^ DR6_RESERVED;
> >+ break;
> >+ case NM_VECTOR:
> >+ if (vcpu->arch.guest_fpu.fpstate->xfd) {
> >+ rdmsrl(MSR_IA32_XFD_ERR, vcpu-
> >arch.guest_fpu.xfd_err);
> >+ WARN_ON(vcpu-
> >arch.guest_fpu.xfd_err != event_data);
> >+ vcpu->arch.guest_fpu.xfd_err =
> event_data;
> >+ } else {
> >+ WARN_ON(event_data != 0);
> >+ }
> >+ break;
> >+ case PF_VECTOR:
> >+ WARN_ON(vcpu->arch.cr2 != event_data);
> >+ vcpu->arch.cr2 = event_data;
> >+ break;
> >+ default:
> >+ WARN_ON(event_data != 0);
>
> I am not sure if this WARN_ON() can be triggeded by nested VMX. It is legitimate
> for L1 VMM to inject any event w/ an event_data.
>
> FRED spec says:
>
> Section 5.2.1 specifies the event data that FRED event delivery of certain events
> saves on the stack. When FRED event delivery is used for an event injected by VM
> entry, the event data saved is the value of the injected-event-data field in the
> VMCS. This value is used instead of what is specified in Section 5.2.1 and is done
> for __ALL__ injected events using FRED event delivery

5.2.1 Saving Information on the Regular Stack also says:
- For any other event, the event data are not currently defined and will
be zero until they are.

Or you mean something else?