Re: [PATCH 08/16] KVM: x86/mmu: WARN and skip MMIO cache on private, reserved page faults

From: Huang, Kai
Date: Tue Mar 05 2024 - 16:33:21 EST




On 5/03/2024 4:51 am, Sean Christopherson wrote:
On Fri, Mar 01, 2024, Kai Huang wrote:
On 1/03/2024 12:06 pm, Sean Christopherson wrote:
E.g. in this case, KVM will just skip various fast paths because of the RSVD flag,
and treat the fault like a PRIVATE fault. Hmm, but page_fault_handle_page_track()
would skip write tracking, which could theoretically cause data corruption, so I
guess arguably it would be safer to bail?

Anyone else have an opinion? This type of bug should never escape development,
so I'm a-ok effectively killing the VM. Unless someone has a good argument for
continuing on, I'll go with Kai's suggestion and squash this:

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index cedacb1b89c5..d796a162b2da 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5892,8 +5892,10 @@ int noinline kvm_mmu_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa, u64 err
error_code |= PFERR_PRIVATE_ACCESS;
r = RET_PF_INVALID;
- if (unlikely((error_code & PFERR_RSVD_MASK) &&
- !WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))) {
+ if (unlikely(error_code & PFERR_RSVD_MASK)) {
+ if (WARN_ON_ONCE(error_code & PFERR_PRIVATE_ACCESS))
+ return -EFAULT;

-EFAULT is part of guest_memfd() memory fault ABI. I didn't think over this
thoroughly but do you want to return -EFAULT here?

Yes, I/we do. There are many existing paths that can return -EFAULT from KVM_RUN
without setting run->exit_reason to KVM_EXIT_MEMORY_FAULT. Userspace is responsible
for checking run->exit_reason on -EFAULT (and -EHWPOISON), i.e. must be prepared
to handle a "bare" -EFAULT, where for all intents and purposes "handle" means
"terminate the guest".

Right.


That's actually one of the reasons why KVM_EXIT_MEMORY_FAULT exists, it'd require
an absurd amount of work and churn in KVM to *safely* return useful information
on *all* -EFAULTs. FWIW, I had hopes and dreams of actually doing exactly this,
but have long since abandoned those dreams.

I am not sure whether we need to do that. Perhaps it made you feel so after we changed to use -EFAULT to carry KVM_EXIT_MEMORY_FAULT. :-)


In other words, KVM_EXIT_MEMORY_FAULT essentially communicates to userspace that
(a) userspace can likely fix whatever badness triggered the -EFAULT, and (b) that
KVM is in a state where fixing the underlying problem and resuming the guest is
safe, e.g. won't corrupt the guest (because KVM is in a half-baked state).


Sure. One small issue might be that, in a later code check, we actually return KVM_EXIT_MEMORY_FAULT when private fault hits RET_PF_EMULATION -- see your patch:

[PATCH 01/16] KVM: x86/mmu: Exit to userspace with -EFAULT if private fault hits emulation

So here if we just return -EFAULT w/o reporting KVM_EXIT_MEMORY_FAULT when private+reserved is hit, it seems there's a little bit inconsistency here.

But you may have concern of corrupting guest here as you mentioned above.