Re: [RFC PATCH v4 04/10] KVM: x86: Introduce PFERR_GUEST_ENC_MASK to indicate fault is private

From: Sean Christopherson
Date: Wed Feb 21 2024 - 21:06:18 EST


On Fri, Jul 21, 2023, Isaku Yamahata wrote:
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
> Date: Wed, 14 Jun 2023 12:34:00 -0700
> Subject: [PATCH 4/8] KVM: x86: Use PFERR_GUEST_ENC_MASK to indicate fault is
> private
>
> SEV-SNP defines PFERR_GUEST_ENC_MASK (bit 32) in page-fault error bits to
> represent the guest page is encrypted. Use the bit to designate that the
> page fault is private and that it requires looking up memory attributes.
> The vendor kvm page fault handler should set PFERR_GUEST_ENC_MASK bit based
> on their fault information. It may or may not use the hardware value
> directly or parse the hardware value to set the bit.
>
> For KVM_X86_SW_PROTECTED_VM, ask memory attributes for the fault
> privateness. For async page fault, carry the bit and use it for kvm page
> fault handler.
>
> Signed-off-by: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>

..

> @@ -4315,7 +4316,8 @@ void kvm_arch_async_page_ready(struct kvm_vcpu *vcpu, struct kvm_async_pf *work)
> work->arch.cr3 != kvm_mmu_get_guest_pgd(vcpu, vcpu->arch.mmu))
> return;
>
> - kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, 0, true, NULL);
> + kvm_mmu_do_page_fault(vcpu, work->cr2_or_gpa, work->arch.error_code,
> + true, NULL);

This is unnecessary, KVM doesn't suppoort async page fault behavior for private
memory (and doesn't need to, because guest_memmfd() doesn't support swap).

> diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
> index 7f9ec1e5b136..3a423403af01 100644
> --- a/arch/x86/kvm/mmu/mmu_internal.h
> +++ b/arch/x86/kvm/mmu/mmu_internal.h
> @@ -295,13 +295,13 @@ static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> .user = err & PFERR_USER_MASK,
> .prefetch = prefetch,
> .is_tdp = likely(vcpu->arch.mmu->page_fault == kvm_tdp_page_fault),
> + .is_private = err & PFERR_GUEST_ENC_MASK,

This breaks SEV and SEV-ES guests, because AFAICT, the APM is lying by defining
PFERR_GUEST_ENC_MASK in the context of SNP. The flag isn't just set when running
SEV-SNP guests, it's set for all C-bit=1 effective accesses when running on SNP
capable hardware (at least, that's my observation).

Grumpiness about discovering yet another problem that I would have expected
_someone_ to stumble upon...

FYI, I'm going to post a rambling series to cleanup code in the page fault path
(it started as a cleanup of the "no slot" code and then grew a few more heads).
One of the patches I'm going to include is something that looks like this patch,
but I'm going to use a KVM-defined synthetic bit, because stuffing a bit that KVM
would need _clear_ on _some_ hardware is gross.