Re: [PATCH 11/16] KVM: x86/mmu: Explicitly disallow private accesses to emulated MMIO

From: Huang, Kai
Date: Wed Mar 06 2024 - 18:20:37 EST




On 7/03/2024 12:01 pm, Sean Christopherson wrote:
On Thu, Mar 07, 2024, Kai Huang wrote:


On 7/03/2024 11:43 am, Sean Christopherson wrote:
On Thu, Mar 07, 2024, Kai Huang wrote:


On 28/02/2024 3:41 pm, Sean Christopherson wrote:
Explicitly detect and disallow private accesses to emulated MMIO in
kvm_handle_noslot_fault() instead of relying on kvm_faultin_pfn_private()
to perform the check. This will allow the page fault path to go straight
to kvm_handle_noslot_fault() without bouncing through __kvm_faultin_pfn().

Signed-off-by: Sean Christopherson <seanjc@xxxxxxxxxx>
---
arch/x86/kvm/mmu/mmu.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 5c8caab64ba2..ebdb3fcce3dc 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3314,6 +3314,11 @@ static int kvm_handle_noslot_fault(struct kvm_vcpu *vcpu,
{
gva_t gva = fault->is_tdp ? 0 : fault->addr;
+ if (fault->is_private) {
+ kvm_mmu_prepare_memory_fault_exit(vcpu, fault);
+ return -EFAULT;
+ }
+

As mentioned in another reply in this series, unless I am mistaken, for TDX
guest the _first_ MMIO access would still cause EPT violation with MMIO GFN
being private.

Returning to userspace cannot really help here because the MMIO mapping is
inside the guest.

That's a guest bug. The guest *knows* it's a TDX VM, it *has* to know. Accessing
emulated MMIO and thus taking a #VE before enabling paging is nonsensical. Either
enable paging and setup MMIO regions as shared, or go straight to TDCALL.

+Kirill,

I kinda forgot the detail, but what I am afraid is there might be bunch of
existing TDX guests (since TDX guest code is upstream-ed) using unmodified
drivers, which doesn't map MMIO regions as shared I suppose.

Kirill,

Could you clarify whether TDX guest code maps MMIO regions as shared since
beginning?

Y'all get the same answer we gave the SNP folks: KVM does not yet support TDX,
so as far is KVM is concerned, there is no existing functionality to support.

s/firmware/Linux if this is a Linux kernel problem.

On Thu, Feb 08, 2024, Paolo Bonzini wrote:
> On Thu, Feb 8, 2024 at 6:27 PM Sean Christopherson <seanjc@xxxxxxxxxx> wrote:
> > No. KVM does not yet support SNP, so as far as KVM's ABI goes, there are no
> > existing guests. Yes, I realize that I am burying my head in the sand to some
> > extent, but it is simply not sustainable for KVM to keep trying to pick up the
> > pieces of poorly defined hardware specs and broken guest firmware.
>
> 101% agreed. There are cases in which we have to and should bend
> together backwards for guests (e.g. older Linux kernels), but not for
> code that---according to current practices---is chosen by the host
> admin.
>
> (I am of the opinion that "bring your own firmware" is the only sane
> way to handle attestation/measurement, but that's not how things are
> done currently).

Fair enough, and good to know. :-)

(Still better to hear from Kirill, though.)