Re: [PATCH 09/16] KVM: x86/mmu: Move private vs. shared check above slot validity checks

From: Huang, Kai
Date: Wed Mar 06 2024 - 17:07:38 EST




On 6/03/2024 3:02 pm, Sean Christopherson wrote:
On Wed, Mar 06, 2024, Kai Huang wrote:


On 6/03/2024 1:38 pm, Sean Christopherson wrote:
On Wed, Mar 06, 2024, Kai Huang wrote:


On 28/02/2024 3:41 pm, Sean Christopherson wrote:
Prioritize private vs. shared gfn attribute checks above slot validity
checks to ensure a consistent userspace ABI. E.g. as is, KVM will exit to
userspace if there is no memslot, but emulate accesses to the APIC access
page even if the attributes mismatch.

IMHO, it would be helpful to explicitly say that, in the later case (emulate
APIC access page) we still want to report MEMORY_FAULT error first (so that
userspace can have chance to fixup, IIUC) instead of emulating directly,
which will unlikely work.

Hmm, it's not so much that emulating directly won't work, it's that KVM would be
violating its ABI. Emulating APIC accesses after userspace converted the APIC
gfn to private would still work (I think), but KVM's ABI is that emulated MMIO
is shared-only.

But for (at least) TDX guest I recall we _CAN_ allow guest's MMIO to be
mapped as private, right? The guest is supposed to get a #VE anyway?

Not really. KVM can't _map_ emulated MMIO as private memory, because S-EPT
entries can only point at convertible memory.

Right. I was talking about the MMIO mapping in the guest, which can be private I suppose.

KVM _could_ emulate in response
to a !PRESENT EPT violation, but KVM is not going to do that.

https://lore.kernel.org/all/ZcUO5sFEAIH68JIA@xxxxxxxxxx


Right agreed KVM shouldn't "emulate" !PRESENT fault.

I am talking about this -- for TDX guest, if I recall correctly, for guest's MMIO gfn KVM still needs to get the EPT violation for the _first_ access, in which KVM can configure the EPT entry to make sure "suppress #VE" bit is cleared so the later accesses can trigger #VE directly.

I suppose this is still the way we want to implement?

I am afraid for this case, we will still see the MMIO GFN as private, while by default I believe the "guest memory attributes" for that MMIO GFN should be shared? AFAICT, it seems the "guest memory attributes" code doesn't check whether a GFN is MMIO or truly RAM.

So I agree making KVM only "emulate" shared MMIO makes sense, and we need this patch to cover the APIC access page case.

Anyway:

Reviewed-by: Kai Huang <kai.huang@xxxxxxxxx>

Perhaps I am missing something -- I apologize if this has already been
discussed.


FWIW, I doubt there's a legitmate use case for converting the APIC gfn to private,
this is purely to ensure KVM has simple, consistent rules for how private vs.
shared access work.

Again I _think_ for TDX APIC gfn can be private? IIUC virtualizing APIC is
done by the TDX module, which injects #VE to guest when emulation is
required.

It's a moot point for TDX, as x2APIC is mandatory.

Right. My bad to mention.