Re: [PATCH 2/5] KVM: SVM: Use kvm_pat_valid() directly instead of kvm_mtrr_valid()

From: Huang, Kai
Date: Fri May 05 2023 - 07:20:50 EST


On Thu, 2023-05-04 at 08:34 -0700, Sean Christopherson wrote:
> On Wed, May 03, 2023, Kai Huang wrote:
> > > for better or worse, KVM doesn't apply the "zap
> > > SPTEs" logic to guest PAT changes when the VM has a passthrough device
> > > with non-coherent DMA.
> >
> > Is it a bug?
>
> No. KVM's MTRR behavior is using a heuristic to try not to break the VM: if the
> VM has non-coherent DMA, then honor UC mapping in the MTRRs as such mappings may
> be coverage the non-coherent DMA.
>
> From vmx_get_mt_mask():
>
> /* We wanted to honor guest CD/MTRR/PAT, but doing so could result in
> * memory aliases with conflicting memory types and sometimes MCEs.
> * We have to be careful as to what are honored and when.
>
> The PAT is problematic because it is referenced via the guest PTEs, versus the
> MTRRs being tied to the guest physical address, e.g. different virtual mappings
> for the same physical address can yield different memtypes via the PAT. My head
> hurts just thinking about how that might interact with shadow paging :-)
>
> Even the MTRRs are somewhat sketchy because they are technically per-CPU, i.e.
> two vCPUs could have different memtypes for the same physical address. But in
> practice, sane software/firmware uses consistent MTRRs across all CPUs.

Agreed on all above odds.

But I think the answer to my question is actually we simply don't _need_ to zap
SPTEs (with non-coherent DMA) when guest's IA32_PAT is changed:

1) If EPT is enabled, IIUC guest's PAT is already horned. VMCS's GUEST_IA32_PAT
always reflects the IA32_PAT that guest wants to set. EPT's memtype bits are
set according to guest's MTRR. That means guest changing IA32_PAT doesn't need
to zap EPT PTEs as "EPT PTEs essentially only replaces guest's MTRRs".

2) If EPT is disabled, looking at the code, if I read correctly, the
'shadow_memtype_mask' is 0 for Intel, in which case KVM won't try to set any PAT
memtype bit in shadow MMU PTE, which means the true PAT memtype is always WB and
guest's memtype is never horned (guest's MTRRs are also never actually used by
HW), which should be fine I guess?? My brain refused to go further :)

But anyway back to my question, I think "changing guest's IA32_PAT" shouldn't
result in needing to "zap SPTEs".