Re: [PATCH] KVM: x86: Zap all TDP leaf entries according noncoherent DMA count

From: Yan Zhao
Date: Mon May 08 2023 - 22:06:03 EST


On Mon, May 08, 2023 at 03:19:56PM +0800, Chao Gao wrote:
> On Mon, May 08, 2023 at 11:47:00AM +0800, Yan Zhao wrote:
> >Zap all TDP leaf entries when noncoherent DMA count goes from 0 to !0, or
> >from !0 to 0.
> >
> >When there's no noncoherent DMA device, EPT memory type is
> >((MTRR_TYPE_WRBACK << VMX_EPT_MT_EPTE_SHIFT) | VMX_EPT_IPAT_BIT)
> >
> >When there're noncoherent DMA devices, EPT memory type needs to honor
> >guest CR0_CD and MTRR settings.
> >
> >So, if noncoherent DMA count changes between 0 and !0, EPT leaf entries
> >need to be zapped to clear stale memory type.
> >
> >This issue might be hidden when VFIO adding/removing MMIO regions of the
> >noncoherent DMA devices on device attaching/de-attaching because
> >usually the MMIO regions will be disabled/enabled for several times during
> >guest PCI probing. And in KVM, TDP entries are all zapped on memslot
> >removal.
> >
> >However, this issue may appear when kvm_mmu_zap_all_fast() is not called
> >before KVM slot removal, e.g. as for TDX, only leaf entries for the
> >memslot to be removed is zapped.
> >
> >static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
> > struct kvm_memory_slot *slot,
> > struct kvm_page_track_notifier_node *node)
> >{
> > if (kvm_gfn_shared_mask(kvm))
> > /*
> > * Secure-EPT requires to release PTs from the leaf. The
> > * optimization to zap root PT first with child PT doesn't
> > * work.
> > */
> > kvm_mmu_zap_memslot(kvm, slot);
> > else
> > kvm_mmu_zap_all_fast(kvm);
> >}
>
> TDX code isn't merged. So, I think you'd better not use TDX as an argument.
>
Ok. But I just want to explain that kvm_mmu_zap_all_fast() is not
desired in some cases during slot DELETE. TDX is a good example here.

> >
> >And even without TDX's case, in some extreme conditions if MMIO regions
> >are not disabled during device attaching, e.g. if guest does not cause
> >the MMIO region disabling in QEMU.
> >Then TDP zap will not be called and wrong EPT memory type might be
> >retained.
> >
> >So, do the TDP zapping of all leaf entries when present/non-present state
> >of noncoherent DMA devices changes to ensure stale entries cleaned away.
> >And as this is not a frequent operation, the extra zap should be fine.
> >
> >Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
> >---
> > arch/x86/kvm/x86.c | 6 ++++--
> > 1 file changed, 4 insertions(+), 2 deletions(-)
> >
> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> >index e7f78fe79b32..99a825722d95 100644
> >--- a/arch/x86/kvm/x86.c
> >+++ b/arch/x86/kvm/x86.c
> >@@ -13145,13 +13145,15 @@ EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device);
> >
> > void kvm_arch_register_noncoherent_dma(struct kvm *kvm)
> > {
> >- atomic_inc(&kvm->arch.noncoherent_dma_count);
> >+ if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1)
>
> >+ kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL));
>
> The issue is specific to EPT. shouldn't this be conditional on tdp_enabled, like
> update_mtrr()?
>
Yes. good point.
Maybe also include checking of shadow_memtype_mask.

> Likewise, shouldn't we omit to call kvm_zap_gfn_range() in kvm_post_set_cr0() if
> tdp_enabled is false?
I think so.
And also check tdp_enabled and shadow_memtype_mask in the case of update_mtrr().

Will add a helper function in next version.

Thanks, Chao!
>
> > }
> > EXPORT_SYMBOL_GPL(kvm_arch_register_noncoherent_dma);
> >
> > void kvm_arch_unregister_noncoherent_dma(struct kvm *kvm)
> > {
> >- atomic_dec(&kvm->arch.noncoherent_dma_count);
> >+ if (!atomic_dec_return(&kvm->arch.noncoherent_dma_count))
> >+ kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL));
> > }
> > EXPORT_SYMBOL_GPL(kvm_arch_unregister_noncoherent_dma);
> >
> >--
> >2.17.1
> >