Re: [PATCH] KVM: x86: Zap all TDP leaf entries according noncoherent DMA count

From: Chao Gao
Date: Mon May 08 2023 - 23:11:50 EST


On Tue, May 09, 2023 at 09:40:37AM +0800, Yan Zhao wrote:
>On Mon, May 08, 2023 at 03:19:56PM +0800, Chao Gao wrote:
>> >However, this issue may appear when kvm_mmu_zap_all_fast() is not called
>> >before KVM slot removal, e.g. as for TDX, only leaf entries for the
>> >memslot to be removed is zapped.
>> >
>> >static void kvm_mmu_invalidate_zap_pages_in_memslot(struct kvm *kvm,
>> > struct kvm_memory_slot *slot,
>> > struct kvm_page_track_notifier_node *node)
>> >{
>> > if (kvm_gfn_shared_mask(kvm))
>> > /*
>> > * Secure-EPT requires to release PTs from the leaf. The
>> > * optimization to zap root PT first with child PT doesn't
>> > * work.
>> > */
>> > kvm_mmu_zap_memslot(kvm, slot);
>> > else
>> > kvm_mmu_zap_all_fast(kvm);
>> >}
>>
>> TDX code isn't merged. So, I think you'd better not use TDX as an argument.
>>
>Ok. But I just want to explain that kvm_mmu_zap_all_fast() is not
>desired in some cases during slot DELETE. TDX is a good example here.

Ok. Maybe just remove the above code snippet. The code snippet is confusing
because people may think it is what current KVM has, but actually it turns out
to be a modified version in pending TDX patches, which are subject to change.

>
>> >
>> >And even without TDX's case, in some extreme conditions if MMIO regions
>> >are not disabled during device attaching, e.g. if guest does not cause
>> >the MMIO region disabling in QEMU.
>> >Then TDP zap will not be called and wrong EPT memory type might be
>> >retained.
>> >
>> >So, do the TDP zapping of all leaf entries when present/non-present state
>> >of noncoherent DMA devices changes to ensure stale entries cleaned away.
>> >And as this is not a frequent operation, the extra zap should be fine.
>> >
>> >Signed-off-by: Yan Zhao <yan.y.zhao@xxxxxxxxx>
>> >---
>> > arch/x86/kvm/x86.c | 6 ++++--
>> > 1 file changed, 4 insertions(+), 2 deletions(-)
>> >
>> >diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> >index e7f78fe79b32..99a825722d95 100644
>> >--- a/arch/x86/kvm/x86.c
>> >+++ b/arch/x86/kvm/x86.c
>> >@@ -13145,13 +13145,15 @@ EXPORT_SYMBOL_GPL(kvm_arch_has_assigned_device);
>> >
>> > void kvm_arch_register_noncoherent_dma(struct kvm *kvm)
>> > {
>> >- atomic_inc(&kvm->arch.noncoherent_dma_count);
>> >+ if (atomic_inc_return(&kvm->arch.noncoherent_dma_count) == 1)
>>
>> >+ kvm_zap_gfn_range(kvm, gpa_to_gfn(0), gpa_to_gfn(~0ULL));
>>
>> The issue is specific to EPT. shouldn't this be conditional on tdp_enabled, like
>> update_mtrr()?
>>
>Yes. good point.
>Maybe also include checking of shadow_memtype_mask.

looks good.