Re: [PATCH 7/8] KVM: gmem: Avoid race with kvm_gmem_release and mmu notifier

From: Sean Christopherson
Date: Fri Aug 18 2023 - 14:16:07 EST


On Tue, Aug 15, 2023, isaku.yamahata@xxxxxxxxx wrote:
> From: Isaku Yamahata <isaku.yamahata@xxxxxxxxx>
>
> Add slots_lock around kvm_flush_shadow_all(). kvm_gmem_release() via
> fput() and kvm_mmu_notifier_release() via mmput() can be called
> simultaneously on process exit because vhost, /dev/vhost_{net, vsock}, can
> delay the call to release mmu_notifier, kvm_mmu_notifier_release() by its
> kernel thread. Vhost uses get_task_mm() and mmput() for the kernel thread
> to access process memory. mmput() can defer after closing the file.
>
> kvm_flush_shadow_all() and kvm_gmem_release() can be called simultaneously.

KVM shouldn't reclaim memory on file release, it should instead do that on the
inode being "evicted": https://lore.kernel.org/all/ZLGiEfJZTyl7M8mS@xxxxxxxxxx

> With TDX KVM, HKID releasing by kvm_flush_shadow_all() and private memory
> releasing by kvm_gmem_release() can race. Add slots_lock to
> kvm_mmu_notifier_release().

No, the right answer is to not release the HKID until the VM is destroyed. gmem
has a reference to its associated kvm instance, and so that will naturally ensure
memory all memory encrypted with the HKID is freed before the HKID is released.
kvm_flush_shadow_all() should only tear down page tables, it shouldn't be freeing
guest_memfd memory.

Then patches 6-8 go away.