Re: [RFC PATCH 3/3] KVM: x86/mmu: skip zap maybe-dma-pinned pages for NUMA migration

From: Yan Zhao
Date: Thu Aug 10 2023 - 05:35:26 EST


On Wed, Aug 09, 2023 at 08:59:16AM -0300, Jason Gunthorpe wrote:
> On Wed, Aug 09, 2023 at 08:11:17AM +0800, Yan Zhao wrote:
>
> > > Can we just tell userspace to mbind() the pinned region to explicitly exclude the
> > > VMA(s) from NUMA balancing?
>
> > For VMs with VFIO mdev mediated devices, the VMAs to be pinned are
> > dynamic, I think it's hard to mbind() in advance.
>
> It is hard to view the mediated devices path as a performance path
> that deserves this kind of intervention :\

Though you are right, maybe we can still make it better?

What about introducing a new callback which will be called when a page
is ensured to be PROT_NONE protected for NUMA balancing?

Then, rather than duplicate mm logic in KVM, KVM can depend on this callback
and do the page unmap in secondary MMU only for pages that are indeed
PROT_NONE protected for NUMA balancing, excluding pages that are obviously
non-NUMA-migratable.

I sent a RFC v2 (commit messages and comments are not well polished) to
show this idea,
https://lore.kernel.org/all/20230810085636.25914-1-yan.y.zhao@xxxxxxxxx/

Do you think we can continue the work?

Thanks a lot for your review!