Re: [RFC PATCH v3 1/2] mempinfd: Add new syscall to provide memory pin

From: Matthew Wilcox
Date: Sun Feb 07 2021 - 20:32:03 EST


On Sun, Feb 07, 2021 at 10:24:28PM +0000, Song Bao Hua (Barry Song) wrote:
> > > In high-performance I/O cases, accelerators might want to perform
> > > I/O on a memory without IO page faults which can result in dramatically
> > > increased latency. Current memory related APIs could not achieve this
> > > requirement, e.g. mlock can only avoid memory to swap to backup device,
> > > page migration can still trigger IO page fault.
> >
> > Well ... we have two requirements. The application wants to not take
> > page faults. The system wants to move the application to a different
> > NUMA node in order to optimise overall performance. Why should the
> > application's desires take precedence over the kernel's desires? And why
> > should it be done this way rather than by the sysadmin using numactl to
> > lock the application to a particular node?
>
> NUMA balancer is just one of many reasons for page migration. Even one
> simple alloc_pages() can cause memory migration in just single NUMA
> node or UMA system.
>
> The other reasons for page migration include but are not limited to:
> * memory move due to CMA
> * memory move due to huge pages creation
>
> Hardly we can ask users to disable the COMPACTION, CMA and Huge Page
> in the whole system.

You're dodging the question. Should the CMA allocation fail because
another application is using SVA?

I would say no. The application using SVA should take the one-time
performance hit from having its memory moved around.