Re: Is it possible to implement the per-node page cache for programs/libraries?

From: Linus Torvalds
Date: Wed Sep 01 2021 - 21:14:21 EST


On Wed, Sep 1, 2021 at 5:15 PM Barry Song <21cnbao@xxxxxxxxx> wrote:
>
> In case we are running mysql on a machine with 128 cores
> (4numa, 32cores in each numa), how will the reflink help the only
> mysql process to leverage its local libc copy?

That's a fundamentally harder problem anyway, and for the foreseeable
future you should expect the answer to that be "Not a way in hell".

Because it's not about "local libc copies" at that point any more,
it's about "a single process only has a single page table".

So a single process will have a particular virtual address mapped to
*one* physical page. And no, it doesn't matter how many threads you
have. What makes them threads - not processes - is that they share the
same VM image.

So the only way you will have local NUMA copies is if you
(a) run multiple processes
(b) bind each process to a particular NUMA node
(c) do something special to then have per-node mappings

That "(c)" is what is up for discussion, whether it be with various
user mode hacks, or the "NUMA COW" thing, or whatever.

But (a) and (b) are basically required.

Linus