Re: [RFC] proc: Add a new isolated /proc/pid/mempolicy type.

From: Abel Wu
Date: Mon Sep 26 2022 - 23:22:17 EST


Hi Michal, thanks very much for your patience!

On 9/26/22 10:08 PM, Michal Hocko wrote:
On Mon 26-09-22 20:53:19, Zhongkun He wrote:
[Cc linux-api - please do so for any patches making/updating
kernel<->user interfaces]


On Mon 26-09-22 17:10:33, hezhongkun wrote:
From: Zhongkun He <hezhongkun.hzk@xxxxxxxxxxxxx>

/proc/pid/mempolicy can be used to check and adjust the userspace task's
mempolicy dynamically.In many case, the application and the control plane
are two separate systems. When the application is created, it doesn't know
how to use memory, and it doesn't care. The control plane will decide the
memory usage policy based on different reasons.In that case, we can
dynamically adjust the mempolicy using /proc/pid/mempolicy interface.

Is there any reason to make it procfs interface rather than pidfd one?

Hi michal, thanks for your reply.

I just think that it is easy to display and adjust the mempolicy using
procfs. But it may not be suitable, I will send a pidfd_set_mempolicy patch
later.

proc interface has many usability issues. That is why pidfd has been
introduced. So I would rather go with the pidfd interface than repeating
old proc API mistakes.

I can't agree more.


Btw.in order to add per-thread-group mempolicy, is it possible to add
mempolicy in mm_struct?

I dunno. This would make the mempolicy interface even more confusing.
Per mm behavior makes a lot of sense but we already do have per-thread
semantic so I would stick to it rather than introducing a new semantic.

Why is this really important?

We want soft control on memory footprint of background jobs by applying
NUMA preferences when necessary, so the impact on different NUMA nodes
can be managed to some extent. These NUMA preferences are given by the
control panel, and it might not be suitable to overwrite the tasks with
specific memory policies already (or vice versa).

Best Regards,
Abel