Re: [RFC PATCH 00/11] mm/mempolicy: Make task->mempolicy externally modifiable via syscall and procfs

From: Andrew Morton
Date: Wed Nov 22 2023 - 16:33:53 EST


On Wed, 22 Nov 2023 16:11:49 -0500 Gregory Price <gourry.memverge@xxxxxxxxx> wrote:

> The patch set changes task->mempolicy to be modifiable by tasks other
> than just current.
>
> The ultimate goal is to make mempolicy more flexible and extensible,
> such as adding interleave weights (which may need to change at runtime
> due to hotplug events). Making mempolicy externally modifiable allows
> for userland daemons to make runtime performance adjustments to running
> tasks without that software needing to be made numa-aware.

Please add to this [0/N] a full description of the security aspect: who
can modify whose mempolicy, along with a full description of the
reasoning behind this decision.

> 3. Add external interfaces which allow for a task mempolicy to be
> modified by another task. This is implemented in 4 syscalls
> and a procfs interface:
> sys_set_task_mempolicy
> sys_get_task_mempolicy
> sys_set_task_mempolicy_home_node
> sys_task_mbind
> /proc/[pid]/mempolicy

Why is the procfs interface needed? Doesn't it simply duplicate the
syscall interface? Please update [0/N] with a description of this
decision.

> The new syscalls are the same as their current-task counterparts,
> except that they take a pid as an argument. The exception is
> task_mbind, which required a new struct due to the number of args.
>
> The /proc/pid/mempolicy re-uses the interface mpol_parse_str format
> to enable get/set of mempolicy via procsfs.
>
> mpol_parse_str format:
> <mode>[=<flags>][:<nodelist>]
>
> Example usage:
>
> echo "default" > /proc/pid/mempolicy
> echo "prefer=relative:0" > /proc/pid/mempolicy
> echo "interleave:0-3" > /proc/pid/mempolicy

What do we get when we read from this? Please add to changelog.

> Changing the mempolicy does not induce memory migrations via the
> procfs interface (which is the exact same behavior as set_mempolicy).
>