Re: [PATCH] sched/core: Use empty mask to reset cpumasks in sched_setaffinity()

From: Peter Zijlstra
Date: Mon Jul 03 2023 - 06:26:28 EST


On Wed, Jun 28, 2023 at 05:16:37PM -0400, Waiman Long wrote:
> Since commit 8f9ea86fdf99 ("sched: Always preserve the user requested
> cpumask"), user provided CPU affinity via sched_setaffinity(2) is
> perserved even if the task is being moved to a different cpuset. However,
> that affinity is also being inherited by any subsequently created child
> processes which may not want or be aware of that affinity.
>
> One way to solve this problem is to provide a way to back off from
> that user provided CPU affinity. This patch implements such a scheme
> by using an empty cpumask to signal a reset of the cpumasks to the
> default as allowed by the current cpuset.
>
> Before this patch, passing in an empty cpumask to sched_setaffinity(2)
> will return an EINVAL error. With this patch, an error will no longer
> be returned. Instead, the user_cpus_ptr that stores the user provided
> affinity, if set, will be cleared and the task's CPU affinity will be
> reset to that of the current cpuset. This reverts the cpumask change
> done by all the previous sched_setaffinity(2) calls.
>

This is a user visible ABI change -- but with very limited motivation.
Why do we want this? Who will use this?

> Signed-off-by: Waiman Long <longman@xxxxxxxxxx>
> ---
> kernel/sched/core.c | 26 +++++++++++++++++++++-----
> 1 file changed, 21 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index c52c2eba7c73..f4806d969fc9 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -8317,7 +8317,12 @@ __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx)
> }
>
> cpuset_cpus_allowed(p, cpus_allowed);
> - cpumask_and(new_mask, ctx->new_mask, cpus_allowed);
> +
> + /* Default to cpus_allowed with NULL new_mask */
> + if (ctx->new_mask)
> + cpumask_and(new_mask, ctx->new_mask, cpus_allowed);
> + else
> + cpumask_copy(new_mask, cpus_allowed);
>
> ctx->new_mask = new_mask;
> ctx->flags |= SCA_CHECK;
> @@ -8366,6 +8371,7 @@ __sched_setaffinity(struct task_struct *p, struct affinity_context *ctx)
>
> long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
> {
> + bool reset_cpumasks = cpumask_empty(in_mask);
> struct affinity_context ac;
> struct cpumask *user_mask;
> struct task_struct *p;
> @@ -8403,13 +8409,23 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask)
> goto out_put_task;
>
> /*
> - * With non-SMP configs, user_cpus_ptr/user_mask isn't used and
> - * alloc_user_cpus_ptr() returns NULL.
> + * If an empty cpumask is passed in, clear user_cpus_ptr, if set,
> + * and reset the current cpu affinity to the default for the
> + * current cpuset.
> */
> - user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE);
> + if (reset_cpumasks) {
> + in_mask = NULL; /* To be updated in __sched_setaffinity */
> + user_mask = NULL;
> + } else {
> + /*
> + * With non-SMP configs, user_cpus_ptr/user_mask isn't used
> + * and alloc_user_cpus_ptr() returns NULL.
> + */
> + user_mask = alloc_user_cpus_ptr(NUMA_NO_NODE);
> + }
> if (user_mask) {
> cpumask_copy(user_mask, in_mask);
> - } else if (IS_ENABLED(CONFIG_SMP)) {
> + } else if (!reset_cpumasks && IS_ENABLED(CONFIG_SMP)) {
> retval = -ENOMEM;
> goto out_put_task;
> }
> --
> 2.31.1
>