Re: [PATCH] sched/core: Use empty mask to reset cpumasks in sched_setaffinity()

From: Waiman Long
Date: Wed Jul 05 2023 - 10:08:30 EST



On 7/5/23 05:37, Peter Zijlstra wrote:
On Mon, Jul 03, 2023 at 10:55:02AM -0400, Waiman Long wrote:

Our OpenShift team has actually hit a problem with the recent persistent
user provided cpu affinity change because they are relying on the fact that
moving a task to a different cpuset will reset cpu affinity to the cpuset
default which is no longer true. That is the main reason behind this patch
to provide a way to reset cpu affinity to the cpuset default.
Where is the sched_setaffinity() in that story?

So somewhere this thing did a sched_setaffinity() and then starts
playing with cpusets. Instead of adding more sched_setaffinity() calls,
can't it just remove some?

I don't know the full picture. From what I understand, there is a master control process that limit its cpu affinity to just a limited set of housekeeping CPUs. It then spawn child processes to be run in different containers. The control process doesn't need to change its cpu affinity.

In the past, putting the child processes in a different container (cpuset) will reset its affinity to that of the cpuset. That is not true anymore because user_cpus_ptr is inherited in the forked child process. I have thought about 2 ways to address that. Either we introduce a new clone flag to disable the inheritance of users_cpu_ptr or a way to reset the cpu affinity to the default which is what this patch does.

Cheers,
Longman