Re: [PATCH v2] cgroup/cpuset: Change nr_deadline_tasks to an atomic_t value

From: Waiman Long
Date: Wed Nov 01 2023 - 14:00:49 EST


On 11/1/23 12:34, Michal Koutný wrote:
On Tue, Oct 24, 2023 at 10:18:34AM -0400, Waiman Long <longman@xxxxxxxxxx> wrote:
The nr_deadline_tasks field in cpuset structure was introduced by
commit 6c24849f5515 ("sched/cpuset: Keep track of SCHED_DEADLINE task
in cpusets"). Unlike nr_migrate_dl_tasks which is only modified under
cpuset_mutex, nr_deadline_tasks can be updated under two different
locks - cpuset_mutex in most cases or css_set_lock in cgroup_exit(). As
a result, data races can happen leading to incorrect nr_deadline_tasks
value.
The effect is that dl_update_tasks_root_domain() processes tasks
unnecessarily or that it incorrectly skips dl_add_task_root_domain()?
The effect is that dl_update_tasks_root_domain() may return incorrectly or it is doing unnecessary work. Will update the commit log to reflect that.

Since it is not practical to somehow take cpuset_mutex in cgroup_exit(),
the easy way out to avoid this possible race condition is by making
nr_deadline_tasks an atomic_t value.
If css_set_lock is useless for this fields and it's going to be atomic,
could you please add (presumably) a cleanup that moves dec_dl_tasks_cs()
from under css_set_lock in cgroup_exit() to a (new but specific)
cpuset_cgrp_subsys.exit() handler?

But css_set_lock is needed for updating other css data. It is true that we can move dec_dl_tasks_cs() outside of the lock. I can do that in the next version.

Cheers,
Longman