Re: [PATCH 2/4] sched: Account per task_group nr_iowait

From: Kirill Tkhai
Date: Mon Nov 06 2017 - 11:13:08 EST


On 06.11.2017 19:06, Peter Zijlstra wrote:
> On Mon, Nov 06, 2017 at 05:40:32PM +0300, Kirill Tkhai wrote:
>> The patch makes number of task_group's tasks in iowait state
>> be tracked separately. This may be useful for containers to
>> check nr_iowait state of a single one.
>>
>> Signed-off-by: Kirill Tkhai <ktkhai@xxxxxxxxxxxxx>
>> ---
>> kernel/sched/core.c | 45 +++++++++++++++++++++++++++++++++++++++++++++
>> kernel/sched/sched.h | 5 +++++
>> 2 files changed, 50 insertions(+)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 712ee54edaa1..86d1ad5f49bd 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -796,12 +796,32 @@ void deactivate_task(struct rq *rq, struct task_struct *p, int flags)
>>
>> static void task_iowait_start(struct rq *rq, struct task_struct *p)
>> {
>> +#ifdef CONFIG_CGROUP_SCHED
>> + struct task_group *tg = task_group(p);
>> +
>> + /* Task's sched_task_group is changed under both of the below locks */
>> + BUG_ON(!raw_spin_is_locked(&p->pi_lock) && !raw_spin_is_locked(&rq->lock));
>
> We have lockdep_assert_held for that.
>
>> + while (task_group_is_autogroup(tg))
>> + tg = tg->parent;
>> + atomic_inc(&tg->stat[rq->cpu].nr_iowait);
>
> You're joking right, more atomic ops on the fast paths..

There should be a synchronization... It's modified under rq->lock everywhere, except try_to_wakeup().
Would it be better to use one more rq->lock at try_to_wakeup() instead of atomic?

>> +#endif
>> +
>> atomic_inc(&rq->nr_iowait);
>> delayacct_blkio_start();