Re: [PATCH 1/2] nohz: use seqlock to avoid race on idle time stats v2

From: Denys Vlasenko
Date: Sat Apr 05 2014 - 10:58:04 EST


On Sat, Apr 5, 2014 at 12:08 PM, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>> > Iowait makes sense but not per cpu. Eventually it's a global
>> > stat. Or per task.
>>
>> There a lot of situations where admins want to know
>> how much, on average, their CPUs are idle because
>> they wait for IO.
>>
>> If you are running, say, a Web cache,
>> you need to know that stat in order to be able to
>> conjecture "looks like I'm IO bound, perhaps caching
>> some data in RAM will speed it up".
>
> But then accounting iowait time waited until completion on the CPU
> that the task wakes up should work for that.
>
> Doesn't it?

It can easily make iowait count higher than idle count,
or even higher than idle+sys+user+nice count.

IOW, it can show that the system is way more
than 100% IO bound, which doesn't make sense.


> So we save, per task, the time when the task went to sleep. And when it wakes up
> we just flush the pending time since that sleep time to the waking CPU:
> iowait[$CPU] += NOW() - tsk->sleeptime;
>
>> Is such counter meaningful for the admin?
>
> Well, you get the iowait time accounting.

Admin wants to know "how often do I have CPU idled
because they have nothing to do until IO is complete?"

Merely knowing how much tasks waited for IO
doesn't answer that question.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/