Re: [RFC PATCH 4/4] sched: Upload nohz full CPU load on task enqueue/dequeue

From: Thomas Gleixner
Date: Wed Jan 20 2016 - 09:44:42 EST


On Wed, 20 Jan 2016, Frederic Weisbecker wrote:
> On Wed, Jan 20, 2016 at 10:03:32AM +0100, Thomas Gleixner wrote:
> > I tell you since years, that you need to fix that remote accounting stuff,
> > but no, you insist on adding more trainwrecks left and right.
>
> The solution you proposed to me was to do remote scheduler_tick() from
> CPU 0 and this was nacked by peterz (and he was right).

He did not nack the general approach of remote accounting, right?

> We all know that we need to fix this remote accounting stuff, but I'm the
> only one who actually _tries_, at least through RFC's to start discussions,
> such that I find the right direction to move forward.

Well, I do not see any attempt to do remote accounting, not even in a
minimalistic form. The current RFC is about dealing with issues which are
caused by the lack of continous (remote) accounting.

> > > The problem with doing this remotely is that we can miss past cpu loads if
> > > there was several enqueue/dequeue operations happening while tickless.
> >
> > That's complete bullshit.
> >
> > 1) How is remote accounting that happens every tick different from local
> > accounting which happens every tick?
>
> Enqueue/dequeue don't happen on tick, unless there is a wakeup on that interrupt.

And how does that matter? Tick based accounting whether remote or local does
not account for intermediate states at all.

> > 2) How do you have enqueue/dequeue operations when you are running in full
> > nohz, i.e. one task is consuming 100% cpu time in user space?
>
> Well that task is going to sleep, wake up, sleep like any other task. We

If that tasks goes to sleep, then it leaves the full nohz state.

> need to account these slices properly. If a second task wakes up and restart
> the tick, we must make sure that the previous tickless frame got accounted
> properly.

The previous tickless frame ends when that task goes to sleep. And that's
where you update the accounting.

> Besides, if a SCHED_FIFO task runs (tickless) with SCHED_NORMAL tasks in the
> runqueue, those are typically still accounted with the tick, so perhaps we
> need to keep that behaviour without the tick as well and account those
> SCHED_NORMAL task's load.

So we agreed long time ago, that we first fix the issues with s single task
running undisturbed in user space, i.e. tickless. Those issues have never been
resolved fully, but now you try to add more complexity of extra runnable
tasks, nohz tasks sleeping and whatever.

Can we please go back to the point where this all started:

ONE task running with 100% CPU in user space

And get all the issues around that resolved proper, which involves remote
accounting.

Once that works, you can add the new features, i.e. extra runnable tasks and
whatever.

Thanks,

tglx