Re: [GIT PULL, RFC] Full dynticks, CONFIG_NO_HZ_FULL feature

From: Ingo Molnar
Date: Mon May 06 2013 - 05:25:48 EST



* Paul E. McKenney <paulmck@xxxxxxxxxxxxxxxxxx> wrote:

> On Sun, May 05, 2013 at 01:33:45PM -0700, Linus Torvalds wrote:
> > On Sun, May 5, 2013 at 4:03 AM, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> > >
> > > Please consider pulling the latest timers-nohz-for-linus git tree from:
> >
> > Ok, it seems to work for me, so pulled.
> >
> > However, by "work for me" I mean "doesn't actually seem to make any
> > difference for me". Maybe I'm odd, but the most common situation is
> > either a fairly idle machine (in which case the old NOHZ did fine) or
> > a fairly over-crowded one when I'm running something sufficiently
> > threaded (in which case the new NOHZ_FULL doesn't do anything either).
> >
> > So I really hope the "cpu has more than one running thread" case is
> > getting a lot of attention. Not for 3.10, but right now it seems to
> > still result in the same old 1kHz timer interrupts..
> >
> > So I haven't actually found a real load where any of this makes a
> > noticeable *difference*.
>
> The workloads where we expect the most noticeable differences are HPC
> workloads with short iterations and a HPC-style barrier between each
> interation on the one hand and real-time workloads on the other. My
> guess is that you aren't doing too much of either.

I think Linus might have referred to my 'future plans' entry:

| Future plans:
|
| - there's ongoing work to reduce 1Hz to 0Hz, to essentially shut
| off the periodic tick altogether when there's a single busy task on a
| CPU. We'd first like 1 Hz to be exposed more widely before we go for
| the 0 Hz target though.
|
| - once we reach 0 Hz we can and remove the periodic tick assumption from
| nr_running>=2 as well, by essentially interrupting busy tasks only as
| frequently as the sched_latency constraints require us to do - once
| every 4-40 msecs, depending on nr_running.

and indicated that in practice desktop and developer workload will see the
full win from reduced HZ only once we implement those two points and
extend the scope of dynticks even more and make HZ truly variable
regardless of rq->nr_running.

> We do have some measurements taken on an early prototype of this patchset,
> which are on slides 5 and 6 of:
>
> http://linuxplumbersconf.org/2009/slides/Josh-Triplett-painless-kernel.pdf
>
> This is for an HPC workload with a 100-microsecond iteration time.

Interesting that HZ=1000 caused 8% overhead there. On a regular x86 server
PC I've measured the HZ=1000 overhead to pure user-space execution to be
around 1% (sometimes a bit less, sometimes a bit more).

But even 1% is worth it.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/