Re: [PATCH 3/7] Add IRQ_TIME_ACCOUNTING, finer accounting of irqtime -v3

From: Peter Zijlstra
Date: Fri Oct 01 2010 - 07:45:31 EST


On Thu, 2010-09-30 at 09:29 -0700, Venkatesh Pallipadi wrote:
> On Thu, Sep 30, 2010 at 4:06 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> > On Wed, 2010-09-29 at 12:21 -0700, Venkatesh Pallipadi wrote:
> >> +void account_system_vtime(struct task_struct *curr)
> >> +{
> >> + unsigned long flags;
> >> + int cpu;
> >> + u64 now, delta;
> >> +
> >> + if (!sched_clock_irqtime)
> >> + return;
> >> +
> >> + local_irq_save(flags);
> >> +
> >> + now = sched_clock();
> >> + cpu = smp_processor_id();
> >
> > Like said before, that really wants to read like:
> >
> > cpu = smp_processor_id();
> > now = sched_clock_cpu(cpu);
> >
> > sched_clock() is raw tsc + ns conversion and can go all over the place.
>
> sched_clock_cpu() won't really work for here, due to what looks like
> idle and timer tick dependencies. Using sched_clock_cpu(), I end up
> accounting CPU idle time to hardirq due to time captured before the
> handler and after the handler.

huh!?

No, sched_clock_cpu() gives sched_clock() for those few systems that
actually have a usable tsc, for those that don't we use the tick (and
idle hooks) to read GTOD and set up a TICK_NSEC window. We then use TSC
to calculate high res deltas, we filter anything that goes backwards and
anything that exceeds the window.

How does that render it useless for this purpose?

Never use sched_clock(), its crap (just like TSC is).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/