Re: [PATCH v2 3/4] ftrace: add max execution time mesurement toworkqueue tracer

From: Ingo Molnar
Date: Mon Apr 13 2009 - 17:22:49 EST



(Oleg, Andrew: it's about workqueue tracing design.)

* Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:

> > if (tsk) {
> > - seq_printf(s, "%3d %6d %6u %s\n", cws->cpu,
> > + seq_printf(s, "%3d %6d %6u %5lu.%06lu"
> > + " %s\n",
> > + cws->cpu,
> > atomic_read(&cws->inserted), cws->executed,
> > + exec_secs, exec_usec_rem,
>
>
> You are measuring the latency from a workqueue thread point of
> view. While I find the work latency measurement very interesting,
> I think this patch does it in the wrong place. The _work_ latency
> point of view seems to me much more rich as an information source.
>
> There are several reasons for that.
>
> Indeed this patch is useful for workqueues that receive always the
> same work to perform so that you can find very easily the guilty
> worklet. But the sense of this design is lost once we consider the
> workqueue threads that receive random works. Of course the best
> example is events/%d One will observe the max latency that
> happened on event/0 as an exemple but he will only be able to feel
> a silent FUD because he has no way to find which work caused this
> max latency.

Expanding the trace view in a per worklet fashion is also useful for
debugging: sometimes inefficiencies (or hangs) are related to the
mixing of high-speed worklets with blocking worklets. This is not
exposed if we stay at the workqueue level only.

> Especially the events/%d latency measurement seems to me very
> important because a single work from a random driver can propagate
> its latency all over the system.
>
> A single work that consumes too much cpu time, waits for long
> coming events, sleeps too much, tries to take too often contended
> locks, or whatever... such single work may delay all pending works
> in the queue and the only max latency for a given workqueue is not
> helpful to find these culprits.
>
> Having this max latency snapshot per work and not per workqueue
> thread would be useful for every kind of workqueue latency
> instrumentation:
>
> - workqueues with single works
> - workqueue with random works
>
> A developer will also be able to measure its own worklet action
> and find if it takes too much time, even if it isn't the worst
> worklet in the workqueue to cause latencies.
>
> The end result would be to have a descending latency sort of works
> per cpu workqueue threads (or better: per workqueue group).
>
> What do you think?

Sounds like a good idea to me. It would also allow histograms based
on worklet identity, etc. Often the most active kevents worklet
should be considered to be split out as a new workqueue.

And if we have a per worklet tracepoint it would also allow a trace
filter to only trace a given type of worklet.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/