Re: [PATCH V2] tracing, perf : add cpu hotplug trace events

From: Vincent Guittot
Date: Mon Jan 24 2011 - 04:02:57 EST


On 22 January 2011 03:42, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
> On Fri, Jan 21, 2011 at 06:41:58PM +0100, Vincent Guittot wrote:
>> On 21 January 2011 17:44, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>> > On Fri, Jan 21, 2011 at 09:43:18AM +0100, Vincent Guittot wrote:
>> >> On 20 January 2011 17:11, Frederic Weisbecker <fweisbec@xxxxxxxxx> wrote:
>> >> > On Thu, Jan 20, 2011 at 09:25:54AM +0100, Vincent Guittot wrote:
>> >> >> Please find below a new proposal for adding trace events for cpu hotplug.
>> >> >> The goal is to measure the latency of each part (kernel, architecture)
>> >> >> and also to trace the cpu hotplug activity with other power events. I
>> >> >> have tested these traces events on an arm platform.
>> >> >>
>> >> >> Changes since previous version:
>> >> >> -Use cpu_hotplug for trace name
>> >> >> -Define traces for kernel core and arch parts only
>> >> >> -Use DECLARE_EVENT_CLASS and DEFINE_EVENT
>> >> >> -Use proper indentation
>> >> >>
>> >> >> Subject: [PATCH] cpu hotplug tracepoint
>> >> >>
>> >> >> this patch adds new events for cpu hotplug tracing
>> >> >>  * plug/unplug sequence
>> >> >>  * core and architecture latency measurements
>> >> >>
>> >> >> Signed-off-by: Vincent Guittot <vincent.guittot@xxxxxxxxxx>
>> >> >> ---
>> >> >>  include/trace/events/cpu_hotplug.h |  117 ++++++++++++++++++++++++++++++++++++
>> >> >
>> >> > Note we can't apply new tracepoints if they are not inserted in the code.
>> >>
>> >> I agree, i just want to have 1st feedbacks on the tracepoint interface
>> >> before providing a patch which inserts the trace in the code.
>> >>
>> >> >
>> >> >> +DEFINE_EVENT(cpu_hotplug, cpu_hotplug_arch_wait_die_start,
>> >> >> +
>> >> >> +     TP_PROTO(unsigned int cpuid),
>> >> >> +
>> >> >> +     TP_ARGS(cpuid)
>> >> >> +);
>> >> >> +
>> >> >> +DEFINE_EVENT(cpu_hotplug, cpu_hotplug_arch_wait_die_end,
>> >> >> +
>> >> >> +     TP_PROTO(unsigned int cpuid),
>> >> >> +
>> >> >> +     TP_ARGS(cpuid)
>> >> >> +);
>> >> >
>> >> > What is wait die, compared to die for example?
>> >> >
>> >>
>> >> The arch_wait_die is used to trace the process which waits for the cpu
>> >> to die (__cpu_die) and the arch_die is used to trace when the cpu dies
>> >> (cpu_die)
>> >
>> > I still can't find the difference.
>> >
>> > Having:
>> >
>> > trace_cpu_hotplug_arch_die_start(cpu)
>> > __cpu_die();
>> > trace_cpu_hotplug_arch_die_end(cpu)
>> >
>> > Is not enough to get both the information that a cpu dies
>> > and the time took to do so?
>> >
>>
>> it's quite interesting to trace the cpu_die function because the cpu
>> really dies in this one.
>
> Note in case of success, you have barely the same time between die and
> wait_die, the difference will reside in some completion wait/polling,
> noise, mostly. Probably most of the time unnoticeable and irrelevant.
>

OK, tracing only __cpu_die should be enough

> Plus if you opt for this scheme, you need to put your die hook into
> every architectures, while otherwise a simple trace_cpu_die_start()
> trace_cpu_die_stop() pair around __cpu_die() call in the generic code
> is enough.
>
>> The __cpu_die function can't return if the
>> cpu fails to die in the very last step and then wake up. But this
>> could be detected with some cpu_die traces.
>>
>>
>> for a normal use case we have something like :
>> cpu 0 enters __cpu_die
>> cpu 1 enters cpu_die
>> cpu1 acks that it is going to died
>> cpu0 returns from __cpu_die
>>
>> if the cpu 1 fails to die at the very last step, we could have:
>> cpu 0 enters __cpu_die
>> cpu 1 enters cpu_idle --> cpu_die
>> cpu1 leaves cpu_die because of some issues and comes back into cpu_idle.
>> cpu0 returns from __cpu_die after a timeout or an error ack
>
> If it fails in the hardware level, you'll certainly notice in your
> power profiling because a CPU is not supposed to take seconds to
> die. Especially with a such visual tool like pytimechart, it will
> be obvious.
>
> For the details, that's something that must be found in syslogs and
> that's it.
>
> I don't think it's a good idea to handle such buggy and unexpected case at
> the tracepoint level. You don't want to profile bugs, you want to debug them.
> So it doesn't belong to this space IMHO.
>
>> Then, cpu_die traces can be used with power traces for profiling the
>> cpu power state. May be, the power.h trace file is a better place for
>> the cpu_die traces ?
>
> Hmm, this should probably stay inside the cpu hotplug tracepoint family,
> this is where people will seek them in the first place.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/