Re: [Bug #12465] KVM guests stalling on 2.6.28 (bisected)

From: Frédéric Weisbecker
Date: Tue Jan 20 2009 - 09:47:01 EST


2009/1/20 Kevin Shanahan <kmshanah@xxxxxxxxxxx>:
> On Tue, 2009-01-20 at 13:56 +0100, Ingo Molnar wrote:
>> * Kevin Shanahan <kmshanah@xxxxxxxxxxx> wrote:
>> > > This suggests some sort of KVM-specific problem. Scheduler latencies
>> > > in the seconds that occur under normal load situations are noticed and
>> > > reported quickly - and there are no such open regressions currently.
>> >
>> > It at least suggests a problem with interaction between the scheduler
>> > and kvm, otherwise reverting that scheduler patch wouldn't have made the
>> > regression go away.
>>
>> the scheduler affects almost everything, so almost by definition a
>> scheduler change can tickle a race or other timing bug in just about any
>> code - and reverting that change in the scheduler can make the bug go
>> away. But yes, it could also be a genuine scheduler bug - that is always a
>> possibility.
>
> Okay, I understand.
>
>> Could you please run a cfs-debug-info.sh session on a CONFIG_SCHED_DEBUG=y
>> and CONFIG_SCHEDSTATS=y kernel, while you are experiencing those
>> latencies:
>>
>> http://people.redhat.com/mingo/cfs-scheduler/tools/cfs-debug-info.sh
>>
>> and post that (relatively large) somewhere, or send it as a reply after
>> bzip2 -9 compressing it? It will include a lot of information about the
>> delays your tasks are experiencing.
>
> Running it while the problem is occuring will be tricky, as it only
> lasts for a few seconds at a time. Is it going to be useful at all to
> just see those statistics if the system is running normally?
>
> I might need to modify the script a little. Am I right that everything
> above "gathering statistics..." is pretty much static information?
>
> I could run top, vmstat and cat /proc/sched_debug in a loop until the
> problem occurs and then trim it. Something like:
>
> while true; do
> date >> $FILE
> echo "-- top: --" >> $FILE
> top -H -c -b -d 1 -n 0.5 >> $FILE 2>/dev/null
> echo "-- vmstat: --" >> $FILE
> vmstat >> $FILE 2>/dev/null
> echo "-- sched_debug #$i: --" >> $FILE
> cat /proc/sched_debug >> $FILE 2>/dev/null
> done
>
> That should take a snapshot every half second or so.
>
> Regards,
> Kevin.
>
> P.S. Please keep kmshanah@xxxxxxxxxxxxxxxxx out of the CC list (it won't
> route properly anyway). I don't know how it got added - the only
> place it would have appeared was in the "revert" commit message
> when I was testing 2.6.28 with the commit I bisected down to
> removed.
>


One other thing you can do is enabling CONFIG_FUNCTION_GRAPH_TRACER,
as Ingo suggested, and
trace the schedule() function.
This way you will see the time spent in (almost) each functions called
from schedule() and perhaps find
where is the contention (if it comes from the scheduler).

How to use it?

echo schedule > /debugfs/tracing/set_graph_function
echo function_graph > /debugfs/tracing/current_tracer
cat /debugfs/tracing/trace

Or even through a pipe:
cat /debugfs/tracing/trace_pipe > ~/func_graph.log

To end the tracing: echo nop > /debugfs/tracing/current_tracer
Or just make a pause: echo 0 > /debugfs/tracing/tracing_enabled
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/