Re: [Bug #12650] Strange load average and ksoftirqd behavior with2.6.29-rc2-git1

From: Damien Wyart
Date: Sun Feb 15 2009 - 05:35:25 EST


* Ingo Molnar <mingo@xxxxxxx> [2009-02-15 11:13]:
> Note that if the box you test this on is multi-core or HT, then interpreting
> traces is easier if there's just a single CPU to look at. In that case i'd
> suggest to reproduce with just a single core, by turning the second one off:

> echo 0 > /sys/devices/system/cpu/cpu1/online

> Or, if the problem only occurs with two cpus, restrict tracing to CPU#1:

> echo 2 > /debug/tracing/tracing_cpumask

The box I test on is HT, so I tried the first suggestion and it made the
problem much less visible (but not completely absent).

So I used "echo 1 > /sys/devices/system/cpu/cpu1/online" to go back to
HT mode and then it made the problem much more visible on CPU#1:
ksoftirqd/1 is running a lot and ksoftirqd/0 is almost normal. The load
average is about 0.80 and the total running time for ksoftirqd/1 is
almost one minute (and I booted on rc5 ten minutes ago)!

So I followed the tracing steps in the tutorial (with the 1 sec sleep),
which gave me this:
http://damien.wyart.free.fr/trace_2.6.29-rc5_ksoftirqd_prob.txt.gz

As I will be away until tomorrow, I did this on vanilla rc5 to get
something out today, and if tip is really needed, I will work on it
tomorrow. But maybe this vanilla trace will be helpful to you...

Do not hesitate to ask for further tests or info.

--
Damien
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/