Re: Unusually high system CPU usage with recent kernels

From: Tibor Billes
Date: Fri Aug 23 2013 - 09:20:35 EST


> From: Paul E. McKenney Sent: 08/22/13 12:09 AM
> On Wed, Aug 21, 2013 at 11:05:51PM +0200, Tibor Billes wrote:
> > > From: Paul E. McKenney Sent: 08/21/13 09:12 PM
> > > On Wed, Aug 21, 2013 at 08:14:46PM +0200, Tibor Billes wrote:
> > > > > From: Paul E. McKenney Sent: 08/20/13 11:43 PM
> > > > > On Tue, Aug 20, 2013 at 10:52:26PM +0200, Tibor Billes wrote:
> > > > > > > From: Paul E. McKenney Sent: 08/20/13 04:53 PM
> > > > > > > On Tue, Aug 20, 2013 at 08:01:28AM +0200, Tibor Billes wrote:
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > I was using the 3.9.7 stable release and tried to upgrade to the 3.10.x series.
> > > > > > > > The 3.10.x series was showing unusually high (>75%) system CPU usage in some
> > > > > > > > situations, making things really slow. The latest stable I tried is 3.10.7.
> > > > > > > > I also tried 3.11-rc5, they both show this behaviour. This behaviour doesn't
> > > > > > > > show up when the system is idling, only when doing some CPU intensive work,
> > > > > > > > like compiling with multiple threads. Compiling with only one thread seems not
> > > > > > > > to trigger this behaviour.
> > > > > > > >
> > > > > > > > To be more precise I did a `perf record -a` while compiling a large C++ program
> > > > > > > > with scons using 4 threads, the result is appended at the end of this email.
> > > > > > >
> > > > > > > New one on me! You are running a mainstream system (x86_64), so I am
> > > > > > > surprised no one else noticed.
> > > > > > >
> > > > > > > Could you please send along your .config file?
> > > > > >
> > > > > > Here it is
> > > > >
> > > > > Interesting. I don't see RCU stuff all that high on the list, but
> > > > > the items I do see lead me to suspect RCU_FAST_NO_HZ, which has some
> > > > > relevance to the otherwise inexplicable group of commits you located
> > > > > with your bisection. Could you please rerun with CONFIG_RCU_FAST_NO_HZ=n?
> > > > >
> > > > > If that helps, there are some things I could try.
> > > >
> > > > It did help. I didn't notice anything unusual when running with CONFIG_RCU_FAST_NO_HZ=n.
> > >
> > > Interesting. Thank you for trying this -- and we at least have a
> > > short-term workaround for this problem. I will put a patch together
> > > for further investigation.
> >
> > I don't specifically need this config option so I'm fine without it in
> > the long term, but I guess it's not supposed to behave like that.
>
> OK, good, we have a long-term workload for your specific case,
> even better. ;-)
>
> But yes, there are situations where RCU_FAST_NO_HZ needs to work
> a bit better. I hope you will bear with me with a bit more
> testing...

Don't worry, I will :) Unfortunately I didn't have time yesterday and I
won't have time today either. But I'll do what you asked tomorrow and I'll
send you the results.

Tibor

> > > In the meantime, could you please tell me how you were measuring
> > > performance for your kernel builds? Wall-clock time required to complete
> > > one build? Number of builds completed per unit time? Something else?
> >
> > Actually, I wasn't all this sophisticated. I have a system monitor
> > applet on my top panel (using MATE, Linux Mint), four little graphs,
> > one of which shows CPU usage. Different colors indicate different kind
> > of CPU usage. Blue shows user space usage, red shows system usage, and
> > two more for nice and iowait. During a normal compile it's almost
> > completely filled with blue user space CPU usage, only the top few
> > pixels show some iowait and system usage. With CONFIG_RCU_FAST_NO_HZ
> > set, about 3/4 of the graph was red system CPU usage, the rest was
> > blue. So I just looked for a pile of red on my graphs when I tested
> > different kernel builds. But also compile speed was horrible I couldn't
> > wait for the build to finish. Even the UI got unresponsive.
>
> We have been having problems with CPU accounting, but this one looks
> quite real.
>
> > Now I did some measuring. In the normal case a compile finished in 36
> > seconds, compiled 315 object files. Here are some output lines from
> > dstat -tasm --vm during compile:
> > ----system---- ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ----swap--- ------memory-usage----- -----virtual-memory----
> >   time   |usr sys idl wai hiq siq| read Âwrit| recv Âsend| Âin  out | int  csw | used Âfree| used Âbuff Âcach Âfree|majpf minpf alloc Âfree
> > 21-08 21:48:05| 91  8  2  0  0  0|  0 Â5852k|  0   0 |  0   0 |1413 Â1772 |  0 Â7934M| 581M 58.0M Â602M 6553M|  0  Â71k  46k  54k
> > 21-08 21:48:06| 93  6  1  0  0  0|  0 Â2064k| 137B Â131B|  0   0 |1356 Â1650 |  0 Â7934M| 649M 58.0M Â604M 6483M|  0  Â72k  47k  28k
> > 21-08 21:48:07| 86 Â11  4  0  0  0|  0 Â5872k|  0   0 |  0   0 |2000 Â2991 |  0 Â7934M| 577M 58.0M Â627M 6531M|  0  Â99k  67k  79k
> > 21-08 21:48:08| 87  9  3  0  0  0|  0 Â2840k|  0   0 |  0   0 |2558 Â4164 |  0 Â7934M| 597M 58.0M Â632M 6507M|  0  Â96k  57k  51k
> > 21-08 21:48:09| 93  7  1  0  0  0|  0 Â3032k|  0   0 |  0   0 |1329 Â1512 |  0 Â7934M| 641M 58.0M Â626M 6469M|  0  Â61k  48k  39k
> > 21-08 21:48:10| 93  6  0  0  0  0|  0 Â4984k|  0   0 |  0   0 |1160 Â1146 |  0 Â7934M| 572M 58.0M Â628M 6536M|  0  Â50k  40k  57k
> > 21-08 21:48:11| 86  9  6  0  0  0|  0 Â2520k|  0   0 |  0   0 |2947 Â4760 |  0 Â7934M| 605M 58.0M Â631M 6500M|  0  103k  55k  45k
> > 21-08 21:48:12| 90  8  2  0  0  0|  0 Â2840k|  0   0 |  0   0 |2674 Â4179 |  0 Â7934M| 671M 58.0M Â635M 6431M|  0  Â84k  59k  42k
> > 21-08 21:48:13| 90  9  1  0  0  0|  0 Â4656k|  0   0 |  0   0 |1223 Â1410 |  0 Â7934M| 643M 58.0M Â638M 6455M|  0  Â90k  62k  68k
> > 21-08 21:48:14| 91  8  1  0  0  0|  0 Â3572k|  0   0 |  0   0 |1432 Â1828 |  0 Â7934M| 647M 58.0M Â641M 6447M|  0  Â81k  59k  57k
> > 21-08 21:48:15| 91  8  1  0  0  0|  0 Â5116k| 116B  Â0 |  0   0 |1194 Â1295 |  0 Â7934M| 605M 58.0M Â644M 6487M|  0  Â69k  54k  64k
> > 21-08 21:48:16| 87 Â10  3  0  0  0|  0 Â5140k|  0   0 |  0   0 |1761 Â2586 |  0 Â7934M| 584M 58.0M Â650M 6502M|  0  105k  64k  68k
> >
> > The abnormal case compiled only 182 object file in 6 and a half minutes,
> > then I stopped it. The same dstat output for this case:
> > ----system---- ----total-cpu-usage---- -dsk/total- -net/total- ---paging-- ---system-- ----swap--- ------memory-usage----- -----virtual-memory----
> >   time   |usr sys idl wai hiq siq| read Âwrit| recv Âsend| Âin  out | int  csw | used Âfree| used Âbuff Âcach Âfree|majpf minpf alloc Âfree
> > 21-08 22:10:49| 27 Â62 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | 210B Â Â0 | Â 0 Â Â 0 |1414 Â3137k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â1628 Â1250 Â 322
> > 21-08 22:10:50| 25 Â60 Â 4 Â 0 Â11 Â 0| Â 0 Â Â88k| 126B Â Â0 | Â 0 Â Â 0 |1337 Â3110k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â Â91 Â 128 Â 115
> > 21-08 22:10:51| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â 184k| 294B Â Â0 | Â 0 Â Â 0 |1411 Â3147k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â1485 Â 814 Â 815
> > 21-08 22:10:52| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | 437B Â239B| Â 0 Â Â 0 |1355 Â3160k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â Â24 Â Â94 Â Â97
> > 21-08 22:10:53| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | 168B Â Â0 | Â 0 Â Â 0 |1397 Â3155k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â 479 Â 285 Â 273
> > 21-08 22:10:54| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â4096B| 396B Â324B| Â 0 Â Â 0 |1346 Â3154k| Â 0 Â7934M| 531M 57.6M Â595M 6611M| Â 0 Â Â27 Â 145 Â 145
> > 21-08 22:10:55| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â60k| Â 0 Â Â 0 | Â 0 Â Â 0 |1353 Â3148k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â93 Â 117 Â Â36
> > 21-08 22:10:56| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1341 Â3172k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â 158 Â Â87 Â Â74
> > 21-08 22:10:57| 26 Â62 Â 1 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â42B Â 60B| Â 0 Â Â 0 |1332 Â3162k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â56 Â Â82 Â Â78
> > 21-08 22:10:58| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1334 Â3178k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â26 Â Â56 Â Â56
> > 21-08 22:10:59| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â 0 | Â 0 Â Â 0 | Â 0 Â Â 0 |1336 Â3179k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â 3 Â Â33 Â Â32
> > 21-08 22:11:00| 26 Â63 Â 0 Â 0 Â11 Â 0| Â 0 Â Â24k| Â90B Â108B| Â 0 Â Â 0 |1347 Â3172k| Â 0 Â7934M| 531M 57.6M Â595M 6610M| Â 0 Â Â41 Â Â73 Â Â71
> >
> > I have four logical cores so 25% makes up 1 core. I don't know if the ~26% user CPU usage has anthing to do with this fact or just coincidence. The rest is ~63% system and ~11% hardware interrupt. Do these support what you suspect?
>
> The massive increase in context switches does come as a bit of a surprise!
> It does rule out my initial suspicion of lock contention, but then again
> the fact that you have only four CPUs made that pretty unlikely to begin
> with.
>
> 2.4k average context switches in the good case for the full run vs. 3,156k
> for about half of a run in the bad case. That is an increase of more
> than three orders of magnitude!
>
> Yow!!!
>
> Page faults are actually -higher- in the good case. You have about 6.5GB
> free in both cases, so you are not running out of memory. Lots more disk
> writes in the good case, perhaps consistent with its getting more done.
> Networking is negligible in both cases.
>
> Lots of hardware interrupts in the bad case as well. Would you be willing
> to take a look at /proc/interrupts before and after to see which one you
> are getting hit with? (Or whatever interrupt tracking tool you prefer.)
>
> One hypothesis is that your workload and configuration are interacting
> with RCU_FAST_NO_HZ to force very large numbers of RCU grace periods.
> Could you please check for this by building with CONFIG_RCU_TRACE=y,
> mounting debugfs somewhere and dumping rcu/rcu_sched/rcugp before and
> after each run?
>
> Thanx, Paul

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/