Re: [PATCH] a patch to fix the cpu-offline-online problem causedby pm_idle

From: Peter Zijlstra
Date: Mon Jan 31 2011 - 05:15:54 EST


On Sun, 2011-01-30 at 22:26 -0500, Luming Yu wrote:

> > Guessing is totally the wrong thing when you're sending stuff upstream,
> > esp ugly patches such as this. .32 is more than a year old, anything
> > could have happened.
>
> Ok. the default upstream kernel seems to have NMI watchdog disabled?

Then enable it already, its a whole CONFIG option away..

> It's not working because of NMI watchdog. If you ignore NMI watchdog,
> then I guess it works but just slow..

Don't guess, test it dammit. And then figure out why it triggers, I
haven't seen _anything_ that would cause it to trigger, nor a sane
explanation for your patch.

> > Ok, so one IPI costs 50-100 us, even with 64 cpu, that's at most 6.4ms
> > nowhere near enough to trigger the NMI watchdog. So what does go wrong?
>
> Good question!
> But we also can't forget there were large latency from C3.

Not 60+ seconds large I hope, I know NHM-EX has some suckage, but surely
not that bad?

> And I guess some reschedule ticks get lost to kick some CPUs out of
> idle due to the side effects of the CPU PM feature. if use nohz=off,
> everything seems to just work.
> Yes, I agree we need to dig it out either.
> But it's kind of combination problem between the special stop_machine
> context and CPU power management...

Yeah, so? Also, incidentally, stop-machine got a rewrite around .35 and
again significant changes in .37, so please do test mainline and not
your dinosaur.

> > Yeah, what are you smoking? Why do you wreck perfectly fine code for one
> > backward ass piece of hardware.
>
> Just make things less complex...

But its wrong, it very clearly works around a real problem, don't ever
do that, fix the problem!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/