Re: [PATCH] a patch to fix the cpu-offline-online problem caused by pm_idle

From: Luming Yu
Date: Wed Jan 26 2011 - 01:42:33 EST


On Tue, Jan 25, 2011 at 4:12 AM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
> On Mon, 2011-01-24 at 20:59 -0500, Luming Yu wrote:
>>
>> > Ow god this is ugly.. pm_idle should die asap, not find it way into generic code, so NAK!
>>
>> Without the ugly fix, we seem not able to fix the problem in short time.
>> Or , Are you suggesting to wrap Âpm_idle or similar in some generic
>> code that would not disappear in foreseeable future ,
>
> There are patches out there removing pm_idle from x86 (at least).
> pm_idle is a horribly broken interface that really should die.
>
>> ÂOr ÂAre you just
>> suggesting me don't do the stuff in kerne/cpu.c, and do it in Arch
>> code?
>
> Well, as it stand only about half the architectures out there even have
> a pm_idle pointer, so your patch would break the other half.
>
> If you really need to do this, do it in arch code, but really, why is
> this needed at all? The changelog failed to explain wth happens and why
> this solves it.

Ok, How about the new patch in the attachment?

We have seen an extremely slow system under the CPU-OFFLINE-ONLIE test
on a 4-sockets NHM-EX system.

The test case of off-line-on-line a cpu 1000 times and its performance
is dominated by IPI and ipi_handler performance. On NHM-EX, Sending
IPI not through broadcast is very slow. Needs to wake up processor by
IPI from deep-c-state also incurs heavy weight delay in set_mtrr
synchronization in stop_machinecontext. NHM-EX's c3-stop-APIC timer
adds more trouble to the problem. If I understand the problem
correctly, We probably need to tweak IPI code in upstream to get a
clean solution for NHM-EX's slow IPI delivery problem to get
reschedule tick processed without any delay on CPU which was in deep c state.
But it needs more time. So A quick fix is provided to make the test pass.

Without the patch the current CPU Office Online feature would not work
reliably, since it currently unnecessarily implicitly interact with
CPU power management.

--Luming

Attachment: switch-idle-procedure.patch
Description: Binary data