Re: 50 Watt idle power regression bisected to Linux-3.10

From: Thomas Gleixner
Date: Wed Dec 11 2013 - 06:28:46 EST


On Wed, 11 Dec 2013, Mike Galbraith wrote:
> Alakazam..
> Yup, magical gremlin repellent works on 8 socket DL980 too.

Now here is a less magical version of the gremlin repellent.

And just for the amusement value: The erratum for the series 7400
says:

AAI65. MONITOR/MWAIT May Have Excessive False Wakeups

Problem: Normally, if MWAIT is used to enter a C-state that is
C1 or higher, a store to the address range armed by
the MONITOR instruction will cause the processor to
exit MWAIT. Due to this erratum, false wakeups may
occur when the monitored address range was recently
written prior to executing the MONITOR instruction.

Implication: Due to this erratum, performance and power savings may
be impacted due to excessive false wakeups.

Workaround: Execute a CLFLUSH Instruction immediately before every
MONITOR instruction when the monitored location may
have been recently written.

Now that looks like the very same issue on these westmere EX
machines.

These false wakeups can be observed already before the idle changes
and now they are just more prominent.

Adding that clflush() unconditionally fixes the issue at least on
Boris machine.

Mike, can you retest on that 8 socket monstrum, please?

So it looks like the idle power regression is actually a software
change which exhibits a hardware "regression".

So much for proper validated advertising which promises core power 0W
at idle for these beasts :)

Thanks,

tglx
---
diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 92d1206..50299ad 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -376,7 +376,7 @@ static int intel_idle(struct cpuidle_device *dev,
clockevents_notify(CLOCK_EVT_NOTIFY_BROADCAST_ENTER, &cpu);

if (!current_set_polling_and_test()) {
-
+ clflush(&current_thread_info()->flags);
__monitor((void *)&current_thread_info()->flags, 0, 0);
smp_mb();
if (!need_resched())
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/