Re: 50 Watt idle power regression bisected to Linux-3.10

From: H. Peter Anvin
Date: Wed Dec 11 2013 - 18:09:54 EST


On 12/11/2013 09:50 AM, Ingo Molnar wrote:
>
> Well, availability could be a problem too, if some CPU (real or
> virtual) implements MWAIT but not CLFLUSH.
>
> In theory we could make mwait an alternatives variant and patch in the
> right combination of instructions? The CLFLUSH goes to the same
> address as on which the monitoring happens, so it could be considered
> one meta-instruction.
>

The first thing to do is probably to drop the use of thread_info as a
wakeup doorbell. It seemed like a good idea at the time -- after all,
there is one for each thread -- but it is extremely likely to be dirty
in the cache, which is (presumably) what causes these kinds of bugs to
be maximally likely. Even if we don't do the CLFLUSH it is likely that
the hardware has to do something expensive behind the scenes.

So I would like to propose that we switch to using a percpu variable
which is a single cache line of nothing at all. It would only ever be
touched by MONITOR and for explicit wakeup. Hopefully that will resolve
this problem without the need for the CLFLUSH.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/