Re: [RFC PATCH 2/4] x86, mwaitt: introduce mwaitx idle with a configurable timer

From: Borislav Petkov
Date: Wed May 20 2015 - 06:50:39 EST


On Wed, May 20, 2015 at 12:22:58PM +0200, Ingo Molnar wrote:
> Well, HLT does not get any hint from the OS how long the idling is
> expected to last.

MWAIT on AMD doesn't either:

"EAX specifies optional hints for the MWAIT instruction. There are
currently no hints defined and all bits should be 0. Setting a reserved
bit in EAX is ignored by the processor."

I don't know about MWAITX though as I haven't seen any official doc yet.

> Another MWAITX round - we've got no crystal ball, so the hint might be
> wrong if an external event occurs that we did not anticipate.

So if we end up doing a bunch of MWAITX rounds instead of HLT and MWAITX
saves less power than HLT, then we practically are worse.

I think the idea with MWAITX is to use it to sleep only when you know
the timeout would be shorter - whatever "shorter" means - and thus you
can save yourself the idle entry/exit latency.

If you keep waking up due to timeout ending - which with u32 in EBX will
be ~1sec on a 4GHz core, or 2 on a 2GHz core - and your MWAITX C-state
is worse wrt power consumption than your HLT state, then you lose. And
your MWAITX C-state *is* worse currently, see below.

> I don't think MWAITX will wake up in itself. (If yes then it's
> essentially a timer in disguise and needs a whole different approach!)

I mean when the MWAITX timeout expires. When it does, we wake up.

Also, normal MWAIT allows for interrupts to wake it up:

"ECX specifies optional extensions for the MWAIT instruction. The only
extension currently defined is ECX bit 0, which allows interrupts to
wake MWAIT, even when eFLAGS.IF = 0. Support for this extension is
indicated by a feature flage returned by the CPUID instruction."

> The question would be: on systems that provide ACPI idle but also have
> MWAITX support, which one behaves better on the hardware side?

I'd venture a guess here that the ACPI side should be using all C-states
available (think of other OSes and having optimal power savings there)
and MWAITX would be worse or the same. Right now it is entering some
funny state between C0 and C1 reportedly:

"But on AMD platform, mwaitx/mwait cannot go to C1 or C1E like intel.
The power consumption of waiting phase is somewhere in between (C0 and
C1). Actually, it's still in C0 but less power consumption than normal
C0."

So my thinking currently is - provided we want to use it at all:

* Use MWAITX on entry to idle, considering that on a busy system, the
statistical probability of this sleep timeout to be small, is high.

* When the timeout expires and we wake up and realize there's still
nothing to do, we do HLT.

But all that is pointless if we end up in acpi_idle anyway...

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.
--
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/