Re: ACPI _CST introduced performance regresions on Haswll

From: Mel Gorman
Date: Fri Oct 16 2020 - 10:09:16 EST


On Fri, Oct 16, 2020 at 03:41:12PM +0200, Rafael J. Wysocki wrote:
> > Turns out I didn't even have that. On another machine (same model,
> > same cpu, different BIOS that cannot be updated), enabling the C6 state
> > still did not enable it on boot and dmesg complained about CST not being
> > usable. This is weird because one would expect that if CST was unusable
> > that it would be the same as use_acpi == false.
> >
> > This could potentially be if the ACPI tables are unsuitable due to bad
> > bad FFH information for a lower c-state. If _CST is not found or usable,
> > should acpi_state_table.count be reset to go back to the old behaviour?
>
> Yes, it should, although I would reset it in intel_idle_cst_usable()
> right away before returning 'false'.
>

Good stuff.

> I can send a patch to do the above or please submit the one below as
> it works too.
>

I'm happy to go with your alternative if you prefer. For a finish,
I decided it was worth reporting if the _CST was ignored regardless of
the reason. It performs roughly the same as setting use_acpi = false on
the affected machines.

---8<---
intel_idle: Ignore _CST if control cannot be taken from the platform

e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") avoids
enabling c-states that have been disabled by the platform with the
exception of C1E.

Unfortunately, BIOS implementations are not always consistent in terms
of how capabilities are advertised and control cannot always be handed
over. If control cannot be handed over then intel_idle reports that "ACPI
_CST not found or not usable" but does not clear acpi_state_table.count
meaning the information is still partially used.

This patch ignores ACPI information if CST control cannot be requested from
the platform. This was only observed on a number of Haswell platforms that
had identical CPUs but not identical BIOS versions. While this problem
may be rare overall, 24 separate test cases bisected to this specific
commit across 4 separate test machines and is worth addressing. If the
situation occurs, the kernel behaves as it did before commit e6d4f08a6776
and uses any c-states that are discovered.

The affected test cases were all ones that involved a small number of
processes -- exec microbenchmark, pipe microbenchmark, git test suite,
netperf, tbench with one client and system call microbenchmark. Each
case benefits from being able to use turboboost which is prevented if the
lower c-states are unavailable. This may mask real regressions specific
to older hardware so it is worth addressing.

C-state status before and after the patch

5.9.0-vanilla POLL latency:0 disabled:0 default:enabled
5.9.0-vanilla C1 latency:2 disabled:0 default:enabled
5.9.0-vanilla C1E latency:10 disabled:0 default:enabled
5.9.0-vanilla C3 latency:33 disabled:1 default:disabled
5.9.0-vanilla C6 latency:133 disabled:1 default:disabled
5.9.0-ignore-cst-v1r1 POLL latency:0 disabled:0 default:enabled
5.9.0-ignore-cst-v1r1 C1 latency:2 disabled:0 default:enabled
5.9.0-ignore-cst-v1r1 C1E latency:10 disabled:0 default:enabled
5.9.0-ignore-cst-v1r1 C3 latency:33 disabled:0 default:enabled
5.9.0-ignore-cst-v1r1 C6 latency:133 disabled:0 default:enabled

Patch enables C3/C6.

Netperf UDP_STREAM

netperf-udp
5.5.0 5.9.0
vanilla ignore-cst-v1r1
Hmean send-64 193.41 ( 0.00%) 226.54 * 17.13%*
Hmean send-128 392.16 ( 0.00%) 450.54 * 14.89%*
Hmean send-256 769.94 ( 0.00%) 881.85 * 14.53%*
Hmean send-1024 2994.21 ( 0.00%) 3468.95 * 15.85%*
Hmean send-2048 5725.60 ( 0.00%) 6628.99 * 15.78%*
Hmean send-3312 8468.36 ( 0.00%) 10288.02 * 21.49%*
Hmean send-4096 10135.46 ( 0.00%) 12387.57 * 22.22%*
Hmean send-8192 17142.07 ( 0.00%) 19748.11 * 15.20%*
Hmean send-16384 28539.71 ( 0.00%) 30084.45 * 5.41%*

Fixes: e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems")
Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
---
drivers/idle/intel_idle.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
index 9a810e4a7946..4af2d3f2c8aa 100644
--- a/drivers/idle/intel_idle.c
+++ b/drivers/idle/intel_idle.c
@@ -1212,14 +1212,13 @@ static bool __init intel_idle_acpi_cst_extract(void)
if (!intel_idle_cst_usable())
continue;

- if (!acpi_processor_claim_cst_control()) {
- acpi_state_table.count = 0;
- return false;
- }
+ if (!acpi_processor_claim_cst_control())
+ break;

return true;
}

+ acpi_state_table.count = 0;
pr_debug("ACPI _CST not found or not usable\n");
return false;
}