Re: ACPI _CST introduced performance regresions on Haswll

From: Rafael J. Wysocki
Date: Fri Oct 16 2020 - 11:29:49 EST


On Fri, Oct 16, 2020 at 4:09 PM Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx> wrote:
>
> On Fri, Oct 16, 2020 at 03:41:12PM +0200, Rafael J. Wysocki wrote:
> > > Turns out I didn't even have that. On another machine (same model,
> > > same cpu, different BIOS that cannot be updated), enabling the C6 state
> > > still did not enable it on boot and dmesg complained about CST not being
> > > usable. This is weird because one would expect that if CST was unusable
> > > that it would be the same as use_acpi == false.
> > >
> > > This could potentially be if the ACPI tables are unsuitable due to bad
> > > bad FFH information for a lower c-state. If _CST is not found or usable,
> > > should acpi_state_table.count be reset to go back to the old behaviour?
> >
> > Yes, it should, although I would reset it in intel_idle_cst_usable()
> > right away before returning 'false'.
> >
>
> Good stuff.
>
> > I can send a patch to do the above or please submit the one below as
> > it works too.
> >
>
> I'm happy to go with your alternative if you prefer. For a finish,
> I decided it was worth reporting if the _CST was ignored regardless of
> the reason. It performs roughly the same as setting use_acpi = false on
> the affected machines.
>
> ---8<---
> intel_idle: Ignore _CST if control cannot be taken from the platform
>
> e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems") avoids
> enabling c-states that have been disabled by the platform with the
> exception of C1E.
>
> Unfortunately, BIOS implementations are not always consistent in terms
> of how capabilities are advertised and control cannot always be handed
> over. If control cannot be handed over then intel_idle reports that "ACPI
> _CST not found or not usable" but does not clear acpi_state_table.count
> meaning the information is still partially used.
>
> This patch ignores ACPI information if CST control cannot be requested from
> the platform. This was only observed on a number of Haswell platforms that
> had identical CPUs but not identical BIOS versions. While this problem
> may be rare overall, 24 separate test cases bisected to this specific
> commit across 4 separate test machines and is worth addressing. If the
> situation occurs, the kernel behaves as it did before commit e6d4f08a6776
> and uses any c-states that are discovered.
>
> The affected test cases were all ones that involved a small number of
> processes -- exec microbenchmark, pipe microbenchmark, git test suite,
> netperf, tbench with one client and system call microbenchmark. Each
> case benefits from being able to use turboboost which is prevented if the
> lower c-states are unavailable. This may mask real regressions specific
> to older hardware so it is worth addressing.
>
> C-state status before and after the patch
>
> 5.9.0-vanilla POLL latency:0 disabled:0 default:enabled
> 5.9.0-vanilla C1 latency:2 disabled:0 default:enabled
> 5.9.0-vanilla C1E latency:10 disabled:0 default:enabled
> 5.9.0-vanilla C3 latency:33 disabled:1 default:disabled
> 5.9.0-vanilla C6 latency:133 disabled:1 default:disabled
> 5.9.0-ignore-cst-v1r1 POLL latency:0 disabled:0 default:enabled
> 5.9.0-ignore-cst-v1r1 C1 latency:2 disabled:0 default:enabled
> 5.9.0-ignore-cst-v1r1 C1E latency:10 disabled:0 default:enabled
> 5.9.0-ignore-cst-v1r1 C3 latency:33 disabled:0 default:enabled
> 5.9.0-ignore-cst-v1r1 C6 latency:133 disabled:0 default:enabled
>
> Patch enables C3/C6.
>
> Netperf UDP_STREAM
>
> netperf-udp
> 5.5.0 5.9.0
> vanilla ignore-cst-v1r1
> Hmean send-64 193.41 ( 0.00%) 226.54 * 17.13%*
> Hmean send-128 392.16 ( 0.00%) 450.54 * 14.89%*
> Hmean send-256 769.94 ( 0.00%) 881.85 * 14.53%*
> Hmean send-1024 2994.21 ( 0.00%) 3468.95 * 15.85%*
> Hmean send-2048 5725.60 ( 0.00%) 6628.99 * 15.78%*
> Hmean send-3312 8468.36 ( 0.00%) 10288.02 * 21.49%*
> Hmean send-4096 10135.46 ( 0.00%) 12387.57 * 22.22%*
> Hmean send-8192 17142.07 ( 0.00%) 19748.11 * 15.20%*
> Hmean send-16384 28539.71 ( 0.00%) 30084.45 * 5.41%*
>
> Fixes: e6d4f08a6776 ("intel_idle: Use ACPI _CST on server systems")
> Signed-off-by: Mel Gorman <mgorman@xxxxxxxxxxxxxxxxxxx>
> ---
> drivers/idle/intel_idle.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/idle/intel_idle.c b/drivers/idle/intel_idle.c
> index 9a810e4a7946..4af2d3f2c8aa 100644
> --- a/drivers/idle/intel_idle.c
> +++ b/drivers/idle/intel_idle.c
> @@ -1212,14 +1212,13 @@ static bool __init intel_idle_acpi_cst_extract(void)
> if (!intel_idle_cst_usable())
> continue;
>
> - if (!acpi_processor_claim_cst_control()) {
> - acpi_state_table.count = 0;
> - return false;
> - }
> + if (!acpi_processor_claim_cst_control())
> + break;
>
> return true;
> }
>
> + acpi_state_table.count = 0;
> pr_debug("ACPI _CST not found or not usable\n");
> return false;
> }

Applied as a fix for 5.10-rc1, thanks!