Re: [linus:master] [x86/bugs] 6613d82e61: stress-ng.mutex.ops_per_sec -7.9% regression

From: Feng Tang
Date: Tue Mar 05 2024 - 00:51:38 EST


Hi Dave,

On Mon, Mar 04, 2024 at 09:58:53AM -0800, Dave Hansen wrote:
> On 3/3/24 21:53, kernel test robot wrote:
> > kernel test robot noticed a -7.9% regression of stress-ng.mutex.ops_per_sec on:
> >
> > commit: 6613d82e617dd7eb8b0c40b2fe3acea655b1d611 ("x86/bugs: Use ALTERNATIVE() instead of mds_user_clear static key")
> > https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> This _looks_ like noise to me.
>
> Some benchmarks went up, some went down. The differential profile shows
> random gunk that basically amounts to "my computer is slow" because it's
> mostly things that change when the result changes, like:
>
> > 182670 +9.0% 199032 stress-ng.mutex.nanosecs_per_mutex
>
> Does anyone think there's something substantial to chase after here?

We further checked this, and it seems to be another case of data/text
alignment effect, that 6613d82e617d removes staic key 'mds_user_clear'
which sits in '.bss' section and change the address alignment of
following data in that section.

With below debug patch to restore the alignment, we can see the
performance is recovered:

a0e2dab44d22b913 6613d82e617dd7eb8b0c40b2fe3 398e7f0da8595354dc330938831
---------------- --------------------------- ---------------------------

302318 -7.9% 278364 +0.3% 303161 stress-ng.mutex.ops_per_sec

---
diff --git a/arch/x86/kernel/cpu/bugs.c b/arch/x86/kernel/cpu/bugs.c
index 48d049cd74e7..1876865dc954 100644
--- a/arch/x86/kernel/cpu/bugs.c
+++ b/arch/x86/kernel/cpu/bugs.c
@@ -111,6 +111,9 @@ DEFINE_STATIC_KEY_FALSE(switch_mm_cond_ibpb);
/* Control unconditional IBPB in switch_mm() */
DEFINE_STATIC_KEY_FALSE(switch_mm_always_ibpb);

+DEFINE_STATIC_KEY_FALSE(test_static_key);
+EXPORT_SYMBOL_GPL(test_static_key);
+
/* Control MDS CPU buffer clear before idling (halt, mwait) */
DEFINE_STATIC_KEY_FALSE(mds_idle_clear);
EXPORT_SYMBOL_GPL(mds_idle_clear);
---

There was another similar case which changed the alignment of
percpu section:
https://lore.kernel.org/lkml/ZSeF6T0mkrH5pOgD@feng-clx/

Thanks,
Feng