Re: [PATCH v2] mm/slub: disable slab merging in the default configuration

From: David Rientjes
Date: Tue Jul 25 2023 - 19:26:03 EST


On Tue, 18 Jul 2023, Julian Pidancet wrote:

> Hi David,
>
> Many thanks for running all these tests. The amount of attention you've
> given this change is simply amazing. I wish I could have been able to
> assist you by doing more tests, but I've been lacking the necessary
> resources to do so.
>
> I'm as surprised as you are regarding the skylake regression. 20% is
> quite a large number, but perhaps it's less worrying than it looks given
> that benchmarks are usually very different from real-world workloads?
>

I'm not an expert on context_switch1_per_thread_ops so I can't infere
which workloads would be most affected by such a regression other than to
point out that -18% is quite substantial.

I'm still hoping to run some benchmarks with 64KB page sizes as Christoph
suggested, I should be able to do this with arm64.

It's ceratinly good news that the overall memory footprint doesn't change
much with this change.

> As Kees Cook was suggesting in his own reply, have you given a thought
> about including this change in -next and see if there are regressions
> showing up in CI performance tests results?
>

I assume that anything we can run with CI performance tests can also be
run without merging into -next?

The performance degradation is substantial for a microbenchmark, I'd like
to complete the picture on other benchmarks and do a complete analysis
with 64KB page sizes since I think the concern Christoph mentions could be
quite real. We just don't have the data yet to make an informed
assessment of it. Certainly would welcome any help that others would like
to provide for running benchmarks with this change as well :P

Once we have a complete picture, we might also want to discuss what we are
hoping to achieve with such a change. I was very supportive of it prior
to the -18% benchmark result. But if most users are simply using whatever
their distro defaults to and other users may already be opting into this
either by the kernel command line or .config, it's hard to determine
exactly the set of users that would be affected by this change. Suddenly
causing a -18% regression overnight for this would be surprising for them.