Re: [RFC PATCH 00/14] Prevent cross-cache attacks in the SLUB allocator

From: Ingo Molnar
Date: Wed Sep 20 2023 - 03:45:01 EST



* Matteo Rizzo <matteorizzo@xxxxxxxxxx> wrote:

> On Mon, 18 Sept 2023 at 19:39, Ingo Molnar <mingo@xxxxxxxxxx> wrote:
> >
> > What's the split of the increase in overhead due to SLAB_VIRTUAL=y, between
> > user-space execution and kernel-space execution?
> >
>
> Same benchmark as before (compiling a kernel on a system running the patched
> kernel):
>
> Intel Skylake:
>
> LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV
> ---------------+-------+----------+----------+----------+----------+--------
> wall clock | | | | | |
> SLAB_VIRTUAL=n | 150 | 49.700 | 51.320 | 50.449 | 50.430 | 0.29959
> SLAB_VIRTUAL=y | 150 | 50.020 | 51.660 | 50.880 | 50.880 | 0.30495
> | | +0.64% | +0.66% | +0.85% | +0.89% | +1.79%
> system time | | | | | |
> SLAB_VIRTUAL=n | 150 | 358.560 | 362.900 | 360.922 | 360.985 | 0.91761
> SLAB_VIRTUAL=y | 150 | 362.970 | 367.970 | 366.062 | 366.115 | 1.015
> | | +1.23% | +1.40% | +1.42% | +1.42% | +10.60%
> user time | | | | | |
> SLAB_VIRTUAL=n | 150 | 3110.000 | 3124.520 | 3118.143 | 3118.120 | 2.466
> SLAB_VIRTUAL=y | 150 | 3115.070 | 3127.070 | 3120.762 | 3120.925 | 2.654
> | | +0.16% | +0.08% | +0.08% | +0.09% | +7.63%

These Skylake figures are a bit counter-intuitive: how does an increase of
only +0.08% user-time - which dominates 89.5% of execution, combined with a
+1.42% increase in system time that consumes only 10.5% of CPU capacity,
result in a +0.85% increase in wall-clock time?

There might be hidden factors at work in the DMA space, as Linus suggested?

Or perhaps wall-clock time is dominated by the single-threaded final link
time of the kernel, which phase might be disproportionately hurt by these
changes?

(Stddev seems low enough for this not to be a measurement artifact.)

The AMD Milan figures are more intuitive:

> AMD Milan:
>
> LABEL | COUNT | MIN | MAX | MEAN | MEDIAN | STDDEV
> ---------------+-------+----------+----------+----------+----------+--------
> wall clock | | | | | |
> SLAB_VIRTUAL=n | 150 | 25.480 | 26.550 | 26.065 | 26.055 | 0.23495
> SLAB_VIRTUAL=y | 150 | 25.820 | 27.080 | 26.531 | 26.540 | 0.25974
> | | +1.33% | +2.00% | +1.79% | +1.86% | +10.55%
> system time | | | | | |
> SLAB_VIRTUAL=n | 150 | 478.530 | 540.420 | 520.803 | 521.485 | 9.166
> SLAB_VIRTUAL=y | 150 | 530.520 | 572.460 | 552.825 | 552.985 | 7.161
> | | +10.86% | +5.93% | +6.15% | +6.04% | -21.88%
> user time | | | | | |
> SLAB_VIRTUAL=n | 150 | 2373.540 | 2403.800 | 2386.343 | 2385.840 | 5.325
> SLAB_VIRTUAL=y | 150 | 2388.690 | 2426.290 | 2408.325 | 2408.895 | 6.667
> | | +0.64% | +0.94% | +0.92% | +0.97% | +25.20%
>
>
> I'm not exactly sure why user time increases by almost 1% on Milan, it
> could be TLB contention.

The other worrying aspect is the increase of +6.15% of system time ...
which is roughly in line with what we'd expect from a +1.79% increase in
wall-clock time.

Thanks,

Ingo