Re: [Regression] mqueue performance degradation after "The new cgroup slab memory controller" patchset.

From: Roman Gushchin
Date: Mon Dec 05 2022 - 21:15:22 EST


On Mon, Dec 05, 2022 at 02:55:48PM +0000, Luther, Sven wrote:
> #regzbot ^introduced 10befea91b61c4e2c2d1df06a2e978d182fcf792
>
> We are making heavy use of mqueues, and noticed a degradation of performance between 4.18 & 5.10 linux kernels.
>
> After a gross per-version tracing, we did kernel bisection between 5.8 and 5.9
> and traced the issue to a 10 patches (of which 9 where skipped as they didn't boot) between:
>
>
> commit 10befea91b61c4e2c2d1df06a2e978d182fcf792 (HEAD, refs/bisect/bad)
> Author: Roman Gushchin <guro@xxxxxx>
> Date: Thu Aug 6 23:21:27 2020 -0700
>
> mm: memcg/slab: use a single set of kmem_caches for all allocations
>
> and:
>
> commit 286e04b8ed7a04279ae277f0f024430246ea5eec (refs/bisect/good-286e04b8ed7a04279ae277f0f024430246ea5eec)
> Author: Roman Gushchin <guro@xxxxxx>
> Date: Thu Aug 6 23:20:52 2020 -0700
>
> mm: memcg/slab: allocate obj_cgroups for non-root slab pages
>
> All of them are part of the "The new cgroup slab memory controller" patchset:
>
> https://lore.kernel.org/all/20200623174037.3951353-18-guro@xxxxxx/T/
>
> from Roman Gushchin, which moves the accounting for page level to the object level.
>
> Measurements where done using the a test programmtest, which measures mix/average/max time mqueue_send/mqueue_rcv,
> and average for getppid, both measured over 100 000 runs. Results are shown in the following table
>
> +----------+--------------------------+-------------------------+----------------+
> | kernel | mqueue_rcv (ns) | mqueue_send (ns) | getppid |
> | version | min avg max variation | min avg max variation | (ns) variation |
> +----------+--------------------------+-------------------------+----------------+
> | 4.18.45 | 351 382 17533 base | 383 410 13178 base | 149 base |
> | 5.8-good | 380 392 7156 -2,55% | 376 384 6225 6,77% | 169 -11,83% |
> | 5.8-bad | 524 530 5310 -27,92% | 512 519 8775 -21,00% | 169 -11,83% |
> | 5.10 | 520 533 4078 -28,33% | 518 534 8108 -23,22% | 167 -10,78% |
> | 5.15 | 431 444 8440 -13,96% | 425 437 6170 -6,18% | 171 -12,87% |
> | 6.03 | 474 614 3881 -37,79% | 482 693 931 -40,84% | 171 -12,87% |
> +----------+--------------------------+-------------------------+-----------------
>

Hi Sven!

To prove a concept of local msg caching, I'm mastered a patch (attached below).
In my test setup it seems to resolve most of the regression. Would you mind to
give it a try? (It's only tested on my local vm, don't treat it as a production
code). If it will fix the regression, I can invest more time into it and post
it in an umpstreamble form.

Here are my results (5 runs each):

Original (current mm tree, 6.1+):
RX: 1122/1202/114001 1197/1267/26517 1109/1173/29613 1091/1165/54434 1091/1160/26302
TX: 1176/1255/38168 1252/1360/27683 1165/1226/41454 1145/1222/90040 1146/1214/26595

No accounting:
RX: 984/1053/31268 1024/1091/39105 1018/1077/61515 999/1065/30423 1008/1060/115284
TX: 1020/1097/137690 1065/1143/31448 1055/1130/133278 1032/1106/52372 1043/1099/25705

Patched:
RX: 1033/1165/38579 1030/1108/43703 1022/1114/25653 1008/1110/38462 1089/1136/29120
TX: 1047/1184/25373 1048/1116/25425 1034/1122/61275 1022/1121/24636 1105/1155/46600

Thanks!

--