Re: [PATCH v3 7/7] sched: Shard per-LLC shared runqueues

From: Chen Yu
Date: Sat Sep 23 2023 - 02:36:39 EST


On 2023-08-31 at 14:14:44 -0500, David Vernet wrote:
> On Thu, Aug 31, 2023 at 06:45:11PM +0800, Chen Yu wrote:
> > On 2023-08-30 at 19:01:47 -0500, David Vernet wrote:
> > > On Wed, Aug 30, 2023 at 02:17:09PM +0800, Chen Yu wrote:
[snip...]
> >
> > Let me run other benchmarks to see if others are sensitive to
> > the resource locality.
>
> Great, thank you Chenyu.
>
> FYI, I'll be on vacation for over a week starting later today, so my
> responses may be delayed.
>
> Thanks in advance for working on this. Looking forward to seeing the
> results when I'm back at work.

Sorry for late result. I applied your latest patch set on top of upstream
6.6-rc2 Commit 27bbf45eae9c(I pulled the latest commit from upstream yesterday).
The good news is that, there is overall slight stable improvement on tbench,
and no obvious regression on other benchmarks is observed on Sapphire Rapids
with 224 CPUs:

tbench throughput
======
case load baseline(std%) compare%( std%)
loopback 56-threads 1.00 ( 0.85) +4.35 ( 0.23)
loopback 112-threads 1.00 ( 0.38) +0.91 ( 0.05)
loopback 168-threads 1.00 ( 0.03) +2.96 ( 0.06)
loopback 224-threads 1.00 ( 0.09) +2.95 ( 0.05)
loopback 280-threads 1.00 ( 0.12) +2.48 ( 0.25)
loopback 336-threads 1.00 ( 0.23) +2.54 ( 0.14)
loopback 392-threads 1.00 ( 0.53) +2.91 ( 0.04)
loopback 448-threads 1.00 ( 0.10) +2.76 ( 0.07)

schbench 99.0th tail latency
========
case load baseline(std%) compare%( std%)
normal 1-mthreads 1.00 ( 0.32) +0.68 ( 0.32)
normal 2-mthreads 1.00 ( 1.83) +4.48 ( 3.31)
normal 4-mthreads 1.00 ( 0.83) -0.59 ( 1.80)
normal 8-mthreads 1.00 ( 4.47) -1.07 ( 3.49)

netperf throughput
=======
case load baseline(std%) compare%( std%)
TCP_RR 56-threads 1.00 ( 1.01) +1.37 ( 1.41)
TCP_RR 112-threads 1.00 ( 2.44) -0.94 ( 2.63)
TCP_RR 168-threads 1.00 ( 2.94) +3.22 ( 4.63)
TCP_RR 224-threads 1.00 ( 2.38) +2.83 ( 3.62)
TCP_RR 280-threads 1.00 ( 66.07) -7.26 ( 78.95)
TCP_RR 336-threads 1.00 ( 21.92) -0.50 ( 21.48)
TCP_RR 392-threads 1.00 ( 34.31) -0.00 ( 33.08)
TCP_RR 448-threads 1.00 ( 43.33) -0.31 ( 43.82)
UDP_RR 56-threads 1.00 ( 8.78) +3.84 ( 9.38)
UDP_RR 112-threads 1.00 ( 14.15) +1.84 ( 8.35)
UDP_RR 168-threads 1.00 ( 5.10) +2.95 ( 8.85)
UDP_RR 224-threads 1.00 ( 15.13) +2.76 ( 14.11)
UDP_RR 280-threads 1.00 ( 15.14) +2.14 ( 16.75)
UDP_RR 336-threads 1.00 ( 25.85) +1.64 ( 27.42)
UDP_RR 392-threads 1.00 ( 34.34) +0.40 ( 34.20)
UDP_RR 448-threads 1.00 ( 41.87) +1.41 ( 41.22)

We can have a re-run after the latest one is released.

thanks,
Chenyu