Re: [RFC PATCH 3/3] sched: Implement shared wakequeue in CFS
From: Aaron Lu
Date: Tue Jun 20 2023 - 22:35:57 EST
On Tue, Jun 20, 2023 at 12:36:26PM -0500, David Vernet wrote:
> On Fri, Jun 16, 2023 at 08:53:38AM +0800, Aaron Lu wrote:
> > I also tried that on the 18cores/36threads/LLC Skylake and the contention
> > is indeed much smaller than UDP_RR:
> >
> > 7.30% 7.29% [kernel.vmlinux] [k] native_queued_spin_lock_slowpath
> >
> > But I wouldn't say it's entirely gone. Also consider Skylake has a lot
> > fewer cores per LLC than later Intel servers like Icelake and Sapphire
> > Rapids and I expect things would be worse on those two machines.
>
> I cannot reproduce this contention locally, even on a slightly larger
With netperf client number equal to nr_cpu?
> Skylake. Not really sure what to make of the difference here. Perhaps
> it's because you're running with CONFIG_SCHED_CORE=y? What is the
Yes I had that config on but I didn't tag any tasks or groups.
> change in throughput when you run the default workload on your SKL?
The throughput dropped a little with SWQUEUE:
avg_throughput native_queued_spin_lock_slowpath%
NO_SWQUEUE: 9528.061111111108 0.09%
SWQUEUE: 8984.369722222222 8.05%
avg_throughput: average throughput of all netperf client's throughput,
higher is better.
I run this workload like this:
"
netserver
for i in `seq 72`; do
netperf -l 60 -n 72 -6 &
done
sleep 30
perf record -ag -e cycles:pp -- sleep 5 &
wait
"
(the '-n 72' should be redundant but I just keep it there)
Thanks,
Aaron