Re: [tip:timers/core] [timers] 7ee9887703: netperf.Throughput_Mbps -1.2% regression

From: Frederic Weisbecker
Date: Tue Mar 05 2024 - 05:55:24 EST


On Tue, Mar 05, 2024 at 10:17:43AM +0800, Oliver Sang wrote:
> hi, Frederic Weisbecker,
>
> On Mon, Mar 04, 2024 at 12:28:33PM +0100, Frederic Weisbecker wrote:
> > Le Mon, Mar 04, 2024 at 10:13:00AM +0800, Oliver Sang a écrit :
> > > On Mon, Mar 04, 2024 at 01:32:45AM +0100, Frederic Weisbecker wrote:
> > > > Le Fri, Mar 01, 2024 at 04:09:24PM +0800, kernel test robot a écrit :
> > > > > commit:
> > > > > 57e95a5c41 ("timers: Introduce function to check timer base is_idle flag")
> > > > > 7ee9887703 ("timers: Implement the hierarchical pull model")
> > > >
> > > > Is this something that is observed also with the commits that follow in this
> > > > branch?
> > >
> > > when this bisect done, we also tested the tip of timers/core branch at that time
> > > 8b3843ae3634b vdso/datapage: Quick fix - use asm/page-def.h for ARM64
> > >
> > > the regression still exists on it:
> > >
> > > 57e95a5c4117dc6a 7ee988770326fca440472200c3e 8b3843ae3634b472530fb69c386
> > > ---------------- --------------------------- ---------------------------
> > > %stddev %change %stddev %change %stddev
> > > \ | \ | \
> > > 4.10 -1.2% 4.05 -1.2% 4.05 netperf.ThroughputBoth_Mbps
> > > 1049 -1.2% 1037 -1.2% 1036 netperf.ThroughputBoth_total_Mbps
> > > 4.10 -1.2% 4.05 -1.2% 4.05 netperf.Throughput_Mbps
> > > 1049 -1.2% 1037 -1.2% 1036 netperf.Throughput_total_Mbps
> >
> > Oh, I see... :-/
> >
> > > > Ie: would it be possible to compare instead:
> > > >
> > > > 57e95a5c4117 (timers: Introduce function to check timer base is_idle flag)
> > > > VS
> > > > b2cf7507e186 (timers: Always queue timers on the local CPU)
> > > >
> > > > Because the improvements introduced by 7ee9887703 are mostly relevant after
> > > > b2cf7507e186.
> > >
> > > got it. will test.
> > >
> > > at the same time, we noticed current tip of timers/core is
> > > a184d9835a0a6 (tip/timers/core) tick/sched: Fix build failure for
> > > CONFIG_NO_HZ_COMMON=n
> >
> > Shouldn't be a problem as it fixes an issue introduced after:
> >
> > b2cf7507e186 (timers: Always queue timers on the local CPU)
> >
> > >
> > > though it seems irelevant, we will still get data for it.
> >
> > Thanks a lot, this will be very helpful. Especially with all the perf diff
> > details like in the initial email report.
>
> the regression still exists on b2cf7507e186 and current tip of the branch:
>
> =========================================================================================
> cluster/compiler/cpufreq_governor/ip/kconfig/nr_threads/rootfs/runtime/tbox_group/test/testcase:
> cs-localhost/gcc-12/performance/ipv4/x86_64-rhel-8.3/200%/debian-12-x86_64-20240206.cgz/300s/lkp-icl-2sp2/SCTP_STREAM/netperf
>
> commit:
> 57e95a5c4117 (timers: Introduce function to check timer base is_idle flag)
> b2cf7507e186 (timers: Always queue timers on the local CPU)
> a184d9835a0a (tick/sched: Fix build failure for CONFIG_NO_HZ_COMMON=n)
>
> a184d9835a0a689261ea6a4a8dbc18173a031b77
>
> 57e95a5c4117dc6a b2cf7507e18649a30512515ec0c a184d9835a0a689261ea6a4a8db
> ---------------- --------------------------- ---------------------------
> %stddev %change %stddev %change %stddev
> \ | \ | \
> 4.10 -1.4% 4.04 -1.5% 4.04 netperf.ThroughputBoth_Mbps
> 1049 -1.4% 1034 -1.5% 1033 netperf.ThroughputBoth_total_Mbps
> 4.10 -1.4% 4.04 -1.5% 4.04 netperf.Throughput_Mbps
> 1049 -1.4% 1034 -1.5% 1033 netperf.Throughput_total_Mbps
>
> details are in below [1]

Thanks a lot!

>
> > Because I'm having some troubles
> > running those lkp tests. How is it working BTW? I've seen it downloading
> > two kernel trees but I haven't noticed a kernel build.
>
> you need build 7ee9887703 and its parent kernel with config in
> https://download.01.org/0day-ci/archive/20240301/202403011511.24defbbd-oliver.sang@xxxxxxxxx
> then boot into kernel.
>
> after that, you could run netperf in each kernel by following
> https://download.01.org/0day-ci/archive/20240301/202403011511.24defbbd-oliver.sang@xxxxxxxxx/reproduce
> to get data.
>
> the results will store in different path according the kernel commit, then you
> could compare the results from both kernels.

Oh I see now.

>
> what's your OS BTW? we cannot support all distributions so far...

Opensuse, but it failed to find a lot of equivalent packages.
Then I tried Ubuntu 22.04.4 LTS but it failed saying perf didn't have the
"sched" subcommand. Which distro do you recommand using?

>
> > Are the two compared
> > instances running through kvm?
>
> we run performance tests on bare mental. for netperf, we just test on one
> machine so the test is really upon local net.

Ok.

Thanks!