Re: [PATCH] perf/core: Introduce cpuctx->cgrp_ctx_list

From: Ingo Molnar
Date: Wed Oct 04 2023 - 03:26:55 EST



* Namhyung Kim <namhyung@xxxxxxxxxx> wrote:

> AFAIK we don't have a tool to measure the context switch overhead
> directly. (I think I should add one to perf ftrace latency). But I can
> see it with a simple perf bench command like this.
>
> $ perf bench sched pipe -l 100000
> # Running 'sched/pipe' benchmark:
> # Executed 100000 pipe operations between two processes
>
> Total time: 0.650 [sec]
>
> 6.505740 usecs/op
> 153710 ops/sec
>
> It runs two tasks communicate each other using a pipe so it should
> stress the context switch code. This is the normal numbers on my
> system. But after I run these two perf stat commands in background,
> the numbers vary a lot.
>
> $ sudo perf stat -a -e cycles -G user.slice -- sleep 100000 &
> $ sudo perf stat -a -e uncore_imc/cas_count_read/ -- sleep 10000 &
>
> I will show the last two lines of perf bench sched pipe output for
> three runs.
>
> 58.597060 usecs/op # run 1
> 17065 ops/sec
>
> 11.329240 usecs/op # run 2
> 88267 ops/sec
>
> 88.481920 usecs/op # run 3
> 11301 ops/sec
>
> I think the deviation comes from the fact that uncore events are managed
> a certain number of cpus only. If the target process runs on a cpu that
> manages uncore pmu, it'd take longer. Otherwise it won't affect the
> performance much.

The numbers of pipe-message context switching will vary a lot depending on
CPU migration patterns as well.

The best way to measure context-switch overhead is to pin that task
to a single CPU with something like:

$ taskset 1 perf stat --null --repeat 10 perf bench sched pipe -l 10000 >/dev/null

Performance counter stats for 'perf bench sched pipe -l 10000' (10 runs):

0.049798 +- 0.000102 seconds time elapsed ( +- 0.21% )

as you can see the 0.21% stddev is pretty low.

If we allow 2 CPUs, both runtime and stddev is much higher:

$ taskset 3 perf stat --null --repeat 10 perf bench sched pipe -l 10000 >/dev/null

Performance counter stats for 'perf bench sched pipe -l 10000' (10 runs):

1.4835 +- 0.0383 seconds time elapsed ( +- 2.58% )

Thanks,

Ingo