Re: [PATCH v3 1/1] x86,sched: On AMD EPYC set freq_max = max_boost in schedutil invariant formula

From: Rafael J. Wysocki
Date: Thu Feb 04 2021 - 08:58:32 EST


On Thu, Feb 4, 2021 at 2:49 PM Giovanni Gherdovich <ggherdovich@xxxxxxx> wrote:
>
> On Wed, 2021-02-03 at 19:25 +0100, Rafael J. Wysocki wrote:
> > [cut]
> >
> > So below is a prototype of an alternative fix for the issue at hand.
> >
> > I can't really test it here, because there's no _CPC in the ACPI tables of my
> > test machines, so testing it would be appreciated. However, AFAICS these
> > machines are affected by the performance issue related to the scale-invariance
> > when they are running acpi-cpufreq, so what we are doing here is not entirely
> > sufficient.
> >
> > It looks like the scale-invariance code should ask the cpufreq driver about
> > the maximum frequency and note that cpufreq drivers may be changed on the
> > fly.
> >
> > What the patch below does is to add an extra entry to the frequency table for
> > each CPU to represent the maximum "boost" frequency, so as to cause that
> > frequency to be used as cpuinfo.max_freq.
> >
> > The reason why I think it is better to extend the frequency tables instead
> > of simply increasing the frequency for the "P0" entry is because the latter
> > may cause "turbo" frequency to be asked for less often.
> >
> > ---
> > drivers/cpufreq/acpi-cpufreq.c | 107 ++++++++++++++++++++++++++++++++++++-----
> > 1 file changed, 95 insertions(+), 12 deletions(-)
>
> Hello Rafael,
>
> thanks for looking at this. Your patch is indeed cleaner than the one I proposed.
>
> Preliminary testing is favorable; more tests are running.
>
> Results from your patch are in the fourth column below; the performance from
> v5.10 looks restored.
>
> I'll follow up once the tests I queued are completed.

Thank you!

> TEST : Intel Open Image Denoise, www.openimagedenoise.org
> INVOCATION : ./denoise -hdr memorial.pfm -out out.pfm -bench 200 -threads $NTHREADS
> CPU : MODEL : 2x AMD EPYC 7742
> FREQUENCY TABLE : P2: 1.50 GHz
> P1: 2.00 GHz
> P0: 2.25 GHz
> MAX BOOST : 3.40 GHz
>
> Results: threads, msecs (ratio). Lower is better.
>
> v5.10 v5.11-rc4 v5.11-rc4-ggherdov v5.11-rc6-rafael
> ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
> 1 1069.85 (1.00) 1071.84 (1.00) 1070.42 (1.00) 1069.12 (1.00)
> 2 542.24 (1.00) 544.40 (1.00) 544.48 (1.00) 540.81 (1.00)
> 4 278.00 (1.00) 278.44 (1.00) 277.72 (1.00) 277.79 (1.00)
> 8 149.81 (1.00) 149.61 (1.00) 149.87 (1.00) 149.51 (1.00)
> 16 79.01 (1.00) 79.31 (1.00) 78.94 (1.00) 79.02 (1.00)
> 24 58.01 (1.00) 58.51 (1.01) 58.15 (1.00) 57.84 (1.00)
> 32 46.58 (1.00) 48.30 (1.04) 46.66 (1.00) 46.70 (1.00)
> 48 37.29 (1.00) 51.29 (1.38) 37.27 (1.00) 38.10 (1.02)
> 64 34.01 (1.00) 49.59 (1.46) 33.71 (0.99) 34.51 (1.01)
> 80 31.09 (1.00) 44.27 (1.42) 31.33 (1.01) 31.11 (1.00)
> 96 28.56 (1.00) 40.82 (1.43) 28.47 (1.00) 28.65 (1.00)
> 112 28.09 (1.00) 40.06 (1.43) 28.63 (1.02) 28.38 (1.01)
> 120 28.73 (1.00) 39.78 (1.38) 28.14 (0.98) 28.16 (0.98)
> 128 28.93 (1.00) 39.60 (1.37) 29.38 (1.02) 28.55 (0.99)