Re: [PATCH v2 0/2] intel_powerclamp: New module parameter

From: Zhang, Rui
Date: Mon Feb 06 2023 - 03:05:39 EST


On Sun, 2023-02-05 at 18:45 -0800, srinivas pandruvada wrote:
> Hi Rui,
>
> On Sun, 2023-02-05 at 15:57 +0000, Zhang, Rui wrote:
> > Hi, Srinivas,
> >
> > First of all, the previous build error is gone.
> >
> > Second, I found something strange, which may be related with the
> > scheduler asym-packing, so CC Ricardo.
> >
> I thought you disable ITMT before idle injection and reenebale after
> removal.

No.

I can reproduce this by playing with raw intel_powerclamp sysfs knobs
and ITMT enabled.

>
>
>
> > The test is done with pm linux-intel branch

sorry, I mean linux-next branch.

> > + this patch series on an
> > ADL system.
> Can you do test on bleeding edge branch of Linux-pm?
>
> > cpu0~cpu7 are Pcore cpus, cpu8-cpu15 are Ecore cpus, and
> > intel_powerclamp is register as cooling_device21.
> >
> > 1. run stress -c 16
> > 2. update /sys/module/intel_powerclamp/parameters/cpumask
> > echo 90 > /sys/module/intel_powerclamp/parameters/max_idle
> > 3. echo 90 > /sys/class/thermal/cooling_device21/cur_state
> > 4. echo 0 > /sys/class/thermal/cooling_device21/cur_state
> > I use turbostat to monitor the CPU Busy% in all 4 steps.
> >
> > If 'cpumask' does not include all the Ecore CPUs, all CPUs becomes
> > 100%
> > busy after idle injection removed in step 4.
> >
> that should be the case.
>
> > If 'cpumask' includes all the Ecore CPUs, i.e. cpumask = FFxy, in
> > some
> > cases, the Ecore CPUs will drop to an Busy% much lower than 10%,
> > and
> > then they don't come back to busy after idle injection removed in
> > step
> Do you see that idle injection is removed message in dmesg?

yes.

> We can also check powercap idle-inejct, if some CPUs still not wake
> from play_idle.

"ps" command shows the the idle_injection threads time is not
increasing any more.

>
>
> > 4, although we have 16 stress threads. And this also relates with
> > how
> > long we stay in idle injection.
> >
> > Say, when cpumask=fff3, the problem can be triggered occasionally
> > if
> > there is a 10 second timeout between step 3 and step4, but it is
> > much
> > easier to reproducible if I increase the timeout to 20 seconds.
> >
> > It seems that Pcore can always pull tasks from Ecores, but Ecore
> > can
> > not pull tasks from Pcore HT siblings.
> >
> That will be regular load balance threads should do.
> Better to try upsteam kernel first.

I'm already running with linux-pm tree linux-next branch + this patch
series.

thanks,
rui

>
> Thanks,
> Srinivas
>
>
> > thanks,
> > rui
> >
> > On Sat, 2023-02-04 at 18:59 -0800, Srinivas Pandruvada wrote:
> > > Split from the series for powerclamp user of powercap idle-
> > > inject.
> > >
> > > v2
> > > - Build warnings reported by Rui
> > > - Moved the powerclamp documentation to admin guide folder
> > > - Commit log updated as suggested by Rafael and other code
> > > suggestion
> > >
> > > Srinivas Pandruvada (2):
> > > Documentation:admin-guide: Move intel_powerclamp documentation
> > > thermal/drivers/intel_powerclamp: Add two module parameters
> > >
> > > Documentation/admin-guide/index.rst | 1 +
> > > .../thermal/intel_powerclamp.rst | 22 +++
> > > Documentation/driver-api/thermal/index.rst | 1 -
> > > MAINTAINERS | 1 +
> > > drivers/thermal/intel/intel_powerclamp.c | 177
> > > +++++++++++++++-
> > > --
> > > 5 files changed, 180 insertions(+), 22 deletions(-)
> > > rename Documentation/{driver-api => admin-
> > > guide}/thermal/intel_powerclamp.rst (93%)
> > >