Re: [PATCH v2 0/2] intel_powerclamp: New module parameter

From: Ricardo Neri
Date: Tue Feb 07 2023 - 13:58:34 EST


On Tue, Feb 07, 2023 at 05:51:59AM -0800, srinivas pandruvada wrote:
> On Tue, 2023-02-07 at 05:42 -0800, Ricardo Neri wrote:
> > On Mon, Feb 06, 2023 at 02:02:28AM -0800, srinivas pandruvada wrote:
> > > On Mon, 2023-02-06 at 08:05 +0000, Zhang, Rui wrote:
> > > > On Sun, 2023-02-05 at 18:45 -0800, srinivas pandruvada wrote:
> > > > > Hi Rui,
> > > > >
> > > > > On Sun, 2023-02-05 at 15:57 +0000, Zhang, Rui wrote:
> > > > > > Hi, Srinivas,
> > > > > >
> > > > > > First of all, the previous build error is gone.
> > > > > >
> > > > > > Second, I found something strange, which may be related with
> > > > > > the
> > > > > > scheduler asym-packing, so CC Ricardo.
> > > > > >
> > > > > I thought you disable ITMT before idle injection and reenebale
> > > > > after
> > > > > removal.
> > > >
> > > > No.
> > > >
> > > > I can reproduce this by playing with raw intel_powerclamp sysfs
> > > > knobs
> > > > and ITMT enabled.
> > > >
> > >
> > > This issue is happening even if ITMT disabled. If the module mask
> > > is
> > > composed of P-cores it works or even on servers as expected.
> > > Also if you offline all P-cores then select mask among E-cores, it
> > > is
> > > working. Somehow P-core influences E-cores.
> > >
> > > Since this patch is module mask related, that is functioning
> > > correctly.
> > > We have to debug this interaction with P and E cores separately.
> >
> > Currently, when doing asym_packing, ECores will only pull tasks from
> > a
> > PCore only if both SMT siblings are busy. It will only pull from the
> > lower-priority sibling. These patches [1] let ECores pull from either
> > sibling, if both are busy.
> >
> > I presume that by injecting idle, the scheduler thinks that the CPU
> > is
> > idle (i.e., idle_cpu() returns true) and it will not do asym_packing
> > from
> > lower-priority CPUs.
> >
> > However, in your experiment you have 16 threads. If a Pcore is
> > overloaded,
> > an ECore should be able to help.
> This issue happens with or without ITMT and also without any idle
> injection active.

I was not able to reproduce this issue on my ADL-S system with ITMT. The
described bug is exactly what and old patchset of mine was supposed to
fix [2]. Maybe the CPU priorities in the failing system are such that it
prevents asym_packing from kicking in.

I was able to reproduce the issue without ITMT.

I had found that the scheduler cannot handle load balance between SMT and
non-SMT cores correctly. My patchset [1] includes fixes for this case. I
applied it on top of Rafael's linux-next branch and it fixed the issue for
me in the non-ITMT case. Perhaps patches 5 and 6 are sufficient, but I
applied the whole series.

Thanks and BR,
Ricardo

[1]. https://lore.kernel.org/lkml/20230207045838.11243-1-ricardo.neri-calderon@xxxxxxxxxxxxxxx/
[2]. https://lore.kernel.org/all/20210911011819.12184-7-ricardo.neri-calderon@xxxxxxxxxxxxxxx/