Re: [PATCH 1/2] thermal: cooling: Check Energy Model type in cpufreq_cooling and devfreq_cooling

From: Matthias Kaehlcke
Date: Wed Feb 16 2022 - 17:14:05 EST


On Wed, Feb 16, 2022 at 09:33:50AM -0800, Doug Anderson wrote:
> Hi,
>
> On Wed, Feb 16, 2022 at 7:35 AM Lukasz Luba <lukasz.luba@xxxxxxx> wrote:
> >
> > Hi Matthias,
> >
> > On 2/9/22 10:17 PM, Matthias Kaehlcke wrote:
> > > On Wed, Feb 09, 2022 at 11:16:36AM +0000, Lukasz Luba wrote:
> > >>
> > >>
> > >> On 2/8/22 5:25 PM, Matthias Kaehlcke wrote:
> > >>> On Tue, Feb 08, 2022 at 09:32:28AM +0000, Lukasz Luba wrote:
> > >>>>
> > >>>>
> >
> > [snip]
> >
> > >>>> Could you point me to those devices please?
> > >>>
> > >>> arch/arm64/boot/dts/qcom/sc7180-trogdor-*
> > >>>
> > >>> Though as per above they shouldn't be impacted by your change, since the
> > >>> CPUs always pretend to use milli-Watts.
> > >>>
> > >>> [skipped some questions/answers since sc7180 isn't actually impacted by
> > >>> the change]
> > >>
> > >> Thank you Matthias. I will investigate your setup to get better
> > >> understanding.
> > >
> > > Thanks!
> > >
> >
> > I've checked those DT files and related code.
> > As you already said, this patch is safe for them.
> > So we can apply it IMO.
> >
> >
> > -------------Off-topic------------------
> > Not in $subject comments:
> >
> > AFAICS based on two files which define thermal zones:
> > sc7180-trogdor-homestar.dtsi
> > sc7180-trogdor-coachz.dtsi
> >
> > only the 'big' cores are used as cooling devices in the
> > 'skin_temp_thermal' - the CPU6 and CPU7.
> >
> > I assume you don't want to model at all the power usage
> > from the Little cluster (which is quite big: 6 CPUs), do you?
> > I can see that the Little CPUs have small dyn-power-coeff
> > ~30% of the big and lower max freq, but still might be worth
> > to add them to IPA. You might give them more 'weight', to
> > make sure they receive more power during power split.

In experiments we saw that including the little cores as cooling
devices for 'skin_temp_thermal' didn't have a significant impact on
thermals, so we left them out.

> > You also don't have GPU cooling device in that thermal zone.
> > Based on my experience if your GPU is a power hungry one,
> > e.g. 2-4Watts, you might get better results when you model
> > this 'hot' device (which impacts your temp sensor reported value).
>
> I think the two boards you point at (homestar and coachz) are just the
> two that override the default defined in the SoC dtsi file. If you
> look in sc7180.dtsi you'll see 'gpuss1-thermal' which has a cooling
> map. You can also see the cooling maps for the littles.

Yep, plus thermal zones with cooling maps for the big cores.

> I guess we don't have a `dynamic-power-coefficient` for the GPU,
> though? Seems like we should, but I haven't dug through all the code
> here...

To my knowledge the SC7x80 GPU doesn't register an energy model, which is
one of the reasons the GPU wasn't included as cooling device for
'skin_temp_thermal'.