Re: [PATCH v6 2/2] thermal/debugfs: Add thermal debugfs information for mitigation episodes

From: Rafael J. Wysocki
Date: Tue Jan 09 2024 - 08:04:12 EST


On Tue, Jan 9, 2024 at 10:41 AM Daniel Lezcano
<daniel.lezcano@xxxxxxxxxx> wrote:
>
> The mitigation episodes are recorded. A mitigation episode happens
> when the first trip point is crossed the way up and then the way
> down. During this episode other trip points can be crossed also and
> are accounted for this mitigation episode. The interesting information
> is the average temperature at the trip point, the undershot and the
> overshot. The standard deviation of the mitigated temperature will be
> added later.
>
> The thermal debugfs directory structure tries to stay consistent with
> the sysfs one but in a very simplified way:
>
> thermal/
> `-- thermal_zones
> |-- 0
> | `-- mitigations
> `-- 1
> `-- mitigations
>
> The content of the mitigations file has the following format:
>
> ,-Mitigation at 349988258us, duration=130136ms
> | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
> | 0 | passive | 65000 | 2000 | 130136 | 68227 | 62500 | 75625 |
> | 1 | passive | 75000 | 2000 | 104209 | 74857 | 71666 | 77500 |
> ,-Mitigation at 272451637us, duration=75000ms
> | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
> | 0 | passive | 65000 | 2000 | 75000 | 68561 | 62500 | 75000 |
> | 1 | passive | 75000 | 2000 | 60714 | 74820 | 70555 | 77500 |
> ,-Mitigation at 238184119us, duration=27316ms
> | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
> | 0 | passive | 65000 | 2000 | 27316 | 73377 | 62500 | 75000 |
> | 1 | passive | 75000 | 2000 | 19468 | 75284 | 69444 | 77500 |
> ,-Mitigation at 39863713us, duration=136196ms
> | trip | type | temp(°mC) | hyst(°mC) | duration | avg(°mC) | min(°mC) | max(°mC) |
> | 0 | passive | 65000 | 2000 | 136196 | 73922 | 62500 | 75000 |
> | 1 | passive | 75000 | 2000 | 91721 | 74386 | 69444 | 78125 |
>
> More information for a better understanding of the thermal behavior
> will be added after. The idea is to give detailed statistics
> information about the undershots and overshots, the temperature speed,
> etc... As all the information in a single file is too much, the idea
> would be to create a directory named with the mitigation timestamp
> where all data could be added.
>
> Please note this code is immune against trip ordering but not against
> a trip temperature change while a mitigation is happening. However,
> this situation should be extremely rare, perhaps not happening and we
> might question ourselves if something should be done in the core
> framework for other components first.
>
> Signed-off-by: Daniel Lezcano <daniel.lezcano@xxxxxxxxxx>

Both patches in the series look good to me now, so I'll queue them up
for 6.8-rc1.

Thanks!