Re: [PATCH v3 2/2] thermal/debugfs: Add thermal debugfs information for mitigation episodes

From: Rafael J. Wysocki
Date: Wed Dec 27 2023 - 14:53:54 EST


Hi Daniel,

On Sat, Dec 23, 2023 at 12:41 PM Daniel Lezcano
<daniel.lezcano@xxxxxxxxxx> wrote:
>
>
> Hi Rafael,
>
> On 21/12/2023 20:26, Rafael J. Wysocki wrote:
>
> [ ... ]
>
>
> >> +/**
> >> + * struct tz_events - Store all events related to a mitigation episode
> >> + *
> >> + * The tz_events structure describes a mitigation episode.
> >
> > So why not call it tz_mitigation?
>
> A mitigation episode = N x tz_events
>
> eg.
> trip A = passive cooling - cpufreq cluster0
> trip B = passive cooling - cpufreq cluster0 + cluster1
> trip C = active cooling + fan
>
> temperature trip A < trip B < trip C
>
> The mitigation episode, as defined, begins at trip A, and we can have
> multiple events (eg. trip B crossed several times, trip C, then trip B
> again etc ...).

I understand this, but I thought that tz_events represented the entire episode.

> [ ... ]
>
> >> + if (dfs->tz.trip_index < 0) {
> >> + tze = thermal_debugfs_tz_event_alloc(tz, now);
> >> + if (!tze)
> >> + return;
> >> +
> >> + list_add(&tze->node, &dfs->tz.tz_events);
> >> + }
> >> +
> >> + dfs->tz.trip_index++;
> >> + dfs->tz.trips_crossed[dfs->tz.trip_index] = trip_id;
> >
> > So trip_index is an index into trips_crossed[] and the value is the ID
> > of the trip passed by thermal_debug_tz_trip_up() IIUC, so the trip IDs
> > in trips_crossed[] are always sorted by the trip temperature, in the
> > ascending order.
> >
> > It would be good to write this down somewhere in a comment.
> >
> > And what if trip temperatures change during a mitigation episode such
> > that the order by the trip temperature changes?
>
> Changing a trip point temperature during a mitigation is a general
> question about the thermal framework.
>
> How the governors will behave with such a change on the fly while they
> are in action?
>
> IMO, we should prevent to change a trip point temperature when this one
> is crossed and has a cooling device bound to it.

Well, it's not that simple.

There are legitimate cases in which this can happen on purpose. For
example, the ACPI spec does not provide a way to get a trip hysteresis
from the platform firmware and instead it recommends the firmware to
update the trip temperature every time it is crossed on the way up in
order to implement hysteresis.

> >> +
> >> + tze = list_first_entry(&dfs->tz.tz_events, struct tz_events, node);
> >> + tze->trip_stats[trip_id].timestamp = now;
> >> + tze->trip_stats[trip_id].max = max(tze->trip_stats[trip_id].max, temperature);
> >> + tze->trip_stats[trip_id].min = min(tze->trip_stats[trip_id].min, temperature);
> >> + tze->trip_stats[trip_id].avg = tze->trip_stats[trip_id].avg +
> >> + (temperature - tze->trip_stats[trip_id].avg) /
> >> + tze->trip_stats[trip_id].count;
> >> +
> >> + mutex_unlock(&dfs->lock);
> >> +}
>
>
>
> --