[PATCH 1/1] thermal: sysfs: avoid actual readings from sysfs

From: Eduardo Valentin
Date: Fri Jun 30 2023 - 21:38:19 EST


On Fri, Jun 30, 2023 at 10:16:38AM +0200, Rafael J. Wysocki wrote:
>
>
>
> On Wed, Jun 28, 2023 at 11:10 PM Eduardo Valentin <evalenti@xxxxxxxxxx> wrote:
> >
> > On Fri, Jun 23, 2023 at 07:31:43PM +0200, Rafael J. Wysocki wrote:
> > >
>
> [cut]
>
> > >
> > > Regardless of where the problem is etc, if my understanding of the
> > > patch is correct, it is proposing to change the behavior of a
> > > well-known sysfs interface in a way that is likely to be incompatible
> > > with at least some of its users. This is an obvious no-go in kernel
> > > development and I would expect you to be well aware of it.
> >
> > yeah I get it.
> >
> > >
> > > IOW, if you want the cached value to be returned, a new interface (eg.
> > > a new sysfs attribute) is needed.
> >
> > Yeah, I am fine with either a new sysfs entry to return the cached value,
> > or a new sysfs entry to change the behavior of the existing /temp, as I
> > mentioned before, either way works for me, if changing the existing one
> > is too intrusive.
> >
> > >
> > > And I think that the use case is not really i2c sensors in general,
> >
> > I2C was just the factual example I had, but you are right, the use case
> > is not isolated to I2C sensor. Rather, to be clear I am not blaming I2C,
> > the actual issue just happen to be easier to see when I2C devices, slower
> > than typical MMIO devices, are being used as input for the control.
> >
> > > because at least in some cases they work just fine AFAICS, but a
> > > platform with a control loop running in the kernel that depends on
> > > sensor reads carried out at a specific, approximately constant, rate
> > > that can be disturbed by user space checking the sensor temperature
> > > via sysfs at "wrong" times. If at the same time the user space
> > > program in question doesn't care about the most recent values reported
> > > by the sensor, it may very well use the values cached by the in-kernel
> > > control loop.
> >
> > That is right, the balance between supporting user space reads and
> > running the control timely is the actual original concern. The problem
> > fades out a bit when you have device reads in the us / ns time scale
> > and control update is in 100s of ms. But becomes more apparent on slower
> > devices, when reads are in ms and policy update is in the 100s ms, that is
> > why the I2C case was quoted. But nothing wrong with I2C, or supporting
> > I2C on the thermal subsystem as we do today via the hwmon interface REGISTER_TZ,
> > the problem is on having to support the control in kernel and a read in
> > userspace that can create jitter to the control.
> >
> > And as you properly stated, for this use case, the userspace does not care
> > about the most recent value of the device, so that is why the change
> > proposes to give cached values.
> >
> > On the flip side though, there may be user space based policies that
> > require the most recent device value. But in this case, the expectation
> > is to disable the in kernel policy and switch the thermal zone to
> > mode == disabled. And that is also why this patch will go the path
> > to request the most recent device value when the /temp sysfs entry
> > is read and the mode is disabled.
> >
> > I would suggest to have an addition sysfs entry that sets the
> > thermal zone into cached mode or not, let's say for the sake of the
> > discussion, we call it 'cached_values', with default to 'not cached'.
> > This way, we could support:
> >
> > a) Default, current situation, where all reads in /temp are always backed up
> > with an actual device .get_temp(). Nothing changes here, keeps reading
> > under /temp, and so long system designer is satisfied with jittering,
> > no need to change anything.
> >
> > b) When one has control in kernel, and frequent userspace reads on /temp
> > but system designer wants to protect the control in kernel to avoid jittering.
> > Just keep reading from /temp but set the new sysfs property 'cached_values' to 'cached'.
> > Then userspace will get updated values as frequent as the kernel control has
> > and the kernel control will not be disturbed by frequent device reads.
> >
> > c) When one has control in userspace, and wants to have the most frequent
> > device read. Here, one can simply disable the in kernel control by
> > setting the 'mode' sysfs entry to 'disabled', and making sure the new sysfs property is set
> > to 'not cached'. Well in fact, the way I thought this originally in this patch
> > was to simply always read the device when /temp is read is 'mode' is 'disabled'.
> >
> > I believe you proposed to have another sysfs entry sysfs entry for reading cached temperature.
> > Something like 'temp_cached'. Thinking of it, as I mentioned before, it will work.
> > The only possible downside is to have two entries to read temperature.
> >
> > Strong opinions on any of the above?
>
> So what about adding a new zone attribute that can be used to specify
> the preferred caching time for the temperature?
>
> That is, if the time interval between two consecutive updates of the
> cached temperature value is less than the value of the new attribute,
> the cached temperature value will be returned by "temp". Otherwise,
> it will cause the sensor to be read and the value obtained from it
> will be returned to user space and cached.
>
> If the value of the new attribute is 0, everything will work as it
> does now (which will also need to be the default behavior).


Yeah, that makes sense to me. I can cook something up in the next version.

--
All the best,
Eduardo Valentin