[RFT PATCH v3 0/1] Summary: hwmon driver disk and solid state drives with temperature sensors

From: Guenter Roeck
Date: Thu Dec 26 2019 - 12:51:01 EST


In the past, several attempts have been made to add support for reporting
SCSI/[S]ATA drive temperatures to the Linux kernel. This is desirable to
have a means to report drive temperatures to userspace without root
privileges and in a standard format, but also to be able to tie reported
temperatures with the thermal subsystem.

The most recent attempt was [1] by Linus Walleij. It went through a total
of seven iterations. At the end, it was rejected for a number of reasons;
see the provided link for details. This implementation resides in the
SCSI core. It originally resided in libata but was moved to SCSI per
maintainer request, where it was ultimately rejected.

An earlier submission of a driver to report SCSI/SATA drive temperatures
was made back in 2009 by Constantin Baranov [2]. This submission resides
in the hardware monitoring subsystem. It does not rely on changes in the
SCSI subsystem or in libata-scsi. Instead, it registers itself with the
SCSI subsystem using scsi_register_interface(). It was rejected primarily
because it executes ATA passthrough commands without verification that it
is actually connected to an ATA drive.

Both submissions use SMART attributes to read drive temperature information.
[1] also tries to identify temperature limits from those attributes.
Unfortunately, SMART attributes are not well defined, resulting in relative
complex code trying to identify the exact format of the reported data.

With the available information and feedback, we can make a number of
observations and conclusions.
a) Using available (S)ATA drive temperature information and convert it to
a SCSI log page is an interesting idea. On the downside, it would add a
substantial amount of complexity to libata-scsi. The code would either
have to be optional, or it would have to be built into the kernel even
if it is never used on a given system. Without access to SCSI drives
supporting this feature, it would be all but impossible to test the code
against such a drive. It would neither be possible to test correctness
of the code in libata-scsi nor in the driver using that information.
Overall it would be much easier and much less risky to implement such
code on the receiving side (ie in a driver reporting the temperatures)
instead of trying to convert the information from one format to another
first. In summary, it is neither practical nor feasible. On top of that,
there is no guarantee that code implementing this functionality would
ever be accepted into the kernel for this very reason.
b) The code needed to read and analyze SCSI temperature log pages is quite
complex (see smartmontools [5]). There is no existing support code
in the Linux kernel; such code would have to be written. This makes
the approach discussed in a) even more risky and less practical.
c) Overall, any attempt to report temperature information for anything
but SATA drives in the kernel is not practical due to the complexity
involved, and due to the inability to test the resulting code with
non-SATA drives.
d) Using SMART data for anything but basic temperature reporting is not
really feasible due to the lack of standardization. Any attempt to do
this would add a substantial amount of code, ambiguity, and risk.

This submission implements a driver to report the temperature of SATA
drives through the hardware monitoring subsystem. It is implemented as
stand-alone driver in the hardware monitoring subsystem. The driver uses
the mechanism from submission [1] to register with the SCSI subsystem.
By using this mechanism, changes in the SCSI or ATA subsystems are not
required. To reduce risk and complexity, it only instantiates after
reliably validating that it is connected to a SATA drive. It does not
attempt to report the temperature of non-SATA drives.

The driver uses the SCT Command Transport feature set as specified in
ATA8-ACS [4] to read and report the temperature as well as temperature
limits and lowest/highest temperature information (if available) for
SATA drives. If a drive does not support SCT Command Transport, the driver
attempts to access a limited set of well known SMART attributes to read
the drive temperature. In that case, only the current drive temperature
is reported.

The driver does not currently report temperatures for SCSI drives. This
will be added with a subsequent patch.

---
v3: Rename satatemp -> drivetemp
Use cached VPD page 89 data (available with v5.5 and later kernels)
Relax ATA drive detection; still check if inquiry data is
present, but don't use it for access detection.
Modify VPD data analysis following guidance from Martin K. Petersen
Separate SATA drive detection into separate function
Marked as RFT. Martin K. Petersen reports:
"I get a crash in the driver core during probe if the drivetemp module
is loaded prior to loading ahci or a SCSI HBA driver. This crash is
unrelated to my changes. Haven't had time to debug."
This will require further testing before the patch is applied.

v2: scsi_cmd variable is no longer static
Fixed drive name in Kconfig
Describe heuristics used to select SCT or SMART in commit message
Added Reviewed-by: from Linus Walleij

---
References:
[1] https://patchwork.kernel.org/patch/10688021/
[2] https://lore.kernel.org/lkml/20090913040104.ab1d0b69.const@xxxxxxxx/
[3] http://www.t10.org/cgi-bin/ac.pl?t=f&f=sat5r02.pdf
Information technology - SCSI / ATA Translation - 5 (SAT-5),
section 10.3.8 (Temperature log page).
[4] http://www.t13.org/documents/uploadeddocuments/docs2008/d1699r6a-ata8-acs.pdf
ANS T13/1699-D "Information technology - AT Attachment 8 - ATA/ATAPI Command
Set (ATA8-ACS)"
[5] https://github.com/mirror/smartmontools.git