Re: [PATCH v2] nvme: Add hardware monitoring support

From: Christoph Hellwig
Date: Wed Oct 30 2019 - 10:13:01 EST


On Tue, Oct 29, 2019 at 03:32:14PM -0700, Guenter Roeck wrote:
> This patch adds support to read NVME temperatures from the kernel using the
> hwmon API and adds temperature zones for NVME drives. The thermal subsystem
> can use this information to set thermal policies, and userspace can access
> it using libsensors and/or the "sensors" command.

Except in all upper case or all lower case identifier the speling should
always be "NVMe". Thi also happens a few more places like in the Kconfig
text.

> +static int nvme_hwmon_get_smart_log(struct nvme_hwmon_data *data)
> +{
> + return nvme_get_log(data->ctrl, NVME_NSID_ALL, NVME_LOG_SMART, 0,
> + &data->log, sizeof(data->log), 0);
> +}
> +
> +static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
> + u32 attr, int channel, long *val)
> +{
> + struct nvme_hwmon_data *data = dev_get_drvdata(dev);
> + struct nvme_smart_log *log = &data->log;
> + int err;
> + int temp;
> +
> + err = nvme_hwmon_get_smart_log(data);
> + if (err)
> + return err < 0 ? err : -EPROTO;

I think the handling of positive errnos fits better into
nvme_hwmon_get_smart_log. Also EIO sounds like a better error for
generic NVMe level errors.

> +
> + switch (attr) {
> + case hwmon_temp_max:
> + *val = (data->ctrl->wctemp - 273) * 1000;
> + break;
> + case hwmon_temp_crit:
> + *val = (data->ctrl->cctemp - 273) * 1000;
> + break;
> + case hwmon_temp_input:
> + if (!channel)
> + temp = le16_to_cpup((__le16 *)log->temperature);

This needs to use get_unaligned_le16, otherwise you'll run into problems
on architectures that don't support unaligned loads.

> +static const struct hwmon_ops nvme_hwmon_ops = {
> + .is_visible = nvme_hwmon_is_visible,
> + .read = nvme_hwmon_read,
> + .read_string = nvme_hwmon_read_string,
> +};
> +
> +static const struct hwmon_chip_info nvme_hwmon_chip_info = {
> + .ops = &nvme_hwmon_ops,
> + .info = nvme_hwmon_info,
> +};

Please use tabs to align all the = in an ops structure.

> +#if IS_ENABLED(CONFIG_NVME_HWMON)

No real need to use IS_ENABLED here, a plain ifdef should do it.