Re: [PATCH v2] nvme: Add hardware monitoring support

From: Akinobu Mita
Date: Thu Oct 31 2019 - 09:44:40 EST


2019å10æ31æ(æ) 11:20 Guenter Roeck <linux@xxxxxxxxxxxx>:
>
> On 10/30/19 4:16 AM, Akinobu Mita wrote:
> > 2019å10æ30æ(æ) 7:32 Guenter Roeck <linux@xxxxxxxxxxxx>:
> >>
> >> nvme devices report temperature information in the controller information
> >> (for limits) and in the smart log. Currently, the only means to retrieve
> >> this information is the nvme command line interface, which requires
> >> super-user privileges.
> >>
> >> At the same time, it would be desirable to use NVME temperature information
> >> for thermal control.
> >>
> >> This patch adds support to read NVME temperatures from the kernel using the
> >> hwmon API and adds temperature zones for NVME drives. The thermal subsystem
> >> can use this information to set thermal policies, and userspace can access
> >> it using libsensors and/or the "sensors" command.
> >>
> >> Example output from the "sensors" command:
> >>
> >> nvme0-pci-0100
> >> Adapter: PCI adapter
> >> Composite: +39.0ÂC (high = +85.0ÂC, crit = +85.0ÂC)
> >> Sensor 1: +39.0ÂC
> >> Sensor 2: +41.0ÂC
> >>
> >> Signed-off-by: Guenter Roeck <linux@xxxxxxxxxxxx>
> >> ---
> >> v2: Use devm_kfree() to release memory in error path
> >>
> >> drivers/nvme/host/Kconfig | 10 ++
> >> drivers/nvme/host/Makefile | 1 +
> >> drivers/nvme/host/core.c | 5 +
> >> drivers/nvme/host/nvme-hwmon.c | 163 +++++++++++++++++++++++++++++++++
> >> drivers/nvme/host/nvme.h | 8 ++
> >> 5 files changed, 187 insertions(+)
> >> create mode 100644 drivers/nvme/host/nvme-hwmon.c
> >>
> >> diff --git a/drivers/nvme/host/Kconfig b/drivers/nvme/host/Kconfig
> >> index 2b36f052bfb9..aeb49e16e386 100644
> >> --- a/drivers/nvme/host/Kconfig
> >> +++ b/drivers/nvme/host/Kconfig
> >> @@ -23,6 +23,16 @@ config NVME_MULTIPATH
> >> /dev/nvmeXnY device will show up for each NVMe namespaces,
> >> even if it is accessible through multiple controllers.
> >>
> >> +config NVME_HWMON
> >> + bool "NVME hardware monitoring"
> >> + depends on (NVME_CORE=y && HWMON=y) || (NVME_CORE=m && HWMON)
> >> + help
> >> + This provides support for NVME hardware monitoring. If enabled,
> >> + a hardware monitoring device will be created for each NVME drive
> >> + in the system.
> >> +
> >> + If unsure, say N.
> >> +
> >> config NVME_FABRICS
> >> tristate
> >>
> >> diff --git a/drivers/nvme/host/Makefile b/drivers/nvme/host/Makefile
> >> index 8a4b671c5f0c..03de4797a877 100644
> >> --- a/drivers/nvme/host/Makefile
> >> +++ b/drivers/nvme/host/Makefile
> >> @@ -14,6 +14,7 @@ nvme-core-$(CONFIG_TRACING) += trace.o
> >> nvme-core-$(CONFIG_NVME_MULTIPATH) += multipath.o
> >> nvme-core-$(CONFIG_NVM) += lightnvm.o
> >> nvme-core-$(CONFIG_FAULT_INJECTION_DEBUG_FS) += fault_inject.o
> >> +nvme-core-$(CONFIG_NVME_HWMON) += nvme-hwmon.o
> >>
> >> nvme-y += pci.o
> >>
> >> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> >> index fa7ba09dca77..fc1d4b146717 100644
> >> --- a/drivers/nvme/host/core.c
> >> +++ b/drivers/nvme/host/core.c
> >> @@ -2796,6 +2796,9 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
> >> ctrl->oncs = le16_to_cpu(id->oncs);
> >> ctrl->mtfa = le16_to_cpu(id->mtfa);
> >> ctrl->oaes = le32_to_cpu(id->oaes);
> >> + ctrl->wctemp = le16_to_cpu(id->wctemp);
> >> + ctrl->cctemp = le16_to_cpu(id->cctemp);
> >> +
> >> atomic_set(&ctrl->abort_limit, id->acl + 1);
> >> ctrl->vwc = id->vwc;
> >> if (id->mdts)
> >> @@ -2897,6 +2900,8 @@ int nvme_init_identify(struct nvme_ctrl *ctrl)
> >>
> >> ctrl->identified = true;
> >>
> >> + nvme_hwmon_init(ctrl);
> >> +
> >> return 0;
> >>
> >> out_free:
> >
> > The nvme_init_identify() can be called multiple time in nvme ctrl's
> > lifetime (e.g 'nvme reset /dev/nvme*' or suspend/resume paths), so
> > should we need to prevent nvme_hwmon_init() from registering hwmon
> > device more than twice?
> >
> > In the nvme thermal zone patchset[1], thernal zone is registered in
> > nvme_init_identify and unregistered in nvme_stop_ctrl().
> >
>
> Doesn't that mean that the initialization should happen in nvme_start_ctrl()
> and not here ?

Seems possible. But I would like to ask maintainers' opinion.