Re: [PATCH] nvme: hwmon: fix crash on device teardown

From: Hannes Reinecke
Date: Fri Dec 11 2020 - 09:14:33 EST


On 12/9/20 10:32 PM, Enzo Matsumiya wrote:
Fix a possible NULL pointer dereference when trying to read
hwmon sysfs entries associated to NVMe-oF devices that were
hot-removed or disconnected.

Unregister the NVMe hwmon device upon controller teardown
(nvme_stop_ctrl()).

Signed-off-by: Enzo Matsumiya <ematsumiya@xxxxxxx>
---
drivers/nvme/host/core.c | 1 +
drivers/nvme/host/hwmon.c | 8 ++++++++
drivers/nvme/host/nvme.h | 2 ++
3 files changed, 11 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 9a270e49df17..becc80a0c3b8 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -4344,6 +4344,7 @@ EXPORT_SYMBOL_GPL(nvme_complete_async_event);
void nvme_stop_ctrl(struct nvme_ctrl *ctrl)
{
+ nvme_hwmon_exit(ctrl);
nvme_mpath_stop(ctrl);
nvme_stop_keep_alive(ctrl);
flush_work(&ctrl->async_event_work);
diff --git a/drivers/nvme/host/hwmon.c b/drivers/nvme/host/hwmon.c
index 552dbc04567b..7f62cca4c577 100644
--- a/drivers/nvme/host/hwmon.c
+++ b/drivers/nvme/host/hwmon.c
@@ -71,6 +71,9 @@ static int nvme_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
int temp;
int err;
+ if (data->ctrl->state != NVME_CTRL_LIVE)
+ return -EAGAIN;
+
/*
* First handle attributes which don't require us to read
* the smart log.
@@ -253,3 +256,8 @@ int nvme_hwmon_init(struct nvme_ctrl *ctrl)
return 0;
}
+
+void nvme_hwmon_exit(struct nvme_ctrl *ctrl)
+{
+ hwmon_device_unregister(ctrl->dev);
+}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 567f7ad18a91..621e9b1575f6 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -807,11 +807,13 @@ static inline struct nvme_ns *nvme_get_ns_from_dev(struct device *dev)
#ifdef CONFIG_NVME_HWMON
int nvme_hwmon_init(struct nvme_ctrl *ctrl);
+void nvme_hwmon_exit(struct nvme_ctrl *ctrl);
#else
static inline int nvme_hwmon_init(struct nvme_ctrl *ctrl)
{
return 0;
}
+static inline void nvme_hwmon_exit(struct nvme_ctrl *ctrl) { }
#endif
u32 nvme_command_effects(struct nvme_ctrl *ctrl, struct nvme_ns *ns,

Something's fishy here.
The hwmon attributes should have been created only once for the lifetime of the controller (creation is protected by '!initialized').
And the hwmon lifetime should've been controlled by the lifetime of the controller (which to my knowledge was the idea behind the devm_* thingies).
So why do we have to deallocate the hwmon attributes?
And why on reset? And who's re-creating them after reset, seeing that 'initialized' should be true?
Hmm?

Cheers,

Hannes
--
Dr. Hannes Reinecke Kernel Storage Architect
hare@xxxxxxx +49 911 74053 688
SUSE Software Solutions GmbH, Maxfeldstr. 5, 90409 Nürnberg
HRB 36809 (AG Nürnberg), Geschäftsführer: Felix Imendörffer