[PATCH 1/2] devcoredump: Remove devcoredump device if failing device is gone

From: Rodrigo Vivi
Date: Fri Jan 26 2024 - 10:11:43 EST


Make dev_coredumpm a real device managed helper, that not only
frees the device after a scheduled delay (DEVCD_TIMEOUT), but
also when the failing/crashed device is gone.

The module remove for the drivers using devcoredump are currently
broken if attempted between the crash and the DEVCD_TIMEOUT, since
the symbolic sysfs link won't be deleted.

On top of that, for PCI devices, the unbind of the device will
call the pci .remove void function, that cannot fail. At that
time, our device is pretty much gone, but the read and free
functions are alive trough the devcoredump device and they
can get some NULL dereferences or use after free.

So, if the failing-device is gone let's also request for the
devcoredump-device removal using the same mod_delayed_work
as when writing anything through data. The flush cannot be
used since it is synchronous and the devcd would be surely
gone right before the mutex_unlock on the next line.

Cc: Jose Souza <jose.souza@xxxxxxxxx>
Cc: Maarten Lankhorst <maarten.lankhorst@xxxxxxxxxxxxxxx>
Cc: Johannes Berg <johannes@xxxxxxxxxxxxxxxx>
Cc: Greg Kroah-Hartman <gregkh@xxxxxxxxxxxxxxxxxxx>
Cc: Rafael J. Wysocki <rafael@xxxxxxxxxx>
Signed-off-by: Rodrigo Vivi <rodrigo.vivi@xxxxxxxxx>
---
drivers/base/devcoredump.c | 15 +++++++++++++++
1 file changed, 15 insertions(+)

diff --git a/drivers/base/devcoredump.c b/drivers/base/devcoredump.c
index 7e2d1f0d903a..678ecc2fa242 100644
--- a/drivers/base/devcoredump.c
+++ b/drivers/base/devcoredump.c
@@ -304,6 +304,19 @@ static ssize_t devcd_read_from_sgtable(char *buffer, loff_t offset,
offset);
}

+static void devcd_remove(void *data)
+{
+ struct devcd_entry *devcd = data;
+
+ mutex_lock(&devcd->mutex);
+ if (!devcd->delete_work) {
+ devcd->delete_work = true;
+ /* XXX: Cannot flush otherwise the mutex below will hit a UAF */
+ mod_delayed_work(system_wq, &devcd->del_wk, 0);
+ }
+ mutex_unlock(&devcd->mutex);
+}
+
/**
* dev_coredumpm - create device coredump with read/free methods
* @dev: the struct device for the crashed device
@@ -381,6 +394,8 @@ void dev_coredumpm(struct device *dev, struct module *owner,
kobject_uevent(&devcd->devcd_dev.kobj, KOBJ_ADD);
INIT_DELAYED_WORK(&devcd->del_wk, devcd_del);
schedule_delayed_work(&devcd->del_wk, DEVCD_TIMEOUT);
+ if (devm_add_action(dev, devcd_remove, devcd))
+ dev_warn(dev, "devcoredump managed auto-removal registration failed\n");
mutex_unlock(&devcd->mutex);
return;
put_device:
--
2.43.0