Re: [PATCH v5 6/7] module: Improve support for asynchronous module exit code

From: Bart Van Assche
Date: Wed Sep 28 2022 - 15:27:16 EST


On 9/27/22 18:09, Ming Lei wrote:
On Wed, Sep 14, 2022 at 03:56:20PM -0700, Bart Van Assche wrote:
Some kernel modules call device_del() from their module exit code and
schedule asynchronous work from inside the .release callback without waiting
until that callback has finished. As an example, many SCSI LLD drivers call

It isn't only related with device, any kobject has such issue, or any
reference counter usage has similar potential risk, see previous discussion:

https://lore.kernel.org/lkml/YsZm7lSXYAHT14ui@T590/

IMO, it is one fundamental problem wrt. module vs. reference counting or
kobject uses at least, since the callback depends on module code
segment.

scsi_remove_host() from their module exit code. scsi_remove_host() may
invoke scsi_device_dev_release_usercontext() asynchronously.
scsi_device_dev_release_usercontext() uses the host template pointer and
that pointer usually exists in static storage in the SCSI LLD. Support
using the module reference count to keep the module around until
asynchronous module exiting has completed by waiting in the delete_module()
system call until the module reference count drops to zero.

The issue can't be addressed by the normal mod->refcnt, since user need
to unload module when the device isn't used.

Hi Ming,

How about removing support for calling scsi_device_put() from atomic context
as is done in the untested patch below?

Thanks,

Bart.

diff --git a/drivers/scsi/scsi.c b/drivers/scsi/scsi.c
index c59eac7a32f2..661753a10b47 100644
--- a/drivers/scsi/scsi.c
+++ b/drivers/scsi/scsi.c
@@ -561,6 +561,8 @@ EXPORT_SYMBOL(scsi_report_opcode);
*/
int scsi_device_get(struct scsi_device *sdev)
{
+ might_sleep();
+
if (sdev->sdev_state == SDEV_DEL || sdev->sdev_state == SDEV_CANCEL)
goto fail;
if (!get_device(&sdev->sdev_gendev))
@@ -588,6 +590,7 @@ void scsi_device_put(struct scsi_device *sdev)
{
struct module *mod = sdev->host->hostt->module;

+ might_sleep();
put_device(&sdev->sdev_gendev);
module_put(mod);
}
diff --git a/drivers/scsi/scsi_sysfs.c b/drivers/scsi/scsi_sysfs.c
index a3aaafdeac1d..4cfc9317b4ad 100644
--- a/drivers/scsi/scsi_sysfs.c
+++ b/drivers/scsi/scsi_sysfs.c
@@ -441,7 +441,7 @@ static void scsi_device_cls_release(struct device *class_dev)
put_device(&sdev->sdev_gendev);
}

-static void scsi_device_dev_release_usercontext(struct work_struct *work)
+static void scsi_device_dev_release(struct device *dev)
{
struct scsi_device *sdev;
struct device *parent;
@@ -450,11 +450,8 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)
struct scsi_vpd *vpd_pg0 = NULL, *vpd_pg89 = NULL;
struct scsi_vpd *vpd_pgb0 = NULL, *vpd_pgb1 = NULL, *vpd_pgb2 = NULL;
unsigned long flags;
- struct module *mod;
-
- sdev = container_of(work, struct scsi_device, ew.work);

- mod = sdev->host->hostt->module;
+ sdev = to_scsi_device(dev);

parent = sdev->sdev_gendev.parent;

@@ -516,19 +513,6 @@ static void scsi_device_dev_release_usercontext(struct work_struct *work)

if (parent)
put_device(parent);
- module_put(mod);
-}
-
-static void scsi_device_dev_release(struct device *dev)
-{
- struct scsi_device *sdp = to_scsi_device(dev);
-
- /* Set module pointer as NULL in case of module unloading */
- if (!try_module_get(sdp->host->hostt->module))
- sdp->host->hostt->module = NULL;
-
- execute_in_process_context(scsi_device_dev_release_usercontext,
- &sdp->ew);
}

static struct class sdev_class = {