Re: [PATCH] scsi/sg: don't grab scsi host module reference

From: Marc Hartmayer
Date: Tue Jul 04 2023 - 13:04:45 EST


On Thu, Jun 22, 2023 at 12:01 AM +0800, Yu Kuai <yukuai1@xxxxxxxxxxxxxxx> wrote:
> From: Yu Kuai <yukuai3@xxxxxxxxxx>
>
> In order to prevent request_queue to be freed before cleaning up
> blktrace debugfs entries, commit db59133e9279 ("scsi: sg: fix blktrace
> debugfs entries leakage") use scsi_device_get(), however,
> scsi_device_get() will also grab scsi module reference and scsi module
> can't be removed.
>
> It's reported that blktests can't unload scsi_debug after block/001:
>
> blktests (master) # ./check block
> block/001 (stress device hotplugging) [failed]
> +++ /root/blktests/results/nodev/block/001.out.bad 2023-06-19
> Running block/001
> Stressing sd
> +modprobe: FATAL: Module scsi_debug is in use.
>
> Fix this problem by grabbing request_queue reference directly, so that
> scsi host module can still be unloaded while request_queue will be
> pinged by sg device.
>
> Reported-by: Chaitanya Kulkarni <chaitanyak@xxxxxxxxxx>
> Link: https://lore.kernel.org/all/1760da91-876d-fc9c-ab51-999a6f66ad50@xxxxxxxxxx/
> Fixes: db59133e9279 ("scsi: sg: fix blktrace debugfs entries leakage")
> Signed-off-by: Yu Kuai <yukuai3@xxxxxxxxxx>
> ---
> drivers/scsi/sg.c | 6 +++---
> 1 file changed, 3 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/scsi/sg.c b/drivers/scsi/sg.c
> index 2433eeef042a..dcb73787c29d 100644
> --- a/drivers/scsi/sg.c
> +++ b/drivers/scsi/sg.c
> @@ -1497,7 +1497,7 @@ sg_add_device(struct device *cl_dev)
> int error;
> unsigned long iflags;
>
> - error = scsi_device_get(scsidp);
> + error = blk_get_queue(scsidp->request_queue);
> if (error)
> return error;
>
> @@ -1558,7 +1558,7 @@ sg_add_device(struct device *cl_dev)
> out:
> if (cdev)
> cdev_del(cdev);
> - scsi_device_put(scsidp);
> + blk_put_queue(scsidp->request_queue);
> return error;
> }
>
> @@ -1575,7 +1575,7 @@ sg_device_destroy(struct kref *kref)
> */
>
> blk_trace_remove(q);
> - scsi_device_put(sdp->device);
> + blk_put_queue(q);
>
> write_lock_irqsave(&sg_index_lock, flags);
> idr_remove(&sg_index_idr, sdp->index);
> --
> 2.39.2

Hi,

This change (bisected) triggers a regression in our KVM on s390x CI. The
symptom is that a “scsi_debug device” does not bind to the scsi_generic
driver. On s390x you can reproduce the problem as follows (I have not
tested on x86):

With this patch applied:

$ sudo modprobe scsi_debug
$ # Get the 'scsi_host,channel,target_number,LUN' tuple for the scsi_debug device
$ lsscsi |grep scsi_debug |awk '{ print $1 }'
[0:0:0:0]
$ sudo stat /sys/bus/scsi/devices/0:0:0:0/scsi_generic
stat: cannot statx '/sys/bus/scsi/devices/0:0:0:0/scsi_generic': No such file or directory


Patch reverted:

$ sudo modprobe scsi_debug
$ lsscsi |grep scsi_debug |awk '{ print $1 }'
[0:0:0:0]
$ sudo stat /sys/bus/scsi/devices/0:0:0:0/scsi_generic
File: /sys/bus/scsi/devices/0:0:0:0/scsi_generic
Size: 0 Blocks: 0 IO Block: 4096 directory
Device: 0,20 Inode: 12155 Links: 3


Any ideas?

Marc