RE: SCSI low level driver: how to prevent I/O upon hibernation?

From: Dexuan Cui
Date: Sat Apr 11 2020 - 02:02:08 EST


> From: Ming Lei <tom.leiming@xxxxxxxxx>
> Sent: Friday, April 10, 2020 12:43 AM
> To: Dexuan Cui <decui@xxxxxxxxxxxxx>
>
> Hello Dexuan,
>
> On Fri, Apr 10, 2020 at 1:44 PM Dexuan Cui <decui@xxxxxxxxxxxxx> wrote:
> >
> > Hi all,
> > Can you please recommend the standard way to prevent the upper layer SCSI
> > driver from submitting new I/O requests when the system is doing
> > hibernation?
> >
> > Actually I already asked the question on 5/30 last year ...
> > and I thought all the sdevs are suspended and resumed automatically in
> > drivers/scsi/scsi_pm.c, and the low level SCSI adapter driver (i.e. hv_storvsc)
> > only needs to suspend/resume the state of the adapter itself. However, it
> > looks
> > this is not true, because today I got such a panic in a v5.6 Linux VM running
> > on Hyper-V: the 'suspend' part of the hibernation process finished without
> > any issue, but when the VM was trying to resume back from the 'new'
> > kernel to the 'old' kernel, these events happened:
> >
> > 1. the new kernel loaded the saved state from disk to memory.
> >
> > 2. the new kernel quiesced the devices, including the SCSI DVD device
> > controlled by the hv_storvsc low level SCSI driver, i.e.
> > drivers/scsi/storvsc_drv.c: storvsc_suspend() was called and the related
> > vmbus ringbuffer was freed.
> >
> > 3. However, disk_events_workfn() -> ... -> cdrom_check_events() -> ...
> > -> scsi_queue_rq() -> ... -> storvsc_queuecommand() was still trying to
> > submit I/O commands to the freed vmbus ringbuffer, and as a result, a NULL
> > pointer dereference panic happened.
>
> Last time I replied to you in above link:
>
> "scsi_device_quiesce() has been called by scsi_dev_type_suspend() to prevent
> any non-pm request from entering queue."
>
> That meant no any normal FS request can enter scsi queue after suspend,
> however request with BLK_MQ_REQ_PREEMPT is still allowed to be queued
> to LLD after suspend.
>
> So you can't free related vmbus ringbuffer cause BLK_MQ_REQ_PREEMPT
> request is still to be handled.
>
> Thanks,
> Ming Lei

Actually I think I found the APIs with the help of Long Li:
scsi_host_block and scsi_host_unblock(). The new APIs were just added
on 2/28. :-)

Unluckily scsi_host_block() doesn't allow a state transition from
SDEV_QUIESCE to SDEV_BLOCK and returns -EINVAL for that, so I made the
below patch and it looks hibernation can work reliably for me now.

Please let me know if the change to scsi_device_set_state() is OK.

If the patch looks good, I plan to split and post it sometime next week.

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 47835c4b4ee0..06c260f6cdae 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -2284,6 +2284,7 @@ scsi_device_set_state(struct scsi_device *sdev, enum scsi_device_state state)
switch (oldstate) {
case SDEV_RUNNING:
case SDEV_CREATED_BLOCK:
+ case SDEV_QUIESCE:
case SDEV_OFFLINE:
break;
default:
diff --git a/drivers/scsi/storvsc_drv.c b/drivers/scsi/storvsc_drv.c
index fb41636519ee..fd51d2f03778 100644
--- a/drivers/scsi/storvsc_drv.c
+++ b/drivers/scsi/storvsc_drv.c
@@ -1948,6 +1948,11 @@ static int storvsc_suspend(struct hv_device *hv_dev)
struct storvsc_device *stor_device = hv_get_drvdata(hv_dev);
struct Scsi_Host *host = stor_device->host;
struct hv_host_device *host_dev = shost_priv(host);
+ int ret;
+
+ ret = scsi_host_block(host);
+ if (ret)
+ return ret;

storvsc_wait_to_drain(stor_device);

@@ -1968,10 +1973,15 @@ static int storvsc_suspend(struct hv_device *hv_dev)

static int storvsc_resume(struct hv_device *hv_dev)
{
+ struct storvsc_device *stor_device = hv_get_drvdata(hv_dev);
+ struct Scsi_Host *host = stor_device->host;
int ret;

ret = storvsc_connect_to_vsp(hv_dev, storvsc_ringbuffer_size,
hv_dev_is_fc(hv_dev));
+ if (!ret)
+ ret = scsi_host_unblock(host, SDEV_RUNNING);
+
return ret;
}

Thanks,
-- Dexuan