RE: SCSI low level driver: how to prevent I/O upon hibernation?

From: Dexuan Cui
Date: Sat Apr 11 2020 - 13:21:33 EST


> From: Bart Van Assche <bvanassche@xxxxxxx>
> Sent: Saturday, April 11, 2020 8:03 AM
>
> On 2020-04-10 23:01, Dexuan Cui wrote:
> > Please let me know if the change to scsi_device_set_state() is OK.
>
> Hadn't Ming Lei already root-caused this issue for you? From his e-mail:
> "So you can't free related vmbus ringbuffer cause BLK_MQ_REQ_PREEMPT
> request is still to be handled."
>
> Please follow that advice.
>
> Bart.

Hi Bart, Ming,
I agree Ming has root-caused the issue, but it looks the advice can not
apply to the hibernation scenario. :-)

Sorry for my lack of knowledge of the complex SCSI subsystems -- could
you please elaborate on what a low level SCSI device driver (like hv_storvsc)
should do to safely save/restore the device state upon hibernation?

The nature of "free related vmbus ringbuffer" in hv_storvsc is that: the
driver can not handle any I/O after the device is quiesced in
software_resume() -> load_image_and_restore() -> hibernation_restore()
-> dpm_suspend_start() -> ... -> storvsc_suspend(). BTW, after the SCSI
device is quiesced, the hibernation's resume path also quiesces other
devices, disables non-boot CPUs, and finally jumps to the old kernel's
entry point where the old kernel was suspended, and the old kernel will
resume back.

My intuition is that the upper level SCSI layer should provide an API to
flush any pending I/O and block any new I/O after a SCSI device is
"quiesced"? -- it looks scsi_host_block()/scsi_host_unblock() are such
APIs, which are already used by
drivers/scsi/aacraid/linit.c: aac_suspend()/aac_resume().

That's why I proposed the patch of the same thing for hv_storvsc, and
it looks the patch works for me: without the patch I can easily hit the
panic I reported in the first email; with the patch, I have successfully
done more than 30 rounds of hibernation without the panic.

However, it looks you implied my intuition is wrong and it's *expected*
that the upper level SCSI layer can still submit I/O requests with the
BLK_MQ_REQ_PREEMPT flag after the SCSI device is "quiesced"?

If this is the case, then how is hv_storvsc supposed to handle the I/O
after the SCSI device is quiesced? I can keep the related vmbus ringbuffer,
but the real issue is: the driver is unable to handle any I/O at all since the
vmbus connection to the Hyper-V host is disconnected soon, after
the SCSI device is quiesced. Should hv_storvsc return an error for such I/O,
or block such I/O until the SCSI device is resumed? -- These don't look
good to me, and I really think the upper level SCSI layer should provide
an API to block any new I/O after a SCSI device is "quiesced" -- again, can
you please clarify if scsi_host_block()/scsi_host_unblock() are such APIs?

Looking forward to your replies!

Thanks,
-- Dexuan