Re: [BUG] Oops when SCSI device under multipath is removed

From: Jun'ichi Nomura
Date: Thu Aug 18 2011 - 05:12:41 EST


Hi James,

On 08/16/11 20:26, Jun'ichi Nomura wrote:
> The commit log of 86cbfb5607d4b81b1a993ff689bbd2addd5d3a9b
> ("[SCSI] put stricter guards on queue dead checks") does not
> explain about the move of scsi_free_queue().
>
> But according to the discussion below, it seems
> the move was motivated to solve the following self-deadlock:
> https://lkml.org/lkml/2011/4/12/9
>
> [in the context of kblockd_workqueue]
> blk_delay_work
> __blk_run_queue
> scsi_request_fn
> put_device
> (puts final sdev refcount)
> scsi_device_dev_release
> execute_in_process_context(scsi_device_dev_release_usercontext)
> [execute immediately because it's in process context]
> scsi_device_dev_release_usercontext
> scsi_free_queue
> blk_cleanup_queue
> blk_sync_queue
> (wait for blk_delay_work to complete...)
>
> James, is my understanding correct?
>
> If so, isn't it possible to move the scsi_free_queue back to
> the original place and solve the deadlock instead by
> avoiding the wait in the same context?

Actually, Tejun has posted a patch to replace
execute_in_process_context() with queue_work()
and asking your review:

[PATCH RESEND] scsi: don't use execute_in_process_context()
https://lkml.org/lkml/2011/4/30/87

Do you think you can take the patch and revert the move
of scsi_free_queue()?

Thanks,
--
Jun'ichi Nomura, NEC Corporation
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/