Re: [PATCH 1/1] block: System crashes when cpu hotplug + bouncing port

From: Daniel Wagner
Date: Tue Jun 29 2021 - 05:49:44 EST


On Tue, Jun 29, 2021 at 05:35:51PM +0800, Ming Lei wrote:
> With the two patches I posted, __nvme_submit_sync_cmd() shouldn't return
> error, can you observe the error?

There are still ways the allocation can fail:

ret = blk_queue_enter(q, flags);
if (ret)
return ERR_PTR(ret);

ret = -EXDEV;
data.hctx = q->queue_hw_ctx[hctx_idx];
if (!blk_mq_hw_queue_mapped(data.hctx))
goto out_queue_exit;

No, I don't see any errors. I am still trying to reproduce it on real
hardware. The setup with blktests running in Qemu did work with all
patches applied (the once from me and your patches).

About the error argument: Later in the code path, e.g. in
__nvme_submit_sync_cmd() transport errors (incl. canceled request) are
handled as well, hence the upper layer will see errors during connection
attempts. My point is, there is nothing special about the connection
attempt failing. We have error handling code in place and the above
state machine has to deal with it.

Anyway, avoiding the if in the hotpath is a good thing. I just don't
think your argument about no error can happen is correct.