Re: [PATCH v2 5/5] nvme-fc: Freeze queues before destroying them

From: James Smart
Date: Fri Jul 09 2021 - 12:14:16 EST


On 7/8/2021 2:27 AM, Daniel Wagner wrote:
nvme_wait_freeze_timeout() in nvme_fc_recreate_io_queues() needs to be
paired with a nvme_start_freeze(). Without freezing first we will always
timeout in nvme_wait_freeze_timeout().

Note there is a similiar fix for RDMA 9f98772ba307 ("nvme-rdma: fix
controller reset hang during traffic") which happens to follow the PCI
strategy how to handle resetting the queues.

Signed-off-by: Daniel Wagner <dwagner@xxxxxxx>
---
drivers/nvme/host/fc.c | 1 +
1 file changed, 1 insertion(+)

diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
index 8e1fc3796735..a38b01485939 100644
--- a/drivers/nvme/host/fc.c
+++ b/drivers/nvme/host/fc.c
@@ -3249,6 +3249,7 @@ nvme_fc_delete_association(struct nvme_fc_ctrl *ctrl)
nvme_fc_xmt_ls_rsp(disls);
if (ctrl->ctrl.tagset) {
+ nvme_start_freeze(&ctrl->ctrl);
nvme_fc_delete_hw_io_queues(ctrl);
nvme_fc_free_io_queues(ctrl);
}


Thanks for the note. that definitely helped follow what is being attempted. I also agree with Hannes that the comment from the rdma patch should also be present to understand what's going on.

Looking at the patch - this is not done in the same place or manner as rdma. Freezing and stoppage is prior to cancelling and that doesn't correspond where this was added (this is after all cancellations). We also seem to be missing a nvme_sync_io_queues() call in the sequence as well. So I believe there's more work to be done on this patch. I'll see what I can do.

We really need to see about a common layer for transports. So much we do is similar. We were ok at the start, but we've drifted apart over time and the requirements to the core layer aren't propogating to all transports.

-- james