Re: [PATCH] nvme-pci: cancel nvme device request before disabling

From: Sagi Grimberg
Date: Fri Aug 14 2020 - 14:00:12 EST



diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index ba725ae47305..c4f1ce0ee1e3 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1249,8 +1249,8 @@ static enum blk_eh_timer_return nvme_timeout(struct request *req, bool reserved)
dev_warn_ratelimited(dev->ctrl.device,
"I/O %d QID %d timeout, disable controller\n",
req->tag, nvmeq->qid);
- nvme_dev_disable(dev, true);
nvme_req(req)->flags |= NVME_REQ_CANCELLED;
+ nvme_dev_disable(dev, true);
return BLK_EH_DONE;

Shouldn't this flag have been set in nvme_cancel_request()?

nvme_cancel_request() is not setting this flag to cancelled and this is causing

Right, I see that it doesn't, but I'm saying that it should. We used to
do something like that, and I'm struggling to recall why we're not
anymore.

I also don't recall why, but I know that we rely on the status
propagating back from submit_sync_cmd which won't happen because
it converts the status into -ENODEV.

The driver is not reporting non-response back for all
cancelled requests, and that is probably not what we should be doing.

I'd think that we should modify our callers to handle nvme status
codes as well rather than rely on nvme_submit_sync_cmd to return a
negative codes under some conditions.