Re: [PATCH] nvme-pci: Shutdown when removing dead controller

From: Tyler Ramer
Date: Sat Oct 05 2019 - 17:58:21 EST


> What is the bad CSTS bit? CSTS.RDY?

The reset will be triggered by the result of nvme_should_reset():

1196 static bool nvme_should_reset(struct nvme_dev *dev, u32 csts)
1197 {
1198
1199 â /* If true, indicates loss of adapter communication, possibly by a
1200 â * NVMe Subsystem reset.
1201 â */
1202 â bool nssro = dev->subsystem && (csts & NVME_CSTS_NSSRO);

This csts value is set in nvme_timeout:

1240 static enum blk_eh_timer_return nvme_timeout(struct request *req,
bool reserved)
1241 {
...
1247 â u32 csts = readl(dev->bar + NVME_REG_CSTS);
...
1256 â /*
1257 â * Reset immediately if the controller is failed
1258 â */
1259 â if (nvme_should_reset(dev, csts)) {
1260 â â nvme_warn_reset(dev, csts);
1261 â â nvme_dev_disable(dev, false);
1262 â â nvme_reset_ctrl(&dev->ctrl);


Again, here's the message printed by nvme_warn_reset:

Aug 26 15:01:27 testhost kernel: nvme nvme4: controller is down; will
reset: CSTS=0x3, PCI_STATUS=0x10

>From include/linux/nvme.h:
105 â NVME_REG_CSTSâ = 0x001c,â /* Controller Status */

- Tyler