NVME regression in all kernels after 4.4.x for NVME in M2 slot for laptop?

From: Marc MERLIN
Date: Fri Aug 05 2016 - 15:26:28 EST


I've been stuck on 4.4.x for a while (currently 4.4.5) because any
subsequent kernel would fail to suspend or resume (S3 sleep) on my
Thinkpad P70.

Due to lack of time, I only got around to doing a git bisect now
(sorry), and did it between 4.4.0 and 4.5.0
It's my first bisect, but I hope I did it right outside of the fact that
my kernel wasn't exactly the same each time due to having my .config
file change depending on which kernel I ended up on.

However, the patch found by bisect makes sense that it would be a good
culprit.
I use an NVME 512GB SSD in my laptop, and I guess very few people use those
which could be why I'm the first/only person to report this.

Sadly because NVME changed a lot between 4.4 and 4.5 and I'm not a
kernel hacker, I can't just reverse apply the patch to 4.5 and see if it
works because I'd have to unroll a bunch of other changes too, and
that's a bit beyond my expertise and time at hand right now.

Would this patch make sense as being the reason why I can't S3 sleep
anymore and would you have a test patch against 4.5, 4.6, or 4.7 I can
try to see if it fixes the problem?
Symptom is that my red LED (the dot for in in thinkpad on the back
cover) goes flashing in weird ways when I shut the lid, but not always
the same pattern, however none are the normal on/off gentle pulsing that
indicate proper S3 sleep.
The caps lock key LED also flashes rapidly when I open the lid and the
laptop is stone dead at this point.

Boot logs on 4.4.5 kernel where sleep works fine:
[ 1.245549] ahci 0000:00:17.0: version 3.0
[ 1.245733] ahci 0000:00:17.0: AHCI 0001.0301 32 slots 2 ports 6 Gbps 0xc impl SATA mode
[ 1.245771] ahci 0000:00:17.0: flags: 64bit ncq sntf pm led clo only pio slum part ems deso sadm sds apst
[ 1.251140] scsi host0: ahci
[ 1.251587] scsi host1: ahci
[ 1.251972] scsi host2: ahci
[ 1.252360] scsi host3: ahci
[ 1.252437] ata1: DUMMY
[ 1.252449] ata2: DUMMY
[ 1.252462] ata3: SATA max UDMA/133 abar m2048@0xd584c000 port 0xd584c200 irq 122
[ 1.252499] ata4: SATA max UDMA/133 abar m2048@0xd584c000 port 0xd584c280 irq 122
[ 1.253374] scsi host4: pata_legacy
[ 1.253439] ata5: PATA max PIO4 cmd 0x1f0 ctl 0x3f6 irq 14
[ 1.355385] nvme0n1: p1 p2 p3 p4 p5 p6 p7 p8
[ 1.570804] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 1.570877] ata4: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[ 1.573097] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 1.573101] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 1.573690] ata3.00: supports DRM functions and may not be fully accessible
[ 1.574399] ata3.00: disabling queued TRIM support
[ 1.574402] ata3.00: ATA-9: Samsung SSD 850 EVO 2TB, EMT01B6Q, max UDMA/133
[ 1.574435] ata3.00: 3907029168 sectors, multi 1: LBA48 NCQ (depth 31/32), AA
[ 1.575954] ata3.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 1.575958] ata3.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out
[ 1.576550] ata3.00: supports DRM functions and may not be fully accessible
[ 1.577209] ata3.00: disabling queued TRIM support
[ 1.578007] ata3.00: configured for UDMA/133
[ 1.578037] ata4.00: ACPI cmd ef/02:00:00:00:00:a0 (SET FEATURES) succeeded
[ 1.578040] ata4.00: ACPI cmd f5/00:00:00:00:00:a0 (SECURITY FREEZE LOCK) filtered out


Patch found by bisect, attached

Thanks,
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Microsoft is to operating systems ....
.... what McDonalds is to gourmet cooking
Home page: http://marc.merlins.org/ | PGP 1024R/763BE901
saruman:/usr/src/linux# git bisect good
25646264e15af96c5c630fc742708b1eb3339222 is the first bad commit
commit 25646264e15af96c5c630fc742708b1eb3339222
Author: Keith Busch <keith.busch@xxxxxxxxx>
Date: Mon Jan 4 09:10:57 2016 -0700

NVMe: Remove queue freezing on resets

NVMe submits all commands through the block layer now. This means we
can let requests queue at the blk-mq hardware context since there is no
path that bypasses this anymore so we don't need to freeze the queues
anymore. The driver can simply stop the h/w queues from running during
a reset instead.

This also fixes a WARN in percpu_ref_reinit when the queue was unfrozen
with requeued requests.

Signed-off-by: Keith Busch <keith.busch@xxxxxxxxx>
Signed-off-by: Jens Axboe <axboe@xxxxxx>


diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index e31a256..8da4a8a 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1372,12 +1372,14 @@ out:
return ret;
}

-void nvme_stop_queues(struct nvme_ctrl *ctrl)
+void nvme_freeze_queues(struct nvme_ctrl *ctrl)
{
struct nvme_ns *ns;

mutex_lock(&ctrl->namespaces_mutex);
list_for_each_entry(ns, &ctrl->namespaces, list) {
+ blk_mq_freeze_queue_start(ns->queue);
+
spin_lock_irq(ns->queue->queue_lock);
queue_flag_set(QUEUE_FLAG_STOPPED, ns->queue);
spin_unlock_irq(ns->queue->queue_lock);
@@ -1388,13 +1390,14 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl)
mutex_unlock(&ctrl->namespaces_mutex);
}

-void nvme_start_queues(struct nvme_ctrl *ctrl)
+void nvme_unfreeze_queues(struct nvme_ctrl *ctrl)
{
struct nvme_ns *ns;

mutex_lock(&ctrl->namespaces_mutex);
list_for_each_entry(ns, &ctrl->namespaces, list) {
queue_flag_clear_unlocked(QUEUE_FLAG_STOPPED, ns->queue);
+ blk_mq_unfreeze_queue(ns->queue);
blk_mq_start_stopped_hw_queues(ns->queue, true);
blk_mq_kick_requeue_list(ns->queue);
}
diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
index 4722fad..4437592 100644
--- a/drivers/nvme/host/nvme.h
+++ b/drivers/nvme/host/nvme.h
@@ -238,8 +238,8 @@ int nvme_init_identify(struct nvme_ctrl *ctrl);
void nvme_scan_namespaces(struct nvme_ctrl *ctrl);
void nvme_remove_namespaces(struct nvme_ctrl *ctrl);

-void nvme_stop_queues(struct nvme_ctrl *ctrl);
-void nvme_start_queues(struct nvme_ctrl *ctrl);
+void nvme_freeze_queues(struct nvme_ctrl *ctrl);
+void nvme_unfreeze_queues(struct nvme_ctrl *ctrl);

struct request *nvme_alloc_request(struct request_queue *q,
struct nvme_command *cmd, unsigned int flags);
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 953fe48..ac6c7af 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1064,7 +1064,7 @@ static int nvme_suspend_queue(struct nvme_queue *nvmeq)
spin_unlock_irq(&nvmeq->q_lock);

if (!nvmeq->qid && nvmeq->dev->ctrl.admin_q)
- blk_mq_stop_hw_queues(nvmeq->dev->ctrl.admin_q);
+ blk_mq_freeze_queue_start(nvmeq->dev->ctrl.admin_q);

irq_set_affinity_hint(vector, NULL);
free_irq(vector, nvmeq);
@@ -1296,7 +1296,7 @@ static int nvme_alloc_admin_tags(struct nvme_dev *dev)
return -ENODEV;
}
} else
- blk_mq_start_stopped_hw_queues(dev->ctrl.admin_q, true);
+ blk_mq_unfreeze_queue(dev->ctrl.admin_q);

return 0;
}
@@ -1917,7 +1917,7 @@ static void nvme_dev_shutdown(struct nvme_dev *dev)

mutex_lock(&dev->shutdown_lock);
if (dev->bar) {
- nvme_stop_queues(&dev->ctrl);
+ nvme_freeze_queues(&dev->ctrl);
csts = readl(dev->bar + NVME_REG_CSTS);
}
if (csts & NVME_CSTS_CFS || !(csts & NVME_CSTS_RDY)) {
@@ -2026,7 +2026,7 @@ static void nvme_reset_work(struct work_struct *work)
dev_warn(dev->dev, "IO queues not created\n");
nvme_remove_namespaces(&dev->ctrl);
} else {
- nvme_start_queues(&dev->ctrl);
+ nvme_unfreeze_queues(&dev->ctrl);
nvme_dev_add(dev);
}