Re: [PATCH] NVMe: Avoid interrupt disable during queue init.

From: Keith Busch
Date: Fri May 22 2015 - 11:11:51 EST


On Fri, 22 May 2015, Parav Pandit wrote:
On Fri, May 22, 2015 at 8:18 PM, Keith Busch <keith.busch@xxxxxxxxx> wrote:
The rcu protection on nvme queues was removed with the blk-mq conversion
as we rely on that layer for h/w access.

o.k. But above is at level where data I/Os are not even active. Its
between nvme_kthread and nvme_resume() from power management
subsystem.
I must be missing something.

On resume, everything is already reaped from the queues, so there should
be no harm letting the kthread poll an inactive queue. The proposal to
remove the q_lock during queue init makes it possible for the thread to
see the wrong cq phase bit and mess up the completion queue's head from
reaping non-existent entries.

But beyond nvme_resume, it appears a race condition is possible on any
scenario when a device is reinitialized if it cannot create the same
number of IO queues as it had in originally. Part of the problem is there
doesn't seem to be a way to change a tagset's nr_hw_queues after it was
created. The conditions that leads to this scenario should be uncommon,
so I haven't given it much thought; I need to untangle dynamic namespaces
first. :)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/