Re: [PATCH v1] virtio_blk: fix race between start and stop queue

From: Jens Axboe
Date: Fri May 16 2014 - 11:45:35 EST


On 2014-05-16 09:43, Ming Lei wrote:
On Fri, May 16, 2014 at 11:32 PM, Jens Axboe <axboe@xxxxxxxxx> wrote:
On 2014-05-16 09:31, Ming Lei wrote:

When there isn't enough vring descriptor for adding to vq,
blk-mq will be put as stopped state until some of pending
descriptors are completed & freed.

Unfortunately, the vq's interrupt may come just before
blk-mq's BLK_MQ_S_STOPPED flag is set, so the blk-mq will
still be kept as stopped even though lots of descriptors
are completed and freed in the interrupt handler. The worst
case is that all pending descriptors are freed in the
interrupt handler, and the queue is kept as stopped forever.

This patch fixes the problem by starting/stopping blk-mq
with holding vq_lock.


Thanks, this looks good, I'll apply it for 3.16 (with a stable marker, even
if it is an unlikely event).

Thanks.

It shouldn't be very difficult to happen in case of
non-indirect descriptor, and it is easy to reproduce
when module parameter of 'virtblk_queue_depth'
is bigger than vq->num_free for non-indirect case.

I agree, it can definitely be setup so that it would not be hard to trigger. But I don't recall seeing any hang bugs since 3.13 was released, which would seem to indicate that it doesn't happen a lot in the wild with default settings.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/