Re: [patch]blk-mq: blk_mq_tag_to_rq should handle flush request

From: Jens Axboe
Date: Wed Jun 04 2014 - 22:40:36 EST


On 2014-06-04 20:27, Shaohua Li wrote:
On Wed, Jun 04, 2014 at 08:05:33PM -0600, Jens Axboe wrote:
On 2014-06-04 19:27, Shaohua Li wrote:
On Wed, Jun 04, 2014 at 10:25:22AM -0600, Jens Axboe wrote:
On 06/04/2014 09:47 AM, Jens Axboe wrote:
On 06/04/2014 09:39 AM, Jens Axboe wrote:
On 06/04/2014 09:31 AM, Christoph Hellwig wrote:
On Wed, Jun 04, 2014 at 09:02:19AM -0600, Jens Axboe wrote:
scsi_mq_find_tag only gets the scsi host, which may have multiple
queues. When called from scsi_find_tag we actually have a scsi device,
so that's not an issue, but when called from scsi_host_find_tag the
driver only provides the host.

Only solution I see right now is to have the flush_rq in the shared
tags, but that would potentially be a regression for multiple
devices and heavy flush uses cases. I'll see if I can come up with
something better, or maybe Shaohua has an idea.

What about something like the following (untest, uncompiled, maybe
pseudo-code):

struct request *blk_mq_tag_to_rq(struct blk_mq_tags *tags, unsigned int tag)
{
struct request *rq = tags->rqs[tag];

if ((rq->cmd_flags & REQ_FLUSH_SEQ) && rq->q->flush_rq->tag == tag)
return rq->q->flush_rq;
return rq;

Ah yes, that'll work, the queue is always assigned. I'll make that change.

Something like this in complete form. Compile tested only, I'll test it
on dev box. Probably doesn't matter too much, but I prefer to
potentially have the faster path (non-flush) just fall inline.

Works for me, committed.

Sounds there is a small race here. FUA request has REQ_FLUSH_SEQ set too.
Assume its tag is 0. we initialize flush_rq.
blk_mq_rq_init->blk_rq_init->memset could set flush_rq tag to 0 in a short
time. In that short time, blk_mq_tag_to_rq will return wrong request for the
FUA request.

we can do (rq->cmd_flags & REQ_FLUSH_SEQ) && !(rq->cmd_flags & REQ_FUA) in
is_flush_request to avoid this issue.

We don't memset the entire request anymore from the rq alloc path.

blk_kick_flush() still calls blk_rq_init()?

OK, I see what you mean now. I was thinking about the normal uses cases of blk_mq_tag_to_rq(), which would be completion or issue time. If you are concerned about the "any point in time" validity, then yes, it could be an issue.

Might be better to fixup flush init, though.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/