Re: [PATCH v2 1/4] blk-flush: flush_rq should inherit first_rq's cmd_flags

From: Chengming Zhou
Date: Mon Jul 31 2023 - 10:07:06 EST


On 2023/7/31 14:09, Christoph Hellwig wrote:
> On Tue, Jul 25, 2023 at 09:00:59PM +0800, chengming.zhou@xxxxxxxxx wrote:
>> From: Chengming Zhou <zhouchengming@xxxxxxxxxxxxx>
>>
>> The cmd_flags in blk_kick_flush() should inherit the original request's
>> cmd_flags, but the current code looks buggy to me:
>
> Should it? I know the code is kinda trying to do it, but does it really
> make sense? Adding Hannes who originally added this inheritance and
> discussing the details below:

I'm not sure, actually I don't get what the current code is doing...
Hope Hannes could provide some details.

blk_flush_complete_seq(rq) -> blk_kick_flush(rq->cmd_flags)

flush_rq will use the cmd_flags of request which just complete a sequence,
there are three cases:

1. blk_insert_flush(rq): rq is pending, wait for flush
2. flush_end_io(flush_rq): rq flush seq done
3. mq_flush_data_end_io(rq): rq data seq done

Only in the 1st case, the rq is the pending request that wait for flush_rq.
In the 2nd and 3rd cases, the rq has nothing to do with the next flush_rq?

So it's more reasonable for flush_rq to use its pending first_rq's cmd_flags?

>
>> flush_rq->cmd_flags = REQ_OP_FLUSH | REQ_PREFLUSH;
>> - flush_rq->cmd_flags |= (flags & REQ_DRV) | (flags & REQ_FAILFAST_MASK);
>> + flush_rq->cmd_flags |= (first_rq->cmd_flags & REQ_DRV) |
>> + (first_rq->cmd_flags & REQ_FAILFAST_MASK);
>
> Two cases here:
>
> 1) REQ_FAILFAST_MASK: I don't think this is actually set on flush request
> currently, and even if it was applying it to the flush that serves more
> than a single originating command seems wrong to me.
> 2) REQ_DRV is only set by drivers that have seen a bio. For dm this
> is used as REQ_DM_POLL_LIST which should never be set for a flush/fua
> request. For nvme-mpath it is REQ_NVME_MPATH, which is set in the
> bio based driver and used for decision making in the I/O completion
> handler. So I guess this one actually does need to get passed
> through.
>

The commit 84fca1b0c461 ("block: pass failfast and driver-specific flags to
flush requests") says:
If flush requests are being sent to the device we need to inherit the
failfast and driver-specific flags, too, otherwise I/O will fail.

1) REQ_FAILFAST_MASK: agree, shouldn't set to the flush_rq I think?
2) REQ_DRV: I don't get why this flag not set would cause I/O fail?

Thanks!