Re: [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected

From: Dan Moulding
Date: Sun Mar 10 2024 - 00:13:33 EST


> Dan, can you try the following patch?
>
> diff --git a/block/blk-core.c b/block/blk-core.c
> index de771093b526..474462abfbdc 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -1183,6 +1183,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool
> from_schedule)
> if (unlikely(!rq_list_empty(plug->cached_rq)))
> blk_mq_free_plug_rqs(plug);
> }
> +EXPORT_SYMBOL(__blk_flush_plug);
>
> /**
> * blk_finish_plug - mark the end of a batch of submitted I/O
> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
> index 8497880135ee..26e09cdf46a3 100644
> --- a/drivers/md/raid5.c
> +++ b/drivers/md/raid5.c
> @@ -6773,6 +6773,11 @@ static void raid5d(struct md_thread *thread)
> spin_unlock_irq(&conf->device_lock);
> md_check_recovery(mddev);
> spin_lock_irq(&conf->device_lock);
> + } else {
> + spin_unlock_irq(&conf->device_lock);
> + blk_flush_plug(&plug, false);
> + cond_resched();
> + spin_lock_irq(&conf->device_lock);
> }
> }
> pr_debug("%d stripes handled\n", handled);

This patch seems to work! I can no longer reproduce the problem after
applying this.

Thanks,

-- Dan