Re: [REGRESSION] 6.7.1: md: raid5 hang and unresponsive system; successfully bisected

From: Linux regression tracking (Thorsten Leemhuis)
Date: Wed Mar 06 2024 - 03:38:44 EST


On 02.03.24 01:05, Song Liu wrote:
> On Fri, Mar 1, 2024 at 3:12 PM Dan Moulding <dan@xxxxxxxx> wrote:
>>
>>> 5. Looks like the block layer or underlying(scsi/virtio-scsi) may have
>>> some issue which leading to the io request from md layer stayed in a
>>> partial complete statue. I can't see how this can be related with the
>>> commit bed9e27baf52 ("Revert "md/raid5: Wait for MD_SB_CHANGE_PENDING in
>>> raid5d"")
>>
>> There is no question that the above mentioned commit makes this
>> problem appear. While it may be that ultimately the root cause lies
>> outside the md/raid5 code (I'm not able to make such an assessment), I
>> can tell you that change is what turned it into a runtime
>> regression. Prior to that change, I cannot reproduce the problem. One
>> of my RAID-5 arrays has been running on every kernel version since
>> 4.8, without issue. Then kernel 6.7.1 the problem appeared within
>> hours of running the new code and affected not just one but two
>> different machines with RAID-5 arrays. With that change reverted, the
>> problem is not reproducible. Then when I recently upgraded to 6.8-rc5
>> I immediately hit the problem again (because it hadn't been reverted
>> in the mainline yet). I'm now running 6.8.0-rc5 on one of my affected
>> machines without issue after reverting that commit on top of it.
> [...]
> I also tried again to reproduce the issue, but haven't got luck. While
> I will continue try to repro the issue, I will also send the revert to 6.8
> kernel.

Is that revert on the way meanwhile? I'm asking because Linus might
release 6.8 on Sunday.

Ciao, Thorsten (wearing his 'the Linux kernel's regression tracker' hat)
--
Everything you wanna know about Linux kernel regression tracking:
https://linux-regtracking.leemhuis.info/about/#tldr
If I did something stupid, please tell me, as explained on that page.