Re: stalling IO regression since linux 5.12, through 5.18

From: Holger Hoffstätte
Date: Wed Aug 17 2022 - 14:38:24 EST


On 2022-08-17 20:16, Chris Murphy wrote:


On Wed, Aug 17, 2022, at 5:52 AM, Holger Hoffstätte wrote:

Chris, just a shot in the dark but can you try the patch from

https://lore.kernel.org/linux-block/20220803121504.212071-1-yukuai1@xxxxxxxxxxxxxxx/

on top of something more recent than 5.12? Ideally 5.19 where it applies
cleanly.


This patch applies cleanly on 5.12.0. I can try newer kernels later, but as the problem so easily reproduces with 5.12 and the problem first appeared there, is why I'm sticking with it. (For sure we prefer to be on 5.19 series.)

Let me know if I should try it still.

I just started running it in 5.19.2 to see if it breaks something;
no issues so far but then again I didn't have any problems to begin with
and only do peasant I/O load, and no MegaRAID.
However if it applies *and builds* on 5.12 I'd just go ahead and see what
catches fire. But you need to set the megaraid setting to fail, otherwise we
won't be able to see whether this is really a contributing factor,
or indeed the other commit that Jan identified.
Unfortunately 5.12 is a bit old already and most of the other important
fixes to sbitmap.c probably won't apply due to some other blk-mq changes.

In any case the plot thickens, so keep going. :)

-h