Re: kernel 5.15 does not boot with 3ware card (never had this issue <= 5.14) - scsi 0:0:0:0: WARNING: (0x06:0x002C) : Command (0x12) timed out, resetting card

From: Douglas Miller
Date: Mon Nov 08 2021 - 09:16:39 EST


The commit I referenced earlier does point back to the commit that caused the problem (that I saw). There was a series of commits related to IRQ domains, this one seems to have actually caused the problem I saw:

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=a5f3d2c17b07


On 11/7/21 07:46, Justin Piszcz wrote:
On Sat, Nov 6, 2021 at 7:54 AM Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx> wrote:


-----Original Message-----
From: Bart Van Assche <bvanassche@xxxxxxx>
Sent: Wednesday, November 3, 2021 12:23 PM
To: Justin Piszcz <jpiszcz@xxxxxxxxxxxxxxx>; Douglas Miller <dougmill@xxxxxxxxxxxxxxxxxx>
Cc: LKML <linux-kernel@xxxxxxxxxxxxxxx>; linux-scsi@xxxxxxxxxxxxxxx
Subject: Re: kernel 5.15 does not boot with 3ware card (never had this issue <= 5.14) - scsi 0:0:0:0: WARNING: (0x06:0x002C) : Command (0x12) timed out, resetting card

On 11/3/21 9:18 AM, Justin Piszcz wrote:
Thanks!-- Has anyone else reading run into this issue and/or are there
any suggestions how I can troubleshoot this further (as all -rc's have
the same issue)?
How about bisecting this issue
(https://www.kernel.org/doc/html/latest/admin-guide/bug-bisect.html)?

[ .. ]

I was having some issues finding a list of changes with git bisect, so I started checking the kernel .config and boot parameters:

I found the option that was causing the system not to boot (tested with 5.15.0 and latest linux-git as of 6 NOV 2021)
append="3w-sas.use_msi=1"

3w-sas.use_msi defaults to 0 (so now it is using IR-IO-APIC instead of MSI but now the machine boots using 5.15)
https://lwn.net/Articles/358679/

Something between 5.14 and 5.15 changed regarding x86_64's handling of Message Signaled Interrupts.
... which causes the kernel to no longer boot when 3w-sas.use_msi=1 is specified starting with 5.15.
This only partially fixes the issues, trying to reboot also results in
a hard lockup on cpu 1 (this is semi-reproducible)
https://installkernel.tripod.com/5.15-reboot-lockup.jpg

Back to 5.14.x for now...



Justin.