Re: [Regression][BISECTED] kernel boot hang after 19898ce9cf8a ("wifi: iwlwifi: split 22000.c into multiple files")

From: Thorsten Leemhuis
Date: Sat Jul 08 2023 - 10:18:30 EST


On 07.07.23 12:55, Linux regression tracking (Thorsten Leemhuis) wrote:
> On 07.07.23 10:25, Zhang, Rui wrote:
>>
>> I run into a NULL pointer dereference and kernel boot hang after
>> switching to latest upstream kernel, and git bisect shows that below
>> commit is the first offending commit, and I have confirmed that commit
>> 19898ce9cf8a has the issue while 19898ce9cf8a~1 does not.
>
> FWIW, this is the fourth such report about this that I'm aware of.
>
> The first is this one (with two affected users afaics):
> https://bugzilla.kernel.org/show_bug.cgi?id=217622
>
> The second is this one:
> https://lore.kernel.org/all/CAAJw_Zug6VCS5ZqTWaFSr9sd85k%3DtyPm9DEE%2BmV%3DAKoECZM%2BsQ@xxxxxxxxxxxxxx/
>
> The third:
> https://lore.kernel.org/all/9274d9bd3d080a457649ff5addcc1726f08ef5b2.camel@xxxxxxxxxxx/
>
> And in the past few days two people from Fedora land talked to me on IRC
> with problems that in retrospective might be caused by this as well.

I got confirmation: one of those cases is also caused by 19898ce9cf8a
But I write for a different reason:

Larry (now CCed) looked at the culprit and spotted something that looked
suspicious to him; he posted a patch and looks for testers:
https://lore.kernel.org/all/0068af47-e475-7e8d-e476-c374e90dff5f@xxxxxxxxxxxx/

Ciao, Thorsten

> This many reports about a problem at this stage of the cycle makes me
> suspect we'll see a lot more once -rc1 is out. That's why I raising the
> awareness of this. Sadly a simple revert of just this commit is not
> possible. :-/
>
> Ciao, Thorsten
>
>> commit 19898ce9cf8a33e0ac35cb4c7f68de297cc93cb2 (refs/bisect/bad)
>> Author: Johannes Berg <johannes.berg@xxxxxxxxx>
>> AuthorDate: Wed Jun 21 13:12:07 2023 +0300
>> Commit: Johannes Berg <johannes.berg@xxxxxxxxx>
>> CommitDate: Wed Jun 21 14:07:00 2023 +0200
>>
>> wifi: iwlwifi: split 22000.c into multiple files
>>
>> Split the configuration list in 22000.c into four new files,
>> per new device family, so we don't have this huge unusable
>> file. Yes, this duplicates a few small things, but that's
>> still much better than what we have now.
>>
>> Signed-off-by: Johannes Berg <johannes.berg@xxxxxxxxx>
>> Signed-off-by: Gregory Greenman <gregory.greenman@xxxxxxxxx>
>> Link:
>> https://lore.kernel.org/r/20230621130443.7543603b2ee7.Ia8dd54216d341ef1ddc0531f2c9aa30d30536a5d@changeid
>> Signed-off-by: Johannes Berg <johannes.berg@xxxxxxxxx>
>>
>> I have some screenshots which show that RIP points to iwl_mem_free_skb,
>> I can create a kernel bugzilla and attach the screenshots there if
>> needed.
>>
>> BTW, lspci output of the wifi device and git bisect log attached.
>>
>> If any other information needed, please let me know.
>
> --
> Everything you wanna know about Linux kernel regression tracking:
> https://linux-regtracking.leemhuis.info/about/#tldr
> That page also explains what to do if mails like this annoy you.
>
> P.S.: for regzbot
>
> #regzbot ^introduced 19898ce9cf8a
> #regzbot dup-of:
> https://lore.kernel.org/all/a5cdc7f8-b340-d372-2971-0d24b01de217@xxxxxxxxx/