Re: [BISECTED] 5.12 hangs at reboot

From: Johannes Berg
Date: Mon Apr 26 2021 - 16:11:50 EST


On Mon, 2021-04-26 at 12:51 -0700, Linus Torvalds wrote:
> On Mon, Apr 26, 2021 at 12:46 PM Johannes Berg
> <johannes@xxxxxxxxxxxxxxxx> wrote:
> >
> > Right. Maybe if it's modules, could try to remove them rather than
> > reboot?
>
> Yes, doing an 'rmmod ath9k' (or whatever that module is called)
> sounds like a good idea, it might trigger the same lockup.
>
> In fact, that might be the reason Harald sees this - maybe Void Linux
> tries to unload modules before rebooting, and other distros don't?

Seems odd if they would, but maybe?

I guess we're well into speculation here now - Harald, even taking a
picture of a stack dump will help, I'll likely only need an indication
where it's actually locking up, unless it's actually in
cfg80211_destroy_iface_wk() itself, but I can't see how that'd be
possible.

Looks like with mac80211 this really should just go down into
ieee80211_if_remove() and that looks OK.

And it's coming from a work struct, so I thought maybe some flushing
happened in a bad context, but that's only in wiphy_unregister(),
without the lock(s) held around it, as it should be. I figured then
maybe wiphy_unregister() could be called in a bad context, but then that
would've deadlocked itself earlier, unrelated to the destroy_iface_wk().


Oh, I have another idea - maybe void linux is using iwd instead of
wpa_supplicant, and that insists on doing the netlink owner stuff so
everything is deleted in case it crashes. But I've been looking at the
code pretty much assuming that we get actual calls down, so ...


Dunno. I don't see anything obvious right now, any additional
information (stack dump, or lockdep report) would be great.

johannes