Re: [PATCH 05/14] mwifiex: re-register wiphy across reset

From: Johannes Berg
Date: Wed Jun 28 2017 - 03:29:17 EST


On Tue, 2017-06-27 at 13:48 -0700, Brian Norris wrote:

> > There isn't really a good way to do this. You can, of course, call
> > wiphy_unregister(), but if you could do that you'd already have the
> > problem solved, I think?
>
> That's probably along the right track. There are still some things
> we'd need to do properly before that though, and this is where all
> the problems are so far. (Also, this is what Kalle was already
> objecting to; he didn't think we should be unregistering/recreating
> the wiphy, but I think he ended up softening on that a bit.)
>
> For one, I still expect I should be removing the wireless dev's
> before unregistering the wihpy, no? Otherwise, there will be existing
> wdevs backed by an unregistered wiphy?

Yeah, that's true - though once you get rid of those they can't be
accessed any more.

> And that gets to the heart of another bug: deleting interfaces (e.g.,
> "iw dev foo del") races with a lot of stuff -- like see
>
> mwifiex_process_sta_event() ->
> Â EVENT_EXT_SCAN_REPORT ->
> ÂÂÂÂnetif_running(priv->netdev)
>
> Because mwifiex_del_virtual_intf() doesn't stop any outstanding
> commands, we can be both deleting the netdev and processing scans for
> it.

Huh, well, I guess you need some kind of locking here anyway, since the
user can always do things like deleting the interface while a scan is
running?

> > > Also, IIUC, we need to wait for all control paths to complete (or
> > > cancel) before we can free up the associated resources; so just
> > > marking "unavailable" isn't enough.
> >
> > Yeah, I suppose so. Though if you just do all the freeing after
> > wiphy_unregister() it'll do that for you?
>
> Yes, I think so. Then part of the problem is probably that some of
> the current "cancel command" logic is tied up with the "free command
> structures" logic. So we're freeing some stuff too early.
>
> Anyway, those sorts of bugs aside, IIUC the full sequence for
> teardown should probably be something like:
>
> 1. Stop TX queues
> 2. Cancel outstanding commands (let them fail or finish, etc.) -- but
> ÂÂÂDON'T free their backing resources yet
> 3. Remove wdevs
> 4. wiphy_unregister()
> 5. Free up resources
>
> Current problems are at least:
>
> * we don't do step 4 in the right place (if at all; see this patch)
> * step 2 mixes in "free"ing resources too early

So I'm not sure what you mean by splitting in 2/5 - this seems
reasonable, but I don't understand why something like a scan request
wouldn't be freed while you cancel it in 2? In fact, you really have to
free it before you remove the corresponding wdev, or cfg80211 will
complain?

johannes