Re: [PATCH] wifi: mac80211: tx: Add __must_hold() annotation

From: Johannes Berg
Date: Mon Jan 15 2024 - 08:15:33 EST


On Sat, 2024-01-13 at 08:32 +0200, Kalle Valo wrote:
>
> > static void ieee80211_set_beacon_cntdwn(struct ieee80211_sub_if_data *sdata,
> > struct beacon_data *beacon,
> > struct ieee80211_link_data *link)
> > + __must_hold(link)
>
> Oh, never seen __must_hold() before and looks very useful. So does this
> work with RCU, mutexes and spinlocks?
>
> In case others are interested, here's the documentation I was able to find:
>
> https://docs.kernel.org/dev-tools/sparse.html#using-sparse-for-lock-checking
>

Except it's not actually useful, and looks more useful than it is. IMHO
it's actually more harmful than anything else.

One might even consider this patch a good example! The function
ieee80211_set_beacon_cntdwn() is called from a number of places in this
file, some of which acquire RCU critical section, and some of which
acquire no locks nor RCU critical section at all. Most of them nest and
are called in RCU.

However, there's basically no way to get sparse to warn on this. Even
inserting a function

void test(void);
void test(void)
{
ieee80211_set_beacon_cntdwn(NULL, NULL, NULL);
}

will not cause sparse to complain, where this *clearly* doesn't hold an
locks.


Also, as we (should) all know, the argument to __acquires(),
__releases() and __must_check() is pretty much ignored. I tried to fix
this in sparse many years ago, some code even got merged (and then
reverted), and if the experience tells me anything then that it's pretty
much not fixable.

__acquires() and __releases() at least are useful for tracking that you
don't have a mismatch, e.g. a function that __acquires() but then takes
a lock in most paths but forgot one, for example. With __must_hold(),
this really isn't the case.

And then we could argue that at least it has a documentation effect, but
.. what does it even mean to "hold 'link'"? There isn't even a lock,
mutex or otherwise, in the link. You can't "own" a reference to it, or
anything like that. The closest thing in current kernels would be to
maybe see if you have the wiphy mutex, but that's likely not the case in
these paths and RCU was used to get to the link struct ...


IOW, I find this lacking from an implementation/validation point of
view, and lacking if not outright confusing from a documentation point
of view. Much better to put something lockdep_assert_held() or similar
into the right places.

As for your comment about RCU in ath11k (which points back to this
thread): I don't find

RCU_LOCKDEP_WARN(!rcu_read_lock_held());
or
WARN_ON_ONCE(!rcu_read_lock_held());

very persuasive, it's much better to have it checked with
rcu_dereference_protected(), rcu_dereference_check(), the condition
argument to list_for_each_rcu(), or (in the case of wiphy) our wrappers
around these like wiphy_dereference(). I cannot think of any case where
you'd want to ensure that some code is in an RCU critical section
without it actually using RCU - and if it does you have
rcu_dereference() and all those things that (a) check anyway, and also
(b) serve as their own documentation.


Anyway, long story short: I don't see value in this patch and won't be
applying it unless somebody here can convince me otherwise, ideally
addressing the concerns stated above.

johannes