Re: [Bug] WARNING in static_key_disable_cpuslocked

From: Jason Baron
Date: Wed Mar 06 2024 - 19:06:17 EST




On 3/6/24 5:16 PM, Josh Poimboeuf wrote:
On Wed, Mar 06, 2024 at 03:12:07PM -0500, Jason Baron wrote:


On 3/6/24 2:31 PM, Josh Poimboeuf wrote:
On Wed, Mar 06, 2024 at 10:54:20AM -0500, Steven Rostedt wrote:
Now I guess the question is, why is something trying to disable something
that is not enabled? Is the above scenario OK? Or should the users of
static_key also prevent this?

Apparently that's an allowed scenario, as the jump label code seems to
be actively trying to support it. Basically the last one "wins".

See for example:

1dbb6704de91 ("jump_label: Fix concurrent static_key_enable/disable()")

Also the purpose of the first atomic_read() is to do a quick test before
grabbing the jump lock. So instead of grabbing the jump lock earlier,
it should actually do the first test atomically:

Makes sense but the enable path can also set key->enabled to -1.

Ah, this code is really subtle :-/

So I think a concurrent disable could then see the -1 in tmp and still
trigger the WARN.

I think this shouldn't be possible, for the same reason that
static_key_slow_try_dec() warns on -1: key->enabled can only be -1
during the first enable. And disable should never be called before
then.

hmm, right but I think in this case the reproducer is writing to a sysfs file to enable/disable randomly so i'm not sure if there is anything that would enforce that ordering. I guess you could try the reproducer, I haven't really looked at it in any detail.

The code in question here is in mm/vmscan.c which actually already takes the local 'state_mutex' for some cases. So that could be extended I think easily to avoid this warning.


So I think we could change the WARN to be:
WARN_ON_ONCE(tmp != 0 && tmp != -1). And also add a similar check
for enable if we have enable vs enable racing?

My patch subtly changed the "key->enabled > 0" to "key->enabled != 0".
If I change that back then it should be fine.

Although it seems like the set key->enabled to -1 while used in the inc/dec
API isn't really doing anything in the enable/disable part here?
But then the key->enabled I think has to move in front of the
jump_label_update() to make that part work right...

Yeah, this code needs better comments. Let me turn it into a proper
patch.