Re: ww_mutex.sh hangs since v5.16-rc1

From: John Stultz
Date: Fri Jul 07 2023 - 16:23:22 EST


On Tue, Nov 30, 2021 at 5:26 PM Li Zhijian <zhijianx.li@xxxxxxxxx> wrote:
>
> LKP/0Day found that ww_mutex.sh cannot complete since v5.16-rc1, but
> I'm pretty sorry that we failed to bisect the FBC, instead, the bisection pointed
> to a/below merge commit(91e1c99e17) finally.
>
> Due to this hang, other tests in the same group are also blocked in 0Day, we
> hope we can fix this hang ASAP.
>
> So if you have any idea about this, or need more debug information, feel free to let me know :)
>
> BTW, ww_mutex.sh was failed in v5.15 without hang, and looks it cannot reproduce on a vm.
>

So, as part of the proxy-execution work, I've been recently trying to
understand why the patch series was causing apparent hangs in the
ww_mutex test with large(64) cpu counts.
I was assuming my changes were causing a lost wakeup somehow, but as I
dug in I found it looked like the stress_inorder_work() function was
live-locking.

I noticed that adding printks to the logic would change the behavior,
and finally realized I could reproduce a livelock against mainline by
adding a printk before the "return -EDEADLK;" in __ww_mutex_kill(),
making it clear the logic was timing sensitive. Then searching around
I found this old and unresolved thread.

Part of the issue is that we may not hit the timeout check at the end
of the loop, as the EDEADLK case short-cuts back to retry, allowing
the test to effectively get stuck.

But I know with ww_mutexes there's supposed to be a guarantee of
forward progress as the older context wins, but it's not clear to me
that works here. The EDEADLK case results in a releasing and
reacquiring of the locks (only with the contended lock being taken
first), and if a second EDEADLK occurs, it starts over again from
scratch (though with the new contended lock being chosen first instead
- which seems to lose any progress).

So maybe the test has broken that guarentee in how it restarts, or
with 128 threads trying to acquire a random order of 16 locks without
contention (and the order shifting slightly each time it does see
contention) it might just be a very big space to resolve if we don't
luck into good timing.

Anyway, I wanted to get some feedback from folks who have a better
theoretical understanding of the ww_mutexes. With large cpu counts are
we just asking for trouble here? Is the test doing something wrong? Or
is there possibly a ww_mutex bug under this?

thanks
-john