Re: [PATCH] p9: trans_fd: Fix deadlock when connection cancel

From: asmadeus
Date: Wed Aug 31 2022 - 16:42:31 EST


Schspa Shi wrote on Thu, Sep 01, 2022 at 02:09:50AM +0800:
> To fix it, we can add extra reference counter to avoid deadlock, and
> decrease it after we unlock the client->lock.

Thanks for the patch!

Unfortunately I already sent a slightly different version to the list,
hidden in another syzbot thread, here:
https://lkml.kernel.org/r/YvyD053bdbGE9xoo@xxxxxxxxxxxxx

(yes, sorry, not exactly somewhere I'd expect someone to find it... 9p
hasn't had many contributors recently)


Basically instead of taking an extra lock I just released the client
lock before calling p9_client_cb, so it shouldn't hang anymore.

We don't need the lock to call the cb as in p9_conn_cancel we already
won't accept any new request and by this point the requests are in a
local list that isn't shared anywhere.

If you have a test setup, would you mind testing my patch?
That's the main reason I was delaying pushing it.

Since you went out of your way to make this patch if you agree with my
approach I don't mind adding your sign off or another mark of having
worked on it.

Thank you,
--
Dominique