Re: [syzbot] [fs?] INFO: task hung in pipe_release (4)

From: David Howells
Date: Fri Jul 28 2023 - 19:53:16 EST


Jakub Kicinski <kuba@xxxxxxxxxx> wrote:

> Hi David, any ideas about this one? Looks like it triggers on fairly
> recent upstream?

I've managed to reproduce it finally. Instrumenting the pipe_lock/unlock
functions, splice_to_socket() and pipe_release() seems to show that
pipe_release() is being called whilst splice_to_socket() is still running.

I *think* syzbot is arranging things such that splice_to_socket() takes a
significant amount of time so that another thread can close the socket as it
exits.

In this sample logging, the pipe is created by pid 7101:

[ 66.205719] --pipe 7101
[ 66.209942] lock
[ 66.212526] locked
[ 66.215344] unlock
[ 66.218103] unlocked

splice begins in 7101 also and locks the pipe:

[ 66.221057] ==>splice_to_socket() 7101
[ 66.225596] lock
[ 66.228177] locked

but for some reason, pid 7100 then tries to release it:

[ 66.377781] release 7100

and hangs on the __pipe_lock() call in pipe_release():

[ 66.381059] lock

The syz reproducer does weird things with threading - and I'm wondering if
there's a file struct refcount bug here. Note that splice_to_socket() can't
access the pipe file structs to alter the refcount, and the involved pipe
isn't communicated to udp_sendmsg() in any way - so if there is a refcount
bug, it must be somewhere in the VFS, the pipe driver or the splice
infrastructure:-/.

I'm also not sure what's going on inside udp_sendmsg() as yet. It doesn't
show a stack in /proc/7101/stacks, which means it doesn't hit a schedule().

David