Re: INFO: task hung in fuse_reverse_inval_entry

From: Dmitry Vyukov
Date: Mon Jul 23 2018 - 08:22:35 EST


On Mon, Jul 23, 2018 at 2:12 PM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
> On Mon, Jul 23, 2018 at 10:11 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>> On Mon, Jul 23, 2018 at 9:59 AM, syzbot
>> <syzbot+bb6d800770577a083f8c@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit: d72e90f33aa4 Linux 4.18-rc6
>>> git tree: upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=1324f794400000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=68af3495408deac5
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=bb6d800770577a083f8c
>>> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
>>> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=11564d1c400000
>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16fc570c400000
>>
>>
>> Hi fuse maintainers,
>>
>> We are seeing a bunch of such deadlocks in fuse on syzbot. As far as I
>> understand this is mostly working-as-intended (parts about deadlocks
>> in Documentation/filesystems/fuse.txt). The intended way to resolve
>> this is aborting connections via fusectl, right?
>
> Yes. Alternative is with "umount -f".
>
>> The doc says "Under
>> the fuse control filesystem each connection has a directory named by a
>> unique number". The question is: if I start a process and this process
>> can mount fuse, how do I kill it? I mean: totally and certainly get
>> rid of it right away? How do I find these unique numbers for the
>> mounts it created?
>
> It is the device number found in st_dev for the mount. Other than
> doing stat(2) it is possible to find out the device number by reading
> /proc/$PID/mountinfo (third field).

Thanks. I will try to figure out fusectl connection numbers and see if
it's possible to integrate aborting into syzkaller.

>> Taking into account that there is usually no
>> operator attached to each server, I wonder if kernel could somehow
>> auto-abort fuse on kill?
>
> Depends on what the fuse server is sleeping on. If it's trying to
> acquire an inode lock (e.g. unlink(2)), which is classical way to
> deadlock a fuse filesystem, then it will go into an uninterruptible
> sleep. There's no way in which that process can be killed except to
> force a release of the offending lock, which can only be done by
> aborting the request that is being performed while holding that lock.

I understand that it is not killed today, but I am asking if we can
make it killable. It's all code that we can change, and if a human
operator can do it, it can be done pure programmatically on kill too,
right?