Re: possible deadlock in send_sigio

From: Eric W. Biederman
Date: Thu Jun 11 2020 - 12:11:57 EST


Waiman Long <longman@xxxxxxxxxx> writes:

> On 4/4/20 1:55 AM, syzbot wrote:
>> Hello,
>>
>> syzbot found the following crash on:
>>
>> HEAD commit: bef7b2a7 Merge tag 'devicetree-for-5.7' of git://git.kerne..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=15f39c5de00000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=91b674b8f0368e69
>> dashboard link: https://syzkaller.appspot.com/bug?extid=a9fb1457d720a55d6dc5
>> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1454c3b7e00000
>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12a22ac7e00000
>>
>> The bug was bisected to:
>>
>> commit 7bc3e6e55acf065500a24621f3b313e7e5998acf
>> Author: Eric W. Biederman <ebiederm@xxxxxxxxxxxx>
>> Date: Thu Feb 20 00:22:26 2020 +0000
>>
>> proc: Use a list of inodes to flush from proc
>>
>> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=165c4acde00000
>> final crash: https://syzkaller.appspot.com/x/report.txt?x=155c4acde00000
>> console output: https://syzkaller.appspot.com/x/log.txt?x=115c4acde00000
>>
>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>> Reported-by: syzbot+a9fb1457d720a55d6dc5@xxxxxxxxxxxxxxxxxxxxxxxxx
>> Fixes: 7bc3e6e55acf ("proc: Use a list of inodes to flush from proc")
>>
>> ========================================================
>> WARNING: possible irq lock inversion dependency detected
>> 5.6.0-syzkaller #0 Not tainted
>> --------------------------------------------------------
>> ksoftirqd/0/9 just changed the state of lock:
>> ffffffff898090d8 (tasklist_lock){.+.?}-{2:2}, at: send_sigio+0xa9/0x340 fs/fcntl.c:800
>> but this lock took another, SOFTIRQ-unsafe lock in the past:
>> (&pid->wait_pidfd){+.+.}-{2:2}
>>
>>
>> and interrupts could create inverse lock ordering between them.
>>
>>
>> other info that might help us debug this:
>> Possible interrupt unsafe locking scenario:
>>
>> CPU0 CPU1
>> ---- ----
>> lock(&pid->wait_pidfd);
>> local_irq_disable();
>> lock(tasklist_lock);
>> lock(&pid->wait_pidfd);
>> <Interrupt>
>> lock(tasklist_lock);
>>
>> *** DEADLOCK ***
>
> That is a false positive. The qrwlock has the special property that it becomes
> unfair (for read lock) at interrupt context. So unless it is taking a write lock
> in the interrupt context, it won't go into deadlock. The current lockdep code
> does not capture the full semantics of qrwlock leading to this false positive.
>

Whatever it was it was fixed with:
63f818f46af9 ("proc: Use a dedicated lock in struct pid")

It is classic lock inversion caused by not disabling irqs.

Unless I am completely mistaken any non-irq code path that does:
write_lock_irq(&tasklist_lock);
spin_lock(&pid->lock);

Is susceptible to deadlock with:
spin_lock(&pid->lock);
<Interrupt>
read_lock(&task_list_lock);

Because it remains a lock inversion even with only a read lock taken in
irq context in irq context.

Eric