Re: possible deadlock in free_ioctx_users

From: Miklos Szeredi
Date: Mon Sep 10 2018 - 05:50:48 EST


On Mon, Sep 10, 2018 at 11:43 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> On Mon, Sep 10, 2018 at 11:28 AM, Miklos Szeredi <miklos@xxxxxxxxxx> wrote:
>> On Sun, Sep 9, 2018 at 8:41 PM, syzbot
>> <syzbot+d86c4426a01f60feddc7@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote:
>>> Hello,
>>>
>>> syzbot found the following crash on:
>>>
>>> HEAD commit: f8f65382c98a Merge tag 'for-linus' of git://git.kernel.org..
>>> git tree: upstream
>>> console output: https://syzkaller.appspot.com/x/log.txt?x=113260ae400000
>>> kernel config: https://syzkaller.appspot.com/x/.config?x=8f59875069d721b6
>>> dashboard link: https://syzkaller.appspot.com/bug?extid=d86c4426a01f60feddc7
>>> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
>>> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=120baa9e400000
>>> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13979cbe400000
>>>
>>> IMPORTANT: if you fix the bug, please add the following tag to the commit:
>>> Reported-by: syzbot+d86c4426a01f60feddc7@xxxxxxxxxxxxxxxxxxxxxxxxx
>>>
>>> random: sshd: uninitialized urandom read (32 bytes read)
>>> random: sshd: uninitialized urandom read (32 bytes read)
>>> random: sshd: uninitialized urandom read (32 bytes read)
>>>
>>> ========================================================
>>> WARNING: possible irq lock inversion dependency detected
>>> 4.19.0-rc2+ #229 Not tainted
>>> --------------------------------------------------------
>>> swapper/0/0 just changed the state of lock:
>>> 00000000c02bddef (&(&ctx->ctx_lock)->rlock){..-.}, at: spin_lock_irq
>>> include/linux/spinlock.h:354 [inline]
>>> 00000000c02bddef (&(&ctx->ctx_lock)->rlock){..-.}, at:
>>> free_ioctx_users+0xbc/0x710 fs/aio.c:603
>>> but this lock took another, SOFTIRQ-unsafe lock in the past:
>>> (&fiq->waitq){+.+.}
>>>
>>>
>>> and interrupts could create inverse lock ordering between them.
>>>
>>>
>>> other info that might help us debug this:
>>> Possible interrupt unsafe locking scenario:
>>>
>>> CPU0 CPU1
>>> ---- ----
>>> lock(&fiq->waitq);
>>> local_irq_disable();
>>> lock(&(&ctx->ctx_lock)->rlock);
>>> lock(&fiq->waitq);
>>> <Interrupt>
>>> lock(&(&ctx->ctx_lock)->rlock);
>>
>> Fuse device doesn't support AIO ops. So false positive, AFAICS.
>
> Hi Miklos,
>
> We still need to annotate this. How?

Good question.

Isn't lockdep assuming too much here? It hasn't shown that that
ctx_lock instance was actually called from interrupt context, has it?

Thanks,
Miklos