Re: tty: deadlock between n_tracerouter_receivebuf and flush_to_ldisc

From: Dmitry Vyukov
Date: Thu Jan 21 2016 - 05:07:16 EST


On Wed, Jan 20, 2016 at 5:08 PM, Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:
> On 01/20/2016 05:02 AM, Peter Zijlstra wrote:
>> On Wed, Dec 30, 2015 at 11:44:01AM +0100, Dmitry Vyukov wrote:
>>> -> #3 (&buf->lock){+.+...}:
>>> [<ffffffff813f0acf>] lock_acquire+0x19f/0x3c0 kernel/locking/lockdep.c:3585
>>> [< inline >] __raw_spin_lock_irqsave include/linux/spinlock_api_smp.h:112
>>> [<ffffffff85c8e790>] _raw_spin_lock_irqsave+0x50/0x70 kernel/locking/spinlock.c:159
>>> [<ffffffff82b8c050>] tty_get_pgrp+0x20/0x80 drivers/tty/tty_io.c:2502
>>
>> So in any recent code that I look at this function tries to acquire
>> tty->ctrl_lock, not buf->lock. Am I missing something ?!
>
> Yes.
>
> The tty locks were annotated with __lockfunc so were being elided from lockdep
> stacktraces. Greg has a patch in his queue from me that removes the __lockfunc
> annotation ("tty: Remove __lockfunc annotation from tty lock functions").
>
> Unfortunately, I think syzkaller's post-processing stack trace isn't helping
> either, giving the impression that the stack is still inside tty_get_pgrp().
>
> It's not.

I've got a new report on commit
a200dcb34693084e56496960d855afdeaaf9578f (Jan 18).
Here is unprocessed version:
https://gist.githubusercontent.com/dvyukov/428a0c9bfaa867d8ce84/raw/0754db31668602ad07947f9964238b2f9cf63315/gistfile1.txt
and here is processed one:
https://gist.githubusercontent.com/dvyukov/42b874213de82d94c35e/raw/2bbced252035821243678de0112e2ed3a766fb5d/gistfile1.txt

Peter, what exactly is wrong with the post-processed version? I would
be interested in fixing the processing script.

As far as I see it contains the same stacks just with line numbers and
inlined frames. I am using a significantly different compilation mode
(kasan + kcov + very recent gcc), so nobody except me won't be able to
figure out line numbers based on offsets.