Re: [PATCH v2] locking/hung_task: Show all hung tasks before panic

From: Dmitry Vyukov
Date: Mon Apr 09 2018 - 05:03:53 EST


On Sat, Apr 7, 2018 at 6:24 PM, Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
> Dmitry Vyukov wrote:
>> On Sat, Apr 7, 2018 at 5:39 PM, Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>> > On Sat, Apr 07, 2018 at 09:31:19PM +0900, Tetsuo Handa wrote:
>> >> are for replacing debug_show_all_locks() in check_hung_task() for cases like
>> >> https://syzkaller.appspot.com/bug?id=26aa22915f5e3b7ca2cfca76a939f12c25d624db
>> >> because we are interested in only threads holding locks.
>> >>
>> >> SysRq-t is too much but SysRq-w is useless for killable/interruptible threads...
>> >
>> > Or use a script to process the sysrq-t output? I mean, we can add all
>> > sorts, but where does it end?
>
> Maybe allow khungtaskd to call call_usermode_helper() to run arbitrary operations
> instead of just calling panic()?

This would probably work for syzbot too.

>> Good question.
>> We are talking about few dozen more stacks, right?
>>
>> Not all kernel bugs are well reproducible, so it's not always possible
>> to go back and hit sysrq-t. And this come up in the context of syzbot,
>> which is an automated system. It reported a bunch of hangs and most of
>> them are real bugs, but not all of them are easily actionable.
>> Can it be a config or a command line argument, which will make syzbot
>> capture more useful context for each such hang?
>>
>
> It will be nice if syzbot testing is done with kdump configured, and the
> result of automated scripting on vmcore (such as "foreach bt -s -l") is
> available.

kdump's popped up several times already
(https://github.com/google/syzkaller/issues/491). But this will
require some non-trivial amount of work to pipe it through the whole
system (starting from investigation/testing, second kernel to storing
them and exposing).