Re: [syzbot] [kernel?] possible deadlock in console_flush_all (2)

From: John Ogness
Date: Wed Mar 27 2024 - 07:06:54 EST


On 2024-03-20, Alexei Starovoitov <alexei.starovoitov@xxxxxxxxx> wrote:
> On Wed, Mar 20, 2024 at 12:30 AM Tetsuo Handa
> <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>>
>> On 2024/03/20 16:12, Alexei Starovoitov wrote:
>> > On Wed, Mar 20, 2024 at 12:05 AM Tetsuo Handa
>> > <penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>> >>
>> >> On 2024/03/20 15:56, Alexei Starovoitov wrote:
>> >>> This has nothing to do with bpf.
>> >>> bpf never calls printk().
>> >>
>> >> Please see the Sample crash report in the dashboard.
>> >> bpf program is hitting printk() via report_bug().
>> >
>> > Exactly. local_bh_neable is simply asking for a splat.
>> > _this_ bug is in printk.
>> > It's a generic issue.
>>
>> I can't catch. printk() is called due to report_bug().
>>
>> If the reason report_bug() is called is that spin_unlock_bh() is bad,
>> this is a bug in sock_map_delete_elem() rather than a bug in printk(), isn't it.
>
> There are two bugs.
> The one you've started the thread about is in printk.

The printk rework (which is not yet fully mainline) will correctly
handle this context.

As to the patch [0] you suggested, it would be more appropriate to
perform deferred_enter/_exit *within* the locked critical section. But
we really only want these whack-a-mole workarounds for cases that can
occur in a non-bug situation. IMHO this is not such a case and falls
into the category of "known problem, the rework will handle it".

John Ogness

[0] https://syzkaller.appspot.com/text?tag=Patch&x=121c92fe180000