Re: Removal of printk safe buffers delays NMI context printk

From: John Ogness
Date: Fri Nov 05 2021 - 09:57:34 EST


On 2021-11-05, Nicholas Piggin <npiggin@xxxxxxxxx> wrote:
>> What was removed from 93d102f094b was irq_work triggering on all
>> CPUs.
>
> No, it was the caller executing the flush for all remote CPUs itself.
> irq work was not involved (and irq work can't be raised in a remote
> CPU from NMI context).

Maybe I am missing something. In 93d102f094b~1 I see:

watchdog_smp_panic
printk_safe_flush
__printk_safe_flush
printk_safe_flush_buffer
printk_safe_flush_line
printk_deferred
vprintk_deferred
vprintk_emit (but no direct printing)
defer_console_output
irq_work_queue

AFAICT, using defer_console_output() instead of your new printk_flush()
should cause the exact behavior as before.

> but we do need that printk flush capability back there and for
> nmi_backtrace.

Agreed. I had not considered this necessary side-effect when I removed
the NMI safe buffers.

I am just wondering if we should fix the regression by going back to
using irq_work (such as defer_console_output()) or if we want to
introduce something new that introduces direct printing.

John Ogness