Re: [PATCH printk v1 16/18] kernel/panic: Add atomic write enforcement to warn/panic

From: John Ogness
Date: Thu Apr 13 2023 - 08:15:57 EST


On 2023-04-13, Petr Mladek <pmladek@xxxxxxxx> wrote:
>> --- a/kernel/panic.c
>> +++ b/kernel/panic.c
>> @@ -329,6 +332,8 @@ void panic(const char *fmt, ...)
>> if (_crash_kexec_post_notifiers)
>> __crash_kexec(NULL);
>>
>> + cons_atomic_flush(NULL, true);
>
> Do we need to explicitly flush the messages here?

This is where the atomic printing actually starts (after the full dump
has been inserted into the ringbuffer).

> cons_atomic_flush() is called also from vprintk_emit(). And there are
> many messages printed with the PANIC priority above.

vprintk_emit() does not print in this case. From cons_atomic_flush():

/*
* When in an elevated priority, the printk() calls are not
* individually flushed. This is to allow the full output to
* be dumped to the ringbuffer before starting with printing
* the backlog.
*/
if (cpu_state->prio > NBCON_PRIO_NORMAL && printk_caller_wctxt)
return;

> This makes an assumption that either printk() in PANIC context
> does not try to show the messages immediately or that this
> explicit console_atomic_flush() tries harder. I think
> that both assumptions are wrong.

Both assumptions are correct, because until this point there has been no
effort to print.

>> @@ -353,6 +358,7 @@ void panic(const char *fmt, ...)
>> * We can't use the "normal" timers since we just panicked.
>> */
>> pr_emerg("Rebooting in %d seconds..\n", panic_timeout);
>> + cons_atomic_flush(NULL, true);
>
> Same here.

This flush is just to make sure the rebooting message is
output. For nbcon consoles printk() calls are never synchronous except
for during early boot (before kthreads are ready).

The same goes for the other cons_atomic_flush() calls in this function.

>> disabled_wait();
>> #endif
>> pr_emerg("---[ end Kernel panic - not syncing: %s ]---\n", buf);
>>
>> /* Do not scroll important messages printed above */
>> suppress_printk = 1;
>> +
>> + cons_atomic_exit(CONS_PRIO_PANIC, prev_prio);
>
> On the contrary, I would explicitly call cons_atomic_flush(NULL, false)
> here instead of hiding it in cons_atomic_exit().

It is not hiding there. That is the semantic. After entering an atomic
block all printk's are only writing to the ringbuffer. On exiting the
atomic block the ringbuffer is flushed via atomic printing.

Exiting CONS_PRIO_PANIC has a special condition that it first tries to
safely flush all consoles, then will try the unsafe variant for consoles
that were not flushed.

> Also I think that we want to set the EMERGENCY prio also in
> oops_enter()?

Agreed.

John