Re: [PATCH printk v2 09/11] panic: Add atomic write enforcement to oops

From: Petr Mladek
Date: Wed Sep 27 2023 - 09:15:14 EST


On Wed 2023-09-20 01:14:54, John Ogness wrote:
> Invoke the atomic write enforcement functions for oops to
> ensure that the information gets out to the consoles.
>
> Since there is no single general function that calls both
> oops_enter() and oops_exit(), the nesting feature of atomic
> write sections is taken advantage of in order to guarantee
> full coverage between the first oops_enter() and the last
> oops_exit().
>
> It is important to note that if there are any legacy consoles
> registered, they will be attempting to directly print from the
> printk-caller context, which may jeopardize the reliability of
> the atomic consoles. Optimally there should be no legacy
> consoles registered.
>
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -630,6 +634,36 @@ bool oops_may_print(void)
> */
> void oops_enter(void)
> {
> + enum nbcon_prio prev_prio;
> + int cpu = -1;
> +
> + /*
> + * If this turns out to be the first CPU in oops, this is the
> + * beginning of the outermost atomic section. Otherwise it is
> + * the beginning of an inner atomic section.
> + */

This sounds strange. What is the advantage of having the inner
atomic context, please? It covers only messages printed inside
oops_enter() and not the whole oops_enter()/exit(). Also see below.

> + prev_prio = nbcon_atomic_enter(NBCON_PRIO_EMERGENCY);
> +
> + if (atomic_try_cmpxchg_relaxed(&oops_cpu, &cpu, smp_processor_id())) {
> + /*
> + * This is the first CPU in oops. Save the outermost
> + * @prev_prio in order to restore it on the outermost
> + * matching oops_exit(), when @oops_nesting == 0.
> + */
> + oops_prev_prio = prev_prio;
> +
> + /*
> + * Enter an inner atomic section that ends at the end of this
> + * function. In this case, the nbcon_atomic_enter() above
> + * began the outermost atomic section.
> + */
> + prev_prio = nbcon_atomic_enter(NBCON_PRIO_EMERGENCY);
> + }
> +
> + /* Track nesting when this CPU is the owner. */
> + if (cpu == -1 || cpu == smp_processor_id())
> + oops_nesting++;
> +
> tracing_off();
> /* can't trust the integrity of the kernel anymore: */
> debug_locks_off();
> @@ -637,6 +671,9 @@ void oops_enter(void)
>
> if (sysctl_oops_all_cpu_backtrace)
> trigger_all_cpu_backtrace();
> +
> + /* Exit inner atomic section. */
> + nbcon_atomic_exit(NBCON_PRIO_EMERGENCY, prev_prio);

This will not flush the messages when:

+ This CPU owns oops_cpu. The flush will have to wait for exiting
the outer loop.

In this case, the inner atomic context is not needed.


+ oops_cpu is owner by another CPU, the other CPU is
just flushing the messages and block the per-console
lock.

The good thing is that the messages printed by this oops_enter()
would likely get flushed by the other CPU.

The bad thing is that oops_exit() on this CPU won't call
nbcon_atomic_exit() so that the following OOPS messages
from this CPU might need to wait for the printk kthread.
IMHO, this is not what we want.


One solution would be to store prev_prio in per-CPU array
so that each CPU could call its own nbcon_atomic_exit().

But I start liking more and more the idea with storing
and counting nested emergency contexts in struct task_struct.
It is the alternative implementation in reply to the 7th patch,
https://lore.kernel.org/r/ZRLBxsXPCym2NC5Q@alley

Then it will be enough to simply call:

+ nbcon_emergency_enter() in oops_enter()
+ nbcon_emergency_exit() in oops_enter()

Best Regards,
Petr

PS: I just hope that you didn't add all this complexity just because
we preferred this behavior at LPC 2022. Especially I hope
that it was not me who proposed and preferred this.