Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

From: Sergey Senozhatsky
Date: Mon Mar 16 2020 - 22:18:25 EST


On (20/03/15 18:09), Eugeniu Rosca wrote:

[..]

> @@ -428,6 +428,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> }
> }
>
> + console_verbose_start();
> +
> pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
> smp_processor_id(), duration,
> current->comm, task_pid_nr(current));
> @@ -453,6 +455,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> if (softlockup_panic)
> panic("softlockup: hung tasks");
> __this_cpu_write(soft_watchdog_warn, true);
> +
> + console_verbose_end();
> } else
> __this_cpu_write(soft_watchdog_warn, false);


I'm afraid, as of now, this approach is not going to work the way it's
supposed to work in 100% of cases. Because the only thing that printk()
call sort of guarantees is that the message will be stored somewhere.
Either in the main kernel log buffer, on in one of auxiliary per-CPU
log buffers. It does not guarantee, generally speaking, that the message
will be printed on the console immediately.

Consider the following example:

CPU0 CPU1
console_lock();
schedule();

watchdog()
console_verbose_start();
printk()
log_store()
if (!console_trylock())
return;
console_verbose_end();

...
console_unlock()
print logbuf messages to the consoles
we missed the console_verbose_start/end
on CPU1



IIRC, we had a similar approach in the past. See commit 375899cddcbb26
("printk: make sure to print log on console"). And we reverted it, see
a6ae928c25835 ("Revert "printk: make sure to print log on console.").

-ss