Re: [PATCH v4 2/2] tty/sysrq: Dump printk ring buffer messages via sysrq
From: Sreenath Vijayan
Date: Wed Feb 14 2024 - 06:29:50 EST
On Wed, Feb 07, 2024 at 04:09:34PM +0100, Petr Mladek wrote:
> On Thu 2024-02-01 13:12:41, Sreenath Vijayan wrote:
> > When terminal is unresponsive, one cannot use dmesg to view printk
> > ring buffer messages. Also, syslog services may be disabled,
> > to check the messages after a reboot, especially on embedded systems.
> > In this scenario, dump the printk ring buffer messages via sysrq
> > by pressing sysrq+D.
>
> I would use sysrq-R and say that it replays the kernel log on
> consoles.
>
> The word "dump" is ambiguous. People might thing that it calls
> dmesg dumpers.
>
Ok noted. We proposed sysrq-D as it is an alternative to dmesg
command and might be easier to remember.
> Also the messages would be shown on the terminal only when
> console_loglevel is set to show all messages. This is done
> in __handle_sysrq(). But it is not done in the workqueue
> context.
>
Yes, the initial implementation was using write() of consoles
so the messages would be shown irrespective of the console log
level. The current implementation depends on the console log
level but many other sysrq keys dump the messages at KERN_INFO
level. In my understanding, __handle_sysrq() dumps only the
sysrq header at the manipulated loglevel. It restores original
loglevel before calling callback function for the key.
If console_loglevel is set to show KERN_INFO messages, it would
dump most of the important printk messages in our case. Also the
loglevel can be modified using sysrq itself now.
> Finally, the commit message should explain why workqueues are used
> and what are the limitations. Something like:
>
> <add>
> The log is replayed using workqueues. The reason is that it has to
> be done a safe way (in compare with panic context).
>
> This also means that the sysrq won't have the desired effect
> when the system is in so bad state that workqueues do not
> make any progress.
> </add>
>
> Another reason might be that we do not want to do it in
> an interrupt context. But this reason is questionable.
> Many other sysrq commands do a complicate work and
> print many messages as well.
>
Noted. Will add this if we proceed with workqueue implementation.
> Another reason is that the function need to use console_lock()
> which can't be called in IRQ context. Maybe, we should use
> console_trylock() instead.
>
> The function would replay the messages only when console_trylock()
> succeeds. Users could repeat the sysrq when it fails.
>
> Idea:
>
> Using console_trylock() actually might be more reliable than
> workqueues. console_trylock() might fail repeatably when:
>
> + the console_lock() owner is stuck. But workqueues would fail
> in this case as well.
>
> + there is a flood of messages. In this case, replaying
> the log would not help much.
>
> Another advantage is that the consoles would be flushed
> in sysrq context with the manipulated console_loglevel.
>
> Best Regards,
> Petr
Yes, this seems to work well from interrupt context when the
console lock owner is not stuck. We can also manipulate
the console_loglevel. Something like this:
//in printk.c
void console_replay_all(void)
{
if (console_trylock()) {
__console_rewind_all();
console_unlock();
}
}
//in sysrq.c
static void sysrq_handle_dmesg_dump(u8 key)
{
int orig_log_level = console_loglevel;
console_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
console_replay_all();
console_loglevel = orig_log_level;
}
The downside I see is that the user may have to hit the
key multiple times or give up trying if the console lock
owner is busy at the time of key press. This information
should probably be updated in the documentation.
Please let me know your opinion.
Regards,
Sreenath