[PATCH printk v2 0/9] fix console flushing

From: John Ogness
Date: Mon Nov 06 2023 - 16:07:39 EST


Hi,

While testing various flushing scenarios, I stumbled on a
couple issues that cause console flushing to fail. While
discussing the v1 [0] series, a couple more issues arose.
This series addresses all the issues:

1. The prb_next_seq() optimization caused inconsistent return
values. Fix prb_next_seq() to the originally intended
behavior but keep an optimization.

2. pr_flush() might not wait until the most recently stored
printk() message if non-finalized records precede it. Fix
pr_flush() to wait for all records to print that are at
least reserved at the time of the call.

3. In panic, the panic messages will not print if non-finalized
records precede them. Add a special condition so that
readers on the panic CPU can drop non-finalized records.

4. It is possible (and easy to reproduce) a scenario where the
console on the panic CPU hands over to a waiter of a stopped
CPU. Do not use the handover feature in panic.

5. If messages are being dropped during panic, non-panic CPUs
are silenced. But by then it is already too late and most
likely the panic messages have been dropped. Change the
non-panic CPU silencing logic to restrict non-panic CPUs
from flooding the ringbuffer.

This series also performing some minor cleanups to remove open
coded checks about the panic context and improve documentation
language regarding data-less records.

Because of multiple refactoring done in recent history, it
would be helpful to provide the LTS maintainers with the proper
backported patches. I am happy to do this.

The changes since v1:

- Rename NO_LPOS to EMPTY_LINE_LPOS.

- Add and cleanup documentation to clarify language regarding
data-less records and special lpos values.

- Implement a new prb_next_seq() optimization to preserve the
intended behavior. This is essentially my rfc [1] with
memory barriers added and based on an alternate implemenation
suggested by pmladek [2].

- Introduce new prb_next_reserve_seq() function to return the
sequence number after @head_id.

- Use prb_next_reserve_seq() instead of prb_next_seq() for
pr_flush().

- Implement dropping non-finalized records in panic within
_prb_read_valid() instead of printk_get_next_message(). This
also makes use of the new prb_next_reserve_seq().

- Use the alternate implementation from pmladek [3] to avoid
the handover feature in panic.

- Implement a new strategy to avoid dropping panic messages
when non-panic CPUs are flooding the ringbuffer.

John Ogness

[0] https://lore.kernel.org/lkml/20231013204340.1112036-1-john.ogness@xxxxxxxxxxxxx
[1] https://lore.kernel.org/lkml/20231019132545.1190490-1-john.ogness@xxxxxxxxxxxxx
[2] https://lore.kernel.org/lkml/ZTkxOJbDLPy12n41@alley
[3] https://lore.kernel.org/lkml/ZS-r3QnpKzm7UVip@alley

John Ogness (8):
printk: ringbuffer: Do not skip non-finalized records with
prb_next_seq()
printk: ringbuffer: Clarify special lpos values
printk: For @suppress_panic_printk check for other CPU in panic
printk: Add this_cpu_in_panic()
printk: ringbuffer: Cleanup reader terminology
printk: Wait for all reserved records with pr_flush()
printk: Skip non-finalized records in panic
printk: Avoid non-panic CPUs flooding ringbuffer

Petr Mladek (1):
printk: Disable passing console lock owner completely during panic()

kernel/printk/internal.h | 1 +
kernel/printk/printk.c | 108 ++++++----
kernel/printk/printk_ringbuffer.c | 343 +++++++++++++++++++++++++-----
kernel/printk/printk_ringbuffer.h | 21 +-
4 files changed, 382 insertions(+), 91 deletions(-)


base-commit: b4908d68609b57ad1ba4b80bd72c4d2260387e31
--
2.39.2