Re: [PATCH printk v2 5/8] printk: nbcon: Add sequence handling

From: Petr Mladek
Date: Thu Aug 10 2023 - 05:29:17 EST


On Fri 2023-07-28 02:08:30, John Ogness wrote:
> From: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
>
> Add an atomic_long_t field @nbcon_seq to the console struct to
> store the sequence number for nbcon consoles. For nbcon consoles
> this will be used instead of the non-atomic @seq field. The new
> field allows for safe atomic sequence number updates without
> requiring any locking.
>
> On 64bit systems the new field stores the full sequence number.
> On 32bit systems the new field stores the lower 32 bits of the
> sequence number, which are expanded to 64bit as needed by
> folding the values based on the sequence numbers available in
> the ringbuffer.
>
> For 32bit systems, having a 32bit representation in the console
> is sufficient. If a console ever gets more than 2^31 records
> behind the ringbuffer then this is the least of the problems.
>
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3171,8 +3171,27 @@ void console_unblank(void)
> */
> void console_flush_on_panic(enum con_flush_mode mode)
> {
> + struct console *c;
> bool handover;
> - u64 next_seq;
> + short flags;
> + int cookie;
> + u64 seq;
> +
> + seq = prb_first_valid_seq(prb);
> +
> + /*
> + * Safely handle the atomic consoles before trying to flush any
> + * legacy consoles.
> + */

This is a bit weird because the loop below just sets sequence
number for NBCON consoles. But they are not really flushed.

I think that you already agreed with it for v3. But let me mention
it here for completeness.

I would prefer to just add the API and use it later when some
particular action get supported. And the flush could not do
anything until nbcon_write() callback is added.

As is it is now, this patch adds nbcon_read()/write() into random
locations and it is not clear how they will be used and if
it is enough.

That said, it makes sense to update the init() path.


> + if (mode == CONSOLE_REPLAY_ALL) {
> + cookie = console_srcu_read_lock();
> + for_each_console_srcu(c) {
> + flags = console_srcu_read_flags(c);
> + if (flags & CON_NBCON)
> + nbcon_seq_force(c, seq);
> + }
> + console_srcu_read_unlock(cookie);
> + }


> if (!serialized_printing)
> return;

[...]

> @@ -3829,7 +3846,9 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
> if (!console_is_usable(c))
> continue;
>
> - if (locked)
> + if (flags & CON_NBCON)
> + printk_seq = nbcon_seq_read(c);
> + else if (locked)
> printk_seq = c->seq;
> else
> continue;

I think that I mentioned this already in a previous patch. The "else
continue" path is bad. It allows quietly skip messages for classic
consoles when "locked" is false. I know that it should not happen
but...

A solution would be to add WARN_ON_ONCE() before the continue.

> diff --git a/kernel/printk/printk_nbcon.c b/kernel/printk/printk_nbcon.c
> index 39fa64891ec6..8229a0a00d5b 100644
> --- a/kernel/printk/printk_nbcon.c
> +++ b/kernel/printk/printk_nbcon.c
> @@ -108,6 +108,116 @@ static inline bool nbcon_state_try_cmpxchg(struct console *con, struct nbcon_sta
> return atomic_try_cmpxchg(&ACCESS_PRIVATE(con, nbcon_state), &cur->atom, new->atom);
> }
>
> +/* Convert sequence from u64 to unsigned long. */
> +static inline unsigned long __nbcon_seq_to_stored(u64 full_seq)
> +{
> + /* On 32bit systems only the lower 32 bits are stored. */
> + return (unsigned long)full_seq;
> +}
> +
> +/* Convert sequence from unsigned long to u64. */
> +static inline u64 __nbcon_seq_to_full(unsigned long stored_seq)
> +{
> +#ifdef CONFIG_64BIT
> + return stored_seq;
> +#else
> + u64 full_seq;
> + u64 rb_seq;
> +
> + /*
> + * The provided sequence is only the lower 32 bits of the ringbuffer
> + * sequence. It needs to be expanded to 64bit. Get the next sequence
> + * number from the ringbuffer and fold it.
> + */
> + rb_seq = prb_next_seq(prb);
> + full_seq = rb_seq - ((u32)rb_seq - stored_seq);
> +
> + return full_seq;
> +#endif
> +}

I would personally do it this way:

#ifdef CONFIG_64BIT

$define __seq_to_nbcon_seq(seq) seq
$define __nbcon_seq_to_seq(seq) seq

#else /* CONFIG_64BIT */

$define __seq_to_nbcon_seq(seq) ((u32)seq)

static inline u64 __nbcon_seq_to_seq(u32 nbcon_seq)
{
u64 seq;
u64 rb_next_seq;

/*
* The provided sequence is only the lower 32 bits of the ringbuffer
* sequence. It needs to be expanded to 64bit. Get the next sequence
* number from the ringbuffer and fold it.
*
* Having a 32bit representation in the console is sufficient.
* If a console ever gets more than 2^31 records behind
* the ringbuffer then this is the least of the problems.
*
* Also the access to the ring buffer is always safe.
*/
rb_next_seq = prb_next_seq(prb);
seq = rb_next_seq - ((u32)rb_next_seq - nbcon_seq);

return seq;
}

#endif /* CONFIG_64BIT */

It looks more clear to me.

> +
> +/**
> + * nbcon_seq_init - Helper function to initialize the console sequence
> + * @con: Console to work on
> + *
> + * Set @con->nbcon_seq to the starting record (specified with con->seq).
> + * If the starting record no longer exists, the oldest available record
> + * is chosen. This is because on 32bit systems only the lower 32 bits of
> + * the sequence number are stored. The upper 32 bits are derived from the
> + * sequence numbers available in the ringbuffer.

It makes sense even on 64-bit systems. I would do:

s/This is because on 32bit systems/This is especially important on 32bit systems/


> + *
> + * For init only. Do not use for runtime updates.
> + */
> +static void nbcon_seq_init(struct console *con)
> +{
> + u64 seq = max_t(u64, con->seq, prb_first_valid_seq(prb));
> +
> + atomic_long_set(&ACCESS_PRIVATE(con, nbcon_seq), __nbcon_seq_to_stored(seq));
> +
> + /* Clear con->seq since nbcon consoles use con->nbcon_seq instead. */
> + con->seq = 0;
> +}

[...]

> +/**
> + * nbcon_seq_try_update - Try to update the console sequence number
> + * @ctxt: Pointer to an acquire context that contains
> + * all information about the acquire mode
> + *
> + * Return: True if the console sequence was updated, false otherwise.
> + *
> + * On 32bit the sequence in con->nbcon_seq is only the lower 32 bits.
> + * Therefore it must be expanded to 64bit upon a failed cmpxchg in
> + * order to correctly verify that the new sequence (ctxt->seq) is
> + * larger.
> + *
> + * In case of fail the console has been likely taken over. However, the
> + * caller must still assume it has ownership and decide how to proceed.
> + */
> +__maybe_unused
> +static bool nbcon_seq_try_update(struct nbcon_context *ctxt)
> +{
> + struct console *con = ctxt->console;
> + u64 con_seq = nbcon_seq_read(con);
> +
> + while (con_seq < ctxt->seq) {

What if anyone called nbcon_seq_force() to reply the entire log
in the meantime?

IMHO, we should remember the original nbcon_seq before
the context handle a line. And this function should update
nbcon_seq only when it has not been changed by other context
in the meantime.

> + unsigned long seq = __nbcon_seq_to_stored(con_seq);
> +
> + if (atomic_long_try_cmpxchg(&ACCESS_PRIVATE(con, nbcon_seq), &seq,
> + __nbcon_seq_to_stored(ctxt->seq))) {
> + return true;
> + }
> +
> + /* Expand new @seq value for comparing. */
> + con_seq = __nbcon_seq_to_full(seq);
> + }
> + return false;
> +}
> +
> /**
> * nbcon_context_try_acquire_direct - Try to acquire directly
> * @ctxt: The context of the caller

Best Regards,
Petr