Re: [PATCH printk v2 3/4] printk: Skip unfinalized records in panic

From: Petr Mladek
Date: Wed Oct 18 2023 - 11:27:32 EST


On Wed 2023-10-18 15:51:57, John Ogness wrote:
> On 2023-10-18, Petr Mladek <pmladek@xxxxxxxx> wrote:
> > Wait! This means that your patch actually does not work. Here is the diff:
> >
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -2815,8 +2815,19 @@ static bool printk_get_next_message(struct printk_message *pmsg, u64 seq,
> > else
> > prb_rec_init_rd(&r, &info, outbuf, outbuf_sz);
> >
> > - if (!prb_read_valid(prb, seq, &r))
> > - return false;
> > + while (!prb_read_valid(prb, seq, &r)) {
> > + if (this_cpu_in_panic() && seq < prb_next_seq(prb)) {
> > + /*
> > + * The record @seq is not finalized and there may be
> > + * more records in the ringbuffer. Since this is the
> > + * panic CPU, skip over the unfinalized record and
> > + * try to read a finalized record that may follow.
> > + */
> > + seq++;
> > + } else {
> > + return false;
> > + }
> > + }
> >
> > pmsg->seq = r.info->seq;
> > pmsg->dropped = r.info->seq - seq;
> >
> > It skips the invalid reads only when seq < prb_next_seq(). But
> > prb_next_seq(prb) points to the 1st non-finalized record. And
> > all records with seq < prb_next_seq() must be finalized!
>
> Please take a look at prb_next_seq(). It _starts_ its search from:
>
> id = atomic_long_read(&desc_ring->last_finalized_id);

I see. And this this set in

static void desc_make_final(struct prb_desc_ring *desc_ring, unsigned long id)
{
[...]

/* Best effort to remember the last finalized @id. */
atomic_long_set(&desc_ring->last_finalized_id, id);
}

So it is the _last_ finalized id from the timing POV. If there are
more CPUs storing and finalizing the messages in parallel then
it might change forth and back. There might be earlier non-finalized
records and newer finalized ones.

It means that prb_next_seq() really is the best effort and
the description is not valid:

/**
* prb_next_seq() - Get the sequence number after the last available record.
*
* @rb: The ringbuffer to get the sequence number from.
*
* This is the public function available to readers to see what the next
* newest sequence number available to readers will be.
*
* This provides readers a sequence number to jump to if all currently
* available records should be skipped.

It is not guaranteed that it will be the last available record
because there might be newer already finalized records with
some non-finalized records in between.

Also it is not guaranteed that it will be the next record available
to readers because readers should stop on the 1st non-yet-finalized
record and prb_next_seq() might be behind.

It would be great to document these subtle details especially when
we are going to depend on them.

> For console_flush_on_panic(), @last_finalized_id will _always_ be set to
> the last finalized message of the panic messages, being higher than any
> non-finalized records that may exist. There are no other CPUs running
> except the panic CPU.

It is not guaranteed. It might be lower when some still running CPU
manages to finalize an earlier record and there are later
non-finalized records.

But you are right, there is a very high chance that it will point
behind the last message from panic() context sooner or later,
especially after CPUs get stopped.

Well, note that only NMI guarantees that CPUs get stopped. There
are still architectures which stop CPUs using normal interrupts.

> Unfortunately you are not reading the code correctly. (Or rather, you
> are being misled by comments because you incorrectly associate "not
> available to reader" to mean "valid record with an empty string".

You are right. Well, it is really hard to put all the pieces together
just by reading the code. And the unclear comments make it even worse.

Best Regards,
Petr