Re: [PATCH printk v3 04/14] printk: ringbuffer: Do not skip non-finalized records with prb_next_seq()

From: Petr Mladek
Date: Mon Jan 15 2024 - 12:00:38 EST


On Mon 2024-01-15 13:01:36, John Ogness wrote:
> On 2024-01-12, Petr Mladek <pmladek@xxxxxxxx> wrote:
> >> u64 prb_next_seq(struct printk_ringbuffer *rb)
> >> {
> >> - struct prb_desc_ring *desc_ring = &rb->desc_ring;
> >> - enum desc_state d_state;
> >> - unsigned long id;
> >> u64 seq;
> >>
> >> - /* Check if the cached @id still points to a valid @seq. */
> >> - id = atomic_long_read(&desc_ring->last_finalized_id);
> >> - d_state = desc_read(desc_ring, id, NULL, &seq, NULL);
> >> + seq = desc_last_finalized_seq(rb);
> >
> > desc_last_finalized_seq() does internally:
> >
> > ulseq = atomic_long_read_acquire(&desc_ring->last_finalized_seq
> > ); /* LMM(desc_last_finalized_seq:A) */
> >
> >
> > It guarantees that this CPU would see the data which were seen by the
> > CPU which updated desc_ring->last_finalized_seq.
> >
> > So far so good.
> >
> > The problem is that I somehow miss the counter part. Maybe,
> > it is not needed. It just looks strange.
>
> As the comments in desc_last_finalized_seq() state: "This pairs with
> desc_update_last_finalized:A."
>
> desc_update_last_finalized() successfully reads a record and then uses a
> cmpxchg_release() to set the new @last_finalized_seq value (of the
> record it just read). That is the counter part.
>
> >> - if (d_state == desc_finalized || d_state == desc_reusable) {
> >> - /*
> >> - * Begin searching after the last finalized record.
> >> - *
> >> - * On 0, the search must begin at 0 because of hack#2
> >> - * of the bootstrapping phase it is not known if a
> >> - * record at index 0 exists.
> >> - */
> >> - if (seq != 0)
> >> - seq++;
> >> - } else {
> >> - /*
> >> - * The information about the last finalized sequence number
> >> - * has gone. It should happen only when there is a flood of
> >> - * new messages and the ringbuffer is rapidly recycled.
> >> - * Give up and start from the beginning.
> >> - */
> >> - seq = 0;
> >> - }
> >> + /*
> >> + * Begin searching after the last finalized record.
> >> + *
> >> + * On 0, the search must begin at 0 because of hack#2
> >> + * of the bootstrapping phase it is not known if a
> >> + * record at index 0 exists.
> >> + */
> >> + if (seq != 0)
> >> + seq++;
> >>
> >> /*
> >> * The information about the last finalized @seq might be inaccurate.
> >
> > Below is the code:
> >
> > while (_prb_read_valid(rb, &seq, NULL, NULL))
> > seq++;
> >
> > Maybe, the release() should be here to make sure that the CPU which
> > would see this "seq" would also the data.
>
> The acquire is with @last_finalized_seq. So the release must also be
> with @last_finalized_seq. The important thing is that the CPU that
> updates @last_finalized_seq has actually read the corresponding record
> beforehand. That is exactly what desc_update_last_finalized() does.

I probably did not describe it well. The CPU updating @last_finalized_seq
does the right thing. I was not sure about the CPU which reads
@last_finalized_seq via prb_next_seq().

To make it more clear:

u64 prb_next_seq(struct printk_ringbuffer *rb)
{
u64 seq;

seq = desc_last_finalized_seq(rb);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
`-> This includes atomic_long_read_acquire(last_finalized_seq)


if (seq != 0)
seq++;

while (_prb_read_valid(rb, &seq, NULL, NULL))
seq++;

return seq;
}

But where is the atomic_long_read_release(last_finalized_seq) in
this code path?

IMHO, the barrier provided by the acquire() is _important_ to make sure
that _prb_read_valid() would see the valid descriptor.

Now, I think that the related read_release(seq) is hidden in:

static int prb_read(struct printk_ringbuffer *rb, u64 seq,
struct printk_record *r, unsigned int *line_count)
{
/* Get a local copy of the correct descriptor (if available). */
err = desc_read_finalized_seq(desc_ring, id, seq, &desc);

/* If requested, copy meta data. */
if (r->info)
memcpy(r->info, info, sizeof(*(r->info)));

/* Copy text data. If it fails, this is a data-less record. */
if (!copy_data(&rb->text_data_ring, &desc.text_blk_lpos, info->text_len,
r->text_buf, r->text_buf_size, line_count)) {
return -ENOENT;
}

/* Ensure the record is still finalized and has the same @seq. */
return desc_read_finalized_seq(desc_ring, id, seq, &desc);
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
`-> This includes a memory barrier /* LMM(desc_read:A) */
which makes sure that the data are read before
the desc/data could be reused.
}

I consider this /* LMM(desc_read:A) */ as a counter part for that
acquire() in prb_next_seq().


Summary:

I saw atomic_long_read_acquire(last_finalized_seq) called from
prb_next_seq() code path. The barrier looked important to me.
But I saw neither the counter-part nor any comment. I wanted
to understand it because it might be important for reviewing
following patches which depend on prb_next_seq().

Does it make sense now, please?

Best Regards,
Petr