Re: [RFC PATCH v5 1/3] printk-rb: new printk ringbuffer implementation (writer)

From: Petr Mladek
Date: Fri Jan 03 2020 - 05:24:27 EST


On Mon 2019-12-23 17:01:00, John Ogness wrote:
> Hi Andrea,
>
> On 2019-12-21, Andrea Parri <parri.andrea@xxxxxxxxx> wrote:
> >> + *desc_out = READ_ONCE(*desc);
> >> +
> >> + /* Load data before re-checking state. */
> >> + smp_rmb(); /* matches LMM_REF(desc_reserve:A) */
> >
> > I looked for a matching WRITE_ONCE() or some other type of marked write,
> > but I could not find it. What is the rationale? Or what did I miss?

Good question. READ_ONCE() looks superfluous here because it is
surrounded by two read barriers. In each case, there is no
corresponding WRITE_ONCE().

Note that we are copying the entire struct prb_desc here. All values
are written only when state_val is in desc_reserved state. It happens
between two full write barriers:

+ A writer is allowed to modify the descriptor after successful
cmpxchg in desc_reserve(), see LMM_TAG(desc_reserve:A).

+ The writer must not touch the descriptor after changing
state_var to committed state, see
LMM_TAG(prb_commit:A) in prb_commit().

These barriers are mentioned in the comments for the two
read barriers here.

> >> + do {
> >> + next_lpos = get_next_lpos(data_ring, begin_lpos, size);
> >> +
> >> + if (!data_push_tail(rb, data_ring,
> >> + next_lpos - DATA_SIZE(data_ring))) {
> >> + /* Failed to allocate, specify a data-less block. */
> >> + blk_lpos->begin = INVALID_LPOS;
> >> + blk_lpos->next = INVALID_LPOS;
> >> + return NULL;
> >> + }
> >> + } while (!atomic_long_try_cmpxchg(&data_ring->head_lpos, &begin_lpos,
> >> + next_lpos));
> >> +
> >> + /*
> >> + * No barrier is needed here. The data validity is defined by
> >> + * the state of the associated descriptor. They are marked as
> >> + * invalid at the moment. And only the winner of the above
> >> + * cmpxchg() could write here.
> >> + */
> >
> > The (successful) CMPXCHG provides a full barrier. This comment suggests
> > that that could be somehow relaxed? Or the comment could be improved?
>
> You are correct. There is no need for the full barrier here. This code
> is based on Petr's POC. I focussed on making sure needed barriers are in
> place, but did not try to eliminate excessive barriers.

I hope that I'll get better understanding of the guarantees
of different atomic operations one day. There are so many variants now.

BTW: Documentation/memory-barriers.txt describes various aspects of
the memory barriers. It describes implicit barriers provided
by spin locks, mutexes, semaphores, and various scheduler-related
operations.

But I can't find any explanation of the various variants of the atomic
operations: acquire, release, fetch, return, try, relaxed. I can find
some clues here and there but it is hard to get the picture.

Best Regards,
Petr