Re: [RFC 0/2] srcu: Remove pre-flip memory barrier

From: Joel Fernandes
Date: Tue Dec 20 2022 - 09:22:37 EST




> On Dec 20, 2022, at 9:07 AM, Frederic Weisbecker <frederic@xxxxxxxxxx> wrote:
>
> On Tue, Dec 20, 2022 at 08:44:40AM -0500, Joel Fernandes wrote:
>>> C w-depend-r
>>>
>>> {
>>> PLOCK=LOCK0;
>>> }
>>>
>>> // updater
>>> P0(int *LOCK1, int **PLOCK)
>>> {
>>> int lock1;
>>>
>>> lock1 = READ_ONCE(*LOCK1); // READ from inactive idx
>>> smp_mb();
>>> WRITE_ONCE(*PLOCK, LOCK1); // Flip idx
>>> }
>>>
>>> // reader
>>> P1(int **PLOCK)
>>> {
>>> int *plock;
>>>
>>> plock = READ_ONCE(*PLOCK); // Read active idx
>>> WRITE_ONCE(*plock, 1); // Write to active idx
>>
>> I am a bit lost here, why would the reader want to write to the active idx?
>> The reader does not update the idx, only the lock count.
>
> So &ssp->sda->srcu_lock_count is the base address and idx is the offset, right?
> The write is then displayed that way:
>
> this_cpu_inc(ssp->sda->srcu_lock_count[idx].counter);
>
> But things could be also thought the other way around with idx being the base address and
> ssp->sda->srcu_lock_count being the offset.
>
> this_cpu_inc(idx[ssp->sda->srcu_lock_count].counter);
>
> That would require to change some high level types but the result would be the same from
> the memory point of view (and even from the ASM point of view). In the end we
> are dealing with the same address and access.
>
> Now ssp->sda->srcu_lock_count is a constant address value. It doesn't change.
> So it can be zero for example. Then the above increment becomes:
>
> this_cpu_inc(idx.counter);
>
> And then it can be modelized as in the above litmus test.
>
> I had to play that trick because litmus doesn't support arrays but I believe
> it stands. Now of course I may well have got something wrong since I've always
> been terrible at maths...

Ah ok, I get where you were going with that. Yes there is address dependency between reading idx and writing lock count. But IMHO, the access on the update side is trying to order write to index, and reads from a lock count of a previous index (as far as E / B+C is concerned). So IMHO, on the read side you have to consider 2 consecutive readers and not the same reader in order to pair the same accesses correctly. But I could be missing something.

>> Further, the comment does not talk about implicit memory ordering, it’s talking about explicit ordering due to B+C on one side, and E on the other.
>
> Not arguing I'm also still confused by the comment...

;-)

Thanks,

- Joel


>
> Thanks.