Re: [PATCH 01/20] asm-generic/mmiowb: Add generic implementation of mmiowb() tracking

From: Michael Ellerman
Date: Sun Mar 03 2019 - 04:26:50 EST


Nicholas Piggin <npiggin@xxxxxxxxx> writes:
> Will Deacon's on March 2, 2019 12:03 am:
>> In preparation for removing all explicit mmiowb() calls from driver
>> code, implement a tracking system in asm-generic based loosely on the
>> PowerPC implementation. This allows architectures with a non-empty
>> mmiowb() definition to have the barrier automatically inserted in
>> spin_unlock() following a critical section containing an I/O write.
>
> Is there a reason to call this "mmiowb"? We already have wmb that
> orders cacheable stores vs mmio stores don't we?
>
> Yes ia64 "sn2" is broken in that case, but that can be fixed (if
> anyone really cares about the platform any more). Maybe that's
> orthogonal to what you're doing here, I just don't like seeing
> "mmiowb" spread.
>
> This series works for spin locks, but you would want a driver to
> be able to use wmb() to order locks vs mmio when using a bit lock
> or a mutex or whatever else. Calling your wmb-if-io-is-pending
> version io_mb_before_unlock() would kind of match with existing
> patterns.
>
>> +static inline void mmiowb_set_pending(void)
>> +{
>> + struct mmiowb_state *ms = __mmiowb_state();
>> + ms->mmiowb_pending = ms->nesting_count;
>> +}
>> +
>> +static inline void mmiowb_spin_lock(void)
>> +{
>> + struct mmiowb_state *ms = __mmiowb_state();
>> + ms->nesting_count++;
>> +}
>> +
>> +static inline void mmiowb_spin_unlock(void)
>> +{
>> + struct mmiowb_state *ms = __mmiowb_state();
>> +
>> + if (unlikely(ms->mmiowb_pending)) {
>> + ms->mmiowb_pending = 0;
>> + mmiowb();
>> + }
>> +
>> + ms->nesting_count--;
>> +}
>
> Humour me for a minute and tell me what this algorithm is doing, or
> what was broken about the powerpc one, which is basically:
>
> static inline void mmiowb_set_pending(void)
> {
> struct mmiowb_state *ms = __mmiowb_state();
> ms->mmiowb_pending = 1;
> }
>
> static inline void mmiowb_spin_lock(void)
> {
> }

The current powerpc code clears io_sync in spin_lock().

ie, it would be equivalent to:

static inline void mmiowb_spin_lock(void)
{
ms->mmiowb_pending = 0;
}

Which means that:

spin_lock(a);
writel(x, y);
spin_lock(b);
...
spin_unlock(b);
spin_unlock(a);

Does no barrier.

cheers