Re: [PATCH 09/18] soc: qcom: ipa: GSI transactions

From: Alex Elder
Date: Mon May 20 2019 - 12:52:35 EST


On 5/20/19 11:34 AM, Evan Green wrote:
> On Mon, May 20, 2019 at 7:44 AM Alex Elder <elder@xxxxxxxxxx> wrote:
>>
>> On 5/20/19 9:43 AM, Arnd Bergmann wrote:
>>> I have no idea how two 8-bit assignments could do that,
>>> it sounds like a serious gcc bug, unless you mean two
>>> 8-byte assignments, which would be within the range
>>> of expected behavior. If it's actually 8-bit stores, please
>>> open a bug against gcc with a minimized test case.
>>
>> Sorry, it's 8 *byte* assignments, not 8 bit. -Alex
>
> Is it important to the hardware that you're writing all 128 bits of

No, it is not important in the ways you are describing.

We're just geeking out over how to get optimal performance.
A single 128-bit write is nicer than two 64-bit writes,
or more smaller writes.

The hardware won't touch the TRE until the doorbell gets
rung telling it that it is permitted to do so. The doorbell
is an I/O write, which implies a memory barrier, so the entire
TRE will be up-to-date in memory regardless of whether we
write it 128 bits or 8 bits at a time.

-Alex

> this in an atomic manner? My understanding is that while you may get
> different behaviors using various combinations of
> volatile/aligned/packed, this is way deep in the compiler's "free to
> do whatever I want" territory. If the hardware's going to misbehave if
> you don't get an atomic 128-bit write, then I don't think this has
> been nailed down enough, since I think Clang or even a different
> version of GCC is within its right to split the writes up differently.
>
> Is filling out the TRE touching memory that the hardware is also
> watching at the same time? Usually when the hardware cares about the
> contents of a struct, there's a particular (smaller) field that can be
> written out atomically. I remember USB having these structs that
> needed to be filled out, but the hardware wouldn't actually slurp them
> up until the smaller "token" part was non-zero. If the hardware is
> scanning this struct, it might be safer to do it in two steps: 1)
> flush out the filled out struct except for the field with the "go" bit
> (which you'd have zeroed), then 2) set the field containing the "go"
> bit.
>