One CPU still needs to be able to mutate the flags of another CPU to fire an
IPI; AIUI the per-cpu ops are *not* atomic for concurrent access by multiple
CPUs, and in fact there is no API for that, only for "this CPU".
Huh, I really thought we had an API for that, but you're right. Oh well! But
I'd still suggest a per-cpu atomic_t in that case, rather than the array.
I think a more idiomatic (and portable) way to do this would be to use
the relaxed accessors, but with smp_mb__after_atomic() between them. Do you
have a good reason for _not_ doing it like that?
Not particularly, other than symmetry with the case below.
I think it would be better not to rely on arm64-specific ordering unless
there's a good reason to.
We do need the return data here, and the release semantics (or another
barrier before it). But the read below can be made relaxed and a barrier
used instead, and then the same patern above except with a plain
atomic_or().
Yes, I think using atomic_fetch_or() followed by atomic_read() would be
best (obviously with the relevant comments!)
It is ordered, right? As the comment says, it "needs to be ordered after the
aic_ic_write() above". atomic_fetch_andnot() is *supposed* to be fully
ordered and that should include against the writel_relaxed() on
AIC_IPI_FLAG. On ARM it turns out it's not quite fully ordered, but the
acquire semantics of the read half are sufficient for this case, as they
guarantee the flags are always read after the FIQ has been ACKed.
Sorry, I missed that the answer to my question was already written in the
comment. However, I'm still a bit unsure about whether the memory barriers
give you what you need here. The barrier in atomic_fetch_andnot() will
order the previous aic_ic_write(AIC_IPI_ACK) for the purposes of other
CPUs reading those locations, but it doesn't say anything about when the
interrupt controller actually changes state after the Ack.
Given that the AIC is mapped Device-nGnRnE, the Arm ARM offers:
| Additionally, for Device-nGnRnE memory, a read or write of a Location
| in a Memory-mapped peripheral that exhibits side-effects is complete
| only when the read or write both:
|
| * Can begin to affect the state of the Memory-mapped peripheral.
| * Can trigger all associated side-effects, whether they affect other
| peripheral devices, PEs, or memory.
so without AIC documentation I can't tell whether completion of the Ack write
just begins the process of an Ack (in which case we might need something like
a read-back), or whether the write response back from the AIC only occurs once
the Ack has taken effect. Any ideas?