Re: [RFC] Relaxed PIO read vs. DMA write ordering

From: Jesse Barnes
Date: Fri Jan 09 2004 - 18:17:02 EST


On Fri, Jan 09, 2004 at 11:51:47AM -0800, Jesse Barnes wrote:
> On Thu, Jan 08, 2004 at 11:13:47PM -0800, Jeremy Higdon wrote:
> > In theory, such a distinction would be useful for any platform that
> > uses separate paths for PIO read responses and DMA writes. Perhaps
> > there is only one platform that has that feature?

Let me clarify this a bit more (and remember to Cc Ralf and Christoph
this time): there are two types of ordering we're worried about here,
DMA vs. other DMA ordering and DMA vs. PIO ordering.

Some platforms allow DMA buffers to arrive at their destination out of
order when a barrier bit is unset. SGI machines using Bridge (or Bridge
variant like sn2 or Origin) chips are implemented this way, so if a DMA
transaction from a device to system memory doesn't have the barrier bit
set it's allowed to pass other non-barriered DMA.

PIOs are another matter. SGI machines using the above chips enforce
_no_ ordering whatsoever between DMA and PIO traffic. That is, a PIO
read response can 'pass' a DMA write (even a barriered one) and get to
the requesting CPU before the DMA is in the coherence domain. So in
effect, all PIOs on SGI boxes are 'relaxed' by default. For sn2, we've
implemented a special sn_dma_flush() function that we use following a
PIO read in our read() and in() routines so that the driver can be
assured that any DMA that it thinks is done actually is. I'm not sure
this workaround is possible on Origin machines, so certain devices may
be open to data corruption depending on how they interact with the host
system. Our next I/O chip will implement PIO vs. DMA ordering as a
seperate address space.

Given the above, a new read_relaxed() routine or an ioremap_relaxed()
routine seem like the best solution for us. Neither is invasive at all
for current drivers (i.e. existing stuff will 'just work'), but it does
add complexity to the API. Adding a pci_enable_relaxed() routine is a
completely seperate issue since it will be a noop on our platform (and
it probably silly to implement until we have real hardware that needs
it).

Anyway, thanks for your patience. I probably made a mistake trying to
tie this proposal to the PCI-X spec since I'm only guessing at how such
hardware might behave...

Thanks,
Jesse
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/