Cache incoherencies (WAS: New resources - pls, explain :-( )

Benjamin Herrenschmidt (bh40@calva.net)
Tue, 24 Aug 1999 11:24:38 +0200


On Sun, Aug 22, 1999, Philip Blundell <Philip.Blundell@pobox.com> wrote:

>>Philip> This is what the dma_cache_xxx functions defined in asm/io.h
>>Philip> are for.
>>
>>Just notice that there is currently no cross-architecture definition
>>of these macros - all architectures name the differently and the ones
>>that are coherent don't have them at all.
>
>Oh, I thought all the ports had these macros these days (even though most of
>them are defined to do nothing). Oops.

And I don't think those macros are enough. I may be wrong, but I beleive
the way tulip.c is currently fixed for non cache coherent archs is wrong.
Why ? Because we should also take care of the cache line granularity,
especially when flushing/invalidating the ring descriptors themselves.

A ring descriptor is usually smaller than a cache line. That means that
flushing one of them (after setting the OWN bit for example, or after
changing the buffer pointer) will also flush whatever datas are next to
this descriptor (just before or just after). That means that you have the
possibility of killing datas that are just beeing written by the chip
into this other descriptor.

Hopefully I think this doesn't happen on StrongARM since there seem to be
a notion of half cache line (2 dirty bits per line), and so the half line
size is just the size of a ring descriptor entry. I'm not sure about the
exact behaviour of those half-lines, so the potential bug may still be here.

Another issue, which doesn't seem to be handled in tulip.c neither, is
that for datas written by the chipset to memory, the cache must be
invalidated before the chip begins writing (so when the buffers are added
to the ring), not when the datas are read by the CPU. This is because you
may still have stale dirty cache lines corresponding to this piece of
memory that can be flushed at any time, corrupting the datas beeing
written by the chip.

The only way I see to make the fist issue safe is to have the ring
descriptor entries non cachable. Non-cache coherent archs should probably
define a kmalloc flag to allocate non cachable space. (I still don't know
what is the cleanest way to get non-cachable space. ioremap ?).

For the buffers themselves, we should use the inval routine when adding
skbuffs to the ring, (and make sure skbuffs are cache-aligned, and their
size a multiple of a cache line too).

In order to make something compilable on both coherent and incoherent
archs, we should standardize the #define for the cache line size, add a
routine/macro to query about the coherency of the bus, and provide a
"magic" bit for kmalloc that gives you non cachable space.

-- 
           Perso. e-mail: <mailto:bh40@calva.net>
           Work   e-mail: <mailto:benh@mipsys.com>
BenH.      Web   : <http://calvaweb.calvacom.fr/bh40/>

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/