Re: DMA cache consistency bug introduced in 2.6.28 (Was: Re: [Fdutils]Cannot format floppies under kernel 2.6.*?)

From: Linus Torvalds
Date: Thu Dec 17 2009 - 12:28:55 EST




On Thu, 17 Dec 2009, Alain Knaff wrote:
>
> 1. initial contents: 33 44 55 66
> 2. one DMA transfer is performed
> 3. program changes buffer to: 77 88 99 aa
> 4. new DMA transfer is performed => instead it transmits 33 88 99 aa
> (i.e. first byte is from previous contents)
>
> This used to work in 2.6.27.41, but broke in 2.6.28 . It doesn't happen on
> all hardware though.

Do you have a list of hardware it works on? Especially chipsets.

On x86, where all caches are supposed to be totally coherent (except for
I$ under very special circumstances), the above should never be able to
happen. At least not unless there is really buggy hardware involved.

> It does indeed seem to be related to a DMA-side cache (rather than the
> processor's cache not being flushed to main memory), as doing lots of
> memory intensive work (kernel compilation) between 2 and 3 doesn't fix the
> problem.

I'm not entirely surprised. Actual CPU bugs are pretty rare in the x86
world. But chipset bugs? Another thing entirely. There are buffers and
caches there, and those are sometimes software-visible. The most obvious
case of that is just the IOMMU's themselves, but from your description I
don't think you actually change the DMA _mappings_ do you? Just the
actual buffer (that was then mapped earlier)?

So I don't think it's the IOMMU code itself necessarily, although an IOMMU
may well be involved (eg I could easily see a few cachelines worth of
actual DMA data caching going on in the whole IOMMU too)

And to some degree the floppy driver might be _more_ likely to see some
kinds of bugs, because it uses that crazy legacy DMA engine. So it's not
going to go through the regular PCI DMA hardware paths, it's going to go
through its own special paths that nobody else uses any more (and thus has
probably not had as much testing).

> In the diff between 2.6.27.41 and 2.6.28, I noticed a lot of changes in
> arch/x86/kernel/amd_iommu.c and related files, could any of these have
> triggered this behavior?

Could it have triggered? Sure. Chipset caches are often flushed by certain
trivial operations (often the caches are small, and operations like "any
PIO access" will make sure they are flushed). Different IOMMU flush
patterns could easily account for it.

But I think we'd like to see a list of hardware where this can be
triggered, and quite frankly, a 'git bisect' would be absolutely wonderful
especially if the list of hardware is not showing any really obvious
patterns (and I assume they aren't all _that_ obvious, or you'd have
mentioned them).

Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/