Re: [BUG] PDC20268 crashing during DMA setup on stock Debian 2.6.12-1-powerpc

From: Kyle Moffett
Date: Wed Oct 19 2005 - 18:15:16 EST



On Oct 19, 2005, at 18:13:44, Benjamin Herrenschmidt wrote:
On Wed, 2005-10-19 at 13:48 -0400, Kyle Moffett wrote:

Do you have any other ideas WRT this bug? I've been browsing around in the code a bit, and I plan to try diffing my 2.6.8.1 version of the files against the latest Debian to see what changed, although I suspect it will be a relatively fat hunk of changes. Thanks for your
help!


Nope. The lspci output looks perfectly normal. I looks like a mixture of issues with BM DMA being disabled for a reason I haven't figured out and then the code crashing because it doesn't like BM- DMA being disabled ...

BusMaster-DMA definitely should be enabled on that card. After a lot of looking through icky IDE code, I've determined that the reason for the crash is that if there is a "mate", or another IDE bus on the same card, then hwif->dma_master is set to hwif->mate->dma_base on the secondary channel. Since DMA explicitly wasn't enabled on the primary channel, hwif->dma_master on the secondary is 0 even though dma is enabled, and therefore we hit that BUG().

Therefore it seems that the only issue is ide_get_or_set_dma_base is returning 0 when it should return a valid DMA base. In that function, the only ways that function can even theoretically return 0 without printing any weird error messages is "if (hwif->mmio)" or "if (hwif->mate && hwif->mate->dma_base)". The latter can't happen before hwif->mate is set up (since the problem is while initializing the primary). Could hwif->mmio be nonzero somehow? The only drivers that seem to set it are pci/sgiioc4, pci/siimage, ppc/pmac, and a couple misc arch drivers.

I see a couple theoretical possibilities:
* hwif->mate and hwif->mate->dma_base are set for the primary while still initializing it (before any secondary is set up.
* hwif->mmio is set somehow even though it shouldn't be, is the value ever pre-initialized to 0?

Again, best is you pour printk's all over setup-pci.c and ide-dma.c to figure out what's going on...

I wish I could but the machine is remote and in-production, so it's hard to have time to do much with it. I'm trying as best I can to walk through the sources by hand, specifically with regards to changes between the two. I'm hoping I can come up with a good enough guess by Thanksgiving, so that when I have a week near the server to test things out I can make significant progress.

Cheers,
Kyle Moffett

--
I have yet to see any problem, however complicated, which, when you looked at it in the right way, did not become still more complicated.
-- Poul Anderson



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/