Re: LibPATA code issues / 2.6.17.3 (What is the next step?)

From: Mark Lord
Date: Fri Jul 14 2006 - 16:00:38 EST


Mark Lord wrote:
Justin Piszcz wrote:
They are Western Digital 400* drives.

[4294678.049000] Vendor: ATA Model: WDC WD4000KD-00N Rev: 01.0
[4294678.050000] Vendor: ATA Model: WDC WD4000KD-00N Rev: 01.0

On a SiL controller, it also happens when they are on a promise controller too.

On Fri, 14 Jul 2006, Mark Lord wrote:

Justin Piszcz wrote:

opcode=0x35 & opcode=0xca

Those are non-DMA WRITE opcodes. Using PIO for I/O is pretty rare these days,
so I'm betting that this is not a hard disk device -- compactflash?

Okay. So why are we issuing PIO WRITE commands to drives that
obviously should only be sent DMA commands by libata?

Perhaps that's the bug.

Oh wait.. I remember this.. No, those are DMA commands,
despite the misleading libata name for them. We went through
this before last spring..

Okay. So I wonder what's really going on.
The next step would be to instrument the interrupt handler,
so that when it sees bad-status, it dumps out the stat/err values
right then and there, before anything else can muck with them.

It might also be good to have it dump out the controller engine's
DMA status/err values, assuming the controller has registers for those.

Then we should get a better picture of what's going on.
Assuming the drives aren't lying to us (a perfectly good assumption here),
then the controller must be aborting the transfer unexpectedly.

Cheers
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/