Re: ATA Write Error and Time-out Notification in User Space

From: Alan Cox
Date: Tue Jan 03 2006 - 13:55:36 EST


On Maw, 2006-01-03 at 12:29 -0600, John Treubig wrote:
> failure point. I put a drive on the Secondary IDE bus hanging off the
> motherboard Nvidia NForce 2 controller, began an access and pulled the plug.
> Sure enough the failures occured and were passed back to user level, but
> the system did not hang. I've repeated this a number of times. I moved the
> same drive to the Promise Controller and the hang occurs. Thus it seems we
> have proved the Promise sub-system is my problem.


Bingo. Yes I know why this is occuring now.

There is a known old bug with error handling in some cases on promise
chips. The core kernel code tries to clean up any remaining data after
an error (to handle chip prefetch/postwrite FIFOs) if DRQ_STAT is
asserted. Its a nice trick, saves on resets and slow recovery but isn't
compatible with some promise controllers.

The -mm tree has a partial but incomplete fix to this implemented, the
base kernel does not have this fixed.

Its been known for some time so perhaps the ide maintainers have patches
waiting for 2.6.16 now its opened ?

Alan

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/