Re: [PATCH] libata-eh.c should handle AMNF error condition (error byte bit 0, usually code 0x01) in libata-eh.c along with UNC as a media error so SCSI stack can handle it properly (translation code 0x01 is already present in libata-scsi.c) but was never passed down due to lack of handling in EH.

From: Tejun Heo
Date: Tue Jul 15 2014 - 11:16:42 EST


On Tue, Jul 15, 2014 at 10:28:42AM +0400, Alexey Asemov wrote:
> While using linux-based machine (AMD 6550M-based notebook, PCI IDs for the
> controller are 1022:7801 subsys 1025:059d) and ddrescue to salvage data
> from failing hard drive (WD7500BPVT 2.5" 750G SATA2), I've found that pure
> AMNF 0x01 error code generates generic "device error" that is retried
> several times by SCSI stack instead of "media error" that is passed up to
> software.
>
> So we may assume deprecated AMNF error code is surely not dead yet, and
> it's better for it to be handled properly. As we may see it is used by
> modern enough devices, and used properly: drive returned AMNF only when IDs
> for track cannot be read completely due to dying head or positioning,
> otherwise it returned UNC(orrectables).
>
> Not handling it causes wrong generic error code ("device error") reporting
> down the stack, can damage failing drives further because of excessive
> retries, and slows salvaging down a lot. Also, there is handling code in
> libata-scsi.c for 0x01 AMNF error already.
>
> https://bugzilla.kernel.org/show_bug.cgi?id=80031
>
> Signed-off-by: Alexey Asemov <alex@xxxxxxxxxx>

Applied to libata/for-3.16-fixes w/ shortened $SUBJ (moved to the
first paragraph).

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/