Re: ATA device reset, shoud I be concerned?

From: Tejun Heo
Date: Mon Jan 21 2008 - 02:56:45 EST


Alan Cox wrote:
>>> [ 9031.028000] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x2
>>> frozen
>>> [ 9031.028000] ata1.00: cmd c8/00:08:90:ca:ce/00:00:00:00:00/e0 tag 0 cdb 0x0
>>> data 4096 in
>>> [ 9031.028000] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask 0x4
>>> (timeout)
>
> We got bored of waiting for the drive to respond to our request. I still
> think we have the timeouts too short or are accounting queue time
> somewhere we shouldn't as there a few other examples where we don't allow
> long enough for a drive to retry out and fail with a media error on a bad
> sector.

Hmm.. That's not what I hear from Mark and vendor contacts. They say
30secs is more than enough. I actually am thinking about reducing it to
15secs (not for FLUSH of course) as many SFF controllers report
transmission failure as timeouts. Of course, if we're ticking the timer
while the command is not in flight, that's a bug. If there are cases
where 30 secs isn't enough, can you please point me to those reports?

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/