Re: sata_mv port lockup on hotplug (kernel 2.6.38.2)

From: Tejun Heo
Date: Tue Sep 06 2011 - 12:42:39 EST


On Wed, Sep 07, 2011 at 01:33:55AM +0900, Tejun Heo wrote:
> Hello,
>
> On Tue, Sep 06, 2011 at 01:19:44PM +0100, Bruce Stenning wrote:
> > ata4: EH complete
> > Waking error handler thread
> > scsi_eh_wakeup: succeeded
> > scsi_schedule_eh: succeeded
> > scsi_restart_operations: waking up host to restart
> > Error handler scsi_eh_3 sleeping
>
> I think the following should fix the problem. The code there is from
> the time when libata shared scsi_host->host_lock. libata no longer
> does that so, in the current state, host_eh_scheduled can be cleared
> with actual pending EH condition.

Hmmm... maybe not. Such race condition exists iff host_eh_scheduled
is incremented outside of ap->lock, which I can't see how. Weird.
Can you please instrument the followings?

* print the caller of scsi_eh_wakeup(). "%pF" w/ (void *)_RET_IP_
should do it.

* print why scsi_eh is going back to sleep immediately.
ie. shost->host_failed, host_eh_scheduled, host_busy in
scsi_error_handler(). It would also be nice to add some printks
around schedule() and enable PRINTK_TIME.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/