Re: rmmod e1000 hangs (Was Re: 2.6.22-rc2-mm1)

From: Herbert Xu
Date: Thu May 24 2007 - 06:54:20 EST


On Thu, May 24, 2007 at 08:47:13PM +1000, Herbert Xu wrote:
> On Thu, May 24, 2007 at 11:36:22AM +0100, Jeremy Fitzhardinge wrote:
> >
> > I got a hang while rmmodding e1000. sysrq-t shows:
> >
> > rmmod D 003FFAFC 6616 15923 15911 (NOTLB)
> > e9341e44 00000092 82318c15 003ffafc e9341e2c 00000000 e9341e14 823187a1
> > 003ffafc 00000000 c0123862 d3dbab80 d3dbad1c c2c08a40 77a67d01 000001ca
> > 00000292 e9341e24 c03799cd e9341e54 c0540840 e9341e44 00223389 000000ff
> > Call Trace:
> > [<c03777b1>] schedule_timeout+0x70/0x8e
> > [<c03777e4>] schedule_timeout_uninterruptible+0x15/0x17
> > [<c0133d04>] msleep+0x10/0x16
> > [<c030d5e0>] dev_close+0x39/0x6b
>
> Looks like we're spinning on __LINK_STATE_RX_SCHED. This means that
> someone called netif_poll_disable() without re-enabling it again.
> Perhaps e1000_io_error_detected? Auke?

I think the dual meaning of __LINK_STATE_RX_SCHED is seriously broken.
In dev_close we are waiting for any outstanding poll to terminate but
the same bit can either mean an outstanding poll or that poll has
been disabled.

It's a surprise that it has taken so many years for someone to report
a bug on it. I'll try to get this fixed up, probably by adding a bit.

Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <herbert@xxxxxxxxxxxxxxxxxxx>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/