Re: hang with "wait_on_bh" trace

Simon Kirby (sim@netnation.com)
Tue, 9 Mar 1999 11:07:48 -0800 (PST)


On Tue, 9 Mar 1999, Alan Curry wrote:

> Ingo Molnar writes the following:
> >
> >this looks like the socket locking bug fixed in 2.2.2.
>
> I was reluctant to try 2.2.2 because of reports of scsi lockups due to some
> timer error. I've seen the fix floating around and I know it's in the -ac's
> and -pre's, but I wanted to stay on the Linus Tree.

This has been fixed now in 2.2.3. 2.2.2ac7 which contained the fix has
been running on a server that got the wait_on_bh lockups for 13 days now
without a lockup (but SCSI timeouts due to (I think) a bad cable that are
producing wait_on_bh messages). Previously (2.2.2 and before) it would
lock up several times a day with wait_on_bh messages.

Previously the wait_on_bh messages also had the tcp_v4_hash() trace, so
perhaps there was also a socket locking bug that was fixed as Ingo
mentioned. However, the SCSI timeouts were definitely causing lockups and
that has (also) been fixed.

> What about the "eth0: Too much work in interrupt" coming from 2.0.36 on the
> same machine? That can't be a 2.2.1 bug...

We use EEPRO100s in over 20 servers here, and haven't had this problem
since 2.0 kernels.

Simon-

| Simon Kirby | Systems Administration |
| mailto:sim@netnation.com | NetNation Communications |
| http://www.netnation.com/ | Tech: (604) 684-6892 |

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/