Re: [PATCH] Fix kernel lockup in RTL-8169 gigabit ethernet driver

From: Malvineous
Date: Sat Apr 03 2004 - 19:59:50 EST


> Adam: did you see deadlocks that disappeared after applying your
> patch? It shouldn't deadlock - it should loop until the nic sends the
> packet to the wire. It might take a few msecs, but then it should
> continue. Perhaps gcc optimized away the reload from memory and loops
> on a register. Or there is another bug that is hidden by your patch.

Yes, it used to deadlock within a second after starting a large transfer
across the network (e.g. copying a 1GB+ file over NFS) however smaller
transfers (e.g. background gkrellmd traffic, web browsing, etc.) was
less likely to cause problems (I could go for a few hours before it
would deadlock.)

I initially tried to move the counters outside the loop, as I thought it
was just the one entry in the array causing the problem, however this
slowed network traffic down to approx 5kB/sec. Upon looking at the
RTL8139 code it looked like "else break" was the correct action, and
this brought network traffic back up to full speed and it's now been 4.5
days since I booted the kernel with the patch and it's all working
perfectly.

When it did deadlock it was more or less permanent, as any programs
accessing the NIC (or indeed the hard drive) would immediately deadlock
- so no program could send network data, thus it would loop forever
waiting for more traffic.

I did add a 'printk' line in to see what the variables were just to make
sure this was the location of the bug, and as I expected none of
the values changed at all.

Cheers,
Adam.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/