Re: sky2 PHY setup

From: Stephen Hemminger
Date: Wed Apr 04 2007 - 14:24:45 EST


On Mon, 26 Mar 2007 21:24:06 -0600
Rob Sims <lkml-z@xxxxxxxxxxx> wrote:

> On Fri, Mar 16, 2007 at 02:16:48PM -0700, Stephen Hemminger wrote:
> > On Fri, 16 Mar 2007 14:36:45 -0600
> > Rob Sims <lkml-z@xxxxxxxxxxx> wrote:
>
> > > Are there some debug hooks that can be activated? My sky2 stops
> > > responding (very light load) about twice a day. The netdev watchdog
> > > notices after a while and is able to reactivate the interface:
>
> > > Mar 15 13:28:12 btd kernel: NETDEV WATCHDOG: eth0: transmit timed out
> > > Mar 15 13:28:12 btd kernel: sky2 eth0: tx timeout
> > > Mar 15 13:28:12 btd kernel: sky2 eth0: transmit ring 458 .. 435 report=458 done=458
> > > Mar 15 13:28:12 btd kernel: sky2 eth0: disabling interface
> > > Mar 15 13:28:12 btd kernel: sky2 eth0: enabling interface
> > > Mar 15 13:28:12 btd kernel: sky2 eth0: ram buffer 48K
> > > Mar 15 13:28:15 btd kernel: sky2 eth0: Link is up at 1000 Mbps, full duplex, flow control both
> >
> > Use ethtool -S to if there are any pause frames, etc. See if frames are
> > still making it into PHY statistics but not being received.
> >
> > Use ethtool -d to dump registers. Need current version of ethtool with decode logic.
> >
> > Then look for things like is Ram buffer read/write pointer changing?
> >
> > Is GMAC stuck in pause:
> >
> > Normal is:
> > GMAC 1
> > Status 0x5010 (see GM_GPSR_XXX in sky2.h)
> > Control 0x1800
> >
> > Stuck is
> > GMAC 1
> > Status 0x5810 (or 0x5A10)
>
> First, here's the described hang in action, on the Core2 Duo on a 1Gb
> hub:
> GMAC 1 Status/Control remains at 0x5010/0x1800 until module is removed.
> Read/write buffer pointers are changing. Full ethtool output in
> http://www.robsims.com/sky2.netmon.log.gz
>
> This machine was also having major throughput problems - 17 kB/s.
> Rebooting brought it to ~ 20 MB/s. Booting into a kernel with the
> proprietary sk98lin kernel module showed ~ 80MB/s. Finally, returning
> to sky2 gave 117 MB/s. Tests run using netcat, dd, /dev/zero, and
> /dev/null, transmitting from the problem box to an e1000 via a Netgear
> GS108. No hangs were observed during the "load test."

The vendor driver does not do hardware flow control correctly. It ignores
Tx pause frames.

Were you using Jumbo MTU?

> I also had a hang on a Pentium 4 w/sky2, 100Mb/s hub. I neglected to
> try removing and re-inserting the module before rebooting.
> GMAC 1
> Status 0xF004
This is [100mbit] [Full duplex] [Tx flow control disabled] [link up]
[Rx flow control disabled]

> Control 0x1800
[TX enabled] [Rx enabled]

You might have over run the hub and it wedged. Try doing:
ethtool -r eth0
that forces a down/up


--
Stephen Hemminger <shemminger@xxxxxxxxxxxxxxxxxxxx>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/