Re: Gigabit NIC question.

From: Christopher E. Brown (cbrown@denalics.net)
Date: Wed Jul 12 2000 - 13:22:05 EST


On Wed, 12 Jul 2000, Donald Becker wrote:

> On Tue, 11 Jul 2000, Christopher E. Brown wrote:
> > On Tue, 11 Jul 2000, Alan Cox wrote:
> > > > Richard> It might not be a 2.0.36 driver that is needed. It may be the
> > > > Richard> 2.0.36 reliability!
> > > >
> > > > 2.0.36 reliawhat?? If you are looking for GigE performance you are
> > > > likely to want SMP and SMP in 2.0.36 is absolutely not reliable.
> > >
> > > Actually by 2.0.36 its very reliable. It still gives sucky performance.
> >
> > forbin:~# uname -a;uptime
> > Linux forbin 2.0.36 #2 SMP Sun Apr 4 14:04:56 AKDT 1999 i686 unknown
> > 7:40pm up 435 days, 2:43, 1 user, load average: 5.46, 5.30, 5.28
> > Dual PPro 200Mhz, 128M ECC FPM @66mhz
> > Ultra-Wide + Cheetas
> > Box has been rebooted once since the last kernel build, when
> > we lost power for 4 hrs. We had 3.5 hrs backup power.
>
> That beats our cluster record. The longest uptime we had with 2.0.* was
> between the power conditioner shorting out when a fan was being replaced,
> and a whole-room shutdown that was traced to a air conditioner on fire --
> only nine months. The 2.0.3* series was very, very reliable.

        We had 5 2.0.3x boxes pass the 300day uptime mark, 2 were SMP.
After the original few reboots during install and kernel update/build
they were just left running until a kernel or major OS update was
needed. The rest of our boxes rarely pass the 200 day mark before some
kernel update or such is needed (the internal only boxes vs the
external reachable ones). The only one left is the one listed above.
No crashes, the others were taken down for upgrades.

        Other than a obscure memory leak in an old sangoma wan driver
(the error handler called for unknown format packets leaked a few
bits) getting triggered when an upstream was sending Cisco Router
Discovery packets ever 5 secs for months we have never had a
production box give us any trouble, and the wanpipe was caught and
rebooted before it crashed ( had at least 5 megs of swap left ;) ).
Keep in mind, I am a total freak about hardware and whatnot. Boxes
built inhouse by me, carefully selected higher end parts. All ECC
memory, proper PS units, conditioned power, redundant cooling, etc.
After wiping out all the fixable hardware issues 2.0.36 is very
stable, even in SMP boxes.

        
> > Right about the network io though, it will easily max a single
> > 100Mbit interface, but won't go past 180Mbit total (box has 2 100Mbit
> > FD interfaces).
>
> Hmmm, we had very similar boxes, Dual P6-200 PR440FX motherboards, and got
> better performance. It was about 270Mbps w/ TCP/IP on three bonded Fast
> Ethernet channels. This beat a single Yellowfin card, but IIRC the Hamachi
> got over 300Mbps (user-level 'ttcp' with larger transfer sizes).
>
> Alas, I no longer have that hardware to test with. And it was a local peak
> in performance.

        Oh, this one will do much better than 180Mbps on a ttcp
benchmark, this is a slightly more real world type test. In the
active system there are just to many interrupts flying around (2x
KNE100TX 21140, 2x BT-958) to scale well. The 180Mbps figure is from
a mixed nfsv2 and ftp transfer test w/ 25 clients per service. ttcp
tests will peg both interfaces with very little slowdown noticed on
the box. Playing with the max_interrupt_work, TX_RING_SIZE, and
RX_RING_SIZE let us top things out though. I figured it was just SMP
locking issues keeping the cards from being services often enough.

---
As folks might have suspected, not much survives except roaches, 
and they don't carry large enough packets fast enough...
        --About the Internet and nuclear war.

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Jul 15 2000 - 21:00:15 EST