Re: [PATCH-2.2] Bonding Driver Enhancements - final

From: Constantine Gavrilov (const-g@xpert.com)
Date: Thu Oct 05 2000 - 18:43:27 EST


willy tarreau wrote:
>
> Hello Thomas !
>
> ok, all the suggestions, help and doc included. I've
> extended bonding.txt to explain how to proceed with
> HA.
> The modified ifenslave prog is also included in the
> Documentation directory since many people have
> difficulties getting the original and I don't receive
> replies from Donald Becker about this.
>

Donald's email server has been down for few days, my machine was not
able to send him e-mail.

> It works for me on two alteon switches. I think it
> can safely be included in 2.2.18 (since it also fixes
> some bugs anyway).
>
> I attach the complete patch against 2.2.1[78] with
> help, doc and ifenslave.c.
>

Regarding your last patch -- it does not include the documentation
update (ifenslave.c compile problem is solved).

I have run the tests on both UP and SMP machines for two versions of
your patch: the early version patch-bonding-2.2.17.gz (defaults to MII
link monitoring and does not include optimized transmit path) and the
latest version I got during the time we were doing the tests --
patch-bonding-20001005.gz.

TEST RESULTS.

Hardware:
Machine 1: IBM Tninkpad 600X with 3Com 3CCFE575BT Cyclone CardBus and
3Com Megahertz 574B PCMCIA, 128MB RAM, PIII 500MHZx256K CPU (UP kernel
tested).

Machine 2: Compaq Proliant 3600xx (1U rackmount version) with
2x800MHZx256K CPU, 256MB RAM and 2 on-board Intel EtherExpress PRO/100
adapters (SMP and UP kernels tested).

Machine 3: IBM Thinkpad laptop with one NIC.

Switches: BayStack Switches, 2 trunks were defined (trunks were defined
between ports that belonged to different modules).

Software:
2.2.18-pre15 + updated bonding driver + E820 memory detection + software
raid + nfs v3 server + IP Virtual Server. (Software raid, nfs and IP
virtual server were modularized and were not used at testing time.
Thus, they are not expected to have influenced the test results).

BAD RESULTS.
It did not work with SMP kernel on Compaq (worked fine with UP kernel
though). With the old patch, if both links were active at boot, there
would be no network. If only one link was active at boot, the network
was functioning until the second link was brought up. There were no
error kernel messages with the old patch (the link detection worked fine
and was reported even when the network was dead).

With the new patch, if both links were up, the network would be
functioning well for just a few seconds. As soon as packets were queued
to be sent via the second NIC, the network would effectively die. The
driver could never sent the packets via the second interface (ifconfig
showed zero packet statistics at all times). In this case, however, I
saw a lot of messages like "eth1: transmit timeout, status..." and could
do some pinging with variable success. However, after bringing up and
down links one-two times, such messages would stop, no pinging could be
done and no change of packet stats would be observed for both NICs.

With both the old and the new patch, trying to bring the second ethernet
down (either manually with ifconfig or at reboot) would freeze the
machine. It was not a lock-up; machine would respond to keyboard, but
all commands would get stuck. Alt+CTRL+DEL would not reboot, Sysrq would
not sync or unmount but could reboot.

This could be a case of a Compaq that is not properly configured for
Linux (I did not have access for the config CD and the machine could not
be configured from the keyboard AFAIK). For what it is worth, I did run
some kernel compiles with "-j8" and some massive rpm builds on the
machine and it would not lock up. If not a config problem, it seems that
a lock is taken somewhere which is never released.

I realize I forgot to test the backup mode with the SMP kernel. No test
results for that, sorry.

GOOD RESULTS.
Both the old and the optimized patch work fine on UP machines (Compaq
and Laptop). I have really tried to stress the network (nfs, interactive
ssh sessions, and several ftp transfers of ~500MB files at the same
time) and check for different link loss/recovery scenarios. Link loss
was always properly detected and transparent for all applications. The
traffic was distributed fairly between the two bonded interfaces. I have
also tested the new backup mode with the latest patch. It worked fine
without exhibiting any problems.

The new patch does indeed optimize the load. One of my laptop cards
(PCMCIA Megahertz 574B) is a lousy performer or has a lousy driver.
Normally I
can't achieve more than 4-5MB/sec and the driver really stresses the CPU
(the cardbus card does not have that problem). With the old bonding
patch, I would see 100% CPU usage with 1 high-speed FTP session.
With the new patch, I see 100% CPU usage only with 2 high-speed FTP
sessions. The high usage with the latest bonding patch on this specific
card is not the bonding driver problem. I have never seen high CPU usage
on Compaq with Intel cards and laptop with Cardbus cards with both
versions of the bonding patch. However the difference does show that
some optimization was done.

Caveats.
1) Link status change is always properly detected (this can be verified
by observing changes in packet statistics of individual interfaces with
ifconfig). However, for relatively short link loss (1-3 seconds) the
restoration of the link was not always reported (but always properly
detected). What I mean is this: I take the link out, it gets reported, I
can verify with ifconfig that only one interface is used, then I restore
the link, IT IS NOT REPORTED but I see with ifconfig that both links are
used again. And if I take the link down again, link down is reported. I
could not trigger this with the laptop cards but could trigger it
reliably with Intel cards on Compaq.

2) Backup mode was observed to be more resilient to a link status
change. No network performance loss (not even short time) was observed
during a switch to a backup interface. The trunk round-robin mode does
exhibit a short (very very short with the latest patch) transfer speed
degradation when a link is lost. A worse case is when a link is
restored -- the uplink is detected immediately but it takes 1-2 seconds
for the switch to either finish negotiation or update the trunk status.
Thus, the bonding driver starts to use the NIC before the switch is
actually ready which causes some retransmits and a drop in network
performance for a period of 1-2 seconds. This was completely
transparent for all applications though (no NFS and other kernel or
application messages). I do realize that this behavior and time-tuning
parameters will be switch-specific. Probably for my case, if I specify a
large monitor time, network loss will occur when a link goes down; if I
specify a short time, network loss will occur when the link is restored.
(I used 100 ms with the new driver and the default 1 sec with the
initial patch.) Maybe we can (and should?) make this behavior smoother
for the case of round-robin policy by not immediately re-using the
interface when a link is restored? We could use a second parameter for
interface re-use or make it static, say 2*miimon.

3) The previous two are really minor and I can live with them. If we
are talking about features, we should remove the limitation of one
bonding interface!!!

-- 
----------------------------------------
Constantine Gavrilov
Unix System Administrator and Programmer
Xpert Integrated Systems
1 Shenkar St, Herzliya 46725, Israel
Phone: (972-8)-952-2361
Fax:   (972-9)-952-2366
----------------------------------------
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Oct 07 2000 - 21:00:18 EST