problem with IP addresses getting mangled - various NIC drivers and kernels

From: Phil N (pnewlon@toosan.com)
Date: Mon Feb 28 2000 - 16:05:22 EST


I want to apologize to all you busy people about this - but I don't know where to turn and hope that
this is actually the right place to do so. :)

NFS "should work right out of the box", right? I seem to have stumbled on the wrapping paper! :0

I have been trying to implement NFS between two systems that are on different IP networks (VLANs,
within the same physical building). Between the hosts are Cisco Catalyst 5500 core switches with
RSMs (router on a blade, routes between VLANs). The client box is attached to a 10/100 Cisco switch
in the closet that has a GB uplink to the core switch. Traffic is routed/switched by the 5500,
ending up in a different closet also attached by a GB uplink. I got the system working just fine
when the hosts sat on the same shared segment, but as soon as I moved them to their respective
locations performance went straight into the crapper. I put a sniffer on the conversation and saw
that there was a huge amount of broadcast/ARP and retransmission traffic being generated - it
appeared that the IP addresses of the hosts were changing about every 200 packets. I then moved the
hosts back to a shared segment and stuck a sniffer there - sure enough there was a multitude of
different IP addresses for the same MAC address. I thought that it must be the NICs or drivers so I
tried a different PC as a client with different NIC and driver (switched the desktop PC for a
laptop) and still had the same results. (As a side note, I thought that the NFS part might be a
problem so I downloaded the NFSv3 patches for 2.2.14 that Dave Higgens is working on and got the
same result, though I am now not surprised by this as I don't believe it is NFS related.)

I have the sniffer traces to show this but I have condensed it a bit here:

All Red Hat 6.1 systems
Server_1 = 2.2.12-20 kernel, Intel PRO100+ NIC, eepro100 driver, Compaq Deskpro PC PII 350 w/128MB
RAM
Server_2 = 2.2.14 kernel, Intel PRO100+ NIC, eepro100 driver, Compaq Deskpro PC PII 350 w/128MB RAM
Client_1 = 2.2.12-20 kernel, Intel PRO100+ NIC, eepro100 driver, IBM 300GL PII 266 w/64MB RAM
Client_2 = 2.2.12-20 kernel, 3Com 10/100 LAN CardBus, 3c575_cb driver, IBM laptop, PII 350 w/192MB
RAM

Config 1 = Server_1 + client_1
Config 2 = Server_2 + client_1
Config 2 = Server_2 + client_1, replaced the NIC with same type
Config_3 = Server_2 + client_2

Sniffer information for Config_3 = Server_2 + client_2 on a shared hub, the two hosts and the
sniffer were the only devices on the hub.

   Frame Source Address Dest. Address Size Rel. Time Delta Time Abs. Time
       1 pnewl01 [10.255.239.133] 170 000:00:00.000 0.000.000 02/25/2000 04:59:05
        DLC: Destination = Station 00902762C4DC
        DLC: Source = Station 00104BF909A8
        IP: Source address = [10.255.239.134], pnewl01
        IP: Destination address = [10.255.239.133]

       2 [10.255.133.10] [255.239.134.8] 138 000:00:00.000 0.000.416 02/25/2000 04:59:05
        DLC: Destination = Station 00104BF909A8
        DLC: Source = Station 00902762C4DC
        IP: Source address = [10.255.133.10]
        IP: Destination address = [255.239.134.8]
.... <snip>
      29 [255.239.134.10] [255.239.133.3] 186 000:00:00.007 0.000.290 02/25/2000 04:59:05
        DLC: Destination = Station 00902762C4DC
        DLC: Source = Station 00104BF909A8
        IP: Source address = [255.239.134.10]
        IP: Destination address = [255.239.133.3]
.... <snip>

The traffic continues normally until frame 207 where the laptop mangles the IP addresses again as in
frame 29 above. Within 1540 frames there are the 2 correct addresses and 14 variations thereof.

Net Station DLC Station Rx Frames Tx Frames
[10.255.239.134] 00104BF909A8 739 747 (Correct address, client)
[10.255.134.10] 00104BF909A8 0 9
[10.255.8.1] 00104BF909A8 1 0
[239.134.10.255] 00104BF909A8 0 6
[239.134.8.1] 00104BF909A8 7 0
[255.239.134.10] 00104BF909A8 0 12
[255.239.134.8] 00104BF909A8 17 0

[10.255.239.133] 00902762C4DC 733 747 (Correct address, server)
[10.255.10.255] 00902762C4DC 0 1
[10.255.133.10] 00902762C4DC 0 6
[10.255.133.3] 00902762C4DC 4 0
[10.255.3.32] 00902762C4DC 1 0
[239.133.10.255] 00902762C4DC 0 4
[239.133.3.32] 00902762C4DC 11 0
[255.239.133.10] 00902762C4DC 0 6
[255.239.133.3] 00902762C4DC 25 0

I am stuck here, not knowing what to do to find the root cause of this and would appreciate a
direction!

TIA, Phil

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Feb 29 2000 - 21:00:21 EST