Bug: kernel 2.2, DNS and udp masquerading, dynamic IP (not related to ip_dynaddr hack)

From: raphael.marmier@span.ch
Date: Sat Aug 26 2000 - 17:48:39 EST


Thanks for your attention, I've been searching the whole web and
kernel-mailing list archive but came up with nothing significant. To the best
of my knowledge, I feel I have a real bug here that I have to share with
you.

Note: Apparently, this problem has no relation with the
"/proc/sys/net/ipv4/ip_dynaddr hack".

Please, I would like to be CC'ed the follow up on this. Thanks.

AFFECTS:
kernel 2.2.14mdk, 2.2.15mdk, 2.2.16, 2.2.16mdk (sorry, couldn't try earlier
versions of 2.2)

DOESN'T affect 2.0 kernels! (at least not Red Hat 5.2 retail kernel)

--------- DESCRIPTION ----------
A kernel 2.2 based linux machine is part of a private network masqueraded by
 a 2.2 based router. The router connects to an ISP with dialup ppp on
_demand_.

The masqueraded 2.2 based machine can resolve names _only_ the first time
the ppp line is brought up. (thanks to ip_dynaddr feature)

After the ppp line is brought down and up again once, all subsequent
attempts to the _same_ nameserver (at the ISP) _fails_ (no response), until
the masqueraded udp _sessions_ to the name server disappears after timing
out.

Cause: every DNS queries from a 2.2 based machine has the same source
_port_. The router assumes it is the same udp sessions, so reuses the same
masqueraded session every time.
When the external IP changes, the router keeps rewriting udp frames with the
 _old_ source IP. Unless the masquerading session times out, the workstation
 will never get any answer from that nameserver.

The UDP traffic logs show this phenomenon. See below for the demonstration.

Setting the forwarding timeout for udp sessions to a very low number
artificially solves the problem. Unfortunately, I hear that doing this can
break things.

This doesn't happen when resolving names from Openstep 4.2, OpenBSD 2.6,
Windows 98, and most notably Red Hat 5.2 (_2.0 kernel_). With all these
platform, a new masqueraded session gets created by the router for each news
connections to the name server.

The problem arises too when using Coyotelinux* on the workstation. It has a
2.2 kernel, but runs with the old libraries, excluding a library. From this I
have to conclude the bug is somewhere in the kernel. Maybe ipv4/udp.c?

This problem is new to 2.2 (2.0 worked). Unfortunatly, I couldn't try the
2.4 kernel due to bandwidth limitation and the high cost of telcos here in
Switzerland.

As I'm not proficient in C, and even less in kernel hacking, I cannot
investigate directly into the code. Even less suggest a fix.

regards, Raphael
raphael.marmier@span.ch

------ SITUATION ------

One masquerading router, connecting a LAN (10.0.0.0/8) to an ISP with
    PPP in dial on demand mode. (dynamic IP addresses)

    Pbl. verified either running on Coyotelinux (2.2.16 kernel)
    or Mandrake 7.1 (2.2.15 k)

One workstation running Mandrake 7.1 and 7.0, Redhat 5.2,
    Openstep (BSD 4.3), Windows 98se, and Coyotelinux.

One server running OpenBSD 2.6.

------ ROUTER CONFIGURATION ------

pppd options:
    demand defaultroute ipcp-accept-local ipcp-accept-remote

ipchains config:

    #!/bin/sh

    # Flush old forwarding rulesets
    /sbin/ipchains -F forward

    # Prevents some (annoying) troubles with dynamic addresses and forwarding.
    echo "1" > /proc/sys/net/ipv4/ip_dynaddr

    # By default, deny all forwarding
    /sbin/ipchains -P forward DENY
    # Allow local clients to access the outside world
    /sbin/ipchains -A forward -j MASQ -s $LOCAL_NETWORK/$LOCAL_NETMASK -d 0.0.0.0/0

    # MASQ timeouts
    #
    # 2 hrs timeout for TCP session timeouts
    # 10 sec timeout for traffic after the TCP/IP "FIN" packet is received
    # 160 sec timeout for UDP traffic (Important for MASQ'ed ICQ users)
    #
    /sbin/ipchains -M -S 7200 10 160

    # Flush old rulesets
    /sbin/ipchains -F input
    /sbin/ipchains -F output

------ DEMONSTRATION ------

On the linux 2.2 workstation:

     nslookup www.span.ch 144.85.30.1
     ---> success
     nslookup www.span.ch 144.85.30.1
     ---> success
   - ppp link brought down and up
     nslookup www.span.ch 144.85.30.1
     ---> failure: "no response from server"

as shown in the log of the workstation (eth0):

08/26-15:04:46.608250 10.0.0.10:1024 -> 144.85.30.1:53
UDP TTL:64 TOS:0x0 ID:417
Len: 50
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/26-15:04:51.166299 144.85.30.1:53 -> 10.0.0.10:1024
UDP TTL:60 TOS:0x0 ID:44355
Len: 172
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/26-15:04:51.166903 10.0.0.10:1024 -> 144.85.30.1:53
UDP TTL:64 TOS:0x0 ID:418
Len: 37
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/26-15:04:51.325976 144.85.30.1:53 -> 10.0.0.10:1024
UDP TTL:60 TOS:0x0 ID:44365
Len: 136
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/26-15:06:54.965944 10.0.0.10:1024 -> 144.85.30.1:53
UDP TTL:64 TOS:0x0 ID:431
Len: 50
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/26-15:06:59.962344 10.0.0.10:1024 -> 144.85.30.1:53
UDP TTL:64 TOS:0x0 ID:432
Len: 50
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

Notice how the workstation uses always the same port (here 1024) for every
queries.

To the router, these connections look like to be always the same one. As
result, everything works fine until its external IP changes. When this
happen, all subsequent DNS queries get rewritten with the old source address.

-------------------------------

On the linux 2.2 router:

Snort outputs a log for each source IP it encounters in packet traversing
the interface ppp0. Thus, we have two logs, one for the IP of the first
dialup (195.15.85.140), one for the second (195.15.83.37).

intercepted with source IP 195.15.85.140:

08/25-23:23:27.406473 195.15.85.140:61068 -> 144.85.30.1:53
UDP TTL:63 TOS:0x0 ID:14704
Len: 37
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/25-23:23:27.636473 144.85.30.1:53 -> 195.15.85.140:61068
UDP TTL:61 TOS:0x0 ID:3262
Len: 136

---> correspond to nslookup www.span.ch 144.85.30.1, succesful.

---> The dialup is brought down and up again

---> nslookup www.span.ch 144.85.30.1 , twice. Frames still get
     written with the old source address:

=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/25-23:25:13.816473 195.15.85.140:61068 -> 144.85.30.1:53
UDP TTL:63 TOS:0x0 ID:14856
Len: 50
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/25-23:25:18.816473 195.15.85.140:61068 -> 144.85.30.1:53
UDP TTL:63 TOS:0x0 ID:14857
Len: 50
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/25-23:26:06.626473 195.15.85.140:61068 -> 144.85.30.1:53
UDP TTL:63 TOS:0x0 ID:14889
Len: 50
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/25-23:26:11.626473 195.15.85.140:61068 -> 144.85.30.1:53
UDP TTL:63 TOS:0x0 ID:14896
Len: 50

intercepted with source IP 195.15.83.37:

.... nothing, until I try the secondary nameserver:

08/25-23:27:12.516473 195.15.83.37:61069 -> 144.85.20.30:53
UDP TTL:63 TOS:0x0 ID:14912
Len: 51
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+
08/25-23:27:12.676473 144.85.20.30:53 -> 195.15.83.37:61069
UDP TTL:252 TOS:0x0 ID:46507 DF
Len: 162
=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+=+

which works of course...

--------------------------

On the router, the masquerading session just get updated because masq cannot
destinguish between different lookups:

EntrOes IP Masquerade
prot expire source destination ports
udp 2:26.37 10.0.0.10 144.85.30.1 1024 -> 53 (61068)

---> nslookup www.span.ch 144.85.30.1

EntrOes IP Masquerade
prot expire source destination ports
udp 2:32.90 10.0.0.10 144.85.30.1 1024 -> 53 (61068)

--------------------------

When several request are made from Openstep 4.2, snort logs a whole list of
different connections from different udp ports. I spare you the details:

UDP:2722-53
UDP:2723-53
UDP:2727-53
UDP:2728-53
UDP:2729-53
UDP:2730-53
UDP:2731-53
UDP:2732-53
UDP:2733-53
UDP:2734-53
and so on.

WHAT I DIDN'T TRY:

- Running the router with a 2.0 kernel
- Running the router with a 2.4 kernel
- Running the workstation with a 2.4 kernel
- Running the both with a 2.4 kernel

* Coyotelinux: www.coyotelinux.com
Logs obtained with snort: www.snort.org

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Thu Aug 31 2000 - 21:00:18 EST