[PATCH 2.2, 2.4] Documentation/networking/ethertap.txt and Configure.help update

From: bert hubert (ahu@ds9a.nl)
Date: Fri Aug 04 2000 - 13:40:58 EST


Hi,

I tried to get the ethertap device working for a project I was writing and
discovered that the kernel-supplied documentation was out of date,
incomplete and partly just wrong.

Therefore I decided to rewrite Documentation/networking/ethertap.txt.
Configure.help turned out to include outdated instructions on the nature of
the netlink. Netlink is now implemented over a socket family, and the
/dev/route, /dev/tap0 style interfaces are deprecated. I updated
Configure.help to reflect this.

While made for 2.4, it applies to 2.2 as well. I'm probably unable to email
Alan anymore, but he is more then welcome to apply this to 2.2.17 as well.

Regards,

bert hubert

The patch:

diff -u -r linux-2.4.0test5.orig/Documentation/Configure.help linux-2.4.0test5/Documentation/Configure.help
--- linux-2.4.0test5.orig/Documentation/Configure.help Fri Jul 28 01:40:57 2000
+++ linux-2.4.0test5/Documentation/Configure.help Fri Aug 4 20:05:27 2000
@@ -4649,9 +4649,11 @@
 Kernel/User network link driver
 CONFIG_NETLINK
   This driver allows for two-way communication between the kernel and
- user processes; the user processes communicate with the kernel by
- reading from and writing to character special files in the /dev
- directory having major mode 36.
+ user processes. It does so by creating a new socket family, PF_NETLINK.
+ Over this socket, the kernel can send and receive datagrams carrying
+ information. It is documented on many systems in netlink(7), a HOWTO is
+ provided as well, for example on
+ http://snafu.freedom.org/linux2.2/docs/netlink-HOWTO.html
 
   So far, the kernel uses this feature to publish some network related
   information if you say Y to "Routing messages", below. You also need
@@ -4665,16 +4667,19 @@
 
 Routing messages
 CONFIG_RTNETLINK
- If you say Y here and create a character special file /dev/route
- with major number 36 and minor number 0 using mknod ("man mknod"),
- you (or some user space utility) can read some network related
- routing information from that file. Everything you write to that
- file will be discarded.
+ If you say Y here, userspace programs can receive some network
+ related routing information over the netlink. 'rtmon', supplied
+ with the iproute2 package (ftp://ftp.inr.ac.ru), can read and
+ interpret this data. Information sent to the kernel over this link
+ is ignored.
 
 Netlink device emulation
 CONFIG_NETLINK_DEV
+ This option will be removed soon. Any programs that want to use
+ character special nodes like /dev/tap0 or /dev/route (all with major
+ number 36) need this option, and need to be rewritten soon to use
+ the real netlink socket.
   This is a backward compatibility option, choose Y for now.
- This option will be removed soon.
 
 Asynchronous Transfer Mode (ATM)
 CONFIG_ATM
diff -u -r linux-2.4.0test5.orig/Documentation/networking/ethertap.txt linux-2.4.0test5/Documentation/networking/ethertap.txt
--- linux-2.4.0test5.org/Documentation/networking/ethertap.txt Thu Jan 6 23:46:18 2000
+++ linux-2.4.0test5/Documentation/networking/ethertap.txt Fri Aug 4 20:31:23 2000
@@ -1,88 +1,262 @@
-Documentation on setup and use of EtherTap.
+Ethertap programming mini-HOWTO
+-------------------------------
 
-Contact Jay Schulist <jschlst@turbolinux.com> if you
-have questions or need futher assistance.
+The ethertap driver was written by Jay Schulist <jschlst@turbolinux.com>,
+you should contact him for further information. This document was written by
+bert hubert <bert.hubert@netherlabs.nl>. Updates are welcome.
 
-Introduction
-============
+What ethertap can do for you
+----------------------------
 
-Ethertap provides packet reception and transmission for user
-space programs. It can be viewed as a simple Ethernet device,
-which instead of receiving packets from a network wire, it receives
-them from user space.
+Ethertap allows you to easily run your own network stack from userspace.
+Tunnels can benefit greatly from this. You can also use it to do network
+experiments. The alternative would be to use a raw socket to send data and
+use libpcap to receive it. Using ethertap saves you this multiplicity and
+also does ARP for you if you want.
 
-Ethertap can be used for anything from AppleTalk to IPX to even
-building bridging tunnels. It also has many other general purpose
-uses.
+The more technical blurb:
 
-Ethertap also can do ARP for you, although this is not enabled by
-default.
+Ethertap provides packet reception and transmission for user space programs.
+It can be viewed as a simple Ethernet device, which instead of receiving
+packets from a network wire, it receives them from user space.
 
-SetUp
-=====
+Ethertap can be used for anything from AppleTalk to IPX to even building
+bridging tunnels. It also has many other general purpose uses.
 
-First you will have to enable Ethertap in the kernel configuration.
-Then you will need to create any number of ethertap device files,
-/dev/tap0->/dev/tap15. This is done by the following command.
+Configuring your kernel
+-----------------------
 
-mknod /dev/tap* c 36 16 ( 17 18 19 20 for tap1,2,3,4...)
+Firstly, you need this in Networking Options:
 
-** Replace * with the proper tap device number you need. **
+ #
+ # Code maturity level options
+ #
+ CONFIG_EXPERIMENTAL=y
 
-Now with your kernel that has ethertap enabled, you will need
-to ifconfig /dev/tap* 192.168.1.1 (replace 192.168.1.1 with the
-proper IP number for your situation.)
+Then you need Netlink support:
 
-If you want your Ethertap device to ARP for you would ifconfig
-the interface like this: ifconfig tap* 192.168.1.1 arp
+ CONFIG_NETLINK=y
 
-Remember that you need to have a corresponding /dev/tap* file
-for each tap* device you need to ifconfig.
+This allows the kernel to exchange data with userspace applications. There
+are two ways of doing this, the new way works with netlink sockets and I
+have no experience with that yet. ANK uses it in his excellent iproute2
+package, see for example rtmon.c. iproute2 can be found on
+ftp://ftp.inr.ac.ru/ip-routing/iproute2*
 
-Now Ethertap should be ready to use.
+The new way is described, partly in netlink(7), available on
+http://www.europe.redhat.com/documentation/man-pages/man7/netlink.7.php3
 
-Diagram of how Ethertap works. (Courtesy of Alan Cox)
-====================================================
+There is also a Netlink-HOWTO, available on http://snafu.freedom.org/linux2.2/docs/netlink-HOWTO.html
+Sadly I know of no code using ethertap with this new interface.
 
-This is for a tunnel, but you should be able to
-get the general idea.
+The older way works by opening character special files with major node 36.
+Enable this with:
 
- 1.2.3.4 will be the router to the outside world
- 1.2.3.5 our box
- 2.0.0.1 our box (appletalk side)
- 2.0.0.* a pile of macintoys
+ CONFIG_NETLINK_DEV=m
 
+Please be advised that this support is going to be dropped somewhere in the
+future!
 
-[1.2.3.4]-------------1.2.3.5[Our Box]2.0.0.1---------> macs
+Then finally in the Network Devices section,
 
-The routing on our box would be
+ CONFIG_ETHERTAP=m
 
- ifconfig eth0 1.2.3.5 netmask 255.255.255.0 up
- route add default gw 1.2.3.4
- ifconfig tap0 2.0.0.1 netmask 255.255.255.0 up arp
- (route add 2.0.0.0 netmask 255.255.255.0)
+You can include it directly in the kernel if you want, of course, no need
+for modules.
 
-C code for a Simple program using an EtherTap device
-====================================================
+Setting it all up
+-----------------
 
-This code is just excerpts from a real program, so some parts are missing
-but the important stuff is below.
+First we need to create the /dev/tap0 device node:
 
-void main (void)
-{
- int TapDevice, eth_pkt_len = 0;
- unsigned char full_pkt_len[MAX_PKT_LEN];
+ # mknod /dev/tap0 c 36 16
+ # mknod /dev/tap1 c 36 17
+ (etc)
+
+Include the relevant modules (ethertap.o, netlink_dev.o, perhaps netlink.o),
+and bring up your tap0 device:
+
+ # ifconfig tap0 10.0.0.123 up
+
+Now your device is up and running, you can ping it as well. This is what
+confused me to no end, because nothing is connected to our ethertap as yet,
+how is it that we can ping it?
+
+It turns out that the ethertap is just like a regular network interface -
+even when it's down you can ping it. We need to route stuff to it:
+
+ # route add -host 10.0.0.124 gw 10.0.0.123
+
+Now we can read /dev/tap0 and when we ping 10.0.0.124 from our
+localhost, output should appear on the screen.
+
+ # cat /dev/tap0
+ :ßVU:9````````````````````````şışET@?'
+
+
+Getting this to work from other hosts
+-------------------------------------
+
+For this to work, you often need proxy ARP.
+
+ # echo 1 > /proc/sys/net/ipv4/conf/eth0/proxy_arp
+
+eth0 here stands for the interface that connects to 'other hosts'.
 
- TapDevice = open("/dev/tap0", O_RDWR);
- if(TapDevice < 0)
- {
- perror("Error opening device");
- exit(1);
- }
+Chances are that you are trying this on a non-routing desktop computer, so
+you need to enable ip forwarding:
 
- write(TapDevice, full_packet, eth_pkt_len);
+ # echo 1 > /proc/sys/net/ipv4/ip_forward
 
- close(TapDevice);
+You should now be able to ping 10.0.0.124 from other hosts on your
+10.0.0.0/8 subnet. If you are using public ip space, it should work from
+everywhere.
 
- return;
+ARP
+---
+
+If we were to take things very literally, your tcp/ip pseudo stack would
+also have to implement ARP and MAC addresses. This is often a bit silly as
+the ethertap device is a figment of our imagination anyway. However, should
+you want to go 'all the way', you can add the 'arp' flag to ifconfig:
+
+ # ifconfig tap0 10.0.0.123 up arp
+
+This may also be useful when implementing a bridge, which needs to bridge
+ARP packets as well.
+
+The sample program below will no longer work then, because it does not
+implement ARP.
+
+Sample program
+--------------
+
+A sample program is included somewhere in the bowels of the netfilter
+source. I've extracted this program and list it here. It implements a very
+tiny part of the IP stack and can respond to any pings it receives. It gets
+confused if it receives ARP, as it tries to parse it by treating it as an IP
+packet.
+
+/* Simple program to listen to /dev/tap0 and reply to pings. */
+#include <fcntl.h>
+#include <netinet/ip.h>
+#include <netinet/ip_icmp.h>
+#if defined(__GLIBC__) && (__GLIBC__ == 2)
+#include <netinet/tcp.h>
+#include <netinet/udp.h>
+#else
+#include <linux/tcp.h>
+#include <linux/udp.h>
+#endif
+#include <string.h>
+#include <stdio.h>
+#include <errno.h>
+#include <unistd.h>
+
+u_int16_t csum_partial(void *buffer, unsigned int len, u_int16_t prevsum)
+{
+ u_int32_t sum = 0;
+ u_int16_t *ptr = buffer;
+
+ while (len > 1) {
+ sum += *ptr++;
+ len -= 2;
+ }
+ if (len) {
+ union {
+ u_int8_t byte;
+ u_int16_t wyde;
+ } odd;
+ odd.wyde = 0;
+ odd.byte = *((u_int8_t *)ptr);
+ sum += odd.wyde;
+ }
+ sum = (sum >> 16) + (sum & 0xFFFF);
+ sum += prevsum;
+ return (sum + (sum >> 16));
+}
+
+int main()
+{
+ int fd, len;
+ union {
+ struct {
+ char etherhdr[16];
+ struct iphdr ip;
+ } fmt;
+ unsigned char raw[65536];
+ } u;
+
+ fd = open("/dev/tap0", O_RDWR);
+ if (fd < 0) {
+ perror("Opening `/dev/tap0'");
+ return 1;
+ }
+
+ /* u.fmt.ip.ihl in host order! Film at 11. */
+ while ((len = read(fd, &u, sizeof(u))) > 0) {
+ u_int32_t tmp;
+ struct icmphdr *icmp
+ = (void *)((u_int32_t *)&u.fmt.ip + u.fmt.ip.ihl );
+ struct tcphdr *tcp = (void *)icmp;
+ struct udphdr *udp = (void *)icmp;
+
+ fprintf(stderr, "SRC = %u.%u.%u.%u DST = %u.%u.%u.%u\n",
+ (ntohl(u.fmt.ip.saddr) >> 24) & 0xFF,
+ (ntohl(u.fmt.ip.saddr) >> 16) & 0xFF,
+ (ntohl(u.fmt.ip.saddr) >> 8) & 0xFF,
+ (ntohl(u.fmt.ip.saddr) >> 0) & 0xFF,
+ (ntohl(u.fmt.ip.daddr) >> 24) & 0xFF,
+ (ntohl(u.fmt.ip.daddr) >> 16) & 0xFF,
+ (ntohl(u.fmt.ip.daddr) >> 8) & 0xFF,
+ (ntohl(u.fmt.ip.daddr) >> 0) & 0xFF);
+
+ switch (u.fmt.ip.protocol) {
+ case IPPROTO_ICMP:
+ if (icmp->type == ICMP_ECHO) {
+ fprintf(stderr, "PONG! (iphdr = %u bytes)\n",
+ (unsigned int)((char *)icmp
+ - (char *)&u.fmt.ip));
+
+ /* Turn it around */
+ tmp = u.fmt.ip.saddr;
+ u.fmt.ip.saddr = u.fmt.ip.daddr;
+ u.fmt.ip.daddr = tmp;
+
+ icmp->type = ICMP_ECHOREPLY;
+ icmp->checksum = 0;
+ icmp->checksum
+ = ~csum_partial(icmp,
+ ntohs(u.fmt.ip.tot_len)
+ - u.fmt.ip.ihl*4, 0);
+
+ {
+ unsigned int i;
+ for (i = 44;
+ i < ntohs(u.fmt.ip.tot_len); i++){
+ printf("%u:0x%02X ", i,
+ ((unsigned char *)
+ &u.fmt.ip)[i]);
+ }
+ printf("\n");
+ }
+ write(fd, &u, len);
+ }
+ break;
+ case IPPROTO_TCP:
+ fprintf(stderr, "TCP: %u -> %u\n", ntohs(tcp->source),
+ ntohs(tcp->dest));
+ break;
+
+ case IPPROTO_UDP:
+ fprintf(stderr, "UDP: %u -> %u\n", ntohs(udp->source),
+ ntohs(udp->dest));
+ break;
+ }
+ }
+ if (len < 0)
+ perror("Reading from `/dev/tap0'");
+ else fprintf(stderr, "Empty read from `/dev/tap0'");
+ return len < 0 ? 1 : 0;
 }
+

-- 
                       |              http://www.rent-a-nerd.nl
                       |                     - U N I X -
                       |          Inspice et cautus eris - D11T'95

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Mon Aug 07 2000 - 21:00:13 EST