UDP packet loss when running lsof

From: John Miller
Date: Mon May 21 2007 - 04:55:20 EST


Hi,

I'm experiencing silent (no counter increased or log message) UDP
packet loss inside the kernel (2.6.16.27 from Novell).

My application receives about 50 MB of multicast UDP traffic and
writes the packets into files. If I comment out the write() call,
the application runs for days without packet loss. With the write()
at some random times (20 minuntes or even hours) blocks of packets
(from 10 to 300) are lost.

While I can run 10 "while true; do true; done" or a "dd" that writes
additional data to disk without any effect on the packet loss, a
simple "lsof" forces packet loss most of the time.

Are there any more configuration options of the kernel that will
help? Which system calls of lsof are blocking the network data
throughput of the kernel?

Some background about the things I already did:

- I increased the max allowed receive buffer through
proc/sys/net/core/rmem_max and the application calls the right
syscall. "netstat -su" does not show any "packet receive errors".

- After getting "kernel: swapper: page allocation failure.
order:0, mode:0x20", I increased /proc/sys/vm/min_free_kbytes

- ixgb.txt in kernel network documentation suggests to increase
net.core.netdev_max_backlog to 300000. This did not help.

- I also had to increase net.core.optmem_max, because the default
value was too small for 700 multicast groups.

- Network card is handled by bnx2 kernel module

- The application uses multiple threads for receiving and writing,
so the receiver threads can still receive while the writer threads
are waiting for I/O completions.

TIA for any help and suggestions.
JM


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/