IGMP join/leave time variability

From: Nat Ersoz (nat.ersoz@myrio.com)
Date: Wed Jul 25 2001 - 21:04:32 EST


I'm encountering time variability with IGMP joins and leaves. I'm working
with the 2.2.19 kernel. I've placed gettimeofday() printf's within the user
space program and do_gettimeofday() printk's within the ethernet driver.

So far, what I've found is typical of this captured data:

--- user space timestamps
996133011.376224 +UserCloseSource
996133011.377821 -UserCloseSource
996133011.378296 +UserOpenSource:
996133011.379933 -UserOpenSource: result=0

---- tcpdump output:
00:36:43.335501 > stb_nat.et.myrio.com > igmp nreport [ttl 1]
00:36:45.245501 > stb_nat.et.myrio.com > igmp nreport [ttl 1]
00:36:51.376707 > stb_nat.et.myrio.com > all-routers.mcast.net: igmp leave [ttl 1]
00:36:52.275523 > stb_nat.et.myrio.com > igmp nreport [ttl 1]
00:36:53.705502 > stb_nat.et.myrio.com > igmp nreport [ttl 1]
00:37:02.495500 > stb_nat.et.myrio.com > igmp nreport [ttl 1]

---- ethernet driver timestamps (natsemi.o, modified)
Jul 26 00:36:35 stb_nat kernel: eth0: Add Multicast 996132995.817524
Jul 26 00:36:35 stb_nat kernel: ^I1.
Jul 26 00:36:35 stb_nat kernel: eth0: Add Multicast 996132995.819686
Jul 26 00:36:35 stb_nat kernel:
Jul 26 00:36:35 stb_nat kernel:

==== Some notes:
1. The user space socket() calls take less than 4mS to complete.
2. The ethernet multicast filter gets set very quickly: less than 2 mS.
3. Tcpdump reports that the time between this leave and join is 900 mS for
this particular transaction. We have correlated tcpdump's results with
actual traffic on the ethernet wire using a network analyzer and found
tcpdump to be accurate.

==== Linux 2.2.19 code:
I have dug into code and it seems that the function igmp_group_added(),
found in linux/net/ipv4/igmp.c, is where things really happen. The function
igmp_start_timer() gets called with a IGMP_Initial_Report_Delay value of
(1*HZ). From what I can tell, this amounts to up to 1 second of delay
depending on what net_random() returns in igmp_start_timer() - which agrees
with our measurements of IGMP joins varying from "very short" delays to
something a bit over a second.

==== Questions:
For our application, it would be desireable to have the leave/join occur
ASAP with respect to the user mode calls.
1. What would be the harm if I set IGMP_Initial_Report_Delay to something
very small like 5 to 10 (jiffies)? No need for net_random() I'de expect in
that case?
2. I'm guessing that modifying igmp_start_timer() to call
igmp_timer_expire() directly is not a good idea, since the timers provide
race condition safeness. (?)

Thanks for wading through this. I looked at the 2.4.3 igmp.c code and
noticed that its somewhat similar. Right now our app is at 2.2.19 however.

Thanks for any help and thoughts you may offer.


Nat Ersoz Myrio Corporation
Phone: 425.897.7278 Fax:425.897.5600
3500 Carillon Point Kirkland, WA 98033
