tbusy debate: summary and patch

Henner Eisen (eis@baty.hanse.de)
Sat, 17 Oct 1998 20:23:42 +0200


Hi,

As I consider the conclusions of the recent tbusy debate rather important,
but there is no final summary so far, I tried to prepare that missing summary.

I've attached it as a patch against 2.1.125, thus making it easy to depose
it where every future driver author should look for it. The patch also extends
include/linux/netdevice.h by tbusy wrapper functions (as proposed by
Andi Kleen)

I hope I got it right. Core developers, please have a look at it (and feel
free to include it in the kernel tree if it is correct)!

Henner

diff -urN 2.1.125-i4ldev/Documentation/networking/tbusy.usage 2.1.125-ix25/Documentation/networking/tbusy.usage
--- 2.1.125-i4ldev/Documentation/networking/tbusy.usage Thu Jan 1 01:00:00 1970
+++ 2.1.125-ix25/Documentation/networking/tbusy.usage Sat Oct 17 19:35:32 1998
@@ -0,0 +1,144 @@
+
+
+ dev->tbusy, dev->hard_start_xmit() and flow control
+ ===================================================
+
+
+
+Author: Henner Eisen, <eis@baty.hanse-de>
+
+
+
+ The Linux network device structure (include/linux/netdevice.h::struct device)
+ contains a field named "tbusy". Writers of network device drivers need
+ to understand the exact semantics of this.
+
+ Unfortunately, the role of the tbusy flag has changed in the
+ past. Lots of drivers still contain legacy usage patterns of the tbusy
+ flag. Even new drivers often do so because they were cloned from older
+ drivers or because the author tried to understand tbusy usage by
+ looking at older drivers. Thus, there is definitely a need to clarify
+ the tbusy usage. This is an attempt to do so by summarising the stuff
+ from a recent (1998-09) thread on the linux kernel mailing list and other
+ information I've gathered so far.
+
+
+What it's NOT for:
+==================
+
+dev->tbusy is NOT for device locking or serialisation of
+dev->hard_start_xmit() threads.
+
+It was used for this purpose a long time ago. But since the day when
+Linux 1.2.9 was released, the dev->hard_start_xmit() methods are
+called from bh_atomic context. This guarantees that the device's
+hard_start_xmit() methods are entered in a single threaded manner
+and driver authors can rely on that.
+
+Future (2.3.x) Linux versions might probably get rid of the bh_atomic
+context in order to better take advantage of SMP. However, the network
+core will continue to guarantee that for each device, its
+dev->hard_start_xmit() method will be called in a single threaded
+manner. Thus, only network device drivers that manage multiple
+device struct's need to take care about those foreseeable change
+(and apply some private locking when accessing shared driver data
+from their hard_start_xmit() methods).
+
+
+What it is for:
+===============
+
+dev->tbusy is solely for flow control purpose.
+
+Setting dev->tbusy to 1 is for asking the upper layer to stop calling
+dev->hard_start_xmit(). Setting it to 0 is for telling the upper layer
+that the device is ready to send more frames out of the wire.
+
+The flag is advisory only. The upper layer might call
+dev->hard_start_xmit() even if dev->tbusy is set. The only way a
+driver can reliably reject xmitting frames is by returning
+something != 0 from it's dev->hard_start_xmit() method.
+
+
+Usage:
+======
+
+The normal usage is that the dev->hard_start_xmit() method sets
+dev->tbusy to 1 whenever it decides that the device currently does
+not want to get passed any more frames for transmission. When the
+device's interrupt handler detects that the device is ready to process
+more frames again, it sets dev->tbusy to 0.
+
+Whenever the tbusy flag is set to 0, the driver also needs to do a
+mark_bh(NET_BH). Otherwise, sending out the next frame will be delayed
+until something else marks net_bh active.
+
+
+Race Conditions:
+================
+
+There is a certain race: An interrupt handler might clear tbusy while
+dev->hard_start_xmit() is executing. After returning from
+hard_start_xmit() tbusy might still be set although the device is ready to
+process more frames. There are two different approaches to address
+that problem:
+
+(a) The driver's hard_start_xmit() must carefully take into account
+ that an interrupt handler might change tbusy. Such drivers need to
+ access tbusy only by means of the atomic bitops
+ [test_and_]{set,clear}_bit(&dev->tbusy) because tbusy is not
+ declared volatile.
+(b) The driver does not treat tbusy as a special variable (no atomic
+ bitops). This will improve performance because the atomic bitops
+ might be rather expensive for certain architectures.
+ The rare cases where the race occurs is treated by a timeout
+ routine that clears tbusy when is was set for too long.
+
+Drivers that do both waste resources!
+
+Currently, the device structure does not provide any special support for
+a timeout routine. Instead, the upper layer occasionally calls
+dev->hard_start_xmit() although dev->tbusy is set. The
+dev->hard_start_xmit() method can then check the device state and
+return 1 when it detects that the device really is still busy. When
+it detects that it is not busy, it just continues processing the frame
+(and clearing tbusy / marking net_bh when appropriate).
+
+As timeout handling by the xmit method is ugly, support for this
+is going to be removed. Thus, authors of new drivers should
+already account for this and not blindly clone the hard_start_xmit() entry
+timeout handling of existing drivers.
+
+
+Wrappers:
+=========
+
+As the flow control / tbusy handling is subject to change,
+authors of new and maintainers of old drivers are encouraged not to
+access tbusy directly but rather use the wrapper functions:
+
+For asking the upper layer to stop passing more frames for a
+while, use
+
+ dev_xoff(struct device *dev);
+
+For telling the upper layer that the device is ready to xmit more
+frames, use
+
+ dev_xon(struct device *dev);
+
+This clarifies the flow control semantics and will help to
+support legacy linux versions with your driver.
+
+Further, don't inline the timeout handler at the entry point of
+your hard_start_xmit() method. Put it in seperate function, i.e.
+
+static void xxx_tx_timeout(struct device *dev);
+
+And use something like
+
+if (jiffies - dev->trans_start >= XXX_TX_TIMEOUT) xxx_tx_timeout(dev);
+
+when entering your dev->hard_start_xmit() method or call it directly
+from a timer. Like this, if somebody adds special timeout support to
+struct device, then your driver could easily be changed to use it.
--- 2.1.125-i4ldev/include/linux/netdevice.h Thu Sep 17 18:56:13 1998
+++ 2.1.125-ix25/include/linux/netdevice.h Sat Oct 17 19:18:14 1998
@@ -449,6 +449,37 @@
#endif


+/*
+ * Flow control for network device xmit.
+ *
+ * Wrapper for setting/clearing dev->tbusy, as suggested by Andi Kleen.
+ * These access dev->tbusy non-atomically which is fast but subject to
+ * race conditions. Drivers applying these functions need to account for
+ * possible races by means of a timeout handler.
+ *
+ * See Documentation/networking/tbusy.usage for details.
+ */
+
+/*
+ * Tell higher layer to stop putting new data
+ */
+static __inline__ void dev_xoff(struct device *dev)
+{
+ dev->tbusy = 1;
+}
+
+/*
+ * Tell higher layer to start transmitting again
+ */
+static __inline__ void dev_xon(struct device *dev)
+{
+ if (dev->tbusy) {
+ dev->tbusy = 0;
+ mark_bh(NET_BH);
+ }
+}
+
+
#endif /* __KERNEL__ */

#endif /* _LINUX_DEV_H */

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/