[RFC] udp_err compliance: RFC1122 vs. BSD

From: Martin Diehl (mdiehlcs@compuserve.de)
Date: Sun Oct 08 2000 - 20:17:04 EST


just thinking about the RFC1122 vs. BSD compliance issue wrt error
reporting on unconnected udp sockets i'd like to make a proposal for some
kind of solution:

  Approach to have a default policy whether error reporting on udp
  sockets follows the official internet standard from RFC1122 or
  traditional BSD. Depending on this choice code, following the
  corresponding design will work without any changes. Other code
  will work after minor compatibility changes. Nothing new here.

  There is a continuing effort to make Linux IP protocols comply to
  RFC1122 which can be seen from the "Status" comments at the beginning
  of various protocol (icmp/tcp/udp) specific files in net/ipv4. From 1.x
  on Linux implemented what is required by rule of RFC1122:
  "UDP MUST pass to the application layer all ICMP error messages that it
  receives from the IP layer." On the other hand, for IPv4/UDP there is
  a different approach from BSD-like applications which do not pass icmp
  errors on unconnected udp sockets to the application layer. Linux
  implemented a SOL_SOCKET level option called SO_BSDCOMPAT to switch
  from default RFC1122 compliant behaviour to BSD-like on a per-socket
  basis. The described behaviour was consistenly realized including latest
  2.2.18-pre* kernel versions. It is documented (as of man-pages-1.29:
  socket(4), socket(7), udp(4)) exactly this way. Especially, fixing
  BSD-like error handling instead of setting SO_BSDCOMPAT is encouraged.
  Furthermore, SO_BSDCOMPAT is claimed to be scheduled for future removal,
  i.e. RFC1122 compliant behaviour being the only option then.

  Linux' handling of icmp errors on unconnected udp sockets was changed
  somewhere at 2.3.4x. It now favours BSD-like behaviour while dropping
  RFC1122 conformance. This change is hard-coded into the respective
  socket layer functions, i.e. there is no option to change this without
  kernel patch. While SO_BSDCOMPAT is still syntactically there, its
  semantics disappeared. Especially with SO_BSDCOMPAT set to 0 (regardless
  whether by default or explicitly) the socket still behaves BSD-like and
  violates RFC1122.
  This change potentially breaks code relying on previous Linux behaviour.
  While probably most of the arising pitfalls might be attributed to
  misconfigured system configuration, there might be a number of network
  applications designed to rely on RFC1122 to minimize network stalls.
  On the other hand, RFC1122 compliant error handling was shown to cause
  significant trouble at hight network load due to delayed errors from
  former requests passed to the wrong socket.
  Taken together, both approaches are somehow buggy depending on the
  individual point of view. While the former implementation allowed
  overriding the default behaviour by changing it to BSD-like on a
  per-socket basis there is no such opportunity from user space with
  the new one.

  Both approaches are desirable from there respective application but
  do mutually exclude each other. Hence there is some need for a
  solution which allows the user to specify a default policy for
  error handling on unconnected udp sockets. It should be feasible to
  override this default behaviour to help porting from both ends.

Detailed requirements:
  Q1: There must be a system wide error handling policy for ipv4/udp
  Q2: The allowed policies are "RFC1122 compliant" or "BSD-like".
  Q3: udp sockets must be created to honor the default policy at the
      moment they are created.
  Q4: Once created, it must be possible to change the way each individual
      socket of this kind is handled at any moment any number of times.
  Q5: If an icmp error is received for an udp socket it is tentatively
      assigned to it. If the socket behaves "BSD-like" at the moment when
      the error arrives, it is disregarded unless the socket is connected
      or IP_RECVERR is set. If the socket is RFC1122 compliant however,
      the error must be passed to the application layer in any case.
  Q6: The handling of potential races (default policy changed while socket
      is created for example) is unspecified.
      Comment: This shouldn't be much an issue since default policy should
          be set once when booting. Sharing sockets between processes on
          SMP boxes would be another candidate not taken into
  Q7: There must be a hard-coded value for setting the default policy at
      bootup. This should make everything behave like 2.2.x.
  Q8: Non ipv4/udp sockets must not be influenced by this approach.

Design outline:
  D1: Introduce a new sysctl option called "udp_rfc1122" to be placed at
      sysctl net.ipv4 with boolean semantics. (covers Q1+Q2)
  D2: Set the present "bsdish" socket property when creating ipv4/udp
      sockets. (covers Q3)
  D3: Keep the SO_BSDCOMPAT socket option which corresponds to the
      "bsdish" socket property and is managed by get-/setsockopt(2).
      (covers Q4)
  D4: Take proper care of the "bsdish" property when handling udp errors
      the same way it was done before 2.3.4x. (covers Q5)
  D5: Initialize the "udp_rfc1122" to be "true" by default. (covers Q7)

    Q8 covered by D2+D4 (somehow ugly: SO_BSDCOMPAT still belongs to
                         SOL_SOCKET for historical reason)
    Q6 covered a priori by its definition - here: like 2.2.x

Implementation (files to modify):
  I1: include/linux/sysctl.h (D1)
  I2: include/net/udp.h (D1,D5)
  I3: net/ipv4/sysctl_net_ipv4.c (D1)
  I4: net/ipv4/udp.c (D1,D2,D4,D5)

  nothing to do for D3 (still there)
  update of Documentation/networking/ip-sysctl.txt

  external: update man-pages socket(4), socket(7), udp(4), sysctl.conf(5)

Well, taken together, this doesn't seem to be a big thing to do - probably
above is a little bit of overkill. Anyway, it would remove two bugs by one
patch (by allowing to choose which one to have ;-)

Comments? Suggestions? Objetions?
Something to think about for IPv6?

In fact, I have it already running here, just needs some testing before
going to post the patch.


PS: Yes, I've read what was said about it on l-k when the change was made.
In fact, that's why I've tried to provide this rather detailed description
of what I think could be a acceptable solution for both sides of the

To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/

This archive was generated by hypermail 2b29 : Sun Oct 15 2000 - 21:00:11 EST