Re: Fw: sendpkt: Connection refused

Edgar Toernig (froese@gmx.de)
Wed, 14 Oct 1998 00:11:43 +0200


Hi,

Andi Kleen wrote:
> On Tue, Oct 13, 1998 at 05:10:16PM +0200, Edgar Toernig wrote:
> > Andi Kleen wrote:
> > > mblack@csihq.com (Mike Black) writes:
> > > ...
> > > > >> Oct 10 06:27:18 defiant dhcpd: sendpkt: Connection refused
> > > > >
> > > > >Because Linux stupidly reports ICMP errors it receives for a given UDP
> > > > >port the next time you try to send on a socket bound to that port,
> > > > >even though the error has nothing to do with the packet you're trying
> > > > >to send. The error is essentially informational at this point - when
> > > > >send_packet gets it while transmitting a packet, it just retransmits
> > > > >the packet. Utterly bogus, but harmless to you.
> > >
> > > He should either fix his program to do correct error handling
> > > or set the SO_BSDCOMPAT option on the socket.
> >
> > How should the "correct error handling" look like?
> >
> > I got the same problem when building a daemon which talks to
> > a large number of clients. The problem: You don't know, whether
> > the error return of the send is for the current packet or for
> > one of the thousand packets before. If you try to resend the
> > current packet another icmp-error may have returned in the
> > meantime and you get just another -1.
>
> One possibility is to check and clear the pending error first with
> getsockopt(sk, SOL_SOCKET, SO_ERROR, &err, sizeof err); first.

And between the getsockopt and the send another ICMP arrives...

> Also you get different error codes for local errors and errors
> generated by the network, e.g. ENETDOWN or EINVAL or EWOULDBLOCK
> is definitely a local error, EPROTO or EHOSTUNREACH a network error.

Sure, there are some errors codes that are definitely generated by the
current send, but there are others (EHOSTUNREACH) that may be generated
by any packet (including the current one) and you don't know what to
do now (resend or ignore).

> > IMO, this interface is broken. I came to the solution, that
> > SO_BSDCOMPAT is the Right Thing and should be the default!
> > (I don't know, if this would break some 'standard'...)
> It breaks RFC1122. The main problem is that the BSD API has no way
> to deliver the complete error information, but that is not an excuse
> for hiding the error.

AFAIK, requires RFC1122 that the error is passed to the app-layer and
that the app should be able to disable this mechanism.
It doesn't say how the error should be reported (in socket-world) or
at what time.

And that's what I'm complaining about: it's done the wrong way.
The error report you get is totally useless (you don't get enough
information to perform a correct error handling), makes sending
of packets a guesswork, and - at last - breaks a lot of applications.

> The vger kernel has a complete solution that will hopefully appear
> in the official 2.2 kernel. The user process can enable an per socket
> error queue that buffers incoming errors.
>
> [sample code snipped]

These error queues would allow decent error handling and I _hope_ this
will get in.

But I still think it's a bad idea to block outgoing packets (at least
if you work with unconnected sockets). The recvmsg(MSG_ERRQUEUE)
should be enough (especially if select()ing for an exception is
possible).

IMO, SO_BSDCOMPAT should be the default. The arrival of (possibly
unrelated) error messages shouldn't disable normal operation.

Setting IP_RECVERR and select()ing for recvmsg(MSG_ERRQUEUE) would
be a much cleaner solution, wouldn't break older apps without
error handling, and could still conform to RFC1122.

Ciao, ET.

PS: Don't get me wrong. I would love to see a method to get reliable
error messages (the error queues look right) but why should old
APIs be broken by that?

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/