RE: UDP recvmsg blocks after select(), 2.6 bug?

From: David Schwartz
Date: Fri Oct 15 2004 - 18:38:00 EST



> > As I asked in a previous mail in this overly long thread, why
> > not returning
> > zero bytes at all. It is perfectly valid to receive an UDP packet with 0

> Zero bytes is "end of file". Don't go trying to co-opt end of
> file. That way lies
> madness and despair.

Not for UDP. Zero bytes means that zero bytes of data were received, a
perfectly legitimate (though seldom useful) number.

> You would *then* need a flag on each file descriptor to determine
> if the most
> previous call before the read op was a select that returned the
> file as readable so
> you knew whether to block or return the not-really-end-of-file.
> Your *app* would
> then also need a flag/context to determine whether the end of
> file just read was
> contextually an aborted read after select.

You mean whether it was a zero-byte datagram or some sort of error.

> [On the larger issues, I am surprised that select() doesn't
> guarantee available data
> and one subsequent non-blocking read, but again in the case of a
> UDP discard after
> the select but before the read, that is the only thing that makes
> sense. I would
> vote (were this a democracy 8-) to put a CAVEAT in the manual
> that listed the _rare_
> cases, as examples, where the warrant of available data may prove
> false; give a nod
> to real life, and _firmly suggest_ that if you are using select,
> you *probably* want
> nonblocking file descriptors too.]

I think it's a really bad idea to make 'select' more complicated by trying
to nail down precise semantics for every possible protocol. The 'select'
function is supposed to be protocol-neutral and trying to say it guarantees
X on protocol Y, where such guarantees constrict what the kernel can do and
do not make user code anything but more fragile, doesn't seem to be a good
idea to me.

For UDP specifically, datagrams are fundamentally discardable whenever that
seems to be a good idea. In general, there are any number of corner cases
for various combinations of protocols and situations where a 'select' hit
will not be followed by an operation that doesn't block.

It just happens to be that 'select' works best when it's a hint that
something has changed and the operation can/should be re-tried. It works
very badly when the results of a 'select' are supposed to change something
because you're supposed to be able to 'select' (and then not perform the
operation) without affecting things. This is level semantics, not edge.

The CAVEAT is that 'select', like every other status information function
provided by the kernel, does not guarantee anything about the future. Just
like 'stat' does not guarantee that the file size will still be the same
later when you call 'read'.

DS


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/