Re: [bug] af_unix: Reading from a stream socket may lock the concurrentpoll() call

From: Alexey Moiseytsev
Date: Mon Nov 21 2011 - 18:34:56 EST


21.11.2011 18:38, Eric Dumazet ÐÐÑÐÑ:
Le lundi 21 novembre 2011 Ã 00:19 +0400, Alexey Moiseytsev a Ãcrit :
Hello,

The following program shows how the poll() call hangs on a non-empty
stream socket.

#include<sys/types.h>
#include<sys/socket.h>
#include<pthread.h>
#include<stdio.h>
#include<unistd.h>
#include<poll.h>

int sockets[2];

int poll_socket(void) {
struct pollfd pfd = {
.fd = sockets[1],
.events = POLLIN
};
return poll(&pfd, 1, -1);
}


/* observer routine doesn't modify amount of data available in the
socket buffer */
void* observer(void* arg) {
char buffer;
for (int j = 0; j< 2000; j++) {
recv(sockets[1],&buffer, sizeof(buffer), MSG_PEEK);
sched_yield();
}
return NULL;
}

int main(void) {
if (socketpair(PF_UNIX, SOCK_STREAM, 0, sockets) == -1)
return 1;
int rc, data[250] = {0};
if ((rc = send(sockets[0],&data, sizeof(data), MSG_DONTWAIT))<= 0)
return 2;
poll_socket();
/* If the first poll_socket() call didn't hang then the following
message will be printed */
fprintf(stderr, "%d bytes available in input buffer\n", rc);
pthread_t observer_thread;
pthread_create(&observer_thread, NULL, observer, NULL);
for (int j = 0; j< 20000; j++) {
/* If the first poll_socket() call didn't hang then all the following
calls should do the same */
poll_socket();
}
fprintf(stderr, "Well done\n");
pthread_join(observer_thread, NULL);
close(sockets[0]);
close(sockets[1]);
return 0;
}


Expected output: two lines or nothing (in case of error).
Observed output: only the first line (and the process never exits).

So the first poll() said that there is some data available in the
socket. And one of the following poll() said that there is no data
available in the socket. But this is false because the observer thread
didn't actually consume any data from then socket.

I assume that this bug can be eliminated by adding
sk->sk_data_ready(...) call right after each call to
skb_queue_head(..) in the unix_stream_recvmsg(...) routine
(net/unix/af_unix.c)

Other info:
$ uname -srmo
Linux 2.6.32-5-amd64 x86_64 GNU/Linux


Hi Alexy

I believe you found a bug and your suggested fix should be just fine.

(Or maybe testing in unix_poll() that at least one thread is currently
handling one skb from sk->receive_queue)

Could you submit an official patch on top of current Linus tree or do
you prefer us to take care of this ?


Hi,

I will try to send a patch. If I will do something wrong, feel free to submit it yourself.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/