Re: strace, accept(), ERESTARTSYS and EINTR

From: Phil Endecott
Date: Fri Jan 04 2008 - 18:52:43 EST


Hi Jiri,

Jiri Slaby wrote:
On 01/04/2008 10:01 PM, Phil Endecott wrote:
Dear Experts,

I have some code like this:

struct sockaddr_in client_addr;
socklen_t client_size=sizeof(client_addr);
int connfd = accept(fd,(struct sockaddr*)(&client_addr),&client_size);
if (connfd==-1) {
// [1]
.....report error and terminate......
}
int rc = fcntl(connfd,F_SETFD,FD_CLOEXEC);

show socket() call please to see what proto and type you have there.

It's a ipv4 tcp socket:

// error handling & other noise removed:
int fd = socket(PF_INET,SOCK_STREAM,0);
struct sockaddr_in server_addr;
memset(&server_addr,0,sizeof(server_addr));
server_addr.sin_family=AF_INET;
server_addr.sin_addr.s_addr=htonl(INADDR_ANY);
server_addr.sin_port=htons(port);
bind(fd,(struct sockaddr*)&server_addr,sizeof(server_addr));
listen(listenfd,128);


I believe that I should be checking for errno==EINTR at [1] and retrying
the accept(); currently I'm not doing so.

When I strace -f this application - which is multi-threaded - I see this:

[pid 11079] accept(3, <unfinished ...>
[pid 11093] restart_syscall(<... resuming interrupted call ...>
<unfinished ...>
[pid 8799] --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
[pid 11079] <... accept resumed> 0xbfdaa73c, [16]) = ? ERESTARTSYS (To
be restarted)
[pid 8799] read(6, <unfinished ...>
[pid 11079] fcntl64(-512, F_SETFD, FD_CLOEXEC) = -1 EBADF (Bad file
descriptor)

This shows accept() "returning" ERESTARTSYS; as I understand it this is
an artefact of how strace works, and my code will not have seen accept
return at all at that point. However, the strace output does not show
any other return from the call to accept() before reporting that
thread's call to fcntl(). And the first parameter to fcntl, -512, is
the return value from accept() which should be -1 or >0. What is going
on here???

Google found a couple of related reports:

http://lkml.org/lkml/2001/11/22/65 - Phil Howard reports getting
ERESTARTSYS returned from accept(), not only in the strace output, and
fixed his problem by treating it like EINTR. He looked at errno if
accept() returned <0, not ==-1.

http://lkml.org/lkml/2005/9/20/135 - Peter Duellings reports seeing
accept() return -512 with errno==0.

ERESTARTSYS might be returned from system calls only when signal is pending.
Signal handler will change ERESTARTSYS to proper userspace error, i.e.
ERESTARTSYS (512) must not leak to userspace.

Some fail paths returns ERESTARTSYS even if no signal is pending and that used
to be the point.

There are two odd things happening:

1. ERESTARTSYS is escaping to user-space, rather than EINTR or restarting the accept.
2. It gets out of libc into my code in the form ret=-512, not (ret=-1, errno=512).

Very odd; a user-space mess (e.g. stack corruption) shouldn't be able to change the kernel behaviour, and a kernel problem shouldn't cause the odd libc behaviour. There must be another explanation....


Phil.




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/