Re: linux syslog problems (now linux-2.2.1+glibc-2.0.111)

Zack Weinberg (zack@rabi.columbia.edu)
Mon, 01 Feb 1999 22:36:43 -0500


On Mon, 1 Feb 1999 17:55:07 -0800, John Kennedy wrote:
>[Zack Weinberg]
>> Try sending a stuck process SIGPIPE, syslog() has code to catch that
>> signal and reinitialize the connection.
>
> Ok, will try that. Hmm...
>
> Looking at the glibc code, I can see that code is present. Looking
>at strace output I could see it get a SIGPIPE, it did a close(), and
>then my strace segfaulted. Feh.

This might be a bug in strace or it might be the kernel returning
something totally unexpected. Probably the first. Ulrich has an
updated strace somewhere, check ftp.cygnus.com.

> Reattaching strace doesn't get me any more output nor does it unstick.

Strange - you should've seen something like this:

write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = ? ERESTARTSYS (To be restarted)
--- SIGPIPE (Broken pipe) ---
close(4) = 0
sigreturn() = ? (mask now [])
write(4, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 8192) = -1 EBADF (Bad file descriptor)

followed by a write to the console, and syslog() returning.

>However, if I go into gdb, look at the process (load it up, do a `where'
>and bail out), and rerun strace that seems to do the trick and I get
>syslog output (to the daemon) again and strace is showing activity.
>Probably just partially wedged and stuck but gdb manages to interrupt
>something passively enough that it doesn't crash (EINTR without signal?).

Yes, the debugging hook can do things to processes that ordinary kill(2)
can't.

> dhcpd will crash it if you send it a SIGPIPE and it isn't in syslog(),
>so I wonder if it is wedged in the kernel or ignoring other signals
>while it is wedged in elsewhere in glibc.

SIGPIPE kills a process by default. syslog() overrides that, but only
while it's active.

> I should have a dhcpd that I can link against a glibc compiled with
>`-g' (staticly) and run gdb on that. Maybe that will tell me where it
>is getting wedged after catching the SIGPIPE.
>
>> Try tracing syslogd and see if it's receiving anything. (The mark
>> messages are done by a timer, if the receive pipe is hosed in-kernel
>> those will still happen.)
>
> I can't say. It doesn't *look* like it is receiving anything on the
>UNIX socket that dhcpd is using, but it is certainly receiving stuff on
>other UNIX sockets (logger messages as well as sendmail output) while
>dhcpd is wedged.

Then the bug is unlikely to be in syslogd, unless the pipe has gotten
lost from syslogd's select() fdset. You ought to be able to check
that with lsof and strace.

zw

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/