Re: sendmail/linux-2.2.0+mingo hangs

John Kennedy (jk@csuchico.edu)
Sat, 30 Jan 1999 08:56:37 -0800 (PST)


[John Alvord]
> There is a 2.2.1 patch with some critical fixes in it, one of which
> cured a hung in state "D" bug. john alvord

No, they're not stuck in a `D' state. It looks like they're hung
in poll(), which seems reasonable:

[gdb sendmail 2569]
GNU gdb 4.17.0.5 with Linux/x86 hardware watchpoint and FPU support
...
/usr/sbin/2569: No such file or directory.
Attaching to program `/usr/sbin/sendmail', process 2569
Reading symbols from /lib/libresolv.so.2...done.
Reading symbols from /lib/libc.so.6...done.
Reading symbols from /lib/ld-linux.so.2...done.
Reading symbols from /lib/libnss_compat.so.2...done.
Reading symbols from /lib/libnsl.so.1...done.
Reading symbols from /lib/libnss_files.so.2...done.
Reading symbols from /lib/libnss_dns.so.2...done.
Reading symbols from /lib/libnss_nis.so.2...done.
0x400cdd60 in __poll ()

(gdb) where
#0 0x400cdd60 in __poll ()
#1 0x400216a3 in __res_send ()
#2 0x40020730 in res_query ()
#3 0x40020b41 in res_querydomain ()
#4 0x805905e in dns_getcanonname ()
#5 0x8062de3 in getcanonname ()
#6 0x8052a48 in host_map_lookup ()
#7 0x806a52e in map_lookup ()
#8 0x806a080 in rewrite ()
#9 0x806a3a6 in callsubr ()
#10 0x806a1e1 in rewrite ()
#11 0x8068d23 in parseaddr ()
#12 0x805a488 in setsender ()
#13 0x80789ec in smtp ()
#14 0x8060bd7 in main ()
#15 0x4003e8af in __libc_start_main ()

As I recall, that is both a glibc and kernel issue (since glibc has to
make it visible to the end-app), which goes a long way towards letting
me figure out exactly where the problem crept in. Backing down from
linux-2.2.0+mingo to -2.1.127 didn't make the problem go away, but since
my previous version of glibc was pre-poll(), I'm probably not using the
same kernel API I used to.

If that is a DNS-only poll() problem, I could see why this might be
creeping out now. I'm on the GNOME mailing list and I've got anti-spam
stuff turned on in sendmail. Right now, with redhat off the net, it
can't be verified via DNS and that is causing a large backlog to build
up and hammer me.

My other test, DHCPD, is stuck in a different area. It is apparently
getting hung up trying to syslog data. According to gdb, syslogd is
waiting in select() but that is where it would be anyway.

(gdb) where
#0 0x400c6672 in __libc_send ()
#1 0x400c3131 in vsyslog ()
#2 0x400c2e51 in syslog ()
#3 0x8059309 in note (fmt=0x805ab60 "BOOTREQUEST from %s via %s")
at errwarn.c:133
#4 0x804d2f0 in bootp (packet=0xbfffe188) at bootp.c:70
#5 0x80577c1 in do_packet (interface=0x8189bd0, packet=0xbfffe9fc, len=300,
from_port=17408, from={len = 4,
iabuf = "\204ñ¯\001", '\000' <repeats 11 times>}, hfrom=0xbffffa14)
at options.c:607
#6 0x805273f in got_one (l=0x8189c28) at dispatch.c:596
#7 0x8052626 in dispatch () at dispatch.c:560
#8 0x8049a86 in main (argc=1, argv=0xbffffc84, envp=0xbffffc8c) at dhcpd.c:278
#9 0x400308af in __libc_start_main ()

--- john

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/