malloc() problems in 2.2.5-15smp

From: Rick Stevens (rstevens@publichost.com)
Date: Wed Mar 29 2000 - 15:05:48 EST


I know it's kinda late to ask something about such an old kernel
BUT...

We're running 2.2.5-15smp kernels on dual PIII-450s. I have an
application that is behaving very strangely. When running the app
multithreaded, I get random SIGSEGV core dumps on sig_wait() calls.
If I run it as a single thread (there's code in there that lets me do
either), I get random SIGSEGV core dumps in the middle of a malloc()
call. Here's gdb's analysis of one such crash:

        [root@stats-01-001 bin]# gdb weblogd core
        GNU gdb 4.17.0.11 with Linux support
        Copyright 1998 Free Software Foundation, Inc.
        GDB is free software, covered by the GNU General Public License, and
you are
        welcome to change it and/or distribute copies of it under certain
conditions.
        Type "show copying" to see the conditions.
        There is absolutely no warranty for GDB. Type "show warranty" for
details.
        This GDB was configured as "i386-redhat-linux"...
        Core was generated by `./weblogd -z'.
        Program terminated with signal 11, Segmentation fault.
        Reading symbols from /lib/libpthread.so.0...done.
        Reading symbols from /lib/libc.so.6...done.
        Reading symbols from /lib/ld-linux.so.2...done.
        #0 0x4008144a in chunk_alloc (ar_ptr=0x40111580, nb=376) at
malloc.c:2857
        malloc.c:2857: No such file or directory.
        (gdb) back
        #0 0x4008144a in chunk_alloc (ar_ptr=0x40111580, nb=376) at
malloc.c:2857
        #1 0x40080b8a in __libc_malloc (bytes=368) at malloc.c:2616
        #2 0x804a493 in update_clients (msg=0x804d680) at procthread.c:360
        #3 0x804a2e9 in scan_main_ring (arg=0x0) at procthread.c:128
        #4 0x8049d49 in main_loop () at weblogd.c:704
        #5 0x804914a in main (argc=2, argv=0xbffffd84) at weblogd.c:217
        #6 0x40040cb3 in __libc_start_main (main=0x8048eb0 <main>, argc=2,
            argv=0xbffffd84, init=0x8048aa0 <_init>, fini=0x804aa2c <_fini>,
            rtld_fini=0x4000a350 <_dl_fini>, stack_end=0xbffffd7c)
            at ../sysdeps/generic/libc-start.c:78
        (gdb)

Here's the system config:

        Linux version 2.2.5-15smp (root@porky.devel.redhat.com)
            (gcc version egcs-2.91.66 19990314/Linux (egcs-1.1.2 release))
            #1 SMP Mon Apr 19 22:43:28 EDT 1999
        Intel MultiProcessor Specification v1.1
            Virtual Wire compatibility mode.
        OEM ID: OEM00000 Product ID: PROD00000000 APIC at: 0xFEE00000
        Processor #0 Pentium(tm) Pro APIC version 17
        Processor #1 Pentium(tm) Pro APIC version 17
        I/O APIC #2 Version 17 at 0xFEC00000.
        Processors: 2
        mapped APIC to ffffe000 (fee00000)
        mapped IOAPIC to ffffd000 (fec00000)
        Detected 448812216 Hz processor.
        Console: colour VGA+ 80x25
        Calibrating delay loop... 447.28 BogoMIPS
        Memory: 517224k/524224k available (1044k kernel code, 416k reserved,
        5472k data, 68k init)

Has anyone else seen this behaviour? Is it a known glitch? Am I
crazy? Any help would be GREATLY appreciated. Please CC: my email
address with any responses as I sometimes don't get a chance to scan
the digests.

----------------------------------------------------------------------
- Rick Stevens, CTO, PublicHost, Inc. rstevens@publichost.com -
- 949-743-2010 (Voice) http://www.publichost.com -
- -
- Try to look unimportant. The bad guys may be low on ammo. _
----------------------------------------------------------------------

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Mar 31 2000 - 21:00:25 EST