Re: Damn. It's not fixed.

Rick Franchuk (rickf@transpect.net)
Sun, 4 Apr 1999 04:39:14 -0700 (PDT)


On Sat, 3 Apr 1999, Simon Kirby wrote:

> On Sat, 3 Apr 1999, Rick Franchuk wrote:
>
> > Well, it's crashing less, but it's still crashing. Here's the latest panic:
>
> What kernel version is this? Any patches?
>
> > Any thoughts? (Aside from the obvious of 'replace the damn thing' ;)
>
> It's a NULL pointer, so it may not just be dud hardware. Is it running
> SMP? What sort of load does it have? What is running on the machine?
> Does it happen to occur at regular times?

Look back to the last message mang... same machine, same set of
circumstances. To recap though:

PII-450, 256Mb RAM

Kernel 2.2.5 (non AC), non-MTRR and non-SMP

It's a webserver, so load averages vary (you know this score). Typically
running between 0.5-1.5... this one has some moderate CGI usage.

MTBF: Variable. Like I said before, it *seems* to be localized around periods
of heavier disk usage, but uptimes can range between a few hours and several
days, just to make life interesting.

After scanning through recent postings I noticed some people complaining
about oddball Null Pointers after upgrading to 2.2.4 and beyond, so I thought
I'd take a chance and retrograde to 2.2.3. No luck. Stayed up for about 24
hours and died again... check it out:

Unable to handle kernel NULL pointer dereference at virtual address 00000000
current->tss.cr3 = 05855000, pr3 = 05855000
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c011eb2d>]
EFLAGS: 00010002
eax: c04a5740 ebx: cfdedfe0 ecx: 00000000 edx: 00000008
esi: c04a5740 edi: 00000282 ebp: 00000008 esp: c82e9dc8
ds: 0018 es: 0018 ss: 0018
Process httpd (pid: 20414, process nr: 78, stackpage=c82e9000)
Stack: c4ac4000 00000000 c0124ce5 c04a5740 00000008 00000000 c4ac4000 c0124d66
00000000 00000000 00000400 c4ac4000 000d0804 c82e8000 00000000 c01254c5
c4ac4000 00000400 00000000 00000000 00000400 00000001 c0124606 00000400
Call Trace: [<c0124ce5>] [<c0124d66>] [<c01254c5>] [<c0124606>] [<c012488e>] [<c0138fc1>] [<c0139621>]
[<c01398c9>] [<c0137c9e>] [<c0137ab4>] [<c019ce03>] [<c010982d>] [<c01095eb>] [<c0109607>] [<c0123080>]
[<c0108790>] [<c010002b>]
Code: 8b 01 89 03 85 c0 74 2b 8b 43 04 85 c0 75 10 89 19 89 c8 2b

>>EIP: c011eb2d <rw_swap_page_base+59/358>
Trace: c0124ce5 <remove_vfsmnt+d/84>
Trace: c0124d66 <register_filesystem+a/60>
Trace: c01254c5 <sys_ustat+a9/ac>
Trace: c0124606 <show_buffers+66/ec>
Trace: c012488e <sys_bdflush+2/84>
Trace: c0138fc1 <ext2_notify_change+61/154>
Trace: c0139621 <ext2_add_entry+45/31c>
Trace: c01398c9 <ext2_add_entry+2ed/31c>
Trace: c0108790 <enable_8259A_irq+4/2c>
Code: c011eb2d <rw_swap_page_base+59/358> 00000000 <_EIP>: <===
Code: c011eb2d <rw_swap_page_base+59/358> 0: 8b 01 movl (%ecx),%eax <===
Code: c011eb2f <rw_swap_page_base+5b/358> 2: 89 03 movl %eax,(%ebx)
Code: c011eb31 <rw_swap_page_base+5d/358> 4: 85 c0 testl %eax,%eax
Code: c011eb33 <rw_swap_page_base+5f/358> 6: 74 2b je c011eb60 <rw_swap_page_base+8c/358>
Code: c011eb35 <rw_swap_page_base+61/358> 8: 8b 43 04 movl 0x4(%ebx),%eax
Code: c011eb38 <rw_swap_page_base+64/358> b: 85 c0 testl %eax,%eax
Code: c011eb3a <rw_swap_page_base+66/358> d: 75 10 jne c011eb4c <rw_swap_page_base+78/358>
Code: c011eb3c <rw_swap_page_base+68/358> f: 89 19 movl %ebx,(%ecx)
Code: c011eb3e <rw_swap_page_base+6a/358> 11: 89 c8 movl %ecx,%eax
Code: c011eb40 <rw_swap_page_base+6c/358> 13: 2b 00 subl (%eax),%eax

The first thing I noticed is that the trace originates at enable_8259A_irq
again, same as the last one (although it dies in different areas). The 8259
is the PIC for the SCSI card... once again, it's coming down to disk
subsystem strangeness.

--
  __________________________________________
 |                                          |
 |  Rick Franchuk  -  TranSpecT Consulting  |
 |_______                            _______|
         \mailto:rickf@transpect.net/
          \_____ICQ_#_4435025______/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/