pre2.0.6 oopses. plus details of course.

Chris Evans (chris@ferret.lmh.ox.ac.uk)
Tue, 21 May 1996 14:18:55 +0100 (BST)


Rats. Just noted a few oopses in my logs when I typed a random dmesg to
check for problems.

I ramble on a bit more after the oopses.

Note before I detail them that I have made the following kernel
modifications:

1) Dropped in Becker's v0.25 version of 3c59x.c driver.
2) Enabled tagged queueing on my aic7xxx driver, with CMDS_PER_LUN 8 and
RESET_DELAY 2.
3) Reduced max tasks per user (inconsequential of course)
4) Patched the process accounting to log extra stuff like swaps caused,
size in memory, i/o usage.
5) Patched /proc so /proc/<pid> has changeable permissions (ie processes
can be hidden)

general protection: 0228
CPU: 0
EIP: 0010:[<0013888b>]
EFLAGS: 00010206
eax: 00000800 ebx: 000005b4 ecx: 01e2ee18 edx: 00448000
esi: 00444858 edi: 00448000 ebp: 013f0f70 esp: 013f0ebc
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Process netscape (pid: 7988, process nr: 32, stackpage=013f0000)
Stack: 0000022b 00000018 00444e20 000005b4 013f0f78 01e2edfc 000005b4 00142f14
013f0f70 00444858 000005b4 01e2edfc 013f0f7c 00000000 00000800 00000000
01e2ee18 00000000 00000000 01ebfc0c 013f0f08 bd3fb659 0014c13a 01e2edfc
Call Trace: [<00142f14>] [<0014c13a>] [<001362e7>] [<00120f47>] [<0010a392>]
Code: 07 76 15 89 f9 f7 d9 83 e1 03 29 cb f3 a4 89 d9 c1 e9 02 f3

ksymoops says...

>>EIP: 13888b <memcpy_toiovec+3f/84>
Trace: 142f14 <tcp_recvmsg+2ec/458>
Trace: 14c13a <inet_recvmsg+76/8c>
Trace: 1362e7 <sock_read+ab/c0>
Trace: 120f47 <sys_read+8b/9c>
Trace: 10a392 <system_call+52/80>

3 seconds later...

general protection: 0000
CPU: 0
EIP: 0010:[<0010ffd6>]
EFLAGS: 00010282
eax: 00824414 ebx: 013f0f08 ecx: 001bd000 edx: 8901e285
esi: 001a1e68 edi: 00824414 ebp: 0109dee0 esp: 0109ded4
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Process netscape (pid: 8019, process nr: 31, stackpage=0109d000)
Stack: 008243d0 001a1e68 008243d0 001acbd8 00121eb1 00824414 00796c64 0149494c
00000004 01d58017 0149494c 00150dff 001acbd8 0001867d 00000001 00796c64
0109df70 0109df70 00000004 01f78c18 0001867d 00129fc2 00796c64 01d58017
Call Trace: [<00121eb1>] [<00150dff>] [<00129fc2>] [<0012a1c8>]
[<0012a260>] [<00128503>] [<0010a392>]
Code: 8b 02 83 f8 02 74 07 8b 02 83 f8 01 75 5e 9c 5e fa c7 02 00

ksymoops says...

>>EIP: 10ffd6 <wake_up+22/dc>
Trace: 121eb1 <__iget+135/1e8>
Trace: 150dff <ext2_lookup+11f/138>
Trace: 129fc2 <lookup+da/f4>
Trace: 12a1c8 <_namei+54/bc>
Trace: 12a260 <lnamei+30/48>
Trace: 128503 <sys_readlink+3f/88>
Trace: 10a392 <system_call+52/80>

17 seconds later...

general protection: 0000
CPU: 0
EIP: 0010:[<0010ffd6>]
EFLAGS: 00010282
eax: 00824414 ebx: 013f0f08 ecx: 00edcfac edx: 8901e285
esi: 0000044e edi: 00824414 ebp: 01eddf6c esp: 01eddf60
ds: 0018 es: 0018 fs: 002b gs: 002b ss: 0018
Process update (pid: 9, process nr: 8, stackpage=01edd000)
Stack: 008243d0 0000044e 00000000 bffffe30 00121728 00824414 008243d0 001219dd
008243d0 01f1b414 00000000 00000000 00125920 00000000 00000000 01f1b414
00000000 00000000 01f1b414 00125a69 01f1b414 00000001 0010a392 00000001
Call Trace: [<00121728>] [<001219dd>] [<00125920>] [<00125a69>] [<0010a392>]
Code: 8b 02 83 f8 02 74 07 8b 02 83 f8 01 75 5e 9c 5e fa c7 02 00

ksymoops says....

>>EIP: 10ffd6 <wake_up+22/dc>
Trace: 121728 <write_inode+5c/64>
Trace: 1219dd <sync_inodes+39/54>
Trace: 125920 <sync_old_buffers+14/128>
Trace: 125a69 <sys_bdflush+35/a4>
Trace: 10a392 <system_call+52/80>

System has been going fine for 12 hours since, during which demanding
stuff like remote X sessions, netscape, etc. has been going on as well as
simple shell + pine usage.

Unfortunately, ksymoops seems to believe that I need newer binutils to
disassemble oops_decode.o, apart from the fact that I recently installed
2.6.0.12, which may well be the problem anyway (overly rapid installation
no doubt). I'll upgrade to the new 2.6.0.14 nonetheless, but any pointers
to this problem will be appreciated, as well as get you a proper
disassembly if wanted. Note that after ksymoops complains, there is
actually no oops_decode.o present, if there should be....

I don't think my problem should be flakey hardware, which caused a glut
of oopses a while ago. The problem was traced to a stalled pentium fan
and has been rectified. 50 consecutive kernel compiles starting now
should check for hardware hassles...

Any more info available on request of course, and if really wanted, I'll
get proper disassembly sorted out.

Cheers,
Chris.