Re: kernel panic: raid1 + loop

From: bug1 (bug1@netconnect.com.au)
Date: Mon Jun 19 2000 - 23:21:30 EST


Goswin Brederlow wrote:
>
> I'm using 2.4.0-test1-ac7 and also have problems with loopback
>
> losetup -d /dev/loop/XXX
>
> leaves the file attached to that device somehow locked. You cannot
> open it for write anymore (while you can open it for write while it is
> attached).
>
> You also can't reatach the file to a loopback anymore.
>
> So I wonder how you reintegrate the file into your raid. Your "losetup
> /dev/loop/XXX file" call should fail (unless the bug was partly
> fixed).
>
> MfG
> Goswin
hmm, im not sure i exactly i follow you.

In 2.2.15, 2.4.0-test1-ac18 and 2.4.0-test1-ac20 i can do
losetup /dev/loop1 ./loop1fs
losetup -d /dev/loop1 (as long as nothign is using ./loop1fs)
losetup /dev/loop1 then says the device is not configured, which is
correct.

So are you saying in 2.4 the loop bug leave it attched, but unreported?

I think between trying diffeent kernel i confused myself, ive been
re-doing some tests today.

Below are the different errors i get in 2.2 and 2.4 kernels, they are
different problems.

Im thinking the 2.2 problem could be a raid problem... maybe, but the
2.4 is probably the know loop problem.

Under 2.2.15 (not 2.4.0test1)i can create a raid1 comprised of 2
loopback files.
I can repeatedly remove and reintegrate *one* of the two loop files by
doing
raidsetfaulty /dev/md1 /dev/loop1
raidhotremove /dev/md1 /dev/loop1
losetup -d /dev/loop1
raidhotadd /dev/md1 /dev/loop1

I repeated the above preocess a couple of times with no problem, however
a problem does occur if i try and do it on the other loopback device.

So i can remove and reintegrate /dev/loop1 and then remove /dev/loop2
but get errors reintegrating it.
If i remove and reintegrate /dev/loop2 first i can remove /dev/loop1 but
not reintegrate it.

This is the error i get, it doesnt look like a very good trace, but
anyway.

Unable to handle kernel NULL pointer dereference at virtual address
00000000
current->tss.cr3 = 00101000, %cr3 = 00101000
*pde = 00000000
Oops: 0000
CPU: 1
EIP: 0010:[<00000000>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010002
eax: 00000000 ebx: 00000246 ecx: 00000001 edx: c028a5c0
esi: c3b60000 edi: 00000080 ebp: 00000007 esp: c7e3df68
ds: 0018 es: 0018 ss: 0018
Process mdrecoveryd (pid: 6, process nr: 7, stackpage=c7e3d000)
Stack: c01aacea c028a5c0 c7746e2c c7746de0 c6355000 c7e3dfdc c7746e38
00000024
       00000001 00001000 00001000 0000d27a 00000001 00000009 00000080
00000080
       00000000 00000400 00004dc0 c3b60000 c01ab32d c7746de0 c6355300
c7e3c000
Call Trace: [<c01aacea>] [<c01ab32d>] [<c01a9de8>] [<c0107c74>]
Code: <1>Unable to handle kernel NULL pointer dereference at virtual
address 00000000
Warning (Oops_code): trailing garbage ignored on Code: line
  Text: 'Code: <1>Unable to handle kernel NULL pointer dereference at
virtual address 00000000'
  Garbage: 'Unable to handle kernel NULL pointer dereference at virtual
address 00000000'
Warning (Oops_code_values): Code looks like message, not hex digits. No
disassembly attempted.

>>EIP; 00000000 Before first symbol
Trace; c01aacea <md_do_sync+432/97c>
Trace; c01ab32d <md_do_recovery+f9/260>
Trace; c01a9de8 <md_write+10/48>
Trace; c0107c74 <kernel_thread+28/38>

current->tss.cr3 = 00101000, %cr3 = 00101000
*pde = 00000000

4 warnings issued. Results may not be reliable.

Under 2.4.0-test1-ac20 it crashes when i try and run mkraid on the two
loop files right at the beginning.
This is what i get.

Kernel panic: loop: block not locked
NMI Watchdog detected LOCKUP on CPU1, registers:
CPU: 1
EIP: 0010:[<c02480e5>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00000082
eax: 00000120 ebx: 00000002 ecx: c0357f40 edx: 00000001
esi: 00000302 edi: 00000000 ebp: c594bd9c esp: c594bd40
ds: 0018 es: 0018 ss: 0018
Stack: 00000003 00000000 c01a85d8 00000302 c0b3eda0 00000000 00000000
00000000
       c031fd46 c01a8796 00000001 00000001 c594bd9c 00000000 c0135489
00000001
       00000001 c594bd9c 00000246 00000000 c033d290 c594be4c 00000000
c0a8cec0
Call Trace: [<c01a85d8>] [<c01a8796>] [<c0135489>] [<c0135597>]
[<c0135613>] [<c011b5fa>] [<c01a9429>]
       [<c02847c0>] [<c01a8033>] [<c01a85a0>] [<c01bd1a1>] [<c011bd61>]
[<c01bcaec>] [<c01af53e>] [<c01af5c1>]
       [<c01af4c1>] [<c01bc46f>] [<c011e139>] [<c0242ba8>] [<c01ae6cf>]
[<c01ae7f4>] [<c0108ffc>]
Code: f3 90 7e f5 e9 0c f6 f5 ff 80 3d 00 3e 2c c0 00 f3 90 7e f5

>>EIP; c02480e5 <stext_lock+3a91/9f30> <=====
Trace; c01a85d8 <__ll_rw_block+1c/1c4>
Trace; c01a8796 <ll_rw_block+16/1c>
Trace; c0135489 <sync_buffers+125/200>
Trace; c0135597 <fsync_dev+f/84>
Trace; c0135613 <sys_sync+7/10>
Trace; c011b5fa <panic+6a/f0>
Trace; c01a9429 <do_lo_request+41/3a4>
Trace; c02847c0 <cpdext+1c980/2238c>
Trace; c01a8033 <generic_make_request+24b/7d4>
Trace; c01a85a0 <generic_make_request+7b8/7d4>
Trace; c01bd1a1 <raid1_sync_request+795/7e4>
Trace; c011bd61 <printk+169/178>
Trace; c01bcaec <raid1_sync_request+e0/7e4>
Trace; c01af53e <md_do_sync+286/788>
Trace; c01af5c1 <md_do_sync+309/788>
Trace; c01af4c1 <md_do_sync+209/788>
Trace; c01bc46f <raid1syncd+93/630>
Trace; c011e139 <put_files_struct+ad/b8>
Trace; c0242ba8 <sprintf+14/1c>
Trace; c01ae6cf <md_thread+31f/42c>
Trace; c01ae7f4 <md_wakeup_thread+18/1c>
Trace; c0108ffc <kernel_thread+28/38>
Code; c02480e5 <stext_lock+3a91/9f30>
00000000 <_EIP>:
Code; c02480e5 <stext_lock+3a91/9f30> <=====
   0: f3 90 repz nop <=====
Code; c02480e7 <stext_lock+3a93/9f30>
   2: 7e f5 jle fffffff9 <_EIP+0xfffffff9>
c02480de <stext_lock+3a8a/9f30>
Code; c02480e9 <stext_lock+3a95/9f30>
   4: e9 0c f6 f5 ff jmp fff5f615 <_EIP+0xfff5f615>
c01a76fa <blk_get_queue+1a/50>
Code; c02480ee <stext_lock+3a9a/9f30>
   9: 80 3d 00 3e 2c c0 00 cmpb $0x0,0xc02c3e00
Code; c02480f5 <stext_lock+3aa1/9f30>
  10: f3 90 repz nop
Code; c02480f7 <stext_lock+3aa3/9f30>
  12: 7e f5 jle 9 <_EIP+0x9> c02480ee
<stext_lock+3a9a/9f30>

2 warnings issued. Results may not be reliable.

Thanks

Glenn

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jun 23 2000 - 21:00:18 EST