oops on partition check, depending on drive order

From: bug1 (bug1@netconnect.com.au)
Date: Thu Apr 20 2000 - 22:07:17 EST


Hi, ive been gettign strange behaviour on my dual celeron bp6 with disk
behaviour for a few days, ive tried 2.3.99pre3, 2.3.99pre5, but am now
using 2.3.99pre6pre3.

I wish i could rule out hardware problem, but i really dont know what
the cause of my problem is. My system locks up on boot detecting
partitions, depending on the order of the drives, when i do boot i get
lookups when reading from my 4 way ide raid0.

When i boot succesfully and read my array i get ide_dma_timeout func
only:14, i can write no problems (55MB/s on raid0 :)) but reading cause
this error after about 1/2 second of begining to read.

lockup on boot depending on which controller my ide drives are plugged
into.

I have 6 ide channels on my machine, 4 from my bp6 and 2 from pci hpt366
(abit hotrod) controller card.

The drive i have are
Quantum fireball EX6.4A, this has always been plugged into my udma33
slot, never had a problem with this one.

Two 20.5GB IBM-DPTA-372050
One Quantum Fireball Plus KA18.2
One Quantum Fireball Plus KX20.5

I can boot with any combination of 4 drives.

I can succesfully boot with both ibm drives on the controller 3+4 on
mobo, and quantum drives on my pci udma66 card
hda: Quantum EX6.4
hdc: IBM-DPTA (udma 4)
hde: Quantum KA (udma 2)
hdg: Quantum KX (udma 4)
hdi: IDM-DPTA (udma 4)

NOTE: the secondary onboard udma33 channel is disabled, so hdk (or hdi?)
becomes hdc

If i swap the position of my quantum drives on the pci udma controller,
i can still boot.

If i swap the position of my ibm drives on my onboard controller i can
still boot.

BUT if i put a ibm and a quntum plus drives togther on my onboard udma
channels AND on my pci card, it locks up and display the data below.

hda: Quantum Ex6.4
hdc: IBM (remember hdc is really 6th channel, channel2 disabled in
bios)
hde: IBM
hdg: Quantum ka
hdi: Quantum kx

or

hda: Quantum ex6.4
hdc: ibm
hde: Quantum ka
hdg: ibm
hdi: Quantum kx

I havent worked out how to use ksymoops properly yet, so this is all i
have.

First time i got this

Partition check:
 hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 >
 hdc: hdc1 hdc2
 hde: hde1 hde2
 hdg:NMI Watchdog detected LOCKUP on CPU0, registers:
CPU: 0
EIP: 0010:[<c01e9ab9>]
EFLAGS: 00000006
eax: 00000050 ebx: c12991c0 ecx: 00000000 edx: 0000a407
esi: c129c360 edi: 00000286 ebp: c03e5140 esp: c1273d80
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 1, stackpage=c1273000)
Stack: c12991c0 04000001 00000000 00000012 00000000 00000000 c010db01
00000012
       c129c360 c1273de4 c0349a40 c0349a50 00000012 c1273ddc c010de34
00000012
       c1273de4 c127dbe0 c03e5180 00000282 c12910a8 00000000 c127dbe0
c1273e68
Call Trace: [<c010db01>] [<c010de34>] [<c010bee0>] [<c01e0018>]
[<c01ad1ab>] [<c013efd1>] [<c0140b61>]
       [<c015b521>] [<c015b160>] [<c010bee0>] [<c015b23d>] [<c015b1ca>]
[<c01e82fb>] [<c0110018>] [<c01071cb>]
       [<c0109010>]
Code: e6 80 81 3d 28 f4 2f c0 ad 4e ad de 74 19 6a 5c 68 80 e2 2b
console shuts up ...

rebooted and this time it i struck problems at hde:

The registers ebx, edx, esi, ebp, and the Stack are different from above
but everything else is the same.

Stopped at hde, some of the output is.

EIP: 0010:[<c010de3e>]
EFLAGS: 00000046

Call Trace: [<c010bee0>] [c01e0018>] [<c01ad1ab>] <c013efd1>]
[<c0140b61>] [<c015b521>] [<c015b160>]
        [c010bee0>] [<c015b23d>] [<c015b1ca>] [<c01e82fb>] [<c0110018>]
[<c01071cb>] [<c0109010>]
Code: 74 28 68 37 de 10 c0 68 00 ff 27 c0 e8 69 01 01 00 83 c4 08

Stopped at hdg, the call trace is identical to above, but Code is.
Code: f6 06 01 f3 90 75 f9 e9 16 63 e9 ff f6 06 01 f3 90 75 f9 e9

I can reproduce more of this is needed

dmesg follows

> NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
 00 000 00 1 0 0 0 0 0 0 00
 01 0FF 0F 0 0 0 0 0 1 1 59
 02 0FF 0F 0 0 0 0 0 1 1 51
 03 0FF 0F 0 0 0 0 0 1 1 61
 04 0FF 0F 0 0 0 0 0 1 1 69
 05 000 00 1 0 0 0 0 0 0 00
 06 0FF 0F 0 0 0 0 0 1 1 71
 07 0FF 0F 0 0 0 0 0 1 1 79
 08 0FF 0F 0 0 0 0 0 1 1 81
 09 000 00 1 0 0 0 0 0 0 00
 0a 000 00 1 0 0 0 0 0 0 00
 0b 0FF 0F 0 0 0 0 0 1 1 89
 0c 0FF 0F 0 0 0 0 0 1 1 91
 0d 000 00 1 0 0 0 0 0 0 00
 0e 0FF 0F 0 0 0 0 0 1 1 99
 0f 0FF 0F 0 0 0 0 0 1 1 A1
 10 0FF 0F 1 1 0 1 0 1 1 A9
 11 000 00 1 0 0 0 0 0 0 00
 12 0FF 0F 1 1 0 1 0 1 1 B1
 13 0FF 0F 1 1 0 1 0 1 1 B9
 14 000 00 1 0 0 0 0 0 0 00
 15 000 00 1 0 0 0 0 0 0 00
 16 000 00 1 0 0 0 0 0 0 00
 17 000 00 1 0 0 0 0 0 0 00
IRQ to pin mappings:
IRQ0 -> 2
IRQ1 -> 1
IRQ3 -> 3
IRQ4 -> 4
IRQ6 -> 6
IRQ7 -> 7
IRQ8 -> 8
IRQ11 -> 11
IRQ12 -> 12
IRQ13 -> 13
IRQ14 -> 14
IRQ15 -> 15
IRQ16 -> 16
IRQ18 -> 18
IRQ19 -> 19
.................................... done.
calibrating APIC timer ...
..... CPU clock speed is 434.3800 MHz.
..... host bus clock speed is 66.8275 MHz.
cpu: 0, clocks: 668275, slice: 222758
CPU0<C0:668272,C:445504,D:10,S:222758,C:668275>
cpu: 1, clocks: 668275, slice: 222758
CPU1<C0:668272,C:222752,D:4,S:222758,C:668275>
checking TSC synchronization across CPUs: passed.
Setting commenced=1, go go go
mtrr: your CPUs had inconsistent fixed MTRR settings
mtrr: probably your BIOS does not setup all CPUs
PCI: PCI BIOS revision 2.10 entry at 0xfb440, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router PIIX [8086/7000] at 00:07.0
PCI->APIC IRQ transform: (B0,I9,P0) -> 19
PCI->APIC IRQ transform: (B0,I11,P0) -> 18
PCI->APIC IRQ transform: (B0,I11,P0) -> 18
PCI->APIC IRQ transform: (B0,I15,P0) -> 16
PCI->APIC IRQ transform: (B0,I19,P0) -> 18
PCI->APIC IRQ transform: (B0,I19,P1) -> 18
Limiting direct PCI/PCI transfers.
isapnp: Scanning for Pnp cards...
isapnp: No Plug & Play device found
Linux NET4.0 for Linux 2.3
Based upon Swansea University Computer Society NET3.039
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
IP: routing cache hash table of 512 buckets, 8Kbytes
TCP: Hash tables configured (established 4096 bind 5461)
Starting kswapd v1.6
vga16fb: initializing
vga16fb: unable to reserve VGA memory, exiting
Detected PS/2 Mouse Port.
pty: 256 Unix98 ptys configured
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
loop: registered device at major 7
loop: enabling 8 loop devices
Uniform Multi-Platform E-IDE driver Revision: 6.30
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
PIIX4: IDE controller on PCI bus 00 dev 39
PIIX4: not 100% native mode: will probe irqs later
    ide0: BM-DMA at 0xf000-0xf007, BIOS settings: hda:DMA, hdb:pio
HPT366: IDE controller on PCI bus 00 dev 58
HPT366: not 100% native mode: will probe irqs later
    ide2: BM-DMA at 0xa000-0xa007, BIOS settings: hde:DMA, hdf:pio
HPT366: IDE controller on PCI bus 00 dev 59
HPT366: not 100% native mode: will probe irqs later
    ide3: BM-DMA at 0xac00-0xac07, BIOS settings: hdg:DMA, hdh:pio
HPT366: onboard version of chipset, pin1=1 pin2=2
HPT366: IDE controller on PCI bus 00 dev 98
HPT366: not 100% native mode: will probe irqs later
    ide1: BM-DMA at 0xb800-0xb807, BIOS settings: hdc:DMA, hdd:pio
HPT366: IDE controller on PCI bus 00 dev 99
HPT366: not 100% native mode: will probe irqs later
    ide4: BM-DMA at 0xc400-0xc407, BIOS settings: hdi:DMA, hdj:pio
hda: QUANTUM FIREBALL EX6.4A, ATA DISK drive
hdc: IBM-DPTA-372050, ATA DISK drive
hde: QUANTUM FIREBALLP KA18.2, ATA DISK drive
hdg: QUANTUM FIREBALLP KX20.5, ATA DISK drive
hdi: IBM-DPTA-372050, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide1 at 0xb000-0xb007,0xb402 on irq 18
ide2 at 0x9800-0x9807,0x9c02 on irq 18
ide3 at 0xa400-0xa407,0xa802 on irq 18
ide4 at 0xbc00-0xbc07,0xc002 on irq 18
hda: 12594960 sectors (6449 MB) w/418KiB Cache, CHS=13328/15/63,
UDMA(33)
hdc: 40088160 sectors (20525 MB) w/1961KiB Cache, CHS=39770/16/63,
UDMA(66)
hde: 36094464 sectors (18480 MB) w/371KiB Cache, CHS=35808/16/63,
UDMA(33)
hdg: 40160988 sectors (20562 MB) w/418KiB Cache, CHS=39842/16/63,
UDMA(33)
hdi: 40088160 sectors (20525 MB) w/1961KiB Cache, CHS=39770/16/63,
UDMA(66)
Partition check:
 hda: hda1 hda2 hda3 hda4 < hda5 hda6 hda7 >
 hdc: hdc1 hdc2
 hde: hde1 hde2
 hdg: hdg1 hdg2
 hdi: hdi1 hdi2
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
md driver 0.90.0 MAX_MD_DEVS=256, MAX_REAL=12
linear personality registered
raid0 personality registered
md.c: sizeof(mdp_super_t) = 4096
LVM version 0.8final by Heinz Mauelshagen (15/02/2000)
lvm -- Driver successfully initialized
scsi0 : SCSI host adapter emulation for IDE ATAPI devices
scsi : 1 host.
scsi : detected total.
SLIP: version 0.8.4-NET3.019-NEWTTY (dynamic channels, max=256) (6 bit
encapsulation enabled).
CSLIP: code copyright 1989 Regents of the University of California.
SLIP linefill/keepalive option.
Serial driver version 4.93 (2000-03-20) with MANY_PORTS SHARE_IRQ
SERIAL_PCI ISAPNP enabled
ttyS00 at 0x03f8 (irq = 4) is a 16550A
ttyS03 at 0x02e8 (irq = 3) is a 16550A
Real Time Clock Driver v1.10b
Non-volatile memory driver v1.0
8139too Fast Ethernet driver 0.9.4 loaded
eth0: RealTek RTL8139 Fast Ethernet at 0xeb000000, IRQ 19,
00:00:1c:d0:0b:00.
Linux agpgart interface v0.99 (c) Jeff Hartmann
agpgart: Maximum main memory to use for agp memory: 96M
agpgart: Detected Intel 440BX chipset
agpgart: AGP aperture is 64M @ 0xe0000000
kmem_create: Forcing size word alignment - nfs_fh
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 196k freed
Adding Swap: 32564k swap-space (priority -1)
(read) hdc1's sb offset: 11718400 [events: 00000004]
(read) hde1's sb offset: 11718400 [events: 00000004]
(read) hdg1's sb offset: 11718400 [events: 00000008]
autorun ...
considering hdg1 ...
  adding hdg1 ...
created md0
bind<hdg1,1>
running: <hdg1>
now!
hdg1's event counter: 00000008
md: device name has changed from [dev 39:01] to hdg1 since last import!
md0: former device hdg1 is unavailable, removing from array!
md0: former device hdi1 is unavailable, removing from array!
md0: max total readahead window set to 384k
md0: 3 data-disks, max readahead per data-disk: 128k
md: md0, array needs 3 disks, has 1, aborting.
raid0: disks are not ordered, aborting!
pers->run() failed ...
do_md_run() returned -22
unbind<hdg1,0>
export_rdev(hdg1)
md0 stopped.
considering hde1 ...
  adding hde1 ...
  adding hdc1 ...
created md0
bind<hdc1,1>
bind<hde1,2>
running: <hde1><hdc1>
now!
hde1's event counter: 00000004
hdc1's event counter: 00000004
md: device name has changed from hdg1 to hde1 since last import!
md0: former device hde1 is unavailable, removing from array!
md0: max total readahead window set to 384k
md0: 3 data-disks, max readahead per data-disk: 128k
md: md0, array needs 3 disks, has 2, aborting.
raid0: disks are not ordered, aborting!
pers->run() failed ...
do_md_run() returned -22
unbind<hde1,1>
export_rdev(hde1)
unbind<hdc1,0>
export_rdev(hdc1)
md0 stopped.
... autorun DONE.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Apr 23 2000 - 21:00:18 EST