Problems making big fs on DPT RAID

Chris Adams (cadams@ro.com)
Wed, 5 Nov 1997 16:04:23 -0600 (CST)


I am having trouble setting up filesystems on a big array with a DPT
PM3334UW, e2fstools 1.10 (from RedHat 4.2), and Linux 2.0.31. Whenever
something accesses the drives attached to the DPT (even the drive not in
the array), the kernel gobbles memory - and I don't mean buffers/cache.
This is the first command after a bootup where the partitions had to be
fscked:

1:newnews:~$ w
3:36pm up 1 min, 1 user, load average: 0.49, 0.19, 0.07
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
cadams ttyp0 sprocket 3:36pm 0.00s 0.13s 0.02s w
2:newnews:~$ free
total used free shared buffers cached
Mem: 322024 182160 139864 6384 155076 3960
-/+ buffers: 23124 298900
Swap: 130748 0 130748
3:newnews:~$

Here is after a clean reboot (no fsck):

1:newnews:~$ w
3:42pm up 0 min, 1 user, load average: 0.33, 0.09, 0.03
USER TTY FROM LOGIN@ IDLE JCPU PCPU WHAT
cadams ttyp0 sprocket 3:42pm 0.00s 0.15s 0.01s w
2:newnews:~$ free
total used free shared buffers cached
Mem: 322024 8908 313116 6380 1668 3920
-/+ buffers: 3320 318704
Swap: 130748 0 130748
3:newnews:~$

When I try to make a big filesystem on the array (15G or more), the
system crashes. Sometimes I get (as fast as the screen can scroll),
repeated messages like:

Aiee: scheduling in interrupt 0015b840

According to /System.map, 0015b840 is in the middle of
__get_request_wait.

The other times it crashes, I get something like (this is hand typed, as
it didn't make it to the logs):

scsi0 channel 0 : resetting for second half of retries.
SCSI bus is being reset for host 0 channel 0.
eata_reset called pid:9053 target: 9 lun: 0 reason 0
eata_reset: slot 2 in reset, pid 9060.
eata_reset: slot 4 in reset, pid 9207.
eata_reset: slot 7 in reset, pid 9116.
eata_reset: slot 8 in reset, pid 9165.
eata_reset: slot 10 in reset, pid 9067.
eata_reset: slot 14 in reset, pid 9213.
eata_reset: slot 15 in reset, pid 9214.
eata_reset: slot 16 in reset, pid 9123.
eata_reset: slot 17 in reset, pid 9172.
eata_reset: slot 18 in reset, pid 9074.
eata_reset: slot 19 in reset, pid 9215.
eata_reset: slot 20 in reset, pid 9216.
eata_reset: slot 25 in reset, pid 9179.
eata_reset: slot 26 in reset, pid 9081.
eata_reset: slot 27 in reset, pid 9130.
eata_reset: slot 35 in reset, pid 9088.
eata_reset: slot 36 in reset, pid 9137.
eata_reset: slot 37 in reset, pid 9186.
eata_reset: slot 44 in reset, pid 9095.
eata_reset: slot 45 in reset, pid 9144.
eata_reset: slot 47 in reset, pid 9193.
eata_reset: slot 53 in reset, pid 9102.
eata_reset: slot 54 in reset, pid 9151.
eata_reset: slot 57 in reset, pid 9200.
eata_reset: slot 62 in reset, pid 9109.
eata_reset: slot 63 in reset, pid 9158.
eata_reset: board reset done, enabling interrupts.
eata_reset: interrups disabled again.
eata_reset: slot 2, DID_RESET, pid 9060 done.
eata_reset: slot 4, DID_RESET, pid 9207 done.
eata_reset: slot 7, DID_RESET, pid 9116 done.
eata_reset: slot 8, DID_RESET, pid 9165 done.
eata_reset: slot 10, DID_RESET, pid 9067 done.
eata_reset: slot 14, DID_RESET, pid 9213 done.
eata_reset: slot 15, DID_RESET, pid 9214 done.
eata_reset: slot 16, DID_RESET, pid 9123 done.
eata_reset: slot 17, DID_RESET, pid 9172 done.
eata_reset: slot 18, DID_RESET, pid 9074 done.
eata_reset: slot 19, DID_RESET, pid 9215 done.
eata_reset: slot 20, DID_RESET, pid 9216 done.
eata_reset: slot 25, DID_RESET, pid 9179 done.
eata_reset: slot 26, DID_RESET, pid 9081 done.
eata_reset: slot 27, DID_RESET, pid 9130 done.
eata_reset: slot 35, DID_RESET, pid 9088 done.
eata_reset: slot 36, DID_RESET, pid 9137 done.
eata_reset: slot 37, DID_RESET, pid 9186 done.
eata_reset: slot 44, DID_RESET, pid 9095 done.
eata_reset: slot 45, DID_RESET, pid 9144 done.
eata_reset: slot 47, DID_RESET, pid 9193 done.
eata_reset: slot 53, DID_RESET, pid 9102 done.
eata_reset: slot 54, DID_RESET, pid 9151 done.
eata_reset: slot 57, DID_RESET, pid 9200 done.
eata_reset: slot 62, DID_RESET, pid 9109 done.
eata_reset: slot 63, DID_RESET, pid 9158 done.
eata_reset: exit, wakeup.
eata_dma: in_handler, reseted command pid 9213 returned
eata_dma: in_handler, reseted command pid 9216 returned
eata_dma: in_handler, reseted command pid 9215 returned
eata_dma: in_handler, reseted command pid 9060 returned
eata_dma: in_handler, reseted command pid 9067 returned
eata_dma: in_handler, reseted command pid 9074 returned
eata_dma: in_handler, reseted command pid 9081 returned
eata_dma: in_handler, reseted command pid 9088 returned
eata_dma: in_handler, reseted command pid 9095 returned
eata_dma: in_handler, reseted command pid 9102 returned
eata_dma: in_handler, reseted command pid 9109 returned
eata_dma: in_handler, reseted command pid 9116 returned
eata_dma: in_handler, reseted command pid 9123 returned
eata_dma: in_handler, reseted command pid 9130 returned
eata_dma: in_handler, reseted command pid 9137 returned
eata_dma: in_handler, reseted command pid 9144 returned
eata_dma: in_handler, reseted command pid 9151 returned
eata_dma: in_handler, reseted command pid 9158 returned
eata_dma: in_handler, reseted command pid 9165 returned
eata_dma: in_handler, reseted command pid 9172 returned
eata_dma: in_handler, reseted command pid 9179 returned
eata_dma: in_handler, reseted command pid 9186 returned
eata_dma: in_handler, reseted command pid 9193 returned
eata_dma: in_handler, reseted command pid 9200 returned
eata_dma: in_handler, reseted command pid 9207 returned
eata_dma: in_handler, reseted command pid 9214 returned

What is really weird is that sometimes, the system doesn't crash until
the filesystem is apparently done being made (mke2fs has said "done"). I
say apparently, since I can't tell for sure, because if the system
crashes when trying to make the filesystem, it also crashes trying to
check it.

Also, can somebody tell me what size blocks the '-R stride=xxx' option
to mke2fs (version 1.10) is in? Is it 512 bytes, 1k, 2k, or what? I
set the stripe size for the RAID 0 to 256k. What should I set this
option to?

-- 
Chris Adams - cadams@ro.com
System Administrator - Renaissance Internet Services
I don't speak for anybody but myself - that's enough trouble.