2.6.26.x hangs on amd64/smp

From: BERTRAND Joel
Date: Thu Sep 25 2008 - 17:08:21 EST


Hello,

System : debian/testing, tested kernels 2.6.26, 2.6.26.3, 2.6.26.5.
Hardware : core2duo, 4 GB, raid1 software, CFQ scheduler.

I have written a program that work on cartographic data. This program is started as a daemon and does some fork() (and pthread_create()). I have seen that it requires 6 GB to work, each process takes 1,5 GB. The same program works fine under FreeBSD or Solaris (on of course the same hardware).

When it starts, I can see disk activity (swap), and after 2 or 3 minutes, kernel crashes without any trace (no more disk activity, sysrq does nothing...). I have reproduced this bug when I was logged on console. There was no messsage.

If I introduce some nanosleep() syscalls in my code, crash is more difficult to reproduce.

cauchy:[~] > cat /proc/mdstat
Personalities : [raid1]
md1 : active raid1 sdb2[1] sda2[0]
5855616 blocks [2/2] [UU]

md2 : active raid1 sdb3[1] sda3[0]
48829440 blocks [2/2] [UU]

md3 : active raid1 sdb4[1] sda4[0]
101474496 blocks [2/2] [UU]

md0 : active raid1 sdb1[1] sda1[0]
128384 blocks [2/2] [UU]

unused devices: <none>

swap in on /dev/md1.

cauchy:[~] > df -h
Sys. de fich. Tail. Occ. Disp. %Occ. Monté sur
/dev/md2 46G 28G 16G 64% /
tmpfs 2,0G 0 2,0G 0% /lib/init/rw
udev 10M 124K 9,9M 2% /dev
tmpfs 2,0G 0 2,0G 0% /dev/shm
/dev/md0 122M 60M 56M 52% /boot
/dev/md3 96G 56G 35G 62% /home
cauchy:[~] >

dmesg :
Linux version 2.6.26.5 (root@cauchy) (gcc version 4.3.1 (Debian 4.3.1-9) ) #16 SMP PREEMPT Tue Sep 23 15:54:59 CEST 2008
...
ACPI: BIOS bug: multiple APIC/MADT found, using 0
ACPI: If "acpi_apic_instance=2" works better, notify linux-acpi@xxxxxxxxxxxxxxx
ACPI: DMI detected: Toshiba
...

.config: see http://www.systella.fr/~bertrand/config.2.6.26.5

Regards,

JKB
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/