Problems with 2.4.20-rc-ac (smp+piix)

From: Peter Kjellstroem (cap@nsc.liu.se)
Date: Thu Nov 21 2002 - 11:41:33 EST


Hi Alan,

Sorry for the double send, forgot subject, cc ... Need sleep... ;-)

I have some problems here running late 2.4.20+ac on a dual Xeon system
(supermicro mb). The kernels tried are 2.4.20-rc1-ac4 and rc2-ac1 with
various configurations. All combinations give the same result. One
.config with corresponding boot output and lspci -vv output has been
attached.

The following kernels does not have this problem (smp + dma piix works)
2.4.18-17.7.x (rh)
2.4.20-pre1
2.4.19-pre8-ac5
2.4.18 (custom)
... plus a few more

*Problem 1 (piix driver)
System takes a strange BUG when running rc.sysinit (right after sysinit
has enabled dma for the ide drive). The kernel doesn't panic or Oops only
a BUG is printed (approx. like this):

 BUG at panic.c:286, inv. op. 0000
 ...regdump...
 approx. calltrace (no oops):
 __out_of_line_bug
 'ide_dma stuff'
 dorwdisk
 ext3_get_blk
 ...
 system call

That is what I manually wrote down of the calltrace and mapped back
through System.map. I guess I could spend some time on getting a full
calltrace if you really want.

*Problem 2 (smp detection/init stuff)
After disabling dma and getting past problem 1 the system surprised by
booting up with only one cpu.

part of bootup messages:

kernel: CPU: L1 I cache: 0K, L1 D cache: 8K
kernel: CPU: L2 cache: 512K
kernel: CPU: Physical Processor ID: 0
kernel: Enabling fast FPU save and restore... done.
kernel: Enabling unmasked SIMD FPU exception support... done.
kernel: Checking 'hlt' instruction... OK.
kernel: POSIX conformance testing by UNIFIX
kernel: mtrr: v1.40 (20010327) Richard Gooch (rgooch@atnf.csiro.au)
kernel: mtrr: detected mtrr type: Intel
kernel: CPU: L1 I cache: 0K, L1 D cache: 8K
kernel: CPU: L2 cache: 512K
kernel: CPU: Physical Processor ID: 0
kernel: CPU0: Intel(R) XEON(TM) CPU 2.20GHz stepping 04
kernel: per-CPU timeslice cutoff: 1462.49 usecs.
kernel: task migration cache decay timeout: 10 msecs.
kernel: enabled ExtINT on CPU#0
kernel: ESR value before enabling vector: 00000000
kernel: ESR value after enabling vector: 00000000
kernel: Error: only one processor found.

Could any of the following patches be relevant? (from -ac changelog)

Linux 2.4.20-pre1-ac1
* Fix a harmless physical/logical cpu confusion (me)
        in the APM code

Linux 2.4.19rc5-ac1
+ Switch 'processor id' to 'physical id' (me)
        | Keeps glibc happy until we sort out cpu numbers longer term
o Fix incorrect marking of phys_proc_id init (David Luyer)

/Peter

-- 
------------------------------------------------------------
  Peter Kjellstroem              | E-mail: cap@nsc.liu.se
  National Supercomputer Centre  |
  Sweden                         | http://www.nsc.liu.se




- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Nov 23 2002 - 22:00:37 EST