Re: [x86,SMP,patch] smp-2.3.18, please test.

Pete Wyckoff (wyckoff@ca.sandia.gov)
Mon, 20 Sep 1999 11:17:40 -0700


mingo@chiara.csoma.elte.hu said:
>
> this is the latest version of the x86 SMP/APIC/IOAPIC code:
>
> http://www.redhat.com/~mingo/smp-2.3.18-B1
>
[...]
>
> reports, comments, suggestions welcome.

I was really looking forward to getting some info from the NMI watchdog,
but it stays silent. Any guesses where to look next on my hard-lockup
problem?

It's easily triggered by running two processes:

tar cf - / > /dev/null
dd if=/dev/zero | ssh oscar dd of=/dev/null

Neither chewing up the network nor eating the disk alone will cause the
hard lock. Scsi is Buslogic BT-950, net is eepro100 full-duplex and
switched at 100 Mb/s to the destination machine.

Running 2.3.18 with the new apic patch. Here's /proc/interrupts, first.

CPU0 CPU1
0: 54285 60701 IO-APIC-edge timer
1: 1040 1250 IO-APIC-edge keyboard
2: 0 0 XT-PIC cascade
9: 1 0 IO-APIC-edge Cyclom-Y
13: 1 0 XT-PIC fpu
16: 411382 411087 IO-APIC-level BusLogic BT-950
19: 28320 33899 IO-APIC-level eth0
NMI: 114925 114925
ERR: 0

Some lines of interest from the boot follow. Thanks for any suggestions.

-- Pete

Linux version 2.3.18 (pw@flog) (gcc version 2.8.1) #1 SMP Mon Sep 20 10:07:58 PDT 1999
Intel MultiProcessor Specification v1.4
Virtual Wire compatibility mode.
OEM ID: INTEL Product ID: 440GX APIC at: 0xFEE00000
Processor #0 Pentium(tm) Pro APIC version 17
Processor #1 Pentium(tm) Pro APIC version 17
I/O APIC #2 Version 17 at 0xFEC00000.
Processors: 2
mapped APIC to ffffe000 (fee00000)
mapped IOAPIC to ffffd000 (fec00000)
Detected 398276917 Hz processor.
[...]
Initializing CPU#1
masked ExtINT on CPU#1
ESR value before enabling vector: 00000000
ESR value after enabling vector: 00000000
Calibrating delay loop... 398.13 BogoMIPS
OK.
CPU1: Intel Pentium II (Deschutes) stepping 01
Total of 2 processors activated (795.44 BogoMIPS).
enabling symmetric IO mode... ... done.
ENABLING IO-APIC IRQs
init IO_APIC IRQs
IO-APIC (apicid-pin) 2-0, 2-10, 2-11, 2-17, 2-18, 2-20, 2-21, 2-22, 2-23 not connected.
activating NMI Watchdog ... done.
number of MP IRQ sources: 17.
number of IO-APIC #2 registers: 24.
testing the IO APIC.......................
Sep 20 10:42:03 flog kernel:
IO APIC #2......
.... register #00: 02000000
....... : physical APIC id: 02
.... register #01: 00170011
....... : max redirection entries: 0017
....... : IO APIC version: 0011
.... register #02: 01000000
....... : arbitration: 01
.... IRQ redirection table:
NR Log Phy Mask Trig IRR Pol Stat Dest Deli Vect:
00 000 00 1 0 0 0 0 0 0 00
01 0FF 0F 0 0 0 0 0 1 1 59
02 0FF 0F 0 0 0 0 0 1 1 51
03 0FF 0F 0 0 0 0 0 1 1 61
04 0FF 0F 0 0 0 0 0 1 1 69
05 0FF 0F 0 0 0 0 0 1 1 71
06 0FF 0F 0 0 0 0 0 1 1 79
07 0FF 0F 0 0 0 0 0 1 1 81
08 0FF 0F 0 0 0 0 0 1 1 89
09 0FF 0F 0 0 0 0 0 1 1 91
0a 000 00 1 0 0 0 0 0 0 00
0b 000 00 1 0 0 0 0 0 0 00
0c 0FF 0F 0 0 0 0 0 1 1 99
0d 000 00 1 0 0 0 0 0 0 00
0e 0FF 0F 0 0 0 0 0 1 1 A1
0f 0FF 0F 0 0 0 0 0 1 1 A9
10 0FF 0F 1 1 0 1 0 1 1 B1
11 000 00 1 0 0 0 0 0 0 00
12 000 00 1 0 0 0 0 0 0 00
13 0FF 0F 1 1 0 1 0 1 1 B9
14 000 00 1 0 0 0 0 0 0 00
15 000 00 1 0 0 0 0 0 0 00
16 000 00 1 0 0 0 0 0 0 00
17 000 00 1 0 0 0 0 0 0 00
IRQ to pin mappings:
IRQ0 -> 2
IRQ1 -> 1
IRQ3 -> 3
IRQ4 -> 4
IRQ5 -> 5
IRQ6 -> 6
IRQ7 -> 7
IRQ8 -> 8
IRQ9 -> 9
IRQ12 -> 12
IRQ13 -> 13
IRQ14 -> 14
IRQ15 -> 15
IRQ16 -> 16
IRQ19 -> 19
.................................... done.
Initializing CPU#0
checking TSC synchronization across CPUs: passed.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/