Re: [BUG] 2.5.63: ESR killed my box!

From: Ion Badulescu (ionut@badula.org)
Date: Wed Feb 26 2003 - 16:44:16 EST


On Wed, 26 Feb 2003, Linus Torvalds wrote:

> What about detect_init_APIC()?
>
> That one currently does an unconditional
>
> boot_cpu_physical_apicid = 0;

Mikael's patch (included in the previous message) changes this to

        boot_cpu_physical_apicid = -1U;

which is the same thing indeed.

> What happens if you just remove that line (which means that the code
> further on will do
>
> */
> if (boot_cpu_physical_apicid == -1U)
> boot_cpu_physical_apicid = GET_APIC_ID(apic_read(APIC_ID));
>
> which might be correct.

It's not enough. There are two other problems, further down in
APIC_init_uniprocessor():

1) apic_write_around(APIC_ID, boot_cpu_physical_apicid) places the APIC
value in the lower 8 bits of APIC_ID, when it should be in the upper 8. As
as result, it effectively forces the APIC id to always be 0 for the boot
CPU, which is fatal on SMP AMD boxes.

2) phys_cpu_present_map = 1 means we always set bit 0, but later on
in setup_local_APIC() we do
        if (!clustered_apic_mode &&
            !test_bit(GET_APIC_ID(apic_read(APIC_ID)), &phys_cpu_present_map))
                BUG();
and the bug is triggered if the APIC_ID is not zero.

Here's Mikael's patch again -- it's quite obviously correct, it fixes the
problem on my SMP AMD boxes and doesn't break anything else I've thrown at
it. Applies cleanly to both 2.4 and 2.5.latest.

Ion

-- 
  It is better to keep your mouth shut and be thought a fool,
            than to open it and remove all doubt.
-----------------------------------------------

--- linux-2.4.21-pre4/arch/i386/kernel/apic.c.~1~ 2003-02-23 15:55:31.000000000 +0100 +++ linux-2.4.21-pre4/arch/i386/kernel/apic.c 2003-02-23 16:03:50.000000000 +0100 @@ -649,7 +649,7 @@ } set_bit(X86_FEATURE_APIC, &boot_cpu_data.x86_capability); mp_lapic_addr = APIC_DEFAULT_PHYS_BASE; - boot_cpu_physical_apicid = 0; + boot_cpu_physical_apicid = -1U; if (nmi_watchdog != NMI_NONE) nmi_watchdog = NMI_LOCAL_APIC; @@ -1169,8 +1169,8 @@ connect_bsp_APIC(); - phys_cpu_present_map = 1; - apic_write_around(APIC_ID, boot_cpu_physical_apicid); + BUG_ON(boot_cpu_physical_apicid != GET_APIC_ID(apic_read(APIC_ID))); + phys_cpu_present_map = 1 << boot_cpu_physical_apicid; apic_pm_init2();

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Feb 28 2003 - 22:00:39 EST