Re: [PATCH] x86: Plug racy xAPIC access of CPU hotplug code

From: Jan Kiszka
Date: Tue Jan 28 2014 - 07:10:00 EST


On 2014-01-28 12:55, Ingo Molnar wrote:
>
> * Jan Kiszka <jan.kiszka@xxxxxxxxxxx> wrote:
>
>> On 2014-01-27 21:22, Andi Kleen wrote:
>>> On Mon, Jan 27, 2014 at 08:14:06PM +0100, Jan Kiszka wrote:
>>>> apic_icr_write and its users in smpboot.c were apparently written under
>>>> the assumption that this code would only run during early boot. But
>>>> nowadays we also execute it when onlining a CPU later on while the
>>>> system is fully running. That will make wakeup_cpu_via_init_nmi and,
>>>> thus, also native_apic_icr_write run in plain process context. If we
>>>> migrate the caller to a different CPU at the wrong time or interrupt it
>>>> and write to ICR/ICR2 to send unrelated IPIs, we can end up sending
>>>> INIT, SIPI or NMIs to wrong CPUs.
>>>>
>>>> Fix this by disabling interrupts during the write to the ICR halves and
>>>> disable preemption around waiting for ICR availability and using it.
>>>
>>> If you just want to disable migration use get_cpu()/put_cpu()
>>
>> Fine with me if that is now preferred. Will that be the upstream way of
>> -rt's migrate_disable()?
>
> Your original patch is fine, the suggestion to do ICR accesses with
> just preemption disabled is crap and is really asking for trouble: if
> some IRQ comes in at that point after all then it might cause all
> sorts of hard to debug problems (hangs, delays, missed IPIs, etc.).

Of course, we still need irqs off during ICR writes. I thought Andi was
just suggesting to replace preempt_disable with get_cpu, maybe to
document why we are disabling preemption here.

Jan

--
Siemens AG, Corporate Technology, CT RTC ITP SES-DE
Corporate Competence Center Embedded Linux
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/