Re: [PATCH 2/3] [BUGFIX] x86/x86_64: fix CPU offlining triggered inactive device IRQ interrruption

From: Yinghai Lu
Date: Sat Apr 11 2009 - 03:44:20 EST


On Fri, Apr 10, 2009 at 3:02 PM, Eric W. Biederman
<ebiederm@xxxxxxxxxxxx> wrote:
> Gary Hade <garyhade@xxxxxxxxxx> writes:
>
>> On Thu, Apr 09, 2009 at 06:29:10PM -0700, Eric W. Biederman wrote:
>>> Gary Hade <garyhade@xxxxxxxxxx> writes:
>>>
>>> > Impact: Eliminates a race that can leave the system in an
>>> >         unusable state
>>> >
>>> > During rapid offlining of multiple CPUs there is a chance
>>> > that an IRQ affinity move destination CPU will be offlined
>>> > before the IRQ affinity move initiated during the offlining
>>> > of a previous CPU completes.  This can happen when the device
>>> > is not very active and thus fails to generate the IRQ that is
>>> > needed to complete the IRQ affinity move before the move
>>> > destination CPU is offlined.  When this happens there is an
>>> > -EBUSY return from __assign_irq_vector() during the offlining
>>> > of the IRQ move destination CPU which prevents initiation of
>>> > a new IRQ affinity move operation to an online CPU.  This
>>> > leaves the IRQ affinity set to an offlined CPU.
>>> >
>>> > I have been able to reproduce the problem on some of our
>>> > systems using the following script.  When the system is idle
>>> > the problem often reproduces during the first CPU offlining
>>> > sequence.
>>>
>>> You appear to be focusing on the IBM x460 and x3835.
>>
>> True.  I have also observed IRQ interruptions on an IBM x3950 M2
>> which I believe, but am not certain, were due to the other
>> "I/O redirection table register write with Remote IRR bit set"
>> caused problem.
>>
>> I intend to do more testing on the x3950 M2 and other
>> IBM System x servers but I unfortunately do not currently
>> have access to any Intel based non-IBM MP servers.  I was
>> hoping that my testing request might at least get some
>> others interested in running the simple test script on their
>> systems and reporting their results.  Have you perhaps tried
>> the test on any of the Intel based MP systems that you have
>> access to?
>>
>>> Can you describe
>>> what kind of interrupt setup you are running.
>>
>> Being somewhat of a ioapic neophyte I am not exactly sure
>> what you are asking for here.  This is ioapic information
>> logged during boot if that helps at all.
>> x3850:
>>     ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
>>     IOAPIC[0]: apic_id 15, version 0, address 0xfec00000, GSI 0-35
>>     ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
>>     IOAPIC[1]: apic_id 14, version 0, address 0xfec01000, GSI 36-71
>> x460:
>>     ACPI: IOAPIC (id[0x0f] address[0xfec00000] gsi_base[0])
>>     IOAPIC[0]: apic_id 15, version 17, address 0xfec00000, GSI 0-35
>>     ACPI: IOAPIC (id[0x0e] address[0xfec01000] gsi_base[36])
>>     IOAPIC[1]: apic_id 14, version 17, address 0xfec01000, GSI 36-71
>>     ACPI: IOAPIC (id[0x0d] address[0xfec02000] gsi_base[72])
>>     IOAPIC[2]: apic_id 13, version 17, address 0xfec02000, GSI 72-107
>>     ACPI: IOAPIC (id[0x0c] address[0xfec03000] gsi_base[108])
>>     IOAPIC[3]: apic_id 12, version 17, address 0xfec03000, GSI 108-143
>
> Sorry.  My real question is which mode you are running the ioapics in.
>

looks like ack_level_irq.

YH
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/