Re: Panic when cpu hot-remove

From: Guenter Roeck
Date: Mon Nov 09 2015 - 15:21:59 EST

Next message: Simon Xiao: "RE: linux-next network throughput performance regression"
Previous message: Michael S. Tsirkin: "Re: [PATCH V6 0/6] Fast mmio eventfd fixes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Gerry,

On Thu, Jun 25, 2015 at 04:11:36PM +0800, Jiang Liu wrote:
> On 2015/6/18 15:54, fandongdong wrote:
> >
> >
> > å 2015/6/18 15:27, fandongdong åé:
> >>
> >>
> >> å 2015/6/18 13:40, Jiang Liu åé:
> >>> On 2015/6/17 22:36, Alex Williamson wrote:
> >>>> On Wed, 2015-06-17 at 13:52 +0200, Joerg Roedeljoro wrote:
> >>>>> On Wed, Jun 17, 2015 at 10:42:49AM +0000, èåå wrote:
> >>>>>> Hi maintainer,
> >>>>>>
> >>>>>> We found a problem that a panic happen when cpu was hot-removed.
> >>>>>> We also trace the problem according to the calltrace information.
> >>>>>> An endless loop happen because value head is not equal to value
> >>>>>> tail forever in the function qi_check_fault( ).
> >>>>>> The location code is as follows:
> >>>>>>
> >>>>>>
> >>>>>> do {
> >>>>>> if (qi->desc_status[head] == QI_IN_USE)
> >>>>>> qi->desc_status[head] = QI_ABORT;
> >>>>>> head = (head - 2 + QI_LENGTH) % QI_LENGTH;
> >>>>>> } while (head != tail);
> >>>>> Hmm, this code interates only over every second QI descriptor, and
> >>>>> tail
> >>>>> probably points to a descriptor that is not iterated over.
> >>>>>
> >>>>> Jiang, can you please have a look?
> >>>> I think that part is normal, the way we use the queue is to always
> >>>> submit a work operation followed by a wait operation so that we can
> >>>> determine the work operation is complete. That's done via
> >>>> qi_submit_sync(). We have had spurious reports of the queue getting
> >>>> impossibly out of sync though. I saw one that was somehow linked to
> >>>> the
> >>>> I/O AT DMA engine. Roland Dreier saw something similar[1]. I'm not
> >>>> sure if they're related to this, but maybe worth comparing. Thanks,
> >>> Thanks, Alex and Joerg!
> >>>
> >>> Hi Dongdong,
> >>> Could you please help to give some instructions about how to
> >>> reproduce this issue? I will try to reproduce it if possible.
> >>> Thanks!
> >>> Gerry
> >> Hi Gerry,
> >>
> >> We're running kernel 4.1.0 on a 4-socket system and we want to
> >> offline socket 1.
> >> Steps as follows:
> >>
> >> echo 1 > /sys/firmware/acpi/hotplug/force_remove
> >> echo 1 > /sys/devices/LNXSYSTM:00/LNXSYBUS:00/ACPI0004:01/eject
> Hi Dongdong,
> I failed to reproduce this issue on my side. Some please help
> to confirm?
> 1) Is this issue reproducible on your side?
> 2) Does this issue happen if you disable irqbalance service on you
> system?
> 3) Has the corresponding PCI host bridge been removed before removing
> the socket?
>
> From the log message, we only noticed log messages for CPU and memory,
> but not messages for PCI (IOMMU) devices. And this log message
> "[ 149.976493] acpi ACPI0004:01: Still not present"
> implies that the socket has been powered off during the ejection.
> So the story may be that you powered off the socket while the host
> bridge on the socket is still in use.
> Thanks!
> Gerry
>

Was this problem ever resolved ?

We are seeing the same (or a similar) problem randomly with our hardware.
No CPU hotplug is involved.

Any idea what I can do (or help) to track down the problem ?

Thanks,
Guenter

---
Sample traceback:

[ 485.547997] Uhhuh. NMI received for unknown reason 29 on CPU 0.
[ 485.633519] Do you have a strange power saving mode enabled?
[ 485.715262] Kernel panic - not syncing: NMI: Not continuing^M
[ 485.795750] CPU: 0 PID: 25109 Comm: cty Tainted: P W O 4.1.12-juniper-00687-g3de457e-dirty #1
[ 485.932825] Hardware name: Juniper Networks, Inc. 0576/HSW RCB PTX, BIOS NGRE_v0.44 04/07/2015
[ 486.057327] 0000000000000029 ffff88085f605df8 ffffffff80a9e179 0000000000000000
[ 486.164220] ffffffff80e53b4a ffff88085f605e78 ffffffff80a99b6f ffff88085f605e18^M
[ 486.271116] ffffffff00000008 ffff88085f605e88 ffff88085f605e28 ffffffff81019a00
[ 486.378012] Call Trace:
[ 486.413225] <NMI> [<ffffffff80a9e179>] dump_stack+0x4f/0x7b
[ 486.496228] [<ffffffff80a99b6f>] panic+0xbb/0x1e9
[ 486.565393] [<ffffffff802070ac>] unknown_nmi_error+0x9c/0xa0
[ 486.648394] [<ffffffff8020724c>] default_do_nmi+0x19c/0x1c0
[ 486.730138] [<ffffffff80207356>] do_nmi+0xe6/0x160
[ 486.800564] [<ffffffff80aa859b>] end_repeat_nmi+0x1a/0x1e
[ 486.879793] [<ffffffff8072a896>] ? qi_submit_sync+0x186/0x3f0
[ 486.964051] [<ffffffff8072a896>] ? qi_submit_sync+0x186/0x3f0
[ 487.048307] [<ffffffff8072a896>] ? qi_submit_sync+0x186/0x3f0
[ 487.132564] <<EOE>> [<ffffffff80731823>] modify_irte+0x93/0xd0
[ 487.219342] [<ffffffff80731bd3>] intel_ioapic_set_affinity+0x113/0x1a0
[ 487.314918] [<ffffffff80732130>] set_remapped_irq_affinity+0x20/0x30
[ 487.407979] [<ffffffff802c5fec>] irq_do_set_affinity+0x1c/0x50
[ 487.493494] [<ffffffff802c607d>] setup_affinity+0x5d/0x80
[ 487.572725] [<ffffffff802c68b4>] __setup_irq+0x2c4/0x580
[ 487.650695] [<ffffffff8070ce80>] ? serial8250_modem_status+0xd0/0xd0
[ 487.743755] [<ffffffff802c6cf4>] request_threaded_irq+0xf4/0x1b0
[ 487.831786] [<ffffffff8070febf>] univ8250_setup_irq+0x24f/0x290
[ 487.918560] [<ffffffff80710c27>] serial8250_do_startup+0x117/0x5f0
[ 488.009108] [<ffffffff80711125>] serial8250_startup+0x25/0x30
[ 488.093365] [<ffffffff8070b779>] uart_startup.part.16+0x89/0x1f0
[ 488.181397] [<ffffffff8070c475>] uart_open+0x115/0x160
[ 488.256852] [<ffffffff806e9537>] ? check_tty_count+0x57/0xc0
[ 488.339854] [<ffffffff806ed95c>] tty_open+0xcc/0x610
[ 488.412793] [<ffffffff8073dc92>] ? kobj_lookup+0x112/0x170
[ 488.493283] [<ffffffff803b7e6f>] chrdev_open+0x9f/0x1d0
[ 488.569992] [<ffffffff803b1297>] do_dentry_open+0x217/0x340
[ 488.651735] [<ffffffff803b7dd0>] ? cdev_put+0x30/0x30
[ 488.725934] [<ffffffff803b2577>] vfs_open+0x57/0x60
[ 488.797616] [<ffffffff803bffbb>] do_last+0x3fb/0xee0
[ 488.870557] [<ffffffff803c2620>] path_openat+0x80/0x640^M
[ 488.947270] [<ffffffff803c3eda>] do_filp_open+0x3a/0x90
[ 489.023984] [<ffffffff80aa6098>] ? _raw_spin_unlock+0x18/0x40
[ 489.108240] [<ffffffff803d0ba7>] ? __alloc_fd+0xa7/0x130
[ 489.186213] [<ffffffff803b2909>] do_sys_open+0x129/0x220^M
[ 489.264184] [<ffffffff80402a4b>] compat_SyS_open+0x1b/0x20
[ 489.344670] [<ffffffff80aa8d65>] ia32_do_call+0x13/0x13

---
Similar traceback, but during PCIe hotplug:

Call Trace:
<NMI> [<ffffffff80a9218a>] dump_stack+0x4f/0x7b^M
[<ffffffff80a8df39>] panic+0xbb/0x1df
[<ffffffff8020728c>] unknown_nmi_error+0x9c/0xa0
[<ffffffff8020742c>] default_do_nmi+0x19c/0x1c0
[<ffffffff80207536>] do_nmi+0xe6/0x160^M
[<ffffffff80a9b31b>] end_repeat_nmi+0x1a/0x1e
[<ffffffff80723dc6>] ? qi_submit_sync+0x186/0x3f0
[<ffffffff80723dc6>] ? qi_submit_sync+0x186/0x3f0
[<ffffffff80723dc6>] ? qi_submit_sync+0x186/0x3f0
<<EOE>> [<ffffffff8072a325>] free_irte+0xe5/0x130
[<ffffffff8072ba0f>] free_remapped_irq+0x2f/0x40^M
[<ffffffff8023af33>] arch_teardown_hwirq+0x23/0x70
[<ffffffff802c32d8>] irq_free_hwirqs+0x38/0x60
[<ffffffff8023e0e3>] native_teardown_msi_irq+0x13/0x20
[<ffffffff8020777f>] arch_teardown_msi_irq+0xf/0x20
[<ffffffff8069e08f>] default_teardown_msi_irqs+0x5f/0xa0
[<ffffffff8020775f>] arch_teardown_msi_irqs+0xf/0x20
[<ffffffff8069e159>] free_msi_irqs+0x89/0x1a0
[<ffffffff8069f165>] pci_disable_msi+0x45/0x50
[<ffffffff80696d05>] cleanup_service_irqs+0x25/0x40
[<ffffffff8069749e>] pcie_port_device_remove+0x2e/0x40
[<ffffffff8069760e>] pcie_portdrv_remove+0xe/0x10

---
Similar, but at another location in qi_submit_sync:

Call Trace:
<NMI> [<ffffffff80a9218a>] dump_stack+0x4f/0x7b^M
[<ffffffff80a8df39>] panic+0xbb/0x1df
[<ffffffff8020728c>] unknown_nmi_error+0x9c/0xa0
[<ffffffff8020742c>] default_do_nmi+0x19c/0x1c0
[<ffffffff80207536>] do_nmi+0xe6/0x160^M
[<ffffffff80a9b31b>] end_repeat_nmi+0x1a/0x1e
[<ffffffff80a98c58>] ? _raw_spin_lock+0x38/0x40^M
[<ffffffff80a98c58>] ? _raw_spin_lock+0x38/0x40
[<ffffffff80a98c58>] ? _raw_spin_lock+0x38/0x40
<<EOE>> [<ffffffff80723e9d>] qi_submit_sync+0x25d/0x3f0
[<ffffffff8072a325>] free_irte+0xe5/0x130
[<ffffffff8072ba0f>] free_remapped_irq+0x2f/0x40
[<ffffffff8023af33>] arch_teardown_hwirq+0x23/0x70
[<ffffffff802c32d8>] irq_free_hwirqs+0x38/0x60
[<ffffffff8023e0e3>] native_teardown_msi_irq+0x13/0x20
[<ffffffff8020777f>] arch_teardown_msi_irq+0xf/0x20
[<ffffffff8069e08f>] default_teardown_msi_irqs+0x5f/0xa0
[<ffffffff8020775f>] arch_teardown_msi_irqs+0xf/0x20
[<ffffffff8069e159>] free_msi_irqs+0x89/0x1a0
[<ffffffff8069f165>] pci_disable_msi+0x45/0x50
[<ffffffff80696d05>] cleanup_service_irqs+0x25/0x40
[<ffffffff8069749e>] pcie_port_device_remove+0x2e/0x40
[<ffffffff8069760e>] pcie_portdrv_remove+0xe/0x10
[<ffffffff806896ed>] pci_device_remove+0x3d/0xc0

The NMIs during PCIe hotplug seem to be more likely (possibly because our
testing generates a large number of PCIe hotplug events).

---
CPU information:

processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 63
model name : Genuine Intel(R) CPU @ 1.80GHz
stepping : 1
microcode : 0x14
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Simon Xiao: "RE: linux-next network throughput performance regression"
Previous message: Michael S. Tsirkin: "Re: [PATCH V6 0/6] Fast mmio eventfd fixes"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]