Re: [RFC PATCH] x86/apic: Fix BUG due to multiple allocation of legacy vectors.

From: imran . f . khan
Date: Thu May 20 2021 - 02:23:30 EST




On 20/5/21 3:56 pm, Greg KH wrote:
On Wed, May 19, 2021 at 11:39:28PM +0000, Imran Khan wrote:
During activation of secondary CPUs, lapic_online is
invoked to initialize vectors. While lapic_online
installs legacy vectors on all CPUs, it does not set
the corresponding bits in per CPU bitmap maintained
under irq_matrix.
This may result in these legacy vectors getting allocated
by irq_matrix_alloc and if that happens subsequent invocation
of apic_update_vector will cause BUG like the one shown below:

[ 154.738226] kernel BUG at arch/x86/kernel/apic/vector.c:172!
[ 154.805956] invalid opcode: 0000 [#1] SMP PTI
[ 154.858092] CPU: 22 PID: 3569 Comm: ifup-eth Not tainted 5.8.0-20200716.x86_64 #1
[ 154.954939] Hardware name: Oracle Corporation ORACLE SERVER X6-2/ASM,MOTHERBOARD,1U
[ 155.073636] RIP: 0010:apic_update_vector+0xa7/0x190
[ 155.131996] Code: 01 00 4a 8b 14 ed 80 69 01 a6 48 89 c8 4a 8d 04 e0 48 8b 04 10 48
85 c0 0f 84 d2 00 00 00 48 3d 00 f0 ff ff 0f 87 c6 00 00 00 <0f> 0b 41 8b 46 10 48 0f
[.....]
[ 156.268168] Call Trace:
[ 156.297409] ? irq_matrix_alloc+0x8a/0x150
[ 156.346408] assign_vector_locked+0xd2/0x170
[ 156.397489] x86_vector_activate+0x1b5/0x320
[ 156.448570] __irq_domain_activate_irq+0x64/0xa0
[ 156.503808] __irq_domain_activate_irq+0x38/0xa0
[ 156.559050] irq_domain_activate_irq+0x2b/0x40
[ 156.612213] irq_activate+0x25/0x30
[ 156.653930] __setup_irq+0x58f/0x7b0
[ 156.696690] request_threaded_irq+0xf8/0x1b0
[ 156.747784] ixgbe_open+0x3af/0x600 [ixgbe]
[ 156.797827] __dev_open+0xd8/0x160
[ 156.838503] dev_open+0x48/0x90
[ 156.876065] bond_enslave+0x2b6/0x12c0 [bonding]
[ 156.931310] ? vsscanf+0x5af/0x8e0
[ 156.971986] ? sscanf+0x4e/0x70
[ 157.009546] bond_option_slaves_set+0x112/0x1c0 [bonding]
[ 157.074148] __bond_opt_set+0xdc/0x320 [bonding]
[ 157.129389] __bond_opt_set_notify+0x2c/0x90 [bonding]
[ 157.190871] bond_opt_tryset_rtnl+0x56/0xa0 [bonding]
[ 157.251315] bonding_sysfs_store_option+0x52/0x90 [bonding]

This patch marks these legacy vectors as assigned in irq_matrix
so that corresponding bits in percpu bitmaps get set and these
legacy vectors don't get reallocted.

Signed-off-by: Imran Khan <imran.f.khan@xxxxxxxxxx>
---
arch/x86/kernel/apic/vector.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

<formletter>

This is not the correct way to submit patches for inclusion in the
stable kernel tree. Please read:
https://urldefense.com/v3/__https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html__;!!GqivPVa7Brio!MYRIHQM1qBB0Raf823KeG1-OSUDCwlOyOxqTp5fzHTlxAL1H4LZW6XniBtajqKVb3w$
for how to do this properly.

</formletter>


Thanks for clarifying the process and providing the relevant doc. Looks like CC-ing stable list for RFC patch was a mistake in first place. I will wait for review comments and, if the patch gets included, will inform via option 2 of the above mentioned document.

Thanks,
Imran