Re: [PATCH v3 3/3] genirq: Use the maple tree for IRQ descriptors management

From: Yujie Liu
Date: Wed May 10 2023 - 03:28:37 EST


Hi Thomas,

On Mon, May 08, 2023 at 11:36:37AM +0200, Thomas Gleixner wrote:
> >> Under the assumption that the code is correct, then the effect of this
> >> patch is that it changes the timing. Sigh.
> >>
> >> 1) Does this happen with a 64-bit kernel too?
> >
> > It doesn't happen on a 64-bit kernel:
>
> Ok. So one difference might be that a 64 bit kernel enables interrupt
> rempping. Can you add 'intremap=off' to the kernel command line please?

Sorry, my previous info was incorrect.

The block/008 (do IO while hotplugging CPUs) failure also happens on a
64-bit kernel no matter having 'intremap=off' or not, and persists when
tested against v6.3, but the warning in default_send_IPI_mask_logical
function is not triggered on a 64-bit kernel. Not sure if that function
is 32-bit specific since it is set in arch/x86/kernel/apic/probe_32.c.

== x86_64 kernel ==

compiler/disk/kconfig/rootfs/tbox_group/test/testcase:
gcc-11/1SSD/x86_64-rhel-8.3-func/debian-11.1-x86_64-20220510.cgz/lkp-skl-d06/block-group-00/blktests

commit:
v6.3
32c58fc685e5c ("genirq: Use the maple tree for IRQ descriptors management")

v6.3 32c58fc685e5cd6b5947a5f8e9a
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:10 70% 7:7 blktests.block/008.fail
:10 70% 7:7 blktests.block/012.fail

== i386 kernel ==

compiler/disk/kconfig/rootfs/tbox_group/test/testcase:
gcc-11/1SSD/i386-debian-10.3-func/debian-11.1-i386-20220923.cgz/lkp-skl-d06/block-group-00/blktests

commit:
v6.3
32c58fc685e5c ("genirq: Use the maple tree for IRQ descriptors management")

v6.3 32c58fc685e5cd6b5947a5f8e9a
---------------- ---------------------------
fail:runs %reproduction fail:runs
| | |
:20 90% 18:49 blktests.block/008.fail
:20 90% 18:49 blktests.block/012.fail
:20 80% 16:49 dmesg.EIP:default_send_IPI_mask_logical
:20 80% 16:49 dmesg.WARNING:at_arch/x86/kernel/apic/ipi.c:#default_send_IPI_mask_logical

> >> 2) Can you enable the irq_vector:vector_*.* tracepoints and provide
> >> the trace?
> >
> > Nothing was written to trace buffer, seems like no irq_vector events
> > were captured during this test.
>
> Can you please apply the patch below? No need to enable the irq_vector
> events. It just dumps the information into dmesg.

The dmesgs of 64-bit and 32-bit kernels are attached.

--
Best Regards,
Yujie

> ---
> --- a/kernel/irq/cpuhotplug.c
> +++ b/kernel/irq/cpuhotplug.c
> @@ -57,7 +57,8 @@ static bool migrate_one_irq(struct irq_d
> bool maskchip = !irq_can_move_pcntxt(d) && !irqd_irq_masked(d);
> const struct cpumask *affinity;
> bool brokeaff = false;
> - int err;
> + int err, irq = d->irq;
> + bool move_pending;
>
> /*
> * IRQ chip might be already torn down, but the irq descriptor is
> @@ -101,10 +102,16 @@ static bool migrate_one_irq(struct irq_d
> * there is no move pending or the pending mask does not contain
> * any online CPU, use the current affinity mask.
> */
> - if (irq_fixup_move_pending(desc, true))
> + move_pending = irqd_is_setaffinity_pending(d);
> + if (irq_fixup_move_pending(desc, true)) {
> affinity = irq_desc_get_pending_mask(desc);
> - else
> + pr_info("IRQ %3d: move_pending=%d pending mask: %*pbl\n",
> + irq, move_pending, cpumask_pr_args(affinity));
> + } else {
> affinity = irq_data_get_affinity_mask(d);
> + pr_info("IRQ %3d: move_pending=%d affinity mask: %*pbl\n",
> + irq, move_pending, cpumask_pr_args(affinity));
> + }
>
> /* Mask the chip for interrupts which cannot move in process context */
> if (maskchip && chip->irq_mask)
> @@ -136,6 +143,9 @@ static bool migrate_one_irq(struct irq_d
> brokeaff = false;
> }
>
> + affinity = irq_data_get_effective_affinity_mask(d);
> + pr_info("IRQ %3d: Done: %*pbl\n", irq, cpumask_pr_args(affinity));
> +
> if (maskchip && chip->irq_unmask)
> chip->irq_unmask(d);
>

Attachment: dmesg_i386.xz
Description: application/xz

Attachment: dmesg_x86_64.xz
Description: application/xz