Re: [PATCH v3 3/3] genirq: Use the maple tree for IRQ descriptors management

From: Thomas Gleixner
Date: Mon May 08 2023 - 05:36:47 EST


Yujie!

On Sun, May 07 2023 at 16:05, Yujie Liu wrote:
> Sorry for late reply as we were on public holiday earlier this week.

Holidays are more important and the problems do not run away :)

> On Fri, Apr 28, 2023 at 12:31:14PM +0200, Thomas Gleixner wrote:
>> Under the assumption that the code is correct, then the effect of this
>> patch is that it changes the timing. Sigh.
>>
>> 1) Does this happen with a 64-bit kernel too?
>
> It doesn't happen on a 64-bit kernel:

Ok. So one difference might be that a 64 bit kernel enables interrupt
rempping. Can you add 'intremap=off' to the kernel command line please?

>> 2) Can you enable the irq_vector:vector_*.* tracepoints and provide
>> the trace?
>
> I'm a beginner of kernel and not sure if I'm doing this correctly. Here
> are my test steps:

They are perfectly fine.

> # check the trace
> # cat /sys/kernel/debug/tracing/trace
> # tracer: nop
> #
> # entries-in-buffer/entries-written: 0/0 #P:4
> #
> # _-----=> irqs-off/BH-disabled
> # / _----=> need-resched
> # | / _---=> hardirq/softirq
> # || / _--=> preempt-depth
> # ||| / _-=> migrate-disable
> # |||| / delay
> # TASK-PID CPU# ||||| TIMESTAMP FUNCTION
> # | | | ||||| | |
>
> Nothing was written to trace buffer, seems like no irq_vector events
> were captured during this test.

Stupid me. I completely forgot that this happens on the outgoing CPU at
a point where the tracer for that CPU is already shut down.

Can you please apply the patch below? No need to enable the irq_vector
events. It just dumps the information into dmesg.

Thanks,

tglx
---
--- a/kernel/irq/cpuhotplug.c
+++ b/kernel/irq/cpuhotplug.c
@@ -57,7 +57,8 @@ static bool migrate_one_irq(struct irq_d
bool maskchip = !irq_can_move_pcntxt(d) && !irqd_irq_masked(d);
const struct cpumask *affinity;
bool brokeaff = false;
- int err;
+ int err, irq = d->irq;
+ bool move_pending;

/*
* IRQ chip might be already torn down, but the irq descriptor is
@@ -101,10 +102,16 @@ static bool migrate_one_irq(struct irq_d
* there is no move pending or the pending mask does not contain
* any online CPU, use the current affinity mask.
*/
- if (irq_fixup_move_pending(desc, true))
+ move_pending = irqd_is_setaffinity_pending(d);
+ if (irq_fixup_move_pending(desc, true)) {
affinity = irq_desc_get_pending_mask(desc);
- else
+ pr_info("IRQ %3d: move_pending=%d pending mask: %*pbl\n",
+ irq, move_pending, cpumask_pr_args(affinity));
+ } else {
affinity = irq_data_get_affinity_mask(d);
+ pr_info("IRQ %3d: move_pending=%d affinity mask: %*pbl\n",
+ irq, move_pending, cpumask_pr_args(affinity));
+ }

/* Mask the chip for interrupts which cannot move in process context */
if (maskchip && chip->irq_mask)
@@ -136,6 +143,9 @@ static bool migrate_one_irq(struct irq_d
brokeaff = false;
}

+ affinity = irq_data_get_effective_affinity_mask(d);
+ pr_info("IRQ %3d: Done: %*pbl\n", irq, cpumask_pr_args(affinity));
+
if (maskchip && chip->irq_unmask)
chip->irq_unmask(d);