Re: [PATCH] irqdomain: Fix driver re-inserting failures when IRQs not being freed completely

From: Thomas Gleixner
Date: Fri Aug 25 2023 - 14:01:29 EST


On Thu, Jul 20 2023 at 20:24, Jie Zhan wrote:
> Since commit 4615fbc3788d ("genirq/irqdomain: Don't try to free an
> interrupt that has no mapping"), we have found failures when
> re-inserting some specific drivers:
>
> [root@localhost ~]# rmmod hisi_sas_v3_hw
> [root@localhost ~]# modprobe hisi_sas_v3_hw
> [ 1295.622525] hisi_sas_v3_hw: probe of 0000:30:04.0 failed with error -2
>
> This comes from the case where some IRQs allocated from a low-level domain,
> e.g. GIC ITS, are not freed completely, leaving some leaked. Thus, the next
> driver insertion fails to get the same number of IRQs because some IRQs are
> still occupied.

Why?

> Free a contiguous group of IRQs in one go to fix this issue.

Again why?

> @@ -1445,13 +1445,24 @@ static void irq_domain_free_irqs_hierarchy(struct irq_domain *domain,
> unsigned int nr_irqs)
> {
> unsigned int i;
> + int n;
>
> if (!domain->ops->free)
> return;
>
> for (i = 0; i < nr_irqs; i++) {
> - if (irq_domain_get_irq_data(domain, irq_base + i))
> - domain->ops->free(domain, irq_base + i, 1);
> + /* Find the largest possible span of IRQs to free in one go */
> + for (n = 0;
> + ((i + n) < nr_irqs) &&
> + (irq_domain_get_irq_data(domain, irq_base + i + n));
> + n++)
> + ;

For one this is unreadable gunk. But what's worse it still does not
explain what this is solving.

It's completely sensible to expect that freeing interrupts in a range
one by one just works.

So why do we need to work around an obvious low level failure in the
core code?

Thanks,

tglx