Re: [PATCH] irqchip/gic-v3: handle DOMAIN_BUS_ANY in gic_irq_domain_select

From: Marc Zyngier
Date: Mon Feb 19 2024 - 11:38:04 EST


On Mon, 19 Feb 2024 16:21:06 +0000,
Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> wrote:
>
> On Mon, 19 Feb 2024 at 17:53, Marc Zyngier <maz@xxxxxxxxxx> wrote:
> >
> > On Mon, 19 Feb 2024 14:47:37 +0000,
> > Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx> wrote:
> > >
> > > Before the commit de1ff306dcf4 ("genirq/irqdomain: Remove the param
> > > count restriction from select()") the irq_find_matching_fwspec() was
> > > handling the DOMAIN_BUS_ANY on its own. After this commit it is a job of
> > > the select() callback. However the callback of GICv3 (even though it got
> > > modified to handle zero param_count) wasn't prepared to return true for
> > > DOMAIN_BUS_ANY bus_token.
> > >
> > > This breaks probing of any of the child IRQ domains, since
> > > platform_irqchip_probe() uses irq_find_matching_host(par_np,
> > > DOMAIN_BUS_ANY) to check for the presence of the parent IRQ domain.
> > >
> > > Fixes: 151378251004 ("irqchip/gic-v3: Make gic_irq_domain_select() robust for zero parameter count")
> > > Fixes: de1ff306dcf4 ("genirq/irqdomain: Remove the param count restriction from select()")
> > > Signed-off-by: Dmitry Baryshkov <dmitry.baryshkov@xxxxxxxxxx>
> > > ---
> > > drivers/irqchip/irq-gic-v3.c | 3 ++-
> > > 1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c
> > > index 6fb276504bcc..e9e9643c653f 100644
> > > --- a/drivers/irqchip/irq-gic-v3.c
> > > +++ b/drivers/irqchip/irq-gic-v3.c
> > > @@ -1696,7 +1696,8 @@ static int gic_irq_domain_select(struct irq_domain *d,
> > >
> > > /* Handle pure domain searches */
> > > if (!fwspec->param_count)
> > > - return d->bus_token == bus_token;
> > > + return d->bus_token == bus_token ||
> > > + bus_token == DOMAIN_BUS_ANY;
> > >
> > > /* If this is not DT, then we have a single domain */
> > > if (!is_of_node(fwspec->fwnode))
> > >
> >
> > I really dislike the look of this. If that's the case, any irqchip
> > that has a 'select' method (such as imx-intmux) should be similarly
> > hacked. And at this point, this should be handled by the core code.
> >
> > Can you try this instead? I don't have any HW that relies on
> > behaviour, but I'd expect this to work.
> >
> > Thanks,
> >
> > M.
> >
> > diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> > index aeb41655d6de..3dd1c871e091 100644
> > --- a/kernel/irq/irqdomain.c
> > +++ b/kernel/irq/irqdomain.c
> > @@ -449,7 +449,7 @@ struct irq_domain *irq_find_matching_fwspec(struct irq_fwspec *fwspec,
> > */
> > mutex_lock(&irq_domain_mutex);
> > list_for_each_entry(h, &irq_domain_list, link) {
> > - if (h->ops->select)
> > + if (h->ops->select && bus_token != DOMAIN_BUS_ANY)
> > rc = h->ops->select(h, fwspec, bus_token);
> > else if (h->ops->match)
> > rc = h->ops->match(h, to_of_node(fwnode), bus_token);
>
> This works. But I wonder if the following change is even better. WDYT?
>
> diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> index aeb41655d6de..2f0d2700709e 100644
> --- a/kernel/irq/irqdomain.c
> +++ b/kernel/irq/irqdomain.c
> @@ -449,14 +449,17 @@ struct irq_domain
> *irq_find_matching_fwspec(struct irq_fwspec *fwspec,
> */
> mutex_lock(&irq_domain_mutex);
> list_for_each_entry(h, &irq_domain_list, link) {
> - if (h->ops->select)
> + if (fwnode != NULL &&
> + h->fwnode == fwnode &&
> + bus_token == DOMAIN_BUS_ANY)
> + rc = true;
> + else if (h->ops->select)
> rc = h->ops->select(h, fwspec, bus_token);
> else if (h->ops->match)
> rc = h->ops->match(h, to_of_node(fwnode), bus_token);
> else
> rc = ((fwnode != NULL) && (h->fwnode == fwnode) &&
> - ((bus_token == DOMAIN_BUS_ANY) ||
> - (h->bus_token == bus_token)));
> + (h->bus_token == bus_token));
>
> if (rc) {
> found = h;
>

Can't say I like it either. It duplicates the existing check without
any obvious benefit. Honestly, this code is shit enough that we should
try to make it simpler, not more complex...

I'd rather we keep the impact as minimal as possible, and use the
upcoming weeks to weed out the effects of these changes (there is
another report of some Renesas machine falling over itself here[1]).

Thanks,

M.

[1] https://lore.kernel.org/all/170802702416.398.14922976721740218856.tip-bot2@tip-bot2

--
Without deviation from the norm, progress is not possible.