Re: irqdomain API: how to set affinity of parent irq of chained irqs?

From: Radu Rendec
Date: Thu Apr 20 2023 - 18:22:55 EST


Hi Marc,

On Fri, 2023-04-07 at 10:18 +0100, Marc Zyngier wrote:
> On Fri, 07 Apr 2023 00:56:40 +0100, Radu Rendec <rrendec@xxxxxxxxxx> wrote:
> > Are you aware of any work being done (or having been done) in this
> > area? Thanks in advance!
> >
> > My colleagues and I are looking into picking this up and implementing
> > the new sysfs interface and the related irqbalance changes, and we are
> > currently evaluating the level of effort. Obviously, we would like to
> > avoid any effort duplication.
>
> I don't think anyone ever tried it (it's far easier to just moan about
> it than to do anything useful). But if you want to start looking into
> that, that'd be great.

Thanks for the feedback and sorry for the late reply. It looks like I
already started. I have been working on a "sandbox" driver that
implements hierarchical/muxed interrupts and would allow me to test in
a generic environment, without requiring mux hardware or to mess with
real interrupts.

But first, I would like to clarify something, just to make sure I'm on
the right track. It looks to me like with the hierarchical IRQ domain
API, there is always a 1:1 end-to-end mapping between virqs and the
hwirqs near the CPU. IOW, there is a 1:1 mapping between a given virq
and the corresponding hwirq in each IRQ domain along the chain, and
there is no other virq in-between. I looked at many of the irqchip
drivers that implement the hierarchical API, and couldn't find a single
one that does muxed IRQs. Furthermore, the revmap in struct irq_domain
is clearly a 1:1 map, so when an IRQ vector is entered, there is no way
to map multiple virqs (and run the associated handlers). I tried it in
my test driver, and if the .alloc domain op implementation allocates
the same hwirq in the parent domain for two different (v)irqs, the
revmap slot in the parent domain is overwritten.

If my understanding is correct, muxed IRQs are not possible with the
hierarchical IRQ domain API. That means in this particular case you can
never indirectly change the affinity of a different IRQ because hwirqs
are never shared. So, this is just a matter of exposing the affinity
through the new sysfs API for every irqchip driver that opts-in.

On the other hand, muxed IRQs *are* possible with the legacy API, and
drivers/irqchip/irq-imx-intmux.c is a clear example of that. However,
in this case one (or multiple) additional virq(s) exist at the mux
level, and it is the virq handler that implements the logic to invoke
the appropriate downstream (child) virq handler(s). Then the virq(s) at
the mux level and all the corresponding downstream virqs share the same
affinity setting, because they also share the same hwirq in the root
domain (which is where affinity is really implemented). And yes, in
this case the relationship between these virqs is not tracked anywhere
currently. Is this what you had in mind when you mentioned below a "new
infrastructure to track muxed interrupts"?

> One of my concern is that allowing affinity changes for chained
> interrupt may uncover issues in existing drivers, so it would have to
> be an explicit buy-in for any chained irqchip. That's probably not too
> hard to achieve anyway given that you'll need some new infrastructure
> to track the muxed interrupts.

The first thing that comes to mind for the "explicit buy-in" is a new
function pointer in struct irq_chip to set the affinity in a mux-aware
manner. Something like irq_set_affinity_shared or _chained. I may not
see the whole picture yet but so far my thinking is that the existing
irq_set_affinity must remain unchanged in order to preserve
compatibility/behavior of the procfs interface.

> Hopefully this will result in something actually happening! ;-)

I really hope so. I am also excited to have the opportunity to work on
this. I will likely need your guidance along the way but I think it's
better to talk in advance than submit a huge patch series that makes no
sense :)

Thanks,
Radu