Re: [PATCH] irqdomain: Fix mapping-creation race

From: Marc Zyngier
Date: Thu Jul 28 2022 - 09:14:34 EST


On Thu, 28 Jul 2022 13:56:41 +0100,
Johan Hovold <johan@xxxxxxxxxx> wrote:
>
> On Thu, Jul 28, 2022 at 12:48:23PM +0100, Marc Zyngier wrote:
> > On Thu, 28 Jul 2022 10:27:10 +0100,
> > Johan Hovold <johan+linaro@xxxxxxxxxx> wrote:
> > >
> > > Parallel probing (e.g. due to asynchronous probing) of devices that share
> > > interrupts can currently result in two mappings for the same hardware
> > > interrupt to be created.
> >
> > And I thought nobody would be using shared interrupts anymore. Turns
> > out people are still building braindead HW... :-/
> >
> > >
> > > Add a serialising mapping mutex so that looking for an existing mapping
> > > before creating a new one is done atomically.
> > >
> > > Note that serialising the lookup and creation in
> > > irq_create_mapping_affinity() would have been enough to prevent the
> > > duplicate mapping, but that could instead cause
> > > irq_create_fwspec_mapping() to fail when there is a race.
> > >
> > > Fixes: 765230b5f084 ("driver-core: add asynchronous probing support for drivers")
> > > Fixes: b62b2cf5759b ("irqdomain: Fix handling of type settings for existing mappings")
> > > Cc: Dmitry Torokhov <dtor@xxxxxxxxxxxx>
> > > Cc: Jon Hunter <jonathanh@xxxxxxxxxx>
> > > Signed-off-by: Johan Hovold <johan+linaro@xxxxxxxxxx>
> > > ---
> > > kernel/irq/irqdomain.c | 46 +++++++++++++++++++++++++++++++-----------
> > > 1 file changed, 34 insertions(+), 12 deletions(-)
> > >
> > > diff --git a/kernel/irq/irqdomain.c b/kernel/irq/irqdomain.c
> > > index 8fe1da9614ee..d263a7dd4170 100644
> > > --- a/kernel/irq/irqdomain.c
> > > +++ b/kernel/irq/irqdomain.c
> > > @@ -22,6 +22,7 @@
> > >
> > > static LIST_HEAD(irq_domain_list);
> > > static DEFINE_MUTEX(irq_domain_mutex);
> > > +static DEFINE_MUTEX(irq_mapping_mutex);
> >
> > I'd really like to avoid a global mutex. At the very least this should
> > be a per-domain mutex, otherwise this will serialise a lot more than
> > what is needed.
>
> Yeah, I considered that too, but wanted to get your comments on this
> first.
>
> Also note that the likewise global irq_domain_mutex (and
> sparse_irq_lock) are taken in some of these paths so perhaps using finer
> locking won't actually matter that much as this is mostly for parallel
> probing.

It will be a good opportunity to make the locking suck a bit less,
like in irq_domain_associate().

> > > } else {
> > > /* Create mapping */
> > > - virq = irq_create_mapping(domain, hwirq);
> > > + virq = __irq_create_mapping_affinity(domain, hwirq, NULL);
> >
> > This rechecks for the existence of the mapping. Surely we can do a bit
> > better by rejigging this (admittedly bitrotting) code.
>
> I'm sure we can. Should I try to fix the race first with a patch like
> this one that can potentially be backported, and then see what I can do
> about cleaning this up?
>
> After all it has looked like this for the past eight years since when
> this code was first merged.

No, let's put the code in shape *first*, then add work on the locking,
as it should make the patch simpler. Backports aren't my concern,
really.

Thanks,

M.

--
Without deviation from the norm, progress is not possible.