Re: [PATCH 2/2] genirq: proc: fix a procfs entry leak

From: Thomas Gleixner
Date: Tue Aug 29 2023 - 05:13:06 EST


On Tue, Aug 29 2023 at 08:26, Bartosz Golaszewski wrote:
> On Mon, Aug 28, 2023 at 11:44 PM Thomas Gleixner <tglx@xxxxxxxxxxxxx> wrote:
>> That's the module which provides the interrupt domain and hid-whatever
>> is the one which requests the interrupt, no?
>>
> Not at all! This is what I said: "we have the hid-cp2112 module which
> drives a GPIO-and-I2C-expander-on-a-USB-stick". Meaning: the
> hid-cp2112 module registers a USB driver for a GPIO expander on a
> stick. This GPIO chip is the interrupt controller here. It's the USB
> stick that provides interrupts for whatever else needs them (in real
> life: it can be an IIO device on the I2C bus which signals some events
> over the GPIOs). The user can get the interrupt number using the
> gpiod_to_irq() function. It can be unplugged at any moment and module
> references will not stop the USB bus from unbinding it.

Sorry for my confusion, but this all is confusing at best.

So what you are saying is that the GPIO driver, which creates the
interrupt domain is unbound and that unbind destroys the interrupt
domain, right? IOW, the wonderful world of plug and pray.

Let's look at the full picture again.

USB -> USB-device
|----------- GPIO
|------------I2C ---------- I2C-device
(hid-cp2112 driver) (i2c-xx-driver)

i2x-xx-driver is the one which requests the interrupt from
hid-cp2112-GPIO, right?

So when the USB device disconnects, then something needs to tell the
i2c-xx-driver that the I2C interface is not longer available, right?

IOW, the unbind operation needs the following notification and teardown
order:

1) USB core notifies hid-cp2112

2) hid-cp2112 notifies i2c-xx-driver

3) i2c-xx-driver mops up and invokes free_irq()

4) hid-cp2112 removes the interrupt domain

But it seems that you end up with a situation where the notification of
the i2c-xx-driver is either not happening or the driver in question is
simply failing to mop up and free the requested interrupt.

As a consequence you want to work around it by mopping up the requested
interrupts somewhere else.

It's not clear to me what you are trying to achieve here.

If this is meant as a hardening effort to catch such issues and let
the kernel gracefully "survive", then I'm all ears.

If this is just meant to paper over the shortcomings of subsystems or
driver implemtations then I'm not interested at all.

Thanks,

tglx