Re: [PATCH 2/2] gpiolib: cdev: release IRQs when the gpio chip device is removed

From: Kent Gibson
Date: Tue Feb 20 2024 - 19:26:10 EST


On Tue, Feb 20, 2024 at 07:26:57PM +0100, Herve Codina wrote:
> Hi Kent,
>
> On Tue, 20 Feb 2024 22:29:59 +0800
> Kent Gibson <warthog618@xxxxxxxxx> wrote:
>

..

>
> I can call gcdev_unregister() right after gdev->chip = NULL.
> The needed things is to have free_irq() (from the gcdev_unregister()) called
> before calling gpiochip_irqchip_remove().
>
> And so, why not:
> --- 8< ---
> ...
> gpiochip_free_hogs(gc);
> /* Numb the device, cancelling all outstanding operations */
> gdev->chip = NULL;
> gcdev_unregister(gdev);
> gpiochip_irqchip_remove(gc);
> acpi_gpiochip_remove(gc);
> of_gpiochip_remove(gc);
> gpiochip_remove_pin_ranges(gc);
> ...
> --- 8< ---
>

Exactly - why not. Though adding some comments to the code as to why
that ordering is required, as per the numbing, would be helpful for
maintainability.

> >
> > There is also a race here with linereq_set_config(). That can be prevented
> > by holding the lr->config_mutex - assuming the notifier is not being called
> > from atomic context.
>
> I missed that one and indeed, I probably can take the mutex. With the mutex
> holded, no more race condition with linereq_set_config() and so the IRQ cannot
> be re-requested.
>
> >
> > You also have a race with the line being freed that could pull the
> > lr out from under you, so a use after free problem.
>
> I probably missed something but I don't see this use after free.
> Can you give me some details/pointers ?
>

What is to prevent userspace releasing the request and freeing the
linereq while you use it? The use after free is anywhere that is
possible.

AIUI, from the userspace side that is prevented by the file handle not
being closed, and so linereq_release() not being called, while ioctls
are in flight. But as you are calling in from the kernel side there is
nothing in place to prevent userspace freeing the linereq while you are
using it.

>
> > I'd rather live with the warning :(.
> > Fixing that requires rethinking the lifecycle management for the
> > linereq/lineevent.
>
> Well, currently the warning is a big one with a dump_stack included.
> It will be interesting to have it fixed.
>
> The need to fix it is to have free_irq() called before
> gpiochip_irqchip_remove();
>
> Is there really no way to have this correct sequence without rethinking all
> the lifecycle management ?
>

Not that I am aware of. You need to protect against the linereq
being freed while you are using it, which is by definition is lifecycle
management. Though it isn't necessarily __all__ the lifecycle management.

> Also, after the warning related to the IRQ, the following one is present:
> --- 8< ---
> [ 9593.527961] gpio gpiochip9: REMOVING GPIOCHIP WITH GPIOS STILL REQUESTED
> [ 9593.535602] ------------[ cut here ]------------
> [ 9593.540244] WARNING: CPU: 0 PID: 309 at drivers/gpio/gpiolib.c:2352 gpiod_free.part.0+0x20/0x48
> ...
> [ 9593.725016] Call trace:
> [ 9593.727468] gpiod_free.part.0+0x20/0x48
> [ 9593.731404] gpiod_free+0x14/0x24
> [ 9593.734728] lineevent_free+0x40/0x74
> [ 9593.738402] lineevent_release+0x14/0x24
> [ 9593.742335] __fput+0x70/0x2bc
> [ 9593.745403] __fput_sync+0x50/0x5c
> [ 9593.748817] __arm64_sys_close+0x38/0x7c
> [ 9593.752751] invoke_syscall+0x48/0x114
> ...
> [ 9593.815299] ---[ end trace 0000000000000000 ]---
> [ 9593.820616] hotplug-manager dock-hotplug-manager: remove overlay 0 (ovcs id 1)
> gpiomon: error waiting for events: No such device
> #
> --- 8< ---
>

My understanding is that we now handle that case - that is what
the gdev->device_notifier chain is for, and gpiolib and gpiolib-cdev now
perform a controlled cleanupi - so that warning is obsolete and should be
removed from gpiochip_remove().

IIRC, previously you would've gotten a panic shortly after that warning.
Now you get gpiomon noticing that the device has been removed.

Btw, where I mention linereq here, the same applies to lineevent and
linehandle - the uAPI v1 equivalents of linereq.

Cheers,
Kent.