Re: [PATCH v3] gpiolib: fix reference leaks when removing GPIO chips still in use

From: Andy Shevchenko
Date: Tue Aug 15 2023 - 10:44:51 EST


On Tue, Aug 15, 2023 at 03:07:50PM +0200, Linus Walleij wrote:
> On Tue, Aug 15, 2023 at 2:57 PM Andy Shevchenko
> <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
> > On Tue, Aug 15, 2023 at 01:40:22PM +0200, Linus Walleij wrote:
> > > On Tue, Aug 15, 2023 at 11:50 AM Andy Shevchenko
> > > <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
> > > > On Fri, Aug 11, 2023 at 09:30:34PM +0200, Bartosz Golaszewski wrote:

...

> > > > > + module_put(desc->gdev->owner);
> > > > > + gpio_device_put(desc->gdev);
> > > >
> > > > So, if gdev can be NULL, you will get an Oops with new code.
> > >
> > > I read it such that gdev->chip can be NULL, but not gdev,
> > > and desc->gdev->owner is fine to reference?
> >
> > Basically the Q is
> > "if desc is non-NULL, does it guarantee that gdev is non-NULL either?"
>
> gdev->desc is assigned in one single spot, which is in
> gpiochip_add_data_with_key():
>
> for (i = 0; i < gc->ngpio; i++)
> gdev->descs[i].gdev = gdev;
>
> It is never assigned anywhere else, so I guess yes.
>
> We may also ask if it is ever invalid (i.e. if desc->gdev can point to
> junk).
>
> A gdev turns to junk when its reference count goes down to zero
> and gpiodev_release() is called effectively calling kfree() on the
> struct gpio_device *.
>
> But that can only happen as a result of module_put() getting
> called, pulling the references down to zero. Which is what we
> are discussing. The line after module_put(), desc->gdev
> *could* be NULL.

Yes.

> But then we just call gpio_device_put(desc->gdev) which is
> just a call to device_put(), which is NULL-tolerant.

But gpio_device_put() does not NULL tolerant.
So, oops in this line then.

--
With Best Regards,
Andy Shevchenko