Re: [PATCH v3] gpiolib: fix reference leaks when removing GPIO chips still in use

From: Linus Walleij
Date: Tue Aug 15 2023 - 09:08:56 EST


On Tue, Aug 15, 2023 at 2:57 PM Andy Shevchenko
<andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
> On Tue, Aug 15, 2023 at 01:40:22PM +0200, Linus Walleij wrote:
> > On Tue, Aug 15, 2023 at 11:50 AM Andy Shevchenko
> > <andriy.shevchenko@xxxxxxxxxxxxxxx> wrote:
> > > On Fri, Aug 11, 2023 at 09:30:34PM +0200, Bartosz Golaszewski wrote:
> > > > From: Bartosz Golaszewski <bartosz.golaszewski@xxxxxxxxxx>
> > > >
> > > > After we remove a GPIO chip that still has some requested descriptors,
> > > > gpiod_free_commit() will fail and we will never put the references to the
> > > > GPIO device and the owning module in gpiod_free().
> > > >
> > > > Rework this function to:
> > > > - not warn on desc == NULL as this is a use-case on which most free
> > > > functions silently return
> > > > - put the references to desc->gdev and desc->gdev->owner unconditionally
> > > > so that the release callback actually gets called when the remaining
> > > > references are dropped by external GPIO users
>
> ...
>
> > > > - if (desc && desc->gdev && gpiod_free_commit(desc)) {
> > >
> > > The commit message doesn't explain disappearing of gdev check.
> > >
> > > > - module_put(desc->gdev->owner);
> > > > - gpio_device_put(desc->gdev);
> > > > - } else {
> > > > + /*
> > > > + * We must not use VALIDATE_DESC_VOID() as the underlying gdev->chip
> > > > + * may already be NULL but we still want to put the references.
> > > > + */
> > > > + if (!desc)
> > > > + return;
> > > > +
> > > > + if (!gpiod_free_commit(desc))
> > > > WARN_ON(extra_checks);
> > > > - }
> > > > +
> > > > + module_put(desc->gdev->owner);
> > > > + gpio_device_put(desc->gdev);
> > > > }
> > >
> > > So, if gdev can be NULL, you will get an Oops with new code.
> >
> > I read it such that gdev->chip can be NULL, but not gdev,
> > and desc->gdev->owner is fine to reference?
>
> Basically the Q is
> "if desc is non-NULL, does it guarantee that gdev is non-NULL either?"

gdev->desc is assigned in one single spot, which is in
gpiochip_add_data_with_key():

for (i = 0; i < gc->ngpio; i++)
gdev->descs[i].gdev = gdev;

It is never assigned anywhere else, so I guess yes.

We may also ask if it is ever invalid (i.e. if desc->gdev can point to
junk).

A gdev turns to junk when its reference count goes down to zero
and gpiodev_release() is called effectively calling kfree() on the
struct gpio_device *.

But that can only happen as a result of module_put() getting
called, pulling the references down to zero. Which is what we
are discussing. The line after module_put(), desc->gdev
*could* be NULL.

But then we just call gpio_device_put(desc->gdev) which is
just a call to device_put(), which is NULL-tolerant.

Yours,
Linus Walleij