Re: linux 5.15-rc4: refcount underflow when unloading gpio-mockup

From: Andy Shevchenko
Date: Mon Oct 04 2021 - 15:56:41 EST


On Mon, Oct 4, 2021 at 8:51 PM Kent Gibson <warthog618@xxxxxxxxx> wrote:
>
> On Mon, Oct 04, 2021 at 10:17:54PM +0800, Kent Gibson wrote:
> > On Mon, Oct 04, 2021 at 04:20:55PM +0300, Heikki Krogerus wrote:
> > > On Mon, Oct 04, 2021 at 08:47:01PM +0800, Kent Gibson wrote:
> > > > On Mon, Oct 04, 2021 at 03:30:43PM +0300, Heikki Krogerus wrote:
> > > > > On Mon, Oct 04, 2021 at 08:19:42PM +0800, Kent Gibson wrote:
> > > > > > On Mon, Oct 04, 2021 at 11:44:17AM +0200, Greg Kroah-Hartman wrote:
> > > > > > > On Mon, Oct 04, 2021 at 05:34:16PM +0800, Kent Gibson wrote:
> > > > > > > > Hi,
> > > > > > > >
>
> [snip]
>
> > > > Looking at the offending patch, it effectively replaces a call to
> > > > device_add_properties() with one to
> > > > device_create_managed_software_node(), and those two functions appear
> > > > quite different - at least at first glance.
> > > > Is that correct?
> > >
> > > The only real difference between the two functions is that
> > > device_create_managed_software_node() marks the software node it
> > > creates (and it does it exactly the same way as
> > > device_add_properties()) as "managed" with a specific flag.
> > >
> >
>
> That managed flag makes all the difference.
>
> I've tried to find a fix along the same lines as Laurentiu Tudor's
> 5aeb05b27f8 software node: balance refcount for managed software nodes
> but haven't found anything that works.
>
> What does work for me is to revert the call to
> device_create_managed_software_node() to a call to
> device_add_properties():
>
> diff --git a/drivers/base/platform.c b/drivers/base/platform.c
> index 652531f67135..2f30bdb94fab 100644
> --- a/drivers/base/platform.c
> +++ b/drivers/base/platform.c
> @@ -826,8 +826,7 @@ struct platform_device *platform_device_register_full(
> goto err;
>
> if (pdevinfo->properties) {
> - ret = device_create_managed_software_node(&pdev->dev,
> - pdevinfo->properties, NULL);
> + ret = device_add_properties(&pdev->dev, pdevinfo->properties);
> if (ret)
> goto err;
> }
>
> That obviously wont work with your latest series that removes
> device_add_properties(), but that is the simplest, and only, solution
> that I've found so far.

Here is debug prints of what happens

# modprobe gpio-mockup gpio_mockup_ranges=-1,10
[ 212.850993] (null): device_create_managed_software_node <<< 0
[ 212.858177] platform gpio-mockup.0: software_node_notify 0 <<<
[ 212.865264] platform gpio-mockup.0: software_node_notify 1 <<< 1
[ 212.874159] gpio-mockup gpio-mockup.0: no of_node; not parsing pinctrl DT
[ 212.882962] gpiochip_find_base: found new base at 840
[ 212.889873] gpio gpiochip3: software_node_notify 0 <<<
[ 212.896164] gpio gpiochip3: software_node_notify 1 <<< 1
[ 212.905099] gpio gpiochip3: (gpio-mockup-A): added GPIO chardev (254:3)
[ 212.913313] gpio gpiochip840: software_node_notify 0 <<<
[ 212.920676] gpio gpiochip3: registered GPIOs 840 to 849 on gpio-mockup-A
[ 212.935601] modprobe (185) used greatest stack depth: 12744 bytes left

As I read it the software node is created for gpio-mockup device and
then _shared_ with the GPIO device, since it's managed it's gone when
GPIO device gone followed by UAF when mockup (platform) device itself
gone. I.o.w. this software node mustn't be managed because it covers
the lifetime of two different objects. The correct fix is to create
manually software node and assign it to the pdev and manually free in
gpio_mockup_unregister_pdevs()/

Tl;DR: it's a bug in gpio-mockup.


--
With Best Regards,
Andy Shevchenko