Re: [PATCH] HID: google: Don't use devm for hid_hw_stop()

From: Dmitry Torokhov
Date: Wed May 10 2023 - 20:25:23 EST


On Wed, May 10, 2023 at 01:50:01PM -0700, Stephen Boyd wrote:
> Quoting Dmitry Torokhov (2023-05-10 13:24:08)
> > On Wed, May 10, 2023 at 11:51:31AM -0700, Stephen Boyd wrote:
> > > Quoting Dmitry Torokhov (2023-05-05 17:06:07)
> > > > On Fri, May 05, 2023 at 04:24:16PM -0700, Stephen Boyd wrote:
> > > > >
> > > > ...
> > > > > Unfortunately, the hid google hammer driver hand rolls a devm function
> > > > > to call hid_hw_stop() when the driver is unbound and implements an
> > > > > hid_driver::remove() function. The driver core doesn't call the devm
> > > > > release functions until _after_ the bus unbinds the driver, so the order
> > > > > of operations is like this:
> > > >
> > > > Excellent analysis, but the problem is not limited to the hammer driver
> > > > (potentially) and shalt be dealt with appropriately, at the HID bus
> > > > level.
> > >
> > > Thanks. I thought of the bus level approach as well, but I was trying to
> > > keep the fix isolated to the driver that had the problem. I'd like to
> > > get the fix into the stable kernel, as this fixes a regression
> > > introduced by commit d950db3f80a8 ("HID: google: switch to devm when
> > > registering keyboard backlight LED") in v5.18.
> > >
> > > Is the bus level approach going to be acceptable as a stable backport?
> >
> > Sure, why not given the kind of stuff flowing into stable kernels. At
> > least this would be fixing real issue that can be triggered with a real
> > device.
>
> Hmm, ok. I was worried it would be too much "new code" vs. fixing
> something.
>
> > >
> > > This got me thinking that maybe both of these approaches are wrong.
> > > Maybe the call to hid_close_report() should be removed from
> > > hid_device_remove() instead.
> > >
> > > The device is being removed from the bus when hid_device_remove() is
> > > called, but it hasn't been released yet. Other devices like the hidinput
> > > device are referencing the hdev device because they set the hdev as
> > > their parent. Basically, child devices are still bound to some sort of
> > > driver or subsystem when the parent hdev is unbound from its driver,
> > > leading to a state where the child drivers could still access the hdev
> > > while it is being destroyed. If we remove the hid_close_report() call
> > > from this function it will eventually be called by hid_device_release()
> > > when the last reference to the device is dropped, i.e. when the child
> > > devices all get destroyed. In the case of hid-google-hammer, that would
> > > be when hid_hw_stop() is called from the devm release function by driver
> > > core.
> > >
> > > The benefit of this approach is that we don't allocate a devres group
> > > for all the hid devices when only two drivers need it. The possible
> > > downside is that we keep the report around while the device exists but
> > > has no driver bound to it.
> > >
> > > Here's a totally untested patch for that.
> > >
> > > ---8<----
> > > diff --git a/drivers/hid/hid-core.c b/drivers/hid/hid-core.c
> > > index 22623eb4f72f..93905e200cae 100644
> > > --- a/drivers/hid/hid-core.c
> > > +++ b/drivers/hid/hid-core.c
> > > @@ -1211,8 +1211,8 @@ int hid_open_report(struct hid_device *device)
> > > hid_parser_reserved
> > > };
> > >
> > > - if (WARN_ON(device->status & HID_STAT_PARSED))
> > > - return -EBUSY;
> > > + if (device->status & HID_STAT_PARSED)
> > > + hid_close_report(device);
> > >
> > > start = device->dev_rdesc;
> > > if (WARN_ON(!start))
> > > @@ -2662,7 +2662,6 @@ static void hid_device_remove(struct device *dev)
> > > hdrv->remove(hdev);
> > > else /* default remove */
> > > hid_hw_stop(hdev);
> > > - hid_close_report(hdev);
> > > hdev->driver = NULL;
> > > }
> >
> > This will probably work, but it I consider this still being fragile as
> > at some point we might want to add some more unwinding, and we'll run
> > into this issue again. I would feel much safer if the order of release
> > followed (inversely) order of allocations more closely.
> >
>
> Sorry, I'm not following here. How is it fragile? Are you saying that if
> we want to add devm calls into the bus layer itself the order of release
> won't be inverse of allocation/creation?

What I was trying to say is that later someone else might be tempted to
add more traditional-style resources and non-devm-unwinding for them.
Having an explicit devres groups gives exact point when driver-allocated resources
are released, and makes patch authors take it into consideration.

If everything is devm-controlled then we do not need a separate devres
group.

Thanks.

--
Dmitry