Re: [PATCH 2/2] drm/imx: make sure to cleanup DRM before unbinding components

From: Russell King - ARM Linux
Date: Wed Oct 10 2018 - 07:22:27 EST


On Wed, Oct 10, 2018 at 01:02:16PM +0200, Stefan Agner wrote:
> [adding Lucas]
>
> On 10.10.2018 12:38, Russell King - ARM Linux wrote:
> > On Tue, Oct 09, 2018 at 11:30:49PM +0200, Stefan Agner wrote:
> >> In situations where a component fails to bind, a previously
> >> successfully bound component might already registered itself
> >> with the DRM framework (e.g. an encoder). When the master
> >> component then calls drm_mode_config_cleanup, we end up in a
> >> use after free sitution.
> >>
> >> Use the cleanup callback to make sure all framework level
> >> cleanup is done by the time we unbind components.
> >
> > I'm not sure about this approach - the idea about the component bind
> > and unbind callbacks is that unbind undoes _everything_ that bind has
> > done, so everything is correctly ordered. If bind registers something,
> > unbind should unregister it.
> >
> > What seems to be going on is that imx is registering stuff in bind()
> > but not unregistering it in unbind().
>
> Yes indeed, if that can be fixed this seems to be the better approach to
> me.
>
> >
> > Since imx was one of the drivers that the component helper was
> > created for, if it's now crashing, that's a regression in the imx
> > driver. Looking at the commit log, I'd say:
> >
> > commit 8e3b16e2117409625b89807de3912ff773aea354
> > Author: Lucas Stach <l.stach@xxxxxxxxxxxxxx>
> > Date: Thu Aug 11 11:18:49 2016 +0200
> >
> > drm/imx: don't destroy mode objects manually on driver unbind
> >
> > Instead let drm_mode_config_cleanup() do the work when taking down
> > the master device. This requires all cleanup functions to be
> > properly hooked up to the mode object .destroy callback.
> >
> > Signed-off-by: Lucas Stach <l.stach@xxxxxxxxxxxxxx>
> > Signed-off-by: Philipp Zabel <p.zabel@xxxxxxxxxxxxxx>
> >
> > is probably responsible for introducing this problem, since the
> > explicit calls were added by me when imx was stuck in staging due to
> > the problems that the component helper solved.
>
> The commit above does not revert cleanly today, but a quick fixing
> seemed to resolve the problem I am seeing...
>
> >
> > I think what we have here are different opinions on how cleanup
> > should be handled.
>
> In the regular case using the framework cleanup function before calling
> component_unbind_all() works fine.
>
> Its really only the case where a subcomponent fails to bind where unbind
> happens before calling drm_mode_config_cleanup(drm). I guess Lucas was
> not aware of that special case...?

As long as ->unbind undoes everything that ->bind does, it's not a
problem.

WRT using devres groups in one of your previous mails on this topic,
consider what would happen if you use devm_* APIs in the bind callback
and the component helper did not use devres groups.

Any resources taken in the bind callback would not be freed until the
device was unbound from the driver by the driver model. This means
each time the bind callback gets called, new sets of resources are
attempted to be acquired. If some resources are exclusive, the first
bind attempt would succeed, but the second attempt would fail with
-EBUSY for example. In the case of memory, the allocated memory
would not be freed, but new memory would be allocated each time a
bind was attempted.

The alternative would be that devres must not be used at all in the
bind/unbind callbacks - but then we'll be subject to the same error-
prone cleanup that we see with driver probe functions prior to devres.

Consider what the LDB bind/unbind callbacks would look like if devres
was not permitted, and how the memory for the imx_ldb structure would
be managed - you'd need some sort of ref-counting on the imx_ldb
structure, and management of that in the encoder and optional connector
destroy functions so you know when all embedded encoders and connectors
in that structure have been "destroyed" by drm_mode_config_cleanup(),
and hence it's safe to free the data structure.

--
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up