Re: [RFC] component: Fix: Unassign components' masters if bringing up master fails

From: Archit Taneja
Date: Tue Feb 16 2016 - 02:17:08 EST



On 02/16/2016 01:02 AM, Jon Medhurst (Tixy) wrote:
On Thu, 2016-02-11 at 15:05 +0530, Archit Taneja wrote:
component_master_add_with_match can fail if the master's bind op doesn't
go through successfully. In such a scenario, all the components in the
master's match array have their 'master' pointer set to the given master.
These pointers need to be set to NULL again. If they aren't, successive
calls to component_master_add_with_match will fail because the driver
thinks these components already have a master.

This issue can be seen when a driver defers probe because of missing
resources. It is seen after the introduction of commit:

"component: track components via array rather than list"

Add 'master_remove_components' which sets the all the components's masters
in the match array to NULL. This function is also re-used in
component_master_del and replaces code that did the same thing.

Signed-off-by: Archit Taneja <architt@xxxxxxxxxxxxxx>

As Daniel pointed out in his reply, there is already a fix for this
issue in Linux which makes sure no components point to a master if it is
deleted. See commit 57480484f9f7 ("component: Detach components when
deleting master struct")

Similarly, Daniel's fix for the mirror case has just been applied, which
makes sure masters don't refer to components when they are deleted.
Commit 8e7199c2c50f ("component: remove device from master match list on
failed add").


I gave these fixes a try. As expected, they resolve the issue I
observed.


It seems to me that for other error cases (that don't result in deletion
of objects) we would want to leave the references between components and
masters intact once they have been created.

With regard to the $subject patch (below) it looks like it would go
wrong in this situation...

- component_add() is called to add a component

- This calls try_to_bring_up_masters() which calls
try_to_bring_up_master() for each master in the system

- If that master doesn't yet have all components available yet
find_components() returned false, then

- master_remove_components() is called

But, this isn't an error situation that needs rolling back, and as
written the patch only half does this, because it stops components
pointing to the master, but leaves the master's match list pointing to
those components.

You're right. I didn't realize this, this would have really messed
things up.


The actual real error conditions in try_to_bring_up_master() only get
triggered when actually trying to bring up a master, and that only
happens when either:

a) The last component for that master is being added with
component_add()

b) A master is added by component_master_add_with_match() and all the
components it required where already registered.

Both a) and b) should now be handled correctly by the deletion of the
relevant component/master that was being added (thanks to the two fixes
I mentioned at the beginning).

The other components or master should subsequently get cleaned up by
calling component_del() or component_master_del(), which take care of
updating the relevant references between components and master.

For component_master_del this is not immediately obvious, but
take_down_master calls devres_release_group which causes
devm_component_match_release to be called.


Thanks for the explanation.

Archit

--
The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, hosted by The Linux Foundation