Re: Device links between providers and consumers

From: Saravana Kannan
Date: Mon Nov 20 2023 - 23:53:38 EST


Hi Bartosz,

Adding LKML so that others are aware of what the issue is and it'll be
easier when I get to my TODO patches and send them out. I'm hoping
that's okay with you because we didn't discuss anything confidential
here.

On Mon, Nov 20, 2023 at 2:21 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
>
> On Mon, Nov 20, 2023 at 12:38 PM Bartosz Golaszewski <brgl@xxxxxxxx> wrote:
> >
> > Hi Saravana,
> >
> > As I suspected, I couldn't observe the behavior you described during
> > our discussion at the LPC event. I have a DT GPIO provider and a
> > consumer referencing it by phandle. I'm unbinding the provider and the
> > consumer keeps on living, if it tries to use the GPIO, then it will
> > enter the regular code path in GPIO for checking if the provider is
> > there or not.
> >
> > Could you point me in the right direction here?
>
> Thanks for trying it out! Based on the code it should unbind the
> consumers. I haven't ever tried it myself (no need for it).

I took a closer look to show you where the consumer unbind is supposed
to be done, but in doing so I think I know what issue you are hitting.
One of my TODO items for device links should fix your problem.

The force unbinding of consumers when the supplier is unbound is
supposed to happen here:
device_driver_detach()
-> device_release_driver_internal()
-> __device_release_driver()
-> device_links_unbind_consumers()
-> for all "active" consumer -> device_release_driver_internal()

However the problem is the "if (drv)" check in __device_release_driver().

This problem also exists for "class" device suppliers that don't have
a drv. Fixing managed device links for "class" suppliers (and now, bus
suppliers without drv) has been in my TODO list for a while.

The gpio device is one of the cases of a "bus" device probing without
a driver. A while ago, I implemented a gpio_bus_match() that'll probe
the gpio device (so consumer probing isn't blocked) and I was trying
to keep the boilerplate code minimalistic. So, for your test case, a
quick test hack would be to implement an actual stub driver instead of
using a stub bus match. That should fix your problem with the
consumers not unbinding. I'll put up a proper fix sometime soon
(hopefully over the holiday lulls).

Btw, when we were talking in person at the LPC dinner, you were asking
"what would you do if the supplier was an optional supplier but you
forcefully unbound the consumer?" I have a nice answer now:

After a force unbind, we need to add all these consumers to the
deferred probe list and trigger another deferred probe attempt. If the
supplier was optional, the consumer would probe again. This also has
the nice property that the consumer doesn't advertise something to
userspace that it can't deliver (because the supplier has gone
missing) and it makes the error handling easier for drivers. They
don't have to worry about suppliers vanishing in every part of their
code. Once they get the supplier and probe successfully, they
shouldn't have to worry about it vanishing underneath them.

Cheers,
Saravana


>
> Let's start with making sure the basic functionality is working in your case.
>
> Can you check /sys/class/devlink to see if you see a folder with the
> following name?
> <bus:supplier>--<bus:consumer>
>
> Once you find it, can you cat all the file contents and tell me what
> it says before you unbind it?
>
> The "status" should be "available". And "sync_state_only" should be false.
>
> Also, how are you unbinding the supplier? And does the board you are
> playing with something that's upstream? Should we take this discussion
> to LKML?
>
> -Saravana