Re: [PATCH v2 2/3] driver core: fw_devlink: Handle missing drivers for optional suppliers

From: Saravana Kannan
Date: Thu Feb 04 2021 - 14:06:07 EST


On Thu, Feb 4, 2021 at 10:41 AM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
>
> On Tue, Feb 2, 2021 at 8:47 PM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> >
> > On Tue, Feb 2, 2021 at 6:34 AM Rafael J. Wysocki <rafael@xxxxxxxxxx> wrote:
> > >
> > > On Tue, Feb 2, 2021 at 5:33 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > > >
>
> [cut]
>
> > >
> > > > + *
> > > > + * This function requests fw_devlink to set itself up for a deferred probe
> > > > + * retry. This allows fw_devlink to ignore device links it created to
> > > > + * suppliers that'll never probe. This is necessary in case some of the
> > > > + * suppliers are optional and their consumers can probe without them.
> > > > + *
> > > > + * Returns true if deferred probe retry is likely to make any difference.
> > > > + */
> > > > +bool fw_devlink_deferred_probe_retry(void)
> > > > +{
> > > > + if (IS_ENABLED(CONFIG_MODULES))
> > > > + return false;
> > >
> > > To make the above more visible, I'd fold this function into the caller.
> >
> > I had written it this way because I'm thinking of adding a timeout
> > heuristic for MODULES in here. I can move it to the caller if you feel
> > strongly about it.
>
> Not really strongly, but then moving it back when you need doesn't
> sound particularly troublesome to me. :-)

Ok, will move it. I'm also rewriting this patch. So we'll see where this lands.

>
> > >
> > > > +
> > > > + fw_devlink_def_probe_retry = true;
> > > > + return fw_devlink_get_flags() && !fw_devlink_is_permissive();
> > > > +}
> > > > +
> > > > /**
> > > > * fw_devlink_create_devlink - Create a device link from a consumer to fwnode
> > > > * @con - Consumer device for the device link
> > > > diff --git a/drivers/base/dd.c b/drivers/base/dd.c
> > > > index 9179825ff646..11325df2327f 100644
> > > > --- a/drivers/base/dd.c
> > > > +++ b/drivers/base/dd.c
> > > > @@ -317,6 +317,11 @@ static int deferred_probe_initcall(void)
> > > > driver_deferred_probe_trigger();
> > > > /* Sort as many dependencies as possible before exiting initcalls */
> > > > flush_work(&deferred_probe_work);
> > > > +
> > > > + if (fw_devlink_deferred_probe_retry()) {
> > > > + driver_deferred_probe_trigger();
> > > > + flush_work(&deferred_probe_work);
> > > > + }
> > > > initcalls_done = true;
> > > >
> > > > /*
> > > > --
> > >
> > > Overall, the "let's do nothing if modules are not enabled" approach is
> > > a bit disappointing, because what if somebody builds all of the
> > > drivers needed for boot in and enables modules anyway, for example to
> > > allow USB drivers to be probed dynamically?
> >
> > Yeah, I'm disappointed too :( But I'm trying to get it to work for
> > !MODULES so that we can enable fw_devlink=on by default at least for
> > !MODULES to make sure drivers don't introduce more issues going
> > forward. And then I plan to continue working on making it work
> > correctly for MODULES case too.
> >
> > Getting fw_devlink=on to work perfectly for MODULES and !MODULES is
> > not a problem at all. But it needs fixing a bunch of drivers (mostly
> > simple fixes like setting the right flag, handling deferred probes
> > correctly, etc), but I'm hitting a catch-22 here. I can't find the
> > drivers without setting fw_devlink=on by default. But if I did that,
> > it's going to break a bunch of boards.
> >
> > What's your thought on leaving fw_devlink=on by default on 5.12 and
> > fixing drivers as issues are reported?
>
> If there are any issues known today that need to be addressed, I'd fix
> them first and then try to enable fw_devlink=on maybe just for
> !MODULES to start with.

Yeah, that's what I'm thinking of for now.

> > If that's a no, do you have any other ideas on how to deal with this catch-22?
>
> Try to enable, fix issues as they show up in linux-next. If there are
> still outstanding issues before the next release, back off and try in
> the next cycle. Repeat.

If it's just dealing with outstanding issues that are reported, I'm
hoping I can do that. The biggest headache right now is dealing with
devices that have drivers that directly parse the fwnode AND still
have a struct device. So the struct device remains unbound even if the
driver has initialized the device.

>
> This doesn't sound particularly attractive, but I don't have any
> better idea, sorry.

:'( Yeah, another approach I'm thinking of is to have a separate
"strict mode" for fw_devlink=on or above. Where it'll try it's best
till kernel late init and then fallback to permissive. But it's
becoming a headache to deal with some corner cases.

-Saravana