Re: [PATCH v1 2/2] driver core: fw_devlink: Handle missing drivers for optional suppliers

From: Saravana Kannan
Date: Mon Feb 01 2021 - 15:49:52 EST


On Mon, Feb 1, 2021 at 2:32 AM Geert Uytterhoeven <geert@xxxxxxxxxxxxxx> wrote:
>
> Hi Saravana,
>
> On Sat, Jan 30, 2021 at 5:03 AM Saravana Kannan <saravanak@xxxxxxxxxx> wrote:
> > After a deferred probe attempt has exhaused all the devices that can be
> > bound, any device that remains unbound has one/both of these conditions
> > true:
> >
> > (1) It is waiting on its supplier to bind
> > (2) It does not have a matching driver
> >
> > So, to make fw_devlink=on more forgiving of missing drivers for optional
> > suppliers, after we've done a full deferred probe attempt, this patch
> > deletes all device links created by fw_devlink where the supplier hasn't
> > probed yet and the supplier itself is not waiting on any of its
> > suppliers. This allows consumers to probe during another deferred probe
> > attempt if they were waiting on optional suppliers.
> >
> > When modules are enabled, we can't differentiate between a driver
> > that'll never be registered vs a driver that'll be registered soon by
> > loading a module. So, this patch doesn't do anything for the case where
> > modules are enabled.
>
> For the modular case, can't you do a probe regardless? Or limit it
> to devices where the missing provider is a DMAC or IOMMU driver?
> Many drivers can handle missing DMAC controller drivers, and are even
> supposed to work that way. They may even retry obtaining DMA releases
> later.

I don't want to handle this at a property/provider-type level. It'll
be a whack-a-mole that'll never end -- there'll be some driver that
would work without some resource. Letting it probe is not difficult (I
just need to drop these device links), but the problem is that a lot
of drivers are not written properly to be able to handle getting
deferred and then getting reattempted before the supplier. Either
because:

1. They were never built and tested as a module
2. The supplier gets deferred and the consumer doesn't have proper
deferred probe implementation and when we drop the device links, the
consumer might be attempted before the supplier and things go bad.

One hack I'm thinking of is that with CONFIG_MODULES, I can drop these
unmet device links after a N-second timeout, but having the timeout
extended everytime a new driver is registered. So as long as no two
modules are loaded further than N seconds apart during boot up, it
would all just work out fine. But it doesn't solve the problem fully
either. But maybe it'll be good enough? I haven't analyzed this fully
yet -- so apologies in advance if it's stupid.

-Saravana