Re: [RFC PATCH] of: device: Support 2nd sources of probeable but undiscoverable devices

From: Rob Herring
Date: Thu Sep 28 2023 - 16:38:11 EST


On Wed, Sep 27, 2023 at 6:50 PM Doug Anderson <dianders@xxxxxxxxxxxx> wrote:
>
> Hi,
>
> On Tue, Sep 26, 2023 at 7:37 PM Jeff LaBundy <jeff@xxxxxxxxxxx> wrote:
> >
> > Hi Doug,
> >
> > On Fri, Sep 22, 2023 at 05:11:10PM -0700, Doug Anderson wrote:
> > > Hi,
> > >
> > > On Fri, Sep 22, 2023 at 12:08 PM Rob Herring <robh+dt@xxxxxxxxxx> wrote:
> > > >
> > > > > > This seems like overkill to me. Do we really need groups and a mutex
> > > > > > for each group? Worst case is what? 2-3 groups of 2-3 devices?
> > > > > > Instead, what about extending "status" with another value
> > > > > > ("fail-needs-probe"? (fail-xxx is a documented value)). Currently, the
> > > > > > kernel would just ignore nodes with that status. Then we can process
> > > > > > those nodes separately 1-by-1.
> > > > >
> > > > > My worry here is that this has the potential to impact boot speed in a
> > > > > non-trivial way. While trackpads and touchscreens _are_ probable,
> > > > > their probe routines are often quite slow. This is even mentioned in
> > > > > Dmitry's initial patches adding async probe to the kernel. See commit
> > > > > 765230b5f084 ("driver-core: add asynchronous probing support for
> > > > > drivers") where he specifically brings up input devices as examples.
> >
> > Ideally, all but one driver in a group should bail out of probe quickly if
> > the device is not populated. If not, I would consider that to be a bug or at
> > least room for improvement in that driver.
> >
> > The reason input devices can take a while to probe is because they may be
> > loading FW over I2C or performing some sort of calibration procedure; only
> > one driver in the group should get that far.
>
> Hmm, that's not my experience. Specifically I've seen i2c-hid devices
> whose datasheets say that you're not allowed to talk i2c to them at
> all for hundreds of milliseconds after you power them on. See, for
> instance, "i2c-hid-of-goodix.c" which has a "post_gpio_reset_delay_ms"
> of 180 ms and "i2c-hid-of-elan.c" which has one of 300 ms.
>
> As I understand it these touchscreens have firmware on them and that
> firmware can take a while to boot. Until the firmware boots they won't
> respond over i2c. This is simply not something that Linux can do
> anything about.
>
> About the best you could do would be to add a board-specific driver
> that understood that it could power up the rails, wait the maximum
> amount of time that all possible touchscreens might need, and then
> look for i2c ACKs. I'm still hoping to hear from Rob about how I would
> get a board-specific driver to load on a DT system so I can
> investigate / prototype this.

foo_initcall()
{
if (of_machine_is_compatible(...))
platform_device_create();
}

That chunk would have to be built in and if that's part of the driver
module, autoloading wouldn't work.

We could have a match table of board compatibles and driver names. I'm
not worried about that list being big, so I'm happy to stick that in
drivers/of/.

> > > > We could add information on the class of device. touchscreen and
> > > > touchpad aliases or something.
> > >
> > > Ah, I see. So something like "fail-needs-probe-<class>". The
> > > touchscreens could have "fail-needs-probe-touchscreen" and the
> > > trackpads could have "fail-needs-probe-trackpad" ? That could work. In
> > > theory that could fall back to the same solution of grabbing a mutex
> > > based on the group ID...
> > >
> > > Also: if having the mutex in the "struct device" is seen as a bad
> > > idea, it would also be easy to remove. __driver_probe_device() could
> > > just make a call like "of_device_probe_start()" at the beginning that
> > > locks the mutex and then "of_device_probe_end()" that unlocks it. Both
> > > of those calls could easily lookup the mutex in a list, which would
> > > get rid of the need to store it in the "struct device".
> > >
> > >
> > > > > That would lead me to suggest this:
> > > > >
> > > > > &i2c_bus {
> > > > > trackpad-prober {
> > > > > compatible = "mt8173-elm-hana-trackpad-prober";
> > > > >
> > > > > tp1: trackpad@10 {
> > > > > compatible = "hid-over-i2c";
> > > > > reg = <0x10>;
> > > > > ...
> > > > > post-power-on-delay-ms = <200>;
> > > > > };
> > > > > tp2: trackpad@20 {
> > > > > compatible = "hid-over-i2c";
> > > > > reg = <0x20>;
> > > > > ...
> > > > > post-power-on-delay-ms = <200>;
> > > > > };
> > > > > };
> > > > > };
> > > > >
> > > > > ...but I suspect that would be insta-NAKed because it's creating a
> > > > > completely virtual device ("mt8173-elm-hana-trackpad-prober") in the
> > > > > device tree. I don't know if there's something that's functionally
> > > > > similar that would be OK?
> >
> > This solution seems a bit confusing to me, and would require more edits
> > to the dts each time a second source is added. It also means one would
> > have to write a small platform driver for each group of devices, correct?
>
> No matter what we need to add something to the dts each time a second
> source is added, right?

That was my thought. There is the case of the devices are almost the
same, so we lie. That's what you are doing for displays IIRC.

> While it's true that we'd end up with some extra drivers, if we do it
> correctly we don't necessarily need a driver for each group of devices
> nor even a driver per board. If several boards have very similar
> probing requirements then, even if they have unique "compatible"
> strings they could still end up using the same Linux driver.
>
> I've actually been talking offline with folks on ChromeOS more about
> this problem as well. Chen-Yu actually pointed at a patch series (that
> never landed, I guess) that has some similar ideas [1]. I guess in
> that case Hans was actually constructing device tree properties
> manually in the driver. I was thinking more of having all of the
> options listed in the device tree and then doing something that only
> causes some of them to probe.
>
> If Rob was OK with it, I guess I could have some sort of top-level
> "hwmanager" node like Hans did and then have phandle links to all the
> hardware that are managed by it. Then I could just change those to
> "okay"?

That's really just making the mutex node link the other direction. The
devices link to the common mutex node vs. the hwmanager node(s) links
to all the devices. That's really just picking the paint colors.

> Ideally, though, this could somehow use device tree "overlays" I
> guess. That seems like almost a perfect fit. I guess the issue here,
> though, is that I'd want the overlays bundled together with the
> original DT and then the board-specific "hardware prober" driver to
> actually apply the overlays after probing. Does that seem sensible?

BTW, there was an idea long ago from maintainer emeritus Likely to
append overlays to the base DTB for the kernel to apply.

How would that help you here? Are you going to have an overlay for
each device that enables it? It's much easier to just call
of_update_property() to change "status".

Rob