Re: [PATCH 1/2] driver core: Clear FWNODE_FLAG_NOT_DEVICE when a device is added

From: Herve Codina
Date: Tue Mar 12 2024 - 10:18:55 EST


Hi Saravana,

On Tue, 20 Feb 2024 18:41:03 -0800
Saravana Kannan <saravanak@xxxxxxxxxx> wrote:

> On Tue, Feb 20, 2024 at 3:10 AM Herve Codina <herve.codina@xxxxxxxxxxx> wrote:
> >
> > Since commit 1a50d9403fb9 ("treewide: Fix probing of devices in DT
> > overlays"), when using device-tree overlays, the FWNODE_FLAG_NOT_DEVICE
> > is set on each overlay nodes.
> > When an overlay contains a node related to a bus (i2c for instance)
> > and its children nodes representing i2c devices, the flag is cleared for
> > the bus node by the OF notifier but the "standard" probe sequence takes
> > place (the same one is performed without an overlay) for the bus and
> > children devices are created simply by walking the children DT nodes
> > without clearing the FWNODE_FLAG_NOT_DEVICE flag for these devices.
> >
> > Clear the FWNODE_FLAG_NOT_DEVICE when the device is added, no matter if
> > an overlay is used or not.
> >
> > Fixes: 1a50d9403fb9 ("treewide: Fix probing of devices in DT overlays")
> > Cc: <stable@xxxxxxxxxxxxxxx>
> > Signed-off-by: Herve Codina <herve.codina@xxxxxxxxxxx>
> > ---
> > drivers/base/core.c | 1 +
> > 1 file changed, 1 insertion(+)
> >
> > diff --git a/drivers/base/core.c b/drivers/base/core.c
> > index 14d46af40f9a..61d09ac57bfb 100644
> > --- a/drivers/base/core.c
> > +++ b/drivers/base/core.c
> > @@ -3619,6 +3619,7 @@ int device_add(struct device *dev)
> > */
> > if (dev->fwnode && !dev->fwnode->dev) {
> > dev->fwnode->dev = dev;
> > + dev->fwnode->flags &= ~FWNODE_FLAG_NOT_DEVICE;
> > fw_devlink_link_device(dev);
> > }
>
> Temporary Nack on this. I think depending on how we address patch 2/2
> this patch might not be necessary.
>
> Also, I'd ideally prefer this gets cleared before the device is added,
> but it's a position that I'd be willing to change.
>

Some more information about this current patch.

Several month ago, I sent a patch related to a warning raised during driver
unbinding [1]. This warning was raised by __device_links_no_driver() because
we unlink a consumer while its supplier links.status is DL_DEV_UNBINDING.
You suspected an issue with the device removal ordering.

On this system, I applied this current patch clearing FWNODE_FLAG_NOT_DEVICE
in device_add(). This fixes the warning described in [1].

[1]: https://lore.kernel.org/linux-kernel/CAGETcx-Mp0uKBF_BWFFBUm=eVOp8xhxF3+znFB8vTaFwpJWTnw@xxxxxxxxxxxxxx/


The use case on that system, involved DT overlays and the fragment applied is the
following:
--- 8< ---
pci-ep-bus@0 {
compatible = "simple-bus";
#address-cells = <1>;
#size-cells = <1>;

/*
* map @0xe2000000 (32MB) to BAR0 (CPU)
* map @0xe0000000 (16MB) to BAR1 (AMBA)
*/
ranges = <0xe2000000 0x00 0x00 0x00 0x2000000
0xe0000000 0x01 0x00 0x00 0x1000000>;

...

flx0: flexcom@e0040000 {
compatible = "atmel,sama5d2-flexcom";
reg = <0xe0040000 0x100>;
clocks = <&clks GCK_ID_FLEXCOM0>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x0 0xe0040000 0x800>;

atmel,flexcom-mode = <ATMEL_FLEXCOM_MODE_TWI>;

i2c_lan966x: i2c@600 {
compatible = "microchip,lan966x-i2c";
reg = <0x600 0x200>;
interrupt-parent = <&itc>;
interrupts = <48>;
#address-cells = <1>;
#size-cells = <0>;
clocks = <&clks GCK_ID_FLEXCOM0>;
assigned-clocks = <&clks GCK_ID_FLEXCOM0>;
assigned-clock-rates = <20000000>;
pinctrl-0 = <&fc0_a_pins>;
pinctrl-names = "default";
i2c-analog-filter;
i2c-digital-filter;
i2c-digital-filter-width-ns = <35>;
};
};
...
};
--- 8< ---
This fragment is applied to a PCI device node.

Without clearing the FWNODE_FLAG_NOT_DEVICE, a link is present between the
i2c@600 and the PCI device. With the flag cleared, this link is replaced by
a link between the i2c@600 and the pci-ep-bus. Which looks better.

The flexcom driver is a MFD driver. As a MFD driver, it simply calls
devm_of_platform_populate(). In this path, devices are created and added but
nothing cleared the FWNODE_FLAG_NOT_DEVICE.

Based on your remark "I'd ideally prefer this gets cleared before the device
is added", I have the feeling that all calls to device_add() should clear the
flag before calling device_add(). So having FWNODE_FLAG_NOT_DEVICE cleared
in device_add() itself in that case makes sense.

What is your opinion ?

Also, feel free to ask for some more traces and/or logs if needed.

Best regards,
Hervé
--
Hervé Codina, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com