Re: [EXT] Re: [PATCH] driver core: improve cycle detection on fwnode graph

From: Saravana Kannan
Date: Mon Jan 29 2024 - 22:11:24 EST


On Fri, Jan 26, 2024 at 1:00 AM Xu Yang <xu.yang_2@xxxxxxx> wrote:
>
> Hi Saravana,
>
> >
> > On Wed, Jan 24, 2024 at 8:21 PM Xu Yang <xu.yang_2@xxxxxxx> wrote:
> > >
> > I think my confusion was because you use ----> in the opposite way to
> > what I have used for all my fw_devlink and cycle detection patches.
>
> Okay, I will follow the usage of "-->" later as yours.
>
> >
> > The part I was referring to is related to how driver/of/property.c has
> > node_not_dev set to true for pasrse_remote_endpoint.
> >
> > > >
> > > > 4. Has this actually caused an issue? If so, what is it? And give me
> > > > an example in an upstream DT.
> > >
> > > Yes, there are two cycles (B.EP->A->C->B and B.EP->A/A.EP->B) in above
> > > example. But only one cycle (B.EP->A->C->B) is recognized.
> > >
> > > My real case as below:
> >
> > I think you still missed some details because usb3_phy0 seems
>
> One line is indeed missing in usb3_phy0.
>
> > irrelevant here. Can you just point me to the dts (not dtsi) file for
> > this platform in the kernel tree?
>
> This parts of dts is not in upstream kernel tree due to some reasons.
> Allow me to show the necessary parts as below again, you can also
> get the full dts file from the link I attached below:
>
> ---
> ptn5110: tcpc@50 {
> compatible = "nxp,ptn5110";
> ...
>
> port {
> typec_dr_sw: endpoint {
> remote-endpoint = <&usb3_drd_sw>;
> };
> };
> };
>
> usb_dwc3_0: usb@38100000 {
> compatible = "snps,dwc3";
> phys = <&usb3_phy0>, <&usb3_phy0>;
> ...
>
> port {
> usb3_drd_sw: endpoint {
> remote-endpoint = <&typec_dr_sw>;
> };
> };
> };
>
> usb3_phy0: usb-phy@381f0040 {
> compatible = "fsl,imx8mp-usb-phy";
> vbus-power-supply = <&ptn5110>;
>
> ...
> };
>
> > Also, can you change all the pr_debug and dev_dbg in
> > drivers/base/core.c to their info equivalent and boot up the system
> > and give me the logs? That'll be a lot easier for me to understand
> > your case.
>
> Thank you for willing to debug this issue.
> The boot log and dts file is under:
> https://drive.google.com/drive/folders/1hlkzg042q5_b5l59DCW2pECXRmTH4Vy_?usp=sharing
>
> >
> > So let's say I see your logs and what you say is true, but you still
> > aren't telling me what's the problem you have because of this
> > incorrect cycle detection. What's breaking? Is something not allowed
> > to probe? If so, which one? What's supposed to be the right order of
> > probes?
> >
>
> Let me describe the issue again based on above log and dts:
>
> usb
> +-----+
> tcpc | |
> +-----+ | +--|
> | |----------->|EP|
> |--+ | | +--|
> |EP|<-----------| |
> |--+ | | B |
> | | +-----+
> | A | |
> +-----+ |
> ^ +-----+ |
> | | | |
> +-----| C |<--+
> | |
> +-----+
> usb-phy
>
> Node A (tcpc) will be populated as device 1-0050.
> Node B (usb) will be populated as device 38100000.usb.
> Node C (usb-phy) will be populated as device 381f0040.usb-phy.
>
> 1. Node C is populated as device C. But nodes A and B are still not
> populated. When do cycle detection with device C, no cycle is found.
> 2. Node B is populated as device B. When do cycle detection with device
> B, it found a fwnode link cycle B-->C-->A-->B.EP. Then, fwnode link
> A-->B.EP, C-->A and B-->C are marked as cycle. The fwnode link B-->C
> is converted to device link too.
> 3. Node A is populated as device A. When do cycle detection with device
> A, it find C-->A is marked as cycle and convert it to device link. It
> also find A-->B.EP is marked as cycle but will not convert it to device
> link since node B.EP is not a device.
>
> Finally, fwnode link B-->C and C-->A is removed, A-->B.EP is only marked
> as cycle and B-->A.EP is neither been marked as cycle nor removed.
>
> So there are 2 cycles and only the first cycle is detected.
> 1. B-->C-->A-->B.EP--B
> 2. B-->A.EP--A-->B.EP--B
>
> In the end, device 38100000.usb (node B) is defered probe due to node B
> still has a supplier node A.EP.
> Device 1-0050 (node A) is also defered probe due to it depends on one device
> which is created by 38100000.usb.
>
> The normal behavior is all of the devices can be successfully probed after two
> cycles are detected.
>

This took me several hours to debug this and I almost gave you the
"wrong" fix. A fix to create fwnode links between A --> and B --> A in
your example and remove EPs from the loop. But when typing up the
commit text, I realized what I was saying wasn't correct because this
cycle detection works fine if you don't have "C" in the example. Yet
again this bug comes down to my attempt to optimize some "unnecessary"
cycle detection logic that ended up being necessary.

Here's a test patch that I'm 99% sure will fix your issue. Please give
it a shot and let me know. After that, I need to run some more local
tests to make sure I'm not messing anything else up, clean up some
redundant logging, and then I can send a proper fix upstream.

diff --git a/drivers/base/core.c b/drivers/base/core.c
index 14d46af40f9a..75203ccc96f6 100644
--- a/drivers/base/core.c
+++ b/drivers/base/core.c
@@ -2060,7 +2060,6 @@ static int fw_devlink_create_devlink(struct device *con,
* SYNC_STATE_ONLY device links don't block probing and supports cycles.
* So cycle detection isn't necessary and shouldn't be done.
*/
- if (!(flags & DL_FLAG_SYNC_STATE_ONLY)) {
device_links_write_lock();
if (__fw_devlink_relax_cycles(con, sup_handle)) {
__fwnode_link_cycle(link);
@@ -2069,7 +2068,6 @@ static int fw_devlink_create_devlink(struct device *con,
sup_handle);
}
device_links_write_unlock();
- }

if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
sup_dev = fwnode_get_next_parent_dev(sup_handle);

Thanks,
Saravana