RE: [EXT] Re: [PATCH] driver core: improve cycle detection on fwnode graph

From: Xu Yang
Date: Mon Jan 29 2024 - 22:40:22 EST


Hi Saravana,

>
> On Fri, Jan 26, 2024 at 1:00 AM Xu Yang <xu.yang_2@xxxxxxx> wrote:
> >
> > Hi Saravana,
> >
> > >
> > > On Wed, Jan 24, 2024 at 8:21 PM Xu Yang <xu.yang_2@xxxxxxx> wrote:
> > > >
> > > I think my confusion was because you use ----> in the opposite way to
> > > what I have used for all my fw_devlink and cycle detection patches.
> >
> > Okay, I will follow the usage of "-->" later as yours.
> >
> > >
> > > The part I was referring to is related to how driver/of/property.c has
> > > node_not_dev set to true for pasrse_remote_endpoint.
> > >
> > > > >
> > > > > 4. Has this actually caused an issue? If so, what is it? And give me
> > > > > an example in an upstream DT.
> > > >
> > > > Yes, there are two cycles (B.EP->A->C->B and B.EP->A/A.EP->B) in above
> > > > example. But only one cycle (B.EP->A->C->B) is recognized.
> > > >
> > > > My real case as below:
> > >
> > > I think you still missed some details because usb3_phy0 seems
> >
> > One line is indeed missing in usb3_phy0.
> >
> > > irrelevant here. Can you just point me to the dts (not dtsi) file for
> > > this platform in the kernel tree?
> >
> > This parts of dts is not in upstream kernel tree due to some reasons.
> > Allow me to show the necessary parts as below again, you can also
> > get the full dts file from the link I attached below:
> >
> > ---
> > ptn5110: tcpc@50 {
> > compatible = "nxp,ptn5110";
> > ...
> >
> > port {
> > typec_dr_sw: endpoint {
> > remote-endpoint = <&usb3_drd_sw>;
> > };
> > };
> > };
> >
> > usb_dwc3_0: usb@38100000 {
> > compatible = "snps,dwc3";
> > phys = <&usb3_phy0>, <&usb3_phy0>;
> > ...
> >
> > port {
> > usb3_drd_sw: endpoint {
> > remote-endpoint = <&typec_dr_sw>;
> > };
> > };
> > };
> >
> > usb3_phy0: usb-phy@381f0040 {
> > compatible = "fsl,imx8mp-usb-phy";
> > vbus-power-supply = <&ptn5110>;
> >
> > ...
> > };
> >
> > > Also, can you change all the pr_debug and dev_dbg in
> > > drivers/base/core.c to their info equivalent and boot up the system
> > > and give me the logs? That'll be a lot easier for me to understand
> > > your case.
> >
> > Thank you for willing to debug this issue.
> > The boot log and dts file is under:
> >
> https://drive.google.com/drive/folders/1hlkzg042
> q5_b5l59DCW2pECXRmTH4Vy_%3Fusp%3Dsharing&data=05%7C02%7Cxu.yang_2%40nxp.com%7Ca3d2f5c60e58402ba7a30
> 8dc21411d1b%7C686ea1d3bc2b4c6fa92cd99c5c301635%7C0%7C0%7C638421810737996636%7CUnknown%7CTWFpbGZsb3d
> 8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C0%7C%7C%7C&sdata=yhw729d%2F5%2
> BKwoHcpjb0Bqcv%2BhhtGEQ75zE0N2d2Agac%3D&reserved=0
> >
> > >
> > > So let's say I see your logs and what you say is true, but you still
> > > aren't telling me what's the problem you have because of this
> > > incorrect cycle detection. What's breaking? Is something not allowed
> > > to probe? If so, which one? What's supposed to be the right order of
> > > probes?
> > >
> >
> > Let me describe the issue again based on above log and dts:
> >
> > usb
> > +-----+
> > tcpc | |
> > +-----+ | +--|
> > | |----------->|EP|
> > |--+ | | +--|
> > |EP|<-----------| |
> > |--+ | | B |
> > | | +-----+
> > | A | |
> > +-----+ |
> > ^ +-----+ |
> > | | | |
> > +-----| C |<--+
> > | |
> > +-----+
> > usb-phy
> >
> > Node A (tcpc) will be populated as device 1-0050.
> > Node B (usb) will be populated as device 38100000.usb.
> > Node C (usb-phy) will be populated as device 381f0040.usb-phy.
> >
> > 1. Node C is populated as device C. But nodes A and B are still not
> > populated. When do cycle detection with device C, no cycle is found.
> > 2. Node B is populated as device B. When do cycle detection with device
> > B, it found a fwnode link cycle B-->C-->A-->B.EP. Then, fwnode link
> > A-->B.EP, C-->A and B-->C are marked as cycle. The fwnode link B-->C
> > is converted to device link too.
> > 3. Node A is populated as device A. When do cycle detection with device
> > A, it find C-->A is marked as cycle and convert it to device link. It
> > also find A-->B.EP is marked as cycle but will not convert it to device
> > link since node B.EP is not a device.
> >
> > Finally, fwnode link B-->C and C-->A is removed, A-->B.EP is only marked
> > as cycle and B-->A.EP is neither been marked as cycle nor removed.
> >
> > So there are 2 cycles and only the first cycle is detected.
> > 1. B-->C-->A-->B.EP--B
> > 2. B-->A.EP--A-->B.EP--B
> >
> > In the end, device 38100000.usb (node B) is defered probe due to node B
> > still has a supplier node A.EP.
> > Device 1-0050 (node A) is also defered probe due to it depends on one device
> > which is created by 38100000.usb.
> >
> > The normal behavior is all of the devices can be successfully probed after two
> > cycles are detected.
> >
>
> This took me several hours to debug this and I almost gave you the
> "wrong" fix. A fix to create fwnode links between A --> and B --> A in
> your example and remove EPs from the loop. But when typing up the
> commit text, I realized what I was saying wasn't correct because this
> cycle detection works fine if you don't have "C" in the example. Yet
> again this bug comes down to my attempt to optimize some "unnecessary"
> cycle detection logic that ended up being necessary.
>
> Here's a test patch that I'm 99% sure will fix your issue. Please give
> it a shot and let me know. After that, I need to run some more local
> tests to make sure I'm not messing anything else up, clean up some
> redundant logging, and then I can send a proper fix upstream.
>
> diff --git a/drivers/base/core.c b/drivers/base/core.c
> index 14d46af40f9a..75203ccc96f6 100644
> --- a/drivers/base/core.c
> +++ b/drivers/base/core.c
> @@ -2060,7 +2060,6 @@ static int fw_devlink_create_devlink(struct device *con,
> * SYNC_STATE_ONLY device links don't block probing and supports cycles.
> * So cycle detection isn't necessary and shouldn't be done.
> */
> - if (!(flags & DL_FLAG_SYNC_STATE_ONLY)) {
> device_links_write_lock();
> if (__fw_devlink_relax_cycles(con, sup_handle)) {
> __fwnode_link_cycle(link);
> @@ -2069,7 +2068,6 @@ static int fw_devlink_create_devlink(struct device *con,
> sup_handle);
> }
> device_links_write_unlock();
> - }
>
> if (sup_handle->flags & FWNODE_FLAG_NOT_DEVICE)
> sup_dev = fwnode_get_next_parent_dev(sup_handle);
>

It works now. All of these devices are probed correctly on my board.
Thanks for your input!

Best Regards,
Xu Yang