Re: [PATCH v2 2/2] Revert "usb: dwc3: Don't switch OTG -> peripheral if extcon is present"

From: Ferry Toth
Date: Tue Oct 18 2022 - 16:48:07 EST


Hi,

Op 17-10-2022 om 23:20 schreef Andrey Smirnov:
On Sun, Oct 16, 2022 at 1:59 PM Ferry Toth <fntoth@xxxxxxxxx> wrote:

Op 15-10-2022 om 21:54 schreef Andrey Smirnov:
On Thu, Oct 13, 2022 at 12:35 PM Ferry Toth <fntoth@xxxxxxxxx> wrote:
<SNIP>
My end goal here is to find a way to test vanilla v6.0 with the two
patches reverted on your end. I thought that during my testing I saw
tusb1210 print those timeout messages during its probe and that
disabling the driver worked to break the loop, but I went back to
double check and it doesn't work so scratch that idea. Configuring
extcon as a built-in breaks host functionality with or without patches
on my end, so I'm not sure it could be a path.

I won't have time to try things with
0043b-TODO-driver-core-Break-infinite-loop-when-deferred-p.patch until
the weekend, meanwhile can you give this diff a try with vanilla (no
reverts) v6.0:

OK, got a chance to try things with that patch. Both v6.0 and v6.0
with my patches reverted work the same, my Kingston DataTraveller USB
stick enumerates and works as expected.

Iow you don't need the patch at all to get usb to work. There has got to
be a difference in our configs.

My patch? Yeah, it should have zero effect on anything.
!DWC3_VER_IS_PRIOR(DWC3, 330A) is false for Merrifield, so the logical
change from my patch is a no-op. It's a pure coincidence that it
resolved the probe loop that
0043b-TODO-driver-core-Break-infinite-loop-when-deferred-p.patch is
for.

Did you have a chance to look at mine (here:
https://drive.google.com/file/d/1aKJWMqiAXnReeLCvxshzjKwGxIWQ7eJk/view?usp=sharing)

Else, send me yours.

I've been using your config in all of the testing.

modified drivers/phy/ti/phy-tusb1210.c
@@ -127,6 +127,7 @@ static int tusb1210_set_mode(struct phy *phy, enum
phy_mode mode, int submode)
u8 reg;

ret = tusb1210_ulpi_read(tusb, ULPI_OTG_CTRL, &reg);
+ WARN_ON(ret < 0);
if (ret < 0)
return ret;

@@ -152,7 +153,10 @@ static int tusb1210_set_mode(struct phy *phy,
enum phy_mode mode, int submode)
}

tusb->otg_ctrl = reg;
- return tusb1210_ulpi_write(tusb, ULPI_OTG_CTRL, reg);
+ ret = tusb1210_ulpi_write(tusb, ULPI_OTG_CTRL, reg);
+ WARN_ON(ret < 0);
+ return ret;
+
}

#ifdef CONFIG_POWER_SUPPLY

? I'm curious to see if there's masked errors on your end since dwc3
driver doesn't check for those.
root@yuna:~# dmesg | grep -i -E 'warn|assert|error|tusb|dwc3'
8250_mid: probe of 0000:00:04.0 failed with error -16
platform regulatory.0: Direct firmware load for regulatory.db failed
with error -2
brcmfmac mmc2:0001:1: Direct firmware load for
brcm/brcmfmac43340-sdio.Intel Corporation-Merrifield.bin failed with
error -2
sof-audio-pci-intel-tng 0000:00:0d.0: error: I/O region is too small.
sof-audio-pci-intel-tng 0000:00:0d.0: error: failed to probe DSP -19


This is done through configfs only when the switch is set to device mode.
Sure, but can it be disabled? We are looking for unknown variables, so
excluding this would be a reasonable thing to do.
It's not enabled until I flip the switch to device mode.
OK to cut this back and forth short, I think it'd be easier to just
ask you to run what I run. Here's vanilla v6.0 bzImage and initrd
(built with your config + CONFIG_PHY_TUSB1210=y) I tested with
What do you mean by this? My config is with

CONFIG_GENERIC_PHY=y
CONFIG_PHY_TUSB1210=y

$ cat config-6.0.0-edison-acpi-standard | grep 1210
# CONFIG_PHY_TUSB1210 is not set
$ md5sum config-6.0.0-edison-acpi-standard
3c989c708302c1f9e73c6113e71aed9d config-6.0.0-edison-acpi-standard

I had to manually enable it, that's what I meant by my comment.

Unbelievable, seems I uploaded the wrong config. I just double checked to see if any other differences:

scripts/diffconfig config-6.0.0-edison-acpi-standard-bad config-6.0.0-edison-acpi-standard-good
 GENERIC_PHY n -> y
 PHY_TUSB1210 n -> y


https://drive.google.com/drive/folders/1H28AL1coPPZ2kLTYskDuDdWo-oE7DRPH?usp=sharing
let's see how it behaves on your setup. There's also the U-Boot binary
Ok, it's getting weirder and weirder. The following is with my U-Boot
and your kernel/initrd

1) I placed them in /boot which is on my btrfs partition on the emmc (my
U-Boot has btrfs enabled)

Linux kernel version 6.0.0-edison-acpi-standard
(andreysm@neptunefw-builder) #8 SMP PREEMPT_DYNAMIC Sat Oct 15 18:47:19
UTC 2022
Building boot_params at 0x00090000
Loading bzImage at address 100000 (12064480 bytes)
Initial RAM disk at linear address 0x06000000, size 25165824 bytes
Kernel command line: "quiet root=/dev/mmcblk0p8
rootflags=subvol=@,compress=lzo rootfstype=btrfs console=ttyS2,115200n8
earlyprintk=ttyS2,115200n8,keep loglevel=4 systemd.unit=multi-user.target"
Kernel loaded at 00100000, setup_base=00090000

You shouldn't be using root from you storage since:
a) the initrd I uploaded is self-containing, it doesn't need anything else

Yes I know. With the Yocto image we build our own that does switchroot.

Here I am inside your buildroot initrd, no fs from the emmc are mounted. According to dmesg btrfs module is loaded later then dwc3, and scans (finds) the btrfs partition in all cases without mounting.

b) your local data is another variable we don't want to introduce

just "rootfstype=ramfs" should be enough for this and

root=/dev/mmcblk0p8 rootflags=subvol=@,compress=lzo rootfstype=btrfs

should be dropped.

After some experimenting it appears "rootfstype=btrfs" causes the buildroot rootfs to fail probing tsub1210.

I think you should be able to reproduce this.

However, changing "rootfstype=ramfs" for my (yocto) image (which probably should be the right thing to do now I think about it) does not resolve the failing to probe tsub1210. Comparing the dmesg with the buildroot one shows that in my case a lot of stuff happens prior to dwc3:

raid6 does speed testing (this is used by btrfs)

btrfs is loaded

sdhci probed

acpi tables (for edison-arduino) loaded into configfs

external gpio muxes setup

finally xhci (tusb1210 is before this on the buildroot image)



Usb drive is not detected regardless booting with stick plugged or
plugging later on.

# lsusb
Bus 001 Device 001: ID 1d6b:0002
Bus 002 Device 001: ID 1d6b:0003

No TUSB1210 probed

# dmesg | grep dwc3
#

2) I placed them in my vfat rescue partition

Linux kernel version 6.0.0-edison-acpi-standard
(andreysm@neptunefw-builder) #8 SMP PREEMPT_DYNAMIC Sat Oct 15 18:47:19
UTC 2022
Building boot_params at 0x00090000
Loading bzImage at address 100000 (12064480 bytes)
Initial RAM disk at linear address 0x06000000, size 25165824 bytes
Kernel command line: "debugshell=0 tty1 console=ttyS2,115200n8
root=/dev/mmcblk0p7 rootfstype=vfat systemd.unit=multi-user.target"
Kernel loaded at 00100000, setup_base=00090000

Usb drive is detected.
Yep, that's exactly my point about extra variables. So it looks like
something in your root btrfs partition is triggering this issue. I
don't really know the contents of your root file system, so don't
really have any suggestions there. Maybe old kernel modules are
getting picked up? Or something else is interfering ¯\_(ツ)_/¯

# lsusb
Bus 001 Device 001: ID 1d6b:0002
Bus 001 Device 002: ID 125f:312b
Bus 002 Device 001: ID 1d6b:0003

TUSB1210 probed

# dmesg | grep dwc3
[ 8.605845] tusb1210 dwc3.0.auto.ulpi: GPIO lookup for consumer reset
[ 8.605876] tusb1210 dwc3.0.auto.ulpi: using ACPI for GPIO lookup
[ 8.605927] tusb1210 dwc3.0.auto.ulpi: using lookup tables for GPIO
lookup
[ 8.605941] tusb1210 dwc3.0.auto.ulpi: No GPIO consumer reset found
[ 8.605956] tusb1210 dwc3.0.auto.ulpi: GPIO lookup for consumer cs
[ 8.605970] tusb1210 dwc3.0.auto.ulpi: using ACPI for GPIO lookup
[ 8.606011] tusb1210 dwc3.0.auto.ulpi: using lookup tables for GPIO
lookup
[ 8.606024] tusb1210 dwc3.0.auto.ulpi: No GPIO consumer cs found
[ 8.669317] tusb1210 dwc3.0.auto.ulpi: error -110 writing val 0x41 to
reg 0x80

## note: options debugshell, root and rootfstype are normally handled by
a script in my initrd, so I guess here noop.

I use in that folder in case you want to give it a try.

Now on Merrifield dwc3_get_extcon() doesn't do anything but call
extcon_get_extcon_dev() which doesn't touch any hardware or interact
with other drivers, so assuming

So current v6.0 has: dwc3_get_extcon - dwc3_get_dr_mode - ... -
dwc3_core_init - .. - dwc3_core_init_mode (not working)

I changed to: dwc3_get_dr_mode - dwc3_get_extcon - .. - dwc3_core_init -
.. - dwc3_core_init_mode (no change)

Then to: dwc3_get_dr_mode - .. - dwc3_core_init - .. - dwc3_get_extcon -
dwc3_core_init_mode (works)
still holds(did you double check that with vanilla v6.0?) the only
I didn't check
difference that I can see is execution timings. It seems to me it's
either an extra delay added by execution of extcon_get_extcon_dev()
(unlikely) or multiple partial probes that include dwc3_core_init()
that change things. You can try to check the latter by adding an
artificial probe deferral point after dwc3_core_init(). Something like
(didn't test this):

modified drivers/usb/dwc3/core.c
@@ -1860,6 +1860,10 @@ static int dwc3_probe(struct platform_device *pdev)
goto err3;

ret = dwc3_core_init(dwc);
+ static int deferral_counter = 0;
+ if (deferral_counter++ < 9) /* I counted 9 deferrals in my testing */
+ ret = -EPROBE_DEFER;
+
if (ret) {
dev_err_probe(dev, ret, "failed to initialize core\n");
goto err4;
Not sure how you wanted this tested. So I assume on vanilla booting from
btrfs on eemc. It crashes but maybe the trace is usefull. After crash it
continues but no USB appears at all.

I think you'll have to experiment with that code placement to emulate
a deferred probe for the old location of "get extcon". I'd focus on
figuring out the root filesystem variable first before trying to get
this to work.

Yes, did that as described above. I think that "rootfstype=btrfs" causes some ordering issue, like as if xhci goes to soon. It goes before:

spi_master spi5: GPIO lookup for consumer cs

while tusb1210 when it does probe starts with:

tusb1210 dwc3.0.auto.ulpi: GPIO lookup for consumer reset

and xhci follow later.

To be explicit, at this point I don't think the revert is really
warranted. I'm also happy to reply/help you with suggestions, but you
are going to have to start driving this.

I agree that reverting based on a "regression" can not be concluded here as dwc3 on merrifield never worked without an out-of-tree patch. And your patch makes that out-of-tree patch obsolete - that's a good thing.

But I do think your patch is exposing an older issue that makes dwc3 sensitive to ordering. I would very much appreciate if you could try "rootfstype=btrfs" to reproduce. It think it would be a good thing to resolve it so that the effort here has not been for nothing.

My next step will be to move around the code placement as you suggest. (I can spend a few hours in the evenings only as this is not my day job, so explains if I'm a bit slow to respond here).