Re: [PATCH v12 1/1] serial: core: Start managing serial controllers to enable runtime PM

From: Tony Lindgren
Date: Fri Jun 02 2023 - 05:29:11 EST


Hi,

* Chen-Yu Tsai <wenst@xxxxxxxxxxxx> [230602 08:33]:
> This patch, in linux-next since 20230601, unfortunately breaks MediaTek
> based Chromebooks. The kernel hangs during the probe of the serial ports,
> which use the 8250_mtk driver. This happens even with the subsequent
> fixes in next-20230602 and on the mailing list:
>
> serial: core: Fix probing serial_base_bus devices
> serial: core: Don't drop port_mutex in serial_core_remove_one_port
> serial: core: Fix error handling for serial_core_ctrl_device_add()

OK thanks for reporting it.

> Without the fixes, the kernel gives "WARNING: bad unlock balance detected!"
> With the fixes, it just silently hangs. The last messages seen on the
> (serial) console are:
>
> Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled
> printk: console [ttyS0] disabled
> mt6577-uart 11002000.serial: using DT '/soc/serial@11002000' for 'rs485-term' GPIO lookup
> of_get_named_gpiod_flags: can't parse 'rs485-term-gpios' property of node '/soc/serial@11002000[0]'
> of_get_named_gpiod_flags: can't parse 'rs485-term-gpio' property of node '/soc/serial@11002000[0]'
> mt6577-uart 11002000.serial: using lookup tables for GPIO lookup
> mt6577-uart 11002000.serial: No GPIO consumer rs485-term found
> mt6577-uart 11002000.serial: using DT '/soc/serial@11002000' for 'rs485-rx-during-tx' GPIO lookup
> of_get_named_gpiod_flags: can't parse 'rs485-rx-during-tx-gpios' property of node '/soc/serial@11002000[0]'
> of_get_named_gpiod_flags: can't parse 'rs485-rx-during-tx-gpio' property of node '/soc/serial@11002000[0]'
> mt6577-uart 11002000.serial: using lookup tables for GPIO lookup
> mt6577-uart 11002000.serial: No GPIO consumer rs485-rx-during-tx found
>
> What can we do to help resolve this?

There may be something blocking serial_ctrl and serial_port from
probing. That was the issue with the arch_initcall() using drivers.

Not sure yet what the issue here might be, but the 8250_mtk should be
fairly similar use case to the 8250_omap driver that I've tested with.
But unfortunately I don't think I have any 8250_mtk using devices to
test with.

The following hack should allow you to maybe see more info on what goes
wrong and allows adding some debug printk to serial_base_match() for
example to see if that gets called for mt6577-uart.

Hmm maybe early_mtk8250_setup() somehow triggers the issue? Not sure why
early_serial8250_setup() would cause issues here though.

Regards,

Tony

8< -----------------
diff --git a/drivers/tty/serial/serial_core.c b/drivers/tty/serial/serial_core.c
--- a/drivers/tty/serial/serial_core.c
+++ b/drivers/tty/serial/serial_core.c
@@ -144,7 +144,7 @@ static void __uart_start(struct tty_struct *tty)
return;

port_dev = port->port_dev;
-
+#if 0
/* Increment the runtime PM usage count for the active check below */
err = pm_runtime_get(&port_dev->dev);
if (err < 0) {
@@ -161,6 +161,9 @@ static void __uart_start(struct tty_struct *tty)
port->ops->start_tx(port);
pm_runtime_mark_last_busy(&port_dev->dev);
pm_runtime_put_autosuspend(&port_dev->dev);
+#else
+ port->ops->start_tx(port);
+#endif
}

static void uart_start(struct tty_struct *tty)
--
2.41.0