Re: [RFC] Fix stuck UCSI controller on DELL

From: Mario Limonciello
Date: Wed Jan 17 2024 - 12:34:57 EST


On 1/17/2024 00:35, Christian A. Ehrhardt wrote:

Hi Mario,

On Tue, Jan 16, 2024 at 09:00:03PM -0600, Mario Limonciello wrote:
On 1/15/2024 12:55, Christian A. Ehrhardt wrote:

Hi Heikki,

sorry to bother you again with this but I'm afraid there's
a misunderstanding wrt. the nature of the quirk. See below:

On Thu, Jan 04, 2024 at 01:59:02PM +0200, Heikki Krogerus wrote:
Hi Christian,

On Wed, Jan 03, 2024 at 11:06:35AM +0100, Christian A. Ehrhardt wrote:
I have a DELL Latitude 5431 where typec only works somewhat.
After the first plug/unplug event the PPM seems to be stuck and
commands end with a timeout (GET_CONNECTOR_STATUS failed (-110)).

This patch fixes it for me but according to my reading it is in
violation of the UCSI spec. On the other hand searching through
the net it appears that many DELL models seem to have timeout problems
with UCSI.

Do we want some kind of quirk here? There does not seem to be a quirk
framework for this part of the code, yet. Or is it ok to just send the
additional ACK in all cases and hope that the PPM will do the right
thing?

We can use DMI quirks. Something like the attached diff (not tested).

thanks,

--
heikki

diff --git a/drivers/usb/typec/ucsi/ucsi_acpi.c b/drivers/usb/typec/ucsi/ucsi_acpi.c
index 6bbf490ac401..7e8b1fcfa024 100644
--- a/drivers/usb/typec/ucsi/ucsi_acpi.c
+++ b/drivers/usb/typec/ucsi/ucsi_acpi.c
@@ -113,18 +113,44 @@ ucsi_zenbook_read(struct ucsi *ucsi, unsigned int offset, void *val, size_t val_
return 0;
}
-static const struct ucsi_operations ucsi_zenbook_ops = {
- .read = ucsi_zenbook_read,
- .sync_write = ucsi_acpi_sync_write,
- .async_write = ucsi_acpi_async_write
-};
+static int ucsi_dell_sync_write(struct ucsi *ucsi, unsigned int offset,
+ const void *val, size_t val_len)
+{
+ u64 ctrl = *(u64 *)val;
+ int ret;
+
+ ret = ucsi_acpi_sync_write(ucsi, offset, val, val_len);
+ if (ret && (ctrl & (UCSI_ACK_CC_CI | UCSI_ACK_CONNECTOR_CHANGE))) {
+ ctrl= UCSI_ACK_CC_CI | UCSI_ACK_COMMAND_COMPLETE;
+
+ dev_dbg(ucsi->dev->parent, "%s: ACK failed\n", __func__);
+ ret = ucsi_acpi_sync_write(ucsi, UCSI_CONTROL, &ctrl, sizeof(ctrl));
+ }

Unfortunately, this has the logic reversed. The quirk (i.e. the
additional UCSI_ACK_COMMAND_COMPLETE) is required after a _successful_
UCSI_ACK_CONNECTOR_CHANGE. Otherwise, _subsequent_ commands will timeout
(usually the next GET_CONNECTOR_CHANGE).

This means the quirk must be applied _before_ we detect any failure.
Consequently, the quirk has the potential to break working systems.

Sorry, if that wasn't clear from my original mail. Please let me know
if this changes how you want the quirks handled.

Thanks Christian


For the problematic scenario have you tried to play with it a bit to see if
it's too short of a timeout (raise timeout) or to output the response bits
to see if anything else surprising is sent?

It is not a problem with the timeout. Waiting forever in this case
doesn't help. IMHO this is actually a bug in the PPM, i.e. in Dell's
bios.

"Usually" the PD controller F/W is distributed with the EC, but yes Dell nominally puts everything in a monolithic BIOS package.


Sending an ack after the timeout fixes things, though.

Does it always fail on the same command, or does it happen to a bunch of
them?

It always fails on the first command after UCSI_ACK_CC_CI for a
connector change. However, there might be no such command if the
next event is a notification.

I did play around with it a bit more and came up with a way to
probe for the issue:

https://lore.kernel.orgorg/all/20240116224041.220740-1-lk@xxxxxxx/

If some variation of your prob-able workaround is picked up I think it's worth making noise when probed (dev_warn or dev_notice) about this situation that it is being used to workaround a PPM bug.


regards Christian



+ Dell Client Kernel mailbox

Dell team,

Can you look into this? It sounds like it should be investigated more closely to see where the impedance mismatch against the spec and real behavior actually lies.