Re: [PATCH v3 1/3] usb: ucsi: Add missing ppm_lock

From: Heikki Krogerus
Date: Wed Jan 24 2024 - 08:08:46 EST


On Wed, Jan 24, 2024 at 01:04:06PM +0100, Christian A. Ehrhardt wrote:
>
> Hi Heikki,
>
> Thanks for looking into this.
>
> On Wed, Jan 24, 2024 at 10:09:04AM +0200, Heikki Krogerus wrote:
> > On Sun, Jan 21, 2024 at 09:41:21PM +0100, Christian A. Ehrhardt wrote:
> > > Calling ->sync_write must be done while holding the PPM lock as
> > > the mailbox logic does not support concurrent commands.
> > >
> > > At least since the addition of partner task this means that
> > > ucsi_acknowledge_connector_change should be called with the
> > > PPM lock held as it calls ->sync_write.
> > >
> > > Thus protect the only call to ucsi_acknowledge_connector_change
> > > with the PPM. All other calls to ->sync_write already happen
> > > under the PPM lock.
> > >
> > > Fixes: b9aa02ca39a4 ("usb: typec: ucsi: Add polling mechanism for partner tasks like alt mode checking")
> > > Cc: stable@xxxxxxxxxxxxxxx
> > > Signed-off-by: Christian A. Ehrhardt <lk@xxxxxxx>
> > > ---
> > > drivers/usb/typec/ucsi/ucsi.c | 2 ++
> > > 1 file changed, 2 insertions(+)
> > >
> > > diff --git a/drivers/usb/typec/ucsi/ucsi.c b/drivers/usb/typec/ucsi/ucsi.c
> > > index 61b64558f96c..8f9dff993b3d 100644
> > > --- a/drivers/usb/typec/ucsi/ucsi.c
> > > +++ b/drivers/usb/typec/ucsi/ucsi.c
> > > @@ -935,7 +935,9 @@ static void ucsi_handle_connector_change(struct work_struct *work)
> > >
> > > clear_bit(EVENT_PENDING, &con->ucsi->flags);
> > >
> > > + mutex_lock(&ucsi->ppm_lock);
> > > ret = ucsi_acknowledge_connector_change(ucsi);
> > > + mutex_unlock(&ucsi->ppm_lock);
> > > if (ret)
> > > dev_err(ucsi->dev, "%s: ACK failed (%d)", __func__, ret);
> >
> > Is this really actually fixing some issue? The function has already
> > taken the connector lock, so what exactly could happen without this?
>
> I've definitely _seen_ issues with the quirk from PATCH 3/3 if the
> lock is missing. I'm pretty sure from looking at the code that races
> with other connectors can happen without the quirk, too.
>
> The PPM lock protects the Mailbox and the ACK_PENDING/COMMAND_PENDING
> bits and I could observe cases where ucsi_acpi_sync_write() was entered
> with the COMMAND_PENDING bit already set. One possible sequence is this:
>
> ucsi_connector_change() for connector #1
> => schedules partner tasks
> => Acks the connector change
> => Releases locks
> ucsi_connecotr_change() for connector #2
> => acquire con lock for #2
> => Start to process connector state change
> => Processing reaches the point right before
> ucsi_acknowledge_connector_change()
> Connector #1 workqueue starts to process one of the partner tasks
> => Acquire con lock for connector #1
> => Acquire ppm lock
> => Enter ucsi_exec_command()
> => ->sync_write() starts to use the mailbox and sets
> COMMAND_PENDING
> => ->sync_write blocks waiting for the command completion
> with wait_for_completion_timeout().
> ucsi_connector_change() for connector #2 continues
> => execute ucsi_acknowledge_connector_change() and start using
> the mailbox that is still in use.
> => BOOM
>
> There is a simpler an much more likely sequence with the dell quirk active.

Okay.

Reviewed-by: Heikki Krogerus <heikki.krogerus@xxxxxxxxxxxxxxx>

--
heikki