Re: [PATCH] tpm_tis_spi: fix:release chip select when flow control fails

From: Jarkko Sakkinen
Date: Fri Apr 21 2023 - 17:47:06 EST


On Fri, Mar 31, 2023 at 02:56:25PM +0800, shaopeijie@xxxxxxxx wrote:
> From: Peijie Shao <shaopeijie@xxxxxxxx>
>
> The TPM's chip select will leave active after spi_bus_unlock when
> flow control timeout, and may interfere other chips sharing the same
> spi bus, or may damage them dule to level conflict on MISO pin.
>
> So the patch deactives the chip select by sending an empty message
> with cs_change=0 if flow control fails.
>
> The reason why flow control timeout for me is unfortunately I got a
> damaged TPM chip. It always pull MISO low during cs active(this can
> be easily emulated by wire MISO to the ground), not responding anything,
> and dmesg shows below:
> ...
> [ 311.150725] tpm_tis_spi: probe of spi0.0 failed with error -110
> ...

We don't really cease to support damaged hardware but it is true
that the *software* failure paths should probably be robust enough
to deativate chip select.

I would rewrite this as

"The failure paths in tpm_tis_spi_transfer() do not deactivate
chip select. Send an empty message (cs_select == 0) to overcome
this."

That's all there needs to be. We do not care about broken hardware.

>
> Signed-off-by: Peijie Shao <shaopeijie@xxxxxxxx>
> ---
> drivers/char/tpm/tpm_tis_spi_main.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/char/tpm/tpm_tis_spi_main.c b/drivers/char/tpm/tpm_tis_spi_main.c
> index a0963a3e92bd..5c8ff343761f 100644
> --- a/drivers/char/tpm/tpm_tis_spi_main.c
> +++ b/drivers/char/tpm/tpm_tis_spi_main.c
> @@ -105,8 +105,19 @@ int tpm_tis_spi_transfer(struct tpm_tis_data *data, u32 addr, u16 len,
> /* Flow control transfers are receive only */
> spi_xfer.tx_buf = NULL;
> ret = phy->flow_control(phy, &spi_xfer);
> - if (ret < 0)
> + if (ret < 0) {
> + /*
> + * Release cs pin if the device is not responding, regardless of the reason.
> + * Notice cs may alreadly been released if the failure was caused inside
> + * spi_sync_locked called by flow_control, in this situation, a pluse may be
> + * generated on cs.
> + */

Please replace above comment with:

/* Deactivate chip select: */

> + memset(&spi_xfer, 0, sizeof(spi_xfer));
> + spi_message_init(&m);
> + spi_message_add_tail(&spi_xfer, &m);
> + spi_sync_locked(phy->spi_device, &m);
> goto exit;
> + }
>
> spi_xfer.cs_change = 0;
> spi_xfer.len = transfer_len;
> --
> 2.39.1
>
>
>

There's three call sites, why are you taking care of only one
of them?

I'd consider instead:

return 0;

exit:
memset(&spi_xfer, 0, sizeof(spi_xfer));
spi_message_init(&m);
spi_message_add_tail(&spi_xfer, &m);
spi_sync_locked(phy->spi_device, &m);
spi_bus_unlock(phy->spi_device->master);
return ret;
}

The the rollback would apply to all call sites.

BR, Jarkko