Re: [PATCH] tpm: return false from tpm_amd_is_rng_defective on non-x86 platforms

From: Jerry Snitselaar
Date: Wed Jul 05 2023 - 13:05:33 EST


On Fri, Jun 30, 2023 at 01:07:00PM +0300, Jarkko Sakkinen wrote:
> On Thu Jun 29, 2023 at 11:41 PM EEST, Jerry Snitselaar wrote:
> > tpm_amd_is_rng_defective is for dealing with an issue related to the
> > AMD firmware TPM, so on non-x86 architectures just have it inline and
> > return false.
> >
> > Cc: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> > Cc: "Jason A. Donenfeld" <Jason@xxxxxxxxx>
> > Cc: Jason Gunthorpe <jgg@xxxxxxxx>
> > Cc: Peter Huewe <peterhuewe@xxxxxx>
> > Cc: stable@xxxxxxxxxxxxxxx
> > Cc: Linux regressions mailing list <regressions@xxxxxxxxxxxxxxx>
> > Cc: Mario Limonciello <mario.limonciello@xxxxxxx>
> > Reported-by: Aneesh Kumar K. V <aneesh.kumar@xxxxxxxxxxxxx>
> > Reported-by: Sachin Sant <sachinp@xxxxxxxxxxxxx>
> > Closes: https://lore.kernel.org/lkml/99B81401-DB46-49B9-B321-CF832B50CAC3@xxxxxxxxxxxxx/
> > Fixes: f1324bbc4011 ("tpm: disable hwrng for fTPM on some AMD designs")
> > Signed-off-by: Jerry Snitselaar <jsnitsel@xxxxxxxxxx>
> > ---
> > drivers/char/tpm/tpm-chip.c | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/drivers/char/tpm/tpm-chip.c b/drivers/char/tpm/tpm-chip.c
> > index cd48033b804a..cf5499e51999 100644
> > --- a/drivers/char/tpm/tpm-chip.c
> > +++ b/drivers/char/tpm/tpm-chip.c
> > @@ -518,6 +518,7 @@ static int tpm_add_legacy_sysfs(struct tpm_chip *chip)
> > * 6.x.y.z series: 6.0.18.6 +
> > * 3.x.y.z series: 3.57.y.5 +
> > */
> > +#ifdef CONFIG_X86
> > static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
> > {
> > u32 val1, val2;
> > @@ -566,6 +567,12 @@ static bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
> >
> > return true;
> > }
> > +#else
> > +static inline bool tpm_amd_is_rng_defective(struct tpm_chip *chip)
> > +{
> > + return false;
> > +}
> > +#endif /* CONFIG_X86 */
> >
> > static int tpm_hwrng_read(struct hwrng *rng, void *data, size_t max, bool wait)
> > {
> > --
> > 2.38.1
>
> Sanity check, this was the right patch, right?
>
> I'll apply it.
>
> BR, Jarkko

Sorry, I've been dealing with a family health issue the past week. It wasn't clear
to me why chip->ops was null when I first took a look, but I think I understand
now looking at it again this morning. The stack trace shows it in the device_shutdown() path:

[ 34.381674] NIP [c0000000009db1e4] tpm_amd_is_rng_defective+0x74/0x240
[ 34.381681] LR [c0000000009db928] tpm_chip_unregister+0x138/0x160
[ 34.381685] Call Trace:
[ 34.381686] [c00000009742faa0] [c0000000009db928] tpm_chip_unregister+0x138/0x160
[ 34.381690] [c00000009742fae0] [c0000000009eab94] tpm_ibmvtpm_remove+0x34/0x130
[ 34.381695] [c00000009742fb50] [c000000000115738] vio_bus_remove+0x58/0xd0
[ 34.381701] [c00000009742fb90] [c000000000a01ecc] device_shutdown+0x21c/0x39c
[ 34.381705] [c00000009742fc20] [c0000000001a2684] kernel_restart_prepare+0x54/0x70
[ 34.381710] [c00000009742fc40] [c000000000292c48] kernel_kexec+0xa8/0x100
[ 34.381714] [c00000009742fcb0] [c0000000001a2cd4] __do_sys_reboot+0x214/0x2c0
[ 34.381718] [c00000009742fe10] [c000000000034adc] system_call_exception+0x13c/0x340
[ 34.381723] [c00000009742fe50] [c00000000000d05c] system_call_vectored_common+0x15c/0x2ec

So I think what happened is:

device_shutdown -> dev->class->shutdown_pre (tpm_class_shutdown) // clears chip->ops
-> dev->bus->shutdown (vio_bus_shutdown) -> vio_bus_remove -> viodrv->remove (tpm_ibmvtpm_remove) -> tpm_chip_unregister -> tpm_amd_is_rng_defective -> oops!


I guess anything that gets called in the tpm_chip_unregister path
should be doing a check of chip->ops prior to using it. So I think
Mario's patch would still be a good thing to have.

Regards,
Jerry