Re: [PATCH v3 2/2] tpm: in tpm2_del_space check if ops pointer is still valid

From: Jason Gunthorpe
Date: Fri Feb 05 2021 - 22:40:20 EST


On Fri, Feb 05, 2021 at 05:08:20PM -0800, James Bottomley wrote:

> Effectively all of this shuffles the tpmrm device allocation from
> chip_alloc to chip_add ... I'm not averse to this but it does mean we
> can suffer allocation failures now in the add routine and it makes
> error handling a bit more complex.

We already have to handle failures here, so this doesn't seem any
worse (and the existing error handling looked wrong, I fixed it)

> > rc = cdev_device_add(&chip->cdevs, &chip->devs);
> > if (rc) {
> > dev_err(&chip->devs,
> > "unable to cdev_device_add() %s, major
> > %d, minor %d, err=%d\n",
> > dev_name(&chip->devs), MAJOR(chip-
> > >devs.devt),
> > MINOR(chip->devs.devt), rc);
> > - return rc;
> > + goto out_put_devs;
> > }
> > }
> >
> > @@ -460,6 +459,10 @@ static int tpm_add_char_device(struct tpm_chip
> > *chip)
> > idr_replace(&dev_nums_idr, chip, chip->dev_num);
> > mutex_unlock(&idr_lock);
> >
> > +out_put_devs:
> > + put_device(&chip->devs);
>
> I think there should be a if (chip->flags & TPM_CHIP_FLAG_TPM2) here.
>
> I realise you got everything semantically correct and you only ever go
> to this label from somewhere that already has the check, but guess what
> will happen when the bot rewriters get hold of this ...

Makes sense

> > +out_del_dev:
> > + cdev_device_del(&chip->cdev);
> > return rc;
> > }
> >
> > @@ -640,8 +643,10 @@ void tpm_chip_unregister(struct tpm_chip *chip)
> > if (IS_ENABLED(CONFIG_HW_RANDOM_TPM))
> > hwrng_unregister(&chip->hwrng);
> > tpm_bios_log_teardown(chip);
> > - if (chip->flags & TPM_CHIP_FLAG_TPM2)
> > + if (chip->flags & TPM_CHIP_FLAG_TPM2) {
> > cdev_device_del(&chip->cdevs, &chip->devs);
> > + put_device(&chip->devs);
> > + }
> > tpm_del_char_device(chip);
>
> Actually, I think you want to go further here. If there's a
>
> put_device(&chips->dev)
>
> as the last statement (or moved into tpm_del_char_device) we should
> now

The proper TPM driver remove sequence is:

remove()
{
/* Upon return the core guarentees no driver callback is running or
* will ever run again */
tpm_chip_unregister()

// Safe to do this because nothing will now use the HW resources
free_irq(chip->XXX)
unmap_memory(chip->YYY)

// Now we are done with the memory
put_device(&chip-dev);
}

ie the general driver design should expect the chip memory to continue
to exist after unregister because it will need to refer to it to
destroy any driver resources.

> have no active reference on the devices from the kernel and we can
> eliminate the
>
> rc = devm_add_action_or_reset(pdev,
> (void (*)(void *)) put_device,
> &chip->dev);

This devm exists because adding the put_device to the error unwinds of
every driver probe function was too daunting. It can be removed only
if someone goes and updates every driver to correctly error-unwind
tpm_chip_alloc() with put_device() in the driver probe function.

Jason