Re: [PATCH v2 1/6] hwrng: core: Freeze khwrng thread during suspend

From: Stephen Boyd
Date: Fri Aug 02 2019 - 18:50:52 EST


Quoting Stephen Boyd (2019-07-17 10:03:22)
> Quoting Jason Gunthorpe (2019-07-17 09:50:11)
> > On Wed, Jul 17, 2019 at 09:42:32AM -0700, Stephen Boyd wrote:
>
> Yes. That's exactly my point. A hwrng that's suspended will fail here
> and it's better to just not try until it's guaranteed to have resumed.
>
> >
> > It just seems weird to do this, what about all the other tpm API
> > users? Do they have a racy problem with suspend too?
>
> I haven't looked at them. Are they being called from suspend/resume
> paths? I don't think anything for the userspace API can be a problem
> because those tasks are all frozen. The only problem would be some
> kernel internal API that TPM API exposes. I did a quick grep and I see
> things like IMA or the trusted keys APIs that might need a closer look.
>
> Either way, trying to hold off a TPM operation from the TPM API when
> we're suspended isn't really possible. If something like IMA needs to
> get TPM data from deep suspend path and it fails because the device is
> suspended, all we can do is return an error from TPM APIs and hope the
> caller can recover. The fix is probably going to be to change the code
> to not call into the TPM API until the hardware has resumed by avoiding
> doing anything with the TPM until resume is over. So we're at best able
> to make same sort of band-aid here in the TPM API where all we can do is
> say -EAGAIN but we can't tell the caller when to try again.
>

Andrey talked to me a little about this today. Andrey would prefer we
don't just let the TPM go into a wonky state if it's used during
suspend/resume so that it can stay resilient to errors. Sounds OK to me,
but my point still stands that we need to fix the callers.

I'll resurrect the IS_SUSPENDED flag and make it set generically by the
tpm_pm_suspend() and tpm_pm_resume() functions and then spit out a big
WARN_ON() and return an error value like -EAGAIN if the TPM functions
are called when the TPM is suspended. I hope we don't hit the warning
message, but if we do then at least we can track it down rather quickly
and figure out how to fix the caller instead of just silently returning
-EAGAIN and hoping for that to be visible to the user.

This patch will still be required to avoid the WARN message, so I'll
resend with the Cc to crypto list so it can be picked up.