Re: [PATCH v2] virt: tdx-guest: Handle GetQuote request error code

From: Dan Williams
Date: Fri Feb 23 2024 - 00:48:58 EST


Kuppuswamy Sathyanarayanan wrote:
> During the TDX guest attestation process, TSM ConfigFS ABI is used by
> the user attestation agent to get the signed VM measurement data (a.k.a
> Quote), which can be used by a remote verifier to validate the
> trustworthiness of the guest. When a user requests for the Quote data
> via the ConfigFS ABI, the TDX Quote generation handler
> (tdx_report_new()) forwards the request to VMM (or QE) via a hypercall,
> and then shares the output with the user.
>
> Currently, when handling the Quote generation request, tdx_report_new()
> handler only checks whether the VMM successfully processed the request
> and if it is true it returns success and shares the output to the user
> without actually validating the output data. Since the VMM can return
> error even after processing the Quote request, always returning success
> for the processed requests is incorrect and will create confusion to
> the user. Although for the failed request, output buffer length will
> be zero and can also be used by the user to identify the failure case,
> it will be more clear to return error for all failed cases.

This is a lot of text. More is not necessarily better.

---
The tdx-guest driver marshals requests via hypercall to have a quoting
enclave sign attestation evidence about the current state of the TD.
There are 2 possible failures, a transport failure (failure to
communicate with the quoting agent) and payload failure (a failed
quote). The driver only checks the former, update it to consider the
latter payload errors as well.
---


>
> Validate the Quote data output status and return error code for all
> failed cases.
>
> Fixes: f4738f56d1dc ("virt: tdx-guest: Add Quote generation support using TSM_REPORTS")
> Reported-by: Xiaoyao Li <xiaoyao.li@xxxxxxxxx>
> Closes: https://lore.kernel.org/linux-coco/6bdf569c-684a-4459-af7c-4430691804eb@xxxxxxxxxxxxxxx/T/#u
> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
> ---
>
> Changes since v1:
> * Updated the commit log (Kirill)
>
> drivers/virt/coco/tdx-guest/tdx-guest.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/drivers/virt/coco/tdx-guest/tdx-guest.c b/drivers/virt/coco/tdx-guest/tdx-guest.c
> index 1253bf76b570..61368318fa39 100644
> --- a/drivers/virt/coco/tdx-guest/tdx-guest.c
> +++ b/drivers/virt/coco/tdx-guest/tdx-guest.c
> @@ -228,6 +228,12 @@ static int tdx_report_new(struct tsm_report *report, void *data)
> goto done;
> }
>
> + if (quote_buf->status != GET_QUOTE_SUCCESS) {
> + pr_err("GetQuote request failed, ret %llx\n", quote_buf->status);

Do you really want to spam the log on every error? I would expect
pr_err() for events that are fatal to driver operation that might
indicate conditions where maybe the TD should give up on the host.

Yes, there are other pr_err() in this function and I am kicking myself
for not scrutinizing those more closely. It is likely enough to
distinguish transport errors vs payload / quote errors with ENXIO and
EIO.

Otherwise if there is an exceedingly good reason to keep this driver
chirping into the kernel log then these likely also want to be
rate-limited. If they are "just in case" debug messages, then move them
to pr_debug().