Re: [PATCH V2] virt: Prevent IV reuse in SNP guest driver

From: Tom Lendacky
Date: Fri Oct 21 2022 - 17:27:19 EST


On 10/21/22 15:57, Peter Gonda wrote:
On Fri, Oct 21, 2022 at 1:02 PM Tom Lendacky <thomas.lendacky@xxxxxxx> wrote:
On 10/21/22 12:33, Peter Gonda wrote:
The ASP and an SNP guest use a series of AES-GCM keys called VMPCKs to
communicate securely with each other. The IV to this scheme is a
sequence number that both the ASP and the guest track. Currently this
sequence number in a guest request must exactly match the sequence
number tracked by the ASP. This means that if the guest sees an error
from the host during a request it can only retry that exact request or
disable the VMPCK to prevent an IV reuse. AES-GCM cannot tolerate IV
reuse see:
https://csrc.nist.gov/csrc/media/projects/block-cipher-techniques/documents/bcm/comments/800-38-series-drafts/gcm/joux_comments.pdf

To handle userspace querying the cert_data length. Instead of requesting
the cert length from userspace use the size of the drivers allocated
shared buffer. Then copy that buffer to userspace, or give userspace an
error depending on the size of the buffer given by userspace.

Fixes: fce96cf044308 ("virt: Add SEV-SNP guest driver")
Signed-off-by: Peter Gonda <pgonda@xxxxxxxxxx>
Reported-by: Peter Gonda <pgonda@xxxxxxxxxx>
Reviewed-by: Dionna Glaze <dionnaglaze@xxxxxxxxxx>
Cc: Borislav Petkov <bp@xxxxxxx>
Cc: Tom Lendacky <thomas.lendacky@xxxxxxx>
Cc: Michael Roth <michael.roth@xxxxxxx>
Cc: Haowen Bai <baihaowen@xxxxxxxxx>
Cc: Yang Yingliang <yangyingliang@xxxxxxxxxx>
Cc: Marc Orr <marcorr@xxxxxxxxxx>
Cc: David Rientjes <rientjes@xxxxxxxxxx>
Cc: Ashish Kalra <Ashish.Kalra@xxxxxxx>
Cc: linux-kernel@xxxxxxxxxxxxxxx
Cc: kvm@xxxxxxxxxxxxxxx
---

@@ -477,25 +496,37 @@ static int get_ext_report(struct snp_guest_dev *snp_dev, struct snp_guest_reques
if (!resp)
return -ENOMEM;

- snp_dev->input.data_npages = npages;
+ snp_dev->input.data_npages = sizeof(*snp_dev->certs_data) >> PAGE_SHIFT;
ret = handle_guest_request(snp_dev, SVM_VMGEXIT_EXT_GUEST_REQUEST, arg->msg_version,
SNP_MSG_REPORT_REQ, &req.data,
sizeof(req.data), resp->data, resp_len, &arg->fw_err);

+ resp_cert_len = snp_dev->input.data_npages << PAGE_SHIFT;

The hypervisor is not required to update the number of pages that the
certificates actually used/required if enough pages were supplied. So this
value could always remain as 4 (based on SEV_FW_BLOB_MAX_SIZE) on
successful return.

And if that's the case, we could always just return 4 for the number of
pages no matter what. Otherwise you'll have to update the logic here if
you want to obtain the actual number.

Are you asking for this to just hard code the userspace requirement to
4 pages? We could leave this as written here, that would leave the
guest open to a new GHCB spec where

It's up to you. Ideally, if userspace provides a npages value of 0, then the driver issues the request with 0 and gets back the actual value. Then, to ensure the sequence numbers are updated, you issue the request again with the either the just returned value or SEV_FW_BLOB_MAX_SIZE >> PAGE_SHIFT. That will update the sequence numbers and the driver returns the number of pages required as returned from the first request.

That number can also be cached and then whenever userspace calls down with npages of 0, immediately return the cached value. If the request ever gets a SNP_GUEST_REQ_INVALID_LEN with the cached value, the newly returned value gets cached and the driver performs the request again, like the very first time in order to update the sequence numbers. The driver would then return the new npages value back to userspace. Of course that becomes the "minimum" number of pages now, so even if the hypervisor reduces the size of the certs data, it always requires more than needed.


"State from Hypervisor: on error will contain the number of guest
contiguous pages required to hold the data to be returned"

Is instead:

"State from Hypervisor: contain the number of guest contiguous pages
required to hold the data to be returned"

If the driver always returns 4, I don't see this as requiring any change to the spec. It may be more than is truly needed, but that doesn't matter, the cert data will fit, it just may be more than necessary. This can occur even if you pass back the actual number, if, in between the call with 0, the hypervisor updates the certs such that less pages are required.


I think this would be a non-breaking change since the spec says
nothing of the non-error case currently. Fine with your preference
here. Either Dionna or I can follow up with a series to allow more
than 4pages if needed.

I'd prefer that userspace get the actual number, but really, I don't think it's a big deal to just return 4 until the driver can handle a more dynamic approach should more than 4 pages ever be needed (this would also require support on the hypervisor where currently not more than 4 pages are allowed to be provided, too).

I just wanted you to be aware that in testing you're likely to see 4 always returned to userspace.


The logic required would be parsing the GUID table? I think we'd
rather keep that out of the kernel driver, right?

No, that's not the logic I'm thinking of. I'm just thinking of using the userspace specified npages and acting accordingly.

Thanks,
Tom