Re: [PATCH v4 6/6] virt: tdx-guest: Add Quote generation support using TSM_REPORTS

From: Dan Williams
Date: Tue Oct 03 2023 - 20:54:25 EST


Peter Gonda wrote:
> On Tue, Oct 3, 2023 at 1:29 PM Kuppuswamy Sathyanarayanan
> <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx> wrote:
> >
> > Hi,
> >
> > On 10/3/2023 11:37 AM, Peter Gonda wrote:
> > > On Fri, Sep 29, 2023 at 11:26 AM Peter Gonda <pgonda@xxxxxxxxxx> wrote:
> > >>
> > >> On Thu, Sep 28, 2023 at 4:49 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> > >>>
> > >>> Peter Gonda wrote:
> > >>>> On Mon, Sep 25, 2023 at 10:17 PM Dan Williams <dan.j.williams@xxxxxxxxx> wrote:
> > >>>>>
> > >>>>> From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
> > >>>>>
> > >>>>> In TDX guest, the attestation process is used to verify the TDX guest
> > >>>>> trustworthiness to other entities before provisioning secrets to the
> > >>>>> guest. The first step in the attestation process is TDREPORT
> > >>>>> generation, which involves getting the guest measurement data in the
> > >>>>> format of TDREPORT, which is further used to validate the authenticity
> > >>>>> of the TDX guest. TDREPORT by design is integrity-protected and can
> > >>>>> only be verified on the local machine.
> > >>>>>
> > >>>>> To support remote verification of the TDREPORT in a SGX-based
> > >>>>> attestation, the TDREPORT needs to be sent to the SGX Quoting Enclave
> > >>>>> (QE) to convert it to a remotely verifiable Quote. SGX QE by design can
> > >>>>> only run outside of the TDX guest (i.e. in a host process or in a
> > >>>>> normal VM) and guest can use communication channels like vsock or
> > >>>>> TCP/IP to send the TDREPORT to the QE. But for security concerns, the
> > >>>>> TDX guest may not support these communication channels. To handle such
> > >>>>> cases, TDX defines a GetQuote hypercall which can be used by the guest
> > >>>>> to request the host VMM to communicate with the SGX QE. More details
> > >>>>> about GetQuote hypercall can be found in TDX Guest-Host Communication
> > >>>>> Interface (GHCI) for Intel TDX 1.0, section titled
> > >>>>> "TDG.VP.VMCALL<GetQuote>".
> > >>>>>
> > >>>>> Trusted Security Module (TSM) [1] exposes a common ABI for Confidential
> > >>>>> Computing Guest platforms to get the measurement data via ConfigFS.
> > >>>>> Extend the TSM framework and add support to allow an attestation agent
> > >>>>> to get the TDX Quote data (included usage example below).
> > >>>>>
> > >>>>> report=/sys/kernel/config/tsm/report/report0
> > >>>>> mkdir $report
> > >>>>> dd if=/dev/urandom bs=64 count=1 > $report/inblob
> > >>>>> hexdump -C $report/outblob
> > >>>>> rmdir $report
> > >>>>>
> > >>>>> GetQuote TDVMCALL requires TD guest pass a 4K aligned shared buffer
> > >>>>> with TDREPORT data as input, which is further used by the VMM to copy
> > >>>>> the TD Quote result after successful Quote generation. To create the
> > >>>>> shared buffer, allocate a large enough memory and mark it shared using
> > >>>>> set_memory_decrypted() in tdx_guest_init(). This buffer will be re-used
> > >>>>> for GetQuote requests in the TDX TSM handler.
> > >>>>>
> > >>>>> Although this method reserves a fixed chunk of memory for GetQuote
> > >>>>> requests, such one time allocation can help avoid memory fragmentation
> > >>>>> related allocation failures later in the uptime of the guest.
> > >>>>>
> > >>>>> Since the Quote generation process is not time-critical or frequently
> > >>>>> used, the current version uses a polling model for Quote requests and
> > >>>>> it also does not support parallel GetQuote requests.
> > >>>>>
> > >>>>> Link: https://lore.kernel.org/lkml/169342399185.3934343.3035845348326944519.stgit@xxxxxxxxxxxxxxxxxxxxxxxxx/ [1]
> > >>>>> Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy@xxxxxxxxxxxxxxx>
> > >>>>> Reviewed-by: Erdem Aktas <erdemaktas@xxxxxxxxxx>
> > >>>>> Signed-off-by: Dan Williams <dan.j.williams@xxxxxxxxx>
> > >>>>
> > >>>> Hey Dan,
> > >>>>
> > >>>> I tried running your test commands on an SNP enabled guest. To build
> > >>>> the kernel I just checked out linus/master and applied your series. I
> > >>>> haven't done any debugging yet, so I will update with what I find.
> > >>>>
> > >>>> root@Ubuntu2004:~# hexdump -C $report/outblob
> > >>>> [ 219.871875] ------------[ cut here ]------------
> > >>>> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
> > >>>
> > >>> Ok, it does not like virtual address of one of the buffers, but my
> > >>> changes "should" not have affected that as get_ext_report() internally
> > >>> uses snp_dev->certs_data and snp_dev->response for bounce buffering the
> > >>> actual request / response memory. First test I want to try once I can
> > >>> get on an SNP system is compare this to the ioctl path just make sure
> > >>> that succeeds.
> > >>
> > >
> > > I think there may be an issue with CONFIG_DEBUG_SG. That was the
> > > warning we were getting in my above stack trace:
> > >
> > >> [ 219.876642] kernel BUG at include/linux/scatterlist.h:187!
> > >
> > > This was for this line in enc_dec_message():
> > >
> > > sg_set_buf(&src[1], src_buf, hdr->msg_sz);
> > >
> > > I am not sure why in sg_set_buf() virt_addr_valid() returns false for
> > > the address given in the sev_report_new() which is from the variable
> > > 'ext_req' which is stack allocated?
> > >
> > > static inline void sg_set_buf(struct scatterlist *sg, const void *buf,
> > > unsigned int buflen)
> > > {
> > > #ifdef CONFIG_DEBUG_SG
> > > BUG_ON(!virt_addr_valid(buf));
> > > #endif
> > > sg_set_page(sg, virt_to_page(buf), buflen, offset_in_page(buf));
> > > }
> > >
> > > When I disable CONFIG_DEBUG_SG in my config. Your patch seems to work,
> > > well at least it doesn't crash the guest. I haven't checked if the
> > > report is valid yet.
> > >
> >
> > Dan, do you think it is related to not allocating direct mapped memory (using
> > kvalloc)?
>
> But I think the issue is the stack allocated variable 'ext_req' here:
>
> sev_report_new()
> + void *buf __free(kvfree) = kvzalloc(size, GFP_KERNEL);
> + if (!buf)
> + return -ENOMEM;
> +
> + guard(mutex)(&snp_cmd_mutex);
> + certs_address = buf + report_size;
> + struct snp_ext_report_req ext_req = {
> + .data = { .vmpl = desc->privlevel },
> + .certs_address = (__u64)certs_address,
> + .certs_len = ext_size,
> + };
> + memcpy(&ext_req.data.user_data, desc->inblob, desc->inblob_len);

If the failure is coming from:

sg_set_buf(&src[1], src_buf, hdr->msg_sz);

...then that is always coming from the stack as get_ext_report()
internally copies either from the user ioctl() address or the kernel
stack into the local stack copy in both cases:

get_ext_report(...)
...
struct snp_ext_report_req req;
...
if (copy_from_sockptr(&req, io->req_data, sizeof(req)))
return -EFAULT;
...
ret = handle_guest_request(..., &req.data, ...);

...where that "&req.data" always becomes the @src_buf argument to
enc_dec_message(). So while I do understand why sg_set_buf() is
complaining, I don't understand why it is not *always* complaining,
regardless of configfs-tsm or ioctl() with CONFIG_DEBUG_SG=y builds.

I will be able to dig deeper once I can test on hardware, but I am
thinking that the entire scheme to pass the source buffer on the kernel
stack is broken and is only happening to work because there are no
crypto-accelerators attached that require that the virtual addresses be
virt_addr_valid() for a later dma_map_sg() event.

...or my eyes are overlooking how the ioctl() path is succeeding.