Re: [RFC PATCH 00/10] KVM: selftests: Add support for test-selectable ucall implementations

From: Michael Roth
Date: Wed Jan 05 2022 - 14:11:31 EST


On Wed, Jan 05, 2022 at 05:43:21PM +0000, Sean Christopherson wrote:
> On Wed, Jan 05, 2022, Michael Roth wrote:
> > On Wed, Jan 05, 2022 at 12:17:33AM +0000, Sean Christopherson wrote:
> > > PIO shouldn't require instruction decoding or a #VC handler. What I was thinking
> > > is that the guest in the selftest would make a direct #VMGEXIT/TDCALL to request
> > > PIO instead of executing an OUT.
> >
> > That seems like a nicer approach. But it sort of lends itself to having
> > test-specific ucall implementations in some form. How are you thinking
> > vm_create() should decide what implementation to use? With this series
> > in place it could be something like:
> >
> > vm_create(..., struct ucall_ops *ops)
> > ucall_init_ops(ops)
> >
> > and with the SEV selftests in their current form it would look something
> > like:
> >
> > sev_vm_create(...)
> > vm_create_with_ucall(..., ops=ucall_ops_pio_vmgexit)
> > ucall_init_ops(ops)
> >
> > is that sort of what you're thinking, or something else?
>
> I keep forgetting ucall() doesn't have access to the VM. But, since we're
> restricing ucall() to a single VM, we can have a global that sets ucall_ops during
> ucall_init() based on the VM type, or skip an ops and just open code the behavior
> in x86's ucall() by snapshotting the VM type. Either way, the goal is to avoid
> having to pass in ucall_ops at the test level.
>
> > > Yeah, I was thinking it could be done at the lowest level vm_create() helper.
> > > We'll need to expand vm_create() (or add yet another layer to avoid modifying a
> > > pile of tests) to allow opting out of initializing ucall, e.g. sev_migrate_tests.c
> > > needs to create multiple concurrent VMs, but happily doesn't need ucall support.
> >
> > Why does sev_migrate_tests need to opt out? Couldn't it use
> > ucall_ops_pio_vmgexit like that SEV case above?
>
> Because it uses multiple VMs, and my rough sketch only allows for a single VM to
> use ucall. Though I suppose we could simply keep appending to the ucall list for
> every VM. The requirement would then be that all VMs are of the same type, i.e.
> utilize the same ucall_ops.

Hmm, maybe I misread your patch. Not supporting multiple VMs was the
reason I gave up on having the ucall structs allocated on-demand and
went with requiring them to be passed as arguments to ucall().

I thought with your patch you had solved that by having each vm have it's
own pool, via vm->ucall_list, and then mapping each pool into each guest
separately via:

ucall_init(vm):
ucall_list = vm->ucall_list
sync_global_to_guest(ucall_list).

then as long as that ucall_init() is done *after* the guest calls
kvm_vm_elf_load(), it will end up with a 'ucall_list' global that points
to it's own specific vm->ucall_list. Then on the test side it doesn't
matter what the 'ucall_list' global is currently set to since you have
the GPA and know what vm exited.

Or am I missing something there?

Although even if that is the case, now that we're proposing doing the
ucall_init() inside vm_create(), then we run the risk of a test calling
kvm_vm_elf_load() after, which might clobber the guest's copy of
ucall_list global if ucall_init() had since been called for another VM.
But that could maybe be worked around by having whatever vm_create()
variant we use also do the kvm_vm_elf_load() unconditionally as part of
creation.

>
> > I ask because there is a ucall() in the exception handling code where
> > some unhandled exceptions result in the guest automatically issuing a
> > ucall(UCALL_UNHANDLED), so even when tests don't use ucall() they
> > might still rely on it if they enable exception handling. So that might
> > be an argument for always setting up at least the default ucall_ops_pio
> > implementation and creating a pool just in case. (or an argument for
> > dropping the UCALL_HANDLED handling).
>
> The sev_migrate_tests don't even run a guest, hence the quick-and-dirty "solution".
> Though thinking toward the future, that may be too dirty as it would prevent tests
> from having multiple "real" VMs.
>
> > > > > To reduce the burden on tests and avoid ordering issues with creating vCPUs,
> > > > > allocate a ucall struct for every possible vCPU when the VM is created and stuff
> > > > > the GPA of the struct in the struct itself so that the guest can communicate the
> > > > > GPA instead of the GVA. Then confidential VMs just need to make all structs shared.
> > > >
> > > > So a separate call like:
> > > >
> > > > ucall_make_shared(vm->ucall_list)
> > > >
> > > > ? Might need some good documentation/assertions to make sure it gets
> > > > called at the right place for confidential VMs, and may need some extra
> > > > hooks in SEV selftest implementation for switching from private to shared
> > > > after the memory has already been allocated, but seems reasonable.
> > >
> > > Again, I was thinking that it would be done unconditionally by ucall_init(), i.e.
> > > would be automatically handled by the selftest framework and would Just Work for
> > > individual tests.
> >
> > Ok, I'll have to think that through more. Currently with the SEV
> > selftests as they we have:
> >
> > sev_vm_create(policy, npages)
> > vm = vm_create(...)
> > vm_set_memory_encryption(vm, encrypt_by_default, enc_bit)
> > //vm_vaddr_alloc_shared() can be used now
> >
> > The ucall struct allocations would need to go through
> > vm_vaddr_alloc_shared() to make sure the selftest library tracks/maps
> > the pages as shared, but that vm_set_memory_encryption() happens too
> > late if the ucall_init() stuff is done in vm_create(). It should be
> > possible to pass the vm_set_memory_encryption() arguments directly to
> > vm_create() to allow for what you're proposing, but I guess we'd need
> > a new vm_create() wrapper that handles both the
> > vm_set_memory_encryption() args, along with the ucall_ops above,
> > something like:
> >
> > sev_vm_create(policy, npages)
> > vm = vm_create_coco(..., encrypt_by_default, enc_bit/shared_bit, ucall_ops)
> >
> > Or were you thinking something else? Just trying to get an idea of how
> > this will all need to tie in with the SEV selftests and what needs to
> > change on that end.
>
> Hmm, I was thinking the selftest framework would only need to be told the VM type,
> e.g. DEFAULT, SEV, SEV-ES, SEV-SNP, or TDX, and would then handle setting everything
> up, e.g. enumerating the C-bit location and encrypting memory as needed.
>
> One thought would be to extend "enum vm_guest_mode" with flags above NUM_VM_MODES
> to specify the VM type. That way tests that use VM_MODE_DEFAULT would continue to
> work without any updates.

Ok, let me see what that approach looks like on the SEV selftest side.