Re: [PATCH v7 11/14] KVM: Register/unregister the guest private memory regions

From: Chao Peng
Date: Mon Jul 25 2022 - 09:09:23 EST


On Thu, Jul 21, 2022 at 05:58:50PM +0000, Sean Christopherson wrote:
> On Thu, Jul 21, 2022, Chao Peng wrote:
> > On Thu, Jul 21, 2022 at 03:34:59PM +0800, Wei Wang wrote:
> > >
> > >
> > > On 7/21/22 00:21, Sean Christopherson wrote:
> > > Maybe you could tag it with cgs for all the confidential guest support
> > > related stuff: e.g. kvm_vm_ioctl_set_cgs_mem()
> > >
> > > bool is_private = ioctl == KVM_MEMORY_ENCRYPT_REG_REGION;
> > > ...
> > > kvm_vm_ioctl_set_cgs_mem(, is_private)
> >
> > If we plan to widely use such abbr. through KVM (e.g. it's well known),
> > I'm fine.
>
> I'd prefer to stay away from "confidential guest", and away from any VM-scoped
> name for that matter. User-unmappable memmory has use cases beyond hiding guest
> state from the host, e.g. userspace could use inaccessible/unmappable memory to
> harden itself against unintentional access to guest memory.
>
> > I actually use mem_attr in patch: https://lkml.org/lkml/2022/7/20/610
> > But I also don't quite like it, it's so generic and sounds say nothing.
> >
> > But I do want a name can cover future usages other than just
> > private/shared (pKVM for example may have a third state).
>
> I don't think there can be a third top-level state. Memory is either private to
> the guest or it's not. There can be sub-states, e.g. memory could be selectively
> shared or encrypted with a different key, in which case we'd need metadata to
> track that state.
>
> Though that begs the question of whether or not private_fd is the correct
> terminology. E.g. if guest memory is backed by a memfd that can't be mapped by
> userspace (currently F_SEAL_INACCESSIBLE), but something else in the kernel plugs
> that memory into a device or another VM, then arguably that memory is shared,
> especially the multi-VM scenario.
>
> For TDX and SNP "private vs. shared" is likely the correct terminology given the
> current specs, but for generic KVM it's probably better to align with whatever
> terminology is used for memfd. "inaccessible_fd" and "user_inaccessible_fd" are
> a bit odd since the fd itself is accesible.
>
> What about "user_unmappable"? E.g.
>
> F_SEAL_USER_UNMAPPABLE, MFD_USER_UNMAPPABLE, KVM_HAS_USER_UNMAPPABLE_MEMORY,
> MEMFILE_F_USER_INACCESSIBLE, user_unmappable_fd, etc...

For KVM I also think user_unmappable looks better than 'private', e.g.
user_unmappable_fd/KVM_HAS_USER_UNMAPPABLE_MEMORY sounds more
appropriate names. For memfd however, I don't feel that strong to change
it from current 'inaccessible' to 'user_unmappable', one of the reason
is it's not just about unmappable, but actually also inaccessible
through direct ioctls like read()/write().

>
> that gives us flexibility to map the memory from within the kernel, e.g. into
> other VMs or devices.
>
> Hmm, and then keep your original "mem_attr_array" name? And probably
>
> int kvm_vm_ioctl_set_mem_attr(struct kvm *kvm, gpa_t gpa, gpa_t size,
> bool is_user_mappable)
>
> Then the x86/mmu code for TDX/SNP private faults could be:
>
> is_private = !kvm_is_gpa_user_mappable();
>
> if (fault->is_private != is_private) {
>
> or if we want to avoid mixing up "user_mappable" and "user_unmappable":
>
> is_private = kvm_is_gpa_user_unmappable();
>
> if (fault->is_private != is_private) {
>
> though a helper that returns a negative (not mappable) feels kludgy. And I like
> kvm_is_gpa_user_mappable() because then when there's not "special" memory, it
> defaults to true, which is more intuitive IMO.

yes.

>
> And then if the future needs more precision, e.g. user-unmappable memory isn't
> necessarily guest-exclusive, the uAPI names still work even though KVM internals
> will need to be reworked, but that's unavoidable. E.g. piggybacking
> KVM_MEMORY_ENCRYPT_(UN)REG_REGION doesn't allow for further differentiation,
> so we'd need to _extend_ the uAPI, but the _existing_ uAPI would still be sane.

Right, that has to be extended.

Chao