Re: Rename restrictedmem => guardedmem? (was: Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM)

From: Sean Christopherson
Date: Tue Apr 18 2023 - 20:54:25 EST


On Tue, Apr 18, 2023, David Hildenbrand wrote:
> On 17.04.23 21:16, Sean Christopherson wrote:
> > Hidden/Concealed/etc - Too close to secretmem, suffers the "hidden from whom" problem,
> > and depending on the use case, the memory may not actually be concealed from the
> > user that controls the VMM.
> >
> > Restricted - "rmem" collides with "reserved memory" in code.
> >
> > Guarded - Conflicts with s390's "guarded storage", has the "from whom" problem.
> >
> > Inaccessible - Many of the same problems as "hidden".
> >
> > Unmappable - Doesn't cover things like read/write, and is wrong in the sense that
> > the memory is still mappable, just not via mmap().
> >
> > Secured - I'm not getting anywhere near this one :-)
>
> The think about "secretmem" that I kind-of like (a little) is that it
> explains what it's used for: storing secrets. We don't call it "unmapped"
> memory because we unmap it from the directmap or "unpinnable" memory or
> "inaccessible" memory ... or even "restricted" because it has restrictions
> ... how the secrets are protected is kind of an implementation detail.
>
> So instead of describing *why*/*how* restrictedmem is the weird kid
> (restricted/guarded/hidden/restricted/inaccessible/ ...), maybe rather
> describe what it's used for?
>
> I know, I know, "there are other use cases where it will be used outside of
> VM context". I really don't care.

Heh, we originally proposed F_SEAL_GUEST, but that was also sub-optimal[1] ;-)

> "memfd_vm" / "vm_mem" would be sooo (feel free to add some more o's here)
> much easier to get. It's a special fd to be used to back VM memory. Depending
> on the VM type (encrypted/protected/whatever), restrictions might apply (not
> able to mmap, not able to read/write ...). For example, there really is no
> need to disallow mmap/read/write when using that memory to back a simple VM
> where all we want to do is avoid user-space page tables.

In seriousness, I do agree with Jason's very explicit objection[2] against naming
a non-KVM uAPI "guest", or any variation thereof.

An alternative that we haven't considered since the very early days is making the
uAPI a KVM ioctl() instead of a memfd() flag or dedicated syscall. Looking at the
code for "pure shim" implementation[3], that's actually not that crazy of an idea.

I don't know that I love the idea of burying this in KVM, but there are benefits
to coupling restrictedmem to KVM (aside from getting out from behind this bikeshed
that I created).

The big benefit is that the layer of indirection goes away. That simplifies things
like enhancing restrictedmem to allow host userspace access for debug purposes,
batching TLB flushes if a PUNCH_HOLE spans multiple memslots, enforcing exclusive
access, likely the whole "share with a device" story if/when we get there, etc.

The obvious downsides are that (a) maintenance falls under the KVM umbrella, but
that's likely to be true in practice regardless of where the code lands, and
(b) if another use case comes along, e.g. the Gunyah hypervisor[4][5], we risk
someone reinventing a similar solution.

If we can get Gunyah on board and they don't need substantial changes to the
restrictedmem implementation, then I'm all for continuing on the path we're on.
But if Gunyah wants to do their own thing, and the lightweight shim approach is
viable, then it's awfully tempting to put this all behind a KVM ioctl().

[1] https://lore.kernel.org/all/df11d753-6242-8f7c-cb04-c095f68b41fa@xxxxxxxxxx
[2] https://lore.kernel.org/all/20211123171723.GD5112@xxxxxxxx
[3] https://lore.kernel.org/all/ZDiCG%2F7OgDI0SwMR@xxxxxxxxxx
[4] https://lore.kernel.org/all/Y%2FkI66qQFJJ6bkTq@xxxxxxxxxx
[5] https://lore.kernel.org/all/20230304010632.2127470-13-quic_eberman@xxxxxxxxxxx