Re: Rename restrictedmem => guardedmem? (was: Re: [PATCH v10 0/9] KVM: mm: fd-based approach for supporting KVM)

From: Chao Peng
Date: Tue May 09 2023 - 08:52:13 EST


On Fri, May 05, 2023 at 07:39:36PM +0000, Ackerley Tng wrote:
>
> Hi Sean,
>
> Thanks for implementing this POC!
>
> I’ve started porting the selftests (both Chao’s and those I added [1]).

Hi Sean/Ackerley,

Thanks for doing that. Overall making gmem a KVM ioctl() looks good to
me and it should also play nice with Intel TDX. Besides what Ackerley
mentioned below, I think we haven't discussed device assignment, which
will be supported in not too long distance. Current VFIO_IOMMU_MAP_DMA
consumes virtual address so that needs to be fixed for fd-based memory
anyway, and the fix looks not related to whether this being a syscall()
or a KVM ioctl(). There will be some initialization sequence dependency,
e.g. if gmem is finally a VM-scope ioctl() then we need VM created first
before can we map fd-based memory in VFIO, but that sounds not an issue
at all.

I also see Vlastimil/David expressed their preference on ioctl. So maybe
we can move forward on your current PoC. Do you already have a plan to
post a formal version?

Chao

>
> guest mem seems to cover the use cases that have been discussed and
> proposed so far, but I still need to figure out how gmem can work with
>
> + hugetlbfs
> + specification of/storing memory policy (for NUMA node bindings)
> + memory accounting - we may need to account for memory used separately,
> so that guest mem shows up separately on /proc/meminfo and similar
> places.
>
> One issue I’ve found so far is that the pointer to kvm (gmem->kvm) is
> not cleaned up, and hence it is possible to crash the host kernel in the
> following way
>
> 1. Create a KVM VM
> 2. Create a guest mem fd on that VM
> 3. Create a memslot with the guest mem fd (hence binding the fd to the
> VM)
> 4. Close/destroy the KVM VM
> 5. Call fallocate(PUNCH_HOLE) on the guest mem fd, which uses gmem->kvm
> when it tries to do invalidation.
>
> I then tried to clean up the gmem->kvm pointer during unbinding when the
> KVM VM is destroyed.
>
> That works, but then I realized there’s a simpler way to use the pointer
> after freeing:
>
> 1. Create a KVM VM
> 2. Create a guest mem fd on that VM
> 3. Close/destroy the KVM VM
> 4. Call fallocate(PUNCH_HOLE) on the guest mem fd, which uses gmem->kvm
> when it tries to do invalidation.
>
> Perhaps binding should mean setting the gmem->kvm pointer in addition to
> gmem->bindings. This makes binding and unbinding symmetric and avoids
> the use-after-frees described above.
>
> This also means that creating a guest mem fd is no longer dependent on
> the VM. Perhaps we can make creating a gmem fd a system ioctl (like
> KVM_GET_API_VERSION and KVM_CREATE_VM) instead of a vm ioctl?
>
> [1]
> https://lore.kernel.org/all/cover.1678926164.git.ackerleytng@xxxxxxxxxx/T/
>
> Ackerley