Re: [PATCH v13 10/24] gunyah: vm_mgr: Add/remove user memory regions

From: Will Deacon
Date: Thu Jul 20 2023 - 06:40:00 EST


On Tue, Jul 18, 2023 at 07:28:49PM -0700, Elliot Berman wrote:
> On 7/14/2023 5:13 AM, Will Deacon wrote:
> > On Thu, Jul 13, 2023 at 01:28:34PM -0700, Elliot Berman wrote:
> > > On 6/22/2023 4:56 PM, Elliot Berman wrote:
> > > > On 6/7/2023 8:54 AM, Elliot Berman wrote:
> > > > > On 6/5/2023 7:18 AM, Will Deacon wrote:
> > > > > > Right, protected guests will use the new restricted memfd ("guest mem"
> > > > > > now, I think?), but non-protected guests should implement the existing
> > > > > > interface *without* the need for the GUP pin on guest memory pages. Yes,
> > > > > > that means full support for MMU notifiers so that these pages can be
> > > > > > managed properly by the host kernel. We're working on that for pKVM, but
> > > > > > it requires a more flexible form of memory sharing over what we
> > > > > > currently
> > > > > > have so that e.g. the zero page can be shared between multiple entities.
> > > > >
> > > > > Gunyah doesn't support swapping pages out while the guest is running
> > > > > and the design of Gunyah isn't made to give host kernel full control
> > > > > over the S2 page table for its guests. As best I can tell from
> > > > > reading the respective drivers, ACRN and Nitro Enclaves both GUP pin
> > > > > guest memory pages prior to giving them to the guest, so I don't
> > > > > think this requirement from Gunyah is particularly unusual.
> > > > >
> > > >
> > > > I read/dug into mmu notifiers more and I don't think it matches with
> > > > Gunyah's features today. We don't allow the host to freely manage VM's
> > > > pages because it requires the guest VM to have a level of trust on the
> > > > host. Once a page is given to the guest, it's done for the lifetime of
> > > > the VM. Allowing the host to replace pages in the guest memory map isn't
> > > > part of any VM's security model that we run in Gunyah. With that
> > > > requirement, longterm pinning looks like the correct approach to me.
> > >
> > > Is my approach of longterm pinning correct given that Gunyah doesn't allow
> > > host to freely swap pages?
> >
> > No, I really don't think a longterm GUP pin is the right approach for this.
> > GUP pins in general are horrible for the mm layer, but required for cases
> > such as DMA where I/O faults are unrecoverable. Gunyah is not a good
> > justification for such a hack, and I don't think you get to choose which
> > parts of the Linux mm you want and which bits you don't.
> >
> > In other words, either carve out your memory and pin it that way, or
> > implement the proper hooks for the mm to do its job.
>
> I talked to the team about whether we can extend the Gunyah support for
> this. We have plans to support sharing/lending individual pages when the
> guest faults on them. The support also allows (unprotected) pages to be
> removed from the VM. We'll need to temporarily pin the pages of the VM
> configuration device tree blob while the VM is being created and those pages
> can be unpinned once the VM starts. I'll work on this.

That's pleasantly unexpected, thanks for pursuing this!

Will