Re: [RFC PATCH 0/5] Prototype for direct map awareness in page allocator

From: Sean Christopherson
Date: Fri May 19 2023 - 14:25:59 EST


On Fri, May 19, 2023, Mike Rapoport wrote:
> On Fri, May 19, 2023 at 08:40:48AM -0700, Sean Christopherson wrote:
> > On Thu, Mar 09, 2023, Mike Rapoport wrote:
> > > On Thu, Mar 09, 2023 at 01:59:00AM +0000, Edgecombe, Rick P wrote:
> > > > On Wed, 2023-03-08 at 11:41 +0200, Mike Rapoport wrote:
> > > > > From: "Mike Rapoport (IBM)" <rppt@xxxxxxxxxx>
> > > > >
> > > > > Hi,
> > > > >
> > > > > This is a third attempt to make page allocator aware of the direct
> > > > > map
> > > > > layout and allow grouping of the pages that must be unmapped from
> > > > > the direct map.
> > > > >
> > > > > This a new implementation of __GFP_UNMAPPED, kinda a follow up for
> > > > > this set:
> > > > >
> > > > > https://lore.kernel.org/all/20220127085608.306306-1-rppt@xxxxxxxxxx
> > > > >
> > > > > but instead of using a migrate type to cache the unmapped pages, the
> > > > > current implementation adds a dedicated cache to serve __GFP_UNMAPPED
> > > > > allocations.
> > > >
> > > > It seems a downside to having a page allocator outside of _the_ page
> > > > allocator is you don't get all of the features that are baked in there.
> > > > For example does secretmem care about numa? I guess in this
> > > > implementation there is just one big cache for all nodes.
> > > >
> > > > Probably most users would want __GFP_ZERO. Would secretmem care about
> > > > __GFP_ACCOUNT?
> > >
> > > The intention was that the pages in cache are always zeroed, so __GFP_ZERO
> > > is always implicitly there, at least should have been.
> >
> > Would it be possible to drop that assumption/requirement, i.e. allow allocation of
> > __GFP_UNMAPPED without __GFP_ZERO? At a glance, __GFP_UNMAPPED looks like it would
> > be a great fit for backing guest memory, in particular for confidential VMs. And
> > for some flavors of CoCo, i.e. TDX, the trusted intermediary is responsible for
> > zeroing/initializing guest memory as the untrusted host (kernel/KVM) doesn't have
> > access to the guest's encryption key. In other words, zeroing in the kernel would
> > be unnecessary work.
>
> Making and unmapped allocation without __GFP_ZERO shouldn't be a problem.
>
> However, using a gfp flag and hooking up into the free path in page
> allocator have issues and preferably should be avoided.
>
> Will something like unmapped_alloc() and unmapped_free() work for your
> usecase?

Yep, I'm leaning more and more towards having KVM implement its own ioctl() for
managing this type of memory. Wiring that up to use dedicated APIs should be no
problem.

Thanks!