Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag)

From: Jason Gunthorpe
Date: Mon Jul 10 2023 - 13:44:14 EST


On Wed, Jul 05, 2023 at 06:17:39PM -0700, Mina Almasry wrote:

> Another issue is that in networks with low MTU, we could be DMAing
> 1400/1500 bytes into each allocation, which is problematic if the
> allocation is 8K+. I would need to investigate a bit to see if/how to
> solve that, and we may end up having to split the page and again run
> into the 'not enough room in struct page' problem.

You don't have an intree driver to use this with, so who knows, but
the out of tree GPU drivers tend to use a 64k memory management page
size, and I don't expect you'd make progress with a design where a 64K
naturaly sized allocator is producing 4k/8k non-compound pages just
for netdev. We are still struggling with pagemap support for variable
page size folios, so there is a bunch of technical blockers before
drivers could do this.

This is why it is so important to come with a complete in-tree
solution, as we cannot review this design if your work is done with
hacked up out of tree drivers.

Fully and properly adding P2P ZONE_DEVICE to a real world driver is a
pretty big ask still.

> > Or allocate per page memory and do a memdesc like thing..
>
> I need to review memdesc more closely. Do you imagine I add a pointer
> in struct page that points to the memdesc?

Pointer to extra memory from the PFN has been the usual meaning of
memdesc, so doing an interm where the pointer is in the struct page is
a reasonable starting point.

> > Though overall, you won't find devices creating struct pages for their
> > P2P memory today, so I'm not sure what the purpose is. Jonathan
> > already got highly slammed for proposing code to the kernel that was
> > unusable. Please don't repeat that. Other than a special NVMe use case
> > the interface for P2P is DMABUF right now and it is not struct page
> > backed.
> >
>
> Our approach is actually to extend DMABUF to provide struct page
> backed attachment mappings, which as far as I understand sidesteps the
> issues Jonathan ran into.

No DMABUF exporters do this today, so your patch series is just as
incomplete as the prior ones. Please don't post it as non-RFC,
unusable code like this must not be merged.

> that supports dmabuf and in fact a lot of my tests use udmabuf to
> minimize the dependencies. The RFC may come with a udmabuf selftest to
> showcase that any dmabuf, even a mocked one, would be supported.

That is not good enough to get merged. You need to get agreement and
coded merged from actual driver owners of dmabuf exporters that they
want to support this direction. As above it has surprising road
blocks outside netdev :\

Jason