Re: Memory providers multiplexing (Was: [PATCH net-next v4 4/5] page_pool: remove PP_FLAG_PAGE_FRAG flag)

From: Mina Almasry
Date: Wed Jul 12 2023 - 18:42:07 EST


On Wed, Jul 12, 2023 at 6:35 AM Christian König
<christian.koenig@xxxxxxx> wrote:
>
> Am 12.07.23 um 15:03 schrieb Jason Gunthorpe:
> > On Wed, Jul 12, 2023 at 09:55:51AM +0200, Christian König wrote:
> >
> >>> Anyone see any glaring issues with this approach? I plan on trying to
> >>> implement a PoC and sending an RFC v2.
> >> Well we already have DMA-buf as user API for this use case, which is
> >> perfectly supported by RDMA if I'm not completely mistaken.
> >>
> >> So what problem do you try to solve here actually?
> > In a nutshell, netdev's design currently needs struct pages to do DMA
> > to it's packet buffers.
> >
> > So it cannot consume the scatterlist that dmabuf puts out
> >
> > RDMA doesn't need struct pages at all, so it is fine.
> >
> > If Mina can go down the path of changing netdev to avoid needing
> > struct pages then no changes to DRM side things.
> >
> > Otherwise a P2P struct page and a co-existance with netmem on a
> > ZONE_DEVICE page would be required. :\
>
> Uff, depending on why netdev needs struct page (I think I have a good
> idea why) this isn't really going to work generically either way.
>
> What we maybe able to do is to allow copy_file_range() between DMA-buf
> file descriptor and a TCP socket.
>
> If I'm not completely mistaken that should then end up in DMA-bufs
> file_operations->copy_file_range callback (maybe with some minor change
> to allows this).
>
> The DMA-buf framework could then forward this to the exporter of the
> memory which owns the backing memory could then do the necessary steps.
>

I may be missing something, but the way it works on our end for
receive is that we give a list of buffers (dma_addr + length + other
metadata) to the network card, and the network card writes incoming
packets to these dma_addrs and gives us an rx completion pointing to
the data it DMA'd. Usually the network card does something like an
alloc_page() + dma_map_page() and provides the to the network card.
Transmit path works similarly. Not sure that adding copy_file_range()
support to dma-buf enables this in some way.

--
Thanks,
Mina