Re: [PATCH bpf-next v3 1/3] xsk: Support UMEM chunk_size > PAGE_SIZE

From: Kal Cutter Conley
Date: Fri Apr 14 2023 - 05:04:18 EST


> >> "More annoying" is not a great argument, though. You're basically saying
> >> "please complicate your code so I don't have to complicate mine". And
> >> since kernel API is essentially frozen forever, adding more of them
> >> carries a pretty high cost, which is why kernel developers tend not to
> >> be easily swayed by convenience arguments (if all you want is a more
> >> convenient API, just build one on top of the kernel primitives and wrap
> >> it into a library).
> >
> > I was trying to make a fair comparison from the user's perspective
> > between having to allocate huge pages and deal with discontiguous
> > buffers. That was all.
> >
> > I think the "your code" distinction is a bit harsh. The kernel is a
> > community project. Why isn't it "our" code? I am trying to add a
> > feature that I think is generally useful to people. The kernel only
> > exists to serve its users.
>
> Oh, I'm sorry if that came across as harsh, that was not my intention! I
> was certainly not trying to make a "you vs us" distinction; I was just
> trying to explain why making changes on the kernel side carries a higher
> cost than an equivalent (or even slightly more complex) change on the
> userspace side, because of the UAPI consideration.

No problem! I agree. I am just somewhat confused by the "slightly more
complex" view of the situation. Currently, packets > 4K are not
supported _at all_ with AF_XDP. So this patchset adds something that
is not possible _at all_ today. We have been patiently waiting since
2018 for AF_XDP jumbo packet support. It seems that interest in this
feature from maintainers has been... lacking. :-) Now, if I understand
the situation correctly, you are asking for benchmarks to compare this
implementation with AF_XDP multi-buffer which doesn't exist yet? I am
glad it's being worked on but until there is a patchset, AF_XDP
multi-buffer is still VAPORWARE. :-)

Now in our case, we are primarily interested in throughput and
reducing total system load (so we have more compute budget for other
tasks). The faster we can receive packets the better. The packets we
need to receive are (almost) all 8000-9000 bytes... Using AF_XDP
multi-buffer would mean either (1) allocating extra space per packet
and copying the data to linearize it or (2) rewriting a significant
amount of code to handle segmented packets efficiently. If you want
benchmarks for (1) just use xdpsock and compare -z with -c. I think
that is a good approximation...

Hopefully, the HFT crowd can save this patchset in the end. :-)

-Kal