Re: [PATCH V4 0/5] mlx5 ConnectX control misc driver

From: Jason Gunthorpe
Date: Thu Feb 15 2024 - 08:22:54 EST


On Wed, Feb 14, 2024 at 07:48:32AM -0800, Jakub Kicinski wrote:
> On Wed, 14 Feb 2024 00:29:16 -0800 Christoph Hellwig wrote:
> > With my busy kernel contributor head on I have to voice my
> > dissatisfaction with the subsystem maintainer overreach that's causing
> > the troubles here.
>
> Overreach is unfortunate, I'd love to say "please do merge it as part
> of RDMA". You probably don't trust my opinion but Jason admitted himself
> this is primarily for RDMA.

"admitted"? You make it sound like a crime. I've been very clear on
this need from the RDMA community since the first posting.

> The problem is that some RDMA stuff is built really closely on TCP,

Huh? Since when? Are you talking about soft-iwarp? That is a reasearch
project and Bernard is very responsive, if you have issues ask him and
he will help.

Otherwise the actual HW devices are not entangled with netdev TCP, the
few iWarp devices have their own TCP implementation, in accordance
with what the IETF standardized.

> and given Jason's and co. inability to understand that good fences
> make good neighbors it will soon start getting into the netdev stack :|

I seem to recall you saying RDMA shouldn't call any netdev APIs at
all. We were unable to agree on where to build the fence for this
reason.

> Ah, and I presume they may also want it for their DOCA products.
> So 80% RDMA, 15% DOCA, 5% the rest is my guess.

I don't know all details about DOCA, but what I know about runs over
RDMA.

> Not sure what you mean by "without lots of precedence" but you can ask
> around netdev. We have nacked such interfaces multiple times.
> The best proof the rule exists and is well established it is that Saeed
> has himself asked us a number of times to lift it.
>
> What should be expected of us is fairness and not engaging in politics.
> We have a clear rule against opaque user space to FW interfaces,
> and I don't see how we could enforce that fairly for pure Ethernet
> devices if big vendors get to do whatever they want.

If your community is telling your rules are not working for them
anymore, it is not nice to tell them that rules exist and cannot be
questioned. Try working together toward a reasonable consensus
solution.

The world has changed alot, the use cases are different, the users are
different, the devices are different. When Dave made that prohibition
long ago it was not in a world of a multi billion transistor NIC being
deployed in uniform clusters of unimaginable size.

Jason