Re: [PATCH V4 0/5] mlx5 ConnectX control misc driver

From: Jakub Kicinski
Date: Thu Feb 15 2024 - 20:10:28 EST


On Wed, 14 Feb 2024 23:00:40 -0800 Christoph Hellwig wrote:
> On Wed, Feb 14, 2024 at 07:48:32AM -0800, Jakub Kicinski wrote:
> > Overreach is unfortunate, I'd love to say "please do merge it as part
> > of RDMA". You probably don't trust my opinion but Jason admitted himself
> > this is primarily for RDMA. RDMA is what it is in terms of openness and
> > all vendors trying to sell their secret magic sauce.
>
> Common. RDMA has two important open standards, one of them even done
> in IETF that most open of all standards organizations.

While I don't dispute that there are standards which can be read,
the practical interoperability of RDMA devices is extremely low.
By practical I mean having two devices from different vendors
achieve any reasonable performance talking to each other.
Even two devices from _the same_ vendor but different generations
are unlikely to perform.

Given how RDMA is deployed (uniform, greenfield/full replacement)
this is entirely reasonable from the engineering perspective.

But this is a bit of a vicious cycle, vendors have little incentive
to interoperate, and primarily focus on adding secret sauce outside of
the standard. In fact you're lucky if the vendor didn't bake some
extension which requires custom switches into the NICs :(

Compare that to WiFi, which is a level of standardization netdev folks
are more accustomed to. You can connect a new device from vendor X to
a 10 year old AP from vendor Y and it will run with high perf.

Unfortunately because of the AI craze I have some experience
with RDMA deployments now. Perhaps you have more, perhaps your
experience differs.