Re: mlx5 ConnectX diagnostic misc driver

From: Jason Gunthorpe
Date: Thu Oct 19 2023 - 12:01:56 EST


On Thu, Oct 19, 2023 at 05:36:04PM +0200, Greg Kroah-Hartman wrote:
> On Thu, Oct 19, 2023 at 08:24:51AM -0700, Jakub Kicinski wrote:
> > > The ConnectX HW family supported by the mlx5 drivers uses an architecture
> > > where a FW component executes "mailbox RPCs" issued by the driver to make
> > > changes to the device. This results in a complex debugging environment
> > > where the FW component has information and complex low level state that
> > > needs to be accessed to userspace for debugging purposes.
> >
> > You're being very dishonest towards Greg by not telling him that this
> > is a networking device, and the networking maintainers explicitly nacked
> > this backdoor.

Do you have a lore link?

> Well, in this case, no way in hell will I be taking this. If this is a
> networking device, it needs to go through the normal networking driver
> review process, thanks for the heads up.

It is not just a networking device. mlx5 is a giant and complex
multi-subsystem piece of hardware.

This is shared debugging and configuration interface to the device FW
that interacts across all of the different subsystems the driver
supports.

Looking at Saeed's tool capability on his github it is significantly,
but not exclusively supporting RDMA (ie drivers/infiniband), with some
features for the mlx5 VFIO drivers, mlx5 VDPA and a bunch of lowlevel
PCI stuff too.

Calling it a "networking device" in the sense of "it is owned only be
netdev" is not accurate.

We think misc is an appropriate place to put something like this,
there are many other misc drivers that are sort of similar APIs to
access embedded FW to manage and debug it.

Jason