Re: [PATCH V3 2/5] misc: mlx5ctl: Add mlx5ctl misc driver

From: Saeed Mahameed
Date: Tue Nov 28 2023 - 14:55:13 EST


On 28 Nov 10:33, Jakub Kicinski wrote:
On Tue, 28 Nov 2023 13:52:24 -0400 Jason Gunthorpe wrote:
> The question at LPC was about making devlink params completely
> transparent to the kernel. Basically added directly from FW.
> That what I was not happy about.

It is creating a back-porting nightmare for all the enterprise
distributions.

We don't care about enterprise distros, Jason, or stable kernel APIs.

> You can add as many params at the driver level as you want.
> In fact I asked Saeed repeatedly to start posting all those
> params instead of complaining.

That really isn't what you said in the video.

Regardless, configurables are only one part of what mlx5ctl addresses,
we still have all the debugability problems, which are arguably more
important.

Read-only debug interfaces are "do whatever you want" in netdev.
Params controlling them (ie. writing stuff) need to be reviewed
but are also allowed.

Doesn't mlx5 have a pile of stuff in debugfs already?


not enough, not scalable and it's a backporting and maintenance nightmare
as Jason already showed.

mlx5 supports creating millions of objects, tools need to selectively
pick which objects to dump for a specific use case, if it's ok with you to
do this in debugfs, then ioctl is much cleaner .. so what's your problem
with mlx5ctl?


Nobody bothered to answer my "are you not going support mstreg over
this" question (arbitrary register writes).

> Let the users complain about the user problems. Also something
> I repeatedly told Saeed. His response was something along the lines
> of users are secret, they can't post on the list, blah, blah.

You mean like the S390 team at IBM did in the video?

This is not a reasonable position. One of the jobs of the vendors is
to aggregate the user requests. Even the giant hyperscale customers
that do have the capacity to come on this list prefer to delegate
these things to us.

If you want to get a direct user forum the kernel mailing list is not
an appropriate place to do it.

Agree to disagree.

> You know one user who is participating in this thread?
> *ME*
> While the lot of you work for vendors.

I'm sick of this vendor bashing. You work for *one* user. You know who
talks to *every* user out there? *ME*.

User and vendors need debugging of this complex HW. I don't need to
bring a parade of a dozen users to this thread to re-enforce that
obvious truth. Indeed when debugging is required the vendor usually
has to do it, so we are the user in this discussion.

You didn't answer the question, what is your alternative debug-ability
vision here?

Covered above. And it's been discussed multiple times.

Honestly I don't want to spend any more time discussing this.
Once you're ready to work together in good faith let me know.

On future revisions of this series please carry:

Nacked-by: Jakub Kicinski <kuba@xxxxxxxxxx>

I asked before and I never got a technical answer, based on what?

All we got is just political views and complaints against vendors.
What is your proposal for accessing every possible debug information from a
vendor specific device ? devlink X Y Z, debugfs? won't work, sorry.

And I can't accept "do it out of tree" as an answer from a well
established linux maintainer, the whole point of this is to have this
available in every linux box with any mlx5 configuration (not only netdev)
so we can start debugging on the spot.

For your claims that we need this for setting device parameters, it
is simply not true, because we don't need this driver to do that,
so please go back and read the cover-letter and code, and let me know what
is wrong with our approach to get access to our device's debug info.