Re: [PATCH for-next 0/3] RDMA/hns: Add more debugging information for rdma-tool

From: Leon Romanovsky
Date: Mon Aug 28 2023 - 14:03:52 EST


On Thu, Aug 24, 2023 at 03:58:15PM +0800, Junxian Huang wrote:
>
>
> On 2023/8/19 19:32, Leon Romanovsky wrote:
> > On Wed, Aug 16, 2023 at 05:18:09PM +0800, Junxian Huang wrote:
> >> 1. #1: The first patch supports dumping QP/CQ/MR context entirely in raw
> >> data with rdma-tool.
> >>
> >> 2. #2: The second patch supports query of HW stats with rdma-tool.
> >>
> >> 3. #3: The last patch supports query of SW stats with rdma-tool.
> >>
> >> Chengchang Tang (3):
> >> RDMA/hns: Dump whole QP/CQ/MR resource in raw
> >> RDMA/hns: Support hns HW stats
> >
> > These two patches generate static analyzer warnings.
> > ➜ kernel git:(wip/leon-for-next) mkt ci --rev 0a68261bbbe5
> > 0a68261bbbe5 (HEAD -> build) RDMA/hns: Dump whole QP/CQ/MR resource in raw
> > WARNING: 'informations' may be misspelled - perhaps 'information'?
> > #7:
> > rdma-tool, but these informations are not enough. It is very
> > ^^^^^^^^^^^^
> > ➜ kernel git:(wip/leon-for-next) mkt ci
> > 5a87279591a1 (HEAD -> build) RDMA/hns: Support hns HW stats
> > drivers/infiniband/hw/hns/hns_roce_hw_v2.c:1651:35: warning: restricted __le16 degrades to integer
> >
>
> OK,I'll fix them in V2.
>
> >> RDMA/hns: Support hns SW stats
> >
> > This is not support SW stats, but actually implementation of SW
> > statistics which you exposed through rdmatool. That tool is
>
> Yes,
>
> > not right place for such information and debugfs will be better
> > fit.
> >
> > Thanks
> >
>
> but from what I have seen, efa and bnxt_re drivers also use rdmatool
> to expose SW statisics.

I afraid that it was missed in review.

>
> And could you please explain why rdmatool is not suitable for this?

IMHO, SW statistics are too broad and too coupled with the code to be
really useful in rdmatool.

Let's take an example, your newly added counter in modify QP.
It counts number of failure in hns_roce_modify_qp(), but that
function returns error in such case and users will see it anyway.

So what will give this newly added counter in addition to already known
by users? The answer is nothing and that answer is almost always applicable
to SW statistics.

It is unlikely that we can remove from EFA and bnxt_re already added
counters, but if we can, it will be great.

Thanks

>
> Junxian
>
> >>
> >> drivers/infiniband/hw/hns/hns_roce_ah.c | 6 +-
> >> drivers/infiniband/hw/hns/hns_roce_cmd.c | 19 ++-
> >> drivers/infiniband/hw/hns/hns_roce_cq.c | 15 +-
> >> drivers/infiniband/hw/hns/hns_roce_device.h | 50 ++++++
> >> drivers/infiniband/hw/hns/hns_roce_hw_v2.c | 59 +++++++
> >> drivers/infiniband/hw/hns/hns_roce_hw_v2.h | 1 +
> >> drivers/infiniband/hw/hns/hns_roce_main.c | 152 +++++++++++++++++-
> >> drivers/infiniband/hw/hns/hns_roce_mr.c | 26 ++-
> >> drivers/infiniband/hw/hns/hns_roce_pd.c | 10 +-
> >> drivers/infiniband/hw/hns/hns_roce_qp.c | 8 +-
> >> drivers/infiniband/hw/hns/hns_roce_restrack.c | 75 +--------
> >> drivers/infiniband/hw/hns/hns_roce_srq.c | 6 +-
> >> 12 files changed, 325 insertions(+), 102 deletions(-)
> >>
> >> --
> >> 2.30.0
> >>