RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW controller

From: Shradha Todi
Date: Thu Feb 15 2024 - 04:58:31 EST




> -----Original Message-----
> From: 'Manivannan Sadhasivam' <manivannan.sadhasivam@xxxxxxxxxx>
> Sent: 04 January 2024 11:21
> To: Shradha Todi <shradha.t@xxxxxxxxxxx>
> Cc: lpieralisi@xxxxxxxxxx; kw@xxxxxxxxx; robh@xxxxxxxxxx;
> bhelgaas@xxxxxxxxxx; jingoohan1@xxxxxxxxx;
> gustavo.pimentel@xxxxxxxxxxxx; josh@xxxxxxxxxxxxxxxx;
> lukas.bulwahn@xxxxxxxxx; hongxing.zhu@xxxxxxx;
> pankaj.dubey@xxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> pci@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in PCIe DW
> controller
>
> On Wed, Jan 03, 2024 at 11:13:20AM +0530, Shradha Todi wrote:
> >
> >
> > > -----Original Message-----
> > > From: Shradha Todi <shradha.t@xxxxxxxxxxx>
> > > Sent: 04 December 2023 14:10
> > > To: 'Manivannan Sadhasivam' <manivannan.sadhasivam@xxxxxxxxxx>
> > > Cc: 'lpieralisi@xxxxxxxxxx' <lpieralisi@xxxxxxxxxx>; 'kw@xxxxxxxxx'
> > > <kw@xxxxxxxxx>; 'robh@xxxxxxxxxx' <robh@xxxxxxxxxx>;
> > > 'bhelgaas@xxxxxxxxxx' <bhelgaas@xxxxxxxxxx>; 'jingoohan1@xxxxxxxxx'
> > > <jingoohan1@xxxxxxxxx>; 'gustavo.pimentel@xxxxxxxxxxxx'
> > > <gustavo.pimentel@xxxxxxxxxxxx>; 'josh@xxxxxxxxxxxxxxxx'
> > > <josh@xxxxxxxxxxxxxxxx>; 'lukas.bulwahn@xxxxxxxxx'
> > > <lukas.bulwahn@xxxxxxxxx>; 'hongxing.zhu@xxxxxxx'
> > > <hongxing.zhu@xxxxxxx>; 'pankaj.dubey@xxxxxxxxxxx'
> > > <pankaj.dubey@xxxxxxxxxxx>; 'linux-kernel@xxxxxxxxxxxxxxx' <linux-
> > > kernel@xxxxxxxxxxxxxxx>; 'linux-pci@xxxxxxxxxxxxxxx' <linux-
> > > pci@xxxxxxxxxxxxxxx>
> > > Subject: RE: [PATCH v2 0/3] Add support for RAS DES feature in PCIe
> > > DW controller
> > >
> > >
> > >
> > > > -----Original Message-----
> > > > From: Manivannan Sadhasivam
> > > > [mailto:manivannan.sadhasivam@xxxxxxxxxx]
> > > > Sent: 30 November 2023 22:25
> > > > To: Shradha Todi <shradha.t@xxxxxxxxxxx>
> > > > Cc: lpieralisi@xxxxxxxxxx; kw@xxxxxxxxx; robh@xxxxxxxxxx;
> > > > bhelgaas@xxxxxxxxxx; jingoohan1@xxxxxxxxx;
> > > > gustavo.pimentel@xxxxxxxxxxxx; josh@xxxxxxxxxxxxxxxx;
> > > > lukas.bulwahn@xxxxxxxxx; hongxing.zhu@xxxxxxx;
> > > > pankaj.dubey@xxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; linux-
> > > > pci@xxxxxxxxxxxxxxx
> > > > Subject: Re: [PATCH v2 0/3] Add support for RAS DES feature in
> > > > PCIe DW controller
> > > >
> > > > On Thu, Nov 30, 2023 at 05:20:41PM +0530, Shradha Todi wrote:
> > > > > DesignWare controller provides a vendor specific extended
> > > > > capability called RASDES as an IP feature. This extended
> > > > > capability provides hardware information like:
> > > > > - Debug registers to know the state of the link or controller.
> > > > > - Error injection mechanisms to inject various PCIe errors including
> > > > > sequence number, CRC
> > > > > - Statistical counters to know how many times a particular event
> > > > > occurred
> > > > >
> > > > > However, in Linux we do not have any generic or custom support
> > > > > to be able to use this feature in an efficient manner. This is
> > > > > the reason we are proposing this framework. Debug and bring up
> > > > > time of high-speed IPs are highly dependent on costlier hardware
> > > > > analyzers and this solution will in some ways help to reduce the HW
> analyzer usage.
> > > > >
> > > > > The debugfs entries can be used to get information about
> > > > > underlying hardware and can be shared with user space. Separate
> > > > > debugfs entries has been created to cater to all the DES hooks provided
> by the controller.
> > > > > The debugfs entries interacts with the RASDES registers in the
> > > > > required sequence and provides the meaningful data to the user.
> > > > > This eases the effort to understand and use the register
> > > > > information for
> > > > debugging.
> > > > >
> > > > > v1 version was posted long back and for some reasons I couldn't
> > > > > work on it. I apologize for the long break. I'm restarting this
> > > > > activity and have taken care of all previous review comments shared.
> > > > > v1:
> > > > > https://lore.kernel.org/all/20210518174618.42089-1-shradha.t@sam
> > > > > sung
> > > > > .c
> > > > > om/T/
> > > > >
> > > >
> > > > There is already a series floating to add similar functionality
> > > > via perf
> > > > subsystem:
> > > > https://lore.kernel.org/linux-pci/20231121013400.18367-1-
> > > > xueshuai@xxxxxxxxxxxxxxxxx/
> > > >
> > > > - Mani
> > > >
> > >
> > > Hi Mani,
> > >
> > > The series proposed in perf includes only time based-analysis and
> > > event counters which will monitor performance (Group 6 and 7). The
> > > patch or framework that we have proposed includes debug information,
> > > error injection facility and error counters (Group 0 - 5) which are
> > > not included as part of the functionality implemented via perf. In
> > > my opinion, these functionalities don't count as performance
> > > monitoring or counters but rather as debug counters. How about we take this
> up as a debugfs framework as proposed in my patch?
> > > Or if others feel it can be taken via perf driver then I am happy to
> > > extend the perf driver if authors do not have objection. Let me know what
> you think of this?
> > > Meanwhile I will review the perf patches and share my feedback.
> > >
> >
> > Hello Mani,
> > Any update on the above comment? IMO, even though the perf patches and
> > this patchset are both part of the DWC vendor specific capability -
> > RASDES, they cover different features. The perf file includes
> > performance based parameters like time-based analysis and event
> > counters for count of packets whereas this patchset includes debugging
> > fields, error injection and event counters for count of errors. I
> > think having a separate debugfs file fits more but would you suggest we extend
> the perf file itself?
> >
>
> For the error injection and counters, we already have the EDAC framework. So
> adding them in the DWC driver doesn't make sense to me.
>

Sorry for late response, was going through the EDAC framework to understand better how we can fit RAS DES support in it. Below are some technical challenges found so far:
1: This debugfs framework proposed [1] can run on both side of the link i.e. RC and EP as it will be a part of the link controller platform driver Here for the EP side the assumption is that it has Linux running, which is primarily a use case for chip-to-chip communication. After your suggestion to migrate to EDAC framework we studied and here are the findings:
- If we move to EDAC framework, we need to have RAS DES as a pci_driver which will be binded based on vendor_id and device_id. Our observation is that on EP side system we are unable to bind two function driver (pci_driver), as pci_endpoint_test function driver or some other chip-to-chip function driver will already be bound. On the other hand, on RC side we observed that if we have portdrv enabled in Linux running on RC system, it gets bound to RC controller and then it does not allow EDAC pci_driver to bind. So basically we see a problem here, that we can't have two pci_driver binding to same PCI device
2: Another point is even though we use EDAC driver framework, we may not be able to use any of EDAC framework APIs as they are mostly suitable for memory controller devices sitting on PCI BUS. We will end up using debugfs entries just via a pci_driver placed inside EDAC framework.

Please let me know if my understanding is wrong.

> But first check with the perf driver author if they have any plans on adding the
> proposed functionality. If they do not have any plan or not working on it, then
> look into EDAC.
>
> - Mani
>

Since we already worked and posted patches [1], [2], we will continue to work on this and based on consent from community we will adopt to most suitable framework.
We see many subsystems like ethernet, usb, gpu, cxl having debugfs files that give information about the current status of the running system and as of now based on our findings, we still feel there is no harm in having debugfs entry based support in DesignWare controller driver itself.
Others any opinion please?

[1]: https://lkml.org/lkml/2021/5/18/1371
[2]: https://lkml.org/lkml/2023/11/30/574

> > Shradha
> >
> > > > > Shradha Todi (3):
> > > > > PCI: dwc: Add support for vendor specific capability search
> > > > > PCI: debugfs: Add support for RASDES framework in DWC
> > > > > PCI: dwc: Create debugfs files in DWC driver
> > > > >
> > > > > drivers/pci/controller/dwc/Kconfig | 8 +
> > > > > drivers/pci/controller/dwc/Makefile | 1 +
> > > > > .../controller/dwc/pcie-designware-debugfs.c | 476
> > > > ++++++++++++++++++
> > > > > .../controller/dwc/pcie-designware-debugfs.h | 0
> > > > > drivers/pci/controller/dwc/pcie-designware.c | 20 +
> > > > > drivers/pci/controller/dwc/pcie-designware.h | 18 +
> > > > > 6 files changed, 523 insertions(+) create mode 100644
> > > > > drivers/pci/controller/dwc/pcie-designware-debugfs.c
> > > > > create mode 100644
> > > > > drivers/pci/controller/dwc/pcie-designware-debugfs.h
> > > > >
> > > > > --
> > > > > 2.17.1
> > > > >
> > > >
> > > > --
> > > > மணிவண்ணன் சதாசிவம்
> >
> >
>
> --
> மணிவண்ணன் சதாசிவம்