Re: [PATCH net] net: stmmac: xgmac: fix handling of DPP safety error for DMA channels

From: Furong Xu
Date: Thu Jan 25 2024 - 22:31:31 EST


On Thu, 25 Jan 2024 22:07:15 +0300
Serge Semin <fancer.lancer@xxxxxxxxx> wrote:

> On Thu, Jan 25, 2024 at 10:56:20AM +0800, Furong Xu wrote:
> > On Wed, 24 Jan 2024 17:36:10 +0300
> > Serge Semin <fancer.lancer@xxxxxxxxx> wrote:
> >
> > > On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote:
> > > > Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in
> > > > XGMAC core") checks and reports safety errors, but leaves the
> > > > Data Path Parity Errors for each channel in DMA unhandled at all, lead to
> > > > a storm of interrupt.
> > > > Fix it by checking and clearing the DMA_DPP_Interrupt_Status register.
> > > >
> > > > Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core")
> > > > Signed-off-by: Furong Xu <0x1207@xxxxxxxxx>
> > > > ---
> > > > drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h | 1 +
> > > > drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++
> > > > 2 files changed, 7 insertions(+)
> > > >
> > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > > index 207ff1799f2c..188e11683136 100644
> > > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h
> > > > @@ -385,6 +385,7 @@
> > > > #define XGMAC_DCEIE BIT(1)
> > > > #define XGMAC_TCEIE BIT(0)
> > > > #define XGMAC_DMA_ECC_INT_STATUS 0x0000306c
> > > > +#define XGMAC_DMA_DPP_INT_STATUS 0x00003074
> > > > #define XGMAC_DMA_CH_CONTROL(x) (0x00003100 + (0x80 * (x)))
> > > > #define XGMAC_SPH BIT(24)
> > > > #define XGMAC_PBLx8 BIT(16)
> > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > > index eb48211d9b0e..874e85b499e2 100644
> > > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c
> > > > @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev,
> > > >
> > > > dwxgmac3_log_error(ndev, value, correctable, "MAC",
> > > > dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats);
> > > > +
> > > > + value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> > > > + writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS);
> > > > +
> > > > + if (value)
> > > > + netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value);
> > >
> > > 1. Why not to implement this in the same way as the rest of the safety
> > > errors handle code? (with the flags described by the
> > > dwxgmac3_error_desc-based table and the respective counters being
> > > incremented should the errors were detected)
> > >
>
> > XGMAC_DMA_DPP_INT_STATUS is just a bitmap of DMA RX and TX channels,
> > bottom 16 bits for 16 DMA TX channels and top 16 bits for 16 DMA RX channels.
> > No other descriptions.
> >
> > And the counters should be updated, I will send a new patch.
>
> Ok. I'll wait for this patch v2 then with the counters fixed. Please
> also note that you are adding the _DMA_ DPP events handling support.
> Thus the more suitable place for this change would be
> dwmac5_handle_dma_err().
>
> >
> > > 2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config()
> > > method. How come the respective event has turned to be triggered
> > > anyway?
> > This error report is enabled by default, and cannot be disabled or marked(as Synopsys Databook says).
> > What we can do is clearing it when it asserts.
>
> This sounds so strange that I can barely believe in it. The DW QoS Eth
> MTL DPP feature can be enabled/disabled, but the DW XGMAC DMA DPP
> can't? This doesn't look logical. What's the point in having a never
> maskable IRQ for not that much crucial but optional feature? Moreover
> DPP adds some data flow overhead. If we are sure that no problem with
> the device data paths, then it seems redundant to have it always
> enabled. So I guess it must be switchable. Are you completely sure it
> isn't?

Sorry for my bad explanation.

Double checked DMA_DPP error report path on my device.

XGMAC DMA_DPP is enable by DDPP bit of MTL_DPP_Control.
DDPP bit is default to 0(Data path Parity Protection is enabled).
When DDPP bit is set to 1(Data path Parity Protection is disabled), no DMA_DPP interrupt is reported.

Once DMA_DPP interrupt is reported, there is no control bit to disable it or mask it.
DMA_DPP error is unrecoverable type, and unrecoverable error interrupt cannot be disabled or masked,
this is a design(as Synopsys Databook says).

A explicit ops on MTL_DPP_Control to clear DDPP bit can add to dwxgmac3_safety_feat_config
to make code looks better.

>
> -Serge(y)
>
> > >
> > > -Serge(y)
> > >
> > > > }
> > > >
> > > > static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= {
> > > > --
> > > > 2.34.1
> > > >
> > > >
> >