Message ID | 20240123095006.979437-1-0x1207@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [net] net: stmmac: xgmac: fix handling of DPP safety error for DMA channels | expand |
On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote: > Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in > XGMAC core") checks and reports safety errors, but leaves the > Data Path Parity Errors for each channel in DMA unhandled at all, lead to > a storm of interrupt. > Fix it by checking and clearing the DMA_DPP_Interrupt_Status register. > > Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core") > Signed-off-by: Furong Xu <0x1207@gmail.com> > --- > drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h | 1 + > drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++ > 2 files changed, 7 insertions(+) > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > index 207ff1799f2c..188e11683136 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > @@ -385,6 +385,7 @@ > #define XGMAC_DCEIE BIT(1) > #define XGMAC_TCEIE BIT(0) > #define XGMAC_DMA_ECC_INT_STATUS 0x0000306c > +#define XGMAC_DMA_DPP_INT_STATUS 0x00003074 > #define XGMAC_DMA_CH_CONTROL(x) (0x00003100 + (0x80 * (x))) > #define XGMAC_SPH BIT(24) > #define XGMAC_PBLx8 BIT(16) > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > index eb48211d9b0e..874e85b499e2 100644 > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev, > > dwxgmac3_log_error(ndev, value, correctable, "MAC", > dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats); > + > + value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS); > + writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS); > + > + if (value) > + netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value); 1. Why not to implement this in the same way as the rest of the safety errors handle code? (with the flags described by the dwxgmac3_error_desc-based table and the respective counters being incremented should the errors were detected) 2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config() method. How come the respective event has turned to be triggered anyway? -Serge(y) > } > > static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= { > -- > 2.34.1 > >
On Wed, 24 Jan 2024 17:36:10 +0300 Serge Semin <fancer.lancer@gmail.com> wrote: > On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote: > > Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in > > XGMAC core") checks and reports safety errors, but leaves the > > Data Path Parity Errors for each channel in DMA unhandled at all, lead to > > a storm of interrupt. > > Fix it by checking and clearing the DMA_DPP_Interrupt_Status register. > > > > Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core") > > Signed-off-by: Furong Xu <0x1207@gmail.com> > > --- > > drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h | 1 + > > drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++ > > 2 files changed, 7 insertions(+) > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > index 207ff1799f2c..188e11683136 100644 > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > @@ -385,6 +385,7 @@ > > #define XGMAC_DCEIE BIT(1) > > #define XGMAC_TCEIE BIT(0) > > #define XGMAC_DMA_ECC_INT_STATUS 0x0000306c > > +#define XGMAC_DMA_DPP_INT_STATUS 0x00003074 > > #define XGMAC_DMA_CH_CONTROL(x) (0x00003100 + (0x80 * (x))) > > #define XGMAC_SPH BIT(24) > > #define XGMAC_PBLx8 BIT(16) > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > index eb48211d9b0e..874e85b499e2 100644 > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev, > > > > dwxgmac3_log_error(ndev, value, correctable, "MAC", > > dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats); > > + > > + value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS); > > + writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS); > > + > > + if (value) > > + netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value); > > 1. Why not to implement this in the same way as the rest of the safety > errors handle code? (with the flags described by the > dwxgmac3_error_desc-based table and the respective counters being > incremented should the errors were detected) > XGMAC_DMA_DPP_INT_STATUS is just a bitmap of DMA RX and TX channels, bottom 16 bits for 16 DMA TX channels and top 16 bits for 16 DMA RX channels. No other descriptions. And the counters should be updated, I will send a new patch. > 2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config() > method. How come the respective event has turned to be triggered > anyway? This error report is enabled by default, and cannot be disabled or marked(as Synopsys Databook says). What we can do is clearing it when it asserts. > > -Serge(y) > > > } > > > > static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= { > > -- > > 2.34.1 > > > >
On Thu, Jan 25, 2024 at 10:56:20AM +0800, Furong Xu wrote: > On Wed, 24 Jan 2024 17:36:10 +0300 > Serge Semin <fancer.lancer@gmail.com> wrote: > > > On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote: > > > Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in > > > XGMAC core") checks and reports safety errors, but leaves the > > > Data Path Parity Errors for each channel in DMA unhandled at all, lead to > > > a storm of interrupt. > > > Fix it by checking and clearing the DMA_DPP_Interrupt_Status register. > > > > > > Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core") > > > Signed-off-by: Furong Xu <0x1207@gmail.com> > > > --- > > > drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h | 1 + > > > drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++ > > > 2 files changed, 7 insertions(+) > > > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > > index 207ff1799f2c..188e11683136 100644 > > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > > @@ -385,6 +385,7 @@ > > > #define XGMAC_DCEIE BIT(1) > > > #define XGMAC_TCEIE BIT(0) > > > #define XGMAC_DMA_ECC_INT_STATUS 0x0000306c > > > +#define XGMAC_DMA_DPP_INT_STATUS 0x00003074 > > > #define XGMAC_DMA_CH_CONTROL(x) (0x00003100 + (0x80 * (x))) > > > #define XGMAC_SPH BIT(24) > > > #define XGMAC_PBLx8 BIT(16) > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > > index eb48211d9b0e..874e85b499e2 100644 > > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > > @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev, > > > > > > dwxgmac3_log_error(ndev, value, correctable, "MAC", > > > dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats); > > > + > > > + value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS); > > > + writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS); > > > + > > > + if (value) > > > + netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value); > > > > 1. Why not to implement this in the same way as the rest of the safety > > errors handle code? (with the flags described by the > > dwxgmac3_error_desc-based table and the respective counters being > > incremented should the errors were detected) > > > XGMAC_DMA_DPP_INT_STATUS is just a bitmap of DMA RX and TX channels, > bottom 16 bits for 16 DMA TX channels and top 16 bits for 16 DMA RX channels. > No other descriptions. > > And the counters should be updated, I will send a new patch. Ok. I'll wait for this patch v2 then with the counters fixed. Please also note that you are adding the _DMA_ DPP events handling support. Thus the more suitable place for this change would be dwmac5_handle_dma_err(). > > > 2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config() > > method. How come the respective event has turned to be triggered > > anyway? > This error report is enabled by default, and cannot be disabled or marked(as Synopsys Databook says). > What we can do is clearing it when it asserts. This sounds so strange that I can barely believe in it. The DW QoS Eth MTL DPP feature can be enabled/disabled, but the DW XGMAC DMA DPP can't? This doesn't look logical. What's the point in having a never maskable IRQ for not that much crucial but optional feature? Moreover DPP adds some data flow overhead. If we are sure that no problem with the device data paths, then it seems redundant to have it always enabled. So I guess it must be switchable. Are you completely sure it isn't? -Serge(y) > > > > -Serge(y) > > > > > } > > > > > > static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= { > > > -- > > > 2.34.1 > > > > > > >
On Thu, 25 Jan 2024 22:07:15 +0300 Serge Semin <fancer.lancer@gmail.com> wrote: > On Thu, Jan 25, 2024 at 10:56:20AM +0800, Furong Xu wrote: > > On Wed, 24 Jan 2024 17:36:10 +0300 > > Serge Semin <fancer.lancer@gmail.com> wrote: > > > > > On Tue, Jan 23, 2024 at 05:50:06PM +0800, Furong Xu wrote: > > > > Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in > > > > XGMAC core") checks and reports safety errors, but leaves the > > > > Data Path Parity Errors for each channel in DMA unhandled at all, lead to > > > > a storm of interrupt. > > > > Fix it by checking and clearing the DMA_DPP_Interrupt_Status register. > > > > > > > > Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core") > > > > Signed-off-by: Furong Xu <0x1207@gmail.com> > > > > --- > > > > drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h | 1 + > > > > drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++ > > > > 2 files changed, 7 insertions(+) > > > > > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > > > index 207ff1799f2c..188e11683136 100644 > > > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h > > > > @@ -385,6 +385,7 @@ > > > > #define XGMAC_DCEIE BIT(1) > > > > #define XGMAC_TCEIE BIT(0) > > > > #define XGMAC_DMA_ECC_INT_STATUS 0x0000306c > > > > +#define XGMAC_DMA_DPP_INT_STATUS 0x00003074 > > > > #define XGMAC_DMA_CH_CONTROL(x) (0x00003100 + (0x80 * (x))) > > > > #define XGMAC_SPH BIT(24) > > > > #define XGMAC_PBLx8 BIT(16) > > > > diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > > > index eb48211d9b0e..874e85b499e2 100644 > > > > --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > > > +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c > > > > @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev, > > > > > > > > dwxgmac3_log_error(ndev, value, correctable, "MAC", > > > > dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats); > > > > + > > > > + value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS); > > > > + writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS); > > > > + > > > > + if (value) > > > > + netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value); > > > > > > 1. Why not to implement this in the same way as the rest of the safety > > > errors handle code? (with the flags described by the > > > dwxgmac3_error_desc-based table and the respective counters being > > > incremented should the errors were detected) > > > > > > XGMAC_DMA_DPP_INT_STATUS is just a bitmap of DMA RX and TX channels, > > bottom 16 bits for 16 DMA TX channels and top 16 bits for 16 DMA RX channels. > > No other descriptions. > > > > And the counters should be updated, I will send a new patch. > > Ok. I'll wait for this patch v2 then with the counters fixed. Please > also note that you are adding the _DMA_ DPP events handling support. > Thus the more suitable place for this change would be > dwmac5_handle_dma_err(). > > > > > > 2. I don't see this IRQ being enabled in the dwxgmac3_safety_feat_config() > > > method. How come the respective event has turned to be triggered > > > anyway? > > This error report is enabled by default, and cannot be disabled or marked(as Synopsys Databook says). > > What we can do is clearing it when it asserts. > > This sounds so strange that I can barely believe in it. The DW QoS Eth > MTL DPP feature can be enabled/disabled, but the DW XGMAC DMA DPP > can't? This doesn't look logical. What's the point in having a never > maskable IRQ for not that much crucial but optional feature? Moreover > DPP adds some data flow overhead. If we are sure that no problem with > the device data paths, then it seems redundant to have it always > enabled. So I guess it must be switchable. Are you completely sure it > isn't? Sorry for my bad explanation. Double checked DMA_DPP error report path on my device. XGMAC DMA_DPP is enable by DDPP bit of MTL_DPP_Control. DDPP bit is default to 0(Data path Parity Protection is enabled). When DDPP bit is set to 1(Data path Parity Protection is disabled), no DMA_DPP interrupt is reported. Once DMA_DPP interrupt is reported, there is no control bit to disable it or mask it. DMA_DPP error is unrecoverable type, and unrecoverable error interrupt cannot be disabled or masked, this is a design(as Synopsys Databook says). A explicit ops on MTL_DPP_Control to clear DDPP bit can add to dwxgmac3_safety_feat_config to make code looks better. > > -Serge(y) > > > > > > > -Serge(y) > > > > > > > } > > > > > > > > static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= { > > > > -- > > > > 2.34.1 > > > > > > > > > >
diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h index 207ff1799f2c..188e11683136 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h @@ -385,6 +385,7 @@ #define XGMAC_DCEIE BIT(1) #define XGMAC_TCEIE BIT(0) #define XGMAC_DMA_ECC_INT_STATUS 0x0000306c +#define XGMAC_DMA_DPP_INT_STATUS 0x00003074 #define XGMAC_DMA_CH_CONTROL(x) (0x00003100 + (0x80 * (x))) #define XGMAC_SPH BIT(24) #define XGMAC_PBLx8 BIT(16) diff --git a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c index eb48211d9b0e..874e85b499e2 100644 --- a/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c +++ b/drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c @@ -745,6 +745,12 @@ static void dwxgmac3_handle_mac_err(struct net_device *ndev, dwxgmac3_log_error(ndev, value, correctable, "MAC", dwxgmac3_mac_errors, STAT_OFF(mac_errors), stats); + + value = readl(ioaddr + XGMAC_DMA_DPP_INT_STATUS); + writel(value, ioaddr + XGMAC_DMA_DPP_INT_STATUS); + + if (value) + netdev_err(ndev, "Found DMA_DPP error, status: 0x%x\n", value); } static const struct dwxgmac3_error_desc dwxgmac3_mtl_errors[32]= {
Commit 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core") checks and reports safety errors, but leaves the Data Path Parity Errors for each channel in DMA unhandled at all, lead to a storm of interrupt. Fix it by checking and clearing the DMA_DPP_Interrupt_Status register. Fixes: 56e58d6c8a56 ("net: stmmac: Implement Safety Features in XGMAC core") Signed-off-by: Furong Xu <0x1207@gmail.com> --- drivers/net/ethernet/stmicro/stmmac/dwxgmac2.h | 1 + drivers/net/ethernet/stmicro/stmmac/dwxgmac2_core.c | 6 ++++++ 2 files changed, 7 insertions(+)