Message ID | 20230324140404.95745-1-nbd@nbd.name (mailing list archive) |
---|---|
State | Accepted |
Commit | 07b3af42d8d528374d4f42d688bae86eeb30831a |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links | expand |
> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr > Von: "Felix Fietkau" <nbd@nbd.name> > An: netdev@vger.kernel.org > Cc: "Frank Wunderlich" <frank-w@public-files.de>, "Daniel Golle" <daniel@makrotopia.org> > Betreff: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links > > Using the QDMA tx scheduler to throttle tx to line speed works fine for > switch ports, but apparently caused a regression on non-switch ports. > > Based on a number of tests, it seems that this throttling can be safely > dropped without re-introducing the issues on switch ports that the > tx scheduling changes resolved. > > Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ > Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") > Reported-by: Frank Wunderlich <frank-w@public-files.de> > Reported-by: Daniel Golle <daniel@makrotopia.org> > Tested-by: Daniel Golle <daniel@makrotopia.org> > Signed-off-by: Felix Fietkau <nbd@nbd.name> > --- > drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c > index a94aa08515af..282f9435d5ff 100644 > --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c > +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c > @@ -763,8 +763,6 @@ static void mtk_mac_link_up(struct phylink_config *config, > break; > } > > - mtk_set_queue_speed(mac->hw, mac->id, speed); > - > /* Configure duplex */ > if (duplex == DUPLEX_FULL) > mcr |= MAC_MCR_FORCE_DPX; thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? regards Frank
On 25.03.23 10:28, Frank Wunderlich wrote: >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr >> Von: "Felix Fietkau" <nbd@nbd.name> >> An: netdev@vger.kernel.org >> Cc: "Frank Wunderlich" <frank-w@public-files.de>, "Daniel Golle" <daniel@makrotopia.org> >> Betreff: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links >> >> Using the QDMA tx scheduler to throttle tx to line speed works fine for >> switch ports, but apparently caused a regression on non-switch ports. >> >> Based on a number of tests, it seems that this throttling can be safely >> dropped without re-introducing the issues on switch ports that the >> tx scheduling changes resolved. >> >> Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ >> Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") >> Reported-by: Frank Wunderlich <frank-w@public-files.de> >> Reported-by: Daniel Golle <daniel@makrotopia.org> >> Tested-by: Daniel Golle <daniel@makrotopia.org> >> Signed-off-by: Felix Fietkau <nbd@nbd.name> >> --- >> drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> index a94aa08515af..282f9435d5ff 100644 >> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> @@ -763,8 +763,6 @@ static void mtk_mac_link_up(struct phylink_config *config, >> break; >> } >> >> - mtk_set_queue_speed(mac->hw, mac->id, speed); >> - >> /* Configure duplex */ >> if (duplex == DUPLEX_FULL) >> mcr |= MAC_MCR_FORCE_DPX; > > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 > > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. > > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? Could you please test if making it use the MT7621 codepath brings back performance? I don't have any MT7623 hardware for testing right now. - Felix
> Gesendet: Sonntag, 26. März 2023 um 17:56 Uhr > Von: "Felix Fietkau" <nbd@nbd.name> > An: "Frank Wunderlich" <frank-w@public-files.de> > Cc: netdev@vger.kernel.org, "Daniel Golle" <daniel@makrotopia.org> > Betreff: Re: Aw: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links > > On 25.03.23 10:28, Frank Wunderlich wrote: > >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr > >> Von: "Felix Fietkau" <nbd@nbd.name> > >> An: netdev@vger.kernel.org > >> Cc: "Frank Wunderlich" <frank-w@public-files.de>, "Daniel Golle" <daniel@makrotopia.org> > >> Betreff: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links > >> > >> Using the QDMA tx scheduler to throttle tx to line speed works fine for > >> switch ports, but apparently caused a regression on non-switch ports. > >> > >> Based on a number of tests, it seems that this throttling can be safely > >> dropped without re-introducing the issues on switch ports that the > >> tx scheduling changes resolved. > >> > >> Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ > >> Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") > >> Reported-by: Frank Wunderlich <frank-w@public-files.de> > >> Reported-by: Daniel Golle <daniel@makrotopia.org> > >> Tested-by: Daniel Golle <daniel@makrotopia.org> > >> Signed-off-by: Felix Fietkau <nbd@nbd.name> > >> --- > >> drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 -- > >> 1 file changed, 2 deletions(-) > >> > >> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c > >> index a94aa08515af..282f9435d5ff 100644 > >> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c > >> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c > >> @@ -763,8 +763,6 @@ static void mtk_mac_link_up(struct phylink_config *config, > >> break; > >> } > >> > >> - mtk_set_queue_speed(mac->hw, mac->id, speed); > >> - > >> /* Configure duplex */ > >> if (duplex == DUPLEX_FULL) > >> mcr |= MAC_MCR_FORCE_DPX; > > > > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 > > > > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. > > > > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? > Could you please test if making it use the MT7621 codepath brings back > performance? I don't have any MT7623 hardware for testing right now. Hi, this seems to make the CPU stall (after kernel is loaded completely when userspace begins to start): - if (IS_ENABLED(CONFIG_SOC_MT7621)) { + if (IS_ENABLED(CONFIG_SOC_MT7621) || IS_ENABLED(CONFIG_SOC_MT7623)) { [ 27.252672] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 27.258618] rcu: 2-...0: (0 ticks this GP) idle=54c4/1/0x40000000 softir8 [ 27.266973] rcu: (detected by 1, t=2102 jiffies, g=-891, q=7 ncpus=4) [ 27.273499] Sending NMI from CPU 1 to CPUs 2: [USBD] USB PRB0 LineState: 0 wonder why this happens...i expected some kind of tranmit queue erros with trace... full log here https://pastebin.com/de4dZDt4 but i've found no error there (sorry for cutting on the right side...my terminal window was to small) regards Frank
On 26.03.23 19:10, Frank Wunderlich wrote: >> Gesendet: Sonntag, 26. März 2023 um 17:56 Uhr >> Von: "Felix Fietkau" <nbd@nbd.name> >> An: "Frank Wunderlich" <frank-w@public-files.de> >> Cc: netdev@vger.kernel.org, "Daniel Golle" <daniel@makrotopia.org> >> Betreff: Re: Aw: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links >> >> On 25.03.23 10:28, Frank Wunderlich wrote: >> >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr >> >> Von: "Felix Fietkau" <nbd@nbd.name> >> >> An: netdev@vger.kernel.org >> >> Cc: "Frank Wunderlich" <frank-w@public-files.de>, "Daniel Golle" <daniel@makrotopia.org> >> >> Betreff: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links >> >> >> >> Using the QDMA tx scheduler to throttle tx to line speed works fine for >> >> switch ports, but apparently caused a regression on non-switch ports. >> >> >> >> Based on a number of tests, it seems that this throttling can be safely >> >> dropped without re-introducing the issues on switch ports that the >> >> tx scheduling changes resolved. >> >> >> >> Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ >> >> Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") >> >> Reported-by: Frank Wunderlich <frank-w@public-files.de> >> >> Reported-by: Daniel Golle <daniel@makrotopia.org> >> >> Tested-by: Daniel Golle <daniel@makrotopia.org> >> >> Signed-off-by: Felix Fietkau <nbd@nbd.name> >> >> --- >> >> drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 -- >> >> 1 file changed, 2 deletions(-) >> >> >> >> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> >> index a94aa08515af..282f9435d5ff 100644 >> >> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> >> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> >> @@ -763,8 +763,6 @@ static void mtk_mac_link_up(struct phylink_config *config, >> >> break; >> >> } >> >> >> >> - mtk_set_queue_speed(mac->hw, mac->id, speed); >> >> - >> >> /* Configure duplex */ >> >> if (duplex == DUPLEX_FULL) >> >> mcr |= MAC_MCR_FORCE_DPX; >> > >> > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 >> > >> > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. >> > >> > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? >> Could you please test if making it use the MT7621 codepath brings back >> performance? I don't have any MT7623 hardware for testing right now. > > Hi, > > this seems to make the CPU stall (after kernel is loaded completely when userspace begins to start): > > - if (IS_ENABLED(CONFIG_SOC_MT7621)) { > + if (IS_ENABLED(CONFIG_SOC_MT7621) || IS_ENABLED(CONFIG_SOC_MT7623)) { > > [ 27.252672] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > [ 27.258618] rcu: 2-...0: (0 ticks this GP) idle=54c4/1/0x40000000 softir8 > [ 27.266973] rcu: (detected by 1, t=2102 jiffies, g=-891, q=7 ncpus=4) > [ 27.273499] Sending NMI from CPU 1 to CPUs 2: > > [USBD] USB PRB0 LineState: 0 > > wonder why this happens...i expected some kind of tranmit queue erros with trace... > > full log here > https://pastebin.com/de4dZDt4 > > but i've found no error there (sorry for cutting on the right side...my terminal window was to small) The change you made has no effect, because CONFIG_SOC_MT7623 does not exist, so there must be something else that broke your system. Try CONFIG_MACH_MT7623 instead, once you've resolved the system hang. - Felix
> Gesendet: Sonntag, 26. März 2023 um 19:27 Uhr > Von: "Felix Fietkau" <nbd@nbd.name> > An: "Frank Wunderlich" <frank-w@public-files.de> > Cc: netdev@vger.kernel.org, "Daniel Golle" <daniel@makrotopia.org> > Betreff: Re: Aw: Re: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links > > On 26.03.23 19:10, Frank Wunderlich wrote: > >> Gesendet: Sonntag, 26. März 2023 um 17:56 Uhr > >> Von: "Felix Fietkau" <nbd@nbd.name> > >> An: "Frank Wunderlich" <frank-w@public-files.de> > >> Cc: netdev@vger.kernel.org, "Daniel Golle" <daniel@makrotopia.org> > >> Betreff: Re: Aw: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links > >> > >> On 25.03.23 10:28, Frank Wunderlich wrote: > >> >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr > >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> An: netdev@vger.kernel.org > >> >> Cc: "Frank Wunderlich" <frank-w@public-files.de>, "Daniel Golle" <daniel@makrotopia.org> > >> >> Betreff: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links > >> >> > >> >> Using the QDMA tx scheduler to throttle tx to line speed works fine for > >> >> switch ports, but apparently caused a regression on non-switch ports. > >> >> > >> >> Based on a number of tests, it seems that this throttling can be safely > >> >> dropped without re-introducing the issues on switch ports that the > >> >> tx scheduling changes resolved. > >> >> > >> >> Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ > >> >> Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") > >> >> Reported-by: Frank Wunderlich <frank-w@public-files.de> > >> >> Reported-by: Daniel Golle <daniel@makrotopia.org> > >> >> Tested-by: Daniel Golle <daniel@makrotopia.org> > >> >> Signed-off-by: Felix Fietkau <nbd@nbd.name> > >> >> --- > >> >> drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 -- > >> >> 1 file changed, 2 deletions(-) > >> >> > >> >> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c > >> >> index a94aa08515af..282f9435d5ff 100644 > >> >> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c > >> >> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c > >> >> @@ -763,8 +763,6 @@ static void mtk_mac_link_up(struct phylink_config *config, > >> >> break; > >> >> } > >> >> > >> >> - mtk_set_queue_speed(mac->hw, mac->id, speed); > >> >> - > >> >> /* Configure duplex */ > >> >> if (duplex == DUPLEX_FULL) > >> >> mcr |= MAC_MCR_FORCE_DPX; > >> > > >> > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 > >> > > >> > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. > >> > > >> > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? > >> Could you please test if making it use the MT7621 codepath brings back > >> performance? I don't have any MT7623 hardware for testing right now. > > > > Hi, > > > > this seems to make the CPU stall (after kernel is loaded completely when userspace begins to start): > > > > - if (IS_ENABLED(CONFIG_SOC_MT7621)) { > > + if (IS_ENABLED(CONFIG_SOC_MT7621) || IS_ENABLED(CONFIG_SOC_MT7623)) { > > > > [ 27.252672] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > > [ 27.258618] rcu: 2-...0: (0 ticks this GP) idle=54c4/1/0x40000000 softir8 > > [ 27.266973] rcu: (detected by 1, t=2102 jiffies, g=-891, q=7 ncpus=4) > > [ 27.273499] Sending NMI from CPU 1 to CPUs 2: > > > > [USBD] USB PRB0 LineState: 0 > > > > wonder why this happens...i expected some kind of tranmit queue erros with trace... > > > > full log here > > https://pastebin.com/de4dZDt4 > > > > but i've found no error there (sorry for cutting on the right side...my terminal window was to small) > The change you made has no effect, because CONFIG_SOC_MT7623 does not > exist, so there must be something else that broke your system. > Try CONFIG_MACH_MT7623 instead, once you've resolved the system hang. it was exactly this change above...dropping it fixed it do not know why i did no compile-error, looked into implementation of IS_ENABLED which is passed through different defines till include/linux/kconfig.h:42:#define ___is_defined(val) ____is_defined(__ARG_PLACEHOLDER_##val) include/linux/kconfig.h:43:#define ____is_defined(arg1_or_junk) __take_second_arg(arg1_or_junk 1, 0) include/linux/kconfig.h:14:#define __take_second_arg(__ignored, val, ...) val i do not expect that there is anything wrong, just wonder why this stalls... used the CONFIG_MACH_MT7623 (which is set in my config) boots up fine, but did not fix the 622Mbit-tx-issue and i'm not sure i have tested it before...all ports of mt7531 are affected, not only wan (i remembered you asked for this) regards Frank
On 26.03.23 19:49, Frank Wunderlich wrote: >> Gesendet: Sonntag, 26. März 2023 um 19:27 Uhr >> Von: "Felix Fietkau" <nbd@nbd.name> >> An: "Frank Wunderlich" <frank-w@public-files.de> >> Cc: netdev@vger.kernel.org, "Daniel Golle" <daniel@makrotopia.org> >> Betreff: Re: Aw: Re: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links >> >> On 26.03.23 19:10, Frank Wunderlich wrote: >> >> Gesendet: Sonntag, 26. März 2023 um 17:56 Uhr >> >> Von: "Felix Fietkau" <nbd@nbd.name> >> >> An: "Frank Wunderlich" <frank-w@public-files.de> >> >> Cc: netdev@vger.kernel.org, "Daniel Golle" <daniel@makrotopia.org> >> >> Betreff: Re: Aw: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links >> >> >> >> On 25.03.23 10:28, Frank Wunderlich wrote: >> >> >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr >> >> >> Von: "Felix Fietkau" <nbd@nbd.name> >> >> >> An: netdev@vger.kernel.org >> >> >> Cc: "Frank Wunderlich" <frank-w@public-files.de>, "Daniel Golle" <daniel@makrotopia.org> >> >> >> Betreff: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links >> >> >> >> >> >> Using the QDMA tx scheduler to throttle tx to line speed works fine for >> >> >> switch ports, but apparently caused a regression on non-switch ports. >> >> >> >> >> >> Based on a number of tests, it seems that this throttling can be safely >> >> >> dropped without re-introducing the issues on switch ports that the >> >> >> tx scheduling changes resolved. >> >> >> >> >> >> Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ >> >> >> Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") >> >> >> Reported-by: Frank Wunderlich <frank-w@public-files.de> >> >> >> Reported-by: Daniel Golle <daniel@makrotopia.org> >> >> >> Tested-by: Daniel Golle <daniel@makrotopia.org> >> >> >> Signed-off-by: Felix Fietkau <nbd@nbd.name> >> >> >> --- >> >> >> drivers/net/ethernet/mediatek/mtk_eth_soc.c | 2 -- >> >> >> 1 file changed, 2 deletions(-) >> >> >> >> >> >> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> >> >> index a94aa08515af..282f9435d5ff 100644 >> >> >> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> >> >> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c >> >> >> @@ -763,8 +763,6 @@ static void mtk_mac_link_up(struct phylink_config *config, >> >> >> break; >> >> >> } >> >> >> >> >> >> - mtk_set_queue_speed(mac->hw, mac->id, speed); >> >> >> - >> >> >> /* Configure duplex */ >> >> >> if (duplex == DUPLEX_FULL) >> >> >> mcr |= MAC_MCR_FORCE_DPX; >> >> > >> >> > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 >> >> > >> >> > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. >> >> > >> >> > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? >> >> Could you please test if making it use the MT7621 codepath brings back >> >> performance? I don't have any MT7623 hardware for testing right now. >> > >> > Hi, >> > >> > this seems to make the CPU stall (after kernel is loaded completely when userspace begins to start): >> > >> > - if (IS_ENABLED(CONFIG_SOC_MT7621)) { >> > + if (IS_ENABLED(CONFIG_SOC_MT7621) || IS_ENABLED(CONFIG_SOC_MT7623)) { >> > >> > [ 27.252672] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: >> > [ 27.258618] rcu: 2-...0: (0 ticks this GP) idle=54c4/1/0x40000000 softir8 >> > [ 27.266973] rcu: (detected by 1, t=2102 jiffies, g=-891, q=7 ncpus=4) >> > [ 27.273499] Sending NMI from CPU 1 to CPUs 2: >> > >> > [USBD] USB PRB0 LineState: 0 >> > >> > wonder why this happens...i expected some kind of tranmit queue erros with trace... >> > >> > full log here >> > https://pastebin.com/de4dZDt4 >> > >> > but i've found no error there (sorry for cutting on the right side...my terminal window was to small) >> The change you made has no effect, because CONFIG_SOC_MT7623 does not >> exist, so there must be something else that broke your system. >> Try CONFIG_MACH_MT7623 instead, once you've resolved the system hang. > > it was exactly this change above...dropping it fixed it > > do not know why i did no compile-error, looked into implementation of IS_ENABLED which is passed through different defines till > > include/linux/kconfig.h:42:#define ___is_defined(val) ____is_defined(__ARG_PLACEHOLDER_##val) > include/linux/kconfig.h:43:#define ____is_defined(arg1_or_junk) __take_second_arg(arg1_or_junk 1, 0) > include/linux/kconfig.h:14:#define __take_second_arg(__ignored, val, ...) val > > i do not expect that there is anything wrong, just wonder why this stalls... > > used the CONFIG_MACH_MT7623 (which is set in my config) boots up fine, but did not fix the 622Mbit-tx-issue > > and i'm not sure i have tested it before...all ports of mt7531 are affected, not only wan (i remembered you asked for this) Does the MAC that's connected to the switch use flow control? Can you test if changing that makes a difference? - Felix
> Gesendet: Sonntag, 26. März 2023 um 22:09 Uhr > Von: "Felix Fietkau" <nbd@nbd.name> > On 26.03.23 19:49, Frank Wunderlich wrote: > >> Gesendet: Sonntag, 26. März 2023 um 19:27 Uhr > >> Von: "Felix Fietkau" <nbd@nbd.name> > >> On 26.03.23 19:10, Frank Wunderlich wrote: > >> >> Gesendet: Sonntag, 26. März 2023 um 17:56 Uhr > >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> On 25.03.23 10:28, Frank Wunderlich wrote: > >> >> >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr > >> >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 > >> >> > > >> >> > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. > >> >> > > >> >> > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? > >> >> Could you please test if making it use the MT7621 codepath brings back > >> >> performance? I don't have any MT7623 hardware for testing right now. > > used the CONFIG_MACH_MT7623 (which is set in my config) boots up fine, but did not fix the 622Mbit-tx-issue > > > > and i'm not sure i have tested it before...all ports of mt7531 are affected, not only wan (i remembered you asked for this) > Does the MAC that's connected to the switch use flow control? Can you > test if changing that makes a difference? it does use flow control/pause on mac and switch-port, disabled it, but it does not change anything (still ~620Mbit on tx) +++ b/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts @@ -182,7 +182,7 @@ gmac0: mac@0 { fixed-link { speed = <1000>; full-duplex; - pause; + //pause; }; }; @@ -235,7 +235,7 @@ port@6 { fixed-link { speed = <1000>; full-duplex; - pause; + //pause; }; }; }; regards Frank
On Fri, 24 Mar 2023 15:04:04 +0100 Felix Fietkau wrote: > Using the QDMA tx scheduler to throttle tx to line speed works fine for > switch ports, but apparently caused a regression on non-switch ports. > > Based on a number of tests, it seems that this throttling can be safely > dropped without re-introducing the issues on switch ports that the > tx scheduling changes resolved. > > Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ > Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") > Reported-by: Frank Wunderlich <frank-w@public-files.de> > Reported-by: Daniel Golle <daniel@makrotopia.org> > Tested-by: Daniel Golle <daniel@makrotopia.org> > Signed-off-by: Felix Fietkau <nbd@nbd.name> My reading of the discussion was that this patch is good, even if it doesn't fix all the models, but it's marked as Changes Requested in PW. Could you confirm that we should apply this one as is, just to be sure?
On 29.03.23 04:01, Jakub Kicinski wrote: > On Fri, 24 Mar 2023 15:04:04 +0100 Felix Fietkau wrote: >> Using the QDMA tx scheduler to throttle tx to line speed works fine for >> switch ports, but apparently caused a regression on non-switch ports. >> >> Based on a number of tests, it seems that this throttling can be safely >> dropped without re-introducing the issues on switch ports that the >> tx scheduling changes resolved. >> >> Link: https://lore.kernel.org/netdev/trinity-92c3826f-c2c8-40af-8339-bc6d0d3ffea4-1678213958520@3c-app-gmx-bs16/ >> Fixes: f63959c7eec3 ("net: ethernet: mtk_eth_soc: implement multi-queue support for per-port queues") >> Reported-by: Frank Wunderlich <frank-w@public-files.de> >> Reported-by: Daniel Golle <daniel@makrotopia.org> >> Tested-by: Daniel Golle <daniel@makrotopia.org> >> Signed-off-by: Felix Fietkau <nbd@nbd.name> > > My reading of the discussion was that this patch is good, even if it > doesn't fix all the models, but it's marked as Changes Requested in PW. > Could you confirm that we should apply this one as is, just to be sure? Yes, please apply this one. I will send another patch to take care of MT7623 soon. - Felix
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Fri, 24 Mar 2023 15:04:04 +0100 you wrote: > Using the QDMA tx scheduler to throttle tx to line speed works fine for > switch ports, but apparently caused a regression on non-switch ports. > > Based on a number of tests, it seems that this throttling can be safely > dropped without re-introducing the issues on switch ports that the > tx scheduling changes resolved. > > [...] Here is the summary with links: - [net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links https://git.kernel.org/netdev/net/c/07b3af42d8d5 You are awesome, thank you!
On 27.03.23 19:28, Frank Wunderlich wrote: > >> Gesendet: Sonntag, 26. März 2023 um 22:09 Uhr >> Von: "Felix Fietkau" <nbd@nbd.name> >> On 26.03.23 19:49, Frank Wunderlich wrote: >> >> Gesendet: Sonntag, 26. März 2023 um 19:27 Uhr >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> On 26.03.23 19:10, Frank Wunderlich wrote: >> >> >> Gesendet: Sonntag, 26. März 2023 um 17:56 Uhr >> >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> >> On 25.03.23 10:28, Frank Wunderlich wrote: >> >> >> >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr >> >> >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> >> > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 >> >> >> > >> >> >> > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. >> >> >> > >> >> >> > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? >> >> >> Could you please test if making it use the MT7621 codepath brings back >> >> >> performance? I don't have any MT7623 hardware for testing right now. > >> > used the CONFIG_MACH_MT7623 (which is set in my config) boots up fine, but did not fix the 622Mbit-tx-issue >> > >> > and i'm not sure i have tested it before...all ports of mt7531 are affected, not only wan (i remembered you asked for this) >> Does the MAC that's connected to the switch use flow control? Can you >> test if changing that makes a difference? > > it does use flow control/pause on mac and switch-port, disabled it, but it does not change anything (still ~620Mbit on tx) > > +++ b/arch/arm/boot/dts/mt7623n-bananapi-bpi-r2.dts > @@ -182,7 +182,7 @@ gmac0: mac@0 { > fixed-link { > speed = <1000>; > full-duplex; > - pause; > + //pause; > }; > }; > > @@ -235,7 +235,7 @@ port@6 { > fixed-link { > speed = <1000>; > full-duplex; > - pause; > + //pause; > }; > }; > }; > > regards Frank I think I have an idea on how to properly deal with this issue now. I had overlooked the fact that on MT7623 the link to the switch is only 1000M, so any QDMA shaping to that speed does not make any sense, since throughput is already limited by MAC speed. I will make a patch that enables shaping only if port_speed < mac_speed. - Felix
On 27.03.23 19:28, Frank Wunderlich wrote: > >> Gesendet: Sonntag, 26. März 2023 um 22:09 Uhr >> Von: "Felix Fietkau" <nbd@nbd.name> >> On 26.03.23 19:49, Frank Wunderlich wrote: >> >> Gesendet: Sonntag, 26. März 2023 um 19:27 Uhr >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> On 26.03.23 19:10, Frank Wunderlich wrote: >> >> >> Gesendet: Sonntag, 26. März 2023 um 17:56 Uhr >> >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> >> On 25.03.23 10:28, Frank Wunderlich wrote: >> >> >> >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr >> >> >> >> Von: "Felix Fietkau" <nbd@nbd.name> > >> >> >> > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 >> >> >> > >> >> >> > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. >> >> >> > >> >> >> > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? >> >> >> Could you please test if making it use the MT7621 codepath brings back >> >> >> performance? I don't have any MT7623 hardware for testing right now. > >> > used the CONFIG_MACH_MT7623 (which is set in my config) boots up fine, but did not fix the 622Mbit-tx-issue >> > >> > and i'm not sure i have tested it before...all ports of mt7531 are affected, not only wan (i remembered you asked for this) >> Does the MAC that's connected to the switch use flow control? Can you >> test if changing that makes a difference? > > it does use flow control/pause on mac and switch-port, disabled it, but it does not change anything (still ~620Mbit on tx) I finally found a MT7623 device in my stash, so I could run some experiments with it. For some reason I could only reproduce your tx throughput values after switching off TRGMII. With TRGMII enabled, I got around 864 Mbit/s, which is of course still lower than what I get with shaping disabled. I also experimented with bumping the shaping rate to higher values, but got no changes in throughput at all. Based on that, I'm beginning to think that the shaper simply can't handle rates close to the MAC rate and runs into a fundamental limit somehow. Now that I think about it, I do remember that shaping to 2.5 Gbps on MT7622 also reduced link throughput. I think we should simply use this patch to deal with it. Do you agree? - Felix --- --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -753,6 +753,7 @@ static void mtk_mac_link_up(struct phylink_config *config, MAC_MCR_FORCE_RX_FC); /* Configure speed */ + mac->speed = speed; switch (speed) { case SPEED_2500: case SPEED_1000: @@ -3235,6 +3236,9 @@ static int mtk_device_event(struct notifier_block *n, unsigned long event, void if (dp->index >= MTK_QDMA_NUM_QUEUES) return NOTIFY_DONE; + if (mac->speed > 0 && mac->speed <= s.base.speed) + s.base.speed = 0; + mtk_set_queue_speed(eth, dp->index + 3, s.base.speed); return NOTIFY_DONE;
something ist still strange...i get a rcu stall again with this patch...reverted it and my r2 boots again. [ 29.772755] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 29.778689] rcu: 2-...0: (1 GPs behind) idle=547c/1/0x40000000 softirq=251/258 fqs=427 [ 29.786697] rcu: (detected by 1, t=2104 jiffies, g=-875, q=29 ncpus=4) [ 29.793308] Sending NMI from CPU 1 to CPUs 2: [ 34.492968] vusb: disabling [ 34.495828] vmc: disabling [ 34.498587] vmch: disabling [ 34.501433] vgp1: disabling [ 34.504426] vcamaf: disabling [ 39.797579] rcu: rcu_sched kthread timer wakeup didn't happen for 994 jiffies! g-875 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 [ 39.808619] rcu: Possible timer handling issue on cpu=1 timer-softirq=493 [ 39.815487] rcu: rcu_sched kthread starved for 995 jiffies! g-875 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1 [ 39.825571] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. [ 39.834520] rcu: RCU grace-period kthread stack dump: [ 39.839564] task:rcu_sched state:I stack:0 pid:14 ppid:2 flags:0x00000000 [ 39.847928] __schedule from schedule+0x54/0xe8 [ 39.852472] schedule from schedule_timeout+0x94/0x158 [ 39.857619] schedule_timeout from rcu_gp_fqs_loop+0x12c/0x50c [ 39.863467] rcu_gp_fqs_loop from rcu_gp_kthread+0x194/0x21c [ 39.869135] rcu_gp_kthread from kthread+0xc8/0xcc [ 39.873931] kthread from ret_from_fork+0x14/0x2c [ 39.878639] Exception stack(0xf0859fb0 to 0xf0859ff8) [ 39.883690] 9fa0: 00000000 00000000 00000000 00000000 [ 39.891864] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 [ 39.900037] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 [ 39.906645] rcu: Stack dump where RCU GP kthread last ran: [ 39.912125] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.3.0-rc1-bpi-r2-rc-net #2 [ 39.919518] Hardware name: Mediatek Cortex-A7 (Device Tree) [ 39.925082] PC is at default_idle_call+0x1c/0xb0 [ 39.929698] LR is at ct_kernel_enter.constprop.0+0x48/0x11c [ 39.935267] pc : [<c0d105ec>] lr : [<c0d0ffa4>] psr: 600e0013 [ 39.941527] sp : f0861fb0 ip : c15721e0 fp : 00000000 [ 39.946746] r10: 00000000 r9 : 410fc073 r8 : 8000406a [ 39.951964] r7 : c1404f74 r6 : c19e0900 r5 : c15727e0 r4 : c19e0900 [ 39.958486] r3 : 00000000 r2 : 2da0a000 r1 : 00000001 r0 : 00008cfc [ 39.965007] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none [ 39.972138] Control: 10c5387d Table: 84f4806a DAC: 00000051 [ 39.977878] default_idle_call from cpuidle_idle_call+0x24/0x68 [ 39.983805] cpuidle_idle_call from do_idle+0x9c/0xd0 [ 39.988863] do_idle from cpu_startup_entry+0x20/0x24 [ 39.993921] cpu_startup_entry from secondary_start_kernel+0x118/0x138 [ 40.000457] secondary_start_kernel from 0x801017a0 maybe i need additional patch or did anything else wrong? still working on 6.3-rc1 https://github.com/frank-w/BPI-Router-Linux/commits/6.3-rc-net regards Frank > Gesendet: Mittwoch, 29. März 2023 um 14:04 Uhr > Von: "Felix Fietkau" <nbd@nbd.name> > An: "Frank Wunderlich" <frank-w@public-files.de> > Cc: netdev@vger.kernel.org, "Daniel Golle" <daniel@makrotopia.org> > Betreff: Re: Aw: Re: Re: Re: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links > > On 27.03.23 19:28, Frank Wunderlich wrote: > > > >> Gesendet: Sonntag, 26. März 2023 um 22:09 Uhr > >> Von: "Felix Fietkau" <nbd@nbd.name> > >> On 26.03.23 19:49, Frank Wunderlich wrote: > >> >> Gesendet: Sonntag, 26. März 2023 um 19:27 Uhr > >> >> Von: "Felix Fietkau" <nbd@nbd.name> > > > >> >> On 26.03.23 19:10, Frank Wunderlich wrote: > >> >> >> Gesendet: Sonntag, 26. März 2023 um 17:56 Uhr > >> >> >> Von: "Felix Fietkau" <nbd@nbd.name> > > > >> >> >> On 25.03.23 10:28, Frank Wunderlich wrote: > >> >> >> >> Gesendet: Freitag, 24. März 2023 um 15:04 Uhr > >> >> >> >> Von: "Felix Fietkau" <nbd@nbd.name> > > > >> >> >> > thx for the fix, as daniel already checked it on mt7986/bpi-r3 i tested bpi-r2/mt7623 > >> >> >> > > >> >> >> > but unfortunately it does not fix issue on bpi-r2 where the gmac0/mt7530 part is affected. > >> >> >> > > >> >> >> > maybe it needs a special handling like you do for mt7621? maybe it is because the trgmii mode used on this path? > >> >> >> Could you please test if making it use the MT7621 codepath brings back > >> >> >> performance? I don't have any MT7623 hardware for testing right now. > > > >> > used the CONFIG_MACH_MT7623 (which is set in my config) boots up fine, but did not fix the 622Mbit-tx-issue > >> > > >> > and i'm not sure i have tested it before...all ports of mt7531 are affected, not only wan (i remembered you asked for this) > >> Does the MAC that's connected to the switch use flow control? Can you > >> test if changing that makes a difference? > > > > it does use flow control/pause on mac and switch-port, disabled it, but it does not change anything (still ~620Mbit on tx) > > I finally found a MT7623 device in my stash, so I could run some > experiments with it. For some reason I could only reproduce your tx > throughput values after switching off TRGMII. With TRGMII enabled, I got > around 864 Mbit/s, which is of course still lower than what I get with > shaping disabled. > I also experimented with bumping the shaping rate to higher values, but > got no changes in throughput at all. > Based on that, I'm beginning to think that the shaper simply can't > handle rates close to the MAC rate and runs into a fundamental limit > somehow. > Now that I think about it, I do remember that shaping to 2.5 Gbps on > MT7622 also reduced link throughput. > > I think we should simply use this patch to deal with it. Do you agree? > > - Felix > > --- > --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c > +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c > @@ -753,6 +753,7 @@ static void mtk_mac_link_up(struct phylink_config *config, > MAC_MCR_FORCE_RX_FC); > > /* Configure speed */ > + mac->speed = speed; > switch (speed) { > case SPEED_2500: > case SPEED_1000: > @@ -3235,6 +3236,9 @@ static int mtk_device_event(struct notifier_block *n, unsigned long event, void > if (dp->index >= MTK_QDMA_NUM_QUEUES) > return NOTIFY_DONE; > > + if (mac->speed > 0 && mac->speed <= s.base.speed) > + s.base.speed = 0; > + > mtk_set_queue_speed(eth, dp->index + 3, s.base.speed); > > return NOTIFY_DONE; > > > >
On 30.03.23 15:58, Frank Wunderlich wrote: > something ist still strange...i get a rcu stall again with this patch...reverted it and my r2 boots again. > > [ 29.772755] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > [ 29.778689] rcu: 2-...0: (1 GPs behind) idle=547c/1/0x40000000 softirq=251/258 fqs=427 > [ 29.786697] rcu: (detected by 1, t=2104 jiffies, g=-875, q=29 ncpus=4) > [ 29.793308] Sending NMI from CPU 1 to CPUs 2: > [ 34.492968] vusb: disabling > [ 34.495828] vmc: disabling > [ 34.498587] vmch: disabling > [ 34.501433] vgp1: disabling > [ 34.504426] vcamaf: disabling > [ 39.797579] rcu: rcu_sched kthread timer wakeup didn't happen for 994 jiffies! g-875 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 > [ 39.808619] rcu: Possible timer handling issue on cpu=1 timer-softirq=493 > [ 39.815487] rcu: rcu_sched kthread starved for 995 jiffies! g-875 f0x0 RCU_GP_WAIT_FQS(5) ->state=0x402 ->cpu=1 > [ 39.825571] rcu: Unless rcu_sched kthread gets sufficient CPU time, OOM is now expected behavior. > [ 39.834520] rcu: RCU grace-period kthread stack dump: > [ 39.839564] task:rcu_sched state:I stack:0 pid:14 ppid:2 flags:0x00000000 > [ 39.847928] __schedule from schedule+0x54/0xe8 > [ 39.852472] schedule from schedule_timeout+0x94/0x158 > [ 39.857619] schedule_timeout from rcu_gp_fqs_loop+0x12c/0x50c > [ 39.863467] rcu_gp_fqs_loop from rcu_gp_kthread+0x194/0x21c > [ 39.869135] rcu_gp_kthread from kthread+0xc8/0xcc > [ 39.873931] kthread from ret_from_fork+0x14/0x2c > [ 39.878639] Exception stack(0xf0859fb0 to 0xf0859ff8) > [ 39.883690] 9fa0: 00000000 00000000 00000000 00000000 > [ 39.891864] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000 > [ 39.900037] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 > [ 39.906645] rcu: Stack dump where RCU GP kthread last ran: > [ 39.912125] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 6.3.0-rc1-bpi-r2-rc-net #2 > [ 39.919518] Hardware name: Mediatek Cortex-A7 (Device Tree) > [ 39.925082] PC is at default_idle_call+0x1c/0xb0 > [ 39.929698] LR is at ct_kernel_enter.constprop.0+0x48/0x11c > [ 39.935267] pc : [<c0d105ec>] lr : [<c0d0ffa4>] psr: 600e0013 > [ 39.941527] sp : f0861fb0 ip : c15721e0 fp : 00000000 > [ 39.946746] r10: 00000000 r9 : 410fc073 r8 : 8000406a > [ 39.951964] r7 : c1404f74 r6 : c19e0900 r5 : c15727e0 r4 : c19e0900 > [ 39.958486] r3 : 00000000 r2 : 2da0a000 r1 : 00000001 r0 : 00008cfc > [ 39.965007] Flags: nZCv IRQs on FIQs on Mode SVC_32 ISA ARM Segment none > [ 39.972138] Control: 10c5387d Table: 84f4806a DAC: 00000051 > [ 39.977878] default_idle_call from cpuidle_idle_call+0x24/0x68 > [ 39.983805] cpuidle_idle_call from do_idle+0x9c/0xd0 > [ 39.988863] do_idle from cpu_startup_entry+0x20/0x24 > [ 39.993921] cpu_startup_entry from secondary_start_kernel+0x118/0x138 > [ 40.000457] secondary_start_kernel from 0x801017a0 > > maybe i need additional patch or did anything else wrong? > > still working on 6.3-rc1 > https://github.com/frank-w/BPI-Router-Linux/commits/6.3-rc-net Can you try applying this patch to a stable kernel instead? These hangs don't make any sense to me, especially the one triggered by an earlier patch that should definitely have been a no-op because of the wrong config symbol. It really looks to me like you have an issue in that kernel triggered by spurious code changes. - Felix
regards Frank > Gesendet: Donnerstag, 30. März 2023 um 19:06 Uhr > Von: "Felix Fietkau" <nbd@nbd.name> > An: "Frank Wunderlich" <frank-w@public-files.de> > Cc: netdev@vger.kernel.org, "Daniel Golle" <daniel@makrotopia.org> > Betreff: Re: Aw: Re: Re: Re: Re: [PATCH net] net: ethernet: mtk_eth_soc: fix tx throughput regression with direct 1G links > > On 30.03.23 15:58, Frank Wunderlich wrote: > > something ist still strange...i get a rcu stall again with this patch...reverted it and my r2 boots again. > > > > [ 29.772755] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: > > [ 29.778689] rcu: 2-...0: (1 GPs behind) idle=547c/1/0x40000000 softirq=251/258 fqs=427 > > [ 29.786697] rcu: (detected by 1, t=2104 jiffies, g=-875, q=29 ncpus=4) > > [ 29.793308] Sending NMI from CPU 1 to CPUs 2: > > > > maybe i need additional patch or did anything else wrong? > > > > still working on 6.3-rc1 > > https://github.com/frank-w/BPI-Router-Linux/commits/6.3-rc-net > Can you try applying this patch to a stable kernel instead? These hangs > don't make any sense to me, especially the one triggered by an earlier > patch that should definitely have been a no-op because of the wrong > config symbol. > It really looks to me like you have an issue in that kernel triggered by > spurious code changes. Hi, have applied it on top of 6.2.0 which has the "implement multi-queue support for per-port queues" in, and indeed it fixes the problem on mt7623. thx for the fix, you can send it with my tested-tag Tested-By: Frank Wunderlich <frank-w@public-files.de> regards Frank
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c index a94aa08515af..282f9435d5ff 100644 --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c @@ -763,8 +763,6 @@ static void mtk_mac_link_up(struct phylink_config *config, break; } - mtk_set_queue_speed(mac->hw, mac->id, speed); - /* Configure duplex */ if (duplex == DUPLEX_FULL) mcr |= MAC_MCR_FORCE_DPX;