Message ID | 20231129155350.5843-1-hau@realtek.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 4b0768b6556af56ee9b7cf4e68452a2b6289ae45 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] r8169: fix rtl8125b PAUSE frames blasting when suspended | expand |
On 11/29/2023 7:53 AM, ChunHao Lin wrote: > When FIFO reaches near full state, device will issue pause frame. > If pause slot is enabled(set to 1), in this time, device will issue > pause frame only once. But if pause slot is disabled(set to 0), device > will keep sending pause frames until FIFO reaches near empty state. > > When pause slot is disabled, if there is no one to handle receive > packets, device FIFO will reach near full state and keep sending > pause frames. That will impact entire local area network. > > This issue can be reproduced in Chromebox (not Chromebook) in > developer mode running a test image (and v5.10 kernel): > 1) ping -f $CHROMEBOX (from workstation on same local network) > 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX > 3) ping $ROUTER (wait until ping fails from workstation) > > Takes about ~20-30 seconds after step 2 for the local network to > stop working. > > Fix this issue by enabling pause slot to only send pause frame once > when FIFO reaches near full state. > Makes sense. Avoiding the spam is good. The naming is a bit confusing but I guess that comes from realtek datasheet? > Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") > Reported-by: Grant Grundler <grundler@chromium.org> > Tested-by: Grant Grundler <grundler@chromium.org> > Cc: stable@vger.kernel.org > Signed-off-by: ChunHao Lin <hau@realtek.com> > --- > v2: > - update comment and title. > --- > drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c > index 62cabeeb842a..bb787a52bc75 100644 > --- a/drivers/net/ethernet/realtek/r8169_main.c > +++ b/drivers/net/ethernet/realtek/r8169_main.c > @@ -196,6 +196,7 @@ enum rtl_registers { > /* No threshold before first PCI xfer */ > #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT) > #define RX_EARLY_OFF (1 << 11) > +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */ This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that from a different set of devices? We're writing to the same register RxConfig here I think in both cases? Can you clarify if these are supposed to be the same bit? > #define RXCFG_DMA_SHIFT 8 > /* Unlimited maximum PCI burst. */ > #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT) > @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp) > case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53: > RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); > break; > - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63: > + case RTL_GIGA_MAC_VER_61: > RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST); > break; I assume there isn't a VER_62 between these? > + case RTL_GIGA_MAC_VER_63: > + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST | > + RX_PAUSE_SLOT_ON); We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes sense. > + break; > default: > RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); > break;
On Wed, Nov 29, 2023 at 3:05 PM Jacob Keller <jacob.e.keller@intel.com> wrote: > On 11/29/2023 7:53 AM, ChunHao Lin wrote: > > When FIFO reaches near full state, device will issue pause frame. > > If pause slot is enabled(set to 1), in this time, device will issue > > pause frame only once. But if pause slot is disabled(set to 0), device > > will keep sending pause frames until FIFO reaches near empty state. > > > > When pause slot is disabled, if there is no one to handle receive > > packets, device FIFO will reach near full state and keep sending > > pause frames. That will impact entire local area network. > > > > This issue can be reproduced in Chromebox (not Chromebook) in > > developer mode running a test image (and v5.10 kernel): > > 1) ping -f $CHROMEBOX (from workstation on same local network) > > 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX > > 3) ping $ROUTER (wait until ping fails from workstation) > > > > Takes about ~20-30 seconds after step 2 for the local network to > > stop working. > > > > Fix this issue by enabling pause slot to only send pause frame once > > when FIFO reaches near full state. > > > > Makes sense. Avoiding the spam is good. The naming is a bit confusing > but I guess that comes from realtek datasheet? I don't know. It doesn't matter to me what it's called since I don't have access to the data sheet anyway. :/ > > Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") > > Reported-by: Grant Grundler <grundler@chromium.org> > > Tested-by: Grant Grundler <grundler@chromium.org> > > Cc: stable@vger.kernel.org > > Signed-off-by: ChunHao Lin <hau@realtek.com> > > --- > > v2: > > - update comment and title. > > --- > > drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- > > 1 file changed, 6 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c > > index 62cabeeb842a..bb787a52bc75 100644 > > --- a/drivers/net/ethernet/realtek/r8169_main.c > > +++ b/drivers/net/ethernet/realtek/r8169_main.c > > @@ -196,6 +196,7 @@ enum rtl_registers { > > /* No threshold before first PCI xfer */ > > #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT) > > #define RX_EARLY_OFF (1 << 11) > > +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */ > > This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that > from a different set of devices? Yes, for a different HW version of the device. > We're writing to the same register > RxConfig here I think in both cases? Yes. But to different versions of the HW which use this bit differently. Ergo the comment about "8125b and later". > Can you clarify if these are supposed to be the same bit? Yes, they are the same bit - but different versions of HW use BIT(11) differently. > > > #define RXCFG_DMA_SHIFT 8 > > /* Unlimited maximum PCI burst. */ > > #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT) > > @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp) > > case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53: > > RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); > > break; > > - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63: > > + case RTL_GIGA_MAC_VER_61: > > RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST); > > break; > > I assume there isn't a VER_62 between these? Correct. My clue is this code near the top of this file: 149 [RTL_GIGA_MAC_VER_61] = {"RTL8125A", FIRMWARE_8125A_3}, 150 /* reserve 62 for CFG_METHOD_4 in the vendor driver */ 151 [RTL_GIGA_MAC_VER_63] = {"RTL8125B", FIRMWARE_8125B_2}, > > > + case RTL_GIGA_MAC_VER_63: > > + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST | > > + RX_PAUSE_SLOT_ON); > > We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes > sense. Exactly. thanks for reviewing! cheers, grant > > + break; > > default: > > RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); > > break;
On 11/29/2023 3:40 PM, Grant Grundler wrote: > On Wed, Nov 29, 2023 at 3:05 PM Jacob Keller <jacob.e.keller@intel.com> wrote: >> On 11/29/2023 7:53 AM, ChunHao Lin wrote: >>> When FIFO reaches near full state, device will issue pause frame. >>> If pause slot is enabled(set to 1), in this time, device will issue >>> pause frame only once. But if pause slot is disabled(set to 0), device >>> will keep sending pause frames until FIFO reaches near empty state. >>> >>> When pause slot is disabled, if there is no one to handle receive >>> packets, device FIFO will reach near full state and keep sending >>> pause frames. That will impact entire local area network. >>> >>> This issue can be reproduced in Chromebox (not Chromebook) in >>> developer mode running a test image (and v5.10 kernel): >>> 1) ping -f $CHROMEBOX (from workstation on same local network) >>> 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX >>> 3) ping $ROUTER (wait until ping fails from workstation) >>> >>> Takes about ~20-30 seconds after step 2 for the local network to >>> stop working. >>> >>> Fix this issue by enabling pause slot to only send pause frame once >>> when FIFO reaches near full state. >>> >> >> Makes sense. Avoiding the spam is good. The naming is a bit confusing >> but I guess that comes from realtek datasheet? > > I don't know. It doesn't matter to me what it's called since I don't > have access to the data sheet anyway. :/ > The name is fine, i just found it a bit hard to parse since its effectively "PAUSE_SLOT_ON" makes us *not* send pause frames forever. I think its fine as-is, since this is referring to the use of the pause slot in hardware. >>> Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") >>> Reported-by: Grant Grundler <grundler@chromium.org> >>> Tested-by: Grant Grundler <grundler@chromium.org> >>> Cc: stable@vger.kernel.org >>> Signed-off-by: ChunHao Lin <hau@realtek.com> >>> --- >>> v2: >>> - update comment and title. >>> --- >>> drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++- >>> 1 file changed, 6 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c >>> index 62cabeeb842a..bb787a52bc75 100644 >>> --- a/drivers/net/ethernet/realtek/r8169_main.c >>> +++ b/drivers/net/ethernet/realtek/r8169_main.c >>> @@ -196,6 +196,7 @@ enum rtl_registers { >>> /* No threshold before first PCI xfer */ >>> #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT) >>> #define RX_EARLY_OFF (1 << 11) >>> +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */ >> >> This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that >> from a different set of devices? > > Yes, for a different HW version of the device. > Great. >> We're writing to the same register >> RxConfig here I think in both cases? > > Yes. But to different versions of the HW which use this bit > differently. Ergo the comment about "8125b and later". > >> Can you clarify if these are supposed to be the same bit? > > Yes, they are the same bit - but different versions of HW use BIT(11) > differently. Thanks for the clarification! > >> >>> #define RXCFG_DMA_SHIFT 8 >>> /* Unlimited maximum PCI burst. */ >>> #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT) >>> @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp) >>> case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53: >>> RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); >>> break; >>> - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63: >>> + case RTL_GIGA_MAC_VER_61: >>> RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST); >>> break; >> >> I assume there isn't a VER_62 between these? > > Correct. My clue is this code near the top of this file: > > 149 [RTL_GIGA_MAC_VER_61] = {"RTL8125A", FIRMWARE_8125A_3}, > 150 /* reserve 62 for CFG_METHOD_4 in the vendor driver */ > 151 [RTL_GIGA_MAC_VER_63] = {"RTL8125B", FIRMWARE_8125B_2}, > >> >>> + case RTL_GIGA_MAC_VER_63: >>> + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST | >>> + RX_PAUSE_SLOT_ON); >> >> We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes >> sense. > > Exactly. > > thanks for reviewing! > Great. For the record: Reviewed-by: Jacob Keller <jacob.e.keller@intel.com>
On Wed, 29 Nov 2023 23:53:50 +0800 ChunHao Lin wrote: > When FIFO reaches near full state, device will issue pause frame. > If pause slot is enabled(set to 1), in this time, device will issue > pause frame only once. But if pause slot is disabled(set to 0), device > will keep sending pause frames until FIFO reaches near empty state. Heiner, looks good?
On 29.11.2023 16:53, ChunHao Lin wrote: > When FIFO reaches near full state, device will issue pause frame. > If pause slot is enabled(set to 1), in this time, device will issue > pause frame only once. But if pause slot is disabled(set to 0), device > will keep sending pause frames until FIFO reaches near empty state. > > When pause slot is disabled, if there is no one to handle receive > packets, device FIFO will reach near full state and keep sending > pause frames. That will impact entire local area network. > > This issue can be reproduced in Chromebox (not Chromebook) in > developer mode running a test image (and v5.10 kernel): > 1) ping -f $CHROMEBOX (from workstation on same local network) > 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX > 3) ping $ROUTER (wait until ping fails from workstation) > > Takes about ~20-30 seconds after step 2 for the local network to > stop working. > > Fix this issue by enabling pause slot to only send pause frame once > when FIFO reaches near full state. > > Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125") > Reported-by: Grant Grundler <grundler@chromium.org> > Tested-by: Grant Grundler <grundler@chromium.org> > Cc: stable@vger.kernel.org > Signed-off-by: ChunHao Lin <hau@realtek.com> > --- > v2: > - update comment and title. > --- Reviewed-by: Heiner Kallweit <hkallweit1@gmail.com>
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Wed, 29 Nov 2023 23:53:50 +0800 you wrote: > When FIFO reaches near full state, device will issue pause frame. > If pause slot is enabled(set to 1), in this time, device will issue > pause frame only once. But if pause slot is disabled(set to 0), device > will keep sending pause frames until FIFO reaches near empty state. > > When pause slot is disabled, if there is no one to handle receive > packets, device FIFO will reach near full state and keep sending > pause frames. That will impact entire local area network. > > [...] Here is the summary with links: - [net,v2] r8169: fix rtl8125b PAUSE frames blasting when suspended https://git.kernel.org/netdev/net/c/4b0768b6556a You are awesome, thank you!
diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c index 62cabeeb842a..bb787a52bc75 100644 --- a/drivers/net/ethernet/realtek/r8169_main.c +++ b/drivers/net/ethernet/realtek/r8169_main.c @@ -196,6 +196,7 @@ enum rtl_registers { /* No threshold before first PCI xfer */ #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT) #define RX_EARLY_OFF (1 << 11) +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */ #define RXCFG_DMA_SHIFT 8 /* Unlimited maximum PCI burst. */ #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT) @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp) case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF); break; - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63: + case RTL_GIGA_MAC_VER_61: RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST); break; + case RTL_GIGA_MAC_VER_63: + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST | + RX_PAUSE_SLOT_ON); + break; default: RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST); break;