Message ID | 20240615103038.973-3-paul.barker.ct@bp.renesas.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Fix maximum TX/RX frame sizes in ravb driver | expand |
Hi Paul, Thanks for your work. On 2024-06-15 11:30:38 +0100, Paul Barker wrote: > The RX frame size limit should not be based on the current MTU setting. > Instead it should be based on the hardware capabilities. > > Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") I agree with the change the RFLR.RFL setting should not be connected to the MTU setting. And this likely comes from the early days of the driver where neither Rx or Tx supported multiple descriptors for each packet. In those days the single descriptor used for each packet was tied to the MTU setting. So likely the fixes tag should point to a later commit? This is a great find and shows a flaw in the driver. But limiting the size of each descriptor used for Tx is only half the solution right? As the driver now supports multiple descriptors for Tx (on GbEth) the driver allows setting an MTU larger then the maximum size for single descriptor on those devices. But the xmit function of the driver still hardcode the maximum of 2 descriptors for each Tx packet. And it only uses the two descriptors to align the data to hardware constrains. Is it not incorrect for the driver to accept an MTU larger then the maximum size of a single descriptor with the current Tx implementation? The driver can only support larger MTU settings for Rx, but not Tx ATM. I think the complete fix is to extend ravb_start_xmit() to fully support split descriptors for packets larger then the maximum single descriptor size. Not just to align the packet between a DT_FSTART and DT_FEND descriptor when needed. > Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com> > --- > drivers/net/ethernet/renesas/ravb_main.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c > index 02cbf850bd85..481c854cb305 100644 > --- a/drivers/net/ethernet/renesas/ravb_main.c > +++ b/drivers/net/ethernet/renesas/ravb_main.c > @@ -555,8 +555,10 @@ static void ravb_emac_init_gbeth(struct net_device *ndev) > > static void ravb_emac_init_rcar(struct net_device *ndev) > { > + struct ravb_private *priv = netdev_priv(ndev); > + > /* Receive frame limit set register */ > - ravb_write(ndev, ndev->mtu + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN, RFLR); > + ravb_write(ndev, priv->info->rx_max_frame_size + ETH_FCS_LEN, RFLR); > > /* EMAC Mode: PAUSE prohibition; Duplex; RX Checksum; TX; RX */ > ravb_write(ndev, ECMR_ZPF | ECMR_DM | > -- > 2.39.2 >
On Sat, Jun 15, 2024 at 11:30:38AM +0100, Paul Barker wrote: > The RX frame size limit should not be based on the current MTU setting. > Instead it should be based on the hardware capabilities. This is a bit odd. MTU is Maximum Transmission Unit, so clearly is about Tx. MRU does not really exist. Does TCP allow for asymmetric MTU/MRU? Does MTU discovery work correctly for this? In general, it seems like drivers implement min(MTU, MRU) and nothing more. Do you have a real use case for this asymmetry? Andrew
On 16/06/2024 02:23, Andrew Lunn wrote: > On Sat, Jun 15, 2024 at 11:30:38AM +0100, Paul Barker wrote: >> The RX frame size limit should not be based on the current MTU setting. >> Instead it should be based on the hardware capabilities. > > This is a bit odd. MTU is Maximum Transmission Unit, so clearly is > about Tx. MRU does not really exist. Does TCP allow for asymmetric > MTU/MRU? Does MTU discovery work correctly for this? > > In general, it seems like drivers implement min(MTU, MRU) and nothing > more. Do you have a real use case for this asymmetry? > > Andrew Hi Andrew, This change is based on my understanding of MTU/MRU, on the specs of the RZ SoCs I'm working with (primarily RZ/G2L family, RZ/G3S and RZ/G2H) and on some testing. My goal here is just to make the capabilities of the hardware available to users. For the RZ/G2L family and RZ/G3S, we can only support an MTU of up to 1500 bytes, but we can receive frames of up to (IIRC) 8192 bytes. I have tested sending jumbo frames to an RZ/G2L device using both iperf3 and ping and I see no errors. * For iperf3 RX testing, the RZ/G2L is only responding with acks. These are small regardless of the size of the received packets, so the mis-match in MTU between the two hosts causes no issue. * For ping testing, the RZ/G2L will give a fragmented response to the ping packet which the other host can reassemble. For the RZ/G2H, we support sending frames of up to 2047 bytes but we can receive frames of up to 4092 bytes. The driver will need a few more changes to handle reception of packets >2kB in size, but this is something we can do in the near future. Is there any reason why we shouldn't support this? I am by no means an expert in the Linux networking internals so there may be things I'm missing. Thanks,
On 15/06/2024 14:04, Niklas Söderlund wrote: > Hi Paul, > > Thanks for your work. > > On 2024-06-15 11:30:38 +0100, Paul Barker wrote: >> The RX frame size limit should not be based on the current MTU setting. >> Instead it should be based on the hardware capabilities. >> >> Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") > > I agree with the change the RFLR.RFL setting should not be connected to > the MTU setting. And this likely comes from the early days of the driver > where neither Rx or Tx supported multiple descriptors for each packet. > In those days the single descriptor used for each packet was tied to the > MTU setting. So likely the fixes tag should point to a later commit?> If my understanding of MTU & MRU are correct, even with a single descriptor we can always accept the same number of bytes regardless of the current MTU. > This is a great find and shows a flaw in the driver. But limiting the > size of each descriptor used for Tx is only half the solution right? As > the driver now supports multiple descriptors for Tx (on GbEth) the > driver allows setting an MTU larger then the maximum size for single > descriptor on those devices. But the xmit function of the driver still > hardcode the maximum of 2 descriptors for each Tx packet. And it only > uses the two descriptors to align the data to hardware constrains. > > Is it not incorrect for the driver to accept an MTU larger then the > maximum size of a single descriptor with the current Tx implementation? > The driver can only support larger MTU settings for Rx, but not Tx ATM. > > I think the complete fix is to extend ravb_start_xmit() to fully support > split descriptors for packets larger then the maximum single descriptor > size. Not just to align the packet between a DT_FSTART and DT_FEND > descriptor when needed. For the RZ SoCs I have looked at, this isn't an issue. We support transmitting frames of slightly over 1500 bytes on the SoCs with GbEth IP, or 2047 bytes on the SoCs with R-Car AVB IP (e.g. RZ/G2H). Given that a single descriptor can cover up to 4095 bytes of data (with its 12 bit DS field), we don't need split TX descriptors for either of these. Thanks,
On Mon, Jun 17, 2024 at 03:03:21PM +0100, Paul Barker wrote: > On 16/06/2024 02:23, Andrew Lunn wrote: > > On Sat, Jun 15, 2024 at 11:30:38AM +0100, Paul Barker wrote: > >> The RX frame size limit should not be based on the current MTU setting. > >> Instead it should be based on the hardware capabilities. > > > > This is a bit odd. MTU is Maximum Transmission Unit, so clearly is > > about Tx. MRU does not really exist. Does TCP allow for asymmetric > > MTU/MRU? Does MTU discovery work correctly for this? > > > > In general, it seems like drivers implement min(MTU, MRU) and nothing > > more. Do you have a real use case for this asymmetry? > > > > Andrew > > Hi Andrew, > > This change is based on my understanding of MTU/MRU, on the specs of the > RZ SoCs I'm working with (primarily RZ/G2L family, RZ/G3S and RZ/G2H) > and on some testing. My goal here is just to make the capabilities of > the hardware available to users. > > For the RZ/G2L family and RZ/G3S, we can only support an MTU of up to > 1500 bytes, but we can receive frames of up to (IIRC) 8192 bytes. I have > tested sending jumbo frames to an RZ/G2L device using both iperf3 and > ping and I see no errors. > > * For iperf3 RX testing, the RZ/G2L is only responding with acks. These > are small regardless of the size of the received packets, so the > mis-match in MTU between the two hosts causes no issue. > > * For ping testing, the RZ/G2L will give a fragmented response to the > ping packet which the other host can reassemble. > > For the RZ/G2H, we support sending frames of up to 2047 bytes but we can > receive frames of up to 4092 bytes. The driver will need a few more > changes to handle reception of packets >2kB in size, but this is > something we can do in the near future. > > Is there any reason why we shouldn't support this? I am by no means an > expert in the Linux networking internals so there may be things I'm > missing. I've no really experience what the Linux stack will do. I have seen a number of devices which do not limit received packets to the MTU, i.e. they are happy to receive bigger packets. And i don't think this has caused an issue. But is it ever actually used? Do users setup asymmetric jumbo systems? I've no idea. One thing i suggest you test is you put this in the middle of two systems which do support jumbo. 4K 1.5K 4K A <-----> B <-----> C <------> D. Make your device B and C, setting the MTU as supported. But set A and D with a jumbo MTU which your systems should be able to receive. Do IP routing along the chain. And then do some iperf transfers. I would test both IPv4 and IPv6, since MTU path discovery in IPv6 is used a lot, and gateways are not supposed to fragment. I'm assuming here your device does actually have two interfaces? If it is only ever an end system, that simplifies it a lot, and these tests are not needed. Andrew
On 6/15/24 1:30 PM, Paul Barker wrote: > The RX frame size limit should not be based on the current MTU setting. > Instead it should be based on the hardware capabilities. > > Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") > Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com> Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru> Sounds like this is also a fix for net.git tho? [...] > diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c > index 02cbf850bd85..481c854cb305 100644 > --- a/drivers/net/ethernet/renesas/ravb_main.c > +++ b/drivers/net/ethernet/renesas/ravb_main.c > @@ -555,8 +555,10 @@ static void ravb_emac_init_gbeth(struct net_device *ndev) > > static void ravb_emac_init_rcar(struct net_device *ndev) > { > + struct ravb_private *priv = netdev_priv(ndev); > + > /* Receive frame limit set register */ > - ravb_write(ndev, ndev->mtu + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN, RFLR); > + ravb_write(ndev, priv->info->rx_max_frame_size + ETH_FCS_LEN, RFLR); Aha, that's what we're doing in ravb_emac_init_gbeth()... [...] MBR, Sergey
On 17/06/2024 15:29, Andrew Lunn wrote: > On Mon, Jun 17, 2024 at 03:03:21PM +0100, Paul Barker wrote: >> On 16/06/2024 02:23, Andrew Lunn wrote: >>> On Sat, Jun 15, 2024 at 11:30:38AM +0100, Paul Barker wrote: >>>> The RX frame size limit should not be based on the current MTU setting. >>>> Instead it should be based on the hardware capabilities. >>> >>> This is a bit odd. MTU is Maximum Transmission Unit, so clearly is >>> about Tx. MRU does not really exist. Does TCP allow for asymmetric >>> MTU/MRU? Does MTU discovery work correctly for this? >>> >>> In general, it seems like drivers implement min(MTU, MRU) and nothing >>> more. Do you have a real use case for this asymmetry? >>> >>> Andrew >> >> Hi Andrew, >> >> This change is based on my understanding of MTU/MRU, on the specs of the >> RZ SoCs I'm working with (primarily RZ/G2L family, RZ/G3S and RZ/G2H) >> and on some testing. My goal here is just to make the capabilities of >> the hardware available to users. >> >> For the RZ/G2L family and RZ/G3S, we can only support an MTU of up to >> 1500 bytes, but we can receive frames of up to (IIRC) 8192 bytes. I have >> tested sending jumbo frames to an RZ/G2L device using both iperf3 and >> ping and I see no errors. >> >> * For iperf3 RX testing, the RZ/G2L is only responding with acks. These >> are small regardless of the size of the received packets, so the >> mis-match in MTU between the two hosts causes no issue. >> >> * For ping testing, the RZ/G2L will give a fragmented response to the >> ping packet which the other host can reassemble. >> >> For the RZ/G2H, we support sending frames of up to 2047 bytes but we can >> receive frames of up to 4092 bytes. The driver will need a few more >> changes to handle reception of packets >2kB in size, but this is >> something we can do in the near future. >> >> Is there any reason why we shouldn't support this? I am by no means an >> expert in the Linux networking internals so there may be things I'm >> missing. > > I've no really experience what the Linux stack will do. I have seen a > number of devices which do not limit received packets to the MTU, > i.e. they are happy to receive bigger packets. And i don't think this > has caused an issue. But is it ever actually used? Do users setup > asymmetric jumbo systems? I've no idea. > > One thing i suggest you test is you put this in the middle of two > systems which do support jumbo. > > 4K 1.5K 4K > A <-----> B <-----> C <------> D. > > Make your device B and C, setting the MTU as supported. But set A and > D with a jumbo MTU which your systems should be able to receive. Do IP > routing along the chain. And then do some iperf transfers. I would > test both IPv4 and IPv6, since MTU path discovery in IPv6 is used a > lot, and gateways are not supposed to fragment. > > I'm assuming here your device does actually have two interfaces? If > it is only ever an end system, that simplifies it a lot, and these > tests are not needed. > > Andrew Apologies, my response here is abysmally late due to illness, other priorities and then the loss of my main dev box. As you've said, a number of devices do not limit received packet size to the MTU. There are many applications, other than a gateway, where using jumbo packets in even just one direction would be beneficial. For example if an application needs to receive large amounts of data but only needs to send back control and acknowledgement messages. I think we should support this where possible. This is the thought behind the first patch in this series as the GbEth IP present in the RZ/G2L and other Renesas SoCs has a very asymmetric capability (it can receive 8000 byte frames but only transmit 1522 byte frames). If we explicitly do not wish to support this, that restriction should be documented and then (maybe over time) handled uniformly for all network drivers. I'm planning to submit v2 of this series shortly. Thanks,
> Apologies, my response here is abysmally late due to illness, other > priorities and then the loss of my main dev box. Not a problem, life happens. > As you've said, a number of devices do not limit received packet size to > the MTU. There are many applications, other than a gateway, where using > jumbo packets in even just one direction would be beneficial. For > example if an application needs to receive large amounts of data but > only needs to send back control and acknowledgement messages. I think we > should support this where possible. This is the thought behind the first > patch in this series as the GbEth IP present in the RZ/G2L and other > Renesas SoCs has a very asymmetric capability (it can receive 8000 byte > frames but only transmit 1522 byte frames). > > If we explicitly do not wish to support this, that restriction should be > documented and then (maybe over time) handled uniformly for all network > drivers. > > I'm planning to submit v2 of this series shortly. Does the hardware support scatter/gather? How does supporting jumbo receive affect memory usage? Can you give the hardware a number of 2K buffers, and it will use one for a typical packet, and 4 for a jumbo frame? Andrew
On 13/08/2024 15:06, Andrew Lunn wrote: >> Apologies, my response here is abysmally late due to illness, other >> priorities and then the loss of my main dev box. > > Not a problem, life happens. > >> As you've said, a number of devices do not limit received packet size to >> the MTU. There are many applications, other than a gateway, where using >> jumbo packets in even just one direction would be beneficial. For >> example if an application needs to receive large amounts of data but >> only needs to send back control and acknowledgement messages. I think we >> should support this where possible. This is the thought behind the first >> patch in this series as the GbEth IP present in the RZ/G2L and other >> Renesas SoCs has a very asymmetric capability (it can receive 8000 byte >> frames but only transmit 1522 byte frames). >> >> If we explicitly do not wish to support this, that restriction should be >> documented and then (maybe over time) handled uniformly for all network >> drivers. >> >> I'm planning to submit v2 of this series shortly. > > Does the hardware support scatter/gather? How does supporting jumbo > receive affect memory usage? Can you give the hardware a number of 2K > buffers, and it will use one for a typical packet, and 4 for a jumbo > frame? This is exactly what happens. After recent changes [1], we use 2kB RX buffers and an 8kB maximum RX frame size for the GbEth IP. The hardware will split the received frame over one or more buffers as needed. As we would allocate a ring of 2kB buffers in any case, supporting jumbo packets doesn't cause any increase in memory usage or in CPU time spent in memory management. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=966726324b7b14009216fda33b47e0bc003944c6 Thanks,
diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c index 02cbf850bd85..481c854cb305 100644 --- a/drivers/net/ethernet/renesas/ravb_main.c +++ b/drivers/net/ethernet/renesas/ravb_main.c @@ -555,8 +555,10 @@ static void ravb_emac_init_gbeth(struct net_device *ndev) static void ravb_emac_init_rcar(struct net_device *ndev) { + struct ravb_private *priv = netdev_priv(ndev); + /* Receive frame limit set register */ - ravb_write(ndev, ndev->mtu + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN, RFLR); + ravb_write(ndev, priv->info->rx_max_frame_size + ETH_FCS_LEN, RFLR); /* EMAC Mode: PAUSE prohibition; Duplex; RX Checksum; TX; RX */ ravb_write(ndev, ECMR_ZPF | ECMR_DM |
The RX frame size limit should not be based on the current MTU setting. Instead it should be based on the hardware capabilities. Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper") Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com> --- drivers/net/ethernet/renesas/ravb_main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)