diff mbox series

[net-next,2/2] net: ravb: Fix R-Car RX frame size limit

Message ID 20240615103038.973-3-paul.barker.ct@bp.renesas.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series Fix maximum TX/RX frame sizes in ravb driver | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 845 this patch: 845
netdev/build_tools success No tools touched, skip
netdev/cc_maintainers success CCed 9 of 10 maintainers
netdev/build_clang success Errors and warnings before: 849 this patch: 849
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 849 this patch: 849
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 11 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
netdev/contest success net-next-2024-06-16--18-00 (tests: 659)

Commit Message

Paul Barker June 15, 2024, 10:30 a.m. UTC
The RX frame size limit should not be based on the current MTU setting.
Instead it should be based on the hardware capabilities.

Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
---
 drivers/net/ethernet/renesas/ravb_main.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

Comments

Niklas Söderlund June 15, 2024, 1:04 p.m. UTC | #1
Hi Paul,

Thanks for your work.

On 2024-06-15 11:30:38 +0100, Paul Barker wrote:
> The RX frame size limit should not be based on the current MTU setting.
> Instead it should be based on the hardware capabilities.
> 
> Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")

I agree with the change the RFLR.RFL setting should not be connected to 
the MTU setting. And this likely comes from the early days of the driver 
where neither Rx or Tx supported multiple descriptors for each packet.  
In those days the single descriptor used for each packet was tied to the 
MTU setting. So likely the fixes tag should point to a later commit?

This is a great find and shows a flaw in the driver. But limiting the 
size of each descriptor used for Tx is only half the solution right? As 
the driver now supports multiple descriptors for Tx (on GbEth) the 
driver allows setting an MTU larger then the maximum size for single 
descriptor on those devices. But the xmit function of the driver still 
hardcode the maximum of 2 descriptors for each Tx packet. And it only 
uses the two descriptors to align the data to hardware constrains.

Is it not incorrect for the driver to accept an MTU larger then the 
maximum size of a single descriptor with the current Tx implementation?  
The driver can only support larger MTU settings for Rx, but not Tx ATM.

I think the complete fix is to extend ravb_start_xmit() to fully support 
split descriptors for packets larger then the maximum single descriptor 
size. Not just to align the packet between a DT_FSTART and DT_FEND 
descriptor when needed.

> Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
> ---
>  drivers/net/ethernet/renesas/ravb_main.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
> index 02cbf850bd85..481c854cb305 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -555,8 +555,10 @@ static void ravb_emac_init_gbeth(struct net_device *ndev)
>  
>  static void ravb_emac_init_rcar(struct net_device *ndev)
>  {
> +	struct ravb_private *priv = netdev_priv(ndev);
> +
>  	/* Receive frame limit set register */
> -	ravb_write(ndev, ndev->mtu + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN, RFLR);
> +	ravb_write(ndev, priv->info->rx_max_frame_size + ETH_FCS_LEN, RFLR);
>  
>  	/* EMAC Mode: PAUSE prohibition; Duplex; RX Checksum; TX; RX */
>  	ravb_write(ndev, ECMR_ZPF | ECMR_DM |
> -- 
> 2.39.2
>
Andrew Lunn June 16, 2024, 1:23 a.m. UTC | #2
On Sat, Jun 15, 2024 at 11:30:38AM +0100, Paul Barker wrote:
> The RX frame size limit should not be based on the current MTU setting.
> Instead it should be based on the hardware capabilities.

This is a bit odd. MTU is Maximum Transmission Unit, so clearly is
about Tx. MRU does not really exist. Does TCP allow for asymmetric
MTU/MRU? Does MTU discovery work correctly for this?

In general, it seems like drivers implement min(MTU, MRU) and nothing
more. Do you have a real use case for this asymmetry?

      Andrew
Paul Barker June 17, 2024, 2:03 p.m. UTC | #3
On 16/06/2024 02:23, Andrew Lunn wrote:
> On Sat, Jun 15, 2024 at 11:30:38AM +0100, Paul Barker wrote:
>> The RX frame size limit should not be based on the current MTU setting.
>> Instead it should be based on the hardware capabilities.
> 
> This is a bit odd. MTU is Maximum Transmission Unit, so clearly is
> about Tx. MRU does not really exist. Does TCP allow for asymmetric
> MTU/MRU? Does MTU discovery work correctly for this?
> 
> In general, it seems like drivers implement min(MTU, MRU) and nothing
> more. Do you have a real use case for this asymmetry?
> 
>       Andrew

Hi Andrew,

This change is based on my understanding of MTU/MRU, on the specs of the
RZ SoCs I'm working with (primarily RZ/G2L family, RZ/G3S and RZ/G2H)
and on some testing. My goal here is just to make the capabilities of
the hardware available to users.

For the RZ/G2L family and RZ/G3S, we can only support an MTU of up to
1500 bytes, but we can receive frames of up to (IIRC) 8192 bytes. I have
tested sending jumbo frames to an RZ/G2L device using both iperf3 and
ping and I see no errors.

* For iperf3 RX testing, the RZ/G2L is only responding with acks. These
  are small regardless of the size of the received packets, so the
  mis-match in MTU between the two hosts causes no issue.

* For ping testing, the RZ/G2L will give a fragmented response to the
  ping packet which the other host can reassemble.

For the RZ/G2H, we support sending frames of up to 2047 bytes but we can
receive frames of up to 4092 bytes. The driver will need a few more
changes to handle reception of packets >2kB in size, but this is
something we can do in the near future.

Is there any reason why we shouldn't support this? I am by no means an
expert in the Linux networking internals so there may be things I'm
missing.

Thanks,
Paul Barker June 17, 2024, 2:20 p.m. UTC | #4
On 15/06/2024 14:04, Niklas Söderlund wrote:
> Hi Paul,
> 
> Thanks for your work.
> 
> On 2024-06-15 11:30:38 +0100, Paul Barker wrote:
>> The RX frame size limit should not be based on the current MTU setting.
>> Instead it should be based on the hardware capabilities.
>>
>> Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
> 
> I agree with the change the RFLR.RFL setting should not be connected to 
> the MTU setting. And this likely comes from the early days of the driver 
> where neither Rx or Tx supported multiple descriptors for each packet.  
> In those days the single descriptor used for each packet was tied to the 
> MTU setting. So likely the fixes tag should point to a later commit?> 

If my understanding of MTU & MRU are correct, even with a single
descriptor we can always accept the same number of bytes regardless of
the current MTU.

> This is a great find and shows a flaw in the driver. But limiting the 
> size of each descriptor used for Tx is only half the solution right? As 
> the driver now supports multiple descriptors for Tx (on GbEth) the 
> driver allows setting an MTU larger then the maximum size for single 
> descriptor on those devices. But the xmit function of the driver still 
> hardcode the maximum of 2 descriptors for each Tx packet. And it only 
> uses the two descriptors to align the data to hardware constrains.
> 
> Is it not incorrect for the driver to accept an MTU larger then the 
> maximum size of a single descriptor with the current Tx implementation?  
> The driver can only support larger MTU settings for Rx, but not Tx ATM.
> 
> I think the complete fix is to extend ravb_start_xmit() to fully support 
> split descriptors for packets larger then the maximum single descriptor 
> size. Not just to align the packet between a DT_FSTART and DT_FEND 
> descriptor when needed.

For the RZ SoCs I have looked at, this isn't an issue. We support
transmitting frames of slightly over 1500 bytes on the SoCs with GbEth
IP, or 2047 bytes on the SoCs with R-Car AVB IP (e.g. RZ/G2H). Given
that a single descriptor can cover up to 4095 bytes of data (with its 12
bit DS field), we don't need split TX descriptors for either of these.

Thanks,
Andrew Lunn June 17, 2024, 2:29 p.m. UTC | #5
On Mon, Jun 17, 2024 at 03:03:21PM +0100, Paul Barker wrote:
> On 16/06/2024 02:23, Andrew Lunn wrote:
> > On Sat, Jun 15, 2024 at 11:30:38AM +0100, Paul Barker wrote:
> >> The RX frame size limit should not be based on the current MTU setting.
> >> Instead it should be based on the hardware capabilities.
> > 
> > This is a bit odd. MTU is Maximum Transmission Unit, so clearly is
> > about Tx. MRU does not really exist. Does TCP allow for asymmetric
> > MTU/MRU? Does MTU discovery work correctly for this?
> > 
> > In general, it seems like drivers implement min(MTU, MRU) and nothing
> > more. Do you have a real use case for this asymmetry?
> > 
> >       Andrew
> 
> Hi Andrew,
> 
> This change is based on my understanding of MTU/MRU, on the specs of the
> RZ SoCs I'm working with (primarily RZ/G2L family, RZ/G3S and RZ/G2H)
> and on some testing. My goal here is just to make the capabilities of
> the hardware available to users.
> 
> For the RZ/G2L family and RZ/G3S, we can only support an MTU of up to
> 1500 bytes, but we can receive frames of up to (IIRC) 8192 bytes. I have
> tested sending jumbo frames to an RZ/G2L device using both iperf3 and
> ping and I see no errors.
> 
> * For iperf3 RX testing, the RZ/G2L is only responding with acks. These
>   are small regardless of the size of the received packets, so the
>   mis-match in MTU between the two hosts causes no issue.
> 
> * For ping testing, the RZ/G2L will give a fragmented response to the
>   ping packet which the other host can reassemble.
> 
> For the RZ/G2H, we support sending frames of up to 2047 bytes but we can
> receive frames of up to 4092 bytes. The driver will need a few more
> changes to handle reception of packets >2kB in size, but this is
> something we can do in the near future.
> 
> Is there any reason why we shouldn't support this? I am by no means an
> expert in the Linux networking internals so there may be things I'm
> missing.

I've no really experience what the Linux stack will do. I have seen a
number of devices which do not limit received packets to the MTU,
i.e. they are happy to receive bigger packets. And i don't think this
has caused an issue. But is it ever actually used? Do users setup
asymmetric jumbo systems? I've no idea.

One thing i suggest you test is you put this in the middle of two
systems which do support jumbo.

     4K       1.5K       4K
A <-----> B <-----> C <------> D.

Make your device B and C, setting the MTU as supported. But set A and
D with a jumbo MTU which your systems should be able to receive. Do IP
routing along the chain. And then do some iperf transfers. I would
test both IPv4 and IPv6, since MTU path discovery in IPv6 is used a
lot, and gateways are not supposed to fragment.

I'm assuming here your device does actually have two interfaces?  If
it is only ever an end system, that simplifies it a lot, and these
tests are not needed.

    Andrew
Sergey Shtylyov June 17, 2024, 8:18 p.m. UTC | #6
On 6/15/24 1:30 PM, Paul Barker wrote:

> The RX frame size limit should not be based on the current MTU setting.
> Instead it should be based on the hardware capabilities.
> 
> Fixes: c156633f1353 ("Renesas Ethernet AVB driver proper")
> Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>

Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>

   Sounds like this is also a fix for net.git tho?

[...]

> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
> index 02cbf850bd85..481c854cb305 100644
> --- a/drivers/net/ethernet/renesas/ravb_main.c
> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -555,8 +555,10 @@ static void ravb_emac_init_gbeth(struct net_device *ndev)
>  
>  static void ravb_emac_init_rcar(struct net_device *ndev)
>  {
> +	struct ravb_private *priv = netdev_priv(ndev);
> +
>  	/* Receive frame limit set register */
> -	ravb_write(ndev, ndev->mtu + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN, RFLR);
> +	ravb_write(ndev, priv->info->rx_max_frame_size + ETH_FCS_LEN, RFLR);

   Aha, that's what we're doing in ravb_emac_init_gbeth()...

[...]

MBR, Sergey
Paul Barker Aug. 13, 2024, 1:29 p.m. UTC | #7
On 17/06/2024 15:29, Andrew Lunn wrote:
> On Mon, Jun 17, 2024 at 03:03:21PM +0100, Paul Barker wrote:
>> On 16/06/2024 02:23, Andrew Lunn wrote:
>>> On Sat, Jun 15, 2024 at 11:30:38AM +0100, Paul Barker wrote:
>>>> The RX frame size limit should not be based on the current MTU setting.
>>>> Instead it should be based on the hardware capabilities.
>>>
>>> This is a bit odd. MTU is Maximum Transmission Unit, so clearly is
>>> about Tx. MRU does not really exist. Does TCP allow for asymmetric
>>> MTU/MRU? Does MTU discovery work correctly for this?
>>>
>>> In general, it seems like drivers implement min(MTU, MRU) and nothing
>>> more. Do you have a real use case for this asymmetry?
>>>
>>>       Andrew
>>
>> Hi Andrew,
>>
>> This change is based on my understanding of MTU/MRU, on the specs of the
>> RZ SoCs I'm working with (primarily RZ/G2L family, RZ/G3S and RZ/G2H)
>> and on some testing. My goal here is just to make the capabilities of
>> the hardware available to users.
>>
>> For the RZ/G2L family and RZ/G3S, we can only support an MTU of up to
>> 1500 bytes, but we can receive frames of up to (IIRC) 8192 bytes. I have
>> tested sending jumbo frames to an RZ/G2L device using both iperf3 and
>> ping and I see no errors.
>>
>> * For iperf3 RX testing, the RZ/G2L is only responding with acks. These
>>   are small regardless of the size of the received packets, so the
>>   mis-match in MTU between the two hosts causes no issue.
>>
>> * For ping testing, the RZ/G2L will give a fragmented response to the
>>   ping packet which the other host can reassemble.
>>
>> For the RZ/G2H, we support sending frames of up to 2047 bytes but we can
>> receive frames of up to 4092 bytes. The driver will need a few more
>> changes to handle reception of packets >2kB in size, but this is
>> something we can do in the near future.
>>
>> Is there any reason why we shouldn't support this? I am by no means an
>> expert in the Linux networking internals so there may be things I'm
>> missing.
> 
> I've no really experience what the Linux stack will do. I have seen a
> number of devices which do not limit received packets to the MTU,
> i.e. they are happy to receive bigger packets. And i don't think this
> has caused an issue. But is it ever actually used? Do users setup
> asymmetric jumbo systems? I've no idea.
> 
> One thing i suggest you test is you put this in the middle of two
> systems which do support jumbo.
> 
>      4K       1.5K       4K
> A <-----> B <-----> C <------> D.
> 
> Make your device B and C, setting the MTU as supported. But set A and
> D with a jumbo MTU which your systems should be able to receive. Do IP
> routing along the chain. And then do some iperf transfers. I would
> test both IPv4 and IPv6, since MTU path discovery in IPv6 is used a
> lot, and gateways are not supposed to fragment.
> 
> I'm assuming here your device does actually have two interfaces?  If
> it is only ever an end system, that simplifies it a lot, and these
> tests are not needed.
> 
>     Andrew

Apologies, my response here is abysmally late due to illness, other
priorities and then the loss of my main dev box.

As you've said, a number of devices do not limit received packet size to
the MTU. There are many applications, other than a gateway, where using
jumbo packets in even just one direction would be beneficial. For
example if an application needs to receive large amounts of data but
only needs to send back control and acknowledgement messages. I think we
should support this where possible. This is the thought behind the first
patch in this series as the GbEth IP present in the RZ/G2L and other
Renesas SoCs has a very asymmetric capability (it can receive 8000 byte
frames but only transmit 1522 byte frames).

If we explicitly do not wish to support this, that restriction should be
documented and then (maybe over time) handled uniformly for all network
drivers.

I'm planning to submit v2 of this series shortly.

Thanks,
Andrew Lunn Aug. 13, 2024, 2:06 p.m. UTC | #8
> Apologies, my response here is abysmally late due to illness, other
> priorities and then the loss of my main dev box.

Not a problem, life happens.

> As you've said, a number of devices do not limit received packet size to
> the MTU. There are many applications, other than a gateway, where using
> jumbo packets in even just one direction would be beneficial. For
> example if an application needs to receive large amounts of data but
> only needs to send back control and acknowledgement messages. I think we
> should support this where possible. This is the thought behind the first
> patch in this series as the GbEth IP present in the RZ/G2L and other
> Renesas SoCs has a very asymmetric capability (it can receive 8000 byte
> frames but only transmit 1522 byte frames).
> 
> If we explicitly do not wish to support this, that restriction should be
> documented and then (maybe over time) handled uniformly for all network
> drivers.
> 
> I'm planning to submit v2 of this series shortly.

Does the hardware support scatter/gather? How does supporting jumbo
receive affect memory usage? Can you give the hardware a number of 2K
buffers, and it will use one for a typical packet, and 4 for a jumbo
frame?

	Andrew
Paul Barker Aug. 15, 2024, 9:10 a.m. UTC | #9
On 13/08/2024 15:06, Andrew Lunn wrote:
>> Apologies, my response here is abysmally late due to illness, other
>> priorities and then the loss of my main dev box.
> 
> Not a problem, life happens.
> 
>> As you've said, a number of devices do not limit received packet size to
>> the MTU. There are many applications, other than a gateway, where using
>> jumbo packets in even just one direction would be beneficial. For
>> example if an application needs to receive large amounts of data but
>> only needs to send back control and acknowledgement messages. I think we
>> should support this where possible. This is the thought behind the first
>> patch in this series as the GbEth IP present in the RZ/G2L and other
>> Renesas SoCs has a very asymmetric capability (it can receive 8000 byte
>> frames but only transmit 1522 byte frames).
>>
>> If we explicitly do not wish to support this, that restriction should be
>> documented and then (maybe over time) handled uniformly for all network
>> drivers.
>>
>> I'm planning to submit v2 of this series shortly.
> 
> Does the hardware support scatter/gather? How does supporting jumbo
> receive affect memory usage? Can you give the hardware a number of 2K
> buffers, and it will use one for a typical packet, and 4 for a jumbo
> frame?

This is exactly what happens. After recent changes [1], we use 2kB RX
buffers and an 8kB maximum RX frame size for the GbEth IP. The hardware
will split the received frame over one or more buffers as needed. As we
would allocate a ring of 2kB buffers in any case, supporting jumbo
packets doesn't cause any increase in memory usage or in CPU time spent
in memory management.

[1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=966726324b7b14009216fda33b47e0bc003944c6

Thanks,
diff mbox series

Patch

diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index 02cbf850bd85..481c854cb305 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -555,8 +555,10 @@  static void ravb_emac_init_gbeth(struct net_device *ndev)
 
 static void ravb_emac_init_rcar(struct net_device *ndev)
 {
+	struct ravb_private *priv = netdev_priv(ndev);
+
 	/* Receive frame limit set register */
-	ravb_write(ndev, ndev->mtu + ETH_HLEN + VLAN_HLEN + ETH_FCS_LEN, RFLR);
+	ravb_write(ndev, priv->info->rx_max_frame_size + ETH_FCS_LEN, RFLR);
 
 	/* EMAC Mode: PAUSE prohibition; Duplex; RX Checksum; TX; RX */
 	ravb_write(ndev, ECMR_ZPF | ECMR_DM |