diff mbox series

[6.1.y,cip,7/8] net: ravb: Use NAPI threaded mode on 1-core CPUs with GbEth IP

Message ID 20240729150049.1924352-8-paul.barker.ct@bp.renesas.com (mailing list archive)
State New
Headers show
Series Backport ravb driver performance improvements | expand

Commit Message

Paul Barker July 29, 2024, 3 p.m. UTC
commit 65c482bc226ab25a7884ee6fd1cc09bb08cbd1be upstream.

NAPI Threaded mode (along with the previously enabled SW IRQ Coalescing)
is required to improve network stack performance for single core SoCs
using the GbEth IP (currently the RZ/G2L SoC family and the RZ/G3S SoC).

This patch gives the following improvements during testing with iperf3.

  * RZ/G2UL:
    * TCP TX: +32% bandwidth (638Mbps -> 841Mbps)
    * TXP RX: +8.8% bandwidth (667Mbps -> 726Mbps)
    * UDP RX: +104% bandwidth (53Mbps -> 108Mbps)

  * RZ/G3S:
    * TCP TX: 29% bandwidth (529Mbps -> 681Mbps)
    * UDP RX: +1290% bandwidth (6.46Mbps -> 90Mbps)

  * RZ/Five:
    * UDP RX: Test no longer crashes (0 -> 20 Mbps)

This patch gives the following reductions in performance in the same
testing:

  * RZ/G2UL:
    * UDP TX: -7.5% bandwidth (594Mbps -> 549Mbps)

  * RZ/G3S:
    * UDP TX: -5% bandwidth (625Mbps -> 594Mbps)

These losses are considered acceptable given the benefits shown above.
If UDP TX bandwidth must be maximised for a particular use case, NAPI
threaded mode can be disabled at runtime via sysfs writes.

The improvement of UDP RX bandwidth for the single core SoCs (RZ/G2UL &
RZ/G3S) is particularly critical.

Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
Reviewed-by: Sergey Shtylyov <s.shtylyov@omp.ru>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>

Signed-off-by: Paul Barker <paul.barker.ct@bp.renesas.com>
---
 drivers/net/ethernet/renesas/ravb_main.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Pavel Machek July 31, 2024, 7:56 a.m. UTC | #1
Hi!

> NAPI Threaded mode (along with the previously enabled SW IRQ Coalescing)
> is required to improve network stack performance for single core SoCs
> using the GbEth IP (currently the RZ/G2L SoC family and the RZ/G3S SoC).
> 
> This patch gives the following improvements during testing with iperf3.
> 
>   * RZ/G2UL:
>     * TCP TX: +32% bandwidth (638Mbps -> 841Mbps)
>     * TXP RX: +8.8% bandwidth (667Mbps -> 726Mbps)
>     * UDP RX: +104% bandwidth (53Mbps -> 108Mbps)
> 
>   * RZ/G3S:
>     * TCP TX: 29% bandwidth (529Mbps -> 681Mbps)
>     * UDP RX: +1290% bandwidth (6.46Mbps -> 90Mbps)
> 
>   * RZ/Five:
>     * UDP RX: Test no longer crashes (0 -> 20 Mbps)

I believe you'll really need to get the test to work in both
configurations as you still allow user to select.

> +++ b/drivers/net/ethernet/renesas/ravb_main.c
> @@ -2953,8 +2953,11 @@ static int ravb_probe(struct platform_device *pdev)
>  	if (info->nc_queues)
>  		netif_napi_add(ndev, &priv->napi[RAVB_NC], ravb_poll);
>  
> -	if (info->coalesce_irqs)
> +	if (info->coalesce_irqs) {
>  		netdev_sw_irq_coalesce_default_on(ndev);
> +		if (num_present_cpus() == 1)
> +			dev_set_threaded(ndev, true);
> +	}
>  

And as this heuristic is not really reliable. Users can offline CPUs,
for example, so you can have machine with num_present_cpus() == 2 but
only single CPU online. I'm running in similar configuration on my
notebook at the moment.

Best regards,
								Pavel
diff mbox series

Patch

diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
index cdc3c3be9c4c..fb34491d1fd0 100644
--- a/drivers/net/ethernet/renesas/ravb_main.c
+++ b/drivers/net/ethernet/renesas/ravb_main.c
@@ -2953,8 +2953,11 @@  static int ravb_probe(struct platform_device *pdev)
 	if (info->nc_queues)
 		netif_napi_add(ndev, &priv->napi[RAVB_NC], ravb_poll);
 
-	if (info->coalesce_irqs)
+	if (info->coalesce_irqs) {
 		netdev_sw_irq_coalesce_default_on(ndev);
+		if (num_present_cpus() == 1)
+			dev_set_threaded(ndev, true);
+	}
 
 	/* Network device register */
 	error = register_netdev(ndev);