diff mbox series

[net,v3,1/1] net: stmmac: Prevent DSA tags from breaking COE

Message ID 20240108111747.73872-2-romain.gantois@bootlin.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series Prevent DSA tags from breaking COE | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1113 this patch: 1113
netdev/cc_maintainers success CCed 0 of 0 maintainers
netdev/build_clang success Errors and warnings before: 1140 this patch: 1140
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1140 this patch: 1140
netdev/checkpatch warning WARNING: line length of 81 exceeds 80 columns WARNING: line length of 93 exceeds 80 columns
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline fail Was 0 now: 1

Commit Message

Romain Gantois Jan. 8, 2024, 11:17 a.m. UTC
Some DSA tagging protocols change the EtherType field in the MAC header
e.g.  DSA_TAG_PROTO_(DSA/EDSA/BRCM/MTK/RTL4C_A/SJA1105). On TX these tagged
frames are ignored by the checksum offload engine and IP header checker of
some stmmac cores.

On RX, the stmmac driver wrongly assumes that checksums have been computed
for these tagged packets, and sets CHECKSUM_UNNECESSARY.

Add an additional check in the stmmac TX and RX hotpaths so that COE is
deactivated for packets with ethertypes that will not trigger the COE and
IP header checks.

Fixes: 6b2c6e4a938f ("net: stmmac: propagate feature flags to vlan")
Cc: stable@vger.kernel.org
Reported-by: Richard Tresidder <rtresidd@electromag.com.au>
Link: https://lore.kernel.org/netdev/e5c6c75f-2dfa-4e50-a1fb-6bf4cdb617c2@electromag.com.au/
Reported-by: Romain Gantois <romain.gantois@bootlin.com>
Link: https://lore.kernel.org/netdev/c57283ed-6b9b-b0e6-ee12-5655c1c54495@bootlin.com/
Signed-off-by: Romain Gantois <romain.gantois@bootlin.com>
---
 .../net/ethernet/stmicro/stmmac/stmmac_main.c | 23 ++++++++++++++++---
 1 file changed, 20 insertions(+), 3 deletions(-)

Comments

Vladimir Oltean Jan. 8, 2024, 1:02 p.m. UTC | #1
On Mon, Jan 08, 2024 at 12:17:45PM +0100, Romain Gantois wrote:
> Some DSA tagging protocols change the EtherType field in the MAC header
> e.g.  DSA_TAG_PROTO_(DSA/EDSA/BRCM/MTK/RTL4C_A/SJA1105). On TX these tagged
> frames are ignored by the checksum offload engine and IP header checker of
> some stmmac cores.
> 
> On RX, the stmmac driver wrongly assumes that checksums have been computed
> for these tagged packets, and sets CHECKSUM_UNNECESSARY.
> 
> Add an additional check in the stmmac TX and RX hotpaths so that COE is
> deactivated for packets with ethertypes that will not trigger the COE and
> IP header checks.
> 
> Fixes: 6b2c6e4a938f ("net: stmmac: propagate feature flags to vlan")
> Cc: stable@vger.kernel.org
> Reported-by: Richard Tresidder <rtresidd@electromag.com.au>
> Link: https://lore.kernel.org/netdev/e5c6c75f-2dfa-4e50-a1fb-6bf4cdb617c2@electromag.com.au/
> Reported-by: Romain Gantois <romain.gantois@bootlin.com>
> Link: https://lore.kernel.org/netdev/c57283ed-6b9b-b0e6-ee12-5655c1c54495@bootlin.com/
> Signed-off-by: Romain Gantois <romain.gantois@bootlin.com>
> ---

Reviewed-by: Vladimir Oltean <vladimir.oltean@nxp.com>

>  .../net/ethernet/stmicro/stmmac/stmmac_main.c | 23 ++++++++++++++++---
>  1 file changed, 20 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index a9b6b383e863..6797c944a2ac 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -4371,6 +4371,19 @@ static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
>  	return NETDEV_TX_OK;
>  }
>  
> +/* Check if ethertype will trigger IP
> + * header checks/COE in hardware
> + */

Nitpick: you could render this in kernel-doc format.
https://docs.kernel.org/doc-guide/kernel-doc.html

> +static inline bool stmmac_has_ip_ethertype(struct sk_buff *skb)

Nitpick: in netdev it is preferred not to use the "inline" keyword at
all in C files, only "static inline" in headers, and to let the compiler
decide by itself when it is appropriate to inline the code (which it
does by itself even without the "inline" keyword). For a bit more
background why, you can view Documentation/process/4.Coding.rst, section
"Inline functions".

> +{
> +	int depth = 0;
> +	__be16 proto;
> +
> +	proto = __vlan_get_protocol(skb, eth_header_parse_protocol(skb), &depth);
> +
> +	return depth <= ETH_HLEN && (proto == htons(ETH_P_IP) || proto == htons(ETH_P_IPV6));
> +}
Miquel Raynal Jan. 8, 2024, 1:29 p.m. UTC | #2
Hi Romain,

> > +/* Check if ethertype will trigger IP
> > + * header checks/COE in hardware
> > + */  
> 
> Nitpick: you could render this in kernel-doc format.
> https://docs.kernel.org/doc-guide/kernel-doc.html
> 
> > +static inline bool stmmac_has_ip_ethertype(struct sk_buff *skb)  
> 
> Nitpick: in netdev it is preferred not to use the "inline" keyword at
> all in C files, only "static inline" in headers, and to let the compiler
> decide by itself when it is appropriate to inline the code (which it
> does by itself even without the "inline" keyword). For a bit more
> background why, you can view Documentation/process/4.Coding.rst, section
> "Inline functions".
> 
> > +{
> > +	int depth = 0;
> > +	__be16 proto;
> > +
> > +	proto = __vlan_get_protocol(skb, eth_header_parse_protocol(skb), &depth);
> > +
> > +	return depth <= ETH_HLEN && (proto == htons(ETH_P_IP) || proto == htons(ETH_P_IPV6));

I also want to nitpick a bit :) If you are to send a v4, maybe you can
enclose the first condition within parenthesis to further clarify the
return logic.

Cheers,
Miquèl
Romain Gantois Jan. 8, 2024, 2:23 p.m. UTC | #3
On Mon, 8 Jan 2024, Vladimir Oltean wrote:

...

> Nitpick: you could render this in kernel-doc format.
> https://docs.kernel.org/doc-guide/kernel-doc.html
> 
> > +static inline bool stmmac_has_ip_ethertype(struct sk_buff *skb)
> 
> Nitpick: in netdev it is preferred not to use the "inline" keyword at
> all in C files, only "static inline" in headers, and to let the compiler
> decide by itself when it is appropriate to inline the code (which it
> does by itself even without the "inline" keyword). For a bit more
> background why, you can view Documentation/process/4.Coding.rst, section
> "Inline functions".

I see, the kernel docs were indeed enlightening on this point. As a side note, 
I've just benchmarked both the "with-inline" and "without-inline" versions. 
First of all, objdump seems to confirm that GCC does indeed follow this pragma 
in this particular case. Also, RX perfs are better with stmmac_has_ip_ethertype 
inlined, but TX perfs are actually consistently worse with this function 
inlined, which could very well be caused by cache effects.

In any case, I think it is better to remove the "inline" pragma as you said. 
I'll do that in v4.

Best Regards,
Vladimir Oltean Jan. 8, 2024, 2:36 p.m. UTC | #4
On Mon, Jan 08, 2024 at 03:23:38PM +0100, Romain Gantois wrote:
> I see, the kernel docs were indeed enlightening on this point. As a side note, 
> I've just benchmarked both the "with-inline" and "without-inline" versions. 
> First of all, objdump seems to confirm that GCC does indeed follow this pragma 
> in this particular case. Also, RX perfs are better with stmmac_has_ip_ethertype 
> inlined, but TX perfs are actually consistently worse with this function 
> inlined, which could very well be caused by cache effects.
> 
> In any case, I think it is better to remove the "inline" pragma as you said. 
> I'll do that in v4.

Are you doing any code instrumentation, or just measuring the results
and deducing what might cause them?

It might be worth looking at the perf events and seeing what function
consumes the most amount of time.

CPU_CORE=0
perf record -e cycles -C $CPU_CORE sleep 10 && perf report
perf record -e cache-misses -C $CPU_CORE sleep 10 && perf report
Romain Gantois Jan. 9, 2024, 3:16 p.m. UTC | #5
On Mon, 8 Jan 2024, Vladimir Oltean wrote:

> On Mon, Jan 08, 2024 at 03:23:38PM +0100, Romain Gantois wrote:
> > I see, the kernel docs were indeed enlightening on this point. As a side note, 
> > I've just benchmarked both the "with-inline" and "without-inline" versions. 
> > First of all, objdump seems to confirm that GCC does indeed follow this pragma 
> > in this particular case. Also, RX perfs are better with stmmac_has_ip_ethertype 
> > inlined, but TX perfs are actually consistently worse with this function 
> > inlined, which could very well be caused by cache effects.
> > 
> > In any case, I think it is better to remove the "inline" pragma as you said. 
> > I'll do that in v4.
> 
> Are you doing any code instrumentation, or just measuring the results
> and deducing what might cause them?
> 
> It might be worth looking at the perf events and seeing what function
> consumes the most amount of time.
> 
> CPU_CORE=0
> perf record -e cycles -C $CPU_CORE sleep 10 && perf report
> perf record -e cache-misses -C $CPU_CORE sleep 10 && perf report
> 

Unfortunately my hardware doesn't support these performance metrics, but I did 
manage to do some instrumentation with the ftrace profiler:

Same test conditions as before, 10 second iperf3 runs with unfragmented UDP 
packets.

no inline TX
  average time per call for stmmac_xmit(): 85us
  average time per call for stmmac_has_ip_ethertype(): 2us

no inline RX
  average time per call for stmmac_napi_poll_rx(): 8142us
  average time per call for stmmac_has_ip_ethertype(): 2us

inline TX:
  average time per call for stmmac_xmit(): 85us

inline RX:
  average time per call for stmmac_napi_poll_rx(): 8410us

It seems like this time, RX performed slightly worse with the function inline. 
To be honest, I'm starting to doubt the reproducibility of these tests. In any 
case it seems better to just remove the "inline" and let gcc do the optimizing.

Best Regards,
diff mbox series

Patch

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index a9b6b383e863..6797c944a2ac 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -4371,6 +4371,19 @@  static netdev_tx_t stmmac_tso_xmit(struct sk_buff *skb, struct net_device *dev)
 	return NETDEV_TX_OK;
 }
 
+/* Check if ethertype will trigger IP
+ * header checks/COE in hardware
+ */
+static inline bool stmmac_has_ip_ethertype(struct sk_buff *skb)
+{
+	int depth = 0;
+	__be16 proto;
+
+	proto = __vlan_get_protocol(skb, eth_header_parse_protocol(skb), &depth);
+
+	return depth <= ETH_HLEN && (proto == htons(ETH_P_IP) || proto == htons(ETH_P_IPV6));
+}
+
 /**
  *  stmmac_xmit - Tx entry point of the driver
  *  @skb : the socket buffer
@@ -4435,9 +4448,13 @@  static netdev_tx_t stmmac_xmit(struct sk_buff *skb, struct net_device *dev)
 	/* DWMAC IPs can be synthesized to support tx coe only for a few tx
 	 * queues. In that case, checksum offloading for those queues that don't
 	 * support tx coe needs to fallback to software checksum calculation.
+	 *
+	 * Packets that won't trigger the COE e.g. most DSA-tagged packets will
+	 * also have to be checksummed in software.
 	 */
 	if (csum_insertion &&
-	    priv->plat->tx_queues_cfg[queue].coe_unsupported) {
+	    (priv->plat->tx_queues_cfg[queue].coe_unsupported ||
+	    !stmmac_has_ip_ethertype(skb))) {
 		if (unlikely(skb_checksum_help(skb)))
 			goto dma_map_err;
 		csum_insertion = !csum_insertion;
@@ -4997,7 +5014,7 @@  static void stmmac_dispatch_skb_zc(struct stmmac_priv *priv, u32 queue,
 	stmmac_rx_vlan(priv->dev, skb);
 	skb->protocol = eth_type_trans(skb, priv->dev);
 
-	if (unlikely(!coe))
+	if (unlikely(!coe) || !stmmac_has_ip_ethertype(skb))
 		skb_checksum_none_assert(skb);
 	else
 		skb->ip_summed = CHECKSUM_UNNECESSARY;
@@ -5513,7 +5530,7 @@  static int stmmac_rx(struct stmmac_priv *priv, int limit, u32 queue)
 		stmmac_rx_vlan(priv->dev, skb);
 		skb->protocol = eth_type_trans(skb, priv->dev);
 
-		if (unlikely(!coe))
+		if (unlikely(!coe) || !stmmac_has_ip_ethertype(skb))
 			skb_checksum_none_assert(skb);
 		else
 			skb->ip_summed = CHECKSUM_UNNECESSARY;