diff mbox series

[net-next,v1,2/9] net: dsa: tag_ar9331: detect IGMP and MLD packets

Message ID 20210403114848.30528-3-o.rempel@pengutronix.de (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series ar9331: mainline some parts of switch functionality | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers success CCed 7 of 7 maintainers
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch warning CHECK: Please don't use multiple blank lines WARNING: networking block comments don't use an empty /* line, use /* Comment...
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/header_inline success Link

Commit Message

Oleksij Rempel April 3, 2021, 11:48 a.m. UTC
The ar9331 switch is not forwarding IGMP and MLD packets if IGMP
snooping is enabled. This patch is trying to mimic the HW heuristic to take
same decisions as this switch would do to be able to tell the linux
bridge if some packet was prabably forwarded or not.

Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
---
 net/dsa/tag_ar9331.c | 130 +++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 130 insertions(+)

Comments

Vladimir Oltean April 3, 2021, 1:03 p.m. UTC | #1
Hi Oleksij,

On Sat, Apr 03, 2021 at 01:48:41PM +0200, Oleksij Rempel wrote:
> The ar9331 switch is not forwarding IGMP and MLD packets if IGMP
> snooping is enabled. This patch is trying to mimic the HW heuristic to take
> same decisions as this switch would do to be able to tell the linux
> bridge if some packet was prabably forwarded or not.
> 
> Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> ---

I am not familiar with IGMP/MLD, therefore I don't really understand
what problem you are trying to solve.

Your switch has packet traps for IGMP and MLD, ok. So it doesn't forward
them. Must the IGMP/MLD packets be forwarded by an IGMP/MLD snooping
bridge? Which ones and under what circumstances?
Oleksij Rempel April 3, 2021, 1:26 p.m. UTC | #2
On Sat, Apr 03, 2021 at 04:03:18PM +0300, Vladimir Oltean wrote:
> Hi Oleksij,
> 
> On Sat, Apr 03, 2021 at 01:48:41PM +0200, Oleksij Rempel wrote:
> > The ar9331 switch is not forwarding IGMP and MLD packets if IGMP
> > snooping is enabled. This patch is trying to mimic the HW heuristic to take
> > same decisions as this switch would do to be able to tell the linux
> > bridge if some packet was prabably forwarded or not.
> > 
> > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > ---
> 
> I am not familiar with IGMP/MLD, therefore I don't really understand
> what problem you are trying to solve.
> 
> Your switch has packet traps for IGMP and MLD, ok. So it doesn't forward
> them. Must the IGMP/MLD packets be forwarded by an IGMP/MLD snooping
> bridge? Which ones and under what circumstances?

I'll better refer to the rfc:
https://tools.ietf.org/html/rfc4541

Regards,
Oleksij
Vladimir Oltean April 3, 2021, 1:46 p.m. UTC | #3
On Sat, Apr 03, 2021 at 03:26:36PM +0200, Oleksij Rempel wrote:
> On Sat, Apr 03, 2021 at 04:03:18PM +0300, Vladimir Oltean wrote:
> > Hi Oleksij,
> > 
> > On Sat, Apr 03, 2021 at 01:48:41PM +0200, Oleksij Rempel wrote:
> > > The ar9331 switch is not forwarding IGMP and MLD packets if IGMP
> > > snooping is enabled. This patch is trying to mimic the HW heuristic to take
> > > same decisions as this switch would do to be able to tell the linux
> > > bridge if some packet was prabably forwarded or not.
> > > 
> > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > > ---
> > 
> > I am not familiar with IGMP/MLD, therefore I don't really understand
> > what problem you are trying to solve.
> > 
> > Your switch has packet traps for IGMP and MLD, ok. So it doesn't forward
> > them. Must the IGMP/MLD packets be forwarded by an IGMP/MLD snooping
> > bridge? Which ones and under what circumstances?
> 
> I'll better refer to the rfc:
> https://tools.ietf.org/html/rfc4541

Ok, the question might have been a little bit dumb.
I found this PDF:
https://www.alliedtelesis.com/sites/default/files/documents/how-alliedware/howto_config_igmp1.pdf
and it explains that:
- a snooper floods the Membership Query messages from the network's
  querier towards all ports that are not blocked by STP
- a snooper forwards all Membership Report messages from a client
  towards the All Groups port (which is how it reaches the querier).

I'm asking this because I just want to understand what the bridge code
does. Does the code path for IGMP_HOST_MEMBERSHIP_REPORT (for example)
for a snooper go through should_deliver -> nbp_switchdev_allowed_egress,
which is what you are affecting here?
Andrew Lunn April 3, 2021, 2:49 p.m. UTC | #4
> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
>  	__le16 *phdr;
>  	u16 hdr;
>  
> +	if (dp->stp_state == BR_STATE_BLOCKING) {
> +		/* TODO: should we reflect it in the stats? */
> +		netdev_warn_once(dev, "%s:%i dropping blocking packet\n",
> +				 __func__, __LINE__);
> +		return NULL;
> +	}
> +
>  	phdr = skb_push(skb, AR9331_HDR_LEN);
>  
>  	hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION);

Hi Oleksij

This change does not seem to fit with what this patch is doing.

I also think it is wrong. You still need BPDU to pass through a
blocked port, otherwise spanning tree protocol will be unstable.

	Andrew
Oleksij Rempel April 3, 2021, 3:22 p.m. UTC | #5
On Sat, Apr 03, 2021 at 04:46:06PM +0300, Vladimir Oltean wrote:
> On Sat, Apr 03, 2021 at 03:26:36PM +0200, Oleksij Rempel wrote:
> > On Sat, Apr 03, 2021 at 04:03:18PM +0300, Vladimir Oltean wrote:
> > > Hi Oleksij,
> > > 
> > > On Sat, Apr 03, 2021 at 01:48:41PM +0200, Oleksij Rempel wrote:
> > > > The ar9331 switch is not forwarding IGMP and MLD packets if IGMP
> > > > snooping is enabled. This patch is trying to mimic the HW heuristic to take
> > > > same decisions as this switch would do to be able to tell the linux
> > > > bridge if some packet was prabably forwarded or not.
> > > > 
> > > > Signed-off-by: Oleksij Rempel <o.rempel@pengutronix.de>
> > > > ---
> > > 
> > > I am not familiar with IGMP/MLD, therefore I don't really understand
> > > what problem you are trying to solve.
> > > 
> > > Your switch has packet traps for IGMP and MLD, ok. So it doesn't forward
> > > them. Must the IGMP/MLD packets be forwarded by an IGMP/MLD snooping
> > > bridge? Which ones and under what circumstances?
> > 
> > I'll better refer to the rfc:
> > https://tools.ietf.org/html/rfc4541
> 
> Ok, the question might have been a little bit dumb.
> I found this PDF:
> https://www.alliedtelesis.com/sites/default/files/documents/how-alliedware/howto_config_igmp1.pdf
> and it explains that:
> - a snooper floods the Membership Query messages from the network's
>   querier towards all ports that are not blocked by STP
> - a snooper forwards all Membership Report messages from a client
>   towards the All Groups port (which is how it reaches the querier).
> 
> I'm asking this because I just want to understand what the bridge code
> does. Does the code path for IGMP_HOST_MEMBERSHIP_REPORT (for example)
> for a snooper go through should_deliver -> nbp_switchdev_allowed_egress,
> which is what you are affecting here?

yes.

the exact path should depend on this configuration option:
/sys/devices/virtual/net/test/bridge/multicast_snooping

I assume, some optimization can be done by letting DSA know the state
of multicast_snooping.

Off-topic question, this patch set stops to work after rebasing against
latest netdev. I get following warning:
ip l s lan0 master test
RTNETLINK answers: Invalid argumen

Are there some API changes?

Regards,
Oleksij
Vladimir Oltean April 3, 2021, 4:38 p.m. UTC | #6
On Sat, Apr 03, 2021 at 05:22:24PM +0200, Oleksij Rempel wrote:
> Off-topic question, this patch set stops to work after rebasing against
> latest netdev. I get following warning:
> ip l s lan0 master test
> RTNETLINK answers: Invalid argumen
> 
> Are there some API changes?

Yes, it's likely that you are returning -EINVAL to some of the functions
with which DSA calls you at .port_bridge_join time, see dsa_port_switchdev_sync.
Oleksij Rempel April 3, 2021, 5:14 p.m. UTC | #7
Am 03.04.21 um 16:49 schrieb Andrew Lunn:
>> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
>>  	__le16 *phdr;
>>  	u16 hdr;
>>
>> +	if (dp->stp_state == BR_STATE_BLOCKING) {
>> +		/* TODO: should we reflect it in the stats? */
>> +		netdev_warn_once(dev, "%s:%i dropping blocking packet\n",
>> +				 __func__, __LINE__);
>> +		return NULL;
>> +	}
>> +
>>  	phdr = skb_push(skb, AR9331_HDR_LEN);
>>
>>  	hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION);
>
> Hi Oleksij
>
> This change does not seem to fit with what this patch is doing.

done

> I also think it is wrong. You still need BPDU to pass through a
> blocked port, otherwise spanning tree protocol will be unstable.

We need a better filter, otherwise, in case of software based STP, we are leaking packages on
blocked port. For example DHCP do trigger lots of spam in the kernel log.

I'll drop STP patch for now, it will be better to make a generic soft STP for all switches without
HW offloading. For example ksz9477 is doing SW based STP in similar way.

--
Regards,
Oleksij
Vladimir Oltean April 4, 2021, 12:02 a.m. UTC | #8
On Sat, Apr 03, 2021 at 07:14:56PM +0200, Oleksij Rempel wrote:
> Am 03.04.21 um 16:49 schrieb Andrew Lunn:
> >> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
> >>  	__le16 *phdr;
> >>  	u16 hdr;
> >>
> >> +	if (dp->stp_state == BR_STATE_BLOCKING) {
> >> +		/* TODO: should we reflect it in the stats? */
> >> +		netdev_warn_once(dev, "%s:%i dropping blocking packet\n",
> >> +				 __func__, __LINE__);
> >> +		return NULL;
> >> +	}
> >> +
> >>  	phdr = skb_push(skb, AR9331_HDR_LEN);
> >>
> >>  	hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION);
> >
> > Hi Oleksij
> >
> > This change does not seem to fit with what this patch is doing.
> 
> done
> 
> > I also think it is wrong. You still need BPDU to pass through a
> > blocked port, otherwise spanning tree protocol will be unstable.
> 
> We need a better filter, otherwise, in case of software based STP, we are leaking packages on
> blocked port. For example DHCP do trigger lots of spam in the kernel log.

I have no idea whatsoever what 'software based STP' is, if you have
hardware-accelerated forwarding.

> I'll drop STP patch for now, it will be better to make a generic soft STP for all switches without
> HW offloading. For example ksz9477 is doing SW based STP in similar way.

How about we discuss first about what your switch is not doing properly?
Have you debugged more than just watching the bridge change port states?
As Andrew said, a port needs to accept and send link-local frames
regardless of the STP state. In the BLOCKING state it must send no other
frames and have address learning disabled. Is this what's happening, is
the switch forwarding frames towards a BLOCKING port?
Oleksij Rempel April 4, 2021, 5:35 a.m. UTC | #9
Am 04.04.21 um 02:02 schrieb Vladimir Oltean:
> On Sat, Apr 03, 2021 at 07:14:56PM +0200, Oleksij Rempel wrote:
>> Am 03.04.21 um 16:49 schrieb Andrew Lunn:
>>>> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
>>>>  	__le16 *phdr;
>>>>  	u16 hdr;
>>>>
>>>> +	if (dp->stp_state == BR_STATE_BLOCKING) {
>>>> +		/* TODO: should we reflect it in the stats? */
>>>> +		netdev_warn_once(dev, "%s:%i dropping blocking packet\n",
>>>> +				 __func__, __LINE__);
>>>> +		return NULL;
>>>> +	}
>>>> +
>>>>  	phdr = skb_push(skb, AR9331_HDR_LEN);
>>>>
>>>>  	hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION);
>>>
>>> Hi Oleksij
>>>
>>> This change does not seem to fit with what this patch is doing.
>>
>> done
>>
>>> I also think it is wrong. You still need BPDU to pass through a
>>> blocked port, otherwise spanning tree protocol will be unstable.
>>
>> We need a better filter, otherwise, in case of software based STP, we are leaking packages on
>> blocked port. For example DHCP do trigger lots of spam in the kernel log.
>
> I have no idea whatsoever what 'software based STP' is, if you have
> hardware-accelerated forwarding.

I do not mean hardware-accelerated forwarding, i mean
hardware-accelerated STP port state helpers.

>> I'll drop STP patch for now, it will be better to make a generic soft STP for all switches without
>> HW offloading. For example ksz9477 is doing SW based STP in similar way.
>
> How about we discuss first about what your switch is not doing properly?
> Have you debugged more than just watching the bridge change port states?
> As Andrew said, a port needs to accept and send link-local frames
> regardless of the STP state. In the BLOCKING state it must send no other
> frames and have address learning disabled. Is this what's happening, is
> the switch forwarding frames towards a BLOCKING port?

The switch is not forwarding BPDU frame to the CPU port. So, the linux
bridge will stack by cycling different state of the port where loop was
detected.

--
Regards,
Oleksij
Vladimir Oltean April 4, 2021, 12:58 p.m. UTC | #10
On Sun, Apr 04, 2021 at 07:35:26AM +0200, Oleksij Rempel wrote:
> Am 04.04.21 um 02:02 schrieb Vladimir Oltean:
> > On Sat, Apr 03, 2021 at 07:14:56PM +0200, Oleksij Rempel wrote:
> >> Am 03.04.21 um 16:49 schrieb Andrew Lunn:
> >>>> @@ -31,6 +96,13 @@ static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
> >>>>  	__le16 *phdr;
> >>>>  	u16 hdr;
> >>>>
> >>>> +	if (dp->stp_state == BR_STATE_BLOCKING) {
> >>>> +		/* TODO: should we reflect it in the stats? */
> >>>> +		netdev_warn_once(dev, "%s:%i dropping blocking packet\n",
> >>>> +				 __func__, __LINE__);
> >>>> +		return NULL;
> >>>> +	}
> >>>> +
> >>>>  	phdr = skb_push(skb, AR9331_HDR_LEN);
> >>>>
> >>>>  	hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION);
> >>>
> >>> Hi Oleksij
> >>>
> >>> This change does not seem to fit with what this patch is doing.
> >>
> >> done
> >>
> >>> I also think it is wrong. You still need BPDU to pass through a
> >>> blocked port, otherwise spanning tree protocol will be unstable.
> >>
> >> We need a better filter, otherwise, in case of software based STP, we are leaking packages on
> >> blocked port. For example DHCP do trigger lots of spam in the kernel log.
> >
> > I have no idea whatsoever what 'software based STP' is, if you have
> > hardware-accelerated forwarding.
> 
> I do not mean hardware-accelerated forwarding, i mean
> hardware-accelerated STP port state helpers.

Still no clue what you mean, sorry.

> >> I'll drop STP patch for now, it will be better to make a generic soft STP for all switches without
> >> HW offloading. For example ksz9477 is doing SW based STP in similar way.
> >
> > How about we discuss first about what your switch is not doing properly?
> > Have you debugged more than just watching the bridge change port states?
> > As Andrew said, a port needs to accept and send link-local frames
> > regardless of the STP state. In the BLOCKING state it must send no other
> > frames and have address learning disabled. Is this what's happening, is
> > the switch forwarding frames towards a BLOCKING port?
> 
> The switch is not forwarding BPDU frame to the CPU port. So, the linux
> bridge will stack by cycling different state of the port where loop was
> detected.

The switch should not be 'forwarding' BPDU frames to the CPU port, it
should be 'trapping' them. The difference is subtle but important. Often
times switches have an Access Control List which allows them to steal
packets from the normal FDB-based forwarding path. It is probably the
case that your switch needs to be told to treat STP BPDUs specially and
not just 'forward' them.
To confirm whether I'm right or wrong, if you disable STP and send a
packet with MAC DA 01:80:c2:00:00:00 to the switch, will it flood it
towards all ports or will it only send them to the CPU?
diff mbox series

Patch

diff --git a/net/dsa/tag_ar9331.c b/net/dsa/tag_ar9331.c
index 002cf7f952e2..0ba90e0f91bb 100644
--- a/net/dsa/tag_ar9331.c
+++ b/net/dsa/tag_ar9331.c
@@ -6,6 +6,8 @@ 
 
 #include <linux/bitfield.h>
 #include <linux/etherdevice.h>
+#include <linux/igmp.h>
+#include <net/addrconf.h>
 
 #include "dsa_priv.h"
 
@@ -24,6 +26,69 @@ 
 #define AR9331_HDR_RESERVED_MASK	GENMASK(5, 4)
 #define AR9331_HDR_PORT_NUM_MASK	GENMASK(3, 0)
 
+/*
+ * This code replicated MLD detection more or less in the same way as the
+ * switch is doing it
+ */
+static int ipv6_mc_check_ip6hdr(struct sk_buff *skb)
+{
+	const struct ipv6hdr *ip6h;
+	unsigned int offset;
+
+	offset = skb_network_offset(skb) + sizeof(*ip6h);
+
+	if (!pskb_may_pull(skb, offset))
+		return -EINVAL;
+
+	ip6h = ipv6_hdr(skb);
+
+	if (ip6h->version != 6)
+		return -EINVAL;
+
+	skb_set_transport_header(skb, offset);
+
+	return 0;
+}
+
+static int ipv6_mc_check_exthdrs(struct sk_buff *skb)
+{
+	const struct ipv6hdr *ip6h;
+	int offset;
+	u8 nexthdr;
+	__be16 frag_off;
+
+	ip6h = ipv6_hdr(skb);
+
+	if (ip6h->nexthdr != IPPROTO_HOPOPTS)
+		return -ENOMSG;
+
+	nexthdr = ip6h->nexthdr;
+	offset = skb_network_offset(skb) + sizeof(*ip6h);
+	offset = ipv6_skip_exthdr(skb, offset, &nexthdr, &frag_off);
+
+	if (offset < 0)
+		return -EINVAL;
+
+	if (nexthdr != IPPROTO_ICMPV6)
+		return -ENOMSG;
+
+	skb_set_transport_header(skb, offset);
+
+	return 0;
+}
+
+static int my_ipv6_mc_check_mld(struct sk_buff *skb)
+{
+	int ret;
+
+	ret = ipv6_mc_check_ip6hdr(skb);
+	if (ret < 0)
+		return ret;
+
+	return ipv6_mc_check_exthdrs(skb);
+}
+
+
 static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
 				       struct net_device *dev)
 {
@@ -31,6 +96,13 @@  static struct sk_buff *ar9331_tag_xmit(struct sk_buff *skb,
 	__le16 *phdr;
 	u16 hdr;
 
+	if (dp->stp_state == BR_STATE_BLOCKING) {
+		/* TODO: should we reflect it in the stats? */
+		netdev_warn_once(dev, "%s:%i dropping blocking packet\n",
+				 __func__, __LINE__);
+		return NULL;
+	}
+
 	phdr = skb_push(skb, AR9331_HDR_LEN);
 
 	hdr = FIELD_PREP(AR9331_HDR_VERSION_MASK, AR9331_HDR_VERSION);
@@ -80,11 +152,69 @@  static struct sk_buff *ar9331_tag_rcv(struct sk_buff *skb,
 	return skb;
 }
 
+static void ar9331_tag_rcv_post(struct sk_buff *skb)
+{
+	const struct iphdr *iph;
+	unsigned char *dest;
+	int ret;
+
+	/*
+	 * Since the switch do not tell us which packets was offloaded we assume
+	 * that all of them did. Except:
+	 * - port is not configured for forwarding to any other ports
+	 * - igmp/mld snooping is enabled
+	 * - unicast or multicast flood is disabled on some of bridged ports
+	 * - if we have two port bridge and one is not in forwarding state.
+	 * - packet was dropped on the output port..
+	 * - any other reasons?
+	 */
+	skb->offload_fwd_mark = true;
+
+	dest = eth_hdr(skb)->h_dest;
+	/*
+	 * Complete not multicast traffic seems to be forwarded automatically,
+	 * as long as multicast and unicast flood are enabled
+	 */
+	if (!(is_multicast_ether_addr(dest) && !is_broadcast_ether_addr(dest)))
+		return;
+
+
+	/*
+	 * Documentation do not providing any usable information on how the
+	 * igmp/mld snooping is implemented on this switch. Following
+	 * implementation is based on testing, by sending correct and malformed
+	 * packets to the switch.
+	 * It is not trying to find sane and properly formated packets. Instead
+	 * it is trying to be as close to the switch behavior as possible.
+	 */
+	skb_reset_network_header(skb);
+	switch (ntohs(skb->protocol)) {
+	case ETH_P_IP:
+
+		if (!pskb_network_may_pull(skb, sizeof(*iph)))
+			break;
+
+		iph = ip_hdr(skb);
+		if (iph->protocol == IPPROTO_IGMP)
+			skb->offload_fwd_mark = false;
+
+		break;
+	case ETH_P_IPV6:
+		ret = my_ipv6_mc_check_mld(skb);
+		if (!ret)
+			skb->offload_fwd_mark = false;
+
+		break;
+	}
+}
+
+
 static const struct dsa_device_ops ar9331_netdev_ops = {
 	.name	= "ar9331",
 	.proto	= DSA_TAG_PROTO_AR9331,
 	.xmit	= ar9331_tag_xmit,
 	.rcv	= ar9331_tag_rcv,
+	.rcv_post = ar9331_tag_rcv_post,
 	.overhead = AR9331_HDR_LEN,
 };