Message ID | 20220210142401.4912-1-nbd@nbd.name (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,1/2] net: bridge: add knob for filtering rx/tx BPDU packets on a port | expand |
On 10/02/2022 16:24, Felix Fietkau wrote: > Some devices (e.g. wireless APs) can't have devices behind them be part of > a bridge topology with redundant links, due to address limitations. > Additionally, broadcast traffic on these devices is somewhat expensive, due to > the low data rate and wakeups of clients in powersave mode. > This knob can be used to ensure that BPDU packets are never sent or forwarded > to/from these devices > > Signed-off-by: Felix Fietkau <nbd@nbd.name> > --- > include/linux/if_bridge.h | 1 + > include/uapi/linux/if_link.h | 1 + > net/bridge/br_forward.c | 5 +++++ > net/bridge/br_input.c | 2 ++ > net/bridge/br_netlink.c | 6 +++++- > net/bridge/br_stp_bpdu.c | 9 +++++++-- > net/core/rtnetlink.c | 4 +++- > 7 files changed, 24 insertions(+), 4 deletions(-) > Why can't netfilter or tc be used to filter these frames?
On 10.02.22 15:55, Nikolay Aleksandrov wrote: > On 10/02/2022 16:24, Felix Fietkau wrote: >> Some devices (e.g. wireless APs) can't have devices behind them be part of >> a bridge topology with redundant links, due to address limitations. >> Additionally, broadcast traffic on these devices is somewhat expensive, due to >> the low data rate and wakeups of clients in powersave mode. >> This knob can be used to ensure that BPDU packets are never sent or forwarded >> to/from these devices >> >> Signed-off-by: Felix Fietkau <nbd@nbd.name> >> --- >> include/linux/if_bridge.h | 1 + >> include/uapi/linux/if_link.h | 1 + >> net/bridge/br_forward.c | 5 +++++ >> net/bridge/br_input.c | 2 ++ >> net/bridge/br_netlink.c | 6 +++++- >> net/bridge/br_stp_bpdu.c | 9 +++++++-- >> net/core/rtnetlink.c | 4 +++- >> 7 files changed, 24 insertions(+), 4 deletions(-) >> > > Why can't netfilter or tc be used to filter these frames? netfilter is slow as hell, and even adding a tc rule that has to look at all frames to check for useless BPDU packets costs a lot more CPU cycles than simply suppressing them at the source. - Felix
(resending, for some reason my first email didn't make it to the mailing list) On 10/02/2022 18:06, Felix Fietkau wrote: > > On 10.02.22 15:55, Nikolay Aleksandrov wrote: >> On 10/02/2022 16:24, Felix Fietkau wrote: >>> Some devices (e.g. wireless APs) can't have devices behind them be part of >>> a bridge topology with redundant links, due to address limitations. >>> Additionally, broadcast traffic on these devices is somewhat expensive, due to >>> the low data rate and wakeups of clients in powersave mode. >>> This knob can be used to ensure that BPDU packets are never sent or forwarded >>> to/from these devices >>> >>> Signed-off-by: Felix Fietkau <nbd@nbd.name> >>> --- >>> include/linux/if_bridge.h | 1 + >>> include/uapi/linux/if_link.h | 1 + >>> net/bridge/br_forward.c | 5 +++++ >>> net/bridge/br_input.c | 2 ++ >>> net/bridge/br_netlink.c | 6 +++++- >>> net/bridge/br_stp_bpdu.c | 9 +++++++-- >>> net/core/rtnetlink.c | 4 +++- >>> 7 files changed, 24 insertions(+), 4 deletions(-) >>> >> >> Why can't netfilter or tc be used to filter these frames? > netfilter is slow as hell, and even adding a tc rule that has to look at all frames to check for useless BPDU packets costs a lot more CPU cycles than simply suppressing them at the source. > > - Felix You can use XDP, should be much faster. I don't want new tests in the fast path for something that is already solved and doesn't need anything bridge-specific. Tomorrow someone will try the same with some other packet type, sorry but absolutely unacceptable. Cheers, Nik
On Thu, 10 Feb 2022 16:55:40 +0200 Nikolay Aleksandrov <nikolay@nvidia.com> wrote: > On 10/02/2022 16:24, Felix Fietkau wrote: > > Some devices (e.g. wireless APs) can't have devices behind them be part of > > a bridge topology with redundant links, due to address limitations. > > Additionally, broadcast traffic on these devices is somewhat expensive, due to > > the low data rate and wakeups of clients in powersave mode. > > This knob can be used to ensure that BPDU packets are never sent or forwarded > > to/from these devices > > > > Signed-off-by: Felix Fietkau <nbd@nbd.name> > > --- > > include/linux/if_bridge.h | 1 + > > include/uapi/linux/if_link.h | 1 + > > net/bridge/br_forward.c | 5 +++++ > > net/bridge/br_input.c | 2 ++ > > net/bridge/br_netlink.c | 6 +++++- > > net/bridge/br_stp_bpdu.c | 9 +++++++-- > > net/core/rtnetlink.c | 4 +++- > > 7 files changed, 24 insertions(+), 4 deletions(-) > > > > Why can't netfilter or tc be used to filter these frames? > > It could but this looks better. BPDU filter matches what other hardware switch vendors offer and has the benefit of not adding another layer of complexity into user configurations. Adding one rule into a complex firewall or starting to have to configure tc is a mess.
diff --git a/include/linux/if_bridge.h b/include/linux/if_bridge.h index 509e18c7e740..18d3b264b754 100644 --- a/include/linux/if_bridge.h +++ b/include/linux/if_bridge.h @@ -58,6 +58,7 @@ struct br_ip_list { #define BR_MRP_LOST_CONT BIT(18) #define BR_MRP_LOST_IN_CONT BIT(19) #define BR_TX_FWD_OFFLOAD BIT(20) +#define BR_BPDU_FILTER BIT(21) #define BR_DEFAULT_AGEING_TIME (300 * HZ) diff --git a/include/uapi/linux/if_link.h b/include/uapi/linux/if_link.h index 6218f93f5c1a..4c847c2d6afa 100644 --- a/include/uapi/linux/if_link.h +++ b/include/uapi/linux/if_link.h @@ -537,6 +537,7 @@ enum { IFLA_BRPORT_MRP_IN_OPEN, IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT, IFLA_BRPORT_MCAST_EHT_HOSTS_CNT, + IFLA_BRPORT_BPDU_FILTER, __IFLA_BRPORT_MAX }; #define IFLA_BRPORT_MAX (__IFLA_BRPORT_MAX - 1) diff --git a/net/bridge/br_forward.c b/net/bridge/br_forward.c index ec646656dbf1..9fe5c888f27d 100644 --- a/net/bridge/br_forward.c +++ b/net/bridge/br_forward.c @@ -199,6 +199,7 @@ static struct net_bridge_port *maybe_deliver( void br_flood(struct net_bridge *br, struct sk_buff *skb, enum br_pkt_type pkt_type, bool local_rcv, bool local_orig) { + const unsigned char *dest = eth_hdr(skb)->h_dest; struct net_bridge_port *prev = NULL; struct net_bridge_port *p; @@ -214,6 +215,10 @@ void br_flood(struct net_bridge *br, struct sk_buff *skb, case BR_PKT_MULTICAST: if (!(p->flags & BR_MCAST_FLOOD) && skb->dev != br->dev) continue; + if ((p->flags & BR_BPDU_FILTER) && + unlikely(is_link_local_ether_addr(dest) && + dest[5] == 0)) + continue; break; case BR_PKT_BROADCAST: if (!(p->flags & BR_BCAST_FLOOD) && skb->dev != br->dev) diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index b50382f957c1..d8263c4849c1 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -316,6 +316,8 @@ static rx_handler_result_t br_handle_frame(struct sk_buff **pskb) fwd_mask |= p->group_fwd_mask; switch (dest[5]) { case 0x00: /* Bridge Group Address */ + if (p->flags & BR_BPDU_FILTER) + goto drop; /* If STP is turned off, then must forward to keep loop detection */ if (p->br->stp_enabled == BR_NO_STP || diff --git a/net/bridge/br_netlink.c b/net/bridge/br_netlink.c index 2ff83d84230d..11215c55adc2 100644 --- a/net/bridge/br_netlink.c +++ b/net/bridge/br_netlink.c @@ -184,6 +184,7 @@ static inline size_t br_port_info_size(void) + nla_total_size(1) /* IFLA_BRPORT_VLAN_TUNNEL */ + nla_total_size(1) /* IFLA_BRPORT_NEIGH_SUPPRESS */ + nla_total_size(1) /* IFLA_BRPORT_ISOLATED */ + + nla_total_size(1) /* IFLA_BRPORT_BPDU_FILTER */ + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_ROOT_ID */ + nla_total_size(sizeof(struct ifla_bridge_id)) /* IFLA_BRPORT_BRIDGE_ID */ + nla_total_size(sizeof(u16)) /* IFLA_BRPORT_DESIGNATED_PORT */ @@ -269,7 +270,8 @@ static int br_port_fill_attrs(struct sk_buff *skb, BR_MRP_LOST_CONT)) || nla_put_u8(skb, IFLA_BRPORT_MRP_IN_OPEN, !!(p->flags & BR_MRP_LOST_IN_CONT)) || - nla_put_u8(skb, IFLA_BRPORT_ISOLATED, !!(p->flags & BR_ISOLATED))) + nla_put_u8(skb, IFLA_BRPORT_ISOLATED, !!(p->flags & BR_ISOLATED)) || + nla_put_u8(skb, IFLA_BRPORT_BPDU_FILTER, !!(p->flags & BR_BPDU_FILTER))) return -EMSGSIZE; timerval = br_timer_value(&p->message_age_timer); @@ -829,6 +831,7 @@ static const struct nla_policy br_port_policy[IFLA_BRPORT_MAX + 1] = { [IFLA_BRPORT_ISOLATED] = { .type = NLA_U8 }, [IFLA_BRPORT_BACKUP_PORT] = { .type = NLA_U32 }, [IFLA_BRPORT_MCAST_EHT_HOSTS_LIMIT] = { .type = NLA_U32 }, + [IFLA_BRPORT_BPDU_FILTER] = { .type = NLA_U8 }, }; /* Change the state of the port and notify spanning tree */ @@ -893,6 +896,7 @@ static int br_setport(struct net_bridge_port *p, struct nlattr *tb[], br_set_port_flag(p, tb, IFLA_BRPORT_VLAN_TUNNEL, BR_VLAN_TUNNEL); br_set_port_flag(p, tb, IFLA_BRPORT_NEIGH_SUPPRESS, BR_NEIGH_SUPPRESS); br_set_port_flag(p, tb, IFLA_BRPORT_ISOLATED, BR_ISOLATED); + br_set_port_flag(p, tb, IFLA_BRPORT_BPDU_FILTER, BR_BPDU_FILTER); changed_mask = old_flags ^ p->flags; diff --git a/net/bridge/br_stp_bpdu.c b/net/bridge/br_stp_bpdu.c index 0e4572f31330..9d2a235260eb 100644 --- a/net/bridge/br_stp_bpdu.c +++ b/net/bridge/br_stp_bpdu.c @@ -80,7 +80,8 @@ void br_send_config_bpdu(struct net_bridge_port *p, struct br_config_bpdu *bpdu) { unsigned char buf[35]; - if (p->br->stp_enabled != BR_KERNEL_STP) + if (p->br->stp_enabled != BR_KERNEL_STP || + (p->flags & BR_BPDU_FILTER)) return; buf[0] = 0; @@ -127,7 +128,8 @@ void br_send_tcn_bpdu(struct net_bridge_port *p) { unsigned char buf[4]; - if (p->br->stp_enabled != BR_KERNEL_STP) + if (p->br->stp_enabled != BR_KERNEL_STP || + (p->flags & BR_BPDU_FILTER)) return; buf[0] = 0; @@ -172,6 +174,9 @@ void br_stp_rcv(const struct stp_proto *proto, struct sk_buff *skb, if (!(br->dev->flags & IFF_UP)) goto out; + if (p->flags & BR_BPDU_FILTER) + goto out; + if (p->state == BR_STATE_DISABLED) goto out; diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index 710da8a36729..00328f0dd22b 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4707,7 +4707,9 @@ int ndo_dflt_bridge_getlink(struct sk_buff *skb, u32 pid, u32 seq, brport_nla_put_flag(skb, flags, mask, IFLA_BRPORT_MCAST_FLOOD, BR_MCAST_FLOOD) || brport_nla_put_flag(skb, flags, mask, - IFLA_BRPORT_BCAST_FLOOD, BR_BCAST_FLOOD)) { + IFLA_BRPORT_BCAST_FLOOD, BR_BCAST_FLOOD) || + brport_nla_put_flag(skb, flags, mask, + IFLA_BRPORT_BPDU_FILTER, BR_BPDU_FILTER)) { nla_nest_cancel(skb, protinfo); goto nla_put_failure; }
Some devices (e.g. wireless APs) can't have devices behind them be part of a bridge topology with redundant links, due to address limitations. Additionally, broadcast traffic on these devices is somewhat expensive, due to the low data rate and wakeups of clients in powersave mode. This knob can be used to ensure that BPDU packets are never sent or forwarded to/from these devices Signed-off-by: Felix Fietkau <nbd@nbd.name> --- include/linux/if_bridge.h | 1 + include/uapi/linux/if_link.h | 1 + net/bridge/br_forward.c | 5 +++++ net/bridge/br_input.c | 2 ++ net/bridge/br_netlink.c | 6 +++++- net/bridge/br_stp_bpdu.c | 9 +++++++-- net/core/rtnetlink.c | 4 +++- 7 files changed, 24 insertions(+), 4 deletions(-)