Message ID | 20220505225904.342388-1-andrew@lunn.ch (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC] net: bridge: Clear offload_fwd_mark when passing frame up bridge interface. | expand |
On Fri, 6 May 2022 00:59:04 +0200 Andrew Lunn <andrew@lunn.ch> wrote: > It is possible to stack bridges on top of each other. Consider the > following which makes use of an Ethernet switch: > > br1 > / \ > / \ > / \ > br0.11 wlan0 > | > br0 > / | \ > p1 p2 p3 > > br0 is offloaded to the switch. Above br0 is a vlan interface, for > vlan 11. This vlan interface is then a slave of br1. br1 also has > wireless interface as a slave. This setup trunks wireless lan traffic > over the copper network inside a VLAN. > > A frame received on p1 which is passed up to the bridge has the > skb->offload_fwd_mark flag set to true, indicating it that the switch > has dealt with forwarding the frame out ports p2 and p3 as > needed. This flag instructs the software bridge it does not need to > pass the frame back down again. However, the flag is not getting reset > when the frame is passed upwards. As a result br1 sees the flag, > wrongly interprets it, and fails to forward the frame to wlan0. > > When passing a frame upwards, clear the flag. > > RFC because i don't know the bridge code well enough if this is the > correct place to do this, and if there are any side effects, could the > skb be a clone, etc. > > Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") > Signed-off-by: Andrew Lunn <andrew@lunn.ch> Bridging of bridges is not supposed to be allowed. See: bridge:br_if.c /* No bridging of bridges */ if (dev->netdev_ops->ndo_start_xmit == br_dev_xmit) { NL_SET_ERR_MSG(extack, "Can not enslave a bridge to a bridge"); return -ELOOP; }
On Thu, May 05, 2022 at 04:07:20PM -0700, Stephen Hemminger wrote: > On Fri, 6 May 2022 00:59:04 +0200 > Andrew Lunn <andrew@lunn.ch> wrote: > > > It is possible to stack bridges on top of each other. Consider the > > following which makes use of an Ethernet switch: > > > > br1 > > / \ > > / \ > > / \ > > br0.11 wlan0 > > | > > br0 > > / | \ > > p1 p2 p3 > > > > br0 is offloaded to the switch. Above br0 is a vlan interface, for > > vlan 11. This vlan interface is then a slave of br1. br1 also has > > wireless interface as a slave. This setup trunks wireless lan traffic > > over the copper network inside a VLAN. > > > > A frame received on p1 which is passed up to the bridge has the > > skb->offload_fwd_mark flag set to true, indicating it that the switch > > has dealt with forwarding the frame out ports p2 and p3 as > > needed. This flag instructs the software bridge it does not need to > > pass the frame back down again. However, the flag is not getting reset > > when the frame is passed upwards. As a result br1 sees the flag, > > wrongly interprets it, and fails to forward the frame to wlan0. > > > > When passing a frame upwards, clear the flag. > > > > RFC because i don't know the bridge code well enough if this is the > > correct place to do this, and if there are any side effects, could the > > skb be a clone, etc. > > > > Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") > > Signed-off-by: Andrew Lunn <andrew@lunn.ch> > > Bridging of bridges is not supposed to be allowed. > See: > > bridge:br_if.c > > /* No bridging of bridges */ > if (dev->netdev_ops->ndo_start_xmit == br_dev_xmit) { > NL_SET_ERR_MSG(extack, > "Can not enslave a bridge to a bridge"); > return -ELOOP; > } This is not direct bridging of bridges. There is a vlan interface in the middle. And even if it is not supposed to work, it does work, it is being used, and it regressed. This fixes the regression. Andrew
Hi Andrew, On Fri, May 06, 2022 at 12:59:04AM +0200, Andrew Lunn wrote: > It is possible to stack bridges on top of each other. Consider the > following which makes use of an Ethernet switch: > > br1 > / \ > / \ > / \ > br0.11 wlan0 > | > br0 > / | \ > p1 p2 p3 > > br0 is offloaded to the switch. Above br0 is a vlan interface, for > vlan 11. This vlan interface is then a slave of br1. br1 also has > wireless interface as a slave. This setup trunks wireless lan traffic > over the copper network inside a VLAN. > > A frame received on p1 which is passed up to the bridge has the > skb->offload_fwd_mark flag set to true, indicating it that the switch > has dealt with forwarding the frame out ports p2 and p3 as > needed. This flag instructs the software bridge it does not need to > pass the frame back down again. However, the flag is not getting reset > when the frame is passed upwards. As a result br1 sees the flag, > wrongly interprets it, and fails to forward the frame to wlan0. > > When passing a frame upwards, clear the flag. > > RFC because i don't know the bridge code well enough if this is the > correct place to do this, and if there are any side effects, could the > skb be a clone, etc. Each skb has its own offload_fwd_mark, so clearing it for this skb does not affect a clone. And when a packet is simultaneously forwarded and locally received, the order is first forward/flood it, then receive it. Cloning takes place during forwarding using deliver_clone(), so it shouldn't be the case that you are clearing the offload_fwd_mark for a yet-to-be-forwarded packet, either. So I think we're good there. > > Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") > Signed-off-by: Andrew Lunn <andrew@lunn.ch> > --- > net/bridge/br_input.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c > index 196417859c4a..9327a5fad1df 100644 > --- a/net/bridge/br_input.c > +++ b/net/bridge/br_input.c > @@ -39,6 +39,13 @@ static int br_pass_frame_up(struct sk_buff *skb) > dev_sw_netstats_rx_add(brdev, skb->len); > > vg = br_vlan_group_rcu(br); > + > + /* Reset the offload_fwd_mark because there could be a stacked > + * bridge above, and it should not think this bridge it doing > + * that bridges work forward out its ports. "this bridge is doing that bridge's work forwarding out its ports" > + */ > + br_switchdev_frame_unmark(skb); > + > /* Bridge is just like any other port. Make sure the > * packet is allowed except in promisc mode when someone > * may be running packet capture. > -- > 2.36.0 > The good thing with this patch is that it avoids conditionals. The bad thing is that it prevents true offloading of this configuration from being possible (when "wlan0" is "p4"). I don't know what hardware is capable of doing this, but I think it's cautious to not exclude it, either. Some safer alternatives to this patch are based on the idea that we could ignore skb->offload_fwd_mark coming from an unoffloaded bridge port (i.e. treat this condition at br1, not at br0). We could: - clear skb->offload_fwd_mark in br_handle_frame_finish(), if p->hwdom is 0 - change nbp_switchdev_allowed_egress() to return true if cb->src_hwdom == 0
On Fri, 6 May 2022 03:18:22 +0200 Andrew Lunn <andrew@lunn.ch> wrote: > On Thu, May 05, 2022 at 04:07:20PM -0700, Stephen Hemminger wrote: > > On Fri, 6 May 2022 00:59:04 +0200 > > Andrew Lunn <andrew@lunn.ch> wrote: > > > > > It is possible to stack bridges on top of each other. Consider the > > > following which makes use of an Ethernet switch: > > > > > > br1 > > > / \ > > > / \ > > > / \ > > > br0.11 wlan0 > > > | > > > br0 > > > / | \ > > > p1 p2 p3 > > > > > > br0 is offloaded to the switch. Above br0 is a vlan interface, for > > > vlan 11. This vlan interface is then a slave of br1. br1 also has > > > wireless interface as a slave. This setup trunks wireless lan traffic > > > over the copper network inside a VLAN. > > > > > > A frame received on p1 which is passed up to the bridge has the > > > skb->offload_fwd_mark flag set to true, indicating it that the switch > > > has dealt with forwarding the frame out ports p2 and p3 as > > > needed. This flag instructs the software bridge it does not need to > > > pass the frame back down again. However, the flag is not getting reset > > > when the frame is passed upwards. As a result br1 sees the flag, > > > wrongly interprets it, and fails to forward the frame to wlan0. > > > > > > When passing a frame upwards, clear the flag. > > > > > > RFC because i don't know the bridge code well enough if this is the > > > correct place to do this, and if there are any side effects, could the > > > skb be a clone, etc. > > > > > > Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") > > > Signed-off-by: Andrew Lunn <andrew@lunn.ch> > > > > Bridging of bridges is not supposed to be allowed. > > See: > > > > bridge:br_if.c > > > > /* No bridging of bridges */ > > if (dev->netdev_ops->ndo_start_xmit == br_dev_xmit) { > > NL_SET_ERR_MSG(extack, > > "Can not enslave a bridge to a bridge"); > > return -ELOOP; > > } > > This is not direct bridging of bridges. There is a vlan interface in > the middle. And even if it is not supposed to work, it does work, it > is being used, and it regressed. This fixes the regression. > > Andrew The problem is that doing this kind of nested bridging screws up Spanning Tree.
> Some safer alternatives to this patch are based on the idea that we > could ignore skb->offload_fwd_mark coming from an unoffloaded bridge > port (i.e. treat this condition at br1, not at br0). We could: > - clear skb->offload_fwd_mark in br_handle_frame_finish(), if p->hwdom is 0 > - change nbp_switchdev_allowed_egress() to return true if cb->src_hwdom == 0 O.K, i will try out these solutions. Thanks Andrew
On Fri, May 06, 2022 at 02:36:45PM +0000, Vladimir Oltean wrote: > Hi Andrew, > > On Fri, May 06, 2022 at 12:59:04AM +0200, Andrew Lunn wrote: > > It is possible to stack bridges on top of each other. Consider the > > following which makes use of an Ethernet switch: > > > > br1 > > / \ > > / \ > > / \ > > br0.11 wlan0 > > | > > br0 > > / | \ > > p1 p2 p3 > > > > br0 is offloaded to the switch. Above br0 is a vlan interface, for > > vlan 11. This vlan interface is then a slave of br1. br1 also has > > wireless interface as a slave. This setup trunks wireless lan traffic > > over the copper network inside a VLAN. > > > > A frame received on p1 which is passed up to the bridge has the > > skb->offload_fwd_mark flag set to true, indicating it that the switch > > has dealt with forwarding the frame out ports p2 and p3 as > > needed. This flag instructs the software bridge it does not need to > > pass the frame back down again. However, the flag is not getting reset > > when the frame is passed upwards. As a result br1 sees the flag, > > wrongly interprets it, and fails to forward the frame to wlan0. > > > > When passing a frame upwards, clear the flag. > > > > RFC because i don't know the bridge code well enough if this is the > > correct place to do this, and if there are any side effects, could the > > skb be a clone, etc. > > Each skb has its own offload_fwd_mark, so clearing it for this skb does > not affect a clone. And when a packet is simultaneously forwarded and > locally received, the order is first forward/flood it, then receive it. > Cloning takes place during forwarding using deliver_clone(), so it > shouldn't be the case that you are clearing the offload_fwd_mark for a > yet-to-be-forwarded packet, either. So I think we're good there. > > > > > Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") > > Signed-off-by: Andrew Lunn <andrew@lunn.ch> > > --- > > net/bridge/br_input.c | 7 +++++++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c > > index 196417859c4a..9327a5fad1df 100644 > > --- a/net/bridge/br_input.c > > +++ b/net/bridge/br_input.c > > @@ -39,6 +39,13 @@ static int br_pass_frame_up(struct sk_buff *skb) > > dev_sw_netstats_rx_add(brdev, skb->len); > > > > vg = br_vlan_group_rcu(br); > > + > > + /* Reset the offload_fwd_mark because there could be a stacked > > + * bridge above, and it should not think this bridge it doing > > + * that bridges work forward out its ports. > > "this bridge is doing that bridge's work forwarding out its ports" > > > + */ > > + br_switchdev_frame_unmark(skb); > > + > > /* Bridge is just like any other port. Make sure the > > * packet is allowed except in promisc mode when someone > > * may be running packet capture. > > -- > > 2.36.0 > > > > The good thing with this patch is that it avoids conditionals. > The bad thing is that it prevents true offloading of this configuration > from being possible (when "wlan0" is "p4"). > > I don't know what hardware is capable of doing this, but I think it's > cautious to not exclude it, either. > > Some safer alternatives to this patch are based on the idea that we > could ignore skb->offload_fwd_mark coming from an unoffloaded bridge > port (i.e. treat this condition at br1, not at br0). We could: > - clear skb->offload_fwd_mark in br_handle_frame_finish(), if p->hwdom is 0 > - change nbp_switchdev_allowed_egress() to return true if cb->src_hwdom == 0 I like Andrew's patch because it is the Rx equivalent of br_switchdev_frame_unmark() in br_dev_xmit(). However, if we go with the second option, it should allow us to remove the clearing of the mark in the Tx path as the control block is cleared in the Tx path since commit fd65e5a95d08 ("net: bridge: clear bridge's private skb space on xmit"). I don't know how far back Nik's patch was backported and I don't know how far back Andrew's patch will be backported, so it might be best to submit Andrew's patch to net as-is and then in net-next change nbp_switchdev_allowed_egress() and remove br_switchdev_frame_unmark() from both the Rx and Tx paths. Anyway, I have applied this patch to our tree for testing. Will report tomorrow in case there are any regressions.
> I like Andrew's patch because it is the Rx equivalent of > br_switchdev_frame_unmark() in br_dev_xmit(). However, if we go with the > second option, it should allow us to remove the clearing of the mark in > the Tx path as the control block is cleared in the Tx path since commit > fd65e5a95d08 ("net: bridge: clear bridge's private skb space on xmit"). > > I don't know how far back Nik's patch was backported and I don't know > how far back Andrew's patch will be backported, so it might be best to > submit Andrew's patch to net as-is and then in net-next change > nbp_switchdev_allowed_egress() and remove br_switchdev_frame_unmark() > from both the Rx and Tx paths. > > Anyway, I have applied this patch to our tree for testing. Will report > tomorrow in case there are any regressions. Hi Ido Did your testing find any issues? Thanks Andrew
On Thu, May 12, 2022 at 10:38:03PM +0200, Andrew Lunn wrote: > > I like Andrew's patch because it is the Rx equivalent of > > br_switchdev_frame_unmark() in br_dev_xmit(). However, if we go with the > > second option, it should allow us to remove the clearing of the mark in > > the Tx path as the control block is cleared in the Tx path since commit > > fd65e5a95d08 ("net: bridge: clear bridge's private skb space on xmit"). > > > > I don't know how far back Nik's patch was backported and I don't know > > how far back Andrew's patch will be backported, so it might be best to > > submit Andrew's patch to net as-is and then in net-next change > > nbp_switchdev_allowed_egress() and remove br_switchdev_frame_unmark() > > from both the Rx and Tx paths. > > > > Anyway, I have applied this patch to our tree for testing. Will report > > tomorrow in case there are any regressions. > > Hi Ido > > Did your testing find any issues? No, patch is fine. Thanks!
diff --git a/net/bridge/br_input.c b/net/bridge/br_input.c index 196417859c4a..9327a5fad1df 100644 --- a/net/bridge/br_input.c +++ b/net/bridge/br_input.c @@ -39,6 +39,13 @@ static int br_pass_frame_up(struct sk_buff *skb) dev_sw_netstats_rx_add(brdev, skb->len); vg = br_vlan_group_rcu(br); + + /* Reset the offload_fwd_mark because there could be a stacked + * bridge above, and it should not think this bridge it doing + * that bridges work forward out its ports. + */ + br_switchdev_frame_unmark(skb); + /* Bridge is just like any other port. Make sure the * packet is allowed except in promisc mode when someone * may be running packet capture.
It is possible to stack bridges on top of each other. Consider the following which makes use of an Ethernet switch: br1 / \ / \ / \ br0.11 wlan0 | br0 / | \ p1 p2 p3 br0 is offloaded to the switch. Above br0 is a vlan interface, for vlan 11. This vlan interface is then a slave of br1. br1 also has wireless interface as a slave. This setup trunks wireless lan traffic over the copper network inside a VLAN. A frame received on p1 which is passed up to the bridge has the skb->offload_fwd_mark flag set to true, indicating it that the switch has dealt with forwarding the frame out ports p2 and p3 as needed. This flag instructs the software bridge it does not need to pass the frame back down again. However, the flag is not getting reset when the frame is passed upwards. As a result br1 sees the flag, wrongly interprets it, and fails to forward the frame to wlan0. When passing a frame upwards, clear the flag. RFC because i don't know the bridge code well enough if this is the correct place to do this, and if there are any side effects, could the skb be a clone, etc. Fixes: f1c2eddf4cb6 ("bridge: switchdev: Use an helper to clear forward mark") Signed-off-by: Andrew Lunn <andrew@lunn.ch> --- net/bridge/br_input.c | 7 +++++++ 1 file changed, 7 insertions(+)