Message ID | 20230802114320.4156068-1-william.xuanziyang@huawei.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 01f4fd27087078c90a0e22860d1dfa2cd0510791 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net,v2] bonding: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves | expand |
On Wed, Aug 02, 2023 at 07:43:20PM +0800, Ziyang Xuan wrote: > BUG_ON(!vlan_info) is triggered in unregister_vlan_dev() with > following testcase: > > # ip netns add ns1 > # ip netns exec ns1 ip link add bond0 type bond mode 0 > # ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 > # ip netns exec ns1 ip link set bond_slave_1 master bond0 > # ip netns exec ns1 ip link add link bond_slave_1 name vlan10 type vlan id 10 protocol 802.1ad > # ip netns exec ns1 ip link add link bond0 name bond0_vlan10 type vlan id 10 protocol 802.1ad > # ip netns exec ns1 ip link set bond_slave_1 nomaster > # ip netns del ns1 > > The logical analysis of the problem is as follows: > > 1. create ETH_P_8021AD protocol vlan10 for bond_slave_1: > register_vlan_dev() > vlan_vid_add() > vlan_info_alloc() > __vlan_vid_add() // add [ETH_P_8021AD, 10] vid to bond_slave_1 > > 2. create ETH_P_8021AD protocol bond0_vlan10 for bond0: > register_vlan_dev() > vlan_vid_add() > __vlan_vid_add() > vlan_add_rx_filter_info() > if (!vlan_hw_filter_capable(dev, proto)) // condition established because bond0 without NETIF_F_HW_VLAN_STAG_FILTER > return 0; > > if (netif_device_present(dev)) > return dev->netdev_ops->ndo_vlan_rx_add_vid(dev, proto, vid); // will be never called > // The slaves of bond0 will not refer to the [ETH_P_8021AD, 10] vid. > > 3. detach bond_slave_1 from bond0: > __bond_release_one() > vlan_vids_del_by_dev() > list_for_each_entry(vid_info, &vlan_info->vid_list, list) > vlan_vid_del(dev, vid_info->proto, vid_info->vid); > // bond_slave_1 [ETH_P_8021AD, 10] vid will be deleted. > // bond_slave_1->vlan_info will be assigned NULL. > > 4. delete vlan10 during delete ns1: > default_device_exit_batch() > dev->rtnl_link_ops->dellink() // unregister_vlan_dev() for vlan10 > vlan_info = rtnl_dereference(real_dev->vlan_info); // real_dev of vlan10 is bond_slave_1 > BUG_ON(!vlan_info); // bond_slave_1->vlan_info is NULL now, bug is triggered!!! > > Add S-VLAN tag related features support to bond driver. So the bond driver > will always propagate the VLAN info to its slaves. > > Fixes: 8ad227ff89a7 ("net: vlan: add 802.1ad support") > Suggested-by: Ido Schimmel <idosch@idosch.org> > Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> Reviewed-by: Ido Schimmel <idosch@nvidia.com>
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Wed, 2 Aug 2023 19:43:20 +0800 you wrote: > BUG_ON(!vlan_info) is triggered in unregister_vlan_dev() with > following testcase: > > # ip netns add ns1 > # ip netns exec ns1 ip link add bond0 type bond mode 0 > # ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 > # ip netns exec ns1 ip link set bond_slave_1 master bond0 > # ip netns exec ns1 ip link add link bond_slave_1 name vlan10 type vlan id 10 protocol 802.1ad > # ip netns exec ns1 ip link add link bond0 name bond0_vlan10 type vlan id 10 protocol 802.1ad > # ip netns exec ns1 ip link set bond_slave_1 nomaster > # ip netns del ns1 > > [...] Here is the summary with links: - [net,v2] bonding: Fix incorrect deletion of ETH_P_8021AD protocol vid from slaves https://git.kernel.org/netdev/net/c/01f4fd270870 You are awesome, thank you!
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 484c9e3e5e82..447b06ea4fc9 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -5901,7 +5901,9 @@ void bond_setup(struct net_device *bond_dev) bond_dev->hw_features = BOND_VLAN_FEATURES | NETIF_F_HW_VLAN_CTAG_RX | - NETIF_F_HW_VLAN_CTAG_FILTER; + NETIF_F_HW_VLAN_CTAG_FILTER | + NETIF_F_HW_VLAN_STAG_RX | + NETIF_F_HW_VLAN_STAG_FILTER; bond_dev->hw_features |= NETIF_F_GSO_ENCAP_ALL; bond_dev->features |= bond_dev->hw_features;
BUG_ON(!vlan_info) is triggered in unregister_vlan_dev() with following testcase: # ip netns add ns1 # ip netns exec ns1 ip link add bond0 type bond mode 0 # ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2 # ip netns exec ns1 ip link set bond_slave_1 master bond0 # ip netns exec ns1 ip link add link bond_slave_1 name vlan10 type vlan id 10 protocol 802.1ad # ip netns exec ns1 ip link add link bond0 name bond0_vlan10 type vlan id 10 protocol 802.1ad # ip netns exec ns1 ip link set bond_slave_1 nomaster # ip netns del ns1 The logical analysis of the problem is as follows: 1. create ETH_P_8021AD protocol vlan10 for bond_slave_1: register_vlan_dev() vlan_vid_add() vlan_info_alloc() __vlan_vid_add() // add [ETH_P_8021AD, 10] vid to bond_slave_1 2. create ETH_P_8021AD protocol bond0_vlan10 for bond0: register_vlan_dev() vlan_vid_add() __vlan_vid_add() vlan_add_rx_filter_info() if (!vlan_hw_filter_capable(dev, proto)) // condition established because bond0 without NETIF_F_HW_VLAN_STAG_FILTER return 0; if (netif_device_present(dev)) return dev->netdev_ops->ndo_vlan_rx_add_vid(dev, proto, vid); // will be never called // The slaves of bond0 will not refer to the [ETH_P_8021AD, 10] vid. 3. detach bond_slave_1 from bond0: __bond_release_one() vlan_vids_del_by_dev() list_for_each_entry(vid_info, &vlan_info->vid_list, list) vlan_vid_del(dev, vid_info->proto, vid_info->vid); // bond_slave_1 [ETH_P_8021AD, 10] vid will be deleted. // bond_slave_1->vlan_info will be assigned NULL. 4. delete vlan10 during delete ns1: default_device_exit_batch() dev->rtnl_link_ops->dellink() // unregister_vlan_dev() for vlan10 vlan_info = rtnl_dereference(real_dev->vlan_info); // real_dev of vlan10 is bond_slave_1 BUG_ON(!vlan_info); // bond_slave_1->vlan_info is NULL now, bug is triggered!!! Add S-VLAN tag related features support to bond driver. So the bond driver will always propagate the VLAN info to its slaves. Fixes: 8ad227ff89a7 ("net: vlan: add 802.1ad support") Suggested-by: Ido Schimmel <idosch@idosch.org> Signed-off-by: Ziyang Xuan <william.xuanziyang@huawei.com> --- v2: - Do not add vlan_hw_filter_capable() check in vlan_vids_del_by_dev(). - Add S-VLAN tag related features support to bond driver to fix the bug. --- drivers/net/bonding/bond_main.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-)