diff mbox series

[net,v2] net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev()

Message ID 20231213040641.2653812-1-liujian56@huawei.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net,v2] net: check vlan filter feature in vlan_vids_add_by_dev() and vlan_vids_del_by_dev() | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net
netdev/ynl success SINGLE THREAD; Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1117 this patch: 1117
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 1143 this patch: 1143
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1144 this patch: 1144
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 40 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

liujian (CE) Dec. 13, 2023, 4:06 a.m. UTC
I got the bleow warning trace:

WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
Call Trace:
 rtnl_dellink
 rtnetlink_rcv_msg
 netlink_rcv_skb
 netlink_unicast
 netlink_sendmsg
 __sock_sendmsg
 ____sys_sendmsg
 ___sys_sendmsg
 __sys_sendmsg
 do_syscall_64
 entry_SYSCALL_64_after_hwframe

It can be repoduced via:

    ip netns add ns1
    ip netns exec ns1 ip link add bond0 type bond mode 0
    ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
    ip netns exec ns1 ip link set bond_slave_1 master bond0
[1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
[2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
[3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
[4] ip netns exec ns1 ip link set bond_slave_1 nomaster
[5] ip netns exec ns1 ip link del veth2
    ip netns del ns1

This is all caused by command [1] turning off the rx-vlan-filter function
of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix
incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
[2] [3] add the same vid to slave and master respectively, causing
command [4] to empty slave->vlan_info. The following command [5] triggers
this problem.

To fix this problem, we should add VLAN_FILTER feature checks in
vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
addition or deletion of vlan_vid information.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Liu Jian <liujian56@huawei.com>
---
v1->v2: Modify patch title and commit message.
	Remove superfluous operations in ethtool/features.c and ioctl.c
 net/8021q/vlan_core.c | 21 ++++++++++++++++++++-
 1 file changed, 20 insertions(+), 1 deletion(-)

Comments

Jakub Kicinski Dec. 15, 2023, 2:36 a.m. UTC | #1
On Wed, 13 Dec 2023 12:06:41 +0800 Liu Jian wrote:
> I got the bleow warning trace:

s/bleow/below/

> WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
> CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15
> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
> RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
> Call Trace:
>  rtnl_dellink
>  rtnetlink_rcv_msg
>  netlink_rcv_skb
>  netlink_unicast
>  netlink_sendmsg
>  __sock_sendmsg
>  ____sys_sendmsg
>  ___sys_sendmsg
>  __sys_sendmsg
>  do_syscall_64
>  entry_SYSCALL_64_after_hwframe
> 
> It can be repoduced via:
> 
>     ip netns add ns1
>     ip netns exec ns1 ip link add bond0 type bond mode 0
>     ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
>     ip netns exec ns1 ip link set bond_slave_1 master bond0
> [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
> [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
> [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
> [4] ip netns exec ns1 ip link set bond_slave_1 nomaster
> [5] ip netns exec ns1 ip link del veth2
>     ip netns del ns1

Could you construct a selftest based on those commands?

> This is all caused by command [1] turning off the rx-vlan-filter function
> of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix
> incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
> [2] [3] add the same vid to slave and master respectively, causing
> command [4] to empty slave->vlan_info. The following command [5] triggers
> this problem.
> 
> To fix this problem, we should add VLAN_FILTER feature checks in
> vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
> addition or deletion of vlan_vid information.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")

Did the STAG/CTAG features exist in 2.6? I thought I saw the commit
that added them in git at some point. Could be misremembering...

> Signed-off-by: Liu Jian <liujian56@huawei.com>
> ---
> v1->v2: Modify patch title and commit message.
> 	Remove superfluous operations in ethtool/features.c and ioctl.c
>  net/8021q/vlan_core.c | 21 ++++++++++++++++++++-
>  1 file changed, 20 insertions(+), 1 deletion(-)
> 
> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
> index 0beb44f2fe1f..e94b509386bb 100644
> --- a/net/8021q/vlan_core.c
> +++ b/net/8021q/vlan_core.c
> @@ -407,6 +407,12 @@ int vlan_vids_add_by_dev(struct net_device *dev,
>  		return 0;
>  
>  	list_for_each_entry(vid_info, &vlan_info->vid_list, list) {
> +		if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) &&
> +		    vid_info->proto == htons(ETH_P_8021Q))
> +			continue;
> +		if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) &&
> +		    vid_info->proto == htons(ETH_P_8021AD))
> +			continue;

this code is copied 3 times, could you please factor it out to a helper
taking dev and vid_info and deciding if the walk should skip?
liujian (CE) Dec. 16, 2023, 7:40 a.m. UTC | #2
在 2023/12/15 10:36, Jakub Kicinski 写道:
> On Wed, 13 Dec 2023 12:06:41 +0800 Liu Jian wrote:
>> I got the bleow warning trace:
> 
> s/bleow/below/
> 
>> WARNING: CPU: 4 PID: 4056 at net/core/dev.c:11066 unregister_netdevice_many_notify
>> CPU: 4 PID: 4056 Comm: ip Not tainted 6.7.0-rc4+ #15
>> Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.15.0-1 04/01/2014
>> RIP: 0010:unregister_netdevice_many_notify+0x9a4/0x9b0
>> Call Trace:
>>   rtnl_dellink
>>   rtnetlink_rcv_msg
>>   netlink_rcv_skb
>>   netlink_unicast
>>   netlink_sendmsg
>>   __sock_sendmsg
>>   ____sys_sendmsg
>>   ___sys_sendmsg
>>   __sys_sendmsg
>>   do_syscall_64
>>   entry_SYSCALL_64_after_hwframe
>>
>> It can be repoduced via:
>>
>>      ip netns add ns1
>>      ip netns exec ns1 ip link add bond0 type bond mode 0
>>      ip netns exec ns1 ip link add bond_slave_1 type veth peer veth2
>>      ip netns exec ns1 ip link set bond_slave_1 master bond0
>> [1] ip netns exec ns1 ethtool -K bond0 rx-vlan-filter off
>> [2] ip netns exec ns1 ip link add link bond_slave_1 name bond_slave_1.0 type vlan id 0
>> [3] ip netns exec ns1 ip link add link bond0 name bond0.0 type vlan id 0
>> [4] ip netns exec ns1 ip link set bond_slave_1 nomaster
>> [5] ip netns exec ns1 ip link del veth2
>>      ip netns del ns1
> 
> Could you construct a selftest based on those commands?
OK.
> 
>> This is all caused by command [1] turning off the rx-vlan-filter function
>> of bond0. The reason is the same as commit 01f4fd270870 ("bonding: Fix
>> incorrect deletion of ETH_P_8021AD protocol vid from slaves"). Commands
>> [2] [3] add the same vid to slave and master respectively, causing
>> command [4] to empty slave->vlan_info. The following command [5] triggers
>> this problem.
>>
>> To fix this problem, we should add VLAN_FILTER feature checks in
>> vlan_vids_add_by_dev() and vlan_vids_del_by_dev() to prevent incorrect
>> addition or deletion of vlan_vid information.
>>
>> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> 
> Did the STAG/CTAG features exist in 2.6? I thought I saw the commit
> that added them in git at some point. Could be misremembering...
I just saw the feature NETIF_F_HW_VLAN_FILTER 
(NETIF_F_HW_VLAN_CTAG_FILTER) in this tag.
Now I find that the following tag may be more suitable.
348a1443cc43 ("vlan: introduce functions to do mass addition/deletion of 
vids by another device")
> 
>> Signed-off-by: Liu Jian <liujian56@huawei.com>
>> ---
>> v1->v2: Modify patch title and commit message.
>> 	Remove superfluous operations in ethtool/features.c and ioctl.c
>>   net/8021q/vlan_core.c | 21 ++++++++++++++++++++-
>>   1 file changed, 20 insertions(+), 1 deletion(-)
>>
>> diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
>> index 0beb44f2fe1f..e94b509386bb 100644
>> --- a/net/8021q/vlan_core.c
>> +++ b/net/8021q/vlan_core.c
>> @@ -407,6 +407,12 @@ int vlan_vids_add_by_dev(struct net_device *dev,
>>   		return 0;
>>   
>>   	list_for_each_entry(vid_info, &vlan_info->vid_list, list) {
>> +		if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) &&
>> +		    vid_info->proto == htons(ETH_P_8021Q))
>> +			continue;
>> +		if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) &&
>> +		    vid_info->proto == htons(ETH_P_8021AD))
>> +			continue;
> 
> this code is copied 3 times, could you please factor it out to a helper
> taking dev and vid_info and deciding if the walk should skip?

Find a suitable existing function vlan_hw_filter_capable().
Thanks for your review.
diff mbox series

Patch

diff --git a/net/8021q/vlan_core.c b/net/8021q/vlan_core.c
index 0beb44f2fe1f..e94b509386bb 100644
--- a/net/8021q/vlan_core.c
+++ b/net/8021q/vlan_core.c
@@ -407,6 +407,12 @@  int vlan_vids_add_by_dev(struct net_device *dev,
 		return 0;
 
 	list_for_each_entry(vid_info, &vlan_info->vid_list, list) {
+		if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) &&
+		    vid_info->proto == htons(ETH_P_8021Q))
+			continue;
+		if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) &&
+		    vid_info->proto == htons(ETH_P_8021AD))
+			continue;
 		err = vlan_vid_add(dev, vid_info->proto, vid_info->vid);
 		if (err)
 			goto unwind;
@@ -417,6 +423,12 @@  int vlan_vids_add_by_dev(struct net_device *dev,
 	list_for_each_entry_continue_reverse(vid_info,
 					     &vlan_info->vid_list,
 					     list) {
+		if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) &&
+		    vid_info->proto == htons(ETH_P_8021Q))
+			continue;
+		if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) &&
+		    vid_info->proto == htons(ETH_P_8021AD))
+			continue;
 		vlan_vid_del(dev, vid_info->proto, vid_info->vid);
 	}
 
@@ -436,8 +448,15 @@  void vlan_vids_del_by_dev(struct net_device *dev,
 	if (!vlan_info)
 		return;
 
-	list_for_each_entry(vid_info, &vlan_info->vid_list, list)
+	list_for_each_entry(vid_info, &vlan_info->vid_list, list) {
+		if (!(by_dev->features & NETIF_F_HW_VLAN_CTAG_FILTER) &&
+		    vid_info->proto == htons(ETH_P_8021Q))
+			continue;
+		if (!(by_dev->features & NETIF_F_HW_VLAN_STAG_FILTER) &&
+		    vid_info->proto == htons(ETH_P_8021AD))
+			continue;
 		vlan_vid_del(dev, vid_info->proto, vid_info->vid);
+	}
 }
 EXPORT_SYMBOL(vlan_vids_del_by_dev);