Message ID | 20250208141518.191782-1-ericwouds@gmail.com (mailing list archive) |
---|---|
State | RFC |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [RFC,v1,net-next] net: mlxsw_sp: Use switchdev_handle_port_obj_add_foreign() for vxlan | expand |
On Sat, Feb 08, 2025 at 03:15:18PM +0100, Eric Woudstra wrote: > Sending as RFC as I do not own this hardware. This code is not tested. > > Vladimir found this part of the spectrum switchdev, while looking at > another issue here: > > https://lore.kernel.org/all/20250207220408.zipucrmm2yafj4wu@skbuf/ > > As vxlan seems a foreign port, wouldn't it be better to use > switchdev_handle_port_obj_add_foreign() ? Thanks for the patch, but the VXLAN port is not foreign to the other switch ports. That is, forwarding between these ports and VXLAN happens in hardware. And yes, switchdev_bridge_port_offload() does need to be called for the VXLAN port so that it's assigned the same hardware domain as the other ports.
On Mon, Feb 10, 2025 at 08:48:32AM +0200, Ido Schimmel wrote: > On Sat, Feb 08, 2025 at 03:15:18PM +0100, Eric Woudstra wrote: > > Sending as RFC as I do not own this hardware. This code is not tested. > > > > Vladimir found this part of the spectrum switchdev, while looking at > > another issue here: > > > > https://lore.kernel.org/all/20250207220408.zipucrmm2yafj4wu@skbuf/ > > > > As vxlan seems a foreign port, wouldn't it be better to use > > switchdev_handle_port_obj_add_foreign() ? > > Thanks for the patch, but the VXLAN port is not foreign to the other > switch ports. That is, forwarding between these ports and VXLAN happens > in hardware. And yes, switchdev_bridge_port_offload() does need to be > called for the VXLAN port so that it's assigned the same hardware domain > as the other ports. Thanks, this is useful. I'm not providing a patch yet because there are still things I don't understand. Have you seen any of the typical problems associated with the software bridge thinking vxlan isn't part of the same hwdom as the ingress physical port, and, say, flooding packets twice to vxlan, when the switch had already forwarded a copy of the packet? In almost 4 years since commit 2f5dc00f7a3e ("net: bridge: switchdev: let drivers inform which bridge ports are offloaded"), I would have expected such issues would have been noticed? Do we require a Fixes: tag for this? And then, switchdev_bridge_port_offload() has a brport_dev argument, which would pretty clearly be passed as vxlan_dev by mlxsw_sp_bridge_8021d_vxlan_join() and mlxsw_sp_bridge_vlan_aware_vxlan_join(), but it also has one other "struct net_device *dev" argument, on which br_switchdev_port_offload() wants to call dev_get_port_parent_id(), to see what hwdom (what other bridge ports) to associate it to. Usually we use the mlxsw_sp_port->dev as the second argument, but which port to use here? Any random port that's under the bridge, or is there a specific one for the vxlan that should be used?
On Mon, Feb 10, 2025 at 05:22:46PM +0200, Vladimir Oltean wrote: > On Mon, Feb 10, 2025 at 08:48:32AM +0200, Ido Schimmel wrote: > > On Sat, Feb 08, 2025 at 03:15:18PM +0100, Eric Woudstra wrote: > > > Sending as RFC as I do not own this hardware. This code is not tested. > > > > > > Vladimir found this part of the spectrum switchdev, while looking at > > > another issue here: > > > > > > https://lore.kernel.org/all/20250207220408.zipucrmm2yafj4wu@skbuf/ > > > > > > As vxlan seems a foreign port, wouldn't it be better to use > > > switchdev_handle_port_obj_add_foreign() ? > > > > Thanks for the patch, but the VXLAN port is not foreign to the other > > switch ports. That is, forwarding between these ports and VXLAN happens > > in hardware. And yes, switchdev_bridge_port_offload() does need to be > > called for the VXLAN port so that it's assigned the same hardware domain > > as the other ports. > > Thanks, this is useful. I'm not providing a patch yet because there are > still things I don't understand. > > Have you seen any of the typical problems associated with the software > bridge thinking vxlan isn't part of the same hwdom as the ingress > physical port, and, say, flooding packets twice to vxlan, when the > switch had already forwarded a copy of the packet? In almost 4 years > since commit 2f5dc00f7a3e ("net: bridge: switchdev: let drivers inform > which bridge ports are offloaded"), I would have expected such issues > would have been noticed? I'm aware of one report from QA that is on my list. They configured a VXLAN tunnel that floods packets to two remote VTEPs: 00:00:00:00:00:00 dev vx100 dst 1.1.1.2 00:00:00:00:00:00 dev vx100 dst 1.1.1.3 The underlay routes used to forward the VXLAN encapsulated traffic are: 1.1.1.2 via 10.0.0.2 dev swp13 1.1.1.3 via 10.0.0.6 dev swp15 But they made sure not to configure 10.0.0.6 at the other end. What happens is that traffic for 1.1.1.2 is correctly forwarded in hardware, but traffic for 1.1.1.3 encounters a neighbour miss and trapped to the CPU which then forwards it again to 1.1.1.2. Putting the VXLAN device in the same hardware domain as the other switch ports should solve this "double forwarding" scenario. > Do we require a Fixes: tag for this? It's not strictly a regression (never worked) and it's not that critical IMO, so I prefer targeting net-next. > And then, switchdev_bridge_port_offload() has a brport_dev argument, > which would pretty clearly be passed as vxlan_dev by > mlxsw_sp_bridge_8021d_vxlan_join() and > mlxsw_sp_bridge_vlan_aware_vxlan_join(), but it also has one other > "struct net_device *dev" argument, on which br_switchdev_port_offload() > wants to call dev_get_port_parent_id(), to see what hwdom (what other > bridge ports) to associate it to. Right. > Usually we use the mlxsw_sp_port->dev as the second argument, but which > port to use here? Any random port that's under the bridge, or is there a > specific one for the vxlan that should be used? Any random port is fine as they all share the same parent ID. BTW, I asked Amit (Cced) to look into this as there might be some issues with ARP suppression that will make it a bit more complicated to patch. Are you OK with that or do you want the authorship? We can put any tag you choose. I am asking so that it won't appear like I am trying to discredit you. Thanks!
On Tue, Feb 11, 2025 at 04:31:43PM +0200, Ido Schimmel wrote: > On Mon, Feb 10, 2025 at 05:22:46PM +0200, Vladimir Oltean wrote: > > On Mon, Feb 10, 2025 at 08:48:32AM +0200, Ido Schimmel wrote: > > > On Sat, Feb 08, 2025 at 03:15:18PM +0100, Eric Woudstra wrote: > > > > Sending as RFC as I do not own this hardware. This code is not tested. > > > > > > > > Vladimir found this part of the spectrum switchdev, while looking at > > > > another issue here: > > > > > > > > https://lore.kernel.org/all/20250207220408.zipucrmm2yafj4wu@skbuf/ > > > > > > > > As vxlan seems a foreign port, wouldn't it be better to use > > > > switchdev_handle_port_obj_add_foreign() ? > > > > > > Thanks for the patch, but the VXLAN port is not foreign to the other > > > switch ports. That is, forwarding between these ports and VXLAN happens > > > in hardware. And yes, switchdev_bridge_port_offload() does need to be > > > called for the VXLAN port so that it's assigned the same hardware domain > > > as the other ports. > > > > Thanks, this is useful. I'm not providing a patch yet because there are > > still things I don't understand. > > > > Have you seen any of the typical problems associated with the software > > bridge thinking vxlan isn't part of the same hwdom as the ingress > > physical port, and, say, flooding packets twice to vxlan, when the > > switch had already forwarded a copy of the packet? In almost 4 years > > since commit 2f5dc00f7a3e ("net: bridge: switchdev: let drivers inform > > which bridge ports are offloaded"), I would have expected such issues > > would have been noticed? > > I'm aware of one report from QA that is on my list. They configured a > VXLAN tunnel that floods packets to two remote VTEPs: > > 00:00:00:00:00:00 dev vx100 dst 1.1.1.2 > 00:00:00:00:00:00 dev vx100 dst 1.1.1.3 > > The underlay routes used to forward the VXLAN encapsulated traffic are: > > 1.1.1.2 via 10.0.0.2 dev swp13 > 1.1.1.3 via 10.0.0.6 dev swp15 > > But they made sure not to configure 10.0.0.6 at the other end. What > happens is that traffic for 1.1.1.2 is correctly forwarded in hardware, > but traffic for 1.1.1.3 encounters a neighbour miss and trapped to > the CPU which then forwards it again to 1.1.1.2. > > Putting the VXLAN device in the same hardware domain as the other switch > ports should solve this "double forwarding" scenario. Let me see if I understand what you are saying based on the code (please bear with me, VXLAN tunnels are way outside of my area). So in this case, the packet hits a neighbor miss in the VXLAN underlay and reaches the CPU through the MLXSW_SP_TRAP_EXCEPTION(UNRESOLVED_NEIGH, L3_EXCEPTIONS) trap. That is defined to call mlxsw_sp_rx_mark_listener(), which sets skb->offload_fwd_mark = 1 (aka packet was forwarded in L2) but not skb->offload_l3_fwd_mark (aka packet was not forwarded in L3). This corresponds to expected reality: neighbor miss packets should not re-enter the bridge forwarding path in the overlay towards the VXLAN. Yet they still do, because although skb->offload_fwd_mark is correctly set, it only skips software forwarding towards other physical (and/or LAG) mlxsw bridge ports. If my understanding is correct, then yes, I agree that making the hwdomain of the vxlan tunnel coincide with that of the other bridge ports should suppress this packet. > > Do we require a Fixes: tag for this? > > It's not strictly a regression (never worked) and it's not that critical > IMO, so I prefer targeting net-next. Yes, I agree. Before that commit, the bridge would look by itself at the bridge port, recursively searching for lower interfaces until something returned something positively in dev_get_port_parent_id(). Which was a limiting model. If anything, solving the "double forwarding of vxlan exception packets" issue would have required an alternative switchdev bridge port offloading model anyway, like the explicit one that exists now, which is more flexible. > > And then, switchdev_bridge_port_offload() has a brport_dev argument, > > which would pretty clearly be passed as vxlan_dev by > > mlxsw_sp_bridge_8021d_vxlan_join() and > > mlxsw_sp_bridge_vlan_aware_vxlan_join(), but it also has one other > > "struct net_device *dev" argument, on which br_switchdev_port_offload() > > wants to call dev_get_port_parent_id(), to see what hwdom (what other > > bridge ports) to associate it to. > > Right. > > > Usually we use the mlxsw_sp_port->dev as the second argument, but which > > port to use here? Any random port that's under the bridge, or is there a > > specific one for the vxlan that should be used? > > Any random port is fine as they all share the same parent ID. > > BTW, I asked Amit (Cced) to look into this as there might be some issues > with ARP suppression that will make it a bit more complicated to patch. > Are you OK with that or do you want the authorship? We can put any tag > you choose. I am asking so that it won't appear like I am trying to > discredit you. It is completely fine for Amit to take a look at this, I could really do without yet another thing that is completely over my head :) Just one indication from my side on how I would approach things: Because spectrum bridge ports come and go, and the vxlan tunnel bridge port may stay, its hwdom may change over time (depending on how many cards there are plugged in the system). Also, it probably won't work to combine spectrum bridge ports spanning multiple cards in the same bridge. For those reasons, br_switchdev_port_offload() supports multiple calls made to the same vxlan_dev, and behind the scenes, it will just bump the net_bridge_port->offload_count. Thinking a bit more, I think you want exactly that: a vxlan bridge port needs to have an offload_count equal to the number of other spectrum ports in the same bridge. This way, it only stops being offloaded when there are no spectrum ports left (and the offload_count drops to 0). But this needs handling from both directions: when a vxlan joins a bridge with spectrum ports, and when a spectrum port joins a bridge with vxlan tunnels. Plus every leave operation having to call br_switchdev_port_unoffload(), for things to stay balanced. Probably there are tons of other restrictions to be aware of, but I just wanted to say this, because in the previous email we talked about "just any random port" and it seems like it actually needs to be "all ports".
On Tue, Feb 11, 2025 at 05:23:53PM +0200, Vladimir Oltean wrote: > On Tue, Feb 11, 2025 at 04:31:43PM +0200, Ido Schimmel wrote: > > On Mon, Feb 10, 2025 at 05:22:46PM +0200, Vladimir Oltean wrote: > > > On Mon, Feb 10, 2025 at 08:48:32AM +0200, Ido Schimmel wrote: > > > > On Sat, Feb 08, 2025 at 03:15:18PM +0100, Eric Woudstra wrote: > > > > > Sending as RFC as I do not own this hardware. This code is not tested. > > > > > > > > > > Vladimir found this part of the spectrum switchdev, while looking at > > > > > another issue here: > > > > > > > > > > https://lore.kernel.org/all/20250207220408.zipucrmm2yafj4wu@skbuf/ > > > > > > > > > > As vxlan seems a foreign port, wouldn't it be better to use > > > > > switchdev_handle_port_obj_add_foreign() ? > > > > > > > > Thanks for the patch, but the VXLAN port is not foreign to the other > > > > switch ports. That is, forwarding between these ports and VXLAN happens > > > > in hardware. And yes, switchdev_bridge_port_offload() does need to be > > > > called for the VXLAN port so that it's assigned the same hardware domain > > > > as the other ports. > > > > > > Thanks, this is useful. I'm not providing a patch yet because there are > > > still things I don't understand. > > > > > > Have you seen any of the typical problems associated with the software > > > bridge thinking vxlan isn't part of the same hwdom as the ingress > > > physical port, and, say, flooding packets twice to vxlan, when the > > > switch had already forwarded a copy of the packet? In almost 4 years > > > since commit 2f5dc00f7a3e ("net: bridge: switchdev: let drivers inform > > > which bridge ports are offloaded"), I would have expected such issues > > > would have been noticed? > > > > I'm aware of one report from QA that is on my list. They configured a > > VXLAN tunnel that floods packets to two remote VTEPs: > > > > 00:00:00:00:00:00 dev vx100 dst 1.1.1.2 > > 00:00:00:00:00:00 dev vx100 dst 1.1.1.3 > > > > The underlay routes used to forward the VXLAN encapsulated traffic are: > > > > 1.1.1.2 via 10.0.0.2 dev swp13 > > 1.1.1.3 via 10.0.0.6 dev swp15 > > > > But they made sure not to configure 10.0.0.6 at the other end. What > > happens is that traffic for 1.1.1.2 is correctly forwarded in hardware, > > but traffic for 1.1.1.3 encounters a neighbour miss and trapped to > > the CPU which then forwards it again to 1.1.1.2. > > > > Putting the VXLAN device in the same hardware domain as the other switch > > ports should solve this "double forwarding" scenario. > > Let me see if I understand what you are saying based on the code (please > bear with me, VXLAN tunnels are way outside of my area). > > So in this case, the packet hits a neighbor miss in the VXLAN underlay > and reaches the CPU through the > MLXSW_SP_TRAP_EXCEPTION(UNRESOLVED_NEIGH, L3_EXCEPTIONS) trap. That is > defined to call mlxsw_sp_rx_mark_listener(), which sets > skb->offload_fwd_mark = 1 (aka packet was forwarded in L2) but not > skb->offload_l3_fwd_mark (aka packet was not forwarded in L3). > This corresponds to expected reality: neighbor miss packets should not > re-enter the bridge forwarding path in the overlay towards the VXLAN. > > Yet they still do, because although skb->offload_fwd_mark is correctly > set, it only skips software forwarding towards other physical (and/or > LAG) mlxsw bridge ports. > > If my understanding is correct, then yes, I agree that making the hwdomain > of the vxlan tunnel coincide with that of the other bridge ports should > suppress this packet. Yes, the above is correct. > > > > Do we require a Fixes: tag for this? > > > > It's not strictly a regression (never worked) and it's not that critical > > IMO, so I prefer targeting net-next. > > Yes, I agree. Before that commit, the bridge would look by itself at the > bridge port, recursively searching for lower interfaces until something > returned something positively in dev_get_port_parent_id(). Which was a > limiting model. If anything, solving the "double forwarding of vxlan exception > packets" issue would have required an alternative switchdev bridge port > offloading model anyway, like the explicit one that exists now, which is > more flexible. Yes. > > > > And then, switchdev_bridge_port_offload() has a brport_dev argument, > > > which would pretty clearly be passed as vxlan_dev by > > > mlxsw_sp_bridge_8021d_vxlan_join() and > > > mlxsw_sp_bridge_vlan_aware_vxlan_join(), but it also has one other > > > "struct net_device *dev" argument, on which br_switchdev_port_offload() > > > wants to call dev_get_port_parent_id(), to see what hwdom (what other > > > bridge ports) to associate it to. > > > > Right. > > > > > Usually we use the mlxsw_sp_port->dev as the second argument, but which > > > port to use here? Any random port that's under the bridge, or is there a > > > specific one for the vxlan that should be used? > > > > Any random port is fine as they all share the same parent ID. > > > > BTW, I asked Amit (Cced) to look into this as there might be some issues > > with ARP suppression that will make it a bit more complicated to patch. > > Are you OK with that or do you want the authorship? We can put any tag > > you choose. I am asking so that it won't appear like I am trying to > > discredit you. > > It is completely fine for Amit to take a look at this, I could really do > without yet another thing that is completely over my head :) Great, thanks! > > Just one indication from my side on how I would approach things: > > Because spectrum bridge ports come and go, and the vxlan tunnel bridge > port may stay, its hwdom may change over time (depending on how many > cards there are plugged in the system). Also, it probably won't work to > combine spectrum bridge ports spanning multiple cards in the same > bridge. For those reasons, br_switchdev_port_offload() supports multiple > calls made to the same vxlan_dev, and behind the scenes, it will just > bump the net_bridge_port->offload_count. > > Thinking a bit more, I think you want exactly that: a vxlan bridge port > needs to have an offload_count equal to the number of other spectrum > ports in the same bridge. This way, it only stops being offloaded when > there are no spectrum ports left (and the offload_count drops to 0). > But this needs handling from both directions: when a vxlan joins a > bridge with spectrum ports, and when a spectrum port joins a bridge with > vxlan tunnels. Plus every leave operation having to call > br_switchdev_port_unoffload(), for things to stay balanced. > > Probably there are tons of other restrictions to be aware of, but I just > wanted to say this, because in the previous email we talked about "just > any random port" and it seems like it actually needs to be "all ports". I think that "any random port" is fine as the second argument of switchdev_bridge_port_offload() is only used to derive the parent ID which is the same for all mlxsw ports belonging to the same PCI card. And yes, it won't work if you combine ports from different cards in the same bridge, but it's not really a realistic configuration. Packets entering the bridge via the first card would have to go to the CPU if they need to egress ports belonging to the second card. I don't know why anyone would want that. A more realistic way to support multi-ASIC systems is to reload each devlink instance (i.e., ASIC / card) to a different namespace via something like "devlink dev reload pci/0000:01:00.0 netns ns1" or use PCI passthrough to assign each card to a different VM.
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c index d714311fd884..f03b489f7b99 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.c @@ -4002,6 +4002,12 @@ bool mlxsw_sp_port_dev_check(const struct net_device *dev) return dev->netdev_ops == &mlxsw_sp_port_netdev_ops; } +bool mlxsw_sp_foreign_dev_check(const struct net_device *dev, + const struct net_device *foreign_dev) +{ + return netif_is_vxlan(foreign_dev); +} + static int mlxsw_sp_lower_dev_walk(struct net_device *lower_dev, struct netdev_nested_priv *priv) { diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h index b10f80fc651b..4264468c5a6f 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum.h +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum.h @@ -709,6 +709,8 @@ int mlxsw_sp_flow_counter_alloc(struct mlxsw_sp *mlxsw_sp, void mlxsw_sp_flow_counter_free(struct mlxsw_sp *mlxsw_sp, unsigned int counter_index); bool mlxsw_sp_port_dev_check(const struct net_device *dev); +bool mlxsw_sp_foreign_dev_check(const struct net_device *dev, + const struct net_device *foreign_dev); struct mlxsw_sp *mlxsw_sp_lower_get(struct net_device *dev); struct mlxsw_sp_port *mlxsw_sp_port_dev_lower_find(struct net_device *dev); struct mlxsw_sp_port *mlxsw_sp_port_dev_lower_find_rcu(struct net_device *dev); diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c index 6397ff0dc951..0ab80cb22bc4 100644 --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c @@ -2252,6 +2252,14 @@ mlxsw_sp_port_mrouter_update_mdb(struct mlxsw_sp_port *mlxsw_sp_port, } } +static int +mlxsw_sp_switchdev_handle_vxlan_obj_add(struct net_device *vxlan_dev, + const struct switchdev_obj *obj, + struct netlink_ext_ack *extack); +static void +mlxsw_sp_switchdev_handle_vxlan_obj_del(struct net_device *vxlan_dev, + const struct switchdev_obj *obj); + static int mlxsw_sp_port_obj_add(struct net_device *dev, const void *ctx, const struct switchdev_obj *obj, struct netlink_ext_ack *extack) @@ -2262,16 +2270,20 @@ static int mlxsw_sp_port_obj_add(struct net_device *dev, const void *ctx, switch (obj->id) { case SWITCHDEV_OBJ_ID_PORT_VLAN: - vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); - - err = mlxsw_sp_port_vlans_add(mlxsw_sp_port, vlan, extack); - - /* The event is emitted before the changes are actually - * applied to the bridge. Therefore schedule the respin - * call for later, so that the respin logic sees the - * updated bridge state. - */ - mlxsw_sp_span_respin(mlxsw_sp_port->mlxsw_sp); + if (netif_is_vxlan(dev)) { + err = mlxsw_sp_switchdev_handle_vxlan_obj_add(dev, obj, extack); + } else { + vlan = SWITCHDEV_OBJ_PORT_VLAN(obj); + + err = mlxsw_sp_port_vlans_add(mlxsw_sp_port, vlan, extack); + + /* The event is emitted before the changes are actually + * applied to the bridge. Therefore schedule the respin + * call for later, so that the respin logic sees the + * updated bridge state. + */ + mlxsw_sp_span_respin(mlxsw_sp_port->mlxsw_sp); + } break; case SWITCHDEV_OBJ_ID_PORT_MDB: err = mlxsw_sp_port_mdb_add(mlxsw_sp_port, @@ -2401,8 +2413,12 @@ static int mlxsw_sp_port_obj_del(struct net_device *dev, const void *ctx, switch (obj->id) { case SWITCHDEV_OBJ_ID_PORT_VLAN: - err = mlxsw_sp_port_vlans_del(mlxsw_sp_port, - SWITCHDEV_OBJ_PORT_VLAN(obj)); + if (netif_is_vxlan(dev)) { + mlxsw_sp_switchdev_handle_vxlan_obj_del(dev, obj); + } else { + err = mlxsw_sp_port_vlans_del(mlxsw_sp_port, + SWITCHDEV_OBJ_PORT_VLAN(obj)); + } break; case SWITCHDEV_OBJ_ID_PORT_MDB: err = mlxsw_sp_port_mdb_del(mlxsw_sp_port, @@ -3931,19 +3947,17 @@ mlxsw_sp_switchdev_vxlan_vlan_del(struct mlxsw_sp *mlxsw_sp, static int mlxsw_sp_switchdev_vxlan_vlans_add(struct net_device *vxlan_dev, - struct switchdev_notifier_port_obj_info * - port_obj_info) + const struct switchdev_obj *obj, + struct netlink_ext_ack *extack) { struct switchdev_obj_port_vlan *vlan = - SWITCHDEV_OBJ_PORT_VLAN(port_obj_info->obj); + SWITCHDEV_OBJ_PORT_VLAN(obj); bool flag_untagged = vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED; bool flag_pvid = vlan->flags & BRIDGE_VLAN_INFO_PVID; struct mlxsw_sp_bridge_device *bridge_device; - struct netlink_ext_ack *extack; struct mlxsw_sp *mlxsw_sp; struct net_device *br_dev; - extack = switchdev_notifier_info_to_extack(&port_obj_info->info); br_dev = netdev_master_upper_dev_get(vxlan_dev); if (!br_dev) return 0; @@ -3952,8 +3966,6 @@ mlxsw_sp_switchdev_vxlan_vlans_add(struct net_device *vxlan_dev, if (!mlxsw_sp) return 0; - port_obj_info->handled = true; - bridge_device = mlxsw_sp_bridge_device_find(mlxsw_sp->bridge, br_dev); if (!bridge_device) return -EINVAL; @@ -3969,11 +3981,10 @@ mlxsw_sp_switchdev_vxlan_vlans_add(struct net_device *vxlan_dev, static void mlxsw_sp_switchdev_vxlan_vlans_del(struct net_device *vxlan_dev, - struct switchdev_notifier_port_obj_info * - port_obj_info) + const struct switchdev_obj *obj) { struct switchdev_obj_port_vlan *vlan = - SWITCHDEV_OBJ_PORT_VLAN(port_obj_info->obj); + SWITCHDEV_OBJ_PORT_VLAN(obj); struct mlxsw_sp_bridge_device *bridge_device; struct mlxsw_sp *mlxsw_sp; struct net_device *br_dev; @@ -3986,8 +3997,6 @@ mlxsw_sp_switchdev_vxlan_vlans_del(struct net_device *vxlan_dev, if (!mlxsw_sp) return; - port_obj_info->handled = true; - bridge_device = mlxsw_sp_bridge_device_find(mlxsw_sp->bridge, br_dev); if (!bridge_device) return; @@ -4001,15 +4010,15 @@ mlxsw_sp_switchdev_vxlan_vlans_del(struct net_device *vxlan_dev, static int mlxsw_sp_switchdev_handle_vxlan_obj_add(struct net_device *vxlan_dev, - struct switchdev_notifier_port_obj_info * - port_obj_info) + const struct switchdev_obj *obj, + struct netlink_ext_ack *extack) { int err = 0; - switch (port_obj_info->obj->id) { + switch (obj->id) { case SWITCHDEV_OBJ_ID_PORT_VLAN: err = mlxsw_sp_switchdev_vxlan_vlans_add(vxlan_dev, - port_obj_info); + obj, extack); break; default: break; @@ -4020,12 +4029,11 @@ mlxsw_sp_switchdev_handle_vxlan_obj_add(struct net_device *vxlan_dev, static void mlxsw_sp_switchdev_handle_vxlan_obj_del(struct net_device *vxlan_dev, - struct switchdev_notifier_port_obj_info * - port_obj_info) + const struct switchdev_obj *obj) { - switch (port_obj_info->obj->id) { + switch (obj->id) { case SWITCHDEV_OBJ_ID_PORT_VLAN: - mlxsw_sp_switchdev_vxlan_vlans_del(vxlan_dev, port_obj_info); + mlxsw_sp_switchdev_vxlan_vlans_del(vxlan_dev, obj); break; default: break; @@ -4040,20 +4048,16 @@ static int mlxsw_sp_switchdev_blocking_event(struct notifier_block *unused, switch (event) { case SWITCHDEV_PORT_OBJ_ADD: - if (netif_is_vxlan(dev)) - err = mlxsw_sp_switchdev_handle_vxlan_obj_add(dev, ptr); - else - err = switchdev_handle_port_obj_add(dev, ptr, - mlxsw_sp_port_dev_check, - mlxsw_sp_port_obj_add); + err = switchdev_handle_port_obj_add_foreign(dev, ptr, + mlxsw_sp_port_dev_check, + mlxsw_sp_foreign_dev_check, + mlxsw_sp_port_obj_add); return notifier_from_errno(err); case SWITCHDEV_PORT_OBJ_DEL: - if (netif_is_vxlan(dev)) - mlxsw_sp_switchdev_handle_vxlan_obj_del(dev, ptr); - else - err = switchdev_handle_port_obj_del(dev, ptr, - mlxsw_sp_port_dev_check, - mlxsw_sp_port_obj_del); + err = switchdev_handle_port_obj_del_foreign(dev, ptr, + mlxsw_sp_port_dev_check, + mlxsw_sp_foreign_dev_check, + mlxsw_sp_port_obj_del); return notifier_from_errno(err); case SWITCHDEV_PORT_ATTR_SET: err = switchdev_handle_port_attr_set(dev, ptr,
Sending as RFC as I do not own this hardware. This code is not tested. Vladimir found this part of the spectrum switchdev, while looking at another issue here: https://lore.kernel.org/all/20250207220408.zipucrmm2yafj4wu@skbuf/ As vxlan seems a foreign port, wouldn't it be better to use switchdev_handle_port_obj_add_foreign() ? Note that port_obj_info->handled = true is set in __switchdev_handle_port_obj_add() and can be removed. Signed-off-by: Eric Woudstra <ericwouds@gmail.com> --- .../net/ethernet/mellanox/mlxsw/spectrum.c | 6 ++ .../net/ethernet/mellanox/mlxsw/spectrum.h | 2 + .../mellanox/mlxsw/spectrum_switchdev.c | 92 ++++++++++--------- 3 files changed, 56 insertions(+), 44 deletions(-)