Message ID | 20241023123215.5875-1-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [PATCHv2,net] bonding: add ns target multicast address to slave device | expand |
Hangbin Liu <liuhangbin@gmail.com> wrote: >Commit 4598380f9c54 ("bonding: fix ns validation on backup slaves") >tried to resolve the issue where backup slaves couldn't be brought up when >receiving IPv6 Neighbor Solicitation (NS) messages. However, this fix only >worked for drivers that receive all multicast messages, such as the veth >interface. > >For standard drivers, the NS multicast message is silently dropped because >the slave device is not a member of the NS target multicast group. > >To address this, we need to make the slave device join the NS target >multicast group, ensuring it can receive these IPv6 NS messages to validate >the slave’s status properly. > >There are three policies before joining the multicast group: >1. All settings must be under active-backup mode (alb and tlb do not support > arp_validate), with backup slaves and slaves supporting multicast. >2. We can add or remove multicast groups when arp_validate changes. >3. Other operations, such as enslaving, releasing, or setting NS targets, > need to be guarded by arp_validate. > >Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets") >Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> >--- >v2: only add/del mcast group on backup slaves when arp_validate is set (Jay Vosburgh) Sorry for the delay in responding, I've been traveling. For the above, I suspect I wasn't sufficiently clear in my commentary; what I meant wasn't just checking arp_validate being enabled, but that the implementation could be much less complex if it simply kept all of the multicast addresses added to the backup interface (in addition to the active interface) when arp_validate is enabled. I suspect the set of multicast addresses involved is likely to be small in the usual case, so the question then is whether the presumably small amount of traffic that inadvertently passes the filter (and is then thrown away by the kernel RX logic) is worth the complexity added here. That said, I have a few questions below. > arp_validate doesn't support 3ad, tlb, alb. So let's only do it on ab mode. >--- > drivers/net/bonding/bond_main.c | 18 +++++- > drivers/net/bonding/bond_options.c | 95 +++++++++++++++++++++++++++++- > include/net/bond_options.h | 1 + > 3 files changed, 112 insertions(+), 2 deletions(-) > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c >index b1bffd8e9a95..d7c1016619f9 100644 >--- a/drivers/net/bonding/bond_main.c >+++ b/drivers/net/bonding/bond_main.c >@@ -1008,6 +1008,9 @@ static void bond_hw_addr_swap(struct bonding *bond, struct slave *new_active, > > if (bond->dev->flags & IFF_UP) > bond_hw_addr_flush(bond->dev, old_active->dev); >+ >+ /* add target NS maddrs for backup slave */ >+ slave_set_ns_maddrs(bond, old_active, true); > } > > if (new_active) { >@@ -1024,6 +1027,9 @@ static void bond_hw_addr_swap(struct bonding *bond, struct slave *new_active, > dev_mc_sync(new_active->dev, bond->dev); > netif_addr_unlock_bh(bond->dev); > } >+ >+ /* clear target NS maddrs for active slave */ >+ slave_set_ns_maddrs(bond, new_active, false); > } > } > >@@ -2341,6 +2347,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, > bond_compute_features(bond); > bond_set_carrier(bond); > >+ /* set target NS maddrs for new slave, need to be called before >+ * bond_select_active_slave(), which will remove the maddr if >+ * the slave is selected as active slave >+ */ >+ slave_set_ns_maddrs(bond, new_slave, true); >+ > if (bond_uses_primary(bond)) { > block_netpoll_tx(); > bond_select_active_slave(bond); >@@ -2350,7 +2362,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, > if (bond_mode_can_use_xmit_hash(bond)) > bond_update_slave_arr(bond, NULL); > >- > if (!slave_dev->netdev_ops->ndo_bpf || > !slave_dev->netdev_ops->ndo_xdp_xmit) { > if (bond->xdp_prog) { >@@ -2548,6 +2559,11 @@ static int __bond_release_one(struct net_device *bond_dev, > if (oldcurrent == slave) > bond_change_active_slave(bond, NULL); > >+ /* clear target NS maddrs, must after bond_change_active_slave() >+ * as we need to clear the maddrs on backup slave >+ */ >+ slave_set_ns_maddrs(bond, slave, false); >+ > if (bond_is_lb(bond)) { > /* Must be called only after the slave has been > * detached from the list and the curr_active_slave >diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c >index 95d59a18c022..2554ba70f092 100644 >--- a/drivers/net/bonding/bond_options.c >+++ b/drivers/net/bonding/bond_options.c >@@ -1234,6 +1234,75 @@ static int bond_option_arp_ip_targets_set(struct bonding *bond, > } > > #if IS_ENABLED(CONFIG_IPV6) >+/* convert IPv6 address to link-local solicited-node multicast mac address */ >+static void ipv6_addr_to_solicited_mac(const struct in6_addr *addr, >+ unsigned char mac[ETH_ALEN]) >+{ >+ mac[0] = 0x33; >+ mac[1] = 0x33; >+ mac[2] = 0xFF; >+ mac[3] = addr->s6_addr[13]; >+ mac[4] = addr->s6_addr[14]; >+ mac[5] = addr->s6_addr[15]; >+} Can we make use of ndisc_mc_map() / ipv6_eth_mc_map() to perform this step, instead of creating a new function that's almost the same? >+ >+static bool slave_can_set_ns_maddr(struct bonding *bond, struct slave *slave) >+{ >+ return BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP && >+ !bond_is_active_slave(slave) && >+ slave->dev->flags & IFF_MULTICAST; >+} >+ >+static void _slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) >+{ >+ struct in6_addr *targets = bond->params.ns_targets; >+ unsigned char slot_maddr[ETH_ALEN]; >+ int i; >+ >+ if (!slave_can_set_ns_maddr(bond, slave)) >+ return; >+ >+ for (i = 0; i < BOND_MAX_NS_TARGETS; i++) { >+ if (ipv6_addr_any(&targets[i])) >+ break; >+ >+ ipv6_addr_to_solicited_mac(&targets[i], slot_maddr); >+ if (add) >+ dev_mc_add(slave->dev, slot_maddr); >+ else >+ dev_mc_del(slave->dev, slot_maddr); >+ } >+} >+ >+void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) >+{ >+ if (!bond->params.arp_validate) >+ return; >+ >+ _slave_set_ns_maddrs(bond, slave, add); >+} Why does this need a wrapper function vs. having the arp_validate test be first in the larger function? -J >+ >+static void slave_set_ns_maddr(struct bonding *bond, struct slave *slave, >+ struct in6_addr *target, struct in6_addr *slot) >+{ >+ unsigned char target_maddr[ETH_ALEN], slot_maddr[ETH_ALEN]; >+ >+ if (!bond->params.arp_validate || !slave_can_set_ns_maddr(bond, slave)) >+ return; >+ >+ /* remove the previous maddr on salve */ >+ if (!ipv6_addr_any(slot)) { >+ ipv6_addr_to_solicited_mac(slot, slot_maddr); >+ dev_mc_del(slave->dev, slot_maddr); >+ } >+ >+ /* add new maddr on slave if target is set */ >+ if (!ipv6_addr_any(target)) { >+ ipv6_addr_to_solicited_mac(target, target_maddr); >+ dev_mc_add(slave->dev, target_maddr); >+ } >+} >+ > static void _bond_options_ns_ip6_target_set(struct bonding *bond, int slot, > struct in6_addr *target, > unsigned long last_rx) >@@ -1243,8 +1312,10 @@ static void _bond_options_ns_ip6_target_set(struct bonding *bond, int slot, > struct slave *slave; > > if (slot >= 0 && slot < BOND_MAX_NS_TARGETS) { >- bond_for_each_slave(bond, slave, iter) >+ bond_for_each_slave(bond, slave, iter) { > slave->target_last_arp_rx[slot] = last_rx; >+ slave_set_ns_maddr(bond, slave, target, &targets[slot]); >+ } > targets[slot] = *target; > } > } >@@ -1296,15 +1367,37 @@ static int bond_option_ns_ip6_targets_set(struct bonding *bond, > { > return -EPERM; > } >+ >+static void _slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) >+{ >+} >+ >+void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) >+{ >+} > #endif > > static int bond_option_arp_validate_set(struct bonding *bond, > const struct bond_opt_value *newval) > { >+ bool changed = (bond->params.arp_validate == 0 && newval->value != 0) || >+ (bond->params.arp_validate != 0 && newval->value == 0); >+ struct list_head *iter; >+ struct slave *slave; >+ > netdev_dbg(bond->dev, "Setting arp_validate to %s (%llu)\n", > newval->string, newval->value); > bond->params.arp_validate = newval->value; > >+ if (changed) { >+ bond_for_each_slave(bond, slave, iter) { >+ if (bond->params.arp_validate) >+ _slave_set_ns_maddrs(bond, slave, true); >+ else >+ _slave_set_ns_maddrs(bond, slave, false); >+ } >+ } >+ > return 0; > } > >diff --git a/include/net/bond_options.h b/include/net/bond_options.h >index 473a0147769e..59a91d12cd57 100644 >--- a/include/net/bond_options.h >+++ b/include/net/bond_options.h >@@ -161,5 +161,6 @@ void bond_option_arp_ip_targets_clear(struct bonding *bond); > #if IS_ENABLED(CONFIG_IPV6) > void bond_option_ns_ip6_targets_clear(struct bonding *bond); > #endif >+void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add); > > #endif /* _NET_BOND_OPTIONS_H */ >-- >2.46.0 --- -Jay Vosburgh, jv@jvosburgh.net
Hi Jay, On Wed, Oct 30, 2024 at 05:21:05PM +0100, Jay Vosburgh wrote: > I suspect the set of multicast addresses involved is likely to > be small in the usual case, so the question then is whether the > presumably small amount of traffic that inadvertently passes the filter > (and is then thrown away by the kernel RX logic) is worth the complexity > added here. Yes, while the code and logic may be complex, the "small amount of traffic", specifically, the IPv6 NS messages, plays a crucial role in determining whether backup slaves are up or not. Without these messages, it would be akin to dropping ARP traffic for IPv4, which could lead to connectivity issues. > > That said, I have a few questions below. > > > arp_validate doesn't support 3ad, tlb, alb. So let's only do it on ab mode. > >--- > > drivers/net/bonding/bond_main.c | 18 +++++- > > drivers/net/bonding/bond_options.c | 95 +++++++++++++++++++++++++++++- > > include/net/bond_options.h | 1 + > > 3 files changed, 112 insertions(+), 2 deletions(-) > > > >diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > >index b1bffd8e9a95..d7c1016619f9 100644 > >--- a/drivers/net/bonding/bond_main.c > >+++ b/drivers/net/bonding/bond_main.c > >@@ -1008,6 +1008,9 @@ static void bond_hw_addr_swap(struct bonding *bond, struct slave *new_active, > > > > if (bond->dev->flags & IFF_UP) > > bond_hw_addr_flush(bond->dev, old_active->dev); > >+ > >+ /* add target NS maddrs for backup slave */ > >+ slave_set_ns_maddrs(bond, old_active, true); > > } > > > > if (new_active) { > >@@ -1024,6 +1027,9 @@ static void bond_hw_addr_swap(struct bonding *bond, struct slave *new_active, > > dev_mc_sync(new_active->dev, bond->dev); > > netif_addr_unlock_bh(bond->dev); > > } > >+ > >+ /* clear target NS maddrs for active slave */ > >+ slave_set_ns_maddrs(bond, new_active, false); > > } > > } > > > >@@ -2341,6 +2347,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, > > bond_compute_features(bond); > > bond_set_carrier(bond); > > > >+ /* set target NS maddrs for new slave, need to be called before > >+ * bond_select_active_slave(), which will remove the maddr if > >+ * the slave is selected as active slave > >+ */ > >+ slave_set_ns_maddrs(bond, new_slave, true); > >+ > > if (bond_uses_primary(bond)) { > > block_netpoll_tx(); > > bond_select_active_slave(bond); > >@@ -2350,7 +2362,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, > > if (bond_mode_can_use_xmit_hash(bond)) > > bond_update_slave_arr(bond, NULL); > > > >- > > if (!slave_dev->netdev_ops->ndo_bpf || > > !slave_dev->netdev_ops->ndo_xdp_xmit) { > > if (bond->xdp_prog) { > >@@ -2548,6 +2559,11 @@ static int __bond_release_one(struct net_device *bond_dev, > > if (oldcurrent == slave) > > bond_change_active_slave(bond, NULL); > > > >+ /* clear target NS maddrs, must after bond_change_active_slave() > >+ * as we need to clear the maddrs on backup slave > >+ */ > >+ slave_set_ns_maddrs(bond, slave, false); > >+ > > if (bond_is_lb(bond)) { > > /* Must be called only after the slave has been > > * detached from the list and the curr_active_slave > >diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c > >index 95d59a18c022..2554ba70f092 100644 > >--- a/drivers/net/bonding/bond_options.c > >+++ b/drivers/net/bonding/bond_options.c > >@@ -1234,6 +1234,75 @@ static int bond_option_arp_ip_targets_set(struct bonding *bond, > > } > > > > #if IS_ENABLED(CONFIG_IPV6) > >+/* convert IPv6 address to link-local solicited-node multicast mac address */ > >+static void ipv6_addr_to_solicited_mac(const struct in6_addr *addr, > >+ unsigned char mac[ETH_ALEN]) > >+{ > >+ mac[0] = 0x33; > >+ mac[1] = 0x33; > >+ mac[2] = 0xFF; > >+ mac[3] = addr->s6_addr[13]; > >+ mac[4] = addr->s6_addr[14]; > >+ mac[5] = addr->s6_addr[15]; > >+} > > Can we make use of ndisc_mc_map() / ipv6_eth_mc_map() to perform > this step, instead of creating a new function that's almost the same? Ah, yes, I think so. Thanks for this tips. > >+void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) > >+{ > >+ if (!bond->params.arp_validate) > >+ return; > >+ > >+ _slave_set_ns_maddrs(bond, slave, add); > >+} > > Why does this need a wrapper function vs. having the > arp_validate test be first in the larger function? We have 4 places call slave_set_ns_maddrs(). I think with this wrapper could save some codes. I'm fine to remove this wrapper if you think the code would be simpler. Thanks Hangbin
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index b1bffd8e9a95..d7c1016619f9 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1008,6 +1008,9 @@ static void bond_hw_addr_swap(struct bonding *bond, struct slave *new_active, if (bond->dev->flags & IFF_UP) bond_hw_addr_flush(bond->dev, old_active->dev); + + /* add target NS maddrs for backup slave */ + slave_set_ns_maddrs(bond, old_active, true); } if (new_active) { @@ -1024,6 +1027,9 @@ static void bond_hw_addr_swap(struct bonding *bond, struct slave *new_active, dev_mc_sync(new_active->dev, bond->dev); netif_addr_unlock_bh(bond->dev); } + + /* clear target NS maddrs for active slave */ + slave_set_ns_maddrs(bond, new_active, false); } } @@ -2341,6 +2347,12 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, bond_compute_features(bond); bond_set_carrier(bond); + /* set target NS maddrs for new slave, need to be called before + * bond_select_active_slave(), which will remove the maddr if + * the slave is selected as active slave + */ + slave_set_ns_maddrs(bond, new_slave, true); + if (bond_uses_primary(bond)) { block_netpoll_tx(); bond_select_active_slave(bond); @@ -2350,7 +2362,6 @@ int bond_enslave(struct net_device *bond_dev, struct net_device *slave_dev, if (bond_mode_can_use_xmit_hash(bond)) bond_update_slave_arr(bond, NULL); - if (!slave_dev->netdev_ops->ndo_bpf || !slave_dev->netdev_ops->ndo_xdp_xmit) { if (bond->xdp_prog) { @@ -2548,6 +2559,11 @@ static int __bond_release_one(struct net_device *bond_dev, if (oldcurrent == slave) bond_change_active_slave(bond, NULL); + /* clear target NS maddrs, must after bond_change_active_slave() + * as we need to clear the maddrs on backup slave + */ + slave_set_ns_maddrs(bond, slave, false); + if (bond_is_lb(bond)) { /* Must be called only after the slave has been * detached from the list and the curr_active_slave diff --git a/drivers/net/bonding/bond_options.c b/drivers/net/bonding/bond_options.c index 95d59a18c022..2554ba70f092 100644 --- a/drivers/net/bonding/bond_options.c +++ b/drivers/net/bonding/bond_options.c @@ -1234,6 +1234,75 @@ static int bond_option_arp_ip_targets_set(struct bonding *bond, } #if IS_ENABLED(CONFIG_IPV6) +/* convert IPv6 address to link-local solicited-node multicast mac address */ +static void ipv6_addr_to_solicited_mac(const struct in6_addr *addr, + unsigned char mac[ETH_ALEN]) +{ + mac[0] = 0x33; + mac[1] = 0x33; + mac[2] = 0xFF; + mac[3] = addr->s6_addr[13]; + mac[4] = addr->s6_addr[14]; + mac[5] = addr->s6_addr[15]; +} + +static bool slave_can_set_ns_maddr(struct bonding *bond, struct slave *slave) +{ + return BOND_MODE(bond) == BOND_MODE_ACTIVEBACKUP && + !bond_is_active_slave(slave) && + slave->dev->flags & IFF_MULTICAST; +} + +static void _slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) +{ + struct in6_addr *targets = bond->params.ns_targets; + unsigned char slot_maddr[ETH_ALEN]; + int i; + + if (!slave_can_set_ns_maddr(bond, slave)) + return; + + for (i = 0; i < BOND_MAX_NS_TARGETS; i++) { + if (ipv6_addr_any(&targets[i])) + break; + + ipv6_addr_to_solicited_mac(&targets[i], slot_maddr); + if (add) + dev_mc_add(slave->dev, slot_maddr); + else + dev_mc_del(slave->dev, slot_maddr); + } +} + +void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) +{ + if (!bond->params.arp_validate) + return; + + _slave_set_ns_maddrs(bond, slave, add); +} + +static void slave_set_ns_maddr(struct bonding *bond, struct slave *slave, + struct in6_addr *target, struct in6_addr *slot) +{ + unsigned char target_maddr[ETH_ALEN], slot_maddr[ETH_ALEN]; + + if (!bond->params.arp_validate || !slave_can_set_ns_maddr(bond, slave)) + return; + + /* remove the previous maddr on salve */ + if (!ipv6_addr_any(slot)) { + ipv6_addr_to_solicited_mac(slot, slot_maddr); + dev_mc_del(slave->dev, slot_maddr); + } + + /* add new maddr on slave if target is set */ + if (!ipv6_addr_any(target)) { + ipv6_addr_to_solicited_mac(target, target_maddr); + dev_mc_add(slave->dev, target_maddr); + } +} + static void _bond_options_ns_ip6_target_set(struct bonding *bond, int slot, struct in6_addr *target, unsigned long last_rx) @@ -1243,8 +1312,10 @@ static void _bond_options_ns_ip6_target_set(struct bonding *bond, int slot, struct slave *slave; if (slot >= 0 && slot < BOND_MAX_NS_TARGETS) { - bond_for_each_slave(bond, slave, iter) + bond_for_each_slave(bond, slave, iter) { slave->target_last_arp_rx[slot] = last_rx; + slave_set_ns_maddr(bond, slave, target, &targets[slot]); + } targets[slot] = *target; } } @@ -1296,15 +1367,37 @@ static int bond_option_ns_ip6_targets_set(struct bonding *bond, { return -EPERM; } + +static void _slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) +{ +} + +void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add) +{ +} #endif static int bond_option_arp_validate_set(struct bonding *bond, const struct bond_opt_value *newval) { + bool changed = (bond->params.arp_validate == 0 && newval->value != 0) || + (bond->params.arp_validate != 0 && newval->value == 0); + struct list_head *iter; + struct slave *slave; + netdev_dbg(bond->dev, "Setting arp_validate to %s (%llu)\n", newval->string, newval->value); bond->params.arp_validate = newval->value; + if (changed) { + bond_for_each_slave(bond, slave, iter) { + if (bond->params.arp_validate) + _slave_set_ns_maddrs(bond, slave, true); + else + _slave_set_ns_maddrs(bond, slave, false); + } + } + return 0; } diff --git a/include/net/bond_options.h b/include/net/bond_options.h index 473a0147769e..59a91d12cd57 100644 --- a/include/net/bond_options.h +++ b/include/net/bond_options.h @@ -161,5 +161,6 @@ void bond_option_arp_ip_targets_clear(struct bonding *bond); #if IS_ENABLED(CONFIG_IPV6) void bond_option_ns_ip6_targets_clear(struct bonding *bond); #endif +void slave_set_ns_maddrs(struct bonding *bond, struct slave *slave, bool add); #endif /* _NET_BOND_OPTIONS_H */
Commit 4598380f9c54 ("bonding: fix ns validation on backup slaves") tried to resolve the issue where backup slaves couldn't be brought up when receiving IPv6 Neighbor Solicitation (NS) messages. However, this fix only worked for drivers that receive all multicast messages, such as the veth interface. For standard drivers, the NS multicast message is silently dropped because the slave device is not a member of the NS target multicast group. To address this, we need to make the slave device join the NS target multicast group, ensuring it can receive these IPv6 NS messages to validate the slave’s status properly. There are three policies before joining the multicast group: 1. All settings must be under active-backup mode (alb and tlb do not support arp_validate), with backup slaves and slaves supporting multicast. 2. We can add or remove multicast groups when arp_validate changes. 3. Other operations, such as enslaving, releasing, or setting NS targets, need to be guarded by arp_validate. Fixes: 4e24be018eb9 ("bonding: add new parameter ns_targets") Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- v2: only add/del mcast group on backup slaves when arp_validate is set (Jay Vosburgh) arp_validate doesn't support 3ad, tlb, alb. So let's only do it on ab mode. --- drivers/net/bonding/bond_main.c | 18 +++++- drivers/net/bonding/bond_options.c | 95 +++++++++++++++++++++++++++++- include/net/bond_options.h | 1 + 3 files changed, 112 insertions(+), 2 deletions(-)