Message ID | 20240626075156.2565966-1-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [PATCHv3,net-next] bonding: 3ad: send ifinfo notify when mux state changed | expand |
On 26/06/2024 10:51, Hangbin Liu wrote: > Currently, administrators need to retrieve LACP mux state changes from > the kernel DEBUG log using netdev_dbg and slave_dbg macros. To simplify > this process, let's send the ifinfo notification whenever the mux state > changes. This will enable users to directly access and monitor this > information using the ip monitor command. > > Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> > --- > v3: forgot to use GFP_ATOMIC. (Nikolay Aleksandrov) > export symbol for rtmsg_ifinfo. It's weird that my build succeed with > tools/testing/selftests/drivers/net/bonding/config without export > the symbol, but build failed with tools/testing/selftests/net/config. > v2: don't use call_netdevice_notifiers as it will case sleeping in atomic > context (Nikolay Aleksandrov) > > After this patch, we can see the following info with `ip -d monitor link` > > 7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default > link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 > veth > bond_slave state BACKUP mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 143 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,expired> ad_partner_oper_port_state 55 ad_partner_oper_port_state_str <active,short_timeout,aggregating,collecting,distributing> ... > 7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default > link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 > veth > bond_slave state ACTIVE mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 79 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,defaulted> ad_partner_oper_port_state 1 ad_partner_oper_port_state_str <active> ... > 7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default > link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 > veth > bond_slave state ACTIVE mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 63 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state 63 ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ... > --- > drivers/net/bonding/bond_3ad.c | 3 +++ > net/core/rtnetlink.c | 1 + > 2 files changed, 4 insertions(+) > > diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c > index c6807e473ab7..b57c5670b31a 100644 > --- a/drivers/net/bonding/bond_3ad.c > +++ b/drivers/net/bonding/bond_3ad.c > @@ -11,6 +11,7 @@ > #include <linux/etherdevice.h> > #include <linux/if_bonding.h> > #include <linux/pkt_sched.h> > +#include <linux/rtnetlink.h> > #include <net/net_namespace.h> > #include <net/bonding.h> > #include <net/bond_3ad.h> > @@ -1185,6 +1186,8 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) > default: > break; > } > + > + rtmsg_ifinfo(RTM_NEWLINK, port->slave->dev, 0, GFP_ATOMIC, 0, NULL); > } > } > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index eabfc8290f5e..4507bb8d5264 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -4116,6 +4116,7 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change, > rtmsg_ifinfo_event(type, dev, change, rtnl_get_event(0), flags, > NULL, 0, portid, nlh); > } > +EXPORT_SYMBOL(rtmsg_ifinfo); > > void rtmsg_ifinfo_newnet(int type, struct net_device *dev, unsigned int change, > gfp_t flags, int *new_nsid, int new_ifindex) Looks good to me, thanks! Acked-by: Nikolay Aleksandrov <razor@blackwall.org>
Hangbin Liu <liuhangbin@gmail.com> wrote: >Currently, administrators need to retrieve LACP mux state changes from >the kernel DEBUG log using netdev_dbg and slave_dbg macros. To simplify >this process, let's send the ifinfo notification whenever the mux state >changes. This will enable users to directly access and monitor this >information using the ip monitor command. > >Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> >--- >v3: forgot to use GFP_ATOMIC. (Nikolay Aleksandrov) > export symbol for rtmsg_ifinfo. It's weird that my build succeed with > tools/testing/selftests/drivers/net/bonding/config without export > the symbol, but build failed with tools/testing/selftests/net/config. I would hazard to guess that bonding/config works without export because it has CONFIG_BONDING=y which builds bonding into the main image (not as a module), which wouldn't need the EXPORT_SYMBOL. I think the change is fine, the only question is whether it's better to have a wrapper for rtmsg_ifinfo() in net/core/dev.c (where all current callers are). I don't see a particular need, but others might want some consistency. Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com> -J >v2: don't use call_netdevice_notifiers as it will case sleeping in atomic > context (Nikolay Aleksandrov) > >After this patch, we can see the following info with `ip -d monitor link` > >7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default > link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 > veth > bond_slave state BACKUP mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 143 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,expired> ad_partner_oper_port_state 55 ad_partner_oper_port_state_str <active,short_timeout,aggregating,collecting,distributing> ... >7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default > link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 > veth > bond_slave state ACTIVE mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 79 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,defaulted> ad_partner_oper_port_state 1 ad_partner_oper_port_state_str <active> ... >7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default > link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 > veth > bond_slave state ACTIVE mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 63 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state 63 ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ... >--- > drivers/net/bonding/bond_3ad.c | 3 +++ > net/core/rtnetlink.c | 1 + > 2 files changed, 4 insertions(+) > >diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c >index c6807e473ab7..b57c5670b31a 100644 >--- a/drivers/net/bonding/bond_3ad.c >+++ b/drivers/net/bonding/bond_3ad.c >@@ -11,6 +11,7 @@ > #include <linux/etherdevice.h> > #include <linux/if_bonding.h> > #include <linux/pkt_sched.h> >+#include <linux/rtnetlink.h> > #include <net/net_namespace.h> > #include <net/bonding.h> > #include <net/bond_3ad.h> >@@ -1185,6 +1186,8 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) > default: > break; > } >+ >+ rtmsg_ifinfo(RTM_NEWLINK, port->slave->dev, 0, GFP_ATOMIC, 0, NULL); > } > } > >diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c >index eabfc8290f5e..4507bb8d5264 100644 >--- a/net/core/rtnetlink.c >+++ b/net/core/rtnetlink.c >@@ -4116,6 +4116,7 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change, > rtmsg_ifinfo_event(type, dev, change, rtnl_get_event(0), flags, > NULL, 0, portid, nlh); > } >+EXPORT_SYMBOL(rtmsg_ifinfo); > > void rtmsg_ifinfo_newnet(int type, struct net_device *dev, unsigned int change, > gfp_t flags, int *new_nsid, int new_ifindex) >-- >2.45.0 > --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Wed, 26 Jun 2024 15:51:56 +0800 Hangbin Liu wrote: > Currently, administrators need to retrieve LACP mux state changes from > the kernel DEBUG log using netdev_dbg and slave_dbg macros. To simplify > this process, let's send the ifinfo notification whenever the mux state > changes. This will enable users to directly access and monitor this > information using the ip monitor command. Hits: RTNL: assertion failed at net/core/rtnetlink.c (1823) On two selftests. Please run the selftests on a debug kernel..
Jakub Kicinski <kuba@kernel.org> wrote: >On Wed, 26 Jun 2024 15:51:56 +0800 Hangbin Liu wrote: >> Currently, administrators need to retrieve LACP mux state changes from >> the kernel DEBUG log using netdev_dbg and slave_dbg macros. To simplify >> this process, let's send the ifinfo notification whenever the mux state >> changes. This will enable users to directly access and monitor this >> information using the ip monitor command. > >Hits: > >RTNL: assertion failed at net/core/rtnetlink.c (1823) > >On two selftests. Please run the selftests on a debug kernel.. Oh, I forgot about needing RTNL. We cannot simply acquire RTNL in ad_mux_machine(), as the bond->mode_lock is already held, and the lock ordering must be RTNL first, then mode_lock, lest we deadlock. Hangbin, I'd suggest you look at how bond_netdev_notify_work() complies with the lock ordering (basically, doing the actual work out of line in a workqueue event), or how the "should_notify" flag is used in bond_3ad_state_machine_handler(). The first is more complicated, but won't skip events; the second may miss intermediate state transitions if it cannot acquire RTNL and has to delay the notification. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Wed, Jun 26, 2024 at 05:06:00PM -0700, Jay Vosburgh wrote: > >Hits: > > > >RTNL: assertion failed at net/core/rtnetlink.c (1823) Thanks for this hits... > > > >On two selftests. Please run the selftests on a debug kernel.. OK, I will try run my tests on debug kernel in future. > > Oh, I forgot about needing RTNL. > > We cannot simply acquire RTNL in ad_mux_machine(), as the > bond->mode_lock is already held, and the lock ordering must be RTNL > first, then mode_lock, lest we deadlock. > > Hangbin, I'd suggest you look at how bond_netdev_notify_work() > complies with the lock ordering (basically, doing the actual work out of > line in a workqueue event), or how the "should_notify" flag is used in > bond_3ad_state_machine_handler(). The first is more complicated, but > won't skip events; the second may miss intermediate state transitions if > it cannot acquire RTNL and has to delay the notification. I think the administer doesn't want to loose the state change info. So how about something like: diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c index c6807e473ab7..046f11c5c47a 100644 --- a/drivers/net/bonding/bond_3ad.c +++ b/drivers/net/bonding/bond_3ad.c @@ -1185,6 +1185,8 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) default: break; } + + port->slave->should_notify_lacp = 1; } } @@ -2527,6 +2529,9 @@ void bond_3ad_state_machine_handler(struct work_struct *work) bond_slave_state_notify(bond); rtnl_unlock(); } + + /* Notify the mux state changes */ + bond_slave_link_notify(bond); queue_delayed_work(bond->wq, &bond->ad_work, ad_delta_in_ticks); } diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 3c3fcce4acd4..db8f2fb613df 100644 --- a/drivers/net/bonding/bond_main.c +++ b/drivers/net/bonding/bond_main.c @@ -1748,11 +1748,19 @@ static void bond_netdev_notify_work(struct work_struct *_work) notify_work.work); if (rtnl_trylock()) { - struct netdev_bonding_info binfo; + if (slave->should_notify_link) { + struct netdev_bonding_info binfo; + bond_fill_ifslave(slave, &binfo.slave); + bond_fill_ifbond(slave->bond, &binfo.master); + netdev_bonding_info_change(slave->dev, &binfo); + slave->should_notify_link = 0; + } + + if (slave->should_notify_lacp) { + rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_KERNEL, 0, NULL); + slave->should_notify_lacp = 0; + } - bond_fill_ifslave(slave, &binfo.slave); - bond_fill_ifbond(slave->bond, &binfo.master); - netdev_bonding_info_change(slave->dev, &binfo); rtnl_unlock(); } else { queue_delayed_work(slave->bond->wq, &slave->notify_work, 1); diff --git a/include/net/bonding.h b/include/net/bonding.h index b61fb1aa3a56..38d37ea2382c 100644 --- a/include/net/bonding.h +++ b/include/net/bonding.h @@ -170,7 +170,8 @@ struct slave { inactive:1, /* indicates inactive slave */ rx_disabled:1, /* indicates whether slave's Rx is disabled */ should_notify:1, /* indicates whether the state changed */ - should_notify_link:1; /* indicates whether the link changed */ + should_notify_link:1, /* indicates whether the link changed */ + should_notify_lacp:1; /* indicates whether the lacp state changed */ u8 duplex; u32 original_mtu; u32 link_failure_count; @@ -641,11 +642,10 @@ static inline void bond_slave_link_notify(struct bonding *bond) struct slave *tmp; bond_for_each_slave(bond, tmp, iter) { - if (tmp->should_notify_link) { + if (tmp->should_notify_link || tmp->should_notify_lacp) bond_queue_slave_event(tmp); + if (tmp->should_notify_link) bond_lower_state_changed(tmp); - tmp->should_notify_link = 0; - } } } diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index eabfc8290f5e..4507bb8d5264 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4116,6 +4116,7 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change, rtmsg_ifinfo_event(type, dev, change, rtnl_get_event(0), flags, NULL, 0, portid, nlh); } +EXPORT_SYMBOL(rtmsg_ifinfo); void rtmsg_ifinfo_newnet(int type, struct net_device *dev, unsigned int change, gfp_t flags, int *new_nsid, int new_ifindex)
On 27/06/2024 11:26, Hangbin Liu wrote: > On Wed, Jun 26, 2024 at 05:06:00PM -0700, Jay Vosburgh wrote: >>> Hits: >>> >>> RTNL: assertion failed at net/core/rtnetlink.c (1823) > > Thanks for this hits... > >>> >>> On two selftests. Please run the selftests on a debug kernel.. > > OK, I will try run my tests on debug kernel in future. > >> >> Oh, I forgot about needing RTNL. >> +1 & facepalm, completely forgot it was running without rtnl >> We cannot simply acquire RTNL in ad_mux_machine(), as the >> bond->mode_lock is already held, and the lock ordering must be RTNL >> first, then mode_lock, lest we deadlock. >> >> Hangbin, I'd suggest you look at how bond_netdev_notify_work() >> complies with the lock ordering (basically, doing the actual work out of >> line in a workqueue event), or how the "should_notify" flag is used in >> bond_3ad_state_machine_handler(). The first is more complicated, but >> won't skip events; the second may miss intermediate state transitions if >> it cannot acquire RTNL and has to delay the notification. > > I think the administer doesn't want to loose the state change info. So how > about something like: > You can (and will) miss events with the below code. It is kind of best effort, but if the notification is not run before the next state change, you will lose the intermediate changes. > diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c > index c6807e473ab7..046f11c5c47a 100644 > --- a/drivers/net/bonding/bond_3ad.c > +++ b/drivers/net/bonding/bond_3ad.c > @@ -1185,6 +1185,8 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) > default: > break; > } > + > + port->slave->should_notify_lacp = 1; > } > } > > @@ -2527,6 +2529,9 @@ void bond_3ad_state_machine_handler(struct work_struct *work) > bond_slave_state_notify(bond); > rtnl_unlock(); > } > + > + /* Notify the mux state changes */ > + bond_slave_link_notify(bond); > queue_delayed_work(bond->wq, &bond->ad_work, ad_delta_in_ticks); > } > > diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c > index 3c3fcce4acd4..db8f2fb613df 100644 > --- a/drivers/net/bonding/bond_main.c > +++ b/drivers/net/bonding/bond_main.c > @@ -1748,11 +1748,19 @@ static void bond_netdev_notify_work(struct work_struct *_work) > notify_work.work); > > if (rtnl_trylock()) { > - struct netdev_bonding_info binfo; > + if (slave->should_notify_link) { > + struct netdev_bonding_info binfo; > + bond_fill_ifslave(slave, &binfo.slave); > + bond_fill_ifbond(slave->bond, &binfo.master); > + netdev_bonding_info_change(slave->dev, &binfo); > + slave->should_notify_link = 0; > + } > + > + if (slave->should_notify_lacp) { > + rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_KERNEL, 0, NULL); > + slave->should_notify_lacp = 0; > + } > > - bond_fill_ifslave(slave, &binfo.slave); > - bond_fill_ifbond(slave->bond, &binfo.master); > - netdev_bonding_info_change(slave->dev, &binfo); > rtnl_unlock(); > } else { > queue_delayed_work(slave->bond->wq, &slave->notify_work, 1); > diff --git a/include/net/bonding.h b/include/net/bonding.h > index b61fb1aa3a56..38d37ea2382c 100644 > --- a/include/net/bonding.h > +++ b/include/net/bonding.h > @@ -170,7 +170,8 @@ struct slave { > inactive:1, /* indicates inactive slave */ > rx_disabled:1, /* indicates whether slave's Rx is disabled */ > should_notify:1, /* indicates whether the state changed */ > - should_notify_link:1; /* indicates whether the link changed */ > + should_notify_link:1, /* indicates whether the link changed */ > + should_notify_lacp:1; /* indicates whether the lacp state changed */ > u8 duplex; > u32 original_mtu; > u32 link_failure_count; > @@ -641,11 +642,10 @@ static inline void bond_slave_link_notify(struct bonding *bond) > struct slave *tmp; > > bond_for_each_slave(bond, tmp, iter) { > - if (tmp->should_notify_link) { > + if (tmp->should_notify_link || tmp->should_notify_lacp) > bond_queue_slave_event(tmp); > + if (tmp->should_notify_link) > bond_lower_state_changed(tmp); > - tmp->should_notify_link = 0; > - } > } > } > > diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c > index eabfc8290f5e..4507bb8d5264 100644 > --- a/net/core/rtnetlink.c > +++ b/net/core/rtnetlink.c > @@ -4116,6 +4116,7 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change, > rtmsg_ifinfo_event(type, dev, change, rtnl_get_event(0), flags, > NULL, 0, portid, nlh); > } > +EXPORT_SYMBOL(rtmsg_ifinfo); > > void rtmsg_ifinfo_newnet(int type, struct net_device *dev, unsigned int change, > gfp_t flags, int *new_nsid, int new_ifindex)
On Thu, Jun 27, 2024 at 11:29:21AM +0300, Nikolay Aleksandrov wrote: > On 27/06/2024 11:26, Hangbin Liu wrote: > > On Wed, Jun 26, 2024 at 05:06:00PM -0700, Jay Vosburgh wrote: > >>> Hits: > >>> > >>> RTNL: assertion failed at net/core/rtnetlink.c (1823) > > > > Thanks for this hits... > > > >>> > >>> On two selftests. Please run the selftests on a debug kernel.. > > > > OK, I will try run my tests on debug kernel in future. > > > >> > >> Oh, I forgot about needing RTNL. > >> > > +1 & facepalm, completely forgot it was running without rtnl > > >> We cannot simply acquire RTNL in ad_mux_machine(), as the > >> bond->mode_lock is already held, and the lock ordering must be RTNL > >> first, then mode_lock, lest we deadlock. > >> > >> Hangbin, I'd suggest you look at how bond_netdev_notify_work() > >> complies with the lock ordering (basically, doing the actual work out of > >> line in a workqueue event), or how the "should_notify" flag is used in > >> bond_3ad_state_machine_handler(). The first is more complicated, but > >> won't skip events; the second may miss intermediate state transitions if > >> it cannot acquire RTNL and has to delay the notification. > > > > I think the administer doesn't want to loose the state change info. So how > > about something like: > > > > You can (and will) miss events with the below code. It is kind of best effort, > but if the notification is not run before the next state change, you will > lose the intermediate changes. Yes, but at least the admin could get the latest state. With the following code the admin may not get the latest update if lock rtnl failed. if (should_notify_rtnl && rtnl_trylock()) { bond_slave_lacp_notify(bond); rtnl_unlock(); } Thanks Hangbin
On 27/06/2024 13:05, Hangbin Liu wrote: > On Thu, Jun 27, 2024 at 11:29:21AM +0300, Nikolay Aleksandrov wrote: >> On 27/06/2024 11:26, Hangbin Liu wrote: >>> On Wed, Jun 26, 2024 at 05:06:00PM -0700, Jay Vosburgh wrote: >>>>> Hits: >>>>> >>>>> RTNL: assertion failed at net/core/rtnetlink.c (1823) >>> >>> Thanks for this hits... >>> >>>>> >>>>> On two selftests. Please run the selftests on a debug kernel.. >>> >>> OK, I will try run my tests on debug kernel in future. >>> >>>> >>>> Oh, I forgot about needing RTNL. >>>> >> >> +1 & facepalm, completely forgot it was running without rtnl >> >>>> We cannot simply acquire RTNL in ad_mux_machine(), as the >>>> bond->mode_lock is already held, and the lock ordering must be RTNL >>>> first, then mode_lock, lest we deadlock. >>>> >>>> Hangbin, I'd suggest you look at how bond_netdev_notify_work() >>>> complies with the lock ordering (basically, doing the actual work out of >>>> line in a workqueue event), or how the "should_notify" flag is used in >>>> bond_3ad_state_machine_handler(). The first is more complicated, but >>>> won't skip events; the second may miss intermediate state transitions if >>>> it cannot acquire RTNL and has to delay the notification. >>> >>> I think the administer doesn't want to loose the state change info. So how >>> about something like: >>> >> >> You can (and will) miss events with the below code. It is kind of best effort, >> but if the notification is not run before the next state change, you will >> lose the intermediate changes. > > Yes, but at least the admin could get the latest state. With the following > code the admin may not get the latest update if lock rtnl failed. > > if (should_notify_rtnl && rtnl_trylock()) { > bond_slave_lacp_notify(bond); > rtnl_unlock(); > } > > Thanks > Hangbin Well, you mentioned administrators want to see the state changes, please better clarify the exact end goal. Note that technically may even not be the last state as the state change itself happens in parallel (different locks) and any update could be delayed depending on rtnl availability and workqueue re-scheduling. But sure, they will get some update at some point. :) It all depends on what are the requirements. An uglier but lockless alternative would be to poll the slave's sysfs oper state, that doesn't require any locks and would be up-to-date. Cheers, Nik
On Thu, Jun 27, 2024 at 01:33:10PM +0300, Nikolay Aleksandrov wrote: > > Yes, but at least the admin could get the latest state. With the following > > code the admin may not get the latest update if lock rtnl failed. > > > > if (should_notify_rtnl && rtnl_trylock()) { > > bond_slave_lacp_notify(bond); > > rtnl_unlock(); > > } > > > Well, you mentioned administrators want to see the state changes, please > better clarify the exact end goal. Note that technically may even not be > the last state as the state change itself happens in parallel (different > locks) and any update could be delayed depending on rtnl availability > and workqueue re-scheduling. But sure, they will get some update at some point. :) Ah.. Yes, that's a sad fact :( > > It all depends on what are the requirements. > > An uglier but lockless alternative would be to poll the slave's sysfs oper state, > that doesn't require any locks and would be up-to-date. Hmm, that's a workaround, but the admin need to poll the state frequently as they don't know when the state will change. Hi Jay, are you OK to add this sysfs in bonding? Thanks Hangbin
On 27/06/2024 16:17, Hangbin Liu wrote: > On Thu, Jun 27, 2024 at 01:33:10PM +0300, Nikolay Aleksandrov wrote: >>> Yes, but at least the admin could get the latest state. With the following >>> code the admin may not get the latest update if lock rtnl failed. >>> >>> if (should_notify_rtnl && rtnl_trylock()) { >>> bond_slave_lacp_notify(bond); >>> rtnl_unlock(); >>> } >>> >> Well, you mentioned administrators want to see the state changes, please >> better clarify the exact end goal. Note that technically may even not be >> the last state as the state change itself happens in parallel (different >> locks) and any update could be delayed depending on rtnl availability >> and workqueue re-scheduling. But sure, they will get some update at some point. :) > > Ah.. Yes, that's a sad fact :( >> >> It all depends on what are the requirements. >> >> An uglier but lockless alternative would be to poll the slave's sysfs oper state, >> that doesn't require any locks and would be up-to-date. > > Hmm, that's a workaround, but the admin need to poll the state frequently as > they don't know when the state will change. > Oh wait, that wasn't what I was proposing, I was thinking about the port's oper state which is already available via a sysfs attribute. Generally sysfs is getting deprecated. > Hi Jay, are you OK to add this sysfs in bonding? > > Thanks > Hangbin
Hangbin Liu <liuhangbin@gmail.com> wrote: >On Thu, Jun 27, 2024 at 01:33:10PM +0300, Nikolay Aleksandrov wrote: >> > Yes, but at least the admin could get the latest state. With the following >> > code the admin may not get the latest update if lock rtnl failed. >> > >> > if (should_notify_rtnl && rtnl_trylock()) { >> > bond_slave_lacp_notify(bond); >> > rtnl_unlock(); >> > } >> > >> Well, you mentioned administrators want to see the state changes, please >> better clarify the exact end goal. Note that technically may even not be >> the last state as the state change itself happens in parallel (different >> locks) and any update could be delayed depending on rtnl availability >> and workqueue re-scheduling. But sure, they will get some update at some point. :) > >Ah.. Yes, that's a sad fact :( There are basically two paths that will change the LACP state that's passed up via netlink (the aggregator ID, and actor and partner oper port states): bond_3ad_state_machine_handler(), or incoming LACPDUs, which call into ad_rx_machine(). Administrative changes to the bond will do it too, like adding or removing interfaces, but those originate in user space and aren't happening asynchronously. If you want (almost) absolute reliability in communicating every state change for the state machine and LACPDU processing, I think you'd have to (a) create an object with the changed state, (b) queue it somewhere, then (c) call a workqueue event to process that queue out of line. >> It all depends on what are the requirements. >> >> An uglier but lockless alternative would be to poll the slave's sysfs oper state, >> that doesn't require any locks and would be up-to-date. > >Hmm, that's a workaround, but the admin need to poll the state frequently as >they don't know when the state will change. > >Hi Jay, are you OK to add this sysfs in bonding? I think what Nik is proposing is for your userspace to poll the /sys/class/net/${DEV}/operstate. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Thu, Jun 27, 2024 at 07:24:10AM -0700, Jay Vosburgh wrote: > Hangbin Liu <liuhangbin@gmail.com> wrote: > >Ah.. Yes, that's a sad fact :( > > There are basically two paths that will change the LACP state > that's passed up via netlink (the aggregator ID, and actor and partner > oper port states): bond_3ad_state_machine_handler(), or incoming > LACPDUs, which call into ad_rx_machine(). Administrative changes to the Ah, thanks, I didn't notice this. I will also enable lacp notify in ad_rx_machine(). > bond will do it too, like adding or removing interfaces, but those > originate in user space and aren't happening asynchronously. > > If you want (almost) absolute reliability in communicating every > state change for the state machine and LACPDU processing, I think you'd > have to (a) create an object with the changed state, (b) queue it > somewhere, then (c) call a workqueue event to process that queue out of > line. Hmm... This looks too complex. If we store all the states. A frequent flashing may consume the memory. If we made a limit for the queue, we may still loosing some state changes. I'm not sure which way is better. > > >> It all depends on what are the requirements. > >> > >> An uglier but lockless alternative would be to poll the slave's sysfs oper state, > >> that doesn't require any locks and would be up-to-date. > > > >Hmm, that's a workaround, but the admin need to poll the state frequently as > >they don't know when the state will change. > > > >Hi Jay, are you OK to add this sysfs in bonding? > > I think what Nik is proposing is for your userspace to poll the > /sys/class/net/${DEV}/operstate. OK. There are 2 scenarios I got. 1) the local user want to get the local/partner state and make sure not send pkts before they are in DISTRIBUTING state to avoid pkts drop, Or vice versa. Only checking link operstate or up/down status is not enough. 2) the admin want to get the switch/partner status via LACP status incase the switch is crashed. Do you have any suggestion for the implementation? Thanks Hangbin
On 28/06/2024 06:10, Hangbin Liu wrote: > On Thu, Jun 27, 2024 at 07:24:10AM -0700, Jay Vosburgh wrote: >> Hangbin Liu <liuhangbin@gmail.com> wrote: >>> Ah.. Yes, that's a sad fact :( >> >> There are basically two paths that will change the LACP state >> that's passed up via netlink (the aggregator ID, and actor and partner >> oper port states): bond_3ad_state_machine_handler(), or incoming >> LACPDUs, which call into ad_rx_machine(). Administrative changes to the > > Ah, thanks, I didn't notice this. I will also enable lacp notify > in ad_rx_machine(). > >> bond will do it too, like adding or removing interfaces, but those >> originate in user space and aren't happening asynchronously. >> >> If you want (almost) absolute reliability in communicating every >> state change for the state machine and LACPDU processing, I think you'd >> have to (a) create an object with the changed state, (b) queue it >> somewhere, then (c) call a workqueue event to process that queue out of >> line. > > Hmm... This looks too complex. If we store all the states. A frequent flashing > may consume the memory. If we made a limit for the queue, we may still loosing > some state changes. > > I'm not sure which way is better. > >> >>>> It all depends on what are the requirements. >>>> >>>> An uglier but lockless alternative would be to poll the slave's sysfs oper state, >>>> that doesn't require any locks and would be up-to-date. >>> >>> Hmm, that's a workaround, but the admin need to poll the state frequently as >>> they don't know when the state will change. >>> >>> Hi Jay, are you OK to add this sysfs in bonding? >> >> I think what Nik is proposing is for your userspace to poll the >> /sys/class/net/${DEV}/operstate. > Actually I was talking about: /sys/class/net/<bond port>/bonding_slave/ad_actor_oper_port_state /sys/class/net/<bond port>/bonding_slave/ad_partner_oper_port_state etc Wouldn't these work for you? > OK. There are 2 scenarios I got. > > 1) the local user want to get the local/partner state and make sure not > send pkts before they are in DISTRIBUTING state to avoid pkts drop, Or vice > versa. Only checking link operstate or up/down status is not enough. > > 2) the admin want to get the switch/partner status via LACP status incase > the switch is crashed. > > Do you have any suggestion for the implementation? > > Thanks > Hangbin
On 28/06/2024 10:04, Nikolay Aleksandrov wrote: > On 28/06/2024 06:10, Hangbin Liu wrote: >> On Thu, Jun 27, 2024 at 07:24:10AM -0700, Jay Vosburgh wrote: >>> Hangbin Liu <liuhangbin@gmail.com> wrote: >>>> Ah.. Yes, that's a sad fact :( >>> >>> There are basically two paths that will change the LACP state >>> that's passed up via netlink (the aggregator ID, and actor and partner >>> oper port states): bond_3ad_state_machine_handler(), or incoming >>> LACPDUs, which call into ad_rx_machine(). Administrative changes to the >> >> Ah, thanks, I didn't notice this. I will also enable lacp notify >> in ad_rx_machine(). >> >>> bond will do it too, like adding or removing interfaces, but those >>> originate in user space and aren't happening asynchronously. >>> >>> If you want (almost) absolute reliability in communicating every >>> state change for the state machine and LACPDU processing, I think you'd >>> have to (a) create an object with the changed state, (b) queue it >>> somewhere, then (c) call a workqueue event to process that queue out of >>> line. >> >> Hmm... This looks too complex. If we store all the states. A frequent flashing >> may consume the memory. If we made a limit for the queue, we may still loosing >> some state changes. >> >> I'm not sure which way is better. >> >>> >>>>> It all depends on what are the requirements. >>>>> >>>>> An uglier but lockless alternative would be to poll the slave's sysfs oper state, >>>>> that doesn't require any locks and would be up-to-date. >>>> >>>> Hmm, that's a workaround, but the admin need to poll the state frequently as >>>> they don't know when the state will change. >>>> >>>> Hi Jay, are you OK to add this sysfs in bonding? >>> >>> I think what Nik is proposing is for your userspace to poll the >>> /sys/class/net/${DEV}/operstate. >> > > Actually I was talking about: > /sys/class/net/<bond port>/bonding_slave/ad_actor_oper_port_state > /sys/class/net/<bond port>/bonding_slave/ad_partner_oper_port_state > etc > > Wouldn't these work for you? > But it gets much more complicated, I guess it will be easier to read the proc bond file with all the LACP information. That is under RCU only as well. >> OK. There are 2 scenarios I got. >> >> 1) the local user want to get the local/partner state and make sure not >> send pkts before they are in DISTRIBUTING state to avoid pkts drop, Or vice >> versa. Only checking link operstate or up/down status is not enough. >> >> 2) the admin want to get the switch/partner status via LACP status incase >> the switch is crashed. >> >> Do you have any suggestion for the implementation? >> >> Thanks >> Hangbin >
Hi Nikolay, On Fri, Jun 28, 2024 at 10:22:25AM +0300, Nikolay Aleksandrov wrote: > > Actually I was talking about: > > /sys/class/net/<bond port>/bonding_slave/ad_actor_oper_port_state > > /sys/class/net/<bond port>/bonding_slave/ad_partner_oper_port_state > > etc > > > > Wouldn't these work for you? > > > > But it gets much more complicated, I guess it will be easier to read the > proc bond file with all the LACP information. That is under RCU only as > well. Good question. The monitor application want a more elegant/general way to deal with the LACP state and do other network reconfigurations. Here is the requirement I got from customer. 1) As a server administrator, I want ip monitor to show state change events related to LACP bonds so that I can react quickly to network reconfigurations. 2) As a network monitoring application developer, I want my application to be notified about LACP bond operational state changes without having to poll /proc/net/bonding/<bond> and parse its output so that it can trigger predefined failover remediation policies. 3) As a server administrator, I want my LACP bond monitoring application to receive a Netlink-based notification whenever the number of member interfaces is reduced so that the operations support system can provision a member interface replacement. What I understand is the user/admin need to know the latest stable state so they can do some other network configuration based on the status. Losing a middle state notification during fast changes is acceptable. > Well, you mentioned administrators want to see the state changes, please > better clarify the exact end goal. Note that technically may even not be > the last state as the state change itself happens in parallel (different > locks) and any update could be delayed depending on rtnl availability > and workqueue re-scheduling. But sure, they will get some update at some point. :) Would you please help explain why we may not get the latest state? From what I understand: 1) State A -> B, queue notify rtnl_trylock, fail, queue again 2) State B -> C, queue notify rtnl_trylock, success, post current state C 3) State C -> D, queue notify rtnl_trylock, fail, queue again 4) State D -> A, queue notify rtnl_trylock, fail, queue again rtnl_trylock, fail, queue again rtnl_trylock, success, post current state A So how could the step 3) state send but step 4) state not send? BTW, in my code, I should set the should_notify_lacp = 0 first before sending ifinfo message. So that even the should_notify_lacp = 1 in ad_mux_machine() is over written here, it still send the latest status. > + > + if (slave->should_notify_lacp) { > + slave->should_notify_lacp = 0; > + rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_KERNEL, 0, NULL); > + } The side effect is that we may send 2 same latest lacp status(the should_notify_lacp is over written to 1 and queue again), which should be OK. Thanks Hangbin
Hangbin Liu <liuhangbin@gmail.com> wrote: >Hi Nikolay, >On Fri, Jun 28, 2024 at 10:22:25AM +0300, Nikolay Aleksandrov wrote: >> > Actually I was talking about: >> > /sys/class/net/<bond port>/bonding_slave/ad_actor_oper_port_state >> > /sys/class/net/<bond port>/bonding_slave/ad_partner_oper_port_state >> > etc >> > >> > Wouldn't these work for you? >> > >> >> But it gets much more complicated, I guess it will be easier to read the >> proc bond file with all the LACP information. That is under RCU only as >> well. > >Good question. The monitor application want a more elegant/general way >to deal with the LACP state and do other network reconfigurations. >Here is the requirement I got from customer. > >1) As a server administrator, I want ip monitor to show state change events > related to LACP bonds so that I can react quickly to network reconfigurations. >2) As a network monitoring application developer, I want my application to be > notified about LACP bond operational state changes without having to > poll /proc/net/bonding/<bond> and parse its output so that it can trigger > predefined failover remediation policies. >3) As a server administrator, I want my LACP bond monitoring application to > receive a Netlink-based notification whenever the number of member > interfaces is reduced so that the operations support system can provision > a member interface replacement. What notifications are they not getting that they want? Or is it that events happen but lack some information they want? Currently, the end of bond_3ad_state_machine_handler() will send notifications via bond_slave_state_notify() if the state of any member of the bond has changed (via the struct slave.should_notify field). The notifications here come from bond_lower_state_changed(), which propagates link up/down and tx_enabled (active-ness), but not any LACP specifics. These are sent as NETDEV_CHANGELOWERSTATE events, which rtnetlink_event() handles, so they should be visible to ip monitor. >What I understand is the user/admin need to know the latest stable state so >they can do some other network configuration based on the status. Losing >a middle state notification during fast changes is acceptable. This is a much simpler problem to solve. >> Well, you mentioned administrators want to see the state changes, please >> better clarify the exact end goal. Note that technically may even not be >> the last state as the state change itself happens in parallel (different >> locks) and any update could be delayed depending on rtnl availability >> and workqueue re-scheduling. But sure, they will get some update at some point. :) > >Would you please help explain why we may not get the latest state? From what >I understand: > >1) State A -> B, queue notify > rtnl_trylock, fail, queue again >2) State B -> C, queue notify > rtnl_trylock, success, post current state C >3) State C -> D, queue notify > rtnl_trylock, fail, queue again >4) State D -> A, queue notify > rtnl_trylock, fail, queue again > rtnl_trylock, fail, queue again > rtnl_trylock, success, post current state A > >So how could the step 3) state send but step 4) state not send? I'm going to speculate here that the scenario envisioned would be something like CPU A is in the midst of generating an event at the end of the state machine, but CPU B could be processing a LACPDU simultaneously-ish. The CPU A event is sent, but the CPU B event is delayed due to RTNL contention. In this scenario, the last event seen in user space is CPU A, but the actual state has moved on to that set by CPU B, whose notification will be received eventually. >BTW, in my code, I should set the should_notify_lacp = 0 first before sending >ifinfo message. So that even the should_notify_lacp = 1 in ad_mux_machine() >is over written here, it still send the latest status. > >> + >> + if (slave->should_notify_lacp) { >> + slave->should_notify_lacp = 0; >> + rtmsg_ifinfo(RTM_NEWLINK, slave->dev, 0, GFP_KERNEL, 0, NULL); >> + } > >The side effect is that we may send 2 same latest lacp status(the >should_notify_lacp is over written to 1 and queue again), which should >be OK. Looking at the current notifications in bonding, I wonder if it would be sufficient to add the desired information to what bond_lower_state_changed() sends, rather than trying to shoehorn in another rtnl_trylock() gizmo. -J --- -Jay Vosburgh, jay.vosburgh@canonical.com
On Fri, Jun 28, 2024 at 04:36:17PM -0700, Jay Vosburgh wrote: > >Good question. The monitor application want a more elegant/general way > >to deal with the LACP state and do other network reconfigurations. > >Here is the requirement I got from customer. > > > >1) As a server administrator, I want ip monitor to show state change events > > related to LACP bonds so that I can react quickly to network reconfigurations. > >2) As a network monitoring application developer, I want my application to be > > notified about LACP bond operational state changes without having to > > poll /proc/net/bonding/<bond> and parse its output so that it can trigger > > predefined failover remediation policies. > >3) As a server administrator, I want my LACP bond monitoring application to > > receive a Netlink-based notification whenever the number of member > > interfaces is reduced so that the operations support system can provision > > a member interface replacement. > > What notifications are they not getting that they want? Or is > it that events happen but lack some information they want? Yes, e.g. when the switch crashed while link still up, they can get notified from LACP partner state if we send the info via netlink message. > > Currently, the end of bond_3ad_state_machine_handler() will send > notifications via bond_slave_state_notify() if the state of any member > of the bond has changed (via the struct slave.should_notify field). > > The notifications here come from bond_lower_state_changed(), > which propagates link up/down and tx_enabled (active-ness), but not any > LACP specifics. These are sent as NETDEV_CHANGELOWERSTATE events, which > rtnetlink_event() handles, so they should be visible to ip monitor. That's the problem. It only propagates link info via link up/down. The link info is not send if only LACP state changed. > > >What I understand is the user/admin need to know the latest stable state so > >they can do some other network configuration based on the status. Losing > >a middle state notification during fast changes is acceptable. > > This is a much simpler problem to solve. Good. > > >> Well, you mentioned administrators want to see the state changes, please > >> better clarify the exact end goal. Note that technically may even not be > >> the last state as the state change itself happens in parallel (different > >> locks) and any update could be delayed depending on rtnl availability > >> and workqueue re-scheduling. But sure, they will get some update at some point. :) > > > >Would you please help explain why we may not get the latest state? From what > >I understand: > > > >1) State A -> B, queue notify > > rtnl_trylock, fail, queue again > >2) State B -> C, queue notify > > rtnl_trylock, success, post current state C > >3) State C -> D, queue notify > > rtnl_trylock, fail, queue again > >4) State D -> A, queue notify > > rtnl_trylock, fail, queue again > > rtnl_trylock, fail, queue again > > rtnl_trylock, success, post current state A > > > >So how could the step 3) state send but step 4) state not send? > > I'm going to speculate here that the scenario envisioned would > be something like CPU A is in the midst of generating an event at the > end of the state machine, but CPU B could be processing a LACPDU > simultaneously-ish. The CPU A event is sent, but the CPU B event is > delayed due to RTNL contention. In this scenario, the last event seen > in user space is CPU A, but the actual state has moved on to that set by > CPU B, whose notification will be received eventually. Is is possible that LACP event (The second to last state) in CPU A is delayed due to RTNL contention. While the LACP event (the latest state) in CPU B is sent successfully. And later the LACP event in CPU A is sent eventually. This would case the user space receives miss order message. > Looking at the current notifications in bonding, I wonder if it > would be sufficient to add the desired information to what > bond_lower_state_changed() sends, rather than trying to shoehorn in > another rtnl_trylock() gizmo. I'm not sure if the LACP state count for lower state. What do you think of my previous draft patch[1] that replied to you. [1] https://lore.kernel.org/netdev/Zn0iI3SPdRkmfnS1@Laptop-X1/ Thanks Hangbin
On Tue, Jul 02, 2024 at 04:00:06PM +0800, Hangbin Liu wrote: > > Looking at the current notifications in bonding, I wonder if it > > would be sufficient to add the desired information to what > > bond_lower_state_changed() sends, rather than trying to shoehorn in > > another rtnl_trylock() gizmo. > > I'm not sure if the LACP state count for lower state. What do you think of > my previous draft patch[1] that replied to you. > > [1] https://lore.kernel.org/netdev/Zn0iI3SPdRkmfnS1@Laptop-X1/ Hi Jay, Any comments? Thanks Hangbin
On Thu, Jul 11, 2024 at 11:12:58AM +0800, Hangbin Liu wrote: > On Tue, Jul 02, 2024 at 04:00:06PM +0800, Hangbin Liu wrote: > > > Looking at the current notifications in bonding, I wonder if it > > > would be sufficient to add the desired information to what > > > bond_lower_state_changed() sends, rather than trying to shoehorn in > > > another rtnl_trylock() gizmo. > > > > I'm not sure if the LACP state count for lower state. What do you think of > > my previous draft patch[1] that replied to you. > > > > [1] https://lore.kernel.org/netdev/Zn0iI3SPdRkmfnS1@Laptop-X1/ > > Hi Jay, > > Any comments? Ping?
On Fri, Jul 19, 2024 at 02:45:25PM +0800, Hangbin Liu wrote: > On Thu, Jul 11, 2024 at 11:12:58AM +0800, Hangbin Liu wrote: > > On Tue, Jul 02, 2024 at 04:00:06PM +0800, Hangbin Liu wrote: > > > > Looking at the current notifications in bonding, I wonder if it > > > > would be sufficient to add the desired information to what > > > > bond_lower_state_changed() sends, rather than trying to shoehorn in > > > > another rtnl_trylock() gizmo. > > > > > > I'm not sure if the LACP state count for lower state. What do you think of > > > my previous draft patch[1] that replied to you. > > > > > > [1] https://lore.kernel.org/netdev/Zn0iI3SPdRkmfnS1@Laptop-X1/ Hi Jay, I hope I can get some of your comments. Thanks Hangbin
diff --git a/drivers/net/bonding/bond_3ad.c b/drivers/net/bonding/bond_3ad.c index c6807e473ab7..b57c5670b31a 100644 --- a/drivers/net/bonding/bond_3ad.c +++ b/drivers/net/bonding/bond_3ad.c @@ -11,6 +11,7 @@ #include <linux/etherdevice.h> #include <linux/if_bonding.h> #include <linux/pkt_sched.h> +#include <linux/rtnetlink.h> #include <net/net_namespace.h> #include <net/bonding.h> #include <net/bond_3ad.h> @@ -1185,6 +1186,8 @@ static void ad_mux_machine(struct port *port, bool *update_slave_arr) default: break; } + + rtmsg_ifinfo(RTM_NEWLINK, port->slave->dev, 0, GFP_ATOMIC, 0, NULL); } } diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c index eabfc8290f5e..4507bb8d5264 100644 --- a/net/core/rtnetlink.c +++ b/net/core/rtnetlink.c @@ -4116,6 +4116,7 @@ void rtmsg_ifinfo(int type, struct net_device *dev, unsigned int change, rtmsg_ifinfo_event(type, dev, change, rtnl_get_event(0), flags, NULL, 0, portid, nlh); } +EXPORT_SYMBOL(rtmsg_ifinfo); void rtmsg_ifinfo_newnet(int type, struct net_device *dev, unsigned int change, gfp_t flags, int *new_nsid, int new_ifindex)
Currently, administrators need to retrieve LACP mux state changes from the kernel DEBUG log using netdev_dbg and slave_dbg macros. To simplify this process, let's send the ifinfo notification whenever the mux state changes. This will enable users to directly access and monitor this information using the ip monitor command. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- v3: forgot to use GFP_ATOMIC. (Nikolay Aleksandrov) export symbol for rtmsg_ifinfo. It's weird that my build succeed with tools/testing/selftests/drivers/net/bonding/config without export the symbol, but build failed with tools/testing/selftests/net/config. v2: don't use call_netdevice_notifiers as it will case sleeping in atomic context (Nikolay Aleksandrov) After this patch, we can see the following info with `ip -d monitor link` 7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state BACKUP mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 143 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,expired> ad_partner_oper_port_state 55 ad_partner_oper_port_state_str <active,short_timeout,aggregating,collecting,distributing> ... 7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state ACTIVE mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 79 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,defaulted> ad_partner_oper_port_state 1 ad_partner_oper_port_state_str <active> ... 7: veth1@if6: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc noqueue master bond0 state UP group default link/ether 02:0a:04:c2:d6:21 brd ff:ff:ff:ff:ff:ff link-netns b promiscuity 0 allmulti 0 minmtu 68 maxmtu 65535 veth bond_slave state ACTIVE mii_status UP ... ad_aggregator_id 1 ad_actor_oper_port_state 63 ad_actor_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ad_partner_oper_port_state 63 ad_partner_oper_port_state_str <active,short_timeout,aggregating,in_sync,collecting,distributing> ... --- drivers/net/bonding/bond_3ad.c | 3 +++ net/core/rtnetlink.c | 1 + 2 files changed, 4 insertions(+)