Message ID | 1533278629-26147-1-git-send-email-liuhangbin@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [PATCHv2] teamd: add port_master_ifindex_changed for teamd_event_watch_ops | expand |
Fri, Aug 03, 2018 at 08:43:49AM CEST, liuhangbin@gmail.com wrote: >When configure two active-backup teams, one slave each, put them >down and up in a loop and test traffic each loop, eventually one of >the teams will not accept/transmit traffic as there is no active port. I don't think that Jamie intended his reply as a replacement of description... >Like this: > >\#/bin/bash >\# In this scenario we add eth1 to team1, eth2 to team2. >WAIT=2 >COUNT=0 > >start_team() >{ > num=$1 > teamd -o -n -U -d -t team$num -c '{"runner": {"name": "activebackup"},"link_watch": {"name": "ethtool"}}' -gg > teamdctl team$num port add eth$num >} > >while :; do > echo "===========================================================" > let "COUNT++" > echo "Loop $COUNT" > if teamdctl team1 state | grep -q "active port: eth1" && \ > teamdctl team2 state | grep -q "active port: eth2"; then > echo "Pass" > else > echo "FAIL" > exit 1 > fi > > teamd -k -t team1 > teamd -k -t team2 > sleep "$WAIT" > start_team 1 & > start_team 2 & > sleep "$WAIT" >done This reproducer does not work: =========================================================== Loop 1 Device "team1" does not exist FAIL > >Failure as follows: > >]# teamdctl teamX state dump > "runner": { > "active_port": "" > }, > >This happens because teamd_port_present() fails in the codepath >ab_link_watch_handler() -> teamd_for_each_tdport() -> teamd_get_next_tdport() >as the link states notification received later than interface up notification. > >So the teamd_for_each_tdport() for loop can never run against the tdports >bound to the ctx and a port is never set active. > >Currently we only reproduced this with active-backup mode. Fix it by adding a >new teamd_event_watch_ops port_master_ifindex_changed for ab mode. Okay, so if I understand that correctly, you are getting link up event (handled by ab_event_watch_port_link_changed() before you get port_added event (handled by ab_event_watch_port_added()). So wouldn't the following patch resolve that? diff --git a/teamd/teamd_runner_activebackup.c b/teamd/teamd_runner_activebackup.c index 8a3447f1a63d..8cf455d9b3e1 100644 --- a/teamd/teamd_runner_activebackup.c +++ b/teamd/teamd_runner_activebackup.c @@ -509,8 +509,12 @@ static int ab_event_watch_port_added(struct teamd_context *ctx, struct teamd_port *tdport, void *priv) { struct ab *ab = priv; + int err; - return teamd_port_priv_create(tdport, &ab_port_priv, ab); + err = teamd_port_priv_create(tdport, &ab_port_priv, ab); + if (err) + return err; + return ab_link_watch_handler(ctx, priv); } static int ab_event_watch_port_link_changed(struct teamd_context *ctx,
On Fri, Aug 03, 2018 at 02:36:48PM +0200, Jiri Pirko wrote: > Fri, Aug 03, 2018 at 08:43:49AM CEST, liuhangbin@gmail.com wrote: > >When configure two active-backup teams, one slave each, put them > >down and up in a loop and test traffic each loop, eventually one of > >the teams will not accept/transmit traffic as there is no active port. > > I don't think that Jamie intended his reply as a replacement of > description... OK..I thought his reply is more clear to describe the issue. I could re-update the description if you and Jamie like. > > >Like this: > > > >\#/bin/bash > >\# In this scenario we add eth1 to team1, eth2 to team2. > >WAIT=2 > >COUNT=0 > > > >start_team() > >{ > > num=$1 > > teamd -o -n -U -d -t team$num -c '{"runner": {"name": "activebackup"},"link_watch": {"name": "ethtool"}}' -gg > > teamdctl team$num port add eth$num > >} > > > >while :; do > > echo "===========================================================" > > let "COUNT++" > > echo "Loop $COUNT" > > if teamdctl team1 state | grep -q "active port: eth1" && \ > > teamdctl team2 state | grep -q "active port: eth2"; then > > echo "Pass" > > else > > echo "FAIL" > > exit 1 > > fi > > > > teamd -k -t team1 > > teamd -k -t team2 > > sleep "$WAIT" > > start_team 1 & > > start_team 2 & > > sleep "$WAIT" > >done > > This reproducer does not work: > =========================================================== > Loop 1 > Device "team1" does not exist > FAIL This reproducer need a team interface first. Here is an new one. #!/bin/sh if [ -z $1 ] || [ -z $2 ]; then echo "Usage: $0 iface1 iface2" exit 1 else iface1=$1 iface2=$2 fi WAIT=2 COUNT=0 start_team() { local num=$1 local iface=$2 teamd -o -n -U -d -t team$num -c '{"runner": {"name": "activebackup"},"link_watch": {"name": "ethtool"}}' -gg teamdctl team$num port add $iface } while :; do echo "-----------------------------------------------------------" let "COUNT++" echo "Loop $COUNT" teamd -k -t team1 teamd -k -t team2 sleep "$WAIT" start_team 1 $iface1 & start_team 2 $iface2 & sleep "$WAIT" if teamdctl team1 state | grep -q "active port: $iface1" && \ teamdctl team2 state | grep -q "active port: $iface2"; then echo "Pass" else echo "FAIL" exit 1 fi done Note: I could reproduce the issue easily on VM. But hard to reproduce it on a physical machine. > > > > > > >Failure as follows: > > > >]# teamdctl teamX state dump > > "runner": { > > "active_port": "" > > }, > > > >This happens because teamd_port_present() fails in the codepath > >ab_link_watch_handler() -> teamd_for_each_tdport() -> teamd_get_next_tdport() > >as the link states notification received later than interface up notification. > > > >So the teamd_for_each_tdport() for loop can never run against the tdports > >bound to the ctx and a port is never set active. > > > >Currently we only reproduced this with active-backup mode. Fix it by adding a > >new teamd_event_watch_ops port_master_ifindex_changed for ab mode. > > Okay, so if I understand that correctly, you are getting link up event > (handled by ab_event_watch_port_link_changed() before you get port_added > event (handled by ab_event_watch_port_added()). So wouldn't the > following patch resolve that? No, I got ab_event_watch_port_link_changed() after port_added() event. But there is no master info in tdport when call ab_event_watch_port_link_changed(). That's why I added port_master_ifindex_changed() call back. > > diff --git a/teamd/teamd_runner_activebackup.c b/teamd/teamd_runner_activebackup.c > index 8a3447f1a63d..8cf455d9b3e1 100644 > --- a/teamd/teamd_runner_activebackup.c > +++ b/teamd/teamd_runner_activebackup.c > @@ -509,8 +509,12 @@ static int ab_event_watch_port_added(struct teamd_context *ctx, > struct teamd_port *tdport, void *priv) > { > struct ab *ab = priv; > + int err; > > - return teamd_port_priv_create(tdport, &ab_port_priv, ab); > + err = teamd_port_priv_create(tdport, &ab_port_priv, ab); > + if (err) > + return err; > + return ab_link_watch_handler(ctx, priv); > } > > static int ab_event_watch_port_link_changed(struct teamd_context *ctx, Based on upper reason, and I also tested, this patch not works. Thanks Hangbin
diff --git a/teamd/teamd.h b/teamd/teamd.h index 5dbfb9b..3934fc2 100644 --- a/teamd/teamd.h +++ b/teamd/teamd.h @@ -189,6 +189,9 @@ struct teamd_event_watch_ops { struct teamd_port *tdport, void *priv); int (*port_ifname_changed)(struct teamd_context *ctx, struct teamd_port *tdport, void *priv); + int (*port_master_ifindex_changed)(struct teamd_context *ctx, + struct teamd_port *tdport, + void *priv); int (*option_changed)(struct teamd_context *ctx, struct team_option *option, void *priv); char *option_changed_match_name; @@ -208,6 +211,8 @@ int teamd_event_ifinfo_hwaddr_changed(struct teamd_context *ctx, struct team_ifinfo *ifinfo); int teamd_event_ifinfo_ifname_changed(struct teamd_context *ctx, struct team_ifinfo *ifinfo); +int teamd_event_ifinfo_master_ifindex_changed(struct teamd_context *ctx, + struct team_ifinfo *ifinfo); int teamd_event_ifinfo_admin_state_changed(struct teamd_context *ctx, struct team_ifinfo *ifinfo); int teamd_events_init(struct teamd_context *ctx); diff --git a/teamd/teamd_events.c b/teamd/teamd_events.c index 1a95974..65aa46a 100644 --- a/teamd/teamd_events.c +++ b/teamd/teamd_events.c @@ -167,6 +167,25 @@ int teamd_event_ifinfo_ifname_changed(struct teamd_context *ctx, return 0; } +int teamd_event_ifinfo_master_ifindex_changed(struct teamd_context *ctx, + struct team_ifinfo *ifinfo) +{ + struct event_watch_item *watch; + uint32_t ifindex = team_get_ifinfo_ifindex(ifinfo); + struct teamd_port *tdport = teamd_get_port(ctx, ifindex); + int err; + + list_for_each_node_entry(watch, &ctx->event_watch_list, list) { + if (watch->ops->port_master_ifindex_changed && tdport) { + err = watch->ops->port_master_ifindex_changed(ctx, tdport, + watch->priv); + if (err) + return err; + } + } + return 0; +} + int teamd_event_ifinfo_admin_state_changed(struct teamd_context *ctx, struct team_ifinfo *ifinfo) { diff --git a/teamd/teamd_ifinfo_watch.c b/teamd/teamd_ifinfo_watch.c index f334ff6..6a19532 100644 --- a/teamd/teamd_ifinfo_watch.c +++ b/teamd/teamd_ifinfo_watch.c @@ -59,6 +59,11 @@ static int ifinfo_change_handler_func(struct team_handle *th, void *priv, if (err) return err; } + if (team_is_ifinfo_master_ifindex_changed(ifinfo)) { + err = teamd_event_ifinfo_master_ifindex_changed(ctx, ifinfo); + if (err) + return err; + } } return 0; } diff --git a/teamd/teamd_runner_activebackup.c b/teamd/teamd_runner_activebackup.c index 8a3447f..f92d341 100644 --- a/teamd/teamd_runner_activebackup.c +++ b/teamd/teamd_runner_activebackup.c @@ -520,6 +520,13 @@ static int ab_event_watch_port_link_changed(struct teamd_context *ctx, return ab_link_watch_handler(ctx, priv); } +static int ab_event_watch_port_master_ifindex_changed(struct teamd_context *ctx, + struct teamd_port *tdport, + void *priv) +{ + return ab_link_watch_handler(ctx, priv); +} + static int ab_event_watch_prio_option_changed(struct teamd_context *ctx, struct team_option *option, void *priv) @@ -532,6 +539,7 @@ static const struct teamd_event_watch_ops ab_event_watch_ops = { .port_hwaddr_changed = ab_event_watch_port_hwaddr_changed, .port_added = ab_event_watch_port_added, .port_link_changed = ab_event_watch_port_link_changed, + .port_master_ifindex_changed = ab_event_watch_port_master_ifindex_changed, .option_changed = ab_event_watch_prio_option_changed, .option_changed_match_name = "priority", };
When configure two active-backup teams, one slave each, put them down and up in a loop and test traffic each loop, eventually one of the teams will not accept/transmit traffic as there is no active port. Like this: \#/bin/bash \# In this scenario we add eth1 to team1, eth2 to team2. WAIT=2 COUNT=0 start_team() { num=$1 teamd -o -n -U -d -t team$num -c '{"runner": {"name": "activebackup"},"link_watch": {"name": "ethtool"}}' -gg teamdctl team$num port add eth$num } while :; do echo "===========================================================" let "COUNT++" echo "Loop $COUNT" if teamdctl team1 state | grep -q "active port: eth1" && \ teamdctl team2 state | grep -q "active port: eth2"; then echo "Pass" else echo "FAIL" exit 1 fi teamd -k -t team1 teamd -k -t team2 sleep "$WAIT" start_team 1 & start_team 2 & sleep "$WAIT" done Failure as follows: ]# teamdctl teamX state dump "runner": { "active_port": "" }, This happens because teamd_port_present() fails in the codepath ab_link_watch_handler() -> teamd_for_each_tdport() -> teamd_get_next_tdport() as the link states notification received later than interface up notification. So the teamd_for_each_tdport() for loop can never run against the tdports bound to the ctx and a port is never set active. Currently we only reproduced this with active-backup mode. Fix it by adding a new teamd_event_watch_ops port_master_ifindex_changed for ab mode. V2: update commit description from Jamie Bainbridge's reply. Signed-off-by: Hangbin Liu <liuhangbin@gmail.com> --- teamd/teamd.h | 5 +++++ teamd/teamd_events.c | 19 +++++++++++++++++++ teamd/teamd_ifinfo_watch.c | 5 +++++ teamd/teamd_runner_activebackup.c | 8 ++++++++ 4 files changed, 37 insertions(+)