Message ID | 20201016172914.643282-1-idosch@idosch.org (mailing list archive) |
---|---|
State | Not Applicable |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] nexthop: Fix performance regression in nexthop deletion | expand |
Ido Schimmel wrote: > From: Ido Schimmel <idosch@nvidia.com> > > While insertion of 16k nexthops all using the same netdev ('dummy10') > takes less than a second, deletion takes about 130 seconds: > > # time -p ip -b nexthop.batch > real 0.29 > user 0.01 > sys 0.15 > > # time -p ip link set dev dummy10 down > real 131.03 > user 0.06 > sys 0.52 snip... > Since nexthops are always deleted under RTNL, synchronize_net() can be > used instead. It will call synchronize_rcu_expedited() which only blocks > for several microseconds as opposed to multiple milliseconds like > synchronize_rcu(). > > With this patch deletion of 16k nexthops takes less than a second: > > # time -p ip link set dev dummy10 down > real 0.12 > user 0.00 > sys 0.04 That's a nice result! Well done! I can't really speak to whether or not there is any horrible side effect of using synchronize_rcu_expedited(), but FWIW: Reviewed-by: Jesse Brandeburg <jesse.brandeburg@intel.com> > > Tested with fib_nexthops.sh which includes torture tests that prompted > the initial change: > > # ./fib_nexthops.sh > ... > Tests passed: 134 > Tests failed: 0 > > Fixes: 90f33bffa382 ("nexthops: don't modify published nexthop groups") > Signed-off-by: Ido Schimmel <idosch@nvidia.com> Nice fix, good commit message, thanks! > --- > net/ipv4/nexthop.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c > index 8c0f17c6863c..0dc43ad28eb9 100644 > --- a/net/ipv4/nexthop.c > +++ b/net/ipv4/nexthop.c > @@ -845,7 +845,7 @@ static void remove_nexthop_from_groups(struct net *net, struct nexthop *nh, > remove_nh_grp_entry(net, nhge, nlinfo); > > /* make sure all see the newly published array before releasing rtnl */ > - synchronize_rcu(); > + synchronize_net(); > } > > static void remove_nexthop_group(struct nexthop *nh, struct nl_info *nlinfo)
On 10/16/20 11:29 AM, Ido Schimmel wrote: > From: Ido Schimmel <idosch@nvidia.com> > > While insertion of 16k nexthops all using the same netdev ('dummy10') > takes less than a second, deletion takes about 130 seconds: > > # time -p ip -b nexthop.batch > real 0.29 > user 0.01 > sys 0.15 > > # time -p ip link set dev dummy10 down > real 131.03 > user 0.06 > sys 0.52 > > This is because of repeated calls to synchronize_rcu() whenever a > nexthop is removed from a nexthop group: > > # /usr/share/bcc/tools/offcputime -p `pgrep -nx ip` -K > ... > b'finish_task_switch' > b'schedule' > b'schedule_timeout' > b'wait_for_completion' > b'__wait_rcu_gp' > b'synchronize_rcu.part.0' > b'synchronize_rcu' > b'__remove_nexthop' > b'remove_nexthop' > b'nexthop_flush_dev' > b'nh_netdev_event' > b'raw_notifier_call_chain' > b'call_netdevice_notifiers_info' > b'__dev_notify_flags' > b'dev_change_flags' > b'do_setlink' > b'__rtnl_newlink' > b'rtnl_newlink' > b'rtnetlink_rcv_msg' > b'netlink_rcv_skb' > b'rtnetlink_rcv' > b'netlink_unicast' > b'netlink_sendmsg' > b'____sys_sendmsg' > b'___sys_sendmsg' > b'__sys_sendmsg' > b'__x64_sys_sendmsg' > b'do_syscall_64' > b'entry_SYSCALL_64_after_hwframe' > - ip (277) > 126554955 > > Since nexthops are always deleted under RTNL, synchronize_net() can be > used instead. It will call synchronize_rcu_expedited() which only blocks > for several microseconds as opposed to multiple milliseconds like > synchronize_rcu(). > > With this patch deletion of 16k nexthops takes less than a second: > > # time -p ip link set dev dummy10 down > real 0.12 > user 0.00 > sys 0.04 > > Tested with fib_nexthops.sh which includes torture tests that prompted > the initial change: > > # ./fib_nexthops.sh > ... > Tests passed: 134 > Tests failed: 0 > > Fixes: 90f33bffa382 ("nexthops: don't modify published nexthop groups") > Signed-off-by: Ido Schimmel <idosch@nvidia.com> > --- > net/ipv4/nexthop.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > Thanks for finding this, Ido. Reviewed-by: David Ahern <dsahern@gmail.com>
On Fri, 2020-10-16 at 20:29 +0300, Ido Schimmel wrote: > From: Ido Schimmel <idosch@nvidia.com> > > While insertion of 16k nexthops all using the same netdev ('dummy10') > takes less than a second, deletion takes about 130 seconds: > > # time -p ip -b nexthop.batch > real 0.29 > user 0.01 > sys 0.15 > > # time -p ip link set dev dummy10 down > real 131.03 > user 0.06 > sys 0.52 > > This is because of repeated calls to synchronize_rcu() whenever a > nexthop is removed from a nexthop group: > > # /usr/share/bcc/tools/offcputime -p `pgrep -nx ip` -K > ... > b'finish_task_switch' > b'schedule' > b'schedule_timeout' > b'wait_for_completion' > b'__wait_rcu_gp' > b'synchronize_rcu.part.0' > b'synchronize_rcu' > b'__remove_nexthop' > b'remove_nexthop' > b'nexthop_flush_dev' > b'nh_netdev_event' > b'raw_notifier_call_chain' > b'call_netdevice_notifiers_info' > b'__dev_notify_flags' > b'dev_change_flags' > b'do_setlink' > b'__rtnl_newlink' > b'rtnl_newlink' > b'rtnetlink_rcv_msg' > b'netlink_rcv_skb' > b'rtnetlink_rcv' > b'netlink_unicast' > b'netlink_sendmsg' > b'____sys_sendmsg' > b'___sys_sendmsg' > b'__sys_sendmsg' > b'__x64_sys_sendmsg' > b'do_syscall_64' > b'entry_SYSCALL_64_after_hwframe' > - ip (277) > 126554955 > > Since nexthops are always deleted under RTNL, synchronize_net() can be > used instead. It will call synchronize_rcu_expedited() which only blocks > for several microseconds as opposed to multiple milliseconds like > synchronize_rcu(). > > With this patch deletion of 16k nexthops takes less than a second: > > # time -p ip link set dev dummy10 down > real 0.12 > user 0.00 > sys 0.04 > > Tested with fib_nexthops.sh which includes torture tests that prompted > the initial change: > > # ./fib_nexthops.sh > ... > Tests passed: 134 > Tests failed: 0 > > Fixes: 90f33bffa382 ("nexthops: don't modify published nexthop groups") > Signed-off-by: Ido Schimmel <idosch@nvidia.com> > --- > net/ipv4/nexthop.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > Looks good. Acked-by: Nikolay Aleksandrov <nikolay@nvidia.com>
On Fri, 16 Oct 2020 20:29:14 +0300 Ido Schimmel wrote: > From: Ido Schimmel <idosch@nvidia.com> > > While insertion of 16k nexthops all using the same netdev ('dummy10') > takes less than a second, deletion takes about 130 seconds: Applied, thank you!
diff --git a/net/ipv4/nexthop.c b/net/ipv4/nexthop.c index 8c0f17c6863c..0dc43ad28eb9 100644 --- a/net/ipv4/nexthop.c +++ b/net/ipv4/nexthop.c @@ -845,7 +845,7 @@ static void remove_nexthop_from_groups(struct net *net, struct nexthop *nh, remove_nh_grp_entry(net, nhge, nlinfo); /* make sure all see the newly published array before releasing rtnl */ - synchronize_rcu(); + synchronize_net(); } static void remove_nexthop_group(struct nexthop *nh, struct nl_info *nlinfo)