Message ID | 20220220041155.607637-1-eric.dumazet@gmail.com (mailing list archive) |
---|---|
State | Accepted |
Commit | ee8f97efa7a59e7f390ed2de627ddd139beb6243 |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [v2,net-next] gro_cells: avoid using synchronize_rcu() in gro_cells_destroy() | expand |
Hello, On Sat, 2022-02-19 at 20:11 -0800, Eric Dumazet wrote: > From: Eric Dumazet <edumazet@google.com> > > Another thing making netns dismantles potentially very slow is located > in gro_cells_destroy(), > whenever cleanup_net() has to remove a device using gro_cells framework. > > RTNL is not held at this stage, so synchronize_net() > is calling synchronize_rcu(): > > netdev_run_todo() > ip_tunnel_dev_free() > gro_cells_destroy() > synchronize_net() > synchronize_rcu() // Ouch. > > This patch uses call_rcu(), and gave me a 25x performance improvement > in my tests. > > cleanup_net() is no longer blocked ~10 ms per synchronize_rcu() > call. > > In the case we could not allocate the memory needed to queue the > deferred free, use synchronize_rcu_expedited() > > v2: made percpu_free_defer_callback() static > > Signed-off-by: Eric Dumazet <edumazet@google.com> I'm sorry for the late feedback. I'm wondering if you considered placing the 'defer' pointer inside 'gro_cells' and allocating it at gro_cells_init() init time? Thanks! Paolo
On Mon, Feb 21, 2022 at 12:24 AM Paolo Abeni <pabeni@redhat.com> wrote: > > Hello, > > On Sat, 2022-02-19 at 20:11 -0800, Eric Dumazet wrote: > > From: Eric Dumazet <edumazet@google.com> > > > > Another thing making netns dismantles potentially very slow is located > > in gro_cells_destroy(), > > whenever cleanup_net() has to remove a device using gro_cells framework. > > > > RTNL is not held at this stage, so synchronize_net() > > is calling synchronize_rcu(): > > > > netdev_run_todo() > > ip_tunnel_dev_free() > > gro_cells_destroy() > > synchronize_net() > > synchronize_rcu() // Ouch. > > > > This patch uses call_rcu(), and gave me a 25x performance improvement > > in my tests. > > > > cleanup_net() is no longer blocked ~10 ms per synchronize_rcu() > > call. > > > > In the case we could not allocate the memory needed to queue the > > deferred free, use synchronize_rcu_expedited() > > > > v2: made percpu_free_defer_callback() static > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > I'm sorry for the late feedback. I'm wondering if you considered > placing the 'defer' pointer inside 'gro_cells' and allocating it at > gro_cells_init() init time? I did consider this, but I chose not to risk changing structure layouts and adding regression in fast paths, with extra cache line misses.
On Mon, 2022-02-21 at 06:21 -0800, Eric Dumazet wrote: > On Mon, Feb 21, 2022 at 12:24 AM Paolo Abeni <pabeni@redhat.com> wrote: > > > > Hello, > > > > On Sat, 2022-02-19 at 20:11 -0800, Eric Dumazet wrote: > > > From: Eric Dumazet <edumazet@google.com> > > > > > > Another thing making netns dismantles potentially very slow is located > > > in gro_cells_destroy(), > > > whenever cleanup_net() has to remove a device using gro_cells framework. > > > > > > RTNL is not held at this stage, so synchronize_net() > > > is calling synchronize_rcu(): > > > > > > netdev_run_todo() > > > ip_tunnel_dev_free() > > > gro_cells_destroy() > > > synchronize_net() > > > synchronize_rcu() // Ouch. > > > > > > This patch uses call_rcu(), and gave me a 25x performance improvement > > > in my tests. > > > > > > cleanup_net() is no longer blocked ~10 ms per synchronize_rcu() > > > call. > > > > > > In the case we could not allocate the memory needed to queue the > > > deferred free, use synchronize_rcu_expedited() > > > > > > v2: made percpu_free_defer_callback() static > > > > > > Signed-off-by: Eric Dumazet <edumazet@google.com> > > > > I'm sorry for the late feedback. I'm wondering if you considered > > placing the 'defer' pointer inside 'gro_cells' and allocating it at > > gro_cells_init() init time? > > I did consider this, but I chose not to risk changing structure > layouts and adding regression in fast paths, > with extra cache line misses. Understood, thanks! Acked-by: Paolo Abeni <pabeni@redhat.com>
Hello: This patch was applied to netdev/net-next.git (master) by Jakub Kicinski <kuba@kernel.org>: On Sat, 19 Feb 2022 20:11:55 -0800 you wrote: > From: Eric Dumazet <edumazet@google.com> > > Another thing making netns dismantles potentially very slow is located > in gro_cells_destroy(), > whenever cleanup_net() has to remove a device using gro_cells framework. > > RTNL is not held at this stage, so synchronize_net() > is calling synchronize_rcu(): > > [...] Here is the summary with links: - [v2,net-next] gro_cells: avoid using synchronize_rcu() in gro_cells_destroy() https://git.kernel.org/netdev/net-next/c/ee8f97efa7a5 You are awesome, thank you!
diff --git a/net/core/gro_cells.c b/net/core/gro_cells.c index 6eb2e5ec2c5068e1d798557e55d084b785187a9b..8462f926ab457322a12510a4d294ecca617948f0 100644 --- a/net/core/gro_cells.c +++ b/net/core/gro_cells.c @@ -89,8 +89,23 @@ int gro_cells_init(struct gro_cells *gcells, struct net_device *dev) } EXPORT_SYMBOL(gro_cells_init); +struct percpu_free_defer { + struct rcu_head rcu; + void __percpu *ptr; +}; + +static void percpu_free_defer_callback(struct rcu_head *head) +{ + struct percpu_free_defer *defer; + + defer = container_of(head, struct percpu_free_defer, rcu); + free_percpu(defer->ptr); + kfree(defer); +} + void gro_cells_destroy(struct gro_cells *gcells) { + struct percpu_free_defer *defer; int i; if (!gcells->cells) @@ -102,12 +117,23 @@ void gro_cells_destroy(struct gro_cells *gcells) __netif_napi_del(&cell->napi); __skb_queue_purge(&cell->napi_skbs); } - /* This barrier is needed because netpoll could access dev->napi_list - * under rcu protection. + /* We need to observe an rcu grace period before freeing ->cells, + * because netpoll could access dev->napi_list under rcu protection. + * Try hard using call_rcu() instead of synchronize_rcu(), + * because we might be called from cleanup_net(), and we + * definitely do not want to block this critical task. */ - synchronize_net(); - - free_percpu(gcells->cells); + defer = kmalloc(sizeof(*defer), GFP_KERNEL | __GFP_NOWARN); + if (likely(defer)) { + defer->ptr = gcells->cells; + call_rcu(&defer->rcu, percpu_free_defer_callback); + } else { + /* We do not hold RTNL at this point, synchronize_net() + * would not be able to expedite this sync. + */ + synchronize_rcu_expedited(); + free_percpu(gcells->cells); + } gcells->cells = NULL; } EXPORT_SYMBOL(gro_cells_destroy);