From patchwork Wed Sep 21 09:49:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Westphal X-Patchwork-Id: 12983567 X-Patchwork-Delegate: kuba@kernel.org Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D5FACC6FA8E for ; Wed, 21 Sep 2022 09:53:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231654AbiIUJxS (ORCPT ); Wed, 21 Sep 2022 05:53:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231793AbiIUJwy (ORCPT ); Wed, 21 Sep 2022 05:52:54 -0400 Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [IPv6:2a0a:51c0:0:12e:520::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9A131883FA; Wed, 21 Sep 2022 02:50:16 -0700 (PDT) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1oawMw-0006Po-M3; Wed, 21 Sep 2022 11:50:14 +0200 From: Florian Westphal To: Cc: , Jakub Kicinski , Paolo Abeni , "David S. Miller" , Eric Dumazet , Antoine Tenart , Florian Westphal Subject: [PATCH net-next 1/4] netfilter: conntrack: fix the gc rescheduling delay Date: Wed, 21 Sep 2022 11:49:57 +0200 Message-Id: <20220921095000.29569-2-fw@strlen.de> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220921095000.29569-1-fw@strlen.de> References: <20220921095000.29569-1-fw@strlen.de> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: netdev@vger.kernel.org X-Patchwork-Delegate: kuba@kernel.org From: Antoine Tenart Commit 2cfadb761d3d ("netfilter: conntrack: revisit gc autotuning") changed the eviction rescheduling to the use average expiry of scanned entries (within 1-60s) by doing: for (...) { expires = clamp(nf_ct_expires(tmp), ...); next_run += expires; next_run /= 2; } The issue is the above will make the average ('next_run' here) more dependent on the last expiration values than the firsts (for sets > 2). Depending on the expiration values used to compute the average, the result can be quite different than what's expected. To fix this we can do the following: for (...) { expires = clamp(nf_ct_expires(tmp), ...); next_run += (expires - next_run) / ++count; } Fixes: 2cfadb761d3d ("netfilter: conntrack: revisit gc autotuning") Cc: Florian Westphal Signed-off-by: Antoine Tenart Signed-off-by: Florian Westphal --- net/netfilter/nf_conntrack_core.c | 10 ++++++++-- 1 file changed, 8 insertions(+), 2 deletions(-) diff --git a/net/netfilter/nf_conntrack_core.c b/net/netfilter/nf_conntrack_core.c index c5851e1321e7..8efa6bd5703c 100644 --- a/net/netfilter/nf_conntrack_core.c +++ b/net/netfilter/nf_conntrack_core.c @@ -67,6 +67,7 @@ struct conntrack_gc_work { struct delayed_work dwork; u32 next_bucket; u32 avg_timeout; + u32 count; u32 start_time; bool exiting; bool early_drop; @@ -1466,6 +1467,7 @@ static void gc_worker(struct work_struct *work) unsigned int expired_count = 0; unsigned long next_run; s32 delta_time; + long count; gc_work = container_of(work, struct conntrack_gc_work, dwork.work); @@ -1475,10 +1477,12 @@ static void gc_worker(struct work_struct *work) if (i == 0) { gc_work->avg_timeout = GC_SCAN_INTERVAL_INIT; + gc_work->count = 1; gc_work->start_time = start_time; } next_run = gc_work->avg_timeout; + count = gc_work->count; end_time = start_time + GC_SCAN_MAX_DURATION; @@ -1498,8 +1502,8 @@ static void gc_worker(struct work_struct *work) hlist_nulls_for_each_entry_rcu(h, n, &ct_hash[i], hnnode) { struct nf_conntrack_net *cnet; - unsigned long expires; struct net *net; + long expires; tmp = nf_ct_tuplehash_to_ctrack(h); @@ -1513,6 +1517,7 @@ static void gc_worker(struct work_struct *work) gc_work->next_bucket = i; gc_work->avg_timeout = next_run; + gc_work->count = count; delta_time = nfct_time_stamp - gc_work->start_time; @@ -1528,8 +1533,8 @@ static void gc_worker(struct work_struct *work) } expires = clamp(nf_ct_expires(tmp), GC_SCAN_INTERVAL_MIN, GC_SCAN_INTERVAL_CLAMP); + expires = (expires - (long)next_run) / ++count; next_run += expires; - next_run /= 2u; if (nf_conntrack_max95 == 0 || gc_worker_skip_ct(tmp)) continue; @@ -1570,6 +1575,7 @@ static void gc_worker(struct work_struct *work) delta_time = nfct_time_stamp - end_time; if (delta_time > 0 && i < hashsz) { gc_work->avg_timeout = next_run; + gc_work->count = count; gc_work->next_bucket = i; next_run = 0; goto early_exit;