diff mbox series

[net,1/3] netfilter: nft_set_rbtree: fix overlap expiration walk

Message ID 20230726152524.26268-2-fw@strlen.de (mailing list archive)
State Accepted
Commit f718863aca469a109895cb855e6b81fff4827d71
Delegated to: Netdev Maintainers
Headers show
Series [net,1/3] netfilter: nft_set_rbtree: fix overlap expiration walk | expand

Checks

Context Check Description
netdev/series_format success Pull request is its own cover letter
netdev/tree_selection success Clearly marked for net
netdev/fixes_present success Fixes tag present in non-next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1343 this patch: 1343
netdev/cc_maintainers fail 2 blamed authors not CCed: pablo@netfilter.org sbrivio@redhat.com; 4 maintainers not CCed: pablo@netfilter.org coreteam@netfilter.org sbrivio@redhat.com kadlec@netfilter.org
netdev/build_clang success Errors and warnings before: 1365 this patch: 1365
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1366 this patch: 1366
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 50 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Florian Westphal July 26, 2023, 3:23 p.m. UTC
The lazy gc on insert that should remove timed-out entries fails to release
the other half of the interval, if any.

Can be reproduced with tests/shell/testcases/sets/0044interval_overlap_0
in nftables.git and kmemleak enabled kernel.

Second bug is the use of rbe_prev vs. prev pointer.
If rbe_prev() returns NULL after at least one iteration, rbe_prev points
to element that is not an end interval, hence it should not be removed.

Lastly, check the genmask of the end interval if this is active in the
current generation.

Fixes: c9e6978e2725 ("netfilter: nft_set_rbtree: Switch to node list walk for overlap detection")
Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/nft_set_rbtree.c | 20 ++++++++++++++------
 1 file changed, 14 insertions(+), 6 deletions(-)

Comments

patchwork-bot+netdevbpf@kernel.org July 27, 2023, 5:20 a.m. UTC | #1
Hello:

This series was applied to netdev/net.git (main)
by Florian Westphal <fw@strlen.de>:

On Wed, 26 Jul 2023 17:23:47 +0200 you wrote:
> The lazy gc on insert that should remove timed-out entries fails to release
> the other half of the interval, if any.
> 
> Can be reproduced with tests/shell/testcases/sets/0044interval_overlap_0
> in nftables.git and kmemleak enabled kernel.
> 
> Second bug is the use of rbe_prev vs. prev pointer.
> If rbe_prev() returns NULL after at least one iteration, rbe_prev points
> to element that is not an end interval, hence it should not be removed.
> 
> [...]

Here is the summary with links:
  - [net,1/3] netfilter: nft_set_rbtree: fix overlap expiration walk
    https://git.kernel.org/netdev/net/c/f718863aca46
  - [net,2/3] netfilter: nf_tables: skip immediate deactivate in _PREPARE_ERROR
    https://git.kernel.org/netdev/net/c/0a771f7b266b
  - [net,3/3] netfilter: nf_tables: disallow rule addition to bound chain via NFTA_RULE_CHAIN_ID
    https://git.kernel.org/netdev/net/c/0ebc1064e487

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/netfilter/nft_set_rbtree.c b/net/netfilter/nft_set_rbtree.c
index 5c05c9b990fb..8d73fffd2d09 100644
--- a/net/netfilter/nft_set_rbtree.c
+++ b/net/netfilter/nft_set_rbtree.c
@@ -217,29 +217,37 @@  static void *nft_rbtree_get(const struct net *net, const struct nft_set *set,
 
 static int nft_rbtree_gc_elem(const struct nft_set *__set,
 			      struct nft_rbtree *priv,
-			      struct nft_rbtree_elem *rbe)
+			      struct nft_rbtree_elem *rbe,
+			      u8 genmask)
 {
 	struct nft_set *set = (struct nft_set *)__set;
 	struct rb_node *prev = rb_prev(&rbe->node);
-	struct nft_rbtree_elem *rbe_prev = NULL;
+	struct nft_rbtree_elem *rbe_prev;
 	struct nft_set_gc_batch *gcb;
 
 	gcb = nft_set_gc_batch_check(set, NULL, GFP_ATOMIC);
 	if (!gcb)
 		return -ENOMEM;
 
-	/* search for expired end interval coming before this element. */
+	/* search for end interval coming before this element.
+	 * end intervals don't carry a timeout extension, they
+	 * are coupled with the interval start element.
+	 */
 	while (prev) {
 		rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node);
-		if (nft_rbtree_interval_end(rbe_prev))
+		if (nft_rbtree_interval_end(rbe_prev) &&
+		    nft_set_elem_active(&rbe_prev->ext, genmask))
 			break;
 
 		prev = rb_prev(prev);
 	}
 
-	if (rbe_prev) {
+	if (prev) {
+		rbe_prev = rb_entry(prev, struct nft_rbtree_elem, node);
+
 		rb_erase(&rbe_prev->node, &priv->root);
 		atomic_dec(&set->nelems);
+		nft_set_gc_batch_add(gcb, rbe_prev);
 	}
 
 	rb_erase(&rbe->node, &priv->root);
@@ -321,7 +329,7 @@  static int __nft_rbtree_insert(const struct net *net, const struct nft_set *set,
 
 		/* perform garbage collection to avoid bogus overlap reports. */
 		if (nft_set_elem_expired(&rbe->ext)) {
-			err = nft_rbtree_gc_elem(set, priv, rbe);
+			err = nft_rbtree_gc_elem(set, priv, rbe, genmask);
 			if (err < 0)
 				return err;