diff mbox series

[net-next,1/3] ipv6: use a new flag to indicate elevated refcount.

Message ID 7c3ec1f7c7e4098045d1e42961df8af11619089e.1717087015.git.pabeni@redhat.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series dst_cache: cope with device removal | expand

Checks

Context Check Description
netdev/series_format success Posting correctly formatted
netdev/tree_selection success Clearly marked for net-next
netdev/ynl success Generated files up to date; no warnings/errors; no diff in generated;
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1397 this patch: 1397
netdev/build_tools success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 906 this patch: 906
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1412 this patch: 1412
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 25 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Paolo Abeni May 30, 2024, 5:21 p.m. UTC
ip6_pol_route() can return a dst entry with elevated reference count
even when the caller ask for the RT6_LOOKUP_F_DST_NOREF flag.

Currently the caller uses the rt_uncached list entry field to detect
such scenario: the reference is elevated only for entry in the uncached
list.

Soon we are going to insert in the uncached list even entry held by
the dst_cache(s), potentially fooling the above check and causing
reference underflow.

To avoid such issue, introduce and use a new field to mark the entries
with refcount elevated. No functional change intended.

Before:
pahole -EC rt6_info
/* size: 224, cachelines: 4, members: 9 */
/* sum members: 218, holes: 1, sum holes: 4 */

After:
pahole: -EC rt6_info
/* size: 224, cachelines: 4, members: 10 */
/* sum members: 219, holes: 1, sum holes: 4 */

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/ip6_fib.h | 3 +++
 net/ipv6/route.c      | 4 ++--
 2 files changed, 5 insertions(+), 2 deletions(-)
diff mbox series

Patch

diff --git a/include/net/ip6_fib.h b/include/net/ip6_fib.h
index 6cb867ce4878..eb997af5523c 100644
--- a/include/net/ip6_fib.h
+++ b/include/net/ip6_fib.h
@@ -216,6 +216,9 @@  struct rt6_info {
 
 	/* more non-fragment space at head required */
 	unsigned short			rt6i_nfheader_len;
+
+	/* route lookup always acquires a reference */
+	bool				rt6i_count_held;
 };
 
 struct fib6_result {
diff --git a/net/ipv6/route.c b/net/ipv6/route.c
index bbc2a0dd9314..3b729ab86c55 100644
--- a/net/ipv6/route.c
+++ b/net/ipv6/route.c
@@ -2251,6 +2251,7 @@  struct rt6_info *ip6_pol_route(struct net *net, struct fib6_table *table,
 			 * this refcnt is always returned to the caller even
 			 * if caller sets RT6_LOOKUP_F_DST_NOREF flag.
 			 */
+			rt->rt6i_count_held = true;
 			rt6_uncached_list_add(rt);
 			rcu_read_unlock();
 
@@ -2648,8 +2649,7 @@  struct dst_entry *ip6_route_output_flags(struct net *net,
 	rcu_read_lock();
 	dst = ip6_route_output_flags_noref(net, sk, fl6, flags);
 	rt6 = dst_rt6_info(dst);
-	/* For dst cached in uncached_list, refcnt is already taken. */
-	if (list_empty(&rt6->dst.rt_uncached) && !dst_hold_safe(dst)) {
+	if (!rt6->rt6i_count_held && !dst_hold_safe(dst)) {
 		dst = &net->ipv6.ip6_null_entry->dst;
 		dst_hold(dst);
 	}