diff mbox series

[net-next] net: ipv4: replace the right route in case prefsrc is used

Message ID 20231117114837.36100-1-atenart@kernel.org (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next] net: ipv4: replace the right route in case prefsrc is used | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1129 this patch: 1129
netdev/cc_maintainers warning 1 maintainers not CCed: dsahern@kernel.org
netdev/build_clang success Errors and warnings before: 1155 this patch: 1155
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1156 this patch: 1156
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 21 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Antoine Tenart Nov. 17, 2023, 11:48 a.m. UTC
In case similar routes with different prefsrc are installed, any
modification of one of those routes will always modify the first one
found as the prefsrc is not matched. Fix this by updating the entry we
found in case prefsrc was set in the request.

Before the patch:

  $ ip route show
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.2 metric 100
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.3 metric 100
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100
  $ ip route change 172.16.42.0/24 dev eth0 proto kernel scope link \
        src 172.16.42.4 metric 100 mtu 1280
  $ ip route show
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100 mtu 1280
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.3 metric 100
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100

After the patch:

  $ ip route show
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.2 metric 100
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.3 metric 100
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100
  $ ip route change 172.16.42.0/24 dev eth0 proto kernel scope link \
        src 172.16.42.4 metric 100 mtu 1280
  $ ip route show
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.2 metric 100
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.3 metric 100
  172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100 mtu 1280

All fib selftest ran and no failure was seen.

Note: a selftest wasn't added as `ip route` use NLM_F_EXCL which
prevents us from constructing the above routes. But this is a valid
example of what NetworkManager can construct for example.

Signed-off-by: Antoine Tenart <atenart@kernel.org>
---

Hi, comment/question below,

I'm wondering if we want to fix the above case. I made this patch
because we already filter on prefsrc when deleting a route[1] to deal
with the same configurations as above, and that would make the route
replacement consistent with that.

However even with this (same for [1]) things are not 100% failsafe
(and we can argue on the use case and feasibility). For example
consider,

$ ip route show
172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.2 metric 100
172.16.42.0/24 dev eth0 proto kernel scope link metric 100
$ ip route del 172.16.42.0/24 dev eth0 proto kernel scope link metric 100
$ ip route show
172.16.42.0/24 dev eth0 proto kernel scope link metric 100

Also the differing part could be something else that the prefsrc (not
that it would necessarily make sense).

Thoughts?

Thanks!
Antoine

[1] 74cb3c108bc0 ("ipv4: match prefsrc when deleting routes").

---
 net/ipv4/fib_trie.c | 6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Comments

Julian Anastasov Nov. 17, 2023, 4:56 p.m. UTC | #1
Hello,

On Fri, 17 Nov 2023, Antoine Tenart wrote:

> In case similar routes with different prefsrc are installed, any
> modification of one of those routes will always modify the first one
> found as the prefsrc is not matched. Fix this by updating the entry we
> found in case prefsrc was set in the request.
> 
> Before the patch:
> 
>   $ ip route show
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.2 metric 100
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.3 metric 100
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100
>   $ ip route change 172.16.42.0/24 dev eth0 proto kernel scope link \
>         src 172.16.42.4 metric 100 mtu 1280
>   $ ip route show
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100 mtu 1280
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.3 metric 100
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100
> 
> After the patch:
> 
>   $ ip route show
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.2 metric 100
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.3 metric 100
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100
>   $ ip route change 172.16.42.0/24 dev eth0 proto kernel scope link \
>         src 172.16.42.4 metric 100 mtu 1280
>   $ ip route show
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.2 metric 100
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.3 metric 100
>   172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.4 metric 100 mtu 1280
> 
> All fib selftest ran and no failure was seen.
> 
> Note: a selftest wasn't added as `ip route` use NLM_F_EXCL which
> prevents us from constructing the above routes. But this is a valid

ip route append/prepend are standard way to create alternative routes,
if you want to encode a selftest.

> example of what NetworkManager can construct for example.
> 
> Signed-off-by: Antoine Tenart <atenart@kernel.org>
> ---
> 
> Hi, comment/question below,
> 
> I'm wondering if we want to fix the above case. I made this patch
> because we already filter on prefsrc when deleting a route[1] to deal
> with the same configurations as above, and that would make the route
> replacement consistent with that.
> 
> However even with this (same for [1]) things are not 100% failsafe
> (and we can argue on the use case and feasibility). For example
> consider,
> 
> $ ip route show
> 172.16.42.0/24 dev eth0 proto kernel scope link src 172.16.42.2 metric 100
> 172.16.42.0/24 dev eth0 proto kernel scope link metric 100
> $ ip route del 172.16.42.0/24 dev eth0 proto kernel scope link metric 100
> $ ip route show
> 172.16.42.0/24 dev eth0 proto kernel scope link metric 100
> 
> Also the differing part could be something else that the prefsrc (not
> that it would necessarily make sense).
> 
> Thoughts?
> 
> Thanks!
> Antoine
> 
> [1] 74cb3c108bc0 ("ipv4: match prefsrc when deleting routes").
> 
> ---
>  net/ipv4/fib_trie.c | 6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
> index 9bdfdab906fe..6cf775d4574e 100644
> --- a/net/ipv4/fib_trie.c
> +++ b/net/ipv4/fib_trie.c
> @@ -1263,10 +1263,11 @@ int fib_table_insert(struct net *net, struct fib_table *tb,
>  
>  		nlflags &= ~NLM_F_EXCL;
>  
> -		/* We have 2 goals:
> +		/* We have 3 goals:
>  		 * 1. Find exact match for type, scope, fib_info to avoid
>  		 * duplicate routes
>  		 * 2. Find next 'fa' (or head), NLM_F_APPEND inserts before it
> +		 * 3. Find the right 'fa' in case a prefsrc is used
>  		 */
>  		fa_match = NULL;
>  		fa_first = fa;
> @@ -1282,6 +1283,9 @@ int fib_table_insert(struct net *net, struct fib_table *tb,
>  				fa_match = fa;
>  				break;
>  			}
> +			if (cfg->fc_prefsrc &&
> +			    cfg->fc_prefsrc == fa->fa_info->fib_prefsrc)

	You may prefer to restrict it for the change operation by
adding && (cfg->fc_nlflags & NLM_F_REPLACE) check, otherwise if
we change the prepend position (fa_first) route with such prefsrc
can not be installed as first one:

prepend 172.16.42.4 mtu 1280 impossible here if following are
src 172.16.42.2
src 172.16.42.3
<- we do not want here for prepend
src 172.16.42.4

	Even if we consider just the change operation, this patch
will change the expectation that we replace the first alternative
route. But I don't know if one uses alternative routes that
differ in prefsrc. More common example would be alternative routes
that differ in gateway, that is what fib_select_default() and
fib_detect_death() notices as a real alternatives that differ in
neigh state.

	Note that link routes (nhc_scope RT_SCOPE_HOST) or
routes with prefixlen!=0 (fib_select_path) are not considered
as alternatives by the kernel. So, even if we can create such
routes, they are probably not used. So, deleting link routes
by prefsrc is good as we remove routes with deleted prefsrc
but for routing we are using just the first link route.

> +				fa_first = fa;
>  		}
>  
>  		if (cfg->fc_nlflags & NLM_F_REPLACE) {
> -- 
> 2.41.0

Regards

--
Julian Anastasov <ja@ssi.bg>
Antoine Tenart Nov. 20, 2023, 9:32 a.m. UTC | #2
Quoting Julian Anastasov (2023-11-17 17:56:57)
> On Fri, 17 Nov 2023, Antoine Tenart wrote:
> > 
> > Note: a selftest wasn't added as `ip route` use NLM_F_EXCL which
> > prevents us from constructing the above routes. But this is a valid
> 
> ip route append/prepend are standard way to create alternative routes,
> if you want to encode a selftest.

Thanks for the pointer.

> > +                     if (cfg->fc_prefsrc &&
> > +                         cfg->fc_prefsrc == fa->fa_info->fib_prefsrc)
> 
>         You may prefer to restrict it for the change operation by
> adding && (cfg->fc_nlflags & NLM_F_REPLACE) check, otherwise if
> we change the prepend position (fa_first) route with such prefsrc
> can not be installed as first one:

Good point, I'll fix in v2 (if any).

>         Even if we consider just the change operation, this patch
> will change the expectation that we replace the first alternative
> route. But I don't know if one uses alternative routes that
> differ in prefsrc.

For example NetworkManager does, but I don't think NLM_F_REPLACE is
used. One could do it manually, but I don't think that is used in
practice either.

>         Note that link routes (nhc_scope RT_SCOPE_HOST) or
> routes with prefixlen!=0 (fib_select_path) are not considered
> as alternatives by the kernel. So, even if we can create such
> routes, they are probably not used. So, deleting link routes
> by prefsrc is good as we remove routes with deleted prefsrc
> but for routing we are using just the first link route.

Right. Replacing routes by prefsrc could help configuring those routes
so they have the right config when the first link route is deleted, but
that's theoretical too. Reasoning here was since we can delete by
prefsrc, we might want to allow to replace too. As you said, there is a
good reason to allow this for route deletion which might not extend to
the change operation.

Thanks!
Antoine
diff mbox series

Patch

diff --git a/net/ipv4/fib_trie.c b/net/ipv4/fib_trie.c
index 9bdfdab906fe..6cf775d4574e 100644
--- a/net/ipv4/fib_trie.c
+++ b/net/ipv4/fib_trie.c
@@ -1263,10 +1263,11 @@  int fib_table_insert(struct net *net, struct fib_table *tb,
 
 		nlflags &= ~NLM_F_EXCL;
 
-		/* We have 2 goals:
+		/* We have 3 goals:
 		 * 1. Find exact match for type, scope, fib_info to avoid
 		 * duplicate routes
 		 * 2. Find next 'fa' (or head), NLM_F_APPEND inserts before it
+		 * 3. Find the right 'fa' in case a prefsrc is used
 		 */
 		fa_match = NULL;
 		fa_first = fa;
@@ -1282,6 +1283,9 @@  int fib_table_insert(struct net *net, struct fib_table *tb,
 				fa_match = fa;
 				break;
 			}
+			if (cfg->fc_prefsrc &&
+			    cfg->fc_prefsrc == fa->fa_info->fib_prefsrc)
+				fa_first = fa;
 		}
 
 		if (cfg->fc_nlflags & NLM_F_REPLACE) {