Message ID | cover.1723036486.git.petrm@nvidia.com (mailing list archive) |
---|---|
Headers | show |
Series | net: nexthop: Increase weight to u16 | expand |
On Wed, 7 Aug 2024 16:13:45 +0200 Petr Machata wrote: > In CLOS networks, as link failures occur at various points in the network, > ECMP weights of the involved nodes are adjusted to compensate. With high > fan-out of the involved nodes, and overall high number of nodes, > a (non-)ECMP weight ratio that we would like to configure does not fit into > 8 bits. Instead of, say, 255:254, we might like to configure something like > 1000:999. For these deployments, the 8-bit weight may not be enough. > > To that end, in this patchset increase the next hop weight from u8 to u16. > > Patch #1 adds a flag that indicates whether the reserved fields are zeroed. > This is a follow-up to a new fix merged in commit 6d745cd0e972 ("net: > nexthop: Initialize all fields in dumped nexthops"). The theory behind this > patch is that there is a strict ordering between the fields actually being > zeroed, the kernel declaring that they are, and the kernel repurposing the > fields. Thus clients can use the flag to tell if it is safe to interpret > the reserved fields in any way. > > Patch #2 contains the substantial code and the commit message covers the > details of the changes. > > Patches #3 to #6 add selftests. I did update iproute2 to the branch you sent me last time, but tests are not happy: # IPv6 groups functional # ---------------------- # TEST: Create nexthop group with single nexthop [ OK ] # TEST: Get nexthop group by id [ OK ] # TEST: Delete nexthop group by id [ OK ] # TEST: Nexthop group with multiple nexthops [ OK ] # TEST: Nexthop group updated when entry is deleted [ OK ] # TEST: Nexthop group with weighted nexthops [ OK ] # TEST: Weighted nexthop group updated when entry is deleted [ OK ] # TEST: Nexthops in groups removed on admin down [ OK ] # TEST: Multiple groups with same nexthop [ OK ] # TEST: Nexthops in group removed on admin down - mixed group [ OK ] # TEST: Nexthop group can not have a group as an entry [ OK ] # TEST: Nexthop group with a blackhole entry [ OK ] # TEST: Nexthop group can not have a blackhole and another nexthop [ OK ] # TEST: Nexthop group replace refcounts [ OK ] # WARNING: Unexpected route entry # TEST: 16-bit weights [FAIL] # # IPv6 resilient groups functional # -------------------------------- # TEST: Nexthop group updated when entry is deleted [ OK ] # TEST: Nexthop buckets updated when entry is deleted [ OK ] # TEST: Nexthop group updated after replace [ OK ] # TEST: Nexthop buckets updated after replace [ OK ] # TEST: Nexthop group updated when entry is deleted - nECMP [ OK ] # TEST: Nexthop buckets updated when entry is deleted - nECMP [ OK ] # TEST: Nexthop group updated after replace - nECMP [ OK ] # TEST: Nexthop buckets updated after replace - nECMP [ OK ] # WARNING: Unexpected route entry # TEST: 16-bit weights [FAIL] https://netdev-3.bots.linux.dev/vmksft-net/results/718641/2-fib-nexthops-sh/stdout
Jakub Kicinski <kuba@kernel.org> writes: > On Wed, 7 Aug 2024 16:13:45 +0200 Petr Machata wrote: >> In CLOS networks, as link failures occur at various points in the network, >> ECMP weights of the involved nodes are adjusted to compensate. With high >> fan-out of the involved nodes, and overall high number of nodes, >> a (non-)ECMP weight ratio that we would like to configure does not fit into >> 8 bits. Instead of, say, 255:254, we might like to configure something like >> 1000:999. For these deployments, the 8-bit weight may not be enough. >> >> To that end, in this patchset increase the next hop weight from u8 to u16. >> >> Patch #1 adds a flag that indicates whether the reserved fields are zeroed. >> This is a follow-up to a new fix merged in commit 6d745cd0e972 ("net: >> nexthop: Initialize all fields in dumped nexthops"). The theory behind this >> patch is that there is a strict ordering between the fields actually being >> zeroed, the kernel declaring that they are, and the kernel repurposing the >> fields. Thus clients can use the flag to tell if it is safe to interpret >> the reserved fields in any way. >> >> Patch #2 contains the substantial code and the commit message covers the >> details of the changes. >> >> Patches #3 to #6 add selftests. > > I did update iproute2 to the branch you sent me last time, but tests > are not happy: > > # IPv6 groups functional > # ---------------------- > # TEST: Create nexthop group with single nexthop [ OK ] > # TEST: Get nexthop group by id [ OK ] > # TEST: Delete nexthop group by id [ OK ] > # TEST: Nexthop group with multiple nexthops [ OK ] > # TEST: Nexthop group updated when entry is deleted [ OK ] > # TEST: Nexthop group with weighted nexthops [ OK ] > # TEST: Weighted nexthop group updated when entry is deleted [ OK ] > # TEST: Nexthops in groups removed on admin down [ OK ] > # TEST: Multiple groups with same nexthop [ OK ] > # TEST: Nexthops in group removed on admin down - mixed group [ OK ] > # TEST: Nexthop group can not have a group as an entry [ OK ] > # TEST: Nexthop group with a blackhole entry [ OK ] > # TEST: Nexthop group can not have a blackhole and another nexthop [ OK ] > # TEST: Nexthop group replace refcounts [ OK ] > # WARNING: Unexpected route entry > # TEST: 16-bit weights [FAIL] > # > # IPv6 resilient groups functional > # -------------------------------- > # TEST: Nexthop group updated when entry is deleted [ OK ] > # TEST: Nexthop buckets updated when entry is deleted [ OK ] > # TEST: Nexthop group updated after replace [ OK ] > # TEST: Nexthop buckets updated after replace [ OK ] > # TEST: Nexthop group updated when entry is deleted - nECMP [ OK ] > # TEST: Nexthop buckets updated when entry is deleted - nECMP [ OK ] > # TEST: Nexthop group updated after replace - nECMP [ OK ] > # TEST: Nexthop buckets updated after replace - nECMP [ OK ] > # WARNING: Unexpected route entry > # TEST: 16-bit weights [FAIL] > > https://netdev-3.bots.linux.dev/vmksft-net/results/718641/2-fib-nexthops-sh/stdout This failure mode is consistent with non-updated iproute2. I only pushed to the iproute2 repository after having sent the kernel patches, so I think you or your automation have picked up the old version. Can you try again, please? I retested on my end and it still works.
On Fri, 9 Aug 2024 11:48:09 +0200 Petr Machata wrote: > This failure mode is consistent with non-updated iproute2. I only pushed > to the iproute2 repository after having sent the kernel patches, so I > think you or your automation have picked up the old version. Can you try > again, please? I retested on my end and it still works. Updated now. iproute2 is at: commit cefc088ed1b96a2c245b60043b52f31cd89c2946 (HEAD -> main) Merge: 354d8a36 54da1517 Author: Your Name Date: Fri Aug 9 21:06:12 2024 -0700 Merge branch 'nhgw16' of https://github.com/pmachata/iproute2 Let's see what happens, I haven't rebuilt the image in a while maybe I'm doing it wrong.
Jakub Kicinski <kuba@kernel.org> writes: > On Fri, 9 Aug 2024 11:48:09 +0200 Petr Machata wrote: >> This failure mode is consistent with non-updated iproute2. I only pushed >> to the iproute2 repository after having sent the kernel patches, so I >> think you or your automation have picked up the old version. Can you try >> again, please? I retested on my end and it still works. > > Updated now. iproute2 is at: > > commit cefc088ed1b96a2c245b60043b52f31cd89c2946 (HEAD -> main) > Merge: 354d8a36 54da1517 > Author: Your Name > Date: Fri Aug 9 21:06:12 2024 -0700 > > Merge branch 'nhgw16' of https://github.com/pmachata/iproute2 > > > Let's see what happens, I haven't rebuilt the image in a while > maybe I'm doing it wrong. Looking good: https://netdev-3.bots.linux.dev/vmksft-net/results/724221/2-fib-nexthops-sh/stdout
Hello: This series was applied to netdev/net-next.git (main) by Jakub Kicinski <kuba@kernel.org>: On Wed, 7 Aug 2024 16:13:45 +0200 you wrote: > In CLOS networks, as link failures occur at various points in the network, > ECMP weights of the involved nodes are adjusted to compensate. With high > fan-out of the involved nodes, and overall high number of nodes, > a (non-)ECMP weight ratio that we would like to configure does not fit into > 8 bits. Instead of, say, 255:254, we might like to configure something like > 1000:999. For these deployments, the 8-bit weight may not be enough. > > [...] Here is the summary with links: - [net-next,v2,1/6] net: nexthop: Add flag to assert that NHGRP reserved fields are zero https://git.kernel.org/netdev/net-next/c/75bab45e6b2d - [net-next,v2,2/6] net: nexthop: Increase weight to u16 https://git.kernel.org/netdev/net-next/c/b72a6a7ab957 - [net-next,v2,3/6] selftests: router_mpath: Sleep after MZ https://git.kernel.org/netdev/net-next/c/110d3ffe9d2b - [net-next,v2,4/6] selftests: router_mpath_nh: Test 16-bit next hop weights https://git.kernel.org/netdev/net-next/c/bb89fdacf99c - [net-next,v2,5/6] selftests: router_mpath_nh_res: Test 16-bit next hop weights https://git.kernel.org/netdev/net-next/c/dce0765c1d5b - [net-next,v2,6/6] selftests: fib_nexthops: Test 16-bit next hop weights https://git.kernel.org/netdev/net-next/c/4b808f447332 You are awesome, thank you!