Message ID | 20230718183351.297506-1-kuifeng@meta.com (mailing list archive) |
---|---|
Headers | show |
Series | Remove expired routes with a separated list of routes. | expand |
On Tue, 2023-07-18 at 11:33 -0700, Kui-Feng Lee wrote: > FIB6 GC walks trees of fib6_tables to remove expired routes. Walking a tree > can be expensive if the number of routes in a table is big, even if most of > them are permanent. Checking routes in a separated list of routes having > expiration will avoid this potential issue. > > Background > ========== > > The size of a Linux IPv6 routing table can become a big problem if not > managed appropriately. Now, Linux has a garbage collector to remove > expired routes periodically. However, this may lead to a situation in the routing path is blocked for a long period due to an > excessive number of routes. > > For example, years ago, there is a commit c7bb4b89033b ("ipv6: tcp: drop > silly ICMPv6 packet too big messages") about "ICMPv6 Packet too big > messages". The root cause is that malicious ICMPv6 packets were sent back > for every small packet sent to them. These packets add routes with an > expiration time that prompts the GC to periodically check all routes in the > tables, including permanent ones. > > Why Route Expires > ================= > > Users can add IPv6 routes with an expiration time manually. However, > the Neighbor Discovery protocol may also generate routes that can > expire. For example, Router Advertisement (RA) messages may create a > default route with an expiration time. [RFC 4861] For IPv4, it is not > possible to set an expiration time for a route, and there is no RA, so > there is no need to worry about such issues. > > Create Routes with Expires > ========================== > > You can create routes with expires with the command. > > For example, > > ip -6 route add 2001:b000:591::3 via fe80::5054:ff:fe12:3457 \ > dev enp0s3 expires 30 > > The route that has been generated will be deleted automatically in 30 > seconds. > > GC of FIB6 > ========== > > The function called fib6_run_gc() is responsible for performing > garbage collection (GC) for the Linux IPv6 stack. It checks for the > expiration of every route by traversing the trees of routing > tables. The time taken to traverse a routing table increases with its > size. Holding the routing table lock during traversal is particularly > undesirable. Therefore, it is preferable to keep the lock for the > shortest possible duration. > > Solution > ======== > > The cause of the issue is keeping the routing table locked during the > traversal of large trees. To solve this problem, we can create a separate > list of routes that have expiration. This will prevent GC from checking > permanent routes. > > Result > ====== > > We conducted a test to measure the execution times of fib6_gc_timer_cb() > and observed that it enhances the GC of FIB6. During the test, we added > permanent routes with the following numbers: 1000, 3000, 6000, and > 9000. Additionally, we added a route with an expiration time. > > Here are the average execution times for the kernel without the patch. > - 120020 ns with 1000 permanent routes > - 308920 ns with 3000 ... > - 581470 ns with 6000 ... > - 855310 ns with 9000 ... > > The kernel with the patch consistently takes around 14000 ns to execute, > regardless of the number of permanent routes that are installed. > > Major changes from v2: > > - Remove unnecessary and incorrect sysctl restoring in the test case. > > Major changes from v1: > > - Moved gc_link to avoid creating a hole in fib6_info. > > - Moved fib6_set_expires*() and fib6_clean_expires*() to the header > file and inlined. And removed duplicated lines. > > - Added a test case. > > --- > v1: https://lore.kernel.org/all/20230710203609.520720-1-kuifeng@meta.com/ > v2: https://lore.kernel.org/all/20230718180321.294721-1-kuifeng@meta.com/ Too bad I did not notice v3 before starting reviewing v2. When posting a new version you must wait the 24h quarantine period, see: https://elixir.bootlin.com/linux/v6.4/source/Documentation/process/maintainer-netdev.rst#L15 I assume this does not cope with the feedback on previous version ;) /P