mbox series

[v4,net,0/3] udp: Fix two integer overflows when sk->sk_rcvbuf is close to INT_MAX.

Message ID 20250329180541.34968-1-kuniyu@amazon.com (mailing list archive)
Headers show
Series udp: Fix two integer overflows when sk->sk_rcvbuf is close to INT_MAX. | expand

Message

Kuniyuki Iwashima March 29, 2025, 6:05 p.m. UTC
I got a report that UDP mem usage in /proc/net/sockstat did not
drop even after an application was terminated.

The issue could happen if sk->sk_rmem_alloc wraps around due
to a large sk->sk_rcvbuf, which was INT_MAX in our case.

The patch 2 fixes the issue, and the patch 1 fixes yet another
overflow I found while investigating the issue.


v4:
  * Patch 4
    * Wait RCU for at most 30 sec

v3: https://lore.kernel.org/netdev/20250327202722.63756-1-kuniyu@amazon.com/
  * Rebase
  * Add Willem's tags

v2: https://lore.kernel.org/netdev/20250325195826.52385-1-kuniyu@amazon.com/
  * Patch 1
    * Define rmem and rcvbuf as unsigned int (Eric)
    * Take skb->truesize into account for sk with large rcvbuf (Willem)

  * Patch 3
    * Add a comment

v1: https://lore.kernel.org/netdev/20250323231016.74813-1-kuniyu@amazon.com/


Kuniyuki Iwashima (3):
  udp: Fix multiple wraparounds of sk->sk_rmem_alloc.
  udp: Fix memory accounting leak.
  selftest: net: Check wraparounds for sk->sk_rmem_alloc.

 net/ipv4/udp.c                          |  40 ++---
 tools/testing/selftests/net/.gitignore  |   3 +-
 tools/testing/selftests/net/Makefile    |   2 +-
 tools/testing/selftests/net/so_rcvbuf.c | 188 ++++++++++++++++++++++++
 4 files changed, 214 insertions(+), 19 deletions(-)
 create mode 100644 tools/testing/selftests/net/so_rcvbuf.c

Comments

Jakub Kicinski March 31, 2025, 1:51 p.m. UTC | #1
On Sat, 29 Mar 2025 11:05:10 -0700 Kuniyuki Iwashima wrote:
> v4:
>   * Patch 4
>     * Wait RCU for at most 30 sec

The new test still doesn't pass

TAP version 13
1..1
# timeout set to 3600
# selftests: net: so_rcvbuf
# 0.00 [+0.00] TAP version 13
# 0.00 [+0.00] 1..2
# 0.00 [+0.00] # Starting 2 tests from 2 test cases.
# 0.00 [+0.00] #  RUN           so_rcvbuf.udp_ipv4.rmem_max ...
# 0.00 [+0.00] # so_rcvbuf.c:151:rmem_max:Expected pages (35) == 0 (0)
# 0.00 [+0.00] # rmem_max: Test terminated by assertion
# 0.00 [+0.00] #          FAIL  so_rcvbuf.udp_ipv4.rmem_max
# 0.00 [+0.00] not ok 1 so_rcvbuf.udp_ipv4.rmem_max
# 0.01 [+0.00] #  RUN           so_rcvbuf.udp_ipv6.rmem_max ...
# 0.01 [+0.00] # so_rcvbuf.c:151:rmem_max:Expected pages (35) == 0 (0)
# 0.01 [+0.00] # rmem_max: Test terminated by assertion
# 0.01 [+0.00] #          FAIL  so_rcvbuf.udp_ipv6.rmem_max
# 0.01 [+0.00] not ok 2 so_rcvbuf.udp_ipv6.rmem_max
# 0.01 [+0.00] # FAILED: 0 / 2 tests passed.
# 0.01 [+0.00] # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
not ok 1 selftests: net: so_rcvbuf # exit=1


Plus we also see failures in udpgso.sh
Kuniyuki Iwashima March 31, 2025, 6:21 p.m. UTC | #2
From: Jakub Kicinski <kuba@kernel.org>
Date: Mon, 31 Mar 2025 06:51:09 -0700
> On Sat, 29 Mar 2025 11:05:10 -0700 Kuniyuki Iwashima wrote:
> > v4:
> >   * Patch 4
> >     * Wait RCU for at most 30 sec
> 
> The new test still doesn't pass
> 
> TAP version 13
> 1..1
> # timeout set to 3600
> # selftests: net: so_rcvbuf
> # 0.00 [+0.00] TAP version 13
> # 0.00 [+0.00] 1..2
> # 0.00 [+0.00] # Starting 2 tests from 2 test cases.
> # 0.00 [+0.00] #  RUN           so_rcvbuf.udp_ipv4.rmem_max ...
> # 0.00 [+0.00] # so_rcvbuf.c:151:rmem_max:Expected pages (35) == 0 (0)
> # 0.00 [+0.00] # rmem_max: Test terminated by assertion
> # 0.00 [+0.00] #          FAIL  so_rcvbuf.udp_ipv4.rmem_max
> # 0.00 [+0.00] not ok 1 so_rcvbuf.udp_ipv4.rmem_max
> # 0.01 [+0.00] #  RUN           so_rcvbuf.udp_ipv6.rmem_max ...
> # 0.01 [+0.00] # so_rcvbuf.c:151:rmem_max:Expected pages (35) == 0 (0)
> # 0.01 [+0.00] # rmem_max: Test terminated by assertion
> # 0.01 [+0.00] #          FAIL  so_rcvbuf.udp_ipv6.rmem_max
> # 0.01 [+0.00] not ok 2 so_rcvbuf.udp_ipv6.rmem_max
> # 0.01 [+0.00] # FAILED: 0 / 2 tests passed.
> # 0.01 [+0.00] # Totals: pass:0 fail:2 xfail:0 xpass:0 skip:0 error:0
> not ok 1 selftests: net: so_rcvbuf # exit=1
> 
> 
> Plus we also see failures in udpgso.sh

I forgot to update `size` when skb_condense() is called.

Without the change I saw the new test stuck at 1 page after
udpgso.sh, but with the change both passed.

Will post v5.

Thanks!
Jakub Kicinski March 31, 2025, 6:36 p.m. UTC | #3
On Mon, 31 Mar 2025 11:21:34 -0700 Kuniyuki Iwashima wrote:
> > Plus we also see failures in udpgso.sh  
> 
> I forgot to update `size` when skb_condense() is called.
> 
> Without the change I saw the new test stuck at 1 page after
> udpgso.sh, but with the change both passed.
> 
> Will post v5.

Please do test locally if you can.
Kuniyuki Iwashima March 31, 2025, 6:54 p.m. UTC | #4
From: Jakub Kicinski <kuba@kernel.org>
Date: Mon, 31 Mar 2025 11:36:03 -0700
> On Mon, 31 Mar 2025 11:21:34 -0700 Kuniyuki Iwashima wrote:
> > > Plus we also see failures in udpgso.sh  
> > 
> > I forgot to update `size` when skb_condense() is called.
> > 
> > Without the change I saw the new test stuck at 1 page after
> > udpgso.sh, but with the change both passed.
> > 
> > Will post v5.
> 
> Please do test locally if you can.

Sure, will try the same tests with CI.
Kuniyuki Iwashima March 31, 2025, 8:31 p.m. UTC | #5
From: Kuniyuki Iwashima <kuniyu@amazon.com>
Date: Mon, 31 Mar 2025 11:54:53 -0700
> From: Jakub Kicinski <kuba@kernel.org>
> Date: Mon, 31 Mar 2025 11:36:03 -0700
> > On Mon, 31 Mar 2025 11:21:34 -0700 Kuniyuki Iwashima wrote:
> > > > Plus we also see failures in udpgso.sh  
> > > 
> > > I forgot to update `size` when skb_condense() is called.
> > > 
> > > Without the change I saw the new test stuck at 1 page after
> > > udpgso.sh, but with the change both passed.
> > > 
> > > Will post v5.
> > 
> > Please do test locally if you can.
> 
> Sure, will try the same tests with CI.

Is there a way to tell NIPA to run a test in a dedicated VM ?

I see some tests succeed when executed solely but fail when
executed with

  make -C tools/testing/selftests/ TARGETS=net run_tests

When combined with other tests, assuming that the global UDP usage
will soon drop to 0 is not always easy... so it's defeating the
purpose but I'd drop the test in v5 not to make CI unhappy.
Jakub Kicinski March 31, 2025, 11:29 p.m. UTC | #6
On Mon, 31 Mar 2025 13:31:47 -0700 Kuniyuki Iwashima wrote:
> > > Please do test locally if you can.  
> > 
> > Sure, will try the same tests with CI.  
> 
> Is there a way to tell NIPA to run a test in a dedicated VM ?
> 
> I see some tests succeed when executed solely but fail when
> executed with
> 
>   make -C tools/testing/selftests/ TARGETS=net run_tests
> 
> When combined with other tests, assuming that the global UDP usage
> will soon drop to 0 is not always easy... so it's defeating the
> purpose but I'd drop the test in v5 not to make CI unhappy.

Can we account for some level of system noise? Or try to dump all 
the sockets and count the "accounted for" in-use memory?

We can do various things in NIPA, but I'm not sure if it's okay 
for tests inside net/ should require a completely idle system.
If we want a completely idle system maybe user-mode Linux + kunit
is a better direction?

Willem, WDYT?
Willem de Bruijn April 1, 2025, 1:15 a.m. UTC | #7
Jakub Kicinski wrote:
> On Mon, 31 Mar 2025 13:31:47 -0700 Kuniyuki Iwashima wrote:
> > > > Please do test locally if you can.  
> > > 
> > > Sure, will try the same tests with CI.  
> > 
> > Is there a way to tell NIPA to run a test in a dedicated VM ?
> > 
> > I see some tests succeed when executed solely but fail when
> > executed with
> > 
> >   make -C tools/testing/selftests/ TARGETS=net run_tests
> > 
> > When combined with other tests, assuming that the global UDP usage
> > will soon drop to 0 is not always easy... so it's defeating the
> > purpose but I'd drop the test in v5 not to make CI unhappy.
> 
> Can we account for some level of system noise? Or try to dump all 
> the sockets and count the "accounted for" in-use memory?
> 
> We can do various things in NIPA, but I'm not sure if it's okay 
> for tests inside net/ should require a completely idle system.
> If we want a completely idle system maybe user-mode Linux + kunit
> is a better direction?
> 
> Willem, WDYT?

The number of tests depending on global variables like
proto_memory_allocated is thankfully low.

kselftest/runner.sh runs RUN_IN_NETNS tests in parallel. That sounds
the case here. Perhaps we can add a test option to force running
without concurrent other tests?

Otherwise, the specific test drops usage from MAX to 0. And verifies
to reach MAX before exiting its loop.

Other concurrent tests are unlikely to spike very high. It might just
be sufficient to relax that ASSERT to something a bit higher than 0,
but a far cry from INT_MAX, to mean "clearly no longer stressed".
Kuniyuki Iwashima April 1, 2025, 6:35 p.m. UTC | #8
From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
Date: Mon, 31 Mar 2025 21:15:24 -0400
> Jakub Kicinski wrote:
> > On Mon, 31 Mar 2025 13:31:47 -0700 Kuniyuki Iwashima wrote:
> > > > > Please do test locally if you can.  
> > > > 
> > > > Sure, will try the same tests with CI.  
> > > 
> > > Is there a way to tell NIPA to run a test in a dedicated VM ?
> > > 
> > > I see some tests succeed when executed solely but fail when
> > > executed with
> > > 
> > >   make -C tools/testing/selftests/ TARGETS=net run_tests
> > > 
> > > When combined with other tests, assuming that the global UDP usage
> > > will soon drop to 0 is not always easy... so it's defeating the
> > > purpose but I'd drop the test in v5 not to make CI unhappy.
> > 
> > Can we account for some level of system noise? Or try to dump all 
> > the sockets and count the "accounted for" in-use memory?
> > 
> > We can do various things in NIPA, but I'm not sure if it's okay 
> > for tests inside net/ should require a completely idle system.
> > If we want a completely idle system maybe user-mode Linux + kunit
> > is a better direction?
> > 
> > Willem, WDYT?
> 
> The number of tests depending on global variables like
> proto_memory_allocated is thankfully low.
> 
> kselftest/runner.sh runs RUN_IN_NETNS tests in parallel. That sounds
> the case here. Perhaps we can add a test option to force running
> without concurrent other tests?
> 
> Otherwise, the specific test drops usage from MAX to 0. And verifies
> to reach MAX before exiting its loop.

Yes, and we need to query rmem_alloc via netlink.

Also, I haven't investigated, but I saw a weird timeout, when the usage
stuck at 523914, which is smaller than INT_MAX >> PAGE_SHIFT.

Probably the test needs to check sockets' rmem_alloc to be accurate.

I'll drop the test for now and follow up in net-next.

Thanks!