diff mbox series

[mptcp-next,v2,2/2] Squash-to: "mptcp: give rcvlowat some love"

Message ID 21dd84bd44a8d0839564bed351e7d77a16b68474.1696237983.git.pabeni@redhat.com (mailing list archive)
State Accepted, archived
Commit 4b8e173a713a1b707a012257356acfc7a66430f9
Delegated to: Matthieu Baerts
Headers show
Series [mptcp-next,v2,1/2] tcp: define initial scaling factor value as a macro | expand

Commit Message

Paolo Abeni Oct. 2, 2023, 9:13 a.m. UTC
Christoph reported a couple of serious splat caused by
the mentioned patch.

mptcp_set_rcvlowat() can use msk->scaling_ratio, before
such field is initialized, causing a divide by zero: we
need to init it in the sock constructor.

Additionally the same function bogusly cast an msk to a
tcp_sock, causing memory corruption. The reproducer likely
clears the sk refcount for the next msk allocated into the
same slab.

The intent was to properly propagate the rcvbuf changes to
the subflows. Let's do that explicitly.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
--
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/442
Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/443

since the above issues are introduced by the squash-to patch, I think
we can't have the tag in the final patch.

v1 -> v2:
 - use scaling_ratio define (Mat)
---
 net/mptcp/protocol.c |  1 +
 net/mptcp/sockopt.c  | 17 ++++++++++++++---
 2 files changed, 15 insertions(+), 3 deletions(-)

Comments

Matthieu Baerts (NGI0) Oct. 3, 2023, 4:59 p.m. UTC | #1
Hi Paolo,

On 02/10/2023 11:13, Paolo Abeni wrote:
> Christoph reported a couple of serious splat caused by
> the mentioned patch.
> 
> mptcp_set_rcvlowat() can use msk->scaling_ratio, before
> such field is initialized, causing a divide by zero: we
> need to init it in the sock constructor.
> 
> Additionally the same function bogusly cast an msk to a
> tcp_sock, causing memory corruption. The reproducer likely
> clears the sk refcount for the next msk allocated into the
> same slab.
> 
> The intent was to properly propagate the rcvbuf changes to
> the subflows. Let's do that explicitly.
> 
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> --
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/442
> Closes: https://github.com/multipath-tcp/mptcp_net-next/issues/443

Thank you for the patches!

Now in our tree (I added patch 1/2 just before "mptcp: give rcvlowat
some love")

New patches for t/upstream:
- 3a6e7c602676: tcp: define initial scaling factor value as a macro
- 4b8e173a713a: "squashed" patch 2/2 in "mptcp: give rcvlowat some love"
- Results: 3cbb19268039..ad8e597b726a (export)

Tests are now in progress:

https://cirrus-ci.com/github/multipath-tcp/mptcp_net-next/export/20231003T165758

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/protocol.c b/net/mptcp/protocol.c
index 6f9e116598ed..990fcf53074a 100644
--- a/net/mptcp/protocol.c
+++ b/net/mptcp/protocol.c
@@ -2758,6 +2758,7 @@  static void __mptcp_init_sock(struct sock *sk)
 	msk->rmem_fwd_alloc = 0;
 	WRITE_ONCE(msk->rmem_released, 0);
 	msk->timer_ival = TCP_RTO_MIN;
+	msk->scaling_ratio = TCP_DEFAULT_SCALING_RATIO;
 
 	WRITE_ONCE(msk->first, NULL);
 	inet_csk(sk)->icsk_sync_mss = mptcp_sync_mss;
diff --git a/net/mptcp/sockopt.c b/net/mptcp/sockopt.c
index 340e87195a27..6b37946b5c52 100644
--- a/net/mptcp/sockopt.c
+++ b/net/mptcp/sockopt.c
@@ -1473,6 +1473,7 @@  void mptcp_sockopt_sync_locked(struct mptcp_sock *msk, struct sock *ssk)
  */
 int mptcp_set_rcvlowat(struct sock *sk, int val)
 {
+	struct mptcp_subflow_context *subflow;
 	int space, cap;
 
 	if (sk->sk_userlocks & SOCK_RCVBUF_LOCK)
@@ -1490,9 +1491,19 @@  int mptcp_set_rcvlowat(struct sock *sk, int val)
 		return 0;
 
 	space = __tcp_space_from_win(mptcp_sk(sk)->scaling_ratio, val);
-	if (space > sk->sk_rcvbuf) {
-		WRITE_ONCE(sk->sk_rcvbuf, space);
-		tcp_sk(sk)->window_clamp = val;
+	if (space <= sk->sk_rcvbuf)
+		return 0;
+
+	/* propagate the rcvbuf changes to all the subflows */
+	WRITE_ONCE(sk->sk_rcvbuf, space);
+	mptcp_for_each_subflow(mptcp_sk(sk), subflow) {
+		struct sock *ssk = mptcp_subflow_tcp_sock(subflow);
+		bool slow;
+
+		slow = lock_sock_fast(ssk);
+		WRITE_ONCE(ssk->sk_rcvbuf, space);
+		tcp_sk(ssk)->window_clamp = val;
+		unlock_sock_fast(ssk, slow);
 	}
 	return 0;
 }