diff mbox series

mptcp: update add_addr_accepted and accept_add after subflow added

Message ID 1716543865-37448-1-git-send-email-liyonglong@chinatelecom.cn (mailing list archive)
State Superseded, archived
Delegated to: Matthieu Baerts
Headers show
Series mptcp: update add_addr_accepted and accept_add after subflow added | expand

Checks

Context Check Description
matttbe/build success Build and static analysis OK
matttbe/checkpatch success total: 0 errors, 0 warnings, 0 checks, 31 lines checked
matttbe/shellcheck success MPTCP selftests files have not been modified
matttbe/KVM_Validation__normal success Success! ✅
matttbe/KVM_Validation__debug success Success! ✅
matttbe/KVM_Validation__btf__only_bpftest_all_ success Success! ✅

Commit Message

YonglongLi May 24, 2024, 9:44 a.m. UTC
From: YonglongLi <liyonglong@chinatelecom.cn>

receive RM_ADDR will update pm.add_addr_accepted and pm.add_addr_accepted
only if remove id match remote id of subflow. so receive ADD_ADDR should
update pm.add_addr_accepted and pm.add_addr_accepted after subflow added
to conn_list.

Fixes: f7d6a237d742 ("mptcp: fix per socket endpoint accounting")

Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn>
---
 net/mptcp/pm_netlink.c | 16 ++++++++++------
 1 file changed, 10 insertions(+), 6 deletions(-)

Comments

MPTCP CI May 24, 2024, 10:45 a.m. UTC | #1
Hi YonglongLi,

Thank you for your modifications, that's great!

Our CI did some validations and here is its report:

- KVM Validation: normal: Success! ✅
- KVM Validation: debug: Success! ✅
- KVM Validation: btf (only bpftest_all): Success! ✅
- Task: https://github.com/multipath-tcp/mptcp_net-next/actions/runs/9222300685

Initiator: Patchew Applier
Commits: https://github.com/multipath-tcp/mptcp_net-next/commits/3e6551d3f61f
Patchwork: https://patchwork.kernel.org/project/mptcp/list/?series=855647


If there are some issues, you can reproduce them using the same environment as
the one used by the CI thanks to a docker image, e.g.:

    $ cd [kernel source code]
    $ docker run -v "${PWD}:${PWD}:rw" -w "${PWD}" --privileged --rm -it \
        --pull always mptcp/mptcp-upstream-virtme-docker:latest \
        auto-normal

For more details:

    https://github.com/multipath-tcp/mptcp-upstream-virtme-docker


Please note that despite all the efforts that have been already done to have a
stable tests suite when executed on a public CI like here, it is possible some
reported issues are not due to your modifications. Still, do not hesitate to
help us improve that ;-)

Cheers,
MPTCP GH Action bot
Bot operated by Matthieu Baerts (NGI0 Core)
Matthieu Baerts (NGI0) May 24, 2024, 3:06 p.m. UTC | #2
Hi Yonglong Li,

On 24/05/2024 11:44, Yonglong Li wrote:
> From: YonglongLi <liyonglong@chinatelecom.cn>
> 
> receive RM_ADDR will update pm.add_addr_accepted and pm.add_addr_accepted
> only if remove id match remote id of subflow. so receive ADD_ADDR should
> update pm.add_addr_accepted and pm.add_addr_accepted after subflow added
> to conn_list.

Do you think it could be possible to cover this case by modifying/adding
one subtest in our selftests?

The commit message is not very clear to me. ("add_addr_accepted" are
also always duplicated, I'm not sure why).

I understand we can have issues with __mptcp_subflow_connect() and we
might want to increment the counter only if it was possible to create a
subflow (even if the connection can fail later -- I'm not sure whether
we decrement the counter in this case as well), but I'm not sure to
understand the context: the reason why this call could fail, and the
link with RM_ADDR.

> 
> Fixes: f7d6a237d742 ("mptcp: fix per socket endpoint accounting")
> 

(no new line here)

> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn>
> ---
>  net/mptcp/pm_netlink.c | 16 ++++++++++------
>  1 file changed, 10 insertions(+), 6 deletions(-)
> 
> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
> index 766a840..117cc7b 100644
> --- a/net/mptcp/pm_netlink.c
> +++ b/net/mptcp/pm_netlink.c
> @@ -676,6 +676,7 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
>  	struct sock *sk = (struct sock *)msk;
>  	unsigned int add_addr_accept_max;
>  	struct mptcp_addr_info remote;
> +	bool subflow_added = false;
>  	unsigned int subflows_max;
>  	int i, nr;
>  
> @@ -704,15 +705,18 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
>  	if (nr == 0)
>  		return;
>  
> -	msk->pm.add_addr_accepted++;
> -	if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
> -	    msk->pm.subflows >= subflows_max)
> -		WRITE_ONCE(msk->pm.accept_addr, false);
> -
>  	spin_unlock_bh(&msk->pm.lock);
>  	for (i = 0; i < nr; i++)
> -		__mptcp_subflow_connect(sk, &addrs[i], &remote);
> +		if (!__mptcp_subflow_connect(sk, &addrs[i], &remote))
> +			subflow_added = true;

Probably clearer to use:

  if (__mptcp_subflow_connect(...) == 0)

Otherwise, we might think the subflow is added in case of issues.

>  	spin_lock_bh(&msk->pm.lock);
> +
> +	if (subflow_added) {
> +		msk->pm.add_addr_accepted++;
> +		if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
> +		    msk->pm.subflows >= subflows_max)
> +			WRITE_ONCE(msk->pm.accept_addr, false);
> +	}
>  }
>  
>  void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk)

Cheers,
Matt
YonglongLi May 29, 2024, 9:37 a.m. UTC | #3
Hi Matt,

Sorry for the late relay. And thanks for your review.

On 5/24/2024 23:06, Matthieu Baerts wrote:
> Hi Yonglong Li,
> 
> On 24/05/2024 11:44, Yonglong Li wrote:
>> From: YonglongLi <liyonglong@chinatelecom.cn>
>>
>> receive RM_ADDR will update pm.add_addr_accepted and pm.add_addr_accepted
>> only if remove id match remote id of subflow. so receive ADD_ADDR should
>> update pm.add_addr_accepted and pm.add_addr_accepted after subflow added
>> to conn_list.
> 
> Do you think it could be possible to cover this case by modifying/adding
> one subtest in our selftests?
> 

yes, test case for this issue like this:

       if reset "remove invalid address and re-add"; then
               pm_nl_set_limits $ns1 2 2
               pm_nl_set_limits $ns2 2 2
               pm_nl_add_endpoint $ns1 224.0.0.1  flags signal
               #pm_nl_add_endpoint $ns1 10.0.4.1  flags signal
               run_tests $ns1 $ns2 10.0.1.1 0 -1 0 speed_10 2>/dev/null &
               sleep 0.5

               pm_nl_add_endpoint $ns1 10.0.2.1 flags signal
               pm_nl_add_endpoint $ns1 10.0.3.1 flags signal
               wait_mpj $ns2
               chk_join_nr 2 2 2
               chk_rm_nr 1 0 invert
               chk_add_nr 3 3
               chk_rst_nr 0 0
               kill_tests_wait
       fi


> The commit message is not very clear to me. ("add_addr_accepted" are
> also always duplicated, I'm not sure why).

sorry for mistake. my origin mean is "pm.add_addr_accepted and pm.accept_addr"

> 
> I understand we can have issues with __mptcp_subflow_connect() and we
> might want to increment the counter only if it was possible to create a
> subflow (even if the connection can fail later -- I'm not sure whether
> we decrement the counter in this case as well), but I'm not sure to
> understand the context: the reason why this call could fail, and the
> link with RM_ADDR.
> 

I think we should make sure that update pm.add_addr_accepted and pm.accept_addr
after new subflow was added into conn_list when receive do ADD_ADDR action. And
decremant the counter when unlink the last subflow of the addr ID from conn_list.

decremant the counter like this:

@@ -827,10 +829,7 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
                if (!mptcp_pm_is_kernel(msk))
                        continue;

-               if (rm_type == MPTCP_MIB_RMADDR) {
-                       msk->pm.add_addr_accepted--;
-                       WRITE_ONCE(msk->pm.accept_addr, true);
-               } else if (rm_type == MPTCP_MIB_RMSUBFLOW) {
+               if (rm_type == MPTCP_MIB_RMSUBFLOW) {
                        msk->pm.local_addr_used--;
                }
        }

@@ -757,11 +757,32 @@ static void subflow_ulp_fallback(struct sock *sk,
 void mptcp_subflow_drop_ctx(struct sock *ssk)
 {
        struct mptcp_subflow_context *ctx = mptcp_subflow_ctx(ssk);
+       struct mptcp_sock *msk = NULL;

        if (!ctx)
                return;

        list_del(&mptcp_subflow_ctx(ssk)->node);
+        if (ctx->conn)
+                msk = mptcp_sk(ctx->conn);
+        if (msk && !list_empty(&msk->conn_list)) {
+                struct mptcp_subflow_context *subflow, *tmp;
+                u8 id = subflow_get_local_id(ctx);
+                bool lastone = true;
+                mptcp_for_each_subflow_safe(msk, subflow, tmp) {
+                        if (id == 0 || id == subflow_get_local_id(subflow)) {
+                                lastone = false;
+                                break;
+                        }
+                }
+                if (lastone == true) {
+                        spin_lock_bh(&msk->pm.lock);
+                        msk->pm.add_addr_accepted++;
+                        WRITE_ONCE(msk->pm.accept_addr, true);
+                        spin_unlock_bh(&msk->pm.lock);
+                }
+        }
+
        if (inet_csk(ssk)->icsk_ulp_ops) {
                subflow_ulp_fallback(ssk, ctx);
                if (ctx->conn)

And do the same thing in __mptcp_close_ssk

WDYT?

>>
>> Fixes: f7d6a237d742 ("mptcp: fix per socket endpoint accounting")
>>
> 
> (no new line here)
> 
>> Signed-off-by: YonglongLi <liyonglong@chinatelecom.cn>
>> ---
>>  net/mptcp/pm_netlink.c | 16 ++++++++++------
>>  1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
>> index 766a840..117cc7b 100644
>> --- a/net/mptcp/pm_netlink.c
>> +++ b/net/mptcp/pm_netlink.c
>> @@ -676,6 +676,7 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
>>  	struct sock *sk = (struct sock *)msk;
>>  	unsigned int add_addr_accept_max;
>>  	struct mptcp_addr_info remote;
>> +	bool subflow_added = false;
>>  	unsigned int subflows_max;
>>  	int i, nr;
>>  
>> @@ -704,15 +705,18 @@ static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
>>  	if (nr == 0)
>>  		return;
>>  
>> -	msk->pm.add_addr_accepted++;
>> -	if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
>> -	    msk->pm.subflows >= subflows_max)
>> -		WRITE_ONCE(msk->pm.accept_addr, false);
>> -
>>  	spin_unlock_bh(&msk->pm.lock);
>>  	for (i = 0; i < nr; i++)
>> -		__mptcp_subflow_connect(sk, &addrs[i], &remote);
>> +		if (!__mptcp_subflow_connect(sk, &addrs[i], &remote))
>> +			subflow_added = true;
> 
> Probably clearer to use:
> 
>   if (__mptcp_subflow_connect(...) == 0)
> 
> Otherwise, we might think the subflow is added in case of issues.
> 
>>  	spin_lock_bh(&msk->pm.lock);
>> +
>> +	if (subflow_added) {
>> +		msk->pm.add_addr_accepted++;
>> +		if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
>> +		    msk->pm.subflows >= subflows_max)
>> +			WRITE_ONCE(msk->pm.accept_addr, false);
>> +	}
>>  }
>>  
>>  void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk)
> 
> Cheers,
> Matt
Matthieu Baerts (NGI0) June 4, 2024, 3:03 p.m. UTC | #4
Hi Yonglong Li,

Thank you for the reply!

On 29/05/2024 11:37, YonglongLi wrote:
> Hi Matt,
> 
> Sorry for the late relay. And thanks for your review.
> 
> On 5/24/2024 23:06, Matthieu Baerts wrote:
>> Hi Yonglong Li,
>>
>> On 24/05/2024 11:44, Yonglong Li wrote:
>>> From: YonglongLi <liyonglong@chinatelecom.cn>
>>>
>>> receive RM_ADDR will update pm.add_addr_accepted and pm.add_addr_accepted
>>> only if remove id match remote id of subflow. so receive ADD_ADDR should
>>> update pm.add_addr_accepted and pm.add_addr_accepted after subflow added
>>> to conn_list.
>>
>> Do you think it could be possible to cover this case by modifying/adding
>> one subtest in our selftests?
>>
> 
> yes, test case for this issue like this:
> 
>        if reset "remove invalid address and re-add"; then
>                pm_nl_set_limits $ns1 2 2
>                pm_nl_set_limits $ns2 2 2
>                pm_nl_add_endpoint $ns1 224.0.0.1  flags signal
>                #pm_nl_add_endpoint $ns1 10.0.4.1  flags signal
>                run_tests $ns1 $ns2 10.0.1.1 0 -1 0 speed_10 2>/dev/null &
>                sleep 0.5
> 
>                pm_nl_add_endpoint $ns1 10.0.2.1 flags signal
>                pm_nl_add_endpoint $ns1 10.0.3.1 flags signal
>                wait_mpj $ns2
>                chk_join_nr 2 2 2

Just to be sure I understand the patch: before the modification, here,
we only see one MP_JOIN, because the client has reached the limit of 2
subflows: one subflow to "224.0.0.1" and one to "10.0.2.1". The reason
is that even if the subflow to "224.0.0.1" has failed, it has
incremented the counter, but this counter will never be decremented,
even after having received a RM_ADDR, right?

So there are naively 3 solutions I guess:

(1) either we make sure the counter is decremented when receiving the
corresponding RM_ADDR
(2) or increment it either only in case of success
(3) or to decrement it only in case of failure

For (1), we want to decrement the counter only for used addresses: it
means the counter should have been decremented before if the address is
no longer used (or we need to store received ADD_ADDR).

If we don't want to store received ADD_ADDR, for (2) and (3), we need to
take into account two types of errors:

(a) at the creation time, when calling __mptcp_subflow_connect(): in
this case, the subflow will not be part of the conn_list after the call

(b) later on: after having received a RST, ICMP, timeout, ... or the
other peer simply decided to close it.

For (a), that can be done with the patch you sent. For (b), that's a bit
more complex, and linked with the diff you sent below.

>                chk_rm_nr 1 0 invert
>                chk_add_nr 3 3
>                chk_rst_nr 0 0
>                kill_tests_wait
>        fi
> 
> 
>> The commit message is not very clear to me. ("add_addr_accepted" are
>> also always duplicated, I'm not sure why).
> 
> sorry for mistake. my origin mean is "pm.add_addr_accepted and pm.accept_addr"

OK, clearer.

>> I understand we can have issues with __mptcp_subflow_connect() and we
>> might want to increment the counter only if it was possible to create a
>> subflow (even if the connection can fail later -- I'm not sure whether
>> we decrement the counter in this case as well), but I'm not sure to
>> understand the context: the reason why this call could fail, and the
>> link with RM_ADDR.
>>
> 
> I think we should make sure that update pm.add_addr_accepted and pm.accept_addr
> after new subflow was added into conn_list when receive do ADD_ADDR action. And
> decremant the counter when unlink the last subflow of the addr ID from conn_list.
> 
> decremant the counter like this:
> 
> @@ -827,10 +829,7 @@ static void mptcp_pm_nl_rm_addr_or_subflow(struct mptcp_sock *msk,
>                 if (!mptcp_pm_is_kernel(msk))
>                         continue;
> 
> -               if (rm_type == MPTCP_MIB_RMADDR) {
> -                       msk->pm.add_addr_accepted--;
> -                       WRITE_ONCE(msk->pm.accept_addr, true);
> -               } else if (rm_type == MPTCP_MIB_RMSUBFLOW) {
> +               if (rm_type == MPTCP_MIB_RMSUBFLOW) {
>                         msk->pm.local_addr_used--;
>                 }
>         }
> 
> @@ -757,11 +757,32 @@ static void subflow_ulp_fallback(struct sock *sk,
>  void mptcp_subflow_drop_ctx(struct sock *ssk)
>  {
>         struct mptcp_subflow_context *ctx = mptcp_subflow_ctx(ssk);
> +       struct mptcp_sock *msk = NULL;
> 
>         if (!ctx)
>                 return;
> 
>         list_del(&mptcp_subflow_ctx(ssk)->node);
> +        if (ctx->conn)
> +                msk = mptcp_sk(ctx->conn);
> +        if (msk && !list_empty(&msk->conn_list)) {
> +                struct mptcp_subflow_context *subflow, *tmp;
> +                u8 id = subflow_get_local_id(ctx);
> +                bool lastone = true;
> +                mptcp_for_each_subflow_safe(msk, subflow, tmp) {
> +                        if (id == 0 || id == subflow_get_local_id(subflow)) {
> +                                lastone = false;
> +                                break;
> +                        }
> +                }
> +                if (lastone == true) {
> +                        spin_lock_bh(&msk->pm.lock);
> +                        msk->pm.add_addr_accepted++;

(you want to decrement it I suppose)

> +                        WRITE_ONCE(msk->pm.accept_addr, true);
> +                        spin_unlock_bh(&msk->pm.lock);
> +                }
> +        }
> +
>         if (inet_csk(ssk)->icsk_ulp_ops) {
>                 subflow_ulp_fallback(ssk, ctx);
>                 if (ctx->conn)
> 
> And do the same thing in __mptcp_close_ssk

Yes, that's the idea (but this should be done in pm_netlink.c, only if
the in-kernel PM is being used, etc.)

I wonder if it would not be interesting to keep the list of received
ADD_ADDR instead of relying on the conn_list. That might simplify the
case here, and the in-kernel PM code in general I think, no? If we want
to take this direction, we can still apply the patch here (with a better
commit message, and a new selftest) as it fixes one part of the issue,
and it can be backported to older versions. The modification of the
stored ADD_ADDR, and the fix for subflows that are being closed before
receiving an RM_ADDR can be done only in future versions.

I will mention that at our public weekly meeting tomorrow.

Cheers,
Matt
diff mbox series

Patch

diff --git a/net/mptcp/pm_netlink.c b/net/mptcp/pm_netlink.c
index 766a840..117cc7b 100644
--- a/net/mptcp/pm_netlink.c
+++ b/net/mptcp/pm_netlink.c
@@ -676,6 +676,7 @@  static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
 	struct sock *sk = (struct sock *)msk;
 	unsigned int add_addr_accept_max;
 	struct mptcp_addr_info remote;
+	bool subflow_added = false;
 	unsigned int subflows_max;
 	int i, nr;
 
@@ -704,15 +705,18 @@  static void mptcp_pm_nl_add_addr_received(struct mptcp_sock *msk)
 	if (nr == 0)
 		return;
 
-	msk->pm.add_addr_accepted++;
-	if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
-	    msk->pm.subflows >= subflows_max)
-		WRITE_ONCE(msk->pm.accept_addr, false);
-
 	spin_unlock_bh(&msk->pm.lock);
 	for (i = 0; i < nr; i++)
-		__mptcp_subflow_connect(sk, &addrs[i], &remote);
+		if (!__mptcp_subflow_connect(sk, &addrs[i], &remote))
+			subflow_added = true;
 	spin_lock_bh(&msk->pm.lock);
+
+	if (subflow_added) {
+		msk->pm.add_addr_accepted++;
+		if (msk->pm.add_addr_accepted >= add_addr_accept_max ||
+		    msk->pm.subflows >= subflows_max)
+			WRITE_ONCE(msk->pm.accept_addr, false);
+	}
 }
 
 void mptcp_pm_nl_addr_send_ack(struct mptcp_sock *msk)