diff mbox series

[net-next,v3] net/smc: avoid atomic_set and smp_wmb in the tx path when possible

Message ID 20231117111657.16266-1-lirongqing@baidu.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series [net-next,v3] net/smc: avoid atomic_set and smp_wmb in the tx path when possible | expand

Checks

Context Check Description
netdev/series_format success Single patches do not need cover letters
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1127 this patch: 1127
netdev/cc_maintainers warning 1 maintainers not CCed: tonylu@linux.alibaba.com
netdev/build_clang success Errors and warnings before: 1154 this patch: 1154
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1154 this patch: 1154
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 20 lines checked
netdev/build_clang_rust success No Rust files in patch. Skipping build
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Li RongQing Nov. 17, 2023, 11:16 a.m. UTC
There is rare possibility that conn->tx_pushing is not 1, since
tx_pushing is just checked with 1, so move the setting tx_pushing
to 1 after atomic_dec_and_test() return false, to avoid atomic_set
and smp_wmb in tx path

Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
Signed-off-by: Li RongQing <lirongqing@baidu.com>
---

Comments

Alexandra Winter Nov. 17, 2023, 12:27 p.m. UTC | #1
On 17.11.23 12:16, Li RongQing wrote:
> There is rare possibility that conn->tx_pushing is not 1, since
> tx_pushing is just checked with 1, so move the setting tx_pushing
> to 1 after atomic_dec_and_test() return false, to avoid atomic_set
> and smp_wmb in tx path
> 
> Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
> Signed-off-by: Li RongQing <lirongqing@baidu.com>
> ---
> diff v3: improvements in the commit body and comments
> diff v2: fix a typo in commit body and add net-next subject-prefix
>  net/smc/smc_tx.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
> 
> diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
> index 3b0ff3b..2c2933f 100644
> --- a/net/smc/smc_tx.c
> +++ b/net/smc/smc_tx.c
> @@ -667,8 +667,6 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
>  		return 0;
>  
>  again:
> -	atomic_set(&conn->tx_pushing, 1);
> -	smp_wmb(); /* Make sure tx_pushing is 1 before real send */
>  	rc = __smc_tx_sndbuf_nonempty(conn);
>  
>  	/* We need to check whether someone else have added some data into
> @@ -677,8 +675,11 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
>  	 * If so, we need to push again to prevent those data hang in the send
>  	 * queue.
>  	 */
> -	if (unlikely(!atomic_dec_and_test(&conn->tx_pushing)))
> +	if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) {
> +		atomic_set(&conn->tx_pushing, 1);
> +		smp_wmb(); /* Make sure tx_pushing is 1 before send again */
>  		goto again;
> +	}
>  
>  	return rc;
>  }

It seems to me that the purpose of conn->tx_pushing is
a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty().
b) Repeat, in case some other thread has added data to sndbuf concurrently.

I agree that this patch does not change the behaviour of this function and removes an
atomic_set() in the likely path.

I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock.
So how can we ever run in a concurrency situation?
Is this handling of conn->tx_pushing necessary at all?
Dust Li Nov. 20, 2023, 3:20 a.m. UTC | #2
On Fri, Nov 17, 2023 at 01:27:57PM +0100, Alexandra Winter wrote:
>
>
>On 17.11.23 12:16, Li RongQing wrote:
>> There is rare possibility that conn->tx_pushing is not 1, since
>> tx_pushing is just checked with 1, so move the setting tx_pushing
>> to 1 after atomic_dec_and_test() return false, to avoid atomic_set
>> and smp_wmb in tx path
>> 
>> Reviewed-by: Dust Li <dust.li@linux.alibaba.com>
>> Signed-off-by: Li RongQing <lirongqing@baidu.com>
>> ---
>> diff v3: improvements in the commit body and comments
>> diff v2: fix a typo in commit body and add net-next subject-prefix
>>  net/smc/smc_tx.c | 7 ++++---
>>  1 file changed, 4 insertions(+), 3 deletions(-)
>> 
>> diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
>> index 3b0ff3b..2c2933f 100644
>> --- a/net/smc/smc_tx.c
>> +++ b/net/smc/smc_tx.c
>> @@ -667,8 +667,6 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
>>  		return 0;
>>  
>>  again:
>> -	atomic_set(&conn->tx_pushing, 1);
>> -	smp_wmb(); /* Make sure tx_pushing is 1 before real send */
>>  	rc = __smc_tx_sndbuf_nonempty(conn);
>>  
>>  	/* We need to check whether someone else have added some data into
>> @@ -677,8 +675,11 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
>>  	 * If so, we need to push again to prevent those data hang in the send
>>  	 * queue.
>>  	 */
>> -	if (unlikely(!atomic_dec_and_test(&conn->tx_pushing)))
>> +	if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) {
>> +		atomic_set(&conn->tx_pushing, 1);
>> +		smp_wmb(); /* Make sure tx_pushing is 1 before send again */
>>  		goto again;
>> +	}
>>  
>>  	return rc;
>>  }
>
>It seems to me that the purpose of conn->tx_pushing is
>a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty().
>b) Repeat, in case some other thread has added data to sndbuf concurrently.
>
>I agree that this patch does not change the behaviour of this function and removes an
>atomic_set() in the likely path.
>
>I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock.
>So how can we ever run in a concurrency situation?
>Is this handling of conn->tx_pushing necessary at all?

Hi Sandy,

Overall, I think you are right. But there is something we need to take care.

Before commit 6b88af839d20 ("net/smc: don't send in the BH context if
sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ,
without checking sock_owned_by_user(), which would caused a race condition
because bh_lock_sock() did not honor sock_lock(). To address this issue,
I have added the tx_pushing mechanism. However, with commit 6b88af839d20,
we now defer the transmission if sock_lock() is held by the user.
Therefore, there should no longer be a race condition. Nevertheless, if
we remove the tx_pending mechanism, we must always remember not to call
smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock.

Thanks
Dust
Alexandra Winter Nov. 20, 2023, 9:11 a.m. UTC | #3
On 20.11.23 04:20, Dust Li wrote:
>> It seems to me that the purpose of conn->tx_pushing is
>> a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty().
>> b) Repeat, in case some other thread has added data to sndbuf concurrently.
>>
>> I agree that this patch does not change the behaviour of this function and removes an
>> atomic_set() in the likely path.
>>
>> I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock.
>> So how can we ever run in a concurrency situation?
>> Is this handling of conn->tx_pushing necessary at all?
> Hi Sandy,
> 
> Overall, I think you are right. But there is something we need to take care.
> 
> Before commit 6b88af839d20 ("net/smc: don't send in the BH context if
> sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ,
> without checking sock_owned_by_user(), which would caused a race condition
> because bh_lock_sock() did not honor sock_lock(). To address this issue,
> I have added the tx_pushing mechanism. However, with commit 6b88af839d20,
> we now defer the transmission if sock_lock() is held by the user.
> Therefore, there should no longer be a race condition. Nevertheless, if
> we remove the tx_pending mechanism, we must always remember not to call
> smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock.
> 
> Thanks
> Dust


ok, I understand.
So whoever is willing to give it a try and simplify smc_tx_sndbuf_nonempty(),
should remember to document that requirement/precondition.
Maybe in a Function context section of a kernel-doc function decription?
(as described in https://docs.kernel.org/doc-guide/kernel-doc.html)
Although smc_tx_sndbuf_nonempty() is not exported, this format is helpful.
Alexandra Winter Nov. 20, 2023, 9:17 a.m. UTC | #4
On 20.11.23 10:11, Alexandra Winter wrote:
> 
> 
> On 20.11.23 04:20, Dust Li wrote:
>>> It seems to me that the purpose of conn->tx_pushing is
>>> a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty().
>>> b) Repeat, in case some other thread has added data to sndbuf concurrently.
>>>
>>> I agree that this patch does not change the behaviour of this function and removes an
>>> atomic_set() in the likely path.
>>>
>>> I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock.
>>> So how can we ever run in a concurrency situation?
>>> Is this handling of conn->tx_pushing necessary at all?
>> Hi Sandy,
>>
>> Overall, I think you are right. But there is something we need to take care.
>>
>> Before commit 6b88af839d20 ("net/smc: don't send in the BH context if
>> sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ,
>> without checking sock_owned_by_user(), which would caused a race condition
>> because bh_lock_sock() did not honor sock_lock(). To address this issue,
>> I have added the tx_pushing mechanism. However, with commit 6b88af839d20,
>> we now defer the transmission if sock_lock() is held by the user.
>> Therefore, there should no longer be a race condition. Nevertheless, if
>> we remove the tx_pending mechanism, we must always remember not to call
>> smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock.
>>
>> Thanks
>> Dust
> 
> 
> ok, I understand.
> So whoever is willing to give it a try and simplify smc_tx_sndbuf_nonempty(),
> should remember to document that requirement/precondition.
> Maybe in a Function context section of a kernel-doc function decription?
> (as described in https://docs.kernel.org/doc-guide/kernel-doc.html)
> Although smc_tx_sndbuf_nonempty() is not exported, this format is helpful.
> 


Tony Lu <tonylu@linux.alibaba.com> ' mail address has been corrupted in this whole thread.
Please reply to this message (corrected address) or take care, if replying to
other messages in this thread.
Tony Lu Nov. 20, 2023, 9:49 a.m. UTC | #5
On Mon, Nov 20, 2023 at 10:17:15AM +0100, Alexandra Winter wrote:
> 
> 
> On 20.11.23 10:11, Alexandra Winter wrote:
> > 
> > 
> > On 20.11.23 04:20, Dust Li wrote:
> >>> It seems to me that the purpose of conn->tx_pushing is
> >>> a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty().
> >>> b) Repeat, in case some other thread has added data to sndbuf concurrently.
> >>>
> >>> I agree that this patch does not change the behaviour of this function and removes an
> >>> atomic_set() in the likely path.
> >>>
> >>> I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock.
> >>> So how can we ever run in a concurrency situation?
> >>> Is this handling of conn->tx_pushing necessary at all?
> >> Hi Sandy,
> >>
> >> Overall, I think you are right. But there is something we need to take care.
> >>
> >> Before commit 6b88af839d20 ("net/smc: don't send in the BH context if
> >> sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ,
> >> without checking sock_owned_by_user(), which would caused a race condition
> >> because bh_lock_sock() did not honor sock_lock(). To address this issue,
> >> I have added the tx_pushing mechanism. However, with commit 6b88af839d20,
> >> we now defer the transmission if sock_lock() is held by the user.
> >> Therefore, there should no longer be a race condition. Nevertheless, if
> >> we remove the tx_pending mechanism, we must always remember not to call
> >> smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock.
> >>
> >> Thanks
> >> Dust
> > 
> > 
> > ok, I understand.
> > So whoever is willing to give it a try and simplify smc_tx_sndbuf_nonempty(),
> > should remember to document that requirement/precondition.
> > Maybe in a Function context section of a kernel-doc function decription?
> > (as described in https://docs.kernel.org/doc-guide/kernel-doc.html)
> > Although smc_tx_sndbuf_nonempty() is not exported, this format is helpful.
> > 
> 
> 
> Tony Lu <tonylu@linux.alibaba.com> ' mail address has been corrupted in this whole thread.
> Please reply to this message (corrected address) or take care, if replying to
> other messages in this thread.

Yes, that's true. Thanks Alexandra.

Please use my correct address, RongQing: Tony Lu <tonylu@linux.alibaba.com>.
Dust Li Nov. 22, 2023, 9:53 a.m. UTC | #6
On Mon, Nov 20, 2023 at 10:11:17AM +0100, Alexandra Winter wrote:
>
>
>On 20.11.23 04:20, Dust Li wrote:
>>> It seems to me that the purpose of conn->tx_pushing is
>>> a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty().
>>> b) Repeat, in case some other thread has added data to sndbuf concurrently.
>>>
>>> I agree that this patch does not change the behaviour of this function and removes an
>>> atomic_set() in the likely path.
>>>
>>> I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock.
>>> So how can we ever run in a concurrency situation?
>>> Is this handling of conn->tx_pushing necessary at all?
>> Hi Sandy,
>> 
>> Overall, I think you are right. But there is something we need to take care.
>> 
>> Before commit 6b88af839d20 ("net/smc: don't send in the BH context if
>> sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ,
>> without checking sock_owned_by_user(), which would caused a race condition
>> because bh_lock_sock() did not honor sock_lock(). To address this issue,
>> I have added the tx_pushing mechanism. However, with commit 6b88af839d20,
>> we now defer the transmission if sock_lock() is held by the user.
>> Therefore, there should no longer be a race condition. Nevertheless, if
>> we remove the tx_pending mechanism, we must always remember not to call
>> smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock.
>> 
>> Thanks
>> Dust
>
>
>ok, I understand.
>So whoever is willing to give it a try and simplify smc_tx_sndbuf_nonempty(),
>should remember to document that requirement/precondition.
>Maybe in a Function context section of a kernel-doc function decription?
>(as described in https://docs.kernel.org/doc-guide/kernel-doc.html)
>Although smc_tx_sndbuf_nonempty() is not exported, this format is helpful.

I double checked this and realized that I may have missed something
previously. The original goal of introducing tx_push was to maximize the
amount of data that could be corked in order to achieve the best
performance.

__smc_tx_sndbuf_nonempty() is thread and context safe, meaning that
it can be called simultaneously in both user context and softirq without
causing any bugs, just some CPU waste. Although I think we should remove
all the atomics & locks in the data path and only use sock_lock in the
long-term.

I will collaborate with Li RongQing on a new version that eliminates the
tx_pushing.

Best regards,
Dust
diff mbox series

Patch

diff v3: improvements in the commit body and comments
diff v2: fix a typo in commit body and add net-next subject-prefix
 net/smc/smc_tx.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c
index 3b0ff3b..2c2933f 100644
--- a/net/smc/smc_tx.c
+++ b/net/smc/smc_tx.c
@@ -667,8 +667,6 @@  int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
 		return 0;
 
 again:
-	atomic_set(&conn->tx_pushing, 1);
-	smp_wmb(); /* Make sure tx_pushing is 1 before real send */
 	rc = __smc_tx_sndbuf_nonempty(conn);
 
 	/* We need to check whether someone else have added some data into
@@ -677,8 +675,11 @@  int smc_tx_sndbuf_nonempty(struct smc_connection *conn)
 	 * If so, we need to push again to prevent those data hang in the send
 	 * queue.
 	 */
-	if (unlikely(!atomic_dec_and_test(&conn->tx_pushing)))
+	if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) {
+		atomic_set(&conn->tx_pushing, 1);
+		smp_wmb(); /* Make sure tx_pushing is 1 before send again */
 		goto again;
+	}
 
 	return rc;
 }