diff mbox series

[net-next,v5,1/5] net/smc: Make smc_tcp_listen_work() independent

Message ID 58c544cb206d94b759ff0546bcffe693c3cbfb98.1644323503.git.alibuda@linux.alibaba.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net/smc: Optimizing performance in short-lived scenarios | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers success CCed 5 of 5 maintainers
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 45 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

D. Wythe Feb. 8, 2022, 12:53 p.m. UTC
From: "D. Wythe" <alibuda@linux.alibaba.com>

In multithread and 10K connections benchmark, the backend TCP connection
established very slowly, and lots of TCP connections stay in SYN_SENT
state.

Client: smc_run wrk -c 10000 -t 4 http://server

the netstate of server host shows like:
    145042 times the listen queue of a socket overflowed
    145042 SYNs to LISTEN sockets dropped

One reason of this issue is that, since the smc_tcp_listen_work() shared
the same workqueue (smc_hs_wq) with smc_listen_work(), while the
smc_listen_work() do blocking wait for smc connection established. Once
the workqueue became congested, it's will block the accept() from TCP
listen.

This patch creates a independent workqueue(smc_tcp_ls_wq) for
smc_tcp_listen_work(), separate it from smc_listen_work(), which is
quite acceptable considering that smc_tcp_listen_work() runs very fast.

Signed-off-by: D. Wythe <alibuda@linux.alibaba.com>
---
 net/smc/af_smc.c | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

Comments

Karsten Graul Feb. 8, 2022, 5:06 p.m. UTC | #1
On 08/02/2022 13:53, D. Wythe wrote:
> +static struct workqueue_struct	*smc_tcp_ls_wq;	/* wq for tcp listen work */
>  struct workqueue_struct	*smc_hs_wq;	/* wq for handshake work */
>  struct workqueue_struct	*smc_close_wq;	/* wq for close work */
>  
> @@ -2227,7 +2228,7 @@ static void smc_clcsock_data_ready(struct sock *listen_clcsock)
>  	lsmc->clcsk_data_ready(listen_clcsock);
>  	if (lsmc->sk.sk_state == SMC_LISTEN) {
>  		sock_hold(&lsmc->sk); /* sock_put in smc_tcp_listen_work() */
> -		if (!queue_work(smc_hs_wq, &lsmc->tcp_listen_work))
> +		if (!queue_work(smc_tcp_ls_wq, &lsmc->tcp_listen_work))
>  			sock_put(&lsmc->sk);

It works well this way, but given the fact that there is one tcp_listen worker per 
listen socket and these workers finish relatively quickly, wouldn't it be okay to
use the system_wq instead of using an own queue? But I have no strong opinion about that...
D. Wythe Feb. 9, 2022, 6:24 a.m. UTC | #2
It is indeed okay to use system_wq at present. Dues to the load 
balancing issues we found, queue_work() always submits tasks to the 
worker on the current CPU. tcp_listen_work() execution once may submit a 
large number of tasks to the worker of the current CPU, causing 
unnecessary pending, even though worker on other CPU are totaly free. I 
was plan to make tcp_listen_work() blocked wait on worker of every CPU, 
so I create a new workqueue, and that's the only reason for it. But this 
problem is not very urgent, and I don't have strong opinion too...


在 2022/2/9 上午1:06, Karsten Graul 写道:
> On 08/02/2022 13:53, D. Wythe wrote:
>> +static struct workqueue_struct	*smc_tcp_ls_wq;	/* wq for tcp listen work */
>>   struct workqueue_struct	*smc_hs_wq;	/* wq for handshake work */
>>   struct workqueue_struct	*smc_close_wq;	/* wq for close work */
>>   
>> @@ -2227,7 +2228,7 @@ static void smc_clcsock_data_ready(struct sock *listen_clcsock)
>>   	lsmc->clcsk_data_ready(listen_clcsock);
>>   	if (lsmc->sk.sk_state == SMC_LISTEN) {
>>   		sock_hold(&lsmc->sk); /* sock_put in smc_tcp_listen_work() */
>> -		if (!queue_work(smc_hs_wq, &lsmc->tcp_listen_work))
>> +		if (!queue_work(smc_tcp_ls_wq, &lsmc->tcp_listen_work))
>>   			sock_put(&lsmc->sk);
> 
> It works well this way, but given the fact that there is one tcp_listen worker per
> listen socket and these workers finish relatively quickly, wouldn't it be okay to
> use the system_wq instead of using an own queue? But I have no strong opinion about that...
diff mbox series

Patch

diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c
index 00b2e9d..4969ac8 100644
--- a/net/smc/af_smc.c
+++ b/net/smc/af_smc.c
@@ -59,6 +59,7 @@ 
 						 * creation on client
 						 */
 
+static struct workqueue_struct	*smc_tcp_ls_wq;	/* wq for tcp listen work */
 struct workqueue_struct	*smc_hs_wq;	/* wq for handshake work */
 struct workqueue_struct	*smc_close_wq;	/* wq for close work */
 
@@ -2227,7 +2228,7 @@  static void smc_clcsock_data_ready(struct sock *listen_clcsock)
 	lsmc->clcsk_data_ready(listen_clcsock);
 	if (lsmc->sk.sk_state == SMC_LISTEN) {
 		sock_hold(&lsmc->sk); /* sock_put in smc_tcp_listen_work() */
-		if (!queue_work(smc_hs_wq, &lsmc->tcp_listen_work))
+		if (!queue_work(smc_tcp_ls_wq, &lsmc->tcp_listen_work))
 			sock_put(&lsmc->sk);
 	}
 }
@@ -3024,9 +3025,14 @@  static int __init smc_init(void)
 		goto out_nl;
 
 	rc = -ENOMEM;
+
+	smc_tcp_ls_wq = alloc_workqueue("smc_tcp_ls_wq", 0, 0);
+	if (!smc_tcp_ls_wq)
+		goto out_pnet;
+
 	smc_hs_wq = alloc_workqueue("smc_hs_wq", 0, 0);
 	if (!smc_hs_wq)
-		goto out_pnet;
+		goto out_alloc_tcp_ls_wq;
 
 	smc_close_wq = alloc_workqueue("smc_close_wq", 0, 0);
 	if (!smc_close_wq)
@@ -3097,6 +3103,8 @@  static int __init smc_init(void)
 	destroy_workqueue(smc_close_wq);
 out_alloc_hs_wq:
 	destroy_workqueue(smc_hs_wq);
+out_alloc_tcp_ls_wq:
+	destroy_workqueue(smc_tcp_ls_wq);
 out_pnet:
 	smc_pnet_exit();
 out_nl:
@@ -3115,6 +3123,7 @@  static void __exit smc_exit(void)
 	smc_core_exit();
 	smc_ib_unregister_client();
 	destroy_workqueue(smc_close_wq);
+	destroy_workqueue(smc_tcp_ls_wq);
 	destroy_workqueue(smc_hs_wq);
 	proto_unregister(&smc_proto6);
 	proto_unregister(&smc_proto);