diff mbox series

net/smc: Optimize the search method of reused buf_desc

Message ID 20241029065415.1070-1-liqiang64@huawei.com (mailing list archive)
State Superseded
Delegated to: Netdev Maintainers
Headers show
Series net/smc: Optimize the search method of reused buf_desc | expand

Checks

Context Check Description
netdev/tree_selection success Guessing tree name failed - patch did not apply, async

Commit Message

Li Qiang Oct. 29, 2024, 6:54 a.m. UTC
We create a lock-less link list for the currently 
idle reusable smc_buf_desc.

When the 'used' filed mark to 0, it is added to 
the lock-less linked list. 

When a new connection is established, a suitable 
element is obtained directly, which eliminates the 
need for traversal and search, and does not require 
locking resource.

A lock-free linked list is a linked list that uses 
atomic operations to optimize the producer-consumer model.

Signed-off-by: liqiang <liqiang64@huawei.com>
---
 net/smc/smc_core.c | 46 ++++++++++++++++++++++++++++------------------
 net/smc/smc_core.h |  4 ++++
 2 files changed, 32 insertions(+), 18 deletions(-)

Comments

Jakub Kicinski Nov. 1, 2024, 2:06 a.m. UTC | #1
On Tue, 29 Oct 2024 14:54:15 +0800 liqiang wrote:
> We create a lock-less link list for the currently 
> idle reusable smc_buf_desc.
> 
> When the 'used' filed mark to 0, it is added to 
> the lock-less linked list. 
> 
> When a new connection is established, a suitable 
> element is obtained directly, which eliminates the 
> need for traversal and search, and does not require 
> locking resource.
> 
> A lock-free linked list is a linked list that uses 
> atomic operations to optimize the producer-consumer model.

Not sure what the story here is but the patch does not apply to net-next
Li Qiang Nov. 1, 2024, 3:11 a.m. UTC | #2
在 2024/11/1 10:06, Jakub Kicinski 写道:
> On Tue, 29 Oct 2024 14:54:15 +0800 liqiang wrote:
>> We create a lock-less link list for the currently 
>> idle reusable smc_buf_desc.
>>
>> When the 'used' filed mark to 0, it is added to 
>> the lock-less linked list. 
>>
>> When a new connection is established, a suitable 
>> element is obtained directly, which eliminates the 
>> need for traversal and search, and does not require 
>> locking resource.
>>
>> A lock-free linked list is a linked list that uses 
>> atomic operations to optimize the producer-consumer model.
> 
> Not sure what the story here is but the patch does not apply to net-next
> 

Thanks for the reminder.
I will resend the patch to adapt to net-next later.
Dust Li Nov. 1, 2024, 6:50 a.m. UTC | #3
On 2024-10-29 14:54:15, liqiang wrote:
>We create a lock-less link list for the currently 
>idle reusable smc_buf_desc.
>
>When the 'used' filed mark to 0, it is added to 
>the lock-less linked list. 
>
>When a new connection is established, a suitable 
>element is obtained directly, which eliminates the 
>need for traversal and search, and does not require 
>locking resource.
>
>A lock-free linked list is a linked list that uses 
>atomic operations to optimize the producer-consumer model.

Do you see any performance issues without this lock-less linked list ?
Under what test case ? Any performance numbers would be welcome

Best regards,
Dust
Li Qiang Nov. 1, 2024, 8:27 a.m. UTC | #4
在 2024/11/1 14:50, Dust Li 写道:
> On 2024-10-29 14:54:15, liqiang wrote:
>> We create a lock-less link list for the currently 
>> idle reusable smc_buf_desc.
>>
>> When the 'used' filed mark to 0, it is added to 
>> the lock-less linked list. 
>>
>> When a new connection is established, a suitable 
>> element is obtained directly, which eliminates the 
>> need for traversal and search, and does not require 
>> locking resource.
>>
>> A lock-free linked list is a linked list that uses 
>> atomic operations to optimize the producer-consumer model.
> 
> Do you see any performance issues without this lock-less linked list ?
> Under what test case ? Any performance numbers would be welcome
> 

I optimized it through review. I re-sent this patch based on the
net-next repo. It contains some of my own test data. Please check it. :-)

> Best regards,
> Dust
> 
> .
>
D. Wythe Nov. 4, 2024, 1:41 a.m. UTC | #5
On 10/29/24 2:54 PM, liqiang wrote:
> We create a lock-less link list for the currently
> idle reusable smc_buf_desc.
> 
> When the 'used' filed mark to 0, it is added to
> the lock-less linked list.
> 
> When a new connection is established, a suitable
> element is obtained directly, which eliminates the
> need for traversal and search, and does not require
> locking resource.
> 
> A lock-free linked list is a linked list that uses
> atomic operations to optimize the producer-consumer model.



No objection, but could you provide us with some data before and after the optimization ?
Li Qiang Nov. 4, 2024, 7:34 a.m. UTC | #6
在 2024/11/4 9:41, D. Wythe 写道:
> 
> 
> On 10/29/24 2:54 PM, liqiang wrote:
>> We create a lock-less link list for the currently
>> idle reusable smc_buf_desc.
>>
>> When the 'used' filed mark to 0, it is added to
>> the lock-less linked list.
>>
>> When a new connection is established, a suitable
>> element is obtained directly, which eliminates the
>> need for traversal and search, and does not require
>> locking resource.
>>
>> A lock-free linked list is a linked list that uses
>> atomic operations to optimize the producer-consumer model.
> 
> 
> 
> No objection, but could you provide us with some data before and after the optimization ?
> .

I have resent this patch a few days ago with '[PATCH net-next]' prefix.
It contains more detailed function time-consuming and nginx test data.
You can find some test data in that email. :)

Let me summarize it here:
1. The function 'smc_buf_get_slot' takes less time when a new SMC link is established,
5us->100ns (when there are 200 active links), 30us->100ns (when there are 1000 active links).

2. Using wrk and nginx to test multi-threaded short connection performance
has significantly improved.

Environment: QEMU emulator version 1.5.3 @ Intel(R) Xeon(R) CPU E5-2680 v4 @ 2.40GHz
Test with SMC loopback-ism.
diff mbox series

Patch

diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c
index 3b95828d9976..ac120f62313b 100644
--- a/net/smc/smc_core.c
+++ b/net/smc/smc_core.c
@@ -16,6 +16,7 @@ 
 #include <linux/wait.h>
 #include <linux/reboot.h>
 #include <linux/mutex.h>
+#include <linux/llist.h>
 #include <linux/list.h>
 #include <linux/smc.h>
 #include <net/tcp.h>
@@ -872,6 +873,8 @@  static int smc_lgr_create(struct smc_sock *smc, struct smc_init_info *ini)
 	for (i = 0; i < SMC_RMBE_SIZES; i++) {
 		INIT_LIST_HEAD(&lgr->sndbufs[i]);
 		INIT_LIST_HEAD(&lgr->rmbs[i]);
+		init_llist_head(&lgr->rmbs_free[i]);
+		init_llist_head(&lgr->sndbufs_free[i]);
 	}
 	lgr->next_link_id = 0;
 	smc_lgr_list.num += SMC_LGR_NUM_INCR;
@@ -1146,6 +1149,10 @@  static void smcr_buf_unuse(struct smc_buf_desc *buf_desc, bool is_rmb,
 		/* memzero_explicit provides potential memory barrier semantics */
 		memzero_explicit(buf_desc->cpu_addr, buf_desc->len);
 		WRITE_ONCE(buf_desc->used, 0);
+		if (is_rmb)
+			llist_add(&buf_desc->llist, &lgr->rmbs_free[buf_desc->bufsiz_comp]);
+		else
+			llist_add(&buf_desc->llist, &lgr->sndbufs_free[buf_desc->bufsiz_comp]);
 	}
 }
 
@@ -1172,6 +1179,8 @@  static void smc_buf_unuse(struct smc_connection *conn,
 		} else {
 			memzero_explicit(conn->sndbuf_desc->cpu_addr, conn->sndbuf_desc->len);
 			WRITE_ONCE(conn->sndbuf_desc->used, 0);
+			llist_add(&conn->sndbuf_desc->llist,
+				  &lgr->sndbufs_free[conn->sndbuf_desc->bufsiz_comp]);
 		}
 	}
 	if (conn->rmb_desc) {
@@ -1181,6 +1190,8 @@  static void smc_buf_unuse(struct smc_connection *conn,
 			memzero_explicit(conn->rmb_desc->cpu_addr,
 					 conn->rmb_desc->len + sizeof(struct smcd_cdc_msg));
 			WRITE_ONCE(conn->rmb_desc->used, 0);
+			llist_add(&conn->rmb_desc->llist,
+				  &lgr->rmbs_free[conn->rmb_desc->bufsiz_comp]);
 		}
 	}
 }
@@ -2042,24 +2053,19 @@  int smc_uncompress_bufsize(u8 compressed)
 	return (int)size;
 }
 
-/* try to reuse a sndbuf or rmb description slot for a certain
- * buffer size; if not available, return NULL
- */
-static struct smc_buf_desc *smc_buf_get_slot(int compressed_bufsize,
-					     struct rw_semaphore *lock,
-					     struct list_head *buf_list)
+/* use lock less list to save and find reuse buf desc */
+static struct smc_buf_desc *smc_buf_get_slot_free(struct llist_head *buf_llist)
 {
-	struct smc_buf_desc *buf_slot;
+	struct smc_buf_desc *buf_free;
+	struct llist_node *llnode;
 
-	down_read(lock);
-	list_for_each_entry(buf_slot, buf_list, list) {
-		if (cmpxchg(&buf_slot->used, 0, 1) == 0) {
-			up_read(lock);
-			return buf_slot;
-		}
-	}
-	up_read(lock);
-	return NULL;
+	if (llist_empty(buf_llist))
+		return NULL;
+	// lock-less link list don't need an lock
+	llnode = llist_del_first(buf_llist);
+	buf_free = llist_entry(llnode, struct smc_buf_desc, llist);
+	WRITE_ONCE(buf_free->used, 1);
+	return buf_free;
 }
 
 /* one of the conditions for announcing a receiver's current window size is
@@ -2364,6 +2370,7 @@  static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb)
 	struct smc_connection *conn = &smc->conn;
 	struct smc_link_group *lgr = conn->lgr;
 	struct list_head *buf_list;
+	struct llist_head *buf_llist;
 	int bufsize, bufsize_comp;
 	struct rw_semaphore *lock;	/* lock buffer list */
 	bool is_dgraded = false;
@@ -2379,15 +2386,17 @@  static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb)
 	     bufsize_comp >= 0; bufsize_comp--) {
 		if (is_rmb) {
 			lock = &lgr->rmbs_lock;
+			buf_llist = &lgr->rmbs_free[bufsize_comp];
 			buf_list = &lgr->rmbs[bufsize_comp];
 		} else {
 			lock = &lgr->sndbufs_lock;
+			buf_llist = &lgr->sndbufs_free[bufsize_comp];
 			buf_list = &lgr->sndbufs[bufsize_comp];
 		}
 		bufsize = smc_uncompress_bufsize(bufsize_comp);
 
 		/* check for reusable slot in the link group */
-		buf_desc = smc_buf_get_slot(bufsize_comp, lock, buf_list);
+		buf_desc = smc_buf_get_slot_free(buf_llist);
 		if (buf_desc) {
 			buf_desc->is_dma_need_sync = 0;
 			SMC_STAT_RMB_SIZE(smc, is_smcd, is_rmb, bufsize);
@@ -2412,7 +2421,8 @@  static int __smc_buf_create(struct smc_sock *smc, bool is_smcd, bool is_rmb)
 
 		SMC_STAT_RMB_ALLOC(smc, is_smcd, is_rmb);
 		SMC_STAT_RMB_SIZE(smc, is_smcd, is_rmb, bufsize);
-		buf_desc->used = 1;
+		WRITE_ONCE(buf_desc->used, 1);
+		WRITE_ONCE(buf_desc->bufsiz_comp, bufsize_comp);
 		down_write(lock);
 		list_add(&buf_desc->list, buf_list);
 		up_write(lock);
diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h
index d93cf51dbd7c..32ff6a5f076c 100644
--- a/net/smc/smc_core.h
+++ b/net/smc/smc_core.h
@@ -188,10 +188,12 @@  struct smc_link {
 /* tx/rx buffer list element for sndbufs list and rmbs list of a lgr */
 struct smc_buf_desc {
 	struct list_head	list;
+	struct llist_node	llist;
 	void			*cpu_addr;	/* virtual address of buffer */
 	struct page		*pages;
 	int			len;		/* length of buffer */
 	u32			used;		/* currently used / unused */
+	int			bufsiz_comp;
 	union {
 		struct { /* SMC-R */
 			struct sg_table	sgt[SMC_LINKS_PER_LGR_MAX];
@@ -278,8 +280,10 @@  struct smc_link_group {
 	unsigned short		vlan_id;	/* vlan id of link group */
 
 	struct list_head	sndbufs[SMC_RMBE_SIZES];/* tx buffers */
+	struct llist_head	sndbufs_free[SMC_RMBE_SIZES]; /* tx buffer free list */
 	struct rw_semaphore	sndbufs_lock;	/* protects tx buffers */
 	struct list_head	rmbs[SMC_RMBE_SIZES];	/* rx buffers */
+	struct llist_head	rmbs_free[SMC_RMBE_SIZES]; /* rx buffer free list */
 	struct rw_semaphore	rmbs_lock;	/* protects rx buffers */
 
 	u8			id[SMC_LGR_ID_SIZE];	/* unique lgr id */