diff mbox series

[net-next,1/2] net-zerocopy: Reduce compound page head access

Message ID 20230321081202.2370275-1-lixiaoyan@google.com (mailing list archive)
State Accepted
Commit 593ef60c74451ba05a87effb50edb5cd45312115
Headers show
Series [net-next,1/2] net-zerocopy: Reduce compound page head access | expand

Commit Message

Coco Li March 21, 2023, 8:12 a.m. UTC
From: Xiaoyan Li <lixiaoyan@google.com>

When compound pages are enabled, although the mm layer still
returns an array of page pointers, a subset (or all) of them
may have the same page head since a max 180kb skb can span 2
hugepages if it is on the boundary, be a mix of pages and 1 hugepage,
or fit completely in a hugepage. Instead of referencing page head
on all page pointers, use page length arithmetic to only call page
head when referencing a known different page head to avoid touching
a cold cacheline.

Tested:
See next patch with changes to tcp_mmap

Correntess:
On a pair of separate hosts as send with MSG_ZEROCOPY will
force a copy on tx if using loopback alone, check that the SHA
on the message sent is equivalent to checksum on the message received,
since the current program already checks for the length.

echo 1024 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
./tcp_mmap -s -z
./tcp_mmap -H $DADDR -z

SHA256 is correct
received 2 MB (100 % mmap'ed) in 0.005914 s, 2.83686 Gbit
  cpu usage user:0.001984 sys:0.000963, 1473.5 usec per MB, 10 c-switches

Performance:
Run neper between adjacent hosts with the same config
tcp_stream -Z --skip-rx-copy -6 -T 20 -F 1000 --stime-use-proc --test-length=30

Before patch: stime_end=37.670000
After patch: stime_end=30.310000

Signed-off-by: Coco Li <lixiaoyan@google.com>
---
 net/core/datagram.c | 14 +++++++++++---
 1 file changed, 11 insertions(+), 3 deletions(-)

Comments

Eric Dumazet March 22, 2023, 2:53 a.m. UTC | #1
On Tue, Mar 21, 2023 at 1:12 AM Coco Li <lixiaoyan@google.com> wrote:
>
> From: Xiaoyan Li <lixiaoyan@google.com>
>
> When compound pages are enabled, although the mm layer still
> returns an array of page pointers, a subset (or all) of them
> may have the same page head since a max 180kb skb can span 2
> hugepages if it is on the boundary, be a mix of pages and 1 hugepage,
> or fit completely in a hugepage. Instead of referencing page head
> on all page pointers, use page length arithmetic to only call page
> head when referencing a known different page head to avoid touching
> a cold cacheline.


Reviewed-by: Eric Dumazet <edumazet@google.com>
patchwork-bot+netdevbpf@kernel.org March 22, 2023, 2:40 p.m. UTC | #2
Hello:

This series was applied to netdev/net-next.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Tue, 21 Mar 2023 16:12:01 +0800 you wrote:
> From: Xiaoyan Li <lixiaoyan@google.com>
> 
> When compound pages are enabled, although the mm layer still
> returns an array of page pointers, a subset (or all) of them
> may have the same page head since a max 180kb skb can span 2
> hugepages if it is on the boundary, be a mix of pages and 1 hugepage,
> or fit completely in a hugepage. Instead of referencing page head
> on all page pointers, use page length arithmetic to only call page
> head when referencing a known different page head to avoid touching
> a cold cacheline.
> 
> [...]

Here is the summary with links:
  - [net-next,1/2] net-zerocopy: Reduce compound page head access
    https://git.kernel.org/netdev/net-next/c/593ef60c7445
  - [net-next,2/2] selftests/net: Add SHA256 computation over data sent in tcp_mmap
    https://git.kernel.org/netdev/net-next/c/5c5945dc695c

You are awesome, thank you!
diff mbox series

Patch

diff --git a/net/core/datagram.c b/net/core/datagram.c
index e4ff2db40c98..5662dff3d381 100644
--- a/net/core/datagram.c
+++ b/net/core/datagram.c
@@ -622,12 +622,12 @@  int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock *sk,
 	frag = skb_shinfo(skb)->nr_frags;
 
 	while (length && iov_iter_count(from)) {
+		struct page *head, *last_head = NULL;
 		struct page *pages[MAX_SKB_FRAGS];
-		struct page *last_head = NULL;
+		int refs, order, n = 0;
 		size_t start;
 		ssize_t copied;
 		unsigned long truesize;
-		int refs, n = 0;
 
 		if (frag == MAX_SKB_FRAGS)
 			return -EMSGSIZE;
@@ -650,9 +650,17 @@  int __zerocopy_sg_from_iter(struct msghdr *msg, struct sock *sk,
 		} else {
 			refcount_add(truesize, &skb->sk->sk_wmem_alloc);
 		}
+
+		head = compound_head(pages[n]);
+		order = compound_order(head);
+
 		for (refs = 0; copied != 0; start = 0) {
 			int size = min_t(int, copied, PAGE_SIZE - start);
-			struct page *head = compound_head(pages[n]);
+
+			if (pages[n] - head > (1UL << order) - 1) {
+				head = compound_head(pages[n]);
+				order = compound_order(head);
+			}
 
 			start += (pages[n] - head) << PAGE_SHIFT;
 			copied -= size;