diff mbox series

[v3,2/5] SUNRPC: Send RPC message on TCP with a single sock_sendmsg() call

Message ID 168979146971.1905271.4709699930756258041.stgit@morisot.1015granger.net (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series Send RPC-on-TCP with one sock_sendmsg() call | expand

Checks

Context Check Description
netdev/series_format warning Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1347 this patch: 1347
netdev/cc_maintainers warning 11 maintainers not CCed: kuba@kernel.org anna@kernel.org neilb@suse.de tom@talpey.com kolga@netapp.com Dai.Ngo@oracle.com trond.myklebust@hammerspace.com davem@davemloft.net pabeni@redhat.com edumazet@google.com jlayton@kernel.org
netdev/build_clang success Errors and warnings before: 1441 this patch: 1441
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 1370 this patch: 1370
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 68 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Chuck Lever July 19, 2023, 6:31 p.m. UTC
From: Chuck Lever <chuck.lever@oracle.com>

There is now enough infrastructure in place to combine the stream
record marker into the biovec array used to send each outgoing RPC
message on TCP. The whole message can be more efficiently sent with
a single call to sock_sendmsg() using a bio_vec iterator.

Note that this also helps with RPC-with-TLS: the TLS implementation
can now clearly see where the upper layer message boundaries are.
Before, it would send each component of the xdr_buf (record marker,
head, page payload, tail) in separate TLS records.

Suggested-by: David Howells <dhowells@redhat.com>
Signed-off-by: Chuck Lever <chuck.lever@oracle.com>
---
 include/linux/sunrpc/svcsock.h |    2 ++
 net/sunrpc/svcsock.c           |   33 ++++++++++++++++++---------------
 2 files changed, 20 insertions(+), 15 deletions(-)

Comments

David Howells July 24, 2023, 9:58 a.m. UTC | #1
Chuck Lever <cel@kernel.org> wrote:

> +	buf = page_frag_alloc(&svsk->sk_frag_cache, sizeof(marker),
> +			      GFP_KERNEL);

Is this SMP-safe?  page_frag_alloc() does no locking.

David
Chuck Lever July 24, 2023, 1:27 p.m. UTC | #2
On Mon, Jul 24, 2023 at 10:58:50AM +0100, David Howells wrote:
> Chuck Lever <cel@kernel.org> wrote:
> 
> > +	buf = page_frag_alloc(&svsk->sk_frag_cache, sizeof(marker),
> > +			      GFP_KERNEL);
> 
> Is this SMP-safe?  page_frag_alloc() does no locking.

Note that svc_tcp_sendto() takes xprt->xpt_mutex. There can be only
one thread (per svsk) running in svc_tcp_sendmsg() at a time.
David Howells July 26, 2023, 12:13 p.m. UTC | #3
Chuck Lever <cel@kernel.org> wrote:

> From: Chuck Lever <chuck.lever@oracle.com>
> 
> There is now enough infrastructure in place to combine the stream
> record marker into the biovec array used to send each outgoing RPC
> message on TCP. The whole message can be more efficiently sent with
> a single call to sock_sendmsg() using a bio_vec iterator.
> 
> Note that this also helps with RPC-with-TLS: the TLS implementation
> can now clearly see where the upper layer message boundaries are.
> Before, it would send each component of the xdr_buf (record marker,
> head, page payload, tail) in separate TLS records.
> 
> Suggested-by: David Howells <dhowells@redhat.com>
> Signed-off-by: Chuck Lever <chuck.lever@oracle.com>

Reviewed-by: David Howells <dhowells@redhat.com>
diff mbox series

Patch

diff --git a/include/linux/sunrpc/svcsock.h b/include/linux/sunrpc/svcsock.h
index a7116048a4d4..caf3308f1f07 100644
--- a/include/linux/sunrpc/svcsock.h
+++ b/include/linux/sunrpc/svcsock.h
@@ -38,6 +38,8 @@  struct svc_sock {
 	/* Number of queued send requests */
 	atomic_t		sk_sendqlen;
 
+	struct page_frag_cache  sk_frag_cache;
+
 	struct completion	sk_handshake_done;
 
 	struct page *		sk_pages[RPCSVC_MAXPAGES];	/* received data */
diff --git a/net/sunrpc/svcsock.c b/net/sunrpc/svcsock.c
index 90b1ab95c223..d4d816036c04 100644
--- a/net/sunrpc/svcsock.c
+++ b/net/sunrpc/svcsock.c
@@ -1213,31 +1213,30 @@  static int svc_tcp_recvfrom(struct svc_rqst *rqstp)
 static int svc_tcp_sendmsg(struct svc_sock *svsk, struct svc_rqst *rqstp,
 			   rpc_fraghdr marker, unsigned int *sentp)
 {
-	struct kvec rm = {
-		.iov_base	= &marker,
-		.iov_len	= sizeof(marker),
-	};
 	struct msghdr msg = {
-		.msg_flags	= MSG_MORE,
+		.msg_flags	= MSG_SPLICE_PAGES,
 	};
 	unsigned int count;
+	void *buf;
 	int ret;
 
 	*sentp = 0;
 
-	ret = kernel_sendmsg(svsk->sk_sock, &msg, &rm, 1, rm.iov_len);
-	if (ret < 0)
-		return ret;
-	*sentp += ret;
-	if (ret != rm.iov_len)
-		return -EAGAIN;
+	/* The stream record marker is copied into a temporary page
+	 * fragment buffer so that it can be included in rq_bvec.
+	 */
+	buf = page_frag_alloc(&svsk->sk_frag_cache, sizeof(marker),
+			      GFP_KERNEL);
+	if (!buf)
+		return -ENOMEM;
+	memcpy(buf, &marker, sizeof(marker));
+	bvec_set_virt(rqstp->rq_bvec, buf, sizeof(marker));
 
-	count = xdr_buf_to_bvec(rqstp->rq_bvec, ARRAY_SIZE(rqstp->rq_bvec),
-				&rqstp->rq_res);
+	count = xdr_buf_to_bvec(rqstp->rq_bvec + 1,
+				ARRAY_SIZE(rqstp->rq_bvec) - 1, &rqstp->rq_res);
 
-	msg.msg_flags = MSG_SPLICE_PAGES;
 	iov_iter_bvec(&msg.msg_iter, ITER_SOURCE, rqstp->rq_bvec,
-		      count, rqstp->rq_res.len);
+		      1 + count, sizeof(marker) + rqstp->rq_res.len);
 	ret = sock_sendmsg(svsk->sk_sock, &msg);
 	if (ret < 0)
 		return ret;
@@ -1616,6 +1615,7 @@  static void svc_tcp_sock_detach(struct svc_xprt *xprt)
 static void svc_sock_free(struct svc_xprt *xprt)
 {
 	struct svc_sock *svsk = container_of(xprt, struct svc_sock, sk_xprt);
+	struct page_frag_cache *pfc = &svsk->sk_frag_cache;
 	struct socket *sock = svsk->sk_sock;
 
 	trace_svcsock_free(svsk, sock);
@@ -1625,5 +1625,8 @@  static void svc_sock_free(struct svc_xprt *xprt)
 		sockfd_put(sock);
 	else
 		sock_release(sock);
+	if (pfc->va)
+		__page_frag_cache_drain(virt_to_head_page(pfc->va),
+					pfc->pagecnt_bias);
 	kfree(svsk);
 }