diff mbox series

sunrpc: account for xdr->page_base in xdr_alloc_bvec

Message ID 20230814-sendpage-v1-1-d551b0d7f870@kernel.org (mailing list archive)
State Not Applicable
Delegated to: Netdev Maintainers
Headers show
Series sunrpc: account for xdr->page_base in xdr_alloc_bvec | expand

Checks

Context Check Description
netdev/series_format warning Single patches do not need cover letters; Target tree name not specified in the subject
netdev/tree_selection success Guessed tree name to be net-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 1330 this patch: 1330
netdev/cc_maintainers success CCed 14 of 14 maintainers
netdev/build_clang success Errors and warnings before: 1355 this patch: 1355
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/deprecated_api success None detected
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success Fixes tag looks correct
netdev/build_allmodconfig_warn success Errors and warnings before: 1353 this patch: 1353
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 16 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0

Commit Message

Jeff Layton Aug. 14, 2023, 2:32 p.m. UTC
I've been seeing a regression in mainline (v6.5-rc) kernels where
unaligned reads were returning corrupt data.

9d96acbc7f37 added a routine to allocate and populate a bvec array that
can be used to back an iov_iter. When it does this, it always sets the
offset in the first bvec to zero, even when the xdr->page_base is
non-zero.

The old code in svc_tcp_sendmsg used to account for this, as it was
sending the pages one at a time anyway, but now that we just hand the
iov to the network layer, we need to ensure that the bvecs are properly
initialized.

Fix xdr_alloc_bvec to set the offset in the first bvec to the offset
indicated by xdr->page_base, and then 0 in all subsequent bvecs.

Fixes: 9d96acbc7f37 ("SUNRPC: Add a bvec array to struct xdr_buf for use with iovec_iter()")
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
NB: This is only lightly tested so far, but it seems to fix the pynfs
regressions I've been seeing.
---
 net/sunrpc/xdr.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)


---
base-commit: 2ccdd1b13c591d306f0401d98dedc4bdcd02b421
change-id: 20230814-sendpage-b04874eed249

Best regards,

Comments

Jeff Layton Aug. 14, 2023, 2:40 p.m. UTC | #1
On Mon, 2023-08-14 at 10:32 -0400, Jeff Layton wrote:
> I've been seeing a regression in mainline (v6.5-rc) kernels where
> unaligned reads were returning corrupt data.
> 
> 9d96acbc7f37 added a routine to allocate and populate a bvec array that
> can be used to back an iov_iter. When it does this, it always sets the
> offset in the first bvec to zero, even when the xdr->page_base is
> non-zero.
> 
> The old code in svc_tcp_sendmsg used to account for this, as it was
> sending the pages one at a time anyway, but now that we just hand the
> iov to the network layer, we need to ensure that the bvecs are properly
> initialized.
> 
> Fix xdr_alloc_bvec to set the offset in the first bvec to the offset
> indicated by xdr->page_base, and then 0 in all subsequent bvecs.
> 
> Fixes: 9d96acbc7f37 ("SUNRPC: Add a bvec array to struct xdr_buf for use with iovec_iter()")

We might need a different fixes tag here. While I think xdr_alloc_bvec
ought to be where we account for this, the actual patch that broke
things is this one:

    5df5dd03a8f7 sunrpc: Use sendmsg(MSG_SPLICE_PAGES) rather then sendpage

The old code accounted for the fact that the first bvec always had a zero offset.


> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> NB: This is only lightly tested so far, but it seems to fix the pynfs
> regressions I've been seeing.
> ---
>  net/sunrpc/xdr.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index 2a22e78af116..d0f5fc8605b8 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -144,6 +144,7 @@ int
>  xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
>  {
>  	size_t i, n = xdr_buf_pagecount(buf);
> +	unsigned int offset = offset_in_page(buf->page_base);
>  
>  	if (n != 0 && buf->bvec == NULL) {
>  		buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]), gfp);
> @@ -151,7 +152,8 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
>  			return -ENOMEM;
>  		for (i = 0; i < n; i++) {
>  			bvec_set_page(&buf->bvec[i], buf->pages[i], PAGE_SIZE,
> -				      0);
> +				      offset);
> +			offset = 0;
>  		}
>  	}
>  	return 0;
> 
> ---
> base-commit: 2ccdd1b13c591d306f0401d98dedc4bdcd02b421
> change-id: 20230814-sendpage-b04874eed249
> 
> Best regards,
Trond Myklebust Aug. 14, 2023, 2:51 p.m. UTC | #2
On Mon, 2023-08-14 at 10:32 -0400, Jeff Layton wrote:
> I've been seeing a regression in mainline (v6.5-rc) kernels where
> unaligned reads were returning corrupt data.
> 
> 9d96acbc7f37 added a routine to allocate and populate a bvec array
> that
> can be used to back an iov_iter. When it does this, it always sets
> the
> offset in the first bvec to zero, even when the xdr->page_base is
> non-zero.
> 
> The old code in svc_tcp_sendmsg used to account for this, as it was
> sending the pages one at a time anyway, but now that we just hand the
> iov to the network layer, we need to ensure that the bvecs are
> properly
> initialized.
> 
> Fix xdr_alloc_bvec to set the offset in the first bvec to the offset
> indicated by xdr->page_base, and then 0 in all subsequent bvecs.
> 
> Fixes: 9d96acbc7f37 ("SUNRPC: Add a bvec array to struct xdr_buf for
> use with iovec_iter()")
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> NB: This is only lightly tested so far, but it seems to fix the pynfs
> regressions I've been seeing.
> ---
>  net/sunrpc/xdr.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> index 2a22e78af116..d0f5fc8605b8 100644
> --- a/net/sunrpc/xdr.c
> +++ b/net/sunrpc/xdr.c
> @@ -144,6 +144,7 @@ int
>  xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
>  {
>         size_t i, n = xdr_buf_pagecount(buf);
> +       unsigned int offset = offset_in_page(buf->page_base);
>  
>         if (n != 0 && buf->bvec == NULL) {
>                 buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]),
> gfp);
> @@ -151,7 +152,8 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
>                         return -ENOMEM;
>                 for (i = 0; i < n; i++) {
>                         bvec_set_page(&buf->bvec[i], buf->pages[i],
> PAGE_SIZE,
> -                                     0);
> +                                     offset);
> +                       offset = 0;

NACK. That's going to break the client.

>                 }
>         }
>         return 0;
> 
> ---
> base-commit: 2ccdd1b13c591d306f0401d98dedc4bdcd02b421
> change-id: 20230814-sendpage-b04874eed249
> 
> Best regards,
Jeff Layton Aug. 14, 2023, 3:30 p.m. UTC | #3
On Mon, 2023-08-14 at 14:51 +0000, Trond Myklebust wrote:
> On Mon, 2023-08-14 at 10:32 -0400, Jeff Layton wrote:
> > I've been seeing a regression in mainline (v6.5-rc) kernels where
> > unaligned reads were returning corrupt data.
> > 
> > 9d96acbc7f37 added a routine to allocate and populate a bvec array
> > that
> > can be used to back an iov_iter. When it does this, it always sets
> > the
> > offset in the first bvec to zero, even when the xdr->page_base is
> > non-zero.
> > 
> > The old code in svc_tcp_sendmsg used to account for this, as it was
> > sending the pages one at a time anyway, but now that we just hand the
> > iov to the network layer, we need to ensure that the bvecs are
> > properly
> > initialized.
> > 
> > Fix xdr_alloc_bvec to set the offset in the first bvec to the offset
> > indicated by xdr->page_base, and then 0 in all subsequent bvecs.
> > 
> > Fixes: 9d96acbc7f37 ("SUNRPC: Add a bvec array to struct xdr_buf for
> > use with iovec_iter()")
> > Signed-off-by: Jeff Layton <jlayton@kernel.org>
> > ---
> > NB: This is only lightly tested so far, but it seems to fix the pynfs
> > regressions I've been seeing.
> > ---
> >  net/sunrpc/xdr.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> > 
> > diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
> > index 2a22e78af116..d0f5fc8605b8 100644
> > --- a/net/sunrpc/xdr.c
> > +++ b/net/sunrpc/xdr.c
> > @@ -144,6 +144,7 @@ int
> >  xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
> >  {
> >         size_t i, n = xdr_buf_pagecount(buf);
> > +       unsigned int offset = offset_in_page(buf->page_base);
> >  
> >         if (n != 0 && buf->bvec == NULL) {
> >                 buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]),
> > gfp);
> > @@ -151,7 +152,8 @@ xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
> >                         return -ENOMEM;
> >                 for (i = 0; i < n; i++) {
> >                         bvec_set_page(&buf->bvec[i], buf->pages[i],
> > PAGE_SIZE,
> > -                                     0);
> > +                                     offset);
> > +                       offset = 0;
> 
> NACK. That's going to break the client.
> 

<rant>
What's the point of setting up a bvec array that doesn't actually
describe the usable data?
</rant>

Sigh, ok...I suppose we'll need to fix this in the svc callers instead.

> >                 }
> >         }
> >         return 0;
> > 
> > ---
> > base-commit: 2ccdd1b13c591d306f0401d98dedc4bdcd02b421
> > change-id: 20230814-sendpage-b04874eed249
> > 
> > Best regards,
>
diff mbox series

Patch

diff --git a/net/sunrpc/xdr.c b/net/sunrpc/xdr.c
index 2a22e78af116..d0f5fc8605b8 100644
--- a/net/sunrpc/xdr.c
+++ b/net/sunrpc/xdr.c
@@ -144,6 +144,7 @@  int
 xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
 {
 	size_t i, n = xdr_buf_pagecount(buf);
+	unsigned int offset = offset_in_page(buf->page_base);
 
 	if (n != 0 && buf->bvec == NULL) {
 		buf->bvec = kmalloc_array(n, sizeof(buf->bvec[0]), gfp);
@@ -151,7 +152,8 @@  xdr_alloc_bvec(struct xdr_buf *buf, gfp_t gfp)
 			return -ENOMEM;
 		for (i = 0; i < n; i++) {
 			bvec_set_page(&buf->bvec[i], buf->pages[i], PAGE_SIZE,
-				      0);
+				      offset);
+			offset = 0;
 		}
 	}
 	return 0;