diff mbox

[RESEND] svcrdma: handle rdma read with a non-zero initial page offset

Message ID 20150928214605.17900.50257.stgit@build2.ogc.int (mailing list archive)
State New, archived
Headers show

Commit Message

Steve Wise Sept. 28, 2015, 9:46 p.m. UTC
The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
were not taking into account the initial page_offset when determining
the rdma read length.  This resulted in a read who's starting address
and length exceeded the base/bounds of the frmr.

Most work loads don't tickle this bug apparently, but one test hit it
every time: building the linux kernel on a 16 core node with 'make -j
16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.

This bug seems to only be tripped with devices having small fastreg page
list depths.  I didn't see it with mlx4, for instance.

Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
Signed-off-by: Steve Wise <swise@opengridcomputing.com>
Tested-by: Chuck Lever <chuck.lever@oracle.com>
---

 net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 ++++--
 1 files changed, 4 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Doug Ledford Oct. 6, 2015, 5:44 p.m. UTC | #1
On 09/28/2015 05:46 PM, Steve Wise wrote:
> The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> were not taking into account the initial page_offset when determining
> the rdma read length.  This resulted in a read who's starting address
> and length exceeded the base/bounds of the frmr.
> 
> Most work loads don't tickle this bug apparently, but one test hit it
> every time: building the linux kernel on a 16 core node with 'make -j
> 16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> 
> This bug seems to only be tripped with devices having small fastreg page
> list depths.  I didn't see it with mlx4, for instance.

Bruce, what's you're take on this?  Do you want to push this through or
would you care if I push it through my tree?

> Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> Tested-by: Chuck Lever <chuck.lever@oracle.com>
> ---
> 
>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 ++++--
>  1 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> index cb51742..5f6ca47 100644
> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> @@ -136,7 +136,8 @@ int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
>  	ctxt->direction = DMA_FROM_DEVICE;
>  	ctxt->read_hdr = head;
>  	pages_needed = min_t(int, pages_needed, xprt->sc_max_sge_rd);
> -	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
> +	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
> +		     rs_length);
>  
>  	for (pno = 0; pno < pages_needed; pno++) {
>  		int len = min_t(int, rs_length, PAGE_SIZE - pg_off);
> @@ -235,7 +236,8 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
>  	ctxt->direction = DMA_FROM_DEVICE;
>  	ctxt->frmr = frmr;
>  	pages_needed = min_t(int, pages_needed, xprt->sc_frmr_pg_list_len);
> -	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
> +	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
> +		     rs_length);
>  
>  	frmr->kva = page_address(rqstp->rq_arg.pages[pg_no]);
>  	frmr->direction = DMA_FROM_DEVICE;
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
J. Bruce Fields Oct. 7, 2015, 5:01 p.m. UTC | #2
On Tue, Oct 06, 2015 at 01:44:25PM -0400, Doug Ledford wrote:
> On 09/28/2015 05:46 PM, Steve Wise wrote:
> > The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> > were not taking into account the initial page_offset when determining
> > the rdma read length.  This resulted in a read who's starting address
> > and length exceeded the base/bounds of the frmr.
> > 
> > Most work loads don't tickle this bug apparently, but one test hit it
> > every time: building the linux kernel on a 16 core node with 'make -j
> > 16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> > 
> > This bug seems to only be tripped with devices having small fastreg page
> > list depths.  I didn't see it with mlx4, for instance.
> 
> Bruce, what's you're take on this?  Do you want to push this through or
> would you care if I push it through my tree?

Whoops, sorry, I meant to send a pull request for that last week.  Uh, I
think I'll go ahead and do that now if it's OK with you.

--b.

> 
> > Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> > Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> > Tested-by: Chuck Lever <chuck.lever@oracle.com>
> > ---
> > 
> >  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 ++++--
> >  1 files changed, 4 insertions(+), 2 deletions(-)
> > 
> > diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > index cb51742..5f6ca47 100644
> > --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> > @@ -136,7 +136,8 @@ int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
> >  	ctxt->direction = DMA_FROM_DEVICE;
> >  	ctxt->read_hdr = head;
> >  	pages_needed = min_t(int, pages_needed, xprt->sc_max_sge_rd);
> > -	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
> > +	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
> > +		     rs_length);
> >  
> >  	for (pno = 0; pno < pages_needed; pno++) {
> >  		int len = min_t(int, rs_length, PAGE_SIZE - pg_off);
> > @@ -235,7 +236,8 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
> >  	ctxt->direction = DMA_FROM_DEVICE;
> >  	ctxt->frmr = frmr;
> >  	pages_needed = min_t(int, pages_needed, xprt->sc_frmr_pg_list_len);
> > -	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
> > +	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
> > +		     rs_length);
> >  
> >  	frmr->kva = page_address(rqstp->rq_arg.pages[pg_no]);
> >  	frmr->direction = DMA_FROM_DEVICE;
> > 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> 
> -- 
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Oct. 7, 2015, 5:33 p.m. UTC | #3
On 10/07/2015 01:01 PM, J. Bruce Fields wrote:
> On Tue, Oct 06, 2015 at 01:44:25PM -0400, Doug Ledford wrote:
>> On 09/28/2015 05:46 PM, Steve Wise wrote:
>>> The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
>>> were not taking into account the initial page_offset when determining
>>> the rdma read length.  This resulted in a read who's starting address
>>> and length exceeded the base/bounds of the frmr.
>>>
>>> Most work loads don't tickle this bug apparently, but one test hit it
>>> every time: building the linux kernel on a 16 core node with 'make -j
>>> 16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
>>>
>>> This bug seems to only be tripped with devices having small fastreg page
>>> list depths.  I didn't see it with mlx4, for instance.
>>
>> Bruce, what's you're take on this?  Do you want to push this through or
>> would you care if I push it through my tree?
> 
> Whoops, sorry, I meant to send a pull request for that last week.  Uh, I
> think I'll go ahead and do that now if it's OK with you.

Fine with me.  I was just trying to make sure it didn't get forgotten ;-)

> --b.
> 
>>
>>> Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
>>> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
>>> Tested-by: Chuck Lever <chuck.lever@oracle.com>
>>> ---
>>>
>>>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 ++++--
>>>  1 files changed, 4 insertions(+), 2 deletions(-)
>>>
>>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>>> index cb51742..5f6ca47 100644
>>> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>>> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
>>> @@ -136,7 +136,8 @@ int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
>>>  	ctxt->direction = DMA_FROM_DEVICE;
>>>  	ctxt->read_hdr = head;
>>>  	pages_needed = min_t(int, pages_needed, xprt->sc_max_sge_rd);
>>> -	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
>>> +	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
>>> +		     rs_length);
>>>  
>>>  	for (pno = 0; pno < pages_needed; pno++) {
>>>  		int len = min_t(int, rs_length, PAGE_SIZE - pg_off);
>>> @@ -235,7 +236,8 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
>>>  	ctxt->direction = DMA_FROM_DEVICE;
>>>  	ctxt->frmr = frmr;
>>>  	pages_needed = min_t(int, pages_needed, xprt->sc_frmr_pg_list_len);
>>> -	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
>>> +	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
>>> +		     rs_length);
>>>  
>>>  	frmr->kva = page_address(rqstp->rq_arg.pages[pg_no]);
>>>  	frmr->direction = DMA_FROM_DEVICE;
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>>
>>
>>
>> -- 
>> Doug Ledford <dledford@redhat.com>
>>               GPG KeyID: 0E572FDD
>>
>>
> 
>
J. Bruce Fields Oct. 7, 2015, 8:41 p.m. UTC | #4
On Wed, Oct 07, 2015 at 01:33:05PM -0400, Doug Ledford wrote:
> On 10/07/2015 01:01 PM, J. Bruce Fields wrote:
> > On Tue, Oct 06, 2015 at 01:44:25PM -0400, Doug Ledford wrote:
> >> On 09/28/2015 05:46 PM, Steve Wise wrote:
> >>> The server rdma_read_chunk_lcl() and rdma_read_chunk_frmr() functions
> >>> were not taking into account the initial page_offset when determining
> >>> the rdma read length.  This resulted in a read who's starting address
> >>> and length exceeded the base/bounds of the frmr.
> >>>
> >>> Most work loads don't tickle this bug apparently, but one test hit it
> >>> every time: building the linux kernel on a 16 core node with 'make -j
> >>> 16 O=/mnt/0' where /mnt/0 is a ramdisk mounted via NFSRDMA.
> >>>
> >>> This bug seems to only be tripped with devices having small fastreg page
> >>> list depths.  I didn't see it with mlx4, for instance.
> >>
> >> Bruce, what's you're take on this?  Do you want to push this through or
> >> would you care if I push it through my tree?
> > 
> > Whoops, sorry, I meant to send a pull request for that last week.  Uh, I
> > think I'll go ahead and do that now if it's OK with you.
> 
> Fine with me.  I was just trying to make sure it didn't get forgotten ;-)

Understood, thanks!  I've sent the pull request.--b.

> 
> > --b.
> > 
> >>
> >>> Fixes: 0bf4828983df ('svcrdma: refactor marshalling logic')
> >>> Signed-off-by: Steve Wise <swise@opengridcomputing.com>
> >>> Tested-by: Chuck Lever <chuck.lever@oracle.com>
> >>> ---
> >>>
> >>>  net/sunrpc/xprtrdma/svc_rdma_recvfrom.c |    6 ++++--
> >>>  1 files changed, 4 insertions(+), 2 deletions(-)
> >>>
> >>> diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> index cb51742..5f6ca47 100644
> >>> --- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> +++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
> >>> @@ -136,7 +136,8 @@ int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
> >>>  	ctxt->direction = DMA_FROM_DEVICE;
> >>>  	ctxt->read_hdr = head;
> >>>  	pages_needed = min_t(int, pages_needed, xprt->sc_max_sge_rd);
> >>> -	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
> >>> +	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
> >>> +		     rs_length);
> >>>  
> >>>  	for (pno = 0; pno < pages_needed; pno++) {
> >>>  		int len = min_t(int, rs_length, PAGE_SIZE - pg_off);
> >>> @@ -235,7 +236,8 @@ int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
> >>>  	ctxt->direction = DMA_FROM_DEVICE;
> >>>  	ctxt->frmr = frmr;
> >>>  	pages_needed = min_t(int, pages_needed, xprt->sc_frmr_pg_list_len);
> >>> -	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
> >>> +	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
> >>> +		     rs_length);
> >>>  
> >>>  	frmr->kva = page_address(rqstp->rq_arg.pages[pg_no]);
> >>>  	frmr->direction = DMA_FROM_DEVICE;
> >>>
> >>> --
> >>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> >>> the body of a message to majordomo@vger.kernel.org
> >>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> >>>
> >>
> >>
> >> -- 
> >> Doug Ledford <dledford@redhat.com>
> >>               GPG KeyID: 0E572FDD
> >>
> >>
> > 
> > 
> 
> 
> -- 
> Doug Ledford <dledford@redhat.com>
>               GPG KeyID: 0E572FDD
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
index cb51742..5f6ca47 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_recvfrom.c
@@ -136,7 +136,8 @@  int rdma_read_chunk_lcl(struct svcxprt_rdma *xprt,
 	ctxt->direction = DMA_FROM_DEVICE;
 	ctxt->read_hdr = head;
 	pages_needed = min_t(int, pages_needed, xprt->sc_max_sge_rd);
-	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
+	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
+		     rs_length);
 
 	for (pno = 0; pno < pages_needed; pno++) {
 		int len = min_t(int, rs_length, PAGE_SIZE - pg_off);
@@ -235,7 +236,8 @@  int rdma_read_chunk_frmr(struct svcxprt_rdma *xprt,
 	ctxt->direction = DMA_FROM_DEVICE;
 	ctxt->frmr = frmr;
 	pages_needed = min_t(int, pages_needed, xprt->sc_frmr_pg_list_len);
-	read = min_t(int, pages_needed << PAGE_SHIFT, rs_length);
+	read = min_t(int, (pages_needed << PAGE_SHIFT) - *page_offset,
+		     rs_length);
 
 	frmr->kva = page_address(rqstp->rq_arg.pages[pg_no]);
 	frmr->direction = DMA_FROM_DEVICE;