diff mbox

[v2,1/8] IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS

Message ID cebcaeae-94a6-de82-cfc8-ce055b273836@grimberg.me (mailing list archive)
State Not Applicable
Headers show

Commit Message

Sagi Grimberg Feb. 15, 2017, 3:38 p.m. UTC
> Tests have shown that the following error message is reported when
> using SG-GAPS registration with an mlx5 adapter:
>
> scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE ffff880bd4270eb0
> 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000 00000000
> 00000000 0f007806 2500002a ad9fafd1
> scsi host1: ib_srp: reconnect succeeded
> mlx5_0:dump_cqe:262:(pid 7369): dump error cqe
> 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000 00000000
> 00000000 00000000 00000000 00000000
> 00000000 0f007806 25000032 00105dd0
> scsi host1: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff880b92860138
>
> Hence avoid using SG-GAPS memory registrations. Additionally,
> always configure the blk_queue_virt_boundary() to avoid to trigger
> a mapping failure when using adapters that support SG-GAPS (e.g.
> mlx5).

Hi Guys,

Sorry for addressing this late, but has this failure been investigated?

Max, Israel, what does this error syndrome map to?

Looking at mlx5_ib_sg_to_klms, I think the mr->length is incorrectly
incremented. Does the following change fix the problem?
--
--
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Laurence Oberman Feb. 15, 2017, 3:42 p.m. UTC | #1
----- Original Message -----
> From: "Sagi Grimberg" <sagi@grimberg.me>
> To: "Bart Van Assche" <bart.vanassche@sandisk.com>, "Doug Ledford" <dledford@redhat.com>
> Cc: linux-rdma@vger.kernel.org, "Israel Rukshin" <israelr@mellanox.com>, "Max Gurtovoy" <maxg@mellanox.com>, "Leon
> Romanovsky" <leonro@mellanox.com>, "Mark Bloch" <markb@mellanox.com>, "Yuval Shaia" <yuval.shaia@oracle.com>, "# 4 .
> 7+" <stable@vger.kernel.org>
> Sent: Wednesday, February 15, 2017 10:38:06 AM
> Subject: Re: [PATCH v2 1/8] IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS
> 
> 
> > Tests have shown that the following error message is reported when
> > using SG-GAPS registration with an mlx5 adapter:
> >
> > scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE
> > ffff880bd4270eb0
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 0f007806 2500002a ad9fafd1
> > scsi host1: ib_srp: reconnect succeeded
> > mlx5_0:dump_cqe:262:(pid 7369): dump error cqe
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 0f007806 25000032 00105dd0
> > scsi host1: ib_srp: failed FAST REG status memory management operation
> > error (6) for CQE ffff880b92860138
> >
> > Hence avoid using SG-GAPS memory registrations. Additionally,
> > always configure the blk_queue_virt_boundary() to avoid to trigger
> > a mapping failure when using adapters that support SG-GAPS (e.g.
> > mlx5).
> 
> Hi Guys,
> 
> Sorry for addressing this late, but has this failure been investigated?
> 
> Max, Israel, what does this error syndrome map to?
> 
> Looking at mlx5_ib_sg_to_klms, I think the mr->length is incorrectly
> incremented. Does the following change fix the problem?
> --
> diff --git a/drivers/infiniband/hw/mlx5/mr.c
> b/drivers/infiniband/hw/mlx5/mr.c
> index 8f608debe141..c21c9eee37f6 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1832,7 +1832,7 @@ mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr,
>                  klms[i].va = cpu_to_be64(sg_dma_address(sg) + sg_offset);
>                  klms[i].bcount = cpu_to_be32(sg_dma_len(sg) - sg_offset);
>                  klms[i].key = cpu_to_be32(lkey);
> -               mr->ibmr.length += sg_dma_len(sg);
> +               mr->ibmr.length += sg_dma_len(sg) - sg_offset;
> 
>                  sg_offset = 0;
>          }
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Hello Sagi,

I will get this tested.
Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Max Gurtovoy Feb. 15, 2017, 4:18 p.m. UTC | #2
On 2/15/2017 5:38 PM, Sagi Grimberg wrote:
>
>> Tests have shown that the following error message is reported when
>> using SG-GAPS registration with an mlx5 adapter:
>>
>> scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE
>> ffff880bd4270eb0
>> 00000000 00000000 00000000 00000000
>> 00000000 00000000 00000000 00000000
>> 00000000 00000000 00000000 00000000
>> 00000000 0f007806 2500002a ad9fafd1
>> scsi host1: ib_srp: reconnect succeeded
>> mlx5_0:dump_cqe:262:(pid 7369): dump error cqe
>> 00000000 00000000 00000000 00000000
>> 00000000 00000000 00000000 00000000
>> 00000000 00000000 00000000 00000000
>> 00000000 0f007806 25000032 00105dd0
>> scsi host1: ib_srp: failed FAST REG status memory management operation
>> error (6) for CQE ffff880b92860138
>>
>> Hence avoid using SG-GAPS memory registrations. Additionally,
>> always configure the blk_queue_virt_boundary() to avoid to trigger
>> a mapping failure when using adapters that support SG-GAPS (e.g.
>> mlx5).
>
> Hi Guys,
>
> Sorry for addressing this late, but has this failure been investigated?
>
> Max, Israel, what does this error syndrome map to?

Sagi,
this syndrome says that number of klms to write is bigger than number of 
mtts.

Artemy started investigating it and proposed solution that were tested 
by Laurence.
Let's see if your fix will help.

>
> Looking at mlx5_ib_sg_to_klms, I think the mr->length is incorrectly
> incremented. Does the following change fix the problem?
> --
> diff --git a/drivers/infiniband/hw/mlx5/mr.c
> b/drivers/infiniband/hw/mlx5/mr.c
> index 8f608debe141..c21c9eee37f6 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1832,7 +1832,7 @@ mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr,
>                 klms[i].va = cpu_to_be64(sg_dma_address(sg) + sg_offset);
>                 klms[i].bcount = cpu_to_be32(sg_dma_len(sg) - sg_offset);
>                 klms[i].key = cpu_to_be32(lkey);
> -               mr->ibmr.length += sg_dma_len(sg);
> +               mr->ibmr.length += sg_dma_len(sg) - sg_offset;
>
>                 sg_offset = 0;
>         }
> --
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Feb. 15, 2017, 4:27 p.m. UTC | #3
>> Hi Guys,
>>
>> Sorry for addressing this late, but has this failure been investigated?
>>
>> Max, Israel, what does this error syndrome map to?
>
> Sagi,
> this syndrome says that number of klms to write is bigger than number of
> mtts.

That is strange, given that we check for it explicitly...

if this is indeed the case, then mlx5 reports wrong
max_fast_reg_page_list_len.

> Artemy started investigating it and proposed solution that were tested
> by Laurence.
> Let's see if your fix will help.

Not sure it will help if the syndrome is what you say it is...
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky Feb. 15, 2017, 4:30 p.m. UTC | #4
On Wed, Feb 15, 2017 at 06:18:02PM +0200, Max Gurtovoy wrote:
>
>
> On 2/15/2017 5:38 PM, Sagi Grimberg wrote:
> >
> > > Tests have shown that the following error message is reported when
> > > using SG-GAPS registration with an mlx5 adapter:
> > >
> > > scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE
> > > ffff880bd4270eb0
> > > 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000 00000000
> > > 00000000 0f007806 2500002a ad9fafd1
> > > scsi host1: ib_srp: reconnect succeeded
> > > mlx5_0:dump_cqe:262:(pid 7369): dump error cqe
> > > 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000 00000000
> > > 00000000 00000000 00000000 00000000
> > > 00000000 0f007806 25000032 00105dd0
> > > scsi host1: ib_srp: failed FAST REG status memory management operation
> > > error (6) for CQE ffff880b92860138
> > >
> > > Hence avoid using SG-GAPS memory registrations. Additionally,
> > > always configure the blk_queue_virt_boundary() to avoid to trigger
> > > a mapping failure when using adapters that support SG-GAPS (e.g.
> > > mlx5).
> >
> > Hi Guys,
> >
> > Sorry for addressing this late, but has this failure been investigated?
> >
> > Max, Israel, what does this error syndrome map to?
>
> Sagi,
> this syndrome says that number of klms to write is bigger than number of
> mtts.
>
> Artemy started investigating it and proposed solution that were tested by
> Laurence.
> Let's see if your fix will help.

No, Artemy's change doesn't fix it.

>
> >
> > Looking at mlx5_ib_sg_to_klms, I think the mr->length is incorrectly
> > incremented. Does the following change fix the problem?
> > --
> > diff --git a/drivers/infiniband/hw/mlx5/mr.c
> > b/drivers/infiniband/hw/mlx5/mr.c
> > index 8f608debe141..c21c9eee37f6 100644
> > --- a/drivers/infiniband/hw/mlx5/mr.c
> > +++ b/drivers/infiniband/hw/mlx5/mr.c
> > @@ -1832,7 +1832,7 @@ mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr,
> >                 klms[i].va = cpu_to_be64(sg_dma_address(sg) + sg_offset);
> >                 klms[i].bcount = cpu_to_be32(sg_dma_len(sg) - sg_offset);
> >                 klms[i].key = cpu_to_be32(lkey);
> > -               mr->ibmr.length += sg_dma_len(sg);
> > +               mr->ibmr.length += sg_dma_len(sg) - sg_offset;
> >
> >                 sg_offset = 0;
> >         }
> > --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Laurence Oberman Feb. 15, 2017, 4:37 p.m. UTC | #5
----- Original Message -----
> From: "Sagi Grimberg" <sagi@grimberg.me>
> To: "Bart Van Assche" <bart.vanassche@sandisk.com>, "Doug Ledford" <dledford@redhat.com>
> Cc: linux-rdma@vger.kernel.org, "Israel Rukshin" <israelr@mellanox.com>, "Max Gurtovoy" <maxg@mellanox.com>, "Leon
> Romanovsky" <leonro@mellanox.com>, "Mark Bloch" <markb@mellanox.com>, "Yuval Shaia" <yuval.shaia@oracle.com>, "# 4 .
> 7+" <stable@vger.kernel.org>
> Sent: Wednesday, February 15, 2017 10:38:06 AM
> Subject: Re: [PATCH v2 1/8] IB/SRP: Avoid using IB_MR_TYPE_SG_GAPS
> 
> 
> > Tests have shown that the following error message is reported when
> > using SG-GAPS registration with an mlx5 adapter:
> >
> > scsi host1: ib_srp: failed RECV status WR flushed (5) for CQE
> > ffff880bd4270eb0
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 0f007806 2500002a ad9fafd1
> > scsi host1: ib_srp: reconnect succeeded
> > mlx5_0:dump_cqe:262:(pid 7369): dump error cqe
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 00000000 00000000 00000000
> > 00000000 0f007806 25000032 00105dd0
> > scsi host1: ib_srp: failed FAST REG status memory management operation
> > error (6) for CQE ffff880b92860138
> >
> > Hence avoid using SG-GAPS memory registrations. Additionally,
> > always configure the blk_queue_virt_boundary() to avoid to trigger
> > a mapping failure when using adapters that support SG-GAPS (e.g.
> > mlx5).
> 
> Hi Guys,
> 
> Sorry for addressing this late, but has this failure been investigated?
> 
> Max, Israel, what does this error syndrome map to?
> 
> Looking at mlx5_ib_sg_to_klms, I think the mr->length is incorrectly
> incremented. Does the following change fix the problem?
> --
> diff --git a/drivers/infiniband/hw/mlx5/mr.c
> b/drivers/infiniband/hw/mlx5/mr.c
> index 8f608debe141..c21c9eee37f6 100644
> --- a/drivers/infiniband/hw/mlx5/mr.c
> +++ b/drivers/infiniband/hw/mlx5/mr.c
> @@ -1832,7 +1832,7 @@ mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr,
>                  klms[i].va = cpu_to_be64(sg_dma_address(sg) + sg_offset);
>                  klms[i].bcount = cpu_to_be32(sg_dma_len(sg) - sg_offset);
>                  klms[i].key = cpu_to_be32(lkey);
> -               mr->ibmr.length += sg_dma_len(sg);
> +               mr->ibmr.length += sg_dma_len(sg) - sg_offset;
> 
>                  sg_offset = 0;
>          }
> --
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

Started with Linus's tree, applied the change requested by Sagi, built the kernel, rebooted and started the tests.

Linux ibclient 4.10.0-rc8.sagi+ #1 SMP Wed Feb 15 11:09:44 EST 2017 x86_64 x86_64 x86_64 GNU/Linux

Very quickly get to this

[  180.990285] mlx5_0:dump_cqe:262:(pid 0): dump error cqe
[  181.016899] 00000000 00000000 00000000 00000000
[  181.040949] 00000000 00000000 00000000 00000000
[  181.066960] 00000000 00000000 00000000 00000000
[  181.092030] 00000000 0f007806 2500002a bf1913d0
[  181.117254] scsi host2: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff880bdbe88778
[  196.288933] fast_io_fail_tmo expired for SRP port-2:1 / host2.
[  197.090886] scsi host2: ib_srp: reconnect succeeded
[  197.127628] scsi host2: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817f09b6f30

So does not help.
I think my and Barts suggestion to revert for now is the best way forward.
I have already tested this in-depth from Bart's tree and its been sent to Doug as V2 of Bart'recent 8 patch series.

Thanks
Laurence
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sagi Grimberg Feb. 15, 2017, 4:55 p.m. UTC | #6
> Started with Linus's tree, applied the change requested by Sagi, built the kernel, rebooted and started the tests.
>
> Linux ibclient 4.10.0-rc8.sagi+ #1 SMP Wed Feb 15 11:09:44 EST 2017 x86_64 x86_64 x86_64 GNU/Linux
>
> Very quickly get to this
>
> [  180.990285] mlx5_0:dump_cqe:262:(pid 0): dump error cqe
> [  181.016899] 00000000 00000000 00000000 00000000
> [  181.040949] 00000000 00000000 00000000 00000000
> [  181.066960] 00000000 00000000 00000000 00000000
> [  181.092030] 00000000 0f007806 2500002a bf1913d0
> [  181.117254] scsi host2: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff880bdbe88778
> [  196.288933] fast_io_fail_tmo expired for SRP port-2:1 / host2.
> [  197.090886] scsi host2: ib_srp: reconnect succeeded
> [  197.127628] scsi host2: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817f09b6f30
>
> So does not help.
> I think my and Barts suggestion to revert for now is the best way forward.
> I have already tested this in-depth from Bart's tree and its been sent to Doug as V2 of Bart'recent 8 patch series.

Yea, probably this is the best way forward.

Bart, I think the change I suggested is still needed regardless,
do you agree?

Max, Leon, is it possible that the max number of klms pr mr is
less than what reported in device capabilities for page_list_len?

If so, this means that either:
1. mlx5 needs to expose the minimum between pages and sg elems (sucks)
2. we need yet another capability for SG_GAPS (sucks^2 because the
whole point was to make it transparent to the user)
3. mlx5 does not support SG_GAPS (sucks^3 because we now have something
thats not supported by any device).
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Feb. 15, 2017, 11:49 p.m. UTC | #7
On Wed, 2017-02-15 at 18:55 +0200, Sagi Grimberg wrote:
> Bart, I think the change I suggested is still needed regardless,
> do you agree?

I'm not sure. I have not yet had a close look at that part of the mlx5 driver.

Bart.--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky Feb. 16, 2017, 6:14 a.m. UTC | #8
On Wed, Feb 15, 2017 at 06:55:52PM +0200, Sagi Grimberg wrote:
>
> > Started with Linus's tree, applied the change requested by Sagi, built the kernel, rebooted and started the tests.
> >
> > Linux ibclient 4.10.0-rc8.sagi+ #1 SMP Wed Feb 15 11:09:44 EST 2017 x86_64 x86_64 x86_64 GNU/Linux
> >
> > Very quickly get to this
> >
> > [  180.990285] mlx5_0:dump_cqe:262:(pid 0): dump error cqe
> > [  181.016899] 00000000 00000000 00000000 00000000
> > [  181.040949] 00000000 00000000 00000000 00000000
> > [  181.066960] 00000000 00000000 00000000 00000000
> > [  181.092030] 00000000 0f007806 2500002a bf1913d0
> > [  181.117254] scsi host2: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff880bdbe88778
> > [  196.288933] fast_io_fail_tmo expired for SRP port-2:1 / host2.
> > [  197.090886] scsi host2: ib_srp: reconnect succeeded
> > [  197.127628] scsi host2: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817f09b6f30
> >
> > So does not help.
> > I think my and Barts suggestion to revert for now is the best way forward.
> > I have already tested this in-depth from Bart's tree and its been sent to Doug as V2 of Bart'recent 8 patch series.
>
> Yea, probably this is the best way forward.
>
> Bart, I think the change I suggested is still needed regardless,
> do you agree?
>
> Max, Leon, is it possible that the max number of klms pr mr is
> less than what reported in device capabilities for page_list_len?

I hope no and we will check.
I already asked it, but didn't get any response, and I'll repeat it again.
ISER has similar code with SG_GAPS, does it work?

>
> If so, this means that either:
> 1. mlx5 needs to expose the minimum between pages and sg elems (sucks)
> 2. we need yet another capability for SG_GAPS (sucks^2 because the
> whole point was to make it transparent to the user)
> 3. mlx5 does not support SG_GAPS (sucks^3 because we now have something
> thats not supported by any device).
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Max Gurtovoy Feb. 16, 2017, 9:11 a.m. UTC | #9
On 2/16/2017 8:14 AM, Leon Romanovsky wrote:
> On Wed, Feb 15, 2017 at 06:55:52PM +0200, Sagi Grimberg wrote:
>>
>>> Started with Linus's tree, applied the change requested by Sagi, built the kernel, rebooted and started the tests.
>>>
>>> Linux ibclient 4.10.0-rc8.sagi+ #1 SMP Wed Feb 15 11:09:44 EST 2017 x86_64 x86_64 x86_64 GNU/Linux
>>>
>>> Very quickly get to this
>>>
>>> [  180.990285] mlx5_0:dump_cqe:262:(pid 0): dump error cqe
>>> [  181.016899] 00000000 00000000 00000000 00000000
>>> [  181.040949] 00000000 00000000 00000000 00000000
>>> [  181.066960] 00000000 00000000 00000000 00000000
>>> [  181.092030] 00000000 0f007806 2500002a bf1913d0
>>> [  181.117254] scsi host2: ib_srp: failed FAST REG status memory management operation error (6) for CQE ffff880bdbe88778
>>> [  196.288933] fast_io_fail_tmo expired for SRP port-2:1 / host2.
>>> [  197.090886] scsi host2: ib_srp: reconnect succeeded
>>> [  197.127628] scsi host2: ib_srp: failed RECV status WR flushed (5) for CQE ffff8817f09b6f30
>>>
>>> So does not help.
>>> I think my and Barts suggestion to revert for now is the best way forward.
>>> I have already tested this in-depth from Bart's tree and its been sent to Doug as V2 of Bart'recent 8 patch series.
>>
>> Yea, probably this is the best way forward.
>>
>> Bart, I think the change I suggested is still needed regardless,
>> do you agree?
>>
>> Max, Leon, is it possible that the max number of klms pr mr is
>> less than what reported in device capabilities for page_list_len?
>
> I hope no and we will check.
> I already asked it, but didn't get any response, and I'll repeat it again.
> ISER has similar code with SG_GAPS, does it work?

Yes, I haven't seen issues with that in iSER.
We need to continue with the debug.

>
>>
>> If so, this means that either:
>> 1. mlx5 needs to expose the minimum between pages and sg elems (sucks)
>> 2. we need yet another capability for SG_GAPS (sucks^2 because the
>> whole point was to make it transparent to the user)
>> 3. mlx5 does not support SG_GAPS (sucks^3 because we now have something
>> thats not supported by any device).
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/hw/mlx5/mr.c 
b/drivers/infiniband/hw/mlx5/mr.c
index 8f608debe141..c21c9eee37f6 100644
--- a/drivers/infiniband/hw/mlx5/mr.c
+++ b/drivers/infiniband/hw/mlx5/mr.c
@@ -1832,7 +1832,7 @@  mlx5_ib_sg_to_klms(struct mlx5_ib_mr *mr,
                 klms[i].va = cpu_to_be64(sg_dma_address(sg) + sg_offset);
                 klms[i].bcount = cpu_to_be32(sg_dma_len(sg) - sg_offset);
                 klms[i].key = cpu_to_be32(lkey);
-               mr->ibmr.length += sg_dma_len(sg);
+               mr->ibmr.length += sg_dma_len(sg) - sg_offset;

                 sg_offset = 0;
         }