diff mbox series

[for-rc] siw: fix for 'is_kva' flag issue in siw_tx_hdt()

Message ID 20190819111338.9366-1-krishna2@chelsio.com (mailing list archive)
State Changes Requested
Headers show
Series [for-rc] siw: fix for 'is_kva' flag issue in siw_tx_hdt() | expand

Commit Message

Krishnamraju Eraparaju Aug. 19, 2019, 11:13 a.m. UTC
"is_kva" variable in siw_tx_hdt() is not currently being updated for
each while loop iteration(the loop which walks the list of SGE's).

Usecase:

say a WQE has two SGE's :
 - first with "assigned kernel buffer", so not MR allocated.
 - second with PBL memory region, so mem_obj == PBL.

Now while processing first SGE, in iteration 1, "is_kva" is set to "1"
because there is no MR allocated.
And while processing the second SGE in iteration 2, since mem_obj is
PBL, the statement "if (!mem->mem_obj)" becomes false but is_kva is
still "1"(previous state). Thus, the PBL memory is handled as "direct
assigned kernel virtual memory", which eventually results in PAGE
FAULTS, MPA CRC issues.

                if (!(tx_flags(wqe) & SIW_WQE_INLINE)) {
                        mem = wqe->mem[sge_idx];
                        if (!mem->mem_obj)
                                is_kva = 1;
                } else {
                        is_kva = 1;
                }

Note that you need to set MTU size more than the PAGESIZE to recreate
this issue(as the address of "PBL index 0" and "assigned kernel
virtual memory" are always same for SIW). In my case, I used SIW
as ISER initiator with MTU 9000, issue occurs with
SCSI WRITE operation.

Signed-off-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
---
 drivers/infiniband/sw/siw/siw_qp_tx.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Bernard Metzler Aug. 19, 2019, 9:44 p.m. UTC | #1
-----"Krishnamraju Eraparaju" <krishna2@chelsio.com> wrote: -----

>To: jgg@ziepe.ca, bmt@zurich.ibm.com
>From: "Krishnamraju Eraparaju" <krishna2@chelsio.com>
>Date: 08/19/2019 01:14PM
>Cc: linux-rdma@vger.kernel.org, bharat@chelsio.com,
>nirranjan@chelsio.com, "Krishnamraju Eraparaju"
><krishna2@chelsio.com>
>Subject: [EXTERNAL] [PATCH for-rc] siw: fix for 'is_kva' flag issue
>in siw_tx_hdt()
>
>"is_kva" variable in siw_tx_hdt() is not currently being updated for
>each while loop iteration(the loop which walks the list of SGE's).
>
>Usecase:
>
>say a WQE has two SGE's :
> - first with "assigned kernel buffer", so not MR allocated.
> - second with PBL memory region, so mem_obj == PBL.
>
>Now while processing first SGE, in iteration 1, "is_kva" is set to
>"1"
>because there is no MR allocated.
>And while processing the second SGE in iteration 2, since mem_obj is
>PBL, the statement "if (!mem->mem_obj)" becomes false but is_kva is
>still "1"(previous state). Thus, the PBL memory is handled as "direct
>assigned kernel virtual memory", which eventually results in PAGE
>FAULTS, MPA CRC issues.
>
>                if (!(tx_flags(wqe) & SIW_WQE_INLINE)) {
>                        mem = wqe->mem[sge_idx];
>                        if (!mem->mem_obj)
>                                is_kva = 1;
>                } else {
>                        is_kva = 1;
>                }
>

Hi Krishna,
That is a good catch. I was not aware of the possibility of mixed
PBL and kernel buffer addresses in one SQE.

A correct fix must also handle the un-mapping of any kmap()'d
buffers. The current TX code expects all buffers be either kmap()'d or
all not kmap()'d. So the fix is a little more complex, if we must
handle mixed SGL's during un-mapping. I think I can provide it by
tomorrow. It's almost midnight ;)

Thanks!
Bernard.

>Note that you need to set MTU size more than the PAGESIZE to recreate
>this issue(as the address of "PBL index 0" and "assigned kernel
>virtual memory" are always same for SIW). In my case, I used SIW
>as ISER initiator with MTU 9000, issue occurs with
>SCSI WRITE operation.
>
>Signed-off-by: Krishnamraju Eraparaju <krishna2@chelsio.com>
>---
> drivers/infiniband/sw/siw/siw_qp_tx.c | 2 ++
> 1 file changed, 2 insertions(+)
>
>diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c
>b/drivers/infiniband/sw/siw/siw_qp_tx.c
>index 43020d2040fc..e2175a08269a 100644
>--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
>+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
>@@ -465,6 +465,8 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx,
>struct socket *s)
> 			mem = wqe->mem[sge_idx];
> 			if (!mem->mem_obj)
> 				is_kva = 1;
>+			else
>+				is_kva = 0;
> 		} else {
> 			is_kva = 1;
> 		}
>-- 
>2.23.0.rc0
>
>
Doug Ledford Aug. 20, 2019, 5:55 p.m. UTC | #2
On Mon, 2019-08-19 at 21:44 +0000, Bernard Metzler wrote:
> Hi Krishna,
> That is a good catch. I was not aware of the possibility of mixed
> PBL and kernel buffer addresses in one SQE.
> 
> A correct fix must also handle the un-mapping of any kmap()'d
> buffers. The current TX code expects all buffers be either kmap()'d or
> all not kmap()'d. So the fix is a little more complex, if we must
> handle mixed SGL's during un-mapping. I think I can provide it by
> tomorrow. It's almost midnight ;)

I'll wait for a proper fix.  Dropping this patch.  Thanks.
Bernard Metzler Aug. 21, 2019, 1:43 p.m. UTC | #3
-----"Doug Ledford" <dledford@redhat.com> wrote: -----

>To: "Bernard Metzler" <BMT@zurich.ibm.com>, "Krishnamraju Eraparaju"
><krishna2@chelsio.com>
>From: "Doug Ledford" <dledford@redhat.com>
>Date: 08/20/2019 07:56PM
>Cc: jgg@ziepe.ca, linux-rdma@vger.kernel.org, bharat@chelsio.com,
>nirranjan@chelsio.com
>Subject: [EXTERNAL] Re: [PATCH for-rc] siw: fix for 'is_kva' flag
>issue in siw_tx_hdt()
>
>On Mon, 2019-08-19 at 21:44 +0000, Bernard Metzler wrote:
>> Hi Krishna,
>> That is a good catch. I was not aware of the possibility of mixed
>> PBL and kernel buffer addresses in one SQE.
>> 
>> A correct fix must also handle the un-mapping of any kmap()'d
>> buffers. The current TX code expects all buffers be either kmap()'d
>or
>> all not kmap()'d. So the fix is a little more complex, if we must
>> handle mixed SGL's during un-mapping. I think I can provide it by
>> tomorrow. It's almost midnight ;)
>
>I'll wait for a proper fix.  Dropping this patch.  Thanks.
>
Thanks Doug!

I have a fix ready but still have to test it with iSER. 
Unfortunately I have a hard time to test iSER with siw, since
both iSCSI-TCP target and iSER want to bind to the same
TCP port. While this may be considered a bug in that code,
siw is the first RDMA provider to take notice (since using
kernel sockets and not offloaded, hitting a TCP port
already bound).

I sent the patch to Chelsio folks and hope for
the best. They know the trick to make it working.

Thanks
Bernard.
Krishnamraju Eraparaju Aug. 22, 2019, 10:50 a.m. UTC | #4
On Wednesday, August 08/21/19, 2019 at 19:13:18 +0530, Bernard Metzler wrote:
> -----"Doug Ledford" <dledford@redhat.com> wrote: -----
> 
> >To: "Bernard Metzler" <BMT@zurich.ibm.com>, "Krishnamraju Eraparaju"
> ><krishna2@chelsio.com>
> >From: "Doug Ledford" <dledford@redhat.com>
> >Date: 08/20/2019 07:56PM
> >Cc: jgg@ziepe.ca, linux-rdma@vger.kernel.org, bharat@chelsio.com,
> >nirranjan@chelsio.com
> >Subject: [EXTERNAL] Re: [PATCH for-rc] siw: fix for 'is_kva' flag
> >issue in siw_tx_hdt()
> >
> >On Mon, 2019-08-19 at 21:44 +0000, Bernard Metzler wrote:
> >> Hi Krishna,
> >> That is a good catch. I was not aware of the possibility of mixed
> >> PBL and kernel buffer addresses in one SQE.
> >> 
> >> A correct fix must also handle the un-mapping of any kmap()'d
> >> buffers. The current TX code expects all buffers be either kmap()'d
> >or
> >> all not kmap()'d. So the fix is a little more complex, if we must
> >> handle mixed SGL's during un-mapping. I think I can provide it by
> >> tomorrow. It's almost midnight ;)
> >
> >I'll wait for a proper fix.  Dropping this patch.  Thanks.
> >
> Thanks Doug!
> 
> I have a fix ready but still have to test it with iSER. 
> Unfortunately I have a hard time to test iSER with siw, since
> both iSCSI-TCP target and iSER want to bind to the same
> TCP port. While this may be considered a bug in that code,
> siw is the first RDMA provider to take notice (since using
> kernel sockets and not offloaded, hitting a TCP port
> already bound).
Not sure if this will become a serious problem when a iSCSI target
is configured to serve both iSCSI-TCP & iSER connections simultaniously.
Because, offloaded iSER CM handles all the TCP SYN packets that were
destined to iSCSI-TCP.
> 
> I sent the patch to Chelsio folks and hope for
> the best. They know the trick to make it working.
I have tested your patch, it's working fine.
> 
> Thanks
> Bernard.
>
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c
index 43020d2040fc..e2175a08269a 100644
--- a/drivers/infiniband/sw/siw/siw_qp_tx.c
+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c
@@ -465,6 +465,8 @@  static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s)
 			mem = wqe->mem[sge_idx];
 			if (!mem->mem_obj)
 				is_kva = 1;
+			else
+				is_kva = 0;
 		} else {
 			is_kva = 1;
 		}