Message ID | 20190819111338.9366-1-krishna2@chelsio.com (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | [for-rc] siw: fix for 'is_kva' flag issue in siw_tx_hdt() | expand |
-----"Krishnamraju Eraparaju" <krishna2@chelsio.com> wrote: ----- >To: jgg@ziepe.ca, bmt@zurich.ibm.com >From: "Krishnamraju Eraparaju" <krishna2@chelsio.com> >Date: 08/19/2019 01:14PM >Cc: linux-rdma@vger.kernel.org, bharat@chelsio.com, >nirranjan@chelsio.com, "Krishnamraju Eraparaju" ><krishna2@chelsio.com> >Subject: [EXTERNAL] [PATCH for-rc] siw: fix for 'is_kva' flag issue >in siw_tx_hdt() > >"is_kva" variable in siw_tx_hdt() is not currently being updated for >each while loop iteration(the loop which walks the list of SGE's). > >Usecase: > >say a WQE has two SGE's : > - first with "assigned kernel buffer", so not MR allocated. > - second with PBL memory region, so mem_obj == PBL. > >Now while processing first SGE, in iteration 1, "is_kva" is set to >"1" >because there is no MR allocated. >And while processing the second SGE in iteration 2, since mem_obj is >PBL, the statement "if (!mem->mem_obj)" becomes false but is_kva is >still "1"(previous state). Thus, the PBL memory is handled as "direct >assigned kernel virtual memory", which eventually results in PAGE >FAULTS, MPA CRC issues. > > if (!(tx_flags(wqe) & SIW_WQE_INLINE)) { > mem = wqe->mem[sge_idx]; > if (!mem->mem_obj) > is_kva = 1; > } else { > is_kva = 1; > } > Hi Krishna, That is a good catch. I was not aware of the possibility of mixed PBL and kernel buffer addresses in one SQE. A correct fix must also handle the un-mapping of any kmap()'d buffers. The current TX code expects all buffers be either kmap()'d or all not kmap()'d. So the fix is a little more complex, if we must handle mixed SGL's during un-mapping. I think I can provide it by tomorrow. It's almost midnight ;) Thanks! Bernard. >Note that you need to set MTU size more than the PAGESIZE to recreate >this issue(as the address of "PBL index 0" and "assigned kernel >virtual memory" are always same for SIW). In my case, I used SIW >as ISER initiator with MTU 9000, issue occurs with >SCSI WRITE operation. > >Signed-off-by: Krishnamraju Eraparaju <krishna2@chelsio.com> >--- > drivers/infiniband/sw/siw/siw_qp_tx.c | 2 ++ > 1 file changed, 2 insertions(+) > >diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c >b/drivers/infiniband/sw/siw/siw_qp_tx.c >index 43020d2040fc..e2175a08269a 100644 >--- a/drivers/infiniband/sw/siw/siw_qp_tx.c >+++ b/drivers/infiniband/sw/siw/siw_qp_tx.c >@@ -465,6 +465,8 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, >struct socket *s) > mem = wqe->mem[sge_idx]; > if (!mem->mem_obj) > is_kva = 1; >+ else >+ is_kva = 0; > } else { > is_kva = 1; > } >-- >2.23.0.rc0 > >
On Mon, 2019-08-19 at 21:44 +0000, Bernard Metzler wrote: > Hi Krishna, > That is a good catch. I was not aware of the possibility of mixed > PBL and kernel buffer addresses in one SQE. > > A correct fix must also handle the un-mapping of any kmap()'d > buffers. The current TX code expects all buffers be either kmap()'d or > all not kmap()'d. So the fix is a little more complex, if we must > handle mixed SGL's during un-mapping. I think I can provide it by > tomorrow. It's almost midnight ;) I'll wait for a proper fix. Dropping this patch. Thanks.
-----"Doug Ledford" <dledford@redhat.com> wrote: ----- >To: "Bernard Metzler" <BMT@zurich.ibm.com>, "Krishnamraju Eraparaju" ><krishna2@chelsio.com> >From: "Doug Ledford" <dledford@redhat.com> >Date: 08/20/2019 07:56PM >Cc: jgg@ziepe.ca, linux-rdma@vger.kernel.org, bharat@chelsio.com, >nirranjan@chelsio.com >Subject: [EXTERNAL] Re: [PATCH for-rc] siw: fix for 'is_kva' flag >issue in siw_tx_hdt() > >On Mon, 2019-08-19 at 21:44 +0000, Bernard Metzler wrote: >> Hi Krishna, >> That is a good catch. I was not aware of the possibility of mixed >> PBL and kernel buffer addresses in one SQE. >> >> A correct fix must also handle the un-mapping of any kmap()'d >> buffers. The current TX code expects all buffers be either kmap()'d >or >> all not kmap()'d. So the fix is a little more complex, if we must >> handle mixed SGL's during un-mapping. I think I can provide it by >> tomorrow. It's almost midnight ;) > >I'll wait for a proper fix. Dropping this patch. Thanks. > Thanks Doug! I have a fix ready but still have to test it with iSER. Unfortunately I have a hard time to test iSER with siw, since both iSCSI-TCP target and iSER want to bind to the same TCP port. While this may be considered a bug in that code, siw is the first RDMA provider to take notice (since using kernel sockets and not offloaded, hitting a TCP port already bound). I sent the patch to Chelsio folks and hope for the best. They know the trick to make it working. Thanks Bernard.
On Wednesday, August 08/21/19, 2019 at 19:13:18 +0530, Bernard Metzler wrote: > -----"Doug Ledford" <dledford@redhat.com> wrote: ----- > > >To: "Bernard Metzler" <BMT@zurich.ibm.com>, "Krishnamraju Eraparaju" > ><krishna2@chelsio.com> > >From: "Doug Ledford" <dledford@redhat.com> > >Date: 08/20/2019 07:56PM > >Cc: jgg@ziepe.ca, linux-rdma@vger.kernel.org, bharat@chelsio.com, > >nirranjan@chelsio.com > >Subject: [EXTERNAL] Re: [PATCH for-rc] siw: fix for 'is_kva' flag > >issue in siw_tx_hdt() > > > >On Mon, 2019-08-19 at 21:44 +0000, Bernard Metzler wrote: > >> Hi Krishna, > >> That is a good catch. I was not aware of the possibility of mixed > >> PBL and kernel buffer addresses in one SQE. > >> > >> A correct fix must also handle the un-mapping of any kmap()'d > >> buffers. The current TX code expects all buffers be either kmap()'d > >or > >> all not kmap()'d. So the fix is a little more complex, if we must > >> handle mixed SGL's during un-mapping. I think I can provide it by > >> tomorrow. It's almost midnight ;) > > > >I'll wait for a proper fix. Dropping this patch. Thanks. > > > Thanks Doug! > > I have a fix ready but still have to test it with iSER. > Unfortunately I have a hard time to test iSER with siw, since > both iSCSI-TCP target and iSER want to bind to the same > TCP port. While this may be considered a bug in that code, > siw is the first RDMA provider to take notice (since using > kernel sockets and not offloaded, hitting a TCP port > already bound). Not sure if this will become a serious problem when a iSCSI target is configured to serve both iSCSI-TCP & iSER connections simultaniously. Because, offloaded iSER CM handles all the TCP SYN packets that were destined to iSCSI-TCP. > > I sent the patch to Chelsio folks and hope for > the best. They know the trick to make it working. I have tested your patch, it's working fine. > > Thanks > Bernard. >
diff --git a/drivers/infiniband/sw/siw/siw_qp_tx.c b/drivers/infiniband/sw/siw/siw_qp_tx.c index 43020d2040fc..e2175a08269a 100644 --- a/drivers/infiniband/sw/siw/siw_qp_tx.c +++ b/drivers/infiniband/sw/siw/siw_qp_tx.c @@ -465,6 +465,8 @@ static int siw_tx_hdt(struct siw_iwarp_tx *c_tx, struct socket *s) mem = wqe->mem[sge_idx]; if (!mem->mem_obj) is_kva = 1; + else + is_kva = 0; } else { is_kva = 1; }
"is_kva" variable in siw_tx_hdt() is not currently being updated for each while loop iteration(the loop which walks the list of SGE's). Usecase: say a WQE has two SGE's : - first with "assigned kernel buffer", so not MR allocated. - second with PBL memory region, so mem_obj == PBL. Now while processing first SGE, in iteration 1, "is_kva" is set to "1" because there is no MR allocated. And while processing the second SGE in iteration 2, since mem_obj is PBL, the statement "if (!mem->mem_obj)" becomes false but is_kva is still "1"(previous state). Thus, the PBL memory is handled as "direct assigned kernel virtual memory", which eventually results in PAGE FAULTS, MPA CRC issues. if (!(tx_flags(wqe) & SIW_WQE_INLINE)) { mem = wqe->mem[sge_idx]; if (!mem->mem_obj) is_kva = 1; } else { is_kva = 1; } Note that you need to set MTU size more than the PAGESIZE to recreate this issue(as the address of "PBL index 0" and "assigned kernel virtual memory" are always same for SIW). In my case, I used SIW as ISER initiator with MTU 9000, issue occurs with SCSI WRITE operation. Signed-off-by: Krishnamraju Eraparaju <krishna2@chelsio.com> --- drivers/infiniband/sw/siw/siw_qp_tx.c | 2 ++ 1 file changed, 2 insertions(+)