Message ID | 20150724161848.25617.26092.stgit@build2.ogc.int (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
On 7/24/2015 7:18 PM, Steve Wise wrote: > This is in preparation for adding new FRMR-only IO handlers > for devices that support FRMR and not PI. Steve, I've given this some thought and I think we should avoid splitting logic from PI and iWARP. The reason (other than code duplication) is that currently the iser target support only up to 1MB IOs. I have some code (not done yet) to support larger IOs by using multiple registrations per IO (with or without PI). With a little tweaking I think we can get iwarp to fit in too... So, do you mind if I take a crack at it? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jul 26, 2015 at 01:08:16PM +0300, Sagi Grimberg wrote: > I've given this some thought and I think we should avoid splitting > logic from PI and iWARP. The reason (other than code duplication) is > that currently the iser target support only up to 1MB IOs. I have some > code (not done yet) to support larger IOs by using multiple > registrations per IO (with or without PI). Just curious: How is this going to work with iSER only having a single rkey/offset/len field? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/26/2015 1:43 PM, Christoph Hellwig wrote: > On Sun, Jul 26, 2015 at 01:08:16PM +0300, Sagi Grimberg wrote: >> I've given this some thought and I think we should avoid splitting >> logic from PI and iWARP. The reason (other than code duplication) is >> that currently the iser target support only up to 1MB IOs. I have some >> code (not done yet) to support larger IOs by using multiple >> registrations per IO (with or without PI). > > Just curious: How is this going to work with iSER only having a single > rkey/offset/len field? > Good question, On the wire iser sends a single rkey, but the target is allowed to transfer the data however it wants to. Say that the local target HCA supports only 32 pages (128K bytes for 4K pages) registration and the initiator sent: rkey=0x1234 address=0xffffaaaa length=512K The target would allocate a 512K buffer and: register offset 0-128K to lkey=0x1 register offset 128K-256K to lkey=0x2 register offset 256K-384K to lkey=0x3 register offset 384K-512K to lkey=0x4 then constructs sg_list as: sg_list[0] = {addr=buf, length=128K, lkey=0x1} sg_list[1] = {addr=buf+128K, length=128K, lkey=0x2} sg_list[2] = {addr=buf+256K, length=128K, lkey=0x3} sg_list[3] = {addr=buf+384K, length=128K, lkey=0x4} Then set rdma_read wr with: rdma_r_wr.sg_list=&sg_list rdma_r_wr.rdma.addr=0xffffaaaa rdma_r_wr.rdma.rkey=0x1234 post_send(rdma_r_wr); Ideally, the post contains a chain of all 4 registrations and the rdma_read (and an opportunistic good scsi response). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jul 26, 2015 at 02:00:51PM +0300, Sagi Grimberg wrote: > On the wire iser sends a single rkey, but the target is allowed to > transfer the data however it wants to. So you're trying to get above the limit of a single RDMA READ, not above the limit for memory registration in the initiator? In that case your explanation makes sense, that's just not what I expected to be the limiting factor. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/26/2015 6:53 PM, Christoph Hellwig wrote: > On Sun, Jul 26, 2015 at 02:00:51PM +0300, Sagi Grimberg wrote: >> On the wire iser sends a single rkey, but the target is allowed to >> transfer the data however it wants to. > > So you're trying to get above the limit of a single RDMA READ, not > above the limit for memory registration in the initiator? Correct. > In that case your explanation makes sense, that's just not what I expected > to be the limiting factor. > In the initiator case, there is no way to support transfer size that exceeds the device registration length capabilities (unless we start using higher-order atomic allocations which we won't). -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/26/2015 6:00 AM, Sagi Grimberg wrote: > On 7/26/2015 1:43 PM, Christoph Hellwig wrote: >> On Sun, Jul 26, 2015 at 01:08:16PM +0300, Sagi Grimberg wrote: >>> I've given this some thought and I think we should avoid splitting >>> logic from PI and iWARP. The reason (other than code duplication) is >>> that currently the iser target support only up to 1MB IOs. I have some >>> code (not done yet) to support larger IOs by using multiple >>> registrations per IO (with or without PI). >> >> Just curious: How is this going to work with iSER only having a single >> rkey/offset/len field? >> > > Good question, > > On the wire iser sends a single rkey, but the target is allowed to > transfer the data however it wants to. > > Say that the local target HCA supports only 32 pages (128K bytes for 4K > pages) registration and the initiator sent: > rkey=0x1234 > address=0xffffaaaa > length=512K > > The target would allocate a 512K buffer and: > register offset 0-128K to lkey=0x1 > register offset 128K-256K to lkey=0x2 > register offset 256K-384K to lkey=0x3 > register offset 384K-512K to lkey=0x4 > > then constructs sg_list as: > sg_list[0] = {addr=buf, length=128K, lkey=0x1} > sg_list[1] = {addr=buf+128K, length=128K, lkey=0x2} > sg_list[2] = {addr=buf+256K, length=128K, lkey=0x3} > sg_list[3] = {addr=buf+384K, length=128K, lkey=0x4} > > Then set rdma_read wr with: > rdma_r_wr.sg_list=&sg_list > rdma_r_wr.rdma.addr=0xffffaaaa > rdma_r_wr.rdma.rkey=0x1234 > > post_send(rdma_r_wr); > > Ideally, the post contains a chain of all 4 registrations and the > rdma_read (and an opportunistic good scsi response). Just to be clear: This example is for IB only, correct? IW would require rkeys with REMOTE_WRITE and 4 read wrs. And you're ignoring invalidation wrs (or read-with-inv) in the example... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>> Ideally, the post contains a chain of all 4 registrations and the >> rdma_read (and an opportunistic good scsi response). > > Just to be clear: This example is for IB only, correct? IW would > require rkeys with REMOTE_WRITE and 4 read wrs. My assumption is that it would depend on max_sge_rd. IB only? iWARP by definition isn't capable of doing rdma_read to more than one scatter? Anyway, we'll need to calculate the number of RDMA_READs. > And you're ignoring invalidation wrs (or read-with-inv) in the example... Yes, didn't want to inflate the example too much... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/26/2015 12:40 PM, Sagi Grimberg wrote: > >>> Ideally, the post contains a chain of all 4 registrations and the >>> rdma_read (and an opportunistic good scsi response). >> >> Just to be clear: This example is for IB only, correct? IW would >> require rkeys with REMOTE_WRITE and 4 read wrs. > > My assumption is that it would depend on max_sge_rd. > yea. > IB only? iWARP by definition isn't capable of doing rdma_read to > more than one scatter? Anyway, we'll need to calculate the number > of RDMA_READs. > The wire protocol limits the destination to a single stg/to/len (aka rkey/addr/len). Devices/fw/sw could implement some magic to support a single stg/to/len that maps to a scatter gather list of stags/tos/lens. >> And you're ignoring invalidation wrs (or read-with-inv) in the >> example... > > Yes, didn't want to inflate the example too much... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 7/26/2015 5:08 AM, Sagi Grimberg wrote: > On 7/24/2015 7:18 PM, Steve Wise wrote: >> This is in preparation for adding new FRMR-only IO handlers >> for devices that support FRMR and not PI. > > Steve, > > I've given this some thought and I think we should avoid splitting > logic from PI and iWARP. The reason (other than code duplication) is > that currently the iser target support only up to 1MB IOs. I have some > code (not done yet) to support larger IOs by using multiple > registrations per IO (with or without PI). > With a little tweaking I think we can get iwarp to fit in too... > > So, do you mind if I take a crack at it? Sure, go ahead. Let me know how I can help. Certainly I can test it for you. I'm very keen to get this in for 4.3 if possible... -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: linux-rdma-owner@vger.kernel.org [mailto:linux-rdma-owner@vger.kernel.org] On Behalf Of Steve Wise > Sent: Sunday, July 26, 2015 3:17 PM > To: Sagi Grimberg; dledford@redhat.com > Cc: infinipath@intel.com; sagig@mellanox.com; ogerlitz@mellanox.com; roid@mellanox.com; linux-rdma@vger.kernel.org; > eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; bfields@fieldses.org > Subject: Re: [PATCH V6 6/9] isert: Rename IO functions to more descriptive names > > On 7/26/2015 5:08 AM, Sagi Grimberg wrote: > > On 7/24/2015 7:18 PM, Steve Wise wrote: > >> This is in preparation for adding new FRMR-only IO handlers > >> for devices that support FRMR and not PI. > > > > Steve, > > > > I've given this some thought and I think we should avoid splitting > > logic from PI and iWARP. The reason (other than code duplication) is > > that currently the iser target support only up to 1MB IOs. I have some > > code (not done yet) to support larger IOs by using multiple > > registrations per IO (with or without PI). > > With a little tweaking I think we can get iwarp to fit in too... > > > > So, do you mind if I take a crack at it? > > Sure, go ahead. Let me know how I can help. Certainly I can test it > for you. I'm very keen to get this in for 4.3 if possible... > > Since Sagi is going to work on isert to support iwarp as part of his current isert large work, I'll drop the isert patches. I also want to split up the max_sge_rd patches to their own submission. So I will send out 2 new series for a final submission: 1) iser support for iwarp 2) use max_sge_rd and remove rdma_cap_read_multi_sge(). Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Steve, > > > > I've given this some thought and I think we should avoid splitting > > logic from PI and iWARP. The reason (other than code duplication) is > > that currently the iser target support only up to 1MB IOs. I have some > > code (not done yet) to support larger IOs by using multiple > > registrations per IO (with or without PI). > > With a little tweaking I think we can get iwarp to fit in too... > > > > So, do you mind if I take a crack at it? > > Sure, go ahead. Let me know how I can help. Certainly I can test it > for you. I'm very keen to get this in for 4.3 if possible... > Hey Sagi, how is this coming along? How can I help? Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> Hey Sagi, how is this coming along? How can I help? > Hi Steve, This is taking longer than I expected, the changes needed seem pretty extensive throughout the IO path. I don't think it will be ready for 4.3 I'll try to send you soon a preliminary version to play with. Acceptable? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Sagi Grimberg [mailto:sagig@dev.mellanox.co.il] > Sent: Tuesday, August 04, 2015 12:26 PM > To: Steve Wise; dledford@redhat.com > Cc: infinipath@intel.com; sagig@mellanox.com; ogerlitz@mellanox.com; roid@mellanox.com; linux-rdma@vger.kernel.org; > eli@mellanox.com; target-devel@vger.kernel.org; linux-nfs@vger.kernel.org; bfields@fieldses.org > Subject: Re: [PATCH V6 6/9] isert: Rename IO functions to more descriptive names > > > Hey Sagi, how is this coming along? How can I help? > > > > Hi Steve, > > This is taking longer than I expected, the changes needed seem > pretty extensive throughout the IO path. I don't think it will be ready > for 4.3 > Perhaps then we should go with my version that adds iwarp-only FRMR IO handlers for 4.3. Then they can be phased out as the rework matures. Thoughts? > I'll try to send you soon a preliminary version to play with. > Acceptable? I can test the iwarp parts once you think the code is ready to try. Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > > > > Hey Sagi, how is this coming along? How can I help? > > > > > > > Hi Steve, > > > > This is taking longer than I expected, the changes needed seem > > pretty extensive throughout the IO path. I don't think it will be ready > > for 4.3 > > > > Perhaps then we should go with my version that adds iwarp-only FRMR IO handlers for 4.3. Then they can be phased out as the rework > matures. Thoughts? > Shall I send out the series again for merging into 4.3? Thanks, Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 8/6/2015 12:23 AM, Steve Wise wrote: > >>> >>>> Hey Sagi, how is this coming along? How can I help? >>>> >>> >>> Hi Steve, >>> >>> This is taking longer than I expected, the changes needed seem >>> pretty extensive throughout the IO path. I don't think it will be ready >>> for 4.3 >>> >> >> Perhaps then we should go with my version that adds iwarp-only FRMR IO handlers for 4.3. Then they can be phased out as the rework >> matures. Thoughts? >> > > Shall I send out the series again for merging into 4.3? Hi Steve, Not sure about that.. I'm a bit reluctant in adding a code that branches the isert code even more than it already is. Nic, WDYT? -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>>> Hey Sagi, how is this coming along? How can I help? > >>>> > >>> > >>> Hi Steve, > >>> > >>> This is taking longer than I expected, the changes needed seem > >>> pretty extensive throughout the IO path. I don't think it will be ready > >>> for 4.3 > >>> > >> > >> Perhaps then we should go with my version that adds iwarp-only FRMR IO handlers for 4.3. Then they can be phased out as the rework > >> matures. Thoughts? > >> > > > > Shall I send out the series again for merging into 4.3? > > Hi Steve, > > Not sure about that.. I'm a bit reluctant in adding a code that > branches the isert code even more than it already is. > > Nic, WDYT? Nic is silent... Sagi, do you have an ETA on when you can have the recode ready for detailed review and test? If we can't make linux-4.3, can we be early in staging it for linux-4.4? Thanks, Steve. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Nic is silent... > > Sagi, do you have an ETA on when you can have the recode ready for detailed review and test? If we can't make linux-4.3, can we be early in staging it for linux-4.4? Hi Steve, I have something, but its not remotely close to be submission ready. This ended up being a rewrite of the registration path which is pretty convoluted at the moment. My aim is mostly simplifying it in a way that iWARP support would be (almost) straight-forward... I can send you my WIP to test. Sagi. -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 6af3dd4..dcd3c55 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -48,15 +48,15 @@ static struct workqueue_struct *isert_comp_wq; static struct workqueue_struct *isert_release_wq; static void -isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn); +isert_unmap_lkey(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn); static int -isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd, +isert_map_lkey(struct iscsi_conn *conn, struct iscsi_cmd *cmd, struct isert_rdma_wr *wr); static void -isert_unreg_rdma(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn); +isert_unreg_frmr_pi(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn); static int -isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd, - struct isert_rdma_wr *wr); +isert_reg_frmr_pi(struct iscsi_conn *conn, struct iscsi_cmd *cmd, + struct isert_rdma_wr *wr); static int isert_put_response(struct iscsi_conn *conn, struct iscsi_cmd *cmd); static int @@ -367,12 +367,12 @@ isert_create_device_ib_res(struct isert_device *device) if (dev_attr->device_cap_flags & IB_DEVICE_MEM_MGT_EXTENSIONS && dev_attr->device_cap_flags & IB_DEVICE_SIGNATURE_HANDOVER) { device->use_fastreg = 1; - device->reg_rdma_mem = isert_reg_rdma; - device->unreg_rdma_mem = isert_unreg_rdma; + device->reg_rdma_mem = isert_reg_frmr_pi; + device->unreg_rdma_mem = isert_unreg_frmr_pi; } else { device->use_fastreg = 0; - device->reg_rdma_mem = isert_map_rdma; - device->unreg_rdma_mem = isert_unmap_cmd; + device->reg_rdma_mem = isert_map_lkey; + device->unreg_rdma_mem = isert_unmap_lkey; } ret = isert_alloc_comps(device, dev_attr); @@ -1701,7 +1701,7 @@ isert_unmap_data_buf(struct isert_conn *isert_conn, struct isert_data_buf *data) static void -isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn) +isert_unmap_lkey(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn) { struct isert_rdma_wr *wr = &isert_cmd->rdma_wr; @@ -1726,7 +1726,7 @@ isert_unmap_cmd(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn) } static void -isert_unreg_rdma(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn) +isert_unreg_frmr_pi(struct isert_cmd *isert_cmd, struct isert_conn *isert_conn) { struct isert_rdma_wr *wr = &isert_cmd->rdma_wr; @@ -2442,7 +2442,7 @@ isert_build_rdma_wr(struct isert_conn *isert_conn, struct isert_cmd *isert_cmd, } static int -isert_map_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd, +isert_map_lkey(struct iscsi_conn *conn, struct iscsi_cmd *cmd, struct isert_rdma_wr *wr) { struct se_cmd *se_cmd = &cmd->se_cmd; @@ -2848,8 +2848,8 @@ unmap_prot_cmd: } static int -isert_reg_rdma(struct iscsi_conn *conn, struct iscsi_cmd *cmd, - struct isert_rdma_wr *wr) +isert_reg_frmr_pi(struct iscsi_conn *conn, struct iscsi_cmd *cmd, + struct isert_rdma_wr *wr) { struct se_cmd *se_cmd = &cmd->se_cmd; struct isert_cmd *isert_cmd = iscsit_priv_cmd(cmd);
This is in preparation for adding new FRMR-only IO handlers for devices that support FRMR and not PI. Signed-off-by: Steve Wise <swise@opengridcomputing.com> --- drivers/infiniband/ulp/isert/ib_isert.c | 28 ++++++++++++++-------------- 1 files changed, 14 insertions(+), 14 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html