Message ID | 20210524085215.29005-1-mgurtovoy@nvidia.com (mailing list archive) |
---|---|
State | Accepted |
Delegated to: | Jason Gunthorpe |
Headers | show |
Series | [1/1] IB/isert: align target max I/O size to initiator size | expand |
> Since the Linux iser initiator default max I/O size set to 512KB and > since there is no handshake procedure for this size in iser protocol, > set the default max IO size of the target to 512KB as well. > > For changing the default values, there is a module parameter for both > drivers. Is this solving a bug?
On 5/25/2021 6:54 PM, Sagi Grimberg wrote: >> Since the Linux iser initiator default max I/O size set to 512KB and >> since there is no handshake procedure for this size in iser protocol, >> set the default max IO size of the target to 512KB as well. >> >> For changing the default values, there is a module parameter for both >> drivers. > > Is this solving a bug? No. Only OOB for some old connect-IB devices. I think it's reasonable to align initiator and target defaults anyway.
On 5/25/21 7:22 PM, Max Gurtovoy wrote: > > On 5/25/2021 6:54 PM, Sagi Grimberg wrote: >>> Since the Linux iser initiator default max I/O size set to 512KB and >>> since there is no handshake procedure for this size in iser protocol, >>> set the default max IO size of the target to 512KB as well. >>> >>> For changing the default values, there is a module parameter for both >>> drivers. >> >> Is this solving a bug? > > No. Only OOB for some old connect-IB devices. > > I think it's reasonable to align initiator and target defaults anyway. > > Actually, this patch is solving a bug when trying iser over Connect-IB, We see the following failure when trying to do discovery: Server: [ 124.264648] infiniband mlx5_0: create_qp:2783:(pid 83): Create QP type 2 failed [ 124.298598] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 [ 124.364768] isert: isert_cma_handler: failed handle connect request -12 [ 128.271609] infiniband mlx5_0: create_qp:2783:(pid 890): Create QP type 2 failed [ 128.311450] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 [ 128.378995] isert: isert_cma_handler: failed handle connect request -12 [ 130.668362] infiniband mlx5_0: create_qp:2783:(pid 81): Create QP type 2 failed [ 130.705869] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 [ 130.777306] isert: isert_cma_handler: failed handle connect request -12 [ 132.671161] infiniband mlx5_0: create_qp:2783:(pid 86): Create QP type 2 failed [ 132.707807] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 [ 132.778867] isert: isert_cma_handler: failed handle connect request -12 [ 132.810653] infiniband mlx5_0: create_qp:2783:(pid 19): Create QP type 2 failed [ 132.845691] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 [ 132.912706] isert: isert_cma_handler: failed handle connect request -12 [ 134.681936] infiniband mlx5_0: create_qp:2783:(pid 83): Create QP type 2 failed [ 134.718932] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 [ 134.788804] isert: isert_cma_handler: failed handle connect request -12 [ 136.678428] infiniband mlx5_0: create_qp:2783:(pid 86): Create QP type 2 failed [ 136.715859] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 [ 136.785058] isert: isert_cma_handler: failed handle connect request -12 [ 136.817414] infiniband mlx5_0: create_qp:2783:(pid 727): Create QP type 2 failed [ 136.854583] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 [ 136.922975] isert: isert_cma_handler: failed handle connect request -12 Client: $ iscsiadm -m discovery -t sendtargets -p 172.31.0.6 -I iser iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered connection failure iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered connection failure iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered connection failure iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered connection failure iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered connection failure iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered connection failure iscsiadm: connection login retries (reopen_max) 5 exceeded iscsiadm: Could not perform SendTargets discovery: iSCSI PDU timed out Thanks, Kamal
On 6/8/2021 1:24 PM, Kamal Heib wrote: > > On 5/25/21 7:22 PM, Max Gurtovoy wrote: >> On 5/25/2021 6:54 PM, Sagi Grimberg wrote: >>>> Since the Linux iser initiator default max I/O size set to 512KB and >>>> since there is no handshake procedure for this size in iser protocol, >>>> set the default max IO size of the target to 512KB as well. >>>> >>>> For changing the default values, there is a module parameter for both >>>> drivers. >>> Is this solving a bug? >> No. Only OOB for some old connect-IB devices. >> >> I think it's reasonable to align initiator and target defaults anyway. >> >> > Actually, this patch is solving a bug when trying iser over Connect-IB, We see > the following failure when trying to do discovery: You can work around this using the ib_isert sg_tablesize module param and set it to 128. So it's more OOB behavior than a bug. Anyway, This is good practice to be able to establish connections also for old devices without WAs and we also aligning to the sg_table size in the initiator side. Jason/Sagi, can you comment on this patch for 5.14 ? > > Server: > [ 124.264648] infiniband mlx5_0: create_qp:2783:(pid 83): Create QP type 2 failed > [ 124.298598] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 > [ 124.364768] isert: isert_cma_handler: failed handle connect request -12 > [ 128.271609] infiniband mlx5_0: create_qp:2783:(pid 890): Create QP type 2 failed > [ 128.311450] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 > [ 128.378995] isert: isert_cma_handler: failed handle connect request -12 > [ 130.668362] infiniband mlx5_0: create_qp:2783:(pid 81): Create QP type 2 failed > [ 130.705869] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 > [ 130.777306] isert: isert_cma_handler: failed handle connect request -12 > [ 132.671161] infiniband mlx5_0: create_qp:2783:(pid 86): Create QP type 2 failed > [ 132.707807] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 > [ 132.778867] isert: isert_cma_handler: failed handle connect request -12 > [ 132.810653] infiniband mlx5_0: create_qp:2783:(pid 19): Create QP type 2 failed > [ 132.845691] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 > [ 132.912706] isert: isert_cma_handler: failed handle connect request -12 > [ 134.681936] infiniband mlx5_0: create_qp:2783:(pid 83): Create QP type 2 failed > [ 134.718932] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 > [ 134.788804] isert: isert_cma_handler: failed handle connect request -12 > [ 136.678428] infiniband mlx5_0: create_qp:2783:(pid 86): Create QP type 2 failed > [ 136.715859] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 > [ 136.785058] isert: isert_cma_handler: failed handle connect request -12 > [ 136.817414] infiniband mlx5_0: create_qp:2783:(pid 727): Create QP type 2 failed > [ 136.854583] isert: isert_create_qp: rdma_create_qp failed for cma_id -12 > [ 136.922975] isert: isert_cma_handler: failed handle connect request -12 > > > Client: > $ iscsiadm -m discovery -t sendtargets -p 172.31.0.6 -I iser > iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered > connection failure > iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered > connection failure > iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered > connection failure > iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered > connection failure > iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered > connection failure > iscsiadm: Connection to discovery portal 172.31.0.6 failed: encountered > connection failure > iscsiadm: connection login retries (reopen_max) 5 exceeded > iscsiadm: Could not perform SendTargets discovery: iSCSI PDU timed out > > > Thanks, > Kamal >
>> On 5/25/21 7:22 PM, Max Gurtovoy wrote: >>> On 5/25/2021 6:54 PM, Sagi Grimberg wrote: >>>>> Since the Linux iser initiator default max I/O size set to 512KB and >>>>> since there is no handshake procedure for this size in iser protocol, >>>>> set the default max IO size of the target to 512KB as well. >>>>> >>>>> For changing the default values, there is a module parameter for both >>>>> drivers. >>>> Is this solving a bug? >>> No. Only OOB for some old connect-IB devices. >>> >>> I think it's reasonable to align initiator and target defaults anyway. >>> >>> >> Actually, this patch is solving a bug when trying iser over >> Connect-IB, We see >> the following failure when trying to do discovery: > > You can work around this using the ib_isert sg_tablesize module param > and set it to 128. > > So it's more OOB behavior than a bug. > > Anyway, This is good practice to be able to establish connections also > for old devices without WAs and we also aligning to the sg_table size in > the initiator side. > > Jason/Sagi, > > can you comment on this patch for 5.14 ? Actually, if this is the case, why not have a fallback when creating the QP? Seems more reasonable to have the exception for the old devices rather than having those mandate the common denominator no?
On 6/9/2021 2:04 AM, Sagi Grimberg wrote: > >>> On 5/25/21 7:22 PM, Max Gurtovoy wrote: >>>> On 5/25/2021 6:54 PM, Sagi Grimberg wrote: >>>>>> Since the Linux iser initiator default max I/O size set to 512KB and >>>>>> since there is no handshake procedure for this size in iser >>>>>> protocol, >>>>>> set the default max IO size of the target to 512KB as well. >>>>>> >>>>>> For changing the default values, there is a module parameter for >>>>>> both >>>>>> drivers. >>>>> Is this solving a bug? >>>> No. Only OOB for some old connect-IB devices. >>>> >>>> I think it's reasonable to align initiator and target defaults anyway. >>>> >>>> >>> Actually, this patch is solving a bug when trying iser over >>> Connect-IB, We see >>> the following failure when trying to do discovery: >> >> You can work around this using the ib_isert sg_tablesize module param >> and set it to 128. >> >> So it's more OOB behavior than a bug. >> >> Anyway, This is good practice to be able to establish connections >> also for old devices without WAs and we also aligning to the sg_table >> size in the initiator side. >> >> Jason/Sagi, >> >> can you comment on this patch for 5.14 ? > > Actually, if this is the case, why not have a fallback when creating the > QP? Seems more reasonable to have the exception for the old devices > rather than having those mandate the common denominator no? We first wanted to support 16MiB for isert but then we get a report from Chelsio that it will dramatically reduce the total amount of connections the can support. So we created a module param and reduced the default to 1MiB. Now we have similar issue with Connect-IB so reducing it to 512KiB (same as the default for Linux iser initiator) seems reasonable. Users that would like larger sg_table will use the module param. I would avoid doing fallbacks for that and maintain a code that might be dead in a year or two.
On 6/9/21 11:45 AM, Max Gurtovoy wrote: > > On 6/9/2021 2:04 AM, Sagi Grimberg wrote: >> >>>> On 5/25/21 7:22 PM, Max Gurtovoy wrote: >>>>> On 5/25/2021 6:54 PM, Sagi Grimberg wrote: >>>>>>> Since the Linux iser initiator default max I/O size set to 512KB and >>>>>>> since there is no handshake procedure for this size in iser >>>>>>> protocol, >>>>>>> set the default max IO size of the target to 512KB as well. >>>>>>> >>>>>>> For changing the default values, there is a module parameter for >>>>>>> both >>>>>>> drivers. >>>>>> Is this solving a bug? >>>>> No. Only OOB for some old connect-IB devices. >>>>> >>>>> I think it's reasonable to align initiator and target defaults anyway. >>>>> >>>>> >>>> Actually, this patch is solving a bug when trying iser over >>>> Connect-IB, We see >>>> the following failure when trying to do discovery: >>> >>> You can work around this using the ib_isert sg_tablesize module param >>> and set it to 128. >>> >>> So it's more OOB behavior than a bug. >>> >>> Anyway, This is good practice to be able to establish connections >>> also for old devices without WAs and we also aligning to the sg_table >>> size in the initiator side. >>> >>> Jason/Sagi, >>> >>> can you comment on this patch for 5.14 ? >> >> Actually, if this is the case, why not have a fallback when creating the >> QP? Seems more reasonable to have the exception for the old devices >> rather than having those mandate the common denominator no? > > We first wanted to support 16MiB for isert but then we get a report from > Chelsio that it will dramatically reduce the total amount of connections > the can support. > > So we created a module param and reduced the default to 1MiB. Now we > have similar issue with Connect-IB so reducing it to 512KiB (same as the > default for Linux iser initiator) seems reasonable. > > Users that would like larger sg_table will use the module param. > > I would avoid doing fallbacks for that and maintain a code that might be > dead in a year or two. > > Well, from the distro's point of view this code is not going to be dead any time soon..., And the current user experience is very bad, Could you guys please decide on a way to fix this issue? Thanks, Kamal
On 6/20/2021 11:11 AM, Kamal Heib wrote: > > On 6/9/21 11:45 AM, Max Gurtovoy wrote: >> On 6/9/2021 2:04 AM, Sagi Grimberg wrote: >>>>> On 5/25/21 7:22 PM, Max Gurtovoy wrote: >>>>>> On 5/25/2021 6:54 PM, Sagi Grimberg wrote: >>>>>>>> Since the Linux iser initiator default max I/O size set to 512KB and >>>>>>>> since there is no handshake procedure for this size in iser >>>>>>>> protocol, >>>>>>>> set the default max IO size of the target to 512KB as well. >>>>>>>> >>>>>>>> For changing the default values, there is a module parameter for >>>>>>>> both >>>>>>>> drivers. >>>>>>> Is this solving a bug? >>>>>> No. Only OOB for some old connect-IB devices. >>>>>> >>>>>> I think it's reasonable to align initiator and target defaults anyway. >>>>>> >>>>>> >>>>> Actually, this patch is solving a bug when trying iser over >>>>> Connect-IB, We see >>>>> the following failure when trying to do discovery: >>>> You can work around this using the ib_isert sg_tablesize module param >>>> and set it to 128. >>>> >>>> So it's more OOB behavior than a bug. >>>> >>>> Anyway, This is good practice to be able to establish connections >>>> also for old devices without WAs and we also aligning to the sg_table >>>> size in the initiator side. >>>> >>>> Jason/Sagi, >>>> >>>> can you comment on this patch for 5.14 ? >>> Actually, if this is the case, why not have a fallback when creating the >>> QP? Seems more reasonable to have the exception for the old devices >>> rather than having those mandate the common denominator no? >> We first wanted to support 16MiB for isert but then we get a report from >> Chelsio that it will dramatically reduce the total amount of connections >> the can support. >> >> So we created a module param and reduced the default to 1MiB. Now we >> have similar issue with Connect-IB so reducing it to 512KiB (same as the >> default for Linux iser initiator) seems reasonable. >> >> Users that would like larger sg_table will use the module param. >> >> I would avoid doing fallbacks for that and maintain a code that might be >> dead in a year or two. >> >> > Well, from the distro's point of view this code is not going to be dead any time > soon..., And the current user experience is very bad, Could you guys please > decide on a way to fix this issue? As mention above, I prefer the simple solution for this issue. I guess the most of iSER users are using pretty old HW so defaults should be accordingly. For NVMe/RDMA this is a different story and we can use higher defaults Adding fallbacks will complicate the code without a real justification for doing it. > > Thanks, > Kamal >
>> Well, from the distro's point of view this code is not going to be >> dead any time >> soon..., And the current user experience is very bad, Could you guys >> please >> decide on a way to fix this issue? > > As mention above, I prefer the simple solution for this issue. > > I guess the most of iSER users are using pretty old HW so defaults > should be accordingly. > > For NVMe/RDMA this is a different story and we can use higher defaults > > Adding fallbacks will complicate the code without a real justification > for doing it. Usually when you end up changing the defaults multiple times it should be an indication that it should do something about it. But hey, if you are killing Connect-IB anyways, and you don't see any sort of regressions from this I don't really have a problem with it.
On 6/22/2021 11:56 AM, Sagi Grimberg wrote: > >>> Well, from the distro's point of view this code is not going to be >>> dead any time >>> soon..., And the current user experience is very bad, Could you guys >>> please >>> decide on a way to fix this issue? >> >> As mention above, I prefer the simple solution for this issue. >> >> I guess the most of iSER users are using pretty old HW so defaults >> should be accordingly. >> >> For NVMe/RDMA this is a different story and we can use higher defaults >> >> Adding fallbacks will complicate the code without a real >> justification for doing it. > > Usually when you end up changing the defaults multiple times it should > be an indication that it should do something about it. > > But hey, if you are killing Connect-IB anyways, and you don't see any > sort of regressions from this I don't really have a problem with it. I don't know why you conclude it from the above. I just want to change the defaults to what we had in the past. This will help OOB for old devices. We did the same for Chelsio. And we see that RH team is also interested in it.
>>>> Well, from the distro's point of view this code is not going to be >>>> dead any time >>>> soon..., And the current user experience is very bad, Could you guys >>>> please >>>> decide on a way to fix this issue? >>> >>> As mention above, I prefer the simple solution for this issue. >>> >>> I guess the most of iSER users are using pretty old HW so defaults >>> should be accordingly. >>> >>> For NVMe/RDMA this is a different story and we can use higher defaults >>> >>> Adding fallbacks will complicate the code without a real >>> justification for doing it. >> >> Usually when you end up changing the defaults multiple times it should >> be an indication that it should do something about it. >> >> But hey, if you are killing Connect-IB anyways, and you don't see any >> sort of regressions from this I don't really have a problem with it. > > I don't know why you conclude it from the above. > > I just want to change the defaults to what we had in the past. This will > help OOB for old devices. We did the same for Chelsio. > > And we see that RH team is also interested in it. I'm fine with this. Acked-by: Sagi Grimberg <sagi@grimberg.me>
On Mon, May 24, 2021 at 11:52:15AM +0300, Max Gurtovoy wrote: > Since the Linux iser initiator default max I/O size set to 512KB and > since there is no handshake procedure for this size in iser protocol, > set the default max IO size of the target to 512KB as well. > > For changing the default values, there is a module parameter for both > drivers. > > Reviewed-by: Alaa Hleihel <alaa@nvidia.com> > Reviewed-by: Israel Rukshin <israelr@nvidia.com> > Signed-off-by: Max Gurtovoy <mgurtovoy@nvidia.com> > Acked-by: Sagi Grimberg <sagi@grimberg.me> > --- > drivers/infiniband/ulp/isert/ib_isert.c | 4 ++-- > drivers/infiniband/ulp/isert/ib_isert.h | 3 --- > 2 files changed, 2 insertions(+), 5 deletions(-) Applied to for-next, thanks Jason
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c b/drivers/infiniband/ulp/isert/ib_isert.c index 160efef66031..97214329c571 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.c +++ b/drivers/infiniband/ulp/isert/ib_isert.c @@ -35,10 +35,10 @@ static const struct kernel_param_ops sg_tablesize_ops = { .get = param_get_int, }; -static int isert_sg_tablesize = ISCSI_ISER_DEF_SG_TABLESIZE; +static int isert_sg_tablesize = ISCSI_ISER_MIN_SG_TABLESIZE; module_param_cb(sg_tablesize, &sg_tablesize_ops, &isert_sg_tablesize, 0644); MODULE_PARM_DESC(sg_tablesize, - "Number of gather/scatter entries in a single scsi command, should >= 128 (default: 256, max: 4096)"); + "Number of gather/scatter entries in a single scsi command, should >= 128 (default: 128, max: 4096)"); static DEFINE_MUTEX(device_list_mutex); static LIST_HEAD(device_list); diff --git a/drivers/infiniband/ulp/isert/ib_isert.h b/drivers/infiniband/ulp/isert/ib_isert.h index 6c5af13db4e0..ca8cfebe26ca 100644 --- a/drivers/infiniband/ulp/isert/ib_isert.h +++ b/drivers/infiniband/ulp/isert/ib_isert.h @@ -65,9 +65,6 @@ */ #define ISER_RX_SIZE (ISCSI_DEF_MAX_RECV_SEG_LEN + 1024) -/* Default I/O size is 1MB */ -#define ISCSI_ISER_DEF_SG_TABLESIZE 256 - /* Minimum I/O size is 512KB */ #define ISCSI_ISER_MIN_SG_TABLESIZE 128