[RFC] qla2xxx: Add dev_loss_tmo kernel module options

Message ID	20210419100014.47144-1-dwagner@suse.de (mailing list archive)
State	Changes Requested
Headers	show Return-Path: <linux-scsi-owner@kernel.org> From: Daniel Wagner <dwagner@suse.de> To: linux-scsi@vger.kernel.org Cc: GR-QLogic-Storage-Upstream@marvell.com, linux-nvme@lists.infradead.org, Hannes Reinecke <hare@suse.de>, Daniel Wagner <dwagner@suse.de>, Nilesh Javali <njavali@marvell.com>, Arun Easi <aeasi@marvell.com> Subject: [RFC] qla2xxx: Add dev_loss_tmo kernel module options Date: Mon, 19 Apr 2021 12:00:14 +0200 Message-Id: <20210419100014.47144-1-dwagner@suse.de> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	[RFC] qla2xxx: Add dev_loss_tmo kernel module options \| expand [RFC] qla2xxx: Add dev_loss_tmo kernel module options

Daniel Wagner April 19, 2021, 10 a.m. UTC

Allow to set the default dev_loss_tmo value as kernel module option.

Cc: Nilesh Javali <njavali@marvell.com>
Cc: Arun Easi <aeasi@marvell.com>
Signed-off-by: Daniel Wagner <dwagner@suse.de>
---
Hi,

During array upgrade tests with NVMe/FC on systems equiped with QLogic
HBAs we faced the problem with the default setting of dev_loss_tmo.

When the default timeout hit after 60 seconds the file system went
into read only mode. The fix was to set the dev_loss_tmo to infinity
(note this patch can't handle this).

For lpfc devices we could use the sysfs interface under
fc_remote_ports which exposed the dev_loss_tmo for SCSI and NVMe
rports.

The QLogic only expose the rports via fc_remote_ports if SCSI is used.
There is the debugfs interface to set the dev_loss_tmo but this has
two issues. First, it's not watched by udevd hence no rules work. This
could be somehow worked around by setting it statically, but that is
really only an option for testing. Even if the debugfs interface is
used there is a bug in the code. In qla_nvme_register_remote() the
value 0 is assigned to dev_loss_tmo and the NVMe core will use it's
default value 60 (this code path is exercised if the rport droppes
twice).

Anyway, this patch is just to get the discussion going. Maybe the
driver could implement the fc_remote_port interface? Hannes was
pointing out it might make sense to think about an controller sysfs
API as there is already a host and the NVMe protocol is all about host
and controller.

Thanks,
Daniel

 drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
 drivers/scsi/qla2xxx/qla_gbl.h  | 1 +
 drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
 drivers/scsi/qla2xxx/qla_os.c   | 5 +++++
 4 files changed, 9 insertions(+), 3 deletions(-)

Randy Dunlap April 19, 2021, 4:19 p.m. UTC | #1

Hi,

On 4/19/21 3:00 AM, Daniel Wagner wrote:
> Allow to set the default dev_loss_tmo value as kernel module option.
> 
> Cc: Nilesh Javali <njavali@marvell.com>
> Cc: Arun Easi <aeasi@marvell.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> 
>  drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
>  drivers/scsi/qla2xxx/qla_gbl.h  | 1 +
>  drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c   | 5 +++++
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 

> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d74c32f84ef5..c686522ff64e 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *);
>  static int qla2xxx_map_queues(struct Scsi_Host *shost);
>  static void qla2x00_destroy_deferred_work(struct qla_hw_data *);
>  
> +int ql2xdev_loss_tmo = 60;
> +module_param(ql2xdev_loss_tmo, int, 0444);
> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> +		"Time to wait for device to recover before reporting\n"
> +		"an error. Default is 60 seconds\n");

No need for the \n in the quoted string. Just change it to a space.

Daniel Wagner April 20, 2021, 12:37 p.m. UTC | #2

Hi Randy,

On Mon, Apr 19, 2021 at 09:19:13AM -0700, Randy Dunlap wrote:
> > +int ql2xdev_loss_tmo = 60;
> > +module_param(ql2xdev_loss_tmo, int, 0444);
> > +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> > +		"Time to wait for device to recover before reporting\n"
> > +		"an error. Default is 60 seconds\n");
> 
> No need for the \n in the quoted string. Just change it to a space.

I just followed the current style in this file. I guess this style
question is up to the maintainers to decide what they want.

Thanks,
Daniel

Himanshu Madhani April 20, 2021, 2:51 p.m. UTC | #3

Hi Daniel,

> On Apr 20, 2021, at 7:37 AM, Daniel Wagner <dwagner@suse.de> wrote:
> 
> Hi Randy,
> 
> On Mon, Apr 19, 2021 at 09:19:13AM -0700, Randy Dunlap wrote:
>>> +int ql2xdev_loss_tmo = 60;
>>> +module_param(ql2xdev_loss_tmo, int, 0444);
>>> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
>>> +		"Time to wait for device to recover before reporting\n"
>>> +		"an error. Default is 60 seconds\n");
>> 
>> No need for the \n in the quoted string. Just change it to a space.
> 
> I just followed the current style in this file. I guess this style
> question is up to the maintainers to decide what they want.
> 
> Thanks,
> Daniel
> 

Some of the instance in the file needed \n for readability of Module options. In this particular instance, 
I would rather move \n so that the message is displayed on same line and default option on second line

Something like following

>>> "Time to wait for device to recover before reporting an error.\n"
>>> "Default is 60 seconds\n");

With the change mentioned above I am ok with this patch. You can add my R-B when you send official patch. 

--
Himanshu Madhani	 Oracle Linux Engineering

Randy Dunlap April 20, 2021, 3:35 p.m. UTC | #4

On 4/20/21 5:37 AM, Daniel Wagner wrote:
> Hi Randy,
> 
> On Mon, Apr 19, 2021 at 09:19:13AM -0700, Randy Dunlap wrote:
>>> +int ql2xdev_loss_tmo = 60;
>>> +module_param(ql2xdev_loss_tmo, int, 0444);
>>> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
>>> +		"Time to wait for device to recover before reporting\n"
>>> +		"an error. Default is 60 seconds\n");
>>
>> No need for the \n in the quoted string. Just change it to a space.
> 
> I just followed the current style in this file. I guess this style
> question is up to the maintainers to decide what they want.

I see what you are saying, but the file has no consistency
of style in that regard. :)

oh well.

Benjamin Block April 20, 2021, 5:27 p.m. UTC | #5

On Mon, Apr 19, 2021 at 12:00:14PM +0200, Daniel Wagner wrote:
> Allow to set the default dev_loss_tmo value as kernel module option.
> 
> Cc: Nilesh Javali <njavali@marvell.com>
> Cc: Arun Easi <aeasi@marvell.com>
> Signed-off-by: Daniel Wagner <dwagner@suse.de>
> ---
> Hi,
> 
> During array upgrade tests with NVMe/FC on systems equiped with QLogic
> HBAs we faced the problem with the default setting of dev_loss_tmo.
> 
> When the default timeout hit after 60 seconds the file system went
> into read only mode. The fix was to set the dev_loss_tmo to infinity
> (note this patch can't handle this).
> 
> For lpfc devices we could use the sysfs interface under
> fc_remote_ports which exposed the dev_loss_tmo for SCSI and NVMe
> rports.
> 
> The QLogic only expose the rports via fc_remote_ports if SCSI is used.
> There is the debugfs interface to set the dev_loss_tmo but this has
> two issues. First, it's not watched by udevd hence no rules work. This
> could be somehow worked around by setting it statically, but that is
> really only an option for testing. Even if the debugfs interface is
> used there is a bug in the code. In qla_nvme_register_remote() the
> value 0 is assigned to dev_loss_tmo and the NVMe core will use it's
> default value 60 (this code path is exercised if the rport droppes
> twice).
> 
> Anyway, this patch is just to get the discussion going. Maybe the
> driver could implement the fc_remote_port interface? Hannes was
> pointing out it might make sense to think about an controller sysfs
> API as there is already a host and the NVMe protocol is all about host
> and controller.
> 
> Thanks,
> Daniel
> 
>  drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
>  drivers/scsi/qla2xxx/qla_gbl.h  | 1 +
>  drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c   | 5 +++++
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
> index 3aa9869f6fae..0d2386ba65c0 100644
> --- a/drivers/scsi/qla2xxx/qla_attr.c
> +++ b/drivers/scsi/qla2xxx/qla_attr.c
> @@ -3036,7 +3036,7 @@ qla24xx_vport_create(struct fc_vport *fc_vport, bool disable)
>  	}
>  
>  	/* initialize attributes */
> -	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> +	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
>  	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
>  	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
>  	fc_host_supported_classes(vha->host) =
> @@ -3260,7 +3260,7 @@ qla2x00_init_host_attr(scsi_qla_host_t *vha)
>  	struct qla_hw_data *ha = vha->hw;
>  	u32 speeds = FC_PORTSPEED_UNKNOWN;
>  
> -	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> +	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
>  	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
>  	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
>  	fc_host_supported_classes(vha->host) = ha->base_qpair->enable_class_2 ?
> diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
> index fae5cae6f0a8..0b9c24475711 100644
> --- a/drivers/scsi/qla2xxx/qla_gbl.h
> +++ b/drivers/scsi/qla2xxx/qla_gbl.h
> @@ -178,6 +178,7 @@ extern int ql2xdifbundlinginternalbuffers;
>  extern int ql2xfulldump_on_mpifail;
>  extern int ql2xenforce_iocb_limit;
>  extern int ql2xabts_wait_nvme;
> +extern int ql2xdev_loss_tmo;
>  
>  extern int qla2x00_loop_reset(scsi_qla_host_t *);
>  extern void qla2x00_abort_all_cmds(scsi_qla_host_t *, int);
> diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
> index 0cacb667a88b..cdc5b5075407 100644
> --- a/drivers/scsi/qla2xxx/qla_nvme.c
> +++ b/drivers/scsi/qla2xxx/qla_nvme.c
> @@ -41,7 +41,7 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport)
>  	req.port_name = wwn_to_u64(fcport->port_name);
>  	req.node_name = wwn_to_u64(fcport->node_name);
>  	req.port_role = 0;
> -	req.dev_loss_tmo = 0;
> +	req.dev_loss_tmo = ql2xdev_loss_tmo;
>  
>  	if (fcport->nvme_prli_service_param & NVME_PRLI_SP_INITIATOR)
>  		req.port_role = FC_PORT_ROLE_NVME_INITIATOR;
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d74c32f84ef5..c686522ff64e 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *);
>  static int qla2xxx_map_queues(struct Scsi_Host *shost);
>  static void qla2x00_destroy_deferred_work(struct qla_hw_data *);
>  
> +int ql2xdev_loss_tmo = 60;
> +module_param(ql2xdev_loss_tmo, int, 0444);
> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> +		"Time to wait for device to recover before reporting\n"
> +		"an error. Default is 60 seconds\n");

Wouldn't that be really really confusing, if you set essentially the
same thing with two different knobs for one FC HBA? We already have
a `dev_loss_tmo` kernel parameter - granted, only for scsi_transport_fc;
but doesn't qla implement that as well?

I don't really have any horses in this race here, but that sounds
strange.

Roman Bolshakov April 20, 2021, 5:35 p.m. UTC | #6

On Mon, Apr 19, 2021 at 12:00:14PM +0200, Daniel Wagner wrote:
> 
> The QLogic only expose the rports via fc_remote_ports if SCSI is used.
> There is the debugfs interface to set the dev_loss_tmo but this has
> two issues. First, it's not watched by udevd hence no rules work. This
> could be somehow worked around by setting it statically, but that is
> really only an option for testing. Even if the debugfs interface is
> used there is a bug in the code. In qla_nvme_register_remote() the
> value 0 is assigned to dev_loss_tmo and the NVMe core will use it's
> default value 60 (this code path is exercised if the rport droppes
> twice).
> 
> Anyway, this patch is just to get the discussion going. Maybe the
> driver could implement the fc_remote_port interface? Hannes was
> pointing out it might make sense to think about an controller sysfs
> API as there is already a host and the NVMe protocol is all about host
> and controller.
> 

Hi Daniel,

I agree that proper fc_remote_ports interface is needed for NVMe in
qla2xxx.

There was a similar concern from me when the dev_loss_tmo setting in
debugfs was added to the driver. Arun agreed the debugfs approach is a
crutch but the discussion never took off beyond that:

https://patchwork.kernel.org/project/linux-scsi/patch/20200818123203.20361-4-njavali@marvell.com/#23553423

+ James S.

Daniel, WRT to your patch I don't think we should add one more approach
to set dev_loss_tmo via kernel module parameter as NVMe adopters are
going to be even more confused about the parameter. Just imagine
knowledge bases populated with all sorts of the workarounds, that apply
to kernel version x, y, z, etc :)

What exists for FCP/SCSI is quite clear and reasonable. I don't know why
FC-NVMe rports should be way too different.

Thanks,
Roman

> Thanks,
> Daniel
> 
>  drivers/scsi/qla2xxx/qla_attr.c | 4 ++--
>  drivers/scsi/qla2xxx/qla_gbl.h  | 1 +
>  drivers/scsi/qla2xxx/qla_nvme.c | 2 +-
>  drivers/scsi/qla2xxx/qla_os.c   | 5 +++++
>  4 files changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/scsi/qla2xxx/qla_attr.c b/drivers/scsi/qla2xxx/qla_attr.c
> index 3aa9869f6fae..0d2386ba65c0 100644
> --- a/drivers/scsi/qla2xxx/qla_attr.c
> +++ b/drivers/scsi/qla2xxx/qla_attr.c
> @@ -3036,7 +3036,7 @@ qla24xx_vport_create(struct fc_vport *fc_vport, bool disable)
>  	}
>  
>  	/* initialize attributes */
> -	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> +	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
>  	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
>  	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
>  	fc_host_supported_classes(vha->host) =
> @@ -3260,7 +3260,7 @@ qla2x00_init_host_attr(scsi_qla_host_t *vha)
>  	struct qla_hw_data *ha = vha->hw;
>  	u32 speeds = FC_PORTSPEED_UNKNOWN;
>  
> -	fc_host_dev_loss_tmo(vha->host) = ha->port_down_retry_count;
> +	fc_host_dev_loss_tmo(vha->host) = ql2xdev_loss_tmo;
>  	fc_host_node_name(vha->host) = wwn_to_u64(vha->node_name);
>  	fc_host_port_name(vha->host) = wwn_to_u64(vha->port_name);
>  	fc_host_supported_classes(vha->host) = ha->base_qpair->enable_class_2 ?
> diff --git a/drivers/scsi/qla2xxx/qla_gbl.h b/drivers/scsi/qla2xxx/qla_gbl.h
> index fae5cae6f0a8..0b9c24475711 100644
> --- a/drivers/scsi/qla2xxx/qla_gbl.h
> +++ b/drivers/scsi/qla2xxx/qla_gbl.h
> @@ -178,6 +178,7 @@ extern int ql2xdifbundlinginternalbuffers;
>  extern int ql2xfulldump_on_mpifail;
>  extern int ql2xenforce_iocb_limit;
>  extern int ql2xabts_wait_nvme;
> +extern int ql2xdev_loss_tmo;
>  
>  extern int qla2x00_loop_reset(scsi_qla_host_t *);
>  extern void qla2x00_abort_all_cmds(scsi_qla_host_t *, int);
> diff --git a/drivers/scsi/qla2xxx/qla_nvme.c b/drivers/scsi/qla2xxx/qla_nvme.c
> index 0cacb667a88b..cdc5b5075407 100644
> --- a/drivers/scsi/qla2xxx/qla_nvme.c
> +++ b/drivers/scsi/qla2xxx/qla_nvme.c
> @@ -41,7 +41,7 @@ int qla_nvme_register_remote(struct scsi_qla_host *vha, struct fc_port *fcport)
>  	req.port_name = wwn_to_u64(fcport->port_name);
>  	req.node_name = wwn_to_u64(fcport->node_name);
>  	req.port_role = 0;
> -	req.dev_loss_tmo = 0;
> +	req.dev_loss_tmo = ql2xdev_loss_tmo;
>  
>  	if (fcport->nvme_prli_service_param & NVME_PRLI_SP_INITIATOR)
>  		req.port_role = FC_PORT_ROLE_NVME_INITIATOR;
> diff --git a/drivers/scsi/qla2xxx/qla_os.c b/drivers/scsi/qla2xxx/qla_os.c
> index d74c32f84ef5..c686522ff64e 100644
> --- a/drivers/scsi/qla2xxx/qla_os.c
> +++ b/drivers/scsi/qla2xxx/qla_os.c
> @@ -338,6 +338,11 @@ static void qla2x00_free_device(scsi_qla_host_t *);
>  static int qla2xxx_map_queues(struct Scsi_Host *shost);
>  static void qla2x00_destroy_deferred_work(struct qla_hw_data *);
>  
> +int ql2xdev_loss_tmo = 60;
> +module_param(ql2xdev_loss_tmo, int, 0444);
> +MODULE_PARM_DESC(ql2xdev_loss_tmo,
> +		"Time to wait for device to recover before reporting\n"
> +		"an error. Default is 60 seconds\n");
>  
>  static struct scsi_transport_template *qla2xxx_transport_template = NULL;
>  struct scsi_transport_template *qla2xxx_transport_vport_template = NULL;
> -- 
> 2.29.2
>

Daniel Wagner April 20, 2021, 6:28 p.m. UTC | #7

Hi Roman,

On Tue, Apr 20, 2021 at 08:35:10PM +0300, Roman Bolshakov wrote:
> + James S.
> 
> Daniel, WRT to your patch I don't think we should add one more approach
> to set dev_loss_tmo via kernel module parameter as NVMe adopters are
> going to be even more confused about the parameter. Just imagine
> knowledge bases populated with all sorts of the workarounds, that apply
> to kernel version x, y, z, etc :)

Totally agree. I consider this patch just a hack and way to get the
discussion going, hence the RFC :) Well, maybe we are going to add it
downstream in our kernels until we have a better way for setting the
dev_loss_tmo.

As explained the debugfs interface is not working (okay, that's
something which could be fixed) and it has the big problem that it is
not under control by udevd. Not sure if we with some new udev rules the
debugfs could automatically discovered or not.

> What exists for FCP/SCSI is quite clear and reasonable. I don't know why
> FC-NVMe rports should be way too different.

The lpfc driver does expose the FCP/SCSI and the FC-NVMe rports nicely
via the fc_remote_ports and this is what I would like to have from the
qla2xxx driver as well. qla2xxx exposes the FCP/SCSI rports but not the
FC-NVMe rports.

Thanks,
Daniel

Arun Easi April 21, 2021, 12:25 a.m. UTC | #8

Hi Daniel,

On Tue, 20 Apr 2021, 11:28am, Daniel Wagner wrote:

> ----------------------------------------------------------------------
> Hi Roman,
> 
> On Tue, Apr 20, 2021 at 08:35:10PM +0300, Roman Bolshakov wrote:
> > + James S.
> > 
> > Daniel, WRT to your patch I don't think we should add one more approach
> > to set dev_loss_tmo via kernel module parameter as NVMe adopters are
> > going to be even more confused about the parameter. Just imagine
> > knowledge bases populated with all sorts of the workarounds, that apply
> > to kernel version x, y, z, etc :)
> 
> Totally agree. I consider this patch just a hack and way to get the
> discussion going, hence the RFC :) Well, maybe we are going to add it
> downstream in our kernels until we have a better way for setting the
> dev_loss_tmo.
> 
> As explained the debugfs interface is not working (okay, that's
> something which could be fixed) and it has the big problem that it is
> not under control by udevd. Not sure if we with some new udev rules the
> debugfs could automatically discovered or not.

Curious, which udev script does this today for FC SCSI?

In theory, the exsting fc nvmediscovery udev event has enough information 
to find out the right qla2xxx debugfs node and set dev_loss_tmo.

> 
> > What exists for FCP/SCSI is quite clear and reasonable. I don't know why
> > FC-NVMe rports should be way too different.
> 
> The lpfc driver does expose the FCP/SCSI and the FC-NVMe rports nicely
> via the fc_remote_ports and this is what I would like to have from the
> qla2xxx driver as well. qla2xxx exposes the FCP/SCSI rports but not the
> FC-NVMe rports.
> 

Given that FC NVME does not have sysfs hierarchy like FC SCSI, I see 
utility in making FC-NVME ports available via fc_remote_ports. If, though, 
a FC target port is dual protocol aware this would leave with only one 
knob to control both.

I think, going with fc_remote_ports is better than introducing one more 
way (like this patch) to set this.

Regards,
-Arun

Daniel Wagner April 21, 2021, 7:56 a.m. UTC | #9

Hi Arun,

On Tue, Apr 20, 2021 at 05:25:52PM -0700, Arun Easi wrote:
> On Tue, 20 Apr 2021, 11:28am, Daniel Wagner wrote:
> > As explained the debugfs interface is not working (okay, that's
> > something which could be fixed) and it has the big problem that it is
> > not under control by udevd. Not sure if we with some new udev rules the
> > debugfs could automatically discovered or not.
>
> Curious, which udev script does this today for FC SCSI?

I am currently figuring out the 'correct' settings for passing the
various tests our partner does in their labs. That is no upstream udev
rules for this (yet).

Anyway, the settings are

  ACTION!="add|change", GOTO="tmo_end"
  # SCSI fc_remote_ports
  KERNELS=="rport-?*", SUBSYSTEM=="fc_remote_ports", ATTR{fast_io_fail_tmo}="5", ATTR{dev_loss_tmo}="4294967295"
  # nvme fabrics
  KERNELS=="ctl", SUBSYSTEMS=="nvme-fabrics", ATTR{transport}=="fc", ATTR{fast_io_fail_tmo}="-1", ATTR{ctrl_loss_tmo}="-1"
  LABEL="tmo_end"

and this works for lpfc but only half for qla2xxx.

> In theory, the exsting fc nvmediscovery udev event has enough information
> to find out the right qla2xxx debugfs node and set dev_loss_tmo.

Ah, didn't know about nvmediscovery until very recentetly. I try to get
it working with this approach (as this patch is not really a proper
solution).

> > > What exists for FCP/SCSI is quite clear and reasonable. I don't know why
> > > FC-NVMe rports should be way too different.
> >
> > The lpfc driver does expose the FCP/SCSI and the FC-NVMe rports nicely
> > via the fc_remote_ports and this is what I would like to have from the
> > qla2xxx driver as well. qla2xxx exposes the FCP/SCSI rports but not the
> > FC-NVMe rports.
> >
>
> Given that FC NVME does not have sysfs hierarchy like FC SCSI, I see
> utility in making FC-NVME ports available via fc_remote_ports. If, though,
> a FC target port is dual protocol aware this would leave with only one
> knob to control both.

So far I haven't had the need to distinguish between the two protocols
for the dev_loss_tmo setting. I think this is what Hannes was also
trying to tell, it might make sense to introduce sysfs APIs per
protocol.

> I think, going with fc_remote_ports is better than introducing one more
> way (like this patch) to set this.

As I said this patch was really a RFC. I will experiment with
nvmediscovery. Though, I think this is just a stopgap solution. Having
two completely different ways to configure the same thing is sub optimal
and it is asking for a lot of troubles with end customers. I am really
hoping we could streamline the current APIs, so we have only one
recommended way to configure the system independent of the driver
involved.

Thanks,
Daniel

Daniel Wagner April 27, 2021, 9:51 a.m. UTC | #10

On Wed, Apr 21, 2021 at 09:56:59AM +0200, Daniel Wagner wrote:
> Ah, didn't know about nvmediscovery until very recentetly. I try to get
> it working with this approach (as this patch is not really a proper
> solution).

Finally found some time to play with this. FTR, the nvmediscovery
carries following information:

  UDEV  [65238.364677] change   /devices/virtual/fc/fc_udev_device (fc)
  ACTION=change
  DEVPATH=/devices/virtual/fc/fc_udev_device
  FC_EVENT=nvmediscovery
  NVMEFC_HOST_TRADDR=nn-0x20000024ff7fa448:pn-0x21000024ff7fa448
  NVMEFC_TRADDR=nn-0x200200a09890f5bf:pn-0x203800a09890f5bf
  SEQNUM=12357
  SUBSYSTEM=fc
  USEC_INITIALIZED=65238333374

The udev rule I came up is:

  ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery", \
      ENV{NVMEFC_TRADDR}=="*", \
      RUN+="/usr/local/sbin/qla2xxx_dev_loss_tmo.sh $env{NVMEFC_TRADDR} 4294967295"

and the script is:

  #!/bin/sh

  TRADDR=$1
  TMO=$2

  id=$(echo $TRADDR | sed -n "s/.*pn-0x\([0-9a-f]\+\)/\1/p")
  find /sys/kernel/debug/qla2xxx -name pn-$id -exec /bin/sh -c "echo $TMO > {}/dev_loss_tmo" \;

I am sure this can be done in a more elegant way. Anyway, I am testing
this right now, the first 30 minutes look good...

Arun Easi April 27, 2021, 10:35 p.m. UTC | #11

On Tue, 27 Apr 2021, 2:51am, Daniel Wagner wrote:

> On Wed, Apr 21, 2021 at 09:56:59AM +0200, Daniel Wagner wrote:
> > Ah, didn't know about nvmediscovery until very recentetly. I try to get
> > it working with this approach (as this patch is not really a proper
> > solution).
> 
> Finally found some time to play with this. FTR, the nvmediscovery
> carries following information:
> 
>   UDEV  [65238.364677] change   /devices/virtual/fc/fc_udev_device (fc)
>   ACTION=change
>   DEVPATH=/devices/virtual/fc/fc_udev_device
>   FC_EVENT=nvmediscovery
>   NVMEFC_HOST_TRADDR=nn-0x20000024ff7fa448:pn-0x21000024ff7fa448
>   NVMEFC_TRADDR=nn-0x200200a09890f5bf:pn-0x203800a09890f5bf
>   SEQNUM=12357
>   SUBSYSTEM=fc
>   USEC_INITIALIZED=65238333374
> 
> The udev rule I came up is:
> 
>   ACTION=="change", SUBSYSTEM=="fc", ENV{FC_EVENT}=="nvmediscovery", \
>       ENV{NVMEFC_TRADDR}=="*", \
>       RUN+="/usr/local/sbin/qla2xxx_dev_loss_tmo.sh $env{NVMEFC_TRADDR} 4294967295"
> 
> and the script is:
> 
>   #!/bin/sh
> 
>   TRADDR=$1
>   TMO=$2
> 
>   id=$(echo $TRADDR | sed -n "s/.*pn-0x\([0-9a-f]\+\)/\1/p")
>   find /sys/kernel/debug/qla2xxx -name pn-$id -exec /bin/sh -c "echo $TMO > {}/dev_loss_tmo" \;
> 
> I am sure this can be done in a more elegant way. Anyway, I am testing
> this right now, the first 30 minutes look good...
> 

Looks ok to me. Just keep in mind that, with this you'd be setting all 
instances of pn-XXX (multiple initiator ports seeing the same target 
scenario). It looks like this is what you want, but thought I'd point that 
out.

Regards,
-Arun

Daniel Wagner April 28, 2021, 7:17 a.m. UTC | #12

Hi Arun,

On Tue, Apr 27, 2021 at 03:35:47PM -0700, Arun Easi wrote:
> > I am sure this can be done in a more elegant way. Anyway, I am testing
> > this right now, the first 30 minutes look good...
> > 
> 
> Looks ok to me. Just keep in mind that, with this you'd be setting all 
> instances of pn-XXX (multiple initiator ports seeing the same target 
> scenario). It looks like this is what you want, but thought I'd point that 
> out.

Good point. Yes, that's was the plan, set all ports to the same
value. BTW, an 8 hours port toggle test passed. With this setup we don't
need to add dirty patches downstream.

Thanks,
Daniel

James Smart April 28, 2021, 2:51 p.m. UTC | #13

On 4/20/2021 5:25 PM, Arun Easi wrote:
> Hi Daniel,
> 
> On Tue, 20 Apr 2021, 11:28am, Daniel Wagner wrote:
> 
>> ----------------------------------------------------------------------
>> Hi Roman,
>>
>> On Tue, Apr 20, 2021 at 08:35:10PM +0300, Roman Bolshakov wrote:
>>> + James S.
>>>
>>> Daniel, WRT to your patch I don't think we should add one more approach
>>> to set dev_loss_tmo via kernel module parameter as NVMe adopters are
>>> going to be even more confused about the parameter. Just imagine
>>> knowledge bases populated with all sorts of the workarounds, that apply
>>> to kernel version x, y, z, etc :)
>>
>> Totally agree. I consider this patch just a hack and way to get the
>> discussion going, hence the RFC :) Well, maybe we are going to add it
>> downstream in our kernels until we have a better way for setting the
>> dev_loss_tmo.
>>
>> As explained the debugfs interface is not working (okay, that's
>> something which could be fixed) and it has the big problem that it is
>> not under control by udevd. Not sure if we with some new udev rules the
>> debugfs could automatically discovered or not.
> 
> Curious, which udev script does this today for FC SCSI?
> 
> In theory, the exsting fc nvmediscovery udev event has enough information
> to find out the right qla2xxx debugfs node and set dev_loss_tmo.
> 
>>
>>> What exists for FCP/SCSI is quite clear and reasonable. I don't know why
>>> FC-NVMe rports should be way too different.
>>
>> The lpfc driver does expose the FCP/SCSI and the FC-NVMe rports nicely
>> via the fc_remote_ports and this is what I would like to have from the
>> qla2xxx driver as well. qla2xxx exposes the FCP/SCSI rports but not the
>> FC-NVMe rports.
>>
> 
> Given that FC NVME does not have sysfs hierarchy like FC SCSI, I see
> utility in making FC-NVME ports available via fc_remote_ports. If, though,
> a FC target port is dual protocol aware this would leave with only one
> knob to control both.
> 
> I think, going with fc_remote_ports is better than introducing one more
> way (like this patch) to set this.
> 
> Regards,
> -Arun

Thanks Arun,

I think we're all better off if the qla exports the nvme nodes via the 
scsi-side fc_remote_ports.  In the end - we will commonize a fc 
transport that then sits above scsi and nvme and will definitely be 
compatible with what's there. The registration with scsi was rather 
straight-forward for lpfc, so I assume it will be for qla as well and 
the devloss interface, although kludegy to have the driver propagate the 
scsi callback to nvme, also isn't much.

I also don't think we want to keep creating new mgmt points. it's 
already ugly enough.

-- james

[RFC] qla2xxx: Add dev_loss_tmo kernel module options

Commit Message

Comments

Patch