Message ID | 1512728048-84729-1-git-send-email-hare@suse.de (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote: > @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags); > > if (retval) { > + /* > + * If the target only supports active/optimized there's > + * not much we can do; it's not that we can switch paths > + * or somesuch. > + * So ignore any errors to avoid spurious failures during > + * path failover. > + */ > + if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) { > + sdev_printk(KERN_INFO, sdev, > + "%s: ignoring rtpg result %d\n", > + ALUA_DH_NAME, retval); > + kfree(buff); > + return SCSI_DH_OK; > + } Hello Hannes, Sorry but this change looks weird to me. If an RTPG fails, shouldn't alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target supports? Are you perhaps trying to implement a work-around for an array that does not respond to RTPG during path transitions? Thanks, Bart.
On 12/12/2017 01:00 AM, Bart Van Assche wrote: > On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote: >> @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) >> retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags); >> >> if (retval) { >> + /* >> + * If the target only supports active/optimized there's >> + * not much we can do; it's not that we can switch paths >> + * or somesuch. >> + * So ignore any errors to avoid spurious failures during >> + * path failover. >> + */ >> + if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) { >> + sdev_printk(KERN_INFO, sdev, >> + "%s: ignoring rtpg result %d\n", >> + ALUA_DH_NAME, retval); >> + kfree(buff); >> + return SCSI_DH_OK; >> + } > > Hello Hannes, > > Sorry but this change looks weird to me. If an RTPG fails, shouldn't > alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target > supports? Are you perhaps trying to implement a work-around for an array > that does not respond to RTPG during path transitions? > Yes, precisely. Thing is: if an array is only supporting active/optimized the entire device-handler stuff is basically a no-op as we can't switch paths anyway. So it doesn't really matter if the RTPG fails; in fact, we could just short-circuit the entire logic. I did not do that to allow for a state modification (ie arrays _might_ decide to announce additional states eventually, and only starting off with active/optimized as an initial state set). But if we return SCSI_DH_IO here the multipath logic will not attempt to switch paths, and failover will not work. Cheers, Hannes
On Tue, 2017-12-12 at 08:01 +0100, Hannes Reinecke wrote: > On 12/12/2017 01:00 AM, Bart Van Assche wrote: > > On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote: > > > @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) > > > retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags); > > > > > > if (retval) { > > > + /* > > > + * If the target only supports active/optimized there's > > > + * not much we can do; it's not that we can switch paths > > > + * or somesuch. > > > + * So ignore any errors to avoid spurious failures during > > > + * path failover. > > > + */ > > > + if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) { > > > + sdev_printk(KERN_INFO, sdev, > > > + "%s: ignoring rtpg result %d\n", > > > + ALUA_DH_NAME, retval); > > > + kfree(buff); > > > + return SCSI_DH_OK; > > > + } > > > > Hello Hannes, > > > > Sorry but this change looks weird to me. If an RTPG fails, shouldn't > > alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target > > supports? Are you perhaps trying to implement a work-around for an array > > that does not respond to RTPG during path transitions? > > > > Yes, precisely. > > Thing is: if an array is only supporting active/optimized the entire > device-handler stuff is basically a no-op as we can't switch paths anyway. > So it doesn't really matter if the RTPG fails; in fact, we could just > short-circuit the entire logic. I did not do that to allow for a state > modification (ie arrays _might_ decide to announce additional states > eventually, and only starting off with active/optimized as an initial > state set). > > But if we return SCSI_DH_IO here the multipath logic will not attempt to > switch paths, and failover will not work. Hello Hannes, I would appreciate it if it would be mentioned more clearly in the comment above pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED that the newly added code is a workaround for non-standard array behavior. I'm afraid that otherwise people who will read that code will be puzzled about why that code has been added. Thanks, Bart.
On 12/12/2017 07:25 PM, Bart Van Assche wrote: > On Tue, 2017-12-12 at 08:01 +0100, Hannes Reinecke wrote: >> On 12/12/2017 01:00 AM, Bart Van Assche wrote: >>> On Fri, 2017-12-08 at 11:14 +0100, Hannes Reinecke wrote: >>>> @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) >>>> retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags); >>>> >>>> if (retval) { >>>> + /* >>>> + * If the target only supports active/optimized there's >>>> + * not much we can do; it's not that we can switch paths >>>> + * or somesuch. >>>> + * So ignore any errors to avoid spurious failures during >>>> + * path failover. >>>> + */ >>>> + if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) { >>>> + sdev_printk(KERN_INFO, sdev, >>>> + "%s: ignoring rtpg result %d\n", >>>> + ALUA_DH_NAME, retval); >>>> + kfree(buff); >>>> + return SCSI_DH_OK; >>>> + } >>> >>> Hello Hannes, >>> >>> Sorry but this change looks weird to me. If an RTPG fails, shouldn't >>> alua_rtpg() return SCSI_DH_IO independent of what ALUA states the target >>> supports? Are you perhaps trying to implement a work-around for an array >>> that does not respond to RTPG during path transitions? >>> >> >> Yes, precisely. >> >> Thing is: if an array is only supporting active/optimized the entire >> device-handler stuff is basically a no-op as we can't switch paths anyway. >> So it doesn't really matter if the RTPG fails; in fact, we could just >> short-circuit the entire logic. I did not do that to allow for a state >> modification (ie arrays _might_ decide to announce additional states >> eventually, and only starting off with active/optimized as an initial >> state set). >> >> But if we return SCSI_DH_IO here the multipath logic will not attempt to >> switch paths, and failover will not work. > > Hello Hannes, > > I would appreciate it if it would be mentioned more clearly in the comment > above pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED that the newly added code > is a workaround for non-standard array behavior. I'm afraid that otherwise > people who will read that code will be puzzled about why that code has been > added. > Okay, not a problem. Will be sending out a v2. Cheers, Hannes
diff --git a/drivers/scsi/device_handler/scsi_dh_alua.c b/drivers/scsi/device_handler/scsi_dh_alua.c index fd22dc6..b09c12b 100644 --- a/drivers/scsi/device_handler/scsi_dh_alua.c +++ b/drivers/scsi/device_handler/scsi_dh_alua.c @@ -40,6 +40,7 @@ #define TPGS_SUPPORT_LBA_DEPENDENT 0x10 #define TPGS_SUPPORT_OFFLINE 0x40 #define TPGS_SUPPORT_TRANSITION 0x80 +#define TPGS_SUPPORT_ALL 0xdf #define RTPG_FMT_MASK 0x70 #define RTPG_FMT_EXT_HDR 0x10 @@ -81,6 +82,7 @@ struct alua_port_group { int tpgs; int state; int pref; + int valid_states; unsigned flags; /* used for optimizing STPG */ unsigned char transition_tmo; unsigned long expiry; @@ -243,6 +245,7 @@ static struct alua_port_group *alua_alloc_pg(struct scsi_device *sdev, pg->group_id = group_id; pg->tpgs = tpgs; pg->state = SCSI_ACCESS_STATE_OPTIMAL; + pg->valid_states = TPGS_SUPPORT_ALL; if (optimize_stpg) pg->flags |= ALUA_OPTIMIZE_STPG; kref_init(&pg->kref); @@ -516,7 +519,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) { struct scsi_sense_hdr sense_hdr; struct alua_port_group *tmp_pg; - int len, k, off, valid_states = 0, bufflen = ALUA_RTPG_SIZE; + int len, k, off, bufflen = ALUA_RTPG_SIZE; unsigned char *desc, *buff; unsigned err, retval; unsigned int tpg_desc_tbl_off; @@ -541,6 +544,20 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) retval = submit_rtpg(sdev, buff, bufflen, &sense_hdr, pg->flags); if (retval) { + /* + * If the target only supports active/optimized there's + * not much we can do; it's not that we can switch paths + * or somesuch. + * So ignore any errors to avoid spurious failures during + * path failover. + */ + if ((pg->valid_states & ~TPGS_SUPPORT_OPTIMIZED) == 0) { + sdev_printk(KERN_INFO, sdev, + "%s: ignoring rtpg result %d\n", + ALUA_DH_NAME, retval); + kfree(buff); + return SCSI_DH_OK; + } if (!scsi_sense_valid(&sense_hdr)) { sdev_printk(KERN_INFO, sdev, "%s: rtpg failed, result %d\n", @@ -652,7 +669,7 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) rcu_read_unlock(); } if (tmp_pg == pg) - valid_states = desc[1]; + tmp_pg->valid_states = desc[1]; spin_unlock_irqrestore(&tmp_pg->lock, flags); } kref_put(&tmp_pg->kref, release_port_group); @@ -665,13 +682,13 @@ static int alua_rtpg(struct scsi_device *sdev, struct alua_port_group *pg) "%s: port group %02x state %c %s supports %c%c%c%c%c%c%c\n", ALUA_DH_NAME, pg->group_id, print_alua_state(pg->state), pg->pref ? "preferred" : "non-preferred", - valid_states&TPGS_SUPPORT_TRANSITION?'T':'t', - valid_states&TPGS_SUPPORT_OFFLINE?'O':'o', - valid_states&TPGS_SUPPORT_LBA_DEPENDENT?'L':'l', - valid_states&TPGS_SUPPORT_UNAVAILABLE?'U':'u', - valid_states&TPGS_SUPPORT_STANDBY?'S':'s', - valid_states&TPGS_SUPPORT_NONOPTIMIZED?'N':'n', - valid_states&TPGS_SUPPORT_OPTIMIZED?'A':'a'); + pg->valid_states&TPGS_SUPPORT_TRANSITION?'T':'t', + pg->valid_states&TPGS_SUPPORT_OFFLINE?'O':'o', + pg->valid_states&TPGS_SUPPORT_LBA_DEPENDENT?'L':'l', + pg->valid_states&TPGS_SUPPORT_UNAVAILABLE?'U':'u', + pg->valid_states&TPGS_SUPPORT_STANDBY?'S':'s', + pg->valid_states&TPGS_SUPPORT_NONOPTIMIZED?'N':'n', + pg->valid_states&TPGS_SUPPORT_OPTIMIZED?'A':'a'); switch (pg->state) { case SCSI_ACCESS_STATE_TRANSITIONING: