Message ID | 20220503200704.88003-4-hare@suse.de (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | scsi: EH rework prep patches, part 1 | expand |
On 5/3/22 22:06, Hannes Reinecke wrote: > When issuing a host reset we should be waiting for all > ports to become unblocked; just waiting for one might > be resulting in host reset to return too early. > > Signed-off-by: Hannes Reinecke <hare@suse.de> > Reviewed-by: Christoph Hellwig <hch@lst.de> > Cc: Steffen Maier <maier@linux.ibm.com> > Cc: Benjamin Block <bblock@linux.ibm.com> > --- > drivers/s390/scsi/zfcp_scsi.c | 21 +++++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c > index 526ac240d9fe..2e615e1dcde6 100644 > --- a/drivers/s390/scsi/zfcp_scsi.c > +++ b/drivers/s390/scsi/zfcp_scsi.c > @@ -373,9 +373,11 @@ static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt) > > static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) > { > - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); > - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; > - int ret = SUCCESS, fc_ret; > + struct Scsi_Host *host = scpnt->device->host; > + struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0]; > + int ret = SUCCESS; > + unsigned long flags; > + struct zfcp_port *port; > > if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE)) { > zfcp_erp_port_forced_reopen_all(adapter, 0, "schrh_p"); > @@ -383,9 +385,16 @@ static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) > } > zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); > zfcp_erp_wait(adapter); Now internal zfcp recovery for the adapter completed, but there are async pending follow-up things, such as rport(s) unblock, see below. > - fc_ret = fc_block_scsi_eh(scpnt); > - if (fc_ret) > - ret = fc_ret; > + > + spin_lock_irqsave(&adapter->port_list_lock, flags); zfcp_adapter.port_list_lock is of type rwlock_t: + read_lock_irqsave(&adapter->port_list_lock, flags); > + list_for_each_entry(port, &adapter->port_list, list) { > + struct fc_rport *rport = port->rport; While an rport is blocked, port->rport == NULL [zfcp_scsi_rport_block() and zfcp_scsi_rport_register()], so this could be a NULL pointer deref. Typically all rports would still be blocked after above adapter recovery, until they become unblocked (via zfcp's async rport_work) [zfcp_erp_try_rport_unblock() near the end of lun and port recovery that happen as follow-up parts of adapter recovery and before zfcp_erp_wait() returns; it eventually calls zfcp_scsi_schedule_rport_register() queueing work item port->rport_work on which we could do flush_work() [one prior art in zfcp_init_device_configure()]]. Am starting to wonder if we would really need to sync with the async rports unblocks (for this adapter) after zfcp_erp_wait() above and before any deref port->rport. Either within this loop or with a separate loop before this one. ( Another option would be to somehow iterate rports differently via common code lists so we directly get rport references. ) > + > + ret = fc_block_rport(rport); Lock order looks good, we hold port_list_lock here and fc_block_rport() takes Scsi_Host host_lock, so it matches prior art: static void zfcp_erp_action_dismiss_adapter(struct zfcp_adapter *adapter) read_lock(&adapter->port_list_lock); zfcp_erp_action_dismiss_port(port); spin_lock(port->adapter->scsi_host->host_lock); > + if (ret) > + break; So once we find the first rport with (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) we would break the loop and return from zfcp_scsi_eh_host_reset_handler() with FAST_IO_FAIL. Is that correct? What if there are still other blocked rports if we were to continue with the loop? Or am I missing something regarding the goal of "waiting for all rports to become unblocked"? Once we completed the loop, the question arises which consolidated return code would be the correct one, if we got different ret for different rports in the loop. Does my previous brain dump make sense?: I was pondering in my own patch version what to return of just a subset of ports ran into fast_io_fail. And for now I thought just fast_io_fail would be good to let I/O bubble up for path failover, even if this would include of rport which meanwhile unblocked properly and would not need bubbling up pending requests because they could service them with a simple retry. > + } > + spin_unlock_irqrestore(&adapter->port_list_lock, flags); + read_unlock_irqrestore(&adapter->port_list_lock, flags); > > zfcp_dbf_scsi_eh("schrh_r", adapter, ~0, ret); > return ret;
On Tue, May 03, 2022 at 10:06:43PM +0200, Hannes Reinecke wrote: > @@ -373,9 +373,11 @@ static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt) > > static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) > { > - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); > - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; > - int ret = SUCCESS, fc_ret; > + struct Scsi_Host *host = scpnt->device->host; > + struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0]; > + int ret = SUCCESS; > + unsigned long flags; > + struct zfcp_port *port; > > if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE)) { > zfcp_erp_port_forced_reopen_all(adapter, 0, "schrh_p"); > @@ -383,9 +385,16 @@ static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) > } > zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); > zfcp_erp_wait(adapter); > - fc_ret = fc_block_scsi_eh(scpnt); > - if (fc_ret) > - ret = fc_ret; > + > + spin_lock_irqsave(&adapter->port_list_lock, flags); That doesn't compile: || In file included from ./include/linux/kref.h:16, || drivers/s390/scsi/zfcp_scsi.c: In function ‘zfcp_scsi_eh_host_reset_handler’: drivers/s390/scsi/zfcp_scsi.c|389 col 27| error: passing argument 1 of ‘spinlock_check’ from incompatible pointer type [-Werror=incompatible-pointer-types] || 389 | spin_lock_irqsave(&adapter->port_list_lock, flags); || | ^~~~~~~~~~~~~~~~~~~~~~~~ || | | || | rwlock_t * You probably want `read_lock_irqsave()`. The locking order looks fine, we already have places where we take the `port_list_lock`, and then the `host_lock` (ffr: `zfcp_erp_action_dismiss_adapter()`). > + list_for_each_entry(port, &adapter->port_list, list) { > + struct fc_rport *rport = port->rport; Like Steffen said in the other review, this may be NULL here, since `scpnt` can be from a different context, and there is no guarantees this is always assigned, if you iterate over all port of an adapter. So you want: if (!rport) continue; Or some such, with a similar effect. > + > + ret = fc_block_rport(rport); > + if (ret) > + break; > + } > + spin_unlock_irqrestore(&adapter->port_list_lock, flags); Otherwise I like this better than having the code open-coded here, also don't see any other obvious things missing than the two above. > > zfcp_dbf_scsi_eh("schrh_r", adapter, ~0, ret); > return ret;
On Wed, May 04, 2022 at 11:01:38AM +0000, Benjamin Block wrote: > On Tue, May 03, 2022 at 10:06:43PM +0200, Hannes Reinecke wrote: > > @@ -373,9 +373,11 @@ static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt) ... Ah sorry, Steffen was already looking at this. You can ignore my comments, he covered them all already
On Wed, May 04, 2022 at 11:01:38AM +0000, Benjamin Block wrote: > On Tue, May 03, 2022 at 10:06:43PM +0200, Hannes Reinecke wrote: > > + list_for_each_entry(port, &adapter->port_list, list) { > > + struct fc_rport *rport = port->rport; > > Like Steffen said in the other review, this may be NULL here, since > `scpnt` can be from a different context, and there is no guarantees this > is always assigned, if you iterate over all port of an adapter. > > So you want: > > if (!rport) > continue; That is probably wrong; see Steffen's comments w.r.t. waiting for them to appear, instead of skipping them.
On 5/4/22 12:46, Steffen Maier wrote: > On 5/3/22 22:06, Hannes Reinecke wrote: >> When issuing a host reset we should be waiting for all >> ports to become unblocked; just waiting for one might >> be resulting in host reset to return too early. >> >> Signed-off-by: Hannes Reinecke <hare@suse.de> >> Reviewed-by: Christoph Hellwig <hch@lst.de> >> Cc: Steffen Maier <maier@linux.ibm.com> >> Cc: Benjamin Block <bblock@linux.ibm.com> >> --- >> drivers/s390/scsi/zfcp_scsi.c | 21 +++++++++++++++------ >> 1 file changed, 15 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c >> index 526ac240d9fe..2e615e1dcde6 100644 >> --- a/drivers/s390/scsi/zfcp_scsi.c >> +++ b/drivers/s390/scsi/zfcp_scsi.c >> @@ -373,9 +373,11 @@ static int zfcp_scsi_eh_target_reset_handler(struct >> scsi_cmnd *scpnt) >> static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) >> { >> - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); >> - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; >> - int ret = SUCCESS, fc_ret; >> + struct Scsi_Host *host = scpnt->device->host; >> + struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0]; >> + int ret = SUCCESS; >> + unsigned long flags; >> + struct zfcp_port *port; >> if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE)) { >> zfcp_erp_port_forced_reopen_all(adapter, 0, "schrh_p"); >> @@ -383,9 +385,16 @@ static int zfcp_scsi_eh_host_reset_handler(struct >> scsi_cmnd *scpnt) >> } >> zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); >> zfcp_erp_wait(adapter); > > Now internal zfcp recovery for the adapter completed, but there are async > pending follow-up things, such as rport(s) unblock, see below. > >> - fc_ret = fc_block_scsi_eh(scpnt); >> - if (fc_ret) >> - ret = fc_ret; >> + >> + spin_lock_irqsave(&adapter->port_list_lock, flags); > > zfcp_adapter.port_list_lock is of type rwlock_t: > > + read_lock_irqsave(&adapter->port_list_lock, flags); > >> + list_for_each_entry(port, &adapter->port_list, list) { >> + struct fc_rport *rport = port->rport; > > While an rport is blocked, port->rport == NULL [zfcp_scsi_rport_block() and > zfcp_scsi_rport_register()], so this could be a NULL pointer deref. > > Typically all rports would still be blocked after above adapter recovery, until > they become unblocked (via zfcp's async rport_work) > [zfcp_erp_try_rport_unblock() near the end of lun and port recovery that happen > as follow-up parts of adapter recovery and before zfcp_erp_wait() returns; it > eventually calls zfcp_scsi_schedule_rport_register() queueing work item > port->rport_work on which we could do flush_work() [one prior art in > zfcp_init_device_configure()]]. > > Am starting to wonder if we would really need to sync with the async rports > unblocks (for this adapter) after zfcp_erp_wait() above and before any deref > port->rport. Either within this loop or with a separate loop before this one. > ( Another option would be to somehow iterate rports differently via common code > lists so we directly get rport references. ) > >> + >> + ret = fc_block_rport(rport); I think I just noticed that that callee includes an msleep but we hold a read lock on port_list_lock. That does not seem right [scheduling while atomic?!]. Maybe we would need an open coded version after all to be able to drop both locks in correct order (and re-acquire) in the open-coded copy of fc_block_rport()? > > Lock order looks good, we hold port_list_lock here and fc_block_rport() takes > Scsi_Host host_lock, so it matches prior art: > static void zfcp_erp_action_dismiss_adapter(struct zfcp_adapter *adapter) > read_lock(&adapter->port_list_lock); > zfcp_erp_action_dismiss_port(port); > spin_lock(port->adapter->scsi_host->host_lock); > >> + if (ret) >> + break; > > So once we find the first rport with > (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) > we would break the loop and return from zfcp_scsi_eh_host_reset_handler() with > FAST_IO_FAIL. Is that correct? What if there are still other blocked rports if > we were to continue with the loop? Or am I missing something regarding the goal > of "waiting for all rports to become unblocked"? > > Once we completed the loop, the question arises which consolidated return code > would be the correct one, if we got different ret for different rports in the > loop. Does my previous brain dump make sense?: > I was pondering in my own patch version what to return of just a subset > of ports ran into fast_io_fail. And for now I thought just fast_io_fail > would be good to let I/O bubble up for path failover, even if this would > include of rport which meanwhile unblocked properly and would not need > bubbling up pending requests because they could service them with a > simple retry. > >> + } >> + spin_unlock_irqrestore(&adapter->port_list_lock, flags); > > + read_unlock_irqrestore(&adapter->port_list_lock, flags); > >> zfcp_dbf_scsi_eh("schrh_r", adapter, ~0, ret); >> return ret; > >
Hannes, I'm working on a solution, just need more time. On 5/4/22 13:40, Steffen Maier wrote: > On 5/4/22 12:46, Steffen Maier wrote: >> On 5/3/22 22:06, Hannes Reinecke wrote: >>> When issuing a host reset we should be waiting for all >>> ports to become unblocked; just waiting for one might >>> be resulting in host reset to return too early. >>> >>> Signed-off-by: Hannes Reinecke <hare@suse.de> >>> Reviewed-by: Christoph Hellwig <hch@lst.de> >>> Cc: Steffen Maier <maier@linux.ibm.com> >>> Cc: Benjamin Block <bblock@linux.ibm.com> >>> --- >>> drivers/s390/scsi/zfcp_scsi.c | 21 +++++++++++++++------ >>> 1 file changed, 15 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c >>> index 526ac240d9fe..2e615e1dcde6 100644 >>> --- a/drivers/s390/scsi/zfcp_scsi.c >>> +++ b/drivers/s390/scsi/zfcp_scsi.c >>> @@ -373,9 +373,11 @@ static int zfcp_scsi_eh_target_reset_handler(struct >>> scsi_cmnd *scpnt) >>> static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) >>> { >>> - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); >>> - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; >>> - int ret = SUCCESS, fc_ret; >>> + struct Scsi_Host *host = scpnt->device->host; >>> + struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0]; >>> + int ret = SUCCESS; >>> + unsigned long flags; >>> + struct zfcp_port *port; >>> if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE)) { >>> zfcp_erp_port_forced_reopen_all(adapter, 0, "schrh_p"); >>> @@ -383,9 +385,16 @@ static int zfcp_scsi_eh_host_reset_handler(struct >>> scsi_cmnd *scpnt) >>> } >>> zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); >>> zfcp_erp_wait(adapter); >> >> Now internal zfcp recovery for the adapter completed, but there are async >> pending follow-up things, such as rport(s) unblock, see below. >> >>> - fc_ret = fc_block_scsi_eh(scpnt); >>> - if (fc_ret) >>> - ret = fc_ret; >>> + >>> + spin_lock_irqsave(&adapter->port_list_lock, flags); >> >> zfcp_adapter.port_list_lock is of type rwlock_t: >> >> + read_lock_irqsave(&adapter->port_list_lock, flags); >> >>> + list_for_each_entry(port, &adapter->port_list, list) { >>> + struct fc_rport *rport = port->rport; >> >> While an rport is blocked, port->rport == NULL [zfcp_scsi_rport_block() and >> zfcp_scsi_rport_register()], so this could be a NULL pointer deref. >> >> Typically all rports would still be blocked after above adapter recovery, >> until they become unblocked (via zfcp's async rport_work) >> [zfcp_erp_try_rport_unblock() near the end of lun and port recovery that >> happen as follow-up parts of adapter recovery and before zfcp_erp_wait() >> returns; it eventually calls zfcp_scsi_schedule_rport_register() queueing >> work item port->rport_work on which we could do flush_work() [one prior art >> in zfcp_init_device_configure()]]. >> >> Am starting to wonder if we would really need to sync with the async rports >> unblocks (for this adapter) after zfcp_erp_wait() above and before any deref >> port->rport. Either within this loop or with a separate loop before this one. >> ( Another option would be to somehow iterate rports differently via common >> code lists so we directly get rport references. ) >> >>> + >>> + ret = fc_block_rport(rport); > > I think I just noticed that that callee includes an msleep but we hold a read > lock on port_list_lock. That does not seem right [scheduling while atomic?!]. > Maybe we would need an open coded version after all to be able to drop both > locks in correct order (and re-acquire) in the open-coded copy of > fc_block_rport()? > >> >> Lock order looks good, we hold port_list_lock here and fc_block_rport() takes >> Scsi_Host host_lock, so it matches prior art: >> static void zfcp_erp_action_dismiss_adapter(struct zfcp_adapter *adapter) >> read_lock(&adapter->port_list_lock); >> zfcp_erp_action_dismiss_port(port); >> spin_lock(port->adapter->scsi_host->host_lock); >> >>> + if (ret) >>> + break; >> >> So once we find the first rport with >> (rport->flags & FC_RPORT_FAST_FAIL_TIMEDOUT) >> we would break the loop and return from zfcp_scsi_eh_host_reset_handler() >> with FAST_IO_FAIL. Is that correct? What if there are still other blocked >> rports if we were to continue with the loop? Or am I missing something >> regarding the goal of "waiting for all rports to become unblocked"? >> >> Once we completed the loop, the question arises which consolidated return >> code would be the correct one, if we got different ret for different rports >> in the loop. Does my previous brain dump make sense?: >> I was pondering in my own patch version what to return of just a subset >> of ports ran into fast_io_fail. And for now I thought just fast_io_fail >> would be good to let I/O bubble up for path failover, even if this would >> include of rport which meanwhile unblocked properly and would not need >> bubbling up pending requests because they could service them with a >> simple retry. >> >>> + } >>> + spin_unlock_irqrestore(&adapter->port_list_lock, flags); >> >> + read_unlock_irqrestore(&adapter->port_list_lock, flags); >> >>> zfcp_dbf_scsi_eh("schrh_r", adapter, ~0, ret); >>> return ret; >> >> > >
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c index 526ac240d9fe..2e615e1dcde6 100644 --- a/drivers/s390/scsi/zfcp_scsi.c +++ b/drivers/s390/scsi/zfcp_scsi.c @@ -373,9 +373,11 @@ static int zfcp_scsi_eh_target_reset_handler(struct scsi_cmnd *scpnt) static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) { - struct zfcp_scsi_dev *zfcp_sdev = sdev_to_zfcp(scpnt->device); - struct zfcp_adapter *adapter = zfcp_sdev->port->adapter; - int ret = SUCCESS, fc_ret; + struct Scsi_Host *host = scpnt->device->host; + struct zfcp_adapter *adapter = (struct zfcp_adapter *)host->hostdata[0]; + int ret = SUCCESS; + unsigned long flags; + struct zfcp_port *port; if (!(adapter->connection_features & FSF_FEATURE_NPIV_MODE)) { zfcp_erp_port_forced_reopen_all(adapter, 0, "schrh_p"); @@ -383,9 +385,16 @@ static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) } zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); zfcp_erp_wait(adapter); - fc_ret = fc_block_scsi_eh(scpnt); - if (fc_ret) - ret = fc_ret; + + spin_lock_irqsave(&adapter->port_list_lock, flags); + list_for_each_entry(port, &adapter->port_list, list) { + struct fc_rport *rport = port->rport; + + ret = fc_block_rport(rport); + if (ret) + break; + } + spin_unlock_irqrestore(&adapter->port_list_lock, flags); zfcp_dbf_scsi_eh("schrh_r", adapter, ~0, ret); return ret;