Message ID | 20231002154927.68643-2-hare@suse.de (mailing list archive) |
---|---|
State | Deferred |
Headers | show |
Series | [01/15] zfcp: do not wait for rports to become unblocked after host reset | expand |
Hey Hannes, I've got a few questions re the rational for this change. On Mon, Oct 02, 2023 at 05:49:13PM +0200, Hannes Reinecke wrote: > zfcp_scsi_eh_host_reset_handler() would call fc_block_rport() to > wait for all rports to become unblocked after host reset. > But after host reset it might happen that the port is gone, hence > fc_block_rport() might fail due to a missing port. > But that's a perfectly legal operation; on FC remote ports might > come and go. > In the same vein FC HBAs are able to deal with ports being temporarily > blocked, so really there is not point in waiting for all ports > to become unblocked during host reset. But in scsi_transport_fc.c we have this documented: * fc_block_scsi_eh - Block SCSI eh thread for blocked fc_rport * @cmnd: SCSI command that scsi_eh is trying to recover * * This routine can be called from a FC LLD scsi_eh callback. It * blocks the scsi_eh thread until the fc_rport leaves the * FC_PORTSTATE_BLOCKED, or the fast_io_fail_tmo fires. This is * necessary to avoid the scsi_eh failing recovery actions for blocked ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ * rports which would lead to offlined SCSI devices. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ So I don't understand what the real expectation by the SCSI EH call back for host reset is then. Is it that all objects (host/target ports/luns) are operational again once we return to the EH thread, or is it ok that some parts are still being recovered (as with our host reset handler, rports might still be blocked after `zfcp_erp_wait()` finishes, because of how this is organized internally). If it's the later, I'd think this change is fine. But then I'd wonder why this function exists in the first place? Is it because in other EH steps it's more important that rports are ready after the step (e.g. because a TUR is send after, and if that fails, things get escalate unnecessarily)? Oh.. speaking of that, we do send a TUR after host reset as well (`scsi_eh_test_devices()`). So doesn't this break then if one or more rports are sill blocked after host reset returns? At least `zfcp_scsi_queuecommand()` will bail very early if the rport is not ready (we call `fc_remote_port_chkready()` as more or less first thing), and so `scsi_send_eh_cmnd()` that is used for the TUR will fail; then it might be retried one time, but this is a tight loop without any delay, so I'd guess this has a good chance to fail as well. And then we'd offline the whole host as further escalation, which would *REALLY* suck (with no automatic recovery no less). My impression from look at the code that follows `scsi_try_host_reset()` in `scsi_error.c` really is, it rather expects things to be ready to be used after, right there and then (admittedly, this is probably already today problematic, as things might go back to not working concurrently because of some fabric event.. but anyway, we can life with that off-chance it seems). Or do I miss something? > Hence remove the call to fc_block_rport() in host reset. > > Signed-off-by: Hannes Reinecke <hare@suse.de> > Cc: Steffen Maier <maier@linux.ibm.com > Cc: Benjamin Block <bblock@linux.ibm.com> > --- > drivers/s390/scsi/zfcp_scsi.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c > index b2a8cd792266..14f929cca271 100644 > --- a/drivers/s390/scsi/zfcp_scsi.c > +++ b/drivers/s390/scsi/zfcp_scsi.c > @@ -383,10 +383,6 @@ static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) > } > zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); > zfcp_erp_wait(adapter); > - fc_ret = fc_block_scsi_eh(scpnt); > - if (fc_ret) > - ret = fc_ret; > - > zfcp_dbf_scsi_eh("schrh_r", adapter, ~0, ret); > return ret; > } > -- > 2.35.3 >
On 10/12/23 15:54, Benjamin Block wrote: > Hey Hannes, > > I've got a few questions re the rational for this change. > > On Mon, Oct 02, 2023 at 05:49:13PM +0200, Hannes Reinecke wrote: >> zfcp_scsi_eh_host_reset_handler() would call fc_block_rport() to >> wait for all rports to become unblocked after host reset. >> But after host reset it might happen that the port is gone, hence >> fc_block_rport() might fail due to a missing port. >> But that's a perfectly legal operation; on FC remote ports might >> come and go. >> In the same vein FC HBAs are able to deal with ports being temporarily >> blocked, so really there is not point in waiting for all ports >> to become unblocked during host reset. > > But in scsi_transport_fc.c we have this documented: > > * fc_block_scsi_eh - Block SCSI eh thread for blocked fc_rport > * @cmnd: SCSI command that scsi_eh is trying to recover > * > * This routine can be called from a FC LLD scsi_eh callback. It > * blocks the scsi_eh thread until the fc_rport leaves the > * FC_PORTSTATE_BLOCKED, or the fast_io_fail_tmo fires. This is > * necessary to avoid the scsi_eh failing recovery actions for blocked > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > * rports which would lead to offlined SCSI devices. > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > So I don't understand what the real expectation by the SCSI EH call back for > host reset is then. > > Is it that all objects (host/target ports/luns) are operational again once we > return to the EH thread, or is it ok that some parts are still being > recovered (as with our host reset handler, rports might still be blocked after > `zfcp_erp_wait()` finishes, because of how this is organized internally). > > If it's the later, I'd think this change is fine. But then I'd wonder why this > function exists in the first place? Is it because in other EH steps it's more > important that rports are ready after the step (e.g. because a TUR is send > after, and if that fails, things get escalate unnecessarily)? > Thing is, fc_block_scsi_eh() is assumed to be called from eh callbacks _before_ any TMFs are to be sent. Typically you would call them in eh_device_reset() or eh_target_reset() to ensure that you can sent TMFs in the first place; no point in attempting to send TMFs is the port is blocked. Your particular case is arguably a mis-use of fc_block_scsi_eh() as it is called _after_ host reset is initiated, essentially serving as a completion point to ensure that all rports are back online. However, for the FC transport implementation rport lifetimes are decoupled from SCSI Host lifetimes; rports may (and do!) come and go during the lifetime of a SCSI host. Consequently there is no difference between a host with all rports blocked (eg during RSCN processing) and a host just coming on-line after SCSI EH where rports are still in the process of getting ready. Hence the use of fc_block_scsi_eh() after host reset is not required, and we can make our life easier by just dropping the call. > Oh.. speaking of that, we do send a TUR after host reset as well > (`scsi_eh_test_devices()`). So doesn't this break then if one or more rports > are sill blocked after host reset returns? > At least `zfcp_scsi_queuecommand()` will bail very early if the rport is > not ready (we call `fc_remote_port_chkready()` as more or less first thing), > and so `scsi_send_eh_cmnd()` that is used for the TUR will fail; then it might > be retried one time, but this is a tight loop without any delay, so I'd guess > this has a good chance to fail as well. > And then we'd offline the whole host as further escalation, which would > *REALLY* suck (with no automatic recovery no less). > > My impression from look at the code that follows `scsi_try_host_reset()` in > `scsi_error.c` really is, it rather expects things to be ready to be used > after, right there and then (admittedly, this is probably already today > problematic, as things might go back to not working concurrently because of > some fabric event.. but anyway, we can life with that off-chance it seems). > > Or do I miss something? > Ah, right. True, when the rports are not ready (ie still being blocked) sending a TEST UNIT READY will fail, with probably unintended consequences. But: if host reset would return FAST_IO_FAIL everything would be dandy as then we would just check if the devices are online (by virtue of scsi_eh_flush_done_q() in scsi_unjam_host()), which they really should as no-one should have set them offline by then. So I guess that's the correct way to go. Will be modifying the patch accordingly. Thanks for the feedback! Cheers, Hannes
On Thu, Oct 12, 2023 at 04:23:47PM +0200, Hannes Reinecke wrote: > On 10/12/23 15:54, Benjamin Block wrote: > > On Mon, Oct 02, 2023 at 05:49:13PM +0200, Hannes Reinecke wrote: > >> zfcp_scsi_eh_host_reset_handler() would call fc_block_rport() to > >> wait for all rports to become unblocked after host reset. > >> But after host reset it might happen that the port is gone, hence > >> fc_block_rport() might fail due to a missing port. > >> But that's a perfectly legal operation; on FC remote ports might > >> come and go. > >> In the same vein FC HBAs are able to deal with ports being temporarily > >> blocked, so really there is not point in waiting for all ports > >> to become unblocked during host reset. > > > > But in scsi_transport_fc.c we have this documented: > > > > * fc_block_scsi_eh - Block SCSI eh thread for blocked fc_rport > > * @cmnd: SCSI command that scsi_eh is trying to recover > > * > > * This routine can be called from a FC LLD scsi_eh callback. It > > * blocks the scsi_eh thread until the fc_rport leaves the > > * FC_PORTSTATE_BLOCKED, or the fast_io_fail_tmo fires. This is > > * necessary to avoid the scsi_eh failing recovery actions for blocked > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > * rports which would lead to offlined SCSI devices. > > ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ > > > > So I don't understand what the real expectation by the SCSI EH call back for > > host reset is then. > > > > Is it that all objects (host/target ports/luns) are operational again once we > > return to the EH thread, or is it ok that some parts are still being > > recovered (as with our host reset handler, rports might still be blocked after > > `zfcp_erp_wait()` finishes, because of how this is organized internally). > > > > If it's the later, I'd think this change is fine. But then I'd wonder why this > > function exists in the first place? Is it because in other EH steps it's more > > important that rports are ready after the step (e.g. because a TUR is send > > after, and if that fails, things get escalate unnecessarily)? > > > > Thing is, fc_block_scsi_eh() is assumed to be called from eh callbacks > _before_ any TMFs are to be sent. > Typically you would call them in eh_device_reset() or eh_target_reset() > to ensure that you can sent TMFs in the first place; no point in attempting > to send TMFs is the port is blocked. Ok. Interesting. We don't really care about the state of the rport when sending TMFs or Aborts, as those commands are sent outside the normal queuecommand flow (we just check "internal bits"), in case of Aborts we even hand this off to firmware. Consequently we don't really care about their state before trying to send either. > Your particular case is arguably a mis-use of fc_block_scsi_eh() as > it is called _after_ host reset is initiated, essentially serving as > a completion point to ensure that all rports are back online. > > However, for the FC transport implementation rport lifetimes are > decoupled from SCSI Host lifetimes; rports may (and do!) come and > go during the lifetime of a SCSI host. Ye, Ack. We can't control Fabric events. > Consequently there is no > difference between a host with all rports blocked (eg during RSCN > processing) and a host just coming on-line after SCSI EH where rports > are still in the process of getting ready. > > Hence the use of fc_block_scsi_eh() after host reset is not required, > and we can make our life easier by just dropping the call. > > > Oh.. speaking of that, we do send a TUR after host reset as well > > (`scsi_eh_test_devices()`). So doesn't this break then if one or more rports > > are sill blocked after host reset returns? > > At least `zfcp_scsi_queuecommand()` will bail very early if the rport is > > not ready (we call `fc_remote_port_chkready()` as more or less first thing), > > and so `scsi_send_eh_cmnd()` that is used for the TUR will fail; then it might > > be retried one time, but this is a tight loop without any delay, so I'd guess > > this has a good chance to fail as well. > > And then we'd offline the whole host as further escalation, which would > > *REALLY* suck (with no automatic recovery no less). > > > > My impression from look at the code that follows `scsi_try_host_reset()` in > > `scsi_error.c` really is, it rather expects things to be ready to be used > > after, right there and then (admittedly, this is probably already today > > problematic, as things might go back to not working concurrently because of > > some fabric event.. but anyway, we can life with that off-chance it seems). > > > > Or do I miss something? > > > > Ah, right. True, when the rports are not ready (ie still being blocked) > sending a TEST UNIT READY will fail, with probably unintended consequences. > > But: if host reset would return FAST_IO_FAIL everything would be dandy Ok, so that would mean, we finish all commands left in the EH work_q with `scsi_eh_finish_cmd()`, and not populate the local `check_list` at all, which in turns means, we don't do anything in `scsi_eh_test_devices()` (no state checks, not TURs). > as then we would just check if the devices are online (by virtue of > scsi_eh_flush_done_q() in scsi_unjam_host()), which they really should > as no-one should have set them offline by then. When returning to `scsi_unjam_host()` directly after we return from `scsi_eh_host_reset()` we call into `scsi_eh_flush_done_q()` and go over all commands that are now in the done-queue (everything, if host reset returned FAST_IO_FAIL). In there we delete the commands from the EH list, and then check whether we ought to retry the command on the same SDEV or return it to some upper layer (i.e. hopefully dm-multipath for our installations). The former depends on whether the SDEV is online again. If everything is fine in the SAN (not cable pulled or something), I think this should be the case, but IFF we assume the rport is still blocked because the async registration (`zfcp_scsi_rport_work()`) hasn't finished yet (the original point for using `fc_block_scsi_eh()`), then the SDEV might still be in state SDEV_TRANSPORT_OFFLINE. This can happen during adapter recovery (where we block, IOW call `fc_remote_port_delete()` on all rports) if fast-io-fail-tmo runs out, and `fc_terminate_rport_io()` is called. That is undone when we call `fc_remote_port_add()` to 'unblock' the rport. This would then set all SDEVs into RUNNING again. And there we have the interaction with `fc_block_scsi_eh()` again. Hmm. I think I could life with both though. If someone drives I/O directly on the SDEV, and it fails after EH because of some unfortunate timing, that's bad luck, and something was actually wrong in the SAN if fast-io-fail-tmo runs out during recovery. They ought to use dm-multipath. And if they do, the commands are re-issued from that layer. I think that should be fine. So I think we can work with returning FAST_IO_FAIL from `zfcp_scsi_eh_host_reset_handler()`, and removing the call to `fc_block_scsi_eh()`. We (Steffen and/or I) might still want to look into some other solution for only returning from that when we know the async rport registrations have ran at least once after adapter recovery. But as far as your patchset goes, I don't think that is a gate.
On 10/12/23 19:49, Benjamin Block wrote: > On Thu, Oct 12, 2023 at 04:23:47PM +0200, Hannes Reinecke wrote: >> On 10/12/23 15:54, Benjamin Block wrote: >>> On Mon, Oct 02, 2023 at 05:49:13PM +0200, Hannes Reinecke wrote: >>>> zfcp_scsi_eh_host_reset_handler() would call fc_block_rport() to >>>> wait for all rports to become unblocked after host reset. >>>> But after host reset it might happen that the port is gone, hence >>>> fc_block_rport() might fail due to a missing port. >>>> But that's a perfectly legal operation; on FC remote ports might >>>> come and go. >>>> In the same vein FC HBAs are able to deal with ports being temporarily >>>> blocked, so really there is not point in waiting for all ports >>>> to become unblocked during host reset. >>> >>> But in scsi_transport_fc.c we have this documented: >>> >>> * fc_block_scsi_eh - Block SCSI eh thread for blocked fc_rport >>> * @cmnd: SCSI command that scsi_eh is trying to recover >>> * >>> * This routine can be called from a FC LLD scsi_eh callback. It >>> * blocks the scsi_eh thread until the fc_rport leaves the >>> * FC_PORTSTATE_BLOCKED, or the fast_io_fail_tmo fires. This is >>> * necessary to avoid the scsi_eh failing recovery actions for blocked >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> * rports which would lead to offlined SCSI devices. >>> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ >>> >>> So I don't understand what the real expectation by the SCSI EH call back for >>> host reset is then. >>> >>> Is it that all objects (host/target ports/luns) are operational again once we >>> return to the EH thread, or is it ok that some parts are still being >>> recovered (as with our host reset handler, rports might still be blocked after >>> `zfcp_erp_wait()` finishes, because of how this is organized internally). >>> >>> If it's the later, I'd think this change is fine. But then I'd wonder why this >>> function exists in the first place? Is it because in other EH steps it's more >>> important that rports are ready after the step (e.g. because a TUR is send >>> after, and if that fails, things get escalate unnecessarily)? >>> >> >> Thing is, fc_block_scsi_eh() is assumed to be called from eh callbacks >> _before_ any TMFs are to be sent. >> Typically you would call them in eh_device_reset() or eh_target_reset() >> to ensure that you can sent TMFs in the first place; no point in attempting >> to send TMFs is the port is blocked. > > Ok. Interesting. We don't really care about the state of the rport when > sending TMFs or Aborts, as those commands are sent outside the normal > queuecommand flow (we just check "internal bits"), in case of Aborts we even > hand this off to firmware. Consequently we don't really care about their state > before trying to send either. > Interesting. All others do care about the state of the rport, as for them sending commands to a blocked rport will just cause the TMF to fail. [ .. ] >>> My impression from look at the code that follows `scsi_try_host_reset()` in >>> `scsi_error.c` really is, it rather expects things to be ready to be used >>> after, right there and then (admittedly, this is probably already today >>> problematic, as things might go back to not working concurrently because of >>> some fabric event.. but anyway, we can life with that off-chance it seems). >>> >>> Or do I miss something? >>> >> >> Ah, right. True, when the rports are not ready (ie still being blocked) >> sending a TEST UNIT READY will fail, with probably unintended consequences. >> >> But: if host reset would return FAST_IO_FAIL everything would be dandy > > Ok, so that would mean, we finish all commands left in the EH work_q with > `scsi_eh_finish_cmd()`, and not populate the local `check_list` at all, which > in turns means, we don't do anything in `scsi_eh_test_devices()` (no state > checks, not TURs). > >> as then we would just check if the devices are online (by virtue of >> scsi_eh_flush_done_q() in scsi_unjam_host()), which they really should >> as no-one should have set them offline by then. > > When returning to `scsi_unjam_host()` directly after we return from > `scsi_eh_host_reset()` we call into `scsi_eh_flush_done_q()` and go over all > commands that are now in the done-queue (everything, if host reset returned > FAST_IO_FAIL). > > In there we delete the commands from the EH list, and then check whether we > ought to retry the command on the same SDEV or return it to some upper layer > (i.e. hopefully dm-multipath for our installations). > > The former depends on whether the SDEV is online again. If everything is fine > in the SAN (not cable pulled or something), I think this should be the case, > but IFF we assume the rport is still blocked because the async registration > (`zfcp_scsi_rport_work()`) hasn't finished yet (the original point for using > `fc_block_scsi_eh()`), then the SDEV might still be in state > SDEV_TRANSPORT_OFFLINE. > This can happen during adapter recovery (where we block, IOW call > `fc_remote_port_delete()` on all rports) if fast-io-fail-tmo runs out, and > `fc_terminate_rport_io()` is called. > That is undone when we call `fc_remote_port_add()` to 'unblock' the rport. > This would then set all SDEVs into RUNNING again. And there we have the > interaction with `fc_block_scsi_eh()` again. > Yes, but that is perfectly fine, and in fact exactly as things should work. Once a device is in SDEV_TRANSPORT_OFFLINE it means that the underlying rport has been deleted after dev_loss_tmo has expired. If that happened during SCSI EH, well, tough luck. I/O would have been aborted even without SCSI EH here. All fine by me. > Hmm. I think I could life with both though. If someone drives I/O directly on > the SDEV, and it fails after EH because of some unfortunate timing, that's bad > luck, and something was actually wrong in the SAN if fast-io-fail-tmo runs out > during recovery. They ought to use dm-multipath. > And if they do, the commands are re-issued from that layer. I think that > should be fine. > > So I think we can work with returning FAST_IO_FAIL from > `zfcp_scsi_eh_host_reset_handler()`, and removing the call to > `fc_block_scsi_eh()`. > We (Steffen and/or I) might still want to look into some other solution > for only returning from that when we know the async rport registrations have > ran at least once after adapter recovery. But as far as your patchset goes, I > don't think that is a gate. > Thanks a lot. Modification to zfcp have been bogging me down for quite some while, glad that we've found this rather easy solution. Will be sending an updated version. Cheers, Hannes
diff --git a/drivers/s390/scsi/zfcp_scsi.c b/drivers/s390/scsi/zfcp_scsi.c index b2a8cd792266..14f929cca271 100644 --- a/drivers/s390/scsi/zfcp_scsi.c +++ b/drivers/s390/scsi/zfcp_scsi.c @@ -383,10 +383,6 @@ static int zfcp_scsi_eh_host_reset_handler(struct scsi_cmnd *scpnt) } zfcp_erp_adapter_reopen(adapter, 0, "schrh_1"); zfcp_erp_wait(adapter); - fc_ret = fc_block_scsi_eh(scpnt); - if (fc_ret) - ret = fc_ret; - zfcp_dbf_scsi_eh("schrh_r", adapter, ~0, ret); return ret; }
zfcp_scsi_eh_host_reset_handler() would call fc_block_rport() to wait for all rports to become unblocked after host reset. But after host reset it might happen that the port is gone, hence fc_block_rport() might fail due to a missing port. But that's a perfectly legal operation; on FC remote ports might come and go. In the same vein FC HBAs are able to deal with ports being temporarily blocked, so really there is not point in waiting for all ports to become unblocked during host reset. Hence remove the call to fc_block_rport() in host reset. Signed-off-by: Hannes Reinecke <hare@suse.de> Cc: Steffen Maier <maier@linux.ibm.com Cc: Benjamin Block <bblock@linux.ibm.com> --- drivers/s390/scsi/zfcp_scsi.c | 4 ---- 1 file changed, 4 deletions(-)