diff mbox series

scsi: scsi_transport_srp: don't block target in SRP_PORT_LOST state

Message ID 20210401091105.8046-1-mwilck@suse.com (mailing list archive)
State Accepted
Headers show
Series scsi: scsi_transport_srp: don't block target in SRP_PORT_LOST state | expand

Commit Message

Martin Wilck April 1, 2021, 9:11 a.m. UTC
From: Martin Wilck <mwilck@suse.com>

rport_dev_loss_timedout() sets the rport state to SRP_PORT_LOST and
the SCSI target state to SDEV_TRANSPORT_OFFLINE. If this races with
srp_reconnect_work(), a warning is printed:

Mar 27 18:48:07 ictm1604s01h4 kernel: dev_loss_tmo expired for SRP port-18:1 / host18.
Mar 27 18:48:07 ictm1604s01h4 kernel: ------------[ cut here ]------------
Mar 27 18:48:07 ictm1604s01h4 kernel: scsi_internal_device_block(18:0:0:100) failed: ret = -22
Mar 27 18:48:07 ictm1604s01h4 kernel: Call Trace:
Mar 27 18:48:07 ictm1604s01h4 kernel:  ? scsi_target_unblock+0x50/0x50 [scsi_mod]
Mar 27 18:48:07 ictm1604s01h4 kernel:  starget_for_each_device+0x80/0xb0 [scsi_mod]
Mar 27 18:48:07 ictm1604s01h4 kernel:  target_block+0x24/0x30 [scsi_mod]
Mar 27 18:48:07 ictm1604s01h4 kernel:  device_for_each_child+0x57/0x90
Mar 27 18:48:07 ictm1604s01h4 kernel:  srp_reconnect_rport+0xe4/0x230 [scsi_transport_srp]
Mar 27 18:48:07 ictm1604s01h4 kernel:  srp_reconnect_work+0x40/0xc0 [scsi_transport_srp]

Avoid this by not trying to block targets for rports in SRP_PORT_LOST
state.

Signed-off-by: Martin Wilck <mwilck@suse.com>
---
 drivers/scsi/scsi_transport_srp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Bart Van Assche April 2, 2021, 7:38 p.m. UTC | #1
On 4/1/21 2:11 AM, mwilck@suse.com wrote:
> rport_dev_loss_timedout() sets the rport state to SRP_PORT_LOST and
> the SCSI target state to SDEV_TRANSPORT_OFFLINE. If this races with
> srp_reconnect_work(), a warning is printed:

Reviewed-by: Bart Van Assche <bvanassche@acm.org>
Laurence Oberman April 2, 2021, 8:15 p.m. UTC | #2
On Fri, 2021-04-02 at 12:38 -0700, Bart Van Assche wrote:
> On 4/1/21 2:11 AM, mwilck@suse.com wrote:
> > rport_dev_loss_timedout() sets the rport state to SRP_PORT_LOST and
> > the SCSI target state to SDEV_TRANSPORT_OFFLINE. If this races with
> > srp_reconnect_work(), a warning is printed:
> 
> Reviewed-by: Bart Van Assche <bvanassche@acm.org>
> 

Indeed I have seen this while running rapid resets in my lab. Was not
sure if it was something I was doing or a real bug.

For example this script will bring it out if I lower the delay
#!/bin/bash
#on ibclient server in /sys/class/srp_remote_ports, using echo 1 >
delete for the particular port will simulate a port reset.

#/sys/class/srp_remote_ports
#[root@ibclient srp_remote_ports]# ls
#port-1:1  port-2:1
for d in /sys/class/srp_remote_ports/*
do
	echo 1 > $d/delete
sleep 60
done

Looks correct, and anyway Bart agrees so:

Reviewed-by:
Laurence Oberman <loberman@redhat.com>
Martin K. Petersen April 6, 2021, 4:52 a.m. UTC | #3
On Thu, 1 Apr 2021 11:11:05 +0200, mwilck@suse.com wrote:

> rport_dev_loss_timedout() sets the rport state to SRP_PORT_LOST and
> the SCSI target state to SDEV_TRANSPORT_OFFLINE. If this races with
> srp_reconnect_work(), a warning is printed:
> 
> Mar 27 18:48:07 ictm1604s01h4 kernel: dev_loss_tmo expired for SRP port-18:1 / host18.
> Mar 27 18:48:07 ictm1604s01h4 kernel: ------------[ cut here ]------------
> Mar 27 18:48:07 ictm1604s01h4 kernel: scsi_internal_device_block(18:0:0:100) failed: ret = -22
> Mar 27 18:48:07 ictm1604s01h4 kernel: Call Trace:
> Mar 27 18:48:07 ictm1604s01h4 kernel:  ? scsi_target_unblock+0x50/0x50 [scsi_mod]
> Mar 27 18:48:07 ictm1604s01h4 kernel:  starget_for_each_device+0x80/0xb0 [scsi_mod]
> Mar 27 18:48:07 ictm1604s01h4 kernel:  target_block+0x24/0x30 [scsi_mod]
> Mar 27 18:48:07 ictm1604s01h4 kernel:  device_for_each_child+0x57/0x90
> Mar 27 18:48:07 ictm1604s01h4 kernel:  srp_reconnect_rport+0xe4/0x230 [scsi_transport_srp]
> Mar 27 18:48:07 ictm1604s01h4 kernel:  srp_reconnect_work+0x40/0xc0 [scsi_transport_srp]
> 
> [...]

Applied to 5.12/scsi-fixes, thanks!

[1/1] scsi: scsi_transport_srp: don't block target in SRP_PORT_LOST state
      https://git.kernel.org/mkp/scsi/c/5cd0f6f57639
diff mbox series

Patch

diff --git a/drivers/scsi/scsi_transport_srp.c b/drivers/scsi/scsi_transport_srp.c
index 1e939a2a387f..98a34ed10f1a 100644
--- a/drivers/scsi/scsi_transport_srp.c
+++ b/drivers/scsi/scsi_transport_srp.c
@@ -541,7 +541,7 @@  int srp_reconnect_rport(struct srp_rport *rport)
 	res = mutex_lock_interruptible(&rport->mutex);
 	if (res)
 		goto out;
-	if (rport->state != SRP_RPORT_FAIL_FAST)
+	if (rport->state != SRP_RPORT_FAIL_FAST && rport->state != SRP_RPORT_LOST)
 		/*
 		 * sdev state must be SDEV_TRANSPORT_OFFLINE, transition
 		 * to SDEV_BLOCK is illegal. Calling scsi_target_unblock()