Message ID | 20230330110930.175539-1-yanaijie@huawei.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
Series | [v3] scsi: libsas: abort all inflight requests when device is gone | expand |
Jason, > When a disk is removed with inflight IO, the application need to wait > for 30 senconds(depends on the timeout configuration) to get back from > the kernel. Xingui tried to fix this issue by aborting the ATA link > for SATA devices[1]. However this approach left the SAS devices > unresolved. Applied to 6.4/scsi-staging, thanks!
On Thu, 30 Mar 2023 19:09:30 +0800, Jason Yan wrote: > When a disk is removed with inflight IO, the application need to wait > for 30 senconds(depends on the timeout configuration) to get back from > the kernel. Xingui tried to fix this issue by aborting the ATA link for > SATA devices[1]. However this approach left the SAS devices unresolved. > > This patch try to fix this issue by aborting all inflight requests while > the device is gone. This is implemented by itering the tagset. > > [...] Applied to 6.4/scsi-queue, thanks! [1/1] scsi: libsas: abort all inflight requests when device is gone https://git.kernel.org/mkp/scsi/c/0e4b1791d9b1
diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c index 72fdb2e5d047..8c6afe724944 100644 --- a/drivers/scsi/libsas/sas_discover.c +++ b/drivers/scsi/libsas/sas_discover.c @@ -360,6 +360,33 @@ static void sas_destruct_ports(struct asd_sas_port *port) } } +static bool sas_abort_cmd(struct request *req, void *data) +{ + struct scsi_cmnd *cmd = blk_mq_rq_to_pdu(req); + struct domain_device *dev = data; + + if (dev == cmd_to_domain_dev(cmd)) + blk_abort_request(req); + return true; +} + +static void sas_abort_device_scsi_cmds(struct domain_device *dev) +{ + struct sas_ha_struct *sas_ha = dev->port->ha; + struct Scsi_Host *shost = sas_ha->core.shost; + + if (dev_is_expander(dev->dev_type)) + return; + + /* + * For removed device with active IOs, the user space applications have + * to spend very long time waiting for the timeout. This is not + * necessary because a removed device will not return the IOs. + * Abort the inflight IOs here so that EH can be quickly kicked in. + */ + blk_mq_tagset_busy_iter(&shost->tag_set, sas_abort_cmd, dev); +} + void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev) { if (!test_bit(SAS_DEV_DESTROY, &dev->state) && @@ -372,6 +399,8 @@ void sas_unregister_dev(struct asd_sas_port *port, struct domain_device *dev) } if (!test_and_set_bit(SAS_DEV_DESTROY, &dev->state)) { + if (test_bit(SAS_DEV_GONE, &dev->state)) + sas_abort_device_scsi_cmds(dev); sas_rphy_unlink(dev->rphy); list_move_tail(&dev->disco_list_node, &port->destroy_list); }