Message ID | 20201116183114.55703-3-hare@suse.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | scsi: infinite guest hangs with scsi-disk | expand |
On Mon, Nov 16, 2020 at 7:31 PM Hannes Reinecke <hare@suse.de> wrote: > The current code sets an infinite timeout on SG_IO requests, > causing the guest to stall if the host experiences a frame > loss. > This patch adds an 'io_timeout' parameter for SCSIDevice to > make the SG_IO timeout configurable, and also shortens the > default timeout to 30 seconds to avoid infinite stalls. Hannes, could 30 seconds be a bit too short for tape drives? Paolo
On 9/20/21 8:56 PM, Paolo Bonzini wrote: > On Mon, Nov 16, 2020 at 7:31 PM Hannes Reinecke <hare@suse.de> wrote: >> The current code sets an infinite timeout on SG_IO requests, >> causing the guest to stall if the host experiences a frame >> loss. >> This patch adds an 'io_timeout' parameter for SCSIDevice to >> make the SG_IO timeout configurable, and also shortens the >> default timeout to 30 seconds to avoid infinite stalls. > > Hannes, could 30 seconds be a bit too short for tape drives? > It would, but then anyone attempting to use tapes via qemu emulation deserves to suffer. Tapes are bitchy even when used normally, so attempting to use them under qemu emulation will land you with lots of unhappy experiences, where the timeout is the least of your problems. I sincerely doubt anyone will be using tapes here. Not in real-world scenarios. Cheers, Hannes
On 21/09/21 07:39, Hannes Reinecke wrote: > It would, but then anyone attempting to use tapes via qemu emulation > deserves to suffer. > Tapes are bitchy even when used normally, so attempting to use them > under qemu emulation will land you with lots of unhappy experiences, > where the timeout is the least of your problems. > I sincerely doubt anyone will be using tapes here. > Not in real-world scenarios. Hmm, I have customers that disagree. Probably the timeout should be kept infinite for tapes. Paolo
Hi Hannes, On 11/16/20 19:31, Hannes Reinecke wrote: > The current code sets an infinite timeout on SG_IO requests, > causing the guest to stall if the host experiences a frame > loss. > This patch adds an 'io_timeout' parameter for SCSIDevice to > make the SG_IO timeout configurable, and also shortens the > default timeout to 30 seconds to avoid infinite stalls. > > Signed-off-by: Hannes Reinecke <hare@suse.de> > --- > hw/scsi/scsi-disk.c | 6 ++++-- > hw/scsi/scsi-generic.c | 17 +++++++++++------ > include/hw/scsi/scsi.h | 4 +++- > 3 files changed, 18 insertions(+), 9 deletions(-) > int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t *cmd, uint8_t cmd_size, > - uint8_t *buf, uint8_t buf_size) > + uint8_t *buf, uint8_t buf_size, uint32_t timeout) > { > sg_io_hdr_t io_header; > uint8_t sensebuf[8]; > @@ -520,7 +522,7 @@ int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t *cmd, uint8_t cmd_size, > io_header.cmd_len = cmd_size; > io_header.mx_sb_len = sizeof(sensebuf); > io_header.sbp = sensebuf; > - io_header.timeout = 6000; /* XXX */ > + io_header.timeout = timeout * 1000; > @@ -637,7 +639,7 @@ static int get_stream_blocksize(BlockBackend *blk) > cmd[0] = MODE_SENSE; > cmd[4] = sizeof(buf); > > - ret = scsi_SG_IO_FROM_DEV(blk, cmd, sizeof(cmd), buf, sizeof(buf)); > + ret = scsi_SG_IO_FROM_DEV(blk, cmd, sizeof(cmd), buf, sizeof(buf), 6); Why is this timeout hardcoded? Due to the /* XXX */ comment?
On 9/22/21 5:47 PM, Philippe Mathieu-Daudé wrote: > Hi Hannes, > > On 11/16/20 19:31, Hannes Reinecke wrote: >> The current code sets an infinite timeout on SG_IO requests, >> causing the guest to stall if the host experiences a frame >> loss. >> This patch adds an 'io_timeout' parameter for SCSIDevice to >> make the SG_IO timeout configurable, and also shortens the >> default timeout to 30 seconds to avoid infinite stalls. >> >> Signed-off-by: Hannes Reinecke <hare@suse.de> >> --- >> hw/scsi/scsi-disk.c | 6 ++++-- >> hw/scsi/scsi-generic.c | 17 +++++++++++------ >> include/hw/scsi/scsi.h | 4 +++- >> 3 files changed, 18 insertions(+), 9 deletions(-) > >> int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t *cmd, uint8_t >> cmd_size, >> - uint8_t *buf, uint8_t buf_size) >> + uint8_t *buf, uint8_t buf_size, uint32_t >> timeout) >> { >> sg_io_hdr_t io_header; >> uint8_t sensebuf[8]; >> @@ -520,7 +522,7 @@ int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t >> *cmd, uint8_t cmd_size, >> io_header.cmd_len = cmd_size; >> io_header.mx_sb_len = sizeof(sensebuf); >> io_header.sbp = sensebuf; >> - io_header.timeout = 6000; /* XXX */ >> + io_header.timeout = timeout * 1000; > >> @@ -637,7 +639,7 @@ static int get_stream_blocksize(BlockBackend *blk) >> cmd[0] = MODE_SENSE; >> cmd[4] = sizeof(buf); >> - ret = scsi_SG_IO_FROM_DEV(blk, cmd, sizeof(cmd), buf, sizeof(buf)); >> + ret = scsi_SG_IO_FROM_DEV(blk, cmd, sizeof(cmd), buf, >> sizeof(buf), 6); > > Why is this timeout hardcoded? Due to the /* XXX */ comment? > 60 seconds is the default command timeout on linux. And the problem is that the guest might set a command timeout on the comands being send from the guests user space, but the guest is unable to communicate this timeout to the host. Really, one should fix up the virtio spec here to allow for a 'timeout' field. But in the absence of that being able to configure it is the next best attempt. Cheers, Hannes
On 22/09/21 17:47, Philippe Mathieu-Daudé wrote: >> > >> @@ -637,7 +639,7 @@ static int get_stream_blocksize(BlockBackend *blk) >> cmd[0] = MODE_SENSE; >> cmd[4] = sizeof(buf); >> - ret = scsi_SG_IO_FROM_DEV(blk, cmd, sizeof(cmd), buf, sizeof(buf)); >> + ret = scsi_SG_IO_FROM_DEV(blk, cmd, sizeof(cmd), buf, >> sizeof(buf), 6); > > Why is this timeout hardcoded? Due to the /* XXX */ comment? This command is only invoked at startup and involves no I/O, so 6 seconds should be plenty. Paolo
diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c index e859534eaf..2959526b52 100644 --- a/hw/scsi/scsi-disk.c +++ b/hw/scsi/scsi-disk.c @@ -2604,7 +2604,7 @@ static int get_device_type(SCSIDiskState *s) cmd[4] = sizeof(buf); ret = scsi_SG_IO_FROM_DEV(s->qdev.conf.blk, cmd, sizeof(cmd), - buf, sizeof(buf)); + buf, sizeof(buf), s->qdev.io_timeout); if (ret < 0) { return -1; } @@ -2765,7 +2765,7 @@ static BlockAIOCB *scsi_block_do_sgio(SCSIBlockReq *req, /* The rest is as in scsi-generic.c. */ io_header->mx_sb_len = sizeof(r->req.sense); io_header->sbp = r->req.sense; - io_header->timeout = UINT_MAX; + io_header->timeout = s->qdev.io_timeout * 1000; io_header->usr_ptr = r; io_header->flags |= SG_FLAG_DIRECT_IO; @@ -3083,6 +3083,8 @@ static Property scsi_block_properties[] = { DEFAULT_MAX_IO_SIZE), DEFINE_PROP_INT32("scsi_version", SCSIDiskState, qdev.default_scsi_version, -1), + DEFINE_PROP_UINT32("io_timeout", SCSIDiskState, qdev.io_timeout, + DEFAULT_IO_TIMEOUT), DEFINE_PROP_END_OF_LIST(), }; diff --git a/hw/scsi/scsi-generic.c b/hw/scsi/scsi-generic.c index 2cb23ca891..e07924b3d7 100644 --- a/hw/scsi/scsi-generic.c +++ b/hw/scsi/scsi-generic.c @@ -114,6 +114,8 @@ static int execute_command(BlockBackend *blk, SCSIGenericReq *r, int direction, BlockCompletionFunc *complete) { + SCSIDevice *s = r->req.dev; + r->io_header.interface_id = 'S'; r->io_header.dxfer_direction = direction; r->io_header.dxferp = r->buf; @@ -122,7 +124,7 @@ static int execute_command(BlockBackend *blk, r->io_header.cmd_len = r->req.cmd.len; r->io_header.mx_sb_len = sizeof(r->req.sense); r->io_header.sbp = r->req.sense; - r->io_header.timeout = MAX_UINT; + r->io_header.timeout = s->io_timeout * 1000; r->io_header.usr_ptr = r; r->io_header.flags |= SG_FLAG_DIRECT_IO; @@ -505,7 +507,7 @@ static int read_naa_id(const uint8_t *p, uint64_t *p_wwn) } int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t *cmd, uint8_t cmd_size, - uint8_t *buf, uint8_t buf_size) + uint8_t *buf, uint8_t buf_size, uint32_t timeout) { sg_io_hdr_t io_header; uint8_t sensebuf[8]; @@ -520,7 +522,7 @@ int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t *cmd, uint8_t cmd_size, io_header.cmd_len = cmd_size; io_header.mx_sb_len = sizeof(sensebuf); io_header.sbp = sensebuf; - io_header.timeout = 6000; /* XXX */ + io_header.timeout = timeout * 1000; ret = blk_ioctl(blk, SG_IO, &io_header); if (ret < 0 || io_header.driver_status || io_header.host_status) { @@ -550,7 +552,7 @@ static void scsi_generic_set_vpd_bl_emulation(SCSIDevice *s) cmd[4] = sizeof(buf); ret = scsi_SG_IO_FROM_DEV(s->conf.blk, cmd, sizeof(cmd), - buf, sizeof(buf)); + buf, sizeof(buf), s->io_timeout); if (ret < 0) { /* * Do not assume anything if we can't retrieve the @@ -586,7 +588,7 @@ static void scsi_generic_read_device_identification(SCSIDevice *s) cmd[4] = sizeof(buf); ret = scsi_SG_IO_FROM_DEV(s->conf.blk, cmd, sizeof(cmd), - buf, sizeof(buf)); + buf, sizeof(buf), s->io_timeout); if (ret < 0) { return; } @@ -637,7 +639,7 @@ static int get_stream_blocksize(BlockBackend *blk) cmd[0] = MODE_SENSE; cmd[4] = sizeof(buf); - ret = scsi_SG_IO_FROM_DEV(blk, cmd, sizeof(cmd), buf, sizeof(buf)); + ret = scsi_SG_IO_FROM_DEV(blk, cmd, sizeof(cmd), buf, sizeof(buf), 6); if (ret < 0) { return -1; } @@ -727,6 +729,7 @@ static void scsi_generic_realize(SCSIDevice *s, Error **errp) /* Only used by scsi-block, but initialize it nevertheless to be clean. */ s->default_scsi_version = -1; + s->io_timeout = DEFAULT_IO_TIMEOUT; scsi_generic_read_device_inquiry(s); } @@ -750,6 +753,8 @@ static SCSIRequest *scsi_new_request(SCSIDevice *d, uint32_t tag, uint32_t lun, static Property scsi_generic_properties[] = { DEFINE_PROP_DRIVE("drive", SCSIDevice, conf.blk), DEFINE_PROP_BOOL("share-rw", SCSIDevice, conf.share_rw, false), + DEFINE_PROP_UINT32("io_timeout", SCSIDevice, io_timeout, + DEFAULT_IO_TIMEOUT), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h index 7a55cdbd74..21a6249743 100644 --- a/include/hw/scsi/scsi.h +++ b/include/hw/scsi/scsi.h @@ -18,6 +18,7 @@ typedef struct SCSIReqOps SCSIReqOps; #define SCSI_SENSE_BUF_SIZE_OLD 96 #define SCSI_SENSE_BUF_SIZE 252 +#define DEFAULT_IO_TIMEOUT 30 struct SCSIRequest { SCSIBus *bus; @@ -84,6 +85,7 @@ struct SCSIDevice uint64_t port_wwn; int scsi_version; int default_scsi_version; + uint32_t io_timeout; bool needs_vpd_bl_emulation; bool hba_supports_iothread; }; @@ -188,7 +190,7 @@ void scsi_device_unit_attention_reported(SCSIDevice *dev); void scsi_generic_read_device_inquiry(SCSIDevice *dev); int scsi_device_get_sense(SCSIDevice *dev, uint8_t *buf, int len, bool fixed); int scsi_SG_IO_FROM_DEV(BlockBackend *blk, uint8_t *cmd, uint8_t cmd_size, - uint8_t *buf, uint8_t buf_size); + uint8_t *buf, uint8_t buf_size, uint32_t timeout); SCSIDevice *scsi_device_find(SCSIBus *bus, int channel, int target, int lun); /* scsi-generic.c. */
The current code sets an infinite timeout on SG_IO requests, causing the guest to stall if the host experiences a frame loss. This patch adds an 'io_timeout' parameter for SCSIDevice to make the SG_IO timeout configurable, and also shortens the default timeout to 30 seconds to avoid infinite stalls. Signed-off-by: Hannes Reinecke <hare@suse.de> --- hw/scsi/scsi-disk.c | 6 ++++-- hw/scsi/scsi-generic.c | 17 +++++++++++------ include/hw/scsi/scsi.h | 4 +++- 3 files changed, 18 insertions(+), 9 deletions(-)