Message ID | 20220928180603.101533-1-venu.busireddy@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v3] virtio-scsi: Send "REPORTED LUNS CHANGED" sense data upon disk hotplug events. | expand |
On Wed, Sep 28, 2022 at 8:06 PM Venu Busireddy <venu.busireddy@oracle.com> wrote: > > Section 5.6.6.3 of VirtIO specification states, "Events will also > be reported via sense codes..." However, no sense data is sent when > VIRTIO_SCSI_EVT_RESET_RESCAN or VIRTIO_SCSI_EVT_RESET_REMOVED events > are reported (when disk hotplug/hotunplug events occur). SCSI layer > on Solaris depends on this sense data, and hence does not handle disk > hotplug/hotunplug events. > > When disk inventory changes, return a CHECK_CONDITION status with sense > data of 0x06/0x3F/0x0E (sense code REPORTED_LUNS_CHANGED), as per the > specifications in Section 5.14 (h) of SAM-4. > > Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com> > > v2 -> v3: > - Implement the suggestion from Paolo Bonzini <pbonzini@redhat.com>. > > v1 -> v2: > - Send the sense data for VIRTIO_SCSI_EVT_RESET_REMOVED event too. > --- > hw/scsi/scsi-bus.c | 1 + > hw/scsi/virtio-scsi.c | 16 ++++++++++++++++ > include/hw/scsi/scsi.h | 6 ++++++ > include/hw/virtio/virtio-scsi.h | 1 + > 4 files changed, 24 insertions(+) > > diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c > index 4403717c4aaf..b7cb249f2eab 100644 > --- a/hw/scsi/scsi-bus.c > +++ b/hw/scsi/scsi-bus.c > @@ -730,6 +730,7 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun, > */ > !(buf[0] == REQUEST_SENSE && d->sense_is_ua))) { > ops = &reqops_unit_attention; > + d->clear_reported_luns_changed = true; Any reason to have this flag, and not just clear s->reported_luns_changed after scsi_req_new? Is it to handle the invalid opcode case? I just reread the code and noticed that there is also a *bus* unit attention mechanism, which is unused but seems perfect for this usecase. The first device on the bus to execute a command successfully will consume it. You need something like diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c index 4403717c4a..78274e8477 100644 --- a/hw/scsi/scsi-bus.c +++ b/hw/scsi/scsi-bus.c @@ -1616,6 +1616,24 @@ static int scsi_ua_precedence(SCSISense sense) return (sense.asc << 8) | sense.ascq; } +void scsi_bus_set_ua(SCSIBus *bus, SCSISense sense) +{ + int prec1, prec2; + if (sense.key != UNIT_ATTENTION) { + return; + } + + /* + * Override a pre-existing unit attention condition, except for a more + * important reset condition. + */ + prec1 = scsi_ua_precedence(bus->unit_attention); + prec2 = scsi_ua_precedence(sense); + if (prec2 < prec1) { + bus->unit_attention = sense; + } +} + void scsi_device_set_ua(SCSIDevice *sdev, SCSISense sense) { int prec1, prec2; diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h index 88e1a48343..0c86d0359f 100644 --- a/include/hw/scsi/scsi.h +++ b/include/hw/scsi/scsi.h @@ -186,6 +186,7 @@ SCSIDevice *scsi_bus_legacy_add_drive( BlockdevOnError rerror, BlockdevOnError werror, const char *serial, Error **errp); +void scsi_bus_set_ua(SCSIBus *bus, SCSISense sense); void scsi_bus_legacy_handle_cmdline(SCSIBus *bus); void scsi_legacy_handle_cmdline(void); and if you call the new function in the plug/unplug callbacks it should just work. Paolo
On 2022-09-29 12:49:51 +0200, Paolo Bonzini wrote: > On Wed, Sep 28, 2022 at 8:06 PM Venu Busireddy > <venu.busireddy@oracle.com> wrote: > > > > Section 5.6.6.3 of VirtIO specification states, "Events will also > > be reported via sense codes..." However, no sense data is sent when > > VIRTIO_SCSI_EVT_RESET_RESCAN or VIRTIO_SCSI_EVT_RESET_REMOVED events > > are reported (when disk hotplug/hotunplug events occur). SCSI layer > > on Solaris depends on this sense data, and hence does not handle disk > > hotplug/hotunplug events. > > > > When disk inventory changes, return a CHECK_CONDITION status with sense > > data of 0x06/0x3F/0x0E (sense code REPORTED_LUNS_CHANGED), as per the > > specifications in Section 5.14 (h) of SAM-4. > > > > Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com> > > > > v2 -> v3: > > - Implement the suggestion from Paolo Bonzini <pbonzini@redhat.com>. > > > > v1 -> v2: > > - Send the sense data for VIRTIO_SCSI_EVT_RESET_REMOVED event too. > > --- > > hw/scsi/scsi-bus.c | 1 + > > hw/scsi/virtio-scsi.c | 16 ++++++++++++++++ > > include/hw/scsi/scsi.h | 6 ++++++ > > include/hw/virtio/virtio-scsi.h | 1 + > > 4 files changed, 24 insertions(+) > > > > diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c > > index 4403717c4aaf..b7cb249f2eab 100644 > > --- a/hw/scsi/scsi-bus.c > > +++ b/hw/scsi/scsi-bus.c > > @@ -730,6 +730,7 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun, > > */ > > !(buf[0] == REQUEST_SENSE && d->sense_is_ua))) { > > ops = &reqops_unit_attention; > > + d->clear_reported_luns_changed = true; > > Any reason to have this flag, and not just clear > s->reported_luns_changed after scsi_req_new? Is it to handle the > invalid opcode case? Immediately after a hotunplug event, qemu (without any action from the guest) processes a REPORT_LUNS command on the lun 0 of the device (haven't figured out what causes this). If we unconditionally clear the s->reported_luns_changed flag right after calling scsi_req_new(), the action taken in scsi_device_set_ua() is undone by the eventual call to scsi_clear_unit_attention(). Here is the sequence of the events: (Note: SCSIDevice = 0x7ff180005010 is lun 1, and SCSIDevice = 0x557fed88fd40 is lun 0) virtio_scsi_hotunplug(): Entered, reported_luns_changed = 0, VirtIOSCSI = 0x557feda9f750, SCSIDevice = 0x7ff180005010, bus = 0x557feda9f9c0 virtio_scsi_hotunplug(): Exiting, reported_luns_changed = 1, VirtIOSCSI = 0x557feda9f750, SCSIDevice = 0x7ff180005010, bus = 0x557feda9f9c0 virtio_scsi_handle_cmd_req_prepare(): Entered, reported_luns_changed = 1, VirtIOSCSI = 0x557feda9f750, SCSIDevice = 0x557fed88fd40, cdb[0] = 0xa0 scsi_device_set_ua(): Entered, SCSIDevice = 0x557fed88fd40 scsi_device_set_ua(): prec1 = 0x7fffffff, sdev->key = 0, sdev->asc = 0x00, sdev->ascq = 0x00 scsi_device_set_ua(): prec2 = 0x00003f0e, sense->key = 6, sense->asc = 0x3f, sense->ascq = 0x0e scsi_req_new(): SCSIDevice = 0x557fed88fd40, bus = 0x557feda9f9c0, buf[0] = a0 scsi_req_new(): sdev.key = 6, sdev.asc = 0x3f, sdev.ascq = 0x0e virtio_scsi_handle_cmd_req_prepare(): Exiting, reported_luns_changed = 0, VirtIOSCSI = 0x557feda9f750, SCSIDevice = 0x557fed88fd40, cdb[0] = 0xa0 scsi_clear_unit_attention(): Entered, buf[0] = 0xa0, SCSIDevice = 0x557fed88fd40, key = 6, asc = 0x3f, ascq = 0x0e scsi_clear_unit_attention(): Exiting, buf[0] = 0xa0, SCSIDevice = 0x557fed88fd40, key = 0, asc = 0x00, ascq = 0x00 As can be seen, before the guest does anything, we cleared the reported_luns_changed flag as well as the unit attention condition. Therefore, when the guest eventually sends a TEST_UNIT_READY command, we don't report anything back, as can be seen by the traces below: virtio_scsi_handle_cmd_req_prepare(): Entered, reported_luns_changed = 0, VirtIOSCSI = 0x557feda9f750, SCSIDevice = 0x557fed88fd40, cdb[0] = 0x00 scsi_req_new(): SCSIDevice = 0x557fed88fd40, bus = 0x557feda9f9c0, buf[0] = 00 scsi_req_new(): sdev.key = 0, sdev.asc = 0x00, sdev.ascq = 0x00 virtio_scsi_handle_cmd_req_prepare(): Exiting, reported_luns_changed = 0, VirtIOSCSI = 0x557feda9f750, SCSIDevice = 0x557fed88fd40, cdb[0] = 0x00 scsi_clear_unit_attention(): Entered, buf[0] = 0x00, SCSIDevice = 0x557fed88fd40, key = 0, asc = 0x00, ascq = 0x00 That is why we need the d->clear_reported_luns_changed flag, to know when we genuinely processed a command from the guest and only then clear the reported_luns_changed flag. > > I just reread the code and noticed that there is also a *bus* unit > attention mechanism, which is unused but seems perfect for this > usecase. The first device on the bus to execute a command successfully > will consume it. > > You need something like > > diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c > index 4403717c4a..78274e8477 100644 > --- a/hw/scsi/scsi-bus.c > +++ b/hw/scsi/scsi-bus.c > @@ -1616,6 +1616,24 @@ static int scsi_ua_precedence(SCSISense sense) > return (sense.asc << 8) | sense.ascq; > } > > +void scsi_bus_set_ua(SCSIBus *bus, SCSISense sense) > +{ > + int prec1, prec2; > + if (sense.key != UNIT_ATTENTION) { > + return; > + } > + > + /* > + * Override a pre-existing unit attention condition, except for a more > + * important reset condition. > + */ > + prec1 = scsi_ua_precedence(bus->unit_attention); > + prec2 = scsi_ua_precedence(sense); > + if (prec2 < prec1) { > + bus->unit_attention = sense; > + } > +} > + > void scsi_device_set_ua(SCSIDevice *sdev, SCSISense sense) > { > int prec1, prec2; I tried the above suggestion. Even with the new suggestion, we suffer the same fate as with v3, except in this case, we end up clearing the bus->unit_attention instead of device->unit_attention! Traces below: (Note: SCSIDevice = 0x7f2c54027790 is lun 1, and SCSIDevice = 0x5599135fbd40 is lun 0) virtio_scsi_hotunplug(): Entered, VirtIOSCSI = 0x55991380b750, SCSIDevice = 0x7f2c54027790, bus = 0x55991380b9c0 scsi_bus_set_ua(): Entered, bus = 0x55991380b9c0 scsi_bus_set_ua(): prec1 = 0x7fffffff, bus->key = 0, bus->asc = 0x00, bus->ascq = 0x00 scsi_bus_set_ua(): prec2 = 0x00003f0e, sense->key = 6, sense->asc = 0x3f, sense->ascq = 0x0e virtio_scsi_hotunplug(): Exiting virtio_scsi_handle_cmd_req_prepare(): Entered, VirtIOSCSI = 0x55991380b750, SCSIDevice = 0x5599135fbd40, cdb[0] = 0xa0 scsi_req_new(): SCSIDevice = 0x5599135fbd40, bus = 0x55991380b9c0, buf[0] = a0 scsi_req_new(): bus.key = 6, bus.asc = 0x3f, bus.ascq = 0x0e scsi_clear_unit_attention(): Entered, buf[0] = 0xa0, bus = 0x55991380b9c0, key = 6, asc = 0x3f, ascq = 0x0e scsi_clear_unit_attention(): Exiting, buf[0] = 0xa0, bus = 0x55991380b9c0, key = 0, asc = 0x00, ascq = 0x00 At the end of the above sequence, bus->unit_attention is cleared! After that, when a TEST_UNIT_READY command arrives from the guest, we do not report anything back, because no unit_attention is pending, as seen below: virtio_scsi_handle_cmd_req_prepare(): Entered, VirtIOSCSI = 0x55991380b750, SCSIDevice = 0x5599135fbd40, cdb[0] = 0x00 scsi_req_new(): SCSIDevice = 0x5599135fbd40, bus = 0x0x55991380b9c0, buf[0] = 00 scsi_req_new(): bus.key = 0, bus.asc = 0x00, bus.ascq = 0x00 scsi_clear_unit_attention(): Entered, buf[0] = 0x00, bus = 0x55991380b9c0, key = 0, asc = 0x00, ascq = 0x00 As a result, the guest does not see any REPORTED_LUNS_CHANGED sense data. > diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h > index 88e1a48343..0c86d0359f 100644 > --- a/include/hw/scsi/scsi.h > +++ b/include/hw/scsi/scsi.h > @@ -186,6 +186,7 @@ SCSIDevice *scsi_bus_legacy_add_drive( > BlockdevOnError rerror, > BlockdevOnError werror, > const char *serial, Error **errp); > +void scsi_bus_set_ua(SCSIBus *bus, SCSISense sense); > void scsi_bus_legacy_handle_cmdline(SCSIBus *bus); > void scsi_legacy_handle_cmdline(void); > > > and if you call the new function in the plug/unplug callbacks it > should just work. What shall we do? Keep the d->clear_reported_luns_changed? Or, is there a better way to define/handle that flag? Regards, Venu > > Paolo >
On Fri, Sep 30, 2022 at 12:31 AM Venu Busireddy <venu.busireddy@oracle.com> wrote: > > > */ > > > !(buf[0] == REQUEST_SENSE && d->sense_is_ua))) { > > > ops = &reqops_unit_attention; > > > + d->clear_reported_luns_changed = true; > > > > Any reason to have this flag, and not just clear > > s->reported_luns_changed after scsi_req_new? Is it to handle the > > invalid opcode case? > > Immediately after a hotunplug event, qemu (without any action from > the guest) processes a REPORT_LUNS command on the lun 0 of the device > (haven't figured out what causes this). There is only one call to virtio_scsi_handle_cmd_req_prepare and it takes the command from the guest, are you sure it is without any action from the guest? > scsi_req_new(): SCSIDevice = 0x557fed88fd40, bus = 0x557feda9f9c0, buf[0] = a0 > scsi_req_new(): sdev.key = 6, sdev.asc = 0x3f, sdev.ascq = 0x0e > virtio_scsi_handle_cmd_req_prepare(): Exiting, reported_luns_changed = 0, VirtIOSCSI = 0x557feda9f750, SCSIDevice = 0x557fed88fd40, cdb[0] = 0xa0 > scsi_clear_unit_attention(): Entered, buf[0] = 0xa0, SCSIDevice = 0x557fed88fd40, key = 6, asc = 0x3f, ascq = 0x0e > scsi_clear_unit_attention(): Exiting, buf[0] = 0xa0, SCSIDevice = 0x557fed88fd40, key = 0, asc = 0x00, ascq = 0x00 > > As can be seen, before the guest does anything, we cleared the > reported_luns_changed flag as well as the unit attention condition. It is correct that REPORT LUNS clears the unit attention: /* * If a REPORT LUNS command enters the enabled command state, [...] * the device server shall clear any pending unit attention condition * with an additional sense code of REPORTED LUNS DATA HAS CHANGED. */ if (req->cmd.buf[0] == REPORT_LUNS && !(ua->asc == SENSE_CODE(REPORTED_LUNS_CHANGED).asc && ua->ascq == SENSE_CODE(REPORTED_LUNS_CHANGED).ascq)) { return; } Paolo
On 2022-09-30 10:41:03 +0200, Paolo Bonzini wrote: > On Fri, Sep 30, 2022 at 12:31 AM Venu Busireddy > <venu.busireddy@oracle.com> wrote: > > > > */ > > > > !(buf[0] == REQUEST_SENSE && d->sense_is_ua))) { > > > > ops = &reqops_unit_attention; > > > > + d->clear_reported_luns_changed = true; > > > > > > Any reason to have this flag, and not just clear > > > s->reported_luns_changed after scsi_req_new? Is it to handle the > > > invalid opcode case? > > > > Immediately after a hotunplug event, qemu (without any action from > > the guest) processes a REPORT_LUNS command on the lun 0 of the device > > (haven't figured out what causes this). > > There is only one call to virtio_scsi_handle_cmd_req_prepare and it > takes the command from the guest, are you sure it is without any > action from the guest? I am sure, based on what I am observing. I am running the scsitrace (scsitrace -n vtioscsi -v) command on the Solaris guest, and I see no output there. I do see output when I run any scsi related commands (such as sg_luns, sg_raw, etc) on the guest. But no output when I hotunplug the disk, either while virtio_scsi_handle_cmd_req_prepare() is running or after that, until I run any sg_* commands on the guest later. However, for whatever it's worth, if I have two or more luns on a virtio-scsi adapter, the spurious REPORT_LUNS processing (virtio_scsi_handle_cmd_req_prepare() call) occurs only when I hotunplug a lun while the other luns are still plugged in, until the last lun is unplugged. I do not see the spurious call to virtio_scsi_handle_cmd_req_prepare() when the last lun is unplugged, whether that was the only lun present, or if it was the last of many. Venu
On Fri, Sep 30, 2022 at 4:42 PM Venu Busireddy <venu.busireddy@oracle.com> wrote: > > > Immediately after a hotunplug event, qemu (without any action from > > > the guest) processes a REPORT_LUNS command on the lun 0 of the device > > > (haven't figured out what causes this). > > > > There is only one call to virtio_scsi_handle_cmd_req_prepare and it > > takes the command from the guest, are you sure it is without any > > action from the guest? > > I am sure, based on what I am observing. I am running the scsitrace > (scsitrace -n vtioscsi -v) command on the Solaris guest, and I see no > output there. Do you have the sources to the driver and/or to the scsitrace dtrace script? Something must be putting the SCSI command in the queue. Perhaps the driver is doing so when it sees an event? And if it is bypassing the normal submission mechanism, the REPORT LUNS commands is hidden in scsitrac; that in turn retruns a unit attention and steals it from the other commands such as TEST UNIT READY, but that's a guest driver bug. But QEMU cannot just return the unit attention twice. I would start with the patch to use the bus unit attention mechanism. It would be even better to have two unit tests that check the behavior prescribed by the standard: 1) UNIT ATTENTION from TEST UNIT READY immediately after a hotunplug notification; 2) no UNIT ATTENTION from REPORT LUNS and also no UNIT ATTENTION from a subsequent TEST UNIT READY command. Debugging the guest is a separate step. Paolo > However, for whatever it's worth, if I have two or more luns > on a virtio-scsi adapter, the spurious REPORT_LUNS processing > (virtio_scsi_handle_cmd_req_prepare() call) occurs only when > I hotunplug a lun while the other luns are still plugged in, > until the last lun is unplugged. I do not see the spurious call to > virtio_scsi_handle_cmd_req_prepare() when the last lun is unplugged, > whether that was the only lun present, or if it was the last of many. > > Venu >
On 2022-09-30 18:25:48 +0200, Paolo Bonzini wrote: > On Fri, Sep 30, 2022 at 4:42 PM Venu Busireddy > <venu.busireddy@oracle.com> wrote: > > > > Immediately after a hotunplug event, qemu (without any action from > > > > the guest) processes a REPORT_LUNS command on the lun 0 of the device > > > > (haven't figured out what causes this). > > > > > > There is only one call to virtio_scsi_handle_cmd_req_prepare and it > > > takes the command from the guest, are you sure it is without any > > > action from the guest? > > > > I am sure, based on what I am observing. I am running the scsitrace > > (scsitrace -n vtioscsi -v) command on the Solaris guest, and I see no > > output there. > > Do you have the sources to the driver and/or to the scsitrace dtrace I do not have access to the source code. I am working on gaining access. > script? Something must be putting the SCSI command in the queue. > Perhaps the driver is doing so when it sees an event? And if it is > bypassing the normal submission mechanism, the REPORT LUNS commands is > hidden in scsitrac; that in turn retruns a unit attention and steals While SAM does say "if a REPORT LUNS command enters the enabled command state, the device server shall process the REPORT LUNS command and shall not report any unit attention condition;," it also says that the unit attention condition will not be cleared if the UA_INTLCK_CTRL is set to 10b or 11b in the "Control mode page." It doesn't appear to me that virtio-scsi supports "Control mode pages." Does it? If it doesn't, is the expected handling of REPORT LUNS command be same as the case of UA_INTLCK_CTRL being set to 00b? And while trying to understand this, and reading the code regarding the handling of UA_INTLCK_CTRL, I ran across the following comment in scsi_req_get_sense(): /* * FIXME: clearing unit attention conditions upon autosense should be done * only if the UA_INTLCK_CTRL field in the Control mode page is set to 00b * (SAM-5, 5.14). * * We assume UA_INTLCK_CTRL to be 00b for HBAs that support autosense, and * 10b for HBAs that do not support it (do not call scsi_req_get_sense). * Here we handle unit attention clearing for UA_INTLCK_CTRL == 00b. */ If virtio-scsi doesn't support "Control mode pages," why does the above comment even say "assume UA_INTLCK_CTRL to be 00b" or address the case of 10b? Also, other than the reference to it in the above comment, UA_INTLCK_CTRL is not used anywhere else in the code. This comment confused me. Is the comment just wrong, or am I missing something? I am just trying to understand this better so that I am better prepared when the client driver folks start asking me questions about the qemu support. Venu > it from the other commands such as TEST UNIT READY, but that's a guest > driver bug. > > But QEMU cannot just return the unit attention twice. I would start > with the patch to use the bus unit attention mechanism. It would be > even better to have two unit tests that check the behavior prescribed > by the standard: 1) UNIT ATTENTION from TEST UNIT READY immediately > after a hotunplug notification; 2) no UNIT ATTENTION from REPORT LUNS > and also no UNIT ATTENTION from a subsequent TEST UNIT READY command. > Debugging the guest is a separate step.
On 2022-10-03 18:13:06 -0500, Venu Busireddy wrote: > On 2022-09-30 18:25:48 +0200, Paolo Bonzini wrote: > > On Fri, Sep 30, 2022 at 4:42 PM Venu Busireddy > > <venu.busireddy@oracle.com> wrote: > > > > > Immediately after a hotunplug event, qemu (without any action from > > > > > the guest) processes a REPORT_LUNS command on the lun 0 of the device > > > > > (haven't figured out what causes this). > > > > > > > > There is only one call to virtio_scsi_handle_cmd_req_prepare and it > > > > takes the command from the guest, are you sure it is without any > > > > action from the guest? > > > > > > I am sure, based on what I am observing. I am running the scsitrace > > > (scsitrace -n vtioscsi -v) command on the Solaris guest, and I see no > > > output there. > > > > Do you have the sources to the driver and/or to the scsitrace dtrace > > I do not have access to the source code. I am working on gaining access. > > > script? Something must be putting the SCSI command in the queue. > > Perhaps the driver is doing so when it sees an event? And if it is > > bypassing the normal submission mechanism, the REPORT LUNS commands is > > hidden in scsitrac; that in turn retruns a unit attention and steals > > While SAM does say "if a REPORT LUNS command enters the enabled command > state, the device server shall process the REPORT LUNS command and shall > not report any unit attention condition;," it also says that the unit > attention condition will not be cleared if the UA_INTLCK_CTRL is set to > 10b or 11b in the "Control mode page." > > It doesn't appear to me that virtio-scsi supports "Control mode pages." Just to clarify, I am referring the mode pages with page code 0x0a (and any subpage codes). > Does it? If it doesn't, is the expected handling of REPORT LUNS command > be same as the case of UA_INTLCK_CTRL being set to 00b? > > And while trying to understand this, and reading the code regarding > the handling of UA_INTLCK_CTRL, I ran across the following comment in > scsi_req_get_sense(): > > /* > * FIXME: clearing unit attention conditions upon autosense should be done > * only if the UA_INTLCK_CTRL field in the Control mode page is set to 00b > * (SAM-5, 5.14). > * > * We assume UA_INTLCK_CTRL to be 00b for HBAs that support autosense, and > * 10b for HBAs that do not support it (do not call scsi_req_get_sense). > * Here we handle unit attention clearing for UA_INTLCK_CTRL == 00b. > */ > > If virtio-scsi doesn't support "Control mode pages," why does the above > comment even say "assume UA_INTLCK_CTRL to be 00b" or address the case > of 10b? Also, other than the reference to it in the above comment, > UA_INTLCK_CTRL is not used anywhere else in the code. This comment > confused me. Is the comment just wrong, or am I missing something? I am > just trying to understand this better so that I am better prepared when > the client driver folks start asking me questions about the qemu support. > > Venu > > > it from the other commands such as TEST UNIT READY, but that's a guest > > driver bug. > > > > But QEMU cannot just return the unit attention twice. I would start > > with the patch to use the bus unit attention mechanism. It would be > > even better to have two unit tests that check the behavior prescribed > > by the standard: 1) UNIT ATTENTION from TEST UNIT READY immediately > > after a hotunplug notification; 2) no UNIT ATTENTION from REPORT LUNS > > and also no UNIT ATTENTION from a subsequent TEST UNIT READY command. > > Debugging the guest is a separate step.
On 10/4/22 01:13, Venu Busireddy wrote: >> script? Something must be putting the SCSI command in the queue. >> Perhaps the driver is doing so when it sees an event? And if it is >> bypassing the normal submission mechanism, the REPORT LUNS commands is >> hidden in scsitrac; that in turn retruns a unit attention and steals > > While SAM does say "if a REPORT LUNS command enters the enabled command > state, the device server shall process the REPORT LUNS command and shall > not report any unit attention condition;," it also says that the unit > attention condition will not be cleared if the UA_INTLCK_CTRL is set to > 10b or 11b in the "Control mode page." > > It doesn't appear to me that virtio-scsi supports "Control mode pages." > Does it? If it doesn't, is the expected handling of REPORT LUNS command > be same as the case of UA_INTLCK_CTRL being set to 00b? In QEMU, all HBAs except for esp.c and lsi53c895a.c support autosense. As in the comment below, 00b is the right value for virtio-scsi. The code to build the 0Ah (control) mode page would be in scsi-disk.c for example. Nobody ever wrote it because the values mentioned in the comment below (00b if HBA supports autosense and therefore calls scsi_req_get_sense; 10b for HBAs with no autosense, typically very old emulated parallel-SCSI hardware) are the ones that make the most sense and OSes will just assume them. 00b is also the default UA_INTLCK_CTRL value, so the mode page is not needed at all for virtio-scsi. Paolo > If virtio-scsi doesn't support "Control mode pages," why does the above > comment even say "assume UA_INTLCK_CTRL to be 00b" or address the case > of 10b? Also, other than the reference to it in the above comment, > UA_INTLCK_CTRL is not used anywhere else in the code. This comment > confused me. Is the comment just wrong, or am I missing something? I am > just trying to understand this better so that I am better prepared when > the client driver folks start asking me questions about the qemu support. > > Venu > >> it from the other commands such as TEST UNIT READY, but that's a guest >> driver bug. >> >> But QEMU cannot just return the unit attention twice. I would start >> with the patch to use the bus unit attention mechanism. It would be >> even better to have two unit tests that check the behavior prescribed >> by the standard: 1) UNIT ATTENTION from TEST UNIT READY immediately >> after a hotunplug notification; 2) no UNIT ATTENTION from REPORT LUNS >> and also no UNIT ATTENTION from a subsequent TEST UNIT READY command. >> Debugging the guest is a separate step. >
On 2022-10-05 23:37:33 +0200, Paolo Bonzini wrote: > On 10/4/22 01:13, Venu Busireddy wrote: > > > script? Something must be putting the SCSI command in the queue. > > > Perhaps the driver is doing so when it sees an event? And if it is > > > bypassing the normal submission mechanism, the REPORT LUNS commands is > > > hidden in scsitrac; that in turn retruns a unit attention and steals > > > > While SAM does say "if a REPORT LUNS command enters the enabled command > > state, the device server shall process the REPORT LUNS command and shall > > not report any unit attention condition;," it also says that the unit > > attention condition will not be cleared if the UA_INTLCK_CTRL is set to > > 10b or 11b in the "Control mode page." > > > > It doesn't appear to me that virtio-scsi supports "Control mode pages." > > Does it? If it doesn't, is the expected handling of REPORT LUNS command > > be same as the case of UA_INTLCK_CTRL being set to 00b? > > In QEMU, all HBAs except for esp.c and lsi53c895a.c support autosense. As in > the comment below, 00b is the right value for virtio-scsi. > > The code to build the 0Ah (control) mode page would be in scsi-disk.c for > example. Nobody ever wrote it because the values mentioned in the comment > below (00b if HBA supports autosense and therefore calls scsi_req_get_sense; > 10b for HBAs with no autosense, typically very old emulated parallel-SCSI > hardware) are the ones that make the most sense and OSes will just assume > them. > > 00b is also the default UA_INTLCK_CTRL value, so the mode page is not needed > at all for virtio-scsi. I do see that the Solaris driver does send the 0x1a command during the initialization, perhaps (?) seeking the value of UA_INTLCK_CTRL. Since QEMU currently does not support it, QEMU sends back a key/asc/ascq=0x05/0x24/0x00 response, indicating that 0x1a is an Illegal Request. I am assuming that the Solaris driver does not handle that response well (I still don't have access to the source code to verify that), confuses itself about the value of UA_INTLCK_CTRL, and hence does not handle the response to the REPORT_LUNS command correctly. Maybe the Solaris driver assumes that QEMU will retain the unit attention condition (UA_INTLCK_CTRL = 10b?), and will respond with a REPORTED_LUNS_CHANGED for a subsequent command? Based on your confirmation that we want to handle the REPORT_LUNS command as if UA_INTLCK_CTRL is set to 0, I will proceed with the assumption that the Solaris driver is at fault, and will work with the Solaris driver folks. In the meantime, as you suggested, I will post v4 with the bus unit attention mechanism implemented. We still need that. Venu
Il gio 6 ott 2022, 15:25 Venu Busireddy <venu.busireddy@oracle.com> ha scritto: > I do see that the Solaris driver does send the 0x1a command during > the initialization, perhaps (?) seeking the value of UA_INTLCK_CTRL. > Since QEMU currently does not support it, QEMU sends back a > key/asc/ascq=0x05/0x24/0x00 response, indicating that 0x1a is an Illegal > Request. What is your QEMU command line and what is the full CDB (apart from 0x1a)? I am assuming that the Solaris driver does not handle that > response well (I still don't have access to the source code to verify > that), confuses itself about the value of UA_INTLCK_CTRL, and hence does > not handle the response to the REPORT_LUNS command correctly. No this has nothing to do with what's happening. The most likely reason for the bug IMO is simple: the event is causing the driver to send the REPORT LUNS command, but it does so in a way that does not handle the unit attention when it is reported. Paolo Maybe the > Solaris driver assumes that QEMU will retain the unit attention condition > (UA_INTLCK_CTRL = 10b?), and will respond with a REPORTED_LUNS_CHANGED > for a subsequent command? > > Based on your confirmation that we want to handle the REPORT_LUNS command > as if UA_INTLCK_CTRL is set to 0, I will proceed with the assumption > that the Solaris driver is at fault, and will work with the Solaris > driver folks. > > In the meantime, as you suggested, I will post v4 with the bus unit > attention mechanism implemented. We still need that. > > Venu > >
On 2022-10-07 06:55:15 -0400, Paolo Bonzini wrote: > Il gio 6 ott 2022, 15:25 Venu Busireddy <venu.busireddy@oracle.com> ha > scritto: > > > I do see that the Solaris driver does send the 0x1a command during > > the initialization, perhaps (?) seeking the value of UA_INTLCK_CTRL. > > Since QEMU currently does not support it, QEMU sends back a > > key/asc/ascq=0x05/0x24/0x00 response, indicating that 0x1a is an Illegal > > Request. > > > What is your QEMU command line and what is the full CDB (apart from 0x1a)? > > I am assuming that the Solaris driver does not handle that > > response well (I still don't have access to the source code to verify > > that), confuses itself about the value of UA_INTLCK_CTRL, and hence does > > not handle the response to the REPORT_LUNS command correctly. > > > No this has nothing to do with what's happening. The most likely reason for > the bug IMO is simple: the event is causing the driver to send the REPORT > LUNS command, but it does so in a way that does not handle the unit > attention when it is reported. I had a developer with access to the Solaris code review how the response to REPORT_LUNS is being handled. And they do see that the response to REPORT_LUNS is mishandled. With the fix proposed in v4, and fixing the handling of REPORT_LUNS on the Solaris side, we believe we will have a complete working solution. Therefore, I believe we can conclude this thread on v3. Do you agree? Venu > > Maybe the > > Solaris driver assumes that QEMU will retain the unit attention condition > > (UA_INTLCK_CTRL = 10b?), and will respond with a REPORTED_LUNS_CHANGED > > for a subsequent command? > > > > Based on your confirmation that we want to handle the REPORT_LUNS command > > as if UA_INTLCK_CTRL is set to 0, I will proceed with the assumption > > that the Solaris driver is at fault, and will work with the Solaris > > driver folks. > > > > In the meantime, as you suggested, I will post v4 with the bus unit > > attention mechanism implemented. We still need that. > > > > Venu > > > >
diff --git a/hw/scsi/scsi-bus.c b/hw/scsi/scsi-bus.c index 4403717c4aaf..b7cb249f2eab 100644 --- a/hw/scsi/scsi-bus.c +++ b/hw/scsi/scsi-bus.c @@ -730,6 +730,7 @@ SCSIRequest *scsi_req_new(SCSIDevice *d, uint32_t tag, uint32_t lun, */ !(buf[0] == REQUEST_SENSE && d->sense_is_ua))) { ops = &reqops_unit_attention; + d->clear_reported_luns_changed = true; } else if (lun != d->lun || buf[0] == REPORT_LUNS || (buf[0] == REQUEST_SENSE && d->sense_len)) { diff --git a/hw/scsi/virtio-scsi.c b/hw/scsi/virtio-scsi.c index 41f2a5630173..b7f66d366fcb 100644 --- a/hw/scsi/virtio-scsi.c +++ b/hw/scsi/virtio-scsi.c @@ -695,9 +695,23 @@ static int virtio_scsi_handle_cmd_req_prepare(VirtIOSCSI *s, VirtIOSCSIReq *req) return -ENOENT; } virtio_scsi_ctx_check(s, d); + + if (s->reported_luns_changed) { + scsi_device_set_ua(d, SENSE_CODE(REPORTED_LUNS_CHANGED)); + } + + /* + * set d->clear_reported_luns_changed. + * scsi_req_new() will clear it if the required conditions are met. + */ + d->clear_reported_luns_changed = false; req->sreq = scsi_req_new(d, req->req.cmd.tag, virtio_scsi_get_lun(req->req.cmd.lun), req->req.cmd.cdb, vs->cdb_size, req); + if (d->clear_reported_luns_changed) { + s->reported_luns_changed = false; + d->clear_reported_luns_changed = false; + } if (req->sreq->cmd.mode != SCSI_XFER_NONE && (req->sreq->cmd.mode != req->mode || @@ -956,6 +970,7 @@ static void virtio_scsi_hotplug(HotplugHandler *hotplug_dev, DeviceState *dev, virtio_scsi_push_event(s, sd, VIRTIO_SCSI_T_TRANSPORT_RESET, VIRTIO_SCSI_EVT_RESET_RESCAN); + s->reported_luns_changed = true; virtio_scsi_release(s); } } @@ -973,6 +988,7 @@ static void virtio_scsi_hotunplug(HotplugHandler *hotplug_dev, DeviceState *dev, virtio_scsi_push_event(s, sd, VIRTIO_SCSI_T_TRANSPORT_RESET, VIRTIO_SCSI_EVT_RESET_REMOVED); + s->reported_luns_changed = true; virtio_scsi_release(s); } diff --git a/include/hw/scsi/scsi.h b/include/hw/scsi/scsi.h index 001103488c23..ea81c6f89e74 100644 --- a/include/hw/scsi/scsi.h +++ b/include/hw/scsi/scsi.h @@ -89,6 +89,12 @@ struct SCSIDevice uint32_t io_timeout; bool needs_vpd_bl_emulation; bool hba_supports_iothread; + /* + * clear_reported_luns_changed is used, if the required + * conditions are met, to inform the virtio-scsi layer that + * any pending REPORTED_LUNS_CHANGED condition can be cleared. + */ + bool clear_reported_luns_changed; }; extern const VMStateDescription vmstate_scsi_device; diff --git a/include/hw/virtio/virtio-scsi.h b/include/hw/virtio/virtio-scsi.h index a36aad9c8695..efbcf9ba069a 100644 --- a/include/hw/virtio/virtio-scsi.h +++ b/include/hw/virtio/virtio-scsi.h @@ -81,6 +81,7 @@ struct VirtIOSCSI { SCSIBus bus; int resetting; bool events_dropped; + bool reported_luns_changed; /* Fields for dataplane below */ AioContext *ctx; /* one iothread per virtio-scsi-pci for now */
Section 5.6.6.3 of VirtIO specification states, "Events will also be reported via sense codes..." However, no sense data is sent when VIRTIO_SCSI_EVT_RESET_RESCAN or VIRTIO_SCSI_EVT_RESET_REMOVED events are reported (when disk hotplug/hotunplug events occur). SCSI layer on Solaris depends on this sense data, and hence does not handle disk hotplug/hotunplug events. When disk inventory changes, return a CHECK_CONDITION status with sense data of 0x06/0x3F/0x0E (sense code REPORTED_LUNS_CHANGED), as per the specifications in Section 5.14 (h) of SAM-4. Signed-off-by: Venu Busireddy <venu.busireddy@oracle.com> v2 -> v3: - Implement the suggestion from Paolo Bonzini <pbonzini@redhat.com>. v1 -> v2: - Send the sense data for VIRTIO_SCSI_EVT_RESET_REMOVED event too. --- hw/scsi/scsi-bus.c | 1 + hw/scsi/virtio-scsi.c | 16 ++++++++++++++++ include/hw/scsi/scsi.h | 6 ++++++ include/hw/virtio/virtio-scsi.h | 1 + 4 files changed, 24 insertions(+)