diff mbox

question: disk missing in the guest contingently when hotplug several virtio scsi disks consecutively.

Message ID 0c86fe57-ed34-f44f-83af-ecdf36f7605b@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Zhengui li July 3, 2018, 1:30 p.m. UTC
I got a solution, the patch is below:


The commit 
https://git.qemu.org/?p=qemu.git;a=commitdiff;h=e9447f35718439c1affdee3ef69b2fee50c8106c;hp=4c64d5b52ed5287bb31489bba39cf41628230bc 
introduced this problem.

I  think  there should not be any IO requests brefore device is 
realized, but I have no idea  how to avoid it after calling 
virtio_scsi_push_event.


On 2018/7/3 15:20, l00284672 wrote:
> The disk missing due to calling scsi_probe_lun failed in guest. The 
> guest code is below:
> static int scsi_probe_lun(struct scsi_device *sdev, unsigned char 
> *inq_result,
>               int result_len, int *bflags)
>
> {
> .........
>     result = scsi_execute_req(sdev,  scsi_cmd, DMA_FROM_DEVICE,
>                       inq_result, try_inquiry_len, &sshdr,
>                       HZ / 2 + HZ * scsi_inq_timeout, 3,
>                       &resid);
>
> .........
> }
>
> The scsi inqury request from guest is cancelled by qemu. The qemu bt 
> is below:
>
> (gdb) bt
> #0  scsi_req_cancel_async (req=0x7f86d00055c0, notifier=0x0) at 
> hw/scsi/scsi-bus.c:1825
> #1  0x000055b50ac254c7 in scsi_device_purge_requests 
> (sdev=sdev@entry=0x55b50f9a8990, sense=...) at hw/scsi/scsi-bus.c:1925
> #2  0x000055b50ac1d337 in scsi_disk_reset (dev=0x55b50f9a8990) at 
> hw/scsi/scsi-disk.c:2186
> #3  0x000055b50ab95d91 in device_set_realized (obj=<optimized out>, 
> value=<optimized out>, errp=0x7ffedda81898) at hw/core/qdev.c:949
> #4  0x000055b50acd167e in property_set_bool (obj=0x55b50f9a8990, 
> v=<optimized out>, name=<optimized out>, opaque=0x55b50f9c1b50, 
> errp=0x7ffedda81898) at qom/object.c:1854
> #5  0x000055b50acd5091 in object_property_set_qobject 
> (obj=obj@entry=0x55b50f9a8990, value=value@entry=0x55b50f997350, 
> name=name@entry=0x55b50ae0260b "realized",
>     errp=errp@entry=0x7ffedda81898) at qom/qom-qobject.c:27
> #6  0x000055b50acd3210 in object_property_set_bool 
> (obj=0x55b50f9a8990, value=<optimized out>, name=0x55b50ae0260b 
> "realized", errp=0x7ffedda81898) at qom/object.c:1157
> #7  0x000055b50ab1e0b5 in qdev_device_add 
> (opts=opts@entry=0x55b50ceb9880, errp=errp@entry=0x7ffedda81978) at 
> qdev-monitor.c:627
> #8  0x000055b50ab1e68b in qmp_device_add (qdict=<optimized out>, 
> ret_data=<optimized out>, errp=0x7ffedda819c0) at qdev-monitor.c:807
> #9  0x000055b50ad78787 in do_qmp_dispatch (errp=0x7ffedda819b8, 
> request=0x55b50c9605c0) at qapi/qmp-dispatch.c:114
> #10 qmp_dispatch (request=request@entry=0x55b50d447000) at 
> qapi/qmp-dispatch.c:141
> #11 0x000055b50aa3a102 in handle_qmp_command (parser=<optimized out>, 
> tokens=<optimized out>) at 
> /mnt/sdb/lzg/code/UVP_2.5_CODE/qemu/monitor.c:3907
> #12 0x000055b50ad7da54 in json_message_process_token 
> (lexer=0x55b50c5d4458, input=0x55b50c55a6e0, type=JSON_RCURLY, x=181, 
> y=1449) at qobject/json-streamer.c:105
> #13 0x000055b50ad9fdfb in json_lexer_feed_char 
> (lexer=lexer@entry=0x55b50c5d4458, ch=125 '}', 
> flush=flush@entry=false) at qobject/json-lexer.c:319
> #14 0x000055b50ad9febe in json_lexer_feed (lexer=0x55b50c5d4458, 
> buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369
> #15 0x000055b50ad7db19 in json_message_parser_feed (parser=<optimized 
> out>, buffer=<optimized out>, size=<optimized out>) at 
> qobject/json-streamer.c:124
> #16 0x000055b50aa389bb in monitor_qmp_read (opaque=<optimized out>, 
> buf=<optimized out>, size=<optimized out>) at 
> /mnt/sdb/lzg/code/UVP_2.5_CODE/qemu/monitor.c:3937
> #17 0x000055b50ab23e26 in tcp_chr_read (chan=<optimized out>, 
> cond=<optimized out>, opaque=0x55b50c5d0fa0) at qemu-char.c:3253
> #18 0x00007f8754cc499a in g_main_context_dispatch () from 
> /usr/lib64/libglib-2.0.so.0
> #19 0x000055b50acded8c in glib_pollfds_poll () at main-loop.c:228
> #20 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:273
> #21 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:521
> #22 0x000055b50a9ff8ff in main_loop () at vl.c:2095
>
> static void device_set_realized(Object *obj, bool value, Error **errp)
> {
>     .......
>     if (hotplug_ctrl) {
>     hotplug_handler_plug(hotplug_ctrl, dev, &local_err);
> }
>
> .......
>     if (dev->hotplugged) {
>         device_reset(dev);
>     }
>     ......
> }
>
> The hotplug_handler_plug call virtio_scsi_hotplug.  If iothread handle 
> a request between  hotplug_handler_plug and  device_reset, the request 
> will be cancelled by
> scsi_device_purge_requests in device_reset.
>
> On 2018/7/2 21:15, Stefan Hajnoczi wrote:
>> On Wed, Jun 27, 2018 at 06:33:16PM +0800, l00284672 wrote:
>>> Hi, I found a bug that disk missing (not all disks missing ) in the 
>>> guest
>>> contingently when hotplug several virtio scsi disks consecutively.  
>>> After
>>> rebooting the guest,
>> For the record, there is also a bug report here:
>>
>> https://bugs.launchpad.net/qemu/+bug/1779120
>>
>> I plan to reproduce this issue soon.
>>
>> Stefan
>

Comments

Paolo Bonzini July 4, 2018, 12:50 p.m. UTC | #1
On 03/07/2018 15:30, l00284672 wrote:
> I got a solution, the patch is below:
> 
> diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
> index 608fb18..4cdc2bb 100644
> --- a/hw/scsi/scsi-disk.c
> +++ b/hw/scsi/scsi-disk.c
> @@ -2184,7 +2184,9 @@ static void scsi_disk_reset(DeviceState *dev)
>      SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev.qdev, dev);
>      uint64_t nb_sectors;
> 
> -    scsi_device_purge_requests(&s->qdev, SENSE_CODE(RESET));
> +    if (dev->realized) {
> +       scsi_device_purge_requests(&s->qdev, SENSE_CODE(RESET));
> +    }
> 
>      blk_get_geometry(s->qdev.conf.blk, &nb_sectors);
>      nb_sectors /= s->qdev.blocksize / 512;
> 
> The commit
> https://git.qemu.org/?p=qemu.git;a=commitdiff;h=e9447f35718439c1affdee3ef69b2fee50c8106c;hp=4c64d5b52ed5287bb31489bba39cf41628230bc
> introduced this problem.
> 
> I  think  there should not be any IO requests brefore device is
> realized, but I have no idea  how to avoid it after calling
> virtio_scsi_push_event.

virtio_scsi_push_event is called too soon.  David, now that you have
introduced pre_plug in HotplugHandler, it seems to me that the "plug"
callback should be called after dev->realized is set to true.

In general, this means splitting the existing plug callbacks into a
pre_plug that checks for errors and a plug that does the operation.
However, the one use of pre_plug that you introduced for memory hotplug
has a plug callback that returns errors (namely pc_memory_plug via
pc_dimm_plug, where it sets properties).  I must be missing something,
why are properties being set after the realize method is called?  Could
these operations actually fail, or could the object_set_property_* calls
use &error_abort as the error pointer?

Thanks,

Paolo

> 
> On 2018/7/3 15:20, l00284672 wrote:
>> The disk missing due to calling scsi_probe_lun failed in guest. The
>> guest code is below:
>> static int scsi_probe_lun(struct scsi_device *sdev, unsigned char
>> *inq_result,
>>               int result_len, int *bflags)
>>
>> {
>> .........
>>     result = scsi_execute_req(sdev,  scsi_cmd, DMA_FROM_DEVICE,
>>                       inq_result, try_inquiry_len, &sshdr,
>>                       HZ / 2 + HZ * scsi_inq_timeout, 3,
>>                       &resid);
>>
>> .........
>> }
>>
>> The scsi inqury request from guest is cancelled by qemu. The qemu bt
>> is below:
>>
>> (gdb) bt
>> #0  scsi_req_cancel_async (req=0x7f86d00055c0, notifier=0x0) at
>> hw/scsi/scsi-bus.c:1825
>> #1  0x000055b50ac254c7 in scsi_device_purge_requests
>> (sdev=sdev@entry=0x55b50f9a8990, sense=...) at hw/scsi/scsi-bus.c:1925
>> #2  0x000055b50ac1d337 in scsi_disk_reset (dev=0x55b50f9a8990) at
>> hw/scsi/scsi-disk.c:2186
>> #3  0x000055b50ab95d91 in device_set_realized (obj=<optimized out>,
>> value=<optimized out>, errp=0x7ffedda81898) at hw/core/qdev.c:949
>> #4  0x000055b50acd167e in property_set_bool (obj=0x55b50f9a8990,
>> v=<optimized out>, name=<optimized out>, opaque=0x55b50f9c1b50,
>> errp=0x7ffedda81898) at qom/object.c:1854
>> #5  0x000055b50acd5091 in object_property_set_qobject
>> (obj=obj@entry=0x55b50f9a8990, value=value@entry=0x55b50f997350,
>> name=name@entry=0x55b50ae0260b "realized",
>>     errp=errp@entry=0x7ffedda81898) at qom/qom-qobject.c:27
>> #6  0x000055b50acd3210 in object_property_set_bool
>> (obj=0x55b50f9a8990, value=<optimized out>, name=0x55b50ae0260b
>> "realized", errp=0x7ffedda81898) at qom/object.c:1157
>> #7  0x000055b50ab1e0b5 in qdev_device_add
>> (opts=opts@entry=0x55b50ceb9880, errp=errp@entry=0x7ffedda81978) at
>> qdev-monitor.c:627
>> #8  0x000055b50ab1e68b in qmp_device_add (qdict=<optimized out>,
>> ret_data=<optimized out>, errp=0x7ffedda819c0) at qdev-monitor.c:807
>> #9  0x000055b50ad78787 in do_qmp_dispatch (errp=0x7ffedda819b8,
>> request=0x55b50c9605c0) at qapi/qmp-dispatch.c:114
>> #10 qmp_dispatch (request=request@entry=0x55b50d447000) at
>> qapi/qmp-dispatch.c:141
>> #11 0x000055b50aa3a102 in handle_qmp_command (parser=<optimized out>,
>> tokens=<optimized out>) at
>> /mnt/sdb/lzg/code/UVP_2.5_CODE/qemu/monitor.c:3907
>> #12 0x000055b50ad7da54 in json_message_process_token
>> (lexer=0x55b50c5d4458, input=0x55b50c55a6e0, type=JSON_RCURLY, x=181,
>> y=1449) at qobject/json-streamer.c:105
>> #13 0x000055b50ad9fdfb in json_lexer_feed_char
>> (lexer=lexer@entry=0x55b50c5d4458, ch=125 '}',
>> flush=flush@entry=false) at qobject/json-lexer.c:319
>> #14 0x000055b50ad9febe in json_lexer_feed (lexer=0x55b50c5d4458,
>> buffer=<optimized out>, size=<optimized out>) at qobject/json-lexer.c:369
>> #15 0x000055b50ad7db19 in json_message_parser_feed (parser=<optimized
>> out>, buffer=<optimized out>, size=<optimized out>) at
>> qobject/json-streamer.c:124
>> #16 0x000055b50aa389bb in monitor_qmp_read (opaque=<optimized out>,
>> buf=<optimized out>, size=<optimized out>) at
>> /mnt/sdb/lzg/code/UVP_2.5_CODE/qemu/monitor.c:3937
>> #17 0x000055b50ab23e26 in tcp_chr_read (chan=<optimized out>,
>> cond=<optimized out>, opaque=0x55b50c5d0fa0) at qemu-char.c:3253
>> #18 0x00007f8754cc499a in g_main_context_dispatch () from
>> /usr/lib64/libglib-2.0.so.0
>> #19 0x000055b50acded8c in glib_pollfds_poll () at main-loop.c:228
>> #20 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:273
>> #21 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:521
>> #22 0x000055b50a9ff8ff in main_loop () at vl.c:2095
>>
>> static void device_set_realized(Object *obj, bool value, Error **errp)
>> {
>>     .......
>>     if (hotplug_ctrl) {
>>     hotplug_handler_plug(hotplug_ctrl, dev, &local_err);
>> }
>>
>> .......
>>     if (dev->hotplugged) {
>>         device_reset(dev);
>>     }
>>     ......
>> }
>>
>> The hotplug_handler_plug call virtio_scsi_hotplug.  If iothread handle
>> a request between  hotplug_handler_plug and  device_reset, the request
>> will be cancelled by
>> scsi_device_purge_requests in device_reset.
>>
>> On 2018/7/2 21:15, Stefan Hajnoczi wrote:
>>> On Wed, Jun 27, 2018 at 06:33:16PM +0800, l00284672 wrote:
>>>> Hi, I found a bug that disk missing (not all disks missing ) in the
>>>> guest
>>>> contingently when hotplug several virtio scsi disks consecutively. 
>>>> After
>>>> rebooting the guest,
>>> For the record, there is also a bug report here:
>>>
>>> https://bugs.launchpad.net/qemu/+bug/1779120
>>>
>>> I plan to reproduce this issue soon.
>>>
>>> Stefan
>>
>
diff mbox

Patch

diff --git a/hw/scsi/scsi-disk.c b/hw/scsi/scsi-disk.c
index 608fb18..4cdc2bb 100644
--- a/hw/scsi/scsi-disk.c
+++ b/hw/scsi/scsi-disk.c
@@ -2184,7 +2184,9 @@  static void scsi_disk_reset(DeviceState *dev)
      SCSIDiskState *s = DO_UPCAST(SCSIDiskState, qdev.qdev, dev);
      uint64_t nb_sectors;

-    scsi_device_purge_requests(&s->qdev, SENSE_CODE(RESET));
+    if (dev->realized) {
+       scsi_device_purge_requests(&s->qdev, SENSE_CODE(RESET));
+    }

      blk_get_geometry(s->qdev.conf.blk, &nb_sectors);
      nb_sectors /= s->qdev.blocksize / 512;