Message ID | bdbb9588-a913-c449-415b-34a25fcbde9e@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 18.07.2018 [11:10:27 -0400], Farhan Ali wrote: > > > On 07/18/2018 09:42 AM, Farhan Ali wrote: > > > > > > On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote: > > > iiuc, this possibly implies AIO was not actually used previously on this > > > guest (it might have silently been falling back to threaded IO?). I > > > don't have access to s390x, but would it be possible to run qemu under > > > gdb and see if aio_setup_linux_aio is being called at all (I think it > > > might not be, but I'm not sure why), and if so, if it's for the context > > > in question? > > > > > > If it's not being called first, could you see what callpath is calling > > > aio_get_linux_aio when this assertion trips? > > > > > > Thanks! > > > -Nish > > > > > > Hi Nishant, > > > > From the coredump of the guest this is the call trace that calls > > aio_get_linux_aio: > > > > > > Stack trace of thread 145158: > > #0 0x000003ff94dbe274 raise (libc.so.6) > > #1 0x000003ff94da39a8 abort (libc.so.6) > > #2 0x000003ff94db62ce __assert_fail_base (libc.so.6) > > #3 0x000003ff94db634c __assert_fail (libc.so.6) > > #4 0x000002aa20db067a aio_get_linux_aio (qemu-system-s390x) > > #5 0x000002aa20d229a8 raw_aio_plug (qemu-system-s390x) > > #6 0x000002aa20d309ee bdrv_io_plug (qemu-system-s390x) > > #7 0x000002aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x) > > #8 0x000002aa20db2f6e aio_dispatch_handlers (qemu-system-s390x) > > #9 0x000002aa20db3c34 aio_poll (qemu-system-s390x) > > #10 0x000002aa20be32a2 iothread_run (qemu-system-s390x) > > #11 0x000003ff94f879a8 start_thread (libpthread.so.0) > > #12 0x000003ff94e797ee thread_start (libc.so.6) > > > > > > Thanks for taking a look and responding. > > > > Thanks > > Farhan > > > > > > > > Trying to debug a little further, the block device in this case is a "host > device". And looking at your commit carefully you use the > bdrv_attach_aio_context callback to setup a Linux AioContext. > > For some reason the "host device" struct (BlockDriver bdrv_host_device in > block/file-posix.c) does not have a bdrv_attach_aio_context defined. > So a simple change of adding the callback to the struct solves the issue and > the guest starts fine. > > > diff --git a/block/file-posix.c b/block/file-posix.c > index 28824aa..b8d59fb 100644 > --- a/block/file-posix.c > +++ b/block/file-posix.c > @@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = { > .bdrv_refresh_limits = raw_refresh_limits, > .bdrv_io_plug = raw_aio_plug, > .bdrv_io_unplug = raw_aio_unplug, > + .bdrv_attach_aio_context = raw_aio_attach_aio_context, > > .bdrv_co_truncate = raw_co_truncate, > .bdrv_getlength = raw_getlength, > > > > I am not too familiar with block device code in QEMU, so not sure if > this is the right fix or if there are some underlying problems. Oh this is quite embarassing! I only added the bdrv_attach_aio_context callback for the file-backed device. Your fix is definitely corect for host device. Let me make sure there weren't any others missed and I will send out a properly formatted patch. Thank you for the quick testing and turnaround! -Nish
On 07/18/2018 08:52 PM, Nishanth Aravamudan wrote: > On 18.07.2018 [11:10:27 -0400], Farhan Ali wrote: >> >> >> On 07/18/2018 09:42 AM, Farhan Ali wrote: >>> >>> >>> On 07/17/2018 04:52 PM, Nishanth Aravamudan wrote: >>>> iiuc, this possibly implies AIO was not actually used previously on this >>>> guest (it might have silently been falling back to threaded IO?). I >>>> don't have access to s390x, but would it be possible to run qemu under >>>> gdb and see if aio_setup_linux_aio is being called at all (I think it >>>> might not be, but I'm not sure why), and if so, if it's for the context >>>> in question? >>>> >>>> If it's not being called first, could you see what callpath is calling >>>> aio_get_linux_aio when this assertion trips? >>>> >>>> Thanks! >>>> -Nish >>> >>> >>> Hi Nishant, >>> >>> From the coredump of the guest this is the call trace that calls >>> aio_get_linux_aio: >>> >>> >>> Stack trace of thread 145158: >>> #0 0x000003ff94dbe274 raise (libc.so.6) >>> #1 0x000003ff94da39a8 abort (libc.so.6) >>> #2 0x000003ff94db62ce __assert_fail_base (libc.so.6) >>> #3 0x000003ff94db634c __assert_fail (libc.so.6) >>> #4 0x000002aa20db067a aio_get_linux_aio (qemu-system-s390x) >>> #5 0x000002aa20d229a8 raw_aio_plug (qemu-system-s390x) >>> #6 0x000002aa20d309ee bdrv_io_plug (qemu-system-s390x) >>> #7 0x000002aa20b5a8ea virtio_blk_handle_vq (qemu-system-s390x) >>> #8 0x000002aa20db2f6e aio_dispatch_handlers (qemu-system-s390x) >>> #9 0x000002aa20db3c34 aio_poll (qemu-system-s390x) >>> #10 0x000002aa20be32a2 iothread_run (qemu-system-s390x) >>> #11 0x000003ff94f879a8 start_thread (libpthread.so.0) >>> #12 0x000003ff94e797ee thread_start (libc.so.6) >>> >>> >>> Thanks for taking a look and responding. >>> >>> Thanks >>> Farhan >>> >>> >>> >> >> Trying to debug a little further, the block device in this case is a "host >> device". And looking at your commit carefully you use the >> bdrv_attach_aio_context callback to setup a Linux AioContext. >> >> For some reason the "host device" struct (BlockDriver bdrv_host_device in >> block/file-posix.c) does not have a bdrv_attach_aio_context defined. >> So a simple change of adding the callback to the struct solves the issue and >> the guest starts fine. >> >> >> diff --git a/block/file-posix.c b/block/file-posix.c >> index 28824aa..b8d59fb 100644 >> --- a/block/file-posix.c >> +++ b/block/file-posix.c >> @@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = { >> .bdrv_refresh_limits = raw_refresh_limits, >> .bdrv_io_plug = raw_aio_plug, >> .bdrv_io_unplug = raw_aio_unplug, >> + .bdrv_attach_aio_context = raw_aio_attach_aio_context, >> >> .bdrv_co_truncate = raw_co_truncate, >> .bdrv_getlength = raw_getlength, >> >> >> >> I am not too familiar with block device code in QEMU, so not sure if >> this is the right fix or if there are some underlying problems. > > Oh this is quite embarassing! I only added the bdrv_attach_aio_context > callback for the file-backed device. Your fix is definitely corect for > host device. Let me make sure there weren't any others missed and I will > send out a properly formatted patch. Thank you for the quick testing and > turnaround! Farhan, can you respin your patch with proper sign-off and patch description? Adding qemu-block.
Hi Christian, On 19.07.2018 [08:55:20 +0200], Christian Borntraeger wrote: > > > On 07/18/2018 08:52 PM, Nishanth Aravamudan wrote: > > On 18.07.2018 [11:10:27 -0400], Farhan Ali wrote: > >> > >> > >> On 07/18/2018 09:42 AM, Farhan Ali wrote: <snip> > >> I am not too familiar with block device code in QEMU, so not sure if > >> this is the right fix or if there are some underlying problems. > > > > Oh this is quite embarassing! I only added the bdrv_attach_aio_context > > callback for the file-backed device. Your fix is definitely corect for > > host device. Let me make sure there weren't any others missed and I will > > send out a properly formatted patch. Thank you for the quick testing and > > turnaround! > > Farhan, can you respin your patch with proper sign-off and patch description? > Adding qemu-block. I sent it yesterday, sorry I didn't cc everyone from this e-mail: http://lists.nongnu.org/archive/html/qemu-block/2018-07/msg00516.html Thanks, Nish
diff --git a/block/file-posix.c b/block/file-posix.c index 28824aa..b8d59fb 100644 --- a/block/file-posix.c +++ b/block/file-posix.c @@ -3135,6 +3135,7 @@ static BlockDriver bdrv_host_device = { .bdrv_refresh_limits = raw_refresh_limits, .bdrv_io_plug = raw_aio_plug, .bdrv_io_unplug = raw_aio_unplug, + .bdrv_attach_aio_context = raw_aio_attach_aio_context, .bdrv_co_truncate = raw_co_truncate, .bdrv_getlength = raw_getlength,