Message ID | 20190403083841.830-1-yury-kotov@yandex-team.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | migration: fix migration shutdown | expand |
* Yury Kotov (yury-kotov@yandex-team.ru) wrote: > It fixes heap-use-after-free which was found by clang's ASAN. > > Control flow of this use-after-free: > main_thread: > * Got SIGTERM and completes main loop > * Calls migration_shutdown > - migrate_fd_cancel (so, migration_thread begins to complete) > - object_unref(OBJECT(current_migration)); > > migration_thread: > * migration_iteration_finish -> schedule cleanup bh > * object_unref(OBJECT(s)); (Now, current_migration is freed) > * exits > > main_thread: > * Calls vm_shutdown -> drain bdrvs -> main loop > -> cleanup_bh -> use after free > > If you want to reproduce, these couple of sleeps will help: > vl.c:4613: > migration_shutdown(); > + sleep(2); > migration.c:3269: > + sleep(1); > trace_migration_thread_after_loop(); > migration_iteration_finish(s); Fun; I hadn't realised you could get more main loop after exiting the main loop. Is this fallout from my 892ae715b6b or is this just another case it didn't catch? I guess it moved the shutdown early, so it probably is fallout, and so probably does need to go to 4.0 ? > Original output: > qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>) > ================================================================= > ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 > at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 > READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) > #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 > #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 > #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 > #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 > #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 > #5 0x5555573bddf6 in virtio_blk_data_plane_stop > hw/block/dataplane/virtio-blk.c:282:5 > #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 > #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 > #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 > #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 > #10 0x555557c36713 in vm_state_notify vl.c:1605:9 > #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 > #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 > #13 0x555557c4283e in main vl.c:4617:5 > #14 0x7fffdfdb482f in __libc_start_main > (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) > > 0x61900001d210 is located 144 bytes inside of 952-byte region > [0x61900001d180,0x61900001d538) > freed by thread T6 (live_migration) here: > #0 0x555556f76782 in __interceptor_free > /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 > #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 > #2 0x555558d57651 in object_unref qom/object.c:1068:9 > #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 > #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 > #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) > > previously allocated by thread T0 (qemu-vm-0) here: > #0 0x555556f76b03 in __interceptor_malloc > /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 > #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) > #2 0x555558d58031 in object_new qom/object.c:640:12 > #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 > #4 0x555557c41398 in main vl.c:4320:5 > #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > > Thread T6 (live_migration) created by T0 (qemu-vm-0) here: > #0 0x555556f5f0dd in pthread_create > /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 > #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 > #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 > #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 > #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 > #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 > #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 > #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 > #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 > #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 > #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 > #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 > #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 > #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 > #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 > #15 0x7ffff6ede196 in g_main_context_dispatch > (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) > > SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 > in migrate_fd_cleanup > Shadow bytes around the buggy address: > 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa > 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa > 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa > 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa > 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > Shadow byte legend (one shadow byte represents 8 application bytes): > Addressable: 00 > Partially addressable: 01 02 03 04 05 06 07 > Heap left redzone: fa > Freed heap region: fd > Stack left redzone: f1 > Stack mid redzone: f2 > Stack right redzone: f3 > Stack after return: f5 > Stack use after scope: f8 > Global redzone: f9 > Global init order: f6 > Poisoned by user: f7 > Container overflow: fc > Array cookie: ac > Intra object redzone: bb > ASan internal: fe > Left alloca redzone: ca > Right alloca redzone: cb > Shadow gap: cc > ==31958==ABORTING > > Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> > --- > migration/migration.c | 30 ++++++++++++++++++++++++------ > 1 file changed, 24 insertions(+), 6 deletions(-) > > diff --git a/migration/migration.c b/migration/migration.c > index 609e0df5d0..54d82bb88b 100644 > --- a/migration/migration.c > +++ b/migration/migration.c > @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s, > int *current_active_state, > int new_state); > static void migrate_fd_cancel(MigrationState *s); > +static void migrate_join_thread(MigrationState *s); > +static void migrate_fd_cleanup(void *opaque); > > void migration_object_init(void) > { > @@ -176,6 +178,17 @@ void migration_shutdown(void) > * stop the migration using this structure > */ > migrate_fd_cancel(current_migration); > + /* > + * migration_thread schedules cleanup_bh, but migration_shutdown is called > + * from the end of main, so cleanu_up may not be called because main-loop is > + * complete. So, wait for the complition of migration_thread, cancel bh and > + * call cleanup ourselves. > + */ (Some typos) > + migrate_join_thread(current_migration); I'll admit to being a bit nervous that something in here could be hanging and thus we could hang here rather than die as SIGTERM is expecting us to. > + if (current_migration->cleanup_bh) { > + qemu_bh_cancel(current_migration->cleanup_bh); Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete? > + migrate_fd_cleanup(current_migration); > + } > object_unref(OBJECT(current_migration)); > } > > @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s) > } > } > > +static void migrate_join_thread(MigrationState *s) > +{ > + qemu_mutex_unlock_iothread(); > + if (s->migration_thread_running) { > + qemu_thread_join(&s->thread); > + s->migration_thread_running = false; > + } > + qemu_mutex_lock_iothread(); > +} > + > static void migrate_fd_cleanup(void *opaque) > { > MigrationState *s = opaque; > @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque) > QEMUFile *tmp; > > trace_migrate_fd_cleanup(); > - qemu_mutex_unlock_iothread(); > - if (s->migration_thread_running) { > - qemu_thread_join(&s->thread); > - s->migration_thread_running = false; > - } > - qemu_mutex_lock_iothread(); > + migrate_join_thread(s); > > multifd_save_cleanup(); > qemu_mutex_lock(&s->qemu_file_lock); Dave > -- > 2.21.0 > -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
03.04.2019, 22:06, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> It fixes heap-use-after-free which was found by clang's ASAN. >> >> Control flow of this use-after-free: >> main_thread: >> * Got SIGTERM and completes main loop >> * Calls migration_shutdown >> - migrate_fd_cancel (so, migration_thread begins to complete) >> - object_unref(OBJECT(current_migration)); >> >> migration_thread: >> * migration_iteration_finish -> schedule cleanup bh >> * object_unref(OBJECT(s)); (Now, current_migration is freed) >> * exits >> >> main_thread: >> * Calls vm_shutdown -> drain bdrvs -> main loop >> -> cleanup_bh -> use after free >> >> If you want to reproduce, these couple of sleeps will help: >> vl.c:4613: >> migration_shutdown(); >> + sleep(2); >> migration.c:3269: >> + sleep(1); >> trace_migration_thread_after_loop(); >> migration_iteration_finish(s); > > Fun; I hadn't realised you could get more main loop after exiting > the main loop. > > Is this fallout from my 892ae715b6b or is this just another case it > didn't catch? > I guess it moved the shutdown early, so it probably is fallout, and > so probably does need to go to 4.0 ? > I think this is closer to another case than fallout. Because without 892ae715b6b, reproducing of UAF at exit was much easier. Another way to fix it - remove unref from migration_shutdown (but keep migrate_fd_cancel) and restore migration_object_finalize. >> Original output: >> qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>) >> ================================================================= >> ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 >> at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 >> READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) >> #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 >> #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 >> #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 >> #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 >> #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 >> #5 0x5555573bddf6 in virtio_blk_data_plane_stop >> hw/block/dataplane/virtio-blk.c:282:5 >> #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 >> #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 >> #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 >> #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 >> #10 0x555557c36713 in vm_state_notify vl.c:1605:9 >> #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 >> #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 >> #13 0x555557c4283e in main vl.c:4617:5 >> #14 0x7fffdfdb482f in __libc_start_main >> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) >> #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) >> >> 0x61900001d210 is located 144 bytes inside of 952-byte region >> [0x61900001d180,0x61900001d538) >> freed by thread T6 (live_migration) here: >> #0 0x555556f76782 in __interceptor_free >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 >> #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 >> #2 0x555558d57651 in object_unref qom/object.c:1068:9 >> #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 >> #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 >> #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) >> >> previously allocated by thread T0 (qemu-vm-0) here: >> #0 0x555556f76b03 in __interceptor_malloc >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 >> #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) >> #2 0x555558d58031 in object_new qom/object.c:640:12 >> #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 >> #4 0x555557c41398 in main vl.c:4320:5 >> #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) >> >> Thread T6 (live_migration) created by T0 (qemu-vm-0) here: >> #0 0x555556f5f0dd in pthread_create >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 >> #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 >> #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 >> #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 >> #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 >> #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 >> #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 >> #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 >> #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 >> #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 >> #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 >> #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 >> #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 >> #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 >> #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 >> #15 0x7ffff6ede196 in g_main_context_dispatch >> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) >> >> SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 >> in migrate_fd_cleanup >> Shadow bytes around the buggy address: >> 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa >> 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa >> 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa >> 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa >> 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd >> 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> Shadow byte legend (one shadow byte represents 8 application bytes): >> Addressable: 00 >> Partially addressable: 01 02 03 04 05 06 07 >> Heap left redzone: fa >> Freed heap region: fd >> Stack left redzone: f1 >> Stack mid redzone: f2 >> Stack right redzone: f3 >> Stack after return: f5 >> Stack use after scope: f8 >> Global redzone: f9 >> Global init order: f6 >> Poisoned by user: f7 >> Container overflow: fc >> Array cookie: ac >> Intra object redzone: bb >> ASan internal: fe >> Left alloca redzone: ca >> Right alloca redzone: cb >> Shadow gap: cc >> ==31958==ABORTING >> >> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> >> --- >> migration/migration.c | 30 ++++++++++++++++++++++++------ >> 1 file changed, 24 insertions(+), 6 deletions(-) >> >> diff --git a/migration/migration.c b/migration/migration.c >> index 609e0df5d0..54d82bb88b 100644 >> --- a/migration/migration.c >> +++ b/migration/migration.c >> @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s, >> int *current_active_state, >> int new_state); >> static void migrate_fd_cancel(MigrationState *s); >> +static void migrate_join_thread(MigrationState *s); >> +static void migrate_fd_cleanup(void *opaque); >> >> void migration_object_init(void) >> { >> @@ -176,6 +178,17 @@ void migration_shutdown(void) >> * stop the migration using this structure >> */ >> migrate_fd_cancel(current_migration); >> + /* >> + * migration_thread schedules cleanup_bh, but migration_shutdown is called >> + * from the end of main, so cleanu_up may not be called because main-loop is >> + * complete. So, wait for the complition of migration_thread, cancel bh and >> + * call cleanup ourselves. >> + */ > > (Some typos) > Ok :) >> + migrate_join_thread(current_migration); > > I'll admit to being a bit nervous that something in here could be > hanging and thus we could hang here rather than die as SIGTERM is > expecting us to. > I think it's only way to be sure we havn't other races with migration_thread. And this joining is cheaper than vm_shutdown IMHO. >> + if (current_migration->cleanup_bh) { >> + qemu_bh_cancel(current_migration->cleanup_bh); > > Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete? > Agree, it's odd. >> + migrate_fd_cleanup(current_migration); >> + } >> object_unref(OBJECT(current_migration)); >> } >> >> @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s) >> } >> } >> >> +static void migrate_join_thread(MigrationState *s) >> +{ >> + qemu_mutex_unlock_iothread(); >> + if (s->migration_thread_running) { >> + qemu_thread_join(&s->thread); >> + s->migration_thread_running = false; >> + } >> + qemu_mutex_lock_iothread(); >> +} >> + >> static void migrate_fd_cleanup(void *opaque) >> { >> MigrationState *s = opaque; >> @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque) >> QEMUFile *tmp; >> >> trace_migrate_fd_cleanup(); >> - qemu_mutex_unlock_iothread(); >> - if (s->migration_thread_running) { >> - qemu_thread_join(&s->thread); >> - s->migration_thread_running = false; >> - } >> - qemu_mutex_lock_iothread(); >> + migrate_join_thread(s); >> >> multifd_save_cleanup(); >> qemu_mutex_lock(&s->qemu_file_lock); > > Dave > >> -- >> 2.21.0 > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
* Yury Kotov (yury-kotov@yandex-team.ru) wrote: > 03.04.2019, 22:06, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: > >> It fixes heap-use-after-free which was found by clang's ASAN. > >> > >> Control flow of this use-after-free: > >> main_thread: > >> * Got SIGTERM and completes main loop > >> * Calls migration_shutdown > >> - migrate_fd_cancel (so, migration_thread begins to complete) > >> - object_unref(OBJECT(current_migration)); > >> > >> migration_thread: > >> * migration_iteration_finish -> schedule cleanup bh > >> * object_unref(OBJECT(s)); (Now, current_migration is freed) > >> * exits > >> > >> main_thread: > >> * Calls vm_shutdown -> drain bdrvs -> main loop > >> -> cleanup_bh -> use after free > >> > >> If you want to reproduce, these couple of sleeps will help: > >> vl.c:4613: > >> migration_shutdown(); > >> + sleep(2); > >> migration.c:3269: > >> + sleep(1); > >> trace_migration_thread_after_loop(); > >> migration_iteration_finish(s); > > > > Fun; I hadn't realised you could get more main loop after exiting > > the main loop. > > > > Is this fallout from my 892ae715b6b or is this just another case it > > didn't catch? > > I guess it moved the shutdown early, so it probably is fallout, and > > so probably does need to go to 4.0 ? > > > I think this is closer to another case than fallout. Because without > 892ae715b6b, reproducing of UAF at exit was much easier. > Another way to fix it - remove unref from migration_shutdown > (but keep migrate_fd_cancel) and restore migration_object_finalize. OK, so in that case I'd rather leave this to 4.1 rather than rush it in 4.0rc's - I'd rather only take regressions at this point. Dave > >> Original output: > >> qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>) > >> ================================================================= > >> ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 > >> at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 > >> READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) > >> #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 > >> #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 > >> #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 > >> #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 > >> #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 > >> #5 0x5555573bddf6 in virtio_blk_data_plane_stop > >> hw/block/dataplane/virtio-blk.c:282:5 > >> #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 > >> #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 > >> #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 > >> #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 > >> #10 0x555557c36713 in vm_state_notify vl.c:1605:9 > >> #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 > >> #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 > >> #13 0x555557c4283e in main vl.c:4617:5 > >> #14 0x7fffdfdb482f in __libc_start_main > >> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > >> #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) > >> > >> 0x61900001d210 is located 144 bytes inside of 952-byte region > >> [0x61900001d180,0x61900001d538) > >> freed by thread T6 (live_migration) here: > >> #0 0x555556f76782 in __interceptor_free > >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 > >> #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 > >> #2 0x555558d57651 in object_unref qom/object.c:1068:9 > >> #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 > >> #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 > >> #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) > >> > >> previously allocated by thread T0 (qemu-vm-0) here: > >> #0 0x555556f76b03 in __interceptor_malloc > >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 > >> #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) > >> #2 0x555558d58031 in object_new qom/object.c:640:12 > >> #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 > >> #4 0x555557c41398 in main vl.c:4320:5 > >> #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) > >> > >> Thread T6 (live_migration) created by T0 (qemu-vm-0) here: > >> #0 0x555556f5f0dd in pthread_create > >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 > >> #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 > >> #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 > >> #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 > >> #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 > >> #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 > >> #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 > >> #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 > >> #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 > >> #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 > >> #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 > >> #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 > >> #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 > >> #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 > >> #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 > >> #15 0x7ffff6ede196 in g_main_context_dispatch > >> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) > >> > >> SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 > >> in migrate_fd_cleanup > >> Shadow bytes around the buggy address: > >> 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa > >> 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa > >> 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa > >> 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa > >> 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > >> =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd > >> 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > >> 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > >> 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > >> 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > >> 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd > >> Shadow byte legend (one shadow byte represents 8 application bytes): > >> Addressable: 00 > >> Partially addressable: 01 02 03 04 05 06 07 > >> Heap left redzone: fa > >> Freed heap region: fd > >> Stack left redzone: f1 > >> Stack mid redzone: f2 > >> Stack right redzone: f3 > >> Stack after return: f5 > >> Stack use after scope: f8 > >> Global redzone: f9 > >> Global init order: f6 > >> Poisoned by user: f7 > >> Container overflow: fc > >> Array cookie: ac > >> Intra object redzone: bb > >> ASan internal: fe > >> Left alloca redzone: ca > >> Right alloca redzone: cb > >> Shadow gap: cc > >> ==31958==ABORTING > >> > >> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> > >> --- > >> migration/migration.c | 30 ++++++++++++++++++++++++------ > >> 1 file changed, 24 insertions(+), 6 deletions(-) > >> > >> diff --git a/migration/migration.c b/migration/migration.c > >> index 609e0df5d0..54d82bb88b 100644 > >> --- a/migration/migration.c > >> +++ b/migration/migration.c > >> @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s, > >> int *current_active_state, > >> int new_state); > >> static void migrate_fd_cancel(MigrationState *s); > >> +static void migrate_join_thread(MigrationState *s); > >> +static void migrate_fd_cleanup(void *opaque); > >> > >> void migration_object_init(void) > >> { > >> @@ -176,6 +178,17 @@ void migration_shutdown(void) > >> * stop the migration using this structure > >> */ > >> migrate_fd_cancel(current_migration); > >> + /* > >> + * migration_thread schedules cleanup_bh, but migration_shutdown is called > >> + * from the end of main, so cleanu_up may not be called because main-loop is > >> + * complete. So, wait for the complition of migration_thread, cancel bh and > >> + * call cleanup ourselves. > >> + */ > > > > (Some typos) > > > Ok :) > > >> + migrate_join_thread(current_migration); > > > > I'll admit to being a bit nervous that something in here could be > > hanging and thus we could hang here rather than die as SIGTERM is > > expecting us to. > > > I think it's only way to be sure we havn't other races with migration_thread. > And this joining is cheaper than vm_shutdown IMHO. > > >> + if (current_migration->cleanup_bh) { > >> + qemu_bh_cancel(current_migration->cleanup_bh); > > > > Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete? > > > Agree, it's odd. > > >> + migrate_fd_cleanup(current_migration); > >> + } > >> object_unref(OBJECT(current_migration)); > >> } > >> > >> @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s) > >> } > >> } > >> > >> +static void migrate_join_thread(MigrationState *s) > >> +{ > >> + qemu_mutex_unlock_iothread(); > >> + if (s->migration_thread_running) { > >> + qemu_thread_join(&s->thread); > >> + s->migration_thread_running = false; > >> + } > >> + qemu_mutex_lock_iothread(); > >> +} > >> + > >> static void migrate_fd_cleanup(void *opaque) > >> { > >> MigrationState *s = opaque; > >> @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque) > >> QEMUFile *tmp; > >> > >> trace_migrate_fd_cleanup(); > >> - qemu_mutex_unlock_iothread(); > >> - if (s->migration_thread_running) { > >> - qemu_thread_join(&s->thread); > >> - s->migration_thread_running = false; > >> - } > >> - qemu_mutex_lock_iothread(); > >> + migrate_join_thread(s); > >> > >> multifd_save_cleanup(); > >> qemu_mutex_lock(&s->qemu_file_lock); > > > > Dave > > > >> -- > >> 2.21.0 > > -- > > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK -- Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Hi, I've sent another patch to fix this UAF: "migration: Fix use-after-free during process exit" It's more simple and fixes only the regression. Regards, Yury 05.04.2019, 12:07, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> 03.04.2019, 22:06, "Dr. David Alan Gilbert" <dgilbert@redhat.com>: >> > * Yury Kotov (yury-kotov@yandex-team.ru) wrote: >> >> It fixes heap-use-after-free which was found by clang's ASAN. >> >> >> >> Control flow of this use-after-free: >> >> main_thread: >> >> * Got SIGTERM and completes main loop >> >> * Calls migration_shutdown >> >> - migrate_fd_cancel (so, migration_thread begins to complete) >> >> - object_unref(OBJECT(current_migration)); >> >> >> >> migration_thread: >> >> * migration_iteration_finish -> schedule cleanup bh >> >> * object_unref(OBJECT(s)); (Now, current_migration is freed) >> >> * exits >> >> >> >> main_thread: >> >> * Calls vm_shutdown -> drain bdrvs -> main loop >> >> -> cleanup_bh -> use after free >> >> >> >> If you want to reproduce, these couple of sleeps will help: >> >> vl.c:4613: >> >> migration_shutdown(); >> >> + sleep(2); >> >> migration.c:3269: >> >> + sleep(1); >> >> trace_migration_thread_after_loop(); >> >> migration_iteration_finish(s); >> > >> > Fun; I hadn't realised you could get more main loop after exiting >> > the main loop. >> > >> > Is this fallout from my 892ae715b6b or is this just another case it >> > didn't catch? >> > I guess it moved the shutdown early, so it probably is fallout, and >> > so probably does need to go to 4.0 ? >> > >> I think this is closer to another case than fallout. Because without >> 892ae715b6b, reproducing of UAF at exit was much easier. >> Another way to fix it - remove unref from migration_shutdown >> (but keep migrate_fd_cancel) and restore migration_object_finalize. > > OK, so in that case I'd rather leave this to 4.1 rather than rush it in > 4.0rc's - I'd rather only take regressions at this point. > > Dave > >> >> Original output: >> >> qemu-system-x86_64: terminating on signal 15 from pid 31980 (<unknown process>) >> >> ================================================================= >> >> ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 >> >> at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 >> >> READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) >> >> #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 >> >> #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 >> >> #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 >> >> #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 >> >> #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 >> >> #5 0x5555573bddf6 in virtio_blk_data_plane_stop >> >> hw/block/dataplane/virtio-blk.c:282:5 >> >> #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 >> >> #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 >> >> #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 >> >> #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 >> >> #10 0x555557c36713 in vm_state_notify vl.c:1605:9 >> >> #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 >> >> #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 >> >> #13 0x555557c4283e in main vl.c:4617:5 >> >> #14 0x7fffdfdb482f in __libc_start_main >> >> (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) >> >> #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) >> >> >> >> 0x61900001d210 is located 144 bytes inside of 952-byte region >> >> [0x61900001d180,0x61900001d538) >> >> freed by thread T6 (live_migration) here: >> >> #0 0x555556f76782 in __interceptor_free >> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 >> >> #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 >> >> #2 0x555558d57651 in object_unref qom/object.c:1068:9 >> >> #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 >> >> #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 >> >> #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) >> >> >> >> previously allocated by thread T0 (qemu-vm-0) here: >> >> #0 0x555556f76b03 in __interceptor_malloc >> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 >> >> #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) >> >> #2 0x555558d58031 in object_new qom/object.c:640:12 >> >> #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 >> >> #4 0x555557c41398 in main vl.c:4320:5 >> >> #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) >> >> >> >> Thread T6 (live_migration) created by T0 (qemu-vm-0) here: >> >> #0 0x555556f5f0dd in pthread_create >> >> /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 >> >> #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 >> >> #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 >> >> #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 >> >> #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 >> >> #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 >> >> #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 >> >> #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 >> >> #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 >> >> #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 >> >> #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 >> >> #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 >> >> #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 >> >> #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 >> >> #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 >> >> #15 0x7ffff6ede196 in g_main_context_dispatch >> >> (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) >> >> >> >> SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 >> >> in migrate_fd_cleanup >> >> Shadow bytes around the buggy address: >> >> 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa >> >> 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa >> >> 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa >> >> 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa >> >> 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> >> =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd >> >> 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> >> 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> >> 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> >> 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> >> 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd >> >> Shadow byte legend (one shadow byte represents 8 application bytes): >> >> Addressable: 00 >> >> Partially addressable: 01 02 03 04 05 06 07 >> >> Heap left redzone: fa >> >> Freed heap region: fd >> >> Stack left redzone: f1 >> >> Stack mid redzone: f2 >> >> Stack right redzone: f3 >> >> Stack after return: f5 >> >> Stack use after scope: f8 >> >> Global redzone: f9 >> >> Global init order: f6 >> >> Poisoned by user: f7 >> >> Container overflow: fc >> >> Array cookie: ac >> >> Intra object redzone: bb >> >> ASan internal: fe >> >> Left alloca redzone: ca >> >> Right alloca redzone: cb >> >> Shadow gap: cc >> >> ==31958==ABORTING >> >> >> >> Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> >> >> --- >> >> migration/migration.c | 30 ++++++++++++++++++++++++------ >> >> 1 file changed, 24 insertions(+), 6 deletions(-) >> >> >> >> diff --git a/migration/migration.c b/migration/migration.c >> >> index 609e0df5d0..54d82bb88b 100644 >> >> --- a/migration/migration.c >> >> +++ b/migration/migration.c >> >> @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s, >> >> int *current_active_state, >> >> int new_state); >> >> static void migrate_fd_cancel(MigrationState *s); >> >> +static void migrate_join_thread(MigrationState *s); >> >> +static void migrate_fd_cleanup(void *opaque); >> >> >> >> void migration_object_init(void) >> >> { >> >> @@ -176,6 +178,17 @@ void migration_shutdown(void) >> >> * stop the migration using this structure >> >> */ >> >> migrate_fd_cancel(current_migration); >> >> + /* >> >> + * migration_thread schedules cleanup_bh, but migration_shutdown is called >> >> + * from the end of main, so cleanu_up may not be called because main-loop is >> >> + * complete. So, wait for the complition of migration_thread, cancel bh and >> >> + * call cleanup ourselves. >> >> + */ >> > >> > (Some typos) >> > >> Ok :) >> >> >> + migrate_join_thread(current_migration); >> > >> > I'll admit to being a bit nervous that something in here could be >> > hanging and thus we could hang here rather than die as SIGTERM is >> > expecting us to. >> > >> I think it's only way to be sure we havn't other races with migration_thread. >> And this joining is cheaper than vm_shutdown IMHO. >> >> >> + if (current_migration->cleanup_bh) { >> >> + qemu_bh_cancel(current_migration->cleanup_bh); >> > >> > Is that needed? Doesn't migrate_fd_cleanup start with a qemu_bh_delete? >> > >> Agree, it's odd. >> >> >> + migrate_fd_cleanup(current_migration); >> >> + } >> >> object_unref(OBJECT(current_migration)); >> >> } >> >> >> >> @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s) >> >> } >> >> } >> >> >> >> +static void migrate_join_thread(MigrationState *s) >> >> +{ >> >> + qemu_mutex_unlock_iothread(); >> >> + if (s->migration_thread_running) { >> >> + qemu_thread_join(&s->thread); >> >> + s->migration_thread_running = false; >> >> + } >> >> + qemu_mutex_lock_iothread(); >> >> +} >> >> + >> >> static void migrate_fd_cleanup(void *opaque) >> >> { >> >> MigrationState *s = opaque; >> >> @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque) >> >> QEMUFile *tmp; >> >> >> >> trace_migrate_fd_cleanup(); >> >> - qemu_mutex_unlock_iothread(); >> >> - if (s->migration_thread_running) { >> >> - qemu_thread_join(&s->thread); >> >> - s->migration_thread_running = false; >> >> - } >> >> - qemu_mutex_lock_iothread(); >> >> + migrate_join_thread(s); >> >> >> >> multifd_save_cleanup(); >> >> qemu_mutex_lock(&s->qemu_file_lock); >> > >> > Dave >> > >> >> -- >> >> 2.21.0 >> > -- >> > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK > -- > Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
================================================================= ==31958==ERROR: AddressSanitizer: heap-use-after-free on address 0x61900001d210 at pc 0x555558a535ca bp 0x7fffffffb190 sp 0x7fffffffb188 READ of size 8 at 0x61900001d210 thread T0 (qemu-vm-0) #0 0x555558a535c9 in migrate_fd_cleanup migration/migration.c:1502:23 #1 0x5555594fde0a in aio_bh_call util/async.c:90:5 #2 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #3 0x555559524783 in aio_poll util/aio-posix.c:725:17 #4 0x555559504fb3 in aio_wait_bh_oneshot util/aio-wait.c:71:5 #5 0x5555573bddf6 in virtio_blk_data_plane_stop hw/block/dataplane/virtio-blk.c:282:5 #6 0x5555589d5c09 in virtio_bus_stop_ioeventfd hw/virtio/virtio-bus.c:246:9 #7 0x5555589e9917 in virtio_pci_stop_ioeventfd hw/virtio/virtio-pci.c:287:5 #8 0x5555589e22bf in virtio_pci_vmstate_change hw/virtio/virtio-pci.c:1072:9 #9 0x555557628931 in virtio_vmstate_change hw/virtio/virtio.c:2257:9 #10 0x555557c36713 in vm_state_notify vl.c:1605:9 #11 0x55555716ef53 in do_vm_stop cpus.c:1074:9 #12 0x55555716eeff in vm_shutdown cpus.c:1092:12 #13 0x555557c4283e in main vl.c:4617:5 #14 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) #15 0x555556ecb118 in _start (x86_64-softmmu/qemu-system-x86_64+0x1977118) 0x61900001d210 is located 144 bytes inside of 952-byte region [0x61900001d180,0x61900001d538) freed by thread T6 (live_migration) here: #0 0x555556f76782 in __interceptor_free /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:124:3 #1 0x555558d5fa94 in object_finalize qom/object.c:618:9 #2 0x555558d57651 in object_unref qom/object.c:1068:9 #3 0x555558a55588 in migration_thread migration/migration.c:3272:5 #4 0x5555595393f2 in qemu_thread_start util/qemu-thread-posix.c:502:9 #5 0x7fffe057f6b9 in start_thread (/lib/x86_64-linux-gnu/libpthread.so.0+0x76b9) previously allocated by thread T0 (qemu-vm-0) here: #0 0x555556f76b03 in __interceptor_malloc /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_malloc_linux.cc:146:3 #1 0x7ffff6ee37b8 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4f7b8) #2 0x555558d58031 in object_new qom/object.c:640:12 #3 0x555558a31f21 in migration_object_init migration/migration.c:139:25 #4 0x555557c41398 in main vl.c:4320:5 #5 0x7fffdfdb482f in __libc_start_main (/lib/x86_64-linux-gnu/libc.so.6+0x2082f) Thread T6 (live_migration) created by T0 (qemu-vm-0) here: #0 0x555556f5f0dd in pthread_create /tmp/final/llvm.src/projects/compiler-rt/lib/asan/asan_interceptors.cc:210:3 #1 0x555559538cf9 in qemu_thread_create util/qemu-thread-posix.c:539:11 #2 0x555558a53304 in migrate_fd_connect migration/migration.c:3332:5 #3 0x555558a72bd8 in migration_channel_connect migration/channel.c:92:5 #4 0x555558a6ef87 in exec_start_outgoing_migration migration/exec.c:42:5 #5 0x555558a4f3c2 in qmp_migrate migration/migration.c:1922:9 #6 0x555558bb4f6a in qmp_marshal_migrate qapi/qapi-commands-migration.c:607:5 #7 0x555559363738 in do_qmp_dispatch qapi/qmp-dispatch.c:131:5 #8 0x555559362a15 in qmp_dispatch qapi/qmp-dispatch.c:174:11 #9 0x5555571bac15 in monitor_qmp_dispatch monitor.c:4124:11 #10 0x55555719a22d in monitor_qmp_bh_dispatcher monitor.c:4207:9 #11 0x5555594fde0a in aio_bh_call util/async.c:90:5 #12 0x5555594fe522 in aio_bh_poll util/async.c:118:13 #13 0x5555595201e0 in aio_dispatch util/aio-posix.c:460:5 #14 0x555559503553 in aio_ctx_dispatch util/async.c:261:5 #15 0x7ffff6ede196 in g_main_context_dispatch (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x4a196) SUMMARY: AddressSanitizer: heap-use-after-free migration/migration.c:1502:23 in migrate_fd_cleanup Shadow bytes around the buggy address: 0x0c327fffb9f0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba00: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba10: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba20: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c327fffba30: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd =>0x0c327fffba40: fd fd[fd]fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba50: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba60: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba70: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba80: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd 0x0c327fffba90: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==31958==ABORTING Signed-off-by: Yury Kotov <yury-kotov@yandex-team.ru> --- migration/migration.c | 30 ++++++++++++++++++++++++------ 1 file changed, 24 insertions(+), 6 deletions(-) diff --git a/migration/migration.c b/migration/migration.c index 609e0df5d0..54d82bb88b 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -128,6 +128,8 @@ static int migration_maybe_pause(MigrationState *s, int *current_active_state, int new_state); static void migrate_fd_cancel(MigrationState *s); +static void migrate_join_thread(MigrationState *s); +static void migrate_fd_cleanup(void *opaque); void migration_object_init(void) { @@ -176,6 +178,17 @@ void migration_shutdown(void) * stop the migration using this structure */ migrate_fd_cancel(current_migration); + /* + * migration_thread schedules cleanup_bh, but migration_shutdown is called + * from the end of main, so cleanu_up may not be called because main-loop is + * complete. So, wait for the complition of migration_thread, cancel bh and + * call cleanup ourselves. + */ + migrate_join_thread(current_migration); + if (current_migration->cleanup_bh) { + qemu_bh_cancel(current_migration->cleanup_bh); + migrate_fd_cleanup(current_migration); + } object_unref(OBJECT(current_migration)); } @@ -1495,6 +1508,16 @@ static void block_cleanup_parameters(MigrationState *s) } } +static void migrate_join_thread(MigrationState *s) +{ + qemu_mutex_unlock_iothread(); + if (s->migration_thread_running) { + qemu_thread_join(&s->thread); + s->migration_thread_running = false; + } + qemu_mutex_lock_iothread(); +} + static void migrate_fd_cleanup(void *opaque) { MigrationState *s = opaque; @@ -1508,12 +1531,7 @@ static void migrate_fd_cleanup(void *opaque) QEMUFile *tmp; trace_migrate_fd_cleanup(); - qemu_mutex_unlock_iothread(); - if (s->migration_thread_running) { - qemu_thread_join(&s->thread); - s->migration_thread_running = false; - } - qemu_mutex_lock_iothread(); + migrate_join_thread(s); multifd_save_cleanup(); qemu_mutex_lock(&s->qemu_file_lock);