mbox series

[v3,0/4] Introduce 'yank' oob qmp command to recover from hanging qemu

Message ID cover.1590344541.git.lukasstraub2@web.de (mailing list archive)
Headers show
Series Introduce 'yank' oob qmp command to recover from hanging qemu | expand

Message

Lukas Straub May 24, 2020, 6:30 p.m. UTC
Hello Everyone,
In many cases, if qemu has a network connection (qmp, migration, chardev, etc.)
to some other server and that server dies or hangs, qemu hangs too.
These patches introduce the new 'yank' out-of-band qmp command to recover from
these kinds of hangs. The different subsystems register callbacks which get
executed with the yank command. For example the callback can shutdown() a
socket. This is intended for the colo use-case, but it can be used for other
things too of course.

Regards,
Lukas Straub

v3:
 -don't touch softmmu/vl.c, use __contructor__ attribute instead (Paolo Bonzini)
 -fix build errors
 -rewrite migration patch so it actually passes all tests

v2:
 -don't touch io/ code anymore
 -always register yank functions
 -'yank' now takes a list of instances to yank
 -'query-yank' returns a list of yankable instances

Lukas Straub (4):
  Introduce yank feature
  block/nbd.c: Add yank feature
  chardev/char-socket.c: Add yank feature
  migration: Add yank feature

 Makefile.objs                 |   3 +
 block/nbd.c                   | 101 ++++++++++++--------
 chardev/char-socket.c         |  24 +++++
 migration/channel.c           |  12 +++
 migration/migration.c         |  18 +++-
 migration/multifd.c           |  10 ++
 migration/qemu-file-channel.c |   6 ++
 qapi/misc.json                |  45 +++++++++
 yank.c                        | 174 ++++++++++++++++++++++++++++++++++
 yank.h                        |  67 +++++++++++++
 10 files changed, 422 insertions(+), 38 deletions(-)
 create mode 100644 yank.c
 create mode 100644 yank.h

--
2.20.1

Comments

no-reply@patchew.org May 24, 2020, 9:16 p.m. UTC | #1
Patchew URL: https://patchew.org/QEMU/cover.1590344541.git.lukasstraub2@web.de/



Hi,

This series failed the docker-quick@centos7 build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
make docker-image-centos7 V=1 NETWORK=1
time make docker-test-quick@centos7 SHOW_ENV=1 J=14 NETWORK=1
=== TEST SCRIPT END ===

migration/qemu-file-channel.o: In function `channel_close':
/tmp/qemu-test/src/migration/qemu-file-channel.c:110: undefined reference to `yank_generic_iochannel'
/tmp/qemu-test/src/migration/qemu-file-channel.c:110: undefined reference to `yank_unregister_function'
collect2: error: ld returned 1 exit status
  LINK    tests/test-cutils
make: *** [tests/test-vmstate] Error 1
make: *** Waiting for unfinished jobs....
  TEST    iotest-qcow2: 001
  TEST    iotest-qcow2: 002
---
Not run: 259
Failures: 267
Failed 1 of 119 iotests
make: *** [check-tests/check-block.sh] Error 1
make: *** wait: No child processes.  Stop.
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=db483e1a7b374ed8a9cf3528ad2357b2', '-u', '1003', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=1', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew2/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-4gmncabb/src/docker-src.2020-05-24-17.04.07.24804:/var/tmp/qemu:z,ro', 'qemu:centos7', '/var/tmp/qemu/run', 'test-quick']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=db483e1a7b374ed8a9cf3528ad2357b2
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-4gmncabb/src'
make: *** [docker-run-test-quick@centos7] Error 2

real    12m5.130s
user    0m8.082s


The full log is available at
http://patchew.org/logs/cover.1590344541.git.lukasstraub2@web.de/testing.docker-quick@centos7/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
no-reply@patchew.org May 24, 2020, 9:21 p.m. UTC | #2
Patchew URL: https://patchew.org/QEMU/cover.1590344541.git.lukasstraub2@web.de/



Hi,

This series failed the asan build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#!/bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-debug@fedora TARGET_LIST=x86_64-softmmu J=14 NETWORK=1
=== TEST SCRIPT END ===

migration/qemu-file-channel.o: in function `channel_close':
/tmp/qemu-test/src/migration/qemu-file-channel.c:111: undefined reference to `yank_generic_iochannel'
/usr/bin/ld: /tmp/qemu-test/src/migration/qemu-file-channel.c:110: undefined reference to `yank_unregister_function'
clang-8: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [/tmp/qemu-test/src/rules.mak:124: tests/test-vmstate] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 664, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=18e540c0fae14ac68e571023cc096e31', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=x86_64-softmmu', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-92q_wrsw/src/docker-src.2020-05-24-17.15.58.27154:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-debug']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=18e540c0fae14ac68e571023cc096e31
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-92q_wrsw/src'
make: *** [docker-run-test-debug@fedora] Error 2

real    5m14.699s
user    0m8.970s


The full log is available at
http://patchew.org/logs/cover.1590344541.git.lukasstraub2@web.de/testing.asan/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com