diff mbox

[RESEND,v6,00/36] Initial support for multi-process qemu

Message ID cover.1587614626.git.elena.ufimtseva@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Elena Ufimtseva April 23, 2020, 4:13 a.m. UTC
From: Elena Ufimtseva <elena.ufimtseva@oracle.com>

Hello

This is a resend of v6 patchset since we regrettably omitted few comments
from v5 review in the previously sent series 
(see in https://lists.gnu.org/archive/html/qemu-devel/2020-04/msg00828.html).
We also run more tests and fixed the build errors that were found in v6.

Started with the presentation in October 2017 made by Marc-Andre (Red Hat)
and Konrad Wilk (Oracle) [1], and continued by Jag's BoF at KVM Forum 2018,
the multi-process project is now available and presented in this patchset.
This first series enables the emulation of lsi53c895a in a separate process.

We posted the Proof Of Concept patches [2] before the BoF session in 2018.
Subsequently, we posted RFC v1 [3], RFC v2 [4], RFC v3 [5], RFC v4 [6]
and v5 [7] of the patch series.

This is v6 of the patch series and it addresses the previous feedback from
the community.
To make easier to review of the series, we have separated out some of the
patches and will send them in the separate series. As per conversation we
had during the last community call, the live migration support is taken out
from this series as well as asynchronous communication.
The changes include the elimination of fork/exec of the remote process
and instead using the orchestrator which is implemented in this series as
a python script.

Following people contributed to this patchset:

John G Johnson <john.g.johnson@oracle.com>
Jagannathan Raman <jag.raman@oracle.com>
Elena Ufimtseva <elena.ufimtseva@oracle.com>
Kanth Ghatraju <kanth.ghatraju@oracle.com>
Konrad Wilk <konrad.wilk@oracle.com>

For full concept writeup about QEMU disaggregation, refer to
docs/devel/qemu-multiprocess.rst. Please refer to
docs/qemu-multiprocess.txt for usage information.

We will post separate patchsets for the following improvements for
the experimental Qemu multi-process:
 - Live migration;
 - Asynchronous communication channel;
 - Libvirt support;

We welcome all your ideas, concerns, and questions for this patchset.

Testing results

There is an error in travis-ci build test which does not get reproduced.

 TEST    iotest-qcow2: 041 [fail]
QEMU          -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64" -nodefaults -display none -accel qtest
QEMU_IMG      -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-img" 
QEMU_IO       -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio threads -f qcow2
QEMU_NBD      -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-nbd" 
IMGFMT        -- qcow2 (compat=1.1)
IMGPROTO      -- file
PLATFORM      -- Linux/x86_64 travis-job-fc4e2553-b470-4a8b-812e-a4fcf8ba094f 5.0.0-1031-gcp
TEST_DIR      -- /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/scratch
SOCK_DIR      -- /tmp/tmp.LOmYANt5Od
SOCKET_SCM_HELPER -- /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/socket_scm_helper
[1]: https://www.youtube.com/watch?v=Kq1-coHh7lg
[2]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg566538.html
[3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg602285.html
[4]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg624877.html
[5]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg642000.html
[6]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg655118.html
[7]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg682429.html

 -- 
2.25.GIT


Elena Ufimtseva (18):
  multi-process: Refactor machine_init and exit notifiers
  command-line: refractor parser code
  multi-process: Refactor chardev functions out of vl.c
  multi-process: Refactor monitor functions out of vl.c
  multi-process: add a command line option for debug file
  multi-process: introduce proxy object
  multi-process: Forward PCI config space acceses to the remote process
  multi-process: Introduce build flags to separate remote process code
  multi-process: add parse_cmdline in remote process
  multi-process: add support to parse device option
  multi-process: send heartbeat messages to remote
  multi-process: handle heartbeat messages in remote process
  multi-process: perform device reset in the remote process
  multi-process/mon: choose HMP commands based on target
  multi-process/mon: stub functions to enable QMP module for remote
    process
  multi-process/mon: enable QMP module support in the remote process
  multi-process/mon: Initialize QMP module for remote processes
  multi-process: add configure and usage information

Jagannathan Raman (17):
  memory: alloc RAM from file at offset
  monitor: destaticize HMP commands
  multi-process: Add stub functions to facilitate build of multi-process
  multi-process: Add config option for multi-process QEMU
  multi-process: build system for remote device process
  multi-process: define mpqemu-link object
  multi-process: add functions to synchronize proxy and remote endpoints
  multi-process: setup PCI host bridge for remote device
  multi-process: setup a machine object for remote device process
  multi-process: setup memory manager for remote device
  multi-process: remote process initialization
  multi-process: Initialize Proxy Object's communication channel
  multi-process: Connect Proxy Object with device in the remote process
  multi-process: PCI BAR read/write handling for proxy & remote
    endpoints
  multi-process: Synchronize remote memory
  multi-process: create IOHUB object to handle irq
  multi-process: Retrieve PCI info from remote process

John G Johnson (1):
  multi-process: add the concept description to
    docs/devel/qemu-multiprocess

 MAINTAINERS                          |  39 ++
 Makefile                             |   2 +
 Makefile.objs                        |  41 ++
 Makefile.target                      | 104 ++-
 accel/Makefile.objs                  |   2 +
 accel/stubs/kvm-stub.c               |   5 +
 accel/stubs/tcg-stub.c               | 108 +++
 backends/Makefile.objs               |   2 +
 block/Makefile.objs                  |   5 +
 block/monitor/Makefile.objs          |   2 +
 chardev/char.c                       |  14 +
 configure                            |  15 +
 docs/devel/index.rst                 |   1 +
 docs/devel/multi-process.rst         | 957 +++++++++++++++++++++++++++
 docs/multi-process.rst               |  85 +++
 exec.c                               |  31 +-
 hmp-commands-info.hx                 |  10 +
 hmp-commands.hx                      |  25 +-
 hw/Makefile.objs                     |   7 +
 hw/block/Makefile.objs               |   2 +
 hw/core/Makefile.objs                |  19 +
 hw/nvram/Makefile.objs               |   2 +
 hw/pci/Makefile.objs                 |   4 +
 hw/proxy/memory-sync.c               | 217 ++++++
 hw/proxy/qemu-proxy.c                | 488 ++++++++++++++
 hw/scsi/Makefile.objs                |   2 +
 include/chardev/char.h               |   2 +
 include/exec/address-spaces.h        |   2 +
 include/exec/ram_addr.h              |   4 +-
 include/hw/pci/pci_ids.h             |   3 +
 include/hw/proxy/memory-sync.h       |  37 ++
 include/hw/proxy/qemu-proxy.h        |  79 +++
 include/io/mpqemu-link.h             | 192 ++++++
 include/monitor/monitor.h            |   3 +
 include/qemu-common.h                |   8 +
 include/qemu-parse.h                 |  42 ++
 include/qemu/log.h                   |   1 +
 include/qemu/mmap-alloc.h            |   3 +-
 include/remote/iohub.h               |  50 ++
 include/remote/machine.h             |  32 +
 include/remote/memory.h              |  20 +
 include/remote/pcihost.h             |  45 ++
 include/sysemu/sysemu.h              |   2 +
 io/Makefile.objs                     |   2 +
 io/mpqemu-link.c                     | 407 ++++++++++++
 memory.c                             |   2 +-
 migration/Makefile.objs              |   2 +
 monitor/Makefile.objs                |   4 +
 monitor/misc.c                       |  84 +--
 monitor/monitor-internal.h           |  38 ++
 monitor/monitor.c                    |  37 ++
 qapi/Makefile.objs                   |   2 +
 qemu-parse.c                         |  93 +++
 qom/Makefile.objs                    |   4 +
 remote/Makefile.objs                 |   6 +
 remote/iohub.c                       | 148 +++++
 remote/machine.c                     |  99 +++
 remote/memory.c                      |  63 ++
 remote/pcihost.c                     |  64 ++
 remote/remote-common.h               |  21 +
 remote/remote-main.c                 | 379 +++++++++++
 remote/remote-opts.c                 |  96 +++
 remote/remote-opts.h                 |  15 +
 rules.mak                            |   2 +-
 scripts/hxtool                       |  35 +-
 scripts/mpqemu-launcher-perf-mode.py |  92 +++
 scripts/mpqemu-launcher.py           |  53 ++
 softmmu/vl.c                         | 175 +----
 stubs/Makefile.objs                  |   3 +
 stubs/audio.c                        |  12 +
 stubs/gdbstub.c                      |  23 +
 stubs/get-fd.c                       |  10 +
 stubs/machine-init-add.c             |   7 +
 stubs/machine-init-done.c            |   5 +-
 stubs/machine-init-remove.c          |   8 +
 stubs/migration.c                    | 162 +++++
 stubs/monitor.c                      |  85 ++-
 stubs/net-stub.c                     | 100 +++
 stubs/qapi-misc.c                    |  41 ++
 stubs/qapi-target.c                  |  56 ++
 stubs/replay.c                       |  18 +
 stubs/ui-stub.c                      | 130 ++++
 stubs/vl-stub.c                      | 171 +++++
 stubs/vmstate.c                      |  19 +
 stubs/xen-mapcache.c                 |  22 +
 ui/Makefile.objs                     |   2 +
 util/Makefile.objs                   |   2 +
 util/log.c                           |   2 +
 util/machine-notify.c                |  69 ++
 util/mmap-alloc.c                    |   7 +-
 util/oslib-posix.c                   |   2 +-
 91 files changed, 5356 insertions(+), 237 deletions(-)
 create mode 100644 docs/devel/multi-process.rst
 create mode 100644 docs/multi-process.rst
 create mode 100644 hw/proxy/memory-sync.c
 create mode 100644 hw/proxy/qemu-proxy.c
 create mode 100644 include/hw/proxy/memory-sync.h
 create mode 100644 include/hw/proxy/qemu-proxy.h
 create mode 100644 include/io/mpqemu-link.h
 create mode 100644 include/qemu-parse.h
 create mode 100644 include/remote/iohub.h
 create mode 100644 include/remote/machine.h
 create mode 100644 include/remote/memory.h
 create mode 100644 include/remote/pcihost.h
 create mode 100644 io/mpqemu-link.c
 create mode 100644 qemu-parse.c
 create mode 100644 remote/Makefile.objs
 create mode 100644 remote/iohub.c
 create mode 100644 remote/machine.c
 create mode 100644 remote/memory.c
 create mode 100644 remote/pcihost.c
 create mode 100644 remote/remote-common.h
 create mode 100644 remote/remote-main.c
 create mode 100644 remote/remote-opts.c
 create mode 100644 remote/remote-opts.h
 mode change 100644 => 100755 scripts/hxtool
 create mode 100755 scripts/mpqemu-launcher-perf-mode.py
 create mode 100755 scripts/mpqemu-launcher.py
 create mode 100644 stubs/audio.c
 create mode 100644 stubs/get-fd.c
 create mode 100644 stubs/machine-init-add.c
 create mode 100644 stubs/machine-init-remove.c
 create mode 100644 stubs/migration.c
 create mode 100644 stubs/net-stub.c
 create mode 100644 stubs/qapi-misc.c
 create mode 100644 stubs/qapi-target.c
 create mode 100644 stubs/ui-stub.c
 create mode 100644 stubs/vl-stub.c
 create mode 100644 stubs/xen-mapcache.c
 create mode 100644 util/machine-notify.c

Comments

Stefan Hajnoczi April 24, 2020, 12:48 p.m. UTC | #1
On Wed, Apr 22, 2020 at 09:13:35PM -0700, elena.ufimtseva@oracle.com wrote:
> There is an error in travis-ci build test which does not get reproduced.
> 
>  TEST    iotest-qcow2: 041 [fail]
> QEMU          -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64" -nodefaults -display none -accel qtest
> QEMU_IMG      -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-img" 
> QEMU_IO       -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio threads -f qcow2
> QEMU_NBD      -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-nbd" 
> IMGFMT        -- qcow2 (compat=1.1)
> IMGPROTO      -- file
> PLATFORM      -- Linux/x86_64 travis-job-fc4e2553-b470-4a8b-812e-a4fcf8ba094f 5.0.0-1031-gcp
> TEST_DIR      -- /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/scratch
> SOCK_DIR      -- /tmp/tmp.LOmYANt5Od
> SOCKET_SCM_HELPER -- /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/socket_scm_helper
> --- /home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/041.out	2020-04-22 00:17:23.701844698 +0000
> +++ /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/041.out.bad	2020-04-22 00:24:39.234343858 +0000
> @@ -1,5 +1,29 @@
> -..............................................................................................
> +........................FF....................................................................
> +======================================================================
> +FAIL: test_with_other_parent (__main__.TestRepairQuorum)
> +----------------------------------------------------------------------
> +Traceback (most recent call last):
> +  File "041", line 1049, in test_with_other_parent
> +    self.assert_qmp(result, 'return', {})
> +  File "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py", line 821, in assert_qmp
> +    result = self.dictpath(d, path)
> +  File "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py", line 797, in dictpath
> +    self.fail('failed path traversal for "%s" in "%s"' % (path, str(d)))
> +AssertionError: failed path traversal for "return" in "{'error': {'class': 'GenericError', 'desc': "UNIX socket path '/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/scratch/nbd.sock' is too long"}}"

UNIX Domain Socket paths have to be 108 characters or less.  The path in
the failed test case is 110 characters long.  You could rename your
branch to "mpqemu" to solve this failure.

Stefan
Daniel P. Berrangé April 24, 2020, 12:53 p.m. UTC | #2
On Fri, Apr 24, 2020 at 01:48:23PM +0100, Stefan Hajnoczi wrote:
> On Wed, Apr 22, 2020 at 09:13:35PM -0700, elena.ufimtseva@oracle.com wrote:
> > There is an error in travis-ci build test which does not get reproduced.
> > 
> >  TEST    iotest-qcow2: 041 [fail]
> > QEMU          -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64" -nodefaults -display none -accel qtest
> > QEMU_IMG      -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-img" 
> > QEMU_IO       -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-io"  --cache writeback --aio threads -f qcow2
> > QEMU_NBD      -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../qemu-nbd" 
> > IMGFMT        -- qcow2 (compat=1.1)
> > IMGPROTO      -- file
> > PLATFORM      -- Linux/x86_64 travis-job-fc4e2553-b470-4a8b-812e-a4fcf8ba094f 5.0.0-1031-gcp
> > TEST_DIR      -- /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/scratch
> > SOCK_DIR      -- /tmp/tmp.LOmYANt5Od
> > SOCKET_SCM_HELPER -- /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/socket_scm_helper
> > --- /home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/041.out	2020-04-22 00:17:23.701844698 +0000
> > +++ /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/041.out.bad	2020-04-22 00:24:39.234343858 +0000
> > @@ -1,5 +1,29 @@
> > -..............................................................................................
> > +........................FF....................................................................
> > +======================================================================
> > +FAIL: test_with_other_parent (__main__.TestRepairQuorum)
> > +----------------------------------------------------------------------
> > +Traceback (most recent call last):
> > +  File "041", line 1049, in test_with_other_parent
> > +    self.assert_qmp(result, 'return', {})
> > +  File "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py", line 821, in assert_qmp
> > +    result = self.dictpath(d, path)
> > +  File "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py", line 797, in dictpath
> > +    self.fail('failed path traversal for "%s" in "%s"' % (path, str(d)))
> > +AssertionError: failed path traversal for "return" in "{'error': {'class': 'GenericError', 'desc': "UNIX socket path '/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/scratch/nbd.sock' is too long"}}"
> 
> UNIX Domain Socket paths have to be 108 characters or less.  The path in
> the failed test case is 110 characters long.  You could rename your
> branch to "mpqemu" to solve this failure.

Renaming is a pretty poor band-aid.

We should fix the i/o tests instead, so that they use a scratch dir under
$TMP to store unix sockets needed by tests instead.


Regards,
Daniel
Eric Blake April 24, 2020, 12:53 p.m. UTC | #3
On 4/24/20 7:48 AM, Stefan Hajnoczi wrote:
> On Wed, Apr 22, 2020 at 09:13:35PM -0700, elena.ufimtseva@oracle.com wrote:
>> There is an error in travis-ci build test which does not get reproduced.
>>
>>   TEST    iotest-qcow2: 041 [fail]
>> QEMU          -- "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64" -nodefaults -display none -accel qtest

>> +Traceback (most recent call last):
>> +  File "041", line 1049, in test_with_other_parent
>> +    self.assert_qmp(result, 'return', {})
>> +  File "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py", line 821, in assert_qmp
>> +    result = self.dictpath(d, path)
>> +  File "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py", line 797, in dictpath
>> +    self.fail('failed path traversal for "%s" in "%s"' % (path, str(d)))
>> +AssertionError: failed path traversal for "return" in "{'error': {'class': 'GenericError', 'desc': "UNIX socket path '/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/scratch/nbd.sock' is too long"}}"
> 
> UNIX Domain Socket paths have to be 108 characters or less.  The path in
> the failed test case is 110 characters long.  You could rename your
> branch to "mpqemu" to solve this failure.

We recently fixed the iotests to prefer sticking NBD sockets under 
$SOCK_DIR (see commits f0e24942 and friends); did we miss test 41?
Max Reitz April 24, 2020, 1:42 p.m. UTC | #4
On 24.04.20 14:53, Eric Blake wrote:
> On 4/24/20 7:48 AM, Stefan Hajnoczi wrote:
>> On Wed, Apr 22, 2020 at 09:13:35PM -0700, elena.ufimtseva@oracle.com
>> wrote:
>>> There is an error in travis-ci build test which does not get reproduced.
>>>
>>>   TEST    iotest-qcow2: 041 [fail]
>>> QEMU          --
>>> "/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/../../x86_64-softmmu/qemu-system-x86_64"
>>> -nodefaults -display none -accel qtest
> 
>>> +Traceback (most recent call last):
>>> +  File "041", line 1049, in test_with_other_parent
>>> +    self.assert_qmp(result, 'return', {})
>>> +  File
>>> "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py",
>>> line 821, in assert_qmp
>>> +    result = self.dictpath(d, path)
>>> +  File
>>> "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py",
>>> line 797, in dictpath
>>> +    self.fail('failed path traversal for "%s" in "%s"' % (path,
>>> str(d)))
>>> +AssertionError: failed path traversal for "return" in "{'error':
>>> {'class': 'GenericError', 'desc': "UNIX socket path
>>> '/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/scratch/nbd.sock'
>>> is too long"}}"
>>
>> UNIX Domain Socket paths have to be 108 characters or less.  The path in
>> the failed test case is 110 characters long.  You could rename your
>> branch to "mpqemu" to solve this failure.
> 
> We recently fixed the iotests to prefer sticking NBD sockets under
> $SOCK_DIR (see commits f0e24942 and friends); did we miss test 41?

Looks more like I broke it.  Oops.  Will fix.

Max
Stefan Hajnoczi April 28, 2020, 5:29 p.m. UTC | #5
On Wed, Apr 22, 2020 at 09:13:35PM -0700, elena.ufimtseva@oracle.com wrote:
> We will post separate patchsets for the following improvements for
> the experimental Qemu multi-process:
>  - Live migration;
>  - Asynchronous communication channel;
>  - Libvirt support;
> 
> We welcome all your ideas, concerns, and questions for this patchset.

This patch series does two things:
1. It introduces the remote device infrastructure.
2. It creates the remote device program and the associated build changes
   (makefiles, stubs, etc).

There are many patches and it's likely that a bunch more revisions will
be necessary before this can be merged.

I want to share an idea to reduce the scope and get patches merged more
quickly.  It looks like the series can be reduced to 21 patches using
this approach.

I suggest dropping the remote device program from this patch series (and
maybe never bringing it back).  Instead, use the softmmu target for the
remote device.

Why?  Because the remote device program is just a QEMU that uses the
remote machine type and has no vCPUs:

  $ qemu-system-x86_64 -chardev id=char0,... \
                       -M remote,chardev=char0 \
		       -device lsi53c810 \
		       -drive if=none,id=drive0,file=vm.img,format=raw \
		       -device scsi-hd,drive=drive0

This will use the remote machine type, interrupt controller, and PCI bus
that you have created.

The remote machine type should default to no vCPUs and no memory
creation (the memory comes via the mpqemu link communications channel).

At this point qemu-system-x86_64 contains a lot of code that you don't
want in the final remote device program.  Let's ignore that for a
second.

Now you can submit a 21-patch series containing just the remote device
infrastructure.  This will be easier to merge.

Returning to code size, the next step is to reduce the binary.  QEMU has
a Kconfig-style system for optional features and dependencies.  It's a
better approach than creating a separate make target because it
eliminates the duplication and mess in the makefiles.

For example, you can disable TCG and KVM so that your binary has no
ability to execute guest code.  Currently ./configure disallows this but
I've tried it and it works.

You can add a new default-configs/ file that disables CONFIG_ISAPC,
CONFIG_I440FX, etc.  When you compile QEMU most of hw/ will not be built
anymore.  At this point you have a smaller binary that is still a
softmmu target so the makefiles are shared with the regular
qemu-system-x86_64.

There will be some code for which there is no Kconfig option yet.
Further improvements can be made by adding Kconfig options for any code
that you wish to eliminate.  Instead of writing makefile changes like
you did in this patch series you would be adding Kconfig options.  The
nice thing is that this work isn't specific to the remote device program
- anyone can use the new Kconfig options to reduce the size of their
QEMU.  So not only is it less messy than duplicating the makefiles,
but it also benefits everyone.

The downside to doing this is that it will take a while to eliminate all
code that you don't want via Kconfig.  However, your initial patch
series can be merged sooner and I think this direction is also cleaner.

I hope I've explained the idea properly :).  We can continue reviewing
the current series if you prefer, but I think it would be quicker to
drop the remote device program.

Stefan
Michael S. Tsirkin April 28, 2020, 5:47 p.m. UTC | #6
On Tue, Apr 28, 2020 at 06:29:20PM +0100, Stefan Hajnoczi wrote:
> On Wed, Apr 22, 2020 at 09:13:35PM -0700, elena.ufimtseva@oracle.com wrote:
> > We will post separate patchsets for the following improvements for
> > the experimental Qemu multi-process:
> >  - Live migration;
> >  - Asynchronous communication channel;
> >  - Libvirt support;
> > 
> > We welcome all your ideas, concerns, and questions for this patchset.
> 
> This patch series does two things:
> 1. It introduces the remote device infrastructure.
> 2. It creates the remote device program and the associated build changes
>    (makefiles, stubs, etc).
> 
> There are many patches and it's likely that a bunch more revisions will
> be necessary before this can be merged.
> 
> I want to share an idea to reduce the scope and get patches merged more
> quickly.  It looks like the series can be reduced to 21 patches using
> this approach.
> 
> I suggest dropping the remote device program from this patch series (and
> maybe never bringing it back).  Instead, use the softmmu target for the
> remote device.
> 
> Why?  Because the remote device program is just a QEMU that uses the
> remote machine type and has no vCPUs:
> 
>   $ qemu-system-x86_64 -chardev id=char0,... \
>                        -M remote,chardev=char0 \
> 		       -device lsi53c810 \
> 		       -drive if=none,id=drive0,file=vm.img,format=raw \
> 		       -device scsi-hd,drive=drive0
> 
> This will use the remote machine type, interrupt controller, and PCI bus
> that you have created.
> 
> The remote machine type should default to no vCPUs and no memory
> creation (the memory comes via the mpqemu link communications channel).
> 
> At this point qemu-system-x86_64 contains a lot of code that you don't
> want in the final remote device program.  Let's ignore that for a
> second.
> 
> Now you can submit a 21-patch series containing just the remote device
> infrastructure.  This will be easier to merge.
> 
> Returning to code size, the next step is to reduce the binary.  QEMU has
> a Kconfig-style system for optional features and dependencies.  It's a
> better approach than creating a separate make target because it
> eliminates the duplication and mess in the makefiles.
> 
> For example, you can disable TCG and KVM so that your binary has no
> ability to execute guest code.  Currently ./configure disallows this but
> I've tried it and it works.
> 
> You can add a new default-configs/ file that disables CONFIG_ISAPC,
> CONFIG_I440FX, etc.  When you compile QEMU most of hw/ will not be built
> anymore.  At this point you have a smaller binary that is still a
> softmmu target so the makefiles are shared with the regular
> qemu-system-x86_64.
> 
> There will be some code for which there is no Kconfig option yet.
> Further improvements can be made by adding Kconfig options for any code
> that you wish to eliminate.  Instead of writing makefile changes like
> you did in this patch series you would be adding Kconfig options.  The
> nice thing is that this work isn't specific to the remote device program
> - anyone can use the new Kconfig options to reduce the size of their
> QEMU.  So not only is it less messy than duplicating the makefiles,
> but it also benefits everyone.
> 
> The downside to doing this is that it will take a while to eliminate all
> code that you don't want via Kconfig.  However, your initial patch
> series can be merged sooner and I think this direction is also cleaner.
> 
> I hope I've explained the idea properly :).  We can continue reviewing
> the current series if you prefer, but I think it would be quicker to
> drop the remote device program.
> 
> Stefan

Building QEMU twices just to get the remote is however not very
attractive. So how about making remote a special target?
Either remote-softmmu/ or if impossible x86_64-remote-softmmu/
Stefan Hajnoczi April 29, 2020, 9:30 a.m. UTC | #7
On Tue, Apr 28, 2020 at 01:47:24PM -0400, Michael S. Tsirkin wrote:
> On Tue, Apr 28, 2020 at 06:29:20PM +0100, Stefan Hajnoczi wrote:
> > On Wed, Apr 22, 2020 at 09:13:35PM -0700, elena.ufimtseva@oracle.com wrote:
> > > We will post separate patchsets for the following improvements for
> > > the experimental Qemu multi-process:
> > >  - Live migration;
> > >  - Asynchronous communication channel;
> > >  - Libvirt support;
> > > 
> > > We welcome all your ideas, concerns, and questions for this patchset.
> > 
> > This patch series does two things:
> > 1. It introduces the remote device infrastructure.
> > 2. It creates the remote device program and the associated build changes
> >    (makefiles, stubs, etc).
> > 
> > There are many patches and it's likely that a bunch more revisions will
> > be necessary before this can be merged.
> > 
> > I want to share an idea to reduce the scope and get patches merged more
> > quickly.  It looks like the series can be reduced to 21 patches using
> > this approach.
> > 
> > I suggest dropping the remote device program from this patch series (and
> > maybe never bringing it back).  Instead, use the softmmu target for the
> > remote device.
> > 
> > Why?  Because the remote device program is just a QEMU that uses the
> > remote machine type and has no vCPUs:
> > 
> >   $ qemu-system-x86_64 -chardev id=char0,... \
> >                        -M remote,chardev=char0 \
> > 		       -device lsi53c810 \
> > 		       -drive if=none,id=drive0,file=vm.img,format=raw \
> > 		       -device scsi-hd,drive=drive0
> > 
> > This will use the remote machine type, interrupt controller, and PCI bus
> > that you have created.
> > 
> > The remote machine type should default to no vCPUs and no memory
> > creation (the memory comes via the mpqemu link communications channel).
> > 
> > At this point qemu-system-x86_64 contains a lot of code that you don't
> > want in the final remote device program.  Let's ignore that for a
> > second.
> > 
> > Now you can submit a 21-patch series containing just the remote device
> > infrastructure.  This will be easier to merge.
> > 
> > Returning to code size, the next step is to reduce the binary.  QEMU has
> > a Kconfig-style system for optional features and dependencies.  It's a
> > better approach than creating a separate make target because it
> > eliminates the duplication and mess in the makefiles.
> > 
> > For example, you can disable TCG and KVM so that your binary has no
> > ability to execute guest code.  Currently ./configure disallows this but
> > I've tried it and it works.
> > 
> > You can add a new default-configs/ file that disables CONFIG_ISAPC,
> > CONFIG_I440FX, etc.  When you compile QEMU most of hw/ will not be built
> > anymore.  At this point you have a smaller binary that is still a
> > softmmu target so the makefiles are shared with the regular
> > qemu-system-x86_64.
> > 
> > There will be some code for which there is no Kconfig option yet.
> > Further improvements can be made by adding Kconfig options for any code
> > that you wish to eliminate.  Instead of writing makefile changes like
> > you did in this patch series you would be adding Kconfig options.  The
> > nice thing is that this work isn't specific to the remote device program
> > - anyone can use the new Kconfig options to reduce the size of their
> > QEMU.  So not only is it less messy than duplicating the makefiles,
> > but it also benefits everyone.
> > 
> > The downside to doing this is that it will take a while to eliminate all
> > code that you don't want via Kconfig.  However, your initial patch
> > series can be merged sooner and I think this direction is also cleaner.
> > 
> > I hope I've explained the idea properly :).  We can continue reviewing
> > the current series if you prefer, but I think it would be quicker to
> > drop the remote device program.
> > 
> > Stefan
> 
> Building QEMU twices just to get the remote is however not very
> attractive. So how about making remote a special target?
> Either remote-softmmu/ or if impossible x86_64-remote-softmmu/

Yes, that's a good idea.  It needs to be the full x86_64-remote-softmmu
because hw/ code depends on the QEMU target :(.

To summarize the big advantage of this approach (besides reducing the
patch series): the existing makefile rules for softmmu will be used to
build the remote device program.  No new main() and no new per-object
file makefile rules are needed.

Stefan
Michael S. Tsirkin April 29, 2020, 9:59 a.m. UTC | #8
On Wed, Apr 29, 2020 at 10:30:30AM +0100, Stefan Hajnoczi wrote:
> > > I suggest dropping the remote device program from this patch series (and
> > > maybe never bringing it back).  Instead, use the softmmu target for the
> > > remote device.

...

> > 
> > Building QEMU twices just to get the remote is however not very
> > attractive. So how about making remote a special target?
> > Either remote-softmmu/ or if impossible x86_64-remote-softmmu/
> 
> Yes, that's a good idea.  It needs to be the full x86_64-remote-softmmu
> because hw/ code depends on the QEMU target :(.

BTW using QEMU as backend source also gives us goodies such as
cross-version compatibility for free.
Stefan Hajnoczi May 11, 2020, 2:40 p.m. UTC | #9
Hi,
Have you decided whether to drop the remote device program in favor of
using a softmmu make target?

Is there anything in this series you'd like me to review before you send
the next revision?

Stefan
Jag Raman May 11, 2020, 7:30 p.m. UTC | #10
> On May 11, 2020, at 10:40 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> 
> Hi,
> Have you decided whether to drop the remote device program in favor of
> using a softmmu make target?
> 
> Is there anything in this series you'd like me to review before you send
> the next revision?

Hi Stefan,

We are planning to drop the separate remote device program in the next
revision. We are planning to use QEMU’s existing event loop instead of
a separate event loop for the remote process, as well as the command
line invocation you suggested in your feedback.

We hope the following core patches look good to you, by and large:
[PATCH RESEND v6 01/36] memory: alloc RAM from file at offset
[PATCH RESEND v6 11/36] multi-process: define mpqemu-link object
[PATCH RESEND v6 12/36] multi-process: add functions to synchronize proxy and remote endpoints
[PATCH RESEND v6 13/36] multi-process: setup PCI host bridge for remote device
[PATCH RESEND v6 14/36] multi-process: setup a machine object for remote device process
[PATCH RESEND v6 15/36] multi-process: setup memory manager for remote device
[PATCH RESEND v6 17/36] multi-process: introduce proxy object
[PATCH RESEND v6 18/36] multi-process: Initialize Proxy Object's communication channel
[PATCH RESEND v6 19/36] multi-process: Connect Proxy Object with device in the remote process
[PATCH RESEND v6 20/36] multi-process: Forward PCI config space acceses to the remote process
[PATCH RESEND v6 21/36] multi-process: PCI BAR read/write handling for proxy & remote endpoints
[PATCH RESEND v6 22/36] multi-process: Synchronize remote memory
[PATCH RESEND v6 23/36] multi-process: create IOHUB object to handle irq
[PATCH RESEND v6 24/36] multi-process: Retrieve PCI info from remote process

Thank you very much!
—
Jag

> 
> Stefan
Stefan Hajnoczi May 12, 2020, 4:13 p.m. UTC | #11
On Mon, May 11, 2020 at 03:30:50PM -0400, Jag Raman wrote:
> > On May 11, 2020, at 10:40 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> > 
> > Hi,
> > Have you decided whether to drop the remote device program in favor of
> > using a softmmu make target?
> > 
> > Is there anything in this series you'd like me to review before you send
> > the next revision?
> 
> Hi Stefan,
> 
> We are planning to drop the separate remote device program in the next
> revision. We are planning to use QEMU’s existing event loop instead of
> a separate event loop for the remote process, as well as the command
> line invocation you suggested in your feedback.
> 
> We hope the following core patches look good to you, by and large:
> [PATCH RESEND v6 01/36] memory: alloc RAM from file at offset
> [PATCH RESEND v6 11/36] multi-process: define mpqemu-link object
> [PATCH RESEND v6 12/36] multi-process: add functions to synchronize proxy and remote endpoints
> [PATCH RESEND v6 13/36] multi-process: setup PCI host bridge for remote device
> [PATCH RESEND v6 14/36] multi-process: setup a machine object for remote device process
> [PATCH RESEND v6 15/36] multi-process: setup memory manager for remote device
> [PATCH RESEND v6 17/36] multi-process: introduce proxy object
> [PATCH RESEND v6 18/36] multi-process: Initialize Proxy Object's communication channel
> [PATCH RESEND v6 19/36] multi-process: Connect Proxy Object with device in the remote process
> [PATCH RESEND v6 20/36] multi-process: Forward PCI config space acceses to the remote process
> [PATCH RESEND v6 21/36] multi-process: PCI BAR read/write handling for proxy & remote endpoints
> [PATCH RESEND v6 22/36] multi-process: Synchronize remote memory
> [PATCH RESEND v6 23/36] multi-process: create IOHUB object to handle irq
> [PATCH RESEND v6 24/36] multi-process: Retrieve PCI info from remote process

I've completed the review of these patches. Looking forward to
discussing more.

Stefan
Jag Raman May 12, 2020, 4:55 p.m. UTC | #12
> On May 12, 2020, at 12:13 PM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
> 
> On Mon, May 11, 2020 at 03:30:50PM -0400, Jag Raman wrote:
>>> On May 11, 2020, at 10:40 AM, Stefan Hajnoczi <stefanha@gmail.com> wrote:
>>> 
>>> Hi,
>>> Have you decided whether to drop the remote device program in favor of
>>> using a softmmu make target?
>>> 
>>> Is there anything in this series you'd like me to review before you send
>>> the next revision?
>> 
>> Hi Stefan,
>> 
>> We are planning to drop the separate remote device program in the next
>> revision. We are planning to use QEMU’s existing event loop instead of
>> a separate event loop for the remote process, as well as the command
>> line invocation you suggested in your feedback.
>> 
>> We hope the following core patches look good to you, by and large:
>> [PATCH RESEND v6 01/36] memory: alloc RAM from file at offset
>> [PATCH RESEND v6 11/36] multi-process: define mpqemu-link object
>> [PATCH RESEND v6 12/36] multi-process: add functions to synchronize proxy and remote endpoints
>> [PATCH RESEND v6 13/36] multi-process: setup PCI host bridge for remote device
>> [PATCH RESEND v6 14/36] multi-process: setup a machine object for remote device process
>> [PATCH RESEND v6 15/36] multi-process: setup memory manager for remote device
>> [PATCH RESEND v6 17/36] multi-process: introduce proxy object
>> [PATCH RESEND v6 18/36] multi-process: Initialize Proxy Object's communication channel
>> [PATCH RESEND v6 19/36] multi-process: Connect Proxy Object with device in the remote process
>> [PATCH RESEND v6 20/36] multi-process: Forward PCI config space acceses to the remote process
>> [PATCH RESEND v6 21/36] multi-process: PCI BAR read/write handling for proxy & remote endpoints
>> [PATCH RESEND v6 22/36] multi-process: Synchronize remote memory
>> [PATCH RESEND v6 23/36] multi-process: create IOHUB object to handle irq
>> [PATCH RESEND v6 24/36] multi-process: Retrieve PCI info from remote process
> 
> I've completed the review of these patches. Looking forward to
> discussing more.

Thank you very much, Stefan!

We will incorporate the feedback we received from your review.

Thanks!
--
Jag

> 
> Stefan
diff mbox

Patch

--- /home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/041.out	2020-04-22 00:17:23.701844698 +0000
+++ /home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/041.out.bad	2020-04-22 00:24:39.234343858 +0000
@@ -1,5 +1,29 @@ 
-..............................................................................................
+........................FF....................................................................
+======================================================================
+FAIL: test_with_other_parent (__main__.TestRepairQuorum)
+----------------------------------------------------------------------
+Traceback (most recent call last):
+  File "041", line 1049, in test_with_other_parent
+    self.assert_qmp(result, 'return', {})
+  File "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py", line 821, in assert_qmp
+    result = self.dictpath(d, path)
+  File "/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/qemu-iotests/iotests.py", line 797, in dictpath
+    self.fail('failed path traversal for "%s" in "%s"' % (path, str(d)))
+AssertionError: failed path traversal for "return" in "{'error': {'class': 'GenericError', 'desc': "UNIX socket path '/home/travis/build/elena-ufimtseva/qemu-multiprocess/out-of-tree/build/dir/tests/qemu-iotests/scratch/nbd.sock' is too long"}}"
a
+
Not run: 220 259
Failures: 041
Failed 1 of 116 iotests
/home/travis/build/elena-ufimtseva/qemu-multiprocess/tests/Makefile.include:848: recipe for target 'check-tests/check-block.sh' failed
make: *** [check-tests/check-block.sh] Error 1
The command "if [ "$BUILD_RC" -eq 0 ] ; then
    ${TEST_CMD} ;
else
    $(exit $BUILD_RC);
fi


Thank you!

[1]: http://events17.linuxfoundation.org/sites/events/files/slides/KVM%20FORUM%20multi-process.pdf