mbox series

[v7,00/21] Initial support for multi-process qemu

Message ID cover.1593273671.git.elena.ufimtseva@oracle.com (mailing list archive)
Headers show
Series Initial support for multi-process qemu | expand

Message

Elena Ufimtseva June 27, 2020, 5:09 p.m. UTC
From: Elena Ufimtseva <elena.ufimtseva@oracle.com>

Hello

This is the v7 of the patchset.
Thank you very much for the detailed feedback for v6. We appreciate your time.

We have addressed the latest comments and suggestions that were
provided on v6 patch series and incorporated to this patchset.

This is the list of changes for v7:
 - QEMU & remote process share the same binary.
   This allowed us to reduce the number of patches as well.

 - We introduced the machine type "remote" that drives the remote process
   initialization.

 - v7 now uses QIOChannel for communication and descriptors management.

 - The remote process uses the main loop instead of a separate loop.

 - Co-routines support in the QEMU Proxy-remote process communication
   The communication model based on co-routines needs some more work and
   we would like to hear your take on it.
   Stefan has shared some ideas how we can proceed and we will take this
   to the next version after additional discussion.
   We did not implement the protocol to listen and accept new connections.

There are other changes that were incorporated from the feedback we have
received on v6.

We posted the Proof Of Concept patches [2] before the BoF session in 2018.
Subsequently, we posted RFC v1 [3], RFC v2 [4], RFC v3 [5], RFC v4 [6],
v5 [7] and v6 [8] of the patch series.
Following people contributed to this patchset:

John G Johnson <john.g.johnson@oracle.com>
Jagannathan Raman <jag.raman@oracle.com>
Elena Ufimtseva <elena.ufimtseva@oracle.com>
Kanth Ghatraju <kanth.ghatraju@oracle.com>
Konrad Wilk <konrad.wilk@oracle.com>

Also we would like to thank QEMU community for your help, suggestions
and reviewing this large series of patches.

For the full concept writeup about QEMU multi-process, please refer to
docs/devel/qemu-multiprocess.rst. Also see docs/qemu-multiprocess.txt for
usage information.

We will post separate patchsets for the following improvements for
the experimental Qemu multi-process:
 - Live migration;
 - communication channel improvements;

We welcome all your ideas, concerns, and questions for this patchset.

[1]: http://events17.linuxfoundation.org/sites/events/files/slides/KVM%20FORUM%20multi-process.pdf
[1]: https://www.youtube.com/watch?v=Kq1-coHh7lg
[2]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg566538.html
[3]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg602285.html
[4]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg624877.html
[5]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg642000.html
[6]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg655118.html
[7]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg682429.html
[8]: https://www.mail-archive.com/qemu-devel@nongnu.org/msg697484.html

Elena Ufimtseva (9):
  multi-process: add qio channel function to transmit
  multi-process: define MPQemuMsg format and transmission functions
  multi-process: add co-routines to communicate with remote
  multi-process: Initialize communication channel at the remote end
  multi-process: introduce proxy object
  multi-process: Forward PCI config space acceses to the remote process
  multi-process: heartbeat messages to remote
  multi-process: perform device reset in the remote process
  multi-process: add configure and usage information

Jagannathan Raman (11):
  memory: alloc RAM from file at offset
  multi-process: Add config option for multi-process QEMU
  multi-process: setup PCI host bridge for remote device
  multi-process: setup a machine object for remote device process
  multi-process: Initialize message handler in remote device
  multi-process: setup memory manager for remote device
  multi-process: Connect Proxy Object with device in the remote process
  multi-process: PCI BAR read/write handling for proxy & remote
    endpoints
  multi-process: Synchronize remote memory
  multi-process: create IOHUB object to handle irq
  multi-process: Retrieve PCI info from remote process

John G Johnson (1):
  multi-process: add the concept description to
    docs/devel/qemu-multiprocess

 MAINTAINERS                          |  24 +
 backends/hostmem-memfd.c             |   2 +-
 configure                            |  11 +
 docs/devel/index.rst                 |   1 +
 docs/devel/multi-process.rst         | 957 +++++++++++++++++++++++++++
 docs/multi-process.rst               |  71 ++
 exec.c                               |  11 +-
 hw/Makefile.objs                     |   1 +
 hw/i386/Makefile.objs                |   3 +
 hw/i386/remote-memory.c              |  58 ++
 hw/i386/remote-msg.c                 | 301 +++++++++
 hw/i386/remote.c                     |  99 +++
 hw/misc/ivshmem.c                    |   3 +-
 hw/pci-host/Makefile.objs            |   1 +
 hw/pci-host/remote.c                 |  63 ++
 hw/pci/Makefile.objs                 |   2 +
 hw/pci/memory-sync.c                 | 214 ++++++
 hw/pci/proxy.c                       | 436 ++++++++++++
 hw/remote/Makefile.objs              |   1 +
 hw/remote/iohub.c                    | 153 +++++
 include/exec/memory.h                |   2 +
 include/exec/ram_addr.h              |   2 +-
 include/hw/i386/remote-memory.h      |  20 +
 include/hw/i386/remote.h             |  38 ++
 include/hw/pci-host/remote.h         |  34 +
 include/hw/pci/memory-sync.h         |  30 +
 include/hw/pci/pci_ids.h             |   3 +
 include/hw/pci/proxy.h               |  69 ++
 include/hw/remote/iohub.h            |  50 ++
 include/io/channel.h                 |  24 +
 include/io/mpqemu-link.h             | 140 ++++
 include/qemu/mmap-alloc.h            |   3 +-
 io/Makefile.objs                     |   2 +
 io/channel.c                         |  45 ++
 io/mpqemu-link.c                     | 277 ++++++++
 memory.c                             |   3 +-
 scripts/mpqemu-launcher-perf-mode.py |  67 ++
 scripts/mpqemu-launcher.py           |  47 ++
 util/mmap-alloc.c                    |   7 +-
 util/oslib-posix.c                   |   2 +-
 40 files changed, 3264 insertions(+), 13 deletions(-)
 create mode 100644 docs/devel/multi-process.rst
 create mode 100644 docs/multi-process.rst
 create mode 100644 hw/i386/remote-memory.c
 create mode 100644 hw/i386/remote-msg.c
 create mode 100644 hw/i386/remote.c
 create mode 100644 hw/pci-host/remote.c
 create mode 100644 hw/pci/memory-sync.c
 create mode 100644 hw/pci/proxy.c
 create mode 100644 hw/remote/Makefile.objs
 create mode 100644 hw/remote/iohub.c
 create mode 100644 include/hw/i386/remote-memory.h
 create mode 100644 include/hw/i386/remote.h
 create mode 100644 include/hw/pci-host/remote.h
 create mode 100644 include/hw/pci/memory-sync.h
 create mode 100644 include/hw/pci/proxy.h
 create mode 100644 include/hw/remote/iohub.h
 create mode 100644 include/io/mpqemu-link.h
 create mode 100644 io/mpqemu-link.c
 create mode 100644 scripts/mpqemu-launcher-perf-mode.py
 create mode 100755 scripts/mpqemu-launcher.py

Comments

Stefan Hajnoczi July 2, 2020, 1:40 p.m. UTC | #1
On Sat, Jun 27, 2020 at 10:09:22AM -0700, elena.ufimtseva@oracle.com wrote:
> From: Elena Ufimtseva <elena.ufimtseva@oracle.com>
> 
> This is the v7 of the patchset.

I have completed the review and left comments on the patches.

I'm glad it was possible to simplify this feature. The overall approach
makes sense to me and I see how it forms the base on which
VFIO-over-socket and smaller remote program builds using Kconfig can be
developed.

My main concern is that the object lifecycle has not been fully
implemented in the proxy and remote device. Error handling is
incomplete, resources are leaked, and hot unplug does not work. Thinking
through the lifecycle is very important so that additional work can
build on top of this later. I have tried to point out these issues in
the individual patches.
Jag Raman July 9, 2020, 2:16 p.m. UTC | #2
> On Jul 2, 2020, at 9:40 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> 
> On Sat, Jun 27, 2020 at 10:09:22AM -0700, elena.ufimtseva@oracle.com wrote:
>> From: Elena Ufimtseva <elena.ufimtseva@oracle.com>
>> 
>> This is the v7 of the patchset.
> 
> I have completed the review and left comments on the patches.
> 
> I'm glad it was possible to simplify this feature. The overall approach

Hi Stefan,

We’re also with you on this. The feature looks much simpler now.

> makes sense to me and I see how it forms the base on which
> VFIO-over-socket and smaller remote program builds using Kconfig can be
> developed.
> 
> My main concern is that the object lifecycle has not been fully
> implemented in the proxy and remote device. Error handling is

Thank you for your feedback on. FWIW, we did check about the unrealize() path
in the object lifecycle management. We noticed that the destructor for the PCI
devices (pci_qdev_unrealize()) is currently not invoking the instance specific
destructor/unrealize functions. While this is not an excuse for not implementing
the unrealize functions, it currently doesn’t have an impact on the hot unplug path.

You’re correct, we should implement the unrealize/destructor for the Proxy & remote
objects. We’ll also look into any background for why the PCI devices don’t call
instance specific destructor.

> incomplete, resources are leaked, and hot unplug does not work. Thinking

We’ll double check the resource leak issues, specifically with respect to open
file destructors.

> through the lifecycle is very important so that additional work can
> build on top of this later. I have tried to point out these issues in
> the individual patches.

We got a chance to go over your feedback. We will send out responses to them
shortly.

Thank you very much!
--
Jag
Stefan Hajnoczi July 13, 2020, 11:21 a.m. UTC | #3
On Thu, Jul 09, 2020 at 10:16:31AM -0400, Jag Raman wrote:
> > On Jul 2, 2020, at 9:40 AM, Stefan Hajnoczi <stefanha@redhat.com> wrote:
> > On Sat, Jun 27, 2020 at 10:09:22AM -0700, elena.ufimtseva@oracle.com wrote:
> > makes sense to me and I see how it forms the base on which
> > VFIO-over-socket and smaller remote program builds using Kconfig can be
> > developed.
> > 
> > My main concern is that the object lifecycle has not been fully
> > implemented in the proxy and remote device. Error handling is
> 
> Thank you for your feedback on. FWIW, we did check about the unrealize() path
> in the object lifecycle management. We noticed that the destructor for the PCI
> devices (pci_qdev_unrealize()) is currently not invoking the instance specific
> destructor/unrealize functions. While this is not an excuse for not implementing
> the unrealize functions, it currently doesn’t have an impact on the hot unplug path.
> 
> You’re correct, we should implement the unrealize/destructor for the Proxy & remote
> objects. We’ll also look into any background for why the PCI devices don’t call
> instance specific destructor.

PCIDeviceClass->exit() is invoked by pci_qdev_unrealize(). I'm not sure
why it's called "exit" instead of "unrealize" but PCI devices implement
it to perform clean-up.

Stefan