mbox series

[v2,00/20] VM forking

Message ID cover.1576697796.git.tamas.lengyel@intel.com (mailing list archive)
Headers show
Series VM forking | expand

Message

Tamas K Lengyel Dec. 18, 2019, 7:40 p.m. UTC
The following series implements VM forking for Intel HVM guests to allow for
the fast creation of identical VMs without the assosciated high startup costs
of booting or restoring the VM from a savefile.

JIRA issue: https://xenproject.atlassian.net/browse/XEN-89

The main design goal with this series has been to reduce the time of creating
the VM fork as much as possible. To achieve this the VM forking process is
split into two steps:
    1) forking the VM on the hypervisor side;
    2) starting QEMU to handle the backed for emulated devices.

Step 1) involves creating a VM using the new "xl fork-vm" command. The
parent VM is expected to remain paused after forks are created from it (which
is different then what process forking normally entails). During this forking
operation the HVM context and VM settings are copied over to the new forked VM.
This operation is fast and it allows the forked VM to be unpaused and to be
monitored and accessed via VMI. Note however that without its device model
running (depending on what is executing in the VM) it is bound to
misbehave/crash when its trying to access devices that would be emulated by
QEMU. We anticipate that for certain use-cases this would be an acceptable
situation, in case for example when fuzzing is performed of code segments that
don't access such devices.

Step 2) involves launching QEMU to support the forked VM, which requires the
QEMU Xen savefile to be generated manually from the parent VM. This can be
accomplished simply by connecting to its QMP socket and issuing the
"xen-save-devices-state" command as documented by QEMU:
https://github.com/qemu/qemu/blob/master/docs/xen-save-devices-state.txt
Once the QEMU Xen savefile is generated the new "xl fork-launch-dm" command is
used to launch QEMU and load the specified savefile for it.

At runtime the forked VM starts running with an empty p2m which gets lazily
populated when the VM generates EPT faults, similar to how altp2m views are
populated. If the memory access is a read-only access, the p2m entry is
populated with a memory shared entry with its parent. For write memory accesses
or in case memory sharing wasn't possible (for example in case a reference is
held by a third party), a new page is allocated and the page contents are
copied over from the parent VM. Forks can be further forked if needed, thus
allowing for further memory savings.

A VM fork reset hypercall is also added that allows the fork to be reset to the
state it was just after a fork. This is an optimization for cases where the
forks are very short-lived and run without a device model, so resetting saves
some time compared to creating a brand new fork.

The series has been tested with both Linux and Windows VMs and functions as
expected. VM forking time has been measured to be 0.018s, device model launch
to be around 1s depending largely on the number of devices being emulated.

Patches 1-2 implement changes to existing internal Xen APIs to make VM forking
possible.

Patches 3-4 are simple code-formatting fixes for the toolstack and Xen for the
memory sharing paths with no functional changes.

Patches 5-16 are code-cleanups and adjustments of to Xen memory sharing
subsystem with no functional changes.

Patch 17 adds the hypervisor-side code implementing VM forking.

Patch 18 is integration of mem_access with forked VMs.

Patch 19 implements the VM fork reset operation hypervisor side bits.

Patch 20 adds the toolstack-side code implementing VM forking and reset.

Tamas K Lengyel (20):
  x86: make hvm_{get/set}_param accessible
  xen/x86: Make hap_get_allocation accessible
  tools/libxc: clean up memory sharing files
  x86/mem_sharing: cleanup code and comments in various locations
  x86/mem_sharing: make get_two_gfns take locks conditionally
  x86/mem_sharing: drop flags from mem_sharing_unshare_page
  x86/mem_sharing: don't try to unshare twice during page fault
  x86/mem_sharing: define mem_sharing_domain to hold some scattered
    variables
  x86/mem_sharing: Use INVALID_MFN and p2m_is_shared in
    relinquish_shared_pages
  x86/mem_sharing: Make add_to_physmap static and shorten name
  x86/mem_sharing: Convert MEM_SHARING_DESTROY_GFN to a bool
  x86/mem_sharing: Replace MEM_SHARING_DEBUG with gdprintk
  x86/mem_sharing: ASSERT that p2m_set_entry succeeds
  x86/mem_sharing: Enable mem_sharing on first memop
  x86/mem_sharing: Skip xen heap pages in memshr nominate
  x86/mem_sharing: check page type count earlier
  xen/mem_sharing: VM forking
  xen/mem_access: Use __get_gfn_type_access in set_mem_access
  x86/mem_sharing: reset a fork
  xen/tools: VM forking toolstack side

 tools/libxc/include/xenctrl.h     |  30 +-
 tools/libxc/xc_memshr.c           |  34 +-
 tools/libxl/libxl.h               |   7 +
 tools/libxl/libxl_create.c        | 237 +++++---
 tools/libxl/libxl_dm.c            |   2 +-
 tools/libxl/libxl_dom.c           |  83 ++-
 tools/libxl/libxl_internal.h      |   1 +
 tools/libxl/libxl_types.idl       |   1 +
 tools/xl/xl.h                     |   5 +
 tools/xl/xl_cmdtable.c            |  22 +
 tools/xl/xl_saverestore.c         |  96 ++++
 tools/xl/xl_vmcontrol.c           |   8 +
 xen/arch/x86/hvm/hvm.c            | 206 ++++---
 xen/arch/x86/mm/hap/hap.c         |   3 +-
 xen/arch/x86/mm/mem_access.c      |   5 +-
 xen/arch/x86/mm/mem_sharing.c     | 875 +++++++++++++++++++++---------
 xen/arch/x86/mm/p2m.c             |  34 +-
 xen/common/memory.c               |   2 +-
 xen/drivers/passthrough/pci.c     |   3 +-
 xen/include/asm-x86/hap.h         |   1 +
 xen/include/asm-x86/hvm/domain.h  |   6 +-
 xen/include/asm-x86/hvm/hvm.h     |   4 +
 xen/include/asm-x86/mem_sharing.h |  84 ++-
 xen/include/asm-x86/p2m.h         |  14 +-
 xen/include/public/memory.h       |   6 +
 xen/include/xen/sched.h           |   1 +
 26 files changed, 1258 insertions(+), 512 deletions(-)

Comments

Roger Pau Monné Dec. 19, 2019, 9:48 a.m. UTC | #1
On Wed, Dec 18, 2019 at 11:40:37AM -0800, Tamas K Lengyel wrote:
> The following series implements VM forking for Intel HVM guests to allow for
> the fast creation of identical VMs without the assosciated high startup costs
> of booting or restoring the VM from a savefile.
> 
> JIRA issue: https://xenproject.atlassian.net/browse/XEN-89
> 
> The main design goal with this series has been to reduce the time of creating
> the VM fork as much as possible. To achieve this the VM forking process is
> split into two steps:
>     1) forking the VM on the hypervisor side;
>     2) starting QEMU to handle the backed for emulated devices.
> 
> Step 1) involves creating a VM using the new "xl fork-vm" command. The
> parent VM is expected to remain paused after forks are created from it (which
> is different then what process forking normally entails). During this forking
               ^ than
> operation the HVM context and VM settings are copied over to the new forked VM.
> This operation is fast and it allows the forked VM to be unpaused and to be
> monitored and accessed via VMI. Note however that without its device model
> running (depending on what is executing in the VM) it is bound to
> misbehave/crash when its trying to access devices that would be emulated by
> QEMU. We anticipate that for certain use-cases this would be an acceptable
> situation, in case for example when fuzzing is performed of code segments that
> don't access such devices.
> 
> Step 2) involves launching QEMU to support the forked VM, which requires the
> QEMU Xen savefile to be generated manually from the parent VM. This can be
> accomplished simply by connecting to its QMP socket and issuing the
> "xen-save-devices-state" command as documented by QEMU:
> https://github.com/qemu/qemu/blob/master/docs/xen-save-devices-state.txt
> Once the QEMU Xen savefile is generated the new "xl fork-launch-dm" command is
> used to launch QEMU and load the specified savefile for it.

IMO having two different commands is confusing for the end user, I
would rather have something like:

xl fork-vm [-d] ...

Where '-d' would prevent forking any user-space emulators. I don't
thinks there's a need for a separate command to fork the underlying
user-space emulators.

Thanks, Roger.
Tamas K Lengyel Dec. 19, 2019, 3:58 p.m. UTC | #2
On Thu, Dec 19, 2019 at 2:48 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Dec 18, 2019 at 11:40:37AM -0800, Tamas K Lengyel wrote:
> > The following series implements VM forking for Intel HVM guests to allow for
> > the fast creation of identical VMs without the assosciated high startup costs
> > of booting or restoring the VM from a savefile.
> >
> > JIRA issue: https://xenproject.atlassian.net/browse/XEN-89
> >
> > The main design goal with this series has been to reduce the time of creating
> > the VM fork as much as possible. To achieve this the VM forking process is
> > split into two steps:
> >     1) forking the VM on the hypervisor side;
> >     2) starting QEMU to handle the backed for emulated devices.
> >
> > Step 1) involves creating a VM using the new "xl fork-vm" command. The
> > parent VM is expected to remain paused after forks are created from it (which
> > is different then what process forking normally entails). During this forking
>                ^ than
> > operation the HVM context and VM settings are copied over to the new forked VM.
> > This operation is fast and it allows the forked VM to be unpaused and to be
> > monitored and accessed via VMI. Note however that without its device model
> > running (depending on what is executing in the VM) it is bound to
> > misbehave/crash when its trying to access devices that would be emulated by
> > QEMU. We anticipate that for certain use-cases this would be an acceptable
> > situation, in case for example when fuzzing is performed of code segments that
> > don't access such devices.
> >
> > Step 2) involves launching QEMU to support the forked VM, which requires the
> > QEMU Xen savefile to be generated manually from the parent VM. This can be
> > accomplished simply by connecting to its QMP socket and issuing the
> > "xen-save-devices-state" command as documented by QEMU:
> > https://github.com/qemu/qemu/blob/master/docs/xen-save-devices-state.txt
> > Once the QEMU Xen savefile is generated the new "xl fork-launch-dm" command is
> > used to launch QEMU and load the specified savefile for it.
>
> IMO having two different commands is confusing for the end user, I
> would rather have something like:
>
> xl fork-vm [-d] ...
>
> Where '-d' would prevent forking any user-space emulators. I don't
> thinks there's a need for a separate command to fork the underlying
> user-space emulators.

Keeping it as two commands allows you to start up the fork and let it
run immediately and only start up QEMU when you notice it is needed.
The idea being that you can monitor the kernel and see when it tries
to do some I/O that would require the QEMU backend. If you combine the
commands that option goes away. Also, QEMU itself isn't getting forked
right now, we just start a new QEMU process with the saved-state
getting loaded into it. I looked into implementing a QEMU fork command
but it turns out that for the vast majority of our use-cases QEMU
isn't needed at all, so developing that addition was tabled.

Tamas
Roger Pau Monné Dec. 30, 2019, 5:59 p.m. UTC | #3
On Thu, Dec 19, 2019 at 08:58:01AM -0700, Tamas K Lengyel wrote:
> On Thu, Dec 19, 2019 at 2:48 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Wed, Dec 18, 2019 at 11:40:37AM -0800, Tamas K Lengyel wrote:
> > > The following series implements VM forking for Intel HVM guests to allow for
> > > the fast creation of identical VMs without the assosciated high startup costs
> > > of booting or restoring the VM from a savefile.
> > >
> > > JIRA issue: https://xenproject.atlassian.net/browse/XEN-89
> > >
> > > The main design goal with this series has been to reduce the time of creating
> > > the VM fork as much as possible. To achieve this the VM forking process is
> > > split into two steps:
> > >     1) forking the VM on the hypervisor side;
> > >     2) starting QEMU to handle the backed for emulated devices.
> > >
> > > Step 1) involves creating a VM using the new "xl fork-vm" command. The
> > > parent VM is expected to remain paused after forks are created from it (which
> > > is different then what process forking normally entails). During this forking
> >                ^ than
> > > operation the HVM context and VM settings are copied over to the new forked VM.
> > > This operation is fast and it allows the forked VM to be unpaused and to be
> > > monitored and accessed via VMI. Note however that without its device model
> > > running (depending on what is executing in the VM) it is bound to
> > > misbehave/crash when its trying to access devices that would be emulated by
> > > QEMU. We anticipate that for certain use-cases this would be an acceptable
> > > situation, in case for example when fuzzing is performed of code segments that
> > > don't access such devices.
> > >
> > > Step 2) involves launching QEMU to support the forked VM, which requires the
> > > QEMU Xen savefile to be generated manually from the parent VM. This can be
> > > accomplished simply by connecting to its QMP socket and issuing the
> > > "xen-save-devices-state" command as documented by QEMU:
> > > https://github.com/qemu/qemu/blob/master/docs/xen-save-devices-state.txt
> > > Once the QEMU Xen savefile is generated the new "xl fork-launch-dm" command is
> > > used to launch QEMU and load the specified savefile for it.
> >
> > IMO having two different commands is confusing for the end user, I
> > would rather have something like:
> >
> > xl fork-vm [-d] ...
> >
> > Where '-d' would prevent forking any user-space emulators. I don't
> > thinks there's a need for a separate command to fork the underlying
> > user-space emulators.
> 
> Keeping it as two commands allows you to start up the fork and let it
> run immediately and only start up QEMU when you notice it is needed.
> The idea being that you can monitor the kernel and see when it tries
> to do some I/O that would require the QEMU backend. If you combine the
> commands that option goes away.

I'm not sure I see why, you could still provide a `xl fork-vm [-c]
...` that would just lunch a QEMU instance. End users using xl have
AFAICT no way to tell whether or when a QEMU is needed or not, and
hence the default behavior should be a fully functional one.

IMO I think fork-vm without any options should do a complete fork of a
VM, rather than a partial one without a device model clone.

> Also, QEMU itself isn't getting forked
> right now, we just start a new QEMU process with the saved-state
> getting loaded into it. I looked into implementing a QEMU fork command
> but it turns out that for the vast majority of our use-cases QEMU
> isn't needed at all, so developing that addition was tabled.

Starting a new process with the saved-state looks fine to me, it can
always be improved afterwards if there's a need for it.

Thanks, Roger.
Tamas K Lengyel Dec. 30, 2019, 6:15 p.m. UTC | #4
On Mon, Dec 30, 2019 at 10:59 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Thu, Dec 19, 2019 at 08:58:01AM -0700, Tamas K Lengyel wrote:
> > On Thu, Dec 19, 2019 at 2:48 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Wed, Dec 18, 2019 at 11:40:37AM -0800, Tamas K Lengyel wrote:
> > > > The following series implements VM forking for Intel HVM guests to allow for
> > > > the fast creation of identical VMs without the assosciated high startup costs
> > > > of booting or restoring the VM from a savefile.
> > > >
> > > > JIRA issue: https://xenproject.atlassian.net/browse/XEN-89
> > > >
> > > > The main design goal with this series has been to reduce the time of creating
> > > > the VM fork as much as possible. To achieve this the VM forking process is
> > > > split into two steps:
> > > >     1) forking the VM on the hypervisor side;
> > > >     2) starting QEMU to handle the backed for emulated devices.
> > > >
> > > > Step 1) involves creating a VM using the new "xl fork-vm" command. The
> > > > parent VM is expected to remain paused after forks are created from it (which
> > > > is different then what process forking normally entails). During this forking
> > >                ^ than
> > > > operation the HVM context and VM settings are copied over to the new forked VM.
> > > > This operation is fast and it allows the forked VM to be unpaused and to be
> > > > monitored and accessed via VMI. Note however that without its device model
> > > > running (depending on what is executing in the VM) it is bound to
> > > > misbehave/crash when its trying to access devices that would be emulated by
> > > > QEMU. We anticipate that for certain use-cases this would be an acceptable
> > > > situation, in case for example when fuzzing is performed of code segments that
> > > > don't access such devices.
> > > >
> > > > Step 2) involves launching QEMU to support the forked VM, which requires the
> > > > QEMU Xen savefile to be generated manually from the parent VM. This can be
> > > > accomplished simply by connecting to its QMP socket and issuing the
> > > > "xen-save-devices-state" command as documented by QEMU:
> > > > https://github.com/qemu/qemu/blob/master/docs/xen-save-devices-state.txt
> > > > Once the QEMU Xen savefile is generated the new "xl fork-launch-dm" command is
> > > > used to launch QEMU and load the specified savefile for it.
> > >
> > > IMO having two different commands is confusing for the end user, I
> > > would rather have something like:
> > >
> > > xl fork-vm [-d] ...
> > >
> > > Where '-d' would prevent forking any user-space emulators. I don't
> > > thinks there's a need for a separate command to fork the underlying
> > > user-space emulators.
> >
> > Keeping it as two commands allows you to start up the fork and let it
> > run immediately and only start up QEMU when you notice it is needed.
> > The idea being that you can monitor the kernel and see when it tries
> > to do some I/O that would require the QEMU backend. If you combine the
> > commands that option goes away.
>
> I'm not sure I see why, you could still provide a `xl fork-vm [-c]
> ...` that would just lunch a QEMU instance. End users using xl have
> AFAICT no way to tell whether or when a QEMU is needed or not, and
> hence the default behavior should be a fully functional one.
>
> IMO I think fork-vm without any options should do a complete fork of a
> VM, rather than a partial one without a device model clone.

I understand your point but implementing that is outside the scope of
what we are doing right now. There are a lot more steps involved if
you want to create a fully functional VM fork with QEMU, for example
you also have to create a separate disk so you don't clobber the
parent VM's disk. Also, saving the QEMU device state is currently
hard-wired into the save/migration operation, so changing that
plumbing in libxl is quite involved. I actually found it way easier to
just write a script that connects to the socket and saves it to a
target file then going through the pain of adjusting libxl. So while
this could be implemented at this time it won't be.

Tamas
Julien Grall Dec. 30, 2019, 6:43 p.m. UTC | #5
Hi Tamas,

On 30/12/2019 18:15, Tamas K Lengyel wrote:
> On Mon, Dec 30, 2019 at 10:59 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>
>> On Thu, Dec 19, 2019 at 08:58:01AM -0700, Tamas K Lengyel wrote:
>>> On Thu, Dec 19, 2019 at 2:48 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>
>>>> On Wed, Dec 18, 2019 at 11:40:37AM -0800, Tamas K Lengyel wrote:
>>>>> The following series implements VM forking for Intel HVM guests to allow for
>>>>> the fast creation of identical VMs without the assosciated high startup costs
>>>>> of booting or restoring the VM from a savefile.
>>>>>
>>>>> JIRA issue: https://xenproject.atlassian.net/browse/XEN-89
>>>>>
>>>>> The main design goal with this series has been to reduce the time of creating
>>>>> the VM fork as much as possible. To achieve this the VM forking process is
>>>>> split into two steps:
>>>>>      1) forking the VM on the hypervisor side;
>>>>>      2) starting QEMU to handle the backed for emulated devices.
>>>>>
>>>>> Step 1) involves creating a VM using the new "xl fork-vm" command. The
>>>>> parent VM is expected to remain paused after forks are created from it (which
>>>>> is different then what process forking normally entails). During this forking
>>>>                 ^ than
>>>>> operation the HVM context and VM settings are copied over to the new forked VM.
>>>>> This operation is fast and it allows the forked VM to be unpaused and to be
>>>>> monitored and accessed via VMI. Note however that without its device model
>>>>> running (depending on what is executing in the VM) it is bound to
>>>>> misbehave/crash when its trying to access devices that would be emulated by
>>>>> QEMU. We anticipate that for certain use-cases this would be an acceptable
>>>>> situation, in case for example when fuzzing is performed of code segments that
>>>>> don't access such devices.
>>>>>
>>>>> Step 2) involves launching QEMU to support the forked VM, which requires the
>>>>> QEMU Xen savefile to be generated manually from the parent VM. This can be
>>>>> accomplished simply by connecting to its QMP socket and issuing the
>>>>> "xen-save-devices-state" command as documented by QEMU:
>>>>> https://github.com/qemu/qemu/blob/master/docs/xen-save-devices-state.txt
>>>>> Once the QEMU Xen savefile is generated the new "xl fork-launch-dm" command is
>>>>> used to launch QEMU and load the specified savefile for it.
>>>>
>>>> IMO having two different commands is confusing for the end user, I
>>>> would rather have something like:
>>>>
>>>> xl fork-vm [-d] ...
>>>>
>>>> Where '-d' would prevent forking any user-space emulators. I don't
>>>> thinks there's a need for a separate command to fork the underlying
>>>> user-space emulators.
>>>
>>> Keeping it as two commands allows you to start up the fork and let it
>>> run immediately and only start up QEMU when you notice it is needed.
>>> The idea being that you can monitor the kernel and see when it tries
>>> to do some I/O that would require the QEMU backend. If you combine the
>>> commands that option goes away.
>>
>> I'm not sure I see why, you could still provide a `xl fork-vm [-c]
>> ...` that would just lunch a QEMU instance. End users using xl have
>> AFAICT no way to tell whether or when a QEMU is needed or not, and
>> hence the default behavior should be a fully functional one.
>>
>> IMO I think fork-vm without any options should do a complete fork of a
>> VM, rather than a partial one without a device model clone.
> 
> I understand your point but implementing that is outside the scope of
> what we are doing right now. There are a lot more steps involved if
> you want to create a fully functional VM fork with QEMU, for example
> you also have to create a separate disk so you don't clobber the
> parent VM's disk. Also, saving the QEMU device state is currently
> hard-wired into the save/migration operation, so changing that
> plumbing in libxl is quite involved. I actually found it way easier to
> just write a script that connects to the socket and saves it to a
> target file then going through the pain of adjusting libxl. So while
> this could be implemented at this time it won't be.
That's fine to not implement it right now, however the user interface 
should be able to cater it.

In this case, I agree with Roger that it is more intuitive to think that 
fork means a complete fork, not a partial one.

You could impose the user to always pass that option to not clone the 
device model and return an error if it is not there.

Cheers,
Tamas K Lengyel Dec. 30, 2019, 8:46 p.m. UTC | #6
On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
>
> Hi Tamas,
>
> On 30/12/2019 18:15, Tamas K Lengyel wrote:
> > On Mon, Dec 30, 2019 at 10:59 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>
> >> On Thu, Dec 19, 2019 at 08:58:01AM -0700, Tamas K Lengyel wrote:
> >>> On Thu, Dec 19, 2019 at 2:48 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>>>
> >>>> On Wed, Dec 18, 2019 at 11:40:37AM -0800, Tamas K Lengyel wrote:
> >>>>> The following series implements VM forking for Intel HVM guests to allow for
> >>>>> the fast creation of identical VMs without the assosciated high startup costs
> >>>>> of booting or restoring the VM from a savefile.
> >>>>>
> >>>>> JIRA issue: https://xenproject.atlassian.net/browse/XEN-89
> >>>>>
> >>>>> The main design goal with this series has been to reduce the time of creating
> >>>>> the VM fork as much as possible. To achieve this the VM forking process is
> >>>>> split into two steps:
> >>>>>      1) forking the VM on the hypervisor side;
> >>>>>      2) starting QEMU to handle the backed for emulated devices.
> >>>>>
> >>>>> Step 1) involves creating a VM using the new "xl fork-vm" command. The
> >>>>> parent VM is expected to remain paused after forks are created from it (which
> >>>>> is different then what process forking normally entails). During this forking
> >>>>                 ^ than
> >>>>> operation the HVM context and VM settings are copied over to the new forked VM.
> >>>>> This operation is fast and it allows the forked VM to be unpaused and to be
> >>>>> monitored and accessed via VMI. Note however that without its device model
> >>>>> running (depending on what is executing in the VM) it is bound to
> >>>>> misbehave/crash when its trying to access devices that would be emulated by
> >>>>> QEMU. We anticipate that for certain use-cases this would be an acceptable
> >>>>> situation, in case for example when fuzzing is performed of code segments that
> >>>>> don't access such devices.
> >>>>>
> >>>>> Step 2) involves launching QEMU to support the forked VM, which requires the
> >>>>> QEMU Xen savefile to be generated manually from the parent VM. This can be
> >>>>> accomplished simply by connecting to its QMP socket and issuing the
> >>>>> "xen-save-devices-state" command as documented by QEMU:
> >>>>> https://github.com/qemu/qemu/blob/master/docs/xen-save-devices-state.txt
> >>>>> Once the QEMU Xen savefile is generated the new "xl fork-launch-dm" command is
> >>>>> used to launch QEMU and load the specified savefile for it.
> >>>>
> >>>> IMO having two different commands is confusing for the end user, I
> >>>> would rather have something like:
> >>>>
> >>>> xl fork-vm [-d] ...
> >>>>
> >>>> Where '-d' would prevent forking any user-space emulators. I don't
> >>>> thinks there's a need for a separate command to fork the underlying
> >>>> user-space emulators.
> >>>
> >>> Keeping it as two commands allows you to start up the fork and let it
> >>> run immediately and only start up QEMU when you notice it is needed.
> >>> The idea being that you can monitor the kernel and see when it tries
> >>> to do some I/O that would require the QEMU backend. If you combine the
> >>> commands that option goes away.
> >>
> >> I'm not sure I see why, you could still provide a `xl fork-vm [-c]
> >> ...` that would just lunch a QEMU instance. End users using xl have
> >> AFAICT no way to tell whether or when a QEMU is needed or not, and
> >> hence the default behavior should be a fully functional one.
> >>
> >> IMO I think fork-vm without any options should do a complete fork of a
> >> VM, rather than a partial one without a device model clone.
> >
> > I understand your point but implementing that is outside the scope of
> > what we are doing right now. There are a lot more steps involved if
> > you want to create a fully functional VM fork with QEMU, for example
> > you also have to create a separate disk so you don't clobber the
> > parent VM's disk. Also, saving the QEMU device state is currently
> > hard-wired into the save/migration operation, so changing that
> > plumbing in libxl is quite involved. I actually found it way easier to
> > just write a script that connects to the socket and saves it to a
> > target file then going through the pain of adjusting libxl. So while
> > this could be implemented at this time it won't be.
> That's fine to not implement it right now, however the user interface
> should be able to cater it.
>
> In this case, I agree with Roger that it is more intuitive to think that
> fork means a complete fork, not a partial one.
>
> You could impose the user to always pass that option to not clone the
> device model and return an error if it is not there.

Just to be clear, I can add the option to the "fork-vm" command to
load the QEMU state with it, effectively combining the "fork-vm" and
"fork-launch-dm" into one. But I still need the separate
"fork-launch-dm" command since in our model we need to be able to
launch the VM and run it without QEMU for a while, only launching QEMU
when it is determined to be necessary. So if that's what you are
asking, sure, I can do that.

But keep in mind that the "fork-vm" command even with this update
would still not produce for you a "fully functional" VM on its own.
The user still has to produce a new VM config file, create the new
disk, save the QEMU state, etc. So if your concern is that the
"fork-vm" command's name will imply that it is going to be producing
fully functional VM on its own I would rather just rename the command
because by itself it will never create a fully functional VM.

Tamas
Julien Grall Dec. 31, 2019, 12:20 a.m. UTC | #7
Hi,

On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:

> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> But keep in mind that the "fork-vm" command even with this update
> would still not produce for you a "fully functional" VM on its own.
> The user still has to produce a new VM config file, create the new
> disk, save the QEMU state, etc.


 If you fork then the configuration should be very similar. Right?

So why does the user requires to provide a new config rather than the
command to update the existing one? To me, it feels this is a call to make
mistake when forking.

How is the new config different from the original VM?

As a side note, I can't see any patch adding documentation.

Cheers,
Tamas K Lengyel Dec. 31, 2019, 12:37 a.m. UTC | #8
On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
>
> Hi,
>
> On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
>>
>> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
>> But keep in mind that the "fork-vm" command even with this update
>> would still not produce for you a "fully functional" VM on its own.
>> The user still has to produce a new VM config file, create the new
>> disk, save the QEMU state, etc.
>
>
>  If you fork then the configuration should be very similar. Right?
>
> So why does the user requires to provide a new config rather than the command to update the existing one? To me, it feels this is a call to make mistake when forking.
>
> How is the new config different from the original VM?

The config must be different at least by giving the fork a different
name. That's the minimum and it's enough only if the VM you are
forking has no disk at all. If it has a disk, you also have to update
the config to point to where the new disk is. I'm using LVM snapshots
but you could also use qcow2, or whatever else there is for disk-CoW.
The fork can also have different options enabled than it's parent. For
example in our test-case, the forks have altp2m enabled while the
parent VM doesn't. There could be other options like that someone
might want to enable for the fork(s). If there is networking involved
you likely also have to attach the fork to a new VLAN as to avoid
MAC-address collision on the bridge. So there are quite a lot of
variation possible, hence its better to have the user generate the new
config they want instead of xl coming up with something on its own.

>
> As a side note, I can't see any patch adding documentation.

It's only an experimental feature so adding documentation was not a
priority. The documentation is pretty much in the cover letter. I'm
happy to add its content as a file under docs in a patch (with the
above extra information).

Tamas
Roger Pau Monné Dec. 31, 2019, 10:40 a.m. UTC | #9
On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> >
> > Hi,
> >
> > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> >>
> >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> >> But keep in mind that the "fork-vm" command even with this update
> >> would still not produce for you a "fully functional" VM on its own.
> >> The user still has to produce a new VM config file, create the new
> >> disk, save the QEMU state, etc.

IMO the default behavior of the fork command should be to leave the
original VM paused, so that you can continue using the same disk and
network config in the fork and you won't need to pass a new config
file.

As Julien already said, maybe I wasn't clear in my previous replies:
I'm not asking you to implement all this, it's fine if the
implementation of the fork-vm xl command requires you to pass certain
options, and that the default behavior is not implemented.

We need an interface that's sane, and that's designed to be easy and
comprehensive to use, not an interface built around what's currently
implemented.

> >
> >  If you fork then the configuration should be very similar. Right?
> >
> > So why does the user requires to provide a new config rather than the command to update the existing one? To me, it feels this is a call to make mistake when forking.
> >
> > How is the new config different from the original VM?
> 
> The config must be different at least by giving the fork a different
> name. That's the minimum and it's enough only if the VM you are
> forking has no disk at all.

Adding an option to pass an explicit name for the fork would be handy,
or else xl could come up with a name by itself, like it's done for
migration, ie: <orignal name>--fork<digit>.

> If it has a disk, you also have to update
> the config to point to where the new disk is. I'm using LVM snapshots
> but you could also use qcow2, or whatever else there is for disk-CoW.
> The fork can also have different options enabled than it's parent. For
> example in our test-case, the forks have altp2m enabled while the
> parent VM doesn't. There could be other options like that someone
> might want to enable for the fork(s). If there is networking involved
> you likely also have to attach the fork to a new VLAN as to avoid
> MAC-address collision on the bridge. So there are quite a lot of
> variation possible, hence its better to have the user generate the new
> config they want instead of xl coming up with something on its own.

Passing a new config file for the fork is indeed fine, but maybe we
don't want this to be the default behavior, as said above I think it's
possible to fork a VM without passing a new config file.

> >
> > As a side note, I can't see any patch adding documentation.
> 
> It's only an experimental feature so adding documentation was not a
> priority. The documentation is pretty much in the cover letter. I'm
> happy to add its content as a file under docs in a patch (with the
> above extra information).

Please also document the new xl command(s) in the man page [0].

Thanks, Roger.

[0] https://xenbits.xen.org/docs/unstable/man/xl.1.html
Tamas K Lengyel Dec. 31, 2019, 3 p.m. UTC | #10
On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > >
> > > Hi,
> > >
> > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > >>
> > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > >> But keep in mind that the "fork-vm" command even with this update
> > >> would still not produce for you a "fully functional" VM on its own.
> > >> The user still has to produce a new VM config file, create the new
> > >> disk, save the QEMU state, etc.
>
> IMO the default behavior of the fork command should be to leave the
> original VM paused, so that you can continue using the same disk and
> network config in the fork and you won't need to pass a new config
> file.
>
> As Julien already said, maybe I wasn't clear in my previous replies:
> I'm not asking you to implement all this, it's fine if the
> implementation of the fork-vm xl command requires you to pass certain
> options, and that the default behavior is not implemented.
>
> We need an interface that's sane, and that's designed to be easy and
> comprehensive to use, not an interface built around what's currently
> implemented.

OK, so I think that would look like "xl fork-vm <parent_domid>" with
additional options for things like name, disk, vlan, or a completely
new config, all of which are currently not implemented, + an
additional option to not launch QEMU at all, which would be the only
one currently working. Also keeping the separate "xl fork-launch-dm"
as is. Is that what we are talking about?

>
> > >
> > >  If you fork then the configuration should be very similar. Right?
> > >
> > > So why does the user requires to provide a new config rather than the command to update the existing one? To me, it feels this is a call to make mistake when forking.
> > >
> > > How is the new config different from the original VM?
> >
> > The config must be different at least by giving the fork a different
> > name. That's the minimum and it's enough only if the VM you are
> > forking has no disk at all.
>
> Adding an option to pass an explicit name for the fork would be handy,
> or else xl could come up with a name by itself, like it's done for
> migration, ie: <orignal name>--fork<digit>.
>
> > If it has a disk, you also have to update
> > the config to point to where the new disk is. I'm using LVM snapshots
> > but you could also use qcow2, or whatever else there is for disk-CoW.
> > The fork can also have different options enabled than it's parent. For
> > example in our test-case, the forks have altp2m enabled while the
> > parent VM doesn't. There could be other options like that someone
> > might want to enable for the fork(s). If there is networking involved
> > you likely also have to attach the fork to a new VLAN as to avoid
> > MAC-address collision on the bridge. So there are quite a lot of
> > variation possible, hence its better to have the user generate the new
> > config they want instead of xl coming up with something on its own.
>
> Passing a new config file for the fork is indeed fine, but maybe we
> don't want this to be the default behavior, as said above I think it's
> possible to fork a VM without passing a new config file.
>
> > >
> > > As a side note, I can't see any patch adding documentation.
> >
> > It's only an experimental feature so adding documentation was not a
> > priority. The documentation is pretty much in the cover letter. I'm
> > happy to add its content as a file under docs in a patch (with the
> > above extra information).
>
> Please also document the new xl command(s) in the man page [0].

Ack.

Thanks,
Tamas
Roger Pau Monné Dec. 31, 2019, 3:11 p.m. UTC | #11
On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > >
> > > > Hi,
> > > >
> > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > >>
> > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > >> But keep in mind that the "fork-vm" command even with this update
> > > >> would still not produce for you a "fully functional" VM on its own.
> > > >> The user still has to produce a new VM config file, create the new
> > > >> disk, save the QEMU state, etc.
> >
> > IMO the default behavior of the fork command should be to leave the
> > original VM paused, so that you can continue using the same disk and
> > network config in the fork and you won't need to pass a new config
> > file.
> >
> > As Julien already said, maybe I wasn't clear in my previous replies:
> > I'm not asking you to implement all this, it's fine if the
> > implementation of the fork-vm xl command requires you to pass certain
> > options, and that the default behavior is not implemented.
> >
> > We need an interface that's sane, and that's designed to be easy and
> > comprehensive to use, not an interface built around what's currently
> > implemented.
> 
> OK, so I think that would look like "xl fork-vm <parent_domid>" with
> additional options for things like name, disk, vlan, or a completely
> new config, all of which are currently not implemented, + an
> additional option to not launch QEMU at all, which would be the only
> one currently working. Also keeping the separate "xl fork-launch-dm"
> as is. Is that what we are talking about?

I think fork-launch-vm should just be an option of fork-vm (ie:
--launch-dm-only or some such). I don't think there's a reason to have
a separate top-level command to just launch the device model.

Thanks, Roger.
Tamas K Lengyel Dec. 31, 2019, 4:08 p.m. UTC | #12
On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> > On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > > >
> > > > > Hi,
> > > > >
> > > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > > >>
> > > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > > >> But keep in mind that the "fork-vm" command even with this update
> > > > >> would still not produce for you a "fully functional" VM on its own.
> > > > >> The user still has to produce a new VM config file, create the new
> > > > >> disk, save the QEMU state, etc.
> > >
> > > IMO the default behavior of the fork command should be to leave the
> > > original VM paused, so that you can continue using the same disk and
> > > network config in the fork and you won't need to pass a new config
> > > file.
> > >
> > > As Julien already said, maybe I wasn't clear in my previous replies:
> > > I'm not asking you to implement all this, it's fine if the
> > > implementation of the fork-vm xl command requires you to pass certain
> > > options, and that the default behavior is not implemented.
> > >
> > > We need an interface that's sane, and that's designed to be easy and
> > > comprehensive to use, not an interface built around what's currently
> > > implemented.
> >
> > OK, so I think that would look like "xl fork-vm <parent_domid>" with
> > additional options for things like name, disk, vlan, or a completely
> > new config, all of which are currently not implemented, + an
> > additional option to not launch QEMU at all, which would be the only
> > one currently working. Also keeping the separate "xl fork-launch-dm"
> > as is. Is that what we are talking about?
>
> I think fork-launch-vm should just be an option of fork-vm (ie:
> --launch-dm-only or some such). I don't think there's a reason to have
> a separate top-level command to just launch the device model.

It's just that the fork-launch-dm needs the domid of the fork, while
the fork-vm needs the parent's domid. But I guess we can interpret the
"domid" required input differently depending on which sub-option is
specified for the command. Let's see how it pans out.

Thanks,
Tamas
Tamas K Lengyel Dec. 31, 2019, 4:36 p.m. UTC | #13
On Tue, Dec 31, 2019 at 9:08 AM Tamas K Lengyel <tamas@tklengyel.com> wrote:
>
> On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> > > On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > > > >
> > > > > > Hi,
> > > > > >
> > > > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > > > >>
> > > > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > > > >> But keep in mind that the "fork-vm" command even with this update
> > > > > >> would still not produce for you a "fully functional" VM on its own.
> > > > > >> The user still has to produce a new VM config file, create the new
> > > > > >> disk, save the QEMU state, etc.
> > > >
> > > > IMO the default behavior of the fork command should be to leave the
> > > > original VM paused, so that you can continue using the same disk and
> > > > network config in the fork and you won't need to pass a new config
> > > > file.
> > > >
> > > > As Julien already said, maybe I wasn't clear in my previous replies:
> > > > I'm not asking you to implement all this, it's fine if the
> > > > implementation of the fork-vm xl command requires you to pass certain
> > > > options, and that the default behavior is not implemented.
> > > >
> > > > We need an interface that's sane, and that's designed to be easy and
> > > > comprehensive to use, not an interface built around what's currently
> > > > implemented.
> > >
> > > OK, so I think that would look like "xl fork-vm <parent_domid>" with
> > > additional options for things like name, disk, vlan, or a completely
> > > new config, all of which are currently not implemented, + an
> > > additional option to not launch QEMU at all, which would be the only
> > > one currently working. Also keeping the separate "xl fork-launch-dm"
> > > as is. Is that what we are talking about?
> >
> > I think fork-launch-vm should just be an option of fork-vm (ie:
> > --launch-dm-only or some such). I don't think there's a reason to have
> > a separate top-level command to just launch the device model.
>
> It's just that the fork-launch-dm needs the domid of the fork, while
> the fork-vm needs the parent's domid. But I guess we can interpret the
> "domid" required input differently depending on which sub-option is
> specified for the command. Let's see how it pans out.

How does the following look for the interface?

    { "fork-vm",
      &main_fork_vm, 0, 1,
      "Fork a domain from the running parent domid",
      "[options] <Domid>",
      "-h                           Print this help.\n"
      "-N <name>                    Assign name to VM fork.\n"
      "-D <disk>                    Assign disk to VM fork.\n"
      "-B <bridge                   Assign bridge to VM fork.\n"
      "-V <vlan>                    Assign vlan to VM fork.\n"
      "-C <config>                  Use config file for VM fork.\n"
      "-Q <qemu-save-file>          Use qemu save file for VM fork.\n"
      "--launch-dm  <yes|no|late>   Launch device model (QEMU) for VM fork.\n"
      "--fork-reset                 Reset VM fork.\n"
      "-p                           Do not unpause VMs after fork."
      "-h                           Print this help.\n"
      "-d                           Enable debug messages.\n"
    },

Currently the parts that are implemented would look like:
xl fork-vm -p --launch-dm no <parent_domid>
xl fork-vm -p --launch-dm late -C <config> -Q <qemu-save-file> <fork_domid>
xl fork-vm -p --fork-reset <fork_domid>

Tamas
Julien Grall Jan. 8, 2020, 9:42 a.m. UTC | #14
Hi Tamas,

On 31/12/2019 16:36, Tamas K Lengyel wrote:
> On Tue, Dec 31, 2019 at 9:08 AM Tamas K Lengyel <tamas@tklengyel.com> wrote:
>>
>> On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>
>>> On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
>>>> On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>>
>>>>> On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
>>>>>> On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
>>>>>>>>
>>>>>>>> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
>>>>>>>> But keep in mind that the "fork-vm" command even with this update
>>>>>>>> would still not produce for you a "fully functional" VM on its own.
>>>>>>>> The user still has to produce a new VM config file, create the new
>>>>>>>> disk, save the QEMU state, etc.
>>>>>
>>>>> IMO the default behavior of the fork command should be to leave the
>>>>> original VM paused, so that you can continue using the same disk and
>>>>> network config in the fork and you won't need to pass a new config
>>>>> file.
>>>>>
>>>>> As Julien already said, maybe I wasn't clear in my previous replies:
>>>>> I'm not asking you to implement all this, it's fine if the
>>>>> implementation of the fork-vm xl command requires you to pass certain
>>>>> options, and that the default behavior is not implemented.
>>>>>
>>>>> We need an interface that's sane, and that's designed to be easy and
>>>>> comprehensive to use, not an interface built around what's currently
>>>>> implemented.
>>>>
>>>> OK, so I think that would look like "xl fork-vm <parent_domid>" with
>>>> additional options for things like name, disk, vlan, or a completely
>>>> new config, all of which are currently not implemented, + an
>>>> additional option to not launch QEMU at all, which would be the only
>>>> one currently working. Also keeping the separate "xl fork-launch-dm"
>>>> as is. Is that what we are talking about?
>>>
>>> I think fork-launch-vm should just be an option of fork-vm (ie:
>>> --launch-dm-only or some such). I don't think there's a reason to have
>>> a separate top-level command to just launch the device model.
>>
>> It's just that the fork-launch-dm needs the domid of the fork, while
>> the fork-vm needs the parent's domid. But I guess we can interpret the
>> "domid" required input differently depending on which sub-option is
>> specified for the command. Let's see how it pans out.
> 
> How does the following look for the interface?
> 
>      { "fork-vm",
>        &main_fork_vm, 0, 1,
>        "Fork a domain from the running parent domid",
>        "[options] <Domid>",
>        "-h                           Print this help.\n"
>        "-N <name>                    Assign name to VM fork.\n"
>        "-D <disk>                    Assign disk to VM fork.\n"
>        "-B <bridge                   Assign bridge to VM fork.\n"
>        "-V <vlan>                    Assign vlan to VM fork.\n"
>        "-C <config>                  Use config file for VM fork.\n"
>        "-Q <qemu-save-file>          Use qemu save file for VM fork.\n"
>        "--launch-dm  <yes|no|late>   Launch device model (QEMU) for VM fork.\n"
>        "--fork-reset                 Reset VM fork.\n"
>        "-p                           Do not unpause VMs after fork."
>        "-h                           Print this help.\n"
>        "-d                           Enable debug messages.\n"
>      },
> 
> Currently the parts that are implemented would look like:
> xl fork-vm -p --launch-dm no <parent_domid>
> xl fork-vm -p --launch-dm late -C <config> -Q <qemu-save-file> <fork_domid>
> xl fork-vm -p --fork-reset <fork_domid>

The interface looks good to me. Note that I don't think you need to 
describre all the unimplemented option in the help. It would be 
sufficient to describe what you only support and bail out if the user 
gives something different.

What matters is we are able to extend the command line option over the time.

Cheers,
Roger Pau Monné Jan. 8, 2020, 3:08 p.m. UTC | #15
On Tue, Dec 31, 2019 at 09:36:01AM -0700, Tamas K Lengyel wrote:
> On Tue, Dec 31, 2019 at 9:08 AM Tamas K Lengyel <tamas@tklengyel.com> wrote:
> >
> > On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> > > > On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > > > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > > > > >
> > > > > > > Hi,
> > > > > > >
> > > > > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > > > > >>
> > > > > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > > > > >> But keep in mind that the "fork-vm" command even with this update
> > > > > > >> would still not produce for you a "fully functional" VM on its own.
> > > > > > >> The user still has to produce a new VM config file, create the new
> > > > > > >> disk, save the QEMU state, etc.
> > > > >
> > > > > IMO the default behavior of the fork command should be to leave the
> > > > > original VM paused, so that you can continue using the same disk and
> > > > > network config in the fork and you won't need to pass a new config
> > > > > file.
> > > > >
> > > > > As Julien already said, maybe I wasn't clear in my previous replies:
> > > > > I'm not asking you to implement all this, it's fine if the
> > > > > implementation of the fork-vm xl command requires you to pass certain
> > > > > options, and that the default behavior is not implemented.
> > > > >
> > > > > We need an interface that's sane, and that's designed to be easy and
> > > > > comprehensive to use, not an interface built around what's currently
> > > > > implemented.
> > > >
> > > > OK, so I think that would look like "xl fork-vm <parent_domid>" with
> > > > additional options for things like name, disk, vlan, or a completely
> > > > new config, all of which are currently not implemented, + an
> > > > additional option to not launch QEMU at all, which would be the only
> > > > one currently working. Also keeping the separate "xl fork-launch-dm"
> > > > as is. Is that what we are talking about?
> > >
> > > I think fork-launch-vm should just be an option of fork-vm (ie:
> > > --launch-dm-only or some such). I don't think there's a reason to have
> > > a separate top-level command to just launch the device model.
> >
> > It's just that the fork-launch-dm needs the domid of the fork, while
> > the fork-vm needs the parent's domid. But I guess we can interpret the
> > "domid" required input differently depending on which sub-option is
> > specified for the command. Let's see how it pans out.
> 
> How does the following look for the interface?
> 
>     { "fork-vm",
>       &main_fork_vm, 0, 1,
>       "Fork a domain from the running parent domid",
>       "[options] <Domid>",
>       "-h                           Print this help.\n"
>       "-N <name>                    Assign name to VM fork.\n"
>       "-D <disk>                    Assign disk to VM fork.\n"
>       "-B <bridge                   Assign bridge to VM fork.\n"
>       "-V <vlan>                    Assign vlan to VM fork.\n"

IMO I think the name of fork is the only useful option. Being able to
assign disks or bridges from the command line seems quite complicated.
What about VMs with multiple disks? Or VMs with multiple nics on
different bridges?

I think it's easier for both the implementation and the user to just
use a config file in that case.

>       "-C <config>                  Use config file for VM fork.\n"
>       "-Q <qemu-save-file>          Use qemu save file for VM fork.\n"
>       "--launch-dm  <yes|no|late>   Launch device model (QEMU) for VM fork.\n"
>       "--fork-reset                 Reset VM fork.\n"
>       "-p                           Do not unpause VMs after fork."

I think the default behaviour should be to leave the original VM
paused and the forked one running, and hence this should be:

        "-p                           Leave forked VM paused."
	"-u                           Leave parent VM unpaused."

>       "-h                           Print this help.\n"
>       "-d                           Enable debug messages.\n"
>     },
> 
> Currently the parts that are implemented would look like:
> xl fork-vm -p --launch-dm no <parent_domid>
> xl fork-vm -p --launch-dm late -C <config> -Q <qemu-save-file> <fork_domid>

Why do you need a config file for launching the Qemu device model?
Doesn't the save-file contain all the information?

I think you also need something like:

# xl fork-vm --launch-dm late <parent_domid> <fork_domid>

So that a user doesn't need to pass a qemu-save-file?

Can you also list the command used to get the Qemu save-file from the
parent? (just for completeness purposes).

Thanks, Roger.
Tamas K Lengyel Jan. 8, 2020, 3:32 p.m. UTC | #16
On Wed, Jan 8, 2020 at 8:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Tue, Dec 31, 2019 at 09:36:01AM -0700, Tamas K Lengyel wrote:
> > On Tue, Dec 31, 2019 at 9:08 AM Tamas K Lengyel <tamas@tklengyel.com> wrote:
> > >
> > > On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> > > > > On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > >
> > > > > > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > > > > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > > > > > >
> > > > > > > > Hi,
> > > > > > > >
> > > > > > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > > > > > >>
> > > > > > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > > > > > >> But keep in mind that the "fork-vm" command even with this update
> > > > > > > >> would still not produce for you a "fully functional" VM on its own.
> > > > > > > >> The user still has to produce a new VM config file, create the new
> > > > > > > >> disk, save the QEMU state, etc.
> > > > > >
> > > > > > IMO the default behavior of the fork command should be to leave the
> > > > > > original VM paused, so that you can continue using the same disk and
> > > > > > network config in the fork and you won't need to pass a new config
> > > > > > file.
> > > > > >
> > > > > > As Julien already said, maybe I wasn't clear in my previous replies:
> > > > > > I'm not asking you to implement all this, it's fine if the
> > > > > > implementation of the fork-vm xl command requires you to pass certain
> > > > > > options, and that the default behavior is not implemented.
> > > > > >
> > > > > > We need an interface that's sane, and that's designed to be easy and
> > > > > > comprehensive to use, not an interface built around what's currently
> > > > > > implemented.
> > > > >
> > > > > OK, so I think that would look like "xl fork-vm <parent_domid>" with
> > > > > additional options for things like name, disk, vlan, or a completely
> > > > > new config, all of which are currently not implemented, + an
> > > > > additional option to not launch QEMU at all, which would be the only
> > > > > one currently working. Also keeping the separate "xl fork-launch-dm"
> > > > > as is. Is that what we are talking about?
> > > >
> > > > I think fork-launch-vm should just be an option of fork-vm (ie:
> > > > --launch-dm-only or some such). I don't think there's a reason to have
> > > > a separate top-level command to just launch the device model.
> > >
> > > It's just that the fork-launch-dm needs the domid of the fork, while
> > > the fork-vm needs the parent's domid. But I guess we can interpret the
> > > "domid" required input differently depending on which sub-option is
> > > specified for the command. Let's see how it pans out.
> >
> > How does the following look for the interface?
> >
> >     { "fork-vm",
> >       &main_fork_vm, 0, 1,
> >       "Fork a domain from the running parent domid",
> >       "[options] <Domid>",
> >       "-h                           Print this help.\n"
> >       "-N <name>                    Assign name to VM fork.\n"
> >       "-D <disk>                    Assign disk to VM fork.\n"
> >       "-B <bridge                   Assign bridge to VM fork.\n"
> >       "-V <vlan>                    Assign vlan to VM fork.\n"
>
> IMO I think the name of fork is the only useful option. Being able to
> assign disks or bridges from the command line seems quite complicated.
> What about VMs with multiple disks? Or VMs with multiple nics on
> different bridges?
>
> I think it's easier for both the implementation and the user to just
> use a config file in that case.

I agree but it sounded to me you guys wanted to have a "complete"
interface even if it's unimplemented. This is what a complete
interface would look to me.

>
> >       "-C <config>                  Use config file for VM fork.\n"
> >       "-Q <qemu-save-file>          Use qemu save file for VM fork.\n"
> >       "--launch-dm  <yes|no|late>   Launch device model (QEMU) for VM fork.\n"
> >       "--fork-reset                 Reset VM fork.\n"
> >       "-p                           Do not unpause VMs after fork."
>
> I think the default behaviour should be to leave the original VM
> paused and the forked one running, and hence this should be:

That is the default. I guess the text saying VMs was not correctly
worded, it just means don't unpause fork after it's created. The
parent remains always paused.

>
>         "-p                           Leave forked VM paused."
>         "-u                           Leave parent VM unpaused."

But you shouldn't unpause the parent VM at all. It should remain
paused as long as there are forks running that were split from it.
Unpausing it will lead to subtle and unexplainable crashes in the fork
since the fork now will use pages that are from a different execution
path. Technically in the future it would be possible to unpause the VM
but it requires to fully populate the pagetables in all forks made
from it with mem_shared entries and deduplicate to the forks all the
pages that's can't be be shared. That was what I originally tried to
do but it was extremely slow, hence the lazy-population of the
pagetable in the forks.

>
> >       "-h                           Print this help.\n"
> >       "-d                           Enable debug messages.\n"
> >     },
> >
> > Currently the parts that are implemented would look like:
> > xl fork-vm -p --launch-dm no <parent_domid>
> > xl fork-vm -p --launch-dm late -C <config> -Q <qemu-save-file> <fork_domid>
>
> Why do you need a config file for launching the Qemu device model?
> Doesn't the save-file contain all the information?

The config is used to populate xenstore, not just for QEMU. The QEMU
save file doesn't contain the xl config. This is not a full VM save
file, it is only the QEMU state that gets dumped with
xen-save-devices-state.

>
> I think you also need something like:
>
> # xl fork-vm --launch-dm late <parent_domid> <fork_domid>
>
> So that a user doesn't need to pass a qemu-save-file?

This doesn't make much sense to me. To launch QEMU you need the config
file to wire things up correctly. Like in order to launch QEMU you
need to tell it the name of the VM, disk path, etc. that are all
contained in the config.

>
> Can you also list the command used to get the Qemu save-file from the
> parent? (just for completeness purposes).

It's explained in the cover letter. You connect to the QEMU socket and
issue the xen-save-devices-state QMP command.

Tamas
George Dunlap Jan. 8, 2020, 4:34 p.m. UTC | #17
On 12/31/19 3:11 PM, Roger Pau Monné wrote:
> On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
>> On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>
>>> On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
>>>> On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
>>>>>>
>>>>>> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
>>>>>> But keep in mind that the "fork-vm" command even with this update
>>>>>> would still not produce for you a "fully functional" VM on its own.
>>>>>> The user still has to produce a new VM config file, create the new
>>>>>> disk, save the QEMU state, etc.
>>>
>>> IMO the default behavior of the fork command should be to leave the
>>> original VM paused, so that you can continue using the same disk and
>>> network config in the fork and you won't need to pass a new config
>>> file.
>>>
>>> As Julien already said, maybe I wasn't clear in my previous replies:
>>> I'm not asking you to implement all this, it's fine if the
>>> implementation of the fork-vm xl command requires you to pass certain
>>> options, and that the default behavior is not implemented.
>>>
>>> We need an interface that's sane, and that's designed to be easy and
>>> comprehensive to use, not an interface built around what's currently
>>> implemented.
>>
>> OK, so I think that would look like "xl fork-vm <parent_domid>" with
>> additional options for things like name, disk, vlan, or a completely
>> new config, all of which are currently not implemented, + an
>> additional option to not launch QEMU at all, which would be the only
>> one currently working. Also keeping the separate "xl fork-launch-dm"
>> as is. Is that what we are talking about?
> 
> I think fork-launch-vm should just be an option of fork-vm (ie:
> --launch-dm-only or some such). I don't think there's a reason to have
> a separate top-level command to just launch the device model.

So first of all, Tamas -- do you actually need to exec xl here?  Would
it make sense for these to start out simply as libxl functions that are
called by your system?

I actually disagree that we want a single command to do all of these.
If we did want `exec xl` to be one of the supported interfaces, I think
it would break down something like this:

`xl fork-domain`: Only forks the domain.
`xl fork-launch-dm`: (or attach-dm?): Start up and attach the
devicemodel to the domain

Then `xl fork` (or maybe `xl fork-vm`) would be something implemented in
the future that would fork the entire domain.

(This is similar to how `git am` works for instance; internally it runs
several steps, including `git mailsplit`, `git mailinfo`, and `git
apply-patch`, each of which can be called individually.)

I think I would also have:

`xl fork-save-dm`: Connect over qmp to the parent domain and save the dm
file

Then have `xl fork-launch-dm` either take a filename (saved from the
previous step) or a parent domain id (in which case it would arrange to
save the file itself).

Although in fact, is there any reason we couldn't store the parent
domain ID in xenstore, so that `xl fork-launch-dm` could find the parent
by itself?  (Although that, of course, is something that could be added
later if it's not something Tamas needs.)

Thoughts?

 -George
Tamas K Lengyel Jan. 8, 2020, 5:06 p.m. UTC | #18
On Wed, Jan 8, 2020 at 9:34 AM George Dunlap <george.dunlap@citrix.com> wrote:
>
> On 12/31/19 3:11 PM, Roger Pau Monné wrote:
> > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> >> On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>>
> >>> On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> >>>> On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> >>>>>>
> >>>>>> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> >>>>>> But keep in mind that the "fork-vm" command even with this update
> >>>>>> would still not produce for you a "fully functional" VM on its own.
> >>>>>> The user still has to produce a new VM config file, create the new
> >>>>>> disk, save the QEMU state, etc.
> >>>
> >>> IMO the default behavior of the fork command should be to leave the
> >>> original VM paused, so that you can continue using the same disk and
> >>> network config in the fork and you won't need to pass a new config
> >>> file.
> >>>
> >>> As Julien already said, maybe I wasn't clear in my previous replies:
> >>> I'm not asking you to implement all this, it's fine if the
> >>> implementation of the fork-vm xl command requires you to pass certain
> >>> options, and that the default behavior is not implemented.
> >>>
> >>> We need an interface that's sane, and that's designed to be easy and
> >>> comprehensive to use, not an interface built around what's currently
> >>> implemented.
> >>
> >> OK, so I think that would look like "xl fork-vm <parent_domid>" with
> >> additional options for things like name, disk, vlan, or a completely
> >> new config, all of which are currently not implemented, + an
> >> additional option to not launch QEMU at all, which would be the only
> >> one currently working. Also keeping the separate "xl fork-launch-dm"
> >> as is. Is that what we are talking about?
> >
> > I think fork-launch-vm should just be an option of fork-vm (ie:
> > --launch-dm-only or some such). I don't think there's a reason to have
> > a separate top-level command to just launch the device model.
>
> So first of all, Tamas -- do you actually need to exec xl here?  Would
> it make sense for these to start out simply as libxl functions that are
> called by your system?

For my current tools & tests - no. I don't start QEMU for the forks at
all. So at this point I don't even need libxl. But I can foresee that
at some point in the future it may become necessary in case we want
allow the forked VM to touch emulated devices. Wiring QEMU up and
making the system functional as a whole I found it easier to do it via
xl. There is just way too many moving components involved to do that
any other way.

>
> I actually disagree that we want a single command to do all of these.
> If we did want `exec xl` to be one of the supported interfaces, I think
> it would break down something like this:
>
> `xl fork-domain`: Only forks the domain.
> `xl fork-launch-dm`: (or attach-dm?): Start up and attach the
> devicemodel to the domain
>
> Then `xl fork` (or maybe `xl fork-vm`) would be something implemented in
> the future that would fork the entire domain.

I really don't have a strong opinion about this either way. I can see
it working either way. Having them all bundled under a single
top-level comment doesn't pollute the help text when someone is just
looking at what xl can do in general. Makes that command a lot more
complex for sure but I don't think it's too bad.

>
> (This is similar to how `git am` works for instance; internally it runs
> several steps, including `git mailsplit`, `git mailinfo`, and `git
> apply-patch`, each of which can be called individually.)
>
> I think I would also have:
>
> `xl fork-save-dm`: Connect over qmp to the parent domain and save the dm
> file

Aye, could be done. For now I didn't bother since its trivial to do
manually already.

>
> Then have `xl fork-launch-dm` either take a filename (saved from the
> previous step) or a parent domain id (in which case it would arrange to
> save the file itself).
>
> Although in fact, is there any reason we couldn't store the parent
> domain ID in xenstore, so that `xl fork-launch-dm` could find the parent
> by itself?  (Although that, of course, is something that could be added
> later if it's not something Tamas needs.)

Could be done. But I store ID internally in my tools anyway since I
need it to initialize VMI. So having it in Xenstore is not required
for me. In fact I would prefer to leave Xenstore out of these
operations as much as possible cause it would slow things down. In my
latest tests forking is down to 0.0007s, having to touch Xenstore for
each would slow things down considerably.

Thanks,
Tamas
George Dunlap Jan. 8, 2020, 5:16 p.m. UTC | #19
On 1/8/20 5:06 PM, Tamas K Lengyel wrote:
> On Wed, Jan 8, 2020 at 9:34 AM George Dunlap <george.dunlap@citrix.com> wrote:
>>
>> On 12/31/19 3:11 PM, Roger Pau Monné wrote:
>>> On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
>>>> On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>>>>>
>>>>> On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
>>>>>> On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
>>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
>>>>>>>>
>>>>>>>> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
>>>>>>>> But keep in mind that the "fork-vm" command even with this update
>>>>>>>> would still not produce for you a "fully functional" VM on its own.
>>>>>>>> The user still has to produce a new VM config file, create the new
>>>>>>>> disk, save the QEMU state, etc.
>>>>>
>>>>> IMO the default behavior of the fork command should be to leave the
>>>>> original VM paused, so that you can continue using the same disk and
>>>>> network config in the fork and you won't need to pass a new config
>>>>> file.
>>>>>
>>>>> As Julien already said, maybe I wasn't clear in my previous replies:
>>>>> I'm not asking you to implement all this, it's fine if the
>>>>> implementation of the fork-vm xl command requires you to pass certain
>>>>> options, and that the default behavior is not implemented.
>>>>>
>>>>> We need an interface that's sane, and that's designed to be easy and
>>>>> comprehensive to use, not an interface built around what's currently
>>>>> implemented.
>>>>
>>>> OK, so I think that would look like "xl fork-vm <parent_domid>" with
>>>> additional options for things like name, disk, vlan, or a completely
>>>> new config, all of which are currently not implemented, + an
>>>> additional option to not launch QEMU at all, which would be the only
>>>> one currently working. Also keeping the separate "xl fork-launch-dm"
>>>> as is. Is that what we are talking about?
>>>
>>> I think fork-launch-vm should just be an option of fork-vm (ie:
>>> --launch-dm-only or some such). I don't think there's a reason to have
>>> a separate top-level command to just launch the device model.
>>
>> So first of all, Tamas -- do you actually need to exec xl here?  Would
>> it make sense for these to start out simply as libxl functions that are
>> called by your system?
> 
> For my current tools & tests - no. I don't start QEMU for the forks at
> all. So at this point I don't even need libxl. But I can foresee that
> at some point in the future it may become necessary in case we want
> allow the forked VM to touch emulated devices. Wiring QEMU up and
> making the system functional as a whole I found it easier to do it via
> xl. There is just way too many moving components involved to do that
> any other way.
> 
>>
>> I actually disagree that we want a single command to do all of these.
>> If we did want `exec xl` to be one of the supported interfaces, I think
>> it would break down something like this:
>>
>> `xl fork-domain`: Only forks the domain.
>> `xl fork-launch-dm`: (or attach-dm?): Start up and attach the
>> devicemodel to the domain
>>
>> Then `xl fork` (or maybe `xl fork-vm`) would be something implemented in
>> the future that would fork the entire domain.
> 
> I really don't have a strong opinion about this either way. I can see
> it working either way. Having them all bundled under a single
> top-level comment doesn't pollute the help text when someone is just
> looking at what xl can do in general. Makes that command a lot more
> complex for sure but I don't think it's too bad.

One thing I don't like about having a single command is that since
you're not planning on implementing the end-to-end "vm fork" command,
then when running the base "fork-vm" command, you'll have to print an
error message that says "This command is not available in its
completeness; you'll have to implement your own via fork-vm --domain,
fork-vm --save-dm, and fork-vm --launch-dm."

Which we could do, but seem a bit strange. :-)

>> Then have `xl fork-launch-dm` either take a filename (saved from the
>> previous step) or a parent domain id (in which case it would arrange to
>> save the file itself).
>>
>> Although in fact, is there any reason we couldn't store the parent
>> domain ID in xenstore, so that `xl fork-launch-dm` could find the parent
>> by itself?  (Although that, of course, is something that could be added
>> later if it's not something Tamas needs.)
> 
> Could be done. But I store ID internally in my tools anyway since I
> need it to initialize VMI. So having it in Xenstore is not required
> for me. In fact I would prefer to leave Xenstore out of these
> operations as much as possible cause it would slow things down. In my
> latest tests forking is down to 0.0007s, having to touch Xenstore for
> each would slow things down considerably.

Right, that makes sense.

 -George
Tamas K Lengyel Jan. 8, 2020, 5:25 p.m. UTC | #20
> >> I actually disagree that we want a single command to do all of these.
> >> If we did want `exec xl` to be one of the supported interfaces, I think
> >> it would break down something like this:
> >>
> >> `xl fork-domain`: Only forks the domain.
> >> `xl fork-launch-dm`: (or attach-dm?): Start up and attach the
> >> devicemodel to the domain
> >>
> >> Then `xl fork` (or maybe `xl fork-vm`) would be something implemented in
> >> the future that would fork the entire domain.
> >
> > I really don't have a strong opinion about this either way. I can see
> > it working either way. Having them all bundled under a single
> > top-level comment doesn't pollute the help text when someone is just
> > looking at what xl can do in general. Makes that command a lot more
> > complex for sure but I don't think it's too bad.
>
> One thing I don't like about having a single command is that since
> you're not planning on implementing the end-to-end "vm fork" command,
> then when running the base "fork-vm" command, you'll have to print an
> error message that says "This command is not available in its
> completeness; you'll have to implement your own via fork-vm --domain,
> fork-vm --save-dm, and fork-vm --launch-dm."
>
> Which we could do, but seem a bit strange. :-)

Yea, it's not a single step to get to a fully functional fork but its close:
1. pause parent vm
2. generate qemu_save_file
3. xl fork-vm -C config -Q qemu_save_file <parent_domid>

For the second fork - provided it has its own config file ready to go
- it is enough to just run the 3. step. Technically we could integrate
all these three steps into one and then the user would only have to
generate the new config file. But I found this setup to be "good
enough" already.

Tamas
Roger Pau Monné Jan. 8, 2020, 6 p.m. UTC | #21
On Wed, Jan 08, 2020 at 08:32:22AM -0700, Tamas K Lengyel wrote:
> On Wed, Jan 8, 2020 at 8:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Tue, Dec 31, 2019 at 09:36:01AM -0700, Tamas K Lengyel wrote:
> > > On Tue, Dec 31, 2019 at 9:08 AM Tamas K Lengyel <tamas@tklengyel.com> wrote:
> > > >
> > > > On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> > > > > > On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > >
> > > > > > > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > > > > > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > > > > > > >
> > > > > > > > > Hi,
> > > > > > > > >
> > > > > > > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > > > > > > >>
> > > > > > > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > > > > > > >> But keep in mind that the "fork-vm" command even with this update
> > > > > > > > >> would still not produce for you a "fully functional" VM on its own.
> > > > > > > > >> The user still has to produce a new VM config file, create the new
> > > > > > > > >> disk, save the QEMU state, etc.
> > > > > > >
> > > > > > > IMO the default behavior of the fork command should be to leave the
> > > > > > > original VM paused, so that you can continue using the same disk and
> > > > > > > network config in the fork and you won't need to pass a new config
> > > > > > > file.
> > > > > > >
> > > > > > > As Julien already said, maybe I wasn't clear in my previous replies:
> > > > > > > I'm not asking you to implement all this, it's fine if the
> > > > > > > implementation of the fork-vm xl command requires you to pass certain
> > > > > > > options, and that the default behavior is not implemented.
> > > > > > >
> > > > > > > We need an interface that's sane, and that's designed to be easy and
> > > > > > > comprehensive to use, not an interface built around what's currently
> > > > > > > implemented.
> > > > > >
> > > > > > OK, so I think that would look like "xl fork-vm <parent_domid>" with
> > > > > > additional options for things like name, disk, vlan, or a completely
> > > > > > new config, all of which are currently not implemented, + an
> > > > > > additional option to not launch QEMU at all, which would be the only
> > > > > > one currently working. Also keeping the separate "xl fork-launch-dm"
> > > > > > as is. Is that what we are talking about?
> > > > >
> > > > > I think fork-launch-vm should just be an option of fork-vm (ie:
> > > > > --launch-dm-only or some such). I don't think there's a reason to have
> > > > > a separate top-level command to just launch the device model.
> > > >
> > > > It's just that the fork-launch-dm needs the domid of the fork, while
> > > > the fork-vm needs the parent's domid. But I guess we can interpret the
> > > > "domid" required input differently depending on which sub-option is
> > > > specified for the command. Let's see how it pans out.
> > >
> > > How does the following look for the interface?
> > >
> > >     { "fork-vm",
> > >       &main_fork_vm, 0, 1,
> > >       "Fork a domain from the running parent domid",
> > >       "[options] <Domid>",
> > >       "-h                           Print this help.\n"
> > >       "-N <name>                    Assign name to VM fork.\n"
> > >       "-D <disk>                    Assign disk to VM fork.\n"
> > >       "-B <bridge                   Assign bridge to VM fork.\n"
> > >       "-V <vlan>                    Assign vlan to VM fork.\n"
> >
> > IMO I think the name of fork is the only useful option. Being able to
> > assign disks or bridges from the command line seems quite complicated.
> > What about VMs with multiple disks? Or VMs with multiple nics on
> > different bridges?
> >
> > I think it's easier for both the implementation and the user to just
> > use a config file in that case.
> 
> I agree but it sounded to me you guys wanted to have a "complete"
> interface even if it's unimplemented. This is what a complete
> interface would look to me.

I would add those options afterwards if there's a need for them. I was
mainly concerned about introducing a top level command (ie: fork-vm)
that would require calling other commands in order to get a functional
fork. I'm not so concerned about having all the possible options
listed now, as long as the default behavior of fork-vm is something
sane that produces a working fork, even if not fully implemented at
this stage.

> >
> > >       "-C <config>                  Use config file for VM fork.\n"
> > >       "-Q <qemu-save-file>          Use qemu save file for VM fork.\n"
> > >       "--launch-dm  <yes|no|late>   Launch device model (QEMU) for VM fork.\n"
> > >       "--fork-reset                 Reset VM fork.\n"
> > >       "-p                           Do not unpause VMs after fork."
> >
> > I think the default behaviour should be to leave the original VM
> > paused and the forked one running, and hence this should be:
> 
> That is the default. I guess the text saying VMs was not correctly
> worded, it just means don't unpause fork after it's created. The
> parent remains always paused.

Ack.

> >
> >         "-p                           Leave forked VM paused."
> >         "-u                           Leave parent VM unpaused."
> 
> But you shouldn't unpause the parent VM at all. It should remain
> paused as long as there are forks running that were split from it.
> Unpausing it will lead to subtle and unexplainable crashes in the fork
> since the fork now will use pages that are from a different execution
> path. Technically in the future it would be possible to unpause the VM
> but it requires to fully populate the pagetables in all forks made
> from it with mem_shared entries and deduplicate to the forks all the
> pages that's can't be be shared.

Oh, OK, since I didn't look at the implementation yet I assumed that
the parent would also be switched to trap on memory writes, so that
you could duplicate the pages before the parent writes to them, and
hence the parent could be left running.

Anyway, let's forget about the "leave parent unpaused" option then.

> That was what I originally tried to
> do but it was extremely slow, hence the lazy-population of the
> pagetable in the forks.
> 
> >
> > >       "-h                           Print this help.\n"
> > >       "-d                           Enable debug messages.\n"
> > >     },
> > >
> > > Currently the parts that are implemented would look like:
> > > xl fork-vm -p --launch-dm no <parent_domid>
> > > xl fork-vm -p --launch-dm late -C <config> -Q <qemu-save-file> <fork_domid>
> >
> > Why do you need a config file for launching the Qemu device model?
> > Doesn't the save-file contain all the information?
> 
> The config is used to populate xenstore, not just for QEMU. The QEMU
> save file doesn't contain the xl config. This is not a full VM save
> file, it is only the QEMU state that gets dumped with
> xen-save-devices-state.

TBH I think it would be easier to have something like my proposal
below, where you tell xl the parent and the forked VM names and xl
does the rest. Even better would be to not have to tell xl the parent
VM name (since I guess this is already tracked internally somewhere?).

Anyway, I'm not going to insist on this but the workflow of the Qemu
forking seems to not be very user friendly unless you know exactly how
to use it.

> 
> >
> > I think you also need something like:
> >
> > # xl fork-vm --launch-dm late <parent_domid> <fork_domid>
> >
> > So that a user doesn't need to pass a qemu-save-file?
> 
> This doesn't make much sense to me. To launch QEMU you need the config
> file to wire things up correctly. Like in order to launch QEMU you
> need to tell it the name of the VM, disk path, etc. that are all
> contained in the config.

You could get all this information from the parent VM, IIRC libxl has
a json version of the config. For example for migration there's no
need to pass any config file, since the incoming VM can be recreated
from the data in the source VM.

Thanks, Roger.
Roger Pau Monné Jan. 8, 2020, 6:07 p.m. UTC | #22
On Wed, Jan 08, 2020 at 04:34:49PM +0000, George Dunlap wrote:
> On 12/31/19 3:11 PM, Roger Pau Monné wrote:
> > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> >> On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >>>
> >>> On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> >>>> On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> >>>>>
> >>>>> Hi,
> >>>>>
> >>>>> On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> >>>>>>
> >>>>>> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> >>>>>> But keep in mind that the "fork-vm" command even with this update
> >>>>>> would still not produce for you a "fully functional" VM on its own.
> >>>>>> The user still has to produce a new VM config file, create the new
> >>>>>> disk, save the QEMU state, etc.
> >>>
> >>> IMO the default behavior of the fork command should be to leave the
> >>> original VM paused, so that you can continue using the same disk and
> >>> network config in the fork and you won't need to pass a new config
> >>> file.
> >>>
> >>> As Julien already said, maybe I wasn't clear in my previous replies:
> >>> I'm not asking you to implement all this, it's fine if the
> >>> implementation of the fork-vm xl command requires you to pass certain
> >>> options, and that the default behavior is not implemented.
> >>>
> >>> We need an interface that's sane, and that's designed to be easy and
> >>> comprehensive to use, not an interface built around what's currently
> >>> implemented.
> >>
> >> OK, so I think that would look like "xl fork-vm <parent_domid>" with
> >> additional options for things like name, disk, vlan, or a completely
> >> new config, all of which are currently not implemented, + an
> >> additional option to not launch QEMU at all, which would be the only
> >> one currently working. Also keeping the separate "xl fork-launch-dm"
> >> as is. Is that what we are talking about?
> > 
> > I think fork-launch-vm should just be an option of fork-vm (ie:
> > --launch-dm-only or some such). I don't think there's a reason to have
> > a separate top-level command to just launch the device model.
> 
> So first of all, Tamas -- do you actually need to exec xl here?  Would
> it make sense for these to start out simply as libxl functions that are
> called by your system?
> 
> I actually disagree that we want a single command to do all of these.
> If we did want `exec xl` to be one of the supported interfaces, I think
> it would break down something like this:
> 
> `xl fork-domain`: Only forks the domain.
> `xl fork-launch-dm`: (or attach-dm?): Start up and attach the
> devicemodel to the domain
> 
> Then `xl fork` (or maybe `xl fork-vm`) would be something implemented in
> the future that would fork the entire domain.

I don't have a strong opinion on whether we should have a bunch of
fork-* commands or a single one. My preference would be for a single
one because I think other commands can be implemented as options.

What I would like to prevent is ending up with something like
fork-domain and fork-vm commands, which look like aliases, and can
lead to confusion.

Roger.
Tamas K Lengyel Jan. 8, 2020, 6:14 p.m. UTC | #23
On Wed, Jan 8, 2020 at 11:01 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Jan 08, 2020 at 08:32:22AM -0700, Tamas K Lengyel wrote:
> > On Wed, Jan 8, 2020 at 8:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Tue, Dec 31, 2019 at 09:36:01AM -0700, Tamas K Lengyel wrote:
> > > > On Tue, Dec 31, 2019 at 9:08 AM Tamas K Lengyel <tamas@tklengyel.com> wrote:
> > > > >
> > > > > On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > >
> > > > > > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> > > > > > > On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > > >
> > > > > > > > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > > > > > > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > > > > > > > >
> > > > > > > > > > Hi,
> > > > > > > > > >
> > > > > > > > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > > > > > > > >>
> > > > > > > > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > > > > > > > >> But keep in mind that the "fork-vm" command even with this update
> > > > > > > > > >> would still not produce for you a "fully functional" VM on its own.
> > > > > > > > > >> The user still has to produce a new VM config file, create the new
> > > > > > > > > >> disk, save the QEMU state, etc.
> > > > > > > >
> > > > > > > > IMO the default behavior of the fork command should be to leave the
> > > > > > > > original VM paused, so that you can continue using the same disk and
> > > > > > > > network config in the fork and you won't need to pass a new config
> > > > > > > > file.
> > > > > > > >
> > > > > > > > As Julien already said, maybe I wasn't clear in my previous replies:
> > > > > > > > I'm not asking you to implement all this, it's fine if the
> > > > > > > > implementation of the fork-vm xl command requires you to pass certain
> > > > > > > > options, and that the default behavior is not implemented.
> > > > > > > >
> > > > > > > > We need an interface that's sane, and that's designed to be easy and
> > > > > > > > comprehensive to use, not an interface built around what's currently
> > > > > > > > implemented.
> > > > > > >
> > > > > > > OK, so I think that would look like "xl fork-vm <parent_domid>" with
> > > > > > > additional options for things like name, disk, vlan, or a completely
> > > > > > > new config, all of which are currently not implemented, + an
> > > > > > > additional option to not launch QEMU at all, which would be the only
> > > > > > > one currently working. Also keeping the separate "xl fork-launch-dm"
> > > > > > > as is. Is that what we are talking about?
> > > > > >
> > > > > > I think fork-launch-vm should just be an option of fork-vm (ie:
> > > > > > --launch-dm-only or some such). I don't think there's a reason to have
> > > > > > a separate top-level command to just launch the device model.
> > > > >
> > > > > It's just that the fork-launch-dm needs the domid of the fork, while
> > > > > the fork-vm needs the parent's domid. But I guess we can interpret the
> > > > > "domid" required input differently depending on which sub-option is
> > > > > specified for the command. Let's see how it pans out.
> > > >
> > > > How does the following look for the interface?
> > > >
> > > >     { "fork-vm",
> > > >       &main_fork_vm, 0, 1,
> > > >       "Fork a domain from the running parent domid",
> > > >       "[options] <Domid>",
> > > >       "-h                           Print this help.\n"
> > > >       "-N <name>                    Assign name to VM fork.\n"
> > > >       "-D <disk>                    Assign disk to VM fork.\n"
> > > >       "-B <bridge                   Assign bridge to VM fork.\n"
> > > >       "-V <vlan>                    Assign vlan to VM fork.\n"
> > >
> > > IMO I think the name of fork is the only useful option. Being able to
> > > assign disks or bridges from the command line seems quite complicated.
> > > What about VMs with multiple disks? Or VMs with multiple nics on
> > > different bridges?
> > >
> > > I think it's easier for both the implementation and the user to just
> > > use a config file in that case.
> >
> > I agree but it sounded to me you guys wanted to have a "complete"
> > interface even if it's unimplemented. This is what a complete
> > interface would look to me.
>
> I would add those options afterwards if there's a need for them. I was
> mainly concerned about introducing a top level command (ie: fork-vm)
> that would require calling other commands in order to get a functional
> fork. I'm not so concerned about having all the possible options
> listed now, as long as the default behavior of fork-vm is something
> sane that produces a working fork, even if not fully implemented at
> this stage.

OK

> > > Why do you need a config file for launching the Qemu device model?
> > > Doesn't the save-file contain all the information?
> >
> > The config is used to populate xenstore, not just for QEMU. The QEMU
> > save file doesn't contain the xl config. This is not a full VM save
> > file, it is only the QEMU state that gets dumped with
> > xen-save-devices-state.
>
> TBH I think it would be easier to have something like my proposal
> below, where you tell xl the parent and the forked VM names and xl
> does the rest. Even better would be to not have to tell xl the parent
> VM name (since I guess this is already tracked internally somewhere?).

The forked VM has no "name" when it's created. For performance reasons
when the VM fork is created with "--launch-dm no" we explicitly want
to avoid touching Xenstore. Even parsing the config file would be
unneeded overhead at that stage.

>
> Anyway, I'm not going to insist on this but the workflow of the Qemu
> forking seems to not be very user friendly unless you know exactly how
> to use it.
>
> >
> > >
> > > I think you also need something like:
> > >
> > > # xl fork-vm --launch-dm late <parent_domid> <fork_domid>
> > >
> > > So that a user doesn't need to pass a qemu-save-file?
> >
> > This doesn't make much sense to me. To launch QEMU you need the config
> > file to wire things up correctly. Like in order to launch QEMU you
> > need to tell it the name of the VM, disk path, etc. that are all
> > contained in the config.
>
> You could get all this information from the parent VM, IIRC libxl has
> a json version of the config. For example for migration there's no
> need to pass any config file, since the incoming VM can be recreated
> from the data in the source VM.
>

But again, creating a fork with the exact config of the parent is not
possible. Even if the tool would rename the fork on-the-fly as it does
during the migration, the fork would end up thrashing the parent VM's
disk and making it impossible to create any additional forks. It would
also mean that at no point can the original VM be unpaused after the
forks are gone. I don't see any usecase in which that would make any
sense at all.

Tamas
Tamas K Lengyel Jan. 8, 2020, 6:23 p.m. UTC | #24
> > > > Why do you need a config file for launching the Qemu device model?
> > > > Doesn't the save-file contain all the information?
> > >
> > > The config is used to populate xenstore, not just for QEMU. The QEMU
> > > save file doesn't contain the xl config. This is not a full VM save
> > > file, it is only the QEMU state that gets dumped with
> > > xen-save-devices-state.
> >
> > TBH I think it would be easier to have something like my proposal
> > below, where you tell xl the parent and the forked VM names and xl
> > does the rest. Even better would be to not have to tell xl the parent
> > VM name (since I guess this is already tracked internally somewhere?).
>
> The forked VM has no "name" when it's created. For performance reasons
> when the VM fork is created with "--launch-dm no" we explicitly want
> to avoid touching Xenstore. Even parsing the config file would be
> unneeded overhead at that stage.

And to answer your question, no, the parent VM's name is not recorded
anywhere for the fork. Technically not even the parent's domain id is
kept by Xen. The fork only keeps a pointer to the parent's "struct
domain". So right now there is no hypercall interface to retrieve a
fork's parent's ID - it is assumed the tools using the interface are
keeping track of that. Could this information be dumped into Xenstore
as well? Yes. But we specifically want be able to create the fork as
fast possible without any unnecessary overhead.

Tamas
Roger Pau Monné Jan. 8, 2020, 6:36 p.m. UTC | #25
On Wed, Jan 08, 2020 at 11:14:46AM -0700, Tamas K Lengyel wrote:
> On Wed, Jan 8, 2020 at 11:01 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Wed, Jan 08, 2020 at 08:32:22AM -0700, Tamas K Lengyel wrote:
> > > On Wed, Jan 8, 2020 at 8:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Tue, Dec 31, 2019 at 09:36:01AM -0700, Tamas K Lengyel wrote:
> > > > > On Tue, Dec 31, 2019 at 9:08 AM Tamas K Lengyel <tamas@tklengyel.com> wrote:
> > > > > >
> > > > > > On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > >
> > > > > > > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> > > > > > > > On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > > > >
> > > > > > > > > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > > > > > > > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > > > > > > > > >
> > > > > > > > > > > Hi,
> > > > > > > > > > >
> > > > > > > > > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > > > > > > > > >> But keep in mind that the "fork-vm" command even with this update
> > > > > > > > > > >> would still not produce for you a "fully functional" VM on its own.
> > > > > > > > > > >> The user still has to produce a new VM config file, create the new
> > > > > > > > > > >> disk, save the QEMU state, etc.
> > > > > > > > >
> > > > > > > > > IMO the default behavior of the fork command should be to leave the
> > > > > > > > > original VM paused, so that you can continue using the same disk and
> > > > > > > > > network config in the fork and you won't need to pass a new config
> > > > > > > > > file.
> > > > > > > > >
> > > > > > > > > As Julien already said, maybe I wasn't clear in my previous replies:
> > > > > > > > > I'm not asking you to implement all this, it's fine if the
> > > > > > > > > implementation of the fork-vm xl command requires you to pass certain
> > > > > > > > > options, and that the default behavior is not implemented.
> > > > > > > > >
> > > > > > > > > We need an interface that's sane, and that's designed to be easy and
> > > > > > > > > comprehensive to use, not an interface built around what's currently
> > > > > > > > > implemented.
> > > > > > > >
> > > > > > > > OK, so I think that would look like "xl fork-vm <parent_domid>" with
> > > > > > > > additional options for things like name, disk, vlan, or a completely
> > > > > > > > new config, all of which are currently not implemented, + an
> > > > > > > > additional option to not launch QEMU at all, which would be the only
> > > > > > > > one currently working. Also keeping the separate "xl fork-launch-dm"
> > > > > > > > as is. Is that what we are talking about?
> > > > > > >
> > > > > > > I think fork-launch-vm should just be an option of fork-vm (ie:
> > > > > > > --launch-dm-only or some such). I don't think there's a reason to have
> > > > > > > a separate top-level command to just launch the device model.
> > > > > >
> > > > > > It's just that the fork-launch-dm needs the domid of the fork, while
> > > > > > the fork-vm needs the parent's domid. But I guess we can interpret the
> > > > > > "domid" required input differently depending on which sub-option is
> > > > > > specified for the command. Let's see how it pans out.
> > > > >
> > > > > How does the following look for the interface?
> > > > >
> > > > >     { "fork-vm",
> > > > >       &main_fork_vm, 0, 1,
> > > > >       "Fork a domain from the running parent domid",
> > > > >       "[options] <Domid>",
> > > > >       "-h                           Print this help.\n"
> > > > >       "-N <name>                    Assign name to VM fork.\n"
> > > > >       "-D <disk>                    Assign disk to VM fork.\n"
> > > > >       "-B <bridge                   Assign bridge to VM fork.\n"
> > > > >       "-V <vlan>                    Assign vlan to VM fork.\n"
> > > >
> > > > IMO I think the name of fork is the only useful option. Being able to
> > > > assign disks or bridges from the command line seems quite complicated.
> > > > What about VMs with multiple disks? Or VMs with multiple nics on
> > > > different bridges?
> > > >
> > > > I think it's easier for both the implementation and the user to just
> > > > use a config file in that case.
> > >
> > > I agree but it sounded to me you guys wanted to have a "complete"
> > > interface even if it's unimplemented. This is what a complete
> > > interface would look to me.
> >
> > I would add those options afterwards if there's a need for them. I was
> > mainly concerned about introducing a top level command (ie: fork-vm)
> > that would require calling other commands in order to get a functional
> > fork. I'm not so concerned about having all the possible options
> > listed now, as long as the default behavior of fork-vm is something
> > sane that produces a working fork, even if not fully implemented at
> > this stage.
> 
> OK
> 
> > > > Why do you need a config file for launching the Qemu device model?
> > > > Doesn't the save-file contain all the information?
> > >
> > > The config is used to populate xenstore, not just for QEMU. The QEMU
> > > save file doesn't contain the xl config. This is not a full VM save
> > > file, it is only the QEMU state that gets dumped with
> > > xen-save-devices-state.
> >
> > TBH I think it would be easier to have something like my proposal
> > below, where you tell xl the parent and the forked VM names and xl
> > does the rest. Even better would be to not have to tell xl the parent
> > VM name (since I guess this is already tracked internally somewhere?).
> 
> The forked VM has no "name" when it's created. For performance reasons
> when the VM fork is created with "--launch-dm no" we explicitly want
> to avoid touching Xenstore. Even parsing the config file would be
> unneeded overhead at that stage.

I think you need another option to tell xl to not name the forked VM,
abusing of "--launch-dm no" to not set a name is not expected I think.

> >
> > Anyway, I'm not going to insist on this but the workflow of the Qemu
> > forking seems to not be very user friendly unless you know exactly how
> > to use it.
> >
> > >
> > > >
> > > > I think you also need something like:
> > > >
> > > > # xl fork-vm --launch-dm late <parent_domid> <fork_domid>
> > > >
> > > > So that a user doesn't need to pass a qemu-save-file?
> > >
> > > This doesn't make much sense to me. To launch QEMU you need the config
> > > file to wire things up correctly. Like in order to launch QEMU you
> > > need to tell it the name of the VM, disk path, etc. that are all
> > > contained in the config.
> >
> > You could get all this information from the parent VM, IIRC libxl has
> > a json version of the config. For example for migration there's no
> > need to pass any config file, since the incoming VM can be recreated
> > from the data in the source VM.
> >
> 
> But again, creating a fork with the exact config of the parent is not
> possible. Even if the tool would rename the fork on-the-fly as it does
> during the migration, the fork would end up thrashing the parent VM's
> disk and making it impossible to create any additional forks. It would
> also mean that at no point can the original VM be unpaused after the
> forks are gone. I don't see any usecase in which that would make any
> sense at all.

You could have the disk(s) as read-only and the VM running completely
from RAM. Alpine-linux has (or had) a mode where it was completely
stateless and running from RAM. I think it's fine to require passing a
config file for the time being, we can look at other options
afterwards.

Thanks, Roger.
Roger Pau Monné Jan. 8, 2020, 6:44 p.m. UTC | #26
On Wed, Jan 08, 2020 at 11:23:29AM -0700, Tamas K Lengyel wrote:
> > > > > Why do you need a config file for launching the Qemu device model?
> > > > > Doesn't the save-file contain all the information?
> > > >
> > > > The config is used to populate xenstore, not just for QEMU. The QEMU
> > > > save file doesn't contain the xl config. This is not a full VM save
> > > > file, it is only the QEMU state that gets dumped with
> > > > xen-save-devices-state.
> > >
> > > TBH I think it would be easier to have something like my proposal
> > > below, where you tell xl the parent and the forked VM names and xl
> > > does the rest. Even better would be to not have to tell xl the parent
> > > VM name (since I guess this is already tracked internally somewhere?).
> >
> > The forked VM has no "name" when it's created. For performance reasons
> > when the VM fork is created with "--launch-dm no" we explicitly want
> > to avoid touching Xenstore. Even parsing the config file would be
> > unneeded overhead at that stage.
> 
> And to answer your question, no, the parent VM's name is not recorded
> anywhere for the fork. Technically not even the parent's domain id is
> kept by Xen. The fork only keeps a pointer to the parent's "struct
> domain"

There's the domain_id field inside of struct domain, so it seems quite
easy to get the parent domid from the fork if there's a pointer to the
parent's struct domain.

> So right now there is no hypercall interface to retrieve a
> fork's parent's ID - it is assumed the tools using the interface are
> keeping track of that. Could this information be dumped into Xenstore
> as well? Yes. But we specifically want be able to create the fork as
> fast possible without any unnecessary overhead.

I think it would be nice to identify forked domains using
XEN_DOMCTL_getdomaininfo: you could add a parent_domid field to
xen_domctl_getdomaininfo and if it's set to something different than
DOMID_INVALID then the domain is a fork of the given domid.

Not saying it should be done now, but AFAICT getting the parent's
domid is feasible and doesn't require xenstore.

Roger.
Tamas K Lengyel Jan. 8, 2020, 7:47 p.m. UTC | #27
On Wed, Jan 8, 2020 at 11:44 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Jan 08, 2020 at 11:23:29AM -0700, Tamas K Lengyel wrote:
> > > > > > Why do you need a config file for launching the Qemu device model?
> > > > > > Doesn't the save-file contain all the information?
> > > > >
> > > > > The config is used to populate xenstore, not just for QEMU. The QEMU
> > > > > save file doesn't contain the xl config. This is not a full VM save
> > > > > file, it is only the QEMU state that gets dumped with
> > > > > xen-save-devices-state.
> > > >
> > > > TBH I think it would be easier to have something like my proposal
> > > > below, where you tell xl the parent and the forked VM names and xl
> > > > does the rest. Even better would be to not have to tell xl the parent
> > > > VM name (since I guess this is already tracked internally somewhere?).
> > >
> > > The forked VM has no "name" when it's created. For performance reasons
> > > when the VM fork is created with "--launch-dm no" we explicitly want
> > > to avoid touching Xenstore. Even parsing the config file would be
> > > unneeded overhead at that stage.
> >
> > And to answer your question, no, the parent VM's name is not recorded
> > anywhere for the fork. Technically not even the parent's domain id is
> > kept by Xen. The fork only keeps a pointer to the parent's "struct
> > domain"
>
> There's the domain_id field inside of struct domain, so it seems quite
> easy to get the parent domid from the fork if there's a pointer to the
> parent's struct domain.
>
> > So right now there is no hypercall interface to retrieve a
> > fork's parent's ID - it is assumed the tools using the interface are
> > keeping track of that. Could this information be dumped into Xenstore
> > as well? Yes. But we specifically want be able to create the fork as
> > fast possible without any unnecessary overhead.
>
> I think it would be nice to identify forked domains using
> XEN_DOMCTL_getdomaininfo: you could add a parent_domid field to
> xen_domctl_getdomaininfo and if it's set to something different than
> DOMID_INVALID then the domain is a fork of the given domid.
>
> Not saying it should be done now, but AFAICT getting the parent's
> domid is feasible and doesn't require xenstore.
>

Of course it could be done. I was just pointing out that it's not
currently kept separately and there is no interface to retrieve it.
But TBH I have lost the train the though why we would need that in the
first place? When QEMU is being launched the fork is already created
and QEMU doesn't need to know anything about the parent.

Tamas
Tamas K Lengyel Jan. 8, 2020, 7:51 p.m. UTC | #28
On Wed, Jan 8, 2020 at 11:37 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Jan 08, 2020 at 11:14:46AM -0700, Tamas K Lengyel wrote:
> > On Wed, Jan 8, 2020 at 11:01 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Wed, Jan 08, 2020 at 08:32:22AM -0700, Tamas K Lengyel wrote:
> > > > On Wed, Jan 8, 2020 at 8:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > On Tue, Dec 31, 2019 at 09:36:01AM -0700, Tamas K Lengyel wrote:
> > > > > > On Tue, Dec 31, 2019 at 9:08 AM Tamas K Lengyel <tamas@tklengyel.com> wrote:
> > > > > > >
> > > > > > > On Tue, Dec 31, 2019 at 8:11 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > > >
> > > > > > > > On Tue, Dec 31, 2019 at 08:00:17AM -0700, Tamas K Lengyel wrote:
> > > > > > > > > On Tue, Dec 31, 2019 at 3:40 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > > > > >
> > > > > > > > > > On Mon, Dec 30, 2019 at 05:37:38PM -0700, Tamas K Lengyel wrote:
> > > > > > > > > > > On Mon, Dec 30, 2019 at 5:20 PM Julien Grall <julien.grall@gmail.com> wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi,
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, 30 Dec 2019, 20:49 Tamas K Lengyel, <tamas@tklengyel.com> wrote:
> > > > > > > > > > > >>
> > > > > > > > > > > >> On Mon, Dec 30, 2019 at 11:43 AM Julien Grall <julien@xen.org> wrote:
> > > > > > > > > > > >> But keep in mind that the "fork-vm" command even with this update
> > > > > > > > > > > >> would still not produce for you a "fully functional" VM on its own.
> > > > > > > > > > > >> The user still has to produce a new VM config file, create the new
> > > > > > > > > > > >> disk, save the QEMU state, etc.
> > > > > > > > > >
> > > > > > > > > > IMO the default behavior of the fork command should be to leave the
> > > > > > > > > > original VM paused, so that you can continue using the same disk and
> > > > > > > > > > network config in the fork and you won't need to pass a new config
> > > > > > > > > > file.
> > > > > > > > > >
> > > > > > > > > > As Julien already said, maybe I wasn't clear in my previous replies:
> > > > > > > > > > I'm not asking you to implement all this, it's fine if the
> > > > > > > > > > implementation of the fork-vm xl command requires you to pass certain
> > > > > > > > > > options, and that the default behavior is not implemented.
> > > > > > > > > >
> > > > > > > > > > We need an interface that's sane, and that's designed to be easy and
> > > > > > > > > > comprehensive to use, not an interface built around what's currently
> > > > > > > > > > implemented.
> > > > > > > > >
> > > > > > > > > OK, so I think that would look like "xl fork-vm <parent_domid>" with
> > > > > > > > > additional options for things like name, disk, vlan, or a completely
> > > > > > > > > new config, all of which are currently not implemented, + an
> > > > > > > > > additional option to not launch QEMU at all, which would be the only
> > > > > > > > > one currently working. Also keeping the separate "xl fork-launch-dm"
> > > > > > > > > as is. Is that what we are talking about?
> > > > > > > >
> > > > > > > > I think fork-launch-vm should just be an option of fork-vm (ie:
> > > > > > > > --launch-dm-only or some such). I don't think there's a reason to have
> > > > > > > > a separate top-level command to just launch the device model.
> > > > > > >
> > > > > > > It's just that the fork-launch-dm needs the domid of the fork, while
> > > > > > > the fork-vm needs the parent's domid. But I guess we can interpret the
> > > > > > > "domid" required input differently depending on which sub-option is
> > > > > > > specified for the command. Let's see how it pans out.
> > > > > >
> > > > > > How does the following look for the interface?
> > > > > >
> > > > > >     { "fork-vm",
> > > > > >       &main_fork_vm, 0, 1,
> > > > > >       "Fork a domain from the running parent domid",
> > > > > >       "[options] <Domid>",
> > > > > >       "-h                           Print this help.\n"
> > > > > >       "-N <name>                    Assign name to VM fork.\n"
> > > > > >       "-D <disk>                    Assign disk to VM fork.\n"
> > > > > >       "-B <bridge                   Assign bridge to VM fork.\n"
> > > > > >       "-V <vlan>                    Assign vlan to VM fork.\n"
> > > > >
> > > > > IMO I think the name of fork is the only useful option. Being able to
> > > > > assign disks or bridges from the command line seems quite complicated.
> > > > > What about VMs with multiple disks? Or VMs with multiple nics on
> > > > > different bridges?
> > > > >
> > > > > I think it's easier for both the implementation and the user to just
> > > > > use a config file in that case.
> > > >
> > > > I agree but it sounded to me you guys wanted to have a "complete"
> > > > interface even if it's unimplemented. This is what a complete
> > > > interface would look to me.
> > >
> > > I would add those options afterwards if there's a need for them. I was
> > > mainly concerned about introducing a top level command (ie: fork-vm)
> > > that would require calling other commands in order to get a functional
> > > fork. I'm not so concerned about having all the possible options
> > > listed now, as long as the default behavior of fork-vm is something
> > > sane that produces a working fork, even if not fully implemented at
> > > this stage.
> >
> > OK
> >
> > > > > Why do you need a config file for launching the Qemu device model?
> > > > > Doesn't the save-file contain all the information?
> > > >
> > > > The config is used to populate xenstore, not just for QEMU. The QEMU
> > > > save file doesn't contain the xl config. This is not a full VM save
> > > > file, it is only the QEMU state that gets dumped with
> > > > xen-save-devices-state.
> > >
> > > TBH I think it would be easier to have something like my proposal
> > > below, where you tell xl the parent and the forked VM names and xl
> > > does the rest. Even better would be to not have to tell xl the parent
> > > VM name (since I guess this is already tracked internally somewhere?).
> >
> > The forked VM has no "name" when it's created. For performance reasons
> > when the VM fork is created with "--launch-dm no" we explicitly want
> > to avoid touching Xenstore. Even parsing the config file would be
> > unneeded overhead at that stage.
>
> I think you need another option to tell xl to not name the forked VM,
> abusing of "--launch-dm no" to not set a name is not expected I think.

See my reply below.

>
> > >
> > > Anyway, I'm not going to insist on this but the workflow of the Qemu
> > > forking seems to not be very user friendly unless you know exactly how
> > > to use it.
> > >
> > > >
> > > > >
> > > > > I think you also need something like:
> > > > >
> > > > > # xl fork-vm --launch-dm late <parent_domid> <fork_domid>
> > > > >
> > > > > So that a user doesn't need to pass a qemu-save-file?
> > > >
> > > > This doesn't make much sense to me. To launch QEMU you need the config
> > > > file to wire things up correctly. Like in order to launch QEMU you
> > > > need to tell it the name of the VM, disk path, etc. that are all
> > > > contained in the config.
> > >
> > > You could get all this information from the parent VM, IIRC libxl has
> > > a json version of the config. For example for migration there's no
> > > need to pass any config file, since the incoming VM can be recreated
> > > from the data in the source VM.
> > >
> >
> > But again, creating a fork with the exact config of the parent is not
> > possible. Even if the tool would rename the fork on-the-fly as it does
> > during the migration, the fork would end up thrashing the parent VM's
> > disk and making it impossible to create any additional forks. It would
> > also mean that at no point can the original VM be unpaused after the
> > forks are gone. I don't see any usecase in which that would make any
> > sense at all.
>
> You could have the disk(s) as read-only and the VM running completely
> from RAM. Alpine-linux has (or had) a mode where it was completely
> stateless and running from RAM. I think it's fine to require passing a
> config file for the time being, we can look at other options
> afterwards.
>

OK, there is that. But I would say that's a fairly niche use-case. You
wouldn't have any network access in that fork, no disk, no way to get
information in or out beside the serial console. So I wouldn't want
that setup to be considered the default. If someone wants to that I
would rather have an option that tells xl to automatically name the
fork for you instead of the other way around.

Tamas
Roger Pau Monné Jan. 9, 2020, 9:47 a.m. UTC | #29
On Wed, Jan 08, 2020 at 12:51:35PM -0700, Tamas K Lengyel wrote:
> On Wed, Jan 8, 2020 at 11:37 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> >
> > On Wed, Jan 08, 2020 at 11:14:46AM -0700, Tamas K Lengyel wrote:
> > > On Wed, Jan 8, 2020 at 11:01 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > >
> > > > On Wed, Jan 08, 2020 at 08:32:22AM -0700, Tamas K Lengyel wrote:
> > > > > On Wed, Jan 8, 2020 at 8:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > I think you also need something like:
> > > > > >
> > > > > > # xl fork-vm --launch-dm late <parent_domid> <fork_domid>
> > > > > >
> > > > > > So that a user doesn't need to pass a qemu-save-file?
> > > > >
> > > > > This doesn't make much sense to me. To launch QEMU you need the config
> > > > > file to wire things up correctly. Like in order to launch QEMU you
> > > > > need to tell it the name of the VM, disk path, etc. that are all
> > > > > contained in the config.
> > > >
> > > > You could get all this information from the parent VM, IIRC libxl has
> > > > a json version of the config. For example for migration there's no
> > > > need to pass any config file, since the incoming VM can be recreated
> > > > from the data in the source VM.
> > > >
> > >
> > > But again, creating a fork with the exact config of the parent is not
> > > possible. Even if the tool would rename the fork on-the-fly as it does
> > > during the migration, the fork would end up thrashing the parent VM's
> > > disk and making it impossible to create any additional forks. It would
> > > also mean that at no point can the original VM be unpaused after the
> > > forks are gone. I don't see any usecase in which that would make any
> > > sense at all.
> >
> > You could have the disk(s) as read-only and the VM running completely
> > from RAM. Alpine-linux has (or had) a mode where it was completely
> > stateless and running from RAM. I think it's fine to require passing a
> > config file for the time being, we can look at other options
> > afterwards.
> >
> 
> OK, there is that. But I would say that's a fairly niche use-case. You
> wouldn't have any network access in that fork, no disk, no way to get
> information in or out beside the serial console.

Why won't the fork have network access?

If the parent VM is left paused the fork should behave like a local
migration regarding network access, and thus be fully functional.

> So I wouldn't want
> that setup to be considered the default. If someone wants to that I
> would rather have an option that tells xl to automatically name the
> fork for you instead of the other way around.

Ack, I just want to make sure that whatever interface we end up using
is designed taking into account other use cases apart from the one at
hand.

On an unrelated note, does forking work when using PV interfaces?

Thanks, Roger.
Tamas K Lengyel Jan. 9, 2020, 1:31 p.m. UTC | #30
On Thu, Jan 9, 2020 at 2:48 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
>
> On Wed, Jan 08, 2020 at 12:51:35PM -0700, Tamas K Lengyel wrote:
> > On Wed, Jan 8, 2020 at 11:37 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > >
> > > On Wed, Jan 08, 2020 at 11:14:46AM -0700, Tamas K Lengyel wrote:
> > > > On Wed, Jan 8, 2020 at 11:01 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > >
> > > > > On Wed, Jan 08, 2020 at 08:32:22AM -0700, Tamas K Lengyel wrote:
> > > > > > On Wed, Jan 8, 2020 at 8:08 AM Roger Pau Monné <roger.pau@citrix.com> wrote:
> > > > > > > I think you also need something like:
> > > > > > >
> > > > > > > # xl fork-vm --launch-dm late <parent_domid> <fork_domid>
> > > > > > >
> > > > > > > So that a user doesn't need to pass a qemu-save-file?
> > > > > >
> > > > > > This doesn't make much sense to me. To launch QEMU you need the config
> > > > > > file to wire things up correctly. Like in order to launch QEMU you
> > > > > > need to tell it the name of the VM, disk path, etc. that are all
> > > > > > contained in the config.
> > > > >
> > > > > You could get all this information from the parent VM, IIRC libxl has
> > > > > a json version of the config. For example for migration there's no
> > > > > need to pass any config file, since the incoming VM can be recreated
> > > > > from the data in the source VM.
> > > > >
> > > >
> > > > But again, creating a fork with the exact config of the parent is not
> > > > possible. Even if the tool would rename the fork on-the-fly as it does
> > > > during the migration, the fork would end up thrashing the parent VM's
> > > > disk and making it impossible to create any additional forks. It would
> > > > also mean that at no point can the original VM be unpaused after the
> > > > forks are gone. I don't see any usecase in which that would make any
> > > > sense at all.
> > >
> > > You could have the disk(s) as read-only and the VM running completely
> > > from RAM. Alpine-linux has (or had) a mode where it was completely
> > > stateless and running from RAM. I think it's fine to require passing a
> > > config file for the time being, we can look at other options
> > > afterwards.
> > >
> >
> > OK, there is that. But I would say that's a fairly niche use-case. You
> > wouldn't have any network access in that fork, no disk, no way to get
> > information in or out beside the serial console.
>
> Why won't the fork have network access?

If you have multiple forks you end up having MAC-address collision. I
don't see what would be the point of creating a single fork when the
parent remains paused - you could just keep running the parent since
you aren't gaining anything by creating the fork. The main reason to
create a fork would be to create multiples of them.

>
> If the parent VM is left paused the fork should behave like a local
> migration regarding network access, and thus be fully functional.
>
> > So I wouldn't want
> > that setup to be considered the default. If someone wants to that I
> > would rather have an option that tells xl to automatically name the
> > fork for you instead of the other way around.
>
> Ack, I just want to make sure that whatever interface we end up using
> is designed taking into account other use cases apart from the one at
> hand.
>
> On an unrelated note, does forking work when using PV interfaces?

As I recall yes, but In my Linux tests these were the config options I
tested and work with the fork. Not sure if the vif device by default
is PV or emulated:

vnc=1
vnclisten="0.0.0.0:1"

usb=1
usbdevice=['tablet']

disk = ['phy:/dev/t0vg/debian-stretch,xvda,w']
vif = ['bridge=xenbr0,mac=00:07:5B:BB:00:01']

Tamas