new file mode 100644
@@ -0,0 +1,1088 @@
+..
+ Copyright (C) 2017 Red Hat Inc.
+
+ This work is licensed under the terms of the GNU GPL, version 2 or
+ later. See the COPYING file in the top-level directory.
+
+============================
+Live Block Device Operations
+============================
+
+QEMU Block Layer currently (as of QEMU 2.9) supports four major kinds of
+live block device jobs -- stream, commit, mirror, and backup. These can
+be used to manipulate disk image chains to accomplish certain tasks,
+namely: live copy data from backing files into overlays; shorten long
+disk image chains by merging data from overlays into backing files; live
+synchronize data from a disk image chain (including current active disk)
+to another target image; and point-in-time (and incremental) backups of
+a block device. Below is a description of the said block (QMP)
+primitives, and some (non-exhaustive list of) examples to illustrate
+their use.
+
+.. note::
+ The file ``qapi/block-core.json`` in the QEMU source tree has the
+ canonical QEMU API (QAPI) schema documentation for the QMP
+ primitives discussed here.
+
+.. todo (kashyapc):: Remove the ".. contents::" directive when Sphinx is
+ integrated.
+
+.. contents::
+
+Disk image backing chain notation
+---------------------------------
+
+A simple disk image chain. (This can be created live using QMP
+``blockdev-snapshot-sync``, or offline via ``qemu-img``)::
+
+ (Live QEMU)
+ |
+ .
+ V
+
+ [A] <----- [B]
+
+ (backing file) (overlay)
+
+The arrow can be read as: Image [A] is the backing file of disk image
+[B]. And live QEMU is currently writing to image [B], consequently, it
+is also referred to as the "active layer".
+
+There are two kinds of terminology that are common when referring to
+files in a disk image backing chain:
+
+(1) Directional: 'base' and 'top'. Given the simple disk image chain
+ above, image [A] can be referred to as 'base', and image [B] as
+ 'top'. (This terminology can be seen in in QAPI schema file,
+ block-core.json.)
+
+(2) Relational: 'backing file' and 'overlay'. Again, taking the same
+ simple disk image chain from the above, disk image [A] is referred
+ to as the backing file, and image [B] as overlay.
+
+ Throughout this document, we will use the relational terminology.
+
+.. important::
+ The overlay files can generally be any format that supports a
+ backing file, although QCOW2 is the preferred format and the one
+ used in this document.
+
+
+Brief overview of live block QMP primitives
+-------------------------------------------
+
+The following are the four different kinds of live block operations that
+QEMU block layer supports.
+
+(1) ``block-stream``: Live copy of data from backing files into overlay
+ files.
+
+ .. note:: Once the 'stream' operation has finished, three things to
+ note:
+
+ (a) QEMU rewrites the backing chain to remove
+ reference to the now-streamed and redundant backing
+ file;
+
+ (b) the streamed file *itself* won't be removed by QEMU,
+ and must be explicitly discarded by the user;
+
+ (c) the streamed file remains valid -- i.e. further
+ overlays can be created based on it. Refer the
+ ``block-stream`` section further below for more
+ details.
+
+(2) ``block-commit``: Live merge of data from overlay files into backing
+ files (with the optional goal of removing the overlay file from the
+ chain). Since QEMU 2.0, this includes "active ``block-commit``"
+ (i.e. merge the current active layer into the base image).
+
+ .. note:: Once the 'commit' operation has finished, there are three
+ things to note here as well:
+
+ (a) QEMU rewrites the backing chain to remove reference
+ to now-redundant overlay images that have been
+ committed into a backing file;
+
+ (b) the committed file *itself* won't be removed by QEMU
+ -- it ought to be manually removed;
+
+ (c) however, unlike in the case of ``block-stream``, the
+ intermediate images will be rendered invalid -- i.e.
+ no more further overlays can be created based on
+ them. Refer the ``block-commit`` section further
+ below for more details.
+
+(3) ``drive-mirror`` (and ``blockdev-mirror``): Synchronize a running
+ disk to another image.
+
+(4) ``drive-backup`` (and ``blockdev-backup``): Point-in-time (live) copy
+ of a block device to a destination.
+
+
+.. _`Interacting with a QEMU instance`:
+
+Interacting with a QEMU instance
+--------------------------------
+
+To show some example invocations of command-line, we will use the
+following invocation of QEMU, with a QMP server running over UNIX
+socket::
+
+ $ ./x86_64-softmmu/qemu-system-x86_64 -display none -nodefconfig \
+ -M q35 -nodefaults -m 512 \
+ -blockdev node-name=node-A,driver=qcow2,file.driver=file,file.node-name=file,file.filename=./a.qcow2 \
+ -device virtio-blk,drive=node-A,id=virtio0 \
+ -monitor stdio -qmp unix:/tmp/qmp-sock,server,nowait
+
+The ``-blockdev`` command-line option, used above, is available from
+QEMU 2.9 onwards. In the above invocation, notice the ``node-name``
+parameter that is used to refer to the disk image a.qcow2 ('node-A') --
+this is a cleaner way to refer to a disk image (as opposed to referring
+to it by spelling out file paths). So, we will continue to designate a
+``node-name`` to each further disk image created (either via
+``blockdev-snapshot-sync``, or ``blockdev-add``) as part of the disk
+image chain, and continue to refer to the disks using their
+``node-name`` (where possible, because ``block-commit`` does not yet, as
+of QEMU 2.9, accept ``node-name`` parameter) when performing various
+block operations.
+
+To interact with the QEMU instance launched above, we will use the
+``qmp-shell`` utility (located at: ``qemu/scripts/qmp``, as part of the
+QEMU source directory), which takes key-value pairs for QMP commands.
+Invoke it as below (which will also print out the complete raw JSON
+syntax for reference -- examples in the following sections)::
+
+ $ ./qmp-shell -v -p /tmp/qmp-sock
+ (QEMU)
+
+.. note::
+ In the event we have to repeat a certain QMP command, we will: for
+ the first occurrence of it, show the ``qmp-shell`` invocation, *and*
+ the corresponding raw JSON QMP syntax; but for subsequent
+ invocations, present just the ``qmp-shell`` syntax, and omit the
+ equivalent JSON output.
+
+
+Example disk image chain
+------------------------
+
+We will use the below disk image chain (and occasionally spelling it
+out where appropriate) when discussing various primitives::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+Where [A] is the original base image; [B] and [C] are intermediate
+overlay images; image [D] is the active layer -- i.e. live QEMU is
+writing to it. (The rule of thumb is: live QEMU will always be pointing
+to the rightmost image in a disk image chain.)
+
+The above image chain can be created by invoking
+``blockdev-snapshot-sync`` commands as following (which shows the
+creation of overlay image [B]) using the ``qmp-shell`` (our invocation
+also prints the raw JSON invocation of it)::
+
+ (QEMU) blockdev-snapshot-sync node-name=node-A snapshot-file=b.qcow2 snapshot-node-name=node-B format=qcow2
+ {
+ "execute": "blockdev-snapshot-sync",
+ "arguments": {
+ "node-name": "node-A",
+ "snapshot-file": "b.qcow2",
+ "format": "qcow2",
+ "snapshot-node-name": "node-B"
+ }
+ }
+
+Here, "node-A" is the name QEMU internally uses to refer to the base
+image [A] -- it is the backing file, based on which the overlay image,
+[B], is created.
+
+To create the rest of the overlay images, [C], and [D] (omitting the raw
+JSON output for brevity)::
+
+ (QEMU) blockdev-snapshot-sync node-name=node-B snapshot-file=c.qcow2 snapshot-node-name=node-C format=qcow2
+ (QEMU) blockdev-snapshot-sync node-name=node-C snapshot-file=d.qcow2 snapshot-node-name=node-D format=qcow2
+
+
+A note on points-in-time vs file names
+--------------------------------------
+
+In our disk image chain::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+We have *three* points in time and an active layer:
+
+- Point 1: Guest state when [B] was created is contained in file [A]
+- Point 2: Guest state when [C] was created is contained in [A] + [B]
+- Point 3: Guest state when [D] was created is contained in
+ [A] + [B] + [C]
+- Active layer: Current guest state is contained in [A] + [B] + [C] +
+ [D]
+
+Therefore, be aware with naming choices:
+
+- Naming a file after the time it is created is misleading -- the
+ guest data for that point in time is *not* contained in that file
+ (as explained earlier)
+- Rather, think of files as a *delta* from the backing file
+
+
+Live block streaming --- ``block-stream``
+-----------------------------------------
+
+The ``block-stream`` command allows you to do live copy data from backing
+files into overlay images.
+
+Given our original example disk image chain from earlier::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+The disk image chain can be shortened in one of the following different
+ways (not an exhaustive list).
+
+.. _`Case-1`:
+
+(1) Merge everything into the active layer: I.e. copy all contents from
+ the base image, [A], and overlay images, [B] and [C], into [D],
+ *while* the guest is running. The resulting chain will be a
+ standalone image, [D] -- with contents from [A], [B] and [C] merged
+ into it (where live QEMU writes go to)::
+
+ [D]
+
+.. _`Case-2`:
+
+(2) Taking the same example disk image chain mentioned earlier, merge
+ only images [B] and [C] into [D], the active layer. The result will
+ be contents of images [B] and [C] will be copied into [D], and the
+ backing file pointer of image [D] will be adjusted to point to image
+ [A]. The resulting chain will be::
+
+ [A] <-- [D]
+
+.. _`Case-3`:
+
+(3) Intermediate streaming (available since QEMU 2.8): Starting afresh
+ with the original example disk image chain, with a total of four
+ images, it is possible to copy contents from image [B] into image
+ [C]. Once the copy is finished, image [B] can now be (optionally)
+ discarded; and the backing file pointer of image [C] will be
+ adjusted to point to [A]. I.e. after performing "intermediate
+ streaming" of [B] into [C], the resulting image chain will be (where
+ live QEMU is writing to [D])::
+
+ [A] <-- [C] <-- [D]
+
+
+QMP invocation for ``block-stream``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For `Case-1`_, to merge contents of all the backing files into the
+active layer, where 'node-D' is the current active image (by default
+``block-stream`` will flatten the entire chain); ``qmp-shell`` (and its
+corresponding JSON output)::
+
+ (QEMU) block-stream device=node-D job-id=job0
+ {
+ "execute": "block-stream",
+ "arguments": {
+ "device": "node-D",
+ "job-id": "job0"
+ }
+ }
+
+For `Case-2`_, merge contents of the images [B] and [C] into [D], where
+image [D] ends up referring to image [A] as its backing file::
+
+ (QEMU) block-stream device=node-D base-node=node-A job-id=job0
+
+And for `Case-3`_, of "intermediate" streaming", merge contents of
+images [B] into [C], where [C] ends up referring to [A] as its backing
+image::
+
+ (QEMU) block-stream device=node-C base-node=node-A job-id=job0
+
+Progress of a ``block-stream`` operation can be monitored via the QMP
+command::
+
+ (QEMU) query-block-jobs
+ {
+ "execute": "query-block-jobs",
+ "arguments": {}
+ }
+
+
+Once the ``block-stream`` operation has completed, QEMU will emit an
+event, ``BLOCK_JOB_COMPLETED``. The intermediate overlays remain valid,
+and can now be (optionally) discarded, or retained to create further
+overlays based on them. Finally, the ``block-stream`` jobs can be
+restarted at anytime.
+
+
+Live block commit --- ``block-commit``
+--------------------------------------
+
+The ``block-commit`` command lets you merge live data from overlay
+images into backing file(s). Since QEMU 2.0, this includes "live active
+commit" (i.e. it is possible to merge the "active layer", the right-most
+image in a disk image chain where live QEMU will be writing to, into the
+base image). This is analogous to ``block-stream``, but in the opposite
+direction.
+
+Again, starting afresh with our example disk image chain, where live
+QEMU is writing to the right-most image in the chain, [D]::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+The disk image chain can be shortened in one of the following ways:
+
+.. _`block-commit_Case-1`:
+
+(1) Commit content from only image [B] into image [A]. The resulting
+ chain is the following, where image [C] is adjusted to point at [A]
+ as its new backing file::
+
+ [A] <-- [C] <-- [D]
+
+(2) Commit content from images [B] and [C] into image [A]. The
+ resulting chain, where image [D] is adjusted to point to image [A]
+ as its new backing file::
+
+ [A] <-- [D]
+
+.. _`block-commit_Case-3`:
+
+(3) Commit content from images [B], [C], and the active layer [D] into
+ image [A]. The resulting chain (in this case, a consolidated single
+ image)::
+
+ [A]
+
+(4) Commit content from image only image [C] into image [B]. The
+ resulting chain::
+
+ [A] <-- [B] <-- [D]
+
+(5) Commit content from image [C] and the active layer [D] into image
+ [B]. The resulting chain::
+
+ [A] <-- [B]
+
+
+QMP invocation for ``block-commit``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+For :ref:`Case-1 <block-commit_Case-1>`, to merge contents only from
+image [B] into image [A], the invocation is as follows::
+
+ (QEMU) block-commit device=node-D base=a.qcow2 top=b.qcow2 job-id=job0
+ {
+ "execute": "block-commit",
+ "arguments": {
+ "device": "node-D",
+ "job-id": "job0",
+ "top": "b.qcow2",
+ "base": "a.qcow2"
+ }
+ }
+
+Once the above ``block-commit`` operation has completed, a
+``BLOCK_JOB_COMPLETED`` event will be issued, and no further action is
+required. As the end result, the backing file of image [C] is adjusted
+to point to image [A], and the original 4-image chain will end up being
+transformed to::
+
+ [A] <-- [C] <-- [D]
+
+.. note::
+ The intermediate image [B] is invalid (as in: no more further
+ overlays based on it can be created).
+
+ Reasoning: An intermediate image after a 'stream' operation still
+ represents that old point-in-time, and may be valid in that context.
+ However, an intermediate image after a 'commit' operation no longer
+ represents any point-in-time, and is invalid in any context.
+
+
+However, :ref:`Case-3 <block-commit_Case-3>` (also called: "active
+``block-commit``") is a *two-phase* operation: In the first phase, the
+content from the active overlay, along with the intermediate overlays,
+is copied into the backing file (also called the base image). In the
+second phase, adjust the said backing file as the current active image
+-- possible via issuing the command ``block-job-complete``. Optionally,
+the ``block-commit`` operation can be cancelled by issuing the command
+``block-job-cancel``, but be careful when doing this.
+
+Once the ``block-commit`` operation has completed, the event
+``BLOCK_JOB_READY`` will be emitted, signalling that the synchronization
+has finished. Now the job can be gracefully completed by issuing the
+command ``block-job-complete`` -- until such a command is issued, the
+'commit' operation remains active.
+
+The following is the flow for :ref:`Case-3 <block-commit_Case-3>` to
+convert a disk image chain such as this::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+Into::
+
+ [A]
+
+Where content from all the subsequent overlays, [B], and [C], including
+the active layer, [D], is committed back to [A] -- which is where live
+QEMU is performing all its current writes).
+
+Start the "active ``block-commit``" operation::
+
+ (QEMU) block-commit device=node-D base=a.qcow2 top=d.qcow2 job-id=job0
+ {
+ "execute": "block-commit",
+ "arguments": {
+ "device": "node-D",
+ "job-id": "job0",
+ "top": "d.qcow2",
+ "base": "a.qcow2"
+ }
+ }
+
+
+Once the synchronization has completed, the event ``BLOCK_JOB_READY`` will
+be emitted.
+
+Then, optionally query for the status of the active block operations.
+We can see the 'commit' job is now ready to be completed, as indicated
+by the line *"ready": true*::
+
+ (QEMU) query-block-jobs
+ {
+ "execute": "query-block-jobs",
+ "arguments": {}
+ }
+ {
+ "return": [
+ {
+ "busy": false,
+ "type": "commit",
+ "len": 1376256,
+ "paused": false,
+ "ready": true,
+ "io-status": "ok",
+ "offset": 1376256,
+ "device": "job0",
+ "speed": 0
+ }
+ ]
+ }
+
+Gracefully complete the 'commit' block device job::
+
+ (QEMU) block-job-complete device=job0
+ {
+ "execute": "block-job-complete",
+ "arguments": {
+ "device": "job0"
+ }
+ }
+ {
+ "return": {}
+ }
+
+Finally, once the above job is completed, an event
+``BLOCK_JOB_COMPLETED`` will be emitted.
+
+.. note::
+ The invocation for rest of the cases (2, 4, and 5), discussed in the
+ previous section, is omitted for brevity.
+
+
+Live disk synchronization --- ``drive-mirror`` and ``blockdev-mirror``
+----------------------------------------------------------------------
+
+Synchronize a running disk image chain (all or part of it) to a target
+image.
+
+Again, given our familiar disk image chain::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+The ``drive-mirror`` (and its newer equivalent ``blockdev-mirror``) allows
+you to copy data from the entire chain into a single target image (which
+can be located on a different host).
+
+Once a 'mirror' job has started, there are two possible actions while a
+``drive-mirror`` job is active:
+
+(1) Issuing the command ``block-job-cancel`` after it emits the event
+ ``BLOCK_JOB_CANCELLED``: will (after completing synchronization of
+ the content from the disk image chain to the target image, [E])
+ create a point-in-time (which is at the time of *triggering* the
+ cancel command) copy, contained in image [E], of the the entire disk
+ image chain (or only the top-most image, depending on the ``sync``
+ mode).
+
+(2) Issuing the command ``block-job-complete`` after it emits the event
+ ``BLOCK_JOB_COMPLETED``: will, after completing synchronization of
+ the content, adjust the guest device (i.e. live QEMU) to point to
+ the target image, and, causing all the new writes from this point on
+ to happen there. One use case for this is live storage migration.
+
+About synchronization modes: The synchronization mode determines
+*which* part of the disk image chain will be copied to the target.
+Currently, there are four different kinds:
+
+(1) ``full`` -- Synchronize the content of entire disk image chain to
+ the target
+
+(2) ``top`` -- Synchronize only the contents of the top-most disk image
+ in the chain to the target
+
+(3) ``none`` -- Synchronize only the new writes from this point on.
+
+ .. note:: In the case of ``drive-backup`` (or ``blockdev-backup``),
+ the behavior of ``none`` synchronization mode is different.
+ Normally, a ``backup`` job consists of two parts: Anything
+ that is overwritten by the guest is first copied out to
+ the backup, and in the background the whole image is
+ copied from start to end. With ``sync=none``, it's only
+ the first part.
+
+(4) ``incremental`` -- Synchronize content that is described by the
+ dirty bitmap
+
+.. note::
+ Refer to the :doc:`bitmaps` document in the QEMU source
+ tree to learn about the detailed workings of the ``incremental``
+ synchronization mode.
+
+
+QMP invocation for ``drive-mirror``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+To copy the contents of the entire disk image chain, from [A] all the
+way to [D], to a new target (``drive-mirror`` will create the destination
+file, if it doesn't already exist), call it [E]::
+
+ (QEMU) drive-mirror device=node-D target=e.qcow2 sync=full job-id=job0
+ {
+ "execute": "drive-mirror",
+ "arguments": {
+ "device": "node-D",
+ "job-id": "job0",
+ "target": "e.qcow2",
+ "sync": "full"
+ }
+ }
+
+The ``"sync": "full"``, from the above, means: copy the *entire* chain
+to the destination.
+
+Following the above, querying for active block jobs will show that a
+'mirror' job is "ready" to be completed (and QEMU will also emit an
+event, ``BLOCK_JOB_READY``)::
+
+ (QEMU) query-block-jobs
+ {
+ "execute": "query-block-jobs",
+ "arguments": {}
+ }
+ {
+ "return": [
+ {
+ "busy": false,
+ "type": "mirror",
+ "len": 21757952,
+ "paused": false,
+ "ready": true,
+ "io-status": "ok",
+ "offset": 21757952,
+ "device": "job0",
+ "speed": 0
+ }
+ ]
+ }
+
+And, as noted in the previous section, there are two possible actions
+at this point:
+
+(a) Create a point-in-time snapshot by ending the synchronization. The
+ point-in-time is at the time of *ending* the sync. (The result of
+ the following being: the target image, [E], will be populated with
+ content from the entire chain, [A] to [D])::
+
+ (QEMU) block-job-cancel device=job0
+ {
+ "execute": "block-job-cancel",
+ "arguments": {
+ "device": "job0"
+ }
+ }
+
+(b) Or, complete the operation and pivot the live QEMU to the target
+ copy::
+
+ (QEMU) block-job-complete device=job0
+
+In either of the above cases, if you once again run the
+`query-block-jobs` command, there should not be any active block
+operation.
+
+Comparing 'commit' and 'mirror': In both then cases, the overlay images
+can be discarded. However, with 'commit', the *existing* base image
+will be modified (by updating it with contents from overlays); while in
+the case of 'mirror', a *new* target image is populated with the data
+from the disk image chain.
+
+
+QMP invocation for live storage migration with ``drive-mirror`` + NBD
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Live storage migration (without shared storage setup) is one of the most
+common use-cases that takes advantage of the ``drive-mirror`` primitive
+and QEMU's built-in Network Block Device (NBD) server. Here's a quick
+walk-through of this setup.
+
+Given the disk image chain::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+Instead of copying content from the entire chain, synchronize *only* the
+contents of the *top*-most disk image (i.e. the active layer), [D], to a
+target, say, [TargetDisk].
+
+.. important::
+ The destination host must already have the contents of the backing
+ chain, involving images [A], [B], and [C], visible via other means
+ -- whether by ``cp``, ``rsync``, or by some storage array-specific
+ command.)
+
+Sometimes, this is also referred to as "shallow copy" -- because only
+the "active layer", and not the rest of the image chain, is copied to
+the destination.
+
+.. note::
+ In this example, for the sake of simplicity, we'll be using the same
+ ``localhost`` as both source and destination.
+
+As noted earlier, on the destination host the contents of the backing
+chain -- from images [A] to [C] -- are already expected to exist in some
+form (e.g. in a file called, ``Contents-of-A-B-C.qcow2``). Now, on the
+destination host, let's create a target overlay image (with the image
+``Contents-of-A-B-C.qcow2`` as its backing file), to which the contents
+of image [D] (from the source QEMU) will be mirrored to::
+
+ $ qemu-img create -f qcow2 -b ./Contents-of-A-B-C.qcow2 \
+ -F qcow2 ./target-disk.qcow2
+
+And start the destination QEMU (we already have the source QEMU running
+-- discussed in the section: `Interacting with a QEMU instance`_)
+instance, with the following invocation. (As noted earlier, for
+simplicity's sake, the destination QEMU is started on the same host, but
+it could be located elsewhere)::
+
+ $ ./x86_64-softmmu/qemu-system-x86_64 -display none -nodefconfig \
+ -M q35 -nodefaults -m 512 \
+ -blockdev node-name=node-TargetDisk,driver=qcow2,file.driver=file,file.node-name=file,file.filename=./target-disk.qcow2 \
+ -device virtio-blk,drive=node-TargetDisk,id=virtio0 \
+ -S -monitor stdio -qmp unix:./qmp-sock2,server,nowait \
+ -incoming tcp:localhost:6666
+
+Given the disk image chain on source QEMU::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+On the destination host, it is expected that the contents of the chain
+``[A] <-- [B] <-- [C]`` are *already* present, and therefore copy *only*
+the content of image [D].
+
+(1) [On *destination* QEMU] As part of the first step, start the
+ built-in NBD server on a given host (local host, represented by
+ ``::``)and port::
+
+ (QEMU) nbd-server-start addr={"type":"inet","data":{"host":"::","port":"49153"}}
+ {
+ "execute": "nbd-server-start",
+ "arguments": {
+ "addr": {
+ "data": {
+ "host": "::",
+ "port": "49153"
+ },
+ "type": "inet"
+ }
+ }
+ }
+
+(2) [On *destination* QEMU] And export the destination disk image using
+ QEMU's built-in NBD server::
+
+ (QEMU) nbd-server-add device=node-TargetDisk writable=true
+ {
+ "execute": "nbd-server-add",
+ "arguments": {
+ "device": "node-TargetDisk"
+ }
+ }
+
+(3) [On *source* QEMU] Then, invoke ``drive-mirror`` (NB: since we're
+ running ``drive-mirror`` with ``mode=existing`` (meaning:
+ synchronize to a pre-created file, therefore 'existing', file on the
+ target host), with the synchronization mode as 'top' (``"sync:
+ "top"``)::
+
+ (QEMU) drive-mirror device=node-D target=nbd:localhost:49153:exportname=node-TargetDisk sync=top mode=existing job-id=job0
+ {
+ "execute": "drive-mirror",
+ "arguments": {
+ "device": "node-D",
+ "mode": "existing",
+ "job-id": "job0",
+ "target": "nbd:localhost:49153:exportname=node-TargetDisk",
+ "sync": "top"
+ }
+ }
+
+(4) [On *source* QEMU] Once ``drive-mirror`` copies the entire data, and the
+ event ``BLOCK_JOB_READY`` is emitted, issue ``block-job-cancel`` to
+ gracefully end the synchronization, from source QEMU::
+
+ (QEMU) block-job-cancel device=job0
+ {
+ "execute": "block-job-cancel",
+ "arguments": {
+ "device": "job0"
+ }
+ }
+
+(5) [On *destination* QEMU] Then, stop the NBD server::
+
+ (QEMU) nbd-server-stop
+ {
+ "execute": "nbd-server-stop",
+ "arguments": {}
+ }
+
+(6) [On *destination* QEMU] Finally, resume the guest vCPUs by issuing the
+ QMP command `cont`::
+
+ (QEMU) cont
+ {
+ "execute": "cont",
+ "arguments": {}
+ }
+
+.. note::
+ Higher-level libraries (e.g. libvirt) automate the entire above
+ process (although note that libvirt does not allow same-host
+ migrations to localhost for other reasons).
+
+
+Notes on ``blockdev-mirror``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``blockdev-mirror`` command is equivalent in core functionality to
+``drive-mirror``, except that it operates at node-level in a BDS graph.
+
+Also: for ``blockdev-mirror``, the 'target' image needs to be explicitly
+created (using ``qemu-img``) and attach it to live QEMU via
+``blockdev-add``, which assigns a name to the to-be created target node.
+
+E.g. the sequence of actions to create a point-in-time backup of an
+entire disk image chain, to a target, using ``blockdev-mirror`` would be:
+
+(0) Create the QCOW2 overlays, to arrive at a backing chain of desired
+ depth
+
+(1) Create the target image (using ``qemu-img``), say, ``e.qcow2``
+
+(2) Attach the above created file (``e.qcow2``), run-time, using
+ ``blockdev-add`` to QEMU
+
+(3) Perform ``blockdev-mirror`` (use ``"sync": "full"`` to copy the
+ entire chain to the target). And notice the event
+ ``BLOCK_JOB_READY``
+
+(4) Optionally, query for active block jobs, there should be a 'mirror'
+ job ready to be completed
+
+(5) Gracefully complete the 'mirror' block device job, and notice the
+ the event ``BLOCK_JOB_COMPLETED``
+
+(6) Shutdown the guest by issuing the QMP ``quit`` command so that
+ caches are flushed
+
+(7) Then, finally, compare the contents of the disk image chain, and
+ the target copy with ``qemu-img compare``. You should notice:
+ "Images are identical"
+
+
+QMP invocation for ``blockdev-mirror``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Given the disk image chain::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+To copy the contents of the entire disk image chain, from [A] all the
+way to [D], to a new target, call it [E]. The following is the flow.
+
+Create the overlay images, [B], [C], and [D]::
+
+ (QEMU) blockdev-snapshot-sync node-name=node-A snapshot-file=b.qcow2 snapshot-node-name=node-B format=qcow2
+ (QEMU) blockdev-snapshot-sync node-name=node-B snapshot-file=c.qcow2 snapshot-node-name=node-C format=qcow2
+ (QEMU) blockdev-snapshot-sync node-name=node-C snapshot-file=d.qcow2 snapshot-node-name=node-D format=qcow2
+
+Create the target image, [E]::
+
+ $ qemu-img create -f qcow2 e.qcow2 39M
+
+Add the above created target image to QEMU, via ``blockdev-add``::
+
+ (QEMU) blockdev-add driver=qcow2 node-name=node-E file={"driver":"file","filename":"e.qcow2"}
+ {
+ "execute": "blockdev-add",
+ "arguments": {
+ "node-name": "node-E",
+ "driver": "qcow2",
+ "file": {
+ "driver": "file",
+ "filename": "e.qcow2"
+ }
+ }
+ }
+
+Perform ``blockdev-mirror``, and notice the event ``BLOCK_JOB_READY``::
+
+ (QEMU) blockdev-mirror device=node-B target=node-E sync=full job-id=job0
+ {
+ "execute": "blockdev-mirror",
+ "arguments": {
+ "device": "node-D",
+ "job-id": "job0",
+ "target": "node-E",
+ "sync": "full"
+ }
+ }
+
+Query for active block jobs, there should be a 'mirror' job ready::
+
+ (QEMU) query-block-jobs
+ {
+ "execute": "query-block-jobs",
+ "arguments": {}
+ }
+ {
+ "return": [
+ {
+ "busy": false,
+ "type": "mirror",
+ "len": 21561344,
+ "paused": false,
+ "ready": true,
+ "io-status": "ok",
+ "offset": 21561344,
+ "device": "job0",
+ "speed": 0
+ }
+ ]
+ }
+
+Gracefully complete the block device job operation, and notice the
+event ``BLOCK_JOB_COMPLETED``::
+
+ (QEMU) block-job-complete device=job0
+ {
+ "execute": "block-job-complete",
+ "arguments": {
+ "device": "job0"
+ }
+ }
+ {
+ "return": {}
+ }
+
+Shutdown the guest, by issuing the ``quit`` QMP command::
+
+ (QEMU) quit
+ {
+ "execute": "quit",
+ "arguments": {}
+ }
+
+
+Live disk backup --- ``drive-backup`` and ``blockdev-backup``
+-------------------------------------------------------------
+
+The ``drive-backup`` (and its newer equivalent ``blockdev-backup``) allows
+you to create a point-in-time snapshot.
+
+In this case, the point-in-time is when you *start* the ``drive-backup``
+(or its newer equivalent ``blockdev-backup``) command.
+
+
+QMP invocation for ``drive-backup``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Yet again, starting afresh with our example disk image chain::
+
+ [A] <-- [B] <-- [C] <-- [D]
+
+To create a target image [E], with content populated from image [A] to
+[D], from the above chain, the following is the syntax. (If the target
+image does not exist, ``drive-backup`` will create it)::
+
+ (QEMU) drive-backup device=node-D sync=full target=e.qcow2 job-id=job0
+ {
+ "execute": "drive-backup",
+ "arguments": {
+ "device": "node-D",
+ "job-id": "job0",
+ "sync": "full",
+ "target": "e.qcow2"
+ }
+ }
+
+Once the above ``drive-backup`` has completed, a ``BLOCK_JOB_COMPLETED`` event
+will be issued, indicating the live block device job operation has
+completed, and no further action is required.
+
+
+Notes on ``blockdev-backup``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+The ``blockdev-backup`` command is equivalent in functionality to
+``drive-backup``, except that it operates at node-level in a Block Driver
+State (BDS) graph.
+
+E.g. the sequence of actions to create a point-in-time backup
+of an entire disk image chain, to a target, using ``blockdev-backup``
+would be:
+
+(0) Create the QCOW2 overlays, to arrive at a backing chain of desired
+ depth
+
+(1) Create the target image (using ``qemu-img``), say, ``e.qcow2``
+
+(2) Attach the above created file (``e.qcow2``), run-time, using
+ ``blockdev-add`` to QEMU
+
+(3) Perform ``blockdev-backup`` (use ``"sync": "full"`` to copy the
+ entire chain to the target). And notice the event
+ ``BLOCK_JOB_COMPLETED``
+
+(4) Shutdown the guest, by issuing the QMP ``quit`` command, so that
+ caches are flushed
+
+(5) Then, finally, compare the contents of the disk image chain, and
+ the target copy with ``qemu-img compare``. You should notice:
+ "Images are identical"
+
+The following section shows an example QMP invocation for
+``blockdev-backup``.
+
+QMP invocation for ``blockdev-backup``
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+
+Given a disk image chain of depth 1 where image [B] is the active
+overlay (live QEMU is writing to it)::
+
+ [A] <-- [B]
+
+The following is the procedure to copy the content from the entire chain
+to a target image (say, [E]), which has the full content from [A] and
+[B].
+
+Create the overlay [B]::
+
+ (QEMU) blockdev-snapshot-sync node-name=node-A snapshot-file=b.qcow2 snapshot-node-name=node-B format=qcow2
+ {
+ "execute": "blockdev-snapshot-sync",
+ "arguments": {
+ "node-name": "node-A",
+ "snapshot-file": "b.qcow2",
+ "format": "qcow2",
+ "snapshot-node-name": "node-B"
+ }
+ }
+
+
+Create a target image that will contain the copy::
+
+ $ qemu-img create -f qcow2 e.qcow2 39M
+
+Then add it to QEMU via ``blockdev-add``::
+
+ (QEMU) blockdev-add driver=qcow2 node-name=node-E file={"driver":"file","filename":"e.qcow2"}
+ {
+ "execute": "blockdev-add",
+ "arguments": {
+ "node-name": "node-E",
+ "driver": "qcow2",
+ "file": {
+ "driver": "file",
+ "filename": "e.qcow2"
+ }
+ }
+ }
+
+Then invoke ``blockdev-backup`` to copy the contents from the entire
+image chain, consisting of images [A] and [B] to the target image
+'e.qcow2'::
+
+ (QEMU) blockdev-backup device=node-B target=node-E sync=full job-id=job0
+ {
+ "execute": "blockdev-backup",
+ "arguments": {
+ "device": "node-B",
+ "job-id": "job0",
+ "target": "node-E",
+ "sync": "full"
+ }
+ }
+
+Once the above 'backup' operation has completed, the event,
+``BLOCK_JOB_COMPLETED`` will be emitted, signalling successful
+completion.
+
+Next, query for any active block device jobs (there should be none)::
+
+ (QEMU) query-block-jobs
+ {
+ "execute": "query-block-jobs",
+ "arguments": {}
+ }
+
+Shutdown the guest::
+
+ (QEMU) quit
+ {
+ "execute": "quit",
+ "arguments": {}
+ }
+ "return": {}
+ }
+
+.. note::
+ The above step is really important; if forgotten, an error, "Failed
+ to get shared "write" lock on e.qcow2", will be thrown when you do
+ ``qemu-img compare`` to verify the integrity of the disk image
+ with the backup content.
+
+
+The end result will be the image 'e.qcow2' containing a
+point-in-time backup of the disk image chain -- i.e. contents from
+images [A] and [B] at the time the ``blockdev-backup`` command was
+initiated.
+
+One way to confirm the backup disk image contains the identical content
+with the disk image chain is to compare the backup and the contents of
+the chain, you should see "Images are identical". (NB: this is assuming
+QEMU was launched with ``-S`` option, which will not start the CPUs at
+guest boot up)::
+
+ $ qemu-img compare b.qcow2 e.qcow2
+ Warning: Image size mismatch!
+ Images are identical.
+
+NOTE: The "Warning: Image size mismatch!" is expected, as we created the
+target image (e.qcow2) with 39M size.
deleted file mode 100644
@@ -1,72 +0,0 @@
-LIVE BLOCK OPERATIONS
-=====================
-
-High level description of live block operations. Note these are not
-supported for use with the raw format at the moment.
-
-Note also that this document is incomplete and it currently only
-covers the 'stream' operation. Other operations supported by QEMU such
-as 'commit', 'mirror' and 'backup' are not described here yet. Please
-refer to the qapi/block-core.json file for an overview of those.
-
-Snapshot live merge
-===================
-
-Given a snapshot chain, described in this document in the following
-format:
-
-[A] <- [B] <- [C] <- [D] <- [E]
-
-Where the rightmost object ([E] in the example) described is the current
-image which the guest OS has write access to. To the left of it is its base
-image, and so on accordingly until the leftmost image, which has no
-base.
-
-The snapshot live merge operation transforms such a chain into a
-smaller one with fewer elements, such as this transformation relative
-to the first example:
-
-[A] <- [E]
-
-Data is copied in the right direction with destination being the
-rightmost image, but any other intermediate image can be specified
-instead. In this example data is copied from [C] into [D], so [D] can
-be backed by [B]:
-
-[A] <- [B] <- [D] <- [E]
-
-The operation is implemented in QEMU through image streaming facilities.
-
-The basic idea is to execute 'block_stream virtio0' while the guest is
-running. Progress can be monitored using 'info block-jobs'. When the
-streaming operation completes it raises a QMP event. 'block_stream'
-copies data from the backing file(s) into the active image. When finished,
-it adjusts the backing file pointer.
-
-The 'base' parameter specifies an image which data need not be
-streamed from. This image will be used as the backing file for the
-destination image when the operation is finished.
-
-In the first example above, the command would be:
-
-(qemu) block_stream virtio0 file-A.img
-
-In order to specify a destination image different from the active
-(rightmost) one we can use its node name instead.
-
-In the second example above, the command would be:
-
-(qemu) block_stream node-D file-B.img
-
-Live block copy
-===============
-
-To copy an in use image to another destination in the filesystem, one
-should create a live snapshot in the desired destination, then stream
-into that image. Example:
-
-(qemu) snapshot_blkdev ide0-hd0 /new-path/disk.img qcow2
-
-(qemu) block_stream ide0-hd0
-
-