diff mbox

[COLO-Frame,(Base),v21,06/17] COLO: Introduce checkpointing protocol

Message ID 1476792613-11712-7-git-send-email-zhang.zhanghailiang@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Zhanghailiang Oct. 18, 2016, 12:10 p.m. UTC
We need communications protocol of user-defined to control
the checkpointing process.

The new checkpointing request is started by Primary VM,
and the interactive process like below:

Checkpoint synchronizing points:

                   Primary               Secondary
                                            initial work
'checkpoint-ready'    <-------------------- @

'checkpoint-request'  @ -------------------->
                                            Suspend (Only in hybrid mode)
'checkpoint-reply'    <-------------------- @
                      Suspend&Save state
'vmstate-send'        @ -------------------->
                      Send state            Receive state
'vmstate-received'    <-------------------- @
                      Release packets       Load state
'vmstate-load'        <-------------------- @
                      Resume                Resume (Only in hybrid mode)

                      Start Comparing (Only in hybrid mode)
NOTE:
 1) '@' who sends the message
 2) Every sync-point is synchronized by two sides with only
    one handshake(single direction) for low-latency.
    If more strict synchronization is required, a opposite direction
    sync-point should be added.
 3) Since sync-points are single direction, the remote side may
    go forward a lot when this side just receives the sync-point.
 4) For now, we only support 'periodic' checkpoint, for which
   the Secondary VM is not running, later we will support 'hybrid' mode.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
---
v19:
- fix title and comments
v16:
- Rename 'colo_put/get[_check]_cmd()' to 'colo_send/receive[_check]_message()'
v14:
- Rename 'COLOCommand' to 'COLOMessage'. (Markus's suggestion)
- Add Reviewd-by tag
v13:
- Refactor colo command related helper functions, use 'Error **errp' parameter
  instead of return value to indicate success or failure.
- Fix some other comments from Markus.

v12:
- Rename colo_ctl_put() to colo_put_cmd()
- Rename colo_ctl_get() to colo_get_check_cmd() and drop
  the third parameter
- Rename colo_ctl_get_cmd() to colo_get_cmd()
- Remove useless 'invalid' member for COLOcommand enum.
v11:
- Add missing 'checkpoint-ready' communication in comment.
- Use parameter to return 'value' for colo_ctl_get() (Dave's suggestion)
- Fix trace for colo_ctl_get() to trace command and value both
v10:
- Rename enum COLOCmd to COLOCommand (Eric's suggestion).
- Remove unused 'ram-steal'
---
 migration/colo.c       | 200 ++++++++++++++++++++++++++++++++++++++++++++++++-
 migration/trace-events |   2 +
 qapi-schema.json       |  25 +++++++
 3 files changed, 225 insertions(+), 2 deletions(-)

Comments

Amit Shah Oct. 26, 2016, 5:25 a.m. UTC | #1
On (Tue) 18 Oct 2016 [20:10:02], zhanghailiang wrote:
> We need communications protocol of user-defined to control
> the checkpointing process.
> 
> The new checkpointing request is started by Primary VM,
> and the interactive process like below:
> 
> Checkpoint synchronizing points:
> 
>                    Primary               Secondary
>                                             initial work
> 'checkpoint-ready'    <-------------------- @
> 
> 'checkpoint-request'  @ -------------------->
>                                             Suspend (Only in hybrid mode)
> 'checkpoint-reply'    <-------------------- @
>                       Suspend&Save state
> 'vmstate-send'        @ -------------------->
>                       Send state            Receive state
> 'vmstate-received'    <-------------------- @
>                       Release packets       Load state
> 'vmstate-load'        <-------------------- @
>                       Resume                Resume (Only in hybrid mode)
> 
>                       Start Comparing (Only in hybrid mode)
> NOTE:
>  1) '@' who sends the message
>  2) Every sync-point is synchronized by two sides with only
>     one handshake(single direction) for low-latency.
>     If more strict synchronization is required, a opposite direction
>     sync-point should be added.
>  3) Since sync-points are single direction, the remote side may
>     go forward a lot when this side just receives the sync-point.
>  4) For now, we only support 'periodic' checkpoint, for which
>    the Secondary VM is not running, later we will support 'hybrid' mode.
> 
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
> Cc: Eric Blake <eblake@redhat.com>
> Cc: Markus Armbruster <armbru@redhat.com>
> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

Reviewed-by: Amit Shah <amit.shah@redhat.com>

> +static int colo_do_checkpoint_transaction(MigrationState *s)
> +{
> +    Error *local_err = NULL;
> +
> +    colo_send_message(s->to_dst_file, COLO_MESSAGE_CHECKPOINT_REQUEST,
> +                      &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    colo_receive_check_message(s->rp_state.from_dst_file,
> +                    COLO_MESSAGE_CHECKPOINT_REPLY, &local_err);
> +    if (local_err) {
> +        goto out;
> +    }
> +
> +    /* TODO: suspend and save vm state to colo buffer */

I like how you've split up the patches - makes it easier to review.
Thanks for doing this!

> --- a/qapi-schema.json
> +++ b/qapi-schema.json
> @@ -785,6 +785,31 @@
>  { 'command': 'migrate-start-postcopy' }
>  
>  ##
> +# @COLOMessage
> +#
> +# The message transmission between PVM and SVM

Can you expand PVM and SVM for the first use?  It's obvious to someone
who's familiar with COLO, but someone looking at the api may not know
what it all means.  Also, please expand COLO if not already done in
the qapi-schema file.


		Amit
Zhanghailiang Oct. 26, 2016, 2:18 p.m. UTC | #2
On 2016/10/26 13:25, Amit Shah wrote:
> On (Tue) 18 Oct 2016 [20:10:02], zhanghailiang wrote:
>> We need communications protocol of user-defined to control
>> the checkpointing process.
>>
>> The new checkpointing request is started by Primary VM,
>> and the interactive process like below:
>>
>> Checkpoint synchronizing points:
>>
>>                     Primary               Secondary
>>                                              initial work
>> 'checkpoint-ready'    <-------------------- @
>>
>> 'checkpoint-request'  @ -------------------->
>>                                              Suspend (Only in hybrid mode)
>> 'checkpoint-reply'    <-------------------- @
>>                        Suspend&Save state
>> 'vmstate-send'        @ -------------------->
>>                        Send state            Receive state
>> 'vmstate-received'    <-------------------- @
>>                        Release packets       Load state
>> 'vmstate-load'        <-------------------- @
>>                        Resume                Resume (Only in hybrid mode)
>>
>>                        Start Comparing (Only in hybrid mode)
>> NOTE:
>>   1) '@' who sends the message
>>   2) Every sync-point is synchronized by two sides with only
>>      one handshake(single direction) for low-latency.
>>      If more strict synchronization is required, a opposite direction
>>      sync-point should be added.
>>   3) Since sync-points are single direction, the remote side may
>>      go forward a lot when this side just receives the sync-point.
>>   4) For now, we only support 'periodic' checkpoint, for which
>>     the Secondary VM is not running, later we will support 'hybrid' mode.
>>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
>> Signed-off-by: Gonglei <arei.gonglei@huawei.com>
>> Cc: Eric Blake <eblake@redhat.com>
>> Cc: Markus Armbruster <armbru@redhat.com>
>> Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
>> Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
>
> Reviewed-by: Amit Shah <amit.shah@redhat.com>
>
>> +static int colo_do_checkpoint_transaction(MigrationState *s)
>> +{
>> +    Error *local_err = NULL;
>> +
>> +    colo_send_message(s->to_dst_file, COLO_MESSAGE_CHECKPOINT_REQUEST,
>> +                      &local_err);
>> +    if (local_err) {
>> +        goto out;
>> +    }
>> +
>> +    colo_receive_check_message(s->rp_state.from_dst_file,
>> +                    COLO_MESSAGE_CHECKPOINT_REPLY, &local_err);
>> +    if (local_err) {
>> +        goto out;
>> +    }
>> +
>> +    /* TODO: suspend and save vm state to colo buffer */
>
> I like how you've split up the patches - makes it easier to review.
> Thanks for doing this!
>

You're welcome. I'm glad to make your maintainers' review easy. :)

>> --- a/qapi-schema.json
>> +++ b/qapi-schema.json
>> @@ -785,6 +785,31 @@
>>   { 'command': 'migrate-start-postcopy' }
>>
>>   ##
>> +# @COLOMessage
>> +#
>> +# The message transmission between PVM and SVM
>
> Can you expand PVM and SVM for the first use?  It's obvious to someone

Quit reasonable. I will fix it in next version.

> who's familiar with COLO, but someone looking at the api may not know
> what it all means.  Also, please expand COLO if not already done in
> the qapi-schema file.
>

You are right, the full name of COLO appears in patch one, I'll add it
there.

Thanks.
hailiang

>
> 		Amit
>
> .
>
diff mbox

Patch

diff --git a/migration/colo.c b/migration/colo.c
index 7c5769b..bf32d63 100644
--- a/migration/colo.c
+++ b/migration/colo.c
@@ -15,6 +15,7 @@ 
 #include "migration/colo.h"
 #include "trace.h"
 #include "qemu/error-report.h"
+#include "qapi/error.h"
 
 bool colo_supported(void)
 {
@@ -35,22 +36,146 @@  bool migration_incoming_in_colo_state(void)
     return mis && (mis->state == MIGRATION_STATUS_COLO);
 }
 
+static void colo_send_message(QEMUFile *f, COLOMessage msg,
+                              Error **errp)
+{
+    int ret;
+
+    if (msg >= COLO_MESSAGE__MAX) {
+        error_setg(errp, "%s: Invalid message", __func__);
+        return;
+    }
+    qemu_put_be32(f, msg);
+    qemu_fflush(f);
+
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Can't send COLO message");
+    }
+    trace_colo_send_message(COLOMessage_lookup[msg]);
+}
+
+static COLOMessage colo_receive_message(QEMUFile *f, Error **errp)
+{
+    COLOMessage msg;
+    int ret;
+
+    msg = qemu_get_be32(f);
+    ret = qemu_file_get_error(f);
+    if (ret < 0) {
+        error_setg_errno(errp, -ret, "Can't receive COLO message");
+        return msg;
+    }
+    if (msg >= COLO_MESSAGE__MAX) {
+        error_setg(errp, "%s: Invalid message", __func__);
+        return msg;
+    }
+    trace_colo_receive_message(COLOMessage_lookup[msg]);
+    return msg;
+}
+
+static void colo_receive_check_message(QEMUFile *f, COLOMessage expect_msg,
+                                       Error **errp)
+{
+    COLOMessage msg;
+    Error *local_err = NULL;
+
+    msg = colo_receive_message(f, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+    if (msg != expect_msg) {
+        error_setg(errp, "Unexpected COLO message %d, expected %d",
+                          msg, expect_msg);
+    }
+}
+
+static int colo_do_checkpoint_transaction(MigrationState *s)
+{
+    Error *local_err = NULL;
+
+    colo_send_message(s->to_dst_file, COLO_MESSAGE_CHECKPOINT_REQUEST,
+                      &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    colo_receive_check_message(s->rp_state.from_dst_file,
+                    COLO_MESSAGE_CHECKPOINT_REPLY, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    /* TODO: suspend and save vm state to colo buffer */
+
+    colo_send_message(s->to_dst_file, COLO_MESSAGE_VMSTATE_SEND, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    /* TODO: send vmstate to Secondary */
+
+    colo_receive_check_message(s->rp_state.from_dst_file,
+                       COLO_MESSAGE_VMSTATE_RECEIVED, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    colo_receive_check_message(s->rp_state.from_dst_file,
+                       COLO_MESSAGE_VMSTATE_LOADED, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    /* TODO: resume Primary */
+
+    return 0;
+out:
+    if (local_err) {
+        error_report_err(local_err);
+    }
+    return -EINVAL;
+}
+
 static void colo_process_checkpoint(MigrationState *s)
 {
+    Error *local_err = NULL;
+    int ret;
+
     s->rp_state.from_dst_file = qemu_file_get_return_path(s->to_dst_file);
     if (!s->rp_state.from_dst_file) {
         error_report("Open QEMUFile from_dst_file failed");
         goto out;
     }
 
+    /*
+     * Wait for Secondary finish loading VM states and enter COLO
+     * restore.
+     */
+    colo_receive_check_message(s->rp_state.from_dst_file,
+                       COLO_MESSAGE_CHECKPOINT_READY, &local_err);
+    if (local_err) {
+        goto out;
+    }
+
     qemu_mutex_lock_iothread();
     vm_start();
     qemu_mutex_unlock_iothread();
     trace_colo_vm_state_change("stop", "run");
 
-    /* TODO: COLO checkpoint savevm loop */
+    while (s->state == MIGRATION_STATUS_COLO) {
+        ret = colo_do_checkpoint_transaction(s);
+        if (ret < 0) {
+            goto out;
+        }
+    }
 
 out:
+    /* Throw the unreported error message after exited from loop */
+    if (local_err) {
+        error_report_err(local_err);
+    }
     migrate_set_state(&s->state, MIGRATION_STATUS_COLO,
                       MIGRATION_STATUS_COMPLETED);
 
@@ -68,9 +193,33 @@  void migrate_start_colo_process(MigrationState *s)
     qemu_mutex_lock_iothread();
 }
 
+static void colo_wait_handle_message(QEMUFile *f, int *checkpoint_request,
+                                     Error **errp)
+{
+    COLOMessage msg;
+    Error *local_err = NULL;
+
+    msg = colo_receive_message(f, &local_err);
+    if (local_err) {
+        error_propagate(errp, local_err);
+        return;
+    }
+
+    switch (msg) {
+    case COLO_MESSAGE_CHECKPOINT_REQUEST:
+        *checkpoint_request = 1;
+        break;
+    default:
+        *checkpoint_request = 0;
+        error_setg(errp, "Got unknown COLO message: %d", msg);
+        break;
+    }
+}
+
 void *colo_process_incoming_thread(void *opaque)
 {
     MigrationIncomingState *mis = opaque;
+    Error *local_err = NULL;
 
     migrate_set_state(&mis->state, MIGRATION_STATUS_ACTIVE,
                       MIGRATION_STATUS_COLO);
@@ -87,9 +236,56 @@  void *colo_process_incoming_thread(void *opaque)
      */
     qemu_file_set_blocking(mis->from_src_file, true);
 
-    /* TODO: COLO checkpoint restore loop */
+    colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_READY,
+                      &local_err);
+    if (local_err) {
+        goto out;
+    }
+
+    while (mis->state == MIGRATION_STATUS_COLO) {
+        int request;
+
+        colo_wait_handle_message(mis->from_src_file, &request, &local_err);
+        if (local_err) {
+            goto out;
+        }
+        assert(request);
+        /* FIXME: This is unnecessary for periodic checkpoint mode */
+        colo_send_message(mis->to_src_file, COLO_MESSAGE_CHECKPOINT_REPLY,
+                     &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        colo_receive_check_message(mis->from_src_file,
+                           COLO_MESSAGE_VMSTATE_SEND, &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        /* TODO: read migration data into colo buffer */
+
+        colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_RECEIVED,
+                     &local_err);
+        if (local_err) {
+            goto out;
+        }
+
+        /* TODO: load vm state */
+
+        colo_send_message(mis->to_src_file, COLO_MESSAGE_VMSTATE_LOADED,
+                     &local_err);
+        if (local_err) {
+            goto out;
+        }
+    }
 
 out:
+    /* Throw the unreported error message after exited from loop */
+    if (local_err) {
+        error_report_err(local_err);
+    }
+
     if (mis->to_src_file) {
         qemu_fclose(mis->to_src_file);
     }
diff --git a/migration/trace-events b/migration/trace-events
index a29e5a0..f374c8c 100644
--- a/migration/trace-events
+++ b/migration/trace-events
@@ -210,3 +210,5 @@  migration_tls_incoming_handshake_complete(void) ""
 
 # migration/colo.c
 colo_vm_state_change(const char *old, const char *new) "Change '%s' => '%s'"
+colo_send_message(const char *msg) "Send '%s' message"
+colo_receive_message(const char *msg) "Receive '%s' message"
diff --git a/qapi-schema.json b/qapi-schema.json
index aaed423..276d2cf 100644
--- a/qapi-schema.json
+++ b/qapi-schema.json
@@ -785,6 +785,31 @@ 
 { 'command': 'migrate-start-postcopy' }
 
 ##
+# @COLOMessage
+#
+# The message transmission between PVM and SVM
+#
+# @checkpoint-ready: SVM is ready for checkpointing
+#
+# @checkpoint-request: PVM tells SVM to prepare for new checkpointing
+#
+# @checkpoint-reply: SVM gets PVM's checkpoint request
+#
+# @vmstate-send: VM's state will be sent by PVM.
+#
+# @vmstate-size: The total size of VMstate.
+#
+# @vmstate-received: VM's state has been received by SVM.
+#
+# @vmstate-loaded: VM's state has been loaded by SVM.
+#
+# Since: 2.8
+##
+{ 'enum': 'COLOMessage',
+  'data': [ 'checkpoint-ready', 'checkpoint-request', 'checkpoint-reply',
+            'vmstate-send', 'vmstate-size', 'vmstate-received',
+            'vmstate-loaded' ] }
+
 # @MouseInfo:
 #
 # Information about a mouse device.