diff mbox

[02/15] colo-compare: implement the process of checkpoint

Message ID 1487734936-43472-3-git-send-email-zhang.zhanghailiang@huawei.com (mailing list archive)
State New, archived
Headers show

Commit Message

Zhanghailiang Feb. 22, 2017, 3:42 a.m. UTC
While do checkpoint, we need to flush all the unhandled packets,
By using the filter notifier mechanism, we can easily to notify
every compare object to do this process, which runs inside
of compare threads as a coroutine.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
---
 net/colo-compare.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
 net/colo-compare.h | 20 +++++++++++++++
 2 files changed, 92 insertions(+)
 create mode 100644 net/colo-compare.h

Comments

Zhang Chen Feb. 22, 2017, 9:31 a.m. UTC | #1
On 02/22/2017 11:42 AM, zhanghailiang wrote:
> While do checkpoint, we need to flush all the unhandled packets,
> By using the filter notifier mechanism, we can easily to notify
> every compare object to do this process, which runs inside
> of compare threads as a coroutine.

Hi~ Jason and Hailiang.

I will send a patch set later about colo-compare notify mechanism for 
Xen like this patch.
I want to add a new chardev socket way in colo-comapre connect to Xen 
colo, for notify
checkpoint or failover, Because We have no choice to use this way 
communicate with Xen codes.
That's means we will have two notify mechanism.
What do you think about this?


Thanks
Zhang Chen

>
> Cc: Jason Wang <jasowang@redhat.com>
> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
> ---
>   net/colo-compare.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>   net/colo-compare.h | 20 +++++++++++++++
>   2 files changed, 92 insertions(+)
>   create mode 100644 net/colo-compare.h
>
Zhanghailiang Feb. 23, 2017, 1:02 a.m. UTC | #2
Hi,

On 2017/2/22 17:31, Zhang Chen wrote:
>
>
> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>> While do checkpoint, we need to flush all the unhandled packets,
>> By using the filter notifier mechanism, we can easily to notify
>> every compare object to do this process, which runs inside
>> of compare threads as a coroutine.
>
> Hi~ Jason and Hailiang.
>
> I will send a patch set later about colo-compare notify mechanism for
> Xen like this patch.
> I want to add a new chardev socket way in colo-comapre connect to Xen
> colo, for notify
> checkpoint or failover, Because We have no choice to use this way
> communicate with Xen codes.
> That's means we will have two notify mechanism.
> What do you think about this?
>

I don't think you need another mechanism, what you need to do is to
realize a qmp command which calls colo_notify_compares_event(),
It will not  return until the event (checkpoint or failover) be
handled by all compares. Will this satisfy your requirement ?

Thanks,
Hailiang

>
> Thanks
> Zhang Chen
>
>>
>> Cc: Jason Wang <jasowang@redhat.com>
>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>> ---
>>    net/colo-compare.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>    net/colo-compare.h | 20 +++++++++++++++
>>    2 files changed, 92 insertions(+)
>>    create mode 100644 net/colo-compare.h
>>
>
Zhang Chen Feb. 23, 2017, 5:49 a.m. UTC | #3
On 02/23/2017 09:02 AM, Hailiang Zhang wrote:
> Hi,
>
> On 2017/2/22 17:31, Zhang Chen wrote:
>>
>>
>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>> While do checkpoint, we need to flush all the unhandled packets,
>>> By using the filter notifier mechanism, we can easily to notify
>>> every compare object to do this process, which runs inside
>>> of compare threads as a coroutine.
>>
>> Hi~ Jason and Hailiang.
>>
>> I will send a patch set later about colo-compare notify mechanism for
>> Xen like this patch.
>> I want to add a new chardev socket way in colo-comapre connect to Xen
>> colo, for notify
>> checkpoint or failover, Because We have no choice to use this way
>> communicate with Xen codes.
>> That's means we will have two notify mechanism.
>> What do you think about this?
>>
>
> I don't think you need another mechanism, what you need to do is to
> realize a qmp command which calls colo_notify_compares_event(),
> It will not  return until the event (checkpoint or failover) be
> handled by all compares. Will this satisfy your requirement ?

No, colo-frame notify colo-comapre can calls colo_notify_compares_event(),
That's OK, but colo-comapre notify colo-frame in Xen have some problem,
Xen's colo-frame needs a API that blocking and have a timeout to read 
colo-comapre's
notify, this timeout is the time of periodic checkpoint. In this patch set,
colo-compare just call colo_compare_inconsistent_notify() to 
non-blocking notify.
We can not realize a qmp command that Xen always polling that to get 
status of notify,
Qemu also can not accept to call qmp command for polling.


Thanks
Zhang Chen

>
> Thanks,
> Hailiang
>
>>
>> Thanks
>> Zhang Chen
>>
>>>
>>> Cc: Jason Wang <jasowang@redhat.com>
>>> Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
>>> Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
>>> ---
>>>    net/colo-compare.c | 72 
>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>    net/colo-compare.h | 20 +++++++++++++++
>>>    2 files changed, 92 insertions(+)
>>>    create mode 100644 net/colo-compare.h
>>>
>>
>
>
>
> .
>
Jason Wang April 14, 2017, 5:57 a.m. UTC | #4
On 2017年02月22日 17:31, Zhang Chen wrote:
>
>
> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>> While do checkpoint, we need to flush all the unhandled packets,
>> By using the filter notifier mechanism, we can easily to notify
>> every compare object to do this process, which runs inside
>> of compare threads as a coroutine.
>
> Hi~ Jason and Hailiang.
>
> I will send a patch set later about colo-compare notify mechanism for 
> Xen like this patch.
> I want to add a new chardev socket way in colo-comapre connect to Xen 
> colo, for notify
> checkpoint or failover, Because We have no choice to use this way 
> communicate with Xen codes.
> That's means we will have two notify mechanism.
> What do you think about this?
>
>
> Thanks
> Zhang Chen 

I was thinking the possibility of using similar way to for colo compare. 
E.g can we use socket? This can saves duplicated codes more or less.

Thanks
Zhanghailiang April 14, 2017, 6:22 a.m. UTC | #5
Hi Jason,

On 2017/4/14 13:57, Jason Wang wrote:
>
> On 2017年02月22日 17:31, Zhang Chen wrote:
>>
>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>> While do checkpoint, we need to flush all the unhandled packets,
>>> By using the filter notifier mechanism, we can easily to notify
>>> every compare object to do this process, which runs inside
>>> of compare threads as a coroutine.
>> Hi~ Jason and Hailiang.
>>
>> I will send a patch set later about colo-compare notify mechanism for
>> Xen like this patch.
>> I want to add a new chardev socket way in colo-comapre connect to Xen
>> colo, for notify
>> checkpoint or failover, Because We have no choice to use this way
>> communicate with Xen codes.
>> That's means we will have two notify mechanism.
>> What do you think about this?
>>
>>
>> Thanks
>> Zhang Chen
> I was thinking the possibility of using similar way to for colo compare.
> E.g can we use socket? This can saves duplicated codes more or less.

Since there are too many sockets used by filter and COLO, (Two unix sockets and two
  tcp sockets for each vNIC), I don't want to introduce more ;) , but i'm not sure if it is
possible to make it more flexible and optional, abstract these duplicated codes,
pass the opened fd (No matter eventfd or socket fd ) as parameter, for example.
Is this way acceptable ?

Thanks,
Hailiang

> Thanks
>
>
> .
>
Jason Wang April 14, 2017, 6:38 a.m. UTC | #6
On 2017年04月14日 14:22, Hailiang Zhang wrote:
> Hi Jason,
>
> On 2017/4/14 13:57, Jason Wang wrote:
>>
>> On 2017年02月22日 17:31, Zhang Chen wrote:
>>>
>>> On 02/22/2017 11:42 AM, zhanghailiang wrote:
>>>> While do checkpoint, we need to flush all the unhandled packets,
>>>> By using the filter notifier mechanism, we can easily to notify
>>>> every compare object to do this process, which runs inside
>>>> of compare threads as a coroutine.
>>> Hi~ Jason and Hailiang.
>>>
>>> I will send a patch set later about colo-compare notify mechanism for
>>> Xen like this patch.
>>> I want to add a new chardev socket way in colo-comapre connect to Xen
>>> colo, for notify
>>> checkpoint or failover, Because We have no choice to use this way
>>> communicate with Xen codes.
>>> That's means we will have two notify mechanism.
>>> What do you think about this?
>>>
>>>
>>> Thanks
>>> Zhang Chen
>> I was thinking the possibility of using similar way to for colo compare.
>> E.g can we use socket? This can saves duplicated codes more or less.
>
> Since there are too many sockets used by filter and COLO, (Two unix 
> sockets and two
>  tcp sockets for each vNIC), I don't want to introduce more ;) , but 
> i'm not sure if it is
> possible to make it more flexible and optional, abstract these 
> duplicated codes,
> pass the opened fd (No matter eventfd or socket fd ) as parameter, for 
> example.
> Is this way acceptable ?
>
> Thanks,
> Hailiang

Yes, that's kind of what I want. We don't want to use two message 
format. Passing a opened fd need management support, we still need a 
fallback if there's no management on top. For qemu/kvm, we can do all 
stuffs transparent to the cli by e.g socketpair() or others, but the key 
is to have a unified message format.

Thoughts?

Thanks

>
>> Thanks
>>
>>
>> .
>>
>
>
diff mbox

Patch

diff --git a/net/colo-compare.c b/net/colo-compare.c
index a6fc2ff..61a8ee4 100644
--- a/net/colo-compare.c
+++ b/net/colo-compare.c
@@ -29,17 +29,24 @@ 
 #include "qemu/sockets.h"
 #include "qapi-visit.h"
 #include "net/colo.h"
+#include "net/colo-compare.h"
 
 #define TYPE_COLO_COMPARE "colo-compare"
 #define COLO_COMPARE(obj) \
     OBJECT_CHECK(CompareState, (obj), TYPE_COLO_COMPARE)
 
+static QTAILQ_HEAD(, CompareState) net_compares =
+       QTAILQ_HEAD_INITIALIZER(net_compares);
+
 #define COMPARE_READ_LEN_MAX NET_BUFSIZE
 #define MAX_QUEUE_SIZE 1024
 
 /* TODO: Should be configurable */
 #define REGULAR_PACKET_CHECK_MS 3000
 
+static QemuMutex event_mtx = { .lock = PTHREAD_MUTEX_INITIALIZER };
+static QemuCond event_complete_cond = { .cond = PTHREAD_COND_INITIALIZER };
+static int event_unhandled_count;
 /*
   + CompareState ++
   |               |
@@ -86,6 +93,10 @@  typedef struct CompareState {
 
     GMainContext *worker_context;
     GMainLoop *compare_loop;
+    /* Used for COLO to notify compare to do something */
+    FilterNotifier *notifier;
+
+    QTAILQ_ENTRY(CompareState) next;
 } CompareState;
 
 typedef struct CompareClass {
@@ -375,6 +386,11 @@  static void colo_compare_connection(void *opaque, void *user_data)
     while (!g_queue_is_empty(&conn->primary_list) &&
            !g_queue_is_empty(&conn->secondary_list)) {
         pkt = g_queue_pop_tail(&conn->primary_list);
+        if (!pkt) {
+            error_report("colo-compare pop pkt failed");
+            return;
+        }
+
         switch (conn->ip_proto) {
         case IPPROTO_TCP:
             result = g_queue_find_custom(&conn->secondary_list,
@@ -496,6 +512,52 @@  static gboolean check_old_packet_regular(void *opaque)
     return TRUE;
 }
 
+/* Public API, Used for COLO frame to notify compare event */
+void colo_notify_compares_event(void *opaque, int event, Error **errp)
+{
+    CompareState *s;
+    int ret;
+
+    qemu_mutex_lock(&event_mtx);
+    QTAILQ_FOREACH(s, &net_compares, next) {
+        ret = filter_notifier_set(s->notifier, event);
+        if (ret < 0) {
+            error_setg_errno(errp, -ret, "Failed to write value to eventfd");
+            goto fail;
+        }
+        event_unhandled_count++;
+    }
+    /* Wait all compare thread to finish handling this event */
+    while (event_unhandled_count) {
+        qemu_cond_wait(&event_complete_cond, &event_mtx);
+    }
+
+fail:
+    qemu_mutex_unlock(&event_mtx);
+}
+
+static void colo_flush_packets(void *opaque, void *user_data);
+
+static void colo_compare_handle_event(void *opaque, int event)
+{
+    FilterNotifier *notify = opaque;
+    CompareState *s = notify->opaque;
+
+    switch (event) {
+    case COLO_CHECKPOINT:
+        g_queue_foreach(&s->conn_list, colo_flush_packets, s);
+        break;
+    case COLO_FAILOVER:
+        break;
+    default:
+        break;
+    }
+    qemu_mutex_lock(&event_mtx);
+    event_unhandled_count--;
+    qemu_cond_broadcast(&event_complete_cond);
+    qemu_mutex_unlock(&event_mtx);
+}
+
 static void *colo_compare_thread(void *opaque)
 {
     CompareState *s = opaque;
@@ -516,8 +578,12 @@  static void *colo_compare_thread(void *opaque)
                           (GSourceFunc)check_old_packet_regular, s, NULL);
     g_source_attach(timeout_source, s->worker_context);
 
+    s->notifier = filter_noitifier_new(colo_compare_handle_event, s, NULL);
+    g_source_attach(&s->notifier->source, s->worker_context);
+
     g_main_loop_run(s->compare_loop);
 
+    g_source_unref(&s->notifier->source);
     g_source_unref(timeout_source);
     g_main_loop_unref(s->compare_loop);
     g_main_context_unref(s->worker_context);
@@ -660,6 +726,8 @@  static void colo_compare_complete(UserCreatable *uc, Error **errp)
     net_socket_rs_init(&s->pri_rs, compare_pri_rs_finalize);
     net_socket_rs_init(&s->sec_rs, compare_sec_rs_finalize);
 
+    QTAILQ_INSERT_TAIL(&net_compares, s, next);
+
     g_queue_init(&s->conn_list);
 
     s->connection_track_table = g_hash_table_new_full(connection_key_hash,
@@ -726,6 +794,10 @@  static void colo_compare_finalize(Object *obj)
     g_main_loop_quit(s->compare_loop);
     qemu_thread_join(&s->thread);
 
+    if (!QTAILQ_EMPTY(&net_compares)) {
+        QTAILQ_REMOVE(&net_compares, s, next);
+    }
+
     /* Release all unhandled packets after compare thead exited */
     g_queue_foreach(&s->conn_list, colo_flush_packets, s);
 
diff --git a/net/colo-compare.h b/net/colo-compare.h
new file mode 100644
index 0000000..ed823ed
--- /dev/null
+++ b/net/colo-compare.h
@@ -0,0 +1,20 @@ 
+/*
+ * COarse-grain LOck-stepping Virtual Machines for Non-stop Service (COLO)
+ * (a.k.a. Fault Tolerance or Continuous Replication)
+ *
+ * Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
+ * Copyright (c) 2016 FUJITSU LIMITED
+ * Copyright (c) 2016 Intel Corporation
+ *
+ * Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
+ *
+ * This work is licensed under the terms of the GNU GPL, version 2 or
+ * later.  See the COPYING file in the top-level directory.
+ */
+
+#ifndef QEMU_COLO_COMPARE_H
+#define QEMU_COLO_COMPARE_H
+
+void colo_notify_compares_event(void *opaque, int event, Error **errp);
+
+#endif /* QEMU_COLO_COMPARE_H */