mbox series

[V12,00/19] COLO: integrate colo frame with block replication and COLO proxy

Message ID 20180903043900.28592-1-zhangckid@gmail.com (mailing list archive)
Headers show
Series COLO: integrate colo frame with block replication and COLO proxy | expand

Message

Zhang Chen Sept. 3, 2018, 4:38 a.m. UTC
Hi~ All~

COLO Frame, block replication and COLO proxy(colo-compare,filter-mirror,
filter-redirector,filter-rewriter) have been exist in qemu
for long time, it's time to integrate these three parts to make COLO really works.

In this series, we have some optimizations for COLO frame, including separating the
process of saving ram and device state, using an COLO_EXIT event to notify users that
VM exits COLO, for these parts, most of them have been reviewed long time ago in old version,
but since this series have just rebased on upstream which had merged a new series of migration,
parts of pathes in this series deserve review again.

We use notifier/callback method for COLO compare to notify COLO frame about
net packets inconsistent event, and add a handle_event method for NetFilterClass to
help COLO frame to notify filters and colo-compare about checkpoint/failover event,
it is flexible.

For the neweset version, please refer to:
https://github.com/zhangckid/qemu/tree/qemu-colo-18sep1

Please review, thanks.

V12:
 - Rebased on upstream.
 - Removed the patch 15/20 for feature work as Jason's comments.
 - Fixed failover bugs in patch 16/19.
 - Renamed the dummy_handle_event to default_handle_event in patch 15/19.
 - Cleaned needless check job.

V11:
 - Rebased on upstream.
 - Used "RAMBLOCK_FOREACH_MIGRATABLE()" to replace "QLIST_FOREACH_RCU()" in patch 08/20.
 - Fixed COLO related qapi command's since version in patch 10/20.

V10:
 - Rebased on upstream.
 - Removed the "active" in COLOState.
 - Fixed some comments.

V9:
 - Rebased on upstream codes.
 - Addressed Jason's comments add TCP state machine track in
   filter-rewriter.
 - Fix some bug in colo-compare.
 - Fix typo.
 - Add filter-rewriter failover handle.
 - Add net client type check in colo-compare.
 - Add COLO state diagram.
 - Addressed Markus and Daive's comments.

V8:
 - Rebased on upstream codes.
 - Addressed Markus's comments in patch 10/17.
 - Addressed Markus's comments in patch 11/17.
 - Removed some comments in patch 4/17.
 - Moved the "migration_bitmap_clear_dirty()" to suitable position in
   patch 9/17.
 - Rewrote the patch 07/17 to address Davie's comments.
 - Moved the "qemu_savevm_live_state" out of the
   qemu_mutex_lock_iothread.
 - Fixed the bug that in some status COLO vm crash with segmentation fault.

V7:
 - Addressed Markus's comments in 11/17.
 - Rebased on upstream.

V6:
 - Addressed Eric Blake's comments, use the enum to feedback in patch 11/17.
 - Fixed QAPI command separator problem in patch 11/17.


Zhang Chen (15):
  filter-rewriter: Add TCP state machine and fix memory leak in
    connection_track_table
  colo-compare: implement the process of checkpoint
  colo-compare: use notifier to notify packets comparing result
  COLO: integrate colo compare with colo frame
  COLO: Add block replication into colo process
  COLO: Remove colo_state migration struct
  COLO: Load dirty pages into SVM's RAM cache firstly
  ram/COLO: Record the dirty pages that SVM received
  COLO: Flush memory data from ram cache
  qapi/migration.json: Rename COLO unknown mode to none mode.
  qapi: Add new command to query colo status
  savevm: split the process of different stages for loadvm/savevm
  filter: Add handle_event method for NetFilterClass
  filter-rewriter: handle checkpoint and failover event
  docs: Add COLO status diagram to COLO-FT.txt

zhanghailiang (4):
  qmp event: Add COLO_EXIT event to notify users while exited COLO
  COLO: flush host dirty ram from cache
  COLO: notify net filters about checkpoint/failover event
  COLO: quick failover process by kick COLO thread

 docs/COLO-FT.txt          |  34 ++++++
 include/exec/ram_addr.h   |   1 +
 include/migration/colo.h  |  11 +-
 include/net/filter.h      |   5 +
 migration/Makefile.objs   |   2 +-
 migration/colo-comm.c     |  76 --------------
 migration/colo-failover.c |   2 +-
 migration/colo.c          | 212 ++++++++++++++++++++++++++++++++++++--
 migration/migration.c     |  46 ++++++++-
 migration/ram.c           | 166 ++++++++++++++++++++++++++++-
 migration/ram.h           |   4 +
 migration/savevm.c        |  53 ++++++++--
 migration/savevm.h        |   5 +
 migration/trace-events    |   3 +
 net/colo-compare.c        | 115 +++++++++++++++++++--
 net/colo-compare.h        |  24 +++++
 net/colo.c                |  10 +-
 net/colo.h                |  11 +-
 net/filter-rewriter.c     | 162 +++++++++++++++++++++++++++--
 net/filter.c              |  17 +++
 net/net.c                 |  19 ++++
 qapi/migration.json       |  80 +++++++++++++-
 vl.c                      |   2 -
 23 files changed, 921 insertions(+), 139 deletions(-)
 delete mode 100644 migration/colo-comm.c
 create mode 100644 net/colo-compare.h

Comments

Zhang Chen Sept. 10, 2018, 8:16 a.m. UTC | #1
Hi All.
Have any comments?
Ping...

Thanks
Zhang Chen


On Mon, Sep 3, 2018 at 12:39 PM Zhang Chen <zhangckid@gmail.com> wrote:

> Hi~ All~
>
> COLO Frame, block replication and COLO proxy(colo-compare,filter-mirror,
> filter-redirector,filter-rewriter) have been exist in qemu
> for long time, it's time to integrate these three parts to make COLO
> really works.
>
> In this series, we have some optimizations for COLO frame, including
> separating the
> process of saving ram and device state, using an COLO_EXIT event to notify
> users that
> VM exits COLO, for these parts, most of them have been reviewed long time
> ago in old version,
> but since this series have just rebased on upstream which had merged a new
> series of migration,
> parts of pathes in this series deserve review again.
>
> We use notifier/callback method for COLO compare to notify COLO frame about
> net packets inconsistent event, and add a handle_event method for
> NetFilterClass to
> help COLO frame to notify filters and colo-compare about
> checkpoint/failover event,
> it is flexible.
>
> For the neweset version, please refer to:
> https://github.com/zhangckid/qemu/tree/qemu-colo-18sep1
>
> Please review, thanks.
>
> V12:
>  - Rebased on upstream.
>  - Removed the patch 15/20 for feature work as Jason's comments.
>  - Fixed failover bugs in patch 16/19.
>  - Renamed the dummy_handle_event to default_handle_event in patch 15/19.
>  - Cleaned needless check job.
>
> V11:
>  - Rebased on upstream.
>  - Used "RAMBLOCK_FOREACH_MIGRATABLE()" to replace "QLIST_FOREACH_RCU()"
> in patch 08/20.
>  - Fixed COLO related qapi command's since version in patch 10/20.
>
> V10:
>  - Rebased on upstream.
>  - Removed the "active" in COLOState.
>  - Fixed some comments.
>
> V9:
>  - Rebased on upstream codes.
>  - Addressed Jason's comments add TCP state machine track in
>    filter-rewriter.
>  - Fix some bug in colo-compare.
>  - Fix typo.
>  - Add filter-rewriter failover handle.
>  - Add net client type check in colo-compare.
>  - Add COLO state diagram.
>  - Addressed Markus and Daive's comments.
>
> V8:
>  - Rebased on upstream codes.
>  - Addressed Markus's comments in patch 10/17.
>  - Addressed Markus's comments in patch 11/17.
>  - Removed some comments in patch 4/17.
>  - Moved the "migration_bitmap_clear_dirty()" to suitable position in
>    patch 9/17.
>  - Rewrote the patch 07/17 to address Davie's comments.
>  - Moved the "qemu_savevm_live_state" out of the
>    qemu_mutex_lock_iothread.
>  - Fixed the bug that in some status COLO vm crash with segmentation fault.
>
> V7:
>  - Addressed Markus's comments in 11/17.
>  - Rebased on upstream.
>
> V6:
>  - Addressed Eric Blake's comments, use the enum to feedback in patch
> 11/17.
>  - Fixed QAPI command separator problem in patch 11/17.
>
>
> Zhang Chen (15):
>   filter-rewriter: Add TCP state machine and fix memory leak in
>     connection_track_table
>   colo-compare: implement the process of checkpoint
>   colo-compare: use notifier to notify packets comparing result
>   COLO: integrate colo compare with colo frame
>   COLO: Add block replication into colo process
>   COLO: Remove colo_state migration struct
>   COLO: Load dirty pages into SVM's RAM cache firstly
>   ram/COLO: Record the dirty pages that SVM received
>   COLO: Flush memory data from ram cache
>   qapi/migration.json: Rename COLO unknown mode to none mode.
>   qapi: Add new command to query colo status
>   savevm: split the process of different stages for loadvm/savevm
>   filter: Add handle_event method for NetFilterClass
>   filter-rewriter: handle checkpoint and failover event
>   docs: Add COLO status diagram to COLO-FT.txt
>
> zhanghailiang (4):
>   qmp event: Add COLO_EXIT event to notify users while exited COLO
>   COLO: flush host dirty ram from cache
>   COLO: notify net filters about checkpoint/failover event
>   COLO: quick failover process by kick COLO thread
>
>  docs/COLO-FT.txt          |  34 ++++++
>  include/exec/ram_addr.h   |   1 +
>  include/migration/colo.h  |  11 +-
>  include/net/filter.h      |   5 +
>  migration/Makefile.objs   |   2 +-
>  migration/colo-comm.c     |  76 --------------
>  migration/colo-failover.c |   2 +-
>  migration/colo.c          | 212 ++++++++++++++++++++++++++++++++++++--
>  migration/migration.c     |  46 ++++++++-
>  migration/ram.c           | 166 ++++++++++++++++++++++++++++-
>  migration/ram.h           |   4 +
>  migration/savevm.c        |  53 ++++++++--
>  migration/savevm.h        |   5 +
>  migration/trace-events    |   3 +
>  net/colo-compare.c        | 115 +++++++++++++++++++--
>  net/colo-compare.h        |  24 +++++
>  net/colo.c                |  10 +-
>  net/colo.h                |  11 +-
>  net/filter-rewriter.c     | 162 +++++++++++++++++++++++++++--
>  net/filter.c              |  17 +++
>  net/net.c                 |  19 ++++
>  qapi/migration.json       |  80 +++++++++++++-
>  vl.c                      |   2 -
>  23 files changed, 921 insertions(+), 139 deletions(-)
>  delete mode 100644 migration/colo-comm.c
>  create mode 100644 net/colo-compare.h
>
> --
> 2.17.GIT
>
>
Jason Wang Sept. 12, 2018, 7:50 a.m. UTC | #2
On 2018年09月10日 16:16, Zhang Chen wrote:
> Hi All.
> Have any comments?
> Ping...
>
> Thanks
> Zhang Chen

I've queued them with some tweaks on the commit log.

Please refer the comment of patch 1 and send me a patch on top for a 
better comment.

Thanks
Zhang Chen Sept. 13, 2018, 3:10 a.m. UTC | #3
On Wed, Sep 12, 2018 at 3:50 PM Jason Wang <jasowang@redhat.com> wrote:

>
>
> On 2018年09月10日 16:16, Zhang Chen wrote:
> > Hi All.
> > Have any comments?
> > Ping...
> >
> > Thanks
> > Zhang Chen
>
> I've queued them with some tweaks on the commit log.
>
> Please refer the comment of patch 1 and send me a patch on top for a
> better comment.
>
>
OK. I got it.

Thanks
Zhang Chen


> Thanks
>
>
Zhang Chen Sept. 13, 2018, 6:40 a.m. UTC | #4
On Thu, Sep 13, 2018 at 11:10 AM Zhang Chen <zhangckid@gmail.com> wrote:

>
>
> On Wed, Sep 12, 2018 at 3:50 PM Jason Wang <jasowang@redhat.com> wrote:
>
>>
>>
>> On 2018年09月10日 16:16, Zhang Chen wrote:
>> > Hi All.
>> > Have any comments?
>> > Ping...
>> >
>> > Thanks
>> > Zhang Chen
>>
>> I've queued them with some tweaks on the commit log.
>>
>> Please refer the comment of patch 1 and send me a patch on top for a
>> better comment.
>>
>>
> OK. I got it.
>
> Thanks
> Zhang Chen
>

Hi Jason,

I have sent the new version patch 1.
What do you think about the new comments?

Thanks
Zhang Chen



>
>
>> Thanks
>>
>>