diff mbox

migration: disable auto-converge during bulk block migration

Message ID 1505997152-16980-1-git-send-email-pl@kamp.de (mailing list archive)
State New, archived
Headers show

Commit Message

Peter Lieven Sept. 21, 2017, 12:32 p.m. UTC
auto-converge and block migration currently do not play well together.
During block migration the auto-converge logic detects that ram
migration makes no progress and thus throttles down the vm until
it nearly stalls completely. Avoid this by disabling the throttling
logic during the bulk phase of the block migration.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
---
 migration/block.c | 5 +++++
 migration/block.h | 7 +++++++
 migration/ram.c   | 3 ++-
 3 files changed, 14 insertions(+), 1 deletion(-)

Comments

Stefan Hajnoczi Sept. 22, 2017, 10:22 a.m. UTC | #1
On Thu, Sep 21, 2017 at 02:32:32PM +0200, Peter Lieven wrote:
> auto-converge and block migration currently do not play well together.
> During block migration the auto-converge logic detects that ram
> migration makes no progress and thus throttles down the vm until
> it nearly stalls completely. Avoid this by disabling the throttling
> logic during the bulk phase of the block migration.

Please include the rationale in a comment here:

> -        if (migrate_auto_converge()) {
> +        if (migrate_auto_converge() && !blk_mig_bulk_active()) {

That way it's clear why auto-converge isn't enabled when block migration
is active.
Michael Roth Sept. 25, 2017, 8:53 p.m. UTC | #2
Quoting Peter Lieven (2017-09-21 07:32:32)
> auto-converge and block migration currently do not play well together.
> During block migration the auto-converge logic detects that ram
> migration makes no progress and thus throttles down the vm until
> it nearly stalls completely. Avoid this by disabling the throttling
> logic during the bulk phase of the block migration.
> 
> Cc: qemu-stable@nongnu.org
> Signed-off-by: Peter Lieven <pl@kamp.de>

FYI: this patch has been tagged for stable 2.10.1, but is not yet
upstream. Patch freeze for 2.10.1 is September 27th.

> ---
>  migration/block.c | 5 +++++
>  migration/block.h | 7 +++++++
>  migration/ram.c   | 3 ++-
>  3 files changed, 14 insertions(+), 1 deletion(-)
> 
> diff --git a/migration/block.c b/migration/block.c
> index 9171f60..606ad4d 100644
> --- a/migration/block.c
> +++ b/migration/block.c
> @@ -161,6 +161,11 @@ int blk_mig_active(void)
>      return !QSIMPLEQ_EMPTY(&block_mig_state.bmds_list);
>  }
> 
> +int blk_mig_bulk_active(void)
> +{
> +    return blk_mig_active() && !block_mig_state.bulk_completed;
> +}
> +
>  uint64_t blk_mig_bytes_transferred(void)
>  {
>      BlkMigDevState *bmds;
> diff --git a/migration/block.h b/migration/block.h
> index 22ebe94..3178609 100644
> --- a/migration/block.h
> +++ b/migration/block.h
> @@ -16,6 +16,7 @@
> 
>  #ifdef CONFIG_LIVE_BLOCK_MIGRATION
>  int blk_mig_active(void);
> +int blk_mig_bulk_active(void);
>  uint64_t blk_mig_bytes_transferred(void);
>  uint64_t blk_mig_bytes_remaining(void);
>  uint64_t blk_mig_bytes_total(void);
> @@ -25,6 +26,12 @@ static inline int blk_mig_active(void)
>  {
>      return false;
>  }
> +
> +static inline int blk_mig_bulk_active(void)
> +{
> +    return false;
> +}
> +
>  static inline uint64_t blk_mig_bytes_transferred(void)
>  {
>      return 0;
> diff --git a/migration/ram.c b/migration/ram.c
> index e18b3e2..720470e 100644
> --- a/migration/ram.c
> +++ b/migration/ram.c
> @@ -46,6 +46,7 @@
>  #include "exec/ram_addr.h"
>  #include "qemu/rcu_queue.h"
>  #include "migration/colo.h"
> +#include "migration/block.h"
> 
>  /***********************************************************/
>  /* ram save/restore */
> @@ -623,7 +624,7 @@ static void migration_bitmap_sync(RAMState *rs)
>              / (end_time - rs->time_last_bitmap_sync);
>          bytes_xfer_now = ram_counters.transferred;
> 
> -        if (migrate_auto_converge()) {
> +        if (migrate_auto_converge() && !blk_mig_bulk_active()) {
>              /* The following detection logic can be refined later. For now:
>                 Check to see if the dirtied bytes is 50% more than the approx.
>                 amount of bytes that just got transferred since the last time we
> -- 
> 1.9.1
> 
>
Peter Lieven Sept. 26, 2017, 10:19 a.m. UTC | #3
Am 25.09.2017 um 22:53 schrieb Michael Roth:
> Quoting Peter Lieven (2017-09-21 07:32:32)
>> auto-converge and block migration currently do not play well together.
>> During block migration the auto-converge logic detects that ram
>> migration makes no progress and thus throttles down the vm until
>> it nearly stalls completely. Avoid this by disabling the throttling
>> logic during the bulk phase of the block migration.
>>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Peter Lieven <pl@kamp.de>
> FYI: this patch has been tagged for stable 2.10.1, but is not yet
> upstream. Patch freeze for 2.10.1 is September 27th.

I just send a V2 of the patch adding a comment to the code
why auto-converge is disabled during bulk block migration.
Maybe someone can review it and pick it up.

Peter
Peter Lieven Sept. 28, 2017, 12:04 p.m. UTC | #4
Am 25.09.2017 um 22:53 schrieb Michael Roth:
> Quoting Peter Lieven (2017-09-21 07:32:32)
>> auto-converge and block migration currently do not play well together.
>> During block migration the auto-converge logic detects that ram
>> migration makes no progress and thus throttles down the vm until
>> it nearly stalls completely. Avoid this by disabling the throttling
>> logic during the bulk phase of the block migration.
>>
>> Cc: qemu-stable@nongnu.org
>> Signed-off-by: Peter Lieven <pl@kamp.de>
> FYI: this patch has been tagged for stable 2.10.1, but is not yet
> upstream. Patch freeze for 2.10.1 is September 27th.

Hi Michael,

the patch went upstream yesterday.

Peter
diff mbox

Patch

diff --git a/migration/block.c b/migration/block.c
index 9171f60..606ad4d 100644
--- a/migration/block.c
+++ b/migration/block.c
@@ -161,6 +161,11 @@  int blk_mig_active(void)
     return !QSIMPLEQ_EMPTY(&block_mig_state.bmds_list);
 }
 
+int blk_mig_bulk_active(void)
+{
+    return blk_mig_active() && !block_mig_state.bulk_completed;
+}
+
 uint64_t blk_mig_bytes_transferred(void)
 {
     BlkMigDevState *bmds;
diff --git a/migration/block.h b/migration/block.h
index 22ebe94..3178609 100644
--- a/migration/block.h
+++ b/migration/block.h
@@ -16,6 +16,7 @@ 
 
 #ifdef CONFIG_LIVE_BLOCK_MIGRATION
 int blk_mig_active(void);
+int blk_mig_bulk_active(void);
 uint64_t blk_mig_bytes_transferred(void);
 uint64_t blk_mig_bytes_remaining(void);
 uint64_t blk_mig_bytes_total(void);
@@ -25,6 +26,12 @@  static inline int blk_mig_active(void)
 {
     return false;
 }
+
+static inline int blk_mig_bulk_active(void)
+{
+    return false;
+}
+
 static inline uint64_t blk_mig_bytes_transferred(void)
 {
     return 0;
diff --git a/migration/ram.c b/migration/ram.c
index e18b3e2..720470e 100644
--- a/migration/ram.c
+++ b/migration/ram.c
@@ -46,6 +46,7 @@ 
 #include "exec/ram_addr.h"
 #include "qemu/rcu_queue.h"
 #include "migration/colo.h"
+#include "migration/block.h"
 
 /***********************************************************/
 /* ram save/restore */
@@ -623,7 +624,7 @@  static void migration_bitmap_sync(RAMState *rs)
             / (end_time - rs->time_last_bitmap_sync);
         bytes_xfer_now = ram_counters.transferred;
 
-        if (migrate_auto_converge()) {
+        if (migrate_auto_converge() && !blk_mig_bulk_active()) {
             /* The following detection logic can be refined later. For now:
                Check to see if the dirtied bytes is 50% more than the approx.
                amount of bytes that just got transferred since the last time we