From patchwork Tue Dec 11 08:24:47 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10723231 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0886B13BF for ; Tue, 11 Dec 2018 09:02:51 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id ED1082A60C for ; Tue, 11 Dec 2018 09:02:50 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E19F42A627; Tue, 11 Dec 2018 09:02:50 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D2C252A60C for ; Tue, 11 Dec 2018 09:02:49 +0000 (UTC) Received: from localhost ([::1]:36641 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdwG-0005xK-KC for patchwork-qemu-devel@patchwork.kernel.org; Tue, 11 Dec 2018 04:02:48 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51577) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdrb-0002SR-Rg for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWdrY-0006EU-HV for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:57:59 -0500 Received: from mga07.intel.com ([134.134.136.100]:27014) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWdrY-0006Dg-6H for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:57:56 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 00:57:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,341,1539673200"; d="scan'208";a="117803092" Received: from devel-ww.sh.intel.com ([10.239.48.119]) by orsmga001.jf.intel.com with ESMTP; 11 Dec 2018 00:57:51 -0800 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com, peterx@redhat.com Date: Tue, 11 Dec 2018 16:24:47 +0800 Message-Id: <1544516693-5395-2-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> References: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.100 Subject: [Qemu-devel] [PATCH v11 1/7] bitmap: fix bitmap_count_one X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, liliang.opensource@gmail.com, nilal@redhat.com, wei.w.wang@intel.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP BITMAP_LAST_WORD_MASK(nbits) returns 0xffffffff when "nbits=0", which makes bitmap_count_one fail to handle the "nbits=0" case. It appears to be preferred to remain BITMAP_LAST_WORD_MASK identical to the kernel implementation that it is ported from. So this patch fixes bitmap_count_one to handle the nbits=0 case. Inital Discussion Link: https://www.mail-archive.com/qemu-devel@nongnu.org/msg554316.html Signed-off-by: Wei Wang CC: Juan Quintela CC: Dr. David Alan Gilbert CC: Peter Xu Reviewed-by: Dr. David Alan Gilbert --- include/qemu/bitmap.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h index 509eedd..679f1bd 100644 --- a/include/qemu/bitmap.h +++ b/include/qemu/bitmap.h @@ -221,6 +221,10 @@ static inline int bitmap_intersects(const unsigned long *src1, static inline long bitmap_count_one(const unsigned long *bitmap, long nbits) { + if (unlikely(!nbits)) { + return 0; + } + if (small_nbits(nbits)) { return ctpopl(*bitmap & BITMAP_LAST_WORD_MASK(nbits)); } else { From patchwork Tue Dec 11 08:24:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10723215 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 1055C91E for ; Tue, 11 Dec 2018 08:59:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id F367029B3B for ; Tue, 11 Dec 2018 08:59:34 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E7C3D29DA3; Tue, 11 Dec 2018 08:59:34 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 94F2129B3B for ; Tue, 11 Dec 2018 08:59:34 +0000 (UTC) Received: from localhost ([::1]:36620 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdt2-0003Zk-HY for patchwork-qemu-devel@patchwork.kernel.org; Tue, 11 Dec 2018 03:59:28 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51575) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdrb-0002SQ-R4 for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:00 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWdrZ-0006Eg-0W for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:57:59 -0500 Received: from mga07.intel.com ([134.134.136.100]:27013) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWdrY-0006Ck-OA for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:57:56 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 00:57:56 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,341,1539673200"; d="scan'208";a="117803106" Received: from devel-ww.sh.intel.com ([10.239.48.119]) by orsmga001.jf.intel.com with ESMTP; 11 Dec 2018 00:57:54 -0800 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com, peterx@redhat.com Date: Tue, 11 Dec 2018 16:24:48 +0800 Message-Id: <1544516693-5395-3-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> References: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.100 Subject: [Qemu-devel] [PATCH v11 2/7] bitmap: bitmap_count_one_with_offset X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, liliang.opensource@gmail.com, nilal@redhat.com, wei.w.wang@intel.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP Count the number of 1s in a bitmap starting from an offset. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin Reviewed-by: Dr. David Alan Gilbert --- include/qemu/bitmap.h | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/include/qemu/bitmap.h b/include/qemu/bitmap.h index 679f1bd..5c31334 100644 --- a/include/qemu/bitmap.h +++ b/include/qemu/bitmap.h @@ -232,6 +232,19 @@ static inline long bitmap_count_one(const unsigned long *bitmap, long nbits) } } +static inline long bitmap_count_one_with_offset(const unsigned long *bitmap, + long offset, long nbits) +{ + long aligned_offset = QEMU_ALIGN_DOWN(offset, BITS_PER_LONG); + long redundant_bits = offset - aligned_offset; + long bits_to_count = nbits + redundant_bits; + const unsigned long *bitmap_start = bitmap + + aligned_offset / BITS_PER_LONG; + + return bitmap_count_one(bitmap_start, bits_to_count) - + bitmap_count_one(bitmap_start, redundant_bits); +} + void bitmap_set(unsigned long *map, long i, long len); void bitmap_set_atomic(unsigned long *map, long i, long len); void bitmap_clear(unsigned long *map, long start, long nr); From patchwork Tue Dec 11 08:24:49 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10723213 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 9D7D391E for ; Tue, 11 Dec 2018 08:59:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8E50A29B3B for ; Tue, 11 Dec 2018 08:59:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 82C6529DA3; Tue, 11 Dec 2018 08:59:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id D17AB29BF1 for ; Tue, 11 Dec 2018 08:59:27 +0000 (UTC) Received: from localhost ([::1]:36615 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdt0-0002TN-Pz for patchwork-qemu-devel@patchwork.kernel.org; Tue, 11 Dec 2018 03:59:26 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51597) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdrc-0002ST-Ep for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:01 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWdrb-0006Fh-EM for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:00 -0500 Received: from mga07.intel.com ([134.134.136.100]:27021) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWdrb-0006FH-4o for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:57:59 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 00:57:58 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,341,1539673200"; d="scan'208";a="117803119" Received: from devel-ww.sh.intel.com ([10.239.48.119]) by orsmga001.jf.intel.com with ESMTP; 11 Dec 2018 00:57:56 -0800 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com, peterx@redhat.com Date: Tue, 11 Dec 2018 16:24:49 +0800 Message-Id: <1544516693-5395-4-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> References: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.100 Subject: [Qemu-devel] [PATCH v11 3/7] migration: use bitmap_mutex in migration_bitmap_clear_dirty X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, liliang.opensource@gmail.com, nilal@redhat.com, wei.w.wang@intel.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP The bitmap mutex is used to synchronize threads to update the dirty bitmap and the migration_dirty_pages counter. For example, the free page optimization clears bits of free pages from the bitmap in an iothread context. This patch makes migration_bitmap_clear_dirty update the bitmap and counter under the mutex. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin CC: Peter Xu Reviewed-by: Peter Xu --- migration/ram.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/migration/ram.c b/migration/ram.c index 7e7deec..01d267f 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -316,7 +316,7 @@ struct RAMState { uint64_t target_page_count; /* number of dirty bits in the bitmap */ uint64_t migration_dirty_pages; - /* protects modification of the bitmap */ + /* Protects modification of the bitmap and migration dirty pages */ QemuMutex bitmap_mutex; /* The RAMBlock used in the last src_page_requests */ RAMBlock *last_req_rb; @@ -1556,11 +1556,14 @@ static inline bool migration_bitmap_clear_dirty(RAMState *rs, { bool ret; + qemu_mutex_lock(&rs->bitmap_mutex); ret = test_and_clear_bit(page, rb->bmap); if (ret) { rs->migration_dirty_pages--; } + qemu_mutex_unlock(&rs->bitmap_mutex); + return ret; } From patchwork Tue Dec 11 08:24:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10723229 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 921541751 for ; Tue, 11 Dec 2018 09:02:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8271E2A60C for ; Tue, 11 Dec 2018 09:02:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 767472A627; Tue, 11 Dec 2018 09:02:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 086052A60C for ; Tue, 11 Dec 2018 09:02:48 +0000 (UTC) Received: from localhost ([::1]:36632 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdwF-0004ph-S0 for patchwork-qemu-devel@patchwork.kernel.org; Tue, 11 Dec 2018 04:02:47 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51611) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdre-0002TP-Hc for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:04 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWdrd-0006Ir-1Y for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:02 -0500 Received: from mga07.intel.com ([134.134.136.100]:27021) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWdrc-0006FH-M8 for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:00 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 00:58:00 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,341,1539673200"; d="scan'208";a="117803126" Received: from devel-ww.sh.intel.com ([10.239.48.119]) by orsmga001.jf.intel.com with ESMTP; 11 Dec 2018 00:57:58 -0800 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com, peterx@redhat.com Date: Tue, 11 Dec 2018 16:24:50 +0800 Message-Id: <1544516693-5395-5-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> References: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.100 Subject: [Qemu-devel] [PATCH v11 4/7] migration: API to clear bits of guest free pages from the dirty bitmap X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, liliang.opensource@gmail.com, nilal@redhat.com, wei.w.wang@intel.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch adds an API to clear bits corresponding to guest free pages from the dirty bitmap. Spilt the free page block if it crosses the QEMU RAMBlock boundary. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin CC: Peter Xu Reviewed-by: Peter Xu --- include/migration/misc.h | 2 ++ migration/ram.c | 47 +++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 49 insertions(+) diff --git a/include/migration/misc.h b/include/migration/misc.h index 4ebf24c..113320e 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -14,11 +14,13 @@ #ifndef MIGRATION_MISC_H #define MIGRATION_MISC_H +#include "exec/cpu-common.h" #include "qemu/notify.h" /* migration/ram.c */ void ram_mig_init(void); +void qemu_guest_free_page_hint(void *addr, size_t len); /* migration/block.c */ diff --git a/migration/ram.c b/migration/ram.c index 01d267f..c13b44f 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3131,6 +3131,53 @@ static void ram_state_resume_prepare(RAMState *rs, QEMUFile *out) } /* + * This function clears bits of the free pages reported by the caller from the + * migration dirty bitmap. @addr is the host address corresponding to the + * start of the continuous guest free pages, and @len is the total bytes of + * those pages. + */ +void qemu_guest_free_page_hint(void *addr, size_t len) +{ + RAMBlock *block; + ram_addr_t offset; + size_t used_len, start, npages; + MigrationState *s = migrate_get_current(); + + /* This function is currently expected to be used during live migration */ + if (!migration_is_setup_or_active(s->state)) { + return; + } + + for (; len > 0; len -= used_len, addr += used_len) { + block = qemu_ram_block_from_host(addr, false, &offset); + if (unlikely(!block || offset >= block->used_length)) { + /* + * The implementation might not support RAMBlock resize during + * live migration, but it could happen in theory with future + * updates. So we add a check here to capture that case. + */ + error_report_once("%s unexpected error", __func__); + return; + } + + if (len <= block->used_length - offset) { + used_len = len; + } else { + used_len = block->used_length - offset; + } + + start = offset >> TARGET_PAGE_BITS; + npages = used_len >> TARGET_PAGE_BITS; + + qemu_mutex_lock(&ram_state->bitmap_mutex); + ram_state->migration_dirty_pages -= + bitmap_count_one_with_offset(block->bmap, start, npages); + bitmap_clear(block->bmap, start, npages); + qemu_mutex_unlock(&ram_state->bitmap_mutex); + } +} + +/* * Each of ram_save_setup, ram_save_iterate and ram_save_complete has * long-running RCU critical section. When rcu-reclaims in the code * start to become numerous it will be necessary to reduce the From patchwork Tue Dec 11 08:24:51 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10723219 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEA2C17FE for ; Tue, 11 Dec 2018 08:59:42 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E3A129B3B for ; Tue, 11 Dec 2018 08:59:42 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9264929DA3; Tue, 11 Dec 2018 08:59:42 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 06B2529BF1 for ; Tue, 11 Dec 2018 08:59:42 +0000 (UTC) Received: from localhost ([::1]:36621 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdtF-0003h6-9o for patchwork-qemu-devel@patchwork.kernel.org; Tue, 11 Dec 2018 03:59:41 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51622) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdrh-0002W5-6h for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:08 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWdrf-0006Jq-AZ for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:05 -0500 Received: from mga07.intel.com ([134.134.136.100]:27021) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWdre-0006FH-UQ for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:03 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 00:58:02 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,341,1539673200"; d="scan'208";a="117803139" Received: from devel-ww.sh.intel.com ([10.239.48.119]) by orsmga001.jf.intel.com with ESMTP; 11 Dec 2018 00:58:00 -0800 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com, peterx@redhat.com Date: Tue, 11 Dec 2018 16:24:51 +0800 Message-Id: <1544516693-5395-6-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> References: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.100 Subject: [Qemu-devel] [PATCH v11 5/7] migration/ram.c: add a notifier chain for precopy X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, liliang.opensource@gmail.com, nilal@redhat.com, wei.w.wang@intel.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch adds a notifier chain for the memory precopy. This enables various precopy optimizations to be invoked at specific places. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin CC: Peter Xu Reviewed-by: Peter Xu --- include/migration/misc.h | 19 ++++++++++++++++++ migration/ram.c | 51 +++++++++++++++++++++++++++++++++++++++++++++--- migration/savevm.c | 15 ++++++++++++++ vl.c | 1 + 4 files changed, 83 insertions(+), 3 deletions(-) diff --git a/include/migration/misc.h b/include/migration/misc.h index 113320e..15f8d00 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -19,6 +19,25 @@ /* migration/ram.c */ +typedef enum PrecopyNotifyReason { + PRECOPY_NOTIFY_SETUP = 0, + PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC = 1, + PRECOPY_NOTIFY_AFTER_BITMAP_SYNC = 2, + PRECOPY_NOTIFY_COMPLETE = 3, + PRECOPY_NOTIFY_CLEANUP = 4, + PRECOPY_NOTIFY_MAX = 5, +} PrecopyNotifyReason; + +typedef struct PrecopyNotifyData { + enum PrecopyNotifyReason reason; + Error **errp; +} PrecopyNotifyData; + +void precopy_infrastructure_init(void); +void precopy_add_notifier(NotifierWithReturn *n); +void precopy_remove_notifier(NotifierWithReturn *n); +int precopy_notify(PrecopyNotifyReason reason, Error **errp); + void ram_mig_init(void); void qemu_guest_free_page_hint(void *addr, size_t len); diff --git a/migration/ram.c b/migration/ram.c index c13b44f..1e00d92 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -328,6 +328,32 @@ typedef struct RAMState RAMState; static RAMState *ram_state; +static NotifierWithReturnList precopy_notifier_list; + +void precopy_infrastructure_init(void) +{ + notifier_with_return_list_init(&precopy_notifier_list); +} + +void precopy_add_notifier(NotifierWithReturn *n) +{ + notifier_with_return_list_add(&precopy_notifier_list, n); +} + +void precopy_remove_notifier(NotifierWithReturn *n) +{ + notifier_with_return_remove(n); +} + +int precopy_notify(PrecopyNotifyReason reason, Error **errp) +{ + PrecopyNotifyData pnd; + pnd.reason = reason; + pnd.errp = errp; + + return notifier_with_return_list_notify(&precopy_notifier_list, &pnd); +} + uint64_t ram_bytes_remaining(void) { return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) : @@ -1701,6 +1727,25 @@ static void migration_bitmap_sync(RAMState *rs) } } +static void migration_bitmap_sync_precopy(RAMState *rs) +{ + Error *local_err = NULL; + + /* + * The current notifier usage is just an optimization to migration, so we + * don't stop the normal migration process in the error case. + */ + if (precopy_notify(PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC, &local_err)) { + error_report_err(local_err); + } + + migration_bitmap_sync(rs); + + if (precopy_notify(PRECOPY_NOTIFY_AFTER_BITMAP_SYNC, &local_err)) { + error_report_err(local_err); + } +} + /** * save_zero_page_to_file: send the zero page to the file * @@ -3072,7 +3117,7 @@ static void ram_init_bitmaps(RAMState *rs) ram_list_init_bitmaps(); memory_global_dirty_log_start(); - migration_bitmap_sync(rs); + migration_bitmap_sync_precopy(rs); rcu_read_unlock(); qemu_mutex_unlock_ramlist(); @@ -3348,7 +3393,7 @@ static int ram_save_complete(QEMUFile *f, void *opaque) rcu_read_lock(); if (!migration_in_postcopy()) { - migration_bitmap_sync(rs); + migration_bitmap_sync_precopy(rs); } ram_control_before_iterate(f, RAM_CONTROL_FINISH); @@ -3397,7 +3442,7 @@ static void ram_save_pending(QEMUFile *f, void *opaque, uint64_t max_size, remaining_size < max_size) { qemu_mutex_lock_iothread(); rcu_read_lock(); - migration_bitmap_sync(rs); + migration_bitmap_sync_precopy(rs); rcu_read_unlock(); qemu_mutex_unlock_iothread(); remaining_size = rs->migration_dirty_pages * TARGET_PAGE_SIZE; diff --git a/migration/savevm.c b/migration/savevm.c index 9e45fb4..ec74f44 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1018,6 +1018,7 @@ void qemu_savevm_state_header(QEMUFile *f) void qemu_savevm_state_setup(QEMUFile *f) { SaveStateEntry *se; + Error *local_err = NULL; int ret; trace_savevm_state_setup(); @@ -1039,6 +1040,10 @@ void qemu_savevm_state_setup(QEMUFile *f) break; } } + + if (precopy_notify(PRECOPY_NOTIFY_SETUP, &local_err)) { + error_report_err(local_err); + } } int qemu_savevm_state_resume_prepare(MigrationState *s) @@ -1181,6 +1186,11 @@ int qemu_savevm_state_complete_precopy(QEMUFile *f, bool iterable_only, SaveStateEntry *se; int ret; bool in_postcopy = migration_in_postcopy(); + Error *local_err = NULL; + + if (precopy_notify(PRECOPY_NOTIFY_COMPLETE, &local_err)) { + error_report_err(local_err); + } trace_savevm_state_complete_precopy(); @@ -1313,6 +1323,11 @@ void qemu_savevm_state_pending(QEMUFile *f, uint64_t threshold_size, void qemu_savevm_state_cleanup(void) { SaveStateEntry *se; + Error *local_err = NULL; + + if (precopy_notify(PRECOPY_NOTIFY_CLEANUP, &local_err)) { + error_report_err(local_err); + } trace_savevm_state_cleanup(); QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { diff --git a/vl.c b/vl.c index a5ae5f2..1621e1c 100644 --- a/vl.c +++ b/vl.c @@ -3062,6 +3062,7 @@ int main(int argc, char **argv, char **envp) module_call_init(MODULE_INIT_OPTS); runstate_init(); + precopy_infrastructure_init(); postcopy_infrastructure_init(); monitor_init_globals(); From patchwork Tue Dec 11 08:24:52 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10723233 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7725413BF for ; Tue, 11 Dec 2018 09:02:57 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 670932A61C for ; Tue, 11 Dec 2018 09:02:57 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 591C32A627; Tue, 11 Dec 2018 09:02:57 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id E93E12A60C for ; Tue, 11 Dec 2018 09:02:56 +0000 (UTC) Received: from localhost ([::1]:36638 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdwO-0004zB-68 for patchwork-qemu-devel@patchwork.kernel.org; Tue, 11 Dec 2018 04:02:56 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51638) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdrk-0002Yk-Fd for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:09 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWdrh-0006LJ-I1 for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:08 -0500 Received: from mga07.intel.com ([134.134.136.100]:27021) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWdrh-0006FH-87 for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:05 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 00:58:04 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,341,1539673200"; d="scan'208";a="117803156" Received: from devel-ww.sh.intel.com ([10.239.48.119]) by orsmga001.jf.intel.com with ESMTP; 11 Dec 2018 00:58:02 -0800 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com, peterx@redhat.com Date: Tue, 11 Dec 2018 16:24:52 +0800 Message-Id: <1544516693-5395-7-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> References: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.100 Subject: [Qemu-devel] [PATCH v11 6/7] migration/ram.c: add the free page optimization enable flag X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, liliang.opensource@gmail.com, nilal@redhat.com, wei.w.wang@intel.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP This patch adds the free page optimization enable flag, and a function to set this flag. When the free page optimization is enabled, not all the pages are needed to be sent in the bulk stage. Why using a new flag, instead of directly disabling ram_bulk_stage when the optimization is running? Thanks for Peter Xu's reminder that disabling ram_bulk_stage will affect the use of compression. Please see save_page_use_compression. When xbzrle and compression are used, if free page optimizaion causes the ram_bulk_stage to be disabled, save_page_use_compression will return false, which disables the use of compression. That is, if free page optimization avoids the sending of half of the guest pages, the other half of pages loses the benefits of compression in the meantime. Using a new flag to let migration_bitmap_find_dirty skip the free pages in the bulk stage will avoid the above issue. Signed-off-by: Wei Wang CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Michael S. Tsirkin CC: Peter Xu --- include/migration/misc.h | 1 + migration/ram.c | 18 +++++++++++++++++- 2 files changed, 18 insertions(+), 1 deletion(-) diff --git a/include/migration/misc.h b/include/migration/misc.h index 15f8d00..1227279 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -37,6 +37,7 @@ void precopy_infrastructure_init(void); void precopy_add_notifier(NotifierWithReturn *n); void precopy_remove_notifier(NotifierWithReturn *n); int precopy_notify(PrecopyNotifyReason reason, Error **errp); +void precopy_enable_free_page_optimization(void); void ram_mig_init(void); void qemu_guest_free_page_hint(void *addr, size_t len); diff --git a/migration/ram.c b/migration/ram.c index 1e00d92..188bbad 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -290,6 +290,8 @@ struct RAMState { uint32_t last_version; /* We are in the first round */ bool ram_bulk_stage; + /* The free page optimization is enabled */ + bool fpo_enabled; /* How many times we have dirty too many pages */ int dirty_rate_high_cnt; /* these variables are used for bitmap sync */ @@ -354,6 +356,15 @@ int precopy_notify(PrecopyNotifyReason reason, Error **errp) return notifier_with_return_list_notify(&precopy_notifier_list, &pnd); } +void precopy_enable_free_page_optimization(void) +{ + if (!ram_state) { + return; + } + + ram_state->fpo_enabled = true; +} + uint64_t ram_bytes_remaining(void) { return ram_state ? (ram_state->migration_dirty_pages * TARGET_PAGE_SIZE) : @@ -1567,7 +1578,11 @@ unsigned long migration_bitmap_find_dirty(RAMState *rs, RAMBlock *rb, return size; } - if (rs->ram_bulk_stage && start > 0) { + /* + * When the free page optimization is enabled, we need to check the bitmap + * to send the non-free pages rather than all the pages in the bulk stage. + */ + if (!rs->fpo_enabled && rs->ram_bulk_stage && start > 0) { next = start + 1; } else { next = find_next_bit(bitmap, size, start); @@ -2600,6 +2615,7 @@ static void ram_state_reset(RAMState *rs) rs->last_page = 0; rs->last_version = ram_list.version; rs->ram_bulk_stage = true; + rs->fpo_enabled = false; } #define MAX_WAIT 50 /* ms, half buffered_file limit */ From patchwork Tue Dec 11 08:24:53 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Wang, Wei W" X-Patchwork-Id: 10723235 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 00EE11751 for ; Tue, 11 Dec 2018 09:04:11 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DF20C298DD for ; Tue, 11 Dec 2018 09:04:10 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D2B262A404; Tue, 11 Dec 2018 09:04:10 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id C584F298DD for ; Tue, 11 Dec 2018 09:04:09 +0000 (UTC) Received: from localhost ([::1]:36640 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdxY-0005iy-VN for patchwork-qemu-devel@patchwork.kernel.org; Tue, 11 Dec 2018 04:04:09 -0500 Received: from eggs.gnu.org ([2001:4830:134:3::10]:51676) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1gWdrq-0002bx-1N for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:17 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1gWdrk-0006MK-6W for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:13 -0500 Received: from mga07.intel.com ([134.134.136.100]:27021) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1gWdrj-0006FH-N1 for qemu-devel@nongnu.org; Tue, 11 Dec 2018 03:58:08 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga105.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Dec 2018 00:58:07 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,341,1539673200"; d="scan'208";a="117803173" Received: from devel-ww.sh.intel.com ([10.239.48.119]) by orsmga001.jf.intel.com with ESMTP; 11 Dec 2018 00:58:05 -0800 From: Wei Wang To: qemu-devel@nongnu.org, virtio-dev@lists.oasis-open.org, mst@redhat.com, quintela@redhat.com, dgilbert@redhat.com, peterx@redhat.com Date: Tue, 11 Dec 2018 16:24:53 +0800 Message-Id: <1544516693-5395-8-git-send-email-wei.w.wang@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> References: <1544516693-5395-1-git-send-email-wei.w.wang@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.100 Subject: [Qemu-devel] [PATCH v11 7/7] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: pbonzini@redhat.com, liliang.opensource@gmail.com, nilal@redhat.com, wei.w.wang@intel.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Virus-Scanned: ClamAV using ClamSMTP The new feature enables the virtio-balloon device to receive hints of guest free pages from the free page vq. A notifier is registered to the migration precopy notifier chain. The notifier calls free_page_start after the migration thread syncs the dirty bitmap, so that the free page optimization starts to clear bits of free pages from the bitmap. It calls the free_page_stop before the migration thread syncs the bitmap, which is the end of the current round of ram save. The free_page_stop is also called to stop the optimization in the case when there is an error occurred in the process of ram saving. Note: balloon will report pages which were free at the time of this call. As the reporting happens asynchronously, dirty bit logging must be enabled before this free_page_start call is made. Guest reporting must be disabled before the migration dirty bitmap is synchronized. Signed-off-by: Wei Wang CC: Michael S. Tsirkin CC: Dr. David Alan Gilbert CC: Juan Quintela CC: Peter Xu --- hw/virtio/virtio-balloon.c | 263 ++++++++++++++++++++++++ include/hw/virtio/virtio-balloon.h | 28 ++- include/standard-headers/linux/virtio_balloon.h | 5 + 3 files changed, 295 insertions(+), 1 deletion(-) diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c index 1728e4f..543bbd4 100644 --- a/hw/virtio/virtio-balloon.c +++ b/hw/virtio/virtio-balloon.c @@ -27,6 +27,7 @@ #include "qapi/visitor.h" #include "trace.h" #include "qemu/error-report.h" +#include "migration/misc.h" #include "hw/virtio/virtio-bus.h" #include "hw/virtio/virtio-access.h" @@ -308,6 +309,184 @@ out: } } +static void virtio_balloon_handle_free_page_vq(VirtIODevice *vdev, + VirtQueue *vq) +{ + VirtIOBalloon *s = VIRTIO_BALLOON(vdev); + qemu_bh_schedule(s->free_page_bh); +} + +static bool get_free_page_hints(VirtIOBalloon *dev) +{ + VirtQueueElement *elem; + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + VirtQueue *vq = dev->free_page_vq; + + while (dev->block_iothread) { + qemu_cond_wait(&dev->free_page_cond, &dev->free_page_lock); + } + + elem = virtqueue_pop(vq, sizeof(VirtQueueElement)); + if (!elem) { + return false; + } + + if (elem->out_num) { + uint32_t id; + size_t size = iov_to_buf(elem->out_sg, elem->out_num, 0, + &id, sizeof(id)); + virtqueue_push(vq, elem, size); + g_free(elem); + + virtio_tswap32s(vdev, &id); + if (unlikely(size != sizeof(id))) { + virtio_error(vdev, "received an incorrect cmd id"); + return false; + } + if (id == dev->free_page_report_cmd_id) { + dev->free_page_report_status = FREE_PAGE_REPORT_S_START; + } else { + /* + * Stop the optimization only when it has started. This + * avoids a stale stop sign for the previous command. + */ + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) { + dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + } + } + } + + if (elem->in_num) { + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) { + qemu_guest_free_page_hint(elem->in_sg[0].iov_base, + elem->in_sg[0].iov_len); + } + virtqueue_push(vq, elem, 1); + g_free(elem); + } + + return true; +} + +static void virtio_ballloon_get_free_page_hints(void *opaque) +{ + VirtIOBalloon *dev = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + VirtQueue *vq = dev->free_page_vq; + bool continue_to_get_hints; + + do { + qemu_mutex_lock(&dev->free_page_lock); + virtio_queue_set_notification(vq, 0); + continue_to_get_hints = get_free_page_hints(dev); + qemu_mutex_unlock(&dev->free_page_lock); + virtio_notify(vdev, vq); + /* + * Start to poll the vq once the reporting started. Otherwise, continue + * only when there are entries on the vq, which need to be given back. + */ + } while (continue_to_get_hints || + dev->free_page_report_status == FREE_PAGE_REPORT_S_START); + virtio_queue_set_notification(vq, 1); +} + +static bool virtio_balloon_free_page_support(void *opaque) +{ + VirtIOBalloon *s = opaque; + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_FREE_PAGE_HINT); +} + +static void virtio_balloon_free_page_start(VirtIOBalloon *s) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + /* For the stop and copy phase, we don't need to start the optimization */ + if (!vdev->vm_running) { + return; + } + + if (s->free_page_report_cmd_id == UINT_MAX) { + s->free_page_report_cmd_id = + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; + } else { + s->free_page_report_cmd_id++; + } + + s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED; + virtio_notify_config(vdev); +} + +static void virtio_balloon_free_page_stop(VirtIOBalloon *s) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + if (s->free_page_report_status != FREE_PAGE_REPORT_S_STOP) { + /* + * The lock also guarantees us that the + * virtio_ballloon_get_free_page_hints exits after the + * free_page_report_status is set to S_STOP. + */ + qemu_mutex_lock(&s->free_page_lock); + /* + * The guest hasn't done the reporting, so host sends a notification + * to the guest to actively stop the reporting. + */ + s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + qemu_mutex_unlock(&s->free_page_lock); + virtio_notify_config(vdev); + } +} + +static void virtio_balloon_free_page_done(VirtIOBalloon *s) +{ + VirtIODevice *vdev = VIRTIO_DEVICE(s); + + s->free_page_report_status = FREE_PAGE_REPORT_S_DONE; + virtio_notify_config(vdev); +} + +static int +virtio_balloon_free_page_report_notify(NotifierWithReturn *n, void *data) +{ + VirtIOBalloon *dev = container_of(n, VirtIOBalloon, + free_page_report_notify); + VirtIODevice *vdev = VIRTIO_DEVICE(dev); + PrecopyNotifyData *pnd = data; + + if (!virtio_balloon_free_page_support(dev)) { + /* + * This is an optimization provided to migration, so just return 0 to + * have the normal migration process not affected when this feature is + * not supported. + */ + return 0; + } + + switch (pnd->reason) { + case PRECOPY_NOTIFY_SETUP: + precopy_enable_free_page_optimization(); + break; + case PRECOPY_NOTIFY_COMPLETE: + case PRECOPY_NOTIFY_CLEANUP: + case PRECOPY_NOTIFY_BEFORE_BITMAP_SYNC: + virtio_balloon_free_page_stop(dev); + break; + case PRECOPY_NOTIFY_AFTER_BITMAP_SYNC: + if (vdev->vm_running) { + virtio_balloon_free_page_start(dev); + } else { + virtio_balloon_free_page_done(dev); + } + break; + default: + virtio_error(vdev, "%s: %d reason unknown", __func__, pnd->reason); + } + + return 0; +} + static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) { VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); @@ -316,6 +495,17 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data) config.num_pages = cpu_to_le32(dev->num_pages); config.actual = cpu_to_le32(dev->actual); + if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) { + config.free_page_report_cmd_id = + cpu_to_le32(dev->free_page_report_cmd_id); + } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) { + config.free_page_report_cmd_id = + cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID); + } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_DONE) { + config.free_page_report_cmd_id = + cpu_to_le32(VIRTIO_BALLOON_FREE_PAGE_REPORT_DONE_ID); + } + trace_virtio_balloon_get_config(config.num_pages, config.actual); memcpy(config_data, &config, sizeof(struct virtio_balloon_config)); } @@ -376,6 +566,7 @@ static uint64_t virtio_balloon_get_features(VirtIODevice *vdev, uint64_t f, VirtIOBalloon *dev = VIRTIO_BALLOON(vdev); f |= dev->host_features; virtio_add_feature(&f, VIRTIO_BALLOON_F_STATS_VQ); + return f; } @@ -412,6 +603,18 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id) return 0; } +static const VMStateDescription vmstate_virtio_balloon_free_page_report = { + .name = "virtio-balloon-device/free-page-report", + .version_id = 1, + .minimum_version_id = 1, + .needed = virtio_balloon_free_page_support, + .fields = (VMStateField[]) { + VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon), + VMSTATE_UINT32(free_page_report_status, VirtIOBalloon), + VMSTATE_END_OF_LIST() + } +}; + static const VMStateDescription vmstate_virtio_balloon_device = { .name = "virtio-balloon-device", .version_id = 1, @@ -422,6 +625,10 @@ static const VMStateDescription vmstate_virtio_balloon_device = { VMSTATE_UINT32(actual, VirtIOBalloon), VMSTATE_END_OF_LIST() }, + .subsections = (const VMStateDescription * []) { + &vmstate_virtio_balloon_free_page_report, + NULL + } }; static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) @@ -446,6 +653,29 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp) s->dvq = virtio_add_queue(vdev, 128, virtio_balloon_handle_output); s->svq = virtio_add_queue(vdev, 128, virtio_balloon_receive_stats); + if (virtio_has_feature(s->host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT)) { + s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE, + virtio_balloon_handle_free_page_vq); + s->free_page_report_status = FREE_PAGE_REPORT_S_STOP; + s->free_page_report_cmd_id = + VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN; + s->free_page_report_notify.notify = + virtio_balloon_free_page_report_notify; + precopy_add_notifier(&s->free_page_report_notify); + if (s->iothread) { + object_ref(OBJECT(s->iothread)); + s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread), + virtio_ballloon_get_free_page_hints, s); + qemu_mutex_init(&s->free_page_lock); + qemu_cond_init(&s->free_page_cond); + s->block_iothread = false; + } else { + /* Simply disable this feature if the iothread wasn't created. */ + s->host_features &= ~(1 << VIRTIO_BALLOON_F_FREE_PAGE_HINT); + virtio_error(vdev, "iothread is missing"); + } + } reset_stats(s); } @@ -454,6 +684,11 @@ static void virtio_balloon_device_unrealize(DeviceState *dev, Error **errp) VirtIODevice *vdev = VIRTIO_DEVICE(dev); VirtIOBalloon *s = VIRTIO_BALLOON(dev); + if (virtio_balloon_free_page_support(s)) { + qemu_bh_delete(s->free_page_bh); + virtio_balloon_free_page_stop(s); + precopy_remove_notifier(&s->free_page_report_notify); + } balloon_stats_destroy_timer(s); qemu_remove_balloon_handler(s); virtio_cleanup(vdev); @@ -463,6 +698,10 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev) { VirtIOBalloon *s = VIRTIO_BALLOON(vdev); + if (virtio_balloon_free_page_support(s)) { + virtio_balloon_free_page_stop(s); + } + if (s->stats_vq_elem != NULL) { virtqueue_unpop(s->svq, s->stats_vq_elem, 0); g_free(s->stats_vq_elem); @@ -480,6 +719,26 @@ static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status) * was stopped */ virtio_balloon_receive_stats(vdev, s->svq); } + + if (virtio_balloon_free_page_support(s)) { + /* + * The VM is woken up and the iothread was blocked, so signal it to + * continue. + */ + if (vdev->vm_running && s->block_iothread) { + qemu_mutex_lock(&s->free_page_lock); + s->block_iothread = false; + qemu_cond_signal(&s->free_page_cond); + qemu_mutex_unlock(&s->free_page_lock); + } + + /* The VM is stopped, block the iothread. */ + if (!vdev->vm_running) { + qemu_mutex_lock(&s->free_page_lock); + s->block_iothread = true; + qemu_mutex_unlock(&s->free_page_lock); + } + } } static void virtio_balloon_instance_init(Object *obj) @@ -508,6 +767,10 @@ static const VMStateDescription vmstate_virtio_balloon = { static Property virtio_balloon_properties[] = { DEFINE_PROP_BIT("deflate-on-oom", VirtIOBalloon, host_features, VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false), + DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features, + VIRTIO_BALLOON_F_FREE_PAGE_HINT, false), + DEFINE_PROP_LINK("iothread", VirtIOBalloon, iothread, TYPE_IOTHREAD, + IOThread *), DEFINE_PROP_END_OF_LIST(), }; diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h index e0df352..503349a 100644 --- a/include/hw/virtio/virtio-balloon.h +++ b/include/hw/virtio/virtio-balloon.h @@ -17,11 +17,14 @@ #include "standard-headers/linux/virtio_balloon.h" #include "hw/virtio/virtio.h" +#include "sysemu/iothread.h" #define TYPE_VIRTIO_BALLOON "virtio-balloon-device" #define VIRTIO_BALLOON(obj) \ OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON) +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000 + typedef struct virtio_balloon_stat VirtIOBalloonStat; typedef struct virtio_balloon_stat_modern { @@ -30,15 +33,38 @@ typedef struct virtio_balloon_stat_modern { uint64_t val; } VirtIOBalloonStatModern; +enum virtio_balloon_free_page_report_status { + FREE_PAGE_REPORT_S_STOP = 0, + FREE_PAGE_REPORT_S_REQUESTED = 1, + FREE_PAGE_REPORT_S_START = 2, + FREE_PAGE_REPORT_S_DONE = 3, +}; + typedef struct VirtIOBalloon { VirtIODevice parent_obj; - VirtQueue *ivq, *dvq, *svq; + VirtQueue *ivq, *dvq, *svq, *free_page_vq; + uint32_t free_page_report_status; uint32_t num_pages; uint32_t actual; + uint32_t free_page_report_cmd_id; uint64_t stats[VIRTIO_BALLOON_S_NR]; VirtQueueElement *stats_vq_elem; size_t stats_vq_offset; QEMUTimer *stats_timer; + IOThread *iothread; + QEMUBH *free_page_bh; + /* + * Lock to synchronize threads to access the free page reporting related + * fields (e.g. free_page_report_status). + */ + QemuMutex free_page_lock; + QemuCond free_page_cond; + /* + * Set to block iothread to continue reading free page hints as the VM is + * stopped. + */ + bool block_iothread; + NotifierWithReturn free_page_report_notify; int64_t stats_last_update; int64_t stats_poll_interval; uint32_t host_features; diff --git a/include/standard-headers/linux/virtio_balloon.h b/include/standard-headers/linux/virtio_balloon.h index 4dbb7dc..9eee1c6 100644 --- a/include/standard-headers/linux/virtio_balloon.h +++ b/include/standard-headers/linux/virtio_balloon.h @@ -34,15 +34,20 @@ #define VIRTIO_BALLOON_F_MUST_TELL_HOST 0 /* Tell before reclaiming pages */ #define VIRTIO_BALLOON_F_STATS_VQ 1 /* Memory Stats virtqueue */ #define VIRTIO_BALLOON_F_DEFLATE_ON_OOM 2 /* Deflate balloon on OOM */ +#define VIRTIO_BALLOON_F_FREE_PAGE_HINT 3 /* VQ to report free pages */ /* Size of a PFN in the balloon interface. */ #define VIRTIO_BALLOON_PFN_SHIFT 12 +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_STOP_ID 0 +#define VIRTIO_BALLOON_FREE_PAGE_REPORT_DONE_ID 1 struct virtio_balloon_config { /* Number of pages host wants Guest to give up. */ uint32_t num_pages; /* Number of pages we've actually got in balloon. */ uint32_t actual; + /* Free page report command id, readonly by guest */ + uint32_t free_page_report_cmd_id; }; #define VIRTIO_BALLOON_S_SWAP_IN 0 /* Amount of memory swapped in */