From patchwork Thu May 5 07:32:51 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liang Li X-Patchwork-Id: 9021301 Return-Path: X-Original-To: patchwork-qemu-devel@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 8383A9F1D3 for ; Thu, 5 May 2016 07:37:49 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C6A5C203E3 for ; Thu, 5 May 2016 07:37:48 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [208.118.235.17]) (using TLSv1 with cipher AES256-SHA (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 787DB203B0 for ; Thu, 5 May 2016 07:37:47 +0000 (UTC) Received: from localhost ([::1]:52105 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ayDqx-0000dj-K8 for patchwork-qemu-devel@patchwork.kernel.org; Thu, 05 May 2016 03:37:43 -0400 Received: from eggs.gnu.org ([2001:4830:134:3::10]:58165) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ayDqh-0000M0-JA for qemu-devel@nongnu.org; Thu, 05 May 2016 03:37:34 -0400 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ayDqV-0001Co-J6 for qemu-devel@nongnu.org; Thu, 05 May 2016 03:37:22 -0400 Received: from mga02.intel.com ([134.134.136.20]:12639) by eggs.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ayDqV-00016H-78 for qemu-devel@nongnu.org; Thu, 05 May 2016 03:37:15 -0400 Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga101.jf.intel.com with ESMTP; 05 May 2016 00:36:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,581,1455004800"; d="scan'208";a="959325431" Received: from ll.sh.intel.com (HELO localhost) ([10.239.13.27]) by fmsmga001.fm.intel.com with ESMTP; 05 May 2016 00:36:43 -0700 From: Liang Li To: qemu-devel@nongnu.org Date: Thu, 5 May 2016 15:32:51 +0800 Message-Id: <1462433579-13691-2-git-send-email-liang.z.li@intel.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1462433579-13691-1-git-send-email-liang.z.li@intel.com> References: <1462433579-13691-1-git-send-email-liang.z.li@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 134.134.136.20 Subject: [Qemu-devel] [PATCH v2 1/9] migration: Fix multi-thread compression bug X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: amit.shah@redhat.com, Liang Li , dgilbert@redhat.com, quintela@redhat.com Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Recently, a bug related to multiple thread compression feature for live migration is reported. The destination side will be blocked during live migration if there are heavy workload in host and memory intensive workload in guest, this is most likely to happen when there is one decompression thread. Some parts of the decompression code are incorrect: 1. The main thread receives data from source side will enter a busy loop to wait for a free decompression thread. 2. A lock is needed to protect the decomp_param[idx]->start, because it is checked in the main thread and is updated in the decompression thread. Fix these two issues by following the code pattern for compression. Signed-off-by: Liang Li Reported-by: Daniel P. Berrange Reviewed-by: Daniel P. Berrange Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Juan Quintela Tested-by: Daniel P. Berrange Signed-off-by: Liang Li --- migration/ram.c | 38 +++++++++++++++++++++++++++----------- 1 file changed, 27 insertions(+), 11 deletions(-) diff --git a/migration/ram.c b/migration/ram.c index 3f05738..7ab6ab5 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -263,6 +263,7 @@ typedef struct CompressParam CompressParam; struct DecompressParam { bool start; + bool done; QemuMutex mutex; QemuCond cond; void *des; @@ -287,6 +288,8 @@ static bool quit_comp_thread; static bool quit_decomp_thread; static DecompressParam *decomp_param; static QemuThread *decompress_threads; +static QemuMutex decomp_done_lock; +static QemuCond decomp_done_cond; static int do_compress_ram_page(CompressParam *param); @@ -834,6 +837,7 @@ static inline void start_compression(CompressParam *param) static inline void start_decompression(DecompressParam *param) { + param->done = false; qemu_mutex_lock(¶m->mutex); param->start = true; qemu_cond_signal(¶m->cond); @@ -2193,19 +2197,24 @@ static void *do_data_decompress(void *opaque) qemu_mutex_lock(¶m->mutex); while (!param->start && !quit_decomp_thread) { qemu_cond_wait(¶m->cond, ¶m->mutex); + } + if (!quit_decomp_thread) { pagesize = TARGET_PAGE_SIZE; - if (!quit_decomp_thread) { - /* uncompress() will return failed in some case, especially - * when the page is dirted when doing the compression, it's - * not a problem because the dirty page will be retransferred - * and uncompress() won't break the data in other pages. - */ - uncompress((Bytef *)param->des, &pagesize, - (const Bytef *)param->compbuf, param->len); - } - param->start = false; + /* uncompress() will return failed in some case, especially + * when the page is dirted when doing the compression, it's + * not a problem because the dirty page will be retransferred + * and uncompress() won't break the data in other pages. + */ + uncompress((Bytef *)param->des, &pagesize, + (const Bytef *)param->compbuf, param->len); } + param->start = false; qemu_mutex_unlock(¶m->mutex); + + qemu_mutex_lock(&decomp_done_lock); + param->done = true; + qemu_cond_signal(&decomp_done_cond); + qemu_mutex_unlock(&decomp_done_lock); } return NULL; @@ -2219,10 +2228,13 @@ void migrate_decompress_threads_create(void) decompress_threads = g_new0(QemuThread, thread_count); decomp_param = g_new0(DecompressParam, thread_count); quit_decomp_thread = false; + qemu_mutex_init(&decomp_done_lock); + qemu_cond_init(&decomp_done_cond); for (i = 0; i < thread_count; i++) { qemu_mutex_init(&decomp_param[i].mutex); qemu_cond_init(&decomp_param[i].cond); decomp_param[i].compbuf = g_malloc0(compressBound(TARGET_PAGE_SIZE)); + decomp_param[i].done = true; qemu_thread_create(decompress_threads + i, "decompress", do_data_decompress, decomp_param + i, QEMU_THREAD_JOINABLE); @@ -2258,9 +2270,10 @@ static void decompress_data_with_multi_threads(QEMUFile *f, int idx, thread_count; thread_count = migrate_decompress_threads(); + qemu_mutex_lock(&decomp_done_lock); while (true) { for (idx = 0; idx < thread_count; idx++) { - if (!decomp_param[idx].start) { + if (decomp_param[idx].done) { qemu_get_buffer(f, decomp_param[idx].compbuf, len); decomp_param[idx].des = host; decomp_param[idx].len = len; @@ -2270,8 +2283,11 @@ static void decompress_data_with_multi_threads(QEMUFile *f, } if (idx < thread_count) { break; + } else { + qemu_cond_wait(&decomp_done_cond, &decomp_done_lock); } } + qemu_mutex_unlock(&decomp_done_lock); } /*