From patchwork Tue Nov 14 05:40:13 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454796 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 69643C4167D for ; Tue, 14 Nov 2023 05:43:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mB7-0007AB-Tz; Tue, 14 Nov 2023 00:41:37 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mB5-00079S-GV for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:35 -0500 Received: from mail-qk1-x72f.google.com ([2607:f8b0:4864:20::72f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mB3-0001RO-Lb for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:34 -0500 Received: by mail-qk1-x72f.google.com with SMTP id af79cd13be357-7789aed0e46so347374985a.0 for ; Mon, 13 Nov 2023 21:41:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940493; x=1700545293; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=Xr+ydM0w1ChiLZUK2AmzFF5aaNpk3nnn08+T2D9fxos=; b=gYG+oihFRbyXIlUjYeqK1xW72wxSkK+NIcWS8NjNRWhLz4D9wG1qjGj1whgU6lUYsP 0eIyzYEHldNmR7LTvY1VfZ7SeQCe5SJDW8ONXvLHwX5rNcxQCvP4j/HucapY3ofzbsXx WYIbpfM4+pTkA7d0/9dPkCv7MONdH2R7cwmL3MzC74aYWoBPkFQSyc+JMuXJPRDebVau P0wPcmICJQaYhdiqgnk2o7ImmkFZ25ctLATqavy8yvcgRbnmCrk+Jlt83/v3+vAkPP+c WP/V7u2ho+SkNe4g9buKxnp6SI94OMYfo8bwFKu+DZLwvNmsVDP6I9Dud2/wCohDhg2E WAxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940493; x=1700545293; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xr+ydM0w1ChiLZUK2AmzFF5aaNpk3nnn08+T2D9fxos=; b=j1bQchCSPYnUkcIspZbMIZfbxqX85n2w/r1CO1h4SkguGWXNoZIyWiUwFds/HaMiJC N12nOVRYVg11gQxjIBFVC4WbdTKzqgrQ4/ljQV2xjHIbmJ1v4NzDtC7DF+jUrxVsy/5F YycNvQxAv7CREJYE+E/rRCb/JAouKikDVDb26jazhJqbEjO7aUfQ5KP4QBkth5IIn4H0 NXOrHEAwnWZZgL8vUHfdBHi6nPA/c/bL3NBTxvoGW4yRfsGAI8wKpQTdNd5fzRle9ihn WE8y2r13sES99nT6OMJnLVWB/zYXWbxtHQNN/A3EJvJFwF7Jx6teUsLUz8A+FM+/tWZD nl0Q== X-Gm-Message-State: AOJu0Yw37DKPC2J6N5JvbGKuGVfLmEC0VC1iOh6E0xetV3pIqted4rbT /wQCUvX7yhPdW+JwL1qSJfs75g== X-Google-Smtp-Source: AGHT+IEA9HBOdkrDJf774Pzi34f+3p9eGpWwXEG5+on5ufo70sFE/7fe9sZw2logv71cNshNiqPNFg== X-Received: by 2002:a05:620a:3b06:b0:77b:c536:3722 with SMTP id tl6-20020a05620a3b0600b0077bc5363722mr1024227qkn.25.1699940492818; Mon, 13 Nov 2023 21:41:32 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:32 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Subject: [PATCH v2 01/20] multifd: Add capability to enable/disable zero_page Date: Tue, 14 Nov 2023 05:40:13 +0000 Message-Id: <20231114054032.1192027-2-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72f; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Juan Quintela We have to enable it by default until we introduce the new code. Signed-off-by: Juan Quintela --- migration/options.c | 13 +++++++++++++ migration/options.h | 1 + qapi/migration.json | 8 +++++++- 3 files changed, 21 insertions(+), 1 deletion(-) diff --git a/migration/options.c b/migration/options.c index 8d8ec73ad9..00c0c4a0d6 100644 --- a/migration/options.c +++ b/migration/options.c @@ -204,6 +204,8 @@ Property migration_properties[] = { DEFINE_PROP_MIG_CAP("x-switchover-ack", MIGRATION_CAPABILITY_SWITCHOVER_ACK), DEFINE_PROP_MIG_CAP("x-dirty-limit", MIGRATION_CAPABILITY_DIRTY_LIMIT), + DEFINE_PROP_MIG_CAP("main-zero-page", + MIGRATION_CAPABILITY_MAIN_ZERO_PAGE), DEFINE_PROP_END_OF_LIST(), }; @@ -284,6 +286,17 @@ bool migrate_multifd(void) return s->capabilities[MIGRATION_CAPABILITY_MULTIFD]; } +bool migrate_use_main_zero_page(void) +{ + //MigrationState *s; + + //s = migrate_get_current(); + + // We will enable this when we add the right code. + // return s->enabled_capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]; + return true; +} + bool migrate_pause_before_switchover(void) { MigrationState *s = migrate_get_current(); diff --git a/migration/options.h b/migration/options.h index 246c160aee..c901eb57c6 100644 --- a/migration/options.h +++ b/migration/options.h @@ -88,6 +88,7 @@ int migrate_multifd_channels(void); MultiFDCompression migrate_multifd_compression(void); int migrate_multifd_zlib_level(void); int migrate_multifd_zstd_level(void); +bool migrate_use_main_zero_page(void); uint8_t migrate_throttle_trigger_threshold(void); const char *migrate_tls_authz(void); const char *migrate_tls_creds(void); diff --git a/qapi/migration.json b/qapi/migration.json index 975761eebd..09e4393591 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -531,6 +531,12 @@ # and can result in more stable read performance. Requires KVM # with accelerator property "dirty-ring-size" set. (Since 8.1) # +# +# @main-zero-page: If enabled, the detection of zero pages will be +# done on the main thread. Otherwise it is done on +# the multifd threads. +# (since 8.2) +# # Features: # # @deprecated: Member @block is deprecated. Use blockdev-mirror with @@ -555,7 +561,7 @@ { 'name': 'x-ignore-shared', 'features': [ 'unstable' ] }, 'validate-uuid', 'background-snapshot', 'zero-copy-send', 'postcopy-preempt', 'switchover-ack', - 'dirty-limit'] } + 'dirty-limit', 'main-zero-page'] } ## # @MigrationCapabilityStatus: From patchwork Tue Nov 14 05:40:14 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454806 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4F330C4332F for ; Tue, 14 Nov 2023 05:44:02 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBC-0007B0-95; Tue, 14 Nov 2023 00:41:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mB9-0007AO-Cr for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:39 -0500 Received: from mail-qk1-x733.google.com ([2607:f8b0:4864:20::733]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mB7-0001RT-8S for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:39 -0500 Received: by mail-qk1-x733.google.com with SMTP id af79cd13be357-77a277eb084so346430485a.2 for ; Mon, 13 Nov 2023 21:41:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940494; x=1700545294; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=z3Qnk3XRMyF+hVRPcPcuJsAD6fV3CsVX4VzBIwkD5H4=; b=ibvd3VS0d1LZbzwXEPmsKpo180QMIVJDqrhysBt/Lb5OW7rHHQCZTFFwMNaAxzTTwS hqobNZmQTqnCA2rwjV/LJ62GwG1ahkhzp8/cRC+V91v5SjaWXkYuqZ2bh9j1WQUPGflb zTbOex4S8Rq5vYm5XCqa36Jilqdfmo2ByZXYTA32cHgeer0jUOwVjfLdYOhUUJ2TKC+P 9+vSDURZQCcc1j6VG8XS1mH28hVYpI/wrmLvafCOlzgaI59YXVUNMzZp9PhWWjWzJ9xe JvChr6fm6u5mn+53F3X2CTHU0RBZvJcHjcEoxAup1Jc5oALz3xm+VXon4C2ba8+1HROl kirw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940494; x=1700545294; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=z3Qnk3XRMyF+hVRPcPcuJsAD6fV3CsVX4VzBIwkD5H4=; b=iKKiPEdexIg2oxO1FWG+U4eKzdAuhAbBFRzjcXt6N4uY4tYdiCybvW8Q6PyeAe3dHR InGONL05XA8e0pEFUIOADF9dQ3J9TEe96SvivOmEe7UvxmZo9W7FlsTOwVEQDKiVmbb0 Y+yn+m8ZrCh93JVHzY04d/hlTmU6545ABLvdJCyncUaDCRai/zp/e4j8LTmr2q/aoVNP jonPaCQ7oSRH+gWpl2PMlCroG6UsEN8XKVxYhWBgCATlXXuGlFDIYP7zYEcCQwVgEd8H 8CyMkoFPBxAMboIT5EqPSUj9k7zeH0qQkNTNb3i0rHRBVxEQWHY4vJwnmnSx/CT5ezOn 2EWg== X-Gm-Message-State: AOJu0YyeSNU1Ze1tpGb3JCwecLUUmXXS+XVB1TQu3u6pIKL6OigcBjdT eH/ZmBaLiCwpzuhKseBNaGdM/A== X-Google-Smtp-Source: AGHT+IGFv27ChdLqhVZVKeKHoqW1DF8D01/HCeyJzHQjoBY7a+yGauUKZp3azGdc11o9Yfl/kvkVtA== X-Received: by 2002:a05:620a:8703:b0:777:ac06:ed91 with SMTP id px3-20020a05620a870300b00777ac06ed91mr1239738qkn.23.1699940494447; Mon, 13 Nov 2023 21:41:34 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:34 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Subject: [PATCH v2 02/20] multifd: Support for zero pages transmission Date: Tue, 14 Nov 2023 05:40:14 +0000 Message-Id: <20231114054032.1192027-3-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::733; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x733.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Juan Quintela This patch adds counters and similar. Logic will be added on the following patch. Signed-off-by: Juan Quintela --- migration/multifd.c | 37 ++++++++++++++++++++++++++++++------- migration/multifd.h | 17 ++++++++++++++++- migration/trace-events | 8 ++++---- 3 files changed, 50 insertions(+), 12 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index ec58c58082..d28ef0028b 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -267,6 +267,7 @@ static void multifd_send_fill_packet(MultiFDSendParams *p) packet->normal_pages = cpu_to_be32(p->normal_num); packet->next_packet_size = cpu_to_be32(p->next_packet_size); packet->packet_num = cpu_to_be64(p->packet_num); + packet->zero_pages = cpu_to_be32(p->zero_num); if (p->pages->block) { strncpy(packet->ramblock, p->pages->block->idstr, 256); @@ -326,7 +327,15 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) p->next_packet_size = be32_to_cpu(packet->next_packet_size); p->packet_num = be64_to_cpu(packet->packet_num); - if (p->normal_num == 0) { + p->zero_num = be32_to_cpu(packet->zero_pages); + if (p->zero_num > packet->pages_alloc - p->normal_num) { + error_setg(errp, "multifd: received packet " + "with %u zero pages and expected maximum pages are %u", + p->zero_num, packet->pages_alloc - p->normal_num) ; + return -1; + } + + if (p->normal_num == 0 && p->zero_num == 0) { return 0; } @@ -431,6 +440,7 @@ static int multifd_send_pages(QEMUFile *f) p->packet_num = multifd_send_state->packet_num++; multifd_send_state->pages = p->pages; p->pages = pages; + qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); @@ -552,6 +562,8 @@ void multifd_save_cleanup(void) p->iov = NULL; g_free(p->normal); p->normal = NULL; + g_free(p->zero); + p->zero = NULL; multifd_send_state->ops->send_cleanup(p, &local_err); if (local_err) { migrate_set_error(migrate_get_current(), local_err); @@ -680,6 +692,7 @@ static void *multifd_send_thread(void *opaque) uint64_t packet_num = p->packet_num; uint32_t flags; p->normal_num = 0; + p->zero_num = 0; if (use_zero_copy_send) { p->iovs_num = 0; @@ -704,12 +717,13 @@ static void *multifd_send_thread(void *opaque) p->flags = 0; p->num_packets++; p->total_normal_pages += p->normal_num; + p->total_zero_pages += p->zero_num; p->pages->num = 0; p->pages->block = NULL; qemu_mutex_unlock(&p->mutex); - trace_multifd_send(p->id, packet_num, p->normal_num, flags, - p->next_packet_size); + trace_multifd_send(p->id, packet_num, p->normal_num, p->zero_num, + flags, p->next_packet_size); if (use_zero_copy_send) { /* Send header first, without zerocopy */ @@ -732,6 +746,8 @@ static void *multifd_send_thread(void *opaque) stat64_add(&mig_stats.multifd_bytes, p->next_packet_size + p->packet_len); + stat64_add(&mig_stats.normal_pages, p->normal_num); + stat64_add(&mig_stats.zero_pages, p->zero_num); p->next_packet_size = 0; qemu_mutex_lock(&p->mutex); p->pending_job--; @@ -762,7 +778,8 @@ out: rcu_unregister_thread(); migration_threads_remove(thread); - trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages); + trace_multifd_send_thread_end(p->id, p->num_packets, p->total_normal_pages, + p->total_zero_pages); return NULL; } @@ -939,6 +956,7 @@ int multifd_save_setup(Error **errp) p->normal = g_new0(ram_addr_t, page_count); p->page_size = qemu_target_page_size(); p->page_count = page_count; + p->zero = g_new0(ram_addr_t, page_count); if (migrate_zero_copy_send()) { p->write_flags = QIO_CHANNEL_WRITE_FLAG_ZERO_COPY; @@ -1054,6 +1072,8 @@ void multifd_load_cleanup(void) p->iov = NULL; g_free(p->normal); p->normal = NULL; + g_free(p->zero); + p->zero = NULL; multifd_recv_state->ops->recv_cleanup(p); } qemu_sem_destroy(&multifd_recv_state->sem_sync); @@ -1122,10 +1142,11 @@ static void *multifd_recv_thread(void *opaque) flags = p->flags; /* recv methods don't know how to handle the SYNC flag */ p->flags &= ~MULTIFD_FLAG_SYNC; - trace_multifd_recv(p->id, p->packet_num, p->normal_num, flags, - p->next_packet_size); + trace_multifd_recv(p->id, p->packet_num, p->normal_num, p->zero_num, + flags, p->next_packet_size); p->num_packets++; p->total_normal_pages += p->normal_num; + p->total_zero_pages += p->zero_num; qemu_mutex_unlock(&p->mutex); if (p->normal_num) { @@ -1150,7 +1171,8 @@ static void *multifd_recv_thread(void *opaque) qemu_mutex_unlock(&p->mutex); rcu_unregister_thread(); - trace_multifd_recv_thread_end(p->id, p->num_packets, p->total_normal_pages); + trace_multifd_recv_thread_end(p->id, p->num_packets, p->total_normal_pages, + p->total_zero_pages); return NULL; } @@ -1191,6 +1213,7 @@ int multifd_load_setup(Error **errp) p->normal = g_new0(ram_addr_t, page_count); p->page_count = page_count; p->page_size = qemu_target_page_size(); + p->zero = g_new0(ram_addr_t, page_count); } for (i = 0; i < thread_count; i++) { diff --git a/migration/multifd.h b/migration/multifd.h index a835643b48..d587b0e19c 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -48,7 +48,10 @@ typedef struct { /* size of the next packet that contains pages */ uint32_t next_packet_size; uint64_t packet_num; - uint64_t unused[4]; /* Reserved for future use */ + /* zero pages */ + uint32_t zero_pages; + uint32_t unused32[1]; /* Reserved for future use */ + uint64_t unused64[3]; /* Reserved for future use */ char ramblock[256]; uint64_t offset[]; } __attribute__((packed)) MultiFDPacket_t; @@ -122,6 +125,8 @@ typedef struct { uint64_t num_packets; /* non zero pages sent through this channel */ uint64_t total_normal_pages; + /* zero pages sent through this channel */ + uint64_t total_zero_pages; /* buffers to send */ struct iovec *iov; /* number of iovs used */ @@ -130,6 +135,10 @@ typedef struct { ram_addr_t *normal; /* num of non zero pages */ uint32_t normal_num; + /* Pages that are zero */ + ram_addr_t *zero; + /* num of zero pages */ + uint32_t zero_num; /* used for compression methods */ void *data; } MultiFDSendParams; @@ -181,12 +190,18 @@ typedef struct { uint8_t *host; /* non zero pages recv through this channel */ uint64_t total_normal_pages; + /* zero pages recv through this channel */ + uint64_t total_zero_pages; /* buffers to recv */ struct iovec *iov; /* Pages that are not zero */ ram_addr_t *normal; /* num of non zero pages */ uint32_t normal_num; + /* Pages that are zero */ + ram_addr_t *zero; + /* num of zero pages */ + uint32_t zero_num; /* used for de-compression methods */ void *data; } MultiFDRecvParams; diff --git a/migration/trace-events b/migration/trace-events index de4a743c8a..c0a758db9d 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -128,21 +128,21 @@ postcopy_preempt_reset_channel(void) "" # multifd.c multifd_new_send_channel_async(uint8_t id) "channel %u" multifd_new_send_channel_async_error(uint8_t id, void *err) "channel=%u err=%p" -multifd_recv(uint8_t id, uint64_t packet_num, uint32_t used, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " pages %u flags 0x%x next packet size %u" +multifd_recv(uint8_t id, uint64_t packet_num, uint32_t normal, uint32_t zero, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u zero pages %u flags 0x%x next packet size %u" multifd_recv_new_channel(uint8_t id) "channel %u" multifd_recv_sync_main(long packet_num) "packet num %ld" multifd_recv_sync_main_signal(uint8_t id) "channel %u" multifd_recv_sync_main_wait(uint8_t id) "channel %u" multifd_recv_terminate_threads(bool error) "error %d" -multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t pages) "channel %u packets %" PRIu64 " pages %" PRIu64 +multifd_recv_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages, uint64_t zero_pages) "channel %u packets %" PRIu64 " normal pages %" PRIu64 " zero pages %" PRIu64 multifd_recv_thread_start(uint8_t id) "%u" -multifd_send(uint8_t id, uint64_t packet_num, uint32_t normal, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u flags 0x%x next packet size %u" +multifd_send(uint8_t id, uint64_t packet_num, uint32_t normalpages, uint32_t zero_pages, uint32_t flags, uint32_t next_packet_size) "channel %u packet_num %" PRIu64 " normal pages %u zero pages %u flags 0x%x next packet size %u" multifd_send_error(uint8_t id) "channel %u" multifd_send_sync_main(long packet_num) "packet num %ld" multifd_send_sync_main_signal(uint8_t id) "channel %u" multifd_send_sync_main_wait(uint8_t id) "channel %u" multifd_send_terminate_threads(bool error) "error %d" -multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages) "channel %u packets %" PRIu64 " normal pages %" PRIu64 +multifd_send_thread_end(uint8_t id, uint64_t packets, uint64_t normal_pages, uint64_t zero_pages) "channel %u packets %" PRIu64 " normal pages %" PRIu64 " zero pages %" PRIu64 multifd_send_thread_start(uint8_t id) "%u" multifd_tls_outgoing_handshake_start(void *ioc, void *tioc, const char *hostname) "ioc=%p tioc=%p hostname=%s" multifd_tls_outgoing_handshake_error(void *ioc, const char *err) "ioc=%p err=%s" From patchwork Tue Nov 14 05:40:15 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454792 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5479DC4167D for ; Tue, 14 Nov 2023 05:42:22 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBB-0007Ax-7H; Tue, 14 Nov 2023 00:41:41 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mB9-0007AG-0h for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:39 -0500 Received: from mail-qk1-x72f.google.com ([2607:f8b0:4864:20::72f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mB7-0001Re-Cp for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:38 -0500 Received: by mail-qk1-x72f.google.com with SMTP id af79cd13be357-7781bc3783fso347000685a.1 for ; Mon, 13 Nov 2023 21:41:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940496; x=1700545296; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:from:to:cc:subject:date:message-id :reply-to; bh=m7yOYF2wvI8zWlizu/0wvN4KIBvmGhSdtNfjdan8Svs=; b=h481S6gicIS7lvBG19h6kmCCNVOl2R3ou9gCUtP3t3vDY4OAr5BXuP6hc7Jy4oKIrI pw8kh9jxAE6DzAVEY662q9UcrQRQpvpFZr/B/gKLEzHIejyIA/X3zMs7LODBYlFe/tvt Wrhxjac/UuA0EIR8eW6gQaCo3oxVdToYtAfNuOkRUNIupTIT3HPZ/FNGqWv/ck3Z7m/Q uB0GCJmhYfNA/PWVWckusi3SkrXpIO79d4GFIfIHMS866bYvnPP9EtGkdCggUBqiCV+Y XGw2ZyXJL8YCkOOCxx1QM+/oOtePBDngdhPRHiC+2RgK+UYjztMj0h+QchDTBOzLI9uI KyqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940496; x=1700545296; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m7yOYF2wvI8zWlizu/0wvN4KIBvmGhSdtNfjdan8Svs=; b=BHQiz9bSEh0qz1wfyx3C+ylqnyaug4dqhfrPLadLF5+Pnku9T29HYYXw9XESNSixfN 69sXML+joxA5i4jaGyoMeg5U3LW/0Wdk6h26LaHx0V8Titn+hwf5KUhc+qT1C3Q8Bc9k Hl0hlgIR5uloZIV/Nd5yETdXakhfw6RC+jCg6C9pUSswia4Lu8jRsB46hD5WOwoCjaj1 u1UR+II/tZ+f1hyx9jioZRV3H6aOWiUz9AkEMFOggLTgBSLTahcpHIjpO01+laI4tAIm BNodn92IDs5Eguubw0qbpFSOPBpsVpNn8IprhL4IaxoCHW8P3SKf1ozyAU03pjuPlp2u KBNg== X-Gm-Message-State: AOJu0Yww80oMSNOZb8ukVY1H5KHTPDTxb9NwNF7pBJ6zRRqFrVQ5YTmK BmyRuyQSqD7/877N7aXJG6a1Ug== X-Google-Smtp-Source: AGHT+IFoWUVfRtXAcPY2um3Za+B4d76wIjlGb5VeES5aWZsXkzcSdP0lY89WFdppRJums7U1Oo+6+Q== X-Received: by 2002:a05:620a:44cb:b0:77a:6dcd:3fce with SMTP id y11-20020a05620a44cb00b0077a6dcd3fcemr1785792qkp.11.1699940495853; Mon, 13 Nov 2023 21:41:35 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:35 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Subject: [PATCH v2 03/20] multifd: Zero pages transmission Date: Tue, 14 Nov 2023 05:40:15 +0000 Message-Id: <20231114054032.1192027-4-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72f; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Juan Quintela This implements the zero page dection and handling. Signed-off-by: Juan Quintela --- migration/multifd.c | 41 +++++++++++++++++++++++++++++++++++++++-- migration/multifd.h | 5 +++++ 2 files changed, 44 insertions(+), 2 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index d28ef0028b..1b994790d5 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -11,6 +11,7 @@ */ #include "qemu/osdep.h" +#include "qemu/cutils.h" #include "qemu/rcu.h" #include "exec/target_page.h" #include "sysemu/sysemu.h" @@ -279,6 +280,12 @@ static void multifd_send_fill_packet(MultiFDSendParams *p) packet->offset[i] = cpu_to_be64(temp); } + for (i = 0; i < p->zero_num; i++) { + /* there are architectures where ram_addr_t is 32 bit */ + uint64_t temp = p->zero[i]; + + packet->offset[p->normal_num + i] = cpu_to_be64(temp); + } } static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) @@ -361,6 +368,18 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) p->normal[i] = offset; } + for (i = 0; i < p->zero_num; i++) { + uint64_t offset = be64_to_cpu(packet->offset[p->normal_num + i]); + + if (offset > (p->block->used_length - p->page_size)) { + error_setg(errp, "multifd: offset too long %" PRIu64 + " (max " RAM_ADDR_FMT ")", + offset, p->block->used_length); + return -1; + } + p->zero[i] = offset; + } + return 0; } @@ -664,6 +683,8 @@ static void *multifd_send_thread(void *opaque) MultiFDSendParams *p = opaque; MigrationThread *thread = NULL; Error *local_err = NULL; + /* qemu older than 8.2 don't understand zero page on multifd channel */ + bool use_zero_page = !migrate_use_main_zero_page(); int ret = 0; bool use_zero_copy_send = migrate_zero_copy_send(); @@ -689,6 +710,7 @@ static void *multifd_send_thread(void *opaque) qemu_mutex_lock(&p->mutex); if (p->pending_job) { + RAMBlock *rb = p->pages->block; uint64_t packet_num = p->packet_num; uint32_t flags; p->normal_num = 0; @@ -701,8 +723,16 @@ static void *multifd_send_thread(void *opaque) } for (int i = 0; i < p->pages->num; i++) { - p->normal[p->normal_num] = p->pages->offset[i]; - p->normal_num++; + uint64_t offset = p->pages->offset[i]; + if (use_zero_page && + buffer_is_zero(rb->host + offset, p->page_size)) { + p->zero[p->zero_num] = offset; + p->zero_num++; + ram_release_page(rb->idstr, offset); + } else { + p->normal[p->normal_num] = offset; + p->normal_num++; + } } if (p->normal_num) { @@ -1156,6 +1186,13 @@ static void *multifd_recv_thread(void *opaque) } } + for (int i = 0; i < p->zero_num; i++) { + void *page = p->host + p->zero[i]; + if (!buffer_is_zero(page, p->page_size)) { + memset(page, 0, p->page_size); + } + } + if (flags & MULTIFD_FLAG_SYNC) { qemu_sem_post(&multifd_recv_state->sem_sync); qemu_sem_wait(&p->sem_sync); diff --git a/migration/multifd.h b/migration/multifd.h index d587b0e19c..13762900d4 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -53,6 +53,11 @@ typedef struct { uint32_t unused32[1]; /* Reserved for future use */ uint64_t unused64[3]; /* Reserved for future use */ char ramblock[256]; + /* + * This array contains the pointers to: + * - normal pages (initial normal_pages entries) + * - zero pages (following zero_pages entries) + */ uint64_t offset[]; } __attribute__((packed)) MultiFDPacket_t; From patchwork Tue Nov 14 05:40:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454802 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EA2BEC4167D for ; Tue, 14 Nov 2023 05:43:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBC-0007B4-Mh; Tue, 14 Nov 2023 00:41:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBA-0007AW-HA for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:40 -0500 Received: from mail-qk1-x733.google.com ([2607:f8b0:4864:20::733]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mB8-0001Rk-5P for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:40 -0500 Received: by mail-qk1-x733.google.com with SMTP id af79cd13be357-778927f2dd3so268541385a.2 for ; Mon, 13 Nov 2023 21:41:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940497; x=1700545297; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8Nmx9BzmrBNJxBIGmirn1Vxg5B7bMVOoZGw0h67pPtY=; b=MoHKJF9lW0d811oRA3mzAXcEdCSso6A6p5z9x/4Ugi4gPOBshpEpk75Pt+wfiVnOEn yU0NaGBR/GPsxhCZ4udaY49KC5H0ESDzXQ4XSlrb6BaN0zIlq+a2srxxiAe+L3rQsg6l 1hVpj3eL+NqZV0+hcL5u5sE2k9y2vHNTA4ATnuzsmCCzbW32xbn2D8tDQoxUGZM76gS3 ZAYpwzukddIr8l1255MZBKUN3I+rtCQO+tNTFnRskR9L8xP9mP/WaUyTbyZtAXZCdsbG onPk2x1p+nOH+EGf+qjf3cYYDp97CeLl/2Tx0Sn34DGT5Eh8WbUswcseaQX63ClMpaQL 18jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940497; x=1700545297; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8Nmx9BzmrBNJxBIGmirn1Vxg5B7bMVOoZGw0h67pPtY=; b=QHuA/F+JAnTYWAhme3fCXBHpxClS8I9TkCV9QvvWKHaEPfkLff/5pZwjqrlUBOiSHy sjuquEbIWnV8Lb1vXyQ3yvGRbEYP40dlDNG9ksvUwQd/2iHYcGn1Zx6QpBJQb05thg1Y JPcM4r/V6oje0/oyHW5himwNXYkzJalZ+0nehuCuBUoGqu3fCRBpq8TpK4Gdkf2pBhlx cbc3Ri2LkXq96640KwL5IZkH+ElPrCoqCDmORYn5IySU5OwC8x62adk3/pNs2mSmXsSw vjf8B+TktradRv2uK4SyLXcEkHVm74XT2m1H1g88FyW9wgH2lpKcU+6/beibZPX6TW0U SZBA== X-Gm-Message-State: AOJu0YwTTCfIa7QRzm2bMGxN55fwpd0TmVBZ50/jCIxFqAt1Tb97k5bd Rv+Asi+KfV+3D8Xo+fZdmEX/lA== X-Google-Smtp-Source: AGHT+IHF6WtckOoHIkPzxrodqYoZXm45X5ILsL/ALx/rVgjVpPptR2LHeg8c1/Yah86qG1rBVC77jw== X-Received: by 2002:ae9:c314:0:b0:76f:93e:4b2f with SMTP id n20-20020ae9c314000000b0076f093e4b2fmr1118983qkg.38.1699940497313; Mon, 13 Nov 2023 21:41:37 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:37 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Leonardo Bras Subject: [PATCH v2 04/20] So we use multifd to transmit zero pages. Date: Tue, 14 Nov 2023 05:40:16 +0000 Message-Id: <20231114054032.1192027-5-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::733; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x733.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Juan Quintela Signed-off-by: Juan Quintela Reviewed-by: Leonardo Bras --- migration/multifd.c | 7 ++++--- migration/options.c | 13 +++++++------ migration/ram.c | 45 ++++++++++++++++++++++++++++++++++++++------- qapi/migration.json | 1 - 4 files changed, 49 insertions(+), 17 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 1b994790d5..1198ffde9c 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -13,6 +13,7 @@ #include "qemu/osdep.h" #include "qemu/cutils.h" #include "qemu/rcu.h" +#include "qemu/cutils.h" #include "exec/target_page.h" #include "sysemu/sysemu.h" #include "exec/ramblock.h" @@ -459,7 +460,6 @@ static int multifd_send_pages(QEMUFile *f) p->packet_num = multifd_send_state->packet_num++; multifd_send_state->pages = p->pages; p->pages = pages; - qemu_mutex_unlock(&p->mutex); qemu_sem_post(&p->sem); @@ -684,7 +684,7 @@ static void *multifd_send_thread(void *opaque) MigrationThread *thread = NULL; Error *local_err = NULL; /* qemu older than 8.2 don't understand zero page on multifd channel */ - bool use_zero_page = !migrate_use_main_zero_page(); + bool use_multifd_zero_page = !migrate_use_main_zero_page(); int ret = 0; bool use_zero_copy_send = migrate_zero_copy_send(); @@ -713,6 +713,7 @@ static void *multifd_send_thread(void *opaque) RAMBlock *rb = p->pages->block; uint64_t packet_num = p->packet_num; uint32_t flags; + p->normal_num = 0; p->zero_num = 0; @@ -724,7 +725,7 @@ static void *multifd_send_thread(void *opaque) for (int i = 0; i < p->pages->num; i++) { uint64_t offset = p->pages->offset[i]; - if (use_zero_page && + if (use_multifd_zero_page && buffer_is_zero(rb->host + offset, p->page_size)) { p->zero[p->zero_num] = offset; p->zero_num++; diff --git a/migration/options.c b/migration/options.c index 00c0c4a0d6..97d121d4d7 100644 --- a/migration/options.c +++ b/migration/options.c @@ -195,6 +195,7 @@ Property migration_properties[] = { DEFINE_PROP_MIG_CAP("x-block", MIGRATION_CAPABILITY_BLOCK), DEFINE_PROP_MIG_CAP("x-return-path", MIGRATION_CAPABILITY_RETURN_PATH), DEFINE_PROP_MIG_CAP("x-multifd", MIGRATION_CAPABILITY_MULTIFD), + DEFINE_PROP_MIG_CAP("x-main-zero-page", MIGRATION_CAPABILITY_MAIN_ZERO_PAGE), DEFINE_PROP_MIG_CAP("x-background-snapshot", MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT), #ifdef CONFIG_LINUX @@ -288,13 +289,9 @@ bool migrate_multifd(void) bool migrate_use_main_zero_page(void) { - //MigrationState *s; - - //s = migrate_get_current(); + MigrationState *s = migrate_get_current(); - // We will enable this when we add the right code. - // return s->enabled_capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]; - return true; + return s->capabilities[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]; } bool migrate_pause_before_switchover(void) @@ -457,6 +454,7 @@ INITIALIZE_MIGRATE_CAPS_SET(check_caps_background_snapshot, MIGRATION_CAPABILITY_LATE_BLOCK_ACTIVATE, MIGRATION_CAPABILITY_RETURN_PATH, MIGRATION_CAPABILITY_MULTIFD, + MIGRATION_CAPABILITY_MAIN_ZERO_PAGE, MIGRATION_CAPABILITY_PAUSE_BEFORE_SWITCHOVER, MIGRATION_CAPABILITY_AUTO_CONVERGE, MIGRATION_CAPABILITY_RELEASE_RAM, @@ -534,6 +532,9 @@ bool migrate_caps_check(bool *old_caps, bool *new_caps, Error **errp) error_setg(errp, "Postcopy is not yet compatible with multifd"); return false; } + if (new_caps[MIGRATION_CAPABILITY_MAIN_ZERO_PAGE]) { + error_setg(errp, "Postcopy is not yet compatible with main zero copy"); + } } if (new_caps[MIGRATION_CAPABILITY_BACKGROUND_SNAPSHOT]) { diff --git a/migration/ram.c b/migration/ram.c index 8c7886ab79..f7a42feff2 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -2059,17 +2059,42 @@ static int ram_save_target_page_legacy(RAMState *rs, PageSearchStatus *pss) if (save_zero_page(rs, pss, offset)) { return 1; } - /* - * Do not use multifd in postcopy as one whole host page should be - * placed. Meanwhile postcopy requires atomic update of pages, so even - * if host page size == guest page size the dest guest during run may - * still see partially copied pages which is data corruption. + * Do not use multifd for: + * 1. Compression as the first page in the new block should be posted out + * before sending the compressed page + * 2. In postcopy as one whole host page should be placed */ - if (migrate_multifd() && !migration_in_postcopy()) { + if (!migrate_compress() && migrate_multifd() && !migration_in_postcopy()) { + return ram_save_multifd_page(pss->pss_channel, block, offset); + } + + return ram_save_page(rs, pss); +} + +/** + * ram_save_target_page_multifd: save one target page + * + * Returns the number of pages written + * + * @rs: current RAM state + * @pss: data about the page we want to send + */ +static int ram_save_target_page_multifd(RAMState *rs, PageSearchStatus *pss) +{ + RAMBlock *block = pss->block; + ram_addr_t offset = ((ram_addr_t)pss->page) << TARGET_PAGE_BITS; + int res; + + if (!migration_in_postcopy()) { return ram_save_multifd_page(pss->pss_channel, block, offset); } + res = save_zero_page(rs, pss, offset); + if (res > 0) { + return res; + } + return ram_save_page(rs, pss); } @@ -2982,9 +3007,15 @@ static int ram_save_setup(QEMUFile *f, void *opaque) } migration_ops = g_malloc0(sizeof(MigrationOps)); - migration_ops->ram_save_target_page = ram_save_target_page_legacy; + + if (migrate_multifd() && !migrate_use_main_zero_page()) { + migration_ops->ram_save_target_page = ram_save_target_page_multifd; + } else { + migration_ops->ram_save_target_page = ram_save_target_page_legacy; + } qemu_mutex_unlock_iothread(); + ret = multifd_send_sync_main(f); qemu_mutex_lock_iothread(); if (ret < 0) { diff --git a/qapi/migration.json b/qapi/migration.json index 09e4393591..9783289bfc 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -531,7 +531,6 @@ # and can result in more stable read performance. Requires KVM # with accelerator property "dirty-ring-size" set. (Since 8.1) # -# # @main-zero-page: If enabled, the detection of zero pages will be # done on the main thread. Otherwise it is done on # the multifd threads. From patchwork Tue Nov 14 05:40:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454808 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6951EC4332F for ; Tue, 14 Nov 2023 05:44:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBL-0007DR-Js; Tue, 14 Nov 2023 00:41:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBK-0007DD-9q for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:50 -0500 Received: from mail-qk1-x72b.google.com ([2607:f8b0:4864:20::72b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBA-0001S3-L1 for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:42 -0500 Received: by mail-qk1-x72b.google.com with SMTP id af79cd13be357-77897c4ac1fso337145285a.3 for ; Mon, 13 Nov 2023 21:41:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940499; x=1700545299; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=46PzmwJ0rU1BgPMUK6fRjq2lrI+tOWnmggKPgnPdDUc=; b=jzhW8etf1OERKsagiH41GruwoD9J3vrMJu3EV/W4yyu8jBzGgjvnkLvCd1FRTmDXKX zDLoiCqXkx4cz5z3iGiIaKl9wT9uBS7zslobacUAsL3y7UTOE2kg6VZpRWBXWEJU5UlW /LlWvvIw9rsEKnN6r2B1XsA3oPChIte5+AggdRXnJRzLgxE+J8TZA81hqaRdtzgrKBf8 5DJyR/9JeyQdmkPll8Sae8utxG8BrnsmsRp+bZhfp5qOyYlSw6QzNtcb4tQQn6tBb+uz ok9SGaXTJ/N7h4uYEqPbOZdtfNenzQovo9QKb0qi1FuLXMnEfSWugyTHTiFfMsGxwj9J QTHQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940499; x=1700545299; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=46PzmwJ0rU1BgPMUK6fRjq2lrI+tOWnmggKPgnPdDUc=; b=Y+nZLCHYBCeqQXPu1b78TzfSX5dpdesLFM5xnyCo5VhzOKZVPkOn/HmzURcq/gUfHg gCYQd8uEq0SVHyiNvHMR53kiCNl4D34+ZFNJgZG2B7babvwSf6fNOGasHXjsC+jpdQff qzwUnM3pblcIRfcBoGi1j/RI5VqqaBrGs8Hea6WvissfpH5mgHi83PgCKTs2kqGiVqZO iFvQ7McMaHoTZrLsKNDiAsk5cFbwU1JoyAkZGPj7r3RCV0Awk4+QkAL96T1SgGvALyEq DnIfxgK7tSNnZPgGJJEKhU4OpzAW1uBSxm8IyHBhOlJPm1BQTGrBROcTEkbVZ/WJASZ0 u1Qw== X-Gm-Message-State: AOJu0Yx7xC50OzEX6d5HYY1aQG8Z+DqYBR7bC4nlOEVni6A49/PuXdsT eSgOmfruiSKTdFs+oLvMwZB2ag== X-Google-Smtp-Source: AGHT+IHu1ZUJ3vrKX++qgwRLRZzIaj64AoP0XD7TSD8cVd1C4uYmtsCPB9epKzTD8/rrgRyUQRZf+g== X-Received: by 2002:a05:620a:6a83:b0:778:8dc1:bb7b with SMTP id ud3-20020a05620a6a8300b007788dc1bb7bmr1260865qkn.27.1699940499397; Mon, 13 Nov 2023 21:41:39 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:39 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 05/20] meson: Introduce new instruction set enqcmd to the build system. Date: Tue, 14 Nov 2023 05:40:17 +0000 Message-Id: <20231114054032.1192027-6-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72b; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Enable instruction set enqcmd in build. Signed-off-by: Hao Xiang --- meson.build | 2 ++ meson_options.txt | 2 ++ scripts/meson-buildoptions.sh | 3 +++ 3 files changed, 7 insertions(+) diff --git a/meson.build b/meson.build index ec01f8b138..1292ab78a3 100644 --- a/meson.build +++ b/meson.build @@ -2708,6 +2708,8 @@ config_host_data.set('CONFIG_AVX512BW_OPT', get_option('avx512bw') \ int main(int argc, char *argv[]) { return bar(argv[0]); } '''), error_message: 'AVX512BW not available').allowed()) +config_host_data.set('CONFIG_DSA_OPT', get_option('enqcmd')) + # For both AArch64 and AArch32, detect if builtins are available. config_host_data.set('CONFIG_ARM_AES_BUILTIN', cc.compiles(''' #include diff --git a/meson_options.txt b/meson_options.txt index c9baeda639..6fe8aca181 100644 --- a/meson_options.txt +++ b/meson_options.txt @@ -121,6 +121,8 @@ option('avx512f', type: 'feature', value: 'disabled', description: 'AVX512F optimizations') option('avx512bw', type: 'feature', value: 'auto', description: 'AVX512BW optimizations') +option('enqcmd', type: 'boolean', value: false, + description: 'MENQCMD optimizations') option('keyring', type: 'feature', value: 'auto', description: 'Linux keyring support') option('libkeyutils', type: 'feature', value: 'auto', diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index 680fa3f581..bf139e3fb4 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -93,6 +93,7 @@ meson_options_help() { printf "%s\n" ' avx2 AVX2 optimizations' printf "%s\n" ' avx512bw AVX512BW optimizations' printf "%s\n" ' avx512f AVX512F optimizations' + printf "%s\n" ' enqcmd ENQCMD optimizations' printf "%s\n" ' blkio libblkio block device driver' printf "%s\n" ' bochs bochs image format support' printf "%s\n" ' bpf eBPF support' @@ -240,6 +241,8 @@ _meson_option_parse() { --disable-avx512bw) printf "%s" -Davx512bw=disabled ;; --enable-avx512f) printf "%s" -Davx512f=enabled ;; --disable-avx512f) printf "%s" -Davx512f=disabled ;; + --enable-enqcmd) printf "%s" -Denqcmd=true ;; + --disable-enqcmd) printf "%s" -Denqcmd=false ;; --enable-gcov) printf "%s" -Db_coverage=true ;; --disable-gcov) printf "%s" -Db_coverage=false ;; --enable-lto) printf "%s" -Db_lto=true ;; From patchwork Tue Nov 14 05:40:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454801 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D4CAAC4332F for ; Tue, 14 Nov 2023 05:43:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBQ-0007ET-0t; Tue, 14 Nov 2023 00:41:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBN-0007Dv-7z for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:53 -0500 Received: from mail-qk1-x731.google.com ([2607:f8b0:4864:20::731]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBI-0001SL-49 for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:53 -0500 Received: by mail-qk1-x731.google.com with SMTP id af79cd13be357-7781bc3783fso347003585a.1 for ; Mon, 13 Nov 2023 21:41:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940501; x=1700545301; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0JqkkF9dhAzE0OkPFi/BNF1Vq1buivPQjPn42c685Sk=; b=IcuiRf41gk/rp00y761mN3jG2epWDeVHwSp3OA5yDmD7wqxJeYubQCywco4/rmC0+n d9uNSE9IMbVNKPXXd9uDMgG3e5wf8ukJ9EzwDxKZNqw/Uu9vmwpbuCa5sK2CW/btZhTM 8Yc8qqlyBWDtBjI+QRfgtGrVjDKHFHXAdrzkBFLbrmzKJvpeYGGCD0dUYRvAr9HN0+0Z ufe4ymCyB0hHzrzGxAGCd7dTzzokVnoDy1ANUkZkU9r5fMAgptbA18YXwTegagT3zTYc ZJhSOxhRPaJEUC1KF1NGGzIbi+PwsB8Fegyue8WE46egl1ox6meZtvAvRtYi6HFFowfr kgTg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940501; x=1700545301; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0JqkkF9dhAzE0OkPFi/BNF1Vq1buivPQjPn42c685Sk=; b=pu4NdI0j0zZGTV3f+a5EHjUJDJuvmU91jEbduAd32mQ09FCi/dcXFuEPvFHE6yQj2R EIPQZVW5scJFBJUbxd+ir0s4ygaf5gh/EVX5R1s8WqonS79KoiqQFf3Hwdp2pds7EMWq 3QToFAnW4ToMX7ulc1NWehSxvThafKIXW9z/zQNz2bi80rcZSbYk9vbmQGyU+yX99KyG ydUWqDPQvx+zNZ4YctI7eZtDrK+wMNJfJhmmoExnHGI5ZFJFb76ERlK7oPCHCpE/ehlt H2CQYzU55eQsDDdwG3b1gt4QhpEzI/m/XaVe7YQ6hJwJVQ+MytK1G7fwPWY2quGGdM1X 5fDA== X-Gm-Message-State: AOJu0YwQSjSacAB9EgX7d/HGYivB2AICWS5MZ+YZdfEFT8/xKXHtkYiP iDcNbP1qgjiBHFu7gBRJg2Aukg== X-Google-Smtp-Source: AGHT+IG+fegoSVLIADBs+f0zGCxXl0L/g1xJochqcvPepWp6uvXOzJpHoi4mr+4DRpiuvuyznq78zA== X-Received: by 2002:a05:620a:4448:b0:779:f0a1:3128 with SMTP id w8-20020a05620a444800b00779f0a13128mr1416435qkp.63.1699940501131; Mon, 13 Nov 2023 21:41:41 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:40 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 06/20] util/dsa: Add dependency idxd. Date: Tue, 14 Nov 2023 05:40:18 +0000 Message-Id: <20231114054032.1192027-7-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::731; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x731.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Idxd is the device driver for DSA (Intel Data Streaming Accelerator). The driver is fully functioning since Linux kernel 5.19. This change adds the driver's header file used for userspace development. Signed-off-by: Hao Xiang --- linux-headers/linux/idxd.h | 356 +++++++++++++++++++++++++++++++++++++ 1 file changed, 356 insertions(+) create mode 100644 linux-headers/linux/idxd.h diff --git a/linux-headers/linux/idxd.h b/linux-headers/linux/idxd.h new file mode 100644 index 0000000000..1d553bedbd --- /dev/null +++ b/linux-headers/linux/idxd.h @@ -0,0 +1,356 @@ +/* SPDX-License-Identifier: LGPL-2.1 WITH Linux-syscall-note */ +/* Copyright(c) 2019 Intel Corporation. All rights rsvd. */ +#ifndef _USR_IDXD_H_ +#define _USR_IDXD_H_ + +#ifdef __KERNEL__ +#include +#else +#include +#endif + +/* Driver command error status */ +enum idxd_scmd_stat { + IDXD_SCMD_DEV_ENABLED = 0x80000010, + IDXD_SCMD_DEV_NOT_ENABLED = 0x80000020, + IDXD_SCMD_WQ_ENABLED = 0x80000021, + IDXD_SCMD_DEV_DMA_ERR = 0x80020000, + IDXD_SCMD_WQ_NO_GRP = 0x80030000, + IDXD_SCMD_WQ_NO_NAME = 0x80040000, + IDXD_SCMD_WQ_NO_SVM = 0x80050000, + IDXD_SCMD_WQ_NO_THRESH = 0x80060000, + IDXD_SCMD_WQ_PORTAL_ERR = 0x80070000, + IDXD_SCMD_WQ_RES_ALLOC_ERR = 0x80080000, + IDXD_SCMD_PERCPU_ERR = 0x80090000, + IDXD_SCMD_DMA_CHAN_ERR = 0x800a0000, + IDXD_SCMD_CDEV_ERR = 0x800b0000, + IDXD_SCMD_WQ_NO_SWQ_SUPPORT = 0x800c0000, + IDXD_SCMD_WQ_NONE_CONFIGURED = 0x800d0000, + IDXD_SCMD_WQ_NO_SIZE = 0x800e0000, + IDXD_SCMD_WQ_NO_PRIV = 0x800f0000, + IDXD_SCMD_WQ_IRQ_ERR = 0x80100000, + IDXD_SCMD_WQ_USER_NO_IOMMU = 0x80110000, +}; + +#define IDXD_SCMD_SOFTERR_MASK 0x80000000 +#define IDXD_SCMD_SOFTERR_SHIFT 16 + +/* Descriptor flags */ +#define IDXD_OP_FLAG_FENCE 0x0001 +#define IDXD_OP_FLAG_BOF 0x0002 +#define IDXD_OP_FLAG_CRAV 0x0004 +#define IDXD_OP_FLAG_RCR 0x0008 +#define IDXD_OP_FLAG_RCI 0x0010 +#define IDXD_OP_FLAG_CRSTS 0x0020 +#define IDXD_OP_FLAG_CR 0x0080 +#define IDXD_OP_FLAG_CC 0x0100 +#define IDXD_OP_FLAG_ADDR1_TCS 0x0200 +#define IDXD_OP_FLAG_ADDR2_TCS 0x0400 +#define IDXD_OP_FLAG_ADDR3_TCS 0x0800 +#define IDXD_OP_FLAG_CR_TCS 0x1000 +#define IDXD_OP_FLAG_STORD 0x2000 +#define IDXD_OP_FLAG_DRDBK 0x4000 +#define IDXD_OP_FLAG_DSTS 0x8000 + +/* IAX */ +#define IDXD_OP_FLAG_RD_SRC2_AECS 0x010000 +#define IDXD_OP_FLAG_RD_SRC2_2ND 0x020000 +#define IDXD_OP_FLAG_WR_SRC2_AECS_COMP 0x040000 +#define IDXD_OP_FLAG_WR_SRC2_AECS_OVFL 0x080000 +#define IDXD_OP_FLAG_SRC2_STS 0x100000 +#define IDXD_OP_FLAG_CRC_RFC3720 0x200000 + +/* Opcode */ +enum dsa_opcode { + DSA_OPCODE_NOOP = 0, + DSA_OPCODE_BATCH, + DSA_OPCODE_DRAIN, + DSA_OPCODE_MEMMOVE, + DSA_OPCODE_MEMFILL, + DSA_OPCODE_COMPARE, + DSA_OPCODE_COMPVAL, + DSA_OPCODE_CR_DELTA, + DSA_OPCODE_AP_DELTA, + DSA_OPCODE_DUALCAST, + DSA_OPCODE_CRCGEN = 0x10, + DSA_OPCODE_COPY_CRC, + DSA_OPCODE_DIF_CHECK, + DSA_OPCODE_DIF_INS, + DSA_OPCODE_DIF_STRP, + DSA_OPCODE_DIF_UPDT, + DSA_OPCODE_CFLUSH = 0x20, +}; + +enum iax_opcode { + IAX_OPCODE_NOOP = 0, + IAX_OPCODE_DRAIN = 2, + IAX_OPCODE_MEMMOVE, + IAX_OPCODE_DECOMPRESS = 0x42, + IAX_OPCODE_COMPRESS, + IAX_OPCODE_CRC64, + IAX_OPCODE_ZERO_DECOMP_32 = 0x48, + IAX_OPCODE_ZERO_DECOMP_16, + IAX_OPCODE_ZERO_COMP_32 = 0x4c, + IAX_OPCODE_ZERO_COMP_16, + IAX_OPCODE_SCAN = 0x50, + IAX_OPCODE_SET_MEMBER, + IAX_OPCODE_EXTRACT, + IAX_OPCODE_SELECT, + IAX_OPCODE_RLE_BURST, + IAX_OPCODE_FIND_UNIQUE, + IAX_OPCODE_EXPAND, +}; + +/* Completion record status */ +enum dsa_completion_status { + DSA_COMP_NONE = 0, + DSA_COMP_SUCCESS, + DSA_COMP_SUCCESS_PRED, + DSA_COMP_PAGE_FAULT_NOBOF, + DSA_COMP_PAGE_FAULT_IR, + DSA_COMP_BATCH_FAIL, + DSA_COMP_BATCH_PAGE_FAULT, + DSA_COMP_DR_OFFSET_NOINC, + DSA_COMP_DR_OFFSET_ERANGE, + DSA_COMP_DIF_ERR, + DSA_COMP_BAD_OPCODE = 0x10, + DSA_COMP_INVALID_FLAGS, + DSA_COMP_NOZERO_RESERVE, + DSA_COMP_XFER_ERANGE, + DSA_COMP_DESC_CNT_ERANGE, + DSA_COMP_DR_ERANGE, + DSA_COMP_OVERLAP_BUFFERS, + DSA_COMP_DCAST_ERR, + DSA_COMP_DESCLIST_ALIGN, + DSA_COMP_INT_HANDLE_INVAL, + DSA_COMP_CRA_XLAT, + DSA_COMP_CRA_ALIGN, + DSA_COMP_ADDR_ALIGN, + DSA_COMP_PRIV_BAD, + DSA_COMP_TRAFFIC_CLASS_CONF, + DSA_COMP_PFAULT_RDBA, + DSA_COMP_HW_ERR1, + DSA_COMP_HW_ERR_DRB, + DSA_COMP_TRANSLATION_FAIL, +}; + +enum iax_completion_status { + IAX_COMP_NONE = 0, + IAX_COMP_SUCCESS, + IAX_COMP_PAGE_FAULT_IR = 0x04, + IAX_COMP_ANALYTICS_ERROR = 0x0a, + IAX_COMP_OUTBUF_OVERFLOW, + IAX_COMP_BAD_OPCODE = 0x10, + IAX_COMP_INVALID_FLAGS, + IAX_COMP_NOZERO_RESERVE, + IAX_COMP_INVALID_SIZE, + IAX_COMP_OVERLAP_BUFFERS = 0x16, + IAX_COMP_INT_HANDLE_INVAL = 0x19, + IAX_COMP_CRA_XLAT, + IAX_COMP_CRA_ALIGN, + IAX_COMP_ADDR_ALIGN, + IAX_COMP_PRIV_BAD, + IAX_COMP_TRAFFIC_CLASS_CONF, + IAX_COMP_PFAULT_RDBA, + IAX_COMP_HW_ERR1, + IAX_COMP_HW_ERR_DRB, + IAX_COMP_TRANSLATION_FAIL, + IAX_COMP_PRS_TIMEOUT, + IAX_COMP_WATCHDOG, + IAX_COMP_INVALID_COMP_FLAG = 0x30, + IAX_COMP_INVALID_FILTER_FLAG, + IAX_COMP_INVALID_INPUT_SIZE, + IAX_COMP_INVALID_NUM_ELEMS, + IAX_COMP_INVALID_SRC1_WIDTH, + IAX_COMP_INVALID_INVERT_OUT, +}; + +#define DSA_COMP_STATUS_MASK 0x7f +#define DSA_COMP_STATUS_WRITE 0x80 + +struct dsa_hw_desc { + uint32_t pasid:20; + uint32_t rsvd:11; + uint32_t priv:1; + uint32_t flags:24; + uint32_t opcode:8; + uint64_t completion_addr; + union { + uint64_t src_addr; + uint64_t rdback_addr; + uint64_t pattern; + uint64_t desc_list_addr; + }; + union { + uint64_t dst_addr; + uint64_t rdback_addr2; + uint64_t src2_addr; + uint64_t comp_pattern; + }; + union { + uint32_t xfer_size; + uint32_t desc_count; + }; + uint16_t int_handle; + uint16_t rsvd1; + union { + uint8_t expected_res; + /* create delta record */ + struct { + uint64_t delta_addr; + uint32_t max_delta_size; + uint32_t delt_rsvd; + uint8_t expected_res_mask; + }; + uint32_t delta_rec_size; + uint64_t dest2; + /* CRC */ + struct { + uint32_t crc_seed; + uint32_t crc_rsvd; + uint64_t seed_addr; + }; + /* DIF check or strip */ + struct { + uint8_t src_dif_flags; + uint8_t dif_chk_res; + uint8_t dif_chk_flags; + uint8_t dif_chk_res2[5]; + uint32_t chk_ref_tag_seed; + uint16_t chk_app_tag_mask; + uint16_t chk_app_tag_seed; + }; + /* DIF insert */ + struct { + uint8_t dif_ins_res; + uint8_t dest_dif_flag; + uint8_t dif_ins_flags; + uint8_t dif_ins_res2[13]; + uint32_t ins_ref_tag_seed; + uint16_t ins_app_tag_mask; + uint16_t ins_app_tag_seed; + }; + /* DIF update */ + struct { + uint8_t src_upd_flags; + uint8_t upd_dest_flags; + uint8_t dif_upd_flags; + uint8_t dif_upd_res[5]; + uint32_t src_ref_tag_seed; + uint16_t src_app_tag_mask; + uint16_t src_app_tag_seed; + uint32_t dest_ref_tag_seed; + uint16_t dest_app_tag_mask; + uint16_t dest_app_tag_seed; + }; + + uint8_t op_specific[24]; + }; +} __attribute__((packed)); + +struct iax_hw_desc { + uint32_t pasid:20; + uint32_t rsvd:11; + uint32_t priv:1; + uint32_t flags:24; + uint32_t opcode:8; + uint64_t completion_addr; + uint64_t src1_addr; + uint64_t dst_addr; + uint32_t src1_size; + uint16_t int_handle; + union { + uint16_t compr_flags; + uint16_t decompr_flags; + }; + uint64_t src2_addr; + uint32_t max_dst_size; + uint32_t src2_size; + uint32_t filter_flags; + uint32_t num_inputs; +} __attribute__((packed)); + +struct dsa_raw_desc { + uint64_t field[8]; +} __attribute__((packed)); + +/* + * The status field will be modified by hardware, therefore it should be + * volatile and prevent the compiler from optimize the read. + */ +struct dsa_completion_record { + volatile uint8_t status; + union { + uint8_t result; + uint8_t dif_status; + }; + uint16_t rsvd; + uint32_t bytes_completed; + uint64_t fault_addr; + union { + /* common record */ + struct { + uint32_t invalid_flags:24; + uint32_t rsvd2:8; + }; + + uint32_t delta_rec_size; + uint64_t crc_val; + + /* DIF check & strip */ + struct { + uint32_t dif_chk_ref_tag; + uint16_t dif_chk_app_tag_mask; + uint16_t dif_chk_app_tag; + }; + + /* DIF insert */ + struct { + uint64_t dif_ins_res; + uint32_t dif_ins_ref_tag; + uint16_t dif_ins_app_tag_mask; + uint16_t dif_ins_app_tag; + }; + + /* DIF update */ + struct { + uint32_t dif_upd_src_ref_tag; + uint16_t dif_upd_src_app_tag_mask; + uint16_t dif_upd_src_app_tag; + uint32_t dif_upd_dest_ref_tag; + uint16_t dif_upd_dest_app_tag_mask; + uint16_t dif_upd_dest_app_tag; + }; + + uint8_t op_specific[16]; + }; +} __attribute__((packed)); + +struct dsa_raw_completion_record { + uint64_t field[4]; +} __attribute__((packed)); + +struct iax_completion_record { + volatile uint8_t status; + uint8_t error_code; + uint16_t rsvd; + uint32_t bytes_completed; + uint64_t fault_addr; + uint32_t invalid_flags; + uint32_t rsvd2; + uint32_t output_size; + uint8_t output_bits; + uint8_t rsvd3; + uint16_t xor_csum; + uint32_t crc; + uint32_t min; + uint32_t max; + uint32_t sum; + uint64_t rsvd4[2]; +} __attribute__((packed)); + +struct iax_raw_completion_record { + uint64_t field[8]; +} __attribute__((packed)); + +#endif From patchwork Tue Nov 14 05:40:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454803 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 11CEEC4332F for ; Tue, 14 Nov 2023 05:43:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBT-0007FP-Pt; Tue, 14 Nov 2023 00:41:59 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBS-0007F5-TT for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:58 -0500 Received: from mail-qv1-xf2f.google.com ([2607:f8b0:4864:20::f2f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBL-0001SS-2t for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:58 -0500 Received: by mail-qv1-xf2f.google.com with SMTP id 6a1803df08f44-6707401e22eso31967646d6.2 for ; Mon, 13 Nov 2023 21:41:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940502; x=1700545302; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZfD9z1HLcfPZkYP1HElkReRnlB4vX27c4pSPWIDl3KA=; b=fE8W46W0rzJReqDuX9bNB2BDcQd6u6fQMbb7nz3f/8vRBt9zeViG5sQ+BCfnXcooWm ZgfEwtDGzcRYiXvuwV2MGBidd7Q7T/0dbYw66/2gHwtrGx76yLCAWFNh6nDcOhkGzR2U epfycjpwZHTwdDx1RIDoHf4NPwy04tdC6ARGWuiFjX92HPFrTwmX21RbtjGu2auxR1VE rNYPyV+pZTi4KczCcMxeLNNBXSLallMNDsvYGj4P6SL9NQ52xPPsBtkx5MSE3+7CKSSo ZfoPxU/QsJJpK1Z3OGZDq73iXcZeV4rlBWSmUuGjPBG9vV1nDl6RaauGzs8MEi7Ugyuq eCZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940502; x=1700545302; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZfD9z1HLcfPZkYP1HElkReRnlB4vX27c4pSPWIDl3KA=; b=kVOrWaLqOJQIUFSojOu/IeB+t8zMRu+m09qh9ea54PiyyIu9OW0MHGuDv5mQyre2iF rnGI1Uaruyt6/L4DgK6v1dCEacFzP8Ka2tH+5oRSiT5N2kSvdWEEYMkub3KhNbSzEC34 8xGqFNLRrY+RDTvZSIm84eVHMbhzl/S0X7Zt0Y2u56BinBbm6gma8dJr2LXDuN8dFtof srsE70CO3vHXLJr43gu0j/Cm4Ri67mhAkP48z1cjyy6HQ7Itp/7NTHAHEHpKCfh839vk 9pJlg+sps4ysKuLGhq7FzCxHtM5+z8ldCdta1DJhKjzVFl3PmB+54WldJ/wbvA3V8nnZ ca/w== X-Gm-Message-State: AOJu0YwdKjokYhW/LQNIj98YJv83c3AegQ520y4H0GHMp659JBPiiguB iJZwZP8/bCVkbjV0zFaFWKzlkw== X-Google-Smtp-Source: AGHT+IG/wfd/yn52jPopbWbNFjzu9Vcx6W9F6zyrrR3O+441rjArYi16j1d8TPG3Xw7oJpwiN8eHDw== X-Received: by 2002:a05:6214:400e:b0:675:5058:d028 with SMTP id kd14-20020a056214400e00b006755058d028mr1692818qvb.10.1699940502626; Mon, 13 Nov 2023 21:41:42 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:42 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 07/20] util/dsa: Implement DSA device start and stop logic. Date: Tue, 14 Nov 2023 05:40:19 +0000 Message-Id: <20231114054032.1192027-8-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f2f; envelope-from=hao.xiang@bytedance.com; helo=mail-qv1-xf2f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org * DSA device open and close. * DSA group contains multiple DSA devices. * DSA group configure/start/stop/clean. Signed-off-by: Hao Xiang Signed-off-by: Bryan Zhang --- include/qemu/dsa.h | 49 +++++++ util/dsa.c | 338 +++++++++++++++++++++++++++++++++++++++++++++ util/meson.build | 1 + 3 files changed, 388 insertions(+) create mode 100644 include/qemu/dsa.h create mode 100644 util/dsa.c diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h new file mode 100644 index 0000000000..30246b507e --- /dev/null +++ b/include/qemu/dsa.h @@ -0,0 +1,49 @@ +#ifndef QEMU_DSA_H +#define QEMU_DSA_H + +#include "qemu/thread.h" +#include "qemu/queue.h" + +#ifdef CONFIG_DSA_OPT + +#pragma GCC push_options +#pragma GCC target("enqcmd") + +#include +#include "x86intrin.h" + +#endif + +/** + * @brief Initializes DSA devices. + * + * @param dsa_parameter A list of DSA device path from migration parameter. + * @return int Zero if successful, otherwise non zero. + */ +int dsa_init(const char *dsa_parameter); + +/** + * @brief Start logic to enable using DSA. + */ +void dsa_start(void); + +/** + * @brief Stop logic to clean up DSA by halting the device group and cleaning up + * the completion thread. + */ +void dsa_stop(void); + +/** + * @brief Clean up system resources created for DSA offloading. + * This function is called during QEMU process teardown. + */ +void dsa_cleanup(void); + +/** + * @brief Check if DSA is running. + * + * @return True if DSA is running, otherwise false. + */ +bool dsa_is_running(void); + +#endif \ No newline at end of file diff --git a/util/dsa.c b/util/dsa.c new file mode 100644 index 0000000000..8edaa892ec --- /dev/null +++ b/util/dsa.c @@ -0,0 +1,338 @@ +/* + * Use Intel Data Streaming Accelerator to offload certain background + * operations. + * + * Copyright (c) 2023 Hao Xiang + * Bryan Zhang + * + * Permission is hereby granted, free of charge, to any person obtaining a copy + * of this software and associated documentation files (the "Software"), to deal + * in the Software without restriction, including without limitation the rights + * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + * copies of the Software, and to permit persons to whom the Software is + * furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN + * THE SOFTWARE. + */ + +#include "qemu/osdep.h" +#include "qemu/queue.h" +#include "qemu/memalign.h" +#include "qemu/lockable.h" +#include "qemu/cutils.h" +#include "qemu/dsa.h" +#include "qemu/bswap.h" +#include "qemu/error-report.h" +#include "qemu/rcu.h" + +#ifdef CONFIG_DSA_OPT + +#pragma GCC push_options +#pragma GCC target("enqcmd") + +#include +#include "x86intrin.h" + +#define DSA_WQ_SIZE 4096 +#define MAX_DSA_DEVICES 16 + +typedef QSIMPLEQ_HEAD(dsa_task_queue, buffer_zero_batch_task) dsa_task_queue; + +struct dsa_device { + void *work_queue; +}; + +struct dsa_device_group { + struct dsa_device *dsa_devices; + int num_dsa_devices; + uint32_t index; + bool running; + QemuMutex task_queue_lock; + QemuCond task_queue_cond; + dsa_task_queue task_queue; +}; + +uint64_t max_retry_count; +static struct dsa_device_group dsa_group; + + +/** + * @brief This function opens a DSA device's work queue and + * maps the DSA device memory into the current process. + * + * @param dsa_wq_path A pointer to the DSA device work queue's file path. + * @return A pointer to the mapped memory. + */ +static void * +map_dsa_device(const char *dsa_wq_path) +{ + void *dsa_device; + int fd; + + fd = open(dsa_wq_path, O_RDWR); + if (fd < 0) { + fprintf(stderr, "open %s failed with errno = %d.\n", + dsa_wq_path, errno); + return MAP_FAILED; + } + dsa_device = mmap(NULL, DSA_WQ_SIZE, PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, 0); + close(fd); + if (dsa_device == MAP_FAILED) { + fprintf(stderr, "mmap failed with errno = %d.\n", errno); + return MAP_FAILED; + } + return dsa_device; +} + +/** + * @brief Initializes a DSA device structure. + * + * @param instance A pointer to the DSA device. + * @param work_queue A pointer to the DSA work queue. + */ +static void +dsa_device_init(struct dsa_device *instance, + void *dsa_work_queue) +{ + instance->work_queue = dsa_work_queue; +} + +/** + * @brief Cleans up a DSA device structure. + * + * @param instance A pointer to the DSA device to cleanup. + */ +static void +dsa_device_cleanup(struct dsa_device *instance) +{ + if (instance->work_queue != MAP_FAILED) { + munmap(instance->work_queue, DSA_WQ_SIZE); + } +} + +/** + * @brief Initializes a DSA device group. + * + * @param group A pointer to the DSA device group. + * @param num_dsa_devices The number of DSA devices this group will have. + * + * @return Zero if successful, non-zero otherwise. + */ +static int +dsa_device_group_init(struct dsa_device_group *group, + const char *dsa_parameter) +{ + if (dsa_parameter == NULL || strlen(dsa_parameter) == 0) { + return 0; + } + + int ret = 0; + char *local_dsa_parameter = g_strdup(dsa_parameter); + const char *dsa_path[MAX_DSA_DEVICES]; + int num_dsa_devices = 0; + char delim[2] = " "; + + char *current_dsa_path = strtok(local_dsa_parameter, delim); + + while (current_dsa_path != NULL) { + dsa_path[num_dsa_devices++] = current_dsa_path; + if (num_dsa_devices == MAX_DSA_DEVICES) { + break; + } + current_dsa_path = strtok(NULL, delim); + } + + group->dsa_devices = + malloc(sizeof(struct dsa_device) * num_dsa_devices); + group->num_dsa_devices = num_dsa_devices; + group->index = 0; + + group->running = false; + qemu_mutex_init(&group->task_queue_lock); + qemu_cond_init(&group->task_queue_cond); + QSIMPLEQ_INIT(&group->task_queue); + + void *dsa_wq = MAP_FAILED; + for (int i = 0; i < num_dsa_devices; i++) { + dsa_wq = map_dsa_device(dsa_path[i]); + if (dsa_wq == MAP_FAILED) { + fprintf(stderr, "map_dsa_device failed MAP_FAILED, " + "using simulation.\n"); + ret = -1; + goto exit; + } + dsa_device_init(&dsa_group.dsa_devices[i], dsa_wq); + } + +exit: + g_free(local_dsa_parameter); + return ret; +} + +/** + * @brief Starts a DSA device group. + * + * @param group A pointer to the DSA device group. + * @param dsa_path An array of DSA device path. + * @param num_dsa_devices The number of DSA devices in the device group. + */ +static void +dsa_device_group_start(struct dsa_device_group *group) +{ + group->running = true; +} + +/** + * @brief Stops a DSA device group. + * + * @param group A pointer to the DSA device group. + */ +__attribute__((unused)) +static void +dsa_device_group_stop(struct dsa_device_group *group) +{ + group->running = false; +} + +/** + * @brief Cleans up a DSA device group. + * + * @param group A pointer to the DSA device group. + */ +static void +dsa_device_group_cleanup(struct dsa_device_group *group) +{ + if (!group->dsa_devices) { + return; + } + for (int i = 0; i < group->num_dsa_devices; i++) { + dsa_device_cleanup(&group->dsa_devices[i]); + } + free(group->dsa_devices); + group->dsa_devices = NULL; + + qemu_mutex_destroy(&group->task_queue_lock); + qemu_cond_destroy(&group->task_queue_cond); +} + +/** + * @brief Returns the next available DSA device in the group. + * + * @param group A pointer to the DSA device group. + * + * @return struct dsa_device* A pointer to the next available DSA device + * in the group. + */ +__attribute__((unused)) +static struct dsa_device * +dsa_device_group_get_next_device(struct dsa_device_group *group) +{ + if (group->num_dsa_devices == 0) { + return NULL; + } + uint32_t current = qatomic_fetch_inc(&group->index); + current %= group->num_dsa_devices; + return &group->dsa_devices[current]; +} + +/** + * @brief Check if DSA is running. + * + * @return True if DSA is running, otherwise false. + */ +bool dsa_is_running(void) +{ + return false; +} + +static void +dsa_globals_init(void) +{ + max_retry_count = UINT64_MAX; +} + +/** + * @brief Initializes DSA devices. + * + * @param dsa_parameter A list of DSA device path from migration parameter. + * @return int Zero if successful, otherwise non zero. + */ +int dsa_init(const char *dsa_parameter) +{ + dsa_globals_init(); + + return dsa_device_group_init(&dsa_group, dsa_parameter); +} + +/** + * @brief Start logic to enable using DSA. + * + */ +void dsa_start(void) +{ + if (dsa_group.num_dsa_devices == 0) { + return; + } + if (dsa_group.running) { + return; + } + dsa_device_group_start(&dsa_group); +} + +/** + * @brief Stop logic to clean up DSA by halting the device group and cleaning up + * the completion thread. + * + */ +void dsa_stop(void) +{ + struct dsa_device_group *group = &dsa_group; + + if (!group->running) { + return; + } +} + +/** + * @brief Clean up system resources created for DSA offloading. + * This function is called during QEMU process teardown. + * + */ +void dsa_cleanup(void) +{ + dsa_stop(); + dsa_device_group_cleanup(&dsa_group); +} + +#else + +bool dsa_is_running(void) +{ + return false; +} + +int dsa_init(const char *dsa_parameter) +{ + fprintf(stderr, "Intel Data Streaming Accelerator is not supported " + "on this platform.\n"); + return -1; +} + +void dsa_start(void) {} + +void dsa_stop(void) {} + +void dsa_cleanup(void) {} + +#endif + diff --git a/util/meson.build b/util/meson.build index c2322ef6e7..f7277c5e9b 100644 --- a/util/meson.build +++ b/util/meson.build @@ -85,6 +85,7 @@ if have_block or have_ga endif if have_block util_ss.add(files('aio-wait.c')) + util_ss.add(files('dsa.c')) util_ss.add(files('buffer.c')) util_ss.add(files('bufferiszero.c')) util_ss.add(files('hbitmap.c')) From patchwork Tue Nov 14 05:40:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454798 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 442C7C4167B for ; Tue, 14 Nov 2023 05:43:12 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBR-0007Ee-Ro; Tue, 14 Nov 2023 00:41:57 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBP-0007EG-9Y for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:55 -0500 Received: from mail-qk1-x733.google.com ([2607:f8b0:4864:20::733]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBL-0001Sa-2J for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:53 -0500 Received: by mail-qk1-x733.google.com with SMTP id af79cd13be357-778a92c06d6so322123185a.2 for ; Mon, 13 Nov 2023 21:41:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940504; x=1700545304; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=T/zXpgGhSxq4NM4ugG+KYN+fyKZfpH/m9Za1mhIGsuQ=; b=PJu46kq/S9ZV6QaFFE6X8TRffVZdaeSwyC/igDX+cUfIITRHmniHyAVnNKFA5se7Z+ mrtNPnNCXxMog6UnwFbgTOsNH2fnPILfbSXwoQRccSiL0aYW0NJ5SOhRAC+VnA6iliRW S2rWTs+BYcxaUAXRBUFzohVUAe7b6RNxOL/UM+ehFuCHJgCzCXzwJCGPy+UsiowW9HdY WCYe53Xl/iAmk4fJ3WAQ7/U2nPqJPjgLpJa4uRFIfptel1RyPj1WiOBrKUA3S4VUzEBZ Q0tYG0iYfqrs8sSBaWT+Cvyr8NbtXblmvjaGv3cSA4XFdqorAK3567QxCvg3VFbUuQqo DdxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940504; x=1700545304; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=T/zXpgGhSxq4NM4ugG+KYN+fyKZfpH/m9Za1mhIGsuQ=; b=Kn+WeTswaYE3CRwMAH/9icir3C7XoyK8zc0ueKRz2k7NkKSdti9YotLj5DvZQ1aqDC ZTTGZtUN9431nbFad90B8E/nqlDANMSDOijNa7uGWZjfgunlE+8UMP5CTzuK4YNpj3hu l316U3oc6DJJZdkWiK58ijlXhBdDA0vRnnFH9Me6oe5T4S7tvlyHw8KpNrT/mtmXfL/z +zlHHk3+nLdwjop8xZdwUZ5wvVXFgaHY99wy54PhDGKNfhFaGj5yNxeUdixUNghBQYuH GBJewXLzvlgk8Sz1+SuyPanr0qHj38SYe8bzuaa4NDOHQAzRBgFnwJrDI0GVq+EpjZ2D Sb4A== X-Gm-Message-State: AOJu0YzKMC3wWrtf4IOz0Rd3ttQibxZ/AI55X/H7Ohs7oEb1pwUaXr8r FNaGJfD1YVX0AMN5z1UoKwu4AXvTy46hA0mqzuY= X-Google-Smtp-Source: AGHT+IFFAiIu1YXz/W1XplSN3Wz3PkIgmDjYuq0yYum66V8nwnXDTWfESm8LRo2t3IsQTc9O3kj70g== X-Received: by 2002:a05:620a:2a0b:b0:773:fd71:6e7c with SMTP id o11-20020a05620a2a0b00b00773fd716e7cmr1473403qkp.59.1699940504196; Mon, 13 Nov 2023 21:41:44 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:43 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 08/20] util/dsa: Implement DSA task enqueue and dequeue. Date: Tue, 14 Nov 2023 05:40:20 +0000 Message-Id: <20231114054032.1192027-9-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::733; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x733.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org * Use a safe thread queue for DSA task enqueue/dequeue. * Implement DSA task submission. * Implement DSA batch task submission. Signed-off-by: Hao Xiang --- include/qemu/dsa.h | 35 ++++++++ util/dsa.c | 196 +++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 231 insertions(+) diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h index 30246b507e..23f55185be 100644 --- a/include/qemu/dsa.h +++ b/include/qemu/dsa.h @@ -12,6 +12,41 @@ #include #include "x86intrin.h" +enum dsa_task_type { + DSA_TASK = 0, + DSA_BATCH_TASK +}; + +enum dsa_task_status { + DSA_TASK_READY = 0, + DSA_TASK_PROCESSING, + DSA_TASK_COMPLETION +}; + +typedef void (*buffer_zero_dsa_completion_fn)(void *); + +typedef struct buffer_zero_batch_task { + struct dsa_hw_desc batch_descriptor; + struct dsa_hw_desc *descriptors; + struct dsa_completion_record batch_completion __attribute__((aligned(32))); + struct dsa_completion_record *completions; + struct dsa_device_group *group; + struct dsa_device *device; + buffer_zero_dsa_completion_fn completion_callback; + QemuSemaphore sem_task_complete; + enum dsa_task_type task_type; + enum dsa_task_status status; + bool *results; + int batch_size; + QSIMPLEQ_ENTRY(buffer_zero_batch_task) entry; +} buffer_zero_batch_task; + +#else + +struct buffer_zero_batch_task { + bool *results; +}; + #endif /** diff --git a/util/dsa.c b/util/dsa.c index 8edaa892ec..f82282ce99 100644 --- a/util/dsa.c +++ b/util/dsa.c @@ -245,6 +245,200 @@ dsa_device_group_get_next_device(struct dsa_device_group *group) return &group->dsa_devices[current]; } +/** + * @brief Empties out the DSA task queue. + * + * @param group A pointer to the DSA device group. + */ +static void +dsa_empty_task_queue(struct dsa_device_group *group) +{ + qemu_mutex_lock(&group->task_queue_lock); + dsa_task_queue *task_queue = &group->task_queue; + while (!QSIMPLEQ_EMPTY(task_queue)) { + QSIMPLEQ_REMOVE_HEAD(task_queue, entry); + } + qemu_mutex_unlock(&group->task_queue_lock); +} + +/** + * @brief Adds a task to the DSA task queue. + * + * @param group A pointer to the DSA device group. + * @param context A pointer to the DSA task to enqueue. + * + * @return int Zero if successful, otherwise a proper error code. + */ +static int +dsa_task_enqueue(struct dsa_device_group *group, + struct buffer_zero_batch_task *task) +{ + dsa_task_queue *task_queue = &group->task_queue; + QemuMutex *task_queue_lock = &group->task_queue_lock; + QemuCond *task_queue_cond = &group->task_queue_cond; + + bool notify = false; + + qemu_mutex_lock(task_queue_lock); + + if (!group->running) { + fprintf(stderr, "DSA: Tried to queue task to stopped device queue\n"); + qemu_mutex_unlock(task_queue_lock); + return -1; + } + + // The queue is empty. This enqueue operation is a 0->1 transition. + if (QSIMPLEQ_EMPTY(task_queue)) + notify = true; + + QSIMPLEQ_INSERT_TAIL(task_queue, task, entry); + + // We need to notify the waiter for 0->1 transitions. + if (notify) + qemu_cond_signal(task_queue_cond); + + qemu_mutex_unlock(task_queue_lock); + + return 0; +} + +/** + * @brief Takes a DSA task out of the task queue. + * + * @param group A pointer to the DSA device group. + * @return buffer_zero_batch_task* The DSA task being dequeued. + */ +__attribute__((unused)) +static struct buffer_zero_batch_task * +dsa_task_dequeue(struct dsa_device_group *group) +{ + struct buffer_zero_batch_task *task = NULL; + dsa_task_queue *task_queue = &group->task_queue; + QemuMutex *task_queue_lock = &group->task_queue_lock; + QemuCond *task_queue_cond = &group->task_queue_cond; + + qemu_mutex_lock(task_queue_lock); + + while (true) { + if (!group->running) + goto exit; + task = QSIMPLEQ_FIRST(task_queue); + if (task != NULL) { + break; + } + qemu_cond_wait(task_queue_cond, task_queue_lock); + } + + QSIMPLEQ_REMOVE_HEAD(task_queue, entry); + +exit: + qemu_mutex_unlock(task_queue_lock); + return task; +} + +/** + * @brief Submits a DSA work item to the device work queue. + * + * @param wq A pointer to the DSA work queue's device memory. + * @param descriptor A pointer to the DSA work item descriptor. + * + * @return Zero if successful, non-zero otherwise. + */ +static int +submit_wi_int(void *wq, struct dsa_hw_desc *descriptor) +{ + uint64_t retry = 0; + + _mm_sfence(); + + while (true) { + if (_enqcmd(wq, descriptor) == 0) { + break; + } + retry++; + if (retry > max_retry_count) { + fprintf(stderr, "Submit work retry %lu times.\n", retry); + exit(1); + } + } + + return 0; +} + +/** + * @brief Synchronously submits a DSA work item to the + * device work queue. + * + * @param wq A pointer to the DSA worjk queue's device memory. + * @param descriptor A pointer to the DSA work item descriptor. + * + * @return int Zero if successful, non-zero otherwise. + */ +__attribute__((unused)) +static int +submit_wi(void *wq, struct dsa_hw_desc *descriptor) +{ + return submit_wi_int(wq, descriptor); +} + +/** + * @brief Asynchronously submits a DSA work item to the + * device work queue. + * + * @param task A pointer to the buffer zero task. + * + * @return int Zero if successful, non-zero otherwise. + */ +__attribute__((unused)) +static int +submit_wi_async(struct buffer_zero_batch_task *task) +{ + struct dsa_device_group *device_group = task->group; + struct dsa_device *device_instance = task->device; + int ret; + + assert(task->task_type == DSA_TASK); + + task->status = DSA_TASK_PROCESSING; + + ret = submit_wi_int(device_instance->work_queue, + &task->descriptors[0]); + if (ret != 0) + return ret; + + return dsa_task_enqueue(device_group, task); +} + +/** + * @brief Asynchronously submits a DSA batch work item to the + * device work queue. + * + * @param batch_task A pointer to the batch buffer zero task. + * + * @return int Zero if successful, non-zero otherwise. + */ +__attribute__((unused)) +static int +submit_batch_wi_async(struct buffer_zero_batch_task *batch_task) +{ + struct dsa_device_group *device_group = batch_task->group; + struct dsa_device *device_instance = batch_task->device; + int ret; + + assert(batch_task->task_type == DSA_BATCH_TASK); + assert(batch_task->batch_descriptor.desc_count <= batch_task->batch_size); + assert(batch_task->status == DSA_TASK_READY); + + batch_task->status = DSA_TASK_PROCESSING; + + ret = submit_wi_int(device_instance->work_queue, + &batch_task->batch_descriptor); + if (ret != 0) + return ret; + + return dsa_task_enqueue(device_group, batch_task); +} + /** * @brief Check if DSA is running. * @@ -301,6 +495,8 @@ void dsa_stop(void) if (!group->running) { return; } + + dsa_empty_task_queue(group); } /** From patchwork Tue Nov 14 05:40:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454805 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 07A4DC4332F for ; Tue, 14 Nov 2023 05:43:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBl-0007Hx-2o; Tue, 14 Nov 2023 00:42:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBT-0007FL-I3 for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:59 -0500 Received: from mail-qv1-xf2a.google.com ([2607:f8b0:4864:20::f2a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBL-0001Sg-2l for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:59 -0500 Received: by mail-qv1-xf2a.google.com with SMTP id 6a1803df08f44-6707401e1edso31543646d6.1 for ; Mon, 13 Nov 2023 21:41:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940506; x=1700545306; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CLTEEKCthFs57uSw9LUby1piyRPg7rqXrvYS7xVttXM=; b=UWi4XFKnubBUebZzeTbO7eJvtxszZY1TXHhDRXmPtG8+tkEQTmBaqGwB2btED5rxOa Kyxkdhct/FkkXHj7wLONX+YJEil7WY8xPOOstDx6Wu6o/M8sCaglJOnKMQzupGi1IZoP wXRD+/uz6oX2fRldWXhyDP9hVvV9i5aDSTTq6J06NVnDicTNXZZmmRLmGP6NtiOaMdzP iqT4I2VmS3kj6eplC5m6tjSVUXFx/UHTpm6jKMpWIvmhkoVtI+q+ClmI+Uup6uolPpGv KHMrwQCw+EMt0DsTR8IgRBcVla3/QO+0jpG51+XkLRaXofCAU6pHH6PEYBa4u/L1Kl3U RLBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940506; x=1700545306; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CLTEEKCthFs57uSw9LUby1piyRPg7rqXrvYS7xVttXM=; b=Nk/6a6/HjtJkmSuEMMDvObzkMOKKl+hRSqIUI09+sfJfd7t2kASKX1eRUA/KfTFPkR BMPyoMB/LzU2UYCvGxa65vqNfNvQQhtkFhooeIkSi9uJvEYUNzOs88DGwI5jS/iJelJI FCfHJ6yF7A1Q1fBvH3u3vZ9oyzvO40JjW7wdRzLSj5A0+V3JFjcSbyjEMsXC7gVHIiEy ZTWS+5KoSW6zhkepm0dVxCmLrVF62BYHnIZ2Z0Gz/OaQgzjh++9CrI74VRxZ3CQ0/VeG ld/fW0PNjqV1idX0CTIzbWBCCtMR0xzKaChlHjD2MndevyRW/Hy9KXZU4uhWC5tD5pZl Psog== X-Gm-Message-State: AOJu0Yw7cJIi/M+emvygCTSZamzEKCcGmNe9ImdzTyC335/xb22p/g5f qqdQlnvHU7213k/QdY/pwSiDHQ== X-Google-Smtp-Source: AGHT+IGyQJZ+IFLKyWHv/ECnhfcKYYz01hgg3qKM4VJ8cwQHKb5F6ddIZIdwbRI1JuKiS76d4JQ/tg== X-Received: by 2002:a05:6214:558e:b0:671:f14f:bf6 with SMTP id mi14-20020a056214558e00b00671f14f0bf6mr1495208qvb.12.1699940506045; Mon, 13 Nov 2023 21:41:46 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:45 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 09/20] util/dsa: Implement DSA task asynchronous completion thread model. Date: Tue, 14 Nov 2023 05:40:21 +0000 Message-Id: <20231114054032.1192027-10-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f2a; envelope-from=hao.xiang@bytedance.com; helo=mail-qv1-xf2a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org * Create a dedicated thread for DSA task completion. * DSA completion thread runs a loop and poll for completed tasks. * Start and stop DSA completion thread during DSA device start stop. User space application can directly submit task to Intel DSA accelerator by writing to DSA's device memory (mapped in user space). Once a task is submitted, the device starts processing it and write the completion status back to the task. A user space application can poll the task's completion status to check for completion. This change uses a dedicated thread to perform DSA task completion checking. Signed-off-by: Hao Xiang --- util/dsa.c | 243 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 242 insertions(+), 1 deletion(-) diff --git a/util/dsa.c b/util/dsa.c index f82282ce99..0e68013ffb 100644 --- a/util/dsa.c +++ b/util/dsa.c @@ -44,6 +44,7 @@ #define DSA_WQ_SIZE 4096 #define MAX_DSA_DEVICES 16 +#define DSA_COMPLETION_THREAD "dsa_completion" typedef QSIMPLEQ_HEAD(dsa_task_queue, buffer_zero_batch_task) dsa_task_queue; @@ -61,8 +62,18 @@ struct dsa_device_group { dsa_task_queue task_queue; }; +struct dsa_completion_thread { + bool stopping; + bool running; + QemuThread thread; + int thread_id; + QemuSemaphore sem_init_done; + struct dsa_device_group *group; +}; + uint64_t max_retry_count; static struct dsa_device_group dsa_group; +static struct dsa_completion_thread completion_thread; /** @@ -439,6 +450,234 @@ submit_batch_wi_async(struct buffer_zero_batch_task *batch_task) return dsa_task_enqueue(device_group, batch_task); } +/** + * @brief Poll for the DSA work item completion. + * + * @param completion A pointer to the DSA work item completion record. + * @param opcode The DSA opcode. + * + * @return Zero if successful, non-zero otherwise. + */ +static int +poll_completion(struct dsa_completion_record *completion, + enum dsa_opcode opcode) +{ + uint8_t status; + uint64_t retry = 0; + + while (true) { + // The DSA operation completes successfully or fails. + status = completion->status; + if (status == DSA_COMP_SUCCESS || + status == DSA_COMP_PAGE_FAULT_NOBOF || + status == DSA_COMP_BATCH_PAGE_FAULT || + status == DSA_COMP_BATCH_FAIL) { + break; + } else if (status != DSA_COMP_NONE) { + /* TODO: Error handling here on unexpected failure. */ + fprintf(stderr, "DSA opcode %d failed with status = %d.\n", + opcode, status); + exit(1); + } + retry++; + if (retry > max_retry_count) { + fprintf(stderr, "Wait for completion retry %lu times.\n", retry); + exit(1); + } + _mm_pause(); + } + + return 0; +} + +/** + * @brief Complete a single DSA task in the batch task. + * + * @param task A pointer to the batch task structure. + */ +static void +poll_task_completion(struct buffer_zero_batch_task *task) +{ + assert(task->task_type == DSA_TASK); + + struct dsa_completion_record *completion = &task->completions[0]; + uint8_t status; + + poll_completion(completion, task->descriptors[0].opcode); + + status = completion->status; + if (status == DSA_COMP_SUCCESS) { + task->results[0] = (completion->result == 0); + return; + } + + assert(status == DSA_COMP_PAGE_FAULT_NOBOF); +} + +/** + * @brief Poll a batch task status until it completes. If DSA task doesn't + * complete properly, use CPU to complete the task. + * + * @param batch_task A pointer to the DSA batch task. + */ +static void +poll_batch_task_completion(struct buffer_zero_batch_task *batch_task) +{ + struct dsa_completion_record *batch_completion = &batch_task->batch_completion; + struct dsa_completion_record *completion; + uint8_t batch_status; + uint8_t status; + bool *results = batch_task->results; + uint32_t count = batch_task->batch_descriptor.desc_count; + + poll_completion(batch_completion, + batch_task->batch_descriptor.opcode); + + batch_status = batch_completion->status; + + if (batch_status == DSA_COMP_SUCCESS) { + if (batch_completion->bytes_completed == count) { + // Let's skip checking for each descriptors' completion status + // if the batch descriptor says all succedded. + for (int i = 0; i < count; i++) { + assert(batch_task->completions[i].status == DSA_COMP_SUCCESS); + results[i] = (batch_task->completions[i].result == 0); + } + return; + } + } else { + assert(batch_status == DSA_COMP_BATCH_FAIL || + batch_status == DSA_COMP_BATCH_PAGE_FAULT); + } + + for (int i = 0; i < count; i++) { + + completion = &batch_task->completions[i]; + status = completion->status; + + if (status == DSA_COMP_SUCCESS) { + results[i] = (completion->result == 0); + continue; + } + + if (status != DSA_COMP_PAGE_FAULT_NOBOF) { + fprintf(stderr, + "Unexpected completion status = %u.\n", status); + assert(false); + } + } +} + +/** + * @brief Handles an asynchronous DSA batch task completion. + * + * @param task A pointer to the batch buffer zero task structure. + */ +static void +dsa_batch_task_complete(struct buffer_zero_batch_task *batch_task) +{ + batch_task->status = DSA_TASK_COMPLETION; + batch_task->completion_callback(batch_task); +} + +/** + * @brief The function entry point called by a dedicated DSA + * work item completion thread. + * + * @param opaque A pointer to the thread context. + * + * @return void* Not used. + */ +static void * +dsa_completion_loop(void *opaque) +{ + struct dsa_completion_thread *thread_context = + (struct dsa_completion_thread *)opaque; + struct buffer_zero_batch_task *batch_task; + struct dsa_device_group *group = thread_context->group; + + rcu_register_thread(); + + thread_context->thread_id = qemu_get_thread_id(); + qemu_sem_post(&thread_context->sem_init_done); + + while (thread_context->running) { + batch_task = dsa_task_dequeue(group); + assert(batch_task != NULL || !group->running); + if (!group->running) { + assert(!thread_context->running); + break; + } + if (batch_task->task_type == DSA_TASK) { + poll_task_completion(batch_task); + } else { + assert(batch_task->task_type == DSA_BATCH_TASK); + poll_batch_task_completion(batch_task); + } + + dsa_batch_task_complete(batch_task); + } + + rcu_unregister_thread(); + return NULL; +} + +/** + * @brief Initializes a DSA completion thread. + * + * @param completion_thread A pointer to the completion thread context. + * @param group A pointer to the DSA device group. + */ +static void +dsa_completion_thread_init( + struct dsa_completion_thread *completion_thread, + struct dsa_device_group *group) +{ + completion_thread->stopping = false; + completion_thread->running = true; + completion_thread->thread_id = -1; + qemu_sem_init(&completion_thread->sem_init_done, 0); + completion_thread->group = group; + + qemu_thread_create(&completion_thread->thread, + DSA_COMPLETION_THREAD, + dsa_completion_loop, + completion_thread, + QEMU_THREAD_JOINABLE); + + /* Wait for initialization to complete */ + while (completion_thread->thread_id == -1) { + qemu_sem_wait(&completion_thread->sem_init_done); + } +} + +/** + * @brief Stops the completion thread (and implicitly, the device group). + * + * @param opaque A pointer to the completion thread. + */ +static void dsa_completion_thread_stop(void *opaque) +{ + struct dsa_completion_thread *thread_context = + (struct dsa_completion_thread *)opaque; + + struct dsa_device_group *group = thread_context->group; + + qemu_mutex_lock(&group->task_queue_lock); + + thread_context->stopping = true; + thread_context->running = false; + + dsa_device_group_stop(group); + + qemu_cond_signal(&group->task_queue_cond); + qemu_mutex_unlock(&group->task_queue_lock); + + qemu_thread_join(&thread_context->thread); + + qemu_sem_destroy(&thread_context->sem_init_done); +} + /** * @brief Check if DSA is running. * @@ -446,7 +685,7 @@ submit_batch_wi_async(struct buffer_zero_batch_task *batch_task) */ bool dsa_is_running(void) { - return false; + return completion_thread.running; } static void @@ -481,6 +720,7 @@ void dsa_start(void) return; } dsa_device_group_start(&dsa_group); + dsa_completion_thread_init(&completion_thread, &dsa_group); } /** @@ -496,6 +736,7 @@ void dsa_stop(void) return; } + dsa_completion_thread_stop(&completion_thread); dsa_empty_task_queue(group); } From patchwork Tue Nov 14 05:40:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454799 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0426EC4332F for ; Tue, 14 Nov 2023 05:43:35 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBo-0007Is-Dq; Tue, 14 Nov 2023 00:42:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBV-0007G6-OX for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:03 -0500 Received: from mail-qk1-x734.google.com ([2607:f8b0:4864:20::734]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBL-0001Sn-33 for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:01 -0500 Received: by mail-qk1-x734.google.com with SMTP id af79cd13be357-778a92c06d6so322125385a.2 for ; Mon, 13 Nov 2023 21:41:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940507; x=1700545307; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=GL6cpR12l80oXg3yMpIzAIPLap4SikzmVMX/10PRhJc=; b=MpsF8mMWQ24B5rltBa5VoR8Z67JruI3rLRXgVfcD0oDUQRp/kOKbULAJ4Mcov8I4lg S+B5rsHqWuFS9TxLAi952ceh9l1HLr7d6QD+yt5a4G7BDFyRA1XLaQtLP4/7KR0gP5aC m/P2toRey8zKH6N5sUCwjfcSTVTfpP9D+p/lSZZA8ihbiWR4ZHvlTNZkZXkTiqLOwPjC g6qT4g+XUrz8C3BdCORrouQqOolxlB68xtceof2oelmnk5bsG7yOy87JFun9Q78xtRNu bxrOA45gg1HxoCCWhhMZU4p7AplFp8EwMR62TUWzVdfXHqXSswr1HTBUQubxU+FixrcE DzMA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940507; x=1700545307; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=GL6cpR12l80oXg3yMpIzAIPLap4SikzmVMX/10PRhJc=; b=uDBhKZhPRfIPQZiBXxTgxhaOfiz68VodIZBQ9Xr5+Bd7TZW/R1U8t+V5fQ/hI4/i2g BpmZz6YM3JxOOQrmcHBClHZ1alU7tf7JAdtbH4wFXcZrlg4LAupUV38dHh1N4fKVHMjN 2NW6wCDDEgJWgPDbJPC5f5gRBgc+iJAbmOVmefiNpr6r7/5utoettK0dKORB+akjxBGu OTRK50zSG4wCnIur9Feo0A0AZLX9EH34oefDQ/saI0PXx51ytqOQiWUsvEtCAVYxi5Px 0oUUbReA8AWmTxoWiJCmKwuU/h9NGEXC1QiPx75TnMcH0Bvtj/7kbV6qq576LLUKsw+E r3jw== X-Gm-Message-State: AOJu0YwPK29m6Mb9aUj2CujBby1LUGWVnt7uD+sNhI4TpF4NdPMXsZ31 3KKhBdxbmO7SMyi4DzacmtYvOg== X-Google-Smtp-Source: AGHT+IE/KoQkJyjx4MtGnQp50U09+zeqlZ+xABD2qcWrZjknimRlnmArst1YKe/yhsJGCoTFp+93+A== X-Received: by 2002:a05:620a:2446:b0:778:991f:2f78 with SMTP id h6-20020a05620a244600b00778991f2f78mr1563041qkn.38.1699940507577; Mon, 13 Nov 2023 21:41:47 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:47 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 10/20] util/dsa: Implement zero page checking in DSA task. Date: Tue, 14 Nov 2023 05:40:22 +0000 Message-Id: <20231114054032.1192027-11-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::734; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x734.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Create DSA task with operation code DSA_OPCODE_COMPVAL. Here we create two types of DSA tasks, a single DSA task and a batch DSA task. Batch DSA task reduces task submission overhead and hence should be the default option. However, due to the way DSA hardware works, a DSA batch task must contain at least two individual tasks. There are times we need to submit a single task and hence a single DSA task submission is also required. Signed-off-by: Hao Xiang Signed-off-by: Bryan Zhang --- include/qemu/dsa.h | 16 +++ util/dsa.c | 252 +++++++++++++++++++++++++++++++++++++++++---- 2 files changed, 247 insertions(+), 21 deletions(-) diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h index 23f55185be..b10e7b8fb7 100644 --- a/include/qemu/dsa.h +++ b/include/qemu/dsa.h @@ -49,6 +49,22 @@ struct buffer_zero_batch_task { #endif +/** + * @brief Initializes a buffer zero batch task. + * + * @param task A pointer to the batch task to initialize. + * @param batch_size The number of DSA tasks in the batch. + */ +void buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, + int batch_size); + +/** + * @brief Performs the proper cleanup on a DSA batch task. + * + * @param task A pointer to the batch task to cleanup. + */ +void buffer_zero_batch_task_destroy(struct buffer_zero_batch_task *task); + /** * @brief Initializes DSA devices. * diff --git a/util/dsa.c b/util/dsa.c index 0e68013ffb..3cc017b8a0 100644 --- a/util/dsa.c +++ b/util/dsa.c @@ -75,6 +75,7 @@ uint64_t max_retry_count; static struct dsa_device_group dsa_group; static struct dsa_completion_thread completion_thread; +static void buffer_zero_dsa_completion(void *context); /** * @brief This function opens a DSA device's work queue and @@ -208,7 +209,6 @@ dsa_device_group_start(struct dsa_device_group *group) * * @param group A pointer to the DSA device group. */ -__attribute__((unused)) static void dsa_device_group_stop(struct dsa_device_group *group) { @@ -244,7 +244,6 @@ dsa_device_group_cleanup(struct dsa_device_group *group) * @return struct dsa_device* A pointer to the next available DSA device * in the group. */ -__attribute__((unused)) static struct dsa_device * dsa_device_group_get_next_device(struct dsa_device_group *group) { @@ -319,7 +318,6 @@ dsa_task_enqueue(struct dsa_device_group *group, * @param group A pointer to the DSA device group. * @return buffer_zero_batch_task* The DSA task being dequeued. */ -__attribute__((unused)) static struct buffer_zero_batch_task * dsa_task_dequeue(struct dsa_device_group *group) { @@ -376,22 +374,6 @@ submit_wi_int(void *wq, struct dsa_hw_desc *descriptor) return 0; } -/** - * @brief Synchronously submits a DSA work item to the - * device work queue. - * - * @param wq A pointer to the DSA worjk queue's device memory. - * @param descriptor A pointer to the DSA work item descriptor. - * - * @return int Zero if successful, non-zero otherwise. - */ -__attribute__((unused)) -static int -submit_wi(void *wq, struct dsa_hw_desc *descriptor) -{ - return submit_wi_int(wq, descriptor); -} - /** * @brief Asynchronously submits a DSA work item to the * device work queue. @@ -400,7 +382,6 @@ submit_wi(void *wq, struct dsa_hw_desc *descriptor) * * @return int Zero if successful, non-zero otherwise. */ -__attribute__((unused)) static int submit_wi_async(struct buffer_zero_batch_task *task) { @@ -428,7 +409,6 @@ submit_wi_async(struct buffer_zero_batch_task *task) * * @return int Zero if successful, non-zero otherwise. */ -__attribute__((unused)) static int submit_batch_wi_async(struct buffer_zero_batch_task *batch_task) { @@ -678,6 +658,231 @@ static void dsa_completion_thread_stop(void *opaque) qemu_sem_destroy(&thread_context->sem_init_done); } +/** + * @brief Initializes a buffer zero comparison DSA task. + * + * @param descriptor A pointer to the DSA task descriptor. + * @param completion A pointer to the DSA task completion record. + */ +static void +buffer_zero_task_init_int(struct dsa_hw_desc *descriptor, + struct dsa_completion_record *completion) +{ + descriptor->opcode = DSA_OPCODE_COMPVAL; + descriptor->flags = IDXD_OP_FLAG_RCR | IDXD_OP_FLAG_CRAV; + descriptor->comp_pattern = (uint64_t)0; + descriptor->completion_addr = (uint64_t)completion; +} + +/** + * @brief Initializes a buffer zero batch task. + * + * @param task A pointer to the batch task to initialize. + * @param batch_size The number of DSA tasks in the batch. + */ +void +buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, + int batch_size) +{ + int descriptors_size = sizeof(*task->descriptors) * batch_size; + memset(task, 0, sizeof(*task)); + + task->descriptors = + (struct dsa_hw_desc *)qemu_memalign(64, descriptors_size); + memset(task->descriptors, 0, descriptors_size); + task->completions = (struct dsa_completion_record *)qemu_memalign( + 32, sizeof(*task->completions) * batch_size); + task->results = g_new0(bool, batch_size); + task->batch_size = batch_size; + + task->batch_completion.status = DSA_COMP_NONE; + task->batch_descriptor.completion_addr = (uint64_t)&task->batch_completion; + // TODO: Ensure that we never send a batch with count <= 1 + task->batch_descriptor.desc_count = 0; + task->batch_descriptor.opcode = DSA_OPCODE_BATCH; + task->batch_descriptor.flags = IDXD_OP_FLAG_RCR | IDXD_OP_FLAG_CRAV; + task->batch_descriptor.desc_list_addr = (uintptr_t)task->descriptors; + task->status = DSA_TASK_READY; + task->group = &dsa_group; + task->device = dsa_device_group_get_next_device(&dsa_group); + + for (int i = 0; i < task->batch_size; i++) { + buffer_zero_task_init_int(&task->descriptors[i], + &task->completions[i]); + } + + qemu_sem_init(&task->sem_task_complete, 0); + task->completion_callback = buffer_zero_dsa_completion; +} + +/** + * @brief Performs the proper cleanup on a DSA batch task. + * + * @param task A pointer to the batch task to cleanup. + */ +void +buffer_zero_batch_task_destroy(struct buffer_zero_batch_task *task) +{ + qemu_vfree(task->descriptors); + qemu_vfree(task->completions); + g_free(task->results); + + qemu_sem_destroy(&task->sem_task_complete); +} + +/** + * @brief Resets a buffer zero comparison DSA batch task. + * + * @param task A pointer to the batch task. + * @param count The number of DSA tasks this batch task will contain. + */ +static void +buffer_zero_batch_task_reset(struct buffer_zero_batch_task *task, size_t count) +{ + task->batch_completion.status = DSA_COMP_NONE; + task->batch_descriptor.desc_count = count; + task->task_type = DSA_BATCH_TASK; + task->status = DSA_TASK_READY; +} + +/** + * @brief Sets a buffer zero comparison DSA task. + * + * @param descriptor A pointer to the DSA task descriptor. + * @param buf A pointer to the memory buffer. + * @param len The length of the buffer. + */ +static void +buffer_zero_task_set_int(struct dsa_hw_desc *descriptor, + const void *buf, + size_t len) +{ + struct dsa_completion_record *completion = + (struct dsa_completion_record *)descriptor->completion_addr; + + descriptor->xfer_size = len; + descriptor->src_addr = (uintptr_t)buf; + completion->status = 0; + completion->result = 0; +} + +/** + * @brief Resets a buffer zero comparison DSA batch task. + * + * @param task A pointer to the DSA batch task. + */ +static void +buffer_zero_task_reset(struct buffer_zero_batch_task *task) +{ + task->completions[0].status = DSA_COMP_NONE; + task->task_type = DSA_TASK; + task->status = DSA_TASK_READY; +} + +/** + * @brief Sets a buffer zero comparison DSA task. + * + * @param task A pointer to the DSA task. + * @param buf A pointer to the memory buffer. + * @param len The buffer length. + */ +static void +buffer_zero_task_set(struct buffer_zero_batch_task *task, + const void *buf, + size_t len) +{ + buffer_zero_task_reset(task); + buffer_zero_task_set_int(&task->descriptors[0], buf, len); +} + +/** + * @brief Sets a buffer zero comparison batch task. + * + * @param batch_task A pointer to the batch task. + * @param buf An array of memory buffers. + * @param count The number of buffers in the array. + * @param len The length of the buffers. + */ +static void +buffer_zero_batch_task_set(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len) +{ + assert(count > 0); + assert(count <= batch_task->batch_size); + + buffer_zero_batch_task_reset(batch_task, count); + for (int i = 0; i < count; i++) { + buffer_zero_task_set_int(&batch_task->descriptors[i], buf[i], len); + } +} + +/** + * @brief Asychronously perform a buffer zero DSA operation. + * + * @param task A pointer to the batch task structure. + * @param buf A pointer to the memory buffer. + * @param len The length of the memory buffer. + * + * @return int Zero if successful, otherwise an appropriate error code. + */ +__attribute__((unused)) +static int +buffer_zero_dsa_async(struct buffer_zero_batch_task *task, + const void *buf, size_t len) +{ + buffer_zero_task_set(task, buf, len); + + return submit_wi_async(task); +} + +/** + * @brief Sends a memory comparison batch task to a DSA device and wait + * for completion. + * + * @param batch_task The batch task to be submitted to DSA device. + * @param buf An array of memory buffers to check for zero. + * @param count The number of buffers. + * @param len The buffer length. + */ +__attribute__((unused)) +static int +buffer_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len) +{ + assert(count <= batch_task->batch_size); + buffer_zero_batch_task_set(batch_task, buf, count, len); + + return submit_batch_wi_async(batch_task); +} + +/** + * @brief The completion callback function for buffer zero + * comparison DSA task completion. + * + * @param context A pointer to the callback context. + */ +static void +buffer_zero_dsa_completion(void *context) +{ + assert(context != NULL); + + struct buffer_zero_batch_task *task = + (struct buffer_zero_batch_task *)context; + qemu_sem_post(&task->sem_task_complete); +} + +/** + * @brief Wait for the asynchronous DSA task to complete. + * + * @param batch_task A pointer to the buffer zero comparison batch task. + */ +__attribute__((unused)) +static void +buffer_zero_dsa_wait(struct buffer_zero_batch_task *batch_task) +{ + qemu_sem_wait(&batch_task->sem_task_complete); +} + /** * @brief Check if DSA is running. * @@ -753,6 +958,11 @@ void dsa_cleanup(void) #else +void buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, + int batch_size) {} + +void buffer_zero_batch_task_destroy(struct buffer_zero_batch_task *task) {} + bool dsa_is_running(void) { return false; From patchwork Tue Nov 14 05:40:23 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454797 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5FBF3C4332F for ; Tue, 14 Nov 2023 05:43:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBS-0007F2-Aa; Tue, 14 Nov 2023 00:41:58 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBO-0007EE-Rf for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:55 -0500 Received: from mail-qk1-x732.google.com ([2607:f8b0:4864:20::732]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBL-0001Ss-3B for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:54 -0500 Received: by mail-qk1-x732.google.com with SMTP id af79cd13be357-7789a4c01easo338437085a.0 for ; Mon, 13 Nov 2023 21:41:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940509; x=1700545309; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3DH9IahCxHNZhgGh3NSSJ8rC6cgU5W5EjmHIaRYa9EQ=; b=BjGZi4HkKc/1gCNU6pauq8gbd+RDaMnfCIoGZF/4Fs6F8ukg3vGNp3F7nuSbxHos65 9j5qH6nTqaeWNNmGHdbE9ntBXznZaUU5BYx/gao3j4g8rWpfja+d+sgoy3Q43luDhTTB HvlNH8URp1+nr1ynt5x0TaorV5SPGSrWYlmsY1dkYLwSvai/rLTBaZrY3BSUF5hP5GLw r+0pWLJGOTFQ6CLNjAgYrVaHcNqdEf2WSEt29a7Fxkn7SsXZ3EsoTF5LIUCl/ZV1TO9s 0k/2hMyPkm+iT5A4tOQMTEk4AkVCqZfJhX9VdFi2heV/oHKInIUG6UibHGxBwMztIsmX mJRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940509; x=1700545309; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3DH9IahCxHNZhgGh3NSSJ8rC6cgU5W5EjmHIaRYa9EQ=; b=RJ2EsuUVeghR7ApYGTnt8ORg2LAJ5lWuQ3cG4g3aSfGVZgImPbSSW4IsaSZsGnPZaR fB9xM3xFSHR4znWhpFdoCx5ZGuWG6ewBYQ9W5C+NsBuUhRY+3wUkkVEa7CWFlPvFNQ2z vUyoCibnStAWZL7AWseqHsoLsIknK4Nhq/rWWCY9gyXVVfTroxOLQ49l2G3CBa+a02SG 8mxftAwv6Txikc02IykrASZv2WaX9I8dbECRlK00G38CQGRVtG3QbZCb9a2SyYGFnPKR zMLLqF/DCJPaN9z38WiIBfCDJnP2IrYWHagfI2F6cQW7ZbNZUcu/OphJgevLmkMIwrjv ndkA== X-Gm-Message-State: AOJu0YwkD8olOhVKY+X9nexrmPIoIVaWXbcHBFqSDPJaNBNOBqc5HQmr 1mvj9DN1hIoxGcBOrwMDBvQ1pw== X-Google-Smtp-Source: AGHT+IHyGB9ekwmVMcTep8nrAGfWbU50QvwWWfH27MJjhIl3G7AwPe7FbHW2RtpOxtWEDKhuCHioqw== X-Received: by 2002:a05:620a:d85:b0:77a:a2c:679a with SMTP id q5-20020a05620a0d8500b0077a0a2c679amr1364274qkl.42.1699940509069; Mon, 13 Nov 2023 21:41:49 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:48 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 11/20] util/dsa: Implement DSA task asynchronous submission and wait for completion. Date: Tue, 14 Nov 2023 05:40:23 +0000 Message-Id: <20231114054032.1192027-12-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::732; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x732.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org * Add a DSA task completion callback. * DSA completion thread will call the tasks's completion callback on every task/batch task completion. * DSA submission path to wait for completion. * Implement CPU fallback if DSA is not able to complete the task. Signed-off-by: Hao Xiang Signed-off-by: Bryan Zhang --- include/qemu/dsa.h | 14 +++++ util/dsa.c | 153 ++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 164 insertions(+), 3 deletions(-) diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h index b10e7b8fb7..3f8ee07004 100644 --- a/include/qemu/dsa.h +++ b/include/qemu/dsa.h @@ -65,6 +65,20 @@ void buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, */ void buffer_zero_batch_task_destroy(struct buffer_zero_batch_task *task); +/** + * @brief Performs buffer zero comparison on a DSA batch task asynchronously. + * + * @param batch_task A pointer to the batch task. + * @param buf An array of memory buffers. + * @param count The number of buffers in the array. + * @param len The buffer length. + * + * @return Zero if successful, otherwise non-zero. + */ +int +buffer_is_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len); + /** * @brief Initializes DSA devices. * diff --git a/util/dsa.c b/util/dsa.c index 3cc017b8a0..06c6fbf2ca 100644 --- a/util/dsa.c +++ b/util/dsa.c @@ -470,6 +470,41 @@ poll_completion(struct dsa_completion_record *completion, return 0; } +/** + * @brief Use CPU to complete a single zero page checking task. + * + * @param task A pointer to the task. + */ +static void +task_cpu_fallback(struct buffer_zero_batch_task *task) +{ + assert(task->task_type == DSA_TASK); + + struct dsa_completion_record *completion = &task->completions[0]; + const uint8_t *buf; + size_t len; + + if (completion->status == DSA_COMP_SUCCESS) { + return; + } + + /* + * DSA was able to partially complete the operation. Check the + * result. If we already know this is not a zero page, we can + * return now. + */ + if (completion->bytes_completed != 0 && completion->result != 0) { + task->results[0] = false; + return; + } + + /* Let's fallback to use CPU to complete it. */ + buf = (const uint8_t *)task->descriptors[0].src_addr; + len = task->descriptors[0].xfer_size; + task->results[0] = buffer_is_zero(buf + completion->bytes_completed, + len - completion->bytes_completed); +} + /** * @brief Complete a single DSA task in the batch task. * @@ -548,6 +583,62 @@ poll_batch_task_completion(struct buffer_zero_batch_task *batch_task) } } +/** + * @brief Use CPU to complete the zero page checking batch task. + * + * @param batch_task A pointer to the batch task. + */ +static void +batch_task_cpu_fallback(struct buffer_zero_batch_task *batch_task) +{ + assert(batch_task->task_type == DSA_BATCH_TASK); + + struct dsa_completion_record *batch_completion = + &batch_task->batch_completion; + struct dsa_completion_record *completion; + uint8_t status; + const uint8_t *buf; + size_t len; + bool *results = batch_task->results; + uint32_t count = batch_task->batch_descriptor.desc_count; + + // DSA is able to complete the entire batch task. + if (batch_completion->status == DSA_COMP_SUCCESS) { + assert(count == batch_completion->bytes_completed); + return; + } + + /* + * DSA encounters some error and is not able to complete + * the entire batch task. Use CPU fallback. + */ + for (int i = 0; i < count; i++) { + completion = &batch_task->completions[i]; + status = completion->status; + if (status == DSA_COMP_SUCCESS) { + continue; + } + assert(status == DSA_COMP_PAGE_FAULT_NOBOF); + + /* + * DSA was able to partially complete the operation. Check the + * result. If we already know this is not a zero page, we can + * return now. + */ + if (completion->bytes_completed != 0 && completion->result != 0) { + results[i] = false; + continue; + } + + /* Let's fallback to use CPU to complete it. */ + buf = (uint8_t *)batch_task->descriptors[i].src_addr; + len = batch_task->descriptors[i].xfer_size; + results[i] = + buffer_is_zero(buf + completion->bytes_completed, + len - completion->bytes_completed); + } +} + /** * @brief Handles an asynchronous DSA batch task completion. * @@ -825,7 +916,6 @@ buffer_zero_batch_task_set(struct buffer_zero_batch_task *batch_task, * * @return int Zero if successful, otherwise an appropriate error code. */ -__attribute__((unused)) static int buffer_zero_dsa_async(struct buffer_zero_batch_task *task, const void *buf, size_t len) @@ -844,7 +934,6 @@ buffer_zero_dsa_async(struct buffer_zero_batch_task *task, * @param count The number of buffers. * @param len The buffer length. */ -__attribute__((unused)) static int buffer_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, const void **buf, size_t count, size_t len) @@ -876,13 +965,29 @@ buffer_zero_dsa_completion(void *context) * * @param batch_task A pointer to the buffer zero comparison batch task. */ -__attribute__((unused)) static void buffer_zero_dsa_wait(struct buffer_zero_batch_task *batch_task) { qemu_sem_wait(&batch_task->sem_task_complete); } +/** + * @brief Use CPU to complete the zero page checking task if DSA + * is not able to complete it. + * + * @param batch_task A pointer to the batch task. + */ +static void +buffer_zero_cpu_fallback(struct buffer_zero_batch_task *batch_task) +{ + if (batch_task->task_type == DSA_TASK) { + task_cpu_fallback(batch_task); + } else { + assert(batch_task->task_type == DSA_BATCH_TASK); + batch_task_cpu_fallback(batch_task); + } +} + /** * @brief Check if DSA is running. * @@ -956,6 +1061,41 @@ void dsa_cleanup(void) dsa_device_group_cleanup(&dsa_group); } +/** + * @brief Performs buffer zero comparison on a DSA batch task asynchronously. + * + * @param batch_task A pointer to the batch task. + * @param buf An array of memory buffers. + * @param count The number of buffers in the array. + * @param len The buffer length. + * + * @return Zero if successful, otherwise non-zero. + */ +int +buffer_is_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len) +{ + if (count <= 0 || count > batch_task->batch_size) { + return -1; + } + + assert(batch_task != NULL); + assert(len != 0); + assert(buf != NULL); + + if (count == 1) { + // DSA doesn't take batch operation with only 1 task. + buffer_zero_dsa_async(batch_task, buf[0], len); + } else { + buffer_zero_dsa_batch_async(batch_task, buf, count, len); + } + + buffer_zero_dsa_wait(batch_task); + buffer_zero_cpu_fallback(batch_task); + + return 0; +} + #else void buffer_zero_batch_task_init(struct buffer_zero_batch_task *task, @@ -981,5 +1121,12 @@ void dsa_stop(void) {} void dsa_cleanup(void) {} +int +buffer_is_zero_dsa_batch_async(struct buffer_zero_batch_task *batch_task, + const void **buf, size_t count, size_t len) +{ + exit(1); +} + #endif From patchwork Tue Nov 14 05:40:24 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454794 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id CEB81C4332F for ; Tue, 14 Nov 2023 05:42:56 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBS-0007F3-Rm; Tue, 14 Nov 2023 00:41:58 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBQ-0007Ed-OL for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:57 -0500 Received: from mail-qv1-xf2f.google.com ([2607:f8b0:4864:20::f2f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBO-0001TN-Pi for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:56 -0500 Received: by mail-qv1-xf2f.google.com with SMTP id 6a1803df08f44-66d0f945893so43964976d6.1 for ; Mon, 13 Nov 2023 21:41:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940514; x=1700545314; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=m1r1zQ1wix8p0/drY4otOIWoj+h39RPeRIVWFAAZ2w0=; b=L1YLAbCY+wJPH3pvwinhWkrroN79JbWaZ3rC1mvQ2RpEnzc2phaoYg6KgFhgSOB0sq Iup4Y8LdhUoQNeWbpRE82NyYnWEuIR09TaM1rHDrWnQ900zf9sJHHv9Mo0Ao0pwDF1TT dipYgtFkNyc595pSpWcrDWDHxxSM7ksLgan8a58BIFUTsQm7zFW2chKbDY8kwZYrDsLi /qo0QDUZy8c1xLXQziUgx7yf+jvXQSr9xFC8KmIdrzyGiS+j07qej+ykmfRmDNh/rQZq +2aSX8KAFMB+GoUz4ZoqV81NEBOcXawHS/C0Oz3oAkXkc0VzAvdOKhCVDpWKA3i0/61Y woAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940514; x=1700545314; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=m1r1zQ1wix8p0/drY4otOIWoj+h39RPeRIVWFAAZ2w0=; b=vKxFoLDoepZqKI/PLnnrc2qhsHMX5jSHqAItjid62PXVdyenTk6psegi/yofrRhgOb yKZXinWlUnt6CRUI+A5g6XyU8zMZq84meKs0ROUasDzsj82YZuA2yIIOSRdmo0nZapBC TIHD/GMetyyxiyVAhxJMGRrHn8nCqxlCQrMemE/n1zMGhFomSFFImGzqbTxcOh5d3Z91 DokAkRCAGhxW1lukrYSV7LS9m+lGju3RhwhyPbh8EJqr36/hBIXRAxbkvdRdEF4xhLR1 XgQj93UInURmzCK0QmU5NExSdD1sWv1r2unhrYg0NFKincwBEZ4701Kgzoi7oCaTmp++ mjPg== X-Gm-Message-State: AOJu0YysG5R8mrI9J4s7Wlb3czcq15fnXq/6ZDSOuZuwc73IXsAVy3AZ LMb+mdvHZDLczRdcz1qoV3dyqw== X-Google-Smtp-Source: AGHT+IE5GToIbmWtFNvhu174v4ELtJkJjQBCUe87f2CJRVfx+6QqJic7xymCXnSqWlVIY2IEA3Hk+Q== X-Received: by 2002:a05:6214:2d01:b0:66d:670:d425 with SMTP id mz1-20020a0562142d0100b0066d0670d425mr2001134qvb.31.1699940513880; Mon, 13 Nov 2023 21:41:53 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:53 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 12/20] migration/multifd: Add new migration option for multifd DSA offloading. Date: Tue, 14 Nov 2023 05:40:24 +0000 Message-Id: <20231114054032.1192027-13-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::f2f; envelope-from=hao.xiang@bytedance.com; helo=mail-qv1-xf2f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Intel DSA offloading is an optional feature that turns on if proper hardware and software stack is available. To turn on DSA offloading in multifd live migration: multifd-dsa-accel="[dsa_dev_path1] ] [dsa_dev_path2] ... [dsa_dev_pathX]" This feature is turned off by default. Signed-off-by: Hao Xiang --- migration/migration-hmp-cmds.c | 8 ++++++++ migration/options.c | 28 ++++++++++++++++++++++++++++ migration/options.h | 1 + qapi/migration.json | 17 ++++++++++++++--- scripts/meson-buildoptions.sh | 6 +++--- 5 files changed, 54 insertions(+), 6 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 86ae832176..d9451744dd 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -353,6 +353,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: '%s'\n", MigrationParameter_str(MIGRATION_PARAMETER_TLS_AUTHZ), params->tls_authz); + monitor_printf(mon, "%s: %s\n", + MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_DSA_ACCEL), + params->multifd_dsa_accel); if (params->has_block_bitmap_mapping) { const BitmapMigrationNodeAliasList *bmnal; @@ -615,6 +618,11 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->has_block_incremental = true; visit_type_bool(v, param, &p->block_incremental, &err); break; + case MIGRATION_PARAMETER_MULTIFD_DSA_ACCEL: + p->multifd_dsa_accel = g_new0(StrOrNull, 1); + p->multifd_dsa_accel->type = QTYPE_QSTRING; + visit_type_str(v, param, &p->multifd_dsa_accel->u.s, &err); + break; case MIGRATION_PARAMETER_MULTIFD_CHANNELS: p->has_multifd_channels = true; visit_type_uint8(v, param, &p->multifd_channels, &err); diff --git a/migration/options.c b/migration/options.c index 97d121d4d7..6e424b5d63 100644 --- a/migration/options.c +++ b/migration/options.c @@ -179,6 +179,8 @@ Property migration_properties[] = { DEFINE_PROP_MIG_MODE("mode", MigrationState, parameters.mode, MIG_MODE_NORMAL), + DEFINE_PROP_STRING("multifd-dsa-accel", MigrationState, + parameters.multifd_dsa_accel), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -901,6 +903,13 @@ const char *migrate_tls_creds(void) return s->parameters.tls_creds; } +const char *migrate_multifd_dsa_accel(void) +{ + MigrationState *s = migrate_get_current(); + + return s->parameters.multifd_dsa_accel; +} + const char *migrate_tls_hostname(void) { MigrationState *s = migrate_get_current(); @@ -1025,6 +1034,7 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->vcpu_dirty_limit = s->parameters.vcpu_dirty_limit; params->has_mode = true; params->mode = s->parameters.mode; + params->multifd_dsa_accel = s->parameters.multifd_dsa_accel; return params; } @@ -1033,6 +1043,7 @@ void migrate_params_init(MigrationParameters *params) { params->tls_hostname = g_strdup(""); params->tls_creds = g_strdup(""); + params->multifd_dsa_accel = g_strdup(""); /* Set has_* up only for parameter checks */ params->has_compress_level = true; @@ -1362,6 +1373,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_mode) { dest->mode = params->mode; } + + if (params->multifd_dsa_accel) { + assert(params->multifd_dsa_accel->type == QTYPE_QSTRING); + dest->multifd_dsa_accel = params->multifd_dsa_accel->u.s; + } } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1506,6 +1522,12 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) if (params->has_mode) { s->parameters.mode = params->mode; } + + if (params->multifd_dsa_accel) { + g_free(s->parameters.multifd_dsa_accel); + assert(params->multifd_dsa_accel->type == QTYPE_QSTRING); + s->parameters.multifd_dsa_accel = g_strdup(params->multifd_dsa_accel->u.s); + } } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) @@ -1531,6 +1553,12 @@ void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) params->tls_authz->type = QTYPE_QSTRING; params->tls_authz->u.s = strdup(""); } + if (params->multifd_dsa_accel + && params->multifd_dsa_accel->type == QTYPE_QNULL) { + qobject_unref(params->multifd_dsa_accel->u.n); + params->multifd_dsa_accel->type = QTYPE_QSTRING; + params->multifd_dsa_accel->u.s = strdup(""); + } migrate_params_test_apply(params, &tmp); diff --git a/migration/options.h b/migration/options.h index c901eb57c6..56100961a9 100644 --- a/migration/options.h +++ b/migration/options.h @@ -94,6 +94,7 @@ const char *migrate_tls_authz(void); const char *migrate_tls_creds(void); const char *migrate_tls_hostname(void); uint64_t migrate_xbzrle_cache_size(void); +const char *migrate_multifd_dsa_accel(void); /* parameters setters */ diff --git a/qapi/migration.json b/qapi/migration.json index 9783289bfc..a8e3b66d6f 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -879,6 +879,9 @@ # @mode: Migration mode. See description in @MigMode. Default is 'normal'. # (Since 8.2) # +# @multifd-dsa-accel: If enabled, use DSA accelerator offloading for +# certain memory operations. (since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -902,7 +905,7 @@ 'cpu-throttle-initial', 'cpu-throttle-increment', 'cpu-throttle-tailslow', 'tls-creds', 'tls-hostname', 'tls-authz', 'max-bandwidth', - 'avail-switchover-bandwidth', 'downtime-limit', + 'avail-switchover-bandwidth', 'downtime-limit', 'multifd-dsa-accel', { 'name': 'x-checkpoint-delay', 'features': [ 'unstable' ] }, { 'name': 'block-incremental', 'features': [ 'deprecated' ] }, 'multifd-channels', @@ -1067,6 +1070,9 @@ # @mode: Migration mode. See description in @MigMode. Default is 'normal'. # (Since 8.2) # +# @multifd-dsa-accel: If enabled, use DSA accelerator offloading for +# certain memory operations. (since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -1120,7 +1126,8 @@ '*x-vcpu-dirty-limit-period': { 'type': 'uint64', 'features': [ 'unstable' ] }, '*vcpu-dirty-limit': 'uint64', - '*mode': 'MigMode'} } + '*mode': 'MigMode', + '*multifd-dsa-accel': 'StrOrNull'} } ## # @migrate-set-parameters: @@ -1295,6 +1302,9 @@ # @mode: Migration mode. See description in @MigMode. Default is 'normal'. # (Since 8.2) # +# @multifd-dsa-accel: If enabled, use DSA accelerator offloading for +# certain memory operations. (since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -1345,7 +1355,8 @@ '*x-vcpu-dirty-limit-period': { 'type': 'uint64', 'features': [ 'unstable' ] }, '*vcpu-dirty-limit': 'uint64', - '*mode': 'MigMode'} } + '*mode': 'MigMode', + '*multifd-dsa-accel': 'str'} } ## # @query-migrate-parameters: diff --git a/scripts/meson-buildoptions.sh b/scripts/meson-buildoptions.sh index bf139e3fb4..35222ab63e 100644 --- a/scripts/meson-buildoptions.sh +++ b/scripts/meson-buildoptions.sh @@ -32,6 +32,7 @@ meson_options_help() { printf "%s\n" ' --enable-debug-stack-usage' printf "%s\n" ' measure coroutine stack usage' printf "%s\n" ' --enable-debug-tcg TCG debugging' + printf "%s\n" ' --enable-enqcmd MENQCMD optimizations' printf "%s\n" ' --enable-fdt[=CHOICE] Whether and how to find the libfdt library' printf "%s\n" ' (choices: auto/disabled/enabled/internal/system)' printf "%s\n" ' --enable-fuzzing build fuzzing targets' @@ -93,7 +94,6 @@ meson_options_help() { printf "%s\n" ' avx2 AVX2 optimizations' printf "%s\n" ' avx512bw AVX512BW optimizations' printf "%s\n" ' avx512f AVX512F optimizations' - printf "%s\n" ' enqcmd ENQCMD optimizations' printf "%s\n" ' blkio libblkio block device driver' printf "%s\n" ' bochs bochs image format support' printf "%s\n" ' bpf eBPF support' @@ -241,8 +241,6 @@ _meson_option_parse() { --disable-avx512bw) printf "%s" -Davx512bw=disabled ;; --enable-avx512f) printf "%s" -Davx512f=enabled ;; --disable-avx512f) printf "%s" -Davx512f=disabled ;; - --enable-enqcmd) printf "%s" -Denqcmd=true ;; - --disable-enqcmd) printf "%s" -Denqcmd=false ;; --enable-gcov) printf "%s" -Db_coverage=true ;; --disable-gcov) printf "%s" -Db_coverage=false ;; --enable-lto) printf "%s" -Db_lto=true ;; @@ -309,6 +307,8 @@ _meson_option_parse() { --disable-docs) printf "%s" -Ddocs=disabled ;; --enable-dsound) printf "%s" -Ddsound=enabled ;; --disable-dsound) printf "%s" -Ddsound=disabled ;; + --enable-enqcmd) printf "%s" -Denqcmd=true ;; + --disable-enqcmd) printf "%s" -Denqcmd=false ;; --enable-fdt) printf "%s" -Dfdt=enabled ;; --disable-fdt) printf "%s" -Dfdt=disabled ;; --enable-fdt=*) quote_sh "-Dfdt=$2" ;; From patchwork Tue Nov 14 05:40:25 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454804 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2BA8EC4167D for ; Tue, 14 Nov 2023 05:43:48 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBg-0007HE-29; Tue, 14 Nov 2023 00:42:13 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBS-0007F4-SU for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:58 -0500 Received: from mail-qk1-x729.google.com ([2607:f8b0:4864:20::729]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBQ-0001TZ-U6 for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:41:58 -0500 Received: by mail-qk1-x729.google.com with SMTP id af79cd13be357-7789923612dso329310585a.0 for ; Mon, 13 Nov 2023 21:41:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940516; x=1700545316; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Xf50/lsBngLq8SDa+dxmzDJFoeEv3vJJjISvORIZOJw=; b=il9jFP5WljuvjHlGyxHhMnsAhn45prWOFew9U6rBqyXegdkXYdBeDy4U5MWbHQyYn+ PtMMsnR1IOnbDqT25hBI4XHBr9G6cqexf18xyNjXoYusZ3oEfUCylxcMY0WkbIFmKodg Oc3ePt3z6wm4/RMfZG7vum8Yel4BOZ8PdYwKiYXNvsgeAt3A61pALqwMZuHIzOD68/i9 ro0IDTlZ4lATFAU5czk+GhuxdW3Ww6FB2m67nCiw5BaFJJZxLOHCOh+aRCaQtHIrQQpE 9K80KpjBQL4Jetv3Nwk1SHI0JI16x1OPauAhbxUzeIimh5NGBO4PJPO4XB9Ob75cYDZq Tr6w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940516; x=1700545316; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Xf50/lsBngLq8SDa+dxmzDJFoeEv3vJJjISvORIZOJw=; b=B07B3V1eOBOogZAa/W4SnXg6tVkanGj2Xq9jTCv1E5jtYWnXfxvRTxIpnSp/nrtAcG o7MOmJ+KMzfHLW/ReCOTO+bB/L/987V/t3+BhUJiaFeS8RD2AJEyqVccRCt7NSOvsDAJ tAM3E9hix7OwtDcgapzC0+a8cq88i27xmP3o9wCwwo6N07YGQMtFKYjCB3GBrfV6OLq4 z8iVGq9zZtNHpqagOnn2l06S9deSfCKSvVwualHtTMQ/vtJPlCxNiJn2avXIVA5xjsuh 6GffLpwjs3NFaNlRf5Z6V1sT8ICOvJNJebOFZ4JihxYMitfxRu9DlTfKh734tu0eiAfs Zxog== X-Gm-Message-State: AOJu0YzC72xPl1zXfV/1RXxcQShlxCIJEfwgyNAtfMYpAlOfIYP42fvz Q34ZorbEfU+h0Y0212ryREaMKQ== X-Google-Smtp-Source: AGHT+IFlF2FtpMQCXLpcZW9q2WQtJxlo0IwOsUKekl368VZCdXaDvoTErHM8oNQvZCueO5T+M83ztQ== X-Received: by 2002:a05:620a:4395:b0:77b:be50:9cbe with SMTP id a21-20020a05620a439500b0077bbe509cbemr1627558qkp.57.1699940515759; Mon, 13 Nov 2023 21:41:55 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:55 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 13/20] migration/multifd: Prepare to introduce DSA acceleration on the multifd path. Date: Tue, 14 Nov 2023 05:40:25 +0000 Message-Id: <20231114054032.1192027-14-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::729; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x729.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org 1. Refactor multifd_send_thread function. 2. Implement buffer_is_zero_use_cpu to handle CPU based zero page checking. 3. Introduce the batch task structure in MultiFDSendParams. Signed-off-by: Hao Xiang --- migration/multifd.c | 82 ++++++++++++++++++++++++++++++++++++--------- migration/multifd.h | 3 ++ 2 files changed, 70 insertions(+), 15 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 1198ffde9c..68ab97f918 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -14,6 +14,8 @@ #include "qemu/cutils.h" #include "qemu/rcu.h" #include "qemu/cutils.h" +#include "qemu/dsa.h" +#include "qemu/memalign.h" #include "exec/target_page.h" #include "sysemu/sysemu.h" #include "exec/ramblock.h" @@ -574,6 +576,11 @@ void multifd_save_cleanup(void) p->name = NULL; multifd_pages_clear(p->pages); p->pages = NULL; + g_free(p->addr); + p->addr = NULL; + buffer_zero_batch_task_destroy(p->batch_task); + qemu_vfree(p->batch_task); + p->batch_task = NULL; p->packet_len = 0; g_free(p->packet); p->packet = NULL; @@ -678,13 +685,66 @@ int multifd_send_sync_main(QEMUFile *f) return 0; } +static void set_page(MultiFDSendParams *p, bool zero_page, uint64_t offset) +{ + RAMBlock *rb = p->pages->block; + if (zero_page) { + p->zero[p->zero_num] = offset; + p->zero_num++; + ram_release_page(rb->idstr, offset); + } else { + p->normal[p->normal_num] = offset; + p->normal_num++; + } +} + +static void buffer_is_zero_use_cpu(MultiFDSendParams *p) +{ + const void **buf = (const void **)p->addr; + assert(!migrate_use_main_zero_page()); + + for (int i = 0; i < p->pages->num; i++) { + p->batch_task->results[i] = buffer_is_zero(buf[i], p->page_size); + } +} + +static void set_normal_pages(MultiFDSendParams *p) +{ + for (int i = 0; i < p->pages->num; i++) { + p->batch_task->results[i] = false; + } +} + +static void multifd_zero_page_check(MultiFDSendParams *p) +{ + /* older qemu don't understand zero page on multifd channel */ + bool use_multifd_zero_page = !migrate_use_main_zero_page(); + + RAMBlock *rb = p->pages->block; + + for (int i = 0; i < p->pages->num; i++) { + p->addr[i] = (ram_addr_t)(rb->host + p->pages->offset[i]); + } + + if (use_multifd_zero_page) { + buffer_is_zero_use_cpu(p); + } else { + // No zero page checking. All pages are normal pages. + set_normal_pages(p); + } + + for (int i = 0; i < p->pages->num; i++) { + uint64_t offset = p->pages->offset[i]; + bool zero_page = p->batch_task->results[i]; + set_page(p, zero_page, offset); + } +} + static void *multifd_send_thread(void *opaque) { MultiFDSendParams *p = opaque; MigrationThread *thread = NULL; Error *local_err = NULL; - /* qemu older than 8.2 don't understand zero page on multifd channel */ - bool use_multifd_zero_page = !migrate_use_main_zero_page(); int ret = 0; bool use_zero_copy_send = migrate_zero_copy_send(); @@ -710,7 +770,6 @@ static void *multifd_send_thread(void *opaque) qemu_mutex_lock(&p->mutex); if (p->pending_job) { - RAMBlock *rb = p->pages->block; uint64_t packet_num = p->packet_num; uint32_t flags; @@ -723,18 +782,7 @@ static void *multifd_send_thread(void *opaque) p->iovs_num = 1; } - for (int i = 0; i < p->pages->num; i++) { - uint64_t offset = p->pages->offset[i]; - if (use_multifd_zero_page && - buffer_is_zero(rb->host + offset, p->page_size)) { - p->zero[p->zero_num] = offset; - p->zero_num++; - ram_release_page(rb->idstr, offset); - } else { - p->normal[p->normal_num] = offset; - p->normal_num++; - } - } + multifd_zero_page_check(p); if (p->normal_num) { ret = multifd_send_state->ops->send_prepare(p, &local_err); @@ -976,6 +1024,10 @@ int multifd_save_setup(Error **errp) p->pending_job = 0; p->id = i; p->pages = multifd_pages_init(page_count); + p->addr = g_new0(ram_addr_t, page_count); + p->batch_task = + (struct buffer_zero_batch_task *)qemu_memalign(64, sizeof(*p->batch_task)); + buffer_zero_batch_task_init(p->batch_task, page_count); p->packet_len = sizeof(MultiFDPacket_t) + sizeof(uint64_t) * page_count; p->packet = g_malloc0(p->packet_len); diff --git a/migration/multifd.h b/migration/multifd.h index 13762900d4..62f31b03c0 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -119,6 +119,9 @@ typedef struct { * pending_job != 0 -> multifd_channel can use it. */ MultiFDPages_t *pages; + /* Address of each pages in pages */ + ram_addr_t *addr; + struct buffer_zero_batch_task *batch_task; /* thread local variables. No locking required */ From patchwork Tue Nov 14 05:40:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454809 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BF02BC4167B for ; Tue, 14 Nov 2023 05:44:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBl-0007IF-5Y; Tue, 14 Nov 2023 00:42:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBU-0007FW-7a for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:01 -0500 Received: from mail-qk1-x72c.google.com ([2607:f8b0:4864:20::72c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBS-0001Tj-2S for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:00 -0500 Received: by mail-qk1-x72c.google.com with SMTP id af79cd13be357-7789cc5c8ccso437322685a.0 for ; Mon, 13 Nov 2023 21:41:57 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940517; x=1700545317; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0xlsyQY8WI2WTqdEKVrbd/ZHNlkVulCz9X5dqbIEoOs=; b=D2KTBLxjEso/CYD1+4XxlQolGIiIO/q3jc97nIbkQEHAgrSCmMvL6J+klv8O6RTVIm QtM+xxm68ZvhRxHens4dKAS6tSsctkp2Jlz78JBleczaloHcBYer8b28gR6Pbu59bTZq eRGThWzUgcWkz6mKadFuKNPy3o0wm5q+1wKAk77CELLUrkqGRSocg6FSFMSiboQcFDTC 9a65itN0oRWZjhF0Nb/JZ2RKq/zzlbS4MWUQ/1ReLTi1Pmrksoxo2vv8VyX1vbHSIOJB RG98j3CZeczH+s25I3xecx4AUxGfnXevKSl+spAhLqbVBA4+SsutIl5dPiBHpmjaT4ZL YKxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940517; x=1700545317; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0xlsyQY8WI2WTqdEKVrbd/ZHNlkVulCz9X5dqbIEoOs=; b=BknIYv+25T63rIR2wJ0axbjGiJr0HQbA0SnsL45Ypt4oGJsgjima8qbIxQ+EM0Jglj YU+SqCSmSC/kGOgCXLyXov3d7J51HAvrJ5Jm5mYnEzU1RL1s6vCc3qbwhQEnv7vdqdAZ utiwWcnGLfCALJrxpmjP0mZvN2Gt2nQLo8/3dgQ3BNiuZzWBJYRFBRdr5Pc8rSvtblT9 KGCogsV3lLvLYjS9aIYRT5g2PAZS5NA6M7cTNOAlBuNxtR1k7vAdH6er2sOXN/gaDIj6 H0tQztWIRftojvqnuLpNx0OCFnLTwUZUC+YNdP7R5cpuzLVsNk2UZs1eDPptaSt/pt0E foDg== X-Gm-Message-State: AOJu0YzQx6v/k20hlYPtgKAwGWm3VCAtyBfeleoBYMOryffoR38Qdx7Z xeMsqdttuWwNTKvvmJ3B8pmYIZ05aD3cOuuVtn0= X-Google-Smtp-Source: AGHT+IG9RihCKn9sme494WAgj7CNwJbavzmMX5AuBNqazVGN8mnK45cNLBLt4lEfo5XtB7aGPWrOWQ== X-Received: by 2002:a05:620a:4089:b0:77a:5247:730a with SMTP id f9-20020a05620a408900b0077a5247730amr2044202qko.20.1699940517318; Mon, 13 Nov 2023 21:41:57 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:57 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 14/20] migration/multifd: Enable DSA offloading in multifd sender path. Date: Tue, 14 Nov 2023 05:40:26 +0000 Message-Id: <20231114054032.1192027-15-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72c; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Multifd sender path gets an array of pages queued by the migration thread. It performs zero page checking on every page in the array. The pages are classfied as either a zero page or a normal page. This change uses Intel DSA to offload the zero page checking from CPU to the DSA accelerator. The sender thread submits a batch of pages to DSA hardware and waits for the DSA completion thread to signal for work completion. Signed-off-by: Hao Xiang --- migration/multifd.c | 33 ++++++++++++++++++++++++++++++++- 1 file changed, 32 insertions(+), 1 deletion(-) diff --git a/migration/multifd.c b/migration/multifd.c index 68ab97f918..2f635898ed 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -560,6 +560,8 @@ void multifd_save_cleanup(void) qemu_thread_join(&p->thread); } } + dsa_stop(); + dsa_cleanup(); for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDSendParams *p = &multifd_send_state->params[i]; Error *local_err = NULL; @@ -702,6 +704,7 @@ static void buffer_is_zero_use_cpu(MultiFDSendParams *p) { const void **buf = (const void **)p->addr; assert(!migrate_use_main_zero_page()); + assert(!dsa_is_running()); for (int i = 0; i < p->pages->num; i++) { p->batch_task->results[i] = buffer_is_zero(buf[i], p->page_size); @@ -710,15 +713,29 @@ static void buffer_is_zero_use_cpu(MultiFDSendParams *p) static void set_normal_pages(MultiFDSendParams *p) { + assert(migrate_use_main_zero_page()); + for (int i = 0; i < p->pages->num; i++) { p->batch_task->results[i] = false; } } +static void buffer_is_zero_use_dsa(MultiFDSendParams *p) +{ + assert(!migrate_use_main_zero_page()); + assert(dsa_is_running()); + + buffer_is_zero_dsa_batch_async(p->batch_task, + (const void **)p->addr, + p->pages->num, + p->page_size); +} + static void multifd_zero_page_check(MultiFDSendParams *p) { /* older qemu don't understand zero page on multifd channel */ bool use_multifd_zero_page = !migrate_use_main_zero_page(); + bool use_multifd_dsa_accel = dsa_is_running(); RAMBlock *rb = p->pages->block; @@ -726,7 +743,9 @@ static void multifd_zero_page_check(MultiFDSendParams *p) p->addr[i] = (ram_addr_t)(rb->host + p->pages->offset[i]); } - if (use_multifd_zero_page) { + if (use_multifd_dsa_accel && use_multifd_zero_page) { + buffer_is_zero_use_dsa(p); + } else if (use_multifd_zero_page) { buffer_is_zero_use_cpu(p); } else { // No zero page checking. All pages are normal pages. @@ -1001,11 +1020,15 @@ int multifd_save_setup(Error **errp) int thread_count; uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); uint8_t i; + const char *dsa_parameter = migrate_multifd_dsa_accel(); if (!migrate_multifd()) { return 0; } + dsa_init(dsa_parameter); + dsa_start(); + thread_count = migrate_multifd_channels(); multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); @@ -1061,6 +1084,7 @@ int multifd_save_setup(Error **errp) return ret; } } + return 0; } @@ -1138,6 +1162,8 @@ void multifd_load_cleanup(void) qemu_thread_join(&p->thread); } + dsa_stop(); + dsa_cleanup(); for (i = 0; i < migrate_multifd_channels(); i++) { MultiFDRecvParams *p = &multifd_recv_state->params[i]; @@ -1272,6 +1298,7 @@ int multifd_load_setup(Error **errp) int thread_count; uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); uint8_t i; + const char *dsa_parameter = migrate_multifd_dsa_accel(); /* * Return successfully if multiFD recv state is already initialised @@ -1281,6 +1308,9 @@ int multifd_load_setup(Error **errp) return 0; } + dsa_init(dsa_parameter); + dsa_start(); + thread_count = migrate_multifd_channels(); multifd_recv_state = g_malloc0(sizeof(*multifd_recv_state)); multifd_recv_state->params = g_new0(MultiFDRecvParams, thread_count); @@ -1317,6 +1347,7 @@ int multifd_load_setup(Error **errp) return ret; } } + return 0; } From patchwork Tue Nov 14 05:40:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454811 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DEE72C4332F for ; Tue, 14 Nov 2023 05:44:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBn-0007Io-7N; Tue, 14 Nov 2023 00:42:19 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBV-0007Fy-Lc for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:03 -0500 Received: from mail-qk1-x72d.google.com ([2607:f8b0:4864:20::72d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBT-0001U3-UH for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:01 -0500 Received: by mail-qk1-x72d.google.com with SMTP id af79cd13be357-7789a4c01ddso346439085a.1 for ; Mon, 13 Nov 2023 21:41:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940519; x=1700545319; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UC/+IR5Y7dIjrKdVazsg+msPhpgomaB++LU+9G8M++8=; b=YmfmMnqpEgzggjOjnb2yoyOWMIIfigoQCiPuLQIsUyiDkMe/fLx9MsQUX7nF/7+x0j YDC2pZc/lLQtJaH4In79G/4VuxxCUenVziCPnaT+cwK95qIcZyuDQlSK4Ie9a8YslX+s 6xaX29hZurDn3Z2jrffgXLWqhTDd4HJovmG+j38ZK5n0s1HXAy4BA9PqZpbedMvjCdnw Nt4MOVkgaFFrb09qPHlHA3PtxUhZ5g5fAMxxuPiy3lX4zmYX/C3ePrZRK3RluhroWLtI kSpKawVyceJsekjwb3r8qkKF6PhkyJvdDkghRijZYBn8FQJtEErB3ih+v0ebPJYsdo/p OfyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940519; x=1700545319; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UC/+IR5Y7dIjrKdVazsg+msPhpgomaB++LU+9G8M++8=; b=WF1F/3xiKtzvA3X01ir1mRsD8BVgjEqspAcuVIRmmHWul84vOJu5MNAR1E3qF9xo75 kogzA12nMNxi7oW0bhj+d34Otc63bnLce/quYFCIUCme1mC1yAipPVojVxXtBEZYnas4 S8APRcu8vDdacsX5bVisswmpKyXfICXtbOYnjc03T5zRjDbiQhlLQnAWPrQTeL8TJ3pb 0KkLbtkbGWWApQGRWZ/efWz2/ibN2WazSsvozOwO6wphl3DTdNX2twJuJQgVY6A1NlTO 7fBxSAie6JITRqtXQHW0eXzUkZ1MB1It4ocgVJEWWXR3SbfFDN9hM0e9oau8v74zkDi5 6QoQ== X-Gm-Message-State: AOJu0YwV7gAsUbbJia2eWgv4tk/LPgvcgmHAAf5YrzkH0ER4Kf1fAYmV VJZieTmlkHOjlBSBChKFXKtwPw== X-Google-Smtp-Source: AGHT+IGBzP24lWpYXIa/JTY78ljFJlV021h5ohwVPYqsIhmF/i4pSWOcFdAzLFj5FkNx+ddOIzu3Iw== X-Received: by 2002:a05:620a:9493:b0:77a:605:d693 with SMTP id sn19-20020a05620a949300b0077a0605d693mr959550qkn.22.1699940518794; Mon, 13 Nov 2023 21:41:58 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.41.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:41:58 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 15/20] migration/multifd: Add test hook to set normal page ratio. Date: Tue, 14 Nov 2023 05:40:27 +0000 Message-Id: <20231114054032.1192027-16-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72d; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Multifd sender thread performs zero page checking. If a page is a zero page, only the page's metadata is sent to the receiver. If a page is a normal page, the entire page's content is sent to the receiver. This change adds a test hook to set the normal page ratio. A zero page will be forced to be sent as a normal page. This is useful for live migration performance analysis and optimization. Signed-off-by: Hao Xiang --- migration/options.c | 31 +++++++++++++++++++++++++++++++ migration/options.h | 1 + qapi/migration.json | 18 +++++++++++++++--- 3 files changed, 47 insertions(+), 3 deletions(-) diff --git a/migration/options.c b/migration/options.c index 6e424b5d63..e7f1e2df24 100644 --- a/migration/options.c +++ b/migration/options.c @@ -79,6 +79,11 @@ #define DEFAULT_MIGRATE_ANNOUNCE_ROUNDS 5 #define DEFAULT_MIGRATE_ANNOUNCE_STEP 100 +/* + * Parameter for multifd normal page test hook. + */ +#define DEFAULT_MIGRATE_MULTIFD_NORMAL_PAGE_RATIO 101 + #define DEFINE_PROP_MIG_CAP(name, x) \ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) @@ -181,6 +186,9 @@ Property migration_properties[] = { MIG_MODE_NORMAL), DEFINE_PROP_STRING("multifd-dsa-accel", MigrationState, parameters.multifd_dsa_accel), + DEFINE_PROP_UINT8("multifd-normal-page-ratio", MigrationState, + parameters.multifd_normal_page_ratio, + DEFAULT_MIGRATE_MULTIFD_NORMAL_PAGE_RATIO), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -860,6 +868,12 @@ int migrate_multifd_channels(void) return s->parameters.multifd_channels; } +uint8_t migrate_multifd_normal_page_ratio(void) +{ + MigrationState *s = migrate_get_current(); + return s->parameters.multifd_normal_page_ratio; +} + MultiFDCompression migrate_multifd_compression(void) { MigrationState *s = migrate_get_current(); @@ -1258,6 +1272,14 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) return false; } + if (params->has_multifd_normal_page_ratio && + params->multifd_normal_page_ratio > 100) { + error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "multifd_normal_page_ratio", + "a value between 0 and 100"); + return false; + } + return true; } @@ -1378,6 +1400,11 @@ static void migrate_params_test_apply(MigrateSetParameters *params, assert(params->multifd_dsa_accel->type == QTYPE_QSTRING); dest->multifd_dsa_accel = params->multifd_dsa_accel->u.s; } + + if (params->has_multifd_normal_page_ratio) { + dest->has_multifd_normal_page_ratio = true; + dest->multifd_normal_page_ratio = params->multifd_normal_page_ratio; + } } static void migrate_params_apply(MigrateSetParameters *params, Error **errp) @@ -1528,6 +1555,10 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) assert(params->multifd_dsa_accel->type == QTYPE_QSTRING); s->parameters.multifd_dsa_accel = g_strdup(params->multifd_dsa_accel->u.s); } + + if (params->has_multifd_normal_page_ratio) { + s->parameters.multifd_normal_page_ratio = params->multifd_normal_page_ratio; + } } void qmp_migrate_set_parameters(MigrateSetParameters *params, Error **errp) diff --git a/migration/options.h b/migration/options.h index 56100961a9..21e3e7b0cf 100644 --- a/migration/options.h +++ b/migration/options.h @@ -95,6 +95,7 @@ const char *migrate_tls_creds(void); const char *migrate_tls_hostname(void); uint64_t migrate_xbzrle_cache_size(void); const char *migrate_multifd_dsa_accel(void); +uint8_t migrate_multifd_normal_page_ratio(void); /* parameters setters */ diff --git a/qapi/migration.json b/qapi/migration.json index a8e3b66d6f..bb876c8325 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -882,6 +882,9 @@ # @multifd-dsa-accel: If enabled, use DSA accelerator offloading for # certain memory operations. (since 8.2) # +# @multifd-normal-page-ratio: Test hook setting the normal page ratio. +# (Since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -915,7 +918,8 @@ 'block-bitmap-mapping', { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] }, 'vcpu-dirty-limit', - 'mode'] } + 'mode', + 'multifd-normal-page-ratio'] } ## # @MigrateSetParameters: @@ -1073,6 +1077,9 @@ # @multifd-dsa-accel: If enabled, use DSA accelerator offloading for # certain memory operations. (since 8.2) # +# @multifd-normal-page-ratio: Test hook setting the normal page ratio. +# (Since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -1127,7 +1134,8 @@ 'features': [ 'unstable' ] }, '*vcpu-dirty-limit': 'uint64', '*mode': 'MigMode', - '*multifd-dsa-accel': 'StrOrNull'} } + '*multifd-dsa-accel': 'StrOrNull', + '*multifd-normal-page-ratio': 'uint8'} } ## # @migrate-set-parameters: @@ -1305,6 +1313,9 @@ # @multifd-dsa-accel: If enabled, use DSA accelerator offloading for # certain memory operations. (since 8.2) # +# @multifd-normal-page-ratio: Test hook setting the normal page ratio. +# (Since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -1356,7 +1367,8 @@ 'features': [ 'unstable' ] }, '*vcpu-dirty-limit': 'uint64', '*mode': 'MigMode', - '*multifd-dsa-accel': 'str'} } + '*multifd-dsa-accel': 'str', + '*multifd-normal-page-ratio': 'uint8'} } ## # @query-migrate-parameters: From patchwork Tue Nov 14 05:40:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BE3C3C4167D for ; Tue, 14 Nov 2023 05:44:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBx-0007dI-Bh; Tue, 14 Nov 2023 00:42:29 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBj-0007Ha-Jt for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:15 -0500 Received: from mail-qk1-x72b.google.com ([2607:f8b0:4864:20::72b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBh-0001Zd-30 for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:15 -0500 Received: by mail-qk1-x72b.google.com with SMTP id af79cd13be357-778927f2dd3so268557285a.2 for ; Mon, 13 Nov 2023 21:42:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940528; x=1700545328; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PPUuy0slv3M1LESng36bIPWqc+GaBXHR9iSzag4Sinc=; b=PHmLSMuFRxlRTX6xlIBbmpKPP2U56/UdwTCcq90eEybiwAuw6D1ibUBUyTA3CTszAG zGCehMvTrQOCf89og4iUSA6nSDTApV1rRLj6seTGsqMKY/k01dxiznO5zCaiGcG+oGVL DaHdsrizajweKscCiLFPOLdFV6k+W04y1eiBNQFvBw1Ife5jVvjUPGNpUH7cyWi2tlop 46KdccVrVd/0yh82CNpd+brluKeIg1cOHeooB5B3dHkkLyGfKnk3oXwftwEnTkZpGzCn g0OVtYU94UPDNg1q8cDHpt6ET9dF0anGQjThDCiLoOhSA6hvpPKbtmIkAA4kC4BmUVqN 4Uog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940528; x=1700545328; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PPUuy0slv3M1LESng36bIPWqc+GaBXHR9iSzag4Sinc=; b=bNAgFxGzWAzql3IfzMmw14pzflrRp74x4rbg9QU1HeVtLSI/WXp/hMt3gBDVwLrxSb H+rDP1EV2yLg/X7w5jXSGDcjGmI3FM19Np7qhPto4oaYeHW6C/c3GURNQtD9tmnmgRss O8Q03lul80tXjiZPes4PXO7myHWdoLCr/nAYw6fifR6d2DFIuDRd02GaVvYVyoHZAZjo Xc+0JPUtE+sw2h7yDI9O5Pp18RqqX1T8ioIycA6woDxD5isjmsADaFA5f0ZyLWAVAgQY 0xaymTnTckmcYxLykH9Ky0gdf8ANPDnUTTYof99ueBa1uSr6v02oRacrEEBw7YjyDESn au2A== X-Gm-Message-State: AOJu0YyZC2PbSTMMDj92D8LgXgZyr26b2T+yFbpeNQ6C13HWshbbj/K5 f4qI8qm//PKw+EWf9pHBhww1kA== X-Google-Smtp-Source: AGHT+IEkf5jftCDG6OjFoEUmWdXGHaaaSc4zOmLF3rRppdSFLwNiLrp8/GrU+5ofeXTmRDq6YUYjiw== X-Received: by 2002:a37:ef19:0:b0:777:27a5:d1e with SMTP id j25-20020a37ef19000000b0077727a50d1emr1276218qkk.41.1699940528592; Mon, 13 Nov 2023 21:42:08 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.42.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:42:08 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 16/20] migration/multifd: Enable set normal page ratio test hook in multifd. Date: Tue, 14 Nov 2023 05:40:28 +0000 Message-Id: <20231114054032.1192027-17-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72b; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Test hook is disabled by default. To set it, a normal page ratio between 0 and 100 are valid. If the ratio is set to 50, it means at least 50% of all pages are sent as normal pages. Set the option: migrate_set_parameter multifd-normal-page-ratio 60 Signed-off-by: Hao Xiang --- include/qemu/dsa.h | 7 ++++++- migration/migration-hmp-cmds.c | 7 +++++++ migration/multifd.c | 33 +++++++++++++++++++++++++++++++++ 3 files changed, 46 insertions(+), 1 deletion(-) diff --git a/include/qemu/dsa.h b/include/qemu/dsa.h index 3f8ee07004..bc7f652e0b 100644 --- a/include/qemu/dsa.h +++ b/include/qemu/dsa.h @@ -37,7 +37,10 @@ typedef struct buffer_zero_batch_task { enum dsa_task_type task_type; enum dsa_task_status status; bool *results; - int batch_size; + uint32_t batch_size; + // Set normal page ratio test hook. + uint32_t normal_page_index; + uint32_t normal_page_counter; QSIMPLEQ_ENTRY(buffer_zero_batch_task) entry; } buffer_zero_batch_task; @@ -45,6 +48,8 @@ typedef struct buffer_zero_batch_task { struct buffer_zero_batch_task { bool *results; + uint32_t normal_page_index; + uint32_t normal_page_counter; }; #endif diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index d9451744dd..788ce699ac 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -356,6 +356,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %s\n", MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_DSA_ACCEL), params->multifd_dsa_accel); + monitor_printf(mon, "%s: %u\n", + MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_NORMAL_PAGE_RATIO), + params->multifd_normal_page_ratio); if (params->has_block_bitmap_mapping) { const BitmapMigrationNodeAliasList *bmnal; @@ -675,6 +678,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) error_setg(&err, "The block-bitmap-mapping parameter can only be set " "through QMP"); break; + case MIGRATION_PARAMETER_MULTIFD_NORMAL_PAGE_RATIO: + p->has_multifd_normal_page_ratio = true; + visit_type_uint8(v, param, &p->multifd_normal_page_ratio, &err); + break; case MIGRATION_PARAMETER_X_VCPU_DIRTY_LIMIT_PERIOD: p->has_x_vcpu_dirty_limit_period = true; visit_type_size(v, param, &p->x_vcpu_dirty_limit_period, &err); diff --git a/migration/multifd.c b/migration/multifd.c index 2f635898ed..c9f9eef5b1 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -687,6 +687,37 @@ int multifd_send_sync_main(QEMUFile *f) return 0; } +static void multifd_normal_page_test_hook(MultiFDSendParams *p) +{ + /* + * The value is between 0 to 100. If the value is 10, it means at + * least 10% of the pages are normal page. A zero page can be made + * a normal page but not the other way around. + */ + uint8_t multifd_normal_page_ratio = + migrate_multifd_normal_page_ratio(); + struct buffer_zero_batch_task *batch_task = p->batch_task; + + // Set normal page test hook is disabled. + if (multifd_normal_page_ratio > 100) { + return; + } + + for (int i = 0; i < p->pages->num; i++) { + if (batch_task->normal_page_counter < multifd_normal_page_ratio) { + // Turn a zero page into a normal page. + batch_task->results[i] = false; + } + batch_task->normal_page_index++; + batch_task->normal_page_counter++; + + if (batch_task->normal_page_index >= 100) { + batch_task->normal_page_index = 0; + batch_task->normal_page_counter = 0; + } + } +} + static void set_page(MultiFDSendParams *p, bool zero_page, uint64_t offset) { RAMBlock *rb = p->pages->block; @@ -752,6 +783,8 @@ static void multifd_zero_page_check(MultiFDSendParams *p) set_normal_pages(p); } + multifd_normal_page_test_hook(p); + for (int i = 0; i < p->pages->num; i++) { uint64_t offset = p->pages->offset[i]; bool zero_page = p->batch_task->results[i]; From patchwork Tue Nov 14 05:40:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454812 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 88997C4167B for ; Tue, 14 Nov 2023 05:44:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBn-0007Ik-7O; Tue, 14 Nov 2023 00:42:19 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBj-0007Hb-QU for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:15 -0500 Received: from mail-qk1-x72d.google.com ([2607:f8b0:4864:20::72d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBh-0001Zq-VB for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:15 -0500 Received: by mail-qk1-x72d.google.com with SMTP id af79cd13be357-778927f2dd3so268558485a.2 for ; Mon, 13 Nov 2023 21:42:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940530; x=1700545330; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9eRnueSlb3meh0xpvGzA55C2FuUJkOMX0kEcQV57/8g=; b=JV3AoPqqMAd8OLh+EcEic7h6rcxxN9C9lpd0/zBokXkg/OyWt0vqXD/atT2ojnB8VI wuHwKtLNsynFdcj9dR578ontssKymyozhRlNSEaBS1BnTg1n61xh4T8k/DcK9thkKpgD 6qIqzkos6Zx26SAo1b72mbl2zCSHL8/XJbAYWWXj+FVhuqZF7wR395kETdNo93x6VBZx FSnlHEPRdqYbHiOYL0GiDK+rC5bMazKK8oD9sLYOTH0+gy614zuZBzrLaPr23a23dHnt +TaDucZ/2C4IXMXUHxb9pAOv3pqrYziIq/20xBNvzihQX5EzRZHLHEu65Rkyt8ng395w R1Jw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940530; x=1700545330; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9eRnueSlb3meh0xpvGzA55C2FuUJkOMX0kEcQV57/8g=; b=DsNOOIlD12wvD3AW9hgE5nkJTJ/0y5M5d2p6aWhaH3qUi2gsICGw1MAK721EtAw3tV Ui5Dz9MmiQWPYITKzWv1bUfKyTGdOog79wAGANocrCVw+iuxY9sgaI1Jpo2896Czv0eF F00EARuKmYxreGL5TH4am2AdNFMuNkB8wbWsudiGE4QUmT3JVv7QpME2aa/skA+RWBUY O8COABqOyUQjTJqArA19ncTEDb4QocW4o5knCWgE/CrzF7T3nERnP+KYSO+9vdq74YRs uASn8YxkGS92ZocmuvMcMv7p4orqN2/kL64xMK4VaO7CxIF5KQGAJCO0CIomzOKJmoWq m9+A== X-Gm-Message-State: AOJu0YwoBhlH7JsLO0paseXK3lUdnrrY6YjdWwVNZUYVLonONTJVS54X U0wQZZYVAkPCm/7xQuJSvbEnxQ== X-Google-Smtp-Source: AGHT+IFb5lLi69yrlJWJHpQ/LAxsW7J87H+jbc+ggOeNXz8V9gh5HwY4kTwcOo5i5arK+KHj2RQV9g== X-Received: by 2002:a05:620a:46aa:b0:775:9f9d:2c36 with SMTP id bq42-20020a05620a46aa00b007759f9d2c36mr1505356qkb.54.1699940530456; Mon, 13 Nov 2023 21:42:10 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.42.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:42:10 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 17/20] migration/multifd: Add migration option set packet size. Date: Tue, 14 Nov 2023 05:40:29 +0000 Message-Id: <20231114054032.1192027-18-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72d; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The current multifd packet size is 128 * 4kb. This change adds an option to set the packet size. Both sender and receiver needs to set the same packet size for things to work. Signed-off-by: Hao Xiang --- migration/options.c | 34 ++++++++++++++++++++++++++++++++++ migration/options.h | 1 + qapi/migration.json | 21 ++++++++++++++++++--- 3 files changed, 53 insertions(+), 3 deletions(-) diff --git a/migration/options.c b/migration/options.c index e7f1e2df24..81f1bf25d4 100644 --- a/migration/options.c +++ b/migration/options.c @@ -84,6 +84,12 @@ */ #define DEFAULT_MIGRATE_MULTIFD_NORMAL_PAGE_RATIO 101 +/* + * Parameter for multifd packet size. + */ +#define DEFAULT_MIGRATE_MULTIFD_PACKET_SIZE 128 +#define MAX_MIGRATE_MULTIFD_PACKET_SIZE 1024 + #define DEFINE_PROP_MIG_CAP(name, x) \ DEFINE_PROP_BOOL(name, MigrationState, capabilities[x], false) @@ -189,6 +195,9 @@ Property migration_properties[] = { DEFINE_PROP_UINT8("multifd-normal-page-ratio", MigrationState, parameters.multifd_normal_page_ratio, DEFAULT_MIGRATE_MULTIFD_NORMAL_PAGE_RATIO), + DEFINE_PROP_SIZE("multifd-packet-size", MigrationState, + parameters.multifd_packet_size, + DEFAULT_MIGRATE_MULTIFD_PACKET_SIZE), /* Migration capabilities */ DEFINE_PROP_MIG_CAP("x-xbzrle", MIGRATION_CAPABILITY_XBZRLE), @@ -874,6 +883,13 @@ uint8_t migrate_multifd_normal_page_ratio(void) return s->parameters.multifd_normal_page_ratio; } +uint64_t migrate_multifd_packet_size(void) +{ + MigrationState *s = migrate_get_current(); + + return s->parameters.multifd_packet_size; +} + MultiFDCompression migrate_multifd_compression(void) { MigrationState *s = migrate_get_current(); @@ -1012,6 +1028,8 @@ MigrationParameters *qmp_query_migrate_parameters(Error **errp) params->x_checkpoint_delay = s->parameters.x_checkpoint_delay; params->has_block_incremental = true; params->block_incremental = s->parameters.block_incremental; + params->has_multifd_packet_size = true; + params->multifd_packet_size = s->parameters.multifd_packet_size; params->has_multifd_channels = true; params->multifd_channels = s->parameters.multifd_channels; params->has_multifd_compression = true; @@ -1072,6 +1090,7 @@ void migrate_params_init(MigrationParameters *params) params->has_downtime_limit = true; params->has_x_checkpoint_delay = true; params->has_block_incremental = true; + params->has_multifd_packet_size = true; params->has_multifd_channels = true; params->has_multifd_compression = true; params->has_multifd_zlib_level = true; @@ -1170,6 +1189,15 @@ bool migrate_params_check(MigrationParameters *params, Error **errp) /* x_checkpoint_delay is now always positive */ + if (params->has_multifd_packet_size && + ((params->multifd_packet_size < DEFAULT_MIGRATE_MULTIFD_PACKET_SIZE) || + (params->multifd_packet_size > MAX_MIGRATE_MULTIFD_PACKET_SIZE))) { + error_setg(errp, QERR_INVALID_PARAMETER_VALUE, + "multifd_packet_size", + "a value between 128 and 1024"); + return false; + } + if (params->has_multifd_channels && (params->multifd_channels < 1)) { error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "multifd_channels", @@ -1351,6 +1379,9 @@ static void migrate_params_test_apply(MigrateSetParameters *params, if (params->has_block_incremental) { dest->block_incremental = params->block_incremental; } + if (params->has_multifd_packet_size) { + dest->multifd_packet_size = params->multifd_packet_size; + } if (params->has_multifd_channels) { dest->multifd_channels = params->multifd_channels; } @@ -1496,6 +1527,9 @@ static void migrate_params_apply(MigrateSetParameters *params, Error **errp) " use blockdev-mirror with NBD instead"); s->parameters.block_incremental = params->block_incremental; } + if (params->has_multifd_packet_size) { + s->parameters.multifd_packet_size = params->multifd_packet_size; + } if (params->has_multifd_channels) { s->parameters.multifd_channels = params->multifd_channels; } diff --git a/migration/options.h b/migration/options.h index 21e3e7b0cf..5816f6dac2 100644 --- a/migration/options.h +++ b/migration/options.h @@ -96,6 +96,7 @@ const char *migrate_tls_hostname(void); uint64_t migrate_xbzrle_cache_size(void); const char *migrate_multifd_dsa_accel(void); uint8_t migrate_multifd_normal_page_ratio(void); +uint64_t migrate_multifd_packet_size(void); /* parameters setters */ diff --git a/qapi/migration.json b/qapi/migration.json index bb876c8325..f87daddf33 100644 --- a/qapi/migration.json +++ b/qapi/migration.json @@ -885,6 +885,10 @@ # @multifd-normal-page-ratio: Test hook setting the normal page ratio. # (Since 8.2) # +# @multifd-packet-size: Packet size used to migrate data. This value +# indicates the number of pages in a packet. The default value +# is 128 and max value is 1024. (Since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -919,7 +923,8 @@ { 'name': 'x-vcpu-dirty-limit-period', 'features': ['unstable'] }, 'vcpu-dirty-limit', 'mode', - 'multifd-normal-page-ratio'] } + 'multifd-normal-page-ratio', + 'multifd-packet-size'] } ## # @MigrateSetParameters: @@ -1080,6 +1085,10 @@ # @multifd-normal-page-ratio: Test hook setting the normal page ratio. # (Since 8.2) # +# @multifd-packet-size: Packet size used to migrate data. This value +# indicates the number of pages in a packet. The default value +# is 128 and max value is 1024. (Since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -1135,7 +1144,8 @@ '*vcpu-dirty-limit': 'uint64', '*mode': 'MigMode', '*multifd-dsa-accel': 'StrOrNull', - '*multifd-normal-page-ratio': 'uint8'} } + '*multifd-normal-page-ratio': 'uint8', + '*multifd-packet-size' : 'uint64'} } ## # @migrate-set-parameters: @@ -1316,6 +1326,10 @@ # @multifd-normal-page-ratio: Test hook setting the normal page ratio. # (Since 8.2) # +# @multifd-packet-size: Packet size used to migrate data. This value +# indicates the number of pages in a packet. The default value +# is 128 and max value is 1024. (Since 8.2) +# # Features: # # @deprecated: Member @block-incremental is deprecated. Use @@ -1368,7 +1382,8 @@ '*vcpu-dirty-limit': 'uint64', '*mode': 'MigMode', '*multifd-dsa-accel': 'str', - '*multifd-normal-page-ratio': 'uint8'} } + '*multifd-normal-page-ratio': 'uint8', + '*multifd-packet-size': 'uint64'} } ## # @query-migrate-parameters: From patchwork Tue Nov 14 05:40:30 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454800 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D8A0CC4167D for ; Tue, 14 Nov 2023 05:43:36 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBy-0007lO-2P; Tue, 14 Nov 2023 00:42:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBl-0007IP-Fq for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:17 -0500 Received: from mail-qk1-x730.google.com ([2607:f8b0:4864:20::730]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBh-0001Zz-VM for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:17 -0500 Received: by mail-qk1-x730.google.com with SMTP id af79cd13be357-778940531dbso322665485a.0 for ; Mon, 13 Nov 2023 21:42:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940532; x=1700545332; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vt+w0QrMq+jm8GxkTLfUKv1nYJwURMZ3Gb3we60cQuI=; b=FCI4v9axDZMiNucVrhtJtQTU5whLpR9aUdUahWnCnbzBECXfXaThY6itVnJsEfhBV5 mk3xgNXcxQ3RuR7pmmHikPQEFFCm+6KFVA1zra/9d45j6jx4shPPw4FxKOeq+EylCZ3g Ey1lYCiy1+bSMR0HobUy+Ar0mG3aVTiWVSG3rSHb+1i5oL1E28/p8RT6wTuoTT5hn00o TtPLMQBVQH/c6+n4ounx7JgeI2Ioj8CvMoMZY/4zfQ1NC7cgJho9zBghuen7IplhCaYY mVVfTPNy8in/iRePsFSIh6mjP9ErhMQXMRU5FWnzYUsbhb939tMUruubDANtrKwXUBoW V5bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940532; x=1700545332; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vt+w0QrMq+jm8GxkTLfUKv1nYJwURMZ3Gb3we60cQuI=; b=DxvWsv95XlsXBLaTiqyYbXcxfTtLzd7yfH+ra8dtYRWz+KdWcZmi1yjwlP7uJGBKSB 9yAZuybMSlLNgWLrzMSVMAX0j2kRLgFqx6n5+ApD4dZ7a1XMw+H6hVH6PwpQtBQxWfhy tPGOrPgB10XUKq4uD3WbuIPl0E5Gby6yp8f5UaGy29+VIIoW0+4jw6+6X5eSm3C5VSjP yDVbEtaHqiizMpL/rfOQC2d5hDevwLzpGx8DAlxfeeENu1b6ArSVbdyFDgAMZRsaPhTy 70mrzpxSw5WvlS0Tn4fcnNgfbay8ebrKIub/WplC13FDB+jikN4ZLmdDYaQnTryfAYol oKoQ== X-Gm-Message-State: AOJu0YyaRaX02Nt68U9xq0a3I/0Ag4dJUuwxoqEF0DkW8cfcrQNx+2Gl on2zR0WpSjFcYVuSh6Osoyu65w== X-Google-Smtp-Source: AGHT+IFGZd451rj1nu9ZhzY8GwHunn4e/Y9ORufoZJW2QfoF62XqhilzhyKhbDGKsRNbsUKTHTtcmQ== X-Received: by 2002:a05:620a:35a:b0:76c:e7b7:1d9d with SMTP id t26-20020a05620a035a00b0076ce7b71d9dmr1277050qkm.27.1699940531961; Mon, 13 Nov 2023 21:42:11 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.42.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:42:11 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 18/20] migration/multifd: Enable set packet size migration option. Date: Tue, 14 Nov 2023 05:40:30 +0000 Message-Id: <20231114054032.1192027-19-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::730; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x730.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org During live migration, if the latency between sender and receiver is high but bandwidth is high (a long and fat pipe), using a bigger packet size can help reduce migration total time. In addition, Intel DSA offloading performs better with a large batch task. Providing an option to set the packet size is useful for performance tuning. Set the option: migrate_set_parameter multifd-packet-size 512 Signed-off-by: Hao Xiang --- migration/migration-hmp-cmds.c | 7 +++++++ migration/multifd-zlib.c | 8 ++++++-- migration/multifd-zstd.c | 8 ++++++-- migration/multifd.c | 4 ++-- migration/multifd.h | 3 --- 5 files changed, 21 insertions(+), 9 deletions(-) diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 788ce699ac..2d0c71294c 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -338,6 +338,9 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict) monitor_printf(mon, "%s: %s\n", MigrationParameter_str(MIGRATION_PARAMETER_BLOCK_INCREMENTAL), params->block_incremental ? "on" : "off"); + monitor_printf(mon, "%s: %" PRIu64 "\n", + MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_PACKET_SIZE), + params->multifd_packet_size); monitor_printf(mon, "%s: %u\n", MigrationParameter_str(MIGRATION_PARAMETER_MULTIFD_CHANNELS), params->multifd_channels); @@ -626,6 +629,10 @@ void hmp_migrate_set_parameter(Monitor *mon, const QDict *qdict) p->multifd_dsa_accel->type = QTYPE_QSTRING; visit_type_str(v, param, &p->multifd_dsa_accel->u.s, &err); break; + case MIGRATION_PARAMETER_MULTIFD_PACKET_SIZE: + p->has_multifd_packet_size = true; + visit_type_size(v, param, &p->multifd_packet_size, &err); + break; case MIGRATION_PARAMETER_MULTIFD_CHANNELS: p->has_multifd_channels = true; visit_type_uint8(v, param, &p->multifd_channels, &err); diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c index 37ce48621e..453c85d725 100644 --- a/migration/multifd-zlib.c +++ b/migration/multifd-zlib.c @@ -49,6 +49,8 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp) struct zlib_data *z = g_new0(struct zlib_data, 1); z_stream *zs = &z->zs; const char *err_msg; + uint64_t multifd_packet_size = + migrate_multifd_packet_size() * qemu_target_page_size(); zs->zalloc = Z_NULL; zs->zfree = Z_NULL; @@ -58,7 +60,7 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp) goto err_free_z; } /* This is the maximum size of the compressed buffer */ - z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE); + z->zbuff_len = compressBound(multifd_packet_size); z->zbuff = g_try_malloc(z->zbuff_len); if (!z->zbuff) { err_msg = "out of memory for zbuff"; @@ -186,6 +188,8 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp) */ static int zlib_recv_setup(MultiFDRecvParams *p, Error **errp) { + uint64_t multifd_packet_size = + migrate_multifd_packet_size() * qemu_target_page_size(); struct zlib_data *z = g_new0(struct zlib_data, 1); z_stream *zs = &z->zs; @@ -200,7 +204,7 @@ static int zlib_recv_setup(MultiFDRecvParams *p, Error **errp) return -1; } /* To be safe, we reserve twice the size of the packet */ - z->zbuff_len = MULTIFD_PACKET_SIZE * 2; + z->zbuff_len = multifd_packet_size * 2; z->zbuff = g_try_malloc(z->zbuff_len); if (!z->zbuff) { inflateEnd(zs); diff --git a/migration/multifd-zstd.c b/migration/multifd-zstd.c index b471daadcd..60298861d6 100644 --- a/migration/multifd-zstd.c +++ b/migration/multifd-zstd.c @@ -49,6 +49,8 @@ struct zstd_data { */ static int zstd_send_setup(MultiFDSendParams *p, Error **errp) { + uint64_t multifd_packet_size = + migrate_multifd_packet_size() * qemu_target_page_size(); struct zstd_data *z = g_new0(struct zstd_data, 1); int res; @@ -69,7 +71,7 @@ static int zstd_send_setup(MultiFDSendParams *p, Error **errp) return -1; } /* This is the maximum size of the compressed buffer */ - z->zbuff_len = ZSTD_compressBound(MULTIFD_PACKET_SIZE); + z->zbuff_len = ZSTD_compressBound(multifd_packet_size); z->zbuff = g_try_malloc(z->zbuff_len); if (!z->zbuff) { ZSTD_freeCStream(z->zcs); @@ -175,6 +177,8 @@ static int zstd_send_prepare(MultiFDSendParams *p, Error **errp) */ static int zstd_recv_setup(MultiFDRecvParams *p, Error **errp) { + uint64_t multifd_packet_size = + migrate_multifd_packet_size() * qemu_target_page_size(); struct zstd_data *z = g_new0(struct zstd_data, 1); int ret; @@ -196,7 +200,7 @@ static int zstd_recv_setup(MultiFDRecvParams *p, Error **errp) } /* To be safe, we reserve twice the size of the packet */ - z->zbuff_len = MULTIFD_PACKET_SIZE * 2; + z->zbuff_len = multifd_packet_size * 2; z->zbuff = g_try_malloc(z->zbuff_len); if (!z->zbuff) { ZSTD_freeDStream(z->zds); diff --git a/migration/multifd.c b/migration/multifd.c index c9f9eef5b1..fbe8bbcc5c 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -1051,7 +1051,7 @@ static void multifd_new_send_channel_create(gpointer opaque) int multifd_save_setup(Error **errp) { int thread_count; - uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); + uint32_t page_count = migrate_multifd_packet_size(); uint8_t i; const char *dsa_parameter = migrate_multifd_dsa_accel(); @@ -1329,7 +1329,7 @@ static void *multifd_recv_thread(void *opaque) int multifd_load_setup(Error **errp) { int thread_count; - uint32_t page_count = MULTIFD_PACKET_SIZE / qemu_target_page_size(); + uint32_t page_count = migrate_multifd_packet_size(); uint8_t i; const char *dsa_parameter = migrate_multifd_dsa_accel(); diff --git a/migration/multifd.h b/migration/multifd.h index 62f31b03c0..173c3f4171 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -34,9 +34,6 @@ int multifd_queue_page(QEMUFile *f, RAMBlock *block, ram_addr_t offset); #define MULTIFD_FLAG_ZLIB (1 << 1) #define MULTIFD_FLAG_ZSTD (2 << 1) -/* This value needs to be a multiple of qemu_target_page_size() */ -#define MULTIFD_PACKET_SIZE (512 * 1024) - typedef struct { uint32_t magic; uint32_t version; From patchwork Tue Nov 14 05:40:31 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454795 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AD6C2C4332F for ; Tue, 14 Nov 2023 05:43:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBz-0007x7-8J; Tue, 14 Nov 2023 00:42:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBl-0007IZ-TF for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:17 -0500 Received: from mail-qk1-x733.google.com ([2607:f8b0:4864:20::733]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBi-0001a3-Gt for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:17 -0500 Received: by mail-qk1-x733.google.com with SMTP id af79cd13be357-77bc5d8490dso263311485a.2 for ; Mon, 13 Nov 2023 21:42:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940533; x=1700545333; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=N47rpc7KNF92PKRRgmocN/7eIFQQ6DrcXuIQXph6PjY=; b=jqEnDbufamb2YPB8EMZXgEaYKuoras4FjpGFyMTsz5Fj65eriG0sH25Vdvds9OrUcG 7X35kKopDdmNWbqQibZRof8/zdGg2d485ZznovU4jN6a2RKHCds3dNK+YjjMKL6K+pUV ClwY47yt66gOg/F4PGeI8J34S9mxNKDh69LFwcM297UHMCQZh6Qd14nitvHqTN+MDrh9 gVwpp7ZGDQsU0Ed8u1xz7PRfaGK52OkCJ9k1VtV+NPbQEUf+OxqapyHlu6MejiwBMONo RcJubSJTJ7/FF46zYrGRceUtmlUl5VrczSfg/faera3hHP526qo0Vq+6ZrhSY/2uBwPv XmlQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940533; x=1700545333; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=N47rpc7KNF92PKRRgmocN/7eIFQQ6DrcXuIQXph6PjY=; b=mlzOt7XbYt0EkPE61zgM2tSW+ddNZrkDnWnb85Gynk1rKzQ9hufWUa21WaLgM8QbFI eVunTBxRVch/UifszOA73Rvxy3uZNM10KdM1TDi7YVeS13HGBHsWOPfBHHyrIHCoE4OT BW5omEY+L0pdlogVbJ+1KIrQFA9D6N+nPGBydfxLea8wHmeWr84W38KIqMi5tBkABkr9 FUo0dMzZ9K5lJMJh+aOVHbUhwX6omED8b4yliegMvO5AlXTaQsq0KAqQNOUi8UJ01HXL 34GxMnBzYGJMvF5UEs6arNsUTZTf2OI191duT/1hYCT0X6TYW2abpuUc00FclAPtHzH+ rkVA== X-Gm-Message-State: AOJu0YzZgKyMHZuTGqIXkgnm9kykCY22p9Uv2/a6o1QomCt42fTiOyJD HLRoh90ymBZIfwkkzIt/qYkmWlGkiVbB2d689Vs= X-Google-Smtp-Source: AGHT+IHhdiXjNOddYbTrPZEoJOBIvRYnnbfG0OXTRhhn1rBxWDx8NulUJ3dI6a16ysTaKIks8vWZ/Q== X-Received: by 2002:a05:620a:4247:b0:775:9e64:f5be with SMTP id w7-20020a05620a424700b007759e64f5bemr1515814qko.55.1699940533718; Mon, 13 Nov 2023 21:42:13 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.42.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:42:13 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 19/20] util/dsa: Add unit test coverage for Intel DSA task submission and completion. Date: Tue, 14 Nov 2023 05:40:31 +0000 Message-Id: <20231114054032.1192027-20-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::733; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x733.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org * Test DSA start and stop path. * Test DSA configure and cleanup path. * Test DSA task submission and completion path. Signed-off-by: Bryan Zhang Signed-off-by: Hao Xiang --- tests/unit/meson.build | 6 + tests/unit/test-dsa.c | 466 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 472 insertions(+) create mode 100644 tests/unit/test-dsa.c diff --git a/tests/unit/meson.build b/tests/unit/meson.build index a05d471090..72e22063dc 100644 --- a/tests/unit/meson.build +++ b/tests/unit/meson.build @@ -54,6 +54,12 @@ tests = { 'test-virtio-dmabuf': [meson.project_source_root() / 'hw/display/virtio-dmabuf.c'], } +if config_host_data.get('CONFIG_DSA_OPT') + tests += { + 'test-dsa': [], + } +endif + if have_system or have_tools tests += { 'test-qmp-event': [testqapi], diff --git a/tests/unit/test-dsa.c b/tests/unit/test-dsa.c new file mode 100644 index 0000000000..d2f23c3dba --- /dev/null +++ b/tests/unit/test-dsa.c @@ -0,0 +1,466 @@ +/* + * Test DSA functions. + * + * Copyright (c) 2023 Hao Xiang + * Copyright (c) 2023 Bryan Zhang + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2.1 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see . + */ +#include "qemu/osdep.h" +#include "qemu/host-utils.h" + +#include "qemu/cutils.h" +#include "qemu/memalign.h" +#include "qemu/dsa.h" + +// TODO Make these not-hardcoded. +static const char *path1 = "/dev/dsa/wq4.0"; +static const char *path2 = "/dev/dsa/wq4.0 /dev/dsa/wq4.1"; +static const int num_devices = 2; + +static struct buffer_zero_batch_task batch_task __attribute__((aligned(64))); + +// TODO Communicate that DSA must be configured to support this batch size. +// TODO Alternatively, poke the DSA device to figure out batch size. +static int batch_size = 128; +static int page_size = 4096; + +// A helper for running a single task and checking for correctness. +static void do_single_task(void) +{ + buffer_zero_batch_task_init(&batch_task, batch_size); + char buf[page_size]; + char* ptr = buf; + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) &ptr, + 1, + page_size); + g_assert(batch_task.results[0] == buffer_is_zero(buf, page_size)); +} + +static void test_single_zero(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[page_size]; + char* ptr = buf; + + memset(buf, 0x0, page_size); + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) &ptr, + 1, page_size); + g_assert(batch_task.results[0]); + + dsa_cleanup(); +} + +static void test_single_zero_async(void) +{ + test_single_zero(); +} + +static void test_single_nonzero(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[page_size]; + char* ptr = buf; + + memset(buf, 0x1, page_size); + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) &ptr, + 1, page_size); + g_assert(!batch_task.results[0]); + + dsa_cleanup(); +} + +static void test_single_nonzero_async(void) +{ + test_single_nonzero(); +} + +// count == 0 should return quickly without calling into DSA. +static void test_zero_count_async(void) +{ + char buf[page_size]; + buffer_is_zero_dsa_batch_async(&batch_task, + (const void **) &buf, + 0, + page_size); +} + +static void test_null_task_async(void) +{ + if (g_test_subprocess()) { + g_assert(!dsa_init(path1)); + + char buf[page_size * batch_size]; + char *addrs[batch_size]; + for (int i = 0; i < batch_size; i++) { + addrs[i] = buf + (page_size * i); + } + + buffer_is_zero_dsa_batch_async(NULL, (const void**) addrs, batch_size, + page_size); + } else { + g_test_trap_subprocess(NULL, 0, 0); + g_test_trap_assert_failed(); + } +} + +static void test_oversized_batch(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + int oversized_batch_size = batch_size + 1; + char buf[page_size * oversized_batch_size]; + char *addrs[batch_size]; + for (int i = 0; i < oversized_batch_size; i++) { + addrs[i] = buf + (page_size * i); + } + + int ret = buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) addrs, + oversized_batch_size, + page_size); + g_assert(ret != 0); + + dsa_cleanup(); +} + +static void test_oversized_batch_async(void) +{ + test_oversized_batch(); +} + +static void test_zero_len_async(void) +{ + if (g_test_subprocess()) { + g_assert(!dsa_init(path1)); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[page_size]; + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) &buf, + 1, + 0); + } else { + g_test_trap_subprocess(NULL, 0, 0); + g_test_trap_assert_failed(); + } +} + +static void test_null_buf_async(void) +{ + if (g_test_subprocess()) { + g_assert(!dsa_init(path1)); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + buffer_is_zero_dsa_batch_async(&batch_task, NULL, 1, page_size); + } else { + g_test_trap_subprocess(NULL, 0, 0); + g_test_trap_assert_failed(); + } +} + +static void test_batch(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[page_size * batch_size]; + char *addrs[batch_size]; + for (int i = 0; i < batch_size; i++) { + addrs[i] = buf + (page_size * i); + } + + // Using whatever is on the stack is somewhat random. + // Manually set some pages to zero and some to nonzero. + memset(buf + 0, 0, page_size * 10); + memset(buf + (10 * page_size), 0xff, page_size * 10); + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) addrs, + batch_size, + page_size); + + bool is_zero; + for (int i = 0; i < batch_size; i++) { + is_zero = buffer_is_zero((const void*) &buf[page_size * i], page_size); + g_assert(batch_task.results[i] == is_zero); + } + dsa_cleanup(); +} + +static void test_batch_async(void) +{ + test_batch(); +} + +static void test_page_fault(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + char* buf[2]; + int prot = PROT_READ | PROT_WRITE; + int flags = MAP_SHARED | MAP_ANON; + buf[0] = (char*) mmap(NULL, page_size * batch_size, prot, flags, -1, 0); + assert(buf[0] != MAP_FAILED); + buf[1] = (char*) malloc(page_size * batch_size); + assert(buf[1] != NULL); + + for (int j = 0; j < 2; j++) { + buffer_zero_batch_task_init(&batch_task, batch_size); + + char *addrs[batch_size]; + for (int i = 0; i < batch_size; i++) { + addrs[i] = buf[j] + (page_size * i); + } + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) addrs, + batch_size, + page_size); + + bool is_zero; + for (int i = 0; i < batch_size; i++) { + is_zero = buffer_is_zero((const void*) &buf[j][page_size * i], page_size); + g_assert(batch_task.results[i] == is_zero); + } + } + + assert(!munmap(buf[0], page_size * batch_size)); + free(buf[1]); + dsa_cleanup(); +} + +static void test_various_buffer_sizes(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + int len = 1 << 4; + for (int count = 12; count > 0; count--, len <<= 1) { + buffer_zero_batch_task_init(&batch_task, batch_size); + + char buf[len * batch_size]; + char *addrs[batch_size]; + for (int i = 0; i < batch_size; i++) { + addrs[i] = buf + (len * i); + } + + buffer_is_zero_dsa_batch_async(&batch_task, + (const void**) addrs, + batch_size, + len); + + bool is_zero; + for (int j = 0; j < batch_size; j++) { + is_zero = buffer_is_zero((const void*) &buf[len * j], len); + g_assert(batch_task.results[j] == is_zero); + } + } + + dsa_cleanup(); +} + +static void test_various_buffer_sizes_async(void) +{ + test_various_buffer_sizes(); +} + +static void test_double_start_stop(void) +{ + g_assert(!dsa_init(path1)); + // Double start + dsa_start(); + dsa_start(); + g_assert(dsa_is_running()); + do_single_task(); + + // Double stop + dsa_stop(); + g_assert(!dsa_is_running()); + dsa_stop(); + g_assert(!dsa_is_running()); + + // Restart + dsa_start(); + g_assert(dsa_is_running()); + do_single_task(); + dsa_cleanup(); +} + +static void test_is_running(void) +{ + g_assert(!dsa_init(path1)); + + g_assert(!dsa_is_running()); + dsa_start(); + g_assert(dsa_is_running()); + dsa_stop(); + g_assert(!dsa_is_running()); + dsa_cleanup(); +} + +static void test_multiple_engines(void) +{ + g_assert(!dsa_init(path2)); + dsa_start(); + + struct buffer_zero_batch_task tasks[num_devices] + __attribute__((aligned(64))); + char bufs[num_devices][page_size * batch_size]; + char *addrs[num_devices][batch_size]; + + // This is a somewhat implementation-specific way of testing that the tasks + // have unique engines assigned to them. + buffer_zero_batch_task_init(&tasks[0], batch_size); + buffer_zero_batch_task_init(&tasks[1], batch_size); + g_assert(tasks[0].device != tasks[1].device); + + for (int i = 0; i < num_devices; i++) { + for (int j = 0; j < batch_size; j++) { + addrs[i][j] = bufs[i] + (page_size * j); + } + + buffer_is_zero_dsa_batch_async(&tasks[i], + (const void**) addrs[i], + batch_size, page_size); + + bool is_zero; + for (int j = 0; j < batch_size; j++) { + is_zero = buffer_is_zero((const void*) &bufs[i][page_size * j], + page_size); + g_assert(tasks[i].results[j] == is_zero); + } + } + + dsa_cleanup(); +} + +static void test_configure_dsa_twice(void) +{ + g_assert(!dsa_init(path2)); + g_assert(!dsa_init(path2)); + dsa_start(); + do_single_task(); + dsa_cleanup(); +} + +static void test_configure_dsa_bad_path(void) +{ + const char* bad_path = "/not/a/real/path"; + g_assert(dsa_init(bad_path)); +} + +static void test_cleanup_before_configure(void) +{ + dsa_cleanup(); + g_assert(!dsa_init(path2)); +} + +static void test_configure_dsa_num_devices(void) +{ + g_assert(!dsa_init(path1)); + dsa_start(); + + do_single_task(); + dsa_stop(); + dsa_cleanup(); +} + +static void test_cleanup_twice(void) +{ + g_assert(!dsa_init(path2)); + dsa_cleanup(); + dsa_cleanup(); + + g_assert(!dsa_init(path2)); + dsa_start(); + do_single_task(); + dsa_cleanup(); +} + +static int check_test_setup(void) +{ + const char *path[2] = {path1, path2}; + for (int i = 0; i < sizeof(path) / sizeof(char *); i++) { + if (!dsa_init(path[i])) { + return -1; + } + dsa_cleanup(); + } + return 0; +} + +int main(int argc, char **argv) +{ + g_test_init(&argc, &argv, NULL); + + if (check_test_setup() != 0) { + /* + * This test requires extra setup. The current + * setup is not correct. Just skip this test + * for now. + */ + exit(0); + } + + if (num_devices > 1) { + g_test_add_func("/dsa/multiple_engines", test_multiple_engines); + } + + g_test_add_func("/dsa/async/batch", test_batch_async); + g_test_add_func("/dsa/async/various_buffer_sizes", + test_various_buffer_sizes_async); + g_test_add_func("/dsa/async/null_buf", test_null_buf_async); + g_test_add_func("/dsa/async/zero_len", test_zero_len_async); + g_test_add_func("/dsa/async/oversized_batch", test_oversized_batch_async); + g_test_add_func("/dsa/async/zero_count", test_zero_count_async); + g_test_add_func("/dsa/async/single_zero", test_single_zero_async); + g_test_add_func("/dsa/async/single_nonzero", test_single_nonzero_async); + g_test_add_func("/dsa/async/null_task", test_null_task_async); + g_test_add_func("/dsa/async/page_fault", test_page_fault); + + g_test_add_func("/dsa/double_start_stop", test_double_start_stop); + g_test_add_func("/dsa/is_running", test_is_running); + + g_test_add_func("/dsa/configure_dsa_twice", test_configure_dsa_twice); + g_test_add_func("/dsa/configure_dsa_bad_path", test_configure_dsa_bad_path); + g_test_add_func("/dsa/cleanup_before_configure", + test_cleanup_before_configure); + g_test_add_func("/dsa/configure_dsa_num_devices", + test_configure_dsa_num_devices); + g_test_add_func("/dsa/cleanup_twice", test_cleanup_twice); + + return g_test_run(); +} From patchwork Tue Nov 14 05:40:32 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hao Xiang X-Patchwork-Id: 13454810 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5D8C1C4332F for ; Tue, 14 Nov 2023 05:44:37 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1r2mBn-0007Iu-PQ; Tue, 14 Nov 2023 00:42:19 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1r2mBl-0007IS-P2 for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:17 -0500 Received: from mail-qk1-x72a.google.com ([2607:f8b0:4864:20::72a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1r2mBk-0001aF-31 for qemu-devel@nongnu.org; Tue, 14 Nov 2023 00:42:17 -0500 Received: by mail-qk1-x72a.google.com with SMTP id af79cd13be357-778a20df8c3so346027685a.3 for ; Mon, 13 Nov 2023 21:42:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1699940535; x=1700545335; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PD8bVOanL2x2PbMmV65wnVcW5O4sSqOwwu6Aer32ebc=; b=GQrLHlizgMgnuD/EdL8LUTki0JN+nL49Djlg4fw0ZCSOmjikxRmhSp1iM3s7oy+oJe laHuDG6ax62faY81uz3HLYTHCT4ihi2tRepqXwTdH+kteWzWcusmB0HIdMfo4yMNpv2m sT+WZZljRrxYAEQyMOWOZSmP9XOWMafDmjF64zwASOKBc5BCtlmjBr2yy1WveFWaNzBH Tq/Yn+M2Vp03BrSXK0bSyfa2FzGYtNeiYmWpvxQaLBJRL8NEQ999+JcTj6TXs42PTFIW bAbM9ag8PmmqYvJR3D2xyMNYkSM1GveQPKC28XP2mjESaG2rJkyjFrk2v4fqZEwOUpS8 c3fA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699940535; x=1700545335; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PD8bVOanL2x2PbMmV65wnVcW5O4sSqOwwu6Aer32ebc=; b=ha66DzSXjnGBYt2slqzzydGOJ0RZrsrkO32S1Xy05nBjIlfFjM1YAr+FXxWhAUV328 Qfoj3TjdrRLc/ueKwn1X5K7YwHIle5Kq7hBAqboUCdVsEOx0Aji+XIDhxEt6pFgUd44+ SA6w9mXUPdO6Dx4RBCStv0+O+fbquh6E/Bw60JrDGet5W5h+RRZ+QpARrb6xpbMG712J +JDBheafqE/T3y1hVLJdvo+AfmqAzetoOtOCdpe5tnmQldYraB9ko+JZceYv32llVVYX FshYlC16y7H8iZUGJnmTaxoN1DwRVf1GmDwyaaZMRiUc0NgsbdBjJyrzqm8JgMX2AtF8 auxw== X-Gm-Message-State: AOJu0Yyr06ljACrfk6KUVZq85W1IRBu6V/ry/+o9r94B74excKTppIfN 1kHs3uQk7VjxjxV591RVCV2XoA== X-Google-Smtp-Source: AGHT+IFYpNTXN1qaT82DPlnCH3JSilmpGMJvP57FjLbZQQYLIorOZ1lmeUWPLNNOM1MveBNazzkSuQ== X-Received: by 2002:a05:620a:8a8b:b0:777:1d46:fd4a with SMTP id qu11-20020a05620a8a8b00b007771d46fd4amr1007515qkn.29.1699940535175; Mon, 13 Nov 2023 21:42:15 -0800 (PST) Received: from n231-230-216.byted.org ([130.44.212.104]) by smtp.gmail.com with ESMTPSA id w2-20020a05620a094200b0077891d2d12dsm2400367qkw.43.2023.11.13.21.42.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 13 Nov 2023 21:42:14 -0800 (PST) From: Hao Xiang To: farosas@suse.de, peter.maydell@linaro.org, quintela@redhat.com, peterx@redhat.com, marcandre.lureau@redhat.com, bryan.zhang@bytedance.com, qemu-devel@nongnu.org Cc: Hao Xiang Subject: [PATCH v2 20/20] migration/multifd: Add integration tests for multifd with Intel DSA offloading. Date: Tue, 14 Nov 2023 05:40:32 +0000 Message-Id: <20231114054032.1192027-21-hao.xiang@bytedance.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20231114054032.1192027-1-hao.xiang@bytedance.com> References: <20231114054032.1192027-1-hao.xiang@bytedance.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::72a; envelope-from=hao.xiang@bytedance.com; helo=mail-qk1-x72a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org * Add test case to start and complete multifd live migration with DSA offloading enabled. * Add test case to start and cancel multifd live migration with DSA offloading enabled. Signed-off-by: Bryan Zhang Signed-off-by: Hao Xiang --- tests/qtest/migration-test.c | 77 +++++++++++++++++++++++++++++++++++- 1 file changed, 76 insertions(+), 1 deletion(-) diff --git a/tests/qtest/migration-test.c b/tests/qtest/migration-test.c index 5752412b64..3ffbdd5a65 100644 --- a/tests/qtest/migration-test.c +++ b/tests/qtest/migration-test.c @@ -639,6 +639,12 @@ typedef struct { const char *opts_target; } MigrateStart; +/* + * It requires separate steps to configure and enable DSA device. + * This test assumes that the configuration is done already. + */ +static const char* dsa_dev_path = "/dev/dsa/wq4.0"; + /* * A hook that runs after the src and dst QEMUs have been * created, but before the migration is started. This can @@ -2775,7 +2781,7 @@ static void test_multifd_tcp_tls_x509_reject_anon_client(void) * * And see that it works */ -static void test_multifd_tcp_cancel(void) +static void test_multifd_tcp_cancel_common(bool use_dsa) { MigrateStart args = { .hide_stderr = true, @@ -2796,6 +2802,10 @@ static void test_multifd_tcp_cancel(void) migrate_set_capability(from, "multifd", true); migrate_set_capability(to, "multifd", true); + if (use_dsa) { + migrate_set_parameter_str(from, "multifd-dsa-accel", dsa_dev_path); + } + /* Start incoming migration from the 1st socket */ migrate_incoming_qmp(to, "tcp:127.0.0.1:0", "{}"); @@ -2852,6 +2862,48 @@ static void test_multifd_tcp_cancel(void) test_migrate_end(from, to2, true); } +/* + * This test does: + * source target + * migrate_incoming + * migrate + * migrate_cancel + * launch another target + * migrate + * + * And see that it works + */ +static void test_multifd_tcp_cancel(void) +{ + test_multifd_tcp_cancel_common(false); +} + +#ifdef CONFIG_DSA_OPT + +static void *test_migrate_precopy_tcp_multifd_start_dsa(QTestState *from, + QTestState *to) +{ + migrate_set_parameter_str(from, "multifd-dsa-accel", dsa_dev_path); + return test_migrate_precopy_tcp_multifd_start_common(from, to, "none"); +} + +static void test_multifd_tcp_none_dsa(void) +{ + MigrateCommon args = { + .listen_uri = "defer", + .start_hook = test_migrate_precopy_tcp_multifd_start_dsa, + }; + + test_precopy_common(&args); +} + +static void test_multifd_tcp_cancel_dsa(void) +{ + test_multifd_tcp_cancel_common(true); +} + +#endif + static void calc_dirty_rate(QTestState *who, uint64_t calc_time) { qtest_qmp_assert_success(who, @@ -3274,6 +3326,19 @@ static bool kvm_dirty_ring_supported(void) #endif } +#ifdef CONFIG_DSA_OPT +static int test_dsa_setup(void) +{ + int fd; + fd = open(dsa_dev_path, O_RDWR); + if (fd < 0) { + return -1; + } + close(fd); + return 0; +} +#endif + int main(int argc, char **argv) { bool has_kvm, has_tcg; @@ -3468,6 +3533,16 @@ int main(int argc, char **argv) } qtest_add_func("/migration/multifd/tcp/plain/none", test_multifd_tcp_none); + +#ifdef CONFIG_DSA_OPT + if (g_str_equal(arch, "x86_64") && test_dsa_setup() == 0) { + qtest_add_func("/migration/multifd/tcp/plain/none/dsa", + test_multifd_tcp_none_dsa); + qtest_add_func("/migration/multifd/tcp/plain/cancel/dsa", + test_multifd_tcp_cancel_dsa); + } +#endif + /* * This test is flaky and sometimes fails in CI and otherwise: * don't run unless user opts in via environment variable.