From patchwork Sun Nov 17 19:19:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D95ACD10F26 for ; Sun, 17 Nov 2024 19:22:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkpu-0002eF-Sh; Sun, 17 Nov 2024 14:21:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkpq-0002do-HN for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:26 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkpo-0005qJ-RQ for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:26 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkpd-00000002GSh-3l62; Sun, 17 Nov 2024 20:21:13 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 01/24] migration: Clarify that {load, save}_cleanup handlers can run without setup Date: Sun, 17 Nov 2024 20:19:56 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" It's possible for {load,save}_cleanup SaveVMHandlers to get called without the corresponding {load,save}_setup handler being called first. One such example is if {load,save}_setup handler of a proceeding device returns error. In this case the migration core cleanup code will call all corresponding cleanup handlers, even for these devices which haven't had its setup handler called. Since this behavior can generate some surprises let's clearly document it in these SaveVMHandlers description. Signed-off-by: Maciej S. Szmigiero --- include/migration/register.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/include/migration/register.h b/include/migration/register.h index f60e797894e5..0b0292738320 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -69,7 +69,9 @@ typedef struct SaveVMHandlers { /** * @save_cleanup * - * Uninitializes the data structures on the source + * Uninitializes the data structures on the source. + * Note that this handler can be called even if save_setup + * wasn't called earlier. * * @opaque: data pointer passed to register_savevm_live() */ @@ -244,6 +246,8 @@ typedef struct SaveVMHandlers { * @load_cleanup * * Uninitializes the data structures on the destination. + * Note that this handler can be called even if load_setup + * wasn't called earlier. * * @opaque: data pointer passed to register_savevm_live() * From patchwork Sun Nov 17 19:19:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 02976D10F28 for ; Sun, 17 Nov 2024 19:22:43 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkpy-0002fA-5j; Sun, 17 Nov 2024 14:21:35 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkpu-0002eT-Jj for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:30 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkpt-0005qw-6m for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:30 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkpj-00000002GSx-0Abt; Sun, 17 Nov 2024 20:21:19 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 02/24] thread-pool: Remove thread_pool_submit() function Date: Sun, 17 Nov 2024 20:19:57 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This function name conflicts with one used by a future generic thread pool function and it was only used by one test anyway. Update the trace event name in thread_pool_submit_aio() accordingly. Signed-off-by: Maciej S. Szmigiero --- include/block/thread-pool.h | 3 +-- tests/unit/test-thread-pool.c | 2 +- util/thread-pool.c | 7 +------ util/trace-events | 2 +- 4 files changed, 4 insertions(+), 10 deletions(-) diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h index 948ff5f30c31..4f6694026123 100644 --- a/include/block/thread-pool.h +++ b/include/block/thread-pool.h @@ -30,13 +30,12 @@ ThreadPool *thread_pool_new(struct AioContext *ctx); void thread_pool_free(ThreadPool *pool); /* - * thread_pool_submit* API: submit I/O requests in the thread's + * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's * current AioContext. */ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, BlockCompletionFunc *cb, void *opaque); int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); -void thread_pool_submit(ThreadPoolFunc *func, void *arg); void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); diff --git a/tests/unit/test-thread-pool.c b/tests/unit/test-thread-pool.c index 1483e53473db..7a7055141ddb 100644 --- a/tests/unit/test-thread-pool.c +++ b/tests/unit/test-thread-pool.c @@ -46,7 +46,7 @@ static void done_cb(void *opaque, int ret) static void test_submit(void) { WorkerTestData data = { .n = 0 }; - thread_pool_submit(worker_cb, &data); + thread_pool_submit_aio(worker_cb, &data, NULL, NULL); while (data.n == 0) { aio_poll(ctx, true); } diff --git a/util/thread-pool.c b/util/thread-pool.c index 27eb777e855b..2f751d55b33f 100644 --- a/util/thread-pool.c +++ b/util/thread-pool.c @@ -256,7 +256,7 @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, QLIST_INSERT_HEAD(&pool->head, req, all); - trace_thread_pool_submit(pool, req, arg); + trace_thread_pool_submit_aio(pool, req, arg); qemu_mutex_lock(&pool->lock); if (pool->idle_threads == 0 && pool->cur_threads < pool->max_threads) { @@ -290,11 +290,6 @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) return tpc.ret; } -void thread_pool_submit(ThreadPoolFunc *func, void *arg) -{ - thread_pool_submit_aio(func, arg, NULL, NULL); -} - void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) { qemu_mutex_lock(&pool->lock); diff --git a/util/trace-events b/util/trace-events index 49a4962e1886..5be12d7fab89 100644 --- a/util/trace-events +++ b/util/trace-events @@ -14,7 +14,7 @@ aio_co_schedule_bh_cb(void *ctx, void *co) "ctx %p co %p" reentrant_aio(void *ctx, const char *name) "ctx %p name %s" # thread-pool.c -thread_pool_submit(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" +thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" From patchwork Sun Nov 17 19:19:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877918 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0136DD10F26 for ; Sun, 17 Nov 2024 19:22:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkq0-0002fk-P5; Sun, 17 Nov 2024 14:21:36 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkpu-0002eS-5W for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:30 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkps-0005rY-6x for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:29 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkpo-00000002GTC-0jtP; Sun, 17 Nov 2024 20:21:24 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 03/24] thread-pool: Rename AIO pool functions to *_aio() and data types to *Aio Date: Sun, 17 Nov 2024 20:19:58 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" These names conflict with ones used by future generic thread pool equivalents. Generic names should belong to the generic pool type, not specific (AIO) type. Signed-off-by: Maciej S. Szmigiero --- include/block/aio.h | 8 ++--- include/block/thread-pool.h | 8 ++--- util/async.c | 6 ++-- util/thread-pool.c | 58 ++++++++++++++++++------------------- util/trace-events | 4 +-- 5 files changed, 42 insertions(+), 42 deletions(-) diff --git a/include/block/aio.h b/include/block/aio.h index 43883a8a33a8..b2ab3514de23 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -54,7 +54,7 @@ typedef void QEMUBHFunc(void *opaque); typedef bool AioPollFn(void *opaque); typedef void IOHandler(void *opaque); -struct ThreadPool; +struct ThreadPoolAio; struct LinuxAioState; typedef struct LuringState LuringState; @@ -207,7 +207,7 @@ struct AioContext { /* Thread pool for performing work and receiving completion callbacks. * Has its own locking. */ - struct ThreadPool *thread_pool; + struct ThreadPoolAio *thread_pool; #ifdef CONFIG_LINUX_AIO struct LinuxAioState *linux_aio; @@ -500,8 +500,8 @@ void aio_set_event_notifier_poll(AioContext *ctx, */ GSource *aio_get_g_source(AioContext *ctx); -/* Return the ThreadPool bound to this AioContext */ -struct ThreadPool *aio_get_thread_pool(AioContext *ctx); +/* Return the ThreadPoolAio bound to this AioContext */ +struct ThreadPoolAio *aio_get_thread_pool(AioContext *ctx); /* Setup the LinuxAioState bound to this AioContext */ struct LinuxAioState *aio_setup_linux_aio(AioContext *ctx, Error **errp); diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h index 4f6694026123..6f27eb085b45 100644 --- a/include/block/thread-pool.h +++ b/include/block/thread-pool.h @@ -24,10 +24,10 @@ typedef int ThreadPoolFunc(void *opaque); -typedef struct ThreadPool ThreadPool; +typedef struct ThreadPoolAio ThreadPoolAio; -ThreadPool *thread_pool_new(struct AioContext *ctx); -void thread_pool_free(ThreadPool *pool); +ThreadPoolAio *thread_pool_new_aio(struct AioContext *ctx); +void thread_pool_free_aio(ThreadPoolAio *pool); /* * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's @@ -36,7 +36,7 @@ void thread_pool_free(ThreadPool *pool); BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, BlockCompletionFunc *cb, void *opaque); int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); +void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); -void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); #endif diff --git a/util/async.c b/util/async.c index 99db28389f66..f8b7678aefc8 100644 --- a/util/async.c +++ b/util/async.c @@ -369,7 +369,7 @@ aio_ctx_finalize(GSource *source) QEMUBH *bh; unsigned flags; - thread_pool_free(ctx->thread_pool); + thread_pool_free_aio(ctx->thread_pool); #ifdef CONFIG_LINUX_AIO if (ctx->linux_aio) { @@ -435,10 +435,10 @@ GSource *aio_get_g_source(AioContext *ctx) return &ctx->source; } -ThreadPool *aio_get_thread_pool(AioContext *ctx) +ThreadPoolAio *aio_get_thread_pool(AioContext *ctx) { if (!ctx->thread_pool) { - ctx->thread_pool = thread_pool_new(ctx); + ctx->thread_pool = thread_pool_new_aio(ctx); } return ctx->thread_pool; } diff --git a/util/thread-pool.c b/util/thread-pool.c index 2f751d55b33f..908194dc070f 100644 --- a/util/thread-pool.c +++ b/util/thread-pool.c @@ -23,9 +23,9 @@ #include "block/thread-pool.h" #include "qemu/main-loop.h" -static void do_spawn_thread(ThreadPool *pool); +static void do_spawn_thread(ThreadPoolAio *pool); -typedef struct ThreadPoolElement ThreadPoolElement; +typedef struct ThreadPoolElementAio ThreadPoolElementAio; enum ThreadState { THREAD_QUEUED, @@ -33,9 +33,9 @@ enum ThreadState { THREAD_DONE, }; -struct ThreadPoolElement { +struct ThreadPoolElementAio { BlockAIOCB common; - ThreadPool *pool; + ThreadPoolAio *pool; ThreadPoolFunc *func; void *arg; @@ -47,13 +47,13 @@ struct ThreadPoolElement { int ret; /* Access to this list is protected by lock. */ - QTAILQ_ENTRY(ThreadPoolElement) reqs; + QTAILQ_ENTRY(ThreadPoolElementAio) reqs; /* This list is only written by the thread pool's mother thread. */ - QLIST_ENTRY(ThreadPoolElement) all; + QLIST_ENTRY(ThreadPoolElementAio) all; }; -struct ThreadPool { +struct ThreadPoolAio { AioContext *ctx; QEMUBH *completion_bh; QemuMutex lock; @@ -62,10 +62,10 @@ struct ThreadPool { QEMUBH *new_thread_bh; /* The following variables are only accessed from one AioContext. */ - QLIST_HEAD(, ThreadPoolElement) head; + QLIST_HEAD(, ThreadPoolElementAio) head; /* The following variables are protected by lock. */ - QTAILQ_HEAD(, ThreadPoolElement) request_list; + QTAILQ_HEAD(, ThreadPoolElementAio) request_list; int cur_threads; int idle_threads; int new_threads; /* backlog of threads we need to create */ @@ -76,14 +76,14 @@ struct ThreadPool { static void *worker_thread(void *opaque) { - ThreadPool *pool = opaque; + ThreadPoolAio *pool = opaque; qemu_mutex_lock(&pool->lock); pool->pending_threads--; do_spawn_thread(pool); while (pool->cur_threads <= pool->max_threads) { - ThreadPoolElement *req; + ThreadPoolElementAio *req; int ret; if (QTAILQ_EMPTY(&pool->request_list)) { @@ -131,7 +131,7 @@ static void *worker_thread(void *opaque) return NULL; } -static void do_spawn_thread(ThreadPool *pool) +static void do_spawn_thread(ThreadPoolAio *pool) { QemuThread t; @@ -148,14 +148,14 @@ static void do_spawn_thread(ThreadPool *pool) static void spawn_thread_bh_fn(void *opaque) { - ThreadPool *pool = opaque; + ThreadPoolAio *pool = opaque; qemu_mutex_lock(&pool->lock); do_spawn_thread(pool); qemu_mutex_unlock(&pool->lock); } -static void spawn_thread(ThreadPool *pool) +static void spawn_thread(ThreadPoolAio *pool) { pool->cur_threads++; pool->new_threads++; @@ -173,8 +173,8 @@ static void spawn_thread(ThreadPool *pool) static void thread_pool_completion_bh(void *opaque) { - ThreadPool *pool = opaque; - ThreadPoolElement *elem, *next; + ThreadPoolAio *pool = opaque; + ThreadPoolElementAio *elem, *next; defer_call_begin(); /* cb() may use defer_call() to coalesce work */ @@ -184,8 +184,8 @@ restart: continue; } - trace_thread_pool_complete(pool, elem, elem->common.opaque, - elem->ret); + trace_thread_pool_complete_aio(pool, elem, elem->common.opaque, + elem->ret); QLIST_REMOVE(elem, all); if (elem->common.cb) { @@ -217,10 +217,10 @@ restart: static void thread_pool_cancel(BlockAIOCB *acb) { - ThreadPoolElement *elem = (ThreadPoolElement *)acb; - ThreadPool *pool = elem->pool; + ThreadPoolElementAio *elem = (ThreadPoolElementAio *)acb; + ThreadPoolAio *pool = elem->pool; - trace_thread_pool_cancel(elem, elem->common.opaque); + trace_thread_pool_cancel_aio(elem, elem->common.opaque); QEMU_LOCK_GUARD(&pool->lock); if (elem->state == THREAD_QUEUED) { @@ -234,16 +234,16 @@ static void thread_pool_cancel(BlockAIOCB *acb) } static const AIOCBInfo thread_pool_aiocb_info = { - .aiocb_size = sizeof(ThreadPoolElement), + .aiocb_size = sizeof(ThreadPoolElementAio), .cancel_async = thread_pool_cancel, }; BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, BlockCompletionFunc *cb, void *opaque) { - ThreadPoolElement *req; + ThreadPoolElementAio *req; AioContext *ctx = qemu_get_current_aio_context(); - ThreadPool *pool = aio_get_thread_pool(ctx); + ThreadPoolAio *pool = aio_get_thread_pool(ctx); /* Assert that the thread submitting work is the same running the pool */ assert(pool->ctx == qemu_get_current_aio_context()); @@ -290,7 +290,7 @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) return tpc.ret; } -void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) +void thread_pool_update_params(ThreadPoolAio *pool, AioContext *ctx) { qemu_mutex_lock(&pool->lock); @@ -317,7 +317,7 @@ void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) qemu_mutex_unlock(&pool->lock); } -static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) +static void thread_pool_init_one(ThreadPoolAio *pool, AioContext *ctx) { if (!ctx) { ctx = qemu_get_aio_context(); @@ -337,14 +337,14 @@ static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) thread_pool_update_params(pool, ctx); } -ThreadPool *thread_pool_new(AioContext *ctx) +ThreadPoolAio *thread_pool_new_aio(AioContext *ctx) { - ThreadPool *pool = g_new(ThreadPool, 1); + ThreadPoolAio *pool = g_new(ThreadPoolAio, 1); thread_pool_init_one(pool, ctx); return pool; } -void thread_pool_free(ThreadPool *pool) +void thread_pool_free_aio(ThreadPoolAio *pool) { if (!pool) { return; diff --git a/util/trace-events b/util/trace-events index 5be12d7fab89..bd8f25fb5920 100644 --- a/util/trace-events +++ b/util/trace-events @@ -15,8 +15,8 @@ reentrant_aio(void *ctx, const char *name) "ctx %p name %s" # thread-pool.c thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" -thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" -thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" +thread_pool_complete_aio(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" +thread_pool_cancel_aio(void *req, void *opaque) "req %p opaque %p" # buffer.c buffer_resize(const char *buf, size_t olen, size_t len) "%s: old %zd, new %zd" From patchwork Sun Nov 17 19:19:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C8C97D10F26 for ; Sun, 17 Nov 2024 19:23:49 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkq2-0002fu-8l; Sun, 17 Nov 2024 14:21:38 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkpz-0002fR-2U for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:36 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkpw-0005s9-Pe for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:34 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkpt-00000002GTN-1Kvp; Sun, 17 Nov 2024 20:21:29 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 04/24] thread-pool: Implement generic (non-AIO) pool support Date: Sun, 17 Nov 2024 20:19:59 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Migration code wants to manage device data sending threads in one place. QEMU has an existing thread pool implementation, however it is limited to queuing AIO operations only and essentially has a 1:1 mapping between the current AioContext and the AIO ThreadPool in use. Implement generic (non-AIO) ThreadPool by essentially wrapping Glib's GThreadPool. This brings a few new operations on a pool: * thread_pool_wait() operation waits until all the submitted work requests have finished. * thread_pool_set_max_threads() explicitly sets the maximum thread count in the pool. * thread_pool_adjust_max_threads_to_work() adjusts the maximum thread count in the pool to equal the number of still waiting in queue or unfinished work. Signed-off-by: Maciej S. Szmigiero --- include/block/thread-pool.h | 9 +++ util/thread-pool.c | 109 ++++++++++++++++++++++++++++++++++++ 2 files changed, 118 insertions(+) diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h index 6f27eb085b45..3f9f66307b65 100644 --- a/include/block/thread-pool.h +++ b/include/block/thread-pool.h @@ -38,5 +38,14 @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); +typedef struct ThreadPool ThreadPool; + +ThreadPool *thread_pool_new(void); +void thread_pool_free(ThreadPool *pool); +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, + void *opaque, GDestroyNotify opaque_destroy); +void thread_pool_wait(ThreadPool *pool); +bool thread_pool_set_max_threads(ThreadPool *pool, int max_threads); +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool); #endif diff --git a/util/thread-pool.c b/util/thread-pool.c index 908194dc070f..d80c4181c897 100644 --- a/util/thread-pool.c +++ b/util/thread-pool.c @@ -374,3 +374,112 @@ void thread_pool_free_aio(ThreadPoolAio *pool) qemu_mutex_destroy(&pool->lock); g_free(pool); } + +struct ThreadPool { /* type safety */ + GThreadPool *t; + size_t unfinished_el_ctr; + QemuMutex unfinished_el_ctr_mutex; + QemuCond unfinished_el_ctr_zero_cond; +}; + +typedef struct { + ThreadPoolFunc *func; + void *opaque; + GDestroyNotify opaque_destroy; +} ThreadPoolElement; + +static void thread_pool_func(gpointer data, gpointer user_data) +{ + ThreadPool *pool = user_data; + g_autofree ThreadPoolElement *el = data; + + el->func(el->opaque); + + if (el->opaque_destroy) { + el->opaque_destroy(el->opaque); + } + + QEMU_LOCK_GUARD(&pool->unfinished_el_ctr_mutex); + + assert(pool->unfinished_el_ctr > 0); + pool->unfinished_el_ctr--; + + if (pool->unfinished_el_ctr == 0) { + qemu_cond_signal(&pool->unfinished_el_ctr_zero_cond); + } +} + +ThreadPool *thread_pool_new(void) +{ + ThreadPool *pool = g_new(ThreadPool, 1); + + pool->unfinished_el_ctr = 0; + qemu_mutex_init(&pool->unfinished_el_ctr_mutex); + qemu_cond_init(&pool->unfinished_el_ctr_zero_cond); + + pool->t = g_thread_pool_new(thread_pool_func, pool, 0, TRUE, NULL); + /* + * g_thread_pool_new() can only return errors if initial thread(s) + * creation fails but we ask for 0 initial threads above. + */ + assert(pool->t); + + return pool; +} + +void thread_pool_free(ThreadPool *pool) +{ + g_thread_pool_free(pool->t, FALSE, TRUE); + + qemu_cond_destroy(&pool->unfinished_el_ctr_zero_cond); + qemu_mutex_destroy(&pool->unfinished_el_ctr_mutex); + + g_free(pool); +} + +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, + void *opaque, GDestroyNotify opaque_destroy) +{ + ThreadPoolElement *el = g_new(ThreadPoolElement, 1); + + el->func = func; + el->opaque = opaque; + el->opaque_destroy = opaque_destroy; + + WITH_QEMU_LOCK_GUARD(&pool->unfinished_el_ctr_mutex) { + pool->unfinished_el_ctr++; + } + + /* + * Ignore the return value since this function can only return errors + * if creation of an additional thread fails but even in this case the + * provided work is still getting queued (just for the existing threads). + */ + g_thread_pool_push(pool->t, el, NULL); +} + +void thread_pool_wait(ThreadPool *pool) +{ + QEMU_LOCK_GUARD(&pool->unfinished_el_ctr_mutex); + + if (pool->unfinished_el_ctr > 0) { + qemu_cond_wait(&pool->unfinished_el_ctr_zero_cond, + &pool->unfinished_el_ctr_mutex); + assert(pool->unfinished_el_ctr == 0); + } +} + +bool thread_pool_set_max_threads(ThreadPool *pool, + int max_threads) +{ + assert(max_threads > 0); + + return g_thread_pool_set_max_threads(pool->t, max_threads, NULL); +} + +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool) +{ + QEMU_LOCK_GUARD(&pool->unfinished_el_ctr_mutex); + + return thread_pool_set_max_threads(pool, pool->unfinished_el_ctr); +} From patchwork Sun Nov 17 19:20:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877932 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8239CD10F28 for ; Sun, 17 Nov 2024 19:24:39 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkq6-0002gL-SY; Sun, 17 Nov 2024 14:21:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkq5-0002gD-9F for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:41 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkq2-0005sT-1K for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:41 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkpy-00000002GTX-1tPD; Sun, 17 Nov 2024 20:21:34 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 05/24] migration: Add MIG_CMD_SWITCHOVER_START and its load handler Date: Sun, 17 Nov 2024 20:20:00 +0100 Message-ID: <624309a8e37f7779336bd5c0573d24f155b2ed41.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This QEMU_VM_COMMAND sub-command and its switchover_start SaveVMHandler is used to mark the switchover point in main migration stream. It can be used to inform the destination that all pre-switchover main migration stream data has been sent/received so it can start to process post-switchover data that it might have received via other migration channels like the multifd ones. Add also the relevant MigrationState bit stream compatibility property and its hw_compat entry. Signed-off-by: Maciej S. Szmigiero --- hw/core/machine.c | 1 + include/migration/client-options.h | 4 +++ include/migration/register.h | 12 +++++++++ migration/colo.c | 3 +++ migration/migration-hmp-cmds.c | 2 ++ migration/migration.c | 3 +++ migration/migration.h | 2 ++ migration/options.c | 9 +++++++ migration/savevm.c | 39 ++++++++++++++++++++++++++++++ migration/savevm.h | 1 + migration/trace-events | 1 + scripts/analyze-migration.py | 11 +++++++++ 12 files changed, 88 insertions(+) diff --git a/hw/core/machine.c b/hw/core/machine.c index a35c4a8faecb..ed8d39fd769f 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -38,6 +38,7 @@ GlobalProperty hw_compat_9_1[] = { { TYPE_PCI_DEVICE, "x-pcie-ext-tag", "false" }, + { "migration", "send-switchover-start", "off"}, }; const size_t hw_compat_9_1_len = G_N_ELEMENTS(hw_compat_9_1); diff --git a/include/migration/client-options.h b/include/migration/client-options.h index 59f4b55cf4f7..289c9d776221 100644 --- a/include/migration/client-options.h +++ b/include/migration/client-options.h @@ -10,6 +10,10 @@ #ifndef QEMU_MIGRATION_CLIENT_OPTIONS_H #define QEMU_MIGRATION_CLIENT_OPTIONS_H + +/* properties */ +bool migrate_send_switchover_start(void); + /* capabilities */ bool migrate_background_snapshot(void); diff --git a/include/migration/register.h b/include/migration/register.h index 0b0292738320..ff0faf5f68c8 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -279,6 +279,18 @@ typedef struct SaveVMHandlers { * otherwise */ bool (*switchover_ack_needed)(void *opaque); + + /** + * @switchover_start + * + * Notifies that the switchover has started. Called only on + * the destination. + * + * @opaque: data pointer passed to register_savevm_live() + * + * Returns zero to indicate success and negative for error + */ + int (*switchover_start)(void *opaque); } SaveVMHandlers; /** diff --git a/migration/colo.c b/migration/colo.c index 9590f281d0f1..a75c2c41b464 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -452,6 +452,9 @@ static int colo_do_checkpoint_transaction(MigrationState *s, bql_unlock(); goto out; } + + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); + /* Note: device state is saved into buffer */ ret = qemu_save_device_state(fb); diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index 20d1a6e21948..59d0c48a3e0d 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -46,6 +46,8 @@ static void migration_global_dump(Monitor *mon) ms->send_configuration ? "on" : "off"); monitor_printf(mon, "send-section-footer: %s\n", ms->send_section_footer ? "on" : "off"); + monitor_printf(mon, "send-switchover-start: %s\n", + ms->send_switchover_start ? "on" : "off"); monitor_printf(mon, "clear-bitmap-shift: %u\n", ms->clear_bitmap_shift); } diff --git a/migration/migration.c b/migration/migration.c index 8c5bd0a75c85..2e9d6d5087d7 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2543,6 +2543,8 @@ static int postcopy_start(MigrationState *ms, Error **errp) } restart_block = true; + qemu_savevm_maybe_send_switchover_start(ms->to_dst_file); + /* * Cause any non-postcopiable, but iterative devices to * send out their final data. @@ -2742,6 +2744,7 @@ static int migration_completion_precopy(MigrationState *s, */ s->block_inactive = !migrate_colo(); migration_rate_set(RATE_LIMIT_DISABLED); + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false, s->block_inactive); out_unlock: diff --git a/migration/migration.h b/migration/migration.h index 0956e9274b2c..2a18349cfec2 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -403,6 +403,8 @@ struct MigrationState { bool send_configuration; /* Whether we send section footer during migration */ bool send_section_footer; + /* Whether we send switchover start notification during migration */ + bool send_switchover_start; /* Needed by postcopy-pause state */ QemuSemaphore postcopy_pause_sem; diff --git a/migration/options.c b/migration/options.c index ad8d6989a807..f916c8ed4e09 100644 --- a/migration/options.c +++ b/migration/options.c @@ -92,6 +92,8 @@ Property migration_properties[] = { send_configuration, true), DEFINE_PROP_BOOL("send-section-footer", MigrationState, send_section_footer, true), + DEFINE_PROP_BOOL("send-switchover-start", MigrationState, + send_switchover_start, true), DEFINE_PROP_BOOL("multifd-flush-after-each-section", MigrationState, multifd_flush_after_each_section, false), DEFINE_PROP_UINT8("x-clear-bitmap-shift", MigrationState, @@ -206,6 +208,13 @@ bool migrate_auto_converge(void) return s->capabilities[MIGRATION_CAPABILITY_AUTO_CONVERGE]; } +bool migrate_send_switchover_start(void) +{ + MigrationState *s = migrate_get_current(); + + return s->send_switchover_start; +} + bool migrate_background_snapshot(void) { MigrationState *s = migrate_get_current(); diff --git a/migration/savevm.c b/migration/savevm.c index f4e4876f7202..a254c38edcca 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -90,6 +90,7 @@ enum qemu_vm_cmd { MIG_CMD_ENABLE_COLO, /* Enable COLO */ MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ + MIG_CMD_SWITCHOVER_START, /* Switchover start notification */ MIG_CMD_MAX }; @@ -109,6 +110,7 @@ static struct mig_cmd_args { [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, + [MIG_CMD_SWITCHOVER_START] = { .len = 0, .name = "SWITCHOVER_START" }, [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, }; @@ -1201,6 +1203,19 @@ void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name) qemu_savevm_command_send(f, MIG_CMD_RECV_BITMAP, len + 1, (uint8_t *)buf); } +static void qemu_savevm_send_switchover_start(QEMUFile *f) +{ + trace_savevm_send_switchover_start(); + qemu_savevm_command_send(f, MIG_CMD_SWITCHOVER_START, 0, NULL); +} + +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f) +{ + if (migrate_send_switchover_start()) { + qemu_savevm_send_switchover_start(f); + } +} + bool qemu_savevm_state_blocked(Error **errp) { SaveStateEntry *se; @@ -1713,6 +1728,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) ret = qemu_file_get_error(f); if (ret == 0) { + qemu_savevm_maybe_send_switchover_start(f); qemu_savevm_state_complete_precopy(f, false, false); ret = qemu_file_get_error(f); } @@ -2413,6 +2429,26 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) return ret; } +static int loadvm_postcopy_handle_switchover_start(void) +{ + SaveStateEntry *se; + + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + int ret; + + if (!se->ops || !se->ops->switchover_start) { + continue; + } + + ret = se->ops->switchover_start(se->opaque); + if (ret < 0) { + return ret; + } + } + + return 0; +} + /* * Process an incoming 'QEMU_VM_COMMAND' * 0 just a normal return @@ -2511,6 +2547,9 @@ static int loadvm_process_command(QEMUFile *f) case MIG_CMD_ENABLE_COLO: return loadvm_process_enable_colo(mis); + + case MIG_CMD_SWITCHOVER_START: + return loadvm_postcopy_handle_switchover_start(); } return 0; diff --git a/migration/savevm.h b/migration/savevm.h index 9ec96a995c93..4d402723bc3c 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -53,6 +53,7 @@ void qemu_savevm_send_postcopy_listen(QEMUFile *f); void qemu_savevm_send_postcopy_run(QEMUFile *f); void qemu_savevm_send_postcopy_resume(QEMUFile *f); void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name); +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f); void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, uint16_t len, diff --git a/migration/trace-events b/migration/trace-events index bb0e0cc6dcfe..551f5af0740f 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -39,6 +39,7 @@ savevm_send_postcopy_run(void) "" savevm_send_postcopy_resume(void) "" savevm_send_colo_enable(void) "" savevm_send_recv_bitmap(char *name) "%s" +savevm_send_switchover_start(void) "" savevm_state_setup(void) "" savevm_state_resume_prepare(void) "" savevm_state_header(void) "" diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py index 8a254a5b6a2e..a4d4042584c0 100755 --- a/scripts/analyze-migration.py +++ b/scripts/analyze-migration.py @@ -564,7 +564,9 @@ class MigrationDump(object): QEMU_VM_SUBSECTION = 0x05 QEMU_VM_VMDESCRIPTION = 0x06 QEMU_VM_CONFIGURATION = 0x07 + QEMU_VM_COMMAND = 0x08 QEMU_VM_SECTION_FOOTER= 0x7e + QEMU_MIG_CMD_SWITCHOVER_START = 0x0b def __init__(self, filename): self.section_classes = { @@ -626,6 +628,15 @@ def read(self, desc_only = False, dump_memory = False, write_memory = False): elif section_type == self.QEMU_VM_SECTION_PART or section_type == self.QEMU_VM_SECTION_END: section_id = file.read32() self.sections[section_id].read() + elif section_type == self.QEMU_VM_COMMAND: + command_type = file.read16() + command_data_len = file.read16() + if command_type != self.QEMU_MIG_CMD_SWITCHOVER_START: + raise Exception("Unknown QEMU_VM_COMMAND: %x" % + (command_type)) + if command_data_len != 0: + raise Exception("Invalid SWITCHOVER_START length: %x" % + (command_data_len)) elif section_type == self.QEMU_VM_SECTION_FOOTER: read_section_id = file.read32() if read_section_id != section_id: From patchwork Sun Nov 17 19:20:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877923 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1F15DD10F28 for ; Sun, 17 Nov 2024 19:23:30 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqA-0002gu-7o; Sun, 17 Nov 2024 14:21:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkq8-0002gm-Tk for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:45 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkq7-0005sq-EG for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:44 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkq3-00000002GTj-2VyB; Sun, 17 Nov 2024 20:21:39 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 06/24] migration: Add qemu_loadvm_load_state_buffer() and its handler Date: Sun, 17 Nov 2024 20:20:01 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" qemu_loadvm_load_state_buffer() and its load_state_buffer SaveVMHandler allow providing device state buffer to explicitly specified device via its idstr and instance id. Reviewed-by: Fabiano Rosas Signed-off-by: Maciej S. Szmigiero --- include/migration/register.h | 17 +++++++++++++++++ migration/savevm.c | 23 +++++++++++++++++++++++ migration/savevm.h | 3 +++ 3 files changed, 43 insertions(+) diff --git a/include/migration/register.h b/include/migration/register.h index ff0faf5f68c8..39991f3cc5d0 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -229,6 +229,23 @@ typedef struct SaveVMHandlers { */ int (*load_state)(QEMUFile *f, void *opaque, int version_id); + /* This runs outside the BQL. */ + + /** + * @load_state_buffer + * + * Load device state buffer provided to qemu_loadvm_load_state_buffer(). + * + * @opaque: data pointer passed to register_savevm_live() + * @buf: the data buffer to load + * @len: the data length in buffer + * @errp: pointer to Error*, to store an error if it happens. + * + * Returns zero to indicate success and negative for error + */ + int (*load_state_buffer)(void *opaque, char *buf, size_t len, + Error **errp); + /** * @load_setup * diff --git a/migration/savevm.c b/migration/savevm.c index a254c38edcca..1f58a2fa54ae 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -3085,6 +3085,29 @@ int qemu_loadvm_approve_switchover(void) return migrate_send_rp_switchover_ack(mis); } +int qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, + char *buf, size_t len, Error **errp) +{ + SaveStateEntry *se; + + se = find_se(idstr, instance_id); + if (!se) { + error_setg(errp, + "Unknown idstr %s or instance id %u for load state buffer", + idstr, instance_id); + return -1; + } + + if (!se->ops || !se->ops->load_state_buffer) { + error_setg(errp, + "idstr %s / instance %u has no load state buffer operation", + idstr, instance_id); + return -1; + } + + return se->ops->load_state_buffer(se->opaque, buf, len, errp); +} + bool save_snapshot(const char *name, bool overwrite, const char *vmstate, bool has_devices, strList *devices, Error **errp) { diff --git a/migration/savevm.h b/migration/savevm.h index 4d402723bc3c..b5a4f8c8b440 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -71,4 +71,7 @@ int qemu_loadvm_approve_switchover(void); int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, bool in_postcopy, bool inactivate_disks); +int qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, + char *buf, size_t len, Error **errp); + #endif From patchwork Sun Nov 17 19:20:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0B242D10F26 for ; Sun, 17 Nov 2024 19:22:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqF-0002iJ-Qi; Sun, 17 Nov 2024 14:21:52 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqE-0002hs-7Z for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:50 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqC-0005tG-Qx for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:50 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkq8-00000002GTt-37PJ; Sun, 17 Nov 2024 20:21:44 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 07/24] migration: Document the BQL behavior of load SaveVMHandlers Date: Sun, 17 Nov 2024 20:20:02 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Some of these SaveVMHandlers were missing the BQL behavior annotation, making people wonder what it exactly is. Signed-off-by: Maciej S. Szmigiero --- include/migration/register.h | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/include/migration/register.h b/include/migration/register.h index 39991f3cc5d0..761e4e4d8bcb 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -212,6 +212,8 @@ typedef struct SaveVMHandlers { void (*state_pending_exact)(void *opaque, uint64_t *must_precopy, uint64_t *can_postcopy); + /* This runs inside the BQL. */ + /** * @load_state * @@ -246,6 +248,8 @@ typedef struct SaveVMHandlers { int (*load_state_buffer)(void *opaque, char *buf, size_t len, Error **errp); + /* The following handlers run inside the BQL. */ + /** * @load_setup * @@ -272,6 +276,9 @@ typedef struct SaveVMHandlers { */ int (*load_cleanup)(void *opaque); + + /* This runs outside the BQL. */ + /** * @resume_prepare * @@ -284,6 +291,8 @@ typedef struct SaveVMHandlers { */ int (*resume_prepare)(MigrationState *s, void *opaque); + /* The following handlers run inside the BQL. */ + /** * @switchover_ack_needed * From patchwork Sun Nov 17 19:20:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DC223D10F28 for ; Sun, 17 Nov 2024 19:22:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqJ-0002ij-UD; Sun, 17 Nov 2024 14:21:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqJ-0002iZ-5S for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:55 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqH-0005tl-IQ for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:54 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqD-00000002GU4-40YE; Sun, 17 Nov 2024 20:21:49 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 08/24] migration: Add thread pool of optional load threads Date: Sun, 17 Nov 2024 20:20:03 +0100 Message-ID: <877b7108c9cb9064615606d4c731cb12c549b7f9.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Some drivers might want to make use of auxiliary helper threads during VM state loading, for example to make sure that their blocking (sync) I/O operations don't block the rest of the migration process. Add a migration core managed thread pool to facilitate this use case. The migration core will wait for these threads to finish before (re)starting the VM at destination. Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 3 ++ include/qemu/typedefs.h | 1 + migration/savevm.c | 77 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 81 insertions(+) diff --git a/include/migration/misc.h b/include/migration/misc.h index 804eb23c0607..c92ca018ab3b 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -45,9 +45,12 @@ bool migrate_ram_is_ignored(RAMBlock *block); /* migration/block.c */ AnnounceParameters *migrate_announce_params(void); + /* migration/savevm.c */ void dump_vmstate_json_to_file(FILE *out_fp); +void qemu_loadvm_start_load_thread(MigrationLoadThread function, + void *opaque); /* migration/migration.c */ void migration_object_init(void); diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h index 3d84efcac47a..8c8ea5c2840d 100644 --- a/include/qemu/typedefs.h +++ b/include/qemu/typedefs.h @@ -131,5 +131,6 @@ typedef struct IRQState *qemu_irq; * Function types */ typedef void (*qemu_irq_handler)(void *opaque, int n, int level); +typedef int (*MigrationLoadThread)(bool *abort_flag, void *opaque); #endif /* QEMU_TYPEDEFS_H */ diff --git a/migration/savevm.c b/migration/savevm.c index 1f58a2fa54ae..6ea9054c4083 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -54,6 +54,7 @@ #include "qemu/job.h" #include "qemu/main-loop.h" #include "block/snapshot.h" +#include "block/thread-pool.h" #include "qemu/cutils.h" #include "io/channel-buffer.h" #include "io/channel-file.h" @@ -71,6 +72,10 @@ const unsigned int postcopy_ram_discard_version; +static ThreadPool *load_threads; +static int load_threads_ret; +static bool load_threads_abort; + /* Subcommands for QEMU_VM_COMMAND */ enum qemu_vm_cmd { MIG_CMD_INVALID = 0, /* Must be 0 */ @@ -2788,6 +2793,12 @@ static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp) int ret; trace_loadvm_state_setup(); + + assert(!load_threads); + load_threads = thread_pool_new(); + load_threads_ret = 0; + load_threads_abort = false; + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { if (!se->ops || !se->ops->load_setup) { continue; @@ -2806,19 +2817,72 @@ static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp) return ret; } } + + return 0; +} + +struct LoadThreadData { + MigrationLoadThread function; + void *opaque; +}; + +static int qemu_loadvm_load_thread(void *thread_opaque) +{ + struct LoadThreadData *data = thread_opaque; + int ret; + + ret = data->function(&load_threads_abort, data->opaque); + if (ret && !qatomic_read(&load_threads_ret)) { + /* + * Racy with the above read but that's okay - which thread error + * return we report is purely arbitrary anyway. + */ + qatomic_set(&load_threads_ret, ret); + } + return 0; } +void qemu_loadvm_start_load_thread(MigrationLoadThread function, + void *opaque) +{ + struct LoadThreadData *data; + + /* We only set it from this thread so it's okay to read it directly */ + assert(!load_threads_abort); + + data = g_new(struct LoadThreadData, 1); + data->function = function; + data->opaque = opaque; + + thread_pool_submit(load_threads, qemu_loadvm_load_thread, + data, g_free); + thread_pool_adjust_max_threads_to_work(load_threads); +} + void qemu_loadvm_state_cleanup(void) { SaveStateEntry *se; trace_loadvm_state_cleanup(); + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { if (se->ops && se->ops->load_cleanup) { se->ops->load_cleanup(se->opaque); } } + + /* + * We might be called even without earlier qemu_loadvm_state_setup() + * call if qemu_loadvm_state() fails very early. + */ + if (load_threads) { + qatomic_set(&load_threads_abort, true); + bql_unlock(); /* Load threads might be waiting for BQL */ + thread_pool_wait(load_threads); + bql_lock(); + g_clear_pointer(&load_threads, thread_pool_free); + } } /* Return true if we should continue the migration, or false. */ @@ -3007,6 +3071,19 @@ int qemu_loadvm_state(QEMUFile *f) return ret; } + if (ret == 0) { + bql_unlock(); /* Let load threads do work requiring BQL */ + thread_pool_wait(load_threads); + bql_lock(); + + ret = load_threads_ret; + } + /* + * Set this flag unconditionally so we'll catch further attempts to + * start additional threads via an appropriate assert() + */ + qatomic_set(&load_threads_abort, true); + if (ret == 0) { ret = qemu_file_get_error(f); } From patchwork Sun Nov 17 19:20:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 847D3D10F28 for ; Sun, 17 Nov 2024 19:22:38 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqP-0002jU-FP; Sun, 17 Nov 2024 14:22:01 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqN-0002j7-OB for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:59 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqM-0005uD-8j for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:21:59 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqJ-00000002GUE-0OxZ; Sun, 17 Nov 2024 20:21:55 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 09/24] migration/multifd: Split packet into header and RAM data Date: Sun, 17 Nov 2024 20:20:04 +0100 Message-ID: <8cb65952e475c0236d66047270fd3ed907f7404e.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Read packet header first so in the future we will be able to differentiate between a RAM multifd packet and a device state multifd packet. Since these two are of different size we can't read the packet body until we know which packet type it is. Signed-off-by: Maciej S. Szmigiero --- migration/multifd.c | 49 +++++++++++++++++++++++++++++++++++---------- migration/multifd.h | 5 +++++ 2 files changed, 43 insertions(+), 11 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 498e71fd1024..999b88b7ebcb 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -209,10 +209,10 @@ void multifd_send_fill_packet(MultiFDSendParams *p) memset(packet, 0, p->packet_len); - packet->magic = cpu_to_be32(MULTIFD_MAGIC); - packet->version = cpu_to_be32(MULTIFD_VERSION); + packet->hdr.magic = cpu_to_be32(MULTIFD_MAGIC); + packet->hdr.version = cpu_to_be32(MULTIFD_VERSION); - packet->flags = cpu_to_be32(p->flags); + packet->hdr.flags = cpu_to_be32(p->flags); packet->next_packet_size = cpu_to_be32(p->next_packet_size); packet_num = qatomic_fetch_inc(&multifd_send_state->packet_num); @@ -228,12 +228,12 @@ void multifd_send_fill_packet(MultiFDSendParams *p) p->flags, p->next_packet_size); } -static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +static int multifd_recv_unfill_packet_header(MultiFDRecvParams *p, + const MultiFDPacketHdr_t *hdr, + Error **errp) { - const MultiFDPacket_t *packet = p->packet; - uint32_t magic = be32_to_cpu(packet->magic); - uint32_t version = be32_to_cpu(packet->version); - int ret = 0; + uint32_t magic = be32_to_cpu(hdr->magic); + uint32_t version = be32_to_cpu(hdr->version); if (magic != MULTIFD_MAGIC) { error_setg(errp, "multifd: received packet magic %x, expected %x", @@ -247,7 +247,16 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) return -1; } - p->flags = be32_to_cpu(packet->flags); + p->flags = be32_to_cpu(hdr->flags); + + return 0; +} + +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +{ + const MultiFDPacket_t *packet = p->packet; + int ret = 0; + p->next_packet_size = be32_to_cpu(packet->next_packet_size); p->packet_num = be64_to_cpu(packet->packet_num); p->packets_recved++; @@ -1126,8 +1135,12 @@ static void *multifd_recv_thread(void *opaque) rcu_register_thread(); while (true) { + MultiFDPacketHdr_t hdr; uint32_t flags = 0; bool has_data = false; + uint8_t *pkt_buf; + size_t pkt_len; + p->normal_num = 0; if (use_packets) { @@ -1135,8 +1148,22 @@ static void *multifd_recv_thread(void *opaque) break; } - ret = qio_channel_read_all_eof(p->c, (void *)p->packet, - p->packet_len, &local_err); + ret = qio_channel_read_all_eof(p->c, (void *)&hdr, + sizeof(hdr), &local_err); + if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ + break; + } + + ret = multifd_recv_unfill_packet_header(p, &hdr, &local_err); + if (ret) { + break; + } + + pkt_buf = (uint8_t *)p->packet + sizeof(hdr); + pkt_len = p->packet_len - sizeof(hdr); + + ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len, + &local_err); if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ break; } diff --git a/migration/multifd.h b/migration/multifd.h index 50d58c0c9cec..106a48496dc6 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -53,6 +53,11 @@ typedef struct { uint32_t magic; uint32_t version; uint32_t flags; +} __attribute__((packed)) MultiFDPacketHdr_t; + +typedef struct { + MultiFDPacketHdr_t hdr; + /* maximum number of allocated pages */ uint32_t pages_alloc; /* non zero pages */ From patchwork Sun Nov 17 19:20:05 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34F9DD10F28 for ; Sun, 17 Nov 2024 19:22:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqV-0002jm-MA; Sun, 17 Nov 2024 14:22:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqT-0002je-US for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:05 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqR-0005uh-Ut for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:05 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqO-00000002GUP-0z9c; Sun, 17 Nov 2024 20:22:00 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 10/24] migration/multifd: Device state transfer support - receive side Date: Sun, 17 Nov 2024 20:20:05 +0100 Message-ID: <8679a04fda669b0e8f0e3b8c598aa4a58a67de40.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Add a basic support for receiving device state via multifd channels - channels that are shared with RAM transfers. Depending whether MULTIFD_FLAG_DEVICE_STATE flag is present or not in the packet header either device state (MultiFDPacketDeviceState_t) or RAM data (existing MultiFDPacket_t) is read. The received device state data is provided to qemu_loadvm_load_state_buffer() function for processing in the device's load_state_buffer handler. Signed-off-by: Maciej S. Szmigiero --- migration/multifd.c | 87 +++++++++++++++++++++++++++++++++++++++++---- migration/multifd.h | 26 +++++++++++++- 2 files changed, 105 insertions(+), 8 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 999b88b7ebcb..9578a985449b 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -21,6 +21,7 @@ #include "file.h" #include "migration.h" #include "migration-stats.h" +#include "savevm.h" #include "socket.h" #include "tls.h" #include "qemu-file.h" @@ -252,14 +253,24 @@ static int multifd_recv_unfill_packet_header(MultiFDRecvParams *p, return 0; } -static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +static int multifd_recv_unfill_packet_device_state(MultiFDRecvParams *p, + Error **errp) +{ + MultiFDPacketDeviceState_t *packet = p->packet_dev_state; + + packet->instance_id = be32_to_cpu(packet->instance_id); + p->next_packet_size = be32_to_cpu(packet->next_packet_size); + + return 0; +} + +static int multifd_recv_unfill_packet_ram(MultiFDRecvParams *p, Error **errp) { const MultiFDPacket_t *packet = p->packet; int ret = 0; p->next_packet_size = be32_to_cpu(packet->next_packet_size); p->packet_num = be64_to_cpu(packet->packet_num); - p->packets_recved++; if (!(p->flags & MULTIFD_FLAG_SYNC)) { ret = multifd_ram_unfill_packet(p, errp); @@ -271,6 +282,17 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) return ret; } +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +{ + p->packets_recved++; + + if (p->flags & MULTIFD_FLAG_DEVICE_STATE) { + return multifd_recv_unfill_packet_device_state(p, errp); + } + + return multifd_recv_unfill_packet_ram(p, errp); +} + static bool multifd_send_should_exit(void) { return qatomic_read(&multifd_send_state->exiting); @@ -1023,6 +1045,7 @@ static void multifd_recv_cleanup_channel(MultiFDRecvParams *p) p->packet_len = 0; g_free(p->packet); p->packet = NULL; + g_clear_pointer(&p->packet_dev_state, g_free); g_free(p->normal); p->normal = NULL; g_free(p->zero); @@ -1124,6 +1147,28 @@ void multifd_recv_sync_main(void) trace_multifd_recv_sync_main(multifd_recv_state->packet_num); } +static int multifd_device_state_recv(MultiFDRecvParams *p, Error **errp) +{ + g_autofree char *idstr = NULL; + g_autofree char *dev_state_buf = NULL; + int ret; + + dev_state_buf = g_malloc(p->next_packet_size); + + ret = qio_channel_read_all(p->c, dev_state_buf, p->next_packet_size, errp); + if (ret != 0) { + return ret; + } + + idstr = g_strndup(p->packet_dev_state->idstr, + sizeof(p->packet_dev_state->idstr)); + + return qemu_loadvm_load_state_buffer(idstr, + p->packet_dev_state->instance_id, + dev_state_buf, p->next_packet_size, + errp); +} + static void *multifd_recv_thread(void *opaque) { MultiFDRecvParams *p = opaque; @@ -1137,6 +1182,7 @@ static void *multifd_recv_thread(void *opaque) while (true) { MultiFDPacketHdr_t hdr; uint32_t flags = 0; + bool is_device_state = false; bool has_data = false; uint8_t *pkt_buf; size_t pkt_len; @@ -1159,8 +1205,14 @@ static void *multifd_recv_thread(void *opaque) break; } - pkt_buf = (uint8_t *)p->packet + sizeof(hdr); - pkt_len = p->packet_len - sizeof(hdr); + is_device_state = p->flags & MULTIFD_FLAG_DEVICE_STATE; + if (is_device_state) { + pkt_buf = (uint8_t *)p->packet_dev_state + sizeof(hdr); + pkt_len = sizeof(*p->packet_dev_state) - sizeof(hdr); + } else { + pkt_buf = (uint8_t *)p->packet + sizeof(hdr); + pkt_len = p->packet_len - sizeof(hdr); + } ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len, &local_err); @@ -1178,9 +1230,14 @@ static void *multifd_recv_thread(void *opaque) flags = p->flags; /* recv methods don't know how to handle the SYNC flag */ p->flags &= ~MULTIFD_FLAG_SYNC; - if (!(flags & MULTIFD_FLAG_SYNC)) { - has_data = p->normal_num || p->zero_num; + + if (is_device_state) { + has_data = p->next_packet_size > 0; + } else { + has_data = !(flags & MULTIFD_FLAG_SYNC) && + (p->normal_num || p->zero_num); } + qemu_mutex_unlock(&p->mutex); } else { /* @@ -1209,14 +1266,29 @@ static void *multifd_recv_thread(void *opaque) } if (has_data) { - ret = multifd_recv_state->ops->recv(p, &local_err); + if (is_device_state) { + assert(use_packets); + ret = multifd_device_state_recv(p, &local_err); + } else { + ret = multifd_recv_state->ops->recv(p, &local_err); + } if (ret != 0) { break; } + } else if (is_device_state) { + error_setg(&local_err, + "multifd: received empty device state packet"); + break; } if (use_packets) { if (flags & MULTIFD_FLAG_SYNC) { + if (is_device_state) { + error_setg(&local_err, + "multifd: received SYNC device state packet"); + break; + } + qemu_sem_post(&multifd_recv_state->sem_sync); qemu_sem_wait(&p->sem_sync); } @@ -1285,6 +1357,7 @@ int multifd_recv_setup(Error **errp) p->packet_len = sizeof(MultiFDPacket_t) + sizeof(uint64_t) * page_count; p->packet = g_malloc0(p->packet_len); + p->packet_dev_state = g_malloc0(sizeof(*p->packet_dev_state)); } p->name = g_strdup_printf(MIGRATION_THREAD_DST_MULTIFD, i); p->normal = g_new0(ram_addr_t, page_count); diff --git a/migration/multifd.h b/migration/multifd.h index 106a48496dc6..026b653057e2 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -46,6 +46,12 @@ MultiFDRecvData *multifd_get_recv_data(void); #define MULTIFD_FLAG_UADK (8 << 1) #define MULTIFD_FLAG_QATZIP (16 << 1) +/* + * If set it means that this packet contains device state + * (MultiFDPacketDeviceState_t), not RAM data (MultiFDPacket_t). + */ +#define MULTIFD_FLAG_DEVICE_STATE (1 << 6) + /* This value needs to be a multiple of qemu_target_page_size() */ #define MULTIFD_PACKET_SIZE (512 * 1024) @@ -78,6 +84,16 @@ typedef struct { uint64_t offset[]; } __attribute__((packed)) MultiFDPacket_t; +typedef struct { + MultiFDPacketHdr_t hdr; + + char idstr[256] QEMU_NONSTRING; + uint32_t instance_id; + + /* size of the next packet that contains the actual data */ + uint32_t next_packet_size; +} __attribute__((packed)) MultiFDPacketDeviceState_t; + typedef struct { /* number of used pages */ uint32_t num; @@ -95,6 +111,13 @@ struct MultiFDRecvData { off_t file_offset; }; +typedef struct { + char *idstr; + uint32_t instance_id; + char *buf; + size_t buf_len; +} MultiFDDeviceState_t; + typedef enum { MULTIFD_PAYLOAD_NONE, MULTIFD_PAYLOAD_RAM, @@ -210,8 +233,9 @@ typedef struct { /* thread local variables. No locking required */ - /* pointer to the packet */ + /* pointers to the possible packet types */ MultiFDPacket_t *packet; + MultiFDPacketDeviceState_t *packet_dev_state; /* size of the next packet that contains pages */ uint32_t next_packet_size; /* packets received through this channel */ From patchwork Sun Nov 17 19:20:06 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877924 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 58955D10F26 for ; Sun, 17 Nov 2024 19:23:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqb-0002of-0k; Sun, 17 Nov 2024 14:22:13 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqZ-0002n6-3t for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:11 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqX-0005v6-C3 for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:10 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqT-00000002GUa-1ZLK; Sun, 17 Nov 2024 20:22:05 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 11/24] migration/multifd: Make multifd_send() thread safe Date: Sun, 17 Nov 2024 20:20:06 +0100 Message-ID: <20fadbcc46cc7cc698ce24e83a3e0fc0c35cd2d8.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" multifd_send() function is currently not thread safe, make it thread safe by holding a lock during its execution. This way it will be possible to safely call it concurrently from multiple threads. Signed-off-by: Maciej S. Szmigiero --- migration/multifd.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/migration/multifd.c b/migration/multifd.c index 9578a985449b..4575495c8816 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -50,6 +50,10 @@ typedef struct { struct { MultiFDSendParams *params; + + /* multifd_send() body is not thread safe, needs serialization */ + QemuMutex multifd_send_mutex; + /* * Global number of generated multifd packets. * @@ -331,6 +335,7 @@ static void multifd_send_kick_main(MultiFDSendParams *p) */ bool multifd_send(MultiFDSendData **send_data) { + QEMU_LOCK_GUARD(&multifd_send_state->multifd_send_mutex); int i; static int next_channel; MultiFDSendParams *p = NULL; /* make happy gcc */ @@ -508,6 +513,7 @@ static void multifd_send_cleanup_state(void) socket_cleanup_outgoing_migration(); qemu_sem_destroy(&multifd_send_state->channels_created); qemu_sem_destroy(&multifd_send_state->channels_ready); + qemu_mutex_destroy(&multifd_send_state->multifd_send_mutex); g_free(multifd_send_state->params); multifd_send_state->params = NULL; g_free(multifd_send_state); @@ -853,6 +859,7 @@ bool multifd_send_setup(void) thread_count = migrate_multifd_channels(); multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); + qemu_mutex_init(&multifd_send_state->multifd_send_mutex); qemu_sem_init(&multifd_send_state->channels_created, 0); qemu_sem_init(&multifd_send_state->channels_ready, 0); qatomic_set(&multifd_send_state->exiting, 0); From patchwork Sun Nov 17 19:20:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877922 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 43CFAD10F26 for ; Sun, 17 Nov 2024 19:23:29 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqf-0002sn-5h; Sun, 17 Nov 2024 14:22:17 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqd-0002s3-RC for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:15 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqc-0005vQ-BS for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:15 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqY-00000002GUm-2BI8; Sun, 17 Nov 2024 20:22:10 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 12/24] migration/multifd: Add an explicit MultiFDSendData destructor Date: Sun, 17 Nov 2024 20:20:07 +0100 Message-ID: <55c9769a2961d621f5843e04ce995e6c226b1432.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This way if there are fields there that needs explicit disposal (like, for example, some attached buffers) they will be handled appropriately. Add a related assert to multifd_set_payload_type() in order to make sure that this function is only used to fill a previously empty MultiFDSendData with some payload, not the other way around. Reviewed-by: Fabiano Rosas Signed-off-by: Maciej S. Szmigiero --- migration/multifd-nocomp.c | 3 +-- migration/multifd.c | 31 ++++++++++++++++++++++++++++--- migration/multifd.h | 5 +++++ 3 files changed, 34 insertions(+), 5 deletions(-) diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 55191152f9cb..fa0fd0289eca 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -41,8 +41,7 @@ void multifd_ram_save_setup(void) void multifd_ram_save_cleanup(void) { - g_free(multifd_ram_send); - multifd_ram_send = NULL; + g_clear_pointer(&multifd_ram_send, multifd_send_data_free); } static void multifd_set_file_bitmap(MultiFDSendParams *p) diff --git a/migration/multifd.c b/migration/multifd.c index 4575495c8816..730acf55cfad 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -123,6 +123,32 @@ MultiFDSendData *multifd_send_data_alloc(void) return g_malloc0(size_minus_payload + max_payload_size); } +void multifd_send_data_clear(MultiFDSendData *data) +{ + if (multifd_payload_empty(data)) { + return; + } + + switch (data->type) { + default: + /* Nothing to do */ + break; + } + + data->type = MULTIFD_PAYLOAD_NONE; +} + +void multifd_send_data_free(MultiFDSendData *data) +{ + if (!data) { + return; + } + + multifd_send_data_clear(data); + + g_free(data); +} + static bool multifd_use_packets(void) { return !migrate_mapped_ram(); @@ -496,8 +522,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) qemu_sem_destroy(&p->sem_sync); g_free(p->name); p->name = NULL; - g_free(p->data); - p->data = NULL; + g_clear_pointer(&p->data, multifd_send_data_free); p->packet_len = 0; g_free(p->packet); p->packet = NULL; @@ -663,7 +688,7 @@ static void *multifd_send_thread(void *opaque) (uint64_t)p->next_packet_size + p->packet_len); p->next_packet_size = 0; - multifd_set_payload_type(p->data, MULTIFD_PAYLOAD_NONE); + multifd_send_data_clear(p->data); /* * Making sure p->data is published before saying "we're diff --git a/migration/multifd.h b/migration/multifd.h index 026b653057e2..d2f1d0d74da7 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -140,6 +140,9 @@ static inline bool multifd_payload_empty(MultiFDSendData *data) static inline void multifd_set_payload_type(MultiFDSendData *data, MultiFDPayloadType type) { + assert(multifd_payload_empty(data)); + assert(type != MULTIFD_PAYLOAD_NONE); + data->type = type; } @@ -353,6 +356,8 @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p) void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); bool multifd_send(MultiFDSendData **send_data); MultiFDSendData *multifd_send_data_alloc(void); +void multifd_send_data_clear(MultiFDSendData *data); +void multifd_send_data_free(MultiFDSendData *data); static inline uint32_t multifd_ram_page_size(void) { From patchwork Sun Nov 17 19:20:08 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877933 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3906FD10F28 for ; Sun, 17 Nov 2024 19:24:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqn-00032u-BI; Sun, 17 Nov 2024 14:22:25 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqj-0002wf-Pf for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:21 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqh-0005vj-I2 for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:21 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqd-00000002GUx-2phQ; Sun, 17 Nov 2024 20:22:15 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 13/24] migration/multifd: Device state transfer support - send side Date: Sun, 17 Nov 2024 20:20:08 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" A new function multifd_queue_device_state() is provided for device to queue its state for transmission via a multifd channel. Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 4 ++ migration/meson.build | 1 + migration/multifd-device-state.c | 106 +++++++++++++++++++++++++++++++ migration/multifd-nocomp.c | 11 +++- migration/multifd.c | 43 +++++++++++-- migration/multifd.h | 24 ++++--- 6 files changed, 173 insertions(+), 16 deletions(-) create mode 100644 migration/multifd-device-state.c diff --git a/include/migration/misc.h b/include/migration/misc.h index c92ca018ab3b..118e205bbcc6 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -109,4 +109,8 @@ bool migration_incoming_postcopy_advised(void); /* True if background snapshot is active */ bool migration_in_bg_snapshot(void); +/* migration/multifd-device-state.c */ +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, + char *data, size_t len); + #endif diff --git a/migration/meson.build b/migration/meson.build index d53cf3417ab8..9788c47bb56e 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -22,6 +22,7 @@ system_ss.add(files( 'migration-hmp-cmds.c', 'migration.c', 'multifd.c', + 'multifd-device-state.c', 'multifd-nocomp.c', 'multifd-zlib.c', 'multifd-zero-page.c', diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c new file mode 100644 index 000000000000..7741a64fbd4d --- /dev/null +++ b/migration/multifd-device-state.c @@ -0,0 +1,106 @@ +/* + * Multifd device state migration + * + * Copyright (C) 2024 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/lockable.h" +#include "migration/misc.h" +#include "multifd.h" + +static QemuMutex queue_job_mutex; + +static MultiFDSendData *device_state_send; + +size_t multifd_device_state_payload_size(void) +{ + return sizeof(MultiFDDeviceState_t); +} + +void multifd_device_state_send_setup(void) +{ + qemu_mutex_init(&queue_job_mutex); + + device_state_send = multifd_send_data_alloc(); +} + +void multifd_device_state_clear(MultiFDDeviceState_t *device_state) +{ + g_clear_pointer(&device_state->idstr, g_free); + g_clear_pointer(&device_state->buf, g_free); +} + +void multifd_device_state_send_cleanup(void) +{ + g_clear_pointer(&device_state_send, multifd_send_data_free); + + qemu_mutex_destroy(&queue_job_mutex); +} + +static void multifd_device_state_fill_packet(MultiFDSendParams *p) +{ + MultiFDDeviceState_t *device_state = &p->data->u.device_state; + MultiFDPacketDeviceState_t *packet = p->packet_device_state; + + packet->hdr.flags = cpu_to_be32(p->flags); + strncpy(packet->idstr, device_state->idstr, sizeof(packet->idstr)); + packet->instance_id = cpu_to_be32(device_state->instance_id); + packet->next_packet_size = cpu_to_be32(p->next_packet_size); +} + +static void multifd_prepare_header_device_state(MultiFDSendParams *p) +{ + p->iov[0].iov_len = sizeof(*p->packet_device_state); + p->iov[0].iov_base = p->packet_device_state; + p->iovs_num++; +} + +void multifd_device_state_send_prepare(MultiFDSendParams *p) +{ + MultiFDDeviceState_t *device_state = &p->data->u.device_state; + + assert(multifd_payload_device_state(p->data)); + + multifd_prepare_header_device_state(p); + + assert(!(p->flags & MULTIFD_FLAG_SYNC)); + + p->next_packet_size = device_state->buf_len; + if (p->next_packet_size > 0) { + p->iov[p->iovs_num].iov_base = device_state->buf; + p->iov[p->iovs_num].iov_len = p->next_packet_size; + p->iovs_num++; + } + + p->flags |= MULTIFD_FLAG_NOCOMP | MULTIFD_FLAG_DEVICE_STATE; + + multifd_device_state_fill_packet(p); +} + +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, + char *data, size_t len) +{ + /* Device state submissions can come from multiple threads */ + QEMU_LOCK_GUARD(&queue_job_mutex); + MultiFDDeviceState_t *device_state; + + assert(multifd_payload_empty(device_state_send)); + + multifd_set_payload_type(device_state_send, MULTIFD_PAYLOAD_DEVICE_STATE); + device_state = &device_state_send->u.device_state; + device_state->idstr = g_strdup(idstr); + device_state->instance_id = instance_id; + device_state->buf = g_memdup2(data, len); + device_state->buf_len = len; + + if (!multifd_send(&device_state_send)) { + multifd_send_data_clear(device_state_send); + return false; + } + + return true; +} diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index fa0fd0289eca..23564ce9aea9 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -84,6 +84,13 @@ static void multifd_nocomp_send_cleanup(MultiFDSendParams *p, Error **errp) return; } +static void multifd_ram_prepare_header(MultiFDSendParams *p) +{ + p->iov[0].iov_len = p->packet_len; + p->iov[0].iov_base = p->packet; + p->iovs_num++; +} + static void multifd_send_prepare_iovs(MultiFDSendParams *p) { MultiFDPages_t *pages = &p->data->u.ram; @@ -117,7 +124,7 @@ static int multifd_nocomp_send_prepare(MultiFDSendParams *p, Error **errp) * Only !zerocopy needs the header in IOV; zerocopy will * send it separately. */ - multifd_send_prepare_header(p); + multifd_ram_prepare_header(p); } multifd_send_prepare_iovs(p); @@ -368,7 +375,7 @@ bool multifd_send_prepare_common(MultiFDSendParams *p) return false; } - multifd_send_prepare_header(p); + multifd_ram_prepare_header(p); return true; } diff --git a/migration/multifd.c b/migration/multifd.c index 730acf55cfad..56419af417cc 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -12,6 +12,7 @@ #include "qemu/osdep.h" #include "qemu/cutils.h" +#include "qemu/iov.h" #include "qemu/rcu.h" #include "exec/target_page.h" #include "sysemu/sysemu.h" @@ -19,6 +20,7 @@ #include "qemu/error-report.h" #include "qapi/error.h" #include "file.h" +#include "migration/misc.h" #include "migration.h" #include "migration-stats.h" #include "savevm.h" @@ -111,7 +113,9 @@ MultiFDSendData *multifd_send_data_alloc(void) * added to the union in the future are larger than * (MultiFDPages_t + flex array). */ - max_payload_size = MAX(multifd_ram_payload_size(), sizeof(MultiFDPayload)); + max_payload_size = MAX(multifd_ram_payload_size(), + multifd_device_state_payload_size()); + max_payload_size = MAX(max_payload_size, sizeof(MultiFDPayload)); /* * Account for any holes the compiler might insert. We can't pack @@ -130,6 +134,9 @@ void multifd_send_data_clear(MultiFDSendData *data) } switch (data->type) { + case MULTIFD_PAYLOAD_DEVICE_STATE: + multifd_device_state_clear(&data->u.device_state); + break; default: /* Nothing to do */ break; @@ -232,6 +239,7 @@ static int multifd_recv_initial_packet(QIOChannel *c, Error **errp) return msg.id; } +/* Fills a RAM multifd packet */ void multifd_send_fill_packet(MultiFDSendParams *p) { MultiFDPacket_t *packet = p->packet; @@ -524,6 +532,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) p->name = NULL; g_clear_pointer(&p->data, multifd_send_data_free); p->packet_len = 0; + g_clear_pointer(&p->packet_device_state, g_free); g_free(p->packet); p->packet = NULL; multifd_send_state->ops->send_cleanup(p, errp); @@ -536,6 +545,7 @@ static void multifd_send_cleanup_state(void) { file_cleanup_outgoing_migration(); socket_cleanup_outgoing_migration(); + multifd_device_state_send_cleanup(); qemu_sem_destroy(&multifd_send_state->channels_created); qemu_sem_destroy(&multifd_send_state->channels_ready); qemu_mutex_destroy(&multifd_send_state->multifd_send_mutex); @@ -662,16 +672,33 @@ static void *multifd_send_thread(void *opaque) * qatomic_store_release() in multifd_send(). */ if (qatomic_load_acquire(&p->pending_job)) { + bool is_device_state = multifd_payload_device_state(p->data); + size_t total_size; + p->flags = 0; p->iovs_num = 0; assert(!multifd_payload_empty(p->data)); - ret = multifd_send_state->ops->send_prepare(p, &local_err); - if (ret != 0) { - break; + if (is_device_state) { + multifd_device_state_send_prepare(p); + + total_size = iov_size(p->iov, p->iovs_num); + } else { + ret = multifd_send_state->ops->send_prepare(p, &local_err); + if (ret != 0) { + break; + } + + /* + * Can't just always measure IOVs since these do not include + * packet header in the zerocopy RAM case. + */ + total_size = (uint64_t)p->next_packet_size + p->packet_len; } if (migrate_mapped_ram()) { + assert(!is_device_state); + ret = file_write_ramblock_iov(p->c, p->iov, p->iovs_num, &p->data->u.ram, &local_err); } else { @@ -684,8 +711,7 @@ static void *multifd_send_thread(void *opaque) break; } - stat64_add(&mig_stats.multifd_bytes, - (uint64_t)p->next_packet_size + p->packet_len); + stat64_add(&mig_stats.multifd_bytes, total_size); p->next_packet_size = 0; multifd_send_data_clear(p->data); @@ -903,6 +929,9 @@ bool multifd_send_setup(void) p->packet_len = sizeof(MultiFDPacket_t) + sizeof(uint64_t) * page_count; p->packet = g_malloc0(p->packet_len); + p->packet_device_state = g_malloc0(sizeof(*p->packet_device_state)); + p->packet_device_state->hdr.magic = cpu_to_be32(MULTIFD_MAGIC); + p->packet_device_state->hdr.version = cpu_to_be32(MULTIFD_VERSION); } p->name = g_strdup_printf(MIGRATION_THREAD_SRC_MULTIFD, i); p->write_flags = 0; @@ -938,6 +967,8 @@ bool multifd_send_setup(void) assert(p->iov); } + multifd_device_state_send_setup(); + return true; err: diff --git a/migration/multifd.h b/migration/multifd.h index d2f1d0d74da7..dec7d9404434 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -121,10 +121,12 @@ typedef struct { typedef enum { MULTIFD_PAYLOAD_NONE, MULTIFD_PAYLOAD_RAM, + MULTIFD_PAYLOAD_DEVICE_STATE, } MultiFDPayloadType; typedef union MultiFDPayload { MultiFDPages_t ram; + MultiFDDeviceState_t device_state; } MultiFDPayload; struct MultiFDSendData { @@ -137,6 +139,11 @@ static inline bool multifd_payload_empty(MultiFDSendData *data) return data->type == MULTIFD_PAYLOAD_NONE; } +static inline bool multifd_payload_device_state(MultiFDSendData *data) +{ + return data->type == MULTIFD_PAYLOAD_DEVICE_STATE; +} + static inline void multifd_set_payload_type(MultiFDSendData *data, MultiFDPayloadType type) { @@ -188,8 +195,9 @@ typedef struct { /* thread local variables. No locking required */ - /* pointer to the packet */ + /* pointers to the possible packet types */ MultiFDPacket_t *packet; + MultiFDPacketDeviceState_t *packet_device_state; /* size of the next packet that contains pages */ uint32_t next_packet_size; /* packets sent through this channel */ @@ -346,13 +354,6 @@ bool multifd_send_prepare_common(MultiFDSendParams *p); void multifd_send_zero_page_detect(MultiFDSendParams *p); void multifd_recv_zero_page_process(MultiFDRecvParams *p); -static inline void multifd_send_prepare_header(MultiFDSendParams *p) -{ - p->iov[0].iov_len = p->packet_len; - p->iov[0].iov_base = p->packet; - p->iovs_num++; -} - void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); bool multifd_send(MultiFDSendData **send_data); MultiFDSendData *multifd_send_data_alloc(void); @@ -375,4 +376,11 @@ int multifd_ram_flush_and_sync(void); size_t multifd_ram_payload_size(void); void multifd_ram_fill_packet(MultiFDSendParams *p); int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp); + +size_t multifd_device_state_payload_size(void); +void multifd_device_state_send_setup(void); +void multifd_device_state_clear(MultiFDDeviceState_t *device_state); +void multifd_device_state_send_cleanup(void); +void multifd_device_state_send_prepare(MultiFDSendParams *p); + #endif From patchwork Sun Nov 17 19:20:09 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877937 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 098DDD10F28 for ; Sun, 17 Nov 2024 19:25:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqs-0003Sd-Ud; Sun, 17 Nov 2024 14:22:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqq-0003HO-24 for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:28 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqn-0005w9-3Y for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:27 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqi-00000002GVB-3S6p; Sun, 17 Nov 2024 20:22:20 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 14/24] migration/multifd: Make MultiFDSendData a struct Date: Sun, 17 Nov 2024 20:20:09 +0100 Message-ID: <90a4097cce137a99a913566c46ff736fd36e7d25.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Peter Xu The newly introduced device state buffer can be used for either storing VFIO's read() raw data, but already also possible to store generic device states. After noticing that device states may not easily provide a max buffer size (also the fact that RAM MultiFDPages_t after all also want to have flexibility on managing offset[] array), it may not be a good idea to stick with union on MultiFDSendData.. as it won't play well with such flexibility. Switch MultiFDSendData to a struct. It won't consume a lot more space in reality, after all the real buffers were already dynamically allocated, so it's so far only about the two structs (pages, device_state) that will be duplicated, but they're small. With this, we can remove the pretty hard to understand alloc size logic. Because now we can allocate offset[] together with the SendData, and properly free it when the SendData is freed. Signed-off-by: Peter Xu [MSS: Make sure to clear possible device state payload before freeing MultiFDSendData, remove placeholders for other patches not included] Signed-off-by: Maciej S. Szmigiero --- migration/multifd-device-state.c | 5 ----- migration/multifd-nocomp.c | 13 ++++++------- migration/multifd.c | 25 +++++++------------------ migration/multifd.h | 14 +++++++++----- 4 files changed, 22 insertions(+), 35 deletions(-) diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c index 7741a64fbd4d..8cf5a6c2668c 100644 --- a/migration/multifd-device-state.c +++ b/migration/multifd-device-state.c @@ -16,11 +16,6 @@ static QemuMutex queue_job_mutex; static MultiFDSendData *device_state_send; -size_t multifd_device_state_payload_size(void) -{ - return sizeof(MultiFDDeviceState_t); -} - void multifd_device_state_send_setup(void) { qemu_mutex_init(&queue_job_mutex); diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 23564ce9aea9..90c0927b9bcb 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -23,15 +23,14 @@ static MultiFDSendData *multifd_ram_send; -size_t multifd_ram_payload_size(void) +void multifd_ram_payload_alloc(MultiFDPages_t *pages) { - uint32_t n = multifd_ram_page_count(); + pages->offset = g_new0(ram_addr_t, multifd_ram_page_count()); +} - /* - * We keep an array of page offsets at the end of MultiFDPages_t, - * add space for it in the allocation. - */ - return sizeof(MultiFDPages_t) + n * sizeof(ram_addr_t); +void multifd_ram_payload_free(MultiFDPages_t *pages) +{ + g_clear_pointer(&pages->offset, g_free); } void multifd_ram_save_setup(void) diff --git a/migration/multifd.c b/migration/multifd.c index 56419af417cc..4b03253f739e 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -105,26 +105,12 @@ struct { MultiFDSendData *multifd_send_data_alloc(void) { - size_t max_payload_size, size_minus_payload; + MultiFDSendData *new = g_new0(MultiFDSendData, 1); - /* - * MultiFDPages_t has a flexible array at the end, account for it - * when allocating MultiFDSendData. Use max() in case other types - * added to the union in the future are larger than - * (MultiFDPages_t + flex array). - */ - max_payload_size = MAX(multifd_ram_payload_size(), - multifd_device_state_payload_size()); - max_payload_size = MAX(max_payload_size, sizeof(MultiFDPayload)); - - /* - * Account for any holes the compiler might insert. We can't pack - * the structure because that misaligns the members and triggers - * Waddress-of-packed-member. - */ - size_minus_payload = sizeof(MultiFDSendData) - sizeof(MultiFDPayload); + multifd_ram_payload_alloc(&new->u.ram); + /* Device state allocates its payload on-demand */ - return g_malloc0(size_minus_payload + max_payload_size); + return new; } void multifd_send_data_clear(MultiFDSendData *data) @@ -151,8 +137,11 @@ void multifd_send_data_free(MultiFDSendData *data) return; } + /* This also free's device state payload */ multifd_send_data_clear(data); + multifd_ram_payload_free(&data->u.ram); + g_free(data); } diff --git a/migration/multifd.h b/migration/multifd.h index dec7d9404434..05ddfb4bf119 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -99,9 +99,13 @@ typedef struct { uint32_t num; /* number of normal pages */ uint32_t normal_num; + /* + * Pointer to the ramblock. NOTE: it's caller's responsibility to make + * sure the pointer is always valid! + */ RAMBlock *block; - /* offset of each page */ - ram_addr_t offset[]; + /* offset array of each page, managed by multifd */ + ram_addr_t *offset; } MultiFDPages_t; struct MultiFDRecvData { @@ -124,7 +128,7 @@ typedef enum { MULTIFD_PAYLOAD_DEVICE_STATE, } MultiFDPayloadType; -typedef union MultiFDPayload { +typedef struct MultiFDPayload { MultiFDPages_t ram; MultiFDDeviceState_t device_state; } MultiFDPayload; @@ -373,11 +377,11 @@ static inline uint32_t multifd_ram_page_count(void) void multifd_ram_save_setup(void); void multifd_ram_save_cleanup(void); int multifd_ram_flush_and_sync(void); -size_t multifd_ram_payload_size(void); +void multifd_ram_payload_alloc(MultiFDPages_t *pages); +void multifd_ram_payload_free(MultiFDPages_t *pages); void multifd_ram_fill_packet(MultiFDSendParams *p); int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp); -size_t multifd_device_state_payload_size(void); void multifd_device_state_send_setup(void); void multifd_device_state_clear(MultiFDDeviceState_t *device_state); void multifd_device_state_send_cleanup(void); From patchwork Sun Nov 17 19:20:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C14ABD10F26 for ; Sun, 17 Nov 2024 19:22:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkqw-0003g6-UH; Sun, 17 Nov 2024 14:22:35 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqu-0003XX-BH for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:32 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqr-0005we-Mq for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:32 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqn-00000002GVP-43q7; Sun, 17 Nov 2024 20:22:25 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 15/24] migration/multifd: Add migration_has_device_state_support() Date: Sun, 17 Nov 2024 20:20:10 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Since device state transfer via multifd channels requires multifd channels with packets and is currently not compatible with multifd compression add an appropriate query function so device can learn whether it can actually make use of it. Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 1 + migration/multifd-device-state.c | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/include/migration/misc.h b/include/migration/misc.h index 118e205bbcc6..43558d9198f7 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -112,5 +112,6 @@ bool migration_in_bg_snapshot(void); /* migration/multifd-device-state.c */ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, char *data, size_t len); +bool migration_has_device_state_support(void); #endif diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c index 8cf5a6c2668c..bcbea926b6be 100644 --- a/migration/multifd-device-state.c +++ b/migration/multifd-device-state.c @@ -11,6 +11,7 @@ #include "qemu/lockable.h" #include "migration/misc.h" #include "multifd.h" +#include "options.h" static QemuMutex queue_job_mutex; @@ -99,3 +100,9 @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, return true; } + +bool migration_has_device_state_support(void) +{ + return migrate_multifd() && !migrate_mapped_ram() && + migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE; +} From patchwork Sun Nov 17 19:20:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877935 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 65893D10F28 for ; Sun, 17 Nov 2024 19:24:56 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkr1-0004BY-Qa; Sun, 17 Nov 2024 14:22:39 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqz-0003u7-5C for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:37 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkqx-0005x2-Et for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:36 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqt-00000002GVd-0SzF; Sun, 17 Nov 2024 20:22:31 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 16/24] migration/multifd: Send final SYNC only after device state is complete Date: Sun, 17 Nov 2024 20:20:11 +0100 Message-ID: <0b8131dc6107841969d254e88e9d6e14220f1ea5.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Currently, ram_save_complete() sends a final SYNC multifd packet near this function end, after sending all of the remaining RAM data. On the receive side, this SYNC packet will cause multifd channel threads to block, waiting for the final sem_sync posting in multifd_recv_terminate_threads(). However, multifd_recv_terminate_threads() won't be called until the migration is complete, which causes a problem if multifd channels are still required for transferring device state data after RAM transfer is complete but before finishing the migration process. Defer sending the final SYNC packet to the end of sending of post-switchover iterable data instead if device state transfer is possible. Signed-off-by: Maciej S. Szmigiero --- migration/multifd-nocomp.c | 18 +++++++++++++++++- migration/multifd.h | 1 + migration/ram.c | 10 +++++++++- migration/savevm.c | 11 +++++++++++ 4 files changed, 38 insertions(+), 2 deletions(-) diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 90c0927b9bcb..db87b1262ffa 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -348,7 +348,7 @@ retry: return true; } -int multifd_ram_flush_and_sync(void) +int multifd_ram_flush(void) { if (!migrate_multifd()) { return 0; @@ -361,6 +361,22 @@ int multifd_ram_flush_and_sync(void) } } + return 0; +} + +int multifd_ram_flush_and_sync(void) +{ + int ret; + + if (!migrate_multifd()) { + return 0; + } + + ret = multifd_ram_flush(); + if (ret) { + return ret; + } + return multifd_send_sync_main(); } diff --git a/migration/multifd.h b/migration/multifd.h index 05ddfb4bf119..3abf9578e2ae 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -376,6 +376,7 @@ static inline uint32_t multifd_ram_page_count(void) void multifd_ram_save_setup(void); void multifd_ram_save_cleanup(void); +int multifd_ram_flush(void); int multifd_ram_flush_and_sync(void); void multifd_ram_payload_alloc(MultiFDPages_t *pages); void multifd_ram_payload_free(MultiFDPages_t *pages); diff --git a/migration/ram.c b/migration/ram.c index 05ff9eb32876..cf7bea3f073b 100644 --- a/migration/ram.c +++ b/migration/ram.c @@ -3283,7 +3283,15 @@ static int ram_save_complete(QEMUFile *f, void *opaque) } } - ret = multifd_ram_flush_and_sync(); + if (migration_has_device_state_support()) { + /* + * Can't do the final SYNC here since device state might still + * be transferring via multifd channels. + */ + ret = multifd_ram_flush(); + } else { + ret = multifd_ram_flush_and_sync(); + } if (ret < 0) { return ret; } diff --git a/migration/savevm.c b/migration/savevm.c index 6ea9054c4083..98049cb9b09a 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -37,6 +37,7 @@ #include "migration/register.h" #include "migration/global_state.h" #include "migration/channel-block.h" +#include "multifd.h" #include "ram.h" #include "qemu-file.h" #include "savevm.h" @@ -1496,6 +1497,7 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) int64_t start_ts_each, end_ts_each; SaveStateEntry *se; int ret; + bool multifd_device_state = migration_has_device_state_support(); QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { if (!se->ops || @@ -1528,6 +1530,15 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) end_ts_each - start_ts_each); } + if (multifd_device_state) { + /* Send the final SYNC */ + ret = multifd_send_sync_main(); + if (ret) { + qemu_file_set_error(f, ret); + return -1; + } + } + trace_vmstate_downtime_checkpoint("src-iterable-saved"); return 0; From patchwork Sun Nov 17 19:20:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877921 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6C7D8D10F26 for ; Sun, 17 Nov 2024 19:23:17 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkr5-0004XP-Ky; Sun, 17 Nov 2024 14:22:43 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkr3-0004P2-PY for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:41 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkr1-0005xQ-Nx for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:41 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkqy-00000002GVq-15Bf; Sun, 17 Nov 2024 20:22:36 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 17/24] migration: Add save_live_complete_precopy_thread handler Date: Sun, 17 Nov 2024 20:20:12 +0100 Message-ID: <87f31f02484d33e9a3bf1df08b0c8a9fa5fa9fe0.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This SaveVMHandler helps device provide its own asynchronous transmission of the remaining data at the end of a precopy phase via multifd channels, in parallel with the transfer done by save_live_complete_precopy handlers. These threads are launched only when multifd device state transfer is supported. Management of these threads in done in the multifd migration code, wrapping them in the generic thread pool. Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 8 +++ include/migration/register.h | 23 +++++++++ include/qemu/typedefs.h | 4 ++ migration/multifd-device-state.c | 85 ++++++++++++++++++++++++++++++++ migration/savevm.c | 33 ++++++++++++- 5 files changed, 152 insertions(+), 1 deletion(-) diff --git a/include/migration/misc.h b/include/migration/misc.h index 43558d9198f7..67014122dcff 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -114,4 +114,12 @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, char *data, size_t len); bool migration_has_device_state_support(void); +void +multifd_spawn_device_state_save_thread(SaveLiveCompletePrecopyThreadHandler hdlr, + char *idstr, uint32_t instance_id, + void *opaque); + +void multifd_abort_device_state_save_threads(void); +int multifd_join_device_state_save_threads(void); + #endif diff --git a/include/migration/register.h b/include/migration/register.h index 761e4e4d8bcb..ab702e0a930b 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -105,6 +105,29 @@ typedef struct SaveVMHandlers { */ int (*save_live_complete_precopy)(QEMUFile *f, void *opaque); + /* This runs in a separate thread. */ + + /** + * @save_live_complete_precopy_thread + * + * Called at the end of a precopy phase from a separate worker thread + * in configurations where multifd device state transfer is supported + * in order to perform asynchronous transmission of the remaining data in + * parallel with @save_live_complete_precopy handlers. + * When postcopy is enabled, devices that support postcopy will skip this + * step. + * + * @idstr: this device section idstr + * @instance_id: this device section instance_id + * @abort_flag: flag indicating that the migration core wants to abort + * the transmission and so the handler should exit ASAP. To be read by + * qatomic_read() or similar. + * @opaque: data pointer passed to register_savevm_live() + * + * Returns zero to indicate success and negative for error + */ + SaveLiveCompletePrecopyThreadHandler save_live_complete_precopy_thread; + /* This runs both outside and inside the BQL. */ /** diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h index 8c8ea5c2840d..926baaad211f 100644 --- a/include/qemu/typedefs.h +++ b/include/qemu/typedefs.h @@ -132,5 +132,9 @@ typedef struct IRQState *qemu_irq; */ typedef void (*qemu_irq_handler)(void *opaque, int n, int level); typedef int (*MigrationLoadThread)(bool *abort_flag, void *opaque); +typedef int (*SaveLiveCompletePrecopyThreadHandler)(char *idstr, + uint32_t instance_id, + bool *abort_flag, + void *opaque); #endif /* QEMU_TYPEDEFS_H */ diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c index bcbea926b6be..74a4aef346c8 100644 --- a/migration/multifd-device-state.c +++ b/migration/multifd-device-state.c @@ -9,12 +9,17 @@ #include "qemu/osdep.h" #include "qemu/lockable.h" +#include "block/thread-pool.h" #include "migration/misc.h" #include "multifd.h" #include "options.h" static QemuMutex queue_job_mutex; +static ThreadPool *send_threads; +static int send_threads_ret; +static bool send_threads_abort; + static MultiFDSendData *device_state_send; void multifd_device_state_send_setup(void) @@ -22,6 +27,10 @@ void multifd_device_state_send_setup(void) qemu_mutex_init(&queue_job_mutex); device_state_send = multifd_send_data_alloc(); + + send_threads = thread_pool_new(); + send_threads_ret = 0; + send_threads_abort = false; } void multifd_device_state_clear(MultiFDDeviceState_t *device_state) @@ -32,6 +41,7 @@ void multifd_device_state_clear(MultiFDDeviceState_t *device_state) void multifd_device_state_send_cleanup(void) { + g_clear_pointer(&send_threads, thread_pool_free); g_clear_pointer(&device_state_send, multifd_send_data_free); qemu_mutex_destroy(&queue_job_mutex); @@ -106,3 +116,78 @@ bool migration_has_device_state_support(void) return migrate_multifd() && !migrate_mapped_ram() && migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE; } + +struct MultiFDDSSaveThreadData { + SaveLiveCompletePrecopyThreadHandler hdlr; + char *idstr; + uint32_t instance_id; + void *handler_opaque; +}; + +static void multifd_device_state_save_thread_data_free(void *opaque) +{ + struct MultiFDDSSaveThreadData *data = opaque; + + g_clear_pointer(&data->idstr, g_free); + g_free(data); +} + +static int multifd_device_state_save_thread(void *opaque) +{ + struct MultiFDDSSaveThreadData *data = opaque; + int ret; + + ret = data->hdlr(data->idstr, data->instance_id, &send_threads_abort, + data->handler_opaque); + if (ret && !qatomic_read(&send_threads_ret)) { + /* + * Racy with the above read but that's okay - which thread error + * return we report is purely arbitrary anyway. + */ + qatomic_set(&send_threads_ret, ret); + } + + return 0; +} + +void +multifd_spawn_device_state_save_thread(SaveLiveCompletePrecopyThreadHandler hdlr, + char *idstr, uint32_t instance_id, + void *opaque) +{ + struct MultiFDDSSaveThreadData *data; + + assert(migration_has_device_state_support()); + + data = g_new(struct MultiFDDSSaveThreadData, 1); + data->hdlr = hdlr; + data->idstr = g_strdup(idstr); + data->instance_id = instance_id; + data->handler_opaque = opaque; + + thread_pool_submit(send_threads, + multifd_device_state_save_thread, + data, multifd_device_state_save_thread_data_free); + + /* + * Make sure that this new thread is actually spawned immediately so it + * can start its work right now. + */ + thread_pool_adjust_max_threads_to_work(send_threads); +} + +void multifd_abort_device_state_save_threads(void) +{ + assert(migration_has_device_state_support()); + + qatomic_set(&send_threads_abort, true); +} + +int multifd_join_device_state_save_threads(void) +{ + assert(migration_has_device_state_support()); + + thread_pool_wait(send_threads); + + return send_threads_ret; +} diff --git a/migration/savevm.c b/migration/savevm.c index 98049cb9b09a..177849e7d493 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -1499,6 +1499,23 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) int ret; bool multifd_device_state = migration_has_device_state_support(); + if (multifd_device_state) { + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + SaveLiveCompletePrecopyThreadHandler hdlr; + + if (!se->ops || (in_postcopy && se->ops->has_postcopy && + se->ops->has_postcopy(se->opaque)) || + !se->ops->save_live_complete_precopy_thread) { + continue; + } + + hdlr = se->ops->save_live_complete_precopy_thread; + multifd_spawn_device_state_save_thread(hdlr, + se->idstr, se->instance_id, + se->opaque); + } + } + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { if (!se->ops || (in_postcopy && se->ops->has_postcopy && @@ -1523,7 +1540,7 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) save_section_footer(f, se); if (ret < 0) { qemu_file_set_error(f, ret); - return -1; + goto ret_fail_abort_threads; } end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); trace_vmstate_downtime_save("iterable", se->idstr, se->instance_id, @@ -1531,6 +1548,12 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) } if (multifd_device_state) { + ret = multifd_join_device_state_save_threads(); + if (ret) { + qemu_file_set_error(f, ret); + return -1; + } + /* Send the final SYNC */ ret = multifd_send_sync_main(); if (ret) { @@ -1542,6 +1565,14 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) trace_vmstate_downtime_checkpoint("src-iterable-saved"); return 0; + +ret_fail_abort_threads: + if (multifd_device_state) { + multifd_abort_device_state_save_threads(); + multifd_join_device_state_save_threads(); + } + + return -1; } int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, From patchwork Sun Nov 17 19:20:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877934 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 74A5DD10F28 for ; Sun, 17 Nov 2024 19:24:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkrE-00057n-Ml; Sun, 17 Nov 2024 14:22:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkr8-0004ux-67 for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:46 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkr6-0005xk-EY for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:45 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkr3-00000002GW1-1duI; Sun, 17 Nov 2024 20:22:41 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 18/24] vfio/migration: Don't run load cleanup if load setup didn't run Date: Sun, 17 Nov 2024 20:20:13 +0100 Message-ID: <72424ece45968b1ae6b39750917a041867c415ab.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" It's possible for load_cleanup SaveVMHandler to get called without load_setup handler being called first. Since we'll be soon running cleanup operations there that access objects that need earlier initialization in load_setup let's make sure these cleanups only run when load_setup handler had indeed been called earlier. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 21 +++++++++++++++++++-- include/hw/vfio/vfio-common.h | 1 + 2 files changed, 20 insertions(+), 2 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 01aa11013e42..9e2657073012 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -688,16 +688,33 @@ static void vfio_save_state(QEMUFile *f, void *opaque) static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + int ret; + + assert(!migration->load_setup); + + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, + migration->device_state, errp); + if (ret) { + return ret; + } - return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, - vbasedev->migration->device_state, errp); + migration->load_setup = true; + + return 0; } static int vfio_load_cleanup(void *opaque) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + + if (!migration->load_setup) { + return 0; + } vfio_migration_cleanup(vbasedev); + migration->load_setup = false; trace_vfio_load_cleanup(vbasedev->name); return 0; diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index e0ce6ec3a9b3..246250ed8b75 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -66,6 +66,7 @@ typedef struct VFIOMigration { VMChangeStateEntry *vm_state; NotifierWithReturn migration_state; uint32_t device_state; + bool load_setup; int data_fd; void *data_buffer; size_t data_buffer_size; From patchwork Sun Nov 17 19:20:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877931 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3074AD10F26 for ; Sun, 17 Nov 2024 19:24:28 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkrQ-0005Vz-GQ; Sun, 17 Nov 2024 14:23:05 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrD-00058E-8h for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:52 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrB-0005yH-PW for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:51 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkr8-00000002GWD-2D0b; Sun, 17 Nov 2024 20:22:46 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 19/24] vfio/migration: Add x-migration-multifd-transfer VFIO property Date: Sun, 17 Nov 2024 20:20:14 +0100 Message-ID: X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This property allows configuring at runtime whether to transfer the particular device state via multifd channels when live migrating that device. It defaults to AUTO, which means that VFIO device state transfer via multifd channels is attempted in configurations that otherwise support it. Signed-off-by: Maciej S. Szmigiero --- hw/core/machine.c | 1 + hw/vfio/pci.c | 9 +++++++++ include/hw/vfio/vfio-common.h | 1 + 3 files changed, 11 insertions(+) diff --git a/hw/core/machine.c b/hw/core/machine.c index ed8d39fd769f..fda0f8280edd 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -39,6 +39,7 @@ GlobalProperty hw_compat_9_1[] = { { TYPE_PCI_DEVICE, "x-pcie-ext-tag", "false" }, { "migration", "send-switchover-start", "off"}, + { "vfio-pci", "x-migration-multifd-transfer", "off" }, }; const size_t hw_compat_9_1_len = G_N_ELEMENTS(hw_compat_9_1); diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 14bcc725c301..9d547cb5cdff 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3354,6 +3354,8 @@ static void vfio_instance_init(Object *obj) pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; } +static PropertyInfo qdev_prop_on_off_auto_mutable; + static Property vfio_pci_dev_properties[] = { DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), DEFINE_PROP_UUID_NODEFAULT("vf-token", VFIOPCIDevice, vf_token), @@ -3378,6 +3380,10 @@ static Property vfio_pci_dev_properties[] = { VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), DEFINE_PROP_ON_OFF_AUTO("enable-migration", VFIOPCIDevice, vbasedev.enable_migration, ON_OFF_AUTO_AUTO), + DEFINE_PROP("x-migration-multifd-transfer", VFIOPCIDevice, + vbasedev.migration_multifd_transfer, + qdev_prop_on_off_auto_mutable, OnOffAuto, + .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, vbasedev.migration_events, false), DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), @@ -3475,6 +3481,9 @@ static const TypeInfo vfio_pci_nohotplug_dev_info = { static void register_vfio_pci_dev_type(void) { + qdev_prop_on_off_auto_mutable = qdev_prop_on_off_auto; + qdev_prop_on_off_auto_mutable.realized_set_allowed = true; + type_register_static(&vfio_pci_dev_info); type_register_static(&vfio_pci_nohotplug_dev_info); } diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 246250ed8b75..b1c03a82eec8 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -134,6 +134,7 @@ typedef struct VFIODevice { bool no_mmap; bool ram_block_discard_allowed; OnOffAuto enable_migration; + OnOffAuto migration_multifd_transfer; bool migration_events; VFIODeviceOps *ops; unsigned int num_irqs; From patchwork Sun Nov 17 19:20:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877928 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DEBBED10F28 for ; Sun, 17 Nov 2024 19:24:10 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkrT-0005oN-MR; Sun, 17 Nov 2024 14:23:07 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrI-0005J7-Ra for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:58 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrH-0005yc-DD for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:22:56 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkrD-00000002GWO-2nNh; Sun, 17 Nov 2024 20:22:51 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 20/24] vfio/migration: Add load_device_config_state_start trace event Date: Sun, 17 Nov 2024 20:20:15 +0100 Message-ID: <9c6f1db39fd6e8e12cfb8946cd4d2a39ad77564c.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" And rename existing load_device_config_state trace event to load_device_config_state_end for consistency since it is triggered at the end of loading of the VFIO device config state. This way both the start and end points of particular device config loading operation (a long, BQL-serialized operation) are known. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 4 +++- hw/vfio/trace-events | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 9e2657073012..4b2b06b45195 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -285,6 +285,8 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) VFIODevice *vbasedev = opaque; uint64_t data; + trace_vfio_load_device_config_state_start(vbasedev->name); + if (vbasedev->ops && vbasedev->ops->vfio_load_config) { int ret; @@ -303,7 +305,7 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) return -EINVAL; } - trace_vfio_load_device_config_state(vbasedev->name); + trace_vfio_load_device_config_state_end(vbasedev->name); return qemu_file_get_error(f); } diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index cab1cf1de0a2..1bebe9877d88 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -149,7 +149,8 @@ vfio_display_edid_write_error(void) "" # migration.c vfio_load_cleanup(const char *name) " (%s)" -vfio_load_device_config_state(const char *name) " (%s)" +vfio_load_device_config_state_start(const char *name) " (%s)" +vfio_load_device_config_state_end(const char *name) " (%s)" vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" vfio_migration_realize(const char *name) " (%s)" From patchwork Sun Nov 17 19:20:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877930 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D48C6D10F26 for ; Sun, 17 Nov 2024 19:24:15 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkrg-0006nc-Ix; Sun, 17 Nov 2024 14:23:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrY-0006Tx-Uh for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:23:13 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrU-0005ys-Dp for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:23:12 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkrI-00000002GWY-3Osl; Sun, 17 Nov 2024 20:22:56 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 21/24] vfio/migration: Convert bytes_transferred counter to atomic Date: Sun, 17 Nov 2024 20:20:16 +0100 Message-ID: <069dfb52ef5edb06366e94b98a257e17451dda14.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_NONE=0.001, T_SPF_HELO_TEMPERROR=0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" So it can be safety accessed from multiple threads. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 4b2b06b45195..683f2ae98d5e 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -391,7 +391,7 @@ static ssize_t vfio_save_block(QEMUFile *f, VFIOMigration *migration) qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); qemu_put_be64(f, data_size); qemu_put_buffer(f, migration->data_buffer, data_size); - bytes_transferred += data_size; + qatomic_add(&bytes_transferred, data_size); trace_vfio_save_block(migration->vbasedev->name, data_size); @@ -1030,12 +1030,12 @@ static int vfio_block_migration(VFIODevice *vbasedev, Error *err, Error **errp) int64_t vfio_mig_bytes_transferred(void) { - return bytes_transferred; + return qatomic_read(&bytes_transferred); } void vfio_reset_bytes_transferred(void) { - bytes_transferred = 0; + qatomic_set(&bytes_transferred, 0); } /* From patchwork Sun Nov 17 19:20:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877926 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B3AACD10F29 for ; Sun, 17 Nov 2024 19:23:51 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkrb-0006Zr-Hg; Sun, 17 Nov 2024 14:23:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrW-0006HB-6x for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:23:11 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrT-0005zR-DW for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:23:09 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkrN-00000002GWk-3yMY; Sun, 17 Nov 2024 20:23:01 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 22/24] vfio/migration: Multifd device state transfer support - receive side Date: Sun, 17 Nov 2024 20:20:17 +0100 Message-ID: <0567baf58cd68716d06cbe4db360149322e0cfcc.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" The multifd received data needs to be reassembled since device state packets sent via different multifd channels can arrive out-of-order. Therefore, each VFIO device state packet carries a header indicating its position in the stream. The last such VFIO device state packet should have VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state. Since it's important to finish loading device state transferred via the main migration channel (via save_live_iterate SaveVMHandler) before starting loading the data asynchronously transferred via multifd the thread doing the actual loading of the multifd transferred data is only started from switchover_start SaveVMHandler. switchover_start handler is called when MIG_CMD_SWITCHOVER_START sub-command of QEMU_VM_COMMAND is received via the main migration channel. This sub-command is only sent after all save_live_iterate data have already been posted so it is safe to commence loading of the multifd-transferred device state upon receiving it - loading of save_live_iterate data happens synchronously in the main migration thread (much like the processing of MIG_CMD_SWITCHOVER_START) so by the time MIG_CMD_SWITCHOVER_START is processed all the proceeding data must have already been loaded. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 402 ++++++++++++++++++++++++++++++++++ hw/vfio/pci.c | 2 + hw/vfio/trace-events | 6 + include/hw/vfio/vfio-common.h | 19 ++ 4 files changed, 429 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 683f2ae98d5e..b54879fe6209 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -15,6 +15,7 @@ #include #include +#include "io/channel-buffer.h" #include "sysemu/runstate.h" #include "hw/vfio/vfio-common.h" #include "migration/misc.h" @@ -55,6 +56,15 @@ */ #define VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE (1 * MiB) +#define VFIO_DEVICE_STATE_CONFIG_STATE (1) + +typedef struct VFIODeviceStatePacket { + uint32_t version; + uint32_t idx; + uint32_t flags; + uint8_t data[0]; +} QEMU_PACKED VFIODeviceStatePacket; + static int64_t bytes_transferred; static const char *mig_state_to_str(enum vfio_device_mig_state state) @@ -254,6 +264,292 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, return ret; } +typedef struct VFIOStateBuffer { + bool is_present; + char *data; + size_t len; +} VFIOStateBuffer; + +static void vfio_state_buffer_clear(gpointer data) +{ + VFIOStateBuffer *lb = data; + + if (!lb->is_present) { + return; + } + + g_clear_pointer(&lb->data, g_free); + lb->is_present = false; +} + +static void vfio_state_buffers_init(VFIOStateBuffers *bufs) +{ + bufs->array = g_array_new(FALSE, TRUE, sizeof(VFIOStateBuffer)); + g_array_set_clear_func(bufs->array, vfio_state_buffer_clear); +} + +static void vfio_state_buffers_destroy(VFIOStateBuffers *bufs) +{ + g_clear_pointer(&bufs->array, g_array_unref); +} + +static void vfio_state_buffers_assert_init(VFIOStateBuffers *bufs) +{ + assert(bufs->array); +} + +static guint vfio_state_buffers_size_get(VFIOStateBuffers *bufs) +{ + return bufs->array->len; +} + +static void vfio_state_buffers_size_set(VFIOStateBuffers *bufs, guint size) +{ + g_array_set_size(bufs->array, size); +} + +static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, guint idx) +{ + return &g_array_index(bufs->array, VFIOStateBuffer, idx); +} + +static int vfio_load_state_buffer(void *opaque, char *data, size_t data_size, + Error **errp) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + VFIODeviceStatePacket *packet = (VFIODeviceStatePacket *)data; + VFIOStateBuffer *lb; + + /* + * Holding BQL here would violate the lock order and can cause + * a deadlock once we attempt to lock load_bufs_mutex below. + */ + assert(!bql_locked()); + + if (!migration->multifd_transfer) { + error_setg(errp, + "got device state packet but not doing multifd transfer"); + return -1; + } + + if (data_size < sizeof(*packet)) { + error_setg(errp, "packet too short at %zu (min is %zu)", + data_size, sizeof(*packet)); + return -1; + } + + if (packet->version != 0) { + error_setg(errp, "packet has unknown version %" PRIu32, + packet->version); + return -1; + } + + if (packet->idx == UINT32_MAX) { + error_setg(errp, "packet has too high idx %" PRIu32, + packet->idx); + return -1; + } + + trace_vfio_load_state_device_buffer_incoming(vbasedev->name, packet->idx); + + QEMU_LOCK_GUARD(&migration->load_bufs_mutex); + + /* config state packet should be the last one in the stream */ + if (packet->flags & VFIO_DEVICE_STATE_CONFIG_STATE) { + migration->load_buf_idx_last = packet->idx; + } + + vfio_state_buffers_assert_init(&migration->load_bufs); + if (packet->idx >= vfio_state_buffers_size_get(&migration->load_bufs)) { + vfio_state_buffers_size_set(&migration->load_bufs, packet->idx + 1); + } + + lb = vfio_state_buffers_at(&migration->load_bufs, packet->idx); + if (lb->is_present) { + error_setg(errp, "state buffer %" PRIu32 " already filled", + packet->idx); + return -1; + } + + assert(packet->idx >= migration->load_buf_idx); + + migration->load_buf_queued_pending_buffers++; + if (migration->load_buf_queued_pending_buffers > + vbasedev->migration_max_queued_buffers) { + error_setg(errp, + "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64, + packet->idx, vbasedev->migration_max_queued_buffers); + return -1; + } + + lb->data = g_memdup2(&packet->data, data_size - sizeof(*packet)); + lb->len = data_size - sizeof(*packet); + lb->is_present = true; + + qemu_cond_signal(&migration->load_bufs_buffer_ready_cond); + + return 0; +} + +static int vfio_load_device_config_state(QEMUFile *f, void *opaque); + +static int vfio_load_bufs_thread_load_config(VFIODevice *vbasedev) +{ + VFIOMigration *migration = vbasedev->migration; + VFIOStateBuffer *lb; + g_autoptr(QIOChannelBuffer) bioc = NULL; + QEMUFile *f_out = NULL, *f_in = NULL; + uint64_t mig_header; + int ret; + + assert(migration->load_buf_idx == migration->load_buf_idx_last); + lb = vfio_state_buffers_at(&migration->load_bufs, migration->load_buf_idx); + assert(lb->is_present); + + bioc = qio_channel_buffer_new(lb->len); + qio_channel_set_name(QIO_CHANNEL(bioc), "vfio-device-config-load"); + + f_out = qemu_file_new_output(QIO_CHANNEL(bioc)); + qemu_put_buffer(f_out, (uint8_t *)lb->data, lb->len); + + ret = qemu_fflush(f_out); + if (ret) { + g_clear_pointer(&f_out, qemu_fclose); + return ret; + } + + qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL); + f_in = qemu_file_new_input(QIO_CHANNEL(bioc)); + + mig_header = qemu_get_be64(f_in); + if (mig_header != VFIO_MIG_FLAG_DEV_CONFIG_STATE) { + g_clear_pointer(&f_out, qemu_fclose); + g_clear_pointer(&f_in, qemu_fclose); + return -EINVAL; + } + + bql_lock(); + ret = vfio_load_device_config_state(f_in, vbasedev); + bql_unlock(); + + g_clear_pointer(&f_out, qemu_fclose); + g_clear_pointer(&f_in, qemu_fclose); + if (ret < 0) { + return ret; + } + + return 0; +} + +static bool vfio_load_bufs_thread_want_abort(VFIODevice *vbasedev, + bool *abort_flag) +{ + VFIOMigration *migration = vbasedev->migration; + + return migration->load_bufs_thread_want_exit || qatomic_read(abort_flag); +} + +static int vfio_load_bufs_thread(bool *abort_flag, void *opaque) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + QEMU_LOCK_GUARD(&migration->load_bufs_mutex); + int ret; + + assert(migration->load_bufs_thread_running); + + while (!vfio_load_bufs_thread_want_abort(vbasedev, abort_flag)) { + VFIOStateBuffer *lb; + guint bufs_len; + bool starved; + + assert(migration->load_buf_idx <= migration->load_buf_idx_last); + + bufs_len = vfio_state_buffers_size_get(&migration->load_bufs); + if (migration->load_buf_idx >= bufs_len) { + assert(migration->load_buf_idx == bufs_len); + starved = true; + } else { + lb = vfio_state_buffers_at(&migration->load_bufs, + migration->load_buf_idx); + starved = !lb->is_present; + } + + if (starved) { + trace_vfio_load_state_device_buffer_starved(vbasedev->name, + migration->load_buf_idx); + qemu_cond_wait(&migration->load_bufs_buffer_ready_cond, + &migration->load_bufs_mutex); + continue; + } + + if (migration->load_buf_idx == migration->load_buf_idx_last) { + break; + } + + if (migration->load_buf_idx == 0) { + trace_vfio_load_state_device_buffer_start(vbasedev->name); + } + + if (lb->len) { + g_autofree char *buf = NULL; + size_t buf_len; + ssize_t wr_ret; + int errno_save; + + trace_vfio_load_state_device_buffer_load_start(vbasedev->name, + migration->load_buf_idx); + + /* lb might become re-allocated when we drop the lock */ + buf = g_steal_pointer(&lb->data); + buf_len = lb->len; + + /* + * Loading data to the device takes a while, + * drop the lock during this process. + */ + qemu_mutex_unlock(&migration->load_bufs_mutex); + wr_ret = write(migration->data_fd, buf, buf_len); + errno_save = errno; + qemu_mutex_lock(&migration->load_bufs_mutex); + + if (wr_ret < 0) { + ret = -errno_save; + goto ret_signal; + } else if (wr_ret < buf_len) { + ret = -EINVAL; + goto ret_signal; + } + + trace_vfio_load_state_device_buffer_load_end(vbasedev->name, + migration->load_buf_idx); + } + + assert(migration->load_buf_queued_pending_buffers > 0); + migration->load_buf_queued_pending_buffers--; + + if (migration->load_buf_idx == migration->load_buf_idx_last - 1) { + trace_vfio_load_state_device_buffer_end(vbasedev->name); + } + + migration->load_buf_idx++; + } + + if (vfio_load_bufs_thread_want_abort(vbasedev, abort_flag)) { + ret = -ECANCELED; + goto ret_signal; + } + + ret = vfio_load_bufs_thread_load_config(vbasedev); + +ret_signal: + migration->load_bufs_thread_running = false; + qemu_cond_signal(&migration->load_bufs_thread_finished_cond); + + return ret; +} + static int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp) { @@ -430,6 +726,12 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev) return migration->mig_flags & VFIO_MIGRATION_PRE_COPY; } +static bool vfio_multifd_transfer_supported(void) +{ + return migration_has_device_state_support() && + migrate_send_switchover_start(); +} + /* ---------------------------------------------------------------------- */ static int vfio_save_prepare(void *opaque, Error **errp) @@ -695,17 +997,73 @@ static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp) assert(!migration->load_setup); + /* + * Make a copy of this setting at the start in case it is changed + * mid-migration. + */ + if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { + migration->multifd_transfer = vfio_multifd_transfer_supported(); + } else { + migration->multifd_transfer = + vbasedev->migration_multifd_transfer == ON_OFF_AUTO_ON; + } + + if (migration->multifd_transfer && !vfio_multifd_transfer_supported()) { + error_setg(errp, + "%s: Multifd device transfer requested but unsupported in the current config", + vbasedev->name); + return -EINVAL; + } + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, migration->device_state, errp); if (ret) { return ret; } + if (migration->multifd_transfer) { + assert(!migration->load_bufs.array); + vfio_state_buffers_init(&migration->load_bufs); + + qemu_mutex_init(&migration->load_bufs_mutex); + + migration->load_buf_idx = 0; + migration->load_buf_idx_last = UINT32_MAX; + migration->load_buf_queued_pending_buffers = 0; + qemu_cond_init(&migration->load_bufs_buffer_ready_cond); + + migration->load_bufs_thread_running = false; + migration->load_bufs_thread_want_exit = false; + qemu_cond_init(&migration->load_bufs_thread_finished_cond); + } + migration->load_setup = true; return 0; } +static void vfio_load_cleanup_load_bufs_thread(VFIODevice *vbasedev) +{ + VFIOMigration *migration = vbasedev->migration; + + /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ + bql_unlock(); + WITH_QEMU_LOCK_GUARD(&migration->load_bufs_mutex) { + if (!migration->load_bufs_thread_running) { + break; + } + + migration->load_bufs_thread_want_exit = true; + + qemu_cond_signal(&migration->load_bufs_buffer_ready_cond); + qemu_cond_wait(&migration->load_bufs_thread_finished_cond, + &migration->load_bufs_mutex); + + assert(!migration->load_bufs_thread_running); + } + bql_lock(); +} + static int vfio_load_cleanup(void *opaque) { VFIODevice *vbasedev = opaque; @@ -715,7 +1073,19 @@ static int vfio_load_cleanup(void *opaque) return 0; } + if (migration->multifd_transfer) { + vfio_load_cleanup_load_bufs_thread(vbasedev); + } + vfio_migration_cleanup(vbasedev); + + if (migration->multifd_transfer) { + qemu_cond_destroy(&migration->load_bufs_thread_finished_cond); + vfio_state_buffers_destroy(&migration->load_bufs); + qemu_cond_destroy(&migration->load_bufs_buffer_ready_cond); + qemu_mutex_destroy(&migration->load_bufs_mutex); + } + migration->load_setup = false; trace_vfio_load_cleanup(vbasedev->name); @@ -725,6 +1095,7 @@ static int vfio_load_cleanup(void *opaque) static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; int ret = 0; uint64_t data; @@ -736,6 +1107,12 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) switch (data) { case VFIO_MIG_FLAG_DEV_CONFIG_STATE: { + if (migration->multifd_transfer) { + error_report("%s: got DEV_CONFIG_STATE but doing multifd transfer", + vbasedev->name); + return -EINVAL; + } + return vfio_load_device_config_state(f, opaque); } case VFIO_MIG_FLAG_DEV_SETUP_STATE: @@ -801,6 +1178,29 @@ static bool vfio_switchover_ack_needed(void *opaque) return vfio_precopy_supported(vbasedev); } +static int vfio_switchover_start(void *opaque) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + + if (!migration->multifd_transfer) { + /* Load thread is only used for multifd transfer */ + return 0; + } + + /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ + bql_unlock(); + WITH_QEMU_LOCK_GUARD(&migration->load_bufs_mutex) { + assert(!migration->load_bufs_thread_running); + migration->load_bufs_thread_running = true; + } + bql_lock(); + + qemu_loadvm_start_load_thread(vfio_load_bufs_thread, vbasedev); + + return 0; +} + static const SaveVMHandlers savevm_vfio_handlers = { .save_prepare = vfio_save_prepare, .save_setup = vfio_save_setup, @@ -814,7 +1214,9 @@ static const SaveVMHandlers savevm_vfio_handlers = { .load_setup = vfio_load_setup, .load_cleanup = vfio_load_cleanup, .load_state = vfio_load_state, + .load_state_buffer = vfio_load_state_buffer, .switchover_ack_needed = vfio_switchover_ack_needed, + .switchover_start = vfio_switchover_start, }; /* ---------------------------------------------------------------------- */ diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 9d547cb5cdff..72d62ada8a39 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3384,6 +3384,8 @@ static Property vfio_pci_dev_properties[] = { vbasedev.migration_multifd_transfer, qdev_prop_on_off_auto_mutable, OnOffAuto, .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), + DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, + vbasedev.migration_max_queued_buffers, UINT64_MAX), DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, vbasedev.migration_events, false), DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 1bebe9877d88..418b378ebd29 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -153,6 +153,12 @@ vfio_load_device_config_state_start(const char *name) " (%s)" vfio_load_device_config_state_end(const char *name) " (%s)" vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" +vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (%s) idx %"PRIu32 +vfio_load_state_device_buffer_start(const char *name) " (%s)" +vfio_load_state_device_buffer_starved(const char *name, uint32_t idx) " (%s) idx %"PRIu32 +vfio_load_state_device_buffer_load_start(const char *name, uint32_t idx) " (%s) idx %"PRIu32 +vfio_load_state_device_buffer_load_end(const char *name, uint32_t idx) " (%s) idx %"PRIu32 +vfio_load_state_device_buffer_end(const char *name) " (%s)" vfio_migration_realize(const char *name) " (%s)" vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s" vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s" diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index b1c03a82eec8..0954d6981a22 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -61,6 +61,11 @@ typedef struct VFIORegion { uint8_t nr; /* cache the region number for debug */ } VFIORegion; +/* type safety */ +typedef struct VFIOStateBuffers { + GArray *array; +} VFIOStateBuffers; + typedef struct VFIOMigration { struct VFIODevice *vbasedev; VMChangeStateEntry *vm_state; @@ -73,10 +78,23 @@ typedef struct VFIOMigration { uint64_t mig_flags; uint64_t precopy_init_size; uint64_t precopy_dirty_size; + bool multifd_transfer; bool initial_data_sent; bool event_save_iterate_started; bool event_precopy_empty_hit; + + QemuThread load_bufs_thread; + bool load_bufs_thread_running; + bool load_bufs_thread_want_exit; + + VFIOStateBuffers load_bufs; + QemuCond load_bufs_buffer_ready_cond; + QemuCond load_bufs_thread_finished_cond; + QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ + uint32_t load_buf_idx; + uint32_t load_buf_idx_last; + uint32_t load_buf_queued_pending_buffers; } VFIOMigration; struct VFIOGroup; @@ -136,6 +154,7 @@ typedef struct VFIODevice { OnOffAuto enable_migration; OnOffAuto migration_multifd_transfer; bool migration_events; + uint64_t migration_max_queued_buffers; VFIODeviceOps *ops; unsigned int num_irqs; unsigned int num_regions; From patchwork Sun Nov 17 19:20:18 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4BFEAD10F28 for ; Sun, 17 Nov 2024 19:24:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkrm-0007LS-LU; Sun, 17 Nov 2024 14:23:27 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkri-00077F-K9 for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:23:22 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrh-0005zk-7o for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:23:22 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkrT-00000002GWv-0NOc; Sun, 17 Nov 2024 20:23:07 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 23/24] migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile Date: Sun, 17 Nov 2024 20:20:18 +0100 Message-ID: <952bafd5a7312f04bc7f05068a1fdd4c64be433f.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Automatic memory management helps avoid memory safety issues. Signed-off-by: Maciej S. Szmigiero --- migration/qemu-file.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/migration/qemu-file.h b/migration/qemu-file.h index 11c2120edd72..fdf21324df07 100644 --- a/migration/qemu-file.h +++ b/migration/qemu-file.h @@ -33,6 +33,8 @@ QEMUFile *qemu_file_new_input(QIOChannel *ioc); QEMUFile *qemu_file_new_output(QIOChannel *ioc); int qemu_fclose(QEMUFile *f); +G_DEFINE_AUTOPTR_CLEANUP_FUNC(QEMUFile, qemu_fclose) + /* * qemu_file_transferred: * From patchwork Sun Nov 17 19:20:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13877927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 15742D10F26 for ; Sun, 17 Nov 2024 19:23:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tCkrq-0007ap-WB; Sun, 17 Nov 2024 14:23:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrn-0007Mw-9P for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:23:27 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tCkrl-000609-Bv for qemu-devel@nongnu.org; Sun, 17 Nov 2024 14:23:26 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tCkrY-00000002GX6-11Vn; Sun, 17 Nov 2024 20:23:12 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v3 24/24] vfio/migration: Multifd device state transfer support - send side Date: Sun, 17 Nov 2024 20:20:19 +0100 Message-ID: <4f5c33b97be09fcb4e1885073e31c5e930a39ad0.1731773021.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.47.0 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: none client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -15 X-Spam_score: -1.6 X-Spam_bar: - X-Spam_report: (-1.6 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Implement the multifd device state transfer via additional per-device thread inside save_live_complete_precopy_thread handler. Switch between doing the data transfer in the new handler and doing it in the old save_state handler depending on the x-migration-multifd-transfer device property value. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 155 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/trace-events | 2 + 2 files changed, 157 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index b54879fe6209..8709672ada48 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -771,6 +771,24 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp) uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE; int ret; + /* + * Make a copy of this setting at the start in case it is changed + * mid-migration. + */ + if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { + migration->multifd_transfer = vfio_multifd_transfer_supported(); + } else { + migration->multifd_transfer = + vbasedev->migration_multifd_transfer == ON_OFF_AUTO_ON; + } + + if (migration->multifd_transfer && !vfio_multifd_transfer_supported()) { + error_setg(errp, + "%s: Multifd device transfer requested but unsupported in the current config", + vbasedev->name); + return -EINVAL; + } + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE); vfio_query_stop_copy_size(vbasedev, &stop_copy_size); @@ -942,13 +960,32 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque) return !migration->precopy_init_size && !migration->precopy_dirty_size; } +static void vfio_save_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f) +{ + VFIOMigration *migration = vbasedev->migration; + + assert(migration->multifd_transfer); + + /* + * Emit dummy NOP data on the main migration channel since the actual + * device state transfer is done via multifd channels. + */ + qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE); +} + static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; ssize_t data_size; int ret; Error *local_err = NULL; + if (migration->multifd_transfer) { + vfio_save_multifd_emit_dummy_eos(vbasedev, f); + return 0; + } + trace_vfio_save_complete_precopy_start(vbasedev->name); /* We reach here with device state STOP or STOP_COPY only */ @@ -974,12 +1011,129 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) return ret; } +static int +vfio_save_complete_precopy_async_thread_config_state(VFIODevice *vbasedev, + char *idstr, + uint32_t instance_id, + uint32_t idx) +{ + g_autoptr(QIOChannelBuffer) bioc = NULL; + g_autoptr(QEMUFile) f = NULL; + int ret; + g_autofree VFIODeviceStatePacket *packet = NULL; + size_t packet_len; + + bioc = qio_channel_buffer_new(0); + qio_channel_set_name(QIO_CHANNEL(bioc), "vfio-device-config-save"); + + f = qemu_file_new_output(QIO_CHANNEL(bioc)); + + ret = vfio_save_device_config_state(f, vbasedev, NULL); + if (ret) { + return ret; + } + + ret = qemu_fflush(f); + if (ret) { + return ret; + } + + packet_len = sizeof(*packet) + bioc->usage; + packet = g_malloc0(packet_len); + packet->idx = idx; + packet->flags = VFIO_DEVICE_STATE_CONFIG_STATE; + memcpy(&packet->data, bioc->data, bioc->usage); + + if (!multifd_queue_device_state(idstr, instance_id, + (char *)packet, packet_len)) { + return -1; + } + + qatomic_add(&bytes_transferred, packet_len); + + return 0; +} + +static int vfio_save_complete_precopy_thread(char *idstr, + uint32_t instance_id, + bool *abort_flag, + void *opaque) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + int ret; + g_autofree VFIODeviceStatePacket *packet = NULL; + uint32_t idx; + + if (!migration->multifd_transfer) { + /* Nothing to do, vfio_save_complete_precopy() does the transfer. */ + return 0; + } + + trace_vfio_save_complete_precopy_thread_start(vbasedev->name, + idstr, instance_id); + + /* We reach here with device state STOP or STOP_COPY only */ + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY, + VFIO_DEVICE_STATE_STOP, NULL); + if (ret) { + goto ret_finish; + } + + packet = g_malloc0(sizeof(*packet) + migration->data_buffer_size); + + for (idx = 0; ; idx++) { + ssize_t data_size; + size_t packet_size; + + if (qatomic_read(abort_flag)) { + ret = -ECANCELED; + goto ret_finish; + } + + data_size = read(migration->data_fd, &packet->data, + migration->data_buffer_size); + if (data_size < 0) { + ret = -errno; + goto ret_finish; + } else if (data_size == 0) { + break; + } + + packet->idx = idx; + packet_size = sizeof(*packet) + data_size; + + if (!multifd_queue_device_state(idstr, instance_id, + (char *)packet, packet_size)) { + ret = -1; + goto ret_finish; + } + + qatomic_add(&bytes_transferred, packet_size); + } + + ret = vfio_save_complete_precopy_async_thread_config_state(vbasedev, idstr, + instance_id, + idx); + +ret_finish: + trace_vfio_save_complete_precopy_thread_end(vbasedev->name, ret); + + return ret; +} + static void vfio_save_state(QEMUFile *f, void *opaque) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; Error *local_err = NULL; int ret; + if (migration->multifd_transfer) { + vfio_save_multifd_emit_dummy_eos(vbasedev, f); + return; + } + ret = vfio_save_device_config_state(f, opaque, &local_err); if (ret) { error_prepend(&local_err, @@ -1210,6 +1364,7 @@ static const SaveVMHandlers savevm_vfio_handlers = { .is_active_iterate = vfio_is_active_iterate, .save_live_iterate = vfio_save_iterate, .save_live_complete_precopy = vfio_save_complete_precopy, + .save_live_complete_precopy_thread = vfio_save_complete_precopy_thread, .save_state = vfio_save_state, .load_setup = vfio_load_setup, .load_cleanup = vfio_load_cleanup, diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 418b378ebd29..039979bdd98f 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -168,6 +168,8 @@ vfio_save_block_precopy_empty_hit(const char *name) " (%s)" vfio_save_cleanup(const char *name) " (%s)" vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d" vfio_save_complete_precopy_start(const char *name) " (%s)" +vfio_save_complete_precopy_thread_start(const char *name, const char *idstr, uint32_t instance_id) " (%s) idstr %s instance %"PRIu32 +vfio_save_complete_precopy_thread_end(const char *name, int ret) " (%s) ret %d" vfio_save_device_config_state(const char *name) " (%s)" vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size %"PRIu64" precopy dirty size %"PRIu64 vfio_save_iterate_start(const char *name) " (%s)"