From patchwork Thu Jan 30 10:08:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954426 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 96CA4C02190 for ; Thu, 30 Jan 2025 10:10:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUI-000279-O4; Thu, 30 Jan 2025 05:09:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUG-00026q-Vi for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:28 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUE-00071e-OZ for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:28 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRU4-00000006Tw9-0N6B; Thu, 30 Jan 2025 11:09:16 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 01/33] migration: Clarify that {load, save}_cleanup handlers can run without setup Date: Thu, 30 Jan 2025 11:08:22 +0100 Message-ID: <6e82ce38c115d5eec03d7955d94f96050491a1d5.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" It's possible for {load,save}_cleanup SaveVMHandlers to get called without the corresponding {load,save}_setup handler being called first. One such example is if {load,save}_setup handler of a proceeding device returns error. In this case the migration core cleanup code will call all corresponding cleanup handlers, even for these devices which haven't had its setup handler called. Since this behavior can generate some surprises let's clearly document it in these SaveVMHandlers description. Reviewed-by: Fabiano Rosas Reviewed-by: Cédric Le Goater Signed-off-by: Maciej S. Szmigiero --- include/migration/register.h | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/include/migration/register.h b/include/migration/register.h index f60e797894e5..0b0292738320 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -69,7 +69,9 @@ typedef struct SaveVMHandlers { /** * @save_cleanup * - * Uninitializes the data structures on the source + * Uninitializes the data structures on the source. + * Note that this handler can be called even if save_setup + * wasn't called earlier. * * @opaque: data pointer passed to register_savevm_live() */ @@ -244,6 +246,8 @@ typedef struct SaveVMHandlers { * @load_cleanup * * Uninitializes the data structures on the destination. + * Note that this handler can be called even if load_setup + * wasn't called earlier. * * @opaque: data pointer passed to register_savevm_live() * From patchwork Thu Jan 30 10:08:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954427 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1F1DDC0218A for ; Thu, 30 Jan 2025 10:10:25 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUN-00028t-Au; Thu, 30 Jan 2025 05:09:35 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUL-00028P-SD for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:33 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUK-000723-55 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:33 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRU9-00000006TwL-0wMj; Thu, 30 Jan 2025 11:09:21 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 02/33] thread-pool: Remove thread_pool_submit() function Date: Thu, 30 Jan 2025 11:08:23 +0100 Message-ID: <7c49c98255e6fde131580c3fc0b7a021b20cff1b.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This function name conflicts with one used by a future generic thread pool function and it was only used by one test anyway. Update the trace event name in thread_pool_submit_aio() accordingly. Acked-by: Fabiano Rosas Reviewed-by: Cédric Le Goater Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- include/block/thread-pool.h | 3 +-- tests/unit/test-thread-pool.c | 6 +++--- util/thread-pool.c | 7 +------ util/trace-events | 2 +- 4 files changed, 6 insertions(+), 12 deletions(-) diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h index 948ff5f30c31..4f6694026123 100644 --- a/include/block/thread-pool.h +++ b/include/block/thread-pool.h @@ -30,13 +30,12 @@ ThreadPool *thread_pool_new(struct AioContext *ctx); void thread_pool_free(ThreadPool *pool); /* - * thread_pool_submit* API: submit I/O requests in the thread's + * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's * current AioContext. */ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, BlockCompletionFunc *cb, void *opaque); int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); -void thread_pool_submit(ThreadPoolFunc *func, void *arg); void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); diff --git a/tests/unit/test-thread-pool.c b/tests/unit/test-thread-pool.c index 1483e53473db..33407b595d35 100644 --- a/tests/unit/test-thread-pool.c +++ b/tests/unit/test-thread-pool.c @@ -43,10 +43,10 @@ static void done_cb(void *opaque, int ret) active--; } -static void test_submit(void) +static void test_submit_no_complete(void) { WorkerTestData data = { .n = 0 }; - thread_pool_submit(worker_cb, &data); + thread_pool_submit_aio(worker_cb, &data, NULL, NULL); while (data.n == 0) { aio_poll(ctx, true); } @@ -236,7 +236,7 @@ int main(int argc, char **argv) ctx = qemu_get_current_aio_context(); g_test_init(&argc, &argv, NULL); - g_test_add_func("/thread-pool/submit", test_submit); + g_test_add_func("/thread-pool/submit-no-complete", test_submit_no_complete); g_test_add_func("/thread-pool/submit-aio", test_submit_aio); g_test_add_func("/thread-pool/submit-co", test_submit_co); g_test_add_func("/thread-pool/submit-many", test_submit_many); diff --git a/util/thread-pool.c b/util/thread-pool.c index 27eb777e855b..2f751d55b33f 100644 --- a/util/thread-pool.c +++ b/util/thread-pool.c @@ -256,7 +256,7 @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, QLIST_INSERT_HEAD(&pool->head, req, all); - trace_thread_pool_submit(pool, req, arg); + trace_thread_pool_submit_aio(pool, req, arg); qemu_mutex_lock(&pool->lock); if (pool->idle_threads == 0 && pool->cur_threads < pool->max_threads) { @@ -290,11 +290,6 @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) return tpc.ret; } -void thread_pool_submit(ThreadPoolFunc *func, void *arg) -{ - thread_pool_submit_aio(func, arg, NULL, NULL); -} - void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) { qemu_mutex_lock(&pool->lock); diff --git a/util/trace-events b/util/trace-events index 49a4962e1886..5be12d7fab89 100644 --- a/util/trace-events +++ b/util/trace-events @@ -14,7 +14,7 @@ aio_co_schedule_bh_cb(void *ctx, void *co) "ctx %p co %p" reentrant_aio(void *ctx, const char *name) "ctx %p name %s" # thread-pool.c -thread_pool_submit(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" +thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" From patchwork Thu Jan 30 10:08:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954428 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 50A64C02193 for ; Thu, 30 Jan 2025 10:10:26 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUM-00028e-Hp; Thu, 30 Jan 2025 05:09:34 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUL-00027o-0t for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:33 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUI-00072S-SQ for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:32 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUE-00000006TwX-1WbA; Thu, 30 Jan 2025 11:09:26 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 03/33] thread-pool: Rename AIO pool functions to *_aio() and data types to *Aio Date: Thu, 30 Jan 2025 11:08:24 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" These names conflict with ones used by future generic thread pool equivalents. Generic names should belong to the generic pool type, not specific (AIO) type. Acked-by: Fabiano Rosas Reviewed-by: Cédric Le Goater Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- include/block/aio.h | 8 ++--- include/block/thread-pool.h | 8 ++--- util/async.c | 6 ++-- util/thread-pool.c | 58 ++++++++++++++++++------------------- util/trace-events | 4 +-- 5 files changed, 42 insertions(+), 42 deletions(-) diff --git a/include/block/aio.h b/include/block/aio.h index 43883a8a33a8..b2ab3514de23 100644 --- a/include/block/aio.h +++ b/include/block/aio.h @@ -54,7 +54,7 @@ typedef void QEMUBHFunc(void *opaque); typedef bool AioPollFn(void *opaque); typedef void IOHandler(void *opaque); -struct ThreadPool; +struct ThreadPoolAio; struct LinuxAioState; typedef struct LuringState LuringState; @@ -207,7 +207,7 @@ struct AioContext { /* Thread pool for performing work and receiving completion callbacks. * Has its own locking. */ - struct ThreadPool *thread_pool; + struct ThreadPoolAio *thread_pool; #ifdef CONFIG_LINUX_AIO struct LinuxAioState *linux_aio; @@ -500,8 +500,8 @@ void aio_set_event_notifier_poll(AioContext *ctx, */ GSource *aio_get_g_source(AioContext *ctx); -/* Return the ThreadPool bound to this AioContext */ -struct ThreadPool *aio_get_thread_pool(AioContext *ctx); +/* Return the ThreadPoolAio bound to this AioContext */ +struct ThreadPoolAio *aio_get_thread_pool(AioContext *ctx); /* Setup the LinuxAioState bound to this AioContext */ struct LinuxAioState *aio_setup_linux_aio(AioContext *ctx, Error **errp); diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h index 4f6694026123..6f27eb085b45 100644 --- a/include/block/thread-pool.h +++ b/include/block/thread-pool.h @@ -24,10 +24,10 @@ typedef int ThreadPoolFunc(void *opaque); -typedef struct ThreadPool ThreadPool; +typedef struct ThreadPoolAio ThreadPoolAio; -ThreadPool *thread_pool_new(struct AioContext *ctx); -void thread_pool_free(ThreadPool *pool); +ThreadPoolAio *thread_pool_new_aio(struct AioContext *ctx); +void thread_pool_free_aio(ThreadPoolAio *pool); /* * thread_pool_submit_{aio,co} API: submit I/O requests in the thread's @@ -36,7 +36,7 @@ void thread_pool_free(ThreadPool *pool); BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, BlockCompletionFunc *cb, void *opaque); int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); +void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); -void thread_pool_update_params(ThreadPool *pool, struct AioContext *ctx); #endif diff --git a/util/async.c b/util/async.c index 0fe29436090d..47e3d35a263f 100644 --- a/util/async.c +++ b/util/async.c @@ -369,7 +369,7 @@ aio_ctx_finalize(GSource *source) QEMUBH *bh; unsigned flags; - thread_pool_free(ctx->thread_pool); + thread_pool_free_aio(ctx->thread_pool); #ifdef CONFIG_LINUX_AIO if (ctx->linux_aio) { @@ -435,10 +435,10 @@ GSource *aio_get_g_source(AioContext *ctx) return &ctx->source; } -ThreadPool *aio_get_thread_pool(AioContext *ctx) +ThreadPoolAio *aio_get_thread_pool(AioContext *ctx) { if (!ctx->thread_pool) { - ctx->thread_pool = thread_pool_new(ctx); + ctx->thread_pool = thread_pool_new_aio(ctx); } return ctx->thread_pool; } diff --git a/util/thread-pool.c b/util/thread-pool.c index 2f751d55b33f..908194dc070f 100644 --- a/util/thread-pool.c +++ b/util/thread-pool.c @@ -23,9 +23,9 @@ #include "block/thread-pool.h" #include "qemu/main-loop.h" -static void do_spawn_thread(ThreadPool *pool); +static void do_spawn_thread(ThreadPoolAio *pool); -typedef struct ThreadPoolElement ThreadPoolElement; +typedef struct ThreadPoolElementAio ThreadPoolElementAio; enum ThreadState { THREAD_QUEUED, @@ -33,9 +33,9 @@ enum ThreadState { THREAD_DONE, }; -struct ThreadPoolElement { +struct ThreadPoolElementAio { BlockAIOCB common; - ThreadPool *pool; + ThreadPoolAio *pool; ThreadPoolFunc *func; void *arg; @@ -47,13 +47,13 @@ struct ThreadPoolElement { int ret; /* Access to this list is protected by lock. */ - QTAILQ_ENTRY(ThreadPoolElement) reqs; + QTAILQ_ENTRY(ThreadPoolElementAio) reqs; /* This list is only written by the thread pool's mother thread. */ - QLIST_ENTRY(ThreadPoolElement) all; + QLIST_ENTRY(ThreadPoolElementAio) all; }; -struct ThreadPool { +struct ThreadPoolAio { AioContext *ctx; QEMUBH *completion_bh; QemuMutex lock; @@ -62,10 +62,10 @@ struct ThreadPool { QEMUBH *new_thread_bh; /* The following variables are only accessed from one AioContext. */ - QLIST_HEAD(, ThreadPoolElement) head; + QLIST_HEAD(, ThreadPoolElementAio) head; /* The following variables are protected by lock. */ - QTAILQ_HEAD(, ThreadPoolElement) request_list; + QTAILQ_HEAD(, ThreadPoolElementAio) request_list; int cur_threads; int idle_threads; int new_threads; /* backlog of threads we need to create */ @@ -76,14 +76,14 @@ struct ThreadPool { static void *worker_thread(void *opaque) { - ThreadPool *pool = opaque; + ThreadPoolAio *pool = opaque; qemu_mutex_lock(&pool->lock); pool->pending_threads--; do_spawn_thread(pool); while (pool->cur_threads <= pool->max_threads) { - ThreadPoolElement *req; + ThreadPoolElementAio *req; int ret; if (QTAILQ_EMPTY(&pool->request_list)) { @@ -131,7 +131,7 @@ static void *worker_thread(void *opaque) return NULL; } -static void do_spawn_thread(ThreadPool *pool) +static void do_spawn_thread(ThreadPoolAio *pool) { QemuThread t; @@ -148,14 +148,14 @@ static void do_spawn_thread(ThreadPool *pool) static void spawn_thread_bh_fn(void *opaque) { - ThreadPool *pool = opaque; + ThreadPoolAio *pool = opaque; qemu_mutex_lock(&pool->lock); do_spawn_thread(pool); qemu_mutex_unlock(&pool->lock); } -static void spawn_thread(ThreadPool *pool) +static void spawn_thread(ThreadPoolAio *pool) { pool->cur_threads++; pool->new_threads++; @@ -173,8 +173,8 @@ static void spawn_thread(ThreadPool *pool) static void thread_pool_completion_bh(void *opaque) { - ThreadPool *pool = opaque; - ThreadPoolElement *elem, *next; + ThreadPoolAio *pool = opaque; + ThreadPoolElementAio *elem, *next; defer_call_begin(); /* cb() may use defer_call() to coalesce work */ @@ -184,8 +184,8 @@ restart: continue; } - trace_thread_pool_complete(pool, elem, elem->common.opaque, - elem->ret); + trace_thread_pool_complete_aio(pool, elem, elem->common.opaque, + elem->ret); QLIST_REMOVE(elem, all); if (elem->common.cb) { @@ -217,10 +217,10 @@ restart: static void thread_pool_cancel(BlockAIOCB *acb) { - ThreadPoolElement *elem = (ThreadPoolElement *)acb; - ThreadPool *pool = elem->pool; + ThreadPoolElementAio *elem = (ThreadPoolElementAio *)acb; + ThreadPoolAio *pool = elem->pool; - trace_thread_pool_cancel(elem, elem->common.opaque); + trace_thread_pool_cancel_aio(elem, elem->common.opaque); QEMU_LOCK_GUARD(&pool->lock); if (elem->state == THREAD_QUEUED) { @@ -234,16 +234,16 @@ static void thread_pool_cancel(BlockAIOCB *acb) } static const AIOCBInfo thread_pool_aiocb_info = { - .aiocb_size = sizeof(ThreadPoolElement), + .aiocb_size = sizeof(ThreadPoolElementAio), .cancel_async = thread_pool_cancel, }; BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, BlockCompletionFunc *cb, void *opaque) { - ThreadPoolElement *req; + ThreadPoolElementAio *req; AioContext *ctx = qemu_get_current_aio_context(); - ThreadPool *pool = aio_get_thread_pool(ctx); + ThreadPoolAio *pool = aio_get_thread_pool(ctx); /* Assert that the thread submitting work is the same running the pool */ assert(pool->ctx == qemu_get_current_aio_context()); @@ -290,7 +290,7 @@ int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg) return tpc.ret; } -void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) +void thread_pool_update_params(ThreadPoolAio *pool, AioContext *ctx) { qemu_mutex_lock(&pool->lock); @@ -317,7 +317,7 @@ void thread_pool_update_params(ThreadPool *pool, AioContext *ctx) qemu_mutex_unlock(&pool->lock); } -static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) +static void thread_pool_init_one(ThreadPoolAio *pool, AioContext *ctx) { if (!ctx) { ctx = qemu_get_aio_context(); @@ -337,14 +337,14 @@ static void thread_pool_init_one(ThreadPool *pool, AioContext *ctx) thread_pool_update_params(pool, ctx); } -ThreadPool *thread_pool_new(AioContext *ctx) +ThreadPoolAio *thread_pool_new_aio(AioContext *ctx) { - ThreadPool *pool = g_new(ThreadPool, 1); + ThreadPoolAio *pool = g_new(ThreadPoolAio, 1); thread_pool_init_one(pool, ctx); return pool; } -void thread_pool_free(ThreadPool *pool) +void thread_pool_free_aio(ThreadPoolAio *pool) { if (!pool) { return; diff --git a/util/trace-events b/util/trace-events index 5be12d7fab89..bd8f25fb5920 100644 --- a/util/trace-events +++ b/util/trace-events @@ -15,8 +15,8 @@ reentrant_aio(void *ctx, const char *name) "ctx %p name %s" # thread-pool.c thread_pool_submit_aio(void *pool, void *req, void *opaque) "pool %p req %p opaque %p" -thread_pool_complete(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" -thread_pool_cancel(void *req, void *opaque) "req %p opaque %p" +thread_pool_complete_aio(void *pool, void *req, void *opaque, int ret) "pool %p req %p opaque %p ret %d" +thread_pool_cancel_aio(void *req, void *opaque) "req %p opaque %p" # buffer.c buffer_resize(const char *buf, size_t olen, size_t len) "%s: old %zd, new %zd" From patchwork Thu Jan 30 10:08:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954464 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DEFD0C02190 for ; Thu, 30 Jan 2025 10:15:23 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUR-00029c-HX; Thu, 30 Jan 2025 05:09:39 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUP-00029A-Ur for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:37 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUN-00072q-Qq for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:37 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUJ-00000006Twj-2DsZ; Thu, 30 Jan 2025 11:09:31 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 04/33] thread-pool: Implement generic (non-AIO) pool support Date: Thu, 30 Jan 2025 11:08:25 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Migration code wants to manage device data sending threads in one place. QEMU has an existing thread pool implementation, however it is limited to queuing AIO operations only and essentially has a 1:1 mapping between the current AioContext and the AIO ThreadPool in use. Implement generic (non-AIO) ThreadPool by essentially wrapping Glib's GThreadPool. This brings a few new operations on a pool: * thread_pool_wait() operation waits until all the submitted work requests have finished. * thread_pool_set_max_threads() explicitly sets the maximum thread count in the pool. * thread_pool_adjust_max_threads_to_work() adjusts the maximum thread count in the pool to equal the number of still waiting in queue or unfinished work. Reviewed-by: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- include/block/thread-pool.h | 51 ++++++++++++++++ util/thread-pool.c | 119 ++++++++++++++++++++++++++++++++++++ 2 files changed, 170 insertions(+) diff --git a/include/block/thread-pool.h b/include/block/thread-pool.h index 6f27eb085b45..dd48cf07e85f 100644 --- a/include/block/thread-pool.h +++ b/include/block/thread-pool.h @@ -38,5 +38,56 @@ BlockAIOCB *thread_pool_submit_aio(ThreadPoolFunc *func, void *arg, int coroutine_fn thread_pool_submit_co(ThreadPoolFunc *func, void *arg); void thread_pool_update_params(ThreadPoolAio *pool, struct AioContext *ctx); +/* ------------------------------------------- */ +/* Generic thread pool types and methods below */ +typedef struct ThreadPool ThreadPool; + +/* Create a new thread pool. Never returns NULL. */ +ThreadPool *thread_pool_new(void); + +/* + * Free the thread pool. + * Waits for all the previously submitted work to complete before performing + * the actual freeing operation. + */ +void thread_pool_free(ThreadPool *pool); + +/* + * Submit a new work (task) for the pool. + * + * @opaque_destroy is an optional GDestroyNotify for the @opaque argument + * to the work function at @func. + */ +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, + void *opaque, GDestroyNotify opaque_destroy); + +/* + * Submit a new work (task) for the pool, making sure it starts getting + * processed immediately, launching a new thread for it if necessary. + * + * @opaque_destroy is an optional GDestroyNotify for the @opaque argument + * to the work function at @func. + */ +void thread_pool_submit_immediate(ThreadPool *pool, ThreadPoolFunc *func, + void *opaque, GDestroyNotify opaque_destroy); + +/* + * Wait for all previously submitted work to complete before returning. + * + * Can be used as a barrier between two sets of tasks executed on a thread + * pool without destroying it or in a performance sensitive path where the + * caller just wants to wait for all tasks to complete while deferring the + * pool free operation for later, less performance sensitive time. + */ +void thread_pool_wait(ThreadPool *pool); + +/* Set the maximum number of threads in the pool. */ +bool thread_pool_set_max_threads(ThreadPool *pool, int max_threads); + +/* + * Adjust the maximum number of threads in the pool to give each task its + * own thread (exactly one thread per task). + */ +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool); #endif diff --git a/util/thread-pool.c b/util/thread-pool.c index 908194dc070f..d2ead6b72857 100644 --- a/util/thread-pool.c +++ b/util/thread-pool.c @@ -374,3 +374,122 @@ void thread_pool_free_aio(ThreadPoolAio *pool) qemu_mutex_destroy(&pool->lock); g_free(pool); } + +struct ThreadPool { + GThreadPool *t; + size_t cur_work; + QemuMutex cur_work_lock; + QemuCond all_finished_cond; +}; + +typedef struct { + ThreadPoolFunc *func; + void *opaque; + GDestroyNotify opaque_destroy; +} ThreadPoolElement; + +static void thread_pool_func(gpointer data, gpointer user_data) +{ + ThreadPool *pool = user_data; + g_autofree ThreadPoolElement *el = data; + + el->func(el->opaque); + + if (el->opaque_destroy) { + el->opaque_destroy(el->opaque); + } + + QEMU_LOCK_GUARD(&pool->cur_work_lock); + + assert(pool->cur_work > 0); + pool->cur_work--; + + if (pool->cur_work == 0) { + qemu_cond_signal(&pool->all_finished_cond); + } +} + +ThreadPool *thread_pool_new(void) +{ + ThreadPool *pool = g_new(ThreadPool, 1); + + pool->cur_work = 0; + qemu_mutex_init(&pool->cur_work_lock); + qemu_cond_init(&pool->all_finished_cond); + + pool->t = g_thread_pool_new(thread_pool_func, pool, 0, TRUE, NULL); + /* + * g_thread_pool_new() can only return errors if initial thread(s) + * creation fails but we ask for 0 initial threads above. + */ + assert(pool->t); + + return pool; +} + +void thread_pool_free(ThreadPool *pool) +{ + /* + * With _wait = TRUE this effectively waits for all + * previously submitted work to complete first. + */ + g_thread_pool_free(pool->t, FALSE, TRUE); + + qemu_cond_destroy(&pool->all_finished_cond); + qemu_mutex_destroy(&pool->cur_work_lock); + + g_free(pool); +} + +void thread_pool_submit(ThreadPool *pool, ThreadPoolFunc *func, + void *opaque, GDestroyNotify opaque_destroy) +{ + ThreadPoolElement *el = g_new(ThreadPoolElement, 1); + + el->func = func; + el->opaque = opaque; + el->opaque_destroy = opaque_destroy; + + WITH_QEMU_LOCK_GUARD(&pool->cur_work_lock) { + pool->cur_work++; + } + + /* + * Ignore the return value since this function can only return errors + * if creation of an additional thread fails but even in this case the + * provided work is still getting queued (just for the existing threads). + */ + g_thread_pool_push(pool->t, el, NULL); +} + +void thread_pool_submit_immediate(ThreadPool *pool, ThreadPoolFunc *func, + void *opaque, GDestroyNotify opaque_destroy) +{ + thread_pool_submit(pool, func, opaque, opaque_destroy); + thread_pool_adjust_max_threads_to_work(pool); +} + +void thread_pool_wait(ThreadPool *pool) +{ + QEMU_LOCK_GUARD(&pool->cur_work_lock); + + while (pool->cur_work > 0) { + qemu_cond_wait(&pool->all_finished_cond, + &pool->cur_work_lock); + } +} + +bool thread_pool_set_max_threads(ThreadPool *pool, + int max_threads) +{ + assert(max_threads > 0); + + return g_thread_pool_set_max_threads(pool->t, max_threads, NULL); +} + +bool thread_pool_adjust_max_threads_to_work(ThreadPool *pool) +{ + QEMU_LOCK_GUARD(&pool->cur_work_lock); + + return thread_pool_set_max_threads(pool, pool->cur_work); +} From patchwork Thu Jan 30 10:08:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 96B87C0218A for ; Thu, 30 Jan 2025 10:14:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUW-0002A0-Sr; Thu, 30 Jan 2025 05:09:45 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUV-00029l-HB for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:43 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUT-00073I-Ew for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:43 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUO-00000006Twu-2wFX; Thu, 30 Jan 2025 11:09:36 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 05/33] migration: Add MIG_CMD_SWITCHOVER_START and its load handler Date: Thu, 30 Jan 2025 11:08:26 +0100 Message-ID: <9aecc86c354f07911ee81f80594c0260add41065.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This QEMU_VM_COMMAND sub-command and its switchover_start SaveVMHandler is used to mark the switchover point in main migration stream. It can be used to inform the destination that all pre-switchover main migration stream data has been sent/received so it can start to process post-switchover data that it might have received via other migration channels like the multifd ones. Add also the relevant MigrationState bit stream compatibility property and its hw_compat entry. Reviewed-by: Fabiano Rosas Reviewed-by: Zhang Chen # for the COLO part Signed-off-by: Maciej S. Szmigiero --- hw/core/machine.c | 1 + include/migration/client-options.h | 4 +++ include/migration/register.h | 12 +++++++++ migration/colo.c | 3 +++ migration/migration-hmp-cmds.c | 2 ++ migration/migration.c | 4 +++ migration/migration.h | 2 ++ migration/options.c | 9 +++++++ migration/savevm.c | 39 ++++++++++++++++++++++++++++++ migration/savevm.h | 1 + migration/trace-events | 1 + scripts/analyze-migration.py | 11 +++++++++ 12 files changed, 89 insertions(+) diff --git a/hw/core/machine.c b/hw/core/machine.c index c23b39949649..c2964503c5bd 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -38,6 +38,7 @@ GlobalProperty hw_compat_9_2[] = { {"arm-cpu", "backcompat-pauth-default-use-qarma5", "true"}, + { "migration", "send-switchover-start", "off"}, }; const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2); diff --git a/include/migration/client-options.h b/include/migration/client-options.h index 59f4b55cf4f7..289c9d776221 100644 --- a/include/migration/client-options.h +++ b/include/migration/client-options.h @@ -10,6 +10,10 @@ #ifndef QEMU_MIGRATION_CLIENT_OPTIONS_H #define QEMU_MIGRATION_CLIENT_OPTIONS_H + +/* properties */ +bool migrate_send_switchover_start(void); + /* capabilities */ bool migrate_background_snapshot(void); diff --git a/include/migration/register.h b/include/migration/register.h index 0b0292738320..ff0faf5f68c8 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -279,6 +279,18 @@ typedef struct SaveVMHandlers { * otherwise */ bool (*switchover_ack_needed)(void *opaque); + + /** + * @switchover_start + * + * Notifies that the switchover has started. Called only on + * the destination. + * + * @opaque: data pointer passed to register_savevm_live() + * + * Returns zero to indicate success and negative for error + */ + int (*switchover_start)(void *opaque); } SaveVMHandlers; /** diff --git a/migration/colo.c b/migration/colo.c index 9a8e5fbe9b94..c976b3ff344d 100644 --- a/migration/colo.c +++ b/migration/colo.c @@ -452,6 +452,9 @@ static int colo_do_checkpoint_transaction(MigrationState *s, bql_unlock(); goto out; } + + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); + /* Note: device state is saved into buffer */ ret = qemu_save_device_state(fb); diff --git a/migration/migration-hmp-cmds.c b/migration/migration-hmp-cmds.c index e8527bef801c..c84e7914ccfb 100644 --- a/migration/migration-hmp-cmds.c +++ b/migration/migration-hmp-cmds.c @@ -46,6 +46,8 @@ static void migration_global_dump(Monitor *mon) ms->send_configuration ? "on" : "off"); monitor_printf(mon, "send-section-footer: %s\n", ms->send_section_footer ? "on" : "off"); + monitor_printf(mon, "send-switchover-start: %s\n", + ms->send_switchover_start ? "on" : "off"); monitor_printf(mon, "clear-bitmap-shift: %u\n", ms->clear_bitmap_shift); } diff --git a/migration/migration.c b/migration/migration.c index 2d1da917c7b1..65b51d360896 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -2536,6 +2536,8 @@ static int postcopy_start(MigrationState *ms, Error **errp) goto fail; } + qemu_savevm_maybe_send_switchover_start(ms->to_dst_file); + /* * Cause any non-postcopiable, but iterative devices to * send out their final data. @@ -2719,6 +2721,8 @@ static int migration_completion_precopy(MigrationState *s, migration_rate_set(RATE_LIMIT_DISABLED); + qemu_savevm_maybe_send_switchover_start(s->to_dst_file); + /* Inactivate disks except in COLO */ ret = qemu_savevm_state_complete_precopy(s->to_dst_file, false, !migrate_colo()); diff --git a/migration/migration.h b/migration/migration.h index 0df2a187afef..c5731626bbfb 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -400,6 +400,8 @@ struct MigrationState { bool send_configuration; /* Whether we send section footer during migration */ bool send_section_footer; + /* Whether we send switchover start notification during migration */ + bool send_switchover_start; /* Needed by postcopy-pause state */ QemuSemaphore postcopy_pause_sem; diff --git a/migration/options.c b/migration/options.c index b8d530032698..466a45e7f2ba 100644 --- a/migration/options.c +++ b/migration/options.c @@ -92,6 +92,8 @@ const Property migration_properties[] = { send_configuration, true), DEFINE_PROP_BOOL("send-section-footer", MigrationState, send_section_footer, true), + DEFINE_PROP_BOOL("send-switchover-start", MigrationState, + send_switchover_start, true), DEFINE_PROP_BOOL("multifd-flush-after-each-section", MigrationState, multifd_flush_after_each_section, false), DEFINE_PROP_UINT8("x-clear-bitmap-shift", MigrationState, @@ -206,6 +208,13 @@ bool migrate_auto_converge(void) return s->capabilities[MIGRATION_CAPABILITY_AUTO_CONVERGE]; } +bool migrate_send_switchover_start(void) +{ + MigrationState *s = migrate_get_current(); + + return s->send_switchover_start; +} + bool migrate_background_snapshot(void) { MigrationState *s = migrate_get_current(); diff --git a/migration/savevm.c b/migration/savevm.c index c929da1ca5a5..5a3be7a06b6f 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -90,6 +90,7 @@ enum qemu_vm_cmd { MIG_CMD_ENABLE_COLO, /* Enable COLO */ MIG_CMD_POSTCOPY_RESUME, /* resume postcopy on dest */ MIG_CMD_RECV_BITMAP, /* Request for recved bitmap on dst */ + MIG_CMD_SWITCHOVER_START, /* Switchover start notification */ MIG_CMD_MAX }; @@ -109,6 +110,7 @@ static struct mig_cmd_args { [MIG_CMD_POSTCOPY_RESUME] = { .len = 0, .name = "POSTCOPY_RESUME" }, [MIG_CMD_PACKAGED] = { .len = 4, .name = "PACKAGED" }, [MIG_CMD_RECV_BITMAP] = { .len = -1, .name = "RECV_BITMAP" }, + [MIG_CMD_SWITCHOVER_START] = { .len = 0, .name = "SWITCHOVER_START" }, [MIG_CMD_MAX] = { .len = -1, .name = "MAX" }, }; @@ -1201,6 +1203,19 @@ void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name) qemu_savevm_command_send(f, MIG_CMD_RECV_BITMAP, len + 1, (uint8_t *)buf); } +static void qemu_savevm_send_switchover_start(QEMUFile *f) +{ + trace_savevm_send_switchover_start(); + qemu_savevm_command_send(f, MIG_CMD_SWITCHOVER_START, 0, NULL); +} + +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f) +{ + if (migrate_send_switchover_start()) { + qemu_savevm_send_switchover_start(f); + } +} + bool qemu_savevm_state_blocked(Error **errp) { SaveStateEntry *se; @@ -1714,6 +1729,7 @@ static int qemu_savevm_state(QEMUFile *f, Error **errp) ret = qemu_file_get_error(f); if (ret == 0) { + qemu_savevm_maybe_send_switchover_start(f); qemu_savevm_state_complete_precopy(f, false, false); ret = qemu_file_get_error(f); } @@ -2410,6 +2426,26 @@ static int loadvm_process_enable_colo(MigrationIncomingState *mis) return ret; } +static int loadvm_postcopy_handle_switchover_start(void) +{ + SaveStateEntry *se; + + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + int ret; + + if (!se->ops || !se->ops->switchover_start) { + continue; + } + + ret = se->ops->switchover_start(se->opaque); + if (ret < 0) { + return ret; + } + } + + return 0; +} + /* * Process an incoming 'QEMU_VM_COMMAND' * 0 just a normal return @@ -2508,6 +2544,9 @@ static int loadvm_process_command(QEMUFile *f) case MIG_CMD_ENABLE_COLO: return loadvm_process_enable_colo(mis); + + case MIG_CMD_SWITCHOVER_START: + return loadvm_postcopy_handle_switchover_start(); } return 0; diff --git a/migration/savevm.h b/migration/savevm.h index 9ec96a995c93..4d402723bc3c 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -53,6 +53,7 @@ void qemu_savevm_send_postcopy_listen(QEMUFile *f); void qemu_savevm_send_postcopy_run(QEMUFile *f); void qemu_savevm_send_postcopy_resume(QEMUFile *f); void qemu_savevm_send_recv_bitmap(QEMUFile *f, char *block_name); +void qemu_savevm_maybe_send_switchover_start(QEMUFile *f); void qemu_savevm_send_postcopy_ram_discard(QEMUFile *f, const char *name, uint16_t len, diff --git a/migration/trace-events b/migration/trace-events index b82a1c5e40ff..5a73af532f1c 100644 --- a/migration/trace-events +++ b/migration/trace-events @@ -39,6 +39,7 @@ savevm_send_postcopy_run(void) "" savevm_send_postcopy_resume(void) "" savevm_send_colo_enable(void) "" savevm_send_recv_bitmap(char *name) "%s" +savevm_send_switchover_start(void) "" savevm_state_setup(void) "" savevm_state_resume_prepare(void) "" savevm_state_header(void) "" diff --git a/scripts/analyze-migration.py b/scripts/analyze-migration.py index 8e1fbf4c9d9f..67631ac43e9f 100755 --- a/scripts/analyze-migration.py +++ b/scripts/analyze-migration.py @@ -620,7 +620,9 @@ class MigrationDump(object): QEMU_VM_SUBSECTION = 0x05 QEMU_VM_VMDESCRIPTION = 0x06 QEMU_VM_CONFIGURATION = 0x07 + QEMU_VM_COMMAND = 0x08 QEMU_VM_SECTION_FOOTER= 0x7e + QEMU_MIG_CMD_SWITCHOVER_START = 0x0b def __init__(self, filename): self.section_classes = { @@ -685,6 +687,15 @@ def read(self, desc_only = False, dump_memory = False, elif section_type == self.QEMU_VM_SECTION_PART or section_type == self.QEMU_VM_SECTION_END: section_id = file.read32() self.sections[section_id].read() + elif section_type == self.QEMU_VM_COMMAND: + command_type = file.read16() + command_data_len = file.read16() + if command_type != self.QEMU_MIG_CMD_SWITCHOVER_START: + raise Exception("Unknown QEMU_VM_COMMAND: %x" % + (command_type)) + if command_data_len != 0: + raise Exception("Invalid SWITCHOVER_START length: %x" % + (command_data_len)) elif section_type == self.QEMU_VM_SECTION_FOOTER: read_section_id = file.read32() if read_section_id != section_id: From patchwork Thu Jan 30 10:08:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E28EFC02190 for ; Thu, 30 Jan 2025 10:12:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUc-0002Cq-6v; Thu, 30 Jan 2025 05:09:50 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUa-0002Aj-GD for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:48 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUY-00073c-Uq for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:48 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUT-00000006Tx5-3ZeB; Thu, 30 Jan 2025 11:09:41 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 06/33] migration: Add qemu_loadvm_load_state_buffer() and its handler Date: Thu, 30 Jan 2025 11:08:27 +0100 Message-ID: <7dd24d06e3cd9a6fa0fe8067656ee137a3a8511b.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" qemu_loadvm_load_state_buffer() and its load_state_buffer SaveVMHandler allow providing device state buffer to explicitly specified device via its idstr and instance id. Reviewed-by: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- include/migration/register.h | 15 +++++++++++++++ migration/savevm.c | 23 +++++++++++++++++++++++ migration/savevm.h | 3 +++ 3 files changed, 41 insertions(+) diff --git a/include/migration/register.h b/include/migration/register.h index ff0faf5f68c8..58891aa54b76 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -229,6 +229,21 @@ typedef struct SaveVMHandlers { */ int (*load_state)(QEMUFile *f, void *opaque, int version_id); + /** + * @load_state_buffer (invoked outside the BQL) + * + * Load device state buffer provided to qemu_loadvm_load_state_buffer(). + * + * @opaque: data pointer passed to register_savevm_live() + * @buf: the data buffer to load + * @len: the data length in buffer + * @errp: pointer to Error*, to store an error if it happens. + * + * Returns true to indicate success and false for errors. + */ + bool (*load_state_buffer)(void *opaque, char *buf, size_t len, + Error **errp); + /** * @load_setup * diff --git a/migration/savevm.c b/migration/savevm.c index 5a3be7a06b6f..b0b74140daea 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -3082,6 +3082,29 @@ int qemu_loadvm_approve_switchover(void) return migrate_send_rp_switchover_ack(mis); } +bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, + char *buf, size_t len, Error **errp) +{ + SaveStateEntry *se; + + se = find_se(idstr, instance_id); + if (!se) { + error_setg(errp, + "Unknown idstr %s or instance id %u for load state buffer", + idstr, instance_id); + return false; + } + + if (!se->ops || !se->ops->load_state_buffer) { + error_setg(errp, + "idstr %s / instance %u has no load state buffer operation", + idstr, instance_id); + return false; + } + + return se->ops->load_state_buffer(se->opaque, buf, len, errp); +} + bool save_snapshot(const char *name, bool overwrite, const char *vmstate, bool has_devices, strList *devices, Error **errp) { diff --git a/migration/savevm.h b/migration/savevm.h index 4d402723bc3c..8b78493dbc0e 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -71,4 +71,7 @@ int qemu_loadvm_approve_switchover(void); int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, bool in_postcopy, bool inactivate_disks); +bool qemu_loadvm_load_state_buffer(const char *idstr, uint32_t instance_id, + char *buf, size_t len, Error **errp); + #endif From patchwork Thu Jan 30 10:08:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954446 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E43CFC02190 for ; Thu, 30 Jan 2025 10:12:23 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUi-0002Ef-61; Thu, 30 Jan 2025 05:09:56 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUg-0002EU-5w for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:54 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUe-00073x-MQ for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:53 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUZ-00000006TxG-04Ei; Thu, 30 Jan 2025 11:09:47 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 07/33] io: tls: Allow terminating the TLS session gracefully with EOF Date: Thu, 30 Jan 2025 11:08:28 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Currently, hitting EOF on receive without sender terminating the TLS session properly causes the TLS channel to return an error (unless the channel was already shut down for read). Add an optional setting whether we instead just return EOF in that case. This possibility will be soon used by the migration multifd code. Signed-off-by: Maciej S. Szmigiero --- include/io/channel-tls.h | 11 +++++++++++ io/channel-tls.c | 6 ++++++ 2 files changed, 17 insertions(+) diff --git a/include/io/channel-tls.h b/include/io/channel-tls.h index 26c67f17e2d3..8552c0d0266e 100644 --- a/include/io/channel-tls.h +++ b/include/io/channel-tls.h @@ -49,6 +49,7 @@ struct QIOChannelTLS { QCryptoTLSSession *session; QIOChannelShutdown shutdown; guint hs_ioc_tag; + bool premature_eof_okay; }; /** @@ -143,4 +144,14 @@ void qio_channel_tls_handshake(QIOChannelTLS *ioc, QCryptoTLSSession * qio_channel_tls_get_session(QIOChannelTLS *ioc); +/** + * qio_channel_tls_set_premature_eof_okay: + * @ioc: the TLS channel object + * + * Sets whether receiving an EOF without terminating the TLS session properly + * by used the other side is considered okay or an error (the + * default behaviour). + */ +void qio_channel_tls_set_premature_eof_okay(QIOChannelTLS *ioc, bool enabled); + #endif /* QIO_CHANNEL_TLS_H */ diff --git a/io/channel-tls.c b/io/channel-tls.c index aab630e5ae32..1079d6d10de1 100644 --- a/io/channel-tls.c +++ b/io/channel-tls.c @@ -147,6 +147,11 @@ qio_channel_tls_new_client(QIOChannel *master, return NULL; } +void qio_channel_tls_set_premature_eof_okay(QIOChannelTLS *ioc, bool enabled) +{ + ioc->premature_eof_okay = enabled; +} + struct QIOChannelTLSData { QIOTask *task; GMainContext *context; @@ -279,6 +284,7 @@ static ssize_t qio_channel_tls_readv(QIOChannel *ioc, tioc->session, iov[i].iov_base, iov[i].iov_len, + tioc->premature_eof_okay || qatomic_load_acquire(&tioc->shutdown) & QIO_CHANNEL_SHUTDOWN_READ, errp); if (ret == QCRYPTO_TLS_SESSION_ERR_BLOCK) { From patchwork Thu Jan 30 10:08:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954444 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 69297C02193 for ; Thu, 30 Jan 2025 10:12:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUm-0002Ik-DQ; Thu, 30 Jan 2025 05:10:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUk-0002Hy-GT for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:58 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUj-00074I-0A for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:09:58 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUe-00000006TxS-0eRP; Thu, 30 Jan 2025 11:09:52 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 08/33] migration/multifd: Allow premature EOF on TLS incoming channels Date: Thu, 30 Jan 2025 11:08:29 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Multifd send channels are terminated by calling qio_channel_shutdown(QIO_CHANNEL_SHUTDOWN_BOTH) in multifd_send_terminate_threads(), which in the TLS case essentially calls shutdown(SHUT_RDWR) on the underlying raw socket. Unfortunately, this does not terminate the TLS session properly and the receive side sees this as a GNUTLS_E_PREMATURE_TERMINATION error. The only reason why this wasn't causing migration failures is because the current migration code apparently does not check for migration error being set after the end of the multifd receive process. However, this will change soon so the multifd receive code has to be prepared to not return an error on such premature TLS session EOF. Use the newly introduced QIOChannelTLS method for that. It's worth noting that even if the sender were to be changed to terminate the TLS connection properly the receive side still needs to remain compatible with older QEMU bit stream which does not do this. Signed-off-by: Maciej S. Szmigiero --- migration/multifd.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/migration/multifd.c b/migration/multifd.c index ab73d6d984cf..ceaad930e141 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -1310,6 +1310,7 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp) Error *local_err = NULL; bool use_packets = multifd_use_packets(); int id; + QIOChannelTLS *ioc_tls; if (use_packets) { id = multifd_recv_initial_packet(ioc, &local_err); @@ -1337,6 +1338,13 @@ void multifd_recv_new_channel(QIOChannel *ioc, Error **errp) p->c = ioc; object_ref(OBJECT(ioc)); + ioc_tls = QIO_CHANNEL_TLS(object_dynamic_cast(OBJECT(ioc), + TYPE_QIO_CHANNEL_TLS)); + if (ioc_tls) { + /* Multifd send channels do not terminate the TLS session properly */ + qio_channel_tls_set_premature_eof_okay(ioc_tls, true); + } + p->thread_created = true; qemu_thread_create(&p->thread, p->name, multifd_recv_thread, p, QEMU_THREAD_JOINABLE); From patchwork Thu Jan 30 10:08:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954430 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BF1B8C0218A for ; Thu, 30 Jan 2025 10:10:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUs-0002dO-DK; Thu, 30 Jan 2025 05:10:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUp-0002SF-Ev for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:03 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUn-00074W-Mb for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:03 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUj-00000006Txh-1BNQ; Thu, 30 Jan 2025 11:09:57 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 09/33] migration: postcopy_ram_listen_thread() needs to take BQL for some calls Date: Thu, 30 Jan 2025 11:08:30 +0100 Message-ID: <139bf266dbd1e25a1e5a050ecb82e3e59120d705.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" postcopy_ram_listen_thread() is a free running thread, so it needs to take BQL around function calls to migration methods requiring BQL. qemu_loadvm_state_main() needs BQL held since it ultimately calls "load_state" SaveVMHandlers. migration_incoming_state_destroy() needs BQL held since it ultimately calls "load_cleanup" SaveVMHandlers. Signed-off-by: Maciej S. Szmigiero --- migration/savevm.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/migration/savevm.c b/migration/savevm.c index b0b74140daea..0ceea9638cc1 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -2013,7 +2013,9 @@ static void *postcopy_ram_listen_thread(void *opaque) * in qemu_file, and thus we must be blocking now. */ qemu_file_set_blocking(f, true); + bql_lock(); load_res = qemu_loadvm_state_main(f, mis); + bql_unlock(); /* * This is tricky, but, mis->from_src_file can change after it @@ -2073,7 +2075,9 @@ static void *postcopy_ram_listen_thread(void *opaque) * (If something broke then qemu will have to exit anyway since it's * got a bad migration state). */ + bql_lock(); migration_incoming_state_destroy(); + bql_unlock(); rcu_unregister_thread(); mis->have_listen_thread = false; From patchwork Thu Jan 30 10:08:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954429 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 63D36C0218A for ; Thu, 30 Jan 2025 10:10:46 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRUw-0002wB-Io; Thu, 30 Jan 2025 05:10:10 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUt-0002mr-UV for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:08 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUr-0007FD-Lq for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:07 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUo-00000006Txr-1pe6; Thu, 30 Jan 2025 11:10:02 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 10/33] error: define g_autoptr() cleanup function for the Error type Date: Thu, 30 Jan 2025 11:08:31 +0100 Message-ID: <82edf9cfc5f707be405c48a46a42a42df3611aaf.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Automatic memory management helps avoid memory safety issues. Signed-off-by: Maciej S. Szmigiero --- include/qapi/error.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/include/qapi/error.h b/include/qapi/error.h index 71f8fb2c50ee..649ec8f1b6a2 100644 --- a/include/qapi/error.h +++ b/include/qapi/error.h @@ -437,6 +437,8 @@ Error *error_copy(const Error *err); */ void error_free(Error *err); +G_DEFINE_AUTOPTR_CLEANUP_FUNC(Error, error_free) + /* * Convenience function to assert that *@errp is set, then silently free it. */ From patchwork Thu Jan 30 10:08:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954437 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 30879C02190 for ; Thu, 30 Jan 2025 10:11:56 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRV2-0003Iv-8a; Thu, 30 Jan 2025 05:10:16 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUz-0003AJ-9R for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:13 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRUx-0007I1-1Y for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:12 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUt-00000006TyA-2Raz; Thu, 30 Jan 2025 11:10:07 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 11/33] migration: Add thread pool of optional load threads Date: Thu, 30 Jan 2025 11:08:32 +0100 Message-ID: <6bd6ea6a59ef6b350fbffa1ae82830762559fa52.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Some drivers might want to make use of auxiliary helper threads during VM state loading, for example to make sure that their blocking (sync) I/O operations don't block the rest of the migration process. Add a migration core managed thread pool to facilitate this use case. The migration core will wait for these threads to finish before (re)starting the VM at destination. Reviewed-by: Fabiano Rosas Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 3 ++ include/qemu/typedefs.h | 2 + migration/migration.c | 2 +- migration/migration.h | 5 +++ migration/savevm.c | 92 +++++++++++++++++++++++++++++++++++++++- migration/savevm.h | 2 +- 6 files changed, 103 insertions(+), 3 deletions(-) diff --git a/include/migration/misc.h b/include/migration/misc.h index 67f7ef7a0e5c..bc5ce31b52e0 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -45,9 +45,12 @@ bool migrate_ram_is_ignored(RAMBlock *block); /* migration/block.c */ AnnounceParameters *migrate_announce_params(void); + /* migration/savevm.c */ void dump_vmstate_json_to_file(FILE *out_fp); +void qemu_loadvm_start_load_thread(MigrationLoadThread function, + void *opaque); /* migration/migration.c */ void migration_object_init(void); diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h index 3d84efcac47a..fd23ff7771b1 100644 --- a/include/qemu/typedefs.h +++ b/include/qemu/typedefs.h @@ -131,5 +131,7 @@ typedef struct IRQState *qemu_irq; * Function types */ typedef void (*qemu_irq_handler)(void *opaque, int n, int level); +typedef bool (*MigrationLoadThread)(void *opaque, bool *should_quit, + Error **errp); #endif /* QEMU_TYPEDEFS_H */ diff --git a/migration/migration.c b/migration/migration.c index 65b51d360896..0f29188499e4 100644 --- a/migration/migration.c +++ b/migration/migration.c @@ -400,7 +400,7 @@ void migration_incoming_state_destroy(void) * RAM state cleanup needs to happen after multifd cleanup, because * multifd threads can use some of its states (receivedmap). */ - qemu_loadvm_state_cleanup(); + qemu_loadvm_state_cleanup(mis); if (mis->to_src_file) { /* Tell source that we are done */ diff --git a/migration/migration.h b/migration/migration.h index c5731626bbfb..1699fe7d91cc 100644 --- a/migration/migration.h +++ b/migration/migration.h @@ -43,6 +43,7 @@ #define MIGRATION_THREAD_DST_PREEMPT "mig/dst/preempt" struct PostcopyBlocktimeContext; +typedef struct ThreadPool ThreadPool; #define MIGRATION_RESUME_ACK_VALUE (1) @@ -187,6 +188,10 @@ struct MigrationIncomingState { Coroutine *colo_incoming_co; QemuSemaphore colo_incoming_sem; + /* Optional load threads pool and its thread exit request flag */ + ThreadPool *load_threads; + bool load_threads_abort; + /* * PostcopyBlocktimeContext to keep information for postcopy * live migration, to calculate vCPU block time diff --git a/migration/savevm.c b/migration/savevm.c index 0ceea9638cc1..74d1960de3c6 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -54,6 +54,7 @@ #include "qemu/job.h" #include "qemu/main-loop.h" #include "block/snapshot.h" +#include "block/thread-pool.h" #include "qemu/cutils.h" #include "io/channel-buffer.h" #include "io/channel-file.h" @@ -131,6 +132,35 @@ static struct mig_cmd_args { * generic extendable format with an exception for two old entities. */ +/***********************************************************/ +/* Optional load threads pool support */ + +static void qemu_loadvm_thread_pool_create(MigrationIncomingState *mis) +{ + assert(!mis->load_threads); + mis->load_threads = thread_pool_new(); + mis->load_threads_abort = false; +} + +static void qemu_loadvm_thread_pool_destroy(MigrationIncomingState *mis) +{ + qatomic_set(&mis->load_threads_abort, true); + + bql_unlock(); /* Load threads might be waiting for BQL */ + g_clear_pointer(&mis->load_threads, thread_pool_free); + bql_lock(); +} + +static bool qemu_loadvm_thread_pool_wait(MigrationState *s, + MigrationIncomingState *mis) +{ + bql_unlock(); /* Let load threads do work requiring BQL */ + thread_pool_wait(mis->load_threads); + bql_lock(); + + return !migrate_has_error(s); +} + /***********************************************************/ /* savevm/loadvm support */ @@ -2810,16 +2840,62 @@ static int qemu_loadvm_state_setup(QEMUFile *f, Error **errp) return 0; } -void qemu_loadvm_state_cleanup(void) +struct LoadThreadData { + MigrationLoadThread function; + void *opaque; +}; + +static int qemu_loadvm_load_thread(void *thread_opaque) +{ + struct LoadThreadData *data = thread_opaque; + MigrationIncomingState *mis = migration_incoming_get_current(); + g_autoptr(Error) local_err = NULL; + + if (!data->function(data->opaque, &mis->load_threads_abort, &local_err)) { + MigrationState *s = migrate_get_current(); + + assert(local_err); + + /* + * In case of multiple load threads failing which thread error + * return we end setting is purely arbitrary. + */ + migrate_set_error(s, local_err); + } + + return 0; +} + +void qemu_loadvm_start_load_thread(MigrationLoadThread function, + void *opaque) +{ + MigrationIncomingState *mis = migration_incoming_get_current(); + struct LoadThreadData *data; + + /* We only set it from this thread so it's okay to read it directly */ + assert(!mis->load_threads_abort); + + data = g_new(struct LoadThreadData, 1); + data->function = function; + data->opaque = opaque; + + thread_pool_submit_immediate(mis->load_threads, qemu_loadvm_load_thread, + data, g_free); +} + +void qemu_loadvm_state_cleanup(MigrationIncomingState *mis) { SaveStateEntry *se; trace_loadvm_state_cleanup(); + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { if (se->ops && se->ops->load_cleanup) { se->ops->load_cleanup(se->opaque); } } + + qemu_loadvm_thread_pool_destroy(mis); } /* Return true if we should continue the migration, or false. */ @@ -2970,6 +3046,7 @@ out: int qemu_loadvm_state(QEMUFile *f) { + MigrationState *s = migrate_get_current(); MigrationIncomingState *mis = migration_incoming_get_current(); Error *local_err = NULL; int ret; @@ -2979,6 +3056,8 @@ int qemu_loadvm_state(QEMUFile *f) return -EINVAL; } + qemu_loadvm_thread_pool_create(mis); + ret = qemu_loadvm_state_header(f); if (ret) { return ret; @@ -3008,6 +3087,17 @@ int qemu_loadvm_state(QEMUFile *f) return ret; } + if (ret == 0) { + if (!qemu_loadvm_thread_pool_wait(s, mis)) { + ret = -EINVAL; + } + } + /* + * Set this flag unconditionally so we'll catch further attempts to + * start additional threads via an appropriate assert() + */ + qatomic_set(&mis->load_threads_abort, true); + if (ret == 0) { ret = qemu_file_get_error(f); } diff --git a/migration/savevm.h b/migration/savevm.h index 8b78493dbc0e..3fa06574e632 100644 --- a/migration/savevm.h +++ b/migration/savevm.h @@ -64,7 +64,7 @@ void qemu_savevm_live_state(QEMUFile *f); int qemu_save_device_state(QEMUFile *f); int qemu_loadvm_state(QEMUFile *f); -void qemu_loadvm_state_cleanup(void); +void qemu_loadvm_state_cleanup(MigrationIncomingState *mis); int qemu_loadvm_state_main(QEMUFile *f, MigrationIncomingState *mis); int qemu_load_device_state(QEMUFile *f); int qemu_loadvm_approve_switchover(void); From patchwork Thu Jan 30 10:08:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954433 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6A3ACC02190 for ; Thu, 30 Jan 2025 10:11:34 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRV8-0003kN-4b; Thu, 30 Jan 2025 05:10:22 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRV5-0003cz-Ln for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:19 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRV3-0007IX-WA for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:19 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRUy-00000006TyM-32DN; Thu, 30 Jan 2025 11:10:12 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 12/33] migration/multifd: Split packet into header and RAM data Date: Thu, 30 Jan 2025 11:08:33 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Read packet header first so in the future we will be able to differentiate between a RAM multifd packet and a device state multifd packet. Since these two are of different size we can't read the packet body until we know which packet type it is. Reviewed-by: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- migration/multifd.c | 49 +++++++++++++++++++++++++++++++++++---------- migration/multifd.h | 5 +++++ 2 files changed, 43 insertions(+), 11 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index ceaad930e141..53493676012e 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -209,10 +209,10 @@ void multifd_send_fill_packet(MultiFDSendParams *p) memset(packet, 0, p->packet_len); - packet->magic = cpu_to_be32(MULTIFD_MAGIC); - packet->version = cpu_to_be32(MULTIFD_VERSION); + packet->hdr.magic = cpu_to_be32(MULTIFD_MAGIC); + packet->hdr.version = cpu_to_be32(MULTIFD_VERSION); - packet->flags = cpu_to_be32(p->flags); + packet->hdr.flags = cpu_to_be32(p->flags); packet->next_packet_size = cpu_to_be32(p->next_packet_size); packet_num = qatomic_fetch_inc(&multifd_send_state->packet_num); @@ -228,12 +228,12 @@ void multifd_send_fill_packet(MultiFDSendParams *p) p->flags, p->next_packet_size); } -static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +static int multifd_recv_unfill_packet_header(MultiFDRecvParams *p, + const MultiFDPacketHdr_t *hdr, + Error **errp) { - const MultiFDPacket_t *packet = p->packet; - uint32_t magic = be32_to_cpu(packet->magic); - uint32_t version = be32_to_cpu(packet->version); - int ret = 0; + uint32_t magic = be32_to_cpu(hdr->magic); + uint32_t version = be32_to_cpu(hdr->version); if (magic != MULTIFD_MAGIC) { error_setg(errp, "multifd: received packet magic %x, expected %x", @@ -247,7 +247,16 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) return -1; } - p->flags = be32_to_cpu(packet->flags); + p->flags = be32_to_cpu(hdr->flags); + + return 0; +} + +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +{ + const MultiFDPacket_t *packet = p->packet; + int ret = 0; + p->next_packet_size = be32_to_cpu(packet->next_packet_size); p->packet_num = be64_to_cpu(packet->packet_num); p->packets_recved++; @@ -1130,8 +1139,12 @@ static void *multifd_recv_thread(void *opaque) rcu_register_thread(); while (true) { + MultiFDPacketHdr_t hdr; uint32_t flags = 0; bool has_data = false; + uint8_t *pkt_buf; + size_t pkt_len; + p->normal_num = 0; if (use_packets) { @@ -1139,8 +1152,22 @@ static void *multifd_recv_thread(void *opaque) break; } - ret = qio_channel_read_all_eof(p->c, (void *)p->packet, - p->packet_len, &local_err); + ret = qio_channel_read_all_eof(p->c, (void *)&hdr, + sizeof(hdr), &local_err); + if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ + break; + } + + ret = multifd_recv_unfill_packet_header(p, &hdr, &local_err); + if (ret) { + break; + } + + pkt_buf = (uint8_t *)p->packet + sizeof(hdr); + pkt_len = p->packet_len - sizeof(hdr); + + ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len, + &local_err); if (ret == 0 || ret == -1) { /* 0: EOF -1: Error */ break; } diff --git a/migration/multifd.h b/migration/multifd.h index bd785b987315..9e4baa066312 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -69,6 +69,11 @@ typedef struct { uint32_t magic; uint32_t version; uint32_t flags; +} __attribute__((packed)) MultiFDPacketHdr_t; + +typedef struct { + MultiFDPacketHdr_t hdr; + /* maximum number of allocated pages */ uint32_t pages_alloc; /* non zero pages */ From patchwork Thu Jan 30 10:08:34 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954432 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18016C0218A for ; Thu, 30 Jan 2025 10:11:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRVC-0003un-Ij; Thu, 30 Jan 2025 05:10:26 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVA-0003mH-5k for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:24 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRV8-0007Iw-4i for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:23 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRV3-00000006Tyb-3bKh; Thu, 30 Jan 2025 11:10:17 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 13/33] migration/multifd: Device state transfer support - receive side Date: Thu, 30 Jan 2025 11:08:34 +0100 Message-ID: <654768fd717b0def70906bf7adfd6739cfd2ecf3.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Add a basic support for receiving device state via multifd channels - channels that are shared with RAM transfers. Depending whether MULTIFD_FLAG_DEVICE_STATE flag is present or not in the packet header either device state (MultiFDPacketDeviceState_t) or RAM data (existing MultiFDPacket_t) is read. The received device state data is provided to qemu_loadvm_load_state_buffer() function for processing in the device's load_state_buffer handler. Signed-off-by: Maciej S. Szmigiero --- migration/multifd.c | 99 ++++++++++++++++++++++++++++++++++++++++----- migration/multifd.h | 26 +++++++++++- 2 files changed, 113 insertions(+), 12 deletions(-) diff --git a/migration/multifd.c b/migration/multifd.c index 53493676012e..810e7b1fb340 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -21,6 +21,7 @@ #include "file.h" #include "migration.h" #include "migration-stats.h" +#include "savevm.h" #include "socket.h" #include "tls.h" #include "qemu-file.h" @@ -252,14 +253,24 @@ static int multifd_recv_unfill_packet_header(MultiFDRecvParams *p, return 0; } -static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +static int multifd_recv_unfill_packet_device_state(MultiFDRecvParams *p, + Error **errp) +{ + MultiFDPacketDeviceState_t *packet = p->packet_dev_state; + + packet->instance_id = be32_to_cpu(packet->instance_id); + p->next_packet_size = be32_to_cpu(packet->next_packet_size); + + return 0; +} + +static int multifd_recv_unfill_packet_ram(MultiFDRecvParams *p, Error **errp) { const MultiFDPacket_t *packet = p->packet; int ret = 0; p->next_packet_size = be32_to_cpu(packet->next_packet_size); p->packet_num = be64_to_cpu(packet->packet_num); - p->packets_recved++; /* Always unfill, old QEMUs (<9.0) send data along with SYNC */ ret = multifd_ram_unfill_packet(p, errp); @@ -270,6 +281,17 @@ static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) return ret; } +static int multifd_recv_unfill_packet(MultiFDRecvParams *p, Error **errp) +{ + p->packets_recved++; + + if (p->flags & MULTIFD_FLAG_DEVICE_STATE) { + return multifd_recv_unfill_packet_device_state(p, errp); + } + + return multifd_recv_unfill_packet_ram(p, errp); +} + static bool multifd_send_should_exit(void) { return qatomic_read(&multifd_send_state->exiting); @@ -1027,6 +1049,7 @@ static void multifd_recv_cleanup_channel(MultiFDRecvParams *p) p->packet_len = 0; g_free(p->packet); p->packet = NULL; + g_clear_pointer(&p->packet_dev_state, g_free); g_free(p->normal); p->normal = NULL; g_free(p->zero); @@ -1128,6 +1151,32 @@ void multifd_recv_sync_main(void) trace_multifd_recv_sync_main(multifd_recv_state->packet_num); } +static int multifd_device_state_recv(MultiFDRecvParams *p, Error **errp) +{ + g_autofree char *idstr = NULL; + g_autofree char *dev_state_buf = NULL; + int ret; + + dev_state_buf = g_malloc(p->next_packet_size); + + ret = qio_channel_read_all(p->c, dev_state_buf, p->next_packet_size, errp); + if (ret != 0) { + return ret; + } + + idstr = g_strndup(p->packet_dev_state->idstr, + sizeof(p->packet_dev_state->idstr)); + + if (!qemu_loadvm_load_state_buffer(idstr, + p->packet_dev_state->instance_id, + dev_state_buf, p->next_packet_size, + errp)) { + ret = -1; + } + + return ret; +} + static void *multifd_recv_thread(void *opaque) { MultiFDRecvParams *p = opaque; @@ -1141,6 +1190,7 @@ static void *multifd_recv_thread(void *opaque) while (true) { MultiFDPacketHdr_t hdr; uint32_t flags = 0; + bool is_device_state = false; bool has_data = false; uint8_t *pkt_buf; size_t pkt_len; @@ -1163,8 +1213,14 @@ static void *multifd_recv_thread(void *opaque) break; } - pkt_buf = (uint8_t *)p->packet + sizeof(hdr); - pkt_len = p->packet_len - sizeof(hdr); + is_device_state = p->flags & MULTIFD_FLAG_DEVICE_STATE; + if (is_device_state) { + pkt_buf = (uint8_t *)p->packet_dev_state + sizeof(hdr); + pkt_len = sizeof(*p->packet_dev_state) - sizeof(hdr); + } else { + pkt_buf = (uint8_t *)p->packet + sizeof(hdr); + pkt_len = p->packet_len - sizeof(hdr); + } ret = qio_channel_read_all_eof(p->c, (char *)pkt_buf, pkt_len, &local_err); @@ -1183,12 +1239,17 @@ static void *multifd_recv_thread(void *opaque) /* recv methods don't know how to handle the SYNC flag */ p->flags &= ~MULTIFD_FLAG_SYNC; - /* - * Even if it's a SYNC packet, this needs to be set - * because older QEMUs (<9.0) still send data along with - * the SYNC packet. - */ - has_data = p->normal_num || p->zero_num; + if (is_device_state) { + has_data = p->next_packet_size > 0; + } else { + /* + * Even if it's a SYNC packet, this needs to be set + * because older QEMUs (<9.0) still send data along with + * the SYNC packet. + */ + has_data = p->normal_num || p->zero_num; + } + qemu_mutex_unlock(&p->mutex); } else { /* @@ -1217,14 +1278,29 @@ static void *multifd_recv_thread(void *opaque) } if (has_data) { - ret = multifd_recv_state->ops->recv(p, &local_err); + if (is_device_state) { + assert(use_packets); + ret = multifd_device_state_recv(p, &local_err); + } else { + ret = multifd_recv_state->ops->recv(p, &local_err); + } if (ret != 0) { break; } + } else if (is_device_state) { + error_setg(&local_err, + "multifd: received empty device state packet"); + break; } if (use_packets) { if (flags & MULTIFD_FLAG_SYNC) { + if (is_device_state) { + error_setg(&local_err, + "multifd: received SYNC device state packet"); + break; + } + qemu_sem_post(&multifd_recv_state->sem_sync); qemu_sem_wait(&p->sem_sync); } @@ -1293,6 +1369,7 @@ int multifd_recv_setup(Error **errp) p->packet_len = sizeof(MultiFDPacket_t) + sizeof(uint64_t) * page_count; p->packet = g_malloc0(p->packet_len); + p->packet_dev_state = g_malloc0(sizeof(*p->packet_dev_state)); } p->name = g_strdup_printf(MIGRATION_THREAD_DST_MULTIFD, i); p->normal = g_new0(ram_addr_t, page_count); diff --git a/migration/multifd.h b/migration/multifd.h index 9e4baa066312..abf3acdcee40 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -62,6 +62,12 @@ MultiFDRecvData *multifd_get_recv_data(void); #define MULTIFD_FLAG_UADK (8 << 1) #define MULTIFD_FLAG_QATZIP (16 << 1) +/* + * If set it means that this packet contains device state + * (MultiFDPacketDeviceState_t), not RAM data (MultiFDPacket_t). + */ +#define MULTIFD_FLAG_DEVICE_STATE (1 << 6) + /* This value needs to be a multiple of qemu_target_page_size() */ #define MULTIFD_PACKET_SIZE (512 * 1024) @@ -94,6 +100,16 @@ typedef struct { uint64_t offset[]; } __attribute__((packed)) MultiFDPacket_t; +typedef struct { + MultiFDPacketHdr_t hdr; + + char idstr[256] QEMU_NONSTRING; + uint32_t instance_id; + + /* size of the next packet that contains the actual data */ + uint32_t next_packet_size; +} __attribute__((packed)) MultiFDPacketDeviceState_t; + typedef struct { /* number of used pages */ uint32_t num; @@ -111,6 +127,13 @@ struct MultiFDRecvData { off_t file_offset; }; +typedef struct { + char *idstr; + uint32_t instance_id; + char *buf; + size_t buf_len; +} MultiFDDeviceState_t; + typedef enum { MULTIFD_PAYLOAD_NONE, MULTIFD_PAYLOAD_RAM, @@ -227,8 +250,9 @@ typedef struct { /* thread local variables. No locking required */ - /* pointer to the packet */ + /* pointers to the possible packet types */ MultiFDPacket_t *packet; + MultiFDPacketDeviceState_t *packet_dev_state; /* size of the next packet that contains pages */ uint32_t next_packet_size; /* packets received through this channel */ From patchwork Thu Jan 30 10:08:35 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2E0CAC02193 for ; Thu, 30 Jan 2025 10:12:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRVH-0004FJ-0E; Thu, 30 Jan 2025 05:10:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVF-0004Ai-9c for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:29 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVD-0007JO-Bo for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:28 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRV9-00000006Tyl-01e7; Thu, 30 Jan 2025 11:10:23 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 14/33] migration/multifd: Make multifd_send() thread safe Date: Thu, 30 Jan 2025 11:08:35 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" multifd_send() function is currently not thread safe, make it thread safe by holding a lock during its execution. This way it will be possible to safely call it concurrently from multiple threads. Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- migration/multifd.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/migration/multifd.c b/migration/multifd.c index 810e7b1fb340..02d163fe292d 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -50,6 +50,10 @@ typedef struct { struct { MultiFDSendParams *params; + + /* multifd_send() body is not thread safe, needs serialization */ + QemuMutex multifd_send_mutex; + /* * Global number of generated multifd packets. * @@ -339,6 +343,8 @@ bool multifd_send(MultiFDSendData **send_data) return false; } + QEMU_LOCK_GUARD(&multifd_send_state->multifd_send_mutex); + /* We wait here, until at least one channel is ready */ qemu_sem_wait(&multifd_send_state->channels_ready); @@ -507,6 +513,7 @@ static void multifd_send_cleanup_state(void) socket_cleanup_outgoing_migration(); qemu_sem_destroy(&multifd_send_state->channels_created); qemu_sem_destroy(&multifd_send_state->channels_ready); + qemu_mutex_destroy(&multifd_send_state->multifd_send_mutex); g_free(multifd_send_state->params); multifd_send_state->params = NULL; g_free(multifd_send_state); @@ -857,6 +864,7 @@ bool multifd_send_setup(void) thread_count = migrate_multifd_channels(); multifd_send_state = g_malloc0(sizeof(*multifd_send_state)); multifd_send_state->params = g_new0(MultiFDSendParams, thread_count); + qemu_mutex_init(&multifd_send_state->multifd_send_mutex); qemu_sem_init(&multifd_send_state->channels_created, 0); qemu_sem_init(&multifd_send_state->channels_ready, 0); qatomic_set(&multifd_send_state->exiting, 0); From patchwork Thu Jan 30 10:08:36 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954461 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4DB75C0218A for ; Thu, 30 Jan 2025 10:14:24 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRVL-0004NY-B5; Thu, 30 Jan 2025 05:10:35 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVJ-0004Lg-4K for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:33 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVH-0007Jq-Gt for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:32 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVE-00000006Tyx-0fQG; Thu, 30 Jan 2025 11:10:28 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 15/33] migration/multifd: Add an explicit MultiFDSendData destructor Date: Thu, 30 Jan 2025 11:08:36 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This way if there are fields there that needs explicit disposal (like, for example, some attached buffers) they will be handled appropriately. Add a related assert to multifd_set_payload_type() in order to make sure that this function is only used to fill a previously empty MultiFDSendData with some payload, not the other way around. Reviewed-by: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- migration/multifd-nocomp.c | 3 +-- migration/multifd.c | 31 ++++++++++++++++++++++++++++--- migration/multifd.h | 5 +++++ 3 files changed, 34 insertions(+), 5 deletions(-) diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index 1325dba97cea..e46e79d8b272 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -42,8 +42,7 @@ void multifd_ram_save_setup(void) void multifd_ram_save_cleanup(void) { - g_free(multifd_ram_send); - multifd_ram_send = NULL; + g_clear_pointer(&multifd_ram_send, multifd_send_data_free); } static void multifd_set_file_bitmap(MultiFDSendParams *p) diff --git a/migration/multifd.c b/migration/multifd.c index 02d163fe292d..3353a5da7593 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -123,6 +123,32 @@ MultiFDSendData *multifd_send_data_alloc(void) return g_malloc0(size_minus_payload + max_payload_size); } +void multifd_send_data_clear(MultiFDSendData *data) +{ + if (multifd_payload_empty(data)) { + return; + } + + switch (data->type) { + default: + /* Nothing to do */ + break; + } + + data->type = MULTIFD_PAYLOAD_NONE; +} + +void multifd_send_data_free(MultiFDSendData *data) +{ + if (!data) { + return; + } + + multifd_send_data_clear(data); + + g_free(data); +} + static bool multifd_use_packets(void) { return !migrate_mapped_ram(); @@ -496,8 +522,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) qemu_sem_destroy(&p->sem_sync); g_free(p->name); p->name = NULL; - g_free(p->data); - p->data = NULL; + g_clear_pointer(&p->data, multifd_send_data_free); p->packet_len = 0; g_free(p->packet); p->packet = NULL; @@ -665,7 +690,7 @@ static void *multifd_send_thread(void *opaque) (uint64_t)p->next_packet_size + p->packet_len); p->next_packet_size = 0; - multifd_set_payload_type(p->data, MULTIFD_PAYLOAD_NONE); + multifd_send_data_clear(p->data); /* * Making sure p->data is published before saying "we're diff --git a/migration/multifd.h b/migration/multifd.h index abf3acdcee40..436598999e6f 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -156,6 +156,9 @@ static inline bool multifd_payload_empty(MultiFDSendData *data) static inline void multifd_set_payload_type(MultiFDSendData *data, MultiFDPayloadType type) { + assert(multifd_payload_empty(data)); + assert(type != MULTIFD_PAYLOAD_NONE); + data->type = type; } @@ -370,6 +373,8 @@ static inline void multifd_send_prepare_header(MultiFDSendParams *p) void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); bool multifd_send(MultiFDSendData **send_data); MultiFDSendData *multifd_send_data_alloc(void); +void multifd_send_data_clear(MultiFDSendData *data); +void multifd_send_data_free(MultiFDSendData *data); static inline uint32_t multifd_ram_page_size(void) { From patchwork Thu Jan 30 10:08:37 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954436 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 929E8C02190 for ; Thu, 30 Jan 2025 10:11:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRVS-0004Xo-21; Thu, 30 Jan 2025 05:10:42 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVP-0004UQ-CH for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:39 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVM-0007KM-U4 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:39 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVJ-00000006Tz7-1HWh; Thu, 30 Jan 2025 11:10:33 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 16/33] migration/multifd: Device state transfer support - send side Date: Thu, 30 Jan 2025 11:08:37 +0100 Message-ID: <0d277461eb7cb849dc1af03c38275f7c9747f0db.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" A new function multifd_queue_device_state() is provided for device to queue its state for transmission via a multifd channel. Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 4 ++ migration/meson.build | 1 + migration/multifd-device-state.c | 107 +++++++++++++++++++++++++++++++ migration/multifd-nocomp.c | 14 +++- migration/multifd.c | 42 ++++++++++-- migration/multifd.h | 27 +++++--- 6 files changed, 179 insertions(+), 16 deletions(-) create mode 100644 migration/multifd-device-state.c diff --git a/include/migration/misc.h b/include/migration/misc.h index bc5ce31b52e0..885022d21a0c 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -111,4 +111,8 @@ bool migration_in_bg_snapshot(void); bool migration_block_activate(Error **errp); bool migration_block_inactivate(void); +/* migration/multifd-device-state.c */ +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, + char *data, size_t len); + #endif diff --git a/migration/meson.build b/migration/meson.build index dac687ee3acb..50263fa741e7 100644 --- a/migration/meson.build +++ b/migration/meson.build @@ -23,6 +23,7 @@ system_ss.add(files( 'migration-hmp-cmds.c', 'migration.c', 'multifd.c', + 'multifd-device-state.c', 'multifd-nocomp.c', 'multifd-zlib.c', 'multifd-zero-page.c', diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c new file mode 100644 index 000000000000..2207bea9bf8a --- /dev/null +++ b/migration/multifd-device-state.c @@ -0,0 +1,107 @@ +/* + * Multifd device state migration + * + * Copyright (C) 2024,2025 Oracle and/or its affiliates. + * + * This work is licensed under the terms of the GNU GPL, version 2 or later. + * See the COPYING file in the top-level directory. + */ + +#include "qemu/osdep.h" +#include "qemu/lockable.h" +#include "migration/misc.h" +#include "multifd.h" + +static QemuMutex queue_job_mutex; + +static MultiFDSendData *device_state_send; + +size_t multifd_device_state_payload_size(void) +{ + return sizeof(MultiFDDeviceState_t); +} + +void multifd_device_state_send_setup(void) +{ + qemu_mutex_init(&queue_job_mutex); + + assert(!device_state_send); + device_state_send = multifd_send_data_alloc(); +} + +void multifd_device_state_send_cleanup(void) +{ + g_clear_pointer(&device_state_send, multifd_send_data_free); + + qemu_mutex_destroy(&queue_job_mutex); +} + +void multifd_send_data_clear_device_state(MultiFDDeviceState_t *device_state) +{ + g_clear_pointer(&device_state->idstr, g_free); + g_clear_pointer(&device_state->buf, g_free); +} + +static void multifd_device_state_fill_packet(MultiFDSendParams *p) +{ + MultiFDDeviceState_t *device_state = &p->data->u.device_state; + MultiFDPacketDeviceState_t *packet = p->packet_device_state; + + packet->hdr.flags = cpu_to_be32(p->flags); + strncpy(packet->idstr, device_state->idstr, sizeof(packet->idstr)); + packet->instance_id = cpu_to_be32(device_state->instance_id); + packet->next_packet_size = cpu_to_be32(p->next_packet_size); +} + +static void multifd_prepare_header_device_state(MultiFDSendParams *p) +{ + p->iov[0].iov_len = sizeof(*p->packet_device_state); + p->iov[0].iov_base = p->packet_device_state; + p->iovs_num++; +} + +void multifd_device_state_send_prepare(MultiFDSendParams *p) +{ + MultiFDDeviceState_t *device_state = &p->data->u.device_state; + + assert(multifd_payload_device_state(p->data)); + + multifd_prepare_header_device_state(p); + + assert(!(p->flags & MULTIFD_FLAG_SYNC)); + + p->next_packet_size = device_state->buf_len; + if (p->next_packet_size > 0) { + p->iov[p->iovs_num].iov_base = device_state->buf; + p->iov[p->iovs_num].iov_len = p->next_packet_size; + p->iovs_num++; + } + + p->flags |= MULTIFD_FLAG_NOCOMP | MULTIFD_FLAG_DEVICE_STATE; + + multifd_device_state_fill_packet(p); +} + +bool multifd_queue_device_state(char *idstr, uint32_t instance_id, + char *data, size_t len) +{ + /* Device state submissions can come from multiple threads */ + QEMU_LOCK_GUARD(&queue_job_mutex); + MultiFDDeviceState_t *device_state; + + assert(multifd_payload_empty(device_state_send)); + + multifd_set_payload_type(device_state_send, MULTIFD_PAYLOAD_DEVICE_STATE); + device_state = &device_state_send->u.device_state; + device_state->idstr = g_strdup(idstr); + device_state->instance_id = instance_id; + device_state->buf = g_memdup2(data, len); + device_state->buf_len = len; + + if (!multifd_send(&device_state_send)) { + multifd_send_data_clear(device_state_send); + return false; + } + + return true; +} diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index e46e79d8b272..c00804652383 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -14,6 +14,7 @@ #include "exec/ramblock.h" #include "exec/target_page.h" #include "file.h" +#include "migration-stats.h" #include "multifd.h" #include "options.h" #include "qapi/error.h" @@ -85,6 +86,13 @@ static void multifd_nocomp_send_cleanup(MultiFDSendParams *p, Error **errp) return; } +static void multifd_ram_prepare_header(MultiFDSendParams *p) +{ + p->iov[0].iov_len = p->packet_len; + p->iov[0].iov_base = p->packet; + p->iovs_num++; +} + static void multifd_send_prepare_iovs(MultiFDSendParams *p) { MultiFDPages_t *pages = &p->data->u.ram; @@ -118,7 +126,7 @@ static int multifd_nocomp_send_prepare(MultiFDSendParams *p, Error **errp) * Only !zerocopy needs the header in IOV; zerocopy will * send it separately. */ - multifd_send_prepare_header(p); + multifd_ram_prepare_header(p); } multifd_send_prepare_iovs(p); @@ -133,6 +141,8 @@ static int multifd_nocomp_send_prepare(MultiFDSendParams *p, Error **errp) if (ret != 0) { return -1; } + + stat64_add(&mig_stats.multifd_bytes, p->packet_len); } return 0; @@ -431,7 +441,7 @@ int multifd_ram_flush_and_sync(QEMUFile *f) bool multifd_send_prepare_common(MultiFDSendParams *p) { MultiFDPages_t *pages = &p->data->u.ram; - multifd_send_prepare_header(p); + multifd_ram_prepare_header(p); multifd_send_zero_page_detect(p); if (!pages->normal_num) { diff --git a/migration/multifd.c b/migration/multifd.c index 3353a5da7593..61b061a33d35 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -12,6 +12,7 @@ #include "qemu/osdep.h" #include "qemu/cutils.h" +#include "qemu/iov.h" #include "qemu/rcu.h" #include "exec/target_page.h" #include "system/system.h" @@ -19,6 +20,7 @@ #include "qemu/error-report.h" #include "qapi/error.h" #include "file.h" +#include "migration/misc.h" #include "migration.h" #include "migration-stats.h" #include "savevm.h" @@ -111,7 +113,9 @@ MultiFDSendData *multifd_send_data_alloc(void) * added to the union in the future are larger than * (MultiFDPages_t + flex array). */ - max_payload_size = MAX(multifd_ram_payload_size(), sizeof(MultiFDPayload)); + max_payload_size = MAX(multifd_ram_payload_size(), + multifd_device_state_payload_size()); + max_payload_size = MAX(max_payload_size, sizeof(MultiFDPayload)); /* * Account for any holes the compiler might insert. We can't pack @@ -130,6 +134,9 @@ void multifd_send_data_clear(MultiFDSendData *data) } switch (data->type) { + case MULTIFD_PAYLOAD_DEVICE_STATE: + multifd_send_data_clear_device_state(&data->u.device_state); + break; default: /* Nothing to do */ break; @@ -232,6 +239,7 @@ static int multifd_recv_initial_packet(QIOChannel *c, Error **errp) return msg.id; } +/* Fills a RAM multifd packet */ void multifd_send_fill_packet(MultiFDSendParams *p) { MultiFDPacket_t *packet = p->packet; @@ -524,6 +532,7 @@ static bool multifd_send_cleanup_channel(MultiFDSendParams *p, Error **errp) p->name = NULL; g_clear_pointer(&p->data, multifd_send_data_free); p->packet_len = 0; + g_clear_pointer(&p->packet_device_state, g_free); g_free(p->packet); p->packet = NULL; multifd_send_state->ops->send_cleanup(p, errp); @@ -536,6 +545,7 @@ static void multifd_send_cleanup_state(void) { file_cleanup_outgoing_migration(); socket_cleanup_outgoing_migration(); + multifd_device_state_send_cleanup(); qemu_sem_destroy(&multifd_send_state->channels_created); qemu_sem_destroy(&multifd_send_state->channels_ready); qemu_mutex_destroy(&multifd_send_state->multifd_send_mutex); @@ -664,16 +674,32 @@ static void *multifd_send_thread(void *opaque) * qatomic_store_release() in multifd_send(). */ if (qatomic_load_acquire(&p->pending_job)) { + bool is_device_state = multifd_payload_device_state(p->data); + size_t total_size; + p->flags = 0; p->iovs_num = 0; assert(!multifd_payload_empty(p->data)); - ret = multifd_send_state->ops->send_prepare(p, &local_err); - if (ret != 0) { - break; + if (is_device_state) { + multifd_device_state_send_prepare(p); + } else { + ret = multifd_send_state->ops->send_prepare(p, &local_err); + if (ret != 0) { + break; + } } + /* + * The packet header in the zerocopy RAM case is accounted for + * in multifd_nocomp_send_prepare() - where it is actually + * being sent. + */ + total_size = iov_size(p->iov, p->iovs_num); + if (migrate_mapped_ram()) { + assert(!is_device_state); + ret = file_write_ramblock_iov(p->c, p->iov, p->iovs_num, &p->data->u.ram, &local_err); } else { @@ -686,8 +712,7 @@ static void *multifd_send_thread(void *opaque) break; } - stat64_add(&mig_stats.multifd_bytes, - (uint64_t)p->next_packet_size + p->packet_len); + stat64_add(&mig_stats.multifd_bytes, total_size); p->next_packet_size = 0; multifd_send_data_clear(p->data); @@ -908,6 +933,9 @@ bool multifd_send_setup(void) p->packet_len = sizeof(MultiFDPacket_t) + sizeof(uint64_t) * page_count; p->packet = g_malloc0(p->packet_len); + p->packet_device_state = g_malloc0(sizeof(*p->packet_device_state)); + p->packet_device_state->hdr.magic = cpu_to_be32(MULTIFD_MAGIC); + p->packet_device_state->hdr.version = cpu_to_be32(MULTIFD_VERSION); } p->name = g_strdup_printf(MIGRATION_THREAD_SRC_MULTIFD, i); p->write_flags = 0; @@ -943,6 +971,8 @@ bool multifd_send_setup(void) assert(p->iov); } + multifd_device_state_send_setup(); + return true; err: diff --git a/migration/multifd.h b/migration/multifd.h index 436598999e6f..ddc617db9acb 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -137,10 +137,12 @@ typedef struct { typedef enum { MULTIFD_PAYLOAD_NONE, MULTIFD_PAYLOAD_RAM, + MULTIFD_PAYLOAD_DEVICE_STATE, } MultiFDPayloadType; typedef union MultiFDPayload { MultiFDPages_t ram; + MultiFDDeviceState_t device_state; } MultiFDPayload; struct MultiFDSendData { @@ -153,6 +155,11 @@ static inline bool multifd_payload_empty(MultiFDSendData *data) return data->type == MULTIFD_PAYLOAD_NONE; } +static inline bool multifd_payload_device_state(MultiFDSendData *data) +{ + return data->type == MULTIFD_PAYLOAD_DEVICE_STATE; +} + static inline void multifd_set_payload_type(MultiFDSendData *data, MultiFDPayloadType type) { @@ -205,8 +212,9 @@ typedef struct { /* thread local variables. No locking required */ - /* pointer to the packet */ + /* pointers to the possible packet types */ MultiFDPacket_t *packet; + MultiFDPacketDeviceState_t *packet_device_state; /* size of the next packet that contains pages */ uint32_t next_packet_size; /* packets sent through this channel */ @@ -363,13 +371,6 @@ bool multifd_send_prepare_common(MultiFDSendParams *p); void multifd_send_zero_page_detect(MultiFDSendParams *p); void multifd_recv_zero_page_process(MultiFDRecvParams *p); -static inline void multifd_send_prepare_header(MultiFDSendParams *p) -{ - p->iov[0].iov_len = p->packet_len; - p->iov[0].iov_base = p->packet; - p->iovs_num++; -} - void multifd_channel_connect(MultiFDSendParams *p, QIOChannel *ioc); bool multifd_send(MultiFDSendData **send_data); MultiFDSendData *multifd_send_data_alloc(void); @@ -394,4 +395,14 @@ bool multifd_ram_sync_per_section(void); size_t multifd_ram_payload_size(void); void multifd_ram_fill_packet(MultiFDSendParams *p); int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp); + +size_t multifd_device_state_payload_size(void); + +void multifd_send_data_clear_device_state(MultiFDDeviceState_t *device_state); + +void multifd_device_state_send_setup(void); +void multifd_device_state_send_cleanup(void); + +void multifd_device_state_send_prepare(MultiFDSendParams *p); + #endif From patchwork Thu Jan 30 10:08:38 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B2074C02190 for ; Thu, 30 Jan 2025 10:11:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRVV-0004d6-TR; Thu, 30 Jan 2025 05:10:46 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVU-0004cO-R2 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:44 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVS-0007Ki-Q7 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:44 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVO-00000006TzI-1zX9; Thu, 30 Jan 2025 11:10:38 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 17/33] migration/multifd: Make MultiFDSendData a struct Date: Thu, 30 Jan 2025 11:08:38 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: Peter Xu The newly introduced device state buffer can be used for either storing VFIO's read() raw data, but already also possible to store generic device states. After noticing that device states may not easily provide a max buffer size (also the fact that RAM MultiFDPages_t after all also want to have flexibility on managing offset[] array), it may not be a good idea to stick with union on MultiFDSendData.. as it won't play well with such flexibility. Switch MultiFDSendData to a struct. It won't consume a lot more space in reality, after all the real buffers were already dynamically allocated, so it's so far only about the two structs (pages, device_state) that will be duplicated, but they're small. With this, we can remove the pretty hard to understand alloc size logic. Because now we can allocate offset[] together with the SendData, and properly free it when the SendData is freed. Signed-off-by: Peter Xu [MSS: Make sure to clear possible device state payload before freeing MultiFDSendData, remove placeholders for other patches not included] Signed-off-by: Maciej S. Szmigiero --- migration/multifd-device-state.c | 5 ----- migration/multifd-nocomp.c | 13 ++++++------- migration/multifd.c | 25 +++++++------------------ migration/multifd.h | 15 +++++++++------ 4 files changed, 22 insertions(+), 36 deletions(-) diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c index 2207bea9bf8a..d1674b432ff2 100644 --- a/migration/multifd-device-state.c +++ b/migration/multifd-device-state.c @@ -16,11 +16,6 @@ static QemuMutex queue_job_mutex; static MultiFDSendData *device_state_send; -size_t multifd_device_state_payload_size(void) -{ - return sizeof(MultiFDDeviceState_t); -} - void multifd_device_state_send_setup(void) { qemu_mutex_init(&queue_job_mutex); diff --git a/migration/multifd-nocomp.c b/migration/multifd-nocomp.c index c00804652383..ffe75256c9fb 100644 --- a/migration/multifd-nocomp.c +++ b/migration/multifd-nocomp.c @@ -25,15 +25,14 @@ static MultiFDSendData *multifd_ram_send; -size_t multifd_ram_payload_size(void) +void multifd_ram_payload_alloc(MultiFDPages_t *pages) { - uint32_t n = multifd_ram_page_count(); + pages->offset = g_new0(ram_addr_t, multifd_ram_page_count()); +} - /* - * We keep an array of page offsets at the end of MultiFDPages_t, - * add space for it in the allocation. - */ - return sizeof(MultiFDPages_t) + n * sizeof(ram_addr_t); +void multifd_ram_payload_free(MultiFDPages_t *pages) +{ + g_clear_pointer(&pages->offset, g_free); } void multifd_ram_save_setup(void) diff --git a/migration/multifd.c b/migration/multifd.c index 61b061a33d35..0b61b8192231 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -105,26 +105,12 @@ struct { MultiFDSendData *multifd_send_data_alloc(void) { - size_t max_payload_size, size_minus_payload; + MultiFDSendData *new = g_new0(MultiFDSendData, 1); - /* - * MultiFDPages_t has a flexible array at the end, account for it - * when allocating MultiFDSendData. Use max() in case other types - * added to the union in the future are larger than - * (MultiFDPages_t + flex array). - */ - max_payload_size = MAX(multifd_ram_payload_size(), - multifd_device_state_payload_size()); - max_payload_size = MAX(max_payload_size, sizeof(MultiFDPayload)); - - /* - * Account for any holes the compiler might insert. We can't pack - * the structure because that misaligns the members and triggers - * Waddress-of-packed-member. - */ - size_minus_payload = sizeof(MultiFDSendData) - sizeof(MultiFDPayload); + multifd_ram_payload_alloc(&new->u.ram); + /* Device state allocates its payload on-demand */ - return g_malloc0(size_minus_payload + max_payload_size); + return new; } void multifd_send_data_clear(MultiFDSendData *data) @@ -151,8 +137,11 @@ void multifd_send_data_free(MultiFDSendData *data) return; } + /* This also free's device state payload */ multifd_send_data_clear(data); + multifd_ram_payload_free(&data->u.ram); + g_free(data); } diff --git a/migration/multifd.h b/migration/multifd.h index ddc617db9acb..f7811cc0d0cb 100644 --- a/migration/multifd.h +++ b/migration/multifd.h @@ -115,9 +115,13 @@ typedef struct { uint32_t num; /* number of normal pages */ uint32_t normal_num; + /* + * Pointer to the ramblock. NOTE: it's caller's responsibility to make + * sure the pointer is always valid! + */ RAMBlock *block; - /* offset of each page */ - ram_addr_t offset[]; + /* offset array of each page, managed by multifd */ + ram_addr_t *offset; } MultiFDPages_t; struct MultiFDRecvData { @@ -140,7 +144,7 @@ typedef enum { MULTIFD_PAYLOAD_DEVICE_STATE, } MultiFDPayloadType; -typedef union MultiFDPayload { +typedef struct MultiFDPayload { MultiFDPages_t ram; MultiFDDeviceState_t device_state; } MultiFDPayload; @@ -392,12 +396,11 @@ void multifd_ram_save_cleanup(void); int multifd_ram_flush_and_sync(QEMUFile *f); bool multifd_ram_sync_per_round(void); bool multifd_ram_sync_per_section(void); -size_t multifd_ram_payload_size(void); +void multifd_ram_payload_alloc(MultiFDPages_t *pages); +void multifd_ram_payload_free(MultiFDPages_t *pages); void multifd_ram_fill_packet(MultiFDSendParams *p); int multifd_ram_unfill_packet(MultiFDRecvParams *p, Error **errp); -size_t multifd_device_state_payload_size(void); - void multifd_send_data_clear_device_state(MultiFDDeviceState_t *device_state); void multifd_device_state_send_setup(void); From patchwork Thu Jan 30 10:08:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954448 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 97B4AC0218A for ; Thu, 30 Jan 2025 10:12:34 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRVb-00050W-4n; Thu, 30 Jan 2025 05:10:51 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVY-0004or-EW for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:48 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVW-0007LA-TZ for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:48 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVT-00000006TzV-2f7Q; Thu, 30 Jan 2025 11:10:43 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 18/33] migration/multifd: Add multifd_device_state_supported() Date: Thu, 30 Jan 2025 11:08:39 +0100 Message-ID: <1a2ac534131c3bf61303dde0f7d59cfa2a90b3d3.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Since device state transfer via multifd channels requires multifd channels with packets and is currently not compatible with multifd compression add an appropriate query function so device can learn whether it can actually make use of it. Reviewed-by: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 1 + migration/multifd-device-state.c | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/include/migration/misc.h b/include/migration/misc.h index 885022d21a0c..cc987e6e97af 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -114,5 +114,6 @@ bool migration_block_inactivate(void); /* migration/multifd-device-state.c */ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, char *data, size_t len); +bool multifd_device_state_supported(void); #endif diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c index d1674b432ff2..cee3c44bcf2a 100644 --- a/migration/multifd-device-state.c +++ b/migration/multifd-device-state.c @@ -11,6 +11,7 @@ #include "qemu/lockable.h" #include "migration/misc.h" #include "multifd.h" +#include "options.h" static QemuMutex queue_job_mutex; @@ -100,3 +101,9 @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, return true; } + +bool multifd_device_state_supported(void) +{ + return migrate_multifd() && !migrate_mapped_ram() && + migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE; +} From patchwork Thu Jan 30 10:08:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954435 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EA6AEC0218A for ; Thu, 30 Jan 2025 10:11:45 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRVu-0005Xf-IG; Thu, 30 Jan 2025 05:11:10 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVf-0005BI-EG for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:06 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVd-0007Ly-AI for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:55 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVY-00000006Tzp-3IJb; Thu, 30 Jan 2025 11:10:48 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 19/33] migration: Add save_live_complete_precopy_thread handler Date: Thu, 30 Jan 2025 11:08:40 +0100 Message-ID: <7561a5d67e113e166c7d43246d52feec6ca06fb5.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This SaveVMHandler helps device provide its own asynchronous transmission of the remaining data at the end of a precopy phase via multifd channels, in parallel with the transfer done by save_live_complete_precopy handlers. These threads are launched only when multifd device state transfer is supported. Management of these threads in done in the multifd migration code, wrapping them in the generic thread pool. Signed-off-by: Maciej S. Szmigiero --- include/migration/misc.h | 8 +++ include/migration/register.h | 21 ++++++++ include/qemu/typedefs.h | 4 ++ migration/multifd-device-state.c | 83 ++++++++++++++++++++++++++++++++ migration/savevm.c | 37 +++++++++++++- 5 files changed, 152 insertions(+), 1 deletion(-) diff --git a/include/migration/misc.h b/include/migration/misc.h index cc987e6e97af..008d22df8e72 100644 --- a/include/migration/misc.h +++ b/include/migration/misc.h @@ -116,4 +116,12 @@ bool multifd_queue_device_state(char *idstr, uint32_t instance_id, char *data, size_t len); bool multifd_device_state_supported(void); +void +multifd_spawn_device_state_save_thread(SaveLiveCompletePrecopyThreadHandler hdlr, + char *idstr, uint32_t instance_id, + void *opaque); + +void multifd_abort_device_state_save_threads(void); +int multifd_join_device_state_save_threads(void); + #endif diff --git a/include/migration/register.h b/include/migration/register.h index 58891aa54b76..f63b3ca3fd44 100644 --- a/include/migration/register.h +++ b/include/migration/register.h @@ -105,6 +105,27 @@ typedef struct SaveVMHandlers { */ int (*save_live_complete_precopy)(QEMUFile *f, void *opaque); + /** + * @save_live_complete_precopy_thread (invoked in a separate thread) + * + * Called at the end of a precopy phase from a separate worker thread + * in configurations where multifd device state transfer is supported + * in order to perform asynchronous transmission of the remaining data in + * parallel with @save_live_complete_precopy handlers. + * When postcopy is enabled, devices that support postcopy will skip this + * step. + * + * @idstr: this device section idstr + * @instance_id: this device section instance_id + * @abort_flag: flag indicating that the migration core wants to abort + * the transmission and so the handler should exit ASAP. To be read by + * qatomic_read() or similar. + * @opaque: data pointer passed to register_savevm_live() + * + * Returns zero to indicate success and negative for error + */ + SaveLiveCompletePrecopyThreadHandler save_live_complete_precopy_thread; + /* This runs both outside and inside the BQL. */ /** diff --git a/include/qemu/typedefs.h b/include/qemu/typedefs.h index fd23ff7771b1..76578dee8fd3 100644 --- a/include/qemu/typedefs.h +++ b/include/qemu/typedefs.h @@ -133,5 +133,9 @@ typedef struct IRQState *qemu_irq; typedef void (*qemu_irq_handler)(void *opaque, int n, int level); typedef bool (*MigrationLoadThread)(void *opaque, bool *should_quit, Error **errp); +typedef int (*SaveLiveCompletePrecopyThreadHandler)(char *idstr, + uint32_t instance_id, + bool *abort_flag, + void *opaque); #endif /* QEMU_TYPEDEFS_H */ diff --git a/migration/multifd-device-state.c b/migration/multifd-device-state.c index cee3c44bcf2a..fb09e6a48df9 100644 --- a/migration/multifd-device-state.c +++ b/migration/multifd-device-state.c @@ -9,12 +9,17 @@ #include "qemu/osdep.h" #include "qemu/lockable.h" +#include "block/thread-pool.h" #include "migration/misc.h" #include "multifd.h" #include "options.h" static QemuMutex queue_job_mutex; +static ThreadPool *send_threads; +static int send_threads_ret; +static bool send_threads_abort; + static MultiFDSendData *device_state_send; void multifd_device_state_send_setup(void) @@ -23,10 +28,16 @@ void multifd_device_state_send_setup(void) assert(!device_state_send); device_state_send = multifd_send_data_alloc(); + + assert(!send_threads); + send_threads = thread_pool_new(); + send_threads_ret = 0; + send_threads_abort = false; } void multifd_device_state_send_cleanup(void) { + g_clear_pointer(&send_threads, thread_pool_free); g_clear_pointer(&device_state_send, multifd_send_data_free); qemu_mutex_destroy(&queue_job_mutex); @@ -107,3 +118,75 @@ bool multifd_device_state_supported(void) return migrate_multifd() && !migrate_mapped_ram() && migrate_multifd_compression() == MULTIFD_COMPRESSION_NONE; } + +struct MultiFDDSSaveThreadData { + SaveLiveCompletePrecopyThreadHandler hdlr; + char *idstr; + uint32_t instance_id; + void *handler_opaque; +}; + +static void multifd_device_state_save_thread_data_free(void *opaque) +{ + struct MultiFDDSSaveThreadData *data = opaque; + + g_clear_pointer(&data->idstr, g_free); + g_free(data); +} + +static int multifd_device_state_save_thread(void *opaque) +{ + struct MultiFDDSSaveThreadData *data = opaque; + int ret; + + ret = data->hdlr(data->idstr, data->instance_id, &send_threads_abort, + data->handler_opaque); + if (ret && !qatomic_read(&send_threads_ret)) { + /* + * Racy with the above read but that's okay - which thread error + * return we report is purely arbitrary anyway. + */ + qatomic_set(&send_threads_ret, ret); + } + + return 0; +} + +void +multifd_spawn_device_state_save_thread(SaveLiveCompletePrecopyThreadHandler hdlr, + char *idstr, uint32_t instance_id, + void *opaque) +{ + struct MultiFDDSSaveThreadData *data; + + assert(multifd_device_state_supported()); + + assert(!qatomic_read(&send_threads_abort)); + + data = g_new(struct MultiFDDSSaveThreadData, 1); + data->hdlr = hdlr; + data->idstr = g_strdup(idstr); + data->instance_id = instance_id; + data->handler_opaque = opaque; + + thread_pool_submit_immediate(send_threads, + multifd_device_state_save_thread, + data, + multifd_device_state_save_thread_data_free); +} + +void multifd_abort_device_state_save_threads(void) +{ + assert(multifd_device_state_supported()); + + qatomic_set(&send_threads_abort, true); +} + +int multifd_join_device_state_save_threads(void) +{ + assert(multifd_device_state_supported()); + + thread_pool_wait(send_threads); + + return send_threads_ret; +} diff --git a/migration/savevm.c b/migration/savevm.c index 74d1960de3c6..e47c6c92fe50 100644 --- a/migration/savevm.c +++ b/migration/savevm.c @@ -37,6 +37,7 @@ #include "migration/register.h" #include "migration/global_state.h" #include "migration/channel-block.h" +#include "multifd.h" #include "ram.h" #include "qemu-file.h" #include "savevm.h" @@ -1521,6 +1522,24 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) int64_t start_ts_each, end_ts_each; SaveStateEntry *se; int ret; + bool multifd_device_state = multifd_device_state_supported(); + + if (multifd_device_state) { + QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { + SaveLiveCompletePrecopyThreadHandler hdlr; + + if (!se->ops || (in_postcopy && se->ops->has_postcopy && + se->ops->has_postcopy(se->opaque)) || + !se->ops->save_live_complete_precopy_thread) { + continue; + } + + hdlr = se->ops->save_live_complete_precopy_thread; + multifd_spawn_device_state_save_thread(hdlr, + se->idstr, se->instance_id, + se->opaque); + } + } QTAILQ_FOREACH(se, &savevm_state.handlers, entry) { if (!se->ops || @@ -1546,16 +1565,32 @@ int qemu_savevm_state_complete_precopy_iterable(QEMUFile *f, bool in_postcopy) save_section_footer(f, se); if (ret < 0) { qemu_file_set_error(f, ret); - return -1; + goto ret_fail_abort_threads; } end_ts_each = qemu_clock_get_us(QEMU_CLOCK_REALTIME); trace_vmstate_downtime_save("iterable", se->idstr, se->instance_id, end_ts_each - start_ts_each); } + if (multifd_device_state) { + ret = multifd_join_device_state_save_threads(); + if (ret) { + qemu_file_set_error(f, ret); + return -1; + } + } + trace_vmstate_downtime_checkpoint("src-iterable-saved"); return 0; + +ret_fail_abort_threads: + if (multifd_device_state) { + multifd_abort_device_state_save_threads(); + multifd_join_device_state_save_threads(); + } + + return -1; } int qemu_savevm_state_complete_precopy_non_iterable(QEMUFile *f, From patchwork Thu Jan 30 10:08:41 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 71CA4C0218A for ; Thu, 30 Jan 2025 10:12:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRVz-0005cx-BS; Thu, 30 Jan 2025 05:11:15 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVk-0005Bc-VI for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:06 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVi-0007MF-84 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:10:59 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVd-00000006U01-3r2a; Thu, 30 Jan 2025 11:10:53 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 20/33] vfio/migration: Add x-migration-load-config-after-iter VFIO property Date: Thu, 30 Jan 2025 11:08:41 +0100 Message-ID: <11f12c43e098ac5e2466e456e8cf8936c54210dc.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This property allows configuring whether to start the config load only after all iterables were loaded. Such interlocking is required for ARM64 due to this platform VFIO dependency on interrupt controller being loaded first. The property defaults to AUTO, which means ON for ARM, OFF for other platforms. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 25 +++++++++++++++++++++++++ hw/vfio/pci.c | 3 +++ include/hw/vfio/vfio-common.h | 1 + 3 files changed, 29 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index adfa752db527..d801c861d202 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -254,6 +254,31 @@ static int vfio_load_buffer(QEMUFile *f, VFIODevice *vbasedev, return ret; } +static bool vfio_load_config_after_iter(VFIODevice *vbasedev) +{ + if (vbasedev->migration_load_config_after_iter == ON_OFF_AUTO_ON) { + return true; + } else if (vbasedev->migration_load_config_after_iter == ON_OFF_AUTO_OFF) { + return false; + } + + assert(vbasedev->migration_load_config_after_iter == ON_OFF_AUTO_AUTO); + + /* + * Starting the config load only after all iterables were loaded is required + * for ARM64 due to this platform VFIO dependency on interrupt controller + * being loaded first. + * + * See commit d329f5032e17 ("vfio: Move the saving of the config space to + * the right place in VFIO migration"). + */ +#if defined(TARGET_ARM) + return true; +#else + return false; +#endif +} + static int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp) { diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index ab17a98ee5b6..83090c544d95 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3377,6 +3377,9 @@ static const Property vfio_pci_dev_properties[] = { VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), DEFINE_PROP_ON_OFF_AUTO("enable-migration", VFIOPCIDevice, vbasedev.enable_migration, ON_OFF_AUTO_AUTO), + DEFINE_PROP_ON_OFF_AUTO("x-migration-load-config-after-iter", VFIOPCIDevice, + vbasedev.migration_load_config_after_iter, + ON_OFF_AUTO_AUTO), DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, vbasedev.migration_events, false), DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 0c60be5b15c7..153d03745dc7 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -133,6 +133,7 @@ typedef struct VFIODevice { bool no_mmap; bool ram_block_discard_allowed; OnOffAuto enable_migration; + OnOffAuto migration_load_config_after_iter; bool migration_events; VFIODeviceOps *ops; unsigned int num_irqs; From patchwork Thu Jan 30 10:08:42 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954434 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4CE30C0218A for ; Thu, 30 Jan 2025 10:11:40 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRW9-0006HL-4K; Thu, 30 Jan 2025 05:11:25 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVq-0005LO-8B for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:08 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRVm-0007Mj-Mv for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:05 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVj-00000006U0E-0HN9; Thu, 30 Jan 2025 11:10:59 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 21/33] vfio/migration: Add load_device_config_state_start trace event Date: Thu, 30 Jan 2025 11:08:42 +0100 Message-ID: <5116e9675a0f5894f924bc7088d359767a588bdf.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" And rename existing load_device_config_state trace event to load_device_config_state_end for consistency since it is triggered at the end of loading of the VFIO device config state. This way both the start and end points of particular device config loading operation (a long, BQL-serialized operation) are known. Reviewed-by: Cédric Le Goater Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 4 +++- hw/vfio/trace-events | 3 ++- 2 files changed, 5 insertions(+), 2 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index d801c861d202..f5df5ef17080 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -310,6 +310,8 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) VFIODevice *vbasedev = opaque; uint64_t data; + trace_vfio_load_device_config_state_start(vbasedev->name); + if (vbasedev->ops && vbasedev->ops->vfio_load_config) { int ret; @@ -328,7 +330,7 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) return -EINVAL; } - trace_vfio_load_device_config_state(vbasedev->name); + trace_vfio_load_device_config_state_end(vbasedev->name); return qemu_file_get_error(f); } diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index cab1cf1de0a2..1bebe9877d88 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -149,7 +149,8 @@ vfio_display_edid_write_error(void) "" # migration.c vfio_load_cleanup(const char *name) " (%s)" -vfio_load_device_config_state(const char *name) " (%s)" +vfio_load_device_config_state_start(const char *name) " (%s)" +vfio_load_device_config_state_end(const char *name) " (%s)" vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" vfio_migration_realize(const char *name) " (%s)" From patchwork Thu Jan 30 10:08:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954463 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C6504C0218A for ; Thu, 30 Jan 2025 10:15:11 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWA-0006T4-NH; Thu, 30 Jan 2025 05:11:27 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRW3-000696-Cy for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:21 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRW2-0007ND-04 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:19 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVo-00000006U0V-0uOq; Thu, 30 Jan 2025 11:11:04 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 22/33] vfio/migration: Convert bytes_transferred counter to atomic Date: Thu, 30 Jan 2025 11:08:43 +0100 Message-ID: <41ce794acd0714c69e212ff43ddea09cd9a47eec.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" So it can be safety accessed from multiple threads. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index f5df5ef17080..cbb1e0b6f852 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -416,7 +416,7 @@ static ssize_t vfio_save_block(QEMUFile *f, VFIOMigration *migration) qemu_put_be64(f, VFIO_MIG_FLAG_DEV_DATA_STATE); qemu_put_be64(f, data_size); qemu_put_buffer(f, migration->data_buffer, data_size); - bytes_transferred += data_size; + qatomic_add(&bytes_transferred, data_size); trace_vfio_save_block(migration->vbasedev->name, data_size); @@ -1038,12 +1038,12 @@ static int vfio_block_migration(VFIODevice *vbasedev, Error *err, Error **errp) int64_t vfio_mig_bytes_transferred(void) { - return bytes_transferred; + return qatomic_read(&bytes_transferred); } void vfio_reset_bytes_transferred(void) { - bytes_transferred = 0; + qatomic_set(&bytes_transferred, 0); } /* From patchwork Thu Jan 30 10:08:44 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954452 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D05D8C02198 for ; Thu, 30 Jan 2025 10:14:02 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWE-000722-Az; Thu, 30 Jan 2025 05:11:30 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWA-0006YK-NY for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:26 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRW7-0007Nh-KJ for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:26 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVt-00000006U0g-1Zet; Thu, 30 Jan 2025 11:11:09 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 23/33] vfio/migration: Multifd device state transfer support - basic types Date: Thu, 30 Jan 2025 11:08:44 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Add basic types and flags used by VFIO multifd device state transfer support. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index cbb1e0b6f852..715182c4f810 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -47,6 +47,7 @@ #define VFIO_MIG_FLAG_DEV_SETUP_STATE (0xffffffffef100003ULL) #define VFIO_MIG_FLAG_DEV_DATA_STATE (0xffffffffef100004ULL) #define VFIO_MIG_FLAG_DEV_INIT_DATA_SENT (0xffffffffef100005ULL) +#define VFIO_MIG_FLAG_DEV_CONFIG_LOAD_READY (0xffffffffef100006ULL) /* * This is an arbitrary size based on migration of mlx5 devices, where typically @@ -55,6 +56,15 @@ */ #define VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE (1 * MiB) +#define VFIO_DEVICE_STATE_CONFIG_STATE (1) + +typedef struct VFIODeviceStatePacket { + uint32_t version; + uint32_t idx; + uint32_t flags; + uint8_t data[0]; +} QEMU_PACKED VFIODeviceStatePacket; + static int64_t bytes_transferred; static const char *mig_state_to_str(enum vfio_device_mig_state state) From patchwork Thu Jan 30 10:08:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954440 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 44654C02194 for ; Thu, 30 Jan 2025 10:12:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWI-0007SS-1X; Thu, 30 Jan 2025 05:11:34 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWE-00078C-Mq for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:30 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWC-0007O3-Sv for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:30 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRVy-00000006U0r-2D3i; Thu, 30 Jan 2025 11:11:14 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 24/33] vfio/migration: Multifd device state transfer support - VFIOStateBuffer(s) Date: Thu, 30 Jan 2025 11:08:45 +0100 Message-ID: <656c5005405aa875c79a190dd0a671993b2408c6.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Add VFIOStateBuffer(s) types and the associated methods. These store received device state buffers and config state waiting to get loaded into the device. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 54 +++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 54 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 715182c4f810..40cbe1be687d 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -289,6 +289,60 @@ static bool vfio_load_config_after_iter(VFIODevice *vbasedev) #endif } +/* type safety */ +typedef struct VFIOStateBuffers { + GArray *array; +} VFIOStateBuffers; + +typedef struct VFIOStateBuffer { + bool is_present; + char *data; + size_t len; +} VFIOStateBuffer; + +static void vfio_state_buffer_clear(gpointer data) +{ + VFIOStateBuffer *lb = data; + + if (!lb->is_present) { + return; + } + + g_clear_pointer(&lb->data, g_free); + lb->is_present = false; +} + +static void vfio_state_buffers_init(VFIOStateBuffers *bufs) +{ + bufs->array = g_array_new(FALSE, TRUE, sizeof(VFIOStateBuffer)); + g_array_set_clear_func(bufs->array, vfio_state_buffer_clear); +} + +static void vfio_state_buffers_destroy(VFIOStateBuffers *bufs) +{ + g_clear_pointer(&bufs->array, g_array_unref); +} + +static void vfio_state_buffers_assert_init(VFIOStateBuffers *bufs) +{ + assert(bufs->array); +} + +static guint vfio_state_buffers_size_get(VFIOStateBuffers *bufs) +{ + return bufs->array->len; +} + +static void vfio_state_buffers_size_set(VFIOStateBuffers *bufs, guint size) +{ + g_array_set_size(bufs->array, size); +} + +static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, guint idx) +{ + return &g_array_index(bufs->array, VFIOStateBuffer, idx); +} + static int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp) { From patchwork Thu Jan 30 10:08:46 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954438 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BCEB8C0218A for ; Thu, 30 Jan 2025 10:12:03 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWU-0007uO-1b; Thu, 30 Jan 2025 05:11:49 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWJ-0007ij-OE for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:36 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWI-0007Oq-44 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:35 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRW3-00000006U11-2tkP; Thu, 30 Jan 2025 11:11:19 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 25/33] vfio/migration: Multifd device state transfer - add support checking function Date: Thu, 30 Jan 2025 11:08:46 +0100 Message-ID: <6895b3d49835cde3fe349fa0b46c9eadb85bb701.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Add vfio_multifd_transfer_supported() function that tells whether the multifd device state transfer is supported. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 40cbe1be687d..3211041939c6 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -519,6 +519,12 @@ static bool vfio_precopy_supported(VFIODevice *vbasedev) return migration->mig_flags & VFIO_MIGRATION_PRE_COPY; } +static bool vfio_multifd_transfer_supported(void) +{ + return multifd_device_state_supported() && + migrate_send_switchover_start(); +} + /* ---------------------------------------------------------------------- */ static int vfio_save_prepare(void *opaque, Error **errp) From patchwork Thu Jan 30 10:08:47 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954450 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3584DC0218A for ; Thu, 30 Jan 2025 10:13:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWh-0008T5-58; Thu, 30 Jan 2025 05:11:59 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWP-0007uM-0l for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:44 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWN-0007PM-8A for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:40 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRW8-00000006U1C-3U7w; Thu, 30 Jan 2025 11:11:24 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 26/33] vfio/migration: Multifd device state transfer support - receive init/cleanup Date: Thu, 30 Jan 2025 11:08:47 +0100 Message-ID: <1fcf182307e8e1f67a3c226e62d26cad3a2f60d0.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Add support for VFIOMultifd data structure that will contain most of the receive-side data together with its init/cleanup methods. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 52 +++++++++++++++++++++++++++++++++-- include/hw/vfio/vfio-common.h | 5 ++++ 2 files changed, 55 insertions(+), 2 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 3211041939c6..bcdf204d5cf4 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -300,6 +300,9 @@ typedef struct VFIOStateBuffer { size_t len; } VFIOStateBuffer; +typedef struct VFIOMultifd { +} VFIOMultifd; + static void vfio_state_buffer_clear(gpointer data) { VFIOStateBuffer *lb = data; @@ -398,6 +401,18 @@ static int vfio_load_device_config_state(QEMUFile *f, void *opaque) return qemu_file_get_error(f); } +static VFIOMultifd *vfio_multifd_new(void) +{ + VFIOMultifd *multifd = g_new(VFIOMultifd, 1); + + return multifd; +} + +static void vfio_multifd_free(VFIOMultifd *multifd) +{ + g_free(multifd); +} + static void vfio_migration_cleanup(VFIODevice *vbasedev) { VFIOMigration *migration = vbasedev->migration; @@ -785,14 +800,47 @@ static void vfio_save_state(QEMUFile *f, void *opaque) static int vfio_load_setup(QEMUFile *f, void *opaque, Error **errp) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + int ret; + + /* + * Make a copy of this setting at the start in case it is changed + * mid-migration. + */ + if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { + migration->multifd_transfer = vfio_multifd_transfer_supported(); + } else { + migration->multifd_transfer = + vbasedev->migration_multifd_transfer == ON_OFF_AUTO_ON; + } + + if (migration->multifd_transfer && !vfio_multifd_transfer_supported()) { + error_setg(errp, + "%s: Multifd device transfer requested but unsupported in the current config", + vbasedev->name); + return -EINVAL; + } + + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, + migration->device_state, errp); + if (ret) { + return ret; + } + + if (migration->multifd_transfer) { + assert(!migration->multifd); + migration->multifd = vfio_multifd_new(); + } - return vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_RESUMING, - vbasedev->migration->device_state, errp); + return 0; } static int vfio_load_cleanup(void *opaque) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + + g_clear_pointer(&migration->multifd, vfio_multifd_free); vfio_migration_cleanup(vbasedev); trace_vfio_load_cleanup(vbasedev->name); diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index 153d03745dc7..c0c9c0b1b263 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -61,6 +61,8 @@ typedef struct VFIORegion { uint8_t nr; /* cache the region number for debug */ } VFIORegion; +typedef struct VFIOMultifd VFIOMultifd; + typedef struct VFIOMigration { struct VFIODevice *vbasedev; VMChangeStateEntry *vm_state; @@ -72,6 +74,8 @@ typedef struct VFIOMigration { uint64_t mig_flags; uint64_t precopy_init_size; uint64_t precopy_dirty_size; + bool multifd_transfer; + VFIOMultifd *multifd; bool initial_data_sent; bool event_save_iterate_started; @@ -133,6 +137,7 @@ typedef struct VFIODevice { bool no_mmap; bool ram_block_discard_allowed; OnOffAuto enable_migration; + OnOffAuto migration_multifd_transfer; OnOffAuto migration_load_config_after_iter; bool migration_events; VFIODeviceOps *ops; From patchwork Thu Jan 30 10:08:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954442 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 301FDC02195 for ; Thu, 30 Jan 2025 10:12:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWg-0008Md-G1; Thu, 30 Jan 2025 05:11:58 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWU-00080J-Jo for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:49 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWS-0007Ph-CI for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:46 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRWD-00000006U1M-47i2; Thu, 30 Jan 2025 11:11:29 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 27/33] vfio/migration: Multifd device state transfer support - received buffers queuing Date: Thu, 30 Jan 2025 11:08:48 +0100 Message-ID: <74c4bbaaccd81e883504ae478e84394ddd96bbae.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" The multifd received data needs to be reassembled since device state packets sent via different multifd channels can arrive out-of-order. Therefore, each VFIO device state packet carries a header indicating its position in the stream. The raw device state data is saved into a VFIOStateBuffer for later in-order loading into the device. The last such VFIO device state packet should have VFIO_DEVICE_STATE_CONFIG_STATE flag set and carry the device config state. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 116 ++++++++++++++++++++++++++++++++++ hw/vfio/pci.c | 2 + hw/vfio/trace-events | 1 + include/hw/vfio/vfio-common.h | 1 + 4 files changed, 120 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index bcdf204d5cf4..0c0caec1bd64 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -301,6 +301,12 @@ typedef struct VFIOStateBuffer { } VFIOStateBuffer; typedef struct VFIOMultifd { + VFIOStateBuffers load_bufs; + QemuCond load_bufs_buffer_ready_cond; + QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ + uint32_t load_buf_idx; + uint32_t load_buf_idx_last; + uint32_t load_buf_queued_pending_buffers; } VFIOMultifd; static void vfio_state_buffer_clear(gpointer data) @@ -346,6 +352,103 @@ static VFIOStateBuffer *vfio_state_buffers_at(VFIOStateBuffers *bufs, guint idx) return &g_array_index(bufs->array, VFIOStateBuffer, idx); } +static bool vfio_load_state_buffer_insert(VFIODevice *vbasedev, + VFIODeviceStatePacket *packet, + size_t packet_total_size, + Error **errp) +{ + VFIOMigration *migration = vbasedev->migration; + VFIOMultifd *multifd = migration->multifd; + VFIOStateBuffer *lb; + + vfio_state_buffers_assert_init(&multifd->load_bufs); + if (packet->idx >= vfio_state_buffers_size_get(&multifd->load_bufs)) { + vfio_state_buffers_size_set(&multifd->load_bufs, packet->idx + 1); + } + + lb = vfio_state_buffers_at(&multifd->load_bufs, packet->idx); + if (lb->is_present) { + error_setg(errp, "state buffer %" PRIu32 " already filled", + packet->idx); + return false; + } + + assert(packet->idx >= multifd->load_buf_idx); + + multifd->load_buf_queued_pending_buffers++; + if (multifd->load_buf_queued_pending_buffers > + vbasedev->migration_max_queued_buffers) { + error_setg(errp, + "queuing state buffer %" PRIu32 " would exceed the max of %" PRIu64, + packet->idx, vbasedev->migration_max_queued_buffers); + return false; + } + + lb->data = g_memdup2(&packet->data, packet_total_size - sizeof(*packet)); + lb->len = packet_total_size - sizeof(*packet); + lb->is_present = true; + + return true; +} + +static bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, + Error **errp) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + VFIOMultifd *multifd = migration->multifd; + VFIODeviceStatePacket *packet = (VFIODeviceStatePacket *)data; + + /* + * Holding BQL here would violate the lock order and can cause + * a deadlock once we attempt to lock load_bufs_mutex below. + */ + assert(!bql_locked()); + + if (!migration->multifd_transfer) { + error_setg(errp, + "got device state packet but not doing multifd transfer"); + return false; + } + + assert(multifd); + + if (data_size < sizeof(*packet)) { + error_setg(errp, "packet too short at %zu (min is %zu)", + data_size, sizeof(*packet)); + return false; + } + + if (packet->version != 0) { + error_setg(errp, "packet has unknown version %" PRIu32, + packet->version); + return false; + } + + if (packet->idx == UINT32_MAX) { + error_setg(errp, "packet has too high idx %" PRIu32, + packet->idx); + return false; + } + + trace_vfio_load_state_device_buffer_incoming(vbasedev->name, packet->idx); + + QEMU_LOCK_GUARD(&multifd->load_bufs_mutex); + + /* config state packet should be the last one in the stream */ + if (packet->flags & VFIO_DEVICE_STATE_CONFIG_STATE) { + multifd->load_buf_idx_last = packet->idx; + } + + if (!vfio_load_state_buffer_insert(vbasedev, packet, data_size, errp)) { + return false; + } + + qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); + + return true; +} + static int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp) { @@ -405,11 +508,23 @@ static VFIOMultifd *vfio_multifd_new(void) { VFIOMultifd *multifd = g_new(VFIOMultifd, 1); + vfio_state_buffers_init(&multifd->load_bufs); + + qemu_mutex_init(&multifd->load_bufs_mutex); + + multifd->load_buf_idx = 0; + multifd->load_buf_idx_last = UINT32_MAX; + multifd->load_buf_queued_pending_buffers = 0; + qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); + return multifd; } static void vfio_multifd_free(VFIOMultifd *multifd) { + qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); + qemu_mutex_destroy(&multifd->load_bufs_mutex); + g_free(multifd); } @@ -940,6 +1055,7 @@ static const SaveVMHandlers savevm_vfio_handlers = { .load_setup = vfio_load_setup, .load_cleanup = vfio_load_cleanup, .load_state = vfio_load_state, + .load_state_buffer = vfio_load_state_buffer, .switchover_ack_needed = vfio_switchover_ack_needed, }; diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 83090c544d95..2700b355ecf1 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3380,6 +3380,8 @@ static const Property vfio_pci_dev_properties[] = { DEFINE_PROP_ON_OFF_AUTO("x-migration-load-config-after-iter", VFIOPCIDevice, vbasedev.migration_load_config_after_iter, ON_OFF_AUTO_AUTO), + DEFINE_PROP_UINT64("x-migration-max-queued-buffers", VFIOPCIDevice, + vbasedev.migration_max_queued_buffers, UINT64_MAX), DEFINE_PROP_BOOL("migration-events", VFIOPCIDevice, vbasedev.migration_events, false), DEFINE_PROP_BOOL("x-no-mmap", VFIOPCIDevice, vbasedev.no_mmap, false), diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 1bebe9877d88..042a3dc54a33 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -153,6 +153,7 @@ vfio_load_device_config_state_start(const char *name) " (%s)" vfio_load_device_config_state_end(const char *name) " (%s)" vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" +vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (%s) idx %"PRIu32 vfio_migration_realize(const char *name) " (%s)" vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s" vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s" diff --git a/include/hw/vfio/vfio-common.h b/include/hw/vfio/vfio-common.h index c0c9c0b1b263..0e8b0848882e 100644 --- a/include/hw/vfio/vfio-common.h +++ b/include/hw/vfio/vfio-common.h @@ -139,6 +139,7 @@ typedef struct VFIODevice { OnOffAuto enable_migration; OnOffAuto migration_multifd_transfer; OnOffAuto migration_load_config_after_iter; + uint64_t migration_max_queued_buffers; bool migration_events; VFIODeviceOps *ops; unsigned int num_irqs; From patchwork Thu Jan 30 10:08:49 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EC87EC02195 for ; Thu, 30 Jan 2025 10:14:05 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWi-0008Uc-3U; Thu, 30 Jan 2025 05:12:00 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWZ-00086p-4P for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:53 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWX-0007Q1-03 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:50 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRWJ-00000006U1W-0Z9E; Thu, 30 Jan 2025 11:11:35 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 28/33] vfio/migration: Multifd device state transfer support - load thread Date: Thu, 30 Jan 2025 11:08:49 +0100 Message-ID: <3e1708a19ab743441a4771a8868043887ea4543c.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Since it's important to finish loading device state transferred via the main migration channel (via save_live_iterate SaveVMHandler) before starting loading the data asynchronously transferred via multifd the thread doing the actual loading of the multifd transferred data is only started from switchover_start SaveVMHandler. switchover_start handler is called when MIG_CMD_SWITCHOVER_START sub-command of QEMU_VM_COMMAND is received via the main migration channel. This sub-command is only sent after all save_live_iterate data have already been posted so it is safe to commence loading of the multifd-transferred device state upon receiving it - loading of save_live_iterate data happens synchronously in the main migration thread (much like the processing of MIG_CMD_SWITCHOVER_START) so by the time MIG_CMD_SWITCHOVER_START is processed all the proceeding data must have already been loaded. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 229 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/trace-events | 5 + 2 files changed, 234 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 0c0caec1bd64..ab5b097f59c9 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -301,8 +301,16 @@ typedef struct VFIOStateBuffer { } VFIOStateBuffer; typedef struct VFIOMultifd { + QemuThread load_bufs_thread; + bool load_bufs_thread_running; + bool load_bufs_thread_want_exit; + + bool load_bufs_iter_done; + QemuCond load_bufs_iter_done_cond; + VFIOStateBuffers load_bufs; QemuCond load_bufs_buffer_ready_cond; + QemuCond load_bufs_thread_finished_cond; QemuMutex load_bufs_mutex; /* Lock order: this lock -> BQL */ uint32_t load_buf_idx; uint32_t load_buf_idx_last; @@ -449,6 +457,171 @@ static bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, return true; } +static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) +{ + VFIOStateBuffer *lb; + guint bufs_len; + + bufs_len = vfio_state_buffers_size_get(&multifd->load_bufs); + if (multifd->load_buf_idx >= bufs_len) { + assert(multifd->load_buf_idx == bufs_len); + return NULL; + } + + lb = vfio_state_buffers_at(&multifd->load_bufs, + multifd->load_buf_idx); + if (!lb->is_present) { + return NULL; + } + + return lb; +} + +static int vfio_load_bufs_thread_load_config(VFIODevice *vbasedev) +{ + return -EINVAL; +} + +static bool vfio_load_state_buffer_write(VFIODevice *vbasedev, + VFIOStateBuffer *lb, + Error **errp) +{ + VFIOMigration *migration = vbasedev->migration; + VFIOMultifd *multifd = migration->multifd; + g_autofree char *buf = NULL; + char *buf_cur; + size_t buf_len; + + if (!lb->len) { + return true; + } + + trace_vfio_load_state_device_buffer_load_start(vbasedev->name, + multifd->load_buf_idx); + + /* lb might become re-allocated when we drop the lock */ + buf = g_steal_pointer(&lb->data); + buf_cur = buf; + buf_len = lb->len; + while (buf_len > 0) { + ssize_t wr_ret; + int errno_save; + + /* + * Loading data to the device takes a while, + * drop the lock during this process. + */ + qemu_mutex_unlock(&multifd->load_bufs_mutex); + wr_ret = write(migration->data_fd, buf_cur, buf_len); + errno_save = errno; + qemu_mutex_lock(&multifd->load_bufs_mutex); + + if (wr_ret < 0) { + error_setg(errp, + "writing state buffer %" PRIu32 " failed: %d", + multifd->load_buf_idx, errno_save); + return false; + } + + assert(wr_ret <= buf_len); + buf_len -= wr_ret; + buf_cur += wr_ret; + } + + trace_vfio_load_state_device_buffer_load_end(vbasedev->name, + multifd->load_buf_idx); + + return true; +} + +static bool vfio_load_bufs_thread_want_abort(VFIOMultifd *multifd, + bool *should_quit) +{ + return multifd->load_bufs_thread_want_exit || qatomic_read(should_quit); +} + +static bool vfio_load_bufs_thread(void *opaque, bool *should_quit, Error **errp) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + VFIOMultifd *multifd = migration->multifd; + bool ret = true; + int config_ret; + + assert(multifd); + QEMU_LOCK_GUARD(&multifd->load_bufs_mutex); + + assert(multifd->load_bufs_thread_running); + + while (!vfio_load_bufs_thread_want_abort(multifd, should_quit)) { + VFIOStateBuffer *lb; + + assert(multifd->load_buf_idx <= multifd->load_buf_idx_last); + + lb = vfio_load_state_buffer_get(multifd); + if (!lb) { + trace_vfio_load_state_device_buffer_starved(vbasedev->name, + multifd->load_buf_idx); + qemu_cond_wait(&multifd->load_bufs_buffer_ready_cond, + &multifd->load_bufs_mutex); + continue; + } + + if (multifd->load_buf_idx == multifd->load_buf_idx_last) { + break; + } + + if (multifd->load_buf_idx == 0) { + trace_vfio_load_state_device_buffer_start(vbasedev->name); + } + + if (!vfio_load_state_buffer_write(vbasedev, lb, errp)) { + ret = false; + goto ret_signal; + } + + assert(multifd->load_buf_queued_pending_buffers > 0); + multifd->load_buf_queued_pending_buffers--; + + if (multifd->load_buf_idx == multifd->load_buf_idx_last - 1) { + trace_vfio_load_state_device_buffer_end(vbasedev->name); + } + + multifd->load_buf_idx++; + } + + if (vfio_load_bufs_thread_want_abort(multifd, should_quit)) { + error_setg(errp, "operation cancelled"); + ret = false; + goto ret_signal; + } + + if (vfio_load_config_after_iter(vbasedev)) { + while (!multifd->load_bufs_iter_done) { + qemu_cond_wait(&multifd->load_bufs_iter_done_cond, + &multifd->load_bufs_mutex); + + if (vfio_load_bufs_thread_want_abort(multifd, should_quit)) { + error_setg(errp, "operation cancelled"); + ret = false; + goto ret_signal; + } + } + } + + config_ret = vfio_load_bufs_thread_load_config(vbasedev); + if (config_ret) { + error_setg(errp, "load config state failed: %d", config_ret); + ret = false; + } + +ret_signal: + multifd->load_bufs_thread_running = false; + qemu_cond_signal(&multifd->load_bufs_thread_finished_cond); + + return ret; +} + static int vfio_save_device_config_state(QEMUFile *f, void *opaque, Error **errp) { @@ -517,11 +690,40 @@ static VFIOMultifd *vfio_multifd_new(void) multifd->load_buf_queued_pending_buffers = 0; qemu_cond_init(&multifd->load_bufs_buffer_ready_cond); + multifd->load_bufs_iter_done = false; + qemu_cond_init(&multifd->load_bufs_iter_done_cond); + + multifd->load_bufs_thread_running = false; + multifd->load_bufs_thread_want_exit = false; + qemu_cond_init(&multifd->load_bufs_thread_finished_cond); + return multifd; } +static void vfio_load_cleanup_load_bufs_thread(VFIOMultifd *multifd) +{ + /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ + bql_unlock(); + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { + while (multifd->load_bufs_thread_running) { + multifd->load_bufs_thread_want_exit = true; + + qemu_cond_signal(&multifd->load_bufs_buffer_ready_cond); + qemu_cond_signal(&multifd->load_bufs_iter_done_cond); + qemu_cond_wait(&multifd->load_bufs_thread_finished_cond, + &multifd->load_bufs_mutex); + } + } + bql_lock(); +} + static void vfio_multifd_free(VFIOMultifd *multifd) { + vfio_load_cleanup_load_bufs_thread(multifd); + + qemu_cond_destroy(&multifd->load_bufs_thread_finished_cond); + qemu_cond_destroy(&multifd->load_bufs_iter_done_cond); + vfio_state_buffers_destroy(&multifd->load_bufs); qemu_cond_destroy(&multifd->load_bufs_buffer_ready_cond); qemu_mutex_destroy(&multifd->load_bufs_mutex); @@ -1042,6 +1244,32 @@ static bool vfio_switchover_ack_needed(void *opaque) return vfio_precopy_supported(vbasedev); } +static int vfio_switchover_start(void *opaque) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + VFIOMultifd *multifd = migration->multifd; + + if (!migration->multifd_transfer) { + /* Load thread is only used for multifd transfer */ + return 0; + } + + assert(multifd); + + /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ + bql_unlock(); + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { + assert(!multifd->load_bufs_thread_running); + multifd->load_bufs_thread_running = true; + } + bql_lock(); + + qemu_loadvm_start_load_thread(vfio_load_bufs_thread, vbasedev); + + return 0; +} + static const SaveVMHandlers savevm_vfio_handlers = { .save_prepare = vfio_save_prepare, .save_setup = vfio_save_setup, @@ -1057,6 +1285,7 @@ static const SaveVMHandlers savevm_vfio_handlers = { .load_state = vfio_load_state, .load_state_buffer = vfio_load_state_buffer, .switchover_ack_needed = vfio_switchover_ack_needed, + .switchover_start = vfio_switchover_start, }; /* ---------------------------------------------------------------------- */ diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 042a3dc54a33..418b378ebd29 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -154,6 +154,11 @@ vfio_load_device_config_state_end(const char *name) " (%s)" vfio_load_state(const char *name, uint64_t data) " (%s) data 0x%"PRIx64 vfio_load_state_device_data(const char *name, uint64_t data_size, int ret) " (%s) size %"PRIu64" ret %d" vfio_load_state_device_buffer_incoming(const char *name, uint32_t idx) " (%s) idx %"PRIu32 +vfio_load_state_device_buffer_start(const char *name) " (%s)" +vfio_load_state_device_buffer_starved(const char *name, uint32_t idx) " (%s) idx %"PRIu32 +vfio_load_state_device_buffer_load_start(const char *name, uint32_t idx) " (%s) idx %"PRIu32 +vfio_load_state_device_buffer_load_end(const char *name, uint32_t idx) " (%s) idx %"PRIu32 +vfio_load_state_device_buffer_end(const char *name) " (%s)" vfio_migration_realize(const char *name) " (%s)" vfio_migration_set_device_state(const char *name, const char *state) " (%s) state %s" vfio_migration_set_state(const char *name, const char *new_state, const char *recover_state) " (%s) new state %s, recover state %s" From patchwork Thu Jan 30 10:08:50 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 3B721C02190 for ; Thu, 30 Jan 2025 10:12:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWj-0000SN-Vn; Thu, 30 Jan 2025 05:12:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWe-0008Jt-FE for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:56 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWc-0007QN-O0 for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:11:56 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRWO-00000006U1g-1BUc; Thu, 30 Jan 2025 11:11:40 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 29/33] vfio/migration: Multifd device state transfer support - config loading support Date: Thu, 30 Jan 2025 11:08:50 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Load device config received via multifd using the existing machinery behind vfio_load_device_config_state(). Also, make sure to process the relevant main migration channel flags. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 103 +++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 98 insertions(+), 5 deletions(-) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index ab5b097f59c9..31f651ffee85 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -15,6 +15,7 @@ #include #include +#include "io/channel-buffer.h" #include "system/runstate.h" #include "hw/vfio/vfio-common.h" #include "migration/misc.h" @@ -457,6 +458,57 @@ static bool vfio_load_state_buffer(void *opaque, char *data, size_t data_size, return true; } +static int vfio_load_device_config_state(QEMUFile *f, void *opaque); + +static int vfio_load_bufs_thread_load_config(VFIODevice *vbasedev) +{ + VFIOMigration *migration = vbasedev->migration; + VFIOMultifd *multifd = migration->multifd; + VFIOStateBuffer *lb; + g_autoptr(QIOChannelBuffer) bioc = NULL; + QEMUFile *f_out = NULL, *f_in = NULL; + uint64_t mig_header; + int ret; + + assert(multifd->load_buf_idx == multifd->load_buf_idx_last); + lb = vfio_state_buffers_at(&multifd->load_bufs, multifd->load_buf_idx); + assert(lb->is_present); + + bioc = qio_channel_buffer_new(lb->len); + qio_channel_set_name(QIO_CHANNEL(bioc), "vfio-device-config-load"); + + f_out = qemu_file_new_output(QIO_CHANNEL(bioc)); + qemu_put_buffer(f_out, (uint8_t *)lb->data, lb->len); + + ret = qemu_fflush(f_out); + if (ret) { + g_clear_pointer(&f_out, qemu_fclose); + return ret; + } + + qio_channel_io_seek(QIO_CHANNEL(bioc), 0, 0, NULL); + f_in = qemu_file_new_input(QIO_CHANNEL(bioc)); + + mig_header = qemu_get_be64(f_in); + if (mig_header != VFIO_MIG_FLAG_DEV_CONFIG_STATE) { + g_clear_pointer(&f_out, qemu_fclose); + g_clear_pointer(&f_in, qemu_fclose); + return -EINVAL; + } + + bql_lock(); + ret = vfio_load_device_config_state(f_in, vbasedev); + bql_unlock(); + + g_clear_pointer(&f_out, qemu_fclose); + g_clear_pointer(&f_in, qemu_fclose); + if (ret < 0) { + return ret; + } + + return 0; +} + static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) { VFIOStateBuffer *lb; @@ -477,11 +529,6 @@ static VFIOStateBuffer *vfio_load_state_buffer_get(VFIOMultifd *multifd) return lb; } -static int vfio_load_bufs_thread_load_config(VFIODevice *vbasedev) -{ - return -EINVAL; -} - static bool vfio_load_state_buffer_write(VFIODevice *vbasedev, VFIOStateBuffer *lb, Error **errp) @@ -1168,6 +1215,8 @@ static int vfio_load_cleanup(void *opaque) static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + VFIOMultifd *multifd = migration->multifd; int ret = 0; uint64_t data; @@ -1179,6 +1228,12 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) switch (data) { case VFIO_MIG_FLAG_DEV_CONFIG_STATE: { + if (migration->multifd_transfer) { + error_report("%s: got DEV_CONFIG_STATE but doing multifd transfer", + vbasedev->name); + return -EINVAL; + } + return vfio_load_device_config_state(f, opaque); } case VFIO_MIG_FLAG_DEV_SETUP_STATE: @@ -1223,6 +1278,44 @@ static int vfio_load_state(QEMUFile *f, void *opaque, int version_id) return ret; } + case VFIO_MIG_FLAG_DEV_CONFIG_LOAD_READY: + { + if (!migration->multifd_transfer) { + error_report("%s: got DEV_CONFIG_LOAD_READY outside multifd transfer", + vbasedev->name); + return -EINVAL; + } + + if (!vfio_load_config_after_iter(vbasedev)) { + error_report("%s: got DEV_CONFIG_LOAD_READY but was disabled", + vbasedev->name); + return -EINVAL; + } + + assert(multifd); + + /* The lock order is load_bufs_mutex -> BQL so unlock BQL here first */ + bql_unlock(); + WITH_QEMU_LOCK_GUARD(&multifd->load_bufs_mutex) { + if (multifd->load_bufs_iter_done) { + /* Can't print error here as we're outside BQL */ + ret = -EINVAL; + break; + } + + multifd->load_bufs_iter_done = true; + qemu_cond_signal(&multifd->load_bufs_iter_done_cond); + + ret = 0; + } + bql_lock(); + + if (ret) { + error_report("%s: duplicate DEV_CONFIG_LOAD_READY", + vbasedev->name); + } + return ret; + } default: error_report("%s: Unknown tag 0x%"PRIx64, vbasedev->name, data); return -EINVAL; From patchwork Thu Jan 30 10:08:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6E388C0218A for ; Thu, 30 Jan 2025 10:13:53 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWo-00012i-AB; Thu, 30 Jan 2025 05:12:06 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWi-0000CO-Cz for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:12:00 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWg-0007Qj-Pe for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:12:00 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRWT-00000006U1v-1p7J; Thu, 30 Jan 2025 11:11:45 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 30/33] migration/qemu-file: Define g_autoptr() cleanup function for QEMUFile Date: Thu, 30 Jan 2025 11:08:51 +0100 Message-ID: <479b0232aa776e16f9d38300bb7933d361dac77a.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Automatic memory management helps avoid memory safety issues. Reviewed-by: Fabiano Rosas Reviewed-by: Peter Xu Signed-off-by: Maciej S. Szmigiero --- migration/qemu-file.h | 2 ++ 1 file changed, 2 insertions(+) diff --git a/migration/qemu-file.h b/migration/qemu-file.h index 11c2120edd72..fdf21324df07 100644 --- a/migration/qemu-file.h +++ b/migration/qemu-file.h @@ -33,6 +33,8 @@ QEMUFile *qemu_file_new_input(QIOChannel *ioc); QEMUFile *qemu_file_new_output(QIOChannel *ioc); int qemu_fclose(QEMUFile *f); +G_DEFINE_AUTOPTR_CLEANUP_FUNC(QEMUFile, qemu_fclose) + /* * qemu_file_transferred: * From patchwork Thu Jan 30 10:08:52 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954462 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8BCFFC0218A for ; Thu, 30 Jan 2025 10:14:43 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWt-0001mE-AK; Thu, 30 Jan 2025 05:12:11 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWq-0001VE-5M for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:12:08 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWo-0007RQ-0y for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:12:07 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRWY-00000006U26-2Sex; Thu, 30 Jan 2025 11:11:50 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 31/33] vfio/migration: Multifd device state transfer support - send side Date: Thu, 30 Jan 2025 11:08:52 +0100 Message-ID: <6c79a4f61f0b8bee30cda8242af7f51856392051.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Implement the multifd device state transfer via additional per-device thread inside save_live_complete_precopy_thread handler. Switch between doing the data transfer in the new handler and doing it in the old save_state handler depending on the x-migration-multifd-transfer device property value. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/migration.c | 159 +++++++++++++++++++++++++++++++++++++++++++ hw/vfio/trace-events | 2 + 2 files changed, 161 insertions(+) diff --git a/hw/vfio/migration.c b/hw/vfio/migration.c index 31f651ffee85..37d1c0f3d32f 100644 --- a/hw/vfio/migration.c +++ b/hw/vfio/migration.c @@ -943,6 +943,24 @@ static int vfio_save_setup(QEMUFile *f, void *opaque, Error **errp) uint64_t stop_copy_size = VFIO_MIG_DEFAULT_DATA_BUFFER_SIZE; int ret; + /* + * Make a copy of this setting at the start in case it is changed + * mid-migration. + */ + if (vbasedev->migration_multifd_transfer == ON_OFF_AUTO_AUTO) { + migration->multifd_transfer = vfio_multifd_transfer_supported(); + } else { + migration->multifd_transfer = + vbasedev->migration_multifd_transfer == ON_OFF_AUTO_ON; + } + + if (migration->multifd_transfer && !vfio_multifd_transfer_supported()) { + error_setg(errp, + "%s: Multifd device transfer requested but unsupported in the current config", + vbasedev->name); + return -EINVAL; + } + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_SETUP_STATE); vfio_query_stop_copy_size(vbasedev, &stop_copy_size); @@ -1114,13 +1132,32 @@ static int vfio_save_iterate(QEMUFile *f, void *opaque) return !migration->precopy_init_size && !migration->precopy_dirty_size; } +static void vfio_save_multifd_emit_dummy_eos(VFIODevice *vbasedev, QEMUFile *f) +{ + VFIOMigration *migration = vbasedev->migration; + + assert(migration->multifd_transfer); + + /* + * Emit dummy NOP data on the main migration channel since the actual + * device state transfer is done via multifd channels. + */ + qemu_put_be64(f, VFIO_MIG_FLAG_END_OF_STATE); +} + static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; ssize_t data_size; int ret; Error *local_err = NULL; + if (migration->multifd_transfer) { + vfio_save_multifd_emit_dummy_eos(vbasedev, f); + return 0; + } + trace_vfio_save_complete_precopy_start(vbasedev->name); /* We reach here with device state STOP or STOP_COPY only */ @@ -1146,12 +1183,133 @@ static int vfio_save_complete_precopy(QEMUFile *f, void *opaque) return ret; } +static int +vfio_save_complete_precopy_async_thread_config_state(VFIODevice *vbasedev, + char *idstr, + uint32_t instance_id, + uint32_t idx) +{ + g_autoptr(QIOChannelBuffer) bioc = NULL; + g_autoptr(QEMUFile) f = NULL; + int ret; + g_autofree VFIODeviceStatePacket *packet = NULL; + size_t packet_len; + + bioc = qio_channel_buffer_new(0); + qio_channel_set_name(QIO_CHANNEL(bioc), "vfio-device-config-save"); + + f = qemu_file_new_output(QIO_CHANNEL(bioc)); + + ret = vfio_save_device_config_state(f, vbasedev, NULL); + if (ret) { + return ret; + } + + ret = qemu_fflush(f); + if (ret) { + return ret; + } + + packet_len = sizeof(*packet) + bioc->usage; + packet = g_malloc0(packet_len); + packet->idx = idx; + packet->flags = VFIO_DEVICE_STATE_CONFIG_STATE; + memcpy(&packet->data, bioc->data, bioc->usage); + + if (!multifd_queue_device_state(idstr, instance_id, + (char *)packet, packet_len)) { + return -1; + } + + qatomic_add(&bytes_transferred, packet_len); + + return 0; +} + +static int vfio_save_complete_precopy_thread(char *idstr, + uint32_t instance_id, + bool *abort_flag, + void *opaque) +{ + VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; + int ret; + g_autofree VFIODeviceStatePacket *packet = NULL; + uint32_t idx; + + if (!migration->multifd_transfer) { + /* Nothing to do, vfio_save_complete_precopy() does the transfer. */ + return 0; + } + + trace_vfio_save_complete_precopy_thread_start(vbasedev->name, + idstr, instance_id); + + /* We reach here with device state STOP or STOP_COPY only */ + ret = vfio_migration_set_state(vbasedev, VFIO_DEVICE_STATE_STOP_COPY, + VFIO_DEVICE_STATE_STOP, NULL); + if (ret) { + goto ret_finish; + } + + packet = g_malloc0(sizeof(*packet) + migration->data_buffer_size); + + for (idx = 0; ; idx++) { + ssize_t data_size; + size_t packet_size; + + if (qatomic_read(abort_flag)) { + ret = -ECANCELED; + goto ret_finish; + } + + data_size = read(migration->data_fd, &packet->data, + migration->data_buffer_size); + if (data_size < 0) { + ret = -errno; + goto ret_finish; + } else if (data_size == 0) { + break; + } + + packet->idx = idx; + packet_size = sizeof(*packet) + data_size; + + if (!multifd_queue_device_state(idstr, instance_id, + (char *)packet, packet_size)) { + ret = -1; + goto ret_finish; + } + + qatomic_add(&bytes_transferred, packet_size); + } + + ret = vfio_save_complete_precopy_async_thread_config_state(vbasedev, idstr, + instance_id, + idx); + +ret_finish: + trace_vfio_save_complete_precopy_thread_end(vbasedev->name, ret); + + return ret; +} + static void vfio_save_state(QEMUFile *f, void *opaque) { VFIODevice *vbasedev = opaque; + VFIOMigration *migration = vbasedev->migration; Error *local_err = NULL; int ret; + if (migration->multifd_transfer) { + if (vfio_load_config_after_iter(vbasedev)) { + qemu_put_be64(f, VFIO_MIG_FLAG_DEV_CONFIG_LOAD_READY); + } else { + vfio_save_multifd_emit_dummy_eos(vbasedev, f); + } + return; + } + ret = vfio_save_device_config_state(f, opaque, &local_err); if (ret) { error_prepend(&local_err, @@ -1372,6 +1530,7 @@ static const SaveVMHandlers savevm_vfio_handlers = { .is_active_iterate = vfio_is_active_iterate, .save_live_iterate = vfio_save_iterate, .save_live_complete_precopy = vfio_save_complete_precopy, + .save_live_complete_precopy_thread = vfio_save_complete_precopy_thread, .save_state = vfio_save_state, .load_setup = vfio_load_setup, .load_cleanup = vfio_load_cleanup, diff --git a/hw/vfio/trace-events b/hw/vfio/trace-events index 418b378ebd29..039979bdd98f 100644 --- a/hw/vfio/trace-events +++ b/hw/vfio/trace-events @@ -168,6 +168,8 @@ vfio_save_block_precopy_empty_hit(const char *name) " (%s)" vfio_save_cleanup(const char *name) " (%s)" vfio_save_complete_precopy(const char *name, int ret) " (%s) ret %d" vfio_save_complete_precopy_start(const char *name) " (%s)" +vfio_save_complete_precopy_thread_start(const char *name, const char *idstr, uint32_t instance_id) " (%s) idstr %s instance %"PRIu32 +vfio_save_complete_precopy_thread_end(const char *name, int ret) " (%s) ret %d" vfio_save_device_config_state(const char *name) " (%s)" vfio_save_iterate(const char *name, uint64_t precopy_init_size, uint64_t precopy_dirty_size) " (%s) precopy initial size %"PRIu64" precopy dirty size %"PRIu64 vfio_save_iterate_start(const char *name) " (%s)" From patchwork Thu Jan 30 10:08:53 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7F40DC0218A for ; Thu, 30 Jan 2025 10:13:09 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWv-00022W-CS; Thu, 30 Jan 2025 05:12:13 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWs-0001l0-Oq for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:12:10 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWr-0007Rq-5F for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:12:10 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRWd-00000006U2G-345M; Thu, 30 Jan 2025 11:11:55 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 32/33] vfio/migration: Add x-migration-multifd-transfer VFIO property Date: Thu, 30 Jan 2025 11:08:53 +0100 Message-ID: X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" This property allows configuring at runtime whether to transfer the particular device state via multifd channels when live migrating that device. It defaults to AUTO, which means that VFIO device state transfer via multifd channels is attempted in configurations that otherwise support it. Signed-off-by: Maciej S. Szmigiero --- hw/vfio/pci.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/hw/vfio/pci.c b/hw/vfio/pci.c index 2700b355ecf1..cd24f386aaf9 100644 --- a/hw/vfio/pci.c +++ b/hw/vfio/pci.c @@ -3353,6 +3353,8 @@ static void vfio_instance_init(Object *obj) pci_dev->cap_present |= QEMU_PCI_CAP_EXPRESS; } +static PropertyInfo qdev_prop_on_off_auto_mutable; + static const Property vfio_pci_dev_properties[] = { DEFINE_PROP_PCI_HOST_DEVADDR("host", VFIOPCIDevice, host), DEFINE_PROP_UUID_NODEFAULT("vf-token", VFIOPCIDevice, vf_token), @@ -3377,6 +3379,10 @@ static const Property vfio_pci_dev_properties[] = { VFIO_FEATURE_ENABLE_IGD_OPREGION_BIT, false), DEFINE_PROP_ON_OFF_AUTO("enable-migration", VFIOPCIDevice, vbasedev.enable_migration, ON_OFF_AUTO_AUTO), + DEFINE_PROP("x-migration-multifd-transfer", VFIOPCIDevice, + vbasedev.migration_multifd_transfer, + qdev_prop_on_off_auto_mutable, OnOffAuto, + .set_default = true, .defval.i = ON_OFF_AUTO_AUTO), DEFINE_PROP_ON_OFF_AUTO("x-migration-load-config-after-iter", VFIOPCIDevice, vbasedev.migration_load_config_after_iter, ON_OFF_AUTO_AUTO), @@ -3477,6 +3483,9 @@ static const TypeInfo vfio_pci_nohotplug_dev_info = { static void register_vfio_pci_dev_type(void) { + qdev_prop_on_off_auto_mutable = qdev_prop_on_off_auto; + qdev_prop_on_off_auto_mutable.realized_set_allowed = true; + type_register_static(&vfio_pci_dev_info); type_register_static(&vfio_pci_nohotplug_dev_info); } From patchwork Thu Jan 30 10:08:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Maciej S. Szmigiero" X-Patchwork-Id: 13954447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 844DEC0218A for ; Thu, 30 Jan 2025 10:12:25 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1tdRWz-0002Xy-Gk; Thu, 30 Jan 2025 05:12:18 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWw-0002HM-Kx for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:12:14 -0500 Received: from vps-ovh.mhejs.net ([145.239.82.108]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1tdRWv-0007S9-3T for qemu-devel@nongnu.org; Thu, 30 Jan 2025 05:12:14 -0500 Received: from MUA by vps-ovh.mhejs.net with esmtpsa (TLS1.3) tls TLS_AES_256_GCM_SHA384 (Exim 4.98) (envelope-from ) id 1tdRWi-00000006U2S-3il1; Thu, 30 Jan 2025 11:12:00 +0100 From: "Maciej S. Szmigiero" To: Peter Xu , Fabiano Rosas Cc: Alex Williamson , =?utf-8?q?C=C3=A9dric_Le_G?= =?utf-8?q?oater?= , Eric Blake , Markus Armbruster , =?utf-8?q?Daniel_P_=2E_Berrang=C3=A9?= , Avihai Horon , Joao Martins , qemu-devel@nongnu.org Subject: [PATCH v4 33/33] hw/core/machine: Add compat for x-migration-multifd-transfer VFIO property Date: Thu, 30 Jan 2025 11:08:54 +0100 Message-ID: <049e09d1fd5ad3523ba23b37321b6171215ddeb8.1738171076.git.maciej.szmigiero@oracle.com> X-Mailer: git-send-email 2.48.1 In-Reply-To: References: MIME-Version: 1.0 Received-SPF: pass client-ip=145.239.82.108; envelope-from=mhej@vps-ovh.mhejs.net; helo=vps-ovh.mhejs.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.037, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org From: "Maciej S. Szmigiero" Add a hw_compat entry for recently added x-migration-multifd-transfer VFIO property. Signed-off-by: Maciej S. Szmigiero --- hw/core/machine.c | 1 + 1 file changed, 1 insertion(+) diff --git a/hw/core/machine.c b/hw/core/machine.c index c2964503c5bd..3f06ea945859 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -39,6 +39,7 @@ GlobalProperty hw_compat_9_2[] = { {"arm-cpu", "backcompat-pauth-default-use-qarma5", "true"}, { "migration", "send-switchover-start", "off"}, + { "vfio-pci", "x-migration-multifd-transfer", "off" }, }; const size_t hw_compat_9_2_len = G_N_ELEMENTS(hw_compat_9_2);