From patchwork Thu Oct 19 13:19:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fiona Ebner X-Patchwork-Id: 13429166 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0BE8ECDB483 for ; Thu, 19 Oct 2023 13:22:21 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qtSyP-0005wo-Cn; Thu, 19 Oct 2023 09:22:02 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtSwQ-0004UM-C1; Thu, 19 Oct 2023 09:20:05 -0400 Received: from proxmox-new.maurer-it.com ([94.136.29.106]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtSwO-0004Ge-OD; Thu, 19 Oct 2023 09:19:58 -0400 Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 40EE4430E8; Thu, 19 Oct 2023 15:19:41 +0200 (CEST) From: Fiona Ebner To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, vsementsov@yandex-team.ru, jsnow@redhat.com, hreitz@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, t.lamprecht@proxmox.com Subject: [PATCH 1/3] blockjob: drop AioContext lock before calling bdrv_graph_wrlock() Date: Thu, 19 Oct 2023 15:19:34 +0200 Message-Id: <20231019131936.414246-2-f.ebner@proxmox.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231019131936.414246-1-f.ebner@proxmox.com> References: <20231019131936.414246-1-f.ebner@proxmox.com> MIME-Version: 1.0 Received-SPF: pass client-ip=94.136.29.106; envelope-from=f.ebner@proxmox.com; helo=proxmox-new.maurer-it.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Same rationale as in 31b2ddfea3 ("graph-lock: Unlock the AioContext while polling"). Otherwise, a deadlock can happen. The alternative would be to pass a BlockDriverState along to bdrv_graph_wrlock(), but there is no BlockDriverState readily available and it's also better conceptually, because the lock is held for the job. The function is always called with the job's AioContext lock held, via one of the .abort, .clean, .free or .prepare job driver functions. Thus, it's safe to drop it. While mirror_exit_common() does hold a second AioContext lock while calling block_job_remove_all_bdrv(), that is for the main thread's AioContext and does not need to be dropped (bdrv_graph_wrlock(bs) also skips dropping the lock if bdrv_get_aio_context(bs) == qemu_get_aio_context()). Suggested-by: Paolo Bonzini Signed-off-by: Fiona Ebner --- blockjob.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/blockjob.c b/blockjob.c index 807f992b59..953dc1b6dc 100644 --- a/blockjob.c +++ b/blockjob.c @@ -198,7 +198,9 @@ void block_job_remove_all_bdrv(BlockJob *job) * one to make sure that such a concurrent access does not attempt * to process an already freed BdrvChild. */ + aio_context_release(job->job.aio_context); bdrv_graph_wrlock(NULL); + aio_context_acquire(job->job.aio_context); while (job->nodes) { GSList *l = job->nodes; BdrvChild *c = l->data; From patchwork Thu Oct 19 13:19:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fiona Ebner X-Patchwork-Id: 13429165 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EB5ABCDB482 for ; Thu, 19 Oct 2023 13:22:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qtSy4-0005ks-BX; Thu, 19 Oct 2023 09:21:42 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtSwQ-0004UI-9f; Thu, 19 Oct 2023 09:20:05 -0400 Received: from proxmox-new.maurer-it.com ([94.136.29.106]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtSwM-0004Gc-Vt; Thu, 19 Oct 2023 09:19:58 -0400 Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 4047E430E7; Thu, 19 Oct 2023 15:19:41 +0200 (CEST) From: Fiona Ebner To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, vsementsov@yandex-team.ru, jsnow@redhat.com, hreitz@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, t.lamprecht@proxmox.com Subject: [PATCH 2/3] block: avoid potential deadlock during bdrv_graph_wrlock() in bdrv_close() Date: Thu, 19 Oct 2023 15:19:35 +0200 Message-Id: <20231019131936.414246-3-f.ebner@proxmox.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231019131936.414246-1-f.ebner@proxmox.com> References: <20231019131936.414246-1-f.ebner@proxmox.com> MIME-Version: 1.0 Received-SPF: pass client-ip=94.136.29.106; envelope-from=f.ebner@proxmox.com; helo=proxmox-new.maurer-it.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org by passing the BlockDriverState along, so the held AioContext can be dropped before polling. See commit 31b2ddfea3 ("graph-lock: Unlock the AioContext while polling") which introduced this functionality for more information. The only way to reach bdrv_close() is via bdrv_unref() and for calling that the BlockDriverState's AioContext lock is supposed to be held. Signed-off-by: Fiona Ebner --- Is it safe to expect callers of bdrv_unref() to hold the appropriate AioContext lock? block.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block.c b/block.c index f9cf05ddcf..a527aa1a4c 100644 --- a/block.c +++ b/block.c @@ -5200,7 +5200,7 @@ static void bdrv_close(BlockDriverState *bs) bs->drv = NULL; } - bdrv_graph_wrlock(NULL); + bdrv_graph_wrlock(bs); QLIST_FOREACH_SAFE(child, &bs->children, next, next) { bdrv_unref_child(bs, child); } From patchwork Thu Oct 19 13:19:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fiona Ebner X-Patchwork-Id: 13429164 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 00A2AC41513 for ; Thu, 19 Oct 2023 13:22:20 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qtSyB-0005rM-Ge; Thu, 19 Oct 2023 09:21:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtSwR-0004VY-BJ; Thu, 19 Oct 2023 09:20:05 -0400 Received: from proxmox-new.maurer-it.com ([94.136.29.106]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qtSwO-0004Gg-OL; Thu, 19 Oct 2023 09:19:59 -0400 Received: from proxmox-new.maurer-it.com (localhost.localdomain [127.0.0.1]) by proxmox-new.maurer-it.com (Proxmox) with ESMTP id 6CAA3430DF; Thu, 19 Oct 2023 15:19:41 +0200 (CEST) From: Fiona Ebner To: qemu-devel@nongnu.org Cc: qemu-block@nongnu.org, vsementsov@yandex-team.ru, jsnow@redhat.com, hreitz@redhat.com, kwolf@redhat.com, pbonzini@redhat.com, t.lamprecht@proxmox.com Subject: [PATCH 3/3] blockdev: mirror: avoid potential deadlock when using iothread Date: Thu, 19 Oct 2023 15:19:36 +0200 Message-Id: <20231019131936.414246-4-f.ebner@proxmox.com> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20231019131936.414246-1-f.ebner@proxmox.com> References: <20231019131936.414246-1-f.ebner@proxmox.com> MIME-Version: 1.0 Received-SPF: pass client-ip=94.136.29.106; envelope-from=f.ebner@proxmox.com; helo=proxmox-new.maurer-it.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org The bdrv_getlength() function is a generated co-wrapper and uses AIO_WAIT_WHILE() to wait for the spawned coroutine. AIO_WAIT_WHILE() expects the lock to be acquired exactly once. This can happen when the source node is explicitly specified as the @replaces parameter or if there is a filter on top of the source node. Signed-off-by: Fiona Ebner --- blockdev.c | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/blockdev.c b/blockdev.c index a01c62596b..877e3a26d4 100644 --- a/blockdev.c +++ b/blockdev.c @@ -2968,6 +2968,7 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs, if (replaces) { BlockDriverState *to_replace_bs; + AioContext *aio_context; AioContext *replace_aio_context; int64_t bs_size, replace_size; @@ -2982,10 +2983,19 @@ static void blockdev_mirror_common(const char *job_id, BlockDriverState *bs, return; } + aio_context = bdrv_get_aio_context(bs); replace_aio_context = bdrv_get_aio_context(to_replace_bs); - aio_context_acquire(replace_aio_context); + /* + * bdrv_getlength() is a co-wrapper and uses AIO_WAIT_WHILE. Be sure not + * to acquire the same AioContext twice. + */ + if (replace_aio_context != aio_context) { + aio_context_acquire(replace_aio_context); + } replace_size = bdrv_getlength(to_replace_bs); - aio_context_release(replace_aio_context); + if (replace_aio_context != aio_context) { + aio_context_release(replace_aio_context); + } if (replace_size < 0) { error_setg_errno(errp, -replace_size,