From patchwork Mon Mar 4 10:05:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13580339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 73B8BC54E41 for ; Mon, 4 Mar 2024 10:07:44 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rh5DA-0007Se-Co; Mon, 04 Mar 2024 05:06:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rh5D5-0007R0-LM for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:16 -0500 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rh5D3-0006GW-Cp for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:15 -0500 Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1dcd0431f00so26278245ad.3 for ; Mon, 04 Mar 2024 02:06:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1709546772; x=1710151572; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=kmQBUTekJtI6TAhWtZLWvbYC+n2LwNbJ/mJw7J2zXlY=; b=hEvipTFvETk1XRNdjruFOcIskurTrWIVtoZg4bWxOetyrbUvyxThJJl814GrBPyfT2 LcbSY+rYYuswBE8l4tHA2kDws5c3+OX4KyWtLied2usIt4mL1RTzBnNan9tEteWRwG1m ahQ6CkzCJ8/dh5HwBlQEkc2IA4m4OYAQcKI9G24Qpuyv9/+VOsdVhI7AGK5CB4PHaUoz d2iQ3wLoJc3Dv02CB1jrfMpi7slD48811ULRitj+VLibWXZDAFvEh/q1xaUVPyzZw2ZZ Yx17ZRuGwTHZjyu13SxbDIO2etBEWEp/GI20Ql5pwP4J+kn8VHLEiqcbyAdCub41JZ4x J9EA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709546772; x=1710151572; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kmQBUTekJtI6TAhWtZLWvbYC+n2LwNbJ/mJw7J2zXlY=; b=eMarEIqhHJPHuyRcwbnArhxL6vb9PP9GbeeiHHr0etVNJPuE+6GU3Lwq5PqelsnqUN Gvf3ss/KOkYF04wOug0aUEtafUO9N+2/FOnQadKLgv0rTrCzdpGP13KmCCyKdpJWJZCs RPPqef8Bg4R51xfJqDwwq0EVuOiSVnRdclEEoHSum1tiXBuZiWV3NAX9qfuazPjs4Chx ebDllAxOdNK/+SNdRy4K+SkhbX9xHMjEiqFoEX/TgE0e5O+akrc9cN4kzzzGZF0JuEn3 vzSOnWaVwqUzkZB9CORruJWvQoUZMjTaHgxFR9+ph3D4ntlKEo0ycr7Sm9f+x6KYcIJr FVvQ== X-Forwarded-Encrypted: i=1; AJvYcCUqhGZhsDAE1txcBaAFJ+CylxxPUVvtX5qI/zu8GQ+4q0wb6wx464FB3Hn5WbF5iSsaQsJbHNtBajhr3z4F4DZ3Y1D2/Uw= X-Gm-Message-State: AOJu0Yxka4CMPH+vXXu1H4ObRpCVxcAwk04NUE6kCoDtl3M2uZyqwlsW x6OejlfZdtncUh9LPuZLOPsnJ4RsIJg/mLxVir79MVn15ZazV93oyXTNXdurEq8= X-Google-Smtp-Source: AGHT+IFkTsI1/1aEZw4MUe5AovVdVaOiI/KXFy2corosPD93CJIgEi+YyL+WRZAI+KtKOc4qHbsENw== X-Received: by 2002:a17:902:bf41:b0:1db:ab71:a4ae with SMTP id u1-20020a170902bf4100b001dbab71a4aemr6136209pls.42.1709546771967; Mon, 04 Mar 2024 02:06:11 -0800 (PST) Received: from mnissler.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id y18-20020a170902e19200b001dc95e7e191sm8091570pla.298.2024.03.04.02.06.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Mar 2024 02:06:11 -0800 (PST) From: Mattias Nissler To: stefanha@redhat.com, jag.raman@oracle.com, peterx@redhat.com, qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , john.levon@nutanix.com, David Hildenbrand , "Michael S. Tsirkin" , Paolo Bonzini , Richard Henderson , Marcel Apfelbaum , Elena Ufimtseva , Mattias Nissler , Jonathan Cameron Subject: [PATCH v8 1/5] softmmu: Per-AddressSpace bounce buffering Date: Mon, 4 Mar 2024 02:05:50 -0800 Message-Id: <20240304100554.1143763-2-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240304100554.1143763-1-mnissler@rivosinc.com> References: <20240304100554.1143763-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62a; envelope-from=mnissler@rivosinc.com; helo=mail-pl1-x62a.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Instead of using a single global bounce buffer, give each AddressSpace its own bounce buffer. The MapClient callback mechanism moves to AddressSpace accordingly. This is in preparation for generalizing bounce buffer handling further to allow multiple bounce buffers, with a total allocation limit configured per AddressSpace. Reviewed-by: Peter Xu Tested-by: Jonathan Cameron Signed-off-by: Mattias Nissler --- include/exec/cpu-common.h | 2 - include/exec/memory.h | 45 ++++++++++++++++- system/dma-helpers.c | 4 +- system/memory.c | 7 +++ system/physmem.c | 101 ++++++++++++++++---------------------- 5 files changed, 93 insertions(+), 66 deletions(-) diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 9ead1be100..bd6999fa35 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -148,8 +148,6 @@ void *cpu_physical_memory_map(hwaddr addr, bool is_write); void cpu_physical_memory_unmap(void *buffer, hwaddr len, bool is_write, hwaddr access_len); -void cpu_register_map_client(QEMUBH *bh); -void cpu_unregister_map_client(QEMUBH *bh); bool cpu_physical_memory_is_io(hwaddr phys_addr); diff --git a/include/exec/memory.h b/include/exec/memory.h index 8626a355b3..0658846555 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -1106,6 +1106,19 @@ struct MemoryListener { QTAILQ_ENTRY(MemoryListener) link_as; }; +typedef struct AddressSpaceMapClient { + QEMUBH *bh; + QLIST_ENTRY(AddressSpaceMapClient) link; +} AddressSpaceMapClient; + +typedef struct { + MemoryRegion *mr; + void *buffer; + hwaddr addr; + hwaddr len; + bool in_use; +} BounceBuffer; + /** * struct AddressSpace: describes a mapping of addresses to #MemoryRegion objects */ @@ -1123,6 +1136,12 @@ struct AddressSpace { struct MemoryRegionIoeventfd *ioeventfds; QTAILQ_HEAD(, MemoryListener) listeners; QTAILQ_ENTRY(AddressSpace) address_spaces_link; + + /* Bounce buffer to use for this address space. */ + BounceBuffer bounce; + /* List of callbacks to invoke when buffers free up */ + QemuMutex map_client_list_lock; + QLIST_HEAD(, AddressSpaceMapClient) map_client_list; }; typedef struct AddressSpaceDispatch AddressSpaceDispatch; @@ -2926,8 +2945,8 @@ bool address_space_access_valid(AddressSpace *as, hwaddr addr, hwaddr len, * May return %NULL and set *@plen to zero(0), if resources needed to perform * the mapping are exhausted. * Use only for reads OR writes - not for read-modify-write operations. - * Use cpu_register_map_client() to know when retrying the map operation is - * likely to succeed. + * Use address_space_register_map_client() to know when retrying the map + * operation is likely to succeed. * * @as: #AddressSpace to be accessed * @addr: address within that address space @@ -2952,6 +2971,28 @@ void *address_space_map(AddressSpace *as, hwaddr addr, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, bool is_write, hwaddr access_len); +/* + * address_space_register_map_client: Register a callback to invoke when + * resources for address_space_map() are available again. + * + * address_space_map may fail when there are not enough resources available, + * such as when bounce buffer memory would exceed the limit. The callback can + * be used to retry the address_space_map operation. Note that the callback + * gets automatically removed after firing. + * + * @as: #AddressSpace to be accessed + * @bh: callback to invoke when address_space_map() retry is appropriate + */ +void address_space_register_map_client(AddressSpace *as, QEMUBH *bh); + +/* + * address_space_unregister_map_client: Unregister a callback that has + * previously been registered and not fired yet. + * + * @as: #AddressSpace to be accessed + * @bh: callback to unregister + */ +void address_space_unregister_map_client(AddressSpace *as, QEMUBH *bh); /* Internal functions, part of the implementation of address_space_read. */ MemTxResult address_space_read_full(AddressSpace *as, hwaddr addr, diff --git a/system/dma-helpers.c b/system/dma-helpers.c index 9b221cf94e..74013308f5 100644 --- a/system/dma-helpers.c +++ b/system/dma-helpers.c @@ -169,7 +169,7 @@ static void dma_blk_cb(void *opaque, int ret) if (dbs->iov.size == 0) { trace_dma_map_wait(dbs); dbs->bh = aio_bh_new(ctx, reschedule_dma, dbs); - cpu_register_map_client(dbs->bh); + address_space_register_map_client(dbs->sg->as, dbs->bh); return; } @@ -197,7 +197,7 @@ static void dma_aio_cancel(BlockAIOCB *acb) } if (dbs->bh) { - cpu_unregister_map_client(dbs->bh); + address_space_unregister_map_client(dbs->sg->as, dbs->bh); qemu_bh_delete(dbs->bh); dbs->bh = NULL; } diff --git a/system/memory.c b/system/memory.c index a229a79988..ad0caef1b8 100644 --- a/system/memory.c +++ b/system/memory.c @@ -3133,6 +3133,9 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name) as->ioeventfds = NULL; QTAILQ_INIT(&as->listeners); QTAILQ_INSERT_TAIL(&address_spaces, as, address_spaces_link); + as->bounce.in_use = false; + qemu_mutex_init(&as->map_client_list_lock); + QLIST_INIT(&as->map_client_list); as->name = g_strdup(name ? name : "anonymous"); address_space_update_topology(as); address_space_update_ioeventfds(as); @@ -3140,6 +3143,10 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name) static void do_address_space_destroy(AddressSpace *as) { + assert(!qatomic_read(&as->bounce.in_use)); + assert(QLIST_EMPTY(&as->map_client_list)); + qemu_mutex_destroy(&as->map_client_list_lock); + assert(QTAILQ_EMPTY(&as->listeners)); flatview_unref(as->current_map); diff --git a/system/physmem.c b/system/physmem.c index e3ebc19eef..2972b83ab0 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -2971,55 +2971,37 @@ void cpu_flush_icache_range(hwaddr start, hwaddr len) NULL, len, FLUSH_CACHE); } -typedef struct { - MemoryRegion *mr; - void *buffer; - hwaddr addr; - hwaddr len; - bool in_use; -} BounceBuffer; - -static BounceBuffer bounce; - -typedef struct MapClient { - QEMUBH *bh; - QLIST_ENTRY(MapClient) link; -} MapClient; - -QemuMutex map_client_list_lock; -static QLIST_HEAD(, MapClient) map_client_list - = QLIST_HEAD_INITIALIZER(map_client_list); - -static void cpu_unregister_map_client_do(MapClient *client) +static void +address_space_unregister_map_client_do(AddressSpaceMapClient *client) { QLIST_REMOVE(client, link); g_free(client); } -static void cpu_notify_map_clients_locked(void) +static void address_space_notify_map_clients_locked(AddressSpace *as) { - MapClient *client; + AddressSpaceMapClient *client; - while (!QLIST_EMPTY(&map_client_list)) { - client = QLIST_FIRST(&map_client_list); + while (!QLIST_EMPTY(&as->map_client_list)) { + client = QLIST_FIRST(&as->map_client_list); qemu_bh_schedule(client->bh); - cpu_unregister_map_client_do(client); + address_space_unregister_map_client_do(client); } } -void cpu_register_map_client(QEMUBH *bh) +void address_space_register_map_client(AddressSpace *as, QEMUBH *bh) { - MapClient *client = g_malloc(sizeof(*client)); + AddressSpaceMapClient *client = g_malloc(sizeof(*client)); - qemu_mutex_lock(&map_client_list_lock); + qemu_mutex_lock(&as->map_client_list_lock); client->bh = bh; - QLIST_INSERT_HEAD(&map_client_list, client, link); + QLIST_INSERT_HEAD(&as->map_client_list, client, link); /* Write map_client_list before reading in_use. */ smp_mb(); - if (!qatomic_read(&bounce.in_use)) { - cpu_notify_map_clients_locked(); + if (!qatomic_read(&as->bounce.in_use)) { + address_space_notify_map_clients_locked(as); } - qemu_mutex_unlock(&map_client_list_lock); + qemu_mutex_unlock(&as->map_client_list_lock); } void cpu_exec_init_all(void) @@ -3035,28 +3017,27 @@ void cpu_exec_init_all(void) finalize_target_page_bits(); io_mem_init(); memory_map_init(); - qemu_mutex_init(&map_client_list_lock); } -void cpu_unregister_map_client(QEMUBH *bh) +void address_space_unregister_map_client(AddressSpace *as, QEMUBH *bh) { - MapClient *client; + AddressSpaceMapClient *client; - qemu_mutex_lock(&map_client_list_lock); - QLIST_FOREACH(client, &map_client_list, link) { + qemu_mutex_lock(&as->map_client_list_lock); + QLIST_FOREACH(client, &as->map_client_list, link) { if (client->bh == bh) { - cpu_unregister_map_client_do(client); + address_space_unregister_map_client_do(client); break; } } - qemu_mutex_unlock(&map_client_list_lock); + qemu_mutex_unlock(&as->map_client_list_lock); } -static void cpu_notify_map_clients(void) +static void address_space_notify_map_clients(AddressSpace *as) { - qemu_mutex_lock(&map_client_list_lock); - cpu_notify_map_clients_locked(); - qemu_mutex_unlock(&map_client_list_lock); + qemu_mutex_lock(&as->map_client_list_lock); + address_space_notify_map_clients_locked(as); + qemu_mutex_unlock(&as->map_client_list_lock); } static bool flatview_access_valid(FlatView *fv, hwaddr addr, hwaddr len, @@ -3123,8 +3104,8 @@ flatview_extend_translation(FlatView *fv, hwaddr addr, * May map a subset of the requested range, given by and returned in *plen. * May return NULL if resources needed to perform the mapping are exhausted. * Use only for reads OR writes - not for read-modify-write operations. - * Use cpu_register_map_client() to know when retrying the map operation is - * likely to succeed. + * Use address_space_register_map_client() to know when retrying the map + * operation is likely to succeed. */ void *address_space_map(AddressSpace *as, hwaddr addr, @@ -3147,25 +3128,25 @@ void *address_space_map(AddressSpace *as, mr = flatview_translate(fv, addr, &xlat, &l, is_write, attrs); if (!memory_access_is_direct(mr, is_write)) { - if (qatomic_xchg(&bounce.in_use, true)) { + if (qatomic_xchg(&as->bounce.in_use, true)) { *plen = 0; return NULL; } /* Avoid unbounded allocations */ l = MIN(l, TARGET_PAGE_SIZE); - bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, l); - bounce.addr = addr; - bounce.len = l; + as->bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, l); + as->bounce.addr = addr; + as->bounce.len = l; memory_region_ref(mr); - bounce.mr = mr; + as->bounce.mr = mr; if (!is_write) { flatview_read(fv, addr, MEMTXATTRS_UNSPECIFIED, - bounce.buffer, l); + as->bounce.buffer, l); } *plen = l; - return bounce.buffer; + return as->bounce.buffer; } @@ -3183,7 +3164,7 @@ void *address_space_map(AddressSpace *as, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, bool is_write, hwaddr access_len) { - if (buffer != bounce.buffer) { + if (buffer != as->bounce.buffer) { MemoryRegion *mr; ram_addr_t addr1; @@ -3199,15 +3180,15 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, return; } if (is_write) { - address_space_write(as, bounce.addr, MEMTXATTRS_UNSPECIFIED, - bounce.buffer, access_len); + address_space_write(as, as->bounce.addr, MEMTXATTRS_UNSPECIFIED, + as->bounce.buffer, access_len); } - qemu_vfree(bounce.buffer); - bounce.buffer = NULL; - memory_region_unref(bounce.mr); + qemu_vfree(as->bounce.buffer); + as->bounce.buffer = NULL; + memory_region_unref(as->bounce.mr); /* Clear in_use before reading map_client_list. */ - qatomic_set_mb(&bounce.in_use, false); - cpu_notify_map_clients(); + qatomic_set_mb(&as->bounce.in_use, false); + address_space_notify_map_clients(as); } void *cpu_physical_memory_map(hwaddr addr, From patchwork Mon Mar 4 10:05:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13580335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6AD2AC5478C for ; Mon, 4 Mar 2024 10:07:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rh5D9-0007SC-Vo; Mon, 04 Mar 2024 05:06:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rh5D8-0007Rp-Ha for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:18 -0500 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rh5D4-0006Gb-MG for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:16 -0500 Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-6e62c65865cso219474b3a.2 for ; Mon, 04 Mar 2024 02:06:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1709546773; x=1710151573; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=gVQqFC9D2KqJlxw+A6UHxs7gFanVhnkmzHqHRZSs/iE=; b=KrxowlEDfhPf/75KGLEA+Eim8j1dbMN8T3hykC6xn4mklRNBSFOmTfQkT+x4dTth9e skjoVN7oH4wfTycWo8fK7gp5HoWw9aVY3Gnyvu9YUkEww8rE1V2sMvvU1o1Rfn32D8ts uy8rLCRSFRNteu8+EWsjpaAvDl3wrltG0Qz+FzSxkKgWP6tSM6yJ3/mpmyfBufAKgoIq gi3DGMcySMnhfqefxCaTe/C5iMmejjpZFfKF66ULXE12DwT+1OigmlWqaWQ2Mk1H32Z2 8pqSKQicQ965TYWTUt4de1f8FBDehJT9zd6V8HtTwzVcUhkXdFicQ+ZyPFXM8amCDuLL IOsg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709546773; x=1710151573; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gVQqFC9D2KqJlxw+A6UHxs7gFanVhnkmzHqHRZSs/iE=; b=Up1h4873ECYcQC8GRSNc306+/vGGEE5tDoXa7qyiMUXWqC29BQqiC3dDSDwkcBpSnN aQ9pznmAee9Z8vOsOhJAn+x890ZCf7FheX0uAYtxvJfOCsb3tKEi2LBtOr1YDXDcPvlZ PMVwrdrtjrxt4Wdiyrhxp0YaurPnDWTn7M02Mg5kEPrMVYe7Lii+t9Jk6idgQVLvMopY /i9YVARm57o+6VxRXRQDmiz7hwQHAjq410pFaclAdTjbLbYviVoIq5oh2XtnS2voap4B c819q4T88qfoA2O/gfnihq4SflkshXbhj2NQeHHsE0FYNvtPciFfWgvGX7KnZSUqxO6F wn0w== X-Forwarded-Encrypted: i=1; AJvYcCW0cjt7dFaDUruoudTAzVSatXyLIVVavajJjGPaYRJHL4LnOg08yyhy8KA7Zs6htwuSjs5ysik+Ua3lrNHByllTJUiXkMY= X-Gm-Message-State: AOJu0Yy3GAcc1DHDjFw657Tuogb6flkGtIK8TmeMQIWWKxJm1j/SV4fK 58ojSiGuxKMG/ECnUD7M/mb5Y5ZsYPMRSFNGoWPUTvy+k1f4lztSYYzopfShovo= X-Google-Smtp-Source: AGHT+IE3Vyv8Rh6+TEPY69pRZfA5wr/jcS6SYCq/G6FJGy62c8yTDdVdh/oDjk3Du1gttRtFkpzjIA== X-Received: by 2002:a05:6a20:d48e:b0:1a0:fab6:41d0 with SMTP id im14-20020a056a20d48e00b001a0fab641d0mr10372763pzb.42.1709546773177; Mon, 04 Mar 2024 02:06:13 -0800 (PST) Received: from mnissler.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id y18-20020a170902e19200b001dc95e7e191sm8091570pla.298.2024.03.04.02.06.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Mar 2024 02:06:12 -0800 (PST) From: Mattias Nissler To: stefanha@redhat.com, jag.raman@oracle.com, peterx@redhat.com, qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , john.levon@nutanix.com, David Hildenbrand , "Michael S. Tsirkin" , Paolo Bonzini , Richard Henderson , Marcel Apfelbaum , Elena Ufimtseva , Mattias Nissler , Jonathan Cameron Subject: [PATCH v8 2/5] softmmu: Support concurrent bounce buffers Date: Mon, 4 Mar 2024 02:05:51 -0800 Message-Id: <20240304100554.1143763-3-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240304100554.1143763-1-mnissler@rivosinc.com> References: <20240304100554.1143763-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42d; envelope-from=mnissler@rivosinc.com; helo=mail-pf1-x42d.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org When DMA memory can't be directly accessed, as is the case when running the device model in a separate process without shareable DMA file descriptors, bounce buffering is used. It is not uncommon for device models to request mapping of several DMA regions at the same time. Examples include: * net devices, e.g. when transmitting a packet that is split across several TX descriptors (observed with igb) * USB host controllers, when handling a packet with multiple data TRBs (observed with xhci) Previously, qemu only provided a single bounce buffer per AddressSpace and would fail DMA map requests while the buffer was already in use. In turn, this would cause DMA failures that ultimately manifest as hardware errors from the guest perspective. This change allocates DMA bounce buffers dynamically instead of supporting only a single buffer. Thus, multiple DMA mappings work correctly also when RAM can't be mmap()-ed. The total bounce buffer allocation size is limited individually for each AddressSpace. The default limit is 4096 bytes, matching the previous maximum buffer size. A new x-max-bounce-buffer-size parameter is provided to configure the limit for PCI devices. Reviewed-by: Peter Xu Tested-by: Jonathan Cameron Signed-off-by: Mattias Nissler --- hw/pci/pci.c | 8 ++++ include/exec/memory.h | 14 +++---- include/hw/pci/pci_device.h | 3 ++ system/memory.c | 5 ++- system/physmem.c | 80 +++++++++++++++++++++++++------------ 5 files changed, 74 insertions(+), 36 deletions(-) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 6496d027ca..036b3ff822 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -85,6 +85,8 @@ static Property pci_props[] = { QEMU_PCIE_ERR_UNC_MASK_BITNR, true), DEFINE_PROP_BIT("x-pcie-ari-nextfn-1", PCIDevice, cap_present, QEMU_PCIE_ARI_NEXTFN_1_BITNR, false), + DEFINE_PROP_SIZE("x-max-bounce-buffer-size", PCIDevice, + max_bounce_buffer_size, DEFAULT_MAX_BOUNCE_BUFFER_SIZE), DEFINE_PROP_END_OF_LIST() }; @@ -1203,6 +1205,8 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, "bus master container", UINT64_MAX); address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_container_region, pci_dev->name); + pci_dev->bus_master_as.max_bounce_buffer_size = + pci_dev->max_bounce_buffer_size; if (phase_check(PHASE_MACHINE_READY)) { pci_init_bus_master(pci_dev); @@ -2632,6 +2636,10 @@ static void pci_device_class_init(ObjectClass *klass, void *data) k->unrealize = pci_qdev_unrealize; k->bus_type = TYPE_PCI_BUS; device_class_set_props(k, pci_props); + object_class_property_set_description( + klass, "x-max-bounce-buffer-size", + "Maximum buffer size allocated for bounce buffers used for mapped " + "access to indirect DMA memory"); } static void pci_device_class_base_init(ObjectClass *klass, void *data) diff --git a/include/exec/memory.h b/include/exec/memory.h index 0658846555..3fe0e2824c 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -1111,13 +1111,7 @@ typedef struct AddressSpaceMapClient { QLIST_ENTRY(AddressSpaceMapClient) link; } AddressSpaceMapClient; -typedef struct { - MemoryRegion *mr; - void *buffer; - hwaddr addr; - hwaddr len; - bool in_use; -} BounceBuffer; +#define DEFAULT_MAX_BOUNCE_BUFFER_SIZE (4096) /** * struct AddressSpace: describes a mapping of addresses to #MemoryRegion objects @@ -1137,8 +1131,10 @@ struct AddressSpace { QTAILQ_HEAD(, MemoryListener) listeners; QTAILQ_ENTRY(AddressSpace) address_spaces_link; - /* Bounce buffer to use for this address space. */ - BounceBuffer bounce; + /* Maximum DMA bounce buffer size used for indirect memory map requests */ + uint64_t max_bounce_buffer_size; + /* Total size of bounce buffers currently allocated, atomically accessed */ + uint64_t bounce_buffer_size; /* List of callbacks to invoke when buffers free up */ QemuMutex map_client_list_lock; QLIST_HEAD(, AddressSpaceMapClient) map_client_list; diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h index d3dd0f64b2..f4027c5379 100644 --- a/include/hw/pci/pci_device.h +++ b/include/hw/pci/pci_device.h @@ -160,6 +160,9 @@ struct PCIDevice { /* ID of standby device in net_failover pair */ char *failover_pair_id; uint32_t acpi_index; + + /* Maximum DMA bounce buffer size used for indirect memory map requests */ + uint64_t max_bounce_buffer_size; }; static inline int pci_intx(PCIDevice *pci_dev) diff --git a/system/memory.c b/system/memory.c index ad0caef1b8..1cf89654a1 100644 --- a/system/memory.c +++ b/system/memory.c @@ -3133,7 +3133,8 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name) as->ioeventfds = NULL; QTAILQ_INIT(&as->listeners); QTAILQ_INSERT_TAIL(&address_spaces, as, address_spaces_link); - as->bounce.in_use = false; + as->max_bounce_buffer_size = DEFAULT_MAX_BOUNCE_BUFFER_SIZE; + as->bounce_buffer_size = 0; qemu_mutex_init(&as->map_client_list_lock); QLIST_INIT(&as->map_client_list); as->name = g_strdup(name ? name : "anonymous"); @@ -3143,7 +3144,7 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name) static void do_address_space_destroy(AddressSpace *as) { - assert(!qatomic_read(&as->bounce.in_use)); + assert(qatomic_read(&as->bounce_buffer_size) == 0); assert(QLIST_EMPTY(&as->map_client_list)); qemu_mutex_destroy(&as->map_client_list_lock); diff --git a/system/physmem.c b/system/physmem.c index 2972b83ab0..5dcf4910cf 100644 --- a/system/physmem.c +++ b/system/physmem.c @@ -2971,6 +2971,20 @@ void cpu_flush_icache_range(hwaddr start, hwaddr len) NULL, len, FLUSH_CACHE); } +/* + * A magic value stored in the first 8 bytes of the bounce buffer struct. Used + * to detect illegal pointers passed to address_space_unmap. + */ +#define BOUNCE_BUFFER_MAGIC 0xb4017ceb4ffe12ed + +typedef struct { + uint64_t magic; + MemoryRegion *mr; + hwaddr addr; + size_t len; + uint8_t buffer[]; +} BounceBuffer; + static void address_space_unregister_map_client_do(AddressSpaceMapClient *client) { @@ -2996,9 +3010,9 @@ void address_space_register_map_client(AddressSpace *as, QEMUBH *bh) qemu_mutex_lock(&as->map_client_list_lock); client->bh = bh; QLIST_INSERT_HEAD(&as->map_client_list, client, link); - /* Write map_client_list before reading in_use. */ + /* Write map_client_list before reading bounce_buffer_size. */ smp_mb(); - if (!qatomic_read(&as->bounce.in_use)) { + if (qatomic_read(&as->bounce_buffer_size) < as->max_bounce_buffer_size) { address_space_notify_map_clients_locked(as); } qemu_mutex_unlock(&as->map_client_list_lock); @@ -3128,28 +3142,38 @@ void *address_space_map(AddressSpace *as, mr = flatview_translate(fv, addr, &xlat, &l, is_write, attrs); if (!memory_access_is_direct(mr, is_write)) { - if (qatomic_xchg(&as->bounce.in_use, true)) { + size_t size = qatomic_add_fetch(&as->bounce_buffer_size, l); + if (size > as->max_bounce_buffer_size) { + /* + * Note that the overshot might be larger than l if threads are + * racing and bump bounce_buffer_size at the same time. + */ + size_t excess = MIN(size - as->max_bounce_buffer_size, l); + l -= excess; + qatomic_sub(&as->bounce_buffer_size, excess); + } + + if (l == 0) { *plen = 0; return NULL; } - /* Avoid unbounded allocations */ - l = MIN(l, TARGET_PAGE_SIZE); - as->bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, l); - as->bounce.addr = addr; - as->bounce.len = l; + BounceBuffer *bounce = g_malloc0(l + sizeof(BounceBuffer)); + bounce->magic = BOUNCE_BUFFER_MAGIC; memory_region_ref(mr); - as->bounce.mr = mr; + bounce->mr = mr; + bounce->addr = addr; + bounce->len = l; + if (!is_write) { flatview_read(fv, addr, MEMTXATTRS_UNSPECIFIED, - as->bounce.buffer, l); + bounce->buffer, l); } *plen = l; - return as->bounce.buffer; + return bounce->buffer; } - memory_region_ref(mr); *plen = flatview_extend_translation(fv, addr, len, mr, xlat, l, is_write, attrs); @@ -3164,12 +3188,11 @@ void *address_space_map(AddressSpace *as, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, bool is_write, hwaddr access_len) { - if (buffer != as->bounce.buffer) { - MemoryRegion *mr; - ram_addr_t addr1; + MemoryRegion *mr; + ram_addr_t addr1; - mr = memory_region_from_host(buffer, &addr1); - assert(mr != NULL); + mr = memory_region_from_host(buffer, &addr1); + if (mr != NULL) { if (is_write) { invalidate_and_set_dirty(mr, addr1, access_len); } @@ -3179,15 +3202,22 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, memory_region_unref(mr); return; } + + + BounceBuffer *bounce = container_of(buffer, BounceBuffer, buffer); + assert(bounce->magic == BOUNCE_BUFFER_MAGIC); + if (is_write) { - address_space_write(as, as->bounce.addr, MEMTXATTRS_UNSPECIFIED, - as->bounce.buffer, access_len); - } - qemu_vfree(as->bounce.buffer); - as->bounce.buffer = NULL; - memory_region_unref(as->bounce.mr); - /* Clear in_use before reading map_client_list. */ - qatomic_set_mb(&as->bounce.in_use, false); + address_space_write(as, bounce->addr, MEMTXATTRS_UNSPECIFIED, + bounce->buffer, access_len); + } + + qatomic_sub(&as->bounce_buffer_size, bounce->len); + bounce->magic = ~BOUNCE_BUFFER_MAGIC; + memory_region_unref(bounce->mr); + g_free(bounce); + /* Write bounce_buffer_size before reading map_client_list. */ + smp_mb(); address_space_notify_map_clients(as); } From patchwork Mon Mar 4 10:05:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13580337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C2811C48BF6 for ; Mon, 4 Mar 2024 10:07:37 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rh5DA-0007T1-T3; Mon, 04 Mar 2024 05:06:20 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rh5D9-0007SB-Pz for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:19 -0500 Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rh5D7-0006HQ-O5 for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:19 -0500 Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1dcb3e6ff3fso17404985ad.2 for ; Mon, 04 Mar 2024 02:06:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1709546774; x=1710151574; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hf8rFfw1McJn3z3VsNsbHmy2YFs5g1/mEqXgQ2kHlrs=; b=rTa0aC+rFXLpcCPnKV3QAyudnkGL4cUd0Bm5OkTuLnp1Gx/LdtWzzRN5B+Jgf6QLYy CI42XHIxfl/V2EiaZ721T4xFhHoZggcXeOpIe9tMBORl4Ge89h987HRACUYh7LsK8vTS dumIW3D4oNo/dXeeQVhISOTpijS2s/EEIEMZt0k/ExnGKJQ3NQZIuiyWAbYsjmc5Hul3 8HY1LGKRx1YHgpefNBNpT2wiOIrR23Ur1GoJy7vmHVr/D0Ow7UFFvTq9qRtPHnTDuGHZ 3Bk5NAneY+FDBNfcWuROJVUavIhrXdCqY0NJqGwJ2v2L+16afGgHR5o0nne1L6P2BI5i yDlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709546774; x=1710151574; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hf8rFfw1McJn3z3VsNsbHmy2YFs5g1/mEqXgQ2kHlrs=; b=m9i9Ie8SpRYbLikivBQXje/J2m2aekVy2OllKl7+VeIUkkGBzFz+Yd4Zw6+8dda1wK +uewm08jCFQrJYuyPa13YEc7TpItFj4GJVwB4DVk4Sj0J0GEtjl7r2B4T/gHUKs6zB+v NpLUwokv7PzkfAlGV/f6Ww/A+e5S1R2EJZGacqlsZqjbJ8AM2vZitSswMf1eiPW+kxcY 7HComnnpjzHgLqJuo9uNXX/Nw4spCRmnlIgihqY2wuWL4BeYQuvOpLKQDlUUfdegutHA Q1InyTjh1ceoO0HByKvsjwDkDJFlIu6iEv+s4anTVtwOU9NPiLVdWt2HDby/56sYvdS1 VgPA== X-Forwarded-Encrypted: i=1; AJvYcCV0rgj6sWqwokvSR0LXQsnkX+bCbKLW0BoXmly67WQm36nTTeaBcLcgu6zjQ6nTN+FZbtVrt4H6PHNnNHf5c5h+Zwy0w0Y= X-Gm-Message-State: AOJu0YySj0kV3ic99s3fqq8bQUwuz9/m/1kRn88Y79g2RkpVHqcN1Zir j3i0uLoHPJZjLC65e7bTOiQjaDQNtDXdWXDXy/NpmudDngVuqzKCrSuZrMv+74wpoedNKWcMUdc lCas= X-Google-Smtp-Source: AGHT+IGt3Y9sPyyIpGwiS9Eqb3a5g4P1yZLbtb5piKfpJv3IMLhQA0t/vke7ZaOQtn7d4gbzWdEM8Q== X-Received: by 2002:a17:902:fc8f:b0:1dc:f7c2:938e with SMTP id mf15-20020a170902fc8f00b001dcf7c2938emr8351252plb.38.1709546774242; Mon, 04 Mar 2024 02:06:14 -0800 (PST) Received: from mnissler.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id y18-20020a170902e19200b001dc95e7e191sm8091570pla.298.2024.03.04.02.06.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Mar 2024 02:06:14 -0800 (PST) From: Mattias Nissler To: stefanha@redhat.com, jag.raman@oracle.com, peterx@redhat.com, qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , john.levon@nutanix.com, David Hildenbrand , "Michael S. Tsirkin" , Paolo Bonzini , Richard Henderson , Marcel Apfelbaum , Elena Ufimtseva , Mattias Nissler Subject: [PATCH v8 3/5] Update subprojects/libvfio-user Date: Mon, 4 Mar 2024 02:05:52 -0800 Message-Id: <20240304100554.1143763-4-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240304100554.1143763-1-mnissler@rivosinc.com> References: <20240304100554.1143763-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::631; envelope-from=mnissler@rivosinc.com; helo=mail-pl1-x631.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Brings in assorted bug fixes. The following are of particular interest with respect to message-based DMA support: * bb308a2 "Fix address calculation for message-based DMA" Corrects a bug in DMA address calculation. * 1569a37 "Pass server->client command over a separate socket pair" Adds support for separate sockets for either command direction, addressing a bug where libvfio-user gets confused if both client and server send commands concurrently. Reviewed-by: Jagannathan Raman Signed-off-by: Mattias Nissler --- subprojects/libvfio-user.wrap | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/subprojects/libvfio-user.wrap b/subprojects/libvfio-user.wrap index 416955ca45..cdf0a7a375 100644 --- a/subprojects/libvfio-user.wrap +++ b/subprojects/libvfio-user.wrap @@ -1,4 +1,4 @@ [wrap-git] url = https://gitlab.com/qemu-project/libvfio-user.git -revision = 0b28d205572c80b568a1003db2c8f37ca333e4d7 +revision = 1569a37a54ecb63bd4008708c76339ccf7d06115 depth = 1 From patchwork Mon Mar 4 10:05:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13580336 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1CE2BC5478C for ; Mon, 4 Mar 2024 10:07:31 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rh5DB-0007T3-6c; Mon, 04 Mar 2024 05:06:21 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rh5DA-0007SR-7T for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:20 -0500 Received: from mail-pl1-x629.google.com ([2607:f8b0:4864:20::629]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rh5D8-0006IL-9O for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:19 -0500 Received: by mail-pl1-x629.google.com with SMTP id d9443c01a7336-1dc13fb0133so31047145ad.3 for ; Mon, 04 Mar 2024 02:06:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1709546775; x=1710151575; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8rF+7GtIf1rtTmKIckGnSl1eaAm4N6ysn3Y/c7lHL+s=; b=hk6WXwIBjHNwC6OFk1xIw1nVV7bVDy0l9BqRjITIMb6C4PjwHAWRz/zNvJsW7u/k9f ViLKUZGTNsE8H/CoLuYkT9fbixJJ5i+Wp1GIUNWxBz2krfm/82ykz+g/sZW5jw1V6uDx l2T+5qV5Y1arFCd4R2HSSXX79O3Oewjy9INcRMhXECknekSuha1Xo980xzU3cJwwbAko EMO5ely7eO0YmDx1lCf5vnbjRDO5gwIr1q3BqzcfyIyQkwoBevJaUvOFf4YeANbMSpaA 4M9VO3qi8SOMfbQUEIwxvFcNUrh11/zGPK0X0iFbSS9S+QxFOBpP8YB/NinWKnU2otoR laEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709546775; x=1710151575; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8rF+7GtIf1rtTmKIckGnSl1eaAm4N6ysn3Y/c7lHL+s=; b=fTxEbKGTjN9fjK+ekj11dcfVpn9UWG5bGVCDc3a99tMkCDkuYt54ONPbm2Vm53l/BM GB7/otGL8pq8XzY/UFUVbL8vaiXjO+V1RnU+pghzGRL3X/9AwMRmx7dOAxYoPDvld9Cr kRRsSP5kdwsotRpCurJtvzuC+LE7Y1DE95iZLkH8nG8Gnz73IoIVxusU2t+MuBY7WjHh LKcjv//B1lzdV5acz2ydCma7Uf3ZTklmBADhgkfJTaZ/xp6AECBgZKTJUUCp2UFGnfuu rMsQl2y9TXhowzxATp1QeTES20w9ywkqW7/7QrMqNvsQFQjCMbTeDU9zMKU2DxyiKyW6 YNbg== X-Forwarded-Encrypted: i=1; AJvYcCVIYUIa/23YkOrgI7IZbDEnPjExik7dmkhsz0bXmkmoPvoYiy0ucZ2AfmUGVMaUrlLQ+UuQi63PklnlHES/SceSXkMku7g= X-Gm-Message-State: AOJu0Yx5u0xTUuGQF0ftTIQ1vnfuiVIvImPfjwaKrYLpZ0O3WjtMSwAW 2j3Jz3hmsqqS45s5/LWvqxw121m9kzpW6d5XNTb9lFHfFFGU9e2lZ8IVD2mvvJU= X-Google-Smtp-Source: AGHT+IGqIhHmp3DUy138+wzLN9QrpTtKaV9YLz3kGPCmnoOJoQhs1RKGrrQIok5dotOJaNSRn45x2g== X-Received: by 2002:a17:902:b717:b0:1db:51ee:8677 with SMTP id d23-20020a170902b71700b001db51ee8677mr7783405pls.59.1709546775355; Mon, 04 Mar 2024 02:06:15 -0800 (PST) Received: from mnissler.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id y18-20020a170902e19200b001dc95e7e191sm8091570pla.298.2024.03.04.02.06.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Mar 2024 02:06:15 -0800 (PST) From: Mattias Nissler To: stefanha@redhat.com, jag.raman@oracle.com, peterx@redhat.com, qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , john.levon@nutanix.com, David Hildenbrand , "Michael S. Tsirkin" , Paolo Bonzini , Richard Henderson , Marcel Apfelbaum , Elena Ufimtseva , Mattias Nissler Subject: [PATCH v8 4/5] vfio-user: Message-based DMA support Date: Mon, 4 Mar 2024 02:05:53 -0800 Message-Id: <20240304100554.1143763-5-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240304100554.1143763-1-mnissler@rivosinc.com> References: <20240304100554.1143763-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::629; envelope-from=mnissler@rivosinc.com; helo=mail-pl1-x629.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See https://github.com/nutanix/libvfio-user/issues/279 for details as well as a proposed fix. Reviewed-by: Jagannathan Raman Signed-off-by: Mattias Nissler --- hw/remote/trace-events | 2 + hw/remote/vfio-user-obj.c | 100 ++++++++++++++++++++++++++++++++------ 2 files changed, 87 insertions(+), 15 deletions(-) diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 0d1b7d56a5..358a68fb34 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -9,6 +9,8 @@ vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%x -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%x <- 0x%x" vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" +vfu_dma_read(uint64_t gpa, size_t len) "vfu: DMA read 0x%"PRIx64", %zu bytes" +vfu_dma_write(uint64_t gpa, size_t len) "vfu: DMA write 0x%"PRIx64", %zu bytes" vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64"" vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64"" vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64"" diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index d9b879e056..a15e291c9a 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -300,6 +300,63 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, return count; } +static MemTxResult vfu_dma_read(void *opaque, hwaddr addr, uint64_t *val, + unsigned size, MemTxAttrs attrs) +{ + MemoryRegion *region = opaque; + vfu_ctx_t *vfu_ctx = VFU_OBJECT(region->owner)->vfu_ctx; + uint8_t buf[sizeof(uint64_t)]; + + trace_vfu_dma_read(region->addr + addr, size); + + g_autofree dma_sg_t *sg = g_malloc0(dma_sg_size()); + vfu_dma_addr_t vfu_addr = (vfu_dma_addr_t)(region->addr + addr); + if (vfu_addr_to_sgl(vfu_ctx, vfu_addr, size, sg, 1, PROT_READ) < 0 || + vfu_sgl_read(vfu_ctx, sg, 1, buf) != 0) { + return MEMTX_ERROR; + } + + *val = ldn_he_p(buf, size); + + return MEMTX_OK; +} + +static MemTxResult vfu_dma_write(void *opaque, hwaddr addr, uint64_t val, + unsigned size, MemTxAttrs attrs) +{ + MemoryRegion *region = opaque; + vfu_ctx_t *vfu_ctx = VFU_OBJECT(region->owner)->vfu_ctx; + uint8_t buf[sizeof(uint64_t)]; + + trace_vfu_dma_write(region->addr + addr, size); + + stn_he_p(buf, size, val); + + g_autofree dma_sg_t *sg = g_malloc0(dma_sg_size()); + vfu_dma_addr_t vfu_addr = (vfu_dma_addr_t)(region->addr + addr); + if (vfu_addr_to_sgl(vfu_ctx, vfu_addr, size, sg, 1, PROT_WRITE) < 0 || + vfu_sgl_write(vfu_ctx, sg, 1, buf) != 0) { + return MEMTX_ERROR; + } + + return MEMTX_OK; +} + +static const MemoryRegionOps vfu_dma_ops = { + .read_with_attrs = vfu_dma_read, + .write_with_attrs = vfu_dma_write, + .endianness = DEVICE_HOST_ENDIAN, + .valid = { + .min_access_size = 1, + .max_access_size = 8, + .unaligned = true, + }, + .impl = { + .min_access_size = 1, + .max_access_size = 8, + }, +}; + static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) { VfuObject *o = vfu_get_private(vfu_ctx); @@ -308,17 +365,30 @@ static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) g_autofree char *name = NULL; struct iovec *iov = &info->iova; - if (!info->vaddr) { - return; - } - name = g_strdup_printf("mem-%s-%"PRIx64"", o->device, - (uint64_t)info->vaddr); + (uint64_t)iov->iov_base); subregion = g_new0(MemoryRegion, 1); - memory_region_init_ram_ptr(subregion, NULL, name, - iov->iov_len, info->vaddr); + if (info->vaddr) { + memory_region_init_ram_ptr(subregion, OBJECT(o), name, + iov->iov_len, info->vaddr); + } else { + /* + * Note that I/O regions' MemoryRegionOps handle accesses of at most 8 + * bytes at a time, and larger accesses are broken down. However, + * many/most DMA accesses are larger than 8 bytes and VFIO-user can + * handle large DMA accesses just fine, thus this size restriction + * unnecessarily hurts performance, in particular given that each + * access causes a round trip on the VFIO-user socket. + * + * TODO: Investigate how to plumb larger accesses through memory + * regions, possibly by amending MemoryRegionOps or by creating a new + * memory region type. + */ + memory_region_init_io(subregion, OBJECT(o), &vfu_dma_ops, subregion, + name, iov->iov_len); + } dma_as = pci_device_iommu_address_space(o->pci_dev); @@ -330,20 +400,20 @@ static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) static void dma_unregister(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) { VfuObject *o = vfu_get_private(vfu_ctx); + MemoryRegionSection mr_section; AddressSpace *dma_as = NULL; - MemoryRegion *mr = NULL; - ram_addr_t offset; - mr = memory_region_from_host(info->vaddr, &offset); - if (!mr) { + dma_as = pci_device_iommu_address_space(o->pci_dev); + + mr_section = + memory_region_find(dma_as->root, (hwaddr)info->iova.iov_base, 1); + if (!mr_section.mr) { return; } - dma_as = pci_device_iommu_address_space(o->pci_dev); - - memory_region_del_subregion(dma_as->root, mr); + memory_region_del_subregion(dma_as->root, mr_section.mr); - object_unparent((OBJECT(mr))); + object_unparent((OBJECT(mr_section.mr))); trace_vfu_dma_unregister((uint64_t)info->iova.iov_base); } From patchwork Mon Mar 4 10:05:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13580338 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 772F8C5478C for ; Mon, 4 Mar 2024 10:07:42 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1rh5DD-0007Tr-C5; Mon, 04 Mar 2024 05:06:23 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1rh5DC-0007TP-01 for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:22 -0500 Received: from mail-pl1-x62d.google.com ([2607:f8b0:4864:20::62d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1rh5D8-0006IV-Bg for qemu-devel@nongnu.org; Mon, 04 Mar 2024 05:06:21 -0500 Received: by mail-pl1-x62d.google.com with SMTP id d9443c01a7336-1dcd0431f00so26278765ad.3 for ; Mon, 04 Mar 2024 02:06:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1709546776; x=1710151576; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=cnzpVp5oD/TcNq3luNOD5JKCAhVqC9S3ua2Xfu7/SOs=; b=W5UlTy8Q/fHMqIyiOyCRXpeMaAo7+ksnJn+8SPq+Fr29Q9/LlhTG7ZXS/TtXFf0EN6 0BRD3S0HWR9tVM9Kdk4nEXr1P6yCiyo+DhqROzFnBStEBfLHmGzYRqhTg/QbaKJ3OfNh QsRP1LHuCuJ1SG8+ombYqI5kfGcwREwuAj0TazIo635aSTXGLG4sYwxLjjcIwpxm44SO r/Qtogi/fdwXx+y2YlDOnwlDQanYU9ShjqZ+GmxC2CZfr+Lzv4vf36LCuW3LeEUqJ+Oi rs9GfcWLIsPFT6jYezboKYxpwR4cShwE704kiMPLQ/j6eoobMJBdNgfYgdNu5RkffgIo Ti8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1709546776; x=1710151576; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=cnzpVp5oD/TcNq3luNOD5JKCAhVqC9S3ua2Xfu7/SOs=; b=jgaJVb2DMpx8M1gaQqPWW56/QaKKr/UAzFG53yWcMe5UrC5+QZZEmpC7LKJ3PRZcrC OHAPQfFFTE/aGpfzldMCvSuHj0K9LjFv8tG7IDqIeTNl0UOtulxjo+AsJHzm8InVoLin t2eFThDbzfqXu39877IU/HDlhPxeEI1F+GINudTqimBE7ciRINNG7YTdMvBRqQ9bZFkk WLyzu3/tUIcAXQmLCjoGmvAhgqqRMgoQ2j6kndIfPqNDUOplAYH3isZ7rp1WWTO7qPUR jatUxkmGPBa03VAb+Y+k1dEWjg4ZSxgfVzih3e+2ml9aPq46iN0UbD+yWHo+X9KsQV8d 0c6A== X-Forwarded-Encrypted: i=1; AJvYcCXdieiSITQOboK+CYL2jELmXVOaDMR0uH6gqU87xb7y4q0XgQSj7Z748Cz+3g6SPLD+IpZsJvAEORtFEKCAPTAlQT0xPew= X-Gm-Message-State: AOJu0YwdmjdEHxsJfy6qY0UgVITYh0IxA+ouPeMczO1FljbbCKjyGpcR fT+M0LUe8+VxFg2z8D4dV759G1voHZ3+vi3hsZ6k9yaL7zfS+2ad9QVa6V5zk+Y= X-Google-Smtp-Source: AGHT+IGH9RkGDeaxT65gwnBjYMYPmboDYLOQ54D/1YwQYDNmhLc1i+BuDFOl09iYtSkZ/KLbuKACeg== X-Received: by 2002:a17:902:e5d0:b0:1dc:ffb7:e857 with SMTP id u16-20020a170902e5d000b001dcffb7e857mr5284024plf.57.1709546776503; Mon, 04 Mar 2024 02:06:16 -0800 (PST) Received: from mnissler.ba.rivosinc.com ([64.71.180.162]) by smtp.gmail.com with ESMTPSA id y18-20020a170902e19200b001dc95e7e191sm8091570pla.298.2024.03.04.02.06.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Mar 2024 02:06:16 -0800 (PST) From: Mattias Nissler To: stefanha@redhat.com, jag.raman@oracle.com, peterx@redhat.com, qemu-devel@nongnu.org Cc: =?utf-8?q?Philippe_Mathieu-Daud=C3=A9?= , john.levon@nutanix.com, David Hildenbrand , "Michael S. Tsirkin" , Paolo Bonzini , Richard Henderson , Marcel Apfelbaum , Elena Ufimtseva , Mattias Nissler Subject: [PATCH v8 5/5] vfio-user: Fix config space access byte order Date: Mon, 4 Mar 2024 02:05:54 -0800 Message-Id: <20240304100554.1143763-6-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240304100554.1143763-1-mnissler@rivosinc.com> References: <20240304100554.1143763-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62d; envelope-from=mnissler@rivosinc.com; helo=mail-pl1-x62d.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org PCI config space is little-endian, so on a big-endian host we need to perform byte swaps for values as they are passed to and received from the generic PCI config space access machinery. Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Stefan Hajnoczi Reviewed-by: Jagannathan Raman Signed-off-by: Mattias Nissler --- hw/remote/vfio-user-obj.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index a15e291c9a..0e93d7a7b4 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -281,7 +281,7 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, while (bytes > 0) { len = (bytes > pci_access_width) ? pci_access_width : bytes; if (is_write) { - memcpy(&val, ptr, len); + val = ldn_le_p(ptr, len); pci_host_config_write_common(o->pci_dev, offset, pci_config_size(o->pci_dev), val, len); @@ -289,7 +289,7 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, } else { val = pci_host_config_read_common(o->pci_dev, offset, pci_config_size(o->pci_dev), len); - memcpy(ptr, &val, len); + stn_le_p(ptr, len, val); trace_vfu_cfg_read(offset, val); } offset += len;