From patchwork Thu Sep 7 13:04:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13376454 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 20FEAEC8724 for ; Thu, 7 Sep 2023 13:12:04 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeEgT-00033x-Il; Thu, 07 Sep 2023 09:04:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeEgO-00031r-AR for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:28 -0400 Received: from mail-pj1-x1031.google.com ([2607:f8b0:4864:20::1031]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qeEgK-00082c-3l for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:28 -0400 Received: by mail-pj1-x1031.google.com with SMTP id 98e67ed59e1d1-26b41112708so712772a91.3 for ; Thu, 07 Sep 2023 06:04:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1694091862; x=1694696662; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=9QnPkPrqpN3t/VxpGhAabpkEZqu+enV6J+UoWWwNrfM=; b=1214Nxpd6QEpMdRHfvhoIlJtiY9W8DqfyD9tYOQfdOKI3ynZi/nlFoGK+EdvvKDHxv Jp2MBSJhNVJwGxQvHZP9ISWOtgFlaUKV5yqgXCx/ojavFKWrSb6+O2Quj3Zrn2kDt0gx jgNEQCZUYqwFfAv2DlKthW5pZaDZMdnvFP6Hle2di4hZAT4PbEbDGOkb3txDCYGS6llG IxShAJFVYJ3tr1g128giu8AFCzdwGpGn8ZS2OFj/QmXGF1ab3AJxGin0f4plLdRPnhIq jVWmpHulF0DT9195OQ4lxdOgYGmmU2Rr1fT1d6+vnLHBpFav/+j9M3EPjnPVXPyAW92v GWDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694091862; x=1694696662; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9QnPkPrqpN3t/VxpGhAabpkEZqu+enV6J+UoWWwNrfM=; b=R0FJMo5NzQhQ+yBaNY3LXih9FQbTk57nUfBY5elTy26UT4iwNSd6JPcLgJCAmErrl9 vUw2u49bk7Bi8NDzdeYi2Rm0+v0yUwq00y+Kryiw9vd4n8x0oNj22DENH+7egedtUiAC x8n2457yXIvUUz22lBtc+X/0gIZhHf5r2oVjb5Gvv/XJZ/bR4/uUMfSqXYqvV0NTIikl NrYZXhHpUqsBdHbWLSH9DQOxZZLYfJsLhN9XqXdRlQDzx61XHIarmxHQSF+DgG8+D1YV nrCT/D4mLlyWRcJ28KKIui8u5aMNZri+rq6SgRxL8pvJjo/EZxcvqoVx+jCY3/aUmS9V nA0Q== X-Gm-Message-State: AOJu0YxyJxWXo9u7E6ZiA7XqPSqhopQj9SkG4jRK+txHcVVhYJ12JIhg NHg0lJv2Zh7ShYMEdxZn/aHy25CT8Q8tbf604hgMPQ== X-Google-Smtp-Source: AGHT+IEjsNYiTigcAQx/T0qNJb/Mnzv7qSJe55vdOOBuur55l5hxywDdHx5aXN0ap+MQ0MF6myLR+Q== X-Received: by 2002:a17:90b:1956:b0:263:9816:fe0f with SMTP id nk22-20020a17090b195600b002639816fe0fmr18369674pjb.15.1694091861810; Thu, 07 Sep 2023 06:04:21 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id g24-20020a170902fe1800b001b0358848b0sm12821913plj.161.2023.09.07.06.04.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Sep 2023 06:04:21 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, Elena Ufimtseva , "Michael S. Tsirkin" , David Hildenbrand , Marcel Apfelbaum , =?utf-8?q?Philippe_Mathieu-D?= =?utf-8?q?aud=C3=A9?= , Richard Henderson , Jagannathan Raman , stefanha@redhat.com, Peter Xu , Paolo Bonzini , Mattias Nissler Subject: [PATCH v3 1/5] softmmu: Per-AddressSpace bounce buffering Date: Thu, 7 Sep 2023 06:04:06 -0700 Message-Id: <20230907130410.498935-2-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230907130410.498935-1-mnissler@rivosinc.com> References: <20230907130410.498935-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::1031; envelope-from=mnissler@rivosinc.com; helo=mail-pj1-x1031.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Instead of using a single global bounce buffer, give each AddressSpace its own bounce buffer. The MapClient callback mechanism moves to AddressSpace accordingly. This is in preparation for generalizing bounce buffer handling further to allow multiple bounce buffers, with a total allocation limit configured per AddressSpace. Signed-off-by: Mattias Nissler --- include/exec/cpu-common.h | 2 - include/exec/memory.h | 45 ++++++++++++++++- softmmu/dma-helpers.c | 4 +- softmmu/memory.c | 3 ++ softmmu/physmem.c | 103 ++++++++++++++++---------------------- 5 files changed, 90 insertions(+), 67 deletions(-) diff --git a/include/exec/cpu-common.h b/include/exec/cpu-common.h index 41788c0bdd..63463c415d 100644 --- a/include/exec/cpu-common.h +++ b/include/exec/cpu-common.h @@ -138,8 +138,6 @@ void *cpu_physical_memory_map(hwaddr addr, bool is_write); void cpu_physical_memory_unmap(void *buffer, hwaddr len, bool is_write, hwaddr access_len); -void cpu_register_map_client(QEMUBH *bh); -void cpu_unregister_map_client(QEMUBH *bh); bool cpu_physical_memory_is_io(hwaddr phys_addr); diff --git a/include/exec/memory.h b/include/exec/memory.h index 68284428f8..7d68936157 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -1076,6 +1076,19 @@ struct MemoryListener { QTAILQ_ENTRY(MemoryListener) link_as; }; +typedef struct AddressSpaceMapClient { + QEMUBH *bh; + QLIST_ENTRY(AddressSpaceMapClient) link; +} AddressSpaceMapClient; + +typedef struct { + MemoryRegion *mr; + void *buffer; + hwaddr addr; + hwaddr len; + bool in_use; +} BounceBuffer; + /** * struct AddressSpace: describes a mapping of addresses to #MemoryRegion objects */ @@ -1092,6 +1105,12 @@ struct AddressSpace { struct MemoryRegionIoeventfd *ioeventfds; QTAILQ_HEAD(, MemoryListener) listeners; QTAILQ_ENTRY(AddressSpace) address_spaces_link; + + /* Bounce buffer to use for this address space. */ + BounceBuffer bounce; + /* List of callbacks to invoke when buffers free up */ + QemuMutex map_client_list_lock; + QLIST_HEAD(, AddressSpaceMapClient) map_client_list; }; typedef struct AddressSpaceDispatch AddressSpaceDispatch; @@ -2832,8 +2851,8 @@ bool address_space_access_valid(AddressSpace *as, hwaddr addr, hwaddr len, * May return %NULL and set *@plen to zero(0), if resources needed to perform * the mapping are exhausted. * Use only for reads OR writes - not for read-modify-write operations. - * Use cpu_register_map_client() to know when retrying the map operation is - * likely to succeed. + * Use address_space_register_map_client() to know when retrying the map + * operation is likely to succeed. * * @as: #AddressSpace to be accessed * @addr: address within that address space @@ -2858,6 +2877,28 @@ void *address_space_map(AddressSpace *as, hwaddr addr, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, bool is_write, hwaddr access_len); +/* + * address_space_register_map_client: Register a callback to invoke when + * resources for address_space_map() are available again. + * + * address_space_map may fail when there are not enough resources available, + * such as when bounce buffer memory would exceed the limit. The callback can + * be used to retry the address_space_map operation. Note that the callback + * gets automatically removed after firing. + * + * @as: #AddressSpace to be accessed + * @bh: callback to invoke when address_space_map() retry is appropriate + */ +void address_space_register_map_client(AddressSpace *as, QEMUBH *bh); + +/* + * address_space_unregister_map_client: Unregister a callback that has + * previously been registered and not fired yet. + * + * @as: #AddressSpace to be accessed + * @bh: callback to unregister + */ +void address_space_unregister_map_client(AddressSpace *as, QEMUBH *bh); /* Internal functions, part of the implementation of address_space_read. */ MemTxResult address_space_read_full(AddressSpace *as, hwaddr addr, diff --git a/softmmu/dma-helpers.c b/softmmu/dma-helpers.c index 2463964805..d9fc26c063 100644 --- a/softmmu/dma-helpers.c +++ b/softmmu/dma-helpers.c @@ -167,7 +167,7 @@ static void dma_blk_cb(void *opaque, int ret) if (dbs->iov.size == 0) { trace_dma_map_wait(dbs); dbs->bh = aio_bh_new(ctx, reschedule_dma, dbs); - cpu_register_map_client(dbs->bh); + address_space_register_map_client(dbs->sg->as, dbs->bh); goto out; } @@ -197,7 +197,7 @@ static void dma_aio_cancel(BlockAIOCB *acb) } if (dbs->bh) { - cpu_unregister_map_client(dbs->bh); + address_space_unregister_map_client(dbs->sg->as, dbs->bh); qemu_bh_delete(dbs->bh); dbs->bh = NULL; } diff --git a/softmmu/memory.c b/softmmu/memory.c index 7d9494ce70..5c9622c3d6 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -3105,6 +3105,9 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name) as->ioeventfds = NULL; QTAILQ_INIT(&as->listeners); QTAILQ_INSERT_TAIL(&address_spaces, as, address_spaces_link); + as->bounce.in_use = false; + qemu_mutex_init(&as->map_client_list_lock); + QLIST_INIT(&as->map_client_list); as->name = g_strdup(name ? name : "anonymous"); address_space_update_topology(as); address_space_update_ioeventfds(as); diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 18277ddd67..f40cc564b8 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -2926,55 +2926,37 @@ void cpu_flush_icache_range(hwaddr start, hwaddr len) NULL, len, FLUSH_CACHE); } -typedef struct { - MemoryRegion *mr; - void *buffer; - hwaddr addr; - hwaddr len; - bool in_use; -} BounceBuffer; - -static BounceBuffer bounce; - -typedef struct MapClient { - QEMUBH *bh; - QLIST_ENTRY(MapClient) link; -} MapClient; - -QemuMutex map_client_list_lock; -static QLIST_HEAD(, MapClient) map_client_list - = QLIST_HEAD_INITIALIZER(map_client_list); - -static void cpu_unregister_map_client_do(MapClient *client) +static void +address_space_unregister_map_client_do(AddressSpaceMapClient *client) { QLIST_REMOVE(client, link); g_free(client); } -static void cpu_notify_map_clients_locked(void) +static void address_space_notify_map_clients_locked(AddressSpace *as) { - MapClient *client; + AddressSpaceMapClient *client; - while (!QLIST_EMPTY(&map_client_list)) { - client = QLIST_FIRST(&map_client_list); + while (!QLIST_EMPTY(&as->map_client_list)) { + client = QLIST_FIRST(&as->map_client_list); qemu_bh_schedule(client->bh); - cpu_unregister_map_client_do(client); + address_space_unregister_map_client_do(client); } } -void cpu_register_map_client(QEMUBH *bh) +void address_space_register_map_client(AddressSpace *as, QEMUBH *bh) { - MapClient *client = g_malloc(sizeof(*client)); + AddressSpaceMapClient *client = g_malloc(sizeof(*client)); - qemu_mutex_lock(&map_client_list_lock); + qemu_mutex_lock(&as->map_client_list_lock); client->bh = bh; - QLIST_INSERT_HEAD(&map_client_list, client, link); - /* Write map_client_list before reading in_use. */ + QLIST_INSERT_HEAD(&as->map_client_list, client, link); + /* Write map_client_list before reading bounce_buffer_size. */ smp_mb(); - if (!qatomic_read(&bounce.in_use)) { - cpu_notify_map_clients_locked(); + if (!qatomic_read(&as->bounce.in_use)) { + address_space_notify_map_clients_locked(as); } - qemu_mutex_unlock(&map_client_list_lock); + qemu_mutex_unlock(&as->map_client_list_lock); } void cpu_exec_init_all(void) @@ -2990,28 +2972,27 @@ void cpu_exec_init_all(void) finalize_target_page_bits(); io_mem_init(); memory_map_init(); - qemu_mutex_init(&map_client_list_lock); } -void cpu_unregister_map_client(QEMUBH *bh) +void address_space_unregister_map_client(AddressSpace *as, QEMUBH *bh) { - MapClient *client; + AddressSpaceMapClient *client; - qemu_mutex_lock(&map_client_list_lock); - QLIST_FOREACH(client, &map_client_list, link) { + qemu_mutex_lock(&as->map_client_list_lock); + QLIST_FOREACH(client, &as->map_client_list, link) { if (client->bh == bh) { - cpu_unregister_map_client_do(client); + address_space_unregister_map_client_do(client); break; } } - qemu_mutex_unlock(&map_client_list_lock); + qemu_mutex_unlock(&as->map_client_list_lock); } -static void cpu_notify_map_clients(void) +static void address_space_notify_map_clients(AddressSpace *as) { - qemu_mutex_lock(&map_client_list_lock); - cpu_notify_map_clients_locked(); - qemu_mutex_unlock(&map_client_list_lock); + qemu_mutex_lock(&as->map_client_list_lock); + address_space_notify_map_clients_locked(as); + qemu_mutex_unlock(&as->map_client_list_lock); } static bool flatview_access_valid(FlatView *fv, hwaddr addr, hwaddr len, @@ -3078,8 +3059,8 @@ flatview_extend_translation(FlatView *fv, hwaddr addr, * May map a subset of the requested range, given by and returned in *plen. * May return NULL if resources needed to perform the mapping are exhausted. * Use only for reads OR writes - not for read-modify-write operations. - * Use cpu_register_map_client() to know when retrying the map operation is - * likely to succeed. + * Use address_space_register_map_client() to know when retrying the map + * operation is likely to succeed. */ void *address_space_map(AddressSpace *as, hwaddr addr, @@ -3102,25 +3083,25 @@ void *address_space_map(AddressSpace *as, mr = flatview_translate(fv, addr, &xlat, &l, is_write, attrs); if (!memory_access_is_direct(mr, is_write)) { - if (qatomic_xchg(&bounce.in_use, true)) { + if (qatomic_xchg(&as->bounce.in_use, true)) { *plen = 0; return NULL; } /* Avoid unbounded allocations */ l = MIN(l, TARGET_PAGE_SIZE); - bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, l); - bounce.addr = addr; - bounce.len = l; + as->bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, l); + as->bounce.addr = addr; + as->bounce.len = l; memory_region_ref(mr); - bounce.mr = mr; + as->bounce.mr = mr; if (!is_write) { flatview_read(fv, addr, MEMTXATTRS_UNSPECIFIED, - bounce.buffer, l); + as->bounce.buffer, l); } *plen = l; - return bounce.buffer; + return as->bounce.buffer; } @@ -3138,7 +3119,7 @@ void *address_space_map(AddressSpace *as, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, bool is_write, hwaddr access_len) { - if (buffer != bounce.buffer) { + if (buffer != as->bounce.buffer) { MemoryRegion *mr; ram_addr_t addr1; @@ -3154,15 +3135,15 @@ void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, return; } if (is_write) { - address_space_write(as, bounce.addr, MEMTXATTRS_UNSPECIFIED, - bounce.buffer, access_len); + address_space_write(as, as->bounce.addr, MEMTXATTRS_UNSPECIFIED, + as->bounce.buffer, access_len); } - qemu_vfree(bounce.buffer); - bounce.buffer = NULL; - memory_region_unref(bounce.mr); + qemu_vfree(as->bounce.buffer); + as->bounce.buffer = NULL; + memory_region_unref(as->bounce.mr); /* Clear in_use before reading map_client_list. */ - qatomic_set_mb(&bounce.in_use, false); - cpu_notify_map_clients(); + qatomic_set_mb(&as->bounce.in_use, false); + address_space_notify_map_clients(as); } void *cpu_physical_memory_map(hwaddr addr, From patchwork Thu Sep 7 13:04:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13376430 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0D6E9EC8724 for ; Thu, 7 Sep 2023 13:09:14 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeEgQ-00032g-P0; Thu, 07 Sep 2023 09:04:30 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeEgO-00031z-NO for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:28 -0400 Received: from mail-pl1-x636.google.com ([2607:f8b0:4864:20::636]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qeEgL-00082v-94 for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:28 -0400 Received: by mail-pl1-x636.google.com with SMTP id d9443c01a7336-1c09673b006so6804795ad.1 for ; Thu, 07 Sep 2023 06:04:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1694091863; x=1694696663; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xJRHB3e+XiZ0mFQkd2Yg/uxiQhgPeUrgKbxzi/S1ze4=; b=HOYi0JEDzhl8RLMlylX+TIYQW3ZI2SHkyxbi2U4kvsHh6oCPwt90kPv+FgNd4uEbwe FxuEP8JMbx907bapmwpoBjAAUBa1AsoGH2onLtJzCKiLcgfgNH3rbb5l/V1C7jp2mflN 24WUueihH9STkVw3Kc94dFmgVt8MMj4tPkrhP3KarKiDjay04m8iuVS+NQ/KiCbWayyJ jc8tErrFKuA5rBFbCsyOnV+Rxp/Rbwr0iWpaN8BPB+tcspj83+Ju24nw/5ANiE1jepOr Q8EoAaeYFPegKl8Gf0v6MYJXyMH9573xb46fk5a2jcbxsbU58G4YKCvT1yldaYnAD6dJ BY6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694091863; x=1694696663; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xJRHB3e+XiZ0mFQkd2Yg/uxiQhgPeUrgKbxzi/S1ze4=; b=dWAmOfMqYIQVolwojFgRbVniDJp6KDDzHpNzqE+sXwTDlpAjqo7ZhUqKy0V/ua/eao jj13pah2QLXHh+0e77BPe2Tb+1kye6SMWp7l7xTx7A6iaZfIG57pBxR2h8aIf7nHP7vf kpTxRYaAzfSADLsutpYDgXxQSIScK9jCJ+x0XNwttFRra00XEf3zpvSB7dAB/B+zGUw9 5HftOl3PlmxiZxS/2Gtl7305wx09tAfvbhV2Z4NW6T+2RTgWO5Kp15NegrOskX2SJh2P xAxIYWegi/reLuwSvN3k6B0RkJr4LdRm6jwmbjen3Dqz0t6haK2b3XoQI+SrjWEp9tsp iD4A== X-Gm-Message-State: AOJu0Yx9afwQmyWYYeYs7h2c6/HmH6yHeutlNltg2I2PnVl7kRNHXOdG 8WQcVEoauUG0RV8vZYgICQK1GWXfOZ7gdLnQ6LifzQ== X-Google-Smtp-Source: AGHT+IER4Jj9Nx//q4fJoPYD5z0wgvt8oUKtkJZ4q9edsFN3fr5Yx973TI2If17AkhAmP2DRQZ4Y+g== X-Received: by 2002:a17:903:1109:b0:1bb:d280:5e0b with SMTP id n9-20020a170903110900b001bbd2805e0bmr18053409plh.18.1694091863100; Thu, 07 Sep 2023 06:04:23 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id g24-20020a170902fe1800b001b0358848b0sm12821913plj.161.2023.09.07.06.04.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Sep 2023 06:04:22 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, Elena Ufimtseva , "Michael S. Tsirkin" , David Hildenbrand , Marcel Apfelbaum , =?utf-8?q?Philippe_Mathieu-D?= =?utf-8?q?aud=C3=A9?= , Richard Henderson , Jagannathan Raman , stefanha@redhat.com, Peter Xu , Paolo Bonzini , Mattias Nissler Subject: [PATCH v3 2/5] softmmu: Support concurrent bounce buffers Date: Thu, 7 Sep 2023 06:04:07 -0700 Message-Id: <20230907130410.498935-3-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230907130410.498935-1-mnissler@rivosinc.com> References: <20230907130410.498935-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::636; envelope-from=mnissler@rivosinc.com; helo=mail-pl1-x636.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org When DMA memory can't be directly accessed, as is the case when running the device model in a separate process without shareable DMA file descriptors, bounce buffering is used. It is not uncommon for device models to request mapping of several DMA regions at the same time. Examples include: * net devices, e.g. when transmitting a packet that is split across several TX descriptors (observed with igb) * USB host controllers, when handling a packet with multiple data TRBs (observed with xhci) Previously, qemu only provided a single bounce buffer per AddressSpace and would fail DMA map requests while the buffer was already in use. In turn, this would cause DMA failures that ultimately manifest as hardware errors from the guest perspective. This change allocates DMA bounce buffers dynamically instead of supporting only a single buffer. Thus, multiple DMA mappings work correctly also when RAM can't be mmap()-ed. The total bounce buffer allocation size is limited individually for each AddressSpace. The default limit is 4096 bytes, matching the previous maximum buffer size. A new x-max-bounce-buffer-size parameter is provided to configure the limit for PCI devices. Signed-off-by: Mattias Nissler --- hw/pci/pci.c | 8 ++++ include/exec/memory.h | 14 ++---- include/hw/pci/pci_device.h | 3 ++ softmmu/memory.c | 3 +- softmmu/physmem.c | 94 +++++++++++++++++++++++++------------ 5 files changed, 80 insertions(+), 42 deletions(-) diff --git a/hw/pci/pci.c b/hw/pci/pci.c index 881d774fb6..8c4541b394 100644 --- a/hw/pci/pci.c +++ b/hw/pci/pci.c @@ -85,6 +85,8 @@ static Property pci_props[] = { QEMU_PCIE_ERR_UNC_MASK_BITNR, true), DEFINE_PROP_BIT("x-pcie-ari-nextfn-1", PCIDevice, cap_present, QEMU_PCIE_ARI_NEXTFN_1_BITNR, false), + DEFINE_PROP_SIZE("x-max-bounce-buffer-size", PCIDevice, + max_bounce_buffer_size, 4096), DEFINE_PROP_END_OF_LIST() }; @@ -1208,6 +1210,8 @@ static PCIDevice *do_pci_register_device(PCIDevice *pci_dev, "bus master container", UINT64_MAX); address_space_init(&pci_dev->bus_master_as, &pci_dev->bus_master_container_region, pci_dev->name); + pci_dev->bus_master_as.max_bounce_buffer_size = + pci_dev->max_bounce_buffer_size; if (phase_check(PHASE_MACHINE_READY)) { pci_init_bus_master(pci_dev); @@ -2664,6 +2668,10 @@ static void pci_device_class_init(ObjectClass *klass, void *data) k->unrealize = pci_qdev_unrealize; k->bus_type = TYPE_PCI_BUS; device_class_set_props(k, pci_props); + object_class_property_set_description( + klass, "x-max-bounce-buffer-size", + "Maximum buffer size allocated for bounce buffers used for mapped " + "access to indirect DMA memory"); } static void pci_device_class_base_init(ObjectClass *klass, void *data) diff --git a/include/exec/memory.h b/include/exec/memory.h index 7d68936157..5577542b5e 100644 --- a/include/exec/memory.h +++ b/include/exec/memory.h @@ -1081,14 +1081,6 @@ typedef struct AddressSpaceMapClient { QLIST_ENTRY(AddressSpaceMapClient) link; } AddressSpaceMapClient; -typedef struct { - MemoryRegion *mr; - void *buffer; - hwaddr addr; - hwaddr len; - bool in_use; -} BounceBuffer; - /** * struct AddressSpace: describes a mapping of addresses to #MemoryRegion objects */ @@ -1106,8 +1098,10 @@ struct AddressSpace { QTAILQ_HEAD(, MemoryListener) listeners; QTAILQ_ENTRY(AddressSpace) address_spaces_link; - /* Bounce buffer to use for this address space. */ - BounceBuffer bounce; + /* Maximum DMA bounce buffer size used for indirect memory map requests */ + uint64_t max_bounce_buffer_size; + /* Total size of bounce buffers currently allocated, atomically accessed */ + uint64_t bounce_buffer_size; /* List of callbacks to invoke when buffers free up */ QemuMutex map_client_list_lock; QLIST_HEAD(, AddressSpaceMapClient) map_client_list; diff --git a/include/hw/pci/pci_device.h b/include/hw/pci/pci_device.h index d3dd0f64b2..f4027c5379 100644 --- a/include/hw/pci/pci_device.h +++ b/include/hw/pci/pci_device.h @@ -160,6 +160,9 @@ struct PCIDevice { /* ID of standby device in net_failover pair */ char *failover_pair_id; uint32_t acpi_index; + + /* Maximum DMA bounce buffer size used for indirect memory map requests */ + uint64_t max_bounce_buffer_size; }; static inline int pci_intx(PCIDevice *pci_dev) diff --git a/softmmu/memory.c b/softmmu/memory.c index 5c9622c3d6..e02799359c 100644 --- a/softmmu/memory.c +++ b/softmmu/memory.c @@ -3105,7 +3105,8 @@ void address_space_init(AddressSpace *as, MemoryRegion *root, const char *name) as->ioeventfds = NULL; QTAILQ_INIT(&as->listeners); QTAILQ_INSERT_TAIL(&address_spaces, as, address_spaces_link); - as->bounce.in_use = false; + as->max_bounce_buffer_size = 4096; + as->bounce_buffer_size = 0; qemu_mutex_init(&as->map_client_list_lock); QLIST_INIT(&as->map_client_list); as->name = g_strdup(name ? name : "anonymous"); diff --git a/softmmu/physmem.c b/softmmu/physmem.c index f40cc564b8..e3d1cf5fba 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -2926,6 +2926,20 @@ void cpu_flush_icache_range(hwaddr start, hwaddr len) NULL, len, FLUSH_CACHE); } +/* + * A magic value stored in the first 8 bytes of the bounce buffer struct. Used + * to detect illegal pointers passed to address_space_unmap. + */ +#define BOUNCE_BUFFER_MAGIC 0xb4017ceb4ffe12ed + +typedef struct { + uint64_t magic; + MemoryRegion *mr; + hwaddr addr; + size_t len; + uint8_t buffer[]; +} BounceBuffer; + static void address_space_unregister_map_client_do(AddressSpaceMapClient *client) { @@ -2953,7 +2967,7 @@ void address_space_register_map_client(AddressSpace *as, QEMUBH *bh) QLIST_INSERT_HEAD(&as->map_client_list, client, link); /* Write map_client_list before reading bounce_buffer_size. */ smp_mb(); - if (!qatomic_read(&as->bounce.in_use)) { + if (qatomic_read(&as->bounce_buffer_size) < as->max_bounce_buffer_size) { address_space_notify_map_clients_locked(as); } qemu_mutex_unlock(&as->map_client_list_lock); @@ -3081,31 +3095,36 @@ void *address_space_map(AddressSpace *as, RCU_READ_LOCK_GUARD(); fv = address_space_to_flatview(as); mr = flatview_translate(fv, addr, &xlat, &l, is_write, attrs); + memory_region_ref(mr); if (!memory_access_is_direct(mr, is_write)) { - if (qatomic_xchg(&as->bounce.in_use, true)) { + size_t size = qatomic_add_fetch(&as->bounce_buffer_size, l); + if (size > as->max_bounce_buffer_size) { + size_t excess = size - as->max_bounce_buffer_size; + l -= excess; + qatomic_sub(&as->bounce_buffer_size, excess); + } + + if (l == 0) { *plen = 0; return NULL; } - /* Avoid unbounded allocations */ - l = MIN(l, TARGET_PAGE_SIZE); - as->bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, l); - as->bounce.addr = addr; - as->bounce.len = l; - memory_region_ref(mr); - as->bounce.mr = mr; + BounceBuffer *bounce = g_malloc0(l + sizeof(BounceBuffer)); + bounce->magic = BOUNCE_BUFFER_MAGIC; + bounce->mr = mr; + bounce->addr = addr; + bounce->len = l; + if (!is_write) { flatview_read(fv, addr, MEMTXATTRS_UNSPECIFIED, - as->bounce.buffer, l); + bounce->buffer, l); } *plen = l; - return as->bounce.buffer; + return bounce->buffer; } - - memory_region_ref(mr); *plen = flatview_extend_translation(fv, addr, len, mr, xlat, l, is_write, attrs); fuzz_dma_read_cb(addr, *plen, mr); @@ -3119,31 +3138,44 @@ void *address_space_map(AddressSpace *as, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, bool is_write, hwaddr access_len) { - if (buffer != as->bounce.buffer) { - MemoryRegion *mr; - ram_addr_t addr1; + MemoryRegion *mr; + ram_addr_t addr1; + + mr = memory_region_from_host(buffer, &addr1); + if (mr == NULL) { + BounceBuffer *bounce = container_of(buffer, BounceBuffer, buffer); + if (bounce->magic != BOUNCE_BUFFER_MAGIC) { + error_report( + "Unmap request for %p, which neither corresponds to a memory " + "region, nor looks like a bounce buffer, ignoring!", + buffer); + return; + } - mr = memory_region_from_host(buffer, &addr1); - assert(mr != NULL); if (is_write) { - invalidate_and_set_dirty(mr, addr1, access_len); + address_space_write(as, bounce->addr, MEMTXATTRS_UNSPECIFIED, + bounce->buffer, access_len); } - if (xen_enabled()) { - xen_invalidate_map_cache_entry(buffer); + + memory_region_unref(bounce->mr); + uint64_t previous_buffer_size = + qatomic_fetch_sub(&as->bounce_buffer_size, bounce->len); + if (previous_buffer_size == as->max_bounce_buffer_size) { + /* Write bounce_buffer_size before reading map_client_list. */ + smp_mb(); + address_space_notify_map_clients(as); } - memory_region_unref(mr); + bounce->magic = ~BOUNCE_BUFFER_MAGIC; + g_free(bounce); return; } + + if (xen_enabled()) { + xen_invalidate_map_cache_entry(buffer); + } if (is_write) { - address_space_write(as, as->bounce.addr, MEMTXATTRS_UNSPECIFIED, - as->bounce.buffer, access_len); - } - qemu_vfree(as->bounce.buffer); - as->bounce.buffer = NULL; - memory_region_unref(as->bounce.mr); - /* Clear in_use before reading map_client_list. */ - qatomic_set_mb(&as->bounce.in_use, false); - address_space_notify_map_clients(as); + invalidate_and_set_dirty(mr, addr1, access_len); + } } void *cpu_physical_memory_map(hwaddr addr, From patchwork Thu Sep 7 13:04:08 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13376425 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E7A66EC8727 for ; Thu, 7 Sep 2023 13:06:58 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeEgU-000345-Re; Thu, 07 Sep 2023 09:04:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeEgP-000329-1z for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:30 -0400 Received: from mail-pl1-x62e.google.com ([2607:f8b0:4864:20::62e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qeEgM-000834-HT for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:28 -0400 Received: by mail-pl1-x62e.google.com with SMTP id d9443c01a7336-1bf11b1c7d0so14969725ad.0 for ; Thu, 07 Sep 2023 06:04:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1694091865; x=1694696665; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KKnBqBACpdkOcg5+pbvAuwZwBRdK/oxFRTqdjZK/py8=; b=Rc2BoaJyB7SFEDE0/udexL6mqpuVIJeEhpEvY1A6MHB6oPCgUauJTkB59UQ51CLHeO eHOKbwLbWA9QPcfMYSVHt6a8e21z2lbRLnpkQxxTQisbT1gSt0OyQt3QXiQDEluqE9c9 jmsTQpx3/nZrIEbXly95kVQ5v2R1QzZc0U0GsdIsnmohnHQWbz/wc8RJO4g+fMhC6lGG LgAzpCPYGuIjfciR6fFLAqSREfXIz8w6Flpx39ZOBlBzarhskpeq8II3GMJTI0j/1b1G VbDnFu/ktPlGLoPrB7LwpL9DnQQ5X60sIJ6igfOri6SywRD4QSuGXWpARvbTXLlXnidR qlFw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694091865; x=1694696665; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KKnBqBACpdkOcg5+pbvAuwZwBRdK/oxFRTqdjZK/py8=; b=ilOO2XRDoij/0V0CzKo3yd0CIbewC84Q8lfoqiH+dCu07yO/hyaDUbSLOVGUytuRpn 7qBgDK9o50WXnfrV4BwBZhQX55PHJYk4NlGB3yW2EZj0+ukKTxk60+C3odiSz3McZ5xu JsPDLs9ukYQBa9LZSLHI51ffCr1i0G0HmVqrWZQrrldzqZxTweEu+QTECBPd/MYsYTWy Ovl6sDB/0KEACoM78pTC5g1FtgNg9J5YxZOVQVsoBq67SmD35EpvnQf7sPXvb7KAf3vh 4YWxt6exwNFpCBrmAvs02j3iO5mYiCc5Z31NbFiyfuX9qatkHltJYxLME5GGQ1gNMRXA bVsg== X-Gm-Message-State: AOJu0Yyv1kDNDeCuVgyulaWTEIkhkaCaEcHtTFUcVJAloBgiSmeJyzom svsVmFHUMKK5fNdvLPk4csn/RJ6qVN9XZ91Vz8bjTA== X-Google-Smtp-Source: AGHT+IGTbkYlQ06tMReqaRFVTofhGBoKSy3PTDfgnwfrVsTYYmX8oHXgtvktfNr0MRmbKt+XFZEuOA== X-Received: by 2002:a17:902:ce88:b0:1bc:2fe1:1821 with SMTP id f8-20020a170902ce8800b001bc2fe11821mr3576695plg.17.1694091864618; Thu, 07 Sep 2023 06:04:24 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id g24-20020a170902fe1800b001b0358848b0sm12821913plj.161.2023.09.07.06.04.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Sep 2023 06:04:24 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, Elena Ufimtseva , "Michael S. Tsirkin" , David Hildenbrand , Marcel Apfelbaum , =?utf-8?q?Philippe_Mathieu-D?= =?utf-8?q?aud=C3=A9?= , Richard Henderson , Jagannathan Raman , stefanha@redhat.com, Peter Xu , Paolo Bonzini , Mattias Nissler Subject: [PATCH v3 3/5] Update subprojects/libvfio-user Date: Thu, 7 Sep 2023 06:04:08 -0700 Message-Id: <20230907130410.498935-4-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230907130410.498935-1-mnissler@rivosinc.com> References: <20230907130410.498935-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::62e; envelope-from=mnissler@rivosinc.com; helo=mail-pl1-x62e.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Brings in assorted bug fixes. In particular, "Fix address calculation for message-based DMA" corrects a bug in DMA address calculation which is necessary to get DMA across VFIO-user messages working. Signed-off-by: Mattias Nissler --- subprojects/libvfio-user.wrap | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/subprojects/libvfio-user.wrap b/subprojects/libvfio-user.wrap index 416955ca45..135667a40d 100644 --- a/subprojects/libvfio-user.wrap +++ b/subprojects/libvfio-user.wrap @@ -1,4 +1,4 @@ [wrap-git] url = https://gitlab.com/qemu-project/libvfio-user.git -revision = 0b28d205572c80b568a1003db2c8f37ca333e4d7 +revision = f63ef82ad01821417df488cef7ec1fd94c3883fa depth = 1 From patchwork Thu Sep 7 13:04:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13376431 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BB523EC8724 for ; Thu, 7 Sep 2023 13:09:47 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeEgS-00033S-JC; Thu, 07 Sep 2023 09:04:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeEgQ-00032h-Dz for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:30 -0400 Received: from mail-pg1-x52b.google.com ([2607:f8b0:4864:20::52b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qeEgN-00083J-Db for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:30 -0400 Received: by mail-pg1-x52b.google.com with SMTP id 41be03b00d2f7-53fbf2c42bfso787131a12.3 for ; Thu, 07 Sep 2023 06:04:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1694091866; x=1694696666; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=47aPbA5pxaDOWep6DbmY6koMczW36VjR0/UwUNqiflo=; b=mIsDudvvnBdd/fWsI3In/QxXjXtF6vAcC+0HXi5dOoIxNAicIJKuG2tPSF7wcQ4kSW iY/dGK28HQWZkYl/orYciXBuPiZRM6EJAXrd/tkvJpxFeqSC40M13AtDYqaSHPONec7L NQ7Fb5DBspqsO9pzAooYbEsfbpomgwRFYjARmpgrimyyFgQU27MfxEehRiXz2dg7o8Ca 77rhiGFAnPZLLUM1XaNLZO+1erSofDevQGj+qyU+WjYzN30Aki8h2nmeXjQ9Zu4pqT2D jtLjA0Wknczk2LzAnrwI1GZ0c5axhM22+aDCH2Ykh6RjBizipQycbobKZlpqbxpW1SoG 0tDQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694091866; x=1694696666; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=47aPbA5pxaDOWep6DbmY6koMczW36VjR0/UwUNqiflo=; b=dSNIkhgRI9FWWp7Q3+ztiH/QeucPER851Cg6bGQscilIv/rulkGFbjN66OyMrw7AP7 Q2YpJQ6Eb9+KrZcA97MFoNypz/DMKXZd49It7mm29OxAnkooXx9GmWIgDtj6xDbyRXu6 Y2pTCORbwzjZInJeaCI9ZFGwekymfbOfGugbJqy4bkQnPwiTUS0C9p4tgZkWQDZezdQo uQCzlBo5W4CmKPaKnKWSgwZQ+myUDPR8kaHi/Mw4p4dHVHj0fYaZEDSKZi4lOUkUJTDf dhNZBxm3hyr0kHZqNjrrUaUZpTU6xKG+5qaxMEpuDc8EMHCZehEGjM5OvRvi+PmOkwYx Jz/g== X-Gm-Message-State: AOJu0Yw1tHOqopWDrPNo1mtlCbxahI06EPDbvub3P9zC+ejOOYjN6z4g qu5BGcWoh4K11C3bHQgV9EkCgto9pc3WfUqKjB+3Vw== X-Google-Smtp-Source: AGHT+IH899vsTVJzsaiHAuEy+C18gDxE8Jmlx5eKR9dnzbPbVp8UmCID1tnWI3rqWDZJKYHmG+XW2Q== X-Received: by 2002:a17:90a:8d14:b0:268:b64b:f684 with SMTP id c20-20020a17090a8d1400b00268b64bf684mr18034190pjo.33.1694091865935; Thu, 07 Sep 2023 06:04:25 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id g24-20020a170902fe1800b001b0358848b0sm12821913plj.161.2023.09.07.06.04.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Sep 2023 06:04:25 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, Elena Ufimtseva , "Michael S. Tsirkin" , David Hildenbrand , Marcel Apfelbaum , =?utf-8?q?Philippe_Mathieu-D?= =?utf-8?q?aud=C3=A9?= , Richard Henderson , Jagannathan Raman , stefanha@redhat.com, Peter Xu , Paolo Bonzini , Mattias Nissler Subject: [PATCH v3 4/5] vfio-user: Message-based DMA support Date: Thu, 7 Sep 2023 06:04:09 -0700 Message-Id: <20230907130410.498935-5-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230907130410.498935-1-mnissler@rivosinc.com> References: <20230907130410.498935-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::52b; envelope-from=mnissler@rivosinc.com; helo=mail-pg1-x52b.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See https://github.com/nutanix/libvfio-user/issues/279 for details as well as a proposed fix. Signed-off-by: Mattias Nissler --- hw/remote/trace-events | 2 + hw/remote/vfio-user-obj.c | 84 +++++++++++++++++++++++++++++++++++---- 2 files changed, 79 insertions(+), 7 deletions(-) diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 0d1b7d56a5..358a68fb34 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -9,6 +9,8 @@ vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%x -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%x <- 0x%x" vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" +vfu_dma_read(uint64_t gpa, size_t len) "vfu: DMA read 0x%"PRIx64", %zu bytes" +vfu_dma_write(uint64_t gpa, size_t len) "vfu: DMA write 0x%"PRIx64", %zu bytes" vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64"" vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64"" vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64"" diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 8b10c32a3c..cee5e615a9 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -300,6 +300,63 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, return count; } +static MemTxResult vfu_dma_read(void *opaque, hwaddr addr, uint64_t *val, + unsigned size, MemTxAttrs attrs) +{ + MemoryRegion *region = opaque; + VfuObject *o = VFU_OBJECT(region->owner); + uint8_t buf[sizeof(uint64_t)]; + + trace_vfu_dma_read(region->addr + addr, size); + + dma_sg_t *sg = alloca(dma_sg_size()); + vfu_dma_addr_t vfu_addr = (vfu_dma_addr_t)(region->addr + addr); + if (vfu_addr_to_sgl(o->vfu_ctx, vfu_addr, size, sg, 1, PROT_READ) < 0 || + vfu_sgl_read(o->vfu_ctx, sg, 1, buf) != 0) { + return MEMTX_ERROR; + } + + *val = ldn_he_p(buf, size); + + return MEMTX_OK; +} + +static MemTxResult vfu_dma_write(void *opaque, hwaddr addr, uint64_t val, + unsigned size, MemTxAttrs attrs) +{ + MemoryRegion *region = opaque; + VfuObject *o = VFU_OBJECT(region->owner); + uint8_t buf[sizeof(uint64_t)]; + + trace_vfu_dma_write(region->addr + addr, size); + + stn_he_p(buf, size, val); + + dma_sg_t *sg = alloca(dma_sg_size()); + vfu_dma_addr_t vfu_addr = (vfu_dma_addr_t)(region->addr + addr); + if (vfu_addr_to_sgl(o->vfu_ctx, vfu_addr, size, sg, 1, PROT_WRITE) < 0 || + vfu_sgl_write(o->vfu_ctx, sg, 1, buf) != 0) { + return MEMTX_ERROR; + } + + return MEMTX_OK; +} + +static const MemoryRegionOps vfu_dma_ops = { + .read_with_attrs = vfu_dma_read, + .write_with_attrs = vfu_dma_write, + .endianness = DEVICE_HOST_ENDIAN, + .valid = { + .min_access_size = 1, + .max_access_size = 8, + .unaligned = true, + }, + .impl = { + .min_access_size = 1, + .max_access_size = 8, + }, +}; + static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) { VfuObject *o = vfu_get_private(vfu_ctx); @@ -308,17 +365,30 @@ static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) g_autofree char *name = NULL; struct iovec *iov = &info->iova; - if (!info->vaddr) { - return; - } - name = g_strdup_printf("mem-%s-%"PRIx64"", o->device, - (uint64_t)info->vaddr); + (uint64_t)iov->iov_base); subregion = g_new0(MemoryRegion, 1); - memory_region_init_ram_ptr(subregion, NULL, name, - iov->iov_len, info->vaddr); + if (info->vaddr) { + memory_region_init_ram_ptr(subregion, OBJECT(o), name, + iov->iov_len, info->vaddr); + } else { + /* + * Note that I/O regions' MemoryRegionOps handle accesses of at most 8 + * bytes at a time, and larger accesses are broken down. However, + * many/most DMA accesses are larger than 8 bytes and VFIO-user can + * handle large DMA accesses just fine, thus this size restriction + * unnecessarily hurts performance, in particular given that each + * access causes a round trip on the VFIO-user socket. + * + * TODO: Investigate how to plumb larger accesses through memory + * regions, possibly by amending MemoryRegionOps or by creating a new + * memory region type. + */ + memory_region_init_io(subregion, OBJECT(o), &vfu_dma_ops, subregion, + name, iov->iov_len); + } dma_as = pci_device_iommu_address_space(o->pci_dev); From patchwork Thu Sep 7 13:04:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13376440 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5C4D9EC8727 for ; Thu, 7 Sep 2023 13:11:29 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qeEgm-0003Dn-JC; Thu, 07 Sep 2023 09:04:56 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qeEgc-0003As-UI for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:44 -0400 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qeEgO-00083i-U8 for qemu-devel@nongnu.org; Thu, 07 Sep 2023 09:04:41 -0400 Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1c1ff5b741cso8200055ad.2 for ; Thu, 07 Sep 2023 06:04:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20230601.gappssmtp.com; s=20230601; t=1694091867; x=1694696667; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=shPARw+tLNnIUFEFBFESRJz0RNmWaqWCpot+BQW0Pwg=; b=VpiC5toVm6Wn4tPKrqBL1Fk+26N+JNKUa00hmqidn6C2+YSFUQQ9oTFA5Oz+C4cOc0 HSg6JFBdQw0m8ndXxZZcrSaTk6kZOY+tVTIQtcbgtBYEwvm9+o+EdG79YOrl9kpHaR6B SfJ5bULifNBQPBqEaqI1uKeW67zxShjErnjgIuVuzfT9S4lKIc55y1aHPM2W/kJ5ae5o zTrg22uldPwB2UNC+emwUbfjdmO67nvkH46ALgeGGKDFIKLwcMMUElvgG8FJiYHqwYMt J1Yt6EvRhvJkdsU5N2L3TddHS0/t3IJZyToY2VpnMwARAeWt7UoVrbS6gbrSQPm+fndt lgLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1694091867; x=1694696667; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=shPARw+tLNnIUFEFBFESRJz0RNmWaqWCpot+BQW0Pwg=; b=kZhZeT1Ify/da5ub3LhpY8rTzXH6Kw0V/J+rdl9sEkimU01quRTRi23SwmRCm1grSr ntI3d2EvPc67sTfp5p59IoyJruIb08KjI32yoFKztYy3ECew3Sm5lE6LlcALYDRZ/sC7 WdcvcgzuhWUIlj011BBCxgsqRv1w+rgx80iHkWX/Q4/J+vHsDgKh1H6XtWVzma498ca3 WGUSdHjMj07rmLMv4pP6BXlNU2mpQM9WvhgRliMYNg9KI5nyLenqfi9l3mky118vXPje YQRoLnEOKEFNvwGl7XFpBuwDaZwa8qFdROn8CcNQy8kV6fI8NPD8z534qbNqXTQlwnVV xLSg== X-Gm-Message-State: AOJu0Yw5RE6H6iQVEUttBO/g/0Ou42FuJUOoLz+/Ku+XfmriDTjdsf/W 5/WVESIfQiiewZTb5Sk7x9KudnlF6+86M7ucq81Ozw== X-Google-Smtp-Source: AGHT+IGpAgrY5UZo35/0Eq6WgqoT85hokhjospBGmrfXGUYOauR8RKdMECMlSCV/uSNsnXtPSBcLxQ== X-Received: by 2002:a17:902:ec89:b0:1c3:4444:3c9e with SMTP id x9-20020a170902ec8900b001c344443c9emr12012123plg.58.1694091867221; Thu, 07 Sep 2023 06:04:27 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id g24-20020a170902fe1800b001b0358848b0sm12821913plj.161.2023.09.07.06.04.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Sep 2023 06:04:26 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, Elena Ufimtseva , "Michael S. Tsirkin" , David Hildenbrand , Marcel Apfelbaum , =?utf-8?q?Philippe_Mathieu-D?= =?utf-8?q?aud=C3=A9?= , Richard Henderson , Jagannathan Raman , stefanha@redhat.com, Peter Xu , Paolo Bonzini , Mattias Nissler Subject: [PATCH v3 5/5] vfio-user: Fix config space access byte order Date: Thu, 7 Sep 2023 06:04:10 -0700 Message-Id: <20230907130410.498935-6-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230907130410.498935-1-mnissler@rivosinc.com> References: <20230907130410.498935-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::632; envelope-from=mnissler@rivosinc.com; helo=mail-pl1-x632.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org PCI config space is little-endian, so on a big-endian host we need to perform byte swaps for values as they are passed to and received from the generic PCI config space access machinery. Signed-off-by: Mattias Nissler Reviewed-by: Philippe Mathieu-Daudé Reviewed-by: Stefan Hajnoczi --- hw/remote/vfio-user-obj.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index cee5e615a9..d38b4700f3 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -281,7 +281,7 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, while (bytes > 0) { len = (bytes > pci_access_width) ? pci_access_width : bytes; if (is_write) { - memcpy(&val, ptr, len); + val = ldn_le_p(ptr, len); pci_host_config_write_common(o->pci_dev, offset, pci_config_size(o->pci_dev), val, len); @@ -289,7 +289,7 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, } else { val = pci_host_config_read_common(o->pci_dev, offset, pci_config_size(o->pci_dev), len); - memcpy(ptr, &val, len); + stn_le_p(ptr, len, val); trace_vfu_cfg_read(offset, val); } offset += len;