From patchwork Wed Aug 23 09:29:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13361911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F1912EE49B2 for ; Wed, 23 Aug 2023 09:29:54 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qYkB9-00056D-Lf; Wed, 23 Aug 2023 05:29:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qYkB8-0004xy-27 for qemu-devel@nongnu.org; Wed, 23 Aug 2023 05:29:30 -0400 Received: from mail-pf1-x42f.google.com ([2607:f8b0:4864:20::42f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qYkB6-0006YJ-19 for qemu-devel@nongnu.org; Wed, 23 Aug 2023 05:29:29 -0400 Received: by mail-pf1-x42f.google.com with SMTP id d2e1a72fcca58-68a2bf8d404so2824460b3a.2 for ; Wed, 23 Aug 2023 02:29:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1692782966; x=1693387766; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hT2FZgX3WdlDwv5kB6R6HuP5QXnrlRh8hxyUD6s8wdQ=; b=aQQjvXnnEqhzjkZ/uVjzAdSY4USm2hcxOo435Ywyuy5lZRY1SXGPqRfqIBf6A/h3xL y9PQXuuEkTmuBPF64kLqPZ+xTEtRb7J0bl5wwbOBwLmC6parCL+tIb+rIHFjYApclysA Fhoo5nfy4PJ5sQ1qM5+0o/ennkfjY5tJx7xUpZ6TiZPm2LiDFNZIRkh4+QGvwvn4yUIw 1P031os3AdtTDFNniSovSvfHfR+HqSkudV4l/bFBMHproWqwgLvNPNyigAJyrvi+RPCq 6iV50q1wyHuzReZoBpkcFiuYhu9H3A4ibmbpmkQdWIb3VeLYmLOj4fPkCoJX2FOrXCq1 Jd+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692782966; x=1693387766; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hT2FZgX3WdlDwv5kB6R6HuP5QXnrlRh8hxyUD6s8wdQ=; b=aK9qTYav593b4C4aOq/VshVHHj+LAFi9eucRcNLEqAZCgEqKjmnecBFSd+VeTTBpoS okaUTK3eDobGGx77A6C7nb08DwuadgAlIt7PxnT+addZWm24hiyjdiEjVzh4fWwjlsal CMuEQMHctfdeKyvCs5gYwxzjS6sa5U5hs6b+ATWF+VJyf0aJzyq4Zi9sI1Bw0x4HNz6N Eq+SCgU8fSDM9oftAHC/bTs3O4u9hXBOFAe3PpwYFNOXvfPXJ9AZBbA6sldwbKpHQBF/ esW4zywi+4054+3fy5tjp3j1xtPBYngy17zrq9qiQNmSc5T5Y7nwE2At76jNYetXmrMb 8kbw== X-Gm-Message-State: AOJu0Yz/cah1PFSSY49wCuDy1TJE+kVMKROyWOk/6ugpbFi6E06RrsBA MVWLi/RONQZkDAopSlGoahJ81eM4jSTYI/85d/c= X-Google-Smtp-Source: AGHT+IEX9vHsG1AHGqn74XjPqUGd5NCaCv/Yk5nDnNS7w/yri8h2i3drxdONuF5Xn0FhDeo1NwytoQ== X-Received: by 2002:a05:6a00:cc1:b0:68a:54c6:3aa0 with SMTP id b1-20020a056a000cc100b0068a54c63aa0mr7818368pfv.17.1692782966184; Wed, 23 Aug 2023 02:29:26 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id c10-20020aa78c0a000000b006875df4773fsm1169462pfd.163.2023.08.23.02.29.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 02:29:25 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, stefanha@redhat.com, Jagannathan Raman , =?utf-8?q?Philippe_Mathieu-Daud?= =?utf-8?q?=C3=A9?= , Peter Xu , David Hildenbrand , Paolo Bonzini , Elena Ufimtseva , Mattias Nissler Subject: [PATCH v2 1/4] softmmu: Support concurrent bounce buffers Date: Wed, 23 Aug 2023 02:29:02 -0700 Message-Id: <20230823092905.2259418-2-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230823092905.2259418-1-mnissler@rivosinc.com> References: <20230823092905.2259418-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42f; envelope-from=mnissler@rivosinc.com; helo=mail-pf1-x42f.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org When DMA memory can't be directly accessed, as is the case when running the device model in a separate process without shareable DMA file descriptors, bounce buffering is used. It is not uncommon for device models to request mapping of several DMA regions at the same time. Examples include: * net devices, e.g. when transmitting a packet that is split across several TX descriptors (observed with igb) * USB host controllers, when handling a packet with multiple data TRBs (observed with xhci) Previously, qemu only provided a single bounce buffer and would fail DMA map requests while the buffer was already in use. In turn, this would cause DMA failures that ultimately manifest as hardware errors from the guest perspective. This change allocates DMA bounce buffers dynamically instead of supporting only a single buffer. Thus, multiple DMA mappings work correctly also when RAM can't be mmap()-ed. The total bounce buffer allocation size is limited by a new command line parameter. The default is 4096 bytes to match the previous maximum buffer size. It is expected that suitable limits will vary quite a bit in practice depending on device models and workloads. Signed-off-by: Mattias Nissler --- include/sysemu/sysemu.h | 2 + qemu-options.hx | 27 +++++++++++++ softmmu/globals.c | 1 + softmmu/physmem.c | 84 +++++++++++++++++++++++------------------ softmmu/vl.c | 6 +++ 5 files changed, 83 insertions(+), 37 deletions(-) diff --git a/include/sysemu/sysemu.h b/include/sysemu/sysemu.h index 25be2a692e..c5dc93cb53 100644 --- a/include/sysemu/sysemu.h +++ b/include/sysemu/sysemu.h @@ -61,6 +61,8 @@ extern int nb_option_roms; extern const char *prom_envs[MAX_PROM_ENVS]; extern unsigned int nb_prom_envs; +extern uint64_t max_bounce_buffer_size; + /* serial ports */ /* Return the Chardev for serial port i, or NULL if none */ diff --git a/qemu-options.hx b/qemu-options.hx index 29b98c3d4c..6071794237 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -4959,6 +4959,33 @@ SRST ERST #endif +DEF("max-bounce-buffer-size", HAS_ARG, + QEMU_OPTION_max_bounce_buffer_size, + "-max-bounce-buffer-size size\n" + " DMA bounce buffer size limit in bytes (default=4096)\n", + QEMU_ARCH_ALL) +SRST +``-max-bounce-buffer-size size`` + Set the limit in bytes for DMA bounce buffer allocations. + + DMA bounce buffers are used when device models request memory-mapped access + to memory regions that can't be directly mapped by the qemu process, so the + memory must read or written to a temporary local buffer for the device + model to work with. This is the case e.g. for I/O memory regions, and when + running in multi-process mode without shared access to memory. + + Whether bounce buffering is necessary depends heavily on the device model + implementation. Some devices use explicit DMA read and write operations + which do not require bounce buffers. Some devices, notably storage, will + retry a failed DMA map request after bounce buffer space becomes available + again. Most other devices will bail when encountering map request failures, + which will typically appear to the guest as a hardware error. + + Suitable bounce buffer size values depend on the workload and guest + configuration. A few kilobytes up to a few megabytes are common sizes + encountered in practice. +ERST + DEFHEADING() DEFHEADING(Generic object creation:) diff --git a/softmmu/globals.c b/softmmu/globals.c index e83b5428d1..d3cc010717 100644 --- a/softmmu/globals.c +++ b/softmmu/globals.c @@ -54,6 +54,7 @@ const char *prom_envs[MAX_PROM_ENVS]; uint8_t *boot_splash_filedata; int only_migratable; /* turn it off unless user states otherwise */ int icount_align_option; +uint64_t max_bounce_buffer_size = 4096; /* The bytes in qemu_uuid are in the order specified by RFC4122, _not_ in the * little-endian "wire format" described in the SMBIOS 2.6 specification. diff --git a/softmmu/physmem.c b/softmmu/physmem.c index 3df73542e1..9f0fec0c8e 100644 --- a/softmmu/physmem.c +++ b/softmmu/physmem.c @@ -50,6 +50,7 @@ #include "sysemu/dma.h" #include "sysemu/hostmem.h" #include "sysemu/hw_accel.h" +#include "sysemu/sysemu.h" #include "sysemu/xen-mapcache.h" #include "trace/trace-root.h" @@ -2904,13 +2905,12 @@ void cpu_flush_icache_range(hwaddr start, hwaddr len) typedef struct { MemoryRegion *mr; - void *buffer; hwaddr addr; - hwaddr len; - bool in_use; + size_t len; + uint8_t buffer[]; } BounceBuffer; -static BounceBuffer bounce; +static size_t bounce_buffer_size; typedef struct MapClient { QEMUBH *bh; @@ -2945,9 +2945,9 @@ void cpu_register_map_client(QEMUBH *bh) qemu_mutex_lock(&map_client_list_lock); client->bh = bh; QLIST_INSERT_HEAD(&map_client_list, client, link); - /* Write map_client_list before reading in_use. */ + /* Write map_client_list before reading bounce_buffer_size. */ smp_mb(); - if (!qatomic_read(&bounce.in_use)) { + if (qatomic_read(&bounce_buffer_size) < max_bounce_buffer_size) { cpu_notify_map_clients_locked(); } qemu_mutex_unlock(&map_client_list_lock); @@ -3076,31 +3076,35 @@ void *address_space_map(AddressSpace *as, RCU_READ_LOCK_GUARD(); fv = address_space_to_flatview(as); mr = flatview_translate(fv, addr, &xlat, &l, is_write, attrs); + memory_region_ref(mr); if (!memory_access_is_direct(mr, is_write)) { - if (qatomic_xchg(&bounce.in_use, true)) { + size_t size = qatomic_add_fetch(&bounce_buffer_size, l); + if (size > max_bounce_buffer_size) { + size_t excess = size - max_bounce_buffer_size; + l -= excess; + qatomic_sub(&bounce_buffer_size, excess); + } + + if (l == 0) { *plen = 0; return NULL; } - /* Avoid unbounded allocations */ - l = MIN(l, TARGET_PAGE_SIZE); - bounce.buffer = qemu_memalign(TARGET_PAGE_SIZE, l); - bounce.addr = addr; - bounce.len = l; - memory_region_ref(mr); - bounce.mr = mr; + BounceBuffer *bounce = g_malloc(l + sizeof(BounceBuffer)); + bounce->mr = mr; + bounce->addr = addr; + bounce->len = l; + if (!is_write) { flatview_read(fv, addr, MEMTXATTRS_UNSPECIFIED, - bounce.buffer, l); + bounce->buffer, l); } *plen = l; - return bounce.buffer; + return bounce->buffer; } - - memory_region_ref(mr); *plen = flatview_extend_translation(fv, addr, len, mr, xlat, l, is_write, attrs); fuzz_dma_read_cb(addr, *plen, mr); @@ -3114,31 +3118,37 @@ void *address_space_map(AddressSpace *as, void address_space_unmap(AddressSpace *as, void *buffer, hwaddr len, bool is_write, hwaddr access_len) { - if (buffer != bounce.buffer) { - MemoryRegion *mr; - ram_addr_t addr1; + MemoryRegion *mr; + ram_addr_t addr1; + + mr = memory_region_from_host(buffer, &addr1); + if (mr == NULL) { + /* + * Must be a bounce buffer (unless the caller passed a pointer which + * wasn't returned by address_space_map, which is illegal). + */ + BounceBuffer *bounce = container_of(buffer, BounceBuffer, buffer); - mr = memory_region_from_host(buffer, &addr1); - assert(mr != NULL); if (is_write) { - invalidate_and_set_dirty(mr, addr1, access_len); - } - if (xen_enabled()) { - xen_invalidate_map_cache_entry(buffer); + address_space_write(as, bounce->addr, MEMTXATTRS_UNSPECIFIED, + bounce->buffer, access_len); } - memory_region_unref(mr); + + memory_region_unref(bounce->mr); + qatomic_sub(&bounce_buffer_size, bounce->len); + /* Write bounce_buffer_size before reading map_client_list. */ + smp_mb(); + cpu_notify_map_clients(); + g_free(bounce); return; } + + if (xen_enabled()) { + xen_invalidate_map_cache_entry(buffer); + } if (is_write) { - address_space_write(as, bounce.addr, MEMTXATTRS_UNSPECIFIED, - bounce.buffer, access_len); - } - qemu_vfree(bounce.buffer); - bounce.buffer = NULL; - memory_region_unref(bounce.mr); - /* Clear in_use before reading map_client_list. */ - qatomic_set_mb(&bounce.in_use, false); - cpu_notify_map_clients(); + invalidate_and_set_dirty(mr, addr1, access_len); + } } void *cpu_physical_memory_map(hwaddr addr, diff --git a/softmmu/vl.c b/softmmu/vl.c index b0b96f67fa..dbe52f5ea1 100644 --- a/softmmu/vl.c +++ b/softmmu/vl.c @@ -3469,6 +3469,12 @@ void qemu_init(int argc, char **argv) exit(1); #endif break; + case QEMU_OPTION_max_bounce_buffer_size: + if (qemu_strtosz(optarg, NULL, &max_bounce_buffer_size) < 0) { + error_report("invalid -max-ounce-buffer-size value"); + exit(1); + } + break; case QEMU_OPTION_object: object_option_parse(optarg); break; From patchwork Wed Aug 23 09:29:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13361913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0926CEE49B0 for ; Wed, 23 Aug 2023 09:30:06 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qYkB9-00057D-RW; Wed, 23 Aug 2023 05:29:31 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qYkB8-00051i-Jj for qemu-devel@nongnu.org; Wed, 23 Aug 2023 05:29:30 -0400 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qYkB6-0006YN-VJ for qemu-devel@nongnu.org; Wed, 23 Aug 2023 05:29:30 -0400 Received: by mail-pf1-x42b.google.com with SMTP id d2e1a72fcca58-686be3cbea0so4311565b3a.0 for ; Wed, 23 Aug 2023 02:29:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1692782967; x=1693387767; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=x77jGpzmruuxQnEwz6iNzx3FW0qgVmvAGg6kd+BPzF4=; b=IEJ28iv6HLXWcFjoz68X8qkZlJaJDSZUN1tfAhFvkB4fTJRb+t/viM3XUMqmS/Zz28 dp8ZknP1dCcDoPcOhWjfP4V71o3EELS2xmDDDPNsKOYeUG4v21NLJoe9czixZWPoUel7 BPcGBzhePaF30SvN0/TvIb4VukP9/0a8lDODbVQc8bF1wvjEput+ZwaW50xShdUvYw+j UHrIH+M8+PlifjxEujYiq8sYPGNoUpQHshIe6qqi0PcOyTEuGHNzgvR5AZXsPZWUrs8i IEjN/XgxJ1gucMUut0Hb0KKI+HZQGhSyyrz93x883/BSLo2a/SgfYy2w/sk22Hir6dgL MANw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692782967; x=1693387767; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=x77jGpzmruuxQnEwz6iNzx3FW0qgVmvAGg6kd+BPzF4=; b=LguT3pjgckoNTBVEIQ1LROvDkDXkkaZStVpxfD9GoT4GQeBrhVVoTDrO3lK7vEver/ XIkAun6oKnUxzlwdKS8mHB0rCmprZB3fi32qHMGXPyDKOzV36s09hRvcfh46xmHi5sfC 9XAUAozSTtqA1f0r8p83voSV1rGHA2yPJPMsYjHLrWo0nI0zXKGgj1nja7CVcOzbRgql d4ohPyumZOZlu8eKa4WVKLlEu+3cg4PUZgFiEIT+ez5i441VQzXb6W7MZGJy66e6SI0/ 5jKgNoS68qBWcnXQjunUhpiGlyZEk+qzIFVZSQAJ7152aS2yB4O6kxpCrPhGNyTvU4jx q77Q== X-Gm-Message-State: AOJu0YyJZmXwZv4JaexSVMmDUklDa8/92Xvcbm5ZoUiyTWJsIp3HV7gk 272WeR06+tKpnEUftDWg4HM18k6BzML/GeoUO8I= X-Google-Smtp-Source: AGHT+IGKv5b/4f3tOOw0ifRz23Pg8RlDVD/Rz9ktZk/9HWNknRbN4I/GaCmvAmAxJemSWmZ8VMhHyA== X-Received: by 2002:a05:6a20:9148:b0:133:38cb:2b93 with SMTP id x8-20020a056a20914800b0013338cb2b93mr18108978pzc.9.1692782967371; Wed, 23 Aug 2023 02:29:27 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id c10-20020aa78c0a000000b006875df4773fsm1169462pfd.163.2023.08.23.02.29.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 02:29:27 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, stefanha@redhat.com, Jagannathan Raman , =?utf-8?q?Philippe_Mathieu-Daud?= =?utf-8?q?=C3=A9?= , Peter Xu , David Hildenbrand , Paolo Bonzini , Elena Ufimtseva , Mattias Nissler Subject: [PATCH v2 2/4] Update subprojects/libvfio-user Date: Wed, 23 Aug 2023 02:29:03 -0700 Message-Id: <20230823092905.2259418-3-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230823092905.2259418-1-mnissler@rivosinc.com> References: <20230823092905.2259418-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::42b; envelope-from=mnissler@rivosinc.com; helo=mail-pf1-x42b.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Brings in assorted bug fixes. In particular, "Fix address calculation for message-based DMA" corrects a bug in DMA address calculation which is necessary to get DMA across VFIO-user messages working. Signed-off-by: Mattias Nissler --- subprojects/libvfio-user.wrap | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/subprojects/libvfio-user.wrap b/subprojects/libvfio-user.wrap index 416955ca45..47aad1ae18 100644 --- a/subprojects/libvfio-user.wrap +++ b/subprojects/libvfio-user.wrap @@ -1,4 +1,4 @@ [wrap-git] url = https://gitlab.com/qemu-project/libvfio-user.git -revision = 0b28d205572c80b568a1003db2c8f37ca333e4d7 +revision = cfb7d908dca025bdea6709801c5790863e902ef8 depth = 1 From patchwork Wed Aug 23 09:29:04 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13361912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id EF549EE49B0 for ; Wed, 23 Aug 2023 09:29:59 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qYkBB-0005HI-SF; Wed, 23 Aug 2023 05:29:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qYkBA-00058h-1c for qemu-devel@nongnu.org; Wed, 23 Aug 2023 05:29:32 -0400 Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qYkB8-0006YY-0r for qemu-devel@nongnu.org; Wed, 23 Aug 2023 05:29:31 -0400 Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-68a4bcf8a97so2374505b3a.1 for ; Wed, 23 Aug 2023 02:29:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1692782968; x=1693387768; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=47aPbA5pxaDOWep6DbmY6koMczW36VjR0/UwUNqiflo=; b=mdWNy2jTcOX5alL0ykaqYEWJOwdSAsENGbfJQIqxUrBCd8EtKc7Est4G95ZQPmhwsl Xj+Ygf87BEGWITfAvo4W35EkUpYw51YZYjJKaMf11C8hRNoRGDEz/tGrOvWDxEvO2e7p FwjFrRjUBjfH0SeGPMpOnKHnVqSgs6Pa5tGRczGzDglTnYRgDQ3QWL2ENKpIxLrmqz7k WBRheiTicuUu9cxoIznoXysdDVSriEJ18St/VvEHYnsTFJGJzY+gWzJkVGd1Vxi6mcoD i7qJnZwMJ9b/xOTVoqI6nYih8lHToHr9Of8kZUiH3439Pbd0xx5FM0Wz3IOBVCcz+gx3 QckQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692782968; x=1693387768; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=47aPbA5pxaDOWep6DbmY6koMczW36VjR0/UwUNqiflo=; b=kjH0VUwibAlx+oIKVi9E3bXZe7DTTr1tW5ZD2xYdrco/8al31owEXQXFMohe+pDKnF 1YDAsx5XoWEoAywYG7tVLIpSOC+negj0atx4Z6//5UxxyiEO8E/hvuzLrJHaWugfFzCo 6slss1BHssy83KA1n71srrs1NZiCtCeFVJhaxZeSZ1NmlohHk+rv7k9BhfWIeaIXM+mD 2qOaWGRMLpm3N056UALoYmylJ2bM6DLmhwizVegEXlnoGCIO/TjQo4ZPWuR3gmAOkXxf ib6pvhH6bthUe60bxNVGQICyaDd+oHAkNsXD22sl8/O7iaGGDqQ5RF0iRR36Me0R+giO OBkA== X-Gm-Message-State: AOJu0Yw/jfP92KHYBXrfuzF1QdsiwBGkAbeYEsSICYcmeQ7yWJOGXgNa r7vxClqw8xpazAnEecY15wETmeM8mSV5qBSgpOC5Qg== X-Google-Smtp-Source: AGHT+IEViNGNzLCDjOar7eMbE8FJhayoHKWUOkHnvKaDw8zHjET/gREidt/l3oSNcoreZRwDQBOLhA== X-Received: by 2002:a05:6a00:1a13:b0:68a:49bc:9be3 with SMTP id g19-20020a056a001a1300b0068a49bc9be3mr8784372pfv.29.1692782968461; Wed, 23 Aug 2023 02:29:28 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id c10-20020aa78c0a000000b006875df4773fsm1169462pfd.163.2023.08.23.02.29.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 02:29:28 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, stefanha@redhat.com, Jagannathan Raman , =?utf-8?q?Philippe_Mathieu-Daud?= =?utf-8?q?=C3=A9?= , Peter Xu , David Hildenbrand , Paolo Bonzini , Elena Ufimtseva , Mattias Nissler Subject: [PATCH v2 3/4] vfio-user: Message-based DMA support Date: Wed, 23 Aug 2023 02:29:04 -0700 Message-Id: <20230823092905.2259418-4-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230823092905.2259418-1-mnissler@rivosinc.com> References: <20230823092905.2259418-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::436; envelope-from=mnissler@rivosinc.com; helo=mail-pf1-x436.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Wire up support for DMA for the case where the vfio-user client does not provide mmap()-able file descriptors, but DMA requests must be performed via the VFIO-user protocol. This installs an indirect memory region, which already works for pci_dma_{read,write}, and pci_dma_map works thanks to the existing DMA bounce buffering support. Note that while simple scenarios work with this patch, there's a known race condition in libvfio-user that will mess up the communication channel. See https://github.com/nutanix/libvfio-user/issues/279 for details as well as a proposed fix. Signed-off-by: Mattias Nissler --- hw/remote/trace-events | 2 + hw/remote/vfio-user-obj.c | 84 +++++++++++++++++++++++++++++++++++---- 2 files changed, 79 insertions(+), 7 deletions(-) diff --git a/hw/remote/trace-events b/hw/remote/trace-events index 0d1b7d56a5..358a68fb34 100644 --- a/hw/remote/trace-events +++ b/hw/remote/trace-events @@ -9,6 +9,8 @@ vfu_cfg_read(uint32_t offset, uint32_t val) "vfu: cfg: 0x%x -> 0x%x" vfu_cfg_write(uint32_t offset, uint32_t val) "vfu: cfg: 0x%x <- 0x%x" vfu_dma_register(uint64_t gpa, size_t len) "vfu: registering GPA 0x%"PRIx64", %zu bytes" vfu_dma_unregister(uint64_t gpa) "vfu: unregistering GPA 0x%"PRIx64"" +vfu_dma_read(uint64_t gpa, size_t len) "vfu: DMA read 0x%"PRIx64", %zu bytes" +vfu_dma_write(uint64_t gpa, size_t len) "vfu: DMA write 0x%"PRIx64", %zu bytes" vfu_bar_register(int i, uint64_t addr, uint64_t size) "vfu: BAR %d: addr 0x%"PRIx64" size 0x%"PRIx64"" vfu_bar_rw_enter(const char *op, uint64_t addr) "vfu: %s request for BAR address 0x%"PRIx64"" vfu_bar_rw_exit(const char *op, uint64_t addr) "vfu: Finished %s of BAR address 0x%"PRIx64"" diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index 8b10c32a3c..cee5e615a9 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -300,6 +300,63 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, return count; } +static MemTxResult vfu_dma_read(void *opaque, hwaddr addr, uint64_t *val, + unsigned size, MemTxAttrs attrs) +{ + MemoryRegion *region = opaque; + VfuObject *o = VFU_OBJECT(region->owner); + uint8_t buf[sizeof(uint64_t)]; + + trace_vfu_dma_read(region->addr + addr, size); + + dma_sg_t *sg = alloca(dma_sg_size()); + vfu_dma_addr_t vfu_addr = (vfu_dma_addr_t)(region->addr + addr); + if (vfu_addr_to_sgl(o->vfu_ctx, vfu_addr, size, sg, 1, PROT_READ) < 0 || + vfu_sgl_read(o->vfu_ctx, sg, 1, buf) != 0) { + return MEMTX_ERROR; + } + + *val = ldn_he_p(buf, size); + + return MEMTX_OK; +} + +static MemTxResult vfu_dma_write(void *opaque, hwaddr addr, uint64_t val, + unsigned size, MemTxAttrs attrs) +{ + MemoryRegion *region = opaque; + VfuObject *o = VFU_OBJECT(region->owner); + uint8_t buf[sizeof(uint64_t)]; + + trace_vfu_dma_write(region->addr + addr, size); + + stn_he_p(buf, size, val); + + dma_sg_t *sg = alloca(dma_sg_size()); + vfu_dma_addr_t vfu_addr = (vfu_dma_addr_t)(region->addr + addr); + if (vfu_addr_to_sgl(o->vfu_ctx, vfu_addr, size, sg, 1, PROT_WRITE) < 0 || + vfu_sgl_write(o->vfu_ctx, sg, 1, buf) != 0) { + return MEMTX_ERROR; + } + + return MEMTX_OK; +} + +static const MemoryRegionOps vfu_dma_ops = { + .read_with_attrs = vfu_dma_read, + .write_with_attrs = vfu_dma_write, + .endianness = DEVICE_HOST_ENDIAN, + .valid = { + .min_access_size = 1, + .max_access_size = 8, + .unaligned = true, + }, + .impl = { + .min_access_size = 1, + .max_access_size = 8, + }, +}; + static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) { VfuObject *o = vfu_get_private(vfu_ctx); @@ -308,17 +365,30 @@ static void dma_register(vfu_ctx_t *vfu_ctx, vfu_dma_info_t *info) g_autofree char *name = NULL; struct iovec *iov = &info->iova; - if (!info->vaddr) { - return; - } - name = g_strdup_printf("mem-%s-%"PRIx64"", o->device, - (uint64_t)info->vaddr); + (uint64_t)iov->iov_base); subregion = g_new0(MemoryRegion, 1); - memory_region_init_ram_ptr(subregion, NULL, name, - iov->iov_len, info->vaddr); + if (info->vaddr) { + memory_region_init_ram_ptr(subregion, OBJECT(o), name, + iov->iov_len, info->vaddr); + } else { + /* + * Note that I/O regions' MemoryRegionOps handle accesses of at most 8 + * bytes at a time, and larger accesses are broken down. However, + * many/most DMA accesses are larger than 8 bytes and VFIO-user can + * handle large DMA accesses just fine, thus this size restriction + * unnecessarily hurts performance, in particular given that each + * access causes a round trip on the VFIO-user socket. + * + * TODO: Investigate how to plumb larger accesses through memory + * regions, possibly by amending MemoryRegionOps or by creating a new + * memory region type. + */ + memory_region_init_io(subregion, OBJECT(o), &vfu_dma_ops, subregion, + name, iov->iov_len); + } dma_as = pci_device_iommu_address_space(o->pci_dev); From patchwork Wed Aug 23 09:29:05 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mattias Nissler X-Patchwork-Id: 13361914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9F3ADEE49AE for ; Wed, 23 Aug 2023 09:30:16 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1qYkBD-0005JC-4b; Wed, 23 Aug 2023 05:29:35 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1qYkBB-0005GG-Ap for qemu-devel@nongnu.org; Wed, 23 Aug 2023 05:29:33 -0400 Received: from mail-oi1-x22a.google.com ([2607:f8b0:4864:20::22a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1qYkB9-0006Ym-J6 for qemu-devel@nongnu.org; Wed, 23 Aug 2023 05:29:33 -0400 Received: by mail-oi1-x22a.google.com with SMTP id 5614622812f47-3a85c5854deso2060679b6e.0 for ; Wed, 23 Aug 2023 02:29:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20221208.gappssmtp.com; s=20221208; t=1692782969; x=1693387769; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=shPARw+tLNnIUFEFBFESRJz0RNmWaqWCpot+BQW0Pwg=; b=POlUhmrWg1upRg9uJcARhw66wMLMp/xQmuVxv8oTXzLJkEg6B4SjtattVobFxTlFow gmp0TESHHcy03wPtcJusCowQkIaLOhNP5aCk7pMe0sh6+/lwf9uR/9E+PryE8zIPaTlk 0nFgLN3p/fw7+EvhkZmAwsJS8fHT3jte7VOpuu9ayvrXHbfqYMMqC80Q7klc2d9ImQ78 oG7sE4boyAzgg/f7EEKwEumKgWaeMbATeqbBUo2Vte17FG3ifLAGGO5e0nMm77IcXemc zvGl/LKWLz44jlimLzafzmUArKsOT+1bdFWA3NXcIwsyU/kWo3SA50mf/PTcPoOvhp3i BVzQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692782969; x=1693387769; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=shPARw+tLNnIUFEFBFESRJz0RNmWaqWCpot+BQW0Pwg=; b=i5QCHdVHpvJQUX4D5zjcP+Np4tPK1RBOP7Vakx2Lz2BhJRkdjW/jTY8Mkhb3XIWXyQ RXXmGfxqfLxLnHl8113mKK82D10hYkbY8P9ykyFsmQ0LhOADr0nGJxkc3qLn7B3DkzVa RQgHL4Nq8OZgNjFUuCBs3a6wvn+NyemmoGXXtiOzYVCZ7SNlzkhg8Hy8QPHDxPB1vp46 HmItepeTmoDlC9i++3j2mXz9P8Tj668tFiZvVW+gydKNRhy8YbKX+vg9LXHw5uAlazWe /zKsqGlAlW+zvFGzGzEGSM7wZCrjWcQ0YBNP4CtNqN+BaFomWK08LLMSfC8KArzlkNLw iWBw== X-Gm-Message-State: AOJu0YxubGqvfey9kb/uF/JdDb4G/tBWDQCv2ZUQRQYW4n1WlEPvfM9F Stg9uWuIfTHz10nriOYfdrAJ8cuaGx9GAfxENwpJqQ== X-Google-Smtp-Source: AGHT+IEIIktRaJjOEgpZVEsFAvzZG+CANLeW9zgdhIp/MDGf5qbOD5X5pknwZHsuL689Xo16Bntv8Q== X-Received: by 2002:aca:d1a:0:b0:3a7:2d4c:5d4e with SMTP id 26-20020aca0d1a000000b003a72d4c5d4emr12601809oin.7.1692782969727; Wed, 23 Aug 2023 02:29:29 -0700 (PDT) Received: from mnissler.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id c10-20020aa78c0a000000b006875df4773fsm1169462pfd.163.2023.08.23.02.29.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 23 Aug 2023 02:29:29 -0700 (PDT) From: Mattias Nissler To: qemu-devel@nongnu.org Cc: john.levon@nutanix.com, stefanha@redhat.com, Jagannathan Raman , =?utf-8?q?Philippe_Mathieu-Daud?= =?utf-8?q?=C3=A9?= , Peter Xu , David Hildenbrand , Paolo Bonzini , Elena Ufimtseva , Mattias Nissler Subject: [PATCH v2 4/4] vfio-user: Fix config space access byte order Date: Wed, 23 Aug 2023 02:29:05 -0700 Message-Id: <20230823092905.2259418-5-mnissler@rivosinc.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230823092905.2259418-1-mnissler@rivosinc.com> References: <20230823092905.2259418-1-mnissler@rivosinc.com> MIME-Version: 1.0 Received-SPF: pass client-ip=2607:f8b0:4864:20::22a; envelope-from=mnissler@rivosinc.com; helo=mail-oi1-x22a.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org PCI config space is little-endian, so on a big-endian host we need to perform byte swaps for values as they are passed to and received from the generic PCI config space access machinery. Signed-off-by: Mattias Nissler --- hw/remote/vfio-user-obj.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/hw/remote/vfio-user-obj.c b/hw/remote/vfio-user-obj.c index cee5e615a9..d38b4700f3 100644 --- a/hw/remote/vfio-user-obj.c +++ b/hw/remote/vfio-user-obj.c @@ -281,7 +281,7 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, while (bytes > 0) { len = (bytes > pci_access_width) ? pci_access_width : bytes; if (is_write) { - memcpy(&val, ptr, len); + val = ldn_le_p(ptr, len); pci_host_config_write_common(o->pci_dev, offset, pci_config_size(o->pci_dev), val, len); @@ -289,7 +289,7 @@ static ssize_t vfu_object_cfg_access(vfu_ctx_t *vfu_ctx, char * const buf, } else { val = pci_host_config_read_common(o->pci_dev, offset, pci_config_size(o->pci_dev), len); - memcpy(ptr, &val, len); + stn_le_p(ptr, len, val); trace_vfu_cfg_read(offset, val); } offset += len;