From patchwork Thu Feb 16 11:47:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13142979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2DBBFC64EC4 for ; Thu, 16 Feb 2023 11:48:57 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSck6-0004Cg-NU; Thu, 16 Feb 2023 06:48:02 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSck4-0004Bg-Dj for qemu-devel@nongnu.org; Thu, 16 Feb 2023 06:48:00 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSck2-0005bk-KN for qemu-devel@nongnu.org; Thu, 16 Feb 2023 06:48:00 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676548078; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=qOxTKnQxX/V7pj4h/fipsVbjbaWTl7j5IscUzUHX44o=; b=KhuzWiE2MY3CgmKkBAX6Q61GWWdbaOuDaJC5UTfD289fN16Gu36E4cDB0kTyrmnvpN0hmN uAE7wbK3f3I+rvcdMUXGQLRfJCI4d0tVoEIsrVDmPT6zQHn4mhZNj6p/gjxTXFklcGrxik 6GaD+xhYAEqmMuGc7DBUzEobwyvk3BM= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-486-B4z6i6LzNgCl_DmptDSaUg-1; Thu, 16 Feb 2023 06:47:56 -0500 X-MC-Unique: B4z6i6LzNgCl_DmptDSaUg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id A6E553C14860; Thu, 16 Feb 2023 11:47:56 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.195.58]) by smtp.corp.redhat.com (Postfix) with ESMTP id 354E5140EBF6; Thu, 16 Feb 2023 11:47:55 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S. Tsirkin" , Stefan Hajnoczi , "Dr . David Alan Gilbert" , Tiwei Bie Subject: [PATCH v1 1/2] vhost: Defer filtering memory sections until building the vhost memory structure Date: Thu, 16 Feb 2023 12:47:51 +0100 Message-Id: <20230216114752.198627-2-david@redhat.com> In-Reply-To: <20230216114752.198627-1-david@redhat.com> References: <20230216114752.198627-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Received-SPF: pass client-ip=170.10.133.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Having multiple devices, some filtering memslots and some not filtering memslots, messes up the "used_memslot" accounting. If we'd have a device the filters out less memory sections after a device that filters out more, we'd be in trouble, because our memslot checks stop working reliably. For example, hotplugging a device that filters out less memslots might end up passing the checks based on max vs. used memslots, but can run out of memslots when getting notified about all memory sections. Further, it will be helpful in memory device context in the near future to know that a RAM memory region section will consume a memslot, and be accounted for in the used vs. free memslots, such that we can implement reservation of memslots for memory devices properly. Whether a device filters this out and would theoretically still have a free memslot is then hidden internally, making overall vhost memslot accounting easier. Let's filter the memslots when creating the vhost memory array, accounting all RAM && !ROM memory regions as "used_memslots" even if vhost_user isn't interested in anonymous RAM regions, because it needs an fd. When a device actually filters out regions (which should happen rarely in practice), we might detect a layout change although only filtered regions changed. We won't bother about optimizing that for now. Note: we cannot simply filter out the region and count them as "filtered" to add them to used, because filtered regions could get merged and result in a smaller effective number of memslots. Further, we won't touch the hmp/qmp virtio introspection output. Fixes: 988a27754bbb ("vhost: allow backends to filter memory sections") Cc: Tiwei Bie Signed-off-by: David Hildenbrand --- hw/virtio/vhost.c | 79 +++++++++++++++++++++++++++++++++-------------- 1 file changed, 55 insertions(+), 24 deletions(-) diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index eb8c4c378c..b7fb960fa9 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -219,8 +219,13 @@ static void vhost_log_sync_range(struct vhost_dev *dev, int i; /* FIXME: this is N^2 in number of sections */ for (i = 0; i < dev->n_mem_sections; ++i) { - MemoryRegionSection *section = &dev->mem_sections[i]; - vhost_sync_dirty_bitmap(dev, section, first, last); + MemoryRegionSection *mrs = &dev->mem_sections[i]; + + if (dev->vhost_ops->vhost_backend_mem_section_filter && + !dev->vhost_ops->vhost_backend_mem_section_filter(dev, mrs)) { + continue; + } + vhost_sync_dirty_bitmap(dev, mrs, first, last); } } @@ -503,12 +508,6 @@ static bool vhost_section(struct vhost_dev *dev, MemoryRegionSection *section) return false; } - if (dev->vhost_ops->vhost_backend_mem_section_filter && - !dev->vhost_ops->vhost_backend_mem_section_filter(dev, section)) { - trace_vhost_reject_section(mr->name, 2); - return false; - } - trace_vhost_section(mr->name); return true; } else { @@ -525,6 +524,43 @@ static void vhost_begin(MemoryListener *listener) dev->n_tmp_sections = 0; } +static void vhost_realloc_vhost_memory(struct vhost_dev *dev, + unsigned int nregions) +{ + const size_t size = offsetof(struct vhost_memory, regions) + + nregions * sizeof dev->mem->regions[0]; + + dev->mem = g_realloc(dev->mem, size); + dev->mem->nregions = nregions; +} + +static void vhost_rebuild_vhost_memory(struct vhost_dev *dev) +{ + unsigned int nregions = 0; + int i; + + vhost_realloc_vhost_memory(dev, dev->n_mem_sections); + for (i = 0; i < dev->n_mem_sections; i++) { + struct MemoryRegionSection *mrs = dev->mem_sections + i; + struct vhost_memory_region *cur_vmr; + + if (dev->vhost_ops->vhost_backend_mem_section_filter && + !dev->vhost_ops->vhost_backend_mem_section_filter(dev, mrs)) { + continue; + } + cur_vmr = dev->mem->regions + nregions; + nregions++; + + cur_vmr->guest_phys_addr = mrs->offset_within_address_space; + cur_vmr->memory_size = int128_get64(mrs->size); + cur_vmr->userspace_addr = + (uintptr_t)memory_region_get_ram_ptr(mrs->mr) + + mrs->offset_within_region; + cur_vmr->flags_padding = 0; + } + vhost_realloc_vhost_memory(dev, nregions); +} + static void vhost_commit(MemoryListener *listener) { struct vhost_dev *dev = container_of(listener, struct vhost_dev, @@ -532,7 +568,6 @@ static void vhost_commit(MemoryListener *listener) MemoryRegionSection *old_sections; int n_old_sections; uint64_t log_size; - size_t regions_size; int r; int i; bool changed = false; @@ -564,23 +599,19 @@ static void vhost_commit(MemoryListener *listener) goto out; } - /* Rebuild the regions list from the new sections list */ - regions_size = offsetof(struct vhost_memory, regions) + - dev->n_mem_sections * sizeof dev->mem->regions[0]; - dev->mem = g_realloc(dev->mem, regions_size); - dev->mem->nregions = dev->n_mem_sections; + /* + * Globally track the used memslots *without* device specific + * filtering. This way, we always know how many memslots are required + * when devices with differing filtering requirements get mixed, and + * all RAM memory regions of memory devices will consume memslots. + */ used_memslots = dev->mem->nregions; - for (i = 0; i < dev->n_mem_sections; i++) { - struct vhost_memory_region *cur_vmr = dev->mem->regions + i; - struct MemoryRegionSection *mrs = dev->mem_sections + i; - cur_vmr->guest_phys_addr = mrs->offset_within_address_space; - cur_vmr->memory_size = int128_get64(mrs->size); - cur_vmr->userspace_addr = - (uintptr_t)memory_region_get_ram_ptr(mrs->mr) + - mrs->offset_within_region; - cur_vmr->flags_padding = 0; - } + /* + * Rebuild the regions list from the new sections list, filtering out all + * sections that this device is not interested in. + */ + vhost_rebuild_vhost_memory(dev); if (!dev->started) { goto out; From patchwork Thu Feb 16 11:47:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Hildenbrand X-Patchwork-Id: 13142980 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id ECAFAC636CC for ; Thu, 16 Feb 2023 11:49:00 +0000 (UTC) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1pSck7-0004DM-PY; Thu, 16 Feb 2023 06:48:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSck6-0004C7-5u for qemu-devel@nongnu.org; Thu, 16 Feb 2023 06:48:02 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1pSck4-0005dS-HC for qemu-devel@nongnu.org; Thu, 16 Feb 2023 06:48:01 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1676548079; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pOnikgksgXYf9LeeVRoxRrB2V6TEHZumzSD8QSpBdZE=; b=YtJPYUZQ7PwsCEQ2Kym6CmiagkhbcXTWhOPcQFyJfuV2du4RJzoBcUB18eRd4uFoWc1W5I paJoFwFavdie0AwfKY/08z0h+P+VyDnpf54Wk6z3hrr93kXIPLxptmVAxlV8IGrBXTVwtO KAsvubzG3L9z/dfNr/SadbV40TjiwUw= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-187-Mmy6KXJWM22ART07nwnuyg-1; Thu, 16 Feb 2023 06:47:58 -0500 X-MC-Unique: Mmy6KXJWM22ART07nwnuyg-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.rdu2.redhat.com [10.11.54.7]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 1899F3C14865 for ; Thu, 16 Feb 2023 11:47:58 +0000 (UTC) Received: from t480s.fritz.box (unknown [10.39.195.58]) by smtp.corp.redhat.com (Postfix) with ESMTP id 03383140EBF6; Thu, 16 Feb 2023 11:47:56 +0000 (UTC) From: David Hildenbrand To: qemu-devel@nongnu.org Cc: David Hildenbrand , "Michael S. Tsirkin" , Stefan Hajnoczi , "Dr . David Alan Gilbert" Subject: [PATCH v1 2/2] vhost: Remove vhost_backend_can_merge() callback Date: Thu, 16 Feb 2023 12:47:52 +0100 Message-Id: <20230216114752.198627-3-david@redhat.com> In-Reply-To: <20230216114752.198627-1-david@redhat.com> References: <20230216114752.198627-1-david@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.7 Received-SPF: pass client-ip=170.10.129.124; envelope-from=david@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Checking whether the memory regions are equal is sufficient: if they are equal, then most certainly the contained fd is equal. The whole vhost-user memslot handling is suboptimal and overly complicated. We shouldn't have to lookup a RAM memory regions we got notified about in vhost_user_get_mr_data() using a host pointer. But that requires a bigger rework -- especially an alternative vhost_set_mem_table() backend call that simply consumes MemoryRegionSections. For now, let's just drop vhost_backend_can_merge(). Signed-off-by: David Hildenbrand Reviewed-by: Igor Mammedov --- hw/virtio/vhost-user.c | 14 -------------- hw/virtio/vhost-vdpa.c | 1 - hw/virtio/vhost.c | 6 +----- include/hw/virtio/vhost-backend.h | 4 ---- 4 files changed, 1 insertion(+), 24 deletions(-) diff --git a/hw/virtio/vhost-user.c b/hw/virtio/vhost-user.c index e68daa35d4..4bfaf559a7 100644 --- a/hw/virtio/vhost-user.c +++ b/hw/virtio/vhost-user.c @@ -2195,19 +2195,6 @@ static int vhost_user_migration_done(struct vhost_dev *dev, char* mac_addr) return -ENOTSUP; } -static bool vhost_user_can_merge(struct vhost_dev *dev, - uint64_t start1, uint64_t size1, - uint64_t start2, uint64_t size2) -{ - ram_addr_t offset; - int mfd, rfd; - - (void)vhost_user_get_mr_data(start1, &offset, &mfd); - (void)vhost_user_get_mr_data(start2, &offset, &rfd); - - return mfd == rfd; -} - static int vhost_user_net_set_mtu(struct vhost_dev *dev, uint16_t mtu) { VhostUserMsg msg; @@ -2704,7 +2691,6 @@ const VhostOps user_ops = { .vhost_set_vring_enable = vhost_user_set_vring_enable, .vhost_requires_shm_log = vhost_user_requires_shm_log, .vhost_migration_done = vhost_user_migration_done, - .vhost_backend_can_merge = vhost_user_can_merge, .vhost_net_set_mtu = vhost_user_net_set_mtu, .vhost_set_iotlb_callback = vhost_user_set_iotlb_callback, .vhost_send_device_iotlb_msg = vhost_user_send_device_iotlb_msg, diff --git a/hw/virtio/vhost-vdpa.c b/hw/virtio/vhost-vdpa.c index 542e003101..9ab7bc8718 100644 --- a/hw/virtio/vhost-vdpa.c +++ b/hw/virtio/vhost-vdpa.c @@ -1317,7 +1317,6 @@ const VhostOps vdpa_ops = { .vhost_set_config = vhost_vdpa_set_config, .vhost_requires_shm_log = NULL, .vhost_migration_done = NULL, - .vhost_backend_can_merge = NULL, .vhost_net_set_mtu = NULL, .vhost_set_iotlb_callback = NULL, .vhost_send_device_iotlb_msg = NULL, diff --git a/hw/virtio/vhost.c b/hw/virtio/vhost.c index b7fb960fa9..9d8662aa98 100644 --- a/hw/virtio/vhost.c +++ b/hw/virtio/vhost.c @@ -733,11 +733,7 @@ static void vhost_region_add_section(struct vhost_dev *dev, size_t offset = mrs_gpa - prev_gpa_start; if (prev_host_start + offset == mrs_host && - section->mr == prev_sec->mr && - (!dev->vhost_ops->vhost_backend_can_merge || - dev->vhost_ops->vhost_backend_can_merge(dev, - mrs_host, mrs_size, - prev_host_start, prev_size))) { + section->mr == prev_sec->mr) { uint64_t max_end = MAX(prev_host_end, mrs_host + mrs_size); need_add = false; prev_sec->offset_within_address_space = diff --git a/include/hw/virtio/vhost-backend.h b/include/hw/virtio/vhost-backend.h index c5ab49051e..abf1601ba2 100644 --- a/include/hw/virtio/vhost-backend.h +++ b/include/hw/virtio/vhost-backend.h @@ -86,9 +86,6 @@ typedef int (*vhost_set_vring_enable_op)(struct vhost_dev *dev, typedef bool (*vhost_requires_shm_log_op)(struct vhost_dev *dev); typedef int (*vhost_migration_done_op)(struct vhost_dev *dev, char *mac_addr); -typedef bool (*vhost_backend_can_merge_op)(struct vhost_dev *dev, - uint64_t start1, uint64_t size1, - uint64_t start2, uint64_t size2); typedef int (*vhost_vsock_set_guest_cid_op)(struct vhost_dev *dev, uint64_t guest_cid); typedef int (*vhost_vsock_set_running_op)(struct vhost_dev *dev, int start); @@ -160,7 +157,6 @@ typedef struct VhostOps { vhost_set_vring_enable_op vhost_set_vring_enable; vhost_requires_shm_log_op vhost_requires_shm_log; vhost_migration_done_op vhost_migration_done; - vhost_backend_can_merge_op vhost_backend_can_merge; vhost_vsock_set_guest_cid_op vhost_vsock_set_guest_cid; vhost_vsock_set_running_op vhost_vsock_set_running; vhost_set_iotlb_callback_op vhost_set_iotlb_callback;