From patchwork Thu Sep 14 01:55:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sean Christopherson X-Patchwork-Id: 13384031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2419DEE0203 for ; Thu, 14 Sep 2023 01:56:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D6CAA8D0007; Wed, 13 Sep 2023 21:56:26 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D1C888D0001; Wed, 13 Sep 2023 21:56:26 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B6DD08D0007; Wed, 13 Sep 2023 21:56:26 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A20178D0001 for ; Wed, 13 Sep 2023 21:56:26 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 82FA4A0F5E for ; Thu, 14 Sep 2023 01:56:26 +0000 (UTC) X-FDA: 81233538372.29.8894692 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf07.hostedemail.com (Postfix) with ESMTP id B19BB40009 for ; Thu, 14 Sep 2023 01:56:24 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=K9v5fled; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3R2gCZQYKCFAAws51uy66y3w.u64305CF-442Dsu2.69y@flex--seanjc.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3R2gCZQYKCFAAws51uy66y3w.u64305CF-442Dsu2.69y@flex--seanjc.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1694656584; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OxMDE800j4aiJ8tPuJinoyPTmv5S2+AxjnM5iapuS4E=; b=godiPeKXpo62XnNxZ4f9dKZIK/z4mpmTo9THCyvHnQbcd5mCu84jvgdWc3pBTpvDYsKFqr qfnsw7st0Ze379nt3J6Tmdfdmcm3mSEQgroltvtEBv2E7fQWr3O97cG/xiotk387TQTWPt 6WNWlizlc1kf4/9S9AHRH2+WNX8DE+s= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=K9v5fled; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3R2gCZQYKCFAAws51uy66y3w.u64305CF-442Dsu2.69y@flex--seanjc.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3R2gCZQYKCFAAws51uy66y3w.u64305CF-442Dsu2.69y@flex--seanjc.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1694656584; a=rsa-sha256; cv=none; b=wPaZ/L3ZjmCj/aEUX98R+JCNWok/BdcGTnDC6FUTXV7S8RSdHOnnXsi6NQ2s24GINuHvap QZ8zVRRibjUojrWjJvWsOQ/c7v42ZYTrn43yAPKDhDDe1hDIODoI11DMvw+U8jSOSP1Pe8 H4Sl+focU3LOUNUn+mU0vczYDCUYVXE= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-26d5970cbdbso499541a91.0 for ; Wed, 13 Sep 2023 18:56:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694656584; x=1695261384; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=OxMDE800j4aiJ8tPuJinoyPTmv5S2+AxjnM5iapuS4E=; b=K9v5fled+UZYGIQhq3ph1XNIhq7eFYpeRv7/SvW+q9HTHkvpmF297HJrQ6AZcWE7ZS AtH8xXhuQhhp8djMaBDotnvxh+0PHU4AkVR3lxv0xNVeptpLC59FtRDt3Q6lNgc12qHZ P33YxEWZo0bJRjf3H7DBbbyywMnvbpKrUXlzFn73KDdtDqYlx/96fpU69tVPdoBKZzCR t8rQ+cVu4PE8Q0Clupl9uSMVmY2cJT2RcMTHiSmo+/JT4spNSh6WmJf9rJDsifi2VQ8a Nd7eC1MHQ4wlcoOFggfNDg2CtLquz5Zmeo1WYI4vdbMi3Cs8FQ8wiKFuZ/Q/MEksKleg xDgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694656584; x=1695261384; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OxMDE800j4aiJ8tPuJinoyPTmv5S2+AxjnM5iapuS4E=; b=Z09VudM6Pw4rU6bUCwO+dNlSudJrzox3dJ4xabIpAh5FBoljsE+INvOUSx7snCUu67 fM3UuQjIqJQyXXwSSypu9oGmDexZfU63DD2i602Z3ygjeBA3URihY79QiAInWsdH2cLE at1WRoeYWVawmYgF0BxqmyoO5bx0xEEJw2CclR8f+GnB3kk3Z3to8f7RgnMzStnPrCDu djHpYMyO+5HYSqOeBlQjgZvpp/RRSLuEX+iU35XpvFlLbKli7491LaUNySX0znZvaL4K EBlrwgbFOJQyQ01l1aJ4nZyX3Vo21SP+7l/coXgbMRGfgo4i89b6ix5U+rrP9ZvxmrYR WbCQ== X-Gm-Message-State: AOJu0Yzxaq6dU8jJ4Mlwfk0S1ocagzfTX9nv9WZOTU2sevFZgrCuA4KL e2ezN2XYaK8PHo6xFOj4L9JxzSu4B6g= X-Google-Smtp-Source: AGHT+IHrJ+akemSkjusv3tsEKJ81jQid3VaDO1GbVbcTPU4KobK5t8Z1M2fJ9kpEYdnobHHBOeyWBEN2WZU= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:d50b:b0:1c3:29c4:c4e8 with SMTP id b11-20020a170902d50b00b001c329c4c4e8mr224068plg.4.1694656583660; Wed, 13 Sep 2023 18:56:23 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 13 Sep 2023 18:55:22 -0700 In-Reply-To: <20230914015531.1419405-1-seanjc@google.com> Mime-Version: 1.0 References: <20230914015531.1419405-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230914015531.1419405-25-seanjc@google.com> Subject: [RFC PATCH v12 24/33] KVM: selftests: Add support for creating private memslots From: Sean Christopherson To: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , "Matthew Wilcox (Oracle)" , Andrew Morton , Paul Moore , James Morris , "Serge E. Hallyn" Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , Xu Yilun , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" X-Rspamd-Queue-Id: B19BB40009 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: 84fwbsre6mzdijrf55wfosdh8ohrm6kk X-HE-Tag: 1694656584-390577 X-HE-Meta: U2FsdGVkX19vfHGlmEseRjrZMcENi25fvJlDZpSJXxHqtyElwd32E60GpCQCrknXPPGNn7jW34+kuGqYpwA2VdeqRbWQ5JHq3r+1RIr47B4HT1LIhfNm7gqhedMgyqYHboH3lBHe8yYDk6iiwP5qm7+o15tnqfreL5nS4zUo2oJiQ4rxCdTPdLBEQFeFMQaJRM538tnmX8onio/VoaudzcQJFNOg8ZNoWxeiHd8gcsk7sUsdjLv8qD9OLJQnRYYABgpPp9h5pB6fExieWcLgONs6TSoMRiJZgZRZz2KXjXPXpL4AE+ls5e+JDfEon4zA/5kzncUM0Ah1UWK9Lx+GSx+APxoMONWBsBM5BSKeTN7Aj1HZgkPWH9B5PIDdQIN3Te8RGjZ82oDmVzNrR22v3Rvg2RiVZtbFY8falmekdoHppBgxJUpKVUdWDYdJSNr4k8bNUKS/dwtfMq5AugE9GgnNNoM13ykmGDwWl3y0NAqKzpIZSILznLIqFQjrS38PL6ibraeu9POHS2v5L5i5++S47ppGZZmjAWxjj9Q3WiJ8ZwKbo2XsIcM6cIV1QKJm455AyEfQtuhYm7tM+shB9xQ5yERsfqN4L3z1R5iGNbknps6YjCvZBZ4z8Q0XLl9CjHXPQuTv0B/vGCmvf84q48OyHSDkF4x4Ky/+WD2c3OE4HhOTN7oGEZwLF8sdRq+hEHwufMeOR9zXpmX0h42xMLYE6N8oZ/hdSqj0ARleZwuVBTEmytFiIOo3fSmYLWZ7rI89OSdvB7s6VjRUhUPbk/tsja5te8HEngGPVy0HT1I2cXofNLaunwx1ljFt/sObkUSc2+VPTI7KQrR8i8B8VXv2WXeaRx1EpmXEYzCL3nAXT1RTukWJGHYa+X556+qpdepGI+dP8XYLIUOko63IfAHbbp+gxrGtuKPn4+uQhqt6HmgSCjojGf0wR8t26mlPSb5J25tScm05f0MdaO3 gBfIaIlk jhJBPqp9c2Z8hHXkiDAuO9LtksjO5UL+WOiNQsYVucTKkfiGqPA+TtAyYzXu5GCftkMqGaipARdukZ5FWv8W1CqGOgGVNX0ca0c9yDd1OT0uZkNDTTaVj+yrmGiwlabcdYmoxTGKCxOTNb0YoBbe6p5JjcI9SlNheGpOFKzrRS41P3wkrxfbWfAjy6JhDtuwfDPkBEniQ1SwEfYQ3IrD2Gyb+fMpPrFpekPGW/cqUoJu5qJ2VyajqJlQyMJo5jJoZsMBIxaFTKh4Br9I9IvO+WsoIz41TqD7rQTCwHI+KYV38cKP87JLzuzbT/6erpaYgCLFAXecQSVE7RXTyQdF51mkcFds7oQShH8+X57Xub5cAZ1HyU1eP+onYha7KYSROh8ES+HX/Hcm+UwkpF0bazZ8cI05dOgR47kO7ivHzohto+KL/MMXDFCHStGZstUXvd53FV10DZj7HRgPOiDIqQ3dt52/6uobBVK1gJjBg0Mn+1QfAYlibhRWduwOD1KZUNNmQnmqRln1hFLyjhTJ5fxGs/y7IGauchJroZbisBbZ5p6/jA8LXZPfBgTFF8nnBHp2VnskwMUPprRI1w3dAknQNnrhlmvEnUZf47cyGFRLAFAd82mPorM4Oh9YAk4B1wNCnL6ov+9Y9kzTsdIA+bxHXYsbilo4ttM3J1iAxTs3Qh3jbbRSvoRTbKu2zDtaPLBkU7EFk+NVWRTuUaqDniv6hAq2HZrWm1hwWwhc85ydpzpShrt3+IoU92i0hV5d3dLrJ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add support for creating "private" memslots via KVM_CREATE_GUEST_MEMFD and KVM_SET_USER_MEMORY_REGION2. Make vm_userspace_mem_region_add() a wrapper to its effective replacement, vm_mem_add(), so that private memslots are fully opt-in, i.e. don't require update all tests that add memory regions. Pivot on the KVM_MEM_PRIVATE flag instead of the validity of the "gmem" file descriptor so that simple tests can let vm_mem_add() do the heavy lifting of creating the guest memfd, but also allow the caller to pass in an explicit fd+offset so that fancier tests can do things like back multiple memslots with a single file. If the caller passes in a fd, dup() the fd so that (a) __vm_mem_region_delete() can close the fd associated with the memory region without needing yet another flag, and (b) so that the caller can safely close its copy of the fd without having to first destroy memslots. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Signed-off-by: Sean Christopherson --- .../selftests/kvm/include/kvm_util_base.h | 23 +++++ .../testing/selftests/kvm/include/test_util.h | 5 ++ tools/testing/selftests/kvm/lib/kvm_util.c | 85 ++++++++++++------- 3 files changed, 82 insertions(+), 31 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index 9f144841c2ee..47ea25f9dc97 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -431,6 +431,26 @@ static inline uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name) void vm_create_irqchip(struct kvm_vm *vm); +static inline int __vm_create_guest_memfd(struct kvm_vm *vm, uint64_t size, + uint64_t flags) +{ + struct kvm_create_guest_memfd gmem = { + .size = size, + .flags = flags, + }; + + return __vm_ioctl(vm, KVM_CREATE_GUEST_MEMFD, &gmem); +} + +static inline int vm_create_guest_memfd(struct kvm_vm *vm, uint64_t size, + uint64_t flags) +{ + int fd = __vm_create_guest_memfd(vm, size, flags); + + TEST_ASSERT(fd >= 0, KVM_IOCTL_ERROR(KVM_CREATE_GUEST_MEMFD, fd)); + return fd; +} + void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, uint64_t gpa, uint64_t size, void *hva); int __vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, @@ -439,6 +459,9 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, uint64_t guest_paddr, uint32_t slot, uint64_t npages, uint32_t flags); +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int gmem_fd, uint64_t gmem_offset); void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags); void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa); diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 7e614adc6cf4..7257f2243ab9 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -142,6 +142,11 @@ static inline bool backing_src_is_shared(enum vm_mem_backing_src_type t) return vm_mem_backing_src_alias(t)->flag & MAP_SHARED; } +static inline bool backing_src_can_be_huge(enum vm_mem_backing_src_type t) +{ + return t != VM_MEM_SRC_ANONYMOUS && t != VM_MEM_SRC_SHMEM; +} + /* Aligns x up to the next multiple of size. Size must be a power of 2. */ static inline uint64_t align_up(uint64_t x, uint64_t size) { diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 3676b37bea38..127f44c6c83c 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -669,6 +669,8 @@ static void __vm_mem_region_delete(struct kvm_vm *vm, TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret)); close(region->fd); } + if (region->region.gmem_fd >= 0) + close(region->region.gmem_fd); free(region); } @@ -870,36 +872,15 @@ void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, errno, strerror(errno)); } -/* - * VM Userspace Memory Region Add - * - * Input Args: - * vm - Virtual Machine - * src_type - Storage source for this region. - * NULL to use anonymous memory. - * guest_paddr - Starting guest physical address - * slot - KVM region slot - * npages - Number of physical pages - * flags - KVM memory region flags (e.g. KVM_MEM_LOG_DIRTY_PAGES) - * - * Output Args: None - * - * Return: None - * - * Allocates a memory area of the number of pages specified by npages - * and maps it to the VM specified by vm, at a starting physical address - * given by guest_paddr. The region is created with a KVM region slot - * given by slot, which must be unique and < KVM_MEM_SLOTS_NUM. The - * region is created with the flags given by flags. - */ -void vm_userspace_mem_region_add(struct kvm_vm *vm, - enum vm_mem_backing_src_type src_type, - uint64_t guest_paddr, uint32_t slot, uint64_t npages, - uint32_t flags) +/* FIXME: This thing needs to be ripped apart and rewritten. */ +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int gmem_fd, uint64_t gmem_offset) { int ret; struct userspace_mem_region *region; size_t backing_src_pagesz = get_backing_src_pagesz(src_type); + size_t mem_size = npages * vm->page_size; size_t alignment; TEST_ASSERT(vm_adjust_num_guest_pages(vm->mode, npages) == npages, @@ -952,7 +933,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, /* Allocate and initialize new mem region structure. */ region = calloc(1, sizeof(*region)); TEST_ASSERT(region != NULL, "Insufficient Memory"); - region->mmap_size = npages * vm->page_size; + region->mmap_size = mem_size; #ifdef __s390x__ /* On s390x, the host address must be aligned to 1M (due to PGSTEs) */ @@ -999,14 +980,47 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, /* As needed perform madvise */ if ((src_type == VM_MEM_SRC_ANONYMOUS || src_type == VM_MEM_SRC_ANONYMOUS_THP) && thp_configured()) { - ret = madvise(region->host_mem, npages * vm->page_size, + ret = madvise(region->host_mem, mem_size, src_type == VM_MEM_SRC_ANONYMOUS ? MADV_NOHUGEPAGE : MADV_HUGEPAGE); TEST_ASSERT(ret == 0, "madvise failed, addr: %p length: 0x%lx src_type: %s", - region->host_mem, npages * vm->page_size, + region->host_mem, mem_size, vm_mem_backing_src_alias(src_type)->name); } region->backing_src_type = src_type; + + if (flags & KVM_MEM_PRIVATE) { + if (gmem_fd < 0) { + uint32_t gmem_flags = 0; + + /* + * Allow hugepages for the guest memfd backing if the + * "normal" backing is allowed/required to be huge. + */ + if (src_type != VM_MEM_SRC_ANONYMOUS && + src_type != VM_MEM_SRC_SHMEM) + gmem_flags |= KVM_GUEST_MEMFD_ALLOW_HUGEPAGE; + + TEST_ASSERT(!gmem_offset, + "Offset must be zero when creating new guest_memfd"); + gmem_fd = vm_create_guest_memfd(vm, mem_size, gmem_flags); + } else { + /* + * Install a unique fd for each memslot so that the fd + * can be closed when the region is deleted without + * needing to track if the fd is owned by the framework + * or by the caller. + */ + gmem_fd = dup(gmem_fd); + TEST_ASSERT(gmem_fd >= 0, __KVM_SYSCALL_ERROR("dup()", gmem_fd)); + } + + region->region.gmem_fd = gmem_fd; + region->region.gmem_offset = gmem_offset; + } else { + region->region.gmem_fd = -1; + } + region->unused_phy_pages = sparsebit_alloc(); sparsebit_set_num(region->unused_phy_pages, guest_paddr >> vm->page_shift, npages); @@ -1019,9 +1033,10 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, TEST_ASSERT(ret == 0, "KVM_SET_USER_MEMORY_REGION2 IOCTL failed,\n" " rc: %i errno: %i\n" " slot: %u flags: 0x%x\n" - " guest_phys_addr: 0x%lx size: 0x%lx", + " guest_phys_addr: 0x%lx size: 0x%lx guest_memfd: %d\n", ret, errno, slot, flags, - guest_paddr, (uint64_t) region->region.memory_size); + guest_paddr, (uint64_t) region->region.memory_size, + region->region.gmem_fd); /* Add to quick lookup data structures */ vm_userspace_mem_region_gpa_insert(&vm->regions.gpa_tree, region); @@ -1042,6 +1057,14 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, } } +void vm_userspace_mem_region_add(struct kvm_vm *vm, + enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, + uint64_t npages, uint32_t flags) +{ + vm_mem_add(vm, src_type, guest_paddr, slot, npages, flags, -1, 0); +} + /* * Memslot to region *