From patchwork Tue Feb 11 12:11:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13969537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 86EEAC021A1 for ; Tue, 11 Feb 2025 12:11:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E96C4280005; Tue, 11 Feb 2025 07:11:39 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E1EA2280004; Tue, 11 Feb 2025 07:11:39 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9803280005; Tue, 11 Feb 2025 07:11:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A63B4280004 for ; Tue, 11 Feb 2025 07:11:39 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 572E9B20EE for ; Tue, 11 Feb 2025 12:11:39 +0000 (UTC) X-FDA: 83107549518.04.4357FC4 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf05.hostedemail.com (Postfix) with ESMTP id 7C52710000F for ; Tue, 11 Feb 2025 12:11:37 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="h11L5p1/"; spf=pass (imf05.hostedemail.com: domain of 3eD6rZwUKCOUaHIIHNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3eD6rZwUKCOUaHIIHNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739275897; a=rsa-sha256; cv=none; b=gRGmNKgFshn4cwPycceo3eBTmwmAtzd3XA9kkQlXA5sCSjAli70ZnLcIC4fc4aUIO+YNXw E4TTw1UUvK3P3A0sdJOCNe44tFzO2UBOK50UrxjKolmYFZ/+p/aGqQvpknRX4EVhxzxqlV gX9XxqH1lCcO0rxz90UBDxNV1XGnFAU= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="h11L5p1/"; spf=pass (imf05.hostedemail.com: domain of 3eD6rZwUKCOUaHIIHNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=3eD6rZwUKCOUaHIIHNVVNSL.JVTSPUbe-TTRcHJR.VYN@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739275897; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kAdYjo7w1XuCwrHZHSlpYdtLm3u9gqqV2MfDowlzC6o=; b=dNawFuGO+KHGqh4bonLrlFBDhvImkjve7lklTHOCPFNABNqMI4oGkaTGlx+QwK9Se/ANAU Hg/ZR5vwtkjM2cLeD9XP15R6Vu8sgNs9xf6Ppnu2GUyjgOpLh/QujDv34sO3q+QBeXrMYU Baoql0TVp6ncrGPsniVIEDg54m/TSVc= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43947a0919aso13869505e9.0 for ; Tue, 11 Feb 2025 04:11:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739275896; x=1739880696; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kAdYjo7w1XuCwrHZHSlpYdtLm3u9gqqV2MfDowlzC6o=; b=h11L5p1/V6hQYuDufdM/q+piiQPqO3v+yeXNfwLJPMQPgERyIJ/EWRUDWAKb0EFx6+ 9L9sObWgHWuQJVLmDh22N2IWIs1yyVRSv/qOC3Liyl0u9ocLgXQ+hJQhSgQFUuat1nUv m8H1hWpNBUu2oMJEAQo3CXMj5Vhjbnip/aHQQOlrrY54KEcCmpJHs0zkc7cdDgAuOeY4 8s9+S4txoFXjksdLJFHJsUNDos39VMNVdHIp5D0IyUPdGO4n+hcAdkp3uXzUlMxaIfr9 fIc2H63FNXi0yLmOSAS6BnmpokTz3Hzt1vMKGoyXzAbJT9WTZaUPQx+TciPfOWoElgay ve/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739275896; x=1739880696; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kAdYjo7w1XuCwrHZHSlpYdtLm3u9gqqV2MfDowlzC6o=; b=MMbcSaSwPm3Q+UcuJvXKhnyI4keTvlbGXgkPNnfylAUXTVjDjTHMY2dxDE/Z9h769D zkxsVSdRv9JsMKB1SzxlriGFSk1OjaOQH7nk5R7TR38QdRXD3jG3QloyvUga5qYV1nyT 5oj3HWG2gjF+e86oLlIa0uOUkMnJQwJ1VuOxrgvXw26VXdAkA6exkpKlmjWpWCUbX42E QqjYR9dN8l260dsyKMqo4yUzIFpF/h38X4GCDDAIfUVQvS7btKBAPbVwr/Thm9zFmCJZ HGhCujAkWK4o6xiYm6pcxn+d2lTEVWieFbKFRMnu9NB7eD1bMyeRBUsY8qrPoUUy2r+S uqYA== X-Forwarded-Encrypted: i=1; AJvYcCXErTgYHOEiFjS8TTDf27xL0tYcE6S4qP8R+Bv8aPH2IdizYj5g/5LPSD7LbbesuY3aq+mGOq1w8Q==@kvack.org X-Gm-Message-State: AOJu0YzssAsT+pzw4aZgkLyKnbrzV112GsMQkl78x033uxUIC1M+1J9T Gwy5DRfZPU1d0/YUWikvuOW2slzV/C0xaCkK5sIWiWZQNCpGHhaYC2Ycmv2qhwa++SW4ifH8XA= = X-Google-Smtp-Source: AGHT+IGhAIxPnSIYxIYDciFwkjwU21RN+iHch2esUCoX8117xYYfDRohfxX77fPgyaVseOuJnCqxQyk9YA== X-Received: from wmpz19.prod.google.com ([2002:a05:600c:a13:b0:439:468e:a94b]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:1da7:b0:439:34e2:455f with SMTP id 5b1f17b1804b1-43934e24665mr117723345e9.12.1739275896121; Tue, 11 Feb 2025 04:11:36 -0800 (PST) Date: Tue, 11 Feb 2025 12:11:19 +0000 In-Reply-To: <20250211121128.703390-1-tabba@google.com> Mime-Version: 1.0 References: <20250211121128.703390-1-tabba@google.com> X-Mailer: git-send-email 2.48.1.502.g6dc24dfdaf-goog Message-ID: <20250211121128.703390-4-tabba@google.com> Subject: [PATCH v3 03/11] KVM: guest_memfd: Allow host to map guest_memfd() pages From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com X-Stat-Signature: 7aw9c3wjdhh1h48bw1yfi5zout6uxkma X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 7C52710000F X-Rspam-User: X-HE-Tag: 1739275897-392855 X-HE-Meta: U2FsdGVkX1+mvht6+AbcAaqqLnoLmC/5KZ5F1GIAoauyqPxzcqoWj7BRlSn7qfQK5FzA3FCttiPE75fo7Gesy/9X8aKWQWJLjB2KSj8dsmNXjZdmNhhSmTk5OHsHqEYxzEmeJyGCsbdYB/SlFGHxdNMgDNdkp52k5fmzTqyk9rFk+apmrAEfkPxwREZ2AYCDNy0kdMmuSr+8WXCh8IceqIB40yycuZMNC4fUSR7CeFlYLsTo4DGx8A3f4INk/tELtTT07qmNSt+xCX5rXEnFUy40AskcVlSnx4ifbVZYR/0z6KRpZW6JlwIwJXZiTLxUjmiMrnUZmyXXToEMKKvBR8PMNKqaoSLlqnf0CJy0N3EzbKk/OZZRw9AT1TjMPgdWdgzkccjEtxyUptHpUfMDaqFYzo17gS6VeU+MRV3+vSR4rIKIYDwIko1uX3SLsUn4wrHNTvfINQ5Ug/k1PAXU4EtKFV/6p7esDIC5OIdHMF8ffNdY2fwt12GWv3SDEpcvZo8uQWDUnQiVY9Qn2BOtRPN8w6QefLFN8BnE5LfSyRXE8RYTb58bFBE5VTGn17EyVDrrAhlxFOB40Soja7fMiaGuDo/w+e91kDzTLghcVE1lvSaI2fddDIEO88bJcT7yEGgnKU0TLCo10FXR/PxrBCsOienVMk/8645z1mXoJ28wZ895mDjcRc25Xh6QhRxa8oe99sAM7w+wtNeiNyEUUNGnCaojvetpG0drjKuOrseoHbP/1wdejUjiJSr6pyXyl+Bp54BuK0mrwnxBLGi/g0T0IvgwtVpPXckvYQGe9SS4tqspzzDR7jDoJYXr/eT2moSVaCTljDNHa6zc1IrS0d1bvWIGl1qMyNgX3ODRtIW0mkLiPFmP6fvT6UL5gbhBDv7mVMRZaXsRner+IEYebYlJrNVq68iO1EUjDHBalqsYNaScKyX3GCHixrUrh1dNw6zBN3kUSwm7oC+A+fZ mkXkpYP0 GUEiWlV0PIYg8aBmJtEf1Cg0JEHF+dWsgCDJkOYwNOtWpLej96q281iT949+VUDRzxo3U2XY1JYqXF+cHuAaUK4N1XZocAd0GYo4yuV8Db8ZR6laxIaz7g+QWOPkJ8dVJxOk3a5jFm9VFFwmsg0M+VDpImtaZNntVcATv+TyynhmEyJSGCgzWqO6+RGMzP+NYwJBflTA8OH2ND2NuRmLzSkqTOgVOZhRlBqg6EchK/8SpZtFK/BhnRGJwoabpwd30VKIffn3routeKR3PkWJEQN13iv6ipkJpEhAL2IPaDG14jgv5RZLQ8iWyhtW4al9h3n6Kz52csRgsgXeBTVDdYLOfksK7ekjP3+Fkm5X/3Dr3MGDMEBCNxX8SySy1b7OUdUBXtNxuLN4ruZIZ+z4d+y61qGXEcQsbY0SYRqXZeSBeAbRPdTmzgV3uzBX3/FkWXdgfvGjojAhIlsG895rOvqKAd12w2xhVdHKjDb1B89eny9x2XgnBDBOzApp8s6bic6hlnfKaxZ4UiaRerFyRMj2u8x35cwjOwPHPxndD8uWCHma7JjY1y1wz7ktZbmooyswpC90DWJ4ocBcnT9pBnqAzldMxoN5yzqOxzZxCvMBN3mz+EbtUn81Oxf7RQtHPQvJnq33WifTe14M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add support for mmap() and fault() for guest_memfd backed memory in the host for VMs that support in-place conversion between shared and private (shared memory). To that end, this patch adds the ability to check whether the VM type has that support, and only allows mapping its memory if that's the case. Additionally, this behavior is gated with a new configuration option, CONFIG_KVM_GMEM_SHARED_MEM. Signed-off-by: Fuad Tabba --- This patch series will allow shared memory support for software VMs in x86. It will also introduce a similar VM type for arm64 and allow shared memory support for that. In the future, pKVM will also support shared memory. --- include/linux/kvm_host.h | 11 +++++ virt/kvm/Kconfig | 4 ++ virt/kvm/guest_memfd.c | 93 ++++++++++++++++++++++++++++++++++++++++ 3 files changed, 108 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 8b5f28f6efff..438aa3df3175 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -728,6 +728,17 @@ static inline bool kvm_arch_has_private_mem(struct kvm *kvm) } #endif +/* + * Arch code must define kvm_arch_gmem_supports_shared_mem if support for + * private memory is enabled and it supports in-place shared/private conversion. + */ +#if !defined(kvm_arch_gmem_supports_shared_mem) && !IS_ENABLED(CONFIG_KVM_PRIVATE_MEM) +static inline bool kvm_arch_gmem_supports_shared_mem(struct kvm *kvm) +{ + return false; +} +#endif + #ifndef kvm_arch_has_readonly_mem static inline bool kvm_arch_has_readonly_mem(struct kvm *kvm) { diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 54e959e7d68f..4e759e8020c5 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -124,3 +124,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE config HAVE_KVM_ARCH_GMEM_INVALIDATE bool depends on KVM_PRIVATE_MEM + +config KVM_GMEM_SHARED_MEM + select KVM_PRIVATE_MEM + bool diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index c6f6792bec2a..85467a3ef8ea 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -317,9 +317,102 @@ void kvm_gmem_handle_folio_put(struct folio *folio) { WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); } + +static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) +{ + struct kvm_gmem *gmem = file->private_data; + + /* For now, VMs that support shared memory share all their memory. */ + return kvm_arch_gmem_supports_shared_mem(gmem->kvm); +} + +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct folio *folio; + vm_fault_t ret = VM_FAULT_LOCKED; + + filemap_invalidate_lock_shared(inode->i_mapping); + + folio = kvm_gmem_get_folio(inode, vmf->pgoff); + if (IS_ERR(folio)) { + ret = VM_FAULT_SIGBUS; + goto out_filemap; + } + + if (folio_test_hwpoison(folio)) { + ret = VM_FAULT_HWPOISON; + goto out_folio; + } + + /* Must be called with folio lock held, i.e., after kvm_gmem_get_folio() */ + if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + /* + * Only private folios are marked as "guestmem" so far, and we never + * expect private folios at this point. + */ + if (WARN_ON_ONCE(folio_test_guestmem(folio))) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + /* No support for huge pages. */ + if (WARN_ON_ONCE(folio_test_large(folio))) { + ret = VM_FAULT_SIGBUS; + goto out_folio; + } + + if (!folio_test_uptodate(folio)) { + clear_highpage(folio_page(folio, 0)); + kvm_gmem_mark_prepared(folio); + } + + vmf->page = folio_file_page(folio, vmf->pgoff); + +out_folio: + if (ret != VM_FAULT_LOCKED) { + folio_unlock(folio); + folio_put(folio); + } + +out_filemap: + filemap_invalidate_unlock_shared(inode->i_mapping); + + return ret; +} + +static const struct vm_operations_struct kvm_gmem_vm_ops = { + .fault = kvm_gmem_fault, +}; + +static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + struct kvm_gmem *gmem = file->private_data; + + if (!kvm_arch_gmem_supports_shared_mem(gmem->kvm)) + return -ENODEV; + + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) != + (VM_SHARED | VM_MAYSHARE)) { + return -EINVAL; + } + + file_accessed(file); + vm_flags_set(vma, VM_DONTDUMP); + vma->vm_ops = &kvm_gmem_vm_ops; + + return 0; +} +#else +#define kvm_gmem_mmap NULL #endif /* CONFIG_KVM_GMEM_SHARED_MEM */ static struct file_operations kvm_gmem_fops = { + .mmap = kvm_gmem_mmap, .open = generic_file_open, .release = kvm_gmem_release, .fallocate = kvm_gmem_fallocate,