From patchwork Fri Mar 28 15:31:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 14032172 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BE53C36013 for ; Fri, 28 Mar 2025 15:31:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E78E028014E; Fri, 28 Mar 2025 11:31:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E3229280148; Fri, 28 Mar 2025 11:31:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C542128014E; Fri, 28 Mar 2025 11:31:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A2EB4280148 for ; Fri, 28 Mar 2025 11:31:40 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id E29EB160423 for ; Fri, 28 Mar 2025 15:31:40 +0000 (UTC) X-FDA: 83271349560.11.4E4A0DF Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf24.hostedemail.com (Postfix) with ESMTP id 62C4918001A for ; Fri, 28 Mar 2025 15:31:38 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GuSXUSU1; spf=pass (imf24.hostedemail.com: domain of 32cDmZwUKCDwrYZZYemmejc.amkjglsv-kkitYai.mpe@flex--tabba.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=32cDmZwUKCDwrYZZYemmejc.amkjglsv-kkitYai.mpe@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743175898; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pob1TR9dEqi7MIBnWCqwy033V3I8qaqnlKaqyDYFuI0=; b=IBPBAopeURlYkQONNSdWDJtQpYSDycj7aOoLR8CSdpwfMPGEcOanRhspmsAPdHAbCbE6rQ FBc6CL0tT7+Tp/N5p28zITQtaOjoOi3ZdF3uyP9o5mheMFo8U8bEdlbXZEAIsOD9jWYVbF kQGjeJkMP8qccf4f7bKb55JOjyUi3oI= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GuSXUSU1; spf=pass (imf24.hostedemail.com: domain of 32cDmZwUKCDwrYZZYemmejc.amkjglsv-kkitYai.mpe@flex--tabba.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=32cDmZwUKCDwrYZZYemmejc.amkjglsv-kkitYai.mpe@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743175898; a=rsa-sha256; cv=none; b=YVIkKIkiS68L9HtGA+Gn7Hxw2Gfu/H5hIVA1/nBq3TX5f1EdvuUsLPcWNUKtsdSUE2FAoI lAqHlsiIBr+LY0RhynxuMTi4xyxzrh7FYUl9GxKnLxR8ma9xmCExWOeJF5vaiK70IKAdp+ LoCT6Objk8YsjZY9Ku5HAzJimgBMS0E= Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43d4d15058dso18185835e9.0 for ; Fri, 28 Mar 2025 08:31:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743175897; x=1743780697; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pob1TR9dEqi7MIBnWCqwy033V3I8qaqnlKaqyDYFuI0=; b=GuSXUSU1ZII8NUSLRmY6XpVCRw9FKOYT/uNfd0/8X+m8ujOt6Z4gCN49MbdA1fiIfk JnJGE/xw9jnOL8SxWI0z/G0RriCpme4ppIm0giw5rIzjMdlE/tqplIj+t6ufUvcvLmCf Mh0I5su2179/UeCyw119JFoFH16AWNEzkK7e/kDv9fuo9GwTjLqwNJyVYyGqoRri4v/B t02douo/bs17MhYj3v/7U6l3LAWKVR6dUfqPVqjKWJ4uraWDI+k1D9uszAWQGkeEl5BC rWXTW9IiNdQ7PiQbIwdDrGqzzJ7gHK37N9pop9NdItoeFny26+eggpl7QcbuLkNafQxn NHKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743175897; x=1743780697; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pob1TR9dEqi7MIBnWCqwy033V3I8qaqnlKaqyDYFuI0=; b=uaWJRhk0zUlvckoqWnMTwSoKTag252jRIS++2UInWyxj2VdAUOPzjy+7L+9jjMYerH 1NIUqW6aP3EowHfD6qL0U7lzDg4cCeU1lanrH7CdFWWzxrUaO9RpLe3Hfa+70qSKZDSk Pt7kUvOSvN8dNyFO3RFtHsBxWAgN1cL1PgodfOoCPI/4NK2fiXIOrw/zBt8sOd2AckVy MhF0yuIVYFTzBwrWbDkU2me3f3Jp2RkOgZ2KvJAvdOM7IqrwEKtYTTBuicK4HIIZOZC8 6/GWxGmG54sIykhOqRTO/3idTH1+RGTvdX49fgFF4WGuSV4/kqD/NwNrHLterHdO7vkm PR6A== X-Forwarded-Encrypted: i=1; AJvYcCXjjQ9XrAu5kYNgkLHCaXIc836EDfHAhLI6+4K7eY7rFCru36VOy4ensZdn5XgUhDA243mofeuHrA==@kvack.org X-Gm-Message-State: AOJu0YzkJY8WK0Phq+kVM7HESOqF2wMMCameYhMovFP4okNlgvoV6IEj W1eYYmgm69iGXuBYKsRUXnzFFQX5cGe7yplmUmIhC2d0O5TozMjzkD+SMfvislDAZuFnaShvSQ= = X-Google-Smtp-Source: AGHT+IGngaZfLNR9G+mSoTgx2073MmGnWMrkl6EslGCWKeowY0DzmJkEF+mR9oXw1GyqKOgR3dKkpwK4uw== X-Received: from wmbay1.prod.google.com ([2002:a05:600c:1e01:b0:43d:9f1:31a9]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:c15:b0:43c:eeee:b713 with SMTP id 5b1f17b1804b1-43d85065ab9mr84209735e9.20.1743175897006; Fri, 28 Mar 2025 08:31:37 -0700 (PDT) Date: Fri, 28 Mar 2025 15:31:27 +0000 In-Reply-To: <20250328153133.3504118-1-tabba@google.com> Mime-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.472.ge94155a9ec-goog Message-ID: <20250328153133.3504118-2-tabba@google.com> Subject: [PATCH v7 1/7] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, tabba@google.com X-Rspamd-Queue-Id: 62C4918001A X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: 7g8e8z8xeu97s9sc35pezphax6hrax6a X-HE-Tag: 1743175898-103761 X-HE-Meta: U2FsdGVkX19klUQ+CijbhFKHfXyCb8n+ZNQRXiw0Dmn41r8I+zTAy030LQnOezPT0yqJQ1MemG7t9wXxsNLfWoiJkrzFx8VX95NiUM/U3MI84HrZH45vKWD7Hzr1EIWg4cJ0SHBZgAssHAeTdJDC+8hy18YmTKYvy6EqyuatLL1B45NXf9vdvnAAeQUT459O0PDeBrAr+2Aj8okvYPssmZFVQpepUEBp0zUM/UkpFX80shKEO6rYGPzt2blhLn07Nyc9Ezk5w0LhNBcnXhSbF/wiAEcR5ByzVsBdagUEyG+YAi9nOtDmDMI6bD+L4q/hpqk5RK0mAN6BeIjiPgN7brilM3Rc+ytupr+EtQLeoYM73DPjPyETPbwdEgfPNJTAEnWnebUQpO/X+kPzUl5mDCuhvJfXvPo/ScQjqloft/rRzKHF0kcZYa708Qcvepy4HznwVU+PjkKQbn6/B4RkunIqB1uCesyNpvlj7xpbYbCe9bWxylPcY9Y6+o7BuK0hG52dho2d8R/TMrKPNvzVSGBHfeBUOsfj0r3dacbL/4tBt/qUPFlCk8rin9jY1McnBJ+UP5pRx5ZKIbzbV2NrzTJfsVyb6wDeoJpPGncAIvZRPP8f5bvHKjU+Ul2zXAophN/eaSLBof2RVHBe0+F3CIaK27GevRyq8vdn0qreh5MeHztq8lMmWLYcyhTa3cPTn6I3rCwq8eRSOTntQ38Ewt6K7yjIuviyleRdnpZlaT5Qw2fQxjkyW6i08J7Vq4wSO2HGeOxYmsc3uRPsyMy2THVFnc2jYtNSG5LmTxHmfS2S08Pq2hGNo5A8gpZT0uXUan6BLKtYdC6161GDOvNOy77dnYNfbWFJTOUUmCLvmTungtyhdtd4Ql0v9N9p8Ezxhppe+kVZUffu1MEjamxq6YP2iV3bwwV1oF/sFg05lfrM1HG7JiaKSF9HmIgqXLKkyQFrqI+8PjIB/GXiI7j mPe5BDjC lVu7mjXThtNOkxX/0aoPY/EUwtwaYJR+Z8QMH9qCk8D6SBOhgiNdRkEZhAU7qo0bgr1s+TxVetVWboFoPRMT1eKkuchQPiVx0HA1yNslnZazM+qkD/OLYaZNyQmhjeqLSoIeKXFbUGN35u+YRrGMgbPfccU2z4vqD0z46z18nwC3bAbj39BGbmgpwig5RiF7bgkXDo7bKzc3d0yj+53elUIEEeGhvOjvRdAFLjj+xeoYNY1VE/8PT/eE8nz8vydoinqUlW8/HP+/99YRXt5uT50zf2OwTcgKTZbEYFoe+4nVQSV/p7460qfVcrVfwJQJK4Jg2HMNbUuqBozNQV1oNzxZA3Zw5yHH/WxA8j/R9ktbvYP8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Ackerley Tng Using guest mem inodes allows us to store metadata for the backing memory on the inode. Metadata will be added in a later patch to support HugeTLB pages. Metadata about backing memory should not be stored on the file, since the file represents a guest_memfd's binding with a struct kvm, and metadata about backing memory is not unique to a specific binding and struct kvm. Signed-off-by: Fuad Tabba Signed-off-by: Ackerley Tng --- include/uapi/linux/magic.h | 1 + virt/kvm/guest_memfd.c | 130 +++++++++++++++++++++++++++++++------ 2 files changed, 111 insertions(+), 20 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..169dba2a6920 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -103,5 +103,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ +#define GUEST_MEMORY_MAGIC 0x474d454d /* "GMEM" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index fbf89e643add..844e70c82558 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1,12 +1,16 @@ // SPDX-License-Identifier: GPL-2.0 +#include #include #include #include +#include #include #include #include "kvm_mm.h" +static struct vfsmount *kvm_gmem_mnt; + struct kvm_gmem { struct kvm *kvm; struct xarray bindings; @@ -320,6 +324,38 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) return gfn - slot->base_gfn + slot->gmem.pgoff; } +static const struct super_operations kvm_gmem_super_operations = { + .statfs = simple_statfs, +}; + +static int kvm_gmem_init_fs_context(struct fs_context *fc) +{ + struct pseudo_fs_context *ctx; + + if (!init_pseudo(fc, GUEST_MEMORY_MAGIC)) + return -ENOMEM; + + ctx = fc->fs_private; + ctx->ops = &kvm_gmem_super_operations; + + return 0; +} + +static struct file_system_type kvm_gmem_fs = { + .name = "kvm_guest_memory", + .init_fs_context = kvm_gmem_init_fs_context, + .kill_sb = kill_anon_super, +}; + +static void kvm_gmem_init_mount(void) +{ + kvm_gmem_mnt = kern_mount(&kvm_gmem_fs); + BUG_ON(IS_ERR(kvm_gmem_mnt)); + + /* For giggles. Userspace can never map this anyways. */ + kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; +} + #ifdef CONFIG_KVM_GMEM_SHARED_MEM static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) { @@ -430,6 +466,8 @@ static struct file_operations kvm_gmem_fops = { void kvm_gmem_init(struct module *module) { kvm_gmem_fops.owner = module; + + kvm_gmem_init_mount(); } static int kvm_gmem_migrate_folio(struct address_space *mapping, @@ -511,11 +549,79 @@ static const struct inode_operations kvm_gmem_iops = { .setattr = kvm_gmem_setattr, }; +static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, + loff_t size, u64 flags) +{ + const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct inode *inode; + int err; + + inode = alloc_anon_inode(kvm_gmem_mnt->mnt_sb); + if (IS_ERR(inode)) + return inode; + + err = security_inode_init_security_anon(inode, &qname, NULL); + if (err) { + iput(inode); + return ERR_PTR(err); + } + + inode->i_private = (void *)(unsigned long)flags; + inode->i_op = &kvm_gmem_iops; + inode->i_mapping->a_ops = &kvm_gmem_aops; + inode->i_mode |= S_IFREG; + inode->i_size = size; + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_inaccessible(inode->i_mapping); + /* Unmovable mappings are supposed to be marked unevictable as well. */ + WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); + + return inode; +} + +static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, + u64 flags) +{ + static const char *name = "[kvm-gmem]"; + struct inode *inode; + struct file *file; + int err; + + err = -ENOENT; + if (!try_module_get(kvm_gmem_fops.owner)) + goto err; + + inode = kvm_gmem_inode_make_secure_inode(name, size, flags); + if (IS_ERR(inode)) { + err = PTR_ERR(inode); + goto err_put_module; + } + + file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, + &kvm_gmem_fops); + if (IS_ERR(file)) { + err = PTR_ERR(file); + goto err_put_inode; + } + + file->f_flags |= O_LARGEFILE; + file->private_data = priv; + +out: + return file; + +err_put_inode: + iput(inode); +err_put_module: + module_put(kvm_gmem_fops.owner); +err: + file = ERR_PTR(err); + goto out; +} + static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) { - const char *anon_name = "[kvm-gmem]"; struct kvm_gmem *gmem; - struct inode *inode; struct file *file; int fd, err; @@ -529,32 +635,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_fd; } - file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem, - O_RDWR, NULL); + file = kvm_gmem_inode_create_getfile(gmem, size, flags); if (IS_ERR(file)) { err = PTR_ERR(file); goto err_gmem; } - file->f_flags |= O_LARGEFILE; - - inode = file->f_inode; - WARN_ON(file->f_mapping != inode->i_mapping); - - inode->i_private = (void *)(unsigned long)flags; - inode->i_op = &kvm_gmem_iops; - inode->i_mapping->a_ops = &kvm_gmem_aops; - inode->i_mode |= S_IFREG; - inode->i_size = size; - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); - mapping_set_inaccessible(inode->i_mapping); - /* Unmovable mappings are supposed to be marked unevictable as well. */ - WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); - kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); - list_add(&gmem->entry, &inode->i_mapping->i_private_list); + list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list); fd_install(fd, file); return fd; From patchwork Fri Mar 28 15:31:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 14032174 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A85FC36013 for ; Fri, 28 Mar 2025 15:31:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C3F89280148; Fri, 28 Mar 2025 11:31:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BEB50280151; Fri, 28 Mar 2025 11:31:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A69E9280148; Fri, 28 Mar 2025 11:31:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 730BB280151 for ; Fri, 28 Mar 2025 11:31:44 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 234AD80433 for ; Fri, 28 Mar 2025 15:31:45 +0000 (UTC) X-FDA: 83271349770.14.9C42556 Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf21.hostedemail.com (Postfix) with ESMTP id CACCF1C0009 for ; Fri, 28 Mar 2025 15:31:40 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2lmW9JzH; spf=pass (imf21.hostedemail.com: domain of 32sDmZwUKCD0sZaaZfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--tabba.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=32sDmZwUKCD0sZaaZfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743175900; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zBwVWT3bMzuxYkVzph9Re975E4ZOgPSLG5vu5hERwkU=; b=0Jvd94Bf3TDC1rT1ip+1tZXmBWDSKFQyjwCdKsac9oPKp89PwlSpFtxEl6VyU+ZUNpmtCJ AfMfNJzZjq7ZLa7S5VUoi7B0Fc5I0y4dnhGRox0jgjyqenBxatHxZn+p4ce6cDDDM0VTGu 3T8/MGk5vzbrTcC1S/A6aeDWqecEIIU= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=2lmW9JzH; spf=pass (imf21.hostedemail.com: domain of 32sDmZwUKCD0sZaaZfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--tabba.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=32sDmZwUKCD0sZaaZfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743175900; a=rsa-sha256; cv=none; b=RI+aeSeOKaB0RiNvB740q2Pn0TwEhIZjcp6zsrSkDiifpJTt7yRkNgKeBrdLpe9lCie9DV cWVgGRFomS4yPtwW8MHDz0NoHHyuC8S96Cta0WGB+OzSzFtTtrehQR/u1fNL+dWKnGNW3v 19i0sJfpLgMqCPNPztHQcPuPk5kyui0= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-3912d5f6689so1371511f8f.1 for ; Fri, 28 Mar 2025 08:31:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743175899; x=1743780699; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zBwVWT3bMzuxYkVzph9Re975E4ZOgPSLG5vu5hERwkU=; b=2lmW9JzHUREB/1h5wkA0yb9eNPYhD83Fblt5kjrpfZv4J77ZvAsQ9txrqcptiNC4mF uC6z7h8SsKIfe61zhEIAHsa3nyrTkTD2gNSvo+zo9Fk2mkMmsUfMgBrHbu+MaZ0cWMw5 hzRvYn6oQusd92DT6VnqrMgz4iBve7/Je+oKgfkSq+ZVU/E7AD7omV3HUGjGaNSwd3JA ihuMDnsGAtbbYIW8NMZiD3DL7btrXMzVdtxA9gDGmcP7WH1Y9WHpcCBUzEIFOQfI01H1 VxLqpngbqA5RSUY8CK3EWBtQpc4+NjQ0IzBjAxfxmnTXcnsHHxuYiLxhFiYHBAwsF7ZP 5DrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743175899; x=1743780699; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zBwVWT3bMzuxYkVzph9Re975E4ZOgPSLG5vu5hERwkU=; b=XhFItF1XicV2q7b4GYDsKdm6VifBvO8Vhga+DjYNRv2Sbc703T79vs//NtsoNm1GDY va/ipeQIlyunhdKi4wQcIDvmsjYpXTt3aq4MEPUc5jpeUWa8T9onGJ9bc1UERC7+R5p+ UlRinSeKFP5cc9wMlR+F4H+G/lWTrzXOnEHk2SuQuiLKAoO15ZNPRI2rd/MdE6fUOBgD 4XKb6wn9FUoRx7B4oE8OSSIHTU8kY5/j96NL76kj16T3xt5/XKPtTlXdC8ahzWaeSG1J kibyR90mXfhue8x0WuW4uPG457wZ64B0NvNHKj6GTU+P1dch7HjquscUVTbv5SfCAbSK J8xA== X-Forwarded-Encrypted: i=1; AJvYcCV1xkuGQyAdXXAOZNCWatGPJSjhIaEAogwtPfyciLNVC1fIpv0fUiVdhYdMLSx2W94mkxFSXxgnIQ==@kvack.org X-Gm-Message-State: AOJu0YxZCV9km+EuKjpYrAGNixSTVQXyai7acc5ybhuxJ3OwOkhK8KS7 TI2ykpzGUt0NvOLLPFq1Gom5TVoIJIadoQg9EEjQfW4M+JwJR8XOKCMWOXPnpEp6krstHotX5A= = X-Google-Smtp-Source: AGHT+IFjfxf6/c4YGYpRR8ebQgMC7nETbgoP7o3HpJ1E0q7VHYEiUY/Sncl1PhAsgk21ab/L2t0C6VGu3g== X-Received: from wrps2.prod.google.com ([2002:adf:f802:0:b0:399:7a0a:493f]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6000:18a8:b0:391:4559:876a with SMTP id ffacd0b85a97d-39ad1794cd7mr8199077f8f.46.1743175898942; Fri, 28 Mar 2025 08:31:38 -0700 (PDT) Date: Fri, 28 Mar 2025 15:31:28 +0000 In-Reply-To: <20250328153133.3504118-1-tabba@google.com> Mime-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.472.ge94155a9ec-goog Message-ID: <20250328153133.3504118-3-tabba@google.com> Subject: [PATCH v7 2/7] KVM: guest_memfd: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, tabba@google.com X-Rspam-User: X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: CACCF1C0009 X-Stat-Signature: k16fa95hb5wrzg8dnx6roxdspy5ubrc4 X-HE-Tag: 1743175900-119387 X-HE-Meta: U2FsdGVkX1+rRwbmiWIj8NCDnuvdVtYk2DW/fonUrzIuymahGqALVc5MKMPUrF1+6QJtpA61MGK4oTC+20xNHxgS0yL4elwDGQYPwKZjoC/yttfSZKof+nWeXsMRZqMPXZ/Yjniopg6QuycpAttSSQu2K9PV1X/mC6hkf6CGO+upTXDHRT2ywTuM6MWtp5eukDkGp01j/Wm0/Djh4K5jOi+MDtyyxHsA3fM2pXCMi5NznX/dSSaBgsALsIyBg4rzUEwJLHG1YaXI/g41riP4Q62itoLgEb0rWYVwcmViYC81uSuHDbRRNohi4Zo5qBLB3nR2AHhlSXja4K03zVfsB+u8X5hH3orhhfgIdjgGS8DyFYTmTty4a1jTFvjWlNmdD3KWo2ZEKOAtgR3nfdUBj1RAtHiqZ+r7p4OQhlb9NvhvWrmKbQk1vWYigNKeUllnLhIMOn69P/nVUsLHaTnTICPE2akcNDNlWSXtXA6PCIevNNZDlg6uYH3LwThDGhkbql82KRlgPLnOPw6sBXMZp2ivJtnVtRtqgZxdO9wlKrc5Ir6wANsd79zeBFrNBsIYo0t7E9LoTsCefbzOuTmpE1qUJkiiHP1VYEUxQBDcu494wieJp0rmx6NMP8lD9J2gKNElN1Q3EcFM6+DRNmtcpMIAgdp+2CkFtdNYoYZyU3kqxcgyPie8rPrYGf0FQO/TqTpIj8VcTQYZKrmad/V7W5Z/4MdsiSASUM6tY+dlgHs6wChyohypfQa+xhNSDxSLg0t0P5BHMRO7G+B8pFqlKYyKLNR9N6yiLxiZzVRp9cewb2U/mhaFmTQY8yzHy7ecqH6eZm84ieiV2I4XBAVF9cv7oRlIModHxzWehjhVW4X05q6BY0GNZocPEQ/feLatKc5TqM7OXW3GpIdFBtbY8Ye3NehlWkZ5vP/cpqditoRq8UgNXsiFTneKw7WiB2sn2o5x9UCsz/swk10vFdq ny6D4dve kPKdNmXmKV/53uLzuza3wJmbZt/JD+R3JeUCuGju21DVcTIfTmdpEJyTrd2CqvldZcYdqvFnOXRhEr7gek0/bTCkxF0X++oeUK0C+Ot51D663A3S2GFswUANhuAW19ASS5oRZVogXLrcCpbpyRbMI52aiD94pwVAEM0YqZrLZCTT5zbpVGloVxRTgqe2UZoJNto+WegnCvSh3I0B3maesFwomgWfyahx1Gzl0PPN+x9jlt5vAeKbSJ0TIGCUaSxULbRkFQw6Izl8K1XRnQlEySzoTYBV/bVrtIfzzBea2Gh86MYfSfXR0azawzwv40BJsoDMfXNYAVd4Mpg+Z20XeqeKMgEK0QpokLFUa5x2F10+PcIMWZP88rwNDO91zddYNh99jVhR8kxUUiU8rJCKR+K0Etl541kLawad+upTiHb12TnzlMHwCYley+PocDlr4x6g5dp8shyLAe2d9ffKJXK7ii4ogC7xTSD9L5j+jdJLuXrEgY6T6RsaecFw+91fMjMG06Mb0vWj29VWQSorhxczy+dI4/21vPbNpKzFexx8bd5zPWB1hNaEaAfv5czREEpJ7V+3AjV8ihoVdii65UtsW/xkXQhDm+MWUmtpmp9cVm+kBcI9JteJrY+Tol4aCL+nTl81Y4eq7nqVyKNRgHv4zTA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Create a new variant of kvm_gmem_get_pfn(), which retains the folio lock if it returns successfully. This is needed in subsequent patches to protect against races when checking whether a folio can be shared with the host. Signed-off-by: Fuad Tabba --- include/linux/kvm_host.h | 11 +++++++++++ virt/kvm/guest_memfd.c | 27 ++++++++++++++++++++------- 2 files changed, 31 insertions(+), 7 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index ec3bedc18eab..bc73d7426363 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2535,6 +2535,9 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, kvm_pfn_t *pfn, struct page **page, int *max_order); +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order); #else static inline int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, @@ -2544,6 +2547,14 @@ static inline int kvm_gmem_get_pfn(struct kvm *kvm, KVM_BUG_ON(1, kvm); return -EIO; } +static inline int kvm_gmem_get_pfn_locked(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, + struct page **page, int *max_order) +{ + KVM_BUG_ON(1, kvm); + return -EIO; +} #endif /* CONFIG_KVM_PRIVATE_MEM */ #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_PREPARE diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 844e70c82558..ac6b8853699d 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -802,9 +802,9 @@ static struct folio *__kvm_gmem_get_pfn(struct file *file, return folio; } -int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, struct page **page, - int *max_order) +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order) { pgoff_t index = kvm_gmem_get_index(slot, gfn); struct file *file = kvm_gmem_get_file(slot); @@ -824,17 +824,30 @@ int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, if (!is_prepared) r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio); - folio_unlock(folio); - - if (!r) + if (!r) { *page = folio_file_page(folio, index); - else + } else { + folio_unlock(folio); folio_put(folio); + } out: fput(file); return r; } +EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn_locked); + +int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, struct page **page, + int *max_order) +{ + int r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, pfn, page, max_order); + + if (!r) + unlock_page(*page); + + return r; +} EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn); #ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM From patchwork Fri Mar 28 15:31:29 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 14032173 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16568C36011 for ; Fri, 28 Mar 2025 15:31:45 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49DF028014F; Fri, 28 Mar 2025 11:31:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 425AB280148; Fri, 28 Mar 2025 11:31:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2A32928014F; Fri, 28 Mar 2025 11:31:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id BFBB6280148 for ; Fri, 28 Mar 2025 11:31:43 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 2552A1C7F15 for ; Fri, 28 Mar 2025 15:31:44 +0000 (UTC) X-FDA: 83271349728.24.3315A74 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf15.hostedemail.com (Postfix) with ESMTP id 056C3A0008 for ; Fri, 28 Mar 2025 15:31:41 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=SN9GqsVt; spf=pass (imf15.hostedemail.com: domain of 33MDmZwUKCD8ubccbhpphmf.dpnmjovy-nnlwbdl.psh@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=33MDmZwUKCD8ubccbhpphmf.dpnmjovy-nnlwbdl.psh@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743175902; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=V/jpShOFnJ6I/k4rEqX8noV9TrB3vpkLt2594tqNmFU=; b=VMO3fLLgi9ZWDrCZ98+ezdbzmCtb3NDjEmZF+PQ0kVm5yeeeaq+3wHvbqjXOR/OKVQkbtQ xnxIF40GSotQEbFG4Mk1yfcpY2QY0EcSmZcDie2f0YhrwcmLgRedZlNAh2mSEsSNvKJf8p MFrzvYphUTyzheLa7/PW2/kF7qT7rGw= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=SN9GqsVt; spf=pass (imf15.hostedemail.com: domain of 33MDmZwUKCD8ubccbhpphmf.dpnmjovy-nnlwbdl.psh@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=33MDmZwUKCD8ubccbhpphmf.dpnmjovy-nnlwbdl.psh@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743175902; a=rsa-sha256; cv=none; b=pq8iB1opQ+n1JNr2Lmszc+vro9le/eWRlTOeP5DZppP2SFGj7+1XUMUZK+/D4vZwD8+yxg 3y+c50/D2boYT8XHy0uUplEHWhXD8/6siS47mXDD5eW6p74gGKfEmwvMXm7C6tNhgojGiz ZxNatbPZV12U0xmB+3RWrFPvbBKmYFY= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43ab5baf62cso15959115e9.0 for ; Fri, 28 Mar 2025 08:31:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743175900; x=1743780700; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=V/jpShOFnJ6I/k4rEqX8noV9TrB3vpkLt2594tqNmFU=; b=SN9GqsVtWV/HS1gcqsYE2Hwc83UDmDREjV0YngMpFJT/6PHJs4DQEjLA+X1UPKMdS4 rF0mP/GQPkGps1xOXUYk5gGdLw82LtKtb8f8dXsicQ9bvvRTL6Yag7rsLBvKkCuJE5xY nHCCQGk5BLYjAk7lsDdXGtb7UCt/Ns8cj0ttvZ+ev5VWJoJMMWLPATv2FrazdlLDwsWo qqmf9LigoROrVysVyvFmb8TBYs9OokLbBBn5ubzyP9Cre61dPNwxSfpNw7X1i2o08xmZ K7qUyIq9HPcYvgiaTYOe7lQ1+QA0Eu82PVv7FpKSghZHGc6SRXMr6hYrfGQi0DzOrh17 tb7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743175900; x=1743780700; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=V/jpShOFnJ6I/k4rEqX8noV9TrB3vpkLt2594tqNmFU=; b=e87UUs6QdDbkyGHtWs1ML9MaPtoozUUS6RhBJ0mNbxdAAz1KxHrfnyPn4ZHg7nncBr sfZsTQD7qDAN/OmwuBOOBXVsIdo1xiHWggXVMRzWtM9YNP3HWBzk4VOSzz1XBDFgyea8 cdOJ8488Onf8tY9xA9+tysdkxeO8w4vmNtDXDUxaNganf2gRRz909jPYvi+p7583X/05 5qp9ErHXXmiATbpbxLsdISKs3Jr0T5pB/IZNpmOTYWQemgpYQjslfMuASr1LGSF8phJ4 WrB2Dlx0QigKYfx6MSg7XHzL4vwvmhyn5in1MSc9p7OA0qPUsYn1/fcv9RftmR/vGe/C uKQw== X-Forwarded-Encrypted: i=1; AJvYcCVFTxnrXpP5sWBDUG5FTxgJKWHV4HWhTlDcBHx2VzwiAkU5ha3XX/FVjo9rPFq5D0ZPgI/VI59oGg==@kvack.org X-Gm-Message-State: AOJu0Yz57BBDtJXPphQ2uZy75IKBYC2LVzqNRII9OurXI3UkfhqyU7b2 kcTlh83CnJca9Wa+PcyAPLPSbJUPCyD5PB9nDMOD+qTWm4Z8cb9NPsTdY/zvZyjLbTs4lHUU5A= = X-Google-Smtp-Source: AGHT+IEcG7Zl1YWubAQ/xxzfCc7DzPo7o1l3NxQG19us3iyaCBZ3hbODn54f9bMZd/gVjGPshQUZSrIV3Q== X-Received: from wmsp31.prod.google.com ([2002:a05:600c:1d9f:b0:43d:1f62:f36c]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:45c8:b0:43c:fdbe:439b with SMTP id 5b1f17b1804b1-43d84f5de84mr98366605e9.4.1743175900745; Fri, 28 Mar 2025 08:31:40 -0700 (PDT) Date: Fri, 28 Mar 2025 15:31:29 +0000 In-Reply-To: <20250328153133.3504118-1-tabba@google.com> Mime-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.472.ge94155a9ec-goog Message-ID: <20250328153133.3504118-4-tabba@google.com> Subject: [PATCH v7 3/7] KVM: guest_memfd: Track folio sharing within a struct kvm_gmem_private From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, tabba@google.com X-Rspamd-Queue-Id: 056C3A0008 X-Stat-Signature: gdmxknd1okp83odacczkbs7f585aa5wy X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1743175901-802268 X-HE-Meta: U2FsdGVkX1+ES0lVJXBsH7Doub2vrC7JBRVBMZdEWZZOFz+RrV9imhA4RskiIPVFF3RuzjqpiW9p+ulN+tZi8JggbOyrgKmnPn/xuXdzRxCKgwlte6VA0+uEBjyzYfj4ADM3gSbL0TNsiWfPm+WKtnvOA599QlwiJf2C+3jgWA2lgKLu++Ym+TtmaB2BX2X7oYi5k8+zAmvYFvm9AHHcrvjB59mB1Ucl8rKOsC7MbF9GGJHi//RKCz8J3mB/gtFrQihzdp6v67v1eSggRGDS8+1aE1QzoJMaL4Z4K+KBFuUS25RXtpP0bxjkBcyGIDj7xAH+Y+ymsD+Ew0um6PJMpGpGCtdf4l6O1WKmoH0+x5krc/M/OeOxG70LSb3eHtmioA8oyO8k7D8RE8Y4S0KJXK5fmVi27WPxq4JTGQCJ2cgE6Sv/m5vnPVRSs/e8o4ds6hiURmeIbUrE3yDRlshbmD7Lkxg8EnsNGqDFtB1FidmIiyOupbPnriHY9fHnEHEh7wQ8OrO+dUBxQUYvAI4f9Qs5hf6N9yP39DlcpNc5hR+jljbwuhEMoFCraY3+EBJeEW/QpsSnKZmgQPc7KqEdC9MJ77lQBTIXuKUD0Gr00HCtjC8Y9IcCyj5JFPlJGZhzvRwBODv5yUnnFFYroqapD93WjxQGXkI0fBiqN44H6yZ1sj+nkN92QHPwxL47gFceuATMIwamW/+KV7ama+bTTMleo6HuZZPelqbhBxxxuEuYLjGhJHN3mNgoOgz0k218IREkwCAXwq1I1jM89E71GpAkzXMpArMIjX0MQZEotawQkx32W+iP1GFWqkXsG78UseCbr735xpl65L5qn/xvSgO+MV14IS4/LfIPozXo6XEiz3IwbparqwoO90TXX1LNO8y5gokIjT9s15R9GaFDD2fi7qd6tNsKrM7GEX9A6FL4ihttbVXu3ZjiQMl0keGHH/6E3u1MZTJdnH56Q5U HDCUwJxu 9O4i7Wt2RXHGErq1DAH9qYOvbAz+5qMsMtJz2w8cyugaSTxvVZdglKzpZd1ZwboUphb7Kpr8eMflgZ59H63Ka4utXwit9Gs43sd0HVt0Dp7mCVr4/7R0TsRN6+MUnR+X+D0Eruyx06nLzqqdVUTV9Ir/kFzACGHuMjAFn3ZI7bTOLSNyx/QFkjj9fAWvjUMODL74lT4nk4NzdrnP41V3kxWub9kct/R2IstRqzpa97tBTSQl9nqkKSpXsEMrJqgn1EMuth6W+UfIqLrwGzjVsLr4efcckMhyfR7WUFggE1F5Rs6IHnpkXWi592F3KJ45/tkooYyIH9CgkHk+2nD1gzJs6GzxZo5mwPBQKYntnXGcXkLduvZZub603VwahPEaJXp4RI97eCqnYzf8OLLH02ZHq+3Q46LxGqlE/oj7B6aQufkLbbl62AImxmwV20u0BSlSqZwwbZEHDKMwvs2BRtpen8MlzgqT29tGt3TAk4NMCQOyz85keEmFo5FbEbInF50WU2pNUR9bGWEeRcfU7bbue6bobqGrTjE3Exi/PTmYLW7z0ifnr31R7n4NBKI2ohM4iwywouTdIobLzFjG4HDE96J7W6xiY82ZfjXbUzFRxrpyZPpuVnQ6wXK2F2oMhBsiAk8cRdKEj8zWaVAfAAkDOXTccbC3qB3D3Tx5hu1bGP2H5u/yHIPbHug== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Ackerley Tng Track guest_memfd folio sharing state within the inode, since it is a property of the guest_memfd's memory contents. The guest_memfd PRIVATE memory attribute is not used for two reasons. It reflects the userspace expectation for the memory state, and therefore can be toggled by userspace. Also, although each guest_memfd file has a 1:1 binding with a KVM instance, the plan is to allow multiple files per inode, e.g. to allow intra-host migration to a new KVM instance, without destroying guest_memfd. Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve Co-developed-by: Fuad Tabba Signed-off-by: Fuad Tabba --- virt/kvm/guest_memfd.c | 58 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 53 insertions(+), 5 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index ac6b8853699d..cde16ed3b230 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -17,6 +17,18 @@ struct kvm_gmem { struct list_head entry; }; +struct kvm_gmem_inode_private { +#ifdef CONFIG_KVM_GMEM_SHARED_MEM + struct xarray shared_offsets; + rwlock_t offsets_lock; +#endif +}; + +static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode) +{ + return inode->i_mapping->i_private_data; +} + #ifdef CONFIG_KVM_GMEM_SHARED_MEM void kvm_gmem_handle_folio_put(struct folio *folio) { @@ -324,8 +336,28 @@ static pgoff_t kvm_gmem_get_index(struct kvm_memory_slot *slot, gfn_t gfn) return gfn - slot->base_gfn + slot->gmem.pgoff; } +static void kvm_gmem_evict_inode(struct inode *inode) +{ + struct kvm_gmem_inode_private *private = kvm_gmem_private(inode); + +#ifdef CONFIG_KVM_GMEM_SHARED_MEM + /* + * .evict_inode can be called before private data is set up if there are + * issues during inode creation. + */ + if (private) + xa_destroy(&private->shared_offsets); +#endif + + truncate_inode_pages_final(inode->i_mapping); + + kfree(private); + clear_inode(inode); +} + static const struct super_operations kvm_gmem_super_operations = { - .statfs = simple_statfs, + .statfs = simple_statfs, + .evict_inode = kvm_gmem_evict_inode, }; static int kvm_gmem_init_fs_context(struct fs_context *fc) @@ -553,6 +585,7 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, loff_t size, u64 flags) { const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct kvm_gmem_inode_private *private; struct inode *inode; int err; @@ -561,10 +594,20 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, return inode; err = security_inode_init_security_anon(inode, &qname, NULL); - if (err) { - iput(inode); - return ERR_PTR(err); - } + if (err) + goto out; + + err = -ENOMEM; + private = kzalloc(sizeof(*private), GFP_KERNEL); + if (!private) + goto out; + +#ifdef CONFIG_KVM_GMEM_SHARED_MEM + xa_init(&private->shared_offsets); + rwlock_init(&private->offsets_lock); +#endif + + inode->i_mapping->i_private_data = private; inode->i_private = (void *)(unsigned long)flags; inode->i_op = &kvm_gmem_iops; @@ -577,6 +620,11 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); return inode; + +out: + iput(inode); + + return ERR_PTR(err); } static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, From patchwork Fri Mar 28 15:31:30 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 14032176 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03B4EC28B20 for ; Fri, 28 Mar 2025 15:31:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C6734280151; Fri, 28 Mar 2025 11:31:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 9F364280154; Fri, 28 Mar 2025 11:31:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7BA57280152; Fri, 28 Mar 2025 11:31:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id EDA7F280151 for ; Fri, 28 Mar 2025 11:31:48 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8D91D160378 for ; Fri, 28 Mar 2025 15:31:49 +0000 (UTC) X-FDA: 83271349938.23.C9BBFF0 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf19.hostedemail.com (Postfix) with ESMTP id 7BA311A001E for ; Fri, 28 Mar 2025 15:31:44 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jEaStC4H; spf=pass (imf19.hostedemail.com: domain of 33sDmZwUKCEEwdeedjrrjoh.frpolqx0-ppnydfn.ruj@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=33sDmZwUKCEEwdeedjrrjoh.frpolqx0-ppnydfn.ruj@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743175904; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=DQKphsCXTEhqFMrf3VWaTcY0QsviPAOOuZY+cq7cn/w=; b=FlBctdgcnSM8K2pWanzis9VU54DYv8bzawXlXJdf34YOktwk+vwiFR92Qrg3Img3OWhiF+ SLmMCu++2Hlc5X72QFCYEoAKsC4hHByEj6eWf2b5EuAAIWMOvjGXqVHGGpQJ/MCZfDhXrG l9ZpR0DCwSMz1gaStJTQc5N6jFR0oLM= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jEaStC4H; spf=pass (imf19.hostedemail.com: domain of 33sDmZwUKCEEwdeedjrrjoh.frpolqx0-ppnydfn.ruj@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=33sDmZwUKCEEwdeedjrrjoh.frpolqx0-ppnydfn.ruj@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743175904; a=rsa-sha256; cv=none; b=D4agzGYwIE8yikjf8lpJTHGGjOvXDEfXBMP5Hqe/P7hnpfPWym55f5t67zysaqi1A7JjgV r0z0rAg3smBIrxnVsUXK/GIqnuvdwkhDLTwtg4nlxMRE3yvyfn9KsxpOItuOc+6zERXr1H 1h9EbmJaOCt+UhLrvsI3Xw+33W+FLbU= Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43d0a037f97so13376245e9.2 for ; Fri, 28 Mar 2025 08:31:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743175902; x=1743780702; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=DQKphsCXTEhqFMrf3VWaTcY0QsviPAOOuZY+cq7cn/w=; b=jEaStC4HIzWH+xluZv0ZP9T9mnkNHxTMZ4hg8Zfy00EmQrtdyBWAguDJ51X/+GF/j3 Uq8TYSWscWKJtRoYytp7B/kstCZKKfMGTdejWQWX9k1e6ellkuxd44eUER17dAvteOLT uCv8fndoWJnGSMj739WEY12J/hhAA2oARr5fAhE8NOXMgeUi/xlvrAz53AOQBoMXnJLx SWORGaXo5svtFdQEhDwTd8volKUUqy4UTQVs3E+xKIvHqHdVKuvT6hn7jeR+Ui28NMsP RTQMg1dZfyLWCtfUYOx3Rj30y33X3yGQAoCLXQTBr3CPL+Uv7i0DKsANuxfE+XWpQKAm 4eYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743175902; x=1743780702; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=DQKphsCXTEhqFMrf3VWaTcY0QsviPAOOuZY+cq7cn/w=; b=plxideioZ6NQRDWvQJ28yJZFUzyedvntjtlU0fnZJZFw12NRGy1dMLBu1EFxzUEM1f /ON5/zjXMNL2nE/WJTkx/X/g9ko0TTYak6eSQyZ1eqfkQ+9GaFqr3tmXLg/NWf2YPP++ e12RYPjsEkODEl2hMLKDq5X7EwxUT2s1UmvzWrDAVi/P/wrA6OVZFn/UbfbOObW50yqJ XA0KQ27TA7DkFr7WWpCTT0BURNlgDR+420LoKa9MeBYcrpS/UrhUsnoMFJnmHDe1qMbT z1R4G6zLJ5m9D21NBeg52vMxxD3L3APnh7Dm9zfTUblUefcfb4ZHUcxPJ+8O7JC69/z4 O+UQ== X-Forwarded-Encrypted: i=1; AJvYcCViLPtA6kWKJ/QHSP1U42V4vtCFSaV0OElSTnbp0M0BXf7N+uydLCxH2u2dUgTjHdvPyjBBQiC50A==@kvack.org X-Gm-Message-State: AOJu0Yz6PGY0T2zaBZmQbhkzjiieRX/I9zre3VBQK0r3pLXNi1hsYO+v /XUNGzB19QYLqpbjiikqxqYsP0hl1E0giFywuYWVORs5KiMhD54epHFYn0Pr+KwyItSlFaNqYA= = X-Google-Smtp-Source: AGHT+IFeKUD228KCyiAqE0V133tP1161pUOyLkFtpOrh/x3TzNM0KdnO1/NnMHDp+Bj1LbLVaO4D8TJEbg== X-Received: from wmbg10.prod.google.com ([2002:a05:600c:a40a:b0:43d:4d65:23e1]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:548a:b0:43c:f332:703a with SMTP id 5b1f17b1804b1-43d87508dd1mr76169855e9.31.1743175902548; Fri, 28 Mar 2025 08:31:42 -0700 (PDT) Date: Fri, 28 Mar 2025 15:31:30 +0000 In-Reply-To: <20250328153133.3504118-1-tabba@google.com> Mime-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.472.ge94155a9ec-goog Message-ID: <20250328153133.3504118-5-tabba@google.com> Subject: [PATCH v7 4/7] KVM: guest_memfd: Folio sharing states and functions that manage their transition From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, tabba@google.com X-Rspamd-Queue-Id: 7BA311A001E X-Rspam-User: X-Rspamd-Server: rspam02 X-Stat-Signature: tujaibmke93pc4cu1i31a7kcwjfmqw17 X-HE-Tag: 1743175904-128909 X-HE-Meta: U2FsdGVkX1/JP5CZtcZVMjTanebTvCjeI/jK/gyZBB+4n0prZlut6iIyCo9aJxoprOTzO6OKsMO2Km2R7Bi+yARWzp+ZJ0zHLoKLFpaU+wmDVdSEgoL+gc+ns1dMgYTcf129pSD64+mc0pehqdlMy4XejWp9Aji3dJJWM7n7UQTnoEg2HJMrOCe+O6eGbZSrX9a7qYwE4g7zVKnmwNzIqu0gbXf5HLsglA+nNcwlD4qsCEnLSh4+9n89DQluDdtm2vEjh1l+RTcFoNhBOotO/SYCAWCGLdR2U+zQjxSdflYTnHmDkgb1aDTBihLzl7E5lq+sm1//xDmG2qhcOIapkcDOuUasOH3SKR2nFgSHY/w7FqzV9wNoL8KnvLsMG3lOrgOu+pQXWFAJ6enZWGaZ/dhXPwl5pqmUX4cNfPbNfeG9/RdoVe7BYXP1I9R6oB7erfeIWqw4iu72/HI3hsKAtqPQhRbIa+JjfSglA3m32yak0DRgSQ2ihux1IeCu5SMg6aGqelEJfj/nclDaZ8kSuQHRDGJDCAU/bFeRkH+YrUXGG5+8V3ihGcFYhhW3q30EF82eftrWn+eNCKW+gHbtQdcvQFkW613uOQBGSZCN3AphgTAi/aP4FoxKAX05ZQ/zRfhDdx0CSMmGZ7ifKD6jYUt7DikZ0z2t2h0g6gyxnHhY5i16Bv0eEnu/znJw1VwRHKyqnFO2zTmS6Ajor9UPjtAWjLl+7t4wh9wWi2QTN1r1y29E93NDQdu3/7aDn9gLizLshFH0RRgOW2JqucFqX3ydddELPLUeIUaTRA4oyNRUot4XGpt3GwwXqhlrgAFbhNSntPo7Lcrnx0jgn0WslgnLiCtJ+iXjiMwBWHH5PobQp0snb0JGVWpQ+9Ue/PX1WMJLUkxf1ptBKELe6o48DYrtF8jf4Sdd0DYey+1qD8DplAU3UvMciANnBaHS3Ep0OefKbE8fqc6wq+KtHGM TT9kxfS6 2q0+YeXNeNh4sn5xDcLtLsSnCnOjKNwwaApq2jQisGgV4bqFZMkyWq+IZWojdke4ucC/pCnofm1gD/BvSY8ScSOPSyHGM2dBQIFLLXX1zP2Xak0uIy1zqc8S6wpO9AO/cPKfBPIe/450yIV5njli7au2AjhuZxR4Z35abmIWG5vt5c0eYEleO0SJ23bv1vHnb60l5LOujgszxu9q07mH3EzHNlAXii0IJ4XUavQwhDByhwndQ+5xlXau3I4hSdxpRS8JKegLOhdTA+qGuSjhQrWmsxheaiUHJ0ZF0povHHOxEjtENbKhkPW5kDmsIFlLAUszy88wO2+LrHWq+hw/V8/PsbqQV9OyKdRGI7Xvyu9f1Wy/ryZfweotSZ7kgxlnVlatYA9SAS3x/ohJwxnGny7qqwOvvIeigMbhma/vZABnqG5LA+bcmco2vHE56v9F1Hl3i1DYnUTRtVoFj1BwCtym/3NuidY+ppRK4saZC4xFgzLrAh5Pxo42v1pf6A3W94PkSRllYobqzMo7xS75/cQAUh/LwAQThjeuILqtJuOh6/SHGXbk33IbJ/pmEZMHvBmhIcXj+K9YKYXx+zNznJFZO+mzMVO65mUHZ0Os9Nnlj/GZic0MVSVThFkA+xZh8zjSOkwgEakY6qv7NNpxzBBOjGg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To allow in-place sharing of guest_memfd folios with the host, guest_memfd needs to track their sharing state, because mapping of shared folios will only be allowed where it safe to access these folios. It is safe to map and access these folios when explicitly shared with the host, or potentially if not yet exposed to the guest (e.g., at initialization). This patch introduces sharing states for guest_memfd folios as well as the functions that manage transitioning between those states. Signed-off-by: Fuad Tabba --- include/linux/kvm_host.h | 39 +++++++- virt/kvm/guest_memfd.c | 208 ++++++++++++++++++++++++++++++++++++--- virt/kvm/kvm_main.c | 62 ++++++++++++ 3 files changed, 295 insertions(+), 14 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index bc73d7426363..bf82faf16c53 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2600,7 +2600,44 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, #endif #ifdef CONFIG_KVM_GMEM_SHARED_MEM +int kvm_gmem_set_shared(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_gmem_clear_shared(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_gmem_slot_set_shared(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end); +int kvm_gmem_slot_clear_shared(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end); +bool kvm_gmem_slot_is_guest_shared(struct kvm_memory_slot *slot, gfn_t gfn); void kvm_gmem_handle_folio_put(struct folio *folio); -#endif +#else +static inline int kvm_gmem_set_shared(struct kvm *kvm, gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_gmem_clear_shared(struct kvm *kvm, gfn_t start, + gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_gmem_slot_set_shared(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_gmem_slot_clear_shared(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline bool kvm_gmem_slot_is_guest_shared(struct kvm_memory_slot *slot, + gfn_t gfn) +{ + WARN_ON_ONCE(1); + return false; +} +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ #endif diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index cde16ed3b230..3b4d724084a8 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -29,14 +29,6 @@ static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode) return inode->i_mapping->i_private_data; } -#ifdef CONFIG_KVM_GMEM_SHARED_MEM -void kvm_gmem_handle_folio_put(struct folio *folio) -{ - WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); -} -EXPORT_SYMBOL_GPL(kvm_gmem_handle_folio_put); -#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ - /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. @@ -389,22 +381,211 @@ static void kvm_gmem_init_mount(void) } #ifdef CONFIG_KVM_GMEM_SHARED_MEM -static bool kvm_gmem_offset_is_shared(struct file *file, pgoff_t index) +/* + * An enum of the valid folio sharing states: + * Bit 0: set if not shared with the guest (guest cannot fault it in) + * Bit 1: set if not shared with the host (host cannot fault it in) + */ +enum folio_shareability { + KVM_GMEM_ALL_SHARED = 0b00, /* Shared with the host and the guest. */ + KVM_GMEM_GUEST_SHARED = 0b10, /* Shared only with the guest. */ + KVM_GMEM_NONE_SHARED = 0b11, /* Not shared, transient state. */ +}; + +static int kvm_gmem_offset_set_shared(struct inode *inode, pgoff_t index) { - struct kvm_gmem *gmem = file->private_data; + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + void *xval = xa_mk_value(KVM_GMEM_ALL_SHARED); + + lockdep_assert_held_write(offsets_lock); + + return xa_err(xa_store(shared_offsets, index, xval, GFP_KERNEL)); +} + +/* + * Marks the range [start, end) as shared with both the host and the guest. + * Called when guest shares memory with the host. + */ +static int kvm_gmem_offset_range_set_shared(struct inode *inode, + pgoff_t start, pgoff_t end) +{ + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + pgoff_t i; + int r = 0; + + write_lock(offsets_lock); + for (i = start; i < end; i++) { + r = kvm_gmem_offset_set_shared(inode, i); + if (WARN_ON_ONCE(r)) + break; + } + write_unlock(offsets_lock); + + return r; +} + +static int kvm_gmem_offset_clear_shared(struct inode *inode, pgoff_t index) +{ + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_SHARED); + void *xval_none = xa_mk_value(KVM_GMEM_NONE_SHARED); + struct folio *folio; + int refcount; + int r; + + lockdep_assert_held_write(offsets_lock); + + folio = filemap_lock_folio(inode->i_mapping, index); + if (!IS_ERR(folio)) { + /* +1 references are expected because of filemap_lock_folio(). */ + refcount = folio_nr_pages(folio) + 1; + } else { + r = PTR_ERR(folio); + if (WARN_ON_ONCE(r != -ENOENT)) + return r; + + folio = NULL; + } + + if (!folio || folio_ref_freeze(folio, refcount)) { + /* + * No outstanding references: transition to guest shared. + */ + r = xa_err(xa_store(shared_offsets, index, xval_guest, GFP_KERNEL)); + + if (folio) + folio_ref_unfreeze(folio, refcount); + } else { + /* + * Outstanding references: the folio cannot be faulted in by + * anyone until they're dropped. + */ + r = xa_err(xa_store(shared_offsets, index, xval_none, GFP_KERNEL)); + } + + if (folio) { + folio_unlock(folio); + folio_put(folio); + } + + return r; +} +/* + * Marks the range [start, end) as not shared with the host. If the host doesn't + * have any references to a particular folio, then that folio is marked as + * shared with the guest. + * + * However, if the host still has references to the folio, then the folio is + * marked and not shared with anyone. Marking it as not shared allows draining + * all references from the host, and ensures that the hypervisor does not + * transition the folio to private, since the host still might access it. + * + * Called when guest unshares memory with the host. + */ +static int kvm_gmem_offset_range_clear_shared(struct inode *inode, + pgoff_t start, pgoff_t end) +{ + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + pgoff_t i; + int r = 0; + + write_lock(offsets_lock); + for (i = start; i < end; i++) { + r = kvm_gmem_offset_clear_shared(inode, i); + if (WARN_ON_ONCE(r)) + break; + } + write_unlock(offsets_lock); + + return r; +} + +void kvm_gmem_handle_folio_put(struct folio *folio) +{ + WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); +} +EXPORT_SYMBOL_GPL(kvm_gmem_handle_folio_put); + +/* + * Returns true if the folio is shared with the host and the guest. + * + * Must be called with the offsets_lock lock held. + */ +static bool kvm_gmem_offset_is_shared(struct inode *inode, pgoff_t index) +{ + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + unsigned long r; + + lockdep_assert_held(offsets_lock); - /* For now, VMs that support shared memory share all their memory. */ - return kvm_arch_gmem_supports_shared_mem(gmem->kvm); + r = xa_to_value(xa_load(shared_offsets, index)); + + return r == KVM_GMEM_ALL_SHARED; +} + +/* + * Returns true if the folio is shared with the guest (not transitioning). + * + * Must be called with the offsets_lock lock held. + */ +static bool kvm_gmem_offset_is_guest_shared(struct inode *inode, pgoff_t index) +{ + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + unsigned long r; + + lockdep_assert_held(offsets_lock); + + r = xa_to_value(xa_load(shared_offsets, index)); + + return (r == KVM_GMEM_ALL_SHARED || r == KVM_GMEM_GUEST_SHARED); +} + +int kvm_gmem_slot_set_shared(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) +{ + struct inode *inode = file_inode(READ_ONCE(slot->gmem.file)); + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; + pgoff_t end_off = start_off + end - start; + + return kvm_gmem_offset_range_set_shared(inode, start_off, end_off); +} + +int kvm_gmem_slot_clear_shared(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) +{ + struct inode *inode = file_inode(READ_ONCE(slot->gmem.file)); + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; + pgoff_t end_off = start_off + end - start; + + return kvm_gmem_offset_range_clear_shared(inode, start_off, end_off); +} + +bool kvm_gmem_slot_is_guest_shared(struct kvm_memory_slot *slot, gfn_t gfn) +{ + struct inode *inode = file_inode(READ_ONCE(slot->gmem.file)); + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + unsigned long pgoff = slot->gmem.pgoff + gfn - slot->base_gfn; + bool r; + + read_lock(offsets_lock); + r = kvm_gmem_offset_is_guest_shared(inode, pgoff); + read_unlock(offsets_lock); + + return r; } static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) { struct inode *inode = file_inode(vmf->vma->vm_file); + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; struct folio *folio; vm_fault_t ret = VM_FAULT_LOCKED; filemap_invalidate_lock_shared(inode->i_mapping); + read_lock(offsets_lock); folio = kvm_gmem_get_folio(inode, vmf->pgoff); if (IS_ERR(folio)) { @@ -423,7 +604,7 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) goto out_folio; } - if (!kvm_gmem_offset_is_shared(vmf->vma->vm_file, vmf->pgoff)) { + if (!kvm_gmem_offset_is_shared(inode, vmf->pgoff)) { ret = VM_FAULT_SIGBUS; goto out_folio; } @@ -457,6 +638,7 @@ static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) } out_filemap: + read_unlock(offsets_lock); filemap_invalidate_unlock_shared(inode->i_mapping); return ret; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 3e40acb9f5c0..90762252381c 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3091,6 +3091,68 @@ static int next_segment(unsigned long len, int offset) return len; } +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +int kvm_gmem_set_shared(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + int r = 0; + + mutex_lock(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end; + + if (!kvm_slot_can_be_private(memslot)) + continue; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(start >= end)) + continue; + + r = kvm_gmem_slot_set_shared(memslot, gfn_start, gfn_end); + if (WARN_ON_ONCE(r)) + break; + } + + mutex_unlock(&kvm->slots_lock); + + return r; +} +EXPORT_SYMBOL_GPL(kvm_gmem_set_shared); + +int kvm_gmem_clear_shared(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + int r = 0; + + mutex_lock(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end; + + if (!kvm_slot_can_be_private(memslot)) + continue; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(start >= end)) + continue; + + r = kvm_gmem_slot_clear_shared(memslot, gfn_start, gfn_end); + if (WARN_ON_ONCE(r)) + break; + } + + mutex_unlock(&kvm->slots_lock); + + return r; +} +EXPORT_SYMBOL_GPL(kvm_gmem_clear_shared); +#endif /* CONFIG_KVM_GMEM_SHARED_MEM */ + /* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, void *data, int offset, int len) From patchwork Fri Mar 28 15:31:31 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 14032177 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E759EC36013 for ; Fri, 28 Mar 2025 15:31:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EAAEE280154; Fri, 28 Mar 2025 11:31:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E33A7280152; Fri, 28 Mar 2025 11:31:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C5FF2280154; Fri, 28 Mar 2025 11:31:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9F5A5280152 for ; Fri, 28 Mar 2025 11:31:50 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 482A21A0414 for ; Fri, 28 Mar 2025 15:31:51 +0000 (UTC) X-FDA: 83271350022.15.33AA6E7 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) by imf04.hostedemail.com (Postfix) with ESMTP id 2D5944001A for ; Fri, 28 Mar 2025 15:31:45 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="o/R9qHKn"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 34MDmZwUKCEMyfggflttlqj.htrqnsz2-rrp0fhp.twl@flex--tabba.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=34MDmZwUKCEMyfggflttlqj.htrqnsz2-rrp0fhp.twl@flex--tabba.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743175906; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jBMUpGqTWD4UpO2EN6fhtj6WpgPcfU+elgx/+TqNl3U=; b=zeLj+1f7htDE9c6fkprwfkjVdFiX+N2FacLMMOx6+nYxoDAcp5klpmx4p82cjH3nw5+ay5 Kue1BzNYp8I+9J4OUU1hRNDOhpRHO5L3DIHnU9s7pXNjRZTZ6own1BfQ+n7wWqh1qX18uK Ehnj0BEz0xBLaCFve9/Mty2gUU1bhg8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743175906; a=rsa-sha256; cv=none; b=dhkgoOLbxCJcVlYKkfBPPOVUmBc93blK/3Wz7+VRognQb85FvU4cRFul2GjE55QE0/qoII J/YjA5vaiVzAB1q58MBF9lxACsHkcfQZ/Drdsx78LgVsgGnTfBysFDiQokQH+riKvV4TVM 4goyPPDp2lM2qouNqk9MkqrkQoN5/Z8= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="o/R9qHKn"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 34MDmZwUKCEMyfggflttlqj.htrqnsz2-rrp0fhp.twl@flex--tabba.bounces.google.com designates 209.85.128.73 as permitted sender) smtp.mailfrom=34MDmZwUKCEMyfggflttlqj.htrqnsz2-rrp0fhp.twl@flex--tabba.bounces.google.com Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-43d22c304adso14879755e9.0 for ; Fri, 28 Mar 2025 08:31:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743175904; x=1743780704; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jBMUpGqTWD4UpO2EN6fhtj6WpgPcfU+elgx/+TqNl3U=; b=o/R9qHKn5xStO1LPzRkBLHF/u6EvwJOYxQvO1l89SZG36gSAGPbv1/7ZcqCmNzqsy2 MIDSU2GPeoiOE4YFRPXqR1SYhPZ+rU0eL9u+CW40JyT/scLX0qcrBGx3WITalOsCXETU MxlYS/gZF9bA83dFBM9AeDgg/icU2QJHnKJgd/sXwzKwNM+ww8s+Z1OqYXmKUm0Z2/69 53Q7pP1nlbRBCWFthwvTDn+sDB6s8YrSbvCK0P9CPZbZ/AdN6mTN4CGTGT/eAtF5hLdn ykMI5EjTSzqN8L58mY4sINTgHi4uINfQU35EZJ1by+FL/G3beBWzy33U2bLHkF0ud1pJ PkNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743175904; x=1743780704; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jBMUpGqTWD4UpO2EN6fhtj6WpgPcfU+elgx/+TqNl3U=; b=uIZlHAtAxXJbKnrRe4/8pv8t+SdUia45RVo371sVjLF5sfJvhreo+MAPrBz514Z5mh VSt7mQJaeOHiUIijQZN48PZvGOQgPpcNA7qbVqOoM7aa4D/fN3nNHsWxhjynTlNDrViO eq1ksm42mFEVQj65OWT6YaYKWn5h6JYvychu3W1A9qtiGaHTb0ZHR/dnzNbmzSq8vc3k Js0G49nb8Ku83eRtUarGXFxlgDrJ9kEo5iVctfUWefi60vyOdYJd8pifivbu55K8VTUi 87yYWwGCtwaeWKEO4UF2JTJdJrURZqRIED6b8xh6VqgrpdLkhXT5yN0xYyJeFTI9cX4E GeVg== X-Forwarded-Encrypted: i=1; AJvYcCVsPl7uVjQOjTryhrdbgO9LUGeCl1AaOMjix1lCZbWrBBKw5lHx1/M3RUMY3D62tFpf5AQeuh7mxw==@kvack.org X-Gm-Message-State: AOJu0Yw/wpS2D9bBo2ncLEqW6jXWspwlNB0UzeMct9mjq28RN+GETyrJ H4LepGe90lBcOpev1Mbs0fNT025ZMV2Sk0RXuR1PJE3m4LDYMCjCj7EA4iZw7cls/tRE6bw7ng= = X-Google-Smtp-Source: AGHT+IGsVvvboBRAyzbhN+VadIQHCOH8PGYVPxU5z0Lyav96vZClKVh4KR5eyXtOx0aT0txiHrEbB6kiHg== X-Received: from wmrn12.prod.google.com ([2002:a05:600c:500c:b0:43d:41a2:b768]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:6dca:b0:43c:f509:2bbf with SMTP id 5b1f17b1804b1-43d911902efmr30762195e9.15.1743175904477; Fri, 28 Mar 2025 08:31:44 -0700 (PDT) Date: Fri, 28 Mar 2025 15:31:31 +0000 In-Reply-To: <20250328153133.3504118-1-tabba@google.com> Mime-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.472.ge94155a9ec-goog Message-ID: <20250328153133.3504118-6-tabba@google.com> Subject: [PATCH v7 5/7] KVM: guest_memfd: Restore folio state after final folio_put() From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, tabba@google.com X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 2D5944001A X-Stat-Signature: sq78p6br1a1zyezkjkzbwme1uaf85swq X-Rspam-User: X-HE-Tag: 1743175905-357136 X-HE-Meta: U2FsdGVkX190obT8Q2kR85MBtfs3yG1Isq/HoOHFKqeOx5rWXth62xrtAancklUwyPkPFUslE2JKKyIDCuBcB8mTxzajQRURRBzs17qt7/Tz8XqTGhzeTO6r9aS8MFzr+rhg0Mh1u1fov/TXayUXCyFYZFk+bPDjoHYzvqgTiy0SidZIyH6j623p0EPMb8JpIIg67XFdzenrR7Svxo2FtZQJavKn7tu0dZEpx7RyjN8kgXD+sY87WuyCb9/LVRMsZwBY3I2MwuA5KZKhulLzzM3UigmJShV2g55tReJ7tJceQBBA0sxgRXvNvcJN+6+7k+Dt94tYnUkIW3uKmCJoUiRZOTu4W+Bfy5eqaLeyvbEikcQxDmJ1yU78Jwr0AiIOzc9li+tYDCbsqq8O84NhjdlCfdtXhx1bu0aRZ6ZW2gqKSH5bxhYa0z7ABVr7qmGtS8Jt9zylLeWLwgFfVNGI1OZVj8ewgiEZmk/tjs3VqSzD81hwvu1HzWEFoaFkWqCWAqJWE8wxSmQh56xmdVC3BTN7GtoofSPFrdsS3epCjejOmgCvzywBd2DaqZp64CIPGxuJsY4cceyfAb13DvLPboQiR4ZCvZIYT6FLfB6gLBu1QD0Cmf9l0ED06k6rBMQ/ooE3lBKuwtmtmtl+iZ79tiytEypOKOvqVZSNCYD8Ve7B5qxUxpDk7HCSxOjPG2152Z0oJ/3IJjvKY2zxTlkBFRV5KeL32AFKuSdByCodwuc6c6zr73qieTtwdTCOSAt2ACFDPpk8xHAv+lJkjrcrKoqd+nlkiSOePrZHc6BowhGEQCsO94r0I1sOC1mPz92F/7jZtYS4+3rS6NK1DZr+YA+QXceOQwyLUBoHK15fyrHdr6mhCuiL97yVIDEP7vMsALjnS7AXPS96TFidyptTJ0CD/J7d14zGtsF8IJZTQYJl7M+a9vy/ECoAXqa8BdYCsA+yfl0r6WAkEhJKSuO 5v4OArAK 2+VO46hbxITTldML9/XTgvck3rCobFao51K/Cm10XVDeF20D9WtU4ciIfXMAKfr63GgE6n5R1m5EfMDwKbCd87ugiRW2K/PFTRcUjMRICWKiPhHQcLegDATle9SxBMGGFf2xgYTUy7PC13bX+2QV3WlNNsUKBAe7sqa1jLeNGh5tv+IjMLeOPKIapHYYUzz1QKWf0Owvf1auJwxIFV75/C7VA/KlKihG60vzPD04ngBYZ8H4ITs1zLAbznQ1r2y8nRC4imSbE3JdgQvt072K3wv+vvlTAjlgK+upBQGRVgCCuxblvt+rLxwyoBudOhG9eAsFaUSQUomthNsQOcD4rbH++gKI53H/Hra0NkfWSdrZwOeekUj7zZlHksXRMyYY84RcyuDgNnCES6nv17z+vE99T55uTT8TBP6foTS5CEorKIG1eC6haGil01ooRwJsA6oiOncHmYQZToOEcPXd43R7ZOfwSZvJ0JwdqPM1OAiiq1bN9QJcslEi6ZmYph1jpNfUzfsr997Gfp1u5C88i6MHdqrHTBtaX9sbRzJwKM8m/WyyduY9dItoHFf+gd+ovtNilk6kQsXx7OoaC7qbuCYJ39apqyQUTAzRBb/THqyHayxSnUqAbhmShcfFwMvK0ANx3xd+/+31w2lSMHpOgwTqY/yAey4eGDAH0SRJNyGepwBUSQunw+93OJlOLURyDev9S56OsILJPQBg= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Before transitioning a guest_memfd folio to unshared, thereby disallowing access by the host and allowing the hypervisor to transition its view of the guest page as private, we need to ensure that the host doesn't have any references to the folio. This patch uses the guest_memfd folio type to register a callback that informs the guest_memfd subsystem when the last reference is dropped, therefore knowing that the host doesn't have any remaining references. Signed-off-by: Fuad Tabba --- The function kvm_slot_gmem_register_callback() isn't used in this series. It will be used later in code that performs unsharing of memory. I have tested it with pKVM, based on downstream code [*]. It's included in this RFC since it demonstrates the plan to handle unsharing of private folios. [*] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guestmem-6.13-v7-pkvm --- include/linux/kvm_host.h | 6 ++ virt/kvm/guest_memfd.c | 143 ++++++++++++++++++++++++++++++++++++++- 2 files changed, 148 insertions(+), 1 deletion(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index bf82faf16c53..d9d9d72d8beb 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2607,6 +2607,7 @@ int kvm_gmem_slot_set_shared(struct kvm_memory_slot *slot, gfn_t start, int kvm_gmem_slot_clear_shared(struct kvm_memory_slot *slot, gfn_t start, gfn_t end); bool kvm_gmem_slot_is_guest_shared(struct kvm_memory_slot *slot, gfn_t gfn); +int kvm_gmem_slot_register_callback(struct kvm_memory_slot *slot, gfn_t gfn); void kvm_gmem_handle_folio_put(struct folio *folio); #else static inline int kvm_gmem_set_shared(struct kvm *kvm, gfn_t start, gfn_t end) @@ -2638,6 +2639,11 @@ static inline bool kvm_gmem_slot_is_guest_shared(struct kvm_memory_slot *slot, WARN_ON_ONCE(1); return false; } +static inline int kvm_gmem_slot_register_callback(struct kvm_memory_slot *slot, gfn_t gfn) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} #endif /* CONFIG_KVM_GMEM_SHARED_MEM */ #endif diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 3b4d724084a8..ce19bd6c2031 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -392,6 +392,27 @@ enum folio_shareability { KVM_GMEM_NONE_SHARED = 0b11, /* Not shared, transient state. */ }; +/* + * Unregisters the __folio_put() callback from the folio. + * + * Restores a folio's refcount after all pending references have been released, + * and removes the folio type, thereby removing the callback. Now the folio can + * be freed normaly once all actual references have been dropped. + * + * Must be called with the folio locked and the offsets_lock write lock held. + */ +static void kvm_gmem_restore_pending_folio(struct folio *folio, struct inode *inode) +{ + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + + lockdep_assert_held_write(offsets_lock); + WARN_ON_ONCE(!folio_test_locked(folio)); + WARN_ON_ONCE(!folio_test_guestmem(folio)); + + __folio_clear_guestmem(folio); + folio_ref_add(folio, folio_nr_pages(folio)); +} + static int kvm_gmem_offset_set_shared(struct inode *inode, pgoff_t index) { struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; @@ -400,6 +421,24 @@ static int kvm_gmem_offset_set_shared(struct inode *inode, pgoff_t index) lockdep_assert_held_write(offsets_lock); + /* + * If the folio is NONE_SHARED, it indicates that it is transitioning to + * private (GUEST_SHARED). Transition it to shared (ALL_SHARED) + * immediately, and remove the callback. + */ + if (xa_to_value(xa_load(shared_offsets, index)) == KVM_GMEM_NONE_SHARED) { + struct folio *folio = filemap_lock_folio(inode->i_mapping, index); + + if (WARN_ON_ONCE(IS_ERR(folio))) + return PTR_ERR(folio); + + if (folio_test_guestmem(folio)) + kvm_gmem_restore_pending_folio(folio, inode); + + folio_unlock(folio); + folio_put(folio); + } + return xa_err(xa_store(shared_offsets, index, xval, GFP_KERNEL)); } @@ -503,9 +542,111 @@ static int kvm_gmem_offset_range_clear_shared(struct inode *inode, return r; } +/* + * Registers a callback to __folio_put(), so that gmem knows that the host does + * not have any references to the folio. The callback itself is registered by + * setting the folio type to guestmem. + * + * Returns 0 if a callback was registered or already has been registered, or + * -EAGAIN if the host has references, indicating a callback wasn't registered. + * + * Must be called with the folio locked and the offsets_lock write lock held. + */ +static int kvm_gmem_register_callback(struct folio *folio, struct inode *inode, pgoff_t index) +{ + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + void *xval_guest = xa_mk_value(KVM_GMEM_GUEST_SHARED); + int refcount; + int r = 0; + + lockdep_assert_held_write(offsets_lock); + WARN_ON_ONCE(!folio_test_locked(folio)); + + if (folio_test_guestmem(folio)) + return 0; + + if (folio_mapped(folio)) + return -EAGAIN; + + refcount = folio_ref_count(folio); + if (!folio_ref_freeze(folio, refcount)) + return -EAGAIN; + + /* + * Register callback by setting the folio type and subtracting gmem's + * references for it to trigger once outstanding references are dropped. + */ + if (refcount > 1) { + __folio_set_guestmem(folio); + refcount -= folio_nr_pages(folio); + } else { + /* No outstanding references, transition it to guest shared. */ + r = WARN_ON_ONCE(xa_err(xa_store(shared_offsets, index, xval_guest, GFP_KERNEL))); + } + + folio_ref_unfreeze(folio, refcount); + return r; +} + +int kvm_gmem_slot_register_callback(struct kvm_memory_slot *slot, gfn_t gfn) +{ + unsigned long pgoff = slot->gmem.pgoff + gfn - slot->base_gfn; + struct inode *inode = file_inode(READ_ONCE(slot->gmem.file)); + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + struct folio *folio; + int r; + + write_lock(offsets_lock); + + folio = filemap_lock_folio(inode->i_mapping, pgoff); + if (WARN_ON_ONCE(IS_ERR(folio))) { + write_unlock(offsets_lock); + return PTR_ERR(folio); + } + + r = kvm_gmem_register_callback(folio, inode, pgoff); + + folio_unlock(folio); + folio_put(folio); + write_unlock(offsets_lock); + + return r; +} +EXPORT_SYMBOL_GPL(kvm_gmem_slot_register_callback); + +/* + * Callback function for __folio_put(), i.e., called once all references by the + * host to the folio have been dropped. This allows gmem to transition the state + * of the folio to shared with the guest, and allows the hypervisor to continue + * transitioning its state to private, since the host cannot attempt to access + * it anymore. + */ void kvm_gmem_handle_folio_put(struct folio *folio) { - WARN_ONCE(1, "A placeholder that shouldn't trigger. Work in progress."); + struct address_space *mapping; + struct xarray *shared_offsets; + rwlock_t *offsets_lock; + struct inode *inode; + pgoff_t index; + void *xval; + + mapping = folio->mapping; + if (WARN_ON_ONCE(!mapping)) + return; + + inode = mapping->host; + index = folio->index; + shared_offsets = &kvm_gmem_private(inode)->shared_offsets; + offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + xval = xa_mk_value(KVM_GMEM_GUEST_SHARED); + + write_lock(offsets_lock); + folio_lock(folio); + kvm_gmem_restore_pending_folio(folio, inode); + folio_unlock(folio); + WARN_ON_ONCE(xa_err(xa_store(shared_offsets, index, xval, GFP_KERNEL))); + write_unlock(offsets_lock); } EXPORT_SYMBOL_GPL(kvm_gmem_handle_folio_put); From patchwork Fri Mar 28 15:31:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 14032175 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 291AAC36011 for ; Fri, 28 Mar 2025 15:31:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C3F9280153; Fri, 28 Mar 2025 11:31:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 84A23280151; Fri, 28 Mar 2025 11:31:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 67EF3280153; Fri, 28 Mar 2025 11:31:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 4448D280152 for ; Fri, 28 Mar 2025 11:31:49 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 01AFE1A0414 for ; Fri, 28 Mar 2025 15:31:49 +0000 (UTC) X-FDA: 83271349980.03.3333451 Received: from mail-wm1-f74.google.com (mail-wm1-f74.google.com [209.85.128.74]) by imf16.hostedemail.com (Postfix) with ESMTP id 0E0C4180008 for ; Fri, 28 Mar 2025 15:31:47 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wjghmIHQ; spf=pass (imf16.hostedemail.com: domain of 34sDmZwUKCEU0hiihnvvnsl.jvtspu14-ttr2hjr.vyn@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=34sDmZwUKCEU0hiihnvvnsl.jvtspu14-ttr2hjr.vyn@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743175908; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=R3RfnojiUTVQNcoqz4Ca2o/3ssUHr85YnTpv1JDxiLk=; b=qDgXlxnGQmvGj0ElqIlZtBlv0/nbWFjFaWgEnZ6gF5yz8YVhJrtfbuMK4XVwklaCyGkIkO wuq2Bcxh37xo8dgADkPBz2QuIUaF9WjBMVcr4BjmHcgh6HRBCL3JGE0J/Rj83QSyBlYrPX z6ci8o/nh/P0Ez2W4WSjCFhGfeZAVq4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743175908; a=rsa-sha256; cv=none; b=8CigPc6AuUeshaMBygoPsKJe982wOHZjJarNAEkOXrcoS9Ll2fEUCFjzPpS826NqE/PEoY PWcj3wgze87WBxp3pFQBXNdEC+lJCMf+YVQ3VQnyEgnCzrKNdNg9Es0yoK/xH3DomMfnuh t8GO9HDUb16Z93+yh+DMVJnemS+ypqw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=wjghmIHQ; spf=pass (imf16.hostedemail.com: domain of 34sDmZwUKCEU0hiihnvvnsl.jvtspu14-ttr2hjr.vyn@flex--tabba.bounces.google.com designates 209.85.128.74 as permitted sender) smtp.mailfrom=34sDmZwUKCEU0hiihnvvnsl.jvtspu14-ttr2hjr.vyn@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-wm1-f74.google.com with SMTP id 5b1f17b1804b1-43ce8f82e66so14195425e9.3 for ; Fri, 28 Mar 2025 08:31:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743175906; x=1743780706; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=R3RfnojiUTVQNcoqz4Ca2o/3ssUHr85YnTpv1JDxiLk=; b=wjghmIHQB9oyJpr+/bZA94HsgWxsLSOUw5tKEg5rjCeVfuauGoIjGAyA5DBUjvK5Mm HVeh1Ghvxif24GMmdjCq6lh8LKcj1TB0gIvt1ArNFB2ZADoscwW/owfX+SGwIpS6SqBh apmfFpetDXpPXH1++isBiTDLvjP53tJ3/cM0TmZLPtU7Rzci4oJfmqCbi7YQ3Ih0p7a3 H+m+1yq7NdNZ/rj2YWOwtDjVjCpWPC+M340ZVMYEbZ9uxQe+lWPUsRk2rE0Edf/hwXLt jh4w2tuD9Sab5DskqK4i+XMqDAPvPqTz2VZtRdyD65DQbODcpVG8Z5MHlpuh2cBLU9HC 41og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743175906; x=1743780706; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=R3RfnojiUTVQNcoqz4Ca2o/3ssUHr85YnTpv1JDxiLk=; b=TLaXG9UqZnfF0phcLKa7rOcnVQbHPHYWYC2gsadkOnYTT3lcFimGVI3YihQ+Att5pq mzYfqussrJVVYy3IEPG0UYKdwYUfEeeY8edSxlzwTgNe99f2FC5IHrWaaZQfMquzUxPX Sm42JDic1GHlm9K6PYgKvmFwi33kqo6E41KzFW2ITbMIpR4EsYJgXvNbvfQq5/SlK1OT j2LO4/2q5PwJvm9U+Y144cFdyIeLxNcRl5vjQnByPOidJ+ES0Z14t3gG6wN0MPMGKSqS C3olEE9xMNMRWADuY2YfUdWLd33BJcOXCCYYXPBoqgDQMrfedfH1+8BuFeCO7VNvB/Xx cosg== X-Forwarded-Encrypted: i=1; AJvYcCV+Ub6hoWqdj3QeXcmJLrbWib/B+SklKge78hgtkEF8TAdQentcjhh6SoBHOsAyG/Dui7p6ogXsvg==@kvack.org X-Gm-Message-State: AOJu0YwyyWuf7IR2iTtJdBcY8ci9QQFGYWQiN5oFCSlMs/6UF9WOKTYS 5Y+h63l7P2XPp1ll+a+T079NhNAIddH3xbjfTGqc45ijT68JhREmerboXBCBAWjF5aXlA05WJg= = X-Google-Smtp-Source: AGHT+IH5o8XkmQE16RpA7JAnMr7kPBXwGpQkyy1fEQJArMIbsDhiWMuAir53k+JzjSmedSowM4jfKUSd+w== X-Received: from wmbay1.prod.google.com ([2002:a05:600c:1e01:b0:43d:9f1:31a9]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a05:600c:4e52:b0:43c:fc04:6d34 with SMTP id 5b1f17b1804b1-43d9489777emr25843765e9.20.1743175906450; Fri, 28 Mar 2025 08:31:46 -0700 (PDT) Date: Fri, 28 Mar 2025 15:31:32 +0000 In-Reply-To: <20250328153133.3504118-1-tabba@google.com> Mime-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.472.ge94155a9ec-goog Message-ID: <20250328153133.3504118-7-tabba@google.com> Subject: [PATCH v7 6/7] KVM: guest_memfd: Handle invalidation of shared memory From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, tabba@google.com X-Stat-Signature: zh9brcyjwjmiu8jm5wktyyd7s9h1nwuy X-Rspam-User: X-Rspamd-Queue-Id: 0E0C4180008 X-Rspamd-Server: rspam08 X-HE-Tag: 1743175907-216675 X-HE-Meta: U2FsdGVkX1/QZpNS1LD1xDPeaY2SFxaEJsw+KZbrpx1qJ/7M/R978wL2KfohSVOX13PjwMLVKX7iqkfPkz8DhrWgvBgfFIhj8N+mky/Bl3JmOzRU3aCX5wRw8M8PsnZdc3e9n68oEmkv2N8WC3OkMvUAk5lGqiV2w3TJYveD7roKC2g2FDuvWcHhLRa6u9WPBRhrRuGLyJ15tqwyyfIPDTYgXQBEaqsdnKr+/fUFdzaMSIQD4dhYbmKEciqS3bLrXcfFG+XTj5Dp6zmDC09taLIs4MisO2sZNeJYjOpVh5eob85rWkmvlFUdg4oxrw3lKNho1jE6bH+xQz19p/ZqFLg55CLGhSsHSOuKD1QXIJ3yFKIY9dc3/EhToWPngODVqfB086SVjoib7eaAuyvoYbV45sVFKIv5bVbZPbe1nL9JgXcs2DgUCy77d/kzH0vyVTj6a+L6decblQnnomruiFuLgoJdsIwfzzr32J/apBRKME1sk3XpwpkZXKGiwKKXjKF+LzQ9YdvcGR1g7jfXEMCFNKL9yuJDzcK+pe3npZtWlIZ6Ut5ZDNGBQnGEn64gJNUrMqwz+NgRCWt9xt4v2JbvcP8oddJOkl1hYO+eO1fk2TQRU+Cg0wegQsjZozRmhlRBMxppfvur6Fd/RgfVco1H9JaihAEWWsWkxg2IwdaCQ2Gjfq/XhSB6rNgVbS/TNWbw6hESaaOJek84PM4/LVFC/sATE1QiJzshEoGCPnuvhWpKLL1tfvmKalGk0YqVzmT2aAGUIcovoOazeop7P1frTD3xsoUQmjkwEuj3peX3l9g1dZNVAUkNy+zzlpHzHqAXZBJLsNAVTNd85pl6ERkYiCOi2CsfJTcRiEDD4V4I7eYP133TrUDXedsgX1WebvAr1W3Ka9ThLvOOlkjLQKalucGnJClOti5K9LqupIH3YD6RKHFqSBDNwws+D9AmrdNgl00WKTmaSDInV37 v0Q6hbA5 f4LIpYN/hxS/pMDt0Efurmo95SMHVWmL8ar7cAOxwJ69gwKllgoDZMdyJjevV584sR8xafDNTJ3CyOtQwxcG87TvCduIf+502HEYLP0py8T32V+AuwUaM1nRHXCqKQxKn0oCJ0zT/zIMGzQnVHUW9LsxY6yeCxq5G66L6NcpBEamdkuyD5L2EmScALsiFxGvq4xOLvZQ/nSfJT3KllMgh9fl/UNhLbiFMWx1ofVduQFPDpsstWGGQhScNO06whqb/Dc81cMHxsMQkqmVn+nQYFl+8mV7UHnbVgkW5aeJ0QnWQl5iAyOzeD1nkpKwXAujloodNLq3NlzFU4nCX9orWmQN73UbANNM4LxhXdHBekNnqeQ/hpf+KqdmA0D3oqS5qsfLWxrXamAuRLgO/UnkUqGjJGMW4+Bd5wNVVoTTVr1ZNXk+10tqKPz3priWm0Z9bwRV9IwyIX3r0M7mKqqyXAivbyuXIIcbUeHMAo3jjAPzjmB+w4gr/+7ZYAPKPj2SFnWSXmNzudhjIR7/6eN0mZgJZpMvgltpSipC6g5RFbQUHDog9zlNDIZnkS48zQWw6EtB5xtN/voMRN7/yLHFJYOBuMeSiFCFw/C92+ZPPnDKg55WnCw/CCqRbEnFbLYGQgYs+XIToiW/RItf8Lbotrvkf+Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When guest_memfd backed memory is invalidated, e.g., on punching holes, releasing, ensure that the sharing states are updated and that any folios in a transient state are restored to an appropriate state. Signed-off-by: Fuad Tabba --- virt/kvm/guest_memfd.c | 57 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index ce19bd6c2031..eec9d5e09f09 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -118,6 +118,16 @@ static struct folio *kvm_gmem_get_folio(struct inode *inode, pgoff_t index) return filemap_grab_folio(inode->i_mapping, index); } +#ifdef CONFIG_KVM_GMEM_SHARED_MEM +static void kvm_gmem_offset_range_invalidate_shared(struct inode *inode, + pgoff_t start, pgoff_t end); +#else +static inline void kvm_gmem_offset_range_invalidate_shared(struct inode *inode, + pgoff_t start, pgoff_t end) +{ +} +#endif + static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, pgoff_t end) { @@ -127,6 +137,7 @@ static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, unsigned long index; xa_for_each_range(&gmem->bindings, index, slot, start, end - 1) { + struct file *file = READ_ONCE(slot->gmem.file); pgoff_t pgoff = slot->gmem.pgoff; struct kvm_gfn_range gfn_range = { @@ -146,6 +157,16 @@ static void kvm_gmem_invalidate_begin(struct kvm_gmem *gmem, pgoff_t start, } flush |= kvm_mmu_unmap_gfn_range(kvm, &gfn_range); + + /* + * If this gets called after kvm_gmem_unbind() it means that all + * in-flight operations are gone, and the file has been closed. + */ + if (file) { + kvm_gmem_offset_range_invalidate_shared(file_inode(file), + gfn_range.start, + gfn_range.end); + } } if (flush) @@ -512,6 +533,42 @@ static int kvm_gmem_offset_clear_shared(struct inode *inode, pgoff_t index) return r; } +/* + * Callback when invalidating memory that is potentially shared. + * + * Must be called with the offsets_lock write lock held. + */ +static void kvm_gmem_offset_range_invalidate_shared(struct inode *inode, + pgoff_t start, pgoff_t end) +{ + struct xarray *shared_offsets = &kvm_gmem_private(inode)->shared_offsets; + rwlock_t *offsets_lock = &kvm_gmem_private(inode)->offsets_lock; + pgoff_t i; + + lockdep_assert_held_write(offsets_lock); + + for (i = start; i < end; i++) { + /* + * If the folio is NONE_SHARED, it indicates that it's + * transitioning to private (GUEST_SHARED). Transition it to + * shared (ALL_SHARED) and remove the callback. + */ + if (xa_to_value(xa_load(shared_offsets, i)) == KVM_GMEM_NONE_SHARED) { + struct folio *folio = filemap_lock_folio(inode->i_mapping, i); + + if (!WARN_ON_ONCE(IS_ERR(folio))) { + if (folio_test_guestmem(folio)) + kvm_gmem_restore_pending_folio(folio, inode); + + folio_unlock(folio); + folio_put(folio); + } + } + + xa_erase(shared_offsets, i); + } +} + /* * Marks the range [start, end) as not shared with the host. If the host doesn't * have any references to a particular folio, then that folio is marked as From patchwork Fri Mar 28 15:31:33 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 14032178 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3490C36011 for ; Fri, 28 Mar 2025 15:32:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4E76D280155; Fri, 28 Mar 2025 11:31:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 46FFB280152; Fri, 28 Mar 2025 11:31:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 22857280156; Fri, 28 Mar 2025 11:31:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id F0793280155 for ; Fri, 28 Mar 2025 11:31:50 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D6B151C7ECC for ; Fri, 28 Mar 2025 15:31:51 +0000 (UTC) X-FDA: 83271350022.14.869BEAA Received: from mail-wr1-f73.google.com (mail-wr1-f73.google.com [209.85.221.73]) by imf30.hostedemail.com (Postfix) with ESMTP id C1EEA80012 for ; Fri, 28 Mar 2025 15:31:49 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nPxs8yQk; spf=pass (imf30.hostedemail.com: domain of 35MDmZwUKCEc2jkkjpxxpun.lxvurw36-vvt4jlt.x0p@flex--tabba.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=35MDmZwUKCEc2jkkjpxxpun.lxvurw36-vvt4jlt.x0p@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1743175909; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Mv/1E0AU+NyqHw7xnz4KzldcKMV/7tB5wdC54OSV0gs=; b=id8PkCIKzfFtNBFBTtM0AHP7sO7rYlfVoZab7jcFG5noyC7PvEQGd0yKDy4LD4Y0L1rLc1 MwVzTp37yymz5S+PovU5Fhm/ycnwz5UOQHznOViqRXx+9J4mtJhxATw+RFu/BO4rfmGnyv W49DVpJ0REBLMtD8eZdYZ4+Ac827SVA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=nPxs8yQk; spf=pass (imf30.hostedemail.com: domain of 35MDmZwUKCEc2jkkjpxxpun.lxvurw36-vvt4jlt.x0p@flex--tabba.bounces.google.com designates 209.85.221.73 as permitted sender) smtp.mailfrom=35MDmZwUKCEc2jkkjpxxpun.lxvurw36-vvt4jlt.x0p@flex--tabba.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1743175909; a=rsa-sha256; cv=none; b=oG3/urjF76q/6SmNSeec/USvK3VbPRHOz5pSFKF+w8yy+WyAaShg36mpXwnluWvdH2H5Ai cdQdr84cuD/06xX0LVcH/sZDir//jeOfTTGfuf/AgPe96Y7O3ExIIH11OtwCfRYqIE7nYK IKiEwbpRn1yl+tHSB9/PFHprWVQ8qSk= Received: by mail-wr1-f73.google.com with SMTP id ffacd0b85a97d-3912d9848a7so1769614f8f.0 for ; Fri, 28 Mar 2025 08:31:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1743175908; x=1743780708; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Mv/1E0AU+NyqHw7xnz4KzldcKMV/7tB5wdC54OSV0gs=; b=nPxs8yQkeQr0SWUO/ln9SqNugbSJP8GAUqNjZL76tLRxTBAi8FZZEMrsphanQcp1Vz A/EUzmVeHktkC0YY4iC3dNKSHtPNf92TJ0ahfz6FVAy5f8t6d+IRjOBIBu96sue6Lo7O p7G3FhvIK19X+8+hyOaJKBiEGZOFx3hVWHEVslykGmCHOfbJlH9Net94phbsNx8UBua3 yGXJeII5iF+fwo+CsY6Ah0bzn1P+5VpqLnEXqHvYf3TqegxFJ66m1hc641/rUSYnpNem jQ+F14vHPGzotltk0kksEDf44eTNSuehAvcBnssCkA5ktMYYLGpe3l96Kangs6yyEM8f 59ag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1743175908; x=1743780708; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Mv/1E0AU+NyqHw7xnz4KzldcKMV/7tB5wdC54OSV0gs=; b=t+sGop0NZhzWyxQxj/mCbqq/yX3giTQJjMzAfcduFKRlAM3PGp7p6nQvEbmSew5vHe IV5mmGBc9H5B64Ddo1ZdUVmuGOf/2jLA/8IgifBLfeOqeUFagl/MXD1eBGEHBh2QBO7V AkJGALiiANE83iHakuLS3LqseaC+fnqEkm+eqng2LAdrAn06GTbf6I5AGtvLD/yiwTDS GQQrD7UjWOEUMMucKtv3hlJbaHy9nKeHiHHMm0aKTt+YZSgkrYXvbUolE9JF+9DnTZzj 9WRvkNHw7jidYC7n7hIpqq28olEXqcRt+B6GxqdcWaLcT4CHqvsnDFG7ZmCjPP4ao235 vHDw== X-Forwarded-Encrypted: i=1; AJvYcCX5EZUCA4dWaOxD+TnLDuXola+DlLYBpKORnEIPhKxPwPmxnplKXHppeB2wSEThGW3jKtMtQWn8fw==@kvack.org X-Gm-Message-State: AOJu0YzcuxORjLSTsoeE5kojgnNyZKE85O6AgVnyFOSOhc1LTKwHAK9B Wdt0mRVW0Gim53IGx1MjJhyFjziPfJob87KeeWLsq7NKrCxB0BThgXHcTu9LuFRNGH1smYUh1g= = X-Google-Smtp-Source: AGHT+IFlQ1UrYxh5xtzAR3ixBktqgMp71GraHERqKS0AjfxoTOcZj42vAVyGIN1u0tDDWZHdqWBm4jW8+Q== X-Received: from wmbje14.prod.google.com ([2002:a05:600c:1f8e:b0:43d:4038:9229]) (user=tabba job=prod-delivery.src-stubby-dispatcher) by 2002:a5d:6d02:0:b0:391:a74:d7e2 with SMTP id ffacd0b85a97d-39c0c13b8bamr3105305f8f.26.1743175908326; Fri, 28 Mar 2025 08:31:48 -0700 (PDT) Date: Fri, 28 Mar 2025 15:31:33 +0000 In-Reply-To: <20250328153133.3504118-1-tabba@google.com> Mime-Version: 1.0 References: <20250328153133.3504118-1-tabba@google.com> X-Mailer: git-send-email 2.49.0.472.ge94155a9ec-goog Message-ID: <20250328153133.3504118-8-tabba@google.com> Subject: [PATCH v7 7/7] KVM: guest_memfd: Add a guest_memfd() flag to initialize it as shared From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, peterx@redhat.com, pankaj.gupta@amd.com, tabba@google.com X-Rspamd-Queue-Id: C1EEA80012 X-Rspamd-Server: rspam05 X-Rspam-User: X-Stat-Signature: qrmor4rtfir5uz5bxdhpkkxq8rcmrhm9 X-HE-Tag: 1743175909-208440 X-HE-Meta: U2FsdGVkX19VJTNQE1SarNYw9DiWQ7j9rTPMeOmHfiKvH8rZ041kvDNhfrroKhYVT3lJrVimhP7cCWje2/SVYVIl0McQHI/iY4UFhdQyjF8i4QxNf96fR5xTSrQvGAAcYcYl2+zNirwa9WAyCsa0THXq/irm/mKLhjC6eIArX2DjHuO3m35sZyKiVdn8P7JMp9JRyJtUMI0EaKE9tqt6z/zl9+0Hvh7qb7IX7/xooUUk8XZCUPiChg1XsRjuWbQPnDkDe2ARyl+yjJ9wLX9rRKiIZVCOgyt59vKlXwhZK9zqDcFytmT5uBwRBSEVK6xD/LWAYmp14rS9UqZRjkAMcbfWmMvq0o9YP1qp0Ir1QvF0g4MgBh+SpofgNq0NXXBJOTvZbLwUg0DAL68s8awkbRhnl2FkInRzVi6ECE3zj0zbuI+EJtQCeTJJjMLAhZip8i9ajCXhOf/fxQamYyKGyOpaKA2tc1RzFJyLnQLsGykqJfzFOPHdGHUv7qMK1k1kSNAWueMQU9mot3x372aTqkswyHGA2isFqR+iHGONcU8uzjxpInH6Yo3NDGb9CNJ2664pDE4uq+g8cSxAPw6H/hfWAbU4M8azNtInXPUk7/S9gNdTJ7WZmLXHYyGtqf8r9Q9eVp7XOzWikyqr3yyHpBJrZ5uFvbqEJrsjUVO7wshmdylRIau1mmhmXBgnmam8tF68UutohnQqkkMvkYyjlxwwk/xDtmJ3FOEbqdWgEGAbdPvS2GbY9i2GzNbyiAawxXl4Qaj4TSodd9mvINYNlcZCJzjQkDDkKSDlwEdPDwnC1QGzRynrx9FAXR2r4NC3nG/ro9adYJcm3/GO/MCbLtvfRQOLNE8o/EQJoyxGMw71e1hU4HdIiIJjc7D5WDzd/0AKerZAL5NS9tx9jY5Me9xbwQvFT1aF91QpgG6Mqq6myjp/Z3RPiM2WXM/lP2Fxp5MN6SDSApsCVQDMAU0 VDeTS9q0 CntxSCFurYJUkJCaTCYG6glquJaSxjcjGlmNQ7o2zv+okz2Dixgy3MD/ZciLsELeIKmUi0Uq/RdfG6g8pBPDqi5Iid6qRXSZblZPzDS1gLwJ91vd5+RZvnDQjdK6lDniwIKzUOOgu7h7R4CLIV1vBHClfUpWugC3E6hjAc6BBcDBesolXbHQLojEYybamcTLQdHrZSDVRZCKz6tuhORsaYjZbi8E/YohQ6GMfcYQcxgU+ki8Q+vFl5ztImcoSIKBfYEgG+V4MHR6Q6Ys7qNlZxoN8y5f54Obp3kmW+MYTWi8JqfrASSKWtAoXQQgtI3ARMlKZKjCKwzSJyQfE+yvcf6/TGbNJKLXG/bDmwwHkDG5Z5AuS2CPXYnKWc+aJM0+ptI3kajYqkxAeiReBZ2jXMgkLnYsfYY19gFZiC93rdPpGwR1wGznhhWwzltUTk35Hr89tKc7zdf04Nkr/MvOLIkVISaQYdGa9NAYN4VwUpicuGT+U7ZYk1JEGCbvQ/ZeMU3muSvoAk/3AInqHIRdGcncXbaf3RKAZJcqjCl/xyQzGGfs8/KD+Jvhp8HM5k6XTFnDX6z8VixQUROGdvSEsxlKPOjwhU/j1KG2qjlwMXv/RHz90Op2BTOG/hrTLMyuDk41JJirP2Yqc8KTHmwuTIRkPMJJFCIrz2VQZAfiQpey06smVGKnCZzQatg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Not all use cases require guest_memfd() to be shared with the host when first created. Add a new flag, GUEST_MEMFD_FLAG_INIT_SHARED, which when set on KVM_CREATE_GUEST_MEMFD initializes the memory as shared with the host, and therefore mappable by it. Otherwise, memory is private until explicitly shared by the guest with the host. Signed-off-by: Fuad Tabba --- Documentation/virt/kvm/api.rst | 4 ++++ include/uapi/linux/kvm.h | 1 + tools/testing/selftests/kvm/guest_memfd_test.c | 7 +++++-- virt/kvm/guest_memfd.c | 12 ++++++++++++ 4 files changed, 22 insertions(+), 2 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 2b52eb77e29c..a5496d7d323b 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6386,6 +6386,10 @@ most one mapping per page, i.e. binding multiple memory regions to a single guest_memfd range is not allowed (any number of memory regions can be bound to a single guest_memfd file, but the bound ranges must not overlap). +If the capability KVM_CAP_GMEM_SHARED_MEM is supported, then the flags field +supports GUEST_MEMFD_FLAG_INIT_SHARED, which initializes the memory as shared +with the host, and thereby, mappable by it. + See KVM_SET_USER_MEMORY_REGION2 for additional details. 4.143 KVM_PRE_FAULT_MEMORY diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 117937a895da..22d7e33bf09c 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1566,6 +1566,7 @@ struct kvm_memory_attributes { #define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest_memfd) +#define GUEST_MEMFD_FLAG_INIT_SHARED (1UL << 0) struct kvm_create_guest_memfd { __u64 size; diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index 38c501e49e0e..4a7fcd6aa372 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -159,7 +159,7 @@ static void test_invalid_punch_hole(int fd, size_t page_size, size_t total_size) static void test_create_guest_memfd_invalid(struct kvm_vm *vm) { size_t page_size = getpagesize(); - uint64_t flag; + uint64_t flag = BIT(0); size_t size; int fd; @@ -170,7 +170,10 @@ static void test_create_guest_memfd_invalid(struct kvm_vm *vm) size); } - for (flag = BIT(0); flag; flag <<= 1) { + if (kvm_has_cap(KVM_CAP_GMEM_SHARED_MEM)) + flag = GUEST_MEMFD_FLAG_INIT_SHARED << 1; + + for (; flag; flag <<= 1) { fd = __vm_create_guest_memfd(vm, page_size, flag); TEST_ASSERT(fd == -1 && errno == EINVAL, "guest_memfd() with flag '0x%lx' should fail with EINVAL", diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index eec9d5e09f09..32e149478b04 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1069,6 +1069,15 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } + if (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM) && + (flags & GUEST_MEMFD_FLAG_INIT_SHARED)) { + err = kvm_gmem_offset_range_set_shared(file_inode(file), 0, size >> PAGE_SHIFT); + if (err) { + fput(file); + goto err_gmem; + } + } + kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); @@ -1090,6 +1099,9 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) u64 flags = args->flags; u64 valid_flags = 0; + if (IS_ENABLED(CONFIG_KVM_GMEM_SHARED_MEM)) + valid_flags |= GUEST_MEMFD_FLAG_INIT_SHARED; + if (flags & ~valid_flags) return -EINVAL;