From patchwork Thu Oct 10 08:59:20 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829794 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3CD01C3315 for ; Thu, 10 Oct 2024 08:59:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550779; cv=none; b=YvJm2YQ8vVbh7fjgkz4eGlzkJQ9q5obH8jSr7uXNXAGGykRoF6tzC2i/0TzWoG427+7waYbWV6JH4g5JK4iPGQFvHnSJwLAWaAI8eQ+jcxt0Ycl0kOdGwNwWVH3lg9oktb4mA2GEUtr9TgHpcLfl4WXi0WfjWnKMRQeXtISA/yM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550779; c=relaxed/simple; bh=Mga7mt3X9arSdqCQlXB1u30FTNDLBjy3at6GjGP0Jvo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=WEtkRW4KWDxJVns+Mmi7GscRvX1zohH3uIe70moW4LLpqFAwikFZh3D1d00wIRfarnT9zygki1AdYaP2ZVWi7QN1LdtP2hxV1HjTmN5SHVmp6g2Ui3DV64Pjghgs1o6xFugkVDGsGsVedXDHWFdt9OE5jvOhz03qhq8GLwIF4zk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=2CHS9tF7; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="2CHS9tF7" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-37d3986f5a4so203186f8f.3 for ; Thu, 10 Oct 2024 01:59:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550776; x=1729155576; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Vyku6e6IRAvv3Cnewk6FjoCA5VWGrQWG+Ki8Kqf1epQ=; b=2CHS9tF7jEriErxosQt3e4Mw6EAy/zXh6yCm7fpWVTWdUa5SyGHEctI75KDz5yjjWT qpjLF9xRlTyYeDKXjZaGW7kcQ663TPyJ6gtBQXIJCh3az+oIYGQWlnY2TRKn79vbJEUk 68uARHp6UImIMe+ooa9pULVCEMBW5OEnMdU4PmBy8/EioZ1Lkzl1Nlt35oyr+LhDeYGL JpJwXG4/cDElszjAuGLTMGznUyoVmRe7gsub0qntc0Si64OQ5OxFMLf9QXmWM7YrbHyM 144c7/hncZIcyZcxYM4YvK4ASWNK0yaLhVv6LTPf9zT6QHfqazPKteGW3MMdRBmGA+jW WC2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550776; x=1729155576; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Vyku6e6IRAvv3Cnewk6FjoCA5VWGrQWG+Ki8Kqf1epQ=; b=BzxwzFowqJaDtcDqgiNln1D3+DxKWkEqpqsvNADzey0fDHnV4VpM9bnNtxg8+kIeTk yO8Pox0/tVml91liDdcL8wvnBIrhlyzNIe55DMobg02cbWAw+nZZiXyiUEq6MDKobLsl Ov0S+RckrQsUeL6Y1aeJ/5HY/xYRNCBei5ZcK/qWdKHLDjr2kZqDmSdkkicvOSpftpOH fxi/4SSBNbiU6rUyTarZArcvmsQnyr41tlYz8Z19zsvWPF9i74cUZ/hG6lnrzK1gOZPd U/ULU4B9/SSWmzFc+jGnxuvQNr7Bi/5PoRZtXQiUzAKBfgwg5SGFtRtwUvIDgG8VUWR3 DBzw== X-Gm-Message-State: AOJu0Yy2V5GZZhY7Ly7ioQf6LeK6RDQ5BVljOyuisBTYZrFBaXiXDuGC Th9Zuk2YFdJsvLvKcglpHykXfqcDSYYmiCMtM2QkXfz9+3E34EioILQVIrSXbLeNkL723e1U5pp q3kW2KdF6WFWJVneBqHupns/5x8fZxsC2EId7eonIa/WjTEpSnt07n2pVczzeboAU0+AvIAzd+8 jE2PqkL2CwDYWck0ZDF3hEx9A= X-Google-Smtp-Source: AGHT+IGSZpfTUz1T03Ghz7pMPW+WsgsV7vYF/u4tI/gDmrDsGEArgV4hEd1O0Gy7y58cVD5g/Q9cGyXW2w== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:adf:ed10:0:b0:37d:4332:e91a with SMTP id ffacd0b85a97d-37d4332ea66mr2783f8f.0.1728550775123; Thu, 10 Oct 2024 01:59:35 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:20 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-2-tabba@google.com> Subject: [PATCH v3 01/11] KVM: guest_memfd: Make guest mem use guest mem inodes instead of anonymous inodes From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com From: Ackerley Tng Using guest mem inodes allows us to store metadata for the backing memory on the inode. Metadata will be added in a later patch to support HugeTLB pages. Metadata about backing memory should not be stored on the file, since the file represents a guest_memfd's binding with a struct kvm, and metadata about backing memory is not unique to a specific binding and struct kvm. Signed-off-by: Ackerley Tng Signed-off-by: Fuad Tabba --- include/uapi/linux/magic.h | 1 + virt/kvm/guest_memfd.c | 119 ++++++++++++++++++++++++++++++------- 2 files changed, 100 insertions(+), 20 deletions(-) diff --git a/include/uapi/linux/magic.h b/include/uapi/linux/magic.h index bb575f3ab45e..169dba2a6920 100644 --- a/include/uapi/linux/magic.h +++ b/include/uapi/linux/magic.h @@ -103,5 +103,6 @@ #define DEVMEM_MAGIC 0x454d444d /* "DMEM" */ #define SECRETMEM_MAGIC 0x5345434d /* "SECM" */ #define PID_FS_MAGIC 0x50494446 /* "PIDF" */ +#define GUEST_MEMORY_MAGIC 0x474d454d /* "GMEM" */ #endif /* __LINUX_MAGIC_H__ */ diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 8f079a61a56d..5d7fd1f708a6 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -1,12 +1,17 @@ // SPDX-License-Identifier: GPL-2.0 +#include +#include #include #include #include +#include #include #include #include "kvm_mm.h" +static struct vfsmount *kvm_gmem_mnt; + struct kvm_gmem { struct kvm *kvm; struct xarray bindings; @@ -302,6 +307,38 @@ static inline struct file *kvm_gmem_get_file(struct kvm_memory_slot *slot) return get_file_active(&slot->gmem.file); } +static const struct super_operations kvm_gmem_super_operations = { + .statfs = simple_statfs, +}; + +static int kvm_gmem_init_fs_context(struct fs_context *fc) +{ + struct pseudo_fs_context *ctx; + + if (!init_pseudo(fc, GUEST_MEMORY_MAGIC)) + return -ENOMEM; + + ctx = fc->fs_private; + ctx->ops = &kvm_gmem_super_operations; + + return 0; +} + +static struct file_system_type kvm_gmem_fs = { + .name = "kvm_guest_memory", + .init_fs_context = kvm_gmem_init_fs_context, + .kill_sb = kill_anon_super, +}; + +static void kvm_gmem_init_mount(void) +{ + kvm_gmem_mnt = kern_mount(&kvm_gmem_fs); + BUG_ON(IS_ERR(kvm_gmem_mnt)); + + /* For giggles. Userspace can never map this anyways. */ + kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; +} + static struct file_operations kvm_gmem_fops = { .open = generic_file_open, .release = kvm_gmem_release, @@ -311,6 +348,8 @@ static struct file_operations kvm_gmem_fops = { void kvm_gmem_init(struct module *module) { kvm_gmem_fops.owner = module; + + kvm_gmem_init_mount(); } static int kvm_gmem_migrate_folio(struct address_space *mapping, @@ -392,11 +431,67 @@ static const struct inode_operations kvm_gmem_iops = { .setattr = kvm_gmem_setattr, }; +static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, + loff_t size, u64 flags) +{ + const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct inode *inode; + int err; + + inode = alloc_anon_inode(kvm_gmem_mnt->mnt_sb); + if (IS_ERR(inode)) + return inode; + + err = security_inode_init_security_anon(inode, &qname, NULL); + if (err) { + iput(inode); + return ERR_PTR(err); + } + + inode->i_private = (void *)(unsigned long)flags; + inode->i_op = &kvm_gmem_iops; + inode->i_mapping->a_ops = &kvm_gmem_aops; + inode->i_mode |= S_IFREG; + inode->i_size = size; + mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); + mapping_set_inaccessible(inode->i_mapping); + /* Unmovable mappings are supposed to be marked unevictable as well. */ + WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); + + return inode; +} + +static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, + u64 flags) +{ + static const char *name = "[kvm-gmem]"; + struct inode *inode; + struct file *file; + + if (kvm_gmem_fops.owner && !try_module_get(kvm_gmem_fops.owner)) + return ERR_PTR(-ENOENT); + + inode = kvm_gmem_inode_make_secure_inode(name, size, flags); + if (IS_ERR(inode)) + return ERR_CAST(inode); + + file = alloc_file_pseudo(inode, kvm_gmem_mnt, name, O_RDWR, + &kvm_gmem_fops); + if (IS_ERR(file)) { + iput(inode); + return file; + } + + file->f_mapping = inode->i_mapping; + file->f_flags |= O_LARGEFILE; + file->private_data = priv; + + return file; +} + static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) { - const char *anon_name = "[kvm-gmem]"; struct kvm_gmem *gmem; - struct inode *inode; struct file *file; int fd, err; @@ -410,32 +505,16 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_fd; } - file = anon_inode_create_getfile(anon_name, &kvm_gmem_fops, gmem, - O_RDWR, NULL); + file = kvm_gmem_inode_create_getfile(gmem, size, flags); if (IS_ERR(file)) { err = PTR_ERR(file); goto err_gmem; } - file->f_flags |= O_LARGEFILE; - - inode = file->f_inode; - WARN_ON(file->f_mapping != inode->i_mapping); - - inode->i_private = (void *)(unsigned long)flags; - inode->i_op = &kvm_gmem_iops; - inode->i_mapping->a_ops = &kvm_gmem_aops; - inode->i_mode |= S_IFREG; - inode->i_size = size; - mapping_set_gfp_mask(inode->i_mapping, GFP_HIGHUSER); - mapping_set_inaccessible(inode->i_mapping); - /* Unmovable mappings are supposed to be marked unevictable as well. */ - WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); - kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); - list_add(&gmem->entry, &inode->i_mapping->i_private_list); + list_add(&gmem->entry, &file_inode(file)->i_mapping->i_private_list); fd_install(fd, file); return fd; From patchwork Thu Oct 10 08:59:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829795 Received: from mail-wm1-f73.google.com (mail-wm1-f73.google.com [209.85.128.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6DB771C3F04 for ; Thu, 10 Oct 2024 08:59:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550782; cv=none; b=M+xT2kCIeu9AhhRsvy2cdaABsB4oPcHNWeNCP0HfU5zt1Nuh6WhwKZsNwDQT+TFx16LizhX2m4CUidS28bKka0yWKytAchitgJ5Q0erz6WEZqJdTq4KgIy9heEX1weSY1xTT/R33jAewOVQpQzIBjrJ539S2fTwGWuJ082Y3d7k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550782; c=relaxed/simple; bh=3zI2mMYBgDtaQ2CZ1B/ek40Jw4Rd0C6Kzftoo23lzPU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=hcbDxAxSXKF02KjiV7AnnZIVigz+G3gCJuJ3vRQhD6iQgav0B0BzblSqpXgWACyT5PXACx3XcEz7/q+Qfk09RW1XQmiXnP7DuEmx6YlDpMg8ouTT/l27ZyyKinp8JJvl/Ti729sXbuQI7TanzZ9QL2OpRIRNC8epo4VxGHXCkXY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YWpgfD9q; arc=none smtp.client-ip=209.85.128.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YWpgfD9q" Received: by mail-wm1-f73.google.com with SMTP id 5b1f17b1804b1-430581bd920so4088995e9.3 for ; Thu, 10 Oct 2024 01:59:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550779; x=1729155579; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4yV7g5zb1mLRm2mV+FxGCvZ9J58MmHKq+HH3QCFP3/A=; b=YWpgfD9qEOE3PCDGjxniUMjEI0RvNVAouyNjsh2AusKmz3T6wSvSwuDmfBqIm6pYsy 9AT1lM5Ui/GJcb4JKlyZoADFLsIRRAcQ33odH6od7z1E43KDyGscXeuUPnMygVjXXI5c AjXbVjl14q/MT04f/DB5Y1r8XEpZzk56pCYSuhaaD2YD0UyWSJCQ2uiJTHcloi1WbVW5 rmTyOIfdGrMCqhCJiT+dB51mB6G+lEsvFl5inGtydgxq5yr7XNNFcpU1AB+hjtfWg3fQ YNid+3XV7xcQ3bC8969ltCmbWSYrpG+6a26nDSpk0dGLJ8A5/6IphO+D9CscwLqLEOoC kv3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550779; x=1729155579; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4yV7g5zb1mLRm2mV+FxGCvZ9J58MmHKq+HH3QCFP3/A=; b=AAyqfXNBRCyplUPUWvqYFccxCrmqfY+PfNmWjMiPxCg8IZusGjurpThQ/RHERdVcIp /O7ABuN/ZwTPIcZCizQMa1ZHbjlCUXQmIHSVbfESV3XhcLQj0V/fnL39fVeLgBBW81nN eGWPBwHzsWAJUC17BLu/UECmv3h01CleIij1UIuy1tNQDbNbDj3n9theh1jbyjr2fUyd PqE78j+SJf+61v38yY0ekeXqpgCSdijorulp1kfXXUCMk4hXGiL+5Y2q+/N5ZwW5TiOv 1lcaBvACnilR126elvro9AWt4q2GIyWcxUoSkxoByoWNA6+t9tUfJ4f/HT7/51qmJu6v u+hA== X-Gm-Message-State: AOJu0YyO8n4iPFusQW7MwePUOmAQRlYkR1mM6t2d1S7ZJhNB8GAxM9vT 51XQ+QOg4MZBWv0YMvVFPsqxlB+hE2WciohVDJqsmxH5q50LTfJPTWeKrsiwFxEFY5g6bepZzVm 5lgb7zmJgct0EsihAfG5TgweDhxfXBmAHT7Spyn+XkCw8rs+e783IlUE4O73vid25DiyjvMXD3y q2sVzeP+60SX8qG5638tkj4LQ= X-Google-Smtp-Source: AGHT+IEhajSTUGX0DrF4NwfEPAhnOGAT61tcgNDfAcBGm901JNES8kGjHixw5Cb+5EsBR7BggcxyXkjnrQ== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:600c:5357:b0:42c:b45c:5e95 with SMTP id 5b1f17b1804b1-430ccf091e0mr139615e9.1.1728550777855; Thu, 10 Oct 2024 01:59:37 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:21 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-3-tabba@google.com> Subject: [PATCH v3 02/11] KVM: guest_memfd: Track mappability within a struct kvm_gmem_private From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com From: Ackerley Tng Track whether guest_memfd memory can be mapped within the inode, since it is property of the guest_memfd's memory contents. The guest_memfd PRIVATE memory attribute is not used for two reasons. First because it reflects the userspace expectation for that memory location, and therefore can be toggled by userspace. The second is, although each guest_memfd file has a 1:1 binding with a KVM instance, the plan is to allow multiple files per inode, e.g. to allow intra-host migration to a new KVM instance, without destroying guest_memfd. Signed-off-by: Ackerley Tng Co-developed-by: Vishal Annapurve Signed-off-by: Vishal Annapurve Co-developed-by: Fuad Tabba Signed-off-by: Fuad Tabba --- virt/kvm/guest_memfd.c | 56 ++++++++++++++++++++++++++++++++++++++---- 1 file changed, 51 insertions(+), 5 deletions(-) diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 5d7fd1f708a6..4d3ba346c415 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -18,6 +18,17 @@ struct kvm_gmem { struct list_head entry; }; +struct kvm_gmem_inode_private { +#ifdef CONFIG_KVM_GMEM_MAPPABLE + struct xarray mappable_offsets; +#endif +}; + +static struct kvm_gmem_inode_private *kvm_gmem_private(struct inode *inode) +{ + return inode->i_mapping->i_private_data; +} + /** * folio_file_pfn - like folio_file_page, but return a pfn. * @folio: The folio which contains this index. @@ -307,8 +318,28 @@ static inline struct file *kvm_gmem_get_file(struct kvm_memory_slot *slot) return get_file_active(&slot->gmem.file); } +static void kvm_gmem_evict_inode(struct inode *inode) +{ + struct kvm_gmem_inode_private *private = kvm_gmem_private(inode); + +#ifdef CONFIG_KVM_GMEM_MAPPABLE + /* + * .free_inode can be called before private data is set up if there are + * issues during inode creation. + */ + if (private) + xa_destroy(&private->mappable_offsets); +#endif + + truncate_inode_pages_final(inode->i_mapping); + + kfree(private); + clear_inode(inode); +} + static const struct super_operations kvm_gmem_super_operations = { - .statfs = simple_statfs, + .statfs = simple_statfs, + .evict_inode = kvm_gmem_evict_inode, }; static int kvm_gmem_init_fs_context(struct fs_context *fc) @@ -435,6 +466,7 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, loff_t size, u64 flags) { const struct qstr qname = QSTR_INIT(name, strlen(name)); + struct kvm_gmem_inode_private *private; struct inode *inode; int err; @@ -443,10 +475,19 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, return inode; err = security_inode_init_security_anon(inode, &qname, NULL); - if (err) { - iput(inode); - return ERR_PTR(err); - } + if (err) + goto out; + + err = -ENOMEM; + private = kzalloc(sizeof(*private), GFP_KERNEL); + if (!private) + goto out; + +#ifdef CONFIG_KVM_GMEM_MAPPABLE + xa_init(&private->mappable_offsets); +#endif + + inode->i_mapping->i_private_data = private; inode->i_private = (void *)(unsigned long)flags; inode->i_op = &kvm_gmem_iops; @@ -459,6 +500,11 @@ static struct inode *kvm_gmem_inode_make_secure_inode(const char *name, WARN_ON_ONCE(!mapping_unevictable(inode->i_mapping)); return inode; + +out: + iput(inode); + + return ERR_PTR(err); } static struct file *kvm_gmem_inode_create_getfile(void *priv, loff_t size, From patchwork Thu Oct 10 08:59:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829796 Received: from mail-ej1-f74.google.com (mail-ej1-f74.google.com [209.85.218.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4AEE1BB6B5 for ; Thu, 10 Oct 2024 08:59:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.218.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550784; cv=none; b=R1bev3a/DZdjqrBIiSiiIUwURVjrqh3XARVZBKxHZoouJAMwJe7c1A37CntcEmLD/y6K+Qo/QQ0mkSqnHLlIkbZ+Q86lrdg3hRdMz/8eqlwpLADRFgzuGoV6NRFFzMsWHiLEIogTDinyLH3CmVnnxuu8CQC9pmg8ypI1aFVzQzE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550784; c=relaxed/simple; bh=N/7Xc8SNlJ2wkKm4Jwyhcut9fezoJcEGmUEf+H2ES0k=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=A5fWA0yzWNRi215NfmnzUVAKq/Xrfj7xZeNuJDUEgsYb3VklGxSLZuaOonQe8LLC5ny1uJmx6hCa86yprQnJ4YhwOlnurLrJZgghGNWCWJkEpcZ3Kx2Rjl2cXOjIizea4To+BiFerC/Qs99LkqB/7HjXvVqHWyaowYkx3yGz++8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=l38jt1YA; arc=none smtp.client-ip=209.85.218.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="l38jt1YA" Received: by mail-ej1-f74.google.com with SMTP id a640c23a62f3a-a994d8d6004so58372866b.2 for ; Thu, 10 Oct 2024 01:59:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550781; x=1729155581; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zi/pjPr4ta3drldcSk6p6o77eOYFs/2AcZb+H+YLVPU=; b=l38jt1YAF+eJSiyqiHD7yf7A2HLOcBNC18aR8RrbUn7ugOkQ9Ur04RIfuYRv1Ily8X JJRkzEYX4GysvRe4tR7G0zS5V2mLbT2k8esC8N4R1BNKQEQe39KQXmPBKcYLEYsRS3Ak go5gHpHhxdrcZ70SUloIe9G9qwyPG4y7yloHV2AXZDby7aMCy/HzXdHktvTP+WRw87sQ sjyxua4X4SMWqH4/3GvUEvlQ4RvDExQC/dYWrkxIjOXZa0C31VOw/Tj8eFos2namoTeE aTPKBlHFyzFzk41tSQyVrESA3vSm2xsX1ieZ+LUoc/KCRx92IPKYBvZOUytR67iONmno tMiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550781; x=1729155581; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zi/pjPr4ta3drldcSk6p6o77eOYFs/2AcZb+H+YLVPU=; b=jZzUVglGhSQ4+Obyiz1eigGwbkOWvuAxz+8krlka2/O3e+sDaPNAdU+K510l7p89AZ PhFB0zvUI5oZzkR4k6q53RRxqnB7dQEolk3ljwZlcPnJ4zp6YN7CikDd1cwjEDJVj+ml ekVZYmKaX9hdLT41PkuBFVMxNuR4IJfCMr1jJLrh/Sd/H5Du7Ksua3aVHKuAhnqEbRIs 59/2cF6uLheb9qdi0kAatlasDzKO/eqsOnT9zae1jBNjAMAjFwwHFW3fhh6nk8rDT3cd 1rETRspC5pAc+tVXC/WGbPxzxR0JXD+HofJ2YqP1Gvbf3nibjIbmpSDZFxVYNiLqrQ06 efsw== X-Gm-Message-State: AOJu0YyTIWck0SN1QrYc64xoh2wFeLmsYRjHxI/fx60EAjvxSjeY8hEF HDYQFA9BV+T8vjAhskKYHCz9LiuXOiJwLPY/laeXbliTiljHHMYvs1UWs4Q07+ay6rb/X2m3JbL coDukTt4CHI3ywpPKts0sl6P/8ud0cDF/0ZwU1f3lI7gbA+KkOw4uP1mGJ4tFCuAy/yhakL9R71 dsj7JB/efSRkVhiDRKBf0dW50= X-Google-Smtp-Source: AGHT+IFJwsaDfsra9si5hSwL5TQTTo4RhmMq+iKHIyBjxFdz3SDsa44kizohADHJBxPg0EWk39rrAlUrlw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a17:906:e207:b0:a89:ee2b:e2dd with SMTP id a640c23a62f3a-a998d31c253mr128966b.12.1728550780668; Thu, 10 Oct 2024 01:59:40 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:22 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-4-tabba@google.com> Subject: [PATCH v3 03/11] KVM: guest_memfd: Introduce kvm_gmem_get_pfn_locked(), which retains the folio lock From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Create a new variant of kvm_gmem_get_pfn(), which retains the folio lock if it returns successfully. This is needed in subsequent patches in order to protect against races when checking whether a folio can be mapped by the host. Signed-off-by: Fuad Tabba --- include/linux/kvm_host.h | 11 ++++++++++ virt/kvm/guest_memfd.c | 45 +++++++++++++++++++++++++++++++--------- 2 files changed, 46 insertions(+), 10 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index db567d26f7b9..acf85995b582 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2464,6 +2464,9 @@ static inline bool kvm_mem_is_private(struct kvm *kvm, gfn_t gfn) #ifdef CONFIG_KVM_PRIVATE_MEM int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, kvm_pfn_t *pfn, int *max_order); +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order); + #else static inline int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, gfn_t gfn, @@ -2472,6 +2475,14 @@ static inline int kvm_gmem_get_pfn(struct kvm *kvm, KVM_BUG_ON(1, kvm); return -EIO; } +static inline int kvm_gmem_get_pfn_locked(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, + kvm_pfn_t *pfn, int *max_order) +{ + KVM_BUG_ON(1, kvm); + return -EIO; +} #endif /* CONFIG_KVM_PRIVATE_MEM */ #ifdef CONFIG_HAVE_KVM_ARCH_GMEM_PREPARE diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index 4d3ba346c415..f414646c475b 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -714,34 +714,59 @@ __kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, return folio; } -int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, - gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +static int +kvm_gmem_get_pfn_folio_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order, + struct folio **folio) { struct file *file = kvm_gmem_get_file(slot); - struct folio *folio; bool is_prepared = false; int r = 0; if (!file) return -EFAULT; - folio = __kvm_gmem_get_pfn(file, slot, gfn, pfn, &is_prepared, max_order); - if (IS_ERR(folio)) { - r = PTR_ERR(folio); + *folio = __kvm_gmem_get_pfn(file, slot, gfn, pfn, &is_prepared, max_order); + if (IS_ERR(*folio)) { + r = PTR_ERR(*folio); goto out; } if (!is_prepared) - r = kvm_gmem_prepare_folio(kvm, slot, gfn, folio); + r = kvm_gmem_prepare_folio(kvm, slot, gfn, *folio); - folio_unlock(folio); - if (r < 0) - folio_put(folio); + if (r) { + folio_unlock(*folio); + folio_put(*folio); + } out: fput(file); return r; } + +int kvm_gmem_get_pfn_locked(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +{ + struct folio *folio; + + return kvm_gmem_get_pfn_folio_locked(kvm, slot, gfn, pfn, max_order, &folio); + +} +EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn_locked); + +int kvm_gmem_get_pfn(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, int *max_order) +{ + struct folio *folio; + int r; + + r = kvm_gmem_get_pfn_folio_locked(kvm, slot, gfn, pfn, max_order, &folio); + if (!r) + folio_unlock(folio); + + return r; +} EXPORT_SYMBOL_GPL(kvm_gmem_get_pfn); #ifdef CONFIG_KVM_GENERIC_PRIVATE_MEM From patchwork Thu Oct 10 08:59:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829797 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CF8D31C3F32 for ; Thu, 10 Oct 2024 08:59:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550786; cv=none; b=Ov3MSgmFvYnGcC9NLnsJGo/iwCq0KxTJ9e7T18bi0WfVRQUpGj0dGdOdjztVcHweY3DwdJ43/Y061CBfMmkr6NZjQ7+eJLmI9+l7McuZhc+sD6lm6IlQb7mePyfu2JX3rvFrpM1g+CkSnL6AMLUESVjaB49UReEQtCeJMgZtxSg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550786; c=relaxed/simple; bh=pRdqtCot2Cke/Hf83HQPerqg05/NdJlp6U+Chdcdoyw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=AVkbLHzXR2/QWAmk8ad1WBdHmNqKoNJPPR3WclDt/JWz3aKumRe9U23lde+CzSVDl6DaMiTmiurWUvRkBuqJcKE0fhR1ckGWYllO3eUdLIRuAYUQsaQ9+DTUKfbkt3YVbYBi4j7JzTfG3NRz/rG7+qIyzAZYSDu2ZVmDbKeMLdk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fx6lOYcY; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fx6lOYcY" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e25cd76fb92so1049631276.3 for ; Thu, 10 Oct 2024 01:59:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550784; x=1729155584; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=RaMAnTMU97KOopoBfYjbwx+ePz+tIxWqRe78kfl7LPI=; b=fx6lOYcYaM05BlrzMOxhszsKFn433GqXtNkrpQW3lA1LFR7zECwsgn1GEAfT0xGsWE GcGClg50cJq3buObsP71tEXFIXW72lqxghMZERc4MiPV8aWbUqNHqSKQfgpiM1bmeegw 305npPtioKUkUygOTWqA6BT+PQx6YfoE0vrOtUDhaNti2nIpyrqEa+vHLI3qgjwjWc3B h41FuGFQCDktN/gigoyRvmpuy05Jsolrv6tkAvlBbx0o0YHhObcZMM1xHR8ijl4MeqgO xVCv+tkdOyxLaHvy4q/l3G1My2vItnoyi0oAnZONJLKwoWGLTUCLu8dM9JjOIit0lfg4 HScw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550784; x=1729155584; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=RaMAnTMU97KOopoBfYjbwx+ePz+tIxWqRe78kfl7LPI=; b=VYiG21KIUet/teNxxgYkp57r2WMXUef8O2uW8EZFHqAOZV32URZCL3DdG3UCGd2LDs gfUvd6zk8KXduXsWAMk/DZLEiU0w2tgwRNO0LsDYEk45X4fdEU+opt93nI4AOVdwUdCR PaRP8s9O4It9xA0tPNV9pj6/a9u5h1A+8XnVfgHKXwU+4q9B9i0DiDP/qdbIDDvx64DP 39G61FwvcSw+j1esHSjvrkdLILLsMnaQibfEosj7VmWvgYJLLPgrIs3L+8ksNwPOZvsD AXmRRbqUdXPCVxpHuVYqEKmIJGQX5r3aAJwsTM570XRenU76XcaAX2ULsGb10+Qj0JU/ 4kcA== X-Gm-Message-State: AOJu0YxL7iSFDEerduf1Xt/jjfGgol2d2RKaEel/w0QhvvJVHBBGBppe nHEFrPHy0RhX+DT1sH4s5HfLQoetoCjXRix8SbTv8phF8JxAr/GBEGkGxAZIDLxAXeYGpR+mPvv MOnYgL9xHHKJt/xu38t58GZz9agGeS84VLZSfKQ1FjqS7qmfQoQkYRaZmefyxKK2fRw2vYm+6U5 eUatkttyT3gN9pVgKgZPb6KJ0= X-Google-Smtp-Source: AGHT+IFXrnj8DsQckmU8dck3/aR7Cyrtk6dA8VC+TkvI6bTNlQ8hcSmPDes1mOxFCnXb7jNq/oQlri1uFw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a5b:308:0:b0:e25:d46a:a6b6 with SMTP id 3f1490d57ef6-e28fe41d292mr38495276.8.1728550783417; Thu, 10 Oct 2024 01:59:43 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:23 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-5-tabba@google.com> Subject: [PATCH v3 04/11] KVM: guest_memfd: Allow host to mmap guest_memfd() pages when shared From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add support for mmap() and fault() for guest_memfd in the host. The ability to fault in a guest page is contingent on that page being shared with the host. The guest_memfd PRIVATE memory attribute is not used for two reasons. First because it reflects the userspace expectation for that memory location, and therefore can be toggled by userspace. The second is, although each guest_memfd file has a 1:1 binding with a KVM instance, the plan is to allow multiple files per inode, e.g. to allow intra-host migration to a new KVM instance, without destroying guest_memfd. The mapping is restricted to only memory explicitly shared with the host. KVM checks that the host doesn't have any mappings for private memory via the folio's refcount. To avoid races between paths that check mappability and paths that check whether the host has any mappings (via the refcount), the folio lock is held in while either check is being performed. This new feature is gated with a new configuration option, CONFIG_KVM_GMEM_MAPPABLE. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Co-developed-by: Elliot Berman Signed-off-by: Elliot Berman Signed-off-by: Fuad Tabba --- Note that the functions kvm_gmem_is_mapped(), kvm_gmem_set_mappable(), and int kvm_gmem_clear_mappable() are not used in this patch series. They are intended to be used in future patches [*], which check and toggle mapability when the guest shares/unshares pages with the host. [*] https://android-kvm.googlesource.com/linux/+/refs/heads/tabba/guestmem-6.12-v3-pkvm --- include/linux/kvm_host.h | 52 +++++++++++ virt/kvm/Kconfig | 4 + virt/kvm/guest_memfd.c | 185 +++++++++++++++++++++++++++++++++++++++ virt/kvm/kvm_main.c | 138 +++++++++++++++++++++++++++++ 4 files changed, 379 insertions(+) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index acf85995b582..bda7fda9945e 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -2527,4 +2527,56 @@ long kvm_arch_vcpu_pre_fault_memory(struct kvm_vcpu *vcpu, struct kvm_pre_fault_memory *range); #endif +#ifdef CONFIG_KVM_GMEM_MAPPABLE +bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t gfn, gfn_t end); +bool kvm_gmem_is_mapped(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end); +int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end); +int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, gfn_t start, + gfn_t end); +bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, gfn_t gfn); +#else +static inline bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t gfn, gfn_t end) +{ + WARN_ON_ONCE(1); + return false; +} +static inline bool kvm_gmem_is_mapped(struct kvm *kvm, gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return false; +} +static inline int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, + gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, + gfn_t start, gfn_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +static inline bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, + gfn_t gfn) +{ + WARN_ON_ONCE(1); + return false; +} +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + #endif diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index fd6a3010afa8..2cfcb0848e37 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -120,3 +120,7 @@ config HAVE_KVM_ARCH_GMEM_PREPARE config HAVE_KVM_ARCH_GMEM_INVALIDATE bool depends on KVM_PRIVATE_MEM + +config KVM_GMEM_MAPPABLE + select KVM_PRIVATE_MEM + bool diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index f414646c475b..df3a6f05a16e 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -370,7 +370,184 @@ static void kvm_gmem_init_mount(void) kvm_gmem_mnt->mnt_flags |= MNT_NOEXEC; } +#ifdef CONFIG_KVM_GMEM_MAPPABLE +static struct folio * +__kvm_gmem_get_pfn(struct file *file, struct kvm_memory_slot *slot, + gfn_t gfn, kvm_pfn_t *pfn, bool *is_prepared, + int *max_order); + +static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + void *xval = xa_mk_value(true); + pgoff_t i; + bool r; + + filemap_invalidate_lock(inode->i_mapping); + for (i = start; i < end; i++) { + r = xa_err(xa_store(mappable_offsets, i, xval, GFP_KERNEL)); + if (r) + break; + } + filemap_invalidate_unlock(inode->i_mapping); + + return r; +} + +static int gmem_clear_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + pgoff_t i; + int r = 0; + + filemap_invalidate_lock(inode->i_mapping); + for (i = start; i < end; i++) { + struct folio *folio; + + /* + * Holds the folio lock until after checking its refcount, + * to avoid races with paths that fault in the folio. + */ + folio = kvm_gmem_get_folio(inode, i); + if (WARN_ON_ONCE(IS_ERR(folio))) + continue; + + /* + * Check that the host doesn't have any mappings on clearing + * the mappable flag, because clearing the flag implies that the + * memory will be unshared from the host. Therefore, to maintain + * the invariant that the host cannot access private memory, we + * need to check that it doesn't have any mappings to that + * memory before making it private. + * + * Two references are expected because of kvm_gmem_get_folio(). + */ + if (folio_ref_count(folio) > 2) + r = -EPERM; + else + xa_erase(mappable_offsets, i); + + folio_put(folio); + folio_unlock(folio); + + if (r) + break; + } + filemap_invalidate_unlock(inode->i_mapping); + + return r; +} + +static bool gmem_is_mappable(struct inode *inode, pgoff_t pgoff) +{ + struct xarray *mappable_offsets = &kvm_gmem_private(inode)->mappable_offsets; + bool r; + + filemap_invalidate_lock_shared(inode->i_mapping); + r = xa_find(mappable_offsets, &pgoff, pgoff, XA_PRESENT); + filemap_invalidate_unlock_shared(inode->i_mapping); + + return r; +} + +int kvm_slot_gmem_set_mappable(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) +{ + struct inode *inode = file_inode(slot->gmem.file); + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; + pgoff_t end_off = start_off + end - start; + + return gmem_set_mappable(inode, start_off, end_off); +} + +int kvm_slot_gmem_clear_mappable(struct kvm_memory_slot *slot, gfn_t start, gfn_t end) +{ + struct inode *inode = file_inode(slot->gmem.file); + pgoff_t start_off = slot->gmem.pgoff + start - slot->base_gfn; + pgoff_t end_off = start_off + end - start; + + return gmem_clear_mappable(inode, start_off, end_off); +} + +bool kvm_slot_gmem_is_mappable(struct kvm_memory_slot *slot, gfn_t gfn) +{ + struct inode *inode = file_inode(slot->gmem.file); + unsigned long pgoff = slot->gmem.pgoff + gfn - slot->base_gfn; + + return gmem_is_mappable(inode, pgoff); +} + +static vm_fault_t kvm_gmem_fault(struct vm_fault *vmf) +{ + struct inode *inode = file_inode(vmf->vma->vm_file); + struct folio *folio; + vm_fault_t ret = VM_FAULT_LOCKED; + + /* + * Holds the folio lock until after checking whether it can be faulted + * in, to avoid races with paths that change a folio's mappability. + */ + folio = kvm_gmem_get_folio(inode, vmf->pgoff); + if (!folio) + return VM_FAULT_SIGBUS; + + if (folio_test_hwpoison(folio)) { + ret = VM_FAULT_HWPOISON; + goto out; + } + + if (!gmem_is_mappable(inode, vmf->pgoff)) { + ret = VM_FAULT_SIGBUS; + goto out; + } + + if (!folio_test_uptodate(folio)) { + unsigned long nr_pages = folio_nr_pages(folio); + unsigned long i; + + for (i = 0; i < nr_pages; i++) + clear_highpage(folio_page(folio, i)); + + folio_mark_uptodate(folio); + } + + vmf->page = folio_file_page(folio, vmf->pgoff); +out: + if (ret != VM_FAULT_LOCKED) { + folio_put(folio); + folio_unlock(folio); + } + + return ret; +} + +static const struct vm_operations_struct kvm_gmem_vm_ops = { + .fault = kvm_gmem_fault, +}; + +static int kvm_gmem_mmap(struct file *file, struct vm_area_struct *vma) +{ + if ((vma->vm_flags & (VM_SHARED | VM_MAYSHARE)) != + (VM_SHARED | VM_MAYSHARE)) { + return -EINVAL; + } + + file_accessed(file); + vm_flags_set(vma, VM_DONTDUMP); + vma->vm_ops = &kvm_gmem_vm_ops; + + return 0; +} +#else +static int gmem_set_mappable(struct inode *inode, pgoff_t start, pgoff_t end) +{ + WARN_ON_ONCE(1); + return -EINVAL; +} +#define kvm_gmem_mmap NULL +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + static struct file_operations kvm_gmem_fops = { + .mmap = kvm_gmem_mmap, .open = generic_file_open, .release = kvm_gmem_release, .fallocate = kvm_gmem_fallocate, @@ -557,6 +734,14 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) { + err = gmem_set_mappable(file_inode(file), 0, size >> PAGE_SHIFT); + if (err) { + fput(file); + goto err_gmem; + } + } + kvm_get_kvm(kvm); gmem->kvm = kvm; xa_init(&gmem->bindings); diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 05cbb2548d99..aed9cf2f1685 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3263,6 +3263,144 @@ static int next_segment(unsigned long len, int offset) return len; } +#ifdef CONFIG_KVM_GMEM_MAPPABLE +static bool __kvm_gmem_is_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + + lockdep_assert_held(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end, i; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(gfn_start >= gfn_end)) + continue; + + for (i = gfn_start; i < gfn_end; i++) { + if (!kvm_slot_gmem_is_mappable(memslot, i)) + return false; + } + } + + return true; +} + +bool kvm_gmem_is_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + bool r; + + mutex_lock(&kvm->slots_lock); + r = __kvm_gmem_is_mappable(kvm, start, end); + mutex_unlock(&kvm->slots_lock); + + return r; +} + +static bool kvm_gmem_is_pfn_mapped(struct kvm *kvm, struct kvm_memory_slot *memslot, gfn_t gfn_idx) +{ + struct page *page; + bool is_mapped; + kvm_pfn_t pfn; + + /* + * Holds the folio lock until after checking its refcount, + * to avoid races with paths that fault in the folio. + */ + if (WARN_ON_ONCE(kvm_gmem_get_pfn_locked(kvm, memslot, gfn_idx, &pfn, NULL))) + return false; + + page = pfn_to_page(pfn); + + /* Two references are expected because of kvm_gmem_get_pfn_locked(). */ + is_mapped = page_ref_count(page) > 2; + + put_page(page); + unlock_page(page); + + return is_mapped; +} + +static bool __kvm_gmem_is_mapped(struct kvm *kvm, gfn_t start, gfn_t end) +{ + struct kvm_memslot_iter iter; + + lockdep_assert_held(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end, i; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(gfn_start >= gfn_end)) + continue; + + for (i = gfn_start; i < gfn_end; i++) { + if (kvm_gmem_is_pfn_mapped(kvm, memslot, i)) + return true; + } + } + + return false; +} + +bool kvm_gmem_is_mapped(struct kvm *kvm, gfn_t start, gfn_t end) +{ + bool r; + + mutex_lock(&kvm->slots_lock); + r = __kvm_gmem_is_mapped(kvm, start, end); + mutex_unlock(&kvm->slots_lock); + + return r; +} + +static int kvm_gmem_toggle_mappable(struct kvm *kvm, gfn_t start, gfn_t end, + bool is_mappable) +{ + struct kvm_memslot_iter iter; + int r = 0; + + mutex_lock(&kvm->slots_lock); + + kvm_for_each_memslot_in_gfn_range(&iter, kvm_memslots(kvm), start, end) { + struct kvm_memory_slot *memslot = iter.slot; + gfn_t gfn_start, gfn_end; + + gfn_start = max(start, memslot->base_gfn); + gfn_end = min(end, memslot->base_gfn + memslot->npages); + if (WARN_ON_ONCE(start >= end)) + continue; + + if (is_mappable) + r = kvm_slot_gmem_set_mappable(memslot, gfn_start, gfn_end); + else + r = kvm_slot_gmem_clear_mappable(memslot, gfn_start, gfn_end); + + if (WARN_ON_ONCE(r)) + break; + } + + mutex_unlock(&kvm->slots_lock); + + return r; +} + +int kvm_gmem_set_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + return kvm_gmem_toggle_mappable(kvm, start, end, true); +} + +int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end) +{ + return kvm_gmem_toggle_mappable(kvm, start, end, false); +} + +#endif /* CONFIG_KVM_GMEM_MAPPABLE */ + /* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, void *data, int offset, int len) From patchwork Thu Oct 10 08:59:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829798 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADE801C57BE for ; Thu, 10 Oct 2024 08:59:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550789; cv=none; b=W57jGP8PLFoQl3Y/SmBVH2milnttq1pCcQupzXLGWn1h5V7yBsTqK+PNlOSuGCPnxsgR9lPBiTHFs19zv2rUYtgSmBIs7l/tL1NCvm32uZqsKhZI0bUMVq/I//2PoscwHJg57scIdzus2l2lumLyACwubE7eOPmap39FC0G5Pfw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550789; c=relaxed/simple; bh=mYaVJvOvzfMQ+zKlTPpie9EoVyIYwwMO4jC1tA91HVg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ErbyF0xhUdGfF97Wo3J9bECoEQilRVCJI5j6EafpVkGzLUkQuL6ttYWR9NHbNFLNJhPF05t0ELoIa1Zj9wUb5C//xg+2zxptPhhP74NQiGI6UT81ZE8KQ0jpSOfAvwp3S6g3iK6JEPGoIMWS5H1X0GVvrKSHDY6OzbhsgCtS3Pg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=VwZXtI5M; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="VwZXtI5M" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e28d223794so13503197b3.0 for ; Thu, 10 Oct 2024 01:59:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550787; x=1729155587; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8uqb2Hq7+/kapW0F00vPZqAJ1bU0uvb9gYwuhhY1Sds=; b=VwZXtI5MkJcm262Y7vbjZoJ0yfMC3ozcH8osf8PCA6jGRDvOVKipB+QBG9gGzJ8Uk4 uALZDBHeb1Eau7vQgcPH5NsgEehzqjn37p/uCrvgPP75HF+gPkKwUr7E4gnKwzDjXoqH BjFnPqkes8FQH/QEMFI4GIOu19GlpeHf/01ReVqythoqDCncSYAefVRNAdfsMcC3KkHI TK7GTod84r+KQtONlskWbBD/mCDlEyXRTp4E0GP/kWfjatboxpgPZWnEyvc8FMoZ8Bz4 1jod8b9jLZP0NwvLV4EuDPoLyTa697oCvteWceMyXhsc96UCiwPKRjeamykUc3iBv+tc lpgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550787; x=1729155587; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8uqb2Hq7+/kapW0F00vPZqAJ1bU0uvb9gYwuhhY1Sds=; b=w9JVau6FdOFWqRAKMfaGZ5vSZ59GnEpTi7o+bJnMc4HuxlBBOuXrWsgjeoJaOE+/J8 0KjsEcOjGhHMm6JNatBdYU+ub/k5tvfk/aJe9iADcdrhBplDvCN9Ct3pW5STj5PJGHpO +0Ay1Qavfi0fNjmYnoPE9Wk59LiijFnXLEifxT6TzSbPFggrye12OjZiyaPLz4G5TLlG vmZGz/n2lCWDOl04F4OjvKKa9eGdRV5PT2BL/UZe8qJLrEIDmTt9vuHXt5HXGbQ4AgUj jJAU2FNtLNv1dGz31dD6GBsXiw+KRJj6fy89xr6BfUkRHKXU5iuEdfLzz7mhHsXWFvyK /tjw== X-Gm-Message-State: AOJu0YzFIVrYyu9rDKTBfXodChQrJou67JNNDJ6H6g27ZarZ67D34UKD 1h6uOoukpJYeqp42gA9qqc3SzT4+XEgXyxA3mH8FYIBEHOTtUje2fhQiYJdGYIDfyKDX3k51hiC dvXrtgiuGBtRXaLWuXWWIHYPlmxBphK7dbtltC1/x4jp/7mTwo9ELFh61pK0yjUVGHjVv690ZK/ 5Iv8Uky34X2kZVqptQg4rdPfk= X-Google-Smtp-Source: AGHT+IG7SQRbRuMfYH/Rui+CBChrEOHfesWWRZDd/p/7xcfBTKh/y+cfP2FSghBgXkgmmW6X7QZVv1DCpQ== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a81:b545:0:b0:6e2:19f3:ff75 with SMTP id 00721157ae682-6e3221a89dbmr187767b3.6.1728550785757; Thu, 10 Oct 2024 01:59:45 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:24 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-6-tabba@google.com> Subject: [PATCH v3 05/11] KVM: guest_memfd: Add guest_memfd support to kvm_(read|/write)_guest_page() From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Make kvm_(read|/write)_guest_page() capable of accessing guest memory for slots that don't have a userspace address, but only if the memory is mappable, which also indicates that it is accessible by the host. Signed-off-by: Fuad Tabba --- virt/kvm/kvm_main.c | 137 ++++++++++++++++++++++++++++++++++++++------ 1 file changed, 118 insertions(+), 19 deletions(-) diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index aed9cf2f1685..77e6412034b9 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -3399,23 +3399,114 @@ int kvm_gmem_clear_mappable(struct kvm *kvm, gfn_t start, gfn_t end) return kvm_gmem_toggle_mappable(kvm, start, end, false); } +static int __kvm_read_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, + int len) +{ + struct page *page; + u64 pfn; + int r; + + /* + * Holds the folio lock until after checking whether it can be faulted + * in, to avoid races with paths that change a folio's mappability. + */ + r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, &pfn, NULL); + if (r) + return r; + + page = pfn_to_page(pfn); + + if (!kvm_gmem_is_mappable(kvm, gfn, gfn + 1)) { + r = -EPERM; + goto unlock; + } + memcpy(data, page_address(page) + offset, len); +unlock: + if (r) + put_page(page); + else + kvm_release_pfn_clean(pfn); + unlock_page(page); + + return r; +} + +static int __kvm_write_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, const void *data, + int offset, int len) +{ + struct page *page; + u64 pfn; + int r; + + /* + * Holds the folio lock until after checking whether it can be faulted + * in, to avoid races with paths that change a folio's mappability. + */ + r = kvm_gmem_get_pfn_locked(kvm, slot, gfn, &pfn, NULL); + if (r) + return r; + + page = pfn_to_page(pfn); + + if (!kvm_gmem_is_mappable(kvm, gfn, gfn + 1)) { + r = -EPERM; + goto unlock; + } + memcpy(page_address(page) + offset, data, len); +unlock: + if (r) + put_page(page); + else + kvm_release_pfn_dirty(pfn); + unlock_page(page); + + return r; +} +#else +static int __kvm_read_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, + int len) +{ + WARN_ON_ONCE(1); + return -EIO; +} + +static int __kvm_write_guest_memfd_page(struct kvm *kvm, + struct kvm_memory_slot *slot, + gfn_t gfn, const void *data, + int offset, int len) +{ + WARN_ON_ONCE(1); + return -EIO; +} #endif /* CONFIG_KVM_GMEM_MAPPABLE */ /* Copy @len bytes from guest memory at '(@gfn * PAGE_SIZE) + @offset' to @data */ -static int __kvm_read_guest_page(struct kvm_memory_slot *slot, gfn_t gfn, - void *data, int offset, int len) + +static int __kvm_read_guest_page(struct kvm *kvm, struct kvm_memory_slot *slot, + gfn_t gfn, void *data, int offset, int len) { - int r; unsigned long addr; if (WARN_ON_ONCE(offset + len > PAGE_SIZE)) return -EFAULT; + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + kvm_slot_can_be_private(slot) && + !slot->userspace_addr) { + return __kvm_read_guest_memfd_page(kvm, slot, gfn, data, + offset, len); + } + addr = gfn_to_hva_memslot_prot(slot, gfn, NULL); if (kvm_is_error_hva(addr)) return -EFAULT; - r = __copy_from_user(data, (void __user *)addr + offset, len); - if (r) + if (__copy_from_user(data, (void __user *)addr + offset, len)) return -EFAULT; return 0; } @@ -3425,7 +3516,7 @@ int kvm_read_guest_page(struct kvm *kvm, gfn_t gfn, void *data, int offset, { struct kvm_memory_slot *slot = gfn_to_memslot(kvm, gfn); - return __kvm_read_guest_page(slot, gfn, data, offset, len); + return __kvm_read_guest_page(kvm, slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_read_guest_page); @@ -3434,7 +3525,7 @@ int kvm_vcpu_read_guest_page(struct kvm_vcpu *vcpu, gfn_t gfn, void *data, { struct kvm_memory_slot *slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn); - return __kvm_read_guest_page(slot, gfn, data, offset, len); + return __kvm_read_guest_page(vcpu->kvm, slot, gfn, data, offset, len); } EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_page); @@ -3511,22 +3602,30 @@ EXPORT_SYMBOL_GPL(kvm_vcpu_read_guest_atomic); /* Copy @len bytes from @data into guest memory at '(@gfn * PAGE_SIZE) + @offset' */ static int __kvm_write_guest_page(struct kvm *kvm, - struct kvm_memory_slot *memslot, gfn_t gfn, - const void *data, int offset, int len) + struct kvm_memory_slot *slot, gfn_t gfn, + const void *data, int offset, int len) { - int r; - unsigned long addr; - if (WARN_ON_ONCE(offset + len > PAGE_SIZE)) return -EFAULT; - addr = gfn_to_hva_memslot(memslot, gfn); - if (kvm_is_error_hva(addr)) - return -EFAULT; - r = __copy_to_user((void __user *)addr + offset, data, len); - if (r) - return -EFAULT; - mark_page_dirty_in_slot(kvm, memslot, gfn); + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + kvm_slot_can_be_private(slot) && + !slot->userspace_addr) { + int r = __kvm_write_guest_memfd_page(kvm, slot, gfn, data, + offset, len); + + if (r) + return r; + } else { + unsigned long addr = gfn_to_hva_memslot(slot, gfn); + + if (kvm_is_error_hva(addr)) + return -EFAULT; + if (__copy_to_user((void __user *)addr + offset, data, len)) + return -EFAULT; + } + + mark_page_dirty_in_slot(kvm, slot, gfn); return 0; } From patchwork Thu Oct 10 08:59:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829799 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDDFB18DF81 for ; Thu, 10 Oct 2024 08:59:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550790; cv=none; b=leplvKiHkntoQmwOLvfSIopsXiZCmpjAWUNYo18wA0CN6uqCuudTMimBb4QHVk4H/qU/K8nrlw8GgU4aHHFQw4fwJPrwjspn1OSiWvh/ZEVm+K5amYllp/0kVXfz6rInCjwOT5F+xhtgbufhioqsMJ0WsPHpcATJ2Ri9vf/d9nc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550790; c=relaxed/simple; bh=AjE2RSAyv4Xb57vcuvQo1hOwImHvsIkCCGQdz654rUA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=s/zflGhKiSoh60DTbMAutc5Aw386Ke5e2KcDGtS3gUBOI85QpeCq5HF4NQbzi4FXD4om66Zha7/k/sw1OLtbWDd0V7tg2gnAUW76C0mxlY4YGHd5kUBquZYbMho5WL2fXMYR5nD0nRl+Do6BpcA2bcvOvG546QQW0TfxYs00qMc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=hb69+X9G; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="hb69+X9G" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6e32e920cf6so14234677b3.2 for ; Thu, 10 Oct 2024 01:59:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550788; x=1729155588; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=McR6hFD8xu5MydKKfp3q4/vtlv2OyOY0xakrmAL+qLs=; b=hb69+X9G0qjIS5au7DBpFDfQhOl5J3v1gEEre9159cJyE1ma08xDVygYLgdqlOgOXo 2qVMZmiQDOrrexw6KE7vXe4yb5hRzMP/Uq9dYCY/LAOcNZroV7Mq6cwtJiYfSTw1bgJo +ncUbL2Sdbmt8lT/f/Mx0G8kwPzbzuMgEZct6hHc+BY+lPvPePTxLLz+BL24FSvkRxZ7 TmmO32OkIsdCV6eqZ72Vx86qWYJTKNaRuO2V7Ytpro37fSU1nXc+lyBqnUjJqNj+2mP6 zpelmGU9sxlc0OUejkjTKqCiVZmA6aFlqkvA3h9IfgvJf+rcbm0mPWsce2uOqN4PhWoA KGYg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550788; x=1729155588; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=McR6hFD8xu5MydKKfp3q4/vtlv2OyOY0xakrmAL+qLs=; b=OszO+ZIOKH2vcNKoi++C93XmHKo57CSmtTukrkgrsgvobQPYt8ZNkTk4HxZ7SLwe64 0cS18wl4zK+yjDMP6jKXhOM+UUex+YpdQClXBPfcwmhk4QH8hLiC3WlfEZxG/XN/fNqq 9NLL81wYRThf4quu4FAxXnUX5VhUEaCbAJchRQf5sYlaeoxkSA8wQwVcVJ7Z9uH660gu KaR/GnwwviWueyvCO00g/9ldA/y0ry6Xt00LP8JoV0qgtpSADMDm87/21KFi8rK/ZQcC qGyW7vuv9S7rLk/0p2lCd9Ni985sH+3W1X3BCF7SN7+3n1eBI2bBE2HZDwoot0WWsM3q 7scQ== X-Gm-Message-State: AOJu0YwGVdA8Npr8R8LXb5Rr3Qj13NoGUefyYcgF9kjKVXz1YTCTOOe1 8epVV1ICHCoTZ6NmaaZsGqDL7Tl5oyQ2uAuKBxnoGbMGtcLhNgbUTNL9PQvtDQdlDbls8lBFVYF t2rtjou/36sMCCidINdZDbxwfMSGbApc3ljaC5nHd7nima5oAVEMz535fURU+6S0ECUHNdrZgIO vbMZxp5LDpvwmdXgxqLDej7hc= X-Google-Smtp-Source: AGHT+IFvDP0XuuJHxV5RnbWlC7X4GO/XpvIVrGemFE3icy6MnaUivOOGAIqQ7A9+OzK7/cvpWXHNkIHIjw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a5b:e8e:0:b0:e25:e5b9:8114 with SMTP id 3f1490d57ef6-e28fe362769mr134823276.4.1728550787926; Thu, 10 Oct 2024 01:59:47 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:25 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-7-tabba@google.com> Subject: [PATCH v3 06/11] KVM: guest_memfd: Add KVM capability to check if guest_memfd is host mappable From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add the KVM capability KVM_CAP_GUEST_MEMFD_MAPPABLE, which is true if mapping guest memory is supported by the host. Signed-off-by: Fuad Tabba --- include/uapi/linux/kvm.h | 1 + virt/kvm/kvm_main.c | 4 ++++ 2 files changed, 5 insertions(+) diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 637efc055145..2c6057bab71c 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -933,6 +933,7 @@ struct kvm_enable_cap { #define KVM_CAP_PRE_FAULT_MEMORY 236 #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 #define KVM_CAP_X86_GUEST_MODE 238 +#define KVM_CAP_GUEST_MEMFD_MAPPABLE 239 struct kvm_irq_routing_irqchip { __u32 irqchip; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 77e6412034b9..c2ff09197795 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5176,6 +5176,10 @@ static int kvm_vm_ioctl_check_extension_generic(struct kvm *kvm, long arg) #ifdef CONFIG_KVM_PRIVATE_MEM case KVM_CAP_GUEST_MEMFD: return !kvm || kvm_arch_has_private_mem(kvm); +#endif +#ifdef CONFIG_KVM_GMEM_MAPPABLE + case KVM_CAP_GUEST_MEMFD_MAPPABLE: + return !kvm || kvm_arch_has_private_mem(kvm); #endif default: break; From patchwork Thu Oct 10 08:59:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829800 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1B971BC9F0 for ; Thu, 10 Oct 2024 08:59:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550794; cv=none; b=io8Kjp6XYVyMQKrrCt4owHu6nhLXbqUmPTmgkV9Ktr5ji12z33nOScFxq85SMtObsEC4D9/9WPRzMTkCFrm8AKEFPPv2ScBSgC/2ez1UeBMFUuJga2EiYrYZ6LeeW5vG0KJvyAydIIlJEb3Ji749PPYQksQnaFoHIn26Wj8JCmA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550794; c=relaxed/simple; bh=BhjpzC9e+LOdQaEyIwWjPOfcNwacANucZUU5tZhNjr4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=gFimTo90G8/WzaukE9c/tZwDu3aM8kMClrtc773yU35jJ7kop7szXF+QOJUktloHay27VCV8LxcW2dgAG64bJ/uiNoBoFzcJlJY5VGXM58MpxV2JnuZYu7Jsy33NHn19BKIT3uTnrWj1x3tmUERe0emwdG3LD6SyAi6fyZbgC8o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=NE+EOjow; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="NE+EOjow" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e02fff66a83so1082082276.0 for ; Thu, 10 Oct 2024 01:59:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550791; x=1729155591; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zKaf8nivAag4/4uIL21ZmJPpCRWLdv9h6z/9SiSCBvk=; b=NE+EOjowsLpQuoh+ZjHURomptnThefOwBzO24Jr7z7JBcV8VeT+gnefkJOdb7+EDv/ YzDNXq2BJfusfUr6VLOhAsQtWHdbNJhBUyV6tVxiE9iSbJVb0XdRpJsiJnPpNvGRa1YZ eW5JOOWKpkdWkYabbPmPOylXkhXlPEOdlpbmWR0hl/r54YUQUTRAHrXPZK4KeTWODHxQ FZ1QkAZ56Atasf4AC6VrcaUhG9NweXWpdHz1KbzkWsOX4M/dRslnLA+cAcSylrJe3ITh hvJsXvSmGE7qRbzeuTPyY9QortdR3IHfP14XHHjzd9pGB9x/4Ifdy3x492+FBMdthXJN Wimw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550791; x=1729155591; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zKaf8nivAag4/4uIL21ZmJPpCRWLdv9h6z/9SiSCBvk=; b=l2MD+F55QE1Wu84dcTHJpPTnuk/hyq/9AyAk18PCt5D8uva4wKW63zdDPLa9c+hi5J CrX5xnWrJaA4ryd+cw2jMzXaKqinKvIn4xNGslhgSUtnfJeWLNkw/W2zrmdgJ8SBPAAD uEUKoMpfJI9/s7Prh6c2P0w/IQptiz2hrqyuERbWQIM5kB3iqUw2Q/F8U5I7NU6zfjFK /HjO9wL3ejR6w2QdKzCct7aso/i5Viwc+Y0oq6ao0oaXiGDGxMquYEnBC47tmCpdgOAa oDYxnPdj6Ktl2/zUdG2OrOzCNADrqSjCUdx5Fgp+UQncr2iQQ+rjzPGK0GGQNMw6JzLo ta6A== X-Gm-Message-State: AOJu0YzK1cQtedY/tVl3Cmlk/vbGFH2q83q3mbPY6lmFpxt47bAVg+Gk 3shO9grqcumvaerbASUVLGuot10bNRF37kaEKmnECVv6s+eRLaJ63lmNwnnbFkRVmbtDCYFQ4Do nobDTHPxW5TwbbKQmnWt20OJT/JfHfA7bsa65HngofhqcgOyFNy3f9eW5hQSXXLhDh4SM1UwRUe BA88/sQcvdflJcJCbUFS3u9NA= X-Google-Smtp-Source: AGHT+IFyezTuDqo1giJ/p5guQQYFu/xIk5SZEqJH7PMzqSsZJQW4O7K2L6rTa8b012uIruIWq11czsKArw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a25:c2c1:0:b0:e22:5bdf:39c1 with SMTP id 3f1490d57ef6-e28fe45b504mr3520276.10.1728550790172; Thu, 10 Oct 2024 01:59:50 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:26 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-8-tabba@google.com> Subject: [PATCH v3 07/11] KVM: guest_memfd: Add a guest_memfd() flag to initialize it as mappable From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Not all use cases require guest_memfd() to be mappable by the host when first created. Add a new flag, GUEST_MEMFD_FLAG_INIT_MAPPABLE, which when set on KVM_CREATE_GUEST_MEMFD initializes the memory as mappable by the host. Otherwise, memory is private until shared by the guest with the host. Signed-off-by: Fuad Tabba --- Documentation/virt/kvm/api.rst | 4 ++++ include/uapi/linux/kvm.h | 1 + virt/kvm/guest_memfd.c | 6 +++++- 3 files changed, 10 insertions(+), 1 deletion(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index e32471977d0a..c503f9443335 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -6380,6 +6380,10 @@ most one mapping per page, i.e. binding multiple memory regions to a single guest_memfd range is not allowed (any number of memory regions can be bound to a single guest_memfd file, but the bound ranges must not overlap). +If the capability KVM_CAP_GUEST_MEMFD_MAPPABLE is supported, then the flags +field supports GUEST_MEMFD_FLAG_INIT_MAPPABLE, which initializes the memory +as mappable by the host. + See KVM_SET_USER_MEMORY_REGION2 for additional details. 4.143 KVM_PRE_FAULT_MEMORY diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 2c6057bab71c..751f167d0f33 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -1558,6 +1558,7 @@ struct kvm_memory_attributes { #define KVM_MEMORY_ATTRIBUTE_PRIVATE (1ULL << 3) #define KVM_CREATE_GUEST_MEMFD _IOWR(KVMIO, 0xd4, struct kvm_create_guest_memfd) +#define GUEST_MEMFD_FLAG_INIT_MAPPABLE BIT(0) struct kvm_create_guest_memfd { __u64 size; diff --git a/virt/kvm/guest_memfd.c b/virt/kvm/guest_memfd.c index df3a6f05a16e..9080fa29cd8c 100644 --- a/virt/kvm/guest_memfd.c +++ b/virt/kvm/guest_memfd.c @@ -734,7 +734,8 @@ static int __kvm_gmem_create(struct kvm *kvm, loff_t size, u64 flags) goto err_gmem; } - if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) { + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE) && + (flags & GUEST_MEMFD_FLAG_INIT_MAPPABLE)) { err = gmem_set_mappable(file_inode(file), 0, size >> PAGE_SHIFT); if (err) { fput(file); @@ -763,6 +764,9 @@ int kvm_gmem_create(struct kvm *kvm, struct kvm_create_guest_memfd *args) u64 flags = args->flags; u64 valid_flags = 0; + if (IS_ENABLED(CONFIG_KVM_GMEM_MAPPABLE)) + valid_flags |= GUEST_MEMFD_FLAG_INIT_MAPPABLE; + if (flags & ~valid_flags) return -EINVAL; From patchwork Thu Oct 10 08:59:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829801 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5F5AD1BC9FE for ; Thu, 10 Oct 2024 08:59:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550797; cv=none; b=mogl/WKSGC+p9Ti4DaU2cBv2sxqCN3QmZQfAoWZNsHbA4DkbLEHCxBY7jRLp3MP0Y+n7nBfn65kBBKqOQeuXO+x2glKdO4C9T8UK/tG05GuAWkmn5Aq/pwLzxeg8WvaxqH9UD4JAfQTMBSkCWJ9nl6528G21RBkf66zkPirptKQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550797; c=relaxed/simple; bh=6O9oX/Eb/qfSLrwuOSLT7z4LMccm6m6RtLN57f3JkYg=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Xdg0xOWopucMxwJ9oHuJlm6aREfP1nE+ENo6ThDiQDsZVj3gvCA2iQM/GxzPWis9BWPImYnIiPKJ4KR8cQm7rJnJr97SEjgCqq77fhsHt3pGMOmNPPSoU5nGU6oODLQHwKHD+5AloXLBQ//xqKM2VSnbRCF6zt7WJKcIzVRgnXg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=2Dh7dfju; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="2Dh7dfju" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-37d5116f0a6so16073f8f.0 for ; Thu, 10 Oct 2024 01:59:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550794; x=1729155594; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qzUThw8QDZNv9fDcMVRqMkWYYerRinZf/dn/Fj9e+Nk=; b=2Dh7dfjuppkYalWgx8VMPebkCiwmkZkCLkeH04I0isJnWjaA4sv12nJnHs87K6ahzw zZXCneO/F3vmknQI7Vytwb7qRQBMFYil7xi/Hx1FU7zsIHsLzlYe5IBlW/rGhrCYIkWw R//JBggrllpvztKB7cOks/bvMJpxPt7SrJc8bJun/J2U588H8n5jcFNzA0vrAkrad1Rn Gxceu9f7yid1/f3fw7gOM7DFyrUHfNc3zJXsdCx9cLdmXaM3KFkQezL9+eWB9QVUF1F3 s2ZBtTcpmW810Fa8sZg4e/RA5U8UIP0bZx+kkiKhhcy3xDgF8Bl59mOLCe6oV2iTf4/V IW9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550794; x=1729155594; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qzUThw8QDZNv9fDcMVRqMkWYYerRinZf/dn/Fj9e+Nk=; b=hWC7E0Qph2WBYkedJWbwEQqN4k+2n3NJBkg3hlLJ1zX0679xZYV6maQ2jKSXwYQ/uU rqJcjphwNDduiFrlnJGHDzlGUbSDU3OBaBzYunMJZ6WytO0Sg/63SRLMNR+d3GZDUUBk Mad7SigyZVggU6Ecvdx9e3d15ER4n/pAahcCe1SIjtmHV/kvl9bFKh+qAkoPDgjdCIOT IpgW3aSnnr0wPz/24TYUu+8vG+FNnINxbmp941qr3AwDGp6AvJkKJ40Ts2CSFlTlowhT tSABYCWbfnsB1yvStsppZNE4iYcEgKR8XL/rSxCnI9HE1RPvraCBCEAHdO9Ji+jSwirL EMnw== X-Gm-Message-State: AOJu0YzCYxOaE9deEd3x4gRdmIpyBw1tdUM38jJlYOjskV2TPnTzI0CC qB7SWHuPa/ncaTEQowKlJP3RaTVglzP/j6YSfuZs2zZPc5zTITswZ3XA8Z3rY7tsWK+zuLz4Fcx EHkj1ofwDH0tPk19I63fmRKbL3z8nWUYwRkOiFIi6hPMHJUG2Pse9blzdUEHzfycrRONnyqofsT f+qTkc15P+AXFSdetW85e4fKo= X-Google-Smtp-Source: AGHT+IGnEPVNJA5hNA6zgaReNf096hRtQziouC1E1cxYtRNCKe4dTpHEgBlCo1zAywzne6jSW5D++vxtyg== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a5d:420f:0:b0:37d:31a7:6610 with SMTP id ffacd0b85a97d-37d3a8ef27amr2808f8f.0.1728550793139; Thu, 10 Oct 2024 01:59:53 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:27 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-9-tabba@google.com> Subject: [PATCH v3 08/11] KVM: guest_memfd: selftests: guest_memfd mmap() test when mapping is allowed From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Expand the guest_memfd selftests to include testing mapping guest memory if the capability is supported, and that still checks that memory is not mappable if the capability isn't supported. Also, build the guest_memfd selftest for aarch64. Signed-off-by: Fuad Tabba --- tools/testing/selftests/kvm/Makefile | 1 + .../testing/selftests/kvm/guest_memfd_test.c | 57 +++++++++++++++++-- 2 files changed, 53 insertions(+), 5 deletions(-) diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile index 960cf6a77198..c4937b6c2a97 100644 --- a/tools/testing/selftests/kvm/Makefile +++ b/tools/testing/selftests/kvm/Makefile @@ -172,6 +172,7 @@ TEST_GEN_PROGS_aarch64 += coalesced_io_test TEST_GEN_PROGS_aarch64 += demand_paging_test TEST_GEN_PROGS_aarch64 += dirty_log_test TEST_GEN_PROGS_aarch64 += dirty_log_perf_test +TEST_GEN_PROGS_aarch64 += guest_memfd_test TEST_GEN_PROGS_aarch64 += guest_print_test TEST_GEN_PROGS_aarch64 += get-reg-list TEST_GEN_PROGS_aarch64 += kvm_create_max_vcpus diff --git a/tools/testing/selftests/kvm/guest_memfd_test.c b/tools/testing/selftests/kvm/guest_memfd_test.c index ba0c8e996035..ae64027d5bd8 100644 --- a/tools/testing/selftests/kvm/guest_memfd_test.c +++ b/tools/testing/selftests/kvm/guest_memfd_test.c @@ -34,12 +34,55 @@ static void test_file_read_write(int fd) "pwrite on a guest_mem fd should fail"); } -static void test_mmap(int fd, size_t page_size) +static void test_mmap_allowed(int fd, size_t total_size) { + size_t page_size = getpagesize(); + char *mem; + int ret; + int i; + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT(mem != MAP_FAILED, "mmaping() guest memory should pass."); + + memset(mem, 0xaa, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + ret = fallocate(fd, FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE, 0, + page_size); + TEST_ASSERT(!ret, "fallocate the first page should succeed"); + + for (i = 0; i < page_size; i++) + TEST_ASSERT_EQ(mem[i], 0x00); + for (; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + memset(mem, 0xaa, total_size); + for (i = 0; i < total_size; i++) + TEST_ASSERT_EQ(mem[i], 0xaa); + + ret = munmap(mem, total_size); + TEST_ASSERT(!ret, "munmap should succeed"); +} + +static void test_mmap_denied(int fd, size_t total_size) +{ + size_t page_size = getpagesize(); char *mem; mem = mmap(NULL, page_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); TEST_ASSERT_EQ(mem, MAP_FAILED); + + mem = mmap(NULL, total_size, PROT_READ | PROT_WRITE, MAP_SHARED, fd, 0); + TEST_ASSERT_EQ(mem, MAP_FAILED); +} + +static void test_mmap(int fd, size_t total_size) +{ + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD_MAPPABLE)) + test_mmap_allowed(fd, total_size); + else + test_mmap_denied(fd, total_size); } static void test_file_size(int fd, size_t page_size, size_t total_size) @@ -172,13 +215,17 @@ static void test_create_guest_memfd_multiple(struct kvm_vm *vm) int main(int argc, char *argv[]) { - size_t page_size; + uint64_t flags = 0; + struct kvm_vm *vm; size_t total_size; + size_t page_size; int fd; - struct kvm_vm *vm; TEST_REQUIRE(kvm_has_cap(KVM_CAP_GUEST_MEMFD)); + if (kvm_has_cap(KVM_CAP_GUEST_MEMFD_MAPPABLE)) + flags |= GUEST_MEMFD_FLAG_INIT_MAPPABLE; + page_size = getpagesize(); total_size = page_size * 4; @@ -187,10 +234,10 @@ int main(int argc, char *argv[]) test_create_guest_memfd_invalid(vm); test_create_guest_memfd_multiple(vm); - fd = vm_create_guest_memfd(vm, total_size, 0); + fd = vm_create_guest_memfd(vm, total_size, flags); test_file_read_write(fd); - test_mmap(fd, page_size); + test_mmap(fd, total_size); test_file_size(fd, page_size, total_size); test_fallocate(fd, page_size, total_size); test_invalid_punch_hole(fd, page_size, total_size); From patchwork Thu Oct 10 08:59:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829802 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 515661C6F67 for ; Thu, 10 Oct 2024 08:59:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550798; cv=none; b=bsd/Scg8T5+GJskTNlpe0tKgAsGkYziPDfGXWZGbWdLE4f+IxBr5DaiDR+ti+SLwstkT5+N9UeK1KwbfrFB6++OZ7CAop3o5Nl7B34RS8JO3m2NU9e1ApT0fd4dSsnnO+8LL+Mxx3m6o+QZzUdrddL5laka84/2nTfoIDl/y0x4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550798; c=relaxed/simple; bh=7trnHcmDY5Yiyh+CNqA5Y+5bFBagOFIEqKUoC7XfWpA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=LJnSCkIPuTYgul4UQ2NSY+qtmow6EmDV7Puaw09QiurpfGoPiqKRGPLW9/0QmprQyHvGJb7r7m9uQ7kb6ozxb+h3J8y0iy42hTj3cwUxCp4Cx4h9HsnemDzd2xLqKqMjh4hGARtpCeNkSTeJrfjGWcm4BnphohPR3C/tJv1/Bxo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=rOuGu7CT; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="rOuGu7CT" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e24a31ad88aso807777276.1 for ; Thu, 10 Oct 2024 01:59:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550796; x=1729155596; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=HM7gsmExGCXNNGDXzdybypu9GrQXN1fbuf2tdqrKB/Q=; b=rOuGu7CTW1DJDmLKONxps23cthYlXBUiczRgJx0VzM+MlUj26JVk5NO4xnrUWEJydE M80g9yeUwFW2bzUkKtIqe0qdWDCWV+jpTYaY1ZtuKhVpZU9Z8cBwCnUlM1Xz+IFNPqRq PtRPGg6Y6+d4fQ/w5mEZ2XBngLSNHqo8R0NtzeYrj4quaRowyBJt/l+UTxG6qtE2Lzn6 20LrsrI9ADLI80h43kAOkMszlbeV2q8ji+E3pP1gMBLM9WUwJq+cdGuZkZ/OJfa0sogr jwQsKVavRWM3uxxp/0PbtnV5XXelZnD8OUulApUHApo4k0u6oca2uINwiUj92Zm3Cafo QUMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550796; x=1729155596; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=HM7gsmExGCXNNGDXzdybypu9GrQXN1fbuf2tdqrKB/Q=; b=aDB4IBPvRagoHc9CpzVCpFOAx7zVhZYWCgzLhM13/X0JH4V9sOwprI+IVbZMPRaaeO FHZsSMVmifwq/Jg6rshRspV0yE0wUTnJm+wEEnDu3U+a5s6nSZb04b8ms+9AU1ed69Vn on3JbmRUXTtkXj0knULQr8vuqlSvmrzkuZnQaDQjgR68oqwb/6la2IlodGCDjjN7UaRF JvGnTIuL2ceau0erfBDKFvC3dpoa8OUKmDPNeKGEwE6qnH2KIYz+qCV0KOqqrvGxglmR 5dAepLtfsjmAfW4bY1MWSwwfoJdieI5S8ZcwiGtvnDgn9xkdt6/GQgLRXd835QSh+jaU ThRA== X-Gm-Message-State: AOJu0YwoZ1qdBAKzpzxv0+nMJaeiFUZSipcaGnknVgIhz4UYMA41MrE/ e4NfvR/k3xv+bJxMlaThv9qWe3WFyJKeplDVLfVN3ovdla5pMmQmv5V00Vg+5KZ+gFeW5wj6264 H7+BupiJG3sacalxT/CjWveiyENnDGsjdJk/3r9hsgRUErXZE7pUfzTXNuKcX816wgnbnNOdv9w /HS9epNLoeU8ro/v9usxCwKV0= X-Google-Smtp-Source: AGHT+IGFCUFcocuEb3ihc+miy+ZJ2j+wkfEPTs51cWpXQxawruWfXPnQR4cYRFEkkX27O5wF5IDqGcOMOg== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a25:b205:0:b0:e27:3e6a:345 with SMTP id 3f1490d57ef6-e28fe6935c6mr3791276.10.1728550795821; Thu, 10 Oct 2024 01:59:55 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:28 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-10-tabba@google.com> Subject: [PATCH v3 09/11] KVM: arm64: Skip VMA checks for slots without userspace address From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Memory slots backed by guest memory might be created with no intention of being mapped by the host. These are recognized by not having a userspace address in the memory slot. VMA checks are neither possible nor necessary for this kind of slot, so skip them. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index a509b63bd4dd..71ceea661701 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -987,6 +987,10 @@ static void stage2_unmap_memslot(struct kvm *kvm, phys_addr_t size = PAGE_SIZE * memslot->npages; hva_t reg_end = hva + size; + /* Host will not map this private memory without a userspace address. */ + if (kvm_slot_can_be_private(memslot) && !hva) + return; + /* * A memory region could potentially cover multiple VMAs, and any holes * between them, so iterate over all of them to find out if we should @@ -2126,6 +2130,10 @@ int kvm_arch_prepare_memory_region(struct kvm *kvm, hva = new->userspace_addr; reg_end = hva + (new->npages << PAGE_SHIFT); + /* Host will not map this private memory without a userspace address. */ + if ((kvm_slot_can_be_private(new)) && !hva) + return 0; + mmap_read_lock(current->mm); /* * A memory region could potentially cover multiple VMAs, and any holes From patchwork Thu Oct 10 08:59:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829803 Received: from mail-wr1-f74.google.com (mail-wr1-f74.google.com [209.85.221.74]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 348C91C7B64 for ; Thu, 10 Oct 2024 09:00:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.74 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550801; cv=none; b=gZcH0tUPdSRivNMuVFq9Q551TpAr+1/7wiqJ+MzOgDp/PAtfunBxezAjzGJbhHbA3Yd5H9NbqQCKqNMaKbtsk7jQflaadgV+CO1YuQZUbeU2S5nKdCdaDpoMwnrnPdn1Xlq7fCbuPofPwpJSXoWdstqPR8F5Hw53xDoKFbEFljw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550801; c=relaxed/simple; bh=1gESkYZeE8gsJM6vDvXoaGAzOqHt08IYuPYjQygM1g0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=n/4gJllXu4u2YMAe+XIl06Jd8m0vNi1gV2IR1/oir3qvHAXcmeqyIghvYd4ycWJvaZKr1Jk6/eek2cI9sxVjqn4JMIKiqtFp7FsqM+7+1YDS4hyY2XOAQJT1Co1qQs4BDxLEyMeP4pxQRB6w4qvMb0t+a5V1k0LzkdeFS6HTKlc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=kgPD3mGj; arc=none smtp.client-ip=209.85.221.74 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="kgPD3mGj" Received: by mail-wr1-f74.google.com with SMTP id ffacd0b85a97d-37d3878ddcdso249327f8f.3 for ; Thu, 10 Oct 2024 01:59:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550798; x=1729155598; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=exyGShkIuMmlgn/y2fH9jdue3XxSmqPMQZyWZJ9IIOc=; b=kgPD3mGjVy7Wqfk4v3Hj2vJH5+b7kXzkZ9yFIndNp9yhGyvOuyKlMkjBUkqq6h1QGd eND+2tbG+99eQcJWwyF8GRVLz3BzDnV517ImZRAzeN8U9+scVQwvMyuGpf+mW91X8NlF XK1Yn/YkYN4IxG/+YvN0lMBceI2PbfOR66h84ITOTkwclFBUeR/O3G/ssMGHkNV5b7vT 6OY8uTSh5EVTi3PKjErBtScGA//Td/g2tYMhyrYA0kSClSm10C92XZfd0pVy1xtJg0qC +RkXjDQGQaWiVI9DpJZ+Ur1Vd5tr134qO40okZLCfGcNkY0kzeVJTgcs4vYSwE/SiwzP lirA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550798; x=1729155598; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=exyGShkIuMmlgn/y2fH9jdue3XxSmqPMQZyWZJ9IIOc=; b=lYdZsKQryT1s1+meWjvgQOjn/yBb+QGdbvQZAsMMgHd/Gp4GhuL8k9psztU7piYA1k UXXt8EVNML4Zs/TDpSFNwBVyCvHK8GCXKWTCSwyzir5K3kT7ZzSkczNcfuPsrwjVSYr9 JWyqhPPj+zAl0ON2L9CzAV8DIEezPw03Yncss9ieGegy9pb0S8hJ6Dzj01YSha90JKju A9WOl32xfIRtmLb9MCYWdBwBdOhsu9tpLsYolbyYPRpjfzYd4GHOU6GSCl4nfWm+r0iS OqVGKsGDWS5PtXVzFKzL5B9N/QBRmCbYEvlQTaxWXqa5B+hIraYEZjbI8c9GlMhpr1r9 A3WA== X-Gm-Message-State: AOJu0Yxzz+ZmsTfAZEl5+u4HQNl24nFnfIIyESlu7ksdgvUd5mu/kQi4 5zBFVOA03ud6LssJ0ZnHItly1Rk/eWJCu81hDaTVTVMRp/0C3znfiV/x56XSN5Oq8QZhuRUuFUB BqhvPn4cHXDyOAk+yy59FZEV1+bR7ag+CPY0xZoDv3w2A5J98iPUxhyitzcs3Ek7RToA27AzcQx iZ13+bfxbiEEx7TcVEMyAZe+A= X-Google-Smtp-Source: AGHT+IE3WukKHOq7Bmp4xhzc8OQfh5cP/L7vmCsDLrA8EMFDPkbOy1WECbzZ6HHl8nL5SydVZHebiozHSw== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:6000:1091:b0:37d:4d36:e178 with SMTP id ffacd0b85a97d-37d4d36e297mr456f8f.7.1728550798081; Thu, 10 Oct 2024 01:59:58 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:29 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-11-tabba@google.com> Subject: [PATCH v3 10/11] KVM: arm64: Handle guest_memfd()-backed guest page faults From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Add arm64 support for resolving guest page faults on guest_memfd() backed memslots. This support is not contingent on pKVM, or other confidential computing support, and works in both VHE and nVHE modes. Without confidential computing, this support is useful for testing and debugging. In the future, it might also be useful should a user want to use guest_memfd() for all code, whether it's for a protected guest or not. For now, the fault granule is restricted to PAGE_SIZE. Signed-off-by: Fuad Tabba --- arch/arm64/kvm/mmu.c | 112 ++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 110 insertions(+), 2 deletions(-) diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 71ceea661701..250c59f0ca5b 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1422,6 +1422,108 @@ static bool kvm_vma_mte_allowed(struct vm_area_struct *vma) return vma->vm_flags & VM_MTE_ALLOWED; } +static int guest_memfd_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, + struct kvm_memory_slot *memslot, bool fault_is_perm) +{ + struct kvm_mmu_memory_cache *memcache = &vcpu->arch.mmu_page_cache; + bool exec_fault = kvm_vcpu_trap_is_exec_fault(vcpu); + bool logging_active = memslot_is_logging(memslot); + struct kvm_pgtable *pgt = vcpu->arch.hw_mmu->pgt; + enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_R; + bool write_fault = kvm_is_write_fault(vcpu); + struct mm_struct *mm = current->mm; + gfn_t gfn = gpa_to_gfn(fault_ipa); + struct kvm *kvm = vcpu->kvm; + struct page *page; + kvm_pfn_t pfn; + int ret; + + /* For now, guest_memfd() only supports PAGE_SIZE granules. */ + if (WARN_ON_ONCE(fault_is_perm && + kvm_vcpu_trap_get_perm_fault_granule(vcpu) != PAGE_SIZE)) { + return -EFAULT; + } + + VM_BUG_ON(write_fault && exec_fault); + + if (fault_is_perm && !write_fault && !exec_fault) { + kvm_err("Unexpected L2 read permission error\n"); + return -EFAULT; + } + + /* + * Permission faults just need to update the existing leaf entry, + * and so normally don't require allocations from the memcache. The + * only exception to this is when dirty logging is enabled at runtime + * and a write fault needs to collapse a block entry into a table. + */ + if (!fault_is_perm || (logging_active && write_fault)) { + ret = kvm_mmu_topup_memory_cache(memcache, + kvm_mmu_cache_min_pages(vcpu->arch.hw_mmu)); + if (ret) + return ret; + } + + /* + * Holds the folio lock until mapped in the guest and its refcount is + * stable, to avoid races with paths that check if the folio is mapped + * by the host. + */ + ret = kvm_gmem_get_pfn_locked(kvm, memslot, gfn, &pfn, NULL); + if (ret) + return ret; + + page = pfn_to_page(pfn); + + /* + * Once it's faulted in, a guest_memfd() page will stay in memory. + * Therefore, count it as locked. + */ + if (!fault_is_perm) { + ret = account_locked_vm(mm, 1, true); + if (ret) + goto unlock_page; + } + + read_lock(&kvm->mmu_lock); + if (write_fault) + prot |= KVM_PGTABLE_PROT_W; + + if (exec_fault) + prot |= KVM_PGTABLE_PROT_X; + + if (cpus_have_final_cap(ARM64_HAS_CACHE_DIC)) + prot |= KVM_PGTABLE_PROT_X; + + /* + * Under the premise of getting a FSC_PERM fault, we just need to relax + * permissions. + */ + if (fault_is_perm) + ret = kvm_pgtable_stage2_relax_perms(pgt, fault_ipa, prot); + else + ret = kvm_pgtable_stage2_map(pgt, fault_ipa, PAGE_SIZE, + __pfn_to_phys(pfn), prot, + memcache, + KVM_PGTABLE_WALK_HANDLE_FAULT | + KVM_PGTABLE_WALK_SHARED); + + /* Mark the page dirty only if the fault is handled successfully */ + if (write_fault && !ret) { + kvm_set_pfn_dirty(pfn); + mark_page_dirty_in_slot(kvm, memslot, gfn); + } + read_unlock(&kvm->mmu_lock); + + if (ret && !fault_is_perm) + account_locked_vm(mm, 1, false); +unlock_page: + put_page(page); + unlock_page(page); + + return ret != -EAGAIN ? ret : 0; +} + static int user_mem_abort(struct kvm_vcpu *vcpu, phys_addr_t fault_ipa, struct kvm_s2_trans *nested, struct kvm_memory_slot *memslot, unsigned long hva, @@ -1893,8 +1995,14 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) goto out_unlock; } - ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, - esr_fsc_is_permission_fault(esr)); + if (kvm_slot_can_be_private(memslot)) { + ret = guest_memfd_abort(vcpu, fault_ipa, memslot, + esr_fsc_is_permission_fault(esr)); + } else { + ret = user_mem_abort(vcpu, fault_ipa, nested, memslot, hva, + esr_fsc_is_permission_fault(esr)); + } + if (ret == 0) ret = 1; out: From patchwork Thu Oct 10 08:59:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Fuad Tabba X-Patchwork-Id: 13829804 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0946C1C8FBD for ; Thu, 10 Oct 2024 09:00:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550804; cv=none; b=goXC4OD93ZWkdc7/Mdc689UIy1/Cs44DVHdbUckET8Nt5qHyMeMnVeHsCJWRW/n7TM4uWPqj2v0ZB93iKCwdovPUHZIlejUEuOM3Q16e+oM8ZUWZzAItFoPl6Kzv9Um6xAMSxPrpY5LDsfgHfMc9vcctbI1vJJ6dGfGAEEqcdGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728550804; c=relaxed/simple; bh=rWEgzpXZhYi9ydWJUmAbpEPiQxbIlQmYae02y98HGc0=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=E0ljND48IFXYEtlcebAQE231QfzevPneyYkXrYzhZVn3ie1cn4VmdCnWj8PP4WpYl+dc2ZPbsxa1AHT9H5RlV6z1RoLfrnQb6LPaq/Ll7+Yof39u4LiB3Igk6ENMHs/8VsRVLcY8VQe06iRygG8aF+xZFulNiNzLWdj7TxP1yso= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=vvToz+XL; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--tabba.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="vvToz+XL" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e165fc5d94fso1001905276.2 for ; Thu, 10 Oct 2024 02:00:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728550802; x=1729155602; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=vRqRt9/OdUuC94QGlx1b3LLfUnzhlvtZvk9uJQuMY0s=; b=vvToz+XLxU4qQ2xNs4TzBZvDr7wvQItOmn/yvLr1BG5RTgEmsdQhI50eFOD/YMDT5h DmQqv1TUrklXG/OpYEHCu83pghQO1p1b01aXx2JvG//Z3STT+aE1sN/+l7Q/qe7PyDqt +tPivGY5NbrwhDTjr/h6AvSEbI3Rsq5jmMD0LlTVap3pGYJJU0zh3I+RvIFfJ23BD+fF DK0N3SrwooBIPeZlzOLoIhdPC6LVbCxcl4d5msMjaX1A5A4MwuK7yo4EZlWmemA/mWcv o0QmQhP/c1hXeQrI+KCsgzPM/bNt+7ZCjU2r+GZS+7tBc02T4WArhZfurXPlXqfhCo9F M9sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728550802; x=1729155602; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vRqRt9/OdUuC94QGlx1b3LLfUnzhlvtZvk9uJQuMY0s=; b=G30ALNbhhJiUOylFRgeTuu7JOYbmNMNU/TR0P4uuCiX1itm+KGZGFeSMykiBf23GHT rNObh3atbscR4kTatY0sUSkGtVO0XwOXY+h7pj8yNXsUooBUf/2fT+dZbF007j91Z6HI GUl38/Xv31io8pgkROA3FFpwVeWC1OjYtqObgZt+UjsZzmAI0NAmzMLAUJv8Z2YLlqAP GRe0RQiNNnAsF5/qMnR+aJLzZM4wj2+Id3G1ZNLRnJJonejrXgGaIzhqeSPfUXDhfOzS x79tZCJqKv1uSxbLEMQMqfWRSPPWDi6IALowP0F9un4JdeoDx+YwtzzezIHctS5Kquu1 m+tQ== X-Gm-Message-State: AOJu0YwCHYd0QVuIJs7g0waA9kxcUbiTga7NlMnUigfHHUWptSZi/et/ rTMATPulfpGkzyh9ksROtEh5e5KpvOtY/gJDCj9C5Z6rQgNXD7tRpPHiYulCQf1uyeNebfrtrB/ qHL0kTIqnANCgIK5/ee+Pe/zCM1qdwVDk+hROmYf2rERwy+wz8oTgLSfEmpd8rf0f7lBR/MUkvl gRP+rllyEbOOPRtwghP4FUfOQ= X-Google-Smtp-Source: AGHT+IHKOC7WKTbH/ijVUTH38hxdU+Ou0RoKl5cARtWTF5vqoihO6BtV+q30wvio57rQ24Dl8KyceyokuA== X-Received: from fuad.c.googlers.com ([fda3:e722:ac3:cc00:28:9cb1:c0a8:1613]) (user=tabba job=sendgmr) by 2002:a05:6902:1812:b0:e24:9f58:dd17 with SMTP id 3f1490d57ef6-e28fe32f042mr55663276.1.1728550800731; Thu, 10 Oct 2024 02:00:00 -0700 (PDT) Date: Thu, 10 Oct 2024 09:59:30 +0100 In-Reply-To: <20241010085930.1546800-1-tabba@google.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241010085930.1546800-1-tabba@google.com> X-Mailer: git-send-email 2.47.0.rc0.187.ge670bccf7e-goog Message-ID: <20241010085930.1546800-12-tabba@google.com> Subject: [PATCH v3 11/11] KVM: arm64: Enable guest_memfd private memory when pKVM is enabled From: Fuad Tabba To: kvm@vger.kernel.org, linux-arm-msm@vger.kernel.org, linux-mm@kvack.org Cc: pbonzini@redhat.com, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, seanjc@google.com, viro@zeniv.linux.org.uk, brauner@kernel.org, willy@infradead.org, akpm@linux-foundation.org, xiaoyao.li@intel.com, yilun.xu@intel.com, chao.p.peng@linux.intel.com, jarkko@kernel.org, amoorthy@google.com, dmatlack@google.com, yu.c.zhang@linux.intel.com, isaku.yamahata@intel.com, mic@digikod.net, vbabka@suse.cz, vannapurve@google.com, ackerleytng@google.com, mail@maciej.szmigiero.name, david@redhat.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com, suzuki.poulose@arm.com, steven.price@arm.com, quic_eberman@quicinc.com, quic_mnalajal@quicinc.com, quic_tsoni@quicinc.com, quic_svaddagi@quicinc.com, quic_cvanscha@quicinc.com, quic_pderrin@quicinc.com, quic_pheragu@quicinc.com, catalin.marinas@arm.com, james.morse@arm.com, yuzenghui@huawei.com, oliver.upton@linux.dev, maz@kernel.org, will@kernel.org, qperret@google.com, keirf@google.com, roypat@amazon.co.uk, shuah@kernel.org, hch@infradead.org, jgg@nvidia.com, rientjes@google.com, jhubbard@nvidia.com, fvdl@google.com, hughd@google.com, jthoughton@google.com, tabba@google.com Implement kvm_arch_has_private_mem() in arm64 when pKVM is enabled, and make it dependent on the configuration option. Also, now that the infrastructure is in place for arm64 to support guest private memory, enable it in the arm64 kernel configuration. Signed-off-by: Fuad Tabba --- arch/arm64/include/asm/kvm_host.h | 3 +++ arch/arm64/kvm/Kconfig | 1 + 2 files changed, 4 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h index 94cff508874b..eec32e537097 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -1496,4 +1496,7 @@ void kvm_set_vm_id_reg(struct kvm *kvm, u32 reg, u64 val); (system_supports_fpmr() && \ kvm_has_feat((k), ID_AA64PFR2_EL1, FPMR, IMP)) +#define kvm_arch_has_private_mem(kvm) \ + (IS_ENABLED(CONFIG_KVM_PRIVATE_MEM) && is_protected_kvm_enabled()) + #endif /* __ARM64_KVM_HOST_H__ */ diff --git a/arch/arm64/kvm/Kconfig b/arch/arm64/kvm/Kconfig index ead632ad01b4..fe3451f244b5 100644 --- a/arch/arm64/kvm/Kconfig +++ b/arch/arm64/kvm/Kconfig @@ -38,6 +38,7 @@ menuconfig KVM select HAVE_KVM_VCPU_RUN_PID_CHANGE select SCHED_INFO select GUEST_PERF_EVENTS if PERF_EVENTS + select KVM_GMEM_MAPPABLE help Support hosting virtualized guest machines.