From patchwork Tue Jan 18 13:21:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Chao Peng X-Patchwork-Id: 12716366 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8DAD1C433EF for ; Tue, 18 Jan 2022 13:23:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 22C7A6B0074; Tue, 18 Jan 2022 08:23:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 1DC4C6B0075; Tue, 18 Jan 2022 08:23:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0CA8E6B0089; Tue, 18 Jan 2022 08:23:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0155.hostedemail.com [216.40.44.155]) by kanga.kvack.org (Postfix) with ESMTP id F2D3D6B0074 for ; Tue, 18 Jan 2022 08:23:30 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay05.hostedemail.com (Postfix) with ESMTP id B1C20181E8E71 for ; Tue, 18 Jan 2022 13:23:30 +0000 (UTC) X-FDA: 79043474580.26.F11C9EB Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf24.hostedemail.com (Postfix) with ESMTP id EDD4A180007 for ; Tue, 18 Jan 2022 13:23:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1642512210; x=1674048210; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=WMNkcnSDaEvDrlJP1KgKm0jYnNbblXf6Dmx4W+l5lNg=; b=WK/qYYFpzcfCove8pnE3AdtxiMb266iPXNSg89HdsOXmUStWcZrLrDkI 2Pb+PToraiowkDod04ZQucDorDzA4k5RAkRz8TWJrBbQHmuy1WOV5rTzh vdgf1Sk98WGTjoBg/R6oxNJKge0WP5j66C3JuZ+hXJm1AgS6M5u4AcgVg j4lxw7EYe/r6FOVkZJWG+TI2FzWCS9W3rxdzkksD1oQeLKYYAEqx5wWvq ritUERtKPKH89seLWJMc/Wn/Dp0aiD3dLGhai6taPzDJn8JDWwVFMx6rE HsQonmcTRazDnQBa6fN1Ms8cFsK4L8SK8rWpV6Y7fYrKJWVZFImnAS/TY g==; X-IronPort-AV: E=McAfee;i="6200,9189,10230"; a="232172052" X-IronPort-AV: E=Sophos;i="5.88,297,1635231600"; d="scan'208";a="232172052" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jan 2022 05:23:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,297,1635231600"; d="scan'208";a="531791967" Received: from chaop.bj.intel.com ([10.240.192.101]) by orsmga008.jf.intel.com with ESMTP; 18 Jan 2022 05:23:19 -0800 From: Chao Peng To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, qemu-devel@nongnu.org Cc: Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Yu Zhang , Chao Peng , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com Subject: [PATCH v4 12/12] KVM: Expose KVM_MEM_PRIVATE Date: Tue, 18 Jan 2022 21:21:21 +0800 Message-Id: <20220118132121.31388-13-chao.p.peng@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20220118132121.31388-1-chao.p.peng@linux.intel.com> References: <20220118132121.31388-1-chao.p.peng@linux.intel.com> X-Rspamd-Queue-Id: EDD4A180007 X-Stat-Signature: pc1x9w9pyzbqhi7z3yognh9utiuyrxhu Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="WK/qYYFp"; dmarc=pass (policy=none) header.from=intel.com; spf=none (imf24.hostedemail.com: domain of chao.p.peng@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=chao.p.peng@linux.intel.com X-Rspamd-Server: rspam02 X-HE-Tag: 1642512209-56059 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: KVM_MEM_PRIVATE is not exposed by default but architecture code can turn on it by implementing kvm_arch_private_memory_supported(). Also private memslot cannot be movable and the same file+offset can not be mapped into different GFNs. Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 49 ++++++++++++++++++++++++++++++++++------ 2 files changed, 43 insertions(+), 7 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 117cf0da9c5e..444b390261c0 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1328,6 +1328,7 @@ bool kvm_arch_dy_has_pending_interrupt(struct kvm_vcpu *vcpu); int kvm_arch_post_init_vm(struct kvm *kvm); void kvm_arch_pre_destroy_vm(struct kvm *kvm); int kvm_arch_create_vm_debugfs(struct kvm *kvm); +bool kvm_arch_private_memory_supported(struct kvm *kvm); #ifndef __KVM_HAVE_ARCH_VM_ALLOC /* diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 10e553215618..51d0f08a8601 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1491,10 +1491,19 @@ static void kvm_replace_memslot(struct kvm *kvm, } } -static int check_memory_region_flags(const struct kvm_userspace_memory_region *mem) +bool __weak kvm_arch_private_memory_supported(struct kvm *kvm) +{ + return false; +} + +static int check_memory_region_flags(struct kvm *kvm, + const struct kvm_userspace_memory_region *mem) { u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES; + if (kvm_arch_private_memory_supported(kvm)) + valid_flags |= KVM_MEM_PRIVATE; + #ifdef __KVM_HAVE_READONLY_MEM valid_flags |= KVM_MEM_READONLY; #endif @@ -1873,15 +1882,32 @@ static int kvm_set_memslot(struct kvm *kvm, } static bool kvm_check_memslot_overlap(struct kvm_memslots *slots, int id, - gfn_t start, gfn_t end) + struct file *file, + gfn_t start, gfn_t end, + loff_t start_off, loff_t end_off) { struct kvm_memslot_iter iter; + struct kvm_memory_slot *slot; + struct inode *inode; + int bkt; kvm_for_each_memslot_in_gfn_range(&iter, slots, start, end) { if (iter.slot->id != id) return true; } + /* Disallow mapping the same file+offset into multiple gfns. */ + if (file) { + inode = file_inode(file); + kvm_for_each_memslot(slot, bkt, slots) { + if (slot->private_file && + file_inode(slot->private_file) == inode && + !(end_off <= slot->private_offset || + start_off >= slot->private_offset + + (slot->npages >> PAGE_SHIFT))) + return true; + } + } return false; } @@ -1906,7 +1932,7 @@ int __kvm_set_memory_region(struct kvm *kvm, int as_id, id; int r; - r = check_memory_region_flags(mem); + r = check_memory_region_flags(kvm, mem); if (r) return r; @@ -1919,10 +1945,12 @@ int __kvm_set_memory_region(struct kvm *kvm, return -EINVAL; if (mem->guest_phys_addr & (PAGE_SIZE - 1)) return -EINVAL; - /* We can read the guest memory with __xxx_user() later on. */ if ((mem->userspace_addr & (PAGE_SIZE - 1)) || - (mem->userspace_addr != untagged_addr(mem->userspace_addr)) || - !access_ok((void __user *)(unsigned long)mem->userspace_addr, + (mem->userspace_addr != untagged_addr(mem->userspace_addr))) + return -EINVAL; + /* We can read the guest memory with __xxx_user() later on. */ + if (!(mem->flags & KVM_MEM_PRIVATE) && + !access_ok((void __user *)(unsigned long)mem->userspace_addr, mem->memory_size)) return -EINVAL; if (as_id >= KVM_ADDRESS_SPACE_NUM || id >= KVM_MEM_SLOTS_NUM) @@ -1963,6 +1991,9 @@ int __kvm_set_memory_region(struct kvm *kvm, if ((kvm->nr_memslot_pages + npages) < kvm->nr_memslot_pages) return -EINVAL; } else { /* Modify an existing slot. */ + /* Private memslots are immutable, they can only be deleted. */ + if (mem->flags & KVM_MEM_PRIVATE) + return -EINVAL; if ((mem->userspace_addr != old->userspace_addr) || (npages != old->npages) || ((mem->flags ^ old->flags) & KVM_MEM_READONLY)) @@ -1983,7 +2014,11 @@ int __kvm_set_memory_region(struct kvm *kvm, } if ((change == KVM_MR_CREATE || change == KVM_MR_MOVE) && - kvm_check_memslot_overlap(slots, id, base_gfn, base_gfn + npages)) { + kvm_check_memslot_overlap(slots, id, file, + base_gfn, base_gfn + npages, + region_ext->private_offset, + region_ext->private_offset + + mem->memory_size)) { r = -EEXIST; goto out; }