From patchwork Thu Feb  3 01:00:29 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733668
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 17448C433FE
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:04 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348647AbiBCBBD (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:03 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46148 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S241392AbiBCBBC (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:02 -0500
Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com
 [IPv6:2607:f8b0:4864:20::649])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7441C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:02 -0800 (PST)
Received: by mail-pl1-x649.google.com with SMTP id
 h11-20020a170902eecb00b0014cc91d4bc4so274836plb.16
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:02 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=bGFCrw73QNhf8q3al57zaK6xaZK1SHXoQE5u16lUVeY=;
        b=ooceTLI2YahhNYRyruUQirthjAMCPz1D7/s5nigLxsc2TDYJFv5YjG0bjRNjdkZ6de
         sbcEiso+Sa6kkNFHnic4y8S4qNCuf8Lem32wOmypQwZmZTcOuZZcTC28Bm9BKgvzdczR
         F8Is035jijRZi1TT6qzmZNIZvM8i88vmo7enqJu+dLdaGzNls0vY8u8hffPtrMJ4OAGa
         Mp687JHDmFKsBb5BveJYvioVwo9NfJ6DEVCIlGGrFZOnPL+RwVFPHcDQUNXrTCLBsuIg
         M1h7YEOYI3EqQ6s8fPjzyUhpek6uFYAkOdPT82DjScyOtGvjfATl5YORtog/UyIpMq6m
         vjaw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=bGFCrw73QNhf8q3al57zaK6xaZK1SHXoQE5u16lUVeY=;
        b=FF9xOzDMRkTtY7VHORSTSM8ANpsneKNJe3qCahIVG6RmDyfGbC/X0w2CDYnBvdsdZO
         p/h+wLr3yoDkwJdaPMn3QFpu+hmFn0wo1Z/A8uV0l9C2xinKCGRUJLyUKIhomMHjSXde
         ZV9xKkXwq/A7JZr7plmDdS1TkhSqazezEWbiG6gmEaU6PU3PQ3kZ/F9FkdCySM3FL3ns
         6RTWUA/8DxANOk8buZcU600p3RHlclYhOA5GZctZBnjDnNZWasc29KpPRHbDzo0BFn3d
         gGvzRhEDdi6XiMlOOl0OzoxozuvlBMTxLj+BBLfSr97CrhEFok5KJZmhz+BbVGswvi4B
         rf/A==
X-Gm-Message-State: AOAM533/ZnxSYORf6y3HDL3KJ7z2qEWKMnAmxBBHKT0FyLcBZXvKnBRq
        6R3JJPwMtCRSGO6FkUehhGQHaqAS7DCunA==
X-Google-Smtp-Source: 
 ABdhPJww6OZq+QWEavKLmOGrJ7r+vgeszLp3RkgqsuOc5dX81fXxEH6QtgXeqExqzMbHcGAEoJM4B++UL4E+ww==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:8205:: with SMTP id
 x5mr22694118pln.29.1643850062201; Wed, 02 Feb 2022 17:01:02 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:29 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-2-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 01/23] KVM: x86/mmu: Optimize MMU page cache lookup for all
 direct SPs
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Commit fb58a9c345f6 ("KVM: x86/mmu: Optimize MMU page cache lookup for
fully direct MMUs") skipped the unsync checks and write flood clearing
for full direct MMUs. We can extend this further and skip the checks for
all direct shadow pages. Direct shadow pages are never marked unsynced
or have a non-zero write-flooding count.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Sean Christopherson <seanjc@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 296f8723f9ae..6ca38277f2ab 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2052,7 +2052,6 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 					     int direct,
 					     unsigned int access)
 {
-	bool direct_mmu = vcpu->arch.mmu->direct_map;
 	union kvm_mmu_page_role role;
 	struct hlist_head *sp_list;
 	unsigned quadrant;
@@ -2093,7 +2092,8 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 			continue;
 		}
 
-		if (direct_mmu)
+		/* unsync and write-flooding only apply to indirect SPs. */
+		if (sp->role.direct)
 			goto trace_get_page;
 
 		if (sp->unsync) {

From patchwork Thu Feb  3 01:00:30 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733669
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id C9D0FC433EF
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:06 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348648AbiBCBBG (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:06 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46158 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348649AbiBCBBF (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:05 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E6562C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:04 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 mn21-20020a17090b189500b001b4fa60efcbso5614087pjb.2
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:04 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=uMSFpNCTdcqnLaFYij7d569BarSs+O4ODWByz0fWKks=;
        b=sQV4+lBWGik150D0iVdQwCAhV9Ax5GcoH94Lc5K6m8Swqjgx3yWxtfUvHS2M6AFQYU
         F8xSfFFBLVlONnO+7Qz0rm/eKMWE8JO03Fz+0M0f5sjyF7ghA+ssNnKNib+Jqb649Pxc
         gfYJhlfpGSHkUbH71LNbU9Tk1zdR6FQYuqqUD/vzFedrZkzBxKT4BkEmZrSQp9LaW1n/
         j6ZJMinRZDLhlzGUbQ/qv7ItCgCp+E0TB3gbc0YRMozOo6yzwOimRtH4pMvEOqvbnQd6
         EVEusvMzGPM7IaHlDrW4Uv3NqZkBda/hohNjixTZY4O4qZRnDvb4fnvR3AxeGgddLsav
         54uA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=uMSFpNCTdcqnLaFYij7d569BarSs+O4ODWByz0fWKks=;
        b=PIBfHUN6SUv7122vlWcg0fmHTH2CFT9/m6kFMlXSvPsu9e67FWn5T4Qmx/aVaKlaBQ
         eM9jhuJ/z1jh9HWSYR1ln5e9BG0FHSGzfJ1S6MNrESsNqIX/GQ+cXkZtr192eqzUQ/Gs
         A+DVhOBQKRGX82jITnLKYMWRZQpHq/sU66lUFPSF+SbjgC927EkRzjss2qvHcb2WuE1+
         2EBPWOq02Tofxj3VEnbL/B3rM0Rh6zwokjo/r5cNBMaLdInZu4+gwrtP1gRnxzZr6T1/
         ES6ZrledM/IAKUAP0OAt3OHLJ9yyvz+pZiLk0EYgZ9B0VPC+9wVzBNqAwRMPd7SZHJiw
         Ce5w==
X-Gm-Message-State: AOAM533w8/KsnlaxMp+nab3AkgWGukoAllj6igXBcqTr6sdxXQ45xUZd
        3RB0dmRrLPL6EGAStBExVvbhLfKcunGF/Q==
X-Google-Smtp-Source: 
 ABdhPJwmpSkcO+hN9r2FueAzu+IBx4Nf88wTLol8M7YPhENUgKTOsadAbdl7wE6KOQZhVChgmsARZpqpBbEq1Q==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:90b:4d82:: with SMTP id
 oj2mr1187727pjb.1.1643850063939; Wed, 02 Feb 2022 17:01:03 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:30 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-3-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 02/23] KVM: x86/mmu: Derive shadow MMU page role from parent
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Instead of computing the shadow page role from scratch for every new
page, we can derive most of the information from the parent shadow page.
This avoids redundant calculations such as the quadrant, and reduces the
number of parameters to kvm_mmu_get_page().

Preemptivel split out the role calculation to a separate function for
use in a following commit.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c         | 71 ++++++++++++++++++++++------------
 arch/x86/kvm/mmu/paging_tmpl.h |  9 +++--
 2 files changed, 51 insertions(+), 29 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6ca38277f2ab..fc9a4d9c0ddd 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2045,30 +2045,14 @@ static void clear_sp_write_flooding_count(u64 *spte)
 	__clear_sp_write_flooding_count(sptep_to_sp(spte));
 }
 
-static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
-					     gfn_t gfn,
-					     gva_t gaddr,
-					     unsigned level,
-					     int direct,
-					     unsigned int access)
+static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn,
+					     union kvm_mmu_page_role role)
 {
-	union kvm_mmu_page_role role;
 	struct hlist_head *sp_list;
-	unsigned quadrant;
 	struct kvm_mmu_page *sp;
 	int collisions = 0;
 	LIST_HEAD(invalid_list);
 
-	role = vcpu->arch.mmu->mmu_role.base;
-	role.level = level;
-	role.direct = direct;
-	role.access = access;
-	if (role.has_4_byte_gpte) {
-		quadrant = gaddr >> (PAGE_SHIFT + (PT64_PT_BITS * level));
-		quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
-		role.quadrant = quadrant;
-	}
-
 	sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
 	for_each_valid_sp(vcpu->kvm, sp, sp_list) {
 		if (sp->gfn != gfn) {
@@ -2086,7 +2070,7 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 			 * Unsync pages must not be left as is, because the new
 			 * upper-level page will be write-protected.
 			 */
-			if (level > PG_LEVEL_4K && sp->unsync)
+			if (role.level > PG_LEVEL_4K && sp->unsync)
 				kvm_mmu_prepare_zap_page(vcpu->kvm, sp,
 							 &invalid_list);
 			continue;
@@ -2125,14 +2109,14 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 
 	++vcpu->kvm->stat.mmu_cache_miss;
 
-	sp = kvm_mmu_alloc_page(vcpu, direct);
+	sp = kvm_mmu_alloc_page(vcpu, role.direct);
 
 	sp->gfn = gfn;
 	sp->role = role;
 	hlist_add_head(&sp->hash_link, sp_list);
-	if (!direct) {
+	if (!role.direct) {
 		account_shadowed(vcpu->kvm, sp);
-		if (level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn))
+		if (role.level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn))
 			kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1);
 	}
 	trace_kvm_mmu_get_page(sp, true);
@@ -2144,6 +2128,31 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu,
 	return sp;
 }
 
+static union kvm_mmu_page_role kvm_mmu_child_role(struct kvm_mmu_page *parent_sp,
+						  bool direct, u32 access)
+{
+	union kvm_mmu_page_role role;
+
+	role = parent_sp->role;
+	role.level--;
+	role.access = access;
+	role.direct = direct;
+
+	return role;
+}
+
+static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
+						 u64 *sptep, gfn_t gfn,
+						 bool direct, u32 access)
+{
+	struct kvm_mmu_page *parent_sp = sptep_to_sp(sptep);
+	union kvm_mmu_page_role role;
+
+	role = kvm_mmu_child_role(parent_sp, direct, access);
+
+	return kvm_mmu_get_page(vcpu, gfn, role);
+}
+
 static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
 					struct kvm_vcpu *vcpu, hpa_t root,
 					u64 addr)
@@ -2942,8 +2951,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 		if (is_shadow_present_pte(*it.sptep))
 			continue;
 
-		sp = kvm_mmu_get_page(vcpu, base_gfn, it.addr,
-				      it.level - 1, true, ACC_ALL);
+		sp = kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn, true, ACC_ALL);
 
 		link_shadow_page(vcpu, it.sptep, sp);
 		if (fault->is_tdp && fault->huge_page_disallowed &&
@@ -3325,9 +3333,22 @@ static int mmu_check_root(struct kvm_vcpu *vcpu, gfn_t root_gfn)
 static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva,
 			    u8 level, bool direct)
 {
+	union kvm_mmu_page_role role;
 	struct kvm_mmu_page *sp;
+	unsigned int quadrant;
+
+	role = vcpu->arch.mmu->mmu_role.base;
+	role.level = level;
+	role.direct = direct;
+	role.access = ACC_ALL;
+
+	if (role.has_4_byte_gpte) {
+		quadrant = gva >> (PAGE_SHIFT + (PT64_PT_BITS * level));
+		quadrant &= (1 << ((PT32_PT_BITS - PT64_PT_BITS) * level)) - 1;
+		role.quadrant = quadrant;
+	}
 
-	sp = kvm_mmu_get_page(vcpu, gfn, gva, level, direct, ACC_ALL);
+	sp = kvm_mmu_get_page(vcpu, gfn, role);
 	++sp->root_count;
 
 	return __pa(sp->spt);
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 5b5bdac97c7b..f93d4423a067 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -683,8 +683,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 		if (!is_shadow_present_pte(*it.sptep)) {
 			table_gfn = gw->table_gfn[it.level - 2];
 			access = gw->pt_access[it.level - 2];
-			sp = kvm_mmu_get_page(vcpu, table_gfn, fault->addr,
-					      it.level-1, false, access);
+			sp = kvm_mmu_get_child_sp(vcpu, it.sptep, table_gfn,
+						  false, access);
+
 			/*
 			 * We must synchronize the pagetable before linking it
 			 * because the guest doesn't need to flush tlb when
@@ -740,8 +741,8 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 		drop_large_spte(vcpu, it.sptep);
 
 		if (!is_shadow_present_pte(*it.sptep)) {
-			sp = kvm_mmu_get_page(vcpu, base_gfn, fault->addr,
-					      it.level - 1, true, direct_access);
+			sp = kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn,
+						  true, direct_access);
 			link_shadow_page(vcpu, it.sptep, sp);
 			if (fault->huge_page_disallowed &&
 			    fault->req_level >= it.level)

From patchwork Thu Feb  3 01:00:31 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733670
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 40559C433F5
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:09 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348656AbiBCBBI (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:08 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46166 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348652AbiBCBBG (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:06 -0500
Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com
 [IPv6:2607:f8b0:4864:20::549])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A9C3C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:06 -0800 (PST)
Received: by mail-pg1-x549.google.com with SMTP id
 u24-20020a656718000000b0035e911d79edso585872pgf.1
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=itZVBKuazSWkWwWVxGCsy5muThrtWmA2MD1DMqG3pH4=;
        b=gIEaIO/XDx4xg0FArGsobbA6CP+YVku+jaa33TdXE8o2N+q2nJf99eu7RfqZathebQ
         Q5Tk7urNCjfVkrBmltUFQ3/7e36BJPNpdhKmZt6LGhvCmwNXNNVZWst94jtrhszY1+ap
         ckYfUwAVkq7TWZm4+tpsb3S0u9qQ/o3nLVH8Iwo07JHxHF5Kw9Dv/8pPRitWBsl3mKKd
         3TPhmz3NybMZKwvTPQ5zNJMzgLl0fgRzfGznmfPwxzXjXKPZ5W5I2gKRepmUtSgTJ/0M
         XWUPhhL9CKVLDIYjyV7M9PzWN/5r7lDll+HbHkrnUAzYX9e5ynY58H58jtSKO7ij4N0B
         bINA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=itZVBKuazSWkWwWVxGCsy5muThrtWmA2MD1DMqG3pH4=;
        b=orf4AOFhg4lcdyxxl30XhUFY1ce1VOetpdgN0F5FXBU+582V3eo4WHBIL45eKeOXGh
         GNmKf9PisHnwOBGFJsgaLUg/po4weWj5RuH95Q2qc+5jQ1zReYXbVBY4Oa3q3yD1W706
         XehszvcnYac4etTvpGC/9segrEARP3dkCOpH5AD4gh9HfJDbdlpS5UF83VZn2Fp/QzLf
         HOO+VDfU6H/esYAc6sDTiLy4S/m8PKSmaSNoRtH4xRBwv2+6srhTvSzwXRXNVcMivcWe
         GdbNHyf8IXQpcrfxr981iDzcMOZcmb4YuuVeYFbi9T0k20GsAXeQvDND566/PDuHFeBx
         KMBQ==
X-Gm-Message-State: AOAM5302O33Ir9+37tR95SS4GEcqGhqetRhuyfdpqNubF6F/irx3y/Bh
        7mv/E5VP9OWBBUS6ZDgrtkl/eBC8ba71lw==
X-Google-Smtp-Source: 
 ABdhPJwIxLlfD9+H/R6rP0BwsYpIWzf6azN55ADZxPwWwM4cBaWhZT7yM4mJPrVvlDPXV8uZXQzEveN8MdFwhw==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a05:6a00:181f:: with SMTP id
 y31mr31955129pfa.35.1643850065842; Wed, 02 Feb 2022 17:01:05 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:31 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-4-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 03/23] KVM: x86/mmu: Decompose kvm_mmu_get_page() into
 separate functions
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Decompose kvm_mmu_get_page() into separate helper functions to increase
readability and prepare for allocating shadow pages without a vcpu
pointer.

Specifically, pull the guts of kvm_mmu_get_page() into 3 helper
functions:

kvm_mmu_get_existing_sp_mabye_unsync() -
  Walks the page hash checking for any existing mmu pages that match the
  given gfn and role. Does not attempt to synchronize the page if it is
  unsync.

kvm_mmu_get_existing_sp() -
  Gets an existing page from the page hash if it exists and guarantees
  the page, if one is returned, is synced.  Implemented as a thin wrapper
  around kvm_mmu_get_existing_page_mabye_unsync. Requres access to a vcpu
  pointer in order to sync the page.

kvm_mmu_create_sp()
  Allocates an entirely new kvm_mmu_page. This currently requries a
  vcpu pointer for allocation and looking up the memslot but that will
  be removed in a future commit.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c         | 132 ++++++++++++++++++++++++---------
 arch/x86/kvm/mmu/paging_tmpl.h |   5 +-
 arch/x86/kvm/mmu/spte.c        |   5 +-
 3 files changed, 101 insertions(+), 41 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index fc9a4d9c0ddd..24b3cf53aa12 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -2045,16 +2045,25 @@ static void clear_sp_write_flooding_count(u64 *spte)
 	__clear_sp_write_flooding_count(sptep_to_sp(spte));
 }
 
-static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn,
-					     union kvm_mmu_page_role role)
+/*
+ * Looks up an existing SP for the given gfn and role. Makes no attempt to
+ * sync the SP if it is marked unsync.
+ *
+ * If creating an upper-level page table, zaps unsynced pages for the same
+ * gfn and adds them to the invalid_list. It's the callers responsibility
+ * to call kvm_mmu_commit_zap_page() on invalid_list.
+ */
+static struct kvm_mmu_page *kvm_mmu_get_existing_sp_maybe_unsync(struct kvm *kvm,
+								 gfn_t gfn,
+								 union kvm_mmu_page_role role,
+								 struct list_head *invalid_list)
 {
 	struct hlist_head *sp_list;
 	struct kvm_mmu_page *sp;
 	int collisions = 0;
-	LIST_HEAD(invalid_list);
 
-	sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
-	for_each_valid_sp(vcpu->kvm, sp, sp_list) {
+	sp_list = &kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
+	for_each_valid_sp(kvm, sp, sp_list) {
 		if (sp->gfn != gfn) {
 			collisions++;
 			continue;
@@ -2071,60 +2080,109 @@ static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn,
 			 * upper-level page will be write-protected.
 			 */
 			if (role.level > PG_LEVEL_4K && sp->unsync)
-				kvm_mmu_prepare_zap_page(vcpu->kvm, sp,
-							 &invalid_list);
+				kvm_mmu_prepare_zap_page(kvm, sp, invalid_list);
+
 			continue;
 		}
 
-		/* unsync and write-flooding only apply to indirect SPs. */
-		if (sp->role.direct)
-			goto trace_get_page;
+		/* Write-flooding is only tracked for indirect SPs. */
+		if (!sp->role.direct)
+			__clear_sp_write_flooding_count(sp);
 
-		if (sp->unsync) {
-			/*
-			 * The page is good, but is stale.  kvm_sync_page does
-			 * get the latest guest state, but (unlike mmu_unsync_children)
-			 * it doesn't write-protect the page or mark it synchronized!
-			 * This way the validity of the mapping is ensured, but the
-			 * overhead of write protection is not incurred until the
-			 * guest invalidates the TLB mapping.  This allows multiple
-			 * SPs for a single gfn to be unsync.
-			 *
-			 * If the sync fails, the page is zapped.  If so, break
-			 * in order to rebuild it.
-			 */
-			if (!kvm_sync_page(vcpu, sp, &invalid_list))
-				break;
+		goto out;
+	}
 
-			WARN_ON(!list_empty(&invalid_list));
-			kvm_flush_remote_tlbs(vcpu->kvm);
-		}
+	sp = NULL;
 
-		__clear_sp_write_flooding_count(sp);
+out:
+	if (collisions > kvm->stat.max_mmu_page_hash_collisions)
+		kvm->stat.max_mmu_page_hash_collisions = collisions;
+
+	return sp;
+}
 
-trace_get_page:
-		trace_kvm_mmu_get_page(sp, false);
+/*
+ * Looks up an existing SP for the given gfn and role if one exists. The
+ * return SP is guaranteed to be synced.
+ */
+static struct kvm_mmu_page *kvm_mmu_get_existing_sp(struct kvm_vcpu *vcpu,
+						    gfn_t gfn,
+						    union kvm_mmu_page_role role)
+{
+	struct kvm_mmu_page *sp;
+	LIST_HEAD(invalid_list);
+
+	sp = kvm_mmu_get_existing_sp_maybe_unsync(vcpu->kvm, gfn, role, &invalid_list);
+	if (!sp)
 		goto out;
+
+	if (sp->unsync) {
+		/*
+		 * The page is good, but is stale.  kvm_sync_page does
+		 * get the latest guest state, but (unlike mmu_unsync_children)
+		 * it doesn't write-protect the page or mark it synchronized!
+		 * This way the validity of the mapping is ensured, but the
+		 * overhead of write protection is not incurred until the
+		 * guest invalidates the TLB mapping.  This allows multiple
+		 * SPs for a single gfn to be unsync.
+		 *
+		 * If the sync fails, the page is zapped and added to the
+		 * invalid_list.
+		 */
+		if (!kvm_sync_page(vcpu, sp, &invalid_list)) {
+			sp = NULL;
+			goto out;
+		}
+
+		WARN_ON(!list_empty(&invalid_list));
+		kvm_flush_remote_tlbs(vcpu->kvm);
 	}
 
+out:
+	kvm_mmu_commit_zap_page(vcpu->kvm, &invalid_list);
+	return sp;
+}
+
+static struct kvm_mmu_page *kvm_mmu_create_sp(struct kvm_vcpu *vcpu,
+					      gfn_t gfn,
+					      union kvm_mmu_page_role role)
+{
+	struct kvm_mmu_page *sp;
+	struct hlist_head *sp_list;
+
 	++vcpu->kvm->stat.mmu_cache_miss;
 
 	sp = kvm_mmu_alloc_page(vcpu, role.direct);
-
 	sp->gfn = gfn;
 	sp->role = role;
+
+	sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
 	hlist_add_head(&sp->hash_link, sp_list);
+
 	if (!role.direct) {
 		account_shadowed(vcpu->kvm, sp);
 		if (role.level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn))
 			kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1);
 	}
-	trace_kvm_mmu_get_page(sp, true);
-out:
-	kvm_mmu_commit_zap_page(vcpu->kvm, &invalid_list);
 
-	if (collisions > vcpu->kvm->stat.max_mmu_page_hash_collisions)
-		vcpu->kvm->stat.max_mmu_page_hash_collisions = collisions;
+	return sp;
+}
+
+static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn,
+					     union kvm_mmu_page_role role)
+{
+	struct kvm_mmu_page *sp;
+	bool created = false;
+
+	sp = kvm_mmu_get_existing_sp(vcpu, gfn, role);
+	if (sp)
+		goto out;
+
+	created = true;
+	sp = kvm_mmu_create_sp(vcpu, gfn, role);
+
+out:
+	trace_kvm_mmu_get_page(sp, created);
 	return sp;
 }
 
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index f93d4423a067..c533c191925e 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -692,8 +692,9 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 			 * the gpte is changed from non-present to present.
 			 * Otherwise, the guest may use the wrong mapping.
 			 *
-			 * For PG_LEVEL_4K, kvm_mmu_get_page() has already
-			 * synchronized it transiently via kvm_sync_page().
+			 * For PG_LEVEL_4K, kvm_mmu_get_existing_sp() has
+			 * already synchronized it transiently via
+			 * kvm_sync_page().
 			 *
 			 * For higher level pagetable, we synchronize it via
 			 * the slower mmu_sync_children().  If it needs to
diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 8b5309faf5b9..20cf9e0d45dd 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -149,8 +149,9 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 		/*
 		 * Optimization: for pte sync, if spte was writable the hash
 		 * lookup is unnecessary (and expensive). Write protection
-		 * is responsibility of kvm_mmu_get_page / kvm_mmu_sync_roots.
-		 * Same reasoning can be applied to dirty page accounting.
+		 * is responsibility of kvm_mmu_create_sp() and
+		 * kvm_mmu_sync_roots(). Same reasoning can be applied to dirty
+		 * page accounting.
 		 */
 		if (is_writable_pte(old_spte))
 			goto out;

From patchwork Thu Feb  3 01:00:32 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733671
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id F302DC433EF
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:09 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S230029AbiBCBBJ (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:09 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46174 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348654AbiBCBBI (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:08 -0500
Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com
 [IPv6:2607:f8b0:4864:20::549])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B5E70C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:07 -0800 (PST)
Received: by mail-pg1-x549.google.com with SMTP id
 i23-20020a635417000000b00364c29f39aaso569611pgb.8
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:07 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=cj2jh92JV4tPPaqu2IJj/uC6sjmdjzSptfF46c+3oo8=;
        b=P1yqX+Yx7OHPoM8icN+XvQEpmdYP9PFhOmc19vsJBBzBHbD6Nh8wAYlh9YtAG/x3TQ
         S50J3uMQw8kUewpJ/eVpo2q2QYK9vWOvym/0qyA29i7Zth5ZQBjGi6BagifIQDBMoN5O
         88ImHOQmPI4mZgc1G3CEXZ20qwmOsJ7oTr3ideTiwJSHytelygGSLCXpbiIv5WaLLOEM
         b+OoRwpLPxcPeB3NSuSZxjxfxoo8RQMnfRxtm4BqpR0cOhgvidLNTgaem1wXfblt75nD
         USUET4TDR6T/d4ibIdRNDweR3e5YMlp5pZtQ52CgBF08My8unwvjwa/n9YTWCfw3Kv0T
         lc5A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=cj2jh92JV4tPPaqu2IJj/uC6sjmdjzSptfF46c+3oo8=;
        b=YuI7cXwhQQhfE/mdSLiHsKKprrDLxJiP8Q2ZK1XVCw3ixU2rlfefGB9eSBJVwORxLr
         a8ne4IAD08Jt+qcaoV0EC2mkKw3HtdObxdmDvAMZ8uAOlWIzm+s3eKtH1x8w4JbfVAcF
         PEHYnlX6YmstA9I35QhmEGwzawPIYnk4PKIiaa8NNN3T3HoApdgQvywgUqDKH5gzQDPD
         MQMzdKrFc8OYCieUjLaM/IImrMRRNVKGlO54u/RFmexVtGGt8cxKLnCfyuJTL/mUqxxh
         0XYIKsRJZSfi5upiBUnAL2QdvNsZtsc1YVVNbVRHSKx5Uc96W+lTOx/Yu4qdm//ZuIrS
         zDsg==
X-Gm-Message-State: AOAM532BwD96v7o3z4UBQzLaAAfNuSLRf2/xHGQS3PUiUkHpId4jTF9H
        +diywKZuvq87WqZVPiQpcPyPtW/ly4jXXQ==
X-Google-Smtp-Source: 
 ABdhPJz+CcTaXA/f9HJ9QAGYSxF2rMkcxBlYoYfF058AErM+9ZNxYG/rxvAoC/+6KjQCgaELY5+0b8QBavbuDg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:f54b:: with SMTP id
 h11mr33157406plf.91.1643850067203; Wed, 02 Feb 2022 17:01:07 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:32 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-5-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 04/23] KVM: x86/mmu: Rename shadow MMU functions that deal
 with shadow pages
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Rename 3 functions:

  kvm_mmu_get_page()   -> kvm_mmu_get_sp()
  kvm_mmu_alloc_page() -> kvm_mmu_alloc_sp()
  kvm_mmu_free_page()  -> kvm_mmu_free_sp()

This change makes it clear that these functions deal with shadow pages
rather than struct pages.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 24b3cf53aa12..6f55af9c66db 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1679,7 +1679,7 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, long nr)
 	percpu_counter_add(&kvm_total_used_mmu_pages, nr);
 }
 
-static void kvm_mmu_free_page(struct kvm_mmu_page *sp)
+static void kvm_mmu_free_sp(struct kvm_mmu_page *sp)
 {
 	MMU_WARN_ON(!is_empty_shadow_page(sp->spt));
 	hlist_del(&sp->hash_link);
@@ -1717,7 +1717,7 @@ static void drop_parent_pte(struct kvm_mmu_page *sp,
 	mmu_spte_clear_no_track(parent_pte);
 }
 
-static struct kvm_mmu_page *kvm_mmu_alloc_page(struct kvm_vcpu *vcpu, int direct)
+static struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, int direct)
 {
 	struct kvm_mmu_page *sp;
 
@@ -2152,7 +2152,7 @@ static struct kvm_mmu_page *kvm_mmu_create_sp(struct kvm_vcpu *vcpu,
 
 	++vcpu->kvm->stat.mmu_cache_miss;
 
-	sp = kvm_mmu_alloc_page(vcpu, role.direct);
+	sp = kvm_mmu_alloc_sp(vcpu, role.direct);
 	sp->gfn = gfn;
 	sp->role = role;
 
@@ -2168,8 +2168,8 @@ static struct kvm_mmu_page *kvm_mmu_create_sp(struct kvm_vcpu *vcpu,
 	return sp;
 }
 
-static struct kvm_mmu_page *kvm_mmu_get_page(struct kvm_vcpu *vcpu, gfn_t gfn,
-					     union kvm_mmu_page_role role)
+static struct kvm_mmu_page *kvm_mmu_get_sp(struct kvm_vcpu *vcpu, gfn_t gfn,
+					   union kvm_mmu_page_role role)
 {
 	struct kvm_mmu_page *sp;
 	bool created = false;
@@ -2208,7 +2208,7 @@ static struct kvm_mmu_page *kvm_mmu_get_child_sp(struct kvm_vcpu *vcpu,
 
 	role = kvm_mmu_child_role(parent_sp, direct, access);
 
-	return kvm_mmu_get_page(vcpu, gfn, role);
+	return kvm_mmu_get_sp(vcpu, gfn, role);
 }
 
 static void shadow_walk_init_using_root(struct kvm_shadow_walk_iterator *iterator,
@@ -2478,7 +2478,7 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm,
 
 	list_for_each_entry_safe(sp, nsp, invalid_list, link) {
 		WARN_ON(!sp->role.invalid || sp->root_count);
-		kvm_mmu_free_page(sp);
+		kvm_mmu_free_sp(sp);
 	}
 }
 
@@ -3406,7 +3406,7 @@ static hpa_t mmu_alloc_root(struct kvm_vcpu *vcpu, gfn_t gfn, gva_t gva,
 		role.quadrant = quadrant;
 	}
 
-	sp = kvm_mmu_get_page(vcpu, gfn, role);
+	sp = kvm_mmu_get_sp(vcpu, gfn, role);
 	++sp->root_count;
 
 	return __pa(sp->spt);

From patchwork Thu Feb  3 01:00:33 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733672
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id C46E2C4332F
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:10 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348661AbiBCBBK (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:10 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46182 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S245258AbiBCBBJ (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:09 -0500
Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com
 [IPv6:2607:f8b0:4864:20::54a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 691B3C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:09 -0800 (PST)
Received: by mail-pg1-x54a.google.com with SMTP id
 e37-20020a635465000000b00364dfbc8031so566548pgm.10
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:09 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=CP4dB5ZYlRM1sQ3kgtH5k4DQM/Cy6892DHpE9lMRSSg=;
        b=pDaDLECbrKjt9eSgi8bK0sQSvctNeEZ81cUxhrK+N9YwlryjExu3neYJU42R+f/YVO
         bEtkgPANhdzItEeQqt8pd0nw/YXqDoPPiRUFYALlQ1QrVDRAG4mbB5FVT/p6neZ547dL
         ce3+Bgp1Y8auWtvXJBFyOcynQMhdLmChKCAAZHc24r3i/Q69WW38xBnZZyjXQDhN4vYj
         VGH3Er78H4ptteZ0a6GHT7eVNuns/lbimtFH+799HmHEPHDjQ+NNAkAKp1haCD5Xo/Es
         tGRB528vDh4hMYYF9s4xlenQ2qa2uewIpEZHI6UO29DqHFnWbgpHXVif9yZmXbzSarhq
         FoCw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=CP4dB5ZYlRM1sQ3kgtH5k4DQM/Cy6892DHpE9lMRSSg=;
        b=4agsCHsJ3YAnANBYdFePhxpbdeOgGa9roz6zCaUItS7YY2fibNIs7Fwg3D1NollCh+
         7jF7JP9veJVbrxS/GfnccwopQ3rfFgxOnIhxFKbh/ny5AXXmFZtPR0du1SbzkZS755g4
         NmIvC0ouY/f2D8GCuYChkwrIyQW73+TxhALgtC/0g1gnZu5QJkcD+pU6LVfVVo/O7jl+
         lFSPOZ0N7zRBsEM0YK/bDUhoy8HHNWtv6rjvHsexE+XA6HSrljb0JLdz2wfyvXgbNa/D
         1+oQrJFeQH8Meieexm2kP2FnWi6I8Zd38lIpY1EyIZ2YUZmFnIiYHjFm+QhzfHAFr1PR
         BZtg==
X-Gm-Message-State: AOAM533u8F1RYUfpX1oZPiOyk9l//ob7MKo2fYfUQqGZAFsKeVpPYLMO
        5v4ETDJDpa+k99kP5KOsNcgdWrd9wwy/Vw==
X-Google-Smtp-Source: 
 ABdhPJyOXRBsMg9DgG+NXK8VQuqflvVK7oZKFYN8OLvqHYtvUcRswbreMjpWjJF8naLaUMWcImaz/bTkYxq/rA==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a05:6a00:1513:: with SMTP id
 q19mr31896218pfu.12.1643850068867; Wed, 02 Feb 2022 17:01:08 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:33 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-6-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 05/23] KVM: x86/mmu: Pass memslot to kvm_mmu_create_sp()
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Passing the memslot to kvm_mmu_create_sp() avoids the need for the vCPU
pointer when write-protecting indirect 4k shadow pages. This moves us
closer to being able to create new shadow pages during VM ioctls for
eager page splitting, where there is not vCPU pointer.

This change does not negatively impact "Populate memory time" for ept=Y
or ept=N configurations since kvm_vcpu_gfn_to_memslot() caches the last
use slot. So even though we now look up the slot more often, it is a
very cheap check.

Opportunistically move the code to write-protect GFNs shadowed by
PG_LEVEL_4K shadow pages into account_shadowed() to reduce indentation
and consolidate the code. This also eliminates a memslot lookup.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 23 ++++++++++++-----------
 1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 6f55af9c66db..49f82addf4b5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -804,16 +804,14 @@ void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn)
 	update_gfn_disallow_lpage_count(slot, gfn, -1);
 }
 
-static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
+static void account_shadowed(struct kvm *kvm,
+			     struct kvm_memory_slot *slot,
+			     struct kvm_mmu_page *sp)
 {
-	struct kvm_memslots *slots;
-	struct kvm_memory_slot *slot;
 	gfn_t gfn;
 
 	kvm->arch.indirect_shadow_pages++;
 	gfn = sp->gfn;
-	slots = kvm_memslots_for_spte_role(kvm, sp->role);
-	slot = __gfn_to_memslot(slots, gfn);
 
 	/* the non-leaf shadow pages are keeping readonly. */
 	if (sp->role.level > PG_LEVEL_4K)
@@ -821,6 +819,9 @@ static void account_shadowed(struct kvm *kvm, struct kvm_mmu_page *sp)
 						    KVM_PAGE_TRACK_WRITE);
 
 	kvm_mmu_gfn_disallow_lpage(slot, gfn);
+
+	if (kvm_mmu_slot_gfn_write_protect(kvm, slot, gfn, PG_LEVEL_4K))
+		kvm_flush_remote_tlbs_with_address(kvm, gfn, 1);
 }
 
 void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp)
@@ -2144,6 +2145,7 @@ static struct kvm_mmu_page *kvm_mmu_get_existing_sp(struct kvm_vcpu *vcpu,
 }
 
 static struct kvm_mmu_page *kvm_mmu_create_sp(struct kvm_vcpu *vcpu,
+					      struct kvm_memory_slot *slot,
 					      gfn_t gfn,
 					      union kvm_mmu_page_role role)
 {
@@ -2159,11 +2161,8 @@ static struct kvm_mmu_page *kvm_mmu_create_sp(struct kvm_vcpu *vcpu,
 	sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
 	hlist_add_head(&sp->hash_link, sp_list);
 
-	if (!role.direct) {
-		account_shadowed(vcpu->kvm, sp);
-		if (role.level == PG_LEVEL_4K && kvm_vcpu_write_protect_gfn(vcpu, gfn))
-			kvm_flush_remote_tlbs_with_address(vcpu->kvm, gfn, 1);
-	}
+	if (!role.direct)
+		account_shadowed(vcpu->kvm, slot, sp);
 
 	return sp;
 }
@@ -2171,6 +2170,7 @@ static struct kvm_mmu_page *kvm_mmu_create_sp(struct kvm_vcpu *vcpu,
 static struct kvm_mmu_page *kvm_mmu_get_sp(struct kvm_vcpu *vcpu, gfn_t gfn,
 					   union kvm_mmu_page_role role)
 {
+	struct kvm_memory_slot *slot;
 	struct kvm_mmu_page *sp;
 	bool created = false;
 
@@ -2179,7 +2179,8 @@ static struct kvm_mmu_page *kvm_mmu_get_sp(struct kvm_vcpu *vcpu, gfn_t gfn,
 		goto out;
 
 	created = true;
-	sp = kvm_mmu_create_sp(vcpu, gfn, role);
+	slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
+	sp = kvm_mmu_create_sp(vcpu, slot, gfn, role);
 
 out:
 	trace_kvm_mmu_get_page(sp, created);

From patchwork Thu Feb  3 01:00:34 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733673
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 37EC3C433F5
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:12 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348662AbiBCBBL (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:11 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46190 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348654AbiBCBBK (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:10 -0500
Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com
 [IPv6:2607:f8b0:4864:20::54a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CED3EC061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:10 -0800 (PST)
Received: by mail-pg1-x54a.google.com with SMTP id
 b9-20020a63e709000000b00362f44b02aeso553471pgi.17
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=P2nMI/ec0k+rgC8gRDLK+pDpIPlzWQwlvtvvXy44AgM=;
        b=ZJytYEy/Q/1huO5wjouG616p7mPrgOPz53jOUe6r6aEcGKu2nCD5kkr+i6QNS9mQen
         /U0XcxT5QlExhb6AhScqBjFU6JGJPNEAyZyX+cjWtonmdHxVAZOBYeaRuqQnlaXwj50g
         X4iT/zXNeuKsufWZEiz6FQOqCmhStEA0Cep1CYJXfGPOypLQhnW9DgOZyruvo2V9AM/D
         Iiv6V4y1LsW0Fm2V/Rw1GEB7kcWVq4lFMqQM5lNM+4Qhq8G7lHCDbahpHiDjy8erVYtW
         7Fm5vj1rnrSDGGmA5E2dAurtEXgNqBDERJCQMOG4Qe0/351kQZOkbO17zZtfuAik/BBL
         5pPQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=P2nMI/ec0k+rgC8gRDLK+pDpIPlzWQwlvtvvXy44AgM=;
        b=lTCXtue6mwt2a67+YovcVAvlgelRIqqO6f0uS3gYU0tuatat602RXkyGuj+xzinCFw
         liskVXHVVL61eiWm23uWJkqyqwWbnqf7PSnsDVHDoOLYxMuZxMtWceK+zo9ddWEEjFa6
         gIupJ7SBfrxnQ/gtHrP0TkrvlqSSyVjWEO6mmZMa6iNvBAFPcHHe7OtVsOf6reCShjY1
         qgN1txn4A9ToBm6vZIsC3tAp6iEMkALefdb9OlBU/d3LMcpvPpuU3daDi9vR0buS6hA3
         hA0xSr3XqsKQ78+duLUE59AV1S890G1fA6ACjQIHKU87ma5c8/Ro1RGQDrF7ru1eXvNi
         iZhg==
X-Gm-Message-State: AOAM531OmhL05/Z+/imifMmd3FF5I4oco3ptfJ95Y/y7+avbvgAuf9Ip
        si1wBbQdiFbE+fmwf5iTM4oIn1AgMr7f0w==
X-Google-Smtp-Source: 
 ABdhPJwgr824OCtUNEUAY+htDI5fMYyvkuo+dbXjawJ6uSYTVedYJfXg2tID0VJ3vagPNzm4/W1ePm2izX/Jkw==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:90b:3648:: with SMTP id
 nh8mr11125338pjb.145.1643850070308; Wed, 02 Feb 2022 17:01:10 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:34 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-7-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 06/23] KVM: x86/mmu: Separate shadow MMU sp allocation from
 initialization
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Separate the code that allocates a new shadow page from the vCPU caches
from the code that initializes it. This is in preparation for creating
new shadow pages from VM ioctls for eager page splitting, where we do
not have access to the vCPU caches.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 44 +++++++++++++++++++++---------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 49f82addf4b5..d4f90a10b652 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1718,7 +1718,7 @@ static void drop_parent_pte(struct kvm_mmu_page *sp,
 	mmu_spte_clear_no_track(parent_pte);
 }
 
-static struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, int direct)
+static struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, bool direct)
 {
 	struct kvm_mmu_page *sp;
 
@@ -1726,16 +1726,7 @@ static struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, int direct)
 	sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
 	if (!direct)
 		sp->gfns = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_gfn_array_cache);
-	set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
 
-	/*
-	 * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages()
-	 * depends on valid pages being added to the head of the list.  See
-	 * comments in kvm_zap_obsolete_pages().
-	 */
-	sp->mmu_valid_gen = vcpu->kvm->arch.mmu_valid_gen;
-	list_add(&sp->link, &vcpu->kvm->arch.active_mmu_pages);
-	kvm_mod_used_mmu_pages(vcpu->kvm, +1);
 	return sp;
 }
 
@@ -2144,27 +2135,34 @@ static struct kvm_mmu_page *kvm_mmu_get_existing_sp(struct kvm_vcpu *vcpu,
 	return sp;
 }
 
-static struct kvm_mmu_page *kvm_mmu_create_sp(struct kvm_vcpu *vcpu,
-					      struct kvm_memory_slot *slot,
-					      gfn_t gfn,
-					      union kvm_mmu_page_role role)
+
+static void kvm_mmu_init_sp(struct kvm *kvm, struct kvm_mmu_page *sp,
+			    struct kvm_memory_slot *slot, gfn_t gfn,
+			    union kvm_mmu_page_role role)
 {
-	struct kvm_mmu_page *sp;
 	struct hlist_head *sp_list;
 
-	++vcpu->kvm->stat.mmu_cache_miss;
+	++kvm->stat.mmu_cache_miss;
+
+	set_page_private(virt_to_page(sp->spt), (unsigned long)sp);
 
-	sp = kvm_mmu_alloc_sp(vcpu, role.direct);
 	sp->gfn = gfn;
 	sp->role = role;
+	sp->mmu_valid_gen = kvm->arch.mmu_valid_gen;
 
-	sp_list = &vcpu->kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
+	/*
+	 * active_mmu_pages must be a FIFO list, as kvm_zap_obsolete_pages()
+	 * depends on valid pages being added to the head of the list.  See
+	 * comments in kvm_zap_obsolete_pages().
+	 */
+	list_add(&sp->link, &kvm->arch.active_mmu_pages);
+	kvm_mod_used_mmu_pages(kvm, 1);
+
+	sp_list = &kvm->arch.mmu_page_hash[kvm_page_table_hashfn(gfn)];
 	hlist_add_head(&sp->hash_link, sp_list);
 
 	if (!role.direct)
-		account_shadowed(vcpu->kvm, slot, sp);
-
-	return sp;
+		account_shadowed(kvm, slot, sp);
 }
 
 static struct kvm_mmu_page *kvm_mmu_get_sp(struct kvm_vcpu *vcpu, gfn_t gfn,
@@ -2179,8 +2177,10 @@ static struct kvm_mmu_page *kvm_mmu_get_sp(struct kvm_vcpu *vcpu, gfn_t gfn,
 		goto out;
 
 	created = true;
+	sp = kvm_mmu_alloc_sp(vcpu, role.direct);
+
 	slot = kvm_vcpu_gfn_to_memslot(vcpu, gfn);
-	sp = kvm_mmu_create_sp(vcpu, slot, gfn, role);
+	kvm_mmu_init_sp(vcpu->kvm, sp, slot, gfn, role);
 
 out:
 	trace_kvm_mmu_get_page(sp, created);

From patchwork Thu Feb  3 01:00:35 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733674
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id CDCB4C433EF
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:14 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S232280AbiBCBBO (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:14 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46202 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348654AbiBCBBM (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:12 -0500
Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com
 [IPv6:2607:f8b0:4864:20::549])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A725C06173B
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:12 -0800 (PST)
Received: by mail-pg1-x549.google.com with SMTP id
 p29-20020a634f5d000000b003624b087f05so575170pgl.7
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=6KmMfMviXjhh47073t9JRPIjwz/2bj2JAaTlcyJJ9So=;
        b=OHS/4Tryvc4rLrAkA7SgL54tIZSl/+WdXyA5qOzSGcqgECQsX0xmmvmuwVxFJBB8n0
         UxeBMdL7I9fFd7nbiosXRu5snlxQhwcQu232GJKVojDQ4bF29WgAOZCVvb3F+nr2WQRY
         gM7uepdJSDU80azqpYqBQneApdRtqYMg0TPciUyHlFawXNndLEisiGpoWIhKGCXCz4Oh
         kovlkcVv1NfeEuOZmoiSngPnJ+Ab+p5WGyeMAsHN/Hc+vd52K/pLdWwDlHJRyF55bouE
         r9DACAtMSFlH+XSMYe4oAOTsOm6oaNP+UB5RDXxRkoirKa2v7yZN7ZRL30wL0tcYET7G
         K/dA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=6KmMfMviXjhh47073t9JRPIjwz/2bj2JAaTlcyJJ9So=;
        b=tfuPcxElJJ8+iyPZmWg4su2w2nb0Caz85sDtuhT9lXscBX4JAcJ8OdhdVyJNe5Wmg6
         m5zM2WGp5aTkPQCuyMx9xIu15C9bbRpXFYIgLM74ZEBo48YhBZlr6yBB9P5+J6MGz+f7
         1nvvUbrvXUcwvIvgrGeV0AE88FtahtTmBWVvG2qwM8fnUqPpxT+2+PqihnFD1BFEVVBk
         2LD8ilkQDPv+A53yUKTHPvSHfCI69V8Usn2pp9ypAKCBUh+/zySGDEkQ4NuI+spWvNR7
         uHGGU6lc8Zu5Os6d5eRp2Ap6ELTKyX4kZOirJaBu8q2hj+4RXkVj+6L02zlVtc0+JDfj
         VSEA==
X-Gm-Message-State: AOAM532g5gCUS69UHN3onfYvb316O5x35WB+MWBSLxAXM7TMyJKjvyQy
        jQ9UvS0hFXB8TeIftXIHvf0XDTphOXdT/A==
X-Google-Smtp-Source: 
 ABdhPJzBvRJYYNGhqcwrxLrYThKR4uZHb9SZRU/KWAXLiv724Ph1LTf5kTkD0+Q2PplUV5xPxnlMESNZNwHrCA==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a62:2982:: with SMTP id
 p124mr32182607pfp.53.1643850071923; Wed, 02 Feb 2022 17:01:11 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:35 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-8-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 07/23] KVM: x86/mmu: Move huge page split sp allocation code
 to mmu.c
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Move the code that allocates a new shadow page for splitting huge pages
into mmu.c. Currently this code is only used by the TDP MMU but it will
be reused in subsequent commits to also split huge pages mapped by the
shadow MMU.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 26 ++++++++++++++++++++++++++
 arch/x86/kvm/mmu/mmu_internal.h |  2 ++
 arch/x86/kvm/mmu/tdp_mmu.c      | 23 ++---------------------
 3 files changed, 30 insertions(+), 21 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index d4f90a10b652..3acdf372fa9a 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1730,6 +1730,32 @@ static struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, bool direct)
 	return sp;
 }
 
+/*
+ * Allocate a new shadow page using the provided GFP flags to split a huge page.
+ *
+ * Huge page splitting always uses direct shadow pages since the huge page is
+ * being mapped directly with a lower level page table. Thus there's no need to
+ * allocate the gfns array.
+ */
+struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(gfp_t gfp)
+{
+	struct kvm_mmu_page *sp;
+
+	gfp |= __GFP_ZERO;
+
+	sp = kmem_cache_alloc(mmu_page_header_cache, gfp);
+	if (!sp)
+		return NULL;
+
+	sp->spt = (void *)__get_free_page(gfp);
+	if (!sp->spt) {
+		kmem_cache_free(mmu_page_header_cache, sp);
+		return NULL;
+	}
+
+	return sp;
+}
+
 static void mark_unsync(u64 *spte);
 static void kvm_mmu_mark_parents_unsync(struct kvm_mmu_page *sp)
 {
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index da6166b5c377..2c80028695ca 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -160,4 +160,6 @@ void *mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
 void account_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 
+struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(gfp_t gfp);
+
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 8def8f810cb0..0d58c3d15894 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1263,25 +1263,6 @@ bool kvm_tdp_mmu_wrprot_slot(struct kvm *kvm,
 	return spte_set;
 }
 
-static struct kvm_mmu_page *__tdp_mmu_alloc_sp_for_split(gfp_t gfp)
-{
-	struct kvm_mmu_page *sp;
-
-	gfp |= __GFP_ZERO;
-
-	sp = kmem_cache_alloc(mmu_page_header_cache, gfp);
-	if (!sp)
-		return NULL;
-
-	sp->spt = (void *)__get_free_page(gfp);
-	if (!sp->spt) {
-		kmem_cache_free(mmu_page_header_cache, sp);
-		return NULL;
-	}
-
-	return sp;
-}
-
 static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm,
 						       struct tdp_iter *iter,
 						       bool shared)
@@ -1297,7 +1278,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm,
 	 * If this allocation fails we drop the lock and retry with reclaim
 	 * allowed.
 	 */
-	sp = __tdp_mmu_alloc_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT);
+	sp = kvm_mmu_alloc_direct_sp_for_split(GFP_NOWAIT | __GFP_ACCOUNT);
 	if (sp)
 		return sp;
 
@@ -1309,7 +1290,7 @@ static struct kvm_mmu_page *tdp_mmu_alloc_sp_for_split(struct kvm *kvm,
 		write_unlock(&kvm->mmu_lock);
 
 	iter->yielded = true;
-	sp = __tdp_mmu_alloc_sp_for_split(GFP_KERNEL_ACCOUNT);
+	sp = kvm_mmu_alloc_direct_sp_for_split(GFP_KERNEL_ACCOUNT);
 
 	if (shared)
 		read_lock(&kvm->mmu_lock);

From patchwork Thu Feb  3 01:00:36 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733675
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id D8F9AC433FE
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:15 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S234702AbiBCBBP (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:15 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46208 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1345224AbiBCBBO (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:14 -0500
Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com
 [IPv6:2607:f8b0:4864:20::54a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 22E87C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:14 -0800 (PST)
Received: by mail-pg1-x54a.google.com with SMTP id
 27-20020a63135b000000b0036285f54b6aso548537pgt.19
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:14 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=FiCud8ItmVAz7EYTwAPbFj4Foo/9mt3PrXdDTZQK99E=;
        b=ZU9e3JwAD7nmDMlTFHM17n+7TdHTAA46fhVNdiATSYONfxK5KS+V1ZU9jZPrh4CJLX
         KcuZnRPE4YU0t8+lPdlavOeLoqgw1bjdZoRuQ8fro00z1brGspKFq+4obDjMZmalZmkk
         MqIwcWVIjdcscXpo2RhJfGfwMoPm8ignQ5BaRuHb68AXNH4VF2Sple1dNC9WCyTR9Urq
         if+/Yk9bZ5r35GGg3OEQ+4iFHR+O2gMF2K1rQkOIy0MdbS/NMitNVH3ArHEGgrwbK7bW
         Qqd+q3oTl53YXHmmQhCWLFPEjy7bEDeRTnbERLqtjvnJ2VNR6O0e2a1RoBFUuJn/5WHV
         ZAXg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=FiCud8ItmVAz7EYTwAPbFj4Foo/9mt3PrXdDTZQK99E=;
        b=G5EP6z36TCHXqFjdyIsf6CSVb9mT5556rND02OVz1YsxJEonm/Guyc6ourv28kNEIN
         KivLEZkGsEe3tRNuoPxXSxRYbAsbJq/drvWL3Bk4GweAIDYzV4nqMY2qaYm9ZmfUi+oe
         YMnpO09ouUcSId3a2ZEs2ojszKP4fSLoxb57y10IDZTz+LIkx/v0mLw1+tinQ+tclJhH
         gk0JypAnM7TWBuF4IWLIy/oDFE0/q06c0qb4PDuP8y+GcnKpIW5ouXhdjtng7br+abqL
         RGihQCxkT4bKDF75wGMuJcD32zT09xSkc0BncyRlOXc515GyBe6pNnKZGq28nH3TQBrQ
         4B6A==
X-Gm-Message-State: AOAM531XWWrJ+lUglDaf3MkKX26l9dS4m4S97n0rCrFpx926aLLc8wiQ
        0i26LRRKe1WvVeV0yY8yRPTIEaO8iRzBAQ==
X-Google-Smtp-Source: 
 ABdhPJz4JZExpZKil/jPdcDA6GrR7mzWDf+S8sIBRmVa5ScktNhlyZDYfILYuWYrUCRszNVSYOXGi0DIuLBkzQ==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:e552:: with SMTP id
 n18mr33380571plf.152.1643850073636; Wed, 02 Feb 2022 17:01:13 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:36 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-9-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 08/23] KVM: x86/mmu: Use common code to free kvm_mmu_page
 structs
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Use a common function to free kvm_mmu_page structs in the TDP MMU and
the shadow MMU. This reduces the amount of duplicate code and is needed
in subsequent commits that allocate and free kvm_mmu_pages for eager
page splitting.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 8 ++++----
 arch/x86/kvm/mmu/mmu_internal.h | 2 ++
 arch/x86/kvm/mmu/tdp_mmu.c      | 3 +--
 3 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3acdf372fa9a..09a178e64a04 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1680,11 +1680,8 @@ static inline void kvm_mod_used_mmu_pages(struct kvm *kvm, long nr)
 	percpu_counter_add(&kvm_total_used_mmu_pages, nr);
 }
 
-static void kvm_mmu_free_sp(struct kvm_mmu_page *sp)
+void kvm_mmu_free_sp(struct kvm_mmu_page *sp)
 {
-	MMU_WARN_ON(!is_empty_shadow_page(sp->spt));
-	hlist_del(&sp->hash_link);
-	list_del(&sp->link);
 	free_page((unsigned long)sp->spt);
 	if (!sp->role.direct)
 		free_page((unsigned long)sp->gfns);
@@ -2505,6 +2502,9 @@ static void kvm_mmu_commit_zap_page(struct kvm *kvm,
 
 	list_for_each_entry_safe(sp, nsp, invalid_list, link) {
 		WARN_ON(!sp->role.invalid || sp->root_count);
+		MMU_WARN_ON(!is_empty_shadow_page(sp->spt));
+		hlist_del(&sp->hash_link);
+		list_del(&sp->link);
 		kvm_mmu_free_sp(sp);
 	}
 }
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index 2c80028695ca..c68f45c4a745 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -162,4 +162,6 @@ void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 
 struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(gfp_t gfp);
 
+void kvm_mmu_free_sp(struct kvm_mmu_page *sp);
+
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 0d58c3d15894..60bb29cd2b96 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -59,8 +59,7 @@ static bool zap_gfn_range(struct kvm *kvm, struct kvm_mmu_page *root,
 
 static void tdp_mmu_free_sp(struct kvm_mmu_page *sp)
 {
-	free_page((unsigned long)sp->spt);
-	kmem_cache_free(mmu_page_header_cache, sp);
+	kvm_mmu_free_sp(sp);
 }
 
 /*

From patchwork Thu Feb  3 01:00:37 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733676
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 80FF7C433EF
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:17 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348672AbiBCBBQ (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:16 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46216 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348666AbiBCBBQ (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:16 -0500
Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com
 [IPv6:2607:f8b0:4864:20::b49])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0CB8CC061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:16 -0800 (PST)
Received: by mail-yb1-xb49.google.com with SMTP id
 e130-20020a255088000000b006126feb051eso2634412ybb.18
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:16 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=DcQvm0EnB27B5c1LzMvvmcIJJOtlNTGQ0W7y9yZFrLE=;
        b=UN7sv8X5yib9iHjcaUAjCimKFp5qmLsvKdHpM0Z4GZROXjVmmyDD6Yl8XQjTIcFalZ
         VW70bey5N68XuYooFebdEOEGxYIkeL8vnhB5PwtwcUJaOK8jo9qxopbNAQ3kNMNtbVVv
         woxKLaYAisxTHLeurTU3BKQV/4scJTXJc5MtJSx3WvC4xvAe15J26RQDdXOAruMI9/r9
         2H5x3mEP2frMjrJwQAMpgy8oTtndhGyExiuXimJujQ78DgYE7/y0WjT7iv8uRcJimMEX
         l80advOSJAl5b/kcgJOo6JVSg3A6rU2LgW3qTSvRHchu9U1ATVb7WNQNZ+XdpdcpoHnz
         +Iqg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=DcQvm0EnB27B5c1LzMvvmcIJJOtlNTGQ0W7y9yZFrLE=;
        b=v5uOmJ7YwMvT5rRrWk85HI8ZxFzhDQdqBCGS//Yph6nIWVp5UmDcP1/wiEXc/fPdZT
         hLrjNEGrXs+tMqTO9DYrZD+axHbQ+DNh2RvhgeDPw0ammt3Dmq+tVwohVoHkmXGCymYi
         oMagrV+/UYedKuygG1+XJjVqdzMANtVZX8+7od023cB3ELDrzqeNJgUDUIM9t++WaDHJ
         gIylraYlw/pf/uszTYNm3N4gOxSip+E1efiqQwJWbwzgVQBTx5VIkcnXHbwXO8fA/xGd
         /nbsVUom5BRBs30RZofWpyB17VeRV4QXooyzNuv+1bM1YcIKjQUqWmfvDuzD2ncVrWld
         CVWw==
X-Gm-Message-State: AOAM531KJEJaX6Fxf3CR9zZgzZYsJOpLInUpUYBCytopVVeWlElvR+/g
        ZTabNMpgCBsKCDJz0+3YwQuCf8QtevHbrA==
X-Google-Smtp-Source: 
 ABdhPJxibXTS8LAl6H9Rho8QdNL2VBve0j4bcdZ4cruygJ7zAniN7Duh1kRxfaGB/RYz6vJm5ruaMBV8uKu6vQ==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a81:18c1:: with SMTP id
 184mr2631374ywy.304.1643850075181; Wed, 02 Feb 2022 17:01:15 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:37 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-10-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 09/23] KVM: x86/mmu: Use common code to allocate kvm_mmu_page
 structs from vCPU caches
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Now that allocating a kvm_mmu_page struct is isolated to a helper
function, it can be re-used in the TDP MMU.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c          | 2 +-
 arch/x86/kvm/mmu/mmu_internal.h | 1 +
 arch/x86/kvm/mmu/tdp_mmu.c      | 7 +------
 3 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 09a178e64a04..48ebf2bebb90 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1715,7 +1715,7 @@ static void drop_parent_pte(struct kvm_mmu_page *sp,
 	mmu_spte_clear_no_track(parent_pte);
 }
 
-static struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, bool direct)
+struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, bool direct)
 {
 	struct kvm_mmu_page *sp;
 
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index c68f45c4a745..c5f2c0b9177d 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -162,6 +162,7 @@ void unaccount_huge_nx_page(struct kvm *kvm, struct kvm_mmu_page *sp);
 
 struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(gfp_t gfp);
 
+struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, bool direct);
 void kvm_mmu_free_sp(struct kvm_mmu_page *sp);
 
 #endif /* __KVM_X86_MMU_INTERNAL_H */
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 60bb29cd2b96..4ff1af24b5aa 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -172,12 +172,7 @@ static struct kvm_mmu_page *tdp_mmu_next_root(struct kvm *kvm,
 
 static struct kvm_mmu_page *tdp_mmu_alloc_sp(struct kvm_vcpu *vcpu)
 {
-	struct kvm_mmu_page *sp;
-
-	sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache);
-	sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
-
-	return sp;
+	return kvm_mmu_alloc_sp(vcpu, true);
 }
 
 static void tdp_mmu_init_sp(struct kvm_mmu_page *sp, gfn_t gfn,

From patchwork Thu Feb  3 01:00:38 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733677
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id CF29BC433F5
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:18 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348669AbiBCBBR (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:17 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46224 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348674AbiBCBBR (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:17 -0500
Received: from mail-pf1-x44a.google.com (mail-pf1-x44a.google.com
 [IPv6:2607:f8b0:4864:20::44a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4617CC061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:17 -0800 (PST)
Received: by mail-pf1-x44a.google.com with SMTP id
 o194-20020a62cdcb000000b004c9d2b4bfd8so498348pfg.7
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:17 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=F59SUfYeyQQX4JGyCTCS3BzaOvVCAmTejwx5MdGi9kw=;
        b=Zd79YR7yD03r949tweioi1h5+KsJuhg6GQeEcUDAh279Dp0qozn3z3pcg734+eh/QO
         rsC+CVpIGjs/adJvxbK8c0ewUboamc9f8VU8YGYDtMFZP48uGua7sqK6njjfn6MZ0wdL
         vLbAI787BTO1CnhGaYH5P8SWQsNPi4JEdgnQmdrOF3dJ0ursY4gp9jkh/lStC34cwyTz
         DKvDc2owlUaAXcIPhfitDkB35PTTMNrX+pNwz+Oqf7fZQU7fijH9mKvieoyJ+/fmSA/n
         0F9g+/qSCKJcoe2asIC1w5PbFgz9Dp+MVLvtPAPEeqWNEUtXKJxn0ACuBamVie0oocIR
         zvhA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=F59SUfYeyQQX4JGyCTCS3BzaOvVCAmTejwx5MdGi9kw=;
        b=ApNzK8die+Q2U1P6Uf9pB1VguQnMajz/KIlYOe3rz5AdtVGNWuVx9y1yxGi0ZlYpRZ
         Z1Vuja9U45KL1C6NaW/gX15LSAwSEYG9RPLyE/Y3yzPgIF9v7nlr5ah10jQbAOyvQxC8
         axLkQuUo8W3E0XxEMUWKw3/ZfMu+8NWkVOFyw9pfK+o4KZo5MvcKNKEnfv9Cp+mdjH/R
         keupVKWQjF8S1n7PJZEduM7rikQcBfpmqJYHJ2RtbE1ssZEO6xGedxBloUHVnV0ba+ip
         U+SKYj0dYRo9kMvGLh7bqjfUF+EfDb/k+7ajI0gOiQYcFBIuLHFvNF8NcyWCNunhu2jO
         FQwg==
X-Gm-Message-State: AOAM532/JyvkQ6CGI16FH8c9baRDiBO1QMmUkTwc7wIjImDEwYO06fyr
        DXzwvU2n/IsqkLXRfTcOkYF5q3nloqzqkw==
X-Google-Smtp-Source: 
 ABdhPJyqSi4+szrsiGV1cLDxSa67FA1zO/dVXnKSXmq3y/nqbXIin8wubkV9XZLHyxOwoLwuBYx7HA9HcksXgg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:90b:1009:: with SMTP id
 gm9mr10872570pjb.223.1643850076760; Wed, 02 Feb 2022 17:01:16 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:38 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-11-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 10/23] KVM: x86/mmu: Pass const memslot to rmap_add()
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

rmap_add() only uses the slot to call gfn_to_rmap() which takes a const
memslot.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 48ebf2bebb90..a5e3bb632542 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1607,7 +1607,7 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 
 #define RMAP_RECYCLE_THRESHOLD 1000
 
-static void rmap_add(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
+static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot,
 		     u64 *spte, gfn_t gfn)
 {
 	struct kvm_mmu_page *sp;

From patchwork Thu Feb  3 01:00:39 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733678
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id D1EE7C433F5
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:20 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348679AbiBCBBU (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:20 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46234 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348674AbiBCBBT (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:19 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3D0AC061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:18 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 n13-20020a17090a928d00b001b80df27e05so775999pjo.8
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=CpxBz2rSpFHFrskwqdmipYbFWhPiqrMn8PP5pj8ez7U=;
        b=m1JYGJaisGEH4T7QmXuoj7WXTaKj+gwcooNraiEcvjGbzn/GJFosljSEkMIl0WYPQz
         dGerQ1UlZlU6fStcTKnNm0JN5xKIrFp0GqR7GECtQ+YBFAcUBer/ptPAMqPBRo1NJbnR
         fh54cI8hJz7XDwNQ02GULCVzLHyyByWSr/KTShzSWG1wuTpX57KEFboFJkbLJyWwM5Sa
         VqktwXwlrtsa7w5kgLQ9TNCArz6uL02QhhlK0RJzvK5xQ9Fu1W2gV9gRalkiY0igWX73
         yPC15xCGhCHIj73m/lIygsC2vVoDytZx84EGi7iPs+/qHXM1XVU1ed8/8s6kH1ave3fy
         TJHw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=CpxBz2rSpFHFrskwqdmipYbFWhPiqrMn8PP5pj8ez7U=;
        b=s1Z+GuvpJitEvT99QdmYraNw0OzqHCE1kuT68WjkvDYIBXAf+hYN5ztREwq25jOc/n
         1ld5bAmy+ZIEmZQuOHwp1+oce8/yYapm91b+3a4V7bo8BNwBRgks4MMje9TTYUZef+th
         /vsNV6Yhsa+Edphd8k61O51AA5m9eRdZDdGEzeZMCZBhmMoSmmGB7pP3slPFwS+jwca9
         9Ql7S0mj/odTOS8SiNEe3BPLgS2q/OEj5SlB52m++V/LyL6Tt+BxoRECSfzhh01wePFj
         Fe2wF/hiTxfrWeIsu6K9EkH65Ew5VGF59tpTkasSpIn0t7MiM8/bj+4dLvKOcTAweUid
         bPng==
X-Gm-Message-State: AOAM533FQThxRn/6mIXqBKaCP4Eond4kxSbjOK4KvcVhlUY8DKGvCE1k
        W1hZoqOyPfV6STtu3RX+lLcikzttUpqOQg==
X-Google-Smtp-Source: 
 ABdhPJxQpSONJqRxVDJXGlKvbxEW3+WV+iwEhrPYoeisLaikAYvcz7Ym+61cml/miI4NPesOudSKWofx7tU2cA==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:6b4b:: with SMTP id
 g11mr32942650plt.109.1643850078439; Wed, 02 Feb 2022 17:01:18 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:39 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-12-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 11/23] KVM: x86/mmu: Pass const memslot to kvm_mmu_init_sp()
 and descendants
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Use a const pointer so that kvm_mmu_init_sp() can be called from
contexts where we have a const pointer.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/include/asm/kvm_page_track.h | 2 +-
 arch/x86/kvm/mmu/mmu.c                | 7 +++----
 arch/x86/kvm/mmu/mmu_internal.h       | 2 +-
 arch/x86/kvm/mmu/page_track.c         | 4 ++--
 arch/x86/kvm/mmu/tdp_mmu.c            | 2 +-
 arch/x86/kvm/mmu/tdp_mmu.h            | 2 +-
 6 files changed, 9 insertions(+), 10 deletions(-)

diff --git a/arch/x86/include/asm/kvm_page_track.h b/arch/x86/include/asm/kvm_page_track.h
index eb186bc57f6a..3a2dc183ae9a 100644
--- a/arch/x86/include/asm/kvm_page_track.h
+++ b/arch/x86/include/asm/kvm_page_track.h
@@ -58,7 +58,7 @@ int kvm_page_track_create_memslot(struct kvm *kvm,
 				  unsigned long npages);
 
 void kvm_slot_page_track_add_page(struct kvm *kvm,
-				  struct kvm_memory_slot *slot, gfn_t gfn,
+				  const struct kvm_memory_slot *slot, gfn_t gfn,
 				  enum kvm_page_track_mode mode);
 void kvm_slot_page_track_remove_page(struct kvm *kvm,
 				     struct kvm_memory_slot *slot, gfn_t gfn,
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a5e3bb632542..de7c47ee0def 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -805,7 +805,7 @@ void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn)
 }
 
 static void account_shadowed(struct kvm *kvm,
-			     struct kvm_memory_slot *slot,
+			     const struct kvm_memory_slot *slot,
 			     struct kvm_mmu_page *sp)
 {
 	gfn_t gfn;
@@ -1384,7 +1384,7 @@ int kvm_cpu_dirty_log_size(void)
 }
 
 bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
-				    struct kvm_memory_slot *slot, u64 gfn,
+				    const struct kvm_memory_slot *slot, u64 gfn,
 				    int min_level)
 {
 	struct kvm_rmap_head *rmap_head;
@@ -2158,9 +2158,8 @@ static struct kvm_mmu_page *kvm_mmu_get_existing_sp(struct kvm_vcpu *vcpu,
 	return sp;
 }
 
-
 static void kvm_mmu_init_sp(struct kvm *kvm, struct kvm_mmu_page *sp,
-			    struct kvm_memory_slot *slot, gfn_t gfn,
+			    const struct kvm_memory_slot *slot, gfn_t gfn,
 			    union kvm_mmu_page_role role)
 {
 	struct hlist_head *sp_list;
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index c5f2c0b9177d..e6bcea5a0aa9 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -123,7 +123,7 @@ int mmu_try_to_unsync_pages(struct kvm *kvm, const struct kvm_memory_slot *slot,
 void kvm_mmu_gfn_disallow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
 void kvm_mmu_gfn_allow_lpage(const struct kvm_memory_slot *slot, gfn_t gfn);
 bool kvm_mmu_slot_gfn_write_protect(struct kvm *kvm,
-				    struct kvm_memory_slot *slot, u64 gfn,
+				    const struct kvm_memory_slot *slot, u64 gfn,
 				    int min_level);
 void kvm_flush_remote_tlbs_with_address(struct kvm *kvm,
 					u64 start_gfn, u64 pages);
diff --git a/arch/x86/kvm/mmu/page_track.c b/arch/x86/kvm/mmu/page_track.c
index 68eb1fb548b6..ebd704946a35 100644
--- a/arch/x86/kvm/mmu/page_track.c
+++ b/arch/x86/kvm/mmu/page_track.c
@@ -83,7 +83,7 @@ int kvm_page_track_write_tracking_alloc(struct kvm_memory_slot *slot)
 	return 0;
 }
 
-static void update_gfn_track(struct kvm_memory_slot *slot, gfn_t gfn,
+static void update_gfn_track(const struct kvm_memory_slot *slot, gfn_t gfn,
 			     enum kvm_page_track_mode mode, short count)
 {
 	int index, val;
@@ -111,7 +111,7 @@ static void update_gfn_track(struct kvm_memory_slot *slot, gfn_t gfn,
  * @mode: tracking mode, currently only write track is supported.
  */
 void kvm_slot_page_track_add_page(struct kvm *kvm,
-				  struct kvm_memory_slot *slot, gfn_t gfn,
+				  const struct kvm_memory_slot *slot, gfn_t gfn,
 				  enum kvm_page_track_mode mode)
 {
 
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 4ff1af24b5aa..34c451f1eac9 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1645,7 +1645,7 @@ static bool write_protect_gfn(struct kvm *kvm, struct kvm_mmu_page *root,
  * Returns true if an SPTE was set and a TLB flush is needed.
  */
 bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
-				   struct kvm_memory_slot *slot, gfn_t gfn,
+				   const struct kvm_memory_slot *slot, gfn_t gfn,
 				   int min_level)
 {
 	struct kvm_mmu_page *root;
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index 3f987785702a..b1265149a05d 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -64,7 +64,7 @@ void kvm_tdp_mmu_zap_collapsible_sptes(struct kvm *kvm,
 				       const struct kvm_memory_slot *slot);
 
 bool kvm_tdp_mmu_write_protect_gfn(struct kvm *kvm,
-				   struct kvm_memory_slot *slot, gfn_t gfn,
+				   const struct kvm_memory_slot *slot, gfn_t gfn,
 				   int min_level);
 
 void kvm_tdp_mmu_try_split_huge_pages(struct kvm *kvm,

From patchwork Thu Feb  3 01:00:40 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733679
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id C5527C433EF
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:21 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348678AbiBCBBV (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:21 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46246 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348675AbiBCBBU (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:20 -0500
Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com
 [IPv6:2607:f8b0:4864:20::54a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9CDFAC061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:20 -0800 (PST)
Received: by mail-pg1-x54a.google.com with SMTP id
 t7-20020a634447000000b0036579648983so582303pgk.3
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=y4ZDSzE5ON4YdtRzBulHie84tQs+2RQs6E8TDceZkmE=;
        b=Co+cTdlUHj6lqw8GTk8KzU6DETBKf8ntBjmrbNAz1o6lemkOI0FbHKtvkY9lzpzDxF
         JQ8kZGur4tqwj2oGY7X9ewYLLflS45UdFJ2ow8bqIHUYAe22Z9VkAyElxGbMw0jPCd1q
         VnM0fGakYQ95lM46AyzTRYouAqjBp0oy76lGkYvgZ7g7KJ4fYzbK6WP3qaOEMWdBcIOf
         l+IVctVgAOIO//jmBRRbeaELjCIKowHcMcUX3RId8pNAMsKLsZUdkjwa49sjfhG2R0Py
         lxZD6QhCSGdy1V1d4fVkojcGyL8ytCbgERnbmb2VSS4s98nLgZkWlkEF4W5/ZuavhQl5
         /1RA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=y4ZDSzE5ON4YdtRzBulHie84tQs+2RQs6E8TDceZkmE=;
        b=U8iyOdLaFBt3fQ5HeDxmehjZaFJW1sDbasQgCL368DHFCmQaXz2onuBtIbH4zmo6uP
         Xg46vc8I1xU7zkITn+Ga8l5uUjwmwAXJTiwheYo0xQ4J7tmKJXT93bgCc3jVEfAZoclu
         H7QOA1vBhrRrWkHN5S7oHKMxyxsBWE/ermzDUGd0A7SzVR1mBJyX2G7fi4mPjieYSZMB
         bkYCZo9h6vU7576th1awZpBMMaIOpjjWy9D0afogoA+hKvA8ewT4yoZi2kAFRNhYq1Et
         mSXWCaa2yw9l01WeZMM3CAFKm0oCDKKuJ9BpD0n75iUwPYnvplQf2WgdYVmHp120B7bU
         mnbg==
X-Gm-Message-State: AOAM532ke2ZUakZOu2oyryEivy7Rmq7oPoW5iZzyJntc7TUTQxhgJfqq
        j2yzAzwjcZmaUK4kWMtS/roPV9jJs1PBqg==
X-Google-Smtp-Source: 
 ABdhPJxFw07Ea+BRSOjqA2kzJDJL1A+Kj6A3wosxu5j9kIWlPxhLwG/ZtLiFj7D5RjnRd1Qf3SwYv7+y7+Bwcg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a05:6a00:230e:: with SMTP id
 h14mr32502294pfh.10.1643850080107; Wed, 02 Feb 2022 17:01:20 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:40 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-13-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 12/23] KVM: x86/mmu: Decouple rmap_add() and
 link_shadow_page() from kvm_vcpu
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Allow adding new entries to the rmap and linking shadow pages without a
struct kvm_vcpu pointer by moving the implementation of rmap_add() and
link_shadow_page() into inner helper functions.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 43 +++++++++++++++++++++++++++---------------
 1 file changed, 28 insertions(+), 15 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index de7c47ee0def..c2f7f026d414 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -736,9 +736,9 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu)
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache);
 }
 
-static struct pte_list_desc *mmu_alloc_pte_list_desc(struct kvm_vcpu *vcpu)
+static struct pte_list_desc *mmu_alloc_pte_list_desc(struct kvm_mmu_memory_cache *cache)
 {
-	return kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_pte_list_desc_cache);
+	return kvm_mmu_memory_cache_alloc(cache);
 }
 
 static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc)
@@ -885,7 +885,7 @@ gfn_to_memslot_dirty_bitmap(struct kvm_vcpu *vcpu, gfn_t gfn,
 /*
  * Returns the number of pointers in the rmap chain, not counting the new one.
  */
-static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte,
+static int pte_list_add(struct kvm_mmu_memory_cache *cache, u64 *spte,
 			struct kvm_rmap_head *rmap_head)
 {
 	struct pte_list_desc *desc;
@@ -896,7 +896,7 @@ static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte,
 		rmap_head->val = (unsigned long)spte;
 	} else if (!(rmap_head->val & 1)) {
 		rmap_printk("%p %llx 1->many\n", spte, *spte);
-		desc = mmu_alloc_pte_list_desc(vcpu);
+		desc = mmu_alloc_pte_list_desc(cache);
 		desc->sptes[0] = (u64 *)rmap_head->val;
 		desc->sptes[1] = spte;
 		desc->spte_count = 2;
@@ -908,7 +908,7 @@ static int pte_list_add(struct kvm_vcpu *vcpu, u64 *spte,
 		while (desc->spte_count == PTE_LIST_EXT) {
 			count += PTE_LIST_EXT;
 			if (!desc->more) {
-				desc->more = mmu_alloc_pte_list_desc(vcpu);
+				desc->more = mmu_alloc_pte_list_desc(cache);
 				desc = desc->more;
 				desc->spte_count = 0;
 				break;
@@ -1607,8 +1607,10 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 
 #define RMAP_RECYCLE_THRESHOLD 1000
 
-static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot,
-		     u64 *spte, gfn_t gfn)
+static void __rmap_add(struct kvm *kvm,
+		       struct kvm_mmu_memory_cache *cache,
+		       const struct kvm_memory_slot *slot,
+		       u64 *spte, gfn_t gfn)
 {
 	struct kvm_mmu_page *sp;
 	struct kvm_rmap_head *rmap_head;
@@ -1617,15 +1619,21 @@ static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot,
 	sp = sptep_to_sp(spte);
 	kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
 	rmap_head = gfn_to_rmap(gfn, sp->role.level, slot);
-	rmap_count = pte_list_add(vcpu, spte, rmap_head);
+	rmap_count = pte_list_add(cache, spte, rmap_head);
 
 	if (rmap_count > RMAP_RECYCLE_THRESHOLD) {
-		kvm_unmap_rmapp(vcpu->kvm, rmap_head, NULL, gfn, sp->role.level, __pte(0));
+		kvm_unmap_rmapp(kvm, rmap_head, NULL, gfn, sp->role.level, __pte(0));
 		kvm_flush_remote_tlbs_with_address(
-				vcpu->kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
+				kvm, sp->gfn, KVM_PAGES_PER_HPAGE(sp->role.level));
 	}
 }
 
+static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot,
+		     u64 *spte, gfn_t gfn)
+{
+	__rmap_add(vcpu->kvm, &vcpu->arch.mmu_pte_list_desc_cache, slot, spte, gfn);
+}
+
 bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
 {
 	bool young = false;
@@ -1693,13 +1701,13 @@ static unsigned kvm_page_table_hashfn(gfn_t gfn)
 	return hash_64(gfn, KVM_MMU_HASH_SHIFT);
 }
 
-static void mmu_page_add_parent_pte(struct kvm_vcpu *vcpu,
+static void mmu_page_add_parent_pte(struct kvm_mmu_memory_cache *cache,
 				    struct kvm_mmu_page *sp, u64 *parent_pte)
 {
 	if (!parent_pte)
 		return;
 
-	pte_list_add(vcpu, parent_pte, &sp->parent_ptes);
+	pte_list_add(cache, parent_pte, &sp->parent_ptes);
 }
 
 static void mmu_page_remove_parent_pte(struct kvm_mmu_page *sp,
@@ -2297,8 +2305,8 @@ static void shadow_walk_next(struct kvm_shadow_walk_iterator *iterator)
 	__shadow_walk_next(iterator, *iterator->sptep);
 }
 
-static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep,
-			     struct kvm_mmu_page *sp)
+static void __link_shadow_page(struct kvm_mmu_memory_cache *cache, u64 *sptep,
+			       struct kvm_mmu_page *sp)
 {
 	u64 spte;
 
@@ -2308,12 +2316,17 @@ static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep,
 
 	mmu_spte_set(sptep, spte);
 
-	mmu_page_add_parent_pte(vcpu, sp, sptep);
+	mmu_page_add_parent_pte(cache, sp, sptep);
 
 	if (sp->unsync_children || sp->unsync)
 		mark_unsync(sptep);
 }
 
+static void link_shadow_page(struct kvm_vcpu *vcpu, u64 *sptep, struct kvm_mmu_page *sp)
+{
+	__link_shadow_page(&vcpu->arch.mmu_pte_list_desc_cache, sptep, sp);
+}
+
 static void validate_direct_spte(struct kvm_vcpu *vcpu, u64 *sptep,
 				   unsigned direct_access)
 {

From patchwork Thu Feb  3 01:00:41 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733680
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id B149FC433FE
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:23 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348666AbiBCBBX (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:23 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46252 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348674AbiBCBBW (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:22 -0500
Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com
 [IPv6:2607:f8b0:4864:20::54a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 373C2C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:22 -0800 (PST)
Received: by mail-pg1-x54a.google.com with SMTP id
 e37-20020a635465000000b00364dfbc8031so566973pgm.10
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:22 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=tJTEmnOoq97nBjhDDGiFw1d5hYrxIBizGAgKf7X10Mo=;
        b=re95RWG+4sNe5gVyOOB+XqKh1H15R81aKdDAqzZHeiGXnQi0iojvNvioxU4HcZ2o9f
         dNsgkq8onbjGC6MfRMbjY11ZMN0WUv3PxIPrAdR//FnM7lUWVVVfQ/jSHMZGz4b16P1k
         99GtKR5gVwF8NZ2vi7FF6Ks7KHaloIwKqXkQRyrVeZUIvldRMz3bFz6gSt60VUxjLB6Q
         XdMbDP60AkPu1i2/Gr7f3Jiw/y2ngxyA0NNUD1BiB7p2TRKUW2635exrnUfWPjM0T2of
         PVcCi6IKqWFeQqHO/7iuu0NbjnWTry5icy/Uak/mC5m1AAqy0vtBoDxtrhGaTg5L/uMA
         B0lA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=tJTEmnOoq97nBjhDDGiFw1d5hYrxIBizGAgKf7X10Mo=;
        b=FOZyNgoqWJoU+XZt/EDvquqGkwiSRo/umAgVfWZfmZPMCUT5mxzhqIrAPifVBLku/7
         R6729MishnzO6u4iEFWOqM23Kvnf/p04mr957g94J1cfLjqqUYTxpQ/WrgvhT6Y1u6tf
         0ernLow4jLPSl7TE1e0LzhlmTaHhr/6W2XxQhXXNVTgylLv8TD2p5fTN2pOWFv+x8DBr
         lDFYIwBWHN0BSx0M3+ZqVnoYRtZAATnL5E//QtcCWQ6sO8r1sHCJZqcDNF4gDnBnYymq
         0QSSzAnkG/OhRbvese/YgosXViWYKThOLZtm3GutUle2xm37yvQ5UjkRe+aoQrJowLWx
         9uuA==
X-Gm-Message-State: AOAM530ALDT4+JT4h96wsX3qLo+Rr8YGpscGNiJ5Scil3eZDrY9xIL7j
        Ddax7fKyTPIMX4mjd43yRmJO0JbUvLn0Zg==
X-Google-Smtp-Source: 
 ABdhPJyK7fj6w+08G3wCOZt4TDefrrmA+Hz2RetwxPXkR7BGmpnFbrKknQ7euNCPtLY2AFH0j+/6MeCpyD8rNg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:ecc1:: with SMTP id
 a1mr32545543plh.72.1643850081718; Wed, 02 Feb 2022 17:01:21 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:41 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-14-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 13/23] KVM: x86/mmu: Update page stats in __rmap_add()
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Update the page stats in __rmap_add() rather than at the call site. This
will avoid having to manually update page stats when splitting huge
pages in a subsequent commit.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c2f7f026d414..ae1564e67e49 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1621,6 +1621,8 @@ static void __rmap_add(struct kvm *kvm,
 	rmap_head = gfn_to_rmap(gfn, sp->role.level, slot);
 	rmap_count = pte_list_add(cache, spte, rmap_head);
 
+	kvm_update_page_stats(kvm, sp->role.level, 1);
+
 	if (rmap_count > RMAP_RECYCLE_THRESHOLD) {
 		kvm_unmap_rmapp(kvm, rmap_head, NULL, gfn, sp->role.level, __pte(0));
 		kvm_flush_remote_tlbs_with_address(
@@ -2831,7 +2833,6 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
 
 	if (!was_rmapped) {
 		WARN_ON_ONCE(ret == RET_PF_SPURIOUS);
-		kvm_update_page_stats(vcpu->kvm, level, 1);
 		rmap_add(vcpu, slot, sptep, gfn);
 	}
 

From patchwork Thu Feb  3 01:00:42 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733681
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 81FEBC433F5
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:26 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348698AbiBCBBZ (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:25 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46264 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348682AbiBCBBX (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:23 -0500
Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com
 [IPv6:2607:f8b0:4864:20::54a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9D7DC06173B
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:23 -0800 (PST)
Received: by mail-pg1-x54a.google.com with SMTP id
 f35-20020a631f23000000b0035ec54b3bbcso600948pgf.0
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:23 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=7i2xHlLNvbgpmgwj6zOsmV+bBljQ8N869Z7/+3akjow=;
        b=e5yDU2i7oo0aPLzbtIWfvKcYXNa8Wis5VQ1ZuaH9yXXH4eaW1pWi7Nv1qrPVjFKNC4
         NuAP8hYAchLKoKiYmciQ43Qwl+SjfV3DkfPWuP/IybjqTbyE5Ltui56IqKmHcMDgEPcr
         Zwa/bHwd6WDP6Q3vVnuMo/SOOF0bFPExz6cKgjh5Snv9ZXJfjUPEuqAu5Z+0g5Rl1c2t
         siHfqO1cyAFl5EfNRJMHdCAhb5mdQzkS+HblGw+R/tmHcLbwC66NZ8WqLZloaGoAky5f
         +x+K7gr+lA/fVdwZ+N3A/1B3oS+31jy5ksKW9J9dx2SPMhAoGhDsTsDz8ak1WBIfz+up
         aEaQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=7i2xHlLNvbgpmgwj6zOsmV+bBljQ8N869Z7/+3akjow=;
        b=rtfazsfxrE4nTMVR5zftibau/zx4YVWUoCW1tpOmVPfVGRKOFeJeDX9k0pd/VndoFq
         O0u0+6mr3cDCdwt1VG99wIQGmlW1+acP0HhyYMpj0QXFzaV1uz4XMHXj1mOmPVZIONrr
         FNDbPLEAQm6qpZeK8i/+WlM5i2Ka57T7Jn9VKMh4+wMhrr70drC6v61Yg5D2L6V99rQC
         x6WM10cmalm6ab/AOTvJUuxWzPfgdhYqUJEfwzs0UZJb9xGTTTVnTKalCN6LpB3VAfjC
         EgTLB/qnEoqtzRS+pIDbRDWfNxAWJg/OMsk3c1eFWtEPCi5JPcrg1ZaMCCrcb4gVEvy+
         s4Jg==
X-Gm-Message-State: AOAM532FvApbfCbT1EmqdSsUt1PViVsFemMS9HWiRUnyOiJgPu7b88b6
        IQSu83gbU+/stTvKlkvUSkK1wseX25/C+g==
X-Google-Smtp-Source: 
 ABdhPJwmGwymhrQh27IG1a7PzT4ff65krYYHzVi3gmHxg0bR7ciYI/vHfNfGWaJLtL/d03ozRAXYghJ9wNY3aw==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a05:6a00:23ce:: with SMTP id
 g14mr32017877pfc.13.1643850083386; Wed, 02 Feb 2022 17:01:23 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:42 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-15-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 14/23] KVM: x86/mmu: Cache the access bits of shadowed
 translations
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

In order to split a huge page we need to know what access bits to assign
to the role of the new child page table. This can't be easily derived
from the huge page SPTE itself since KVM applies its own access policies
on top, such as for HugePage NX.

We could walk the guest page tables to determine the correct access
bits, but that is difficult to plumb outside of a vCPU fault context.
Instead, we can store the original access bits for each leaf SPTE
alongside the GFN in the gfns array. The access bits only take up 3
bits, which leaves 61 bits left over for gfns, which is more than
enough. So this change does not require any additional memory.

In order to keep the access bit cache in sync with the guest, we have to
extend FNAME(sync_page) to also update the access bits.

Now that the gfns array caches more information than just GFNs, rename
it to shadowed_translation.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/include/asm/kvm_host.h |  2 +-
 arch/x86/kvm/mmu/mmu.c          | 32 +++++++++++++++++++-------------
 arch/x86/kvm/mmu/mmu_internal.h | 15 +++++++++++++--
 arch/x86/kvm/mmu/paging_tmpl.h  |  7 +++++--
 4 files changed, 38 insertions(+), 18 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c371ee7e45f7..f00004c13ccf 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -686,7 +686,7 @@ struct kvm_vcpu_arch {
 
 	struct kvm_mmu_memory_cache mmu_pte_list_desc_cache;
 	struct kvm_mmu_memory_cache mmu_shadow_page_cache;
-	struct kvm_mmu_memory_cache mmu_gfn_array_cache;
+	struct kvm_mmu_memory_cache mmu_shadowed_translation_cache;
 	struct kvm_mmu_memory_cache mmu_page_header_cache;
 
 	/*
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index ae1564e67e49..e2306a39526a 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -719,7 +719,7 @@ static int mmu_topup_memory_caches(struct kvm_vcpu *vcpu, bool maybe_indirect)
 	if (r)
 		return r;
 	if (maybe_indirect) {
-		r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_gfn_array_cache,
+		r = kvm_mmu_topup_memory_cache(&vcpu->arch.mmu_shadowed_translation_cache,
 					       PT64_ROOT_MAX_LEVEL);
 		if (r)
 			return r;
@@ -732,7 +732,7 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu)
 {
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_pte_list_desc_cache);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadow_page_cache);
-	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_gfn_array_cache);
+	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_shadowed_translation_cache);
 	kvm_mmu_free_memory_cache(&vcpu->arch.mmu_page_header_cache);
 }
 
@@ -749,15 +749,17 @@ static void mmu_free_pte_list_desc(struct pte_list_desc *pte_list_desc)
 static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
 {
 	if (!sp->role.direct)
-		return sp->gfns[index];
+		return sp->shadowed_translation[index].gfn;
 
 	return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS));
 }
 
-static void kvm_mmu_page_set_gfn(struct kvm_mmu_page *sp, int index, gfn_t gfn)
+static void kvm_mmu_page_set_gfn_access(struct kvm_mmu_page *sp, int index,
+					gfn_t gfn, u32 access)
 {
 	if (!sp->role.direct) {
-		sp->gfns[index] = gfn;
+		sp->shadowed_translation[index].gfn = gfn;
+		sp->shadowed_translation[index].access = access;
 		return;
 	}
 
@@ -1610,14 +1612,14 @@ static bool kvm_test_age_rmapp(struct kvm *kvm, struct kvm_rmap_head *rmap_head,
 static void __rmap_add(struct kvm *kvm,
 		       struct kvm_mmu_memory_cache *cache,
 		       const struct kvm_memory_slot *slot,
-		       u64 *spte, gfn_t gfn)
+		       u64 *spte, gfn_t gfn, u32 access)
 {
 	struct kvm_mmu_page *sp;
 	struct kvm_rmap_head *rmap_head;
 	int rmap_count;
 
 	sp = sptep_to_sp(spte);
-	kvm_mmu_page_set_gfn(sp, spte - sp->spt, gfn);
+	kvm_mmu_page_set_gfn_access(sp, spte - sp->spt, gfn, access);
 	rmap_head = gfn_to_rmap(gfn, sp->role.level, slot);
 	rmap_count = pte_list_add(cache, spte, rmap_head);
 
@@ -1631,9 +1633,9 @@ static void __rmap_add(struct kvm *kvm,
 }
 
 static void rmap_add(struct kvm_vcpu *vcpu, const struct kvm_memory_slot *slot,
-		     u64 *spte, gfn_t gfn)
+		     u64 *spte, gfn_t gfn, u32 access)
 {
-	__rmap_add(vcpu->kvm, &vcpu->arch.mmu_pte_list_desc_cache, slot, spte, gfn);
+	__rmap_add(vcpu->kvm, &vcpu->arch.mmu_pte_list_desc_cache, slot, spte, gfn, access);
 }
 
 bool kvm_age_gfn(struct kvm *kvm, struct kvm_gfn_range *range)
@@ -1694,7 +1696,7 @@ void kvm_mmu_free_sp(struct kvm_mmu_page *sp)
 {
 	free_page((unsigned long)sp->spt);
 	if (!sp->role.direct)
-		free_page((unsigned long)sp->gfns);
+		free_page((unsigned long)sp->shadowed_translation);
 	kmem_cache_free(mmu_page_header_cache, sp);
 }
 
@@ -1731,8 +1733,12 @@ struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, bool direct)
 
 	sp = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_page_header_cache);
 	sp->spt = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadow_page_cache);
+
+	BUILD_BUG_ON(sizeof(sp->shadowed_translation[0]) != sizeof(u64));
+
 	if (!direct)
-		sp->gfns = kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_gfn_array_cache);
+		sp->shadowed_translation =
+			kvm_mmu_memory_cache_alloc(&vcpu->arch.mmu_shadowed_translation_cache);
 
 	return sp;
 }
@@ -1742,7 +1748,7 @@ struct kvm_mmu_page *kvm_mmu_alloc_sp(struct kvm_vcpu *vcpu, bool direct)
  *
  * Huge page splitting always uses direct shadow pages since the huge page is
  * being mapped directly with a lower level page table. Thus there's no need to
- * allocate the gfns array.
+ * allocate the shadowed_translation array.
  */
 struct kvm_mmu_page *kvm_mmu_alloc_direct_sp_for_split(gfp_t gfp)
 {
@@ -2833,7 +2839,7 @@ static int mmu_set_spte(struct kvm_vcpu *vcpu, struct kvm_memory_slot *slot,
 
 	if (!was_rmapped) {
 		WARN_ON_ONCE(ret == RET_PF_SPURIOUS);
-		rmap_add(vcpu, slot, sptep, gfn);
+		rmap_add(vcpu, slot, sptep, gfn, pte_access);
 	}
 
 	return ret;
diff --git a/arch/x86/kvm/mmu/mmu_internal.h b/arch/x86/kvm/mmu/mmu_internal.h
index e6bcea5a0aa9..9ee175adcc12 100644
--- a/arch/x86/kvm/mmu/mmu_internal.h
+++ b/arch/x86/kvm/mmu/mmu_internal.h
@@ -30,6 +30,11 @@ extern bool dbg;
 #define INVALID_PAE_ROOT	0
 #define IS_VALID_PAE_ROOT(x)	(!!(x))
 
+struct shadowed_translation_entry {
+	u64 access:3;
+	u64 gfn:56;
+};
+
 struct kvm_mmu_page {
 	/*
 	 * Note, "link" through "spt" fit in a single 64 byte cache line on
@@ -51,8 +56,14 @@ struct kvm_mmu_page {
 	gfn_t gfn;
 
 	u64 *spt;
-	/* hold the gfn of each spte inside spt */
-	gfn_t *gfns;
+	/*
+	 * For indirect shadow pages, caches the result of the intermediate
+	 * guest translation being shadowed by each SPTE.
+	 *
+	 * NULL for direct shadow pages.
+	 */
+	struct shadowed_translation_entry *shadowed_translation;
+
 	/* Currently serving as active root */
 	union {
 		int root_count;
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index c533c191925e..703dfb062bf0 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -1016,7 +1016,7 @@ static gpa_t FNAME(gva_to_gpa)(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu,
 }
 
 /*
- * Using the cached information from sp->gfns is safe because:
+ * Using the information in sp->shadowed_translation is safe because:
  * - The spte has a reference to the struct page, so the pfn for a given gfn
  *   can't change unless all sptes pointing to it are nuked first.
  *
@@ -1090,12 +1090,15 @@ static int FNAME(sync_page)(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp)
 		if (sync_mmio_spte(vcpu, &sp->spt[i], gfn, pte_access))
 			continue;
 
-		if (gfn != sp->gfns[i]) {
+		if (gfn != sp->shadowed_translation[i].gfn) {
 			drop_spte(vcpu->kvm, &sp->spt[i]);
 			flush = true;
 			continue;
 		}
 
+		if (pte_access != sp->shadowed_translation[i].access)
+			sp->shadowed_translation[i].access = pte_access;
+
 		sptep = &sp->spt[i];
 		spte = *sptep;
 		host_writable = spte & shadow_host_writable_mask;

From patchwork Thu Feb  3 01:00:43 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733682
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 41EB6C433F5
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:30 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348693AbiBCBB1 (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:27 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46274 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348702AbiBCBB0 (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:26 -0500
Received: from mail-pj1-x1049.google.com (mail-pj1-x1049.google.com
 [IPv6:2607:f8b0:4864:20::1049])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EE775C06173B
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:25 -0800 (PST)
Received: by mail-pj1-x1049.google.com with SMTP id
 n9-20020a17090a73c900b001b5cafefa27so802091pjk.2
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:25 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=dG8lD0mSLVvxVjmwCeTPEOlVoMVFkHAjpgklC5C2hr0=;
        b=Tb+hFO3l/tZXXnLRSMyCx+rNoSddClKnj5+cIBYTAX0vggUv5HCtzn+HcKfOhB7MrN
         xaL3mPzE8ojn/7c0A5nqichAroxT2P6lMZBG61FhbSmQEIF+XngX22mnisK394gW6H0P
         Tz+W7qoBAa9uLRiHSXLiy4LuBcrlz07C4pcb1Vi+uY6Hb7p3oR2/duOwL343mZiH8K94
         y5+/HJURWK3eSZM8O+RIk7IfG56TI1Jw7lcDhUO0Qs4HdwdB/1i//tc8jHsIpXrAtNUh
         r//XLjNrkrIRIANgWt0m2gfIU38w8d64Tj6OPvgwE32/ULqgT88lynZKtItvWdd2JpTO
         rYHQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=dG8lD0mSLVvxVjmwCeTPEOlVoMVFkHAjpgklC5C2hr0=;
        b=ztw1Rx8xX69zJCjOm3ZaSXOoaPYyOi4cB12V95qK56YPiytLSxboew/kQGAyKgHWUh
         HHi9wNqlSoMNI+/vKMN+6b5l2cYT1RN0oK3PlFqvY7oNA4RpRz0pez43ky/AAwdDJrbD
         dcrzqHp49MYy2/RApacdBgcCuAI6Xfo+clwil1W4JLVbjbi1Rcxq+6bf3Q+bgTUiloX+
         Qi2Oj6qt7y39v6UJb9omtNcs+/r7YY49OJrJ6tFMtDa4jtuAM46FPqGKjrSbn6TMBJh2
         dXgotj1IJ+Ez7kSBKoUyCXlOuVDqQydQ22PHGShOgOEdY5GmOndUaB29ksjRBI0b5WF9
         etcA==
X-Gm-Message-State: AOAM531fdWnjs3mor9rFI7jwFnU6qp9HG3XHNaBRbSii6JlAUr32D5kP
        fVjelSkxqtvJBaU2J319AolJR6BERR3VIQ==
X-Google-Smtp-Source: 
 ABdhPJwKB7lDChWUDiVhb8aJ/SmbuUf4xyHtt90S1m7+3fNDfb7XbDiAptxnp2AmDTOh97DW70jPs75bCEElMQ==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:903:300d:: with SMTP id
 o13mr33031316pla.110.1643850085134; Wed, 02 Feb 2022 17:01:25 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:43 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-16-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 15/23] KVM: x86/mmu: Pass access information to
 make_huge_page_split_spte()
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Currently make_huge_page_split_spte() assumes execute permissions can be
granted to any 4K SPTE when splitting huge pages. This is true for the
TDP MMU but is not necessarily true for the shadow MMU. Huge pages
mapped by the shadow MMU may be shadowing huge pages that the guest has
disallowed execute permissions.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/spte.c    | 5 +++--
 arch/x86/kvm/mmu/spte.h    | 3 ++-
 arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
 3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c
index 20cf9e0d45dd..7cba5cffc240 100644
--- a/arch/x86/kvm/mmu/spte.c
+++ b/arch/x86/kvm/mmu/spte.c
@@ -215,7 +215,8 @@ static u64 make_spte_executable(u64 spte)
  * This is used during huge page splitting to build the SPTEs that make up the
  * new page table.
  */
-u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index)
+u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index,
+			      unsigned int access)
 {
 	u64 child_spte;
 	int child_level;
@@ -243,7 +244,7 @@ u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index)
 		 * When splitting to a 4K page, mark the page executable as the
 		 * NX hugepage mitigation no longer applies.
 		 */
-		if (is_nx_huge_page_enabled())
+		if (is_nx_huge_page_enabled() && (access & ACC_EXEC_MASK))
 			child_spte = make_spte_executable(child_spte);
 	}
 
diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h
index 73f12615416f..c7ccdd5c440d 100644
--- a/arch/x86/kvm/mmu/spte.h
+++ b/arch/x86/kvm/mmu/spte.h
@@ -415,7 +415,8 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_page *sp,
 	       unsigned int pte_access, gfn_t gfn, kvm_pfn_t pfn,
 	       u64 old_spte, bool prefetch, bool can_unsync,
 	       bool host_writable, u64 *new_spte);
-u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index);
+u64 make_huge_page_split_spte(u64 huge_spte, int huge_level, int index,
+			      unsigned int access);
 u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled);
 u64 make_mmio_spte(struct kvm_vcpu *vcpu, u64 gfn, unsigned int access);
 u64 mark_spte_for_access_track(u64 spte);
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 34c451f1eac9..02bfbc1bebbe 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -1310,7 +1310,7 @@ static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter,
 	 * not been linked in yet and thus is not reachable from any other CPU.
 	 */
 	for (i = 0; i < PT64_ENT_PER_PAGE; i++)
-		sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i);
+		sp->spt[i] = make_huge_page_split_spte(huge_spte, level, i, ACC_ALL);
 
 	/*
 	 * Replace the huge spte with a pointer to the populated lower level

From patchwork Thu Feb  3 01:00:44 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733683
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3C424C433FE
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:32 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348692AbiBCBBa (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:30 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46280 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348674AbiBCBB1 (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:27 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F3CEDC061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:26 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 em10-20020a17090b014a00b001b5f2f3b5ffso805379pjb.1
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:26 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=0l1gqv5EOwd7xJV6eVY8d2Pk48r8BizOfmRGLMRqKB4=;
        b=QjPBEa8FnfTxC06z6gyVawJMYzwITsMZ7gHrkdBJjAPjmPqsQtW2vt+ornPjsq9l/o
         XVDGKPEWRXsZ++nImCs1pv7Cz0NPF1EZkSIUb4s/DtAOejGEdj1GlSV7V6VQ6MUjL0vf
         dgS3FsWgZK3J9YmofNxs7KN61nzmE0G/RNBaSIamwLcUyX8BmUOBMF4GhQi/+cCEvlyh
         pNqBc+cmWqSS6ld9M7Lod5CcYeqqtusz+bA6mkpbVZqwGkWw6g8oLW/A6cLTQ6NYNhMS
         mwYBcSjoTqQdwNenRtdMqZ5a+wQBxEHEqqwzzSVui89kXYZt20Gk0/IyjiVQ/EeALdIi
         /XYA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=0l1gqv5EOwd7xJV6eVY8d2Pk48r8BizOfmRGLMRqKB4=;
        b=G+luk39NGw7CcVMA7WtgSTYDq1j1GDiHoSrcxmViq/bwedMxq6zLdCJwxCZXqbjTNd
         TSj+3Il7d9rwQSCGXqzS4OxrSTDjq27+bcMTd+bljKmcbI8JfJ5kUiHFDH6HSxl0YWpO
         mgzHFV196+1WPddK1x1ZYw4hRMLDfUp0JwC5RLocfzCDswkpoDfrJOp4pM+dGFndvfgM
         5imQxE30MS9flKchgr5YDwoyb6zhdzYKhIVa+cl4FxeBEkgnLXGCNz35PCx4x+x3hDh5
         fHnC4yFWUseMcKbyhMSXI9iJEPnjSWGl60HWjdVdAi3ni58DpoNmrYqDmj/l/3IETKdK
         AGtw==
X-Gm-Message-State: AOAM533+s1fdnXpjdWgyTVCOz/jie7c4BpHXHFOlEt/uFOue2EkoDdKK
        XkutZF+7TURYP0rmokv32m5yLNXriatpmg==
X-Google-Smtp-Source: 
 ABdhPJwYe2XqR1jNLCi5DnX59bLGtbJ4Lg9q6Gc/kHR/d015B28h15jzCJG3v4lXTuungdSN4oLHLf21539SCg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:90a:2e03:: with SMTP id
 q3mr11023002pjd.184.1643850086506; Wed, 02 Feb 2022 17:01:26 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:44 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-17-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 16/23] KVM: x86/mmu: Zap collapsible SPTEs at all levels in
 the shadow MMU
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Currently KVM only zaps collapsible 4KiB SPTEs in the shadow MMU (i.e.
in the rmap). This leads to correct behavior because KVM never creates
intermediate huge pages during dirty logging. For example, a 1GiB page
is never partially split to a 2MiB page.

However this behavior will stop being correct once the shadow MMU
participates in eager page splitting, which can in fact leave behind
partially split huge pages. In preparation for that change, change the
shadow MMU to iterate over all levels when zapping collapsible SPTEs.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 arch/x86/kvm/mmu/mmu.c | 21 ++++++++++++++-------
 1 file changed, 14 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e2306a39526a..99ad7cc8683f 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6038,18 +6038,25 @@ static bool kvm_mmu_zap_collapsible_spte(struct kvm *kvm,
 	return need_tlb_flush;
 }
 
+static void kvm_rmap_zap_collapsible_sptes(struct kvm *kvm,
+					   const struct kvm_memory_slot *slot)
+{
+	bool flush;
+
+	flush = slot_handle_level(kvm, slot, kvm_mmu_zap_collapsible_spte,
+				  PG_LEVEL_4K, KVM_MAX_HUGEPAGE_LEVEL, true);
+
+	if (flush)
+		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
+
+}
+
 void kvm_mmu_zap_collapsible_sptes(struct kvm *kvm,
 				   const struct kvm_memory_slot *slot)
 {
 	if (kvm_memslots_have_rmaps(kvm)) {
 		write_lock(&kvm->mmu_lock);
-		/*
-		 * Zap only 4k SPTEs since the legacy MMU only supports dirty
-		 * logging at a 4k granularity and never creates collapsible
-		 * 2m SPTEs during dirty logging.
-		 */
-		if (slot_handle_level_4k(kvm, slot, kvm_mmu_zap_collapsible_spte, true))
-			kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
+		kvm_rmap_zap_collapsible_sptes(kvm, slot);
 		write_unlock(&kvm->mmu_lock);
 	}
 

From patchwork Thu Feb  3 01:00:45 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733684
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 96F18C433F5
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:34 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348695AbiBCBBc (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:32 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46290 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348682AbiBCBBa (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:30 -0500
Received: from mail-pl1-x649.google.com (mail-pl1-x649.google.com
 [IPv6:2607:f8b0:4864:20::649])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0801C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:28 -0800 (PST)
Received: by mail-pl1-x649.google.com with SMTP id
 h11-20020a170902eecb00b0014cc91d4bc4so275640plb.16
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:28 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=5i4VY2bHCiSczZZRyAsjbx+sp+PsOe+n0ifiX66TGKQ=;
        b=JZjr3o/Xj0ZRx0suwEy6bYbMd9qW24xXIJPQHNr1KXJ0UewxNeP9Ctdvx+E57Vwn7p
         bA6O/ilFcBB7nikhezvMGjcNUt9zmq4WM8Od2Nf/DYpMKe3H5FHlsKtTnxRY3OZdAUF1
         Cb6yZ7H2REBHyiIMe0vZRU6iojZnWGP5O5RDb4ymtaJNHQqgl1pzwv9ln3FXcvfG+fdS
         eAdSmMwfwpLdhOCjnkUXWqF6Uuxz3W/Sp+SxXoz2Tsa6B52RcYILQlj1kGaoNQlLA79b
         M2CDemAUsjgJwlAF0kKTR+wlZ8nLYKHeSmJjIy3hx8Ec15jLeMILAUQhGxPkLeOqaKPD
         9z/g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=5i4VY2bHCiSczZZRyAsjbx+sp+PsOe+n0ifiX66TGKQ=;
        b=o//V7+WMkyrzx1al+Z6vo42mpwKrUqVGv1CMWLOxDQkq6nbLR3+4jVrtTSsIHm6gm9
         ApE/ODUOI1ThFR2RvQGD4mwwzN7Fd1L0jzDl7sB1iPH0u6zK59tVfZwjIv/whKKqsnzA
         LGHd2m3HyREKdWTJuKl2C0mcsuDh6ABIv+O2TLvnqNfWMsE3pezGeWbPkQcJykZRqZ6D
         2BAvw3wuixCXSzUQ2KVGY+RwAPzECa8yti5e3xE0HA6y8jm3PNTpifmyNlXxPJFV4LeA
         hA8OsJBqlZ+x0a/D/Z98MOYlmCRJ5QQfCEDgFhRIz+A3Sfx6JKJl/ZAiqsF+HzDXIKKb
         9kUQ==
X-Gm-Message-State: AOAM530k5rpJgXRhdftMRE+voOpvnZAk1SrMegLMpfmu8JAkUBig0453
        sq3AOic+nPFyCMYSFauOc0slIN48rUYL5w==
X-Google-Smtp-Source: 
 ABdhPJzRpt54nGYwo/AmMVwVGQneUJIFBBVlhMoEm1c2aqnch/FkFPkJfXePf85np72f8WDCgNjRrmAJTB+Rwg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:c215:: with SMTP id
 21mr31892573pll.134.1643850088211; Wed, 02 Feb 2022 17:01:28 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:45 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-18-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 17/23] KVM: x86/mmu: Pass bool flush parameter to
 drop_large_spte()
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

drop_large_spte() drops a large SPTE if it exists and then flushes TLBs.
Its helper function, __drop_large_spte(), does the drop without the
flush. This difference is not obvious from the name.

To make the code more readable, pass an explicit flush parameter. Also
replace the vCPU pointer with a KVM pointer so we can get rid of the
double-underscore helper function.

This is also in preparation for a future commit that will conditionally
flush after dropping a large SPTE.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/kvm/mmu/mmu.c         | 25 +++++++++++--------------
 arch/x86/kvm/mmu/paging_tmpl.h |  4 ++--
 2 files changed, 13 insertions(+), 16 deletions(-)

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 99ad7cc8683f..2d47a54e62a5 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -1162,23 +1162,20 @@ static void drop_spte(struct kvm *kvm, u64 *sptep)
 }
 
 
-static bool __drop_large_spte(struct kvm *kvm, u64 *sptep)
+static void drop_large_spte(struct kvm *kvm, u64 *sptep, bool flush)
 {
-	if (is_large_pte(*sptep)) {
-		WARN_ON(sptep_to_sp(sptep)->role.level == PG_LEVEL_4K);
-		drop_spte(kvm, sptep);
-		return true;
-	}
+	struct kvm_mmu_page *sp;
 
-	return false;
-}
+	if (!is_large_pte(*sptep))
+		return;
 
-static void drop_large_spte(struct kvm_vcpu *vcpu, u64 *sptep)
-{
-	if (__drop_large_spte(vcpu->kvm, sptep)) {
-		struct kvm_mmu_page *sp = sptep_to_sp(sptep);
+	sp = sptep_to_sp(sptep);
+	WARN_ON(sp->role.level == PG_LEVEL_4K);
+
+	drop_spte(kvm, sptep);
 
-		kvm_flush_remote_tlbs_with_address(vcpu->kvm, sp->gfn,
+	if (flush) {
+		kvm_flush_remote_tlbs_with_address(kvm, sp->gfn,
 			KVM_PAGES_PER_HPAGE(sp->role.level));
 	}
 }
@@ -3051,7 +3048,7 @@ static int __direct_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 		if (it.level == fault->goal_level)
 			break;
 
-		drop_large_spte(vcpu, it.sptep);
+		drop_large_spte(vcpu->kvm, it.sptep, true);
 		if (is_shadow_present_pte(*it.sptep))
 			continue;
 
diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h
index 703dfb062bf0..ba61de29f2e5 100644
--- a/arch/x86/kvm/mmu/paging_tmpl.h
+++ b/arch/x86/kvm/mmu/paging_tmpl.h
@@ -677,7 +677,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 		gfn_t table_gfn;
 
 		clear_sp_write_flooding_count(it.sptep);
-		drop_large_spte(vcpu, it.sptep);
+		drop_large_spte(vcpu->kvm, it.sptep, true);
 
 		sp = NULL;
 		if (!is_shadow_present_pte(*it.sptep)) {
@@ -739,7 +739,7 @@ static int FNAME(fetch)(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault,
 
 		validate_direct_spte(vcpu, it.sptep, direct_access);
 
-		drop_large_spte(vcpu, it.sptep);
+		drop_large_spte(vcpu->kvm, it.sptep, true);
 
 		if (!is_shadow_present_pte(*it.sptep)) {
 			sp = kvm_mmu_get_child_sp(vcpu, it.sptep, base_gfn,

From patchwork Thu Feb  3 01:00:46 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733685
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 75707C433FE
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:36 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348702AbiBCBBe (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:34 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46298 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S239692AbiBCBBa (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:30 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 77531C06173B
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:30 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 o72-20020a17090a0a4e00b001b4e5b5b6c0so785235pjo.5
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:30 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=Qj3auNadsh6GGRYCWptAJo1hgXqrZQ/CToWtEGrh5UI=;
        b=D+UAxZxH0WCgCI653hb8Ke7Wg7u4p0T8IW3ql+aF7Ec+PyRXU6pAgLjs239bBQhCAm
         P/IbMgz6T0O2f82VdgtBer1XD49oG+mcZUVIsA55I/Xr8WXfhXwXv2G0R6yISE2dqAP4
         fQwXkBFG+dpZ/xc+m+OpviSSOrN2a5e9hzF/L+MJCRSiF2qb+0sXBFS9JG5WbUlAuSg1
         IFJHPspcKGIcbflKkol3UnOn0YWA63RhOU019wJV4u4drL7xuC91xac2pabLnzF3xJr9
         V8g1fwWGUdqf/qeWDqbhyhwu8AZ6Onn4pbr54Z+1daZmUO5nhdjS92QWMMHEAijRztnd
         gW7A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=Qj3auNadsh6GGRYCWptAJo1hgXqrZQ/CToWtEGrh5UI=;
        b=IYAFkuCkp5rV7fNC4lZZwRbbIXrOCcfF/QgLl3W7NH9RANbrMBX1jN6gx3G4LlkDwX
         h3zIkOXqpHsZrT4tmDcig/0YnxDQ1ARazkffQYlL5U/e28lF3iqlaselEbjmxXicEs+k
         aHnazSY5SVdZ3UJFFkgFV6vExqPrc7/8V+vhjzaXX+vpk0PG7Lj8wsHNuIXjLOJpLlN8
         4g1imqm3vpIaEcuY7ctTMAO2fH9wZRMJtj2o7jjQcIL0jeoxl5T6eedYZS3zUgUn+Cvg
         3DrHRs1v43swf4WM7TYsUofLFVeB6X4nXi32cCEs3kj+/r6z/EJuM3IfBtAZsCb0LmNa
         r8rA==
X-Gm-Message-State: AOAM533eOlcR++xC6T77ZpXlI4gDPMuB+j+Dp+awMXkAm9jBX+vVxahN
        w8GGi4NKyDVKL8gEjtXAcd6kLIUt3/qYiQ==
X-Google-Smtp-Source: 
 ABdhPJyVDpWK+YN6/gICHFQrM3VTkcsJI6WgadaF82ceFKzSZvYl9LlcneTk7O/87iw+9Hr+mAny4ebgXU26Yg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:8602:: with SMTP id
 f2mr1201878plo.36.1643850089948; Wed, 02 Feb 2022 17:01:29 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:46 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-19-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 18/23] KVM: x86/mmu: Extend Eager Page Splitting to the shadow
 MMU
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Extend KVM's eager page splitting to also split huge pages that are
mapped by the shadow MMU. Specifically, walk through the rmap splitting
all 1GiB pages to 2MiB pages, and splitting all 2MiB pages to 4KiB
pages.

Splitting huge pages mapped by the shadow MMU requries dealing with some
extra complexity beyond that of the TDP MMU:

(1) The shadow MMU has a limit on the number of shadow pages that are
    allowed to be allocated. So, as a policy, Eager Page Splitting
    refuses to split if there are KVM_MIN_FREE_MMU_PAGES or fewer
    pages available.

(2) Huge pages may be mapped by indirect shadow pages which have the
    possibility of being unsync. As a policy we opt not to split such
    pages as their translation may no longer be valid.

(3) Splitting a huge page may end up re-using an existing lower level
    shadow page tables. This is unlike the TDP MMU which always allocates
    new shadow page tables when splitting.  This commit does *not*
    handle such aliasing and opts not to split such huge pages.

(4) When installing the lower level SPTEs, they must be added to the
    rmap which may require allocating additional pte_list_desc structs.
    This commit does *not* handle such cases and instead opts to leave
    such lower-level SPTEs non-present. In this situation TLBs must be
    flushed before dropping the MMU lock as a portion of the huge page
    region is being unmapped.

Suggested-by: Peter Feiner <pfeiner@google.com>
[ This commit is based off of the original implementation of Eager Page
  Splitting from Peter in Google's kernel from 2016. ]
Signed-off-by: David Matlack <dmatlack@google.com>
---
 .../admin-guide/kernel-parameters.txt         |   3 -
 arch/x86/kvm/mmu/mmu.c                        | 349 ++++++++++++++++++
 2 files changed, 349 insertions(+), 3 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 1b54e410e206..09d236cb15d6 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -2351,9 +2351,6 @@
 			the KVM_CLEAR_DIRTY ioctl, and only for the pages being
 			cleared.
 
-			Eager page splitting currently only supports splitting
-			huge pages mapped by the TDP MMU.
-
 			Default is Y (on).
 
 	kvm.enable_vmware_backdoor=[KVM] Support VMware backdoor PV interface.
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 2d47a54e62a5..825cfdec589b 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -738,6 +738,11 @@ static void mmu_free_memory_caches(struct kvm_vcpu *vcpu)
 
 static struct pte_list_desc *mmu_alloc_pte_list_desc(struct kvm_mmu_memory_cache *cache)
 {
+	static const gfp_t gfp_nocache = GFP_ATOMIC | __GFP_ACCOUNT | __GFP_ZERO;
+
+	if (WARN_ON_ONCE(!cache))
+		return kmem_cache_alloc(pte_list_desc_cache, gfp_nocache);
+
 	return kvm_mmu_memory_cache_alloc(cache);
 }
 
@@ -754,6 +759,28 @@ static gfn_t kvm_mmu_page_get_gfn(struct kvm_mmu_page *sp, int index)
 	return sp->gfn + (index << ((sp->role.level - 1) * PT64_LEVEL_BITS));
 }
 
+static gfn_t sptep_to_gfn(u64 *sptep)
+{
+	struct kvm_mmu_page *sp = sptep_to_sp(sptep);
+
+	return kvm_mmu_page_get_gfn(sp, sptep - sp->spt);
+}
+
+static unsigned int kvm_mmu_page_get_access(struct kvm_mmu_page *sp, int index)
+{
+	if (!sp->role.direct)
+		return sp->shadowed_translation[index].access;
+
+	return sp->role.access;
+}
+
+static unsigned int sptep_to_access(u64 *sptep)
+{
+	struct kvm_mmu_page *sp = sptep_to_sp(sptep);
+
+	return kvm_mmu_page_get_access(sp, sptep - sp->spt);
+}
+
 static void kvm_mmu_page_set_gfn_access(struct kvm_mmu_page *sp, int index,
 					gfn_t gfn, u32 access)
 {
@@ -923,6 +950,41 @@ static int pte_list_add(struct kvm_mmu_memory_cache *cache, u64 *spte,
 	return count;
 }
 
+static struct kvm_rmap_head *gfn_to_rmap(gfn_t gfn, int level,
+					 const struct kvm_memory_slot *slot);
+
+static bool pte_list_need_new_desc(struct kvm_rmap_head *rmap_head)
+{
+	struct pte_list_desc *desc;
+
+	if (!rmap_head->val)
+		return false;
+
+	if (!(rmap_head->val & 1))
+		return true;
+
+	desc = (struct pte_list_desc *)(rmap_head->val & ~1ul);
+	while (desc->spte_count == PTE_LIST_EXT) {
+		if (!desc->more)
+			return true;
+		desc = desc->more;
+	}
+
+	return false;
+}
+
+/*
+ * Return true if the rmap for the given gfn and level needs a new
+ * pte_list_desc struct allocated to add a new spte.
+ */
+static bool rmap_need_new_pte_list_desc(const struct kvm_memory_slot *slot,
+					gfn_t gfn, int level)
+{
+	struct kvm_rmap_head *rmap_head = gfn_to_rmap(gfn, level, slot);
+
+	return pte_list_need_new_desc(rmap_head);
+}
+
 static void
 pte_list_desc_remove_entry(struct kvm_rmap_head *rmap_head,
 			   struct pte_list_desc *desc, int i,
@@ -2129,6 +2191,24 @@ static struct kvm_mmu_page *kvm_mmu_get_existing_sp_maybe_unsync(struct kvm *kvm
 	return sp;
 }
 
+static struct kvm_mmu_page *kvm_mmu_get_existing_direct_sp(struct kvm *kvm,
+							   gfn_t gfn,
+							   union kvm_mmu_page_role role)
+{
+	struct kvm_mmu_page *sp;
+	LIST_HEAD(invalid_list);
+
+	BUG_ON(!role.direct);
+
+	sp = kvm_mmu_get_existing_sp_maybe_unsync(kvm, gfn, role, &invalid_list);
+
+	/* Direct SPs are never unsync. */
+	WARN_ON_ONCE(sp && sp->unsync);
+
+	kvm_mmu_commit_zap_page(kvm, &invalid_list);
+	return sp;
+}
+
 /*
  * Looks up an existing SP for the given gfn and role if one exists. The
  * return SP is guaranteed to be synced.
@@ -5955,12 +6035,275 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
 }
 
+
+static int alloc_memory_for_split(struct kvm *kvm, struct kvm_mmu_page **spp, gfp_t gfp)
+{
+	if (*spp)
+		return 0;
+
+	*spp = kvm_mmu_alloc_direct_sp_for_split(gfp);
+
+	return *spp ? 0 : -ENOMEM;
+}
+
+static int prepare_to_split_huge_page(struct kvm *kvm,
+				      const struct kvm_memory_slot *slot,
+				      u64 *huge_sptep,
+				      struct kvm_mmu_page **spp,
+				      bool *flush,
+				      bool *dropped_lock)
+{
+	int r = 0;
+
+	*dropped_lock = false;
+
+	if (kvm_mmu_available_pages(kvm) <= KVM_MIN_FREE_MMU_PAGES)
+		return -ENOSPC;
+
+	if (need_resched() || rwlock_needbreak(&kvm->mmu_lock))
+		goto drop_lock;
+
+	r = alloc_memory_for_split(kvm, spp, GFP_NOWAIT | __GFP_ACCOUNT);
+	if (r)
+		goto drop_lock;
+
+	return 0;
+
+drop_lock:
+	if (*flush)
+		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
+
+	*flush = false;
+	*dropped_lock = true;
+
+	write_unlock(&kvm->mmu_lock);
+	cond_resched();
+	r = alloc_memory_for_split(kvm, spp, GFP_KERNEL_ACCOUNT);
+	write_lock(&kvm->mmu_lock);
+
+	return r;
+}
+
+static struct kvm_mmu_page *kvm_mmu_get_sp_for_split(struct kvm *kvm,
+						     const struct kvm_memory_slot *slot,
+						     u64 *huge_sptep,
+						     struct kvm_mmu_page **spp)
+{
+	struct kvm_mmu_page *huge_sp = sptep_to_sp(huge_sptep);
+	struct kvm_mmu_page *split_sp;
+	union kvm_mmu_page_role role;
+	unsigned int access;
+	gfn_t gfn;
+
+	gfn = sptep_to_gfn(huge_sptep);
+	access = sptep_to_access(huge_sptep);
+
+	/*
+	 * Huge page splitting always uses direct shadow pages since we are
+	 * directly mapping the huge page GFN region with smaller pages.
+	 */
+	role = kvm_mmu_child_role(huge_sp, true, access);
+	split_sp = kvm_mmu_get_existing_direct_sp(kvm, gfn, role);
+
+	/*
+	 * Opt not to split if the lower-level SP already exists. This requires
+	 * more complex handling as the SP may be already partially filled in
+	 * and may need extra pte_list_desc structs to update parent_ptes.
+	 */
+	if (split_sp)
+		return NULL;
+
+	swap(split_sp, *spp);
+	kvm_mmu_init_sp(kvm, split_sp, slot, gfn, role);
+	trace_kvm_mmu_get_page(split_sp, true);
+
+	return split_sp;
+}
+
+static int kvm_mmu_split_huge_page(struct kvm *kvm,
+				   const struct kvm_memory_slot *slot,
+				   u64 *huge_sptep, struct kvm_mmu_page **spp,
+				   bool *flush)
+
+{
+	struct kvm_mmu_page *split_sp;
+	u64 huge_spte, split_spte;
+	int split_level, index;
+	unsigned int access;
+	u64 *split_sptep;
+	gfn_t split_gfn;
+
+	split_sp = kvm_mmu_get_sp_for_split(kvm, slot, huge_sptep, spp);
+	if (!split_sp)
+		return -EOPNOTSUPP;
+
+	/*
+	 * Since we did not allocate pte_list_desc_structs for the split, we
+	 * cannot add a new parent SPTE to parent_ptes. This should never happen
+	 * in practice though since this is a fresh SP.
+	 *
+	 * Note, this makes it safe to pass NULL to __link_shadow_page() below.
+	 */
+	if (WARN_ON_ONCE(pte_list_need_new_desc(&split_sp->parent_ptes)))
+		return -EINVAL;
+
+	huge_spte = READ_ONCE(*huge_sptep);
+
+	split_level = split_sp->role.level;
+	access = split_sp->role.access;
+
+	for (index = 0; index < PT64_ENT_PER_PAGE; index++) {
+		split_sptep = &split_sp->spt[index];
+		split_gfn = kvm_mmu_page_get_gfn(split_sp, index);
+
+		BUG_ON(is_shadow_present_pte(*split_sptep));
+
+		/*
+		 * Since we did not allocate pte_list_desc structs for the
+		 * split, we can't add a new SPTE that maps this GFN.
+		 * Skipping this SPTE means we're only partially mapping the
+		 * huge page, which means we'll need to flush TLBs before
+		 * dropping the MMU lock.
+		 *
+		 * Note, this make it safe to pass NULL to __rmap_add() below.
+		 */
+		if (rmap_need_new_pte_list_desc(slot, split_gfn, split_level)) {
+			*flush = true;
+			continue;
+		}
+
+		split_spte = make_huge_page_split_spte(
+				huge_spte, split_level + 1, index, access);
+
+		mmu_spte_set(split_sptep, split_spte);
+		__rmap_add(kvm, NULL, slot, split_sptep, split_gfn, access);
+	}
+
+	/*
+	 * Replace the huge spte with a pointer to the populated lower level
+	 * page table. Since we are making this change without a TLB flush vCPUs
+	 * will see a mix of the split mappings and the original huge mapping,
+	 * depending on what's currently in their TLB. This is fine from a
+	 * correctness standpoint since the translation will be the same either
+	 * way.
+	 */
+	drop_large_spte(kvm, huge_sptep, false);
+	__link_shadow_page(NULL, huge_sptep, split_sp);
+
+	return 0;
+}
+
+static bool should_split_huge_page(u64 *huge_sptep)
+{
+	struct kvm_mmu_page *huge_sp = sptep_to_sp(huge_sptep);
+
+	if (WARN_ON_ONCE(!is_large_pte(*huge_sptep)))
+		return false;
+
+	if (huge_sp->role.invalid)
+		return false;
+
+	/*
+	 * As a policy, do not split huge pages if SP on which they reside
+	 * is unsync. Unsync means the guest is modifying the page table being
+	 * shadowed by huge_sp, so splitting may be a waste of cycles and
+	 * memory.
+	 */
+	if (huge_sp->unsync)
+		return false;
+
+	return true;
+}
+
+static bool rmap_try_split_huge_pages(struct kvm *kvm,
+				      struct kvm_rmap_head *rmap_head,
+				      const struct kvm_memory_slot *slot)
+{
+	struct kvm_mmu_page *sp = NULL;
+	struct rmap_iterator iter;
+	u64 *huge_sptep, spte;
+	bool flush = false;
+	bool dropped_lock;
+	int level;
+	gfn_t gfn;
+	int r;
+
+restart:
+	for_each_rmap_spte(rmap_head, &iter, huge_sptep) {
+		if (!should_split_huge_page(huge_sptep))
+			continue;
+
+		spte = *huge_sptep;
+		level = sptep_to_sp(huge_sptep)->role.level;
+		gfn = sptep_to_gfn(huge_sptep);
+
+		r = prepare_to_split_huge_page(kvm, slot, huge_sptep, &sp, &flush, &dropped_lock);
+		if (r) {
+			trace_kvm_mmu_split_huge_page(gfn, spte, level, r);
+			break;
+		}
+
+		if (dropped_lock)
+			goto restart;
+
+		r = kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp, &flush);
+
+		trace_kvm_mmu_split_huge_page(gfn, spte, level, r);
+
+		/*
+		 * If splitting is successful we must restart the iterator
+		 * because huge_sptep has just been removed from it.
+		 */
+		if (!r)
+			goto restart;
+	}
+
+	if (sp)
+		kvm_mmu_free_sp(sp);
+
+	return flush;
+}
+
+static void kvm_rmap_try_split_huge_pages(struct kvm *kvm,
+					  const struct kvm_memory_slot *slot,
+					  gfn_t start, gfn_t end,
+					  int target_level)
+{
+	bool flush;
+	int level;
+
+	/*
+	 * Split huge pages starting with KVM_MAX_HUGEPAGE_LEVEL and working
+	 * down to the target level. This ensures pages are recursively split
+	 * all the way to the target level. There's no need to split pages
+	 * already at the target level.
+	 *
+	 * Note that TLB flushes must be done before dropping the MMU lock since
+	 * rmap_try_split_huge_pages() may partially split any given huge page,
+	 * i.e. it may effectively unmap (make non-present) a portion of the
+	 * huge page.
+	 */
+	for (level = KVM_MAX_HUGEPAGE_LEVEL; level > target_level; level--) {
+		flush = slot_handle_level_range(kvm, slot,
+						rmap_try_split_huge_pages,
+						level, level, start, end - 1,
+						true, flush);
+	}
+
+	if (flush)
+		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
+}
+
 /* Must be called with the mmu_lock held in write-mode. */
 void kvm_mmu_try_split_huge_pages(struct kvm *kvm,
 				   const struct kvm_memory_slot *memslot,
 				   u64 start, u64 end,
 				   int target_level)
 {
+	if (kvm_memslots_have_rmaps(kvm))
+		kvm_rmap_try_split_huge_pages(kvm, memslot, start, end,
+					      target_level);
+
 	if (is_tdp_mmu_enabled(kvm))
 		kvm_tdp_mmu_try_split_huge_pages(kvm, memslot, start, end,
 						 target_level, false);
@@ -5978,6 +6321,12 @@ void kvm_mmu_slot_try_split_huge_pages(struct kvm *kvm,
 	u64 start = memslot->base_gfn;
 	u64 end = start + memslot->npages;
 
+	if (kvm_memslots_have_rmaps(kvm)) {
+		write_lock(&kvm->mmu_lock);
+		kvm_rmap_try_split_huge_pages(kvm, memslot, start, end, target_level);
+		write_unlock(&kvm->mmu_lock);
+	}
+
 	if (is_tdp_mmu_enabled(kvm)) {
 		read_lock(&kvm->mmu_lock);
 		kvm_tdp_mmu_try_split_huge_pages(kvm, memslot, start, end, target_level, true);

From patchwork Thu Feb  3 01:00:47 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733687
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 9E793C4332F
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:38 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348716AbiBCBBg (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:36 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46306 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348694AbiBCBBc (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:32 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1F047C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:32 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 s9-20020a17090aad8900b001b82d1e4dc8so625315pjq.6
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=z1NP3hwhOriDa3BDG7PiuvhtUnfRoAwi4K6Zz7Br1zg=;
        b=F3HSzYHHWzUMA5RvJ6GaJZPFElCy7JAP8kGrG5bhqnxBBdO3nK8W7xYvyNXhCcT0+0
         pgdlVGsaD6nWT9AKEyOJroLYneDdqLQB0uxCLKwayfy/UGvqnfRFSk9znVaXxynjsKGW
         EO6aLF1pFPtK63Ed1p0KGfFAP0b/KettJpUO/OKiJb0AixfXqoH1IPgHdJm1Gkx2xLww
         jhyydleV90ydaLYC+6GJ3VsSiu+7/8yNWMkqW8y9wRVZ77VtZq48551D9nmAmMYbg4nO
         +AaJt+ZMud1ad67CnQLaJ6nyrDMalYImWrdKfuE4p9TtvXDN44ypmG43k8a6D4LM+JTD
         0g6w==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=z1NP3hwhOriDa3BDG7PiuvhtUnfRoAwi4K6Zz7Br1zg=;
        b=IH0sS+7dOIa1VLQb8um7gNK75urNdJ8jdWpwL7W/BOHuOkCi7Af+KScCfYFbtPwul3
         1rvxSWV3MVLHlxbWNkPMP4uHMJlGjW4yd1Z41yX6RNLsZ2pgxF5oPF0qdtOkVM4anL3k
         jm4yvC9kJRfLbJcwL35djryHiDUH6v9jgaGgBi4p4gBxTGC+I3m7o/bh1MofTEffibh7
         ++rQHVR9l11uTEe2K9PoG+EUPPHmXUq3AVVmc3gZVZfX6KjtaJpl1Dqhk7FiHBBEnmcd
         DX8jcicizKIVDYrqcN5jACVJDRl2z5/Vnl1ftXIWV1f7n4ERcqZOooLYt4eraYrg57Bl
         INqg==
X-Gm-Message-State: AOAM533vCh8Vz7ZalM7Po0BJR05TixvlFD6yn7R1rnj/W7R49ujRKTxH
        Vx63aZPg9xRP56fj/o0JN2adYUIAetuQew==
X-Google-Smtp-Source: 
 ABdhPJwSbQ1cMCQWoR30BepRmmTyLOdyZYZTQTgqIaRgKahEhC6344L+GHCQfyOMTnWqsSqqGBRnRpQNgb0WSw==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a62:1d4a:: with SMTP id
 d71mr31951117pfd.46.1643850091582; Wed, 02 Feb 2022 17:01:31 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:47 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-20-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 19/23] KVM: Allow for different capacities in
 kvm_mmu_memory_cache structs
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Allow the capacity of the kvm_mmu_memory_cache struct to be chosen at
declaration time rather than being fixed for all declarations. This will
be used in a follow-up commit to declare an cache in x86 with a capacity
of 512+ objects without having to increase the capacity of all caches in
KVM.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/arm64/include/asm/kvm_host.h |  2 +-
 arch/arm64/kvm/mmu.c              | 12 ++++++------
 arch/mips/include/asm/kvm_host.h  |  2 +-
 arch/x86/include/asm/kvm_host.h   |  8 ++++----
 include/linux/kvm_types.h         | 24 ++++++++++++++++++++++--
 virt/kvm/kvm_main.c               |  8 +++++++-
 6 files changed, 41 insertions(+), 15 deletions(-)

diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h
index 3b44ea17af88..a450b91cc2d9 100644
--- a/arch/arm64/include/asm/kvm_host.h
+++ b/arch/arm64/include/asm/kvm_host.h
@@ -357,7 +357,7 @@ struct kvm_vcpu_arch {
 	bool pause;
 
 	/* Cache some mmu pages needed inside spinlock regions */
-	struct kvm_mmu_memory_cache mmu_page_cache;
+	DEFINE_KVM_MMU_MEMORY_CACHE(mmu_page_cache);
 
 	/* Target CPU and feature flags */
 	int target;
diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c
index bc2aba953299..9c853c529b49 100644
--- a/arch/arm64/kvm/mmu.c
+++ b/arch/arm64/kvm/mmu.c
@@ -765,7 +765,8 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 {
 	phys_addr_t addr;
 	int ret = 0;
-	struct kvm_mmu_memory_cache cache = { 0, __GFP_ZERO, NULL, };
+	DEFINE_KVM_MMU_MEMORY_CACHE(cache) page_cache = {};
+	struct kvm_mmu_memory_cache *cache = &page_cache.cache;
 	struct kvm_pgtable *pgt = kvm->arch.mmu.pgt;
 	enum kvm_pgtable_prot prot = KVM_PGTABLE_PROT_DEVICE |
 				     KVM_PGTABLE_PROT_R |
@@ -774,18 +775,17 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 	if (is_protected_kvm_enabled())
 		return -EPERM;
 
+	cache->gfp_zero = __GFP_ZERO;
 	size += offset_in_page(guest_ipa);
 	guest_ipa &= PAGE_MASK;
 
 	for (addr = guest_ipa; addr < guest_ipa + size; addr += PAGE_SIZE) {
-		ret = kvm_mmu_topup_memory_cache(&cache,
-						 kvm_mmu_cache_min_pages(kvm));
+		ret = kvm_mmu_topup_memory_cache(cache, kvm_mmu_cache_min_pages(kvm));
 		if (ret)
 			break;
 
 		spin_lock(&kvm->mmu_lock);
-		ret = kvm_pgtable_stage2_map(pgt, addr, PAGE_SIZE, pa, prot,
-					     &cache);
+		ret = kvm_pgtable_stage2_map(pgt, addr, PAGE_SIZE, pa, prot, cache);
 		spin_unlock(&kvm->mmu_lock);
 		if (ret)
 			break;
@@ -793,7 +793,7 @@ int kvm_phys_addr_ioremap(struct kvm *kvm, phys_addr_t guest_ipa,
 		pa += PAGE_SIZE;
 	}
 
-	kvm_mmu_free_memory_cache(&cache);
+	kvm_mmu_free_memory_cache(cache);
 	return ret;
 }
 
diff --git a/arch/mips/include/asm/kvm_host.h b/arch/mips/include/asm/kvm_host.h
index 72b90d45a46e..82bbcbc3ead6 100644
--- a/arch/mips/include/asm/kvm_host.h
+++ b/arch/mips/include/asm/kvm_host.h
@@ -346,7 +346,7 @@ struct kvm_vcpu_arch {
 	unsigned long pending_exceptions_clr;
 
 	/* Cache some mmu pages needed inside spinlock regions */
-	struct kvm_mmu_memory_cache mmu_page_cache;
+	DEFINE_KVM_MMU_MEMORY_CACHE(mmu_page_cache);
 
 	/* vcpu's vzguestid is different on each host cpu in an smp system */
 	u32 vzguestid[NR_CPUS];
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index f00004c13ccf..d0b12bfe5818 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -684,10 +684,10 @@ struct kvm_vcpu_arch {
 	 */
 	struct kvm_mmu *walk_mmu;
 
-	struct kvm_mmu_memory_cache mmu_pte_list_desc_cache;
-	struct kvm_mmu_memory_cache mmu_shadow_page_cache;
-	struct kvm_mmu_memory_cache mmu_shadowed_translation_cache;
-	struct kvm_mmu_memory_cache mmu_page_header_cache;
+	DEFINE_KVM_MMU_MEMORY_CACHE(mmu_pte_list_desc_cache);
+	DEFINE_KVM_MMU_MEMORY_CACHE(mmu_shadow_page_cache);
+	DEFINE_KVM_MMU_MEMORY_CACHE(mmu_shadowed_translation_cache);
+	DEFINE_KVM_MMU_MEMORY_CACHE(mmu_page_header_cache);
 
 	/*
 	 * QEMU userspace and the guest each have their own FPU state.
diff --git a/include/linux/kvm_types.h b/include/linux/kvm_types.h
index dceac12c1ce5..9575fb8d333f 100644
--- a/include/linux/kvm_types.h
+++ b/include/linux/kvm_types.h
@@ -78,14 +78,34 @@ struct gfn_to_pfn_cache {
  * MMU flows is problematic, as is triggering reclaim, I/O, etc... while
  * holding MMU locks.  Note, these caches act more like prefetch buffers than
  * classical caches, i.e. objects are not returned to the cache on being freed.
+ *
+ * The storage for the cache objects is laid out after the struct to allow
+ * different declarations to choose different capacities. If the capacity field
+ * is 0, the capacity is assumed to be KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE.
  */
 struct kvm_mmu_memory_cache {
 	int nobjs;
+	int capacity;
 	gfp_t gfp_zero;
 	struct kmem_cache *kmem_cache;
-	void *objects[KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE];
+	void *objects[0];
 };
-#endif
+
+/*
+ * Note, if defining a memory cache with a non-default capacity, you must
+ * initialize the capacity field at runtime.
+ */
+#define __DEFINE_KVM_MMU_MEMORY_CACHE(_name, _capacity)	\
+	struct {						\
+		struct kvm_mmu_memory_cache _name;		\
+		void *_name##_objects[_capacity];		\
+	}
+
+/* Define a memory cache with the default capacity. */
+#define DEFINE_KVM_MMU_MEMORY_CACHE(_name) \
+	__DEFINE_KVM_MMU_MEMORY_CACHE(_name, KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE)
+
+#endif /* KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE */
 
 #define HALT_POLL_HIST_COUNT			32
 
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 034c567a680c..afa4bdb6481e 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -373,11 +373,17 @@ static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache *mc,
 
 int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
 {
+	int capacity;
 	void *obj;
 
+	if (mc->capacity)
+		capacity = mc->capacity;
+	else
+		capacity = KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE;
+
 	if (mc->nobjs >= min)
 		return 0;
-	while (mc->nobjs < ARRAY_SIZE(mc->objects)) {
+	while (mc->nobjs < capacity) {
 		obj = mmu_memory_cache_alloc_obj(mc, GFP_KERNEL_ACCOUNT);
 		if (!obj)
 			return mc->nobjs >= min ? 0 : -ENOMEM;

From patchwork Thu Feb  3 01:00:48 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733686
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 8D071C433EF
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:38 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348694AbiBCBBh (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:37 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46316 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348691AbiBCBBd (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:33 -0500
Received: from mail-pg1-x549.google.com (mail-pg1-x549.google.com
 [IPv6:2607:f8b0:4864:20::549])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB2CEC06173B
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:33 -0800 (PST)
Received: by mail-pg1-x549.google.com with SMTP id
 t18-20020a63dd12000000b00342725203b5so555415pgg.16
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:33 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=UjS/m788bWTTP9wb5eU5WUd6J2LTxRg7OTe2uKv1w1A=;
        b=mxPTE0pM1YB12vcQ+/Q9pS5r6Kr86decoFTctphYYzfqGHdBO/N8MTBwdyzL8Lpp5T
         tqm7+UvEeI4ufc2jgWj4ajYjLQ29XN2xr7BwrW0l6VAGajxEXno4HJ+rVwEL9ZE1gKtQ
         Qsk8mIGGCLA2MzIaLVzeAifI2YTS4vC5S7S6GbP/zVhqDW3DnY8WM2gDYxLhSM5KJz+/
         d/OeSn0Zob8/qmb8SV+/PH8WIGQy7+obzCaEjyvn0rhLNbG8YDdqcDJOZgjS4zJlmwBr
         DQG29puh4x+pdPD6iNv77SduYUzAqY44WnjF1AARZ04QIn2uf0nGryeOUg5mibp9lpOU
         /lUA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=UjS/m788bWTTP9wb5eU5WUd6J2LTxRg7OTe2uKv1w1A=;
        b=GpUJOIdf7EZaG5We4ZiHNjVBeYv20HA6jClv8aXK6+DgfHvtTbHpE9KwaPMEXLdgO7
         KkAwKm2IWXdpu1HvQsXEai8Jk9xxsAC5tm4KKf2audrKcM/X6TQ9H/C83nkxEkjhjD3W
         9JApdlql64GSEigyzfzYAV0sEcFsmw+ZfWKjRZHD/Lf0t0bgCDVkADGCyJewiHnkpL2h
         HT5oIy947KXgYA7JAva3h3fMty/eRjJL3FRrigj/66rALPVnGuH6cCw9KlLshHqVWLA8
         k1Fv8dsBOienlSYxoekL4gT0o+7saSVvhrc05IdZg6M9mLQs8oiootQm8cn3VwQ4yAxq
         Cdiw==
X-Gm-Message-State: AOAM532kum9gsdRKcVvB5CS/9/7zUGNkY5QoKe26Hauof1JMl4/EYNCE
        RwQP6cEhKO9iBCVKrkIr4TubDan8osdBUg==
X-Google-Smtp-Source: 
 ABdhPJxwkiBhf2xN7XxwPzPE8ihlAl7rE8BHxyR3KnX4FSmKqrPDo008riaRJJwUj1avHHZ8qaLAWXLBcnlsNg==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a05:6a00:1342:: with SMTP id
 k2mr31952794pfu.20.1643850093337; Wed, 02 Feb 2022 17:01:33 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:48 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-21-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 20/23] KVM: Allow GFP flags to be passed when topping up MMU
 caches
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

This will be used in a subsequent commit to top-up MMU caches under the
MMU lock with GFP_NOWAIT as part of eager page splitting.

No functional change intended.

Signed-off-by: David Matlack <dmatlack@google.com>
Reviewed-by: Ben Gardon <bgardon@google.com>
---
 include/linux/kvm_host.h | 1 +
 virt/kvm/kvm_main.c      | 9 +++++++--
 2 files changed, 8 insertions(+), 2 deletions(-)

diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
index b3810976a27f..128f4c5a8122 100644
--- a/include/linux/kvm_host.h
+++ b/include/linux/kvm_host.h
@@ -1329,6 +1329,7 @@ void kvm_reload_remote_mmus(struct kvm *kvm);
 
 #ifdef KVM_ARCH_NR_OBJS_PER_MEMORY_CACHE
 int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min);
+int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min, gfp_t gfp);
 int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu_memory_cache *mc);
 void kvm_mmu_free_memory_cache(struct kvm_mmu_memory_cache *mc);
 void *kvm_mmu_memory_cache_alloc(struct kvm_mmu_memory_cache *mc);
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index afa4bdb6481e..c39e7ba21fab 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -371,7 +371,7 @@ static inline void *mmu_memory_cache_alloc_obj(struct kvm_mmu_memory_cache *mc,
 		return (void *)__get_free_page(gfp_flags);
 }
 
-int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
+int __kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min, gfp_t gfp)
 {
 	int capacity;
 	void *obj;
@@ -384,7 +384,7 @@ int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
 	if (mc->nobjs >= min)
 		return 0;
 	while (mc->nobjs < capacity) {
-		obj = mmu_memory_cache_alloc_obj(mc, GFP_KERNEL_ACCOUNT);
+		obj = mmu_memory_cache_alloc_obj(mc, gfp);
 		if (!obj)
 			return mc->nobjs >= min ? 0 : -ENOMEM;
 		mc->objects[mc->nobjs++] = obj;
@@ -392,6 +392,11 @@ int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
 	return 0;
 }
 
+int kvm_mmu_topup_memory_cache(struct kvm_mmu_memory_cache *mc, int min)
+{
+	return __kvm_mmu_topup_memory_cache(mc, min, GFP_KERNEL_ACCOUNT);
+}
+
 int kvm_mmu_memory_cache_nr_free_objects(struct kvm_mmu_memory_cache *mc)
 {
 	return mc->nobjs;

From patchwork Thu Feb  3 01:00:49 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733688
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 3C64FC433FE
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:40 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348707AbiBCBBj (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:39 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46326 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348709AbiBCBBf (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:35 -0500
Received: from mail-pg1-x54a.google.com (mail-pg1-x54a.google.com
 [IPv6:2607:f8b0:4864:20::54a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A2A0C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:35 -0800 (PST)
Received: by mail-pg1-x54a.google.com with SMTP id
 125-20020a630383000000b0035d88cc4fedso549099pgd.20
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:35 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=ZsqQTkXhg8Ie3Ecu3y1aWjvt5l3mqJi5K8xgwpfLxrw=;
        b=NI/XjNPfaKWmpX4pvyx6v2BFMvv8n0NI0tsjEZjaLJzKWqofQ3CeXyQ3s2PjoPZzqJ
         BHk4HvtE+5o3gFTr86LFtv6PVouPnXCnAt1hz32h3yH3cKQ2XbMoUmTj3FGpPLlVspSI
         LNGPBXQRFjAIZmw67hYNOIitCNv3GD1LGFsfw0daPzmt1zE5QcOB5gZT7esj0E7UlBKI
         AUeLhTtkk6yAFiirE8ODKRJrdsIZjNSd1YmNCy8DOVGbpfmvDLo+pLp8p8fS766qJAUG
         uJCCbFPu2NK4nw1k/adJUVcZiKdHqk8Wt2OZh6XU6z91A0TOx1uOe7dlQI9lRooSNVN/
         SiTw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=ZsqQTkXhg8Ie3Ecu3y1aWjvt5l3mqJi5K8xgwpfLxrw=;
        b=tOuFEsczdvHjhoHXaEU4E+iQVAPh6PYQY6UPQYXpiMC+Qtd6npl5f3FwIIoNv5/k5W
         0ooIZt3j5df7oJSUdlcoRuUb5eQfRcdhZkbj9sMBojWqz7D7yuw95cViHM587myiw5jt
         1+Dc/29j+Qu3bHQuO2lnjEFpTumE5ke7a+6rKX+kFhIpYlzmxjXWpZfrcMbVgxS9ep8t
         1vRnMJ4CxxyncUjr7n9uAe858J5t7XQF8KU5HU3kpb+n3aF3Kyr/GAI4kQjGE+j1v49P
         7otFGFHO9AlIr2WH7k+Re98Drx6El7yeN7BvMYo3KjDzQVZz7ZHPX97c4OmJJr8knjjv
         J0ZA==
X-Gm-Message-State: AOAM531wdUIM0a01EtCjSOPmyu/mGcTv/J5Ny6qsO1/K12cKjm7uXlRL
        NqxFl9EdU28WwLqApj2254aXcWCe3Z4zMw==
X-Google-Smtp-Source: 
 ABdhPJy9ejZI4VsRf2Kga9TJenxF4gjw8WSy+IUooYXexJ3+3Ti7kdlLU8J9uhVbm4mXyrp35b13oFdYrQSt5w==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:aa7:9634:: with SMTP id
 r20mr31763851pfg.57.1643850095023; Wed, 02 Feb 2022 17:01:35 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:49 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-22-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 21/23] KVM: x86/mmu: Fully split huge pages that require extra
 pte_list_desc structs
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

When splitting a huge page we need to add all of the lower level SPTEs
to the memslot rmap. The current implementation of eager page splitting
bails if adding an SPTE would require allocating an extra pte_list_desc
struct. Fix this limitation by allocating enough pte_list_desc structs
before splitting the huge page.

This eliminates the need for TLB flushing under the MMU lock because the
huge page is always entirely split (no subregion of the huge page is
unmapped).

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/include/asm/kvm_host.h |  10 ++++
 arch/x86/kvm/mmu/mmu.c          | 101 ++++++++++++++++++--------------
 2 files changed, 67 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d0b12bfe5818..a0f7578f7a26 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1232,6 +1232,16 @@ struct kvm_arch {
 	hpa_t	hv_root_tdp;
 	spinlock_t hv_root_tdp_lock;
 #endif
+
+	/*
+	 * Memory cache used to allocate pte_list_desc structs while splitting
+	 * huge pages. In the worst case, to split one huge page we need 512
+	 * pte_list_desc structs to add each new lower level leaf sptep to the
+	 * memslot rmap.
+	 */
+#define HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY 512
+	__DEFINE_KVM_MMU_MEMORY_CACHE(huge_page_split_desc_cache,
+				      HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY);
 };
 
 struct kvm_vm_stat {
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 825cfdec589b..c7981a934237 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -5905,6 +5905,11 @@ void kvm_mmu_init_vm(struct kvm *kvm)
 	node->track_write = kvm_mmu_pte_write;
 	node->track_flush_slot = kvm_mmu_invalidate_zap_pages_in_memslot;
 	kvm_page_track_register_notifier(kvm, node);
+
+	kvm->arch.huge_page_split_desc_cache.capacity =
+		HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY;
+	kvm->arch.huge_page_split_desc_cache.kmem_cache = pte_list_desc_cache;
+	kvm->arch.huge_page_split_desc_cache.gfp_zero = __GFP_ZERO;
 }
 
 void kvm_mmu_uninit_vm(struct kvm *kvm)
@@ -6035,9 +6040,42 @@ void kvm_mmu_slot_remove_write_access(struct kvm *kvm,
 		kvm_arch_flush_remote_tlbs_memslot(kvm, memslot);
 }
 
+static int min_descs_for_split(const struct kvm_memory_slot *slot, u64 *huge_sptep)
+{
+	struct kvm_mmu_page *huge_sp = sptep_to_sp(huge_sptep);
+	int split_level = huge_sp->role.level - 1;
+	int i, min = 0;
+	gfn_t gfn;
+
+	gfn = kvm_mmu_page_get_gfn(huge_sp, huge_sptep - huge_sp->spt);
 
-static int alloc_memory_for_split(struct kvm *kvm, struct kvm_mmu_page **spp, gfp_t gfp)
+	for (i = 0; i < PT64_ENT_PER_PAGE; i++) {
+		if (rmap_need_new_pte_list_desc(slot, gfn, split_level))
+			min++;
+
+		gfn += KVM_PAGES_PER_HPAGE(split_level);
+	}
+
+	return min;
+}
+
+static int topup_huge_page_split_desc_cache(struct kvm *kvm, int min, gfp_t gfp)
+{
+	struct kvm_mmu_memory_cache *cache =
+		&kvm->arch.huge_page_split_desc_cache;
+
+	return __kvm_mmu_topup_memory_cache(cache, min, gfp);
+}
+
+static int alloc_memory_for_split(struct kvm *kvm, struct kvm_mmu_page **spp,
+				  int min_descs, gfp_t gfp)
 {
+	int r;
+
+	r = topup_huge_page_split_desc_cache(kvm, min_descs, gfp);
+	if (r)
+		return r;
+
 	if (*spp)
 		return 0;
 
@@ -6050,9 +6088,9 @@ static int prepare_to_split_huge_page(struct kvm *kvm,
 				      const struct kvm_memory_slot *slot,
 				      u64 *huge_sptep,
 				      struct kvm_mmu_page **spp,
-				      bool *flush,
 				      bool *dropped_lock)
 {
+	int min_descs = min_descs_for_split(slot, huge_sptep);
 	int r = 0;
 
 	*dropped_lock = false;
@@ -6063,22 +6101,18 @@ static int prepare_to_split_huge_page(struct kvm *kvm,
 	if (need_resched() || rwlock_needbreak(&kvm->mmu_lock))
 		goto drop_lock;
 
-	r = alloc_memory_for_split(kvm, spp, GFP_NOWAIT | __GFP_ACCOUNT);
+	r = alloc_memory_for_split(kvm, spp, min_descs, GFP_NOWAIT | __GFP_ACCOUNT);
 	if (r)
 		goto drop_lock;
 
 	return 0;
 
 drop_lock:
-	if (*flush)
-		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
-
-	*flush = false;
 	*dropped_lock = true;
 
 	write_unlock(&kvm->mmu_lock);
 	cond_resched();
-	r = alloc_memory_for_split(kvm, spp, GFP_KERNEL_ACCOUNT);
+	r = alloc_memory_for_split(kvm, spp, min_descs, GFP_KERNEL_ACCOUNT);
 	write_lock(&kvm->mmu_lock);
 
 	return r;
@@ -6122,10 +6156,10 @@ static struct kvm_mmu_page *kvm_mmu_get_sp_for_split(struct kvm *kvm,
 
 static int kvm_mmu_split_huge_page(struct kvm *kvm,
 				   const struct kvm_memory_slot *slot,
-				   u64 *huge_sptep, struct kvm_mmu_page **spp,
-				   bool *flush)
+				   u64 *huge_sptep, struct kvm_mmu_page **spp)
 
 {
+	struct kvm_mmu_memory_cache *cache;
 	struct kvm_mmu_page *split_sp;
 	u64 huge_spte, split_spte;
 	int split_level, index;
@@ -6138,9 +6172,9 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm,
 		return -EOPNOTSUPP;
 
 	/*
-	 * Since we did not allocate pte_list_desc_structs for the split, we
-	 * cannot add a new parent SPTE to parent_ptes. This should never happen
-	 * in practice though since this is a fresh SP.
+	 * We did not allocate an extra pte_list_desc struct to add huge_sptep
+	 * to split_sp->parent_ptes. An extra pte_list_desc struct should never
+	 * be necessary in practice though since split_sp is brand new.
 	 *
 	 * Note, this makes it safe to pass NULL to __link_shadow_page() below.
 	 */
@@ -6151,6 +6185,7 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm,
 
 	split_level = split_sp->role.level;
 	access = split_sp->role.access;
+	cache = &kvm->arch.huge_page_split_desc_cache;
 
 	for (index = 0; index < PT64_ENT_PER_PAGE; index++) {
 		split_sptep = &split_sp->spt[index];
@@ -6158,25 +6193,11 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm,
 
 		BUG_ON(is_shadow_present_pte(*split_sptep));
 
-		/*
-		 * Since we did not allocate pte_list_desc structs for the
-		 * split, we can't add a new SPTE that maps this GFN.
-		 * Skipping this SPTE means we're only partially mapping the
-		 * huge page, which means we'll need to flush TLBs before
-		 * dropping the MMU lock.
-		 *
-		 * Note, this make it safe to pass NULL to __rmap_add() below.
-		 */
-		if (rmap_need_new_pte_list_desc(slot, split_gfn, split_level)) {
-			*flush = true;
-			continue;
-		}
-
 		split_spte = make_huge_page_split_spte(
 				huge_spte, split_level + 1, index, access);
 
 		mmu_spte_set(split_sptep, split_spte);
-		__rmap_add(kvm, NULL, slot, split_sptep, split_gfn, access);
+		__rmap_add(kvm, cache, slot, split_sptep, split_gfn, access);
 	}
 
 	/*
@@ -6222,7 +6243,6 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm,
 	struct kvm_mmu_page *sp = NULL;
 	struct rmap_iterator iter;
 	u64 *huge_sptep, spte;
-	bool flush = false;
 	bool dropped_lock;
 	int level;
 	gfn_t gfn;
@@ -6237,7 +6257,7 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm,
 		level = sptep_to_sp(huge_sptep)->role.level;
 		gfn = sptep_to_gfn(huge_sptep);
 
-		r = prepare_to_split_huge_page(kvm, slot, huge_sptep, &sp, &flush, &dropped_lock);
+		r = prepare_to_split_huge_page(kvm, slot, huge_sptep, &sp, &dropped_lock);
 		if (r) {
 			trace_kvm_mmu_split_huge_page(gfn, spte, level, r);
 			break;
@@ -6246,7 +6266,7 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm,
 		if (dropped_lock)
 			goto restart;
 
-		r = kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp, &flush);
+		r = kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp);
 
 		trace_kvm_mmu_split_huge_page(gfn, spte, level, r);
 
@@ -6261,7 +6281,7 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm,
 	if (sp)
 		kvm_mmu_free_sp(sp);
 
-	return flush;
+	return false;
 }
 
 static void kvm_rmap_try_split_huge_pages(struct kvm *kvm,
@@ -6269,7 +6289,6 @@ static void kvm_rmap_try_split_huge_pages(struct kvm *kvm,
 					  gfn_t start, gfn_t end,
 					  int target_level)
 {
-	bool flush;
 	int level;
 
 	/*
@@ -6277,21 +6296,15 @@ static void kvm_rmap_try_split_huge_pages(struct kvm *kvm,
 	 * down to the target level. This ensures pages are recursively split
 	 * all the way to the target level. There's no need to split pages
 	 * already at the target level.
-	 *
-	 * Note that TLB flushes must be done before dropping the MMU lock since
-	 * rmap_try_split_huge_pages() may partially split any given huge page,
-	 * i.e. it may effectively unmap (make non-present) a portion of the
-	 * huge page.
 	 */
 	for (level = KVM_MAX_HUGEPAGE_LEVEL; level > target_level; level--) {
-		flush = slot_handle_level_range(kvm, slot,
-						rmap_try_split_huge_pages,
-						level, level, start, end - 1,
-						true, flush);
+		slot_handle_level_range(kvm, slot,
+					rmap_try_split_huge_pages,
+					level, level, start, end - 1,
+					true, false);
 	}
 
-	if (flush)
-		kvm_arch_flush_remote_tlbs_memslot(kvm, slot);
+	kvm_mmu_free_memory_cache(&kvm->arch.huge_page_split_desc_cache);
 }
 
 /* Must be called with the mmu_lock held in write-mode. */

From patchwork Thu Feb  3 01:00:50 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733689
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 7C30BC433F5
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:41 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348696AbiBCBBk (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:40 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46338 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348697AbiBCBBh (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:37 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65119C061714
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:37 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 e7-20020a17090ac20700b001b586e65885so5620041pjt.1
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:37 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=4swT+J8ld9uECCywNjrn0CGH0iBu0/rxiDB3j6YxIS0=;
        b=kfNqUaNs01ebgU8QLWrYhMQRA/AvWFWeCTqp5ANPgZ/RXtJJAfsu2pgCQopjAyVBcZ
         TE8vYvM1sUcUItdVQN7Mpkdo0Z1vs1Sahb/+D3SjP965XKegSiCpGtGGgSOYFqGQmgK8
         PuWbRHX5h22dVJZPd/odXFtpOTeyxjROCIBwzwoJYSaGE0GnolxmXjUyfNPGbCON/1XR
         HC1zLnQzuKSh+bLKGTzVGNpoRAYLgmq21cQ1vpgYUDLFDx5oqIoz2reW3CFO9IErhr2a
         MUD8IRahRtaQlvTUhxVBZkN7kWZW1hGQTBNP6AbpqHWY8EVACqpPgWqFOUthII+ADSWI
         uLpg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=4swT+J8ld9uECCywNjrn0CGH0iBu0/rxiDB3j6YxIS0=;
        b=WQx1TG+9tDCjhVVMm9ias6/wFEqo7YeLIctk67FP6h792/JamC0S1fdL3w4f17r0JE
         fMqC4ULjFqZUO6vYfzsRSAfNeXztXUJeTIyuAkDZ2TRdZ2RB4rN3s3ZEnAu12McMUu0c
         W9dKU4rbZbiZJhn/C5Hv8DerMDunxUwHeumXTfWhJBgK7WoS/UNHAePgx+CYaOtMNZRP
         09twuqLyPxNTsNQnTTrq95WOs0Fj3jH1ZiKk2psCJdmCO+Q28wi5NxdGVfjqeoyplPzu
         yH9ucnFpH3JjiJ+MTJ08FnIFy5zzexSdrtT38ljU8mvL96yCRZ3xohkXOAgZ/5lmAylo
         0NSQ==
X-Gm-Message-State: AOAM532Ocw+H+d2AljuGddyEck7VMFIMolikX3FRsghW48e3ef6jJpAv
        IXA8Rr8S6++Y1KwMGHnAV231SZEcNUE07A==
X-Google-Smtp-Source: 
 ABdhPJx03leHb2o6bWu+D9mBJEXVE5KqKiQXhRKd7gQXpvE94sUCp1IsPetMT7WDyP+Bg6HLUQsfr94gphwo2g==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:90b:4c8e:: with SMTP id
 my14mr1189912pjb.0.1643850096420; Wed, 02 Feb 2022 17:01:36 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:50 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-23-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 22/23] KVM: x86/mmu: Split huge pages aliased by multiple
 SPTEs
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

The existing huge page splitting code bails if it encounters a huge page
that is aliased by another SPTE that has already been split (either due
to NX huge pages or eager page splitting). Extend the huge page
splitting code to also handle such aliases.

The thing we have to be careful about is dealing with what's already in
the lower level page table. If eager page splitting was the only
operation that split huge pages, this would be fine. However huge pages
can also be split by NX huge pages. This means the lower level page
table may only be partially filled in and may point to even lower level
page tables that are partially filled in. We can fill in the rest of the
page table but dealing with the lower level page tables would be too
complex.

To handle this we flush TLBs after dropping the huge SPTE whenever we
are about to install a lower level page table that was partially filled
in (*). We can skip the TLB flush if the lower level page table was
empty (no aliasing) or identical to what we were already going to
populate it with (aliased huge page that was just eagerly split).

(*) This TLB flush could probably be delayed until we're about to drop
the MMU lock, which would also let us batch flushes for multiple splits.
However such scenarios should be rare in practice (a huge page must be
aliased in multiple SPTEs and have been split for NX Huge Pages in only
some of them). Flushing immediately is simpler to plumb and also reduces
the chances of tripping over a CPU bug (e.g. see iTLB multi-hit).

Signed-off-by: David Matlack <dmatlack@google.com>
---
 arch/x86/include/asm/kvm_host.h |  5 ++-
 arch/x86/kvm/mmu/mmu.c          | 77 +++++++++++++++------------------
 2 files changed, 38 insertions(+), 44 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index a0f7578f7a26..c11f27f38981 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1237,9 +1237,10 @@ struct kvm_arch {
 	 * Memory cache used to allocate pte_list_desc structs while splitting
 	 * huge pages. In the worst case, to split one huge page we need 512
 	 * pte_list_desc structs to add each new lower level leaf sptep to the
-	 * memslot rmap.
+	 * memslot rmap plus 1 to extend the parent_ptes rmap of the new lower
+	 * level page table.
 	 */
-#define HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY 512
+#define HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY 513
 	__DEFINE_KVM_MMU_MEMORY_CACHE(huge_page_split_desc_cache,
 				      HUGE_PAGE_SPLIT_DESC_CACHE_CAPACITY);
 };
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index c7981a934237..62fbff8979ba 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6056,7 +6056,8 @@ static int min_descs_for_split(const struct kvm_memory_slot *slot, u64 *huge_spt
 		gfn += KVM_PAGES_PER_HPAGE(split_level);
 	}
 
-	return min;
+	/* Plus 1 to extend the parent_ptes rmap of the lower level SP. */
+	return min + 1;
 }
 
 static int topup_huge_page_split_desc_cache(struct kvm *kvm, int min, gfp_t gfp)
@@ -6126,6 +6127,7 @@ static struct kvm_mmu_page *kvm_mmu_get_sp_for_split(struct kvm *kvm,
 	struct kvm_mmu_page *huge_sp = sptep_to_sp(huge_sptep);
 	struct kvm_mmu_page *split_sp;
 	union kvm_mmu_page_role role;
+	bool created = false;
 	unsigned int access;
 	gfn_t gfn;
 
@@ -6138,25 +6140,21 @@ static struct kvm_mmu_page *kvm_mmu_get_sp_for_split(struct kvm *kvm,
 	 */
 	role = kvm_mmu_child_role(huge_sp, true, access);
 	split_sp = kvm_mmu_get_existing_direct_sp(kvm, gfn, role);
-
-	/*
-	 * Opt not to split if the lower-level SP already exists. This requires
-	 * more complex handling as the SP may be already partially filled in
-	 * and may need extra pte_list_desc structs to update parent_ptes.
-	 */
 	if (split_sp)
-		return NULL;
+		goto out;
 
+	created = true;
 	swap(split_sp, *spp);
 	kvm_mmu_init_sp(kvm, split_sp, slot, gfn, role);
-	trace_kvm_mmu_get_page(split_sp, true);
 
+out:
+	trace_kvm_mmu_get_page(split_sp, created);
 	return split_sp;
 }
 
-static int kvm_mmu_split_huge_page(struct kvm *kvm,
-				   const struct kvm_memory_slot *slot,
-				   u64 *huge_sptep, struct kvm_mmu_page **spp)
+static void kvm_mmu_split_huge_page(struct kvm *kvm,
+				    const struct kvm_memory_slot *slot,
+				    u64 *huge_sptep, struct kvm_mmu_page **spp)
 
 {
 	struct kvm_mmu_memory_cache *cache;
@@ -6164,22 +6162,11 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm,
 	u64 huge_spte, split_spte;
 	int split_level, index;
 	unsigned int access;
+	bool flush = false;
 	u64 *split_sptep;
 	gfn_t split_gfn;
 
 	split_sp = kvm_mmu_get_sp_for_split(kvm, slot, huge_sptep, spp);
-	if (!split_sp)
-		return -EOPNOTSUPP;
-
-	/*
-	 * We did not allocate an extra pte_list_desc struct to add huge_sptep
-	 * to split_sp->parent_ptes. An extra pte_list_desc struct should never
-	 * be necessary in practice though since split_sp is brand new.
-	 *
-	 * Note, this makes it safe to pass NULL to __link_shadow_page() below.
-	 */
-	if (WARN_ON_ONCE(pte_list_need_new_desc(&split_sp->parent_ptes)))
-		return -EINVAL;
 
 	huge_spte = READ_ONCE(*huge_sptep);
 
@@ -6191,7 +6178,20 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm,
 		split_sptep = &split_sp->spt[index];
 		split_gfn = kvm_mmu_page_get_gfn(split_sp, index);
 
-		BUG_ON(is_shadow_present_pte(*split_sptep));
+		/*
+		 * split_sp may have populated page table entries if this huge
+		 * page is aliased in multiple shadow page table entries. We
+		 * know the existing SP will be mapping the same GFN->PFN
+		 * translation since this is a direct SP. However, the SPTE may
+		 * point to an even lower level page table that may only be
+		 * partially filled in (e.g. for NX huge pages). In other words,
+		 * we may be unmapping a portion of the huge page, which
+		 * requires a TLB flush.
+		 */
+		if (is_shadow_present_pte(*split_sptep)) {
+			flush |= !is_last_spte(*split_sptep, split_level);
+			continue;
+		}
 
 		split_spte = make_huge_page_split_spte(
 				huge_spte, split_level + 1, index, access);
@@ -6202,16 +6202,12 @@ static int kvm_mmu_split_huge_page(struct kvm *kvm,
 
 	/*
 	 * Replace the huge spte with a pointer to the populated lower level
-	 * page table. Since we are making this change without a TLB flush vCPUs
-	 * will see a mix of the split mappings and the original huge mapping,
-	 * depending on what's currently in their TLB. This is fine from a
-	 * correctness standpoint since the translation will be the same either
-	 * way.
+	 * page table. If the lower-level page table indentically maps the huge
+	 * page, there's no need for a TLB flush. Otherwise, flush TLBs after
+	 * dropping the huge page and before installing the shadow page table.
 	 */
-	drop_large_spte(kvm, huge_sptep, false);
-	__link_shadow_page(NULL, huge_sptep, split_sp);
-
-	return 0;
+	drop_large_spte(kvm, huge_sptep, flush);
+	__link_shadow_page(cache, huge_sptep, split_sp);
 }
 
 static bool should_split_huge_page(u64 *huge_sptep)
@@ -6266,16 +6262,13 @@ static bool rmap_try_split_huge_pages(struct kvm *kvm,
 		if (dropped_lock)
 			goto restart;
 
-		r = kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp);
-
-		trace_kvm_mmu_split_huge_page(gfn, spte, level, r);
-
 		/*
-		 * If splitting is successful we must restart the iterator
-		 * because huge_sptep has just been removed from it.
+		 * After splitting we must restart the iterator because
+		 * huge_sptep has just been removed from it.
 		 */
-		if (!r)
-			goto restart;
+		kvm_mmu_split_huge_page(kvm, slot, huge_sptep, &sp);
+		trace_kvm_mmu_split_huge_page(gfn, spte, level, 0);
+		goto restart;
 	}
 
 	if (sp)

From patchwork Thu Feb  3 01:00:51 2022
Content-Type: text/plain; charset="utf-8"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
X-Patchwork-Submitter: David Matlack <dmatlack@google.com>
X-Patchwork-Id: 12733690
Return-Path: <kvm-owner@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
Received: from vger.kernel.org (vger.kernel.org [23.128.96.18])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 65CFCC433EF
	for <kvm@archiver.kernel.org>; Thu,  3 Feb 2022 01:01:42 +0000 (UTC)
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S1348713AbiBCBBl (ORCPT <rfc822;kvm@archiver.kernel.org>);
        Wed, 2 Feb 2022 20:01:41 -0500
Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46342 "EHLO
        lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1348704AbiBCBBj (ORCPT <rfc822;kvm@vger.kernel.org>);
        Wed, 2 Feb 2022 20:01:39 -0500
Received: from mail-pj1-x104a.google.com (mail-pj1-x104a.google.com
 [IPv6:2607:f8b0:4864:20::104a])
        by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D4DEBC06173B
        for <kvm@vger.kernel.org>; Wed,  2 Feb 2022 17:01:38 -0800 (PST)
Received: by mail-pj1-x104a.google.com with SMTP id
 o72-20020a17090a0a4e00b001b4e5b5b6c0so785540pjo.5
        for <kvm@vger.kernel.org>; Wed, 02 Feb 2022 17:01:38 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20210112;
        h=date:in-reply-to:message-id:mime-version:references:subject:from:to
         :cc;
        bh=kP0Hm9+AffHHY/1UdAipn0JE+3nNfhLxmoZeHFaZkJU=;
        b=gR6j3Ofw/NicTnv99Oi6LGqf5cf8CMgKZ1eRIInU7crAz8OFrKvq7Q5cRh+aYm86BN
         8aTO9+YKu1yPL2FRcV1q8kutU4NcQyJYilPA6VhNvu8VFfW7lW2n5PSxmn0FX7Bq1L9M
         TtUJZKLjoz/Gw1flC8hOdId7GOExeLKibk1oqxsbIyHMi7OXinbEoS/Z565/OVVX4kQA
         RX3x9KLa81B97L+BEnutq7y/SXN5HHrJWxpWhth4QNjyFnDk4a5zUTGFZaW/BPt+t+lZ
         Jw0Qa6dZ0PpWWYYk7t/ZpDm+LMG76DZSJOY7LWqqXPDfGnThULwA37SnKKsYR8P0fVN0
         bENw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20210112;
        h=x-gm-message-state:date:in-reply-to:message-id:mime-version
         :references:subject:from:to:cc;
        bh=kP0Hm9+AffHHY/1UdAipn0JE+3nNfhLxmoZeHFaZkJU=;
        b=7WDkBpFc1OR9fT2zLf6knPKFJIEHud4QdTN0YOUnnvkU7n15PXmWFb12DILW5VWeL+
         CWx081xl8no7cfGm+wcXN6SJMTVDuZIpMMhI2KEiZuFRb/C/GkYjfuwqx98lnUxksGCr
         xBgxKZQlAa07wkP3BlbGj0SPi0eNg30EplrKc2G0WSiQLVB80MfnKSbruPjVMLfTGblb
         GU889YU+bgfsnLak/KXrLx4HT4elNoFtCFrWq0r9PIbFRzpqOqM5lfKtHlWJPTPNIC3M
         WA8+H1uTRzhLuQJFdjY9qRo/ig57dFwmAEqqZid5mONpRW9rtsb+iIx7aj7nzxtIua5+
         igOA==
X-Gm-Message-State: AOAM533YN2wHe+HsrIhUEmfq4rahbWdt5ngslHoTVjdEhcYBqsSSHqa1
        nYL8or9go8MwAg8h56ps8lUwXT2ivEaDLA==
X-Google-Smtp-Source: 
 ABdhPJwkhnq+CDnvbVowYI8RemBupjY7lQI+1x0zr8ztW1F+4vQfp1jNo8twVWIYvR3k46Z0onZn8MnZL+hM1A==
X-Received: from dmatlack-heavy.c.googlers.com
 ([fda3:e722:ac3:cc00:7f:e700:c0a8:19cd])
 (user=dmatlack job=sendgmr) by 2002:a17:902:e844:: with SMTP id
 t4mr32917485plg.104.1643850098316; Wed, 02 Feb 2022 17:01:38 -0800 (PST)
Date: Thu,  3 Feb 2022 01:00:51 +0000
In-Reply-To: <20220203010051.2813563-1-dmatlack@google.com>
Message-Id: <20220203010051.2813563-24-dmatlack@google.com>
Mime-Version: 1.0
References: <20220203010051.2813563-1-dmatlack@google.com>
X-Mailer: git-send-email 2.35.0.rc2.247.g8bbb082509-goog
Subject: [PATCH 23/23] KVM: selftests: Map x86_64 guest virtual memory with
 huge pages
From: David Matlack <dmatlack@google.com>
To: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc Zyngier <maz@kernel.org>, Huacai Chen <chenhuacai@kernel.org>,
        leksandar Markovic <aleksandar.qemu.devel@gmail.com>,
        Sean Christopherson <seanjc@google.com>,
        Vitaly Kuznetsov <vkuznets@redhat.com>,
        Peter Xu <peterx@redhat.com>,
        Wanpeng Li <wanpengli@tencent.com>,
        Jim Mattson <jmattson@google.com>,
        Joerg Roedel <joro@8bytes.org>,
        Peter Feiner <pfeiner@google.com>,
        Andrew Jones <drjones@redhat.com>, maciej.szmigiero@oracle.com,
        kvm@vger.kernel.org, David Matlack <dmatlack@google.com>
Precedence: bulk
List-ID: <kvm.vger.kernel.org>
X-Mailing-List: kvm@vger.kernel.org

Override virt_map() in x86_64 selftests to use the largest page size
possible when mapping guest virtual memory. This enables testing eager
page splitting with shadow paging (e.g. kvm_intel.ept=N), as it allows
KVM to shadow guest memory with huge pages.

Signed-off-by: David Matlack <dmatlack@google.com>
---
 .../selftests/kvm/include/x86_64/processor.h  |  6 ++++
 tools/testing/selftests/kvm/lib/kvm_util.c    |  4 +--
 .../selftests/kvm/lib/x86_64/processor.c      | 31 +++++++++++++++++++
 3 files changed, 39 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/x86_64/processor.h b/tools/testing/selftests/kvm/include/x86_64/processor.h
index 8a470da7b71a..0d6014b7eaf0 100644
--- a/tools/testing/selftests/kvm/include/x86_64/processor.h
+++ b/tools/testing/selftests/kvm/include/x86_64/processor.h
@@ -465,6 +465,12 @@ enum x86_page_size {
 	X86_PAGE_SIZE_2M,
 	X86_PAGE_SIZE_1G,
 };
+
+static inline size_t page_size_bytes(enum x86_page_size page_size)
+{
+	return 1UL << (page_size * 9 + 12);
+}
+
 void __virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
 		   enum x86_page_size page_size);
 
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index d8cf851ab119..33c4a43bffcd 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1393,8 +1393,8 @@ vm_vaddr_t vm_vaddr_alloc_page(struct kvm_vm *vm)
  * Within the VM given by @vm, creates a virtual translation for
  * @npages starting at @vaddr to the page range starting at @paddr.
  */
-void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
-	      unsigned int npages)
+void __weak virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr,
+		     unsigned int npages)
 {
 	size_t page_size = vm->page_size;
 	size_t size = npages * page_size;
diff --git a/tools/testing/selftests/kvm/lib/x86_64/processor.c b/tools/testing/selftests/kvm/lib/x86_64/processor.c
index 9f000dfb5594..7df84292d5de 100644
--- a/tools/testing/selftests/kvm/lib/x86_64/processor.c
+++ b/tools/testing/selftests/kvm/lib/x86_64/processor.c
@@ -282,6 +282,37 @@ void virt_pg_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr)
 	__virt_pg_map(vm, vaddr, paddr, X86_PAGE_SIZE_4K);
 }
 
+void virt_map(struct kvm_vm *vm, uint64_t vaddr, uint64_t paddr, unsigned int npages)
+{
+	size_t size = (size_t) npages * vm->page_size;
+	size_t vend = vaddr + size;
+	enum x86_page_size page_size;
+	size_t stride;
+
+	TEST_ASSERT(vaddr + size > vaddr, "Vaddr overflow");
+	TEST_ASSERT(paddr + size > paddr, "Paddr overflow");
+
+	/*
+	 * Map the region with all 1G pages if possible, falling back to all
+	 * 2M pages, and finally all 4K pages. This could be improved to use
+	 * a mix of page sizes so that more of the region is mapped with large
+	 * pages.
+	 */
+	for (page_size = X86_PAGE_SIZE_1G; page_size >= X86_PAGE_SIZE_4K; page_size--) {
+		stride = page_size_bytes(page_size);
+
+		if (!(vaddr % stride) && !(paddr % stride) && !(size % stride))
+			break;
+	}
+
+	TEST_ASSERT(page_size >= X86_PAGE_SIZE_4K,
+		    "Cannot map unaligned region: vaddr 0x%lx paddr 0x%lx npages 0x%x\n",
+		    vaddr, paddr, npages);
+
+	for (; vaddr < vend; vaddr += stride, paddr += stride)
+		__virt_pg_map(vm, vaddr, paddr, page_size);
+}
+
 static struct pageTableEntry *_vm_get_page_table_entry(struct kvm_vm *vm, int vcpuid,
 						       uint64_t vaddr)
 {