[v4,08/18] KVM: x86/mmu: Support GFN direct bits

From: Isaku Yamahata <isaku.yamahata@intel.com>

From: Isaku Yamahata <isaku.yamahata@intel.com>

Teach the MMU to map guest GFNs at a massaged position on the TDP, to aid
in implementing TDX shared memory.

Like other Coco technologies, TDX has the concept of private and shared
memory. For TDX the private and shared mappings are managed on separate
EPT roots. The private half is managed indirectly through calls into a
protected runtime environment called the TDX module, where the shared half
is managed within KVM in normal page tables.

For TDX, the shared half will be mapped in the higher alias, with a "shared
bit" set in the GPA. However, KVM will still manage it with the same
memslots as the private half. This means memslot looks ups and zapping
operations will be provided with a GFN without the shared bit set.

So KVM will either need to apply or strip the shared bit before mapping or
zapping the shared EPT. Having GFNs sometimes have the shared bit and
sometimes not would make the code confusing.

So instead arrange the code such that GFNs never have shared bit set.
Create a concept of "direct bits", that is stripped from the fault
address when setting fault->gfn, and applied within the TDP MMU iterator.
Calling code will behave as if it is operating on the PTE mapping the GFN
(without shared bits) but within the iterator, the actual mappings will be
shifted using bits specific for the root. SPs will have the GFN set
without the shared bit. In the end the TDP MMU will behave like it is
mapping things at the GFN without the shared bit but with a strange page
table format where everything is offset by the shared bit.

Since TDX only needs to shift the mapping like this for the shared bit,
which is mapped as the normal TDP root, add a "gfn_direct_bits" field to
the kvm_arch structure for each VM with a default value of 0. It will
have the bit set at the position of the GPA shared bit in GFN through TD
specific initialization code. Keep TDX specific concepts out of the MMU
code by not naming it "shared".

Ranged TLB flushes (i.e. flush_remote_tlbs_range()) target specific GFN
ranges. In convention established above, these would need to target the
shifted GFN range. It won't matter functionally, since the actual
implementation will always result in a full flush for the only planned
user (TDX). For correctness reasons, future changes can provide a TDX
x86_ops.flush_remote_tlbs_range implementation to return -EOPNOTSUPP and
force the full flush for TDs.

This leaves one problem. Some operations use a concept of max GFN (i.e.
kvm_mmu_max_gfn()), to iterate over the whole TDP range. When applying the
direct mask to the start of the range, the iterator would end up skipping
iterating over the range not covered by the direct mask bit. For safety,
make sure the __tdp_mmu_zap_root() operation iterates over the full GFN
range supported by the underlying TDP format. Add a new iterator helper,
for_each_tdp_pte_min_level_all(), that iterates the entire TDP GFN range,
regardless of root.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
v4:
 - Add for_each_tdp_pte_min_level_all()
 - Log typos

v3:
 - Add comment for kvm_gfn_root_mask() (Paolo)
 - Change names mask -> bits (Paolo)
 - Add comment in struct definition for fault->gfn not containing shared
   bit. (Paolo)
 - Drop special handling in kvm_arch_flush_remote_tlbs_range(),
   implement kvm_x86_ops.flush_remote_tlbs_range in a future patch.
   (Paolo)
 - Do addition of kvm arg to iterator in previous patch (Paolo)
 - OR gfn_bits in try_step_side() too, because of issue seen with 4
   level EPT
 - Add warning for GFN bits in wrong arg in tdp_iter_start()

v2:
 - Rename from "KVM: x86/mmu: Add address conversion functions for TDX shared bit of GPA"
 - Dropped Binbin's reviewed-by tag because of the extend of the changes
 - Rename gfn_shared_mask to gfn_direct_mask.
 - Don't include shared bits in GFNs, hide the existence in the TDP MMU
   iterator.
 - Don't do range flushes if a gfn_direct_mask is present.
---
 arch/x86/include/asm/kvm_host.h |  2 ++
 arch/x86/kvm/mmu.h              |  5 +++++
 arch/x86/kvm/mmu/mmu_internal.h | 28 ++++++++++++++++++++++++++--
 arch/x86/kvm/mmu/tdp_iter.c     | 10 ++++++----
 arch/x86/kvm/mmu/tdp_iter.h     | 15 +++++++++++----
 arch/x86/kvm/mmu/tdp_mmu.c      |  5 +----
 6 files changed, 51 insertions(+), 14 deletions(-)

Message ID	20240718211230.1492011-9-rick.p.edgecombe@intel.com (mailing list archive)
State	New, archived
Headers	show Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.15]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 45A7D1482E8; Thu, 18 Jul 2024 21:12:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.15 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721337173; cv=none; b=EyFrOCYIg4rS8x4pyNmkwNDn3MhWZp/uvLVGhlNSulhpkTrgIYNGh1wIRH1rNtLyfsBOB01SZkHGN3+T1FN3l3RNEuZ0VtgKAp+EfmzelwK5ie4phglLsaNSoMlc626ZBUUlPdf98/AIz8sW2hpM4oOabaweym74XuFH3ZCo+KI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1721337173; c=relaxed/simple; bh=ErxbfdqQRNxj/yVhrRUoQP5sQ+QEnblBq2Uf+/OullE=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=Q4/LD0gNzMktt7A/MgN0Myt/AKVXIx06r50Tf3SpQR2NOCECQBmF+MoTkNpy6BTd+n/bcP8JEw7s0BwVNalFeIcIjg/HD2c9HuEpD6IGmqXX+M2GGASToIUr8DgZ6/zyTX/nPrystFm9fXWOTwmDoTIizVi2QfaG4ylfw1Ts928= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=b5FklUsU; arc=none smtp.client-ip=198.175.65.15 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="b5FklUsU" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1721337171; x=1752873171; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ErxbfdqQRNxj/yVhrRUoQP5sQ+QEnblBq2Uf+/OullE=; b=b5FklUsU1QoNQZBLHlZRANycb7CF387yAedZMI7CDjr8llqRde28P43D lfspGk2lvrN/OpJOItZPI+sjw//X2W6vBk6QDJlG1q8woen7/1vKH/con l87c4XehAh1rjCJ1/HDUAFVcr2S5dGatyxZHJrQbgz47d33eSA09OJlJk 5B55vpJBC/BcljZuMDmCY/pt5A8eHCRttTQ57pzliU2cGuwjEQ/VGwSN5 ZDg5ujahIaqVcxsTmP0Latftd9SqL6SDuNgNE+ubzRA9lIz71qobpJVTk w/fpDtUBezY3bqjFgOj3HUc4MsbTSpqx1P0IVn2aJ7XtDkY0Cg5qRFeSv A==; X-CSE-ConnectionGUID: F1uQolgDTfixn/96rqUUCQ== X-CSE-MsgGUID: mc1G8hj6Qlmd2mZxKGjpsA== X-IronPort-AV: E=McAfee;i="6700,10204,11137"; a="22697428" X-IronPort-AV: E=Sophos;i="6.09,218,1716274800"; d="scan'208";a="22697428" Received: from orviesa003.jf.intel.com ([10.64.159.143]) by orvoesa107.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2024 14:12:46 -0700 X-CSE-ConnectionGUID: +jGES0tmQqC/fhGO3g4izw== X-CSE-MsgGUID: AHw6fDmzSCKGQoHRtB7pYQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.09,218,1716274800"; d="scan'208";a="55760394" Received: from ccbilbre-mobl3.amr.corp.intel.com (HELO rpedgeco-desk4..) ([10.124.223.76]) by ORVIESA003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 18 Jul 2024 14:12:45 -0700 From: Rick Edgecombe <rick.p.edgecombe@intel.com> To: seanjc@google.com, pbonzini@redhat.com, kvm@vger.kernel.org Cc: kai.huang@intel.com, dmatlack@google.com, erdemaktas@google.com, isaku.yamahata@gmail.com, linux-kernel@vger.kernel.org, sagis@google.com, yan.y.zhao@intel.com, rick.p.edgecombe@intel.com, Isaku Yamahata <isaku.yamahata@intel.com> Subject: [PATCH v4 08/18] KVM: x86/mmu: Support GFN direct bits Date: Thu, 18 Jul 2024 14:12:20 -0700 Message-Id: <20240718211230.1492011-9-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240718211230.1492011-1-rick.p.edgecombe@intel.com> References: <20240718211230.1492011-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: <kvm.vger.kernel.org> List-Subscribe: <mailto:kvm+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:kvm+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	TDX MMU prep series part 1 \| expand [v4,00/18] TDX MMU prep series part 1 [v4,01/18] KVM: x86/mmu: Zap invalid roots with mmu_lock holding for write at uninit [v4,02/18] KVM: Add member to struct kvm_gfn_range for target alias [v4,03/18] KVM: x86: Add a VM type define for TDX [v4,04/18] KVM: x86/mmu: Add an external pointer to struct kvm_mmu_page [v4,05/18] KVM: x86/mmu: Add an is_mirror member for union kvm_mmu_page_role [v4,06/18] KVM: x86/mmu: Make kvm_tdp_mmu_alloc_root() return void [v4,07/18] KVM: x86/tdp_mmu: Take struct kvm in iter loops [v4,08/18] KVM: x86/mmu: Support GFN direct bits [v4,09/18] KVM: x86/tdp_mmu: Extract root invalid check from tdx_mmu_next_root() [v4,10/18] KVM: x86/tdp_mmu: Introduce KVM MMU root types to specify page table type [v4,11/18] KVM: x86/tdp_mmu: Take root in tdp_mmu_for_each_pte() [v4,12/18] KVM: x86/tdp_mmu: Support mirror root for TDP MMU [v4,13/18] KVM: x86/tdp_mmu: Propagate attr_filter to MMU notifier callbacks [v4,14/18] KVM: x86/tdp_mmu: Propagate building mirror page tables [v4,15/18] KVM: x86/tdp_mmu: Propagate tearing down mirror page tables [v4,16/18] KVM: x86/tdp_mmu: Take root types for kvm_tdp_mmu_invalidate_all_roots() [v4,17/18] KVM: x86/tdp_mmu: Don't zap valid mirror roots in kvm_tdp_mmu_zap_all() [v4,18/18] KVM: x86/mmu: Prevent aliased memslot GFNs

[v4,08/18] KVM: x86/mmu: Support GFN direct bits

Commit Message

Patch