[10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU

From: Isaku Yamahata <isaku.yamahata@intel.com>

From: Isaku Yamahata <isaku.yamahata@intel.com>

Allocate mirrored page table for the private page table and implement MMU
hooks to operate on the private page table.

To handle page fault to a private GPA, KVM walks the mirrored page table in
unencrypted memory and then uses MMU hooks in kvm_x86_ops to propagate
changes from the mirrored page table to private page table.

  private KVM page fault   |
      |                    |
      V                    |
 private GPA               |     CPU protected EPTP
      |                    |           |
      V                    |           V
 mirrored PT root          |     private PT root
      |                    |           |
      V                    |           V
   mirrored PT --hook to propagate-->private PT
      |                    |           |
      \--------------------+------\    |
                           |      |    |
                           |      V    V
                           |    private guest page
                           |
                           |
     non-encrypted memory  |    encrypted memory
                           |

PT:         page table
Private PT: the CPU uses it, but it is invisible to KVM. TDX module manages
            this table to map private guest pages.
Mirrored PT:It is visible to KVM, but the CPU doesn't use it. KVM uses it
            to propagate PT change to the actual private PT.

SPTEs in mirrored page table (refer to them as mirrored SPTEs hereafter)
can be modified atomically with mmu_lock held for read, however, the MMU
hooks to private page table are not atomical operations.

To address it, a special REMOVED_SPTE is introduced and below sequence is
used when mirrored SPTEs are updated atomically.

1. Mirrored SPTE is first atomically written to REMOVED_SPTE.
2. The successful updater of the mirrored SPTE in step 1 proceeds with the
   following steps.
3. Invoke MMU hooks to modify private page table with the target value.
4. (a) On hook succeeds, update mirrored SPTE to target value.
   (b) On hook failure, restore mirrored SPTE to original value.

KVM TDP MMU ensures other threads will not overrite REMOVED_SPTE.

This sequence also applies when SPTEs are atomiclly updated from
non-present to present in order to prevent potential conflicts when
multiple vCPUs attempt to set private SPTEs to a different page size
simultaneously, though 4K page size is only supported for private page
table currently.

2M page support can be done in future patches.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Co-developed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
TDX MMU Part 1:
 - Remove unnecessary gfn, access twist in
   tdp_mmu_map_handle_target_level(). (Chao Gao)
 - Open code call to kvm_mmu_alloc_private_spt() instead oCf doing it in
   tdp_mmu_alloc_sp()
 - Update comment in set_private_spte_present() (Yan)
 - Open code call to kvm_mmu_init_private_spt() (Yan)
 - Add comments on TDX MMU hooks (Yan)
 - Fix various whitespace alignment (Yan)
 - Remove pointless warnings and conditionals in
   handle_removed_private_spte() (Yan)
 - Remove redundant lockdep assert in tdp_mmu_set_spte() (Yan)
 - Remove incorrect comment in handle_changed_spte() (Yan)
 - Remove unneeded kvm_pfn_to_refcounted_page() and
   is_error_noslot_pfn() check in kvm_tdp_mmu_map() (Yan)
 - Do kvm_gfn_for_root() branchless (Rick)
 - Update kvm_tdp_mmu_alloc_root() callers to not check error code (Rick)
 - Add comment for stripping shared bit for fault.gfn (Chao)

v19:
- drop CONFIG_KVM_MMU_PRIVATE

v18:
- Rename freezed => frozen

v14 -> v15:
- Refined is_private condition check in kvm_tdp_mmu_map().
  Add kvm_gfn_shared_mask() check.
- catch up for struct kvm_range change
---
 arch/x86/include/asm/kvm-x86-ops.h |   5 +
 arch/x86/include/asm/kvm_host.h    |  25 +++
 arch/x86/kvm/mmu/mmu.c             |  13 +-
 arch/x86/kvm/mmu/mmu_internal.h    |  19 +-
 arch/x86/kvm/mmu/tdp_iter.h        |   2 +-
 arch/x86/kvm/mmu/tdp_mmu.c         | 269 +++++++++++++++++++++++++----
 arch/x86/kvm/mmu/tdp_mmu.h         |   2 +-
 7 files changed, 293 insertions(+), 42 deletions(-)

Message ID	20240515005952.3410568-11-rick.p.edgecombe@intel.com (mailing list archive)
State	New
Headers	show Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B28712B9C4; Wed, 15 May 2024 01:00:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.19 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715734813; cv=none; b=SEVEQBg08uT0QgIxWfwTADqIWRrMynMQNZNlHL7d1KWya1sflfu3oQWR5k0D3idwaAa4IfO322WuYADA94iwV0i+9dbPCClVNlOaMHn9ertJt6Je1P+cVNkRKec4Kyq/PKpYwH3vKDOp6czO8tbMfmHissafAOSpBnasg5li8cU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1715734813; c=relaxed/simple; bh=Ts0Q7br9qwoNR0tYNUlPsMXwR1VYg0z28EqH1RigXkg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=s/ReA4XD5LKmE6ThF6OsgnBRDG3siD//d3PlS4EMzcini7aZE1zDjid/1gIcEYGQZKYwfYizCCLfQhRWA5NdyLP4r6lWnptcR5IpuUEhM0dP3xfVF5YRY21jxCHjBtedaUnAtyjND5cNCjL/kAx4D06AjgcQoEl6Ms7KWVu3nDs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=OR6iHeN3; arc=none smtp.client-ip=192.198.163.19 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="OR6iHeN3" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1715734811; x=1747270811; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Ts0Q7br9qwoNR0tYNUlPsMXwR1VYg0z28EqH1RigXkg=; b=OR6iHeN3gP1WEX/VcXP9ABpboyDmJbOZejxPDEALNseXeuDwwVhCILHO 16rChk3c+HXt6QH5dhLGwN8L+CQ0uC/X39UWPp698YhaXAW0dyulfxgQX M20YOCgDG7kIRk83lRm4BFf3XtGiP437J15W/DmX74g1XmfRtFclaoAyU mv75byGK202tS4uS6T6vamsfu6jdBN+4W42as7bKWzwIWezZMIKCkx8GU 2jdq9gC6baSPFClKOaHqVA5xTiEDS/q01Av56oUPUtEUWxwRYywpX6e4l hrw3/5HPZn0Fs+PSBTrqcMsxsbzRzlP+mkYFquFJbiLsVXt56SmGJ/qUv g==; X-CSE-ConnectionGUID: E8uWYvgARoCFGu++sO2ANg== X-CSE-MsgGUID: VBgeHVCbTPCEhJgJLg86Jg== X-IronPort-AV: E=McAfee;i="6600,9927,11073"; a="11613973" X-IronPort-AV: E=Sophos;i="6.08,160,1712646000"; d="scan'208";a="11613973" Received: from fmviesa006.fm.intel.com ([10.60.135.146]) by fmvoesa113.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 May 2024 18:00:06 -0700 X-CSE-ConnectionGUID: Ub/KEAqETpSJu78Qo3L98w== X-CSE-MsgGUID: 2Fpj+nkiQiqzX5Isi6cI8g== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,160,1712646000"; d="scan'208";a="30942788" Received: from oyildiz-mobl1.amr.corp.intel.com (HELO rpedgeco-desk4.intel.com) ([10.209.51.34]) by fmviesa006-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 May 2024 18:00:05 -0700 From: Rick Edgecombe <rick.p.edgecombe@intel.com> To: pbonzini@redhat.com, seanjc@google.com, kvm@vger.kernel.org Cc: linux-kernel@vger.kernel.org, isaku.yamahata@gmail.com, erdemaktas@google.com, sagis@google.com, yan.y.zhao@intel.com, dmatlack@google.com, rick.p.edgecombe@intel.com Subject: [PATCH 10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU Date: Tue, 14 May 2024 17:59:46 -0700 Message-Id: <20240515005952.3410568-11-rick.p.edgecombe@intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240515005952.3410568-1-rick.p.edgecombe@intel.com> References: <20240515005952.3410568-1-rick.p.edgecombe@intel.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: <kvm.vger.kernel.org> List-Subscribe: <mailto:kvm+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:kvm+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	TDX MMU prep series part 1 \| expand [00/16] TDX MMU prep series part 1 [01/16] KVM: x86: Add a VM type define for TDX [02/16] KVM: x86/mmu: Introduce a slot flag to zap only slot leafs on slot deletion [03/16] KVM: x86/tdp_mmu: Add a helper function to walk down the TDP MMU [04/16] KVM: x86/mmu: Add address conversion functions for TDX shared bit of GPA [05/16] KVM: Add member to struct kvm_gfn_range for target alias [06/16] KVM: x86/mmu: Add a new is_private member for union kvm_mmu_page_role [07/16] KVM: x86/mmu: Add a private pointer to struct kvm_mmu_page [08/16] KVM: x86/mmu: Bug the VM if kvm_zap_gfn_range() is called for TDX [09/16] KVM: x86/mmu: Make kvm_tdp_mmu_alloc_root() return void [10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU [11/16] KVM: x86/tdp_mmu: Extract root invalid check from tdx_mmu_next_root() [12/16] KVM: x86/tdp_mmu: Introduce KVM MMU root types to specify page table type [13/16] KVM: x86/tdp_mmu: Introduce shared, private KVM MMU root types [14/16] KVM: x86/tdp_mmu: Take root types for kvm_tdp_mmu_invalidate_all_roots() [15/16] KVM: x86/tdp_mmu: Make mmu notifier callbacks to check kvm_process [16/16] KVM: x86/tdp_mmu: Invalidate correct roots

[10/16] KVM: x86/tdp_mmu: Support TDX private mapping for TDP MMU

Commit Message

Comments

Patch