[v6,12/18] KVM: x86/tdp_mmu: Support mirror root for TDP MMU

From: Isaku Yamahata <isaku.yamahata@intel.com>

From: Isaku Yamahata <isaku.yamahata@intel.com>

Add the ability for the TDP MMU to maintain a mirror of a separate
mapping.

Like other Coco technologies, TDX has the concept of private and shared
memory. For TDX the private and shared mappings are managed on separate
EPT roots. The private half is managed indirectly through calls into a
protected runtime environment called the TDX module, where the shared half
is managed within KVM in normal page tables.

In order to handle both shared and private memory, KVM needs to learn to
handle faults and other operations on the correct root for the operation.
KVM could learn the concept of private roots, and operate on them by
calling out to operations that call into the TDX module. But there are two
problems with that:
1. Calls into the TDX module are relatively slow compared to the simple
   accesses required to read a PTE managed directly by KVM.
2. Other Coco technologies deal with private memory completely differently
   and it will make the code confusing when being read from their
   perspective. Special operations added for TDX that set private or zap
   private memory will have nothing to do with these other private memory
   technologies. (SEV, etc).

To handle these, instead teach the TDP MMU about a new concept "mirror
roots". Such roots maintain page tables that are not actually mapped,
and are just used to traverse quickly to determine if the mid level page
tables need to be installed. When the memory be mirrored needs to actually
be changed, calls can be made to via x86_ops.

  private KVM page fault   |
      |                    |
      V                    |
 private GPA               |     CPU protected EPTP
      |                    |           |
      V                    |           V
 mirror PT root            |     external PT root
      |                    |           |
      V                    |           V
   mirror PT   --hook to propagate-->external PT
      |                    |           |
      \--------------------+------\    |
                           |      |    |
                           |      V    V
                           |    private guest page
                           |
                           |
     non-encrypted memory  |    encrypted memory
                           |

Leave calling out to actually update the private page tables that are being
mirrored for later changes. Just implement the handling of MMU operations
on to mirrored roots.

In order to direct operations to correct root, add root types
KVM_DIRECT_ROOTS and KVM_MIRROR_ROOTS. Tie the usage of mirrored/direct
roots to private/shared with conditionals. It could also be implemented by
making the kvm_tdp_mmu_root_types and kvm_gfn_range_filter enum bits line
up such that conversion could be a direct assignment with a case. Don't do
this because the mapping of private to mirrored is confusing enough. So it
is worth not hiding the logic in type casting.

Cleanup the mirror root in kvm_mmu_destroy() instead of the normal place
in kvm_mmu_free_roots(), because the private root that is being cannot be
rebuilt like a normal root. It needs to persist for the lifetime of the VM.

The TDX module will also need to be provided with page tables to use for
the actual mapping being mirrored by the mirrored page tables. Allocate
these in the mapping path using the recently added
kvm_mmu_alloc_external_spt().

Don't support 2M page for now. This is avoided by forcing 4k pages in the
fault. Add a KVM_BUG_ON() to verify.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Co-developed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Message-ID: <20240718211230.1492011-13-rick.p.edgecombe@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/mmu.h              | 16 ++++++++++++
 arch/x86/kvm/mmu/mmu.c          | 12 ++++++++-
 arch/x86/kvm/mmu/tdp_mmu.c      | 31 ++++++++++++++++++------
 arch/x86/kvm/mmu/tdp_mmu.h      | 43 ++++++++++++++++++++++++++++++---
 5 files changed, 91 insertions(+), 12 deletions(-)

Message ID	20241222193445.349800-13-pbonzini@redhat.com (mailing list archive)
State	New
Headers	show Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F8CE1B395C for <kvm@vger.kernel.org>; Sun, 22 Dec 2024 19:35:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734896112; cv=none; b=RLUoHU+oHdZT0iSHZ2dMoU9ZGjlOUzb/NjiYc/fyUX/IUlb+N5O/owhCDb9evWbqSovJ7s7pX439HecgJwc2a3nH2mBuFC9LwSXC2u2G4cQZQ7b0VaYt0fTj2fmB5tmAEU0Gbj9sYW0BVIoq8lZ8mrWCfhejhWvOY0R5ClvXs5Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734896112; c=relaxed/simple; bh=SM5SvMmypFBLRjjT38iazVacbOFCM/p12AzCChSeue0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=NrjaJujec73E8Owf0ErBR4lyRA2lNfF63wERH/To8Bqo9u64/DvdUvGYrBfrlxWZA5tC6EiVA1KNmBrmCKMhPGHXFKUxTStXaFJd5d0F0I29i8GJfmpOG4jJH4UEznni/cReA2LKyCd5/4HciXxl64QT0Qwrx9N+0OsN9bp5faY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=VBYFNYxD; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="VBYFNYxD" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1734896109; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NvQGysaWl+AuAeALCJ072XkNpw0dyIXkusRdNBKQEBs=; b=VBYFNYxDtaep9Tm7jDrVshHyptN4VJsPBip9QwG/XklIbeDxA4OkuWCTybqaaEg+arhYnV r4yx3BxhnTxgpR2XL+NWNX7w/05ALhHLKHAN/zwnxIVFvlTlcM/49Gtc79bJKAkjt0CAHk YJ+sVkCsSnNTapS00Vm9JarfGsPp5Cs= Received: from mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-310-o4CLY2R5PBinBaiiRPvosw-1; Sun, 22 Dec 2024 14:35:04 -0500 X-MC-Unique: o4CLY2R5PBinBaiiRPvosw-1 X-Mimecast-MFC-AGG-ID: o4CLY2R5PBinBaiiRPvosw Received: from mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-02.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 55AD81956087; Sun, 22 Dec 2024 19:35:03 +0000 (UTC) Received: from virtlab1023.lab.eng.rdu2.redhat.com (virtlab1023.lab.eng.rdu2.redhat.com [10.8.1.187]) by mx-prod-int-04.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id 4113419560AA; Sun, 22 Dec 2024 19:35:02 +0000 (UTC) From: Paolo Bonzini <pbonzini@redhat.com> To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: yan.y.zhao@intel.com, isaku.yamahata@intel.com, binbin.wu@linux.intel.com, rick.p.edgecombe@intel.com, Kai Huang <kai.huang@intel.com> Subject: [PATCH v6 12/18] KVM: x86/tdp_mmu: Support mirror root for TDP MMU Date: Sun, 22 Dec 2024 14:34:39 -0500 Message-ID: <20241222193445.349800-13-pbonzini@redhat.com> In-Reply-To: <20241222193445.349800-1-pbonzini@redhat.com> References: <20241222193445.349800-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: <kvm.vger.kernel.org> List-Subscribe: <mailto:kvm+subscribe@vger.kernel.org> List-Unsubscribe: <mailto:kvm+unsubscribe@vger.kernel.org> MIME-Version: 1.0 Content-Type: text/plain Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.0 on 10.30.177.40
Series	TDX MMU prep series part 1 \| expand [v6,00/18] TDX MMU prep series part 1 [v6,01/18] KVM: x86/mmu: Zap invalid roots with mmu_lock held for write at uninit [v6,02/18] KVM: Add member to struct kvm_gfn_range to indicate private/shared [v6,03/18] KVM: x86: Add a VM type define for TDX [v6,04/18] KVM: x86/mmu: Add an external pointer to struct kvm_mmu_page [v6,05/18] KVM: x86/mmu: Add an is_mirror member for union kvm_mmu_page_role [v6,06/18] KVM: x86/mmu: Make kvm_tdp_mmu_alloc_root() return void [v6,07/18] KVM: x86/tdp_mmu: Take struct kvm in iter loops [v6,08/18] KVM: x86/mmu: Support GFN direct bits [v6,09/18] KVM: x86/tdp_mmu: Extract root invalid check from tdx_mmu_next_root() [v6,10/18] KVM: x86/tdp_mmu: Introduce KVM MMU root types to specify page table type [v6,11/18] KVM: x86/tdp_mmu: Take root in tdp_mmu_for_each_pte() [v6,12/18] KVM: x86/tdp_mmu: Support mirror root for TDP MMU [v6,13/18] KVM: x86/tdp_mmu: Propagate attr_filter to MMU notifier callbacks [v6,14/18] KVM: x86/tdp_mmu: Propagate building mirror page tables [v6,15/18] KVM: x86/tdp_mmu: Propagate tearing down mirror page tables [v6,16/18] KVM: x86/tdp_mmu: Take root types for kvm_tdp_mmu_invalidate_all_roots() [v6,17/18] KVM: x86/tdp_mmu: Don't zap valid mirror roots in kvm_tdp_mmu_zap_all() [v6,18/18] KVM: x86/mmu: Prevent aliased memslot GFNs

[v6,12/18] KVM: x86/tdp_mmu: Support mirror root for TDP MMU

Commit Message

Patch