diff mbox series

[v3,13/17] KVM: x86/tdp_mmu: Support mirror root for TDP MMU

Message ID 20240619223614.290657-14-rick.p.edgecombe@intel.com (mailing list archive)
State New, archived
Headers show
Series TDX MMU prep series part 1 | expand

Commit Message

Rick Edgecombe June 19, 2024, 10:36 p.m. UTC
From: Isaku Yamahata <isaku.yamahata@intel.com>

Add the ability for the TDP MMU to maintain a mirror of a separate
mapping.

Like other Coco technologies, TDX has the concept of private and shared
memory. For TDX the private and shared mappings are managed on separate
EPT roots. The private half is managed indirectly though calls into a
protected runtime environment called the TDX module, where the shared half
is managed within KVM in normal page tables.

In order to handle both shared and private memory, KVM needs to learn to
handle faults and other operations on the correct root for the operation.
KVM could learn the concept of private roots, and operate on them by
calling out to operations that call into the TDX module. But there are two
problems with that:
1. Calls into the TDX module are relatively slow compared to the simple
   accesses required to read a PTE managed directly by KVM.
2. Other Coco technologies deal with private memory completely differently
   and it will make the code confusing when being read from their
   perspective. Special operations added for TDX that set private or zap
   private memory will have nothing to do with these other private memory
   technologies. (SEV, etc).

To handle these, instead teach the TDP MMU about a new concept "mirror
roots". Such roots maintain page tables that are not actually mapped,
and are just used to traverse quickly to determine if the mid level page
tables need to be installed. When the memory be mirrored needs to actually
be changed, calls can be made to via x86_ops.

  private KVM page fault   |
      |                    |
      V                    |
 private GPA               |     CPU protected EPTP
      |                    |           |
      V                    |           V
 mirror PT root            |     private PT root
      |                    |           |
      V                    |           V
   mirror PT   --hook to propagate-->private PT
      |                    |           |
      \--------------------+------\    |
                           |      |    |
                           |      V    V
                           |    private guest page
                           |
                           |
     non-encrypted memory  |    encrypted memory
                           |

Leave calling out to actually update the private page tables that are being
mirrored for later changes. Just implement the handling of MMU operations
on to mirrored roots.

In order to direct operations to correct root, add root types
KVM_DIRECT_ROOTS and KVM_MIRROR_ROOTS. Tie the usage of mirrored/direct
roots to private/shared with conditionals. It could also be implemented by
making the kvm_tdp_mmu_root_types and kvm_gfn_range_filter enum bits line
up such that conversion could be a direct assignment with a case. Don't do
this because the mapping of private to mirrored is confusing enough. So it
is worth not hiding the logic in type casting.

Cleanup the mirror root in kvm_mmu_destroy() instead of the normal place
in kvm_mmu_free_roots(), because the private root that is being cannot be
rebuilt like a normal root. It needs to persist for the lifetime of the VM.

The TDX module will also need to be provided with page tables to use for
the actual mapping being mirrored by the mirrored page tables. Allocate
these in the mapping path using the recently added
kvm_mmu_alloc_private_spt().

Don't support 2M page for now. This is avoided by forcing 4k pages in the
fault. Add a KVM_BUG_ON() to verify.

Signed-off-by: Isaku Yamahata <isaku.yamahata@intel.com>
Co-developed-by: Kai Huang <kai.huang@intel.com>
Signed-off-by: Kai Huang <kai.huang@intel.com>
Co-developed-by: Yan Zhao <yan.y.zhao@intel.com>
Signed-off-by: Yan Zhao <yan.y.zhao@intel.com>
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
---
TDX MMU Prep v3:
 - Remove handle_changed_spte() changes
 - Drop kvm_on_mirror()/kvm_on_direct(), open code (Paolo)
 - Rename tdp_mmu_get_fault_root_type() -> tdp_mmu_get_root_for_fault()  (Paolo)
 - Use fault->addr (in helper) to determine direct vs mirror root (Paolo)
 - Rename process -> filter (Paolo)

TDX MMU Prep v2:
 - Rename private->mirror
 - Split apart from "KVM: x86/tdp_mmu: Support TDX private mapping for TDP
   MMU"
 - Update log
 - Sprinkle a few comments
 - Use kvm_on_*() helpers to direct iterator to proper root
 - Drop BUGGY_KVM_ROOTS because the translation between the process enum
   is no longer automatic, and the warn already happens elsewhere.

TDX MMU Prep:
 - Remove unnecessary gfn, access twist in
   tdp_mmu_map_handle_target_level(). (Chao Gao)
 - Open code call to kvm_mmu_alloc_private_spt() instead oCf doing it in
   tdp_mmu_alloc_sp()
 - Update comment in set_private_spte_present() (Yan)
 - Open code call to kvm_mmu_init_private_spt() (Yan)
 - Add comments on TDX MMU hooks (Yan)
 - Fix various whitespace alignment (Yan)
 - Remove pointless warnings and conditionals in
   handle_removed_private_spte() (Yan)
 - Remove redundant lockdep assert in tdp_mmu_set_spte() (Yan)
 - Remove incorrect comment in handle_changed_spte() (Yan)
 - Remove unneeded kvm_pfn_to_refcounted_page() and
   is_error_noslot_pfn() check in kvm_tdp_mmu_map() (Yan)
 - Do kvm_gfn_for_root() branchless (Rick)
 - Update kvm_tdp_mmu_alloc_root() callers to not check error code (Rick)
 - Add comment for stripping shared bit for fault.gfn (Chao)

v19:
 - drop CONFIG_KVM_MMU_PRIVATE
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/mmu.h              |  7 ++++++
 arch/x86/kvm/mmu/mmu.c          | 11 ++++++++-
 arch/x86/kvm/mmu/tdp_mmu.c      | 34 ++++++++++++++++++++------
 arch/x86/kvm/mmu/tdp_mmu.h      | 43 ++++++++++++++++++++++++++++++---
 5 files changed, 84 insertions(+), 12 deletions(-)

Comments

Yan Zhao June 24, 2024, 8:30 a.m. UTC | #1
On Wed, Jun 19, 2024 at 03:36:10PM -0700, Rick Edgecombe wrote:
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e9c1783a8743..287dcc2685e4 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3701,7 +3701,9 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
>  	int r;
>  
>  	if (tdp_mmu_enabled) {
> -		kvm_tdp_mmu_alloc_root(vcpu);
> +		if (kvm_has_mirrored_tdp(vcpu->kvm))
> +			kvm_tdp_mmu_alloc_root(vcpu, true);
> +		kvm_tdp_mmu_alloc_root(vcpu, false);
>  		return 0;
>  	}
mmu_alloc_direct_roots() is called when vcpu->arch.mmu->root.hpa is INVALID_PAGE
in kvm_mmu_reload(), which could happen after direct root is invalidated.

> -void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
> +void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu, bool mirror)
>  {
>  	struct kvm_mmu *mmu = vcpu->arch.mmu;
>  	union kvm_mmu_page_role role = mmu->root_role;
> @@ -241,6 +246,9 @@ void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
>  	struct kvm *kvm = vcpu->kvm;
>  	struct kvm_mmu_page *root;
>  
> +	if (mirror)
> +		role.is_mirror = 1;
> +
Could we add a validity check of mirror_root_hpa to prevent an incorrect ref
count increment of the mirror root?

+       if (mirror) {
+               if (mmu->mirror_root_hpa != INVALID_PAGE)
+                       return;
+
                role.is_mirror = true;
+       }

>  	/*
>  	 * Check for an existing root before acquiring the pages lock to avoid
>  	 * unnecessary serialization if multiple vCPUs are loading a new root.
> @@ -292,8 +300,12 @@ void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
>  	 * and actually consuming the root if it's invalidated after dropping
>  	 * mmu_lock, and the root can't be freed as this vCPU holds a reference.
>  	 */
> -	mmu->root.hpa = __pa(root->spt);
> -	mmu->root.pgd = 0;
> +	if (mirror) {
> +		mmu->mirror_root_hpa = __pa(root->spt);
> +	} else {
> +		mmu->root.hpa = __pa(root->spt);
> +		mmu->root.pgd = 0;
> +	}
>  }
Rick Edgecombe June 25, 2024, 12:51 a.m. UTC | #2
On Mon, 2024-06-24 at 16:30 +0800, Yan Zhao wrote:
> On Wed, Jun 19, 2024 at 03:36:10PM -0700, Rick Edgecombe wrote:
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index e9c1783a8743..287dcc2685e4 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3701,7 +3701,9 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu
> > *vcpu)
> >         int r;
> >   
> >         if (tdp_mmu_enabled) {
> > -               kvm_tdp_mmu_alloc_root(vcpu);
> > +               if (kvm_has_mirrored_tdp(vcpu->kvm))
> > +                       kvm_tdp_mmu_alloc_root(vcpu, true);
> > +               kvm_tdp_mmu_alloc_root(vcpu, false);
> >                 return 0;
> >         }
> mmu_alloc_direct_roots() is called when vcpu->arch.mmu->root.hpa is
> INVALID_PAGE
> in kvm_mmu_reload(), which could happen after direct root is invalidated.
> 
> > -void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
> > +void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu, bool mirror)
> >   {
> >         struct kvm_mmu *mmu = vcpu->arch.mmu;
> >         union kvm_mmu_page_role role = mmu->root_role;
> > @@ -241,6 +246,9 @@ void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
> >         struct kvm *kvm = vcpu->kvm;
> >         struct kvm_mmu_page *root;
> >   
> > +       if (mirror)
> > +               role.is_mirror = 1;
> > +
> Could we add a validity check of mirror_root_hpa to prevent an incorrect ref
> count increment of the mirror root?

I was originally suspicious of the asymmetry of the tear down of mirror and
direct roots vs the allocation. Do you see a concrete problem, or just
advocating for safety?

> 
> +       if (mirror) {
> +               if (mmu->mirror_root_hpa != INVALID_PAGE)
> +                       return;
> +
>                 role.is_mirror = true;
> +       }
Yan Zhao June 25, 2024, 5:43 a.m. UTC | #3
On Tue, Jun 25, 2024 at 08:51:33AM +0800, Edgecombe, Rick P wrote:
> On Mon, 2024-06-24 at 16:30 +0800, Yan Zhao wrote:
> > On Wed, Jun 19, 2024 at 03:36:10PM -0700, Rick Edgecombe wrote:
> > > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > > index e9c1783a8743..287dcc2685e4 100644
> > > --- a/arch/x86/kvm/mmu/mmu.c
> > > +++ b/arch/x86/kvm/mmu/mmu.c
> > > @@ -3701,7 +3701,9 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu
> > > *vcpu)
> > >         int r;
> > >   
> > >         if (tdp_mmu_enabled) {
> > > -               kvm_tdp_mmu_alloc_root(vcpu);
> > > +               if (kvm_has_mirrored_tdp(vcpu->kvm))
> > > +                       kvm_tdp_mmu_alloc_root(vcpu, true);
> > > +               kvm_tdp_mmu_alloc_root(vcpu, false);
> > >                 return 0;
> > >         }
> > mmu_alloc_direct_roots() is called when vcpu->arch.mmu->root.hpa is
> > INVALID_PAGE
> > in kvm_mmu_reload(), which could happen after direct root is invalidated.
> > 
> > > -void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
> > > +void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu, bool mirror)
> > >   {
> > >         struct kvm_mmu *mmu = vcpu->arch.mmu;
> > >         union kvm_mmu_page_role role = mmu->root_role;
> > > @@ -241,6 +246,9 @@ void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
> > >         struct kvm *kvm = vcpu->kvm;
> > >         struct kvm_mmu_page *root;
> > >   
> > > +       if (mirror)
> > > +               role.is_mirror = 1;
> > > +
> > Could we add a validity check of mirror_root_hpa to prevent an incorrect ref
> > count increment of the mirror root?
> 
> I was originally suspicious of the asymmetry of the tear down of mirror and
> direct roots vs the allocation. Do you see a concrete problem, or just
> advocating for safety?
IMO it's a concrete problem, though rare up to now. e.g.

After repeatedly hot-plugping and hot-unplugping memory, which increases
memslots generation, kvm_mmu_zap_all_fast() will be called to invalidate direct
roots when the memslots generation wraps around.

> 
> > 
> > +       if (mirror) {
> > +               if (mmu->mirror_root_hpa != INVALID_PAGE)
> > +                       return;
> > +
> >                 role.is_mirror = true;
> > +       }
>
Rick Edgecombe June 25, 2024, 8:33 p.m. UTC | #4
On Tue, 2024-06-25 at 13:43 +0800, Yan Zhao wrote:
> > > > I was originally suspicious of the asymmetry of the tear down of mirror
> > > > and
> > > > direct roots vs the allocation. Do you see a concrete problem, or just
> > > > advocating for safety?
> > IMO it's a concrete problem, though rare up to now. e.g.
> > 
> > After repeatedly hot-plugping and hot-unplugping memory, which increases
> > memslots generation, kvm_mmu_zap_all_fast() will be called to invalidate >
> > direct
> > roots when the memslots generation wraps around.

Hmm, yes. I'm not sure about putting the check there though. It adds even more
confusion to the lifecycle.
 - mirror_root_hpa != INVALID_PAGE check in a different placed than
   root.hpa != INVALID_PAGE check.
 - they get allocated in the same place
 - they are torn down in the different places.

Can you think of clearer fix for it. Maybe we can just move the mirror root
allocation such that it's not subjected to the reload path? Like something that
matches the tear down in kvm_mmu_destroy()?
Yan Zhao June 26, 2024, 5:05 a.m. UTC | #5
On Wed, Jun 26, 2024 at 04:33:20AM +0800, Edgecombe, Rick P wrote:
> On Tue, 2024-06-25 at 13:43 +0800, Yan Zhao wrote:
> > > > > I was originally suspicious of the asymmetry of the tear down of mirror
> > > > > and
> > > > > direct roots vs the allocation. Do you see a concrete problem, or just
> > > > > advocating for safety?
> > > IMO it's a concrete problem, though rare up to now. e.g.
> > > 
> > > After repeatedly hot-plugping and hot-unplugping memory, which increases
> > > memslots generation, kvm_mmu_zap_all_fast() will be called to invalidate >
> > > direct
> > > roots when the memslots generation wraps around.
> 
> Hmm, yes. I'm not sure about putting the check there though. It adds even more
> confusion to the lifecycle.
>  - mirror_root_hpa != INVALID_PAGE check in a different placed than
>    root.hpa != INVALID_PAGE check.
>  - they get allocated in the same place
>  - they are torn down in the different places.
> 
> Can you think of clearer fix for it. Maybe we can just move the mirror root
> allocation such that it's not subjected to the reload path? Like something that
> matches the tear down in kvm_mmu_destroy()?
But we still need the reload path to have each vcpu to hold a ref count of the
mirror root.
What about below fix?

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 026e8edfb0bd..4decd13457ec 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -127,9 +127,28 @@ void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu);
 void kvm_mmu_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
                         int bytes);

+static inline bool kvm_has_mirrored_tdp(const struct kvm *kvm)
+{
+       return kvm->arch.vm_type == KVM_X86_TDX_VM;
+}
+
+static inline bool kvm_mmu_root_hpa_is_invalid(struct kvm_vcpu *vcpu)
+{
+       return vcpu->arch.mmu->root.hpa == INVALID_PAGE;
+}
+
+static inline bool kvm_mmu_mirror_root_hpa_is_invalid(struct kvm_vcpu *vcpu)
+{
+       if (!kvm_has_mirrored_tdp(vcpu->kvm))
+               return false;
+
+       return vcpu->arch.mmu->mirror_root_hpa == INVALID_PAGE;
+}
+
 static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
 {
-       if (likely(vcpu->arch.mmu->root.hpa != INVALID_PAGE))
+       if (!kvm_mmu_root_hpa_is_invalid(vcpu) &&
+           !kvm_mmu_mirror_root_hpa_is_invalid(vcpu))
                return 0;

        return kvm_mmu_load(vcpu);
@@ -322,11 +341,6 @@ static inline gpa_t kvm_translate_gpa(struct kvm_vcpu *vcpu,
        return translate_nested_gpa(vcpu, gpa, access, exception);
 }

-static inline bool kvm_has_mirrored_tdp(const struct kvm *kvm)
-{
-       return kvm->arch.vm_type == KVM_X86_TDX_VM;
-}
-
 static inline gfn_t kvm_gfn_direct_bits(const struct kvm *kvm)
 {
        return kvm->arch.gfn_direct_bits;
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e1299eb03e63..5e7d92074f70 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3702,9 +3702,11 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
        int r;

        if (tdp_mmu_enabled) {
-               if (kvm_has_mirrored_tdp(vcpu->kvm))
+               if (kvm_mmu_mirror_root_hpa_is_invalid(vcpu))
                        kvm_tdp_mmu_alloc_root(vcpu, true);
-               kvm_tdp_mmu_alloc_root(vcpu, false);
+
+               if (kvm_mmu_root_hpa_is_invalid(vcpu))
+                       kvm_tdp_mmu_alloc_root(vcpu, false);
                return 0;
        }
Rick Edgecombe July 3, 2024, 7:40 p.m. UTC | #6
On Wed, 2024-06-26 at 13:05 +0800, Yan Zhao wrote:
> But we still need the reload path to have each vcpu to hold a ref count of the
> mirror root.
> What about below fix?
> 
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 026e8edfb0bd..4decd13457ec 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -127,9 +127,28 @@ void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu);
>  void kvm_mmu_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
>                          int bytes);
> 
> +static inline bool kvm_has_mirrored_tdp(const struct kvm *kvm)
> +{
> +       return kvm->arch.vm_type == KVM_X86_TDX_VM;
> +}
> +
> +static inline bool kvm_mmu_root_hpa_is_invalid(struct kvm_vcpu *vcpu)
> +{
> +       return vcpu->arch.mmu->root.hpa == INVALID_PAGE;
> +}
> +
> +static inline bool kvm_mmu_mirror_root_hpa_is_invalid(struct kvm_vcpu *vcpu)
> +{
> +       if (!kvm_has_mirrored_tdp(vcpu->kvm))
> +               return false;
> +
> +       return vcpu->arch.mmu->mirror_root_hpa == INVALID_PAGE;
> +}
> +
>  static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
>  {
> -       if (likely(vcpu->arch.mmu->root.hpa != INVALID_PAGE))
> +       if (!kvm_mmu_root_hpa_is_invalid(vcpu) &&
> +           !kvm_mmu_mirror_root_hpa_is_invalid(vcpu))

If we go this way, we should keep the likely. But, I'm not convinced.

>                 return 0;
> 
>         return kvm_mmu_load(vcpu);
> @@ -322,11 +341,6 @@ static inline gpa_t kvm_translate_gpa(struct kvm_vcpu
> *vcpu,
>         return translate_nested_gpa(vcpu, gpa, access, exception);
>  }
> 
> -static inline bool kvm_has_mirrored_tdp(const struct kvm *kvm)
> -{
> -       return kvm->arch.vm_type == KVM_X86_TDX_VM;
> -}
> -
>  static inline gfn_t kvm_gfn_direct_bits(const struct kvm *kvm)
>  {
>         return kvm->arch.gfn_direct_bits;
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index e1299eb03e63..5e7d92074f70 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3702,9 +3702,11 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu
> *vcpu)
>         int r;
> 
>         if (tdp_mmu_enabled) {
> -               if (kvm_has_mirrored_tdp(vcpu->kvm))
> +               if (kvm_mmu_mirror_root_hpa_is_invalid(vcpu))
>                         kvm_tdp_mmu_alloc_root(vcpu, true);
> -               kvm_tdp_mmu_alloc_root(vcpu, false);
> +
> +               if (kvm_mmu_root_hpa_is_invalid(vcpu))
> +                       kvm_tdp_mmu_alloc_root(vcpu, false);

So we can have either:
mirror_root_hpa = INVALID_PAGE
root.hpa = INVALID_PAGE

or:
mirror_root_hpa = root
root.hpa = INVALID_PAGE

We don't ever have:
mirror_root_hpa = INVALID_PAGE
root.hpa = root

Because mirror_root_hpa is special.

>                 return 0;
>         }
> 

Having the check in kvm_mmu_reload() and here both is really ugly.  Right now we
have kvm_mmu_reload() which only allocates new roots if needed, then
kvm_mmu_load() goes and allocates/references them. Now mmu_alloc_direct_roots()
has the same logic to only reload as needed.

So could we just leave the kvm_mmu_reload() logic as is, and just have special
logic in mmu_alloc_direct_roots() to not avoid re-allocating mirror roots? This
is actually what v19 had, but it was thought to be a historical relic:
   "Historically we needed it.  We don't need it now. We can drop it."
https://lore.kernel.org/kvm/20240325200122.GH2357401@ls.amr.corp.intel.com/

So just have this small fixup instead?

diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index ae5d5dee86af..2e062178222e 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3705,7 +3705,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
        int r;
 
        if (tdp_mmu_enabled) {
-               if (kvm_has_mirrored_tdp(vcpu->kvm))
+               if (kvm_has_mirrored_tdp(vcpu->kvm) &&
+                   !VALID_PAGE(mmu->mirror_root_hpa))
                        kvm_tdp_mmu_alloc_root(vcpu, true);
                kvm_tdp_mmu_alloc_root(vcpu, false);
                return 0;
Yan Zhao July 4, 2024, 8:09 a.m. UTC | #7
On Thu, Jul 04, 2024 at 03:40:35AM +0800, Edgecombe, Rick P wrote:
> On Wed, 2024-06-26 at 13:05 +0800, Yan Zhao wrote:
> > But we still need the reload path to have each vcpu to hold a ref count of the
> > mirror root.
> > What about below fix?
> > 
> > diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> > index 026e8edfb0bd..4decd13457ec 100644
> > --- a/arch/x86/kvm/mmu.h
> > +++ b/arch/x86/kvm/mmu.h
> > @@ -127,9 +127,28 @@ void kvm_mmu_sync_prev_roots(struct kvm_vcpu *vcpu);
> >  void kvm_mmu_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
> >                          int bytes);
> > 
> > +static inline bool kvm_has_mirrored_tdp(const struct kvm *kvm)
> > +{
> > +       return kvm->arch.vm_type == KVM_X86_TDX_VM;
> > +}
> > +
> > +static inline bool kvm_mmu_root_hpa_is_invalid(struct kvm_vcpu *vcpu)
> > +{
> > +       return vcpu->arch.mmu->root.hpa == INVALID_PAGE;
> > +}
> > +
> > +static inline bool kvm_mmu_mirror_root_hpa_is_invalid(struct kvm_vcpu *vcpu)
> > +{
> > +       if (!kvm_has_mirrored_tdp(vcpu->kvm))
> > +               return false;
> > +
> > +       return vcpu->arch.mmu->mirror_root_hpa == INVALID_PAGE;
> > +}
> > +
> >  static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
> >  {
> > -       if (likely(vcpu->arch.mmu->root.hpa != INVALID_PAGE))
> > +       if (!kvm_mmu_root_hpa_is_invalid(vcpu) &&
> > +           !kvm_mmu_mirror_root_hpa_is_invalid(vcpu))
> 
> If we go this way, we should keep the likely. But, I'm not convinced.
> 
> >                 return 0;
> > 
> >         return kvm_mmu_load(vcpu);
> > @@ -322,11 +341,6 @@ static inline gpa_t kvm_translate_gpa(struct kvm_vcpu
> > *vcpu,
> >         return translate_nested_gpa(vcpu, gpa, access, exception);
> >  }
> > 
> > -static inline bool kvm_has_mirrored_tdp(const struct kvm *kvm)
> > -{
> > -       return kvm->arch.vm_type == KVM_X86_TDX_VM;
> > -}
> > -
> >  static inline gfn_t kvm_gfn_direct_bits(const struct kvm *kvm)
> >  {
> >         return kvm->arch.gfn_direct_bits;
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index e1299eb03e63..5e7d92074f70 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -3702,9 +3702,11 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu
> > *vcpu)
> >         int r;
> > 
> >         if (tdp_mmu_enabled) {
> > -               if (kvm_has_mirrored_tdp(vcpu->kvm))
> > +               if (kvm_mmu_mirror_root_hpa_is_invalid(vcpu))
> >                         kvm_tdp_mmu_alloc_root(vcpu, true);
> > -               kvm_tdp_mmu_alloc_root(vcpu, false);
> > +
> > +               if (kvm_mmu_root_hpa_is_invalid(vcpu))
> > +                       kvm_tdp_mmu_alloc_root(vcpu, false);
> 
> So we can have either:
> mirror_root_hpa = INVALID_PAGE
> root.hpa = INVALID_PAGE
> 
> or:
> mirror_root_hpa = root
> root.hpa = INVALID_PAGE
> 
> We don't ever have:
> mirror_root_hpa = INVALID_PAGE
> root.hpa = root
> 
> Because mirror_root_hpa is special.
> 
Good point.

> >                 return 0;
> >         }
> > 
> 
> Having the check in kvm_mmu_reload() and here both is really ugly.  Right now we
> have kvm_mmu_reload() which only allocates new roots if needed, then
> kvm_mmu_load() goes and allocates/references them. Now mmu_alloc_direct_roots()
> has the same logic to only reload as needed.
> 
> So could we just leave the kvm_mmu_reload() logic as is, and just have special
> logic in mmu_alloc_direct_roots() to not avoid re-allocating mirror roots? This
> is actually what v19 had, but it was thought to be a historical relic:
>    "Historically we needed it.  We don't need it now. We can drop it."
> https://lore.kernel.org/kvm/20240325200122.GH2357401@ls.amr.corp.intel.com/
> 
> So just have this small fixup instead?
Yes.

> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index ae5d5dee86af..2e062178222e 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3705,7 +3705,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
>         int r;
>  
>         if (tdp_mmu_enabled) {
> -               if (kvm_has_mirrored_tdp(vcpu->kvm))
> +               if (kvm_has_mirrored_tdp(vcpu->kvm) &&
> +                   !VALID_PAGE(mmu->mirror_root_hpa))
>                         kvm_tdp_mmu_alloc_root(vcpu, true);
>                 kvm_tdp_mmu_alloc_root(vcpu, false);
>                 return 0;
> 
Perhaps also a comment in kvm_mmu_reload() to address concerns like why checking
only root.hpa in kvm_mmu_reload() is enough.

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 03737f3aaeeb..aba98c8cc67d 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -129,6 +129,15 @@ void kvm_mmu_track_write(struct kvm_vcpu *vcpu, gpa_t gpa, const u8 *new,
 
 static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
 {
+       /*
+        * Checking root.hpa is sufficient even when KVM has mirror root.
+        * We can have either:
+        * (1) mirror_root_hpa = INVALID_PAGE, root.hpa = INVALID_PAGE
+        * (2) mirror_root_hpa = root ,        root.hpa = INVALID_PAGE
+        * (3) mirror_root_hpa = root1,        root.hpa = root2
+        * We don't ever have:
+        *     mirror_root_hpa = INVALID_PAGE, root.hpa = root
+        */
        if (likely(vcpu->arch.mmu->root.hpa != INVALID_PAGE))
                return 0;
 
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index a5f803f1d17e..eee35e958971 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3705,7 +3705,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
        int r;
 
        if (tdp_mmu_enabled) {
-               if (kvm_has_mirrored_tdp(vcpu->kvm))
+               if (kvm_has_mirrored_tdp(vcpu->kvm) &&
+                   !VALID_PAGE(mmu->mirror_root_hpa))
                        kvm_tdp_mmu_alloc_root(vcpu, true);
                kvm_tdp_mmu_alloc_root(vcpu, false);
                return 0;
Yan Zhao July 4, 2024, 8:51 a.m. UTC | #8
On Wed, Jun 19, 2024 at 03:36:10PM -0700, Rick Edgecombe wrote:
> diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
> index b887c225ff24..2903f03a34be 100644
> --- a/arch/x86/kvm/mmu/tdp_mmu.h
> +++ b/arch/x86/kvm/mmu/tdp_mmu.h
> @@ -10,7 +10,7 @@
>  void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
>  void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm);
>  
> -void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu);
> +void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu, bool private);
>  
>  __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *root)
This function name is very similar to below tdp_mmu_get_root().

> +static inline struct kvm_mmu_page *tdp_mmu_get_root_for_fault(struct kvm_vcpu *vcpu,
> +							      struct kvm_page_fault *fault)
> +{
> +	if (unlikely(!kvm_is_addr_direct(vcpu->kvm, fault->addr)))
> +		return root_to_sp(vcpu->arch.mmu->mirror_root_hpa);
> +
> +	return root_to_sp(vcpu->arch.mmu->root.hpa);
> +}
> +
> +static inline struct kvm_mmu_page *tdp_mmu_get_root(struct kvm_vcpu *vcpu,
> +						    enum kvm_tdp_mmu_root_types type)
> +{
> +	if (unlikely(type == KVM_MIRROR_ROOTS))
> +		return root_to_sp(vcpu->arch.mmu->mirror_root_hpa);
> +
> +	return root_to_sp(vcpu->arch.mmu->root.hpa);
> +}
No need to put the two functions in tdp_mmu.h as they are not called in mmu.c.

Could we move them to tdp_mmu.c and rename them to something like
tdp_mmu_type_to_root() and tdp_mmu_fault_to_root() ?
Rick Edgecombe July 9, 2024, 10:36 p.m. UTC | #9
On Thu, 2024-07-04 at 16:09 +0800, Yan Zhao wrote:
> Perhaps also a comment in kvm_mmu_reload() to address concerns like why
> checking
> only root.hpa in kvm_mmu_reload() is enough.

Sounds good, and thanks again for catching this.

> 
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index 03737f3aaeeb..aba98c8cc67d 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -129,6 +129,15 @@ void kvm_mmu_track_write(struct kvm_vcpu *vcpu, gpa_t
> gpa, const u8 *new,
>  
>  static inline int kvm_mmu_reload(struct kvm_vcpu *vcpu)
>  {
> +       /*
> +        * Checking root.hpa is sufficient even when KVM has mirror root.
> +        * We can have either:
> +        * (1) mirror_root_hpa = INVALID_PAGE, root.hpa = INVALID_PAGE
> +        * (2) mirror_root_hpa = root ,        root.hpa = INVALID_PAGE
Looks good to me except for the space  ^

> +        * (3) mirror_root_hpa = root1,        root.hpa = root2
> +        * We don't ever have:
> +        *     mirror_root_hpa = INVALID_PAGE, root.hpa = root
> +        */
>         if (likely(vcpu->arch.mmu->root.hpa != INVALID_PAGE))
>                 return 0;
>  
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index a5f803f1d17e..eee35e958971 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -3705,7 +3705,8 @@ static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
>         int r;
>  
>         if (tdp_mmu_enabled) {
> -               if (kvm_has_mirrored_tdp(vcpu->kvm))
> +               if (kvm_has_mirrored_tdp(vcpu->kvm) &&
> +                   !VALID_PAGE(mmu->mirror_root_hpa))
>                         kvm_tdp_mmu_alloc_root(vcpu, true);
>                 kvm_tdp_mmu_alloc_root(vcpu, false);
>                 return 0;
Rick Edgecombe July 9, 2024, 10:38 p.m. UTC | #10
On Thu, 2024-07-04 at 16:51 +0800, Yan Zhao wrote:
> On Wed, Jun 19, 2024 at 03:36:10PM -0700, Rick Edgecombe wrote:
> > diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
> > index b887c225ff24..2903f03a34be 100644
> > --- a/arch/x86/kvm/mmu/tdp_mmu.h
> > +++ b/arch/x86/kvm/mmu/tdp_mmu.h
> > @@ -10,7 +10,7 @@
> >   void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
> >   void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm);
> >   
> > -void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu);
> > +void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu, bool private);
> >   
> >   __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page
> > *root)
> This function name is very similar to below tdp_mmu_get_root().

That is true.

> 
> > +static inline struct kvm_mmu_page *tdp_mmu_get_root_for_fault(struct
> > kvm_vcpu *vcpu,
> > +                                                             struct
> > kvm_page_fault *fault)
> > +{
> > +       if (unlikely(!kvm_is_addr_direct(vcpu->kvm, fault->addr)))
> > +               return root_to_sp(vcpu->arch.mmu->mirror_root_hpa);
> > +
> > +       return root_to_sp(vcpu->arch.mmu->root.hpa);
> > +}
> > +
> > +static inline struct kvm_mmu_page *tdp_mmu_get_root(struct kvm_vcpu *vcpu,
> > +                                                   enum
> > kvm_tdp_mmu_root_types type)
> > +{
> > +       if (unlikely(type == KVM_MIRROR_ROOTS))
> > +               return root_to_sp(vcpu->arch.mmu->mirror_root_hpa);
> > +
> > +       return root_to_sp(vcpu->arch.mmu->root.hpa);
> > +}
> No need to put the two functions in tdp_mmu.h as they are not called in mmu.c.

I think it is nice to have the root/enum logic above close together.

> 
> Could we move them to tdp_mmu.c and rename them to something like
> tdp_mmu_type_to_root() and tdp_mmu_fault_to_root() ?

tdp_mmu_get_root_for_fault() was proposed by Paolo, and tdp_mmu_get_root() was
discussed without comment. Not to say there is anything wrong with the names
proposed, but I think this is wading into bikeshedding territory at this stage.
Rick Edgecombe July 11, 2024, 11:54 p.m. UTC | #11
On Tue, 2024-07-09 at 15:38 -0700, Rick Edgecombe wrote:
> > Could we move them to tdp_mmu.c and rename them to something like
> > tdp_mmu_type_to_root() and tdp_mmu_fault_to_root() ?
> 
> tdp_mmu_get_root_for_fault() was proposed by Paolo, and tdp_mmu_get_root() was
> discussed without comment. Not to say there is anything wrong with the names
> proposed, but I think this is wading into bikeshedding territory at this
> stage.

I kept thinking about this comment. On more consideration tdp_mmu_get_root() is
a problematically terrible name since "get" usually means take a reference to
something which is something you can do to roots and (and what
kvm_tdp_mmu_get_root() is doing). So people might think tdp_mmu_get_root() is
taking a reference.
Yan Zhao July 12, 2024, 1:42 a.m. UTC | #12
On Fri, Jul 12, 2024 at 07:54:50AM +0800, Edgecombe, Rick P wrote:
> On Tue, 2024-07-09 at 15:38 -0700, Rick Edgecombe wrote:
> > > Could we move them to tdp_mmu.c and rename them to something like
> > > tdp_mmu_type_to_root() and tdp_mmu_fault_to_root() ?
> > 
> > tdp_mmu_get_root_for_fault() was proposed by Paolo, and tdp_mmu_get_root() was
> > discussed without comment. Not to say there is anything wrong with the names
> > proposed, but I think this is wading into bikeshedding territory at this
> > stage.
> 
> I kept thinking about this comment. On more consideration tdp_mmu_get_root() is
> a problematically terrible name since "get" usually means take a reference to
> something which is something you can do to roots and (and what
> kvm_tdp_mmu_get_root() is doing). So people might think tdp_mmu_get_root() is
> taking a reference.
Right, I have exactly the same feeling to "get".
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6e07eff06a58..d67e88a69fc4 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -460,6 +460,7 @@  struct kvm_mmu {
 	int (*sync_spte)(struct kvm_vcpu *vcpu,
 			 struct kvm_mmu_page *sp, int i);
 	struct kvm_mmu_root_info root;
+	hpa_t mirror_root_hpa;
 	union kvm_cpu_role cpu_role;
 	union kvm_mmu_page_role root_role;
 
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 63179a4fba7b..7b12ba761c51 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -328,4 +328,11 @@  static inline gfn_t kvm_gfn_direct_bits(const struct kvm *kvm)
 {
 	return kvm->arch.gfn_direct_bits;
 }
+
+static inline bool kvm_is_addr_direct(struct kvm *kvm, gpa_t gpa)
+{
+	gpa_t gpa_direct_bits = gfn_to_gpa(kvm_gfn_direct_bits(kvm));
+
+	return !gpa_direct_bits || (gpa & gpa_direct_bits);
+}
 #endif
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index e9c1783a8743..287dcc2685e4 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3701,7 +3701,9 @@  static int mmu_alloc_direct_roots(struct kvm_vcpu *vcpu)
 	int r;
 
 	if (tdp_mmu_enabled) {
-		kvm_tdp_mmu_alloc_root(vcpu);
+		if (kvm_has_mirrored_tdp(vcpu->kvm))
+			kvm_tdp_mmu_alloc_root(vcpu, true);
+		kvm_tdp_mmu_alloc_root(vcpu, false);
 		return 0;
 	}
 
@@ -6245,6 +6247,7 @@  static int __kvm_mmu_create(struct kvm_vcpu *vcpu, struct kvm_mmu *mmu)
 
 	mmu->root.hpa = INVALID_PAGE;
 	mmu->root.pgd = 0;
+	mmu->mirror_root_hpa = INVALID_PAGE;
 	for (i = 0; i < KVM_MMU_NUM_PREV_ROOTS; i++)
 		mmu->prev_roots[i] = KVM_MMU_ROOT_INFO_INVALID;
 
@@ -7220,6 +7223,12 @@  int kvm_mmu_vendor_module_init(void)
 void kvm_mmu_destroy(struct kvm_vcpu *vcpu)
 {
 	kvm_mmu_unload(vcpu);
+	if (tdp_mmu_enabled) {
+		read_lock(&vcpu->kvm->mmu_lock);
+		mmu_free_root_page(vcpu->kvm, &vcpu->arch.mmu->mirror_root_hpa,
+				   NULL);
+		read_unlock(&vcpu->kvm->mmu_lock);
+	}
 	free_mmu_pages(&vcpu->arch.root_mmu);
 	free_mmu_pages(&vcpu->arch.guest_mmu);
 	mmu_free_memory_caches(vcpu);
diff --git a/arch/x86/kvm/mmu/tdp_mmu.c b/arch/x86/kvm/mmu/tdp_mmu.c
index 2200bdc7681f..a0010c62425f 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.c
+++ b/arch/x86/kvm/mmu/tdp_mmu.c
@@ -95,10 +95,15 @@  void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root)
 static bool tdp_mmu_root_match(struct kvm_mmu_page *root,
 			       enum kvm_tdp_mmu_root_types types)
 {
+	if (WARN_ON_ONCE(!(types & KVM_VALID_ROOTS)))
+		return false;
+
 	if (root->role.invalid)
 		return types & KVM_INVALID_ROOTS;
+	if (likely(!is_mirror_sp(root)))
+		return types & KVM_DIRECT_ROOTS;
 
-	return true;
+	return types & KVM_MIRROR_ROOTS;
 }
 
 /*
@@ -233,7 +238,7 @@  static void tdp_mmu_init_child_sp(struct kvm_mmu_page *child_sp,
 	tdp_mmu_init_sp(child_sp, iter->sptep, iter->gfn, role);
 }
 
-void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
+void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu, bool mirror)
 {
 	struct kvm_mmu *mmu = vcpu->arch.mmu;
 	union kvm_mmu_page_role role = mmu->root_role;
@@ -241,6 +246,9 @@  void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
 	struct kvm *kvm = vcpu->kvm;
 	struct kvm_mmu_page *root;
 
+	if (mirror)
+		role.is_mirror = 1;
+
 	/*
 	 * Check for an existing root before acquiring the pages lock to avoid
 	 * unnecessary serialization if multiple vCPUs are loading a new root.
@@ -292,8 +300,12 @@  void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu)
 	 * and actually consuming the root if it's invalidated after dropping
 	 * mmu_lock, and the root can't be freed as this vCPU holds a reference.
 	 */
-	mmu->root.hpa = __pa(root->spt);
-	mmu->root.pgd = 0;
+	if (mirror) {
+		mmu->mirror_root_hpa = __pa(root->spt);
+	} else {
+		mmu->root.hpa = __pa(root->spt);
+		mmu->root.pgd = 0;
+	}
 }
 
 static void handle_changed_spte(struct kvm *kvm, int as_id, gfn_t gfn,
@@ -1113,8 +1125,8 @@  static int tdp_mmu_split_huge_page(struct kvm *kvm, struct tdp_iter *iter,
  */
 int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 {
+	struct kvm_mmu_page *root = tdp_mmu_get_root_for_fault(vcpu, fault);
 	struct kvm *kvm = vcpu->kvm;
-	struct kvm_mmu_page *root = root_to_sp(vcpu->arch.mmu->root.hpa);
 	struct tdp_iter iter;
 	struct kvm_mmu_page *sp;
 	int ret = RET_PF_RETRY;
@@ -1152,13 +1164,18 @@  int kvm_tdp_mmu_map(struct kvm_vcpu *vcpu, struct kvm_page_fault *fault)
 		 */
 		sp = tdp_mmu_alloc_sp(vcpu);
 		tdp_mmu_init_child_sp(sp, &iter);
+		if (is_mirror_sp(sp))
+			kvm_mmu_alloc_external_spt(vcpu, sp);
 
 		sp->nx_huge_page_disallowed = fault->huge_page_disallowed;
 
-		if (is_shadow_present_pte(iter.old_spte))
+		if (is_shadow_present_pte(iter.old_spte)) {
+			/* Don't support large page for mirrored roots (TDX) */
+			KVM_BUG_ON(is_mirror_sptep(iter.sptep), vcpu->kvm);
 			r = tdp_mmu_split_huge_page(kvm, &iter, sp, true);
-		else
+		} else {
 			r = tdp_mmu_link_sp(kvm, &iter, sp, true);
+		}
 
 		/*
 		 * Force the guest to retry if installing an upper level SPTE
@@ -1813,7 +1830,8 @@  int kvm_tdp_mmu_get_walk(struct kvm_vcpu *vcpu, u64 addr, u64 *sptes,
 u64 *kvm_tdp_mmu_fast_pf_get_last_sptep(struct kvm_vcpu *vcpu, gfn_t gfn,
 					u64 *spte)
 {
-	struct kvm_mmu_page *root = root_to_sp(vcpu->arch.mmu->root.hpa);
+	/* Fast pf is not supported for mirrored roots  */
+	struct kvm_mmu_page *root = tdp_mmu_get_root(vcpu, KVM_DIRECT_ROOTS);
 	struct tdp_iter iter;
 	tdp_ptep_t sptep = NULL;
 
diff --git a/arch/x86/kvm/mmu/tdp_mmu.h b/arch/x86/kvm/mmu/tdp_mmu.h
index b887c225ff24..2903f03a34be 100644
--- a/arch/x86/kvm/mmu/tdp_mmu.h
+++ b/arch/x86/kvm/mmu/tdp_mmu.h
@@ -10,7 +10,7 @@ 
 void kvm_mmu_init_tdp_mmu(struct kvm *kvm);
 void kvm_mmu_uninit_tdp_mmu(struct kvm *kvm);
 
-void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu);
+void kvm_tdp_mmu_alloc_root(struct kvm_vcpu *vcpu, bool private);
 
 __must_check static inline bool kvm_tdp_mmu_get_root(struct kvm_mmu_page *root)
 {
@@ -21,11 +21,48 @@  void kvm_tdp_mmu_put_root(struct kvm *kvm, struct kvm_mmu_page *root);
 
 enum kvm_tdp_mmu_root_types {
 	KVM_INVALID_ROOTS = BIT(0),
-
-	KVM_VALID_ROOTS = BIT(1),
+	KVM_DIRECT_ROOTS = BIT(1),
+	KVM_MIRROR_ROOTS = BIT(2),
+	KVM_VALID_ROOTS = KVM_DIRECT_ROOTS | KVM_MIRROR_ROOTS,
 	KVM_ALL_ROOTS = KVM_VALID_ROOTS | KVM_INVALID_ROOTS,
 };
 
+static inline enum kvm_tdp_mmu_root_types kvm_gfn_range_filter_to_root_types(struct kvm *kvm,
+							     enum kvm_gfn_range_filter process)
+{
+	enum kvm_tdp_mmu_root_types ret = 0;
+
+	if (!kvm_has_mirrored_tdp(kvm))
+		return KVM_DIRECT_ROOTS;
+
+	if (process & KVM_FILTER_PRIVATE)
+		ret |= KVM_MIRROR_ROOTS;
+	if (process & KVM_FILTER_SHARED)
+		ret |= KVM_DIRECT_ROOTS;
+
+	WARN_ON_ONCE(!ret);
+
+	return ret;
+}
+
+static inline struct kvm_mmu_page *tdp_mmu_get_root_for_fault(struct kvm_vcpu *vcpu,
+							      struct kvm_page_fault *fault)
+{
+	if (unlikely(!kvm_is_addr_direct(vcpu->kvm, fault->addr)))
+		return root_to_sp(vcpu->arch.mmu->mirror_root_hpa);
+
+	return root_to_sp(vcpu->arch.mmu->root.hpa);
+}
+
+static inline struct kvm_mmu_page *tdp_mmu_get_root(struct kvm_vcpu *vcpu,
+						    enum kvm_tdp_mmu_root_types type)
+{
+	if (unlikely(type == KVM_MIRROR_ROOTS))
+		return root_to_sp(vcpu->arch.mmu->mirror_root_hpa);
+
+	return root_to_sp(vcpu->arch.mmu->root.hpa);
+}
+
 bool kvm_tdp_mmu_zap_leafs(struct kvm *kvm, gfn_t start, gfn_t end, bool flush);
 bool kvm_tdp_mmu_zap_sp(struct kvm *kvm, struct kvm_mmu_page *sp);
 void kvm_tdp_mmu_zap_all(struct kvm *kvm);