diff mbox series

[Part2,RFC,v4,28/40] KVM: X86: Introduce kvm_mmu_map_tdp_page() for use by SEV

Message ID 20210707183616.5620-29-brijesh.singh@amd.com (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show
Series Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support | expand

Commit Message

Brijesh Singh July 7, 2021, 6:36 p.m. UTC
Introduce a helper to directly fault-in a TDP page without going through
the full page fault path.  This allows SEV-SNP to build the netsted page
table while handling the page state change VMGEXIT. A guest may issue a
page state change VMGEXIT before accessing the page. Create a fault so
that VMGEXIT handler can get the TDP page level and keep the TDP and RMP
page level in sync.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/kvm/mmu.h     |  2 ++
 arch/x86/kvm/mmu/mmu.c | 20 ++++++++++++++++++++
 2 files changed, 22 insertions(+)

Comments

Sean Christopherson July 16, 2021, 6:15 p.m. UTC | #1
On Wed, Jul 07, 2021, Brijesh Singh wrote:
> +int kvm_mmu_map_tdp_page(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, int max_level)
> +{
> +	int r;
> +
> +	/*
> +	 * Loop on the page fault path to handle the case where an mmu_notifier
> +	 * invalidation triggers RET_PF_RETRY.  In the normal page fault path,
> +	 * KVM needs to resume the guest in case the invalidation changed any
> +	 * of the page fault properties, i.e. the gpa or error code.  For this
> +	 * path, the gpa and error code are fixed by the caller, and the caller
> +	 * expects failure if and only if the page fault can't be fixed.
> +	 */
> +	do {
> +		r = direct_page_fault(vcpu, gpa, error_code, false, max_level, true);
> +	} while (r == RET_PF_RETRY);
> +
> +	return r;

This implementation is completely broken, which in turn means that the page state
change code is not well tested.  The mess is likely masked to some extent because
the call is bookendeda by calls to kvm_mmu_get_tdp_walk(), i.e. most of the time
it's not called, and when it is called, the bugs are hidden by the second walk
detecting that the mapping was not installed.

  1. direct_page_fault() does not return a pfn, it returns the action that should
     be taken by the caller.
  2. The while() can be optimized to bail on no_slot PFNs.
  3. mmu_topup_memory_caches() needs to be called here, otherwise @pfn will be
     uninitialized.  The alternative would be to set @pfn when that fails in
     direct_page_fault().
  4. The 'int' return value is wrong, it needs to be kvm_pfn_t.

A correct implementation can be found in the TDX series, the easiest thing would
be to suck in those patches.

https://lore.kernel.org/kvm/ceffc7ef0746c6064330ef5c30bc0bb5994a1928.1625186503.git.isaku.yamahata@intel.com/
https://lore.kernel.org/kvm/a7e7602375e1f63b32eda19cb8011f11794ebe28.1625186503.git.isaku.yamahata@intel.com/

> +}
> +EXPORT_SYMBOL_GPL(kvm_mmu_map_tdp_page);
> +
>  static void nonpaging_init_context(struct kvm_vcpu *vcpu,
>  				   struct kvm_mmu *context)
>  {
> -- 
> 2.17.1
>
diff mbox series

Patch

diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index 88d0ed5225a4..005ce139c97d 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -114,6 +114,8 @@  static inline void kvm_mmu_load_pgd(struct kvm_vcpu *vcpu)
 int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 		       bool prefault);
 
+int kvm_mmu_map_tdp_page(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, int max_level);
+
 static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
 					u32 err, bool prefault)
 {
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 7991ffae7b31..df8923fb664f 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -3842,6 +3842,26 @@  int kvm_tdp_page_fault(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code,
 				 max_level, true);
 }
 
+int kvm_mmu_map_tdp_page(struct kvm_vcpu *vcpu, gpa_t gpa, u32 error_code, int max_level)
+{
+	int r;
+
+	/*
+	 * Loop on the page fault path to handle the case where an mmu_notifier
+	 * invalidation triggers RET_PF_RETRY.  In the normal page fault path,
+	 * KVM needs to resume the guest in case the invalidation changed any
+	 * of the page fault properties, i.e. the gpa or error code.  For this
+	 * path, the gpa and error code are fixed by the caller, and the caller
+	 * expects failure if and only if the page fault can't be fixed.
+	 */
+	do {
+		r = direct_page_fault(vcpu, gpa, error_code, false, max_level, true);
+	} while (r == RET_PF_RETRY);
+
+	return r;
+}
+EXPORT_SYMBOL_GPL(kvm_mmu_map_tdp_page);
+
 static void nonpaging_init_context(struct kvm_vcpu *vcpu,
 				   struct kvm_mmu *context)
 {