diff mbox series

[Part2,RFC,v4,20/40] KVM: SVM: Make AVIC backing, VMSA and VMCB memory allocation SNP safe

Message ID 20210707183616.5620-21-brijesh.singh@amd.com (mailing list archive)
State Not Applicable
Delegated to: Herbert Xu
Headers show
Series Add AMD Secure Nested Paging (SEV-SNP) Hypervisor Support | expand

Commit Message

Brijesh Singh July 7, 2021, 6:35 p.m. UTC
When SEV-SNP is globally enabled on a system, the VMRUN instruction
performs additional security checks on AVIC backing, VMSA, and VMCB page.
On a successful VMRUN, these pages are marked "in-use" by the
hardware in the RMP entry, and any attempt to modify the RMP entry for
these pages will result in page-fault (RMP violation check).

While performing the RMP check, hardware will try to create a 2MB TLB
entry for the large page accesses. When it does this, it first reads
the RMP for the base of 2MB region and verifies that all this memory is
safe. If AVIC backing, VMSA, and VMCB memory happen to be the base of
2MB region, then RMP check will fail because of the "in-use" marking for
the base entry of this 2MB region.

e.g.

1. A VMCB was allocated on 2MB-aligned address.
2. The VMRUN instruction marks this RMP entry as "in-use".
3. Another process allocated some other page of memory that happened to be
   within the same 2MB region.
4. That process tried to write its page using physmap.

If the physmap entry in step #4 uses a large (1G/2M) page, then the
hardware will attempt to create a 2M TLB entry. The hardware will find
that the "in-use" bit is set in the RMP entry (because it was a
VMCB page) and will cause an RMP violation check.

See APM2 section 15.36.12 for more information on VMRUN checks when
SEV-SNP is globally active.

A generic allocator can return a page which are 2M aligned and will not
be safe to be used when SEV-SNP is globally enabled. Add a
snp_safe_alloc_page() helper that can be used for allocating the
SNP safe memory. The helper allocated 2 pages and splits them into order-1
allocation. It frees one page and keeps one of the page which is not
2M aligned.

Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
---
 arch/x86/include/asm/kvm_host.h |  1 +
 arch/x86/kvm/lapic.c            |  5 ++++-
 arch/x86/kvm/svm/sev.c          | 27 +++++++++++++++++++++++++++
 arch/x86/kvm/svm/svm.c          | 16 ++++++++++++++--
 arch/x86/kvm/svm/svm.h          |  1 +
 5 files changed, 47 insertions(+), 3 deletions(-)

Comments

Marc Orr July 14, 2021, 1:35 p.m. UTC | #1
On Wed, Jul 7, 2021 at 11:38 AM Brijesh Singh <brijesh.singh@amd.com> wrote:
>
> When SEV-SNP is globally enabled on a system, the VMRUN instruction
> performs additional security checks on AVIC backing, VMSA, and VMCB page.
> On a successful VMRUN, these pages are marked "in-use" by the
> hardware in the RMP entry, and any attempt to modify the RMP entry for
> these pages will result in page-fault (RMP violation check).
>
> While performing the RMP check, hardware will try to create a 2MB TLB
> entry for the large page accesses. When it does this, it first reads
> the RMP for the base of 2MB region and verifies that all this memory is
> safe. If AVIC backing, VMSA, and VMCB memory happen to be the base of
> 2MB region, then RMP check will fail because of the "in-use" marking for
> the base entry of this 2MB region.
>
> e.g.
>
> 1. A VMCB was allocated on 2MB-aligned address.
> 2. The VMRUN instruction marks this RMP entry as "in-use".
> 3. Another process allocated some other page of memory that happened to be
>    within the same 2MB region.
> 4. That process tried to write its page using physmap.
>
> If the physmap entry in step #4 uses a large (1G/2M) page, then the
> hardware will attempt to create a 2M TLB entry. The hardware will find
> that the "in-use" bit is set in the RMP entry (because it was a
> VMCB page) and will cause an RMP violation check.
>
> See APM2 section 15.36.12 for more information on VMRUN checks when
> SEV-SNP is globally active.
>
> A generic allocator can return a page which are 2M aligned and will not
> be safe to be used when SEV-SNP is globally enabled. Add a
> snp_safe_alloc_page() helper that can be used for allocating the
> SNP safe memory. The helper allocated 2 pages and splits them into order-1
> allocation. It frees one page and keeps one of the page which is not
> 2M aligned.
>
> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>

Co-developed-by: Marc Orr <marcorr@google.com>

The original version of this patch had this tag. I think it got
dropped on accident.

> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/lapic.c            |  5 ++++-
>  arch/x86/kvm/svm/sev.c          | 27 +++++++++++++++++++++++++++
>  arch/x86/kvm/svm/svm.c          | 16 ++++++++++++++--
>  arch/x86/kvm/svm/svm.h          |  1 +
>  5 files changed, 47 insertions(+), 3 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 55efbacfc244..188110ab2c02 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1383,6 +1383,7 @@ struct kvm_x86_ops {
>         int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err);
>
>         void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector);
> +       void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu);
>  };
>
>  struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index c0ebef560bd1..d4c77f66d7d5 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2441,7 +2441,10 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
>
>         vcpu->arch.apic = apic;
>
> -       apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
> +       if (kvm_x86_ops.alloc_apic_backing_page)
> +               apic->regs = kvm_x86_ops.alloc_apic_backing_page(vcpu);
> +       else
> +               apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
>         if (!apic->regs) {
>                 printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
>                        vcpu->vcpu_id);
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index b8505710c36b..411ed72f63af 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2692,3 +2692,30 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
>                 break;
>         }
>  }
> +
> +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> +{
> +       unsigned long pfn;
> +       struct page *p;
> +
> +       if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +               return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +
> +       p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> +       if (!p)
> +               return NULL;
> +
> +       /* split the page order */
> +       split_page(p, 1);
> +
> +       /* Find a non-2M aligned page */
> +       pfn = page_to_pfn(p);
> +       if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) {
> +               pfn++;
> +               __free_page(p);
> +       } else {
> +               __free_page(pfn_to_page(pfn + 1));
> +       }
> +
> +       return pfn_to_page(pfn);
> +}
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 2acf187a3100..a7adf6ca1713 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -1336,7 +1336,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
>         svm = to_svm(vcpu);
>
>         err = -ENOMEM;
> -       vmcb01_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +       vmcb01_page = snp_safe_alloc_page(vcpu);
>         if (!vmcb01_page)
>                 goto out;
>
> @@ -1345,7 +1345,7 @@ static int svm_create_vcpu(struct kvm_vcpu *vcpu)
>                  * SEV-ES guests require a separate VMSA page used to contain
>                  * the encrypted register state of the guest.
>                  */
> -               vmsa_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +               vmsa_page = snp_safe_alloc_page(vcpu);
>                 if (!vmsa_page)
>                         goto error_free_vmcb_page;
>
> @@ -4439,6 +4439,16 @@ static int svm_vm_init(struct kvm *kvm)
>         return 0;
>  }
>
> +static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu)
> +{
> +       struct page *page = snp_safe_alloc_page(vcpu);
> +
> +       if (!page)
> +               return NULL;
> +
> +       return page_address(page);
> +}
> +
>  static struct kvm_x86_ops svm_x86_ops __initdata = {
>         .hardware_unsetup = svm_hardware_teardown,
>         .hardware_enable = svm_hardware_enable,
> @@ -4564,6 +4574,8 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>         .complete_emulated_msr = svm_complete_emulated_msr,
>
>         .vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
> +
> +       .alloc_apic_backing_page = svm_alloc_apic_backing_page,
>  };
>
>  static struct kvm_x86_init_ops svm_init_ops __initdata = {
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 5f874168551b..1175edb02d33 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -554,6 +554,7 @@ void sev_es_create_vcpu(struct vcpu_svm *svm);
>  void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
>  void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu);
>  void sev_es_unmap_ghcb(struct vcpu_svm *svm);
> +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
>
>  /* vmenter.S */
>
> --
> 2.17.1
>
>
Brijesh Singh July 14, 2021, 4:47 p.m. UTC | #2
On 7/14/21 8:35 AM, Marc Orr wrote:
>> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> 
> Co-developed-by: Marc Orr <marcorr@google.com>
> 
thank you for adding it, it was dropped accidentally.

-Brijesh
Sean Christopherson July 20, 2021, 6:02 p.m. UTC | #3
IMO, the CPU behavior is a bug, even if the behavior is working as intended for
the microarchitecture.  I.e. this should be treated as an erratum.

On Wed, Jul 07, 2021, Brijesh Singh wrote:
> When SEV-SNP is globally enabled on a system, the VMRUN instruction
> performs additional security checks on AVIC backing, VMSA, and VMCB page.
> On a successful VMRUN, these pages are marked "in-use" by the
> hardware in the RMP entry

There's a lot of "noise" in this intro.  That the CPU does additional checks at
VMRUN isn't all that interesting, what's relevant is that the CPU tags the
associated RMP entry with a special flag.  And IIUC, it does that for _all_ VMs,
not just SNP VMs.

Also, what happens if the pages aren't covered by the RMP?  Table 15-41 states
that the VMCB, AVIC, and VMSA for non-SNP guests need to be Hypervisor-Owned,
but doesn't explicitly require them to be RMP covered.  On the other hand, it
does state that the VMSA for an SNP guest must be Guest-Owned and RMP-Covered.
That implies that the Hypervisor-Owned pages do not need to contained within the
RMP, but how does that work if the CPU is setting a magic flag in the RMP?  Does
VMRUN explode?  Does the CPU corrupt random memory?

Is the in-use flag visible to software?  We've already established that "struct
rmpentry" is microarchitectural, so why not document it in the PPR?  It could be
useful info for debugging unexpected RMP violations, even if the flag isn't stable.

Are there other possible collisions with the in-use flag?  The APM states that
the in-use flag results in RMPUPDATE failing with FAIL_INUSE.  That's the same
error code that's returned if two CPUs attempt RMPUPDATE on the same entry.  That
implies that the RMPUPDATE also sets the in-use flag.  If that's true, then isn't
it possible that the spurious RMP violation #PF could happen if the kernel accesses
a hugepage at the same time a CPU is doing RMPUPDATE on the associated 2mb-aligned
entry?

> and any attempt to modify the RMP entry for these pages will result in
> page-fault (RMP violation check).

Again, not that relevant since KVM isn't attempting to modify the RMP entry.
I've no objection to mentioning this behavior in passing, but it should not be
the focal point of the intro.

> While performing the RMP check, hardware will try to create a 2MB TLB
> entry for the large page accesses. When it does this, it first reads
> the RMP for the base of 2MB region and verifies that all this memory is
> safe. If AVIC backing, VMSA, and VMCB memory happen to be the base of
> 2MB region, then RMP check will fail because of the "in-use" marking for
> the base entry of this 2MB region.

There's a critical piece missing here, which is why an RMP violation is thrown
on "in-use" pages.  E.g. are any translations problematic, or just writable
translations?  It may not affect the actual KVM workaround, but knowing exactly
what goes awry is important.

> e.g.
> 
> 1. A VMCB was allocated on 2MB-aligned address.
> 2. The VMRUN instruction marks this RMP entry as "in-use".
> 3. Another process allocated some other page of memory that happened to be
>    within the same 2MB region.
> 4. That process tried to write its page using physmap.

Please explicitly call out the relevance of the physmap.  IIUC, only the physmap,
a.k.a. direct map, is problematic because that's the only scenario where a large
page can overlap one of the magic pages.  That should be explicitly stated.

> If the physmap entry in step #4 uses a large (1G/2M) page, then the

Be consistent with 2MB vs. 2M, i.e. choose one.

> hardware will attempt to create a 2M TLB entry. The hardware will find
> that the "in-use" bit is set in the RMP entry (because it was a
> VMCB page) and will cause an RMP violation check.

So what happens if the problematic page isn't 2mb aligned?  The lack of an RMP
violation on access implies that the hypervisor can bypass the in-use restriction
and create a 2mb hugepage, i.e. access the in-use page.  Same question for if the
TLB entry exists before the page is marked in-use, which also begs the question
of why the in-use flag is being checked at all on RMP lookups.

> See APM2 section 15.36.12 for more information on VMRUN checks when
> SEV-SNP is globally active.
> 
> A generic allocator can return a page which are 2M aligned and will not
> be safe to be used when SEV-SNP is globally enabled.

> Add a snp_safe_alloc_page() helper that can be used for allocating the SNP
> safe memory. The helper allocated 2 pages and splits them into order-1
> allocation. It frees one page and keeps one of the page which is not 2M
> aligned.

I know it's personal preference as to whether to lead with the solution or the
problem statement, but in this case it would be very helpful to at least provide
a brief summary early on so that the reader has some idea of where the changelog
is headed.  As is, the actual change is buried after a big pile of hardware
details.

E.g. something like this

  Implement a workaround for an SNP erratum where the CPU will incorrectly
  signal an RMP violation #PF if a hugepage (2mb or 1gb) collides with the
  RMP entry of a VMCB, VMSA, or AVIC backing page.

  When SEV-SNP is globally enabled, the CPU marks the VMCB, VMSA, and AVIC
  backing   pages as "in-use" in the RMP after a successful VMRUN.  This is
  done for _all_   VMs, not just SNP-Active VMs.

  If the hypervisor accesses an in-use page through a writable translation,
  the CPU will throw an RMP violation #PF.  On early SNP hardware, if an
  in-use page is 2mb aligned and software accesses any part of the associated
  2mb region with a hupage, the CPU will incorrectly treat the entire 2mb
  region as in-use and signal a spurious RMP violation #PF.

  <gory details on the workaround>

> Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
> ---
>  arch/x86/include/asm/kvm_host.h |  1 +
>  arch/x86/kvm/lapic.c            |  5 ++++-
>  arch/x86/kvm/svm/sev.c          | 27 +++++++++++++++++++++++++++
>  arch/x86/kvm/svm/svm.c          | 16 ++++++++++++++--
>  arch/x86/kvm/svm/svm.h          |  1 +
>  5 files changed, 47 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 55efbacfc244..188110ab2c02 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1383,6 +1383,7 @@ struct kvm_x86_ops {
>  	int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err);
>  
>  	void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector);
> +	void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu);
>  };
>  
>  struct kvm_x86_nested_ops {
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index c0ebef560bd1..d4c77f66d7d5 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2441,7 +2441,10 @@ int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
>  
>  	vcpu->arch.apic = apic;
>  
> -	apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
> +	if (kvm_x86_ops.alloc_apic_backing_page)
> +		apic->regs = kvm_x86_ops.alloc_apic_backing_page(vcpu);

This can be a static_call().

> +	else
> +		apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
>  	if (!apic->regs) {
>  		printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
>  		       vcpu->vcpu_id);
> diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
> index b8505710c36b..411ed72f63af 100644
> --- a/arch/x86/kvm/svm/sev.c
> +++ b/arch/x86/kvm/svm/sev.c
> @@ -2692,3 +2692,30 @@ void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
>  		break;
>  	}
>  }
> +
> +struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
> +{
> +	unsigned long pfn;
> +	struct page *p;
> +
> +	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
> +		return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
> +
> +	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
> +	if (!p)
> +		return NULL;
> +
> +	/* split the page order */
> +	split_page(p, 1);
> +
> +	/* Find a non-2M aligned page */

This isn't "finding" anything, it's identifying which of the two pages is
_guaranteed_ to be unaligned.  The whole function needs a much bigger comment to
explain what's going on.

> +	pfn = page_to_pfn(p);
> +	if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) {
> +		pfn++;
> +		__free_page(p);
> +	} else {
> +		__free_page(pfn_to_page(pfn + 1));
> +	}
> +
> +	return pfn_to_page(pfn);
> +}
Brijesh Singh Aug. 3, 2021, 2:38 p.m. UTC | #4
Hi Sean,


On 7/20/21 1:02 PM, Sean Christopherson wrote:
> IMO, the CPU behavior is a bug, even if the behavior is working as intended for
> the microarchitecture.  I.e. this should be treated as an erratum.
> 

I agreed with your comment that it should be treated as an erratum. I 
now have agreement from the hardware team to publish this as an erratum 
with explanation and recommendation. This will certainly help in 
documenting on "why" we are making the page split.

...

>>   
>> -	apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
>> +	if (kvm_x86_ops.alloc_apic_backing_page)
>> +		apic->regs = kvm_x86_ops.alloc_apic_backing_page(vcpu);
> 
> This can be a static_call().

Noted.

> 
> This isn't "finding" anything, it's identifying which of the two pages is
> _guaranteed_ to be unaligned.  The whole function needs a much bigger comment to
> explain what's going on.

Let me add more comment to clarify it.

> 
>> +	pfn = page_to_pfn(p);
>> +	if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) {
>> +		pfn++;
>> +		__free_page(p);
>> +	} else {
>> +		__free_page(pfn_to_page(pfn + 1));
>> +	}
>> +
>> +	return pfn_to_page(pfn);
>> +}

thanks
diff mbox series

Patch

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 55efbacfc244..188110ab2c02 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1383,6 +1383,7 @@  struct kvm_x86_ops {
 	int (*complete_emulated_msr)(struct kvm_vcpu *vcpu, int err);
 
 	void (*vcpu_deliver_sipi_vector)(struct kvm_vcpu *vcpu, u8 vector);
+	void *(*alloc_apic_backing_page)(struct kvm_vcpu *vcpu);
 };
 
 struct kvm_x86_nested_ops {
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index c0ebef560bd1..d4c77f66d7d5 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2441,7 +2441,10 @@  int kvm_create_lapic(struct kvm_vcpu *vcpu, int timer_advance_ns)
 
 	vcpu->arch.apic = apic;
 
-	apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
+	if (kvm_x86_ops.alloc_apic_backing_page)
+		apic->regs = kvm_x86_ops.alloc_apic_backing_page(vcpu);
+	else
+		apic->regs = (void *)get_zeroed_page(GFP_KERNEL_ACCOUNT);
 	if (!apic->regs) {
 		printk(KERN_ERR "malloc apic regs error for vcpu %x\n",
 		       vcpu->vcpu_id);
diff --git a/arch/x86/kvm/svm/sev.c b/arch/x86/kvm/svm/sev.c
index b8505710c36b..411ed72f63af 100644
--- a/arch/x86/kvm/svm/sev.c
+++ b/arch/x86/kvm/svm/sev.c
@@ -2692,3 +2692,30 @@  void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector)
 		break;
 	}
 }
+
+struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu)
+{
+	unsigned long pfn;
+	struct page *p;
+
+	if (!cpu_feature_enabled(X86_FEATURE_SEV_SNP))
+		return alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+
+	p = alloc_pages(GFP_KERNEL_ACCOUNT | __GFP_ZERO, 1);
+	if (!p)
+		return NULL;
+
+	/* split the page order */
+	split_page(p, 1);
+
+	/* Find a non-2M aligned page */
+	pfn = page_to_pfn(p);
+	if (IS_ALIGNED(__pfn_to_phys(pfn), PMD_SIZE)) {
+		pfn++;
+		__free_page(p);
+	} else {
+		__free_page(pfn_to_page(pfn + 1));
+	}
+
+	return pfn_to_page(pfn);
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2acf187a3100..a7adf6ca1713 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -1336,7 +1336,7 @@  static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 	svm = to_svm(vcpu);
 
 	err = -ENOMEM;
-	vmcb01_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+	vmcb01_page = snp_safe_alloc_page(vcpu);
 	if (!vmcb01_page)
 		goto out;
 
@@ -1345,7 +1345,7 @@  static int svm_create_vcpu(struct kvm_vcpu *vcpu)
 		 * SEV-ES guests require a separate VMSA page used to contain
 		 * the encrypted register state of the guest.
 		 */
-		vmsa_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
+		vmsa_page = snp_safe_alloc_page(vcpu);
 		if (!vmsa_page)
 			goto error_free_vmcb_page;
 
@@ -4439,6 +4439,16 @@  static int svm_vm_init(struct kvm *kvm)
 	return 0;
 }
 
+static void *svm_alloc_apic_backing_page(struct kvm_vcpu *vcpu)
+{
+	struct page *page = snp_safe_alloc_page(vcpu);
+
+	if (!page)
+		return NULL;
+
+	return page_address(page);
+}
+
 static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.hardware_unsetup = svm_hardware_teardown,
 	.hardware_enable = svm_hardware_enable,
@@ -4564,6 +4574,8 @@  static struct kvm_x86_ops svm_x86_ops __initdata = {
 	.complete_emulated_msr = svm_complete_emulated_msr,
 
 	.vcpu_deliver_sipi_vector = svm_vcpu_deliver_sipi_vector,
+
+	.alloc_apic_backing_page = svm_alloc_apic_backing_page,
 };
 
 static struct kvm_x86_init_ops svm_init_ops __initdata = {
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5f874168551b..1175edb02d33 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -554,6 +554,7 @@  void sev_es_create_vcpu(struct vcpu_svm *svm);
 void sev_vcpu_deliver_sipi_vector(struct kvm_vcpu *vcpu, u8 vector);
 void sev_es_prepare_guest_switch(struct vcpu_svm *svm, unsigned int cpu);
 void sev_es_unmap_ghcb(struct vcpu_svm *svm);
+struct page *snp_safe_alloc_page(struct kvm_vcpu *vcpu);
 
 /* vmenter.S */