diff mbox series

[v2] KVM: SVM: Inhibit AVIC on SNP-enabled system without HvInUseWrAllowed feature

Message ID 20241018085037.14131-1-suravee.suthikulpanit@amd.com (mailing list archive)
State New
Headers show
Series [v2] KVM: SVM: Inhibit AVIC on SNP-enabled system without HvInUseWrAllowed feature | expand

Commit Message

Suravee Suthikulpanit Oct. 18, 2024, 8:50 a.m. UTC
On SNP-enabled system, VMRUN marks AVIC Backing Page as in-use while
the guest is running for both secure and non-secure guest. Any hypervisor
write to the in-use vCPU's AVIC backing page (e.g. to inject an interrupt)
will generate unexpected #PF in the host.

Currently, attempt to run AVIC guest would result in the following error:

    BUG: unable to handle page fault for address: ff3a442e549cc270
    #PF: supervisor write access in kernel mode
    #PF: error_code(0x80000003) - RMP violation
    PGD b6ee01067 P4D b6ee02067 PUD 10096d063 PMD 11c540063 PTE 80000001149cc163
    SEV-SNP: PFN 0x1149cc unassigned, dumping non-zero entries in 2M PFN region: [0x114800 - 0x114a00]
    ...

Newer AMD system is enhanced to allow hypervisor to modify the backing page
for non-secure guest on SNP-enabled system. This enhancement is available
when the CPUID Fn8000_001F_EAX bit 30 is set (HvInUseWrAllowed).

This table describes AVIC support matrix w.r.t. SNP enablement:

               | Non-SNP system |     SNP system
-----------------------------------------------------
 Non-SNP guest |  AVIC Activate | AVIC Activate iff
               |                | HvInuseWrAllowed=1
-----------------------------------------------------
     SNP guest |      N/A       |    Secure AVIC
               |                |    x2APIC only

Introduce APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED to deactivate AVIC
when the feature is not available on SNP-enabled system.

See the AMD64 Architecture Programmer’s Manual (APM) Volume 2 for detail.
(https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/
programmer-references/40332.pdf)

Fixes: 216d106c7ff7 ("x86/sev: Add SEV-SNP host initialization support")
Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
---

Change log v2:
 * Use APICv inhibit bit instead of disabling AVIC in driver.

 arch/x86/include/asm/cpufeatures.h | 1 +
 arch/x86/include/asm/kvm_host.h    | 9 ++++++++-
 arch/x86/kvm/svm/avic.c            | 6 ++++++
 arch/x86/kvm/svm/svm.h             | 3 ++-
 4 files changed, 17 insertions(+), 2 deletions(-)

Comments

Joao Martins Oct. 18, 2024, 9:57 a.m. UTC | #1
On 18/10/2024 09:50, Suravee Suthikulpanit wrote:
> On SNP-enabled system, VMRUN marks AVIC Backing Page as in-use while
> the guest is running for both secure and non-secure guest. Any hypervisor
> write to the in-use vCPU's AVIC backing page (e.g. to inject an interrupt)
> will generate unexpected #PF in the host.
> 
> Currently, attempt to run AVIC guest would result in the following error:
> 
>     BUG: unable to handle page fault for address: ff3a442e549cc270
>     #PF: supervisor write access in kernel mode
>     #PF: error_code(0x80000003) - RMP violation
>     PGD b6ee01067 P4D b6ee02067 PUD 10096d063 PMD 11c540063 PTE 80000001149cc163
>     SEV-SNP: PFN 0x1149cc unassigned, dumping non-zero entries in 2M PFN region: [0x114800 - 0x114a00]
>     ...
> 
> Newer AMD system is enhanced to allow hypervisor to modify the backing page
> for non-secure guest on SNP-enabled system. This enhancement is available
> when the CPUID Fn8000_001F_EAX bit 30 is set (HvInUseWrAllowed).
> 
> This table describes AVIC support matrix w.r.t. SNP enablement:
> 
>                | Non-SNP system |     SNP system
> -----------------------------------------------------
>  Non-SNP guest |  AVIC Activate | AVIC Activate iff
>                |                | HvInuseWrAllowed=1
> -----------------------------------------------------
>      SNP guest |      N/A       |    Secure AVIC
>                |                |    x2APIC only
> 
> Introduce APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED to deactivate AVIC
> when the feature is not available on SNP-enabled system.
> 
I misread your first sentence in v1 wrt to non-secure guests -- but it's a lot
more obvious now. If this was sort of a dynamic condition at runtime (like the
other inhibits triggered by guest behavior or something that can change at
runtime post-boot, or modparam) then the inhibit system would be best acquainted
for preventing enabling AVIC on a per-vm basis. But it appears this is
global-defined-at-boot that blocks any non-secure guest from using AVIC if we
boot as an SNP-enabled host i.e. based on testing BSP-defined feature bits solely.

Your original proposal perhaps is better where you disable AVIC globally in
avic_hardware_setup(). Apologies for (mistankenly) misleading you and wasting
your time :/

> See the AMD64 Architecture Programmer’s Manual (APM) Volume 2 for detail.
> (https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/
> programmer-references/40332.pdf)
> 
> Fixes: 216d106c7ff7 ("x86/sev: Add SEV-SNP host initialization support")
> Signed-off-by: Suravee Suthikulpanit <suravee.suthikulpanit@amd.com>
> ---
> 
> Change log v2:
>  * Use APICv inhibit bit instead of disabling AVIC in driver.
> 
>  arch/x86/include/asm/cpufeatures.h | 1 +
>  arch/x86/include/asm/kvm_host.h    | 9 ++++++++-
>  arch/x86/kvm/svm/avic.c            | 6 ++++++
>  arch/x86/kvm/svm/svm.h             | 3 ++-
>  4 files changed, 17 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index dd4682857c12..921b6de80e24 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -448,6 +448,7 @@
>  #define X86_FEATURE_SME_COHERENT	(19*32+10) /* AMD hardware-enforced cache coherency */
>  #define X86_FEATURE_DEBUG_SWAP		(19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */
>  #define X86_FEATURE_SVSM		(19*32+28) /* "svsm" SVSM present */
> +#define X86_FEATURE_HV_INUSE_WR_ALLOWED	(19*32+30) /* Write to in-use hypervisor-owned pages allowed */
>  
>  /* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */
>  #define X86_FEATURE_NO_NESTED_DATA_BP	(20*32+ 0) /* No Nested Data Breakpoints */
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 4a68cb3eba78..1fef50025512 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1276,6 +1276,12 @@ enum kvm_apicv_inhibit {
>  	 */
>  	APICV_INHIBIT_REASON_LOGICAL_ID_ALIASED,
>  
> +	/*
> +	 * Non-SNP guest cannot activate AVIC on SNP-enabled system w/o
> +	 * CPUID HvInUseWrAllowed feature.
> +	 */
> +	APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED,
> +
>  	NR_APICV_INHIBIT_REASONS,
>  };
>  
> @@ -1294,7 +1300,8 @@ enum kvm_apicv_inhibit {
>  	__APICV_INHIBIT_REASON(IRQWIN),			\
>  	__APICV_INHIBIT_REASON(PIT_REINJ),		\
>  	__APICV_INHIBIT_REASON(SEV),			\
> -	__APICV_INHIBIT_REASON(LOGICAL_ID_ALIASED)
> +	__APICV_INHIBIT_REASON(LOGICAL_ID_ALIASED),	\
> +	__APICV_INHIBIT_REASON(HVINUSEWR_NOT_ALLOWED)
>  
>  struct kvm_arch {
>  	unsigned long n_used_mmu_pages;
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 4b74ea91f4e6..cc4f0c00334a 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -202,6 +202,12 @@ int avic_vm_init(struct kvm *kvm)
>  	if (!enable_apicv)
>  		return 0;
>  
> +	if (cc_platform_has(CC_ATTR_HOST_SEV_SNP) &&
> +	    !boot_cpu_has(X86_FEATURE_HV_INUSE_WR_ALLOWED)) {
> +		pr_debug("APICv Inhibit due to Missing HvInUseWrAllowed.\n");
> +		kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED);
> +	}
> +
>  	/* Allocating physical APIC ID table (4KB) */
>  	p_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
>  	if (!p_page)
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 76107c7d0595..13046bad2d6e 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -682,7 +682,8 @@ extern struct kvm_x86_nested_ops svm_nested_ops;
>  	BIT(APICV_INHIBIT_REASON_PHYSICAL_ID_ALIASED) |	\
>  	BIT(APICV_INHIBIT_REASON_APIC_ID_MODIFIED) |	\
>  	BIT(APICV_INHIBIT_REASON_APIC_BASE_MODIFIED) |	\
> -	BIT(APICV_INHIBIT_REASON_LOGICAL_ID_ALIASED)	\
> +	BIT(APICV_INHIBIT_REASON_LOGICAL_ID_ALIASED) |	\
> +	BIT(APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED)	\
>  )
>  
>  bool avic_hardware_setup(void);
Suravee Suthikulpanit Oct. 18, 2024, 10:06 a.m. UTC | #2
On 10/18/2024 4:57 PM, Joao Martins wrote:
> On 18/10/2024 09:50, Suravee Suthikulpanit wrote:
>> On SNP-enabled system, VMRUN marks AVIC Backing Page as in-use while
>> the guest is running for both secure and non-secure guest. Any hypervisor
>> write to the in-use vCPU's AVIC backing page (e.g. to inject an interrupt)
>> will generate unexpected #PF in the host.
>>
>> Currently, attempt to run AVIC guest would result in the following error:
>>
>>      BUG: unable to handle page fault for address: ff3a442e549cc270
>>      #PF: supervisor write access in kernel mode
>>      #PF: error_code(0x80000003) - RMP violation
>>      PGD b6ee01067 P4D b6ee02067 PUD 10096d063 PMD 11c540063 PTE 80000001149cc163
>>      SEV-SNP: PFN 0x1149cc unassigned, dumping non-zero entries in 2M PFN region: [0x114800 - 0x114a00]
>>      ...
>>
>> Newer AMD system is enhanced to allow hypervisor to modify the backing page
>> for non-secure guest on SNP-enabled system. This enhancement is available
>> when the CPUID Fn8000_001F_EAX bit 30 is set (HvInUseWrAllowed).
>>
>> This table describes AVIC support matrix w.r.t. SNP enablement:
>>
>>                 | Non-SNP system |     SNP system
>> -----------------------------------------------------
>>   Non-SNP guest |  AVIC Activate | AVIC Activate iff
>>                 |                | HvInuseWrAllowed=1
>> -----------------------------------------------------
>>       SNP guest |      N/A       |    Secure AVIC
>>                 |                |    x2APIC only
>>
>> Introduce APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED to deactivate AVIC
>> when the feature is not available on SNP-enabled system.
>>
> I misread your first sentence in v1 wrt to non-secure guests -- but it's a lot
> more obvious now. If this was sort of a dynamic condition at runtime (like the
> other inhibits triggered by guest behavior or something that can change at
> runtime post-boot, or modparam) then the inhibit system would be best acquainted
> for preventing enabling AVIC on a per-vm basis. But it appears this is
> global-defined-at-boot that blocks any non-secure guest from using AVIC if we
> boot as an SNP-enabled host i.e. based on testing BSP-defined feature bits solely.
> 
> Your original proposal perhaps is better where you disable AVIC globally in
> avic_hardware_setup(). Apologies for (mistankenly) misleading you and wasting
> your time :/

Repost from v1 thread:

I was considering the APICV inhibit as well, and decided to go with 
disabling AVIC since it does not require additional 
APICV_INHIBIT_REASON_XXX flag, and we can simply disable AVIC support 
during kvm-amd driver initialization.

After rethink this, it is better to use per-VM APICv inhibition instead 
since certain AVIC data structures will be needed for secure AVIC 
support in the future.

Thanks,
Suravee
Sean Christopherson Oct. 21, 2024, 8:06 p.m. UTC | #3
On Fri, Oct 18, 2024, Suravee Suthikulpanit wrote:
> On 10/18/2024 4:57 PM, Joao Martins wrote:
> > On 18/10/2024 09:50, Suravee Suthikulpanit wrote:
> > > On SNP-enabled system, VMRUN marks AVIC Backing Page as in-use while
> > > the guest is running for both secure and non-secure guest. Any hypervisor
> > > write to the in-use vCPU's AVIC backing page (e.g. to inject an interrupt)
> > > will generate unexpected #PF in the host.
> > > 
> > > Currently, attempt to run AVIC guest would result in the following error:
> > > 
> > >      BUG: unable to handle page fault for address: ff3a442e549cc270
> > >      #PF: supervisor write access in kernel mode
> > >      #PF: error_code(0x80000003) - RMP violation
> > >      PGD b6ee01067 P4D b6ee02067 PUD 10096d063 PMD 11c540063 PTE 80000001149cc163
> > >      SEV-SNP: PFN 0x1149cc unassigned, dumping non-zero entries in 2M PFN region: [0x114800 - 0x114a00]
> > >      ...
> > > 
> > > Newer AMD system is enhanced to allow hypervisor to modify the backing page
> > > for non-secure guest on SNP-enabled system. This enhancement is available
> > > when the CPUID Fn8000_001F_EAX bit 30 is set (HvInUseWrAllowed).
> > > 
> > > This table describes AVIC support matrix w.r.t. SNP enablement:
> > > 
> > >                 | Non-SNP system |     SNP system
> > > -----------------------------------------------------
> > >   Non-SNP guest |  AVIC Activate | AVIC Activate iff
> > >                 |                | HvInuseWrAllowed=1
> > > -----------------------------------------------------
> > >       SNP guest |      N/A       |    Secure AVIC
> > >                 |                |    x2APIC only
> > > 
> > > Introduce APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED to deactivate AVIC

Please use human/reader friendly terms, that's a very convoluted way of saying:

	APICV_INHIBIT_REASON_IN_USE_AVIC_PAGE_READ_ONLY

> > > when the feature is not available on SNP-enabled system.
> > > 
> > I misread your first sentence in v1 wrt to non-secure guests -- but it's a lot
> > more obvious now. If this was sort of a dynamic condition at runtime (like the
> > other inhibits triggered by guest behavior or something that can change at
> > runtime post-boot, or modparam) then the inhibit system would be best acquainted
> > for preventing enabling AVIC on a per-vm basis. But it appears this is
> > global-defined-at-boot that blocks any non-secure guest from using AVIC if we
> > boot as an SNP-enabled host i.e. based on testing BSP-defined feature bits solely.
> > 
> > Your original proposal perhaps is better where you disable AVIC globally in
> > avic_hardware_setup(). Apologies for (mistankenly) misleading you and wasting
> > your time :/
> 
> Repost from v1 thread:
> 
> I was considering the APICV inhibit as well, and decided to go with
> disabling AVIC since it does not require additional APICV_INHIBIT_REASON_XXX
> flag, and we can simply disable AVIC support during kvm-amd driver
> initialization.
> 
> After rethink this, it is better to use per-VM APICv inhibition instead
> since certain AVIC data structures will be needed for secure AVIC support in
> the future.

I don't follow.  I agree with Joao, this seems like an all-or-nothing situation.
There's no point in an inhibit unless Secure AVIC CPUs will exist WITHOUT
HvInuseWrAllowed, but even then, to keep things simple(r), I'm tempted to make
SNP+AVIC require HvInuseWrAllowed
diff mbox series

Patch

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index dd4682857c12..921b6de80e24 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -448,6 +448,7 @@ 
 #define X86_FEATURE_SME_COHERENT	(19*32+10) /* AMD hardware-enforced cache coherency */
 #define X86_FEATURE_DEBUG_SWAP		(19*32+14) /* "debug_swap" AMD SEV-ES full debug state swap support */
 #define X86_FEATURE_SVSM		(19*32+28) /* "svsm" SVSM present */
+#define X86_FEATURE_HV_INUSE_WR_ALLOWED	(19*32+30) /* Write to in-use hypervisor-owned pages allowed */
 
 /* AMD-defined Extended Feature 2 EAX, CPUID level 0x80000021 (EAX), word 20 */
 #define X86_FEATURE_NO_NESTED_DATA_BP	(20*32+ 0) /* No Nested Data Breakpoints */
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 4a68cb3eba78..1fef50025512 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1276,6 +1276,12 @@  enum kvm_apicv_inhibit {
 	 */
 	APICV_INHIBIT_REASON_LOGICAL_ID_ALIASED,
 
+	/*
+	 * Non-SNP guest cannot activate AVIC on SNP-enabled system w/o
+	 * CPUID HvInUseWrAllowed feature.
+	 */
+	APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED,
+
 	NR_APICV_INHIBIT_REASONS,
 };
 
@@ -1294,7 +1300,8 @@  enum kvm_apicv_inhibit {
 	__APICV_INHIBIT_REASON(IRQWIN),			\
 	__APICV_INHIBIT_REASON(PIT_REINJ),		\
 	__APICV_INHIBIT_REASON(SEV),			\
-	__APICV_INHIBIT_REASON(LOGICAL_ID_ALIASED)
+	__APICV_INHIBIT_REASON(LOGICAL_ID_ALIASED),	\
+	__APICV_INHIBIT_REASON(HVINUSEWR_NOT_ALLOWED)
 
 struct kvm_arch {
 	unsigned long n_used_mmu_pages;
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 4b74ea91f4e6..cc4f0c00334a 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -202,6 +202,12 @@  int avic_vm_init(struct kvm *kvm)
 	if (!enable_apicv)
 		return 0;
 
+	if (cc_platform_has(CC_ATTR_HOST_SEV_SNP) &&
+	    !boot_cpu_has(X86_FEATURE_HV_INUSE_WR_ALLOWED)) {
+		pr_debug("APICv Inhibit due to Missing HvInUseWrAllowed.\n");
+		kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED);
+	}
+
 	/* Allocating physical APIC ID table (4KB) */
 	p_page = alloc_page(GFP_KERNEL_ACCOUNT | __GFP_ZERO);
 	if (!p_page)
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 76107c7d0595..13046bad2d6e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -682,7 +682,8 @@  extern struct kvm_x86_nested_ops svm_nested_ops;
 	BIT(APICV_INHIBIT_REASON_PHYSICAL_ID_ALIASED) |	\
 	BIT(APICV_INHIBIT_REASON_APIC_ID_MODIFIED) |	\
 	BIT(APICV_INHIBIT_REASON_APIC_BASE_MODIFIED) |	\
-	BIT(APICV_INHIBIT_REASON_LOGICAL_ID_ALIASED)	\
+	BIT(APICV_INHIBIT_REASON_LOGICAL_ID_ALIASED) |	\
+	BIT(APICV_INHIBIT_REASON_HVINUSEWR_NOT_ALLOWED)	\
 )
 
 bool avic_hardware_setup(void);