Message ID | 6fc4cd0f07f884da4345951670aebff8815270b7.1737505394.git.ashish.kalra@amd.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Fix broken SNP support with KVM module built-in | expand |
On 1/21/25 19:00, Ashish Kalra wrote: > From: Ashish Kalra <ashish.kalra@amd.com> > > This patch fixes issues with enabling SNP host support and effectively > SNP support which is broken with respect to the KVM module being > built-in. Should this be the second patch in the series, since the first patch is what allows the change from device_initcall() to subsys_initcall()? > > SNP host support is enabled in snp_rmptable_init() which is invoked as a > device_initcall(). Here device_initcall() is used as snp_rmptable_init() > expects AMD IOMMU SNP support to be enabled prior to it and the AMD > IOMMU driver enables SNP support after PCI bus enumeration. > > Now, if kvm_amd module is built-in, it gets initialized before SNP host > support is enabled in snp_rmptable_init() : > > [ 10.131811] kvm_amd: TSC scaling supported > [ 10.136384] kvm_amd: Nested Virtualization enabled > [ 10.141734] kvm_amd: Nested Paging enabled > [ 10.146304] kvm_amd: LBR virtualization supported > [ 10.151557] kvm_amd: SEV enabled (ASIDs 100 - 509) > [ 10.156905] kvm_amd: SEV-ES enabled (ASIDs 1 - 99) > [ 10.162256] kvm_amd: SEV-SNP enabled (ASIDs 1 - 99) > [ 10.171508] kvm_amd: Virtual VMLOAD VMSAVE supported > [ 10.177052] kvm_amd: Virtual GIF supported > ... > ... > [ 10.201648] kvm_amd: in svm_enable_virtualization_cpu > > And then svm_x86_ops->enable_virtualization_cpu() > (svm_enable_virtualization_cpu) programs MSR_VM_HSAVE_PA as following: > wrmsrl(MSR_VM_HSAVE_PA, sd->save_area_pa); > > So VM_HSAVE_PA is non-zero before SNP support is enabled on all CPUs. > > snp_rmptable_init() gets invoked after svm_enable_virtualization_cpu() > as following : > ... > [ 11.256138] kvm_amd: in svm_enable_virtualization_cpu > ... > [ 11.264918] SEV-SNP: in snp_rmptable_init > > This triggers a #GP exception in snp_rmptable_init() when snp_enable() > is invoked to set SNP_EN in SYSCFG MSR: > > [ 11.294289] unchecked MSR access error: WRMSR to 0xc0010010 (tried to write 0x0000000003fc0000) at rIP: 0xffffffffaf5d5c28 (native_write_msr+0x8/0x30) > ... > [ 11.294404] Call Trace: > [ 11.294482] <IRQ> > [ 11.294513] ? show_stack_regs+0x26/0x30 > [ 11.294522] ? ex_handler_msr+0x10f/0x180 > [ 11.294529] ? search_extable+0x2b/0x40 > [ 11.294538] ? fixup_exception+0x2dd/0x340 > [ 11.294542] ? exc_general_protection+0x14f/0x440 > [ 11.294550] ? asm_exc_general_protection+0x2b/0x30 > [ 11.294557] ? __pfx_snp_enable+0x10/0x10 > [ 11.294567] ? native_write_msr+0x8/0x30 > [ 11.294570] ? __snp_enable+0x5d/0x70 > [ 11.294575] snp_enable+0x19/0x20 > [ 11.294578] __flush_smp_call_function_queue+0x9c/0x3a0 > [ 11.294586] generic_smp_call_function_single_interrupt+0x17/0x20 > [ 11.294589] __sysvec_call_function+0x20/0x90 > [ 11.294596] sysvec_call_function+0x80/0xb0 > [ 11.294601] </IRQ> > [ 11.294603] <TASK> > [ 11.294605] asm_sysvec_call_function+0x1f/0x30 > ... > [ 11.294631] arch_cpu_idle+0xd/0x20 > [ 11.294633] default_idle_call+0x34/0xd0 > [ 11.294636] do_idle+0x1f1/0x230 > [ 11.294643] ? complete+0x71/0x80 > [ 11.294649] cpu_startup_entry+0x30/0x40 > [ 11.294652] start_secondary+0x12d/0x160 > [ 11.294655] common_startup_64+0x13e/0x141 > [ 11.294662] </TASK> > > This #GP exception is getting triggered due to the following errata for > AMD family 19h Models 10h-1Fh Processors: > > Processor may generate spurious #GP(0) Exception on WRMSR instruction: > Description: > The Processor will generate a spurious #GP(0) Exception on a WRMSR > instruction if the following conditions are all met: > - the target of the WRMSR is a SYSCFG register. > - the write changes the value of SYSCFG.SNPEn from 0 to 1. > - One of the threads that share the physical core has a non-zero > value in the VM_HSAVE_PA MSR. > > The document being referred to above: > https://www.amd.com/content/dam/amd/en/documents/processor-tech-docs/revision-guides/57095-PUB_1_01.pdf > > To summarize, with kvm_amd module being built-in, KVM/SVM initialization > happens before host SNP is enabled and this SVM initialization > sets VM_HSAVE_PA to non-zero, which then triggers a #GP when > SYSCFG.SNPEn is being set and this will subsequently cause > SNP_INIT(_EX) to fail with INVALID_CONFIG error as SYSCFG[SnpEn] is not > set on all CPUs. > > This patch fixes the current SNP host enabling code and effectively SNP > which is broken with respect to the KVM module being built-in. > > Essentially SNP host enabling code should be invoked before KVM > initialization, which is currently not the case when KVM is built-in. > > With the AMD IOMMU driver patch applied which moves SNP enable check > before enabling IOMMUs, snp_rmptable_init() can now be called early > with subsys_initcall() which enables SNP host support before KVM > initialization with kvm_amd module built-in. > > Fixes: c3b86e61b756 ("x86/cpufeatures: Enable/unmask SEV-SNP CPU feature") > Signed-off-by: Ashish Kalra <ashish.kalra@amd.com> > --- > arch/x86/virt/svm/sev.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c > index 1dcc027ec77e..d5dc4889c445 100644 > --- a/arch/x86/virt/svm/sev.c > +++ b/arch/x86/virt/svm/sev.c > @@ -571,7 +571,7 @@ static int __init snp_rmptable_init(void) > /* > * This must be called after the IOMMU has been initialized. This comment is slightly stale now. Maybe modify it to indicate that the IOMMU SNP check must have been done. Thanks, Tom > */ > -device_initcall(snp_rmptable_init); > +subsys_initcall(snp_rmptable_init); > > static void set_rmp_segment_info(unsigned int segment_shift) > {
diff --git a/arch/x86/virt/svm/sev.c b/arch/x86/virt/svm/sev.c index 1dcc027ec77e..d5dc4889c445 100644 --- a/arch/x86/virt/svm/sev.c +++ b/arch/x86/virt/svm/sev.c @@ -571,7 +571,7 @@ static int __init snp_rmptable_init(void) /* * This must be called after the IOMMU has been initialized. */ -device_initcall(snp_rmptable_init); +subsys_initcall(snp_rmptable_init); static void set_rmp_segment_info(unsigned int segment_shift) {