Message ID | 20241001050110.3643764-7-xin@zytor.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Enable FRED with KVM VMX | expand |
On 9/30/24 22:00, Xin Li (Intel) wrote: > The per CPU variable cea_exception_stacks contains per CPU stacks for > NMI, #DB and #DF, which is referenced in KVM to set host FRED RSP[123] > each time a vCPU is loaded onto a CPU, thus it needs to be exported. Nit: It's not obvious how 'cea_exception_stacks' get used in this series. It's never referenced explicitly. I did figure it out by looking for 'RSP[123]' references, but a much better changelog would be something like: The per CPU array 'cea_exception_stacks' points to per CPU stacks for NMI, #DB and #DF. It is normally referenced via the #define: __this_cpu_ist_top_va(). FRED introduced new fields in the host-state area of the VMCS for stack levels 1->3 (HOST_IA32_FRED_RSP[123]). KVM must populate these each time a vCPU is loaded onto a CPU. See how that explicitly gives the reader greppable strings for "__this_cpu_ist_top_va" and "HOST_IA32_FRED_RSP"? That makes it much easier to figure out what is going on. I was also momentarily confused about why these loads need to be done on _every_ vCPU load. I think it's because the host state can change as the vCPU moves around to different physical CPUs and __this_cpu_ist_top_va() can and will change. But it's a detail that I think deserves to be explained in the changelog. There is also this note in vmx_vcpu_load_vmcs(): > /* > * Linux uses per-cpu TSS and GDT, so set these when switching > * processors. See 22.2.4. > */ which makes me think that it might not be bad to pull *all* of the per-cpu VMCS field population code out into a helper since the reasoning of why these need to be repopulated is identical. Also, what's the purpose of clearing GUEST_IA32_FRED_RSP[123] at init_vmcs() time? I would have thought that those values wouldn't matter until the VMCS gets loaded at vmx_vcpu_load_vmcs() when they are overwritten anyway. Or, I could be just totally misunderstanding how KVM consumes the VMCS. :)
On 10/1/2024 9:12 AM, Dave Hansen wrote: > On 9/30/24 22:00, Xin Li (Intel) wrote: >> The per CPU variable cea_exception_stacks contains per CPU stacks for >> NMI, #DB and #DF, which is referenced in KVM to set host FRED RSP[123] >> each time a vCPU is loaded onto a CPU, thus it needs to be exported. > > Nit: It's not obvious how 'cea_exception_stacks' get used in this > series. It's never referenced explicitly. > > I did figure it out by looking for 'RSP[123]' references, but a much > better changelog would be something like: > > The per CPU array 'cea_exception_stacks' points to per CPU > stacks for NMI, #DB and #DF. It is normally referenced via the > #define: __this_cpu_ist_top_va(). > > FRED introduced new fields in the host-state area of the VMCS > for stack levels 1->3 (HOST_IA32_FRED_RSP[123]). KVM must > populate these each time a vCPU is loaded onto a CPU. > Yeah, this is way clearer. > See how that explicitly gives the reader greppable strings for > "__this_cpu_ist_top_va" and "HOST_IA32_FRED_RSP"? That makes it much > easier to figure out what is going on. Nice for a maintainer in 20 years :) > > I was also momentarily confused about why these loads need to be done on > _every_ vCPU load. I think it's because the host state can change as > the vCPU moves around to different physical CPUs and > __this_cpu_ist_top_va() can and will change. But it's a detail that I > think deserves to be explained in the changelog. There is also this > note in vmx_vcpu_load_vmcs(): Makes sense to me. > >> /* >> * Linux uses per-cpu TSS and GDT, so set these when switching >> * processors. See 22.2.4. >> */ > > which makes me think that it might not be bad to pull *all* of the > per-cpu VMCS field population code out into a helper since the reasoning > of why these need to be repopulated is identical. > > Also, what's the purpose of clearing GUEST_IA32_FRED_RSP[123] at > init_vmcs() time? I would have thought that those values wouldn't > matter until the VMCS gets loaded at vmx_vcpu_load_vmcs() when they are > overwritten anyway. Or, I could be just totally misunderstanding how > KVM consumes the VMCS. :) I don't see any misunderstanding. However we just do what the SDM claims, even it seems that it's not a must *logically*. FRED spec says: The RESET state of each of the new MSRs is zero. INIT does not change the value of the new MSRs Thanks! Xin
On 10/1/24 10:51, Xin Li wrote: ...>> Also, what's the purpose of clearing GUEST_IA32_FRED_RSP[123] at >> init_vmcs() time? I would have thought that those values wouldn't >> matter until the VMCS gets loaded at vmx_vcpu_load_vmcs() when they are >> overwritten anyway. Or, I could be just totally misunderstanding how >> KVM consumes the VMCS.
diff --git a/arch/x86/mm/cpu_entry_area.c b/arch/x86/mm/cpu_entry_area.c index 575f863f3c75..b8af71b67d9a 100644 --- a/arch/x86/mm/cpu_entry_area.c +++ b/arch/x86/mm/cpu_entry_area.c @@ -17,6 +17,7 @@ static DEFINE_PER_CPU_PAGE_ALIGNED(struct entry_stack_page, entry_stack_storage) #ifdef CONFIG_X86_64 static DEFINE_PER_CPU_PAGE_ALIGNED(struct exception_stacks, exception_stacks); DEFINE_PER_CPU(struct cea_exception_stacks*, cea_exception_stacks); +EXPORT_PER_CPU_SYMBOL(cea_exception_stacks); static DEFINE_PER_CPU_READ_MOSTLY(unsigned long, _cea_offset);