Message ID | 20240608000639.3295768-5-seanjc@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: Register cpuhp/syscore callbacks when enabling virt | expand |
> +static void kvm_uninit_virtualization(void) > +{ > + if (enable_virt_at_load) > + kvm_disable_virtualization(); > + > + WARN_ON(kvm_usage_count); > +} > Hi Sean, The above "WARN_ON(kvm_usage_count);" assumes the kvm_uninit_virtualization() is the last call of kvm_disable_virtualization(), and it is called ... > @@ -6433,6 +6468,8 @@ void kvm_exit(void) > */ > misc_deregister(&kvm_dev); > > + kvm_uninit_virtualization(); > + > ... from kvm_exit(). Accordingly, kvm_init_virtualization() is called in kvm_init(). For TDX, we want to "explicitly call kvm_enable_virtualization() + initializing TDX module" before kvm_init() in vt_init(), since kvm_init() is supposed to be the last step after initializing TDX. In the exit path, accordingly, for TDX we want to call kvm_exit() first, and then "do TDX cleanup staff + explicitly call kvm_disable_virtualizaation()". This will trigger the above "WARN_ON(kvm_usage_count);" when enable_virt_at_load is true, because kvm_uninit_virtualization() isn't the last call of kvm_disable_virtualization(). To resolve, I think one way is we can move kvm_init_virtualization() out of kvm_init(), but I am not sure whether there's another common place that kvm_init_virtualization() can be called for all ARCHs. Do you have any comments?
On Fri, 2024-08-02 at 12:02 +0000, Huang, Kai wrote: > This will trigger the above "WARN_ON(kvm_usage_count);" when > enable_virt_at_load is true, because kvm_uninit_virtualization() isn't > the last call of kvm_disable_virtualization(). Correct: the WARN_ON() will be triggered regardless of enable_virt_at_load is true or false, if kvm_uninit_virtualization() isn't the last call of kvm_disable_virtualization().
On Fri, Aug 02, 2024, Kai Huang wrote: > > > +static void kvm_uninit_virtualization(void) > > +{ > > + if (enable_virt_at_load) > > + kvm_disable_virtualization(); > > + > > + WARN_ON(kvm_usage_count); > > +} > > > > Hi Sean, > > The above "WARN_ON(kvm_usage_count);" assumes the > kvm_uninit_virtualization() is the last call of > kvm_disable_virtualization(), and it is called ... > > > @@ -6433,6 +6468,8 @@ void kvm_exit(void) > > */ > > misc_deregister(&kvm_dev); > > > > + kvm_uninit_virtualization(); > > + > > > > ... from kvm_exit(). > > Accordingly, kvm_init_virtualization() is called in kvm_init(). > > For TDX, we want to "explicitly call kvm_enable_virtualization() + > initializing TDX module" before kvm_init() in vt_init(), since kvm_init() > is supposed to be the last step after initializing TDX. > > In the exit path, accordingly, for TDX we want to call kvm_exit() first, > and then "do TDX cleanup staff + explicitly call > kvm_disable_virtualizaation()". > > This will trigger the above "WARN_ON(kvm_usage_count);" when > enable_virt_at_load is true, because kvm_uninit_virtualization() isn't > the last call of kvm_disable_virtualization(). > > To resolve, I think one way is we can move kvm_init_virtualization() out > of kvm_init(), but I am not sure whether there's another common place > that kvm_init_virtualization() can be called for all ARCHs. > > Do you have any comments? Drat. That's my main coment, though not the exact word I used :-) I managed to completely forget about TDX needing to enable virtualization to do its setup before creating /dev/kvm. A few options jump to mind: 1. Expose kvm_enable_virtualization() to arch code and delete the WARN_ON(). 2. Move kvm_init_virtualization() as you suggested. 3. Move the call to misc_register() out of kvm_init(), so that arch code can do additional setup between kvm_init() and kvm_register_dev_kvm() or whatever. I'm leaning towards #1. IIRC, that was my original intent before going down the "enable virtualization at module load" path. And it's not mutually exclusive with allowing virtualization to be forced on at module load. If #1 isn't a good option for whatever reason, I'd lean slightly for #3 over #2, purely because it's less arbitrary (registering /dev/kvm is the only thing that has strict ordering requirements). But I don't know that having a separate registration API would be a net positive, e.g. it's kinda nice that kvm_init() needs to be last, because it helps ensure some amount of guaranteed ordering between common KVM and arch code.
On Mon, 2024-08-12 at 19:31 -0700, Sean Christopherson wrote: > On Fri, Aug 02, 2024, Kai Huang wrote: > > > > > +static void kvm_uninit_virtualization(void) > > > +{ > > > + if (enable_virt_at_load) > > > + kvm_disable_virtualization(); > > > + > > > + WARN_ON(kvm_usage_count); > > > +} > > > > > > > Hi Sean, > > > > The above "WARN_ON(kvm_usage_count);" assumes the > > kvm_uninit_virtualization() is the last call of > > kvm_disable_virtualization(), and it is called ... > > > > > @@ -6433,6 +6468,8 @@ void kvm_exit(void) > > > */ > > > misc_deregister(&kvm_dev); > > > > > > + kvm_uninit_virtualization(); > > > + > > > > > > > ... from kvm_exit(). > > > > Accordingly, kvm_init_virtualization() is called in kvm_init(). > > > > For TDX, we want to "explicitly call kvm_enable_virtualization() + > > initializing TDX module" before kvm_init() in vt_init(), since kvm_init() > > is supposed to be the last step after initializing TDX. > > > > In the exit path, accordingly, for TDX we want to call kvm_exit() first, > > and then "do TDX cleanup staff + explicitly call > > kvm_disable_virtualizaation()". > > > > This will trigger the above "WARN_ON(kvm_usage_count);" when > > enable_virt_at_load is true, because kvm_uninit_virtualization() isn't > > the last call of kvm_disable_virtualization(). > > > > To resolve, I think one way is we can move kvm_init_virtualization() out > > of kvm_init(), but I am not sure whether there's another common place > > that kvm_init_virtualization() can be called for all ARCHs. > > > > Do you have any comments? > > Drat. That's my main coment, though not the exact word I used :-) > > I managed to completely forget about TDX needing to enable virtualization to do > its setup before creating /dev/kvm. A few options jump to mind: > > 1. Expose kvm_enable_virtualization() to arch code and delete the WARN_ON(). > > 2. Move kvm_init_virtualization() as you suggested. > > 3. Move the call to misc_register() out of kvm_init(), so that arch code can > do additional setup between kvm_init() and kvm_register_dev_kvm() or whatever. > > I'm leaning towards #1. IIRC, that was my original intent before going down the > "enable virtualization at module load" path. And it's not mutually exclusive > with allowing virtualization to be forced on at module load. > > If #1 isn't a good option for whatever reason, I'd lean slightly for #3 over #2, > purely because it's less arbitrary (registering /dev/kvm is the only thing that > has strict ordering requirements). But I don't know that having a separate > registration API would be a net positive, e.g. it's kinda nice that kvm_init() > needs to be last, because it helps ensure some amount of guaranteed ordering > between common KVM and arch code. I agree with option 1). We just allow arch code to do additional kvm_enable_virtualization() before kvm_init() and kvm_disable_virtualization() after kvm_exit(). I think it's kinda normal behaviour anyway. And this is exactly what I am doing :-) https://github.com/intel/tdx/commit/2f7cef685527a5ef952346ff5ab9adbb8bb6f371 https://github.com/intel/tdx/commit/6c76ffa47a98ca370fad389271dc3cedf304df2d
On 13/08/2024 5:22 pm, Huang, Kai wrote: > On Mon, 2024-08-12 at 19:31 -0700, Sean Christopherson wrote: >> On Fri, Aug 02, 2024, Kai Huang wrote: >>> >>>> +static void kvm_uninit_virtualization(void) >>>> +{ >>>> + if (enable_virt_at_load) >>>> + kvm_disable_virtualization(); >>>> + >>>> + WARN_ON(kvm_usage_count); >>>> +} >>>> >>> >>> Hi Sean, >>> >>> The above "WARN_ON(kvm_usage_count);" assumes the >>> kvm_uninit_virtualization() is the last call of >>> kvm_disable_virtualization(), and it is called ... >>> >>>> @@ -6433,6 +6468,8 @@ void kvm_exit(void) >>>> */ >>>> misc_deregister(&kvm_dev); >>>> >>>> + kvm_uninit_virtualization(); >>>> + >>>> >>> >>> ... from kvm_exit(). >>> >>> Accordingly, kvm_init_virtualization() is called in kvm_init(). >>> >>> For TDX, we want to "explicitly call kvm_enable_virtualization() + >>> initializing TDX module" before kvm_init() in vt_init(), since kvm_init() >>> is supposed to be the last step after initializing TDX. >>> >>> In the exit path, accordingly, for TDX we want to call kvm_exit() first, >>> and then "do TDX cleanup staff + explicitly call >>> kvm_disable_virtualizaation()". >>> >>> This will trigger the above "WARN_ON(kvm_usage_count);" when >>> enable_virt_at_load is true, because kvm_uninit_virtualization() isn't >>> the last call of kvm_disable_virtualization(). >>> >>> To resolve, I think one way is we can move kvm_init_virtualization() out >>> of kvm_init(), but I am not sure whether there's another common place >>> that kvm_init_virtualization() can be called for all ARCHs. >>> >>> Do you have any comments? >> >> Drat. That's my main coment, though not the exact word I used :-) >> >> I managed to completely forget about TDX needing to enable virtualization to do >> its setup before creating /dev/kvm. A few options jump to mind: >> >> 1. Expose kvm_enable_virtualization() to arch code and delete the WARN_ON(). >> >> 2. Move kvm_init_virtualization() as you suggested. >> >> 3. Move the call to misc_register() out of kvm_init(), so that arch code can >> do additional setup between kvm_init() and kvm_register_dev_kvm() or whatever. >> >> I'm leaning towards #1. IIRC, that was my original intent before going down the >> "enable virtualization at module load" path. And it's not mutually exclusive >> with allowing virtualization to be forced on at module load. >> >> If #1 isn't a good option for whatever reason, I'd lean slightly for #3 over #2, >> purely because it's less arbitrary (registering /dev/kvm is the only thing that >> has strict ordering requirements). But I don't know that having a separate >> registration API would be a net positive, e.g. it's kinda nice that kvm_init() >> needs to be last, because it helps ensure some amount of guaranteed ordering >> between common KVM and arch code. > > I agree with option 1). We just allow arch code to do additional > kvm_enable_virtualization() before kvm_init() and kvm_disable_virtualization() > after kvm_exit(). I think it's kinda normal behaviour anyway. > > And this is exactly what I am doing :-) > > https://github.com/intel/tdx/commit/2f7cef685527a5ef952346ff5ab9adbb8bb6f371 > https://github.com/intel/tdx/commit/6c76ffa47a98ca370fad389271dc3cedf304df2d > Hi Sean, Forgot to ask: I assume you will post v4 with that WARN_ON() in kvm_uninit_virtualizaiton() removed, correct? I am not sure whether you will include the patch to export kvm_enable_virtualization() and kvm_disable_virtualization() but either way is fine to me. I am thinking if we can get this patchset merged to kvm-coco-queue then we can actually start to review (and try to integrate) the "init TDX during KVM loading" patches. I appreciate if you and Paolo could share some plan on this. Thanks!
On 6/8/24 02:06, Sean Christopherson wrote: > Add an off-by-default module param, enable_virt_at_load, to let userspace > force virtualization to be enabled in hardware when KVM is initialized, > i.e. just before /dev/kvm is exposed to userspace. Enabling virtualization > during KVM initialization allows userspace to avoid the additional latency > when creating/destroying the first/last VM. Now that KVM uses the cpuhp > framework to do per-CPU enabling, the latency could be non-trivial as the > cpuhup bringup/teardown is serialized across CPUs, e.g. the latency could > be problematic for use case that need to spin up VMs quickly. > > Enabling virtualizaton during initialization will also allow KVM to setup > the Intel TDX Module, which requires VMX to be fully enabled, without > needing additional APIs to temporarily enable virtualization. > > Signed-off-by: Sean Christopherson <seanjc@google.com> I think we should enable it by default and wait for someone to complain. Or notice, even. Paolo > --- > virt/kvm/kvm_main.c | 37 +++++++++++++++++++++++++++++++++++++ > 1 file changed, 37 insertions(+) > > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index 98e52d12f137..7bdd744e4821 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -5495,6 +5495,9 @@ static struct miscdevice kvm_dev = { > }; > > #ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING > +static bool enable_virt_at_load; > +module_param(enable_virt_at_load, bool, 0444); > + > __visible bool kvm_rebooting; > EXPORT_SYMBOL_GPL(kvm_rebooting); > > @@ -5645,15 +5648,41 @@ static void kvm_disable_virtualization(void) > unregister_syscore_ops(&kvm_syscore_ops); > cpuhp_remove_state(CPUHP_AP_KVM_ONLINE); > } > + > +static int kvm_init_virtualization(void) > +{ > + if (enable_virt_at_load) > + return kvm_enable_virtualization(); > + > + return 0; > +} > + > +static void kvm_uninit_virtualization(void) > +{ > + if (enable_virt_at_load) > + kvm_disable_virtualization(); > + > + WARN_ON(kvm_usage_count); > +} > #else /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */ > static int kvm_enable_virtualization(void) > { > return 0; > } > > +static int kvm_init_virtualization(void) > +{ > + return 0; > +} > + > static void kvm_disable_virtualization(void) > { > > +} > + > +static void kvm_uninit_virtualization(void) > +{ > + > } > #endif /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */ > > @@ -6395,6 +6424,10 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module) > > kvm_gmem_init(module); > > + r = kvm_init_virtualization(); > + if (r) > + goto err_virt; > + > /* > * Registration _must_ be the very last thing done, as this exposes > * /dev/kvm to userspace, i.e. all infrastructure must be setup! > @@ -6408,6 +6441,8 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module) > return 0; > > err_register: > + kvm_uninit_virtualization(); > +err_virt: > kvm_vfio_ops_exit(); > err_vfio: > kvm_async_pf_deinit(); > @@ -6433,6 +6468,8 @@ void kvm_exit(void) > */ > misc_deregister(&kvm_dev); > > + kvm_uninit_virtualization(); > + > debugfs_remove_recursive(kvm_debugfs_dir); > for_each_possible_cpu(cpu) > free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 98e52d12f137..7bdd744e4821 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -5495,6 +5495,9 @@ static struct miscdevice kvm_dev = { }; #ifdef CONFIG_KVM_GENERIC_HARDWARE_ENABLING +static bool enable_virt_at_load; +module_param(enable_virt_at_load, bool, 0444); + __visible bool kvm_rebooting; EXPORT_SYMBOL_GPL(kvm_rebooting); @@ -5645,15 +5648,41 @@ static void kvm_disable_virtualization(void) unregister_syscore_ops(&kvm_syscore_ops); cpuhp_remove_state(CPUHP_AP_KVM_ONLINE); } + +static int kvm_init_virtualization(void) +{ + if (enable_virt_at_load) + return kvm_enable_virtualization(); + + return 0; +} + +static void kvm_uninit_virtualization(void) +{ + if (enable_virt_at_load) + kvm_disable_virtualization(); + + WARN_ON(kvm_usage_count); +} #else /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */ static int kvm_enable_virtualization(void) { return 0; } +static int kvm_init_virtualization(void) +{ + return 0; +} + static void kvm_disable_virtualization(void) { +} + +static void kvm_uninit_virtualization(void) +{ + } #endif /* CONFIG_KVM_GENERIC_HARDWARE_ENABLING */ @@ -6395,6 +6424,10 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module) kvm_gmem_init(module); + r = kvm_init_virtualization(); + if (r) + goto err_virt; + /* * Registration _must_ be the very last thing done, as this exposes * /dev/kvm to userspace, i.e. all infrastructure must be setup! @@ -6408,6 +6441,8 @@ int kvm_init(unsigned vcpu_size, unsigned vcpu_align, struct module *module) return 0; err_register: + kvm_uninit_virtualization(); +err_virt: kvm_vfio_ops_exit(); err_vfio: kvm_async_pf_deinit(); @@ -6433,6 +6468,8 @@ void kvm_exit(void) */ misc_deregister(&kvm_dev); + kvm_uninit_virtualization(); + debugfs_remove_recursive(kvm_debugfs_dir); for_each_possible_cpu(cpu) free_cpumask_var(per_cpu(cpu_kick_mask, cpu));
Add an off-by-default module param, enable_virt_at_load, to let userspace force virtualization to be enabled in hardware when KVM is initialized, i.e. just before /dev/kvm is exposed to userspace. Enabling virtualization during KVM initialization allows userspace to avoid the additional latency when creating/destroying the first/last VM. Now that KVM uses the cpuhp framework to do per-CPU enabling, the latency could be non-trivial as the cpuhup bringup/teardown is serialized across CPUs, e.g. the latency could be problematic for use case that need to spin up VMs quickly. Enabling virtualizaton during initialization will also allow KVM to setup the Intel TDX Module, which requires VMX to be fully enabled, without needing additional APIs to temporarily enable virtualization. Signed-off-by: Sean Christopherson <seanjc@google.com> --- virt/kvm/kvm_main.c | 37 +++++++++++++++++++++++++++++++++++++ 1 file changed, 37 insertions(+)