Message ID | 20221027083831.2985-1-santosh.shukla@amd.com (mailing list archive) |
---|---|
Headers | show |
Series | Virtual NMI feature | expand |
On 10/27/2022 2:08 PM, Santosh Shukla wrote: > VNMI Spec is at [1]. > > Change History: > > v5 (6.1-rc2) > 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson) > Gentle reminder. Thanks, Santosh
On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote: > > > On 10/27/2022 2:08 PM, Santosh Shukla wrote: > > VNMI Spec is at [1]. > > > > Change History: > > > > v5 (6.1-rc2) > > 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson) > > > > Gentle reminder. > > Thanks, > Santosh > I started reviewing it today and I think there are still few issues, and the biggest one is that if a NMI arrives while vNMI injection is pending, current code just drops such NMI. We had a discussion about this, like forcing immeditate vm exit in this case and such but I have a simplier idea: In this case we can just open the NMI window in the good old way by intercepting IRET, STGI, and or RSM (which is intercepted anyway), and only if we already *just* intercepted IRET, only then just drop the new NMI instead of single stepping over it based on reasoning that its 3rd NMI (one is almost done the servicing (its IRET is executing), one is pending injection, and we want to inject another one. Does this sound good to you? It won't work for SEV-ES as it looks like it doesn't intercept IRET, but it might be a reasonable tradeof for SEV-ES guests to accept that we can't inject a NMI if one is already pending injection. Best regards, Maxim Levitsky
Hi Maxim, On 11/14/2022 8:01 PM, Maxim Levitsky wrote: > On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote: >> >> >> On 10/27/2022 2:08 PM, Santosh Shukla wrote: >>> VNMI Spec is at [1]. >>> >>> Change History: >>> >>> v5 (6.1-rc2) >>> 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson) >>> >> >> Gentle reminder. >> >> Thanks, >> Santosh >> > > I started reviewing it today and I think there are still few issues, > and the biggest one is that if a NMI arrives while vNMI injection > is pending, current code just drops such NMI. > > We had a discussion about this, like forcing immeditate vm exit I believe, We discussed above case in [1] i.e.. HW can handle the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit. is it same scenario or different one that you are mentioning? [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/ Thanks, Santosh > in this case and such but I have a simplier idea: > > In this case we can just open the NMI window in the good old way > by intercepting IRET, STGI, and or RSM (which is intercepted anyway), > > and only if we already *just* intercepted IRET, only then just drop > the new NMI instead of single stepping over it based on reasoning that > its 3rd NMI (one is almost done the servicing (its IRET is executing), > one is pending injection, and we want to inject another one. > > Does this sound good to you? It won't work for SEV-ES as it looks > like it doesn't intercept IRET, but it might be a reasonable tradeof > for SEV-ES guests to accept that we can't inject a NMI if one is > already pending injection. > > Best regards, > Maxim Levitsky >
On Wed, 2022-11-16 at 11:10 +0530, Santosh Shukla wrote: > Hi Maxim, > > On 11/14/2022 8:01 PM, Maxim Levitsky wrote: > > On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote: > > > > > > > > > On 10/27/2022 2:08 PM, Santosh Shukla wrote: > > > > VNMI Spec is at [1]. > > > > > > > > Change History: > > > > > > > > v5 (6.1-rc2) > > > > 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson) > > > > > > > > > > Gentle reminder. > > > > > > Thanks, > > > Santosh > > > > > > > I started reviewing it today and I think there are still few issues, > > and the biggest one is that if a NMI arrives while vNMI injection > > is pending, current code just drops such NMI. > > > > We had a discussion about this, like forcing immeditate vm exit > > I believe, We discussed above case in [1] i.e.. HW can handle > the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit. > is it same scenario or different one that you are mentioning? > > [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/ You misunderstood the issue. Hardware can handle the case when a NMI is in service (that is V_NMI_MASK is set) and another one is injected (V_NMI_PENDING can be set), but it is not possible to handle the case when a NMI is already injected (V_NMI_PENDING set) but and KVM wants to inject another one before the first one went into the service (that is V_NMI_MASK is not set yet). Also same can happen when NMIs are blocked in SMM, since V_NMI_MASK is set despite no NMI in service, we will be able to inject only one NMI by setting the V_NMI_PENDING. I think I was able to solve all these issues and I will today post a modified patch series of yours, which should cover all these cases and have some nice refactoring as well. Best regards, Maxim Levitsky > > Thanks, > Santosh > > > in this case and such but I have a simplier idea: > > > > In this case we can just open the NMI window in the good old way > > by intercepting IRET, STGI, and or RSM (which is intercepted anyway), > > > > and only if we already *just* intercepted IRET, only then just drop > > the new NMI instead of single stepping over it based on reasoning that > > its 3rd NMI (one is almost done the servicing (its IRET is executing), > > one is pending injection, and we want to inject another one. > > > > Does this sound good to you? It won't work for SEV-ES as it looks > > like it doesn't intercept IRET, but it might be a reasonable tradeof > > for SEV-ES guests to accept that we can't inject a NMI if one is > > already pending injection. > > > > Best regards, > > Maxim Levitsky > > >
Hello Maxim,. On 11/16/2022 2:51 PM, Maxim Levitsky wrote: > On Wed, 2022-11-16 at 11:10 +0530, Santosh Shukla wrote: >> Hi Maxim, >> >> On 11/14/2022 8:01 PM, Maxim Levitsky wrote: >>> On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote: >>>> >>>> >>>> On 10/27/2022 2:08 PM, Santosh Shukla wrote: >>>>> VNMI Spec is at [1]. >>>>> >>>>> Change History: >>>>> >>>>> v5 (6.1-rc2) >>>>> 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson) >>>>> >>>> >>>> Gentle reminder. >>>> >>>> Thanks, >>>> Santosh >>>> >>> >>> I started reviewing it today and I think there are still few issues, >>> and the biggest one is that if a NMI arrives while vNMI injection >>> is pending, current code just drops such NMI. >>> >>> We had a discussion about this, like forcing immeditate vm exit >> >> I believe, We discussed above case in [1] i.e.. HW can handle >> the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit. >> is it same scenario or different one that you are mentioning? >> >> [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/>> > You misunderstood the issue. > > Hardware can handle the case when a NMI is in service (that is V_NMI_MASK is set) and another one is injected > (V_NMI_PENDING can be set), > > but it is not possible to handle the case when a NMI is already injected (V_NMI_PENDING set) but > and KVM wants to inject another one before the first one went into the service (that is V_NMI_MASK is not set > yet). > In this case, HW will collapse the NMI. Note that the HW will take the pending NMI at the boundary of IRET instruction such that it will check for the V_NMI_PENDING and if its set then HW will *take* the NMI, HW will clear the V_NMI_PENDING bit and set the V_NMI_MASK w/o the VMEXIT!,. > Also same can happen when NMIs are blocked in SMM, since V_NMI_MASK is set despite no NMI in service, > we will be able to inject only one NMI by setting the V_NMI_PENDING. > Ditto,. HW will collapse the NMI. Thanks, Santosh > I think I was able to solve all these issues and I will today post a modified patch series of yours, > which should cover all these cases and have some nice refactoring as well. > > > Best regards, > Maxim Levitsky > > >> >> Thanks, >> Santosh >> >>> in this case and such but I have a simplier idea: >>> >>> In this case we can just open the NMI window in the good old way >>> by intercepting IRET, STGI, and or RSM (which is intercepted anyway), >>> >>> and only if we already *just* intercepted IRET, only then just drop >>> the new NMI instead of single stepping over it based on reasoning that >>> its 3rd NMI (one is almost done the servicing (its IRET is executing), >>> one is pending injection, and we want to inject another one. >>> >>> Does this sound good to you? It won't work for SEV-ES as it looks >>> like it doesn't intercept IRET, but it might be a reasonable tradeof >>> for SEV-ES guests to accept that we can't inject a NMI if one is >>> already pending injection. >>> >>> Best regards, >>> Maxim Levitsky >>> >> > >
On 11/16/22 09:44, Santosh Shukla wrote: > Hello Maxim,. > > On 11/16/2022 2:51 PM, Maxim Levitsky wrote: >> On Wed, 2022-11-16 at 11:10 +0530, Santosh Shukla wrote: >>> Hi Maxim, >>> >>> On 11/14/2022 8:01 PM, Maxim Levitsky wrote: >>>> On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote: >>>>> >>>>> >>>>> On 10/27/2022 2:08 PM, Santosh Shukla wrote: >>>>>> VNMI Spec is at [1]. >>>>>> >>>>>> Change History: >>>>>> >>>>>> v5 (6.1-rc2) >>>>>> 01,02,06 - Renamed s/X86_FEATURE_V_NMI/X86_FEATURE_AMD_VNMI (Jim Mattson) >>>>>> >>>>> >>>>> Gentle reminder. >>>>> >>>>> Thanks, >>>>> Santosh >>>>> >>>> >>>> I started reviewing it today and I think there are still few issues, >>>> and the biggest one is that if a NMI arrives while vNMI injection >>>> is pending, current code just drops such NMI. >>>> >>>> We had a discussion about this, like forcing immeditate vm exit >>> >>> I believe, We discussed above case in [1] i.e.. HW can handle >>> the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit. >>> is it same scenario or different one that you are mentioning? >>> >>> [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/>> >> You misunderstood the issue. >> >> Hardware can handle the case when a NMI is in service (that is V_NMI_MASK is set) and another one is injected >> (V_NMI_PENDING can be set), >> >> but it is not possible to handle the case when a NMI is already injected (V_NMI_PENDING set) but >> and KVM wants to inject another one before the first one went into the service (that is V_NMI_MASK is not set >> yet). >> > > In this case, HW will collapse the NMI. > > Note that the HW will take the pending NMI at the boundary of IRET instruction such that > it will check for the V_NMI_PENDING and if its set then HW will *take* the NMI, > HW will clear the V_NMI_PENDING bit and set the V_NMI_MASK w/o the VMEXIT!,. > > >> Also same can happen when NMIs are blocked in SMM, since V_NMI_MASK is set despite no NMI in service, >> we will be able to inject only one NMI by setting the V_NMI_PENDING. >> > > Ditto,. HW will collapse the NMI. Note, this is how bare-metal NMIs are also handled. Multiple NMIs are collapsed into a single NMI if they are received while an NMI is currently being processed. Thanks, Tom > > Thanks, > Santosh > >> I think I was able to solve all these issues and I will today post a modified patch series of yours, >> which should cover all these cases and have some nice refactoring as well. >> >> >> Best regards, >> Maxim Levitsky >> >> >>> >>> Thanks, >>> Santosh >>> >>>> in this case and such but I have a simplier idea: >>>> >>>> In this case we can just open the NMI window in the good old way >>>> by intercepting IRET, STGI, and or RSM (which is intercepted anyway), >>>> >>>> and only if we already *just* intercepted IRET, only then just drop >>>> the new NMI instead of single stepping over it based on reasoning that >>>> its 3rd NMI (one is almost done the servicing (its IRET is executing), >>>> one is pending injection, and we want to inject another one. >>>> >>>> Does this sound good to you? It won't work for SEV-ES as it looks >>>> like it doesn't intercept IRET, but it might be a reasonable tradeof >>>> for SEV-ES guests to accept that we can't inject a NMI if one is >>>> already pending injection. >>>> >>>> Best regards, >>>> Maxim Levitsky >>>> >>> >> >> >
On Wed, Nov 16, 2022, Tom Lendacky wrote: > On 11/16/22 09:44, Santosh Shukla wrote: > > Hello Maxim,. > > > > On 11/16/2022 2:51 PM, Maxim Levitsky wrote: > > > On Wed, 2022-11-16 at 11:10 +0530, Santosh Shukla wrote: > > > > Hi Maxim, > > > > > > > > On 11/14/2022 8:01 PM, Maxim Levitsky wrote: > > > > > On Mon, 2022-11-14 at 13:32 +0530, Santosh Shukla wrote: > > > > > I started reviewing it today and I think there are still few issues, > > > > > and the biggest one is that if a NMI arrives while vNMI injection > > > > > is pending, current code just drops such NMI. I don't think it gets dropped, just improperly delayed. > > > > > We had a discussion about this, like forcing immeditate vm exit > > > > > > > > I believe, We discussed above case in [1] i.e.. HW can handle > > > > the second (/pending)virtual NMI while the guest processing first virtual NMI w/o vmexit. > > > > is it same scenario or different one that you are mentioning? > > > > > > > > [1] https://lore.kernel.org/lkml/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com/>> > > > You misunderstood the issue. > > > > > > Hardware can handle the case when a NMI is in service (that is V_NMI_MASK > > > is set) and another one is injected (V_NMI_PENDING can be set), > > > > > > but it is not possible to handle the case when a NMI is already injected > > > (V_NMI_PENDING set) but and KVM wants to inject another one before the > > > first one went into the service (that is V_NMI_MASK is not set yet). > > > > In this case, HW will collapse the NMI. Yes, but practically speaking two NMIs can't arrive at the exact same instance on bare metal. One NMI will always arrive first and get vectored, and then the second NMI will arrive and be pended. In a virtual environment, two NMIs that are sent far apart can arrive together, e.g. if the vCPU is scheduled out for an extended period of time. KVM mitigates this side effect by allowing two NMIs to be pending. The problem here isn't that second the NMI is lost, it's that KVM can't get control when the first NMI completes (IRETs). KVM can pend both NMIs and queue the first for injection/vectoring (set V_NMI_PENDING), but AIUI there is no mechanism (and no code) to force a VM-Exit on the IRET so that KVM can inject the second NMI. Santosh's response in v2[*] suggested that hardware would allow KVM to "post" an NMI while the vCPU is running, but I don't see any code in this series to actually do that. svm_inject_nmi() is only called from the vCPU's run loop, i.e. requires a VM-Exit. [*] https://lore.kernel.org/all/1782cdbb-8274-8c3d-fa98-29147f1e5d1e@amd.com > > Note that the HW will take the pending NMI at the boundary of IRET instruction such that > > it will check for the V_NMI_PENDING and if its set then HW will *take* the NMI, > > HW will clear the V_NMI_PENDING bit and set the V_NMI_MASK w/o the VMEXIT!,. > > > > > > > Also same can happen when NMIs are blocked in SMM, since V_NMI_MASK is > > > set despite no NMI in service, we will be able to inject only one NMI by > > > setting the V_NMI_PENDING. I believe this one is a non-issue. Like bare metal, KVM only allows one NMI to be pending if NMIs are blocked. static void process_nmi(struct kvm_vcpu *vcpu) { unsigned limit = 2; /* * x86 is limited to one NMI running, and one NMI pending after it. * If an NMI is already in progress, limit further NMIs to just one. * Otherwise, allow two (and we'll inject the first one immediately). */ if (static_call(kvm_x86_get_nmi_mask)(vcpu) || vcpu->arch.nmi_injected) limit = 1; vcpu->arch.nmi_pending += atomic_xchg(&vcpu->arch.nmi_queued, 0); vcpu->arch.nmi_pending = min(vcpu->arch.nmi_pending, limit); kvm_make_request(KVM_REQ_EVENT, vcpu); } > > Ditto,. HW will collapse the NMI. > > Note, this is how bare-metal NMIs are also handled. Multiple NMIs are > collapsed into a single NMI if they are received while an NMI is currently > being processed.