Message ID | 1453254177-103002-2-git-send-email-feng.wu@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 2016/1/20 9:42, Feng Wu wrote: > When the interrupt is not single destination any more, we need > to change back IRTE to remapped mode explicitly. > > Signed-off-by: Feng Wu <feng.wu@intel.com> > --- > arch/x86/kvm/vmx.c | 11 ++++++++++- > 1 file changed, 10 insertions(+), 1 deletion(-) > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > index e2951b6..13d14d4 100644 > --- a/arch/x86/kvm/vmx.c > +++ b/arch/x86/kvm/vmx.c > @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq, > */ > > kvm_set_msi_irq(e, &irq); > - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) > + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { > + /* > + * Make sure the IRTE is in remapped mode if > + * we don't handle it in posted mode. > + */ > + pi_set_sn(vcpu_to_pi_desc(vcpu)); > + ret = irq_set_vcpu_affinity(host_irq, NULL); > + pi_clear_sn(vcpu_to_pi_desc(vcpu)); > + > continue; > + } > > vcpu_info.pi_desc_addr = __pa(vcpu_to_pi_desc(vcpu)); > vcpu_info.vector = irq.vector; > I am still feel weird with this change: according the semantic of VT-d posted interrupt, the interrupt will injected to guest through posted notification and /proc/interrupts shows the same meaning. But now, without being aware of user, the interrupt changes to legacy way and it appears on different entry on /proc/interrupts. It looks weird. Any comments? Paolo.
> -----Original Message----- > From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] > Sent: Thursday, January 21, 2016 11:06 AM > To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > rkrcmar@redhat.com > Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > interrupt is not single-destination > > On 2016/1/20 9:42, Feng Wu wrote: > > When the interrupt is not single destination any more, we need > > to change back IRTE to remapped mode explicitly. > > > > Signed-off-by: Feng Wu <feng.wu@intel.com> > > --- > > arch/x86/kvm/vmx.c | 11 ++++++++++- > > 1 file changed, 10 insertions(+), 1 deletion(-) > > > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > index e2951b6..13d14d4 100644 > > --- a/arch/x86/kvm/vmx.c > > +++ b/arch/x86/kvm/vmx.c > > @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm > *kvm, unsigned int host_irq, > > */ > > > > kvm_set_msi_irq(e, &irq); > > - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) > > + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { > > + /* > > + * Make sure the IRTE is in remapped mode if > > + * we don't handle it in posted mode. > > + */ > > + pi_set_sn(vcpu_to_pi_desc(vcpu)); > > + ret = irq_set_vcpu_affinity(host_irq, NULL); > > + pi_clear_sn(vcpu_to_pi_desc(vcpu)); > > + > > continue; > > + } > > > > vcpu_info.pi_desc_addr = __pa(vcpu_to_pi_desc(vcpu)); > > vcpu_info.vector = irq.vector; > > > > I am still feel weird with this change: according the semantic of VT-d > posted interrupt, the interrupt will injected to guest through posted > notification and /proc/interrupts shows the same meaning. But now, > without being aware of user, the interrupt changes to legacy way and it > appears on different entry on /proc/interrupts. It looks weird. I don't think it has problem here, IMO, this is exactly how it works. There should be different entry for the interrupts in VT-d PI mode and leagcy mode. For VT-d PI mode, it is delivered by notification event, for legacy mode, it is delivered by VFIO. Thanks, Feng > > Any comments? Paolo. > > -- > best regards > yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/21 11:14, Wu, Feng wrote: > > >> -----Original Message----- >> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >> Sent: Thursday, January 21, 2016 11:06 AM >> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >> rkrcmar@redhat.com >> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the >> interrupt is not single-destination >> >> On 2016/1/20 9:42, Feng Wu wrote: >>> When the interrupt is not single destination any more, we need >>> to change back IRTE to remapped mode explicitly. >>> >>> Signed-off-by: Feng Wu <feng.wu@intel.com> >>> --- >>> arch/x86/kvm/vmx.c | 11 ++++++++++- >>> 1 file changed, 10 insertions(+), 1 deletion(-) >>> >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>> index e2951b6..13d14d4 100644 >>> --- a/arch/x86/kvm/vmx.c >>> +++ b/arch/x86/kvm/vmx.c >>> @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm >> *kvm, unsigned int host_irq, >>> */ >>> >>> kvm_set_msi_irq(e, &irq); >>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) >>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { >>> + /* >>> + * Make sure the IRTE is in remapped mode if >>> + * we don't handle it in posted mode. >>> + */ >>> + pi_set_sn(vcpu_to_pi_desc(vcpu)); >>> + ret = irq_set_vcpu_affinity(host_irq, NULL); >>> + pi_clear_sn(vcpu_to_pi_desc(vcpu)); >>> + >>> continue; >>> + } >>> >>> vcpu_info.pi_desc_addr = __pa(vcpu_to_pi_desc(vcpu)); >>> vcpu_info.vector = irq.vector; >>> >> >> I am still feel weird with this change: according the semantic of VT-d >> posted interrupt, the interrupt will injected to guest through posted >> notification and /proc/interrupts shows the same meaning. But now, >> without being aware of user, the interrupt changes to legacy way and it >> appears on different entry on /proc/interrupts. It looks weird. > > I don't think it has problem here, IMO, this is exactly how it works. > There should be different entry for the interrupts in VT-d PI mode > and leagcy mode. I am not saying any problem here. Just feel weird. From a normal user's point, he has turned on the VT-d pi and according the semantic of VT-d pi, he should not observe the interrupt through legacy mode, but now he do see it. Maybe print out a message here will be helpful, like what you did for disabled lapic found during irq injection. > > For VT-d PI mode, it is delivered by notification event, for legacy mode, > it is delivered by VFIO. > > Thanks, > Feng > >> >> Any comments? Paolo. >> >> -- >> best regards >> yang
> -----Original Message----- > From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On > Behalf Of Yang Zhang > Sent: Thursday, January 21, 2016 11:35 AM > To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > rkrcmar@redhat.com > Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > interrupt is not single-destination > > On 2016/1/21 11:14, Wu, Feng wrote: > > > > > >> -----Original Message----- > >> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] > >> Sent: Thursday, January 21, 2016 11:06 AM > >> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > >> rkrcmar@redhat.com > >> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > >> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > >> interrupt is not single-destination > >> > >> On 2016/1/20 9:42, Feng Wu wrote: > >>> When the interrupt is not single destination any more, we need > >>> to change back IRTE to remapped mode explicitly. > >>> > >>> Signed-off-by: Feng Wu <feng.wu@intel.com> > >>> --- > >>> arch/x86/kvm/vmx.c | 11 ++++++++++- > >>> 1 file changed, 10 insertions(+), 1 deletion(-) > >>> > >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > >>> index e2951b6..13d14d4 100644 > >>> --- a/arch/x86/kvm/vmx.c > >>> +++ b/arch/x86/kvm/vmx.c > >>> @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm > >> *kvm, unsigned int host_irq, > >>> */ > >>> > >>> kvm_set_msi_irq(e, &irq); > >>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) > >>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { > >>> + /* > >>> + * Make sure the IRTE is in remapped mode if > >>> + * we don't handle it in posted mode. > >>> + */ > >>> + pi_set_sn(vcpu_to_pi_desc(vcpu)); > >>> + ret = irq_set_vcpu_affinity(host_irq, NULL); > >>> + pi_clear_sn(vcpu_to_pi_desc(vcpu)); > >>> + > >>> continue; > >>> + } > >>> > >>> vcpu_info.pi_desc_addr = __pa(vcpu_to_pi_desc(vcpu)); > >>> vcpu_info.vector = irq.vector; > >>> > >> > >> I am still feel weird with this change: according the semantic of VT-d > >> posted interrupt, the interrupt will injected to guest through posted > >> notification and /proc/interrupts shows the same meaning. But now, > >> without being aware of user, the interrupt changes to legacy way and it > >> appears on different entry on /proc/interrupts. It looks weird. > > > > I don't think it has problem here, IMO, this is exactly how it works. > > There should be different entry for the interrupts in VT-d PI mode > > and leagcy mode. > > I am not saying any problem here. Just feel weird. From a normal user's > point, he has turned on the VT-d pi and according the semantic of VT-d > pi, he should not observe the interrupt through legacy mode, but now he > do see it. Maybe print out a message here will be helpful, like what you > did for disabled lapic found during irq injection. Even VT-d PI is on, not all interrupts can be handled by it, the reason the interrupts is changed back to legacy mode is because the user changes the affinity, and it cannot be handle in PI mode, and hence legacy mode is used. It is the user's behavior that cause this mode change, seems it is not so weird to me. But add some message here is good idea, just like what I did later in this function, I can also add the following trace message here. trace_kvm_pi_irte_update(vcpu->vcpu_id, host_irq, e->gsi, vcpu_info.vector, vcpu_info.pi_desc_addr, set); Thanks, Feng > > > > > For VT-d PI mode, it is delivered by notification event, for legacy mode, > > it is delivered by VFIO. > > > > Thanks, > > Feng > > > >> > >> Any comments? Paolo. > >> > >> -- > >> best regards > >> yang > > > -- > best regards > yang > -- > To unsubscribe from this list: send the line "unsubscribe kvm" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> From: Wu, Feng > Sent: Thursday, January 21, 2016 12:43 PM > > > -----Original Message----- > > From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On > > Behalf Of Yang Zhang > > Sent: Thursday, January 21, 2016 11:35 AM > > To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > > rkrcmar@redhat.com > > Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > > Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > > interrupt is not single-destination > > > > On 2016/1/21 11:14, Wu, Feng wrote: > > > > > > > > >> -----Original Message----- > > >> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] > > >> Sent: Thursday, January 21, 2016 11:06 AM > > >> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > > >> rkrcmar@redhat.com > > >> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > > >> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > > >> interrupt is not single-destination > > >> > > >> On 2016/1/20 9:42, Feng Wu wrote: > > >>> When the interrupt is not single destination any more, we need > > >>> to change back IRTE to remapped mode explicitly. > > >>> > > >>> Signed-off-by: Feng Wu <feng.wu@intel.com> > > >>> --- > > >>> arch/x86/kvm/vmx.c | 11 ++++++++++- > > >>> 1 file changed, 10 insertions(+), 1 deletion(-) > > >>> > > >>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > >>> index e2951b6..13d14d4 100644 > > >>> --- a/arch/x86/kvm/vmx.c > > >>> +++ b/arch/x86/kvm/vmx.c > > >>> @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm > > >> *kvm, unsigned int host_irq, > > >>> */ > > >>> > > >>> kvm_set_msi_irq(e, &irq); > > >>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) > > >>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { > > >>> + /* > > >>> + * Make sure the IRTE is in remapped mode if > > >>> + * we don't handle it in posted mode. > > >>> + */ > > >>> + pi_set_sn(vcpu_to_pi_desc(vcpu)); > > >>> + ret = irq_set_vcpu_affinity(host_irq, NULL); > > >>> + pi_clear_sn(vcpu_to_pi_desc(vcpu)); > > >>> + > > >>> continue; > > >>> + } > > >>> > > >>> vcpu_info.pi_desc_addr = __pa(vcpu_to_pi_desc(vcpu)); > > >>> vcpu_info.vector = irq.vector; > > >>> > > >> > > >> I am still feel weird with this change: according the semantic of VT-d > > >> posted interrupt, the interrupt will injected to guest through posted > > >> notification and /proc/interrupts shows the same meaning. But now, > > >> without being aware of user, the interrupt changes to legacy way and it > > >> appears on different entry on /proc/interrupts. It looks weird. > > > > > > I don't think it has problem here, IMO, this is exactly how it works. > > > There should be different entry for the interrupts in VT-d PI mode > > > and leagcy mode. > > > > I am not saying any problem here. Just feel weird. From a normal user's > > point, he has turned on the VT-d pi and according the semantic of VT-d > > pi, he should not observe the interrupt through legacy mode, but now he > > do see it. Maybe print out a message here will be helpful, like what you > > did for disabled lapic found during irq injection. > > Even VT-d PI is on, not all interrupts can be handled by it, the reason the > interrupts is changed back to legacy mode is because the user changes > the affinity, and it cannot be handle in PI mode, and hence legacy mode > is used. It is the user's behavior that cause this mode change, seems it is > not so weird to me. But add some message here is good idea, just like > what I did later in this function, I can also add the following trace > message here. > > trace_kvm_pi_irte_update(vcpu->vcpu_id, host_irq, e->gsi, > vcpu_info.vector, vcpu_info.pi_desc_addr, set); > > Thanks, > Feng > Right. Whether the interrupt is delivered into guest directly with PI or with legacy mode, is completely agnostic to the guest. Guest can't tell or set any expectation on the underlying delivering method, so there is no guest visible impact. Thanks Kevin -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/21 12:42, Wu, Feng wrote: > > >> -----Original Message----- >> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] On >> Behalf Of Yang Zhang >> Sent: Thursday, January 21, 2016 11:35 AM >> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >> rkrcmar@redhat.com >> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the >> interrupt is not single-destination >> >> On 2016/1/21 11:14, Wu, Feng wrote: >>> >>> >>>> -----Original Message----- >>>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >>>> Sent: Thursday, January 21, 2016 11:06 AM >>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >>>> rkrcmar@redhat.com >>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the >>>> interrupt is not single-destination >>>> >>>> On 2016/1/20 9:42, Feng Wu wrote: >>>>> When the interrupt is not single destination any more, we need >>>>> to change back IRTE to remapped mode explicitly. >>>>> >>>>> Signed-off-by: Feng Wu <feng.wu@intel.com> >>>>> --- >>>>> arch/x86/kvm/vmx.c | 11 ++++++++++- >>>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>>> >>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>>>> index e2951b6..13d14d4 100644 >>>>> --- a/arch/x86/kvm/vmx.c >>>>> +++ b/arch/x86/kvm/vmx.c >>>>> @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm >>>> *kvm, unsigned int host_irq, >>>>> */ >>>>> >>>>> kvm_set_msi_irq(e, &irq); >>>>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) >>>>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { >>>>> + /* >>>>> + * Make sure the IRTE is in remapped mode if >>>>> + * we don't handle it in posted mode. >>>>> + */ >>>>> + pi_set_sn(vcpu_to_pi_desc(vcpu)); >>>>> + ret = irq_set_vcpu_affinity(host_irq, NULL); >>>>> + pi_clear_sn(vcpu_to_pi_desc(vcpu)); >>>>> + >>>>> continue; >>>>> + } >>>>> >>>>> vcpu_info.pi_desc_addr = __pa(vcpu_to_pi_desc(vcpu)); >>>>> vcpu_info.vector = irq.vector; >>>>> >>>> >>>> I am still feel weird with this change: according the semantic of VT-d >>>> posted interrupt, the interrupt will injected to guest through posted >>>> notification and /proc/interrupts shows the same meaning. But now, >>>> without being aware of user, the interrupt changes to legacy way and it >>>> appears on different entry on /proc/interrupts. It looks weird. >>> >>> I don't think it has problem here, IMO, this is exactly how it works. >>> There should be different entry for the interrupts in VT-d PI mode >>> and leagcy mode. >> >> I am not saying any problem here. Just feel weird. From a normal user's >> point, he has turned on the VT-d pi and according the semantic of VT-d >> pi, he should not observe the interrupt through legacy mode, but now he >> do see it. Maybe print out a message here will be helpful, like what you >> did for disabled lapic found during irq injection. > > Even VT-d PI is on, not all interrupts can be handled by it, the reason the No, we can handle it but we don't do it due to the complexity.For example, we can use wake up vector to delivery the interrupt which still is in PI mode but doesn't require any mode change. > interrupts is changed back to legacy mode is because the user changes > the affinity, and it cannot be handle in PI mode, and hence legacy mode > is used. It is the user's behavior that cause this mode change, seems it is > not so weird to me. But add some message here is good idea, just like Why user's behavior can change the mode? According the current design, there is no way for user to turn on/off dynamically.Why we need to rollback to legacy mode is we don't want to handle multi-destination interrupt in PI mode but it doesn't mean we cannot do it like i said before.
> -----Original Message----- > From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] > Sent: Thursday, January 21, 2016 1:00 PM > To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > rkrcmar@redhat.com > Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > interrupt is not single-destination > > On 2016/1/21 12:42, Wu, Feng wrote: > > > > > >> -----Original Message----- > >> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] > On > >> Behalf Of Yang Zhang > >> Sent: Thursday, January 21, 2016 11:35 AM > >> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > >> rkrcmar@redhat.com > >> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > >> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > >> interrupt is not single-destination > >> > >> On 2016/1/21 11:14, Wu, Feng wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] > >>>> Sent: Thursday, January 21, 2016 11:06 AM > >>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > >>>> rkrcmar@redhat.com > >>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > >>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if > the > >>>> interrupt is not single-destination > >>>> > >>>> On 2016/1/20 9:42, Feng Wu wrote: > >>>>> When the interrupt is not single destination any more, we need > >>>>> to change back IRTE to remapped mode explicitly. > >>>>> > >>>>> Signed-off-by: Feng Wu <feng.wu@intel.com> > >>>>> --- > >>>>> arch/x86/kvm/vmx.c | 11 ++++++++++- > >>>>> 1 file changed, 10 insertions(+), 1 deletion(-) > >>>>> > >>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > >>>>> index e2951b6..13d14d4 100644 > >>>>> --- a/arch/x86/kvm/vmx.c > >>>>> +++ b/arch/x86/kvm/vmx.c > >>>>> @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm > >>>> *kvm, unsigned int host_irq, > >>>>> */ > >>>>> > >>>>> kvm_set_msi_irq(e, &irq); > >>>>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) > >>>>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { > >>>>> + /* > >>>>> + * Make sure the IRTE is in remapped mode if > >>>>> + * we don't handle it in posted mode. > >>>>> + */ > >>>>> + pi_set_sn(vcpu_to_pi_desc(vcpu)); > >>>>> + ret = irq_set_vcpu_affinity(host_irq, NULL); > >>>>> + pi_clear_sn(vcpu_to_pi_desc(vcpu)); > >>>>> + > >>>>> continue; > >>>>> + } > >>>>> > >>>>> vcpu_info.pi_desc_addr = > __pa(vcpu_to_pi_desc(vcpu)); > >>>>> vcpu_info.vector = irq.vector; > >>>>> > >>>> > >>>> I am still feel weird with this change: according the semantic of VT-d > >>>> posted interrupt, the interrupt will injected to guest through posted > >>>> notification and /proc/interrupts shows the same meaning. But now, > >>>> without being aware of user, the interrupt changes to legacy way and it > >>>> appears on different entry on /proc/interrupts. It looks weird. > >>> > >>> I don't think it has problem here, IMO, this is exactly how it works. > >>> There should be different entry for the interrupts in VT-d PI mode > >>> and leagcy mode. > >> > >> I am not saying any problem here. Just feel weird. From a normal user's > >> point, he has turned on the VT-d pi and according the semantic of VT-d > >> pi, he should not observe the interrupt through legacy mode, but now he > >> do see it. Maybe print out a message here will be helpful, like what you > >> did for disabled lapic found during irq injection. > > > > Even VT-d PI is on, not all interrupts can be handled by it, the reason the > > No, we can handle it but we don't do it due to the complexity.For > example, we can use wake up vector to delivery the interrupt which still > is in PI mode but doesn't require any mode change. I mean, multi-cast and broadcast interrupts cannot be handled in PI mode. > > > interrupts is changed back to legacy mode is because the user changes > > the affinity, and it cannot be handle in PI mode, and hence legacy mode > > is used. It is the user's behavior that cause this mode change, seems it is > > not so weird to me. But add some message here is good idea, just like > > Why user's behavior can change the mode? Like you mentioned before, if the interrupt is changed from single-destination to multiple-destination by guest. And this is the reason of adding the rollback logic here, right? Thanks, Feng > According the current design, > there is no way for user to turn on/off dynamically.Why we need to > rollback to legacy mode is we don't want to handle multi-destination > interrupt in PI mode but it doesn't mean we cannot do it like i said before. > > > -- > best regards > yang -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/21 13:07, Wu, Feng wrote: > > >> -----Original Message----- >> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >> Sent: Thursday, January 21, 2016 1:00 PM >> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >> rkrcmar@redhat.com >> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the >> interrupt is not single-destination >> >> On 2016/1/21 12:42, Wu, Feng wrote: >>> >>> >>>> -----Original Message----- >>>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] >> On >>>> Behalf Of Yang Zhang >>>> Sent: Thursday, January 21, 2016 11:35 AM >>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >>>> rkrcmar@redhat.com >>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the >>>> interrupt is not single-destination >>>> >>>> On 2016/1/21 11:14, Wu, Feng wrote: >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >>>>>> Sent: Thursday, January 21, 2016 11:06 AM >>>>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >>>>>> rkrcmar@redhat.com >>>>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >>>>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if >> the >>>>>> interrupt is not single-destination >>>>>> >>>>>> On 2016/1/20 9:42, Feng Wu wrote: >>>>>>> When the interrupt is not single destination any more, we need >>>>>>> to change back IRTE to remapped mode explicitly. >>>>>>> >>>>>>> Signed-off-by: Feng Wu <feng.wu@intel.com> >>>>>>> --- >>>>>>> arch/x86/kvm/vmx.c | 11 ++++++++++- >>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>>>>> >>>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>>>>>> index e2951b6..13d14d4 100644 >>>>>>> --- a/arch/x86/kvm/vmx.c >>>>>>> +++ b/arch/x86/kvm/vmx.c >>>>>>> @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm >>>>>> *kvm, unsigned int host_irq, >>>>>>> */ >>>>>>> >>>>>>> kvm_set_msi_irq(e, &irq); >>>>>>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) >>>>>>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { >>>>>>> + /* >>>>>>> + * Make sure the IRTE is in remapped mode if >>>>>>> + * we don't handle it in posted mode. >>>>>>> + */ >>>>>>> + pi_set_sn(vcpu_to_pi_desc(vcpu)); >>>>>>> + ret = irq_set_vcpu_affinity(host_irq, NULL); >>>>>>> + pi_clear_sn(vcpu_to_pi_desc(vcpu)); >>>>>>> + >>>>>>> continue; >>>>>>> + } >>>>>>> >>>>>>> vcpu_info.pi_desc_addr = >> __pa(vcpu_to_pi_desc(vcpu)); >>>>>>> vcpu_info.vector = irq.vector; >>>>>>> >>>>>> >>>>>> I am still feel weird with this change: according the semantic of VT-d >>>>>> posted interrupt, the interrupt will injected to guest through posted >>>>>> notification and /proc/interrupts shows the same meaning. But now, >>>>>> without being aware of user, the interrupt changes to legacy way and it >>>>>> appears on different entry on /proc/interrupts. It looks weird. >>>>> >>>>> I don't think it has problem here, IMO, this is exactly how it works. >>>>> There should be different entry for the interrupts in VT-d PI mode >>>>> and leagcy mode. >>>> >>>> I am not saying any problem here. Just feel weird. From a normal user's >>>> point, he has turned on the VT-d pi and according the semantic of VT-d >>>> pi, he should not observe the interrupt through legacy mode, but now he >>>> do see it. Maybe print out a message here will be helpful, like what you >>>> did for disabled lapic found during irq injection. >>> >>> Even VT-d PI is on, not all interrupts can be handled by it, the reason the >> >> No, we can handle it but we don't do it due to the complexity.For >> example, we can use wake up vector to delivery the interrupt which still >> is in PI mode but doesn't require any mode change. > > I mean, multi-cast and broadcast interrupts cannot be handled in PI mode. We may have different understanding on PI mode. My understanding is if we set the IRTE to PI format, than the subsequent interrupt will be handled in PI mode. multi-cast and broadcast interrupts cannot be injected to guest directly but it doesn't mean cannot be handled in PI mode. As i said, we can handle it in wake up vector or via other approach.But it is much complexity. I agree that rollback to legacy mode is the best choice, but may need some additional messages to tell the user(host administrator) why we change to legacy mode. I think not all of them are familiar with the detail of VT-d PI. If they find there are still some interrupts goto legacy mode even they have turned on PI, they may get confused. > >> >>> interrupts is changed back to legacy mode is because the user changes >>> the affinity, and it cannot be handle in PI mode, and hence legacy mode >>> is used. It is the user's behavior that cause this mode change, seems it is >>> not so weird to me. But add some message here is good idea, just like >> >> Why user's behavior can change the mode? > > Like you mentioned before, if the interrupt is changed from single-destination > to multiple-destination by guest. And this is the reason of adding the rollback > logic here, right? The user means the host administrator. > > Thanks, > Feng > >> According the current design, >> there is no way for user to turn on/off dynamically.Why we need to >> rollback to legacy mode is we don't want to handle multi-destination >> interrupt in PI mode but it doesn't mean we cannot do it like i said before. >> >> >> -- >> best regards >> yang
> -----Original Message----- > From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] > Sent: Thursday, January 21, 2016 1:36 PM > To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > rkrcmar@redhat.com > Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > interrupt is not single-destination > > On 2016/1/21 13:07, Wu, Feng wrote: > > > > > >> -----Original Message----- > >> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] > >> Sent: Thursday, January 21, 2016 1:00 PM > >> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > >> rkrcmar@redhat.com > >> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > >> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > >> interrupt is not single-destination > >> > >> On 2016/1/21 12:42, Wu, Feng wrote: > >>> > >>> > >>>> -----Original Message----- > >>>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] > >> On > >>>> Behalf Of Yang Zhang > >>>> Sent: Thursday, January 21, 2016 11:35 AM > >>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > >>>> rkrcmar@redhat.com > >>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > >>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if > the > >>>> interrupt is not single-destination > >>>> > >>>> On 2016/1/21 11:14, Wu, Feng wrote: > >>>>> > >>>>> > >>>>>> -----Original Message----- > >>>>>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] > >>>>>> Sent: Thursday, January 21, 2016 11:06 AM > >>>>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; > >>>>>> rkrcmar@redhat.com > >>>>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > >>>>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if > >> the > >>>>>> interrupt is not single-destination > >>>>>> > >>>>>> On 2016/1/20 9:42, Feng Wu wrote: > >>>>>>> When the interrupt is not single destination any more, we need > >>>>>>> to change back IRTE to remapped mode explicitly. > >>>>>>> > >>>>>>> Signed-off-by: Feng Wu <feng.wu@intel.com> > >>>>>>> --- > >>>>>>> arch/x86/kvm/vmx.c | 11 ++++++++++- > >>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-) > >>>>>>> > >>>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > >>>>>>> index e2951b6..13d14d4 100644 > >>>>>>> --- a/arch/x86/kvm/vmx.c > >>>>>>> +++ b/arch/x86/kvm/vmx.c > >>>>>>> @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct > kvm > >>>>>> *kvm, unsigned int host_irq, > >>>>>>> */ > >>>>>>> > >>>>>>> kvm_set_msi_irq(e, &irq); > >>>>>>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) > >>>>>>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { > >>>>>>> + /* > >>>>>>> + * Make sure the IRTE is in remapped mode if > >>>>>>> + * we don't handle it in posted mode. > >>>>>>> + */ > >>>>>>> + pi_set_sn(vcpu_to_pi_desc(vcpu)); > >>>>>>> + ret = irq_set_vcpu_affinity(host_irq, NULL); > >>>>>>> + pi_clear_sn(vcpu_to_pi_desc(vcpu)); > >>>>>>> + > >>>>>>> continue; > >>>>>>> + } > >>>>>>> > >>>>>>> vcpu_info.pi_desc_addr = > >> __pa(vcpu_to_pi_desc(vcpu)); > >>>>>>> vcpu_info.vector = irq.vector; > >>>>>>> > >>>>>> > >>>>>> I am still feel weird with this change: according the semantic of VT-d > >>>>>> posted interrupt, the interrupt will injected to guest through posted > >>>>>> notification and /proc/interrupts shows the same meaning. But now, > >>>>>> without being aware of user, the interrupt changes to legacy way and > it > >>>>>> appears on different entry on /proc/interrupts. It looks weird. > >>>>> > >>>>> I don't think it has problem here, IMO, this is exactly how it works. > >>>>> There should be different entry for the interrupts in VT-d PI mode > >>>>> and leagcy mode. > >>>> > >>>> I am not saying any problem here. Just feel weird. From a normal user's > >>>> point, he has turned on the VT-d pi and according the semantic of VT-d > >>>> pi, he should not observe the interrupt through legacy mode, but now > he > >>>> do see it. Maybe print out a message here will be helpful, like what you > >>>> did for disabled lapic found during irq injection. > >>> > >>> Even VT-d PI is on, not all interrupts can be handled by it, the reason the > >> > >> No, we can handle it but we don't do it due to the complexity.For > >> example, we can use wake up vector to delivery the interrupt which still > >> is in PI mode but doesn't require any mode change. > > > > I mean, multi-cast and broadcast interrupts cannot be handled in PI mode. > > We may have different understanding on PI mode. My understanding is if > we set the IRTE to PI format, than the subsequent interrupt will be > handled in PI mode. multi-cast and broadcast interrupts cannot be > injected to guest directly but it doesn't mean cannot be handled in PI > mode. As i said, we can handle it in wake up vector or via other > approach.But it is much complexity. For the multicast/broastcast, we cannot set the related IRTE in PI mode, since we cannot set only one destination in IRTE. If an interrupt is for multiple destination, how can you use VT-d PI to injection it to all the destinations? Thanks, Feng -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/21 13:41, Wu, Feng wrote: > > >> -----Original Message----- >> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >> Sent: Thursday, January 21, 2016 1:36 PM >> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >> rkrcmar@redhat.com >> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the >> interrupt is not single-destination >> >> On 2016/1/21 13:07, Wu, Feng wrote: >>> >>> >>>> -----Original Message----- >>>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >>>> Sent: Thursday, January 21, 2016 1:00 PM >>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >>>> rkrcmar@redhat.com >>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the >>>> interrupt is not single-destination >>>> >>>> On 2016/1/21 12:42, Wu, Feng wrote: >>>>> >>>>> >>>>>> -----Original Message----- >>>>>> From: kvm-owner@vger.kernel.org [mailto:kvm-owner@vger.kernel.org] >>>> On >>>>>> Behalf Of Yang Zhang >>>>>> Sent: Thursday, January 21, 2016 11:35 AM >>>>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >>>>>> rkrcmar@redhat.com >>>>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >>>>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if >> the >>>>>> interrupt is not single-destination >>>>>> >>>>>> On 2016/1/21 11:14, Wu, Feng wrote: >>>>>>> >>>>>>> >>>>>>>> -----Original Message----- >>>>>>>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >>>>>>>> Sent: Thursday, January 21, 2016 11:06 AM >>>>>>>> To: Wu, Feng <feng.wu@intel.com>; pbonzini@redhat.com; >>>>>>>> rkrcmar@redhat.com >>>>>>>> Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org >>>>>>>> Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if >>>> the >>>>>>>> interrupt is not single-destination >>>>>>>> >>>>>>>> On 2016/1/20 9:42, Feng Wu wrote: >>>>>>>>> When the interrupt is not single destination any more, we need >>>>>>>>> to change back IRTE to remapped mode explicitly. >>>>>>>>> >>>>>>>>> Signed-off-by: Feng Wu <feng.wu@intel.com> >>>>>>>>> --- >>>>>>>>> arch/x86/kvm/vmx.c | 11 ++++++++++- >>>>>>>>> 1 file changed, 10 insertions(+), 1 deletion(-) >>>>>>>>> >>>>>>>>> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c >>>>>>>>> index e2951b6..13d14d4 100644 >>>>>>>>> --- a/arch/x86/kvm/vmx.c >>>>>>>>> +++ b/arch/x86/kvm/vmx.c >>>>>>>>> @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct >> kvm >>>>>>>> *kvm, unsigned int host_irq, >>>>>>>>> */ >>>>>>>>> >>>>>>>>> kvm_set_msi_irq(e, &irq); >>>>>>>>> - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) >>>>>>>>> + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { >>>>>>>>> + /* >>>>>>>>> + * Make sure the IRTE is in remapped mode if >>>>>>>>> + * we don't handle it in posted mode. >>>>>>>>> + */ >>>>>>>>> + pi_set_sn(vcpu_to_pi_desc(vcpu)); >>>>>>>>> + ret = irq_set_vcpu_affinity(host_irq, NULL); >>>>>>>>> + pi_clear_sn(vcpu_to_pi_desc(vcpu)); >>>>>>>>> + >>>>>>>>> continue; >>>>>>>>> + } >>>>>>>>> >>>>>>>>> vcpu_info.pi_desc_addr = >>>> __pa(vcpu_to_pi_desc(vcpu)); >>>>>>>>> vcpu_info.vector = irq.vector; >>>>>>>>> >>>>>>>> >>>>>>>> I am still feel weird with this change: according the semantic of VT-d >>>>>>>> posted interrupt, the interrupt will injected to guest through posted >>>>>>>> notification and /proc/interrupts shows the same meaning. But now, >>>>>>>> without being aware of user, the interrupt changes to legacy way and >> it >>>>>>>> appears on different entry on /proc/interrupts. It looks weird. >>>>>>> >>>>>>> I don't think it has problem here, IMO, this is exactly how it works. >>>>>>> There should be different entry for the interrupts in VT-d PI mode >>>>>>> and leagcy mode. >>>>>> >>>>>> I am not saying any problem here. Just feel weird. From a normal user's >>>>>> point, he has turned on the VT-d pi and according the semantic of VT-d >>>>>> pi, he should not observe the interrupt through legacy mode, but now >> he >>>>>> do see it. Maybe print out a message here will be helpful, like what you >>>>>> did for disabled lapic found during irq injection. >>>>> >>>>> Even VT-d PI is on, not all interrupts can be handled by it, the reason the >>>> >>>> No, we can handle it but we don't do it due to the complexity.For >>>> example, we can use wake up vector to delivery the interrupt which still >>>> is in PI mode but doesn't require any mode change. >>> >>> I mean, multi-cast and broadcast interrupts cannot be handled in PI mode. >> >> We may have different understanding on PI mode. My understanding is if >> we set the IRTE to PI format, than the subsequent interrupt will be >> handled in PI mode. multi-cast and broadcast interrupts cannot be >> injected to guest directly but it doesn't mean cannot be handled in PI >> mode. As i said, we can handle it in wake up vector or via other >> approach.But it is much complexity. > > For the multicast/broastcast, we cannot set the related IRTE in PI > mode, since we cannot set only one destination in IRTE. If an interrupt > is for multiple destination, how can you use VT-d PI to injection it > to all the destinations? You may still not get my point. Anyway, it doesn't matter. Rollback to legacy mode still is the best choice so far.
2016-01-20 09:42+0800, Feng Wu: > When the interrupt is not single destination any more, we need > to change back IRTE to remapped mode explicitly. > > Signed-off-by: Feng Wu <feng.wu@intel.com> > --- > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq, > - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) > + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { > + /* > + * Make sure the IRTE is in remapped mode if > + * we don't handle it in posted mode. > + */ > + pi_set_sn(vcpu_to_pi_desc(vcpu)); What could go wrong if we didn't suppress notifications here? Thanks. > + ret = irq_set_vcpu_affinity(host_irq, NULL); > + pi_clear_sn(vcpu_to_pi_desc(vcpu)); > + > continue; > + } -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2016-01-21 13:44+0800, Yang Zhang: > On 2016/1/21 13:41, Wu, Feng wrote: >>>From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >>>We may have different understanding on PI mode. My understanding is if >>>we set the IRTE to PI format, than the subsequent interrupt will be >>>handled in PI mode. multi-cast and broadcast interrupts cannot be >>>injected to guest directly but it doesn't mean cannot be handled in PI >>>mode. As i said, we can handle it in wake up vector or via other >>>approach.But it is much complexity. KVM has to intercept the interrupt, so we'd need to trigger a deferred work from the notification handler to send the multicast. Reusing existing PI vectors would mean slowing them down, so we should define a new PI notification vector just for this purpose, which would be confusing in /proc/interrupts anyway. On top of that, we'd need to define new PIRR array(s) and create unique PID for every IRTE, to avoid parsing those PIRR arrays as the vector is stored in IRTE ... it's going a bit too far, I guess. >>For the multicast/broastcast, we cannot set the related IRTE in PI >>mode, since we cannot set only one destination in IRTE. If an interrupt >>is for multiple destination, how can you use VT-d PI to injection it >>to all the destinations? > > You may still not get my point. Anyway, it doesn't matter. Rollback to > legacy mode still is the best choice so far. I think we can't do much better than we do now. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Radim Kr?má? [mailto:rkrcmar@redhat.com] > Sent: Friday, January 22, 2016 12:20 AM > To: Wu, Feng <feng.wu@intel.com> > Cc: pbonzini@redhat.com; linux-kernel@vger.kernel.org; > kvm@vger.kernel.org > Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > interrupt is not single-destination > > 2016-01-20 09:42+0800, Feng Wu: > > When the interrupt is not single destination any more, we need > > to change back IRTE to remapped mode explicitly. > > > > Signed-off-by: Feng Wu <feng.wu@intel.com> > > --- > > diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c > > @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm > *kvm, unsigned int host_irq, > > - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) > > + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { > > + /* > > + * Make sure the IRTE is in remapped mode if > > + * we don't handle it in posted mode. > > + */ > > + pi_set_sn(vcpu_to_pi_desc(vcpu)); > > What could go wrong if we didn't suppress notifications here? This is a good question. I also thought about this before, but after thinking it a bit more, seems we don't need to do this. If we don't do this, the in-flight interrupts will continue to be delivered in PI mode while we are changing it to remapped mode in IRTE. Even if we do this, the in-flight interrupts are also delivered in PI mode before setting 'SN' anyway, so seems we really don't need this, what is your opinion? Thanks, Feng Thanks, Feng > > Thanks. > > > + ret = irq_set_vcpu_affinity(host_irq, NULL); > > + pi_clear_sn(vcpu_to_pi_desc(vcpu)); > > + > > continue; > > + } -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/22 0:35, rkrcmar@redhat.com wrote: > 2016-01-21 13:44+0800, Yang Zhang: >> On 2016/1/21 13:41, Wu, Feng wrote: >>>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >>>> We may have different understanding on PI mode. My understanding is if >>>> we set the IRTE to PI format, than the subsequent interrupt will be >>>> handled in PI mode. multi-cast and broadcast interrupts cannot be >>>> injected to guest directly but it doesn't mean cannot be handled in PI >>>> mode. As i said, we can handle it in wake up vector or via other >>>> approach.But it is much complexity. > > KVM has to intercept the interrupt, so we'd need to trigger a deferred > work from the notification handler to send the multicast. > Reusing existing PI vectors would mean slowing them down, so we should > define a new PI notification vector just for this purpose, which would > be confusing in /proc/interrupts anyway. > On top of that, we'd need to define new PIRR array(s) and create unique > PID for every IRTE, to avoid parsing those PIRR arrays as the vector is > stored in IRTE ... it's going a bit too far, I guess. Not so complicated. We can reuse the wake up vector and check whether the interrupt is multicast when one of destination vcpu handles it. If it is multicast, then also notifies other vcpus. It is totally handed in PI mode and we already have the wakeup vector in /proc/interrupts. > >>> For the multicast/broastcast, we cannot set the related IRTE in PI >>> mode, since we cannot set only one destination in IRTE. If an interrupt >>> is for multiple destination, how can you use VT-d PI to injection it >>> to all the destinations? >> >> You may still not get my point. Anyway, it doesn't matter. Rollback to >> legacy mode still is the best choice so far. > > I think we can't do much better than we do now.
2016-01-22 01:49+0000, Wu, Feng: >> From: Radim Kr?má? [mailto:rkrcmar@redhat.com] >> 2016-01-20 09:42+0800, Feng Wu: >> > - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) >> > + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { >> > + /* >> > + * Make sure the IRTE is in remapped mode if >> > + * we don't handle it in posted mode. >> > + */ >> > + pi_set_sn(vcpu_to_pi_desc(vcpu)); >> >> What could go wrong if we didn't suppress notifications here? > > This is a good question. I also thought about this before, but after > thinking it a bit more, seems we don't need to do this. > If we don't do this, the in-flight interrupts will continue to be > delivered in PI mode while we are changing it to remapped > mode in IRTE. Even if we do this, the in-flight interrupts are > also delivered in PI mode before setting 'SN' anyway, so seems > we really don't need this, what is your opinion? I'd remove it. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2016-01-22 10:03+0800, Yang Zhang: > On 2016/1/22 0:35, rkrcmar@redhat.com wrote: >>2016-01-21 13:44+0800, Yang Zhang: >>>On 2016/1/21 13:41, Wu, Feng wrote: >>>>>From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >>>>>We may have different understanding on PI mode. My understanding is if >>>>>we set the IRTE to PI format, than the subsequent interrupt will be >>>>>handled in PI mode. multi-cast and broadcast interrupts cannot be >>>>>injected to guest directly but it doesn't mean cannot be handled in PI >>>>>mode. As i said, we can handle it in wake up vector or via other >>>>>approach.But it is much complexity. >> >>KVM has to intercept the interrupt, so we'd need to trigger a deferred >>work from the notification handler to send the multicast. >>Reusing existing PI vectors would mean slowing them down, so we should >>define a new PI notification vector just for this purpose, which would >>be confusing in /proc/interrupts anyway. >>On top of that, we'd need to define new PIRR array(s) and create unique >>PID for every IRTE, to avoid parsing those PIRR arrays as the vector is >>stored in IRTE ... it's going a bit too far, I guess. > > Not so complicated. We can reuse the wake up vector and check whether the > interrupt is multicast when one of destination vcpu handles it. I'm not sure what you mean now ... I guess it is: - Deliver the interrupt to a guest VCPU and relay the multicast to other VCPUs. No, it's strictly worse than intercepting it in the host. - Modify host's wakeup vector handler to send the multicast. It's so complicated, because all information you start with in the host is a vector number. You start with no idea what the multicast interrupt should be. We could add per-multicast PID to the list of parsed PIDs in wakeup_handler and use PID->multicast interrupt mapping to tell which interrupt we should send, but that seems worse than just delivering a non-remapped interrupt. Also, if wakeup vector were used for wakeup and multicast, we'd be uselessly doing work, because we can't tell which reason triggered the interrupt before finishing one part -- using separate vectors for that would be a bit nicer. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/22 21:31, rkrcmar@redhat.com wrote: > 2016-01-22 10:03+0800, Yang Zhang: >> On 2016/1/22 0:35, rkrcmar@redhat.com wrote: >>> 2016-01-21 13:44+0800, Yang Zhang: >>>> On 2016/1/21 13:41, Wu, Feng wrote: >>>>>> From: Yang Zhang [mailto:yang.zhang.wz@gmail.com] >>>>>> We may have different understanding on PI mode. My understanding is if >>>>>> we set the IRTE to PI format, than the subsequent interrupt will be >>>>>> handled in PI mode. multi-cast and broadcast interrupts cannot be >>>>>> injected to guest directly but it doesn't mean cannot be handled in PI >>>>>> mode. As i said, we can handle it in wake up vector or via other >>>>>> approach.But it is much complexity. >>> >>> KVM has to intercept the interrupt, so we'd need to trigger a deferred >>> work from the notification handler to send the multicast. >>> Reusing existing PI vectors would mean slowing them down, so we should >>> define a new PI notification vector just for this purpose, which would >>> be confusing in /proc/interrupts anyway. >>> On top of that, we'd need to define new PIRR array(s) and create unique >>> PID for every IRTE, to avoid parsing those PIRR arrays as the vector is >>> stored in IRTE ... it's going a bit too far, I guess. >> >> Not so complicated. We can reuse the wake up vector and check whether the >> interrupt is multicast when one of destination vcpu handles it. > > I'm not sure what you mean now ... I guess it is: > - Deliver the interrupt to a guest VCPU and relay the multicast to other > VCPUs. No, it's strictly worse than intercepting it in the host. It is still handled in host context not guest context. The wakeup event cannot be consumed like posted event. So it relies on hypervisor to inject the interrupt to guest. We can add the check at this point. > > - Modify host's wakeup vector handler to send the multicast. > It's so complicated, because all information you start with in the > host is a vector number. You start with no idea what the multicast > interrupt should be. > > We could add per-multicast PID to the list of parsed PIDs in > wakeup_handler and use PID->multicast interrupt mapping to tell which > interrupt we should send, but that seems worse than just delivering a > non-remapped interrupt. > > Also, if wakeup vector were used for wakeup and multicast, we'd be > uselessly doing work, because we can't tell which reason triggered the > interrupt before finishing one part -- using separate vectors for that > would be a bit nicer. >
On 22/01/2016 14:05, Radim Krcmár wrote: > > This is a good question. I also thought about this before, but after > > thinking it a bit more, seems we don't need to do this. > > If we don't do this, the in-flight interrupts will continue to be > > delivered in PI mode while we are changing it to remapped > > mode in IRTE. Even if we do this, the in-flight interrupts are > > also delivered in PI mode before setting 'SN' anyway, so seems > > we really don't need this, what is your opinion? > I'd remove it. It may be necessary because IRTE writes (128 bits) are not atomic. If so, no need to send v5, I'll add it back. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
DQoNCj4gLS0tLS1PcmlnaW5hbCBNZXNzYWdlLS0tLS0NCj4gRnJvbTogUGFvbG8gQm9uemluaSBb bWFpbHRvOnBhb2xvLmJvbnppbmlAZ21haWwuY29tXSBPbiBCZWhhbGYgT2YgUGFvbG8NCj4gQm9u emluaQ0KPiBTZW50OiBNb25kYXksIEphbnVhcnkgMjUsIDIwMTYgODoyMyBQTQ0KPiBUbzogUmFk aW0gS3JjbcOhciA8cmtyY21hckByZWRoYXQuY29tPjsgV3UsIEZlbmcgPGZlbmcud3VAaW50ZWwu Y29tPg0KPiBDYzogbGludXgta2VybmVsQHZnZXIua2VybmVsLm9yZzsga3ZtQHZnZXIua2VybmVs Lm9yZw0KPiBTdWJqZWN0OiBSZTogW1BBVENIIHYzIDEvNF0gS1ZNOiBSZWNvdmVyIElSVEUgdG8g cmVtYXBwZWQgbW9kZSBpZiB0aGUNCj4gaW50ZXJydXB0IGlzIG5vdCBzaW5nbGUtZGVzdGluYXRp b24NCj4gDQo+IA0KPiANCj4gT24gMjIvMDEvMjAxNiAxNDowNSwgUmFkaW0gS3JjbcOhciB3cm90 ZToNCj4gPiA+IFRoaXMgaXMgYSBnb29kIHF1ZXN0aW9uLiBJIGFsc28gdGhvdWdodCBhYm91dCB0 aGlzIGJlZm9yZSwgYnV0IGFmdGVyDQo+ID4gPiB0aGlua2luZyBpdCBhIGJpdCBtb3JlLCBzZWVt cyB3ZSBkb24ndCBuZWVkIHRvIGRvIHRoaXMuDQo+ID4gPiBJZiB3ZSBkb24ndCBkbyB0aGlzLCB0 aGUgaW4tZmxpZ2h0IGludGVycnVwdHMgd2lsbCBjb250aW51ZSB0byBiZQ0KPiA+ID4gZGVsaXZl cmVkIGluIFBJIG1vZGUgd2hpbGUgd2UgYXJlIGNoYW5naW5nIGl0IHRvIHJlbWFwcGVkDQo+ID4g PiBtb2RlIGluIElSVEUuIEV2ZW4gaWYgd2UgZG8gdGhpcywgdGhlIGluLWZsaWdodCBpbnRlcnJ1 cHRzIGFyZQ0KPiA+ID4gYWxzbyBkZWxpdmVyZWQgaW4gUEkgbW9kZSBiZWZvcmUgc2V0dGluZyAn U04nIGFueXdheSwgc28gc2VlbXMNCj4gPiA+IHdlIHJlYWxseSBkb24ndCBuZWVkIHRoaXMsIHdo YXQgaXMgeW91ciBvcGluaW9uPw0KPiA+IEknZCByZW1vdmUgaXQuDQo+IA0KPiBJdCBtYXkgYmUg bmVjZXNzYXJ5IGJlY2F1c2UgSVJURSB3cml0ZXMgKDEyOCBiaXRzKSBhcmUgbm90IGF0b21pYy4N Cg0KSVJURSBpcyB1cGRhdGVkIGF0b21pY2FsbHksIEkgYWRkZWQgdGhlIHBhdGNoIHRvIHN1cHBv cnQgdGhpcy4gUGxlYXNlDQpyZWZlciB0byAzNDRjYjRlMGI2ZjNhMGRiZWYwNjQzZWFjYjQ5NDYz MzhlYjIyOGMwLg0KDQpUaGFua3MsDQpGZW5nDQoNCj4gDQo+IElmIHNvLCBubyBuZWVkIHRvIHNl bmQgdjUsIEknbGwgYWRkIGl0IGJhY2suDQo+IA0KPiBQYW9sbw0K -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 25/01/2016 13:26, Wu, Feng wrote: >> > It may be necessary because IRTE writes (128 bits) are not atomic. > IRTE is updated atomically, I added the patch to support this. Please > refer to 344cb4e0b6f3a0dbef0643eacb4946338eb228c0. Great, I hadn't noticed that patch. Thanks. Paolo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Paolo Bonzini [mailto:pbonzini@redhat.com] > Sent: Monday, January 25, 2016 8:39 PM > To: Wu, Feng <feng.wu@intel.com>; Radim Krcmár <rkrcmar@redhat.com> > Cc: linux-kernel@vger.kernel.org; kvm@vger.kernel.org > Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > interrupt is not single-destination > > > > On 25/01/2016 13:26, Wu, Feng wrote: > >> > It may be necessary because IRTE writes (128 bits) are not atomic. > > IRTE is updated atomically, I added the patch to support this. Please > > refer to 344cb4e0b6f3a0dbef0643eacb4946338eb228c0. > > Great, I hadn't noticed that patch. Thanks. I might forget to cc KVM mailing list, sorry for that! Thanks, Feng > > Paolo
2016-01-25 09:49+0800, Yang Zhang: > On 2016/1/22 21:31, rkrcmar@redhat.com wrote: >>2016-01-22 10:03+0800, Yang Zhang: >>>Not so complicated. We can reuse the wake up vector and check whether the >>>interrupt is multicast when one of destination vcpu handles it. >> >>I'm not sure what you mean now ... I guess it is: >>- Deliver the interrupt to a guest VCPU and relay the multicast to other >> VCPUs. No, it's strictly worse than intercepting it in the host. > > It is still handled in host context not guest context. The wakeup event > cannot be consumed like posted event. Ok. ("when one of destination vcpu handles it" confused me into thinking that you'd like to handle it with the notification vector.) > So it relies on hypervisor to inject > the interrupt to guest. We can add the check at this point. Yes, but I don't think we want to do that, because of following drawbacks: >>- Modify host's wakeup vector handler to send the multicast. >> It's so complicated, because all information you start with in the >> host is a vector number. You start with no idea what the multicast >> interrupt should be. >> >> We could add per-multicast PID to the list of parsed PIDs in >> wakeup_handler and use PID->multicast interrupt mapping to tell which >> interrupt we should send, but that seems worse than just delivering a >> non-remapped interrupt. (should have been "remapped, but non-posted".) >> Also, if wakeup vector were used for wakeup and multicast, we'd be >> uselessly doing work, because we can't tell which reason triggered the >> interrupt before finishing one part -- using separate vectors for that >> would be a bit nicer. (imprecise -- we would always have to check for ON bit of all PIDs from blocked VCPUs, for the original meaning of wakeup vector, and always either read the PIRR or check for ON bit of all PIDs that encode multicast interrupts; then we have to clear ON bits for multicasts.) --- There might be a benefit of using posted interrupts for host interrupts when we run out of free interrupt vectors: we could start using vectors by multiple sources through posted interrupts, if using posted interrupts is the fastest way to distinguish the interrupt source. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2016-01-25 12:26+0000, Wu, Feng: >> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo >> It may be necessary because IRTE writes (128 bits) are not atomic. > > IRTE is updated atomically, I added the patch to support this. Please > refer to 344cb4e0b6f3a0dbef0643eacb4946338eb228c0. I also think that SN bit is not affected by atomicity: if the IRTE could have been read half-updated while changing from posted to non-posted, then it wouldn't point to the correct PID, because its address is not within 64 bits, so the SN bit wouldn't matter. IRTE invalidation seems important in VT-d ... -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
> -----Original Message----- > From: Radim Krcmár [mailto:rkrcmar@redhat.com] > Sent: Monday, January 25, 2016 10:06 PM > To: Wu, Feng <feng.wu@intel.com> > Cc: Paolo Bonzini <pbonzini@redhat.com>; linux-kernel@vger.kernel.org; > kvm@vger.kernel.org > Subject: Re: [PATCH v3 1/4] KVM: Recover IRTE to remapped mode if the > interrupt is not single-destination > > 2016-01-25 12:26+0000, Wu, Feng: > >> From: Paolo Bonzini [mailto:paolo.bonzini@gmail.com] On Behalf Of Paolo > >> It may be necessary because IRTE writes (128 bits) are not atomic. > > > > IRTE is updated atomically, I added the patch to support this. Please > > refer to 344cb4e0b6f3a0dbef0643eacb4946338eb228c0. > > I also think that SN bit is not affected by atomicity: if the IRTE could > have been read half-updated while changing from posted to non-posted, > then it wouldn't point to the correct PID, because its address is not > within 64 bits, so the SN bit wouldn't matter. Yes, like the comments in the commit, we should atomically update the IRTE in PI case (PI -> non-PI, non-PI -> PI), without which, it cannot guarantee the correctness, since 'pda' is not within 64 bits, like Radim pointed out above. Thanks, Feng > > IRTE invalidation seems important in VT-d ... -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/25 21:59, rkrcmar@redhat.com wrote: > 2016-01-25 09:49+0800, Yang Zhang: >> On 2016/1/22 21:31, rkrcmar@redhat.com wrote: >>> 2016-01-22 10:03+0800, Yang Zhang: >>>> Not so complicated. We can reuse the wake up vector and check whether the >>>> interrupt is multicast when one of destination vcpu handles it. >>> >>> I'm not sure what you mean now ... I guess it is: >>> - Deliver the interrupt to a guest VCPU and relay the multicast to other >>> VCPUs. No, it's strictly worse than intercepting it in the host. >> >> It is still handled in host context not guest context. The wakeup event >> cannot be consumed like posted event. > > Ok. ("when one of destination vcpu handles it" confused me into > thinking that you'd like to handle it with the notification vector.) Sorry for my poor english. :( > >> So it relies on hypervisor to inject >> the interrupt to guest. We can add the check at this point. > > Yes, but I don't think we want to do that, because of following > drawbacks: > >>> - Modify host's wakeup vector handler to send the multicast. >>> It's so complicated, because all information you start with in the >>> host is a vector number. You start with no idea what the multicast >>> interrupt should be. >>> >>> We could add per-multicast PID to the list of parsed PIDs in >>> wakeup_handler and use PID->multicast interrupt mapping to tell which >>> interrupt we should send, but that seems worse than just delivering a >>> non-remapped interrupt. > > (should have been "remapped, but non-posted".) > >>> Also, if wakeup vector were used for wakeup and multicast, we'd be >>> uselessly doing work, because we can't tell which reason triggered the >>> interrupt before finishing one part -- using separate vectors for that >>> would be a bit nicer. > > (imprecise -- we would always have to check for ON bit of all PIDs from > blocked VCPUs, for the original meaning of wakeup vector, and always This is what KVM does currently. > either read the PIRR or check for ON bit of all PIDs that encode > multicast interrupts; then we have to clear ON bits for multicasts.) Also, most part of work is covered by current logic except checking the multicast. > > > --- > There might be a benefit of using posted interrupts for host interrupts > when we run out of free interrupt vectors: we could start using vectors > by multiple sources through posted interrupts, if using posted Do you mean per vcpu posted interrupts? > interrupts is the fastest way to distinguish the interrupt source.
2016-01-26 09:44+0800, Yang Zhang: > On 2016/1/25 21:59, rkrcmar@redhat.com wrote: >>2016-01-25 09:49+0800, Yang Zhang: >>>On 2016/1/22 21:31, rkrcmar@redhat.com wrote: >>>>2016-01-22 10:03+0800, Yang Zhang: >>>>>Not so complicated. We can reuse the wake up vector and check whether the >>>>>interrupt is multicast when one of destination vcpu handles it. >>>> >>>>I'm not sure what you mean now ... I guess it is: >>>>- Deliver the interrupt to a guest VCPU and relay the multicast to other >>>> VCPUs. No, it's strictly worse than intercepting it in the host. >>> >>>It is still handled in host context not guest context. The wakeup event >>>cannot be consumed like posted event. >> >>Ok. ("when one of destination vcpu handles it" confused me into >>thinking that you'd like to handle it with the notification vector.) > > Sorry for my poor english. :( It's good. Ambiguity is hard to avoid if a reader doesn't want to assume only the most likely meaning. >>>> Also, if wakeup vector were used for wakeup and multicast, we'd be >>>> uselessly doing work, because we can't tell which reason triggered the >>>> interrupt before finishing one part -- using separate vectors for that >>>> would be a bit nicer. >> >>(imprecise -- we would always have to check for ON bit of all PIDs from >> blocked VCPUs, for the original meaning of wakeup vector, and always > > This is what KVM does currently. Yep. >> either read the PIRR or check for ON bit of all PIDs that encode >> multicast interrupts; then we have to clear ON bits for multicasts.) > > Also, most part of work is covered by current logic except checking the > multicast. We could reuse the setup that gets us to wakeup_handler, but there is nothing to share in the handler itself. Sharing a handler means that we always have to execute both parts. We must create new PID anyway and compared to the extra work needed for multicast handling, a new vector + handler is a relatively small code investment that adds clarity to the design (and performance). (Taking the vector splitting to the extreme, we'd improve performance if we added a vector per assigned device. That is practically the same as non-posted mode, just more complicated.) >>--- >>There might be a benefit of using posted interrupts for host interrupts >>when we run out of free interrupt vectors: we could start using vectors >>by multiple sources through posted interrupts, if using posted > > Do you mean per vcpu posted interrupts? I mean using posting for host device interrupts (no virt involved). Let's say we have 300 devices for one CPU and CPU has 200 useable vectors. We have 100 device interrupts that need to be shared in some vectors and using posting might be faster than directly checking multiple devices. (I couldn't come up with a plausible scenario where we might want to use posting for host interrupts.) -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 2016/1/27 2:22, rkrcmar@redhat.com wrote: > 2016-01-26 09:44+0800, Yang Zhang: >> On 2016/1/25 21:59, rkrcmar@redhat.com wrote: >>> 2016-01-25 09:49+0800, Yang Zhang: >>>> On 2016/1/22 21:31, rkrcmar@redhat.com wrote: >>>>> 2016-01-22 10:03+0800, Yang Zhang: >>>>>> Not so complicated. We can reuse the wake up vector and check whether the >>>>>> interrupt is multicast when one of destination vcpu handles it. >>>>> >>>>> I'm not sure what you mean now ... I guess it is: >>>>> - Deliver the interrupt to a guest VCPU and relay the multicast to other >>>>> VCPUs. No, it's strictly worse than intercepting it in the host. >>>> >>>> It is still handled in host context not guest context. The wakeup event >>>> cannot be consumed like posted event. >>> >>> Ok. ("when one of destination vcpu handles it" confused me into >>> thinking that you'd like to handle it with the notification vector.) >> >> Sorry for my poor english. :( > > It's good. Ambiguity is hard to avoid if a reader doesn't want to > assume only the most likely meaning. > >>>>> Also, if wakeup vector were used for wakeup and multicast, we'd be >>>>> uselessly doing work, because we can't tell which reason triggered the >>>>> interrupt before finishing one part -- using separate vectors for that >>>>> would be a bit nicer. >>> >>> (imprecise -- we would always have to check for ON bit of all PIDs from >>> blocked VCPUs, for the original meaning of wakeup vector, and always >> >> This is what KVM does currently. > > Yep. > >>> either read the PIRR or check for ON bit of all PIDs that encode >>> multicast interrupts; then we have to clear ON bits for multicasts.) >> >> Also, most part of work is covered by current logic except checking the >> multicast. > > We could reuse the setup that gets us to wakeup_handler, but there is > nothing to share in the handler itself. Sharing a handler means that we > always have to execute both parts. I don't quite understand it. There is nothing need to be modified for wakeup logic. The only thing we need to do is add the checking before the vcpu pick up the pending interrupt(This is happened in VCPU context, not in handler). > > We must create new PID anyway and compared to the extra work needed for > multicast handling, a new vector + handler is a relatively small code > investment that adds clarity to the design (and performance). No new PID is needed. If the target vcpu is running, no additional work is required in wakeup handler. If target vcpu is not running, the current logic will wake up the vcpu, then let vcpu itself to check whether pending interrupt is a multicast and handle it in vcpu's context. > > (Taking the vector splitting to the extreme, we'd improve performance if > we added a vector per assigned device. That is practically the same as > non-posted mode, just more complicated.) > >>> --- >>> There might be a benefit of using posted interrupts for host interrupts >>> when we run out of free interrupt vectors: we could start using vectors >>> by multiple sources through posted interrupts, if using posted >> >> Do you mean per vcpu posted interrupts? > > I mean using posting for host device interrupts (no virt involved). > > Let's say we have 300 devices for one CPU and CPU has 200 useable > vectors. We have 100 device interrupts that need to be shared in some > vectors and using posting might be faster than directly checking > multiple devices. Yes, this is an good point.
2016-01-27 10:07+0800, Yang Zhang: > On 2016/1/27 2:22, rkrcmar@redhat.com wrote: >>2016-01-26 09:44+0800, Yang Zhang: >>>On 2016/1/25 21:59, rkrcmar@redhat.com wrote: >>>>>> Also, if wakeup vector were used for wakeup and multicast, we'd be >>>>>> uselessly doing work, because we can't tell which reason triggered the >>>>>> interrupt before finishing one part -- using separate vectors for that >>>>>> would be a bit nicer. >>>> >>>>(imprecise -- we would always have to check for ON bit of all PIDs from >>>> blocked VCPUs, for the original meaning of wakeup vector, and always >>>> either read the PIRR or check for ON bit of all PIDs that encode >>>> multicast interrupts; then we have to clear ON bits for multicasts.) >>> >>>Also, most part of work is covered by current logic except checking the >>>multicast. >> >>We could reuse the setup that gets us to wakeup_handler, but there is >>nothing to share in the handler itself. Sharing a handler means that we >>always have to execute both parts. > > I don't quite understand it. There is nothing need to be modified for wakeup > logic. The only thing we need to do is add the checking before the vcpu pick > up the pending interrupt(This is happened in VCPU context, not in handler). I see, there are few problems with that. >>We must create new PID anyway and compared to the extra work needed for >>multicast handling, a new vector + handler is a relatively small code >>investment that adds clarity to the design (and performance). > > No new PID is needed. If the target vcpu is running, no additional work is > required in wakeup handler. If target vcpu is not running, the current logic > will wake up the vcpu, then let vcpu itself to check whether pending > interrupt is a multicast and handle it in vcpu's context. We do need a new PID. The existing VCPU PID switches between wakeup vector and notification vector, so if the VCPU was running when the device triggered an interrupt, we'd deliver the posted interrupt without exiting, but we need to handle the interrupt in the host. => We need at least one PID that is never set to notification vector. Reusing VCPU's PIRR is in new PID(s) is not doable. Parsing PIRR would be our only option of recognizing multicast interrupts and if the guest configured many sources to send the same vector, we'd have to do unacceptable things to tell which one was triggered. => We also need at least on one new PIRR. Handling the interrupt in VCPU context doesn't pose any advantage and we even want to do it outside, because all VCPUs can be running when the interrupt arrives and can therefore be posted further. I hope I covered other disadvantages of PIDs and PIRRs earlier. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c index e2951b6..13d14d4 100644 --- a/arch/x86/kvm/vmx.c +++ b/arch/x86/kvm/vmx.c @@ -10764,8 +10764,17 @@ static int vmx_update_pi_irte(struct kvm *kvm, unsigned int host_irq, */ kvm_set_msi_irq(e, &irq); - if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) + if (!kvm_intr_is_single_vcpu(kvm, &irq, &vcpu)) { + /* + * Make sure the IRTE is in remapped mode if + * we don't handle it in posted mode. + */ + pi_set_sn(vcpu_to_pi_desc(vcpu)); + ret = irq_set_vcpu_affinity(host_irq, NULL); + pi_clear_sn(vcpu_to_pi_desc(vcpu)); + continue; + } vcpu_info.pi_desc_addr = __pa(vcpu_to_pi_desc(vcpu)); vcpu_info.vector = irq.vector;
When the interrupt is not single destination any more, we need to change back IRTE to remapped mode explicitly. Signed-off-by: Feng Wu <feng.wu@intel.com> --- arch/x86/kvm/vmx.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-)