Message ID | 20181119172536.52649-2-mimu@linux.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | KVM: s390: make use of the GIB | expand |
On Mon, 19 Nov 2018 18:25:25 +0100 Michael Mueller <mimu@linux.ibm.com> wrote: > Do not call __deliver_io() for adapter interruptions already > pending in the IPM. That is a double effort. They will > be processed as soon the vcpu control is given to SIE. > > Signed-off-by: Michael Mueller <mimu@linux.ibm.com> > --- > arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- > 1 file changed, 25 insertions(+), 29 deletions(-) I think this patch does what it says on the tin, but I'm a bit lost as to the why. (Might make more sense with the gib.) Currently, we are trying to process any I/O interrupts, even if we'd get them delivered via the gisa, when we're out of the SIE anyway. IIRC, while this looks a bit like a belt-and-suspenders approach, it also prevented performance problems when the vcpu did not go back into the SIE immediately (it even may exit to userspace). Also, if you're ignoring the I/O interrupts pending in the ipm, you may end up delivering interrupts with a lower priority (higher isc) first. I'm not sure that's what we want. But maybe I'm just missing another bit of the code that makes this safe. Can you elaborate a bit?
On 20.11.18 12:33, Cornelia Huck wrote: > On Mon, 19 Nov 2018 18:25:25 +0100 > Michael Mueller <mimu@linux.ibm.com> wrote: > >> Do not call __deliver_io() for adapter interruptions already >> pending in the IPM. That is a double effort. They will >> be processed as soon the vcpu control is given to SIE. >> >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >> --- >> arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- >> 1 file changed, 25 insertions(+), 29 deletions(-) > > I think this patch does what it says on the tin, but I'm a bit lost as > to the why. (Might make more sense with the gib.) > > Currently, we are trying to process any I/O interrupts, even if we'd > get them delivered via the gisa, when we're out of the SIE anyway. > IIRC, while this looks a bit like a belt-and-suspenders approach, it > also prevented performance problems when the vcpu did not go back into > the SIE immediately (it even may exit to userspace). > > Also, if you're ignoring the I/O interrupts pending in the ipm, you may > end up delivering interrupts with a lower priority (higher isc) first. > I'm not sure that's what we want. > > But maybe I'm just missing another bit of the code that makes this > safe. Can you elaborate a bit? > For interrupt priorities to work at least somewhat predictable, we should always try to inject all interrupts even if the HW would be doing it for us. In the order of priority. But we don't have the same thing for external calls injected via SCA. I remember that I once had a patch lying around a couple of years ago to fix that ... it went missing :) IO interrupt almost have lowest priority, so letting the HW inject them could be problematic when mixing IO interrupt priorities between SW and HW injected ones (hat Conny described). There are other corner cases if a e.g. a RESTART interrupt is pending for that CPU. We would deliver eventually the RESTART interrupt before delivering the IO interrupt, which would be wrong.
On 20/11/2018 12:33, Cornelia Huck wrote: > On Mon, 19 Nov 2018 18:25:25 +0100 > Michael Mueller <mimu@linux.ibm.com> wrote: > >> Do not call __deliver_io() for adapter interruptions already >> pending in the IPM. That is a double effort. They will >> be processed as soon the vcpu control is given to SIE. >> >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >> --- >> arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- >> 1 file changed, 25 insertions(+), 29 deletions(-) > > I think this patch does what it says on the tin, but I'm a bit lost as > to the why. (Might make more sense with the gib.) > > Currently, we are trying to process any I/O interrupts, even if we'd > get them delivered via the gisa, when we're out of the SIE anyway. > IIRC, while this looks a bit like a belt-and-suspenders approach, it > also prevented performance problems when the vcpu did not go back into > the SIE immediately (it even may exit to userspace). > > Also, if you're ignoring the I/O interrupts pending in the ipm, you may > end up delivering interrupts with a lower priority (higher isc) first. > I'm not sure that's what we want. > > But maybe I'm just missing another bit of the code that makes this > safe. Can you elaborate a bit? > I do not think we should worry. In the architecture all interrupts are asynchronous to any activity of the CPU. The priority of the interrupt is controlled intern by each sub-channel and adapter and then the by each CPU among sub-channel and adapter requests. While the first system is completely hardware dependent the second is collisioning with software IRQ we may dispatch out of KVM/QEMU. The assignment of these priority is model dependent and must guarantee that no interrupt is delayed so much that it could cause recovery actions to be initiated. In our case, we can take for sure that the priority seen by the vCPU, that is dispatched by the software by touching the SIE page or by the GISA mechanism, are compatible with the architecture. If we agree on this the only problem may arise from the first level of interruption also inside the subchannel priority mechanism and from the delay induced by GISA when delivering these interrupts to the vCPU. This delay occurs on an asynchronous interrupt, so yes, there will be a delay. Is it larger or smaller than the delay introduced by the software? Why should we worry if it is, the interrupt is asynchronous, should we worry, for example, if the AP card take longer to send the interrupt? I don't think so. Regards, Pierre
On 11/20/2018 12:33 PM, Cornelia Huck wrote: > On Mon, 19 Nov 2018 18:25:25 +0100 > Michael Mueller <mimu@linux.ibm.com> wrote: > >> Do not call __deliver_io() for adapter interruptions already >> pending in the IPM. That is a double effort. They will >> be processed as soon the vcpu control is given to SIE. >> >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >> --- >> arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- >> 1 file changed, 25 insertions(+), 29 deletions(-) > > I think this patch does what it says on the tin, but I'm a bit lost as > to the why. (Might make more sense with the gib.) > > Currently, we are trying to process any I/O interrupts, even if we'd > get them delivered via the gisa, when we're out of the SIE anyway. > IIRC, while this looks a bit like a belt-and-suspenders approach, it > also prevented performance problems when the vcpu did not go back into > the SIE immediately (it even may exit to userspace In fact, doing the inject when in SIE is likely better performance-wise. Right now we "inject" the floating interrupt and then we handle the requests. That can actually mean that it could take a while until the interrupt is actually noticed by the guest (when in SIE). If you now have a 2nd CPU enabled this interrupt could have been delivered to the guest much earlier but it is "stuck" in the local CPU. > > Also, if you're ignoring the I/O interrupts pending in the ipm, you may > end up delivering interrupts with a lower priority (higher isc) first. > I'm not sure that's what we want. FWIW, LPAR has the same relaxation regarding priorities of subclasses. > But maybe I'm just missing another bit of the code that makes this > safe. Can you elaborate a bit?
On 20.11.18 12:33, Cornelia Huck wrote: > On Mon, 19 Nov 2018 18:25:25 +0100 > Michael Mueller <mimu@linux.ibm.com> wrote: > >> Do not call __deliver_io() for adapter interruptions already >> pending in the IPM. That is a double effort. They will >> be processed as soon the vcpu control is given to SIE. >> >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >> --- >> arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- >> 1 file changed, 25 insertions(+), 29 deletions(-) > > I think this patch does what it says on the tin, but I'm a bit lost as > to the why. (Might make more sense with the gib.) > > Currently, we are trying to process any I/O interrupts, even if we'd > get them delivered via the gisa, when we're out of the SIE anyway. > IIRC, while this looks a bit like a belt-and-suspenders approach, it > also prevented performance problems when the vcpu did not go back into > the SIE immediately (it even may exit to userspace). > > Also, if you're ignoring the I/O interrupts pending in the ipm, you may > end up delivering interrupts with a lower priority (higher isc) first. > I'm not sure that's what we want. > > But maybe I'm just missing another bit of the code that makes this > safe. Can you elaborate a bit? Function kvm_s390_deliver_pending_interrupts() is called in the beginning of vcpu_pre_run() and we are about to enter the SIE anyway. SIE will also grant the right ISC priority for adapter interruptions. I basically want to avoid that a GISA that is part of the alert list will get its IPM bits cleared outside the GIB alert interruption context. process_gib_alert_list() >
On Tue, 20 Nov 2018 22:03:59 +0100 Christian Borntraeger <borntraeger@de.ibm.com> wrote: > On 11/20/2018 12:33 PM, Cornelia Huck wrote: > > On Mon, 19 Nov 2018 18:25:25 +0100 > > Michael Mueller <mimu@linux.ibm.com> wrote: > > > >> Do not call __deliver_io() for adapter interruptions already > >> pending in the IPM. That is a double effort. They will > >> be processed as soon the vcpu control is given to SIE. > >> > >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> > >> --- > >> arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- > >> 1 file changed, 25 insertions(+), 29 deletions(-) > > > > I think this patch does what it says on the tin, but I'm a bit lost as > > to the why. (Might make more sense with the gib.) > > > > Currently, we are trying to process any I/O interrupts, even if we'd > > get them delivered via the gisa, when we're out of the SIE anyway. > > IIRC, while this looks a bit like a belt-and-suspenders approach, it > > also prevented performance problems when the vcpu did not go back into > > the SIE immediately (it even may exit to userspace > > In fact, doing the inject when in SIE is likely better performance-wise. > Right now we "inject" the floating interrupt and then we handle > the requests. That can actually mean that it could take a while > until the interrupt is actually noticed by the guest (when > in SIE). If you now have a 2nd CPU enabled this interrupt could > have been delivered to the guest much earlier but it is "stuck" in > the local CPU. Hm, yes. Do we see any different effects if we have a guest with only one cpu (or only one cpu enabled for I/O interrupts?) Or does all of this even out in practice? > > Also, if you're ignoring the I/O interrupts pending in the ipm, you may > > end up delivering interrupts with a lower priority (higher isc) first. > > I'm not sure that's what we want. > > FWIW, LPAR has the same relaxation regarding priorities of subclasses. Interesting to know, thanks. What about restart etc. interrupts, as David has noted?
On Wed, 21 Nov 2018 09:30:34 +0100 Michael Mueller <mimu@linux.ibm.com> wrote: > On 20.11.18 12:33, Cornelia Huck wrote: > > On Mon, 19 Nov 2018 18:25:25 +0100 > > Michael Mueller <mimu@linux.ibm.com> wrote: > > > >> Do not call __deliver_io() for adapter interruptions already > >> pending in the IPM. That is a double effort. They will > >> be processed as soon the vcpu control is given to SIE. > >> > >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> > >> --- > >> arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- > >> 1 file changed, 25 insertions(+), 29 deletions(-) > > > > I think this patch does what it says on the tin, but I'm a bit lost as > > to the why. (Might make more sense with the gib.) > > > > Currently, we are trying to process any I/O interrupts, even if we'd > > get them delivered via the gisa, when we're out of the SIE anyway. > > IIRC, while this looks a bit like a belt-and-suspenders approach, it > > also prevented performance problems when the vcpu did not go back into > > the SIE immediately (it even may exit to userspace). > > > > Also, if you're ignoring the I/O interrupts pending in the ipm, you may > > end up delivering interrupts with a lower priority (higher isc) first. > > I'm not sure that's what we want. > > > > But maybe I'm just missing another bit of the code that makes this > > safe. Can you elaborate a bit? > > > Function kvm_s390_deliver_pending_interrupts() is called in the > beginning of vcpu_pre_run() and we are about to enter the SIE anyway. > SIE will also grant the right ISC priority for adapter interruptions. > > I basically want to avoid that a GISA that is part of the alert list > will get its IPM bits cleared outside the GIB alert interruption > context. process_gib_alert_list() Unfinished sentence? Anyway, I think the patch description would benefit from adding a sentence or two describing possible performance benefits and ease of alert handling when the gib is introduced (?)
On 21.11.18 13:19, Cornelia Huck wrote: > On Wed, 21 Nov 2018 09:30:34 +0100 > Michael Mueller <mimu@linux.ibm.com> wrote: > >> On 20.11.18 12:33, Cornelia Huck wrote: >>> On Mon, 19 Nov 2018 18:25:25 +0100 >>> Michael Mueller <mimu@linux.ibm.com> wrote: >>> >>>> Do not call __deliver_io() for adapter interruptions already >>>> pending in the IPM. That is a double effort. They will >>>> be processed as soon the vcpu control is given to SIE. >>>> >>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>> --- >>>> arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- >>>> 1 file changed, 25 insertions(+), 29 deletions(-) >>> >>> I think this patch does what it says on the tin, but I'm a bit lost as >>> to the why. (Might make more sense with the gib.) >>> >>> Currently, we are trying to process any I/O interrupts, even if we'd >>> get them delivered via the gisa, when we're out of the SIE anyway. >>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>> also prevented performance problems when the vcpu did not go back into >>> the SIE immediately (it even may exit to userspace). >>> >>> Also, if you're ignoring the I/O interrupts pending in the ipm, you may >>> end up delivering interrupts with a lower priority (higher isc) first. >>> I'm not sure that's what we want. >>> >>> But maybe I'm just missing another bit of the code that makes this >>> safe. Can you elaborate a bit? >> >> >> Function kvm_s390_deliver_pending_interrupts() is called in the >> beginning of vcpu_pre_run() and we are about to enter the SIE anyway. >> SIE will also grant the right ISC priority for adapter interruptions. >> >> I basically want to avoid that a GISA that is part of the alert list >> will get its IPM bits cleared outside the GIB alert interruption >> context. process_gib_alert_list() > > Unfinished sentence? > > Anyway, I think the patch description would benefit from adding a > sentence or two describing possible performance benefits and ease of > alert handling when the gib is introduced (?) The patch is independent from the GIB related code. It shall be active with the introduction of the GISA already. Yes, the GIB code will benefit from this change as well but I don't like the forward reference to upcoming patches if not really required. I will update the commit message to: Do not call __deliver_io() for adapter interruptions already pending in the IPM. That is a double effort. They will be processed as soon the vcpu control is given to SIE or by another SIE instance of the guest currently running. >
On Tue, 20 Nov 2018 13:00:03 +0100 David Hildenbrand <david@redhat.com> wrote: > On 20.11.18 12:33, Cornelia Huck wrote: > > On Mon, 19 Nov 2018 18:25:25 +0100 > > Michael Mueller <mimu@linux.ibm.com> wrote: > > > >> Do not call __deliver_io() for adapter interruptions already > >> pending in the IPM. That is a double effort. They will > >> be processed as soon the vcpu control is given to SIE. > >> > >> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> > >> --- > >> arch/s390/kvm/interrupt.c | 54 > >> ++++++++++++++++++++++------------------------- 1 file changed, 25 > >> insertions(+), 29 deletions(-) > > > > I think this patch does what it says on the tin, but I'm a bit lost > > as to the why. (Might make more sense with the gib.) > > > > Currently, we are trying to process any I/O interrupts, even if we'd > > get them delivered via the gisa, when we're out of the SIE anyway. > > IIRC, while this looks a bit like a belt-and-suspenders approach, it > > also prevented performance problems when the vcpu did not go back > > into the SIE immediately (it even may exit to userspace). > > > > Also, if you're ignoring the I/O interrupts pending in the ipm, you > > may end up delivering interrupts with a lower priority (higher isc) > > first. I'm not sure that's what we want. > > > > But maybe I'm just missing another bit of the code that makes this > > safe. Can you elaborate a bit? > > > > For interrupt priorities to work at least somewhat predictable, we > should always try to inject all interrupts even if the HW would be > doing it for us. In the order of priority. > > But we don't have the same thing for external calls injected via SCA. > I remember that I once had a patch lying around a couple of years ago > to fix that ... it went missing :) > > IO interrupt almost have lowest priority, so letting the HW inject > them could be problematic when mixing IO interrupt priorities between > SW and HW injected ones (hat Conny described). > > There are other corner cases if a e.g. a RESTART interrupt is pending > for that CPU. We would deliver eventually the RESTART interrupt before > delivering the IO interrupt, which would be wrong. > I do share David's concern. Could somebody try to explain why this RESTART scenario David described is not actually a problem -- AFAIU it is a problem. Regards, Halil
On 21.11.18 22:05, Halil Pasic wrote: > On Tue, 20 Nov 2018 13:00:03 +0100 > David Hildenbrand <david@redhat.com> wrote: > >> On 20.11.18 12:33, Cornelia Huck wrote: >>> On Mon, 19 Nov 2018 18:25:25 +0100 >>> Michael Mueller <mimu@linux.ibm.com> wrote: >>> >>>> Do not call __deliver_io() for adapter interruptions already >>>> pending in the IPM. That is a double effort. They will >>>> be processed as soon the vcpu control is given to SIE. >>>> >>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>> --- >>>> arch/s390/kvm/interrupt.c | 54 >>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>> insertions(+), 29 deletions(-) >>> I think this patch does what it says on the tin, but I'm a bit lost >>> as to the why. (Might make more sense with the gib.) >>> >>> Currently, we are trying to process any I/O interrupts, even if we'd >>> get them delivered via the gisa, when we're out of the SIE anyway. >>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>> also prevented performance problems when the vcpu did not go back >>> into the SIE immediately (it even may exit to userspace). >>> >>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>> may end up delivering interrupts with a lower priority (higher isc) >>> first. I'm not sure that's what we want. >>> >>> But maybe I'm just missing another bit of the code that makes this >>> safe. Can you elaborate a bit? >>> >> For interrupt priorities to work at least somewhat predictable, we >> should always try to inject all interrupts even if the HW would be >> doing it for us. In the order of priority. >> >> But we don't have the same thing for external calls injected via SCA. >> I remember that I once had a patch lying around a couple of years ago >> to fix that ... it went missing :) >> >> IO interrupt almost have lowest priority, so letting the HW inject >> them could be problematic when mixing IO interrupt priorities between >> SW and HW injected ones (hat Conny described). >> >> There are other corner cases if a e.g. a RESTART interrupt is pending >> for that CPU. We would deliver eventually the RESTART interrupt before >> delivering the IO interrupt, which would be wrong. >> > I do share David's concern. Could somebody try to explain why this > RESTART scenario David described is not actually a problem -- AFAIU it > is a problem. > > Regards, > Halil > Before I start arguing why this is *not* a problem I ask you both why you consider this being a problem. We are talking about the CPU restart here, right?
On 22.11.18 12:21, Michael Mueller wrote: > > > On 21.11.18 22:05, Halil Pasic wrote: >> On Tue, 20 Nov 2018 13:00:03 +0100 >> David Hildenbrand <david@redhat.com> wrote: >> >>> On 20.11.18 12:33, Cornelia Huck wrote: >>>> On Mon, 19 Nov 2018 18:25:25 +0100 >>>> Michael Mueller <mimu@linux.ibm.com> wrote: >>>> >>>>> Do not call __deliver_io() for adapter interruptions already >>>>> pending in the IPM. That is a double effort. They will >>>>> be processed as soon the vcpu control is given to SIE. >>>>> >>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>>> --- >>>>> arch/s390/kvm/interrupt.c | 54 >>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>>> insertions(+), 29 deletions(-) >>>> I think this patch does what it says on the tin, but I'm a bit lost >>>> as to the why. (Might make more sense with the gib.) >>>> >>>> Currently, we are trying to process any I/O interrupts, even if we'd >>>> get them delivered via the gisa, when we're out of the SIE anyway. >>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>>> also prevented performance problems when the vcpu did not go back >>>> into the SIE immediately (it even may exit to userspace). >>>> >>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>>> may end up delivering interrupts with a lower priority (higher isc) >>>> first. I'm not sure that's what we want. >>>> >>>> But maybe I'm just missing another bit of the code that makes this >>>> safe. Can you elaborate a bit? >>>> >>> For interrupt priorities to work at least somewhat predictable, we >>> should always try to inject all interrupts even if the HW would be >>> doing it for us. In the order of priority. >>> >>> But we don't have the same thing for external calls injected via SCA. >>> I remember that I once had a patch lying around a couple of years ago >>> to fix that ... it went missing :) >>> >>> IO interrupt almost have lowest priority, so letting the HW inject >>> them could be problematic when mixing IO interrupt priorities between >>> SW and HW injected ones (hat Conny described). >>> >>> There are other corner cases if a e.g. a RESTART interrupt is pending >>> for that CPU. We would deliver eventually the RESTART interrupt before >>> delivering the IO interrupt, which would be wrong. >>> >> I do share David's concern. Could somebody try to explain why this >> RESTART scenario David described is not actually a problem -- AFAIU it >> is a problem. >> >> Regards, >> Halil >> > Before I start arguing why this is *not* a problem I ask you both why > you consider > this being a problem. We are talking about the CPU restart here, right? > > When sending a restart interrupt to a running CPU (e.g. system_reset) an IO interrupt might remain pending and not delivered. One could make a guess how bad that is (depending on the type of guest and use case), however it is guest observable difference to what is documented in the PoP. Restart interrupt has (almost) lowest priority.
On 22.11.18 17:38, David Hildenbrand wrote: > On 22.11.18 12:21, Michael Mueller wrote: >> >> On 21.11.18 22:05, Halil Pasic wrote: >>> On Tue, 20 Nov 2018 13:00:03 +0100 >>> David Hildenbrand <david@redhat.com> wrote: >>> >>>> On 20.11.18 12:33, Cornelia Huck wrote: >>>>> On Mon, 19 Nov 2018 18:25:25 +0100 >>>>> Michael Mueller <mimu@linux.ibm.com> wrote: >>>>> >>>>>> Do not call __deliver_io() for adapter interruptions already >>>>>> pending in the IPM. That is a double effort. They will >>>>>> be processed as soon the vcpu control is given to SIE. >>>>>> >>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>>>> --- >>>>>> arch/s390/kvm/interrupt.c | 54 >>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>>>> insertions(+), 29 deletions(-) >>>>> I think this patch does what it says on the tin, but I'm a bit lost >>>>> as to the why. (Might make more sense with the gib.) >>>>> >>>>> Currently, we are trying to process any I/O interrupts, even if we'd >>>>> get them delivered via the gisa, when we're out of the SIE anyway. >>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>>>> also prevented performance problems when the vcpu did not go back >>>>> into the SIE immediately (it even may exit to userspace). >>>>> >>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>>>> may end up delivering interrupts with a lower priority (higher isc) >>>>> first. I'm not sure that's what we want. >>>>> >>>>> But maybe I'm just missing another bit of the code that makes this >>>>> safe. Can you elaborate a bit? >>>>> >>>> For interrupt priorities to work at least somewhat predictable, we >>>> should always try to inject all interrupts even if the HW would be >>>> doing it for us. In the order of priority. >>>> >>>> But we don't have the same thing for external calls injected via SCA. >>>> I remember that I once had a patch lying around a couple of years ago >>>> to fix that ... it went missing :) >>>> >>>> IO interrupt almost have lowest priority, so letting the HW inject >>>> them could be problematic when mixing IO interrupt priorities between >>>> SW and HW injected ones (hat Conny described). >>>> >>>> There are other corner cases if a e.g. a RESTART interrupt is pending >>>> for that CPU. We would deliver eventually the RESTART interrupt before >>>> delivering the IO interrupt, which would be wrong. >>>> >>> I do share David's concern. Could somebody try to explain why this >>> RESTART scenario David described is not actually a problem -- AFAIU it >>> is a problem. >>> >>> Regards, >>> Halil >>> >> Before I start arguing why this is *not* a problem I ask you both why >> you consider >> this being a problem. We are talking about the CPU restart here, right? >> >> > When sending a restart interrupt to a running CPU (e.g. system_reset) an > IO interrupt might remain pending and not delivered. Thanks for your answer David. The whole GISA is cleared as well in this case. So no indication of any unprocessed adapter interruption survives the system reset. > > One could make a guess how bad that is (depending on the type of guest > and use case), however it is guest observable difference to what is > documented in the PoP. Restart interrupt has (almost) lowest priority. > Independent of this patch I made a minimal change in the process_gib_alert_list() routine that will recognizes if the the interruption has already been delivered by kvm_s390_deliver_pending_interrupts() and in that case not dispatch an idle vcpu for that guest. That means I will pull back this patch from the series with the next release.
On 22/11/2018 17:38, David Hildenbrand wrote: > On 22.11.18 12:21, Michael Mueller wrote: >> >> >> On 21.11.18 22:05, Halil Pasic wrote: >>> On Tue, 20 Nov 2018 13:00:03 +0100 >>> David Hildenbrand <david@redhat.com> wrote: >>> >>>> On 20.11.18 12:33, Cornelia Huck wrote: >>>>> On Mon, 19 Nov 2018 18:25:25 +0100 >>>>> Michael Mueller <mimu@linux.ibm.com> wrote: >>>>> >>>>>> Do not call __deliver_io() for adapter interruptions already >>>>>> pending in the IPM. That is a double effort. They will >>>>>> be processed as soon the vcpu control is given to SIE. >>>>>> >>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>>>> --- >>>>>> arch/s390/kvm/interrupt.c | 54 >>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>>>> insertions(+), 29 deletions(-) >>>>> I think this patch does what it says on the tin, but I'm a bit lost >>>>> as to the why. (Might make more sense with the gib.) >>>>> >>>>> Currently, we are trying to process any I/O interrupts, even if we'd >>>>> get them delivered via the gisa, when we're out of the SIE anyway. >>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>>>> also prevented performance problems when the vcpu did not go back >>>>> into the SIE immediately (it even may exit to userspace). >>>>> >>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>>>> may end up delivering interrupts with a lower priority (higher isc) >>>>> first. I'm not sure that's what we want. >>>>> >>>>> But maybe I'm just missing another bit of the code that makes this >>>>> safe. Can you elaborate a bit? >>>>> >>>> For interrupt priorities to work at least somewhat predictable, we >>>> should always try to inject all interrupts even if the HW would be >>>> doing it for us. In the order of priority. >>>> >>>> But we don't have the same thing for external calls injected via SCA. >>>> I remember that I once had a patch lying around a couple of years ago >>>> to fix that ... it went missing :) >>>> >>>> IO interrupt almost have lowest priority, so letting the HW inject >>>> them could be problematic when mixing IO interrupt priorities between >>>> SW and HW injected ones (hat Conny described). >>>> >>>> There are other corner cases if a e.g. a RESTART interrupt is pending >>>> for that CPU. We would deliver eventually the RESTART interrupt before >>>> delivering the IO interrupt, which would be wrong. >>>> >>> I do share David's concern. Could somebody try to explain why this >>> RESTART scenario David described is not actually a problem -- AFAIU it >>> is a problem. >>> >>> Regards, >>> Halil >>> >> Before I start arguing why this is *not* a problem I ask you both why >> you consider >> this being a problem. We are talking about the CPU restart here, right? >> >> > > When sending a restart interrupt to a running CPU (e.g. system_reset) an > IO interrupt might remain pending and not delivered. > > One could make a guess how bad that is (depending on the type of guest > and use case), however it is guest observable difference to what is > documented in the PoP. Restart interrupt has (almost) lowest priority. > I do not see the priority as a problem. RESET and IO IRQ are asynchronous from each other anyway. But there I see another issue that your question point. Generally, on system RESET all peripheral also get a RESET and all interrupt are cleared. This is the case for AP VFIO devices. This is the case for local interrupts. I miss the GISA reset code to avoid that the adapter IO IRQ is cleared on RESET. May be I did not look at the right place. Regards, Pierre
On 22.11.18 18:33, Pierre Morel wrote: > On 22/11/2018 17:38, David Hildenbrand wrote: >> On 22.11.18 12:21, Michael Mueller wrote: >>> >>> >>> On 21.11.18 22:05, Halil Pasic wrote: >>>> On Tue, 20 Nov 2018 13:00:03 +0100 >>>> David Hildenbrand <david@redhat.com> wrote: >>>> >>>>> On 20.11.18 12:33, Cornelia Huck wrote: >>>>>> On Mon, 19 Nov 2018 18:25:25 +0100 >>>>>> Michael Mueller <mimu@linux.ibm.com> wrote: >>>>>> >>>>>>> Do not call __deliver_io() for adapter interruptions already >>>>>>> pending in the IPM. That is a double effort. They will >>>>>>> be processed as soon the vcpu control is given to SIE. >>>>>>> >>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>>>>> --- >>>>>>> arch/s390/kvm/interrupt.c | 54 >>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>>>>> insertions(+), 29 deletions(-) >>>>>> I think this patch does what it says on the tin, but I'm a bit lost >>>>>> as to the why. (Might make more sense with the gib.) >>>>>> >>>>>> Currently, we are trying to process any I/O interrupts, even if we'd >>>>>> get them delivered via the gisa, when we're out of the SIE anyway. >>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>>>>> also prevented performance problems when the vcpu did not go back >>>>>> into the SIE immediately (it even may exit to userspace). >>>>>> >>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>>>>> may end up delivering interrupts with a lower priority (higher isc) >>>>>> first. I'm not sure that's what we want. >>>>>> >>>>>> But maybe I'm just missing another bit of the code that makes this >>>>>> safe. Can you elaborate a bit? >>>>>> >>>>> For interrupt priorities to work at least somewhat predictable, we >>>>> should always try to inject all interrupts even if the HW would be >>>>> doing it for us. In the order of priority. >>>>> >>>>> But we don't have the same thing for external calls injected via SCA. >>>>> I remember that I once had a patch lying around a couple of years ago >>>>> to fix that ... it went missing :) >>>>> >>>>> IO interrupt almost have lowest priority, so letting the HW inject >>>>> them could be problematic when mixing IO interrupt priorities between >>>>> SW and HW injected ones (hat Conny described). >>>>> >>>>> There are other corner cases if a e.g. a RESTART interrupt is pending >>>>> for that CPU. We would deliver eventually the RESTART interrupt before >>>>> delivering the IO interrupt, which would be wrong. >>>>> >>>> I do share David's concern. Could somebody try to explain why this >>>> RESTART scenario David described is not actually a problem -- AFAIU it >>>> is a problem. >>>> >>>> Regards, >>>> Halil >>>> >>> Before I start arguing why this is *not* a problem I ask you both why >>> you consider >>> this being a problem. We are talking about the CPU restart here, right? >>> >>> >> >> When sending a restart interrupt to a running CPU (e.g. system_reset) an >> IO interrupt might remain pending and not delivered. >> >> One could make a guess how bad that is (depending on the type of guest >> and use case), however it is guest observable difference to what is >> documented in the PoP. Restart interrupt has (almost) lowest priority. >> > > I do not see the priority as a problem. > RESET and IO IRQ are asynchronous from each other anyway. > > But there I see another issue that your question point. > > Generally, on system RESET all peripheral also get a RESET and all > interrupt are cleared. > This is the case for AP VFIO devices. > This is the case for local interrupts. > > I miss the GISA reset code to avoid that the adapter IO IRQ is cleared > on RESET. > > May be I did not look at the right place. See kvm_s390_clear_float_irqs() and how it as called. > > Regards, > Pierre > >
On 23.11.18 09:37, Michael Mueller wrote: > > > On 22.11.18 18:33, Pierre Morel wrote: >> On 22/11/2018 17:38, David Hildenbrand wrote: >>> On 22.11.18 12:21, Michael Mueller wrote: >>>> >>>> >>>> On 21.11.18 22:05, Halil Pasic wrote: >>>>> On Tue, 20 Nov 2018 13:00:03 +0100 >>>>> David Hildenbrand <david@redhat.com> wrote: >>>>> >>>>>> On 20.11.18 12:33, Cornelia Huck wrote: >>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100 >>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote: >>>>>>> >>>>>>>> Do not call __deliver_io() for adapter interruptions already >>>>>>>> pending in the IPM. That is a double effort. They will >>>>>>>> be processed as soon the vcpu control is given to SIE. >>>>>>>> >>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>>>>>> --- >>>>>>>> arch/s390/kvm/interrupt.c | 54 >>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>>>>>> insertions(+), 29 deletions(-) >>>>>>> I think this patch does what it says on the tin, but I'm a bit lost >>>>>>> as to the why. (Might make more sense with the gib.) >>>>>>> >>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd >>>>>>> get them delivered via the gisa, when we're out of the SIE anyway. >>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>>>>>> also prevented performance problems when the vcpu did not go back >>>>>>> into the SIE immediately (it even may exit to userspace). >>>>>>> >>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>>>>>> may end up delivering interrupts with a lower priority (higher isc) >>>>>>> first. I'm not sure that's what we want. >>>>>>> >>>>>>> But maybe I'm just missing another bit of the code that makes this >>>>>>> safe. Can you elaborate a bit? >>>>>>> >>>>>> For interrupt priorities to work at least somewhat predictable, we >>>>>> should always try to inject all interrupts even if the HW would be >>>>>> doing it for us. In the order of priority. >>>>>> >>>>>> But we don't have the same thing for external calls injected via SCA. >>>>>> I remember that I once had a patch lying around a couple of years ago >>>>>> to fix that ... it went missing :) >>>>>> >>>>>> IO interrupt almost have lowest priority, so letting the HW inject >>>>>> them could be problematic when mixing IO interrupt priorities between >>>>>> SW and HW injected ones (hat Conny described). >>>>>> >>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending >>>>>> for that CPU. We would deliver eventually the RESTART interrupt before >>>>>> delivering the IO interrupt, which would be wrong. >>>>>> >>>>> I do share David's concern. Could somebody try to explain why this >>>>> RESTART scenario David described is not actually a problem -- AFAIU it >>>>> is a problem. >>>>> >>>>> Regards, >>>>> Halil >>>>> >>>> Before I start arguing why this is *not* a problem I ask you both why >>>> you consider >>>> this being a problem. We are talking about the CPU restart here, right? >>>> >>>> >>> >>> When sending a restart interrupt to a running CPU (e.g. system_reset) an >>> IO interrupt might remain pending and not delivered. >>> >>> One could make a guess how bad that is (depending on the type of guest >>> and use case), however it is guest observable difference to what is >>> documented in the PoP. Restart interrupt has (almost) lowest priority. >>> >> >> I do not see the priority as a problem. >> RESET and IO IRQ are asynchronous from each other anyway. >> >> But there I see another issue that your question point. >> >> Generally, on system RESET all peripheral also get a RESET and all >> interrupt are cleared. >> This is the case for AP VFIO devices. >> This is the case for local interrupts. Please don't confuse SIGP RESTART (e.g. RESTART interrupt any CPU can happily send to another CPU) with system resets. For ordinary SIGP RESTARTs, no floating interrupts are cleared. However, the question is if the asynchronous nature of IO interrupts indeed give no guarantees in respect to priorities against RESTART interrupts (IOW, will it be observerable). I'll have to think about this a little bit longer :) >> >> I miss the GISA reset code to avoid that the adapter IO IRQ is cleared >> on RESET. >> >> May be I did not look at the right place. > > See kvm_s390_clear_float_irqs() and how it as called. > >> >> Regards, >> Pierre >> >> >
On 23/11/2018 09:50, David Hildenbrand wrote: > On 23.11.18 09:37, Michael Mueller wrote: >> >> >> On 22.11.18 18:33, Pierre Morel wrote: >>> On 22/11/2018 17:38, David Hildenbrand wrote: >>>> On 22.11.18 12:21, Michael Mueller wrote: >>>>> >>>>> >>>>> On 21.11.18 22:05, Halil Pasic wrote: >>>>>> On Tue, 20 Nov 2018 13:00:03 +0100 >>>>>> David Hildenbrand <david@redhat.com> wrote: >>>>>> >>>>>>> On 20.11.18 12:33, Cornelia Huck wrote: >>>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100 >>>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote: >>>>>>>> >>>>>>>>> Do not call __deliver_io() for adapter interruptions already >>>>>>>>> pending in the IPM. That is a double effort. They will >>>>>>>>> be processed as soon the vcpu control is given to SIE. >>>>>>>>> >>>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>>>>>>> --- >>>>>>>>> arch/s390/kvm/interrupt.c | 54 >>>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>>>>>>> insertions(+), 29 deletions(-) >>>>>>>> I think this patch does what it says on the tin, but I'm a bit lost >>>>>>>> as to the why. (Might make more sense with the gib.) >>>>>>>> >>>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd >>>>>>>> get them delivered via the gisa, when we're out of the SIE anyway. >>>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>>>>>>> also prevented performance problems when the vcpu did not go back >>>>>>>> into the SIE immediately (it even may exit to userspace). >>>>>>>> >>>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>>>>>>> may end up delivering interrupts with a lower priority (higher isc) >>>>>>>> first. I'm not sure that's what we want. >>>>>>>> >>>>>>>> But maybe I'm just missing another bit of the code that makes this >>>>>>>> safe. Can you elaborate a bit? >>>>>>>> >>>>>>> For interrupt priorities to work at least somewhat predictable, we >>>>>>> should always try to inject all interrupts even if the HW would be >>>>>>> doing it for us. In the order of priority. >>>>>>> >>>>>>> But we don't have the same thing for external calls injected via SCA. >>>>>>> I remember that I once had a patch lying around a couple of years ago >>>>>>> to fix that ... it went missing :) >>>>>>> >>>>>>> IO interrupt almost have lowest priority, so letting the HW inject >>>>>>> them could be problematic when mixing IO interrupt priorities between >>>>>>> SW and HW injected ones (hat Conny described). >>>>>>> >>>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending >>>>>>> for that CPU. We would deliver eventually the RESTART interrupt before >>>>>>> delivering the IO interrupt, which would be wrong. >>>>>>> >>>>>> I do share David's concern. Could somebody try to explain why this >>>>>> RESTART scenario David described is not actually a problem -- AFAIU it >>>>>> is a problem. >>>>>> >>>>>> Regards, >>>>>> Halil >>>>>> >>>>> Before I start arguing why this is *not* a problem I ask you both why >>>>> you consider >>>>> this being a problem. We are talking about the CPU restart here, right? >>>>> >>>>> >>>> >>>> When sending a restart interrupt to a running CPU (e.g. system_reset) an >>>> IO interrupt might remain pending and not delivered. >>>> >>>> One could make a guess how bad that is (depending on the type of guest >>>> and use case), however it is guest observable difference to what is >>>> documented in the PoP. Restart interrupt has (almost) lowest priority. >>>> >>> >>> I do not see the priority as a problem. >>> RESET and IO IRQ are asynchronous from each other anyway. >>> >>> But there I see another issue that your question point. >>> >>> Generally, on system RESET all peripheral also get a RESET and all >>> interrupt are cleared. >>> This is the case for AP VFIO devices. >>> This is the case for local interrupts. > > Please don't confuse SIGP RESTART (e.g. RESTART interrupt any CPU can > happily send to another CPU) with system resets. I do not, you were speaking about system_reset. And I answered on this. > For ordinary SIGP > RESTARTs, no floating interrupts are cleared. > > However, the question is if the asynchronous nature of IO interrupts > indeed give no guarantees in respect to priorities against RESTART > interrupts (IOW, will it be observerable). I'll have to think about this > a little bit longer :) > >>> >>> I miss the GISA reset code to avoid that the adapter IO IRQ is cleared >>> on RESET. >>> >>> May be I did not look at the right place. >> >> See kvm_s390_clear_float_irqs() and how it as called. >> >>> >>> Regards, >>> Pierre >>> >>> >> > >
On 23/11/2018 09:37, Michael Mueller wrote: > > > On 22.11.18 18:33, Pierre Morel wrote: >> On 22/11/2018 17:38, David Hildenbrand wrote: >>> On 22.11.18 12:21, Michael Mueller wrote: >>>> >>>> >>>> On 21.11.18 22:05, Halil Pasic wrote: >>>>> On Tue, 20 Nov 2018 13:00:03 +0100 >>>>> David Hildenbrand <david@redhat.com> wrote: >>>>> >>>>>> On 20.11.18 12:33, Cornelia Huck wrote: >>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100 >>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote: >>>>>>> >>>>>>>> Do not call __deliver_io() for adapter interruptions already >>>>>>>> pending in the IPM. That is a double effort. They will >>>>>>>> be processed as soon the vcpu control is given to SIE. >>>>>>>> >>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>>>>>> --- >>>>>>>> arch/s390/kvm/interrupt.c | 54 >>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>>>>>> insertions(+), 29 deletions(-) >>>>>>> I think this patch does what it says on the tin, but I'm a bit lost >>>>>>> as to the why. (Might make more sense with the gib.) >>>>>>> >>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd >>>>>>> get them delivered via the gisa, when we're out of the SIE anyway. >>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>>>>>> also prevented performance problems when the vcpu did not go back >>>>>>> into the SIE immediately (it even may exit to userspace). >>>>>>> >>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>>>>>> may end up delivering interrupts with a lower priority (higher isc) >>>>>>> first. I'm not sure that's what we want. >>>>>>> >>>>>>> But maybe I'm just missing another bit of the code that makes this >>>>>>> safe. Can you elaborate a bit? >>>>>>> >>>>>> For interrupt priorities to work at least somewhat predictable, we >>>>>> should always try to inject all interrupts even if the HW would be >>>>>> doing it for us. In the order of priority. >>>>>> >>>>>> But we don't have the same thing for external calls injected via SCA. >>>>>> I remember that I once had a patch lying around a couple of years ago >>>>>> to fix that ... it went missing :) >>>>>> >>>>>> IO interrupt almost have lowest priority, so letting the HW inject >>>>>> them could be problematic when mixing IO interrupt priorities between >>>>>> SW and HW injected ones (hat Conny described). >>>>>> >>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending >>>>>> for that CPU. We would deliver eventually the RESTART interrupt >>>>>> before >>>>>> delivering the IO interrupt, which would be wrong. >>>>>> >>>>> I do share David's concern. Could somebody try to explain why this >>>>> RESTART scenario David described is not actually a problem -- AFAIU it >>>>> is a problem. >>>>> >>>>> Regards, >>>>> Halil >>>>> >>>> Before I start arguing why this is *not* a problem I ask you both why >>>> you consider >>>> this being a problem. We are talking about the CPU restart here, >>>> right? >>>> >>>> >>> >>> When sending a restart interrupt to a running CPU (e.g. system_reset) an >>> IO interrupt might remain pending and not delivered. >>> >>> One could make a guess how bad that is (depending on the type of guest >>> and use case), however it is guest observable difference to what is >>> documented in the PoP. Restart interrupt has (almost) lowest priority. >>> >> >> I do not see the priority as a problem. >> RESET and IO IRQ are asynchronous from each other anyway. >> >> But there I see another issue that your question point. >> >> Generally, on system RESET all peripheral also get a RESET and all >> interrupt are cleared. >> This is the case for AP VFIO devices. >> This is the case for local interrupts. >> >> I miss the GISA reset code to avoid that the adapter IO IRQ is cleared >> on RESET. >> >> May be I did not look at the right place. > > See kvm_s390_clear_float_irqs() and how it as called. right, thanks. > >> >> Regards, >> Pierre >> >> >
On 23.11.18 10:00, Pierre Morel wrote: > On 23/11/2018 09:50, David Hildenbrand wrote: >> On 23.11.18 09:37, Michael Mueller wrote: >>> >>> >>> On 22.11.18 18:33, Pierre Morel wrote: >>>> On 22/11/2018 17:38, David Hildenbrand wrote: >>>>> On 22.11.18 12:21, Michael Mueller wrote: >>>>>> >>>>>> >>>>>> On 21.11.18 22:05, Halil Pasic wrote: >>>>>>> On Tue, 20 Nov 2018 13:00:03 +0100 >>>>>>> David Hildenbrand <david@redhat.com> wrote: >>>>>>> >>>>>>>> On 20.11.18 12:33, Cornelia Huck wrote: >>>>>>>>> On Mon, 19 Nov 2018 18:25:25 +0100 >>>>>>>>> Michael Mueller <mimu@linux.ibm.com> wrote: >>>>>>>>> >>>>>>>>>> Do not call __deliver_io() for adapter interruptions already >>>>>>>>>> pending in the IPM. That is a double effort. They will >>>>>>>>>> be processed as soon the vcpu control is given to SIE. >>>>>>>>>> >>>>>>>>>> Signed-off-by: Michael Mueller <mimu@linux.ibm.com> >>>>>>>>>> --- >>>>>>>>>> arch/s390/kvm/interrupt.c | 54 >>>>>>>>>> ++++++++++++++++++++++------------------------- 1 file changed, 25 >>>>>>>>>> insertions(+), 29 deletions(-) >>>>>>>>> I think this patch does what it says on the tin, but I'm a bit lost >>>>>>>>> as to the why. (Might make more sense with the gib.) >>>>>>>>> >>>>>>>>> Currently, we are trying to process any I/O interrupts, even if we'd >>>>>>>>> get them delivered via the gisa, when we're out of the SIE anyway. >>>>>>>>> IIRC, while this looks a bit like a belt-and-suspenders approach, it >>>>>>>>> also prevented performance problems when the vcpu did not go back >>>>>>>>> into the SIE immediately (it even may exit to userspace). >>>>>>>>> >>>>>>>>> Also, if you're ignoring the I/O interrupts pending in the ipm, you >>>>>>>>> may end up delivering interrupts with a lower priority (higher isc) >>>>>>>>> first. I'm not sure that's what we want. >>>>>>>>> >>>>>>>>> But maybe I'm just missing another bit of the code that makes this >>>>>>>>> safe. Can you elaborate a bit? >>>>>>>>> >>>>>>>> For interrupt priorities to work at least somewhat predictable, we >>>>>>>> should always try to inject all interrupts even if the HW would be >>>>>>>> doing it for us. In the order of priority. >>>>>>>> >>>>>>>> But we don't have the same thing for external calls injected via SCA. >>>>>>>> I remember that I once had a patch lying around a couple of years ago >>>>>>>> to fix that ... it went missing :) >>>>>>>> >>>>>>>> IO interrupt almost have lowest priority, so letting the HW inject >>>>>>>> them could be problematic when mixing IO interrupt priorities between >>>>>>>> SW and HW injected ones (hat Conny described). >>>>>>>> >>>>>>>> There are other corner cases if a e.g. a RESTART interrupt is pending >>>>>>>> for that CPU. We would deliver eventually the RESTART interrupt before >>>>>>>> delivering the IO interrupt, which would be wrong. >>>>>>>> >>>>>>> I do share David's concern. Could somebody try to explain why this >>>>>>> RESTART scenario David described is not actually a problem -- AFAIU it >>>>>>> is a problem. >>>>>>> >>>>>>> Regards, >>>>>>> Halil >>>>>>> >>>>>> Before I start arguing why this is *not* a problem I ask you both why >>>>>> you consider >>>>>> this being a problem. We are talking about the CPU restart here, right? >>>>>> >>>>>> >>>>> >>>>> When sending a restart interrupt to a running CPU (e.g. system_reset) an >>>>> IO interrupt might remain pending and not delivered. >>>>> >>>>> One could make a guess how bad that is (depending on the type of guest >>>>> and use case), however it is guest observable difference to what is >>>>> documented in the PoP. Restart interrupt has (almost) lowest priority. >>>>> >>>> >>>> I do not see the priority as a problem. >>>> RESET and IO IRQ are asynchronous from each other anyway. >>>> >>>> But there I see another issue that your question point. >>>> >>>> Generally, on system RESET all peripheral also get a RESET and all >>>> interrupt are cleared. >>>> This is the case for AP VFIO devices. >>>> This is the case for local interrupts. >> >> Please don't confuse SIGP RESTART (e.g. RESTART interrupt any CPU can >> happily send to another CPU) with system resets. > > I do not, you were speaking about system_reset. > And I answered on this. I think I was confused. "nmi" uses RESTART interrupts from "external". "system_reset" does not use RESTART interrupts on the QEMU side. (so the example I gave was both confusing and wrong :) ) In Linux, restart interrupts are used for multiple purposes (hibernation restore, bringing up CPUs, kdump). Not sure if the priority thingy could be a problem there (I guess not, but I am not sure yet about the implications).
diff --git a/arch/s390/kvm/interrupt.c b/arch/s390/kvm/interrupt.c index fcb55b02990e..1f4c0c7286f7 100644 --- a/arch/s390/kvm/interrupt.c +++ b/arch/s390/kvm/interrupt.c @@ -271,14 +271,9 @@ static unsigned long disable_iscs(struct kvm_vcpu *vcpu, return active_mask; } -static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu) +static unsigned long __deliverable_irqs(struct kvm_vcpu *vcpu, + unsigned long active_mask) { - unsigned long active_mask; - - active_mask = pending_irqs(vcpu); - if (!active_mask) - return 0; - if (psw_extint_disabled(vcpu)) active_mask &= ~IRQ_PEND_EXT_MASK; if (psw_ioint_disabled(vcpu)) @@ -315,6 +310,28 @@ static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu) return active_mask; } +static unsigned long deliverable_irqs_no_gisa(struct kvm_vcpu *vcpu) +{ + unsigned long active_mask; + + active_mask = pending_irqs_no_gisa(vcpu); + if (!active_mask) + return 0; + + return __deliverable_irqs(vcpu, active_mask); +} + +static unsigned long deliverable_irqs(struct kvm_vcpu *vcpu) +{ + unsigned long active_mask; + + active_mask = pending_irqs(vcpu); + if (!active_mask) + return 0; + + return __deliverable_irqs(vcpu, active_mask); +} + static void __set_cpu_idle(struct kvm_vcpu *vcpu) { kvm_s390_set_cpuflags(vcpu, CPUSTAT_WAIT); @@ -957,7 +974,6 @@ static int __must_check __deliver_io(struct kvm_vcpu *vcpu, struct list_head *isc_list; struct kvm_s390_float_interrupt *fi; struct kvm_s390_interrupt_info *inti = NULL; - struct kvm_s390_io_info io; u32 isc; int rc = 0; @@ -995,28 +1011,8 @@ static int __must_check __deliver_io(struct kvm_vcpu *vcpu, if (inti) { rc = __do_deliver_io(vcpu, &(inti->io)); kfree(inti); - goto out; } - if (vcpu->kvm->arch.gisa && - kvm_s390_gisa_tac_ipm_gisc(vcpu->kvm->arch.gisa, isc)) { - /* - * in case an adapter interrupt was not delivered - * in SIE context KVM will handle the delivery - */ - VCPU_EVENT(vcpu, 4, "%s isc %u", "deliver: I/O (AI/gisa)", isc); - memset(&io, 0, sizeof(io)); - io.io_int_word = isc_to_int_word(isc); - vcpu->stat.deliver_io++; - trace_kvm_s390_deliver_interrupt(vcpu->vcpu_id, - KVM_S390_INT_IO(1, 0, 0, 0), - ((__u32)io.subchannel_id << 16) | - io.subchannel_nr, - ((__u64)io.io_int_parm << 32) | - io.io_int_word); - rc = __do_deliver_io(vcpu, &io); - } -out: return rc; } @@ -1205,7 +1201,7 @@ int __must_check kvm_s390_deliver_pending_interrupts(struct kvm_vcpu *vcpu) if (cpu_timer_irq_pending(vcpu)) set_bit(IRQ_PEND_EXT_CPU_TIMER, &li->pending_irqs); - while ((irqs = deliverable_irqs(vcpu)) && !rc) { + while ((irqs = deliverable_irqs_no_gisa(vcpu)) && !rc) { /* bits are in the reverse order of interrupt priority */ irq_type = find_last_bit(&irqs, IRQ_PEND_COUNT); switch (irq_type) {
Do not call __deliver_io() for adapter interruptions already pending in the IPM. That is a double effort. They will be processed as soon the vcpu control is given to SIE. Signed-off-by: Michael Mueller <mimu@linux.ibm.com> --- arch/s390/kvm/interrupt.c | 54 ++++++++++++++++++++++------------------------- 1 file changed, 25 insertions(+), 29 deletions(-)