Message ID | 20191115201407.45042-1-stewart.hildebrand@dornerworks.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xen: arm: context switch vtimer PPI state | expand |
(+ Andre) Hi, On 15/11/2019 20:14, Stewart Hildebrand wrote: > From: Ian Campbell <ian.campbell@citrix.com> > > ... instead of artificially masking the timer interrupt in the timer > state and relying on the guest to unmask (which it isn't required to > do per the h/w spec, although Linux does). > > By using the newly added hwppi save/restore functionality we make use > of the GICD I[SC]ACTIVER registers to save and restore the active > state of the interrupt, which prevents the nested invocations which > the current masking works around. > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com> > --- > v2: Rebased, in particular over Julien's passthrough stuff which > reworked a bunch of IRQ related stuff. > Also largely rewritten since precursor patches now lay very > different groundwork. > > v3: Address feedback from v2 [1]: > * Remove virt_timer_irqs performance counter since it is now unused. > * Add caveat to comment about not using I*ACTIVER register. > * HACK: don't initialize pending_irq->irq in vtimer for new vGIC to > allows us to build with CONFIG_NEW_VGIC=y > > [1] https://lists.xenproject.org/archives/html/xen-devel/2015-11/msg01065.html > --- > > Note: Regarding Stefano's comment in [2], I did test it with the call > to gic_hwppi_set_pending removed, and I was able to boot dom0. When dealing with the vGIC, testing is not enough to justify the removal of some code. We need a worded justification of why it is (or not) necessary. In this case the timer is level (despite some broken HW misconfiguring it), so by removing set_pending() you don't affect anything as restoring the timer registers will automatically mark the interrupt pending. > > [2] https://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02683.html > --- > xen/arch/arm/time.c | 26 ++---------------- > xen/arch/arm/vtimer.c | 45 +++++++++++++++++++++++++++++--- > xen/include/asm-arm/domain.h | 1 + > xen/include/asm-arm/perfc_defn.h | 1 - > 4 files changed, 44 insertions(+), 29 deletions(-) > > diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c > index 739bcf186c..e3a23b8e16 100644 > --- a/xen/arch/arm/time.c > +++ b/xen/arch/arm/time.c > @@ -243,28 +243,6 @@ static void timer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs) > } > } > > -static void vtimer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs) > -{ > - /* > - * Edge-triggered interrupts can be used for the virtual timer. Even > - * if the timer output signal is masked in the context switch, the > - * GIC will keep track that of any interrupts raised while IRQS are > - * disabled. As soon as IRQs are re-enabled, the virtual interrupt > - * will be injected to Xen. > - * > - * If an IDLE vCPU was scheduled next then we should ignore the > - * interrupt. > - */ > - if ( unlikely(is_idle_vcpu(current)) ) > - return; > - > - perfc_incr(virt_timer_irqs); > - > - current->arch.virt_timer.ctl = READ_SYSREG32(CNTV_CTL_EL0); > - WRITE_SYSREG32(current->arch.virt_timer.ctl | CNTx_CTL_MASK, CNTV_CTL_EL0); > - vgic_inject_irq(current->domain, current, current->arch.virt_timer.irq, true); > -} > - > /* > * Arch timer interrupt really ought to be level triggered, since the > * design of the timer/comparator mechanism is based around that > @@ -304,8 +282,8 @@ void init_timer_interrupt(void) > > request_irq(timer_irq[TIMER_HYP_PPI], 0, timer_interrupt, > "hyptimer", NULL); > - request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt, > - "virtimer", NULL); > + route_hwppi_to_current_vcpu(timer_irq[TIMER_VIRT_PPI], "virtimer"); > + > request_irq(timer_irq[TIMER_PHYS_NONSECURE_PPI], 0, timer_interrupt, > "phytimer", NULL); > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > index e6aebdac9e..6e3498952d 100644 > --- a/xen/arch/arm/vtimer.c > +++ b/xen/arch/arm/vtimer.c > @@ -55,9 +55,19 @@ static void phys_timer_expired(void *data) > static void virt_timer_expired(void *data) > { > struct vtimer *t = data; > - t->ctl |= CNTx_CTL_MASK; > - vgic_inject_irq(t->v->domain, t->v, t->irq, true); > - perfc_incr(vtimer_virt_inject); > + t->ctl |= CNTx_CTL_PENDING; I don't think this is necessary. If the software timer fire, then the virtual timer (in HW) would have fired too. So by restoring the timer, then the HW should by itself set the pending bit and trigger the interrupt. > + if ( !(t->ctl & CNTx_CTL_MASK) ) The timer is only set if the virtual timer is enabled and not masked. So I think this check is unnecessary as we could never reached this code with the virtual timer masked. > + { > + /* > + * An edge triggered interrupt should now be pending. Since This does not make sense, the timer interrupt ought to be level. So why are you even speaking about edge here? > + * this timer can never expire while the domain is scheduled > + * we know that the gic_restore_hwppi in virt_timer_restore > + * will cause the real hwppi to occur and be routed. > + */ > + gic_hwppi_set_pending(&t->ppi_state); > + vcpu_unblock(t->v); > + perfc_incr(vtimer_virt_inject); > + } I think the implementation of virt_timer_expired could only be: vcpu_unlock(...); perfc_incr(vtimer_virt_inject); > } > > int domain_vtimer_init(struct domain *d, struct xen_arch_domainconfig *config) > @@ -98,9 +108,14 @@ int domain_vtimer_init(struct domain *d, struct xen_arch_domainconfig *config) > > int vcpu_vtimer_init(struct vcpu *v) > { > +#ifndef CONFIG_NEW_VGIC > + struct pending_irq *p; > +#endif > struct vtimer *t = &v->arch.phys_timer; > bool d0 = is_hardware_domain(v->domain); > > + const unsigned host_vtimer_irq_ppi = timer_get_irq(TIMER_VIRT_PPI); > + > /* > * Hardware domain uses the hardware interrupts, guests get the virtual > * platform. > @@ -118,10 +133,18 @@ int vcpu_vtimer_init(struct vcpu *v) > init_timer(&t->timer, virt_timer_expired, t, v->processor); > t->ctl = 0; > t->irq = d0 > - ? timer_get_irq(TIMER_VIRT_PPI) > + ? host_vtimer_irq_ppi > : GUEST_TIMER_VIRT_PPI; > t->v = v; > > +#ifndef CONFIG_NEW_VGIC > + p = irq_to_pending(v, t->irq); > + p->irq = t->irq; > +#endif p->irq is initialized by vcpu_vgic_init(), so why do you need to override it? > + > + gic_hwppi_state_init(&v->arch.virt_timer.ppi_state, > + host_vtimer_irq_ppi); > + > v->arch.vtimer_initialized = 1; > > return 0; > @@ -149,6 +172,16 @@ void virt_timer_save(struct vcpu *v) > set_timer(&v->arch.virt_timer.timer, ticks_to_ns(v->arch.virt_timer.cval + > v->domain->arch.virt_timer_base.offset - boot_count)); > } > + > + /* > + * Since the vtimer irq is a PPI we don't need to worry about > + * racing against it becoming active while we are saving the > + * state, since that requires the guest to be reading the IAR, > + * as long as the guest is not using I*ACTIVER register which we > + * don't yet implement. I really don't think this is the correct place to document it. If someone were to implement I*ACTIVER then this would likely be missed. But, in this case, I think we should not rely in the vtimer about the implementation of I*ACTIVER and DTRT from the start. > + */ > + gic_save_and_mask_hwppi(v, v->arch.virt_timer.irq, > + &v->arch.virt_timer.ppi_state); I am not entirely sure of the ordering here. Don't we want to restoring the interrupt state first and then the timer registers? > } > > void virt_timer_restore(struct vcpu *v) > @@ -162,6 +195,10 @@ void virt_timer_restore(struct vcpu *v) > WRITE_SYSREG64(v->domain->arch.virt_timer_base.offset, CNTVOFF_EL2); > WRITE_SYSREG64(v->arch.virt_timer.cval, CNTV_CVAL_EL0); > WRITE_SYSREG32(v->arch.virt_timer.ctl, CNTV_CTL_EL0); > + > + gic_restore_hwppi(v, > + v->arch.virt_timer.irq, > + &v->arch.virt_timer.ppi_state); > } > > static bool vtimer_cntp_ctl(struct cpu_user_regs *regs, uint32_t *r, bool read) > diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h > index c3f4cd5069..b8fe142960 100644 > --- a/xen/include/asm-arm/domain.h > +++ b/xen/include/asm-arm/domain.h > @@ -51,6 +51,7 @@ struct vtimer { > struct timer timer; > uint32_t ctl; > uint64_t cval; > + struct hwppi_state ppi_state; > }; > > struct arch_domain > diff --git a/xen/include/asm-arm/perfc_defn.h b/xen/include/asm-arm/perfc_defn.h > index 6a83185163..198dd4eadb 100644 > --- a/xen/include/asm-arm/perfc_defn.h > +++ b/xen/include/asm-arm/perfc_defn.h > @@ -70,7 +70,6 @@ PERFCOUNTER(guest_irqs, "#GUEST-IRQS") > > PERFCOUNTER(hyp_timer_irqs, "Hypervisor timer interrupts") > PERFCOUNTER(phys_timer_irqs, "Physical timer interrupts") > -PERFCOUNTER(virt_timer_irqs, "Virtual timer interrupts") Please add a word in the commit message explaining why virt_timer_irqs is removed. > PERFCOUNTER(maintenance_irqs, "Maintenance interrupts") > > PERFCOUNTER(atomics_guest, "atomics: guest access") > Cheers,
On Mon, 25 Nov 2019, Julien Grall wrote: > On 15/11/2019 20:14, Stewart Hildebrand wrote: > > From: Ian Campbell <ian.campbell@citrix.com> > > > > ... instead of artificially masking the timer interrupt in the timer > > state and relying on the guest to unmask (which it isn't required to > > do per the h/w spec, although Linux does). > > > > By using the newly added hwppi save/restore functionality we make use > > of the GICD I[SC]ACTIVER registers to save and restore the active > > state of the interrupt, which prevents the nested invocations which > > the current masking works around. > > > > Signed-off-by: Ian Campbell <ian.campbell@citrix.com> > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@dornerworks.com> > > --- > > v2: Rebased, in particular over Julien's passthrough stuff which > > reworked a bunch of IRQ related stuff. > > Also largely rewritten since precursor patches now lay very > > different groundwork. > > > > v3: Address feedback from v2 [1]: > > * Remove virt_timer_irqs performance counter since it is now unused. > > * Add caveat to comment about not using I*ACTIVER register. > > * HACK: don't initialize pending_irq->irq in vtimer for new vGIC to > > allows us to build with CONFIG_NEW_VGIC=y > > > > [1] > > https://lists.xenproject.org/archives/html/xen-devel/2015-11/msg01065.html > > --- > > > > Note: Regarding Stefano's comment in [2], I did test it with the call > > to gic_hwppi_set_pending removed, and I was able to boot dom0. > > When dealing with the vGIC, testing is not enough to justify the removal of > some code. We need a worded justification of why it is (or not) necessary. > > In this case the timer is level (despite some broken HW misconfiguring it), so > by removing set_pending() you don't affect anything as restoring the timer > registers will automatically mark the interrupt pending. > > > > > [2] > > https://lists.xenproject.org/archives/html/xen-devel/2015-12/msg02683.html > > --- > > xen/arch/arm/time.c | 26 ++---------------- > > xen/arch/arm/vtimer.c | 45 +++++++++++++++++++++++++++++--- > > xen/include/asm-arm/domain.h | 1 + > > xen/include/asm-arm/perfc_defn.h | 1 - > > 4 files changed, 44 insertions(+), 29 deletions(-) > > > > diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c > > index 739bcf186c..e3a23b8e16 100644 > > --- a/xen/arch/arm/time.c > > +++ b/xen/arch/arm/time.c > > @@ -243,28 +243,6 @@ static void timer_interrupt(int irq, void *dev_id, > > struct cpu_user_regs *regs) > > } > > } > > -static void vtimer_interrupt(int irq, void *dev_id, struct cpu_user_regs > > *regs) > > -{ > > - /* > > - * Edge-triggered interrupts can be used for the virtual timer. Even > > - * if the timer output signal is masked in the context switch, the > > - * GIC will keep track that of any interrupts raised while IRQS are > > - * disabled. As soon as IRQs are re-enabled, the virtual interrupt > > - * will be injected to Xen. > > - * > > - * If an IDLE vCPU was scheduled next then we should ignore the > > - * interrupt. > > - */ > > - if ( unlikely(is_idle_vcpu(current)) ) > > - return; > > - > > - perfc_incr(virt_timer_irqs); > > - > > - current->arch.virt_timer.ctl = READ_SYSREG32(CNTV_CTL_EL0); > > - WRITE_SYSREG32(current->arch.virt_timer.ctl | CNTx_CTL_MASK, > > CNTV_CTL_EL0); > > - vgic_inject_irq(current->domain, current, current->arch.virt_timer.irq, > > true); > > -} > > - > > /* > > * Arch timer interrupt really ought to be level triggered, since the > > * design of the timer/comparator mechanism is based around that > > @@ -304,8 +282,8 @@ void init_timer_interrupt(void) > > request_irq(timer_irq[TIMER_HYP_PPI], 0, timer_interrupt, > > "hyptimer", NULL); > > - request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt, > > - "virtimer", NULL); > > + route_hwppi_to_current_vcpu(timer_irq[TIMER_VIRT_PPI], "virtimer"); > > + > > request_irq(timer_irq[TIMER_PHYS_NONSECURE_PPI], 0, timer_interrupt, > > "phytimer", NULL); > > diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c > > index e6aebdac9e..6e3498952d 100644 > > --- a/xen/arch/arm/vtimer.c > > +++ b/xen/arch/arm/vtimer.c > > @@ -55,9 +55,19 @@ static void phys_timer_expired(void *data) > > static void virt_timer_expired(void *data) > > { > > struct vtimer *t = data; > > - t->ctl |= CNTx_CTL_MASK; > > - vgic_inject_irq(t->v->domain, t->v, t->irq, true); > > - perfc_incr(vtimer_virt_inject); > > + t->ctl |= CNTx_CTL_PENDING; > > I don't think this is necessary. If the software timer fire, then the virtual > timer (in HW) would have fired too. So by restoring the timer, then the HW > should by itself set the pending bit and trigger the interrupt. > > > + if ( !(t->ctl & CNTx_CTL_MASK) ) > > The timer is only set if the virtual timer is enabled and not masked. So I > think this check is unnecessary as we could never reached this code with the > virtual timer masked. > > > + { > > + /* > > + * An edge triggered interrupt should now be pending. Since > > This does not make sense, the timer interrupt ought to be level. So why are > you even speaking about edge here? > > > + * this timer can never expire while the domain is scheduled > > + * we know that the gic_restore_hwppi in virt_timer_restore > > + * will cause the real hwppi to occur and be routed. > > + */ > > + gic_hwppi_set_pending(&t->ppi_state); > > + vcpu_unblock(t->v); > > + perfc_incr(vtimer_virt_inject); > > + } > > I think the implementation of virt_timer_expired could only be: > > vcpu_unlock(...); ^ vcpu_unblock Your reasoning seems sound > perfc_incr(vtimer_virt_inject);
diff --git a/xen/arch/arm/time.c b/xen/arch/arm/time.c index 739bcf186c..e3a23b8e16 100644 --- a/xen/arch/arm/time.c +++ b/xen/arch/arm/time.c @@ -243,28 +243,6 @@ static void timer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs) } } -static void vtimer_interrupt(int irq, void *dev_id, struct cpu_user_regs *regs) -{ - /* - * Edge-triggered interrupts can be used for the virtual timer. Even - * if the timer output signal is masked in the context switch, the - * GIC will keep track that of any interrupts raised while IRQS are - * disabled. As soon as IRQs are re-enabled, the virtual interrupt - * will be injected to Xen. - * - * If an IDLE vCPU was scheduled next then we should ignore the - * interrupt. - */ - if ( unlikely(is_idle_vcpu(current)) ) - return; - - perfc_incr(virt_timer_irqs); - - current->arch.virt_timer.ctl = READ_SYSREG32(CNTV_CTL_EL0); - WRITE_SYSREG32(current->arch.virt_timer.ctl | CNTx_CTL_MASK, CNTV_CTL_EL0); - vgic_inject_irq(current->domain, current, current->arch.virt_timer.irq, true); -} - /* * Arch timer interrupt really ought to be level triggered, since the * design of the timer/comparator mechanism is based around that @@ -304,8 +282,8 @@ void init_timer_interrupt(void) request_irq(timer_irq[TIMER_HYP_PPI], 0, timer_interrupt, "hyptimer", NULL); - request_irq(timer_irq[TIMER_VIRT_PPI], 0, vtimer_interrupt, - "virtimer", NULL); + route_hwppi_to_current_vcpu(timer_irq[TIMER_VIRT_PPI], "virtimer"); + request_irq(timer_irq[TIMER_PHYS_NONSECURE_PPI], 0, timer_interrupt, "phytimer", NULL); diff --git a/xen/arch/arm/vtimer.c b/xen/arch/arm/vtimer.c index e6aebdac9e..6e3498952d 100644 --- a/xen/arch/arm/vtimer.c +++ b/xen/arch/arm/vtimer.c @@ -55,9 +55,19 @@ static void phys_timer_expired(void *data) static void virt_timer_expired(void *data) { struct vtimer *t = data; - t->ctl |= CNTx_CTL_MASK; - vgic_inject_irq(t->v->domain, t->v, t->irq, true); - perfc_incr(vtimer_virt_inject); + t->ctl |= CNTx_CTL_PENDING; + if ( !(t->ctl & CNTx_CTL_MASK) ) + { + /* + * An edge triggered interrupt should now be pending. Since + * this timer can never expire while the domain is scheduled + * we know that the gic_restore_hwppi in virt_timer_restore + * will cause the real hwppi to occur and be routed. + */ + gic_hwppi_set_pending(&t->ppi_state); + vcpu_unblock(t->v); + perfc_incr(vtimer_virt_inject); + } } int domain_vtimer_init(struct domain *d, struct xen_arch_domainconfig *config) @@ -98,9 +108,14 @@ int domain_vtimer_init(struct domain *d, struct xen_arch_domainconfig *config) int vcpu_vtimer_init(struct vcpu *v) { +#ifndef CONFIG_NEW_VGIC + struct pending_irq *p; +#endif struct vtimer *t = &v->arch.phys_timer; bool d0 = is_hardware_domain(v->domain); + const unsigned host_vtimer_irq_ppi = timer_get_irq(TIMER_VIRT_PPI); + /* * Hardware domain uses the hardware interrupts, guests get the virtual * platform. @@ -118,10 +133,18 @@ int vcpu_vtimer_init(struct vcpu *v) init_timer(&t->timer, virt_timer_expired, t, v->processor); t->ctl = 0; t->irq = d0 - ? timer_get_irq(TIMER_VIRT_PPI) + ? host_vtimer_irq_ppi : GUEST_TIMER_VIRT_PPI; t->v = v; +#ifndef CONFIG_NEW_VGIC + p = irq_to_pending(v, t->irq); + p->irq = t->irq; +#endif + + gic_hwppi_state_init(&v->arch.virt_timer.ppi_state, + host_vtimer_irq_ppi); + v->arch.vtimer_initialized = 1; return 0; @@ -149,6 +172,16 @@ void virt_timer_save(struct vcpu *v) set_timer(&v->arch.virt_timer.timer, ticks_to_ns(v->arch.virt_timer.cval + v->domain->arch.virt_timer_base.offset - boot_count)); } + + /* + * Since the vtimer irq is a PPI we don't need to worry about + * racing against it becoming active while we are saving the + * state, since that requires the guest to be reading the IAR, + * as long as the guest is not using I*ACTIVER register which we + * don't yet implement. + */ + gic_save_and_mask_hwppi(v, v->arch.virt_timer.irq, + &v->arch.virt_timer.ppi_state); } void virt_timer_restore(struct vcpu *v) @@ -162,6 +195,10 @@ void virt_timer_restore(struct vcpu *v) WRITE_SYSREG64(v->domain->arch.virt_timer_base.offset, CNTVOFF_EL2); WRITE_SYSREG64(v->arch.virt_timer.cval, CNTV_CVAL_EL0); WRITE_SYSREG32(v->arch.virt_timer.ctl, CNTV_CTL_EL0); + + gic_restore_hwppi(v, + v->arch.virt_timer.irq, + &v->arch.virt_timer.ppi_state); } static bool vtimer_cntp_ctl(struct cpu_user_regs *regs, uint32_t *r, bool read) diff --git a/xen/include/asm-arm/domain.h b/xen/include/asm-arm/domain.h index c3f4cd5069..b8fe142960 100644 --- a/xen/include/asm-arm/domain.h +++ b/xen/include/asm-arm/domain.h @@ -51,6 +51,7 @@ struct vtimer { struct timer timer; uint32_t ctl; uint64_t cval; + struct hwppi_state ppi_state; }; struct arch_domain diff --git a/xen/include/asm-arm/perfc_defn.h b/xen/include/asm-arm/perfc_defn.h index 6a83185163..198dd4eadb 100644 --- a/xen/include/asm-arm/perfc_defn.h +++ b/xen/include/asm-arm/perfc_defn.h @@ -70,7 +70,6 @@ PERFCOUNTER(guest_irqs, "#GUEST-IRQS") PERFCOUNTER(hyp_timer_irqs, "Hypervisor timer interrupts") PERFCOUNTER(phys_timer_irqs, "Physical timer interrupts") -PERFCOUNTER(virt_timer_irqs, "Virtual timer interrupts") PERFCOUNTER(maintenance_irqs, "Maintenance interrupts") PERFCOUNTER(atomics_guest, "atomics: guest access")