Message ID | 1482372913-18366-3-git-send-email-sstabellini@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Stefano, On 22/12/16 02:15, Stefano Stabellini wrote: > Always take the vgic lock of the old vcpu. When more than one irq > migration is requested before the first one completes, take the vgic > lock of the oldest vcpu. > > Write the new vcpu id into the rank from vgic_migrate_irq, protected by > the oldest vgic vcpu lock. > > Use barriers to ensure proper ordering between clearing inflight and > MIGRATING and setting vcpu to GIC_INVALID_VCPU. The field p->status was designed to be accessed without any lock using *_bit helpers. Currently vgic_migrate_irq is protected by the rank lock (an ASSERT would probably useful for documentation) and can only be called once at the time. Let's take the following example to describe the problem: 1) IRQ has been injected into vCPU A (e.g present in the LR) 2) IRQ is migrated to vCPU B 3) IRQ is migrated to vCPU C 4) IRQ is EOIed by the guest When vgic_irq_migrate_irq is called for the first time (step 2), the vCPU A vgic lock will be taken and GIC_IRQ_GUEST_MIGRATED will be set at the end. The second time (step 3), GIC_IRQ_GUEST_MIGRATING is already set, so the function will return directly no lock needed. So, I think the vgic lock taken is already correct. Regards,
On Wed, 28 Dec 2016, Julien Grall wrote: > Hi Stefano, > > On 22/12/16 02:15, Stefano Stabellini wrote: > > Always take the vgic lock of the old vcpu. When more than one irq > > migration is requested before the first one completes, take the vgic > > lock of the oldest vcpu. > > > > Write the new vcpu id into the rank from vgic_migrate_irq, protected by > > the oldest vgic vcpu lock. > > > > Use barriers to ensure proper ordering between clearing inflight and > > MIGRATING and setting vcpu to GIC_INVALID_VCPU. > > The field p->status was designed to be accessed without any lock using *_bit > helpers. > > Currently vgic_migrate_irq is protected by the rank lock (an ASSERT would > probably useful for documentation) and can only be called once at the time. > > Let's take the following example to describe the problem: > 1) IRQ has been injected into vCPU A (e.g present in the LR) > 2) IRQ is migrated to vCPU B > 3) IRQ is migrated to vCPU C > 4) IRQ is EOIed by the guest > > When vgic_irq_migrate_irq is called for the first time (step 2), the vCPU A > vgic lock will be taken and GIC_IRQ_GUEST_MIGRATED will be set at the end. The > second time (step 3), GIC_IRQ_GUEST_MIGRATING is already set, so the function > will return directly no lock needed. Right, but the caller, vgic_store_itargetsr, will still write to rank->vcpu. > So, I think the vgic lock taken is already correct. The problem arises by 3) and 4) running simultaneously, and specifically from the rank->vcpu write in vgic_store_itargetsr running concurrently with gic_update_one_lr, as described in alpine.DEB.2.10.1612191337370.13831@sstabellini-ThinkPad-X260: CPU0 CPU1 <GIC_IRQ_GUEST_MIGRATING is set> <GIC_IRQ_GUEST_MIGRATING is set> ---------------------------------------------------------------------- vgic_migrate_irq C (does nothing) clear GIC_IRQ_GUEST_MIGRATING read rank->vcpu B set rank->vcpu C irq_set_affinity B Result: the irq physical affinity is B instead of C. Please note that the new patch (1483486167-24607-1-git-send-email-sstabellini@kernel.org) doesn't have this problem because it removes the call to irq_set_affinity in gic_update_one_lr.
Hi Stefano, On 03/01/17 23:30, Stefano Stabellini wrote: > On Wed, 28 Dec 2016, Julien Grall wrote: >> On 22/12/16 02:15, Stefano Stabellini wrote: >>> Always take the vgic lock of the old vcpu. When more than one irq >>> migration is requested before the first one completes, take the vgic >>> lock of the oldest vcpu. >>> >>> Write the new vcpu id into the rank from vgic_migrate_irq, protected by >>> the oldest vgic vcpu lock. >>> >>> Use barriers to ensure proper ordering between clearing inflight and >>> MIGRATING and setting vcpu to GIC_INVALID_VCPU. >> >> The field p->status was designed to be accessed without any lock using *_bit >> helpers. >> >> Currently vgic_migrate_irq is protected by the rank lock (an ASSERT would >> probably useful for documentation) and can only be called once at the time. >> >> Let's take the following example to describe the problem: >> 1) IRQ has been injected into vCPU A (e.g present in the LR) >> 2) IRQ is migrated to vCPU B >> 3) IRQ is migrated to vCPU C >> 4) IRQ is EOIed by the guest >> >> When vgic_irq_migrate_irq is called for the first time (step 2), the vCPU A >> vgic lock will be taken and GIC_IRQ_GUEST_MIGRATED will be set at the end. The >> second time (step 3), GIC_IRQ_GUEST_MIGRATING is already set, so the function >> will return directly no lock needed. > > Right, but the caller, vgic_store_itargetsr, will still write to > rank->vcpu. > > >> So, I think the vgic lock taken is already correct. > > The problem arises by 3) and 4) running simultaneously, and specifically > from the rank->vcpu write in vgic_store_itargetsr running concurrently > with gic_update_one_lr, as described in > alpine.DEB.2.10.1612191337370.13831@sstabellini-ThinkPad-X260: > > CPU0 CPU1 > <GIC_IRQ_GUEST_MIGRATING is set> <GIC_IRQ_GUEST_MIGRATING is set> > ---------------------------------------------------------------------- > vgic_migrate_irq C (does nothing) > clear GIC_IRQ_GUEST_MIGRATING > read rank->vcpu B > set rank->vcpu C > irq_set_affinity B > > > Result: the irq physical affinity is B instead of C. > > Please note that the new patch > (1483486167-24607-1-git-send-email-sstabellini@kernel.org) doesn't have > this problem because it removes the call to irq_set_affinity in > gic_update_one_lr. Sorry I think there was a bit of confusion with my previous e-mail. I was describing the behavior with a change in the code similar to your v3. Cheers,
diff --git a/xen/arch/arm/gic.c b/xen/arch/arm/gic.c index 3189693..51148b4 100644 --- a/xen/arch/arm/gic.c +++ b/xen/arch/arm/gic.c @@ -512,6 +512,11 @@ static void gic_update_one_lr(struct vcpu *v, int i) struct vcpu *v_target = vgic_get_target_vcpu(v, irq); irq_set_affinity(p->desc, cpumask_of(v_target->processor)); } + /* + * Clear MIGRATING, set new affinity, then clear vcpu. This + * barrier pairs with the one in vgic_migrate_irq. + */ + smp_mb(); p->vcpu = GIC_INVALID_VCPU; } } diff --git a/xen/arch/arm/vgic-v2.c b/xen/arch/arm/vgic-v2.c index 3dbcfe8..38b1be1 100644 --- a/xen/arch/arm/vgic-v2.c +++ b/xen/arch/arm/vgic-v2.c @@ -154,15 +154,9 @@ static void vgic_store_itargetsr(struct domain *d, struct vgic_irq_rank *rank, old_target = rank->vcpu[offset]; - /* Only migrate the vIRQ if the target vCPU has changed */ - if ( new_target != old_target ) - { - vgic_migrate_irq(d->vcpu[old_target], - d->vcpu[new_target], - virq); - } - - rank->vcpu[offset] = new_target; + vgic_migrate_irq(d->vcpu[old_target], + d->vcpu[new_target], + virq, &rank->vcpu[offset]); } } diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c index d61479d..6fb0fdd 100644 --- a/xen/arch/arm/vgic-v3.c +++ b/xen/arch/arm/vgic-v3.c @@ -150,11 +150,7 @@ static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank, if ( !new_vcpu ) return; - /* Only migrate the IRQ if the target vCPU has changed */ - if ( new_vcpu != old_vcpu ) - vgic_migrate_irq(old_vcpu, new_vcpu, virq); - - rank->vcpu[offset] = new_vcpu->vcpu_id; + vgic_migrate_irq(old_vcpu, new_vcpu, virq, &rank->vcpu[offset]); } static inline bool vgic_reg64_check_access(struct hsr_dabt dabt) diff --git a/xen/arch/arm/vgic.c b/xen/arch/arm/vgic.c index f2e3eda..cceac24 100644 --- a/xen/arch/arm/vgic.c +++ b/xen/arch/arm/vgic.c @@ -257,9 +257,8 @@ static int vgic_get_virq_priority(struct vcpu *v, unsigned int virq) return priority; } -void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) +static void __vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) { - unsigned long flags; struct pending_irq *p = irq_to_pending(old, irq); /* nothing to do for virtual interrupts */ @@ -272,12 +271,9 @@ void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) perfc_incr(vgic_irq_migrates); - spin_lock_irqsave(&old->arch.vgic.lock, flags); - if ( list_empty(&p->inflight) ) { irq_set_affinity(p->desc, cpumask_of(new->processor)); - spin_unlock_irqrestore(&old->arch.vgic.lock, flags); return; } /* If the IRQ is still lr_pending, re-inject it to the new vcpu */ @@ -287,7 +283,6 @@ void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) list_del_init(&p->lr_queue); list_del_init(&p->inflight); irq_set_affinity(p->desc, cpumask_of(new->processor)); - spin_unlock_irqrestore(&old->arch.vgic.lock, flags); vgic_vcpu_inject_irq(new, irq); return; } @@ -296,7 +291,48 @@ void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq) if ( !list_empty(&p->inflight) ) set_bit(GIC_IRQ_GUEST_MIGRATING, &p->status); - spin_unlock_irqrestore(&old->arch.vgic.lock, flags); +} + +void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq, + uint8_t *rank_vcpu) +{ + struct pending_irq *p; + unsigned long flags; + struct vcpu *v; + uint8_t vcpu; + + /* Only migrate the IRQ if the target vCPU has changed */ + if ( new == old ) + return; + + /* + * In most cases, p->vcpu is either invalid or the same as "old". + * The only exceptions are cases where the interrupt has already + * been migrated to a different vcpu, but the irq migration is still + * in progress (GIC_IRQ_GUEST_MIGRATING has been set). If that is + * the case, then "old" points to an intermediary vcpu we don't care + * about. We want to take the lock on the older vcpu instead, + * because that is the one gic_update_one_lr holds. + * + * The vgic lock is the only lock protecting accesses to rank_vcpu + * from gic_update_one_lr. However, writes to rank_vcpu are still + * protected by the rank lock. + */ + p = irq_to_pending(old, irq); + vcpu = p->vcpu; + + /* This pairs with the barrier in gic_update_one_lr. */ + smp_mb(); + + if ( vcpu != GIC_INVALID_VCPU ) + v = old->domain->vcpu[vcpu]; + else + v = old; + + spin_lock_irqsave(&v->arch.vgic.lock, flags); + __vgic_migrate_irq(old, new, irq); + *rank_vcpu = new->vcpu_id; + spin_unlock_irqrestore(&v->arch.vgic.lock, flags); } void arch_move_irqs(struct vcpu *v) diff --git a/xen/include/asm-arm/vgic.h b/xen/include/asm-arm/vgic.h index fde5b32..dce2f84 100644 --- a/xen/include/asm-arm/vgic.h +++ b/xen/include/asm-arm/vgic.h @@ -314,7 +314,8 @@ extern int vcpu_vgic_free(struct vcpu *v); extern bool vgic_to_sgi(struct vcpu *v, register_t sgir, enum gic_sgi_mode irqmode, int virq, const struct sgi_target *target); -extern void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq); +extern void vgic_migrate_irq(struct vcpu *old, struct vcpu *new, unsigned int irq, + uint8_t *rank_vcpu); /* Reserve a specific guest vIRQ */ extern bool vgic_reserve_virq(struct domain *d, unsigned int virq);
Always take the vgic lock of the old vcpu. When more than one irq migration is requested before the first one completes, take the vgic lock of the oldest vcpu. Write the new vcpu id into the rank from vgic_migrate_irq, protected by the oldest vgic vcpu lock. Use barriers to ensure proper ordering between clearing inflight and MIGRATING and setting vcpu to GIC_INVALID_VCPU. Signed-off-by: Stefano Stabellini <sstabellini@kernel.org> --- xen/arch/arm/gic.c | 5 +++++ xen/arch/arm/vgic-v2.c | 12 +++-------- xen/arch/arm/vgic-v3.c | 6 +----- xen/arch/arm/vgic.c | 50 +++++++++++++++++++++++++++++++++++++++------- xen/include/asm-arm/vgic.h | 3 ++- 5 files changed, 54 insertions(+), 22 deletions(-)