Message ID | 1469619037-13826-1-git-send-email-wanpeng.li@hotmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/27/2016 07:30 AM, Wanpeng Li wrote: > From: Wanpeng Li<wanpeng.li@hotmail.com> > > When the lock holder vCPU is racing with the queue head vCPU: > > lock holder vCPU queue head vCPU > ===================== ================== > > node->locked = 1; > <preemption> READ_ONCE(node->locked) > ... pv_wait_head_or_lock(): > SPIN_THRESHOLD loop; > pv_hash(); > lock->locked = _Q_SLOW_VAL; > node->state = vcpu_hashed; > pv_kick_node(): > cmpxchg(node->state, > vcpu_halted, vcpu_hashed); > lock->locked = _Q_SLOW_VAL; > pv_hash(); > > With preemption at the right moment, it is possible that both the > lock holder and queue head vCPUs can be racing to set node->state > which can result in hash entry race. Making sure the state is never > set to vcpu_halted will prevent this racing from happening. > > This patch fix it by setting vcpu_hashed after we did all hash thing. > > Reviewed-by: Davidlohr Bueso<dave@stgolabs.net> > Reviewed-by: Pan Xinhui<xinhui.pan@linux.vnet.ibm.com> > Cc: Peter Zijlstra (Intel)<peterz@infradead.org> > Cc: Ingo Molnar<mingo@kernel.org> > Cc: Waiman Long<Waiman.Long@hpe.com> > Cc: Davidlohr Bueso<dave@stgolabs.net> > Signed-off-by: Wanpeng Li<wanpeng.li@hotmail.com> > --- > v3 -> v4: > * update patch subject > * add code comments > v2 -> v3: > * fix typo in patch description > v1 -> v2: > * adjust patch description > > kernel/locking/qspinlock_paravirt.h | 23 ++++++++++++++++++++++- > 1 file changed, 22 insertions(+), 1 deletion(-) > > diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h > index 21ede57..ca96db4 100644 > --- a/kernel/locking/qspinlock_paravirt.h > +++ b/kernel/locking/qspinlock_paravirt.h > @@ -450,7 +450,28 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) > goto gotlock; > } > } > - WRITE_ONCE(pn->state, vcpu_halted); > + /* > + * lock holder vCPU queue head vCPU > + * ---------------- --------------- > + * node->locked = 1; > + *<preemption> READ_ONCE(node->locked) > + * ... pv_wait_head_or_lock(): > + * SPIN_THRESHOLD loop; > + * pv_hash(); > + * lock->locked = _Q_SLOW_VAL; > + * node->state = vcpu_hashed; > + * pv_kick_node(): > + * cmpxchg(node->state, > + * vcpu_halted, vcpu_hashed); > + * lock->locked = _Q_SLOW_VAL; > + * pv_hash(); > + * > + * With preemption at the right moment, it is possible that both the > + * lock holder and queue head vCPUs can be racing to set node->state. > + * Making sure the state is never set to vcpu_halted will prevent this > + * racing from happening. > + */ > + WRITE_ONCE(pn->state, vcpu_hashed); > qstat_inc(qstat_pv_wait_head, true); > qstat_inc(qstat_pv_wait_again, waitcnt); > pv_wait(&l->locked, _Q_SLOW_VAL); Acked-by: Waiman Long <Waiman.Long@hpe.com> Cheers, Longman -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
2016-08-03 11:28 GMT+08:00 Waiman Long <waiman.long@hpe.com>: > On 07/27/2016 07:30 AM, Wanpeng Li wrote: >> >> From: Wanpeng Li<wanpeng.li@hotmail.com> >> >> When the lock holder vCPU is racing with the queue head vCPU: >> >> lock holder vCPU queue head vCPU >> ===================== ================== >> >> node->locked = 1; >> <preemption> READ_ONCE(node->locked) >> ... pv_wait_head_or_lock(): >> SPIN_THRESHOLD loop; >> pv_hash(); >> lock->locked = _Q_SLOW_VAL; >> node->state = vcpu_hashed; >> pv_kick_node(): >> cmpxchg(node->state, >> vcpu_halted, vcpu_hashed); >> lock->locked = _Q_SLOW_VAL; >> pv_hash(); >> >> With preemption at the right moment, it is possible that both the >> lock holder and queue head vCPUs can be racing to set node->state >> which can result in hash entry race. Making sure the state is never >> set to vcpu_halted will prevent this racing from happening. >> >> This patch fix it by setting vcpu_hashed after we did all hash thing. >> >> Reviewed-by: Davidlohr Bueso<dave@stgolabs.net> >> Reviewed-by: Pan Xinhui<xinhui.pan@linux.vnet.ibm.com> >> Cc: Peter Zijlstra (Intel)<peterz@infradead.org> >> Cc: Ingo Molnar<mingo@kernel.org> >> Cc: Waiman Long<Waiman.Long@hpe.com> >> Cc: Davidlohr Bueso<dave@stgolabs.net> >> Signed-off-by: Wanpeng Li<wanpeng.li@hotmail.com> >> --- >> v3 -> v4: >> * update patch subject >> * add code comments >> v2 -> v3: >> * fix typo in patch description >> v1 -> v2: >> * adjust patch description >> >> kernel/locking/qspinlock_paravirt.h | 23 ++++++++++++++++++++++- >> 1 file changed, 22 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/locking/qspinlock_paravirt.h >> b/kernel/locking/qspinlock_paravirt.h >> index 21ede57..ca96db4 100644 >> --- a/kernel/locking/qspinlock_paravirt.h >> +++ b/kernel/locking/qspinlock_paravirt.h >> @@ -450,7 +450,28 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct >> mcs_spinlock *node) >> goto gotlock; >> } >> } >> - WRITE_ONCE(pn->state, vcpu_halted); >> + /* >> + * lock holder vCPU queue head vCPU >> + * ---------------- --------------- >> + * node->locked = 1; >> + *<preemption> READ_ONCE(node->locked) >> + * ... pv_wait_head_or_lock(): >> + * SPIN_THRESHOLD loop; >> + * pv_hash(); >> + * lock->locked = >> _Q_SLOW_VAL; >> + * node->state = >> vcpu_hashed; >> + * pv_kick_node(): >> + * cmpxchg(node->state, >> + * vcpu_halted, vcpu_hashed); >> + * lock->locked = _Q_SLOW_VAL; >> + * pv_hash(); >> + * >> + * With preemption at the right moment, it is possible >> that both the >> + * lock holder and queue head vCPUs can be racing to set >> node->state. >> + * Making sure the state is never set to vcpu_halted will >> prevent this >> + * racing from happening. >> + */ >> + WRITE_ONCE(pn->state, vcpu_hashed); >> qstat_inc(qstat_pv_wait_head, true); >> qstat_inc(qstat_pv_wait_again, waitcnt); >> pv_wait(&l->locked, _Q_SLOW_VAL); > > > Acked-by: Waiman Long <Waiman.Long@hpe.com> Thanks. :) Regards, Wanpeng Li -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
===================== ================== node->locked = 1; <preemption> READ_ONCE(node->locked) ... pv_wait_head_or_lock(): SPIN_THRESHOLD loop; pv_hash(); lock->locked = _Q_SLOW_VAL; node->state = vcpu_hashed; pv_kick_node(): cmpxchg(node->state, vcpu_halted, vcpu_hashed); lock->locked = _Q_SLOW_VAL; pv_hash(); With preemption at the right moment, it is possible that both the lock holder and queue head vCPUs can be racing to set node->state which can result in hash entry race. Making sure the state is never set to vcpu_halted will prevent this racing from happening. This patch fix it by setting vcpu_hashed after we did all hash thing. Reviewed-by: Davidlohr Bueso <dave@stgolabs.net> Reviewed-by: Pan Xinhui <xinhui.pan@linux.vnet.ibm.com> Cc: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Ingo Molnar <mingo@kernel.org> Cc: Waiman Long <Waiman.Long@hpe.com> Cc: Davidlohr Bueso <dave@stgolabs.net> Signed-off-by: Wanpeng Li <wanpeng.li@hotmail.com> --- v3 -> v4: * update patch subject * add code comments v2 -> v3: * fix typo in patch description v1 -> v2: * adjust patch description kernel/locking/qspinlock_paravirt.h | 23 ++++++++++++++++++++++- 1 file changed, 22 insertions(+), 1 deletion(-) diff --git a/kernel/locking/qspinlock_paravirt.h b/kernel/locking/qspinlock_paravirt.h index 21ede57..ca96db4 100644 --- a/kernel/locking/qspinlock_paravirt.h +++ b/kernel/locking/qspinlock_paravirt.h @@ -450,7 +450,28 @@ pv_wait_head_or_lock(struct qspinlock *lock, struct mcs_spinlock *node) goto gotlock; } } - WRITE_ONCE(pn->state, vcpu_halted); + /* + * lock holder vCPU queue head vCPU + * ---------------- --------------- + * node->locked = 1; + * <preemption> READ_ONCE(node->locked) + * ... pv_wait_head_or_lock(): + * SPIN_THRESHOLD loop; + * pv_hash(); + * lock->locked = _Q_SLOW_VAL; + * node->state = vcpu_hashed; + * pv_kick_node(): + * cmpxchg(node->state, + * vcpu_halted, vcpu_hashed); + * lock->locked = _Q_SLOW_VAL; + * pv_hash(); + * + * With preemption at the right moment, it is possible that both the + * lock holder and queue head vCPUs can be racing to set node->state. + * Making sure the state is never set to vcpu_halted will prevent this + * racing from happening. + */ + WRITE_ONCE(pn->state, vcpu_hashed); qstat_inc(qstat_pv_wait_head, true); qstat_inc(qstat_pv_wait_again, waitcnt); pv_wait(&l->locked, _Q_SLOW_VAL);
From: Wanpeng Li <wanpeng.li@hotmail.com> When the lock holder vCPU is racing with the queue head vCPU: lock holder vCPU queue head vCPU