Message ID | alpine.LFD.2.02.1208091243080.5231@xanadu.home (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Aug 09, 2012 at 05:57:33PM +0100, Nicolas Pitre wrote: > On Thu, 9 Aug 2012, Nicolas Pitre wrote: > > Yes, that looks fine. I'd remove that if (prev < 0) entirely though. > > We'll just swap a 0 for a 0 if the count wasn't < 0, or a 0 for a 1 if > > the mutex just got unlocked which is also fine. This is especially > > beneficial when a native xchg processor instruction is used. > > In fact, this suggestion isn't entirely correct either. The inner xchg > in this case should be -1 and not 'count'. If 'count' is 0 and the > mutex becomes contended in the small window between the two xchg's then > the contention mark would be lost again. > > In other words, I think this should look like this: > > diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h > index 580a6d35c7..44a66c99c8 100644 > --- a/include/asm-generic/mutex-xchg.h > +++ b/include/asm-generic/mutex-xchg.h > @@ -25,8 +25,11 @@ > static inline void > __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) > { > - if (unlikely(atomic_xchg(count, 0) != 1)) > - fail_fn(count); > + if (unlikely(atomic_xchg(count, 0) != 1)) { > + /* Mark lock contention explicitly */ > + if (likely(atomic_xchg(count, -1) != 1)) > + fail_fn(count); > + } > } > > /** Doesn't this mean that we're no longer just swapping 0 for a 0 if the lock was taken, therefore needlessly sending the current owner down the slowpath on unlock? If we have the explicit test, that doesn't happen and the disassembly isn't much different (an additional cmp/conditional branch). Will
On Thu, 9 Aug 2012, Will Deacon wrote: > On Thu, Aug 09, 2012 at 05:57:33PM +0100, Nicolas Pitre wrote: > > On Thu, 9 Aug 2012, Nicolas Pitre wrote: > > > Yes, that looks fine. I'd remove that if (prev < 0) entirely though. > > > We'll just swap a 0 for a 0 if the count wasn't < 0, or a 0 for a 1 if > > > the mutex just got unlocked which is also fine. This is especially > > > beneficial when a native xchg processor instruction is used. > > > > In fact, this suggestion isn't entirely correct either. The inner xchg > > in this case should be -1 and not 'count'. If 'count' is 0 and the > > mutex becomes contended in the small window between the two xchg's then > > the contention mark would be lost again. > > > > In other words, I think this should look like this: > > > > diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h > > index 580a6d35c7..44a66c99c8 100644 > > --- a/include/asm-generic/mutex-xchg.h > > +++ b/include/asm-generic/mutex-xchg.h > > @@ -25,8 +25,11 @@ > > static inline void > > __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) > > { > > - if (unlikely(atomic_xchg(count, 0) != 1)) > > - fail_fn(count); > > + if (unlikely(atomic_xchg(count, 0) != 1)) { > > + /* Mark lock contention explicitly */ > > + if (likely(atomic_xchg(count, -1) != 1)) > > + fail_fn(count); > > + } > > } > > > > /** > > Doesn't this mean that we're no longer just swapping 0 for a 0 if the lock > was taken, therefore needlessly sending the current owner down the slowpath > on unlock? If the lock was taken, this means the count was either 0 or -1. If it was 1 then we just put a 0 there and we own it. But if the cound was 0 then we should store -1 instead, which is what the inner xchg does. If the count was already -1 then we store -1 back. That more closely mimic what the atomic dec does which is what we want. Nicolas
On Thu, Aug 09, 2012 at 07:09:02PM +0100, Nicolas Pitre wrote: > On Thu, 9 Aug 2012, Will Deacon wrote: > > On Thu, Aug 09, 2012 at 05:57:33PM +0100, Nicolas Pitre wrote: > > > On Thu, 9 Aug 2012, Nicolas Pitre wrote: > > > diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h > > > index 580a6d35c7..44a66c99c8 100644 > > > --- a/include/asm-generic/mutex-xchg.h > > > +++ b/include/asm-generic/mutex-xchg.h > > > @@ -25,8 +25,11 @@ > > > static inline void > > > __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) > > > { > > > - if (unlikely(atomic_xchg(count, 0) != 1)) > > > - fail_fn(count); > > > + if (unlikely(atomic_xchg(count, 0) != 1)) { > > > + /* Mark lock contention explicitly */ > > > + if (likely(atomic_xchg(count, -1) != 1)) > > > + fail_fn(count); > > > + } > > > } > > > > > > /** > > > > Doesn't this mean that we're no longer just swapping 0 for a 0 if the lock > > was taken, therefore needlessly sending the current owner down the slowpath > > on unlock? > > If the lock was taken, this means the count was either 0 or -1. If it > was 1 then we just put a 0 there and we own it. But if the cound was 0 > then we should store -1 instead, which is what the inner xchg does. If > the count was already -1 then we store -1 back. That more closely mimic > what the atomic dec does which is what we want. Ok, I just wasn't sure that marking the lock contended was required when it was previously locked, given that we'll drop into spinning on the owner anyway. I'll add a commit message to the above and re-post if that's ok? Will
On Thu, 9 Aug 2012, Will Deacon wrote: > On Thu, Aug 09, 2012 at 07:09:02PM +0100, Nicolas Pitre wrote: > > On Thu, 9 Aug 2012, Will Deacon wrote: > > > On Thu, Aug 09, 2012 at 05:57:33PM +0100, Nicolas Pitre wrote: > > > > On Thu, 9 Aug 2012, Nicolas Pitre wrote: > > > > diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h > > > > index 580a6d35c7..44a66c99c8 100644 > > > > --- a/include/asm-generic/mutex-xchg.h > > > > +++ b/include/asm-generic/mutex-xchg.h > > > > @@ -25,8 +25,11 @@ > > > > static inline void > > > > __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) > > > > { > > > > - if (unlikely(atomic_xchg(count, 0) != 1)) > > > > - fail_fn(count); > > > > + if (unlikely(atomic_xchg(count, 0) != 1)) { > > > > + /* Mark lock contention explicitly */ > > > > + if (likely(atomic_xchg(count, -1) != 1)) > > > > + fail_fn(count); > > > > + } > > > > } > > > > > > > > /** > > > > > > Doesn't this mean that we're no longer just swapping 0 for a 0 if the lock > > > was taken, therefore needlessly sending the current owner down the slowpath > > > on unlock? > > > > If the lock was taken, this means the count was either 0 or -1. If it > > was 1 then we just put a 0 there and we own it. But if the cound was 0 > > then we should store -1 instead, which is what the inner xchg does. If > > the count was already -1 then we store -1 back. That more closely mimic > > what the atomic dec does which is what we want. > > Ok, I just wasn't sure that marking the lock contended was required when it > was previously locked, given that we'll drop into spinning on the owner > anyway. That's fine, and the owner will put 1 back when it unlocks it as well as processing the wait queue which is what we need. > I'll add a commit message to the above and re-post if that's ok? Sure. Don't forget to update __mutex_fastpath_lock_retval() as well. Nicolas
On Thu, 2012-08-09 at 12:57 -0400, Nicolas Pitre wrote: > > In other words, I think this should look like this: > > diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h > index 580a6d35c7..44a66c99c8 100644 > --- a/include/asm-generic/mutex-xchg.h > +++ b/include/asm-generic/mutex-xchg.h > @@ -25,8 +25,11 @@ > static inline void > __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) > { > - if (unlikely(atomic_xchg(count, 0) != 1)) > - fail_fn(count); > + if (unlikely(atomic_xchg(count, 0) != 1)) { > + /* Mark lock contention explicitly */ > + if (likely(atomic_xchg(count, -1) != 1)) > + fail_fn(count); > + } > } OK, I like this.. Thanks guys! Will will you send a final and complete patch?
Hi Peter, On Mon, Aug 13, 2012 at 09:15:04AM +0100, Peter Zijlstra wrote: > On Thu, 2012-08-09 at 12:57 -0400, Nicolas Pitre wrote: > > > > In other words, I think this should look like this: > > > > diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h > > index 580a6d35c7..44a66c99c8 100644 > > --- a/include/asm-generic/mutex-xchg.h > > +++ b/include/asm-generic/mutex-xchg.h > > @@ -25,8 +25,11 @@ > > static inline void > > __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) > > { > > - if (unlikely(atomic_xchg(count, 0) != 1)) > > - fail_fn(count); > > + if (unlikely(atomic_xchg(count, 0) != 1)) { > > + /* Mark lock contention explicitly */ > > + if (likely(atomic_xchg(count, -1) != 1)) > > + fail_fn(count); > > + } > > } > > OK, I like this.. Thanks guys! Will will you send a final and complete > patch? I sent one out on Friday but somehow managed to drop you from CC -- sorry about that: http://thread.gmane.org/gmane.linux.kernel/1341305 Cheers, Will
On Mon, 2012-08-13 at 09:35 -0400, Nicolas Pitre wrote: > Date: Fri, 10 Aug 2012 15:22:09 +0100 > From: Will Deacon <will.deacon@arm.com> > Subject: [PATCH] mutex: place lock in contended state after fastpath_lock failure > > ARM recently moved to asm-generic/mutex-xchg.h for its mutex > implementation after the previous implementation was found to be missing > some crucial memory barriers. However, this has revealed some problems > running hackbench on SMP platforms due to the way in which the > MUTEX_SPIN_ON_OWNER code operates. > > The symptoms are that a bunch of hackbench tasks are left waiting on an > unlocked mutex and therefore never get woken up to claim it. This boils > down to the following sequence of events: > > Task A Task B Task C Lock value > 0 1 > 1 lock() 0 > 2 lock() 0 > 3 spin(A) 0 > 4 unlock() 1 > 5 lock() 0 > 6 cmpxchg(1,0) 0 > 7 contended() -1 > 8 lock() 0 > 9 spin(C) 0 > 10 unlock() 1 > 11 cmpxchg(1,0) 0 > 12 unlock() 1 > > At this point, the lock is unlocked, but Task B is in an uninterruptible > sleep with nobody to wake it up. > > This patch fixes the problem by ensuring we put the lock into the > contended state if we fail to acquire it on the fastpath, ensuring that > any blocked waiters are woken up when the mutex is released. > > Cc: Arnd Bergmann <arnd@arndb.de> > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Chris Mason <chris.mason@fusionio.com> > Cc: Ingo Molnar <mingo@elte.hu> > Cc: <stable@vger.kernel.org> > Signed-off-by: Will Deacon <will.deacon@arm.com> > Reviewed-by: Nicolas Pitre <nico@linaro.org> Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Will you carry this through the ARM tree or do you want me/Ingo to take it?
On Mon, Aug 13, 2012 at 03:05:17PM +0100, Peter Zijlstra wrote: > Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl> Thanks Peter. > Will you carry this through the ARM tree or do you want me/Ingo to take > it? Please can you guys take it? The move to mutex-dec can go through the ARM tree but it's probably best if you take the core change. Cheers, Will
On Mon, 2012-08-13 at 15:11 +0100, Will Deacon wrote: > > > Will you carry this through the ARM tree or do you want me/Ingo to take > > it? > > Please can you guys take it? The move to mutex-dec can go through the ARM > tree but it's probably best if you take the core change. OK, done. Thanks!
diff --git a/include/asm-generic/mutex-xchg.h b/include/asm-generic/mutex-xchg.h index 580a6d35c7..44a66c99c8 100644 --- a/include/asm-generic/mutex-xchg.h +++ b/include/asm-generic/mutex-xchg.h @@ -25,8 +25,11 @@ static inline void __mutex_fastpath_lock(atomic_t *count, void (*fail_fn)(atomic_t *)) { - if (unlikely(atomic_xchg(count, 0) != 1)) - fail_fn(count); + if (unlikely(atomic_xchg(count, 0) != 1)) { + /* Mark lock contention explicitly */ + if (likely(atomic_xchg(count, -1) != 1)) + fail_fn(count); + } } /**