diff mbox

[3/3] arm64/locking: qspinlocks and qrwlocks support

Message ID 1491860104-4103-4-git-send-email-ynorov@caviumnetworks.com (mailing list archive)
State New, archived
Headers show

Commit Message

Yury Norov April 10, 2017, 9:35 p.m. UTC
From: Jan Glauber <jglauber@cavium.com>

Ported from x86_64 with paravirtualization support removed.

Signed-off-by: Jan Glauber <jglauber@cavium.com>

Note. This patch removes protection from direct inclusion of
arch/arm64/include/asm/spinlock_types.h. It's done because
kernel/locking/qrwlock.c file does it thru the header
include/asm-generic/qrwlock_types.h. Until now the only user
of qrwlock.c was x86, and there's no such protection too.

I'm not happy to remove the protection, but if it's OK for x86,
it should be also OK for arm64. If not, I think we'd fix it
for x86, and add the protection there too.

Yury

Signed-off-by: Yury Norov <ynorov@caviumnetworks.com>
---
 arch/arm64/Kconfig                      |  2 ++
 arch/arm64/include/asm/qrwlock.h        |  7 +++++++
 arch/arm64/include/asm/qspinlock.h      | 20 ++++++++++++++++++++
 arch/arm64/include/asm/spinlock.h       | 12 ++++++++++++
 arch/arm64/include/asm/spinlock_types.h | 14 +++++++++++---
 5 files changed, 52 insertions(+), 3 deletions(-)
 create mode 100644 arch/arm64/include/asm/qrwlock.h
 create mode 100644 arch/arm64/include/asm/qspinlock.h

Comments

Peter Zijlstra April 13, 2017, 6:12 p.m. UTC | #1
On Tue, Apr 11, 2017 at 01:35:04AM +0400, Yury Norov wrote:

> +++ b/arch/arm64/include/asm/qspinlock.h
> @@ -0,0 +1,20 @@
> +#ifndef _ASM_ARM64_QSPINLOCK_H
> +#define _ASM_ARM64_QSPINLOCK_H
> +
> +#include <asm-generic/qspinlock_types.h>
> +
> +#define	queued_spin_unlock queued_spin_unlock
> +/**
> + * queued_spin_unlock - release a queued spinlock
> + * @lock : Pointer to queued spinlock structure
> + *
> + * A smp_store_release() on the least-significant byte.
> + */
> +static inline void queued_spin_unlock(struct qspinlock *lock)
> +{
> +	smp_store_release((u8 *)lock, 0);
> +}

I'm afraid this isn't enough for arm64. I suspect you want your own
variant of queued_spin_unlock_wait() and queued_spin_is_locked() as
well.

Much memory ordering fun to be had there.
Yury Norov April 20, 2017, 6:23 p.m. UTC | #2
On Thu, Apr 13, 2017 at 08:12:12PM +0200, Peter Zijlstra wrote:
> On Tue, Apr 11, 2017 at 01:35:04AM +0400, Yury Norov wrote:
> 
> > +++ b/arch/arm64/include/asm/qspinlock.h
> > @@ -0,0 +1,20 @@
> > +#ifndef _ASM_ARM64_QSPINLOCK_H
> > +#define _ASM_ARM64_QSPINLOCK_H
> > +
> > +#include <asm-generic/qspinlock_types.h>
> > +
> > +#define	queued_spin_unlock queued_spin_unlock
> > +/**
> > + * queued_spin_unlock - release a queued spinlock
> > + * @lock : Pointer to queued spinlock structure
> > + *
> > + * A smp_store_release() on the least-significant byte.
> > + */
> > +static inline void queued_spin_unlock(struct qspinlock *lock)
> > +{
> > +	smp_store_release((u8 *)lock, 0);
> > +}
> 
> I'm afraid this isn't enough for arm64. I suspect you want your own
> variant of queued_spin_unlock_wait() and queued_spin_is_locked() as
> well.
> 
> Much memory ordering fun to be had there.

Hi Peter,

Is there some test to reproduce the locking failure for the case. I
ask because I run loctorture for many hours on my qemu (emulating
cortex-a57), and I see no failures in the test reports. And Jan did it
on ThunderX, and Adam on QDF2400 without any problems. So even if I
rework those functions, how could I check them for correctness?

Anyway, regarding the queued_spin_unlock_wait(), is my understanding
correct that you assume adding smp_mb() before entering the for(;;)
cycle, and using ldaxr/strxr instead of atomic_read()?

Yury
Mark Rutland April 20, 2017, 7 p.m. UTC | #3
On Thu, Apr 20, 2017 at 09:23:18PM +0300, Yury Norov wrote:
> On Thu, Apr 13, 2017 at 08:12:12PM +0200, Peter Zijlstra wrote:
> > On Tue, Apr 11, 2017 at 01:35:04AM +0400, Yury Norov wrote:
> > 
> > > +++ b/arch/arm64/include/asm/qspinlock.h
> > > @@ -0,0 +1,20 @@
> > > +#ifndef _ASM_ARM64_QSPINLOCK_H
> > > +#define _ASM_ARM64_QSPINLOCK_H
> > > +
> > > +#include <asm-generic/qspinlock_types.h>
> > > +
> > > +#define	queued_spin_unlock queued_spin_unlock
> > > +/**
> > > + * queued_spin_unlock - release a queued spinlock
> > > + * @lock : Pointer to queued spinlock structure
> > > + *
> > > + * A smp_store_release() on the least-significant byte.
> > > + */
> > > +static inline void queued_spin_unlock(struct qspinlock *lock)
> > > +{
> > > +	smp_store_release((u8 *)lock, 0);
> > > +}
> > 
> > I'm afraid this isn't enough for arm64. I suspect you want your own
> > variant of queued_spin_unlock_wait() and queued_spin_is_locked() as
> > well.
> > 
> > Much memory ordering fun to be had there.
> 
> Hi Peter,
> 
> Is there some test to reproduce the locking failure for the case. I
> ask because I run loctorture for many hours on my qemu (emulating
> cortex-a57), and I see no failures in the test reports.

Even with multi-threaded TCG, a system emulated with QEMU will have far
stronger memory ordering than a real platform. So stress tests on such a
system are useless for testing memory ordering properties.

I would strongly advise that you use a real platform for anything beyond
basic tests when touching code in this area.

> And Jan did it on ThunderX, and Adam on QDF2400 without any problems.
> So even if I rework those functions, how could I check them for
> correctness?

Given the variation the architecture permits, and how difficult it is to
diagnose issues in this area, testing isn't enough here.

You need at least some informal proof as to the primitives doing what
they should, i.e. you should be able to explain why the code is correct.

Thanks,
Mark.
Peter Zijlstra April 20, 2017, 7:05 p.m. UTC | #4
On Thu, Apr 20, 2017 at 09:23:18PM +0300, Yury Norov wrote:
> Is there some test to reproduce the locking failure for the case.

Possibly sysvsem stress before commit:

  27d7be1801a4 ("ipc/sem.c: avoid using spin_unlock_wait()")

Although a similar scheme is also used in nf_conntrack, see commit:

  b316ff783d17 ("locking/spinlock, netfilter: Fix nf_conntrack_lock() barriers")

> I
> ask because I run loctorture for many hours on my qemu (emulating
> cortex-a57), and I see no failures in the test reports. And Jan did it
> on ThunderX, and Adam on QDF2400 without any problems. So even if I
> rework those functions, how could I check them for correctness?

Running them doesn't prove them correct. Memory ordering bugs have been
in the kernel for many years without 'ever' triggering. This is stuff
you have to think about.

> Anyway, regarding the queued_spin_unlock_wait(), is my understanding
> correct that you assume adding smp_mb() before entering the for(;;)
> cycle, and using ldaxr/strxr instead of atomic_read()?

You'll have to ask Will, I always forget the arm64 details.
diff mbox

Patch

diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index f2b0b52..ac1c170 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -24,6 +24,8 @@  config ARM64
 	select ARCH_WANT_COMPAT_IPC_PARSE_VERSION
 	select ARCH_WANT_FRAME_POINTERS
 	select ARCH_HAS_UBSAN_SANITIZE_ALL
+	select ARCH_USE_QUEUED_SPINLOCKS
+	select ARCH_USE_QUEUED_RWLOCKS
 	select ARM_AMBA
 	select ARM_ARCH_TIMER
 	select ARM_GIC
diff --git a/arch/arm64/include/asm/qrwlock.h b/arch/arm64/include/asm/qrwlock.h
new file mode 100644
index 0000000..626f6eb
--- /dev/null
+++ b/arch/arm64/include/asm/qrwlock.h
@@ -0,0 +1,7 @@ 
+#ifndef _ASM_ARM64_QRWLOCK_H
+#define _ASM_ARM64_QRWLOCK_H
+
+#include <asm-generic/qrwlock_types.h>
+#include <asm-generic/qrwlock.h>
+
+#endif /* _ASM_ARM64_QRWLOCK_H */
diff --git a/arch/arm64/include/asm/qspinlock.h b/arch/arm64/include/asm/qspinlock.h
new file mode 100644
index 0000000..98f50fc
--- /dev/null
+++ b/arch/arm64/include/asm/qspinlock.h
@@ -0,0 +1,20 @@ 
+#ifndef _ASM_ARM64_QSPINLOCK_H
+#define _ASM_ARM64_QSPINLOCK_H
+
+#include <asm-generic/qspinlock_types.h>
+
+#define	queued_spin_unlock queued_spin_unlock
+/**
+ * queued_spin_unlock - release a queued spinlock
+ * @lock : Pointer to queued spinlock structure
+ *
+ * A smp_store_release() on the least-significant byte.
+ */
+static inline void queued_spin_unlock(struct qspinlock *lock)
+{
+	smp_store_release((u8 *)lock, 0);
+}
+
+#include <asm-generic/qspinlock.h>
+
+#endif /* _ASM_ARM64_QSPINLOCK_H */
diff --git a/arch/arm64/include/asm/spinlock.h b/arch/arm64/include/asm/spinlock.h
index cae331d..3771339 100644
--- a/arch/arm64/include/asm/spinlock.h
+++ b/arch/arm64/include/asm/spinlock.h
@@ -20,6 +20,10 @@ 
 #include <asm/spinlock_types.h>
 #include <asm/processor.h>
 
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include <asm/qspinlock.h>
+#else
+
 /*
  * Spinlock implementation.
  *
@@ -187,6 +191,12 @@  static inline int arch_spin_is_contended(arch_spinlock_t *lock)
 }
 #define arch_spin_is_contended	arch_spin_is_contended
 
+#endif /* CONFIG_QUEUED_SPINLOCKS */
+
+#ifdef CONFIG_QUEUED_RWLOCKS
+#include <asm/qrwlock.h>
+#else
+
 /*
  * Write lock implementation.
  *
@@ -351,6 +361,8 @@  static inline int arch_read_trylock(arch_rwlock_t *rw)
 /* read_can_lock - would read_trylock() succeed? */
 #define arch_read_can_lock(x)		((x)->lock < 0x80000000)
 
+#endif /* CONFIG_QUEUED_RWLOCKS */
+
 #define arch_read_lock_flags(lock, flags) arch_read_lock(lock)
 #define arch_write_lock_flags(lock, flags) arch_write_lock(lock)
 
diff --git a/arch/arm64/include/asm/spinlock_types.h b/arch/arm64/include/asm/spinlock_types.h
index 55be59a..0f0f156 100644
--- a/arch/arm64/include/asm/spinlock_types.h
+++ b/arch/arm64/include/asm/spinlock_types.h
@@ -16,9 +16,9 @@ 
 #ifndef __ASM_SPINLOCK_TYPES_H
 #define __ASM_SPINLOCK_TYPES_H
 
-#if !defined(__LINUX_SPINLOCK_TYPES_H) && !defined(__ASM_SPINLOCK_H)
-# error "please don't include this file directly"
-#endif
+#ifdef CONFIG_QUEUED_SPINLOCKS
+#include <asm-generic/qspinlock_types.h>
+#else
 
 #include <linux/types.h>
 
@@ -36,10 +36,18 @@  typedef struct {
 
 #define __ARCH_SPIN_LOCK_UNLOCKED	{ 0 , 0 }
 
+#endif /* CONFIG_QUEUED_SPINLOCKS */
+
+#ifdef CONFIG_QUEUED_RWLOCKS
+#include <asm-generic/qrwlock_types.h>
+#else
+
 typedef struct {
 	volatile unsigned int lock;
 } arch_rwlock_t;
 
 #define __ARCH_RW_LOCK_UNLOCKED		{ 0 }
 
+#endif /* CONFIG_QUEUED_RWLOCKS */
+
 #endif