riscv: Implement smp_cond_load8/16() with Zawrs

Message ID	20241216032253.685728-1-guoren@kernel.org (mailing list archive)
State	Superseded
Headers	show Return-Path: <linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org> From: guoren@kernel.org To: paul.walmsley@sifive.com, palmer@dabbelt.com, guoren@kernel.org, bjorn@rivosinc.com, conor@kernel.org, leobras@redhat.com, alexghiti@rivosinc.com, christoph.muellner@vrull.eu Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, parri.andrea@gmail.com, ajones@ventanamicro.com, ericchancf@google.com, Guo Ren <guoren@linux.alibaba.com> Subject: [PATCH] riscv: Implement smp_cond_load8/16() with Zawrs Date: Sun, 15 Dec 2024 22:22:53 -0500 Message-Id: <20241216032253.685728-1-guoren@kernel.org> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org> Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org
Series	riscv: Implement smp_cond_load8/16() with Zawrs \| expand riscv: Implement smp_cond_load8/16() with Zawrs

Message ID

20241216032253.685728-1-guoren@kernel.org (mailing list archive)

State

Superseded

Headers

From: guoren@kernel.org
To: paul.walmsley@sifive.com,
	palmer@dabbelt.com,
	guoren@kernel.org,
	bjorn@rivosinc.com,
	conor@kernel.org,
	leobras@redhat.com,
	alexghiti@rivosinc.com,
	christoph.muellner@vrull.eu
Cc: linux-riscv@lists.infradead.org,
	linux-kernel@vger.kernel.org,
	parri.andrea@gmail.com,
	ajones@ventanamicro.com,
	ericchancf@google.com,
	Guo Ren <guoren@linux.alibaba.com>
Subject: [PATCH] riscv: Implement smp_cond_load8/16() with Zawrs
Date: Sun, 15 Dec 2024 22:22:53 -0500
Message-Id: <20241216032253.685728-1-guoren@kernel.org>
MIME-Version: 1.0
Precedence: list
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org>
Errors-To: 
 linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org

Series

riscv: Implement smp_cond_load8/16() with Zawrs | expand

Context	Check	Description
conchuod/vmtest-for-next-PR	success	PR summary
conchuod/patch-1-test-1	success	.github/scripts/patches/tests/build_rv32_defconfig.sh took 102.83s
conchuod/patch-1-test-2	success	.github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh took 1969.00s
conchuod/patch-1-test-3	success	.github/scripts/patches/tests/build_rv64_gcc_allmodconfig.sh took 2337.85s
conchuod/patch-1-test-4	success	.github/scripts/patches/tests/build_rv64_nommu_k210_defconfig.sh took 15.96s
conchuod/patch-1-test-5	success	.github/scripts/patches/tests/build_rv64_nommu_virt_defconfig.sh took 17.84s
conchuod/patch-1-test-6	warning	.github/scripts/patches/tests/checkpatch.sh took 0.41s
conchuod/patch-1-test-7	success	.github/scripts/patches/tests/dtb_warn_rv64.sh took 36.73s
conchuod/patch-1-test-8	success	.github/scripts/patches/tests/header_inline.sh took 0.00s
conchuod/patch-1-test-9	success	.github/scripts/patches/tests/kdoc.sh took 0.49s
conchuod/patch-1-test-10	success	.github/scripts/patches/tests/module_param.sh took 0.01s
conchuod/patch-1-test-11	success	.github/scripts/patches/tests/verify_fixes.sh took 0.00s
conchuod/patch-1-test-12	success	.github/scripts/patches/tests/verify_signedoff.sh took 0.02s

Context

Check

Description

conchuod/vmtest-for-next-PR

success

PR summary

conchuod/patch-1-test-1

success

.github/scripts/patches/tests/build_rv32_defconfig.sh took 102.83s

conchuod/patch-1-test-2

success

.github/scripts/patches/tests/build_rv64_clang_allmodconfig.sh took 1969.00s

conchuod/patch-1-test-3

success

.github/scripts/patches/tests/build_rv64_gcc_allmodconfig.sh took 2337.85s

conchuod/patch-1-test-4

success

.github/scripts/patches/tests/build_rv64_nommu_k210_defconfig.sh took 15.96s

conchuod/patch-1-test-5

success

.github/scripts/patches/tests/build_rv64_nommu_virt_defconfig.sh took 17.84s

conchuod/patch-1-test-6

warning

.github/scripts/patches/tests/checkpatch.sh took 0.41s

conchuod/patch-1-test-7

success

.github/scripts/patches/tests/dtb_warn_rv64.sh took 36.73s

conchuod/patch-1-test-8

success

.github/scripts/patches/tests/header_inline.sh took 0.00s

conchuod/patch-1-test-9

success

.github/scripts/patches/tests/kdoc.sh took 0.49s

conchuod/patch-1-test-10

success

.github/scripts/patches/tests/module_param.sh took 0.01s

conchuod/patch-1-test-11

success

.github/scripts/patches/tests/verify_fixes.sh took 0.00s

conchuod/patch-1-test-12

success

.github/scripts/patches/tests/verify_signedoff.sh took 0.02s

Commit Message

Guo Ren Dec. 16, 2024, 3:22 a.m. UTC

From: Guo Ren <guoren@linux.alibaba.com>

RISC-V code uses the queued spinlock implementation, which calls
the macros smp_cond_load_acquire for one byte. So, complement the
implementation of byte and halfword versions.

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Signed-off-by: Guo Ren <guoren@kernel.org>
---
 arch/riscv/include/asm/cmpxchg.h | 38 +++++++++++++++++++++++++++++---
 1 file changed, 35 insertions(+), 3 deletions(-)

Comments

Andrew Jones Dec. 16, 2024, 3:42 p.m. UTC | #1

On Sun, Dec 15, 2024 at 10:22:53PM -0500, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> RISC-V code uses the queued spinlock implementation, which calls
> the macros smp_cond_load_acquire for one byte. So, complement the
> implementation of byte and halfword versions.
> 
> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> Signed-off-by: Guo Ren <guoren@kernel.org>
> ---
>  arch/riscv/include/asm/cmpxchg.h | 38 +++++++++++++++++++++++++++++---
>  1 file changed, 35 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
> index 4cadc56220fe..2bd42a11ff8f 100644
> --- a/arch/riscv/include/asm/cmpxchg.h
> +++ b/arch/riscv/include/asm/cmpxchg.h
> @@ -365,16 +365,48 @@ static __always_inline void __cmpwait(volatile void *ptr,
>  {
>  	unsigned long tmp;
>  
> +	u32 *__ptr32b;
> +	ulong __s, __val, __mask;
> +
>  	asm goto(ALTERNATIVE("j %l[no_zawrs]", "nop",
>  			     0, RISCV_ISA_EXT_ZAWRS, 1)
>  		 : : : : no_zawrs);
>  
>  	switch (size) {
>  	case 1:
> -		fallthrough;
> +		__ptr32b = (u32 *)((ulong)(ptr) & ~0x3);
> +		__s = ((ulong)(ptr) & 0x3) * BITS_PER_BYTE;
> +		__val = val << __s;
> +		__mask = 0xf << __s;

This mask should be 0xff and the mask below should be 0xffff.

> +
> +		asm volatile(
> +		"	lr.w	%0, %1\n"
> +		"	and	%0, %0, %3\n"
> +		"	xor	%0, %0, %2\n"
> +		"	bnez	%0, 1f\n"
> +			ZAWRS_WRS_NTO "\n"
> +		"1:"
> +		: "=&r" (tmp), "+A" (*(__ptr32b))
> +		: "r" (__val), "r" (__mask)
> +		: "memory");
> +		break;
>  	case 2:
> -		/* RISC-V doesn't have lr instructions on byte and half-word. */
> -		goto no_zawrs;
> +		__ptr32b = (u32 *)((ulong)(ptr) & ~0x3);
> +		__s = ((ulong)(ptr) & 0x2) * BITS_PER_BYTE;
> +		__val = val << __s;
> +		__mask = 0xff << __s;
> +
> +		asm volatile(
> +		"	lr.w	%0, %1\n"
> +		"	and	%0, %0, %3\n"
> +		"	xor	%0, %0, %2\n"
> +		"	bnez	%0, 1f\n"
> +			ZAWRS_WRS_NTO "\n"
> +		"1:"
> +		: "=&r" (tmp), "+A" (*(__ptr32b))
> +		: "r" (__val), "r" (__mask)
> +		: "memory");
> +		break;
>  	case 4:
>  		asm volatile(
>  		"	lr.w	%0, %1\n"
> -- 
> 2.40.1
>

Thanks,
drew

Guo Ren Dec. 17, 2024, 1:22 a.m. UTC | #2

On Mon, Dec 16, 2024 at 11:42 PM Andrew Jones <ajones@ventanamicro.com> wrote:
>
> On Sun, Dec 15, 2024 at 10:22:53PM -0500, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > RISC-V code uses the queued spinlock implementation, which calls
> > the macros smp_cond_load_acquire for one byte. So, complement the
> > implementation of byte and halfword versions.
> >
> > Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
> > Signed-off-by: Guo Ren <guoren@kernel.org>
> > ---
> >  arch/riscv/include/asm/cmpxchg.h | 38 +++++++++++++++++++++++++++++---
> >  1 file changed, 35 insertions(+), 3 deletions(-)
> >
> > diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
> > index 4cadc56220fe..2bd42a11ff8f 100644
> > --- a/arch/riscv/include/asm/cmpxchg.h
> > +++ b/arch/riscv/include/asm/cmpxchg.h
> > @@ -365,16 +365,48 @@ static __always_inline void __cmpwait(volatile void *ptr,
> >  {
> >       unsigned long tmp;
> >
> > +     u32 *__ptr32b;
> > +     ulong __s, __val, __mask;
> > +
> >       asm goto(ALTERNATIVE("j %l[no_zawrs]", "nop",
> >                            0, RISCV_ISA_EXT_ZAWRS, 1)
> >                : : : : no_zawrs);
> >
> >       switch (size) {
> >       case 1:
> > -             fallthrough;
> > +             __ptr32b = (u32 *)((ulong)(ptr) & ~0x3);
> > +             __s = ((ulong)(ptr) & 0x3) * BITS_PER_BYTE;
> > +             __val = val << __s;
> > +             __mask = 0xf << __s;
>
> This mask should be 0xff and the mask below should be 0xffff.
Thx for catching it; it's hard to test it out. I will correct it in
the next version.

>
> > +
> > +             asm volatile(
> > +             "       lr.w    %0, %1\n"
> > +             "       and     %0, %0, %3\n"
> > +             "       xor     %0, %0, %2\n"
> > +             "       bnez    %0, 1f\n"
> > +                     ZAWRS_WRS_NTO "\n"
> > +             "1:"
> > +             : "=&r" (tmp), "+A" (*(__ptr32b))
> > +             : "r" (__val), "r" (__mask)
> > +             : "memory");
> > +             break;
> >       case 2:
> > -             /* RISC-V doesn't have lr instructions on byte and half-word. */
> > -             goto no_zawrs;
> > +             __ptr32b = (u32 *)((ulong)(ptr) & ~0x3);
> > +             __s = ((ulong)(ptr) & 0x2) * BITS_PER_BYTE;
> > +             __val = val << __s;
> > +             __mask = 0xff << __s;
> > +
> > +             asm volatile(
> > +             "       lr.w    %0, %1\n"
> > +             "       and     %0, %0, %3\n"
> > +             "       xor     %0, %0, %2\n"
> > +             "       bnez    %0, 1f\n"
> > +                     ZAWRS_WRS_NTO "\n"
> > +             "1:"
> > +             : "=&r" (tmp), "+A" (*(__ptr32b))
> > +             : "r" (__val), "r" (__mask)
> > +             : "memory");
> > +             break;
> >       case 4:
> >               asm volatile(
> >               "       lr.w    %0, %1\n"
> > --
> > 2.40.1
> >
>
> Thanks,
> drew

diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h
index 4cadc56220fe..2bd42a11ff8f 100644
--- a/arch/riscv/include/asm/cmpxchg.h
+++ b/arch/riscv/include/asm/cmpxchg.h
@@ -365,16 +365,48 @@  static __always_inline void __cmpwait(volatile void *ptr,
 {
 	unsigned long tmp;
 
+	u32 *__ptr32b;
+	ulong __s, __val, __mask;
+
 	asm goto(ALTERNATIVE("j %l[no_zawrs]", "nop",
 			     0, RISCV_ISA_EXT_ZAWRS, 1)
 		 : : : : no_zawrs);
 
 	switch (size) {
 	case 1:
-		fallthrough;
+		__ptr32b = (u32 *)((ulong)(ptr) & ~0x3);
+		__s = ((ulong)(ptr) & 0x3) * BITS_PER_BYTE;
+		__val = val << __s;
+		__mask = 0xf << __s;
+
+		asm volatile(
+		"	lr.w	%0, %1\n"
+		"	and	%0, %0, %3\n"
+		"	xor	%0, %0, %2\n"
+		"	bnez	%0, 1f\n"
+			ZAWRS_WRS_NTO "\n"
+		"1:"
+		: "=&r" (tmp), "+A" (*(__ptr32b))
+		: "r" (__val), "r" (__mask)
+		: "memory");
+		break;
 	case 2:
-		/* RISC-V doesn't have lr instructions on byte and half-word. */
-		goto no_zawrs;
+		__ptr32b = (u32 *)((ulong)(ptr) & ~0x3);
+		__s = ((ulong)(ptr) & 0x2) * BITS_PER_BYTE;
+		__val = val << __s;
+		__mask = 0xff << __s;
+
+		asm volatile(
+		"	lr.w	%0, %1\n"
+		"	and	%0, %0, %3\n"
+		"	xor	%0, %0, %2\n"
+		"	bnez	%0, 1f\n"
+			ZAWRS_WRS_NTO "\n"
+		"1:"
+		: "=&r" (tmp), "+A" (*(__ptr32b))
+		: "r" (__val), "r" (__mask)
+		: "memory");
+		break;
 	case 4:
 		asm volatile(
 		"	lr.w	%0, %1\n"

riscv: Implement smp_cond_load8/16() with Zawrs

Checks

Commit Message

Comments

Patch