[V11,00/17] riscv: Add Native/Paravirt qspinlock support

Message ID	20230910082911.3378782-1-guoren@kernel.org (mailing list archive)
Headers	show Return-Path: <kvm-owner@vger.kernel.org> From: guoren@kernel.org To: paul.walmsley@sifive.com, anup@brainfault.org, peterz@infradead.org, mingo@redhat.com, will@kernel.org, palmer@rivosinc.com, longman@redhat.com, boqun.feng@gmail.com, tglx@linutronix.de, paulmck@kernel.org, rostedt@goodmis.org, rdunlap@infradead.org, catalin.marinas@arm.com, conor.dooley@microchip.com, xiaoguang.xing@sophgo.com, bjorn@rivosinc.com, alexghiti@rivosinc.com, keescook@chromium.org, greentime.hu@sifive.com, ajones@ventanamicro.com, jszhang@kernel.org, wefu@redhat.com, wuwei2016@iscas.ac.cn, leobras@redhat.com Cc: linux-arch@vger.kernel.org, linux-riscv@lists.infradead.org, linux-doc@vger.kernel.org, kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, linux-csky@vger.kernel.org, Guo Ren <guoren@linux.alibaba.com> Subject: [PATCH V11 00/17] riscv: Add Native/Paravirt qspinlock support Date: Sun, 10 Sep 2023 04:28:54 -0400 Message-Id: <20230910082911.3378782-1-guoren@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	riscv: Add Native/Paravirt qspinlock support \| expand [V11,00/17] riscv: Add Native/Paravirt qspinlock support [V11,01/17] asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock [V11,02/17] asm-generic: ticket-lock: Move into ticket_spinlock.h [V11,03/17] riscv: Use Zicbop in arch_xchg when available [V11,04/17] locking/qspinlock: Improve xchg_tail for number of cpus >= 16k [V11,05/17] riscv: qspinlock: Add basic queued_spinlock support [V11,06/17] riscv: qspinlock: Introduce combo spinlock [V11,07/17] riscv: qspinlock: Introduce qspinlock param for command line [V11,08/17] riscv: qspinlock: Add virt_spin_lock() support for KVM guest [V11,09/17] riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup [V11,10/17] riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors [V11,11/17] RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton [V11,12/17] RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter [V11,13/17] RISC-V: paravirt: pvqspinlock: Add SBI implementation [V11,14/17] RISC-V: paravirt: pvqspinlock: Add kconfig entry [V11,15/17] RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait [V11,16/17] RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton [V11,17/17] RISC-V: paravirt: pvqspinlock: KVM: Implement kvm_sbi_ext_pvlock_kick_cpu()

Guo Ren Sept. 10, 2023, 8:28 a.m. UTC

From: Guo Ren <guoren@linux.alibaba.com>

patch[1 - 10]: Native   qspinlock
patch[11 -17]: Paravirt qspinlock

patch[4]: Add prefetchw in qspinlock's xchg_tail when cpus >= 16k

This series based on:
 - [RFC PATCH v5 0/5] Rework & improve riscv cmpxchg.h and atomic.h
   https://lore.kernel.org/linux-riscv/20230810040349.92279-2-leobras@redhat.com/
 - [PATCH V3] asm-generic: ticket-lock: Optimize arch_spin_value_unlocked
   https://lore.kernel.org/linux-riscv/20230908154339.3250567-1-guoren@kernel.org/ 

I merge them into sg2042-master branch, then you could directly try it on
sg2042 hardware platform:

https://github.com/guoren83/linux/tree/sg2042-master-qspinlock-64ilp32_v5

Use sophgo_mango_ubuntu_defconfig for sg2042 64/128 cores hardware
platform.

Native qspinlock
================

This time we've proved the qspinlock on th1520 [1] & sg2042 [2], which
gives stability and performance improvement. All T-HEAD processors have
a strong LR/SC forward progress guarantee than the requirements of the
ISA, which could satisfy the xchg_tail of native_qspinlock. Now,
qspinlock has been run with us for more than 1 year, and we have enough
confidence to enable it for all the T-HEAD processors. Of causes, we
found a livelock problem with the qspinlock lock torture test from the
CPU store merge buffer delay mechanism, which caused the queued spinlock
becomes a dead ring and RCU warning to come out. We introduce a custom
WRITE_ONCE to solve this. Do we need explicit ISA instruction to signal
it? Or let hardware handle this.

We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
test on Fedora & Ubuntu & OpenEuler ... Here is the performance
comparison between qspinlock and ticket_lock on sg2042 (64 cores):

sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
  queued_spinlock 0.5109/0.00
  ticket_spinlock 0.5814/0.00

perf futex/hash (+6.7%):
  queued_spinlock 1444393 operations/sec (+- 0.09%)
  ticket_spinlock 1353215 operations/sec (+- 0.15%)

perf futex/wake-parallel (+8.6%):
  queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
  ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)

perf futex/requeue (+4.2%):
  queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
  ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)

System Benchmarks (+6.4%)
  queued_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
    Double-Precision Whetstone                       55.0     182422.8  33167.8
    Execl Throughput                                 43.0      13116.6   3050.4
    File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
    File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
    Pipe Throughput                               12440.0   23058600.5  18535.9
    Pipe-based Context Switching                   4000.0    2835617.7   7089.0
    Process Creation                                126.0      12537.3    995.0
    Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
    Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
    System Call Overhead                          15000.0   33308301.3  22205.5
                                                                       ========
    System Benchmarks Index Score                                       12426.1

  ticket_spinlock:
    System Benchmarks Index Values               BASELINE       RESULT    INDEX
    Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
    Double-Precision Whetstone                       55.0     181921.0  33076.5
    Execl Throughput                                 43.0      12625.1   2936.1
    File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
    File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
    File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
    Pipe Throughput                               12440.0   20594018.7  16554.7
    Pipe-based Context Switching                   4000.0    2571117.7   6427.8
    Process Creation                                126.0      10798.4    857.0
    Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
    Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
    System Call Overhead                          15000.0   30766778.4  20511.2
                                                                       ========
    System Benchmarks Index Score                                       11670.7

The qspinlock has a significant improvement on SOPHGO SG2042 64
cores platform than the ticket_lock.

Paravirt qspinlock
==================

We implemented kvm_kick_cpu/kvm_wait_cpu and add tracepoints to observe the
behaviors. Also, introduce a new SBI extension SBI_EXT_PVLOCK (0xAB0401). If the
name and number are approved, I will send a formal proposal to the SBI spec.

Changlog:
V11:
 - Based on Leonardo Bras's cmpxchg_small patches v5.
 - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
 - Remove abusing alternative framework and use jump_label instead.
 - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
   guarantee.
 - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.

V10:
https://lore.kernel.org/linux-riscv/20230802164701.192791-1-guoren@kernel.org/
 - Using an alternative framework instead of static_key_branch in the
   asm/spinlock.h.
 - Fixup store merge buffer problem, which causes qspinlock lock
   torture test livelock.
 - Add paravirt qspinlock support, include KVM backend
 - Add Compact NUMA-awared qspinlock support 

V9:
https://lore.kernel.org/linux-riscv/20220808071318.3335746-1-guoren@kernel.org/
 - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as
   RCsc)
 - Add qspinlock and combo-lock for riscv
 - Add qspinlock to openrisc
 - Use generic header in csky
 - Optimize cmpxchg & atomic code

V8:
https://lore.kernel.org/linux-riscv/20220724122517.1019187-1-guoren@kernel.org/
 - Coding convention ticket fixup
 - Move combo spinlock into riscv and simply asm-generic/spinlock.h
 - Fixup xchg16 with wrong return value
 - Add csky qspinlock
 - Add combo & qspinlock & ticket-lock comparison
 - Clean up unnecessary riscv acquire and release definitions
 - Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky

V7:
https://lore.kernel.org/linux-riscv/20220628081946.1999419-1-guoren@kernel.org/
 - Add combo spinlock (ticket & queued) support
 - Rename ticket_spinlock.h
 - Remove unnecessary atomic_read in ticket_spin_value_unlocked  

V6:
https://lore.kernel.org/linux-riscv/20220621144920.2945595-1-guoren@kernel.org/
 - Fixup Clang compile problem Reported-by: kernel test robot
 - Cleanup asm-generic/spinlock.h
 - Remove changelog in patch main comment part, suggested by
   Conor.Dooley
 - Remove "default y if NUMA" in Kconfig

V5:
https://lore.kernel.org/linux-riscv/20220620155404.1968739-1-guoren@kernel.org/
 - Update comment with RISC-V forward guarantee feature.
 - Back to V3 direction and optimize asm code.

V4:
https://lore.kernel.org/linux-riscv/1616868399-82848-4-git-send-email-guoren@kernel.org/
 - Remove custom sub-word xchg implementation
 - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock

V3:
https://lore.kernel.org/linux-riscv/1616658937-82063-1-git-send-email-guoren@kernel.org/
 - Coding convention by Peter Zijlstra's advices

V2:
https://lore.kernel.org/linux-riscv/1606225437-22948-2-git-send-email-guoren@kernel.org/
 - Coding convention in cmpxchg.h
 - Re-implement short xchg
 - Remove char & cmpxchg implementations

V1:
https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/
 - Using cmpxchg loop to implement sub-word atomic


Guo Ren (17):
  asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
  asm-generic: ticket-lock: Move into ticket_spinlock.h
  riscv: Use Zicbop in arch_xchg when available
  locking/qspinlock: Improve xchg_tail for number of cpus >= 16k
  riscv: qspinlock: Add basic queued_spinlock support
  riscv: qspinlock: Introduce combo spinlock
  riscv: qspinlock: Introduce qspinlock param for command line
  riscv: qspinlock: Add virt_spin_lock() support for KVM guest
  riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
  riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
  RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
  RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
  RISC-V: paravirt: pvqspinlock: Add SBI implementation
  RISC-V: paravirt: pvqspinlock: Add kconfig entry
  RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
  RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
  RISC-V: paravirt: pvqspinlock: KVM: Implement
    kvm_sbi_ext_pvlock_kick_cpu()

 .../admin-guide/kernel-parameters.txt         |   8 +-
 arch/riscv/Kconfig                            |  50 ++++++++
 arch/riscv/Kconfig.errata                     |  19 +++
 arch/riscv/errata/thead/errata.c              |  29 +++++
 arch/riscv/include/asm/Kbuild                 |   2 +-
 arch/riscv/include/asm/cmpxchg.h              |   4 +-
 arch/riscv/include/asm/errata_list.h          |  13 --
 arch/riscv/include/asm/hwcap.h                |   1 +
 arch/riscv/include/asm/insn-def.h             |   5 +
 arch/riscv/include/asm/kvm_vcpu_sbi.h         |   1 +
 arch/riscv/include/asm/processor.h            |  13 ++
 arch/riscv/include/asm/qspinlock.h            |  35 ++++++
 arch/riscv/include/asm/qspinlock_paravirt.h   |  29 +++++
 arch/riscv/include/asm/rwonce.h               |  24 ++++
 arch/riscv/include/asm/sbi.h                  |  14 +++
 arch/riscv/include/asm/spinlock.h             | 113 ++++++++++++++++++
 arch/riscv/include/asm/vendorid_list.h        |  14 +++
 arch/riscv/include/uapi/asm/kvm.h             |   1 +
 arch/riscv/kernel/Makefile                    |   1 +
 arch/riscv/kernel/cpufeature.c                |   1 +
 arch/riscv/kernel/qspinlock_paravirt.c        |  83 +++++++++++++
 arch/riscv/kernel/sbi.c                       |   2 +-
 arch/riscv/kernel/setup.c                     |  60 ++++++++++
 .../kernel/trace_events_filter_paravirt.h     |  60 ++++++++++
 arch/riscv/kvm/Makefile                       |   1 +
 arch/riscv/kvm/vcpu_sbi.c                     |   4 +
 arch/riscv/kvm/vcpu_sbi_pvlock.c              |  57 +++++++++
 include/asm-generic/rwonce.h                  |   2 +
 include/asm-generic/spinlock.h                |  87 +-------------
 include/asm-generic/spinlock_types.h          |  12 +-
 include/asm-generic/ticket_spinlock.h         | 103 ++++++++++++++++
 kernel/locking/qspinlock.c                    |   5 +-
 32 files changed, 739 insertions(+), 114 deletions(-)
 create mode 100644 arch/riscv/include/asm/qspinlock.h
 create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h
 create mode 100644 arch/riscv/include/asm/rwonce.h
 create mode 100644 arch/riscv/include/asm/spinlock.h
 create mode 100644 arch/riscv/kernel/qspinlock_paravirt.c
 create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h
 create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c
 create mode 100644 include/asm-generic/ticket_spinlock.h

Conor Dooley Sept. 10, 2023, 8:58 a.m. UTC | #1

On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:

> Changlog:
> V11:
>  - Based on Leonardo Bras's cmpxchg_small patches v5.
>  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
>  - Remove abusing alternative framework and use jump_label instead.

btw, I didn't say that using alternatives was the problem, it was
abusing the errata framework to perform feature detection that I had
a problem with. That's not changed in v11.

A stronger forward progress guarantee is not an erratum, AFAICT.

>  - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
>    guarantee.
>  - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.

Guo Ren Sept. 10, 2023, 9:16 a.m. UTC | #2

On Sun, Sep 10, 2023 at 4:58 PM Conor Dooley <conor@kernel.org> wrote:
>
> On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
>
> > Changlog:
> > V11:
> >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> >  - Remove abusing alternative framework and use jump_label instead.
>
> btw, I didn't say that using alternatives was the problem, it was
> abusing the errata framework to perform feature detection that I had
> a problem with. That's not changed in v11.
I've removed errata feature detection. The only related patches are:
 - riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
 - riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors

Which one is your concern? Could you reply on the exact patch thread? Thx.

>
> A stronger forward progress guarantee is not an erratum, AFAICT.
Sorry, there is no erratum of "stronger forward progress guarantee" in the V11.

>
> >  - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
> >    guarantee.
> >  - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.



--
Best Regards
 Guo Ren

Guo Ren Sept. 10, 2023, 9:20 a.m. UTC | #3

On Sun, Sep 10, 2023 at 5:16 PM Guo Ren <guoren@kernel.org> wrote:
>
> On Sun, Sep 10, 2023 at 4:58 PM Conor Dooley <conor@kernel.org> wrote:
> >
> > On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> >
> > > Changlog:
> > > V11:
> > >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> > >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> > >  - Remove abusing alternative framework and use jump_label instead.
> >
> > btw, I didn't say that using alternatives was the problem, it was
> > abusing the errata framework to perform feature detection that I had
> > a problem with. That's not changed in v11.
> I've removed errata feature detection. The only related patches are:
>  - riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
>  - riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
And this one:
 - riscv: Use Zicbop in arch_xchg when available

>
> Which one is your concern? Could you reply on the exact patch thread? Thx.
>
> >
> > A stronger forward progress guarantee is not an erratum, AFAICT.
> Sorry, there is no erratum of "stronger forward progress guarantee" in the V11.
>
> >
> > >  - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
> > >    guarantee.
> > >  - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.
>
>
>
> --
> Best Regards
>  Guo Ren

Conor Dooley Sept. 10, 2023, 9:31 a.m. UTC | #4

On Sun, Sep 10, 2023 at 05:16:46PM +0800, Guo Ren wrote:
> On Sun, Sep 10, 2023 at 4:58 PM Conor Dooley <conor@kernel.org> wrote:
> >
> > On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> >
> > > Changlog:
> > > V11:
> > >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> > >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> > >  - Remove abusing alternative framework and use jump_label instead.
> >
> > btw, I didn't say that using alternatives was the problem, it was
> > abusing the errata framework to perform feature detection that I had
> > a problem with. That's not changed in v11.
> I've removed errata feature detection. The only related patches are:
>  - riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
>  - riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> 
> Which one is your concern? Could you reply on the exact patch thread? Thx.

riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors

Please go back and re-read the comments I left on v11 about using the
errata code for feature detection.

> > A stronger forward progress guarantee is not an erratum, AFAICT.

> Sorry, there is no erratum of "stronger forward progress guarantee" in the V11.

"riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors" still
uses the errata framework to detect the presence of the stronger forward
progress guarantee in v11.

Guo Ren Sept. 10, 2023, 9:49 a.m. UTC | #5

On Sun, Sep 10, 2023 at 5:32 PM Conor Dooley <conor@kernel.org> wrote:
>
> On Sun, Sep 10, 2023 at 05:16:46PM +0800, Guo Ren wrote:
> > On Sun, Sep 10, 2023 at 4:58 PM Conor Dooley <conor@kernel.org> wrote:
> > >
> > > On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> > >
> > > > Changlog:
> > > > V11:
> > > >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> > > >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> > > >  - Remove abusing alternative framework and use jump_label instead.
> > >
> > > btw, I didn't say that using alternatives was the problem, it was
> > > abusing the errata framework to perform feature detection that I had
> > > a problem with. That's not changed in v11.
> > I've removed errata feature detection. The only related patches are:
> >  - riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
> >  - riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> >
> > Which one is your concern? Could you reply on the exact patch thread? Thx.
>
> riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
>
> Please go back and re-read the comments I left on v11 about using the
> errata code for feature detection.
>
> > > A stronger forward progress guarantee is not an erratum, AFAICT.
>
> > Sorry, there is no erratum of "stronger forward progress guarantee" in the V11.
>
> "riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors" still
> uses the errata framework to detect the presence of the stronger forward
> progress guarantee in v11.
Oh, thx for pointing it out. I could replace it with this:

diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 88690751f2ee..4be92766d3e3 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -310,7 +310,8 @@ static void __init riscv_spinlock_init(void)
 {
 #ifdef CONFIG_RISCV_COMBO_SPINLOCKS
        if (!enable_qspinlock_key &&
-           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)) {
+           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM) &&
+           (sbi_get_mvendorid() != THEAD_VENDOR_ID)) {
                static_branch_disable(&combo_qspinlock_key);
                pr_info("Ticket spinlock: enabled\n");
        } else {

Conor Dooley Sept. 10, 2023, 7:45 p.m. UTC | #6

On Sun, Sep 10, 2023 at 05:49:13PM +0800, Guo Ren wrote:
> On Sun, Sep 10, 2023 at 5:32 PM Conor Dooley <conor@kernel.org> wrote:
> >
> > On Sun, Sep 10, 2023 at 05:16:46PM +0800, Guo Ren wrote:
> > > On Sun, Sep 10, 2023 at 4:58 PM Conor Dooley <conor@kernel.org> wrote:
> > > >
> > > > On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> > > >
> > > > > Changlog:
> > > > > V11:
> > > > >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> > > > >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> > > > >  - Remove abusing alternative framework and use jump_label instead.
> > > >
> > > > btw, I didn't say that using alternatives was the problem, it was
> > > > abusing the errata framework to perform feature detection that I had
> > > > a problem with. That's not changed in v11.
> > > I've removed errata feature detection. The only related patches are:
> > >  - riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
> > >  - riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> > >
> > > Which one is your concern? Could you reply on the exact patch thread? Thx.
> >
> > riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> >
> > Please go back and re-read the comments I left on v11 about using the
> > errata code for feature detection.
> >
> > > > A stronger forward progress guarantee is not an erratum, AFAICT.
> >
> > > Sorry, there is no erratum of "stronger forward progress guarantee" in the V11.
> >
> > "riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors" still
> > uses the errata framework to detect the presence of the stronger forward
> > progress guarantee in v11.
> Oh, thx for pointing it out. I could replace it with this:
> 
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 88690751f2ee..4be92766d3e3 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -310,7 +310,8 @@ static void __init riscv_spinlock_init(void)
>  {
>  #ifdef CONFIG_RISCV_COMBO_SPINLOCKS
>         if (!enable_qspinlock_key &&
> -           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)) {
> +           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM) &&
> +           (sbi_get_mvendorid() != THEAD_VENDOR_ID)) {
>                 static_branch_disable(&combo_qspinlock_key);
>                 pr_info("Ticket spinlock: enabled\n");
>         } else {

As I said on v11, I am opposed to feature probing using mvendorid & Co,
partially due to the exact sort of check here to see if the kernel is
running as a KVM guest. IMO, whether a platform has this stronger
guarantee needs to be communicated by firmware, using ACPI or DT.
I made some comments on v11, referring similar discussion about the
thead vector stuff. Please go take a look at that.

Guo Ren Sept. 11, 2023, 3:36 a.m. UTC | #7

On Mon, Sep 11, 2023 at 3:45 AM Conor Dooley <conor@kernel.org> wrote:
>
> On Sun, Sep 10, 2023 at 05:49:13PM +0800, Guo Ren wrote:
> > On Sun, Sep 10, 2023 at 5:32 PM Conor Dooley <conor@kernel.org> wrote:
> > >
> > > On Sun, Sep 10, 2023 at 05:16:46PM +0800, Guo Ren wrote:
> > > > On Sun, Sep 10, 2023 at 4:58 PM Conor Dooley <conor@kernel.org> wrote:
> > > > >
> > > > > On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> > > > >
> > > > > > Changlog:
> > > > > > V11:
> > > > > >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> > > > > >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> > > > > >  - Remove abusing alternative framework and use jump_label instead.
> > > > >
> > > > > btw, I didn't say that using alternatives was the problem, it was
> > > > > abusing the errata framework to perform feature detection that I had
> > > > > a problem with. That's not changed in v11.
> > > > I've removed errata feature detection. The only related patches are:
> > > >  - riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
> > > >  - riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> > > >
> > > > Which one is your concern? Could you reply on the exact patch thread? Thx.
> > >
> > > riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> > >
> > > Please go back and re-read the comments I left on v11 about using the
> > > errata code for feature detection.
> > >
> > > > > A stronger forward progress guarantee is not an erratum, AFAICT.
> > >
> > > > Sorry, there is no erratum of "stronger forward progress guarantee" in the V11.
> > >
> > > "riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors" still
> > > uses the errata framework to detect the presence of the stronger forward
> > > progress guarantee in v11.
> > Oh, thx for pointing it out. I could replace it with this:
> >
> > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> > index 88690751f2ee..4be92766d3e3 100644
> > --- a/arch/riscv/kernel/setup.c
> > +++ b/arch/riscv/kernel/setup.c
> > @@ -310,7 +310,8 @@ static void __init riscv_spinlock_init(void)
> >  {
> >  #ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> >         if (!enable_qspinlock_key &&
> > -           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)) {
> > +           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM) &&
> > +           (sbi_get_mvendorid() != THEAD_VENDOR_ID)) {
> >                 static_branch_disable(&combo_qspinlock_key);
> >                 pr_info("Ticket spinlock: enabled\n");
> >         } else {
>
> As I said on v11, I am opposed to feature probing using mvendorid & Co,
> partially due to the exact sort of check here to see if the kernel is
> running as a KVM guest. IMO, whether a platform has this stronger
KVM can't use any fairness lock, so forcing it using a Test-Set lock
or paravirt qspinlock is the right way. KVM is not a vendor platform.

> guarantee needs to be communicated by firmware, using ACPI or DT.
> I made some comments on v11, referring similar discussion about the
> thead vector stuff. Please go take a look at that.
I prefer forcing T-HEAD processors using qspinlock, but if all people
thought it must be in the ACPI or DT, I would compromise. Then, I
would delete the qspinlock cmdline param patch and move it into DT.

By the way, what's the kind of DT format? How about:
        cpus {
                #address-cells = <1>;
                #size-cells = <0>;
+              qspinlock;
                cpu0: cpu@0 {
                        compatible = "sifive,bullet0", "riscv";
                        device_type = "cpu";
                        i-cache-block-size = <64>;
                        i-cache-sets = <128>;

--
Best Regards
 Guo Ren

Conor Dooley Sept. 11, 2023, 12:52 p.m. UTC | #8

On Mon, Sep 11, 2023 at 11:36:27AM +0800, Guo Ren wrote:
> On Mon, Sep 11, 2023 at 3:45 AM Conor Dooley <conor@kernel.org> wrote:
> >
> > On Sun, Sep 10, 2023 at 05:49:13PM +0800, Guo Ren wrote:
> > > On Sun, Sep 10, 2023 at 5:32 PM Conor Dooley <conor@kernel.org> wrote:
> > > >
> > > > On Sun, Sep 10, 2023 at 05:16:46PM +0800, Guo Ren wrote:
> > > > > On Sun, Sep 10, 2023 at 4:58 PM Conor Dooley <conor@kernel.org> wrote:
> > > > > >
> > > > > > On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> > > > > >
> > > > > > > Changlog:
> > > > > > > V11:
> > > > > > >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> > > > > > >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> > > > > > >  - Remove abusing alternative framework and use jump_label instead.
> > > > > >
> > > > > > btw, I didn't say that using alternatives was the problem, it was
> > > > > > abusing the errata framework to perform feature detection that I had
> > > > > > a problem with. That's not changed in v11.
> > > > > I've removed errata feature detection. The only related patches are:
> > > > >  - riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
> > > > >  - riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> > > > >
> > > > > Which one is your concern? Could you reply on the exact patch thread? Thx.
> > > >
> > > > riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> > > >
> > > > Please go back and re-read the comments I left on v11 about using the
> > > > errata code for feature detection.
> > > >
> > > > > > A stronger forward progress guarantee is not an erratum, AFAICT.
> > > >
> > > > > Sorry, there is no erratum of "stronger forward progress guarantee" in the V11.
> > > >
> > > > "riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors" still
> > > > uses the errata framework to detect the presence of the stronger forward
> > > > progress guarantee in v11.
> > > Oh, thx for pointing it out. I could replace it with this:
> > >
> > > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> > > index 88690751f2ee..4be92766d3e3 100644
> > > --- a/arch/riscv/kernel/setup.c
> > > +++ b/arch/riscv/kernel/setup.c
> > > @@ -310,7 +310,8 @@ static void __init riscv_spinlock_init(void)
> > >  {
> > >  #ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> > >         if (!enable_qspinlock_key &&
> > > -           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)) {
> > > +           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM) &&
> > > +           (sbi_get_mvendorid() != THEAD_VENDOR_ID)) {
> > >                 static_branch_disable(&combo_qspinlock_key);
> > >                 pr_info("Ticket spinlock: enabled\n");
> > >         } else {
> >
> > As I said on v11, I am opposed to feature probing using mvendorid & Co,
> > partially due to the exact sort of check here to see if the kernel is
> > running as a KVM guest. IMO, whether a platform has this stronger

> KVM can't use any fairness lock, so forcing it using a Test-Set lock
> or paravirt qspinlock is the right way. KVM is not a vendor platform.

My point is that KVM should be telling the guest what additional features
it is capable of using, rather than the kernel making some assumptions
based on$vendorid etc that are invalid when the kernel is running as a
KVM guest.
If the mvendorid etc related assumptions were dropped, the kernel would
then default away from your qspinlock & there'd not be a need to
special-case KVM AFAICT.

> > guarantee needs to be communicated by firmware, using ACPI or DT.
> > I made some comments on v11, referring similar discussion about the
> > thead vector stuff. Please go take a look at that.
> I prefer forcing T-HEAD processors using qspinlock, but if all people
> thought it must be in the ACPI or DT, I would compromise. Then, I
> would delete the qspinlock cmdline param patch and move it into DT.
> 
> By the way, what's the kind of DT format? How about:

I added the new "riscv,isa-extensions" property in part to make
communicating vendor extensions like this easier. Please try to use
that. "qspinlock" is software configuration though, the vendor extension
should focus on the guarantee of strong forward progress, since that is
the non-standard aspect of your IP.

A commandline property may still be desirable, to control the locking
method used, since the DT should be a description of the hardware, not
for configuring software policy in your operating system.

Thanks,
Conor.

>         cpus {
>                 #address-cells = <1>;
>                 #size-cells = <0>;
> +              qspinlock;
>                 cpu0: cpu@0 {
>                         compatible = "sifive,bullet0", "riscv";
>                         device_type = "cpu";
>                         i-cache-block-size = <64>;
>                         i-cache-sets = <128>;
> 
> --
> Best Regards
>  Guo Ren

Guo Ren Sept. 12, 2023, 1:33 a.m. UTC | #9

On Mon, Sep 11, 2023 at 8:53 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Mon, Sep 11, 2023 at 11:36:27AM +0800, Guo Ren wrote:
> > On Mon, Sep 11, 2023 at 3:45 AM Conor Dooley <conor@kernel.org> wrote:
> > >
> > > On Sun, Sep 10, 2023 at 05:49:13PM +0800, Guo Ren wrote:
> > > > On Sun, Sep 10, 2023 at 5:32 PM Conor Dooley <conor@kernel.org> wrote:
> > > > >
> > > > > On Sun, Sep 10, 2023 at 05:16:46PM +0800, Guo Ren wrote:
> > > > > > On Sun, Sep 10, 2023 at 4:58 PM Conor Dooley <conor@kernel.org> wrote:
> > > > > > >
> > > > > > > On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> > > > > > >
> > > > > > > > Changlog:
> > > > > > > > V11:
> > > > > > > >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> > > > > > > >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> > > > > > > >  - Remove abusing alternative framework and use jump_label instead.
> > > > > > >
> > > > > > > btw, I didn't say that using alternatives was the problem, it was
> > > > > > > abusing the errata framework to perform feature detection that I had
> > > > > > > a problem with. That's not changed in v11.
> > > > > > I've removed errata feature detection. The only related patches are:
> > > > > >  - riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
> > > > > >  - riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> > > > > >
> > > > > > Which one is your concern? Could you reply on the exact patch thread? Thx.
> > > > >
> > > > > riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> > > > >
> > > > > Please go back and re-read the comments I left on v11 about using the
> > > > > errata code for feature detection.
> > > > >
> > > > > > > A stronger forward progress guarantee is not an erratum, AFAICT.
> > > > >
> > > > > > Sorry, there is no erratum of "stronger forward progress guarantee" in the V11.
> > > > >
> > > > > "riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors" still
> > > > > uses the errata framework to detect the presence of the stronger forward
> > > > > progress guarantee in v11.
> > > > Oh, thx for pointing it out. I could replace it with this:
> > > >
> > > > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> > > > index 88690751f2ee..4be92766d3e3 100644
> > > > --- a/arch/riscv/kernel/setup.c
> > > > +++ b/arch/riscv/kernel/setup.c
> > > > @@ -310,7 +310,8 @@ static void __init riscv_spinlock_init(void)
> > > >  {
> > > >  #ifdef CONFIG_RISCV_COMBO_SPINLOCKS
> > > >         if (!enable_qspinlock_key &&
> > > > -           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM)) {
> > > > +           (sbi_get_firmware_id() != SBI_EXT_BASE_IMPL_ID_KVM) &&
> > > > +           (sbi_get_mvendorid() != THEAD_VENDOR_ID)) {
> > > >                 static_branch_disable(&combo_qspinlock_key);
> > > >                 pr_info("Ticket spinlock: enabled\n");
> > > >         } else {
> > >
> > > As I said on v11, I am opposed to feature probing using mvendorid & Co,
> > > partially due to the exact sort of check here to see if the kernel is
> > > running as a KVM guest. IMO, whether a platform has this stronger
>
> > KVM can't use any fairness lock, so forcing it using a Test-Set lock
> > or paravirt qspinlock is the right way. KVM is not a vendor platform.
>
> My point is that KVM should be telling the guest what additional features
> it is capable of using, rather than the kernel making some assumptions
> based on$vendorid etc that are invalid when the kernel is running as a
> KVM guest.
> If the mvendorid etc related assumptions were dropped, the kernel would
> then default away from your qspinlock & there'd not be a need to
> special-case KVM AFAICT.
>
> > > guarantee needs to be communicated by firmware, using ACPI or DT.
> > > I made some comments on v11, referring similar discussion about the
> > > thead vector stuff. Please go take a look at that.
> > I prefer forcing T-HEAD processors using qspinlock, but if all people
> > thought it must be in the ACPI or DT, I would compromise. Then, I
> > would delete the qspinlock cmdline param patch and move it into DT.
> >
> > By the way, what's the kind of DT format? How about:
>
> I added the new "riscv,isa-extensions" property in part to make
> communicating vendor extensions like this easier. Please try to use
> that. "qspinlock" is software configuration though, the vendor extension
> should focus on the guarantee of strong forward progress, since that is
> the non-standard aspect of your IP.
The qspinlock contains three paths:
 - Native qspinlock, this is your strong forward progress.
 - virt_spin_lock, for KVM guest when paravirt qspinlock disabled.
   https://lore.kernel.org/linux-riscv/20230910082911.3378782-9-guoren@kernel.org/
 - paravirt qspinlock, for KVM guest

So, we need a software configuration here, "riscv,isa-extensions" is
all about vendor extension.

>
> A commandline property may still be desirable, to control the locking
> method used, since the DT should be a description of the hardware, not
> for configuring software policy in your operating system.
Okay, I would keep the cmdline property.

>
> Thanks,
> Conor.
>
> >         cpus {
> >                 #address-cells = <1>;
> >                 #size-cells = <0>;
> > +              qspinlock;
> >                 cpu0: cpu@0 {
> >                         compatible = "sifive,bullet0", "riscv";
> >                         device_type = "cpu";
> >                         i-cache-block-size = <64>;
> >                         i-cache-sets = <128>;
> >
> > --
> > Best Regards
> >  Guo Ren

Conor Dooley Sept. 12, 2023, 8:07 a.m. UTC | #10

On Tue, Sep 12, 2023 at 09:33:57AM +0800, Guo Ren wrote:
> On Mon, Sep 11, 2023 at 8:53 PM Conor Dooley <conor.dooley@microchip.com> wrote:

> > I added the new "riscv,isa-extensions" property in part to make
> > communicating vendor extensions like this easier. Please try to use
> > that. "qspinlock" is software configuration though, the vendor extension
> > should focus on the guarantee of strong forward progress, since that is
> > the non-standard aspect of your IP.
> 
> The qspinlock contains three paths:
>  - Native qspinlock, this is your strong forward progress.
>  - virt_spin_lock, for KVM guest when paravirt qspinlock disabled.
>    https://lore.kernel.org/linux-riscv/20230910082911.3378782-9-guoren@kernel.org/
>  - paravirt qspinlock, for KVM guest
> 
> So, we need a software configuration here, "riscv,isa-extensions" is
> all about vendor extension.

Ah right, yes it would only be able to be used to determine whether or
not the platform is capable of supporting these spinlocks, not whether or
not the kernel is a guest. I think I misinterpreted that snippet you posted,
thinking you were trying to disable your new spinlock for KVM, sorry.
On that note though, what about other sorts of guests? Will non-KVM
guests not also want to use this virt spinlock?

Thanks,
Conor.

Guo Ren Sept. 12, 2023, 10:58 a.m. UTC | #11

On Tue, Sep 12, 2023 at 4:08 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> On Tue, Sep 12, 2023 at 09:33:57AM +0800, Guo Ren wrote:
> > On Mon, Sep 11, 2023 at 8:53 PM Conor Dooley <conor.dooley@microchip.com> wrote:
>
> > > I added the new "riscv,isa-extensions" property in part to make
> > > communicating vendor extensions like this easier. Please try to use
> > > that. "qspinlock" is software configuration though, the vendor extension
> > > should focus on the guarantee of strong forward progress, since that is
> > > the non-standard aspect of your IP.
> >
> > The qspinlock contains three paths:
> >  - Native qspinlock, this is your strong forward progress.
> >  - virt_spin_lock, for KVM guest when paravirt qspinlock disabled.
> >    https://lore.kernel.org/linux-riscv/20230910082911.3378782-9-guoren@kernel.org/
> >  - paravirt qspinlock, for KVM guest
> >
> > So, we need a software configuration here, "riscv,isa-extensions" is
> > all about vendor extension.
>
> Ah right, yes it would only be able to be used to determine whether or
> not the platform is capable of supporting these spinlocks, not whether or
> not the kernel is a guest. I think I misinterpreted that snippet you posted,
> thinking you were trying to disable your new spinlock for KVM, sorry.
> On that note though, what about other sorts of guests? Will non-KVM
> guests not also want to use this virt spinlock?
I only put KVM guests here, and I can't answer other hypervisor that
is another topic.

>
> Thanks,
> Conor.

Leonardo Bras Nov. 6, 2023, 8:42 p.m. UTC | #12

On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> patch[1 - 10]: Native   qspinlock
> patch[11 -17]: Paravirt qspinlock
> 
> patch[4]: Add prefetchw in qspinlock's xchg_tail when cpus >= 16k
> 
> This series based on:
>  - [RFC PATCH v5 0/5] Rework & improve riscv cmpxchg.h and atomic.h
>    https://lore.kernel.org/linux-riscv/20230810040349.92279-2-leobras@redhat.com/
>  - [PATCH V3] asm-generic: ticket-lock: Optimize arch_spin_value_unlocked
>    https://lore.kernel.org/linux-riscv/20230908154339.3250567-1-guoren@kernel.org/ 
> 
> I merge them into sg2042-master branch, then you could directly try it on
> sg2042 hardware platform:
> 
> https://github.com/guoren83/linux/tree/sg2042-master-qspinlock-64ilp32_v5
> 
> Use sophgo_mango_ubuntu_defconfig for sg2042 64/128 cores hardware
> platform.
> 
> Native qspinlock
> ================
> 
> This time we've proved the qspinlock on th1520 [1] & sg2042 [2], which
> gives stability and performance improvement. All T-HEAD processors have
> a strong LR/SC forward progress guarantee than the requirements of the
> ISA, which could satisfy the xchg_tail of native_qspinlock. Now,
> qspinlock has been run with us for more than 1 year, and we have enough
> confidence to enable it for all the T-HEAD processors. Of causes, we
> found a livelock problem with the qspinlock lock torture test from the
> CPU store merge buffer delay mechanism, which caused the queued spinlock
> becomes a dead ring and RCU warning to come out. We introduce a custom
> WRITE_ONCE to solve this. Do we need explicit ISA instruction to signal
> it? Or let hardware handle this.
> 
> We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
> test on Fedora & Ubuntu & OpenEuler ... Here is the performance
> comparison between qspinlock and ticket_lock on sg2042 (64 cores):
> 
> sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
>   queued_spinlock 0.5109/0.00
>   ticket_spinlock 0.5814/0.00
> 
> perf futex/hash (+6.7%):
>   queued_spinlock 1444393 operations/sec (+- 0.09%)
>   ticket_spinlock 1353215 operations/sec (+- 0.15%)
> 
> perf futex/wake-parallel (+8.6%):
>   queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
>   ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)
> 
> perf futex/requeue (+4.2%):
>   queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
>   ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)
> 
> System Benchmarks (+6.4%)
>   queued_spinlock:
>     System Benchmarks Index Values               BASELINE       RESULT    INDEX
>     Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
>     Double-Precision Whetstone                       55.0     182422.8  33167.8
>     Execl Throughput                                 43.0      13116.6   3050.4
>     File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
>     File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
>     File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
>     Pipe Throughput                               12440.0   23058600.5  18535.9
>     Pipe-based Context Switching                   4000.0    2835617.7   7089.0
>     Process Creation                                126.0      12537.3    995.0
>     Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
>     Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
>     System Call Overhead                          15000.0   33308301.3  22205.5
>                                                                        ========
>     System Benchmarks Index Score                                       12426.1
> 
>   ticket_spinlock:
>     System Benchmarks Index Values               BASELINE       RESULT    INDEX
>     Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
>     Double-Precision Whetstone                       55.0     181921.0  33076.5
>     Execl Throughput                                 43.0      12625.1   2936.1
>     File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
>     File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
>     File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
>     Pipe Throughput                               12440.0   20594018.7  16554.7
>     Pipe-based Context Switching                   4000.0    2571117.7   6427.8
>     Process Creation                                126.0      10798.4    857.0
>     Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
>     Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
>     System Call Overhead                          15000.0   30766778.4  20511.2
>                                                                        ========
>     System Benchmarks Index Score                                       11670.7
> 
> The qspinlock has a significant improvement on SOPHGO SG2042 64
> cores platform than the ticket_lock.
> 
> Paravirt qspinlock
> ==================
> 
> We implemented kvm_kick_cpu/kvm_wait_cpu and add tracepoints to observe the
> behaviors. Also, introduce a new SBI extension SBI_EXT_PVLOCK (0xAB0401). If the
> name and number are approved, I will send a formal proposal to the SBI spec.
> 

Hello Guo Ren,

Any update on this series?

Thanks!
Leo


> Changlog:
> V11:
>  - Based on Leonardo Bras's cmpxchg_small patches v5.
>  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
>  - Remove abusing alternative framework and use jump_label instead.
>  - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
>    guarantee.
>  - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.
> 
> V10:
> https://lore.kernel.org/linux-riscv/20230802164701.192791-1-guoren@kernel.org/
>  - Using an alternative framework instead of static_key_branch in the
>    asm/spinlock.h.
>  - Fixup store merge buffer problem, which causes qspinlock lock
>    torture test livelock.
>  - Add paravirt qspinlock support, include KVM backend
>  - Add Compact NUMA-awared qspinlock support 
> 
> V9:
> https://lore.kernel.org/linux-riscv/20220808071318.3335746-1-guoren@kernel.org/
>  - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as
>    RCsc)
>  - Add qspinlock and combo-lock for riscv
>  - Add qspinlock to openrisc
>  - Use generic header in csky
>  - Optimize cmpxchg & atomic code
> 
> V8:
> https://lore.kernel.org/linux-riscv/20220724122517.1019187-1-guoren@kernel.org/
>  - Coding convention ticket fixup
>  - Move combo spinlock into riscv and simply asm-generic/spinlock.h
>  - Fixup xchg16 with wrong return value
>  - Add csky qspinlock
>  - Add combo & qspinlock & ticket-lock comparison
>  - Clean up unnecessary riscv acquire and release definitions
>  - Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky
> 
> V7:
> https://lore.kernel.org/linux-riscv/20220628081946.1999419-1-guoren@kernel.org/
>  - Add combo spinlock (ticket & queued) support
>  - Rename ticket_spinlock.h
>  - Remove unnecessary atomic_read in ticket_spin_value_unlocked  
> 
> V6:
> https://lore.kernel.org/linux-riscv/20220621144920.2945595-1-guoren@kernel.org/
>  - Fixup Clang compile problem Reported-by: kernel test robot
>  - Cleanup asm-generic/spinlock.h
>  - Remove changelog in patch main comment part, suggested by
>    Conor.Dooley
>  - Remove "default y if NUMA" in Kconfig
> 
> V5:
> https://lore.kernel.org/linux-riscv/20220620155404.1968739-1-guoren@kernel.org/
>  - Update comment with RISC-V forward guarantee feature.
>  - Back to V3 direction and optimize asm code.
> 
> V4:
> https://lore.kernel.org/linux-riscv/1616868399-82848-4-git-send-email-guoren@kernel.org/
>  - Remove custom sub-word xchg implementation
>  - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock
> 
> V3:
> https://lore.kernel.org/linux-riscv/1616658937-82063-1-git-send-email-guoren@kernel.org/
>  - Coding convention by Peter Zijlstra's advices
> 
> V2:
> https://lore.kernel.org/linux-riscv/1606225437-22948-2-git-send-email-guoren@kernel.org/
>  - Coding convention in cmpxchg.h
>  - Re-implement short xchg
>  - Remove char & cmpxchg implementations
> 
> V1:
> https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/
>  - Using cmpxchg loop to implement sub-word atomic
> 
> 
> Guo Ren (17):
>   asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
>   asm-generic: ticket-lock: Move into ticket_spinlock.h
>   riscv: Use Zicbop in arch_xchg when available
>   locking/qspinlock: Improve xchg_tail for number of cpus >= 16k
>   riscv: qspinlock: Add basic queued_spinlock support
>   riscv: qspinlock: Introduce combo spinlock
>   riscv: qspinlock: Introduce qspinlock param for command line
>   riscv: qspinlock: Add virt_spin_lock() support for KVM guest
>   riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
>   riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
>   RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
>   RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
>   RISC-V: paravirt: pvqspinlock: Add SBI implementation
>   RISC-V: paravirt: pvqspinlock: Add kconfig entry
>   RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
>   RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
>   RISC-V: paravirt: pvqspinlock: KVM: Implement
>     kvm_sbi_ext_pvlock_kick_cpu()
> 
>  .../admin-guide/kernel-parameters.txt         |   8 +-
>  arch/riscv/Kconfig                            |  50 ++++++++
>  arch/riscv/Kconfig.errata                     |  19 +++
>  arch/riscv/errata/thead/errata.c              |  29 +++++
>  arch/riscv/include/asm/Kbuild                 |   2 +-
>  arch/riscv/include/asm/cmpxchg.h              |   4 +-
>  arch/riscv/include/asm/errata_list.h          |  13 --
>  arch/riscv/include/asm/hwcap.h                |   1 +
>  arch/riscv/include/asm/insn-def.h             |   5 +
>  arch/riscv/include/asm/kvm_vcpu_sbi.h         |   1 +
>  arch/riscv/include/asm/processor.h            |  13 ++
>  arch/riscv/include/asm/qspinlock.h            |  35 ++++++
>  arch/riscv/include/asm/qspinlock_paravirt.h   |  29 +++++
>  arch/riscv/include/asm/rwonce.h               |  24 ++++
>  arch/riscv/include/asm/sbi.h                  |  14 +++
>  arch/riscv/include/asm/spinlock.h             | 113 ++++++++++++++++++
>  arch/riscv/include/asm/vendorid_list.h        |  14 +++
>  arch/riscv/include/uapi/asm/kvm.h             |   1 +
>  arch/riscv/kernel/Makefile                    |   1 +
>  arch/riscv/kernel/cpufeature.c                |   1 +
>  arch/riscv/kernel/qspinlock_paravirt.c        |  83 +++++++++++++
>  arch/riscv/kernel/sbi.c                       |   2 +-
>  arch/riscv/kernel/setup.c                     |  60 ++++++++++
>  .../kernel/trace_events_filter_paravirt.h     |  60 ++++++++++
>  arch/riscv/kvm/Makefile                       |   1 +
>  arch/riscv/kvm/vcpu_sbi.c                     |   4 +
>  arch/riscv/kvm/vcpu_sbi_pvlock.c              |  57 +++++++++
>  include/asm-generic/rwonce.h                  |   2 +
>  include/asm-generic/spinlock.h                |  87 +-------------
>  include/asm-generic/spinlock_types.h          |  12 +-
>  include/asm-generic/ticket_spinlock.h         | 103 ++++++++++++++++
>  kernel/locking/qspinlock.c                    |   5 +-
>  32 files changed, 739 insertions(+), 114 deletions(-)
>  create mode 100644 arch/riscv/include/asm/qspinlock.h
>  create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h
>  create mode 100644 arch/riscv/include/asm/rwonce.h
>  create mode 100644 arch/riscv/include/asm/spinlock.h
>  create mode 100644 arch/riscv/kernel/qspinlock_paravirt.c
>  create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h
>  create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c
>  create mode 100644 include/asm-generic/ticket_spinlock.h
> 
> -- 
> 2.36.1
>

Guo Ren Nov. 12, 2023, 4:23 a.m. UTC | #13

On Mon, Nov 6, 2023 at 3:42 PM Leonardo Bras <leobras@redhat.com> wrote:
>
> On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> > From: Guo Ren <guoren@linux.alibaba.com>
> >
> > patch[1 - 10]: Native   qspinlock
> > patch[11 -17]: Paravirt qspinlock
> >
> > patch[4]: Add prefetchw in qspinlock's xchg_tail when cpus >= 16k
> >
> > This series based on:
> >  - [RFC PATCH v5 0/5] Rework & improve riscv cmpxchg.h and atomic.h
> >    https://lore.kernel.org/linux-riscv/20230810040349.92279-2-leobras@redhat.com/
> >  - [PATCH V3] asm-generic: ticket-lock: Optimize arch_spin_value_unlocked
> >    https://lore.kernel.org/linux-riscv/20230908154339.3250567-1-guoren@kernel.org/
> >
> > I merge them into sg2042-master branch, then you could directly try it on
> > sg2042 hardware platform:
> >
> > https://github.com/guoren83/linux/tree/sg2042-master-qspinlock-64ilp32_v5
> >
> > Use sophgo_mango_ubuntu_defconfig for sg2042 64/128 cores hardware
> > platform.
> >
> > Native qspinlock
> > ================
> >
> > This time we've proved the qspinlock on th1520 [1] & sg2042 [2], which
> > gives stability and performance improvement. All T-HEAD processors have
> > a strong LR/SC forward progress guarantee than the requirements of the
> > ISA, which could satisfy the xchg_tail of native_qspinlock. Now,
> > qspinlock has been run with us for more than 1 year, and we have enough
> > confidence to enable it for all the T-HEAD processors. Of causes, we
> > found a livelock problem with the qspinlock lock torture test from the
> > CPU store merge buffer delay mechanism, which caused the queued spinlock
> > becomes a dead ring and RCU warning to come out. We introduce a custom
> > WRITE_ONCE to solve this. Do we need explicit ISA instruction to signal
> > it? Or let hardware handle this.
> >
> > We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
> > test on Fedora & Ubuntu & OpenEuler ... Here is the performance
> > comparison between qspinlock and ticket_lock on sg2042 (64 cores):
> >
> > sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
> >   queued_spinlock 0.5109/0.00
> >   ticket_spinlock 0.5814/0.00
> >
> > perf futex/hash (+6.7%):
> >   queued_spinlock 1444393 operations/sec (+- 0.09%)
> >   ticket_spinlock 1353215 operations/sec (+- 0.15%)
> >
> > perf futex/wake-parallel (+8.6%):
> >   queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
> >   ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)
> >
> > perf futex/requeue (+4.2%):
> >   queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
> >   ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)
> >
> > System Benchmarks (+6.4%)
> >   queued_spinlock:
> >     System Benchmarks Index Values               BASELINE       RESULT    INDEX
> >     Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
> >     Double-Precision Whetstone                       55.0     182422.8  33167.8
> >     Execl Throughput                                 43.0      13116.6   3050.4
> >     File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
> >     File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
> >     File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
> >     Pipe Throughput                               12440.0   23058600.5  18535.9
> >     Pipe-based Context Switching                   4000.0    2835617.7   7089.0
> >     Process Creation                                126.0      12537.3    995.0
> >     Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
> >     Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
> >     System Call Overhead                          15000.0   33308301.3  22205.5
> >                                                                        ========
> >     System Benchmarks Index Score                                       12426.1
> >
> >   ticket_spinlock:
> >     System Benchmarks Index Values               BASELINE       RESULT    INDEX
> >     Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
> >     Double-Precision Whetstone                       55.0     181921.0  33076.5
> >     Execl Throughput                                 43.0      12625.1   2936.1
> >     File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
> >     File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
> >     File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
> >     Pipe Throughput                               12440.0   20594018.7  16554.7
> >     Pipe-based Context Switching                   4000.0    2571117.7   6427.8
> >     Process Creation                                126.0      10798.4    857.0
> >     Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
> >     Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
> >     System Call Overhead                          15000.0   30766778.4  20511.2
> >                                                                        ========
> >     System Benchmarks Index Score                                       11670.7
> >
> > The qspinlock has a significant improvement on SOPHGO SG2042 64
> > cores platform than the ticket_lock.
> >
> > Paravirt qspinlock
> > ==================
> >
> > We implemented kvm_kick_cpu/kvm_wait_cpu and add tracepoints to observe the
> > behaviors. Also, introduce a new SBI extension SBI_EXT_PVLOCK (0xAB0401). If the
> > name and number are approved, I will send a formal proposal to the SBI spec.
> >
>
> Hello Guo Ren,
>
> Any update on this series?
Found a nested virtualization problem, and I'm solving that. After
that, I'll update v12.

>
> Thanks!
> Leo
>
>
> > Changlog:
> > V11:
> >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> >  - Remove abusing alternative framework and use jump_label instead.
> >  - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
> >    guarantee.
> >  - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.
> >
> > V10:
> > https://lore.kernel.org/linux-riscv/20230802164701.192791-1-guoren@kernel.org/
> >  - Using an alternative framework instead of static_key_branch in the
> >    asm/spinlock.h.
> >  - Fixup store merge buffer problem, which causes qspinlock lock
> >    torture test livelock.
> >  - Add paravirt qspinlock support, include KVM backend
> >  - Add Compact NUMA-awared qspinlock support
> >
> > V9:
> > https://lore.kernel.org/linux-riscv/20220808071318.3335746-1-guoren@kernel.org/
> >  - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as
> >    RCsc)
> >  - Add qspinlock and combo-lock for riscv
> >  - Add qspinlock to openrisc
> >  - Use generic header in csky
> >  - Optimize cmpxchg & atomic code
> >
> > V8:
> > https://lore.kernel.org/linux-riscv/20220724122517.1019187-1-guoren@kernel.org/
> >  - Coding convention ticket fixup
> >  - Move combo spinlock into riscv and simply asm-generic/spinlock.h
> >  - Fixup xchg16 with wrong return value
> >  - Add csky qspinlock
> >  - Add combo & qspinlock & ticket-lock comparison
> >  - Clean up unnecessary riscv acquire and release definitions
> >  - Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky
> >
> > V7:
> > https://lore.kernel.org/linux-riscv/20220628081946.1999419-1-guoren@kernel.org/
> >  - Add combo spinlock (ticket & queued) support
> >  - Rename ticket_spinlock.h
> >  - Remove unnecessary atomic_read in ticket_spin_value_unlocked
> >
> > V6:
> > https://lore.kernel.org/linux-riscv/20220621144920.2945595-1-guoren@kernel.org/
> >  - Fixup Clang compile problem Reported-by: kernel test robot
> >  - Cleanup asm-generic/spinlock.h
> >  - Remove changelog in patch main comment part, suggested by
> >    Conor.Dooley
> >  - Remove "default y if NUMA" in Kconfig
> >
> > V5:
> > https://lore.kernel.org/linux-riscv/20220620155404.1968739-1-guoren@kernel.org/
> >  - Update comment with RISC-V forward guarantee feature.
> >  - Back to V3 direction and optimize asm code.
> >
> > V4:
> > https://lore.kernel.org/linux-riscv/1616868399-82848-4-git-send-email-guoren@kernel.org/
> >  - Remove custom sub-word xchg implementation
> >  - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock
> >
> > V3:
> > https://lore.kernel.org/linux-riscv/1616658937-82063-1-git-send-email-guoren@kernel.org/
> >  - Coding convention by Peter Zijlstra's advices
> >
> > V2:
> > https://lore.kernel.org/linux-riscv/1606225437-22948-2-git-send-email-guoren@kernel.org/
> >  - Coding convention in cmpxchg.h
> >  - Re-implement short xchg
> >  - Remove char & cmpxchg implementations
> >
> > V1:
> > https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/
> >  - Using cmpxchg loop to implement sub-word atomic
> >
> >
> > Guo Ren (17):
> >   asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
> >   asm-generic: ticket-lock: Move into ticket_spinlock.h
> >   riscv: Use Zicbop in arch_xchg when available
> >   locking/qspinlock: Improve xchg_tail for number of cpus >= 16k
> >   riscv: qspinlock: Add basic queued_spinlock support
> >   riscv: qspinlock: Introduce combo spinlock
> >   riscv: qspinlock: Introduce qspinlock param for command line
> >   riscv: qspinlock: Add virt_spin_lock() support for KVM guest
> >   riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
> >   riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> >   RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
> >   RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
> >   RISC-V: paravirt: pvqspinlock: Add SBI implementation
> >   RISC-V: paravirt: pvqspinlock: Add kconfig entry
> >   RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
> >   RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
> >   RISC-V: paravirt: pvqspinlock: KVM: Implement
> >     kvm_sbi_ext_pvlock_kick_cpu()
> >
> >  .../admin-guide/kernel-parameters.txt         |   8 +-
> >  arch/riscv/Kconfig                            |  50 ++++++++
> >  arch/riscv/Kconfig.errata                     |  19 +++
> >  arch/riscv/errata/thead/errata.c              |  29 +++++
> >  arch/riscv/include/asm/Kbuild                 |   2 +-
> >  arch/riscv/include/asm/cmpxchg.h              |   4 +-
> >  arch/riscv/include/asm/errata_list.h          |  13 --
> >  arch/riscv/include/asm/hwcap.h                |   1 +
> >  arch/riscv/include/asm/insn-def.h             |   5 +
> >  arch/riscv/include/asm/kvm_vcpu_sbi.h         |   1 +
> >  arch/riscv/include/asm/processor.h            |  13 ++
> >  arch/riscv/include/asm/qspinlock.h            |  35 ++++++
> >  arch/riscv/include/asm/qspinlock_paravirt.h   |  29 +++++
> >  arch/riscv/include/asm/rwonce.h               |  24 ++++
> >  arch/riscv/include/asm/sbi.h                  |  14 +++
> >  arch/riscv/include/asm/spinlock.h             | 113 ++++++++++++++++++
> >  arch/riscv/include/asm/vendorid_list.h        |  14 +++
> >  arch/riscv/include/uapi/asm/kvm.h             |   1 +
> >  arch/riscv/kernel/Makefile                    |   1 +
> >  arch/riscv/kernel/cpufeature.c                |   1 +
> >  arch/riscv/kernel/qspinlock_paravirt.c        |  83 +++++++++++++
> >  arch/riscv/kernel/sbi.c                       |   2 +-
> >  arch/riscv/kernel/setup.c                     |  60 ++++++++++
> >  .../kernel/trace_events_filter_paravirt.h     |  60 ++++++++++
> >  arch/riscv/kvm/Makefile                       |   1 +
> >  arch/riscv/kvm/vcpu_sbi.c                     |   4 +
> >  arch/riscv/kvm/vcpu_sbi_pvlock.c              |  57 +++++++++
> >  include/asm-generic/rwonce.h                  |   2 +
> >  include/asm-generic/spinlock.h                |  87 +-------------
> >  include/asm-generic/spinlock_types.h          |  12 +-
> >  include/asm-generic/ticket_spinlock.h         | 103 ++++++++++++++++
> >  kernel/locking/qspinlock.c                    |   5 +-
> >  32 files changed, 739 insertions(+), 114 deletions(-)
> >  create mode 100644 arch/riscv/include/asm/qspinlock.h
> >  create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h
> >  create mode 100644 arch/riscv/include/asm/rwonce.h
> >  create mode 100644 arch/riscv/include/asm/spinlock.h
> >  create mode 100644 arch/riscv/kernel/qspinlock_paravirt.c
> >  create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h
> >  create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c
> >  create mode 100644 include/asm-generic/ticket_spinlock.h
> >
> > --
> > 2.36.1
> >
>

Leonardo Bras Nov. 13, 2023, 10:19 a.m. UTC | #14

On Sun, Nov 12, 2023 at 1:24 AM Guo Ren <guoren@kernel.org> wrote:
>
> On Mon, Nov 6, 2023 at 3:42 PM Leonardo Bras <leobras@redhat.com> wrote:
> >
> > On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> > > From: Guo Ren <guoren@linux.alibaba.com>
> > >
> > > patch[1 - 10]: Native   qspinlock
> > > patch[11 -17]: Paravirt qspinlock
> > >
> > > patch[4]: Add prefetchw in qspinlock's xchg_tail when cpus >= 16k
> > >
> > > This series based on:
> > >  - [RFC PATCH v5 0/5] Rework & improve riscv cmpxchg.h and atomic.h
> > >    https://lore.kernel.org/linux-riscv/20230810040349.92279-2-leobras@redhat.com/
> > >  - [PATCH V3] asm-generic: ticket-lock: Optimize arch_spin_value_unlocked
> > >    https://lore.kernel.org/linux-riscv/20230908154339.3250567-1-guoren@kernel.org/
> > >
> > > I merge them into sg2042-master branch, then you could directly try it on
> > > sg2042 hardware platform:
> > >
> > > https://github.com/guoren83/linux/tree/sg2042-master-qspinlock-64ilp32_v5
> > >
> > > Use sophgo_mango_ubuntu_defconfig for sg2042 64/128 cores hardware
> > > platform.
> > >
> > > Native qspinlock
> > > ================
> > >
> > > This time we've proved the qspinlock on th1520 [1] & sg2042 [2], which
> > > gives stability and performance improvement. All T-HEAD processors have
> > > a strong LR/SC forward progress guarantee than the requirements of the
> > > ISA, which could satisfy the xchg_tail of native_qspinlock. Now,
> > > qspinlock has been run with us for more than 1 year, and we have enough
> > > confidence to enable it for all the T-HEAD processors. Of causes, we
> > > found a livelock problem with the qspinlock lock torture test from the
> > > CPU store merge buffer delay mechanism, which caused the queued spinlock
> > > becomes a dead ring and RCU warning to come out. We introduce a custom
> > > WRITE_ONCE to solve this. Do we need explicit ISA instruction to signal
> > > it? Or let hardware handle this.
> > >
> > > We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
> > > test on Fedora & Ubuntu & OpenEuler ... Here is the performance
> > > comparison between qspinlock and ticket_lock on sg2042 (64 cores):
> > >
> > > sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
> > >   queued_spinlock 0.5109/0.00
> > >   ticket_spinlock 0.5814/0.00
> > >
> > > perf futex/hash (+6.7%):
> > >   queued_spinlock 1444393 operations/sec (+- 0.09%)
> > >   ticket_spinlock 1353215 operations/sec (+- 0.15%)
> > >
> > > perf futex/wake-parallel (+8.6%):
> > >   queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
> > >   ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)
> > >
> > > perf futex/requeue (+4.2%):
> > >   queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
> > >   ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)
> > >
> > > System Benchmarks (+6.4%)
> > >   queued_spinlock:
> > >     System Benchmarks Index Values               BASELINE       RESULT    INDEX
> > >     Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
> > >     Double-Precision Whetstone                       55.0     182422.8  33167.8
> > >     Execl Throughput                                 43.0      13116.6   3050.4
> > >     File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
> > >     File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
> > >     File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
> > >     Pipe Throughput                               12440.0   23058600.5  18535.9
> > >     Pipe-based Context Switching                   4000.0    2835617.7   7089.0
> > >     Process Creation                                126.0      12537.3    995.0
> > >     Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
> > >     Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
> > >     System Call Overhead                          15000.0   33308301.3  22205.5
> > >                                                                        ========
> > >     System Benchmarks Index Score                                       12426.1
> > >
> > >   ticket_spinlock:
> > >     System Benchmarks Index Values               BASELINE       RESULT    INDEX
> > >     Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
> > >     Double-Precision Whetstone                       55.0     181921.0  33076.5
> > >     Execl Throughput                                 43.0      12625.1   2936.1
> > >     File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
> > >     File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
> > >     File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
> > >     Pipe Throughput                               12440.0   20594018.7  16554.7
> > >     Pipe-based Context Switching                   4000.0    2571117.7   6427.8
> > >     Process Creation                                126.0      10798.4    857.0
> > >     Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
> > >     Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
> > >     System Call Overhead                          15000.0   30766778.4  20511.2
> > >                                                                        ========
> > >     System Benchmarks Index Score                                       11670.7
> > >
> > > The qspinlock has a significant improvement on SOPHGO SG2042 64
> > > cores platform than the ticket_lock.
> > >
> > > Paravirt qspinlock
> > > ==================
> > >
> > > We implemented kvm_kick_cpu/kvm_wait_cpu and add tracepoints to observe the
> > > behaviors. Also, introduce a new SBI extension SBI_EXT_PVLOCK (0xAB0401). If the
> > > name and number are approved, I will send a formal proposal to the SBI spec.
> > >
> >
> > Hello Guo Ren,
> >
> > Any update on this series?
> Found a nested virtualization problem, and I'm solving that. After
> that, I'll update v12.

Oh, nice to hear :)
I am very excited about this series, please let me know of any update.

Thanks!
Leo

>
> >
> > Thanks!
> > Leo
> >
> >
> > > Changlog:
> > > V11:
> > >  - Based on Leonardo Bras's cmpxchg_small patches v5.
> > >  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
> > >  - Remove abusing alternative framework and use jump_label instead.
> > >  - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
> > >    guarantee.
> > >  - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.
> > >
> > > V10:
> > > https://lore.kernel.org/linux-riscv/20230802164701.192791-1-guoren@kernel.org/
> > >  - Using an alternative framework instead of static_key_branch in the
> > >    asm/spinlock.h.
> > >  - Fixup store merge buffer problem, which causes qspinlock lock
> > >    torture test livelock.
> > >  - Add paravirt qspinlock support, include KVM backend
> > >  - Add Compact NUMA-awared qspinlock support
> > >
> > > V9:
> > > https://lore.kernel.org/linux-riscv/20220808071318.3335746-1-guoren@kernel.org/
> > >  - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as
> > >    RCsc)
> > >  - Add qspinlock and combo-lock for riscv
> > >  - Add qspinlock to openrisc
> > >  - Use generic header in csky
> > >  - Optimize cmpxchg & atomic code
> > >
> > > V8:
> > > https://lore.kernel.org/linux-riscv/20220724122517.1019187-1-guoren@kernel.org/
> > >  - Coding convention ticket fixup
> > >  - Move combo spinlock into riscv and simply asm-generic/spinlock.h
> > >  - Fixup xchg16 with wrong return value
> > >  - Add csky qspinlock
> > >  - Add combo & qspinlock & ticket-lock comparison
> > >  - Clean up unnecessary riscv acquire and release definitions
> > >  - Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky
> > >
> > > V7:
> > > https://lore.kernel.org/linux-riscv/20220628081946.1999419-1-guoren@kernel.org/
> > >  - Add combo spinlock (ticket & queued) support
> > >  - Rename ticket_spinlock.h
> > >  - Remove unnecessary atomic_read in ticket_spin_value_unlocked
> > >
> > > V6:
> > > https://lore.kernel.org/linux-riscv/20220621144920.2945595-1-guoren@kernel.org/
> > >  - Fixup Clang compile problem Reported-by: kernel test robot
> > >  - Cleanup asm-generic/spinlock.h
> > >  - Remove changelog in patch main comment part, suggested by
> > >    Conor.Dooley
> > >  - Remove "default y if NUMA" in Kconfig
> > >
> > > V5:
> > > https://lore.kernel.org/linux-riscv/20220620155404.1968739-1-guoren@kernel.org/
> > >  - Update comment with RISC-V forward guarantee feature.
> > >  - Back to V3 direction and optimize asm code.
> > >
> > > V4:
> > > https://lore.kernel.org/linux-riscv/1616868399-82848-4-git-send-email-guoren@kernel.org/
> > >  - Remove custom sub-word xchg implementation
> > >  - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock
> > >
> > > V3:
> > > https://lore.kernel.org/linux-riscv/1616658937-82063-1-git-send-email-guoren@kernel.org/
> > >  - Coding convention by Peter Zijlstra's advices
> > >
> > > V2:
> > > https://lore.kernel.org/linux-riscv/1606225437-22948-2-git-send-email-guoren@kernel.org/
> > >  - Coding convention in cmpxchg.h
> > >  - Re-implement short xchg
> > >  - Remove char & cmpxchg implementations
> > >
> > > V1:
> > > https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/
> > >  - Using cmpxchg loop to implement sub-word atomic
> > >
> > >
> > > Guo Ren (17):
> > >   asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
> > >   asm-generic: ticket-lock: Move into ticket_spinlock.h
> > >   riscv: Use Zicbop in arch_xchg when available
> > >   locking/qspinlock: Improve xchg_tail for number of cpus >= 16k
> > >   riscv: qspinlock: Add basic queued_spinlock support
> > >   riscv: qspinlock: Introduce combo spinlock
> > >   riscv: qspinlock: Introduce qspinlock param for command line
> > >   riscv: qspinlock: Add virt_spin_lock() support for KVM guest
> > >   riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
> > >   riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
> > >   RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
> > >   RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
> > >   RISC-V: paravirt: pvqspinlock: Add SBI implementation
> > >   RISC-V: paravirt: pvqspinlock: Add kconfig entry
> > >   RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
> > >   RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
> > >   RISC-V: paravirt: pvqspinlock: KVM: Implement
> > >     kvm_sbi_ext_pvlock_kick_cpu()
> > >
> > >  .../admin-guide/kernel-parameters.txt         |   8 +-
> > >  arch/riscv/Kconfig                            |  50 ++++++++
> > >  arch/riscv/Kconfig.errata                     |  19 +++
> > >  arch/riscv/errata/thead/errata.c              |  29 +++++
> > >  arch/riscv/include/asm/Kbuild                 |   2 +-
> > >  arch/riscv/include/asm/cmpxchg.h              |   4 +-
> > >  arch/riscv/include/asm/errata_list.h          |  13 --
> > >  arch/riscv/include/asm/hwcap.h                |   1 +
> > >  arch/riscv/include/asm/insn-def.h             |   5 +
> > >  arch/riscv/include/asm/kvm_vcpu_sbi.h         |   1 +
> > >  arch/riscv/include/asm/processor.h            |  13 ++
> > >  arch/riscv/include/asm/qspinlock.h            |  35 ++++++
> > >  arch/riscv/include/asm/qspinlock_paravirt.h   |  29 +++++
> > >  arch/riscv/include/asm/rwonce.h               |  24 ++++
> > >  arch/riscv/include/asm/sbi.h                  |  14 +++
> > >  arch/riscv/include/asm/spinlock.h             | 113 ++++++++++++++++++
> > >  arch/riscv/include/asm/vendorid_list.h        |  14 +++
> > >  arch/riscv/include/uapi/asm/kvm.h             |   1 +
> > >  arch/riscv/kernel/Makefile                    |   1 +
> > >  arch/riscv/kernel/cpufeature.c                |   1 +
> > >  arch/riscv/kernel/qspinlock_paravirt.c        |  83 +++++++++++++
> > >  arch/riscv/kernel/sbi.c                       |   2 +-
> > >  arch/riscv/kernel/setup.c                     |  60 ++++++++++
> > >  .../kernel/trace_events_filter_paravirt.h     |  60 ++++++++++
> > >  arch/riscv/kvm/Makefile                       |   1 +
> > >  arch/riscv/kvm/vcpu_sbi.c                     |   4 +
> > >  arch/riscv/kvm/vcpu_sbi_pvlock.c              |  57 +++++++++
> > >  include/asm-generic/rwonce.h                  |   2 +
> > >  include/asm-generic/spinlock.h                |  87 +-------------
> > >  include/asm-generic/spinlock_types.h          |  12 +-
> > >  include/asm-generic/ticket_spinlock.h         | 103 ++++++++++++++++
> > >  kernel/locking/qspinlock.c                    |   5 +-
> > >  32 files changed, 739 insertions(+), 114 deletions(-)
> > >  create mode 100644 arch/riscv/include/asm/qspinlock.h
> > >  create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h
> > >  create mode 100644 arch/riscv/include/asm/rwonce.h
> > >  create mode 100644 arch/riscv/include/asm/spinlock.h
> > >  create mode 100644 arch/riscv/kernel/qspinlock_paravirt.c
> > >  create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h
> > >  create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c
> > >  create mode 100644 include/asm-generic/ticket_spinlock.h
> > >
> > > --
> > > 2.36.1
> > >
> >
>
>
> --
> Best Regards
>  Guo Ren
>

[V11,00/17] riscv: Add Native/Paravirt qspinlock support

Message

Comments