From patchwork Fri Mar 15 13:40:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrew Jones X-Patchwork-Id: 13593486 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F0A5EC54E69 for ; Fri, 15 Mar 2024 13:40:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=ps88GbmUm/8RLKU9JTflbWXcE5Wh2/8Q1u/5zw+zvBI=; b=4koxRxeiwCUhMi 9WaUy4JSYAFqZzLYywKOlLCsW0NDaQdAl0496po1lJ1mzKB/o6bsCZ9w62IrMS1baFwpNoEnzAceN fA/byxp108fnPpZM3Fd+W10fTMRO1y0hYqilk8Vh2NpxXl04HGKyRH1kIFcs28l7Kqc1PRhLAHwYg AluyZgk0J+uwbNCPa9o5AJSP4R8/xDUjcONX71nokMQ6FKo6edCWTTHFsKNwguLtMXaYzD5kX+K7A CHwmTDQfnzCJCriurCHHDackLsrYQSrsiLvVUuJEg4Q5BQWnkLgHskIi8+6NCASSJQTmh0Yb3iFE6 FXQxsLdI6tIYKZ9iUYsQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1rl7nG-00000000M3w-1UPt; Fri, 15 Mar 2024 13:40:18 +0000 Received: from mail-ej1-x636.google.com ([2a00:1450:4864:20::636]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1rl7nC-00000000M1B-2Gzo for linux-riscv@lists.infradead.org; Fri, 15 Mar 2024 13:40:16 +0000 Received: by mail-ej1-x636.google.com with SMTP id a640c23a62f3a-a461c50deccso260870866b.0 for ; Fri, 15 Mar 2024 06:40:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ventanamicro.com; s=google; t=1710510012; x=1711114812; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=s/82QkV7lsuYytCREMvIABOartLQJEXqVwZ7l/vDDVc=; b=TRbJeI/hX6oBkB877qiK7M91NZ6Z418isdVHUkn2Auv0Lqlzt50AhW2rhJAu+0V8U7 DjGzZBkQIViGDTeCrL6j3Yo+ItUhVKAy+Y60WxYoNuQRSm3IwUcSsn+bEjOFJ9YDJ6Ew QDOTgQvFuIYBhy6TmvrLl3XyPIStSvWOHpDbMAGeo/CrsIOaKvGJ1Ui1nwtpVjMlcvVT VswpCElZcMoJAXdZmcePhfy9SJ6Bfwj9fDIsjZDV09baKHe2331QvKCJvpHZUV9ZSXSM XtGMT8d4VPnmmhgvmTMt2PeaNghbCEfYzaQ29IWf1DZVq4b8FdbE33DDYgO0WBZ7brHT Gd/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710510012; x=1711114812; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=s/82QkV7lsuYytCREMvIABOartLQJEXqVwZ7l/vDDVc=; b=vmYpwe2Wff5N69I4nSB9FdNEFELROuU5slwnidkynj5GF/moNa2C/JsSFvSb8iEGJB YfqwaESdvqIsDCIBr/lSLcDZfQ9uHyv9LEuNjqxmGi1E4Vh/b7ghDinOsI9SP2qOMz4L QOmGyR+Tg35/8MTVvT/FrqwCbAeD5HrB/SUVxlZe2Xh/1qqGw7l+CcHevEt1KiK7xtzJ JONn7fKwVfCPYxEIjLhZjOxmVVFd7jz9nJOw1GRftZWpTiz18kUa0Eo6DqivRerIyiGW iK1fF1bmMzyQM1uXS6zwx5eorTOVhzh+pQ5T31pjWmUyFTjRDk2eG9KZvxQudzcPNgEG nYpg== X-Gm-Message-State: AOJu0YzzHzliajdoxvydEy7Mwu6rZbrTTYdl9d2jg0Yezc85Ly4xvdAY QcqFC0hNrNNXbxIJbqcu1IgwBxsGzyCKyUxCckE10qiJ10mWdmlJ8v/fEj03NjpjnkS/tZcIF0v h X-Google-Smtp-Source: AGHT+IFHP6Jg6Qc7D4WCQzZ58TKJZBLj4tlDP5a4xkYQ+HUpTYy8cW16RV/ap4g6RdF9Quj/dZk1HA== X-Received: by 2002:a17:907:6d1d:b0:a46:8bf7:8adf with SMTP id sa29-20020a1709076d1d00b00a468bf78adfmr1387841ejc.63.1710510012531; Fri, 15 Mar 2024 06:40:12 -0700 (PDT) Received: from localhost (2001-1ae9-1c2-4c00-20f-c6b4-1e57-7965.ip6.tmcz.cz. [2001:1ae9:1c2:4c00:20f:c6b4:1e57:7965]) by smtp.gmail.com with ESMTPSA id d10-20020a170907272a00b00a45ff890df0sm1731767ejl.35.2024.03.15.06.40.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Mar 2024 06:40:12 -0700 (PDT) From: Andrew Jones To: linux-riscv@lists.infradead.org, kvm-riscv@lists.infradead.org Cc: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, conor.dooley@microchip.com, anup@brainfault.org, atishp@atishpatra.org, christoph.muellner@vrull.eu, heiko@sntech.de, charlie@rivosinc.com, David.Laight@ACULAB.COM, Heiko Stuebner Subject: [PATCH 1/5] riscv: Add Zawrs support for spinlocks Date: Fri, 15 Mar 2024 14:40:11 +0100 Message-ID: <20240315134009.580167-8-ajones@ventanamicro.com> X-Mailer: git-send-email 2.44.0 In-Reply-To: <20240315134009.580167-7-ajones@ventanamicro.com> References: <20240315134009.580167-7-ajones@ventanamicro.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240315_064014_629478_9FEFCCA1 X-CRM114-Status: GOOD ( 20.45 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Christoph Müllner The current RISC-V code uses the generic ticket lock implementation, that calls the macros smp_cond_load_relaxed() and smp_cond_load_acquire(). Currently, RISC-V uses the generic implementation of these macros. This patch introduces a RISC-V specific implementation, of these macros, that peels off the first loop iteration and modifies the waiting loop such, that it is possible to use the WRS.STO instruction of the Zawrs ISA extension to stall the CPU. The resulting implementation of smp_cond_load_*() will only work for 32-bit or 64-bit types for RV64 and 32-bit types for RV32. This is caused by the restrictions of the LR instruction (RISC-V only has LR.W and LR.D). Compiler assertions guard this new restriction. This patch uses the existing RISC-V ISA extension framework to detect the presence of Zawrs at run-time. If available a NOP instruction will be replaced by WRS.NTO or WRS.STO. The whole mechanism is gated by Kconfig setting, which defaults to Y. The Zawrs specification can be found here: https://github.com/riscv/riscv-zawrs/blob/main/zawrs.adoc Signed-off-by: Christoph Müllner [rebase, update to review comments] Signed-off-by: Heiko Stuebner [rebase, move ALT_WRS* to barrier.h] Signed-off-by: Andrew Jones --- arch/riscv/Kconfig | 13 +++++ arch/riscv/include/asm/barrier.h | 82 ++++++++++++++++++++++++++++++++ arch/riscv/include/asm/hwcap.h | 1 + arch/riscv/kernel/cpufeature.c | 1 + 4 files changed, 97 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index e3142ce531a0..2c296113aeb1 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -569,6 +569,19 @@ config RISCV_ISA_V_PREEMPTIVE preemption. Enabling this config will result in higher memory consumption due to the allocation of per-task's kernel Vector context. +config RISCV_ISA_ZAWRS + bool "Zawrs extension support for more efficient busy waiting" + depends on RISCV_ALTERNATIVE + default y + help + Enable the use of the Zawrs (wait for reservation set) extension + when available. + + The Zawrs extension instructions (wrs.nto and wrs.sto) are used for + more efficient busy waiting. + + If you don't know what to do here, say Y. + config TOOLCHAIN_HAS_ZBB bool default y diff --git a/arch/riscv/include/asm/barrier.h b/arch/riscv/include/asm/barrier.h index 110752594228..93b3f572d643 100644 --- a/arch/riscv/include/asm/barrier.h +++ b/arch/riscv/include/asm/barrier.h @@ -11,11 +11,26 @@ #define _ASM_RISCV_BARRIER_H #ifndef __ASSEMBLY__ +#include +#include +#include +#include #define nop() __asm__ __volatile__ ("nop") #define __nops(n) ".rept " #n "\nnop\n.endr\n" #define nops(n) __asm__ __volatile__ (__nops(n)) +#define ZAWRS_WRS_NTO ".long 0x00d00073" +#define ZAWRS_WRS_STO ".long 0x01d00073" +#define ALT_WRS_NTO() \ + __asm__ __volatile__ (ALTERNATIVE( \ + "nop\n", ZAWRS_WRS_NTO "\n", \ + 0, RISCV_ISA_EXT_ZAWRS, CONFIG_RISCV_ISA_ZAWRS)) +#define ALT_WRS_STO() \ + __asm__ __volatile__ (ALTERNATIVE( \ + "nop\n", ZAWRS_WRS_STO "\n", \ + 0, RISCV_ISA_EXT_ZAWRS, CONFIG_RISCV_ISA_ZAWRS)) + #define RISCV_FENCE(p, s) \ __asm__ __volatile__ ("fence " #p "," #s : : : "memory") @@ -44,6 +59,39 @@ do { \ ___p1; \ }) +#define ___smp_load_reservedN(attr, ptr) \ +({ \ + typeof(*ptr) ___p1; \ + \ + __asm__ __volatile__ ("lr." attr " %[p], %[c]\n" \ + : [p]"=&r" (___p1), [c]"+A"(*ptr)); \ + ___p1; \ +}) + +#define __smp_load_reserved_relaxed(ptr) \ +({ \ + typeof(*ptr) ___p1; \ + \ + if (sizeof(*ptr) == sizeof(int)) \ + ___p1 = ___smp_load_reservedN("w", ptr); \ + else if (sizeof(*ptr) == sizeof(long)) \ + ___p1 = ___smp_load_reservedN("d", ptr); \ + else \ + compiletime_assert(0, \ + "Need type compatible with LR/SC instructions for " \ + __stringify(ptr)); \ + ___p1; \ +}) + +#define __smp_load_reserved_acquire(ptr) \ +({ \ + typeof(*ptr) ___p1; \ + \ + ___p1 = __smp_load_reserved_relaxed(ptr); \ + RISCV_FENCE(r, rw); \ + ___p1; \ +}) + /* * This is a very specific barrier: it's currently only used in two places in * the kernel, both in the scheduler. See include/linux/spinlock.h for the two @@ -71,6 +119,40 @@ do { \ */ #define smp_mb__after_spinlock() RISCV_FENCE(iorw,iorw) +#define smp_cond_load_relaxed(ptr, cond_expr) \ +({ \ + typeof(ptr) __PTR = (ptr); \ + __unqual_scalar_typeof(*ptr) VAL; \ + \ + VAL = READ_ONCE(*__PTR); \ + if (!cond_expr) { \ + for (;;) { \ + VAL = __smp_load_reserved_relaxed(__PTR); \ + if (cond_expr) \ + break; \ + ALT_WRS_STO(); \ + } \ + } \ + (typeof(*ptr))VAL; \ +}) + +#define smp_cond_load_acquire(ptr, cond_expr) \ +({ \ + typeof(ptr) __PTR = (ptr); \ + __unqual_scalar_typeof(*ptr) VAL; \ + \ + VAL = smp_load_acquire(__PTR); \ + if (!cond_expr) { \ + for (;;) { \ + VAL = __smp_load_reserved_acquire(__PTR); \ + if (cond_expr) \ + break; \ + ALT_WRS_STO(); \ + } \ + } \ + (typeof(*ptr))VAL; \ +}) + #include #endif /* __ASSEMBLY__ */ diff --git a/arch/riscv/include/asm/hwcap.h b/arch/riscv/include/asm/hwcap.h index 1f2d2599c655..eac7214a4bd0 100644 --- a/arch/riscv/include/asm/hwcap.h +++ b/arch/riscv/include/asm/hwcap.h @@ -80,6 +80,7 @@ #define RISCV_ISA_EXT_ZFA 71 #define RISCV_ISA_EXT_ZTSO 72 #define RISCV_ISA_EXT_ZACAS 73 +#define RISCV_ISA_EXT_ZAWRS 74 #define RISCV_ISA_EXT_XLINUXENVCFG 127 diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index 79a5a35fab96..0e3c79094b07 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -271,6 +271,7 @@ const struct riscv_isa_ext_data riscv_isa_ext[] = { __RISCV_ISA_EXT_DATA(zihintpause, RISCV_ISA_EXT_ZIHINTPAUSE), __RISCV_ISA_EXT_DATA(zihpm, RISCV_ISA_EXT_ZIHPM), __RISCV_ISA_EXT_DATA(zacas, RISCV_ISA_EXT_ZACAS), + __RISCV_ISA_EXT_DATA(zawrs, RISCV_ISA_EXT_ZAWRS), __RISCV_ISA_EXT_DATA(zfa, RISCV_ISA_EXT_ZFA), __RISCV_ISA_EXT_DATA(zfh, RISCV_ISA_EXT_ZFH), __RISCV_ISA_EXT_DATA(zfhmin, RISCV_ISA_EXT_ZFHMIN),