From patchwork Wed Dec 13 13:13:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13491001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2AE2BC4332F for ; Wed, 13 Dec 2023 13:14:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=6pDH/w3eVzfebm3PjvX5i9YCxTSDSMaP25HRXOgAs6Y=; b=llL7dGQGwunhzK DF8DZNqXrLDFpN8dfDInGzomRxE/SOgL7R6uyMP+127bndG6DZwcDLF5Rwt7JbUkzzWHGef4IEdxy J7KmjlzvDvb7dsIe/x/7ITI8F7kVwdjEPEgRY9qTw1N8cjx8Ihf7lpOgeCvLmMTCM05hYMXSq3Tn7 SPUpKANOV6wWcJOT5xWdu48y9XpOLYX612R+YJVPM7BLCxRBavbmZ9Ke8HD9gsR35v8K6LkW78x3E BF5r5w/GMRhCegPkCrCyiNBSNoIjT5MKAxCei47nBTadOZIvn2Y83ZLouucR3i5BUngoPLyqPVjTS uvUSChNRUYCvqvklxdAg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDP3k-00Ekqt-0b; Wed, 13 Dec 2023 13:13:56 +0000 Received: from mail-pg1-x52a.google.com ([2607:f8b0:4864:20::52a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDP3h-00EkpS-2B for linux-riscv@lists.infradead.org; Wed, 13 Dec 2023 13:13:55 +0000 Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-5c6839373f8so5026183a12.0 for ; Wed, 13 Dec 2023 05:13:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702473230; x=1703078030; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Qxjo4MW3gAiPbpXzJ0Ytrn+RREEyk8TETbPYTzJ+Yos=; b=cAuJNGSsjmgCRKv8yuLd10gKGE11oidIclZ1V6IDGizVOdUUsBoNrv42avVC6hWjji mAvlTu5bmp5kisqSjMZptYn3gc9B9YOP85skEVTRW/DlHayGgQJunvDrXyoNxTjQffcX mlXdMO70ed0mhNyJrfXklx2nA21nd2nRDdHd+S3OzZBZCCPmBw5Yj9ZuhnrFIzC5Znx1 auFs1goHOEKyFI4XrtOiFwO6WpZ2m+KQ2z57/DMbeP7/OBhFM+nnC2UDRlaUrpRJSklG r6AlekDrSX0ug7O45w3/RyCxVABJZsZFCIEqmrNSgIMGyoovb32dU8bQ8ZF2AQHcOjRj 1rsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702473230; x=1703078030; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Qxjo4MW3gAiPbpXzJ0Ytrn+RREEyk8TETbPYTzJ+Yos=; b=fitEHacsib4cCiVu9YIJnwzjrmvOfYxCovBoQP93kS6W1w6AcelD1iyolkygqW/wBr LBbih0lJxBBN5caKp3GOXKbfRPvLo4t+b+XicraH0p2UvBmkqzcDtykCDNqp0K9QA2DH S517xq/gXefQ8C3rE5618kIhK+EmDnBgDMLTxs4dZAt5nSITPn3wOA//fHOUrNHIRpHB S5zP8oL8DWH+vYfWJm+4O0MMwxR+AUILISK8NDsIGLhl8Ka4+6HkXD1LApAL2MrmPXUB tO0IYJ0zE1xwnID1eVrRZ+E49S5wn5KsChunALj4PZHxw2j735T5EwyA9s5mD5DUmUoE ZK7w== X-Gm-Message-State: AOJu0YyVFDAz3FNw9txh5jXMJ2qrVCqZ8DvWJu9uVViR8Qxpqre6QGZ/ dxrrobt93FhYtOy/BfLQoJ2lKnBJz5/Er/pV7k45dhAzB2S6i9Dsbe0KNstNXyWVRUDcxSNeD0g BWDgZ68T1BxPL2m+VQ+nuH/RY2N5TUNGOhQn8o2PL3KejZRBYmPxWLCcb6g088v/WLTTUxwA3DU o3KqHrhGphAYbm X-Google-Smtp-Source: AGHT+IG7bE3kPOxnDB8VnzhOAy4gldhxX8dw+hFZNvnSMAvfXUYk1KueOaK1Oat7gHE17i+MlnzviA== X-Received: by 2002:a05:6a21:7889:b0:18f:cb22:31ae with SMTP id bf9-20020a056a21788900b0018fcb2231aemr9529907pzc.54.1702473229911; Wed, 13 Dec 2023 05:13:49 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id fn7-20020a056a002fc700b006cecaff9e29sm9928601pfb.128.2023.12.13.05.13.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 05:13:49 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Subject: [v4, 1/6] riscv: Add support for kernel mode vector Date: Wed, 13 Dec 2023 13:13:16 +0000 Message-Id: <20231213131321.12862-2-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231213131321.12862-1-andy.chiu@sifive.com> References: <20231213131321.12862-1-andy.chiu@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231213_051353_723325_36B78C68 X-CRM114-Status: GOOD ( 24.86 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: guoren@linux.alibaba.com, Heiko Stuebner , arnd@arndb.de, Peter Zijlstra , charlie@rivosinc.com, Conor Dooley , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Vincent Chen , bjorn@kernel.org, Albert Ou , Guo Ren , Evan Green , Andy Chiu , Paul Walmsley , Jisheng Zhang , greentime.hu@sifive.com, ardb@kernel.org, Sami Tolvanen , Alexandre Ghiti Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Greentime Hu Add kernel_vector_begin() and kernel_vector_end() function declarations and corresponding definitions in kernel_mode_vector.c These are needed to wrap uses of vector in kernel mode. Co-developed-by: Vincent Chen Signed-off-by: Vincent Chen Signed-off-by: Greentime Hu Signed-off-by: Andy Chiu --- Changelog v4: - Use kernel_v_flags and helpers to track vector context. Thus remove A-b from Cornor. Changelog v3: - Reorder patch 1 to patch 3 to make use of {get,put}_cpu_vector_context later. - Export {get,put}_cpu_vector_context. - Save V context after disabling preemption. (Guo) - Fix a build fail. (Conor) - Remove irqs_disabled() check as it is not needed, fix styling. (Björn) Changelog v2: - 's/kernel_rvv/kernel_vector' and return void in kernel_vector_begin (Conor) - export may_use_simd to include/asm/simd.h --- arch/riscv/include/asm/processor.h | 15 +++- arch/riscv/include/asm/simd.h | 42 ++++++++++++ arch/riscv/include/asm/vector.h | 21 ++++++ arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/kernel_mode_vector.c | 95 ++++++++++++++++++++++++++ arch/riscv/kernel/process.c | 2 +- 6 files changed, 174 insertions(+), 2 deletions(-) create mode 100644 arch/riscv/include/asm/simd.h create mode 100644 arch/riscv/kernel/kernel_mode_vector.c diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index 3e23e1786d05..10c796a792e7 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -72,6 +72,18 @@ struct task_struct; struct pt_regs; +/* + * We use a flag to track in-kernel Vector context. Currently the flag has the + * following meaning: + * + * - bit 0 indicates whether the in-kernel Vector context is active. The + * activation of this state disables the preemption. + */ + +#define RISCV_KERNEL_MODE_V_MASK 0x1 + +#define RISCV_KERNEL_MODE_V 0x1 + /* CPU-specific state of a task */ struct thread_struct { /* Callee-saved registers */ @@ -80,7 +92,8 @@ struct thread_struct { unsigned long s[12]; /* s[0]: frame pointer */ struct __riscv_d_ext_state fstate; unsigned long bad_cause; - unsigned long vstate_ctrl; + u32 riscv_v_flags; + u32 vstate_ctrl; struct __riscv_v_ext_state vstate; }; diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h new file mode 100644 index 000000000000..269752bfa2cc --- /dev/null +++ b/arch/riscv/include/asm/simd.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2017 Linaro Ltd. + * Copyright (C) 2023 SiFive + */ + +#ifndef __ASM_SIMD_H +#define __ASM_SIMD_H + +#include +#include +#include +#include +#include + +#ifdef CONFIG_RISCV_ISA_V +/* + * may_use_simd - whether it is allowable at this time to issue vector + * instructions or access the vector register file + * + * Callers must not assume that the result remains true beyond the next + * preempt_enable() or return from softirq context. + */ +static __must_check inline bool may_use_simd(void) +{ + /* + * RISCV_KERNEL_MODE_V is only set while preemption is disabled, + * and is clear whenever preemption is enabled. + */ + return !in_hardirq() && !in_nmi() && !(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); +} + +#else /* ! CONFIG_RISCV_ISA_V */ + +static __must_check inline bool may_use_simd(void) +{ + return false; +} + +#endif /* ! CONFIG_RISCV_ISA_V */ + +#endif diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index c5ee07b3df07..a2db3dbe97eb 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -22,6 +22,27 @@ extern unsigned long riscv_v_vsize; int riscv_v_setup_vsize(void); bool riscv_v_first_use_handler(struct pt_regs *regs); +void kernel_vector_begin(void); +void kernel_vector_end(void); +void get_cpu_vector_context(void); +void put_cpu_vector_context(void); + +static inline void riscv_v_ctx_cnt_add(u32 offset) +{ + current->thread.riscv_v_flags += offset; + barrier(); +} + +static inline void riscv_v_ctx_cnt_sub(u32 offset) +{ + barrier(); + current->thread.riscv_v_flags -= offset; +} + +static inline u32 riscv_v_ctx_cnt(void) +{ + return READ_ONCE(current->thread.riscv_v_flags); +} static __always_inline bool has_vector(void) { diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index 95cf25d48405..0597bb668b6e 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -62,6 +62,7 @@ obj-$(CONFIG_MMU) += vdso.o vdso/ obj-$(CONFIG_RISCV_M_MODE) += traps_misaligned.o obj-$(CONFIG_FPU) += fpu.o obj-$(CONFIG_RISCV_ISA_V) += vector.o +obj-$(CONFIG_RISCV_ISA_V) += kernel_mode_vector.o obj-$(CONFIG_SMP) += smpboot.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SMP) += cpu_ops.o diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c new file mode 100644 index 000000000000..c9ccf21dd16c --- /dev/null +++ b/arch/riscv/kernel/kernel_mode_vector.c @@ -0,0 +1,95 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2012 ARM Ltd. + * Author: Catalin Marinas + * Copyright (C) 2017 Linaro Ltd. + * Copyright (C) 2021 SiFive + */ +#include +#include +#include +#include +#include + +#include +#include +#include + +/* + * Claim ownership of the CPU vector context for use by the calling context. + * + * The caller may freely manipulate the vector context metadata until + * put_cpu_vector_context() is called. + */ +void get_cpu_vector_context(void) +{ + preempt_disable(); + + WARN_ON(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); + riscv_v_ctx_cnt_add(RISCV_KERNEL_MODE_V); +} + +/* + * Release the CPU vector context. + * + * Must be called from a context in which get_cpu_vector_context() was + * previously called, with no call to put_cpu_vector_context() in the + * meantime. + */ +void put_cpu_vector_context(void) +{ + WARN_ON(!(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK)); + riscv_v_ctx_cnt_sub(RISCV_KERNEL_MODE_V); + + preempt_enable(); +} + +/* + * kernel_vector_begin(): obtain the CPU vector registers for use by the calling + * context + * + * Must not be called unless may_use_simd() returns true. + * Task context in the vector registers is saved back to memory as necessary. + * + * A matching call to kernel_vector_end() must be made before returning from the + * calling context. + * + * The caller may freely use the vector registers until kernel_vector_end() is + * called. + */ +void kernel_vector_begin(void) +{ + if (WARN_ON(!has_vector())) + return; + + BUG_ON(!may_use_simd()); + + get_cpu_vector_context(); + + riscv_v_vstate_save(current, task_pt_regs(current)); + + riscv_v_enable(); +} +EXPORT_SYMBOL_GPL(kernel_vector_begin); + +/* + * kernel_vector_end(): give the CPU vector registers back to the current task + * + * Must be called from a context in which kernel_vector_begin() was previously + * called, with no call to kernel_vector_end() in the meantime. + * + * The caller must not use the vector registers after this function is called, + * unless kernel_vector_begin() is called again in the meantime. + */ +void kernel_vector_end(void) +{ + if (WARN_ON(!has_vector())) + return; + + riscv_v_vstate_restore(current, task_pt_regs(current)); + + riscv_v_disable(); + + put_cpu_vector_context(); +} +EXPORT_SYMBOL_GPL(kernel_vector_end); diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index e32d737e039f..bb879ab38101 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -169,7 +169,6 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) *dst = *src; /* clear entire V context, including datap for a new task */ memset(&dst->thread.vstate, 0, sizeof(struct __riscv_v_ext_state)); - return 0; } @@ -203,6 +202,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) childregs->a0 = 0; /* Return value of fork() */ p->thread.s[0] = 0; } + p->thread.riscv_v_flags = 0; p->thread.ra = (unsigned long)ret_from_fork; p->thread.sp = (unsigned long)childregs; /* kernel sp */ return 0; From patchwork Wed Dec 13 13:13:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13491002 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1E1B8C4332F for ; Wed, 13 Dec 2023 13:14:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=frr532NNIMh1YeNy4W8zzjN8t5Mk1RXLnPjNEvp7dsc=; b=RsF271NGMkMKNQ 9p4VezdVUXPzhW5PJOwjzyvvXHcMNPYDxFESc0x/Z3E+EZqVGksYoekfV35sODPIS5KrjdRTXKgMc FFtzYlBroIvtuCfkNw/snDTwfk1vLQ+c1nxu+cBkulW5e2MC8wXVAaIz8eeKmhP7Yu4nAIArsTneJ SNWPVMGyKme+fUTnEuiHUgnKZW3xCWIrUz+zKfz0bX3ag/W13dIjmm8czOPAtYtEF9fEgk7vbeV3M rRaFOAZJcRnBYfdGlDl7wAHNbzuvG0ADBL1zP+ohS2FfM4XpzEWW7XfVRSmyUb4ELc9DfXGKlbrdv 77StfwUb5ayQirAS6xuA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDP3p-00EksT-3C; Wed, 13 Dec 2023 13:14:01 +0000 Received: from mail-pf1-x430.google.com ([2607:f8b0:4864:20::430]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDP3n-00EkrK-2L for linux-riscv@lists.infradead.org; Wed, 13 Dec 2023 13:14:01 +0000 Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-6ce972ac39dso4292800b3a.3 for ; Wed, 13 Dec 2023 05:13:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702473237; x=1703078037; darn=lists.infradead.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=Su/cYQNxYPK8qhHX3mu7YrdDkKMaLJ+YHnVyjYZSKGc=; b=KdTfgSvEqssuiifcAMQQdMPNe4A6vnR4V+XSXXEClV5OgN5NViN+IkBaGW4+cdopUV 0JLel3q66XYfgB01voZ3THmDflusCxhhq5titofPWRNVgq+ugqSd/k8Asw0QBiTQhZvs 7ArtiSNrok+DHBaFcNtAfxNeIJZXKPVMhKpG0zlSrbV8W1TjuCaKP+/ykivB2ee1alrg y09flqhpiwJvmFc73Iid2qgDfTNkmmowZPagQezsvaCca2YOTEaZ881ND0XHfQIrHbRE 4Sg/eZs+cu31glrqzDyzv6oFOP4tlGs6uaOmpoHILjOoBIWM25Te82BPlBh/apfIQH+M qXZg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702473237; x=1703078037; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Su/cYQNxYPK8qhHX3mu7YrdDkKMaLJ+YHnVyjYZSKGc=; b=vqV61KDdfkcWIIZ92ih1FegioHaoqKuUozkx6GdutFQdJ2ldnzn4jBVTfJMV8EkvhU WU2CRuLoMgsS4TUnPLFyDkcjeVq0keYfUMkDJOWnSDlOzxTmhLkKNbI0WZzOb9VfZOhu R1XCPKvrPKR8VSofiR0E1z8sJv1XGZShFkVBCU+M6HP2lArQGJ/N4fR9fqSt77zeMU5a /w08oj74RCicZE69Pd6lJrr8xf+geA7IGnkCIimc3TbhHhF3Hdsc+sRCt+PlSicvVxfr 5trbXsIkUVVkjRImzyi00wV9eAE+yItnxamvw0sP5+uPuaxD2lhyy7dwsz8CHPkHoTy0 CcEA== X-Gm-Message-State: AOJu0Ywl8R+XvDd1/58q6jmzMtc0Kyy5MaVS1aFA5KGH49jgI352g9RK iI9o3iEPAvDz9KGfdxSPvm6K45S/7p+2/YEftVUaWQY/wuVQdAmlgs4ch9AfyzI7rxqIpImvpgg XlIIpvG0FdHBNtFZHiSwx2NdLQCpx1iwIBdvp8SslvfZZbIhw1eUtLCaz+F9626njsWiCjqdezn cndqAXcmcRVlPv X-Google-Smtp-Source: AGHT+IGQEQPQfRYR/d9h/bgPu9MYyR23b0F643iv9WBdUZvVaYuw6Z2WhSIG1NyjUYbS4RH7ba0HKQ== X-Received: by 2002:a05:6a00:1824:b0:6ce:2732:274 with SMTP id y36-20020a056a00182400b006ce27320274mr4004721pfa.35.1702473236479; Wed, 13 Dec 2023 05:13:56 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id fn7-20020a056a002fc700b006cecaff9e29sm9928601pfb.128.2023.12.13.05.13.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 05:13:55 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Andy Chiu , Paul Walmsley , Albert Ou , Vincent Chen , Conor Dooley Subject: [v4, 2/6] riscv: vector: make Vector always available for softirq context Date: Wed, 13 Dec 2023 13:13:17 +0000 Message-Id: <20231213131321.12862-3-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231213131321.12862-1-andy.chiu@sifive.com> References: <20231213131321.12862-1-andy.chiu@sifive.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231213_051359_766312_387A6BE3 X-CRM114-Status: GOOD ( 12.33 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org By disabling bottom halves in active kerne-mode Vector, softirq will not be able to nest on top of any kernel-mode Vector. After this patch, Vector context cannot start with irqs disabled. Otherwise local_bh_enable() may run in a wrong context. Disabling bh is not enough for RT-kernel to prevent preeemption. So we must disable preemption, which also implies disabling bh on RT. Related-to: commit 696207d4258b ("arm64/sve: Make kernel FPU protection RT friendly") Related-to: commit 66c3ec5a7120 ("arm64: neon: Forbid when irqs are disabled") Signed-off-by: Andy Chiu --- Changelog v4: - new patch since v4 --- arch/riscv/include/asm/simd.h | 6 +++++- arch/riscv/kernel/kernel_mode_vector.c | 10 ++++++++-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h index 269752bfa2cc..cd6180fe37c0 100644 --- a/arch/riscv/include/asm/simd.h +++ b/arch/riscv/include/asm/simd.h @@ -26,8 +26,12 @@ static __must_check inline bool may_use_simd(void) /* * RISCV_KERNEL_MODE_V is only set while preemption is disabled, * and is clear whenever preemption is enabled. + * + * Kernel-mode Vector temperarily disables bh. So we must not return + * true on irq_disabled(). Otherwise we would fail the lockdep check + * calling local_bh_enable() */ - return !in_hardirq() && !in_nmi() && !(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); + return !in_hardirq() && !in_nmi() && !irqs_disabled() && !(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); } #else /* ! CONFIG_RISCV_ISA_V */ diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c index c9ccf21dd16c..52e42f74ec9a 100644 --- a/arch/riscv/kernel/kernel_mode_vector.c +++ b/arch/riscv/kernel/kernel_mode_vector.c @@ -23,7 +23,10 @@ */ void get_cpu_vector_context(void) { - preempt_disable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_bh_disable(); + else + preempt_disable(); WARN_ON(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); riscv_v_ctx_cnt_add(RISCV_KERNEL_MODE_V); @@ -41,7 +44,10 @@ void put_cpu_vector_context(void) WARN_ON(!(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK)); riscv_v_ctx_cnt_sub(RISCV_KERNEL_MODE_V); - preempt_enable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_bh_enable(); + else + preempt_enable(); } /* From patchwork Wed Dec 13 13:13:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13491003 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 40775C4332F for ; Wed, 13 Dec 2023 13:14:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=EJSSAYNPmACBCn2sE4TEEv371QLzeS3Q/r8lXWNNqMs=; b=QSqF0RQePNgc8T PUBbcgaFaPxSsm6cdtEdQd1BefgSugdAGjV3Slh+5GdyPOjDv+YYk1v/mbPPjS8qyCctKE8yep3l6 +h4I19Xklol2JrcoonvQz8NVHXxQifdkJQEM+rlr86Wcf3c5wp9EMmwWMrxAVpr1w9iv390/oAjz2 vaoq+mI/nK5LjRYgO2G/ZJ19lTFadfFEpIctpeFz6fG7TMQMF2OLCIYZfcwESpbmNrVE25lhR4MvL prJVeu6CWlz1RoutM0hmNTKw5yheWcR3Ey1xns3wur4C3O37pxXsaK2/7wCdB2hdOggc3LB+Zc4Ao dowQnYwjoUiTdJT+CkiQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDP40-00EkwM-2m; Wed, 13 Dec 2023 13:14:12 +0000 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDP3x-00Eku2-14 for linux-riscv@lists.infradead.org; Wed, 13 Dec 2023 13:14:10 +0000 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-6ce7632b032so3557621b3a.1 for ; Wed, 13 Dec 2023 05:14:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702473245; x=1703078045; darn=lists.infradead.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=i9NRPfuDspExspcewG3zKAebEn0P1bg8Tb7NM+tntho=; b=izAb4pIOqnwH0C+fPNqbVoCYozA5GgWjY9jaDc3bXkCUAND05toS7hOFMvv3V8MfA3 zSCqHvD4TJCUlq90DuvzoTyReo577SEPTk791zRm+mA1nNWU2Wo4ZnCpnxENmZYhQPcy gGIWJH90MZC5fwXdbrwAbGqXhKHzDpUXRyca/gTb5LREIfUHIt2CfePBmdI7eBGjKsm6 Reo3DIshJtprA3rW6jd5shHM7WC74r6Zzf2pZ2CGNqIpvnnlMT7IN+8EoTRbgFhB705y 0UpWnBBWsd/X8wDHcqqcVvPRqpXrGOpp0YP3Zfa2ftnwv39eTB/pPMUqpjZOaQ5zv1gk xVmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702473245; x=1703078045; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=i9NRPfuDspExspcewG3zKAebEn0P1bg8Tb7NM+tntho=; b=gGiBSRAYreT9sVFD01eUkJwkQ4HwLPGBbw3NHd78GrjHNKGLKRSK+05O8pPZ2PDf4e +Z/g3aXW78P+N5PjC2MtRoiFLesUZdxejpS5fo2DaPOB1pz4HkdeYsLpAC4HIcd7JBIM +gjbZscvAJ/zYwWYuN6CfuexJtkU9Z4KVtxeKG8Uhmqy86vbjevStTmM98JLfNiQfIiA GkSz7frP3B5sSnY8AMRFjKeNe+3lC6rbWdw5M0ZxOXwWIKcxDThP9I9Jqz3F4lFp4X+h zUGVY2fHA4lMuCra4IczMMayMgYTctGxfxtNMmmEqJCXGYzV4eMP83GLV4u/YBC8oPBj Il5g== X-Gm-Message-State: AOJu0YwZYPoN/kvUvMhULiz/UMt7KwKqxHT3lL/7x/7+Anig0Uk8n49a b2kDff2j9Slfh1vEo18uv+hFYEoK1S/rHEkZCp3tE7OgM4LNSIMVYetuoONYGDrD/xYuAdlsEeG iCL4+G8GbACGa128IeRoCaTL8ORyADazYpmyxuBsXHzK84429WFGeaVTGHeUAlGt7dwVK5S73PQ d4981Rcjyv9VDJ X-Google-Smtp-Source: AGHT+IFAP1MVjOOsFtZBK1jpvI4x5jvPmvTFyEZdjxxtOSRDueQA6XDecxTMiYfk9bfjHKQcim5Xkg== X-Received: by 2002:a05:6a20:e113:b0:18c:f9a7:6f66 with SMTP id kr19-20020a056a20e11300b0018cf9a76f66mr3951037pzb.30.1702473244325; Wed, 13 Dec 2023 05:14:04 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id fn7-20020a056a002fc700b006cecaff9e29sm9928601pfb.128.2023.12.13.05.14.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 05:14:03 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Han-Kuan Chen , Andy Chiu , Paul Walmsley , Albert Ou , Conor Dooley , Andrew Jones , Heiko Stuebner Subject: [v4, 3/6] riscv: Add vector extension XOR implementation Date: Wed, 13 Dec 2023 13:13:18 +0000 Message-Id: <20231213131321.12862-4-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231213131321.12862-1-andy.chiu@sifive.com> References: <20231213131321.12862-1-andy.chiu@sifive.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231213_051409_371152_665FD818 X-CRM114-Status: GOOD ( 14.66 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Greentime Hu This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen Signed-off-by: Han-Kuan Chen Signed-off-by: Greentime Hu Signed-off-by: Andy Chiu Acked-by: Conor Dooley --- Changelog v2: - 's/rvv/vector/' (Conor) --- arch/riscv/include/asm/xor.h | 82 ++++++++++++++++++++++++++++++++++++ arch/riscv/lib/Makefile | 1 + arch/riscv/lib/xor.S | 81 +++++++++++++++++++++++++++++++++++ 3 files changed, 164 insertions(+) create mode 100644 arch/riscv/include/asm/xor.h create mode 100644 arch/riscv/lib/xor.S diff --git a/arch/riscv/include/asm/xor.h b/arch/riscv/include/asm/xor.h new file mode 100644 index 000000000000..903c3275f8d0 --- /dev/null +++ b/arch/riscv/include/asm/xor.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ + +#include +#include +#ifdef CONFIG_RISCV_ISA_V +#include +#include + +void xor_regs_2_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2); +void xor_regs_3_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3); +void xor_regs_4_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4); +void xor_regs_5_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4, + const unsigned long *__restrict p5); + +static void xor_vector_2(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2) +{ + kernel_vector_begin(); + xor_regs_2_(bytes, p1, p2); + kernel_vector_end(); +} + +static void xor_vector_3(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3) +{ + kernel_vector_begin(); + xor_regs_3_(bytes, p1, p2, p3); + kernel_vector_end(); +} + +static void xor_vector_4(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4) +{ + kernel_vector_begin(); + xor_regs_4_(bytes, p1, p2, p3, p4); + kernel_vector_end(); +} + +static void xor_vector_5(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4, + const unsigned long *__restrict p5) +{ + kernel_vector_begin(); + xor_regs_5_(bytes, p1, p2, p3, p4, p5); + kernel_vector_end(); +} + +static struct xor_block_template xor_block_rvv = { + .name = "rvv", + .do_2 = xor_vector_2, + .do_3 = xor_vector_3, + .do_4 = xor_vector_4, + .do_5 = xor_vector_5 +}; + +#undef XOR_TRY_TEMPLATES +#define XOR_TRY_TEMPLATES \ + do { \ + xor_speed(&xor_block_8regs); \ + xor_speed(&xor_block_32regs); \ + if (has_vector()) { \ + xor_speed(&xor_block_rvv);\ + } \ + } while (0) +#endif diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 26cb2502ecf8..494f9cd1a00c 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -11,3 +11,4 @@ lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o +lib-$(CONFIG_RISCV_ISA_V) += xor.o diff --git a/arch/riscv/lib/xor.S b/arch/riscv/lib/xor.S new file mode 100644 index 000000000000..3bc059e18171 --- /dev/null +++ b/arch/riscv/lib/xor.S @@ -0,0 +1,81 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ +#include +#include +#include + +ENTRY(xor_regs_2_) + vsetvli a3, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a3 + vxor.vv v16, v0, v8 + add a2, a2, a3 + vse8.v v16, (a1) + add a1, a1, a3 + bnez a0, xor_regs_2_ + ret +END(xor_regs_2_) +EXPORT_SYMBOL(xor_regs_2_) + +ENTRY(xor_regs_3_) + vsetvli a4, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a4 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a4 + vxor.vv v16, v0, v16 + add a3, a3, a4 + vse8.v v16, (a1) + add a1, a1, a4 + bnez a0, xor_regs_3_ + ret +END(xor_regs_3_) +EXPORT_SYMBOL(xor_regs_3_) + +ENTRY(xor_regs_4_) + vsetvli a5, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a5 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a5 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a5 + vxor.vv v16, v0, v24 + add a4, a4, a5 + vse8.v v16, (a1) + add a1, a1, a5 + bnez a0, xor_regs_4_ + ret +END(xor_regs_4_) +EXPORT_SYMBOL(xor_regs_4_) + +ENTRY(xor_regs_5_) + vsetvli a6, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a6 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a6 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a6 + vxor.vv v0, v0, v24 + vle8.v v8, (a5) + add a4, a4, a6 + vxor.vv v16, v0, v8 + add a5, a5, a6 + vse8.v v16, (a1) + add a1, a1, a6 + bnez a0, xor_regs_5_ + ret +END(xor_regs_5_) +EXPORT_SYMBOL(xor_regs_5_) From patchwork Wed Dec 13 13:13:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13491004 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0969EC4332F for ; Wed, 13 Dec 2023 13:14:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=EysHWcVx9Wmo6vqONcrl9yBYAzV+NrA3guxDb2FhDYw=; b=C1dPAYOyBHVQMc iFemJHNGHQGvm3j2LSTsWorBJkD5pj3HJrWFR2mDRjzlZqubUTr4JXS7zxSxrPb2B+aBRfe+F2pDm BuGyYpglxd8X7vOWQOryeTZi1mUjWF7F28oCLGj6jmF91r7fYvxDSNdOKHq3yRz+mefy2DEA7b0vu UPgEnfmP06ySrzKV71N6ns/+Ygo8c/6ARFIfQ6tX3kpjdmP+wpgOnj1cSpEEWib3j56YNWl7qc7cL FQ0yuYcFVrDBN0ess4J4w54VyJeuxlOUFFb9DS5BSLtkQKQUvegJW18PDCoxPeiFswI+DsVWXNWuw YzkPm4X3x0f8TGzuZvlg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDP4E-00El3P-32; Wed, 13 Dec 2023 13:14:26 +0000 Received: from mail-pf1-x435.google.com ([2607:f8b0:4864:20::435]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDP48-00EkyQ-32 for linux-riscv@lists.infradead.org; Wed, 13 Dec 2023 13:14:25 +0000 Received: by mail-pf1-x435.google.com with SMTP id d2e1a72fcca58-6ceb93fb381so5036695b3a.0 for ; Wed, 13 Dec 2023 05:14:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702473256; x=1703078056; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hXAxSVsrhC7N6HQyqk2Zw+rTtsoxlybAFKnChoLqotI=; b=OAu2QFRLJN4nQVUC2TZkQ6Bj677sOs+DDJI4E9kuQZ6BrVuUqSY9CwPNL50SkRRrV/ l3MmJKkzEYQfU4WOOClU+mrewPmynge+5VY/eTYVAXVFC2DIEso0wL1ymg8Ig2GHD3Nj TKb5Yrr/Bnujv8mn+C2M3qamkAs1ZOZjhV7+MsicF3iMmUIgAIrPa+4ufLTPmUWeEMNp oH1fJqodeqFfk2O9RSbC4LBrhYM0B+WIJvOemQlXSF2c2KXN+lS1ci/MT8VPxn/NOKuF dELMh4CmCDhRNccnRY/Qt3HqOiHzk+nedpjiL/bBtf45uQgD/dKCRbwmEUmjdOZS1g+H jPVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702473256; x=1703078056; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hXAxSVsrhC7N6HQyqk2Zw+rTtsoxlybAFKnChoLqotI=; b=w9EFwxjP447toEdbnjffg8BOZx3YMrBjGdU00y62v9pdn3GcuY+jT9MtrLaXPs+iwH LP94RGJhYI/V02/4viQ/o2abpMck9iVMVg5C0X9KzRJ7/kirvCr96sk858QLbQz+zEiR ZDVa3vILZ65+HMuf4BOyr6Wmin4hrFKf2Zc2BxZVRHT2A5RvhKUKGwnRJxwrU+YxPXLS 6YXfhWIBu7zU/65ZDNJ5BHB1KomTtr7rcOOY4I/q84q2i837eI+mJe3mRbc8iw5REZxq zcs9FeXibRtw2Flj3zBnM4+ZP26+b2bNwljDeS7SdyRBcwhufFYnn6y3Y70IlzETFUVX nTng== X-Gm-Message-State: AOJu0YyWFsoV3ZcHSAs56f9GGecp0Bmhnei1QrcEKmYbsyvm5fxUjKu1 9aR2gx022f1ZKGTjVFmml1DkrD1H1hXwTSyZInLLXRgPomn8sS3lU3jFOV6TPtWxCuAY8Q16aN/ 8wm9H0FtnEzMLO/mRhUEKk0DfehTsrbape3+YOyYx88Azpt7SEbzE/qP+fFwzTaizdeM2AR2fOz +TN3lTY9agoA+c X-Google-Smtp-Source: AGHT+IFNNnPACCR24FYrGpJaSkQhqIv4QAHt9tLczKlhxcmKD5TeZdRGQV0Xdo/V4nV5jWB3i85jjQ== X-Received: by 2002:a05:6a00:1251:b0:6ce:63d8:3b61 with SMTP id u17-20020a056a00125100b006ce63d83b61mr9737925pfi.26.1702473255595; Wed, 13 Dec 2023 05:14:15 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id fn7-20020a056a002fc700b006cecaff9e29sm9928601pfb.128.2023.12.13.05.14.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 05:14:14 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Andy Chiu , Paul Walmsley , Albert Ou , Oleg Nesterov , Guo Ren , Jisheng Zhang , Yipeng Zou , Conor Dooley , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Vincent Chen , Heiko Stuebner , Peter Zijlstra , Mathis Salmen Subject: [v4, 4/6] riscv: sched: defer restoring Vector context for user Date: Wed, 13 Dec 2023 13:13:19 +0000 Message-Id: <20231213131321.12862-5-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231213131321.12862-1-andy.chiu@sifive.com> References: <20231213131321.12862-1-andy.chiu@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231213_051420_993016_49968818 X-CRM114-Status: GOOD ( 21.44 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org User will use its Vector registers only after the kernel really returns to the userspace. So we can delay restoring Vector registers as long as we are still running in kernel mode. So, add a thread flag to indicates the need of restoring Vector and do the restore at the last arch-specific exit-to-user hook. This save the context restoring cost when we switch over multiple processes that run V in kernel mode. For example, if the kernel performs a context swicth from A->B->C, and returns to C's userspace, then there is no need to restore B's V-register. Besides, this also prevents us from repeatedly restoring V context when executing kernel-mode Vector multiple times. The cost of this is that we must disable preemption and mark vector as busy during vstate_{save,restore}. Because then the V context will not get restored back immediately when a trap-causing context switch happens in the middle of vstate_{save,restore}. Signed-off-by: Andy Chiu Acked-by: Conor Dooley --- Changelog v4: - fix typos and re-add Conor's A-b. Changelog v3: - Guard {get,put}_cpu_vector_context between vstate_* operation and explain it in the commit msg. - Drop R-b from Björn and A-b from Conor. Changelog v2: - rename and add comment for the new thread flag (Conor) --- arch/riscv/include/asm/entry-common.h | 17 +++++++++++++++++ arch/riscv/include/asm/thread_info.h | 2 ++ arch/riscv/include/asm/vector.h | 11 ++++++++++- arch/riscv/kernel/kernel_mode_vector.c | 2 +- arch/riscv/kernel/process.c | 2 ++ arch/riscv/kernel/ptrace.c | 5 ++++- arch/riscv/kernel/signal.c | 5 ++++- arch/riscv/kernel/vector.c | 2 +- 8 files changed, 41 insertions(+), 5 deletions(-) diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h index 6e4dee49d84b..5bc90b38590f 100644 --- a/arch/riscv/include/asm/entry-common.h +++ b/arch/riscv/include/asm/entry-common.h @@ -4,6 +4,23 @@ #define _ASM_RISCV_ENTRY_COMMON_H #include +#include +#include + +static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, + unsigned long ti_work) +{ + if (ti_work & _TIF_RISCV_V_DEFER_RESTORE) { + clear_thread_flag(TIF_RISCV_V_DEFER_RESTORE); + /* + * We are already called with irq disabled, so go without + * keeping track of vector_context_busy. + */ + riscv_v_vstate_restore(current, regs); + } +} + +#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare void handle_page_fault(struct pt_regs *regs); void handle_break(struct pt_regs *regs); diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index 1833beb00489..b182f2d03e25 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -93,12 +93,14 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src); #define TIF_NOTIFY_SIGNAL 9 /* signal notifications exist */ #define TIF_UPROBE 10 /* uprobe breakpoint or singlestep */ #define TIF_32BIT 11 /* compat-mode 32bit process */ +#define TIF_RISCV_V_DEFER_RESTORE 12 /* restore Vector before returing to user */ #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL) #define _TIF_UPROBE (1 << TIF_UPROBE) +#define _TIF_RISCV_V_DEFER_RESTORE (1 << TIF_RISCV_V_DEFER_RESTORE) #define _TIF_WORK_MASK \ (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED | \ diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index a2db3dbe97eb..83c36b661b14 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -205,6 +205,15 @@ static inline void riscv_v_vstate_restore(struct task_struct *task, } } +static inline void riscv_v_vstate_set_restore(struct task_struct *task, + struct pt_regs *regs) +{ + if ((regs->status & SR_VS) != SR_VS_OFF) { + set_tsk_thread_flag(task, TIF_RISCV_V_DEFER_RESTORE); + riscv_v_vstate_on(regs); + } +} + static inline void __switch_to_vector(struct task_struct *prev, struct task_struct *next) { @@ -212,7 +221,7 @@ static inline void __switch_to_vector(struct task_struct *prev, regs = task_pt_regs(prev); riscv_v_vstate_save(prev, regs); - riscv_v_vstate_restore(next, task_pt_regs(next)); + riscv_v_vstate_set_restore(next, task_pt_regs(next)); } void riscv_v_vstate_ctrl_init(struct task_struct *tsk); diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c index 52e42f74ec9a..c5b86b554d1a 100644 --- a/arch/riscv/kernel/kernel_mode_vector.c +++ b/arch/riscv/kernel/kernel_mode_vector.c @@ -92,7 +92,7 @@ void kernel_vector_end(void) if (WARN_ON(!has_vector())) return; - riscv_v_vstate_restore(current, task_pt_regs(current)); + riscv_v_vstate_set_restore(current, task_pt_regs(current)); riscv_v_disable(); diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index bb879ab38101..bfd570cf601e 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -153,6 +153,7 @@ void flush_thread(void) riscv_v_vstate_off(task_pt_regs(current)); kfree(current->thread.vstate.datap); memset(¤t->thread.vstate, 0, sizeof(struct __riscv_v_ext_state)); + clear_tsk_thread_flag(current, TIF_RISCV_V_DEFER_RESTORE); #endif } @@ -169,6 +170,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) *dst = *src; /* clear entire V context, including datap for a new task */ memset(&dst->thread.vstate, 0, sizeof(struct __riscv_v_ext_state)); + clear_tsk_thread_flag(dst, TIF_RISCV_V_DEFER_RESTORE); return 0; } diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c index 2afe460de16a..7b93bcbdf9fa 100644 --- a/arch/riscv/kernel/ptrace.c +++ b/arch/riscv/kernel/ptrace.c @@ -99,8 +99,11 @@ static int riscv_vr_get(struct task_struct *target, * Ensure the vector registers have been saved to the memory before * copying them to membuf. */ - if (target == current) + if (target == current) { + get_cpu_vector_context(); riscv_v_vstate_save(current, task_pt_regs(current)); + put_cpu_vector_context(); + } ptrace_vstate.vstart = vstate->vstart; ptrace_vstate.vl = vstate->vl; diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c index 180d951d3624..d31d2c74d31f 100644 --- a/arch/riscv/kernel/signal.c +++ b/arch/riscv/kernel/signal.c @@ -86,7 +86,10 @@ static long save_v_state(struct pt_regs *regs, void __user **sc_vec) /* datap is designed to be 16 byte aligned for better performance */ WARN_ON(unlikely(!IS_ALIGNED((unsigned long)datap, 16))); + get_cpu_vector_context(); riscv_v_vstate_save(current, regs); + put_cpu_vector_context(); + /* Copy everything of vstate but datap. */ err = __copy_to_user(&state->v_state, ¤t->thread.vstate, offsetof(struct __riscv_v_ext_state, datap)); @@ -134,7 +137,7 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec) if (unlikely(err)) return err; - riscv_v_vstate_restore(current, regs); + riscv_v_vstate_set_restore(current, regs); return err; } diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c index 8d92fb6c522c..9d583b760db4 100644 --- a/arch/riscv/kernel/vector.c +++ b/arch/riscv/kernel/vector.c @@ -167,7 +167,7 @@ bool riscv_v_first_use_handler(struct pt_regs *regs) return true; } riscv_v_vstate_on(regs); - riscv_v_vstate_restore(current, regs); + riscv_v_vstate_set_restore(current, regs); return true; } From patchwork Wed Dec 13 13:13:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13491005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1EA4AC4167D for ; Wed, 13 Dec 2023 13:14:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=3v5yrWuNAGuxLEjG6XLbGU0RxGc8PQKDgYhBexJxEF4=; b=eQNSyzU0L5mY8H kpC24TiKNnFT4z9aQjB5OrqV2R3bRLCxVvVcOZnkI9GCaGHrdAVAZhoLeSmODjJFl3d8i655/vTAP 6gAygTlVhL6X5/7UzXa4vi1xB8kjevXRn8+4dG0NUq+hlbK6FMgqOHDjpcJDCAxr6Y9xdD4WogTtx y2oqmP1u1x+DekXJYRfHgiFpoiYEsNWuWnbenHrLAiuyjGtgYQx0HA1OICEN2IKQStTfBmhRJO+dZ sigOCTYvJoj3Fvg76D/P3Ys6T0uXU70BTIPTEjfLWVX+VOt17Zspo44CXVF9593dIBTwoM/jpOR5O g7ShBSOKXC0jQvsLDWpw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDP4G-00El4O-24; Wed, 13 Dec 2023 13:14:28 +0000 Received: from mail-pf1-x42a.google.com ([2607:f8b0:4864:20::42a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDP4D-00El2U-1a for linux-riscv@lists.infradead.org; Wed, 13 Dec 2023 13:14:27 +0000 Received: by mail-pf1-x42a.google.com with SMTP id d2e1a72fcca58-6ce94f62806so3747037b3a.1 for ; Wed, 13 Dec 2023 05:14:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702473264; x=1703078064; darn=lists.infradead.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=XjCT+w+87frztlza5cPJ8CGdORAwqhWad8QuF88LbdQ=; b=kPs8/seOX4AzPjlDPrp2clpOhpM9l7aPcjqNYnJAMxSRqSu1rw0pZ7MI22ItLlE76R VJzBjav9A3u2vTbzTCY3hwnIXCSU9BleR89f7h5FA7+0O/biHGERvMMWll8KCuKu9SoO /UYhcMAXHrvT1b27diOTmAqQSbZ2xSJsa1IkZBGSB232cCFWplBRE8CCT387gWb0Bh56 OAAzE2BLr0yTc3C89QNtRRVje/XcimlcMWAC/YSeGRFmc8VX8sFHRTmg+4nYbT9v6RcD YZY73tMZkJd5XWxSOg9QwgErLCU4WXoZgd9WSZ5z9PVrGaPW6REljgL5la/SiG3QojlJ kSag== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702473264; x=1703078064; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=XjCT+w+87frztlza5cPJ8CGdORAwqhWad8QuF88LbdQ=; b=KUBTZ/H5ySmmn01d0JQ+V+hid+cofN17iNFh1/koGwNRD6R64o8iBkAS2wi/F23hRV iC+LkGXAMo1LnvafDokAYcFzAfkrKmpOdPsGJIYti6KOdIq0JmdB5WOd3IHavBjLMo4Q hnYfIj191k+MOkwz6P9hLra97XnR2OdQymZbM41hnEKd6XgDjqRl5brG/A+rS26X0Wja DEFZf3/rCxTrpByNNR3WatF3YlJoVI4ElpZfMb+QwkriqwprHBKdvTIPrrnMTgzXlLsw So8xyhxG3WgqrFv/a3aumoBOvuPeKL28HOyvVK1JJtbfQrJ9A0dHKQYs6+0Kf4rLXVck 8v7A== X-Gm-Message-State: AOJu0Yxk0k76LSNQHk9KcxnfDuyONt97PeUAlze949L3d3Oaa53WZWWb WtADN6ONzU7LZX0ujzU80nkKYJCVNUTtiIp11KGfDHrEA11OeQ/TyLEaBEC4tNmukghYh5MG4gc c3ccy+6NDyaeOuYR0oh6QuAMKGu1KrtTosb7fH16eh0f+y9I74f+7OqBQIcdGyO3u8xe0QAa0fD jPbQDTaMRv9/NK X-Google-Smtp-Source: AGHT+IH3OPwQlgI8c2Thox7GtsXF5w3m5wvasAU5KVwnwS62Fgocve0nuMPBTZ9yUBdCUO7j1I/SKQ== X-Received: by 2002:a05:6a00:80a:b0:6bd:b7c5:f776 with SMTP id m10-20020a056a00080a00b006bdb7c5f776mr4977765pfk.8.1702473264135; Wed, 13 Dec 2023 05:14:24 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id fn7-20020a056a002fc700b006cecaff9e29sm9928601pfb.128.2023.12.13.05.14.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 05:14:23 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Andy Chiu , Paul Walmsley , Albert Ou , Conor Dooley , Andrew Jones , Han-Kuan Chen , Heiko Stuebner , Aurelien Jarno , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Alexandre Ghiti , Bo YU Subject: [v4, 5/6] riscv: lib: vectorize copy_to_user/copy_from_user Date: Wed, 13 Dec 2023 13:13:20 +0000 Message-Id: <20231213131321.12862-6-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231213131321.12862-1-andy.chiu@sifive.com> References: <20231213131321.12862-1-andy.chiu@sifive.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231213_051425_529518_D4064200 X-CRM114-Status: GOOD ( 19.95 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org This patch utilizes Vector to perform copy_to_user/copy_from_user. If Vector is available and the size of copy is large enough for Vector to perform better than scalar, then direct the kernel to do Vector copies for userspace. Though the best programming practice for users is to reduce the copy, this provides a faster variant when copies are inevitable. The optimal size for using Vector, copy_to_user_thres, is only a heuristic for now. We can add DT parsing if people feel the need of customizing it. The exception fixup code of the __asm_vector_usercopy must fallback to the scalar one because accessing user pages might fault, and must be sleepable. Current kernel-mode Vector does not allow tasks to be preemptible, so we must disactivate Vector and perform a scalar fallback in such case. The original implementation of Vector operations comes from https://github.com/sifive/sifive-libc, which we agree to contribute to Linux kernel. Signed-off-by: Andy Chiu --- Changelog v4: - new patch since v4 --- arch/riscv/lib/Makefile | 2 ++ arch/riscv/lib/riscv_v_helpers.c | 38 ++++++++++++++++++++++ arch/riscv/lib/uaccess.S | 11 +++++++ arch/riscv/lib/uaccess_vector.S | 55 ++++++++++++++++++++++++++++++++ 4 files changed, 106 insertions(+) create mode 100644 arch/riscv/lib/riscv_v_helpers.c create mode 100644 arch/riscv/lib/uaccess_vector.S diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 494f9cd1a00c..1fe8d797e0f2 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -12,3 +12,5 @@ lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o lib-$(CONFIG_RISCV_ISA_V) += xor.o +lib-$(CONFIG_RISCV_ISA_V) += riscv_v_helpers.o +lib-$(CONFIG_RISCV_ISA_V) += uaccess_vector.o diff --git a/arch/riscv/lib/riscv_v_helpers.c b/arch/riscv/lib/riscv_v_helpers.c new file mode 100644 index 000000000000..d763b9c69fb7 --- /dev/null +++ b/arch/riscv/lib/riscv_v_helpers.c @@ -0,0 +1,38 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2023 SiFive + * Author: Andy Chiu + */ +#include +#include + +#include +#include + +size_t riscv_v_usercopy_thres = 768; +int __asm_vector_usercopy(void *dst, void *src, size_t n); +int fallback_scalar_usercopy(void *dst, void *src, size_t n); +asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n) +{ + size_t remain, copied; + + /* skip has_vector() check because it has been done by the asm */ + if (!may_use_simd()) + goto fallback; + + kernel_vector_begin(); + remain = __asm_vector_usercopy(dst, src, n); + kernel_vector_end(); + + if (remain) { + copied = n - remain; + dst += copied; + src += copied; + goto fallback; + } + + return remain; + +fallback: + return fallback_scalar_usercopy(dst, src, n); +} diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S index 09b47ebacf2e..b43fd189b534 100644 --- a/arch/riscv/lib/uaccess.S +++ b/arch/riscv/lib/uaccess.S @@ -3,6 +3,8 @@ #include #include #include +#include +#include .macro fixup op reg addr lbl 100: @@ -12,6 +14,14 @@ ENTRY(__asm_copy_to_user) ENTRY(__asm_copy_from_user) +#ifdef CONFIG_RISCV_ISA_V + ALTERNATIVE("j fallback_scalar_usercopy", "nop", 0, RISCV_ISA_EXT_v, CONFIG_RISCV_ISA_V) + la t0, riscv_v_usercopy_thres + REG_L t0, (t0) + bltu a2, t0, fallback_scalar_usercopy + tail enter_vector_usercopy +#endif +ENTRY(fallback_scalar_usercopy) /* Enable access to user memory */ li t6, SR_SUM @@ -181,6 +191,7 @@ ENTRY(__asm_copy_from_user) csrc CSR_STATUS, t6 sub a0, t5, a0 ret +ENDPROC(fallback_scalar_usercopy) ENDPROC(__asm_copy_to_user) ENDPROC(__asm_copy_from_user) EXPORT_SYMBOL(__asm_copy_to_user) diff --git a/arch/riscv/lib/uaccess_vector.S b/arch/riscv/lib/uaccess_vector.S new file mode 100644 index 000000000000..98226f77efbd --- /dev/null +++ b/arch/riscv/lib/uaccess_vector.S @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include +#include +#include +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + + .macro fixup op reg addr lbl +100: + \op \reg, \addr + _asm_extable 100b, \lbl + .endm + +ENTRY(__asm_vector_usercopy) + /* Enable access to user memory */ + li t6, SR_SUM + csrs CSR_STATUS, t6 + + /* Save for return value */ + mv t5, a2 + + mv pDstPtr, pDst +loop: + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + fixup vle8.v vData, (pSrc), 10f + fixup vse8.v vData, (pDstPtr), 10f + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + add pDstPtr, pDstPtr, iVL + bnez iNum, loop + +.Lout_copy_user: + /* Disable access to user memory */ + csrc CSR_STATUS, t6 + li a0, 0 + ret + + /* Exception fixup code */ +10: + /* Disable access to user memory */ + csrc CSR_STATUS, t6 + mv a0, iNum + ret +ENDPROC(__asm_vector_usercopy) From patchwork Wed Dec 13 13:13:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13491006 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0B191C4332F for ; Wed, 13 Dec 2023 13:14:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=rxxS46L427XKmh9YY6qCUb1FO9KbJPUvMTsi+wJZSKA=; b=ECG9aL1roMcjxE sMiJvfQVQ6IwMGVuLva66McEAQGepe0g90tcZ1yozSN+mnkNW124JQc1JLzPSHZaia26Z5h4qmDH4 3FmAsQMYoGtbzeQK/2rVKawWYBM6SFxxwMhXTzQj446oaNIcbkLeaUSXIZsaYuslNtRwc3Z2Xvzz+ +anREj9qQu1KfG0v8L5zIfsyCqUXmysHtAxIvVjTR/lahskNINymosYV7DVJkmegHSspUbLcfb9dm SbG0r9sEp9rZ6zUQ6Wvs1W+cEtr4ffIT/XvcKrqQPuvZJI2AaIfz8aGABueN9c2As0XDi4pjhYvon 2ae7zOkOtzAiwj2SvbhQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDP4V-00ElCM-2J; Wed, 13 Dec 2023 13:14:43 +0000 Received: from mail-pf1-x42e.google.com ([2607:f8b0:4864:20::42e]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDP4S-00El96-1h for linux-riscv@lists.infradead.org; Wed, 13 Dec 2023 13:14:42 +0000 Received: by mail-pf1-x42e.google.com with SMTP id d2e1a72fcca58-6ce6b62746dso4204373b3a.2 for ; Wed, 13 Dec 2023 05:14:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702473275; x=1703078075; darn=lists.infradead.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=4ualX2swTBrXI+FwzOjM4q/ZoSMm8A2VqrOIQc8Gos8=; b=R0f2tyxjSmY4b5SenjfNTwe4kIRnFTV+YzHl0hkJekLqkoKkcuLJWNcA3NdmLpiamE Ft1Pvy+OqODrd0/jBjlve8V+KTxni5KPtW+nfTVv9x+2G/vv+Ci8GpyPIql4bTIoj0DR Ah2lgZgwxxq+B5AZadm8EW0uT6+b01nNVFQ+6dZqga56SuQZba+JhSBE0ZtHSQBuVg7u dJF4U5/YhiEpRGzw/wK4qDNRzuFLxwoqCXRHJ/0rfCReml6Xkkn5cMz0dflDOA6WPB7n XRdHKP7eLjCkEBKKcFMkre4A7yDcXcDLAzpozUcwh2yjOC0HPZojUCSnQxjO+7yK7j8u WXxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702473275; x=1703078075; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4ualX2swTBrXI+FwzOjM4q/ZoSMm8A2VqrOIQc8Gos8=; b=kAWadiQGVd5RMIRb770LMgIkUosnN2qnpndS7FbBPWuAkbh0xATPBdUe+7NeU8uhDD x0CRRaxL+59Svbb83iFZHvHppcYvDpkpLeilDTfsUDb9p+1U54WHyxj9vEANdON2QYij DUUVwGjoiPjjU/Bfvf9WvRFEZTuW5R5H0a57YYm83zaQHqPyPvtbv98p9VuEb2WVX5e6 1NqEOF7KWYg6EMYIVpQjl/Vxfd9WNdx8dZEWLA2krq2yVqmtbroNA6nWv1EUQ6VetX2z c+H/zxYEsxSOCBLtZb8fDxoKJkXbOBtRpgStAMzctvCIlSnTnnWnlpngJqQ4VrS94//i 6U9A== X-Gm-Message-State: AOJu0YxFxxcC2wAu/P3Sl2fCjiYQW/H0drCZTeYO52sH7A4bIovncbMz 6tO4xYApE6cSaptL/OoL2u59n+U47ZBhhOiIPoSwHuKo11DNrolVIjosIzGWjk2JXbDQfBoMtIE 5uB+9t4+iAjY0Lj7IJXDmR+WMb1FxnUnh/O3GqHj2xu32jtJZDEBswJcKtaRcCp6PGkfGPfA5Oj zOXmOm0y9cNDZz X-Google-Smtp-Source: AGHT+IEhf97n+EouAuUoiNfbwwdZLVPWlSL8GQ9FrHhGWXRimjnKTE0HDqfxpxNe7GTSzj+WvwoYGw== X-Received: by 2002:a05:6a00:2d04:b0:6cd:f35d:afb8 with SMTP id fa4-20020a056a002d0400b006cdf35dafb8mr4365948pfb.11.1702473274368; Wed, 13 Dec 2023 05:14:34 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id fn7-20020a056a002fc700b006cecaff9e29sm9928601pfb.128.2023.12.13.05.14.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 05:14:33 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Andy Chiu , Paul Walmsley , Albert Ou , Conor Dooley , Andrew Jones , Han-Kuan Chen , Heiko Stuebner Subject: [v4, 6/6] riscv: lib: add vectorized mem* routines Date: Wed, 13 Dec 2023 13:13:21 +0000 Message-Id: <20231213131321.12862-7-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231213131321.12862-1-andy.chiu@sifive.com> References: <20231213131321.12862-1-andy.chiu@sifive.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231213_051440_561877_CA860B4D X-CRM114-Status: GOOD ( 14.91 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Provide vectorized memcpy/memset/memmove to accelerate common memory operations. Also, group them into V_OPT_TEMPLATE3 macro because their setup/tear-down and fallback logics are the same. The original implementation of Vector operations comes from https://github.com/sifive/sifive-libc, which we agree to contribute to Linux kernel. Signed-off-by: Andy Chiu --- Changelog v4: - new patch since v4 --- arch/riscv/lib/Makefile | 3 ++ arch/riscv/lib/memcpy_vector.S | 29 +++++++++++++++++++ arch/riscv/lib/memmove_vector.S | 49 ++++++++++++++++++++++++++++++++ arch/riscv/lib/memset.S | 2 +- arch/riscv/lib/memset_vector.S | 33 +++++++++++++++++++++ arch/riscv/lib/riscv_v_helpers.c | 21 ++++++++++++++ 6 files changed, 136 insertions(+), 1 deletion(-) create mode 100644 arch/riscv/lib/memcpy_vector.S create mode 100644 arch/riscv/lib/memmove_vector.S create mode 100644 arch/riscv/lib/memset_vector.S diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 1fe8d797e0f2..3111863afd2e 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -14,3 +14,6 @@ obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o lib-$(CONFIG_RISCV_ISA_V) += xor.o lib-$(CONFIG_RISCV_ISA_V) += riscv_v_helpers.o lib-$(CONFIG_RISCV_ISA_V) += uaccess_vector.o +lib-$(CONFIG_RISCV_ISA_V) += memset_vector.o +lib-$(CONFIG_RISCV_ISA_V) += memcpy_vector.o +lib-$(CONFIG_RISCV_ISA_V) += memmove_vector.o diff --git a/arch/riscv/lib/memcpy_vector.S b/arch/riscv/lib/memcpy_vector.S new file mode 100644 index 000000000000..4176b6e0a53c --- /dev/null +++ b/arch/riscv/lib/memcpy_vector.S @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + + +/* void *memcpy(void *, const void *, size_t) */ +SYM_FUNC_START(__asm_memcpy_vector) + mv pDstPtr, pDst +loop: + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + vle8.v vData, (pSrc) + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + vse8.v vData, (pDstPtr) + add pDstPtr, pDstPtr, iVL + bnez iNum, loop + ret +SYM_FUNC_END(__asm_memcpy_vector) diff --git a/arch/riscv/lib/memmove_vector.S b/arch/riscv/lib/memmove_vector.S new file mode 100644 index 000000000000..4cea9d244dc9 --- /dev/null +++ b/arch/riscv/lib/memmove_vector.S @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 +#define pSrcBackwardPtr a5 +#define pDstBackwardPtr a6 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +SYM_FUNC_START(__asm_memmove_vector) + + mv pDstPtr, pDst + + bgeu pSrc, pDst, forward_copy_loop + add pSrcBackwardPtr, pSrc, iNum + add pDstBackwardPtr, pDst, iNum + bltu pDst, pSrcBackwardPtr, backward_copy_loop + +forward_copy_loop: + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData, (pSrc) + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + vse8.v vData, (pDstPtr) + add pDstPtr, pDstPtr, iVL + + bnez iNum, forward_copy_loop + ret + +backward_copy_loop: + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + sub pSrcBackwardPtr, pSrcBackwardPtr, iVL + vle8.v vData, (pSrcBackwardPtr) + sub iNum, iNum, iVL + sub pDstBackwardPtr, pDstBackwardPtr, iVL + vse8.v vData, (pDstBackwardPtr) + bnez iNum, backward_copy_loop + ret + +SYM_FUNC_END(__asm_memmove_vector) diff --git a/arch/riscv/lib/memset.S b/arch/riscv/lib/memset.S index 34c5360c6705..55207e6f5736 100644 --- a/arch/riscv/lib/memset.S +++ b/arch/riscv/lib/memset.S @@ -110,4 +110,4 @@ WEAK(memset) bltu t0, a3, 5b 6: ret -END(__memset) +ENDPROC(__memset) diff --git a/arch/riscv/lib/memset_vector.S b/arch/riscv/lib/memset_vector.S new file mode 100644 index 000000000000..4611feed72ac --- /dev/null +++ b/arch/riscv/lib/memset_vector.S @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#include +#include + +#define pDst a0 +#define iValue a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 +#define pDstPtr a5 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +/* void *memset(void *, int, size_t) */ +SYM_FUNC_START(__asm_memset_vector) + + mv pDstPtr, pDst + + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + vmv.v.x vData, iValue + +loop: + vse8.v vData, (pDstPtr) + sub iNum, iNum, iVL + add pDstPtr, pDstPtr, iVL + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + bnez iNum, loop + + ret + +SYM_FUNC_END(__asm_memset_vector) diff --git a/arch/riscv/lib/riscv_v_helpers.c b/arch/riscv/lib/riscv_v_helpers.c index d763b9c69fb7..12e8c5deb013 100644 --- a/arch/riscv/lib/riscv_v_helpers.c +++ b/arch/riscv/lib/riscv_v_helpers.c @@ -36,3 +36,24 @@ asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n) fallback: return fallback_scalar_usercopy(dst, src, n); } + +#define V_OPT_TEMPLATE3(prefix, type_r, type_0, type_1) \ +extern type_r __asm_##prefix##_vector(type_0, type_1, size_t n); \ +type_r prefix(type_0 a0, type_1 a1, size_t n) \ +{ \ + type_r ret; \ + if (has_vector() && may_use_simd() && n > riscv_v_##prefix##_thres) { \ + kernel_vector_begin(); \ + ret = __asm_##prefix##_vector(a0, a1, n); \ + kernel_vector_end(); \ + return ret; \ + } \ + return __##prefix(a0, a1, n); \ +} + +static size_t riscv_v_memset_thres = 1280; +V_OPT_TEMPLATE3(memset, void *, void*, int) +static size_t riscv_v_memcpy_thres = 768; +V_OPT_TEMPLATE3(memcpy, void *, void*, const void *) +static size_t riscv_v_memmove_thres = 512; +V_OPT_TEMPLATE3(memmove, void *, void*, const void *)