From patchwork Thu Dec 14 15:57:16 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13493218 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AB28EC4332F for ; Thu, 14 Dec 2023 15:58:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=wFt17+axisshYTpTqP4MgeZnjUE4WajQqhCwRNyUclY=; b=bmnY9AaRIYLAah uRvRsnYtHZp1Eaba5aqXpkmAlivI1yLbvVUG1aNQj8L4o+Gc5oKADhDtw6P+5GsTgfli8rHpDcWrL /xnYSECWCDI/0s3FlaC74jdkT5jNIREd8MRjUGaM8JZbwcRXwLeQtzKqsfR6BQzgZjjiaESTjqSpJ poWWS4gtAXsUIoZKzjjo9ztHFhK+BIid+StWJ2NQxVWlmJMOsZ3CElCkK4Ro1AjAnSrxVa8dc2aC1 q9Lcz9VnVy6/Dwhqqlc0DIv/186Uupv4XUIH8G0v8SLsTHkL3j4yhVaLAwlYqD0dTNcI4NSejz7ap DQwXZZIDkHpc2eSa+5yg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6M-000jIM-1s; Thu, 14 Dec 2023 15:58:18 +0000 Received: from mail-pg1-x535.google.com ([2607:f8b0:4864:20::535]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6J-000jH2-2T for linux-riscv@lists.infradead.org; Thu, 14 Dec 2023 15:58:17 +0000 Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-5c690c3d113so6883652a12.1 for ; Thu, 14 Dec 2023 07:58:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702569492; x=1703174292; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jfJwXYnzG4EIWfdpxMv21nwFi5ESrLdAMpg+T9wE9Bo=; b=lNe/6/yvvrZCwo7lR6v5UXTgreYhXBFQFTD8ORdVNW6SpiFLA4JdJ1A9JAISKinxM/ axDdjR1z/PhBlaLEoqAyinxv+yfQdDhlouDTcbbkYIFQOxkOAmo2bhMWvOwN90r6PErj RFKUJ+Wri7QFKRqO26OJcUZFakeEAZFm/JkzVHPUE+mt/vb6V8pWB+qI6P7Q+maVrwtx F8sWbv1MGObVD1cjEhY6LYsyrUEGxutW3YNOJGwQ6nMjboWdZbMdgIef+kZg+WKch804 kc1JmRlBclBno1cM5Ez2Og4HoZgaQHQOwhQX+wUwMzj3ESpiO/F3RZpqA5QKnuhXm2Gz Fr2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702569492; x=1703174292; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jfJwXYnzG4EIWfdpxMv21nwFi5ESrLdAMpg+T9wE9Bo=; b=oUHRX7LyEp6QdATD3sA7cfOWEsmaYWTlavrl1rrfdNCSVBY/dtyaihYIfz4rbDgFwP 8MVcJ+WHz4kl2wJccXgr2F9LDAPsuWrgTV0ZdRABOVg6aS5/fLtS9uQi9CMFAMvBpvn1 XQikmANON+V8tDtZAsEAz7J/nIRz45ZjYsBMTd1WJ8DgOr37RwoiFMLVz4hfCK7VMOVj BiH3aQ/ny6YEZ1Ief2rPKD2cXRrMyiK8+o5LCsgq4mGh3sw/3aKa5kRYimR3aG4JM7ph xnb5fWw3C95QCyOmJWNO0Mx0c/p6noCe/+TV1voZvr7jBJiITJicsJ/D8QRXgGY/QIaq Lw3A== X-Gm-Message-State: AOJu0YzSZgOr37nIePOFuBb2fZvk1wB2h2j0LGSgQVJJi0HTg/huHwNa 4vBXvjTuvq+PHsVKnqmTCXTQa0hln7hs7fiPlgO9BVBbvf6yD5ijkkeL9rd456olg710b+7a7Qp h8GlGPUxhtrcAALjMoWF0t9YFSNKdEI0O1ieyGUA2i+RxC6xQC3YIdUHSlXMmEvyb0HWhjCmuVd +l4EPAcFH1goB/ X-Google-Smtp-Source: AGHT+IHeos7TYtvKl4SQ0OYvQQqwTUJIA48HtoEW1mg05KfdCPyrLq5Tl79lXtpYYj1iKTZjjf2zzA== X-Received: by 2002:a17:90b:796:b0:286:d6fb:3d4f with SMTP id l22-20020a17090b079600b00286d6fb3d4fmr8142233pjz.24.1702569491980; Thu, 14 Dec 2023 07:58:11 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id f4-20020a170902e98400b001d35223d0besm3320799plb.251.2023.12.14.07.58.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 07:58:11 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Vincent Chen , Andy Chiu , Paul Walmsley , Albert Ou , Heiko Stuebner , Conor Dooley , =?utf-8?b?Q2zDqW1lbnQgTMOpZ2Vy?= , Guo Ren , Xiao Wang , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Alexandre Ghiti , Sami Tolvanen , Sia Jee Heng , Jisheng Zhang , Peter Zijlstra Subject: [v5, 1/6] riscv: Add support for kernel mode vector Date: Thu, 14 Dec 2023 15:57:16 +0000 Message-Id: <20231214155721.1753-2-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231214155721.1753-1-andy.chiu@sifive.com> References: <20231214155721.1753-1-andy.chiu@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231214_075815_806194_C42B364D X-CRM114-Status: GOOD ( 24.77 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Greentime Hu Add kernel_vector_begin() and kernel_vector_end() function declarations and corresponding definitions in kernel_mode_vector.c These are needed to wrap uses of vector in kernel mode. Co-developed-by: Vincent Chen Signed-off-by: Vincent Chen Signed-off-by: Greentime Hu Signed-off-by: Andy Chiu --- Changelog v4: - Use kernel_v_flags and helpers to track vector context. Changelog v3: - Reorder patch 1 to patch 3 to make use of {get,put}_cpu_vector_context later. - Export {get,put}_cpu_vector_context. - Save V context after disabling preemption. (Guo) - Fix a build fail. (Conor) - Remove irqs_disabled() check as it is not needed, fix styling. (Björn) Changelog v2: - 's/kernel_rvv/kernel_vector' and return void in kernel_vector_begin (Conor) - export may_use_simd to include/asm/simd.h --- arch/riscv/include/asm/processor.h | 15 +++- arch/riscv/include/asm/simd.h | 42 ++++++++++++ arch/riscv/include/asm/vector.h | 21 ++++++ arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/kernel_mode_vector.c | 95 ++++++++++++++++++++++++++ arch/riscv/kernel/process.c | 2 +- 6 files changed, 174 insertions(+), 2 deletions(-) create mode 100644 arch/riscv/include/asm/simd.h create mode 100644 arch/riscv/kernel/kernel_mode_vector.c diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h index f19f861cda54..a47763c262e1 100644 --- a/arch/riscv/include/asm/processor.h +++ b/arch/riscv/include/asm/processor.h @@ -73,6 +73,18 @@ struct task_struct; struct pt_regs; +/* + * We use a flag to track in-kernel Vector context. Currently the flag has the + * following meaning: + * + * - bit 0 indicates whether the in-kernel Vector context is active. The + * activation of this state disables the preemption. + */ + +#define RISCV_KERNEL_MODE_V_MASK 0x1 + +#define RISCV_KERNEL_MODE_V 0x1 + /* CPU-specific state of a task */ struct thread_struct { /* Callee-saved registers */ @@ -81,7 +93,8 @@ struct thread_struct { unsigned long s[12]; /* s[0]: frame pointer */ struct __riscv_d_ext_state fstate; unsigned long bad_cause; - unsigned long vstate_ctrl; + u32 riscv_v_flags; + u32 vstate_ctrl; struct __riscv_v_ext_state vstate; unsigned long align_ctl; }; diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h new file mode 100644 index 000000000000..269752bfa2cc --- /dev/null +++ b/arch/riscv/include/asm/simd.h @@ -0,0 +1,42 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +/* + * Copyright (C) 2017 Linaro Ltd. + * Copyright (C) 2023 SiFive + */ + +#ifndef __ASM_SIMD_H +#define __ASM_SIMD_H + +#include +#include +#include +#include +#include + +#ifdef CONFIG_RISCV_ISA_V +/* + * may_use_simd - whether it is allowable at this time to issue vector + * instructions or access the vector register file + * + * Callers must not assume that the result remains true beyond the next + * preempt_enable() or return from softirq context. + */ +static __must_check inline bool may_use_simd(void) +{ + /* + * RISCV_KERNEL_MODE_V is only set while preemption is disabled, + * and is clear whenever preemption is enabled. + */ + return !in_hardirq() && !in_nmi() && !(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); +} + +#else /* ! CONFIG_RISCV_ISA_V */ + +static __must_check inline bool may_use_simd(void) +{ + return false; +} + +#endif /* ! CONFIG_RISCV_ISA_V */ + +#endif diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index 87aaef656257..6254830c0668 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -22,6 +22,27 @@ extern unsigned long riscv_v_vsize; int riscv_v_setup_vsize(void); bool riscv_v_first_use_handler(struct pt_regs *regs); +void kernel_vector_begin(void); +void kernel_vector_end(void); +void get_cpu_vector_context(void); +void put_cpu_vector_context(void); + +static inline void riscv_v_ctx_cnt_add(u32 offset) +{ + current->thread.riscv_v_flags += offset; + barrier(); +} + +static inline void riscv_v_ctx_cnt_sub(u32 offset) +{ + barrier(); + current->thread.riscv_v_flags -= offset; +} + +static inline u32 riscv_v_ctx_cnt(void) +{ + return READ_ONCE(current->thread.riscv_v_flags); +} static __always_inline bool has_vector(void) { diff --git a/arch/riscv/kernel/Makefile b/arch/riscv/kernel/Makefile index fee22a3d1b53..8c58595696b3 100644 --- a/arch/riscv/kernel/Makefile +++ b/arch/riscv/kernel/Makefile @@ -63,6 +63,7 @@ obj-$(CONFIG_MMU) += vdso.o vdso/ obj-$(CONFIG_RISCV_MISALIGNED) += traps_misaligned.o obj-$(CONFIG_FPU) += fpu.o obj-$(CONFIG_RISCV_ISA_V) += vector.o +obj-$(CONFIG_RISCV_ISA_V) += kernel_mode_vector.o obj-$(CONFIG_SMP) += smpboot.o obj-$(CONFIG_SMP) += smp.o obj-$(CONFIG_SMP) += cpu_ops.o diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c new file mode 100644 index 000000000000..c9ccf21dd16c --- /dev/null +++ b/arch/riscv/kernel/kernel_mode_vector.c @@ -0,0 +1,95 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2012 ARM Ltd. + * Author: Catalin Marinas + * Copyright (C) 2017 Linaro Ltd. + * Copyright (C) 2021 SiFive + */ +#include +#include +#include +#include +#include + +#include +#include +#include + +/* + * Claim ownership of the CPU vector context for use by the calling context. + * + * The caller may freely manipulate the vector context metadata until + * put_cpu_vector_context() is called. + */ +void get_cpu_vector_context(void) +{ + preempt_disable(); + + WARN_ON(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); + riscv_v_ctx_cnt_add(RISCV_KERNEL_MODE_V); +} + +/* + * Release the CPU vector context. + * + * Must be called from a context in which get_cpu_vector_context() was + * previously called, with no call to put_cpu_vector_context() in the + * meantime. + */ +void put_cpu_vector_context(void) +{ + WARN_ON(!(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK)); + riscv_v_ctx_cnt_sub(RISCV_KERNEL_MODE_V); + + preempt_enable(); +} + +/* + * kernel_vector_begin(): obtain the CPU vector registers for use by the calling + * context + * + * Must not be called unless may_use_simd() returns true. + * Task context in the vector registers is saved back to memory as necessary. + * + * A matching call to kernel_vector_end() must be made before returning from the + * calling context. + * + * The caller may freely use the vector registers until kernel_vector_end() is + * called. + */ +void kernel_vector_begin(void) +{ + if (WARN_ON(!has_vector())) + return; + + BUG_ON(!may_use_simd()); + + get_cpu_vector_context(); + + riscv_v_vstate_save(current, task_pt_regs(current)); + + riscv_v_enable(); +} +EXPORT_SYMBOL_GPL(kernel_vector_begin); + +/* + * kernel_vector_end(): give the CPU vector registers back to the current task + * + * Must be called from a context in which kernel_vector_begin() was previously + * called, with no call to kernel_vector_end() in the meantime. + * + * The caller must not use the vector registers after this function is called, + * unless kernel_vector_begin() is called again in the meantime. + */ +void kernel_vector_end(void) +{ + if (WARN_ON(!has_vector())) + return; + + riscv_v_vstate_restore(current, task_pt_regs(current)); + + riscv_v_disable(); + + put_cpu_vector_context(); +} +EXPORT_SYMBOL_GPL(kernel_vector_end); diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 4f21d970a129..5c4dcf518684 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -187,7 +187,6 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) *dst = *src; /* clear entire V context, including datap for a new task */ memset(&dst->thread.vstate, 0, sizeof(struct __riscv_v_ext_state)); - return 0; } @@ -221,6 +220,7 @@ int copy_thread(struct task_struct *p, const struct kernel_clone_args *args) childregs->a0 = 0; /* Return value of fork() */ p->thread.s[0] = 0; } + p->thread.riscv_v_flags = 0; p->thread.ra = (unsigned long)ret_from_fork; p->thread.sp = (unsigned long)childregs; /* kernel sp */ return 0; From patchwork Thu Dec 14 15:57:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13493219 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0F7C2C4332F for ; Thu, 14 Dec 2023 15:58:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=frr532NNIMh1YeNy4W8zzjN8t5Mk1RXLnPjNEvp7dsc=; b=GjFPsfm/yfUSr+ ah91CdvMOQFM1xja+fb41qCChSlxuqPuv1tcrA6fWJeaVok++gZQw/tZw0Xbdb+Rx12TerLOSZ4AA g5JpKJwC7PKgHL+UbcAyVXTALjQJi/57MFrq4k1JJIX6vS6beLOzwaue4F14ZXn0Elz6Yg+LMCJRP G3HxRy7GpcwUFQX/upCnS7DkA3XUvs+h1qU3rsTi5eDMA13cTxzivUQsAdosNZ+l/q7jKP0b0eKbe yOTx9vlkuqnjwIia9uXUZZvoG1q4t57rJZyuBaxMPKn0vg+OZbNAOMJm2GHTcvM4DhxDx8yFMTMO2 PG9ZbpZlU78gAmLCnCmQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6W-000jKB-1T; Thu, 14 Dec 2023 15:58:28 +0000 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6U-000jJ6-0h for linux-riscv@lists.infradead.org; Thu, 14 Dec 2023 15:58:27 +0000 Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1d075392ff6so6225455ad.1 for ; Thu, 14 Dec 2023 07:58:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702569501; x=1703174301; darn=lists.infradead.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=Su/cYQNxYPK8qhHX3mu7YrdDkKMaLJ+YHnVyjYZSKGc=; b=i2wmO625aHi74HfY6pnAsLj9TTnH/7DTlDEois4KGN6p7nO6YsEJ+0D4y9zSninpBy WVyKxNf9ps9kD4qPAfPZT/P9xrUzuXj1qBsbzBAMT69RISy3ePCv3RxKv75u2NBt15Sc I+lbda4pCHZgG8Uic+uN/FivQTTYhJP/o+JBX6h2BI72i3qQN69Yhkk1UCgwbJagO58r NoxIWNiTSJO1z0Oryr3yTL/MxjnG2SY1K5ZkmKDSNQZg+vlwBImaEUwJN20esmE3csu2 Hw/U+oEyfuaDV9TBN1S2ImxChLn1QojxFjiD2i9PjHBYIFGjNsGqyRTWL+LicrG+jvpt i9+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702569501; x=1703174301; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Su/cYQNxYPK8qhHX3mu7YrdDkKMaLJ+YHnVyjYZSKGc=; b=MOhAbIaXG3zdMM2Ugl3YmIibd/l+TCt9kmjeHDq/GYyo4cB8g0/tirwO2Zf2qx046l 2bQ+Yjl5qPHO6AXSSc0SIob8R7woa0hGM1sgd/Pr7x322v0y/WeWoFMocSZ8GXgftcqO gZpPMACw0TiwqWOII32V9au2MoKz3fZOnp4SQQuL9Q4CvyE2FSdgRY5uf5ztWyfCgV3+ 0L1aGWVYVVLSyONKxXaCNM0TsLNE8peeEf8kAjSNHFpxJKlKV4cWLHByXUs+ew97YxZq FYTR74QgB8tCwps7kJXFLzoSLP7Ph9SQhQXIM3PE+NlE8sZM1CL6F0lT4+GKepsNf3QT quQQ== X-Gm-Message-State: AOJu0YxGz9DThDJvQicJmFDThx9h3Wbl3//mU1VI83gg2AXcsxkqT8pv WLdxMiLFnIuOq1nO+xQERZ1DPl+HjyunP93EOVWd6uQk1mo37nH/9T3QWcTkW9bWCVKAaKd2vOR AQJoF5fgsQV6/rJ5azGg+SVaHrmS4Y7oDhaFD1FdtX3AOdRG1wXINZSZ8sUYX+2LTWNPbfTbNqI ZDze3hF45wczDS X-Google-Smtp-Source: AGHT+IFRwpTA4h+XxzQHgQTRRAxmsMRfgHoEhgu+fiFY8sBTx1rD2+ewtkPqy+QAxS27M85Ek+qpmQ== X-Received: by 2002:a17:903:603:b0:1d3:6fa4:1cd5 with SMTP id kg3-20020a170903060300b001d36fa41cd5mr781666plb.66.1702569498899; Thu, 14 Dec 2023 07:58:18 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id f4-20020a170902e98400b001d35223d0besm3320799plb.251.2023.12.14.07.58.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 07:58:18 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Andy Chiu , Paul Walmsley , Albert Ou , Vincent Chen , Conor Dooley Subject: [v5, 2/6] riscv: vector: make Vector always available for softirq context Date: Thu, 14 Dec 2023 15:57:17 +0000 Message-Id: <20231214155721.1753-3-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231214155721.1753-1-andy.chiu@sifive.com> References: <20231214155721.1753-1-andy.chiu@sifive.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231214_075826_257170_215A8B5F X-CRM114-Status: GOOD ( 12.11 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org By disabling bottom halves in active kerne-mode Vector, softirq will not be able to nest on top of any kernel-mode Vector. After this patch, Vector context cannot start with irqs disabled. Otherwise local_bh_enable() may run in a wrong context. Disabling bh is not enough for RT-kernel to prevent preeemption. So we must disable preemption, which also implies disabling bh on RT. Related-to: commit 696207d4258b ("arm64/sve: Make kernel FPU protection RT friendly") Related-to: commit 66c3ec5a7120 ("arm64: neon: Forbid when irqs are disabled") Signed-off-by: Andy Chiu --- Changelog v4: - new patch since v4 --- arch/riscv/include/asm/simd.h | 6 +++++- arch/riscv/kernel/kernel_mode_vector.c | 10 ++++++++-- 2 files changed, 13 insertions(+), 3 deletions(-) diff --git a/arch/riscv/include/asm/simd.h b/arch/riscv/include/asm/simd.h index 269752bfa2cc..cd6180fe37c0 100644 --- a/arch/riscv/include/asm/simd.h +++ b/arch/riscv/include/asm/simd.h @@ -26,8 +26,12 @@ static __must_check inline bool may_use_simd(void) /* * RISCV_KERNEL_MODE_V is only set while preemption is disabled, * and is clear whenever preemption is enabled. + * + * Kernel-mode Vector temperarily disables bh. So we must not return + * true on irq_disabled(). Otherwise we would fail the lockdep check + * calling local_bh_enable() */ - return !in_hardirq() && !in_nmi() && !(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); + return !in_hardirq() && !in_nmi() && !irqs_disabled() && !(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); } #else /* ! CONFIG_RISCV_ISA_V */ diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c index c9ccf21dd16c..52e42f74ec9a 100644 --- a/arch/riscv/kernel/kernel_mode_vector.c +++ b/arch/riscv/kernel/kernel_mode_vector.c @@ -23,7 +23,10 @@ */ void get_cpu_vector_context(void) { - preempt_disable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_bh_disable(); + else + preempt_disable(); WARN_ON(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK); riscv_v_ctx_cnt_add(RISCV_KERNEL_MODE_V); @@ -41,7 +44,10 @@ void put_cpu_vector_context(void) WARN_ON(!(riscv_v_ctx_cnt() & RISCV_KERNEL_MODE_V_MASK)); riscv_v_ctx_cnt_sub(RISCV_KERNEL_MODE_V); - preempt_enable(); + if (!IS_ENABLED(CONFIG_PREEMPT_RT)) + local_bh_enable(); + else + preempt_enable(); } /* From patchwork Thu Dec 14 15:57:18 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13493220 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DC6C6C4332F for ; Thu, 14 Dec 2023 15:58:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=EJSSAYNPmACBCn2sE4TEEv371QLzeS3Q/r8lXWNNqMs=; b=knnI9AJJHBIeXk nPEs80fpDNluxAxdQ1s8JaWzxKN9ZhhSeU/3HvnkibjvZYSXClYOF5shXWwqlpibvSt6/DDQEw2jP uVZ1a4NJAJ0ZYxEb5SrmsFldXh8ui/rI1wcY3M+LmD8pJToO/WEBGbEyfG3Qiu/kRTINgyae7tZ5s l5BO36sBcNgng+GG1mAMcO1ZKASXrvzEgk2kEdB2/rvmdcXnLObDhBO9Gt0phq5RyR3dE7v9p1Bpj 2p18Ev68yjJXmeDJWjqmbsUL6oXUihNwD8k0TvwBiodFXx6GL5NvJOJilL+XTlfwocO/KbNBfv/xC jpTWkZMD4XoXGxAK0j1A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6e-000jM5-0i; Thu, 14 Dec 2023 15:58:36 +0000 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6b-000jKv-2i for linux-riscv@lists.infradead.org; Thu, 14 Dec 2023 15:58:35 +0000 Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1d336760e72so29910425ad.3 for ; Thu, 14 Dec 2023 07:58:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702569510; x=1703174310; darn=lists.infradead.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=i9NRPfuDspExspcewG3zKAebEn0P1bg8Tb7NM+tntho=; b=YzmF6i7WZJCOXg3/7F3oa2GYceEgPtdug7kCiMrZd4t3eossea66xHZoy64zKkZfyd Ca1W05L1qIw0hfuQhM/vbDqG5gKmjUHkpdBYT+sEMaceJGmlDwAlzgewzsz38ASvu1zI VDPf/yuz3vVWhkc2aICyN8WA7FSmrrbg4PSHBc4pH9iU31vUTQDSWdX1UO3foxbbEVjf ++fGv5IUmWLEt7KRDhxX1pcrMHyQjiXzI5n53KMRwQANBsmph8WC4SeyhWzMkhbEPARz EbD1mhg1U+Iz9FWc+UdRu7l3Ft7wXdwaUlQ/3+wk6FvTqVJvmIuU7Fme4RJQnDUTWn+o L2Ew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702569510; x=1703174310; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=i9NRPfuDspExspcewG3zKAebEn0P1bg8Tb7NM+tntho=; b=dWSL6Ew8KqhZ8pPRxugXyaP3lMj5CmgWFbZndxca3v+4fjxMChBQpe4FTfdN5n8wr8 isF7N5+mkRkXl3frWYFU7RGWL/t9q4uBC6pqz1YFt6/F6LrpodXdQlRefi2tD9rk98+Z be5TCStiOge/15fwCwJFoiWdu6/fIgi4Pj/fQVbivzjzs8LC7C8De8gSy1uHCVedJi2y a7dZJh0Isykp1aFvMZo2Mk2k2EFHuzNt/hSJpc17FQClmasDbl68RJeDYcBY29iNlDzu hffQatylmt1DTVg8g/0oKZfokhXR27d0z5lkCWhPrXdxNaNcmAagD6HVVbkpY/xFomSP wS4A== X-Gm-Message-State: AOJu0YyIOW6sgt9CRvhKEhnoPHFg/1kui11LY7Ehxb1vG3e0AN2doDcU /D9Xy/jgssOTZx2bqFFVf6fHu9j7Yp0M/ZNF9WmUfA+ldFj6V0niqdVYFvfZcN9vpDilFjj4pZy he0nwbe1Sj0DejHiCZlFzeNXG6pgEBWAmXpmH944iPe3BvorOn5zdda0OYsf8AqkMo4FLm2CtSd qkWEAlmtuk3bMD X-Google-Smtp-Source: AGHT+IFt6cHiab55Cu6SElSVHTNfno3e8oDIndqXa06kq7N+XhbbIR799sPKc++ON/oC0BJuDUucRQ== X-Received: by 2002:a17:902:cec4:b0:1d0:4cde:6e24 with SMTP id d4-20020a170902cec400b001d04cde6e24mr10608748plg.44.1702569509517; Thu, 14 Dec 2023 07:58:29 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id f4-20020a170902e98400b001d35223d0besm3320799plb.251.2023.12.14.07.58.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 07:58:28 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Han-Kuan Chen , Andy Chiu , Paul Walmsley , Albert Ou , Conor Dooley , Andrew Jones , Heiko Stuebner Subject: [v5, 3/6] riscv: Add vector extension XOR implementation Date: Thu, 14 Dec 2023 15:57:18 +0000 Message-Id: <20231214155721.1753-4-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231214155721.1753-1-andy.chiu@sifive.com> References: <20231214155721.1753-1-andy.chiu@sifive.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231214_075833_884594_9818DC69 X-CRM114-Status: GOOD ( 14.41 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Greentime Hu This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen Signed-off-by: Han-Kuan Chen Signed-off-by: Greentime Hu Signed-off-by: Andy Chiu Acked-by: Conor Dooley --- Changelog v2: - 's/rvv/vector/' (Conor) --- arch/riscv/include/asm/xor.h | 82 ++++++++++++++++++++++++++++++++++++ arch/riscv/lib/Makefile | 1 + arch/riscv/lib/xor.S | 81 +++++++++++++++++++++++++++++++++++ 3 files changed, 164 insertions(+) create mode 100644 arch/riscv/include/asm/xor.h create mode 100644 arch/riscv/lib/xor.S diff --git a/arch/riscv/include/asm/xor.h b/arch/riscv/include/asm/xor.h new file mode 100644 index 000000000000..903c3275f8d0 --- /dev/null +++ b/arch/riscv/include/asm/xor.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ + +#include +#include +#ifdef CONFIG_RISCV_ISA_V +#include +#include + +void xor_regs_2_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2); +void xor_regs_3_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3); +void xor_regs_4_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4); +void xor_regs_5_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4, + const unsigned long *__restrict p5); + +static void xor_vector_2(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2) +{ + kernel_vector_begin(); + xor_regs_2_(bytes, p1, p2); + kernel_vector_end(); +} + +static void xor_vector_3(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3) +{ + kernel_vector_begin(); + xor_regs_3_(bytes, p1, p2, p3); + kernel_vector_end(); +} + +static void xor_vector_4(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4) +{ + kernel_vector_begin(); + xor_regs_4_(bytes, p1, p2, p3, p4); + kernel_vector_end(); +} + +static void xor_vector_5(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4, + const unsigned long *__restrict p5) +{ + kernel_vector_begin(); + xor_regs_5_(bytes, p1, p2, p3, p4, p5); + kernel_vector_end(); +} + +static struct xor_block_template xor_block_rvv = { + .name = "rvv", + .do_2 = xor_vector_2, + .do_3 = xor_vector_3, + .do_4 = xor_vector_4, + .do_5 = xor_vector_5 +}; + +#undef XOR_TRY_TEMPLATES +#define XOR_TRY_TEMPLATES \ + do { \ + xor_speed(&xor_block_8regs); \ + xor_speed(&xor_block_32regs); \ + if (has_vector()) { \ + xor_speed(&xor_block_rvv);\ + } \ + } while (0) +#endif diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 26cb2502ecf8..494f9cd1a00c 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -11,3 +11,4 @@ lib-$(CONFIG_64BIT) += tishift.o lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o +lib-$(CONFIG_RISCV_ISA_V) += xor.o diff --git a/arch/riscv/lib/xor.S b/arch/riscv/lib/xor.S new file mode 100644 index 000000000000..3bc059e18171 --- /dev/null +++ b/arch/riscv/lib/xor.S @@ -0,0 +1,81 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ +#include +#include +#include + +ENTRY(xor_regs_2_) + vsetvli a3, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a3 + vxor.vv v16, v0, v8 + add a2, a2, a3 + vse8.v v16, (a1) + add a1, a1, a3 + bnez a0, xor_regs_2_ + ret +END(xor_regs_2_) +EXPORT_SYMBOL(xor_regs_2_) + +ENTRY(xor_regs_3_) + vsetvli a4, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a4 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a4 + vxor.vv v16, v0, v16 + add a3, a3, a4 + vse8.v v16, (a1) + add a1, a1, a4 + bnez a0, xor_regs_3_ + ret +END(xor_regs_3_) +EXPORT_SYMBOL(xor_regs_3_) + +ENTRY(xor_regs_4_) + vsetvli a5, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a5 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a5 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a5 + vxor.vv v16, v0, v24 + add a4, a4, a5 + vse8.v v16, (a1) + add a1, a1, a5 + bnez a0, xor_regs_4_ + ret +END(xor_regs_4_) +EXPORT_SYMBOL(xor_regs_4_) + +ENTRY(xor_regs_5_) + vsetvli a6, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a6 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a6 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a6 + vxor.vv v0, v0, v24 + vle8.v v8, (a5) + add a4, a4, a6 + vxor.vv v16, v0, v8 + add a5, a5, a6 + vse8.v v16, (a1) + add a1, a1, a6 + bnez a0, xor_regs_5_ + ret +END(xor_regs_5_) +EXPORT_SYMBOL(xor_regs_5_) From patchwork Thu Dec 14 15:57:19 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13493221 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0CAF5C4332F for ; Thu, 14 Dec 2023 15:58:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=fxVbYjHP4oGKR3K5bkRGKc2z8l0qVpiNLb14BreD8+8=; b=dax9ysV0TFl8YZ Xy2X3sIGgxwnDFEM6WC0Uk7Tcx2mQLmwLy5MLM/KoVmSNUEF9T5Gi5OMW6dLU3MnMNVC0BsHCkrOL xfjNu3Abb+usdzDj34hXqLtS74bYGPaF0yF0JQsid9vhLUoMZWQMz6fL+mtC7FE5oN1EO2N3n6Imi ruBZhrVJfHLlL2dtXDjbfG0izKRbCJbfBc+mtR+zlfpqsr0i7Z2pmC4kWAG/GLabLR3wtQHZWUGLk PZRZiI99fjlOFqrgCvsSL9f33eRseq35x/amt2eqGN8LnavOvBlEqnts9kUB4xuRJw+yru+2couAM dSH6gWrTBTHBXhx1utdQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6q-000jQc-0Q; Thu, 14 Dec 2023 15:58:48 +0000 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6n-000jPL-0R for linux-riscv@lists.infradead.org; Thu, 14 Dec 2023 15:58:46 +0000 Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1d350dff621so18220805ad.1 for ; Thu, 14 Dec 2023 07:58:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702569523; x=1703174323; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lRA4IcezgUHauSYobvlafo7yPdv+h5Hv5oddcpopvpU=; b=L5WHGsMdygiFNtoQKzvDhYf5gRjLLFwIkhILC8l8YWYAqbEva6Cu5YwMxnfi8qIKxU rNHF7lTw4W9YBjjdiNQvZtvvGykPYA1cbQqTJCcMJ2YYXpHhbiG1iCdgtHOFCIVdCWe5 pyuJ2y0t0ienME2dsmsoqZi7Sj8KYDwN6eLDqEG2TbgMXGsjbYVj0BCG5RThTswpwyNF 1tVeejqPY568fJff8Wx4NWlCJ3MjlPIX+IzlLapwIkQINjPrG/p44NiypLcRVGGw+kO9 QHut3T3S827nXAnmfeXBVVv4fO86tS+JGWw4zAy0ogfJyV6IST5zrRfokCC8LI6Xb6Sn ZfSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702569523; x=1703174323; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lRA4IcezgUHauSYobvlafo7yPdv+h5Hv5oddcpopvpU=; b=oo0QSqSBRz+cJXuPd3d9f99spOhI7erQIq+Vp80lFQwcFDElaizHxaw0i/XMk2O3sq eFysLgao7TDdteZx7YZemrrWSKMRwGGd+nmMFwTtfdMBET+N4PeMgtFLfLdxL1wRx0do tyVxkHwCZqT4Cbd1hwBmzhHsvz3UzOWSOQvZzkuP+egWxUbr7x/k3VRO5L7X4ArGs+ZX 6liqmDPyGKy9WrAHtJfoVXjeQSiPWQlNsYyRrTS1zmUWt/Fq6zhzQZ+oHtU9SPJPy8qW BPOm1lbOzp4nAgpDr5rqydUD2BhW66TaWfOcWz3rn/nM06i3+aHKAc6UdKwctjB5H6Zz R0uA== X-Gm-Message-State: AOJu0YxXYU5eNhS4wSVWuMotPushUfbewqLhcDmF5aNTAW4CPbAN8fCq lpcAyjtTgi7yTMxaMzEBh6gKPGrdFq1FVaeeiZN1X5Ra6m3LDvHGlJvcGwQcn5wKqL/N6KTbI81 bEgK8MVARRK0jqiFz+wFUI7xWlx3eoZD5oE0uSVc/u88HcYmZjw3+yPjw6SHpQvQfb6ZpwH3wsm YhWfekXQp3TYaz X-Google-Smtp-Source: AGHT+IFZUprEGn+y1NGUl7S7bkLKSxl9ichgD08uNfHpbee0Xb42l2PrVNDb66YDhDDeHZ+N4hJGGw== X-Received: by 2002:a17:902:db01:b0:1d3:44ad:215b with SMTP id m1-20020a170902db0100b001d344ad215bmr4539068plx.92.1702569523263; Thu, 14 Dec 2023 07:58:43 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id f4-20020a170902e98400b001d35223d0besm3320799plb.251.2023.12.14.07.58.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 07:58:42 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Andy Chiu , Paul Walmsley , Albert Ou , Oleg Nesterov , =?utf-8?b?QmrDtnJuIFTDtnBlbA==?= , Guo Ren , =?utf-8?b?Q2zDqW1lbnQgTMOpZ2Vy?= , Conor Dooley , Sami Tolvanen , Jisheng Zhang , Deepak Gupta , Vincent Chen , Heiko Stuebner , Xiao Wang , Peter Zijlstra , Mathis Salmen , Haorong Lu , Joel Granados Subject: [v5, 4/6] riscv: sched: defer restoring Vector context for user Date: Thu, 14 Dec 2023 15:57:19 +0000 Message-Id: <20231214155721.1753-5-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231214155721.1753-1-andy.chiu@sifive.com> References: <20231214155721.1753-1-andy.chiu@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231214_075845_172677_1F0E77EA X-CRM114-Status: GOOD ( 21.46 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org User will use its Vector registers only after the kernel really returns to the userspace. So we can delay restoring Vector registers as long as we are still running in kernel mode. So, add a thread flag to indicates the need of restoring Vector and do the restore at the last arch-specific exit-to-user hook. This save the context restoring cost when we switch over multiple processes that run V in kernel mode. For example, if the kernel performs a context swicth from A->B->C, and returns to C's userspace, then there is no need to restore B's V-register. Besides, this also prevents us from repeatedly restoring V context when executing kernel-mode Vector multiple times. The cost of this is that we must disable preemption and mark vector as busy during vstate_{save,restore}. Because then the V context will not get restored back immediately when a trap-causing context switch happens in the middle of vstate_{save,restore}. Signed-off-by: Andy Chiu Acked-by: Conor Dooley --- Changelog v4: - fix typos and re-add Conor's A-b. Changelog v3: - Guard {get,put}_cpu_vector_context between vstate_* operation and explain it in the commit msg. - Drop R-b from Björn and A-b from Conor. Changelog v2: - rename and add comment for the new thread flag (Conor) --- arch/riscv/include/asm/entry-common.h | 17 +++++++++++++++++ arch/riscv/include/asm/thread_info.h | 2 ++ arch/riscv/include/asm/vector.h | 11 ++++++++++- arch/riscv/kernel/kernel_mode_vector.c | 2 +- arch/riscv/kernel/process.c | 2 ++ arch/riscv/kernel/ptrace.c | 5 ++++- arch/riscv/kernel/signal.c | 5 ++++- arch/riscv/kernel/vector.c | 2 +- 8 files changed, 41 insertions(+), 5 deletions(-) diff --git a/arch/riscv/include/asm/entry-common.h b/arch/riscv/include/asm/entry-common.h index 7ab5e34318c8..6361a8488642 100644 --- a/arch/riscv/include/asm/entry-common.h +++ b/arch/riscv/include/asm/entry-common.h @@ -4,6 +4,23 @@ #define _ASM_RISCV_ENTRY_COMMON_H #include +#include +#include + +static inline void arch_exit_to_user_mode_prepare(struct pt_regs *regs, + unsigned long ti_work) +{ + if (ti_work & _TIF_RISCV_V_DEFER_RESTORE) { + clear_thread_flag(TIF_RISCV_V_DEFER_RESTORE); + /* + * We are already called with irq disabled, so go without + * keeping track of vector_context_busy. + */ + riscv_v_vstate_restore(current, regs); + } +} + +#define arch_exit_to_user_mode_prepare arch_exit_to_user_mode_prepare void handle_page_fault(struct pt_regs *regs); void handle_break(struct pt_regs *regs); diff --git a/arch/riscv/include/asm/thread_info.h b/arch/riscv/include/asm/thread_info.h index 574779900bfb..1047a97ddbc8 100644 --- a/arch/riscv/include/asm/thread_info.h +++ b/arch/riscv/include/asm/thread_info.h @@ -103,12 +103,14 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src); #define TIF_NOTIFY_SIGNAL 9 /* signal notifications exist */ #define TIF_UPROBE 10 /* uprobe breakpoint or singlestep */ #define TIF_32BIT 11 /* compat-mode 32bit process */ +#define TIF_RISCV_V_DEFER_RESTORE 12 /* restore Vector before returing to user */ #define _TIF_NOTIFY_RESUME (1 << TIF_NOTIFY_RESUME) #define _TIF_SIGPENDING (1 << TIF_SIGPENDING) #define _TIF_NEED_RESCHED (1 << TIF_NEED_RESCHED) #define _TIF_NOTIFY_SIGNAL (1 << TIF_NOTIFY_SIGNAL) #define _TIF_UPROBE (1 << TIF_UPROBE) +#define _TIF_RISCV_V_DEFER_RESTORE (1 << TIF_RISCV_V_DEFER_RESTORE) #define _TIF_WORK_MASK \ (_TIF_NOTIFY_RESUME | _TIF_SIGPENDING | _TIF_NEED_RESCHED | \ diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index 6254830c0668..e706613aae2c 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -205,6 +205,15 @@ static inline void riscv_v_vstate_restore(struct task_struct *task, } } +static inline void riscv_v_vstate_set_restore(struct task_struct *task, + struct pt_regs *regs) +{ + if ((regs->status & SR_VS) != SR_VS_OFF) { + set_tsk_thread_flag(task, TIF_RISCV_V_DEFER_RESTORE); + riscv_v_vstate_on(regs); + } +} + static inline void __switch_to_vector(struct task_struct *prev, struct task_struct *next) { @@ -212,7 +221,7 @@ static inline void __switch_to_vector(struct task_struct *prev, regs = task_pt_regs(prev); riscv_v_vstate_save(prev, regs); - riscv_v_vstate_restore(next, task_pt_regs(next)); + riscv_v_vstate_set_restore(next, task_pt_regs(next)); } void riscv_v_vstate_ctrl_init(struct task_struct *tsk); diff --git a/arch/riscv/kernel/kernel_mode_vector.c b/arch/riscv/kernel/kernel_mode_vector.c index 52e42f74ec9a..c5b86b554d1a 100644 --- a/arch/riscv/kernel/kernel_mode_vector.c +++ b/arch/riscv/kernel/kernel_mode_vector.c @@ -92,7 +92,7 @@ void kernel_vector_end(void) if (WARN_ON(!has_vector())) return; - riscv_v_vstate_restore(current, task_pt_regs(current)); + riscv_v_vstate_set_restore(current, task_pt_regs(current)); riscv_v_disable(); diff --git a/arch/riscv/kernel/process.c b/arch/riscv/kernel/process.c index 5c4dcf518684..58127b1c6c71 100644 --- a/arch/riscv/kernel/process.c +++ b/arch/riscv/kernel/process.c @@ -171,6 +171,7 @@ void flush_thread(void) riscv_v_vstate_off(task_pt_regs(current)); kfree(current->thread.vstate.datap); memset(¤t->thread.vstate, 0, sizeof(struct __riscv_v_ext_state)); + clear_tsk_thread_flag(current, TIF_RISCV_V_DEFER_RESTORE); #endif } @@ -187,6 +188,7 @@ int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src) *dst = *src; /* clear entire V context, including datap for a new task */ memset(&dst->thread.vstate, 0, sizeof(struct __riscv_v_ext_state)); + clear_tsk_thread_flag(dst, TIF_RISCV_V_DEFER_RESTORE); return 0; } diff --git a/arch/riscv/kernel/ptrace.c b/arch/riscv/kernel/ptrace.c index 2afe460de16a..7b93bcbdf9fa 100644 --- a/arch/riscv/kernel/ptrace.c +++ b/arch/riscv/kernel/ptrace.c @@ -99,8 +99,11 @@ static int riscv_vr_get(struct task_struct *target, * Ensure the vector registers have been saved to the memory before * copying them to membuf. */ - if (target == current) + if (target == current) { + get_cpu_vector_context(); riscv_v_vstate_save(current, task_pt_regs(current)); + put_cpu_vector_context(); + } ptrace_vstate.vstart = vstate->vstart; ptrace_vstate.vl = vstate->vl; diff --git a/arch/riscv/kernel/signal.c b/arch/riscv/kernel/signal.c index 88b6220b2608..aca4a12c8416 100644 --- a/arch/riscv/kernel/signal.c +++ b/arch/riscv/kernel/signal.c @@ -86,7 +86,10 @@ static long save_v_state(struct pt_regs *regs, void __user **sc_vec) /* datap is designed to be 16 byte aligned for better performance */ WARN_ON(unlikely(!IS_ALIGNED((unsigned long)datap, 16))); + get_cpu_vector_context(); riscv_v_vstate_save(current, regs); + put_cpu_vector_context(); + /* Copy everything of vstate but datap. */ err = __copy_to_user(&state->v_state, ¤t->thread.vstate, offsetof(struct __riscv_v_ext_state, datap)); @@ -134,7 +137,7 @@ static long __restore_v_state(struct pt_regs *regs, void __user *sc_vec) if (unlikely(err)) return err; - riscv_v_vstate_restore(current, regs); + riscv_v_vstate_set_restore(current, regs); return err; } diff --git a/arch/riscv/kernel/vector.c b/arch/riscv/kernel/vector.c index 578b6292487e..66e8c6ab09d2 100644 --- a/arch/riscv/kernel/vector.c +++ b/arch/riscv/kernel/vector.c @@ -167,7 +167,7 @@ bool riscv_v_first_use_handler(struct pt_regs *regs) return true; } riscv_v_vstate_on(regs); - riscv_v_vstate_restore(current, regs); + riscv_v_vstate_set_restore(current, regs); return true; } From patchwork Thu Dec 14 15:57:20 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13493222 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D4DADC4332F for ; Thu, 14 Dec 2023 15:59:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=iNcDnMzmFiOYjbMEGzzZR28XTCCPFQ9ENgGkkd2kiXc=; b=Nx8TZB1becGoBy VQeHJftAzZrf5ocnpuU1IdLxIDnS7hNKqjgb1ZID8c0ywmu1f5Nf+L9QdMD817p92hyhv4YZMc+yQ 8A532ZYO9giolk5tYG0WsdB0y9SPjVYbRiaMB3FPnAKgEv06vD99YGTNzljJDT7JRPVCqNA+LVWN3 P1zH/h3Zy375HObuyovRCZ6adaJqGIBmLgDJewEOwityylN+JHPjdiudB2FgST/n0MTPxwfjdS3uF Q6onEAV+HP5hgTQYCKOm9Z7Eno1makNmJvDs11ZNke0gC2iILsh8kbOoGkuDAYsj7NJ2qAzYM5urN JU7ow1qTRN9dRKHpunYA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6z-000jU3-3C; Thu, 14 Dec 2023 15:58:57 +0000 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDo6w-000jSu-2i for linux-riscv@lists.infradead.org; Thu, 14 Dec 2023 15:58:56 +0000 Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1d3448937ddso18620505ad.2 for ; Thu, 14 Dec 2023 07:58:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702569533; x=1703174333; darn=lists.infradead.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=JZ524jmI8bMaL86E7Kn95OVAwtI8eSUFYhWVmFuBQIE=; b=QwRiMfy91ahb2oyIpKa+sccjBw/gELGDb3W6f8xtkykx2zxtvhPnKc7ylaETtOcM6N DuBEOMHas2tXxJI0ZZProRfPCAmplVOM89MWzx5YglWWXQw296dnykqGUaeZ+Hg7wBzD auqWlyKbhHhkiW6ci/Wjge272ZMUw+qQjLiJ2tvP+8+U4/2q6Pda8xaE3StWptbnmPgv 6gYpB1CcK6GuXPutNwFLrrj7eqX+HBPHZoAjKKCYvKkwh07BUW9VngaAbKsuS7qCxkyw GA3RE9C0/z/sOa0NW5pddpH2ziV322TaVylTlMp1sEGSyUYzw0a64or6APKr5jLrEpHB awYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702569533; x=1703174333; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=JZ524jmI8bMaL86E7Kn95OVAwtI8eSUFYhWVmFuBQIE=; b=e7GvcJGpMwYXdpZLFDk5Q6PNWU9rTAOgd+uDRxJYn9YdqjwgGpxYQUUKjoNNCPBKmi f9yUsUeQUBonJ8NgRHjWauLxON1zUJp8fcKO8n/aPJzhqInbZUtwUB65C0APqfDOrWBF vVEGndRYsFxbeCmq7SbhaN/s+GRZ1xV8plb+1me+hN6CO4kHGMg8K27LWO17A96E78Ww 7+Ftg4EQTMiRi98ZsgXwVAcjp7CRuuFr8HvCsNx3/1adX7e6CCqd/YcVKAXp5wiawmWT LzS3bYmonMmwNvsCrMBAkDWtPkVEtM/xyAiRPBu8qPQyhtA5K0aI1CSrUaO5SFGSGa3k Ohqw== X-Gm-Message-State: AOJu0Yxyo+MNN4wdDb9DBVlH1+LrIITHCe9aMBCnP10AmBVS+jNmIBj6 26q9YvxLlkFElZGqUpwAokcmq8kges5flySGDL90Idexy6Z6xweh7uXku5kuHi/Ry7tgJzFN0Ar xUWVh8f/oFaiMZB0/WjWznzjnIsK8ewJCOSWLz/KN+dSrSVFsbpY4QpuDUiKwSnwYpE7SJ+A2Zm 0HvZMjoE+KjDd2 X-Google-Smtp-Source: AGHT+IHBT/C+p5V968+xKuIBZxr9LHaRjBPZ6X7veXDtR2XdXwzYNmPLnwyY4U84MEXz2EIDBN6dCg== X-Received: by 2002:a17:903:32c6:b0:1d0:c6a6:10e4 with SMTP id i6-20020a17090332c600b001d0c6a610e4mr6446907plr.27.1702569533116; Thu, 14 Dec 2023 07:58:53 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id f4-20020a170902e98400b001d35223d0besm3320799plb.251.2023.12.14.07.58.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 07:58:52 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Andy Chiu , Paul Walmsley , Albert Ou , Conor Dooley , Andrew Jones , Han-Kuan Chen , Heiko Stuebner , Aurelien Jarno , Bo YU , Alexandre Ghiti , =?utf-8?b?Q2zDqW1lbnQgTMOpZ2Vy?= Subject: [v5, 5/6] riscv: lib: vectorize copy_to_user/copy_from_user Date: Thu, 14 Dec 2023 15:57:20 +0000 Message-Id: <20231214155721.1753-6-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231214155721.1753-1-andy.chiu@sifive.com> References: <20231214155721.1753-1-andy.chiu@sifive.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231214_075854_880049_E95B7068 X-CRM114-Status: GOOD ( 19.68 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org This patch utilizes Vector to perform copy_to_user/copy_from_user. If Vector is available and the size of copy is large enough for Vector to perform better than scalar, then direct the kernel to do Vector copies for userspace. Though the best programming practice for users is to reduce the copy, this provides a faster variant when copies are inevitable. The optimal size for using Vector, copy_to_user_thres, is only a heuristic for now. We can add DT parsing if people feel the need of customizing it. The exception fixup code of the __asm_vector_usercopy must fallback to the scalar one because accessing user pages might fault, and must be sleepable. Current kernel-mode Vector does not allow tasks to be preemptible, so we must disactivate Vector and perform a scalar fallback in such case. The original implementation of Vector operations comes from https://github.com/sifive/sifive-libc, which we agree to contribute to Linux kernel. Signed-off-by: Andy Chiu --- Changelog v4: - new patch since v4 --- arch/riscv/lib/Makefile | 2 ++ arch/riscv/lib/riscv_v_helpers.c | 38 ++++++++++++++++++++++ arch/riscv/lib/uaccess.S | 11 +++++++ arch/riscv/lib/uaccess_vector.S | 55 ++++++++++++++++++++++++++++++++ 4 files changed, 106 insertions(+) create mode 100644 arch/riscv/lib/riscv_v_helpers.c create mode 100644 arch/riscv/lib/uaccess_vector.S diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 494f9cd1a00c..1fe8d797e0f2 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -12,3 +12,5 @@ lib-$(CONFIG_RISCV_ISA_ZICBOZ) += clear_page.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o lib-$(CONFIG_RISCV_ISA_V) += xor.o +lib-$(CONFIG_RISCV_ISA_V) += riscv_v_helpers.o +lib-$(CONFIG_RISCV_ISA_V) += uaccess_vector.o diff --git a/arch/riscv/lib/riscv_v_helpers.c b/arch/riscv/lib/riscv_v_helpers.c new file mode 100644 index 000000000000..d763b9c69fb7 --- /dev/null +++ b/arch/riscv/lib/riscv_v_helpers.c @@ -0,0 +1,38 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2023 SiFive + * Author: Andy Chiu + */ +#include +#include + +#include +#include + +size_t riscv_v_usercopy_thres = 768; +int __asm_vector_usercopy(void *dst, void *src, size_t n); +int fallback_scalar_usercopy(void *dst, void *src, size_t n); +asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n) +{ + size_t remain, copied; + + /* skip has_vector() check because it has been done by the asm */ + if (!may_use_simd()) + goto fallback; + + kernel_vector_begin(); + remain = __asm_vector_usercopy(dst, src, n); + kernel_vector_end(); + + if (remain) { + copied = n - remain; + dst += copied; + src += copied; + goto fallback; + } + + return remain; + +fallback: + return fallback_scalar_usercopy(dst, src, n); +} diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S index 3ab438f30d13..ae8c1453cfcf 100644 --- a/arch/riscv/lib/uaccess.S +++ b/arch/riscv/lib/uaccess.S @@ -3,6 +3,8 @@ #include #include #include +#include +#include .macro fixup op reg addr lbl 100: @@ -11,6 +13,14 @@ .endm SYM_FUNC_START(__asm_copy_to_user) +#ifdef CONFIG_RISCV_ISA_V + ALTERNATIVE("j fallback_scalar_usercopy", "nop", 0, RISCV_ISA_EXT_v, CONFIG_RISCV_ISA_V) + la t0, riscv_v_usercopy_thres + REG_L t0, (t0) + bltu a2, t0, fallback_scalar_usercopy + tail enter_vector_usercopy +#endif +SYM_FUNC_START(fallback_scalar_usercopy) /* Enable access to user memory */ li t6, SR_SUM @@ -181,6 +191,7 @@ SYM_FUNC_START(__asm_copy_to_user) sub a0, t5, a0 ret SYM_FUNC_END(__asm_copy_to_user) +SYM_FUNC_END(fallback_scalar_usercopy) EXPORT_SYMBOL(__asm_copy_to_user) SYM_FUNC_ALIAS(__asm_copy_from_user, __asm_copy_to_user) EXPORT_SYMBOL(__asm_copy_from_user) diff --git a/arch/riscv/lib/uaccess_vector.S b/arch/riscv/lib/uaccess_vector.S new file mode 100644 index 000000000000..5bebcb1276a2 --- /dev/null +++ b/arch/riscv/lib/uaccess_vector.S @@ -0,0 +1,55 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include +#include +#include +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + + .macro fixup op reg addr lbl +100: + \op \reg, \addr + _asm_extable 100b, \lbl + .endm + +SYM_FUNC_START(__asm_vector_usercopy) + /* Enable access to user memory */ + li t6, SR_SUM + csrs CSR_STATUS, t6 + + /* Save for return value */ + mv t5, a2 + + mv pDstPtr, pDst +loop: + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + fixup vle8.v vData, (pSrc), 10f + fixup vse8.v vData, (pDstPtr), 10f + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + add pDstPtr, pDstPtr, iVL + bnez iNum, loop + +.Lout_copy_user: + /* Disable access to user memory */ + csrc CSR_STATUS, t6 + li a0, 0 + ret + + /* Exception fixup code */ +10: + /* Disable access to user memory */ + csrc CSR_STATUS, t6 + mv a0, iNum + ret +SYM_FUNC_END(__asm_vector_usercopy) From patchwork Thu Dec 14 15:57:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13493223 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E120CC4332F for ; Thu, 14 Dec 2023 15:59:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:List-Subscribe:List-Help: List-Post:List-Archive:List-Unsubscribe:List-Id:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=MIwe0S92H5GvvGtES4iAkmKSpkYCOArR9nEwSqN4OPk=; b=qr/z0eWthL53jX AlI6rO6t3hkA4fNdIrwPwjyUO2hYYPsNiN1e4Si8ISHSlacJWZkvdhlxGHnS6CwmImJxSSbi9LcK5 P+u0YWZlwhqti5200tknwM5oXEmi9dG8x5Flp3AjNNzskER5HSJC7/KrSmfJFAHnhJACfm9LLPutO qz+N6R9v3SmI8j6G0wCoJji0CSVCEPi8RNCnbfLezUQgkot+QsrLRK1SokOCblM20cPD3IOD2PuSF 6v80ktymuyxJpTqnYuWfEDxUAYYjkoxmo5qMxCPaueIfVo3lrzZINN7r84zoP3SXEJOFSiDKQq1Kw 6rc6e4jnYZGqA5pS26Wg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rDo7A-000jZQ-2u; Thu, 14 Dec 2023 15:59:08 +0000 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rDo77-000jXl-2d for linux-riscv@lists.infradead.org; Thu, 14 Dec 2023 15:59:07 +0000 Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1d2e6e14865so40937375ad.0 for ; Thu, 14 Dec 2023 07:59:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1702569544; x=1703174344; darn=lists.infradead.org; h=references:in-reply-to:message-id:date:subject:cc:to:from:from:to :cc:subject:date:message-id:reply-to; bh=M2INiZjT1dR9jMzqVnDUQHsJSZ5/v7zoYN31TBwrCZc=; b=HGhZqB0t3+bIL8yG3z3oofvjPuh80OcTLE59GE5IWWESR/9VU14MAP18JLTQJWGBcB P5VD+k/hY3b1nKBkThLTACXsNhb1MlmCdLfIHs4V9Kk37nwX7MmbZQIQzAIh5mST4kxE UkRNFvn7hx9okCMGOXuQOlK/8UkgIiUx/rKr8+WHscXlntsfeaLqm9vyRjhsqFxE+5fk THGTuUDZrLRmTii0/2WZ9+3uyFIL4oxv5giMn+9yn0YjadINPgrQu0B+5n9liFDF6nO0 d9YixsRIJTN8u+lrGa8K4e157QxmWmfYTEYppZac/AMOZb98Lcs+/Az5hkuHK1Vh28kR v9KQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1702569544; x=1703174344; h=references:in-reply-to:message-id:date:subject:cc:to:from :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=M2INiZjT1dR9jMzqVnDUQHsJSZ5/v7zoYN31TBwrCZc=; b=T37yrfPzrgISY0m5qhB8EwsLqF2m0usi780p0tc0J93r4W8gLpgUZL5XXA+0EMSBID 0ourvOyUb6eO/2uNG53Di5o9jMWzUGxpekaVUsu08cfXPo8XSXCJCIc8mY0AsOMspgYt YyIPznd1OCghBjb07Gf+rcH8WgNBIy4OZLF/zTha+7BCMe7pUmSY+hKUWuJhd2DV9VXL OGOfW+8DXQ0kpqgG+ftVonCH+7Sbgjdq1i4p6KN5gnDMhgsy9+X9NEFYXt2OgDmVLbPV B8ODmI9v9pNf4JMtnDW0qTnmdDSfJZwet71IOXZcDbZHEfogcPsmz80WjkafqJzYw40Y p6vA== X-Gm-Message-State: AOJu0YxwXBc7nl1uSi6IslfHjLRVIwwQ9kLNVFjdI4KcidHmfLbzzZ9d Gj3fHi4T0fdudirGiTGiOdOIpbeG7jiKt9IV2JZDTxRDTX+HNRnKCd5NlJhp//SFRzHUR+tn/98 s0SUrgy/DdpbCOudITl6r3HdZQg/8fBEij2ZkJKJhtSqwGRapHWFsocvYQpfjzn8/TmAwBy5iXA 3ZfAiF2qu1/iQk X-Google-Smtp-Source: AGHT+IG59ErnM4S2zfoOBS15RGTwePoBaO7smGHj3K/VC/c+VXAmtOqZt2coxkFw1Spi7kbNgzwqwg== X-Received: by 2002:a17:903:124e:b0:1d3:3357:22b9 with SMTP id u14-20020a170903124e00b001d3335722b9mr3288553plh.139.1702569544406; Thu, 14 Dec 2023 07:59:04 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id f4-20020a170902e98400b001d35223d0besm3320799plb.251.2023.12.14.07.59.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 07:59:03 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, Andy Chiu , Paul Walmsley , Albert Ou , Conor Dooley , Andrew Jones , Han-Kuan Chen , Heiko Stuebner Subject: [v5, 6/6] riscv: lib: add vectorized mem* routines Date: Thu, 14 Dec 2023 15:57:21 +0000 Message-Id: <20231214155721.1753-7-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231214155721.1753-1-andy.chiu@sifive.com> References: <20231214155721.1753-1-andy.chiu@sifive.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231214_075905_852058_AE6794ED X-CRM114-Status: GOOD ( 14.48 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Provide vectorized memcpy/memset/memmove to accelerate common memory operations. Also, group them into V_OPT_TEMPLATE3 macro because their setup/tear-down and fallback logics are the same. The original implementation of Vector operations comes from https://github.com/sifive/sifive-libc, which we agree to contribute to Linux kernel. Signed-off-by: Andy Chiu --- Changelog v4: - new patch since v4 --- arch/riscv/lib/Makefile | 3 ++ arch/riscv/lib/memcpy_vector.S | 29 +++++++++++++++++++ arch/riscv/lib/memmove_vector.S | 49 ++++++++++++++++++++++++++++++++ arch/riscv/lib/memset_vector.S | 33 +++++++++++++++++++++ arch/riscv/lib/riscv_v_helpers.c | 21 ++++++++++++++ 5 files changed, 135 insertions(+) create mode 100644 arch/riscv/lib/memcpy_vector.S create mode 100644 arch/riscv/lib/memmove_vector.S create mode 100644 arch/riscv/lib/memset_vector.S diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 1fe8d797e0f2..3111863afd2e 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -14,3 +14,6 @@ obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o lib-$(CONFIG_RISCV_ISA_V) += xor.o lib-$(CONFIG_RISCV_ISA_V) += riscv_v_helpers.o lib-$(CONFIG_RISCV_ISA_V) += uaccess_vector.o +lib-$(CONFIG_RISCV_ISA_V) += memset_vector.o +lib-$(CONFIG_RISCV_ISA_V) += memcpy_vector.o +lib-$(CONFIG_RISCV_ISA_V) += memmove_vector.o diff --git a/arch/riscv/lib/memcpy_vector.S b/arch/riscv/lib/memcpy_vector.S new file mode 100644 index 000000000000..4176b6e0a53c --- /dev/null +++ b/arch/riscv/lib/memcpy_vector.S @@ -0,0 +1,29 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ + +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + + +/* void *memcpy(void *, const void *, size_t) */ +SYM_FUNC_START(__asm_memcpy_vector) + mv pDstPtr, pDst +loop: + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + vle8.v vData, (pSrc) + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + vse8.v vData, (pDstPtr) + add pDstPtr, pDstPtr, iVL + bnez iNum, loop + ret +SYM_FUNC_END(__asm_memcpy_vector) diff --git a/arch/riscv/lib/memmove_vector.S b/arch/riscv/lib/memmove_vector.S new file mode 100644 index 000000000000..4cea9d244dc9 --- /dev/null +++ b/arch/riscv/lib/memmove_vector.S @@ -0,0 +1,49 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#include +#include + +#define pDst a0 +#define pSrc a1 +#define iNum a2 + +#define iVL a3 +#define pDstPtr a4 +#define pSrcBackwardPtr a5 +#define pDstBackwardPtr a6 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +SYM_FUNC_START(__asm_memmove_vector) + + mv pDstPtr, pDst + + bgeu pSrc, pDst, forward_copy_loop + add pSrcBackwardPtr, pSrc, iNum + add pDstBackwardPtr, pDst, iNum + bltu pDst, pSrcBackwardPtr, backward_copy_loop + +forward_copy_loop: + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + vle8.v vData, (pSrc) + sub iNum, iNum, iVL + add pSrc, pSrc, iVL + vse8.v vData, (pDstPtr) + add pDstPtr, pDstPtr, iVL + + bnez iNum, forward_copy_loop + ret + +backward_copy_loop: + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + + sub pSrcBackwardPtr, pSrcBackwardPtr, iVL + vle8.v vData, (pSrcBackwardPtr) + sub iNum, iNum, iVL + sub pDstBackwardPtr, pDstBackwardPtr, iVL + vse8.v vData, (pDstBackwardPtr) + bnez iNum, backward_copy_loop + ret + +SYM_FUNC_END(__asm_memmove_vector) diff --git a/arch/riscv/lib/memset_vector.S b/arch/riscv/lib/memset_vector.S new file mode 100644 index 000000000000..4611feed72ac --- /dev/null +++ b/arch/riscv/lib/memset_vector.S @@ -0,0 +1,33 @@ +/* SPDX-License-Identifier: GPL-2.0-only */ +#include +#include + +#define pDst a0 +#define iValue a1 +#define iNum a2 + +#define iVL a3 +#define iTemp a4 +#define pDstPtr a5 + +#define ELEM_LMUL_SETTING m8 +#define vData v0 + +/* void *memset(void *, int, size_t) */ +SYM_FUNC_START(__asm_memset_vector) + + mv pDstPtr, pDst + + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + vmv.v.x vData, iValue + +loop: + vse8.v vData, (pDstPtr) + sub iNum, iNum, iVL + add pDstPtr, pDstPtr, iVL + vsetvli iVL, iNum, e8, ELEM_LMUL_SETTING, ta, ma + bnez iNum, loop + + ret + +SYM_FUNC_END(__asm_memset_vector) diff --git a/arch/riscv/lib/riscv_v_helpers.c b/arch/riscv/lib/riscv_v_helpers.c index d763b9c69fb7..12e8c5deb013 100644 --- a/arch/riscv/lib/riscv_v_helpers.c +++ b/arch/riscv/lib/riscv_v_helpers.c @@ -36,3 +36,24 @@ asmlinkage int enter_vector_usercopy(void *dst, void *src, size_t n) fallback: return fallback_scalar_usercopy(dst, src, n); } + +#define V_OPT_TEMPLATE3(prefix, type_r, type_0, type_1) \ +extern type_r __asm_##prefix##_vector(type_0, type_1, size_t n); \ +type_r prefix(type_0 a0, type_1 a1, size_t n) \ +{ \ + type_r ret; \ + if (has_vector() && may_use_simd() && n > riscv_v_##prefix##_thres) { \ + kernel_vector_begin(); \ + ret = __asm_##prefix##_vector(a0, a1, n); \ + kernel_vector_end(); \ + return ret; \ + } \ + return __##prefix(a0, a1, n); \ +} + +static size_t riscv_v_memset_thres = 1280; +V_OPT_TEMPLATE3(memset, void *, void*, int) +static size_t riscv_v_memcpy_thres = 768; +V_OPT_TEMPLATE3(memcpy, void *, void*, const void *) +static size_t riscv_v_memmove_thres = 512; +V_OPT_TEMPLATE3(memmove, void *, void*, const void *)