From patchwork Fri Dec 29 14:36:17 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andy Chiu X-Patchwork-Id: 13506537 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 815F2C46CD3 for ; Fri, 29 Dec 2023 14:37:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:Message-Id:Date:Subject:Cc :To:From:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References: List-Owner; bh=sfSjkFTLvDlCZ7PYM37Sa+kS7ncqL+j7gbYdx8OUVG0=; b=HGzcmrzslgCpEW ohIPXdWd34zjLCz6qW+wxqA+x7yrhBLb77ZCmJ8J4cQRdulWvo3b7yXS1q0faFImbukekd/WPDlDp M9AvRB2+pakkM3WMxk/NDCoU44VJDDVKndgLUyxO897oQuWYASTZnhRxzzHsNQ/jYBu964Kbz3A4g OiltO7/QykgJ4PUGCoeoRyak2sI+ZcuryuyVqMhj2NgI2YzkKMoU/6GTRe8BFRAlQhFo15ofEuC/E mFEbeL39fAvHYz/0nK/2iHZGmwHlgES3SecQyUX1+2NyNYzyh1n1RuIQHMNVTnewbmxVKoN5+JMi5 AvJ0m/PLdcaqThT9rH6Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rJDyl-0012kP-1X; Fri, 29 Dec 2023 14:36:51 +0000 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rJDyi-0012jc-0h for linux-riscv@lists.infradead.org; Fri, 29 Dec 2023 14:36:50 +0000 Received: by mail-pg1-x531.google.com with SMTP id 41be03b00d2f7-5cdbc7bebecso2192609a12.1 for ; Fri, 29 Dec 2023 06:36:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1703860605; x=1704465405; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=D8+sXY92XvUDoc6l5dhgIjVxYvTr6BCtQ+72FUFJ6OQ=; b=R/lIrsPeebq4IdZouQOGGjmoefEYG2o04Qjrlk7wcn4Hrz84H27fdgItFT+Haz21ju Ax3FATMmPHLZsrJ6Uqiy1Tj2L1AiaQvbDe1cqGof2fXNi1THC5HglsF02FOh2HoLjaUo ki6kfU61iz5e38Sfg4gFrdUMWkZfOIi/KFRKmXBg3UOiLjiS74W5rxpsirXpOfBtJ8jX 90oZAwG4wkJH+iOaTh+QSr2O6Qfpg1Pk6NN2738skvhF1BqZ1Nysv/wPhphW1BxwYO9Z nPv1AVBISS6hfGF9vyFX8lF6AMRExFRayOYO+sglE5+9CSvha19tfzS/MXl+WyFNQlzX k90g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1703860605; x=1704465405; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=D8+sXY92XvUDoc6l5dhgIjVxYvTr6BCtQ+72FUFJ6OQ=; b=ji2WiR16EfuCSlZ7Uii7vAnyQtlNVfWVixA+HaH1MU4tnxdmN4bLS4RyezfccvlcuR 9il2VyFZ4Ecn6cVLuUV0jiDSN7LGr86dXbUEro+zCmZaUUpevjBBXyM7lpl4Kv5wILvh 1nQAI9w3TC02MLa+KVfhViZ9wFbrb17r84CNp4EhkiX8t8iju2onjjuRLoc7fUEl+sya F5b+3qduLSu8DJE1Ni5exNS91gF+ErIBM0h58PlpK25se2xjEentHubdz1pHE95dobDN RzrssvOfLPRy+SsV0QKY3zfp1H2saEOaEGacAvykquDXXeHd+rshbdxC64PiSNwV2Q5N i0hA== X-Gm-Message-State: AOJu0YxV0Z3pSziGevA4bZBIH/D+3Ap+vdw2iyOjxNbaAFfmIhgk5q5A g1wR6OpsSykmDLl+EbHogKf+H77EZkFWDiEbz/APY/SofjJzqOdeaXoy95v6K8rwzUF8TQKax3j qpmuJL1ss47m9MTsJsqHePopAeF6PFhU83cSGnap731kdS8/z+x6lulZFPVwwZyt3PPQfxAellg RwUuMeRGr8TsrEuATLAsF8 X-Google-Smtp-Source: AGHT+IHFbXaHF5Eu8VTx+UnfQ/sy0kdIwjDrIFZIgT8aXi78HllEmbEwpKnl/gkjH5deyR+UXH51LA== X-Received: by 2002:a05:6a20:e113:b0:196:244:aa81 with SMTP id kr19-20020a056a20e11300b001960244aa81mr2988661pzb.60.1703860605296; Fri, 29 Dec 2023 06:36:45 -0800 (PST) Received: from hsinchu26.internal.sifive.com (59-124-168-89.hinet-ip.hinet.net. [59.124.168.89]) by smtp.gmail.com with ESMTPSA id y16-20020aa793d0000000b006d99c6c0f1fsm11544727pff.100.2023.12.29.06.36.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 29 Dec 2023 06:36:44 -0800 (PST) From: Andy Chiu To: linux-riscv@lists.infradead.org, palmer@dabbelt.com Cc: paul.walmsley@sifive.com, greentime.hu@sifive.com, guoren@linux.alibaba.com, bjorn@kernel.org, charlie@rivosinc.com, ardb@kernel.org, arnd@arndb.de, peterz@infradead.org, tglx@linutronix.de, ebiggers@kernel.org, Andy Chiu , Albert Ou Subject: [v9, 00/10] riscv: support kernel-mode Vector Date: Fri, 29 Dec 2023 14:36:17 +0000 Message-Id: <20231229143627.22898-1-andy.chiu@sifive.com> X-Mailer: git-send-email 2.17.1 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231229_063648_286572_0BA315BD X-CRM114-Status: GOOD ( 29.44 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org This series provides support running Vector in kernel mode. Additionally, kernel-mode Vector can be configured to run without turnning off preemption on a CONFIG_PREEMPT kernel. Along with the suport, we add Vector optimized copy_{to,from}_user. And provide a simple threshold to decide when to run the vectorized functions. We decided to drop patch 6 ("riscv: lib: add vectorized mem* routines") from the last series for the moment. We would like to discuss the issue before adding it back (or not). We hope this would help keep the review process going. The issue is that the side-effect of mem* functions could damage the user expectation. This could happens when destination, or source memory overlaps with the memory touched by kernel_vector_{begin,end}(). In the observed case, an optimized task_struct asignment in copy_process() caused an implicit write to current->softirqs_enabled in kernel_vector_begin(). Since the field locates within the soure memory region, the temporarily change for the field was copied to the destination task_struct. After the copy, a softirq assertion check failed because the kernel expects softirq should not be diabled for the new task_struct. Please note that riscv_v_flags in destination task_struct also has a stale value after this copy, but it is fine. Because the flag is not shared outside of kernel-mode Vector and we make sure initializing it before starting any threads. (with CONFIG_PROVE_LOCKING) copy_process: arch_dup_task_struct: *dst = *src; # optimized to memcpy(): kernel_vector_begin(): current->softirqs_enabled = 0; __asm_memcpy_vector(); DEBUG_LOCKS_WARN_ON(!dst->softirqs_enabled); A possible solution is to provide an alias check and fall back to scalar one if either source/destination memory overlaps with the current task_struct. This will increase the overhead for both versions of mem* routines. Or, we could use vectorized version of mem* only when we don't have to alter shared states in task_struct. For example, if we already disable bh, or if we are in softirq. In these cases we can proceed to save dirty context, if any, and use Vector. This will apply just to very constraint operations such as mem*, where we have a clear bound and fallback. Another direction is to minimize the footprint of kernel_vector_begin(). e.g. upgrading local_bh_disable to preempt_disable should mitigate the current issue. This will require us to add a percpu storage for V on non-preemptible Vector. Since preempt_v does not alter any shared states in task_struct when activating it in task context, it is safer to call vectorized mem*. However, since the current fallback for preempt_v still touches the shared state, it is not consider entirely safe to use vectorized mem*. Still, it is possible to make preempt_v safe to use vectorized mem*, by refusing to launch new kernel thread if the V context allocation fails. So kernel threads will always use preempt_v in their task context. This series is composed by 4 parts: patch 1-4: adds basic support for kernel-mode Vector patch 5: includes vectorized copy_{to,from}_user into the kernel patch 6: refactor context switch code in fpu [2] patch 7-10: provides some code refactors and support for preemptible kernel-mode Vector. This series can be merged if we feel any part of {1~4, 5, 6, 7~10} is mature enough. This patch is tested on a QEMU with V and verified that booting, normal userspace operations all work as usual with thresholds set to 0. Also, we test by launching multiple kernel threads which continuously executes and verifies Vector operations in the background. The module that tests these operation is expected to be upstream later. v8 of this series can be found at [1] [1]: https://lore.kernel.org/all/20231223042914.18599-1-andy.chiu@sifive.com/ [2]: https://lore.kernel.org/linux-riscv/20231221070449.1809020-1-songshuaishuai@tinylab.org/ Patch summary: - Updated patches: 1, 4, 10 - New patch: 6 - Unchanged patch: 2, 3, 5, 7, 8, 9 - Deleted patch: 6 (from v8) Changelog v9: - Use one bit to record the on/off status of kernel-mode Vector - Temporarily drop vectorized mem* functions - Add a patch to refactor context switch in fpu - silence lockdep and use WARN_ON instead Changelog v8: - Address build fail on no-mmu config - Fix build fail with W=1 - Refactor patches (1, 2), Eric Changelog v7: - Fix build fail for allmodconfig and test building the series with allmodconfig/allyesconfig Changelog v6: - Provide a more robust check on the use of non-preemptible Vector. - Add Kconfigs to set threshold value at compile time. (Charlie) - Add a patch to utilize kmem_cache_* for V context allocations. - Re-write and add preemptible Vector. Changelog v5: - Rebase on top of riscv for-next (6.7-rc1) Changelog v4: - Use kernel_v_flags and helpers to track vector context. - Prevent softirq from nesting V context for non-preempt V - Add user copy and mem* routines Changelog v3: - Rebase on top of riscv for-next (6.6-rc1) - Fix a build issue (Conor) - Guard vstate_save, vstate_restore with {get,put}_cpu_vector_context. - Save V context after disabling preemption. (Guo) - Remove irqs_disabled() check from may_use_simd(). (Björn) - Comment about nesting V context. Changelog v2: - fix build issues - Follow arm's way of starting kernel-mode simd code: - add include/asm/simd.h and rename may_use_vector() -> may_use_simd() - return void in kernel_vector_begin(), and BUG_ON if may_use_simd() fails - Change naming scheme for functions/macros (Conor): - remove KMV - 's/rvv/vector/' - 's/RISCV_ISA_V_PREEMPTIVE_KMV/RISCV_ISA_V_PREEMPTIVE/' - 's/TIF_RISCV_V_KMV/TIF_RISCV_V_KERNEL_MODE/' Andy Chiu (8): riscv: vector: make Vector always available for softirq context riscv: sched: defer restoring Vector context for user riscv: lib: vectorize copy_to_user/copy_from_user riscv: fpu: drop SR_SD bit checking riscv: vector: do not pass task_struct into riscv_v_vstate_{save,restore}() riscv: vector: use a mask to write vstate_ctrl riscv: vector: use kmem_cache to manage vector context riscv: vector: allow kernel-mode Vector with preemption Greentime Hu (2): riscv: Add support for kernel mode vector riscv: Add vector extension XOR implementation arch/riscv/Kconfig | 22 +++ arch/riscv/include/asm/asm-prototypes.h | 27 +++ arch/riscv/include/asm/entry-common.h | 17 ++ arch/riscv/include/asm/processor.h | 42 +++- arch/riscv/include/asm/simd.h | 64 ++++++ arch/riscv/include/asm/switch_to.h | 3 +- arch/riscv/include/asm/thread_info.h | 2 + arch/riscv/include/asm/vector.h | 100 ++++++++-- arch/riscv/include/asm/xor.h | 68 +++++++ arch/riscv/kernel/Makefile | 1 + arch/riscv/kernel/entry.S | 8 + arch/riscv/kernel/kernel_mode_vector.c | 251 ++++++++++++++++++++++++ arch/riscv/kernel/process.c | 13 +- arch/riscv/kernel/ptrace.c | 7 +- arch/riscv/kernel/signal.c | 7 +- arch/riscv/kernel/vector.c | 50 ++++- arch/riscv/lib/Makefile | 7 +- arch/riscv/lib/riscv_v_helpers.c | 44 +++++ arch/riscv/lib/uaccess.S | 10 + arch/riscv/lib/uaccess_vector.S | 50 +++++ arch/riscv/lib/xor.S | 81 ++++++++ 21 files changed, 846 insertions(+), 28 deletions(-) create mode 100644 arch/riscv/include/asm/simd.h create mode 100644 arch/riscv/include/asm/xor.h create mode 100644 arch/riscv/kernel/kernel_mode_vector.c create mode 100644 arch/riscv/lib/riscv_v_helpers.c create mode 100644 arch/riscv/lib/uaccess_vector.S create mode 100644 arch/riscv/lib/xor.S