From patchwork Tue Dec 5 09:27:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479633 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 6DF85C10DCE for ; Tue, 5 Dec 2023 09:28:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=3sCCSF0IrcNmuFQM0+yBn7UHSdrxw5CgljLnCWjbVIc=; b=eVOnDKSM3FArRw uoe/z5IrJm7zzZK8z4vjSr5TznMqD7sZIWAU++b/L0Co6Q6K80vZ7xt8Uc+bm8LIfhNRoklrv3c5F 6GJn+1NBEHCCInlGZnqdaC1be4TAH7qWKHrTddyvA0QfEA9Q1ojBqf/LqEzxb55AkIz8xI8WExG3l qC4nceOtCgSThq4OwLGBygTmRDzjvxj53G132/cKTS+/dYQfBVBT8BIoJFegdg/YvQM86Nb7uYcSV suIyBGcXECAcY9Sxa+eRc4Gw/jqgRgWeekf5sxggBxfacjWTdtjkslLhq2/rnC0B2nVj7PlKkmB6/ jXabL2W2AE1/uPvRw6LA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARj0-006nZz-2y; Tue, 05 Dec 2023 09:28:18 +0000 Received: from mail-pg1-x533.google.com ([2607:f8b0:4864:20::533]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARiy-006nYY-0Z for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:17 +0000 Received: by mail-pg1-x533.google.com with SMTP id 41be03b00d2f7-5c659db0ce2so2449645a12.0 for ; Tue, 05 Dec 2023 01:28:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768495; x=1702373295; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=2+Duv2eL4Rh6BmoQvomNjrPNP7WadF4MX+fx63CbvwQ=; b=RdtWWbQ8N8f0ODm1DoOZVmS1pLt2v03TwkhANi29LJvCtFbGRApOcwSCqhQuN2PT1Z 0KVDYHeG+7JUtcgLCHO3mcRRn3tpnDfzDFR53NZR1FwW3TNVMH8KM4GprJMKQ+CZH/Af x1+INTzWPJMBhfiTMpsqRzV69cPw6a4AP1u2mhjeNXGZqxnCYIUxEX/RGEvQnhp7YXpr l7GldRqN1owwwCiJVOSXob4+5KY/nFLcb/OkqqyyRl+06S070gxwqP0BKXvBgxCYqWyQ /2pzRvOFgL9vCYg1GrqM4GcCwlupm1Juvb7/DHoaAMQ0AZ3TigsiFWfFUxhPYY0tcqxI zXTA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768495; x=1702373295; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=2+Duv2eL4Rh6BmoQvomNjrPNP7WadF4MX+fx63CbvwQ=; b=Met+x6EYN3i2HiRhRCQ+HqkcQnmyjeaCQdYriEBiao+hHQGi51Tuv/JxMjdUCLcgOe BxhUcNKzCcVe50HckUcCVRGuJ5m37nFAeeoc/JjUp6Hky4yl0xR+GRBZBG9qGeFQl+mY YJyJnnhO1cGAtx4KqK9CA3ICN/hWuPC2NHe3IgM3ysuy4jF7FcseDKslOfhe0Onag9l+ bc+r1C0gQ36R8S1Bm65m4uh2dhjE4pdAtAeMNuygpG7Iw56PZ150JslnMTllqokPFUWH 49hFf4ZGEht3X3IJ8ZX4bQz6TkyiMJls2jbUhV0wb2fdmqtYZVNDqelsfYtdshnOG7Nn kidA== X-Gm-Message-State: AOJu0Yyj8qClvrl83zKjndCCBwva1la+uETwnkOJwEjOmN/3wyfGmedc 1HgqEpio/qFO/aKn2CfJniCBLA== X-Google-Smtp-Source: AGHT+IHo8qyIK+a7xAlqBDFbhHitNbdyhcjGou5e7sEK1zHUFcXdJWS5bgBXVVX5Sw11ZixDo6r/6Q== X-Received: by 2002:a05:6a20:160a:b0:18f:97c:9778 with SMTP id l10-20020a056a20160a00b0018f097c9778mr7394313pzj.96.1701768494761; Tue, 05 Dec 2023 01:28:14 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.11 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:14 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 01/12] RISC-V: add helper function to read the vector VLEN Date: Tue, 5 Dec 2023 17:27:50 +0800 Message-Id: <20231205092801.1335-2-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012816_214392_725FC099 X-CRM114-Status: GOOD ( 11.36 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner VLEN describes the length of each vector register and some instructions need specific minimal VLENs to work correctly. The vector code already includes a variable riscv_v_vsize that contains the value of "32 vector registers with vlenb length" that gets filled during boot. vlenb is the value contained in the CSR_VLENB register and the value represents "VLEN / 8". So add riscv_vector_vlen() to return the actual VLEN value for in-kernel users when they need to check the available VLEN. Signed-off-by: Heiko Stuebner Reviewed-by: Eric Biggers Signed-off-by: Jerry Shih --- arch/riscv/include/asm/vector.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index 9fb2dea66abd..1fd3e5510b64 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -244,4 +244,15 @@ void kernel_vector_allow_preemption(void); #define kernel_vector_allow_preemption() do {} while (0) #endif +/* + * Return the implementation's vlen value. + * + * riscv_v_vsize contains the value of "32 vector registers with vlenb length" + * so rebuild the vlen value in bits from it. + */ +static inline int riscv_vector_vlen(void) +{ + return riscv_v_vsize / 32 * 8; +} + #endif /* ! __ASM_RISCV_VECTOR_H */ From patchwork Tue Dec 5 09:27:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479634 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 26F68C4167B for ; Tue, 5 Dec 2023 09:28:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=xDXtu0vEKC7OjX8eKUHwMzhR+W3e1jcl2cwlmSWirIc=; b=0DPGeD3iZ3bSGS qOQBeiPJ8s+hOJqDlLRpDZ6nUzUKZynr+8PwEyOly9wIeNvRBO/uZRtMkS9JVcsX+l7eSfxCrvkzf 3k6fj8Eq0TiPYMgcbEFZ9Swes+PRDStOgL0fZniOvLZCwDNX7gzjYDzRI3N8fsU0+n5CHBKGIMmM/ CyslosfuzSl92chkuLzHrW06EldLdMMZlRGxBQWeyp8fX4xzp62CeSLww08HicFiA8Sq8sYLz8q3l Rbu4eu97DpeSoPXp22fknGLB7hPmdgl2x1mBWwtDcGSXlKhXDhd8rXlEm9nbzYbK29tmUlui4iUnC DSqah2bwlvQmqFmZtUZQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARj3-006nbC-1n; Tue, 05 Dec 2023 09:28:21 +0000 Received: from mail-pf1-x431.google.com ([2607:f8b0:4864:20::431]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARj1-006nZs-1b for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:20 +0000 Received: by mail-pf1-x431.google.com with SMTP id d2e1a72fcca58-6cbd24d9557so4314184b3a.1 for ; Tue, 05 Dec 2023 01:28:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768498; x=1702373298; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=5Ww+t+x4EF+o8Iz1q4xD0rNSq2UgrGpxhIva2EbXP3U=; b=ic5NIXMyQVv967mtmgpdoXUeRpDPWLlp1sCcOkXINtiOSvdC2db9PwdbgQ+/uwhyDd 0GvzVbnVc1kBkhivirCFvbVvH+3wgbAl/CscQ1z7EP2YS0lKyk5zfP7G/iBw6ybeHW+e ryc7Ekzwt8E73Aw4bxDjXKtyAtw6HoW2Frex7TfzKm3vCOYAZYS2tNp6usCx0yOiDSs8 5DhmAKmOWe7eFgn8iaFNDRe9HNta3mizRZnlob775lpCBiM2v8ZznJzIvIfaLfRgNcmj BELoX5dnTbHayhQpTvDjkjou2+/sWD318NIjMw0QCB3/kzpCNUoa7Q1zaOWkaagTb8c/ ycgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768498; x=1702373298; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=5Ww+t+x4EF+o8Iz1q4xD0rNSq2UgrGpxhIva2EbXP3U=; b=qTRdRtLPm6LAKMvGvFmbQJMYrACWmFW0yYh6mbe4bqfAAAZk9g+Lau1AgfDUM4icFl wWCcGT9Rwil+vqjR81axXe2Ulvxw7qi2hEzQISxJI2+YrVNA/wSZxQs+J58DxN2Zb0za NcnF5Yh9SaeqRyXz6ht5dwNUAls12Yfsu4KwmP7QqEBVjVLmLmxHaE2NxUt/5lJ9mo6X GWKmzVECKYWZmVx+MFRXTETzsEILsOyaO+mUQ0g8tQxNY8+lH2oAhvWazaOk00fP7xPg udROgiey4LuKQf5xZ+Ou5bpIPAUePCjiB0/Ivqdlj0H2bvG7W/D7KDYTVi64OFdHdK+f sBHA== X-Gm-Message-State: AOJu0YxGpsl3gHMeYPu7fWgh5vUiopd02q5PlLiaoga+AE+pZuVvpqMU fMIRNgkKvesxWyCiAmpxeYbL+w== X-Google-Smtp-Source: AGHT+IFaWRkKbZw7DstMCEBheFGlYTaPcp/3Zi4NlnL96zF+Y1cISDdKfBWTpwQP+FRgk8CWXY1DdQ== X-Received: by 2002:a05:6a00:893:b0:6ce:2757:7866 with SMTP id q19-20020a056a00089300b006ce27577866mr1216195pfj.33.1701768498058; Tue, 05 Dec 2023 01:28:18 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.15 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:17 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 02/12] RISC-V: hook new crypto subdir into build-system Date: Tue, 5 Dec 2023 17:27:51 +0800 Message-Id: <20231205092801.1335-3-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012819_536788_1723F506 X-CRM114-Status: GOOD ( 13.53 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner Create a crypto subdirectory for added accelerated cryptography routines and hook it into the riscv Kbuild and the main crypto Kconfig. Signed-off-by: Heiko Stuebner Reviewed-by: Eric Biggers Signed-off-by: Jerry Shih --- arch/riscv/Kbuild | 1 + arch/riscv/crypto/Kconfig | 5 +++++ arch/riscv/crypto/Makefile | 4 ++++ crypto/Kconfig | 3 +++ 4 files changed, 13 insertions(+) create mode 100644 arch/riscv/crypto/Kconfig create mode 100644 arch/riscv/crypto/Makefile diff --git a/arch/riscv/Kbuild b/arch/riscv/Kbuild index d25ad1c19f88..2c585f7a0b6e 100644 --- a/arch/riscv/Kbuild +++ b/arch/riscv/Kbuild @@ -2,6 +2,7 @@ obj-y += kernel/ mm/ net/ obj-$(CONFIG_BUILTIN_DTB) += boot/dts/ +obj-$(CONFIG_CRYPTO) += crypto/ obj-y += errata/ obj-$(CONFIG_KVM) += kvm/ diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig new file mode 100644 index 000000000000..10d60edc0110 --- /dev/null +++ b/arch/riscv/crypto/Kconfig @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + +menu "Accelerated Cryptographic Algorithms for CPU (riscv)" + +endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile new file mode 100644 index 000000000000..b3b6332c9f6d --- /dev/null +++ b/arch/riscv/crypto/Makefile @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# linux/arch/riscv/crypto/Makefile +# diff --git a/crypto/Kconfig b/crypto/Kconfig index 650b1b3620d8..c7b23d2c58e4 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -1436,6 +1436,9 @@ endif if PPC source "arch/powerpc/crypto/Kconfig" endif +if RISCV +source "arch/riscv/crypto/Kconfig" +endif if S390 source "arch/s390/crypto/Kconfig" endif From patchwork Tue Dec 5 09:27:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479635 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id AF611C4167B for ; Tue, 5 Dec 2023 09:28:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=wvq4K6Hb6QGGeJ/DoTvRpBeAk7luLzilc5HAr90tRE4=; b=cjZx7llk1EVILZ W17O0ACW/rMrhPOBrvoGCUg9CHrQw22NkPZgWETGEpOkR9bCe+EZ0uIvHoCvPEWa6a8QqxZS5z/kA Iv9MU94OjrO2O/TgmMd50rP5ni+5q0P2RZKlPo3P+sHyffoCi653v3ZwmCKm/7lNokNL0zYJA2O91 QUsKe+NhD6rVH4+imgYnYo4Lr0AkXiALXVhhYE2KTo2AXY3cd3b8v6tEXBpqKe53ZCO8wHK6XDHMr wPG30mmDzZee+epf7ryixo+Rc0oIBVuY5ZrRkxjDhe84d1/GvsBim7LuxCxufuzl5gjXLVFZziYR6 T+HWMRbBLbH8X09eKklQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARj9-006neg-0x; Tue, 05 Dec 2023 09:28:27 +0000 Received: from mail-pf1-x42d.google.com ([2607:f8b0:4864:20::42d]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARj4-006nba-3D for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:25 +0000 Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-6cdea2f5918so3611557b3a.2 for ; Tue, 05 Dec 2023 01:28:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768501; x=1702373301; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=RBuedL2MVaDCSRXtoqXWbAJ3hnegLt9N4TQwD0//bFg=; b=frAzMDVMOYA1oQWm0l5U+UIN5BV2nYLmgfyujM0xbXvo4GLdBjr7B8OV/aQAHoK7B6 8AXH3SdNIsmkJh8yYrF6oQn/XRSn3WXVPP1XJaCR6rd7jwVpFpjYf+wYgLPgQZf8xBRd 86v64/u1tp1vvYk4EQmVAs+QaSsWoNyWWjHVmSKYIutmdHQNArYO9KOKR7bwX/SWmgB2 zbgj8Ksje+O7nq0zrg0YiGYQrg1qa+hBn4+uTwyNKJK+KUwd5frVqqRmW61l4bwfseCD lmyDu5+VH4vXYX3FBRiJPEk34ZYf2STIJoG0P8rNuMN6zrWZlnzMAUyV9z45BfZEKWx6 1ZiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768501; x=1702373301; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RBuedL2MVaDCSRXtoqXWbAJ3hnegLt9N4TQwD0//bFg=; b=DNSTUE1a0NcX0yxWyifn44GzuOR8Pb4zyakyYFtyNXjVtXNC0/6vkRmEIocdssyvDp 0uTct2z2/bpDXni2bUzSmydsXlklfPmZ8t5zACd3NCiRMVlCTT0F2Ykv409fAot5IltE Gb06nUs/hxlBfxMeT7wZe07j3m3IHYlMPNAUAAPQSW7XnCEhS2IJ3UMxW0VzjqvVcEyR dXySlUmZyLXU16WkAnSFE1pf4U/KhzMQdIGF7MjPvHsVt5Xdj5/Z1WJ+dDS2RlfNND0K ZVEeyHH+v4kQ8AmJQnWfqb6CnW1rAZpjgUukFJnn16KP0QT6Rv9fUaff+E0UBepoQg46 sWow== X-Gm-Message-State: AOJu0YycyI4qnAbqyNfGBoRqK76n5omifYEaAM1s68qD2DDEwo5mw5ze 8oIJAsB89DjffpY65/ssj9610A== X-Google-Smtp-Source: AGHT+IFr4dGGyEyZHIGPv5AWO9ts4wPN/yPzSbmCzmCq7FmwKS1IxwoO4xI+C5TPbBu3ZHb04zstKg== X-Received: by 2002:a05:6a00:a90:b0:6ce:54c4:b514 with SMTP id b16-20020a056a000a9000b006ce54c4b514mr731370pfl.50.1701768501556; Tue, 05 Dec 2023 01:28:21 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.18 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:21 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 03/12] RISC-V: crypto: add OpenSSL perl module for vector instructions Date: Tue, 5 Dec 2023 17:27:52 +0800 Message-Id: <20231205092801.1335-4-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012823_035104_90840C80 X-CRM114-Status: GOOD ( 23.23 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org The OpenSSL has some RISC-V vector cryptography implementations which could be reused for kernel. These implementations use a number of perl helpers for handling vector-crypto-extension instructions. This patch take these perl helpers from OpenSSL(openssl/openssl#21923). The unused scalar crypto instructions in the original perl module are skipped. Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v3: - Only use opcodes for vector crypto extensions. We use the asm mnemonics for all instructions in RVV 1.0 extension which are already supported in kernel. --- arch/riscv/crypto/riscv.pm | 359 +++++++++++++++++++++++++++++++++++++ 1 file changed, 359 insertions(+) create mode 100644 arch/riscv/crypto/riscv.pm diff --git a/arch/riscv/crypto/riscv.pm b/arch/riscv/crypto/riscv.pm new file mode 100644 index 000000000000..d91ad902ca04 --- /dev/null +++ b/arch/riscv/crypto/riscv.pm @@ -0,0 +1,359 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Jerry Shih +# Copyright (c) 2023, Phoebe Chen +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +use strict; +use warnings; + +# Set $have_stacktrace to 1 if we have Devel::StackTrace +my $have_stacktrace = 0; +if (eval {require Devel::StackTrace;1;}) { + $have_stacktrace = 1; +} + +my @regs = map("x$_",(0..31)); +# Mapping from the RISC-V psABI ABI mnemonic names to the register number. +my @regaliases = ('zero','ra','sp','gp','tp','t0','t1','t2','s0','s1', + map("a$_",(0..7)), + map("s$_",(2..11)), + map("t$_",(3..6)) +); + +my %reglookup; +@reglookup{@regs} = @regs; +@reglookup{@regaliases} = @regs; + +# Takes a register name, possibly an alias, and converts it to a register index +# from 0 to 31 +sub read_reg { + my $reg = lc shift; + if (!exists($reglookup{$reg})) { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Unknown register ".$reg."\n".$trace); + } + my $regstr = $reglookup{$reg}; + if (!($regstr =~ /^x([0-9]+)$/)) { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Could not process register ".$reg."\n".$trace); + } + return $1; +} + +my @vregs = map("v$_",(0..31)); +my %vreglookup; +@vreglookup{@vregs} = @vregs; + +sub read_vreg { + my $vreg = lc shift; + if (!exists($vreglookup{$vreg})) { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Unknown vector register ".$vreg."\n".$trace); + } + if (!($vreg =~ /^v([0-9]+)$/)) { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Could not process vector register ".$vreg."\n".$trace); + } + return $1; +} + +# Read the vm settings and convert to mask encoding. +sub read_mask_vreg { + my $vreg = shift; + # The default value is unmasked. + my $mask_bit = 1; + + if (defined($vreg)) { + my $reg_id = read_vreg $vreg; + if ($reg_id == 0) { + $mask_bit = 0; + } else { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("The ".$vreg." is not the mask register v0.\n".$trace); + } + } + return $mask_bit; +} + +# Vector crypto instructions + +## Zvbb and Zvkb instructions +## +## vandn (also in zvkb) +## vbrev +## vbrev8 (also in zvkb) +## vrev8 (also in zvkb) +## vclz +## vctz +## vcpop +## vrol (also in zvkb) +## vror (also in zvkb) +## vwsll + +sub vbrev8_v { + # vbrev8.v vd, vs2, vm + my $template = 0b010010_0_00000_01000_010_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vd << 7)); +} + +sub vrev8_v { + # vrev8.v vd, vs2, vm + my $template = 0b010010_0_00000_01001_010_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vd << 7)); +} + +sub vror_vi { + # vror.vi vd, vs2, uimm + my $template = 0b01010_0_1_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + my $uimm_i5 = $uimm >> 5; + my $uimm_i4_0 = $uimm & 0b11111; + + return ".word ".($template | ($uimm_i5 << 26) | ($vs2 << 20) | ($uimm_i4_0 << 15) | ($vd << 7)); +} + +sub vwsll_vv { + # vwsll.vv vd, vs2, vs1, vm + my $template = 0b110101_0_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +## Zvbc instructions + +sub vclmulh_vx { + # vclmulh.vx vd, vs2, rs1 + my $template = 0b0011011_00000_00000_110_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vclmul_vx_v0t { + # vclmul.vx vd, vs2, rs1, v0.t + my $template = 0b0011000_00000_00000_110_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vclmul_vx { + # vclmul.vx vd, vs2, rs1 + my $template = 0b0011001_00000_00000_110_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +## Zvkg instructions + +sub vghsh_vv { + # vghsh.vv vd, vs2, vs1 + my $template = 0b1011001_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +sub vgmul_vv { + # vgmul.vv vd, vs2 + my $template = 0b1010001_00000_10001_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +## Zvkned instructions + +sub vaesdf_vs { + # vaesdf.vs vd, vs2 + my $template = 0b101001_1_00000_00001_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +sub vaesdm_vs { + # vaesdm.vs vd, vs2 + my $template = 0b101001_1_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +sub vaesef_vs { + # vaesef.vs vd, vs2 + my $template = 0b101001_1_00000_00011_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +sub vaesem_vs { + # vaesem.vs vd, vs2 + my $template = 0b101001_1_00000_00010_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +sub vaeskf1_vi { + # vaeskf1.vi vd, vs2, uimmm + my $template = 0b100010_1_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($uimm << 15) | ($vs2 << 20) | ($vd << 7)); +} + +sub vaeskf2_vi { + # vaeskf2.vi vd, vs2, uimm + my $template = 0b101010_1_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15) | ($vd << 7)); +} + +sub vaesz_vs { + # vaesz.vs vd, vs2 + my $template = 0b101001_1_00000_00111_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +## Zvknha and Zvknhb instructions + +sub vsha2ms_vv { + # vsha2ms.vv vd, vs2, vs1 + my $template = 0b1011011_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20)| ($vs1 << 15 )| ($vd << 7)); +} + +sub vsha2ch_vv { + # vsha2ch.vv vd, vs2, vs1 + my $template = 0b101110_10000_00000_001_00000_01110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20)| ($vs1 << 15 )| ($vd << 7)); +} + +sub vsha2cl_vv { + # vsha2cl.vv vd, vs2, vs1 + my $template = 0b101111_10000_00000_001_00000_01110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20)| ($vs1 << 15 )| ($vd << 7)); +} + +## Zvksed instructions + +sub vsm4k_vi { + # vsm4k.vi vd, vs2, uimm + my $template = 0b1000011_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15) | ($vd << 7)); +} + +sub vsm4r_vs { + # vsm4r.vs vd, vs2 + my $template = 0b1010011_00000_10000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +## zvksh instructions + +sub vsm3c_vi { + # vsm3c.vi vd, vs2, uimm + my $template = 0b1010111_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15 ) | ($vd << 7)); +} + +sub vsm3me_vv { + # vsm3me.vv vd, vs2, vs1 + my $template = 0b1000001_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vs1 << 15 ) | ($vd << 7)); +} + +1; From patchwork Tue Dec 5 09:27:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479636 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 57061C4167B for ; Tue, 5 Dec 2023 09:28:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=hjL5S7OjtkGRJBwx8Dfthrt1W2SEmn/hw3vPuskWxR8=; b=NIBAIiBQC9ZrWu ubz3x89t5GF2RVE093raU08Vv9+qL6nfQJD96iGSZSFEejjcUcAefp9sqByhxW1iI3xufH92ib/My 5Va0XuWCw5MrDv3XfXFgsQkJ4kmU3Dva6VQtBzP4Ap7BFWov60uMjYMiki3mGInKsOQaDgi966Fgx cN8/AMyS/+VsZ7Ags3OQPC8exCapXVYwcaPmTn114vDd1x+l1TRUW34q/c8qxgxKYtI0t2laS1Shr VQ04CvA7YsJRZHFug28wGbMVvfd2+DqeGCQrdySh82skX6oVia/SiaJbjhTRU3Kxn1LG1edEJtjUf 1Q5veoVSI1EyB9ZKWV5g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARjD-006ngn-0P; Tue, 05 Dec 2023 09:28:31 +0000 Received: from mail-pf1-x42b.google.com ([2607:f8b0:4864:20::42b]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARj8-006ndk-2Z for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:29 +0000 Received: by mail-pf1-x42b.google.com with SMTP id d2e1a72fcca58-6cdd214bce1so5722499b3a.3 for ; Tue, 05 Dec 2023 01:28:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768505; x=1702373305; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=6Kmnx+Y86sgulAsiPpTedcqOMl+J6uedayIKHjFyzO8=; b=YjzCrXUgI2KmJ6GJHPX9ZPdooNP1k+4u58YTSPo5K5k18EVd/i9ib4U2g65fePwCPU lInOz0hjuxuXMRWuIHoFSrYKH+caAUtYZKfn2uVwJxcT7lxNvz5UYkZ0hjE9YYOTa8Mf iiCXB3+s+z9j1pj6UbnzSBGgVW4Wt37caDjg57/IsrYiJxSLLGhErStQKnNchCcEf/C6 pDMGj5WTfHPZ5DJTtVzLtQYZAWFVgdzyqhMvWbyXEOYuMIIdsYE9OnWUxCCE/8L0rrvn mQhp1gf3xfcGvOVLn5cADmAne2RmL3VLPGbHCavFEACS8fxipgJLWAly0W8yDWnr8k/1 t0Xw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768505; x=1702373305; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=6Kmnx+Y86sgulAsiPpTedcqOMl+J6uedayIKHjFyzO8=; b=gBZQrzkbzvWCcOxy9ImP5IjPttgIM8JM8/rakyCzFd6lxJZeNUBb6wxAx5Q5cfagM5 xBR7yk4WsQ4NaVvFETfY6xhXZSA8LkbXX+oI7J10IzHoxp85fhyK0HxnStxKQ1vOt6UU s8VVBf8IuxvNhhpKy7NpYDxKoDJcVuJ9UNSLYnvCk8SeZTjE3wO9QsulSqHGJ312PmRr Ymbfen98L+btrajA1S7WwvCUIEQNbZ0pG/yECklLeQ7Y4GuhtDSAmTZuQtuzIults8Kc ZHkYDwLc1oiBW2X1mrnuziF9tuKxt8kFozYCgZRn6YYKox1NLBGry8ub0m1ybunb1Xa/ lIDw== X-Gm-Message-State: AOJu0YwdYXHJa+lGxhPd9zs1OJJ9s3MvWLWwzixC9mEZRtqtLHMR8cuw qEFtIqhfN7GP6YmrdrhN7b+LSw== X-Google-Smtp-Source: AGHT+IECVZQ6+fFEHljhnLtEbUN3kwfwd7RMqPIo0WA204kiBYi99WQEK+Ma8PmyE81fPDrPGeHa7g== X-Received: by 2002:a05:6a00:2186:b0:6ce:450c:6586 with SMTP id h6-20020a056a00218600b006ce450c6586mr1230717pfi.26.1701768505304; Tue, 05 Dec 2023 01:28:25 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.21 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:24 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 04/12] RISC-V: crypto: add Zvkned accelerated AES implementation Date: Tue, 5 Dec 2023 17:27:53 +0800 Message-Id: <20231205092801.1335-5-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012826_838055_93BF8BB1 X-CRM114-Status: GOOD ( 29.50 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org The AES implementation using the Zvkned vector crypto extension from OpenSSL(openssl/openssl#21923). Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v3: - Rename aes_setkey() to aes_setkey_zvkned(). - Rename riscv64_aes_setkey() to riscv64_aes_setkey_zvkned(). - Use aes generic software key expanding everywhere. - Remove rv64i_zvkned_set_encrypt_key(). We still need to provide the decryption expanding key for the SW fallback path which is not supported directly using zvkned extension. So, we turn to use the pure generic software key expanding everywhere to simplify the set_key flow. - Use asm mnemonics for the instructions in RVV 1.0 extension. Changelog v2: - Do not turn on kconfig `AES_RISCV64` option by default. - Turn to use `crypto_aes_ctx` structure for aes key. - Use `Zvkned` extension for AES-128/256 key expanding. - Export riscv64_aes_* symbols for other modules. - Add `asmlinkage` qualifier for crypto asm function. - Reorder structure riscv64_aes_alg_zvkned members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 11 + arch/riscv/crypto/Makefile | 11 + arch/riscv/crypto/aes-riscv64-glue.c | 137 +++++++ arch/riscv/crypto/aes-riscv64-glue.h | 18 + arch/riscv/crypto/aes-riscv64-zvkned.pl | 453 ++++++++++++++++++++++++ 5 files changed, 630 insertions(+) create mode 100644 arch/riscv/crypto/aes-riscv64-glue.c create mode 100644 arch/riscv/crypto/aes-riscv64-glue.h create mode 100644 arch/riscv/crypto/aes-riscv64-zvkned.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 10d60edc0110..65189d4d47b3 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -2,4 +2,15 @@ menu "Accelerated Cryptographic Algorithms for CPU (riscv)" +config CRYPTO_AES_RISCV64 + tristate "Ciphers: AES" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_ALGAPI + select CRYPTO_LIB_AES + help + Block ciphers: AES cipher algorithms (FIPS-197) + + Architecture: riscv64 using: + - Zvkned vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index b3b6332c9f6d..90ca91d8df26 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -2,3 +2,14 @@ # # linux/arch/riscv/crypto/Makefile # + +obj-$(CONFIG_CRYPTO_AES_RISCV64) += aes-riscv64.o +aes-riscv64-y := aes-riscv64-glue.o aes-riscv64-zvkned.o + +quiet_cmd_perlasm = PERLASM $@ + cmd_perlasm = $(PERL) $(<) void $(@) + +$(obj)/aes-riscv64-zvkned.S: $(src)/aes-riscv64-zvkned.pl + $(call cmd,perlasm) + +clean-files += aes-riscv64-zvkned.S diff --git a/arch/riscv/crypto/aes-riscv64-glue.c b/arch/riscv/crypto/aes-riscv64-glue.c new file mode 100644 index 000000000000..f29898c25652 --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-glue.c @@ -0,0 +1,137 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Port of the OpenSSL AES implementation for RISC-V + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "aes-riscv64-glue.h" + +/* aes cipher using zvkned vector crypto extension */ +asmlinkage void rv64i_zvkned_encrypt(const u8 *in, u8 *out, + const struct crypto_aes_ctx *key); +asmlinkage void rv64i_zvkned_decrypt(const u8 *in, u8 *out, + const struct crypto_aes_ctx *key); + +int riscv64_aes_setkey_zvkned(struct crypto_aes_ctx *ctx, const u8 *key, + unsigned int keylen) +{ + int ret; + + ret = aes_check_keylen(keylen); + if (ret < 0) + return -EINVAL; + + /* + * The RISC-V AES vector crypto key expanding doesn't support AES-192. + * So, we use the generic software key expanding here for all cases. + */ + return aes_expandkey(ctx, key, keylen); +} +EXPORT_SYMBOL(riscv64_aes_setkey_zvkned); + +void riscv64_aes_encrypt_zvkned(const struct crypto_aes_ctx *ctx, u8 *dst, + const u8 *src) +{ + if (crypto_simd_usable()) { + kernel_vector_begin(); + rv64i_zvkned_encrypt(src, dst, ctx); + kernel_vector_end(); + } else { + aes_encrypt(ctx, dst, src); + } +} +EXPORT_SYMBOL(riscv64_aes_encrypt_zvkned); + +void riscv64_aes_decrypt_zvkned(const struct crypto_aes_ctx *ctx, u8 *dst, + const u8 *src) +{ + if (crypto_simd_usable()) { + kernel_vector_begin(); + rv64i_zvkned_decrypt(src, dst, ctx); + kernel_vector_end(); + } else { + aes_decrypt(ctx, dst, src); + } +} +EXPORT_SYMBOL(riscv64_aes_decrypt_zvkned); + +static int aes_setkey_zvkned(struct crypto_tfm *tfm, const u8 *key, + unsigned int keylen) +{ + struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); + + return riscv64_aes_setkey_zvkned(ctx, key, keylen); +} + +static void aes_encrypt_zvkned(struct crypto_tfm *tfm, u8 *dst, const u8 *src) +{ + const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); + + riscv64_aes_encrypt_zvkned(ctx, dst, src); +} + +static void aes_decrypt_zvkned(struct crypto_tfm *tfm, u8 *dst, const u8 *src) +{ + const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); + + riscv64_aes_decrypt_zvkned(ctx, dst, src); +} + +static struct crypto_alg riscv64_aes_alg_zvkned = { + .cra_flags = CRYPTO_ALG_TYPE_CIPHER, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_priority = 300, + .cra_name = "aes", + .cra_driver_name = "aes-riscv64-zvkned", + .cra_cipher = { + .cia_min_keysize = AES_MIN_KEY_SIZE, + .cia_max_keysize = AES_MAX_KEY_SIZE, + .cia_setkey = aes_setkey_zvkned, + .cia_encrypt = aes_encrypt_zvkned, + .cia_decrypt = aes_decrypt_zvkned, + }, + .cra_module = THIS_MODULE, +}; + +static inline bool check_aes_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKNED) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_aes_mod_init(void) +{ + if (check_aes_ext()) + return crypto_register_alg(&riscv64_aes_alg_zvkned); + + return -ENODEV; +} + +static void __exit riscv64_aes_mod_fini(void) +{ + crypto_unregister_alg(&riscv64_aes_alg_zvkned); +} + +module_init(riscv64_aes_mod_init); +module_exit(riscv64_aes_mod_fini); + +MODULE_DESCRIPTION("AES (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("aes"); diff --git a/arch/riscv/crypto/aes-riscv64-glue.h b/arch/riscv/crypto/aes-riscv64-glue.h new file mode 100644 index 000000000000..2b544125091e --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-glue.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef AES_RISCV64_GLUE_H +#define AES_RISCV64_GLUE_H + +#include +#include + +int riscv64_aes_setkey_zvkned(struct crypto_aes_ctx *ctx, const u8 *key, + unsigned int keylen); + +void riscv64_aes_encrypt_zvkned(const struct crypto_aes_ctx *ctx, u8 *dst, + const u8 *src); + +void riscv64_aes_decrypt_zvkned(const struct crypto_aes_ctx *ctx, u8 *dst, + const u8 *src); + +#endif /* AES_RISCV64_GLUE_H */ diff --git a/arch/riscv/crypto/aes-riscv64-zvkned.pl b/arch/riscv/crypto/aes-riscv64-zvkned.pl new file mode 100644 index 000000000000..466357b4503c --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-zvkned.pl @@ -0,0 +1,453 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Phoebe Chen +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector AES block cipher extension ('Zvkned') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +{ +################################################################################ +# void rv64i_zvkned_encrypt(const unsigned char *in, unsigned char *out, +# const AES_KEY *key); +my ($INP, $OUTP, $KEYP) = ("a0", "a1", "a2"); +my ($T0) = ("t0"); +my ($KEY_LEN) = ("a3"); + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_encrypt +.type rv64i_zvkned_encrypt,\@function +rv64i_zvkned_encrypt: + # Load key length. + lwu $KEY_LEN, 480($KEYP) + + # Get proper routine for key length. + li $T0, 32 + beq $KEY_LEN, $T0, L_enc_256 + li $T0, 24 + beq $KEY_LEN, $T0, L_enc_192 + li $T0, 16 + beq $KEY_LEN, $T0, L_enc_128 + + j L_fail_m2 +.size rv64i_zvkned_encrypt,.-rv64i_zvkned_encrypt +___ + +$code .= <<___; +.p2align 3 +L_enc_128: + vsetivli zero, 4, e32, m1, ta, ma + + vle32.v $V1, ($INP) + + vle32.v $V10, ($KEYP) + @{[vaesz_vs $V1, $V10]} # with round key w[ 0, 3] + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) + @{[vaesem_vs $V1, $V11]} # with round key w[ 4, 7] + addi $KEYP, $KEYP, 16 + vle32.v $V12, ($KEYP) + @{[vaesem_vs $V1, $V12]} # with round key w[ 8,11] + addi $KEYP, $KEYP, 16 + vle32.v $V13, ($KEYP) + @{[vaesem_vs $V1, $V13]} # with round key w[12,15] + addi $KEYP, $KEYP, 16 + vle32.v $V14, ($KEYP) + @{[vaesem_vs $V1, $V14]} # with round key w[16,19] + addi $KEYP, $KEYP, 16 + vle32.v $V15, ($KEYP) + @{[vaesem_vs $V1, $V15]} # with round key w[20,23] + addi $KEYP, $KEYP, 16 + vle32.v $V16, ($KEYP) + @{[vaesem_vs $V1, $V16]} # with round key w[24,27] + addi $KEYP, $KEYP, 16 + vle32.v $V17, ($KEYP) + @{[vaesem_vs $V1, $V17]} # with round key w[28,31] + addi $KEYP, $KEYP, 16 + vle32.v $V18, ($KEYP) + @{[vaesem_vs $V1, $V18]} # with round key w[32,35] + addi $KEYP, $KEYP, 16 + vle32.v $V19, ($KEYP) + @{[vaesem_vs $V1, $V19]} # with round key w[36,39] + addi $KEYP, $KEYP, 16 + vle32.v $V20, ($KEYP) + @{[vaesef_vs $V1, $V20]} # with round key w[40,43] + + vse32.v $V1, ($OUTP) + + ret +.size L_enc_128,.-L_enc_128 +___ + +$code .= <<___; +.p2align 3 +L_enc_192: + vsetivli zero, 4, e32, m1, ta, ma + + vle32.v $V1, ($INP) + + vle32.v $V10, ($KEYP) + @{[vaesz_vs $V1, $V10]} + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) + @{[vaesem_vs $V1, $V11]} + addi $KEYP, $KEYP, 16 + vle32.v $V12, ($KEYP) + @{[vaesem_vs $V1, $V12]} + addi $KEYP, $KEYP, 16 + vle32.v $V13, ($KEYP) + @{[vaesem_vs $V1, $V13]} + addi $KEYP, $KEYP, 16 + vle32.v $V14, ($KEYP) + @{[vaesem_vs $V1, $V14]} + addi $KEYP, $KEYP, 16 + vle32.v $V15, ($KEYP) + @{[vaesem_vs $V1, $V15]} + addi $KEYP, $KEYP, 16 + vle32.v $V16, ($KEYP) + @{[vaesem_vs $V1, $V16]} + addi $KEYP, $KEYP, 16 + vle32.v $V17, ($KEYP) + @{[vaesem_vs $V1, $V17]} + addi $KEYP, $KEYP, 16 + vle32.v $V18, ($KEYP) + @{[vaesem_vs $V1, $V18]} + addi $KEYP, $KEYP, 16 + vle32.v $V19, ($KEYP) + @{[vaesem_vs $V1, $V19]} + addi $KEYP, $KEYP, 16 + vle32.v $V20, ($KEYP) + @{[vaesem_vs $V1, $V20]} + addi $KEYP, $KEYP, 16 + vle32.v $V21, ($KEYP) + @{[vaesem_vs $V1, $V21]} + addi $KEYP, $KEYP, 16 + vle32.v $V22, ($KEYP) + @{[vaesef_vs $V1, $V22]} + + vse32.v $V1, ($OUTP) + ret +.size L_enc_192,.-L_enc_192 +___ + +$code .= <<___; +.p2align 3 +L_enc_256: + vsetivli zero, 4, e32, m1, ta, ma + + vle32.v $V1, ($INP) + + vle32.v $V10, ($KEYP) + @{[vaesz_vs $V1, $V10]} + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) + @{[vaesem_vs $V1, $V11]} + addi $KEYP, $KEYP, 16 + vle32.v $V12, ($KEYP) + @{[vaesem_vs $V1, $V12]} + addi $KEYP, $KEYP, 16 + vle32.v $V13, ($KEYP) + @{[vaesem_vs $V1, $V13]} + addi $KEYP, $KEYP, 16 + vle32.v $V14, ($KEYP) + @{[vaesem_vs $V1, $V14]} + addi $KEYP, $KEYP, 16 + vle32.v $V15, ($KEYP) + @{[vaesem_vs $V1, $V15]} + addi $KEYP, $KEYP, 16 + vle32.v $V16, ($KEYP) + @{[vaesem_vs $V1, $V16]} + addi $KEYP, $KEYP, 16 + vle32.v $V17, ($KEYP) + @{[vaesem_vs $V1, $V17]} + addi $KEYP, $KEYP, 16 + vle32.v $V18, ($KEYP) + @{[vaesem_vs $V1, $V18]} + addi $KEYP, $KEYP, 16 + vle32.v $V19, ($KEYP) + @{[vaesem_vs $V1, $V19]} + addi $KEYP, $KEYP, 16 + vle32.v $V20, ($KEYP) + @{[vaesem_vs $V1, $V20]} + addi $KEYP, $KEYP, 16 + vle32.v $V21, ($KEYP) + @{[vaesem_vs $V1, $V21]} + addi $KEYP, $KEYP, 16 + vle32.v $V22, ($KEYP) + @{[vaesem_vs $V1, $V22]} + addi $KEYP, $KEYP, 16 + vle32.v $V23, ($KEYP) + @{[vaesem_vs $V1, $V23]} + addi $KEYP, $KEYP, 16 + vle32.v $V24, ($KEYP) + @{[vaesef_vs $V1, $V24]} + + vse32.v $V1, ($OUTP) + ret +.size L_enc_256,.-L_enc_256 +___ + +################################################################################ +# void rv64i_zvkned_decrypt(const unsigned char *in, unsigned char *out, +# const AES_KEY *key); +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_decrypt +.type rv64i_zvkned_decrypt,\@function +rv64i_zvkned_decrypt: + # Load key length. + lwu $KEY_LEN, 480($KEYP) + + # Get proper routine for key length. + li $T0, 32 + beq $KEY_LEN, $T0, L_dec_256 + li $T0, 24 + beq $KEY_LEN, $T0, L_dec_192 + li $T0, 16 + beq $KEY_LEN, $T0, L_dec_128 + + j L_fail_m2 +.size rv64i_zvkned_decrypt,.-rv64i_zvkned_decrypt +___ + +$code .= <<___; +.p2align 3 +L_dec_128: + vsetivli zero, 4, e32, m1, ta, ma + + vle32.v $V1, ($INP) + + addi $KEYP, $KEYP, 160 + vle32.v $V20, ($KEYP) + @{[vaesz_vs $V1, $V20]} # with round key w[40,43] + addi $KEYP, $KEYP, -16 + vle32.v $V19, ($KEYP) + @{[vaesdm_vs $V1, $V19]} # with round key w[36,39] + addi $KEYP, $KEYP, -16 + vle32.v $V18, ($KEYP) + @{[vaesdm_vs $V1, $V18]} # with round key w[32,35] + addi $KEYP, $KEYP, -16 + vle32.v $V17, ($KEYP) + @{[vaesdm_vs $V1, $V17]} # with round key w[28,31] + addi $KEYP, $KEYP, -16 + vle32.v $V16, ($KEYP) + @{[vaesdm_vs $V1, $V16]} # with round key w[24,27] + addi $KEYP, $KEYP, -16 + vle32.v $V15, ($KEYP) + @{[vaesdm_vs $V1, $V15]} # with round key w[20,23] + addi $KEYP, $KEYP, -16 + vle32.v $V14, ($KEYP) + @{[vaesdm_vs $V1, $V14]} # with round key w[16,19] + addi $KEYP, $KEYP, -16 + vle32.v $V13, ($KEYP) + @{[vaesdm_vs $V1, $V13]} # with round key w[12,15] + addi $KEYP, $KEYP, -16 + vle32.v $V12, ($KEYP) + @{[vaesdm_vs $V1, $V12]} # with round key w[ 8,11] + addi $KEYP, $KEYP, -16 + vle32.v $V11, ($KEYP) + @{[vaesdm_vs $V1, $V11]} # with round key w[ 4, 7] + addi $KEYP, $KEYP, -16 + vle32.v $V10, ($KEYP) + @{[vaesdf_vs $V1, $V10]} # with round key w[ 0, 3] + + vse32.v $V1, ($OUTP) + + ret +.size L_dec_128,.-L_dec_128 +___ + +$code .= <<___; +.p2align 3 +L_dec_192: + vsetivli zero, 4, e32, m1, ta, ma + + vle32.v $V1, ($INP) + + addi $KEYP, $KEYP, 192 + vle32.v $V22, ($KEYP) + @{[vaesz_vs $V1, $V22]} # with round key w[48,51] + addi $KEYP, $KEYP, -16 + vle32.v $V21, ($KEYP) + @{[vaesdm_vs $V1, $V21]} # with round key w[44,47] + addi $KEYP, $KEYP, -16 + vle32.v $V20, ($KEYP) + @{[vaesdm_vs $V1, $V20]} # with round key w[40,43] + addi $KEYP, $KEYP, -16 + vle32.v $V19, ($KEYP) + @{[vaesdm_vs $V1, $V19]} # with round key w[36,39] + addi $KEYP, $KEYP, -16 + vle32.v $V18, ($KEYP) + @{[vaesdm_vs $V1, $V18]} # with round key w[32,35] + addi $KEYP, $KEYP, -16 + vle32.v $V17, ($KEYP) + @{[vaesdm_vs $V1, $V17]} # with round key w[28,31] + addi $KEYP, $KEYP, -16 + vle32.v $V16, ($KEYP) + @{[vaesdm_vs $V1, $V16]} # with round key w[24,27] + addi $KEYP, $KEYP, -16 + vle32.v $V15, ($KEYP) + @{[vaesdm_vs $V1, $V15]} # with round key w[20,23] + addi $KEYP, $KEYP, -16 + vle32.v $V14, ($KEYP) + @{[vaesdm_vs $V1, $V14]} # with round key w[16,19] + addi $KEYP, $KEYP, -16 + vle32.v $V13, ($KEYP) + @{[vaesdm_vs $V1, $V13]} # with round key w[12,15] + addi $KEYP, $KEYP, -16 + vle32.v $V12, ($KEYP) + @{[vaesdm_vs $V1, $V12]} # with round key w[ 8,11] + addi $KEYP, $KEYP, -16 + vle32.v $V11, ($KEYP) + @{[vaesdm_vs $V1, $V11]} # with round key w[ 4, 7] + addi $KEYP, $KEYP, -16 + vle32.v $V10, ($KEYP) + @{[vaesdf_vs $V1, $V10]} # with round key w[ 0, 3] + + vse32.v $V1, ($OUTP) + + ret +.size L_dec_192,.-L_dec_192 +___ + +$code .= <<___; +.p2align 3 +L_dec_256: + vsetivli zero, 4, e32, m1, ta, ma + + vle32.v $V1, ($INP) + + addi $KEYP, $KEYP, 224 + vle32.v $V24, ($KEYP) + @{[vaesz_vs $V1, $V24]} # with round key w[56,59] + addi $KEYP, $KEYP, -16 + vle32.v $V23, ($KEYP) + @{[vaesdm_vs $V1, $V23]} # with round key w[52,55] + addi $KEYP, $KEYP, -16 + vle32.v $V22, ($KEYP) + @{[vaesdm_vs $V1, $V22]} # with round key w[48,51] + addi $KEYP, $KEYP, -16 + vle32.v $V21, ($KEYP) + @{[vaesdm_vs $V1, $V21]} # with round key w[44,47] + addi $KEYP, $KEYP, -16 + vle32.v $V20, ($KEYP) + @{[vaesdm_vs $V1, $V20]} # with round key w[40,43] + addi $KEYP, $KEYP, -16 + vle32.v $V19, ($KEYP) + @{[vaesdm_vs $V1, $V19]} # with round key w[36,39] + addi $KEYP, $KEYP, -16 + vle32.v $V18, ($KEYP) + @{[vaesdm_vs $V1, $V18]} # with round key w[32,35] + addi $KEYP, $KEYP, -16 + vle32.v $V17, ($KEYP) + @{[vaesdm_vs $V1, $V17]} # with round key w[28,31] + addi $KEYP, $KEYP, -16 + vle32.v $V16, ($KEYP) + @{[vaesdm_vs $V1, $V16]} # with round key w[24,27] + addi $KEYP, $KEYP, -16 + vle32.v $V15, ($KEYP) + @{[vaesdm_vs $V1, $V15]} # with round key w[20,23] + addi $KEYP, $KEYP, -16 + vle32.v $V14, ($KEYP) + @{[vaesdm_vs $V1, $V14]} # with round key w[16,19] + addi $KEYP, $KEYP, -16 + vle32.v $V13, ($KEYP) + @{[vaesdm_vs $V1, $V13]} # with round key w[12,15] + addi $KEYP, $KEYP, -16 + vle32.v $V12, ($KEYP) + @{[vaesdm_vs $V1, $V12]} # with round key w[ 8,11] + addi $KEYP, $KEYP, -16 + vle32.v $V11, ($KEYP) + @{[vaesdm_vs $V1, $V11]} # with round key w[ 4, 7] + addi $KEYP, $KEYP, -16 + vle32.v $V10, ($KEYP) + @{[vaesdf_vs $V1, $V10]} # with round key w[ 0, 3] + + vse32.v $V1, ($OUTP) + + ret +.size L_dec_256,.-L_dec_256 +___ +} + +$code .= <<___; +L_fail_m1: + li a0, -1 + ret +.size L_fail_m1,.-L_fail_m1 + +L_fail_m2: + li a0, -2 + ret +.size L_fail_m2,.-L_fail_m2 + +L_end: + ret +.size L_end,.-L_end +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Tue Dec 5 09:27:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479637 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1EEF6C10F05 for ; Tue, 5 Dec 2023 09:28:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Ll4xGcIR5oBIqRPZsNI4DdrDw4pk23fFeFXRgXTbE7s=; b=DHrWQJFbxtWgVE eKcrcw+9X2ZMtd+zINuQkzH99l5MPCm3AcInLwrXNckb6XtVDjEVTliVMvupFNHGXaaIllzo25bQU /P07W/E3vcGPjW1yZH3S6kQebFOX1635vytDiqM0vjrnR6PHJrrdD00cVS9FAQs4Nfwk1aJWSc9e4 R2Nply2RhKVF05m9u764/nGzyxo5UMMfuX41hdnfoQNVYoxgsIcViZblINyngsHkBWgrsqLLHpZdE aqokgmgeVX3GAyQYXMxudnTrDCqPcqOY6UQbrMS2U+1TEq3H7FhSF/eixvQzObQB5V2TdpB3D9Q9A ViHR6O2RX4DA6Jax8Zlg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARjE-006njA-32; Tue, 05 Dec 2023 09:28:32 +0000 Received: from mail-oi1-x22b.google.com ([2607:f8b0:4864:20::22b]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARjB-006nfM-2i for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:31 +0000 Received: by mail-oi1-x22b.google.com with SMTP id 5614622812f47-3b84e328327so3073681b6e.2 for ; Tue, 05 Dec 2023 01:28:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768508; x=1702373308; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I9NcydlPjJuZdBvk47SliSWRPXO51iH+ClRWgIXdngI=; b=kMWlO9IJsNXWjRWr+dtnbgktw4khdpDy9KGw3xMmCLqS40e8BnaAfAd4oSjnbVq8Lm QxlIivwJUkT0Lt8mzbHdHwSqLcf8VHntAIfgjlE9ji6XU4YJO6601l+zRnPL2AAIZ4zS 0UPRTUcmnKBvD8di7Uas0/XxJmer53F2THlfsbYpeEID4sfzvzFkkK+CdFRxEsrvn/dq 8FcsC2y6rXvmcctqiISGUV1Ajs8c1ZZkGhw5iPWJKa8hEvgFM3Nn3HJ8CqvKm9jDGiXS wUEc7kzt6QU4uPcaJ5Gtg+NrYgdFoCiqYSSfHvB9SxZ9rYlHMW9UZeE6xHBPXa3zXC59 3okQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768508; x=1702373308; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I9NcydlPjJuZdBvk47SliSWRPXO51iH+ClRWgIXdngI=; b=cv3a9TR79IANT7zcxlcKePFPf2cGZaNczqQrnxtxVPI4sw6/AKaMh8eA1kARg9U2x4 ssyj+oZO8PGe5lxJFP0iDDztw9njpSKUMnTUJ3zB3uDsN7+vKPUEN+f6+5/ip0Dj0Ydd mwPywqz37ZjheT7wjE2BL9Zfj69PrAz+4GyBVwEV2LvWPfMLo9gE6QEz27mUDi+2sN94 DQ/MI96VPmzorTvVaZazKwtgIEb686P0RekcpZZHoCLGkjeHiQ4/+jjzab7MHwRjJjQO aDI/BoPL+z9Fh8ndtpXRts0QWZbuOSVHA2Gkjv0echRbjirFh0mcuNSdtO6Blyrv8m14 ObNw== X-Gm-Message-State: AOJu0YxKC6hrKGnHyGIciAVN/B73ylGl/K0EmlB5EneBPfWluqfq+8xG m+QN5AWYPDYRVHuH18vU0dgoGw== X-Google-Smtp-Source: AGHT+IHZ3EK+AWz07LHe5BsDXoplI+8JX1FVHrlgMUB8wBmoWXqufxaT2Ka9FS50Dim4x8aDAfAYQQ== X-Received: by 2002:a05:6808:16a7:b0:3b8:44dc:7ce0 with SMTP id bb39-20020a05680816a700b003b844dc7ce0mr7330943oib.2.1701768508622; Tue, 05 Dec 2023 01:28:28 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.25 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:28 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 05/12] crypto: simd - Update `walksize` in simd skcipher Date: Tue, 5 Dec 2023 17:27:54 +0800 Message-Id: <20231205092801.1335-6-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012829_895453_60D8F757 X-CRM114-Status: UNSURE ( 9.10 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org The `walksize` assignment is missed in simd skcipher. Signed-off-by: Jerry Shih --- crypto/cryptd.c | 1 + crypto/simd.c | 1 + 2 files changed, 2 insertions(+) diff --git a/crypto/cryptd.c b/crypto/cryptd.c index bbcc368b6a55..253d13504ccb 100644 --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -405,6 +405,7 @@ static int cryptd_create_skcipher(struct crypto_template *tmpl, (alg->base.cra_flags & CRYPTO_ALG_INTERNAL); inst->alg.ivsize = crypto_skcipher_alg_ivsize(alg); inst->alg.chunksize = crypto_skcipher_alg_chunksize(alg); + inst->alg.walksize = crypto_skcipher_alg_walksize(alg); inst->alg.min_keysize = crypto_skcipher_alg_min_keysize(alg); inst->alg.max_keysize = crypto_skcipher_alg_max_keysize(alg); diff --git a/crypto/simd.c b/crypto/simd.c index edaa479a1ec5..ea0caabf90f1 100644 --- a/crypto/simd.c +++ b/crypto/simd.c @@ -181,6 +181,7 @@ struct simd_skcipher_alg *simd_skcipher_create_compat(const char *algname, alg->ivsize = ialg->ivsize; alg->chunksize = ialg->chunksize; + alg->walksize = ialg->walksize; alg->min_keysize = ialg->min_keysize; alg->max_keysize = ialg->max_keysize; From patchwork Tue Dec 5 09:27:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479639 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BE105C4167B for ; Tue, 5 Dec 2023 09:28:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=jOmEe4tHzOVMprY5tzX7ACz+uq7BYXAynHSiWKSqA08=; b=lXa9QfPqV11+Eb ZAQyv67e48CJs9NoT+FshsSf0mC1qpjalYDqLl1gTYG+yqzs2hfPutavAWpcHZeLaYHSXPhx9Amwm lGxPk8f3i9/4BcjJBT4rXokUxyqI+Y6XkhNpTvVlMOD3MAzg32rQ6jx6QGi+vVMbI4isTKHfSjZP6 5DXsu+pmN+tWGyor2sfIK3c0lNNhp3oPcC9Zq5PoMheW6hF9vmtW9KXmZW3mz7tJlHfNfYjGHlxFL h1RWMIzrKqTMBCeeHt+GDbwq/qAw8m7JlO50rEpHWy4Jt5uTRC+PG+LprsvqL0oiXagRqNMQJ59bp iX7YkfCWp7hFZGdKOMmg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARjN-006nrN-2B; Tue, 05 Dec 2023 09:28:41 +0000 Received: from mail-pg1-x535.google.com ([2607:f8b0:4864:20::535]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARjG-006nkf-2c for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:40 +0000 Received: by mail-pg1-x535.google.com with SMTP id 41be03b00d2f7-5c210e34088so3208224a12.2 for ; Tue, 05 Dec 2023 01:28:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768513; x=1702373313; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ykhn4EhOxudXgOFHE4SFrUZMQRGl7jFEpRjCPRpYABc=; b=gXA0bHxEmkQkQKvKQvqmH7RW8Z2H7QjLc/KKcQR4zv9YqzNqEVsMG7oZxIP4KLtlMb az/gQKGuNGVzdIXxMhqNOatxHv/Ev446svIFfaPxbwa15M6yVvmsUXxLFrqtLcoJts8N UNTdcAmDVdPBnu34kwEMsbz1Xoq6cnfwRddG9yo3ZSKzI+uXvmoVtBRS+96tyB3Mw6F/ wyaWY3VhZiul8f9LnTCxXNXKO/Wn+BXwdzIOTpxDzIeQ34rX5TCNli/IhezXh9z7HORv k3c39xjSAahfu+vHe+2ZPj3Xmv6hji18hvDjblPiEv0GQTrQfo2A9hlXFwe51boHp9V/ fMiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768513; x=1702373313; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ykhn4EhOxudXgOFHE4SFrUZMQRGl7jFEpRjCPRpYABc=; b=OXV3LSh8J/AmFj0uGWYPj49YLdNgYndtZBlggC0y5gYjPSrQZRGiL1EDbPnHDQNO6Q uU7gt1KNYFy0FG5Aat0fLLZGiFQ/e0XMwswNV4bJzapeMYBXk7az8/wt8hwjc2OMig5d koXFxl/Gff7RRqbJiPMkH5P/m/qun2WFi5UcO2CTWstTe13aSB3MZTirOQCeev9qWrjV BrFydnTpTMfNPiz8KT1gp35VHJAZAcw1TKAC67vJzfsksdaFYewbUlA1yGFMZy09GDZ4 rWoDrKs7i/EWei4/DJNjgh+HXxryk1kRCz0qDx4+pp6JMHh3xbsmJRzFnEzjzn1A8rWy vwxg== X-Gm-Message-State: AOJu0YzEMI65RdQkLy2nEqjuIPYPt4m1xdaIBeIvjcmONwlf6qd5/IlH Nnv7+fLpZclvmLH/bnrrHrXJag== X-Google-Smtp-Source: AGHT+IFBlxqJtjXedomAV97U/+CSfvmGjnmUcZO4Igmw9kMmPX/+yNhjUIpAofrld4Cy2h+DxOk7Qg== X-Received: by 2002:a05:6a20:7da6:b0:18f:97c:6148 with SMTP id v38-20020a056a207da600b0018f097c6148mr7143021pzj.69.1701768512753; Tue, 05 Dec 2023 01:28:32 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:32 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 06/12] RISC-V: crypto: add accelerated AES-CBC/CTR/ECB/XTS implementations Date: Tue, 5 Dec 2023 17:27:55 +0800 Message-Id: <20231205092801.1335-7-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012834_932886_99B881F2 X-CRM114-Status: GOOD ( 23.72 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Port the vector-crypto accelerated CBC, CTR, ECB and XTS block modes for AES cipher from OpenSSL(openssl/openssl#21923). In addition, support XTS-AES-192 mode which is not existed in OpenSSL. Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v3: - Update extension checking conditions in riscv64_aes_block_mod_init(). - Add `riscv64` prefix for all setkey, encrypt and decrypt functions. - Update xts_crypt() implementation. Use the similar approach as x86's aes-xts implementation. - Use asm mnemonics for the instructions in RVV 1.0 extension. Changelog v2: - Do not turn on kconfig `AES_BLOCK_RISCV64` option by default. - Update asm function for using aes key in `crypto_aes_ctx` structure. - Turn to use simd skcipher interface for AES-CBC/CTR/ECB/XTS modes. We still have lots of discussions for kernel-vector implementation. Before the final version of kernel-vector, use simd skcipher interface to skip the fallback path for all aes modes in all kinds of contexts. If we could always enable kernel-vector in softirq in the future, we could make the original sync skcipher algorithm back. - Refine aes-xts comments for head and tail blocks handling. - Update VLEN constraint for aex-xts mode. - Add `asmlinkage` qualifier for crypto asm function. - Rename aes-riscv64-zvbb-zvkg-zvkned to aes-riscv64-zvkned-zvbb-zvkg. - Rename aes-riscv64-zvkb-zvkned to aes-riscv64-zvkned-zvkb. - Reorder structure riscv64_aes_algs_zvkned, riscv64_aes_alg_zvkned_zvkb and riscv64_aes_alg_zvkned_zvbb_zvkg members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 21 + arch/riscv/crypto/Makefile | 11 + .../crypto/aes-riscv64-block-mode-glue.c | 494 +++++++++ .../crypto/aes-riscv64-zvkned-zvbb-zvkg.pl | 949 ++++++++++++++++++ arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl | 415 ++++++++ arch/riscv/crypto/aes-riscv64-zvkned.pl | 746 ++++++++++++++ 6 files changed, 2636 insertions(+) create mode 100644 arch/riscv/crypto/aes-riscv64-block-mode-glue.c create mode 100644 arch/riscv/crypto/aes-riscv64-zvkned-zvbb-zvkg.pl create mode 100644 arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 65189d4d47b3..9d991ddda289 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -13,4 +13,25 @@ config CRYPTO_AES_RISCV64 Architecture: riscv64 using: - Zvkned vector crypto extension +config CRYPTO_AES_BLOCK_RISCV64 + tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_AES_RISCV64 + select CRYPTO_SIMD + select CRYPTO_SKCIPHER + help + Length-preserving ciphers: AES cipher algorithms (FIPS-197) + with block cipher modes: + - ECB (Electronic Codebook) mode (NIST SP 800-38A) + - CBC (Cipher Block Chaining) mode (NIST SP 800-38A) + - CTR (Counter) mode (NIST SP 800-38A) + - XTS (XOR Encrypt XOR Tweakable Block Cipher with Ciphertext + Stealing) mode (NIST SP 800-38E and IEEE 1619) + + Architecture: riscv64 using: + - Zvkned vector crypto extension + - Zvbb vector extension (XTS) + - Zvkb vector crypto extension (CTR/XTS) + - Zvkg vector crypto extension (XTS) + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 90ca91d8df26..9574b009762f 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -6,10 +6,21 @@ obj-$(CONFIG_CRYPTO_AES_RISCV64) += aes-riscv64.o aes-riscv64-y := aes-riscv64-glue.o aes-riscv64-zvkned.o +obj-$(CONFIG_CRYPTO_AES_BLOCK_RISCV64) += aes-block-riscv64.o +aes-block-riscv64-y := aes-riscv64-block-mode-glue.o aes-riscv64-zvkned-zvbb-zvkg.o aes-riscv64-zvkned-zvkb.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) $(obj)/aes-riscv64-zvkned.S: $(src)/aes-riscv64-zvkned.pl $(call cmd,perlasm) +$(obj)/aes-riscv64-zvkned-zvbb-zvkg.S: $(src)/aes-riscv64-zvkned-zvbb-zvkg.pl + $(call cmd,perlasm) + +$(obj)/aes-riscv64-zvkned-zvkb.S: $(src)/aes-riscv64-zvkned-zvkb.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S +clean-files += aes-riscv64-zvkned-zvbb-zvkg.S +clean-files += aes-riscv64-zvkned-zvkb.S diff --git a/arch/riscv/crypto/aes-riscv64-block-mode-glue.c b/arch/riscv/crypto/aes-riscv64-block-mode-glue.c new file mode 100644 index 000000000000..b1d59f6da923 --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-block-mode-glue.c @@ -0,0 +1,494 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Port of the OpenSSL AES block mode implementations for RISC-V + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "aes-riscv64-glue.h" + +struct riscv64_aes_xts_ctx { + struct crypto_aes_ctx ctx1; + struct crypto_aes_ctx ctx2; +}; + +/* aes cbc block mode using zvkned vector crypto extension */ +asmlinkage void rv64i_zvkned_cbc_encrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, + u8 *ivec); +asmlinkage void rv64i_zvkned_cbc_decrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, + u8 *ivec); +/* aes ecb block mode using zvkned vector crypto extension */ +asmlinkage void rv64i_zvkned_ecb_encrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key); +asmlinkage void rv64i_zvkned_ecb_decrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key); + +/* aes ctr block mode using zvkb and zvkned vector crypto extension */ +/* This func operates on 32-bit counter. Caller has to handle the overflow. */ +asmlinkage void +rv64i_zvkb_zvkned_ctr32_encrypt_blocks(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, + u8 *ivec); + +/* aes xts block mode using zvbb, zvkg and zvkned vector crypto extension */ +asmlinkage void +rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, u8 *iv, + int update_iv); +asmlinkage void +rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, u8 *iv, + int update_iv); + +/* ecb */ +static int riscv64_aes_setkey(struct crypto_skcipher *tfm, const u8 *in_key, + unsigned int key_len) +{ + struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + + return riscv64_aes_setkey_zvkned(ctx, in_key, key_len); +} + +static int riscv64_ecb_encrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + int err; + + /* If we have error here, the `nbytes` will be zero. */ + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + kernel_vector_begin(); + rv64i_zvkned_ecb_encrypt(walk.src.virt.addr, walk.dst.virt.addr, + nbytes & ~(AES_BLOCK_SIZE - 1), ctx); + kernel_vector_end(); + err = skcipher_walk_done(&walk, nbytes & (AES_BLOCK_SIZE - 1)); + } + + return err; +} + +static int riscv64_ecb_decrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + int err; + + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + kernel_vector_begin(); + rv64i_zvkned_ecb_decrypt(walk.src.virt.addr, walk.dst.virt.addr, + nbytes & ~(AES_BLOCK_SIZE - 1), ctx); + kernel_vector_end(); + err = skcipher_walk_done(&walk, nbytes & (AES_BLOCK_SIZE - 1)); + } + + return err; +} + +/* cbc */ +static int riscv64_cbc_encrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + int err; + + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + kernel_vector_begin(); + rv64i_zvkned_cbc_encrypt(walk.src.virt.addr, walk.dst.virt.addr, + nbytes & ~(AES_BLOCK_SIZE - 1), ctx, + walk.iv); + kernel_vector_end(); + err = skcipher_walk_done(&walk, nbytes & (AES_BLOCK_SIZE - 1)); + } + + return err; +} + +static int riscv64_cbc_decrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + int err; + + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + kernel_vector_begin(); + rv64i_zvkned_cbc_decrypt(walk.src.virt.addr, walk.dst.virt.addr, + nbytes & ~(AES_BLOCK_SIZE - 1), ctx, + walk.iv); + kernel_vector_end(); + err = skcipher_walk_done(&walk, nbytes & (AES_BLOCK_SIZE - 1)); + } + + return err; +} + +/* ctr */ +static int riscv64_ctr_encrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int ctr32; + unsigned int nbytes; + unsigned int blocks; + unsigned int current_blocks; + unsigned int current_length; + int err; + + /* the ctr iv uses big endian */ + ctr32 = get_unaligned_be32(req->iv + 12); + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + if (nbytes != walk.total) { + nbytes &= ~(AES_BLOCK_SIZE - 1); + blocks = nbytes / AES_BLOCK_SIZE; + } else { + /* This is the last walk. We should handle the tail data. */ + blocks = DIV_ROUND_UP(nbytes, AES_BLOCK_SIZE); + } + ctr32 += blocks; + + kernel_vector_begin(); + /* + * The `if` block below detects the overflow, which is then handled by + * limiting the amount of blocks to the exact overflow point. + */ + if (ctr32 >= blocks) { + rv64i_zvkb_zvkned_ctr32_encrypt_blocks( + walk.src.virt.addr, walk.dst.virt.addr, nbytes, + ctx, req->iv); + } else { + /* use 2 ctr32 function calls for overflow case */ + current_blocks = blocks - ctr32; + current_length = + min(nbytes, current_blocks * AES_BLOCK_SIZE); + rv64i_zvkb_zvkned_ctr32_encrypt_blocks( + walk.src.virt.addr, walk.dst.virt.addr, + current_length, ctx, req->iv); + crypto_inc(req->iv, 12); + + if (ctr32) { + rv64i_zvkb_zvkned_ctr32_encrypt_blocks( + walk.src.virt.addr + + current_blocks * AES_BLOCK_SIZE, + walk.dst.virt.addr + + current_blocks * AES_BLOCK_SIZE, + nbytes - current_length, ctx, req->iv); + } + } + kernel_vector_end(); + + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); + } + + return err; +} + +/* xts */ +static int riscv64_xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key, + unsigned int key_len) +{ + struct riscv64_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); + unsigned int xts_single_key_len = key_len / 2; + int ret; + + ret = xts_verify_key(tfm, in_key, key_len); + if (ret) + return ret; + ret = riscv64_aes_setkey_zvkned(&ctx->ctx1, in_key, xts_single_key_len); + if (ret) + return ret; + return riscv64_aes_setkey_zvkned( + &ctx->ctx2, in_key + xts_single_key_len, xts_single_key_len); +} + +static int xts_crypt(struct skcipher_request *req, bool encrypt) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct riscv64_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_request sub_req; + struct scatterlist sg_src[2], sg_dst[2]; + struct scatterlist *src, *dst; + struct skcipher_walk walk; + unsigned int walk_size = crypto_skcipher_walksize(tfm); + unsigned int tail = req->cryptlen & (AES_BLOCK_SIZE - 1); + unsigned int nbytes; + unsigned int update_iv = 1; + int err; + + /* xts input size should be bigger than AES_BLOCK_SIZE */ + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; + + riscv64_aes_encrypt_zvkned(&ctx->ctx2, req->iv, req->iv); + + if (unlikely(tail > 0 && req->cryptlen > walk_size)) { + /* + * Find the largest tail size which is small than `walk` size while the + * non-ciphertext-stealing parts still fit AES block boundary. + */ + tail = walk_size + tail - AES_BLOCK_SIZE; + + skcipher_request_set_tfm(&sub_req, tfm); + skcipher_request_set_callback( + &sub_req, skcipher_request_flags(req), NULL, NULL); + skcipher_request_set_crypt(&sub_req, req->src, req->dst, + req->cryptlen - tail, req->iv); + req = &sub_req; + } else { + tail = 0; + } + + err = skcipher_walk_virt(&walk, req, false); + if (!walk.nbytes) + return err; + + while ((nbytes = walk.nbytes)) { + if (nbytes < walk.total) + nbytes &= ~(AES_BLOCK_SIZE - 1); + else + update_iv = (tail > 0); + + kernel_vector_begin(); + if (encrypt) + rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt( + walk.src.virt.addr, walk.dst.virt.addr, nbytes, + &ctx->ctx1, req->iv, update_iv); + else + rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt( + walk.src.virt.addr, walk.dst.virt.addr, nbytes, + &ctx->ctx1, req->iv, update_iv); + kernel_vector_end(); + + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); + } + + if (unlikely(tail > 0 && !err)) { + dst = src = scatterwalk_ffwd(sg_src, req->src, req->cryptlen); + if (req->dst != req->src) + dst = scatterwalk_ffwd(sg_dst, req->dst, req->cryptlen); + + skcipher_request_set_crypt(req, src, dst, tail, req->iv); + + err = skcipher_walk_virt(&walk, req, false); + if (err) + return err; + + kernel_vector_begin(); + if (encrypt) + rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt( + walk.src.virt.addr, walk.dst.virt.addr, + walk.nbytes, &ctx->ctx1, req->iv, 0); + else + rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt( + walk.src.virt.addr, walk.dst.virt.addr, + walk.nbytes, &ctx->ctx1, req->iv, 0); + kernel_vector_end(); + + err = skcipher_walk_done(&walk, 0); + } + + return err; +} + +static int riscv64_xts_encrypt(struct skcipher_request *req) +{ + return xts_crypt(req, true); +} + +static int riscv64_xts_decrypt(struct skcipher_request *req) +{ + return xts_crypt(req, false); +} + +static struct skcipher_alg riscv64_aes_algs_zvkned[] = { + { + .setkey = riscv64_aes_setkey, + .encrypt = riscv64_ecb_encrypt, + .decrypt = riscv64_ecb_decrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .walksize = AES_BLOCK_SIZE * 8, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_priority = 300, + .cra_name = "__ecb(aes)", + .cra_driver_name = "__ecb-aes-riscv64-zvkned", + .cra_module = THIS_MODULE, + }, + }, { + .setkey = riscv64_aes_setkey, + .encrypt = riscv64_cbc_encrypt, + .decrypt = riscv64_cbc_decrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + .walksize = AES_BLOCK_SIZE * 8, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_priority = 300, + .cra_name = "__cbc(aes)", + .cra_driver_name = "__cbc-aes-riscv64-zvkned", + .cra_module = THIS_MODULE, + }, + } +}; + +static struct simd_skcipher_alg + *riscv64_aes_simd_algs_zvkned[ARRAY_SIZE(riscv64_aes_algs_zvkned)]; + +static struct skcipher_alg riscv64_aes_alg_zvkned_zvkb[] = { + { + .setkey = riscv64_aes_setkey, + .encrypt = riscv64_ctr_encrypt, + .decrypt = riscv64_ctr_encrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + .chunksize = AES_BLOCK_SIZE, + .walksize = AES_BLOCK_SIZE * 8, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = 1, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_priority = 300, + .cra_name = "__ctr(aes)", + .cra_driver_name = "__ctr-aes-riscv64-zvkned-zvkb", + .cra_module = THIS_MODULE, + }, + } +}; + +static struct simd_skcipher_alg *riscv64_aes_simd_alg_zvkned_zvkb[ARRAY_SIZE( + riscv64_aes_alg_zvkned_zvkb)]; + +static struct skcipher_alg riscv64_aes_alg_zvkned_zvbb_zvkg[] = { + { + .setkey = riscv64_xts_setkey, + .encrypt = riscv64_xts_encrypt, + .decrypt = riscv64_xts_decrypt, + .min_keysize = AES_MIN_KEY_SIZE * 2, + .max_keysize = AES_MAX_KEY_SIZE * 2, + .ivsize = AES_BLOCK_SIZE, + .chunksize = AES_BLOCK_SIZE, + .walksize = AES_BLOCK_SIZE * 8, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct riscv64_aes_xts_ctx), + .cra_priority = 300, + .cra_name = "__xts(aes)", + .cra_driver_name = "__xts-aes-riscv64-zvkned-zvbb-zvkg", + .cra_module = THIS_MODULE, + }, + } +}; + +static struct simd_skcipher_alg + *riscv64_aes_simd_alg_zvkned_zvbb_zvkg[ARRAY_SIZE( + riscv64_aes_alg_zvkned_zvbb_zvkg)]; + +static int __init riscv64_aes_block_mod_init(void) +{ + int ret = -ENODEV; + + if (riscv_isa_extension_available(NULL, ZVKNED) && + riscv_vector_vlen() >= 128 && riscv_vector_vlen() <= 2048) { + ret = simd_register_skciphers_compat( + riscv64_aes_algs_zvkned, + ARRAY_SIZE(riscv64_aes_algs_zvkned), + riscv64_aes_simd_algs_zvkned); + if (ret) + return ret; + + if (riscv_isa_extension_available(NULL, ZVKB)) { + ret = simd_register_skciphers_compat( + riscv64_aes_alg_zvkned_zvkb, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvkb), + riscv64_aes_simd_alg_zvkned_zvkb); + if (ret) + goto unregister_zvkned; + } + + if (riscv_isa_extension_available(NULL, ZVBB) && + riscv_isa_extension_available(NULL, ZVKG)) { + ret = simd_register_skciphers_compat( + riscv64_aes_alg_zvkned_zvbb_zvkg, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvbb_zvkg), + riscv64_aes_simd_alg_zvkned_zvbb_zvkg); + if (ret) + goto unregister_zvkned_zvkb; + } + } + + return ret; + +unregister_zvkned_zvkb: + simd_unregister_skciphers(riscv64_aes_alg_zvkned_zvkb, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvkb), + riscv64_aes_simd_alg_zvkned_zvkb); +unregister_zvkned: + simd_unregister_skciphers(riscv64_aes_algs_zvkned, + ARRAY_SIZE(riscv64_aes_algs_zvkned), + riscv64_aes_simd_algs_zvkned); + + return ret; +} + +static void __exit riscv64_aes_block_mod_fini(void) +{ + simd_unregister_skciphers(riscv64_aes_alg_zvkned_zvbb_zvkg, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvbb_zvkg), + riscv64_aes_simd_alg_zvkned_zvbb_zvkg); + simd_unregister_skciphers(riscv64_aes_alg_zvkned_zvkb, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvkb), + riscv64_aes_simd_alg_zvkned_zvkb); + simd_unregister_skciphers(riscv64_aes_algs_zvkned, + ARRAY_SIZE(riscv64_aes_algs_zvkned), + riscv64_aes_simd_algs_zvkned); +} + +module_init(riscv64_aes_block_mod_init); +module_exit(riscv64_aes_block_mod_fini); + +MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS (RISC-V accelerated)"); +MODULE_AUTHOR("Jerry Shih "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("cbc(aes)"); +MODULE_ALIAS_CRYPTO("ctr(aes)"); +MODULE_ALIAS_CRYPTO("ecb(aes)"); +MODULE_ALIAS_CRYPTO("xts(aes)"); diff --git a/arch/riscv/crypto/aes-riscv64-zvkned-zvbb-zvkg.pl b/arch/riscv/crypto/aes-riscv64-zvkned-zvbb-zvkg.pl new file mode 100644 index 000000000000..a67d74593860 --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-zvkned-zvbb-zvkg.pl @@ -0,0 +1,949 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 && VLEN <= 2048 +# - RISC-V Vector Bit-manipulation extension ('Zvbb') +# - RISC-V Vector GCM/GMAC extension ('Zvkg') +# - RISC-V Vector AES block cipher extension ('Zvkned') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +{ +################################################################################ +# void rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt(const unsigned char *in, +# unsigned char *out, size_t length, +# const AES_KEY *key, +# unsigned char iv[16], +# int update_iv) +my ($INPUT, $OUTPUT, $LENGTH, $KEY, $IV, $UPDATE_IV) = ("a0", "a1", "a2", "a3", "a4", "a5"); +my ($TAIL_LENGTH) = ("a6"); +my ($VL) = ("a7"); +my ($T0, $T1, $T2, $T3) = ("t0", "t1", "t2", "t3"); +my ($STORE_LEN32) = ("t4"); +my ($LEN32) = ("t5"); +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +# load iv to v28 +sub load_xts_iv0 { + my $code=<<___; + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V28, ($IV) +___ + + return $code; +} + +# prepare input data(v24), iv(v28), bit-reversed-iv(v16), bit-reversed-iv-multiplier(v20) +sub init_first_round { + my $code=<<___; + # load input + vsetvli $VL, $LEN32, e32, m4, ta, ma + vle32.v $V24, ($INPUT) + + li $T0, 5 + # We could simplify the initialization steps if we have `block<=1`. + blt $LEN32, $T0, 1f + + # Note: We use `vgmul` for GF(2^128) multiplication. The `vgmul` uses + # different order of coefficients. We should use`vbrev8` to reverse the + # data when we use `vgmul`. + vsetivli zero, 4, e32, m1, ta, ma + @{[vbrev8_v $V0, $V28]} + vsetvli zero, $LEN32, e32, m4, ta, ma + vmv.v.i $V16, 0 + # v16: [r-IV0, r-IV0, ...] + @{[vaesz_vs $V16, $V0]} + + # Prepare GF(2^128) multiplier [1, x, x^2, x^3, ...] in v8. + # We use `vwsll` to get power of 2 multipliers. Current rvv spec only + # supports `SEW<=64`. So, the maximum `VLEN` for this approach is `2048`. + # SEW64_BITS * AES_BLOCK_SIZE / LMUL + # = 64 * 128 / 4 = 2048 + # + # TODO: truncate the vl to `2048` for `vlen>2048` case. + slli $T0, $LEN32, 2 + vsetvli zero, $T0, e32, m1, ta, ma + # v2: [`1`, `1`, `1`, `1`, ...] + vmv.v.i $V2, 1 + # v3: [`0`, `1`, `2`, `3`, ...] + vid.v $V3 + vsetvli zero, $T0, e64, m2, ta, ma + # v4: [`1`, 0, `1`, 0, `1`, 0, `1`, 0, ...] + vzext.vf2 $V4, $V2 + # v6: [`0`, 0, `1`, 0, `2`, 0, `3`, 0, ...] + vzext.vf2 $V6, $V3 + slli $T0, $LEN32, 1 + vsetvli zero, $T0, e32, m2, ta, ma + # v8: [1<<0=1, 0, 0, 0, 1<<1=x, 0, 0, 0, 1<<2=x^2, 0, 0, 0, ...] + @{[vwsll_vv $V8, $V4, $V6]} + + # Compute [r-IV0*1, r-IV0*x, r-IV0*x^2, r-IV0*x^3, ...] in v16 + vsetvli zero, $LEN32, e32, m4, ta, ma + @{[vbrev8_v $V8, $V8]} + @{[vgmul_vv $V16, $V8]} + + # Compute [IV0*1, IV0*x, IV0*x^2, IV0*x^3, ...] in v28. + # Reverse the bits order back. + @{[vbrev8_v $V28, $V16]} + + # Prepare the x^n multiplier in v20. The `n` is the aes-xts block number + # in a LMUL=4 register group. + # n = ((VLEN*LMUL)/(32*4)) = ((VLEN*4)/(32*4)) + # = (VLEN/32) + # We could use vsetvli with `e32, m1` to compute the `n` number. + vsetvli $T0, zero, e32, m1, ta, ma + li $T1, 1 + sll $T0, $T1, $T0 + vsetivli zero, 2, e64, m1, ta, ma + vmv.v.i $V0, 0 + vsetivli zero, 1, e64, m1, tu, ma + vmv.v.x $V0, $T0 + vsetivli zero, 2, e64, m1, ta, ma + @{[vbrev8_v $V0, $V0]} + vsetvli zero, $LEN32, e32, m4, ta, ma + vmv.v.i $V20, 0 + @{[vaesz_vs $V20, $V0]} + + j 2f +1: + vsetivli zero, 4, e32, m1, ta, ma + @{[vbrev8_v $V16, $V28]} +2: +___ + + return $code; +} + +# prepare xts enc last block's input(v24) and iv(v28) +sub handle_xts_enc_last_block { + my $code=<<___; + bnez $TAIL_LENGTH, 2f + + beqz $UPDATE_IV, 1f + ## Store next IV + addi $VL, $VL, -4 + vsetivli zero, 4, e32, m4, ta, ma + # multiplier + vslidedown.vx $V16, $V16, $VL + + # setup `x` multiplier with byte-reversed order + # 0b00000010 => 0b01000000 (0x40) + li $T0, 0x40 + vsetivli zero, 4, e32, m1, ta, ma + vmv.v.i $V28, 0 + vsetivli zero, 1, e8, m1, tu, ma + vmv.v.x $V28, $T0 + + # IV * `x` + vsetivli zero, 4, e32, m1, ta, ma + @{[vgmul_vv $V16, $V28]} + # Reverse the IV's bits order back to big-endian + @{[vbrev8_v $V28, $V16]} + + vse32.v $V28, ($IV) +1: + + ret +2: + # slidedown second to last block + addi $VL, $VL, -4 + vsetivli zero, 4, e32, m4, ta, ma + # ciphertext + vslidedown.vx $V24, $V24, $VL + # multiplier + vslidedown.vx $V16, $V16, $VL + + vsetivli zero, 4, e32, m1, ta, ma + vmv.v.v $V25, $V24 + + # load last block into v24 + # note: We should load the last block before store the second to last block + # for in-place operation. + vsetvli zero, $TAIL_LENGTH, e8, m1, tu, ma + vle8.v $V24, ($INPUT) + + # setup `x` multiplier with byte-reversed order + # 0b00000010 => 0b01000000 (0x40) + li $T0, 0x40 + vsetivli zero, 4, e32, m1, ta, ma + vmv.v.i $V28, 0 + vsetivli zero, 1, e8, m1, tu, ma + vmv.v.x $V28, $T0 + + # compute IV for last block + vsetivli zero, 4, e32, m1, ta, ma + @{[vgmul_vv $V16, $V28]} + @{[vbrev8_v $V28, $V16]} + + # store second to last block + vsetvli zero, $TAIL_LENGTH, e8, m1, ta, ma + vse8.v $V25, ($OUTPUT) +___ + + return $code; +} + +# prepare xts dec second to last block's input(v24) and iv(v29) and +# last block's and iv(v28) +sub handle_xts_dec_last_block { + my $code=<<___; + bnez $TAIL_LENGTH, 2f + + beqz $UPDATE_IV, 1f + ## Store next IV + # setup `x` multiplier with byte-reversed order + # 0b00000010 => 0b01000000 (0x40) + li $T0, 0x40 + vsetivli zero, 4, e32, m1, ta, ma + vmv.v.i $V28, 0 + vsetivli zero, 1, e8, m1, tu, ma + vmv.v.x $V28, $T0 + + beqz $LENGTH, 3f + addi $VL, $VL, -4 + vsetivli zero, 4, e32, m4, ta, ma + # multiplier + vslidedown.vx $V16, $V16, $VL + +3: + # IV * `x` + vsetivli zero, 4, e32, m1, ta, ma + @{[vgmul_vv $V16, $V28]} + # Reverse the IV's bits order back to big-endian + @{[vbrev8_v $V28, $V16]} + + vse32.v $V28, ($IV) +1: + + ret +2: + # load second to last block's ciphertext + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V24, ($INPUT) + addi $INPUT, $INPUT, 16 + + # setup `x` multiplier with byte-reversed order + # 0b00000010 => 0b01000000 (0x40) + li $T0, 0x40 + vsetivli zero, 4, e32, m1, ta, ma + vmv.v.i $V20, 0 + vsetivli zero, 1, e8, m1, tu, ma + vmv.v.x $V20, $T0 + + beqz $LENGTH, 1f + # slidedown third to last block + addi $VL, $VL, -4 + vsetivli zero, 4, e32, m4, ta, ma + # multiplier + vslidedown.vx $V16, $V16, $VL + + # compute IV for last block + vsetivli zero, 4, e32, m1, ta, ma + @{[vgmul_vv $V16, $V20]} + @{[vbrev8_v $V28, $V16]} + + # compute IV for second to last block + @{[vgmul_vv $V16, $V20]} + @{[vbrev8_v $V29, $V16]} + j 2f +1: + # compute IV for second to last block + vsetivli zero, 4, e32, m1, ta, ma + @{[vgmul_vv $V16, $V20]} + @{[vbrev8_v $V29, $V16]} +2: +___ + + return $code; +} + +# Load all 11 round keys to v1-v11 registers. +sub aes_128_load_key { + my $code=<<___; + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V2, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V3, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V4, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V5, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V6, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V7, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V8, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V9, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V10, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V11, ($KEY) +___ + + return $code; +} + +# Load all 13 round keys to v1-v13 registers. +sub aes_192_load_key { + my $code=<<___; + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V2, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V3, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V4, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V5, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V6, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V7, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V8, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V9, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V10, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V11, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V12, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V13, ($KEY) +___ + + return $code; +} + +# Load all 15 round keys to v1-v15 registers. +sub aes_256_load_key { + my $code=<<___; + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V2, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V3, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V4, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V5, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V6, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V7, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V8, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V9, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V10, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V11, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V12, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V13, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V14, ($KEY) + addi $KEY, $KEY, 16 + vle32.v $V15, ($KEY) +___ + + return $code; +} + +# aes-128 enc with round keys v1-v11 +sub aes_128_enc { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesef_vs $V24, $V11]} +___ + + return $code; +} + +# aes-128 dec with round keys v1-v11 +sub aes_128_dec { + my $code=<<___; + @{[vaesz_vs $V24, $V11]} + @{[vaesdm_vs $V24, $V10]} + @{[vaesdm_vs $V24, $V9]} + @{[vaesdm_vs $V24, $V8]} + @{[vaesdm_vs $V24, $V7]} + @{[vaesdm_vs $V24, $V6]} + @{[vaesdm_vs $V24, $V5]} + @{[vaesdm_vs $V24, $V4]} + @{[vaesdm_vs $V24, $V3]} + @{[vaesdm_vs $V24, $V2]} + @{[vaesdf_vs $V24, $V1]} +___ + + return $code; +} + +# aes-192 enc with round keys v1-v13 +sub aes_192_enc { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesem_vs $V24, $V11]} + @{[vaesem_vs $V24, $V12]} + @{[vaesef_vs $V24, $V13]} +___ + + return $code; +} + +# aes-192 dec with round keys v1-v13 +sub aes_192_dec { + my $code=<<___; + @{[vaesz_vs $V24, $V13]} + @{[vaesdm_vs $V24, $V12]} + @{[vaesdm_vs $V24, $V11]} + @{[vaesdm_vs $V24, $V10]} + @{[vaesdm_vs $V24, $V9]} + @{[vaesdm_vs $V24, $V8]} + @{[vaesdm_vs $V24, $V7]} + @{[vaesdm_vs $V24, $V6]} + @{[vaesdm_vs $V24, $V5]} + @{[vaesdm_vs $V24, $V4]} + @{[vaesdm_vs $V24, $V3]} + @{[vaesdm_vs $V24, $V2]} + @{[vaesdf_vs $V24, $V1]} +___ + + return $code; +} + +# aes-256 enc with round keys v1-v15 +sub aes_256_enc { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesem_vs $V24, $V11]} + @{[vaesem_vs $V24, $V12]} + @{[vaesem_vs $V24, $V13]} + @{[vaesem_vs $V24, $V14]} + @{[vaesef_vs $V24, $V15]} +___ + + return $code; +} + +# aes-256 dec with round keys v1-v15 +sub aes_256_dec { + my $code=<<___; + @{[vaesz_vs $V24, $V15]} + @{[vaesdm_vs $V24, $V14]} + @{[vaesdm_vs $V24, $V13]} + @{[vaesdm_vs $V24, $V12]} + @{[vaesdm_vs $V24, $V11]} + @{[vaesdm_vs $V24, $V10]} + @{[vaesdm_vs $V24, $V9]} + @{[vaesdm_vs $V24, $V8]} + @{[vaesdm_vs $V24, $V7]} + @{[vaesdm_vs $V24, $V6]} + @{[vaesdm_vs $V24, $V5]} + @{[vaesdm_vs $V24, $V4]} + @{[vaesdm_vs $V24, $V3]} + @{[vaesdm_vs $V24, $V2]} + @{[vaesdf_vs $V24, $V1]} +___ + + return $code; +} + +$code .= <<___; +.p2align 3 +.globl rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt +.type rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt,\@function +rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt: + @{[load_xts_iv0]} + + # aes block size is 16 + andi $TAIL_LENGTH, $LENGTH, 15 + mv $STORE_LEN32, $LENGTH + beqz $TAIL_LENGTH, 1f + sub $LENGTH, $LENGTH, $TAIL_LENGTH + addi $STORE_LEN32, $LENGTH, -16 +1: + # We make the `LENGTH` become e32 length here. + srli $LEN32, $LENGTH, 2 + srli $STORE_LEN32, $STORE_LEN32, 2 + + # Load key length. + lwu $T0, 480($KEY) + li $T1, 32 + li $T2, 24 + li $T3, 16 + beq $T0, $T1, aes_xts_enc_256 + beq $T0, $T2, aes_xts_enc_192 + beq $T0, $T3, aes_xts_enc_128 +.size rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt,.-rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt +___ + +$code .= <<___; +.p2align 3 +aes_xts_enc_128: + @{[init_first_round]} + @{[aes_128_load_key]} + + vsetvli $VL, $LEN32, e32, m4, ta, ma + j 1f + +.Lenc_blocks_128: + vsetvli $VL, $LEN32, e32, m4, ta, ma + # load plaintext into v24 + vle32.v $V24, ($INPUT) + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + vxor.vv $V24, $V24, $V28 + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_128_enc]} + vxor.vv $V24, $V24, $V28 + + # store ciphertext + vsetvli zero, $STORE_LEN32, e32, m4, ta, ma + vse32.v $V24, ($OUTPUT) + add $OUTPUT, $OUTPUT, $T0 + sub $STORE_LEN32, $STORE_LEN32, $VL + + bnez $LEN32, .Lenc_blocks_128 + + @{[handle_xts_enc_last_block]} + + # xts last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V28 + @{[aes_128_enc]} + vxor.vv $V24, $V24, $V28 + + # store last block ciphertext + addi $OUTPUT, $OUTPUT, -16 + vse32.v $V24, ($OUTPUT) + + ret +.size aes_xts_enc_128,.-aes_xts_enc_128 +___ + +$code .= <<___; +.p2align 3 +aes_xts_enc_192: + @{[init_first_round]} + @{[aes_192_load_key]} + + vsetvli $VL, $LEN32, e32, m4, ta, ma + j 1f + +.Lenc_blocks_192: + vsetvli $VL, $LEN32, e32, m4, ta, ma + # load plaintext into v24 + vle32.v $V24, ($INPUT) + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + vxor.vv $V24, $V24, $V28 + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_192_enc]} + vxor.vv $V24, $V24, $V28 + + # store ciphertext + vsetvli zero, $STORE_LEN32, e32, m4, ta, ma + vse32.v $V24, ($OUTPUT) + add $OUTPUT, $OUTPUT, $T0 + sub $STORE_LEN32, $STORE_LEN32, $VL + + bnez $LEN32, .Lenc_blocks_192 + + @{[handle_xts_enc_last_block]} + + # xts last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V28 + @{[aes_192_enc]} + vxor.vv $V24, $V24, $V28 + + # store last block ciphertext + addi $OUTPUT, $OUTPUT, -16 + vse32.v $V24, ($OUTPUT) + + ret +.size aes_xts_enc_192,.-aes_xts_enc_192 +___ + +$code .= <<___; +.p2align 3 +aes_xts_enc_256: + @{[init_first_round]} + @{[aes_256_load_key]} + + vsetvli $VL, $LEN32, e32, m4, ta, ma + j 1f + +.Lenc_blocks_256: + vsetvli $VL, $LEN32, e32, m4, ta, ma + # load plaintext into v24 + vle32.v $V24, ($INPUT) + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + vxor.vv $V24, $V24, $V28 + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_256_enc]} + vxor.vv $V24, $V24, $V28 + + # store ciphertext + vsetvli zero, $STORE_LEN32, e32, m4, ta, ma + vse32.v $V24, ($OUTPUT) + add $OUTPUT, $OUTPUT, $T0 + sub $STORE_LEN32, $STORE_LEN32, $VL + + bnez $LEN32, .Lenc_blocks_256 + + @{[handle_xts_enc_last_block]} + + # xts last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V28 + @{[aes_256_enc]} + vxor.vv $V24, $V24, $V28 + + # store last block ciphertext + addi $OUTPUT, $OUTPUT, -16 + vse32.v $V24, ($OUTPUT) + + ret +.size aes_xts_enc_256,.-aes_xts_enc_256 +___ + +################################################################################ +# void rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt(const unsigned char *in, +# unsigned char *out, size_t length, +# const AES_KEY *key, +# unsigned char iv[16], +# int update_iv) +$code .= <<___; +.p2align 3 +.globl rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt +.type rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt,\@function +rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt: + @{[load_xts_iv0]} + + # aes block size is 16 + andi $TAIL_LENGTH, $LENGTH, 15 + beqz $TAIL_LENGTH, 1f + sub $LENGTH, $LENGTH, $TAIL_LENGTH + addi $LENGTH, $LENGTH, -16 +1: + # We make the `LENGTH` become e32 length here. + srli $LEN32, $LENGTH, 2 + + # Load key length. + lwu $T0, 480($KEY) + li $T1, 32 + li $T2, 24 + li $T3, 16 + beq $T0, $T1, aes_xts_dec_256 + beq $T0, $T2, aes_xts_dec_192 + beq $T0, $T3, aes_xts_dec_128 +.size rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt,.-rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt +___ + +$code .= <<___; +.p2align 3 +aes_xts_dec_128: + @{[init_first_round]} + @{[aes_128_load_key]} + + beqz $LEN32, 2f + + vsetvli $VL, $LEN32, e32, m4, ta, ma + j 1f + +.Ldec_blocks_128: + vsetvli $VL, $LEN32, e32, m4, ta, ma + # load ciphertext into v24 + vle32.v $V24, ($INPUT) + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + vxor.vv $V24, $V24, $V28 + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_128_dec]} + vxor.vv $V24, $V24, $V28 + + # store plaintext + vse32.v $V24, ($OUTPUT) + add $OUTPUT, $OUTPUT, $T0 + + bnez $LEN32, .Ldec_blocks_128 + +2: + @{[handle_xts_dec_last_block]} + + ## xts second to last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V29 + @{[aes_128_dec]} + vxor.vv $V24, $V24, $V29 + vmv.v.v $V25, $V24 + + # load last block ciphertext + vsetvli zero, $TAIL_LENGTH, e8, m1, tu, ma + vle8.v $V24, ($INPUT) + + # store second to last block plaintext + addi $T0, $OUTPUT, 16 + vse8.v $V25, ($T0) + + ## xts last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V28 + @{[aes_128_dec]} + vxor.vv $V24, $V24, $V28 + + # store second to last block plaintext + vse32.v $V24, ($OUTPUT) + + ret +.size aes_xts_dec_128,.-aes_xts_dec_128 +___ + +$code .= <<___; +.p2align 3 +aes_xts_dec_192: + @{[init_first_round]} + @{[aes_192_load_key]} + + beqz $LEN32, 2f + + vsetvli $VL, $LEN32, e32, m4, ta, ma + j 1f + +.Ldec_blocks_192: + vsetvli $VL, $LEN32, e32, m4, ta, ma + # load ciphertext into v24 + vle32.v $V24, ($INPUT) + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + vxor.vv $V24, $V24, $V28 + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_192_dec]} + vxor.vv $V24, $V24, $V28 + + # store plaintext + vse32.v $V24, ($OUTPUT) + add $OUTPUT, $OUTPUT, $T0 + + bnez $LEN32, .Ldec_blocks_192 + +2: + @{[handle_xts_dec_last_block]} + + ## xts second to last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V29 + @{[aes_192_dec]} + vxor.vv $V24, $V24, $V29 + vmv.v.v $V25, $V24 + + # load last block ciphertext + vsetvli zero, $TAIL_LENGTH, e8, m1, tu, ma + vle8.v $V24, ($INPUT) + + # store second to last block plaintext + addi $T0, $OUTPUT, 16 + vse8.v $V25, ($T0) + + ## xts last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V28 + @{[aes_192_dec]} + vxor.vv $V24, $V24, $V28 + + # store second to last block plaintext + vse32.v $V24, ($OUTPUT) + + ret +.size aes_xts_dec_192,.-aes_xts_dec_192 +___ + +$code .= <<___; +.p2align 3 +aes_xts_dec_256: + @{[init_first_round]} + @{[aes_256_load_key]} + + beqz $LEN32, 2f + + vsetvli $VL, $LEN32, e32, m4, ta, ma + j 1f + +.Ldec_blocks_256: + vsetvli $VL, $LEN32, e32, m4, ta, ma + # load ciphertext into v24 + vle32.v $V24, ($INPUT) + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + vxor.vv $V24, $V24, $V28 + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_256_dec]} + vxor.vv $V24, $V24, $V28 + + # store plaintext + vse32.v $V24, ($OUTPUT) + add $OUTPUT, $OUTPUT, $T0 + + bnez $LEN32, .Ldec_blocks_256 + +2: + @{[handle_xts_dec_last_block]} + + ## xts second to last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V29 + @{[aes_256_dec]} + vxor.vv $V24, $V24, $V29 + vmv.v.v $V25, $V24 + + # load last block ciphertext + vsetvli zero, $TAIL_LENGTH, e8, m1, tu, ma + vle8.v $V24, ($INPUT) + + # store second to last block plaintext + addi $T0, $OUTPUT, 16 + vse8.v $V25, ($T0) + + ## xts last block + vsetivli zero, 4, e32, m1, ta, ma + vxor.vv $V24, $V24, $V28 + @{[aes_256_dec]} + vxor.vv $V24, $V24, $V28 + + # store second to last block plaintext + vse32.v $V24, ($OUTPUT) + + ret +.size aes_xts_dec_256,.-aes_xts_dec_256 +___ +} + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; diff --git a/arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl b/arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl new file mode 100644 index 000000000000..c3506e5523be --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl @@ -0,0 +1,415 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector AES block cipher extension ('Zvkned') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +################################################################################ +# void rv64i_zvkb_zvkned_ctr32_encrypt_blocks(const unsigned char *in, +# unsigned char *out, size_t length, +# const void *key, +# unsigned char ivec[16]); +{ +my ($INP, $OUTP, $LEN, $KEYP, $IVP) = ("a0", "a1", "a2", "a3", "a4"); +my ($T0, $T1, $T2, $T3) = ("t0", "t1", "t2", "t3"); +my ($VL) = ("t4"); +my ($LEN32) = ("t5"); +my ($CTR) = ("t6"); +my ($MASK) = ("v0"); +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +# Prepare the AES ctr input data into v16. +sub init_aes_ctr_input { + my $code=<<___; + # Setup mask into v0 + # The mask pattern for 4*N-th elements + # mask v0: [000100010001....] + # Note: + # We could setup the mask just for the maximum element length instead of + # the VLMAX. + li $T0, 0b10001000 + vsetvli $T2, zero, e8, m1, ta, ma + vmv.v.x $MASK, $T0 + # Load IV. + # v31:[IV0, IV1, IV2, big-endian count] + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V31, ($IVP) + # Convert the big-endian counter into little-endian. + vsetivli zero, 4, e32, m1, ta, mu + @{[vrev8_v $V31, $V31, $MASK]} + # Splat the IV to v16 + vsetvli zero, $LEN32, e32, m4, ta, ma + vmv.v.i $V16, 0 + @{[vaesz_vs $V16, $V31]} + # Prepare the ctr pattern into v20 + # v20: [x, x, x, 0, x, x, x, 1, x, x, x, 2, ...] + viota.m $V20, $MASK, $MASK.t + # v16:[IV0, IV1, IV2, count+0, IV0, IV1, IV2, count+1, ...] + vsetvli $VL, $LEN32, e32, m4, ta, mu + vadd.vv $V16, $V16, $V20, $MASK.t +___ + + return $code; +} + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkb_zvkned_ctr32_encrypt_blocks +.type rv64i_zvkb_zvkned_ctr32_encrypt_blocks,\@function +rv64i_zvkb_zvkned_ctr32_encrypt_blocks: + # The aes block size is 16 bytes. + # We try to get the minimum aes block number including the tail data. + addi $T0, $LEN, 15 + # the minimum block number + srli $T0, $T0, 4 + # We make the block number become e32 length here. + slli $LEN32, $T0, 2 + + # Load key length. + lwu $T0, 480($KEYP) + li $T1, 32 + li $T2, 24 + li $T3, 16 + + beq $T0, $T1, ctr32_encrypt_blocks_256 + beq $T0, $T2, ctr32_encrypt_blocks_192 + beq $T0, $T3, ctr32_encrypt_blocks_128 + + ret +.size rv64i_zvkb_zvkned_ctr32_encrypt_blocks,.-rv64i_zvkb_zvkned_ctr32_encrypt_blocks +___ + +$code .= <<___; +.p2align 3 +ctr32_encrypt_blocks_128: + # Load all 11 round keys to v1-v11 registers. + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V2, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V3, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V4, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V5, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V6, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V7, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V8, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V9, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V10, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) + + @{[init_aes_ctr_input]} + + ##### AES body + j 2f +1: + vsetvli $VL, $LEN32, e32, m4, ta, mu + # Increase ctr in v16. + vadd.vx $V16, $V16, $CTR, $MASK.t +2: + # Prepare the AES ctr input into v24. + # The ctr data uses big-endian form. + vmv.v.v $V24, $V16 + @{[vrev8_v $V24, $V24, $MASK]} + srli $CTR, $VL, 2 + sub $LEN32, $LEN32, $VL + + # Load plaintext in bytes into v20. + vsetvli $T0, $LEN, e8, m4, ta, ma + vle8.v $V20, ($INP) + sub $LEN, $LEN, $T0 + add $INP, $INP, $T0 + + vsetvli zero, $VL, e32, m4, ta, ma + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesef_vs $V24, $V11]} + + # ciphertext + vsetvli zero, $T0, e8, m4, ta, ma + vxor.vv $V24, $V24, $V20 + + # Store the ciphertext. + vse8.v $V24, ($OUTP) + add $OUTP, $OUTP, $T0 + + bnez $LEN, 1b + + ## store ctr iv + vsetivli zero, 4, e32, m1, ta, mu + # Increase ctr in v16. + vadd.vx $V16, $V16, $CTR, $MASK.t + # Convert ctr data back to big-endian. + @{[vrev8_v $V16, $V16, $MASK]} + vse32.v $V16, ($IVP) + + ret +.size ctr32_encrypt_blocks_128,.-ctr32_encrypt_blocks_128 +___ + +$code .= <<___; +.p2align 3 +ctr32_encrypt_blocks_192: + # Load all 13 round keys to v1-v13 registers. + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V2, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V3, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V4, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V5, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V6, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V7, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V8, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V9, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V10, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V12, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V13, ($KEYP) + + @{[init_aes_ctr_input]} + + ##### AES body + j 2f +1: + vsetvli $VL, $LEN32, e32, m4, ta, mu + # Increase ctr in v16. + vadd.vx $V16, $V16, $CTR, $MASK.t +2: + # Prepare the AES ctr input into v24. + # The ctr data uses big-endian form. + vmv.v.v $V24, $V16 + @{[vrev8_v $V24, $V24, $MASK]} + srli $CTR, $VL, 2 + sub $LEN32, $LEN32, $VL + + # Load plaintext in bytes into v20. + vsetvli $T0, $LEN, e8, m4, ta, ma + vle8.v $V20, ($INP) + sub $LEN, $LEN, $T0 + add $INP, $INP, $T0 + + vsetvli zero, $VL, e32, m4, ta, ma + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesem_vs $V24, $V11]} + @{[vaesem_vs $V24, $V12]} + @{[vaesef_vs $V24, $V13]} + + # ciphertext + vsetvli zero, $T0, e8, m4, ta, ma + vxor.vv $V24, $V24, $V20 + + # Store the ciphertext. + vse8.v $V24, ($OUTP) + add $OUTP, $OUTP, $T0 + + bnez $LEN, 1b + + ## store ctr iv + vsetivli zero, 4, e32, m1, ta, mu + # Increase ctr in v16. + vadd.vx $V16, $V16, $CTR, $MASK.t + # Convert ctr data back to big-endian. + @{[vrev8_v $V16, $V16, $MASK]} + vse32.v $V16, ($IVP) + + ret +.size ctr32_encrypt_blocks_192,.-ctr32_encrypt_blocks_192 +___ + +$code .= <<___; +.p2align 3 +ctr32_encrypt_blocks_256: + # Load all 15 round keys to v1-v15 registers. + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V2, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V3, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V4, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V5, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V6, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V7, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V8, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V9, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V10, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V12, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V13, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V14, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V15, ($KEYP) + + @{[init_aes_ctr_input]} + + ##### AES body + j 2f +1: + vsetvli $VL, $LEN32, e32, m4, ta, mu + # Increase ctr in v16. + vadd.vx $V16, $V16, $CTR, $MASK.t +2: + # Prepare the AES ctr input into v24. + # The ctr data uses big-endian form. + vmv.v.v $V24, $V16 + @{[vrev8_v $V24, $V24, $MASK]} + srli $CTR, $VL, 2 + sub $LEN32, $LEN32, $VL + + # Load plaintext in bytes into v20. + vsetvli $T0, $LEN, e8, m4, ta, ma + vle8.v $V20, ($INP) + sub $LEN, $LEN, $T0 + add $INP, $INP, $T0 + + vsetvli zero, $VL, e32, m4, ta, ma + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesem_vs $V24, $V11]} + @{[vaesem_vs $V24, $V12]} + @{[vaesem_vs $V24, $V13]} + @{[vaesem_vs $V24, $V14]} + @{[vaesef_vs $V24, $V15]} + + # ciphertext + vsetvli zero, $T0, e8, m4, ta, ma + vxor.vv $V24, $V24, $V20 + + # Store the ciphertext. + vse8.v $V24, ($OUTP) + add $OUTP, $OUTP, $T0 + + bnez $LEN, 1b + + ## store ctr iv + vsetivli zero, 4, e32, m1, ta, mu + # Increase ctr in v16. + vadd.vx $V16, $V16, $CTR, $MASK.t + # Convert ctr data back to big-endian. + @{[vrev8_v $V16, $V16, $MASK]} + vse32.v $V16, ($IVP) + + ret +.size ctr32_encrypt_blocks_256,.-ctr32_encrypt_blocks_256 +___ +} + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; diff --git a/arch/riscv/crypto/aes-riscv64-zvkned.pl b/arch/riscv/crypto/aes-riscv64-zvkned.pl index 466357b4503c..1ac84fb660ba 100644 --- a/arch/riscv/crypto/aes-riscv64-zvkned.pl +++ b/arch/riscv/crypto/aes-riscv64-zvkned.pl @@ -67,6 +67,752 @@ my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, ) = map("v$_",(0..31)); +# Load all 11 round keys to v1-v11 registers. +sub aes_128_load_key { + my $KEYP = shift; + + my $code=<<___; + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V2, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V3, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V4, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V5, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V6, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V7, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V8, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V9, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V10, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) +___ + + return $code; +} + +# Load all 13 round keys to v1-v13 registers. +sub aes_192_load_key { + my $KEYP = shift; + + my $code=<<___; + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V2, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V3, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V4, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V5, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V6, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V7, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V8, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V9, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V10, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V12, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V13, ($KEYP) +___ + + return $code; +} + +# Load all 15 round keys to v1-v15 registers. +sub aes_256_load_key { + my $KEYP = shift; + + my $code=<<___; + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $V1, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V2, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V3, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V4, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V5, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V6, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V7, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V8, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V9, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V10, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V11, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V12, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V13, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V14, ($KEYP) + addi $KEYP, $KEYP, 16 + vle32.v $V15, ($KEYP) +___ + + return $code; +} + +# aes-128 encryption with round keys v1-v11 +sub aes_128_encrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} # with round key w[ 0, 3] + @{[vaesem_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesem_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesem_vs $V24, $V4]} # with round key w[12,15] + @{[vaesem_vs $V24, $V5]} # with round key w[16,19] + @{[vaesem_vs $V24, $V6]} # with round key w[20,23] + @{[vaesem_vs $V24, $V7]} # with round key w[24,27] + @{[vaesem_vs $V24, $V8]} # with round key w[28,31] + @{[vaesem_vs $V24, $V9]} # with round key w[32,35] + @{[vaesem_vs $V24, $V10]} # with round key w[36,39] + @{[vaesef_vs $V24, $V11]} # with round key w[40,43] +___ + + return $code; +} + +# aes-128 decryption with round keys v1-v11 +sub aes_128_decrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V11]} # with round key w[40,43] + @{[vaesdm_vs $V24, $V10]} # with round key w[36,39] + @{[vaesdm_vs $V24, $V9]} # with round key w[32,35] + @{[vaesdm_vs $V24, $V8]} # with round key w[28,31] + @{[vaesdm_vs $V24, $V7]} # with round key w[24,27] + @{[vaesdm_vs $V24, $V6]} # with round key w[20,23] + @{[vaesdm_vs $V24, $V5]} # with round key w[16,19] + @{[vaesdm_vs $V24, $V4]} # with round key w[12,15] + @{[vaesdm_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesdm_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesdf_vs $V24, $V1]} # with round key w[ 0, 3] +___ + + return $code; +} + +# aes-192 encryption with round keys v1-v13 +sub aes_192_encrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} # with round key w[ 0, 3] + @{[vaesem_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesem_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesem_vs $V24, $V4]} # with round key w[12,15] + @{[vaesem_vs $V24, $V5]} # with round key w[16,19] + @{[vaesem_vs $V24, $V6]} # with round key w[20,23] + @{[vaesem_vs $V24, $V7]} # with round key w[24,27] + @{[vaesem_vs $V24, $V8]} # with round key w[28,31] + @{[vaesem_vs $V24, $V9]} # with round key w[32,35] + @{[vaesem_vs $V24, $V10]} # with round key w[36,39] + @{[vaesem_vs $V24, $V11]} # with round key w[40,43] + @{[vaesem_vs $V24, $V12]} # with round key w[44,47] + @{[vaesef_vs $V24, $V13]} # with round key w[48,51] +___ + + return $code; +} + +# aes-192 decryption with round keys v1-v13 +sub aes_192_decrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V13]} # with round key w[48,51] + @{[vaesdm_vs $V24, $V12]} # with round key w[44,47] + @{[vaesdm_vs $V24, $V11]} # with round key w[40,43] + @{[vaesdm_vs $V24, $V10]} # with round key w[36,39] + @{[vaesdm_vs $V24, $V9]} # with round key w[32,35] + @{[vaesdm_vs $V24, $V8]} # with round key w[28,31] + @{[vaesdm_vs $V24, $V7]} # with round key w[24,27] + @{[vaesdm_vs $V24, $V6]} # with round key w[20,23] + @{[vaesdm_vs $V24, $V5]} # with round key w[16,19] + @{[vaesdm_vs $V24, $V4]} # with round key w[12,15] + @{[vaesdm_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesdm_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesdf_vs $V24, $V1]} # with round key w[ 0, 3] +___ + + return $code; +} + +# aes-256 encryption with round keys v1-v15 +sub aes_256_encrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} # with round key w[ 0, 3] + @{[vaesem_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesem_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesem_vs $V24, $V4]} # with round key w[12,15] + @{[vaesem_vs $V24, $V5]} # with round key w[16,19] + @{[vaesem_vs $V24, $V6]} # with round key w[20,23] + @{[vaesem_vs $V24, $V7]} # with round key w[24,27] + @{[vaesem_vs $V24, $V8]} # with round key w[28,31] + @{[vaesem_vs $V24, $V9]} # with round key w[32,35] + @{[vaesem_vs $V24, $V10]} # with round key w[36,39] + @{[vaesem_vs $V24, $V11]} # with round key w[40,43] + @{[vaesem_vs $V24, $V12]} # with round key w[44,47] + @{[vaesem_vs $V24, $V13]} # with round key w[48,51] + @{[vaesem_vs $V24, $V14]} # with round key w[52,55] + @{[vaesef_vs $V24, $V15]} # with round key w[56,59] +___ + + return $code; +} + +# aes-256 decryption with round keys v1-v15 +sub aes_256_decrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V15]} # with round key w[56,59] + @{[vaesdm_vs $V24, $V14]} # with round key w[52,55] + @{[vaesdm_vs $V24, $V13]} # with round key w[48,51] + @{[vaesdm_vs $V24, $V12]} # with round key w[44,47] + @{[vaesdm_vs $V24, $V11]} # with round key w[40,43] + @{[vaesdm_vs $V24, $V10]} # with round key w[36,39] + @{[vaesdm_vs $V24, $V9]} # with round key w[32,35] + @{[vaesdm_vs $V24, $V8]} # with round key w[28,31] + @{[vaesdm_vs $V24, $V7]} # with round key w[24,27] + @{[vaesdm_vs $V24, $V6]} # with round key w[20,23] + @{[vaesdm_vs $V24, $V5]} # with round key w[16,19] + @{[vaesdm_vs $V24, $V4]} # with round key w[12,15] + @{[vaesdm_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesdm_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesdf_vs $V24, $V1]} # with round key w[ 0, 3] +___ + + return $code; +} + +{ +############################################################################### +# void rv64i_zvkned_cbc_encrypt(const unsigned char *in, unsigned char *out, +# size_t length, const AES_KEY *key, +# unsigned char *ivec, const int enc); +my ($INP, $OUTP, $LEN, $KEYP, $IVP, $ENC) = ("a0", "a1", "a2", "a3", "a4", "a5"); +my ($T0, $T1) = ("t0", "t1", "t2"); + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_cbc_encrypt +.type rv64i_zvkned_cbc_encrypt,\@function +rv64i_zvkned_cbc_encrypt: + # check whether the length is a multiple of 16 and >= 16 + li $T1, 16 + blt $LEN, $T1, L_end + andi $T1, $LEN, 15 + bnez $T1, L_end + + # Load key length. + lwu $T0, 480($KEYP) + + # Get proper routine for key length. + li $T1, 16 + beq $T1, $T0, L_cbc_enc_128 + + li $T1, 24 + beq $T1, $T0, L_cbc_enc_192 + + li $T1, 32 + beq $T1, $T0, L_cbc_enc_256 + + ret +.size rv64i_zvkned_cbc_encrypt,.-rv64i_zvkned_cbc_encrypt +___ + +$code .= <<___; +.p2align 3 +L_cbc_enc_128: + # Load all 11 round keys to v1-v11 registers. + @{[aes_128_load_key $KEYP]} + + # Load IV. + vle32.v $V16, ($IVP) + + vle32.v $V24, ($INP) + vxor.vv $V24, $V24, $V16 + j 2f + +1: + vle32.v $V17, ($INP) + vxor.vv $V24, $V24, $V17 + +2: + # AES body + @{[aes_128_encrypt]} + + vse32.v $V24, ($OUTP) + + addi $INP, $INP, 16 + addi $OUTP, $OUTP, 16 + addi $LEN, $LEN, -16 + + bnez $LEN, 1b + + vse32.v $V24, ($IVP) + + ret +.size L_cbc_enc_128,.-L_cbc_enc_128 +___ + +$code .= <<___; +.p2align 3 +L_cbc_enc_192: + # Load all 13 round keys to v1-v13 registers. + @{[aes_192_load_key $KEYP]} + + # Load IV. + vle32.v $V16, ($IVP) + + vle32.v $V24, ($INP) + vxor.vv $V24, $V24, $V16 + j 2f + +1: + vle32.v $V17, ($INP) + vxor.vv $V24, $V24, $V17 + +2: + # AES body + @{[aes_192_encrypt]} + + vse32.v $V24, ($OUTP) + + addi $INP, $INP, 16 + addi $OUTP, $OUTP, 16 + addi $LEN, $LEN, -16 + + bnez $LEN, 1b + + vse32.v $V24, ($IVP) + + ret +.size L_cbc_enc_192,.-L_cbc_enc_192 +___ + +$code .= <<___; +.p2align 3 +L_cbc_enc_256: + # Load all 15 round keys to v1-v15 registers. + @{[aes_256_load_key $KEYP]} + + # Load IV. + vle32.v $V16, ($IVP) + + vle32.v $V24, ($INP) + vxor.vv $V24, $V24, $V16 + j 2f + +1: + vle32.v $V17, ($INP) + vxor.vv $V24, $V24, $V17 + +2: + # AES body + @{[aes_256_encrypt]} + + vse32.v $V24, ($OUTP) + + addi $INP, $INP, 16 + addi $OUTP, $OUTP, 16 + addi $LEN, $LEN, -16 + + bnez $LEN, 1b + + vse32.v $V24, ($IVP) + + ret +.size L_cbc_enc_256,.-L_cbc_enc_256 +___ + +############################################################################### +# void rv64i_zvkned_cbc_decrypt(const unsigned char *in, unsigned char *out, +# size_t length, const AES_KEY *key, +# unsigned char *ivec, const int enc); +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_cbc_decrypt +.type rv64i_zvkned_cbc_decrypt,\@function +rv64i_zvkned_cbc_decrypt: + # check whether the length is a multiple of 16 and >= 16 + li $T1, 16 + blt $LEN, $T1, L_end + andi $T1, $LEN, 15 + bnez $T1, L_end + + # Load key length. + lwu $T0, 480($KEYP) + + # Get proper routine for key length. + li $T1, 16 + beq $T1, $T0, L_cbc_dec_128 + + li $T1, 24 + beq $T1, $T0, L_cbc_dec_192 + + li $T1, 32 + beq $T1, $T0, L_cbc_dec_256 + + ret +.size rv64i_zvkned_cbc_decrypt,.-rv64i_zvkned_cbc_decrypt +___ + +$code .= <<___; +.p2align 3 +L_cbc_dec_128: + # Load all 11 round keys to v1-v11 registers. + @{[aes_128_load_key $KEYP]} + + # Load IV. + vle32.v $V16, ($IVP) + + vle32.v $V24, ($INP) + vmv.v.v $V17, $V24 + j 2f + +1: + vle32.v $V24, ($INP) + vmv.v.v $V17, $V24 + addi $OUTP, $OUTP, 16 + +2: + # AES body + @{[aes_128_decrypt]} + + vxor.vv $V24, $V24, $V16 + vse32.v $V24, ($OUTP) + vmv.v.v $V16, $V17 + + addi $LEN, $LEN, -16 + addi $INP, $INP, 16 + + bnez $LEN, 1b + + vse32.v $V16, ($IVP) + + ret +.size L_cbc_dec_128,.-L_cbc_dec_128 +___ + +$code .= <<___; +.p2align 3 +L_cbc_dec_192: + # Load all 13 round keys to v1-v13 registers. + @{[aes_192_load_key $KEYP]} + + # Load IV. + vle32.v $V16, ($IVP) + + vle32.v $V24, ($INP) + vmv.v.v $V17, $V24 + j 2f + +1: + vle32.v $V24, ($INP) + vmv.v.v $V17, $V24 + addi $OUTP, $OUTP, 16 + +2: + # AES body + @{[aes_192_decrypt]} + + vxor.vv $V24, $V24, $V16 + vse32.v $V24, ($OUTP) + vmv.v.v $V16, $V17 + + addi $LEN, $LEN, -16 + addi $INP, $INP, 16 + + bnez $LEN, 1b + + vse32.v $V16, ($IVP) + + ret +.size L_cbc_dec_192,.-L_cbc_dec_192 +___ + +$code .= <<___; +.p2align 3 +L_cbc_dec_256: + # Load all 15 round keys to v1-v15 registers. + @{[aes_256_load_key $KEYP]} + + # Load IV. + vle32.v $V16, ($IVP) + + vle32.v $V24, ($INP) + vmv.v.v $V17, $V24 + j 2f + +1: + vle32.v $V24, ($INP) + vmv.v.v $V17, $V24 + addi $OUTP, $OUTP, 16 + +2: + # AES body + @{[aes_256_decrypt]} + + vxor.vv $V24, $V24, $V16 + vse32.v $V24, ($OUTP) + vmv.v.v $V16, $V17 + + addi $LEN, $LEN, -16 + addi $INP, $INP, 16 + + bnez $LEN, 1b + + vse32.v $V16, ($IVP) + + ret +.size L_cbc_dec_256,.-L_cbc_dec_256 +___ +} + +{ +############################################################################### +# void rv64i_zvkned_ecb_encrypt(const unsigned char *in, unsigned char *out, +# size_t length, const AES_KEY *key, +# const int enc); +my ($INP, $OUTP, $LEN, $KEYP, $ENC) = ("a0", "a1", "a2", "a3", "a4"); +my ($VL) = ("a5"); +my ($LEN32) = ("a6"); +my ($T0, $T1) = ("t0", "t1"); + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_ecb_encrypt +.type rv64i_zvkned_ecb_encrypt,\@function +rv64i_zvkned_ecb_encrypt: + # Make the LEN become e32 length. + srli $LEN32, $LEN, 2 + + # Load key length. + lwu $T0, 480($KEYP) + + # Get proper routine for key length. + li $T1, 16 + beq $T1, $T0, L_ecb_enc_128 + + li $T1, 24 + beq $T1, $T0, L_ecb_enc_192 + + li $T1, 32 + beq $T1, $T0, L_ecb_enc_256 + + ret +.size rv64i_zvkned_ecb_encrypt,.-rv64i_zvkned_ecb_encrypt +___ + +$code .= <<___; +.p2align 3 +L_ecb_enc_128: + # Load all 11 round keys to v1-v11 registers. + @{[aes_128_load_key $KEYP]} + +1: + vsetvli $VL, $LEN32, e32, m4, ta, ma + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + vle32.v $V24, ($INP) + + # AES body + @{[aes_128_encrypt]} + + vse32.v $V24, ($OUTP) + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_enc_128,.-L_ecb_enc_128 +___ + +$code .= <<___; +.p2align 3 +L_ecb_enc_192: + # Load all 13 round keys to v1-v13 registers. + @{[aes_192_load_key $KEYP]} + +1: + vsetvli $VL, $LEN32, e32, m4, ta, ma + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + vle32.v $V24, ($INP) + + # AES body + @{[aes_192_encrypt]} + + vse32.v $V24, ($OUTP) + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_enc_192,.-L_ecb_enc_192 +___ + +$code .= <<___; +.p2align 3 +L_ecb_enc_256: + # Load all 15 round keys to v1-v15 registers. + @{[aes_256_load_key $KEYP]} + +1: + vsetvli $VL, $LEN32, e32, m4, ta, ma + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + vle32.v $V24, ($INP) + + # AES body + @{[aes_256_encrypt]} + + vse32.v $V24, ($OUTP) + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_enc_256,.-L_ecb_enc_256 +___ + +############################################################################### +# void rv64i_zvkned_ecb_decrypt(const unsigned char *in, unsigned char *out, +# size_t length, const AES_KEY *key, +# const int enc); +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_ecb_decrypt +.type rv64i_zvkned_ecb_decrypt,\@function +rv64i_zvkned_ecb_decrypt: + # Make the LEN become e32 length. + srli $LEN32, $LEN, 2 + + # Load key length. + lwu $T0, 480($KEYP) + + # Get proper routine for key length. + li $T1, 16 + beq $T1, $T0, L_ecb_dec_128 + + li $T1, 24 + beq $T1, $T0, L_ecb_dec_192 + + li $T1, 32 + beq $T1, $T0, L_ecb_dec_256 + + ret +.size rv64i_zvkned_ecb_decrypt,.-rv64i_zvkned_ecb_decrypt +___ + +$code .= <<___; +.p2align 3 +L_ecb_dec_128: + # Load all 11 round keys to v1-v11 registers. + @{[aes_128_load_key $KEYP]} + +1: + vsetvli $VL, $LEN32, e32, m4, ta, ma + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + vle32.v $V24, ($INP) + + # AES body + @{[aes_128_decrypt]} + + vse32.v $V24, ($OUTP) + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_dec_128,.-L_ecb_dec_128 +___ + +$code .= <<___; +.p2align 3 +L_ecb_dec_192: + # Load all 13 round keys to v1-v13 registers. + @{[aes_192_load_key $KEYP]} + +1: + vsetvli $VL, $LEN32, e32, m4, ta, ma + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + vle32.v $V24, ($INP) + + # AES body + @{[aes_192_decrypt]} + + vse32.v $V24, ($OUTP) + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_dec_192,.-L_ecb_dec_192 +___ + +$code .= <<___; +.p2align 3 +L_ecb_dec_256: + # Load all 15 round keys to v1-v15 registers. + @{[aes_256_load_key $KEYP]} + +1: + vsetvli $VL, $LEN32, e32, m4, ta, ma + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + vle32.v $V24, ($INP) + + # AES body + @{[aes_256_decrypt]} + + vse32.v $V24, ($OUTP) + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_dec_256,.-L_ecb_dec_256 +___ +} + { ################################################################################ # void rv64i_zvkned_encrypt(const unsigned char *in, unsigned char *out, From patchwork Tue Dec 5 09:27:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479638 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 34147C10DCE for ; Tue, 5 Dec 2023 09:28:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=OhYR7c3h8OudUw8s47AjquNAgdt3w+Ui0lViaF8LJPo=; b=dXGVilGOBXGxS3 Mr8C9QjuRjLxmbfp9dOy/66pLB6yLyPGP2mjEQljcuOu6qMi1qvKJrBywW7E7rPw+sy3PiHuQxrWJ B2ll4PkkUDTkQN8VuGwhNZO2YQP8z/W2DGg5FK4KvrbO+jMWQTPsMUOVWsvGJ1oCyxwtLtWJcMN6e O1Hr59NskDMeuu/0WBlKzmqop9t80RWWe7/T0hitobxOZ7mrHWvxUwVDlVTC1vs4jutSE/wMKxcgr +AJ0ld+mUqtqzq7LXNqKFWfZHk8qEdtcTaBCcaz0WRHGoTNwylb3p/1rg8GqX1X9BeMpQYr0soVU9 sqHhfMjJjeyHx2ovrfBA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARjQ-006ntI-04; Tue, 05 Dec 2023 09:28:44 +0000 Received: from mail-pg1-x52a.google.com ([2607:f8b0:4864:20::52a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARjJ-006no7-2S for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:42 +0000 Received: by mail-pg1-x52a.google.com with SMTP id 41be03b00d2f7-5c664652339so1255927a12.1 for ; Tue, 05 Dec 2023 01:28:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768516; x=1702373316; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=TqgLdIh7ZFE1gYel11FicvzVIx+QwNFkbGHDaWfBExY=; b=bHuUfZMorjQUAXlGOQI9kDPxELIHFScDjMeys+tH7bN+YH4nVMCGaZM8NtaBSq2ItB aG5N7cP60FHLkXcs8lX7azY+5mtSGEUby+HYU4r6kT/WMl2dacjctkfmj6eMYURJkrdc JaOmCRQ6y5PxUXi1G+bWdyCz8sbcuXwpsgHX2cDbMOQ1Cu9fQ/vNquyuAA+JsXFjFufR 94u63ycH1GBztm+Zk8YoU0P/KRUEePMZXvp2dTiXfHg+7paRdkF9fP3wuVLNbEvzgMxe sIiSLxMdGVJaHFApAEcK7GRKqPDWiuXawY3FHR9/LeFDWllT11f82/DzGBMXbBPpjJnb A9kA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768516; x=1702373316; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=TqgLdIh7ZFE1gYel11FicvzVIx+QwNFkbGHDaWfBExY=; b=oiiOGxVnxlYfUpT5o2nmuA9a+L1Y/aJhydp0PLOqBPGP9lybtS0INoeSAYbLB9NDCt 0ix5Yuk3M0Q3yte/PHEDnA9p5dX8tG0AzZLvAPhRf/Bf3hRl0Z4UZMDwD8y0FDJosuWn lK1svU2OJcfoNMShHCKcDgq5eZTjhG4zJvDwwfmlskJx39U9BZbLzOi7QCWGT1uMreA2 NStnCgY2sUpQFnNXXclHB2quPI3iQlZ8xeM6Vx2uGlFJweHK3gBA9MJK59YxaK6+74sP lRCprylsXpbtwqsEqu5FgFAmqs8Tzs71DPuWDpaNYk+ASvRlrs9k7dPWVWciLtwOFKg9 mMTQ== X-Gm-Message-State: AOJu0Yyt1TN5ZqaFJqCPmEBcLg1Rn+QwKrNCpBfEEXL/kYCzkVyQdMEC xgheGff7xpbggZcboCvI6qfa7g== X-Google-Smtp-Source: AGHT+IFq0drciTkZXIld809/gA8+vG+lQq4S7/UbBc5Z5Plazz8vtmMWy6JN/F/D8ld5FPtdHnAdww== X-Received: by 2002:a05:6a20:c906:b0:187:a4df:4e57 with SMTP id gx6-20020a056a20c90600b00187a4df4e57mr4180155pzb.20.1701768516597; Tue, 05 Dec 2023 01:28:36 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.33 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:36 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 07/12] RISC-V: crypto: add Zvkg accelerated GCM GHASH implementation Date: Tue, 5 Dec 2023 17:27:56 +0800 Message-Id: <20231205092801.1335-8-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012837_814216_E2CCFCE0 X-CRM114-Status: GOOD ( 31.25 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add a gcm hash implementation using the Zvkg extension from OpenSSL (openssl/openssl#21923). The perlasm here is different from the original implementation in OpenSSL. The OpenSSL assumes that the H is stored in little-endian. Thus, it needs to convert the H to big-endian for Zvkg instructions. In kernel, we have the big-endian H directly. There is no need for endian conversion. Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Signed-off-by: Jerry Shih --- Changelog v3: - Use asm mnemonics for the instructions in RVV 1.0 extension. Changelog v2: - Do not turn on kconfig `GHASH_RISCV64` option by default. - Add `asmlinkage` qualifier for crypto asm function. - Update the ghash fallback path in ghash_blocks(). - Rename structure riscv64_ghash_context to riscv64_ghash_tfm_ctx. - Fold ghash_update_zvkg() and ghash_final_zvkg(). - Reorder structure riscv64_ghash_alg_zvkg members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 10 ++ arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/ghash-riscv64-glue.c | 175 ++++++++++++++++++++++++ arch/riscv/crypto/ghash-riscv64-zvkg.pl | 100 ++++++++++++++ 4 files changed, 292 insertions(+) create mode 100644 arch/riscv/crypto/ghash-riscv64-glue.c create mode 100644 arch/riscv/crypto/ghash-riscv64-zvkg.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 9d991ddda289..6863f01a2ab0 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -34,4 +34,14 @@ config CRYPTO_AES_BLOCK_RISCV64 - Zvkb vector crypto extension (CTR/XTS) - Zvkg vector crypto extension (XTS) +config CRYPTO_GHASH_RISCV64 + tristate "Hash functions: GHASH" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_GCM + help + GCM GHASH function (NIST SP 800-38D) + + Architecture: riscv64 using: + - Zvkg vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 9574b009762f..94a7f8eaa8a7 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -9,6 +9,9 @@ aes-riscv64-y := aes-riscv64-glue.o aes-riscv64-zvkned.o obj-$(CONFIG_CRYPTO_AES_BLOCK_RISCV64) += aes-block-riscv64.o aes-block-riscv64-y := aes-riscv64-block-mode-glue.o aes-riscv64-zvkned-zvbb-zvkg.o aes-riscv64-zvkned-zvkb.o +obj-$(CONFIG_CRYPTO_GHASH_RISCV64) += ghash-riscv64.o +ghash-riscv64-y := ghash-riscv64-glue.o ghash-riscv64-zvkg.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) @@ -21,6 +24,10 @@ $(obj)/aes-riscv64-zvkned-zvbb-zvkg.S: $(src)/aes-riscv64-zvkned-zvbb-zvkg.pl $(obj)/aes-riscv64-zvkned-zvkb.S: $(src)/aes-riscv64-zvkned-zvkb.pl $(call cmd,perlasm) +$(obj)/ghash-riscv64-zvkg.S: $(src)/ghash-riscv64-zvkg.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S +clean-files += ghash-riscv64-zvkg.S diff --git a/arch/riscv/crypto/ghash-riscv64-glue.c b/arch/riscv/crypto/ghash-riscv64-glue.c new file mode 100644 index 000000000000..b01ab5714677 --- /dev/null +++ b/arch/riscv/crypto/ghash-riscv64-glue.c @@ -0,0 +1,175 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * RISC-V optimized GHASH routines + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* ghash using zvkg vector crypto extension */ +asmlinkage void gcm_ghash_rv64i_zvkg(be128 *Xi, const be128 *H, const u8 *inp, + size_t len); + +struct riscv64_ghash_tfm_ctx { + be128 key; +}; + +struct riscv64_ghash_desc_ctx { + be128 shash; + u8 buffer[GHASH_BLOCK_SIZE]; + u32 bytes; +}; + +static inline void ghash_blocks(const struct riscv64_ghash_tfm_ctx *tctx, + struct riscv64_ghash_desc_ctx *dctx, + const u8 *src, size_t srclen) +{ + /* The srclen is nonzero and a multiple of 16. */ + if (crypto_simd_usable()) { + kernel_vector_begin(); + gcm_ghash_rv64i_zvkg(&dctx->shash, &tctx->key, src, srclen); + kernel_vector_end(); + } else { + do { + crypto_xor((u8 *)&dctx->shash, src, GHASH_BLOCK_SIZE); + gf128mul_lle(&dctx->shash, &tctx->key); + srclen -= GHASH_BLOCK_SIZE; + src += GHASH_BLOCK_SIZE; + } while (srclen); + } +} + +static int ghash_init(struct shash_desc *desc) +{ + struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc); + + *dctx = (struct riscv64_ghash_desc_ctx){}; + + return 0; +} + +static int ghash_update_zvkg(struct shash_desc *desc, const u8 *src, + unsigned int srclen) +{ + size_t len; + const struct riscv64_ghash_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); + struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc); + + if (dctx->bytes) { + if (dctx->bytes + srclen < GHASH_BLOCK_SIZE) { + memcpy(dctx->buffer + dctx->bytes, src, srclen); + dctx->bytes += srclen; + return 0; + } + memcpy(dctx->buffer + dctx->bytes, src, + GHASH_BLOCK_SIZE - dctx->bytes); + + ghash_blocks(tctx, dctx, dctx->buffer, GHASH_BLOCK_SIZE); + + src += GHASH_BLOCK_SIZE - dctx->bytes; + srclen -= GHASH_BLOCK_SIZE - dctx->bytes; + dctx->bytes = 0; + } + len = srclen & ~(GHASH_BLOCK_SIZE - 1); + + if (len) { + ghash_blocks(tctx, dctx, src, len); + src += len; + srclen -= len; + } + + if (srclen) { + memcpy(dctx->buffer, src, srclen); + dctx->bytes = srclen; + } + + return 0; +} + +static int ghash_final_zvkg(struct shash_desc *desc, u8 *out) +{ + const struct riscv64_ghash_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); + struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc); + int i; + + if (dctx->bytes) { + for (i = dctx->bytes; i < GHASH_BLOCK_SIZE; i++) + dctx->buffer[i] = 0; + + ghash_blocks(tctx, dctx, dctx->buffer, GHASH_BLOCK_SIZE); + } + + memcpy(out, &dctx->shash, GHASH_DIGEST_SIZE); + + return 0; +} + +static int ghash_setkey(struct crypto_shash *tfm, const u8 *key, + unsigned int keylen) +{ + struct riscv64_ghash_tfm_ctx *tctx = crypto_shash_ctx(tfm); + + if (keylen != GHASH_BLOCK_SIZE) + return -EINVAL; + + memcpy(&tctx->key, key, GHASH_BLOCK_SIZE); + + return 0; +} + +static struct shash_alg riscv64_ghash_alg_zvkg = { + .init = ghash_init, + .update = ghash_update_zvkg, + .final = ghash_final_zvkg, + .setkey = ghash_setkey, + .descsize = sizeof(struct riscv64_ghash_desc_ctx), + .digestsize = GHASH_DIGEST_SIZE, + .base = { + .cra_blocksize = GHASH_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct riscv64_ghash_tfm_ctx), + .cra_priority = 303, + .cra_name = "ghash", + .cra_driver_name = "ghash-riscv64-zvkg", + .cra_module = THIS_MODULE, + }, +}; + +static inline bool check_ghash_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKG) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_ghash_mod_init(void) +{ + if (check_ghash_ext()) + return crypto_register_shash(&riscv64_ghash_alg_zvkg); + + return -ENODEV; +} + +static void __exit riscv64_ghash_mod_fini(void) +{ + crypto_unregister_shash(&riscv64_ghash_alg_zvkg); +} + +module_init(riscv64_ghash_mod_init); +module_exit(riscv64_ghash_mod_fini); + +MODULE_DESCRIPTION("GCM GHASH (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("ghash"); diff --git a/arch/riscv/crypto/ghash-riscv64-zvkg.pl b/arch/riscv/crypto/ghash-riscv64-zvkg.pl new file mode 100644 index 000000000000..a414d77554d2 --- /dev/null +++ b/arch/riscv/crypto/ghash-riscv64-zvkg.pl @@ -0,0 +1,100 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector GCM/GMAC extension ('Zvkg') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +############################################################################### +# void gcm_ghash_rv64i_zvkg(be128 *Xi, const be128 *H, const u8 *inp, size_t len) +# +# input: Xi: current hash value +# H: hash key +# inp: pointer to input data +# len: length of input data in bytes (multiple of block size) +# output: Xi: Xi+1 (next hash value Xi) +{ +my ($Xi,$H,$inp,$len) = ("a0","a1","a2","a3"); +my ($vXi,$vH,$vinp,$Vzero) = ("v1","v2","v3","v4"); + +$code .= <<___; +.p2align 3 +.globl gcm_ghash_rv64i_zvkg +.type gcm_ghash_rv64i_zvkg,\@function +gcm_ghash_rv64i_zvkg: + vsetivli zero, 4, e32, m1, ta, ma + vle32.v $vH, ($H) + vle32.v $vXi, ($Xi) + +Lstep: + vle32.v $vinp, ($inp) + add $inp, $inp, 16 + add $len, $len, -16 + @{[vghsh_vv $vXi, $vH, $vinp]} + bnez $len, Lstep + + vse32.v $vXi, ($Xi) + ret + +.size gcm_ghash_rv64i_zvkg,.-gcm_ghash_rv64i_zvkg +___ +} + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Tue Dec 5 09:27:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479640 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E691BC10F05 for ; Tue, 5 Dec 2023 09:28:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=AbEBqskxa2pCb04vGUayGYLQsRYZGGOq6LchYEG4Wbg=; b=pyu+CKXY0KxMBh p4lT9/puqfS69aLbSHQHYqt2HoeN+IrBVXRDU0AzfqgoTLbA12Y4BywI1dRZEg0zw1i1KD9tUAEXX MpiGvsKLmdb5KbzmiA5SJCpWRCC4nPk8nLoIp24CP9yv0xguDtyWh6yKVitu3akt99GjfU5F+Ay5/ 1tG11/d67u4VaWlekmdPEOp3sfkreBIFWfidjNcnUXo+hvjW0fDbvDzGhYfeybvSTst9nTrQhwAlv HwMYWI7VxAIWLLE+eOifyZoIBrnbIkp+Rqw5Fd1Ucp977Xiglu/oE7++TSTky6ArpXJFg+WE/eyUZ ICcHJJ5vYIHPREKQCIMA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARjS-006nvX-2l; Tue, 05 Dec 2023 09:28:46 +0000 Received: from mail-pf1-x42c.google.com ([2607:f8b0:4864:20::42c]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARjN-006nqt-21 for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:45 +0000 Received: by mail-pf1-x42c.google.com with SMTP id d2e1a72fcca58-6ce52d796d2so1732841b3a.3 for ; Tue, 05 Dec 2023 01:28:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768520; x=1702373320; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=rkAxxEMcPfloo2uem1ovLMPqU15m8yVjWoJ4/a1v3NY=; b=MTy8l3Qu1xE9X88yVyLYZBT2nHPu8P4HzfT7XjPXcpkKXVqh2i9QdTaYvHVRDyBiIb wGQK0CfN/TLhfBQQxk5sEG/8a/6vlmkTEcSPa7YAFklYLNdPABcVbWhKM5w6wTjYFl/j CF/yr9+qA6tb4AiklmjLZCaW9/fgOuJQpFC5BXMnnRACNFIDxkpAYUcfULq2Myofm1+B +P0kQJCh181y9czsIjF4WPYCF3TtvbsHKhp8BywJthwK5gz0Xot3zyjNpvFW9I5PnBXz Y6pwbd5C8Ns6+DUN3qhh6rYQErTyTO5HSN9IUT3DhWWLD0g0b7c3l+afQlidN7veb9SD Wwjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768520; x=1702373320; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rkAxxEMcPfloo2uem1ovLMPqU15m8yVjWoJ4/a1v3NY=; b=IPFWdOXPJhdA4EQIOV/nHhOHRAbvtr3v0DF+56bLq7bmNxnmImfWEy2brw+p5ZMLnC xteR9ebOs3HVM6dfqqGaDePaL4UQKoYK0gzksdh8+S9XTH+MNTeQntd5awOowoNYsBlY L4EgJnFUmEftM0ZauZBnF9mCS+Xv1mzSjCz1rAXrpl5H1r3IOecPLezZwN8px1rn0XDt DuNdcl9z4q0M/YCb2CoMrLYXFelMvnAAEdliYb/J63opJpN7iOUl98OLMjpoXKg8Xhyu kkyWDZ7V+oAItVSGR0ewrogSoNfx/GedZmtvP3dp8HTY4JdJR8/ywFNi6NEhLUycfBdy 3Eow== X-Gm-Message-State: AOJu0YzOijLskFX2OOyyl5iBR4GElvxBEdgzrY64PXzNS3ETZqLxfgeb Fkh9oDm70CBgxGpLYNXmWFpuUw== X-Google-Smtp-Source: AGHT+IF/a8HDOVTLYHdfK8yiLKlVkwnCjjbd6X4trwmlDcQqE7kIuZrZnBO01IKodFPb8vrIKx6hcQ== X-Received: by 2002:a05:6a00:2d89:b0:6cd:e8c3:f733 with SMTP id fb9-20020a056a002d8900b006cde8c3f733mr1304453pfb.3.1701768520004; Tue, 05 Dec 2023 01:28:40 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.36 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:39 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 08/12] RISC-V: crypto: add Zvknha/b accelerated SHA224/256 implementations Date: Tue, 5 Dec 2023 17:27:57 +0800 Message-Id: <20231205092801.1335-9-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012841_679549_204BC0DD X-CRM114-Status: GOOD ( 31.82 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add SHA224 and 256 implementations using Zvknha or Zvknhb vector crypto extensions from OpenSSL(openssl/openssl#21923). Co-developed-by: Charalampos Mitrodimas Signed-off-by: Charalampos Mitrodimas Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v3: - Use `SYM_TYPED_FUNC_START` for sha256 indirect-call asm symbol. - Use asm mnemonics for the instructions in RVV 1.0 extension. Changelog v2: - Do not turn on kconfig `SHA256_RISCV64` option by default. - Add `asmlinkage` qualifier for crypto asm function. - Rename sha256-riscv64-zvkb-zvknha_or_zvknhb to sha256-riscv64-zvknha_or_zvknhb-zvkb. - Reorder structure sha256_algs members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 11 + arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/sha256-riscv64-glue.c | 145 ++++++++ .../sha256-riscv64-zvknha_or_zvknhb-zvkb.pl | 317 ++++++++++++++++++ 4 files changed, 480 insertions(+) create mode 100644 arch/riscv/crypto/sha256-riscv64-glue.c create mode 100644 arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 6863f01a2ab0..d31af9190717 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -44,4 +44,15 @@ config CRYPTO_GHASH_RISCV64 Architecture: riscv64 using: - Zvkg vector crypto extension +config CRYPTO_SHA256_RISCV64 + tristate "Hash functions: SHA-224 and SHA-256" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_SHA256 + help + SHA-224 and SHA-256 secure hash algorithm (FIPS 180) + + Architecture: riscv64 using: + - Zvknha or Zvknhb vector crypto extensions + - Zvkb vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 94a7f8eaa8a7..e9d7717ec943 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -12,6 +12,9 @@ aes-block-riscv64-y := aes-riscv64-block-mode-glue.o aes-riscv64-zvkned-zvbb-zvk obj-$(CONFIG_CRYPTO_GHASH_RISCV64) += ghash-riscv64.o ghash-riscv64-y := ghash-riscv64-glue.o ghash-riscv64-zvkg.o +obj-$(CONFIG_CRYPTO_SHA256_RISCV64) += sha256-riscv64.o +sha256-riscv64-y := sha256-riscv64-glue.o sha256-riscv64-zvknha_or_zvknhb-zvkb.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) @@ -27,7 +30,11 @@ $(obj)/aes-riscv64-zvkned-zvkb.S: $(src)/aes-riscv64-zvkned-zvkb.pl $(obj)/ghash-riscv64-zvkg.S: $(src)/ghash-riscv64-zvkg.pl $(call cmd,perlasm) +$(obj)/sha256-riscv64-zvknha_or_zvknhb-zvkb.S: $(src)/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S clean-files += ghash-riscv64-zvkg.S +clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S diff --git a/arch/riscv/crypto/sha256-riscv64-glue.c b/arch/riscv/crypto/sha256-riscv64-glue.c new file mode 100644 index 000000000000..760d89031d1c --- /dev/null +++ b/arch/riscv/crypto/sha256-riscv64-glue.c @@ -0,0 +1,145 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Linux/riscv64 port of the OpenSSL SHA256 implementation for RISC-V 64 + * + * Copyright (C) 2022 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * sha256 using zvkb and zvknha/b vector crypto extension + * + * This asm function will just take the first 256-bit as the sha256 state from + * the pointer to `struct sha256_state`. + */ +asmlinkage void +sha256_block_data_order_zvkb_zvknha_or_zvknhb(struct sha256_state *digest, + const u8 *data, int num_blks); + +static int riscv64_sha256_update(struct shash_desc *desc, const u8 *data, + unsigned int len) +{ + int ret = 0; + + /* + * Make sure struct sha256_state begins directly with the SHA256 + * 256-bit internal state, as this is what the asm function expect. + */ + BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + ret = sha256_base_do_update( + desc, data, len, + sha256_block_data_order_zvkb_zvknha_or_zvknhb); + kernel_vector_end(); + } else { + ret = crypto_sha256_update(desc, data, len); + } + + return ret; +} + +static int riscv64_sha256_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) +{ + if (crypto_simd_usable()) { + kernel_vector_begin(); + if (len) + sha256_base_do_update( + desc, data, len, + sha256_block_data_order_zvkb_zvknha_or_zvknhb); + sha256_base_do_finalize( + desc, sha256_block_data_order_zvkb_zvknha_or_zvknhb); + kernel_vector_end(); + + return sha256_base_finish(desc, out); + } + + return crypto_sha256_finup(desc, data, len, out); +} + +static int riscv64_sha256_final(struct shash_desc *desc, u8 *out) +{ + return riscv64_sha256_finup(desc, NULL, 0, out); +} + +static struct shash_alg sha256_algs[] = { + { + .init = sha256_base_init, + .update = riscv64_sha256_update, + .final = riscv64_sha256_final, + .finup = riscv64_sha256_finup, + .descsize = sizeof(struct sha256_state), + .digestsize = SHA256_DIGEST_SIZE, + .base = { + .cra_blocksize = SHA256_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sha256", + .cra_driver_name = "sha256-riscv64-zvknha_or_zvknhb-zvkb", + .cra_module = THIS_MODULE, + }, + }, { + .init = sha224_base_init, + .update = riscv64_sha256_update, + .final = riscv64_sha256_final, + .finup = riscv64_sha256_finup, + .descsize = sizeof(struct sha256_state), + .digestsize = SHA224_DIGEST_SIZE, + .base = { + .cra_blocksize = SHA224_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sha224", + .cra_driver_name = "sha224-riscv64-zvknha_or_zvknhb-zvkb", + .cra_module = THIS_MODULE, + }, + }, +}; + +static inline bool check_sha256_ext(void) +{ + /* + * From the spec: + * The Zvknhb ext supports both SHA-256 and SHA-512 and Zvknha only + * supports SHA-256. + */ + return (riscv_isa_extension_available(NULL, ZVKNHA) || + riscv_isa_extension_available(NULL, ZVKNHB)) && + riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_sha256_mod_init(void) +{ + if (check_sha256_ext()) + return crypto_register_shashes(sha256_algs, + ARRAY_SIZE(sha256_algs)); + + return -ENODEV; +} + +static void __exit riscv64_sha256_mod_fini(void) +{ + crypto_unregister_shashes(sha256_algs, ARRAY_SIZE(sha256_algs)); +} + +module_init(riscv64_sha256_mod_init); +module_exit(riscv64_sha256_mod_fini); + +MODULE_DESCRIPTION("SHA-256 (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("sha224"); +MODULE_ALIAS_CRYPTO("sha256"); diff --git a/arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl b/arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl new file mode 100644 index 000000000000..b664cd65fbfc --- /dev/null +++ b/arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl @@ -0,0 +1,317 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Phoebe Chen +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# The generated code of this file depends on the following RISC-V extensions: +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector SHA-2 Secure Hash extension ('Zvknha' or 'Zvknhb') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +#include + +.text +___ + +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +my $K256 = "K256"; + +# Function arguments +my ($H, $INP, $LEN, $KT, $H2, $INDEX_PATTERN) = ("a0", "a1", "a2", "a3", "t3", "t4"); + +sub sha_256_load_constant { + my $code=<<___; + la $KT, $K256 # Load round constants K256 + vle32.v $V10, ($KT) + addi $KT, $KT, 16 + vle32.v $V11, ($KT) + addi $KT, $KT, 16 + vle32.v $V12, ($KT) + addi $KT, $KT, 16 + vle32.v $V13, ($KT) + addi $KT, $KT, 16 + vle32.v $V14, ($KT) + addi $KT, $KT, 16 + vle32.v $V15, ($KT) + addi $KT, $KT, 16 + vle32.v $V16, ($KT) + addi $KT, $KT, 16 + vle32.v $V17, ($KT) + addi $KT, $KT, 16 + vle32.v $V18, ($KT) + addi $KT, $KT, 16 + vle32.v $V19, ($KT) + addi $KT, $KT, 16 + vle32.v $V20, ($KT) + addi $KT, $KT, 16 + vle32.v $V21, ($KT) + addi $KT, $KT, 16 + vle32.v $V22, ($KT) + addi $KT, $KT, 16 + vle32.v $V23, ($KT) + addi $KT, $KT, 16 + vle32.v $V24, ($KT) + addi $KT, $KT, 16 + vle32.v $V25, ($KT) +___ + + return $code; +} + +################################################################################ +# void sha256_block_data_order_zvkb_zvknha_or_zvknhb(void *c, const void *p, size_t len) +$code .= <<___; +SYM_TYPED_FUNC_START(sha256_block_data_order_zvkb_zvknha_or_zvknhb) + vsetivli zero, 4, e32, m1, ta, ma + + @{[sha_256_load_constant]} + + # H is stored as {a,b,c,d},{e,f,g,h}, but we need {f,e,b,a},{h,g,d,c} + # The dst vtype is e32m1 and the index vtype is e8mf4. + # We use index-load with the following index pattern at v26. + # i8 index: + # 20, 16, 4, 0 + # Instead of setting the i8 index, we could use a single 32bit + # little-endian value to cover the 4xi8 index. + # i32 value: + # 0x 00 04 10 14 + li $INDEX_PATTERN, 0x00041014 + vsetivli zero, 1, e32, m1, ta, ma + vmv.v.x $V26, $INDEX_PATTERN + + addi $H2, $H, 8 + + # Use index-load to get {f,e,b,a},{h,g,d,c} + vsetivli zero, 4, e32, m1, ta, ma + vluxei8.v $V6, ($H), $V26 + vluxei8.v $V7, ($H2), $V26 + + # Setup v0 mask for the vmerge to replace the first word (idx==0) in key-scheduling. + # The AVL is 4 in SHA, so we could use a single e8(8 element masking) for masking. + vsetivli zero, 1, e8, m1, ta, ma + vmv.v.i $V0, 0x01 + + vsetivli zero, 4, e32, m1, ta, ma + +L_round_loop: + # Decrement length by 1 + add $LEN, $LEN, -1 + + # Keep the current state as we need it later: H' = H+{a',b',c',...,h'}. + vmv.v.v $V30, $V6 + vmv.v.v $V31, $V7 + + # Load the 512-bits of the message block in v1-v4 and perform + # an endian swap on each 4 bytes element. + vle32.v $V1, ($INP) + @{[vrev8_v $V1, $V1]} + add $INP, $INP, 16 + vle32.v $V2, ($INP) + @{[vrev8_v $V2, $V2]} + add $INP, $INP, 16 + vle32.v $V3, ($INP) + @{[vrev8_v $V3, $V3]} + add $INP, $INP, 16 + vle32.v $V4, ($INP) + @{[vrev8_v $V4, $V4]} + add $INP, $INP, 16 + + # Quad-round 0 (+0, Wt from oldest to newest in v1->v2->v3->v4) + vadd.vv $V5, $V10, $V1 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V3, $V2, $V0 + @{[vsha2ms_vv $V1, $V5, $V4]} # Generate W[19:16] + + # Quad-round 1 (+1, v2->v3->v4->v1) + vadd.vv $V5, $V11, $V2 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V4, $V3, $V0 + @{[vsha2ms_vv $V2, $V5, $V1]} # Generate W[23:20] + + # Quad-round 2 (+2, v3->v4->v1->v2) + vadd.vv $V5, $V12, $V3 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V1, $V4, $V0 + @{[vsha2ms_vv $V3, $V5, $V2]} # Generate W[27:24] + + # Quad-round 3 (+3, v4->v1->v2->v3) + vadd.vv $V5, $V13, $V4 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V2, $V1, $V0 + @{[vsha2ms_vv $V4, $V5, $V3]} # Generate W[31:28] + + # Quad-round 4 (+0, v1->v2->v3->v4) + vadd.vv $V5, $V14, $V1 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V3, $V2, $V0 + @{[vsha2ms_vv $V1, $V5, $V4]} # Generate W[35:32] + + # Quad-round 5 (+1, v2->v3->v4->v1) + vadd.vv $V5, $V15, $V2 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V4, $V3, $V0 + @{[vsha2ms_vv $V2, $V5, $V1]} # Generate W[39:36] + + # Quad-round 6 (+2, v3->v4->v1->v2) + vadd.vv $V5, $V16, $V3 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V1, $V4, $V0 + @{[vsha2ms_vv $V3, $V5, $V2]} # Generate W[43:40] + + # Quad-round 7 (+3, v4->v1->v2->v3) + vadd.vv $V5, $V17, $V4 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V2, $V1, $V0 + @{[vsha2ms_vv $V4, $V5, $V3]} # Generate W[47:44] + + # Quad-round 8 (+0, v1->v2->v3->v4) + vadd.vv $V5, $V18, $V1 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V3, $V2, $V0 + @{[vsha2ms_vv $V1, $V5, $V4]} # Generate W[51:48] + + # Quad-round 9 (+1, v2->v3->v4->v1) + vadd.vv $V5, $V19, $V2 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V4, $V3, $V0 + @{[vsha2ms_vv $V2, $V5, $V1]} # Generate W[55:52] + + # Quad-round 10 (+2, v3->v4->v1->v2) + vadd.vv $V5, $V20, $V3 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V1, $V4, $V0 + @{[vsha2ms_vv $V3, $V5, $V2]} # Generate W[59:56] + + # Quad-round 11 (+3, v4->v1->v2->v3) + vadd.vv $V5, $V21, $V4 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + vmerge.vvm $V5, $V2, $V1, $V0 + @{[vsha2ms_vv $V4, $V5, $V3]} # Generate W[63:60] + + # Quad-round 12 (+0, v1->v2->v3->v4) + # Note that we stop generating new message schedule words (Wt, v1-13) + # as we already generated all the words we end up consuming (i.e., W[63:60]). + vadd.vv $V5, $V22, $V1 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + + # Quad-round 13 (+1, v2->v3->v4->v1) + vadd.vv $V5, $V23, $V2 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + + # Quad-round 14 (+2, v3->v4->v1->v2) + vadd.vv $V5, $V24, $V3 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + + # Quad-round 15 (+3, v4->v1->v2->v3) + vadd.vv $V5, $V25, $V4 + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + + # H' = H+{a',b',c',...,h'} + vadd.vv $V6, $V30, $V6 + vadd.vv $V7, $V31, $V7 + bnez $LEN, L_round_loop + + # Store {f,e,b,a},{h,g,d,c} back to {a,b,c,d},{e,f,g,h}. + vsuxei8.v $V6, ($H), $V26 + vsuxei8.v $V7, ($H2), $V26 + + ret +SYM_FUNC_END(sha256_block_data_order_zvkb_zvknha_or_zvknhb) + +.p2align 2 +.type $K256,\@object +$K256: + .word 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5 + .word 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5 + .word 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3 + .word 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174 + .word 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc + .word 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da + .word 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7 + .word 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967 + .word 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13 + .word 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85 + .word 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3 + .word 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070 + .word 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5 + .word 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3 + .word 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208 + .word 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 +.size $K256,.-$K256 +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Tue Dec 5 09:27:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479641 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 348D2C4167B for ; Tue, 5 Dec 2023 09:28:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=cc6U8Jf+uY7VxGU6lc/HGSGeznXBBzSj/pIw2HOEniw=; b=0bCJzdAEP8wPcr ri9O86kpCVpgQW4w168YWu2vpYuO2BUX4KDRLHb/f1JljVCkJJF0SaVisdje4E5hLStVWbUHrwx3E el50YEGgvkpYpIylLsrvgWXJjV80JwKKetm/InY2syG41MqAE+j/4LHkncy9+vpR65xGG1To+6H6A chtiJtPju4E7nHeZgyvAuJGKPH4JUSphzuYE5s26qREHVTzBNRK9wJ/kd/+8byXX7lx6wxkurnPcL ZaJmCWKMzOHJG9uqV+1q+3i80mI0a/uXdR4+0BOIY530bnO/dK1wRxsFsF1lXKwVopava2Ow34wLF VehN3r1v9wXAGlopLh4w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARjV-006nxY-1p; Tue, 05 Dec 2023 09:28:49 +0000 Received: from mail-pf1-x429.google.com ([2607:f8b0:4864:20::429]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARjQ-006ntC-39 for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:47 +0000 Received: by mail-pf1-x429.google.com with SMTP id d2e1a72fcca58-6ce52ff23dfso1385508b3a.0 for ; Tue, 05 Dec 2023 01:28:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768523; x=1702373323; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wpHLj7NQfO6jzj2DVP1oEzxWGoW4RlpyCrCS4XZ8hCc=; b=hRyeQe5rEPKwyQvHVXB7MnIlfjPqxeq7zzsKGIeG8Llsyvcv/xhEn0zGHXp1QzAAYY f5dXi0MyqBebSoUvo+mPacA+tY6LDAu/hsfGBHdqJt3jIjLUvkq62biumYdsWBXbmGRc sk5dYlZNCNMqwuaHsOpK4Hq8jMXSMdAv0mbRu5TkDa5cR+XtAZu1Y0sDpdv8KFqWDYFf 4dY9beuq2y4DEfSjf7hI6T8noPdn53MIeWCwE2LptREKRvBs3/dsiH+X6bbO/Lf0v1ec 75V9HG04lTiZYAY+tVOQmMPlABIRcXqR9BkyQK8hZBcWyTEoa8wC0ljYRqiwy6u2qtnE FnxA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768523; x=1702373323; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wpHLj7NQfO6jzj2DVP1oEzxWGoW4RlpyCrCS4XZ8hCc=; b=UWU3aKXCz5NlDxpd0NIF4pv28WdOVyx3dwg6nY2PUu5uKUg3usPMqMQHVJ+QmQcjJt U+frA7V+vcMbhpcSt8bE5/viFF6P6uSIq4raqJxPdycYVRcxg2I2SdCOSM9RvxzYXUoM N3riXzDiBSFAuChxb+OEARGS7gYOCuOlzLpe6MEzXB+FKnCihQQSrqEEyEsZKNCwB7jV 8y3oTu7WkaQ1gAAdD/j39UMB27UD/uRrOT+9eGEjqaHLnN8bc/kXClbnIdOgqqp/d7WK HXkbMiq/qcIzr/ALBKyx3SKCIdbHVh+KOxvFijVLc0ptJiO1xtMedejxcvoazH1s6MuE /JDw== X-Gm-Message-State: AOJu0YwK3A+X/TwSu5X+ubw8JCdeasiH6VmUQLf2x+0nLLCLNgjqil7h kdzF3vEHtlZAZpEYiJOaxzzUYA== X-Google-Smtp-Source: AGHT+IGY0NY3ybh3om/5hWy7Ani6NczpENwtpaIYA9yfajIuLqU+A3UHYPcDQwB4PBylDh4SLpLfZg== X-Received: by 2002:a05:6a00:4387:b0:6ce:2731:47bd with SMTP id bt7-20020a056a00438700b006ce273147bdmr1068195pfb.29.1701768523368; Tue, 05 Dec 2023 01:28:43 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.40 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:43 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 09/12] RISC-V: crypto: add Zvknhb accelerated SHA384/512 implementations Date: Tue, 5 Dec 2023 17:27:58 +0800 Message-Id: <20231205092801.1335-10-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012845_050747_B81107F3 X-CRM114-Status: GOOD ( 32.58 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add SHA384 and 512 implementations using Zvknhb vector crypto extension from OpenSSL(openssl/openssl#21923). Co-developed-by: Charalampos Mitrodimas Signed-off-by: Charalampos Mitrodimas Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v3: - Use `SYM_TYPED_FUNC_START` for sha512 indirect-call asm symbol. - Use asm mnemonics for the instructions in RVV 1.0 extension. Changelog v2: - Do not turn on kconfig `SHA512_RISCV64` option by default. - Add `asmlinkage` qualifier for crypto asm function. - Rename sha512-riscv64-zvkb-zvknhb to sha512-riscv64-zvknhb-zvkb. - Reorder structure sha512_algs members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 11 + arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/sha512-riscv64-glue.c | 139 +++++++++ .../crypto/sha512-riscv64-zvknhb-zvkb.pl | 265 ++++++++++++++++++ 4 files changed, 422 insertions(+) create mode 100644 arch/riscv/crypto/sha512-riscv64-glue.c create mode 100644 arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index d31af9190717..ad0b08a13c9a 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -55,4 +55,15 @@ config CRYPTO_SHA256_RISCV64 - Zvknha or Zvknhb vector crypto extensions - Zvkb vector crypto extension +config CRYPTO_SHA512_RISCV64 + tristate "Hash functions: SHA-384 and SHA-512" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_SHA512 + help + SHA-384 and SHA-512 secure hash algorithm (FIPS 180) + + Architecture: riscv64 using: + - Zvknhb vector crypto extension + - Zvkb vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index e9d7717ec943..8aabef950ad3 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -15,6 +15,9 @@ ghash-riscv64-y := ghash-riscv64-glue.o ghash-riscv64-zvkg.o obj-$(CONFIG_CRYPTO_SHA256_RISCV64) += sha256-riscv64.o sha256-riscv64-y := sha256-riscv64-glue.o sha256-riscv64-zvknha_or_zvknhb-zvkb.o +obj-$(CONFIG_CRYPTO_SHA512_RISCV64) += sha512-riscv64.o +sha512-riscv64-y := sha512-riscv64-glue.o sha512-riscv64-zvknhb-zvkb.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) @@ -33,8 +36,12 @@ $(obj)/ghash-riscv64-zvkg.S: $(src)/ghash-riscv64-zvkg.pl $(obj)/sha256-riscv64-zvknha_or_zvknhb-zvkb.S: $(src)/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl $(call cmd,perlasm) +$(obj)/sha512-riscv64-zvknhb-zvkb.S: $(src)/sha512-riscv64-zvknhb-zvkb.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S clean-files += ghash-riscv64-zvkg.S clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S +clean-files += sha512-riscv64-zvknhb-zvkb.S diff --git a/arch/riscv/crypto/sha512-riscv64-glue.c b/arch/riscv/crypto/sha512-riscv64-glue.c new file mode 100644 index 000000000000..3dd8e1c9d402 --- /dev/null +++ b/arch/riscv/crypto/sha512-riscv64-glue.c @@ -0,0 +1,139 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Linux/riscv64 port of the OpenSSL SHA512 implementation for RISC-V 64 + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * sha512 using zvkb and zvknhb vector crypto extension + * + * This asm function will just take the first 512-bit as the sha512 state from + * the pointer to `struct sha512_state`. + */ +asmlinkage void sha512_block_data_order_zvkb_zvknhb(struct sha512_state *digest, + const u8 *data, + int num_blks); + +static int riscv64_sha512_update(struct shash_desc *desc, const u8 *data, + unsigned int len) +{ + int ret = 0; + + /* + * Make sure struct sha512_state begins directly with the SHA512 + * 512-bit internal state, as this is what the asm function expect. + */ + BUILD_BUG_ON(offsetof(struct sha512_state, state) != 0); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + ret = sha512_base_do_update( + desc, data, len, sha512_block_data_order_zvkb_zvknhb); + kernel_vector_end(); + } else { + ret = crypto_sha512_update(desc, data, len); + } + + return ret; +} + +static int riscv64_sha512_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) +{ + if (crypto_simd_usable()) { + kernel_vector_begin(); + if (len) + sha512_base_do_update( + desc, data, len, + sha512_block_data_order_zvkb_zvknhb); + sha512_base_do_finalize(desc, + sha512_block_data_order_zvkb_zvknhb); + kernel_vector_end(); + + return sha512_base_finish(desc, out); + } + + return crypto_sha512_finup(desc, data, len, out); +} + +static int riscv64_sha512_final(struct shash_desc *desc, u8 *out) +{ + return riscv64_sha512_finup(desc, NULL, 0, out); +} + +static struct shash_alg sha512_algs[] = { + { + .init = sha512_base_init, + .update = riscv64_sha512_update, + .final = riscv64_sha512_final, + .finup = riscv64_sha512_finup, + .descsize = sizeof(struct sha512_state), + .digestsize = SHA512_DIGEST_SIZE, + .base = { + .cra_blocksize = SHA512_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sha512", + .cra_driver_name = "sha512-riscv64-zvknhb-zvkb", + .cra_module = THIS_MODULE, + }, + }, + { + .init = sha384_base_init, + .update = riscv64_sha512_update, + .final = riscv64_sha512_final, + .finup = riscv64_sha512_finup, + .descsize = sizeof(struct sha512_state), + .digestsize = SHA384_DIGEST_SIZE, + .base = { + .cra_blocksize = SHA384_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sha384", + .cra_driver_name = "sha384-riscv64-zvknhb-zvkb", + .cra_module = THIS_MODULE, + }, + }, +}; + +static inline bool check_sha512_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKNHB) && + riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_sha512_mod_init(void) +{ + if (check_sha512_ext()) + return crypto_register_shashes(sha512_algs, + ARRAY_SIZE(sha512_algs)); + + return -ENODEV; +} + +static void __exit riscv64_sha512_mod_fini(void) +{ + crypto_unregister_shashes(sha512_algs, ARRAY_SIZE(sha512_algs)); +} + +module_init(riscv64_sha512_mod_init); +module_exit(riscv64_sha512_mod_fini); + +MODULE_DESCRIPTION("SHA-512 (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("sha384"); +MODULE_ALIAS_CRYPTO("sha512"); diff --git a/arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.pl b/arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.pl new file mode 100644 index 000000000000..1635b382b523 --- /dev/null +++ b/arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.pl @@ -0,0 +1,265 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Phoebe Chen +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# The generated code of this file depends on the following RISC-V extensions: +# - RV64I +# - RISC-V vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector SHA-2 Secure Hash extension ('Zvknhb') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +#include + +.text +___ + +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +my $K512 = "K512"; + +# Function arguments +my ($H, $INP, $LEN, $KT, $H2, $INDEX_PATTERN) = ("a0", "a1", "a2", "a3", "t3", "t4"); + +################################################################################ +# void sha512_block_data_order_zvkb_zvknhb(void *c, const void *p, size_t len) +$code .= <<___; +SYM_TYPED_FUNC_START(sha512_block_data_order_zvkb_zvknhb) + vsetivli zero, 4, e64, m2, ta, ma + + # H is stored as {a,b,c,d},{e,f,g,h}, but we need {f,e,b,a},{h,g,d,c} + # The dst vtype is e64m2 and the index vtype is e8mf4. + # We use index-load with the following index pattern at v1. + # i8 index: + # 40, 32, 8, 0 + # Instead of setting the i8 index, we could use a single 32bit + # little-endian value to cover the 4xi8 index. + # i32 value: + # 0x 00 08 20 28 + li $INDEX_PATTERN, 0x00082028 + vsetivli zero, 1, e32, m1, ta, ma + vmv.v.x $V1, $INDEX_PATTERN + + addi $H2, $H, 16 + + # Use index-load to get {f,e,b,a},{h,g,d,c} + vsetivli zero, 4, e64, m2, ta, ma + vluxei8.v $V22, ($H), $V1 + vluxei8.v $V24, ($H2), $V1 + + # Setup v0 mask for the vmerge to replace the first word (idx==0) in key-scheduling. + # The AVL is 4 in SHA, so we could use a single e8(8 element masking) for masking. + vsetivli zero, 1, e8, m1, ta, ma + vmv.v.i $V0, 0x01 + + vsetivli zero, 4, e64, m2, ta, ma + +L_round_loop: + # Load round constants K512 + la $KT, $K512 + + # Decrement length by 1 + addi $LEN, $LEN, -1 + + # Keep the current state as we need it later: H' = H+{a',b',c',...,h'}. + vmv.v.v $V26, $V22 + vmv.v.v $V28, $V24 + + # Load the 1024-bits of the message block in v10-v16 and perform the endian + # swap. + vle64.v $V10, ($INP) + @{[vrev8_v $V10, $V10]} + addi $INP, $INP, 32 + vle64.v $V12, ($INP) + @{[vrev8_v $V12, $V12]} + addi $INP, $INP, 32 + vle64.v $V14, ($INP) + @{[vrev8_v $V14, $V14]} + addi $INP, $INP, 32 + vle64.v $V16, ($INP) + @{[vrev8_v $V16, $V16]} + addi $INP, $INP, 32 + + .rept 4 + # Quad-round 0 (+0, v10->v12->v14->v16) + vle64.v $V20, ($KT) + addi $KT, $KT, 32 + vadd.vv $V18, $V20, $V10 + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + vmerge.vvm $V18, $V14, $V12, $V0 + @{[vsha2ms_vv $V10, $V18, $V16]} + + # Quad-round 1 (+1, v12->v14->v16->v10) + vle64.v $V20, ($KT) + addi $KT, $KT, 32 + vadd.vv $V18, $V20, $V12 + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + vmerge.vvm $V18, $V16, $V14, $V0 + @{[vsha2ms_vv $V12, $V18, $V10]} + + # Quad-round 2 (+2, v14->v16->v10->v12) + vle64.v $V20, ($KT) + addi $KT, $KT, 32 + vadd.vv $V18, $V20, $V14 + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + vmerge.vvm $V18, $V10, $V16, $V0 + @{[vsha2ms_vv $V14, $V18, $V12]} + + # Quad-round 3 (+3, v16->v10->v12->v14) + vle64.v $V20, ($KT) + addi $KT, $KT, 32 + vadd.vv $V18, $V20, $V16 + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + vmerge.vvm $V18, $V12, $V10, $V0 + @{[vsha2ms_vv $V16, $V18, $V14]} + .endr + + # Quad-round 16 (+0, v10->v12->v14->v16) + # Note that we stop generating new message schedule words (Wt, v10-16) + # as we already generated all the words we end up consuming (i.e., W[79:76]). + vle64.v $V20, ($KT) + addi $KT, $KT, 32 + vadd.vv $V18, $V20, $V10 + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + + # Quad-round 17 (+1, v12->v14->v16->v10) + vle64.v $V20, ($KT) + addi $KT, $KT, 32 + vadd.vv $V18, $V20, $V12 + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + + # Quad-round 18 (+2, v14->v16->v10->v12) + vle64.v $V20, ($KT) + addi $KT, $KT, 32 + vadd.vv $V18, $V20, $V14 + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + + # Quad-round 19 (+3, v16->v10->v12->v14) + vle64.v $V20, ($KT) + # No t1 increment needed. + vadd.vv $V18, $V20, $V16 + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + + # H' = H+{a',b',c',...,h'} + vadd.vv $V22, $V26, $V22 + vadd.vv $V24, $V28, $V24 + bnez $LEN, L_round_loop + + # Store {f,e,b,a},{h,g,d,c} back to {a,b,c,d},{e,f,g,h}. + vsuxei8.v $V22, ($H), $V1 + vsuxei8.v $V24, ($H2), $V1 + + ret +SYM_FUNC_END(sha512_block_data_order_zvkb_zvknhb) + +.p2align 3 +.type $K512,\@object +$K512: + .dword 0x428a2f98d728ae22, 0x7137449123ef65cd + .dword 0xb5c0fbcfec4d3b2f, 0xe9b5dba58189dbbc + .dword 0x3956c25bf348b538, 0x59f111f1b605d019 + .dword 0x923f82a4af194f9b, 0xab1c5ed5da6d8118 + .dword 0xd807aa98a3030242, 0x12835b0145706fbe + .dword 0x243185be4ee4b28c, 0x550c7dc3d5ffb4e2 + .dword 0x72be5d74f27b896f, 0x80deb1fe3b1696b1 + .dword 0x9bdc06a725c71235, 0xc19bf174cf692694 + .dword 0xe49b69c19ef14ad2, 0xefbe4786384f25e3 + .dword 0x0fc19dc68b8cd5b5, 0x240ca1cc77ac9c65 + .dword 0x2de92c6f592b0275, 0x4a7484aa6ea6e483 + .dword 0x5cb0a9dcbd41fbd4, 0x76f988da831153b5 + .dword 0x983e5152ee66dfab, 0xa831c66d2db43210 + .dword 0xb00327c898fb213f, 0xbf597fc7beef0ee4 + .dword 0xc6e00bf33da88fc2, 0xd5a79147930aa725 + .dword 0x06ca6351e003826f, 0x142929670a0e6e70 + .dword 0x27b70a8546d22ffc, 0x2e1b21385c26c926 + .dword 0x4d2c6dfc5ac42aed, 0x53380d139d95b3df + .dword 0x650a73548baf63de, 0x766a0abb3c77b2a8 + .dword 0x81c2c92e47edaee6, 0x92722c851482353b + .dword 0xa2bfe8a14cf10364, 0xa81a664bbc423001 + .dword 0xc24b8b70d0f89791, 0xc76c51a30654be30 + .dword 0xd192e819d6ef5218, 0xd69906245565a910 + .dword 0xf40e35855771202a, 0x106aa07032bbd1b8 + .dword 0x19a4c116b8d2d0c8, 0x1e376c085141ab53 + .dword 0x2748774cdf8eeb99, 0x34b0bcb5e19b48a8 + .dword 0x391c0cb3c5c95a63, 0x4ed8aa4ae3418acb + .dword 0x5b9cca4f7763e373, 0x682e6ff3d6b2b8a3 + .dword 0x748f82ee5defb2fc, 0x78a5636f43172f60 + .dword 0x84c87814a1f0ab72, 0x8cc702081a6439ec + .dword 0x90befffa23631e28, 0xa4506cebde82bde9 + .dword 0xbef9a3f7b2c67915, 0xc67178f2e372532b + .dword 0xca273eceea26619c, 0xd186b8c721c0c207 + .dword 0xeada7dd6cde0eb1e, 0xf57d4f7fee6ed178 + .dword 0x06f067aa72176fba, 0x0a637dc5a2c898a6 + .dword 0x113f9804bef90dae, 0x1b710b35131c471b + .dword 0x28db77f523047d84, 0x32caab7b40c72493 + .dword 0x3c9ebe0a15c9bebc, 0x431d67c49c100d4c + .dword 0x4cc5d4becb3e42b6, 0x597f299cfc657e2a + .dword 0x5fcb6fab3ad6faec, 0x6c44198c4a475817 +.size $K512,.-$K512 +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Tue Dec 5 09:27:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479642 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id D3106C07E97 for ; Tue, 5 Dec 2023 09:28:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=DHZfWevkvrOH9gRBvqXUrLJE2wp5oaceJ4ekVXqisgM=; b=teI7rRnfdfOs21 Wf58AbaNxfI8i7JcIu4Vh2bQe9/5vqBmLzeSttNAeML41LrSZsyFXd+8uESt3X0T0qaYfY3nb0agv J/R6L3+v5QOsf/eKyjP1/CgoTVtEVT4HSpIVPjetpEUZdp+M8k+Xsx7B7AMedO08cAvRTYGXyNKa9 QYtOmVveT7Wdb7ey9h5fUUY6KldbigCvnCsy+K1NVZubUk3RmzAysUwyKVKXxKejQhxwoxfWSNE4E p7un3KhUMKmqJtNWdW/FxPLAbksgNpu80yo/MzPa4bqcqVyxMBVmspQFkqv1p2GFt95aQZy620gEa n44bo4FppHMbaYlnY0kA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARjY-006o0N-24; Tue, 05 Dec 2023 09:28:52 +0000 Received: from mail-oi1-x230.google.com ([2607:f8b0:4864:20::230]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARjU-006nw4-0R for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:50 +0000 Received: by mail-oi1-x230.google.com with SMTP id 5614622812f47-3b84402923fso1581627b6e.0 for ; Tue, 05 Dec 2023 01:28:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768527; x=1702373327; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jwTD3K4Kri8m1i9JXNKcf7GR1NP3fxv8EZuVIRyXFX8=; b=PYXNEAjACzaO3sEqHtE5fR143T0l6Xp6QK05X1ZNcBXDXvBh1uvWzn4AgNQISjfg9B 8KTib0RMiAK/MRQJav4H77QAic3dxVjcrLzJzaClmnh/0IMLipB7gNcIscs8avpDJX8O yTaal5NATQHL546J8DSx6jug+w9mw2tDpJ8fNxYEz/JSG8Px0Fc/M2X+7T7KmoykYOwM Vry6bAzQt07FdOyFyR2KC5Dq4ShBAiJp3jGlJv289zqygvBsj6mcyqVIwVoSK+7HTb/v IkY971Tbs3rKTF6i/vuWnFGMxWHJgEX6VMjS+4zWozSAYvN8o1WRVG73aM4ySUyjCHMi fE+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768527; x=1702373327; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jwTD3K4Kri8m1i9JXNKcf7GR1NP3fxv8EZuVIRyXFX8=; b=G1PRnL7YZtTQ6gOZw/esTp/iDbJt/kQmxxvap5tpHQK70lF7NWH00YZbCZr6qrBNW9 jx+7iczv+2w+muO52nqTFX9l8FawqMp8i+EU2cPzHhd/e3G8xCi/Emojx9B3yflaFifz sMIirPwKdC2SxhzvcpUkZVYSWOPFCI3XMpZ2+BUlYu0d76q7F5CVfX5zIQASxFMlz9wh DCPTFTV6bkaqpUI2FRgW35kl+iVk0PZWIMR95HYEwzLNjWX4d7Gfyk2aUAv+pcrngdFy e9LYKVwq99yRQGIfqEC2Y+wBmB0Pv4vdM2l0c4ZTfrjtnRMOSEYBTgZ2XvJ3MVuoUzcj xmFA== X-Gm-Message-State: AOJu0YyPjnXzq/Ye9Hb4nJLmK0SHUjIFHBvEn4FVs3RYnaHC5VMnL/0V F+PPzaIt0de0GrnppzGriM3bIA== X-Google-Smtp-Source: AGHT+IHaxRimNT55Tc1y3Gj+N2olyd5vW1SYGP6ul/rhUm7Gt0MY9M6MMzSLbKlun+CgGu3jnR/8Pg== X-Received: by 2002:a05:6808:1990:b0:3b8:b063:8935 with SMTP id bj16-20020a056808199000b003b8b0638935mr3497663oib.67.1701768526882; Tue, 05 Dec 2023 01:28:46 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.43 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:46 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 10/12] RISC-V: crypto: add Zvksed accelerated SM4 implementation Date: Tue, 5 Dec 2023 17:27:59 +0800 Message-Id: <20231205092801.1335-11-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012848_180406_634F6718 X-CRM114-Status: GOOD ( 29.10 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add SM4 implementation using Zvksed vector crypto extension from OpenSSL (openssl/openssl#21923). The perlasm here is different from the original implementation in OpenSSL. In OpenSSL, SM4 has the separated set_encrypt_key and set_decrypt_key functions. In kernel, these set_key functions are merged into a single one in order to skip the redundant key expanding instructions. Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Signed-off-by: Jerry Shih --- Changelog v3: - Use asm mnemonics for the instructions in RVV 1.0 extension. Changelog v2: - Do not turn on kconfig `SM4_RISCV64` option by default. - Add the missed `static` declaration for riscv64_sm4_zvksed_alg. - Add `asmlinkage` qualifier for crypto asm function. - Rename sm4-riscv64-zvkb-zvksed to sm4-riscv64-zvksed-zvkb. - Reorder structure riscv64_sm4_zvksed_zvkb_alg members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 17 ++ arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/sm4-riscv64-glue.c | 121 +++++++++++ arch/riscv/crypto/sm4-riscv64-zvksed.pl | 268 ++++++++++++++++++++++++ 4 files changed, 413 insertions(+) create mode 100644 arch/riscv/crypto/sm4-riscv64-glue.c create mode 100644 arch/riscv/crypto/sm4-riscv64-zvksed.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index ad0b08a13c9a..b28cf1972250 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -66,4 +66,21 @@ config CRYPTO_SHA512_RISCV64 - Zvknhb vector crypto extension - Zvkb vector crypto extension +config CRYPTO_SM4_RISCV64 + tristate "Ciphers: SM4 (ShangMi 4)" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_ALGAPI + select CRYPTO_SM4 + help + SM4 cipher algorithms (OSCCA GB/T 32907-2016, + ISO/IEC 18033-3:2010/Amd 1:2021) + + SM4 (GBT.32907-2016) is a cryptographic standard issued by the + Organization of State Commercial Administration of China (OSCCA) + as an authorized cryptographic algorithms for the use within China. + + Architecture: riscv64 using: + - Zvksed vector crypto extension + - Zvkb vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 8aabef950ad3..8e34861bba34 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -18,6 +18,9 @@ sha256-riscv64-y := sha256-riscv64-glue.o sha256-riscv64-zvknha_or_zvknhb-zvkb.o obj-$(CONFIG_CRYPTO_SHA512_RISCV64) += sha512-riscv64.o sha512-riscv64-y := sha512-riscv64-glue.o sha512-riscv64-zvknhb-zvkb.o +obj-$(CONFIG_CRYPTO_SM4_RISCV64) += sm4-riscv64.o +sm4-riscv64-y := sm4-riscv64-glue.o sm4-riscv64-zvksed.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) @@ -39,9 +42,13 @@ $(obj)/sha256-riscv64-zvknha_or_zvknhb-zvkb.S: $(src)/sha256-riscv64-zvknha_or_z $(obj)/sha512-riscv64-zvknhb-zvkb.S: $(src)/sha512-riscv64-zvknhb-zvkb.pl $(call cmd,perlasm) +$(obj)/sm4-riscv64-zvksed.S: $(src)/sm4-riscv64-zvksed.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S clean-files += ghash-riscv64-zvkg.S clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S clean-files += sha512-riscv64-zvknhb-zvkb.S +clean-files += sm4-riscv64-zvksed.S diff --git a/arch/riscv/crypto/sm4-riscv64-glue.c b/arch/riscv/crypto/sm4-riscv64-glue.c new file mode 100644 index 000000000000..9d9d24b67ee3 --- /dev/null +++ b/arch/riscv/crypto/sm4-riscv64-glue.c @@ -0,0 +1,121 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Linux/riscv64 port of the OpenSSL SM4 implementation for RISC-V 64 + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* sm4 using zvksed vector crypto extension */ +asmlinkage void rv64i_zvksed_sm4_encrypt(const u8 *in, u8 *out, const u32 *key); +asmlinkage void rv64i_zvksed_sm4_decrypt(const u8 *in, u8 *out, const u32 *key); +asmlinkage int rv64i_zvksed_sm4_set_key(const u8 *user_key, + unsigned int key_len, u32 *enc_key, + u32 *dec_key); + +static int riscv64_sm4_setkey_zvksed(struct crypto_tfm *tfm, const u8 *key, + unsigned int key_len) +{ + struct sm4_ctx *ctx = crypto_tfm_ctx(tfm); + int ret = 0; + + if (crypto_simd_usable()) { + kernel_vector_begin(); + if (rv64i_zvksed_sm4_set_key(key, key_len, ctx->rkey_enc, + ctx->rkey_dec)) + ret = -EINVAL; + kernel_vector_end(); + } else { + ret = sm4_expandkey(ctx, key, key_len); + } + + return ret; +} + +static void riscv64_sm4_encrypt_zvksed(struct crypto_tfm *tfm, u8 *dst, + const u8 *src) +{ + const struct sm4_ctx *ctx = crypto_tfm_ctx(tfm); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + rv64i_zvksed_sm4_encrypt(src, dst, ctx->rkey_enc); + kernel_vector_end(); + } else { + sm4_crypt_block(ctx->rkey_enc, dst, src); + } +} + +static void riscv64_sm4_decrypt_zvksed(struct crypto_tfm *tfm, u8 *dst, + const u8 *src) +{ + const struct sm4_ctx *ctx = crypto_tfm_ctx(tfm); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + rv64i_zvksed_sm4_decrypt(src, dst, ctx->rkey_dec); + kernel_vector_end(); + } else { + sm4_crypt_block(ctx->rkey_dec, dst, src); + } +} + +static struct crypto_alg riscv64_sm4_zvksed_zvkb_alg = { + .cra_flags = CRYPTO_ALG_TYPE_CIPHER, + .cra_blocksize = SM4_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct sm4_ctx), + .cra_priority = 300, + .cra_name = "sm4", + .cra_driver_name = "sm4-riscv64-zvksed-zvkb", + .cra_cipher = { + .cia_min_keysize = SM4_KEY_SIZE, + .cia_max_keysize = SM4_KEY_SIZE, + .cia_setkey = riscv64_sm4_setkey_zvksed, + .cia_encrypt = riscv64_sm4_encrypt_zvksed, + .cia_decrypt = riscv64_sm4_decrypt_zvksed, + }, + .cra_module = THIS_MODULE, +}; + +static inline bool check_sm4_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKSED) && + riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_sm4_mod_init(void) +{ + if (check_sm4_ext()) + return crypto_register_alg(&riscv64_sm4_zvksed_zvkb_alg); + + return -ENODEV; +} + +static void __exit riscv64_sm4_mod_fini(void) +{ + crypto_unregister_alg(&riscv64_sm4_zvksed_zvkb_alg); +} + +module_init(riscv64_sm4_mod_init); +module_exit(riscv64_sm4_mod_fini); + +MODULE_DESCRIPTION("SM4 (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("sm4"); diff --git a/arch/riscv/crypto/sm4-riscv64-zvksed.pl b/arch/riscv/crypto/sm4-riscv64-zvksed.pl new file mode 100644 index 000000000000..5669a3b38944 --- /dev/null +++ b/arch/riscv/crypto/sm4-riscv64-zvksed.pl @@ -0,0 +1,268 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# The generated code of this file depends on the following RISC-V extensions: +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector SM4 Block Cipher extension ('Zvksed') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +#### +# int rv64i_zvksed_sm4_set_key(const u8 *user_key, unsigned int key_len, +# u32 *enc_key, u32 *dec_key); +# +{ +my ($ukey,$key_len,$enc_key,$dec_key)=("a0","a1","a2","a3"); +my ($fk,$stride)=("a4","a5"); +my ($t0,$t1)=("t0","t1"); +my ($vukey,$vfk,$vk0,$vk1,$vk2,$vk3,$vk4,$vk5,$vk6,$vk7)=("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10"); +$code .= <<___; +.p2align 3 +.globl rv64i_zvksed_sm4_set_key +.type rv64i_zvksed_sm4_set_key,\@function +rv64i_zvksed_sm4_set_key: + li $t0, 16 + beq $t0, $key_len, 1f + li a0, 1 + ret +1: + + vsetivli zero, 4, e32, m1, ta, ma + + # Load the user key + vle32.v $vukey, ($ukey) + @{[vrev8_v $vukey, $vukey]} + + # Load the FK. + la $fk, FK + vle32.v $vfk, ($fk) + + # Generate round keys. + vxor.vv $vukey, $vukey, $vfk + @{[vsm4k_vi $vk0, $vukey, 0]} # rk[0:3] + @{[vsm4k_vi $vk1, $vk0, 1]} # rk[4:7] + @{[vsm4k_vi $vk2, $vk1, 2]} # rk[8:11] + @{[vsm4k_vi $vk3, $vk2, 3]} # rk[12:15] + @{[vsm4k_vi $vk4, $vk3, 4]} # rk[16:19] + @{[vsm4k_vi $vk5, $vk4, 5]} # rk[20:23] + @{[vsm4k_vi $vk6, $vk5, 6]} # rk[24:27] + @{[vsm4k_vi $vk7, $vk6, 7]} # rk[28:31] + + # Store enc round keys + vse32.v $vk0, ($enc_key) # rk[0:3] + addi $enc_key, $enc_key, 16 + vse32.v $vk1, ($enc_key) # rk[4:7] + addi $enc_key, $enc_key, 16 + vse32.v $vk2, ($enc_key) # rk[8:11] + addi $enc_key, $enc_key, 16 + vse32.v $vk3, ($enc_key) # rk[12:15] + addi $enc_key, $enc_key, 16 + vse32.v $vk4, ($enc_key) # rk[16:19] + addi $enc_key, $enc_key, 16 + vse32.v $vk5, ($enc_key) # rk[20:23] + addi $enc_key, $enc_key, 16 + vse32.v $vk6, ($enc_key) # rk[24:27] + addi $enc_key, $enc_key, 16 + vse32.v $vk7, ($enc_key) # rk[28:31] + + # Store dec round keys in reverse order + addi $dec_key, $dec_key, 12 + li $stride, -4 + vsse32.v $vk7, ($dec_key), $stride # rk[31:28] + addi $dec_key, $dec_key, 16 + vsse32.v $vk6, ($dec_key), $stride # rk[27:24] + addi $dec_key, $dec_key, 16 + vsse32.v $vk5, ($dec_key), $stride # rk[23:20] + addi $dec_key, $dec_key, 16 + vsse32.v $vk4, ($dec_key), $stride # rk[19:16] + addi $dec_key, $dec_key, 16 + vsse32.v $vk3, ($dec_key), $stride # rk[15:12] + addi $dec_key, $dec_key, 16 + vsse32.v $vk2, ($dec_key), $stride # rk[11:8] + addi $dec_key, $dec_key, 16 + vsse32.v $vk1, ($dec_key), $stride # rk[7:4] + addi $dec_key, $dec_key, 16 + vsse32.v $vk0, ($dec_key), $stride # rk[3:0] + + li a0, 0 + ret +.size rv64i_zvksed_sm4_set_key,.-rv64i_zvksed_sm4_set_key +___ +} + +#### +# void rv64i_zvksed_sm4_encrypt(const unsigned char *in, unsigned char *out, +# const SM4_KEY *key); +# +{ +my ($in,$out,$keys,$stride)=("a0","a1","a2","t0"); +my ($vdata,$vk0,$vk1,$vk2,$vk3,$vk4,$vk5,$vk6,$vk7,$vgen)=("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10"); +$code .= <<___; +.p2align 3 +.globl rv64i_zvksed_sm4_encrypt +.type rv64i_zvksed_sm4_encrypt,\@function +rv64i_zvksed_sm4_encrypt: + vsetivli zero, 4, e32, m1, ta, ma + + # Load input data + vle32.v $vdata, ($in) + @{[vrev8_v $vdata, $vdata]} + + # Order of elements was adjusted in sm4_set_key() + # Encrypt with all keys + vle32.v $vk0, ($keys) # rk[0:3] + @{[vsm4r_vs $vdata, $vk0]} + addi $keys, $keys, 16 + vle32.v $vk1, ($keys) # rk[4:7] + @{[vsm4r_vs $vdata, $vk1]} + addi $keys, $keys, 16 + vle32.v $vk2, ($keys) # rk[8:11] + @{[vsm4r_vs $vdata, $vk2]} + addi $keys, $keys, 16 + vle32.v $vk3, ($keys) # rk[12:15] + @{[vsm4r_vs $vdata, $vk3]} + addi $keys, $keys, 16 + vle32.v $vk4, ($keys) # rk[16:19] + @{[vsm4r_vs $vdata, $vk4]} + addi $keys, $keys, 16 + vle32.v $vk5, ($keys) # rk[20:23] + @{[vsm4r_vs $vdata, $vk5]} + addi $keys, $keys, 16 + vle32.v $vk6, ($keys) # rk[24:27] + @{[vsm4r_vs $vdata, $vk6]} + addi $keys, $keys, 16 + vle32.v $vk7, ($keys) # rk[28:31] + @{[vsm4r_vs $vdata, $vk7]} + + # Save the ciphertext (in reverse element order) + @{[vrev8_v $vdata, $vdata]} + li $stride, -4 + addi $out, $out, 12 + vsse32.v $vdata, ($out), $stride + + ret +.size rv64i_zvksed_sm4_encrypt,.-rv64i_zvksed_sm4_encrypt +___ +} + +#### +# void rv64i_zvksed_sm4_decrypt(const unsigned char *in, unsigned char *out, +# const SM4_KEY *key); +# +{ +my ($in,$out,$keys,$stride)=("a0","a1","a2","t0"); +my ($vdata,$vk0,$vk1,$vk2,$vk3,$vk4,$vk5,$vk6,$vk7,$vgen)=("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10"); +$code .= <<___; +.p2align 3 +.globl rv64i_zvksed_sm4_decrypt +.type rv64i_zvksed_sm4_decrypt,\@function +rv64i_zvksed_sm4_decrypt: + vsetivli zero, 4, e32, m1, ta, ma + + # Load input data + vle32.v $vdata, ($in) + @{[vrev8_v $vdata, $vdata]} + + # Order of key elements was adjusted in sm4_set_key() + # Decrypt with all keys + vle32.v $vk7, ($keys) # rk[31:28] + @{[vsm4r_vs $vdata, $vk7]} + addi $keys, $keys, 16 + vle32.v $vk6, ($keys) # rk[27:24] + @{[vsm4r_vs $vdata, $vk6]} + addi $keys, $keys, 16 + vle32.v $vk5, ($keys) # rk[23:20] + @{[vsm4r_vs $vdata, $vk5]} + addi $keys, $keys, 16 + vle32.v $vk4, ($keys) # rk[19:16] + @{[vsm4r_vs $vdata, $vk4]} + addi $keys, $keys, 16 + vle32.v $vk3, ($keys) # rk[15:11] + @{[vsm4r_vs $vdata, $vk3]} + addi $keys, $keys, 16 + vle32.v $vk2, ($keys) # rk[11:8] + @{[vsm4r_vs $vdata, $vk2]} + addi $keys, $keys, 16 + vle32.v $vk1, ($keys) # rk[7:4] + @{[vsm4r_vs $vdata, $vk1]} + addi $keys, $keys, 16 + vle32.v $vk0, ($keys) # rk[3:0] + @{[vsm4r_vs $vdata, $vk0]} + + # Save the ciphertext (in reverse element order) + @{[vrev8_v $vdata, $vdata]} + li $stride, -4 + addi $out, $out, 12 + vsse32.v $vdata, ($out), $stride + + ret +.size rv64i_zvksed_sm4_decrypt,.-rv64i_zvksed_sm4_decrypt +___ +} + +$code .= <<___; +# Family Key (little-endian 32-bit chunks) +.p2align 3 +FK: + .word 0xA3B1BAC6, 0x56AA3350, 0x677D9197, 0xB27022DC +.size FK,.-FK +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Tue Dec 5 09:28:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479643 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18A0FC10DCE for ; Tue, 5 Dec 2023 09:29:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=w8mz09LDOINMQ9xkUeKMNBfe8RvMd0ElAyE482gWyJs=; b=d1DyKrGk7RjpGy lxvTcK1jABHLWAl+8AuAFZvER5rC+LC+qWddudciMGtpCSYpZX007dhSDFNfXMbgORUX3/I1s5ZoO terKhCMqNevPYa5ulW2jUFCeQerhzB+nUEph4464OA71RdhNwrAw94YLAYYyth5EpQmDGtP9no1Om PqIYjpIGtyjjCL2c/0sqrEO4M+/AfxL54cyCJDJ53y4d6wfPXpdmtPvphR33cR/WhhW3Mo0x1+bCp IdugYgWd8VCEL5n+uOaWj9ZpFRmBmA4wayM6SiXYame4E/BpEBHZ3r+409m+NxUTPwuRHajKtDr1V 9okIKgo9CrfiCrwCjzXQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARjg-006o5g-2r; Tue, 05 Dec 2023 09:29:00 +0000 Received: from mail-oo1-xc35.google.com ([2607:f8b0:4864:20::c35]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARjY-006nzG-0W for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:28:58 +0000 Received: by mail-oo1-xc35.google.com with SMTP id 006d021491bc7-58ceabd7cdeso3302443eaf.3 for ; Tue, 05 Dec 2023 01:28:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768530; x=1702373330; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UrD2FGOwoOZDuE5NlOXPwH6yjK7drVhBh94o1xKypaI=; b=Td20o2VdlNy+2GuZwqswDOAxRIWwNQdJmJ1r8hm5pdOICsrQdCfL02w8V58Ujb9zJP BzqGB/tQA4e8Aawrn7hLMpo2dhqhUH8+pgC+PzxwSkJR1gxKXElDWCJZmfr3MHsSy+Ji ZXM0IRKc5VGDRJuNhFOE35/BbVSL0Kw3MaKNWsm3Pal0AC8aNOsJ+YXrmjGcCyyY7Z49 /BAGrdVo7bB15iHquSvzVouoYr5F+7Q63YkX8GnbXtukpWyuDyQyaTYyLPHTrMgmCTX4 XfFu0PFeGIc2tls3b/+rK93F9VE8FnA6o6YceoBsoUU6vlCLlb70tSOMQCtOxEW3LgrW M6Mw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768530; x=1702373330; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UrD2FGOwoOZDuE5NlOXPwH6yjK7drVhBh94o1xKypaI=; b=J4rMF8VV5h4RQEbUuO2rH/QRy7o24bImkhw3Ymf0geTWxwtL9QmMelWjNMXzE73J3N Q6epIHl3/J4oxdxvZcIqIBdtsrjyNLRmRiCKtRkea1lh1tF17umq/JpNnntGv3FRrpb4 BWTCE+qwJnlrg8pVjHXxwAZnyJf+xRLyLndVR1ydB39Kv1NrCvJAVsGID9DPoUzbO7it Z7NapCWFlkVjrFbfIjrEMMcJwoFMNX3mZiGlpSf52qZBVUkEerTTbC05PJhx5TFgKTPY CSqAkovCKhQDZQjwyAv05vglUeWcGArzFHQ1UfEXBkqtT78Jpho9+ttzdy2nzPzbe6vh 984A== X-Gm-Message-State: AOJu0YxZkzLwuOqowiwbEB+U8BgT4UATT4DEimjWn4dUPF/bj9LQdn9j QkAgKsp9WbNQcppXz3V6EF2g2g== X-Google-Smtp-Source: AGHT+IFlw/eNWQfj1E9SO5ziBC4YcAjeADQV1vJQPokw/77J0tn0kdtl0+xtImh1bzSOK9d8sTqAdg== X-Received: by 2002:a05:6358:9486:b0:170:17eb:1de with SMTP id i6-20020a056358948600b0017017eb01demr2962408rwb.33.1701768530373; Tue, 05 Dec 2023 01:28:50 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.47 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:50 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 11/12] RISC-V: crypto: add Zvksh accelerated SM3 implementation Date: Tue, 5 Dec 2023 17:28:00 +0800 Message-Id: <20231205092801.1335-12-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012852_236904_250D1174 X-CRM114-Status: GOOD ( 31.99 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add SM3 implementation using Zvksh vector crypto extension from OpenSSL (openssl/openssl#21923). Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Signed-off-by: Jerry Shih --- Changelog v3: - Use `SYM_TYPED_FUNC_START` for sm3 indirect-call asm symbol. - Use asm mnemonics for the instructions in RVV 1.0 extension. Changelog v2: - Do not turn on kconfig `SM3_RISCV64` option by default. - Add `asmlinkage` qualifier for crypto asm function. - Rename sm3-riscv64-zvkb-zvksh to sm3-riscv64-zvksh-zvkb. - Reorder structure sm3_alg members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 12 ++ arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/sm3-riscv64-glue.c | 124 ++++++++++++++ arch/riscv/crypto/sm3-riscv64-zvksh.pl | 227 +++++++++++++++++++++++++ 4 files changed, 370 insertions(+) create mode 100644 arch/riscv/crypto/sm3-riscv64-glue.c create mode 100644 arch/riscv/crypto/sm3-riscv64-zvksh.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index b28cf1972250..7415fb303785 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -66,6 +66,18 @@ config CRYPTO_SHA512_RISCV64 - Zvknhb vector crypto extension - Zvkb vector crypto extension +config CRYPTO_SM3_RISCV64 + tristate "Hash functions: SM3 (ShangMi 3)" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_HASH + select CRYPTO_SM3 + help + SM3 (ShangMi 3) secure hash function (OSCCA GM/T 0004-2012) + + Architecture: riscv64 using: + - Zvksh vector crypto extension + - Zvkb vector crypto extension + config CRYPTO_SM4_RISCV64 tristate "Ciphers: SM4 (ShangMi 4)" depends on 64BIT && RISCV_ISA_V diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 8e34861bba34..b1f857695c1c 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -18,6 +18,9 @@ sha256-riscv64-y := sha256-riscv64-glue.o sha256-riscv64-zvknha_or_zvknhb-zvkb.o obj-$(CONFIG_CRYPTO_SHA512_RISCV64) += sha512-riscv64.o sha512-riscv64-y := sha512-riscv64-glue.o sha512-riscv64-zvknhb-zvkb.o +obj-$(CONFIG_CRYPTO_SM3_RISCV64) += sm3-riscv64.o +sm3-riscv64-y := sm3-riscv64-glue.o sm3-riscv64-zvksh.o + obj-$(CONFIG_CRYPTO_SM4_RISCV64) += sm4-riscv64.o sm4-riscv64-y := sm4-riscv64-glue.o sm4-riscv64-zvksed.o @@ -42,6 +45,9 @@ $(obj)/sha256-riscv64-zvknha_or_zvknhb-zvkb.S: $(src)/sha256-riscv64-zvknha_or_z $(obj)/sha512-riscv64-zvknhb-zvkb.S: $(src)/sha512-riscv64-zvknhb-zvkb.pl $(call cmd,perlasm) +$(obj)/sm3-riscv64-zvksh.S: $(src)/sm3-riscv64-zvksh.pl + $(call cmd,perlasm) + $(obj)/sm4-riscv64-zvksed.S: $(src)/sm4-riscv64-zvksed.pl $(call cmd,perlasm) @@ -51,4 +57,5 @@ clean-files += aes-riscv64-zvkned-zvkb.S clean-files += ghash-riscv64-zvkg.S clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S clean-files += sha512-riscv64-zvknhb-zvkb.S +clean-files += sm3-riscv64-zvksh.S clean-files += sm4-riscv64-zvksed.S diff --git a/arch/riscv/crypto/sm3-riscv64-glue.c b/arch/riscv/crypto/sm3-riscv64-glue.c new file mode 100644 index 000000000000..0e5a2b84c930 --- /dev/null +++ b/arch/riscv/crypto/sm3-riscv64-glue.c @@ -0,0 +1,124 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Linux/riscv64 port of the OpenSSL SM3 implementation for RISC-V 64 + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * sm3 using zvksh vector crypto extension + * + * This asm function will just take the first 256-bit as the sm3 state from + * the pointer to `struct sm3_state`. + */ +asmlinkage void ossl_hwsm3_block_data_order_zvksh(struct sm3_state *digest, + u8 const *o, int num); + +static int riscv64_sm3_update(struct shash_desc *desc, const u8 *data, + unsigned int len) +{ + int ret = 0; + + /* + * Make sure struct sm3_state begins directly with the SM3 256-bit internal + * state, as this is what the asm function expect. + */ + BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + ret = sm3_base_do_update(desc, data, len, + ossl_hwsm3_block_data_order_zvksh); + kernel_vector_end(); + } else { + sm3_update(shash_desc_ctx(desc), data, len); + } + + return ret; +} + +static int riscv64_sm3_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) +{ + struct sm3_state *ctx; + + if (crypto_simd_usable()) { + kernel_vector_begin(); + if (len) + sm3_base_do_update(desc, data, len, + ossl_hwsm3_block_data_order_zvksh); + sm3_base_do_finalize(desc, ossl_hwsm3_block_data_order_zvksh); + kernel_vector_end(); + + return sm3_base_finish(desc, out); + } + + ctx = shash_desc_ctx(desc); + if (len) + sm3_update(ctx, data, len); + sm3_final(ctx, out); + + return 0; +} + +static int riscv64_sm3_final(struct shash_desc *desc, u8 *out) +{ + return riscv64_sm3_finup(desc, NULL, 0, out); +} + +static struct shash_alg sm3_alg = { + .init = sm3_base_init, + .update = riscv64_sm3_update, + .final = riscv64_sm3_final, + .finup = riscv64_sm3_finup, + .descsize = sizeof(struct sm3_state), + .digestsize = SM3_DIGEST_SIZE, + .base = { + .cra_blocksize = SM3_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sm3", + .cra_driver_name = "sm3-riscv64-zvksh-zvkb", + .cra_module = THIS_MODULE, + }, +}; + +static inline bool check_sm3_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKSH) && + riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_sm3_mod_init(void) +{ + if (check_sm3_ext()) + return crypto_register_shash(&sm3_alg); + + return -ENODEV; +} + +static void __exit riscv64_sm3_mod_fini(void) +{ + crypto_unregister_shash(&sm3_alg); +} + +module_init(riscv64_sm3_mod_init); +module_exit(riscv64_sm3_mod_fini); + +MODULE_DESCRIPTION("SM3 (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("sm3"); diff --git a/arch/riscv/crypto/sm3-riscv64-zvksh.pl b/arch/riscv/crypto/sm3-riscv64-zvksh.pl new file mode 100644 index 000000000000..6a2399d3a5cf --- /dev/null +++ b/arch/riscv/crypto/sm3-riscv64-zvksh.pl @@ -0,0 +1,227 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# The generated code of this file depends on the following RISC-V extensions: +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector SM3 Secure Hash extension ('Zvksh') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +#include + +.text +___ + +################################################################################ +# ossl_hwsm3_block_data_order_zvksh(SM3_CTX *c, const void *p, size_t num); +{ +my ($CTX, $INPUT, $NUM) = ("a0", "a1", "a2"); +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +$code .= <<___; +SYM_TYPED_FUNC_START(ossl_hwsm3_block_data_order_zvksh) + vsetivli zero, 8, e32, m2, ta, ma + + # Load initial state of hash context (c->A-H). + vle32.v $V0, ($CTX) + @{[vrev8_v $V0, $V0]} + +L_sm3_loop: + # Copy the previous state to v2. + # It will be XOR'ed with the current state at the end of the round. + vmv.v.v $V2, $V0 + + # Load the 64B block in 2x32B chunks. + vle32.v $V6, ($INPUT) # v6 := {w7, ..., w0} + addi $INPUT, $INPUT, 32 + + vle32.v $V8, ($INPUT) # v8 := {w15, ..., w8} + addi $INPUT, $INPUT, 32 + + addi $NUM, $NUM, -1 + + # As vsm3c consumes only w0, w1, w4, w5 we need to slide the input + # 2 elements down so we process elements w2, w3, w6, w7 + # This will be repeated for each odd round. + vslidedown.vi $V4, $V6, 2 # v4 := {X, X, w7, ..., w2} + + @{[vsm3c_vi $V0, $V6, 0]} + @{[vsm3c_vi $V0, $V4, 1]} + + # Prepare a vector with {w11, ..., w4} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, X, X, w7, ..., w4} + vslideup.vi $V4, $V8, 4 # v4 := {w11, w10, w9, w8, w7, w6, w5, w4} + + @{[vsm3c_vi $V0, $V4, 2]} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, w11, w10, w9, w8, w7, w6} + @{[vsm3c_vi $V0, $V4, 3]} + + @{[vsm3c_vi $V0, $V8, 4]} + vslidedown.vi $V4, $V8, 2 # v4 := {X, X, w15, w14, w13, w12, w11, w10} + @{[vsm3c_vi $V0, $V4, 5]} + + @{[vsm3me_vv $V6, $V8, $V6]} # v6 := {w23, w22, w21, w20, w19, w18, w17, w16} + + # Prepare a register with {w19, w18, w17, w16, w15, w14, w13, w12} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, X, X, w15, w14, w13, w12} + vslideup.vi $V4, $V6, 4 # v4 := {w19, w18, w17, w16, w15, w14, w13, w12} + + @{[vsm3c_vi $V0, $V4, 6]} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, w19, w18, w17, w16, w15, w14} + @{[vsm3c_vi $V0, $V4, 7]} + + @{[vsm3c_vi $V0, $V6, 8]} + vslidedown.vi $V4, $V6, 2 # v4 := {X, X, w23, w22, w21, w20, w19, w18} + @{[vsm3c_vi $V0, $V4, 9]} + + @{[vsm3me_vv $V8, $V6, $V8]} # v8 := {w31, w30, w29, w28, w27, w26, w25, w24} + + # Prepare a register with {w27, w26, w25, w24, w23, w22, w21, w20} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, X, X, w23, w22, w21, w20} + vslideup.vi $V4, $V8, 4 # v4 := {w27, w26, w25, w24, w23, w22, w21, w20} + + @{[vsm3c_vi $V0, $V4, 10]} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, w27, w26, w25, w24, w23, w22} + @{[vsm3c_vi $V0, $V4, 11]} + + @{[vsm3c_vi $V0, $V8, 12]} + vslidedown.vi $V4, $V8, 2 # v4 := {x, X, w31, w30, w29, w28, w27, w26} + @{[vsm3c_vi $V0, $V4, 13]} + + @{[vsm3me_vv $V6, $V8, $V6]} # v6 := {w32, w33, w34, w35, w36, w37, w38, w39} + + # Prepare a register with {w35, w34, w33, w32, w31, w30, w29, w28} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, X, X, w31, w30, w29, w28} + vslideup.vi $V4, $V6, 4 # v4 := {w35, w34, w33, w32, w31, w30, w29, w28} + + @{[vsm3c_vi $V0, $V4, 14]} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, w35, w34, w33, w32, w31, w30} + @{[vsm3c_vi $V0, $V4, 15]} + + @{[vsm3c_vi $V0, $V6, 16]} + vslidedown.vi $V4, $V6, 2 # v4 := {X, X, w39, w38, w37, w36, w35, w34} + @{[vsm3c_vi $V0, $V4, 17]} + + @{[vsm3me_vv $V8, $V6, $V8]} # v8 := {w47, w46, w45, w44, w43, w42, w41, w40} + + # Prepare a register with {w43, w42, w41, w40, w39, w38, w37, w36} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, X, X, w39, w38, w37, w36} + vslideup.vi $V4, $V8, 4 # v4 := {w43, w42, w41, w40, w39, w38, w37, w36} + + @{[vsm3c_vi $V0, $V4, 18]} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, w43, w42, w41, w40, w39, w38} + @{[vsm3c_vi $V0, $V4, 19]} + + @{[vsm3c_vi $V0, $V8, 20]} + vslidedown.vi $V4, $V8, 2 # v4 := {X, X, w47, w46, w45, w44, w43, w42} + @{[vsm3c_vi $V0, $V4, 21]} + + @{[vsm3me_vv $V6, $V8, $V6]} # v6 := {w55, w54, w53, w52, w51, w50, w49, w48} + + # Prepare a register with {w51, w50, w49, w48, w47, w46, w45, w44} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, X, X, w47, w46, w45, w44} + vslideup.vi $V4, $V6, 4 # v4 := {w51, w50, w49, w48, w47, w46, w45, w44} + + @{[vsm3c_vi $V0, $V4, 22]} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, w51, w50, w49, w48, w47, w46} + @{[vsm3c_vi $V0, $V4, 23]} + + @{[vsm3c_vi $V0, $V6, 24]} + vslidedown.vi $V4, $V6, 2 # v4 := {X, X, w55, w54, w53, w52, w51, w50} + @{[vsm3c_vi $V0, $V4, 25]} + + @{[vsm3me_vv $V8, $V6, $V8]} # v8 := {w63, w62, w61, w60, w59, w58, w57, w56} + + # Prepare a register with {w59, w58, w57, w56, w55, w54, w53, w52} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, X, X, w55, w54, w53, w52} + vslideup.vi $V4, $V8, 4 # v4 := {w59, w58, w57, w56, w55, w54, w53, w52} + + @{[vsm3c_vi $V0, $V4, 26]} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, w59, w58, w57, w56, w55, w54} + @{[vsm3c_vi $V0, $V4, 27]} + + @{[vsm3c_vi $V0, $V8, 28]} + vslidedown.vi $V4, $V8, 2 # v4 := {X, X, w63, w62, w61, w60, w59, w58} + @{[vsm3c_vi $V0, $V4, 29]} + + @{[vsm3me_vv $V6, $V8, $V6]} # v6 := {w71, w70, w69, w68, w67, w66, w65, w64} + + # Prepare a register with {w67, w66, w65, w64, w63, w62, w61, w60} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, X, X, w63, w62, w61, w60} + vslideup.vi $V4, $V6, 4 # v4 := {w67, w66, w65, w64, w63, w62, w61, w60} + + @{[vsm3c_vi $V0, $V4, 30]} + vslidedown.vi $V4, $V4, 2 # v4 := {X, X, w67, w66, w65, w64, w63, w62} + @{[vsm3c_vi $V0, $V4, 31]} + + # XOR in the previous state. + vxor.vv $V0, $V0, $V2 + + bnez $NUM, L_sm3_loop # Check if there are any more block to process +L_sm3_end: + @{[vrev8_v $V0, $V0]} + vse32.v $V0, ($CTX) + ret +SYM_FUNC_END(ossl_hwsm3_block_data_order_zvksh) +___ +} + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Tue Dec 5 09:28:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13479644 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 881C1C10F05 for ; Tue, 5 Dec 2023 09:29:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=eRXNiCdTeRkXsz33z0gFurN9p9aHuhoiQJj9kBuzjnY=; b=Yp2hcDZk4egSst A3FPyVoJZpx22aOVvQfWWJipSXNsAIu3sJjro+v5WgC8ksZq4T6qNRq8ZF2jomk5Mct8EgzQYdyQQ XjLHi4nCTNJIUlBoNxEGANqmyo1v2lphyUr1pqLUimwQvmpGaePksrdvEXWnMHOxOKaTr9/8rBetu zJNYfNwImu38F4sAvr0wlphps+B1LUPsKecjExnF5JXt9CnGpRXiLnjmZnBzqyyjpndTJyaBOOnRh w4+mjVV2lMNYUIAJ3/vgzpmStr+kB7OopyIdU3FKINKPgVsawVxYu2CNGTKKwYVMf6fze+XvH0jow hfbGF76oC9Tf39vbr9hw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rARji-006o7Q-2A; Tue, 05 Dec 2023 09:29:02 +0000 Received: from mail-pf1-x436.google.com ([2607:f8b0:4864:20::436]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rARjb-006o1m-1s for linux-riscv@lists.infradead.org; Tue, 05 Dec 2023 09:29:00 +0000 Received: by mail-pf1-x436.google.com with SMTP id d2e1a72fcca58-6ce49d9d874so2014072b3a.2 for ; Tue, 05 Dec 2023 01:28:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701768534; x=1702373334; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=04bzMED+CUiu/sEgfXg1lK1LfSki9LT72NWorxu5xx8=; b=BchPxm4jUbO/yZq39+W6dNDOdTKuDFuJZ7vyADVDazRZBXLQna8Yaw/UCNHhTTXvhg 469XDiD7P6QR/WRJ5xNqSJ4yiGbHjLQSYoTqjzFJrFUkl8aa6+mI83hJN4YJDCvPIdyK 7sGLWqVel5BkeKwAdNJQpLsQEwNP9J1wC7R066YRKmvILHJkzCJtG4cNGcImYZrP7FeV YxaDAQSSpw1a9enLZpy/a7J6e3Qu2x+FAmevzV8y8rSvXfIYvdQI41iI5lRmhrZtSFw5 W6u/lcWIOULuJ2HviD1t3FUkZIIRszgrkV4iVpWbj2cWTTJRQ+Z4rsgC8G+WuQAgT4Hz WmSQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701768534; x=1702373334; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=04bzMED+CUiu/sEgfXg1lK1LfSki9LT72NWorxu5xx8=; b=O83yqnih5iq2fj9iSXFmPIu5ljh8ngM/B7nXOZR5+28l5G/vNog9oJKhDZIl0wAgGE HyLnFWhm3BrP3V03+T9LWBMcQaK5i7sEt1MWEQj5AKn/xsarUqV/IrC1qgQsKnd60VRE MZcDowN04ZhGx+FQr7rrzQ4SbApDUURyooaUlsGsCGYJ3nBvquoQZEIkuTJlxUR1bovO 9iJefKlgb8GER5ASdRB7TdPUavEsYUPR2czH/uPg6icLsbT1sTDFvHTjM+Q/yDJIlhSU rf3DdQ4Z+nsprg3t/cB/5Yv7LSx+0iRCYqugj4ymoiX9wfYv8Vn2JjiI6FwMkp8v60Fd 1CgA== X-Gm-Message-State: AOJu0YxzfObNY+BCj87nZRiV+LdJ/dtuiPhUy1FyRnE8VnDMBrfVHI2O wrAyBUPoYUkQfoQk3WVelzH/HA== X-Google-Smtp-Source: AGHT+IHhgHtsyl6FZXviL9LLuUs5JF4ynILYvLLrRDt4mFRgkJQBjCy/mj5gnpnH8ag5WILZWPTmow== X-Received: by 2002:a05:6a00:2d11:b0:6ce:4927:2811 with SMTP id fa17-20020a056a002d1100b006ce49272811mr1120349pfb.22.1701768533939; Tue, 05 Dec 2023 01:28:53 -0800 (PST) Received: from localhost.localdomain ([101.10.93.135]) by smtp.gmail.com with ESMTPSA id l6-20020a056a00140600b006cdd723bb6fsm8858788pfu.115.2023.12.05.01.28.50 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Tue, 05 Dec 2023 01:28:53 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org, conor@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v3 12/12] RISC-V: crypto: add Zvkb accelerated ChaCha20 implementation Date: Tue, 5 Dec 2023 17:28:01 +0800 Message-Id: <20231205092801.1335-13-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231205092801.1335-1-jerry.shih@sifive.com> References: <20231205092801.1335-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231205_012855_654536_CD65FABB X-CRM114-Status: GOOD ( 26.94 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add a ChaCha20 vector implementation from OpenSSL(openssl/openssl#21923). Signed-off-by: Jerry Shih --- Changelog v3: - Rename kconfig CRYPTO_CHACHA20_RISCV64 to CRYPTO_CHACHA_RISCV64. - Rename chacha20_encrypt() to riscv64_chacha20_encrypt(). - Use asm mnemonics for the instructions in RVV 1.0 extension. Changelog v2: - Do not turn on kconfig `CHACHA20_RISCV64` option by default. - Use simd skcipher interface. - Add `asmlinkage` qualifier for crypto asm function. - Reorder structure riscv64_chacha_alg_zvkb members initialization in the order declared. - Use smaller iv buffer instead of whole state matrix as chacha20's input. --- arch/riscv/crypto/Kconfig | 12 + arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/chacha-riscv64-glue.c | 122 +++++++++ arch/riscv/crypto/chacha-riscv64-zvkb.pl | 321 +++++++++++++++++++++++ 4 files changed, 462 insertions(+) create mode 100644 arch/riscv/crypto/chacha-riscv64-glue.c create mode 100644 arch/riscv/crypto/chacha-riscv64-zvkb.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 7415fb303785..a5c19532400e 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -34,6 +34,18 @@ config CRYPTO_AES_BLOCK_RISCV64 - Zvkb vector crypto extension (CTR/XTS) - Zvkg vector crypto extension (XTS) +config CRYPTO_CHACHA_RISCV64 + tristate "Ciphers: ChaCha" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_SIMD + select CRYPTO_SKCIPHER + select CRYPTO_LIB_CHACHA_GENERIC + help + Length-preserving ciphers: ChaCha20 stream cipher algorithm + + Architecture: riscv64 using: + - Zvkb vector crypto extension + config CRYPTO_GHASH_RISCV64 tristate "Hash functions: GHASH" depends on 64BIT && RISCV_ISA_V diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index b1f857695c1c..31021eb3929c 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -9,6 +9,9 @@ aes-riscv64-y := aes-riscv64-glue.o aes-riscv64-zvkned.o obj-$(CONFIG_CRYPTO_AES_BLOCK_RISCV64) += aes-block-riscv64.o aes-block-riscv64-y := aes-riscv64-block-mode-glue.o aes-riscv64-zvkned-zvbb-zvkg.o aes-riscv64-zvkned-zvkb.o +obj-$(CONFIG_CRYPTO_CHACHA_RISCV64) += chacha-riscv64.o +chacha-riscv64-y := chacha-riscv64-glue.o chacha-riscv64-zvkb.o + obj-$(CONFIG_CRYPTO_GHASH_RISCV64) += ghash-riscv64.o ghash-riscv64-y := ghash-riscv64-glue.o ghash-riscv64-zvkg.o @@ -36,6 +39,9 @@ $(obj)/aes-riscv64-zvkned-zvbb-zvkg.S: $(src)/aes-riscv64-zvkned-zvbb-zvkg.pl $(obj)/aes-riscv64-zvkned-zvkb.S: $(src)/aes-riscv64-zvkned-zvkb.pl $(call cmd,perlasm) +$(obj)/chacha-riscv64-zvkb.S: $(src)/chacha-riscv64-zvkb.pl + $(call cmd,perlasm) + $(obj)/ghash-riscv64-zvkg.S: $(src)/ghash-riscv64-zvkg.pl $(call cmd,perlasm) @@ -54,6 +60,7 @@ $(obj)/sm4-riscv64-zvksed.S: $(src)/sm4-riscv64-zvksed.pl clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S +clean-files += chacha-riscv64-zvkb.S clean-files += ghash-riscv64-zvkg.S clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S clean-files += sha512-riscv64-zvknhb-zvkb.S diff --git a/arch/riscv/crypto/chacha-riscv64-glue.c b/arch/riscv/crypto/chacha-riscv64-glue.c new file mode 100644 index 000000000000..3beaa23fcb64 --- /dev/null +++ b/arch/riscv/crypto/chacha-riscv64-glue.c @@ -0,0 +1,122 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Port of the OpenSSL ChaCha20 implementation for RISC-V 64 + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* chacha20 using zvkb vector crypto extension */ +asmlinkage void ChaCha20_ctr32_zvkb(u8 *out, const u8 *input, size_t len, + const u32 *key, const u32 *counter); + +static int riscv64_chacha20_encrypt(struct skcipher_request *req) +{ + u32 iv[CHACHA_IV_SIZE / sizeof(u32)]; + u8 block_buffer[CHACHA_BLOCK_SIZE]; + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + unsigned int tail_bytes; + int err; + + iv[0] = get_unaligned_le32(req->iv); + iv[1] = get_unaligned_le32(req->iv + 4); + iv[2] = get_unaligned_le32(req->iv + 8); + iv[3] = get_unaligned_le32(req->iv + 12); + + err = skcipher_walk_virt(&walk, req, false); + while (walk.nbytes) { + nbytes = walk.nbytes & (~(CHACHA_BLOCK_SIZE - 1)); + tail_bytes = walk.nbytes & (CHACHA_BLOCK_SIZE - 1); + kernel_vector_begin(); + if (nbytes) { + ChaCha20_ctr32_zvkb(walk.dst.virt.addr, + walk.src.virt.addr, nbytes, + ctx->key, iv); + iv[0] += nbytes / CHACHA_BLOCK_SIZE; + } + if (walk.nbytes == walk.total && tail_bytes > 0) { + memcpy(block_buffer, walk.src.virt.addr + nbytes, + tail_bytes); + ChaCha20_ctr32_zvkb(block_buffer, block_buffer, + CHACHA_BLOCK_SIZE, ctx->key, iv); + memcpy(walk.dst.virt.addr + nbytes, block_buffer, + tail_bytes); + tail_bytes = 0; + } + kernel_vector_end(); + + err = skcipher_walk_done(&walk, tail_bytes); + } + + return err; +} + +static struct skcipher_alg riscv64_chacha_alg_zvkb[] = { + { + .setkey = chacha20_setkey, + .encrypt = riscv64_chacha20_encrypt, + .decrypt = riscv64_chacha20_encrypt, + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = CHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = CHACHA_BLOCK_SIZE * 4, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = 1, + .cra_ctxsize = sizeof(struct chacha_ctx), + .cra_priority = 300, + .cra_name = "__chacha20", + .cra_driver_name = "__chacha20-riscv64-zvkb", + .cra_module = THIS_MODULE, + }, + } +}; + +static struct simd_skcipher_alg + *riscv64_chacha_simd_alg_zvkb[ARRAY_SIZE(riscv64_chacha_alg_zvkb)]; + +static inline bool check_chacha20_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_chacha_mod_init(void) +{ + if (check_chacha20_ext()) + return simd_register_skciphers_compat( + riscv64_chacha_alg_zvkb, + ARRAY_SIZE(riscv64_chacha_alg_zvkb), + riscv64_chacha_simd_alg_zvkb); + + return -ENODEV; +} + +static void __exit riscv64_chacha_mod_fini(void) +{ + simd_unregister_skciphers(riscv64_chacha_alg_zvkb, + ARRAY_SIZE(riscv64_chacha_alg_zvkb), + riscv64_chacha_simd_alg_zvkb); +} + +module_init(riscv64_chacha_mod_init); +module_exit(riscv64_chacha_mod_fini); + +MODULE_DESCRIPTION("ChaCha20 (RISC-V accelerated)"); +MODULE_AUTHOR("Jerry Shih "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("chacha20"); diff --git a/arch/riscv/crypto/chacha-riscv64-zvkb.pl b/arch/riscv/crypto/chacha-riscv64-zvkb.pl new file mode 100644 index 000000000000..a76069f62e11 --- /dev/null +++ b/arch/riscv/crypto/chacha-riscv64-zvkb.pl @@ -0,0 +1,321 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023-2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You may not use +# this file except in compliance with the License. You can obtain a copy +# in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT, ">$output"; + +my $code = <<___; +.text +___ + +# void ChaCha20_ctr32_zvkb(unsigned char *out, const unsigned char *inp, +# size_t len, const unsigned int key[8], +# const unsigned int counter[4]); +################################################################################ +my ( $OUTPUT, $INPUT, $LEN, $KEY, $COUNTER ) = ( "a0", "a1", "a2", "a3", "a4" ); +my ( $T0 ) = ( "t0" ); +my ( $CONST_DATA0, $CONST_DATA1, $CONST_DATA2, $CONST_DATA3 ) = + ( "a5", "a6", "a7", "t1" ); +my ( $KEY0, $KEY1, $KEY2,$KEY3, $KEY4, $KEY5, $KEY6, $KEY7, + $COUNTER0, $COUNTER1, $NONCE0, $NONCE1 +) = ( "s0", "s1", "s2", "s3", "s4", "s5", "s6", + "s7", "s8", "s9", "s10", "s11" ); +my ( $VL, $STRIDE, $CHACHA_LOOP_COUNT ) = ( "t2", "t3", "t4" ); +my ( + $V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, $V8, $V9, $V10, + $V11, $V12, $V13, $V14, $V15, $V16, $V17, $V18, $V19, $V20, $V21, + $V22, $V23, $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map( "v$_", ( 0 .. 31 ) ); + +sub chacha_quad_round_group { + my ( + $A0, $B0, $C0, $D0, $A1, $B1, $C1, $D1, + $A2, $B2, $C2, $D2, $A3, $B3, $C3, $D3 + ) = @_; + + my $code = <<___; + # a += b; d ^= a; d <<<= 16; + vadd.vv $A0, $A0, $B0 + vadd.vv $A1, $A1, $B1 + vadd.vv $A2, $A2, $B2 + vadd.vv $A3, $A3, $B3 + vxor.vv $D0, $D0, $A0 + vxor.vv $D1, $D1, $A1 + vxor.vv $D2, $D2, $A2 + vxor.vv $D3, $D3, $A3 + @{[vror_vi $D0, $D0, 32 - 16]} + @{[vror_vi $D1, $D1, 32 - 16]} + @{[vror_vi $D2, $D2, 32 - 16]} + @{[vror_vi $D3, $D3, 32 - 16]} + # c += d; b ^= c; b <<<= 12; + vadd.vv $C0, $C0, $D0 + vadd.vv $C1, $C1, $D1 + vadd.vv $C2, $C2, $D2 + vadd.vv $C3, $C3, $D3 + vxor.vv $B0, $B0, $C0 + vxor.vv $B1, $B1, $C1 + vxor.vv $B2, $B2, $C2 + vxor.vv $B3, $B3, $C3 + @{[vror_vi $B0, $B0, 32 - 12]} + @{[vror_vi $B1, $B1, 32 - 12]} + @{[vror_vi $B2, $B2, 32 - 12]} + @{[vror_vi $B3, $B3, 32 - 12]} + # a += b; d ^= a; d <<<= 8; + vadd.vv $A0, $A0, $B0 + vadd.vv $A1, $A1, $B1 + vadd.vv $A2, $A2, $B2 + vadd.vv $A3, $A3, $B3 + vxor.vv $D0, $D0, $A0 + vxor.vv $D1, $D1, $A1 + vxor.vv $D2, $D2, $A2 + vxor.vv $D3, $D3, $A3 + @{[vror_vi $D0, $D0, 32 - 8]} + @{[vror_vi $D1, $D1, 32 - 8]} + @{[vror_vi $D2, $D2, 32 - 8]} + @{[vror_vi $D3, $D3, 32 - 8]} + # c += d; b ^= c; b <<<= 7; + vadd.vv $C0, $C0, $D0 + vadd.vv $C1, $C1, $D1 + vadd.vv $C2, $C2, $D2 + vadd.vv $C3, $C3, $D3 + vxor.vv $B0, $B0, $C0 + vxor.vv $B1, $B1, $C1 + vxor.vv $B2, $B2, $C2 + vxor.vv $B3, $B3, $C3 + @{[vror_vi $B0, $B0, 32 - 7]} + @{[vror_vi $B1, $B1, 32 - 7]} + @{[vror_vi $B2, $B2, 32 - 7]} + @{[vror_vi $B3, $B3, 32 - 7]} +___ + + return $code; +} + +$code .= <<___; +.p2align 3 +.globl ChaCha20_ctr32_zvkb +.type ChaCha20_ctr32_zvkb,\@function +ChaCha20_ctr32_zvkb: + srli $LEN, $LEN, 6 + beqz $LEN, .Lend + + addi sp, sp, -96 + sd s0, 0(sp) + sd s1, 8(sp) + sd s2, 16(sp) + sd s3, 24(sp) + sd s4, 32(sp) + sd s5, 40(sp) + sd s6, 48(sp) + sd s7, 56(sp) + sd s8, 64(sp) + sd s9, 72(sp) + sd s10, 80(sp) + sd s11, 88(sp) + + li $STRIDE, 64 + + #### chacha block data + # "expa" little endian + li $CONST_DATA0, 0x61707865 + # "nd 3" little endian + li $CONST_DATA1, 0x3320646e + # "2-by" little endian + li $CONST_DATA2, 0x79622d32 + # "te k" little endian + li $CONST_DATA3, 0x6b206574 + + lw $KEY0, 0($KEY) + lw $KEY1, 4($KEY) + lw $KEY2, 8($KEY) + lw $KEY3, 12($KEY) + lw $KEY4, 16($KEY) + lw $KEY5, 20($KEY) + lw $KEY6, 24($KEY) + lw $KEY7, 28($KEY) + + lw $COUNTER0, 0($COUNTER) + lw $COUNTER1, 4($COUNTER) + lw $NONCE0, 8($COUNTER) + lw $NONCE1, 12($COUNTER) + +.Lblock_loop: + vsetvli $VL, $LEN, e32, m1, ta, ma + + # init chacha const states + vmv.v.x $V0, $CONST_DATA0 + vmv.v.x $V1, $CONST_DATA1 + vmv.v.x $V2, $CONST_DATA2 + vmv.v.x $V3, $CONST_DATA3 + + # init chacha key states + vmv.v.x $V4, $KEY0 + vmv.v.x $V5, $KEY1 + vmv.v.x $V6, $KEY2 + vmv.v.x $V7, $KEY3 + vmv.v.x $V8, $KEY4 + vmv.v.x $V9, $KEY5 + vmv.v.x $V10, $KEY6 + vmv.v.x $V11, $KEY7 + + # init chacha key states + vid.v $V12 + vadd.vx $V12, $V12, $COUNTER0 + vmv.v.x $V13, $COUNTER1 + + # init chacha nonce states + vmv.v.x $V14, $NONCE0 + vmv.v.x $V15, $NONCE1 + + # load the top-half of input data + vlsseg8e32.v $V16, ($INPUT), $STRIDE + + li $CHACHA_LOOP_COUNT, 10 +.Lround_loop: + addi $CHACHA_LOOP_COUNT, $CHACHA_LOOP_COUNT, -1 + @{[chacha_quad_round_group + $V0, $V4, $V8, $V12, + $V1, $V5, $V9, $V13, + $V2, $V6, $V10, $V14, + $V3, $V7, $V11, $V15]} + @{[chacha_quad_round_group + $V0, $V5, $V10, $V15, + $V1, $V6, $V11, $V12, + $V2, $V7, $V8, $V13, + $V3, $V4, $V9, $V14]} + bnez $CHACHA_LOOP_COUNT, .Lround_loop + + # load the bottom-half of input data + addi $T0, $INPUT, 32 + vlsseg8e32.v $V24, ($T0), $STRIDE + + # add chacha top-half initial block states + vadd.vx $V0, $V0, $CONST_DATA0 + vadd.vx $V1, $V1, $CONST_DATA1 + vadd.vx $V2, $V2, $CONST_DATA2 + vadd.vx $V3, $V3, $CONST_DATA3 + vadd.vx $V4, $V4, $KEY0 + vadd.vx $V5, $V5, $KEY1 + vadd.vx $V6, $V6, $KEY2 + vadd.vx $V7, $V7, $KEY3 + # xor with the top-half input + vxor.vv $V16, $V16, $V0 + vxor.vv $V17, $V17, $V1 + vxor.vv $V18, $V18, $V2 + vxor.vv $V19, $V19, $V3 + vxor.vv $V20, $V20, $V4 + vxor.vv $V21, $V21, $V5 + vxor.vv $V22, $V22, $V6 + vxor.vv $V23, $V23, $V7 + + # save the top-half of output + vssseg8e32.v $V16, ($OUTPUT), $STRIDE + + # add chacha bottom-half initial block states + vadd.vx $V8, $V8, $KEY4 + vadd.vx $V9, $V9, $KEY5 + vadd.vx $V10, $V10, $KEY6 + vadd.vx $V11, $V11, $KEY7 + vid.v $V0 + vadd.vx $V12, $V12, $COUNTER0 + vadd.vx $V13, $V13, $COUNTER1 + vadd.vx $V14, $V14, $NONCE0 + vadd.vx $V15, $V15, $NONCE1 + vadd.vv $V12, $V12, $V0 + # xor with the bottom-half input + vxor.vv $V24, $V24, $V8 + vxor.vv $V25, $V25, $V9 + vxor.vv $V26, $V26, $V10 + vxor.vv $V27, $V27, $V11 + vxor.vv $V29, $V29, $V13 + vxor.vv $V28, $V28, $V12 + vxor.vv $V30, $V30, $V14 + vxor.vv $V31, $V31, $V15 + + # save the bottom-half of output + addi $T0, $OUTPUT, 32 + vssseg8e32.v $V24, ($T0), $STRIDE + + # update counter + add $COUNTER0, $COUNTER0, $VL + sub $LEN, $LEN, $VL + # increase offset for `4 * 16 * VL = 64 * VL` + slli $T0, $VL, 6 + add $INPUT, $INPUT, $T0 + add $OUTPUT, $OUTPUT, $T0 + bnez $LEN, .Lblock_loop + + ld s0, 0(sp) + ld s1, 8(sp) + ld s2, 16(sp) + ld s3, 24(sp) + ld s4, 32(sp) + ld s5, 40(sp) + ld s6, 48(sp) + ld s7, 56(sp) + ld s8, 64(sp) + ld s9, 72(sp) + ld s10, 80(sp) + ld s11, 88(sp) + addi sp, sp, 96 + +.Lend: + ret +.size ChaCha20_ctr32_zvkb,.-ChaCha20_ctr32_zvkb +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!";