From patchwork Mon Nov 27 07:06:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469228 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4C1B8C4167B for ; Mon, 27 Nov 2023 07:07:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=U9q4FvDMZujw8xdOjcDBYhc8/giaAx5Zx7v5xGqPGnw=; b=hqNZ2daUJ/ueWA RrUPP5Hyg0KOD11wXROAjzfGmjqyeDM/3FezHjy2BVT03r+9Jp6vJca57AM+q9JGAPOtzGYmJ94Af 6RKpD9CGp9PqmMunYA9sWAH+EUVfO6DoJOllWMmwnVmDky8UYrk1nSZEKbUwZj42EDF2vRkzW6/Cg aeh9SPoRq5R5D0XhOEaNeqDqCQ9deK2gyhaHdDU0XR00TAwntdelYIY9zMSvTUmkps+W08ZIiVO1C zut/52jteLnkWci2RQPQYWCBG4Cb3gd1WVSOa+gkg81hgHhOnsmLsUFHLkYZI53mdaDp2S/6GN2QD NrZS4Tew5e8GAbG7QHeA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViD-001eoX-2d; Mon, 27 Nov 2023 07:07:21 +0000 Received: from mail-pg1-x531.google.com ([2607:f8b0:4864:20::531]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViA-001emt-0m for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:19 +0000 Received: by mail-pg1-x531.google.com with SMTP id 41be03b00d2f7-5bd099e3d3cso2118548a12.1 for ; Sun, 26 Nov 2023 23:07:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068836; x=1701673636; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/AQWefridDFzEX+S3QZsEdRxceu5Syx5iK5d77pjmdM=; b=DotzDew3xops7RqyEvYcxSQCkauAyrCsL/JpAHRlw1x/QSsqVPuoDVn6+zE980XtfD C+WJKSRkGhXN/VSYUusRdrhIgisyjCLDzsANd+jP52LKp6nNq6aqhK9YpYw0sE6m4nkQ gCCTRd2WA432hRCNKeCTy/3o8g+cpQ7OtN8acJWRVTR3YHp5Idw0OhET88OG38aToTOF w5TXC4McxiqNKSV6rxNf45CsZ5uLrIP9J8QjBzeMPUcjdRuJ35H3IL5sZfBjdg1Z6ejD 2JJeoOP+lR1Ac7mcGyAkJR0tULV4XSPpTK7Nuq5NPCXi2/zSLR/JRPfETt1TlWqgysRd Wp2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068836; x=1701673636; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/AQWefridDFzEX+S3QZsEdRxceu5Syx5iK5d77pjmdM=; b=im6i79FKFZA43zFrj0MqyOpgu4HTrr9qi5HpFJd9YYzlFo8VDQdQkSKKl8sYBgjIk6 ctlfsZysLiOZbwynbo1BtviK3ab74Eb3l4cei+uUP+vrCcyXXR3iqfLTo8wWyCB2C5J3 rylKFXcDnVNvI+s7bzdnZPx6jizswqnr2yt460FtFJDsW75jFfgrchHSfGS3S3Bq6WyO I0qXSs0VSL5LedKgKDSyu53Or+Z/jnEuTFdRjQRtuYZkGs0l+5Yx7O/ZE+V46LBDYgld 63Ev9+2Y5s8Iy508EKvao0zJZIqeVCW6ferfrEymA4uLccI87o7lLv2d8vaZYjIkpbYW 09yg== X-Gm-Message-State: AOJu0Yz90LkSxc9qTPwMXW7PWWPhhyxioQ7j/L3tZvWD5w2fF7JgW7sG m7KD2vH76g/SkGE35P2B8kKI+A== X-Google-Smtp-Source: AGHT+IEfgbssJ61TZlq0d2dhxDtdtwhMhkAYzY1cvNJK6+MKTZHhqLX3gJHGHb4gNjKA9hC/r5Pzfg== X-Received: by 2002:a05:6a20:d48e:b0:187:3b1f:219c with SMTP id im14-20020a056a20d48e00b001873b1f219cmr12639143pzb.10.1701068836407; Sun, 26 Nov 2023 23:07:16 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.13 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:16 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 01/13] RISC-V: add helper function to read the vector VLEN Date: Mon, 27 Nov 2023 15:06:51 +0800 Message-Id: <20231127070703.1697-2-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230718_289032_CC995074 X-CRM114-Status: GOOD ( 11.47 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner VLEN describes the length of each vector register and some instructions need specific minimal VLENs to work correctly. The vector code already includes a variable riscv_v_vsize that contains the value of "32 vector registers with vlenb length" that gets filled during boot. vlenb is the value contained in the CSR_VLENB register and the value represents "VLEN / 8". So add riscv_vector_vlen() to return the actual VLEN value for in-kernel users when they need to check the available VLEN. Signed-off-by: Heiko Stuebner Signed-off-by: Jerry Shih Reviewed-by: Eric Biggers --- arch/riscv/include/asm/vector.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/riscv/include/asm/vector.h b/arch/riscv/include/asm/vector.h index 9fb2dea66abd..1fd3e5510b64 100644 --- a/arch/riscv/include/asm/vector.h +++ b/arch/riscv/include/asm/vector.h @@ -244,4 +244,15 @@ void kernel_vector_allow_preemption(void); #define kernel_vector_allow_preemption() do {} while (0) #endif +/* + * Return the implementation's vlen value. + * + * riscv_v_vsize contains the value of "32 vector registers with vlenb length" + * so rebuild the vlen value in bits from it. + */ +static inline int riscv_vector_vlen(void) +{ + return riscv_v_vsize / 32 * 8; +} + #endif /* ! __ASM_RISCV_VECTOR_H */ From patchwork Mon Nov 27 07:06:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469224 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C782CC07D5A for ; Mon, 27 Nov 2023 07:07:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=2N2CEIPmT1ZfJL3sttofM+C1rQYfSniqFzShZz1lMTo=; b=r2Ov6lqaIXzixa nuMZ+2xHlDuryOrN9FpYC/01oe4t3dJF9gcMuwUhpLuOMJnS9wgj52YpfRfdKy9aO+o+G7BwaORrP Z3olW1nhbZsl3UqHx2IDoTMbFtVQWWY/mZ0/px+mJsQyYM26VNMMqdQvosFFFf+y8gi0j3wwc9KJt Z6kT+TtdySXgKoGmV8CqVzj1YsQ92lcqp2ua0IApXFqEFyazgN1uxKX3rO5UwgowwhHU7re3iBjle GG6oAueAXlKdZ0+GZsNAiGKcsbCucOQK1FO47WuVCo2S1rrzpexYHM8Ux7YpGDRShQsLU2GmK8PK2 9MaIxyfSb6qT2ADrlnaQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViM-001er7-1E; Mon, 27 Nov 2023 07:07:30 +0000 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViD-001eo8-2G for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:23 +0000 Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1cf856663a4so24695235ad.3 for ; Sun, 26 Nov 2023 23:07:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068839; x=1701673639; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WMA+62ieF0dHRmbCC9GfwETOr1r4110ste5ly+nBz6U=; b=UDj+EwKUE2jdt5YYk5M6I1fhwmwBjahGr7t/6tFnNX7+uaQvqU+3T2O1Ian38eFzrQ uORuyjJQ6cVg+UR6dp1JCzAXwigCeG32ysgriSdWBYWJd49Wn5ZzVYa+psrnS2oIzeMn rU/bqozgFwBOm63F7/iDkFE8u7wMPaiZVZvG0KGVMD9Pn+UXrgX5yoC+WP22/Wzk/jND Be1dTvsIg4xGVUhYo3GgixQwFSYt1QbdREFKHMSlP9IWCx5Tz0rfAwoD7Uwz4ZkoFPff EFLtHeSBmfFPk3LHG7YpNiUzY0FxLwp3W4YTxcaC4t/3L50m5wqhlhdfrq9tcscQ7u52 Q2uA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068839; x=1701673639; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WMA+62ieF0dHRmbCC9GfwETOr1r4110ste5ly+nBz6U=; b=HTRlD2hK0j7y8wEyGUsmbuUtEKnAPdX69uJ/waaVhuPNaEhURT0bHCd+hrZe5KuuV1 jLciXJ8Tx1JH1eOAnjmnf3mC4A+O8u/FfvQF0wxoDI6OBn74UCbGGbr5+plIqWcBRFc3 eJjdASFULDEJ59wQw2Rf32fFYVN1aPZIsW6rKIqOFP+Yljwu00yFd4RHpLcjsmLGWfh0 rWZgAEiVVJNIDGsaD3Gj7bAs5TvP48Css2Rb8PJvENgLuY89D3ZfnmKs/9ceCL18pinM T1glYpg8TBzeMpMyrR/g1p7NM90Els4ZiDrFqbOm/vr1lsR4i7V5Vor2D7FyOLDi/xh1 qvMw== X-Gm-Message-State: AOJu0Yzl+KeHPCk7AykkxyCNJDulxQ/7bkh9ALS51devSmLgR01azEnU 86TE2clRdiokbT+P0lpouKB1IQ== X-Google-Smtp-Source: AGHT+IGuXY7hovfUUFK918N8FOS5pzp5Ne429WLs/7kFsDgC+8/3HusV0OnDurXW0gspNvwdh6EGhA== X-Received: by 2002:a17:903:1205:b0:1cc:5589:7dba with SMTP id l5-20020a170903120500b001cc55897dbamr10929394plh.43.1701068839357; Sun, 26 Nov 2023 23:07:19 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.16 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:19 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 02/13] RISC-V: hook new crypto subdir into build-system Date: Mon, 27 Nov 2023 15:06:52 +0800 Message-Id: <20231127070703.1697-3-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230721_992443_241878AB X-CRM114-Status: GOOD ( 13.84 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org From: Heiko Stuebner Create a crypto subdirectory for added accelerated cryptography routines and hook it into the riscv Kbuild and the main crypto Kconfig. Signed-off-by: Heiko Stuebner Signed-off-by: Jerry Shih Reviewed-by: Eric Biggers --- arch/riscv/Kbuild | 1 + arch/riscv/crypto/Kconfig | 5 +++++ arch/riscv/crypto/Makefile | 4 ++++ crypto/Kconfig | 3 +++ 4 files changed, 13 insertions(+) create mode 100644 arch/riscv/crypto/Kconfig create mode 100644 arch/riscv/crypto/Makefile diff --git a/arch/riscv/Kbuild b/arch/riscv/Kbuild index d25ad1c19f88..2c585f7a0b6e 100644 --- a/arch/riscv/Kbuild +++ b/arch/riscv/Kbuild @@ -2,6 +2,7 @@ obj-y += kernel/ mm/ net/ obj-$(CONFIG_BUILTIN_DTB) += boot/dts/ +obj-$(CONFIG_CRYPTO) += crypto/ obj-y += errata/ obj-$(CONFIG_KVM) += kvm/ diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig new file mode 100644 index 000000000000..10d60edc0110 --- /dev/null +++ b/arch/riscv/crypto/Kconfig @@ -0,0 +1,5 @@ +# SPDX-License-Identifier: GPL-2.0 + +menu "Accelerated Cryptographic Algorithms for CPU (riscv)" + +endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile new file mode 100644 index 000000000000..b3b6332c9f6d --- /dev/null +++ b/arch/riscv/crypto/Makefile @@ -0,0 +1,4 @@ +# SPDX-License-Identifier: GPL-2.0-only +# +# linux/arch/riscv/crypto/Makefile +# diff --git a/crypto/Kconfig b/crypto/Kconfig index 650b1b3620d8..c7b23d2c58e4 100644 --- a/crypto/Kconfig +++ b/crypto/Kconfig @@ -1436,6 +1436,9 @@ endif if PPC source "arch/powerpc/crypto/Kconfig" endif +if RISCV +source "arch/riscv/crypto/Kconfig" +endif if S390 source "arch/s390/crypto/Kconfig" endif From patchwork Mon Nov 27 07:06:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469225 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 42B58C07D5A for ; Mon, 27 Nov 2023 07:07:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=RLYbcEHdNH80gpfoa7e59RAXP4ep9QvkBCL027gWV2c=; b=t90Id2uV+SNd1N GbwhF93JnTLrQcRIIBVbVhwSC6+laRz8862eWK4n6doe+apSkkD7ejnz8fjbXM9F3Rk5Bq0/vA9ry UY9/VmObbvF4TsVqcYHAfC5I1M5vMtFA5pwi7EmRwSJXBT5mUgzjJDEbqG/Qp/EJcisp+ebVMknL5 9v7mpdC9ICeAiM0Waq9hj680XS0MVO7VLIqmDTYPTtT5NRf1wCH+R/1rrXX+F6+Nv8JyuJygG2Icb z3UmOchtSHHGVG4A9TyJGfdxbQSuNzl1iLY9B2OGvt8CVHW8+rLb9c7Ax+7R6C4S362IY7WO4SRma AC395gR5wLV7uhy9bJ5w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViP-001erq-2K; Mon, 27 Nov 2023 07:07:33 +0000 Received: from mail-pl1-x62f.google.com ([2607:f8b0:4864:20::62f]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViI-001epD-0I for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:29 +0000 Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1cf80a7be0aso31918655ad.1 for ; Sun, 26 Nov 2023 23:07:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068842; x=1701673642; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=N6MgwafsLqjywfMCXjdUW6zqI+itRnJtEI6/v6PIAPo=; b=FqeejumM7TRxww1LAcEXCFSaCkduWcDAWVBsegfCs7JG7H9GCpnfXBXJcRckqkuFJu snR0MmCm8gSF0XRTlf8A/e8KLfSvTZl1n633tc3HKRv5CxmsJnNeh3luvTb0ZI7Zet94 yM7D+lw6DNtNjY+XDx7GE3g1s0Mr+BNTVpyTve5OgLT7IWBMn1j0FutxKCIyMq+5lVrP dJxERk1qKgYMP2CdySi9GNZXVdEv1ak/0uOLdwkMNpwcZorQl9JXlipwmXbKdCxDJjut os0O9hce7tmWCTP63emRO9VbLEQVMIlp8aUCG6yfSJZOJd7DvFY2wTFeSM0nbd0f45Lf eO5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068842; x=1701673642; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=N6MgwafsLqjywfMCXjdUW6zqI+itRnJtEI6/v6PIAPo=; b=qRc4zvxMQn7Uo6vonv49F2uL8xSFEwWSdwykZJMTMXMxJlN64UfUlR4i2kdc/f0J2Y Iuz9xuj627qydfR+Lvu2jHKopp03F1WvCkjZBj/S3PMAV37RgZEQ4RNShR5r2OZ5pJ8M A24XxytiQp8fDtnVDnM7KtHB2aHkaMi8DWdR5b1paIkYqcYKi467eNenTzuUAmJnkMVb XXKQSEF51/FN6DfsNXOjvCaJMuhTCjLlk2qeQnV3SChLZ0pU2IxgiDx/ZbwtlDPInGlG 0joEtcf/2CB2zKj/aOdoeHAxA0NTjm4zPfZQ5Bq+wam/EU0LRg0xanL267AK/zi65lze +7PQ== X-Gm-Message-State: AOJu0Yyik/NRh0UbIRW+c9VP8AIvN9JjkwcErTE1+RL4DUijNd/XVipW oEmSRoqe+auQ0A4gQIra2pL2lw== X-Google-Smtp-Source: AGHT+IHdcu5gU/CpFXYLjHgVWZvRsNwIrGTxKxdpB9YsVIiqrNzlaMEgYx7HAKNHw+RQcHOzvMld+w== X-Received: by 2002:a17:902:ce84:b0:1cf:b7ea:fea with SMTP id f4-20020a170902ce8400b001cfb7ea0feamr7251378plg.1.1701068842591; Sun, 26 Nov 2023 23:07:22 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.19 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:22 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 03/13] RISC-V: crypto: add OpenSSL perl module for vector instructions Date: Mon, 27 Nov 2023 15:06:53 +0800 Message-Id: <20231127070703.1697-4-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230726_184974_CCE1B925 X-CRM114-Status: GOOD ( 22.85 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org The OpenSSL has some RISC-V vector cryptography implementations which could be reused for kernel. These implementations use a number of perl helpers for handling vector and vector-crypto-extension instructions. This patch take these perl helpers from OpenSSL(openssl/openssl#21923). The unused scalar crypto instructions in the original perl module are skipped. Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- arch/riscv/crypto/riscv.pm | 828 +++++++++++++++++++++++++++++++++++++ 1 file changed, 828 insertions(+) create mode 100644 arch/riscv/crypto/riscv.pm diff --git a/arch/riscv/crypto/riscv.pm b/arch/riscv/crypto/riscv.pm new file mode 100644 index 000000000000..e188f7476e3e --- /dev/null +++ b/arch/riscv/crypto/riscv.pm @@ -0,0 +1,828 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Jerry Shih +# Copyright (c) 2023, Phoebe Chen +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +use strict; +use warnings; + +# Set $have_stacktrace to 1 if we have Devel::StackTrace +my $have_stacktrace = 0; +if (eval {require Devel::StackTrace;1;}) { + $have_stacktrace = 1; +} + +my @regs = map("x$_",(0..31)); +# Mapping from the RISC-V psABI ABI mnemonic names to the register number. +my @regaliases = ('zero','ra','sp','gp','tp','t0','t1','t2','s0','s1', + map("a$_",(0..7)), + map("s$_",(2..11)), + map("t$_",(3..6)) +); + +my %reglookup; +@reglookup{@regs} = @regs; +@reglookup{@regaliases} = @regs; + +# Takes a register name, possibly an alias, and converts it to a register index +# from 0 to 31 +sub read_reg { + my $reg = lc shift; + if (!exists($reglookup{$reg})) { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Unknown register ".$reg."\n".$trace); + } + my $regstr = $reglookup{$reg}; + if (!($regstr =~ /^x([0-9]+)$/)) { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Could not process register ".$reg."\n".$trace); + } + return $1; +} + +# Read the sew setting(8, 16, 32 and 64) and convert to vsew encoding. +sub read_sew { + my $sew_setting = shift; + + if ($sew_setting eq "e8") { + return 0; + } elsif ($sew_setting eq "e16") { + return 1; + } elsif ($sew_setting eq "e32") { + return 2; + } elsif ($sew_setting eq "e64") { + return 3; + } else { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Unsupported SEW setting:".$sew_setting."\n".$trace); + } +} + +# Read the LMUL settings and convert to vlmul encoding. +sub read_lmul { + my $lmul_setting = shift; + + if ($lmul_setting eq "mf8") { + return 5; + } elsif ($lmul_setting eq "mf4") { + return 6; + } elsif ($lmul_setting eq "mf2") { + return 7; + } elsif ($lmul_setting eq "m1") { + return 0; + } elsif ($lmul_setting eq "m2") { + return 1; + } elsif ($lmul_setting eq "m4") { + return 2; + } elsif ($lmul_setting eq "m8") { + return 3; + } else { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Unsupported LMUL setting:".$lmul_setting."\n".$trace); + } +} + +# Read the tail policy settings and convert to vta encoding. +sub read_tail_policy { + my $tail_setting = shift; + + if ($tail_setting eq "ta") { + return 1; + } elsif ($tail_setting eq "tu") { + return 0; + } else { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Unsupported tail policy setting:".$tail_setting."\n".$trace); + } +} + +# Read the mask policy settings and convert to vma encoding. +sub read_mask_policy { + my $mask_setting = shift; + + if ($mask_setting eq "ma") { + return 1; + } elsif ($mask_setting eq "mu") { + return 0; + } else { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Unsupported mask policy setting:".$mask_setting."\n".$trace); + } +} + +my @vregs = map("v$_",(0..31)); +my %vreglookup; +@vreglookup{@vregs} = @vregs; + +sub read_vreg { + my $vreg = lc shift; + if (!exists($vreglookup{$vreg})) { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Unknown vector register ".$vreg."\n".$trace); + } + if (!($vreg =~ /^v([0-9]+)$/)) { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("Could not process vector register ".$vreg."\n".$trace); + } + return $1; +} + +# Read the vm settings and convert to mask encoding. +sub read_mask_vreg { + my $vreg = shift; + # The default value is unmasked. + my $mask_bit = 1; + + if (defined($vreg)) { + my $reg_id = read_vreg $vreg; + if ($reg_id == 0) { + $mask_bit = 0; + } else { + my $trace = ""; + if ($have_stacktrace) { + $trace = Devel::StackTrace->new->as_string; + } + die("The ".$vreg." is not the mask register v0.\n".$trace); + } + } + return $mask_bit; +} + +# Vector instructions + +sub vadd_vv { + # vadd.vv vd, vs2, vs1, vm + my $template = 0b000000_0_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +sub vadd_vx { + # vadd.vx vd, vs2, rs1, vm + my $template = 0b000000_0_00000_00000_100_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vsub_vv { + # vsub.vv vd, vs2, vs1, vm + my $template = 0b000010_0_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +sub vsub_vx { + # vsub.vx vd, vs2, rs1, vm + my $template = 0b000010_0_00000_00000_100_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vid_v { + # vid.v vd + my $template = 0b0101001_00000_10001_010_00000_1010111; + my $vd = read_vreg shift; + return ".word ".($template | ($vd << 7)); +} + +sub viota_m { + # viota.m vd, vs2, vm + my $template = 0b010100_0_00000_10000_010_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vd << 7)); +} + +sub vle8_v { + # vle8.v vd, (rs1), vm + my $template = 0b000000_0_00000_00000_000_00000_0000111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($rs1 << 15) | ($vd << 7)); +} + +sub vle32_v { + # vle32.v vd, (rs1), vm + my $template = 0b000000_0_00000_00000_110_00000_0000111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($rs1 << 15) | ($vd << 7)); +} + +sub vle64_v { + # vle64.v vd, (rs1) + my $template = 0b0000001_00000_00000_111_00000_0000111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($rs1 << 15) | ($vd << 7)); +} + +sub vlse32_v { + # vlse32.v vd, (rs1), rs2 + my $template = 0b0000101_00000_00000_110_00000_0000111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + my $rs2 = read_reg shift; + return ".word ".($template | ($rs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vlsseg_nf_e32_v { + # vlssege32.v vd, (rs1), rs2 + my $template = 0b0000101_00000_00000_110_00000_0000111; + my $nf = shift; + $nf -= 1; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + my $rs2 = read_reg shift; + return ".word ".($template | ($nf << 29) | ($rs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vlse64_v { + # vlse64.v vd, (rs1), rs2 + my $template = 0b0000101_00000_00000_111_00000_0000111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + my $rs2 = read_reg shift; + return ".word ".($template | ($rs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vluxei8_v { + # vluxei8.v vd, (rs1), vs2, vm + my $template = 0b000001_0_00000_00000_000_00000_0000111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + my $vs2 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vmerge_vim { + # vmerge.vim vd, vs2, imm, v0 + my $template = 0b0101110_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $imm = shift; + return ".word ".($template | ($vs2 << 20) | ($imm << 15) | ($vd << 7)); +} + +sub vmerge_vvm { + # vmerge.vvm vd vs2 vs1 + my $template = 0b0101110_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)) +} + +sub vmseq_vi { + # vmseq.vi vd vs1, imm + my $template = 0b0110001_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $vs1 = read_vreg shift; + my $imm = shift; + return ".word ".($template | ($vs1 << 20) | ($imm << 15) | ($vd << 7)) +} + +sub vmsgtu_vx { + # vmsgtu.vx vd vs2, rs1, vm + my $template = 0b011110_0_00000_00000_100_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)) +} + +sub vmv_v_i { + # vmv.v.i vd, imm + my $template = 0b0101111_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $imm = shift; + return ".word ".($template | ($imm << 15) | ($vd << 7)); +} + +sub vmv_v_x { + # vmv.v.x vd, rs1 + my $template = 0b0101111_00000_00000_100_00000_1010111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($rs1 << 15) | ($vd << 7)); +} + +sub vmv_v_v { + # vmv.v.v vd, vs1 + my $template = 0b0101111_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs1 << 15) | ($vd << 7)); +} + +sub vor_vv_v0t { + # vor.vv vd, vs2, vs1, v0.t + my $template = 0b0010100_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +sub vse8_v { + # vse8.v vd, (rs1), vm + my $template = 0b000000_0_00000_00000_000_00000_0100111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($rs1 << 15) | ($vd << 7)); +} + +sub vse32_v { + # vse32.v vd, (rs1), vm + my $template = 0b000000_0_00000_00000_110_00000_0100111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($rs1 << 15) | ($vd << 7)); +} + +sub vssseg_nf_e32_v { + # vsssege32.v vs3, (rs1), rs2 + my $template = 0b0000101_00000_00000_110_00000_0100111; + my $nf = shift; + $nf -= 1; + my $vs3 = read_vreg shift; + my $rs1 = read_reg shift; + my $rs2 = read_reg shift; + return ".word ".($template | ($nf << 29) | ($rs2 << 20) | ($rs1 << 15) | ($vs3 << 7)); +} + +sub vsuxei8_v { + # vsuxei8.v vs3, (rs1), vs2, vm + my $template = 0b000001_0_00000_00000_000_00000_0100111; + my $vs3 = read_vreg shift; + my $rs1 = read_reg shift; + my $vs2 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($rs1 << 15) | ($vs3 << 7)); +} + +sub vse64_v { + # vse64.v vd, (rs1) + my $template = 0b0000001_00000_00000_111_00000_0100111; + my $vd = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($rs1 << 15) | ($vd << 7)); +} + +sub vsetivli__x0_2_e64_m1_tu_mu { + # vsetivli x0, 2, e64, m1, tu, mu + return ".word 0xc1817057"; +} + +sub vsetivli__x0_4_e32_m1_tu_mu { + # vsetivli x0, 4, e32, m1, tu, mu + return ".word 0xc1027057"; +} + +sub vsetivli__x0_4_e64_m1_tu_mu { + # vsetivli x0, 4, e64, m1, tu, mu + return ".word 0xc1827057"; +} + +sub vsetivli__x0_8_e32_m1_tu_mu { + # vsetivli x0, 8, e32, m1, tu, mu + return ".word 0xc1047057"; +} + +sub vsetvli { + # vsetvli rd, rs1, vtypei + my $template = 0b0_00000000000_00000_111_00000_1010111; + my $rd = read_reg shift; + my $rs1 = read_reg shift; + my $sew = read_sew shift; + my $lmul = read_lmul shift; + my $tail_policy = read_tail_policy shift; + my $mask_policy = read_mask_policy shift; + my $vtypei = ($mask_policy << 7) | ($tail_policy << 6) | ($sew << 3) | $lmul; + + return ".word ".($template | ($vtypei << 20) | ($rs1 << 15) | ($rd << 7)); +} + +sub vsetivli { + # vsetvli rd, uimm, vtypei + my $template = 0b11_0000000000_00000_111_00000_1010111; + my $rd = read_reg shift; + my $uimm = shift; + my $sew = read_sew shift; + my $lmul = read_lmul shift; + my $tail_policy = read_tail_policy shift; + my $mask_policy = read_mask_policy shift; + my $vtypei = ($mask_policy << 7) | ($tail_policy << 6) | ($sew << 3) | $lmul; + + return ".word ".($template | ($vtypei << 20) | ($uimm << 15) | ($rd << 7)); +} + +sub vslidedown_vi { + # vslidedown.vi vd, vs2, uimm + my $template = 0b0011111_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15) | ($vd << 7)); +} + +sub vslidedown_vx { + # vslidedown.vx vd, vs2, rs1 + my $template = 0b0011111_00000_00000_100_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vslideup_vi_v0t { + # vslideup.vi vd, vs2, uimm, v0.t + my $template = 0b0011100_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15) | ($vd << 7)); +} + +sub vslideup_vi { + # vslideup.vi vd, vs2, uimm + my $template = 0b0011101_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15) | ($vd << 7)); +} + +sub vsll_vi { + # vsll.vi vd, vs2, uimm, vm + my $template = 0b1001011_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15) | ($vd << 7)); +} + +sub vsrl_vx { + # vsrl.vx vd, vs2, rs1 + my $template = 0b1010001_00000_00000_100_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vsse32_v { + # vse32.v vs3, (rs1), rs2 + my $template = 0b0000101_00000_00000_110_00000_0100111; + my $vs3 = read_vreg shift; + my $rs1 = read_reg shift; + my $rs2 = read_reg shift; + return ".word ".($template | ($rs2 << 20) | ($rs1 << 15) | ($vs3 << 7)); +} + +sub vsse64_v { + # vsse64.v vs3, (rs1), rs2 + my $template = 0b0000101_00000_00000_111_00000_0100111; + my $vs3 = read_vreg shift; + my $rs1 = read_reg shift; + my $rs2 = read_reg shift; + return ".word ".($template | ($rs2 << 20) | ($rs1 << 15) | ($vs3 << 7)); +} + +sub vxor_vv_v0t { + # vxor.vv vd, vs2, vs1, v0.t + my $template = 0b0010110_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +sub vxor_vv { + # vxor.vv vd, vs2, vs1 + my $template = 0b0010111_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +sub vzext_vf2 { + # vzext.vf2 vd, vs2, vm + my $template = 0b010010_0_00000_00110_010_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vd << 7)); +} + +# Vector crypto instructions + +## Zvbb and Zvkb instructions +## +## vandn (also in zvkb) +## vbrev +## vbrev8 (also in zvkb) +## vrev8 (also in zvkb) +## vclz +## vctz +## vcpop +## vrol (also in zvkb) +## vror (also in zvkb) +## vwsll + +sub vbrev8_v { + # vbrev8.v vd, vs2, vm + my $template = 0b010010_0_00000_01000_010_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vd << 7)); +} + +sub vrev8_v { + # vrev8.v vd, vs2, vm + my $template = 0b010010_0_00000_01001_010_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vd << 7)); +} + +sub vror_vi { + # vror.vi vd, vs2, uimm + my $template = 0b01010_0_1_00000_00000_011_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + my $uimm_i5 = $uimm >> 5; + my $uimm_i4_0 = $uimm & 0b11111; + + return ".word ".($template | ($uimm_i5 << 26) | ($vs2 << 20) | ($uimm_i4_0 << 15) | ($vd << 7)); +} + +sub vwsll_vv { + # vwsll.vv vd, vs2, vs1, vm + my $template = 0b110101_0_00000_00000_000_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + my $vm = read_mask_vreg shift; + return ".word ".($template | ($vm << 25) | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +## Zvbc instructions + +sub vclmulh_vx { + # vclmulh.vx vd, vs2, rs1 + my $template = 0b0011011_00000_00000_110_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vclmul_vx_v0t { + # vclmul.vx vd, vs2, rs1, v0.t + my $template = 0b0011000_00000_00000_110_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +sub vclmul_vx { + # vclmul.vx vd, vs2, rs1 + my $template = 0b0011001_00000_00000_110_00000_1010111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $rs1 = read_reg shift; + return ".word ".($template | ($vs2 << 20) | ($rs1 << 15) | ($vd << 7)); +} + +## Zvkg instructions + +sub vghsh_vv { + # vghsh.vv vd, vs2, vs1 + my $template = 0b1011001_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vs1 << 15) | ($vd << 7)); +} + +sub vgmul_vv { + # vgmul.vv vd, vs2 + my $template = 0b1010001_00000_10001_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +## Zvkned instructions + +sub vaesdf_vs { + # vaesdf.vs vd, vs2 + my $template = 0b101001_1_00000_00001_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +sub vaesdm_vs { + # vaesdm.vs vd, vs2 + my $template = 0b101001_1_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +sub vaesef_vs { + # vaesef.vs vd, vs2 + my $template = 0b101001_1_00000_00011_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +sub vaesem_vs { + # vaesem.vs vd, vs2 + my $template = 0b101001_1_00000_00010_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +sub vaeskf1_vi { + # vaeskf1.vi vd, vs2, uimmm + my $template = 0b100010_1_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($uimm << 15) | ($vs2 << 20) | ($vd << 7)); +} + +sub vaeskf2_vi { + # vaeskf2.vi vd, vs2, uimm + my $template = 0b101010_1_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15) | ($vd << 7)); +} + +sub vaesz_vs { + # vaesz.vs vd, vs2 + my $template = 0b101001_1_00000_00111_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +## Zvknha and Zvknhb instructions + +sub vsha2ms_vv { + # vsha2ms.vv vd, vs2, vs1 + my $template = 0b1011011_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20)| ($vs1 << 15 )| ($vd << 7)); +} + +sub vsha2ch_vv { + # vsha2ch.vv vd, vs2, vs1 + my $template = 0b101110_10000_00000_001_00000_01110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20)| ($vs1 << 15 )| ($vd << 7)); +} + +sub vsha2cl_vv { + # vsha2cl.vv vd, vs2, vs1 + my $template = 0b101111_10000_00000_001_00000_01110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20)| ($vs1 << 15 )| ($vd << 7)); +} + +## Zvksed instructions + +sub vsm4k_vi { + # vsm4k.vi vd, vs2, uimm + my $template = 0b1000011_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15) | ($vd << 7)); +} + +sub vsm4r_vs { + # vsm4r.vs vd, vs2 + my $template = 0b1010011_00000_10000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vd << 7)); +} + +## zvksh instructions + +sub vsm3c_vi { + # vsm3c.vi vd, vs2, uimm + my $template = 0b1010111_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $uimm = shift; + return ".word ".($template | ($vs2 << 20) | ($uimm << 15 ) | ($vd << 7)); +} + +sub vsm3me_vv { + # vsm3me.vv vd, vs2, vs1 + my $template = 0b1000001_00000_00000_010_00000_1110111; + my $vd = read_vreg shift; + my $vs2 = read_vreg shift; + my $vs1 = read_vreg shift; + return ".word ".($template | ($vs2 << 20) | ($vs1 << 15 ) | ($vd << 7)); +} + +1; From patchwork Mon Nov 27 07:06:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469227 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 68A37C0755A for ; Mon, 27 Nov 2023 07:07:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=QDx+kUChaapPPEL1aPYDslmcGzF1zPQWJ4ZvF32D63k=; b=TwQCUN+PQMuTU4 5qxSUwQZu4irfrglgwGww68EQJyGT0R3dlLoDRwgPYaFT4meKC9rzSKBK7dScmtmryHzcLzuV9f1x Wrzgs3dy/cf3cHg9qUV6AyEZWShkeQBj+wCGJs6vifddERIk0mDg5Z8jKRPmZ6KKLJCVT4Vinr0+R lCULn2vSScqubjrr0WyehuhnKtDGwJEmydnXU22tdVXkVJi2DB5scSERHwKTfSU6T/4UfHqMUjO+r YfHP2/zorNAVbm5Pfry+pzy5hR4DtbZrMoQONMKyZZ3+xe6Wy8H4vhkMAp+bR9rRb6bFEoS1ZL0KI iJzjB/7ZO1971bPrKY2A==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViR-001esS-1Z; Mon, 27 Nov 2023 07:07:35 +0000 Received: from mail-pl1-x62a.google.com ([2607:f8b0:4864:20::62a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViK-001epp-0r for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:32 +0000 Received: by mail-pl1-x62a.google.com with SMTP id d9443c01a7336-1ce28faa92dso28037585ad.2 for ; Sun, 26 Nov 2023 23:07:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068845; x=1701673645; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CtUiUXkF3kN/BXKzxp91tLi6KCR2x1T9PQI4Fk+ZakA=; b=Jt89Vvwx1ROdi4wP0GeKwlpKfBitir6tEtR6625iKJfattpavKv+Qi4Or6tLfElwqR XFKuvZsoaZLri2iZMHre30KpLv1Ot3jFJrUTIQyUb8lbjujCvQi8YOmnxb23H31d+Jki 4UEMt+UrLAnBzVg68fWpJE1Rg4jnFMgLWJr4ay7Yq2ETllzsLBX0ZMbzsHBLvB3l6IMF htX+JM1dRkWd8PSjjngfy8BbT9B51JPGE97Gd/Ve+k/SUUQzwlTsrcuYtDlQylHKxayA 1U0N4tji6icQc2jxDec+MwC9nfhzkZckpdlrcmDgnmX1uw7XCVfpq+G23NVEqs4evS1H g+EA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068845; x=1701673645; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CtUiUXkF3kN/BXKzxp91tLi6KCR2x1T9PQI4Fk+ZakA=; b=ZUsTMZxAu5rTmkOgHlZZIKjUQGb7kFNHyiR7ClzAkgor5ydDPyWYvIwwrRq4pO14AN bcwYudMw/yce4NYd0rOQpT6Y7BOjZMtsxOWCE9srHQB1RIhKPHxTZGULA3ai178lwRZ4 5VuyPtoglohVAxnIGhg30Bw7iFWcUuDuuT+1bwxA9k1GoQccNefpArCVfBOXO9BbACy6 2yNtcFPfl+5rmVvz7Ub95oXT+uecslVr2wSbK6WdPxb3Zevo8qiYRVLj6oJkLpgP5le2 S3pIzyXBxsyTDN3V1Ib1mAn5A49mvtZuRoVNmBIjElIaDaUwVr8YFZ1DS0ZOr84MjsQK Z5EQ== X-Gm-Message-State: AOJu0YwMgwc9rCu1TIH09J1Hdn8lnJ2Po12NVCcxxTOKK6spp4q7PYMO CN4KFobW+fFndnpGneqdkKMW0A== X-Google-Smtp-Source: AGHT+IFRltSyEt83zMIe016Zra5s1haP8WmUsC8LiFC12OgzYubxW56A1sWoR6JvaAeOcy6VtjPE/w== X-Received: by 2002:a17:902:d38d:b0:1cf:a4e8:d2a1 with SMTP id e13-20020a170902d38d00b001cfa4e8d2a1mr8131780pld.42.1701068845494; Sun, 26 Nov 2023 23:07:25 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.22 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:25 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 04/13] RISC-V: crypto: add Zvkned accelerated AES implementation Date: Mon, 27 Nov 2023 15:06:54 +0800 Message-Id: <20231127070703.1697-5-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230728_731939_7DAF523E X-CRM114-Status: GOOD ( 28.53 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org The AES implementation using the Zvkned vector crypto extension from OpenSSL(openssl/openssl#21923). Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v2: - Do not turn on kconfig `AES_RISCV64` option by default. - Turn to use `crypto_aes_ctx` structure for aes key. - Use `Zvkned` extension for AES-128/256 key expanding. - Export riscv64_aes_* symbols for other modules. - Add `asmlinkage` qualifier for crypto asm function. - Reorder structure riscv64_aes_alg_zvkned members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 11 + arch/riscv/crypto/Makefile | 11 + arch/riscv/crypto/aes-riscv64-glue.c | 151 ++++++ arch/riscv/crypto/aes-riscv64-glue.h | 18 + arch/riscv/crypto/aes-riscv64-zvkned.pl | 593 ++++++++++++++++++++++++ 5 files changed, 784 insertions(+) create mode 100644 arch/riscv/crypto/aes-riscv64-glue.c create mode 100644 arch/riscv/crypto/aes-riscv64-glue.h create mode 100644 arch/riscv/crypto/aes-riscv64-zvkned.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 10d60edc0110..65189d4d47b3 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -2,4 +2,15 @@ menu "Accelerated Cryptographic Algorithms for CPU (riscv)" +config CRYPTO_AES_RISCV64 + tristate "Ciphers: AES" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_ALGAPI + select CRYPTO_LIB_AES + help + Block ciphers: AES cipher algorithms (FIPS-197) + + Architecture: riscv64 using: + - Zvkned vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index b3b6332c9f6d..90ca91d8df26 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -2,3 +2,14 @@ # # linux/arch/riscv/crypto/Makefile # + +obj-$(CONFIG_CRYPTO_AES_RISCV64) += aes-riscv64.o +aes-riscv64-y := aes-riscv64-glue.o aes-riscv64-zvkned.o + +quiet_cmd_perlasm = PERLASM $@ + cmd_perlasm = $(PERL) $(<) void $(@) + +$(obj)/aes-riscv64-zvkned.S: $(src)/aes-riscv64-zvkned.pl + $(call cmd,perlasm) + +clean-files += aes-riscv64-zvkned.S diff --git a/arch/riscv/crypto/aes-riscv64-glue.c b/arch/riscv/crypto/aes-riscv64-glue.c new file mode 100644 index 000000000000..091e368edb30 --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-glue.c @@ -0,0 +1,151 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Port of the OpenSSL AES implementation for RISC-V + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "aes-riscv64-glue.h" + +/* aes cipher using zvkned vector crypto extension */ +asmlinkage int rv64i_zvkned_set_encrypt_key(const u8 *user_key, const int bytes, + const struct crypto_aes_ctx *key); +asmlinkage void rv64i_zvkned_encrypt(const u8 *in, u8 *out, + const struct crypto_aes_ctx *key); +asmlinkage void rv64i_zvkned_decrypt(const u8 *in, u8 *out, + const struct crypto_aes_ctx *key); + +int riscv64_aes_setkey(struct crypto_aes_ctx *ctx, const u8 *key, + unsigned int keylen) +{ + int ret; + + ret = aes_check_keylen(keylen); + if (ret < 0) + return -EINVAL; + + /* + * The RISC-V AES vector crypto key expanding doesn't support AES-192. + * Use the generic software key expanding for that case. + */ + if ((keylen == 16 || keylen == 32) && crypto_simd_usable()) { + /* + * All zvkned-based functions use encryption expanding keys for both + * encryption and decryption. + */ + kernel_vector_begin(); + rv64i_zvkned_set_encrypt_key(key, keylen, ctx); + kernel_vector_end(); + } else { + ret = aes_expandkey(ctx, key, keylen); + } + + return ret; +} +EXPORT_SYMBOL(riscv64_aes_setkey); + +void riscv64_aes_encrypt_zvkned(const struct crypto_aes_ctx *ctx, u8 *dst, + const u8 *src) +{ + if (crypto_simd_usable()) { + kernel_vector_begin(); + rv64i_zvkned_encrypt(src, dst, ctx); + kernel_vector_end(); + } else { + aes_encrypt(ctx, dst, src); + } +} +EXPORT_SYMBOL(riscv64_aes_encrypt_zvkned); + +void riscv64_aes_decrypt_zvkned(const struct crypto_aes_ctx *ctx, u8 *dst, + const u8 *src) +{ + if (crypto_simd_usable()) { + kernel_vector_begin(); + rv64i_zvkned_decrypt(src, dst, ctx); + kernel_vector_end(); + } else { + aes_decrypt(ctx, dst, src); + } +} +EXPORT_SYMBOL(riscv64_aes_decrypt_zvkned); + +static int aes_setkey(struct crypto_tfm *tfm, const u8 *key, + unsigned int keylen) +{ + struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); + + return riscv64_aes_setkey(ctx, key, keylen); +} + +static void aes_encrypt_zvkned(struct crypto_tfm *tfm, u8 *dst, const u8 *src) +{ + const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); + + riscv64_aes_encrypt_zvkned(ctx, dst, src); +} + +static void aes_decrypt_zvkned(struct crypto_tfm *tfm, u8 *dst, const u8 *src) +{ + const struct crypto_aes_ctx *ctx = crypto_tfm_ctx(tfm); + + riscv64_aes_decrypt_zvkned(ctx, dst, src); +} + +static struct crypto_alg riscv64_aes_alg_zvkned = { + .cra_flags = CRYPTO_ALG_TYPE_CIPHER, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_priority = 300, + .cra_name = "aes", + .cra_driver_name = "aes-riscv64-zvkned", + .cra_cipher = { + .cia_min_keysize = AES_MIN_KEY_SIZE, + .cia_max_keysize = AES_MAX_KEY_SIZE, + .cia_setkey = aes_setkey, + .cia_encrypt = aes_encrypt_zvkned, + .cia_decrypt = aes_decrypt_zvkned, + }, + .cra_module = THIS_MODULE, +}; + +static inline bool check_aes_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKNED) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_aes_mod_init(void) +{ + if (check_aes_ext()) + return crypto_register_alg(&riscv64_aes_alg_zvkned); + + return -ENODEV; +} + +static void __exit riscv64_aes_mod_fini(void) +{ + crypto_unregister_alg(&riscv64_aes_alg_zvkned); +} + +module_init(riscv64_aes_mod_init); +module_exit(riscv64_aes_mod_fini); + +MODULE_DESCRIPTION("AES (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("aes"); diff --git a/arch/riscv/crypto/aes-riscv64-glue.h b/arch/riscv/crypto/aes-riscv64-glue.h new file mode 100644 index 000000000000..0416bbc4318e --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-glue.h @@ -0,0 +1,18 @@ +/* SPDX-License-Identifier: GPL-2.0 */ + +#ifndef AES_RISCV64_GLUE_H +#define AES_RISCV64_GLUE_H + +#include +#include + +int riscv64_aes_setkey(struct crypto_aes_ctx *ctx, const u8 *key, + unsigned int keylen); + +void riscv64_aes_encrypt_zvkned(const struct crypto_aes_ctx *ctx, u8 *dst, + const u8 *src); + +void riscv64_aes_decrypt_zvkned(const struct crypto_aes_ctx *ctx, u8 *dst, + const u8 *src); + +#endif /* AES_RISCV64_GLUE_H */ diff --git a/arch/riscv/crypto/aes-riscv64-zvkned.pl b/arch/riscv/crypto/aes-riscv64-zvkned.pl new file mode 100644 index 000000000000..303e82d9f6f0 --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-zvkned.pl @@ -0,0 +1,593 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Phoebe Chen +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector AES block cipher extension ('Zvkned') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +{ +################################################################################ +# int rv64i_zvkned_set_encrypt_key(const unsigned char *userKey, const int bytes, +# AES_KEY *key) +my ($UKEY, $BYTES, $KEYP) = ("a0", "a1", "a2"); +my ($T0) = ("t0"); + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_set_encrypt_key +.type rv64i_zvkned_set_encrypt_key,\@function +rv64i_zvkned_set_encrypt_key: + beqz $UKEY, L_fail_m1 + beqz $KEYP, L_fail_m1 + + # Store the key length. + sw $BYTES, 480($KEYP) + + li $T0, 32 + beq $BYTES, $T0, L_set_key_256 + li $T0, 16 + beq $BYTES, $T0, L_set_key_128 + + j L_fail_m2 + +L_set_key_128: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + # Load the key + @{[vle32_v $V10, $UKEY]} + + # Generate keys for round 2-11 into registers v11-v20. + @{[vaeskf1_vi $V11, $V10, 1]} # v11 <- rk2 (w[ 4, 7]) + @{[vaeskf1_vi $V12, $V11, 2]} # v12 <- rk3 (w[ 8,11]) + @{[vaeskf1_vi $V13, $V12, 3]} # v13 <- rk4 (w[12,15]) + @{[vaeskf1_vi $V14, $V13, 4]} # v14 <- rk5 (w[16,19]) + @{[vaeskf1_vi $V15, $V14, 5]} # v15 <- rk6 (w[20,23]) + @{[vaeskf1_vi $V16, $V15, 6]} # v16 <- rk7 (w[24,27]) + @{[vaeskf1_vi $V17, $V16, 7]} # v17 <- rk8 (w[28,31]) + @{[vaeskf1_vi $V18, $V17, 8]} # v18 <- rk9 (w[32,35]) + @{[vaeskf1_vi $V19, $V18, 9]} # v19 <- rk10 (w[36,39]) + @{[vaeskf1_vi $V20, $V19, 10]} # v20 <- rk11 (w[40,43]) + + # Store the round keys + @{[vse32_v $V10, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V11, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V12, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V13, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V14, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V15, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V16, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V17, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V18, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V19, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V20, $KEYP]} + + li a0, 1 + ret + +L_set_key_256: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + # Load the key + @{[vle32_v $V10, $UKEY]} + addi $UKEY, $UKEY, 16 + @{[vle32_v $V11, $UKEY]} + + @{[vmv_v_v $V12, $V10]} + @{[vaeskf2_vi $V12, $V11, 2]} + @{[vmv_v_v $V13, $V11]} + @{[vaeskf2_vi $V13, $V12, 3]} + @{[vmv_v_v $V14, $V12]} + @{[vaeskf2_vi $V14, $V13, 4]} + @{[vmv_v_v $V15, $V13]} + @{[vaeskf2_vi $V15, $V14, 5]} + @{[vmv_v_v $V16, $V14]} + @{[vaeskf2_vi $V16, $V15, 6]} + @{[vmv_v_v $V17, $V15]} + @{[vaeskf2_vi $V17, $V16, 7]} + @{[vmv_v_v $V18, $V16]} + @{[vaeskf2_vi $V18, $V17, 8]} + @{[vmv_v_v $V19, $V17]} + @{[vaeskf2_vi $V19, $V18, 9]} + @{[vmv_v_v $V20, $V18]} + @{[vaeskf2_vi $V20, $V19, 10]} + @{[vmv_v_v $V21, $V19]} + @{[vaeskf2_vi $V21, $V20, 11]} + @{[vmv_v_v $V22, $V20]} + @{[vaeskf2_vi $V22, $V21, 12]} + @{[vmv_v_v $V23, $V21]} + @{[vaeskf2_vi $V23, $V22, 13]} + @{[vmv_v_v $V24, $V22]} + @{[vaeskf2_vi $V24, $V23, 14]} + + @{[vse32_v $V10, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V11, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V12, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V13, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V14, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V15, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V16, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V17, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V18, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V19, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V20, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V21, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V22, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V23, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vse32_v $V24, $KEYP]} + + li a0, 1 + ret +.size rv64i_zvkned_set_encrypt_key,.-rv64i_zvkned_set_encrypt_key +___ +} + +{ +################################################################################ +# void rv64i_zvkned_encrypt(const unsigned char *in, unsigned char *out, +# const AES_KEY *key); +my ($INP, $OUTP, $KEYP) = ("a0", "a1", "a2"); +my ($T0) = ("t0"); +my ($KEY_LEN) = ("a3"); + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_encrypt +.type rv64i_zvkned_encrypt,\@function +rv64i_zvkned_encrypt: + # Load key length. + lwu $KEY_LEN, 480($KEYP) + + # Get proper routine for key length. + li $T0, 32 + beq $KEY_LEN, $T0, L_enc_256 + li $T0, 24 + beq $KEY_LEN, $T0, L_enc_192 + li $T0, 16 + beq $KEY_LEN, $T0, L_enc_128 + + j L_fail_m2 +.size rv64i_zvkned_encrypt,.-rv64i_zvkned_encrypt +___ + +$code .= <<___; +.p2align 3 +L_enc_128: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + @{[vle32_v $V1, $INP]} + + @{[vle32_v $V10, $KEYP]} + @{[vaesz_vs $V1, $V10]} # with round key w[ 0, 3] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} + @{[vaesem_vs $V1, $V11]} # with round key w[ 4, 7] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V12, $KEYP]} + @{[vaesem_vs $V1, $V12]} # with round key w[ 8,11] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V13, $KEYP]} + @{[vaesem_vs $V1, $V13]} # with round key w[12,15] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V14, $KEYP]} + @{[vaesem_vs $V1, $V14]} # with round key w[16,19] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V15, $KEYP]} + @{[vaesem_vs $V1, $V15]} # with round key w[20,23] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V16, $KEYP]} + @{[vaesem_vs $V1, $V16]} # with round key w[24,27] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V17, $KEYP]} + @{[vaesem_vs $V1, $V17]} # with round key w[28,31] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V18, $KEYP]} + @{[vaesem_vs $V1, $V18]} # with round key w[32,35] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V19, $KEYP]} + @{[vaesem_vs $V1, $V19]} # with round key w[36,39] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V20, $KEYP]} + @{[vaesef_vs $V1, $V20]} # with round key w[40,43] + + @{[vse32_v $V1, $OUTP]} + + ret +.size L_enc_128,.-L_enc_128 +___ + +$code .= <<___; +.p2align 3 +L_enc_192: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + @{[vle32_v $V1, $INP]} + + @{[vle32_v $V10, $KEYP]} + @{[vaesz_vs $V1, $V10]} # with round key w[ 0, 3] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} + @{[vaesem_vs $V1, $V11]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V12, $KEYP]} + @{[vaesem_vs $V1, $V12]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V13, $KEYP]} + @{[vaesem_vs $V1, $V13]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V14, $KEYP]} + @{[vaesem_vs $V1, $V14]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V15, $KEYP]} + @{[vaesem_vs $V1, $V15]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V16, $KEYP]} + @{[vaesem_vs $V1, $V16]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V17, $KEYP]} + @{[vaesem_vs $V1, $V17]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V18, $KEYP]} + @{[vaesem_vs $V1, $V18]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V19, $KEYP]} + @{[vaesem_vs $V1, $V19]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V20, $KEYP]} + @{[vaesem_vs $V1, $V20]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V21, $KEYP]} + @{[vaesem_vs $V1, $V21]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V22, $KEYP]} + @{[vaesef_vs $V1, $V22]} + + @{[vse32_v $V1, $OUTP]} + ret +.size L_enc_192,.-L_enc_192 +___ + +$code .= <<___; +.p2align 3 +L_enc_256: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + @{[vle32_v $V1, $INP]} + + @{[vle32_v $V10, $KEYP]} + @{[vaesz_vs $V1, $V10]} # with round key w[ 0, 3] + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} + @{[vaesem_vs $V1, $V11]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V12, $KEYP]} + @{[vaesem_vs $V1, $V12]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V13, $KEYP]} + @{[vaesem_vs $V1, $V13]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V14, $KEYP]} + @{[vaesem_vs $V1, $V14]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V15, $KEYP]} + @{[vaesem_vs $V1, $V15]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V16, $KEYP]} + @{[vaesem_vs $V1, $V16]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V17, $KEYP]} + @{[vaesem_vs $V1, $V17]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V18, $KEYP]} + @{[vaesem_vs $V1, $V18]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V19, $KEYP]} + @{[vaesem_vs $V1, $V19]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V20, $KEYP]} + @{[vaesem_vs $V1, $V20]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V21, $KEYP]} + @{[vaesem_vs $V1, $V21]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V22, $KEYP]} + @{[vaesem_vs $V1, $V22]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V23, $KEYP]} + @{[vaesem_vs $V1, $V23]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V24, $KEYP]} + @{[vaesef_vs $V1, $V24]} + + @{[vse32_v $V1, $OUTP]} + ret +.size L_enc_256,.-L_enc_256 +___ + +################################################################################ +# void rv64i_zvkned_decrypt(const unsigned char *in, unsigned char *out, +# const AES_KEY *key); +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_decrypt +.type rv64i_zvkned_decrypt,\@function +rv64i_zvkned_decrypt: + # Load key length. + lwu $KEY_LEN, 480($KEYP) + + # Get proper routine for key length. + li $T0, 32 + beq $KEY_LEN, $T0, L_dec_256 + li $T0, 24 + beq $KEY_LEN, $T0, L_dec_192 + li $T0, 16 + beq $KEY_LEN, $T0, L_dec_128 + + j L_fail_m2 +.size rv64i_zvkned_decrypt,.-rv64i_zvkned_decrypt +___ + +$code .= <<___; +.p2align 3 +L_dec_128: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + @{[vle32_v $V1, $INP]} + + addi $KEYP, $KEYP, 160 + @{[vle32_v $V20, $KEYP]} + @{[vaesz_vs $V1, $V20]} # with round key w[40,43] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V19, $KEYP]} + @{[vaesdm_vs $V1, $V19]} # with round key w[36,39] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V18, $KEYP]} + @{[vaesdm_vs $V1, $V18]} # with round key w[32,35] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V17, $KEYP]} + @{[vaesdm_vs $V1, $V17]} # with round key w[28,31] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V16, $KEYP]} + @{[vaesdm_vs $V1, $V16]} # with round key w[24,27] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V15, $KEYP]} + @{[vaesdm_vs $V1, $V15]} # with round key w[20,23] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V14, $KEYP]} + @{[vaesdm_vs $V1, $V14]} # with round key w[16,19] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V13, $KEYP]} + @{[vaesdm_vs $V1, $V13]} # with round key w[12,15] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V12, $KEYP]} + @{[vaesdm_vs $V1, $V12]} # with round key w[ 8,11] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V11, $KEYP]} + @{[vaesdm_vs $V1, $V11]} # with round key w[ 4, 7] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V10, $KEYP]} + @{[vaesdf_vs $V1, $V10]} # with round key w[ 0, 3] + + @{[vse32_v $V1, $OUTP]} + + ret +.size L_dec_128,.-L_dec_128 +___ + +$code .= <<___; +.p2align 3 +L_dec_192: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + @{[vle32_v $V1, $INP]} + + addi $KEYP, $KEYP, 192 + @{[vle32_v $V22, $KEYP]} + @{[vaesz_vs $V1, $V22]} # with round key w[48,51] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V21, $KEYP]} + @{[vaesdm_vs $V1, $V21]} # with round key w[44,47] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V20, $KEYP]} + @{[vaesdm_vs $V1, $V20]} # with round key w[40,43] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V19, $KEYP]} + @{[vaesdm_vs $V1, $V19]} # with round key w[36,39] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V18, $KEYP]} + @{[vaesdm_vs $V1, $V18]} # with round key w[32,35] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V17, $KEYP]} + @{[vaesdm_vs $V1, $V17]} # with round key w[28,31] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V16, $KEYP]} + @{[vaesdm_vs $V1, $V16]} # with round key w[24,27] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V15, $KEYP]} + @{[vaesdm_vs $V1, $V15]} # with round key w[20,23] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V14, $KEYP]} + @{[vaesdm_vs $V1, $V14]} # with round key w[16,19] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V13, $KEYP]} + @{[vaesdm_vs $V1, $V13]} # with round key w[12,15] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V12, $KEYP]} + @{[vaesdm_vs $V1, $V12]} # with round key w[ 8,11] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V11, $KEYP]} + @{[vaesdm_vs $V1, $V11]} # with round key w[ 4, 7] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V10, $KEYP]} + @{[vaesdf_vs $V1, $V10]} # with round key w[ 0, 3] + + @{[vse32_v $V1, $OUTP]} + + ret +.size L_dec_192,.-L_dec_192 +___ + +$code .= <<___; +.p2align 3 +L_dec_256: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + @{[vle32_v $V1, $INP]} + + addi $KEYP, $KEYP, 224 + @{[vle32_v $V24, $KEYP]} + @{[vaesz_vs $V1, $V24]} # with round key w[56,59] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V23, $KEYP]} + @{[vaesdm_vs $V1, $V23]} # with round key w[52,55] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V22, $KEYP]} + @{[vaesdm_vs $V1, $V22]} # with round key w[48,51] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V21, $KEYP]} + @{[vaesdm_vs $V1, $V21]} # with round key w[44,47] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V20, $KEYP]} + @{[vaesdm_vs $V1, $V20]} # with round key w[40,43] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V19, $KEYP]} + @{[vaesdm_vs $V1, $V19]} # with round key w[36,39] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V18, $KEYP]} + @{[vaesdm_vs $V1, $V18]} # with round key w[32,35] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V17, $KEYP]} + @{[vaesdm_vs $V1, $V17]} # with round key w[28,31] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V16, $KEYP]} + @{[vaesdm_vs $V1, $V16]} # with round key w[24,27] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V15, $KEYP]} + @{[vaesdm_vs $V1, $V15]} # with round key w[20,23] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V14, $KEYP]} + @{[vaesdm_vs $V1, $V14]} # with round key w[16,19] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V13, $KEYP]} + @{[vaesdm_vs $V1, $V13]} # with round key w[12,15] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V12, $KEYP]} + @{[vaesdm_vs $V1, $V12]} # with round key w[ 8,11] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V11, $KEYP]} + @{[vaesdm_vs $V1, $V11]} # with round key w[ 4, 7] + addi $KEYP, $KEYP, -16 + @{[vle32_v $V10, $KEYP]} + @{[vaesdf_vs $V1, $V10]} # with round key w[ 0, 3] + + @{[vse32_v $V1, $OUTP]} + + ret +.size L_dec_256,.-L_dec_256 +___ +} + +$code .= <<___; +L_fail_m1: + li a0, -1 + ret +.size L_fail_m1,.-L_fail_m1 + +L_fail_m2: + li a0, -2 + ret +.size L_fail_m2,.-L_fail_m2 + +L_end: + ret +.size L_end,.-L_end +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Mon Nov 27 07:06:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469226 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C9D05C07D5B for ; Mon, 27 Nov 2023 07:07:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Ll4xGcIR5oBIqRPZsNI4DdrDw4pk23fFeFXRgXTbE7s=; b=g0jzSWESXfrWve GKtmtYCZ04NC9MhKUJRdnj/cNgqGJ8KfeKKJiS+xYvGd3J8WQuVyyxXbehshEM4l7WMKxOx74pqNa af0WsmXlsY1bo/VCv7ipkoxsnEMATzKQ6el2aQSnPCEzE/unEwsTiN72L0iGqydVIsxakS/RT9R95 05F1XllQynVcNwBwlvH94jRbOkebDfFD8Khw9As8fh7OZ7DwaEN0Cl3K23EfzY9MK5YTE2ZTu35pi vqNJp5lhDPj8Hdbc4C/ZuZJdinxv1FxgTqLEzf6Cwq7+BTG0ACP6LTesVPKA1l10cLZTvLR+P2gGC er2lKeBq7xBARieyN8Bw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViQ-001esA-22; Mon, 27 Nov 2023 07:07:34 +0000 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViM-001eqU-0V for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:31 +0000 Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1cfc35090b0so6525645ad.1 for ; Sun, 26 Nov 2023 23:07:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068848; x=1701673648; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=I9NcydlPjJuZdBvk47SliSWRPXO51iH+ClRWgIXdngI=; b=BcJrbVspd9KKZjBqq9xWXD4lFxmOlG/vOEioIassLVAbw4yVjLI8e4AG4LHc99k0qu dUV9xGJJzNHM0PtbM6DZid0m04wK/Px/V9+fJgppcLvtfbpgyp3ai2PF13XLXqDzh+Sm mQS6buDRDCtVfoP+rTkh5GXGFIyxYB4RqRhIHaOhJCTCm7nS8PsRIpZRUcM2VUQzwq0m HUsI0V60veAWOjm4APiaA0LnSjZA1yiWPrl7Sv+nJwAMU4hpNCqs/X1f2Jy/i491oUbh /9Zb0aBjJAek/KNeSCbHgc2al9a0JiKdU3Wy7uCEqa4zaBdJG2h4E53+mMhiKAN+daR+ IUeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068848; x=1701673648; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=I9NcydlPjJuZdBvk47SliSWRPXO51iH+ClRWgIXdngI=; b=Y/LopBQbiiXCTcZ6ErIrPwcs0Qc/IxzAR7ZXHO8TDyGIUUK6DHRAqHK5vxGNuQ43S7 UJ217bEMHiTKCCbZeQYTyfvFlbd08FwgBoZhFH4ADd+GG+qBKoPrSXNxpqx6LcXtAanM pnvjtmyosK7bhc/K1u6PSvtx3LYhSQLX0HGWDmoCJMV4GidCP1OELL5pDNVgaXbBp6l7 +N6ZAEIHl4vBfWXLTMYIa9yJeGo84KTcftxFVEO26nX2REm66vcz0z4NScteOYT6cs4G uvduGo8AAEQCmiA39Tk2f1kvhWpumKMcWjCLXQ1A7Q0ciS2UNRXCOKiZ1YFw7WHm0fmr OXsQ== X-Gm-Message-State: AOJu0Yyc/ovPlhNUXgFzZopKrFkJ+p9m4dXMTDZotPA6gRANc9ZSkJCa qK/CQcUVXhAkTM60NQudlKb5Bg== X-Google-Smtp-Source: AGHT+IELRS3hxE4VCKaoEswdjN2LTSxsJZBADubHAmIAofnrpOYQED6FfVAgy1DMUxcK/odvBEY6yg== X-Received: by 2002:a17:903:1d1:b0:1cf:d58b:da39 with SMTP id e17-20020a17090301d100b001cfd58bda39mr1135943plh.64.1701068848370; Sun, 26 Nov 2023 23:07:28 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.25 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:28 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 05/13] crypto: simd - Update `walksize` in simd skcipher Date: Mon, 27 Nov 2023 15:06:55 +0800 Message-Id: <20231127070703.1697-6-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230730_207280_A9E14715 X-CRM114-Status: UNSURE ( 9.02 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org The `walksize` assignment is missed in simd skcipher. Signed-off-by: Jerry Shih --- crypto/cryptd.c | 1 + crypto/simd.c | 1 + 2 files changed, 2 insertions(+) diff --git a/crypto/cryptd.c b/crypto/cryptd.c index bbcc368b6a55..253d13504ccb 100644 --- a/crypto/cryptd.c +++ b/crypto/cryptd.c @@ -405,6 +405,7 @@ static int cryptd_create_skcipher(struct crypto_template *tmpl, (alg->base.cra_flags & CRYPTO_ALG_INTERNAL); inst->alg.ivsize = crypto_skcipher_alg_ivsize(alg); inst->alg.chunksize = crypto_skcipher_alg_chunksize(alg); + inst->alg.walksize = crypto_skcipher_alg_walksize(alg); inst->alg.min_keysize = crypto_skcipher_alg_min_keysize(alg); inst->alg.max_keysize = crypto_skcipher_alg_max_keysize(alg); diff --git a/crypto/simd.c b/crypto/simd.c index edaa479a1ec5..ea0caabf90f1 100644 --- a/crypto/simd.c +++ b/crypto/simd.c @@ -181,6 +181,7 @@ struct simd_skcipher_alg *simd_skcipher_create_compat(const char *algname, alg->ivsize = ialg->ivsize; alg->chunksize = ialg->chunksize; + alg->walksize = ialg->walksize; alg->min_keysize = ialg->min_keysize; alg->max_keysize = ialg->max_keysize; From patchwork Mon Nov 27 07:06:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469229 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 8AE3BC07E98 for ; Mon, 27 Nov 2023 07:07:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=nDnOGGhHO1/zVIVSmg4Fc48Vf+ApevGCKpXqmBw2SFA=; b=WG1giO52l/gKWz X8qt5baQQr/tdR9j8iE2Ccj2Sipue7MQvYA+2YyAmZxqlaTCZ6iIaCq4xGPvSzQdMdyLL+2IfB1Pm cYlo3v1FPqOsGkd+R7ryqP1yqM2DgwLnfq/04KfcSb6MsRXS7jgQ0XBK68zgzPVnP0tDJGoTQkYmF 5vxGssEo0GUK7c6Z1nb76sfqjwnjl4NLmQIPNkSRcqeBoMhq+ASO9iVbxtIm8gLKHOK0eeKmz+ECp KGRkAR+GDG/qHvpGWbTUtJQQyK+WHfcWnRvxx2GIX5+zS0QkLoCwiM4J+5EZP4iYSZqEbT18jZoi2 b/vXHes8qIE2U5MfxopQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViV-001euU-1x; Mon, 27 Nov 2023 07:07:39 +0000 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViS-001erR-2F for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:38 +0000 Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1cfd78f8a12so1916845ad.2 for ; Sun, 26 Nov 2023 23:07:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068851; x=1701673651; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KWDZyqBJhOVpYqYhwbudR4lRinTiZ1fIB/oQ8fHi3tg=; b=QLBx8iCbdEuDrXqBqeM1d8jQpXFsjxs5UiXVtoeJ4HOcJog8llQnUJR6GYMpMQD/Z4 KLoavIQn6sOIyzi58F1XGcFCTjd+mfNlv//9dtYOOTBCrmPHWKra3+CAKeAjIEsoIYsZ TI3PJ+TtRNnK5YmSLKPWt62TvenyzypJCJbKxaK+puYNTGOWy2q5iW3Djb0KJ1lA/+M4 1XJj/CqooCIYynlIjenoOo2J36J4Rq+CRqR/qzvL08O7aFz0593EhZ/3U8PFomx5Z0R0 DMexxmwuGSYaHMSUZvlIFeZsOE+gePFQggDqMf84z320nn/AbS5QtoK0jTJAhhuA6qz5 6ULA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068851; x=1701673651; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KWDZyqBJhOVpYqYhwbudR4lRinTiZ1fIB/oQ8fHi3tg=; b=Seh89jRcZnYPj9cvdc/0rLdbk83sBNtc/RcY1AOm8j2XzdJlM2lz9XV6mK7fCl3ObZ iGhxmKV1rAi8HCfRIzBc/OjbIL53SABpdj0ZYeE/1GVgIVYuREo8ksuoi/f7sMv8twW2 Th775qrz14wwrHh+ojqvActmamGbU4KFxYdbun+LxHi/LqtFwvEziyYpjp678vbnEhNs yMU0qebUWy6gmcvnSvY3M/iGSU5LH1xgGcZfrJq146eEh7ya6Xw/Jedz9CHcmUSLfJks Ek+UQIWZd6lr3rSxAW+TW4C/7XyEEtuLAeSsMEkr5WoSCHQP6kVGpeV9fFpR34d5O/jc 2vjA== X-Gm-Message-State: AOJu0YyCqaIXX/Tz8SNcMCMgvYRmZ5QzbvIYqqrSfxyvkipldlxM31KL 81vxznX9xR9ccjishzwDHD1WYA== X-Google-Smtp-Source: AGHT+IH+pvi642lKHelJCCdm6Sbw3UusLCtlqBwe0TaqWn+6ozeTKQf9INKnJ4k0xOSABXksoQFmKA== X-Received: by 2002:a17:902:820b:b0:1cf:cd4e:ca02 with SMTP id x11-20020a170902820b00b001cfcd4eca02mr2699019pln.24.1701068851234; Sun, 26 Nov 2023 23:07:31 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.28 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:31 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 06/13] crypto: scatterwalk - Add scatterwalk_next() to get the next scatterlist in scatter_walk Date: Mon, 27 Nov 2023 15:06:56 +0800 Message-Id: <20231127070703.1697-7-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230736_756419_9FB68081 X-CRM114-Status: GOOD ( 11.62 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org In some situations, we might split the `skcipher_request` into several segments. When we try to move to next segment, we might use `scatterwalk_ffwd()` to get the corresponding `scatterlist` iterating from the head of `scatterlist`. This helper function could just gather the information in `skcipher_walk` and move to next `scatterlist` directly. Signed-off-by: Jerry Shih --- include/crypto/scatterwalk.h | 9 +++++++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/include/crypto/scatterwalk.h b/include/crypto/scatterwalk.h index 32fc4473175b..b1a90afe695d 100644 --- a/include/crypto/scatterwalk.h +++ b/include/crypto/scatterwalk.h @@ -98,7 +98,12 @@ void scatterwalk_map_and_copy(void *buf, struct scatterlist *sg, unsigned int start, unsigned int nbytes, int out); struct scatterlist *scatterwalk_ffwd(struct scatterlist dst[2], - struct scatterlist *src, - unsigned int len); + struct scatterlist *src, unsigned int len); + +static inline struct scatterlist *scatterwalk_next(struct scatterlist dst[2], + struct scatter_walk *src) +{ + return scatterwalk_ffwd(dst, src->sg, src->offset - src->sg->offset); +} #endif /* _CRYPTO_SCATTERWALK_H */ From patchwork Mon Nov 27 07:06:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469231 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 354D7C4167B for ; Mon, 27 Nov 2023 07:07:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=tlW/QI0Apv5z2EtLaGRSKFYVX1t4fZqK2MttYDwCWPc=; b=wdzfCIYjC4m9vl OJrcuwY8mteQAF5BiJ+OngctjBoLuv8RQhnvZZwWXOK2LD4sDm/sH/jMC7xEKrqGuE11p4OQLdHuL CbpPATSAwxKEtySjyWyeVZyTPIJDhNYKqB9FaUiK+oliZghb4tJIWP8sRWwXAl3GCv21ySg/xzRX5 ncx8P25fkBzVDWuTzqCC2bZl+2W3zpwIZ9Wj8FzeAX7r6vUJI+XFR3CLI/dxFNIcoggmPydADn013 0bYpU/xXA4UOt5Ev+cl+60EXHjKL0UCkFR4M+5wlPL5kDJfceQLcZtImtGzvxxgq7IDDsJFVuCtRn zzFPvgbLINyVYJg/tfMQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViZ-001ex4-0V; Mon, 27 Nov 2023 07:07:43 +0000 Received: from mail-pl1-x634.google.com ([2607:f8b0:4864:20::634]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViR-001ese-2Y for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:41 +0000 Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1cfc9c4acb6so4186345ad.0 for ; Sun, 26 Nov 2023 23:07:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068855; x=1701673655; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=s1s7i3VA49GEZVjag/Kbhi+FIt8+X1iRiytC3VJdGAg=; b=YVMmCbIYW4YXEsT59boL/36qIBTHNjo9ggIrOTQb9hbR9F+CpVce45JUZwYSrW9Ie0 776C/XVCw/TZ3ld7y+H/ROV6m1lCcR6hJ5f9c7lOLJAr5GDWIfRdgGqsyFsUJ3Np9v69 lWRF476mElhzCD+/Tu9OqAg1CV2qc/QWybL1//kzjRHiIcZZi6kbA7UKPu49XTs39Ia4 wHn7ASc+iieBp24DR5XuVMUPaIBZ5848jOkXraC6XD01TShYi4jfYK7SihfqWOgPHMiH HRtn7xCzzikYcLF80XoBVRqdghiAVu80fgQZiSigCiS/SuJqLVhh+KlnCKE4TrEcSsMW ouNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068855; x=1701673655; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=s1s7i3VA49GEZVjag/Kbhi+FIt8+X1iRiytC3VJdGAg=; b=ZU1UZhJL9J28cyAum5BdDiBd65Psb+xgoTff8fkfygtGVUvap9TW1006naLZwEtAqO aHwUPIKT2hKlwLJD+SL5W1wnIeKSfv/rpudVPjfBG335vB/6NxOxx6Ph9kHrmO4s2ets xdomeAcP5tb18en9Uc83tpN7JAmz4odxa/NTFowNAa0Pek2+NXWQDVCuvp1f9NH8K0iJ lE+vk7ppSbSsRj4gGC6XStKuh2kB2PnVnNwTZ4iQ/aQF1fn4YUcQGqvHg/yI55gffNoX lqMRF4+30esSizD7xCExq3p0lPflyKy5RnVrWQ2xQ3odepxoGc5y4w8T0fWATXcJypc+ yvwA== X-Gm-Message-State: AOJu0Yx5JC/OQlgWhPUejOrbHIfEq5lK8P289D0LhA9SgBB28JjgLfXz QfHGk1lid10bM+TcbmtYeIWFdg== X-Google-Smtp-Source: AGHT+IEdpkYekzvZiIQ/njqiH1GpCr+lk2R945AdXDLzoB19kY/lDSKWeYSQh2uFfNGIUhkXyGVo9g== X-Received: by 2002:a17:902:ead2:b0:1cf:63bb:82a6 with SMTP id p18-20020a170902ead200b001cf63bb82a6mr9164128pld.65.1701068854621; Sun, 26 Nov 2023 23:07:34 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.31 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:34 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 07/13] RISC-V: crypto: add accelerated AES-CBC/CTR/ECB/XTS implementations Date: Mon, 27 Nov 2023 15:06:57 +0800 Message-Id: <20231127070703.1697-8-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230735_915259_8632C372 X-CRM114-Status: GOOD ( 23.79 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Port the vector-crypto accelerated CBC, CTR, ECB and XTS block modes for AES cipher from OpenSSL(openssl/openssl#21923). In addition, support XTS-AES-192 mode which is not existed in OpenSSL. Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v2: - Do not turn on kconfig `AES_BLOCK_RISCV64` option by default. - Update asm function for using aes key in `crypto_aes_ctx` structure. - Turn to use simd skcipher interface for AES-CBC/CTR/ECB/XTS modes. We still have lots of discussions for kernel-vector implementation. Before the final version of kernel-vector, use simd skcipher interface to skip the fallback path for all aes modes in all kinds of contexts. If we could always enable kernel-vector in softirq in the future, we could make the original sync skcipher algorithm back. - Refine aes-xts comments for head and tail blocks handling. - Update VLEN constraint for aex-xts mode. - Add `asmlinkage` qualifier for crypto asm function. - Rename aes-riscv64-zvbb-zvkg-zvkned to aes-riscv64-zvkned-zvbb-zvkg. - Rename aes-riscv64-zvkb-zvkned to aes-riscv64-zvkned-zvkb. - Reorder structure riscv64_aes_algs_zvkned, riscv64_aes_alg_zvkned_zvkb and riscv64_aes_alg_zvkned_zvbb_zvkg members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 21 + arch/riscv/crypto/Makefile | 11 + .../crypto/aes-riscv64-block-mode-glue.c | 514 ++++++++++ .../crypto/aes-riscv64-zvkned-zvbb-zvkg.pl | 949 ++++++++++++++++++ arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl | 415 ++++++++ arch/riscv/crypto/aes-riscv64-zvkned.pl | 746 ++++++++++++++ 6 files changed, 2656 insertions(+) create mode 100644 arch/riscv/crypto/aes-riscv64-block-mode-glue.c create mode 100644 arch/riscv/crypto/aes-riscv64-zvkned-zvbb-zvkg.pl create mode 100644 arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 65189d4d47b3..9d991ddda289 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -13,4 +13,25 @@ config CRYPTO_AES_RISCV64 Architecture: riscv64 using: - Zvkned vector crypto extension +config CRYPTO_AES_BLOCK_RISCV64 + tristate "Ciphers: AES, modes: ECB/CBC/CTR/XTS" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_AES_RISCV64 + select CRYPTO_SIMD + select CRYPTO_SKCIPHER + help + Length-preserving ciphers: AES cipher algorithms (FIPS-197) + with block cipher modes: + - ECB (Electronic Codebook) mode (NIST SP 800-38A) + - CBC (Cipher Block Chaining) mode (NIST SP 800-38A) + - CTR (Counter) mode (NIST SP 800-38A) + - XTS (XOR Encrypt XOR Tweakable Block Cipher with Ciphertext + Stealing) mode (NIST SP 800-38E and IEEE 1619) + + Architecture: riscv64 using: + - Zvkned vector crypto extension + - Zvbb vector extension (XTS) + - Zvkb vector crypto extension (CTR/XTS) + - Zvkg vector crypto extension (XTS) + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 90ca91d8df26..9574b009762f 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -6,10 +6,21 @@ obj-$(CONFIG_CRYPTO_AES_RISCV64) += aes-riscv64.o aes-riscv64-y := aes-riscv64-glue.o aes-riscv64-zvkned.o +obj-$(CONFIG_CRYPTO_AES_BLOCK_RISCV64) += aes-block-riscv64.o +aes-block-riscv64-y := aes-riscv64-block-mode-glue.o aes-riscv64-zvkned-zvbb-zvkg.o aes-riscv64-zvkned-zvkb.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) $(obj)/aes-riscv64-zvkned.S: $(src)/aes-riscv64-zvkned.pl $(call cmd,perlasm) +$(obj)/aes-riscv64-zvkned-zvbb-zvkg.S: $(src)/aes-riscv64-zvkned-zvbb-zvkg.pl + $(call cmd,perlasm) + +$(obj)/aes-riscv64-zvkned-zvkb.S: $(src)/aes-riscv64-zvkned-zvkb.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S +clean-files += aes-riscv64-zvkned-zvbb-zvkg.S +clean-files += aes-riscv64-zvkned-zvkb.S diff --git a/arch/riscv/crypto/aes-riscv64-block-mode-glue.c b/arch/riscv/crypto/aes-riscv64-block-mode-glue.c new file mode 100644 index 000000000000..36fdd83b11ef --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-block-mode-glue.c @@ -0,0 +1,514 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Port of the OpenSSL AES block mode implementations for RISC-V + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "aes-riscv64-glue.h" + +struct riscv64_aes_xts_ctx { + struct crypto_aes_ctx ctx1; + struct crypto_aes_ctx ctx2; +}; + +/* aes cbc block mode using zvkned vector crypto extension */ +asmlinkage void rv64i_zvkned_cbc_encrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, + u8 *ivec); +asmlinkage void rv64i_zvkned_cbc_decrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, + u8 *ivec); +/* aes ecb block mode using zvkned vector crypto extension */ +asmlinkage void rv64i_zvkned_ecb_encrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key); +asmlinkage void rv64i_zvkned_ecb_decrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key); + +/* aes ctr block mode using zvkb and zvkned vector crypto extension */ +/* This func operates on 32-bit counter. Caller has to handle the overflow. */ +asmlinkage void +rv64i_zvkb_zvkned_ctr32_encrypt_blocks(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, + u8 *ivec); + +/* aes xts block mode using zvbb, zvkg and zvkned vector crypto extension */ +asmlinkage void +rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, u8 *iv, + int update_iv); +asmlinkage void +rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, u8 *iv, + int update_iv); + +typedef void (*aes_xts_func)(const u8 *in, u8 *out, size_t length, + const struct crypto_aes_ctx *key, u8 *iv, + int update_iv); + +/* ecb */ +static int aes_setkey(struct crypto_skcipher *tfm, const u8 *in_key, + unsigned int key_len) +{ + struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + + return riscv64_aes_setkey(ctx, in_key, key_len); +} + +static int ecb_encrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + int err; + + /* If we have error here, the `nbytes` will be zero. */ + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + kernel_vector_begin(); + rv64i_zvkned_ecb_encrypt(walk.src.virt.addr, walk.dst.virt.addr, + nbytes & (~(AES_BLOCK_SIZE - 1)), ctx); + kernel_vector_end(); + err = skcipher_walk_done(&walk, nbytes & (AES_BLOCK_SIZE - 1)); + } + + return err; +} + +static int ecb_decrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + int err; + + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + kernel_vector_begin(); + rv64i_zvkned_ecb_decrypt(walk.src.virt.addr, walk.dst.virt.addr, + nbytes & (~(AES_BLOCK_SIZE - 1)), ctx); + kernel_vector_end(); + err = skcipher_walk_done(&walk, nbytes & (AES_BLOCK_SIZE - 1)); + } + + return err; +} + +/* cbc */ +static int cbc_encrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + int err; + + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + kernel_vector_begin(); + rv64i_zvkned_cbc_encrypt(walk.src.virt.addr, walk.dst.virt.addr, + nbytes & (~(AES_BLOCK_SIZE - 1)), ctx, + walk.iv); + kernel_vector_end(); + err = skcipher_walk_done(&walk, nbytes & (AES_BLOCK_SIZE - 1)); + } + + return err; +} + +static int cbc_decrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + int err; + + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + kernel_vector_begin(); + rv64i_zvkned_cbc_decrypt(walk.src.virt.addr, walk.dst.virt.addr, + nbytes & (~(AES_BLOCK_SIZE - 1)), ctx, + walk.iv); + kernel_vector_end(); + err = skcipher_walk_done(&walk, nbytes & (AES_BLOCK_SIZE - 1)); + } + + return err; +} + +/* ctr */ +static int ctr_encrypt(struct skcipher_request *req) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct crypto_aes_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int ctr32; + unsigned int nbytes; + unsigned int blocks; + unsigned int current_blocks; + unsigned int current_length; + int err; + + /* the ctr iv uses big endian */ + ctr32 = get_unaligned_be32(req->iv + 12); + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + if (nbytes != walk.total) { + nbytes &= (~(AES_BLOCK_SIZE - 1)); + blocks = nbytes / AES_BLOCK_SIZE; + } else { + /* This is the last walk. We should handle the tail data. */ + blocks = DIV_ROUND_UP(nbytes, AES_BLOCK_SIZE); + } + ctr32 += blocks; + + kernel_vector_begin(); + /* + * The `if` block below detects the overflow, which is then handled by + * limiting the amount of blocks to the exact overflow point. + */ + if (ctr32 >= blocks) { + rv64i_zvkb_zvkned_ctr32_encrypt_blocks( + walk.src.virt.addr, walk.dst.virt.addr, nbytes, + ctx, req->iv); + } else { + /* use 2 ctr32 function calls for overflow case */ + current_blocks = blocks - ctr32; + current_length = + min(nbytes, current_blocks * AES_BLOCK_SIZE); + rv64i_zvkb_zvkned_ctr32_encrypt_blocks( + walk.src.virt.addr, walk.dst.virt.addr, + current_length, ctx, req->iv); + crypto_inc(req->iv, 12); + + if (ctr32) { + rv64i_zvkb_zvkned_ctr32_encrypt_blocks( + walk.src.virt.addr + + current_blocks * AES_BLOCK_SIZE, + walk.dst.virt.addr + + current_blocks * AES_BLOCK_SIZE, + nbytes - current_length, ctx, req->iv); + } + } + kernel_vector_end(); + + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); + } + + return err; +} + +/* xts */ +static int xts_setkey(struct crypto_skcipher *tfm, const u8 *in_key, + unsigned int key_len) +{ + struct riscv64_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); + unsigned int xts_single_key_len = key_len / 2; + int ret; + + ret = xts_verify_key(tfm, in_key, key_len); + if (ret) + return ret; + ret = riscv64_aes_setkey(&ctx->ctx1, in_key, xts_single_key_len); + if (ret) + return ret; + return riscv64_aes_setkey(&ctx->ctx2, in_key + xts_single_key_len, + xts_single_key_len); +} + +static int xts_crypt(struct skcipher_request *req, aes_xts_func func) +{ + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct riscv64_aes_xts_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_request sub_req; + struct scatterlist sg_src[2], sg_dst[2]; + struct scatterlist *src, *dst; + struct skcipher_walk walk; + unsigned int walk_size = crypto_skcipher_walksize(tfm); + unsigned int tail_bytes; + unsigned int head_bytes; + unsigned int nbytes; + unsigned int update_iv = 1; + int err; + + /* xts input size should be bigger than AES_BLOCK_SIZE */ + if (req->cryptlen < AES_BLOCK_SIZE) + return -EINVAL; + + /* + * We split xts-aes cryption into `head` and `tail` parts. + * The head block contains the input from the beginning which doesn't need + * `ciphertext stealing` method. + * The tail block contains at least two AES blocks including ciphertext + * stealing data from the end. + */ + if (req->cryptlen <= walk_size) { + /* + * All data is in one `walk`. We could handle it within one AES-XTS call in + * the end. + */ + tail_bytes = req->cryptlen; + head_bytes = 0; + } else { + if (req->cryptlen & (AES_BLOCK_SIZE - 1)) { + /* + * with ciphertext stealing + * + * Find the largest tail size which is small than `walk` size while the + * head part still fits AES block boundary. + */ + tail_bytes = req->cryptlen & (AES_BLOCK_SIZE - 1); + tail_bytes = walk_size + tail_bytes - AES_BLOCK_SIZE; + head_bytes = req->cryptlen - tail_bytes; + } else { + /* no ciphertext stealing */ + tail_bytes = 0; + head_bytes = req->cryptlen; + } + } + + riscv64_aes_encrypt_zvkned(&ctx->ctx2, req->iv, req->iv); + + if (head_bytes && tail_bytes) { + /* If we have to parts, setup new request for head part only. */ + skcipher_request_set_tfm(&sub_req, tfm); + skcipher_request_set_callback( + &sub_req, skcipher_request_flags(req), NULL, NULL); + skcipher_request_set_crypt(&sub_req, req->src, req->dst, + head_bytes, req->iv); + req = &sub_req; + } + + if (head_bytes) { + err = skcipher_walk_virt(&walk, req, false); + while ((nbytes = walk.nbytes)) { + if (nbytes == walk.total) + update_iv = (tail_bytes > 0); + + nbytes &= (~(AES_BLOCK_SIZE - 1)); + kernel_vector_begin(); + func(walk.src.virt.addr, walk.dst.virt.addr, nbytes, + &ctx->ctx1, req->iv, update_iv); + kernel_vector_end(); + + err = skcipher_walk_done(&walk, walk.nbytes - nbytes); + } + if (err || !tail_bytes) + return err; + + /* + * Setup new request for tail part. + * We use `scatterwalk_next()` to find the next scatterlist from last + * walk instead of iterating from the beginning. + */ + dst = src = scatterwalk_next(sg_src, &walk.in); + if (req->dst != req->src) + dst = scatterwalk_next(sg_dst, &walk.out); + skcipher_request_set_crypt(req, src, dst, tail_bytes, req->iv); + } + + /* tail */ + err = skcipher_walk_virt(&walk, req, false); + if (err) + return err; + if (walk.nbytes != tail_bytes) + return -EINVAL; + kernel_vector_begin(); + func(walk.src.virt.addr, walk.dst.virt.addr, walk.nbytes, &ctx->ctx1, + req->iv, 0); + kernel_vector_end(); + + return skcipher_walk_done(&walk, 0); +} + +static int xts_encrypt(struct skcipher_request *req) +{ + return xts_crypt(req, rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt); +} + +static int xts_decrypt(struct skcipher_request *req) +{ + return xts_crypt(req, rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt); +} + +static struct skcipher_alg riscv64_aes_algs_zvkned[] = { + { + .setkey = aes_setkey, + .encrypt = ecb_encrypt, + .decrypt = ecb_decrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .walksize = AES_BLOCK_SIZE * 8, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_priority = 300, + .cra_name = "__ecb(aes)", + .cra_driver_name = "__ecb-aes-riscv64-zvkned", + .cra_module = THIS_MODULE, + }, + }, { + .setkey = aes_setkey, + .encrypt = cbc_encrypt, + .decrypt = cbc_decrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + .walksize = AES_BLOCK_SIZE * 8, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_priority = 300, + .cra_name = "__cbc(aes)", + .cra_driver_name = "__cbc-aes-riscv64-zvkned", + .cra_module = THIS_MODULE, + }, + } +}; + +static struct simd_skcipher_alg + *riscv64_aes_simd_algs_zvkned[ARRAY_SIZE(riscv64_aes_algs_zvkned)]; + +static struct skcipher_alg riscv64_aes_alg_zvkned_zvkb[] = { + { + .setkey = aes_setkey, + .encrypt = ctr_encrypt, + .decrypt = ctr_encrypt, + .min_keysize = AES_MIN_KEY_SIZE, + .max_keysize = AES_MAX_KEY_SIZE, + .ivsize = AES_BLOCK_SIZE, + .chunksize = AES_BLOCK_SIZE, + .walksize = AES_BLOCK_SIZE * 8, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = 1, + .cra_ctxsize = sizeof(struct crypto_aes_ctx), + .cra_priority = 300, + .cra_name = "__ctr(aes)", + .cra_driver_name = "__ctr-aes-riscv64-zvkned-zvkb", + .cra_module = THIS_MODULE, + }, + } +}; + +static struct simd_skcipher_alg *riscv64_aes_simd_alg_zvkned_zvkb[ARRAY_SIZE( + riscv64_aes_alg_zvkned_zvkb)]; + +static struct skcipher_alg riscv64_aes_alg_zvkned_zvbb_zvkg[] = { + { + .setkey = xts_setkey, + .encrypt = xts_encrypt, + .decrypt = xts_decrypt, + .min_keysize = AES_MIN_KEY_SIZE * 2, + .max_keysize = AES_MAX_KEY_SIZE * 2, + .ivsize = AES_BLOCK_SIZE, + .chunksize = AES_BLOCK_SIZE, + .walksize = AES_BLOCK_SIZE * 8, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = AES_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct riscv64_aes_xts_ctx), + .cra_priority = 300, + .cra_name = "__xts(aes)", + .cra_driver_name = "__xts-aes-riscv64-zvkned-zvbb-zvkg", + .cra_module = THIS_MODULE, + }, + } +}; + +static struct simd_skcipher_alg + *riscv64_aes_simd_alg_zvkned_zvbb_zvkg[ARRAY_SIZE( + riscv64_aes_alg_zvkned_zvbb_zvkg)]; + +static int __init riscv64_aes_block_mod_init(void) +{ + int ret = -ENODEV; + + if (riscv_isa_extension_available(NULL, ZVKNED) && + riscv_vector_vlen() >= 128 && riscv_vector_vlen() <= 2048) { + ret = simd_register_skciphers_compat( + riscv64_aes_algs_zvkned, + ARRAY_SIZE(riscv64_aes_algs_zvkned), + riscv64_aes_simd_algs_zvkned); + if (ret) + return ret; + + if (riscv_isa_extension_available(NULL, ZVBB)) { + ret = simd_register_skciphers_compat( + riscv64_aes_alg_zvkned_zvkb, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvkb), + riscv64_aes_simd_alg_zvkned_zvkb); + if (ret) + goto unregister_zvkned; + + if (riscv_isa_extension_available(NULL, ZVKG)) { + ret = simd_register_skciphers_compat( + riscv64_aes_alg_zvkned_zvbb_zvkg, + ARRAY_SIZE( + riscv64_aes_alg_zvkned_zvbb_zvkg), + riscv64_aes_simd_alg_zvkned_zvbb_zvkg); + if (ret) + goto unregister_zvkned_zvkb; + } + } + } + + return ret; + +unregister_zvkned_zvkb: + simd_unregister_skciphers(riscv64_aes_alg_zvkned_zvkb, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvkb), + riscv64_aes_simd_alg_zvkned_zvkb); +unregister_zvkned: + simd_unregister_skciphers(riscv64_aes_algs_zvkned, + ARRAY_SIZE(riscv64_aes_algs_zvkned), + riscv64_aes_simd_algs_zvkned); + + return ret; +} + +static void __exit riscv64_aes_block_mod_fini(void) +{ + simd_unregister_skciphers(riscv64_aes_alg_zvkned_zvbb_zvkg, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvbb_zvkg), + riscv64_aes_simd_alg_zvkned_zvbb_zvkg); + simd_unregister_skciphers(riscv64_aes_alg_zvkned_zvkb, + ARRAY_SIZE(riscv64_aes_alg_zvkned_zvkb), + riscv64_aes_simd_alg_zvkned_zvkb); + simd_unregister_skciphers(riscv64_aes_algs_zvkned, + ARRAY_SIZE(riscv64_aes_algs_zvkned), + riscv64_aes_simd_algs_zvkned); +} + +module_init(riscv64_aes_block_mod_init); +module_exit(riscv64_aes_block_mod_fini); + +MODULE_DESCRIPTION("AES-ECB/CBC/CTR/XTS (RISC-V accelerated)"); +MODULE_AUTHOR("Jerry Shih "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("cbc(aes)"); +MODULE_ALIAS_CRYPTO("ctr(aes)"); +MODULE_ALIAS_CRYPTO("ecb(aes)"); +MODULE_ALIAS_CRYPTO("xts(aes)"); diff --git a/arch/riscv/crypto/aes-riscv64-zvkned-zvbb-zvkg.pl b/arch/riscv/crypto/aes-riscv64-zvkned-zvbb-zvkg.pl new file mode 100644 index 000000000000..6b6aad1cc97a --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-zvkned-zvbb-zvkg.pl @@ -0,0 +1,949 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 && VLEN <= 2048 +# - RISC-V Vector Bit-manipulation extension ('Zvbb') +# - RISC-V Vector GCM/GMAC extension ('Zvkg') +# - RISC-V Vector AES block cipher extension ('Zvkned') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +{ +################################################################################ +# void rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt(const unsigned char *in, +# unsigned char *out, size_t length, +# const AES_KEY *key, +# unsigned char iv[16], +# int update_iv) +my ($INPUT, $OUTPUT, $LENGTH, $KEY, $IV, $UPDATE_IV) = ("a0", "a1", "a2", "a3", "a4", "a5"); +my ($TAIL_LENGTH) = ("a6"); +my ($VL) = ("a7"); +my ($T0, $T1, $T2, $T3) = ("t0", "t1", "t2", "t3"); +my ($STORE_LEN32) = ("t4"); +my ($LEN32) = ("t5"); +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +# load iv to v28 +sub load_xts_iv0 { + my $code=<<___; + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V28, $IV]} +___ + + return $code; +} + +# prepare input data(v24), iv(v28), bit-reversed-iv(v16), bit-reversed-iv-multiplier(v20) +sub init_first_round { + my $code=<<___; + # load input + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + @{[vle32_v $V24, $INPUT]} + + li $T0, 5 + # We could simplify the initialization steps if we have `block<=1`. + blt $LEN32, $T0, 1f + + # Note: We use `vgmul` for GF(2^128) multiplication. The `vgmul` uses + # different order of coefficients. We should use`vbrev8` to reverse the + # data when we use `vgmul`. + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vbrev8_v $V0, $V28]} + @{[vsetvli "zero", $LEN32, "e32", "m4", "ta", "ma"]} + @{[vmv_v_i $V16, 0]} + # v16: [r-IV0, r-IV0, ...] + @{[vaesz_vs $V16, $V0]} + + # Prepare GF(2^128) multiplier [1, x, x^2, x^3, ...] in v8. + # We use `vwsll` to get power of 2 multipliers. Current rvv spec only + # supports `SEW<=64`. So, the maximum `VLEN` for this approach is `2048`. + # SEW64_BITS * AES_BLOCK_SIZE / LMUL + # = 64 * 128 / 4 = 2048 + # + # TODO: truncate the vl to `2048` for `vlen>2048` case. + slli $T0, $LEN32, 2 + @{[vsetvli "zero", $T0, "e32", "m1", "ta", "ma"]} + # v2: [`1`, `1`, `1`, `1`, ...] + @{[vmv_v_i $V2, 1]} + # v3: [`0`, `1`, `2`, `3`, ...] + @{[vid_v $V3]} + @{[vsetvli "zero", $T0, "e64", "m2", "ta", "ma"]} + # v4: [`1`, 0, `1`, 0, `1`, 0, `1`, 0, ...] + @{[vzext_vf2 $V4, $V2]} + # v6: [`0`, 0, `1`, 0, `2`, 0, `3`, 0, ...] + @{[vzext_vf2 $V6, $V3]} + slli $T0, $LEN32, 1 + @{[vsetvli "zero", $T0, "e32", "m2", "ta", "ma"]} + # v8: [1<<0=1, 0, 0, 0, 1<<1=x, 0, 0, 0, 1<<2=x^2, 0, 0, 0, ...] + @{[vwsll_vv $V8, $V4, $V6]} + + # Compute [r-IV0*1, r-IV0*x, r-IV0*x^2, r-IV0*x^3, ...] in v16 + @{[vsetvli "zero", $LEN32, "e32", "m4", "ta", "ma"]} + @{[vbrev8_v $V8, $V8]} + @{[vgmul_vv $V16, $V8]} + + # Compute [IV0*1, IV0*x, IV0*x^2, IV0*x^3, ...] in v28. + # Reverse the bits order back. + @{[vbrev8_v $V28, $V16]} + + # Prepare the x^n multiplier in v20. The `n` is the aes-xts block number + # in a LMUL=4 register group. + # n = ((VLEN*LMUL)/(32*4)) = ((VLEN*4)/(32*4)) + # = (VLEN/32) + # We could use vsetvli with `e32, m1` to compute the `n` number. + @{[vsetvli $T0, "zero", "e32", "m1", "ta", "ma"]} + li $T1, 1 + sll $T0, $T1, $T0 + @{[vsetivli "zero", 2, "e64", "m1", "ta", "ma"]} + @{[vmv_v_i $V0, 0]} + @{[vsetivli "zero", 1, "e64", "m1", "tu", "ma"]} + @{[vmv_v_x $V0, $T0]} + @{[vsetivli "zero", 2, "e64", "m1", "ta", "ma"]} + @{[vbrev8_v $V0, $V0]} + @{[vsetvli "zero", $LEN32, "e32", "m4", "ta", "ma"]} + @{[vmv_v_i $V20, 0]} + @{[vaesz_vs $V20, $V0]} + + j 2f +1: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vbrev8_v $V16, $V28]} +2: +___ + + return $code; +} + +# prepare xts enc last block's input(v24) and iv(v28) +sub handle_xts_enc_last_block { + my $code=<<___; + bnez $TAIL_LENGTH, 2f + + beqz $UPDATE_IV, 1f + ## Store next IV + addi $VL, $VL, -4 + @{[vsetivli "zero", 4, "e32", "m4", "ta", "ma"]} + # multiplier + @{[vslidedown_vx $V16, $V16, $VL]} + + # setup `x` multiplier with byte-reversed order + # 0b00000010 => 0b01000000 (0x40) + li $T0, 0x40 + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vmv_v_i $V28, 0]} + @{[vsetivli "zero", 1, "e8", "m1", "tu", "ma"]} + @{[vmv_v_x $V28, $T0]} + + # IV * `x` + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vgmul_vv $V16, $V28]} + # Reverse the IV's bits order back to big-endian + @{[vbrev8_v $V28, $V16]} + + @{[vse32_v $V28, $IV]} +1: + + ret +2: + # slidedown second to last block + addi $VL, $VL, -4 + @{[vsetivli "zero", 4, "e32", "m4", "ta", "ma"]} + # ciphertext + @{[vslidedown_vx $V24, $V24, $VL]} + # multiplier + @{[vslidedown_vx $V16, $V16, $VL]} + + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vmv_v_v $V25, $V24]} + + # load last block into v24 + # note: We should load the last block before store the second to last block + # for in-place operation. + @{[vsetvli "zero", $TAIL_LENGTH, "e8", "m1", "tu", "ma"]} + @{[vle8_v $V24, $INPUT]} + + # setup `x` multiplier with byte-reversed order + # 0b00000010 => 0b01000000 (0x40) + li $T0, 0x40 + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vmv_v_i $V28, 0]} + @{[vsetivli "zero", 1, "e8", "m1", "tu", "ma"]} + @{[vmv_v_x $V28, $T0]} + + # compute IV for last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vgmul_vv $V16, $V28]} + @{[vbrev8_v $V28, $V16]} + + # store second to last block + @{[vsetvli "zero", $TAIL_LENGTH, "e8", "m1", "ta", "ma"]} + @{[vse8_v $V25, $OUTPUT]} +___ + + return $code; +} + +# prepare xts dec second to last block's input(v24) and iv(v29) and +# last block's and iv(v28) +sub handle_xts_dec_last_block { + my $code=<<___; + bnez $TAIL_LENGTH, 2f + + beqz $UPDATE_IV, 1f + ## Store next IV + # setup `x` multiplier with byte-reversed order + # 0b00000010 => 0b01000000 (0x40) + li $T0, 0x40 + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vmv_v_i $V28, 0]} + @{[vsetivli "zero", 1, "e8", "m1", "tu", "ma"]} + @{[vmv_v_x $V28, $T0]} + + beqz $LENGTH, 3f + addi $VL, $VL, -4 + @{[vsetivli "zero", 4, "e32", "m4", "ta", "ma"]} + # multiplier + @{[vslidedown_vx $V16, $V16, $VL]} + +3: + # IV * `x` + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vgmul_vv $V16, $V28]} + # Reverse the IV's bits order back to big-endian + @{[vbrev8_v $V28, $V16]} + + @{[vse32_v $V28, $IV]} +1: + + ret +2: + # load second to last block's ciphertext + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V24, $INPUT]} + addi $INPUT, $INPUT, 16 + + # setup `x` multiplier with byte-reversed order + # 0b00000010 => 0b01000000 (0x40) + li $T0, 0x40 + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vmv_v_i $V20, 0]} + @{[vsetivli "zero", 1, "e8", "m1", "tu", "ma"]} + @{[vmv_v_x $V20, $T0]} + + beqz $LENGTH, 1f + # slidedown third to last block + addi $VL, $VL, -4 + @{[vsetivli "zero", 4, "e32", "m4", "ta", "ma"]} + # multiplier + @{[vslidedown_vx $V16, $V16, $VL]} + + # compute IV for last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vgmul_vv $V16, $V20]} + @{[vbrev8_v $V28, $V16]} + + # compute IV for second to last block + @{[vgmul_vv $V16, $V20]} + @{[vbrev8_v $V29, $V16]} + j 2f +1: + # compute IV for second to last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vgmul_vv $V16, $V20]} + @{[vbrev8_v $V29, $V16]} +2: +___ + + return $code; +} + +# Load all 11 round keys to v1-v11 registers. +sub aes_128_load_key { + my $code=<<___; + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V2, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V3, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V4, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V5, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V6, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V7, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V8, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V9, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V10, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V11, $KEY]} +___ + + return $code; +} + +# Load all 13 round keys to v1-v13 registers. +sub aes_192_load_key { + my $code=<<___; + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V2, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V3, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V4, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V5, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V6, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V7, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V8, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V9, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V10, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V11, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V12, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V13, $KEY]} +___ + + return $code; +} + +# Load all 15 round keys to v1-v15 registers. +sub aes_256_load_key { + my $code=<<___; + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V2, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V3, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V4, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V5, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V6, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V7, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V8, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V9, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V10, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V11, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V12, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V13, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V14, $KEY]} + addi $KEY, $KEY, 16 + @{[vle32_v $V15, $KEY]} +___ + + return $code; +} + +# aes-128 enc with round keys v1-v11 +sub aes_128_enc { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesef_vs $V24, $V11]} +___ + + return $code; +} + +# aes-128 dec with round keys v1-v11 +sub aes_128_dec { + my $code=<<___; + @{[vaesz_vs $V24, $V11]} + @{[vaesdm_vs $V24, $V10]} + @{[vaesdm_vs $V24, $V9]} + @{[vaesdm_vs $V24, $V8]} + @{[vaesdm_vs $V24, $V7]} + @{[vaesdm_vs $V24, $V6]} + @{[vaesdm_vs $V24, $V5]} + @{[vaesdm_vs $V24, $V4]} + @{[vaesdm_vs $V24, $V3]} + @{[vaesdm_vs $V24, $V2]} + @{[vaesdf_vs $V24, $V1]} +___ + + return $code; +} + +# aes-192 enc with round keys v1-v13 +sub aes_192_enc { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesem_vs $V24, $V11]} + @{[vaesem_vs $V24, $V12]} + @{[vaesef_vs $V24, $V13]} +___ + + return $code; +} + +# aes-192 dec with round keys v1-v13 +sub aes_192_dec { + my $code=<<___; + @{[vaesz_vs $V24, $V13]} + @{[vaesdm_vs $V24, $V12]} + @{[vaesdm_vs $V24, $V11]} + @{[vaesdm_vs $V24, $V10]} + @{[vaesdm_vs $V24, $V9]} + @{[vaesdm_vs $V24, $V8]} + @{[vaesdm_vs $V24, $V7]} + @{[vaesdm_vs $V24, $V6]} + @{[vaesdm_vs $V24, $V5]} + @{[vaesdm_vs $V24, $V4]} + @{[vaesdm_vs $V24, $V3]} + @{[vaesdm_vs $V24, $V2]} + @{[vaesdf_vs $V24, $V1]} +___ + + return $code; +} + +# aes-256 enc with round keys v1-v15 +sub aes_256_enc { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesem_vs $V24, $V11]} + @{[vaesem_vs $V24, $V12]} + @{[vaesem_vs $V24, $V13]} + @{[vaesem_vs $V24, $V14]} + @{[vaesef_vs $V24, $V15]} +___ + + return $code; +} + +# aes-256 dec with round keys v1-v15 +sub aes_256_dec { + my $code=<<___; + @{[vaesz_vs $V24, $V15]} + @{[vaesdm_vs $V24, $V14]} + @{[vaesdm_vs $V24, $V13]} + @{[vaesdm_vs $V24, $V12]} + @{[vaesdm_vs $V24, $V11]} + @{[vaesdm_vs $V24, $V10]} + @{[vaesdm_vs $V24, $V9]} + @{[vaesdm_vs $V24, $V8]} + @{[vaesdm_vs $V24, $V7]} + @{[vaesdm_vs $V24, $V6]} + @{[vaesdm_vs $V24, $V5]} + @{[vaesdm_vs $V24, $V4]} + @{[vaesdm_vs $V24, $V3]} + @{[vaesdm_vs $V24, $V2]} + @{[vaesdf_vs $V24, $V1]} +___ + + return $code; +} + +$code .= <<___; +.p2align 3 +.globl rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt +.type rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt,\@function +rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt: + @{[load_xts_iv0]} + + # aes block size is 16 + andi $TAIL_LENGTH, $LENGTH, 15 + mv $STORE_LEN32, $LENGTH + beqz $TAIL_LENGTH, 1f + sub $LENGTH, $LENGTH, $TAIL_LENGTH + addi $STORE_LEN32, $LENGTH, -16 +1: + # We make the `LENGTH` become e32 length here. + srli $LEN32, $LENGTH, 2 + srli $STORE_LEN32, $STORE_LEN32, 2 + + # Load key length. + lwu $T0, 480($KEY) + li $T1, 32 + li $T2, 24 + li $T3, 16 + beq $T0, $T1, aes_xts_enc_256 + beq $T0, $T2, aes_xts_enc_192 + beq $T0, $T3, aes_xts_enc_128 +.size rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt,.-rv64i_zvbb_zvkg_zvkned_aes_xts_encrypt +___ + +$code .= <<___; +.p2align 3 +aes_xts_enc_128: + @{[init_first_round]} + @{[aes_128_load_key]} + + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + j 1f + +.Lenc_blocks_128: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + # load plaintext into v24 + @{[vle32_v $V24, $INPUT]} + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + @{[vxor_vv $V24, $V24, $V28]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_128_enc]} + @{[vxor_vv $V24, $V24, $V28]} + + # store ciphertext + @{[vsetvli "zero", $STORE_LEN32, "e32", "m4", "ta", "ma"]} + @{[vse32_v $V24, $OUTPUT]} + add $OUTPUT, $OUTPUT, $T0 + sub $STORE_LEN32, $STORE_LEN32, $VL + + bnez $LEN32, .Lenc_blocks_128 + + @{[handle_xts_enc_last_block]} + + # xts last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V28]} + @{[aes_128_enc]} + @{[vxor_vv $V24, $V24, $V28]} + + # store last block ciphertext + addi $OUTPUT, $OUTPUT, -16 + @{[vse32_v $V24, $OUTPUT]} + + ret +.size aes_xts_enc_128,.-aes_xts_enc_128 +___ + +$code .= <<___; +.p2align 3 +aes_xts_enc_192: + @{[init_first_round]} + @{[aes_192_load_key]} + + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + j 1f + +.Lenc_blocks_192: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + # load plaintext into v24 + @{[vle32_v $V24, $INPUT]} + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + @{[vxor_vv $V24, $V24, $V28]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_192_enc]} + @{[vxor_vv $V24, $V24, $V28]} + + # store ciphertext + @{[vsetvli "zero", $STORE_LEN32, "e32", "m4", "ta", "ma"]} + @{[vse32_v $V24, $OUTPUT]} + add $OUTPUT, $OUTPUT, $T0 + sub $STORE_LEN32, $STORE_LEN32, $VL + + bnez $LEN32, .Lenc_blocks_192 + + @{[handle_xts_enc_last_block]} + + # xts last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V28]} + @{[aes_192_enc]} + @{[vxor_vv $V24, $V24, $V28]} + + # store last block ciphertext + addi $OUTPUT, $OUTPUT, -16 + @{[vse32_v $V24, $OUTPUT]} + + ret +.size aes_xts_enc_192,.-aes_xts_enc_192 +___ + +$code .= <<___; +.p2align 3 +aes_xts_enc_256: + @{[init_first_round]} + @{[aes_256_load_key]} + + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + j 1f + +.Lenc_blocks_256: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + # load plaintext into v24 + @{[vle32_v $V24, $INPUT]} + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + @{[vxor_vv $V24, $V24, $V28]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_256_enc]} + @{[vxor_vv $V24, $V24, $V28]} + + # store ciphertext + @{[vsetvli "zero", $STORE_LEN32, "e32", "m4", "ta", "ma"]} + @{[vse32_v $V24, $OUTPUT]} + add $OUTPUT, $OUTPUT, $T0 + sub $STORE_LEN32, $STORE_LEN32, $VL + + bnez $LEN32, .Lenc_blocks_256 + + @{[handle_xts_enc_last_block]} + + # xts last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V28]} + @{[aes_256_enc]} + @{[vxor_vv $V24, $V24, $V28]} + + # store last block ciphertext + addi $OUTPUT, $OUTPUT, -16 + @{[vse32_v $V24, $OUTPUT]} + + ret +.size aes_xts_enc_256,.-aes_xts_enc_256 +___ + +################################################################################ +# void rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt(const unsigned char *in, +# unsigned char *out, size_t length, +# const AES_KEY *key, +# unsigned char iv[16], +# int update_iv) +$code .= <<___; +.p2align 3 +.globl rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt +.type rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt,\@function +rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt: + @{[load_xts_iv0]} + + # aes block size is 16 + andi $TAIL_LENGTH, $LENGTH, 15 + beqz $TAIL_LENGTH, 1f + sub $LENGTH, $LENGTH, $TAIL_LENGTH + addi $LENGTH, $LENGTH, -16 +1: + # We make the `LENGTH` become e32 length here. + srli $LEN32, $LENGTH, 2 + + # Load key length. + lwu $T0, 480($KEY) + li $T1, 32 + li $T2, 24 + li $T3, 16 + beq $T0, $T1, aes_xts_dec_256 + beq $T0, $T2, aes_xts_dec_192 + beq $T0, $T3, aes_xts_dec_128 +.size rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt,.-rv64i_zvbb_zvkg_zvkned_aes_xts_decrypt +___ + +$code .= <<___; +.p2align 3 +aes_xts_dec_128: + @{[init_first_round]} + @{[aes_128_load_key]} + + beqz $LEN32, 2f + + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + j 1f + +.Ldec_blocks_128: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + # load ciphertext into v24 + @{[vle32_v $V24, $INPUT]} + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + @{[vxor_vv $V24, $V24, $V28]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_128_dec]} + @{[vxor_vv $V24, $V24, $V28]} + + # store plaintext + @{[vse32_v $V24, $OUTPUT]} + add $OUTPUT, $OUTPUT, $T0 + + bnez $LEN32, .Ldec_blocks_128 + +2: + @{[handle_xts_dec_last_block]} + + ## xts second to last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V29]} + @{[aes_128_dec]} + @{[vxor_vv $V24, $V24, $V29]} + @{[vmv_v_v $V25, $V24]} + + # load last block ciphertext + @{[vsetvli "zero", $TAIL_LENGTH, "e8", "m1", "tu", "ma"]} + @{[vle8_v $V24, $INPUT]} + + # store second to last block plaintext + addi $T0, $OUTPUT, 16 + @{[vse8_v $V25, $T0]} + + ## xts last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V28]} + @{[aes_128_dec]} + @{[vxor_vv $V24, $V24, $V28]} + + # store second to last block plaintext + @{[vse32_v $V24, $OUTPUT]} + + ret +.size aes_xts_dec_128,.-aes_xts_dec_128 +___ + +$code .= <<___; +.p2align 3 +aes_xts_dec_192: + @{[init_first_round]} + @{[aes_192_load_key]} + + beqz $LEN32, 2f + + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + j 1f + +.Ldec_blocks_192: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + # load ciphertext into v24 + @{[vle32_v $V24, $INPUT]} + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + @{[vxor_vv $V24, $V24, $V28]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_192_dec]} + @{[vxor_vv $V24, $V24, $V28]} + + # store plaintext + @{[vse32_v $V24, $OUTPUT]} + add $OUTPUT, $OUTPUT, $T0 + + bnez $LEN32, .Ldec_blocks_192 + +2: + @{[handle_xts_dec_last_block]} + + ## xts second to last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V29]} + @{[aes_192_dec]} + @{[vxor_vv $V24, $V24, $V29]} + @{[vmv_v_v $V25, $V24]} + + # load last block ciphertext + @{[vsetvli "zero", $TAIL_LENGTH, "e8", "m1", "tu", "ma"]} + @{[vle8_v $V24, $INPUT]} + + # store second to last block plaintext + addi $T0, $OUTPUT, 16 + @{[vse8_v $V25, $T0]} + + ## xts last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V28]} + @{[aes_192_dec]} + @{[vxor_vv $V24, $V24, $V28]} + + # store second to last block plaintext + @{[vse32_v $V24, $OUTPUT]} + + ret +.size aes_xts_dec_192,.-aes_xts_dec_192 +___ + +$code .= <<___; +.p2align 3 +aes_xts_dec_256: + @{[init_first_round]} + @{[aes_256_load_key]} + + beqz $LEN32, 2f + + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + j 1f + +.Ldec_blocks_256: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + # load ciphertext into v24 + @{[vle32_v $V24, $INPUT]} + # update iv + @{[vgmul_vv $V16, $V20]} + # reverse the iv's bits order back + @{[vbrev8_v $V28, $V16]} +1: + @{[vxor_vv $V24, $V24, $V28]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + add $INPUT, $INPUT, $T0 + @{[aes_256_dec]} + @{[vxor_vv $V24, $V24, $V28]} + + # store plaintext + @{[vse32_v $V24, $OUTPUT]} + add $OUTPUT, $OUTPUT, $T0 + + bnez $LEN32, .Ldec_blocks_256 + +2: + @{[handle_xts_dec_last_block]} + + ## xts second to last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V29]} + @{[aes_256_dec]} + @{[vxor_vv $V24, $V24, $V29]} + @{[vmv_v_v $V25, $V24]} + + # load last block ciphertext + @{[vsetvli "zero", $TAIL_LENGTH, "e8", "m1", "tu", "ma"]} + @{[vle8_v $V24, $INPUT]} + + # store second to last block plaintext + addi $T0, $OUTPUT, 16 + @{[vse8_v $V25, $T0]} + + ## xts last block + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V28]} + @{[aes_256_dec]} + @{[vxor_vv $V24, $V24, $V28]} + + # store second to last block plaintext + @{[vse32_v $V24, $OUTPUT]} + + ret +.size aes_xts_dec_256,.-aes_xts_dec_256 +___ +} + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; diff --git a/arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl b/arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl new file mode 100644 index 000000000000..3b8c324bc4d5 --- /dev/null +++ b/arch/riscv/crypto/aes-riscv64-zvkned-zvkb.pl @@ -0,0 +1,415 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector AES block cipher extension ('Zvkned') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +################################################################################ +# void rv64i_zvkb_zvkned_ctr32_encrypt_blocks(const unsigned char *in, +# unsigned char *out, size_t length, +# const void *key, +# unsigned char ivec[16]); +{ +my ($INP, $OUTP, $LEN, $KEYP, $IVP) = ("a0", "a1", "a2", "a3", "a4"); +my ($T0, $T1, $T2, $T3) = ("t0", "t1", "t2", "t3"); +my ($VL) = ("t4"); +my ($LEN32) = ("t5"); +my ($CTR) = ("t6"); +my ($MASK) = ("v0"); +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +# Prepare the AES ctr input data into v16. +sub init_aes_ctr_input { + my $code=<<___; + # Setup mask into v0 + # The mask pattern for 4*N-th elements + # mask v0: [000100010001....] + # Note: + # We could setup the mask just for the maximum element length instead of + # the VLMAX. + li $T0, 0b10001000 + @{[vsetvli $T2, "zero", "e8", "m1", "ta", "ma"]} + @{[vmv_v_x $MASK, $T0]} + # Load IV. + # v31:[IV0, IV1, IV2, big-endian count] + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V31, $IVP]} + # Convert the big-endian counter into little-endian. + @{[vsetivli "zero", 4, "e32", "m1", "ta", "mu"]} + @{[vrev8_v $V31, $V31, $MASK]} + # Splat the IV to v16 + @{[vsetvli "zero", $LEN32, "e32", "m4", "ta", "ma"]} + @{[vmv_v_i $V16, 0]} + @{[vaesz_vs $V16, $V31]} + # Prepare the ctr pattern into v20 + # v20: [x, x, x, 0, x, x, x, 1, x, x, x, 2, ...] + @{[viota_m $V20, $MASK, $MASK]} + # v16:[IV0, IV1, IV2, count+0, IV0, IV1, IV2, count+1, ...] + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "mu"]} + @{[vadd_vv $V16, $V16, $V20, $MASK]} +___ + + return $code; +} + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkb_zvkned_ctr32_encrypt_blocks +.type rv64i_zvkb_zvkned_ctr32_encrypt_blocks,\@function +rv64i_zvkb_zvkned_ctr32_encrypt_blocks: + # The aes block size is 16 bytes. + # We try to get the minimum aes block number including the tail data. + addi $T0, $LEN, 15 + # the minimum block number + srli $T0, $T0, 4 + # We make the block number become e32 length here. + slli $LEN32, $T0, 2 + + # Load key length. + lwu $T0, 480($KEYP) + li $T1, 32 + li $T2, 24 + li $T3, 16 + + beq $T0, $T1, ctr32_encrypt_blocks_256 + beq $T0, $T2, ctr32_encrypt_blocks_192 + beq $T0, $T3, ctr32_encrypt_blocks_128 + + ret +.size rv64i_zvkb_zvkned_ctr32_encrypt_blocks,.-rv64i_zvkb_zvkned_ctr32_encrypt_blocks +___ + +$code .= <<___; +.p2align 3 +ctr32_encrypt_blocks_128: + # Load all 11 round keys to v1-v11 registers. + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V2, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V3, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V4, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V5, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V6, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V7, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V8, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V9, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V10, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} + + @{[init_aes_ctr_input]} + + ##### AES body + j 2f +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "mu"]} + # Increase ctr in v16. + @{[vadd_vx $V16, $V16, $CTR, $MASK]} +2: + # Prepare the AES ctr input into v24. + # The ctr data uses big-endian form. + @{[vmv_v_v $V24, $V16]} + @{[vrev8_v $V24, $V24, $MASK]} + srli $CTR, $VL, 2 + sub $LEN32, $LEN32, $VL + + # Load plaintext in bytes into v20. + @{[vsetvli $T0, $LEN, "e8", "m4", "ta", "ma"]} + @{[vle8_v $V20, $INP]} + sub $LEN, $LEN, $T0 + add $INP, $INP, $T0 + + @{[vsetvli "zero", $VL, "e32", "m4", "ta", "ma"]} + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesef_vs $V24, $V11]} + + # ciphertext + @{[vsetvli "zero", $T0, "e8", "m4", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V20]} + + # Store the ciphertext. + @{[vse8_v $V24, $OUTP]} + add $OUTP, $OUTP, $T0 + + bnez $LEN, 1b + + ## store ctr iv + @{[vsetivli "zero", 4, "e32", "m1", "ta", "mu"]} + # Increase ctr in v16. + @{[vadd_vx $V16, $V16, $CTR, $MASK]} + # Convert ctr data back to big-endian. + @{[vrev8_v $V16, $V16, $MASK]} + @{[vse32_v $V16, $IVP]} + + ret +.size ctr32_encrypt_blocks_128,.-ctr32_encrypt_blocks_128 +___ + +$code .= <<___; +.p2align 3 +ctr32_encrypt_blocks_192: + # Load all 13 round keys to v1-v13 registers. + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V2, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V3, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V4, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V5, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V6, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V7, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V8, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V9, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V10, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V12, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V13, $KEYP]} + + @{[init_aes_ctr_input]} + + ##### AES body + j 2f +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "mu"]} + # Increase ctr in v16. + @{[vadd_vx $V16, $V16, $CTR, $MASK]} +2: + # Prepare the AES ctr input into v24. + # The ctr data uses big-endian form. + @{[vmv_v_v $V24, $V16]} + @{[vrev8_v $V24, $V24, $MASK]} + srli $CTR, $VL, 2 + sub $LEN32, $LEN32, $VL + + # Load plaintext in bytes into v20. + @{[vsetvli $T0, $LEN, "e8", "m4", "ta", "ma"]} + @{[vle8_v $V20, $INP]} + sub $LEN, $LEN, $T0 + add $INP, $INP, $T0 + + @{[vsetvli "zero", $VL, "e32", "m4", "ta", "ma"]} + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesem_vs $V24, $V11]} + @{[vaesem_vs $V24, $V12]} + @{[vaesef_vs $V24, $V13]} + + # ciphertext + @{[vsetvli "zero", $T0, "e8", "m4", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V20]} + + # Store the ciphertext. + @{[vse8_v $V24, $OUTP]} + add $OUTP, $OUTP, $T0 + + bnez $LEN, 1b + + ## store ctr iv + @{[vsetivli "zero", 4, "e32", "m1", "ta", "mu"]} + # Increase ctr in v16. + @{[vadd_vx $V16, $V16, $CTR, $MASK]} + # Convert ctr data back to big-endian. + @{[vrev8_v $V16, $V16, $MASK]} + @{[vse32_v $V16, $IVP]} + + ret +.size ctr32_encrypt_blocks_192,.-ctr32_encrypt_blocks_192 +___ + +$code .= <<___; +.p2align 3 +ctr32_encrypt_blocks_256: + # Load all 15 round keys to v1-v15 registers. + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V2, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V3, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V4, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V5, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V6, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V7, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V8, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V9, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V10, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V12, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V13, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V14, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V15, $KEYP]} + + @{[init_aes_ctr_input]} + + ##### AES body + j 2f +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "mu"]} + # Increase ctr in v16. + @{[vadd_vx $V16, $V16, $CTR, $MASK]} +2: + # Prepare the AES ctr input into v24. + # The ctr data uses big-endian form. + @{[vmv_v_v $V24, $V16]} + @{[vrev8_v $V24, $V24, $MASK]} + srli $CTR, $VL, 2 + sub $LEN32, $LEN32, $VL + + # Load plaintext in bytes into v20. + @{[vsetvli $T0, $LEN, "e8", "m4", "ta", "ma"]} + @{[vle8_v $V20, $INP]} + sub $LEN, $LEN, $T0 + add $INP, $INP, $T0 + + @{[vsetvli "zero", $VL, "e32", "m4", "ta", "ma"]} + @{[vaesz_vs $V24, $V1]} + @{[vaesem_vs $V24, $V2]} + @{[vaesem_vs $V24, $V3]} + @{[vaesem_vs $V24, $V4]} + @{[vaesem_vs $V24, $V5]} + @{[vaesem_vs $V24, $V6]} + @{[vaesem_vs $V24, $V7]} + @{[vaesem_vs $V24, $V8]} + @{[vaesem_vs $V24, $V9]} + @{[vaesem_vs $V24, $V10]} + @{[vaesem_vs $V24, $V11]} + @{[vaesem_vs $V24, $V12]} + @{[vaesem_vs $V24, $V13]} + @{[vaesem_vs $V24, $V14]} + @{[vaesef_vs $V24, $V15]} + + # ciphertext + @{[vsetvli "zero", $T0, "e8", "m4", "ta", "ma"]} + @{[vxor_vv $V24, $V24, $V20]} + + # Store the ciphertext. + @{[vse8_v $V24, $OUTP]} + add $OUTP, $OUTP, $T0 + + bnez $LEN, 1b + + ## store ctr iv + @{[vsetivli "zero", 4, "e32", "m1", "ta", "mu"]} + # Increase ctr in v16. + @{[vadd_vx $V16, $V16, $CTR, $MASK]} + # Convert ctr data back to big-endian. + @{[vrev8_v $V16, $V16, $MASK]} + @{[vse32_v $V16, $IVP]} + + ret +.size ctr32_encrypt_blocks_256,.-ctr32_encrypt_blocks_256 +___ +} + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; diff --git a/arch/riscv/crypto/aes-riscv64-zvkned.pl b/arch/riscv/crypto/aes-riscv64-zvkned.pl index 303e82d9f6f0..71a9248320c0 100644 --- a/arch/riscv/crypto/aes-riscv64-zvkned.pl +++ b/arch/riscv/crypto/aes-riscv64-zvkned.pl @@ -67,6 +67,752 @@ my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, ) = map("v$_",(0..31)); +# Load all 11 round keys to v1-v11 registers. +sub aes_128_load_key { + my $KEYP = shift; + + my $code=<<___; + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V2, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V3, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V4, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V5, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V6, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V7, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V8, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V9, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V10, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} +___ + + return $code; +} + +# Load all 13 round keys to v1-v13 registers. +sub aes_192_load_key { + my $KEYP = shift; + + my $code=<<___; + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V2, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V3, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V4, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V5, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V6, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V7, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V8, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V9, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V10, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V12, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V13, $KEYP]} +___ + + return $code; +} + +# Load all 15 round keys to v1-v15 registers. +sub aes_256_load_key { + my $KEYP = shift; + + my $code=<<___; + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $V1, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V2, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V3, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V4, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V5, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V6, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V7, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V8, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V9, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V10, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V11, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V12, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V13, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V14, $KEYP]} + addi $KEYP, $KEYP, 16 + @{[vle32_v $V15, $KEYP]} +___ + + return $code; +} + +# aes-128 encryption with round keys v1-v11 +sub aes_128_encrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} # with round key w[ 0, 3] + @{[vaesem_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesem_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesem_vs $V24, $V4]} # with round key w[12,15] + @{[vaesem_vs $V24, $V5]} # with round key w[16,19] + @{[vaesem_vs $V24, $V6]} # with round key w[20,23] + @{[vaesem_vs $V24, $V7]} # with round key w[24,27] + @{[vaesem_vs $V24, $V8]} # with round key w[28,31] + @{[vaesem_vs $V24, $V9]} # with round key w[32,35] + @{[vaesem_vs $V24, $V10]} # with round key w[36,39] + @{[vaesef_vs $V24, $V11]} # with round key w[40,43] +___ + + return $code; +} + +# aes-128 decryption with round keys v1-v11 +sub aes_128_decrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V11]} # with round key w[40,43] + @{[vaesdm_vs $V24, $V10]} # with round key w[36,39] + @{[vaesdm_vs $V24, $V9]} # with round key w[32,35] + @{[vaesdm_vs $V24, $V8]} # with round key w[28,31] + @{[vaesdm_vs $V24, $V7]} # with round key w[24,27] + @{[vaesdm_vs $V24, $V6]} # with round key w[20,23] + @{[vaesdm_vs $V24, $V5]} # with round key w[16,19] + @{[vaesdm_vs $V24, $V4]} # with round key w[12,15] + @{[vaesdm_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesdm_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesdf_vs $V24, $V1]} # with round key w[ 0, 3] +___ + + return $code; +} + +# aes-192 encryption with round keys v1-v13 +sub aes_192_encrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} # with round key w[ 0, 3] + @{[vaesem_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesem_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesem_vs $V24, $V4]} # with round key w[12,15] + @{[vaesem_vs $V24, $V5]} # with round key w[16,19] + @{[vaesem_vs $V24, $V6]} # with round key w[20,23] + @{[vaesem_vs $V24, $V7]} # with round key w[24,27] + @{[vaesem_vs $V24, $V8]} # with round key w[28,31] + @{[vaesem_vs $V24, $V9]} # with round key w[32,35] + @{[vaesem_vs $V24, $V10]} # with round key w[36,39] + @{[vaesem_vs $V24, $V11]} # with round key w[40,43] + @{[vaesem_vs $V24, $V12]} # with round key w[44,47] + @{[vaesef_vs $V24, $V13]} # with round key w[48,51] +___ + + return $code; +} + +# aes-192 decryption with round keys v1-v13 +sub aes_192_decrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V13]} # with round key w[48,51] + @{[vaesdm_vs $V24, $V12]} # with round key w[44,47] + @{[vaesdm_vs $V24, $V11]} # with round key w[40,43] + @{[vaesdm_vs $V24, $V10]} # with round key w[36,39] + @{[vaesdm_vs $V24, $V9]} # with round key w[32,35] + @{[vaesdm_vs $V24, $V8]} # with round key w[28,31] + @{[vaesdm_vs $V24, $V7]} # with round key w[24,27] + @{[vaesdm_vs $V24, $V6]} # with round key w[20,23] + @{[vaesdm_vs $V24, $V5]} # with round key w[16,19] + @{[vaesdm_vs $V24, $V4]} # with round key w[12,15] + @{[vaesdm_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesdm_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesdf_vs $V24, $V1]} # with round key w[ 0, 3] +___ + + return $code; +} + +# aes-256 encryption with round keys v1-v15 +sub aes_256_encrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V1]} # with round key w[ 0, 3] + @{[vaesem_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesem_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesem_vs $V24, $V4]} # with round key w[12,15] + @{[vaesem_vs $V24, $V5]} # with round key w[16,19] + @{[vaesem_vs $V24, $V6]} # with round key w[20,23] + @{[vaesem_vs $V24, $V7]} # with round key w[24,27] + @{[vaesem_vs $V24, $V8]} # with round key w[28,31] + @{[vaesem_vs $V24, $V9]} # with round key w[32,35] + @{[vaesem_vs $V24, $V10]} # with round key w[36,39] + @{[vaesem_vs $V24, $V11]} # with round key w[40,43] + @{[vaesem_vs $V24, $V12]} # with round key w[44,47] + @{[vaesem_vs $V24, $V13]} # with round key w[48,51] + @{[vaesem_vs $V24, $V14]} # with round key w[52,55] + @{[vaesef_vs $V24, $V15]} # with round key w[56,59] +___ + + return $code; +} + +# aes-256 decryption with round keys v1-v15 +sub aes_256_decrypt { + my $code=<<___; + @{[vaesz_vs $V24, $V15]} # with round key w[56,59] + @{[vaesdm_vs $V24, $V14]} # with round key w[52,55] + @{[vaesdm_vs $V24, $V13]} # with round key w[48,51] + @{[vaesdm_vs $V24, $V12]} # with round key w[44,47] + @{[vaesdm_vs $V24, $V11]} # with round key w[40,43] + @{[vaesdm_vs $V24, $V10]} # with round key w[36,39] + @{[vaesdm_vs $V24, $V9]} # with round key w[32,35] + @{[vaesdm_vs $V24, $V8]} # with round key w[28,31] + @{[vaesdm_vs $V24, $V7]} # with round key w[24,27] + @{[vaesdm_vs $V24, $V6]} # with round key w[20,23] + @{[vaesdm_vs $V24, $V5]} # with round key w[16,19] + @{[vaesdm_vs $V24, $V4]} # with round key w[12,15] + @{[vaesdm_vs $V24, $V3]} # with round key w[ 8,11] + @{[vaesdm_vs $V24, $V2]} # with round key w[ 4, 7] + @{[vaesdf_vs $V24, $V1]} # with round key w[ 0, 3] +___ + + return $code; +} + +{ +############################################################################### +# void rv64i_zvkned_cbc_encrypt(const unsigned char *in, unsigned char *out, +# size_t length, const AES_KEY *key, +# unsigned char *ivec, const int enc); +my ($INP, $OUTP, $LEN, $KEYP, $IVP, $ENC) = ("a0", "a1", "a2", "a3", "a4", "a5"); +my ($T0, $T1) = ("t0", "t1", "t2"); + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_cbc_encrypt +.type rv64i_zvkned_cbc_encrypt,\@function +rv64i_zvkned_cbc_encrypt: + # check whether the length is a multiple of 16 and >= 16 + li $T1, 16 + blt $LEN, $T1, L_end + andi $T1, $LEN, 15 + bnez $T1, L_end + + # Load key length. + lwu $T0, 480($KEYP) + + # Get proper routine for key length. + li $T1, 16 + beq $T1, $T0, L_cbc_enc_128 + + li $T1, 24 + beq $T1, $T0, L_cbc_enc_192 + + li $T1, 32 + beq $T1, $T0, L_cbc_enc_256 + + ret +.size rv64i_zvkned_cbc_encrypt,.-rv64i_zvkned_cbc_encrypt +___ + +$code .= <<___; +.p2align 3 +L_cbc_enc_128: + # Load all 11 round keys to v1-v11 registers. + @{[aes_128_load_key $KEYP]} + + # Load IV. + @{[vle32_v $V16, $IVP]} + + @{[vle32_v $V24, $INP]} + @{[vxor_vv $V24, $V24, $V16]} + j 2f + +1: + @{[vle32_v $V17, $INP]} + @{[vxor_vv $V24, $V24, $V17]} + +2: + # AES body + @{[aes_128_encrypt]} + + @{[vse32_v $V24, $OUTP]} + + addi $INP, $INP, 16 + addi $OUTP, $OUTP, 16 + addi $LEN, $LEN, -16 + + bnez $LEN, 1b + + @{[vse32_v $V24, $IVP]} + + ret +.size L_cbc_enc_128,.-L_cbc_enc_128 +___ + +$code .= <<___; +.p2align 3 +L_cbc_enc_192: + # Load all 13 round keys to v1-v13 registers. + @{[aes_192_load_key $KEYP]} + + # Load IV. + @{[vle32_v $V16, $IVP]} + + @{[vle32_v $V24, $INP]} + @{[vxor_vv $V24, $V24, $V16]} + j 2f + +1: + @{[vle32_v $V17, $INP]} + @{[vxor_vv $V24, $V24, $V17]} + +2: + # AES body + @{[aes_192_encrypt]} + + @{[vse32_v $V24, $OUTP]} + + addi $INP, $INP, 16 + addi $OUTP, $OUTP, 16 + addi $LEN, $LEN, -16 + + bnez $LEN, 1b + + @{[vse32_v $V24, $IVP]} + + ret +.size L_cbc_enc_192,.-L_cbc_enc_192 +___ + +$code .= <<___; +.p2align 3 +L_cbc_enc_256: + # Load all 15 round keys to v1-v15 registers. + @{[aes_256_load_key $KEYP]} + + # Load IV. + @{[vle32_v $V16, $IVP]} + + @{[vle32_v $V24, $INP]} + @{[vxor_vv $V24, $V24, $V16]} + j 2f + +1: + @{[vle32_v $V17, $INP]} + @{[vxor_vv $V24, $V24, $V17]} + +2: + # AES body + @{[aes_256_encrypt]} + + @{[vse32_v $V24, $OUTP]} + + addi $INP, $INP, 16 + addi $OUTP, $OUTP, 16 + addi $LEN, $LEN, -16 + + bnez $LEN, 1b + + @{[vse32_v $V24, $IVP]} + + ret +.size L_cbc_enc_256,.-L_cbc_enc_256 +___ + +############################################################################### +# void rv64i_zvkned_cbc_decrypt(const unsigned char *in, unsigned char *out, +# size_t length, const AES_KEY *key, +# unsigned char *ivec, const int enc); +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_cbc_decrypt +.type rv64i_zvkned_cbc_decrypt,\@function +rv64i_zvkned_cbc_decrypt: + # check whether the length is a multiple of 16 and >= 16 + li $T1, 16 + blt $LEN, $T1, L_end + andi $T1, $LEN, 15 + bnez $T1, L_end + + # Load key length. + lwu $T0, 480($KEYP) + + # Get proper routine for key length. + li $T1, 16 + beq $T1, $T0, L_cbc_dec_128 + + li $T1, 24 + beq $T1, $T0, L_cbc_dec_192 + + li $T1, 32 + beq $T1, $T0, L_cbc_dec_256 + + ret +.size rv64i_zvkned_cbc_decrypt,.-rv64i_zvkned_cbc_decrypt +___ + +$code .= <<___; +.p2align 3 +L_cbc_dec_128: + # Load all 11 round keys to v1-v11 registers. + @{[aes_128_load_key $KEYP]} + + # Load IV. + @{[vle32_v $V16, $IVP]} + + @{[vle32_v $V24, $INP]} + @{[vmv_v_v $V17, $V24]} + j 2f + +1: + @{[vle32_v $V24, $INP]} + @{[vmv_v_v $V17, $V24]} + addi $OUTP, $OUTP, 16 + +2: + # AES body + @{[aes_128_decrypt]} + + @{[vxor_vv $V24, $V24, $V16]} + @{[vse32_v $V24, $OUTP]} + @{[vmv_v_v $V16, $V17]} + + addi $LEN, $LEN, -16 + addi $INP, $INP, 16 + + bnez $LEN, 1b + + @{[vse32_v $V16, $IVP]} + + ret +.size L_cbc_dec_128,.-L_cbc_dec_128 +___ + +$code .= <<___; +.p2align 3 +L_cbc_dec_192: + # Load all 13 round keys to v1-v13 registers. + @{[aes_192_load_key $KEYP]} + + # Load IV. + @{[vle32_v $V16, $IVP]} + + @{[vle32_v $V24, $INP]} + @{[vmv_v_v $V17, $V24]} + j 2f + +1: + @{[vle32_v $V24, $INP]} + @{[vmv_v_v $V17, $V24]} + addi $OUTP, $OUTP, 16 + +2: + # AES body + @{[aes_192_decrypt]} + + @{[vxor_vv $V24, $V24, $V16]} + @{[vse32_v $V24, $OUTP]} + @{[vmv_v_v $V16, $V17]} + + addi $LEN, $LEN, -16 + addi $INP, $INP, 16 + + bnez $LEN, 1b + + @{[vse32_v $V16, $IVP]} + + ret +.size L_cbc_dec_192,.-L_cbc_dec_192 +___ + +$code .= <<___; +.p2align 3 +L_cbc_dec_256: + # Load all 15 round keys to v1-v15 registers. + @{[aes_256_load_key $KEYP]} + + # Load IV. + @{[vle32_v $V16, $IVP]} + + @{[vle32_v $V24, $INP]} + @{[vmv_v_v $V17, $V24]} + j 2f + +1: + @{[vle32_v $V24, $INP]} + @{[vmv_v_v $V17, $V24]} + addi $OUTP, $OUTP, 16 + +2: + # AES body + @{[aes_256_decrypt]} + + @{[vxor_vv $V24, $V24, $V16]} + @{[vse32_v $V24, $OUTP]} + @{[vmv_v_v $V16, $V17]} + + addi $LEN, $LEN, -16 + addi $INP, $INP, 16 + + bnez $LEN, 1b + + @{[vse32_v $V16, $IVP]} + + ret +.size L_cbc_dec_256,.-L_cbc_dec_256 +___ +} + +{ +############################################################################### +# void rv64i_zvkned_ecb_encrypt(const unsigned char *in, unsigned char *out, +# size_t length, const AES_KEY *key, +# const int enc); +my ($INP, $OUTP, $LEN, $KEYP, $ENC) = ("a0", "a1", "a2", "a3", "a4"); +my ($VL) = ("a5"); +my ($LEN32) = ("a6"); +my ($T0, $T1) = ("t0", "t1"); + +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_ecb_encrypt +.type rv64i_zvkned_ecb_encrypt,\@function +rv64i_zvkned_ecb_encrypt: + # Make the LEN become e32 length. + srli $LEN32, $LEN, 2 + + # Load key length. + lwu $T0, 480($KEYP) + + # Get proper routine for key length. + li $T1, 16 + beq $T1, $T0, L_ecb_enc_128 + + li $T1, 24 + beq $T1, $T0, L_ecb_enc_192 + + li $T1, 32 + beq $T1, $T0, L_ecb_enc_256 + + ret +.size rv64i_zvkned_ecb_encrypt,.-rv64i_zvkned_ecb_encrypt +___ + +$code .= <<___; +.p2align 3 +L_ecb_enc_128: + # Load all 11 round keys to v1-v11 registers. + @{[aes_128_load_key $KEYP]} + +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + @{[vle32_v $V24, $INP]} + + # AES body + @{[aes_128_encrypt]} + + @{[vse32_v $V24, $OUTP]} + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_enc_128,.-L_ecb_enc_128 +___ + +$code .= <<___; +.p2align 3 +L_ecb_enc_192: + # Load all 13 round keys to v1-v13 registers. + @{[aes_192_load_key $KEYP]} + +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + @{[vle32_v $V24, $INP]} + + # AES body + @{[aes_192_encrypt]} + + @{[vse32_v $V24, $OUTP]} + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_enc_192,.-L_ecb_enc_192 +___ + +$code .= <<___; +.p2align 3 +L_ecb_enc_256: + # Load all 15 round keys to v1-v15 registers. + @{[aes_256_load_key $KEYP]} + +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + @{[vle32_v $V24, $INP]} + + # AES body + @{[aes_256_encrypt]} + + @{[vse32_v $V24, $OUTP]} + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_enc_256,.-L_ecb_enc_256 +___ + +############################################################################### +# void rv64i_zvkned_ecb_decrypt(const unsigned char *in, unsigned char *out, +# size_t length, const AES_KEY *key, +# const int enc); +$code .= <<___; +.p2align 3 +.globl rv64i_zvkned_ecb_decrypt +.type rv64i_zvkned_ecb_decrypt,\@function +rv64i_zvkned_ecb_decrypt: + # Make the LEN become e32 length. + srli $LEN32, $LEN, 2 + + # Load key length. + lwu $T0, 480($KEYP) + + # Get proper routine for key length. + li $T1, 16 + beq $T1, $T0, L_ecb_dec_128 + + li $T1, 24 + beq $T1, $T0, L_ecb_dec_192 + + li $T1, 32 + beq $T1, $T0, L_ecb_dec_256 + + ret +.size rv64i_zvkned_ecb_decrypt,.-rv64i_zvkned_ecb_decrypt +___ + +$code .= <<___; +.p2align 3 +L_ecb_dec_128: + # Load all 11 round keys to v1-v11 registers. + @{[aes_128_load_key $KEYP]} + +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + @{[vle32_v $V24, $INP]} + + # AES body + @{[aes_128_decrypt]} + + @{[vse32_v $V24, $OUTP]} + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_dec_128,.-L_ecb_dec_128 +___ + +$code .= <<___; +.p2align 3 +L_ecb_dec_192: + # Load all 13 round keys to v1-v13 registers. + @{[aes_192_load_key $KEYP]} + +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + @{[vle32_v $V24, $INP]} + + # AES body + @{[aes_192_decrypt]} + + @{[vse32_v $V24, $OUTP]} + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_dec_192,.-L_ecb_dec_192 +___ + +$code .= <<___; +.p2align 3 +L_ecb_dec_256: + # Load all 15 round keys to v1-v15 registers. + @{[aes_256_load_key $KEYP]} + +1: + @{[vsetvli $VL, $LEN32, "e32", "m4", "ta", "ma"]} + slli $T0, $VL, 2 + sub $LEN32, $LEN32, $VL + + @{[vle32_v $V24, $INP]} + + # AES body + @{[aes_256_decrypt]} + + @{[vse32_v $V24, $OUTP]} + + add $INP, $INP, $T0 + add $OUTP, $OUTP, $T0 + + bnez $LEN32, 1b + + ret +.size L_ecb_dec_256,.-L_ecb_dec_256 +___ +} + { ################################################################################ # int rv64i_zvkned_set_encrypt_key(const unsigned char *userKey, const int bytes, From patchwork Mon Nov 27 07:06:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469230 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 4C658C0755A for ; Mon, 27 Nov 2023 07:07:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=QjDEKbrhHnPhv/E91t4iEAs4glPUEUhsxPfhuPj+L1w=; b=omcDyL24Gx8bhz K2XOo9h2DwlxV/p5DmwxDzcheInE+IT6VpIyh5BWqP1qrX6pQcpqJ5v6Yhmq3rN6DYiSlUC4izC00 w6ov19MFDs7VZbUZmA23mG8HaAEh/+B+XNjwxwjrr0wRhDLj0U40UiVZ8JYs/V9dvW/zM0UmZZg1h 74iwgf0mpSv1eJy0XskBjxuAO7MHBDuduBTr352Lk8rFh5tR/Hw5IA6m+Gztbu2hhFsn//ue7axBJ 1msP49h8qLWXuJ6JJZgf+I5wi4t15AqFzA/0qIbIliAjgXPTQIo0GCK+Er/CvT+0ox19T+iNCXC5x aZ5evF8kMeFf9EwGsF+Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vic-001ez1-0V; Mon, 27 Nov 2023 07:07:46 +0000 Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViV-001ety-2h for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:42 +0000 Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-1cfb2176150so11004785ad.3 for ; Sun, 26 Nov 2023 23:07:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068857; x=1701673657; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7M9E14E6fLJh7AJb55zP++mgBfFiPcFBw2pNzr7Sm0I=; b=f7B9By9mNOSiBx/CFgVDCotbY2R3Z+5D9RUlVOoztyOIQ06e+MjaYhygyDCtfIGtxY ePrxpEDbJmq5NjihrGLsmKB/k7Rpr1xElERUOA3KKVR1sU1F9UlFCflLeAhb5mu762zi vleb6RLSoAxSjD7M0OxqZF7Zp0uiQUl14l10LjAnzgJnZoq6lib/heQhDWnD2J/pcnJw nslPYG3ua+2mVPxXu/LUTejp3DYVGJppI/qNJJc59ds4zTEvnKSbNT+YEVdexp842pnG SB3Fm1myR4gGmXQroH3yMVNqw85tC29WgqeAebPf1luTN+Z90qzT81BQyNiPyCiZjzON 2umw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068857; x=1701673657; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7M9E14E6fLJh7AJb55zP++mgBfFiPcFBw2pNzr7Sm0I=; b=D5rM4nOAwjyh27l4sStQLM6byAkQfPCYPogDswiYmWRFmEfP73xC3iMiqBL43ha3N7 WFH9cIZmWr1Ya5gINPEMqDX1MSfuOIdOj60CndUAJU9+gWHld+SyPniOBa0Iy84zXdFL g3pprpujeQZz4w7ynax2fk+LIHjhjkTyP6vYsYB9K0zvKPhqBzxXo6tiPvxYEZkN181M TL6cAL+RV8SiMjdHoxmoXk0FrT+LxtyW2mqUCgNM9MrSj9dtfzkopg7SD1rkNxPbOmGS JoKSpn6yNiI0/HpdN3GCdk8BeJ5d0wc3JxJk116lUcetIO4LcIb+k/MkCCwuJfUcxq61 sftA== X-Gm-Message-State: AOJu0Yx7H+Dr4MrQPL3WeyyusvEMw7WddxhhGpPSaG79t6Co+5xSWnm1 WBcnxmjxTLsDgcKkI671oTqfkA== X-Google-Smtp-Source: AGHT+IHVda7BxvkREXau4SuF+gpfjUKcDNap6z/qKUI6eGy91unUWy63cKgknIxJ/anhs1Q3NGprvg== X-Received: by 2002:a17:902:b488:b0:1cc:5aef:f2c1 with SMTP id y8-20020a170902b48800b001cc5aeff2c1mr9325175plr.33.1701068857649; Sun, 26 Nov 2023 23:07:37 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.34 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:37 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 08/13] RISC-V: crypto: add Zvkg accelerated GCM GHASH implementation Date: Mon, 27 Nov 2023 15:06:58 +0800 Message-Id: <20231127070703.1697-9-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230739_944440_CA8697F0 X-CRM114-Status: GOOD ( 31.13 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add a gcm hash implementation using the Zvkg extension from OpenSSL (openssl/openssl#21923). The perlasm here is different from the original implementation in OpenSSL. The OpenSSL assumes that the H is stored in little-endian. Thus, it needs to convert the H to big-endian for Zvkg instructions. In kernel, we have the big-endian H directly. There is no need for endian conversion. Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Signed-off-by: Jerry Shih --- Changelog v2: - Do not turn on kconfig `GHASH_RISCV64` option by default. - Add `asmlinkage` qualifier for crypto asm function. - Update the ghash fallback path in ghash_blocks(). - Rename structure riscv64_ghash_context to riscv64_ghash_tfm_ctx. - Fold ghash_update_zvkg() and ghash_final_zvkg(). - Reorder structure riscv64_ghash_alg_zvkg members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 10 ++ arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/ghash-riscv64-glue.c | 175 ++++++++++++++++++++++++ arch/riscv/crypto/ghash-riscv64-zvkg.pl | 100 ++++++++++++++ 4 files changed, 292 insertions(+) create mode 100644 arch/riscv/crypto/ghash-riscv64-glue.c create mode 100644 arch/riscv/crypto/ghash-riscv64-zvkg.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 9d991ddda289..6863f01a2ab0 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -34,4 +34,14 @@ config CRYPTO_AES_BLOCK_RISCV64 - Zvkb vector crypto extension (CTR/XTS) - Zvkg vector crypto extension (XTS) +config CRYPTO_GHASH_RISCV64 + tristate "Hash functions: GHASH" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_GCM + help + GCM GHASH function (NIST SP 800-38D) + + Architecture: riscv64 using: + - Zvkg vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 9574b009762f..94a7f8eaa8a7 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -9,6 +9,9 @@ aes-riscv64-y := aes-riscv64-glue.o aes-riscv64-zvkned.o obj-$(CONFIG_CRYPTO_AES_BLOCK_RISCV64) += aes-block-riscv64.o aes-block-riscv64-y := aes-riscv64-block-mode-glue.o aes-riscv64-zvkned-zvbb-zvkg.o aes-riscv64-zvkned-zvkb.o +obj-$(CONFIG_CRYPTO_GHASH_RISCV64) += ghash-riscv64.o +ghash-riscv64-y := ghash-riscv64-glue.o ghash-riscv64-zvkg.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) @@ -21,6 +24,10 @@ $(obj)/aes-riscv64-zvkned-zvbb-zvkg.S: $(src)/aes-riscv64-zvkned-zvbb-zvkg.pl $(obj)/aes-riscv64-zvkned-zvkb.S: $(src)/aes-riscv64-zvkned-zvkb.pl $(call cmd,perlasm) +$(obj)/ghash-riscv64-zvkg.S: $(src)/ghash-riscv64-zvkg.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S +clean-files += ghash-riscv64-zvkg.S diff --git a/arch/riscv/crypto/ghash-riscv64-glue.c b/arch/riscv/crypto/ghash-riscv64-glue.c new file mode 100644 index 000000000000..b01ab5714677 --- /dev/null +++ b/arch/riscv/crypto/ghash-riscv64-glue.c @@ -0,0 +1,175 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * RISC-V optimized GHASH routines + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* ghash using zvkg vector crypto extension */ +asmlinkage void gcm_ghash_rv64i_zvkg(be128 *Xi, const be128 *H, const u8 *inp, + size_t len); + +struct riscv64_ghash_tfm_ctx { + be128 key; +}; + +struct riscv64_ghash_desc_ctx { + be128 shash; + u8 buffer[GHASH_BLOCK_SIZE]; + u32 bytes; +}; + +static inline void ghash_blocks(const struct riscv64_ghash_tfm_ctx *tctx, + struct riscv64_ghash_desc_ctx *dctx, + const u8 *src, size_t srclen) +{ + /* The srclen is nonzero and a multiple of 16. */ + if (crypto_simd_usable()) { + kernel_vector_begin(); + gcm_ghash_rv64i_zvkg(&dctx->shash, &tctx->key, src, srclen); + kernel_vector_end(); + } else { + do { + crypto_xor((u8 *)&dctx->shash, src, GHASH_BLOCK_SIZE); + gf128mul_lle(&dctx->shash, &tctx->key); + srclen -= GHASH_BLOCK_SIZE; + src += GHASH_BLOCK_SIZE; + } while (srclen); + } +} + +static int ghash_init(struct shash_desc *desc) +{ + struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc); + + *dctx = (struct riscv64_ghash_desc_ctx){}; + + return 0; +} + +static int ghash_update_zvkg(struct shash_desc *desc, const u8 *src, + unsigned int srclen) +{ + size_t len; + const struct riscv64_ghash_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); + struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc); + + if (dctx->bytes) { + if (dctx->bytes + srclen < GHASH_BLOCK_SIZE) { + memcpy(dctx->buffer + dctx->bytes, src, srclen); + dctx->bytes += srclen; + return 0; + } + memcpy(dctx->buffer + dctx->bytes, src, + GHASH_BLOCK_SIZE - dctx->bytes); + + ghash_blocks(tctx, dctx, dctx->buffer, GHASH_BLOCK_SIZE); + + src += GHASH_BLOCK_SIZE - dctx->bytes; + srclen -= GHASH_BLOCK_SIZE - dctx->bytes; + dctx->bytes = 0; + } + len = srclen & ~(GHASH_BLOCK_SIZE - 1); + + if (len) { + ghash_blocks(tctx, dctx, src, len); + src += len; + srclen -= len; + } + + if (srclen) { + memcpy(dctx->buffer, src, srclen); + dctx->bytes = srclen; + } + + return 0; +} + +static int ghash_final_zvkg(struct shash_desc *desc, u8 *out) +{ + const struct riscv64_ghash_tfm_ctx *tctx = crypto_shash_ctx(desc->tfm); + struct riscv64_ghash_desc_ctx *dctx = shash_desc_ctx(desc); + int i; + + if (dctx->bytes) { + for (i = dctx->bytes; i < GHASH_BLOCK_SIZE; i++) + dctx->buffer[i] = 0; + + ghash_blocks(tctx, dctx, dctx->buffer, GHASH_BLOCK_SIZE); + } + + memcpy(out, &dctx->shash, GHASH_DIGEST_SIZE); + + return 0; +} + +static int ghash_setkey(struct crypto_shash *tfm, const u8 *key, + unsigned int keylen) +{ + struct riscv64_ghash_tfm_ctx *tctx = crypto_shash_ctx(tfm); + + if (keylen != GHASH_BLOCK_SIZE) + return -EINVAL; + + memcpy(&tctx->key, key, GHASH_BLOCK_SIZE); + + return 0; +} + +static struct shash_alg riscv64_ghash_alg_zvkg = { + .init = ghash_init, + .update = ghash_update_zvkg, + .final = ghash_final_zvkg, + .setkey = ghash_setkey, + .descsize = sizeof(struct riscv64_ghash_desc_ctx), + .digestsize = GHASH_DIGEST_SIZE, + .base = { + .cra_blocksize = GHASH_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct riscv64_ghash_tfm_ctx), + .cra_priority = 303, + .cra_name = "ghash", + .cra_driver_name = "ghash-riscv64-zvkg", + .cra_module = THIS_MODULE, + }, +}; + +static inline bool check_ghash_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKG) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_ghash_mod_init(void) +{ + if (check_ghash_ext()) + return crypto_register_shash(&riscv64_ghash_alg_zvkg); + + return -ENODEV; +} + +static void __exit riscv64_ghash_mod_fini(void) +{ + crypto_unregister_shash(&riscv64_ghash_alg_zvkg); +} + +module_init(riscv64_ghash_mod_init); +module_exit(riscv64_ghash_mod_fini); + +MODULE_DESCRIPTION("GCM GHASH (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("ghash"); diff --git a/arch/riscv/crypto/ghash-riscv64-zvkg.pl b/arch/riscv/crypto/ghash-riscv64-zvkg.pl new file mode 100644 index 000000000000..4beea4ac9cbe --- /dev/null +++ b/arch/riscv/crypto/ghash-riscv64-zvkg.pl @@ -0,0 +1,100 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector GCM/GMAC extension ('Zvkg') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +############################################################################### +# void gcm_ghash_rv64i_zvkg(be128 *Xi, const be128 *H, const u8 *inp, size_t len) +# +# input: Xi: current hash value +# H: hash key +# inp: pointer to input data +# len: length of input data in bytes (multiple of block size) +# output: Xi: Xi+1 (next hash value Xi) +{ +my ($Xi,$H,$inp,$len) = ("a0","a1","a2","a3"); +my ($vXi,$vH,$vinp,$Vzero) = ("v1","v2","v3","v4"); + +$code .= <<___; +.p2align 3 +.globl gcm_ghash_rv64i_zvkg +.type gcm_ghash_rv64i_zvkg,\@function +gcm_ghash_rv64i_zvkg: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vle32_v $vH, $H]} + @{[vle32_v $vXi, $Xi]} + +Lstep: + @{[vle32_v $vinp, $inp]} + add $inp, $inp, 16 + add $len, $len, -16 + @{[vghsh_vv $vXi, $vH, $vinp]} + bnez $len, Lstep + + @{[vse32_v $vXi, $Xi]} + ret + +.size gcm_ghash_rv64i_zvkg,.-gcm_ghash_rv64i_zvkg +___ +} + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Mon Nov 27 07:06:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469232 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 76A06C07D5A for ; Mon, 27 Nov 2023 07:07:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=KH6gia3JJ9TU8fkhQyQ0NBAB+ypZKaM9UU0C/wIOCwQ=; b=e6NSToM6xElJDK T9DpeAOHMkZD1mkua1GRbQxjmY1lumESnqVBk+loi0IsIzRsw/L4qfFPM231TK7TLwPeBajr8+YDj Po3scsrjsNnSRompYEwbQKPnPMOoVx9knm+m8+1r1qNvqqQmQTdMRGjT8BqleUmluD0oVBfR+xhqj B26eV95nXwHgdJJXr57tweptDZtXPV1lbT3SBzqyeOrdEleYUYQk9c6Iz+n1iOBhgOkwOGecpsgWw PHO05s7ELg2/RD0CDzfAdzQBAjXY8V0/MEMfav1VL+LtjZs4ybLe74zDpruw1GDrK2o1VkAlP/bA5 PUWNqXIurBzOwFRKorsQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vid-001ezu-0d; Mon, 27 Nov 2023 07:07:47 +0000 Received: from mail-pl1-x632.google.com ([2607:f8b0:4864:20::632]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7ViX-001evj-0h for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:44 +0000 Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1ce28faa92dso28038985ad.2 for ; Sun, 26 Nov 2023 23:07:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068860; x=1701673660; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=H2ie9UBY5BJCsnebwLbVxGGtRlv0a5BfHvQORsnSkzU=; b=nR4qqurgpbHn8DT1DFRVmFIHonrgvgsAuL53M3h+mtrhQv9M8JbvFUl1o0PgZELDe4 MhViTishvl9yC90MIu19JZb6MNkCd4TGcLuRB6mrKdk0TCwadgQIDq/3WaZfGQDFwtls 1QOCD6bPwlOVbw9THE0olXS/Ibq0gKFx6ysrnLBdYlzVtn0xZSj8/zQdliACl3TT18/Q j4mmVpZtegw4q91nPVuUbEEZTzqBYSuNuAOH+6ifHm6anMoIzGtKlD1H5F6PHujr4gSk mPM3LbSZwRh7/NgdCezFGBNlqM/VWMttpx+yyzbUww+TqA8vmnXM3H28Fxcu4TfuZD4V ZP/Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068860; x=1701673660; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H2ie9UBY5BJCsnebwLbVxGGtRlv0a5BfHvQORsnSkzU=; b=qp/VXSYBRXvROOiyhBUWHx/aEM19G1V8mI/FYU+EMcBd4ZKQn3YQ7pczlw1B196rBU Gc8Bjfb3k9TdkTfHyebaZ4RL2Ehhca0mP/0DDp8Ots/noOSc+VPN/vP/4juqQTM7aq/0 HwHAJg9ZL5LCqtL0NHpcz5ulcdRoXncOkD2w7PNyDLzVw9UVonfpPg98x+pjmsgooDFH t89qROG8NTM8mYqkiSEd2uIINllroH51gMQ7pnWbpYFiLyjB0Y+RWUPMtPCxNZmkymyx ZGAM6TkTfdskHTiCbjoU0vBm4fqilDHpuERyRcJZPWZ5rDbHM2BdPNg/u3JwMvISdxt6 Hyqw== X-Gm-Message-State: AOJu0YyEuWTjtXdHmhfaYXrmbz7pCVOEBGd1hWo8TqQikPMt7nJplMsF Z29lY96uLqayNiBL0CxuLaPnyQ== X-Google-Smtp-Source: AGHT+IE1JBBVZSJGyCAbSJ13cr1H5pF+3UXdDbWfyBIJioNv3kdxLCkMNdaACGujT6Ku8o5fRzSGSQ== X-Received: by 2002:a17:902:d38d:b0:1cf:a4e8:d2a1 with SMTP id e13-20020a170902d38d00b001cfa4e8d2a1mr8132283pld.42.1701068860527; Sun, 26 Nov 2023 23:07:40 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.37 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:40 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 09/13] RISC-V: crypto: add Zvknha/b accelerated SHA224/256 implementations Date: Mon, 27 Nov 2023 15:06:59 +0800 Message-Id: <20231127070703.1697-10-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230741_271658_B044AC73 X-CRM114-Status: GOOD ( 31.61 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add SHA224 and 256 implementations using Zvknha or Zvknhb vector crypto extensions from OpenSSL(openssl/openssl#21923). Co-developed-by: Charalampos Mitrodimas Signed-off-by: Charalampos Mitrodimas Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v2: - Do not turn on kconfig `SHA256_RISCV64` option by default. - Add `asmlinkage` qualifier for crypto asm function. - Rename sha256-riscv64-zvkb-zvknha_or_zvknhb to sha256-riscv64-zvknha_or_zvknhb-zvkb. - Reorder structure sha256_algs members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 11 + arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/sha256-riscv64-glue.c | 145 ++++++++ .../sha256-riscv64-zvknha_or_zvknhb-zvkb.pl | 318 ++++++++++++++++++ 4 files changed, 481 insertions(+) create mode 100644 arch/riscv/crypto/sha256-riscv64-glue.c create mode 100644 arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 6863f01a2ab0..d31af9190717 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -44,4 +44,15 @@ config CRYPTO_GHASH_RISCV64 Architecture: riscv64 using: - Zvkg vector crypto extension +config CRYPTO_SHA256_RISCV64 + tristate "Hash functions: SHA-224 and SHA-256" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_SHA256 + help + SHA-224 and SHA-256 secure hash algorithm (FIPS 180) + + Architecture: riscv64 using: + - Zvknha or Zvknhb vector crypto extensions + - Zvkb vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 94a7f8eaa8a7..e9d7717ec943 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -12,6 +12,9 @@ aes-block-riscv64-y := aes-riscv64-block-mode-glue.o aes-riscv64-zvkned-zvbb-zvk obj-$(CONFIG_CRYPTO_GHASH_RISCV64) += ghash-riscv64.o ghash-riscv64-y := ghash-riscv64-glue.o ghash-riscv64-zvkg.o +obj-$(CONFIG_CRYPTO_SHA256_RISCV64) += sha256-riscv64.o +sha256-riscv64-y := sha256-riscv64-glue.o sha256-riscv64-zvknha_or_zvknhb-zvkb.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) @@ -27,7 +30,11 @@ $(obj)/aes-riscv64-zvkned-zvkb.S: $(src)/aes-riscv64-zvkned-zvkb.pl $(obj)/ghash-riscv64-zvkg.S: $(src)/ghash-riscv64-zvkg.pl $(call cmd,perlasm) +$(obj)/sha256-riscv64-zvknha_or_zvknhb-zvkb.S: $(src)/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S clean-files += ghash-riscv64-zvkg.S +clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S diff --git a/arch/riscv/crypto/sha256-riscv64-glue.c b/arch/riscv/crypto/sha256-riscv64-glue.c new file mode 100644 index 000000000000..760d89031d1c --- /dev/null +++ b/arch/riscv/crypto/sha256-riscv64-glue.c @@ -0,0 +1,145 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Linux/riscv64 port of the OpenSSL SHA256 implementation for RISC-V 64 + * + * Copyright (C) 2022 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * sha256 using zvkb and zvknha/b vector crypto extension + * + * This asm function will just take the first 256-bit as the sha256 state from + * the pointer to `struct sha256_state`. + */ +asmlinkage void +sha256_block_data_order_zvkb_zvknha_or_zvknhb(struct sha256_state *digest, + const u8 *data, int num_blks); + +static int riscv64_sha256_update(struct shash_desc *desc, const u8 *data, + unsigned int len) +{ + int ret = 0; + + /* + * Make sure struct sha256_state begins directly with the SHA256 + * 256-bit internal state, as this is what the asm function expect. + */ + BUILD_BUG_ON(offsetof(struct sha256_state, state) != 0); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + ret = sha256_base_do_update( + desc, data, len, + sha256_block_data_order_zvkb_zvknha_or_zvknhb); + kernel_vector_end(); + } else { + ret = crypto_sha256_update(desc, data, len); + } + + return ret; +} + +static int riscv64_sha256_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) +{ + if (crypto_simd_usable()) { + kernel_vector_begin(); + if (len) + sha256_base_do_update( + desc, data, len, + sha256_block_data_order_zvkb_zvknha_or_zvknhb); + sha256_base_do_finalize( + desc, sha256_block_data_order_zvkb_zvknha_or_zvknhb); + kernel_vector_end(); + + return sha256_base_finish(desc, out); + } + + return crypto_sha256_finup(desc, data, len, out); +} + +static int riscv64_sha256_final(struct shash_desc *desc, u8 *out) +{ + return riscv64_sha256_finup(desc, NULL, 0, out); +} + +static struct shash_alg sha256_algs[] = { + { + .init = sha256_base_init, + .update = riscv64_sha256_update, + .final = riscv64_sha256_final, + .finup = riscv64_sha256_finup, + .descsize = sizeof(struct sha256_state), + .digestsize = SHA256_DIGEST_SIZE, + .base = { + .cra_blocksize = SHA256_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sha256", + .cra_driver_name = "sha256-riscv64-zvknha_or_zvknhb-zvkb", + .cra_module = THIS_MODULE, + }, + }, { + .init = sha224_base_init, + .update = riscv64_sha256_update, + .final = riscv64_sha256_final, + .finup = riscv64_sha256_finup, + .descsize = sizeof(struct sha256_state), + .digestsize = SHA224_DIGEST_SIZE, + .base = { + .cra_blocksize = SHA224_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sha224", + .cra_driver_name = "sha224-riscv64-zvknha_or_zvknhb-zvkb", + .cra_module = THIS_MODULE, + }, + }, +}; + +static inline bool check_sha256_ext(void) +{ + /* + * From the spec: + * The Zvknhb ext supports both SHA-256 and SHA-512 and Zvknha only + * supports SHA-256. + */ + return (riscv_isa_extension_available(NULL, ZVKNHA) || + riscv_isa_extension_available(NULL, ZVKNHB)) && + riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_sha256_mod_init(void) +{ + if (check_sha256_ext()) + return crypto_register_shashes(sha256_algs, + ARRAY_SIZE(sha256_algs)); + + return -ENODEV; +} + +static void __exit riscv64_sha256_mod_fini(void) +{ + crypto_unregister_shashes(sha256_algs, ARRAY_SIZE(sha256_algs)); +} + +module_init(riscv64_sha256_mod_init); +module_exit(riscv64_sha256_mod_fini); + +MODULE_DESCRIPTION("SHA-256 (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("sha224"); +MODULE_ALIAS_CRYPTO("sha256"); diff --git a/arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl b/arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl new file mode 100644 index 000000000000..51b2d9d4f8f1 --- /dev/null +++ b/arch/riscv/crypto/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl @@ -0,0 +1,318 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Phoebe Chen +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# The generated code of this file depends on the following RISC-V extensions: +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector SHA-2 Secure Hash extension ('Zvknha' or 'Zvknhb') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +my $K256 = "K256"; + +# Function arguments +my ($H, $INP, $LEN, $KT, $H2, $INDEX_PATTERN) = ("a0", "a1", "a2", "a3", "t3", "t4"); + +sub sha_256_load_constant { + my $code=<<___; + la $KT, $K256 # Load round constants K256 + @{[vle32_v $V10, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V11, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V12, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V13, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V14, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V15, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V16, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V17, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V18, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V19, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V20, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V21, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V22, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V23, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V24, $KT]} + addi $KT, $KT, 16 + @{[vle32_v $V25, $KT]} +___ + + return $code; +} + +################################################################################ +# void sha256_block_data_order_zvkb_zvknha_or_zvknhb(void *c, const void *p, size_t len) +$code .= <<___; +.p2align 2 +.globl sha256_block_data_order_zvkb_zvknha_or_zvknhb +.type sha256_block_data_order_zvkb_zvknha_or_zvknhb,\@function +sha256_block_data_order_zvkb_zvknha_or_zvknhb: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + @{[sha_256_load_constant]} + + # H is stored as {a,b,c,d},{e,f,g,h}, but we need {f,e,b,a},{h,g,d,c} + # The dst vtype is e32m1 and the index vtype is e8mf4. + # We use index-load with the following index pattern at v26. + # i8 index: + # 20, 16, 4, 0 + # Instead of setting the i8 index, we could use a single 32bit + # little-endian value to cover the 4xi8 index. + # i32 value: + # 0x 00 04 10 14 + li $INDEX_PATTERN, 0x00041014 + @{[vsetivli "zero", 1, "e32", "m1", "ta", "ma"]} + @{[vmv_v_x $V26, $INDEX_PATTERN]} + + addi $H2, $H, 8 + + # Use index-load to get {f,e,b,a},{h,g,d,c} + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + @{[vluxei8_v $V6, $H, $V26]} + @{[vluxei8_v $V7, $H2, $V26]} + + # Setup v0 mask for the vmerge to replace the first word (idx==0) in key-scheduling. + # The AVL is 4 in SHA, so we could use a single e8(8 element masking) for masking. + @{[vsetivli "zero", 1, "e8", "m1", "ta", "ma"]} + @{[vmv_v_i $V0, 0x01]} + + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + +L_round_loop: + # Decrement length by 1 + add $LEN, $LEN, -1 + + # Keep the current state as we need it later: H' = H+{a',b',c',...,h'}. + @{[vmv_v_v $V30, $V6]} + @{[vmv_v_v $V31, $V7]} + + # Load the 512-bits of the message block in v1-v4 and perform + # an endian swap on each 4 bytes element. + @{[vle32_v $V1, $INP]} + @{[vrev8_v $V1, $V1]} + add $INP, $INP, 16 + @{[vle32_v $V2, $INP]} + @{[vrev8_v $V2, $V2]} + add $INP, $INP, 16 + @{[vle32_v $V3, $INP]} + @{[vrev8_v $V3, $V3]} + add $INP, $INP, 16 + @{[vle32_v $V4, $INP]} + @{[vrev8_v $V4, $V4]} + add $INP, $INP, 16 + + # Quad-round 0 (+0, Wt from oldest to newest in v1->v2->v3->v4) + @{[vadd_vv $V5, $V10, $V1]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V3, $V2, $V0]} + @{[vsha2ms_vv $V1, $V5, $V4]} # Generate W[19:16] + + # Quad-round 1 (+1, v2->v3->v4->v1) + @{[vadd_vv $V5, $V11, $V2]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V4, $V3, $V0]} + @{[vsha2ms_vv $V2, $V5, $V1]} # Generate W[23:20] + + # Quad-round 2 (+2, v3->v4->v1->v2) + @{[vadd_vv $V5, $V12, $V3]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V1, $V4, $V0]} + @{[vsha2ms_vv $V3, $V5, $V2]} # Generate W[27:24] + + # Quad-round 3 (+3, v4->v1->v2->v3) + @{[vadd_vv $V5, $V13, $V4]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V2, $V1, $V0]} + @{[vsha2ms_vv $V4, $V5, $V3]} # Generate W[31:28] + + # Quad-round 4 (+0, v1->v2->v3->v4) + @{[vadd_vv $V5, $V14, $V1]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V3, $V2, $V0]} + @{[vsha2ms_vv $V1, $V5, $V4]} # Generate W[35:32] + + # Quad-round 5 (+1, v2->v3->v4->v1) + @{[vadd_vv $V5, $V15, $V2]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V4, $V3, $V0]} + @{[vsha2ms_vv $V2, $V5, $V1]} # Generate W[39:36] + + # Quad-round 6 (+2, v3->v4->v1->v2) + @{[vadd_vv $V5, $V16, $V3]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V1, $V4, $V0]} + @{[vsha2ms_vv $V3, $V5, $V2]} # Generate W[43:40] + + # Quad-round 7 (+3, v4->v1->v2->v3) + @{[vadd_vv $V5, $V17, $V4]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V2, $V1, $V0]} + @{[vsha2ms_vv $V4, $V5, $V3]} # Generate W[47:44] + + # Quad-round 8 (+0, v1->v2->v3->v4) + @{[vadd_vv $V5, $V18, $V1]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V3, $V2, $V0]} + @{[vsha2ms_vv $V1, $V5, $V4]} # Generate W[51:48] + + # Quad-round 9 (+1, v2->v3->v4->v1) + @{[vadd_vv $V5, $V19, $V2]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V4, $V3, $V0]} + @{[vsha2ms_vv $V2, $V5, $V1]} # Generate W[55:52] + + # Quad-round 10 (+2, v3->v4->v1->v2) + @{[vadd_vv $V5, $V20, $V3]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V1, $V4, $V0]} + @{[vsha2ms_vv $V3, $V5, $V2]} # Generate W[59:56] + + # Quad-round 11 (+3, v4->v1->v2->v3) + @{[vadd_vv $V5, $V21, $V4]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + @{[vmerge_vvm $V5, $V2, $V1, $V0]} + @{[vsha2ms_vv $V4, $V5, $V3]} # Generate W[63:60] + + # Quad-round 12 (+0, v1->v2->v3->v4) + # Note that we stop generating new message schedule words (Wt, v1-13) + # as we already generated all the words we end up consuming (i.e., W[63:60]). + @{[vadd_vv $V5, $V22, $V1]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + + # Quad-round 13 (+1, v2->v3->v4->v1) + @{[vadd_vv $V5, $V23, $V2]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + + # Quad-round 14 (+2, v3->v4->v1->v2) + @{[vadd_vv $V5, $V24, $V3]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + + # Quad-round 15 (+3, v4->v1->v2->v3) + @{[vadd_vv $V5, $V25, $V4]} + @{[vsha2cl_vv $V7, $V6, $V5]} + @{[vsha2ch_vv $V6, $V7, $V5]} + + # H' = H+{a',b',c',...,h'} + @{[vadd_vv $V6, $V30, $V6]} + @{[vadd_vv $V7, $V31, $V7]} + bnez $LEN, L_round_loop + + # Store {f,e,b,a},{h,g,d,c} back to {a,b,c,d},{e,f,g,h}. + @{[vsuxei8_v $V6, $H, $V26]} + @{[vsuxei8_v $V7, $H2, $V26]} + + ret +.size sha256_block_data_order_zvkb_zvknha_or_zvknhb,.-sha256_block_data_order_zvkb_zvknha_or_zvknhb + +.p2align 2 +.type $K256,\@object +$K256: + .word 0x428a2f98, 0x71374491, 0xb5c0fbcf, 0xe9b5dba5 + .word 0x3956c25b, 0x59f111f1, 0x923f82a4, 0xab1c5ed5 + .word 0xd807aa98, 0x12835b01, 0x243185be, 0x550c7dc3 + .word 0x72be5d74, 0x80deb1fe, 0x9bdc06a7, 0xc19bf174 + .word 0xe49b69c1, 0xefbe4786, 0x0fc19dc6, 0x240ca1cc + .word 0x2de92c6f, 0x4a7484aa, 0x5cb0a9dc, 0x76f988da + .word 0x983e5152, 0xa831c66d, 0xb00327c8, 0xbf597fc7 + .word 0xc6e00bf3, 0xd5a79147, 0x06ca6351, 0x14292967 + .word 0x27b70a85, 0x2e1b2138, 0x4d2c6dfc, 0x53380d13 + .word 0x650a7354, 0x766a0abb, 0x81c2c92e, 0x92722c85 + .word 0xa2bfe8a1, 0xa81a664b, 0xc24b8b70, 0xc76c51a3 + .word 0xd192e819, 0xd6990624, 0xf40e3585, 0x106aa070 + .word 0x19a4c116, 0x1e376c08, 0x2748774c, 0x34b0bcb5 + .word 0x391c0cb3, 0x4ed8aa4a, 0x5b9cca4f, 0x682e6ff3 + .word 0x748f82ee, 0x78a5636f, 0x84c87814, 0x8cc70208 + .word 0x90befffa, 0xa4506ceb, 0xbef9a3f7, 0xc67178f2 +.size $K256,.-$K256 +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Mon Nov 27 07:07:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469233 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id BFFDCC4167B for ; Mon, 27 Nov 2023 07:07:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=Pns7uiwcPAPZz9stQSHt9l8BJ/oLX+8u3uWbVf0RrqA=; b=Pv6HBa2L6yUaDp tVJA6ZglAW72fxGQHpNsyJKNJcf6hF9v5HM7QWR9qzDziCpBkMaHjyc7p0UnZ9ROBM81PtqcECWSh onU0kK+UeY6wuoR3PeLZOe9lc5zY9A/LvuHTENzgnkYzYKojc6gSQH44zA8g6T3ewvUai7CIkptBr zbelR6PgWyT/0bODeMFUKMxy1yeOpFT75NfArIi0hpgt/CcgRHgN/tGZ5wnbTBQatr0CUKVd2zBeg ySMm0FfjdlwO4pewRdBmcdGzDhCaVUV5xJNDecqkbOAtqTW87mIjWQ7XnP8nqhl+1eOrMIPMfqbLC +uMRjWSDVv0765WVaq+w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vih-001f40-13; Mon, 27 Nov 2023 07:07:52 +0000 Received: from mail-pl1-x635.google.com ([2607:f8b0:4864:20::635]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vic-001exj-2A for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:49 +0000 Received: by mail-pl1-x635.google.com with SMTP id d9443c01a7336-1cc9b626a96so26865185ad.2 for ; Sun, 26 Nov 2023 23:07:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068863; x=1701673663; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Zx+JOV6SwFsQe7mCBswIKAe2MRx+dnQNvWPjGwWvo/Q=; b=R+IhLlTlD/1LWV7bJqY5J6IHSmcaHQSB23dI0attFyVVtBHZ0r5mM5Sq6777tzgrHz thZlEpCsnJgBwzjD1ZL5TkuSFxC6FL6tDpDDEF8RAkYG9Qut7+mGjWbl60zliSITQfuD 4lILHfKY42YwfPgvaSPpluew66gcOfot0BuA4NaYlt/vYzkKXjrlh/HkX7ihYdTtwg3D Jg2dmYlYYwM+1y50vOZ5bFLZkXszy8W6rJiH6gt5bzOB5L1xVQxQ/lV+j1s70sEYKxUH gIc5ZonWEqVTP32M1rpG1xzzVB1DIHBXe1hNOSiEZBXEYNiwSw7c6W2vmzcJd6QE0vCW 4wEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068863; x=1701673663; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Zx+JOV6SwFsQe7mCBswIKAe2MRx+dnQNvWPjGwWvo/Q=; b=EQroxFsthRK2sX34r1cBuf4JEzrk7X49Skzqnu2PLNhZf7DX6vdRWKZ6s53TB2JRO2 pZSBdz4TB2xAyM4xtUlfd+5EoSuOZNbkc7SdN5rgX8Qw9j49/Zx3LDc7DiA8LZ09sIZs gITL1cG2g6o69bNSwR+Tf+5EVlOhJhCAPmvXH/WqhSJHaH3dagCMAYJ7cKA3OkPa2EO9 d/zA6uZ0g3PmsZ57otpjvutuBHWUzioFe/cG31ApU9S4RJwJfEi9xUJ78Lc3KlU32f5e pZR83+2jrplWf/29H/xZ1lNhI6vSStWudVj61cKx8jJZ154XENnsnHSAFpsLlqzV8lVX ZJ2Q== X-Gm-Message-State: AOJu0Yw6Rfl7TnUBzTSp1/CDF2HmBlTf8ze/8h2afDDv7deQ/GpyZgDd DWPaw1aquB27k+lsAuCRbe2rzg== X-Google-Smtp-Source: AGHT+IFdhvBp2RdlsIAPjfy87G49FrSij7r62uYf+IVyP/edmjf3e/EKCMqEtk41/Rq8ZCgftaiBvQ== X-Received: by 2002:a17:902:8e89:b0:1cc:7adb:16a5 with SMTP id bg9-20020a1709028e8900b001cc7adb16a5mr9377034plb.13.1701068863528; Sun, 26 Nov 2023 23:07:43 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.40 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:43 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 10/13] RISC-V: crypto: add Zvknhb accelerated SHA384/512 implementations Date: Mon, 27 Nov 2023 15:07:00 +0800 Message-Id: <20231127070703.1697-11-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230746_735898_4B049334 X-CRM114-Status: GOOD ( 32.61 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add SHA384 and 512 implementations using Zvknhb vector crypto extension from OpenSSL(openssl/openssl#21923). Co-developed-by: Charalampos Mitrodimas Signed-off-by: Charalampos Mitrodimas Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Co-developed-by: Phoebe Chen Signed-off-by: Phoebe Chen Signed-off-by: Jerry Shih --- Changelog v2: - Do not turn on kconfig `SHA512_RISCV64` option by default. - Add `asmlinkage` qualifier for crypto asm function. - Rename sha512-riscv64-zvkb-zvknhb to sha512-riscv64-zvknhb-zvkb. - Reorder structure sha512_algs members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 11 + arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/sha512-riscv64-glue.c | 139 +++++++++ .../crypto/sha512-riscv64-zvknhb-zvkb.pl | 266 ++++++++++++++++++ 4 files changed, 423 insertions(+) create mode 100644 arch/riscv/crypto/sha512-riscv64-glue.c create mode 100644 arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index d31af9190717..ad0b08a13c9a 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -55,4 +55,15 @@ config CRYPTO_SHA256_RISCV64 - Zvknha or Zvknhb vector crypto extensions - Zvkb vector crypto extension +config CRYPTO_SHA512_RISCV64 + tristate "Hash functions: SHA-384 and SHA-512" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_SHA512 + help + SHA-384 and SHA-512 secure hash algorithm (FIPS 180) + + Architecture: riscv64 using: + - Zvknhb vector crypto extension + - Zvkb vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index e9d7717ec943..8aabef950ad3 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -15,6 +15,9 @@ ghash-riscv64-y := ghash-riscv64-glue.o ghash-riscv64-zvkg.o obj-$(CONFIG_CRYPTO_SHA256_RISCV64) += sha256-riscv64.o sha256-riscv64-y := sha256-riscv64-glue.o sha256-riscv64-zvknha_or_zvknhb-zvkb.o +obj-$(CONFIG_CRYPTO_SHA512_RISCV64) += sha512-riscv64.o +sha512-riscv64-y := sha512-riscv64-glue.o sha512-riscv64-zvknhb-zvkb.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) @@ -33,8 +36,12 @@ $(obj)/ghash-riscv64-zvkg.S: $(src)/ghash-riscv64-zvkg.pl $(obj)/sha256-riscv64-zvknha_or_zvknhb-zvkb.S: $(src)/sha256-riscv64-zvknha_or_zvknhb-zvkb.pl $(call cmd,perlasm) +$(obj)/sha512-riscv64-zvknhb-zvkb.S: $(src)/sha512-riscv64-zvknhb-zvkb.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S clean-files += ghash-riscv64-zvkg.S clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S +clean-files += sha512-riscv64-zvknhb-zvkb.S diff --git a/arch/riscv/crypto/sha512-riscv64-glue.c b/arch/riscv/crypto/sha512-riscv64-glue.c new file mode 100644 index 000000000000..3dd8e1c9d402 --- /dev/null +++ b/arch/riscv/crypto/sha512-riscv64-glue.c @@ -0,0 +1,139 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Linux/riscv64 port of the OpenSSL SHA512 implementation for RISC-V 64 + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * sha512 using zvkb and zvknhb vector crypto extension + * + * This asm function will just take the first 512-bit as the sha512 state from + * the pointer to `struct sha512_state`. + */ +asmlinkage void sha512_block_data_order_zvkb_zvknhb(struct sha512_state *digest, + const u8 *data, + int num_blks); + +static int riscv64_sha512_update(struct shash_desc *desc, const u8 *data, + unsigned int len) +{ + int ret = 0; + + /* + * Make sure struct sha512_state begins directly with the SHA512 + * 512-bit internal state, as this is what the asm function expect. + */ + BUILD_BUG_ON(offsetof(struct sha512_state, state) != 0); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + ret = sha512_base_do_update( + desc, data, len, sha512_block_data_order_zvkb_zvknhb); + kernel_vector_end(); + } else { + ret = crypto_sha512_update(desc, data, len); + } + + return ret; +} + +static int riscv64_sha512_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) +{ + if (crypto_simd_usable()) { + kernel_vector_begin(); + if (len) + sha512_base_do_update( + desc, data, len, + sha512_block_data_order_zvkb_zvknhb); + sha512_base_do_finalize(desc, + sha512_block_data_order_zvkb_zvknhb); + kernel_vector_end(); + + return sha512_base_finish(desc, out); + } + + return crypto_sha512_finup(desc, data, len, out); +} + +static int riscv64_sha512_final(struct shash_desc *desc, u8 *out) +{ + return riscv64_sha512_finup(desc, NULL, 0, out); +} + +static struct shash_alg sha512_algs[] = { + { + .init = sha512_base_init, + .update = riscv64_sha512_update, + .final = riscv64_sha512_final, + .finup = riscv64_sha512_finup, + .descsize = sizeof(struct sha512_state), + .digestsize = SHA512_DIGEST_SIZE, + .base = { + .cra_blocksize = SHA512_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sha512", + .cra_driver_name = "sha512-riscv64-zvknhb-zvkb", + .cra_module = THIS_MODULE, + }, + }, + { + .init = sha384_base_init, + .update = riscv64_sha512_update, + .final = riscv64_sha512_final, + .finup = riscv64_sha512_finup, + .descsize = sizeof(struct sha512_state), + .digestsize = SHA384_DIGEST_SIZE, + .base = { + .cra_blocksize = SHA384_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sha384", + .cra_driver_name = "sha384-riscv64-zvknhb-zvkb", + .cra_module = THIS_MODULE, + }, + }, +}; + +static inline bool check_sha512_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKNHB) && + riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_sha512_mod_init(void) +{ + if (check_sha512_ext()) + return crypto_register_shashes(sha512_algs, + ARRAY_SIZE(sha512_algs)); + + return -ENODEV; +} + +static void __exit riscv64_sha512_mod_fini(void) +{ + crypto_unregister_shashes(sha512_algs, ARRAY_SIZE(sha512_algs)); +} + +module_init(riscv64_sha512_mod_init); +module_exit(riscv64_sha512_mod_fini); + +MODULE_DESCRIPTION("SHA-512 (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("sha384"); +MODULE_ALIAS_CRYPTO("sha512"); diff --git a/arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.pl b/arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.pl new file mode 100644 index 000000000000..4be448266a59 --- /dev/null +++ b/arch/riscv/crypto/sha512-riscv64-zvknhb-zvkb.pl @@ -0,0 +1,266 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Phoebe Chen +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# The generated code of this file depends on the following RISC-V extensions: +# - RV64I +# - RISC-V vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector SHA-2 Secure Hash extension ('Zvknhb') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +my $K512 = "K512"; + +# Function arguments +my ($H, $INP, $LEN, $KT, $H2, $INDEX_PATTERN) = ("a0", "a1", "a2", "a3", "t3", "t4"); + +################################################################################ +# void sha512_block_data_order_zvkb_zvknhb(void *c, const void *p, size_t len) +$code .= <<___; +.p2align 2 +.globl sha512_block_data_order_zvkb_zvknhb +.type sha512_block_data_order_zvkb_zvknhb,\@function +sha512_block_data_order_zvkb_zvknhb: + @{[vsetivli "zero", 4, "e64", "m2", "ta", "ma"]} + + # H is stored as {a,b,c,d},{e,f,g,h}, but we need {f,e,b,a},{h,g,d,c} + # The dst vtype is e64m2 and the index vtype is e8mf4. + # We use index-load with the following index pattern at v1. + # i8 index: + # 40, 32, 8, 0 + # Instead of setting the i8 index, we could use a single 32bit + # little-endian value to cover the 4xi8 index. + # i32 value: + # 0x 00 08 20 28 + li $INDEX_PATTERN, 0x00082028 + @{[vsetivli "zero", 1, "e32", "m1", "ta", "ma"]} + @{[vmv_v_x $V1, $INDEX_PATTERN]} + + addi $H2, $H, 16 + + # Use index-load to get {f,e,b,a},{h,g,d,c} + @{[vsetivli "zero", 4, "e64", "m2", "ta", "ma"]} + @{[vluxei8_v $V22, $H, $V1]} + @{[vluxei8_v $V24, $H2, $V1]} + + # Setup v0 mask for the vmerge to replace the first word (idx==0) in key-scheduling. + # The AVL is 4 in SHA, so we could use a single e8(8 element masking) for masking. + @{[vsetivli "zero", 1, "e8", "m1", "ta", "ma"]} + @{[vmv_v_i $V0, 0x01]} + + @{[vsetivli "zero", 4, "e64", "m2", "ta", "ma"]} + +L_round_loop: + # Load round constants K512 + la $KT, $K512 + + # Decrement length by 1 + addi $LEN, $LEN, -1 + + # Keep the current state as we need it later: H' = H+{a',b',c',...,h'}. + @{[vmv_v_v $V26, $V22]} + @{[vmv_v_v $V28, $V24]} + + # Load the 1024-bits of the message block in v10-v16 and perform the endian + # swap. + @{[vle64_v $V10, $INP]} + @{[vrev8_v $V10, $V10]} + addi $INP, $INP, 32 + @{[vle64_v $V12, $INP]} + @{[vrev8_v $V12, $V12]} + addi $INP, $INP, 32 + @{[vle64_v $V14, $INP]} + @{[vrev8_v $V14, $V14]} + addi $INP, $INP, 32 + @{[vle64_v $V16, $INP]} + @{[vrev8_v $V16, $V16]} + addi $INP, $INP, 32 + + .rept 4 + # Quad-round 0 (+0, v10->v12->v14->v16) + @{[vle64_v $V20, $KT]} + addi $KT, $KT, 32 + @{[vadd_vv $V18, $V20, $V10]} + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + @{[vmerge_vvm $V18, $V14, $V12, $V0]} + @{[vsha2ms_vv $V10, $V18, $V16]} + + # Quad-round 1 (+1, v12->v14->v16->v10) + @{[vle64_v $V20, $KT]} + addi $KT, $KT, 32 + @{[vadd_vv $V18, $V20, $V12]} + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + @{[vmerge_vvm $V18, $V16, $V14, $V0]} + @{[vsha2ms_vv $V12, $V18, $V10]} + + # Quad-round 2 (+2, v14->v16->v10->v12) + @{[vle64_v $V20, $KT]} + addi $KT, $KT, 32 + @{[vadd_vv $V18, $V20, $V14]} + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + @{[vmerge_vvm $V18, $V10, $V16, $V0]} + @{[vsha2ms_vv $V14, $V18, $V12]} + + # Quad-round 3 (+3, v16->v10->v12->v14) + @{[vle64_v $V20, $KT]} + addi $KT, $KT, 32 + @{[vadd_vv $V18, $V20, $V16]} + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + @{[vmerge_vvm $V18, $V12, $V10, $V0]} + @{[vsha2ms_vv $V16, $V18, $V14]} + .endr + + # Quad-round 16 (+0, v10->v12->v14->v16) + # Note that we stop generating new message schedule words (Wt, v10-16) + # as we already generated all the words we end up consuming (i.e., W[79:76]). + @{[vle64_v $V20, $KT]} + addi $KT, $KT, 32 + @{[vadd_vv $V18, $V20, $V10]} + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + + # Quad-round 17 (+1, v12->v14->v16->v10) + @{[vle64_v $V20, $KT]} + addi $KT, $KT, 32 + @{[vadd_vv $V18, $V20, $V12]} + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + + # Quad-round 18 (+2, v14->v16->v10->v12) + @{[vle64_v $V20, $KT]} + addi $KT, $KT, 32 + @{[vadd_vv $V18, $V20, $V14]} + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + + # Quad-round 19 (+3, v16->v10->v12->v14) + @{[vle64_v $V20, $KT]} + # No t1 increment needed. + @{[vadd_vv $V18, $V20, $V16]} + @{[vsha2cl_vv $V24, $V22, $V18]} + @{[vsha2ch_vv $V22, $V24, $V18]} + + # H' = H+{a',b',c',...,h'} + @{[vadd_vv $V22, $V26, $V22]} + @{[vadd_vv $V24, $V28, $V24]} + bnez $LEN, L_round_loop + + # Store {f,e,b,a},{h,g,d,c} back to {a,b,c,d},{e,f,g,h}. + @{[vsuxei8_v $V22, $H, $V1]} + @{[vsuxei8_v $V24, $H2, $V1]} + + ret +.size sha512_block_data_order_zvkb_zvknhb,.-sha512_block_data_order_zvkb_zvknhb + +.p2align 3 +.type $K512,\@object +$K512: + .dword 0x428a2f98d728ae22, 0x7137449123ef65cd + .dword 0xb5c0fbcfec4d3b2f, 0xe9b5dba58189dbbc + .dword 0x3956c25bf348b538, 0x59f111f1b605d019 + .dword 0x923f82a4af194f9b, 0xab1c5ed5da6d8118 + .dword 0xd807aa98a3030242, 0x12835b0145706fbe + .dword 0x243185be4ee4b28c, 0x550c7dc3d5ffb4e2 + .dword 0x72be5d74f27b896f, 0x80deb1fe3b1696b1 + .dword 0x9bdc06a725c71235, 0xc19bf174cf692694 + .dword 0xe49b69c19ef14ad2, 0xefbe4786384f25e3 + .dword 0x0fc19dc68b8cd5b5, 0x240ca1cc77ac9c65 + .dword 0x2de92c6f592b0275, 0x4a7484aa6ea6e483 + .dword 0x5cb0a9dcbd41fbd4, 0x76f988da831153b5 + .dword 0x983e5152ee66dfab, 0xa831c66d2db43210 + .dword 0xb00327c898fb213f, 0xbf597fc7beef0ee4 + .dword 0xc6e00bf33da88fc2, 0xd5a79147930aa725 + .dword 0x06ca6351e003826f, 0x142929670a0e6e70 + .dword 0x27b70a8546d22ffc, 0x2e1b21385c26c926 + .dword 0x4d2c6dfc5ac42aed, 0x53380d139d95b3df + .dword 0x650a73548baf63de, 0x766a0abb3c77b2a8 + .dword 0x81c2c92e47edaee6, 0x92722c851482353b + .dword 0xa2bfe8a14cf10364, 0xa81a664bbc423001 + .dword 0xc24b8b70d0f89791, 0xc76c51a30654be30 + .dword 0xd192e819d6ef5218, 0xd69906245565a910 + .dword 0xf40e35855771202a, 0x106aa07032bbd1b8 + .dword 0x19a4c116b8d2d0c8, 0x1e376c085141ab53 + .dword 0x2748774cdf8eeb99, 0x34b0bcb5e19b48a8 + .dword 0x391c0cb3c5c95a63, 0x4ed8aa4ae3418acb + .dword 0x5b9cca4f7763e373, 0x682e6ff3d6b2b8a3 + .dword 0x748f82ee5defb2fc, 0x78a5636f43172f60 + .dword 0x84c87814a1f0ab72, 0x8cc702081a6439ec + .dword 0x90befffa23631e28, 0xa4506cebde82bde9 + .dword 0xbef9a3f7b2c67915, 0xc67178f2e372532b + .dword 0xca273eceea26619c, 0xd186b8c721c0c207 + .dword 0xeada7dd6cde0eb1e, 0xf57d4f7fee6ed178 + .dword 0x06f067aa72176fba, 0x0a637dc5a2c898a6 + .dword 0x113f9804bef90dae, 0x1b710b35131c471b + .dword 0x28db77f523047d84, 0x32caab7b40c72493 + .dword 0x3c9ebe0a15c9bebc, 0x431d67c49c100d4c + .dword 0x4cc5d4becb3e42b6, 0x597f299cfc657e2a + .dword 0x5fcb6fab3ad6faec, 0x6c44198c4a475817 +.size $K512,.-$K512 +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Mon Nov 27 07:07:01 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469234 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1824DC4167B for ; Mon, 27 Nov 2023 07:08:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=iTUIzTq+A4KRuc+wlKYOry2oOZLCf2SkLwtbfGt7ZOY=; b=Cl/UulzNqOcsUn f7/5jw5v1aQvYqAdDRSM4fj3+tc+/vqgOTnEZznQqPSP2pHrBvd+02CeShtEcnh7F7TxmECZGgs1F j4dk6fohkSxtX5LzviTapYcIAAhg56gF12ajnV4fEElCToH6zIrkpNfmngcHGeRVX6U2QG94O+Q9P FDjGi5g/pwMGnF8I37wiHeXNOGiccxGgQnlHwIH0QSTPfNtBegpqf6SUQGFx1i9EzJD8iHWM+Tc4I FkUB/vE3lvlqhFXh8dPTJLs9QtsWwE6eg6Qf1OvaH/+SM/79LtYHdjRPS7kdfh3D+JC13MIpf2Exo KD6syCnPRyldTLi2pq1Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vim-001f7g-1L; Mon, 27 Nov 2023 07:07:56 +0000 Received: from mail-pf1-x430.google.com ([2607:f8b0:4864:20::430]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vif-001ezt-2V for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:53 +0000 Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-6cb74a527ceso2925823b3a.2 for ; Sun, 26 Nov 2023 23:07:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068867; x=1701673667; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MHvQMLoaTueUI0HUb6KNgDLBfeUfZVP5F06b8uLBkiE=; b=SzXiaonCK1+GcZFG7H8NBs8DC88O5NmcNxrtMMqYKwFM62zBD34ttaKY/Jykl+E1be BvrWAetAm9H6b1KuNRsiYuBDppFLmrMLt/I1UrC0ubBAzaBgTmgTiY/tSfon4ss3vfoc szUBpYcS2ZQ9CpERH9xKwOhDqafeQ+F4SmToGohPFDWAfqBJ2KlwovSZ4ItBGP2vsJTh OMnYDR8pFZ24xhcr8odyfJdpBkEOBFYqM2+wqZ96A6QShawWoxLMz5iVW+ynTwE0xRQN D/1dxQoOt1EJ2XsPNxHa+L+2RhWI8NSNS2DDzAfe7qBzfJpwwIaMfRP8c59TlFgeaj5a rFVA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068867; x=1701673667; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=MHvQMLoaTueUI0HUb6KNgDLBfeUfZVP5F06b8uLBkiE=; b=wt8WYXEQ7SnwkcOxtslt3ZK+c40GhHsiekpxVa1aBkiRqWP+wwHUqbD6Q6rIjEkNps 8QOLm+glEbsU2kkT3aEb1GwNBkVHwPqxFQfxz6uQ/jUO/iBsx1a8v6OvUN3PVoTO1AT7 Nb/ZSXshHBiow4qyth8iKYYHzUtUsKq8cXwxk2Z4dm+MY5XW0fJgCn+IOPKd2YvnEGiC umfNHZjahH8eFy0yl6ZZMzDDgzMD7xkJKNmIbAxeizTSl5bS7CSrCBUl4f0xw4+ZqRW1 OTy4QwJ7O5rRAnZY0uO4Qh3RAlK6gfkaROKEu3RIHsbim7wVI5XFw0ryVzwzgGCa9Pb+ n6vA== X-Gm-Message-State: AOJu0Yw+3g9+IpMDeSksIDT+Djb2Js+M/BzWr7n8x0vt6EFHFYBSDSIH Nh89zqSfDvs39u/Al+8R6CFlWg== X-Google-Smtp-Source: AGHT+IFK7haKFNWcAoAL0u/ApDI4eyKdaHASKTvf+XvN5fJLNzDmGaowZ3pjGzDqkx0PXPdo+0RfNQ== X-Received: by 2002:a05:6a20:3d28:b0:18c:2287:29cf with SMTP id y40-20020a056a203d2800b0018c228729cfmr8525491pzi.40.1701068866585; Sun, 26 Nov 2023 23:07:46 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.43 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:46 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 11/13] RISC-V: crypto: add Zvksed accelerated SM4 implementation Date: Mon, 27 Nov 2023 15:07:01 +0800 Message-Id: <20231127070703.1697-12-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230749_866899_E4849584 X-CRM114-Status: GOOD ( 29.11 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add SM4 implementation using Zvksed vector crypto extension from OpenSSL (openssl/openssl#21923). The perlasm here is different from the original implementation in OpenSSL. In OpenSSL, SM4 has the separated set_encrypt_key and set_decrypt_key functions. In kernel, these set_key functions are merged into a single one in order to skip the redundant key expanding instructions. Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Signed-off-by: Jerry Shih --- Changelog v2: - Do not turn on kconfig `SM4_RISCV64` option by default. - Add the missed `static` declaration for riscv64_sm4_zvksed_alg. - Add `asmlinkage` qualifier for crypto asm function. - Rename sm4-riscv64-zvkb-zvksed to sm4-riscv64-zvksed-zvkb. - Reorder structure riscv64_sm4_zvksed_zvkb_alg members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 17 ++ arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/sm4-riscv64-glue.c | 121 +++++++++++ arch/riscv/crypto/sm4-riscv64-zvksed.pl | 268 ++++++++++++++++++++++++ 4 files changed, 413 insertions(+) create mode 100644 arch/riscv/crypto/sm4-riscv64-glue.c create mode 100644 arch/riscv/crypto/sm4-riscv64-zvksed.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index ad0b08a13c9a..b28cf1972250 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -66,4 +66,21 @@ config CRYPTO_SHA512_RISCV64 - Zvknhb vector crypto extension - Zvkb vector crypto extension +config CRYPTO_SM4_RISCV64 + tristate "Ciphers: SM4 (ShangMi 4)" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_ALGAPI + select CRYPTO_SM4 + help + SM4 cipher algorithms (OSCCA GB/T 32907-2016, + ISO/IEC 18033-3:2010/Amd 1:2021) + + SM4 (GBT.32907-2016) is a cryptographic standard issued by the + Organization of State Commercial Administration of China (OSCCA) + as an authorized cryptographic algorithms for the use within China. + + Architecture: riscv64 using: + - Zvksed vector crypto extension + - Zvkb vector crypto extension + endmenu diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 8aabef950ad3..8e34861bba34 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -18,6 +18,9 @@ sha256-riscv64-y := sha256-riscv64-glue.o sha256-riscv64-zvknha_or_zvknhb-zvkb.o obj-$(CONFIG_CRYPTO_SHA512_RISCV64) += sha512-riscv64.o sha512-riscv64-y := sha512-riscv64-glue.o sha512-riscv64-zvknhb-zvkb.o +obj-$(CONFIG_CRYPTO_SM4_RISCV64) += sm4-riscv64.o +sm4-riscv64-y := sm4-riscv64-glue.o sm4-riscv64-zvksed.o + quiet_cmd_perlasm = PERLASM $@ cmd_perlasm = $(PERL) $(<) void $(@) @@ -39,9 +42,13 @@ $(obj)/sha256-riscv64-zvknha_or_zvknhb-zvkb.S: $(src)/sha256-riscv64-zvknha_or_z $(obj)/sha512-riscv64-zvknhb-zvkb.S: $(src)/sha512-riscv64-zvknhb-zvkb.pl $(call cmd,perlasm) +$(obj)/sm4-riscv64-zvksed.S: $(src)/sm4-riscv64-zvksed.pl + $(call cmd,perlasm) + clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S clean-files += ghash-riscv64-zvkg.S clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S clean-files += sha512-riscv64-zvknhb-zvkb.S +clean-files += sm4-riscv64-zvksed.S diff --git a/arch/riscv/crypto/sm4-riscv64-glue.c b/arch/riscv/crypto/sm4-riscv64-glue.c new file mode 100644 index 000000000000..9d9d24b67ee3 --- /dev/null +++ b/arch/riscv/crypto/sm4-riscv64-glue.c @@ -0,0 +1,121 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Linux/riscv64 port of the OpenSSL SM4 implementation for RISC-V 64 + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* sm4 using zvksed vector crypto extension */ +asmlinkage void rv64i_zvksed_sm4_encrypt(const u8 *in, u8 *out, const u32 *key); +asmlinkage void rv64i_zvksed_sm4_decrypt(const u8 *in, u8 *out, const u32 *key); +asmlinkage int rv64i_zvksed_sm4_set_key(const u8 *user_key, + unsigned int key_len, u32 *enc_key, + u32 *dec_key); + +static int riscv64_sm4_setkey_zvksed(struct crypto_tfm *tfm, const u8 *key, + unsigned int key_len) +{ + struct sm4_ctx *ctx = crypto_tfm_ctx(tfm); + int ret = 0; + + if (crypto_simd_usable()) { + kernel_vector_begin(); + if (rv64i_zvksed_sm4_set_key(key, key_len, ctx->rkey_enc, + ctx->rkey_dec)) + ret = -EINVAL; + kernel_vector_end(); + } else { + ret = sm4_expandkey(ctx, key, key_len); + } + + return ret; +} + +static void riscv64_sm4_encrypt_zvksed(struct crypto_tfm *tfm, u8 *dst, + const u8 *src) +{ + const struct sm4_ctx *ctx = crypto_tfm_ctx(tfm); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + rv64i_zvksed_sm4_encrypt(src, dst, ctx->rkey_enc); + kernel_vector_end(); + } else { + sm4_crypt_block(ctx->rkey_enc, dst, src); + } +} + +static void riscv64_sm4_decrypt_zvksed(struct crypto_tfm *tfm, u8 *dst, + const u8 *src) +{ + const struct sm4_ctx *ctx = crypto_tfm_ctx(tfm); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + rv64i_zvksed_sm4_decrypt(src, dst, ctx->rkey_dec); + kernel_vector_end(); + } else { + sm4_crypt_block(ctx->rkey_dec, dst, src); + } +} + +static struct crypto_alg riscv64_sm4_zvksed_zvkb_alg = { + .cra_flags = CRYPTO_ALG_TYPE_CIPHER, + .cra_blocksize = SM4_BLOCK_SIZE, + .cra_ctxsize = sizeof(struct sm4_ctx), + .cra_priority = 300, + .cra_name = "sm4", + .cra_driver_name = "sm4-riscv64-zvksed-zvkb", + .cra_cipher = { + .cia_min_keysize = SM4_KEY_SIZE, + .cia_max_keysize = SM4_KEY_SIZE, + .cia_setkey = riscv64_sm4_setkey_zvksed, + .cia_encrypt = riscv64_sm4_encrypt_zvksed, + .cia_decrypt = riscv64_sm4_decrypt_zvksed, + }, + .cra_module = THIS_MODULE, +}; + +static inline bool check_sm4_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKSED) && + riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_sm4_mod_init(void) +{ + if (check_sm4_ext()) + return crypto_register_alg(&riscv64_sm4_zvksed_zvkb_alg); + + return -ENODEV; +} + +static void __exit riscv64_sm4_mod_fini(void) +{ + crypto_unregister_alg(&riscv64_sm4_zvksed_zvkb_alg); +} + +module_init(riscv64_sm4_mod_init); +module_exit(riscv64_sm4_mod_fini); + +MODULE_DESCRIPTION("SM4 (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("sm4"); diff --git a/arch/riscv/crypto/sm4-riscv64-zvksed.pl b/arch/riscv/crypto/sm4-riscv64-zvksed.pl new file mode 100644 index 000000000000..dab1d026a360 --- /dev/null +++ b/arch/riscv/crypto/sm4-riscv64-zvksed.pl @@ -0,0 +1,268 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# The generated code of this file depends on the following RISC-V extensions: +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector SM4 Block Cipher extension ('Zvksed') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +#### +# int rv64i_zvksed_sm4_set_key(const u8 *user_key, unsigned int key_len, +# u32 *enc_key, u32 *dec_key); +# +{ +my ($ukey,$key_len,$enc_key,$dec_key)=("a0","a1","a2","a3"); +my ($fk,$stride)=("a4","a5"); +my ($t0,$t1)=("t0","t1"); +my ($vukey,$vfk,$vk0,$vk1,$vk2,$vk3,$vk4,$vk5,$vk6,$vk7)=("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10"); +$code .= <<___; +.p2align 3 +.globl rv64i_zvksed_sm4_set_key +.type rv64i_zvksed_sm4_set_key,\@function +rv64i_zvksed_sm4_set_key: + li $t0, 16 + beq $t0, $key_len, 1f + li a0, 1 + ret +1: + + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + # Load the user key + @{[vle32_v $vukey, $ukey]} + @{[vrev8_v $vukey, $vukey]} + + # Load the FK. + la $fk, FK + @{[vle32_v $vfk, $fk]} + + # Generate round keys. + @{[vxor_vv $vukey, $vukey, $vfk]} + @{[vsm4k_vi $vk0, $vukey, 0]} # rk[0:3] + @{[vsm4k_vi $vk1, $vk0, 1]} # rk[4:7] + @{[vsm4k_vi $vk2, $vk1, 2]} # rk[8:11] + @{[vsm4k_vi $vk3, $vk2, 3]} # rk[12:15] + @{[vsm4k_vi $vk4, $vk3, 4]} # rk[16:19] + @{[vsm4k_vi $vk5, $vk4, 5]} # rk[20:23] + @{[vsm4k_vi $vk6, $vk5, 6]} # rk[24:27] + @{[vsm4k_vi $vk7, $vk6, 7]} # rk[28:31] + + # Store enc round keys + @{[vse32_v $vk0, $enc_key]} # rk[0:3] + addi $enc_key, $enc_key, 16 + @{[vse32_v $vk1, $enc_key]} # rk[4:7] + addi $enc_key, $enc_key, 16 + @{[vse32_v $vk2, $enc_key]} # rk[8:11] + addi $enc_key, $enc_key, 16 + @{[vse32_v $vk3, $enc_key]} # rk[12:15] + addi $enc_key, $enc_key, 16 + @{[vse32_v $vk4, $enc_key]} # rk[16:19] + addi $enc_key, $enc_key, 16 + @{[vse32_v $vk5, $enc_key]} # rk[20:23] + addi $enc_key, $enc_key, 16 + @{[vse32_v $vk6, $enc_key]} # rk[24:27] + addi $enc_key, $enc_key, 16 + @{[vse32_v $vk7, $enc_key]} # rk[28:31] + + # Store dec round keys in reverse order + addi $dec_key, $dec_key, 12 + li $stride, -4 + @{[vsse32_v $vk7, $dec_key, $stride]} # rk[31:28] + addi $dec_key, $dec_key, 16 + @{[vsse32_v $vk6, $dec_key, $stride]} # rk[27:24] + addi $dec_key, $dec_key, 16 + @{[vsse32_v $vk5, $dec_key, $stride]} # rk[23:20] + addi $dec_key, $dec_key, 16 + @{[vsse32_v $vk4, $dec_key, $stride]} # rk[19:16] + addi $dec_key, $dec_key, 16 + @{[vsse32_v $vk3, $dec_key, $stride]} # rk[15:12] + addi $dec_key, $dec_key, 16 + @{[vsse32_v $vk2, $dec_key, $stride]} # rk[11:8] + addi $dec_key, $dec_key, 16 + @{[vsse32_v $vk1, $dec_key, $stride]} # rk[7:4] + addi $dec_key, $dec_key, 16 + @{[vsse32_v $vk0, $dec_key, $stride]} # rk[3:0] + + li a0, 0 + ret +.size rv64i_zvksed_sm4_set_key,.-rv64i_zvksed_sm4_set_key +___ +} + +#### +# void rv64i_zvksed_sm4_encrypt(const unsigned char *in, unsigned char *out, +# const SM4_KEY *key); +# +{ +my ($in,$out,$keys,$stride)=("a0","a1","a2","t0"); +my ($vdata,$vk0,$vk1,$vk2,$vk3,$vk4,$vk5,$vk6,$vk7,$vgen)=("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10"); +$code .= <<___; +.p2align 3 +.globl rv64i_zvksed_sm4_encrypt +.type rv64i_zvksed_sm4_encrypt,\@function +rv64i_zvksed_sm4_encrypt: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + # Load input data + @{[vle32_v $vdata, $in]} + @{[vrev8_v $vdata, $vdata]} + + # Order of elements was adjusted in sm4_set_key() + # Encrypt with all keys + @{[vle32_v $vk0, $keys]} # rk[0:3] + @{[vsm4r_vs $vdata, $vk0]} + addi $keys, $keys, 16 + @{[vle32_v $vk1, $keys]} # rk[4:7] + @{[vsm4r_vs $vdata, $vk1]} + addi $keys, $keys, 16 + @{[vle32_v $vk2, $keys]} # rk[8:11] + @{[vsm4r_vs $vdata, $vk2]} + addi $keys, $keys, 16 + @{[vle32_v $vk3, $keys]} # rk[12:15] + @{[vsm4r_vs $vdata, $vk3]} + addi $keys, $keys, 16 + @{[vle32_v $vk4, $keys]} # rk[16:19] + @{[vsm4r_vs $vdata, $vk4]} + addi $keys, $keys, 16 + @{[vle32_v $vk5, $keys]} # rk[20:23] + @{[vsm4r_vs $vdata, $vk5]} + addi $keys, $keys, 16 + @{[vle32_v $vk6, $keys]} # rk[24:27] + @{[vsm4r_vs $vdata, $vk6]} + addi $keys, $keys, 16 + @{[vle32_v $vk7, $keys]} # rk[28:31] + @{[vsm4r_vs $vdata, $vk7]} + + # Save the ciphertext (in reverse element order) + @{[vrev8_v $vdata, $vdata]} + li $stride, -4 + addi $out, $out, 12 + @{[vsse32_v $vdata, $out, $stride]} + + ret +.size rv64i_zvksed_sm4_encrypt,.-rv64i_zvksed_sm4_encrypt +___ +} + +#### +# void rv64i_zvksed_sm4_decrypt(const unsigned char *in, unsigned char *out, +# const SM4_KEY *key); +# +{ +my ($in,$out,$keys,$stride)=("a0","a1","a2","t0"); +my ($vdata,$vk0,$vk1,$vk2,$vk3,$vk4,$vk5,$vk6,$vk7,$vgen)=("v1","v2","v3","v4","v5","v6","v7","v8","v9","v10"); +$code .= <<___; +.p2align 3 +.globl rv64i_zvksed_sm4_decrypt +.type rv64i_zvksed_sm4_decrypt,\@function +rv64i_zvksed_sm4_decrypt: + @{[vsetivli "zero", 4, "e32", "m1", "ta", "ma"]} + + # Load input data + @{[vle32_v $vdata, $in]} + @{[vrev8_v $vdata, $vdata]} + + # Order of key elements was adjusted in sm4_set_key() + # Decrypt with all keys + @{[vle32_v $vk7, $keys]} # rk[31:28] + @{[vsm4r_vs $vdata, $vk7]} + addi $keys, $keys, 16 + @{[vle32_v $vk6, $keys]} # rk[27:24] + @{[vsm4r_vs $vdata, $vk6]} + addi $keys, $keys, 16 + @{[vle32_v $vk5, $keys]} # rk[23:20] + @{[vsm4r_vs $vdata, $vk5]} + addi $keys, $keys, 16 + @{[vle32_v $vk4, $keys]} # rk[19:16] + @{[vsm4r_vs $vdata, $vk4]} + addi $keys, $keys, 16 + @{[vle32_v $vk3, $keys]} # rk[15:11] + @{[vsm4r_vs $vdata, $vk3]} + addi $keys, $keys, 16 + @{[vle32_v $vk2, $keys]} # rk[11:8] + @{[vsm4r_vs $vdata, $vk2]} + addi $keys, $keys, 16 + @{[vle32_v $vk1, $keys]} # rk[7:4] + @{[vsm4r_vs $vdata, $vk1]} + addi $keys, $keys, 16 + @{[vle32_v $vk0, $keys]} # rk[3:0] + @{[vsm4r_vs $vdata, $vk0]} + + # Save the ciphertext (in reverse element order) + @{[vrev8_v $vdata, $vdata]} + li $stride, -4 + addi $out, $out, 12 + @{[vsse32_v $vdata, $out, $stride]} + + ret +.size rv64i_zvksed_sm4_decrypt,.-rv64i_zvksed_sm4_decrypt +___ +} + +$code .= <<___; +# Family Key (little-endian 32-bit chunks) +.p2align 3 +FK: + .word 0xA3B1BAC6, 0x56AA3350, 0x677D9197, 0xB27022DC +.size FK,.-FK +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Mon Nov 27 07:07:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469235 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id B5E02C07D5A for ; Mon, 27 Nov 2023 07:08:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=2Un6sPFQgYDqfFwtlvTgEKDxCSnsZMPUQ7wVHqjPulE=; b=ch+jTwkooffdeU qOo4CvaY5NlTRTZnUUbTQNz19CTLqwNldHlBBnuvdY1jB6a6R8fwK3FIFtefTjJdZgaC/x6GZ/Jjm JCDsbzYKTMz6D2DVMhAjGRO7DCZo/qAk1Z+oH7PQYugDeMseKhggCtZQr+Oat8RgrkYcsF7arvQkf ijxVhtT1e4W/Z7GRfDdvnKnHQjGFqAF23i97TkZdi8GDgAUuyrZq7xN9lLZgEyJphzaHc9/75fteT nXy3OXuoSKZO19hgvd5046+mPp//WSnH0AjDsLi0ZRKLYa7fN77mYSMt9x8AFCZXe+13R/eZTkiGb DtAotDk60SrVhx8EW1hQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vin-001f8P-1N; Mon, 27 Nov 2023 07:07:57 +0000 Received: from mail-pl1-x62b.google.com ([2607:f8b0:4864:20::62b]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vig-001f2K-16 for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:53 +0000 Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1cfa168faefso19524055ad.3 for ; Sun, 26 Nov 2023 23:07:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068870; x=1701673670; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yncBKNuxBeXEf1T1sdiq2nXhaNeK0KdzufyHRNnVzQo=; b=VN8SFypZN8pHvMXvyiU/du+HwpMiqNYZ4rIp3DSCe9lNy7eSpCEnEap2PgfS9Vrw7F lluieiUuM+W48j3tOKOpaHC5eZ4N7+I0iN29n16fqIG5DkPqR99R1dKXD2XzDq7+JPKc QAvoiDLn2Wb1EAHbuqgGMQkS17QyJoG20wHE5DNFE03b11fWJfdp6kQ1ZBtpGctk9c6T 9mpV2yJMjBAG1sSRL6pC/fLAW83aTJbSrhJ/1vQGCcAu6HuwarAlmgNeFwpJ+uks7I+o kxraYBZLDs5oHJL/z7qvKXR1HlIxkcIxoF/MErGz6Ai+C3eFGYwOcbUGqZgWwHCK778H 2+Iw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068870; x=1701673670; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yncBKNuxBeXEf1T1sdiq2nXhaNeK0KdzufyHRNnVzQo=; b=WjrKV/icjzGg2l2NFCsC9eonuHegZbIbXR5kcMsAtBXKnLreqi8fR8K66zWc3ZsQei fzTzFO66zlZGvdXfNj6tIYy3ZCB31/3h7zKxEvRNA2ql221ScI7GaNbHaXcYJHqbgbky ctNo/UyCQTgkUpPKf7LZnF6LGA4CELS1CGZYbk2iBGXikwtY+cgi6ymSD+sCpglyONC3 qmOG2erX79dCEI+L6Bhc40W/lG61POw0hBkUE9UuSHCYNxMoTEEHW/+nOSvS0WF4db2b KUvARunPhWFZZeXPCmt/jlAxieYDXLiAwWfyvJ2nVFQL+ikm501d2pS8oOuIGu2XdHgY AeIA== X-Gm-Message-State: AOJu0YwPqOmeAQNK462dLiB/kZ0Hy9mIHg/m+zaQS3R6YKG9yFFaQnYf G84E7GBWPFu3SqX1CEb6NLYLHw== X-Google-Smtp-Source: AGHT+IGIlYXxV0/6Hz/XM9VtJY6cDuUp1mCFCtnsRfP772eQikHti4XfmoPs6aOvt6V4rIekwrrhow== X-Received: by 2002:a17:902:ee82:b0:1c6:2ae1:dc28 with SMTP id a2-20020a170902ee8200b001c62ae1dc28mr10388718pld.36.1701068869624; Sun, 26 Nov 2023 23:07:49 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.46 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:49 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 12/13] RISC-V: crypto: add Zvksh accelerated SM3 implementation Date: Mon, 27 Nov 2023 15:07:02 +0800 Message-Id: <20231127070703.1697-13-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230750_432078_F87AA026 X-CRM114-Status: GOOD ( 31.70 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add SM3 implementation using Zvksh vector crypto extension from OpenSSL (openssl/openssl#21923). Co-developed-by: Christoph Müllner Signed-off-by: Christoph Müllner Co-developed-by: Heiko Stuebner Signed-off-by: Heiko Stuebner Signed-off-by: Jerry Shih --- Changelog v2: - Do not turn on kconfig `SM3_RISCV64` option by default. - Add `asmlinkage` qualifier for crypto asm function. - Rename sm3-riscv64-zvkb-zvksh to sm3-riscv64-zvksh-zvkb. - Reorder structure sm3_alg members initialization in the order declared. --- arch/riscv/crypto/Kconfig | 12 ++ arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/sm3-riscv64-glue.c | 124 +++++++++++++ arch/riscv/crypto/sm3-riscv64-zvksh.pl | 230 +++++++++++++++++++++++++ 4 files changed, 373 insertions(+) create mode 100644 arch/riscv/crypto/sm3-riscv64-glue.c create mode 100644 arch/riscv/crypto/sm3-riscv64-zvksh.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index b28cf1972250..7415fb303785 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -66,6 +66,18 @@ config CRYPTO_SHA512_RISCV64 - Zvknhb vector crypto extension - Zvkb vector crypto extension +config CRYPTO_SM3_RISCV64 + tristate "Hash functions: SM3 (ShangMi 3)" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_HASH + select CRYPTO_SM3 + help + SM3 (ShangMi 3) secure hash function (OSCCA GM/T 0004-2012) + + Architecture: riscv64 using: + - Zvksh vector crypto extension + - Zvkb vector crypto extension + config CRYPTO_SM4_RISCV64 tristate "Ciphers: SM4 (ShangMi 4)" depends on 64BIT && RISCV_ISA_V diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index 8e34861bba34..b1f857695c1c 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -18,6 +18,9 @@ sha256-riscv64-y := sha256-riscv64-glue.o sha256-riscv64-zvknha_or_zvknhb-zvkb.o obj-$(CONFIG_CRYPTO_SHA512_RISCV64) += sha512-riscv64.o sha512-riscv64-y := sha512-riscv64-glue.o sha512-riscv64-zvknhb-zvkb.o +obj-$(CONFIG_CRYPTO_SM3_RISCV64) += sm3-riscv64.o +sm3-riscv64-y := sm3-riscv64-glue.o sm3-riscv64-zvksh.o + obj-$(CONFIG_CRYPTO_SM4_RISCV64) += sm4-riscv64.o sm4-riscv64-y := sm4-riscv64-glue.o sm4-riscv64-zvksed.o @@ -42,6 +45,9 @@ $(obj)/sha256-riscv64-zvknha_or_zvknhb-zvkb.S: $(src)/sha256-riscv64-zvknha_or_z $(obj)/sha512-riscv64-zvknhb-zvkb.S: $(src)/sha512-riscv64-zvknhb-zvkb.pl $(call cmd,perlasm) +$(obj)/sm3-riscv64-zvksh.S: $(src)/sm3-riscv64-zvksh.pl + $(call cmd,perlasm) + $(obj)/sm4-riscv64-zvksed.S: $(src)/sm4-riscv64-zvksed.pl $(call cmd,perlasm) @@ -51,4 +57,5 @@ clean-files += aes-riscv64-zvkned-zvkb.S clean-files += ghash-riscv64-zvkg.S clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S clean-files += sha512-riscv64-zvknhb-zvkb.S +clean-files += sm3-riscv64-zvksh.S clean-files += sm4-riscv64-zvksed.S diff --git a/arch/riscv/crypto/sm3-riscv64-glue.c b/arch/riscv/crypto/sm3-riscv64-glue.c new file mode 100644 index 000000000000..63c7af338877 --- /dev/null +++ b/arch/riscv/crypto/sm3-riscv64-glue.c @@ -0,0 +1,124 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Linux/riscv64 port of the OpenSSL SM3 implementation for RISC-V 64 + * + * Copyright (C) 2023 VRULL GmbH + * Author: Heiko Stuebner + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +/* + * sm3 using zvksh vector crypto extension + * + * This asm function will just take the first 256-bit as the sm3 state from + * the pointer to `struct sm3_state`. + */ +asmlinkage void ossl_hwsm3_block_data_order_zvksh(struct sm3_state *digest, + u8 const *o, int num); + +static int riscv64_sm3_update(struct shash_desc *desc, const u8 *data, + unsigned int len) +{ + int ret = 0; + + /* + * Make sure struct sm3_state begins directly with the SM3 256-bit internal + * state, as this is what the asm function expect. + */ + BUILD_BUG_ON(offsetof(struct sm3_state, state) != 0); + + if (crypto_simd_usable()) { + kernel_vector_begin(); + ret = sm3_base_do_update(desc, data, len, + ossl_hwsm3_block_data_order_zvksh); + kernel_vector_end(); + } else { + sm3_update(shash_desc_ctx(desc), data, len); + } + + return ret; +} + +static int riscv64_sm3_finup(struct shash_desc *desc, const u8 *data, + unsigned int len, u8 *out) +{ + struct sm3_state *ctx; + + if (crypto_simd_usable()) { + kernel_vector_begin(); + if (len) + sm3_base_do_update(desc, data, len, + ossl_hwsm3_block_data_order_zvksh); + sm3_base_do_finalize(desc, ossl_hwsm3_block_data_order_zvksh); + kernel_vector_end(); + + return sm3_base_finish(desc, out); + } + + ctx = shash_desc_ctx(desc); + if (len) + sm3_update(ctx, data, len); + sm3_final(ctx, out); + + return 0; +} + +static int riscv64_sm3_final(struct shash_desc *desc, u8 *out) +{ + return riscv64_sm3_finup(desc, NULL, 0, out); +} + +static struct shash_alg sm3_alg = { + .init = sm3_base_init, + .update = riscv64_sm3_update, + .final = riscv64_sm3_final, + .finup = riscv64_sm3_finup, + .descsize = sizeof(struct sm3_state), + .digestsize = SM3_DIGEST_SIZE, + .base = { + .cra_blocksize = SM3_BLOCK_SIZE, + .cra_priority = 150, + .cra_name = "sm3", + .cra_driver_name = "sm3-riscv64-zvksh-zvkb", + .cra_module = THIS_MODULE, + }, +}; + +static inline bool check_sm3_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKSH) && + riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_riscv64_sm3_mod_init(void) +{ + if (check_sm3_ext()) + return crypto_register_shash(&sm3_alg); + + return -ENODEV; +} + +static void __exit riscv64_sm3_mod_fini(void) +{ + crypto_unregister_shash(&sm3_alg); +} + +module_init(riscv64_riscv64_sm3_mod_init); +module_exit(riscv64_sm3_mod_fini); + +MODULE_DESCRIPTION("SM3 (RISC-V accelerated)"); +MODULE_AUTHOR("Heiko Stuebner "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("sm3"); diff --git a/arch/riscv/crypto/sm3-riscv64-zvksh.pl b/arch/riscv/crypto/sm3-riscv64-zvksh.pl new file mode 100644 index 000000000000..942d78d982e9 --- /dev/null +++ b/arch/riscv/crypto/sm3-riscv64-zvksh.pl @@ -0,0 +1,230 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You can obtain +# a copy in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Christoph Müllner +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# The generated code of this file depends on the following RISC-V extensions: +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') +# - RISC-V Vector SM3 Secure Hash extension ('Zvksh') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT,">$output"; + +my $code=<<___; +.text +___ + +################################################################################ +# ossl_hwsm3_block_data_order_zvksh(SM3_CTX *c, const void *p, size_t num); +{ +my ($CTX, $INPUT, $NUM) = ("a0", "a1", "a2"); +my ($V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, + $V8, $V9, $V10, $V11, $V12, $V13, $V14, $V15, + $V16, $V17, $V18, $V19, $V20, $V21, $V22, $V23, + $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map("v$_",(0..31)); + +$code .= <<___; +.text +.p2align 3 +.globl ossl_hwsm3_block_data_order_zvksh +.type ossl_hwsm3_block_data_order_zvksh,\@function +ossl_hwsm3_block_data_order_zvksh: + @{[vsetivli "zero", 8, "e32", "m2", "ta", "ma"]} + + # Load initial state of hash context (c->A-H). + @{[vle32_v $V0, $CTX]} + @{[vrev8_v $V0, $V0]} + +L_sm3_loop: + # Copy the previous state to v2. + # It will be XOR'ed with the current state at the end of the round. + @{[vmv_v_v $V2, $V0]} + + # Load the 64B block in 2x32B chunks. + @{[vle32_v $V6, $INPUT]} # v6 := {w7, ..., w0} + addi $INPUT, $INPUT, 32 + + @{[vle32_v $V8, $INPUT]} # v8 := {w15, ..., w8} + addi $INPUT, $INPUT, 32 + + addi $NUM, $NUM, -1 + + # As vsm3c consumes only w0, w1, w4, w5 we need to slide the input + # 2 elements down so we process elements w2, w3, w6, w7 + # This will be repeated for each odd round. + @{[vslidedown_vi $V4, $V6, 2]} # v4 := {X, X, w7, ..., w2} + + @{[vsm3c_vi $V0, $V6, 0]} + @{[vsm3c_vi $V0, $V4, 1]} + + # Prepare a vector with {w11, ..., w4} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, X, X, w7, ..., w4} + @{[vslideup_vi $V4, $V8, 4]} # v4 := {w11, w10, w9, w8, w7, w6, w5, w4} + + @{[vsm3c_vi $V0, $V4, 2]} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, w11, w10, w9, w8, w7, w6} + @{[vsm3c_vi $V0, $V4, 3]} + + @{[vsm3c_vi $V0, $V8, 4]} + @{[vslidedown_vi $V4, $V8, 2]} # v4 := {X, X, w15, w14, w13, w12, w11, w10} + @{[vsm3c_vi $V0, $V4, 5]} + + @{[vsm3me_vv $V6, $V8, $V6]} # v6 := {w23, w22, w21, w20, w19, w18, w17, w16} + + # Prepare a register with {w19, w18, w17, w16, w15, w14, w13, w12} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, X, X, w15, w14, w13, w12} + @{[vslideup_vi $V4, $V6, 4]} # v4 := {w19, w18, w17, w16, w15, w14, w13, w12} + + @{[vsm3c_vi $V0, $V4, 6]} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, w19, w18, w17, w16, w15, w14} + @{[vsm3c_vi $V0, $V4, 7]} + + @{[vsm3c_vi $V0, $V6, 8]} + @{[vslidedown_vi $V4, $V6, 2]} # v4 := {X, X, w23, w22, w21, w20, w19, w18} + @{[vsm3c_vi $V0, $V4, 9]} + + @{[vsm3me_vv $V8, $V6, $V8]} # v8 := {w31, w30, w29, w28, w27, w26, w25, w24} + + # Prepare a register with {w27, w26, w25, w24, w23, w22, w21, w20} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, X, X, w23, w22, w21, w20} + @{[vslideup_vi $V4, $V8, 4]} # v4 := {w27, w26, w25, w24, w23, w22, w21, w20} + + @{[vsm3c_vi $V0, $V4, 10]} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, w27, w26, w25, w24, w23, w22} + @{[vsm3c_vi $V0, $V4, 11]} + + @{[vsm3c_vi $V0, $V8, 12]} + @{[vslidedown_vi $V4, $V8, 2]} # v4 := {x, X, w31, w30, w29, w28, w27, w26} + @{[vsm3c_vi $V0, $V4, 13]} + + @{[vsm3me_vv $V6, $V8, $V6]} # v6 := {w32, w33, w34, w35, w36, w37, w38, w39} + + # Prepare a register with {w35, w34, w33, w32, w31, w30, w29, w28} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, X, X, w31, w30, w29, w28} + @{[vslideup_vi $V4, $V6, 4]} # v4 := {w35, w34, w33, w32, w31, w30, w29, w28} + + @{[vsm3c_vi $V0, $V4, 14]} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, w35, w34, w33, w32, w31, w30} + @{[vsm3c_vi $V0, $V4, 15]} + + @{[vsm3c_vi $V0, $V6, 16]} + @{[vslidedown_vi $V4, $V6, 2]} # v4 := {X, X, w39, w38, w37, w36, w35, w34} + @{[vsm3c_vi $V0, $V4, 17]} + + @{[vsm3me_vv $V8, $V6, $V8]} # v8 := {w47, w46, w45, w44, w43, w42, w41, w40} + + # Prepare a register with {w43, w42, w41, w40, w39, w38, w37, w36} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, X, X, w39, w38, w37, w36} + @{[vslideup_vi $V4, $V8, 4]} # v4 := {w43, w42, w41, w40, w39, w38, w37, w36} + + @{[vsm3c_vi $V0, $V4, 18]} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, w43, w42, w41, w40, w39, w38} + @{[vsm3c_vi $V0, $V4, 19]} + + @{[vsm3c_vi $V0, $V8, 20]} + @{[vslidedown_vi $V4, $V8, 2]} # v4 := {X, X, w47, w46, w45, w44, w43, w42} + @{[vsm3c_vi $V0, $V4, 21]} + + @{[vsm3me_vv $V6, $V8, $V6]} # v6 := {w55, w54, w53, w52, w51, w50, w49, w48} + + # Prepare a register with {w51, w50, w49, w48, w47, w46, w45, w44} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, X, X, w47, w46, w45, w44} + @{[vslideup_vi $V4, $V6, 4]} # v4 := {w51, w50, w49, w48, w47, w46, w45, w44} + + @{[vsm3c_vi $V0, $V4, 22]} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, w51, w50, w49, w48, w47, w46} + @{[vsm3c_vi $V0, $V4, 23]} + + @{[vsm3c_vi $V0, $V6, 24]} + @{[vslidedown_vi $V4, $V6, 2]} # v4 := {X, X, w55, w54, w53, w52, w51, w50} + @{[vsm3c_vi $V0, $V4, 25]} + + @{[vsm3me_vv $V8, $V6, $V8]} # v8 := {w63, w62, w61, w60, w59, w58, w57, w56} + + # Prepare a register with {w59, w58, w57, w56, w55, w54, w53, w52} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, X, X, w55, w54, w53, w52} + @{[vslideup_vi $V4, $V8, 4]} # v4 := {w59, w58, w57, w56, w55, w54, w53, w52} + + @{[vsm3c_vi $V0, $V4, 26]} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, w59, w58, w57, w56, w55, w54} + @{[vsm3c_vi $V0, $V4, 27]} + + @{[vsm3c_vi $V0, $V8, 28]} + @{[vslidedown_vi $V4, $V8, 2]} # v4 := {X, X, w63, w62, w61, w60, w59, w58} + @{[vsm3c_vi $V0, $V4, 29]} + + @{[vsm3me_vv $V6, $V8, $V6]} # v6 := {w71, w70, w69, w68, w67, w66, w65, w64} + + # Prepare a register with {w67, w66, w65, w64, w63, w62, w61, w60} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, X, X, w63, w62, w61, w60} + @{[vslideup_vi $V4, $V6, 4]} # v4 := {w67, w66, w65, w64, w63, w62, w61, w60} + + @{[vsm3c_vi $V0, $V4, 30]} + @{[vslidedown_vi $V4, $V4, 2]} # v4 := {X, X, w67, w66, w65, w64, w63, w62} + @{[vsm3c_vi $V0, $V4, 31]} + + # XOR in the previous state. + @{[vxor_vv $V0, $V0, $V2]} + + bnez $NUM, L_sm3_loop # Check if there are any more block to process +L_sm3_end: + @{[vrev8_v $V0, $V0]} + @{[vse32_v $V0, $CTX]} + ret + +.size ossl_hwsm3_block_data_order_zvksh,.-ossl_hwsm3_block_data_order_zvksh +___ +} + +print $code; + +close STDOUT or die "error closing STDOUT: $!"; From patchwork Mon Nov 27 07:07:03 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jerry Shih X-Patchwork-Id: 13469236 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 18323C4167B for ; Mon, 27 Nov 2023 07:08:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=3VRVncjsHS+IYh2Qk+zJN7cfkeeSGPzzBm9dppM7EYY=; b=zn0lwwFlzby7jF i/ZfBHykyt10cda8bd9FTWTttF+ge3IdIbB/a8eet7dz8ZMJv+18zTb+xjaFpxNYNQaqX2MYqkaoW iLCzxbix9M5kSAopm9H8zaKCnY/NIBYmB9Uem8z+lwXni2VRCPm0RVkhN3L4A/vibekgOU7B/t7Zi R1LFw8rzDnU34CExKf0odwMqQJkKm2XCLv1jdl6fDGBaDsR/nuBAvSEQeFu90I0BkuWteWE1PjJrr GcZGu9v7b549qvwJ3/S3w6Jlo4ekA/dGC+t1UcjltiV4i+jLk8TNzHdaOO6+fyf8VMYcOFb80pRKN 11RYvJJ3WOztZltmWoKw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vip-001fAp-2v; Mon, 27 Nov 2023 07:07:59 +0000 Received: from mail-pf1-x430.google.com ([2607:f8b0:4864:20::430]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1r7Vij-001f5J-0f for linux-riscv@lists.infradead.org; Mon, 27 Nov 2023 07:07:56 +0000 Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-6cbb71c3020so3305519b3a.1 for ; Sun, 26 Nov 2023 23:07:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=sifive.com; s=google; t=1701068872; x=1701673672; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=65aClSoG7Vvogs/9DSpZCzSUcof37h/9Wvainxt/D1s=; b=ik9RBrzSgZfLoNVRpMRLXMLDW/1+e/vBapJ/f6q/t2pWLFm8m8Z0gZz+XlCz5c74ox zaELTLYP0CsfznQq9A6KIM6C3wwEEyIuOp0bXrjEXlMasgNEQUj8ME8ENoOYc5x09iBl 8di0TjA36evS/nBDuK06rv6/wzu4JLvY7GkzSyHy6Jvp9s2AlZ4hjK8TveSy+82qtSUD U+cMkkQglCPChLxkFOnNcArcCaxaSZ3MWWgjP/1nPN2u5wTOFx5fjLT/+ebEy9GQv3fO Xsws9MOejfH1Sk12Inw4HWHTF66TgMPYuTZGF5+Cd0awfc6wQLVsDiusubLb3Em+9us2 i4+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701068872; x=1701673672; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=65aClSoG7Vvogs/9DSpZCzSUcof37h/9Wvainxt/D1s=; b=NaJwb0vOBw+sSuoB67u3APXdl3KQYBgXHjjF6Cg23hRSgmnnTy3PDwi1bc9eVQmj7r 1bP+7iHNlwMXqt9vkT1DdgjNgJ+M1EB6ov4tQOFicAyq7dkaPeFY0Xddv2EH8F/04q1K Um1kVb4Weto6ajp+lJ7acZ3/5e8JzQgkueBUdypcV+sFLjvFfKhAvdr0qFdozJhBpR+i IWJtOkBtfGnxKR1e3UdmQ5DJeMyli5A/AgGuNVf+64YdW0ogIe7Z7RCbOQ6kHeogoflB iJfP7zq+jySCt0n5Bwyym/ipcf7U6JzDhW/QW7K5YEK+wivjhLFbD8R2FlHxLRulCgyB 1bIw== X-Gm-Message-State: AOJu0Yxak1b0EELas+TEjn120qzsjUP6ejy4pxG9imeqYA82ZroK/86D ExSKXdBS/VCGcg88sLSymf4t+A== X-Google-Smtp-Source: AGHT+IHcnauxtqQExXlH4Csib1LUNoeakDZRIE0yrXb2DtrTMmRFI7ZHKdBCpbiQLIdAZt7kTNKppQ== X-Received: by 2002:a17:902:c942:b0:1cd:f823:456d with SMTP id i2-20020a170902c94200b001cdf823456dmr14867788pla.20.1701068872601; Sun, 26 Nov 2023 23:07:52 -0800 (PST) Received: from localhost.localdomain ([101.10.45.230]) by smtp.gmail.com with ESMTPSA id jh15-20020a170903328f00b001cfcd3a764esm1340134plb.77.2023.11.26.23.07.49 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Sun, 26 Nov 2023 23:07:52 -0800 (PST) From: Jerry Shih To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, herbert@gondor.apana.org.au, davem@davemloft.net, conor.dooley@microchip.com, ebiggers@kernel.org, ardb@kernel.org Cc: heiko@sntech.de, phoebe.chen@sifive.com, hongrong.hsu@sifive.com, linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org Subject: [PATCH v2 13/13] RISC-V: crypto: add Zvkb accelerated ChaCha20 implementation Date: Mon, 27 Nov 2023 15:07:03 +0800 Message-Id: <20231127070703.1697-14-jerry.shih@sifive.com> X-Mailer: git-send-email 2.28.0 In-Reply-To: <20231127070703.1697-1-jerry.shih@sifive.com> References: <20231127070703.1697-1-jerry.shih@sifive.com> MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231126_230753_328470_5A00A887 X-CRM114-Status: GOOD ( 26.69 ) X-BeenThere: linux-riscv@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-riscv" Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org Add a ChaCha20 vector implementation from OpenSSL(openssl/openssl#21923). Signed-off-by: Jerry Shih --- Changelog v2: - Do not turn on kconfig `CHACHA20_RISCV64` option by default. - Use simd skcipher interface. - Add `asmlinkage` qualifier for crypto asm function. - Reorder structure riscv64_chacha_alg_zvkb members initialization in the order declared. - Use smaller iv buffer instead of whole state matrix as chacha20's input. --- arch/riscv/crypto/Kconfig | 12 + arch/riscv/crypto/Makefile | 7 + arch/riscv/crypto/chacha-riscv64-glue.c | 122 +++++++++ arch/riscv/crypto/chacha-riscv64-zvkb.pl | 321 +++++++++++++++++++++++ 4 files changed, 462 insertions(+) create mode 100644 arch/riscv/crypto/chacha-riscv64-glue.c create mode 100644 arch/riscv/crypto/chacha-riscv64-zvkb.pl diff --git a/arch/riscv/crypto/Kconfig b/arch/riscv/crypto/Kconfig index 7415fb303785..1932297a1e73 100644 --- a/arch/riscv/crypto/Kconfig +++ b/arch/riscv/crypto/Kconfig @@ -34,6 +34,18 @@ config CRYPTO_AES_BLOCK_RISCV64 - Zvkb vector crypto extension (CTR/XTS) - Zvkg vector crypto extension (XTS) +config CRYPTO_CHACHA20_RISCV64 + tristate "Ciphers: ChaCha20" + depends on 64BIT && RISCV_ISA_V + select CRYPTO_SIMD + select CRYPTO_SKCIPHER + select CRYPTO_LIB_CHACHA_GENERIC + help + Length-preserving ciphers: ChaCha20 stream cipher algorithm + + Architecture: riscv64 using: + - Zvkb vector crypto extension + config CRYPTO_GHASH_RISCV64 tristate "Hash functions: GHASH" depends on 64BIT && RISCV_ISA_V diff --git a/arch/riscv/crypto/Makefile b/arch/riscv/crypto/Makefile index b1f857695c1c..748c53aa38dc 100644 --- a/arch/riscv/crypto/Makefile +++ b/arch/riscv/crypto/Makefile @@ -9,6 +9,9 @@ aes-riscv64-y := aes-riscv64-glue.o aes-riscv64-zvkned.o obj-$(CONFIG_CRYPTO_AES_BLOCK_RISCV64) += aes-block-riscv64.o aes-block-riscv64-y := aes-riscv64-block-mode-glue.o aes-riscv64-zvkned-zvbb-zvkg.o aes-riscv64-zvkned-zvkb.o +obj-$(CONFIG_CRYPTO_CHACHA20_RISCV64) += chacha-riscv64.o +chacha-riscv64-y := chacha-riscv64-glue.o chacha-riscv64-zvkb.o + obj-$(CONFIG_CRYPTO_GHASH_RISCV64) += ghash-riscv64.o ghash-riscv64-y := ghash-riscv64-glue.o ghash-riscv64-zvkg.o @@ -36,6 +39,9 @@ $(obj)/aes-riscv64-zvkned-zvbb-zvkg.S: $(src)/aes-riscv64-zvkned-zvbb-zvkg.pl $(obj)/aes-riscv64-zvkned-zvkb.S: $(src)/aes-riscv64-zvkned-zvkb.pl $(call cmd,perlasm) +$(obj)/chacha-riscv64-zvkb.S: $(src)/chacha-riscv64-zvkb.pl + $(call cmd,perlasm) + $(obj)/ghash-riscv64-zvkg.S: $(src)/ghash-riscv64-zvkg.pl $(call cmd,perlasm) @@ -54,6 +60,7 @@ $(obj)/sm4-riscv64-zvksed.S: $(src)/sm4-riscv64-zvksed.pl clean-files += aes-riscv64-zvkned.S clean-files += aes-riscv64-zvkned-zvbb-zvkg.S clean-files += aes-riscv64-zvkned-zvkb.S +clean-files += chacha-riscv64-zvkb.S clean-files += ghash-riscv64-zvkg.S clean-files += sha256-riscv64-zvknha_or_zvknhb-zvkb.S clean-files += sha512-riscv64-zvknhb-zvkb.S diff --git a/arch/riscv/crypto/chacha-riscv64-glue.c b/arch/riscv/crypto/chacha-riscv64-glue.c new file mode 100644 index 000000000000..96047cb75222 --- /dev/null +++ b/arch/riscv/crypto/chacha-riscv64-glue.c @@ -0,0 +1,122 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Port of the OpenSSL ChaCha20 implementation for RISC-V 64 + * + * Copyright (C) 2023 SiFive, Inc. + * Author: Jerry Shih + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include + +/* chacha20 using zvkb vector crypto extension */ +asmlinkage void ChaCha20_ctr32_zvkb(u8 *out, const u8 *input, size_t len, + const u32 *key, const u32 *counter); + +static int chacha20_encrypt(struct skcipher_request *req) +{ + u32 iv[CHACHA_IV_SIZE / sizeof(u32)]; + u8 block_buffer[CHACHA_BLOCK_SIZE]; + struct crypto_skcipher *tfm = crypto_skcipher_reqtfm(req); + const struct chacha_ctx *ctx = crypto_skcipher_ctx(tfm); + struct skcipher_walk walk; + unsigned int nbytes; + unsigned int tail_bytes; + int err; + + iv[0] = get_unaligned_le32(req->iv); + iv[1] = get_unaligned_le32(req->iv + 4); + iv[2] = get_unaligned_le32(req->iv + 8); + iv[3] = get_unaligned_le32(req->iv + 12); + + err = skcipher_walk_virt(&walk, req, false); + while (walk.nbytes) { + nbytes = walk.nbytes & (~(CHACHA_BLOCK_SIZE - 1)); + tail_bytes = walk.nbytes & (CHACHA_BLOCK_SIZE - 1); + kernel_vector_begin(); + if (nbytes) { + ChaCha20_ctr32_zvkb(walk.dst.virt.addr, + walk.src.virt.addr, nbytes, + ctx->key, iv); + iv[0] += nbytes / CHACHA_BLOCK_SIZE; + } + if (walk.nbytes == walk.total && tail_bytes > 0) { + memcpy(block_buffer, walk.src.virt.addr + nbytes, + tail_bytes); + ChaCha20_ctr32_zvkb(block_buffer, block_buffer, + CHACHA_BLOCK_SIZE, ctx->key, iv); + memcpy(walk.dst.virt.addr + nbytes, block_buffer, + tail_bytes); + tail_bytes = 0; + } + kernel_vector_end(); + + err = skcipher_walk_done(&walk, tail_bytes); + } + + return err; +} + +static struct skcipher_alg riscv64_chacha_alg_zvkb[] = { + { + .setkey = chacha20_setkey, + .encrypt = chacha20_encrypt, + .decrypt = chacha20_encrypt, + .min_keysize = CHACHA_KEY_SIZE, + .max_keysize = CHACHA_KEY_SIZE, + .ivsize = CHACHA_IV_SIZE, + .chunksize = CHACHA_BLOCK_SIZE, + .walksize = CHACHA_BLOCK_SIZE * 4, + .base = { + .cra_flags = CRYPTO_ALG_INTERNAL, + .cra_blocksize = 1, + .cra_ctxsize = sizeof(struct chacha_ctx), + .cra_priority = 300, + .cra_name = "__chacha20", + .cra_driver_name = "__chacha20-riscv64-zvkb", + .cra_module = THIS_MODULE, + }, + } +}; + +static struct simd_skcipher_alg + *riscv64_chacha_simd_alg_zvkb[ARRAY_SIZE(riscv64_chacha_alg_zvkb)]; + +static inline bool check_chacha20_ext(void) +{ + return riscv_isa_extension_available(NULL, ZVKB) && + riscv_vector_vlen() >= 128; +} + +static int __init riscv64_chacha_mod_init(void) +{ + if (check_chacha20_ext()) + return simd_register_skciphers_compat( + riscv64_chacha_alg_zvkb, + ARRAY_SIZE(riscv64_chacha_alg_zvkb), + riscv64_chacha_simd_alg_zvkb); + + return -ENODEV; +} + +static void __exit riscv64_chacha_mod_fini(void) +{ + simd_unregister_skciphers(riscv64_chacha_alg_zvkb, + ARRAY_SIZE(riscv64_chacha_alg_zvkb), + riscv64_chacha_simd_alg_zvkb); +} + +module_init(riscv64_chacha_mod_init); +module_exit(riscv64_chacha_mod_fini); + +MODULE_DESCRIPTION("ChaCha20 (RISC-V accelerated)"); +MODULE_AUTHOR("Jerry Shih "); +MODULE_LICENSE("GPL"); +MODULE_ALIAS_CRYPTO("chacha20"); diff --git a/arch/riscv/crypto/chacha-riscv64-zvkb.pl b/arch/riscv/crypto/chacha-riscv64-zvkb.pl new file mode 100644 index 000000000000..6602ef79452f --- /dev/null +++ b/arch/riscv/crypto/chacha-riscv64-zvkb.pl @@ -0,0 +1,321 @@ +#! /usr/bin/env perl +# SPDX-License-Identifier: Apache-2.0 OR BSD-2-Clause +# +# This file is dual-licensed, meaning that you can use it under your +# choice of either of the following two licenses: +# +# Copyright 2023-2023 The OpenSSL Project Authors. All Rights Reserved. +# +# Licensed under the Apache License 2.0 (the "License"). You may not use +# this file except in compliance with the License. You can obtain a copy +# in the file LICENSE in the source distribution or at +# https://www.openssl.org/source/license.html +# +# or +# +# Copyright (c) 2023, Jerry Shih +# All rights reserved. +# +# Redistribution and use in source and binary forms, with or without +# modification, are permitted provided that the following conditions +# are met: +# 1. Redistributions of source code must retain the above copyright +# notice, this list of conditions and the following disclaimer. +# 2. Redistributions in binary form must reproduce the above copyright +# notice, this list of conditions and the following disclaimer in the +# documentation and/or other materials provided with the distribution. +# +# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS +# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT +# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR +# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT +# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, +# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT +# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, +# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY +# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT +# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE +# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +# - RV64I +# - RISC-V Vector ('V') with VLEN >= 128 +# - RISC-V Vector Cryptography Bit-manipulation extension ('Zvkb') + +use strict; +use warnings; + +use FindBin qw($Bin); +use lib "$Bin"; +use lib "$Bin/../../perlasm"; +use riscv; + +# $output is the last argument if it looks like a file (it has an extension) +# $flavour is the first argument if it doesn't look like a file +my $output = $#ARGV >= 0 && $ARGV[$#ARGV] =~ m|\.\w+$| ? pop : undef; +my $flavour = $#ARGV >= 0 && $ARGV[0] !~ m|\.| ? shift : undef; + +$output and open STDOUT, ">$output"; + +my $code = <<___; +.text +___ + +# void ChaCha20_ctr32_zvkb(unsigned char *out, const unsigned char *inp, +# size_t len, const unsigned int key[8], +# const unsigned int counter[4]); +################################################################################ +my ( $OUTPUT, $INPUT, $LEN, $KEY, $COUNTER ) = ( "a0", "a1", "a2", "a3", "a4" ); +my ( $T0 ) = ( "t0" ); +my ( $CONST_DATA0, $CONST_DATA1, $CONST_DATA2, $CONST_DATA3 ) = + ( "a5", "a6", "a7", "t1" ); +my ( $KEY0, $KEY1, $KEY2,$KEY3, $KEY4, $KEY5, $KEY6, $KEY7, + $COUNTER0, $COUNTER1, $NONCE0, $NONCE1 +) = ( "s0", "s1", "s2", "s3", "s4", "s5", "s6", + "s7", "s8", "s9", "s10", "s11" ); +my ( $VL, $STRIDE, $CHACHA_LOOP_COUNT ) = ( "t2", "t3", "t4" ); +my ( + $V0, $V1, $V2, $V3, $V4, $V5, $V6, $V7, $V8, $V9, $V10, + $V11, $V12, $V13, $V14, $V15, $V16, $V17, $V18, $V19, $V20, $V21, + $V22, $V23, $V24, $V25, $V26, $V27, $V28, $V29, $V30, $V31, +) = map( "v$_", ( 0 .. 31 ) ); + +sub chacha_quad_round_group { + my ( + $A0, $B0, $C0, $D0, $A1, $B1, $C1, $D1, + $A2, $B2, $C2, $D2, $A3, $B3, $C3, $D3 + ) = @_; + + my $code = <<___; + # a += b; d ^= a; d <<<= 16; + @{[vadd_vv $A0, $A0, $B0]} + @{[vadd_vv $A1, $A1, $B1]} + @{[vadd_vv $A2, $A2, $B2]} + @{[vadd_vv $A3, $A3, $B3]} + @{[vxor_vv $D0, $D0, $A0]} + @{[vxor_vv $D1, $D1, $A1]} + @{[vxor_vv $D2, $D2, $A2]} + @{[vxor_vv $D3, $D3, $A3]} + @{[vror_vi $D0, $D0, 32 - 16]} + @{[vror_vi $D1, $D1, 32 - 16]} + @{[vror_vi $D2, $D2, 32 - 16]} + @{[vror_vi $D3, $D3, 32 - 16]} + # c += d; b ^= c; b <<<= 12; + @{[vadd_vv $C0, $C0, $D0]} + @{[vadd_vv $C1, $C1, $D1]} + @{[vadd_vv $C2, $C2, $D2]} + @{[vadd_vv $C3, $C3, $D3]} + @{[vxor_vv $B0, $B0, $C0]} + @{[vxor_vv $B1, $B1, $C1]} + @{[vxor_vv $B2, $B2, $C2]} + @{[vxor_vv $B3, $B3, $C3]} + @{[vror_vi $B0, $B0, 32 - 12]} + @{[vror_vi $B1, $B1, 32 - 12]} + @{[vror_vi $B2, $B2, 32 - 12]} + @{[vror_vi $B3, $B3, 32 - 12]} + # a += b; d ^= a; d <<<= 8; + @{[vadd_vv $A0, $A0, $B0]} + @{[vadd_vv $A1, $A1, $B1]} + @{[vadd_vv $A2, $A2, $B2]} + @{[vadd_vv $A3, $A3, $B3]} + @{[vxor_vv $D0, $D0, $A0]} + @{[vxor_vv $D1, $D1, $A1]} + @{[vxor_vv $D2, $D2, $A2]} + @{[vxor_vv $D3, $D3, $A3]} + @{[vror_vi $D0, $D0, 32 - 8]} + @{[vror_vi $D1, $D1, 32 - 8]} + @{[vror_vi $D2, $D2, 32 - 8]} + @{[vror_vi $D3, $D3, 32 - 8]} + # c += d; b ^= c; b <<<= 7; + @{[vadd_vv $C0, $C0, $D0]} + @{[vadd_vv $C1, $C1, $D1]} + @{[vadd_vv $C2, $C2, $D2]} + @{[vadd_vv $C3, $C3, $D3]} + @{[vxor_vv $B0, $B0, $C0]} + @{[vxor_vv $B1, $B1, $C1]} + @{[vxor_vv $B2, $B2, $C2]} + @{[vxor_vv $B3, $B3, $C3]} + @{[vror_vi $B0, $B0, 32 - 7]} + @{[vror_vi $B1, $B1, 32 - 7]} + @{[vror_vi $B2, $B2, 32 - 7]} + @{[vror_vi $B3, $B3, 32 - 7]} +___ + + return $code; +} + +$code .= <<___; +.p2align 3 +.globl ChaCha20_ctr32_zvkb +.type ChaCha20_ctr32_zvkb,\@function +ChaCha20_ctr32_zvkb: + srli $LEN, $LEN, 6 + beqz $LEN, .Lend + + addi sp, sp, -96 + sd s0, 0(sp) + sd s1, 8(sp) + sd s2, 16(sp) + sd s3, 24(sp) + sd s4, 32(sp) + sd s5, 40(sp) + sd s6, 48(sp) + sd s7, 56(sp) + sd s8, 64(sp) + sd s9, 72(sp) + sd s10, 80(sp) + sd s11, 88(sp) + + li $STRIDE, 64 + + #### chacha block data + # "expa" little endian + li $CONST_DATA0, 0x61707865 + # "nd 3" little endian + li $CONST_DATA1, 0x3320646e + # "2-by" little endian + li $CONST_DATA2, 0x79622d32 + # "te k" little endian + li $CONST_DATA3, 0x6b206574 + + lw $KEY0, 0($KEY) + lw $KEY1, 4($KEY) + lw $KEY2, 8($KEY) + lw $KEY3, 12($KEY) + lw $KEY4, 16($KEY) + lw $KEY5, 20($KEY) + lw $KEY6, 24($KEY) + lw $KEY7, 28($KEY) + + lw $COUNTER0, 0($COUNTER) + lw $COUNTER1, 4($COUNTER) + lw $NONCE0, 8($COUNTER) + lw $NONCE1, 12($COUNTER) + +.Lblock_loop: + @{[vsetvli $VL, $LEN, "e32", "m1", "ta", "ma"]} + + # init chacha const states + @{[vmv_v_x $V0, $CONST_DATA0]} + @{[vmv_v_x $V1, $CONST_DATA1]} + @{[vmv_v_x $V2, $CONST_DATA2]} + @{[vmv_v_x $V3, $CONST_DATA3]} + + # init chacha key states + @{[vmv_v_x $V4, $KEY0]} + @{[vmv_v_x $V5, $KEY1]} + @{[vmv_v_x $V6, $KEY2]} + @{[vmv_v_x $V7, $KEY3]} + @{[vmv_v_x $V8, $KEY4]} + @{[vmv_v_x $V9, $KEY5]} + @{[vmv_v_x $V10, $KEY6]} + @{[vmv_v_x $V11, $KEY7]} + + # init chacha key states + @{[vid_v $V12]} + @{[vadd_vx $V12, $V12, $COUNTER0]} + @{[vmv_v_x $V13, $COUNTER1]} + + # init chacha nonce states + @{[vmv_v_x $V14, $NONCE0]} + @{[vmv_v_x $V15, $NONCE1]} + + # load the top-half of input data + @{[vlsseg_nf_e32_v 8, $V16, $INPUT, $STRIDE]} + + li $CHACHA_LOOP_COUNT, 10 +.Lround_loop: + addi $CHACHA_LOOP_COUNT, $CHACHA_LOOP_COUNT, -1 + @{[chacha_quad_round_group + $V0, $V4, $V8, $V12, + $V1, $V5, $V9, $V13, + $V2, $V6, $V10, $V14, + $V3, $V7, $V11, $V15]} + @{[chacha_quad_round_group + $V0, $V5, $V10, $V15, + $V1, $V6, $V11, $V12, + $V2, $V7, $V8, $V13, + $V3, $V4, $V9, $V14]} + bnez $CHACHA_LOOP_COUNT, .Lround_loop + + # load the bottom-half of input data + addi $T0, $INPUT, 32 + @{[vlsseg_nf_e32_v 8, $V24, $T0, $STRIDE]} + + # add chacha top-half initial block states + @{[vadd_vx $V0, $V0, $CONST_DATA0]} + @{[vadd_vx $V1, $V1, $CONST_DATA1]} + @{[vadd_vx $V2, $V2, $CONST_DATA2]} + @{[vadd_vx $V3, $V3, $CONST_DATA3]} + @{[vadd_vx $V4, $V4, $KEY0]} + @{[vadd_vx $V5, $V5, $KEY1]} + @{[vadd_vx $V6, $V6, $KEY2]} + @{[vadd_vx $V7, $V7, $KEY3]} + # xor with the top-half input + @{[vxor_vv $V16, $V16, $V0]} + @{[vxor_vv $V17, $V17, $V1]} + @{[vxor_vv $V18, $V18, $V2]} + @{[vxor_vv $V19, $V19, $V3]} + @{[vxor_vv $V20, $V20, $V4]} + @{[vxor_vv $V21, $V21, $V5]} + @{[vxor_vv $V22, $V22, $V6]} + @{[vxor_vv $V23, $V23, $V7]} + + # save the top-half of output + @{[vssseg_nf_e32_v 8, $V16, $OUTPUT, $STRIDE]} + + # add chacha bottom-half initial block states + @{[vadd_vx $V8, $V8, $KEY4]} + @{[vadd_vx $V9, $V9, $KEY5]} + @{[vadd_vx $V10, $V10, $KEY6]} + @{[vadd_vx $V11, $V11, $KEY7]} + @{[vid_v $V0]} + @{[vadd_vx $V12, $V12, $COUNTER0]} + @{[vadd_vx $V13, $V13, $COUNTER1]} + @{[vadd_vx $V14, $V14, $NONCE0]} + @{[vadd_vx $V15, $V15, $NONCE1]} + @{[vadd_vv $V12, $V12, $V0]} + # xor with the bottom-half input + @{[vxor_vv $V24, $V24, $V8]} + @{[vxor_vv $V25, $V25, $V9]} + @{[vxor_vv $V26, $V26, $V10]} + @{[vxor_vv $V27, $V27, $V11]} + @{[vxor_vv $V29, $V29, $V13]} + @{[vxor_vv $V28, $V28, $V12]} + @{[vxor_vv $V30, $V30, $V14]} + @{[vxor_vv $V31, $V31, $V15]} + + # save the bottom-half of output + addi $T0, $OUTPUT, 32 + @{[vssseg_nf_e32_v 8, $V24, $T0, $STRIDE]} + + # update counter + add $COUNTER0, $COUNTER0, $VL + sub $LEN, $LEN, $VL + # increase offset for `4 * 16 * VL = 64 * VL` + slli $T0, $VL, 6 + add $INPUT, $INPUT, $T0 + add $OUTPUT, $OUTPUT, $T0 + bnez $LEN, .Lblock_loop + + ld s0, 0(sp) + ld s1, 8(sp) + ld s2, 16(sp) + ld s3, 24(sp) + ld s4, 32(sp) + ld s5, 40(sp) + ld s6, 48(sp) + ld s7, 56(sp) + ld s8, 64(sp) + ld s9, 72(sp) + ld s10, 80(sp) + ld s11, 88(sp) + addi sp, sp, 96 + +.Lend: + ret +.size ChaCha20_ctr32_zvkb,.-ChaCha20_ctr32_zvkb +___ + +print $code; + +close STDOUT or die "error closing STDOUT: $!";