[16/16] crypto: arm64/sm4 - add ARMv9 SVE cryptography acceleration implementation

Scalable Vector Extension (SVE) is the next-generation SIMD extension for
arm64. SVE allows flexible vector length implementations with a range of
possible values in CPU implementations. The vector length can vary from a
minimum of 128 bits up to a maximum of 2048 bits, at 128-bit increments.
The SVE design guarantees that the same application can run on different
implementations that support SVE, without the need to recompile the code.

SVE was originally introduced by ARMv8, and ARMv9 introduced SVE2 to
expand and improve it. Similar to the Crypto Extension supported by the
NEON instruction set for the algorithm, SVE also supports the similar
instructions, called cryptography acceleration instructions, but this is
also optional instruction set.

This patch uses SM4 cryptography acceleration instructions and SVE2
instructions to optimize the SM4 algorithm for ECB/CBC/CFB/CTR modes.
Since the encryption of CBC/CFB cannot be parallelized, the Crypto
Extension instruction is used.

Since no test environment with a Vector Length (VL) greater than 128 bits
was found, the performance data was obtained on a machine with a VL is
128 bits, because this driver is enabled when the VL is greater than 128
bits, so this performance is only for reference. It can be seen from the
data that there is little difference between the data optimized by Crypto
Extension and SVE (VL=128 bits), and the optimization effect will be more
obvious when VL=256 bits or longer.

Benchmark on T-Head Yitian-710 2.75 GHz, the data comes from the 218 mode
of tcrypt, and compared with that optimized by Crypto Extension.  The
abscissas are blocks of different lengths. The data is tabulated and the
unit is Mb/s:

sm4-ce      |      16       64      128      256     1024     1420     4096
------------+--------------------------------------------------------------
    ECB enc |  315.18  1162.65  1815.66  2553.50  3692.91  3727.20  4001.93
    ECB dec |  316.06  1172.97  1817.81  2554.66  3692.18  3786.54  4001.93
    CBC enc |  304.82   629.54   768.65   864.72   953.90   963.32   974.06
    CBC dec |  306.05  1142.53  1805.11  2481.67  3522.06  3587.87  3790.99
    CFB enc |  309.48   635.70   774.44   865.85   950.62   952.68   968.24
    CFB dec |  315.98  1170.38  1828.75  2509.72  3543.63  3539.40  3793.25
    CTR enc |  285.83  1036.59  1583.50  2147.26  2933.54  2954.66  3041.14
    CTR dec |  285.29  1037.47  1584.67  2145.51  2934.10  2950.89  3041.62

sm4-sve-ce (VL = 128 bits)
    ECB enc |  310.00  1154.70  1813.26  2579.74  3766.90  3869.45  4100.26
    ECB dec |  315.60  1176.22  1838.06  2593.69  3774.95  3878.42  4098.83
    CBC enc |  303.44   622.65   764.67   861.40   953.18   963.05   973.77
    CBC dec |  302.13  1091.15  1689.10  2267.79  3182.84  3242.68  3408.92
    CFB enc |  296.62   620.41   762.94   858.96   948.18   956.04   967.67
    CFB dec |  291.23  1065.50  1637.33  2228.12  3158.52  3213.35  3403.83
    CTR enc |  272.27   959.35  1466.34  1934.24  2562.80  2595.87  2695.15
    CTR dec |  273.40   963.65  1471.83  1938.97  2563.12  2597.25  2694.54

Signed-off-by: Tianjia Zhang <tianjia.zhang@linux.alibaba.com>
---
 arch/arm64/crypto/Kconfig           |   19 +
 arch/arm64/crypto/Makefile          |    3 +
 arch/arm64/crypto/sm4-sve-ce-core.S | 1028 +++++++++++++++++++++++++++
 arch/arm64/crypto/sm4-sve-ce-glue.c |  332 +++++++++
 4 files changed, 1382 insertions(+)
 create mode 100644 arch/arm64/crypto/sm4-sve-ce-core.S
 create mode 100644 arch/arm64/crypto/sm4-sve-ce-glue.c

Message ID	20220926093620.99898-17-tianjia.zhang@linux.alibaba.com (mailing list archive)
State	Superseded
Delegated to:	Herbert Xu
Headers	show Return-Path: <linux-crypto-owner@kernel.org> From: Tianjia Zhang <tianjia.zhang@linux.alibaba.com> To: Herbert Xu <herbert@gondor.apana.org.au>, "David S. Miller" <davem@davemloft.net>, Jussi Kivilinna <jussi.kivilinna@iki.fi>, Ard Biesheuvel <ardb@kernel.org>, Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Maxime Coquelin <mcoquelin.stm32@gmail.com>, Alexandre Torgue <alexandre.torgue@foss.st.com>, Eric Biggers <ebiggers@kernel.org>, linux-crypto@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com Subject: [PATCH 16/16] crypto: arm64/sm4 - add ARMv9 SVE cryptography acceleration implementation Date: Mon, 26 Sep 2022 17:36:20 +0800 Message-Id: <20220926093620.99898-17-tianjia.zhang@linux.alibaba.com> In-Reply-To: <20220926093620.99898-1-tianjia.zhang@linux.alibaba.com> References: <20220926093620.99898-1-tianjia.zhang@linux.alibaba.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	Optimizing SM3 and SM4 algorithms using NEON/CE/SVE instructions \| expand [00/16] Optimizing SM3 and SM4 algorithms using NEON/CE/SVE instructions [01/16] crypto: arm64/sm3 - raise the priority of the CE implementation [02/16] crypto: arm64/sm3 - add NEON assembly implementation [03/16] crypto: arm64/sm4 - refactor and simplify NEON implementation [04/16] crypto: testmgr - add SM4 cts-cbc/essiv/xts/xcbc test vectors [05/16] crypto: tcrypt - add SM4 cts-cbc/essiv/xts/xcbc test [06/16] crypto: arm64/sm4 - refactor and simplify CE implementation [07/16] crypto: arm64/sm4 - simplify sm4_ce_expand_key() of CE implementation [08/16] crypto: arm64/sm4 - export reusable CE acceleration functions [09/16] crypto: arm64/sm4 - add CE implementation for CTS-CBC mode [10/16] crypto: arm64/sm4 - add CE implementation for XTS mode [11/16] crypto: essiv - allow digestsize to be greater than keysize [12/16] crypto: arm64/sm4 - add CE implementation for ESSIV mode [13/16] crypto: arm64/sm4 - add CE implementation for cmac/xcbc/cbcmac [14/16] crypto: arm64/sm4 - add CE implementation for CCM mode [15/16] crypto: arm64/sm4 - add CE implementation for GCM mode [16/16] crypto: arm64/sm4 - add ARMv9 SVE cryptography acceleration implementation

[16/16] crypto: arm64/sm4 - add ARMv9 SVE cryptography acceleration implementation

Commit Message

Comments

Patch