Message ID | 20241007012430.163606-1-ebiggers@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | AEGIS x86 assembly tuning | expand |
On Mon, Oct 7, 2024 at 3:33 AM Eric Biggers <ebiggers@kernel.org> wrote: > > This series cleans up the AES-NI optimized implementation of AEGIS-128. > > Performance is improved by 1-5% depending on the input lengths. Binary > code size is reduced by about 20% (measuring glue + assembly combined), > and source code length is reduced by about 150 lines. > > The first patch also fixes a bug which could theoretically cause > incorrect behavior but was seemingly not being encountered in practice. > > Note: future optimizations for AEGIS-128 could involve adding AVX512 / > AVX10 optimized assembly code. However, unfortunately due to the way > that AEGIS-128 is specified, its level of parallelism is limited, and it > can't really take advantage of vector lengths greater than 128 bits. > So, probably this would provide only another modest improvement, mostly > coming from being able to use the ternary logic instructions. > > Eric Biggers (10): > crypto: x86/aegis128 - access 32-bit arguments as 32-bit > crypto: x86/aegis128 - remove no-op init and exit functions > crypto: x86/aegis128 - eliminate some indirect calls > crypto: x86/aegis128 - don't bother with special code for aligned data > crypto: x86/aegis128 - optimize length block preparation using SSE4.1 > crypto: x86/aegis128 - improve assembly function prototypes > crypto: x86/aegis128 - optimize partial block handling using SSE4.1 > crypto: x86/aegis128 - take advantage of block-aligned len > crypto: x86/aegis128 - remove unneeded FRAME_BEGIN and FRAME_END > crypto: x86/aegis128 - remove unneeded RETs > > arch/x86/crypto/Kconfig | 4 +- > arch/x86/crypto/aegis128-aesni-asm.S | 532 ++++++++++---------------- > arch/x86/crypto/aegis128-aesni-glue.c | 145 ++++--- > 3 files changed, 261 insertions(+), 420 deletions(-) > > > base-commit: 9852d85ec9d492ebef56dc5f229416c925758edc > -- > 2.46.2 > Nice work! Notwithstanding my non-blocking comment on patch #3: Reviewed-by: Ondrej Mosnacek <omosnace@redhat.com> -- Ondrej Mosnacek Senior Software Engineer, Linux Security - SELinux kernel Red Hat, Inc.