mbox series

[0/3] crypto: x86/twofish-3way - Cleanup and optimize asm

Message ID cover.1675653010.git.peter@n8pjl.ca (mailing list archive)
Headers show
Series crypto: x86/twofish-3way - Cleanup and optimize asm | expand

Message

Peter Lafreniere Feb. 6, 2023, 3:24 a.m. UTC
1/3 removes the unused xor argument to encode functions. This argument
is deadweight and its removal shaves off a both a few cycles per call as
well as a small amount of lines.

2/3 moves handling for cbc mode decryption to assembly in order to
remove overhead, yielding a ~6% speedup on AMD Zen1.

3/3 makes a minor readability change that doesn't fit well into 2/3.

Peter Lafreniere (3):
  crypto: x86/twofish-3way - Remove unused encode parameter
  crypto: x86/twofish-3way - Perform cbc xor in assembly
  crypto: x86/twofish-3way - Remove unused macro argument

 arch/x86/crypto/twofish-x86_64-asm_64-3way.S | 71 ++++++++++++--------
 arch/x86/crypto/twofish.h                    | 19 ++++--
 arch/x86/crypto/twofish_avx_glue.c           |  5 --
 arch/x86/crypto/twofish_glue_3way.c          | 22 +-----
 4 files changed, 59 insertions(+), 58 deletions(-)

Comments

Herbert Xu Feb. 14, 2023, 8:44 a.m. UTC | #1
Peter Lafreniere <peter@n8pjl.ca> wrote:
> 1/3 removes the unused xor argument to encode functions. This argument
> is deadweight and its removal shaves off a both a few cycles per call as
> well as a small amount of lines.
> 
> 2/3 moves handling for cbc mode decryption to assembly in order to
> remove overhead, yielding a ~6% speedup on AMD Zen1.
> 
> 3/3 makes a minor readability change that doesn't fit well into 2/3.
> 
> Peter Lafreniere (3):
>  crypto: x86/twofish-3way - Remove unused encode parameter
>  crypto: x86/twofish-3way - Perform cbc xor in assembly
>  crypto: x86/twofish-3way - Remove unused macro argument
> 
> arch/x86/crypto/twofish-x86_64-asm_64-3way.S | 71 ++++++++++++--------
> arch/x86/crypto/twofish.h                    | 19 ++++--
> arch/x86/crypto/twofish_avx_glue.c           |  5 --
> arch/x86/crypto/twofish_glue_3way.c          | 22 +-----
> 4 files changed, 59 insertions(+), 58 deletions(-)

This fails the self-test on my machine:

[   91.709458] alg: skcipher: ecb-twofish-3way decryption test failed (wrong result) on test vector 3, cfg="in-place (one sglist)"
[   91.710358] alg: self-tests for ecb(twofish) using ecb-twofish-3way failed (rc=-22)
[   91.710370] ------------[ cut here ]------------
[   91.711228] alg: self-tests for ecb(twofish) using ecb-twofish-3way failed (rc=-22)
[   91.711288] WARNING: CPU: 1 PID: 3245 at crypto/testmgr.c:5858 alg_test.part.0.cold+0xea/0x128 [cryptomgr]
[   91.712672] Modules linked in: twofish_x86_64_3way(+) sm3_generic rmd160 ip_vti ip_tunnel af_key ah6 ah4 esp6 esp4 xfrm4_tunnel tunnel4 ipcomp ipcomp6 xfrm6_tunnel xfrm_ipcomp tunnel6 cfb sm4_generic sm4_aesni_avx_x86_64 sm4 sm3 poly1305_generic poly1305_x86_64 nhpoly1305_sse2 nhpoly1305 libpoly1305 des3_ede_x86_64 curve25519_x86_64 libcurve25519_generic chacha_generic chacha_x86_64 libchacha aria_aesni_avx_x86_64 aria_generic aegis128 aegis128_aesni chacha20poly1305 cmac camellia_aesni_avx_x86_64 camellia_generic camellia_x86_64 lzo lzo_compress lzo_decompress cast6_avx_x86_64 cast6_generic cast5_avx_x86_64 cast5_generic cast_common deflate ccm serpent_avx_x86_64 serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64 blowfish_common twofish_generic twofish_x86_64 twofish_common xcbc md5 des_generic libdes xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat nf_conntrack xfrm_user xfrm_algo nf_defrag_ipv6 nf_defrag_ipv4 xt_tcpudp nft_compat bridge stp llc nf_tables libcrc32c
[   91.712714]  nfnetlink af_packet intel_rapl_msr intel_rapl_common iosf_mbi kvm_intel kvm irqbypass crc32_generic crc32_pclmul polyval_clmulni polyval_generic ghash_clmulni_intel sha512_ssse3 sha512_generic xctr ghash_generic gf128mul gcm crypto_null xts sd_mod t10_pi ctr crc64_rocksoft_generic crc64_rocksoft crc_t10dif cts crct10dif_generic crct10dif_pclmul crc64 cbc sr_mod crct10dif_common cdrom sg joydev mousedev ppdev aes_generic ecb aesni_intel snd_pcm bochs drm_vram_helper drm_ttm_helper snd_timer virtio_net libaes ttm snd libcryptoutils drm_kms_helper evdev input_leds ata_generic pata_acpi crypto_simd cryptd rapl net_failover failover psmouse drm soundcore led_class ata_piix drm_panel_orientation_quirks cfbfillrect cfbimgblt cfbcopyarea syscopyarea sysfillrect sysimgblt libata pcspkr serio_raw fb fbdev backlight floppy intel_agp rtc_cmos i2c_piix4 intel_gtt scsi_mod parport_pc agpgart i2c_core scsi_common parport tiny_power_button button qemu_fw_cfg ip_tables x_tables sha256_ssse3
[   91.719801]  sha256_generic libsha256 sha1_ssse3 sha1_generic hmac ipv6 autofs4 ext4 crc16 mbcache jbd2 dm_snapshot dm_bufio dm_mod virtio_pci virtio_pci_legacy_dev virtio_pci_modern_dev virtio_blk virtio_ring virtio unix crc32c_generic crc32c_intel cryptomgr kpp crypto_acompress akcipher rng aead crypto_hash skcipher crypto_algapi crypto [last unloaded: twofish_x86_64_3way]
[   91.729585] CPU: 1 PID: 3245 Comm: cryptomgr_test Not tainted 6.2.0-rc1+ #4
[   91.730089] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.14.0-2 04/01/2014
[   91.730727] RIP: 0010:alg_test.part.0.cold+0xea/0x128 [cryptomgr]
[   91.731154] Code: dd ea 46 e1 44 8b 85 54 ff ff ff 41 83 f8 fe 0f 84 61 9f ff ff 44 89 c1 4c 89 ea 4c 89 fe 48 c7 c7 30 82 2a a0 e8 b6 cd 46 e1 <0f> 0b 44 8b 85 54 ff ff ff e9 3e 9f ff ff 0f 0b 48 c7 c7 08 81 2a
[   91.732625] RSP: 0018:ffffc900001e3e18 EFLAGS: 00010296
[   91.732989] RAX: 0000000000000047 RBX: 0000000000000079 RCX: ffff88801ed1c4c8
[   91.733494] RDX: 00000000ffffffd8 RSI: 0000000000000027 RDI: ffff88801ed1c4c0
[   91.733996] RBP: ffffc900001e3ed8 R08: ffffffff81ea3dc8 R09: 0000000000000003
[   91.734516] R10: 00000000000001d3 R11: ffffffff81e8e9a8 R12: 000000000000007b
[   91.735028] R13: ffff888003334200 R14: 000000000000007b R15: ffff888003334280
[   91.735532] FS:  0000000000000000(0000) GS:ffff88801ed00000(0000) knlGS:0000000000000000
[   91.736113] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   91.736536] CR2: 00007f253884c060 CR3: 000000000393f003 CR4: 0000000000060ee0
[   91.737068] Call Trace:
[   91.737191]  <TASK>
[   91.737288]  ? __schedule+0x274/0xf90
[   91.737529]  ? _raw_spin_unlock_irqrestore+0xd/0x40
[   91.737860]  ? try_to_wake_up+0x18f/0x2d0
[   91.738126]  ? __pfx_cryptomgr_test+0x10/0x10 [cryptomgr]
[   91.738510]  alg_test+0x18/0x50 [cryptomgr]
[   91.739551]  cryptomgr_test+0x24/0x50 [cryptomgr]
[   91.740362]  kthread+0xe9/0x110
[   91.741011]  ? __pfx_kthread+0x10/0x10
[   91.741650]  ret_from_fork+0x2c/0x50
[   91.742272]  </TASK>
[   91.742799] ---[ end trace 0000000000000000 ]---

Cheers,