From patchwork Wed Jul 25 01:29:07 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 10543403 X-Patchwork-Delegate: herbert@gondor.apana.org.au Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C65D617FD for ; Wed, 25 Jul 2018 01:29:52 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B6D08298BD for ; Wed, 25 Jul 2018 01:29:52 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id AB403298D4; Wed, 25 Jul 2018 01:29:52 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 59D50298BD for ; Wed, 25 Jul 2018 01:29:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388517AbeGYCjD (ORCPT ); Tue, 24 Jul 2018 22:39:03 -0400 Received: from mail-pf1-f196.google.com ([209.85.210.196]:44202 "EHLO mail-pf1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388468AbeGYCjD (ORCPT ); Tue, 24 Jul 2018 22:39:03 -0400 Received: by mail-pf1-f196.google.com with SMTP id k21-v6so1277650pff.11 for ; Tue, 24 Jul 2018 18:29:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=QWypyZ80I2gVYezogQjAKBoQshD6EQR/R/j4gUdowQM=; b=aivK+nkimlE5hKDRvGtKCCG7NV8mdOaIFRNdyK1rzzL0OoyX+JHbGNcoU/QTAKC+ha qMuYvJWlz8WqWbWcl/RXBojEniboW1Xm5ruH+nYzNw2cquf+ZUwLH7hWwd7BI7iKL0KB L+QCgJmVJIHAiF99SxIwnYuDHWHwe87uqXB93dLFdyIP1I2E9Gu23y2oZaEPxgJCfUEi jPxxEEoHnMU1LQsRwJVp6LwwcDqlB7JFwGnrAAs4oOrYkzFhFeOfEhKAPfzQixLIxuSQ Yo8/H53QZ181YGWOJvQQiQ2GMY3JXbSRw52pdxsoYwcmxipbaLlyopoG4sMw39p07iJv DDEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=QWypyZ80I2gVYezogQjAKBoQshD6EQR/R/j4gUdowQM=; b=YztQg6OhVbfu7i5ULiQrbwjwyNPQZhQvxLfVqAZvUSfKym78BpnSJcFlywizgxCQkK Fx4LpliTqtg7MfQ6a1mU8+eOXJaiObCUo207cg8+tAU0ALBeHTbmtGd5+2xU3HS3COkj xuYkQb9S4F11CmSeRCwYqD66ZCndEPS9GqSpHKHgWAJAIEfhGw35A8VABxROh1leozFF BJ1Y12o+DKMvlJ1g2ED/Btmxk/U0Vx7OUbhXIyC9+QH8gLnocL7n6WaZJliAKO0xsdGZ RIhb7FIdiB2VH+iGYeypiWtSAqG4gVP2gqk8zOxSKel37EmOi6okdZAO1cRZlF1Sf4J6 tsYA== X-Gm-Message-State: AOUpUlHn60xR5EAv8k83RlmSmobhPsaBcR1o/07ePghVIq6ZrJv2ViZO RHxE8uQ4lvLY9nM8sgaRWOHEvdPI X-Google-Smtp-Source: AAOMgpfWUbd6rnAMZMeY+8uYP9x5Oy5OiUMldCqnxoPp0F4ombjmOrr/cZxJQO+HO+7+gYH8OdHFpQ== X-Received: by 2002:a62:1314:: with SMTP id b20-v6mr20149662pfj.230.1532482190443; Tue, 24 Jul 2018 18:29:50 -0700 (PDT) Received: from sol.localdomain (c-67-185-97-198.hsd1.wa.comcast.net. [67.185.97.198]) by smtp.gmail.com with ESMTPSA id f126-v6sm18306577pgc.88.2018.07.24.18.29.49 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 24 Jul 2018 18:29:49 -0700 (PDT) From: Eric Biggers To: linux-crypto@vger.kernel.org, Herbert Xu Cc: linux-arm-kernel@lists.infradead.org, Ard Biesheuvel , Eric Biggers Subject: [PATCH] crypto: arm/chacha20 - always use vrev for 16-bit rotates Date: Tue, 24 Jul 2018 18:29:07 -0700 Message-Id: <20180725012907.1614-1-ebiggers3@gmail.com> X-Mailer: git-send-email 2.18.0 Sender: linux-crypto-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Biggers The 4-way ChaCha20 NEON code implements 16-bit rotates with vrev32.16, but the one-way code (used on remainder blocks) implements it with vshl + vsri, which is slower. Switch the one-way code to vrev32.16 too. Signed-off-by: Eric Biggers Acked-by: Ard Biesheuvel --- arch/arm/crypto/chacha20-neon-core.S | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/arch/arm/crypto/chacha20-neon-core.S b/arch/arm/crypto/chacha20-neon-core.S index 3fecb2124c35..451a849ad518 100644 --- a/arch/arm/crypto/chacha20-neon-core.S +++ b/arch/arm/crypto/chacha20-neon-core.S @@ -51,9 +51,8 @@ ENTRY(chacha20_block_xor_neon) .Ldoubleround: // x0 += x1, x3 = rotl32(x3 ^ x0, 16) vadd.i32 q0, q0, q1 - veor q4, q3, q0 - vshl.u32 q3, q4, #16 - vsri.u32 q3, q4, #16 + veor q3, q3, q0 + vrev32.16 q3, q3 // x2 += x3, x1 = rotl32(x1 ^ x2, 12) vadd.i32 q2, q2, q3 @@ -82,9 +81,8 @@ ENTRY(chacha20_block_xor_neon) // x0 += x1, x3 = rotl32(x3 ^ x0, 16) vadd.i32 q0, q0, q1 - veor q4, q3, q0 - vshl.u32 q3, q4, #16 - vsri.u32 q3, q4, #16 + veor q3, q3, q0 + vrev32.16 q3, q3 // x2 += x3, x1 = rotl32(x1 ^ x2, 12) vadd.i32 q2, q2, q3