From patchwork Tue Oct 15 10:41:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ard Biesheuvel X-Patchwork-Id: 13836152 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 031FBCFC287 for ; Tue, 15 Oct 2024 10:43:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Type:Cc:To:From: Subject:Message-ID:Mime-Version:Date:Reply-To:Content-Transfer-Encoding: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=L+2TeSEOAq3mZA+06DmXhg/TmVjMwFJKq0Mn1/ZAU14=; b=SkSwQhXLKW+eDoT4DQm3ZoYaRG sO1eZi9lXz4jDMKQ+JkijvRUnG5cSa/SgZc1skAJ9mEtuwCQaEyYM15KlqPjpu55aVkJeIhCtfuLi t9pHZKl+VeOOz2KjFyZdCiXHET7mffTSoeJZsqzmNcruM/pt9O95j+bfPHfXJnUdJSsrhmMBeqs0S P+Zhn/cAIBR8CgwKFIM6Q3uhKfMhJA8eksm8vFjXR/nr3zKXADr9WogvKLBsZjytYX22k9D9OY9Y5 CWv3q/dykiAsXG/p0Ct5woUnqPKRVoXzVrTPdV8aXDwIc8mExrVhhtThbC8Mg+30zcrcoX8pBhJzL x77rgSzA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t0f1S-00000007t38-3xNz; Tue, 15 Oct 2024 10:43:26 +0000 Received: from mail-wm1-x34a.google.com ([2a00:1450:4864:20::34a]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t0f01-00000007sif-1CaF for linux-arm-kernel@lists.infradead.org; Tue, 15 Oct 2024 10:41:58 +0000 Received: by mail-wm1-x34a.google.com with SMTP id 5b1f17b1804b1-4312acda5f6so16721615e9.1 for ; Tue, 15 Oct 2024 03:41:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1728988914; x=1729593714; darn=lists.infradead.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=L+2TeSEOAq3mZA+06DmXhg/TmVjMwFJKq0Mn1/ZAU14=; b=bkq6SvSAW3Ejzr7Txg15+RVCM7dfMnXumazbVeGYaQQQ/T4APinoo+kJk1OBIcp5M5 4s2bEZXn3LryWfBeyaOU7GX2RPvAOaPhohlAbust3tWmeu6Y1kxoqeWD7ZxfYwuX1en0 Xh7BBRVyTAwpUcp1PkIwROl8M1yAQ2o3kcI3Kfu1FQv0r0FkpPInKECRk4Mvzrq/EO5J 3fAsKz3EkR2QyZcI4yJ4D5RnjMCPR7Nvp6BtcQ/VsVacyvd1yF0D/1C7aJrY1pl9bf9h bw/jrOifbyENnvE7vpgRBAsT7Yny9sam4HuG3qDp7EKkhq6UqRHwNwQr7QirQp1v7rWc hvhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728988914; x=1729593714; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=L+2TeSEOAq3mZA+06DmXhg/TmVjMwFJKq0Mn1/ZAU14=; b=J/um1emZXi1qSXf2jYPl41Day7Ab/mjyZWBH0uQWt18cr9ecT9eXbtCtUYLBjWs3Do x19IGjW5C9dSxICfeF9dAiCMqz2WnoySkEN7m+n5W30s2GPERdUW1ruR/gbzv3LJfOss fdcvqu8ofHMCXVIDeBXxOPzpT6xgWTBAFCl7BrDT36nXDYQdCHTl86WkMBVGblxx/6x+ en80aqTLwQVRsw+Jm4DDmQBBkWtdPKXGDztm6dgFHqeOPA14np5i3Dg+ZCFNjCsbtCx1 oclRGz02Bazkbiq3DSkhCVLVW+4Yz1DNj83paIjXBhrA90ZOo8eX8DsPDcJTy4sJViuu Y7Nw== X-Gm-Message-State: AOJu0Ywc5+iInG0Fy2RD5aoL7bzOWJMLxfFO+2Q9No/jYzYS7PPgNNq2 HlZ1yJFReoVWWrGRS6uWr4ZqoeHEgqAjMzf/i0fxDtmQE6aehc2uaZ6lKWvBoF3YdnxEjV0mK4L RdI8HmvRdVamAM814KmRwOF9Zq47RN6wmEYC9WrxQ2a5vPwVvQvx64xPWaCOKeh37Nw3L/ZBxm0 BjN9beQlHfl6j6SxuTehhGzmDTPsJQvi6xoAjJ+YhM X-Google-Smtp-Source: AGHT+IE5rC1KQjF0XqZu3kluvpF6mqicCJ7dllaP88Y2Wmm+1tL/LvfGQLVme4ZCXf/c4xYWlRksmYIg X-Received: from palermo.c.googlers.com ([fda3:e722:ac3:cc00:7b:198d:ac11:8138]) (user=ardb job=sendgmr) by 2002:a05:600c:4301:b0:42c:ac8c:9725 with SMTP id 5b1f17b1804b1-4314a3cdd84mr425e9.8.1728988914149; Tue, 15 Oct 2024 03:41:54 -0700 (PDT) Date: Tue, 15 Oct 2024 12:41:38 +0200 Mime-Version: 1.0 X-Developer-Key: i=ardb@kernel.org; a=openpgp; fpr=F43D03328115A198C90016883D200E9CA6329909 X-Developer-Signature: v=1; a=openpgp-sha256; l=1320; i=ardb@kernel.org; h=from:subject; bh=rQmohYKx9m4loPLy8qUR8hfayoHC17fyhkwhZbzWsrY=; b=owGbwMvMwCFmkMcZplerG8N4Wi2JIZ3P7VHN2dDleUs1DvoZnzTIPPCjbV7Cd70jM28YGpdnN gXcLFDoKGVhEONgkBVTZBGY/ffdztMTpWqdZ8nCzGFlAhnCwMUpABOZZcXIsPXQ6agjR94x9r/Z XrZ02pus98kqedblq/1sz1oZy3c4L2f473T0wfpniZFTT72TuzDtj9j29ksJLu/Dj+66vtfz380 4X3YA X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241015104138.2875879-4-ardb+git@google.com> Subject: [PATCH 0/2] arm64: Speed up CRC-32 using PMULL instructions From: Ard Biesheuvel To: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, will@kernel.org, catalin.marinas@arm.com, Ard Biesheuvel , Eric Biggers , Kees Cook X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241015_034157_350215_F6434555 X-CRM114-Status: GOOD ( 11.87 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org From: Ard Biesheuvel The CRC-32 code is library code, and is not part of the crypto subsystem. This means that callers may not generally be aware of the kind of implementation that backs it, and so we've refrained from using FP/SIMD code in the past, as it disables preemption, and this may incur scheduling latencies that the caller did not anticipate. This was solved a while ago, and on arm64, kernel mode FP/SIMD no longer disables preemption. This means we can happily use PMULL instructions in the CRC-32 library code, which permits an optimization to be implemented that results in a speedup of 2 - 2.8x for inputs >1k in size (on Apple M2) Patch #1 implements some prepwork to handle the scalar CRC-32 alternatives patching in C code. Cc: Eric Biggers Cc: Kees Cook Ard Biesheuvel (2): arm64/lib: Handle CRC-32 alternative in C code arm64/crc32: Implement 4-way interleave using PMULL arch/arm64/lib/Makefile | 2 +- arch/arm64/lib/crc32-glue.c | 70 ++++++ arch/arm64/lib/crc32-pmull.S | 240 ++++++++++++++++++++ arch/arm64/lib/crc32.S | 21 +- 4 files changed, 317 insertions(+), 16 deletions(-) create mode 100644 arch/arm64/lib/crc32-glue.c create mode 100644 arch/arm64/lib/crc32-pmull.S