From patchwork Thu Jun 13 00:18:12 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jisheng Zhang X-Patchwork-Id: 13695716 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 806C4C27C53 for ; Thu, 13 Jun 2024 00:32:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:Message-ID:Date:Subject:Cc:To:From:Reply-To:Content-Type: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=rKsaEHI+qUYHLpwZ6g2ZJPr+aXt1GRbo7WyzrMSAcgg=; b=il3L9o8YApO9ndFKkztCwf1mjE WqG0EqbroTRQZdHIgiqvSqQ/LKwzBCVyobujvPh+2XJ+3CVCNmSyGw/HHZB7QoApVEa2Bdr4ve2ZG YBqeMf69To0OzefJR1Iijwe5y1lU4S4sFUKnKC0dPAZCPXApLjrJJ2oAFUwdSnqK8D28Nl3+Tz9yO rvTaGiY/99mSsYNXii+ZjUY2AZH3QAWJu6cfFaDaXmGL9JD9h00ZcCS92wQ1oZJvFd+zZ7TT5gTXY /OaTnsvwkA7tcpkFL3HWBc+xfy1h5lga1LMl3QManjvWS+kax2RNBTtQGP//tMnKzPudN4rgKIonm UUzC94Xg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.97.1 #2 (Red Hat Linux)) id 1sHYNu-0000000EbdG-1zQq; Thu, 13 Jun 2024 00:32:10 +0000 Received: from dfw.source.kernel.org ([2604:1380:4641:c500::1]) by bombadil.infradead.org with esmtps (Exim 4.97.1 #2 (Red Hat Linux)) id 1sHYNq-0000000Ebcg-0evY for linux-arm-kernel@lists.infradead.org; Thu, 13 Jun 2024 00:32:07 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 26241617CB; Thu, 13 Jun 2024 00:32:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D945DC4AF1D; Thu, 13 Jun 2024 00:32:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1718238723; bh=CzUcaUEwid9QP2s+ip8EXYQc/owNHwh0AeSROulmsmQ=; h=From:To:Cc:Subject:Date:From; b=RP3f3zHljgB69meR2mgyyG8IzWJoRo8SVqkFr5jfvu+xVTg/9RN/q30+sHnDWuMxl cOlOAXD7rIeLJKCCvtMVWLt31QrLWRSCcAch7K95GAH7Mnc8G3A4wMkATg977X20iB B+MLf8f0KbV/kqBoxlbbN6+gSMztY/CT1fFGniRjB+HV6/iIiM38HY0y0dxQJDP2Tn HCsE9Pp9aryZMOnBs/QgAmmtmWj72asyKJPsBCxG8LAtAv8LJPLxo5rrd0QtPjJBR0 RlRcvvz+zp5UFD2kSU2GUzfqgLdpXX3xs/Egop2Q/h6r7oizsqbu7GbHCqiuDGHOFA 3QXMDfaZ6ZFNg== From: Jisheng Zhang To: Catalin Marinas , Will Deacon Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH] arm64/lib: copy_page: s/stnp/stp Date: Thu, 13 Jun 2024 08:18:12 +0800 Message-ID: <20240613001812.2141-1-jszhang@kernel.org> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240612_173206_268878_1D3D2C85 X-CRM114-Status: UNSURE ( 9.34 ) X-CRM114-Notice: Please train this message. X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org stnp performs non-temporal store, give a hints to the memory system that caching is not useful for this data. But the scenario where copy_page() used may not have this implication, although I must admit there's such case where stnp helps performance(good). In this good case, we can rely on the HW write streaming mechanism in some implementations such as cortex-a55 to detect the case and take actions. testing with https://github.com/apinski-cavium/copy_page_benchmark this patch can reduce the time by about 3% on cortex-a55 platforms. Signed-off-by: Jisheng Zhang --- arch/arm64/lib/copy_page.S | 32 ++++++++++++++++---------------- 1 file changed, 16 insertions(+), 16 deletions(-) diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S index 6a56d7cf309d..4c74fe2d8bd6 100644 --- a/arch/arm64/lib/copy_page.S +++ b/arch/arm64/lib/copy_page.S @@ -32,21 +32,21 @@ SYM_FUNC_START(__pi_copy_page) 1: tst x0, #(PAGE_SIZE - 1) - stnp x2, x3, [x0, #-256] + stp x2, x3, [x0, #-256] ldp x2, x3, [x1] - stnp x4, x5, [x0, #16 - 256] + stp x4, x5, [x0, #16 - 256] ldp x4, x5, [x1, #16] - stnp x6, x7, [x0, #32 - 256] + stp x6, x7, [x0, #32 - 256] ldp x6, x7, [x1, #32] - stnp x8, x9, [x0, #48 - 256] + stp x8, x9, [x0, #48 - 256] ldp x8, x9, [x1, #48] - stnp x10, x11, [x0, #64 - 256] + stp x10, x11, [x0, #64 - 256] ldp x10, x11, [x1, #64] - stnp x12, x13, [x0, #80 - 256] + stp x12, x13, [x0, #80 - 256] ldp x12, x13, [x1, #80] - stnp x14, x15, [x0, #96 - 256] + stp x14, x15, [x0, #96 - 256] ldp x14, x15, [x1, #96] - stnp x16, x17, [x0, #112 - 256] + stp x16, x17, [x0, #112 - 256] ldp x16, x17, [x1, #112] add x0, x0, #128 @@ -54,14 +54,14 @@ SYM_FUNC_START(__pi_copy_page) b.ne 1b - stnp x2, x3, [x0, #-256] - stnp x4, x5, [x0, #16 - 256] - stnp x6, x7, [x0, #32 - 256] - stnp x8, x9, [x0, #48 - 256] - stnp x10, x11, [x0, #64 - 256] - stnp x12, x13, [x0, #80 - 256] - stnp x14, x15, [x0, #96 - 256] - stnp x16, x17, [x0, #112 - 256] + stp x2, x3, [x0, #-256] + stp x4, x5, [x0, #16 - 256] + stp x6, x7, [x0, #32 - 256] + stp x8, x9, [x0, #48 - 256] + stp x10, x11, [x0, #64 - 256] + stp x12, x13, [x0, #80 - 256] + stp x14, x15, [x0, #96 - 256] + stp x16, x17, [x0, #112 - 256] ret SYM_FUNC_END(__pi_copy_page)