From patchwork Wed Sep 23 18:55:39 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Feng Kan X-Patchwork-Id: 7252041 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id D588F9F30C for ; Wed, 23 Sep 2015 19:17:35 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id B4624208BB for ; Wed, 23 Sep 2015 19:17:34 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 8D543206F3 for ; Wed, 23 Sep 2015 19:17:33 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1ZepCi-0002HG-9P; Wed, 23 Sep 2015 18:55:44 +0000 Received: from mail-pa0-x235.google.com ([2607:f8b0:400e:c03::235]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1ZepCO-0001Ml-TD for linux-arm-kernel@lists.infradead.org; Wed, 23 Sep 2015 18:55:27 +0000 Received: by pacex6 with SMTP id ex6so47960443pac.0 for ; Wed, 23 Sep 2015 11:55:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=apm.com; s=apm; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=eC3SbcWscUT2C6btG0s3aMd4Wz1r1msibI/YcNbvBr0=; b=KoAv8Tw0B3ySW7+MK3UK8cnI17TR2QNm1XnrpsQ10hj7BruGeSDDgmJqDAxhGuSIvT 0mGGLo3RZXbqgeK5Ea8ooSZ6MoiuEXmSic7OxlRDds59MJl8319mvVafVlUq57lHvLMM OROuX0yHzsVHz4+0LlQLsooGeUF75n8ZVdiBs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=eC3SbcWscUT2C6btG0s3aMd4Wz1r1msibI/YcNbvBr0=; b=Ch8+4gnZIYKrDDVuwHp59bIx55Cwb4rUHemYZoamt0By/cdDHQi6gUFhb1VMnAdOcQ Rx9DITXRubjoSo70FKYsD2XKGtlFGmotP6YUrp+dPl3Ui8zQvl6pyzId9e+b4+8ODAV3 2DCVadD58o7yt/NvRfg1KoEegsHHpWfNVvQZWAKKRH9e5vbwgXElLiJ4AB36EMhKsFWy PmY8CWoKniS9z4CGje3MqIVCkWOZV+uoNhuP117slNM49ItvQNOV4EteweeBNaTwLJKV WRHc5k4kRmyIAJmIllJEXs37eFzYZFO3goYmZUYixvhQ3CdxhQaB3lkovGY3318DbqlD CuxA== X-Gm-Message-State: ALoCoQkBmxbAWG0Pv4i55T00zDWTEZTRuNMO2j3F9k930fXQC4CHa33oXc5+ayvuJfAOEasRpLmL X-Received: by 10.68.248.102 with SMTP id yl6mr39946519pbc.66.1443034504281; Wed, 23 Sep 2015 11:55:04 -0700 (PDT) Received: from fkan-lpt.amcc.com (70-35-53-82.static.wiline.com. [70.35.53.82]) by smtp.gmail.com with ESMTPSA id zc4sm9322975pbb.24.2015.09.23.11.55.02 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Wed, 23 Sep 2015 11:55:02 -0700 (PDT) From: Feng Kan To: patches@apm.com, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, philipp.tomsich@theobroma-systems.com, dann.frazier@canonical.com, tim.gardner@canonical.com, craig.magina@canonical.com, soni.trilok.oss@gmail.com Subject: [PATCH V5 2/2] arm64: copy_to-from-in_user optimization using copy template Date: Wed, 23 Sep 2015 11:55:39 -0700 Message-Id: <1443034539-1848-3-git-send-email-fkan@apm.com> X-Mailer: git-send-email 1.9.1 In-Reply-To: <1443034539-1848-1-git-send-email-fkan@apm.com> References: <1443034539-1848-1-git-send-email-fkan@apm.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150923_115525_018119_AF0F3736 X-CRM114-Status: GOOD ( 12.75 ) X-Spam-Score: -2.7 (--) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Feng Kan , Balamurugan Shanmugam MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED,RP_MATCHES_RCVD,T_DKIM_INVALID,UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch optimize copy_to-from-in_user for arm 64bit architecture. The copy template is used as template file for all the copy*.S files. Minor change was made to it to accommodate the copy to/from/in user files. Signed-off-by: Feng Kan Signed-off-by: Balamurugan Shanmugam --- arch/arm64/lib/copy_from_user.S | 78 +++++++++++++++++++++++------------------ arch/arm64/lib/copy_in_user.S | 67 ++++++++++++++++++++--------------- arch/arm64/lib/copy_to_user.S | 67 ++++++++++++++++++++--------------- 3 files changed, 120 insertions(+), 92 deletions(-) diff --git a/arch/arm64/lib/copy_from_user.S b/arch/arm64/lib/copy_from_user.S index 1be9ef2..4699cd7 100644 --- a/arch/arm64/lib/copy_from_user.S +++ b/arch/arm64/lib/copy_from_user.S @@ -18,6 +18,7 @@ #include #include +#include #include #include @@ -31,49 +32,58 @@ * Returns: * x0 - bytes not copied */ + + .macro ldrb1 ptr, regB, val + USER(9998f, ldrb \ptr, [\regB], \val) + .endm + + .macro strb1 ptr, regB, val + strb \ptr, [\regB], \val + .endm + + .macro ldrh1 ptr, regB, val + USER(9998f, ldrh \ptr, [\regB], \val) + .endm + + .macro strh1 ptr, regB, val + strh \ptr, [\regB], \val + .endm + + .macro ldr1 ptr, regB, val + USER(9998f, ldr \ptr, [\regB], \val) + .endm + + .macro str1 ptr, regB, val + str \ptr, [\regB], \val + .endm + + .macro ldp1 ptr, regB, regC, val + USER(9998f, ldp \ptr, \regB, [\regC], \val) + .endm + + .macro stp1 ptr, regB, regC, val + stp \ptr, \regB, [\regC], \val + .endm + +end .req x5 ENTRY(__copy_from_user) ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \ CONFIG_ARM64_PAN) - add x5, x1, x2 // upper user buffer boundary - subs x2, x2, #16 - b.mi 1f -0: -USER(9f, ldp x3, x4, [x1], #16) - subs x2, x2, #16 - stp x3, x4, [x0], #16 - b.pl 0b -1: adds x2, x2, #8 - b.mi 2f -USER(9f, ldr x3, [x1], #8 ) - sub x2, x2, #8 - str x3, [x0], #8 -2: adds x2, x2, #4 - b.mi 3f -USER(9f, ldr w3, [x1], #4 ) - sub x2, x2, #4 - str w3, [x0], #4 -3: adds x2, x2, #2 - b.mi 4f -USER(9f, ldrh w3, [x1], #2 ) - sub x2, x2, #2 - strh w3, [x0], #2 -4: adds x2, x2, #1 - b.mi 5f -USER(9f, ldrb w3, [x1] ) - strb w3, [x0] -5: mov x0, #0 + add end, x0, x2 +#include "copy_template.S" ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(1)), ARM64_HAS_PAN, \ CONFIG_ARM64_PAN) + mov x0, #0 // Nothing to copy ret ENDPROC(__copy_from_user) .section .fixup,"ax" .align 2 -9: sub x2, x5, x1 - mov x3, x2 -10: strb wzr, [x0], #1 // zero remaining buffer space - subs x3, x3, #1 - b.ne 10b - mov x0, x2 // bytes not copied +9998: + sub x0, end, dst +9999: + strb wzr, [dst], #1 // zero remaining buffer space + cmp dst, end + b.lo 9999b ret .previous diff --git a/arch/arm64/lib/copy_in_user.S b/arch/arm64/lib/copy_in_user.S index 1b94661e..81c8fc9 100644 --- a/arch/arm64/lib/copy_in_user.S +++ b/arch/arm64/lib/copy_in_user.S @@ -20,6 +20,7 @@ #include #include +#include #include #include @@ -33,44 +34,52 @@ * Returns: * x0 - bytes not copied */ + .macro ldrb1 ptr, regB, val + USER(9998f, ldrb \ptr, [\regB], \val) + .endm + + .macro strb1 ptr, regB, val + USER(9998f, strb \ptr, [\regB], \val) + .endm + + .macro ldrh1 ptr, regB, val + USER(9998f, ldrh \ptr, [\regB], \val) + .endm + + .macro strh1 ptr, regB, val + USER(9998f, strh \ptr, [\regB], \val) + .endm + + .macro ldr1 ptr, regB, val + USER(9998f, ldr \ptr, [\regB], \val) + .endm + + .macro str1 ptr, regB, val + USER(9998f, str \ptr, [\regB], \val) + .endm + + .macro ldp1 ptr, regB, regC, val + USER(9998f, ldp \ptr, \regB, [\regC], \val) + .endm + + .macro stp1 ptr, regB, regC, val + USER(9998f, stp \ptr, \regB, [\regC], \val) + .endm + +end .req x5 ENTRY(__copy_in_user) ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \ CONFIG_ARM64_PAN) - add x5, x0, x2 // upper user buffer boundary - subs x2, x2, #16 - b.mi 1f -0: -USER(9f, ldp x3, x4, [x1], #16) - subs x2, x2, #16 -USER(9f, stp x3, x4, [x0], #16) - b.pl 0b -1: adds x2, x2, #8 - b.mi 2f -USER(9f, ldr x3, [x1], #8 ) - sub x2, x2, #8 -USER(9f, str x3, [x0], #8 ) -2: adds x2, x2, #4 - b.mi 3f -USER(9f, ldr w3, [x1], #4 ) - sub x2, x2, #4 -USER(9f, str w3, [x0], #4 ) -3: adds x2, x2, #2 - b.mi 4f -USER(9f, ldrh w3, [x1], #2 ) - sub x2, x2, #2 -USER(9f, strh w3, [x0], #2 ) -4: adds x2, x2, #1 - b.mi 5f -USER(9f, ldrb w3, [x1] ) -USER(9f, strb w3, [x0] ) -5: mov x0, #0 + add end, x0, x2 +#include "copy_template.S" ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(1)), ARM64_HAS_PAN, \ CONFIG_ARM64_PAN) + mov x0, #0 ret ENDPROC(__copy_in_user) .section .fixup,"ax" .align 2 -9: sub x0, x5, x0 // bytes not copied +9998: sub x0, end, dst // bytes not copied ret .previous diff --git a/arch/arm64/lib/copy_to_user.S b/arch/arm64/lib/copy_to_user.S index a257b47..7512bbb 100644 --- a/arch/arm64/lib/copy_to_user.S +++ b/arch/arm64/lib/copy_to_user.S @@ -18,6 +18,7 @@ #include #include +#include #include #include @@ -31,44 +32,52 @@ * Returns: * x0 - bytes not copied */ + .macro ldrb1 ptr, regB, val + ldrb \ptr, [\regB], \val + .endm + + .macro strb1 ptr, regB, val + USER(9998f, strb \ptr, [\regB], \val) + .endm + + .macro ldrh1 ptr, regB, val + ldrh \ptr, [\regB], \val + .endm + + .macro strh1 ptr, regB, val + USER(9998f, strh \ptr, [\regB], \val) + .endm + + .macro ldr1 ptr, regB, val + ldr \ptr, [\regB], \val + .endm + + .macro str1 ptr, regB, val + USER(9998f, str \ptr, [\regB], \val) + .endm + + .macro ldp1 ptr, regB, regC, val + ldp \ptr, \regB, [\regC], \val + .endm + + .macro stp1 ptr, regB, regC, val + USER(9998f, stp \ptr, \regB, [\regC], \val) + .endm + +end .req x5 ENTRY(__copy_to_user) ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(0)), ARM64_HAS_PAN, \ CONFIG_ARM64_PAN) - add x5, x0, x2 // upper user buffer boundary - subs x2, x2, #16 - b.mi 1f -0: - ldp x3, x4, [x1], #16 - subs x2, x2, #16 -USER(9f, stp x3, x4, [x0], #16) - b.pl 0b -1: adds x2, x2, #8 - b.mi 2f - ldr x3, [x1], #8 - sub x2, x2, #8 -USER(9f, str x3, [x0], #8 ) -2: adds x2, x2, #4 - b.mi 3f - ldr w3, [x1], #4 - sub x2, x2, #4 -USER(9f, str w3, [x0], #4 ) -3: adds x2, x2, #2 - b.mi 4f - ldrh w3, [x1], #2 - sub x2, x2, #2 -USER(9f, strh w3, [x0], #2 ) -4: adds x2, x2, #1 - b.mi 5f - ldrb w3, [x1] -USER(9f, strb w3, [x0] ) -5: mov x0, #0 + add end, x0, x2 +#include "copy_template.S" ALTERNATIVE("nop", __stringify(SET_PSTATE_PAN(1)), ARM64_HAS_PAN, \ CONFIG_ARM64_PAN) + mov x0, #0 ret ENDPROC(__copy_to_user) .section .fixup,"ax" .align 2 -9: sub x0, x5, x0 // bytes not copied +9998: sub x0, end, dst // bytes not copied ret .previous