From patchwork Wed Jul 8 12:40:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zhenyu Ye X-Patchwork-Id: 11651563 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B6AC13BD for ; Wed, 8 Jul 2020 12:43:55 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 5931D20720 for ; Wed, 8 Jul 2020 12:43:55 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5931D20720 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=huawei.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 9C8CC6B00D5; Wed, 8 Jul 2020 08:43:54 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 99FD66B00D7; Wed, 8 Jul 2020 08:43:54 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8DD636B00DA; Wed, 8 Jul 2020 08:43:54 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0038.hostedemail.com [216.40.44.38]) by kanga.kvack.org (Postfix) with ESMTP id 7A2E46B00D5 for ; Wed, 8 Jul 2020 08:43:54 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 341E2180AD802 for ; Wed, 8 Jul 2020 12:43:54 +0000 (UTC) X-FDA: 77014875588.27.spade53_2b07b6f26ebd Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin27.hostedemail.com (Postfix) with ESMTP id 851513E379 for ; Wed, 8 Jul 2020 12:40:52 +0000 (UTC) X-Spam-Summary: 1,0,0,,d41d8cd98f00b204,yezhenyu2@huawei.com,,RULES_HIT:30054:30070:30074,0,RBL:45.249.212.191:@huawei.com:.lbl8.mailshell.net-62.18.2.100 64.95.201.95;04y8amgar7anbxugm4b15eif8ux3ayccd185ndu8aoukxqmet7jwere4apzt7xs.gbnsbk4e7ucrf1yorni5gqcnb98kqn74wpeamsffa3kobpxw41s1qjrzu1narkn.1-lbl8.mailshell.net-223.238.255.100,CacheIP:none,Bayesian:0.5,0.5,0.5,Netcheck:none,DomainCache:0,MSF:not bulk,SPF:fp,MSBL:0,DNSBL:neutral,Custom_rules:0:0:0,LFtime:29,LUA_SUMMARY:none X-HE-Tag: spade53_2b07b6f26ebd X-Filterd-Recvd-Size: 3022 Received: from huawei.com (szxga05-in.huawei.com [45.249.212.191]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Wed, 8 Jul 2020 12:40:51 +0000 (UTC) Received: from DGGEMS405-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 0F2C12A3B17F5D981EE8; Wed, 8 Jul 2020 20:40:46 +0800 (CST) Received: from DESKTOP-KKJBAGG.china.huawei.com (10.174.186.75) by DGGEMS405-HUB.china.huawei.com (10.3.19.205) with Microsoft SMTP Server id 14.3.487.0; Wed, 8 Jul 2020 20:40:38 +0800 From: Zhenyu Ye To: , , , , , , CC: , , , , , , , , , Subject: [RFC PATCH v5 0/2] arm64: tlb: add support for TLBI RANGE instructions Date: Wed, 8 Jul 2020 20:40:29 +0800 Message-ID: <20200708124031.1414-1-yezhenyu2@huawei.com> X-Mailer: git-send-email 2.22.0.windows.1 MIME-Version: 1.0 X-Originating-IP: [10.174.186.75] X-CFilter-Loop: Reflected X-Rspamd-Queue-Id: 851513E379 X-Spamd-Result: default: False [0.00 / 100.00] X-Rspamd-Server: rspam02 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: ARMv8.4-TLBI provides TLBI invalidation instruction that apply to a range of input addresses. This series add support for this feature. I tested this feature on a FPGA machine whose cpus support the tlbi range. As the page num increases, the performance is improved significantly. When page num = 256, the performance is improved by about 10 times. Below is the test data when the stride = PTE: [page num] [classic] [tlbi range] 1 16051 13524 2 11366 11146 3 11582 12171 4 11694 11101 5 12138 12267 6 12290 11105 7 12400 12002 8 12837 11097 9 14791 12140 10 15461 11087 16 18233 11094 32 26983 11079 64 43840 11092 128 77754 11098 256 145514 11089 512 280932 11111 See more details in: https://lore.kernel.org/linux-arm-kernel/504c7588-97e5-e014-fca0-c5511ae0d256@huawei.com/ --- ChangeList: v5: - rebase this series on Linux 5.8-rc4. - remove the __TG macro. - move the odd range_pages check into loop. v4: combine the __flush_tlb_range() and the __directly into the same function with a single loop for both. v3: rebase this series on Linux 5.7-rc1. v2: Link: https://lkml.org/lkml/2019/11/11/348 Zhenyu Ye (2): arm64: tlb: Detect the ARMv8.4 TLBI RANGE feature arm64: tlb: Use the TLBI RANGE feature in arm64 arch/arm64/include/asm/cpucaps.h | 3 +- arch/arm64/include/asm/sysreg.h | 3 + arch/arm64/include/asm/tlbflush.h | 101 +++++++++++++++++++++++++----- arch/arm64/kernel/cpufeature.c | 10 +++ 4 files changed, 102 insertions(+), 15 deletions(-)