From patchwork Mon Aug 22 08:21:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 12950323 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C21BC32796 for ; Mon, 22 Aug 2022 08:23:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15C266B0074; Mon, 22 Aug 2022 04:23:36 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DC0C8D0001; Mon, 22 Aug 2022 04:23:36 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E95BD6B0078; Mon, 22 Aug 2022 04:23:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id D213F6B0074 for ; Mon, 22 Aug 2022 04:23:35 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A760C40846 for ; Mon, 22 Aug 2022 08:23:35 +0000 (UTC) X-FDA: 79826539590.31.8D19FCB Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf26.hostedemail.com (Postfix) with ESMTP id 6A296140027 for ; Mon, 22 Aug 2022 08:23:34 +0000 (UTC) Received: from canpemm500009.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4MB4zR2LXtz1N7HW; Mon, 22 Aug 2022 16:20:03 +0800 (CST) Received: from localhost.localdomain (10.67.164.66) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 22 Aug 2022 16:23:30 +0800 From: Yicong Yang To: , , , , , , CC: , , , , , , , , , , , , , , , , Barry Song <21cnbao@gmail.com>, , , , , Barry Song Subject: [PATCH v3 1/4] Revert "Documentation/features: mark BATCHED_UNMAP_TLB_FLUSH doesn't apply to ARM64" Date: Mon, 22 Aug 2022 16:21:17 +0800 Message-ID: <20220822082120.8347-2-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20220822082120.8347-1-yangyicong@huawei.com> References: <20220822082120.8347-1-yangyicong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661156615; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Vk6zLa5olGH+5zX8mtvoVrst3uYXLPK+IsDsBd+Ouhs=; b=7PA6YNq74nssFgb/b066RjShWzCzpGonFiVq43LERR5+V2BBAIi/XLf/TqAF5CwzI45+GV 9J5njrVN93SUlRkCsviHO4PooexbciuI4Slt25wjTPvi59N5EAqCpDGBmgvhr/MA8GHih9 zBMrH97PdWhLsVq91PaFp+rhGmLJ+eQ= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=yangyicong@huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661156615; a=rsa-sha256; cv=none; b=uktZiNAUzhCr3/bWneUEAoLsXTXIV+YpdW88hzw2sWyGyyNMVpKqwRqI5D/h51f0N7D2je 2rPt7YY7bnN56uwbERvrgBzxw30P/ImsbJoUPXx6mM0QPZt+yoVcvMQsE+UUX/stmPCyKG XQNmPEK1osE6cCNyw1A3qkKlD0DWv4c= X-Rspamd-Queue-Id: 6A296140027 X-Rspam-User: Authentication-Results: imf26.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf26.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=yangyicong@huawei.com X-Rspamd-Server: rspam04 X-Stat-Signature: 1gz3jyyuk3fqpe5i6ewnfrksxrftahg5 X-HE-Tag: 1661156614-784626 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Barry Song This reverts commit 6bfef171d0d74cb050112e0e49feb20bfddf7f42. I was wrong. Though ARM64 has hardware TLB flush, but it is not free and it is still expensive. We still have a good chance to enable batched and deferred TLB flush on ARM64 for memory reclamation. A possible way is that we only queue tlbi instructions in hardware's queue. When we have to broadcast TLB, we broadcast it by dsb. We just need to get adapted the existing BATCHED_UNMAP_TLB_FLUSH. Tested-by: Xin Hao Signed-off-by: Barry Song Signed-off-by: Yicong Yang --- Documentation/features/arch-support.txt | 1 - Documentation/features/vm/TLB/arch-support.txt | 2 +- 2 files changed, 1 insertion(+), 2 deletions(-) diff --git a/Documentation/features/arch-support.txt b/Documentation/features/arch-support.txt index 118ae031840b..d22a1095e661 100644 --- a/Documentation/features/arch-support.txt +++ b/Documentation/features/arch-support.txt @@ -8,5 +8,4 @@ The meaning of entries in the tables is: | ok | # feature supported by the architecture |TODO| # feature not yet supported by the architecture | .. | # feature cannot be supported by the hardware - | N/A| # feature doesn't apply to the architecture diff --git a/Documentation/features/vm/TLB/arch-support.txt b/Documentation/features/vm/TLB/arch-support.txt index 039e4e91ada3..1c009312b9c1 100644 --- a/Documentation/features/vm/TLB/arch-support.txt +++ b/Documentation/features/vm/TLB/arch-support.txt @@ -9,7 +9,7 @@ | alpha: | TODO | | arc: | TODO | | arm: | TODO | - | arm64: | N/A | + | arm64: | TODO | | csky: | TODO | | hexagon: | TODO | | ia64: | TODO | From patchwork Mon Aug 22 08:21:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 12950326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B884C28D13 for ; Mon, 22 Aug 2022 08:25:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 153178D0003; Mon, 22 Aug 2022 04:25:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1031C8D0001; Mon, 22 Aug 2022 04:25:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F0C758D0003; Mon, 22 Aug 2022 04:25:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DFE568D0001 for ; Mon, 22 Aug 2022 04:25:40 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id B33F61A0404 for ; Mon, 22 Aug 2022 08:25:40 +0000 (UTC) X-FDA: 79826544840.06.B605D80 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by imf25.hostedemail.com (Postfix) with ESMTP id 76484A004C for ; Mon, 22 Aug 2022 08:23:34 +0000 (UTC) Received: from canpemm500009.china.huawei.com (unknown [172.30.72.53]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4MB4zR4wk3z1N7Vq; Mon, 22 Aug 2022 16:20:03 +0800 (CST) Received: from localhost.localdomain (10.67.164.66) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 22 Aug 2022 16:23:30 +0800 From: Yicong Yang To: , , , , , , CC: , , , , , , , , , , , , , , , , Barry Song <21cnbao@gmail.com>, , , , , Anshuman Khandual Subject: [PATCH v3 2/4] mm/tlbbatch: Introduce arch_tlbbatch_should_defer() Date: Mon, 22 Aug 2022 16:21:18 +0800 Message-ID: <20220822082120.8347-3-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20220822082120.8347-1-yangyicong@huawei.com> References: <20220822082120.8347-1-yangyicong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661156615; a=rsa-sha256; cv=none; b=atlESNjbzYyYGVtHltWzn3Cx8kvTxwdUm38eyE4AwRMawKX75aoks+U0/9IAfcUAvAvtg7 ncgf0nYwUasFLBbJieZEI3iNPMmdx3x7eK3Z4mLkCRdbvL8q4lgrX/+5pk6WBosrhAvFCa al93IOLZ946IvCDXK7mhRqco7xqM//o= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf25.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=yangyicong@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661156615; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=eh4hv036LHuNDHsVgHaw8qPJWY4E7fR0gzteZ3nXcNc=; b=b4GRil/IK2JPBHngCdEoY6nS2xQovly+HrMR6biGLRaVxjBoH1ieAFZ+ti9UVE8c3B5h0H YxveSn4PgZDjBfqIdgqCZuaEbTMwC0Id5EErbXlveA9m1+59dlMi5JUXsBj5MLU0WKRGhF 07q3WmvCO22Fsc4w/Ph7VBmXiZWg9fE= X-Rspamd-Queue-Id: 76484A004C X-Stat-Signature: 5z5ptu87kt8n3zfdbnfiuidkd4cxmf5h Authentication-Results: imf25.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf25.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.255 as permitted sender) smtp.mailfrom=yangyicong@huawei.com X-Rspam-User: X-Rspamd-Server: rspam09 X-HE-Tag: 1661156614-228843 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Anshuman Khandual The entire scheme of deferred TLB flush in reclaim path rests on the fact that the cost to refill TLB entries is less than flushing out individual entries by sending IPI to remote CPUs. But architecture can have different ways to evaluate that. Hence apart from checking TTU_BATCH_FLUSH in the TTU flags, rest of the decision should be architecture specific. Signed-off-by: Anshuman Khandual [https://lore.kernel.org/linuxppc-dev/20171101101735.2318-2-khandual@linux.vnet.ibm.com/] Signed-off-by: Yicong Yang [Rebase and fix incorrect return value type] Reviewed-by: Kefeng Wang Reviewed-by: Anshuman Khandual --- arch/x86/include/asm/tlbflush.h | 12 ++++++++++++ mm/rmap.c | 9 +-------- 2 files changed, 13 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index cda3118f3b27..8a497d902c16 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -240,6 +240,18 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long a) flush_tlb_mm_range(vma->vm_mm, a, a + PAGE_SIZE, PAGE_SHIFT, false); } +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) +{ + bool should_defer = false; + + /* If remote CPUs need to be flushed then defer batch the flush */ + if (cpumask_any_but(mm_cpumask(mm), get_cpu()) < nr_cpu_ids) + should_defer = true; + put_cpu(); + + return should_defer; +} + static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) { /* diff --git a/mm/rmap.c b/mm/rmap.c index edc06c52bc82..a17a004550c6 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -687,17 +687,10 @@ static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable) */ static bool should_defer_flush(struct mm_struct *mm, enum ttu_flags flags) { - bool should_defer = false; - if (!(flags & TTU_BATCH_FLUSH)) return false; - /* If remote CPUs need to be flushed then defer batch the flush */ - if (cpumask_any_but(mm_cpumask(mm), get_cpu()) < nr_cpu_ids) - should_defer = true; - put_cpu(); - - return should_defer; + return arch_tlbbatch_should_defer(mm); } /* From patchwork Mon Aug 22 08:21:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 12950324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1CB07C38145 for ; Mon, 22 Aug 2022 08:23:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 301FC6B0075; Mon, 22 Aug 2022 04:23:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 213398D0001; Mon, 22 Aug 2022 04:23:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 017606B007B; Mon, 22 Aug 2022 04:23:36 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DBF386B0075 for ; Mon, 22 Aug 2022 04:23:36 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id B0E90140F6F for ; Mon, 22 Aug 2022 08:23:36 +0000 (UTC) X-FDA: 79826539632.24.9E7FA7C Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf31.hostedemail.com (Postfix) with ESMTP id C0A002003B for ; Mon, 22 Aug 2022 08:23:35 +0000 (UTC) Received: from canpemm500009.china.huawei.com (unknown [172.30.72.55]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4MB4yW0hwhzXdwW; Mon, 22 Aug 2022 16:19:15 +0800 (CST) Received: from localhost.localdomain (10.67.164.66) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 22 Aug 2022 16:23:30 +0800 From: Yicong Yang To: , , , , , , CC: , , , , , , , , , , , , , , , , Barry Song <21cnbao@gmail.com>, , , , , Barry Song , Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , "H. Peter Anvin" , Nadav Amit , Mel Gorman Subject: [PATCH v3 3/4] mm: rmap: Extend tlbbatch APIs to fit new platforms Date: Mon, 22 Aug 2022 16:21:19 +0800 Message-ID: <20220822082120.8347-4-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20220822082120.8347-1-yangyicong@huawei.com> References: <20220822082120.8347-1-yangyicong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661156616; a=rsa-sha256; cv=none; b=8fyzqNopfqXpwjkMUic7OZpmW/OHrifBhLYVNomuziORnEcuXk/TkZQWpaYI177X/A1b+U iyCaP4VZyjY0TSiwsxq+izhOjYk93umVsB/dp01icT+oXxf0mdVJeMSAbraGNSrztyMY0U bYoXStUmfnH/HKTpkjyDM/vrBM6xbfg= ARC-Authentication-Results: i=1; imf31.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf31.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=yangyicong@huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661156616; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=yv6bo6Vv8OfgJm/4FZZyjmnUKwMeRzC0HhN1AmC2EA8=; b=GgWR6d6djI4H5nPci+Io4m+3gsN3vurZ6OlAHlsZi/Xd3yjHunP0ZZkXt/5Ql/t2wY3ThS qX6OZf+LrYLEo9OjlAGzi8/TDkIv+w6IpGLX597Xf3u+Y/8I5iEMKdFPsEP6ogLkoidVJc xY5+se7Dgu5/d0RQw4RBUzbBxDp32Sw= X-Stat-Signature: uz99jzpjc5ef5pyuaq9jmkfb6mc15hqs X-Rspamd-Queue-Id: C0A002003B Authentication-Results: imf31.hostedemail.com; dkim=none; dmarc=pass (policy=quarantine) header.from=huawei.com; spf=pass (imf31.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=yangyicong@huawei.com X-Rspamd-Server: rspam03 X-Rspam-User: X-HE-Tag: 1661156615-909247 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Barry Song Add uaddr to tlbbatch APIs so that platforms like ARM64 are able to apply this on their specific hardware features. For ARM64, this could be sending tlbi into hardware queues for the page with this particular uaddr. Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: Dave Hansen Cc: "H. Peter Anvin" Cc: Nadav Amit Cc: Mel Gorman Tested-by: Xin Hao Signed-off-by: Barry Song Signed-off-by: Yicong Yang Reviewed-by: Kefeng Wang --- arch/x86/include/asm/tlbflush.h | 3 ++- mm/rmap.c | 10 ++++++---- 2 files changed, 8 insertions(+), 5 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index 8a497d902c16..5bd78ae55cd4 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -264,7 +264,8 @@ static inline u64 inc_mm_tlb_gen(struct mm_struct *mm) } static inline void arch_tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch, - struct mm_struct *mm) + struct mm_struct *mm, + unsigned long uaddr) { inc_mm_tlb_gen(mm); cpumask_or(&batch->cpumask, &batch->cpumask, mm_cpumask(mm)); diff --git a/mm/rmap.c b/mm/rmap.c index a17a004550c6..7187a72b63b1 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -642,12 +642,13 @@ void try_to_unmap_flush_dirty(void) #define TLB_FLUSH_BATCH_PENDING_LARGE \ (TLB_FLUSH_BATCH_PENDING_MASK / 2) -static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable) +static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable, + unsigned long uaddr) { struct tlbflush_unmap_batch *tlb_ubc = ¤t->tlb_ubc; int batch, nbatch; - arch_tlbbatch_add_mm(&tlb_ubc->arch, mm); + arch_tlbbatch_add_mm(&tlb_ubc->arch, mm, uaddr); tlb_ubc->flush_required = true; /* @@ -725,7 +726,8 @@ void flush_tlb_batched_pending(struct mm_struct *mm) } } #else -static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable) +static void set_tlb_ubc_flush_pending(struct mm_struct *mm, bool writable, + unsigned long uaddr) { } @@ -1587,7 +1589,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, */ pteval = ptep_get_and_clear(mm, address, pvmw.pte); - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval), address); } else { pteval = ptep_clear_flush(vma, address, pvmw.pte); } From patchwork Mon Aug 22 08:21:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yicong Yang X-Patchwork-Id: 12950325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3F9AC28D13 for ; Mon, 22 Aug 2022 08:24:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 39D668D0002; Mon, 22 Aug 2022 04:24:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 324C88D0001; Mon, 22 Aug 2022 04:24:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1C7658D0002; Mon, 22 Aug 2022 04:24:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 0A9288D0001 for ; Mon, 22 Aug 2022 04:24:53 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id D9DA81A0729 for ; Mon, 22 Aug 2022 08:24:52 +0000 (UTC) X-FDA: 79826542824.22.2083943 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by imf09.hostedemail.com (Postfix) with ESMTP id 913021400B1 for ; Mon, 22 Aug 2022 08:23:34 +0000 (UTC) Received: from canpemm500009.china.huawei.com (unknown [172.30.72.53]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4MB4zm2lylzlWHZ; Mon, 22 Aug 2022 16:20:20 +0800 (CST) Received: from localhost.localdomain (10.67.164.66) by canpemm500009.china.huawei.com (7.192.105.203) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Mon, 22 Aug 2022 16:23:31 +0800 From: Yicong Yang To: , , , , , , CC: , , , , , , , , , , , , , , , , Barry Song <21cnbao@gmail.com>, , , , , Barry Song , Nadav Amit , Mel Gorman Subject: [PATCH v3 4/4] arm64: support batched/deferred tlb shootdown during page reclamation Date: Mon, 22 Aug 2022 16:21:20 +0800 Message-ID: <20220822082120.8347-5-yangyicong@huawei.com> X-Mailer: git-send-email 2.31.0 In-Reply-To: <20220822082120.8347-1-yangyicong@huawei.com> References: <20220822082120.8347-1-yangyicong@huawei.com> MIME-Version: 1.0 X-Originating-IP: [10.67.164.66] X-ClientProxiedBy: dggems703-chm.china.huawei.com (10.3.19.180) To canpemm500009.china.huawei.com (7.192.105.203) X-CFilter-Loop: Reflected ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661156615; a=rsa-sha256; cv=none; b=eKl+lc1sx8X1wEkOtUXokb2tMXjmgQfLY7bG0LJlDJpF5ZRUTgUgGJveIRydsgGzr1iUSU s2mhSzvmoenfhq3RRtI8SLxkEy9Kk2xdSpz9b5vGe/tEkOldLJ1Ys/7l+bFp+CAX7KQCOg VO/AWRr/gWUznKRUrhSIFlPoaMWgtpM= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=yangyicong@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661156615; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=jHdI+eAzDard3GosvrYQG+yjXFNN/FD6nUw+NBK3kZs=; b=7TXfg7HNDTeMTNQAFUv7nNLTMgl+lDcNsepOad8/XgQoBtgrqph2H2mMZfvuYjD+PNiLrf z5zRrcaN686vVy783hT3z6PJ/sbhFSpKwMmkD8A+9hgzGM1KZbWMFLACrx3jJvulJv5dL5 lhAvxai1XuhB+Fa9RZ9r9eyqyHMgTFA= X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 913021400B1 X-Rspam-User: Authentication-Results: imf09.hostedemail.com; dkim=none; spf=pass (imf09.hostedemail.com: domain of yangyicong@huawei.com designates 45.249.212.188 as permitted sender) smtp.mailfrom=yangyicong@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com X-Stat-Signature: 39mucxbkqgnhunhxe4rhoesq5a193q4u X-HE-Tag: 1661156614-633404 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Barry Song on x86, batched and deferred tlb shootdown has lead to 90% performance increase on tlb shootdown. on arm64, HW can do tlb shootdown without software IPI. But sync tlbi is still quite expensive. Even running a simplest program which requires swapout can prove this is true, #include #include #include #include int main() { #define SIZE (1 * 1024 * 1024) volatile unsigned char *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_SHARED | MAP_ANONYMOUS, -1, 0); memset(p, 0x88, SIZE); for (int k = 0; k < 10000; k++) { /* swap in */ for (int i = 0; i < SIZE; i += 4096) { (void)p[i]; } /* swap out */ madvise(p, SIZE, MADV_PAGEOUT); } } Perf result on snapdragon 888 with 8 cores by using zRAM as the swap block device. ~ # perf record taskset -c 4 ./a.out [ perf record: Woken up 10 times to write data ] [ perf record: Captured and wrote 2.297 MB perf.data (60084 samples) ] ~ # perf report # To display the perf.data header info, please use --header/--header-only options. # To display the perf.data header info, please use --header/--header-only options. # # # Total Lost Samples: 0 # # Samples: 60K of event 'cycles' # Event count (approx.): 35706225414 # # Overhead Command Shared Object Symbol # ........ ....... ................. ............................................................................. # 21.07% a.out [kernel.kallsyms] [k] _raw_spin_unlock_irq 8.23% a.out [kernel.kallsyms] [k] _raw_spin_unlock_irqrestore 6.67% a.out [kernel.kallsyms] [k] filemap_map_pages 6.16% a.out [kernel.kallsyms] [k] __zram_bvec_write 5.36% a.out [kernel.kallsyms] [k] ptep_clear_flush 3.71% a.out [kernel.kallsyms] [k] _raw_spin_lock 3.49% a.out [kernel.kallsyms] [k] memset64 1.63% a.out [kernel.kallsyms] [k] clear_page 1.42% a.out [kernel.kallsyms] [k] _raw_spin_unlock 1.26% a.out [kernel.kallsyms] [k] mod_zone_state.llvm.8525150236079521930 1.23% a.out [kernel.kallsyms] [k] xas_load 1.15% a.out [kernel.kallsyms] [k] zram_slot_lock ptep_clear_flush() takes 5.36% CPU in the micro-benchmark swapping in/out a page mapped by only one process. If the page is mapped by multiple processes, typically, like more than 100 on a phone, the overhead would be much higher as we have to run tlb flush 100 times for one single page. Plus, tlb flush overhead will increase with the number of CPU cores due to the bad scalability of tlb shootdown in HW, so those ARM64 servers should expect much higher overhead. Further perf annonate shows 95% cpu time of ptep_clear_flush is actually used by the final dsb() to wait for the completion of tlb flush. This provides us a very good chance to leverage the existing batched tlb in kernel. The minimum modification is that we only send async tlbi in the first stage and we send dsb while we have to sync in the second stage. With the above simplest micro benchmark, collapsed time to finish the program decreases around 5%. Typical collapsed time w/o patch: ~ # time taskset -c 4 ./a.out 0.21user 14.34system 0:14.69elapsed w/ patch: ~ # time taskset -c 4 ./a.out 0.22user 13.45system 0:13.80elapsed Also, Yicong Yang added the following observation. Tested with benchmark in the commit on Kunpeng920 arm64 server, observed an improvement around 12.5% with command `time ./swap_bench`. w/o w/ real 0m13.460s 0m11.771s user 0m0.248s 0m0.279s sys 0m12.039s 0m11.458s Originally it's noticed a 16.99% overhead of ptep_clear_flush() which has been eliminated by this patch: [root@localhost yang]# perf record -- ./swap_bench && perf report [...] 16.99% swap_bench [kernel.kallsyms] [k] ptep_clear_flush Cc: Jonathan Corbet Cc: Nadav Amit Cc: Mel Gorman Tested-by: Yicong Yang Tested-by: Xin Hao Signed-off-by: Barry Song Signed-off-by: Yicong Yang Reviewed-by: Kefeng Wang --- .../features/vm/TLB/arch-support.txt | 2 +- arch/arm64/Kconfig | 1 + arch/arm64/include/asm/tlbbatch.h | 12 ++++++++ arch/arm64/include/asm/tlbflush.h | 28 +++++++++++++++++-- 4 files changed, 40 insertions(+), 3 deletions(-) create mode 100644 arch/arm64/include/asm/tlbbatch.h diff --git a/Documentation/features/vm/TLB/arch-support.txt b/Documentation/features/vm/TLB/arch-support.txt index 1c009312b9c1..2caf815d7c6c 100644 --- a/Documentation/features/vm/TLB/arch-support.txt +++ b/Documentation/features/vm/TLB/arch-support.txt @@ -9,7 +9,7 @@ | alpha: | TODO | | arc: | TODO | | arm: | TODO | - | arm64: | TODO | + | arm64: | ok | | csky: | TODO | | hexagon: | TODO | | ia64: | TODO | diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 571cc234d0b3..09d45cd6d665 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -93,6 +93,7 @@ config ARM64 select ARCH_SUPPORTS_INT128 if CC_HAS_INT128 select ARCH_SUPPORTS_NUMA_BALANCING select ARCH_SUPPORTS_PAGE_TABLE_CHECK + select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT select ARCH_WANT_DEFAULT_BPF_JIT select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT diff --git a/arch/arm64/include/asm/tlbbatch.h b/arch/arm64/include/asm/tlbbatch.h new file mode 100644 index 000000000000..fedb0b87b8db --- /dev/null +++ b/arch/arm64/include/asm/tlbbatch.h @@ -0,0 +1,12 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ARCH_ARM64_TLBBATCH_H +#define _ARCH_ARM64_TLBBATCH_H + +struct arch_tlbflush_unmap_batch { + /* + * For arm64, HW can do tlb shootdown, so we don't + * need to record cpumask for sending IPI + */ +}; + +#endif /* _ARCH_ARM64_TLBBATCH_H */ diff --git a/arch/arm64/include/asm/tlbflush.h b/arch/arm64/include/asm/tlbflush.h index 412a3b9a3c25..23cbc987321a 100644 --- a/arch/arm64/include/asm/tlbflush.h +++ b/arch/arm64/include/asm/tlbflush.h @@ -254,17 +254,24 @@ static inline void flush_tlb_mm(struct mm_struct *mm) dsb(ish); } -static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, + +static inline void __flush_tlb_page_nosync(struct mm_struct *mm, unsigned long uaddr) { unsigned long addr; dsb(ishst); - addr = __TLBI_VADDR(uaddr, ASID(vma->vm_mm)); + addr = __TLBI_VADDR(uaddr, ASID(mm)); __tlbi(vale1is, addr); __tlbi_user(vale1is, addr); } +static inline void flush_tlb_page_nosync(struct vm_area_struct *vma, + unsigned long uaddr) +{ + return __flush_tlb_page_nosync(vma->vm_mm, uaddr); +} + static inline void flush_tlb_page(struct vm_area_struct *vma, unsigned long uaddr) { @@ -272,6 +279,23 @@ static inline void flush_tlb_page(struct vm_area_struct *vma, dsb(ish); } +static inline bool arch_tlbbatch_should_defer(struct mm_struct *mm) +{ + return true; +} + +static inline void arch_tlbbatch_add_mm(struct arch_tlbflush_unmap_batch *batch, + struct mm_struct *mm, + unsigned long uaddr) +{ + __flush_tlb_page_nosync(mm, uaddr); +} + +static inline void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch) +{ + dsb(ish); +} + /* * This is meant to avoid soft lock-ups on large TLB flushing ranges and not * necessarily a performance improvement.