From patchwork Wed Jul 6 07:27:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 12907439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A960EC43334 for ; Wed, 6 Jul 2022 07:27:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0AB738E0002; Wed, 6 Jul 2022 03:27:23 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 05BA88E0001; Wed, 6 Jul 2022 03:27:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E64F88E0002; Wed, 6 Jul 2022 03:27:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id D45FE8E0001 for ; Wed, 6 Jul 2022 03:27:22 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A5F81D9B for ; Wed, 6 Jul 2022 07:27:22 +0000 (UTC) X-FDA: 79655844324.17.632452A Received: from mail-pf1-f182.google.com (mail-pf1-f182.google.com [209.85.210.182]) by imf14.hostedemail.com (Postfix) with ESMTP id 46D13100038 for ; Wed, 6 Jul 2022 07:27:22 +0000 (UTC) Received: by mail-pf1-f182.google.com with SMTP id j3so972536pfb.6 for ; Wed, 06 Jul 2022 00:27:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=FBNOHK6TrUivbMpa8W7z4583vq/fBAQTSSPZ8CCGyBI=; b=OfDJeegyfCM5B9dZJ5bCWM0Tc3QLQ2lqKnE6PKBhG5Y5cggTqLEBN7DSWP67HbUorY AICMy9Ani116g+nC2FpMQ7VBQIccJjXNYL41uTJrGFkK1p6Hzqkkvh4Pv0jqxFOTnGRz PkdF/+9BDdbHnT0vi0Dk1OPCFO5VJyRvnDy/JVBNNVMW+GodTr5nYw5KI6bpv+FsBPl4 3iFfNn5qvnF3tPqBE6nVUTAqbFpN83XL2x9gxrT9J461PRifb2OFznrVnJSjiRf3nlsV thTgdRROoVs9++K8htSSktO69KYxvx7GbbfB5yLUH2vvHMhtYigl5yWEP/31U6dtaDWN XLyw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:mime-version :content-transfer-encoding; bh=FBNOHK6TrUivbMpa8W7z4583vq/fBAQTSSPZ8CCGyBI=; b=jmRpdWJ+I3/ltF3FL6hxq8PQBrJRYnoM5I2l7u0M4364r48TLrBZh+XFEtPkjpimQh 1euyR4cmqEl9EMJ8JPmh9moDnxDBRsNx6fmKt5m1STj8OG5YPe6KtdB+XGrB2zMToIIi 4m3VltpJ71Ttk1SzzF7GCrYyaKSz5EvnGhYgl0W4a1grOgr071e5ksWwf8M2hgFDdQUp d+UgngFo2VDBM88uu/zIGqkbgEa6DNfu6OHD2EPc9hhnsFsRNnraiZCSB4TrWI6xCy3N DYHQ5IMx+85RbsalXp0HowKqy4meJz3xMum1Vpddqyru3+bpoWz0ghx1Y8lvBSUHBqd6 eCrQ== X-Gm-Message-State: AJIora+XGFeknng/QAfO1A5vdl54SGY3hu1ajdWYdqkbeyNjry7s3crp HK6pK7kX0it5TbpunWCRsmQ= X-Google-Smtp-Source: AGRyM1veHyr9zFdk9VXUSlgyj7GwIwH45xDuWvCAuHe/ns7sMvaysy6L88PpLzZyPk/sM7PQB+Xiqg== X-Received: by 2002:a05:6a00:278c:b0:525:65c0:6415 with SMTP id bd12-20020a056a00278c00b0052565c06415mr44744290pfb.33.1657092441145; Wed, 06 Jul 2022 00:27:21 -0700 (PDT) Received: from localhost.localdomain (47-72-206-164.dsl.dyn.ihug.co.nz. [47.72.206.164]) by smtp.gmail.com with ESMTPSA id w190-20020a6262c7000000b00528655cd6a6sm5366352pfb.53.2022.07.06.00.27.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Jul 2022 00:27:20 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, anshuman.khandual@arm.com, catalin.marinas@arm.com, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, steven.price@arm.com, will@kernel.org Cc: aarcange@redhat.com, guojian@oppo.com, hanchuanhua@oppo.com, hannes@cmpxchg.org, hughd@google.com, linux-kernel@vger.kernel.org, minchan@kernel.org, shy828301@gmail.com, v-songbaohua@oppo.com, ying.huang@intel.com, zhangshiming@oppo.com Subject: [PATCH v3] arm64: enable THP_SWAP for arm64 Date: Wed, 6 Jul 2022 19:27:07 +1200 Message-Id: <20220706072707.114376-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.25.1 MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1657092442; a=rsa-sha256; cv=none; b=EHfdouDpuj9pnAKbjJ/RBjAfJ/B1Cj2Z7f9AAEKkqqKZjBu63RCMS3MAuy/s7e2I2xUbLv fQV+ZGUBaL9CEe/KtPxrjQyibFnDhengrY/1dfs9IzAfuygGUvgkgOVpEhCNrp74IbaBzM T2jUbSUxLwP4G1UA2uK6sYOmEw5+G0Q= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=OfDJeegy; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1657092442; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=FBNOHK6TrUivbMpa8W7z4583vq/fBAQTSSPZ8CCGyBI=; b=73iiApiXiKWEcJFgI6wMSpWxlayg/k8fFbzvm90w/kEwdR8adK6jFKnAwoYBhdfpTPZW2A EuFQXGFKFxO0aq/bQP3fm4ZslEYaGUZVid3ZeVvJhuujNG4AKXFi7jjnvbUhSSSCx+6GMp 81x2aPVhx+677gACyP9lFzK2NanrtGI= X-Rspamd-Server: rspam04 X-Rspam-User: Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=OfDJeegy; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf14.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.182 as permitted sender) smtp.mailfrom=21cnbao@gmail.com X-Stat-Signature: dz9p6mpbthdofnkug7q6k8463csfe3f9 X-Rspamd-Queue-Id: 46D13100038 X-HE-Tag: 1657092442-298659 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Barry Song THP_SWAP has been proven to improve the swap throughput significantly on x86_64 according to commit bd4c82c22c367e ("mm, THP, swap: delay splitting THP after swapped out"). As long as arm64 uses 4K page size, it is quite similar with x86_64 by having 2MB PMD THP. THP_SWAP is architecture-independent, thus, enabling it on arm64 will benefit arm64 as well. A corner case is that MTE has an assumption that only base pages can be swapped. We won't enable THP_SWAP for ARM64 hardware with MTE support until MTE is reworked to coexist with THP_SWAP. A micro-benchmark is written to measure thp swapout throughput as below, unsigned long long tv_to_ms(struct timeval tv) { return tv.tv_sec * 1000 + tv.tv_usec / 1000; } main() { struct timeval tv_b, tv_e;; #define SIZE 400*1024*1024 volatile void *p = mmap(NULL, SIZE, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0); if (!p) { perror("fail to get memory"); exit(-1); } madvise(p, SIZE, MADV_HUGEPAGE); memset(p, 0x11, SIZE); /* write to get mem */ gettimeofday(&tv_b, NULL); madvise(p, SIZE, MADV_PAGEOUT); gettimeofday(&tv_e, NULL); printf("swp out bandwidth: %ld bytes/ms\n", SIZE/(tv_to_ms(tv_e) - tv_to_ms(tv_b))); } Testing is done on rk3568 64bit quad core processor Quad Core Cortex-A55 platform - ROCK 3A. thp swp throughput w/o patch: 2734bytes/ms (mean of 10 tests) thp swp throughput w/ patch: 3331bytes/ms (mean of 10 tests) Cc: "Huang, Ying" Cc: Minchan Kim Cc: Johannes Weiner Cc: Hugh Dickins Cc: Andrea Arcangeli Cc: Anshuman Khandual Cc: Steven Price Cc: Yang Shi Signed-off-by: Barry Song --- -v3: * refine the commit log; * add a benchmark result; * refine the macro of arch_thp_swp_supported Thanks to the comments of Anshuman, Andrew, Steven arch/arm64/Kconfig | 1 + arch/arm64/include/asm/pgtable.h | 6 ++++++ include/linux/huge_mm.h | 12 ++++++++++++ mm/swap_slots.c | 2 +- 4 files changed, 20 insertions(+), 1 deletion(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 1652a9800ebe..e1c540e80eec 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -101,6 +101,7 @@ config ARM64 select ARCH_WANT_HUGETLB_PAGE_OPTIMIZE_VMEMMAP select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_NO_INSTR + select ARCH_WANTS_THP_SWAP if ARM64_4K_PAGES select ARCH_HAS_UBSAN_SANITIZE_ALL select ARM_AMBA select ARM_ARCH_TIMER diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 0b6632f18364..78d6f6014bfb 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -45,6 +45,12 @@ __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ +static inline bool arch_thp_swp_supported(void) +{ + return !system_supports_mte(); +} +#define arch_thp_swp_supported arch_thp_swp_supported + /* * Outside of a few very special situations (e.g. hibernation), we always * use broadcast TLB invalidation instructions, therefore a spurious page diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index de29821231c9..4ddaf6ad73ef 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -461,4 +461,16 @@ static inline int split_folio_to_list(struct folio *folio, return split_huge_page_to_list(&folio->page, list); } +/* + * archs that select ARCH_WANTS_THP_SWAP but don't support THP_SWP due to + * limitations in the implementation like arm64 MTE can override this to + * false + */ +#ifndef arch_thp_swp_supported +static inline bool arch_thp_swp_supported(void) +{ + return true; +} +#endif + #endif /* _LINUX_HUGE_MM_H */ diff --git a/mm/swap_slots.c b/mm/swap_slots.c index 2a65a89b5b4d..10b94d64cc25 100644 --- a/mm/swap_slots.c +++ b/mm/swap_slots.c @@ -307,7 +307,7 @@ swp_entry_t folio_alloc_swap(struct folio *folio) entry.val = 0; if (folio_test_large(folio)) { - if (IS_ENABLED(CONFIG_THP_SWAP)) + if (IS_ENABLED(CONFIG_THP_SWAP) && arch_thp_swp_supported()) get_swap_pages(1, &entry, folio_nr_pages(folio)); goto out; }