From patchwork Thu Jan 18 11:10:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13522709 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D4CBC4707B for ; Thu, 18 Jan 2024 11:11:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F118B6B0081; Thu, 18 Jan 2024 06:11:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id EC1016B0082; Thu, 18 Jan 2024 06:11:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3AAF6B0083; Thu, 18 Jan 2024 06:11:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C4AEC6B0081 for ; Thu, 18 Jan 2024 06:11:18 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A2DA1160AF2 for ; Thu, 18 Jan 2024 11:11:18 +0000 (UTC) X-FDA: 81692165436.10.601691F Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf20.hostedemail.com (Postfix) with ESMTP id B912C1C0008 for ; Thu, 18 Jan 2024 11:11:16 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="V7vE/bd2"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705576276; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H8++HXKXEg2rLAGYvsGKqRlSZ467q1G3QH/l6LoLLoc=; b=K84jirQs7YfUw0zsLCU0/fk/3O2pi9NDcW2l7h0Og3AMgIj0rzXTD8iVaUYpbenBahgqpc YRh/6pY1qyTY+Baak51lZvLlSRjcybAps3gED7SPufUZrX+PpkA7wHApnJHblCkYsM4lXE QqpJeTDa1qJx3aK1GaZuUEdgaIoANTI= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="V7vE/bd2"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf20.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705576276; a=rsa-sha256; cv=none; b=A8nRL7w8DSynRpUY3H/BP9R5N6X4EhreoV6c/XSEOWs9p/tnVocqlvgaMV1gnnJvjAZ8u+ 1nhxp6lrLGGXn7G5cqBBoihhZyAXJsg0y1ZHnu0NoiL7vVT9ljrRVVPaxjBZqsI498Xw01 P49Ec1W8FKvIRBbt5insI7VYdiubfwk= Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-6d9af1f12d5so10376984b3a.3 for ; Thu, 18 Jan 2024 03:11:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705576275; x=1706181075; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=H8++HXKXEg2rLAGYvsGKqRlSZ467q1G3QH/l6LoLLoc=; b=V7vE/bd2hGCRmHf2TOQAYuvCODg1zHaTKLuKaSaXzEHeW0jtHHFJ71YSm8ooHT2grX /ArF44L7aXNVw1kNXx4Poyb5E0VzmL8gEWVL6Kz5Fm2ttbvKvoviL55csDdClo5Rbqsh oSfOLX43YrI6WqL2bjMNykDYDfV2Q+0eNpcOeH0NuZoK+q1J0yGJ8iTgeXYYrovKD6mj r/r0OegN/1ATBtf5wtOc3sMcZHoRrcOjqBE5Qr8y9pPrJJIj7Kc90z6B/IVqGI33EWAg CHqJ+VFGVFs10j8jitH94d1dWXYTa17f3i8HAnV5PVc9IWF78SfY1YsVQmxRpFPYerld NVYA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705576275; x=1706181075; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=H8++HXKXEg2rLAGYvsGKqRlSZ467q1G3QH/l6LoLLoc=; b=Brep0DAJRB4F33SY+plagZrWt9anGSfghLc125Du2P+BTzqGilNRLybKAMXRw7NZmy ZJlAFH7mAypXr7sd8EQ73+WUbm11YQGmHaToj01KQMinKn3NQTu1LydOdOKoQmRCyHKt XOnGqp04nw9YkfJmHc+W9os+MKKs39byIIG7g2KGuUXcRmErN1QVSQQQln5zZtIhP7Uu pLN4wn8vIKW0GitKKs5hbH9jqadM1Qysus7nwpgEprg3JdhbXceRbVZSRJVj8dT/vvqs Et694qTzLLAE1r/jbk8kTJBB/4Mj2eYKMt5PWGSJ76D+SWiGdCgjWcLDtpCu9ZvV/n9X kLtw== X-Gm-Message-State: AOJu0YzObWkryLOM8efKYe+pLLyVsH2lCdgXopl/YjICS0oemNJVFv77 ZaIbyWndMLI0FL/Xi3aQfaWnppJ2XDlfb3OUY6b7kjCph3kSV2Ap X-Google-Smtp-Source: AGHT+IES0NLZgAm4unP7Kp0dcWly48U0/DxS1VkSTEJ6A+2jDMT/vDPz11Jhz+G2DyZI1Lcqv4xcYg== X-Received: by 2002:a05:6a20:7354:b0:19b:90d8:2a11 with SMTP id v20-20020a056a20735400b0019b90d82a11mr617298pzc.69.1705576275572; Thu, 18 Jan 2024 03:11:15 -0800 (PST) Received: from barry-desktop.. (143.122.224.49.dyn.cust.vf.net.nz. [49.224.122.143]) by smtp.gmail.com with ESMTPSA id t19-20020a056a0021d300b006d9be753ac7sm3039107pfj.108.2024.01.18.03.11.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:11:15 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: ryan.roberts@arm.com, akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, surenb@google.com, steven.price@arm.com, Barry Song Subject: [PATCH RFC 1/6] arm64: mm: swap: support THP_SWAP on hardware with MTE Date: Fri, 19 Jan 2024 00:10:31 +1300 Message-Id: <20240118111036.72641-2-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240118111036.72641-1-21cnbao@gmail.com> References: <20231025144546.577640-1-ryan.roberts@arm.com> <20240118111036.72641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: B912C1C0008 X-Stat-Signature: 63jhgxx5ojfhuc6tkiikkfxwb874tcp8 X-HE-Tag: 1705576276-975178 X-HE-Meta: U2FsdGVkX18a+6B5cNFMC2SFKIW12p1KPmzG+DTIF/h6NFTu4bug5hC26PNPMQDL9DkiOCoA5J6NT6re7JlCiSs+p/rjWX59PVgaCLwdgZHmVUmY2FFn4xk5JqWP1RJ8p7+djjDJ9cC0gqwEr7AdaBMl2P7kbnNI0ySB+p6rTuZ1gzy+6cfIzzrBL8Ps26VZP1VCvx+OAz7v0FllGmBJokq+zegHtuFB54RmftoDP5h84ziMfAcqJ8Rz/HHs1MECj+iuVK6LBe/Vp49uU+c+LRVSmvp4GD+Rwdd8fFZ4vA8CfSAdrABwPun9VBJE/iIobE5MKNHtc0tgrBM8h3TMFgKoRG48ec1YDdnJ35SrCqragAkONIEb9dr6m0xXCV6XmbPxEHGMRU9Kr0hWMNZxwfPBV3QeItAcon7FCy07cVtXLrRa9ChZhE7WPSwHu7xvKhcLPDH2R68XEBx/PUr28Y5L+dnbtQkMr/o8BQTdxBzGDFXBoUXTe79RtYLr7o6JBWnUm1sHgb4qlQ2M62iXNot5esp5sg+//txlcZz6ylFQLHv1cnWm/FsMGhH3YLm7Jdxnt++2jhAQ2x2nYgDHvbTmlmpS62nlCsKVd3nWFKY/6uk6siZ10b6EjnTUgPWO+83bbJnsP3apIYAReFt1pqMUUsdlrAbMVMT+3AQZV4GKdpKOQdMh21l/93jdr1VQMV03ZnYCFAtdgIrOMmTtzJRocmqWaVuYvick/u0TTjfcD07WLh+p5bNw5zBrUYTMtCR2Krdj7Q6JblKI3gEgHCBZb5QllVGeNNEm7VRmY0ig6TPH3+3tvF2zb49nvrobzvjpf5gev4sywFBiTWbYTMRqPf0rGQaBQbunIYVjk6GlvATN7TrUOzcW0M3PCXN0h2x87HD6t5fIqAryFuCV//YJyjdx9hkizJa/0xN6lR1Zfi8SWIQ8W02t/NjFJAyh2DL5/XcZfZ1twoDvC4f u+GVyDBv FXYaUBr1zA+iMCrQxyjHRyLbdnw2YmRZLZsvz4Ko8ssqYFREUHcD1vGaIi4NCBQ9xKx50EzBlSAO8kap2hMA/VhW3pcoBbXrLY6sbeVJ49/4oGvhfUMr+AkyAT1ynGv5ZVtT1QAlHBpDHlYFQr+5vOFn4xJBAeLt7hcwl6oNMpU60mgDo+5RTl1pORv6BztZEvVHMtKHA/NAMJGa+LuSNFGMupEEgyECjaAe5AWzc/oUFotYMyroclZgdwpwyNI5GHAL7AooALU3wJzfIZPAK9rcAmguZ0rrvkalpkhSQ1BZIHBrQh6WKbl/nNULQW71IhsFr5iZpFvJNxps8eM1rf0IB9jD2HYaR66ba1efDyQ19Q7uy95RF06JLzd08sRk9DEr+46zINcx5CAoF7dJkrS4lHfyWXSgYQzGpcLQeczUjJZbHg1uPCsd0KI67biKp8RWwLe/tXSn10e0j49y0pjekT24t7eRZkjJakbw4skJFttiWHq0/UZyuGKiKiKxURxQ9pldcHUEO/U7hJelKASbVoo06eW6NuTs8Qo8maaRz7i4fJeHQQsPQUV5SThcQlpBIGXURLizMTV13wFoRZfxNiw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song Commit d0637c505f8a1 ("arm64: enable THP_SWAP for arm64") brings up THP_SWAP on ARM64, but it doesn't enable THP_SWP on hardware with MTE as the MTE code works with the assumption tags save/restore is always handling a folio with only one page. The limitation should be removed as more and more ARM64 SoCs have this feature. Co-existence of MTE and THP_SWAP becomes more and more important. This patch makes MTE tags saving support large folios, then we don't need to split large folios into base pages for swapping out on ARM64 SoCs with MTE any more. arch_prepare_to_swap() should take folio rather than page as parameter because we support THP swap-out as a whole. It saves tags for all pages in a large folio. As now we are restoring tags based-on folio, in arch_swap_restore(), we may increase some extra loops and early-exitings while refaulting a large folio which is still in swapcache in do_swap_page(). In case a large folio has nr pages, do_swap_page() will only set the PTE of the particular page which is causing the page fault. Thus do_swap_page() runs nr times, and each time, arch_swap_restore() will loop nr times for those subpages in the folio. So right now the algorithmic complexity becomes O(nr^2). Once we support mapping large folios in do_swap_page(), extra loops and early-exitings will decrease while not being completely removed as a large folio might get partially tagged in corner cases such as, 1. a large folio in swapcache can be partially unmapped, thus, MTE tags for the unmapped pages will be invalidated; 2. users might use mprotect() to set MTEs on a part of a large folio. arch_thp_swp_supported() is dropped since ARM64 MTE was the only one who needed it. Reviewed-by: Steven Price Signed-off-by: Barry Song Acked-by: Chris Li --- arch/arm64/include/asm/pgtable.h | 21 +++------------- arch/arm64/mm/mteswap.c | 42 ++++++++++++++++++++++++++++++++ include/linux/huge_mm.h | 12 --------- include/linux/pgtable.h | 2 +- mm/page_io.c | 2 +- mm/swap_slots.c | 2 +- 6 files changed, 49 insertions(+), 32 deletions(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 79ce70fbb751..9902395ca426 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -45,12 +45,6 @@ __flush_tlb_range(vma, addr, end, PUD_SIZE, false, 1) #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -static inline bool arch_thp_swp_supported(void) -{ - return !system_supports_mte(); -} -#define arch_thp_swp_supported arch_thp_swp_supported - /* * Outside of a few very special situations (e.g. hibernation), we always * use broadcast TLB invalidation instructions, therefore a spurious page @@ -1042,12 +1036,8 @@ static inline pmd_t pmdp_establish(struct vm_area_struct *vma, #ifdef CONFIG_ARM64_MTE #define __HAVE_ARCH_PREPARE_TO_SWAP -static inline int arch_prepare_to_swap(struct page *page) -{ - if (system_supports_mte()) - return mte_save_tags(page); - return 0; -} +#define arch_prepare_to_swap arch_prepare_to_swap +extern int arch_prepare_to_swap(struct folio *folio); #define __HAVE_ARCH_SWAP_INVALIDATE static inline void arch_swap_invalidate_page(int type, pgoff_t offset) @@ -1063,11 +1053,8 @@ static inline void arch_swap_invalidate_area(int type) } #define __HAVE_ARCH_SWAP_RESTORE -static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) -{ - if (system_supports_mte()) - mte_restore_tags(entry, &folio->page); -} +#define arch_swap_restore arch_swap_restore +extern void arch_swap_restore(swp_entry_t entry, struct folio *folio); #endif /* CONFIG_ARM64_MTE */ diff --git a/arch/arm64/mm/mteswap.c b/arch/arm64/mm/mteswap.c index a31833e3ddc5..b9ca1b35902f 100644 --- a/arch/arm64/mm/mteswap.c +++ b/arch/arm64/mm/mteswap.c @@ -68,6 +68,13 @@ void mte_invalidate_tags(int type, pgoff_t offset) mte_free_tag_storage(tags); } +static inline void __mte_invalidate_tags(struct page *page) +{ + swp_entry_t entry = page_swap_entry(page); + + mte_invalidate_tags(swp_type(entry), swp_offset(entry)); +} + void mte_invalidate_tags_area(int type) { swp_entry_t entry = swp_entry(type, 0); @@ -83,3 +90,38 @@ void mte_invalidate_tags_area(int type) } xa_unlock(&mte_pages); } + +int arch_prepare_to_swap(struct folio *folio) +{ + int err; + long i; + + if (system_supports_mte()) { + long nr = folio_nr_pages(folio); + + for (i = 0; i < nr; i++) { + err = mte_save_tags(folio_page(folio, i)); + if (err) + goto out; + } + } + return 0; + +out: + while (i--) + __mte_invalidate_tags(folio_page(folio, i)); + return err; +} + +void arch_swap_restore(swp_entry_t entry, struct folio *folio) +{ + if (system_supports_mte()) { + long i, nr = folio_nr_pages(folio); + + entry.val -= swp_offset(entry) & (nr - 1); + for (i = 0; i < nr; i++) { + mte_restore_tags(entry, folio_page(folio, i)); + entry.val++; + } + } +} diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 5adb86af35fc..67219d2309dd 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -530,16 +530,4 @@ static inline int split_folio(struct folio *folio) return split_folio_to_list(folio, NULL); } -/* - * archs that select ARCH_WANTS_THP_SWAP but don't support THP_SWP due to - * limitations in the implementation like arm64 MTE can override this to - * false - */ -#ifndef arch_thp_swp_supported -static inline bool arch_thp_swp_supported(void) -{ - return true; -} -#endif - #endif /* _LINUX_HUGE_MM_H */ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index f6d0e3513948..37fe83b0c358 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -925,7 +925,7 @@ static inline int arch_unmap_one(struct mm_struct *mm, * prototypes must be defined in the arch-specific asm/pgtable.h file. */ #ifndef __HAVE_ARCH_PREPARE_TO_SWAP -static inline int arch_prepare_to_swap(struct page *page) +static inline int arch_prepare_to_swap(struct folio *folio) { return 0; } diff --git a/mm/page_io.c b/mm/page_io.c index ae2b49055e43..a9a7c236aecc 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -189,7 +189,7 @@ int swap_writepage(struct page *page, struct writeback_control *wbc) * Arch code may have to preserve more data than just the page * contents, e.g. memory tags. */ - ret = arch_prepare_to_swap(&folio->page); + ret = arch_prepare_to_swap(folio); if (ret) { folio_mark_dirty(folio); folio_unlock(folio); diff --git a/mm/swap_slots.c b/mm/swap_slots.c index 0bec1f705f8e..2325adbb1f19 100644 --- a/mm/swap_slots.c +++ b/mm/swap_slots.c @@ -307,7 +307,7 @@ swp_entry_t folio_alloc_swap(struct folio *folio) entry.val = 0; if (folio_test_large(folio)) { - if (IS_ENABLED(CONFIG_THP_SWAP) && arch_thp_swp_supported()) + if (IS_ENABLED(CONFIG_THP_SWAP)) get_swap_pages(1, &entry, folio_nr_pages(folio)); goto out; } From patchwork Thu Jan 18 11:10:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13522710 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A40E9C4707B for ; Thu, 18 Jan 2024 11:11:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 28EC36B0083; Thu, 18 Jan 2024 06:11:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2483E6B0085; Thu, 18 Jan 2024 06:11:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DF886B0087; Thu, 18 Jan 2024 06:11:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id F03E36B0083 for ; Thu, 18 Jan 2024 06:11:36 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C66A51A0225 for ; Thu, 18 Jan 2024 11:11:36 +0000 (UTC) X-FDA: 81692166192.09.677699B Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf04.hostedemail.com (Postfix) with ESMTP id 0431240003 for ; Thu, 18 Jan 2024 11:11:33 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NXgmjCKR; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705576294; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=yrbTbn/Co4xZrxsx5gjY1Btb4+PFsZ1QZ40ExDZVluk=; b=AKxp6xYFcagE/UffyserdhT/pI/p7f3eGILbS+knlV28BhktEwodAwGQ32wLEibFtMq9uR QxCtJJKyG/8IE1l24LxtIwtc/GzFzeYrsqnv3Xdd4YpwxT5RzNVC+OkSLrco5auNjleBvV GZMlN8u68hb/h588YafujI+nmE16KKg= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=NXgmjCKR; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf04.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705576294; a=rsa-sha256; cv=none; b=jAVCZKx2p5fBKbGAx4AwhOWmpFp+3PdAUnWLVf7jbS2ULlNxW5gjW0aHSubqZ6FHfKGkgu 27Rr1lBEXmlT1WvWFD8WttcnRYxmVcWFjjqYn812+23L4A1/8GyVk2Y99bu9kG3FQpMhAb W6VY9kyf3XvDl5zczgtzSdLg0c6lGJw= Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-6d9b13fe9e9so9249205b3a.2 for ; Thu, 18 Jan 2024 03:11:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705576293; x=1706181093; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=yrbTbn/Co4xZrxsx5gjY1Btb4+PFsZ1QZ40ExDZVluk=; b=NXgmjCKRP/NGePTQnc8A+gCAW2BLnhENbj2iYTKvTLbCPkf+6JofS0iaRhcBNhpNE+ +uYROAjWG9EIRvIZbI6i/x1uee8Dv9YtTWWI+hn9XAk+rwTIbPVugnk65bbU2Jx4Du1L riJni5T8Bifv5lLXPhzZrsRRPdwrVFwUGr4314w2tfkvfCUFXHLFBkVnd0s0k7nlAH/+ bDkc63mionBB8zTH9bqjqaV4eN83/DfofS3CI4cCh2OF6HB4v2UlOM1SDTJ7Op1XD8Jj pLRm1tBhdviyzBwpKEVu5uwN03rRBLjuwzr+StP6qU8xsHd4plTU/iFRSo1hoPu5S5+W G5SA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705576293; x=1706181093; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yrbTbn/Co4xZrxsx5gjY1Btb4+PFsZ1QZ40ExDZVluk=; b=Q5di6iJltDNs38WSgh5rr69pWLBR7SaWEep0lfhRzVgqgSuJY2w1LuQMQzjcwczZee hKNF5qTGqMeOmNBwWhhmtRsHyOmhHi4kRL6br1+Z0fYPWAbELLBHys2mdles50DKPk8g IGuA91vo9W4vm1zltOmDX2KL3+Vz7aH+U4XGc+4uViH8QkhpHcZJlkO5XfHcrLHTD9J0 uCc3/v6/MK0UmsEDI0trtdxfffJrvyj/Zg3aRI72aWeTFjkhwDkCoMlQbsa38g70LqAB kYD5onutrkbyfGlFIQ/fykzv1TVTcwIduH2LkznuRDQU/KvYe6JTRmlkR72rcRzPijSS a61A== X-Gm-Message-State: AOJu0YwO/8H5QNrHpt37KgRVWtqz9QUeJFewzodh3M/DlAYfqresjsgn fX8gFu0zCxrfbCO1fAoUzJnLGzOzIuUV6c61MvBHb+HUE/r7YOj8 X-Google-Smtp-Source: AGHT+IF12opuNoryR9vVFMwwLQb6LE2P/B1s8ZwHCzxtTLJ9aOUfJO5mW5VC3XFfJkko5dyhT2a0+g== X-Received: by 2002:a05:6a00:174e:b0:6d9:bd63:e3e5 with SMTP id j14-20020a056a00174e00b006d9bd63e3e5mr689678pfc.26.1705576292986; Thu, 18 Jan 2024 03:11:32 -0800 (PST) Received: from barry-desktop.. (143.122.224.49.dyn.cust.vf.net.nz. [49.224.122.143]) by smtp.gmail.com with ESMTPSA id t19-20020a056a0021d300b006d9be753ac7sm3039107pfj.108.2024.01.18.03.11.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:11:32 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: ryan.roberts@arm.com, akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, surenb@google.com, steven.price@arm.com, Chuanhua Han , Barry Song Subject: [PATCH RFC 2/6] mm: swap: introduce swap_nr_free() for batched swap_free() Date: Fri, 19 Jan 2024 00:10:32 +1300 Message-Id: <20240118111036.72641-3-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240118111036.72641-1-21cnbao@gmail.com> References: <20231025144546.577640-1-ryan.roberts@arm.com> <20240118111036.72641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: h6ma6kkq5gns4c9ht9o7cmc9j8nupxwc X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 0431240003 X-HE-Tag: 1705576293-250756 X-HE-Meta: U2FsdGVkX1+jaBVEmvXjXXLOc2OxPmTHzQGgYzb4Ub15NoNShd5Q9+QaaORsrJKBUnAEmBN6CNu/12JR3ldvCpk3jRsdnkn6bhxooAAufB8eTFB6wjX6K8sE/K9OrrcmOXORA2/8ropRAUYBRiOHC3M7bIHW1har0yiAVoIxPTCCOs+hYsMCU8i+AMbmcKF1V4VRHs5m6RrOToSsbiG8mYAOjmz2N8imVVP0Enfddzve87ULKeDP2BaUWhCIKfT4wz+Ix+S8wEBSrHD9NIZrsmMF59X2tQtV1fr9PxXzAg8jzgOf4F5m2u69wD4kLAzopOcpDRl8XQyErdZj9kzwwKWcjmJa7TNrrdLD6YLHNATB34xUQs1/JUpEhUgn7CeLN6NuEhKT5AjWREKOMJIjdBjE9yBEY1rMdqNa3gvBjvkCwPW0KzgYwG6p/mkllnBynGLoNr61g/ss7mt/rpFAY0qDh5/8kyGmW8x3sM21D4g3uixOhm73OlzSBCF5Q/4EdNo54AX/5PkHhh5Yt87ntzpiKp9hLQCY2unQGtD5ybisRfzOrsTupDTbshCvchqOC+aIaiOQnxu+Z4AYGrmSv258KAEcJojo34eJqPVCin9RAFChdqTOP2wmk2jfRPAAtBtH4Qu8AbCxo6/bYNUD+oZt8O+Th/A6EAh560APzkG6HQ4IkKWfoBThuks777yStx6gXZm7vIIeNcjE4W4UslXTB7Mmux3WWrdhF1JuwXauwrtFaizFNoN7HuK3kZzdunMHwNpJmdh6YPe+qg9Cr+1VBfsH8g0xe+tiRWNqsAwOMZ80OienxOJkQ6eXj4SF5hrwv1QnuhymrllbkL7mD/UIoXSrqzHNlnvLsu6EYocJnGsB/iAAbli8T14nhKOipk4b4waIh4qRIml0otcK1fKFHwebmS9xzj4+6IVGnESG59XO1Ni21HvnSzspWnpElvaSIPGe8ljv+FoyJpH SCMUPhuD WU5BJWbY/1A2Z2+8XwtRQxcWRB5XCwbYvMPb5dNfU3+T413/lRe/wt25ihKg6FkJ97JfYFvXrgqDZAco10l7vdaQVZCJehZxhDxxQ2IzQG4gah8jfRfoR2CnMi28PdJKLPsUXYdBjyIGF+xPmP9NSWDeoVeQk9bZ96Z8sSPTCzgQRYZNvgTSkCCszHybsoMNZxO0yggo72SMh+6UNplqUQ4wgglmdLe77AQAVsExk8eIr1wVrvXdginrnS2HWq2j6X/6HPFsWV1zB8YaGonPP3uNthdcwFL9xIe/SqAuRjcfG/CM3C7PFG1z4yRgqSOz2mI2wMxvLcXsPwslrLIrpO3jQd0NOYDraWJTbB7D8GwSDWsxi8Jot/uaHfJSkOBKUsJgXsZTHYM20jvR0y4gs810Gsem/tgjC+CVexf+z0BdL/B5vCpY4somXxXkLz442ApyuC75TUgZOqx7Inhecmg3Ib44rGjVoFReiiPskooP08o33taivvCm/9D/ubPacJwhx0dhX0xjixEYJdtFIqcKDSQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuanhua Han While swapping in a large folio, we need to free swaps related to the whole folio. To avoid frequently acquiring and releasing swap locks, it is better to introduce an API for batched free. Signed-off-by: Chuanhua Han Co-developed-by: Barry Song Signed-off-by: Barry Song --- include/linux/swap.h | 6 ++++++ mm/swapfile.c | 29 +++++++++++++++++++++++++++++ 2 files changed, 35 insertions(+) diff --git a/include/linux/swap.h b/include/linux/swap.h index 4db00ddad261..31a4ee2dcd1c 100644 --- a/include/linux/swap.h +++ b/include/linux/swap.h @@ -478,6 +478,7 @@ extern void swap_shmem_alloc(swp_entry_t); extern int swap_duplicate(swp_entry_t); extern int swapcache_prepare(swp_entry_t); extern void swap_free(swp_entry_t); +extern void swap_nr_free(swp_entry_t entry, int nr_pages); extern void swapcache_free_entries(swp_entry_t *entries, int n); extern int free_swap_and_cache(swp_entry_t); int swap_type_of(dev_t device, sector_t offset); @@ -553,6 +554,11 @@ static inline void swap_free(swp_entry_t swp) { } +void swap_nr_free(swp_entry_t entry, int nr_pages) +{ + +} + static inline void put_swap_folio(struct folio *folio, swp_entry_t swp) { } diff --git a/mm/swapfile.c b/mm/swapfile.c index 556ff7347d5f..6321bda96b77 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1335,6 +1335,35 @@ void swap_free(swp_entry_t entry) __swap_entry_free(p, entry); } +void swap_nr_free(swp_entry_t entry, int nr_pages) +{ + int i; + struct swap_cluster_info *ci; + struct swap_info_struct *p; + unsigned type = swp_type(entry); + unsigned long offset = swp_offset(entry); + DECLARE_BITMAP(usage, SWAPFILE_CLUSTER) = { 0 }; + + VM_BUG_ON(offset % SWAPFILE_CLUSTER + nr_pages > SWAPFILE_CLUSTER); + + if (nr_pages == 1) { + swap_free(entry); + return; + } + + p = _swap_info_get(entry); + + ci = lock_cluster(p, offset); + for (i = 0; i < nr_pages; i++) { + if (__swap_entry_free_locked(p, offset + i, 1)) + __bitmap_set(usage, i, 1); + } + unlock_cluster(ci); + + for_each_clear_bit(i, usage, nr_pages) + free_swap_slot(swp_entry(type, offset + i)); +} + /* * Called after dropping swapcache to decrease refcnt to swap entries. */ From patchwork Thu Jan 18 11:10:33 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13522711 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56BF6C4707B for ; Thu, 18 Jan 2024 11:11:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E9BB56B0087; Thu, 18 Jan 2024 06:11:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E4C846B0088; Thu, 18 Jan 2024 06:11:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D141F6B0089; Thu, 18 Jan 2024 06:11:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id C05096B0087 for ; Thu, 18 Jan 2024 06:11:50 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 8FDD3802D6 for ; Thu, 18 Jan 2024 11:11:50 +0000 (UTC) X-FDA: 81692166780.24.B594C39 Received: from mail-pg1-f169.google.com (mail-pg1-f169.google.com [209.85.215.169]) by imf30.hostedemail.com (Postfix) with ESMTP id BD60080017 for ; Thu, 18 Jan 2024 11:11:48 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=S9ekOocR; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705576308; a=rsa-sha256; cv=none; b=Z1GIvtDVf1x4PtegKsFV6f1dwto7Pzj+SXco2J406DKFVcNa/dT4Fa2DYa6n6MR8/S30SG PvJ1R/xXJLRy8AzOjMP9c5udGL1Vl6sEavNm+L+Bn8gXqSL1aFp998gFbkm/1WUD62Kr5d YkaBVW0Gx1+w1pSn0z6413WLfh49kn4= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=S9ekOocR; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf30.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.215.169 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705576308; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xZZoI8vzJz6uaOc3IP6SaeIVpVjWcCgJVsn6AMZhjDQ=; b=eWVgWIgtEQUwyfJqR9klOGIJjjr9sPIONHsQN49O0WFUhfzqsQqygCd71BOYjmUY1ry+Ca bzymvIwZ2gXb6fGZOFhGQ/MiUIroUxloX9OJrmd1PQ4eB+sZzrREXqRUVoBVMWMQSqIoHx qAdL5O37aGAw2XN7qQbYzd5MQi9WIVk= Received: by mail-pg1-f169.google.com with SMTP id 41be03b00d2f7-5cdfed46372so9888441a12.3 for ; Thu, 18 Jan 2024 03:11:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705576307; x=1706181107; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xZZoI8vzJz6uaOc3IP6SaeIVpVjWcCgJVsn6AMZhjDQ=; b=S9ekOocRt9jTQPDIfgX+CjvcvyrHVMotoL4eCeni5YI11tRHdrfr4cr6PyRccVnkWW sdEoZFUfR0UKozyk57ZGJ+TC2ZG4v7zBh/Zo1DTPLtZm6Y5EcZdPUstnI9g4z8i4X1Kt xN9+VkAqdL2zUZeRbHd0X6P13xGpCwtZUdR1Rvg0L6KtgU8I7x5ghdFSdra/VHlYt70O uGhWlknZarKcCxNsnjeME65TGNpgrJJlEciOn4WSeT186ZJZn22gPBjsYJSS+fS7PxZ+ Du6xcQeL32CuCtMY3tGCFbgmW/wY43n7tK1JYfmxkRi6azO2wXWDSXJP6K8dbvn8mAKG ZMog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705576307; x=1706181107; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xZZoI8vzJz6uaOc3IP6SaeIVpVjWcCgJVsn6AMZhjDQ=; b=pFwDohG32cud4Rl6aLmKqiyDqkuF/sRMklctRqp/KWcYHt/lTpPy1pNccg+UunFCSF uGbeUbC+HihS9sanMIVU0yV8BnOL53PV801v/LESN/47nyYsLxayERLJrbClvzhNLIjb SRsMQIP7lK6HciM3DuoM63lHqMTXuFLp6Ke6BhtMmHCRh3xr0cxG37D9gQIw9c6ERX7C J5jxW+XrGlCaj5cRdCCoIUO7a3v6vUSfcTJWIaXZ4EpUIFF7jvLaGFZkMGu08jTdPw20 K8SmoFDXo4h9GOMIBfDmHuGzzmA4xac6osHC29rzIIkRE3wLH03Vj7YQrqe3XXXN+nQb 7nrw== X-Gm-Message-State: AOJu0YxxaqIo763xHb9BdV2dER7IuKqy3+ZkzNs9PH48nggK838DZRGp pVTPeVP59cYM9iHkyTmq+m3XQdu+wpk22letpadXWucQgBvaAftm X-Google-Smtp-Source: AGHT+IGrNGM+/9wCFNZHoUn+TpVS1dFfrjzlPUiB7OyAY2gO2fREpkC0FCj3Devi0x2OTNRIh4y8Ug== X-Received: by 2002:a05:6a20:9f9b:b0:199:e4ab:691c with SMTP id mm27-20020a056a209f9b00b00199e4ab691cmr731200pzb.8.1705576307486; Thu, 18 Jan 2024 03:11:47 -0800 (PST) Received: from barry-desktop.. (143.122.224.49.dyn.cust.vf.net.nz. [49.224.122.143]) by smtp.gmail.com with ESMTPSA id t19-20020a056a0021d300b006d9be753ac7sm3039107pfj.108.2024.01.18.03.11.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:11:47 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: ryan.roberts@arm.com, akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, surenb@google.com, steven.price@arm.com, Chuanhua Han , Barry Song Subject: [PATCH RFC 3/6] mm: swap: make should_try_to_free_swap() support large-folio Date: Fri, 19 Jan 2024 00:10:33 +1300 Message-Id: <20240118111036.72641-4-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240118111036.72641-1-21cnbao@gmail.com> References: <20231025144546.577640-1-ryan.roberts@arm.com> <20240118111036.72641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: BD60080017 X-Stat-Signature: 75u36uh7wgo7mujh5rcw9o5nijt7zstp X-HE-Tag: 1705576308-978682 X-HE-Meta: U2FsdGVkX1/m5VJcRNGpGfeDObBeWKUti7l6JPSNamc685fRL1sJF9frUy/Vs+CgypCjCbhJ5kkxV4uxTCB+s+8+sDDX2mKb83R11Y2WurSQlARKemJHoDcKYXTpo1qfA9hvK7ULM6aC/CbDmOG7VdP/4xx681ymy6wLUl83+l6NKH7h+vtx8pVB2oijyv/8bzJFXEfYMXgETbqlPKf10MueGAzET2MNIqelUxO8hPsBf+M+62fT2fMXCW57CuGwmFsmJ/e02WRWU4YiW7gLNk0Kv0ypP7HtQhtVJxaxE7M9uvghb7dNSsKvHqS0HVbxaHBYttdlWgLk7cBFg6DjMBiJBuiuJ5V0No1kGnbuQZRCCZlw15PcfEDhbIi6aC5LeaupVzoECTD3ksO5Q9NE9jTLGTIHU54N20m0AO9xBr1UEn1UDCReDFi1F4E6iPclPaQWpZDvtRgY6WgL4fSULpEAYYj2eQTrDiA8tkG9wMaZ2NT6WYl5z4X4uzacFNQvDVLLjprXn4J+gK5QdC+7mWE637egWTyEPHw1TMjHLlRcVOxvgYm8u2VX9vJg6UoUXC/7lDQln3vSZ3rsyJgdqPGqMx85XR/VJGHvTijkNyfCupH3iVg27ylXBz01R/3hOBUsces2N/5UuYDQSM1SqTfqg8DPXObui2kvrQDpE0HRWQp1LLCW/e1dfb6W7iszbd1E5rJsM35sJ2uv48O6uljLKpglAuEZ7W8TA2Y/DjvO6vQA8brh4/HSU019FpXNQmtskn2L7HfS/Lhbj7kWGi1gw25gALQBsu4oFg3y5ulxwpwWmhijyqbVd9UjH1zYk0EI00yT8shjjFUqlgPQn7jXfn9DYXHp7LGGcWY7q9lYNqJta9J1KIH5Pn5gHAjowmoOtcFn9OOTdCIZbNnCAHGaYlmXYbq6aiBh42cipBD/Kj85QpztnnKRdYSBhYgRKqAACgD4fDBGHCpS8uS rI41zq+x nbxeRRjJA2seOB8WCdhk1fgxS1cWR3RdXBb+u3He+r6xYOg+V2cCzsaDDOtk/qSkLDZ186LYqG/XAMtVWGL+lVqQ4aQUyu5SKlE7o7IMlGY2vRVS4QVvYnYT/Zma9TZH7mdXxLOZ5EIMJ6aukH5s5cB6jtat+BJbQhCNt40Xu/kMhPg17qejHWdaDI5/0wIAEUXIMDrMCcoCK2Z3x0iyuIgu34R5MUq2P/fGu/icfnrlfZnn1Pj4TESP6dITgKu0OF8gdt2TaAqKLmXqq3VcXAVW71z6ASx2zAs8lOQuqqcldLC7MmFTt1R8Url/oAqzJI9YkrKogvlqMVPF0aaLiDxLuXCGgpTkO8HqRhxFgPT5Q7aJwQigy5l5xLWeJAWC56k6fku2yvGv3H6bKGkKDWJtjEEdu4mJHYxnHWfwaYLC8voYnE0reLHbR7uguyCNXAhhQAZxNxW015/oZxQSFKn72R/Zp5g4fk9iPbMNGbuv2W2wb3+Xq0h9lQqMwPVGSbFm6hgboKoGjOkPHm4+mnoY+OaY92aRTqvkvpRBmfrEDAKX+Z6AM3vu8sw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuanhua Han should_try_to_free_swap() works with an assumption that swap-in is always done at normal page granularity, aka, folio_nr_pages = 1. To support large folio swap-in, this patch removes the assumption. Signed-off-by: Chuanhua Han Co-developed-by: Barry Song Signed-off-by: Barry Song Acked-by: Chris Li --- mm/memory.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/memory.c b/mm/memory.c index 7e1f4849463a..f61a48929ba7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3714,7 +3714,7 @@ static inline bool should_try_to_free_swap(struct folio *folio, * reference only in case it's likely that we'll be the exlusive user. */ return (fault_flags & FAULT_FLAG_WRITE) && !folio_test_ksm(folio) && - folio_ref_count(folio) == 2; + folio_ref_count(folio) == (1 + folio_nr_pages(folio)); } static vm_fault_t pte_marker_clear(struct vm_fault *vmf) From patchwork Thu Jan 18 11:10:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13522712 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D48DCC47DAF for ; Thu, 18 Jan 2024 11:12:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 74A156B0089; Thu, 18 Jan 2024 06:12:07 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 720606B008A; Thu, 18 Jan 2024 06:12:07 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5E8826B008C; Thu, 18 Jan 2024 06:12:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4F0C76B0089 for ; Thu, 18 Jan 2024 06:12:07 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2D965160C12 for ; Thu, 18 Jan 2024 11:12:07 +0000 (UTC) X-FDA: 81692167494.24.8ACABF8 Received: from mail-oo1-f45.google.com (mail-oo1-f45.google.com [209.85.161.45]) by imf29.hostedemail.com (Postfix) with ESMTP id 5F69612001F for ; Thu, 18 Jan 2024 11:12:05 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZBuGzUuc; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.161.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705576325; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vxtV65B3xdBW2CsjvzUu/C8QzJUJxjQlgzUH6d2NXwk=; b=rgWv3IHkfa0ThyX8zccis2Y6grH+CRcc4fXXL22m7/+nDxreKAp1wV6Ayhm8XX5+BByThn B9WDQutJnp4FuOLpbFtrE9xWJt3Z8kEDQe+xNHJ6IOMk+Mlw2rH8vfUZM4NU51ac2LvleP FiCrwqhlyWGf+ikE07eJTQXRD+OTCKo= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZBuGzUuc; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf29.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.161.45 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705576325; a=rsa-sha256; cv=none; b=aR7O7rwGeQSZTMsrUkdzj7UiPmIw2pfdYlmBr6e5zqM8BSiByn+8UUZlsZ4OwIZAjzp+9Z DTV44oNeisdokkQSLiZ+77RjDYlgA9BqpWRDUlhvjTWtiUPyCnlmBV8baDauv2AbFwKAdx Fj6eKsdMVri4wssa+sMMhf5o60b3AEc= Received: by mail-oo1-f45.google.com with SMTP id 006d021491bc7-5986d902ae6so4977714eaf.3 for ; Thu, 18 Jan 2024 03:12:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705576324; x=1706181124; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=vxtV65B3xdBW2CsjvzUu/C8QzJUJxjQlgzUH6d2NXwk=; b=ZBuGzUuc2kQ2NO+Lf+AHp72FW1vN2AbsF9BAfELW2v+Sbb/jjW5I7ip6SzjhaCZ8op QjxNFPkmFjTkqIyxaqLORgHgqfhOG+dk4uujBAGafg+6iE16sEjNsTihP7qW8ak5vxON JRygxhGnnX9LL/W9hOrvW4607PzWkwp4uL6Q17sMFYZjiIP9yMDJcyn+R5DHVNFgRVnS e69P/op2VVZvR1pGqcLz6pvQBpxu1VD7IvYK0qCvCEQ2vvz5AQYy/vKFNmw71XTTPUHM 7TY3B6nrzbbNrBZgCJuPFv5ttHXFnqH5k47EZzout5W3HX6qnXN41IUcxAjBG8taLJ0Y hsNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705576324; x=1706181124; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=vxtV65B3xdBW2CsjvzUu/C8QzJUJxjQlgzUH6d2NXwk=; b=H1vZWG+/BoNBBr0zAdkkJ2Y9OhVazjynLutYpVaTXcNT7A3vKNSGC7sHqr335XiF6p uPH9xGq8dLiAgVLvRjgiZzPr1Jbwwfh0UTVUMXma8F20E3t+VS9d+LkpRIfRritZrVo4 8fzYZsiyIufe02b7VIOiXr9LTv8jD/HoWuODb+8xBq1i1xC7B7suxjTzkULd6CLrROn4 HKk4dwSJB0TYpEfGiWg0I3ujTxs1F+eGKjUu8h1sjTWJ9vOe1dEV1gO6a/z3Mhi2R4Zz k6t05yQfl56hJ0CYYVq7X0QVpN44KT8gFqMQuxj7O/Sg+f/307nXCTI3B4hzYJ3/flVP Gq5g== X-Gm-Message-State: AOJu0YzxPE1HQqbttpKhg1Vmx0Mi2Krr7DAqAd3WHFFB6CaJKF+8S8Bq jjmM14NlhNpPCThSET/QT2EzSDT9UuIxWJnkBTxSqck3YAxgVMJd X-Google-Smtp-Source: AGHT+IEeSGuh99kBzDsDR/89TCeSTS2RWEiKEOvRPUVCzJfTDywJ9YQlQRSPdbA+rjeC2MmjCJPSvw== X-Received: by 2002:a05:6359:ba7:b0:175:5c8c:3ab with SMTP id gf39-20020a0563590ba700b001755c8c03abmr609446rwb.65.1705576324307; Thu, 18 Jan 2024 03:12:04 -0800 (PST) Received: from barry-desktop.. (143.122.224.49.dyn.cust.vf.net.nz. [49.224.122.143]) by smtp.gmail.com with ESMTPSA id t19-20020a056a0021d300b006d9be753ac7sm3039107pfj.108.2024.01.18.03.11.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:12:03 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: ryan.roberts@arm.com, akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, surenb@google.com, steven.price@arm.com, Chuanhua Han , Barry Song Subject: [PATCH RFC 4/6] mm: support large folios swapin as a whole Date: Fri, 19 Jan 2024 00:10:34 +1300 Message-Id: <20240118111036.72641-5-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240118111036.72641-1-21cnbao@gmail.com> References: <20231025144546.577640-1-ryan.roberts@arm.com> <20240118111036.72641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 5F69612001F X-Rspam-User: X-Rspamd-Server: rspam04 X-Stat-Signature: fw9hmojfh815cbf9x3gmiawspmcgenax X-HE-Tag: 1705576325-341397 X-HE-Meta: U2FsdGVkX1+sI7qzgIzSFJ2yqlZgIYrDDKCMN/3wuhLq9KAWfnH2kxfo9QnwTiYZlCsa79pj/az4eDe+VN1X7wR9MfG+2t0Dua9hRCzrYPysX+T50GLTSD5HXd/6496di/yQ9RsLan/8VB61lM/uYHSduU17NjCXtZMw7QQin4Bh0Tpv4E6YpefnRllvyoigkODACexSKWeYsrn4GN/Ni2XBGwEzLV14/fV6l7poq4vH9T3wsUawAqvhFd+0o+JNT4YaKHBE09HBb9qmOsV2WPEdm173abNKmGZSxyeZZVfYLf/gXVp1EBXrN/ZUqxOKUn4z7pNCx9TYd9pJrASHkmTRPSikHXzsr5f5B+L4OliAlF3fB5ihTDqhsF935nK9KEji7eumz/qOudRCVYMdkA0mCJK8g71jpOwg7eU0/JbY+fTr3KXiuS27tRYyxmPdDJyrWKExB/4d+f31/scesNuUu6w9AiTONZjluQdCSXuHLbiXI9kROmlsgxbqXRspJ0sW63OnNqV6Jow1r7wB2SmARFKRUb6VY7gjwEjuzCEBrkC0CS85OCSRMw0t8GcEjknQq5UZEHvx2R4uUkYR8Qa8duJyKpbhkE7Tv3JubuQFR4K8rJIhqXcUahcEDal2R0XhgWp4XUdxpbqShJC4e0mqKZnNOi1K97d66Ypa4Fd88h+AipVNppFWMjxpdxReOTTuj0zbaKC5xiYaL/DoythaD5ukdbSft8CfvEDIm09CLgKjv3r/0MmfJ/usHEmwW76zWG/mpacNkdxGKZiKHx88NrWHBJZ23UCtTzaUjdxObnrmH6CBouU9xlfbodycn72isKwPSDaWBjU9/BbV3KBG/aFSmjPIouhYhFcV3UfLSq4U/sclrBf+v4qYEzRMh1cxfj54UVEFE5CW/kP2B9ugnBT9x2yADQFtDfPy9TVc3pjKgJWLwcYNQNCzkpGeSkkPJ/L9bUec/e440Ff 3Zol0wcZ z4kGwghvTdAlDzQPxAlnvZVUs/HNpbq1PWm9pyMNrLosgJH6QRF/+fpU6kh38PQMrtQxWenWNtyj5C/Qt5TN8ZEnJyU5U/xNKX0GiqPd09oDHngfuVm6L9YFfHFxEJk5ZisYtdk4oFY/D4sAcFqdEb2i21AdFqzJ9FW/acKWFAH24UFE29ZPnn1Xk5+e3NuYysKkS75P72qmhsj7mnj9LPBKseZ0JZphotx8j33GiyF9xdBTCjqjAN/r80vOrTO0r/NNzB1pPyZcwSS33cUdkuiTYNm9xf5Dkoe0mXDwk69yAEzFttnjP/bZXvQ9QLK27Kn3XVN3f6hqw0amtHyJ+qEjnuIJUjlBjtMwo07yh0L1O4+PK5qSYSK+2ORL1LtG5iKn5Lizq2x8l8orEA14uFsxbIMlSmrkv7Q/chi4bbIIJq4K6mF9sUmlHqis7iPCG+eTAWHCkeaYxzfyZUddIwOHqqo/+S0YaMSJh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuanhua Han On an embedded system like Android, more than half of anon memory is actually in swap devices such as zRAM. For example, while an app is switched to back- ground, its most memory might be swapped-out. Now we have mTHP features, unfortunately, if we don't support large folios swap-in, once those large folios are swapped-out, we immediately lose the performance gain we can get through large folios and hardware optimization such as CONT-PTE. This patch brings up mTHP swap-in support. Right now, we limit mTHP swap-in to those contiguous swaps which were likely swapped out from mTHP as a whole. On the other hand, the current implementation only covers the SWAP_SYCHRONOUS case. It doesn't support swapin_readahead as large folios yet. Right now, we are re-faulting large folios which are still in swapcache as a whole, this can effectively decrease extra loops and early-exitings which we have increased in arch_swap_restore() while supporting MTE restore for folios rather than page. Signed-off-by: Chuanhua Han Co-developed-by: Barry Song Signed-off-by: Barry Song --- mm/memory.c | 108 +++++++++++++++++++++++++++++++++++++++++++++------- 1 file changed, 94 insertions(+), 14 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index f61a48929ba7..928b3f542932 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -107,6 +107,8 @@ EXPORT_SYMBOL(mem_map); static vm_fault_t do_fault(struct vm_fault *vmf); static vm_fault_t do_anonymous_page(struct vm_fault *vmf); static bool vmf_pte_changed(struct vm_fault *vmf); +static struct folio *alloc_anon_folio(struct vm_fault *vmf, + bool (*pte_range_check)(pte_t *, int)); /* * Return true if the original pte was a uffd-wp pte marker (so the pte was @@ -3784,6 +3786,34 @@ static vm_fault_t handle_pte_marker(struct vm_fault *vmf) return VM_FAULT_SIGBUS; } +static bool pte_range_swap(pte_t *pte, int nr_pages) +{ + int i; + swp_entry_t entry; + unsigned type; + pgoff_t start_offset; + + entry = pte_to_swp_entry(ptep_get_lockless(pte)); + if (non_swap_entry(entry)) + return false; + start_offset = swp_offset(entry); + if (start_offset % nr_pages) + return false; + + type = swp_type(entry); + for (i = 1; i < nr_pages; i++) { + entry = pte_to_swp_entry(ptep_get_lockless(pte + i)); + if (non_swap_entry(entry)) + return false; + if (swp_offset(entry) != start_offset + i) + return false; + if (swp_type(entry) != type) + return false; + } + + return true; +} + /* * We enter with non-exclusive mmap_lock (to exclude vma changes, * but allow concurrent faults), and pte mapped but not yet locked. @@ -3804,6 +3834,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) pte_t pte; vm_fault_t ret = 0; void *shadow = NULL; + int nr_pages = 1; + unsigned long start_address; + pte_t *start_pte; if (!pte_unmap_same(vmf)) goto out; @@ -3868,13 +3901,20 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && __swap_count(entry) == 1) { /* skip swapcache */ - folio = vma_alloc_folio(GFP_HIGHUSER_MOVABLE, 0, - vma, vmf->address, false); + folio = alloc_anon_folio(vmf, pte_range_swap); page = &folio->page; if (folio) { __folio_set_locked(folio); __folio_set_swapbacked(folio); + if (folio_test_large(folio)) { + unsigned long start_offset; + + nr_pages = folio_nr_pages(folio); + start_offset = swp_offset(entry) & ~(nr_pages - 1); + entry = swp_entry(swp_type(entry), start_offset); + } + if (mem_cgroup_swapin_charge_folio(folio, vma->vm_mm, GFP_KERNEL, entry)) { @@ -3980,6 +4020,39 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) */ vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); + + start_address = vmf->address; + start_pte = vmf->pte; + if (folio_test_large(folio)) { + unsigned long nr = folio_nr_pages(folio); + unsigned long addr = ALIGN_DOWN(vmf->address, nr * PAGE_SIZE); + pte_t *pte_t = vmf->pte - (vmf->address - addr) / PAGE_SIZE; + + /* + * case 1: we are allocating large_folio, try to map it as a whole + * iff the swap entries are still entirely mapped; + * case 2: we hit a large folio in swapcache, and all swap entries + * are still entirely mapped, try to map a large folio as a whole. + * otherwise, map only the faulting page within the large folio + * which is swapcache + */ + if (pte_range_swap(pte_t, nr)) { + start_address = addr; + start_pte = pte_t; + if (unlikely(folio == swapcache)) { + /* + * the below has been done before swap_read_folio() + * for case 1 + */ + nr_pages = nr; + entry = pte_to_swp_entry(ptep_get(start_pte)); + page = &folio->page; + } + } else if (nr_pages > 1) { /* ptes have changed for case 1 */ + goto out_nomap; + } + } + if (unlikely(!vmf->pte || !pte_same(ptep_get(vmf->pte), vmf->orig_pte))) goto out_nomap; @@ -4047,12 +4120,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * We're already holding a reference on the page but haven't mapped it * yet. */ - swap_free(entry); + swap_nr_free(entry, nr_pages); if (should_try_to_free_swap(folio, vma, vmf->flags)) folio_free_swap(folio); - inc_mm_counter(vma->vm_mm, MM_ANONPAGES); - dec_mm_counter(vma->vm_mm, MM_SWAPENTS); + folio_ref_add(folio, nr_pages - 1); + add_mm_counter(vma->vm_mm, MM_ANONPAGES, nr_pages); + add_mm_counter(vma->vm_mm, MM_SWAPENTS, -nr_pages); + pte = mk_pte(page, vma->vm_page_prot); /* @@ -4062,14 +4137,14 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) * exclusivity. */ if (!folio_test_ksm(folio) && - (exclusive || folio_ref_count(folio) == 1)) { + (exclusive || folio_ref_count(folio) == nr_pages)) { if (vmf->flags & FAULT_FLAG_WRITE) { pte = maybe_mkwrite(pte_mkdirty(pte), vma); vmf->flags &= ~FAULT_FLAG_WRITE; } rmap_flags |= RMAP_EXCLUSIVE; } - flush_icache_page(vma, page); + flush_icache_pages(vma, page, nr_pages); if (pte_swp_soft_dirty(vmf->orig_pte)) pte = pte_mksoft_dirty(pte); if (pte_swp_uffd_wp(vmf->orig_pte)) @@ -4081,14 +4156,15 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) folio_add_new_anon_rmap(folio, vma, vmf->address); folio_add_lru_vma(folio, vma); } else { - folio_add_anon_rmap_pte(folio, page, vma, vmf->address, + folio_add_anon_rmap_ptes(folio, page, nr_pages, vma, start_address, rmap_flags); } VM_BUG_ON(!folio_test_anon(folio) || (pte_write(pte) && !PageAnonExclusive(page))); - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, pte); - arch_do_swap_page(vma->vm_mm, vma, vmf->address, pte, vmf->orig_pte); + set_ptes(vma->vm_mm, start_address, start_pte, pte, nr_pages); + + arch_do_swap_page(vma->vm_mm, vma, start_address, pte, vmf->orig_pte); folio_unlock(folio); if (folio != swapcache && swapcache) { @@ -4105,6 +4181,9 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) } if (vmf->flags & FAULT_FLAG_WRITE) { + if (folio_test_large(folio) && nr_pages > 1) + vmf->orig_pte = ptep_get(vmf->pte); + ret |= do_wp_page(vmf); if (ret & VM_FAULT_ERROR) ret &= VM_FAULT_ERROR; @@ -4112,7 +4191,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) } /* No need to invalidate - it was non-present before */ - update_mmu_cache_range(vmf, vma, vmf->address, vmf->pte, 1); + update_mmu_cache_range(vmf, vma, start_address, start_pte, nr_pages); unlock: if (vmf->pte) pte_unmap_unlock(vmf->pte, vmf->ptl); @@ -4148,7 +4227,8 @@ static bool pte_range_none(pte_t *pte, int nr_pages) return true; } -static struct folio *alloc_anon_folio(struct vm_fault *vmf) +static struct folio *alloc_anon_folio(struct vm_fault *vmf, + bool (*pte_range_check)(pte_t *, int)) { #ifdef CONFIG_TRANSPARENT_HUGEPAGE struct vm_area_struct *vma = vmf->vma; @@ -4190,7 +4270,7 @@ static struct folio *alloc_anon_folio(struct vm_fault *vmf) order = highest_order(orders); while (orders) { addr = ALIGN_DOWN(vmf->address, PAGE_SIZE << order); - if (pte_range_none(pte + pte_index(addr), 1 << order)) + if (pte_range_check(pte + pte_index(addr), 1 << order)) break; order = next_order(&orders, order); } @@ -4269,7 +4349,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (unlikely(anon_vma_prepare(vma))) goto oom; /* Returns NULL on OOM or ERR_PTR(-EAGAIN) if we must retry the fault */ - folio = alloc_anon_folio(vmf); + folio = alloc_anon_folio(vmf, pte_range_none); if (IS_ERR(folio)) return 0; if (!folio) From patchwork Thu Jan 18 11:10:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13522713 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCA4BC4707B for ; Thu, 18 Jan 2024 11:12:31 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 771C36B008C; Thu, 18 Jan 2024 06:12:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 721086B0092; Thu, 18 Jan 2024 06:12:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 59AA56B0093; Thu, 18 Jan 2024 06:12:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 485986B008C for ; Thu, 18 Jan 2024 06:12:31 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 18477160AF2 for ; Thu, 18 Jan 2024 11:12:31 +0000 (UTC) X-FDA: 81692168502.02.4B0C112 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf16.hostedemail.com (Postfix) with ESMTP id 431B6180018 for ; Thu, 18 Jan 2024 11:12:29 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=m6QVRXFQ; spf=pass (imf16.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705576349; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=lCBEGTZpNMmcOnq2Y92D+y0mof5icxn15UOhqT4EV4Y=; b=iedAE1s0yHcBhfzLABssg2D2lvseBBWTOoNRIV3715p0fihfqs98LkvqyAymRfZ1KcaMA0 mWBYBvNr3K0YL7lwRPL9/Y61HB8BdsRS+EayYKkrQ5ES2GMJUXXVS4yAZmtNvELKP4aTr3 hic1MEgcqzYjrzWrWxkbKT31YgSiNqw= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=m6QVRXFQ; spf=pass (imf16.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705576349; a=rsa-sha256; cv=none; b=h3C3l8s1TSzlwRrCPl9QCyQt0zCwdVN3WViJzAduzigoTPGM7POQ+QrQFXZjOzyO1thBiJ s8YtrCLuuH7MVwH3ey1/2KSAnC8+fNRpZCBc/N3qngwuQfcTrtz3IEyuUi6nRFSMw1nNC/ AbibVJuRY/rUScqne2jAFd1em2TVUr0= Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-6dac8955af0so7015751b3a.0 for ; Thu, 18 Jan 2024 03:12:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705576348; x=1706181148; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lCBEGTZpNMmcOnq2Y92D+y0mof5icxn15UOhqT4EV4Y=; b=m6QVRXFQ4bTtZvyDvbR1VKDQaCtr0ZuFXb+Jdi8yGTPkWJlwxSaeafDxwWsNVxfXug XFtCAIiSCIf7e7Ro+6i88d4zTnX5Ci8WBeX2hx17Ax9NmXxysMNqiR9wLUySBK3Y6cac iVrS2Ja9DMO5E9+vztJSDdwWvrb6HdbOA+63HoTecFJ/6Cjdp/wn1gXsuDT3q8fS4GJU a4adc7JYrYAuM812UvCOyUD2n9REdyCbWZUxq6JurvYRnZ130jAX6uhQxzq0oNhf8wyU H1mEU3Thh+5vFmXhydb6Cve50HpB24J2M2cStf50sgYSV5PeJRsvEAVJ0OgdwwmUNvfB icSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705576348; x=1706181148; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=lCBEGTZpNMmcOnq2Y92D+y0mof5icxn15UOhqT4EV4Y=; b=gb4/gNESvlgOqZg9YWDZ+F4kzWqdgLVmobP2yM2VuJXqweAzuDWtl9pMd5Xuhh0T+5 q9u3RBDMyzPoDg35GMdLshD2CLkJ6VIjV0LpEgmILGttWUPEYMXeGOce97dAWxNo9B3O kdm05q4bzqIxt9rsWYnYUuAMdpmI80BXdqRZ7qxYJQ+KqYtoJ57ptO4XK+I7o5Fddtw2 L9/594ytvyN/+EIGlRNbt6J24+1aFfd0O61/vk4JEVqkh5HvYGj/OlGonvNfiIYK0oTu WG9S6Imc7wWZi9SKgqXK2asM6EJHTuGR8IKLjxE3hPR2KLx7RexbdvxHWsdaPHUejexk JJ2w== X-Gm-Message-State: AOJu0YyPfeAlSTCvMcswqqlDVV8FKvFyzTug30L7oZWViWLDZArhyrm8 ops0NTRkXSy0RNieebPKX6OARtGtsljEWymO+JnpDmHGrph6l8zj X-Google-Smtp-Source: AGHT+IEJstXUsC4BpkRjY9scs4NUCR+YL22P3xRiQhsTqcfdW5F99LeuFxkMM1kuf1dvP6YKKu38lg== X-Received: by 2002:a05:6a00:170c:b0:6db:6ed8:4e9f with SMTP id h12-20020a056a00170c00b006db6ed84e9fmr684987pfc.56.1705576348144; Thu, 18 Jan 2024 03:12:28 -0800 (PST) Received: from barry-desktop.. (143.122.224.49.dyn.cust.vf.net.nz. [49.224.122.143]) by smtp.gmail.com with ESMTPSA id t19-20020a056a0021d300b006d9be753ac7sm3039107pfj.108.2024.01.18.03.12.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:12:27 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: ryan.roberts@arm.com, akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, surenb@google.com, steven.price@arm.com, Barry Song , Chuanhua Han Subject: [PATCH RFC 5/6] mm: rmap: weaken the WARN_ON in __folio_add_anon_rmap() Date: Fri, 19 Jan 2024 00:10:35 +1300 Message-Id: <20240118111036.72641-6-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240118111036.72641-1-21cnbao@gmail.com> References: <20231025144546.577640-1-ryan.roberts@arm.com> <20240118111036.72641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 431B6180018 X-Rspam-User: X-Stat-Signature: nsctgu94t6auxs7q36j6p4wghz16dcxp X-Rspamd-Server: rspam01 X-HE-Tag: 1705576349-427242 X-HE-Meta: U2FsdGVkX18Sj5kUu5PLqzyVje3/nycjOvDQui1SnLTcYI2ezqo15+yAeRP8Hyd1HeoS1bBEQPdv3mac9ZuvNxHUp3qxwu/6EHVvtxLJRg63iBgPLYY9H0ptWbO8b486umJW4Ffm3IKeuUQ5nKEs3OKw93LRin6TfJ5RncufhLIaVUM+jUDs8mtwkffr7MM4sSKR66BP8gjkyL1AANm8nk0c3x/9uhcSMtmTAs4mk+YNmRvue1JBk7EZAQiEi2Vu8hk9lQG9pvYndk4ZEOQYFLyTiT3lPinSqCS6B9YkZcBf5zYpl0q56HQhMLFAMvC21XkXuEp5XO3K/IYlHUWGt0kRBq93dIMlxgVx5rh1k6Ggg0xhYzfRv5aKHzYsuabZZrB7dP4lYC5EbKid+eB4CYvpJsP00VEukjzzUX90ofIApa+XTXAjeMH9WtfrPO498g4UI8F9A0xaUNnzWChmPuG8dzX014DcfufA7U55PtvyKIum9p51ID4JbmuxzVamxqxta3EOxCgCeLV1gxdj+dbA64fyD7Ku7uEc9w0wQ/3M26fpvoMVwEMpvva0Z8FI2d/Fd284XzsZ7muXsMCBGgGghddJieFrf5FDUNSGbY7aEnlHT/4YdZiJfQcdkOL+q67lWEn6XJi1kKUglGJ+8EJJ86scbQn9BpJ3f/sgnFjlm8EDX4kE85Y4Pp6Z4mI9gOjzNZGyWLEAXkBpLkNK6qvOxFBxcxru6LGWEwq3/cep4WsK1HK565MhUC2rmii34Dw5XJmnVyx33pdx6ojZiEnUUKLPz85P5UcJCu7X22REqFhHfzKGVR0HhVTyQXTnAQt/r+pLXJ5nX+1QoxemrGL0ucsea52KJ+0oWM3IdWni2WwpZm4y0+1ETZUmSWxPs51SQsC1csquLRVGaBTTsy16N/Q6bUysloZVG7v29jmZukN4O3ce5E3cxsLlS/bVDeaUR8c06Ut0fKlVa+K 3oEkA6HP Djg9xrwHHH9GCDGMO9rV9/zgagsXm/WAj2P/pqTCoQNiyr+cvWxkLxaaDLOovSay5qdb5hAubq30U0XpXEnTDGMF66ooU16WMyN32XUe0KqCFPSc+gDRFJhSon0iK3i1ue/A74YEN0ix8OoZpiNsvsP59wXnP9Y66LFdajlJkP1NEpAZRMg8KvEilRfjVoYN2OBYLbmtbTgmG61hpvnC3gd2Wh5vEaMbgRr6+meSkI4XkKiwRP4MFNBeWcdXExDPjGokjQGw1guXMNMtmHead8b7BE6LpYy7oKWtvnx4N/nQJuhWskyVAkaDyjUg97SwZcL6fqvyftciInSjzN/wgnV78766VwBx6s0CHXoYhFEMyv3NaQJhf1c//NzKpdFt+yY0kn2pndK3E+Qrh1bK8bBwJV8Tf311df3ISij2L7vLK2HpDFg8fVIeVa1Xp4TpcZLM9qEV8Tgu5usgPEOAGlzMpQKFuM4rNanhDk+aBeZgz1FYSl2xbIvP0uai2h0awshmkxXdrEYd3xL3a3zppNAawnw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song In do_swap_page(), while supporting large folio swap-in, we are using the helper folio_add_anon_rmap_ptes. This is triggerring a WARN_ON in __folio_add_anon_rmap. We can make the warning quiet by two ways 1. in do_swap_page, we call folio_add_new_anon_rmap() if we are sure the large folio is new allocated one; we call folio_add_anon_rmap_ptes() if we find the large folio in swapcache. 2. we always call folio_add_anon_rmap_ptes() in do_swap_page but weaken the WARN_ON in __folio_add_anon_rmap() by letting the WARN_ON less sensitive. Option 2 seems to be better for do_swap_page() as it can use unified code for all cases. Signed-off-by: Barry Song Tested-by: Chuanhua Han --- mm/rmap.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index f5d43edad529..469fcfd32317 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1304,7 +1304,10 @@ static __always_inline void __folio_add_anon_rmap(struct folio *folio, * page. */ VM_WARN_ON_FOLIO(folio_test_large(folio) && - level != RMAP_LEVEL_PMD, folio); + level != RMAP_LEVEL_PMD && + (!IS_ALIGNED(address, nr_pages * PAGE_SIZE) || + (folio_test_swapcache(folio) && !IS_ALIGNED(folio->index, nr_pages)) || + page != &folio->page), folio); __folio_set_anon(folio, vma, address, !!(flags & RMAP_EXCLUSIVE)); } else if (likely(!folio_test_ksm(folio))) { From patchwork Thu Jan 18 11:10:36 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13522714 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9A026C4707B for ; Thu, 18 Jan 2024 11:12:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 304886B0093; Thu, 18 Jan 2024 06:12:48 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B4356B0095; Thu, 18 Jan 2024 06:12:48 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 156516B0096; Thu, 18 Jan 2024 06:12:48 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 021976B0093 for ; Thu, 18 Jan 2024 06:12:48 -0500 (EST) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C85A2A096A for ; Thu, 18 Jan 2024 11:12:47 +0000 (UTC) X-FDA: 81692169174.17.07734B3 Received: from mail-pf1-f172.google.com (mail-pf1-f172.google.com [209.85.210.172]) by imf05.hostedemail.com (Postfix) with ESMTP id 003E610001D for ; Thu, 18 Jan 2024 11:12:45 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=dlnTSPVM; spf=pass (imf05.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.172 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1705576366; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=jPRbXml8IYd9XMbGDp/b9xmA+KUiJTFI45JekYQVHFk=; b=OfPRC16YL9nEShP2tZSn9FmIR5VP5Gjaiw8LCAVKGgjJWHKBZbWd9KtLj2iPz4TBgheFEI n4cepubEqZp1bizw8ZJiAB+sDTnZ9v3paCPrPM1LODBuu7E9TjWULXDT5bTrFrnKfCRaxn Zmfb5G2A8sHQyhRCFyKpSElxQuFJ8Wk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1705576366; a=rsa-sha256; cv=none; b=Zk+PHU4/VBXbZeJdBMr3pC6JgI1NF9zAsdlAUvV2UiWG+o1RHX263uU9S4kOsqi/XZUISZ 0fU3apEz+EARWgIOH2cyxGXxIcvZ2Ziba8ZqklKuC2FIEGGZXutoTStDOflhRHjrem3EWX kpgu2DjgIt/7yGOkhh4+gPWjKbEI6ds= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=dlnTSPVM; spf=pass (imf05.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.210.172 as permitted sender) smtp.mailfrom=21cnbao@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-pf1-f172.google.com with SMTP id d2e1a72fcca58-6d9b37f4804so393868b3a.1 for ; Thu, 18 Jan 2024 03:12:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1705576365; x=1706181165; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=jPRbXml8IYd9XMbGDp/b9xmA+KUiJTFI45JekYQVHFk=; b=dlnTSPVMKHCZCbWoKgVAhaWXCGW/ZzIlLBQ+f6BJqm5fC6zk2s2rO/WabEs8FTvD9O W6bznuLLZ0f/gRMlvNmZq9uUxL4V4X8Fsqy3pyt43bh40SLSEsuaKiMg+p+R8En+98cC QU/HVwwyt1D5VULepZNq/TiEgrsZdw4APyzQ64MzraM8VArfEFIO4BSt3Xyf14PRURGk 2w4Hf/VtGrdT6khUDW91ZCVsscc6WFMpceXMdOeSYkexXKQT4t4sdR2YFjrHiTNv/bGr 7eaMyu3H9bpPyjwTQJ+L2D33ERTGmhA11rLV+aeURUfNPDYq8ZxcVtymMNwIHd2c5H3S njkg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705576365; x=1706181165; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=jPRbXml8IYd9XMbGDp/b9xmA+KUiJTFI45JekYQVHFk=; b=rRZldn4PFo91q9x1A6pgwiIbcImuFbqXGsFQA+AQWKnbaAAu+A7if8NwFB2KDpGef1 MSwHUBtJSlE+/CQzEL3WSBe6unDFQDnci/kqcIa/u/4DbzJDupNsa7uQ8KY55THEkfEr hzsNdK9xzzpnbJAEVtLTeJR2cW7y14EKbJvBZMhqhekTC7/vYrkhbT6b0pS22ol1QSuY tS+pzNqiEazfR3+At3eznpDg+7t5wdSVG6qk/BqCp0h4ijpH/QxlP2NVspdtdzSkzF9b bRv0T/HOuo6uSmCTGT8MzcQcCtrxLxZ8pqHaADQu2PdTlRltGBv6h2W4J1ZYSyz+jhmb VOVw== X-Gm-Message-State: AOJu0Yzq5lOOUDLFWpP6ullLJdB1O6a9GfDmOd2WFSE7+FsM5T6gpHLm K3Us672o+nIYt0Lk0HQ+6ERKnjw4qc3bzQQWls3gcx31ED+ko4C3iFzAdVsr X-Google-Smtp-Source: AGHT+IHPb2V5o+Kp/DcFse2JrVkm3MDRb3SQdSn4vIY2cU1g3ZhVG+tEV+84Z29sP1LjBn1UyzwA9g== X-Received: by 2002:a05:6a20:548b:b0:19a:fbfa:b16a with SMTP id i11-20020a056a20548b00b0019afbfab16amr1121941pzk.30.1705576364885; Thu, 18 Jan 2024 03:12:44 -0800 (PST) Received: from barry-desktop.. (143.122.224.49.dyn.cust.vf.net.nz. [49.224.122.143]) by smtp.gmail.com with ESMTPSA id t19-20020a056a0021d300b006d9be753ac7sm3039107pfj.108.2024.01.18.03.12.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 03:12:44 -0800 (PST) From: Barry Song <21cnbao@gmail.com> To: ryan.roberts@arm.com, akpm@linux-foundation.org, david@redhat.com, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, mhocko@suse.com, shy828301@gmail.com, wangkefeng.wang@huawei.com, willy@infradead.org, xiang@kernel.org, ying.huang@intel.com, yuzhao@google.com, surenb@google.com, steven.price@arm.com, Chuanhua Han , Barry Song Subject: [PATCH RFC 6/6] mm: madvise: don't split mTHP for MADV_PAGEOUT Date: Fri, 19 Jan 2024 00:10:36 +1300 Message-Id: <20240118111036.72641-7-21cnbao@gmail.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240118111036.72641-1-21cnbao@gmail.com> References: <20231025144546.577640-1-ryan.roberts@arm.com> <20240118111036.72641-1-21cnbao@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 003E610001D X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: g1ecx1enpir9o7b9ajokk73k57mi9bzq X-HE-Tag: 1705576365-608926 X-HE-Meta: U2FsdGVkX18IoUpWhE6k6TJI8KYKcNNGX1gmlgIErglUDpaQRysrZ1Mk9gef6tfWpY2pMZsW7Ms0DM44wKNYCkRVK4zZX5TvvJgp+7cnR4biXBEJ9ssHznUfs4I1QvKeQ0H5BR0k69Ld1khQK9oKGBcA9LJs4P5lmp7PzwyQSQVuDsnXlcHIghzzBM1aVUYE72Mxft4OGasWCCMwhC5slQU3CPyjPauMEmhYPZ4bICCa89cwyW3ELWO0Rkleav0AuLLdd+gcX8JYlZJ35BNebAAHQG+UJR/c+qazXmUVR7DkiTYG7siAXqHTbDDL5Q/Dnl85BJnxWv+xXSOJYWs1jo2c1FOMW9xFR7iAZyohjTQe+bGlemb9/6xESMvNg6jj0M3a3KuAMEzw79/wC3RhOdl025CbQvyt3gKeUIYzQyMUTJrPM6qLjo6v0xszAG8N/tPOwUYOijU5xsAzsGDlPKEVfHhVjfkDuRSIGB7Wj4nkPqagzUO9wrUwu3KOK8gL9QedaIiGQNpaPNgQKOdRQRf7Qw54vtT0zObXaLmdhAzbTsdrsJEZi4sJfV+JkcPeRst6AQNMr5cNJZ75y1BgaQSu+at88yr7g89hMmJEsSqel4ayQs/0iR9+4pzWwPkzA2yVnUsRAzB5chDVWxUEXxNjd+nbxjq8dpFPmSzACKCzJsdXBcJqFGv/2LRET3KhgZPo5Q831/GUuTaPwGt9gakBuG20uSpy/dI4MysjYM1vkod4dL7j543qb0ucCWWvwN58Dwo9AWliYq1P17Opl3jhjdFCPIOdstax0eUzsyrdtqD5HOelKp+YcJMGJz2QeGvFHl05GApCHERRNG+lsUge352MUYar2aqWo4pkac1YK+FVwyv84Yl4qydhW/rnM01KDK3WOYWErNY6feAJgZ/0BmP8XmFkU25k93aZntHiCv8bsOasWWcOUHU58H8m9BHZbbyexZjvqKb3MU0 tfecW3D/ xR1gRMc/BR+8AJNiPcntcECpcXoVO0inixyC0EV7Huf7al6GK18d8vSJxMx5j7YWcxKzkjsz+fKYjNKOMS/r11qkU6HsTjDog4C4NjVLfSo4OqkgmOCWm1GkBv5ZsEsgKw5X6ajXOeZoJ/sIcbbd19PdrBAW2OPqAeUsbA62Z9zcSCLCct9jD/FrnXuMaNooBl3HUEDz0c8Sb58hMB0WoKfeu/TIvQCbiHq3LnYxw/XQ19donpNTTwdJvwVM0v76Ta8o9Kj4ZUgL6M0zeoOxyulvRHRk+z+I8/kfcCT7ldh9rRYaPl3Nq7YFiRB9VADd0uXn9dTgQQPVIrywFk7GRHFgIg6af3tIgwdNV9j8p4vDipqILuazUnIWbdVQ3Zt0wTuuGouIYCUhTREu+lY7srUKzPphqLxBjC8gybu8pUfqAUF7QWYByxUphkYGzMKNf07LBsehbn0rFl40FnxqCz1brcPpnRewB9qEjIL4vcwfSxXU2zgtuKFdj3GQkl9etJj9VL7gtpRgU1v8D8hWoLzsZ5A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Chuanhua Han MADV_PAGEOUT and MADV_FREE are common cases in Android. Ryan's patchset has supported swapping large folios out as a whole for vmscan case. This patch extends the feature to madvise. If madvised range covers the whole large folio, we don't split it. Otherwise, we still need to split it. This patch doesn't depend on ARM64's CONT-PTE, alternatively, it defines one helper named pte_range_cont_mapped() to check if all PTEs are contiguously mapped to a large folio. Signed-off-by: Chuanhua Han Co-developed-by: Barry Song Signed-off-by: Barry Song Signed-off-by: Barry Song --- include/asm-generic/tlb.h | 10 +++++++ include/linux/pgtable.h | 60 +++++++++++++++++++++++++++++++++++++++ mm/madvise.c | 48 +++++++++++++++++++++++++++++++ 3 files changed, 118 insertions(+) diff --git a/include/asm-generic/tlb.h b/include/asm-generic/tlb.h index 129a3a759976..f894e22da5d6 100644 --- a/include/asm-generic/tlb.h +++ b/include/asm-generic/tlb.h @@ -608,6 +608,16 @@ static inline void tlb_flush_p4d_range(struct mmu_gather *tlb, __tlb_remove_tlb_entry(tlb, ptep, address); \ } while (0) +#define tlb_remove_nr_tlb_entry(tlb, ptep, address, nr) \ + do { \ + int i; \ + tlb_flush_pte_range(tlb, address, \ + PAGE_SIZE * nr); \ + for (i = 0; i < nr; i++) \ + __tlb_remove_tlb_entry(tlb, ptep + i, \ + address + i * PAGE_SIZE); \ + } while (0) + #define tlb_remove_huge_tlb_entry(h, tlb, ptep, address) \ do { \ unsigned long _sz = huge_page_size(h); \ diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 37fe83b0c358..da0c1cf447e3 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -320,6 +320,42 @@ static inline pgd_t pgdp_get(pgd_t *pgdp) } #endif +#ifndef pte_range_cont_mapped +static inline bool pte_range_cont_mapped(unsigned long start_pfn, + pte_t *start_pte, + unsigned long start_addr, + int nr) +{ + int i; + pte_t pte_val; + + for (i = 0; i < nr; i++) { + pte_val = ptep_get(start_pte + i); + + if (pte_none(pte_val)) + return false; + + if (pte_pfn(pte_val) != (start_pfn + i)) + return false; + } + + return true; +} +#endif + +#ifndef pte_range_young +static inline bool pte_range_young(pte_t *start_pte, int nr) +{ + int i; + + for (i = 0; i < nr; i++) + if (pte_young(ptep_get(start_pte + i))) + return true; + + return false; +} +#endif + #ifndef __HAVE_ARCH_PTEP_TEST_AND_CLEAR_YOUNG static inline int ptep_test_and_clear_young(struct vm_area_struct *vma, unsigned long address, @@ -580,6 +616,23 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, } #endif +#define __HAVE_ARCH_PTEP_GET_AND_CLEAR_RANGE_FULL +static inline pte_t ptep_get_and_clear_range_full(struct mm_struct *mm, + unsigned long start_addr, + pte_t *start_pte, + int nr, int full) +{ + int i; + pte_t pte; + + pte = ptep_get_and_clear_full(mm, start_addr, start_pte, full); + + for (i = 1; i < nr; i++) + ptep_get_and_clear_full(mm, start_addr + i * PAGE_SIZE, + start_pte + i, full); + + return pte; +} /* * If two threads concurrently fault at the same page, the thread that @@ -995,6 +1048,13 @@ static inline void arch_swap_restore(swp_entry_t entry, struct folio *folio) }) #endif +#ifndef pte_nr_addr_end +#define pte_nr_addr_end(addr, size, end) \ +({ unsigned long __boundary = ((addr) + size) & (~(size - 1)); \ + (__boundary - 1 < (end) - 1)? __boundary: (end); \ +}) +#endif + /* * When walking page tables, we usually want to skip any p?d_none entries; * and any p?d_bad entries - reporting the error before resetting to none. diff --git a/mm/madvise.c b/mm/madvise.c index 912155a94ed5..262460ac4b2e 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -452,6 +452,54 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, if (folio_test_large(folio)) { int err; + if (!folio_test_pmd_mappable(folio)) { + int nr_pages = folio_nr_pages(folio); + unsigned long folio_size = PAGE_SIZE * nr_pages; + unsigned long start_addr = ALIGN_DOWN(addr, nr_pages * PAGE_SIZE);; + unsigned long start_pfn = page_to_pfn(folio_page(folio, 0)); + pte_t *start_pte = pte - (addr - start_addr) / PAGE_SIZE; + unsigned long next = pte_nr_addr_end(addr, folio_size, end); + + if (!pte_range_cont_mapped(start_pfn, start_pte, start_addr, nr_pages)) + goto split; + + if (next - addr != folio_size) { + goto split; + } else { + /* Do not interfere with other mappings of this page */ + if (folio_estimated_sharers(folio) != 1) + goto skip; + + VM_BUG_ON(addr != start_addr || pte != start_pte); + + if (pte_range_young(start_pte, nr_pages)) { + ptent = ptep_get_and_clear_range_full(mm, start_addr, start_pte, + nr_pages, tlb->fullmm); + ptent = pte_mkold(ptent); + + set_ptes(mm, start_addr, start_pte, ptent, nr_pages); + tlb_remove_nr_tlb_entry(tlb, start_pte, start_addr, nr_pages); + } + + folio_clear_referenced(folio); + folio_test_clear_young(folio); + if (pageout) { + if (folio_isolate_lru(folio)) { + if (folio_test_unevictable(folio)) + folio_putback_lru(folio); + else + list_add(&folio->lru, &folio_list); + } + } else + folio_deactivate(folio); + } +skip: + pte += (next - PAGE_SIZE - (addr & PAGE_MASK))/PAGE_SIZE; + addr = next - PAGE_SIZE; + continue; + + } +split: if (folio_estimated_sharers(folio) != 1) break; if (pageout_anon_only_filter && !folio_test_anon(folio))