From patchwork Mon Apr 7 23:42:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 14041995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 823F6C3601E for ; Mon, 7 Apr 2025 23:42:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D5A776B000A; Mon, 7 Apr 2025 19:42:28 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CE60B6B000C; Mon, 7 Apr 2025 19:42:28 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B11676B000E; Mon, 7 Apr 2025 19:42:28 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 8F4D46B000A for ; Mon, 7 Apr 2025 19:42:28 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 71EAD1A172E for ; Mon, 7 Apr 2025 23:42:29 +0000 (UTC) X-FDA: 83308874418.12.470B9BC Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) by imf15.hostedemail.com (Postfix) with ESMTP id A9D2EA0003 for ; Mon, 7 Apr 2025 23:42:27 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fm2Tk22I; spf=pass (imf15.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.176 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744069347; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=73uaIqpGQmGcoc6xX4uGJDrIoUQxEZ/6HdzUqXO0b/Q=; b=IJTbRXKwpGjmXUg0cYa0f/4mUx49+Djrx7W76wDTbX9Nzuu5c1OOiM+/aRaNa9QYvzB8Tg hVeoUfnHo7gMkKoIhuTprhTfHR5uhlhxOuiyboH2Js3a7FeYP1GBfZ9iB3gYs2vEyXPx+j sUj/Qt8kfIBFVOllDMmjbCZ8icHxL8Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744069347; a=rsa-sha256; cv=none; b=E6Mu57v0R2OW2d+Cfwl/fgVFzTGLD5VGJNRtRFQRX9JBLiXtMzSFI/ossCmer1xTpRpZ98 QWh2TuC7CzZ9uExRL4iQuJIpBZ1oVbwhWkVJPPmFeLm8LP4T89ZgZEcRrBEA2KL+6k4yyr 6VKpI3RVKWI7LMNSfamyzu2ZuQeWMhw= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=fm2Tk22I; spf=pass (imf15.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.176 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-6f768e9be1aso67159547b3.0 for ; Mon, 07 Apr 2025 16:42:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1744069347; x=1744674147; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=73uaIqpGQmGcoc6xX4uGJDrIoUQxEZ/6HdzUqXO0b/Q=; b=fm2Tk22IqgwDs2EivFGFAbuman0YG4kAWH3WCBnuqssV3dqCMc1ny53dCrN6r2pqsw LW6AVe08Wkkc/AC8gSENB0BShYCIESTBb334+ttVlh5Q2ujSYz1BKbAwsGIt47NlVdvD GNTjsWC6sima8lWKITR0KXIHETVpX6IzYdVgvzhXUXsOMlQAuDQnbb/2gVCW0AzgXGPc sZTh6yyoPJ8H/bOmbJYBTkZluNgrgapsL8+azLc8FRfvdGVIqT46abz4E2yNsD0dZ7xg 3q1FoB4rIzGmY7EnITupvQjeA3wMCulXnj5rlmYtx1C3RYAldU+sCN+Sov9H2DJnANgx kx/A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744069347; x=1744674147; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=73uaIqpGQmGcoc6xX4uGJDrIoUQxEZ/6HdzUqXO0b/Q=; b=bgQNu78B1PbpryGYp+vXM9TAEX+7tu37g6VvmFydPlZ1rLPuhChIZk3DmMLBAFY1cF FhO2eBrz6Y+6X/SqVbM0AHH6fjc1OVPSIuB1dtUMYfh6zVwBStdcDHbSzc50XDrdY/Cs zkGm86lZpaNLFXlcR48nE0AB1R1Amag9ITRoKqGaqX8vypgDyP4YdquUjuRNG7FbD0Ho QOQfFYkh1pjOPX2jskB/P3mOBIns6qv7//BgkWrW/1XjZ8T3PUlsWTyvld2Az7rAnOuJ UpgTkHt7B7DLNKwa+RjKsiSmCfhfp2UX+YjQHm7mW4UO2bjh5cuVXIPnyoCsgWBafDyx fezw== X-Gm-Message-State: AOJu0YxkBHLADZ3G/eu6KlSlxLSlE/GNMhhf82d6yL9zClqN33wbfKQV pIl4TRPf4eI1OQ76d3wgvpfXBct6WuiqXiRRGYeMXr/Vruhty0hpepdd3w== X-Gm-Gg: ASbGncsr21ZCOxpSIi80dsKKwMX1UAuve185pxghn/YjDEWo05IbLMw37AIvFDkNoF+ 5Ne3REH5aK4EHBLZmdbm8voIval985PJa/EbVsySd/7hRgaXd7C7VONIIgTadzUes7RAh7xHh14 WDQJHobnOfQLN0SqO53zj28oeASfW25suoF0Fwq16ti/jSPm4FbAUlxCn8MQSeymY0/jZf0ahqA oYLpCkGZBAA0QmR+DE7FE8nRYXoMSL1soLvFu9rdwFyCc6BBtN1ux+xYC8kHygsz3RJWce7/07O ApTQw5OE6+We+bqV6lH9g0rrcS35jxEscs0= X-Google-Smtp-Source: AGHT+IHZdzxtGyNDUvpRfkN+hu9r8BnXmzk6o6ybIvQZjcqkoNyrqs6Z1N2T8Jqkk8HRpaLSwYFkoQ== X-Received: by 2002:a05:690c:11:b0:6f9:492e:94db with SMTP id 00721157ae682-7042d433cc9mr20957957b3.2.1744069346612; Mon, 07 Apr 2025 16:42:26 -0700 (PDT) Received: from localhost ([2a03:2880:25ff:4::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-703d1e2f5e3sm28156577b3.23.2025.04.07.16.42.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 07 Apr 2025 16:42:26 -0700 (PDT) From: Nhat Pham To: linux-mm@kvack.org Cc: akpm@linux-foundation.org, hannes@cmpxchg.org, hughd@google.com, yosry.ahmed@linux.dev, mhocko@kernel.org, roman.gushchin@linux.dev, shakeel.butt@linux.dev, muchun.song@linux.dev, len.brown@intel.com, chengming.zhou@linux.dev, kasong@tencent.com, chrisl@kernel.org, huang.ying.caritas@gmail.com, ryan.roberts@arm.com, viro@zeniv.linux.org.uk, baohua@kernel.org, osalvador@suse.de, lorenzo.stoakes@oracle.com, christophe.leroy@csgroup.eu, pavel@kernel.org, kernel-team@meta.com, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, linux-pm@vger.kernel.org Subject: [RFC PATCH 04/14] mm: swap: swap cache support for virtualized swap Date: Mon, 7 Apr 2025 16:42:05 -0700 Message-ID: <20250407234223.1059191-5-nphamcs@gmail.com> X-Mailer: git-send-email 2.47.1 In-Reply-To: <20250407234223.1059191-1-nphamcs@gmail.com> References: <20250407234223.1059191-1-nphamcs@gmail.com> MIME-Version: 1.0 X-Stat-Signature: 97xhux655jgfg9odtph5mmewn9yaux1y X-Rspam-User: X-Rspamd-Queue-Id: A9D2EA0003 X-Rspamd-Server: rspam08 X-HE-Tag: 1744069347-444329 X-HE-Meta: U2FsdGVkX1/sWunbc2B0+wuA8QHFyxza90rXRpKVjVDLXVU5m3f/QpkpChxSzbOdUEz1oqgQlJE3mD1NW4qxGT67X/F2v9vuAhddgejVrVdNUrkUtfWd9su86MqMOiTz4jgBouXxy/keVrlN+9uPXpx49x9i/gAGtpUFzQ7ChS5RsCK828hzs+6Dif36KM+QS4wesN4ZE0rmSe2PL14IGs8zLZYtW0s9Ss/pTkSiUIHtss1LtuEWgd9b4IdFMDWYhGUyf2lO7DHdkneEB5Bp+JAB0gwn3dVy91J1s5xOCT2FktdHxcn+UjORFtyMvu2rIKaHhBAUk0wmvH+h4rW6KJ7xBhM1d17p74RbkONq4TSaLn0WmrQwizoj8GtBYZnbDoyjiJ/JNp5agIEAPl2R7gFico7XaiUGyqEBw70+aRELbF+vmP7IxAX1qm87OAWvIXMU4Z5OokGswbKSzaB8WANUvu9TimkOom2bP76xTq53W3j8L1ylbTxmtLIiz/bZ++Aekv3ew6ZfYQgViWiqGyBAUg+kGdIQaZk48h1X/CtIjgHTeSZmrAbhg6MaCZw5pPFOTELk+Vyi2ARemUhwwZvVkacGoatcaKvRDdQM0znulhdcQWX27GOrys7oN7ToPmlpPX9AKiiLyhbCXMzGsjKsLUTR1NJAIOCIAb+rph9B/CdM7KHDKvcw03L8EE5J7UsBUYuT/ExY9z/svTcedVWRNrfmOFKFoIDUXjnrf7YntguDx8o0HgTbgjCQwGKaS5Lo/F9X9tQodf5kCMRDPhbFX753quTI/y8Y4TpZCyYhM4c3LOs+GToS3OYEn84esDUPS853C+4ZZ3uXBFyDWERDOzFmMkLQ4Pcg+XUnOD0DnTO8MjrfTTCdF8y8hl4kQxZfeBDbnd4ipJvWJD+VQ1Vfb+WBXdT4DeihkizwKm4kpyWooZYuiGkjWWFM8ZCZIbaMQ4iJgEokfdKmQP0 +ueb26Ne aTpr6j6Up38bhdnOR0+hPf4nNsjis5XrmD+wLZIJokSprGh/GCFTAk8OSrP8BjIqjKtv8DA78Yxu1OTdyPoqBy5fGAeT5oyOQZ2Q+6U00FNgBtQ9SM/kxAVK7a/dO1UhZ9FaUo4ajwETyWgHAWfW+WYC5gOkxj7DrT+pj1ju+xxN7JIU5fWgx7ZlB5+BHEvQIpBDc5QvvVqixUiYvwPP4f3im17wylEu8U7yEh+rjKkGxfsSNHcJT0lEVzvPoL/CTc2PuUUUSe9Oqz359CluhLQsDe0LGf/IFjvRWqmPMffkGDerNhzcA0p7xoR5AlIByO+65CEokE5QAhNJiXg6wR8Z6HmcMzrZK0AO+OG+WSFl7LIXGB5lmpnJtXKkn8J9VHBEayyBHVcDOYragMbIlI6t0iFurCOS7yz98g7Xf6JOEHCj1BSxGsf02IN6aF5TU26QjkuvDuRj1/cRuuzkAzp1m5s552iimmDfSxyOi7oJCcINead6D6FwTlg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, the swap cache code assumes that the swap space is of a fixed size. The virtual swap space is dynamically sized, so the existing partitioning code cannot be easily reused. A dynamic partitioning is planned, but for now keep the design simple and just use a flat swapcache for vswap. Since the vswap's implementation has begun to diverge from the old implementation, we also introduce a new build config (CONFIG_VIRTUAL_SWAP). Users who do not select this config will get the old implementation, with no behavioral change. Signed-off-by: Nhat Pham --- mm/Kconfig | 13 ++++++++++ mm/swap.h | 22 ++++++++++------ mm/swap_state.c | 68 +++++++++++++++++++++++++++++++++++++++++-------- 3 files changed, 85 insertions(+), 18 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index 1b501db06417..1a6acdb64333 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -22,6 +22,19 @@ menuconfig SWAP used to provide more virtual memory than the actual RAM present in your computer. If unsure say Y. +config VIRTUAL_SWAP + bool "Swap space virtualization" + depends on SWAP + default n + help + When this is selected, the kernel is built with the new swap + design. This will allow us to decouple the swap backends + (zswap, on-disk swapfile, etc.), and save disk space when we + use zswap (or the zero-filled swap page optimization). + + There might be more lock contentions with heavy swap use, since + the swap cache is no longer range partitioned. + config ZSWAP bool "Compressed cache for swap pages" depends on SWAP diff --git a/mm/swap.h b/mm/swap.h index d5f8effa8015..06e20b1d79c4 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -22,22 +22,27 @@ void swap_write_unplug(struct swap_iocb *sio); int swap_writepage(struct page *page, struct writeback_control *wbc); void __swap_writepage(struct folio *folio, struct writeback_control *wbc); -/* linux/mm/swap_state.c */ -/* One swap address space for each 64M swap space */ +/* Return the swap device position of the swap slot. */ +static inline loff_t swap_slot_pos(swp_slot_t slot) +{ + return ((loff_t)swp_slot_offset(slot)) << PAGE_SHIFT; +} + #define SWAP_ADDRESS_SPACE_SHIFT 14 #define SWAP_ADDRESS_SPACE_PAGES (1 << SWAP_ADDRESS_SPACE_SHIFT) #define SWAP_ADDRESS_SPACE_MASK (SWAP_ADDRESS_SPACE_PAGES - 1) + +/* linux/mm/swap_state.c */ +#ifdef CONFIG_VIRTUAL_SWAP +extern struct address_space *swap_address_space(swp_entry_t entry); +#define swap_cache_index(entry) entry.val +#else +/* One swap address space for each 64M swap space */ extern struct address_space *swapper_spaces[]; #define swap_address_space(entry) \ (&swapper_spaces[swp_type(entry)][swp_offset(entry) \ >> SWAP_ADDRESS_SPACE_SHIFT]) -/* Return the swap device position of the swap slot. */ -static inline loff_t swap_slot_pos(swp_slot_t slot) -{ - return ((loff_t)swp_slot_offset(slot)) << PAGE_SHIFT; -} - /* * Return the swap cache index of the swap entry. */ @@ -46,6 +51,7 @@ static inline pgoff_t swap_cache_index(swp_entry_t entry) BUILD_BUG_ON((SWP_OFFSET_MASK | SWAP_ADDRESS_SPACE_MASK) != SWP_OFFSET_MASK); return swp_offset(entry) & SWAP_ADDRESS_SPACE_MASK; } +#endif void show_swap_cache_info(void); bool add_to_swap(struct folio *folio); diff --git a/mm/swap_state.c b/mm/swap_state.c index 055e555d3382..268338a0ea57 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -38,10 +38,19 @@ static const struct address_space_operations swap_aops = { #endif }; +#ifdef CONFIG_VIRTUAL_SWAP +static struct address_space swapper_space __read_mostly; + +struct address_space *swap_address_space(swp_entry_t entry) +{ + return &swapper_space; +} +#else struct address_space *swapper_spaces[MAX_SWAPFILES] __read_mostly; static unsigned int nr_swapper_spaces[MAX_SWAPFILES] __read_mostly; -static bool enable_vma_readahead __read_mostly = true; +#endif +static bool enable_vma_readahead __read_mostly = true; #define SWAP_RA_ORDER_CEILING 5 #define SWAP_RA_WIN_SHIFT (PAGE_SHIFT / 2) @@ -260,6 +269,28 @@ void delete_from_swap_cache(struct folio *folio) folio_ref_sub(folio, folio_nr_pages(folio)); } +#ifdef CONFIG_VIRTUAL_SWAP +void clear_shadow_from_swap_cache(int type, unsigned long begin, + unsigned long end) +{ + swp_slot_t slot = swp_slot(type, begin); + swp_entry_t entry = swp_slot_to_swp_entry(slot); + unsigned long index = swap_cache_index(entry); + struct address_space *address_space = swap_address_space(entry); + void *old; + XA_STATE(xas, &address_space->i_pages, index); + + xas_set_update(&xas, workingset_update_node); + + xa_lock_irq(&address_space->i_pages); + xas_for_each(&xas, old, entry.val + end - begin) { + if (!xa_is_value(old)) + continue; + xas_store(&xas, NULL); + } + xa_unlock_irq(&address_space->i_pages); +} +#else void clear_shadow_from_swap_cache(int type, unsigned long begin, unsigned long end) { @@ -290,6 +321,7 @@ void clear_shadow_from_swap_cache(int type, unsigned long begin, break; } } +#endif /* * If we are the only user, then try to free up the swap cache. @@ -718,23 +750,34 @@ struct folio *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, return folio; } +static void init_swapper_space(struct address_space *space) +{ + xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ); + atomic_set(&space->i_mmap_writable, 0); + space->a_ops = &swap_aops; + /* swap cache doesn't use writeback related tags */ + mapping_set_no_writeback_tags(space); +} + +#ifdef CONFIG_VIRTUAL_SWAP int init_swap_address_space(unsigned int type, unsigned long nr_pages) { - struct address_space *spaces, *space; + return 0; +} + +void exit_swap_address_space(unsigned int type) {} +#else +int init_swap_address_space(unsigned int type, unsigned long nr_pages) +{ + struct address_space *spaces; unsigned int i, nr; nr = DIV_ROUND_UP(nr_pages, SWAP_ADDRESS_SPACE_PAGES); spaces = kvcalloc(nr, sizeof(struct address_space), GFP_KERNEL); if (!spaces) return -ENOMEM; - for (i = 0; i < nr; i++) { - space = spaces + i; - xa_init_flags(&space->i_pages, XA_FLAGS_LOCK_IRQ); - atomic_set(&space->i_mmap_writable, 0); - space->a_ops = &swap_aops; - /* swap cache doesn't use writeback related tags */ - mapping_set_no_writeback_tags(space); - } + for (i = 0; i < nr; i++) + init_swapper_space(spaces + i); nr_swapper_spaces[type] = nr; swapper_spaces[type] = spaces; @@ -752,6 +795,7 @@ void exit_swap_address_space(unsigned int type) nr_swapper_spaces[type] = 0; swapper_spaces[type] = NULL; } +#endif static int swap_vma_ra_win(struct vm_fault *vmf, unsigned long *start, unsigned long *end) @@ -930,6 +974,10 @@ static int __init swap_init_sysfs(void) int err; struct kobject *swap_kobj; +#ifdef CONFIG_VIRTUAL_SWAP + init_swapper_space(&swapper_space); +#endif + swap_kobj = kobject_create_and_add("swap", mm_kobj); if (!swap_kobj) { pr_err("failed to create swap kobject\n");