From patchwork Fri Feb 28 10:00:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Sridhar, Kanchana P" X-Patchwork-Id: 13996120 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E252FC19776 for ; Fri, 28 Feb 2025 10:01:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6B2BA280011; Fri, 28 Feb 2025 05:00:44 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 68A0D28000F; Fri, 28 Feb 2025 05:00:44 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4B496280011; Fri, 28 Feb 2025 05:00:44 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 22E9A28000F for ; Fri, 28 Feb 2025 05:00:44 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C77E51406D4 for ; Fri, 28 Feb 2025 10:00:43 +0000 (UTC) X-FDA: 83168909166.13.9825677 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) by imf08.hostedemail.com (Postfix) with ESMTP id 90BE1160014 for ; Fri, 28 Feb 2025 10:00:41 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=BHgeEuNO; spf=pass (imf08.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.18 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740736841; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=osSdciknXf0N+P0Uk5GYtHkcpbPIHbpGuj7NzjEYSFs=; b=8TwxGx6n1Lc8H3MQvQFej6M3wUstwMkzse/+BPYxyDv9Q4AhfoYkAaK9gGJk7cE6ZCK1zB UWXV/BHr5To/oYUT0Xs9+lM3Jcy81Lt1vLFY7h1oCkvRCGLscooSDFE+EHF23gTqCM05CF pFk3j1dbdiGQsCPOhFR+bByi1/FFVD0= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=BHgeEuNO; spf=pass (imf08.hostedemail.com: domain of kanchana.p.sridhar@intel.com designates 192.198.163.18 as permitted sender) smtp.mailfrom=kanchana.p.sridhar@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740736841; a=rsa-sha256; cv=none; b=x2U3HjJcIm/XlVxEspa+oRlsva6lRMTvI3o6WGNMGwRe96j8+lPx7Usk7NIRRsQmappyP8 2DUvWadIsPYxKpu++Bfin6wK8x87kFD/sYXOzk0+7m/VfH9dH2d8YfvZNkBy25pc+PZ0nt oLqLWSRsO/L5nj5xRm4HjOJgFhDEDog= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1740736842; x=1772272842; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=r6PEYgxq6/N6uxBAD+xcepNUnGuvoW5pJWNnSNPI1s4=; b=BHgeEuNOkhpG8/pNxpu1P6hrGG4BgrGtezDaNBVX5Ni0uKFf4Te913/r 4thQ9Dl+mHFmDUNjXwYGo1xAJepzNM30asz4R2HdJJpqNiVBdMyxVQThl AL1l1dvewyCFC8v3MtfR0N+whh3s7q548h/XCly6974YiDosOqEN95NaD DcNfqxUWJIort6yCGgtfnf/D/x8ZlP1Gxf80z4VlgGuKAdUWF3irZpgkC 1fTbRu9ijPieQjI1D1I/ZEy8Jy5cU59pLvp0QPMuMLQdb6nMk6WKkxJLe FcF/1B+Pk4f5llnxG2QsjwsBtiWm2tvzWPRMsfGCEDussyDtWuFM+28jS g==; X-CSE-ConnectionGUID: e4BMgzs8Sa6tLyTyXeJDAA== X-CSE-MsgGUID: A2gPB7IcQ9OTFv/+Vv9BhQ== X-IronPort-AV: E=McAfee;i="6700,10204,11358"; a="40902739" X-IronPort-AV: E=Sophos;i="6.13,322,1732608000"; d="scan'208";a="40902739" Received: from orviesa006.jf.intel.com ([10.64.159.146]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 Feb 2025 02:00:29 -0800 X-CSE-ConnectionGUID: qKe0qnEyQwu6amG26g9mBA== X-CSE-MsgGUID: mm6IdXK6SA27wiKDPmIP8w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.13,322,1732608000"; d="scan'208";a="117325757" Received: from jf5300-b11a338t.jf.intel.com ([10.242.51.115]) by orviesa006.jf.intel.com with ESMTP; 28 Feb 2025 02:00:29 -0800 From: Kanchana P Sridhar To: linux-kernel@vger.kernel.org, linux-mm@kvack.org, hannes@cmpxchg.org, yosry.ahmed@linux.dev, nphamcs@gmail.com, chengming.zhou@linux.dev, usamaarif642@gmail.com, ryan.roberts@arm.com, 21cnbao@gmail.com, ying.huang@linux.alibaba.com, akpm@linux-foundation.org, linux-crypto@vger.kernel.org, herbert@gondor.apana.org.au, davem@davemloft.net, clabbe@baylibre.com, ardb@kernel.org, ebiggers@google.com, surenb@google.com, kristen.c.accardi@intel.com Cc: wajdi.k.feghali@intel.com, vinodh.gopal@intel.com, kanchana.p.sridhar@intel.com Subject: [PATCH v7 14/15] mm: zswap: Restructure & simplify zswap_store() to make it amenable for batching. Date: Fri, 28 Feb 2025 02:00:23 -0800 Message-Id: <20250228100024.332528-15-kanchana.p.sridhar@intel.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20250228100024.332528-1-kanchana.p.sridhar@intel.com> References: <20250228100024.332528-1-kanchana.p.sridhar@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 90BE1160014 X-Stat-Signature: on831cb3ifrferpa3usoehkqxhahyb6r X-HE-Tag: 1740736841-360990 X-HE-Meta: U2FsdGVkX19gADawrV8Y6juMfSG5lI8cYr0fc7oh40V9UIX1QIUf2pvdAldu11yx+U+3hutOPBxtWbNNn3Lsi7FsnLrueZ5+0aHMTO6tu0UXHLCbdLw+bws9bPQcqpE1dNU6A7IigBL67wJ+VKUITUCKoWT433NCI+aEZiqXjINZXBgXdTULctlshHlSBOfFybAxkzdMqvIkDM38j4aOAO46gFUK6dCtu36w3goplDhcru4kv3rGYySi3nWasg53hevTshiWu4qjeEz+gKvXVS8H10VFepZYEwuXtHa1tpu9kN0NHthO8Dis6AECxgIcDGg/dODskNsUJSN+ZjoAdxKfhd7GO6pmqEDo6Yt0KQrgLRWC9KyqOZO6Y6IZPFZseKkGGZirfMdrgE0iwvXBqNebSxo6TCTDRWrohcMWCJyAz193KUleNU9TZ5k5vKPt4TXZi9tug+Rp2lGTx4LOvR5fHvqp7BBTdWhI7F8IB3wcUp6lJYOZPQsjiWuzPGZpZAumLE5hccmvPHTLOKbSNnOUYlKQgNuY5m4ASu6gXH5tLgOYIdulNvqjpvhJxrSpI/kNHca3JY9gw5jKfHc6P1d9tTVdoBQtSuXxxgdViHfONTK5qaeGYGH0LtjIFwQhm8PHvzjFqKU16JTBohQzXjZz2T3aNac6CraWGdkm5RktqKhbtF2EisK/OXj+DWwpHHA8/tdWYtc6ilTvJHSzcfoVX4TEXyDTlCcPMH/RI2wA7WOmoJWkIuugA8lDgzdtaNaBeYaCvuQbMBL92CUvssSThOS2XSbYMgf7vPsPl4pp/B1GOkyYGoMr8i8YvvmHCA4Lx3wsBl7RpixBtKIBE4Be5uDzEmFU7+UBM5YEyQ3XbEbbpth2mArk5DgtpC4jVmDsBaPmU1d0AO3zYyf7lPnfBfNFy77MT+r8hEYbwGzciIGKeol5ErAA0ul4yvI9DrEzT15ZanuIKermTXM tJNEh1xc HXrLj4TBt5Q6XPO8ss3OZ5NIdWcf9ifvAQJsfrqOm9o5675Gv3Ekh4DOAP9b4wP/0lkb7Mf+gc9SiMmLZk5FuFcEY0WPKKh/J7/m+b3GeAflQ2sSRQVnzEvFz++jhr5xoQ7I0152HTFL9+hzViJsAkOK1Cw1LGl+amPrkC3sXBTAnppTU+oQoljEkjUIH6stWFJ/IOPkLDnN46oe6VSNlWOHJsyXvrjqTufEa+ZomsxUsB9E= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This patch introduces zswap_store_folio() that implements all the computes done earlier in zswap_store_page() for a single-page, for all the pages in a folio. This allows us to move the loop over the folio's pages from zswap_store() to zswap_store_folio(). zswap_store_folio() starts by allocating all zswap entries required to store the folio. Next, it iterates over the folio's pages, and for each page, it calls zswap_compress(), adds the zswap entry to the xarray and LRU, charges zswap memory and increments zswap stats. The error handling and cleanup required for all failure scenarios that can occur while storing a folio in zswap are now consolidated to a "store_folio_failed" label in zswap_store_folio(). These changes facilitate developing support for compress batching in zswap_store_folio(). Signed-off-by: Kanchana P Sridhar --- mm/zswap.c | 166 +++++++++++++++++++++++++++++++---------------------- 1 file changed, 98 insertions(+), 68 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 6aa602b8514e..ab9167220cb6 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1580,81 +1580,115 @@ static void shrink_worker(struct work_struct *w) * main API **********************************/ -static bool zswap_store_page(struct page *page, - struct obj_cgroup *objcg, - struct zswap_pool *pool) +/* + * Store all pages in a folio. + * + * The error handling from all failure points is consolidated to the + * "store_folio_failed" label, based on the initialization of the zswap entries' + * handles to ERR_PTR(-EINVAL) at allocation time, and the fact that the + * entry's handle is subsequently modified only upon a successful zpool_malloc() + * after the page is compressed. + */ +static bool zswap_store_folio(struct folio *folio, + struct obj_cgroup *objcg, + struct zswap_pool *pool) { - swp_entry_t page_swpentry = page_swap_entry(page); - struct zswap_entry *entry, *old; + long index, from_index = 0, nr_pages = folio_nr_pages(folio); + struct zswap_entry **entries = NULL; + int node_id = folio_nid(folio); - /* allocate entry */ - entry = zswap_entry_cache_alloc(GFP_KERNEL, page_to_nid(page)); - if (!entry) { - zswap_reject_kmemcache_fail++; + entries = kmalloc(nr_pages * sizeof(*entries), GFP_KERNEL); + if (!entries) return false; - } - if (!zswap_compress(page, entry, pool)) - goto compress_failed; + for (index = from_index; index < nr_pages; ++index) { + entries[index] = zswap_entry_cache_alloc(GFP_KERNEL, node_id); - old = xa_store(swap_zswap_tree(page_swpentry), - swp_offset(page_swpentry), - entry, GFP_KERNEL); - if (xa_is_err(old)) { - int err = xa_err(old); + if (!entries[index]) { + zswap_reject_kmemcache_fail++; + nr_pages = index; + goto store_folio_failed; + } - WARN_ONCE(err != -ENOMEM, "unexpected xarray error: %d\n", err); - zswap_reject_alloc_fail++; - goto store_failed; + entries[index]->handle = (unsigned long)ERR_PTR(-EINVAL); } - /* - * We may have had an existing entry that became stale when - * the folio was redirtied and now the new version is being - * swapped out. Get rid of the old. - */ - if (old) - zswap_entry_free(old); + for (index = from_index; index < nr_pages; ++index) { + struct page *page = folio_page(folio, index); + swp_entry_t page_swpentry = page_swap_entry(page); + struct zswap_entry *old, *entry = entries[index]; - /* - * The entry is successfully compressed and stored in the tree, there is - * no further possibility of failure. Grab refs to the pool and objcg, - * charge zswap memory, and increment zswap_stored_pages. - * The opposite actions will be performed by zswap_entry_free() - * when the entry is removed from the tree. - */ - zswap_pool_get(pool); - if (objcg) { - obj_cgroup_get(objcg); - obj_cgroup_charge_zswap(objcg, entry->length); - } - atomic_long_inc(&zswap_stored_pages); + if (!zswap_compress(page, entry, pool)) { + from_index = index; + goto store_folio_failed; + } - /* - * We finish initializing the entry while it's already in xarray. - * This is safe because: - * - * 1. Concurrent stores and invalidations are excluded by folio lock. - * - * 2. Writeback is excluded by the entry not being on the LRU yet. - * The publishing order matters to prevent writeback from seeing - * an incoherent entry. - */ - entry->pool = pool; - entry->swpentry = page_swpentry; - entry->objcg = objcg; - entry->referenced = true; - if (entry->length) { - INIT_LIST_HEAD(&entry->lru); - zswap_lru_add(&zswap_list_lru, entry); + old = xa_store(swap_zswap_tree(page_swpentry), + swp_offset(page_swpentry), + entry, GFP_KERNEL); + if (xa_is_err(old)) { + int err = xa_err(old); + + WARN_ONCE(err != -ENOMEM, "unexpected xarray error: %d\n", err); + zswap_reject_alloc_fail++; + from_index = index; + goto store_folio_failed; + } + + /* + * We may have had an existing entry that became stale when + * the folio was redirtied and now the new version is being + * swapped out. Get rid of the old. + */ + if (old) + zswap_entry_free(old); + + /* + * The entry is successfully compressed and stored in the tree, there is + * no further possibility of failure. Grab refs to the pool and objcg, + * charge zswap memory, and increment zswap_stored_pages. + * The opposite actions will be performed by zswap_entry_free() + * when the entry is removed from the tree. + */ + zswap_pool_get(pool); + if (objcg) { + obj_cgroup_get(objcg); + obj_cgroup_charge_zswap(objcg, entry->length); + } + atomic_long_inc(&zswap_stored_pages); + + /* + * We finish initializing the entry while it's already in xarray. + * This is safe because: + * + * 1. Concurrent stores and invalidations are excluded by folio lock. + * + * 2. Writeback is excluded by the entry not being on the LRU yet. + * The publishing order matters to prevent writeback from seeing + * an incoherent entry. + */ + entry->pool = pool; + entry->swpentry = page_swpentry; + entry->objcg = objcg; + entry->referenced = true; + if (entry->length) { + INIT_LIST_HEAD(&entry->lru); + zswap_lru_add(&zswap_list_lru, entry); + } } + kfree(entries); return true; -store_failed: - zpool_free(pool->zpool, entry->handle); -compress_failed: - zswap_entry_cache_free(entry); +store_folio_failed: + for (index = from_index; index < nr_pages; ++index) { + if (!IS_ERR_VALUE(entries[index]->handle)) + zpool_free(pool->zpool, entries[index]->handle); + + zswap_entry_cache_free(entries[index]); + } + + kfree(entries); return false; } @@ -1666,7 +1700,6 @@ bool zswap_store(struct folio *folio) struct mem_cgroup *memcg = NULL; struct zswap_pool *pool; bool ret = false; - long index; VM_WARN_ON_ONCE(!folio_test_locked(folio)); VM_WARN_ON_ONCE(!folio_test_swapcache(folio)); @@ -1700,12 +1733,8 @@ bool zswap_store(struct folio *folio) mem_cgroup_put(memcg); } - for (index = 0; index < nr_pages; ++index) { - struct page *page = folio_page(folio, index); - - if (!zswap_store_page(page, objcg, pool)) - goto put_pool; - } + if (!zswap_store_folio(folio, objcg, pool)) + goto put_pool; if (objcg) count_objcg_events(objcg, ZSWPOUT, nr_pages); @@ -1732,6 +1761,7 @@ bool zswap_store(struct folio *folio) pgoff_t offset = swp_offset(swp); struct zswap_entry *entry; struct xarray *tree; + long index; for (index = 0; index < nr_pages; ++index) { tree = swap_zswap_tree(swp_entry(type, offset + index));