From patchwork Mon Mar 3 20:06:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nhat Pham X-Patchwork-Id: 13999398 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BE6AAC282C6 for ; Mon, 3 Mar 2025 20:06:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4F27F6B007B; Mon, 3 Mar 2025 15:06:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4A0F76B0083; Mon, 3 Mar 2025 15:06:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 341C96B0082; Mon, 3 Mar 2025 15:06:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 111EB280002 for ; Mon, 3 Mar 2025 15:06:33 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BE49580FC7 for ; Mon, 3 Mar 2025 20:06:32 +0000 (UTC) X-FDA: 83181322224.02.0905CC3 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) by imf30.hostedemail.com (Postfix) with ESMTP id D75C580005 for ; Mon, 3 Mar 2025 20:06:30 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=SUqscyPf; spf=pass (imf30.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.176 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1741032391; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=HdkfveeBHMcFthAzLeLlVIDBPC/Zmo/nuIz4RPFMAfw=; b=ayY60CQaonizi54MFF7LM1g4UZL1q7lazJRtvb3TQpOGOURHdbnAnlIYt2AU6C8efAHM/R p8PetTcCdXwJ266+MtQ6RnbZg89caTrYzQUFqcVX4g7OFAXGygoq+20vRqHBJzxsMfN3MZ CtohjtZfV5zGugd7NyNuYgQFpMM+tlE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1741032391; a=rsa-sha256; cv=none; b=VXLfJEFbcTZCNlPwKBERRD/OnXW4+u4pC7dHUefbGPtlBcDU4q5718gUzo48dFvfKYyEPx Xg3x3Rg/rWh7+DbDxTSOSkFSFfMubu6i3/7uocvVhx6Umww4Ok1ewtBy8uq47IVEROXvIt tzEHWNhjujJ6whiP/C69etJkLhiCAkA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=SUqscyPf; spf=pass (imf30.hostedemail.com: domain of nphamcs@gmail.com designates 209.85.128.176 as permitted sender) smtp.mailfrom=nphamcs@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-6efe4e3d698so43289097b3.0 for ; Mon, 03 Mar 2025 12:06:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1741032389; x=1741637189; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=HdkfveeBHMcFthAzLeLlVIDBPC/Zmo/nuIz4RPFMAfw=; b=SUqscyPfEVUJDUOIbDvZQ6KFqiJS5NDPqICD4F8zWmCli7gsvslUrVlP/tb2cS8aPR XTDoUgpxSLuckbUD/3qcemb6c+uwftjwu2v6XxWfrKGAvJIHc6uoYOqMI6FHaryL8nZl rZV1nn3PkNgjn7KCCJIVe99kZwsh7gYAnVZCPa4/YLGFHsPbNy9bA1gkYZd8dG+EmibR BIPC0ivCBBF3zDm+oijv+3IfaI9OGA4ziB1sQpmdGEnjvtevPaQuWgRylRgEeCjZakcX fa6A+zJ0bRTdad8tDabk8CDnB3YevmOJkxxpV6wcvDymDgfg/MApq7/hlnA2PG7/tzL7 nhlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1741032389; x=1741637189; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=HdkfveeBHMcFthAzLeLlVIDBPC/Zmo/nuIz4RPFMAfw=; b=JZQBWCIo/ic1FKJlun1jIISyPparfXbpLTvHRLIeMorj2hzkfHM6ZJWpGinzUWd891 mEQYyUM7HXlACAL/82SZFoyuaWSac+WFtxIXO5y+RdQP8bB9qrvOaIFdT7ufRT586XlB kCek4zK335fShEK3ZuSvbov0SBHrCFLepELxMU9YZ5l8tRoSFxOdv0c9s7y3MqbWpvVB 60tRpy5w7e9gVsS2ww0XtsBYDDPd1kKwQDVxN9OzkJn6Rl1+8BvmXsrbAHNoUb5EEzl/ TPdXHtcf1y9AIhnpnGYwTuN7UKuyNieN/jVGBoqQ9PJm9OiDkmUkeV1yAsrKf2aaS2RE pDGw== X-Forwarded-Encrypted: i=1; AJvYcCVmSXXffeHqfGFYPIPT5DUQOLMtZz82OCruq8D3xsEaNAGZ6eWkheQ6beMom2TquSHe1Wyw43FnXA==@kvack.org X-Gm-Message-State: AOJu0YxRxg4xBOKeLzaDFDJ1zMJ+GcSEJWNbgNBpxBbM1y4zB/d7bbwn IaFtFdQufNzKG1odloRWN3+dUdKqgB3W32WYHaUj82tgd/9S1+NF X-Gm-Gg: ASbGncvXcj4mEgikZteC/Iq0aFufV/Ava3MimOQhdEQ3c2r6qCkycA/oIhBfUzyOwco IiQK2nqop+kZfHGRJ38kXmSIVV4/ynP4b0YHuWiLv9Ih6jFjzVbjbeHCF5gfraDG2T+aNVGxkWj vEmx30IN58pnKk+ZuQxc2nPwUOZ149823o8PHrKI/yy0oO6d/ZEHn58M98FzBisoD9ZBCe8N93d v6jr1N638gVBPlfAnWZ4a5GbbhSL+AbCJhhsy8WLGH7yXm7mcDPxuSiQvFF8fFN8ZhMPbo6fegW rILAX7/e8iAThMZoeKKhjY6+Y9oE8+KPzSw= X-Google-Smtp-Source: AGHT+IFwi79QlbVMiM2gb/qIQACVkI5eUGxJILuAwvKnC6e45vwXI/U6kbVe8aV+EygTVz1UjXpIeQ== X-Received: by 2002:a05:690c:6912:b0:6f8:cedc:570d with SMTP id 00721157ae682-6fd4a030070mr209744017b3.6.1741032389023; Mon, 03 Mar 2025 12:06:29 -0800 (PST) Received: from localhost ([2a03:2880:25ff:6::]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6fd3ca0f3desm21421927b3.1.2025.03.03.12.06.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 03 Mar 2025 12:06:28 -0800 (PST) From: Nhat Pham To: akpm@linux-foundation.org Cc: hannes@cmpxchg.org, yosryahmed@google.com, yosry.ahmed@linux.dev, chengming.zhou@linux.dev, linux-mm@kvack.org, kernel-team@meta.com, linux-kernel@vger.kernel.org Subject: [PATCH v3] page_io: zswap: do not crash the kernel on decompression failure Date: Mon, 3 Mar 2025 12:06:27 -0800 Message-ID: <20250303200627.2102890-1-nphamcs@gmail.com> X-Mailer: git-send-email 2.43.5 MIME-Version: 1.0 X-Stat-Signature: u7h5zq57chuahxtjfnajs3n5e9fper6i X-Rspamd-Queue-Id: D75C580005 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1741032390-242802 X-HE-Meta: U2FsdGVkX18Azek9bpWzFIrU0i8ftHky/+UmCY+0AleR6IC6t9V7maOG16J9fRzs+MNlC7cy+HfMbaBlDD3Khjq5Tq3PLupXG9WRtn5gl4S3+bT54xrAE5Rhh2itTAc06SweCSDqO1h0gblUvH/4bFTZ9MmXpwxJLDQYjZrmcHwE+nP5M18JjePFYC/9ewPQAOyc9/Fab4j0TjROYWiXXEVBkDReSIbm2/6x4pps7HW9794uBhHHta3slGwvO5p80HkbAZfHVxd5oShkzeJ5HOcqS1eBjYxcxn+frKuLqsEz5J7LFFQaq2RtsGJcLkj2I1AO4pyf/GlfsmCtsV2deuld2BDiYD8v3g6U6bGXP/tJHbQimtvRdN5CnDy1ZySY3y8uq7zodfGnFcPKByflXWNn8mKJvte6Z0cMO3YfQepKnkbQahEkQqFN7ULPvZUfZ6FInA5buCxxBrNJAcUsl+2HWsPHlOHDA9xVHjGLe8aVtwQ+w1rf5LAYn4+8zgK/ksOZ7h1zPRAiFtUli+E4+aleLPKmbT1pNKRNjlwNwRdlDEcHMhtlIUDCtZ66U9iuzMItkoN89tyrzFHSzOEbN8FUGBwZA6q1WNJTCT+itUdQA0sRljTUcxjiKuTdgfPELPxujlNA48a4xkyECBF8gDTrAV3l2qBlpK740yRWaczvwPwtNXXNznhJe0Ur2ZyosfCNujJMq1I2BR7gLJNa8F8xtP32F2aeo76m/OlbBT5OYhWAsrbkTK0TyA4Md3SKkMoogsvoxvNW+Xh7ZNzBcku+q3wg/n5Ap0rlclM0piP+dE5c3WHCYtm96cJRj9J1DtKTnIfMYYKnWDPm8Gz1R98CO0G2w5SDVnZ35R4kk17htdRuLcfRXn8dbvnGNrih4s2P6diOSAd1T1rhd85XOoUOFxg1C3fuyK4Zf82ud6KBbggQsqviczOoauDtb0+p1c8/KwhtCCAcPgKg/tg 2puLHyhx 9bLreQW8hp1UlqtmTaY8VR0lZP8r5IbfpnRdPP44nlXP4GDRM2ilD2DWpTE7Kp1JI3mC4YUH0NSFBp3rrK1hlZzQwxbLKn9WgXql02k58VFKdPQK6k0lmZgCamIfynnAjpgHEfqbQsR/hEReNN1LBHH7sPlULsqvw3KubKXt+7CNOPE6oT9aLeNt4j9KTSzJj8/X9UvTWd8/uOBY43+labj5b1HJGU12m1PEwWO2yRtruUTaUHAhO53hdgsgfkiLBYZmlZEcYd9XsY2IWaZbw/7iutG1T8G+3ihzX/QYMBOvSkUth9Sc3w+5V0k4edLcFdiSFUixd4poiUo01iUSnltv5+TY4nBZFv+5LGiL8uczz8sb2WZQBFwB5VyEtBL1ZA4IsTPhTzGPLMdDG9XeJAk3OltdE1s105gTTX8TBK0rj0sNZGSaOnFXUlA6x2GfMldKeYDeMrN3J6/0FqVlphhB+wgb1eAOyhBRBjHpLhT9MRI4I4FOmAOX3BplsZpa0wv+++Bgbrebop1Be4CGAP96q1BWSTvReY5PGivIqTpbu19yy+kQZhCHGDueEOIKZwT6FV44fIC9dcuNi+QdCrOzEZN4cKV9kOrWY90prjq+rYvM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, we crash the kernel when a decompression failure occurs in zswap (either because of memory corruption, or a bug in the compression algorithm). This is overkill. We should only SIGBUS the unfortunate process asking for the zswap entry on zswap load, and skip the corrupted entry in zswap writeback. See [1] for a recent upstream discussion about this. The zswap writeback case is relatively straightforward to fix. For the zswap_load() case, we reorganize the swap read paths, having swap_read_folio_zeromap() and zswap_load() return specific error codes: * 0 indicates the backend owns the swapped out content, which was successfully loaded into the page. * -ENOENT indicates the backend does not own the swapped out content. * -EINVAL and -EIO indicate the backend own the swapped out content, but the content was not successfully loaded into the page for some reasons. The folio will not be marked up-to-date, which will eventually cause the process requesting the page to SIGBUS (see the handling of not-up-to-date folio in do_swap_page() in mm/memory.c). zswap decompression failures on the zswap load path are treated as an -EIO error, as described above, and will no longer crash the kernel. As a side effect, we require one extra zswap tree traversal in the load and writeback paths. Quick benchmarking on a kernel build test shows no performance difference: With the new scheme: real: mean: 125.1s, stdev: 0.12s user: mean: 3265.23s, stdev: 9.62s sys: mean: 2156.41s, stdev: 13.98s The old scheme: real: mean: 125.78s, stdev: 0.45s user: mean: 3287.18s, stdev: 5.95s sys: mean: 2177.08s, stdev: 26.52s [1]: https://lore.kernel.org/all/ZsiLElTykamcYZ6J@casper.infradead.org/ Suggested-by: Matthew Wilcox Suggested-by: Yosry Ahmed Suggested-by: Johannes Weiner Signed-off-by: Nhat Pham --- include/linux/zswap.h | 4 +- mm/page_io.c | 35 ++++++++---- mm/zswap.c | 130 ++++++++++++++++++++++++++++++------------ 3 files changed, 120 insertions(+), 49 deletions(-) base-commit: 598d34afeca6bb10554846cf157a3ded8729516c diff --git a/include/linux/zswap.h b/include/linux/zswap.h index d961ead91bf1..9468cb3e0878 100644 --- a/include/linux/zswap.h +++ b/include/linux/zswap.h @@ -26,7 +26,7 @@ struct zswap_lruvec_state { unsigned long zswap_total_pages(void); bool zswap_store(struct folio *folio); -bool zswap_load(struct folio *folio); +int zswap_load(struct folio *folio); void zswap_invalidate(swp_entry_t swp); int zswap_swapon(int type, unsigned long nr_pages); void zswap_swapoff(int type); @@ -46,7 +46,7 @@ static inline bool zswap_store(struct folio *folio) static inline bool zswap_load(struct folio *folio) { - return false; + return -ENOENT; } static inline void zswap_invalidate(swp_entry_t swp) {} diff --git a/mm/page_io.c b/mm/page_io.c index 9b983de351f9..8a44faec3f92 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -511,7 +511,21 @@ static void sio_read_complete(struct kiocb *iocb, long ret) mempool_free(sio, sio_pool); } -static bool swap_read_folio_zeromap(struct folio *folio) +/* + * Return: one of the following error codes: + * + * 0: the folio is zero-filled (and was populated as such and marked + * up-to-date and unlocked). + * + * -ENOENT: the folio was not zero-filled. + * + * -EINVAL: some of the subpages in the folio are zero-filled, but not all of + * them. This is an error because we don't currently support a large folio + * that is partially in the zeromap. The folio is unlocked, but NOT marked + * up-to-date, so that an IO error is emitted (e.g. do_swap_page() will + * sigbus). + */ +static int swap_read_folio_zeromap(struct folio *folio) { int nr_pages = folio_nr_pages(folio); struct obj_cgroup *objcg; @@ -523,11 +537,13 @@ static bool swap_read_folio_zeromap(struct folio *folio) * that an IO error is emitted (e.g. do_swap_page() will sigbus). */ if (WARN_ON_ONCE(swap_zeromap_batch(folio->swap, nr_pages, - &is_zeromap) != nr_pages)) - return true; + &is_zeromap) != nr_pages)) { + folio_unlock(folio); + return -EINVAL; + } if (!is_zeromap) - return false; + return -ENOENT; objcg = get_obj_cgroup_from_folio(folio); count_vm_events(SWPIN_ZERO, nr_pages); @@ -538,7 +554,8 @@ static bool swap_read_folio_zeromap(struct folio *folio) folio_zero_range(folio, 0, folio_size(folio)); folio_mark_uptodate(folio); - return true; + folio_unlock(folio); + return 0; } static void swap_read_folio_fs(struct folio *folio, struct swap_iocb **plug) @@ -635,13 +652,11 @@ void swap_read_folio(struct folio *folio, struct swap_iocb **plug) } delayacct_swapin_start(); - if (swap_read_folio_zeromap(folio)) { - folio_unlock(folio); + if (swap_read_folio_zeromap(folio) != -ENOENT) goto finish; - } else if (zswap_load(folio)) { - folio_unlock(folio); + + if (zswap_load(folio) != -ENOENT) goto finish; - } /* We have to read from slower devices. Increase zswap protection. */ zswap_folio_swapin(folio); diff --git a/mm/zswap.c b/mm/zswap.c index 6dbf31bd2218..b67481defc6c 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -62,6 +62,8 @@ static u64 zswap_reject_reclaim_fail; static u64 zswap_reject_compress_fail; /* Compressed page was too big for the allocator to (optimally) store */ static u64 zswap_reject_compress_poor; +/* Load or writeback failed due to decompression failure */ +static u64 zswap_decompress_fail; /* Store failed because underlying allocator could not get memory */ static u64 zswap_reject_alloc_fail; /* Store failed because the entry metadata could not be allocated (rare) */ @@ -996,11 +998,12 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, return comp_ret == 0 && alloc_ret == 0; } -static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) +static bool zswap_decompress(struct zswap_entry *entry, struct folio *folio) { struct zpool *zpool = entry->pool->zpool; struct scatterlist input, output; struct crypto_acomp_ctx *acomp_ctx; + int decomp_ret, dlen; u8 *src; acomp_ctx = acomp_ctx_get_cpu_lock(entry->pool); @@ -1025,12 +1028,31 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) sg_init_table(&output, 1); sg_set_folio(&output, folio, PAGE_SIZE, 0); acomp_request_set_params(acomp_ctx->req, &input, &output, entry->length, PAGE_SIZE); - BUG_ON(crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait)); - BUG_ON(acomp_ctx->req->dlen != PAGE_SIZE); + decomp_ret = crypto_wait_req(crypto_acomp_decompress(acomp_ctx->req), &acomp_ctx->wait); + dlen = acomp_ctx->req->dlen; if (src != acomp_ctx->buffer) zpool_unmap_handle(zpool, entry->handle); acomp_ctx_put_unlock(acomp_ctx); + + if (decomp_ret || dlen != PAGE_SIZE) { + zswap_decompress_fail++; + pr_alert_ratelimited( + "decompression failed with returned value %d on zswap entry with " + "swap entry value %08lx, swap type %d, and swap offset %lu. " + "compression algorithm is %s. compressed size is %u bytes, and " + "decompressed size is %u bytes.\n", + decomp_ret, + entry->swpentry.val, + swp_type(entry->swpentry), + swp_offset(entry->swpentry), + entry->pool->tfm_name, + entry->length, + acomp_ctx->req->dlen); + + return false; + } + return true; } /********************************* @@ -1060,6 +1082,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry, struct writeback_control wbc = { .sync_mode = WB_SYNC_NONE, }; + int ret = 0; /* try to allocate swap cache folio */ si = get_swap_device(swpentry); @@ -1081,8 +1104,8 @@ static int zswap_writeback_entry(struct zswap_entry *entry, * and freed when invalidated by the concurrent shrinker anyway. */ if (!folio_was_allocated) { - folio_put(folio); - return -EEXIST; + ret = -EEXIST; + goto out; } /* @@ -1095,14 +1118,17 @@ static int zswap_writeback_entry(struct zswap_entry *entry, * be dereferenced. */ tree = swap_zswap_tree(swpentry); - if (entry != xa_cmpxchg(tree, offset, entry, NULL, GFP_KERNEL)) { - delete_from_swap_cache(folio); - folio_unlock(folio); - folio_put(folio); - return -ENOMEM; + if (entry != xa_load(tree, offset)) { + ret = -ENOMEM; + goto out; } - zswap_decompress(entry, folio); + if (!zswap_decompress(entry, folio)) { + ret = -EIO; + goto out; + } + + xa_erase(tree, offset); count_vm_event(ZSWPWB); if (entry->objcg) @@ -1118,9 +1144,14 @@ static int zswap_writeback_entry(struct zswap_entry *entry, /* start writeback */ __swap_writepage(folio, &wbc); - folio_put(folio); - return 0; +out: + if (ret && ret != -EEXIST) { + delete_from_swap_cache(folio); + folio_unlock(folio); + } + folio_put(folio); + return ret; } /********************************* @@ -1620,7 +1651,29 @@ bool zswap_store(struct folio *folio) return ret; } -bool zswap_load(struct folio *folio) +/** + * zswap_load() - load a page from zswap + * @folio: folio to load + * + * Return: one of the following error codes: + * + * 0: if the swapped out content was in zswap and was successfully loaded. + * The folio is unlocked and marked up-to-date. + * + * -EIO: if the swapped out content was in zswap, but could not be loaded + * into the page (for e.g, because there was a memory corruption, or a + * decompression bug). The folio is unlocked, but NOT marked up-to-date, + * so that an IO error is emitted (e.g. do_swap_page() will SIGBUS). + * + * -EINVAL: if the swapped out content was in zswap, but the page belongs + * to a large folio, which is not supported by zswap. The folio is unlocked, + * but NOT marked up-to-date, so that an IO error is emitted (e.g. + * do_swap_page() will SIGBUS). + * + * -ENOENT: if the swapped out content was not in zswap. The folio remains + * locked on return. + */ +int zswap_load(struct folio *folio) { swp_entry_t swp = folio->swap; pgoff_t offset = swp_offset(swp); @@ -1631,18 +1684,32 @@ bool zswap_load(struct folio *folio) VM_WARN_ON_ONCE(!folio_test_locked(folio)); if (zswap_never_enabled()) - return false; + return -ENOENT; /* * Large folios should not be swapped in while zswap is being used, as * they are not properly handled. Zswap does not properly load large * folios, and a large folio may only be partially in zswap. - * - * Return true without marking the folio uptodate so that an IO error is - * emitted (e.g. do_swap_page() will sigbus). */ - if (WARN_ON_ONCE(folio_test_large(folio))) - return true; + if (WARN_ON_ONCE(folio_test_large(folio))) { + folio_unlock(folio); + return -EINVAL; + } + + entry = xa_load(tree, offset); + if (!entry) + return -ENOENT; + + if (!zswap_decompress(entry, folio)) { + folio_unlock(folio); + return -EIO; + } + + folio_mark_uptodate(folio); + + count_vm_event(ZSWPIN); + if (entry->objcg) + count_objcg_events(entry->objcg, ZSWPIN, 1); /* * When reading into the swapcache, invalidate our entry. The @@ -1656,27 +1723,14 @@ bool zswap_load(struct folio *folio) * files, which reads into a private page and may free it if * the fault fails. We remain the primary owner of the entry.) */ - if (swapcache) - entry = xa_erase(tree, offset); - else - entry = xa_load(tree, offset); - - if (!entry) - return false; - - zswap_decompress(entry, folio); - - count_vm_event(ZSWPIN); - if (entry->objcg) - count_objcg_events(entry->objcg, ZSWPIN, 1); - if (swapcache) { - zswap_entry_free(entry); folio_mark_dirty(folio); + xa_erase(tree, offset); + zswap_entry_free(entry); } - folio_mark_uptodate(folio); - return true; + folio_unlock(folio); + return 0; } void zswap_invalidate(swp_entry_t swp) @@ -1771,6 +1825,8 @@ static int zswap_debugfs_init(void) zswap_debugfs_root, &zswap_reject_compress_fail); debugfs_create_u64("reject_compress_poor", 0444, zswap_debugfs_root, &zswap_reject_compress_poor); + debugfs_create_u64("decompress_fail", 0444, + zswap_debugfs_root, &zswap_decompress_fail); debugfs_create_u64("written_back_pages", 0444, zswap_debugfs_root, &zswap_written_back_pages); debugfs_create_file("pool_total_size", 0444,