From patchwork Tue Jan 7 07:47:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13928315 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07ACBE77197 for ; Tue, 7 Jan 2025 07:47:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6D3B46B00B7; Tue, 7 Jan 2025 02:47:41 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6843F6B00BE; Tue, 7 Jan 2025 02:47:41 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 52E666B00BF; Tue, 7 Jan 2025 02:47:41 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2BDC46B00B7 for ; Tue, 7 Jan 2025 02:47:41 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A77A5140765 for ; Tue, 7 Jan 2025 07:47:40 +0000 (UTC) X-FDA: 82979876280.24.3BBB497 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf21.hostedemail.com (Postfix) with ESMTP id D69B51C0006 for ; Tue, 7 Jan 2025 07:47:38 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YV6KffS6; spf=pass (imf21.hostedemail.com: domain of 3Gdx8ZwoKCAg6w0z6ipumlowwotm.kwutqv25-uus3iks.wzo@flex--yosryahmed.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3Gdx8ZwoKCAg6w0z6ipumlowwotm.kwutqv25-uus3iks.wzo@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736236058; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=4wA6ZdtK7tdslzU/9ME6rDq1fJIrzO6jPvlNxUoI9Xs=; b=oJABaAH4+ass3WkasmC9dpTFwdn80E9YiAd4OyoFP5uO6EKuQhbZijfHyqT1mWgvtDJ7MW AlzdNWilJHCMOB6LGigKljn2jqmOILSe3VTVKwaBHnVL2Nj8Mb5azjIc1TdJG+IlTp0R7c p935QGd+6Tqgb2rWVJj/vEKI0IJRbv4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736236059; a=rsa-sha256; cv=none; b=oL+hII4ygmwO2Vxcp3TQFfdJUj86U2AkL+QIYSPMzbM5FG4o9nTgPg4SN9OfPK8q40HL4f BnKu+Zb7uy0HN2OXBuv3qRL0Tyg8brK81RFF2LLtNUfu2j1MZPo+ryceE5SRwlmEylfpAl qRRKtrxA1rfwAri9JDi6xQ+Cbt5CTPo= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=YV6KffS6; spf=pass (imf21.hostedemail.com: domain of 3Gdx8ZwoKCAg6w0z6ipumlowwotm.kwutqv25-uus3iks.wzo@flex--yosryahmed.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3Gdx8ZwoKCAg6w0z6ipumlowwotm.kwutqv25-uus3iks.wzo@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2efa74481fdso22585154a91.1 for ; Mon, 06 Jan 2025 23:47:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736236057; x=1736840857; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=4wA6ZdtK7tdslzU/9ME6rDq1fJIrzO6jPvlNxUoI9Xs=; b=YV6KffS6gqHedXpPrtvDNkj1NITikk9n00Bo2/mePg1ieRCfAWZbx5DOksmJbFaHl6 AO2dZt1q1pjZaPaxIBHsR/SWtVitkuT+61asZyrcxQcn3+VJT+y1rnmOl/Fr6Ya7Y72V UkBahtVT97Ibd093Y3SdEKCoZyGbxVebka7g+xLM162InQhOeTeK1+i4bcMUXetPNQ2N dYq3DisrRMuEqkTXLePEQbhm8wW4SbKWP0/fQ1cp83jm65B2QtQT44GuE9sK+0w9oBKc t/ZXccCM7E8pMb8bp1LLPruOkjV755Gxh11Vt2mg3rNvd/Bsc+P1xJfRHT+9hOjsiM1K sB7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736236057; x=1736840857; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=4wA6ZdtK7tdslzU/9ME6rDq1fJIrzO6jPvlNxUoI9Xs=; b=UX9pS32bAgszlIFMM4E8vQxtwZIJXq6VJupFp1kiisK4RUDf3hjj9plZAxkYsF1wyi fFb2dgdWX8ITlUQvF8p0ZwGlsjjSN/OeOJhvj14YeLmTm2FZe4r6m8KUb9GtB2LajsDA KRT4Smr0tPKPhVwBvAoK5WIgqbYuQyMokUuzqscIMcM3/9IFQkThiiYmuRoTf32z0yNc ILrW5OsvcgG5UdYgBZuLjs2A0peGtoI/SFzASvaK8DjbmjBIRC1ztcX+mygOjDoBxaha E9fNb9fWHsrH/cbOCu7IceZriA+ewKCKJ1mz9GHAYXUnugQ5VUf3PkuA2WJh6B8LNdZF PpVg== X-Forwarded-Encrypted: i=1; AJvYcCVFtcPC4IIMZBHS0gUoHZapN4Gbg8HFCsuW96r292caLkGSCL6zeDnDYGaCmT9aCiA0vrPZG/3Ggg==@kvack.org X-Gm-Message-State: AOJu0YwF1QMbjQ+Ly+Wf+pKh+8GI0zoYT2ce3LBpItXqw+EqIts6nN10 5Yuqah5pCgd5V/m1TMrxSxS2fgSwuArtp1rQx0oCtHcVRya8lNo7/61olFFq218zY6YLX4AcPOy 62bfJrlJUlDrAS8T3vg== X-Google-Smtp-Source: AGHT+IFYiAf+25SlOLvUM9JNeawERUjMF1Xk3LGoKj9odLwdc1ERLbzavG4BXaA2ohrXu6LUcuWNB621ApigDvDW X-Received: from pjg8.prod.google.com ([2002:a17:90b:3f48:b0:2f4:465d:5c61]) (user=yosryahmed job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:5244:b0:2ee:f440:53ed with SMTP id 98e67ed59e1d1-2f452ed6a0dmr78632663a91.31.1736236057600; Mon, 06 Jan 2025 23:47:37 -0800 (PST) Date: Tue, 7 Jan 2025 07:47:23 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250107074724.1756696-1-yosryahmed@google.com> Subject: [PATCH RESEND 1/2] Revert "mm: zswap: fix race between [de]compression and CPU hotunplug" From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Nhat Pham , Chengming Zhou , Vitaly Wool , Barry Song , Sam Sun , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed , syzbot X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: D69B51C0006 X-Stat-Signature: 6uyj14p3t6hnsa19rd94z8yukq99nmka X-Rspam-User: X-HE-Tag: 1736236058-554450 X-HE-Meta: U2FsdGVkX1+aEpEru26D4yG1JgxMKWTVAQ0kA/sKAts8igwdzL2OaTbyOXdDRFt76FrFOqm/FMyLX3Zw6E3KTerCrnwgTrtig3EiDFoFFtdDr37KFJk146o36OIy99xKbdH62eWlepc2Jpng/rzEiZIBwh65X/E1nIzC90wlzgWmL4ZB9kXv0+aNF7kZCicGHKHrNv1fb/x2UoZLO7L7FR/5IUvcGigQU+lQeGRcr26cQZ2PFpDeGI0IzBLeJhS82pglCEC8o1PeEotBka4xp1WWjzHJqsu2v/dEH1AF6Vl8KAimkFgBEDIblyBWZnwcqKOZ2rQ2SZvHIDBMS/pjtmkJtbKPsJXlixFwCc2DEH9pfFSH9srcTfcGSdZals076cEIFm8KyHgf03TlOaPnVOGBuD8vM49YYxDZ/OBRQ7JsWGZXGrw96Tq204G3DTOVK6e/TkjaPFvCUym3mwuXHtmlY0pn/e3l49wZypp62U1Z0cVThAheh9AqrbF+Dts3DzHb4jvTNkhUnT436O8+3brF5lzpjxKdLp9iBaksJjHxP4rnUbLqrep7FBPgD9++bAMuchWSWIyfTJLnnM/IqYEjQlyBmY6zd/lYwXdwEEubEIEh2pQ8KxUo5GtTSNSTfnbJZbJ4IWUkdy5Oyi+ZBQOEPuxCW23WOqQ1zA5/8cDYn0mA3aOEwoxTjtlVlzi2LqvXcFhYOFvPitKvdqevlYSizRkxynn5f1kYzoIvwlN6xKjtbymaEURzlilK+bnhXpbSoTrxm0JhHbdTOA4DjA074mnPpf6ELuh2B1c0LcsBzY8tzFaok9JRvsJTFCoeLx55bwx03VtMi0rcOKZgBykHSGSV6LfM9mgX1WMnCgjLlSFEBbQP10Fq52k22x6RXzUpq6qQTKmmAfnblqBmhdBRN/h9Qw+h+T35uokSX+F3R+a/QEV9lUaNb1vZL8IigAvSs8i7l9VB3bZQLwt EUxRjIwV m3z8U+qTPQy411ra8h5xL1F5PuXnPgb5+bE2qwC3skwyIFk+PofguxqhaBgZVVZS6GIli9xih4xMST7EM2EHmTMxaOhfqL//uwLQKqbHeRmBTtfTMoYYQjwYrpZNIgVh2bRD/rpnjmG6LMEC2YX73J/QTXcYvdYJzgxGEY0ay1/bUPlvHFB79kDxz6oYBDqELdEBIX02doVz9ZYUPZ04Yhs0FYR4CUMY1emgD7WVjMmviWWa8C/fVKG6UCagTAef5RPu4sKeOU/VGlhcC4YBEFOCdhdMELFNlTUVwYbLoOugp2ff4n9iPF4vgUOx0MwtFTDAesIrKq2dTrYjETYo2eoZVtJCCWsiKBHfWOQWQbvoaAxi3kXxwphyx7fUQ3BsQiuK+WQXbjti64Zb3wUfUSOtINDct2qyrxt8BSwZ80AUqk5pzHiq9bOl95Ke++sksYeE3r3vRQi3KEiLuO42jiy4YnT4s3ezCOi6xDiBA5g4s9xWClBRk9mIvYJ+yoBIjtXiJf2qb7a12dyCH9l4DwmoXVJ4AQwY9+/n0Ycvo2ULAUH7SZczcBIjQQ0RsqAsJ2FqdQKNLMAQFjx6Lcqi95d6YnrS3HjLv0F6vEOoKkLmPiMYwFUPZPyNVAt9Cs8z+Zg7O X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This reverts commit eaebeb93922ca6ab0dd92027b73d0112701706ef. Commit eaebeb93922c ("mm: zswap: fix race between [de]compression and CPU hotunplug") used the CPU hotplug lock in zswap compress/decompress operations to protect against a race with CPU hotunplug making some per-CPU resources go away. However, zswap compress/decompress can be reached through reclaim while the lock is held, resulting in a potential deadlock as reported by syzbot: ====================================================== WARNING: possible circular locking dependency detected 6.13.0-rc6-syzkaller-00006-g5428dc1906dd #0 Not tainted ------------------------------------------------------ kswapd0/89 is trying to acquire lock: ffffffff8e7d2ed0 (cpu_hotplug_lock){++++}-{0:0}, at: acomp_ctx_get_cpu mm/zswap.c:886 [inline] ffffffff8e7d2ed0 (cpu_hotplug_lock){++++}-{0:0}, at: zswap_compress mm/zswap.c:908 [inline] ffffffff8e7d2ed0 (cpu_hotplug_lock){++++}-{0:0}, at: zswap_store_page mm/zswap.c:1439 [inline] ffffffff8e7d2ed0 (cpu_hotplug_lock){++++}-{0:0}, at: zswap_store+0xa74/0x1ba0 mm/zswap.c:1546 but task is already holding lock: ffffffff8ea355a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6871 [inline] ffffffff8ea355a0 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xb58/0x2f30 mm/vmscan.c:7253 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (fs_reclaim){+.+.}-{0:0}: lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 __fs_reclaim_acquire mm/page_alloc.c:3853 [inline] fs_reclaim_acquire+0x88/0x130 mm/page_alloc.c:3867 might_alloc include/linux/sched/mm.h:318 [inline] slab_pre_alloc_hook mm/slub.c:4070 [inline] slab_alloc_node mm/slub.c:4148 [inline] __kmalloc_cache_node_noprof+0x40/0x3a0 mm/slub.c:4337 kmalloc_node_noprof include/linux/slab.h:924 [inline] alloc_worker kernel/workqueue.c:2638 [inline] create_worker+0x11b/0x720 kernel/workqueue.c:2781 workqueue_prepare_cpu+0xe3/0x170 kernel/workqueue.c:6628 cpuhp_invoke_callback+0x48d/0x830 kernel/cpu.c:194 __cpuhp_invoke_callback_range kernel/cpu.c:965 [inline] cpuhp_invoke_callback_range kernel/cpu.c:989 [inline] cpuhp_up_callbacks kernel/cpu.c:1020 [inline] _cpu_up+0x2b3/0x580 kernel/cpu.c:1690 cpu_up+0x184/0x230 kernel/cpu.c:1722 cpuhp_bringup_mask+0xdf/0x260 kernel/cpu.c:1788 cpuhp_bringup_cpus_parallel+0xf9/0x160 kernel/cpu.c:1878 bringup_nonboot_cpus+0x2b/0x50 kernel/cpu.c:1892 smp_init+0x34/0x150 kernel/smp.c:1009 kernel_init_freeable+0x417/0x5d0 init/main.c:1569 kernel_init+0x1d/0x2b0 init/main.c:1466 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 -> #0 (cpu_hotplug_lock){++++}-{0:0}: check_prev_add kernel/locking/lockdep.c:3161 [inline] check_prevs_add kernel/locking/lockdep.c:3280 [inline] validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904 __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 percpu_down_read include/linux/percpu-rwsem.h:51 [inline] cpus_read_lock+0x42/0x150 kernel/cpu.c:490 acomp_ctx_get_cpu mm/zswap.c:886 [inline] zswap_compress mm/zswap.c:908 [inline] zswap_store_page mm/zswap.c:1439 [inline] zswap_store+0xa74/0x1ba0 mm/zswap.c:1546 swap_writepage+0x647/0xce0 mm/page_io.c:279 shmem_writepage+0x1248/0x1610 mm/shmem.c:1579 pageout mm/vmscan.c:696 [inline] shrink_folio_list+0x35ee/0x57e0 mm/vmscan.c:1374 shrink_inactive_list mm/vmscan.c:1967 [inline] shrink_list mm/vmscan.c:2205 [inline] shrink_lruvec+0x16db/0x2f30 mm/vmscan.c:5734 mem_cgroup_shrink_node+0x385/0x8e0 mm/vmscan.c:6575 mem_cgroup_soft_reclaim mm/memcontrol-v1.c:312 [inline] memcg1_soft_limit_reclaim+0x346/0x810 mm/memcontrol-v1.c:362 balance_pgdat mm/vmscan.c:6975 [inline] kswapd+0x17b3/0x2f30 mm/vmscan.c:7253 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(fs_reclaim); lock(cpu_hotplug_lock); lock(fs_reclaim); rlock(cpu_hotplug_lock); *** DEADLOCK *** 1 lock held by kswapd0/89: #0: ffffffff8ea355a0 (fs_reclaim){+.+.}-{0:0}, at: balance_pgdat mm/vmscan.c:6871 [inline] #0: ffffffff8ea355a0 (fs_reclaim){+.+.}-{0:0}, at: kswapd+0xb58/0x2f30 mm/vmscan.c:7253 stack backtrace: CPU: 0 UID: 0 PID: 89 Comm: kswapd0 Not tainted 6.13.0-rc6-syzkaller-00006-g5428dc1906dd #0 Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 09/13/2024 Call Trace: __dump_stack lib/dump_stack.c:94 [inline] dump_stack_lvl+0x241/0x360 lib/dump_stack.c:120 print_circular_bug+0x13a/0x1b0 kernel/locking/lockdep.c:2074 check_noncircular+0x36a/0x4a0 kernel/locking/lockdep.c:2206 check_prev_add kernel/locking/lockdep.c:3161 [inline] check_prevs_add kernel/locking/lockdep.c:3280 [inline] validate_chain+0x18ef/0x5920 kernel/locking/lockdep.c:3904 __lock_acquire+0x1397/0x2100 kernel/locking/lockdep.c:5226 lock_acquire+0x1ed/0x550 kernel/locking/lockdep.c:5849 percpu_down_read include/linux/percpu-rwsem.h:51 [inline] cpus_read_lock+0x42/0x150 kernel/cpu.c:490 acomp_ctx_get_cpu mm/zswap.c:886 [inline] zswap_compress mm/zswap.c:908 [inline] zswap_store_page mm/zswap.c:1439 [inline] zswap_store+0xa74/0x1ba0 mm/zswap.c:1546 swap_writepage+0x647/0xce0 mm/page_io.c:279 shmem_writepage+0x1248/0x1610 mm/shmem.c:1579 pageout mm/vmscan.c:696 [inline] shrink_folio_list+0x35ee/0x57e0 mm/vmscan.c:1374 shrink_inactive_list mm/vmscan.c:1967 [inline] shrink_list mm/vmscan.c:2205 [inline] shrink_lruvec+0x16db/0x2f30 mm/vmscan.c:5734 mem_cgroup_shrink_node+0x385/0x8e0 mm/vmscan.c:6575 mem_cgroup_soft_reclaim mm/memcontrol-v1.c:312 [inline] memcg1_soft_limit_reclaim+0x346/0x810 mm/memcontrol-v1.c:362 balance_pgdat mm/vmscan.c:6975 [inline] kswapd+0x17b3/0x2f30 mm/vmscan.c:7253 kthread+0x2f0/0x390 kernel/kthread.c:389 ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147 ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244 Revert the change. A different fix for the race with CPU hotunplug will follow. Reported-by: syzbot Signed-off-by: Yosry Ahmed --- This is a resend because I screwed up sending these two patches the first time and included the patch getting reverted as well. The patches apply on top of mm-hotfixes-unstable and are meant for v6.13. Andrew, I am not sure what's the best way to handle this. This fix is already merged into Linus's tree and had CC:stable, so I thought it's best to revert it and replace it with a separate fix that would be easy to backport instead of the revert patch, especially that functionally the new fix is completely different anyway. Does the revert automatically signal the stable maintainers to drop it? Do we need to add CC:stable to this revert as well? --- mm/zswap.c | 19 +++---------------- 1 file changed, 3 insertions(+), 16 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index 5a27af8d86ea9..f6316b66fb236 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -880,18 +880,6 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) return 0; } -/* Prevent CPU hotplug from freeing up the per-CPU acomp_ctx resources */ -static struct crypto_acomp_ctx *acomp_ctx_get_cpu(struct crypto_acomp_ctx __percpu *acomp_ctx) -{ - cpus_read_lock(); - return raw_cpu_ptr(acomp_ctx); -} - -static void acomp_ctx_put_cpu(void) -{ - cpus_read_unlock(); -} - static bool zswap_compress(struct page *page, struct zswap_entry *entry, struct zswap_pool *pool) { @@ -905,7 +893,8 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, gfp_t gfp; u8 *dst; - acomp_ctx = acomp_ctx_get_cpu(pool->acomp_ctx); + acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); + mutex_lock(&acomp_ctx->mutex); dst = acomp_ctx->buffer; @@ -961,7 +950,6 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, zswap_reject_alloc_fail++; mutex_unlock(&acomp_ctx->mutex); - acomp_ctx_put_cpu(); return comp_ret == 0 && alloc_ret == 0; } @@ -972,7 +960,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) struct crypto_acomp_ctx *acomp_ctx; u8 *src; - acomp_ctx = acomp_ctx_get_cpu(entry->pool->acomp_ctx); + acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); mutex_lock(&acomp_ctx->mutex); src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO); @@ -1002,7 +990,6 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) if (src != acomp_ctx->buffer) zpool_unmap_handle(zpool, entry->handle); - acomp_ctx_put_cpu(); } /********************************* From patchwork Tue Jan 7 07:47:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 13928316 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 60B48E77199 for ; Tue, 7 Jan 2025 07:47:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E9C996B00BF; Tue, 7 Jan 2025 02:47:45 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E4BC46B00C0; Tue, 7 Jan 2025 02:47:45 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D3A776B00C1; Tue, 7 Jan 2025 02:47:45 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id AFFFE6B00BF for ; Tue, 7 Jan 2025 02:47:45 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 70FEF1A078C for ; Tue, 7 Jan 2025 07:47:45 +0000 (UTC) X-FDA: 82979876490.10.C5E554D Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf04.hostedemail.com (Postfix) with ESMTP id 9D69F40009 for ; Tue, 7 Jan 2025 07:47:43 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ru0gSVzc; spf=pass (imf04.hostedemail.com: domain of 3Htx8ZwoKCA0B154Bnuzrqt11tyr.p1zyv07A-zzx8npx.14t@flex--yosryahmed.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3Htx8ZwoKCA0B154Bnuzrqt11tyr.p1zyv07A-zzx8npx.14t@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736236063; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TgkmCWVczOKXI/I991UVFS0Ip8/4HV0rkcV91lhH2tc=; b=gdAlikm64M/KWppaI777xCFttIIPbpzXRA7UickroZyPu5+nqiDalV4/xm8qkcQqe0idii KOv79OtEqZMIGQn+PvEzGK47JIAUAVL+vNKzLUZ0krwenl38/AXTg/iCy/pvBAMFMEVbQ1 fgyM6TdRVDwELNmDlWfkCdr1ma/xOIY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736236063; a=rsa-sha256; cv=none; b=qC591GfUj6WNUlYvhphekB2kjJ0XbI6VxllEiH4N0f+kbuolcIFy3ZEokM5cEKz5ywSXgO iXXvRwNn+jDPgEy6Mx47x/rrduatzJNwJowh47bTjJD+QJq6KrLTwBUyK0IEhWVEcbNQ7P Kvh/+xGnlC6wuedni92yU8X92kHCrzk= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ru0gSVzc; spf=pass (imf04.hostedemail.com: domain of 3Htx8ZwoKCA0B154Bnuzrqt11tyr.p1zyv07A-zzx8npx.14t@flex--yosryahmed.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3Htx8ZwoKCA0B154Bnuzrqt11tyr.p1zyv07A-zzx8npx.14t@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2ef79403c5eso33290190a91.0 for ; Mon, 06 Jan 2025 23:47:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736236062; x=1736840862; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=TgkmCWVczOKXI/I991UVFS0Ip8/4HV0rkcV91lhH2tc=; b=ru0gSVzcS9H2VpVtaoowwZsiNrJ6+Y1xbat/COHsmAlEHZjm+VU+Fb0n9rTdoNHDG5 +0rWEdSSIye3b6qya4w4m36C6/26mC9goOAyMShhqD2+plNVvmBYVWpgBhhgh9vVh4Jr K7pSWK83gGPLQv1KoqJc37BhXIw0wicu4IkXqT8yzCGiiUCEDc8wa1OT7sJu+ztRbNNe lSqtB5gxEG1CbmTcjoV9vb0BNTBP0TZMnUnyF7tNmQjb79mOVyWfv1GZRZXBnVMC5+9m hiiJ/UmM4eJRe4bKOoJLeq53wbB8bzX+Lb1G7txng62qWzszuUFNx7xc15M6v03Fingu y8sg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736236062; x=1736840862; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TgkmCWVczOKXI/I991UVFS0Ip8/4HV0rkcV91lhH2tc=; b=k11Y12crv9M3mEhGQzyev46pH4sHqbzNf3k+PWsUA5hYp+3a8pILhsjbhkiZyoh9VN Nf2NqmU8ThDcxiQ5qKEYuQXZxSR0IQYRnzIVTiJIP5OWVCW2IDvZNf2wcAyjs9e8/viF FRxv4I5nEQVHV/rGnM8zL8QeoX3WO34tsk4GLB5wwadaqRl+LdwyNUyE9Nc4f2VVsu0T fNDwsWuuLthZe/KM4LnByP876WIH+NsW2NMGdSJxRElImrWJy/P4uxS2xe/UmKve5D3J NmRjVkjznrUZbNjF+Opr8cSYvsUb4bPyDBVIV4xBrEHC9SjtSB5yllD9ZEkqxjW52Zpa M0fQ== X-Forwarded-Encrypted: i=1; AJvYcCXDm5jRfWarEACZsjH1gEt1adeI968M5eyRA9lj0qpxXH7bRNRg8QL5LvQ4QP3X7qXGA4fRkUu7Eg==@kvack.org X-Gm-Message-State: AOJu0YwM7UUxLGUeIdYFztULzRGaujvWReDpCRdr5P7G5s8B1CCuu3LB JzpRiY4SOUxozOxGxoCkbHArKPs+i9CId8hfJeCeNIPqccGap87r7lbRb0QSjkT8eSNhkF/sZA5 FLFSdR6OytyNZ51bLbg== X-Google-Smtp-Source: AGHT+IHQ7AZqxFj9pAZrCsvqiHbUVAdUbqvqSZXExiCCJxdWdkFbZ6wGHRFwryywnMMfUT9qvbfOq0nZYKQovlUN X-Received: from pjbee7.prod.google.com ([2002:a17:90a:fc47:b0:2ef:8d43:14d8]) (user=yosryahmed job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:6cb:b0:2ee:863e:9fff with SMTP id 98e67ed59e1d1-2f452e22560mr94099372a91.10.1736236062518; Mon, 06 Jan 2025 23:47:42 -0800 (PST) Date: Tue, 7 Jan 2025 07:47:24 +0000 In-Reply-To: <20250107074724.1756696-1-yosryahmed@google.com> Mime-Version: 1.0 References: <20250107074724.1756696-1-yosryahmed@google.com> X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250107074724.1756696-2-yosryahmed@google.com> Subject: [PATCH RESEND 2/2] mm: zswap: use SRCU to synchronize with CPU hotunplug From: Yosry Ahmed To: Andrew Morton Cc: Johannes Weiner , Nhat Pham , Chengming Zhou , Vitaly Wool , Barry Song , Sam Sun , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yosry Ahmed , stable@vger.kernel.org X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 9D69F40009 X-Stat-Signature: accrsd7fb8qz64g9ro3owrg7w3r4t3ax X-Rspam-User: X-HE-Tag: 1736236063-92215 X-HE-Meta: U2FsdGVkX18jM1QiGRCNZCuYy2YSiBf9UFaO4HLB37wWyToXsNsqC6QUAuqkiukrlov6zZKiC8A3StdQAZSu3fobDeuwQlwFXovviKtrE0cF7KZc38ZePSdcxhqDCdCk5Y2ZzlNvYhhSuP27yWuoRUQP74oMhc/6RbejkPclVGvB8mFOIBnEmJ2/S27PN+Rlsj7J6RAqZw5GXFCO42QbTJxWJajbxfk+7I6fHQ6onqYT7TWY4uZEXa4r0uMgt+irk9P7G00If2SeVd33vb5Ybj+4TShsLjTazvZBtQFtgfcdRE04UcyWvBVjuNlQ1BFGgwj0/UjbSQ0XqoG+G1m0e8KOVuwYZej8NxYJ00rvP1NGfuOT48q+uIO+ciiRR+rgF4WfQLZx7oE6SP3a9fuajd4q3ER1vicTdzqdO+Pa0a/erJK4w45K5GxcZYOJaBNy/W+JRZ4jAGYaGNCEfZnKTUFalbC9Tdevhf5pII4+SwGPJSGxyK4S8DjRi0rQ7BiKR7rqbi4C43D51NKldQ2DJeQFCEyT2z5Cji1LksXreOOXhRU1PC8q2nE8pd2edSSzdCPChdnwhPk8/OxnPNOa9Sf5kUp3gaXPkvT8ETPQlh/0DUovRveaiay+Vu90gUD5JbhOx3vQOcyCf2io0tZVdsKwL00O9OP5S6e1FPz9tWZcluhgehF9eJpKJvLns+BQBsUuJJZy2OYWKx6h6rv++xJjXv4ItWXg6ei57qmQdgB7XwfS/RtN9gZKggsdtTQ1AhfT5+fkP6zor3rH+2Ugh9kcwijUqBxY0ACSBOsgjoWcR0qtWpmTaMJjbe4Ob5Va0L3IYQowPobNUgjKs2ohvXvL63ur8Q1EFOBlUHyk16BpEhLWdRMTbXvPbuJEVRVId3oIvLwkPE6PjyXZDS00IzdFDV6DwWaANMAPPtG9MnEyntchn6/U6bYMOj1/sNtS/8F5iPye7XWNtDx1tsP +4jDA+hU ci9bx7NFgJKDAgNVq7KJ6PjXmrmNp2LNjI8EyBLCf8+Qf6O7bIIdyzgZv3pUYFSX7EEb8lodBCnfKvmXCLdiJ8mGFx+4hJn27J2Wfgneuff82dlDP0wy6jwwTJFe0ZxOYRezyU21mOmcXNPCoLBNKgA04nb0D6fp0RVRpd3FNAflbXk+yl+XUGXh1tjgMX5YOlxHV5Ym5pp8TpSx5IEprF/KPBhCL3Zb4SnpNChzNhC1qtFftGm0aAgjZWR2+1xKfV0s83R3CJkdpQTBgP/R31B9fbMnPW37zVbXufAc3RetHjTAHN97QVtPxLeWER4RbUood5lJBOH4JbVhgZ5N9Un/+vSqjgrNiG2US6UYn9qCboStxqYT68zVcrkQyGS5bmquBtgbv9FY5D1QohyELtDpwPdzeTGqhZC8HybqDnHLfLvFASdXNw+ubeuC1E8IJADG43KLbEK9MEqB0gweWMXyddzfcIrR6UvPiKALDOlVWvrA24I1mJZdaxXOnsAqDSf2KYmNSBAMRyCCbuC1iDwbiG6gAnPf8oqwDNv+gTgtrhle1xC77tbA/9f2yjYyXvf/QxoEtPFuijtLNJtdXDqLGfK/q5ehTMvD8+jcGal12KSdPK6a5drv8JL29yf0tyoX5+krKx4MEBlOwGq5ng2v2Oscr/RFn/AtmKsXFW5it/rfGB2MYhuVoXA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In zswap_compress() and zswap_decompress(), the per-CPU acomp_ctx of the current CPU at the beginning of the operation is retrieved and used throughout. However, since neither preemption nor migration are disabled, it is possible that the operation continues on a different CPU. If the original CPU is hotunplugged while the acomp_ctx is still in use, we run into a UAF bug as the resources attached to the acomp_ctx are freed during hotunplug in zswap_cpu_comp_dead(). The problem was introduced in commit 1ec3b5fe6eec ("mm/zswap: move to use crypto_acomp API for hardware acceleration") when the switch to the crypto_acomp API was made. Prior to that, the per-CPU crypto_comp was retrieved using get_cpu_ptr() which disables preemption and makes sure the CPU cannot go away from under us. Preemption cannot be disabled with the crypto_acomp API as a sleepable context is needed. Commit 8ba2f844f050 ("mm/zswap: change per-cpu mutex and buffer to per-acomp_ctx") increased the UAF surface area by making the per-CPU buffers dynamic, adding yet another resource that can be freed from under zswap compression/decompression by CPU hotunplug. There are a few ways to fix this: (a) Add a refcount for acomp_ctx. (b) Disable migration while using the per-CPU acomp_ctx. (c) Use SRCU to wait for other CPUs using the acomp_ctx of the CPU being hotunplugged. Normal RCU cannot be used as a sleepable context is required. Implement (c) since it's simpler than (a), and (b) involves using migrate_disable() which is apparently undesired (see huge comment in include/linux/preempt.h). Fixes: 1ec3b5fe6eec ("mm/zswap: move to use crypto_acomp API for hardware acceleration") Cc: Signed-off-by: Yosry Ahmed Reported-by: Johannes Weiner Closes: https://lore.kernel.org/lkml/20241113213007.GB1564047@cmpxchg.org/ Reported-by: Sam Sun Closes: https://lore.kernel.org/lkml/CAEkJfYMtSdM5HceNsXUDf5haghD5+o2e7Qv4OcuruL4tPg6OaQ@mail.gmail.com/ Signed-off-by: Yosry Ahmed Reported-by: syzbot Signed-off-by: Andrew Morton --- mm/zswap.c | 31 ++++++++++++++++++++++++++++--- 1 file changed, 28 insertions(+), 3 deletions(-) diff --git a/mm/zswap.c b/mm/zswap.c index f6316b66fb236..add1406d693b8 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -864,12 +864,22 @@ static int zswap_cpu_comp_prepare(unsigned int cpu, struct hlist_node *node) return ret; } +DEFINE_STATIC_SRCU(acomp_srcu); + static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) { struct zswap_pool *pool = hlist_entry(node, struct zswap_pool, node); struct crypto_acomp_ctx *acomp_ctx = per_cpu_ptr(pool->acomp_ctx, cpu); if (!IS_ERR_OR_NULL(acomp_ctx)) { + /* + * Even though the acomp_ctx should not be currently in use on + * @cpu, it may still be used by compress/decompress operations + * that started on @cpu and migrated to a different CPU. Wait + * for such usages to complete, any news usages would be a bug. + */ + synchronize_srcu(&acomp_srcu); + if (!IS_ERR_OR_NULL(acomp_ctx->req)) acomp_request_free(acomp_ctx->req); if (!IS_ERR_OR_NULL(acomp_ctx->acomp)) @@ -880,6 +890,18 @@ static int zswap_cpu_comp_dead(unsigned int cpu, struct hlist_node *node) return 0; } +static struct crypto_acomp_ctx *acomp_ctx_get_cpu(struct crypto_acomp_ctx __percpu *acomp_ctx, + int *srcu_idx) +{ + *srcu_idx = srcu_read_lock(&acomp_srcu); + return raw_cpu_ptr(acomp_ctx); +} + +static void acomp_ctx_put_cpu(int srcu_idx) +{ + srcu_read_unlock(&acomp_srcu, srcu_idx); +} + static bool zswap_compress(struct page *page, struct zswap_entry *entry, struct zswap_pool *pool) { @@ -889,12 +911,12 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, unsigned int dlen = PAGE_SIZE; unsigned long handle; struct zpool *zpool; + int srcu_idx; char *buf; gfp_t gfp; u8 *dst; - acomp_ctx = raw_cpu_ptr(pool->acomp_ctx); - + acomp_ctx = acomp_ctx_get_cpu(pool->acomp_ctx, &srcu_idx); mutex_lock(&acomp_ctx->mutex); dst = acomp_ctx->buffer; @@ -950,6 +972,7 @@ static bool zswap_compress(struct page *page, struct zswap_entry *entry, zswap_reject_alloc_fail++; mutex_unlock(&acomp_ctx->mutex); + acomp_ctx_put_cpu(srcu_idx); return comp_ret == 0 && alloc_ret == 0; } @@ -958,9 +981,10 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) struct zpool *zpool = entry->pool->zpool; struct scatterlist input, output; struct crypto_acomp_ctx *acomp_ctx; + int srcu_idx; u8 *src; - acomp_ctx = raw_cpu_ptr(entry->pool->acomp_ctx); + acomp_ctx = acomp_ctx_get_cpu(entry->pool->acomp_ctx, &srcu_idx); mutex_lock(&acomp_ctx->mutex); src = zpool_map_handle(zpool, entry->handle, ZPOOL_MM_RO); @@ -990,6 +1014,7 @@ static void zswap_decompress(struct zswap_entry *entry, struct folio *folio) if (src != acomp_ctx->buffer) zpool_unmap_handle(zpool, entry->handle); + acomp_ctx_put_cpu(srcu_idx); } /*********************************