From patchwork Mon Mar 3 02:03:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergey Senozhatsky X-Patchwork-Id: 13998050 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D0635C19F32 for ; Mon, 3 Mar 2025 02:24:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 193B16B007B; Sun, 2 Mar 2025 21:24:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 11CF16B0083; Sun, 2 Mar 2025 21:24:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id EDACB6B0085; Sun, 2 Mar 2025 21:24:39 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id D0B7F6B007B for ; Sun, 2 Mar 2025 21:24:39 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 5E370C09E0 for ; Mon, 3 Mar 2025 02:24:39 +0000 (UTC) X-FDA: 83178646278.22.C9F9D31 Received: from mail-pj1-f44.google.com (mail-pj1-f44.google.com [209.85.216.44]) by imf25.hostedemail.com (Postfix) with ESMTP id 8484CA000E for ; Mon, 3 Mar 2025 02:24:37 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=R1kl2Enw; spf=pass (imf25.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.216.44 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740968677; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=sznfA2l8NibSE6qcuM2Jb9P/DtwF9Yp5f99HQpgSZqY=; b=AK83NKxXR/32TrJ6qR44iUAMdp+j20AQ4EePqk9Byp6C1Rn6siM8th65AT7FC04hQXom2Y fD3fIkEv3bLR7t0Tbp+2FRlrtFM+Cm2cw1abxZ4Tqmp2jd6BM81Wt1BtxaculrMICfdJsR mLx33Dcjr844z4es9kQoe4tpVq07QTU= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=R1kl2Enw; spf=pass (imf25.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.216.44 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740968677; a=rsa-sha256; cv=none; b=nHIWBXbN4mTsNzAtOhSCdBfANFpVUQyQUazHEeDN4j18mJ+HU38xxyR86j3uMfp6032r/o +2n3mVNEtIxce/U4bhcHmRV0paqVEf8WA8QmE0xMuq1gzNELpETDzjBS0m0nyg1kyv4p7O UJV5eMDlBHwQswVkUh8h3jwpUQlPyzA= Received: by mail-pj1-f44.google.com with SMTP id 98e67ed59e1d1-2fec3176ef3so3132397a91.1 for ; Sun, 02 Mar 2025 18:24:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; t=1740968676; x=1741573476; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=sznfA2l8NibSE6qcuM2Jb9P/DtwF9Yp5f99HQpgSZqY=; b=R1kl2Enwd5aBBVj8ABca/dZLQ+35Q6QzaFZAOpQ3pN9sWpHjvqU4MPbgbMylDspW6j 0pNOiMi7rCa/ST4+CRFGHLFBFALaiaxNBnYi0cNb4e+UsB9YPZxR3PZHQt52QDIjHBCJ 8OKG0ByUZXQ5/IXWXf/NYzbUAG7nrZ/ItvTmc= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740968676; x=1741573476; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=sznfA2l8NibSE6qcuM2Jb9P/DtwF9Yp5f99HQpgSZqY=; b=rzIFFf6yW7riqsjhcMzeTSrxdMJePkLtq3ujalYIQwiFC1OvdsH7RY8zKecPVf0has 6QNKzbA0VlYJy8kMML371ePN2TEORwRTmpNVf16AROTY2QEl7mLrewn/5o+pelXtCrTF LSV1wEkU4gJ4w2udbGST7ytPEsi9HSdJUaEw4kEyqnsa0u70GTHFcnS4CxCWji1Mer+a ewwXSFrSgvSK8O6V9A33y5oLB85Z6JPpoDkrlgssd4zswONAou0R8RHDrh6WrfqMAPIi pbg5Q8ZV9sUH6HRAGl52m6J9cZPpMWopDWvqVSZ/4EvaJ/cX7m7vaSQ9/fcyWUsql6ej dYdw== X-Forwarded-Encrypted: i=1; AJvYcCVeZKIXGEY5arvlJNqimPyZ1h+HG3gteYqu0HY31Zk+tUGHih7GBJPJiqIwK6iXucHpbwtj7ZJfXw==@kvack.org X-Gm-Message-State: AOJu0YwibvX1zlv1iLYZtvJp7iAGTaUAXNfOWhtfrTQLg3P6JjEwB6F5 NYs5Edskmoa4LCxhOAVLdUrPnLW/uaT435jg6cOAWFkM+hJlY6NZURf1QmolHg== X-Gm-Gg: ASbGncsdETlMfirgmfOQk0arJsdlj7JwCKC0h27vOfQhAd7wEy3Ao4c27k1TRA65eJu ptp2FaM8xOCwT59DJJoIizkQeIFGBcm3PmNTuQ/vpErKRPky3gu/2EhuDXJ9IMv1BBNN3eolL23 k4jN1FdmIw/kf1TNrV/RKINpHadA6YCUnpoc4pIVyxDJ24E6KkzNA2wrQKJSHIHwPEupo7rxwCQ XpyM7QFr47AfaYupH4bRVzmfdmFLpyBTa5Evycmvju6gJVf2bTweCuHhVFmFyo72WFAXHIZMIQg 7+DxWwb31iiRSrawmt9g7WBX7zmnC/xEWSKPoFNy+djMTn8= X-Google-Smtp-Source: AGHT+IGv6hQiKqkOLC37uSZSWuOIbz10ORF2+JrYswdrfALyUuBndhlzbODDhgkgtSYwTOU6BN3POA== X-Received: by 2002:a17:90b:2fc8:b0:2fe:8282:cb9d with SMTP id 98e67ed59e1d1-2febac0466fmr16781267a91.28.1740968676360; Sun, 02 Mar 2025 18:24:36 -0800 (PST) Received: from localhost ([2401:fa00:8f:203:1513:4f61:a4d3:b418]) by smtp.gmail.com with UTF8SMTPSA id 98e67ed59e1d1-2fe8284f116sm10902322a91.49.2025.03.02.18.24.34 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 02 Mar 2025 18:24:35 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton Cc: Yosry Ahmed , Hillf Danton , Kairui Song , Sebastian Andrzej Siewior , Minchan Kim , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Sergey Senozhatsky Subject: [PATCH v10 00/19] zsmalloc/zram: there be preemption Date: Mon, 3 Mar 2025 11:03:09 +0900 Message-ID: <20250303022425.285971-1-senozhatsky@chromium.org> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog MIME-Version: 1.0 X-Stat-Signature: xxcx7rxp5kgu7wpbwc7fu4jwpw7iptm1 X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 8484CA000E X-Rspam-User: X-HE-Tag: 1740968677-508663 X-HE-Meta: U2FsdGVkX19l0p4jAl2Oe//UjOdqsUZZJerSsVH06XxlIBYLguLYV4wlyxlhlcQDhO1QHiR0FeaNzYzkqwhArOsTYcBo2cHZaobZaq8hgVY0XD9g4tV31Cw7aXB7OxMDw30uxOBjQYs5WVsCVUY/lKnAQW8KI6jtBHL9up4ESb5aPC/3qqFARfWFmUZGeRZ/ogqiz9E0+Q9UZdp7tnqSYIy8eoFXIguZZU3Oh7OxbjoOXCYhLJfBhRMJAYN4G+paAIgDOKoV5wzHa4p8PFh503n+xJt9mVS6/eYI4/LabVz0pcNkci7IACUAqOTjB0yvtFNYsSunc7iy63QwExEso0e3+v3nG1IU8lP1eASAh0zChmLqHdBRCBg4SAbsw4TXHgHyyyV9wKtG4dNLwhBJKPplz3vRvV3ROKwfB1n7mqiaLdaXSIJ+tRAqTSDBAIK3bN5MMjDIDcl0JufsZu59R3L8BadIIr/Vn2JaMctuvhPgxpNW9IwQ560Sr13Kgp5rdRkbL5VIDZcDE5SC1edoMYJ+Hyf6c9RPRV3z4UwxmE4DdeaEzBGNRsE36nwiGZY6CJgfCOZceiIczigx49IMXfndHPEvNKh1ytGO1V09Fut0PJk7sn5uJXtbVnBCoFGR7pywqGyqjGHkFPq8wIioyNa290G1ClxuCK9upgh0QxFeH+TdRTgEdQrB2Jazrq9XsUDqH4Ya4IQV9qbG0vUl4hegy0y039E10Lo+Z5weeHWd427Xn8fXUpklMS6a/wpv/qkqfXdxg36Sn9V/FnLoHmDN34y59C36rkBFBkEE5OcMUKoZ1hWeYChWLJwjCjrQKOLYt9UML+WRy9huseuLs6AfowddLLYJEiYItbPaQYWepvb6eA3183qW/GjIy7PNa47L1rnscS7JTAGYtQZT89FLCS1d1XlwZHcaGCo7jZ94Up8yF2ix5Et+RXxRk3JkQHHkmzaZTU23nVAEM63 /4dVxrNY Ofq90te71MGBRVp9qH1rT95Jm+jBMKHxH6AQNWq8Vr/2a9C9IV6OEO0DXCU9C3Va+UsvyIwUylb7PkvsHEasjQASdYecqJuV+vJYL51XDEX4Vlm3ay+sCIoL6LARSD4rwG9aor/Sm03ZPdjY45UCn1Fl9fkE+cskQG6wlZZ/4oUH7ZJZAoUgMF/V+VY9EQrfilFmzwdUv2+5V/k5nECQv94BBDgrcNW8ZODKcj+zx3id/8zb4FopZ4kzLI901DfxCkCJDph8iyBJVelU0jHrCB1tMlq0Ckc1jKIUUwtl7Qa3McrLNMFLN7q4eNn6Hbm35361U X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently zram runs compression and decompression in non-preemptible sections, e.g. zcomp_stream_get() // grabs CPU local lock zcomp_compress() or zram_slot_lock() // grabs entry spin-lock zcomp_stream_get() // grabs CPU local lock zs_map_object() // grabs rwlock and CPU local lock zcomp_decompress() Potentially a little troublesome for a number of reasons. For instance, this makes it impossible to use async compression algorithms or/and H/W compression algorithms, which can wait for OP completion or resource availability. This also restricts what compression algorithms can do internally, for example, zstd can allocate internal state memory for C/D dictionaries: do_fsync() do_writepages() zram_bio_write() zram_write_page() // become non-preemptible zcomp_compress() zstd_compress() ZSTD_compress_usingCDict() ZSTD_compressBegin_usingCDict_internal() ZSTD_resetCCtx_usingCDict() ZSTD_resetCCtx_internal() zstd_custom_alloc() // memory allocation Not to mention that the system can be configured to maximize compression ratio at a cost of CPU/HW time (e.g. lz4hc or deflate with very high compression level) so zram can stay in non-preemptible section (even under spin-lock or/and rwlock) for an extended period of time. Aside from compression algorithms, this also restricts what zram can do. One particular example is zram_write_page() zsmalloc handle allocation, which has an optimistic allocation (disallowing direct reclaim) and a pessimistic fallback path, which then forces zram to compress the page one more time. This series changes zram to not directly impose atomicity restrictions on compression algorithms (and on itself), which makes zram write() fully preemptible; zram read(), sadly, is not always preemptible yet. There are still indirect atomicity restrictions imposed by zsmalloc(). One notable example is object mapping API, which returns with: a) local CPU lock held b) zspage rwlock held First, zsmalloc's zspage lock is converted from rwlock to a special type of RW-lookalike look with some extra guarantees/features. Second, a new handle mapping is introduced which doesn't use per-CPU buffers (and hence no local CPU lock), does fewer memcpy() calls, but requires users to provide a pointer to temp buffer for object copy-in (when needed). Third, zram is converted to the new zsmalloc mapping API and thus zram read() becomes preemptible. v9 -> v10 - moved to statically allocated lockdep lock classes in zram and zsmalloc (Sebastian) - dropped lock_contended() because we only can call it under lock_acquire() and thta's not the case for zram and zsmalloc trylock Sergey Senozhatsky (19): zram: sleepable entry locking zram: permit preemption with active compression stream zram: remove unused crypto include zram: remove max_comp_streams device attr zram: remove second stage of handle allocation zram: add GFP_NOWARN to incompressible zsmalloc handle allocation zram: remove writestall zram_stats member zram: limit max recompress prio to num_active_comps zram: filter out recomp targets based on priority zram: rework recompression loop zram: move post-processing target allocation zsmalloc: rename pool lock zsmalloc: sleepable zspage reader-lock zsmalloc: introduce new object mapping API zram: switch to new zsmalloc object mapping API zram: permit reclaim in zstd custom allocator zram: do not leak page on recompress_store error path zram: do not leak page on writeback_store error path zram: add might_sleep to zcomp API Documentation/ABI/testing/sysfs-block-zram | 8 - Documentation/admin-guide/blockdev/zram.rst | 36 +-- drivers/block/zram/backend_zstd.c | 11 +- drivers/block/zram/zcomp.c | 48 ++- drivers/block/zram/zcomp.h | 8 +- drivers/block/zram/zram_drv.c | 330 +++++++++----------- drivers/block/zram/zram_drv.h | 17 +- include/linux/zsmalloc.h | 8 + mm/zsmalloc.c | 329 ++++++++++++++----- 9 files changed, 478 insertions(+), 317 deletions(-)