From patchwork Wed Jan 18 00:52:09 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sergey Senozhatsky X-Patchwork-Id: 13105310 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07E01C00A5A for ; Wed, 18 Jan 2023 00:52:33 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9F20B6B0075; Tue, 17 Jan 2023 19:52:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9A1D86B0078; Tue, 17 Jan 2023 19:52:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 89CFA6B007B; Tue, 17 Jan 2023 19:52:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7B4F76B0075 for ; Tue, 17 Jan 2023 19:52:32 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 5334740B92 for ; Wed, 18 Jan 2023 00:52:32 +0000 (UTC) X-FDA: 80366094144.03.756AE8F Received: from mail-pj1-f50.google.com (mail-pj1-f50.google.com [209.85.216.50]) by imf21.hostedemail.com (Postfix) with ESMTP id A89A01C0006 for ; Wed, 18 Jan 2023 00:52:30 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=dKn2e93s; spf=pass (imf21.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.216.50 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1674003150; a=rsa-sha256; cv=none; b=vw26ROwUN5F2K+rQ+Tcpr+rLic4V+0fNdTghrjwcyZwjnM5JD+Y5Q1gtTEVMGkfPT8nFr7 vsXU3wXv+jyiDUsA/IoYilHAJaPf/RW4LkPsH3EU4OthGt10MuP+gKvqHjmBapQUCgrOsZ MQz+FVLWEjhebMDziRLthZDSelZlJHs= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=chromium.org header.s=google header.b=dKn2e93s; spf=pass (imf21.hostedemail.com: domain of senozhatsky@chromium.org designates 209.85.216.50 as permitted sender) smtp.mailfrom=senozhatsky@chromium.org; dmarc=pass (policy=none) header.from=chromium.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1674003150; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CIIcftnTzg8Kfx+49Y7UDaJ1lnEPtZ3NItF9OqS/fDU=; b=hQdTq6IzHyYrQUbYjaOJQ5IssX/QUEOTHg7QSOWUiguPFU9b9YV3EQhmF+SZa6IW+FVrjT ygnu1t2DYgHo7q4ZdrBzfnR7VU7MxdW+x590g6CLfCU/Y0XHunzK8wyuZ/lC8kZ8CgVwRQ DWAzkPdDoyGzjdvcsmTIfClwaw8Nyns= Received: by mail-pj1-f50.google.com with SMTP id a14-20020a17090a70ce00b00229a2f73c56so670397pjm.3 for ; Tue, 17 Jan 2023 16:52:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CIIcftnTzg8Kfx+49Y7UDaJ1lnEPtZ3NItF9OqS/fDU=; b=dKn2e93spigc03KLoXF7ZTvVdGY4XHGNHnI/QVdRnTJdStKDXdwLMk4o7DqqMfWHHO T/G/I+6eLpYdy/tTV7clT0V0xsm+nkJuExMO/sXyx4IPTW38Qde+EdFSLKH1HHZQzPnc VV5m4HySicj9ZEu9Ag3VnYKm5AY5GJ635n4JU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CIIcftnTzg8Kfx+49Y7UDaJ1lnEPtZ3NItF9OqS/fDU=; b=hnv+j24YOrZSb5TaJi8TPEFpdJ0wL4iydyWwlzUVxjpyecdI4A5XDHpGofZ2GCw65q D8364pGWo58/Oo65h7webXoKNjR/hmw5F2jE102+aMIEEM8ULy6kDXpb/UZppu64iDCC BtwdY7VDR49N7ObSXXSqL+5/PXuiqnS+edqI/iAwzZ83HA5Cwm2hqj7sD+cgzG8WhuFC NrzCilc6uvag2OKOhlkyNEFF2N95Y6a3UHI0E1cucd4QZEXUwfmkur3r5SLjEa7/UFag M4Uyplm4C/7f/IhDp6v3gUkFSslSmzDMX5qtMRLH4TDGpM56Z01EHbJho2kPZ7vPlSb1 OXvw== X-Gm-Message-State: AFqh2krM61qv+BbHncabdVn04k03VQYIY6QVnwJ3nsu+C4Gdefm6gFpx FPKGK8rHn40CimLXhR3h5ZxqEA== X-Google-Smtp-Source: AMrXdXt+yEbHb017zotxisctz8oAkeMlBJeMXC+abT7hLpAAamlMLlet/dkMa+wmQJxZGPxJa539Uw== X-Received: by 2002:a17:902:aa05:b0:194:52ed:7a2b with SMTP id be5-20020a170902aa0500b0019452ed7a2bmr5362165plb.39.1674003149509; Tue, 17 Jan 2023 16:52:29 -0800 (PST) Received: from tigerii.tok.corp.google.com ([2401:fa00:8f:203:10f3:73a5:7e44:adbf]) by smtp.gmail.com with ESMTPSA id u10-20020a17090341ca00b00192f4fbdeb5sm8351461ple.102.2023.01.17.16.52.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 17 Jan 2023 16:52:28 -0800 (PST) From: Sergey Senozhatsky To: Andrew Morton , Minchan Kim Cc: Mike Kravetz , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Sergey Senozhatsky Subject: [PATCHv3 3/4] zsmalloc: make zspage chain size configurable Date: Wed, 18 Jan 2023 09:52:09 +0900 Message-Id: <20230118005210.2814763-4-senozhatsky@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20230118005210.2814763-1-senozhatsky@chromium.org> References: <20230118005210.2814763-1-senozhatsky@chromium.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: A89A01C0006 X-Rspamd-Server: rspam01 X-Stat-Signature: ha7946qp5r4op678rj8uw7beqp5bu6yg X-HE-Tag: 1674003150-478671 X-HE-Meta: U2FsdGVkX18r4qTmdkjzJe96ziXiP4w0JvdDDxyv7Pz2dEuvCn15LMpC9odYHwMrhuE5uRvs2MmtZvKLf71l3tI+2jnYT1H8m1NYqDptnF2AROq5wQKlaPRwnH3HmrRMldmzHA/CZwpgNFeAypb5vRUKewWJO3PvLzkR5x/JQqfjA4sfWJiYxhAn4fkfo1kQztuLFcrZd6Xl3I+H+0lYi/L/rOi5fbZybX/MDuc+Jzl8JM3wKy8EQ8rsUruwKwhltFGtfSs8bV91m9SIRaYJlL6QJwng5L3kKNfoYa3TA7JzvRiidkFc9ESpLatvScPO+4ES3Y7tcpcjlGG4MTjmvmMuvK1hyVSmHcjgsNCNrLZJpVgAHCyklZPILmnE9UP6AC04up+7EeXPjE7QRSClJ2XNKuQ5jcUEap1rNOKrF7wVCbvP2FZYev9GU2aWahE9MIu6bHiyW9KCgomI/YMhTbWHUEM4EAIF44KxdvaoPvMkUF97sKS/MAcsjVOwUzKMyh+BA09Mny9iLgtrox7XeDnOmu69os4D6GTg0YL6kAzq1OQT+N8mmSlo2qppXxXe/woURa+G73dYgTHMds+5rcPNQlBHOWPxgZ+GzdpZvKkT6xivi0CJOxWX7ghnLQOE47q+cmLcDj0mfY5PIdKh5AajCtL9iUxbDOKXFzZm1HEzxUGYy4OU9FkzQToQBuEvHrpahAqo95EwZiq2uDVjPY97e2q0BNizdHzL8yTs6tLtMI+gut9TvkyA893bEDp/iytPrckzBACpJ41BsKGdOsBoK3um/Vl7g2uioSsGdG8iNuiV1PMdDnBfJ/JzBHuGB4e1WFVMyqzbISVzfLWinMjbQPvYP67NkIJPJ7v3/UkVeE+CpvcEe7VpTjVKfdPc7z2Bwtc9PF7Js3910M4PEO5rcSO36p/Qsj3NIpn3/PiTbFlYK8/S9K8qGFJsjkXaw7stkUzhxuhV262BZYX 9a3jvqfg nnfU0hHzFeGq0u5kwiUpyLY3mJpuO4jsBuLRqVATBlrZcLQl5DNoAep31dPam6TgdkeunZxVeFx4UfgJjTKI0Of1mlUomtcweEWw2PUwA3F6XW6E6kuSF1kk80ipm2IxQVbWi X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Remove hard coded limit on the maximum number of physical pages per-zspage. This will allow tuning of zsmalloc pool as zspage chain size changes `pages per-zspage` and `objects per-zspage` characteristics of size classes which also affects size classes clustering (the way size classes are merged). Signed-off-by: Sergey Senozhatsky Acked-by: Minchan Kim --- Documentation/mm/zsmalloc.rst | 168 ++++++++++++++++++++++++++++++++++ mm/Kconfig | 19 ++++ mm/zsmalloc.c | 12 +-- 3 files changed, 191 insertions(+), 8 deletions(-) diff --git a/Documentation/mm/zsmalloc.rst b/Documentation/mm/zsmalloc.rst index 6e79893d6132..40323c9b39d8 100644 --- a/Documentation/mm/zsmalloc.rst +++ b/Documentation/mm/zsmalloc.rst @@ -80,3 +80,171 @@ Similarly, we assign zspage to: * ZS_ALMOST_FULL when n > N / f * ZS_EMPTY when n == 0 * ZS_FULL when n == N + + +Internals +========= + +zsmalloc has 255 size classes, each of which can hold a number of zspages. +Each zspage can contain up to ZSMALLOC_CHAIN_SIZE physical (0-order) pages. +The optimal zspage chain size for each size class is calculated during the +creation of the zsmalloc pool (see calculate_zspage_chain_size()). + +As an optimization, zsmalloc merges size classes that have similar +characteristics in terms of the number of pages per zspage and the number +of objects that each zspage can store. + +For instance, consider the following size classes::: + + class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage freeable + ... + 94 1536 0 0 0 0 0 3 0 + 100 1632 0 0 0 0 0 2 0 + ... + + +Size classes #95-99 are merged with size class #100. This means that when we +need to store an object of size, say, 1568 bytes, we end up using size class +#100 instead of size class #96. Size class #100 is meant for objects of size +1632 bytes, so each object of size 1568 bytes wastes 1632-1568=64 bytes. + +Size class #100 consists of zspages with 2 physical pages each, which can +hold a total of 5 objects. If we need to store 13 objects of size 1568, we +end up allocating three zspages, or 6 physical pages. + +However, if we take a closer look at size class #96 (which is meant for +objects of size 1568 bytes) and trace `calculate_zspage_chain_size()`, we +find that the most optimal zspage configuration for this class is a chain +of 5 physical pages::: + + pages per zspage wasted bytes used% + 1 960 76 + 2 352 95 + 3 1312 89 + 4 704 95 + 5 96 99 + +This means that a class #96 configuration with 5 physical pages can store 13 +objects of size 1568 in a single zspage, using a total of 5 physical pages. +This is more efficient than the class #100 configuration, which would use 6 +physical pages to store the same number of objects. + +As the zspage chain size for class #96 increases, its key characteristics +such as pages per-zspage and objects per-zspage also change. This leads to +dewer class mergers, resulting in a more compact grouping of classes, which +reduces memory wastage. + +Let's take a closer look at the bottom of `/sys/kernel/debug/zsmalloc/zramX/classes`::: + + class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage freeable + ... + 202 3264 0 0 0 0 0 4 0 + 254 4096 0 0 0 0 0 1 0 + ... + +Size class #202 stores objects of size 3264 bytes and has a maximum of 4 pages +per zspage. Any object larger than 3264 bytes is considered huge and belongs +to size class #254, which stores each object in its own physical page (objects +in huge classes do not share pages). + +Increasing the size of the chain of zspages also results in a higher watermark +for the huge size class and fewer huge classes overall. This allows for more +efficient storage of large objects. + +For zspage chain size of 8, huge class watermark becomes 3632 bytes::: + + class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage freeable + ... + 202 3264 0 0 0 0 0 4 0 + 211 3408 0 0 0 0 0 5 0 + 217 3504 0 0 0 0 0 6 0 + 222 3584 0 0 0 0 0 7 0 + 225 3632 0 0 0 0 0 8 0 + 254 4096 0 0 0 0 0 1 0 + ... + +For zspage chain size of 16, huge class watermark becomes 3840 bytes::: + + class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage freeable + ... + 202 3264 0 0 0 0 0 4 0 + 206 3328 0 0 0 0 0 13 0 + 207 3344 0 0 0 0 0 9 0 + 208 3360 0 0 0 0 0 14 0 + 211 3408 0 0 0 0 0 5 0 + 212 3424 0 0 0 0 0 16 0 + 214 3456 0 0 0 0 0 11 0 + 217 3504 0 0 0 0 0 6 0 + 219 3536 0 0 0 0 0 13 0 + 222 3584 0 0 0 0 0 7 0 + 223 3600 0 0 0 0 0 15 0 + 225 3632 0 0 0 0 0 8 0 + 228 3680 0 0 0 0 0 9 0 + 230 3712 0 0 0 0 0 10 0 + 232 3744 0 0 0 0 0 11 0 + 234 3776 0 0 0 0 0 12 0 + 235 3792 0 0 0 0 0 13 0 + 236 3808 0 0 0 0 0 14 0 + 238 3840 0 0 0 0 0 15 0 + 254 4096 0 0 0 0 0 1 0 + ... + +Overall the combined zspage chain size effect on zsmalloc pool configuration::: + + pages per zspage number of size classes (clusters) huge size class watermark + 4 69 3264 + 5 86 3408 + 6 93 3504 + 7 112 3584 + 8 123 3632 + 9 140 3680 + 10 143 3712 + 11 159 3744 + 12 164 3776 + 13 180 3792 + 14 183 3808 + 15 188 3840 + 16 191 3840 + + +A synthetic test +---------------- + +zram as a build artifacts storage (Linux kernel compilation). + +* `CONFIG_ZSMALLOC_CHAIN_SIZE=4` + + zsmalloc classes stats::: + + class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage freeable + ... + Total 13 51 413836 412973 159955 3 + + zram mm_stat::: + + 1691783168 628083717 655175680 0 655175680 60 0 34048 34049 + + +* `CONFIG_ZSMALLOC_CHAIN_SIZE=8` + + zsmalloc classes stats::: + + class size almost_full almost_empty obj_allocated obj_used pages_used pages_per_zspage freeable + ... + Total 18 87 414852 412978 156666 0 + + zram mm_stat::: + + 1691803648 627793930 641703936 0 641703936 60 0 33591 33591 + +Using larger zspage chains may result in using fewer physical pages, as seen +in the example where the number of physical pages used decreased from 159955 +to 156666, at the same time maximum zsmalloc pool memory usage went down from +655175680 to 641703936 bytes. + +However, this advantage may be offset by the potential for increased system +memory pressure (as some zspages have larger chain sizes) in cases where there +is heavy internal fragmentation and zspool compaction is unable to relocate +objects and release zspages. In these cases, it is recommended to decrease +the limit on the size of the zspage chains (as specified by the +CONFIG_ZSMALLOC_CHAIN_SIZE option). diff --git a/mm/Kconfig b/mm/Kconfig index 4eb4afa53e6d..1cfc0ec4e35e 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -191,6 +191,25 @@ config ZSMALLOC_STAT information to userspace via debugfs. If unsure, say N. +config ZSMALLOC_CHAIN_SIZE + int "Maximum number of physical pages per-zspage" + default 4 + range 4 16 + depends on ZSMALLOC + help + This option sets the upper limit on the number of physical pages + that a zmalloc page (zspage) can consist of. The optimal zspage + chain size is calculated for each size class during the + initialization of the pool. + + Changing this option can alter the characteristics of size classes, + such as the number of pages per zspage and the number of objects + per zspage. This can also result in different configurations of + the pool, as zsmalloc merges size classes with similar + characteristics. + + For more information, see zsmalloc documentation. + menu "SLAB allocator options" choice diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c index ee8431784998..1a7f68c46ccd 100644 --- a/mm/zsmalloc.c +++ b/mm/zsmalloc.c @@ -73,13 +73,6 @@ */ #define ZS_ALIGN 8 -/* - * A single 'zspage' is composed of up to 2^N discontiguous 0-order (single) - * pages. ZS_MAX_ZSPAGE_ORDER defines upper limit on N. - */ -#define ZS_MAX_ZSPAGE_ORDER 2 -#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(1, UL) << ZS_MAX_ZSPAGE_ORDER) - #define ZS_HANDLE_SIZE (sizeof(unsigned long)) /* @@ -120,10 +113,13 @@ #define HUGE_BITS 1 #define FULLNESS_BITS 2 #define CLASS_BITS 8 -#define ISOLATED_BITS 3 +#define ISOLATED_BITS 5 #define MAGIC_VAL_BITS 8 #define MAX(a, b) ((a) >= (b) ? (a) : (b)) + +#define ZS_MAX_PAGES_PER_ZSPAGE (_AC(CONFIG_ZSMALLOC_CHAIN_SIZE, UL)) + /* ZS_MIN_ALLOC_SIZE must be multiple of ZS_ALIGN */ #define ZS_MIN_ALLOC_SIZE \ MAX(32, (ZS_MAX_PAGES_PER_ZSPAGE << PAGE_SHIFT >> OBJ_INDEX_BITS))