From patchwork Fri Oct 18 10:48:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13841572 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B92B6D2FFEC for ; Fri, 18 Oct 2024 10:50:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4FAA26B0089; Fri, 18 Oct 2024 06:50:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4AA646B008A; Fri, 18 Oct 2024 06:50:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3725D6B008C; Fri, 18 Oct 2024 06:50:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 18F096B0089 for ; Fri, 18 Oct 2024 06:50:42 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 875B4406B1 for ; Fri, 18 Oct 2024 10:50:34 +0000 (UTC) X-FDA: 82686404430.16.B62AC5F Received: from mail-ua1-f54.google.com (mail-ua1-f54.google.com [209.85.222.54]) by imf28.hostedemail.com (Postfix) with ESMTP id 0C6B7C0010 for ; Fri, 18 Oct 2024 10:50:28 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UuSnvQjF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.54 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729248605; a=rsa-sha256; cv=none; b=b8nWU1Y+X/sdgb9XdWJP4FFfNuVr2MjPEHEaPNSxrg5C+/pkeDOUHw8KSjRnwrx0BfwSxw db645g111pOtKgHl1TTDRSNV7n23RlrjTawo0Lbsc5XMt5Ma26um9S1AIpXijPfamM94s7 UwVBv/Io80q41pz9pdioQ/g31c/ThOA= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=UuSnvQjF; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf28.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.54 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729248605; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=um8CJgoH8YP7qg2KspbA5c0NLRN95mVz+O4+Mbauz+0=; b=sYVpULejMp59iAzVYPydLZU31bs0YXtX13cCkaCmaXKkVXIdsfXblT22cgDswryc21IffI TSxb9ypIot+aEM+IFwbGxMQmwNOPCEBxpV3BOAGR6RyaKsbnyfzvZPI2MBY24KT33CYaaq X62hH0AvlgOyBjaeQ8RdDQDkEHeSJHE= Received: by mail-ua1-f54.google.com with SMTP id a1e0cc1a2514c-84ff612ca93so636180241.0 for ; Fri, 18 Oct 2024 03:50:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729248639; x=1729853439; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=um8CJgoH8YP7qg2KspbA5c0NLRN95mVz+O4+Mbauz+0=; b=UuSnvQjF87ly/+dH/WHSR469zvHOZ/6GKN/3EleZO67ZK5FjuqCrKm4HOt8YG3FWZm WIhlrilULtE2AnhwAiJ02OYTvEtgfbQ6GkX2dBCt+oNL7z0ozSqNiw2HuXT9Znhdluz/ XGp8G2zBbs9jjFpYG/pp6/rlv1Mj7H13xbBKlAf/NoHtHqgZhL1QDcNlYhoodi1Eo43n zZzEFBNg+A02ZCK2g/Zc4S3OEEVIK7k09MJB9Trrctvghw9KOz1AYuDxN6JDaS3bEOpH 4AVJ2sueNBSBDZ2Ot12eKV3qpspcQQO2C17YM0ITEofFAz9CtjwNPKpJVLfsS1YYVaQr pIRQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729248639; x=1729853439; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=um8CJgoH8YP7qg2KspbA5c0NLRN95mVz+O4+Mbauz+0=; b=cSl5ykHI+Hddu851LDg/bTdC7Cx+syAeWBSP6VUN4vZnvlwVHKViJ3z98rseL5f2/B HVs59owEU4Ax8gdV2jK6GcET4KMP0XNT+3Hu9OTMAM2RJ42VJo5wAVQ0/rUAUPyC9MUG GkCbL/U5cBEHgYb2FZZ0Oz6axrM5kcrGq4e+P98tsBHZ5XkLIO0L2X2j69KD4wdCnIlR PpKkTJBIQCb3/E6Xl/f5VGwUhJvPAOl98klUJHPLMew/vuWVhtoxiHO/3uiKvLSH62iP wbgc/i+KcvriqaQr+hoYtrAbAsVqwh5lCilpPY626Vt5NAM1Y8okifpBN8HswquAZnjQ pDuQ== X-Forwarded-Encrypted: i=1; AJvYcCXWUWFtS9EOrWP01RCx5TwTLCy07FpMSO7hURMVgtbk2vZWXFw/poIcDVkA5LfCs8q/wizw3dfy8A==@kvack.org X-Gm-Message-State: AOJu0YwC+AOz9tgLQ0KdyYdVc/QZHBt4OqRfI+87cArumB5TCDHd3mSX lsggTtFa7BZXLS53Bkm0yK+ErGEUpBdlYKPaAKH+rwOC9Ut2L5OP X-Google-Smtp-Source: AGHT+IH8Mn5he2yfHltuZbtxUYmes0GQVaKU8iMC8ELro74qZ/ahKHro38OTgzSJU4H0qaepttvaZA== X-Received: by 2002:a05:6102:1608:b0:4a4:97bc:c0d7 with SMTP id ada2fe7eead31-4a5d6ae16a3mr1399174137.8.1729248638707; Fri, 18 Oct 2024 03:50:38 -0700 (PDT) Received: from localhost (fwdproxy-ash-003.fbsv.net. [2a03:2880:20ff:3::face:b00c]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7b156fa5618sm58452385a.65.2024.10.18.03.50.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 18 Oct 2024 03:50:38 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, david@redhat.com, willy@infradead.org, kanchana.p.sridhar@intel.com, yosryahmed@google.com, nphamcs@gmail.com, chengming.zhou@linux.dev, ryan.roberts@arm.com, ying.huang@intel.com, 21cnbao@gmail.com, riel@surriel.com, shakeel.butt@linux.dev, kernel-team@meta.com, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, Usama Arif Subject: [RFC 0/4] mm: zswap: add support for zswapin of large folios Date: Fri, 18 Oct 2024 11:48:38 +0100 Message-ID: <20241018105026.2521366-1-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 MIME-Version: 1.0 X-Rspam-User: X-Stat-Signature: 91d1ti4dcx47gh8ahh7nrke3qsf3bou7 X-Rspamd-Queue-Id: 0C6B7C0010 X-Rspamd-Server: rspam02 X-HE-Tag: 1729248628-210696 X-HE-Meta: U2FsdGVkX19vKRdL5WqDo3/++DEjWXjlqxRdAO7SmSxA5NAKG88H6CDDI0k2wxTHX3E98XTjCmCnL4mEAclqczhQbQMLyzCX302VN0Gj4S2HhppZ6QCO9Ye3JeOlMdqkZZlE6qrou48MqMKOl3RZf2/yBEnOuiZQpCWLnAXhem63zNO3cu47r3IacQQ9wdWQ69T983pKuwUuTlHw7x4eBMXkor6NPkovwzQN1ZiElErg0h4zSlfKeekVLW6T33Kl/h24uHvF1uX2i2Z64vldi2KRWDusnCM6hYFwFCqT7FV61m3m2mBRzI/pk73tUSdKc32gIZQuYXB87ueRxE6yriXF0x2XhekSy8D5A0AjvG5pt+QfjSSLcwCLY36Mr5j6K8u2rfEjJmCT3Ql6kZvD1o3GdSzvjzZwq2I8pq/ohJT7+8moVzGbxj2J21XwGPHrDJLIa5vSk4UYpPxt3owuUQOlZz0V2806bbSUm3+w5YpuSvkcQGlwM9+KJQEkMk9ChM1NvhghHbR6VIAu2FWEUkV6vrs4mc8lkrzaNvgqdg1WjtAMJEIO4lfo8sFUlqQ8x65LXSkhFxMpEPXkxuik5LFFmQ95vLzBdvCDFvF61tzUJPJxCz5XGilrFWk3hFXKHE1JcH8zZXlu3Y2EhtK6/jtanPRIyVW8Ll5svdCfSTJd4Ji6duI04MTQ1OVlDGGSuJV6RUCTv4eo4x1py2gvMvmFIOkm7U9CTpCb5J8PMMOZd96LgJVEOSyh9BIf2nYsz9oiikkmflPCsfKTb959H+rUt/7iDV6uhx5f7DNNY6oltwYmNDd2rE0bs8IIUlhC+iKZDPF1Ub/tSveSRSg15+vpGPfQYNelSRSaMzjlA8cihVwYdS/oVSZ8nJ2Ez35qkX1jbRKs1E2kyhkMs+4B3kNj2OUWvIPop0Xc2QPMp913atpNGeTUa44DAMsMxyfYANVTNtbpHhCuhSrs+nJ 8GA94rgG ZHvRsyDM1nS+T2tbCVcokH8GTWt9iLvBn/H+TekResZM1W34jVPmj5ZSYQrkEOvDKGwJEwnsBcDxGKY9B9+PiSbiY2Y9Nk6/Y1ObeR73hoEJo5rIURYHdARlJ3jAsE0g8rXRIU2aSCPQpfnCfNQ8xVtub/JJK6jt1T+lRbOlXAf7KVAQ2F4BFz/Gs47K5AYvzJ1QjC1I73q6Xl+ThRBNmWpXED/AtO2AyTRZc/9us/P9rcd3LL4nqPHBw1iJ+M4kGkNqKhslKcAZvGvHBahWmIpRy6y807dXiJCKa6x4FZyH9EV9zJQc/s3TJmLvaklCtyxpSiB8a4VIpSCEHdQwGqxDrhRu8WW0OxHiR37BvU23galpxtfitamTyClrSL0NrMYAP2da6Hxpp8D0h6z5Lke7n0UtC+vOJxOVq9TLqwG2+s0NCtGye+iuJVrqJg1ORF+iC/rN+W4Yf5os= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000101, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: After large folio zswapout support added in [1], this patch adds support for zswapin of large folios to bring it on par with zram. This series makes sure that the benefits of large folios (fewer page faults, batched PTE and rmap manipulation, reduced lru list, TLB coalescing (for arm64 and amd)) are not lost at swap out when using zswap. It builds on top of [2] which added large folio swapin support for zram and provides the same level of large folio swapin support as zram, i.e. only supporting swap count == 1. Patch 1 skips swapcache for swapping in zswap pages, this should improve no readahead swapin performance [3], and also allows us to build on large folio swapin support added in [2], hence is a prerequisite for patch 3. Patch 3 adds support for large folio zswapin. This patch does not add support for hybrid backends (i.e. folios partly present swap and zswap). The main performance benefit comes from maintaining large folios *after* swapin, large folio performance improvements have been mentioned in previous series posted on it [2],[4], so have not added those. Below is a simple microbenchmark to measure the time needed *for* zswpin of 1G memory (along with memory integrity check). | no mTHP (ms) | 1M mTHP enabled (ms) Base kernel | 1165 | 1163 Kernel with mTHP zswpin series | 1203 | 738 The time measured was pretty consistent between runs (~1-2% variation). There is 36% improvement in zswapin time with 1M folios. The percentage improvement is likely to be more if the memcmp is removed. T(test_zswapin), T(test_zswap_writeback_enabled), T(test_zswap_writeback_disabled), + T(test_zswapin_perf), T(test_no_kmem_bypass), T(test_no_invasive_cgroup_shrink), }; [1] https://lore.kernel.org/all/20241001053222.6944-1-kanchana.p.sridhar@intel.com/ [2] https://lore.kernel.org/all/20240821074541.516249-1-hanchuanhua@oppo.com/ [3] https://lore.kernel.org/all/1505886205-9671-5-git-send-email-minchan@kernel.org/T/#u [4] https://lwn.net/Articles/955575/ Usama Arif (4): mm/zswap: skip swapcache for swapping in zswap pages mm/zswap: modify zswap_decompress to accept page instead of folio mm/zswap: add support for large folio zswapin mm/zswap: count successful large folio zswap loads Documentation/admin-guide/mm/transhuge.rst | 3 + include/linux/huge_mm.h | 1 + include/linux/zswap.h | 6 ++ mm/huge_memory.c | 3 + mm/memory.c | 16 +-- mm/page_io.c | 2 +- mm/zswap.c | 120 ++++++++++++++------- 7 files changed, 99 insertions(+), 52 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_zswap.c b/tools/testing/selftests/cgroup/test_zswap.c index 40de679248b8..77068c577c86 100644 --- a/tools/testing/selftests/cgroup/test_zswap.c +++ b/tools/testing/selftests/cgroup/test_zswap.c @@ -9,6 +9,8 @@ #include #include #include +#include +#include #include "../kselftest.h" #include "cgroup_util.h" @@ -407,6 +409,74 @@ static int test_zswap_writeback_disabled(const char *root) return test_zswap_writeback(root, false); } +static int zswapin_perf(const char *cgroup, void *arg) +{ + long pagesize = sysconf(_SC_PAGESIZE); + size_t memsize = MB(1*1024); + char buf[pagesize]; + int ret = -1; + char *mem; + struct timeval start, end; + + mem = (char *)memalign(2*1024*1024, memsize); + if (!mem) + return ret; + + /* + * Fill half of each page with increasing data, and keep other + * half empty, this will result in data that is still compressible + * and ends up in zswap, with material zswap usage. + */ + for (int i = 0; i < pagesize; i++) + buf[i] = i < pagesize/2 ? (char) i : 0; + + for (int i = 0; i < memsize; i += pagesize) + memcpy(&mem[i], buf, pagesize); + + /* Try and reclaim allocated memory */ + if (cg_write_numeric(cgroup, "memory.reclaim", memsize)) { + ksft_print_msg("Failed to reclaim all of the requested memory\n"); + goto out; + } + + gettimeofday(&start, NULL); + /* zswpin */ + for (int i = 0; i < memsize; i += pagesize) { + if (memcmp(&mem[i], buf, pagesize)) { + ksft_print_msg("invalid memory\n"); + goto out; + } + } + gettimeofday(&end, NULL); + printf ("zswapin took %fms to run.\n", (end.tv_sec - start.tv_sec)*1000 + (double)(end.tv_usec - start.tv_usec) / 1000); + ret = 0; +out: + free(mem); + return ret; +} + +static int test_zswapin_perf(const char *root) +{ + int ret = KSFT_FAIL; + char *test_group; + + test_group = cg_name(root, "zswapin_perf_test"); + if (!test_group) + goto out; + if (cg_create(test_group)) + goto out; + + if (cg_run(test_group, zswapin_perf, NULL)) + goto out; + + ret = KSFT_PASS; +out: + cg_destroy(test_group); + free(test_group); + return ret; +} + /* * When trying to store a memcg page in zswap, if the memcg hits its memory * limit in zswap, writeback should affect only the zswapped pages of that @@ -584,6 +654,7 @@ struct zswap_test {