From patchwork Sat Nov 2 10:12:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Barry Song <21cnbao@gmail.com> X-Patchwork-Id: 13860071 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D6575E677F8 for ; Sat, 2 Nov 2024 10:12:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 65C6C6B0088; Sat, 2 Nov 2024 06:12:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 60CAE6B0089; Sat, 2 Nov 2024 06:12:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D38C6B008A; Sat, 2 Nov 2024 06:12:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 30E046B0088 for ; Sat, 2 Nov 2024 06:12:56 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D1069A0648 for ; Sat, 2 Nov 2024 10:12:55 +0000 (UTC) X-FDA: 82740740040.01.D434F19 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) by imf25.hostedemail.com (Postfix) with ESMTP id DB24DA0022 for ; Sat, 2 Nov 2024 10:12:32 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Y3u/KbRo"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1730542209; a=rsa-sha256; cv=none; b=dU6Icd0QXj8iWv7d26tgAOUo65wJ0yzwGxGnhcsdaN0t6S5sVTNyn8A3SoehqOobiEarSy 9IKdEaFWJL29FhKsg2JMJbMgYhMdxbdp7fFd+illjzLzaSaQ7HQKwcdDYw3i7bVcNgWAF9 Ehxk2mzDfza5+nrByydlZH+gOsXrmRk= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="Y3u/KbRo"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf25.hostedemail.com: domain of 21cnbao@gmail.com designates 209.85.214.174 as permitted sender) smtp.mailfrom=21cnbao@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1730542209; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=hZugl64Sp4r/qKgfzCCff/lOeoH7GG2OCZqrY25ODH8=; b=FUk4L/S0Whjz5ItwjweFWSgk14DPvHaIoyL7fN/y9hVkj2YTuft8mxYlfjVUjbgcoobc2F uazFvhZUyFZ80jJMBx0UBj+D0q3c6JmMFXlyE3qR7AbrZYU9hODRE6zwdnFZj1/9KKtShY VV8ROCCDTS/umMGjpcxXCdoAxkdLzGw= Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-20c7edf2872so29376635ad.1 for ; Sat, 02 Nov 2024 03:12:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730542372; x=1731147172; darn=kvack.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=hZugl64Sp4r/qKgfzCCff/lOeoH7GG2OCZqrY25ODH8=; b=Y3u/KbRoo+zGKBO36Dg17sKPK04xhpKsJNX9kOm7bGMB/FaaqPlBBRSN9oFsQTY1Gb O4iEWCtoHhDzh2US9/EPjMHWuRhvvg51QmfpykJsWXH/K4ZMkCiXt99zZp2HLyyte1FC BRkUADkteKvnV6dKKL7vx+fcs8mk1Ru6XkK2I06DfUnmgsN4h52sq9TJQ6z8AHVbyrG7 at3GpeMNF6zGNlcu3p/sPEL65Z3l1ZOMpJHYqUXJTtRxTbtqtSrO1Z4xB28loUCYWYwp zjEKT/6onCRPR2iGoatwWhGbC30ufFixrRGqMrkUaKGz+wOzWyGncg/TCaScmz+FkeQA LrrA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730542372; x=1731147172; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=hZugl64Sp4r/qKgfzCCff/lOeoH7GG2OCZqrY25ODH8=; b=piKvnjpIBqK3w4Jh+a7yjC+5Eo0m9/KMdpuoR8AexKXyZZjv+ytp7adXlXroP6foxs 4Gp05QLE6GYr0pEo1FAtaqCQXarF0qQMykZT3uQCz8qVDKhZQsTehhFZKrhpH790BDZk tm1xSsnX3M9NJUjfN4fsCq3DD206W76T36khO5dOf0+6lJv+PC5e2xP4RipxC88VrNG8 XrarMdqQxHNfkbPY2wPCtOYZG5wqAb+xL58I8xLW6oVzmHSL6flibmf8LJKcLa6wLhk/ cQgZ3O+Tq+Sm5wborYq1qhb79kBRVoONQvYWk8GMbGyFxsawjLpv0UzEDQEVUvbbt3UD 6+xA== X-Forwarded-Encrypted: i=1; AJvYcCU4bN38RIrytk5CgJSHXsv0EY/E+4ZZsscUd5OA1y0zhNOUH3IE630bQMnoZnBNrHyeseQHThyS+w==@kvack.org X-Gm-Message-State: AOJu0YyyUA59AOpsed4bpiZPMJV6ffinqQVuvCNFA4MUDvQyrilsWT1Y m7/iBZM2bX+Nt2hwq7RnJu4Aba8VtCiGGfK2rRy5hb0NsFwpg3YA X-Google-Smtp-Source: AGHT+IFJyvRUxnkKCNxdxJMLRlxNvxSxxPfLZYyB/obaqAGGXdsp/lsawhNkkOmPTRWZrb6lbA1XYw== X-Received: by 2002:a17:902:ec8e:b0:211:2fb2:6a6 with SMTP id d9443c01a7336-2112fb206e3mr38447105ad.24.1730542372401; Sat, 02 Nov 2024 03:12:52 -0700 (PDT) Received: from Barrys-MBP.hub ([2407:7000:8942:5500:89d8:b773:1a89:5a6e]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-21105715434sm31843505ad.113.2024.11.02.03.12.45 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sat, 02 Nov 2024 03:12:51 -0700 (PDT) From: Barry Song <21cnbao@gmail.com> To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, Barry Song , Usama Arif , Chengming Zhou , Yosry Ahmed , Nhat Pham , Johannes Weiner , David Hildenbrand , Hugh Dickins , Matthew Wilcox , Shakeel Butt , Andi Kleen , Baolin Wang , Chris Li , "Huang, Ying" , Kairui Song , Ryan Roberts Subject: [PATCH v2] mm: count zeromap read and set for swapout and swapin Date: Sat, 2 Nov 2024 23:12:40 +1300 Message-Id: <20241102101240.35072-1-21cnbao@gmail.com> X-Mailer: git-send-email 2.39.3 (Apple Git-146) MIME-Version: 1.0 X-Stat-Signature: ywytdhbwdbzpcxqpbeszdz3p99sn8s81 X-Rspamd-Queue-Id: DB24DA0022 X-Rspamd-Server: rspam08 X-Rspam-User: X-HE-Tag: 1730542352-339623 X-HE-Meta: U2FsdGVkX1/wYv2Ib1kuDVBgsXwjYpKhAtBUQDD/demaGqj5Y1J05FWcGREhZXlNmMfBjE9wqpCEzGznD6UnDANOce8e6gPQl2o4B+TF3m5QkStES5kYYDWInNChRJZ5hg6LdVQ/dHQzzbEd0/4P9YGfvO9qSfYwkOOT8Q/Dha+I+A/4iTXIdeQDfVdJarinAVxD1HOvKKZ0HStfVQW6Gki12H+76JHNwaUK0w2xuOjVjvqtTZiKpTfdZCDdXLPtbfmZ4SGZKzrVBwaYL9CHQVy1Nf2ZNiEAD8Ovwx+xQnqu6Fl0vjR9Mgz9L0HoGenVFr1nCD+zkxWKhhTUgYOL+xquo/BMpAZaQgm2WFpoW2l2kD6ZLtrjIxPZtM1OSTWplVM5iPpMUJQQCH4Bz7dP3i63cpUdgx9hFQCMuAXtxY6XRPyiZL0nhMsV2V6Hb86rZEV6EWvS3eWHz1kyLW1T1EyhZRJ7KC3bdNXtV+KyII/i3npCcx5z+ut4zLPPp1vOBy5LYW1QKGhTuDu1j2usx6S2T9RpRUMAvNZEg5KRHgXJlAtuchQmNVllhZ2HK7VgjQHrylwXvRH9W9ZgOEFyraDadpx9rY+LmFAYeRUQmmriJGQQ1LUVEFfKtD1yN7aRiwz40LEx2Ki0LAYfEFWb30Vy53Bp7pdKud/hfTM9clgPi6dTuWFVkmyL0KxS+x6TkxBFBWemSe2LxUw4/gFWBQgLusiPE8GTbReP0dMuvohOuzBfz5oDolpkEZ1DROJ1gs3ufRqb1Hqx3FEUq9b6LLK1uj1A7G5Qxdw3YvGNa4gcNL1BhOt6AY+C/Mw4LPajopQh6P4zRVV7Eeub3c8cb61HHOFD9hof6u7AmfFH25in+DaaFgNpAz4YgVJAB+qbyoE7yXFizjD9MmO9UOJ1PsGLrmRa0vwqg9d48BCI65JAcPZ/9FgHJ4zv6Bo1v+gaTXJY80g77xt8yQCgdq8 shKAta7d kik7+RFIlDMtO8NqjKX+RJrJM1Te7LR4KgXbrEpCipaPZLZ1x2U5AFxEq6bo+PcmhWHDR1Zd33faQL8MAUv9WiQSw88XARceaeL+661R/TffBJ3RlzlFPHw17+QCPwVU4/cs5NbglW2PMOQGaeKGGUxZMlWhGv8M4G3Gm6w00vnTthWjGhslcTwRXd7um7MSXL33JKIVK6K4G4KxkQbp1nYPLMb8YCLJOl6Ely/Mp7ZBzxpNYsKOaGW94Cn5Ts1GhJYmUx4KhdZJ2RJtleWXHi/cpK+ratn/rY3wNbUb975I/y+Z+XRwWEjEObUa3iw5rDt8iAlLaFtyn4aWGyzk5/GXx7w0bGd2cx/XN7iFWncFN4nYeL1lv7EtZ37t3zl0G7BB7CSFuG6WU1d/0h9B71Pxi+ChUfZ2vr3ZOAyempNzW8lSiDQfogH2hCf9aeDtQDh5nCslPVSK/waXYv5jOq+azxCpuV/NRnx39ahzUziYwmpS6uuEEu0pE/vLyyP3aI28DM3OoNExcL0mjWoOfhsq95brkidalloKMw5TbOB0TtJJmUKABIUYkhh6t3hcOFSZuEI7+zIePqpmQrwEDHH1lxNd0z1lTC9G6U2KtLTzGy0gqaSoy6yjL8PhclQ+LFEazAUti0sw+CykAlDJyEyu8ZYy7EHUEBfYPV6rVpBfLuWObBOKccTPNDgpdqGt961vIxazi+w0I47qzBNTCYROti8/007ZIYsidaZmnn7FGpdgTT1E50Cv85Ke7dOeYzzNZ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Barry Song When the proportion of folios from the zero map is small, missing their accounting may not significantly impact profiling. However, it’s easy to construct a scenario where this becomes an issue—for example, allocating 1 GB of memory, writing zeros from userspace, followed by MADV_PAGEOUT, and then swapping it back in. In this case, the swap-out and swap-in counts seem to vanish into a black hole, potentially causing semantic ambiguity. We have two ways to address this: 1. Add a separate counter specifically for the zero map. 2. Continue using the current accounting, treating the zero map like a normal backend. (This aligns with the current behavior of zRAM when supporting same-page fills at the device level.) This patch adopts option 1 as pswpin/pswpout counters are that they only apply to IO done directly to the backend device (as noted by Nhat Pham). We can find these counters from /proc/vmstat (counters for the whole system) and memcg's memory.stat (counters for the interested memcg). For example: $ grep -E 'swpin_zero|swpout_zero' /proc/vmstat swpin_zero 1648 swpout_zero 33536 $ grep -E 'swpin_zero|swpout_zero' /sys/fs/cgroup/system.slice/memory.stat swpin_zero 3905 swpout_zero 3985 Fixes: 0ca0c24e3211 ("mm: store zero pages to be swapped out in a bitmap") Cc: Usama Arif Cc: Chengming Zhou Cc: Yosry Ahmed Cc: Nhat Pham Cc: Johannes Weiner Cc: David Hildenbrand Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Shakeel Butt Cc: Andi Kleen Cc: Baolin Wang Cc: Chris Li Cc: "Huang, Ying" Cc: Kairui Song Cc: Ryan Roberts Signed-off-by: Barry Song Reviewed-by: Nhat Pham --- -v2: * add separate counters rather than using pswpin/out; thanks for the comments from Usama, David, Yosry and Nhat; * Usama also suggested a new counter like swapped_zero, I prefer that one be separated as an enhancement patch not a hotfix. will probably handle it later on. Documentation/admin-guide/cgroup-v2.rst | 10 ++++++++++ include/linux/vm_event_item.h | 2 ++ mm/memcontrol.c | 4 ++++ mm/page_io.c | 16 ++++++++++++++++ mm/vmstat.c | 2 ++ 5 files changed, 34 insertions(+) diff --git a/Documentation/admin-guide/cgroup-v2.rst b/Documentation/admin-guide/cgroup-v2.rst index db3799f1483e..984eb3c9d05b 100644 --- a/Documentation/admin-guide/cgroup-v2.rst +++ b/Documentation/admin-guide/cgroup-v2.rst @@ -1599,6 +1599,16 @@ The following nested keys are defined. pglazyfreed (npn) Amount of reclaimed lazyfree pages + swpin_zero + Number of pages moved into memory with zero content, meaning no + copy exists in the backend swapfile, allowing swap-in to avoid + I/O read overhead. + + swpout_zero + Number of pages moved out of memory with zero content, meaning no + copy is needed in the backend swapfile, allowing swap-out to avoid + I/O write overhead. + zswpin Number of pages moved in to memory from zswap. diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index aed952d04132..f70d0958095c 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -134,6 +134,8 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, #ifdef CONFIG_SWAP SWAP_RA, SWAP_RA_HIT, + SWPIN_ZERO, + SWPOUT_ZERO, #ifdef CONFIG_KSM KSM_SWPIN_COPY, #endif diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5e44d6e7591e..7b3503d12aaf 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -441,6 +441,10 @@ static const unsigned int memcg_vm_event_stat[] = { PGDEACTIVATE, PGLAZYFREE, PGLAZYFREED, +#ifdef CONFIG_SWAP + SWPIN_ZERO, + SWPOUT_ZERO, +#endif #ifdef CONFIG_ZSWAP ZSWPIN, ZSWPOUT, diff --git a/mm/page_io.c b/mm/page_io.c index 5d9b6e6cf96c..4b4ea8e49cf6 100644 --- a/mm/page_io.c +++ b/mm/page_io.c @@ -204,7 +204,9 @@ static bool is_folio_zero_filled(struct folio *folio) static void swap_zeromap_folio_set(struct folio *folio) { + struct obj_cgroup *objcg = get_obj_cgroup_from_folio(folio); struct swap_info_struct *sis = swp_swap_info(folio->swap); + int nr_pages = folio_nr_pages(folio); swp_entry_t entry; unsigned int i; @@ -212,6 +214,12 @@ static void swap_zeromap_folio_set(struct folio *folio) entry = page_swap_entry(folio_page(folio, i)); set_bit(swp_offset(entry), sis->zeromap); } + + count_vm_events(SWPOUT_ZERO, nr_pages); + if (objcg) { + count_objcg_events(objcg, SWPOUT_ZERO, nr_pages); + obj_cgroup_put(objcg); + } } static void swap_zeromap_folio_clear(struct folio *folio) @@ -507,6 +515,7 @@ static void sio_read_complete(struct kiocb *iocb, long ret) static bool swap_read_folio_zeromap(struct folio *folio) { int nr_pages = folio_nr_pages(folio); + struct obj_cgroup *objcg; bool is_zeromap; /* @@ -521,6 +530,13 @@ static bool swap_read_folio_zeromap(struct folio *folio) if (!is_zeromap) return false; + objcg = get_obj_cgroup_from_folio(folio); + count_vm_events(SWPIN_ZERO, nr_pages); + if (objcg) { + count_objcg_events(objcg, SWPIN_ZERO, nr_pages); + obj_cgroup_put(objcg); + } + folio_zero_range(folio, 0, folio_size(folio)); folio_mark_uptodate(folio); return true; diff --git a/mm/vmstat.c b/mm/vmstat.c index 22a294556b58..c8ef7352f9ed 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1418,6 +1418,8 @@ const char * const vmstat_text[] = { #ifdef CONFIG_SWAP "swap_ra", "swap_ra_hit", + "swpin_zero", + "swpout_zero", #ifdef CONFIG_KSM "ksm_swpin_copy", #endif