From patchwork Sun Nov 19 19:47:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13460661 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 447EAC54FB9 for ; Sun, 19 Nov 2023 19:48:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C5DDA6B0366; Sun, 19 Nov 2023 14:48:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C0C426B036D; Sun, 19 Nov 2023 14:48:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A871C6B0374; Sun, 19 Nov 2023 14:48:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 94B216B0366 for ; Sun, 19 Nov 2023 14:48:22 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 73CCDC0347 for ; Sun, 19 Nov 2023 19:48:22 +0000 (UTC) X-FDA: 81475740444.24.066FEA7 Received: from mail-pf1-f170.google.com (mail-pf1-f170.google.com [209.85.210.170]) by imf15.hostedemail.com (Postfix) with ESMTP id 9AF2DA0002 for ; Sun, 19 Nov 2023 19:48:20 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hzU2Y6NT; spf=pass (imf15.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700423300; a=rsa-sha256; cv=none; b=Qe3Osz5UjRL0ViNB/aYov/hfw/vOPL7AaWWxLOR/9YJ5bWmKzvzsg/qZab2MbHhLwKfxGG VJX6r3Y2nSHfSC7GPBOvoerCRWs+BgLvI/xK4MLw9ZOo/ifN3c+3Lcuf4e+unaoYfdVYBt FinQWX6zleDNS5LBCelnZLcDrUdWX/4= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=hzU2Y6NT; spf=pass (imf15.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.170 as permitted sender) smtp.mailfrom=ryncsn@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700423300; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZPY5phyUcHn1HYB2k83NkwocYwAsatgYc16U9qxXR/Y=; b=FXPg60QIMwYrxJtRinHc7CsqxoO5bMYZElH3KcPP4YmU+Kvj4KrQIKg0k1jtqQw4pzj38j 5D7u5vpi45g6juiNpoMqBl2d4jxEpbAivjZtvnQ3hXkB3W/SbGnSniXEKKyQ56oD6+XpnP EAF6kwhpWp8aoh4L+8Nst28fQF91iuY= Received: by mail-pf1-f170.google.com with SMTP id d2e1a72fcca58-6c10f098a27so2928167b3a.2 for ; Sun, 19 Nov 2023 11:48:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700423298; x=1701028098; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=ZPY5phyUcHn1HYB2k83NkwocYwAsatgYc16U9qxXR/Y=; b=hzU2Y6NTayF2T1ZjUgzwmpQpGASxGFoeT0zHfktF919qNotItnprf//LGdl3hE4MQ3 Z0oJJAVLmxeTwyLrynz5R7neyT+8LVV5t40AKWS2BVMdcPLbxZqO+M+sQ+gt8gW+F4Ly 42usKxpHxL2dBDGOzT2FwkqPru0gnr3KGfIaXNbyNvoDdIixgcJmVfbYeT7BTwxXSqsT yN/1NONwmwj/4bwuU/nme0XC0rK+KudZ18eqGKCOTyn/az2S/AtkTEa5juiCc25n4Z9v PlIVD0wpdBIQVfrl+790THloLiQbvYtzqwMx4OpwaIiSl/4fIC+sfN4wj0RXhBtKa3w0 pujw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700423298; x=1701028098; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ZPY5phyUcHn1HYB2k83NkwocYwAsatgYc16U9qxXR/Y=; b=ue8BN4XxQZonLG0DbRMGCIeMqZbVe/xJh+U5Hy389ZfzYTWnexbL0k2VefTzSsrX/I DlM+0P8bPj4N3ougcqKHbGAtiGCX7py5+EDFl92nCC/wsl52j3opy4tQvYvQ7pa6ZiZh 9dhilNS9/OwwUxsuq3pnjZxcmtLQPnSXnylr59sHejRfgbe+iZEKoSDt4TVHjGAGmR1J hiXotSX80DJ/VtjLcXnKYErtbNOz37iyvR3kOdOZ7oftAxj1WImccyiOEWOY7efmH/LP gu0PkFqLLU7gdcfvqRpOQNH4lvSqX8SlDv109mnCu3l95NU6DImj/SwHfGXWSUZMZiHo RGSA== X-Gm-Message-State: AOJu0YzSR5+s+EfSSC++fbpKL4lMdgtHC4uLueMj67Y5pNXw4OQuLyIV RBXDylNFs1CqsEpCmC4Z3MoSbS5Le708QtPu X-Google-Smtp-Source: AGHT+IFSrhZjCOK5OLW/ySLi0SEQSRf2Ywf78h0lYs0h5fiCtYrCKswNp0hMG0Figvjc/xBHlwsMUw== X-Received: by 2002:aa7:9ddd:0:b0:6c4:dc5b:5b2b with SMTP id g29-20020aa79ddd000000b006c4dc5b5b2bmr3806580pfq.20.1700423298129; Sun, 19 Nov 2023 11:48:18 -0800 (PST) Received: from KASONG-MB2.tencent.com ([115.171.40.79]) by smtp.gmail.com with ESMTPSA id a6-20020aa78646000000b006cb7feae74fsm1237140pfo.164.2023.11.19.11.48.15 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 19 Nov 2023 11:48:17 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , "Huang, Ying" , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH 05/24] mm/swap: move readahead policy checking into swapin_readahead Date: Mon, 20 Nov 2023 03:47:21 +0800 Message-ID: <20231119194740.94101-6-ryncsn@gmail.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231119194740.94101-1-ryncsn@gmail.com> References: <20231119194740.94101-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Server: rspam08 X-Rspamd-Queue-Id: 9AF2DA0002 X-Stat-Signature: 1ph8doetpxxrppnedm77zprd9q43eewe X-Rspam-User: X-HE-Tag: 1700423300-547185 X-HE-Meta: U2FsdGVkX19AD2E7zJVZqkfN0frLn6/Xo97YNPqxvuEXdr0eEmZOVbP9UMcfXThQck461oN1KuULBWq01Y8eah1RRK295rduKTWExDjEJKuIY0h7nx52jRRu805TTLLjQ0+3DyX9nc1zQmG8H+Sl1tTeqoscfKamwpTo7gFyRpu2ECFBZ3anJB2WqzgKPXuAtvNOZk/pVyUJ6trG0AXrY3ODEYcfFoLQok3LV8vhfvlXGPTZvkFh5yq6O4YgGI6yt/SorRHoUnD6/5YPQBGZ9P1zPfeMd0Sx5Qc4rKLCQHf7WtCow5euOY5URTr/vtjRzkQPvttYLAwn09wFe0+kL1XxWxwSZN/C5ZLbVl3QjYRLeBVBqgvhhfGKG22jY1XlRPBMHnHwuzBfbIwmDbOffmt3hvva+RMGE/96TED3KPIPv7727ZjM7vGmGJFuhTlwQhScGoKU/27Oeom95JeTsFHXAIuRfBU5c7TmoOTgkWBis4Ya8E8zuvxDetZx/yxgAlmz11Qf6+nKpfqVBWEe/SEkVzaddPfg62jscp126PXPC7m4TFeSk8fj3BGKMIOEu6fmYcXeXBeQByqS8K5VqYQPWFxb5KEqgmDIF/UAKKW4TNEX45yUhD5i7szQIGCpweAoKBkxl40sWp24GmTu/8ipTnMRRDAumgNZU9CWwtw4zo7Ho2KvthUzNZrcOIAdKU1YRexum3tdvALhz8j2LsOiX+TsoMmKZPYP9eI8T3+1b0KKodBbizv2HN+4YxoPjUi9GiNELD4ObPXoYaM1GK1adGELn4AtC5ZI2XumdollkowuFY15gdHJC9dUG8SRzqZWftm8eCEyrTJyRA2lM3dX2ptwQ8pWRbYeedWrAjCp1DdifHEXPG23eP+7M5SoKtNbdAEfW8Kg7Asda7iQIrheGed0Rdb40LC6QiAgUtFhEOkp29kFOtRWdQ32eKA5tiJgSCmLFJOrZ+X3x7V RHz6qDal a2+kWpjJAgvngElkQcdKxV/mP1e0Mtm8v00QdObDcOIGA0ePwc9YclJg7m7x6d0Q7S53hOSTCy/UGwRct++V4n/gm6x3QqqxpEHXSiDJ4kYnUfAkZJmNnHB/sZ2xS2Mszkxr8GMbHwW2azfpa3imQTkFyEL6tRBQT1MjKL+N2jNkiMYgBAhGoyw0PBzYAfsVBU1lLwcnOSLIp3Nr6nLAWVa45sO8TjsMFAzdmzN09D3nSugEL/PCK4hufZRy8m/bEOB8ki5tv1txLUoEnA7cqYvOulenZkt248qOCqGSDqcuCIetQTSXYI5jQnJF6V6T8tvYwNrTeSpFlty3z9nOTEInTDnLLfNO9zHgq2+s+7NKyzv7VSzWWh5pAzHV7BBfJXEokoaTMYUA5Fy7jeaaxJkho66mFhDR7hA67bfVrn1J4YZVZdrOuh1711/kjCbomnGKlma1X3jRGnCxk1g7pA8K+seWxUuGH8DBt0g3semT86Gv0AGiMj4cHQ22IYi6pPYOEC0rBl3yZ4oRH7upy9mxp0xUo43LtkrwcvJ2GjNv7O9VNMzCA2famq5RLg/pLFFJQkCrd4kD/3c1dvWQhf6MWOZ4uu4liincWo9uhOKs63d+LXV6mtMTskRW+H4taEe9Hj8sFnb9ZKpgmdqdHbT/IRTpMGGm1uAyl4JBYNdUSZHo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song This makes swapin_readahead a main entry for swapin pages, prepare for optimizations in later commits. This also makes swapoff able to make use of readahead checking based on entry. Swapping off a 10G ZRAM (lzo-rle) is faster: Before: time swapoff /dev/zram0 real 0m12.337s user 0m0.001s sys 0m12.329s After: time swapoff /dev/zram0 real 0m9.728s user 0m0.001s sys 0m9.719s And what's more, because now swapoff will also make use of no-readahead swapin helper, this also fixed a bug for no-readahead case (eg. ZRAM): when a process that swapped out some memory previously was moved to a new cgroup, and the original cgroup is dead, swapoff the swap device will make the swapped in pages accounted into the process doing the swapoff instead of the new cgroup the process was moved to. This can be easily reproduced by: - Setup a ramdisk (eg. ZRAM) swap. - Create memory cgroup A, B and C. - Spawn process P1 in cgroup A and make it swap out some pages. - Move process P1 to memory cgroup B. - Destroy cgroup A. - Do a swapoff in cgroup C. - Swapped in pages is accounted into cgroup C. This patch will fix it make the swapped in pages accounted in cgroup B. The same bug exists for readahead path too, we'll fix it in later commits. Signed-off-by: Kairui Song --- mm/memory.c | 22 +++++++--------------- mm/swap.h | 6 ++---- mm/swap_state.c | 33 ++++++++++++++++++++++++++------- mm/swapfile.c | 2 +- 4 files changed, 36 insertions(+), 27 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index fba4a5229163..f4237a2e3b93 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3792,6 +3792,7 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) rmap_t rmap_flags = RMAP_NONE; bool exclusive = false; swp_entry_t entry; + bool swapcached; pte_t pte; vm_fault_t ret = 0; @@ -3855,22 +3856,13 @@ vm_fault_t do_swap_page(struct vm_fault *vmf) swapcache = folio; if (!folio) { - if (data_race(si->flags & SWP_SYNCHRONOUS_IO) && - __swap_count(entry) == 1) { - /* skip swapcache and readahead */ - page = swapin_no_readahead(entry, GFP_HIGHUSER_MOVABLE, - vmf); - if (page) - folio = page_folio(page); + page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, + vmf, &swapcached); + if (page) { + folio = page_folio(page); + if (swapcached) + swapcache = folio; } else { - page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, - vmf); - if (page) - folio = page_folio(page); - swapcache = folio; - } - - if (!folio) { /* * Back out if somebody else faulted in this pte * while we released the pte lock. diff --git a/mm/swap.h b/mm/swap.h index ea4be4791394..f82d43d7b52a 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -55,9 +55,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t flag, struct mempolicy *mpol, pgoff_t ilx); struct page *swapin_readahead(swp_entry_t entry, gfp_t flag, - struct vm_fault *vmf); -struct page *swapin_no_readahead(swp_entry_t entry, gfp_t flag, - struct vm_fault *vmf); + struct vm_fault *vmf, bool *swapcached); static inline unsigned int folio_swap_flags(struct folio *folio) { @@ -89,7 +87,7 @@ static inline struct page *swap_cluster_readahead(swp_entry_t entry, } static inline struct page *swapin_readahead(swp_entry_t swp, gfp_t gfp_mask, - struct vm_fault *vmf) + struct vm_fault *vmf, bool *swapcached) { return NULL; } diff --git a/mm/swap_state.c b/mm/swap_state.c index 45dd8b7c195d..fd0047ae324e 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -316,6 +316,11 @@ void free_pages_and_swap_cache(struct encoded_page **pages, int nr) release_pages(pages, nr); } +static inline bool swap_use_no_readahead(struct swap_info_struct *si, swp_entry_t entry) +{ + return data_race(si->flags & SWP_SYNCHRONOUS_IO) && __swap_count(entry) == 1; +} + static inline bool swap_use_vma_readahead(void) { return READ_ONCE(enable_vma_readahead) && !atomic_read(&nr_rotate_swap); @@ -861,8 +866,8 @@ static struct page *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask, * Returns the struct page for entry and addr after the swap entry is read * in. */ -struct page *swapin_no_readahead(swp_entry_t entry, gfp_t gfp_mask, - struct vm_fault *vmf) +static struct page *swapin_no_readahead(swp_entry_t entry, gfp_t gfp_mask, + struct vm_fault *vmf) { struct vm_area_struct *vma = vmf->vma; struct page *page = NULL; @@ -904,6 +909,8 @@ struct page *swapin_no_readahead(swp_entry_t entry, gfp_t gfp_mask, * @entry: swap entry of this memory * @gfp_mask: memory allocation flags * @vmf: fault information + * @swapcached: pointer to a bool used as indicator if the + * page is swapped in through swapcache. * * Returns the struct page for entry and addr, after queueing swapin. * @@ -912,17 +919,29 @@ struct page *swapin_no_readahead(swp_entry_t entry, gfp_t gfp_mask, * or vma-based(ie, virtual address based on faulty address) readahead. */ struct page *swapin_readahead(swp_entry_t entry, gfp_t gfp_mask, - struct vm_fault *vmf) + struct vm_fault *vmf, bool *swapcached) { struct mempolicy *mpol; - pgoff_t ilx; struct page *page; + pgoff_t ilx; + bool cached; mpol = get_vma_policy(vmf->vma, vmf->address, 0, &ilx); - page = swap_use_vma_readahead() ? - swap_vma_readahead(entry, gfp_mask, mpol, ilx, vmf) : - swap_cluster_readahead(entry, gfp_mask, mpol, ilx); + if (swap_use_no_readahead(swp_swap_info(entry), entry)) { + page = swapin_no_readahead(entry, gfp_mask, vmf); + cached = false; + } else if (swap_use_vma_readahead()) { + page = swap_vma_readahead(entry, gfp_mask, mpol, ilx, vmf); + cached = true; + } else { + page = swap_cluster_readahead(entry, gfp_mask, mpol, ilx); + cached = true; + } mpol_cond_put(mpol); + + if (swapcached) + *swapcached = cached; + return page; } diff --git a/mm/swapfile.c b/mm/swapfile.c index 756104ebd585..0142bfc71b81 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1874,7 +1874,7 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd, }; page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, - &vmf); + &vmf, NULL); if (page) folio = page_folio(page); }