From patchwork Sun Nov 19 19:47:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13460679 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DDBBDC5AE5B for ; Sun, 19 Nov 2023 19:49:20 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4BA726B03D6; Sun, 19 Nov 2023 14:49:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 41DB06B03DD; Sun, 19 Nov 2023 14:49:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2BCAC6B03E0; Sun, 19 Nov 2023 14:49:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 197056B03D6 for ; Sun, 19 Nov 2023 14:49:20 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id EEFE4B5628 for ; Sun, 19 Nov 2023 19:49:19 +0000 (UTC) X-FDA: 81475742838.07.4DDBC99 Received: from mail-pf1-f180.google.com (mail-pf1-f180.google.com [209.85.210.180]) by imf21.hostedemail.com (Postfix) with ESMTP id 290CA1C0002 for ; Sun, 19 Nov 2023 19:49:17 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=G6RgB0+o; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1700423358; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VWrBT21AdvUolArI+Vamx/Z+QXIBnOosXdhjX6fE5gQ=; b=QsOTqYpLd5wYnRwTaQvWhhWbwd5dzWvSSBELtdOKvAJ7ZXkwfGUMkIUMyaEZqmYPwPJZBm kBwQ86/RljqON03NImAj/a9KadxG3fgcOkjV9338Dwl9ease9adVANmUFUYWCACdccK4nK Q2YkIr5DoChm2SDRJtaLFxoAge4AQc8= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=G6RgB0+o; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf21.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.210.180 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1700423358; a=rsa-sha256; cv=none; b=grk6DiyDc3dfHcy3gRe2Z9m/pw8DU4kIID9OrgYIITlAQDm+otWqRCLV3dLnBQdXruBDlB B0uC/H9TrqLEghqmfv/4YarQwYf5kmLx0hnSbLBw3Y2WBuQHhciCGcEJ5v2ZMjqXcZuO0K 9IWNvvKy0OgSWAa+2DfalyV7e7cwmz4= Received: by mail-pf1-f180.google.com with SMTP id d2e1a72fcca58-6c320a821c4so3100307b3a.2 for ; Sun, 19 Nov 2023 11:49:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1700423356; x=1701028156; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=VWrBT21AdvUolArI+Vamx/Z+QXIBnOosXdhjX6fE5gQ=; b=G6RgB0+osWEJMjIBHy0EXZb+YbOIubyo/6+bPF3oeMzaJWDTvmfaIyW1z1Z0kez8a4 tUw603V7GW5GR4jDmk5Rm6ClEBRs9MUBwbNHoTumqUCVisE5YzZLuREPZ1kj8GDFMsFV yiJ4mnh0AL5s8uEVGejUtBpS1+T++sfgv6iltWKqToQau553ljTs9POx6IfMFLAG2V7b /gcGoCODu8XBVF26gJTy39RFf7Vx9hUUIj45FySWn0CYrdI6RWQbwdwJpAdXVbKa05Sa 4uVM+TA3bhF9IkpMwj/o6X/VofvcmlqgVzQyaqu85ztgm/xko9AAkPGe4HuWWZN+AwN4 H2DA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700423356; x=1701028156; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=VWrBT21AdvUolArI+Vamx/Z+QXIBnOosXdhjX6fE5gQ=; b=PFxgftVfQ7Ov9Pov7Tqr70kwoKcG1M7Wlp8wtYL1i2V2F0EiB5EDDXiMieFehXhr3d YY+XHOhvhiBnOU+siT512XpV4/WPYSZo0oDVKBj0w3pH7bGDf+2CvmGVhh24dxJhJq5B IHMk67cUkBkd9iwLBXqtdILVwxajvG3jpJDkzDEpbaC0VJ1u+JIeOFnPmldAVB9u417H CJVg6NacaL/lV63sPZ87nQ3DbmSbUpLVn9W+sb9LfU4prqrsL8lE6kCOby9SCrVtXwup eJYRAKFmZUcfVK4HlRAoMIculsZ7gw2huHpF5lbkogSIrEgfESaeND4rRtuZzgLtBDwd qyIg== X-Gm-Message-State: AOJu0YypKv1sWmPvq/Q6yfFKuXlm5mUT6+W667YbHkKknURXB+f55ykY MLuOrIRCZgbGelRzIAFkZwOlVHEqieszGgPq X-Google-Smtp-Source: AGHT+IGWIkDwN2GaIA67iopevmfTqkLtEBcX4Ge5hWNRIHGYUTrFSZBXVCdyc6Dj2MoCIbUQoxa2mw== X-Received: by 2002:a05:6a00:3492:b0:6cb:a0dc:3d56 with SMTP id cp18-20020a056a00349200b006cba0dc3d56mr660136pfb.0.1700423355969; Sun, 19 Nov 2023 11:49:15 -0800 (PST) Received: from KASONG-MB2.tencent.com ([115.171.40.79]) by smtp.gmail.com with ESMTPSA id a6-20020aa78646000000b006cb7feae74fsm1237140pfo.164.2023.11.19.11.49.13 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Sun, 19 Nov 2023 11:49:15 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , "Huang, Ying" , David Hildenbrand , Hugh Dickins , Johannes Weiner , Matthew Wilcox , Michal Hocko , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH 23/24] swap: fix multiple swap leak when after cgroup migrate Date: Mon, 20 Nov 2023 03:47:39 +0800 Message-ID: <20231119194740.94101-24-ryncsn@gmail.com> X-Mailer: git-send-email 2.42.0 In-Reply-To: <20231119194740.94101-1-ryncsn@gmail.com> References: <20231119194740.94101-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 X-Rspamd-Queue-Id: 290CA1C0002 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: ph47ttxwq1tmso1jbmme4oijmsztrfza X-HE-Tag: 1700423357-635506 X-HE-Meta: U2FsdGVkX193dHw38WEm9jwSdCBnNHWc3b/VJ0421F6wUqq7D3Y7MpTI+06b9ysIYQFbb27Sj7o8zFhJjZWSNqPX69JSlK/pPEe2ZlPqbqNlbUTnQXJEHRX92BqbzjTtWMeGUrQ0I9v5xD7dBSzD2fqFiLqAPc1Mlrq+TqPp4QfI7ijMCzwmfOm0XhW1FVM68YP86hd9kPiLle3v+Bmi4N4OP8S3ffJHQSKiJdmHq6F8m2l0/aKcfBFqO4WLwAlNhrDLXoxHQWL1t+qwD0UFQaE9QV3HBG3RBkiKnXyTU/aKktV7AHlpaOauUGIqI1EgVvpiRLBZvdfXhk/bgCx2lyGJ7bcz74lmjIQ6694iDgAdNmKZqUDsh38GDkjiIuGwkDVCsgJ/7VtZj6Qz77RC2w/Yex8h1krbYVlTdHdtjOmNJ5GQaJl1DIRDo1x/GltwdbxI6rPGrP2yS6Ykfwj1U29cKGf6Zy5mD1MNGzTS7vX3dudL1fYz8A0dj//7k/0yikI5j9/T2WADwFboK9lq4C6tBeoQnklAiOwroJOTFwM6oUEkbHJAP1Q26bQpbFl7lrrcqmvyxfRBtaIq/ABWi/11Ptg/lxrua0J2hYc+6wBMt511NtjQC+lE0NXXLKHfe1Vs59g0z10oFk6aE82BWIR026ls+zm0tVoenqFxKmYTxabR2kiDbnYGINiGJzzEBzJ8q1A/Ec2XwpGodDvkccbrhcaK+/3dt+JD8hN2Jp7mdsdKl+5LXXGz8mkXj6i7PAhPtS2kXuyow/ULURZ3kdQG14ZZhcXVZJPviOUbOfDBNEOFFSOs2mVBFWozyuIdjdXGzZFP9JA3a7Y+Jd3wFcuPTQ1Bji30unf+xayQZL8DskebSqEvsG44HJn3UAIp4eX9Dggdi2zJvcWUrHuTsKlVO0pAM+V5nI/YXYfo5Nddaf84aC26PgbKJ3HhjmMctKyjmeT3/xk06Am+vjl zLbQbdyc ZsTdK6IRxydFPU5SnZ7o43CNEHmC2dtQa0wDgtgWb7m6ZywPGtOlRxV18Q8Tba5VU5uX8xi07422b9TvJF6AMzoEXi6ZpEVIrABTwzPjgZrSnSNQlB5i6O6st1hOlutKh4pUS0D0z/F7WhMLRyy/2OV6fGMAKDEL8vTH4Ko65lTS8c0sBqTnsJ2/v2Rnl3stZuTaf7N1BEUFW7bR8VN4f1RxQTW9iNBmxfw413L8SkXJkobpCvl8weSK6lVIZTIS+hO3CSDVBCf3nPz84wfwk2YYAtDo5Fi2wcoBJmxqjN/egoYSlZqMKlcdoFrJUCpIK5UVQKjmnQaF50aGAU5IA5ABIvqhSgI9R15KaAgfKKu8TdslY56xjibfgYHRyc2OiBYpOEo5KCf6ar/Lr2acxweEM3RUjYgd0WE1fykVEm0JPxiDUIyIZbD8i/3CzijFZtDqME/zUmsvZ+ZnUWmMdCeYQKDJolPhv5/QMgWXm26N6q1Uu9BYez+gxVYcxJWbv0j8jXNS8DPTink6ECPitfwzH+Eot4TWT64FZi+hvmNFFyM2OEw/kgHLH7DT4cAVzApFLNGQqybMavXHSiqFQXlYXZudIB/muruqdeo0JG38Twjln3e8WwPqFpetqT5XnLOWy X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song When a process which previously swapped some memory was moved to another cgroup, and the cgroup it previous in is dead, then swapped in pages will be leaked into rootcg. Previous commits fixed the bug for no readahead path, this commit fix the same issue for readahead path. This can be easily reproduced by: - Setup a SSD or HDD swap. - Create memory cgroup A, B and C. - Spawn process P1 in cgroup A and make it swap out some pages. - Move process P1 to memory cgroup B. - Destroy cgroup A. - Do a swapoff in cgroup C - Swapped in pages is accounted into cgroup C. This patch will fix it make the swapped in pages accounted in cgroup B. Signed-off-by: Kairui Song --- mm/swap.h | 2 +- mm/swap_state.c | 19 ++++++++++--------- mm/zswap.c | 2 +- 3 files changed, 12 insertions(+), 11 deletions(-) diff --git a/mm/swap.h b/mm/swap.h index 795a25df87da..4374bf11ca41 100644 --- a/mm/swap.h +++ b/mm/swap.h @@ -55,7 +55,7 @@ struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct swap_iocb **plug); struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct mempolicy *mpol, pgoff_t ilx, - bool *new_page_allocated); + struct mm_struct *mm, bool *new_page_allocated); struct page *swapin_readahead(swp_entry_t entry, gfp_t flag, struct vm_fault *vmf, enum swap_cache_result *result); struct page *swapin_page_non_fault(swp_entry_t entry, gfp_t gfp_mask, diff --git a/mm/swap_state.c b/mm/swap_state.c index b377e55cb850..362a6f674b36 100644 --- a/mm/swap_state.c +++ b/mm/swap_state.c @@ -416,7 +416,7 @@ struct folio *filemap_get_incore_folio(struct address_space *mapping, struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, struct mempolicy *mpol, pgoff_t ilx, - bool *new_page_allocated) + struct mm_struct *mm, bool *new_page_allocated) { struct swap_info_struct *si; struct folio *folio; @@ -462,7 +462,7 @@ struct page *__read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, mpol, ilx, numa_node_id()); if (!folio) goto fail_put_swap; - if (mem_cgroup_swapin_charge_folio(folio, NULL, gfp_mask, entry)) + if (mem_cgroup_swapin_charge_folio(folio, mm, gfp_mask, entry)) goto fail_put_folio; /* @@ -540,7 +540,7 @@ struct page *read_swap_cache_async(swp_entry_t entry, gfp_t gfp_mask, mpol = get_vma_policy(vma, addr, 0, &ilx); page = __read_swap_cache_async(entry, gfp_mask, mpol, ilx, - &page_allocated); + vma->vm_mm, &page_allocated); mpol_cond_put(mpol); if (page_allocated) @@ -628,7 +628,8 @@ static unsigned long swapin_nr_pages(unsigned long offset) * are fairly likely to have been swapped out from the same node. */ static struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, - struct mempolicy *mpol, pgoff_t ilx) + struct mempolicy *mpol, pgoff_t ilx, + struct mm_struct *mm) { struct page *page; unsigned long entry_offset = swp_offset(entry); @@ -657,7 +658,7 @@ static struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, /* Ok, do the async read-ahead now */ page = __read_swap_cache_async( swp_entry(swp_type(entry), offset), - gfp_mask, mpol, ilx, &page_allocated); + gfp_mask, mpol, ilx, mm, &page_allocated); if (!page) continue; if (page_allocated) { @@ -675,7 +676,7 @@ static struct page *swap_cluster_readahead(swp_entry_t entry, gfp_t gfp_mask, skip: /* The page was likely read above, so no need for plugging here */ page = __read_swap_cache_async(entry, gfp_mask, mpol, ilx, - &page_allocated); + mm, &page_allocated); if (unlikely(page_allocated)) swap_readpage(page, false, NULL); return page; @@ -830,7 +831,7 @@ static struct page *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask, pte_unmap(pte); pte = NULL; page = __read_swap_cache_async(entry, gfp_mask, mpol, ilx, - &page_allocated); + vmf->vma->vm_mm, &page_allocated); if (!page) continue; if (page_allocated) { @@ -850,7 +851,7 @@ static struct page *swap_vma_readahead(swp_entry_t targ_entry, gfp_t gfp_mask, skip: /* The page was likely read above, so no need for plugging here */ page = __read_swap_cache_async(targ_entry, gfp_mask, mpol, targ_ilx, - &page_allocated); + vmf->vma->vm_mm, &page_allocated); if (unlikely(page_allocated)) swap_readpage(page, false, NULL); return page; @@ -980,7 +981,7 @@ struct page *swapin_page_non_fault(swp_entry_t entry, gfp_t gfp_mask, workingset_refault(page_folio(page), shadow); cache_result = SWAP_CACHE_BYPASS; } else { - page = swap_cluster_readahead(entry, gfp_mask, mpol, ilx); + page = swap_cluster_readahead(entry, gfp_mask, mpol, ilx, mm); cache_result = SWAP_CACHE_MISS; } done: diff --git a/mm/zswap.c b/mm/zswap.c index 030cc137138f..e2712ff169b1 100644 --- a/mm/zswap.c +++ b/mm/zswap.c @@ -1081,7 +1081,7 @@ static int zswap_writeback_entry(struct zswap_entry *entry, /* try to allocate swap cache page */ mpol = get_task_policy(current); page = __read_swap_cache_async(swpentry, GFP_KERNEL, mpol, - NO_INTERLEAVE_INDEX, &page_was_allocated); + NO_INTERLEAVE_INDEX, NULL, &page_was_allocated); if (!page) { ret = -ENOMEM; goto fail;