From patchwork Tue Oct 4 23:34:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yosry Ahmed X-Patchwork-Id: 12998782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 92D0BC433F5 for ; Tue, 4 Oct 2022 23:34:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 05FCC6B0071; Tue, 4 Oct 2022 19:34:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 010016B0073; Tue, 4 Oct 2022 19:34:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DCBD56B0074; Tue, 4 Oct 2022 19:34:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C54EE6B0071 for ; Tue, 4 Oct 2022 19:34:55 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 97431140153 for ; Tue, 4 Oct 2022 23:34:55 +0000 (UTC) X-FDA: 79984874550.03.17582EF Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by imf25.hostedemail.com (Postfix) with ESMTP id 21D42A0018 for ; Tue, 4 Oct 2022 23:34:54 +0000 (UTC) Received: by mail-pf1-f201.google.com with SMTP id k11-20020aa792cb000000b00558674e8e7fso9767733pfa.6 for ; Tue, 04 Oct 2022 16:34:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date; bh=1VwEyudEEg4smBY+MKZJWdKtquNfzB47uQiiqV8iKco=; b=NTnSd9KVV9hSY/eQWHkPHqQELNcIlqXK5hsxw8/MyexBmo1bKCndxFIWE0/C+FZmFI b11BypLFppC7UdyHn8rthfOwwJjqsgCrxnuD6cNd+0+xrH0d1+/k+Nwtr8TSQ2EUsCug OVYko2zetdkm1pM9AxqEGmEwKzORdSEW9qqgBCsaCQmp4G2ucgjqRP6zktn3MQ5b4kz/ Lrq/ORYH0cFTXNOMop7YlY3ty97eccT7Xhy5HKM4jlo1ipvycIUlJc3uUcds3eWwEWQP 6JXQb0c9AO4n177s6Rz0uP11zUUDZSknUhSW3b7sBMw3MxcoFPVumzPz6OBwo4IP5hqL Bspg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date; bh=1VwEyudEEg4smBY+MKZJWdKtquNfzB47uQiiqV8iKco=; b=fPp72spdJW0o/YYELcd9xPTjfXAGsbRvu4FLqQJk7ADeQMwi2n+W2RNdq0yFOIMREH 6+NhWaCRM/6ysxQMMvWfzTe24/wsZ7mT5WyT6DABoVfT34vUlLNOIrPEjwwJ5VMziFo8 TKCsDwqZWeoT2A08dx3odBC5mEGxAfAXHY1sPaCEts1barYodW3ZLZB1WwlefCIW/lgs dolqVxn8lxQWjPnQ5Dr+huBmh7tcMEzEVyHXwwAN6tyAMZ793tzEgz9oSZyjpRZX1RHg /CmBvlkn8m7OgI4ufBmv0lamNrkX+auy3aYguzvW6ZIgIitvdPlkeLbL2xJX/aJof+X0 KELg== X-Gm-Message-State: ACrzQf1dixyMLPvJuZCovdci8q+3V2lL1ytVsuwDHLpM8NZ60jVjrfsV FgF5UbtqXZ+DHXHvF8PxF4QLiQBGlkvIjkWF X-Google-Smtp-Source: AMsMyM7G2avoH6qtf6yI3RrR3LpogE6JnX6Z1xf+0wvpymlSTnAYLsRP24LicTX48tUJpOueGYCpPRPrkAvAMZ7b X-Received: from yosry.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:2327]) (user=yosryahmed job=sendgmr) by 2002:a17:90a:c986:b0:205:f08c:a82b with SMTP id w6-20020a17090ac98600b00205f08ca82bmr438572pjt.1.1664926493371; Tue, 04 Oct 2022 16:34:53 -0700 (PDT) Date: Tue, 4 Oct 2022 23:34:46 +0000 Mime-Version: 1.0 X-Mailer: git-send-email 2.38.0.rc1.362.ged0d419d3c-goog Message-ID: <20221004233446.787056-1-yosryahmed@google.com> Subject: [PATCH] mm/vmscan: check references from all memcgs for swapbacked memory From: Yosry Ahmed To: Andrew Morton , Johannes Weiner , Michal Hocko , Roman Gushchin , Shakeel Butt , Muchun Song Cc: Greg Thelen , David Rientjes , cgroups@vger.kernel.org, linux-mm@kvack.org, Yosry Ahmed ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1664926495; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=1VwEyudEEg4smBY+MKZJWdKtquNfzB47uQiiqV8iKco=; b=FvIBG/2FaAjctZ1arctpH77fYjSw2Jd4+tuHYM/y6f/eM7yjIfsVyyy5VRSchZm0cxZ5I5 PQm4fS7DKSPycomzBOJE3MDPDleq2ZxDuXSuLOjEvyPOYs6ZNWIFwHeEQlyc/SS5D14cKl DJmdHfwaLUAXl0ruAXbPpU1vkj7mz44= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=NTnSd9KV; spf=pass (imf25.hostedemail.com: domain of 3HcM8YwoKCLYukonuWdiaZckkcha.Ykihejqt-iigrWYg.knc@flex--yosryahmed.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3HcM8YwoKCLYukonuWdiaZckkcha.Ykihejqt-iigrWYg.knc@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1664926495; a=rsa-sha256; cv=none; b=utlIZTb7rzpvYqzJpn926gLWqIH3l4dTmgu0Tke/yu81wkHBXwlDmoC4GdKJ59GdWkLIRa YwD6i9Xp0I+F7MasPcHTftvj6CyJXUBMcK5dUVEy/Q1MubISK4eYmlI6ZMzrwA36jciGCB oqPksNwq7cItlH9XuR1N+Q/TY93vnTM= X-Stat-Signature: x4jyaajbhhoapjmyi9rfmt7zizsx1ji7 X-Rspamd-Queue-Id: 21D42A0018 Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=NTnSd9KV; spf=pass (imf25.hostedemail.com: domain of 3HcM8YwoKCLYukonuWdiaZckkcha.Ykihejqt-iigrWYg.knc@flex--yosryahmed.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3HcM8YwoKCLYukonuWdiaZckkcha.Ykihejqt-iigrWYg.knc@flex--yosryahmed.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspamd-Server: rspam07 X-Rspam-User: X-HE-Tag: 1664926494-555082 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: During page/folio reclaim, we check folio is referenced using folio_referenced() to avoid reclaiming folios that have been recently accessed (hot memory). The ratinale is that this memory is likely to be accessed soon, and hence reclaiming it will cause a refault. For memcg reclaim, we pass in sc->target_mem_cgroup to folio_referenced(), which means we only check accesses to the folio from processes in the subtree of the target memcg. This behavior was originally introduced by commit bed7161a519a ("Memory controller: make page_referenced() cgroup aware") a long time ago. Back then, refaulted pages would get charged to the memcg of the process that was faulting them in. It made sense to only consider accesses coming from processes in the subtree of target_mem_cgroup. If a page was charged to memcg A but only being accessed by a sibling memcg B, we would reclaim it if memcg A is under pressure. memcg B can then fault it back in and get charged for it appropriately. Today, this behavior still makes sense for file pages. However, unlike file pages, when swapbacked pages are refaulted they are charged to the memcg that was originally charged for them during swapout. Which means that if a swapbacked page is charged to memcg A but only used by memcg B, and we reclaim it when memcg A is under pressure, it would simply be faulted back in and charged again to memcg A once memcg B accesses it. In that sense, accesses from all memcgs matter equally when considering if a swapbacked page/folio is a viable reclaim target. Add folio_referenced_memcg() which decides what memcg we should pass to folio_referenced() based on the folio type, and includes an elaborate comment about why we should do so. This should help reclaim make better decision and reduce refaults when reclaiming swapbacked memory that is used by multiple memcgs. Signed-off-by: Yosry Ahmed --- mm/vmscan.c | 38 ++++++++++++++++++++++++++++++++++---- 1 file changed, 34 insertions(+), 4 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index c5a4bff11da6..f9fa0f9287e5 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1443,14 +1443,43 @@ enum folio_references { FOLIOREF_ACTIVATE, }; +/* What memcg should we pass to folio_referenced()? */ +static struct mem_cgroup *folio_referenced_memcg(struct folio *folio, + struct mem_cgroup *target_memcg) +{ + /* + * We check references to folios to make sure we don't reclaim hot + * folios that are likely to be refaulted soon. We pass a memcg to + * folio_referenced() to only check references coming from processes in + * that memcg's subtree. + * + * For file folios, we only consider references from processes in the + * subtree of the target memcg. If a folio is charged to + * memcg A but is only referenced by processes in memcg B, we reclaim it + * if memcg A is under pressure. If it is later accessed by memcg B it + * will be faulted back in and charged to memcg B. For memcg A, this is + * called memory that should be reclaimed. + * + * On the other hand, when swapbacked folios are faulted in, they get + * charged to the memcg that was originally charged for them at the time + * of swapping out. This means that if a folio that is charged to + * memcg A gets swapped out, it will get charged back to A when *any* + * memcg accesses it. In that sense, we need to consider references from + * *all* processes when considering whether to reclaim a swapbacked + * folio. + */ + return folio_test_swapbacked(folio) ? NULL : target_memcg; +} + static enum folio_references folio_check_references(struct folio *folio, struct scan_control *sc) { int referenced_ptes, referenced_folio; unsigned long vm_flags; + struct mem_cgroup *memcg = folio_referenced_memcg(folio, + sc->target_mem_cgroup); - referenced_ptes = folio_referenced(folio, 1, sc->target_mem_cgroup, - &vm_flags); + referenced_ptes = folio_referenced(folio, 1, memcg, &vm_flags); referenced_folio = folio_test_clear_referenced(folio); /* @@ -2581,6 +2610,7 @@ static void shrink_active_list(unsigned long nr_to_scan, while (!list_empty(&l_hold)) { struct folio *folio; + struct mem_cgroup *memcg; cond_resched(); folio = lru_to_folio(&l_hold); @@ -2600,8 +2630,8 @@ static void shrink_active_list(unsigned long nr_to_scan, } /* Referenced or rmap lock contention: rotate */ - if (folio_referenced(folio, 0, sc->target_mem_cgroup, - &vm_flags) != 0) { + memcg = folio_referenced_memcg(folio, sc->target_mem_cgroup); + if (folio_referenced(folio, 0, memcg, &vm_flags) != 0) { /* * Identify referenced, file-backed active folios and * give them one more trip around the active list. So