From patchwork Wed Aug 2 02:56:02 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kalesh Singh X-Patchwork-Id: 13337538 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB8D9EB64DD for ; Wed, 2 Aug 2023 02:56:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 682B628011A; Tue, 1 Aug 2023 22:56:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 60A97280112; Tue, 1 Aug 2023 22:56:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4D27B28011A; Tue, 1 Aug 2023 22:56:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 39687280112 for ; Tue, 1 Aug 2023 22:56:18 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 036E414066B for ; Wed, 2 Aug 2023 02:56:17 +0000 (UTC) X-FDA: 81077650836.26.9FF8DFD Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf04.hostedemail.com (Postfix) with ESMTP id 2BB1640004 for ; Wed, 2 Aug 2023 02:56:15 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=33TVVDzX; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3z8XJZAsKCK4YOZSgVgWbUVUccUZS.QcaZWbil-aaYjOQY.cfU@flex--kaleshsingh.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3z8XJZAsKCK4YOZSgVgWbUVUccUZS.QcaZWbil-aaYjOQY.cfU@flex--kaleshsingh.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1690944976; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=aqehfJB+hF+Fne+AdGJurbsZIYsQCeOte57mRi2GAJw=; b=1f9VVMRtlPqyanRQ403J0o9Q0iCSKwyPEI+96VMdrL48jx2n0a7A95aN1Q6qErGG1ToXUu zJ76V6uHvY9+8jSSeCvL7djjGTto6Aqi8/9ybjkEKnT+NEmTOYT+T26eOAVkFtA8/QSMyk o1B2jXLmUpg+QcmSKjr0wyTEvUJR2YY= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=33TVVDzX; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf04.hostedemail.com: domain of 3z8XJZAsKCK4YOZSgVgWbUVUccUZS.QcaZWbil-aaYjOQY.cfU@flex--kaleshsingh.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3z8XJZAsKCK4YOZSgVgWbUVUccUZS.QcaZWbil-aaYjOQY.cfU@flex--kaleshsingh.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1690944976; a=rsa-sha256; cv=none; b=QNP/1qhC6gSR6/WPJcfAm2wsNDPnxHorFUAb+shp/cI58H5WWAQb4kiUJS9I0m/rBi3f6x 7ARu+bncMjTIFmRq5nk3f+gPZ15J5JkDekfN4z2iYCCIU/n5MUTSh8xW8VMH9WrENOYLdW MW0d/fVMTf6I0hUrYN172347Xn0LUfY= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-583c49018c6so71993997b3.0 for ; Tue, 01 Aug 2023 19:56:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1690944975; x=1691549775; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=aqehfJB+hF+Fne+AdGJurbsZIYsQCeOte57mRi2GAJw=; b=33TVVDzXg6eBl7/Vpt8F6zz+Hj/NOmdXdllc0HT3+fXbEBhkPYtEkaq37eNBBfQ//f huzyAhMvUorpNH56dDYRwA06JZVdYTEAbzfGjbqw3gBWiic0SbSYYIRLI7+dUO6KJeP/ QbLSh8xuVcrhWjh4oWX9BfGiIQ2WhcFjEVln6zdtkNbTrb8Vr/FaB0bbjGe719xWxqpC hL3EQTp/eSWIIOfcyzhoc5/BO66sBZZeI+o1Tma4ICZmtHB5eUiGscJSklwG5yeK8aOP pOTWKEoNcoXV0uOzF2WA7ODrKPaZuVgMaXvYUoulWRPVnXT17L+Vmq1QR0HZFgWxuJek tQWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1690944975; x=1691549775; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=aqehfJB+hF+Fne+AdGJurbsZIYsQCeOte57mRi2GAJw=; b=NxrnU6AyN3+MHG2qIM/mkFmUhE2ik/NvJHFyxiBPxhBU6XiwFHcTU3EfDCqJIUeNAB txYeFrsbOj6hgpPyl5+u+mXbuxbqXF7tNwyyFiQ0dHdIGFHPpA85mu5dCQWscjExkfem LsOcc6yxhbE+AiqAUoMsDY0v1RL+TYdzn5cHLX9Y1OYL8h7ESQm6/vKyd0qoH0n0UyPa DaQ687zQrrmCHkl1ykarZGzEBeJCSN6YArSOuF4AFvdoJO08XggwlMw9kYNrFxUYxFZz HNEZobH9jrx9fhEir595zYzKRf1Ug7msS2YS6L+jIjrzcotiAEjzr/06OJ4GJjs7uQcM XmCA== X-Gm-Message-State: ABy/qLaBT0Z+isBHPrNK4CDo5lkE4YZevAzxWXKuH2iUloYH75WOGDnA 1EYYAkWO4eHzIGNP6nURcPbPKfI+zKFy34U/Nw== X-Google-Smtp-Source: APBJJlGu4CGYprEhx9EAJQV2bkMTULOViWk9fNZltzmcAuDxS7zR2BS8SgTh3oU+2KBCXMRrlqL63XtCKu/qeEPcYg== X-Received: from kalesh.mtv.corp.google.com ([2620:15c:211:201:aa5f:ff75:318:255]) (user=kaleshsingh job=sendgmr) by 2002:a05:6902:100f:b0:cf9:3564:33cc with SMTP id w15-20020a056902100f00b00cf9356433ccmr116986ybt.13.1690944975112; Tue, 01 Aug 2023 19:56:15 -0700 (PDT) Date: Tue, 1 Aug 2023 19:56:02 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230802025606.346758-1-kaleshsingh@google.com> Subject: [PATCH v2 1/3] mm-unstable: Multi-gen LRU: Fix per-zone reclaim From: Kalesh Singh To: yuzhao@google.com, akpm@linux-foundation.org Cc: surenb@google.com, android-mm@google.com, kernel-team@android.com, Kalesh Singh , stable@vger.kernel.org, Charan Teja Kalla , Lecopzer Chen , Matthias Brugger , AngeloGioacchino Del Regno , Suleiman Souhlal , Oleksandr Natalenko , "Jan Alexander Steffens (heftig)" , Qi Zheng , Steven Barrett , Brian Geffon , Barry Song , linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, linux-mediatek@lists.infradead.org X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2BB1640004 X-Stat-Signature: tdfdf6dgm5h5cnn1pzwtzsmxuyjubo45 X-Rspam-User: X-HE-Tag: 1690944975-908486 X-HE-Meta: U2FsdGVkX1+AXi9KoEiU2LJuYqfHbwNxcQnZifW/QPzMRZlP2KOAEiVIDFWJN95MqumxXsWfKkGp/HjtM2s4HdjE90GY5ykmGSl6cmRWGHoRNR3jzjTJ3XCh8rfKE/IV1uMIiJJmKRqLOwl4hVItywBC+MqSQuEN/NfRoIXxN5R7DNS+9gOC+EnFqR5KjMv6FqRv5uyIbtVP1iJ+kivKcG6qYMWQcss1lX5uHVJYlr9QpcCCgHCmtI3hDKigmDRovrSleZ2Y5+0zAWJ6XpMMI6+EyV6i+YV2nx2f2uBV+/8a9nUUxCTbM997ByZj2SIOoHnFsPtr8amtqtL4QWY9KAtvKe3U/qwmQBSY135PGUHssBjYeFNQflcDxhwqy1t6eEYgXvyLbCUnbtUfImW3+pV3tQQF5IHmDuVTDXLVwZ4TqsbiRHjEvJHjMEfFRfNcVFlVQlWhHO/46gbhgUSWV1Ma5MtNAEAmmppwgaMTC+2EBcwDHxct+TkldBcje3qXdAqLdUe38hm90M/xEv33Rl20oUTlSJuK+IYtI0yOQJS/KT1RYEI6J2KITkqJ2cYyvMqCCM9jPoGJSSXLLsAiYW2YiUeEyNBrCHhCRnspeYoCisCEnLM59GbAqi48YD1RxDu9PQoOYs7Acczrhxr1knXW2os7qV89Lu2utT0U0CDvnnHP3id46C61C5z8Pck1VFk2xq/2c72k7BSaHB436FaBYH1XW8M1UwSmWW4xicvogAI53p9O7Abqi5nHhBvR/XT6odwTPPb669ydGpOc8jxnYJRB/K4DitX/WWMDUqmklzZbWh6fXPZ5nS0iC3jz7TwN7yxYNXkdOeeNDNeD363H+ocU2RtZ5ergAnzIU0UfuLbXdaFuNs0bT0sv5bEz5aXBPXFicQa78oScuJTGNaqlcrSo8tp04VKo8XOW0DxvGQ0Mbkx2DlwNYkxVn+KMQA/6IqD2qLX6ZFP6kiS Ic02ZAdY LI2lhfZ2lk3R/yHh1Nk5mPEVrPK6snGmCiJwZuhHLFWfGsHNFPunM/awKgeVfTnLEnW5/niJdL3acciUzElAMAhnnyEuo6uIMjvf+FPq9RkSK9UUtv5eg7Z6SO7QX8An2jo5gdo3rAtW4/+oWIsMLqR8zJ8/YuXUADcXXTQpOe820HMZ4WzjkADKJkdFeGguimD40kiWWHNLl9HZkqRhGqGAOCuTiISaHlPdrND/QVk+5vMuFmzMBT6fO3ySTYbtSUP7IsacKNUu/c4Fly6ZdQI7YmWjWiwjkfTNu2XJX7Dc3KcFBfCtpGw8nYVxcad9pD6PqEJ5xGsijND4eTHxSqQlWOCj7ETRYQotnJPoh7OAcytAxMNdTgkV4/k3n1uvgWr6/jjr7xeDTeljNcpWilL2qcvB24Js8yO6CsRP3BiIK++Ln0SjwyPs1bCmUIFg9JJGMDlJoJkcdjFFPds1GyAE/TtpgFVGW2xVlu42OlikMN7XkJa8tGe2xLIz66azQzWEcYT4jqgbEGeFRR/KKkZ1IGcHoBeUZIQYyKX52wV1Vuwj7NIbAQrXxdoZicXkV8jzKOfxWva3WG4V5FrGXNhQm4pedytxoWQ4jBLaqXjssZv6UX4kSe0VuazXilGv4zcTE+RdueX7l0REJiUBHpg82axO0/ZOGPRd0QIK05MzAacwd/UQ5WQO/S1j1jSUgj1EwG2Jl6cNLLKlU5RRV+IKgSz+D4dtcN/aJI1lY/NBLwdXcQqxfinoLKkIelvA8RL7hnXJRMkarViW/DYrrGTTEOODQgk4wongzimWOaFr1IHVBNuaHhtYC5RfXCyR5E1/AVGxrPk3PDgEQdFX4/NiBHZ5Byqc0R1QykjfiHGRZR6rf//xmUsFVeUcMw5CLp+rKnczWId2vd7hCO1gOsXBrqwI6GWqM2v3BeR46lUnTrvWMdQUDYUcyYztJdYxwsogqYo/VW7ijv4TTDu3t3suU+MX9 rPX2tNJ1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000006, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: MGLRU has a LRU list for each zone for each type (anon/file) in each generation: long nr_pages[MAX_NR_GENS][ANON_AND_FILE][MAX_NR_ZONES]; The min_seq (oldest generation) can progress independently for each type but the max_seq (youngest generation) is shared for both anon and file. This is to maintain a common frame of reference. In order for eviction to advance the min_seq of a type, all the per-zone lists in the oldest generation of that type must be empty. The eviction logic only considers pages from eligible zones for eviction or promotion. scan_folios() { ... for (zone = sc->reclaim_idx; zone >= 0; zone--) { ... sort_folio(); // Promote ... isolate_folio(); // Evict } ... } Consider the system has the movable zone configured and default 4 generations. The current state of the system is as shown below (only illustrating one type for simplicity): Type: ANON Zone DMA32 Normal Movable Device Gen 0 0 0 4GB 0 Gen 1 0 1GB 1MB 0 Gen 2 1MB 4GB 1MB 0 Gen 3 1MB 1MB 1MB 0 Now consider there is a GFP_KERNEL allocation request (eligible zone index <= Normal), evict_folios() will return without doing any work since there are no pages to scan in the eligible zones of the oldest generation. Reclaim won't make progress until triggered from a ZONE_MOVABLE allocation request; which may not happen soon if there is a lot of free memory in the movable zone. This can lead to OOM kills, although there is 1GB pages in the Normal zone of Gen 1 that we have not yet tried to reclaim. This issue is not seen in the conventional active/inactive LRU since there are no per-zone lists. If there are no (not enough) folios to scan in the eligible zones, move folios from ineligible zone (zone_index > reclaim_index) to the next generation. This allows for the progression of min_seq and reclaiming from the next generation (Gen 1). Qualcomm, Mediatek and raspberrypi [1] discovered this issue independently. [1] https://github.com/raspberrypi/linux/issues/5395 Fixes: ac35a4902374 ("mm: multi-gen LRU: minimal implementation") Cc: stable@vger.kernel.org Cc: Yu Zhao Cc: Andrew Morton Reported-by: Charan Teja Kalla Reported-by: Lecopzer Chen Signed-off-by: Kalesh Singh Tested-by: AngeloGioacchino Del Regno Tested-by: Charan Teja Kalla --- Changes in v2: - Add Fixes tag and cc stable mm/vmscan.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 4039620d30fe..489a4fc7d9b1 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -4889,7 +4889,8 @@ static int lru_gen_memcg_seg(struct lruvec *lruvec) * the eviction ******************************************************************************/ -static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx) +static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_control *sc, + int tier_idx) { bool success; int gen = folio_lru_gen(folio); @@ -4939,6 +4940,13 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, int tier_idx) return true; } + /* ineligible */ + if (zone > sc->reclaim_idx) { + gen = folio_inc_gen(lruvec, folio, false); + list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); + return true; + } + /* waiting for writeback */ if (folio_test_locked(folio) || folio_test_writeback(folio) || (type == LRU_GEN_FILE && folio_test_dirty(folio))) { @@ -4987,7 +4995,8 @@ static bool isolate_folio(struct lruvec *lruvec, struct folio *folio, struct sca static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, int type, int tier, struct list_head *list) { - int gen, zone; + int i; + int gen; enum vm_event_item item; int sorted = 0; int scanned = 0; @@ -5003,9 +5012,10 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, gen = lru_gen_from_seq(lrugen->min_seq[type]); - for (zone = sc->reclaim_idx; zone >= 0; zone--) { + for (i = MAX_NR_ZONES; i > 0; i--) { LIST_HEAD(moved); int skipped = 0; + int zone = (sc->reclaim_idx + i) % MAX_NR_ZONES; struct list_head *head = &lrugen->folios[gen][type][zone]; while (!list_empty(head)) { @@ -5019,7 +5029,7 @@ static int scan_folios(struct lruvec *lruvec, struct scan_control *sc, scanned += delta; - if (sort_folio(lruvec, folio, tier)) + if (sort_folio(lruvec, folio, sc, tier)) sorted += delta; else if (isolate_folio(lruvec, folio, sc)) { list_add(&folio->lru, list);