From patchwork Mon Nov 13 23:34:22 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Marcelo Tosatti X-Patchwork-Id: 13454610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C6A6DC4167B for ; Mon, 13 Nov 2023 23:44:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 29CEA6B0206; Mon, 13 Nov 2023 18:44:50 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 224F56B0207; Mon, 13 Nov 2023 18:44:50 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 09F1F6B028D; Mon, 13 Nov 2023 18:44:50 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id CC6736B0207 for ; Mon, 13 Nov 2023 18:44:49 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A1BA8C010F for ; Mon, 13 Nov 2023 23:44:49 +0000 (UTC) X-FDA: 81454563498.14.B8AA827 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf16.hostedemail.com (Postfix) with ESMTP id D344E180012 for ; Mon, 13 Nov 2023 23:44:47 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VPIIDYuF; spf=pass (imf16.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1699919087; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:references:dkim-signature; bh=kzmqPw87QEnBeQlSVWGP3J/i9psO+2r7MY9OYmaqENY=; b=5y290NEPSnqfOSutGIDZ5I6QnClRt4+OuEcVm2tmG5SXMePjjG5qfUI+VKA3VcHOKUsp09 z32wd7Yg8wTg7THgipXdpnrcgKyAFTxyUMOEskrTHIIey/1W3gZOM3UPbwaCjK59x5PE0l TH9Cyu7EPXF7jSR6uRtBykbb2+Sw7UU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1699919087; a=rsa-sha256; cv=none; b=EtTAZyaotEIEfzJH8O61y1Z13jd8B1QeNNk6qG+pJ2TZiHNW1qgY0uR8cl3BGvCgekNtQ2 4uhooxYtaukVBp2qj0/Xh43Lss/CBrTHNBS/dQJl4nEayggA++IuTN87mjZzYLqmffu6E9 iTLSgaU51HkC1mby24D365RhoTrw5A8= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=VPIIDYuF; spf=pass (imf16.hostedemail.com: domain of mtosatti@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=mtosatti@redhat.com; dmarc=pass (policy=none) header.from=redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1699919087; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: references:references; bh=kzmqPw87QEnBeQlSVWGP3J/i9psO+2r7MY9OYmaqENY=; b=VPIIDYuFM4ZNnq/ilN8CCjqgMqIATX6MRbsX1TwqV9LCD9Dz9wiMMhiJTSLb7XOpDgmwY2 O/8zgxbVUL6vIaJ1wAiN1Fq0ui/ZsAo2dO2TRXUoc3rkY8bEC+6gHIiCUF1xLyvjc4B1/y KiwjdxehDY4bkvnQU9XViMV0ACWcj6g= Received: from mimecast-mx02.redhat.com (mx-ext.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-649-cZCqJAktN82N8Llx0r_7qg-1; Mon, 13 Nov 2023 18:44:43 -0500 X-MC-Unique: cZCqJAktN82N8Llx0r_7qg-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 3CA341C2B660; Mon, 13 Nov 2023 23:44:42 +0000 (UTC) Received: from tpad.localdomain (ovpn-112-2.gru2.redhat.com [10.97.112.2]) by smtp.corp.redhat.com (Postfix) with ESMTPS id CCD5F40C6EB9; Mon, 13 Nov 2023 23:44:41 +0000 (UTC) Received: by tpad.localdomain (Postfix, from userid 1000) id 300E6409C16E0; Mon, 13 Nov 2023 20:35:57 -0300 (-03) Message-ID: <20231113233502.587879658@redhat.com> User-Agent: quilt/0.67 Date: Mon, 13 Nov 2023 20:34:22 -0300 From: Marcelo Tosatti To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: Michal Hocko , Vlastimil Babka , Andrew Morton , David Hildenbrand , Peter Xu , Marcelo Tosatti Subject: [patch 2/2] mm: vmstat: use node_page_state_snapshot in too_many_isolated References: <20231113233420.446465795@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.11.54.2 X-Rspamd-Queue-Id: D344E180012 X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: ktq7aqxkptcutg7zrb84qe3h4gpaog9e X-HE-Tag: 1699919087-607514 X-HE-Meta: U2FsdGVkX1+77OS72AsNvKLYQDiy5KRRi9D7aJqKUdGcUS4jTMVn+DK73YrzPdnnndIX7e9ZrBUMzMKgEeRNykOXDkD3yVqJa5rke/2ByyLp7CYIzhtcTDbpO/4WC2gVg8HC08+KZngG/uSZc0FFpljwoQ6VAF7fBdgzfZOVAxWmMDsfcjyRU/FKPWk5vW3FnHIC5ZnL73jOuCtTObWUuTg4vIHKuDhW08skBDOUTutFrBzMsKyV7hT0MAf0ma/Ef9Ik+Ea96bB8l5k9VfNscZDVXbfJHJddBBjCNdvEM5f/1ZynaOWdLyBY5Ye2l+JW4JfVqMXM3K4kmPO1Fab19UTnCY4iFTChM5OKgYXjLuTELIhh5XUVdRTsb0PmFe1HA6InsBDeboEGEVAd1jpZEKOiLgzhxVmOb+YRW5m2e2epUnP2ePoP10m9WDu5hwo/xsiyOZPG1wLwvd7hay23ZVV19XBoulwJRHZvDcXKXVYEkwVWzVfB1XkNaQAkOw2oSnFC29IdmLQ3Ut0CWbM8j/jK7Tymk5B+cGdP89fiNlQ9vJYPGY1cxfqz5AyBzyuykSjUfw4RnjNZrCcd28Z7eGlN1UsrPH3ckuHqnwCs5TCrKX+jo+qV0mePRJbUOUFrm6eber7rhQ9e5jw4MK12bBpsQze3ZL3Yp8uMl+xnfrA+m6SL8zyShOmGMhJ2D/wzeS3JPdll6iZTBmRuIcza9wrsF5fnOcvNlEa3HGjqw9hwHz5QlnkXXFGhHc1n0tcXLM/KH/2LddCVYlexePWAeTGYnUhCWdvbPHBiqQB9eLx9h7MZBmIHZMDz82CcH7pChKkBnJg3V1E1BE7yks56QS43x+6hnJGsy0h9TS9PW0sDpZka1JDh3SBSArLlq0iM0snkwxbaQwd1OsL+YB2KxTbXJ0c+VYzpnfVr93qf/qfOPR1dTLfm3mC5Ksn5PMRqXU0N4vJ730oUzmTEpUQ KZUn60FN TTXxrivo3CmSKhzSsI+d67dHmA/7ZcsDEYTQOKtiLd5pTwrVHJ5BjnzkMZKYhVlWMLV6yEwcoVXaVClV2EvTjqbW5L6Uoj0H3TTy+AGJyp8X9f8YallU1XdLat7bR3UPSIr3ltwDta2IEkKhz9TFPTUgkhV6SLVFwzZKLd9L1vOnN23Ztv/NOf7Tgxbt1lKeFfKevUbMW9Nw05cdH6MbaBfTTF6Dvl56T9KUF X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: A customer reported seeing processes hung at too_many_isolated, while analysis indicated that the problem occurred due to out of sync per-CPU stats (see below). Fix is to use node_page_state_snapshot to avoid the out of stale values. 2136 static unsigned long 2137 shrink_inactive_list(unsigned long nr_to_scan, struct lruvec *lruvec, 2138 struct scan_control *sc, enum lru_list lru) 2139 { : 2145 bool file = is_file_lru(lru); : 2147 struct pglist_data *pgdat = lruvec_pgdat(lruvec); : 2150 while (unlikely(too_many_isolated(pgdat, file, sc))) { 2151 if (stalled) 2152 return 0; 2153 2154 /* wait a bit for the reclaimer. */ 2155 msleep(100); <--- some processes were sleeping here, with pending SIGKILL. 2156 stalled = true; 2157 2158 /* We are about to die and free our memory. Return now. */ 2159 if (fatal_signal_pending(current)) 2160 return SWAP_CLUSTER_MAX; 2161 } msleep() must be called only when there are too many isolated pages: 2019 static int too_many_isolated(struct pglist_data *pgdat, int file, 2020 struct scan_control *sc) 2021 { : 2030 if (file) { 2031 inactive = node_page_state(pgdat, NR_INACTIVE_FILE); 2032 isolated = node_page_state(pgdat, NR_ISOLATED_FILE); 2033 } else { : 2046 return isolated > inactive; The return value was true since: crash> p ((struct pglist_data *) 0xffff00817fffe580)->vm_stat[NR_INACTIVE_FILE] $8 = { counter = 1 } crash> p ((struct pglist_data *) 0xffff00817fffe580)->vm_stat[NR_ISOLATED_FILE] $9 = { counter = 2 while per_cpu stats had: crash> p ((struct pglist_data *) 0xffff00817fffe580)->per_cpu_nodestats $85 = (struct per_cpu_nodestat *) 0xffff8000118832e0 crash> p/x 0xffff8000118832e0 + __per_cpu_offset[42] $86 = 0xffff00917fcc32e0 crash> p ((struct per_cpu_nodestat *) 0xffff00917fcc32e0)->vm_node_stat_diff[NR_ISOLATED_FILE] $87 = -1 '\377' crash> p/x 0xffff8000118832e0 + __per_cpu_offset[44] $89 = 0xffff00917fe032e0 crash> p ((struct per_cpu_nodestat *) 0xffff00917fe032e0)->vm_node_stat_diff[NR_ISOLATED_FILE] $91 = -1 '\377' It seems that processes were trapped in direct reclaim/compaction loop because these nodes had few free pages lower than watermark min. crash> kmem -z | grep -A 3 Normal : NODE: 4 ZONE: 1 ADDR: ffff00817fffec40 NAME: "Normal" SIZE: 8454144 PRESENT: 98304 MIN/LOW/HIGH: 68/166/264 VM_STAT: NR_FREE_PAGES: 68 -- NODE: 5 ZONE: 1 ADDR: ffff00897fffec40 NAME: "Normal" SIZE: 118784 MIN/LOW/HIGH: 82/200/318 VM_STAT: NR_FREE_PAGES: 45 -- NODE: 6 ZONE: 1 ADDR: ffff00917fffec40 NAME: "Normal" SIZE: 118784 MIN/LOW/HIGH: 82/200/318 VM_STAT: NR_FREE_PAGES: 53 -- NODE: 7 ZONE: 1 ADDR: ffff00997fbbec40 NAME: "Normal" SIZE: 118784 MIN/LOW/HIGH: 82/200/318 VM_STAT: NR_FREE_PAGES: 52 Signed-off-by: Marcelo Tosatti --- mm/compaction.c | 6 +++--- mm/vmscan.c | 8 ++++---- 2 files changed, 7 insertions(+), 7 deletions(-) Index: linux/mm/compaction.c =================================================================== --- linux.orig/mm/compaction.c +++ linux/mm/compaction.c @@ -791,11 +791,11 @@ static bool too_many_isolated(struct com unsigned long active, inactive, isolated; - inactive = node_page_state(pgdat, NR_INACTIVE_FILE) + + inactive = node_page_state_snapshot(pgdat, NR_INACTIVE_FILE) + node_page_state(pgdat, NR_INACTIVE_ANON); - active = node_page_state(pgdat, NR_ACTIVE_FILE) + + active = node_page_state_snapshot(pgdat, NR_ACTIVE_FILE) + node_page_state(pgdat, NR_ACTIVE_ANON); - isolated = node_page_state(pgdat, NR_ISOLATED_FILE) + + isolated = node_page_state_snapshot(pgdat, NR_ISOLATED_FILE) + node_page_state(pgdat, NR_ISOLATED_ANON); /* Index: linux/mm/vmscan.c =================================================================== --- linux.orig/mm/vmscan.c +++ linux/mm/vmscan.c @@ -1756,11 +1756,11 @@ static int too_many_isolated(struct pgli return 0; if (file) { - inactive = node_page_state(pgdat, NR_INACTIVE_FILE); - isolated = node_page_state(pgdat, NR_ISOLATED_FILE); + inactive = node_page_state_snapshot(pgdat, NR_INACTIVE_FILE); + isolated = node_page_state_snapshot(pgdat, NR_ISOLATED_FILE); } else { - inactive = node_page_state(pgdat, NR_INACTIVE_ANON); - isolated = node_page_state(pgdat, NR_ISOLATED_ANON); + inactive = node_page_state_snapshot(pgdat, NR_INACTIVE_ANON); + isolated = node_page_state_snapshot(pgdat, NR_ISOLATED_ANON); } /*