From patchwork Wed Apr 16 13:45:39 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 14054012 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 19F61C369C4 for ; Wed, 16 Apr 2025 13:51:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B375628000E; Wed, 16 Apr 2025 09:51:56 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A6ECD28000B; Wed, 16 Apr 2025 09:51:56 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 828ED28000E; Wed, 16 Apr 2025 09:51:56 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5DEB728000B for ; Wed, 16 Apr 2025 09:51:56 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id EFC5A5B655 for ; Wed, 16 Apr 2025 13:51:57 +0000 (UTC) X-FDA: 83340045474.03.0E52B71 Received: from mail-qk1-f180.google.com (mail-qk1-f180.google.com [209.85.222.180]) by imf15.hostedemail.com (Postfix) with ESMTP id 20A73A0006 for ; Wed, 16 Apr 2025 13:51:55 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=ZswxMZgE; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf15.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.180 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744811516; a=rsa-sha256; cv=none; b=rgng6PsAf3rSLRo0xMaNdyaIEuHmk+quKKy2/57qLM0mbAo0gnQydc+pu6omf35D0tfILV 6JnPtSGAsBRALYzkkkvGj8H8okSvCT/jIOPqiY8ftc5onvuJBk4Imy7+fEZRyquf885oB1 5XywK1CooaDckHwjZtAzLhFzG5/33+Q= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=cmpxchg-org.20230601.gappssmtp.com header.s=20230601 header.b=ZswxMZgE; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf15.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.222.180 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744811516; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Wj3NrniHNZn/X8o5tdAU6UD1m+UiZoBgpGIel9S7xk4=; b=dEyMFkrc+/UVq1bJZ1nWt2PiMfqmFVdS6233oZ/I7TJdmdIvzQfDSK89WH4G4Z4FxaJN2G MopHvaah59U8LZLvmunb/32JipnrgKXKNWjpvvISjWqCOO6GcsaUf5Oa5/sVnVZH1R49xa /H3FCT4gWJfj12mZkw7FmF/SoDhzcDw= Received: by mail-qk1-f180.google.com with SMTP id af79cd13be357-7c5b2472969so664425485a.1 for ; Wed, 16 Apr 2025 06:51:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20230601.gappssmtp.com; s=20230601; t=1744811515; x=1745416315; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Wj3NrniHNZn/X8o5tdAU6UD1m+UiZoBgpGIel9S7xk4=; b=ZswxMZgEdAzg5rKNtdcaS2n4QOcm9QxT1oiHwZPMrsDWRIy7IYil9F+vnYv9iNU4a7 MkxofN2/LPxq+inbUy4hKnZV0Qq7upJVITbhyz7/JxY8OY3QyLkzKq4PyxS2TzWQHJJ3 flw4wFWH3e78tv7HKISybbK97nWwfALAOMBQW+HKhpsajiulFRiPUChMjbTvT4kStsZ+ x8VmfOxvmWZBDff43dp6rVQvzoHXSRn7mLaUxv8JSorZ5rRwnWnWjBlPH6ED8YBbk/Tf NNaQL7Juop8UoGjQjfcQyK2AcBmBNbuFSV92wqEJhUPICjYlf8fr1EAGvIGg+hetkWgP n1LA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744811515; x=1745416315; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Wj3NrniHNZn/X8o5tdAU6UD1m+UiZoBgpGIel9S7xk4=; b=jGeQLfcSrq4MO+QSLfZoRizU90+R+b+7h0hLu9VRVoWPzqKwwLt8FRoJDv3mDvpq6m W96kGXhwvI4e+ZH1ViaTAx8Yu5HepTt7aNQRrsAN5XzDqq96mJp5dJc8PTJ+JCsQ6AGI r7hiOpYv4OhwxSKhTGl0YZSk+al4VAtrWaMip7fy4mxPPiyvYcqHlibEo/Svu8AhwGZA T88DOgxw0Zxfj5RVMmim30Ot/I+BTAzZy8vZ/9s3BvEkMN9zeONoKMI1/RLkoKAnvwpr bdV6e7nmflDwfU+WWcTsOLhzLRskB6N5gxw1odlTCodBjPbZDPtJ0EReiQ/VTobD3kRy qb3Q== X-Forwarded-Encrypted: i=1; AJvYcCXsPYsi//seJawEjJCA7f9KdVH8JQYf7EfepFLHvPRZlgouHKRNvJvPsirTWnWEa6E6go/SOum/xQ==@kvack.org X-Gm-Message-State: AOJu0YxBtHhw6Vv0D/Lx4hDUZibabSTRcXq8JHxzk+2bI2TFE3xcgdBI goyITZLLFXXBhP/Ztwg/rqe3JCthwnTQvzfImyGYwPg0JtU46Far8XN2lKyHvdU= X-Gm-Gg: ASbGnctPh7A607hox5+/ZOhqSdmnqVoQ6rwOuj8GLUD2MK5hqnTQvUKTDWECYZd099D oRK+3CrUxB8/P0HdArieJNBjs1+51kh3fV//QLPLRsFBM6zrED4oMj8PFw69wI1GxNu8hBdynEH ejJM/ndbODJOrKyZQ7TV+A+knZ3jc+7NrSvzQLcQQ35kDTJGJhq8/jhZvHzDG87P92VJUg7DZT7 U4tZIz7Zp5PMl2oy2ONMJyy9r1l2NTjkMo+QEUcVAoj0R8DJ65Thzb8nluo3/oRrckf1f47miow R5y+1IQeFCD6fBMpdljwu898uVl0r2wphRzkfutPQiCACm4GFA== X-Google-Smtp-Source: AGHT+IGmuTUyCI94ZLmAFrRh2xfR+geJlI+XCIVaAF1HfCP+icwZogiLKrredbMjVg402ILAUyjhvQ== X-Received: by 2002:a05:620a:4890:b0:7c5:a513:1ff0 with SMTP id af79cd13be357-7c919002f41mr255476485a.34.1744811514898; Wed, 16 Apr 2025 06:51:54 -0700 (PDT) Received: from localhost ([2603:7000:c01:2716:365a:60ff:fe62:ff29]) by smtp.gmail.com with UTF8SMTPSA id af79cd13be357-7c7a89f9943sm1059867585a.82.2025.04.16.06.51.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 16 Apr 2025 06:51:54 -0700 (PDT) From: Johannes Weiner To: Andrew Morton Cc: Vlastimil Babka , Brendan Jackman , linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/2] mm: vmscan: restore high-cpu watermark safety in kswapd Date: Wed, 16 Apr 2025 09:45:39 -0400 Message-ID: <20250416135142.778933-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250416135142.778933-1-hannes@cmpxchg.org> References: <20250416135142.778933-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Rspamd-Queue-Id: 20A73A0006 X-Stat-Signature: s5ffg3sox85596oiegobgy5iqd8pjj6z X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1744811515-571630 X-HE-Meta: U2FsdGVkX185rtcCzCPXGhjTKPYsq4XqT10KPNcJ2GOzUq2FgaZDVIsz6X8lww4cM5Z3dR+C+/vp0420zysRtbnDvWabqUuV7Pa7xQ4Cmviy7WYa8rbhVQ/Glur7HOsJvaDfmAiMkybJXQWaTg92uIUTAk9ieRUyPldiOzi1kBmQUSygNeq1bNANIPuYrMMeza+hQxGnMXERyabnjkHJXBjb6fa9sLJnls4Nt4D7/T9xkEuMd5IbRPnuO12lnZjDGaJIFT8Dts5/2JVzjubMCK327loMDoZgZQh+AjeVrW2Rql+Djof/jTupFJLD96qLHSDV6YAk7k5QzwzmC970oJP6qgNUW7CULwhxdUTartQgO5dbkpqgm7KD90dHg1cLzIbNy9r77PcZEqK943jq/V5Oz70hGqBLzmZ95S1kQICYtNGl4YlWqeTJGP/aZRUkivHr06CcLiYkcLOUICxiaW/z58aoI/oBlYpGoV6ccfL34lbpLvKMyT3m0yJOx+xn7seyMj28cm5Br4pul9caKWUhgv2C2OCdZffpfJG6QamSf9YYvZbMdMHoR615SMGYOaKTntaCgou+B0fdAQQssj0rfPvbsAMIwCDqI373JLQ3gj13qiduU0IekEHBXPXNcDApkP9QDotgXVFAh0PAyLphj8dmrwpFNlJmv0JfZX7z0fcvfQkxzYnipV/67dPwy7v7q2bv6fKO/+GVTgYxAco1yHl7kC+V863IrWMgMDLNLkiIvybnweXoPx2CqEqNI7gN/Tr7PXizfdcUPcj59BzR/rsZGl6rsIFPPzOwzP6jcmlorXjA2EKG9Tq3hiKOjlaUNSJdfYpP8ThaNUNiviBr1i4/W4R3/3OkBJDP3nwgJCogGWIbVIrWPGogTnYr5zBsZw4cHCfn9tgkjHz/ItMPjFLQDdJAKTsvxCorQw7CukQAbZYD5xABWqjpmFKjbKAHsN3uTMIOdD+hrrX elF/Nnky w1GtszM4RIoLhYGYDRZAUjpOPskjbg6Weq3o7y/ntnPoR2/1E3M4vEN+mYHtOOVAtQxSFJc4y/UqjUhvfyjIB7z3r+A2ovA3sxaR/6sGZAqPkeEt47c7yad12SrqSpm3Hn6aYlu4RWdyzK83uYXvV42WfNDlLMlfbg0xxiqjHTUIW6xwFv/7GM3sTLrZNiUWdCI7OjIa6wxnlR2vk8vTO6w+957cGHW+cUPxXNosJQ/Aqj+4o2uRp5HLH/Qyis9kdgpypWk3aGvmkr1A9dsU4lhmB5cIjOL5sSfgfOaNYo5mUtRFO6orXQ/T+v5O4bQZiBH6FSasP13ivLnNydhgRl0zJqmm2vCZ7y64WqQvAWHv6MHt7hgAp30Cks0yE943yMNZBrXUl3ZlVp9Un5cgsBVbL0ixz+Z6kjvVve1xJqRurgcdUcwI2MRXQCcWIzbSYjU6X X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Vlastimil points out that commit a211c6550efc ("mm: page_alloc: defrag_mode kswapd/kcompactd watermarks") switched kswapd from zone_watermark_ok_safe() to the standard, percpu-cached version of reading free pages, thus dropping the watermark safety precautions for systems with high CPU counts (e.g. >212 cpus on 64G). Restore them. Since zone_watermark_ok_safe() is no longer the right interface, and this was the last caller of the function anyway, open-code the zone_page_state_snapshot() conditional and delete the function. Reported-by: Vlastimil Babka Fixes: a211c6550efc ("mm: page_alloc: defrag_mode kswapd/kcompactd watermarks") Signed-off-by: Johannes Weiner Reviewed-by: Vlastimil Babka --- include/linux/mmzone.h | 2 -- mm/page_alloc.c | 12 ------------ mm/vmscan.c | 21 +++++++++++++++++++-- 3 files changed, 19 insertions(+), 16 deletions(-) diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 4c95fcc9e9df..6ccec1bf2896 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1502,8 +1502,6 @@ bool __zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, bool zone_watermark_ok(struct zone *z, unsigned int order, unsigned long mark, int highest_zoneidx, unsigned int alloc_flags); -bool zone_watermark_ok_safe(struct zone *z, unsigned int order, - unsigned long mark, int highest_zoneidx); /* * Memory initialization context, use to differentiate memory added by * the platform statically or via memory hotplug interface. diff --git a/mm/page_alloc.c b/mm/page_alloc.c index d7cfcfa2b077..928a81f67326 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -3470,18 +3470,6 @@ static inline bool zone_watermark_fast(struct zone *z, unsigned int order, return false; } -bool zone_watermark_ok_safe(struct zone *z, unsigned int order, - unsigned long mark, int highest_zoneidx) -{ - long free_pages = zone_page_state(z, NR_FREE_PAGES); - - if (z->percpu_drift_mark && free_pages < z->percpu_drift_mark) - free_pages = zone_page_state_snapshot(z, NR_FREE_PAGES); - - return __zone_watermark_ok(z, order, mark, highest_zoneidx, 0, - free_pages); -} - #ifdef CONFIG_NUMA int __read_mostly node_reclaim_distance = RECLAIM_DISTANCE; diff --git a/mm/vmscan.c b/mm/vmscan.c index b620d74b0f66..cc422ad830d6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -6736,6 +6736,7 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx) * meet watermarks. */ for_each_managed_zone_pgdat(zone, pgdat, i, highest_zoneidx) { + enum zone_stat_item item; unsigned long free_pages; if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) @@ -6748,9 +6749,25 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx) * blocks to avoid polluting allocator fallbacks. */ if (defrag_mode) - free_pages = zone_page_state(zone, NR_FREE_PAGES_BLOCKS); + item = NR_FREE_PAGES_BLOCKS; else - free_pages = zone_page_state(zone, NR_FREE_PAGES); + item = NR_FREE_PAGES; + + /* + * When there is a high number of CPUs in the system, + * the cumulative error from the vmstat per-cpu cache + * can blur the line between the watermarks. In that + * case, be safe and get an accurate snapshot. + * + * TODO: NR_FREE_PAGES_BLOCKS moves in steps of + * pageblock_nr_pages, while the vmstat pcp threshold + * is limited to 125. On many configurations that + * counter won't actually be per-cpu cached. But keep + * things simple for now; revisit when somebody cares. + */ + free_pages = zone_page_state(zone, item); + if (zone->percpu_drift_mark && free_pages < zone->percpu_drift_mark) + free_pages = zone_page_state_snapshot(zone, item); if (__zone_watermark_ok(zone, order, mark, highest_zoneidx, 0, free_pages))