From patchwork Tue Apr 18 19:13:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 13216108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E162DC77B78 for ; Tue, 18 Apr 2023 19:13:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 197B9280007; Tue, 18 Apr 2023 15:13:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0AAD6280001; Tue, 18 Apr 2023 15:13:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DC756280007; Tue, 18 Apr 2023 15:13:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CAB8D280001 for ; Tue, 18 Apr 2023 15:13:45 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 91D00120262 for ; Tue, 18 Apr 2023 19:13:45 +0000 (UTC) X-FDA: 80695461210.30.67B14F8 Received: from mail-qv1-f53.google.com (mail-qv1-f53.google.com [209.85.219.53]) by imf19.hostedemail.com (Postfix) with ESMTP id AD6581A0017 for ; Tue, 18 Apr 2023 19:13:43 +0000 (UTC) Authentication-Results: imf19.hostedemail.com; dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b="ABr/1kuC"; spf=pass (imf19.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.53 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1681845223; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bD8cI1T5Y1yip08nzZ5ycgmWClJik+UbLC7ebo3KxqY=; b=B5Rc3eUTa95g/RyUDt2qhHMqQd7PZtASoT2ZDtjzcOYcROFcRumTCLbe4sxksg1uIHHpU3 rJCHk4a8l+uV4MYVixr/REsMFv2O0NKgYUNhimmENEJXUkHn13SVJGfe0tF3oPaQYt5WzZ m5PcokVWd5r/bMzcbXsRDIgdIy79W4U= ARC-Authentication-Results: i=1; imf19.hostedemail.com; dkim=pass header.d=cmpxchg-org.20221208.gappssmtp.com header.s=20221208 header.b="ABr/1kuC"; spf=pass (imf19.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.219.53 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org; dmarc=pass (policy=none) header.from=cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1681845223; a=rsa-sha256; cv=none; b=pdo+POdERiibLD6jOTQT/WVboVmWFLk8lAayJQ6VQjFYuAC3Dc3d8pDvHtoTgMvVgcz0qP NZHAsqeZC/h9oxgB6GJpWfSRQYKXNCFS7piHZ7nXotoF8aVQ1/OXHG7XDUAmHJEQG8yTY5 Or/5jtpo9cBa3FGpcwteocBgxM3N6hY= Received: by mail-qv1-f53.google.com with SMTP id me15so9556515qvb.4 for ; Tue, 18 Apr 2023 12:13:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20221208.gappssmtp.com; s=20221208; t=1681845222; x=1684437222; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bD8cI1T5Y1yip08nzZ5ycgmWClJik+UbLC7ebo3KxqY=; b=ABr/1kuC79+cxAX8L6WJR2/bWTqWb8HcQEjFbpPZF0HuB9YW1YJMuRW7gyOVzYVPRw 03H6aLq7CX52ef2Om4hh96VGqT7TjGkiKZRVbtTbvnP86DodSH+fBT3Bg8K3RBiskmEn PBPAPvjvrcmBXvcXT5D1ENEFAgwZsOibeMRYfxOUHZ/XIjDWRlAOEno0v6h4ZjF413dj hOYeaOUBMh9OFP6N6OvsQe4bSQ4Xtq8Jibe8ALXvvyUnSx8zCdaZb/tqXiSYgET+QTx0 uVr7LhBCB60jsNBnVVB4qFMsgKQHi2Oe9RlYbzhzQb172EdH3Z+Txzl+m/HRkKeizyY4 nKRA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1681845222; x=1684437222; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bD8cI1T5Y1yip08nzZ5ycgmWClJik+UbLC7ebo3KxqY=; b=a0pK9TiLAND6QUkAivdBw58NeqY2lgGma5gJXb+wqtwwYENgaPpKDftQBpSTk6g0Zf 2nzxpz66x39u7RYO4PLpqX0ZjrjZQERLbv9n9DhV4LomtkdF8ILTzuJ3z2fjdDQNtQ87 X1dujrCU+rJ04UHJt6D1+MC3t7KLMCA0GX381LsdlnB98G5sVdBexw6I7OfFrQYifAHY uyHx+BxOk2vASIx/67N6T9OZC1qnncZyzId3Z7PCovA0Y60Lpr58qV2DgJpKfSmImSF8 0iGcR3AAOeABI98Y1jkP/oy63ot2kLzHFzr8GigllfB8m3NdX3vdgciLpnX6GuBa/Fz5 dMcQ== X-Gm-Message-State: AAQBX9dw4rTeLTq54PyJShA9yUAWsHGJhBf+PmuRk/REj390Ex+w/ydL IgTTIr1q1+K6VR0ZTtdYDQD6u21TfTijwNpYftI= X-Google-Smtp-Source: AKy350ZedblEphVsszu1HNXxFw88IPawTMC/tacVCADIVXo/ITvdIoDkqo0MjIIYPO3ZAuvFaGP6+g== X-Received: by 2002:a05:6214:e49:b0:5ef:564a:3296 with SMTP id o9-20020a0562140e4900b005ef564a3296mr17532356qvc.44.1681845222547; Tue, 18 Apr 2023 12:13:42 -0700 (PDT) Received: from localhost ([2620:10d:c091:400::5:e646]) by smtp.gmail.com with ESMTPSA id ep19-20020a05621418f300b005dd8b934594sm3931380qvb.44.2023.04.18.12.13.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 18 Apr 2023 12:13:42 -0700 (PDT) From: Johannes Weiner To: linux-mm@kvack.org Cc: Kaiyang Zhao , Mel Gorman , Vlastimil Babka , David Rientjes , linux-kernel@vger.kernel.org, kernel-team@fb.com Subject: [RFC PATCH 20/26] mm: vmscan: use compaction_suitable() check in kswapd Date: Tue, 18 Apr 2023 15:13:07 -0400 Message-Id: <20230418191313.268131-21-hannes@cmpxchg.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20230418191313.268131-1-hannes@cmpxchg.org> References: <20230418191313.268131-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: AD6581A0017 X-Stat-Signature: dmg5d5e5dof863opsb7hsqnq4h98s6im X-Rspam-User: X-HE-Tag: 1681845223-843021 X-HE-Meta: U2FsdGVkX19+kv2LG9oSYQ3j7/hbykLQXgWzE2AD+qdHjnEBtmOUyapHFTzn5/19uxh+NEejeib9YZuoMUGC7+TfeAE/rOgXqkOjrh8kmklESdyBe22YF/FyTx4KB3jyoQQ0ZQWiohdbTfFL5SccvVe/iFhJkzoGlvhJK1CT2xg98ThHj2/5cBvAtuNlxhSUiObaFqORgmX9IiHwQOoAa50PN+ZlpUNAYd0R+gmAXAIk9OBTLFUr9iL6l5whPJoSBqyeIYEnDWjuEkLR69fKcDN3Td2u9B93RZUkgHzX6M+aH3uxvO7nuIinENypZueSU4r4JIs1gxmMLM8x1SeHtQuw8Lc3kGRcP06VgkOF9jIgc17OnniFFR0tJXI37nDpfZgKOeeisPd7hRZ6lQBb410a7FbATbfU+QR9m1GfmaBNex1ifd5C0o96iAqcppcK9Gf3tzOlH6u8u7fbZ5wu1R/z8eTfJx7iNcb0ZKmoaxmcTW4TCYQRMVgg5lS9LZSdAyaH06ZKnHdbZzxpEvre6eugoe9fb/mlTJgScNy6ZWW4069LXC2/4EJ4MDKhSH9jEwlgfp0NUZLviKWJjQwZr0OwFUqroUdr26M5A/LZFggty8oZnSct8IWj0G+bVtiIfap2PJaAneB5N0jb3FyXQL/tsD2VpzIA43cqpOswTAyuiFWNPbg014LxV9y8PLcTdY3oPjbMmU173j25naiAKPcJSlJr1I9TvlPZu6f+cBqaZG3nH5Gzk6a4ooP4QjrnvTkZ0n4NGNqO85onojKyYGR/gDC6cIjPyJCwRUSxCClYr1ocexMWZiQCp4jiVq5/2DOG9OF761HdakgdjzfUegu+Iylz1A1MAs265KAjLNtKWLJe/bdDSVVip58qvdZIz2AQkUEeb8NkY2MIbhcOnek/g2Osy8umisIZafYJ/RaGMBHP1f/2yvrL+XpgXaiVjPxzfjmwDFNAdou83jJ Mfj43DYZ A1PzMDQDBSk5ZUDeW++qhWokPvs7SXQXb4H7np840fyfvDMCPjr67ME1zn0yunxXaVEPEhy5Wz/jFeSTdNEk9buals3XZiMiYz56mZ0T1V9NHp8OpWld2XJjMe5nSFZIhjscHkJTfWM8DOSf7LdzE+AX9KKyW318+GofC3shCATKujZJesw0IkTMiiUlNCosxdmzRvT7V1vQ/BtmSBo++lH6Z/Y28NX1rPEXLz3lcxk23wj+BwLAC9/+6dZmsitDo9DIyEQP65OBUV5GOKrYCm2gPAgR6O0o96rADVuqF2Nlr/XOrod+SX9Ns6RhQKSkBaSMFpOzhkMhUlwiF5GWw3RWyqYHEaD3FD9bZ5pGFV1n/zo32e+q8c6qD4XFCB889Q5bQNZ6tE9t72eR6b9EY88h/1wHHitNzeHMa8fM+yubKCeF68JqwsUF1SHEywvpcEvtEnHR8PHISwTRx1/yIMeE8gA8ORNimfzrLXFYpBmnksLCemrmJStJUryV3Td72kUb8gQ8aps1PI25LH9lYXk3ziw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Kswapd currently bails on higher-order allocations with an open-coded check for whether it's reclaimed the compaction gap. compaction_suitable() is the customary interface to coordinate reclaim with compaction. Signed-off-by: Johannes Weiner --- mm/vmscan.c | 67 ++++++++++++++++++----------------------------------- 1 file changed, 23 insertions(+), 44 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index ee8c8ca2e7b5..723705b9e4d9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -6872,12 +6872,18 @@ static bool pgdat_balanced(pg_data_t *pgdat, int order, int highest_zoneidx) if (!managed_zone(zone)) continue; + /* Allocation can succeed in any zone, done */ if (sysctl_numa_balancing_mode & NUMA_BALANCING_MEMORY_TIERING) mark = wmark_pages(zone, WMARK_PROMO); else mark = high_wmark_pages(zone); if (zone_watermark_ok_safe(zone, order, mark, highest_zoneidx)) return true; + + /* Allocation can't succeed, but enough order-0 to compact */ + if (compaction_suitable(zone, order, + highest_zoneidx) == COMPACT_CONTINUE) + return true; } /* @@ -6968,16 +6974,6 @@ static bool kswapd_shrink_node(pg_data_t *pgdat, */ shrink_node(pgdat, sc); - /* - * Fragmentation may mean that the system cannot be rebalanced for - * high-order allocations. If twice the allocation size has been - * reclaimed then recheck watermarks only at order-0 to prevent - * excessive reclaim. Assume that a process requested a high-order - * can direct reclaim/compact. - */ - if (sc->order && sc->nr_reclaimed >= compact_gap(sc->order)) - sc->order = 0; - return sc->nr_scanned >= sc->nr_to_reclaim; } @@ -7018,15 +7014,13 @@ clear_reclaim_active(pg_data_t *pgdat, int highest_zoneidx) * that are eligible for use by the caller until at least one zone is * balanced. * - * Returns the order kswapd finished reclaiming at. - * * kswapd scans the zones in the highmem->normal->dma direction. It skips * zones which have free_pages > high_wmark_pages(zone), but once a zone is * found to have free_pages <= high_wmark_pages(zone), any page in that zone * or lower is eligible for reclaim until at least one usable zone is * balanced. */ -static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) +static void balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) { int i; unsigned long nr_soft_reclaimed; @@ -7226,14 +7220,6 @@ static int balance_pgdat(pg_data_t *pgdat, int order, int highest_zoneidx) __fs_reclaim_release(_THIS_IP_); psi_memstall_leave(&pflags); set_task_reclaim_state(current, NULL); - - /* - * Return the order kswapd stopped reclaiming at as - * prepare_kswapd_sleep() takes it into account. If another caller - * entered the allocator slow path while kswapd was awake, order will - * remain at the higher level. - */ - return sc.order; } /* @@ -7251,7 +7237,7 @@ static enum zone_type kswapd_highest_zoneidx(pg_data_t *pgdat, return curr_idx == MAX_NR_ZONES ? prev_highest_zoneidx : curr_idx; } -static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_order, +static void kswapd_try_to_sleep(pg_data_t *pgdat, int order, unsigned int highest_zoneidx) { long remaining = 0; @@ -7269,7 +7255,7 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o * eligible zone balanced that it's also unlikely that compaction will * succeed. */ - if (prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { + if (prepare_kswapd_sleep(pgdat, order, highest_zoneidx)) { /* * Compaction records what page blocks it recently failed to * isolate pages from and skips them in the future scanning. @@ -7282,7 +7268,7 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o * We have freed the memory, now we should compact it to make * allocation of the requested order possible. */ - wakeup_kcompactd(pgdat, alloc_order, highest_zoneidx); + wakeup_kcompactd(pgdat, order, highest_zoneidx); remaining = schedule_timeout(HZ/10); @@ -7296,8 +7282,8 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o kswapd_highest_zoneidx(pgdat, highest_zoneidx)); - if (READ_ONCE(pgdat->kswapd_order) < reclaim_order) - WRITE_ONCE(pgdat->kswapd_order, reclaim_order); + if (READ_ONCE(pgdat->kswapd_order) < order) + WRITE_ONCE(pgdat->kswapd_order, order); } finish_wait(&pgdat->kswapd_wait, &wait); @@ -7308,8 +7294,7 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o * After a short sleep, check if it was a premature sleep. If not, then * go fully to sleep until explicitly woken up. */ - if (!remaining && - prepare_kswapd_sleep(pgdat, reclaim_order, highest_zoneidx)) { + if (!remaining && prepare_kswapd_sleep(pgdat, order, highest_zoneidx)) { trace_mm_vmscan_kswapd_sleep(pgdat->node_id); /* @@ -7350,8 +7335,7 @@ static void kswapd_try_to_sleep(pg_data_t *pgdat, int alloc_order, int reclaim_o */ static int kswapd(void *p) { - unsigned int alloc_order, reclaim_order; - unsigned int highest_zoneidx = MAX_NR_ZONES - 1; + unsigned int order, highest_zoneidx; pg_data_t *pgdat = (pg_data_t *)p; struct task_struct *tsk = current; const struct cpumask *cpumask = cpumask_of_node(pgdat->node_id); @@ -7374,22 +7358,20 @@ static int kswapd(void *p) tsk->flags |= PF_MEMALLOC | PF_KSWAPD; set_freezable(); - WRITE_ONCE(pgdat->kswapd_order, 0); + order = 0; + highest_zoneidx = MAX_NR_ZONES - 1; + WRITE_ONCE(pgdat->kswapd_order, order); WRITE_ONCE(pgdat->kswapd_highest_zoneidx, MAX_NR_ZONES); + atomic_set(&pgdat->nr_writeback_throttled, 0); + for ( ; ; ) { bool ret; - alloc_order = reclaim_order = READ_ONCE(pgdat->kswapd_order); - highest_zoneidx = kswapd_highest_zoneidx(pgdat, - highest_zoneidx); - -kswapd_try_sleep: - kswapd_try_to_sleep(pgdat, alloc_order, reclaim_order, - highest_zoneidx); + kswapd_try_to_sleep(pgdat, order, highest_zoneidx); /* Read the new order and highest_zoneidx */ - alloc_order = READ_ONCE(pgdat->kswapd_order); + order = READ_ONCE(pgdat->kswapd_order); highest_zoneidx = kswapd_highest_zoneidx(pgdat, highest_zoneidx); WRITE_ONCE(pgdat->kswapd_order, 0); @@ -7415,11 +7397,8 @@ static int kswapd(void *p) * request (alloc_order). */ trace_mm_vmscan_kswapd_wake(pgdat->node_id, highest_zoneidx, - alloc_order); - reclaim_order = balance_pgdat(pgdat, alloc_order, - highest_zoneidx); - if (reclaim_order < alloc_order) - goto kswapd_try_sleep; + order); + balance_pgdat(pgdat, order, highest_zoneidx); } tsk->flags &= ~(PF_MEMALLOC | PF_KSWAPD);