From patchwork Tue Jan 23 18:45:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kairui Song X-Patchwork-Id: 13527978 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07A92C47DDB for ; Tue, 23 Jan 2024 18:46:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8C5266B0092; Tue, 23 Jan 2024 13:46:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 874206B0093; Tue, 23 Jan 2024 13:46:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6EF5A6B0095; Tue, 23 Jan 2024 13:46:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5B1246B0092 for ; Tue, 23 Jan 2024 13:46:12 -0500 (EST) Received: from smtpin21.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 2C89E1403E4 for ; Tue, 23 Jan 2024 18:46:12 +0000 (UTC) X-FDA: 81711455784.21.763D5CE Received: from mail-pl1-f173.google.com (mail-pl1-f173.google.com [209.85.214.173]) by imf03.hostedemail.com (Postfix) with ESMTP id 5232C20009 for ; Tue, 23 Jan 2024 18:46:10 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QwCOPayb; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.173 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1706035570; h=from:from:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IuyIoFCfZ7YdkhEdaTKUsA0jfoduX/Ed91Z/Iy4HbTw=; b=7COFatIJadDGfEVMrkAJRnZa1rr+0b+CFwGtpIRJJjUaytTeqAFieuwmD1BDPNGnUl2nHk SQ8QiDmL+coJtsogAjdST8lXRPsTD8ABNr3KgL/Oc1Anp6XJttX36ikLBTdw8T8/M1Mm/P dw/At+x0sH0qt8Y8tYjMZn1CewoEJlg= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=QwCOPayb; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf03.hostedemail.com: domain of ryncsn@gmail.com designates 209.85.214.173 as permitted sender) smtp.mailfrom=ryncsn@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1706035570; a=rsa-sha256; cv=none; b=WyA/zjbhr5ucb3k3y5dOsVRFmXPdX+U9K6Ap2FhcctXWU/1ZoxdU3UjGdHuOjbF9mLtCpm vzkilWpL5Nj8lOlaAkGPALQXRejFYSttnL79Z9ns8fumnskSjyyWjQJM6RCMVueCgfXL8Z RQkdtH2F95+76SW9kb8DKAae6Fcwa9E= Received: by mail-pl1-f173.google.com with SMTP id d9443c01a7336-1d748d43186so17462595ad.0 for ; Tue, 23 Jan 2024 10:46:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1706035568; x=1706640368; darn=kvack.org; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:from:to:cc:subject :date:message-id:reply-to; bh=IuyIoFCfZ7YdkhEdaTKUsA0jfoduX/Ed91Z/Iy4HbTw=; b=QwCOPaybqcNz/jOp3t9QnvjSZOEIdYjmakzpoQQdFUfZ2DBSADaT5NjMqwIxJ7L5Dz YfcgG3265UeNAefaRbmgfBvJAiJNNWrlqGYXs3asc+mrBsB1MbgRChWhrCUXxsUMUTTU bUD0x+hOaG0UOZT8sAN+rAU1nSpT4uePpbvu6KwMul0lWO/PEJ8Z1ocsXfXPat0gd3ET kBua3LZ0A94HtIW/CY8nfPqN2TAw7HTG0/Cqmey7+520QwSJePDypwnxVzgjDdpWuZAm AI/+WX9s5cNKT6TYfWGlQV1VqDOqHUu9hLpXl+yIKoQJnGjKBW+OMjmgMMe8T0rmVLst deWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1706035568; x=1706640368; h=content-transfer-encoding:mime-version:reply-to:references :in-reply-to:message-id:date:subject:cc:to:from:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=IuyIoFCfZ7YdkhEdaTKUsA0jfoduX/Ed91Z/Iy4HbTw=; b=juN4LV9J8xzCyBLqF08d1BoIh/j651P/A+tyhlp8Y5pgqu/SLOrdAkJ9hnwQFec5ZM wdOjIWf1vHIp57q/Eyojq98zZuguBWwGuTK3hfI1jNWhGmdWbkpJS7zpmqHr0hUDXCzj 1tkvzcUoU1n7kfhAuckHCcAsWtXW7BnkmahpJFaU5ykCC484T6VHqW49dEJFV+2qXlV/ vcrIqm79aB+Fk63ezq8aj4G/RwGm1D4uzzG7wZImxFOCmmwjpYjdBOdAqseAx6WtcOdn vREUSGYySiAUOkiz6JMOKHbku0KpnHvR5S52d4gM138l3eNZpiarr6uji+xvepTmHk6y SrWQ== X-Gm-Message-State: AOJu0YwO+kWUa0MulkvAItepKbu1UUgOBzhbhRSyCT4Pj16kIfdbJTgM lG9+aI/DfY0p8kdQ2Lmagi600+PnZfgi8v7WQSBqTSzy/TVnFx6nD4C9k4BLf71Duw== X-Google-Smtp-Source: AGHT+IEICLiW4CBTB3qCkG5rC/3WLM2CjLX2MTEiymk06M7wUen2STMsCnbNu4Cim58h7VWkWAoNUA== X-Received: by 2002:a17:902:eacb:b0:1d7:2500:69d5 with SMTP id p11-20020a170902eacb00b001d7250069d5mr3578430pld.17.1706035567964; Tue, 23 Jan 2024 10:46:07 -0800 (PST) Received: from KASONG-MB2.tencent.com ([1.203.117.98]) by smtp.gmail.com with ESMTPSA id d5-20020a170902b70500b001d74c285b55sm4035196pls.67.2024.01.23.10.46.05 (version=TLS1_3 cipher=TLS_CHACHA20_POLY1305_SHA256 bits=256/256); Tue, 23 Jan 2024 10:46:07 -0800 (PST) From: Kairui Song To: linux-mm@kvack.org Cc: Andrew Morton , Yu Zhao , Wei Xu , Chris Li , Matthew Wilcox , linux-kernel@vger.kernel.org, Kairui Song Subject: [PATCH v3 3/3] mm, lru_gen: move pages in bulk when aging Date: Wed, 24 Jan 2024 02:45:52 +0800 Message-ID: <20240123184552.59758-4-ryncsn@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240123184552.59758-1-ryncsn@gmail.com> References: <20240123184552.59758-1-ryncsn@gmail.com> Reply-To: Kairui Song MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 5232C20009 X-Stat-Signature: tq4yk51nb9zeui57qiqjdncpuhyg3ymh X-HE-Tag: 1706035570-766500 X-HE-Meta: U2FsdGVkX19oKSXKnkQcdl5mLC/m2cmlaJpUQjnJJqC1bbjBzEIUpFZFEwEJ8n/64D3fys+gesYmXN9zARX7U/VOn9ek7Vh/mxcS3p96NKQAvF41GGolbMBQA3AMvnpmFnsTNVLvI1Nk4lltYNQIGasEpCZGVRdVYebSg5CKjKmNMba+jkRGAS3KRs3wttgBtY2iQqlLgfY+N3XrUQEDrBquBU7jAcuOPYo+1BJjJdK8K9OwNoM3xC3ZTvNPZAIMnlI0JguKo37vORRS01GAhqf8T5NePms1i/u9wmYwtNu6/lLC/BDepmgIEooICE6uzwigLsgOBbB5BQkhq88o/XY5U7ofUajVxA03AZC3mTWi1jAwn7/pzO2iW7Tj6W38YQBmWyOVJDv3SQUDWZ8HeQiXRzUhiXEuaEoHai4RzAKUHfaDe5COCQoJDxdyqCgIgXICbz4S7sN7sTA6uuEk+RtKqxpsjiZiGtRtw1VuyK7d/zCpeBdj5dTzKTONjal3LMceSz15NlNCyCMRVdyKLORm2vDXNC2/ysk4kN54fCdYoMMSs/EEyqV3tyYP8LKgnrqaaySeLXENIzEP3YCgSBkQ0DZ+rwt71orvpOryI0fhComBFkaEexDfw/XPdeJIJUebKacu5MxgHxECY5h+FTx5gpsxmWRG5O7NNgDQ2/4IMCYZJtO2o62RUGrQi0NAUFI+/1bzTfCvLJ1wxZ4fSToLA0FuDQ34+tH2XjX0W7wE+8LhMmSA0NaOE5ZI7hxDgmzf2seuMsqf4nkEzhHszx9Oh1AKyBw3Z3kOnHMeQvHpyYJAzYIIKXvOZB4KqJR/NXg4nu5EM9ZYsZPXLAHgUFzvwTRlXYMLPeD8wVF/sLU+gVORiCFgOezFnF7T+GktCfzkRmQL/tN1ul16rbAZfWSl6PZfioxvB1wiIVN7f48HdoGJDeEgeFTLBu+Js5aLHEVKWctx3IGztlPe7iW JVS0j/8Q mZKpqDe0BDocA2+OjLU7bKZ+UA1daNi5WbSYa4fhC+2Pj3aLOMpW+QJc7UdLcIAoHM2xDH3yekzme2pjezc3MluW/5dGaApRvq15bf6biuLR1W6PWJG4Xrxjni6iw6itJUqRwvakEbpuSUPm1usj3JOyZ/qiGLzAC2Q4zW3EfG/tNGyVCbsn1mttL/l8ccV68QX/wjb1scVC9/viQkNAOot654X1Mv1Vi2q4giH69dtQNxTR2WK2qUaJsqcCb5jCSRp/QjW0UiY0W2bMIlKHGVF3/JQbI2xc93Hj9pAFqi6q86bym8bPXlRzXDYiUgzz4nRPygdm/yrBC/OccGoZB95aNboa5ACimlwkBSK0hVoZFsN0o0ULLLRjBGG817v/0lRObv1QAuoK6uBuDdJ10xurLcvCmHm+nes95nXcetMJrJZJre8e4QjkUoVvk0i3tutjbZRjAhbUk7NoCigyGvl+UvNChHWRlYGyH9EqxHGA6X8yCKoyvF4SFIfrOPp64yz60Focb+xligz7rNjzjKNJ2kUUAZ8ETGn0b4+mngFAs+kfvyWFvV6eE/2YxIYhBvB+B6yW8Qlp4tYfHbNnv+89u6cyy++PQp1C74gtqTdniq99ia4iBCQh8gG8fvLpN884lbjb18j8CDGCtKdMjgqX6lKx1TmvZf8xKr6RJRAD+h2I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.001096, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Kairui Song Another overhead of aging is page moving. Actually, in most cases, pages are being moved to the same gen after folio_inc_gen is called, especially the protected pages. So it's better to move them in bulk. This also has a good effect on LRU order. Currently when MGLRU ages, it walks the LRU backwards, and the protected pages are moved to the tail of newer gen one by one, which actually reverses the order of pages in LRU. Moving them in batches can help keep their order, only in a small scope though, due to the scan limit of MAX_LRU_BATCH pages. After this commit, we can see a slight performance gain (with CONFIG_DEBUG_LIST=n): Test 1: Ramdisk fio test in a 4G memcg on a EPYC 7K62: fio -name=mglru --numjobs=16 --directory=/mnt --size=960m \ --buffered=1 --ioengine=io_uring --iodepth=128 \ --iodepth_batch_submit=32 --iodepth_batch_complete=32 \ --rw=randread --random_distribution=zipf:0.5 --norandommap \ --time_based --ramp_time=1m --runtime=6m --group_reporting Before: bw ( MiB/s): min= 8299, max= 9847, per=100.00%, avg=9388.23, stdev=16.25, samples=11488 iops : min=2124544, max=2521056, avg=2403385.82, stdev=4159.07, samples=11488 After (-0.2%): bw ( MiB/s): min= 8359, max= 9796, per=100.00%, avg=9367.29, stdev=15.75, samples=11488 iops : min=2140113, max=2507928, avg=2398024.65, stdev=4033.07, samples=11488 Test 2: Ramdisk fio hybrid test for 30m in a 4G memcg on a EPYC 7K62 (3 times): fio --buffered=1 --numjobs=8 --size=960m --directory=/mnt \ --time_based --ramp_time=1m --runtime=30m \ --ioengine=io_uring --iodepth=128 --iodepth_batch_submit=32 \ --iodepth_batch_complete=32 --norandommap \ --name=mglru-ro --rw=randread --random_distribution=zipf:0.7 \ --name=mglru-rw --rw=randrw --random_distribution=zipf:0.7 Before this patch: READ: 6973.3 MiB/s, Stdev: 19.601587 WRITE: 1302.3 MiB/s, Stdev: 4.988877 After this patch (+0.1%, +0.3%): READ: 6981.0 MiB/s, Stdev: 15.556349 WRITE: 1305.7 MiB/s, Stdev: 2.357023 Test 3: 30m of MySQL test in 6G memcg for 12 times: echo 'set GLOBAL innodb_buffer_pool_size=16106127360;' | \ mysql -u USER -h localhost --password=PASS sysbench /usr/share/sysbench/oltp_read_only.lua \ --mysql-user=USER --mysql-password=PASS --mysql-db=DB \ --tables=48 --table-size=2000000 --threads=16 --time=1800 run Before this patch Avg: 135310.868182 qps. Stdev: 379.200942 After this patch (-0.3%): Avg: 135099.210000 qps. Stdev: 351.488863 Test 4: Build linux kernel in 2G memcg with make -j48 with SSD swap (for memory stress, 18 times): Before this patch: Average: 1467.813023. Stdev: 24.232886 After this patch (+0.0%): Average: 1464.178154. Stdev: 17.992974 Test 5: Memtier test in a 4G cgroup using brd as swap (20 times): memcached -u nobody -m 16384 -s /tmp/memcached.socket \ -a 0766 -t 16 -B binary & memtier_benchmark -S /tmp/memcached.socket \ -P memcache_binary -n allkeys \ --key-minimum=1 --key-maximum=16000000 -d 1024 \ --ratio=1:0 --key-pattern=P:P -c 1 -t 16 --pipeline 8 -x 3 Before this patch: Avg: 48389.282500 Ops/sec. Stdev: 3534.470933 After this patch (+1.2%): Avg: 48959.374118 Ops/sec. Stdev: 3488.559744 Signed-off-by: Kairui Song --- mm/vmscan.c | 47 ++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 44 insertions(+), 3 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 8c701b34d757..373a70801db9 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3122,8 +3122,45 @@ static int folio_update_gen(struct folio *folio, int gen) */ struct lru_gen_inc_batch { int delta; + struct folio *head, *tail; }; +static inline void lru_gen_inc_bulk_done(struct lru_gen_folio *lrugen, + int bulk_gen, bool type, int zone, + struct lru_gen_inc_batch *batch) +{ + if (!batch->head) + return; + + list_bulk_move_tail(&lrugen->folios[bulk_gen][type][zone], + &batch->head->lru, + &batch->tail->lru); + + batch->head = NULL; +} + +/* + * When aging, protected pages will go to the tail of the same higher + * gen, so the can be moved in batches. Besides reduced overhead, this + * also avoids changing their LRU order in a small scope. + */ +static inline void lru_gen_try_bulk_move(struct lru_gen_folio *lrugen, struct folio *folio, + int bulk_gen, int new_gen, bool type, int zone, + struct lru_gen_inc_batch *batch) +{ + /* + * If folio not moving to the bulk_gen, it's raced with promotion + * so it need to go to the head of another LRU. + */ + if (bulk_gen != new_gen) + list_move(&folio->lru, &lrugen->folios[new_gen][type][zone]); + + if (!batch->head) + batch->tail = folio; + + batch->head = folio; +} + static void lru_gen_inc_batch_done(struct lruvec *lruvec, int gen, int type, int zone, struct lru_gen_inc_batch *batch) { @@ -3132,6 +3169,8 @@ static void lru_gen_inc_batch_done(struct lruvec *lruvec, int gen, int type, int struct lru_gen_folio *lrugen = &lruvec->lrugen; enum lru_list lru = type ? LRU_INACTIVE_FILE : LRU_INACTIVE_ANON; + lru_gen_inc_bulk_done(lrugen, new_gen, type, zone, batch); + if (!delta) return; @@ -3709,6 +3748,7 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, bool can_swap) struct lru_gen_inc_batch batch = { }; struct lru_gen_folio *lrugen = &lruvec->lrugen; int new_gen, old_gen = lru_gen_from_seq(lrugen->min_seq[type]); + int bulk_gen = (old_gen + 1) % MAX_NR_GENS; if (type == LRU_GEN_ANON && !can_swap) goto done; @@ -3737,7 +3777,7 @@ static bool inc_min_seq(struct lruvec *lruvec, int type, bool can_swap) } new_gen = folio_inc_gen(folio, old_gen, false, &batch); - list_move_tail(&folio->lru, &lrugen->folios[new_gen][type][zone]); + lru_gen_try_bulk_move(lrugen, folio, bulk_gen, new_gen, type, zone, &batch); if (!--remaining) { lru_gen_inc_batch_done(lruvec, old_gen, type, zone, &batch); @@ -4275,6 +4315,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c int tier = lru_tier_from_refs(refs); struct lru_gen_folio *lrugen = &lruvec->lrugen; int old_gen = lru_gen_from_seq(lrugen->min_seq[type]); + int bulk_gen = (old_gen + 1) % MAX_NR_GENS; VM_WARN_ON_ONCE_FOLIO(gen >= MAX_NR_GENS, folio); @@ -4308,7 +4349,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c int hist = lru_hist_from_seq(lrugen->min_seq[type]); gen = folio_inc_gen(folio, old_gen, false, batch); - list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); + lru_gen_try_bulk_move(lrugen, folio, bulk_gen, gen, type, zone, batch); WRITE_ONCE(lrugen->protected[hist][type][tier - 1], lrugen->protected[hist][type][tier - 1] + delta); @@ -4318,7 +4359,7 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c /* ineligible */ if (zone > sc->reclaim_idx || skip_cma(folio, sc)) { gen = folio_inc_gen(folio, old_gen, false, batch); - list_move_tail(&folio->lru, &lrugen->folios[gen][type][zone]); + lru_gen_try_bulk_move(lrugen, folio, bulk_gen, gen, type, zone, batch); return true; }