From patchwork Tue Aug 27 23:07:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kinsey Ho X-Patchwork-Id: 13780211 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08F2CC54742 for ; Tue, 27 Aug 2024 23:11:50 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49EDF6B0093; Tue, 27 Aug 2024 19:11:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 425C46B0095; Tue, 27 Aug 2024 19:11:48 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 205116B0096; Tue, 27 Aug 2024 19:11:48 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id F2CAB6B0093 for ; Tue, 27 Aug 2024 19:11:47 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 6D77F140846 for ; Tue, 27 Aug 2024 23:11:47 +0000 (UTC) X-FDA: 82499574654.30.D6470BC Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf25.hostedemail.com (Postfix) with ESMTP id A9C38A000C for ; Tue, 27 Aug 2024 23:11:45 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=uNEvlmjj; spf=pass (imf25.hostedemail.com: domain of 3MF3OZggKCGEJHMRDXGNFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--kinseyho.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3MF3OZggKCGEJHMRDXGNFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--kinseyho.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724800236; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GW/USJCjupc34zRwCQ9bHi7JmQHKfEiUl9nNlR9x2Hk=; b=gQ4bUYeP8QSpp3Khc8fjNwbBSuaQUhdAKOZsugruXUCkK2hh56EAj3lRizpmlrYeeKjy0I ph0mieHu6AbYl4yYcsU7EZ9UjDvd4f3zsa3/Rp05sssQLIsvx4SHgFgg5K5U/d5D6a2pzk usu4NHk3MgokNEj3RkLQoqB/Iqr7rXY= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=uNEvlmjj; spf=pass (imf25.hostedemail.com: domain of 3MF3OZggKCGEJHMRDXGNFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--kinseyho.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3MF3OZggKCGEJHMRDXGNFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--kinseyho.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724800236; a=rsa-sha256; cv=none; b=0EafIC2JRDBja5MrUKkH5Bp/IdVN7v54GabPoAfUbKIOiGdnjAOJN8Y1a8VC5zXZSL3H4O Cy/Nh3UPdTkDp8V7BEzzjh7y2LmBBXsH6AmOXVYsDAb2B2BCZREz8qogLYt4BF8ra75Njy zuwyYbTYRQQnAf5lyln4cx4SbC2/29U= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6b46a237b48so127400937b3.1 for ; Tue, 27 Aug 2024 16:11:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1724800305; x=1725405105; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GW/USJCjupc34zRwCQ9bHi7JmQHKfEiUl9nNlR9x2Hk=; b=uNEvlmjjkQw/6NXBHvv6lNDKwca/e/nYSfHuLxHBm0IHz7NgbwMES7KY5VAxWNOEq0 3wVQXdIx//0/vBwGBv9fWHPKe0Z+VMXaVssKBdBitfVm0NtbfsR0tZoEYFlR1i1jgWAG d132NQkCk+X6qu0IlwvttDP2DvT27Lk0DI0dhZy4TvnHDhz3rjl/CI1jeMEkty1zSx6u y5WDfo80JxyEDtoFyPyX/Cj8paTULrhvn5AXuP3XNFK0JTdUMx1m5IaX2cyTbsL6+mCm o8klcQ4HqtlFXRfMu5qZ/+TTq1wyWlbeRJIFIo13dCSrXix/4+B1SO1B9kIeidRfj4p8 bNBw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724800305; x=1725405105; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GW/USJCjupc34zRwCQ9bHi7JmQHKfEiUl9nNlR9x2Hk=; b=hVQcYLAHdn4gHylXPdqKmIt0d1y3JhuOliDRGTkZuwL6DhEHJ9ybnRnPKAmTNoqDVe BJyshc0yT4xF69vDqI8tKqQiEGwQ3c2fCfvT9DAN4bw5Mot0qCa64nOYgBNCMAbEmVp1 1hABpIvzAS9EE/U+63uQZBhlnJFTWqpfRuMJAzhM8B5yQy8a5yz1uhBhrwgzpuy1LvHR UpWWvBjGps1cmlTm3FBBMQPZgcvUFO52fPS8pYv4xNv2r7v/I7wSRohxulaYQiK2d/av 6B+G5vsMgDqajrstZ6BMGvIltwfnemAPu9+XKmJchQ7g7OXdEfHlo8GBlj0h/fYbeVDd o7mw== X-Gm-Message-State: AOJu0YyzM6LskS0LkXW42bNEt0COpQe5H2d4a/T0q1p/BmGG2LsI/Tzo l5OKoqK8aWhieX0VV0+go6Bm3etKiAUB+SxVKH7xAvbMqqi5bAZhNdJsJaMV6plyvrG47ugAI6m DlR59wyjAng== X-Google-Smtp-Source: AGHT+IF1yp8ipyeMEvESYLh84If45Gfi4ptWiqS2P4CUd5/mGA60SGsDPudut6D5G0yZlYICuZUQrkeSqUZvzQ== X-Received: from kinseyct.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:46b]) (user=kinseyho job=sendgmr) by 2002:a05:690c:46c3:b0:69a:536:afd3 with SMTP id 00721157ae682-6d172052874mr2467b3.5.1724800304855; Tue, 27 Aug 2024 16:11:44 -0700 (PDT) Date: Tue, 27 Aug 2024 23:07:41 +0000 In-Reply-To: <20240827230753.2073580-1-kinseyho@google.com> Mime-Version: 1.0 References: <20240827230753.2073580-1-kinseyho@google.com> X-Mailer: git-send-email 2.46.0.295.g3b9ea8a38a-goog Message-ID: <20240827230753.2073580-5-kinseyho@google.com> Subject: [PATCH mm-unstable v3 4/5] mm: restart if multiple traversals raced From: Kinsey Ho To: Andrew Morton Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, cgroups@vger.kernel.org, Yosry Ahmed , Roman Gushchin , Johannes Weiner , Michal Hocko , Shakeel Butt , Muchun Song , Tejun Heo , Zefan Li , mkoutny@suse.com, Kinsey Ho X-Stat-Signature: 3bqkmgszwcwm716hftu1ua51ce5e5fcb X-Rspam-User: X-Rspamd-Queue-Id: A9C38A000C X-Rspamd-Server: rspam02 X-HE-Tag: 1724800305-227313 X-HE-Meta: U2FsdGVkX1/OmijcPPgpyY+P8iP1Rkb7QaKBYrj/AykEcpuWKLq9sXIiOX7+nxg1xYCYWvOAO1F49PC4Rikoz8P1z01rWbS+EDxsU/A20rD6WRIpcB+4WX+2hxRfeOPMb+DkKOIeC0IMAAsmkdW0AnV6MdZiTvWVXnmaLGQbUgJPNT4L/YdtExIjZLkv1Oju/6JJbQonYClxLupx36L6dzyvrwZuGsvx1cQma2UIPAmULzRKcNn9f8KEG+4pHjOw2xzyFtrTxt6hD2M02DJxYYpA9qypMLZKEystxBj5biHEMUXj1k04j+DLbydu/SEHTtwSEvhEUjF4W/CNZagHVGB73hEPp301WAJjqFLQfCwYMAlaRwIps5I1lr5Hftxi+arXieznYrxlCKuZZCJOQUBnQSRijsmNaxNMERb4UQiAtLFl+JcdHQX8wE5xFheEZQ1sPWSX6tcafluDupnYvcgSlQvUIX/zg7kldCtllhm8YdH0aYBgIac3PSRcynFFsVAZYZRMLDyKxElXb8PfdmKImO/OKz+Qqil31qqTMfro8nkc2RW20YnuPno08IrBAgTef/4DceuFX0KqxA72LDy5zDS9eDancwJ5tEzNi1qpliWQ7SAgPTUU8Z+zy3z7DietqhujOEzXkjQ9qyxDnTc4EWNfwiGrHbfngj0B82/8N+GhIEAqf3ftBvdsuYbPl/8PoH3eDEjLsVqB7YH/UIPb3QmdkTaGN6rF57oCi/lLG4Q/1z9IgmBo+2lvju0fpevk+/KCNJc1J1dFUk5nGiQ2d1URxWEYSL9qEiY0F+YHW064HweDH4ru5O5W6vGqVrgPoTLGgSP/mdvCmG/bpLaRX3SBjyDa0zoVCdbeEsFjPBi2l+vD6qxiEuzOJal17NK9v8V1PVhX/RGClEGcDPhKYPrUaboHj8yfbvcmIqjQHKvTIjbxrINCv4TUvw/KCBnLPCUM87ZCO8tjcY/ bw0OkWWO X6sC/qMBD9QXGkTjb2Eh1ak1yGaKAsCBPK6YIORoX+MNEHe/Osp0qm7mQnGfdnECYPJhkMgllfs2WDN1WowVsbL3f/qLQL4/lHn16D1k7qJKJO0z1uooIR+ReagSTwMjP1WphsHozs0o9mhGKmKuQ6yyDfnrdj90vSRZnHPvM/2u5XJ2thTBHWAbwwduzN0GVEnaM16wZfVlp1HNyq6/W9xbCauC2xQoo9J8aRULajyc0JowT89Lzqud2Q1MpRDS3VQD+fMg/EfPFYSPWdac0PwsGpBj/sEFuPADaprbk4uzVMRsBYASQ4cszsZrS8VDU75LrYGU99dldwQw/heAhBmAeB+uxnZVtQTdxhsbGrXbXqVuh/0SmDiPaY/nRXQTXQ33Y7pswLDScs6fYfCagvzOgeqisYp8C0v3AzlWLgjNyJLO772Jc125dlYVns2iwkZbe8AYersQyjiiQv6ENTN+52YKGwvYUmRReMJ4I6uX5vix7lPkWiWqNDXwScgSidfWYG62Dpx/JPAyaXrxkQB9vxio0IG0L//Z7s4ue9/kCcOk0FBQFN5OyUw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, if multiple reclaimers raced on the same position, the reclaimers which detect the race will still reclaim from the same memcg. Instead, the reclaimers which detect the race should move on to the next memcg in the hierarchy. So, in the case where multiple traversals race, jump back to the start of the mem_cgroup_iter() function to find the next memcg in the hierarchy to reclaim from. Signed-off-by: Kinsey Ho Reviewed-by: T.J. Mercier Signed-off-by: Hugh Dickins Acked-by: Kinsey Ho Reported-by: syzbot+e099d407346c45275ce9@syzkaller.appspotmail.com --- include/linux/memcontrol.h | 4 ++-- mm/memcontrol.c | 22 ++++++++++++++-------- 2 files changed, 16 insertions(+), 10 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index fe05fdb92779..2ef94c74847d 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -57,7 +57,7 @@ enum memcg_memory_event { struct mem_cgroup_reclaim_cookie { pg_data_t *pgdat; - unsigned int generation; + int generation; }; #ifdef CONFIG_MEMCG @@ -78,7 +78,7 @@ struct lruvec_stats; struct mem_cgroup_reclaim_iter { struct mem_cgroup *position; /* scan generation, increased every round-trip */ - unsigned int generation; + atomic_t generation; }; /* diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 51b194a4c375..33bd379c738b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -986,7 +986,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, struct mem_cgroup_reclaim_cookie *reclaim) { struct mem_cgroup_reclaim_iter *iter; - struct cgroup_subsys_state *css = NULL; + struct cgroup_subsys_state *css; struct mem_cgroup *memcg = NULL; struct mem_cgroup *pos = NULL; @@ -999,18 +999,20 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, rcu_read_lock(); restart: if (reclaim) { + int gen; struct mem_cgroup_per_node *mz; mz = root->nodeinfo[reclaim->pgdat->node_id]; iter = &mz->iter; + gen = atomic_read(&iter->generation); /* * On start, join the current reclaim iteration cycle. * Exit when a concurrent walker completes it. */ if (!prev) - reclaim->generation = iter->generation; - else if (reclaim->generation != iter->generation) + reclaim->generation = gen; + else if (reclaim->generation != gen) goto out_unlock; pos = READ_ONCE(iter->position); @@ -1018,8 +1020,7 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, pos = prev; } - if (pos) - css = &pos->css; + css = pos ? &pos->css : NULL; for (;;) { css = css_next_descendant_pre(css, &root->css); @@ -1033,21 +1034,26 @@ struct mem_cgroup *mem_cgroup_iter(struct mem_cgroup *root, * and kicking, and don't take an extra reference. */ if (css == &root->css || css_tryget(css)) { - memcg = mem_cgroup_from_css(css); break; } } + memcg = mem_cgroup_from_css(css); + if (reclaim) { /* * The position could have already been updated by a competing * thread, so check that the value hasn't changed since we read * it to avoid reclaiming from the same cgroup twice. */ - (void)cmpxchg(&iter->position, pos, memcg); + if (cmpxchg(&iter->position, pos, memcg) != pos) { + if (css && css != &root->css) + css_put(css); + goto restart; + } if (!memcg) { - iter->generation++; + atomic_inc(&iter->generation); /* * Reclaimers share the hierarchy walk, and a