From patchwork Fri Oct 25 01:22:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13849917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C5556D1038E for ; Fri, 25 Oct 2024 01:23:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 517466B009C; Thu, 24 Oct 2024 21:23:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4C8006B009D; Thu, 24 Oct 2024 21:23:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 390106B009E; Thu, 24 Oct 2024 21:23:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1BD656B009C for ; Thu, 24 Oct 2024 21:23:34 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 7F6744028B for ; Fri, 25 Oct 2024 01:23:23 +0000 (UTC) X-FDA: 82710376560.22.891D479 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) by imf24.hostedemail.com (Postfix) with ESMTP id 32638180013 for ; Fri, 25 Oct 2024 01:23:28 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=BYHipPaw; spf=pass (imf24.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.186 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729819259; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=l+J9jIglENq8jzu02bF8vylZNSZPLddTn6u8mh0nWUU=; b=sEKEcpG6vkZeKnA7DiIlgGPNRztI5Gze2dIB15IK0gxIu4Dxu9J+NMyaVO4hc2THCzHI9m b0FIbCROdWzUNFAGNaQiho6NlLI4q35GdToCYNfw9N4gMYA8LvFOZa73vmjLF5TDv42WqX 61zPHcnhJXOaqs1QJD5IwvjHeB2aXq4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729819259; a=rsa-sha256; cv=none; b=LRmzYrCddiYtdpDkIfL7WYieRNSawFPLbVPkxCXxIAJfwbS8kO+XDAiNVsoqTgkKyJNttF AxwpBgQzslfxp8TpcM1e/CXsPsnX0KtxM4G9+SGF6MPlcNUXANCStOP0D6PK9lH61mnhgL 8I0r6q1rS6VCApcrMa1xjQVS+vFTwGs= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=BYHipPaw; spf=pass (imf24.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.186 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1729819409; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l+J9jIglENq8jzu02bF8vylZNSZPLddTn6u8mh0nWUU=; b=BYHipPawJUMMW2Grb3IgGNBGyd/tCAEj58t9TYe6oWoiC8107S9853GwmSzHeSpldFUmVs 0f1Xrr4JjMdvp9qn3yoUAkWW5btgPv7xCsKVEDAThJjkmP4IKr9HmyQMwtGeytNs1l9ZA2 NN38hBDw18D1pA94B4nYiBsxI9LGTnA= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Hugh Dickins , Yosry Ahmed , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Meta kernel team Subject: [PATCH v1 1/6] memcg-v1: fully deprecate move_charge_at_immigrate Date: Thu, 24 Oct 2024 18:22:58 -0700 Message-ID: <20241025012304.2473312-2-shakeel.butt@linux.dev> In-Reply-To: <20241025012304.2473312-1-shakeel.butt@linux.dev> References: <20241025012304.2473312-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: 32638180013 X-Stat-Signature: rpo7xwxn3o74t5xaofq6kcjgjb8whybh X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1729819408-112275 X-HE-Meta: U2FsdGVkX1+XFJpTmbqj0FuFryL8/qGPFS5fO9RN1zIzYnnn97vQ4IU73r21FT2bB753+AFSrTI54+DJ4MzjHOtfreQARApu5EPpdHrJmCKxFWNDOUa0+LaAp/lJp9YHqgcSyrd0hcwvH+VwgFYHvHwIbr7dlmM2pD0HQ3kRH0Qz8VE+6dq+feNQT+xiiJbsPxQdVxqf98pCZqWkcBTIxIJsSK9x2QmFGf1ybKoGFMrPYW8D5cNwIzdFfRAvwsIE7CFNNAYWNCNDhLb0qogmgbmwCYHQfqWV3BbiSt3QCwp+R5vA4j4pUHOCm1x3kggAln/IKC1quORQn4z57T8S8TKdHdWup7/byK1tX+NUv0MVHcfC3kFBvKzpmt3MgIE8k1gCRSk/EUjdq/ZBzpSaApLT7ey4dI6QPUAsIzrI+vx6imd7gLDGzSGYwZniHllFo/PtL+lJkhhh6OxIzt7ecxSrq7uyb+fVD1gdaE98/4IKZTDqUtYpt/LCX3gdvuboa/DQNkUiiyjtK5vEQM6h/nRmcNpk3N0otvhevzaq8KstNtc04LXmjcWKtqVa5Jaus/5M6uncb4IrMDv3DlnSYOzw458J0RxbXw+Nc1XS2iPMwJYCvMrlR7f/qo7W3D1/J5y73eRv7CXvhIhj0xsUrMgAJiydz9zDFtaVUXUGL8ue794gly3SaRGn8mP4hIAIz0nBSH/TmJ25XnbPCtJJqP4Yo90lu9OZiYFgcKx31JGvaCN+y2HpqjTPwOlhJIRsHZwTPXWOZMZEuNPAU3EdMd6BQzpvOKlUu89EfSZnM35LzAM/Spe8p4tbx+bc0Dp0U+9Pajmg2lYVd6I+kaJjpvQPSZLxKEUmJyFwf8+33ep92p6+oHg5zsQsfOlyt+sXPQIg2iLWuUZAHHjfoqA0425ToO2Gv+nHQPAfuNlC/XE4VZPYgieJ/YCMOZ406TUtCFwyjY/l5j2ffLy1Q0I Hh3weeoC g0QHRhJls8UNatvvraK4b8JJRhL+H2KVTDUcyLhHK90519/iukRNlJ1bBd3nfp4X9MRHVsVR8TtfvcAoZ+UDuJKKN/MiuUx9FgpDmzrVydu/7V9Iq2jTliNsNW0bVBXGYUXu16kdV7gk59cT02xZ+9HczJQKvcELtUXD9nauw1/N69oNmIm1/NQS02gwUBNBbu/xnhfgXY9NPrFRe7YQGqwqabjomUxeiVRS4f0wg1NTLFJ7nFMRvvwwW/Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Proceed with the complete deprecation of memcg v1's charge moving feature. The deprecation warning has been in the kernel for almost two years and has been ported to all stable kernel since. Now is the time to fully deprecate this feature. Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko Acked-by: Johannes Weiner --- Changes since RFC: - Writing 0 to memory.move_charge_at_immigrate is allowed. .../admin-guide/cgroup-v1/memory.rst | 82 +------------------ mm/memcontrol-v1.c | 14 +--- 2 files changed, 5 insertions(+), 91 deletions(-) diff --git a/Documentation/admin-guide/cgroup-v1/memory.rst b/Documentation/admin-guide/cgroup-v1/memory.rst index 270501db9f4e..286d16fc22eb 100644 --- a/Documentation/admin-guide/cgroup-v1/memory.rst +++ b/Documentation/admin-guide/cgroup-v1/memory.rst @@ -90,9 +90,7 @@ Brief summary of control files. used. memory.swappiness set/show swappiness parameter of vmscan (See sysctl's vm.swappiness) - memory.move_charge_at_immigrate set/show controls of moving charges - This knob is deprecated and shouldn't be - used. + memory.move_charge_at_immigrate This knob is deprecated. memory.oom_control set/show oom controls. This knob is deprecated and shouldn't be used. @@ -243,10 +241,6 @@ behind this approach is that a cgroup that aggressively uses a shared page will eventually get charged for it (once it is uncharged from the cgroup that brought it in -- this will happen on memory pressure). -But see :ref:`section 8.2 ` when moving a -task to another cgroup, its pages may be recharged to the new cgroup, if -move_charge_at_immigrate has been chosen. - 2.4 Swap Extension -------------------------------------- @@ -756,78 +750,8 @@ If we want to change this to 1G, we can at any time use:: THIS IS DEPRECATED! -It's expensive and unreliable! It's better practice to launch workload -tasks directly from inside their target cgroup. Use dedicated workload -cgroups to allow fine-grained policy adjustments without having to -move physical pages between control domains. - -Users can move charges associated with a task along with task migration, that -is, uncharge task's pages from the old cgroup and charge them to the new cgroup. -This feature is not supported in !CONFIG_MMU environments because of lack of -page tables. - -8.1 Interface -------------- - -This feature is disabled by default. It can be enabled (and disabled again) by -writing to memory.move_charge_at_immigrate of the destination cgroup. - -If you want to enable it:: - - # echo (some positive value) > memory.move_charge_at_immigrate - -.. note:: - Each bits of move_charge_at_immigrate has its own meaning about what type - of charges should be moved. See :ref:`section 8.2 - ` for details. - -.. note:: - Charges are moved only when you move mm->owner, in other words, - a leader of a thread group. - -.. note:: - If we cannot find enough space for the task in the destination cgroup, we - try to make space by reclaiming memory. Task migration may fail if we - cannot make enough space. - -.. note:: - It can take several seconds if you move charges much. - -And if you want disable it again:: - - # echo 0 > memory.move_charge_at_immigrate - -.. _cgroup-v1-memory-movable-charges: - -8.2 Type of charges which can be moved --------------------------------------- - -Each bit in move_charge_at_immigrate has its own meaning about what type of -charges should be moved. But in any case, it must be noted that an account of -a page or a swap can be moved only when it is charged to the task's current -(old) memory cgroup. - -+---+--------------------------------------------------------------------------+ -|bit| what type of charges would be moved ? | -+===+==========================================================================+ -| 0 | A charge of an anonymous page (or swap of it) used by the target task. | -| | You must enable Swap Extension (see 2.4) to enable move of swap charges. | -+---+--------------------------------------------------------------------------+ -| 1 | A charge of file pages (normal file, tmpfs file (e.g. ipc shared memory) | -| | and swaps of tmpfs file) mmapped by the target task. Unlike the case of | -| | anonymous pages, file pages (and swaps) in the range mmapped by the task | -| | will be moved even if the task hasn't done page fault, i.e. they might | -| | not be the task's "RSS", but other task's "RSS" that maps the same file. | -| | The mapcount of the page is ignored (the page can be moved independent | -| | of the mapcount). You must enable Swap Extension (see 2.4) to | -| | enable move of swap charges. | -+---+--------------------------------------------------------------------------+ - -8.3 TODO --------- - -- All of moving charge operations are done under cgroup_mutex. It's not good - behavior to hold the mutex too long, so we may need some trick. +Reading memory.move_charge_at_immigrate will always return 0 and writing +to it will always return -EINVAL. 9. Memory thresholds ==================== diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c index 81d8819f13cd..9b3b1a446c65 100644 --- a/mm/memcontrol-v1.c +++ b/mm/memcontrol-v1.c @@ -593,29 +593,19 @@ static inline int mem_cgroup_move_swap_account(swp_entry_t entry, static u64 mem_cgroup_move_charge_read(struct cgroup_subsys_state *css, struct cftype *cft) { - return mem_cgroup_from_css(css)->move_charge_at_immigrate; + return 0; } #ifdef CONFIG_MMU static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css, struct cftype *cft, u64 val) { - struct mem_cgroup *memcg = mem_cgroup_from_css(css); - pr_warn_once("Cgroup memory moving (move_charge_at_immigrate) is deprecated. " "Please report your usecase to linux-mm@kvack.org if you " "depend on this functionality.\n"); - if (val & ~MOVE_MASK) + if (val != 0) return -EINVAL; - - /* - * No kind of locking is needed in here, because ->can_attach() will - * check this value once in the beginning of the process, and then carry - * on with stale data. This means that changes to this value will only - * affect task migrations starting after the change. - */ - memcg->move_charge_at_immigrate = val; return 0; } #else From patchwork Fri Oct 25 01:22:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13849918 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 216BED10399 for ; Fri, 25 Oct 2024 01:23:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8F4016B009E; Thu, 24 Oct 2024 21:23:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 873486B009F; Thu, 24 Oct 2024 21:23:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6C3AB6B00A0; Thu, 24 Oct 2024 21:23:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 431026B009E for ; Thu, 24 Oct 2024 21:23:37 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id C8BB31A0215 for ; Fri, 25 Oct 2024 01:23:02 +0000 (UTC) X-FDA: 82710376392.08.C80B521 Received: from out-174.mta1.migadu.com (out-174.mta1.migadu.com [95.215.58.174]) by imf08.hostedemail.com (Postfix) with ESMTP id 455E5160008 for ; Fri, 25 Oct 2024 01:23:22 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="SBOeM/0c"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf08.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.174 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729819246; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dkv2k7zB4FzCfldIONhKIVqkDHKenpJYxbxDPiqQtgw=; b=E3bRF3CPj0bv/YDxBr2PVuEWr8+1ycIwTcltwAic7KOazT2hQsAdMIqWWaxGar30/1qt3T 7W02hs00TYy/B+jYoAIRE40LeBlh3qHEwrJvhv/7GPOGlV6w8CwO0/uIMPqPT7hWVK9+d2 5kjim7eyYKDHkDbOYU1ee4Axpf/ZNus= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729819246; a=rsa-sha256; cv=none; b=3L0M4o5mT/w50SOcDb0ndzU4HX6DSOroK9QeAQXDux6OFIiYn+uiHYYuYdkBawMLLohSD0 O6itUsrUatukHkdEDHfDp4S6VoXNwnpKyzpywbkhW9CWJWm5MlGtcKqVihe/RkfTpp1mMx wRQkLBF3A+sc/EA4RLVgEytHNM4pRD0= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="SBOeM/0c"; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf08.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.174 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1729819413; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dkv2k7zB4FzCfldIONhKIVqkDHKenpJYxbxDPiqQtgw=; b=SBOeM/0c1tJsEnwEpVlUjON9rbt7EaqF+SMfe5gOedK/Eu7TmhT5pq3uJ7S25R4opr6nGf Wj69m6+XXOEH93NenURVqMy37BIxITMtJSAOUjI7+eY4VbdmsX3qKt8bFO+4CAt+GmJy8M AybIrv7WNxcLYp9tKOWRt7SJUcuD00U= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Hugh Dickins , Yosry Ahmed , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Meta kernel team , Michal Hocko Subject: [PATCH v1 2/6] memcg-v1: remove charge move code Date: Thu, 24 Oct 2024 18:22:59 -0700 Message-ID: <20241025012304.2473312-3-shakeel.butt@linux.dev> In-Reply-To: <20241025012304.2473312-1-shakeel.butt@linux.dev> References: <20241025012304.2473312-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 455E5160008 X-Stat-Signature: uagpsdx9srgg61rgdu3phy56jpnd6duj X-Rspam-User: X-HE-Tag: 1729819402-308014 X-HE-Meta: U2FsdGVkX1+JywDV/3Yyxh4Uaa8TmSmWrZ+YGutlMl2vb+lDxrry2/B58WjkHhRp3ikSpuyO2QbTNAtTGn+82LkTgDzAcv98BrGpOdkpRAvJv+IKfWwMg0K+cLJgyH95T6myuh0moasdKx5LXsugjSjo1okbXWMztSWAcsrb3rlUw7c1UPSzV0BnhoONktNQD4EPUU6d1a8gcH8Zd6jxNY313lYlBVcNz+mlE7cp5x1UoHLAJNZqASoa5RrJEZLNc1Ty3L/MxHhrojWpSmN6XC20XWgZlJvbUe43ZujDhY05XMO1R8N6qGpp78uG96ZkMCGz6fbcY8ztc30sWtSzU7fEfRiNZBTIl3iBSKqZPTv4FsQwaYf0pIJLLDiApPd4HMDoaXyYmE+xiVXcRptuzblF24BsqVhokSSWimZGoD9Miyz+oM7CCTgBiOBAa71YNPuxliWLRO2hMx+SqN/YVctDEqTqCPsgusM7iocCL+E6Fw6nUvv2PCahvxoZnu6UumXjWvqcwmE6MqwaJ5sasVvnmqDBIEcUdo9vBls/NQNz8RUC13wHEzo3cbsWabwqypprnClxsE/Y6GF3pAYdtTaIEVR3tQ0o2TlNfC7HRv6Ibo+rLQAkzDT0t/ykRmUY4jG6QPMY+aKh1PQl/Xyq66fgkt56yoZcxo+NHgvmBL9u7tP9dcrn53Phn6XRLzJYPh28/8CkCQ9HsDKUwgisj/C2UvD81p/Sg6tW/8mgDqV33GG4Ticw/4szRlkz3ZVZg5HItlnkgyMCgnd2IKrQqb4qA8CVEqwaZk8iuV9rJ8RP6bL43MXnNhArZnqYBVTeQcB45HrkDXn7DfXGNLF4u7/jVI+Zb8yM7e0JWW+uRq+AydnwyqC0U0MJx1GN36fg6qyB783EqjcmMdJwsK0Wc0BtaWGaZ/lO7ercnSFc0W3B/MQ7HRca6zGi8OoBVoH3xcIRJxMh2TQMMMxoYdJ ct9ZtxT2 v5UbyjAzjToCgi2lFXs7uRlWKBFXey3GPbpjvXQGViyM7IH58aqvdrFIcl/5xLQY2MzfZzdo7S8D8T1LT0sKFn3unhYjVYqqAYfEG/st3z9oExoT08e+ES8HozbETZnciY7JZ/U1A7oJTrVwozJ0c6Z8UvO7F029tm3FFKxFuPyYfpzyHIUyIEdDRSsIRiJ2SGL0gMJ8uZ24WIvE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The memcg-v1 charge move feature has been deprecated completely and let's remove the relevant code as well. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko Reviewed-by: Roman Gushchin Acked-by: Johannes Weiner --- include/linux/memcontrol.h | 5 - mm/memcontrol-v1.c | 862 ------------------------------------- mm/memcontrol-v1.h | 6 - mm/memcontrol.c | 9 - 4 files changed, 882 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 524006313b0d..798db70b0a30 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -299,11 +299,6 @@ struct mem_cgroup { /* For oom notifier event fd */ struct list_head oom_notify; - /* - * Should we move charges of a task when a task is moved into this - * mem_cgroup ? And what type of charges should we move ? - */ - unsigned long move_charge_at_immigrate; /* taken only while moving_account > 0 */ spinlock_t move_lock; unsigned long move_lock_flags; diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c index 9b3b1a446c65..9c0fba8c8a83 100644 --- a/mm/memcontrol-v1.c +++ b/mm/memcontrol-v1.c @@ -40,31 +40,6 @@ static struct mem_cgroup_tree soft_limit_tree __read_mostly; #define MEM_CGROUP_MAX_RECLAIM_LOOPS 100 #define MEM_CGROUP_MAX_SOFT_LIMIT_RECLAIM_LOOPS 2 -/* Stuffs for move charges at task migration. */ -/* - * Types of charges to be moved. - */ -#define MOVE_ANON 0x1ULL -#define MOVE_FILE 0x2ULL -#define MOVE_MASK (MOVE_ANON | MOVE_FILE) - -/* "mc" and its members are protected by cgroup_mutex */ -static struct move_charge_struct { - spinlock_t lock; /* for from, to */ - struct mm_struct *mm; - struct mem_cgroup *from; - struct mem_cgroup *to; - unsigned long flags; - unsigned long precharge; - unsigned long moved_charge; - unsigned long moved_swap; - struct task_struct *moving_task; /* a task moving charges */ - wait_queue_head_t waitq; /* a waitq for other context */ -} mc = { - .lock = __SPIN_LOCK_UNLOCKED(mc.lock), - .waitq = __WAIT_QUEUE_HEAD_INITIALIZER(mc.waitq), -}; - /* for OOM */ struct mem_cgroup_eventfd_list { struct list_head list; @@ -426,51 +401,6 @@ unsigned long memcg1_soft_limit_reclaim(pg_data_t *pgdat, int order, return nr_reclaimed; } -/* - * A routine for checking "mem" is under move_account() or not. - * - * Checking a cgroup is mc.from or mc.to or under hierarchy of - * moving cgroups. This is for waiting at high-memory pressure - * caused by "move". - */ -static bool mem_cgroup_under_move(struct mem_cgroup *memcg) -{ - struct mem_cgroup *from; - struct mem_cgroup *to; - bool ret = false; - /* - * Unlike task_move routines, we access mc.to, mc.from not under - * mutual exclusion by cgroup_mutex. Here, we take spinlock instead. - */ - spin_lock(&mc.lock); - from = mc.from; - to = mc.to; - if (!from) - goto unlock; - - ret = mem_cgroup_is_descendant(from, memcg) || - mem_cgroup_is_descendant(to, memcg); -unlock: - spin_unlock(&mc.lock); - return ret; -} - -bool memcg1_wait_acct_move(struct mem_cgroup *memcg) -{ - if (mc.moving_task && current != mc.moving_task) { - if (mem_cgroup_under_move(memcg)) { - DEFINE_WAIT(wait); - prepare_to_wait(&mc.waitq, &wait, TASK_INTERRUPTIBLE); - /* moving charge context might have finished. */ - if (mc.moving_task) - schedule(); - finish_wait(&mc.waitq, &wait); - return true; - } - } - return false; -} - /** * folio_memcg_lock - Bind a folio to its memcg. * @folio: The folio. @@ -552,44 +482,6 @@ void folio_memcg_unlock(struct folio *folio) __folio_memcg_unlock(folio_memcg(folio)); } -#ifdef CONFIG_SWAP -/** - * mem_cgroup_move_swap_account - move swap charge and swap_cgroup's record. - * @entry: swap entry to be moved - * @from: mem_cgroup which the entry is moved from - * @to: mem_cgroup which the entry is moved to - * - * It succeeds only when the swap_cgroup's record for this entry is the same - * as the mem_cgroup's id of @from. - * - * Returns 0 on success, -EINVAL on failure. - * - * The caller must have charged to @to, IOW, called page_counter_charge() about - * both res and memsw, and called css_get(). - */ -static int mem_cgroup_move_swap_account(swp_entry_t entry, - struct mem_cgroup *from, struct mem_cgroup *to) -{ - unsigned short old_id, new_id; - - old_id = mem_cgroup_id(from); - new_id = mem_cgroup_id(to); - - if (swap_cgroup_cmpxchg(entry, old_id, new_id) == old_id) { - mod_memcg_state(from, MEMCG_SWAP, -1); - mod_memcg_state(to, MEMCG_SWAP, 1); - return 0; - } - return -EINVAL; -} -#else -static inline int mem_cgroup_move_swap_account(swp_entry_t entry, - struct mem_cgroup *from, struct mem_cgroup *to) -{ - return -EINVAL; -} -#endif - static u64 mem_cgroup_move_charge_read(struct cgroup_subsys_state *css, struct cftype *cft) { @@ -616,760 +508,6 @@ static int mem_cgroup_move_charge_write(struct cgroup_subsys_state *css, } #endif -#ifdef CONFIG_MMU -/* Handlers for move charge at task migration. */ -static int mem_cgroup_do_precharge(unsigned long count) -{ - int ret; - - /* Try a single bulk charge without reclaim first, kswapd may wake */ - ret = try_charge(mc.to, GFP_KERNEL & ~__GFP_DIRECT_RECLAIM, count); - if (!ret) { - mc.precharge += count; - return ret; - } - - /* Try charges one by one with reclaim, but do not retry */ - while (count--) { - ret = try_charge(mc.to, GFP_KERNEL | __GFP_NORETRY, 1); - if (ret) - return ret; - mc.precharge++; - cond_resched(); - } - return 0; -} - -union mc_target { - struct folio *folio; - swp_entry_t ent; -}; - -enum mc_target_type { - MC_TARGET_NONE = 0, - MC_TARGET_PAGE, - MC_TARGET_SWAP, - MC_TARGET_DEVICE, -}; - -static struct page *mc_handle_present_pte(struct vm_area_struct *vma, - unsigned long addr, pte_t ptent) -{ - struct page *page = vm_normal_page(vma, addr, ptent); - - if (!page) - return NULL; - if (PageAnon(page)) { - if (!(mc.flags & MOVE_ANON)) - return NULL; - } else { - if (!(mc.flags & MOVE_FILE)) - return NULL; - } - get_page(page); - - return page; -} - -#if defined(CONFIG_SWAP) || defined(CONFIG_DEVICE_PRIVATE) -static struct page *mc_handle_swap_pte(struct vm_area_struct *vma, - pte_t ptent, swp_entry_t *entry) -{ - struct page *page = NULL; - swp_entry_t ent = pte_to_swp_entry(ptent); - - if (!(mc.flags & MOVE_ANON)) - return NULL; - - /* - * Handle device private pages that are not accessible by the CPU, but - * stored as special swap entries in the page table. - */ - if (is_device_private_entry(ent)) { - page = pfn_swap_entry_to_page(ent); - if (!get_page_unless_zero(page)) - return NULL; - return page; - } - - if (non_swap_entry(ent)) - return NULL; - - /* - * Because swap_cache_get_folio() updates some statistics counter, - * we call find_get_page() with swapper_space directly. - */ - page = find_get_page(swap_address_space(ent), swap_cache_index(ent)); - entry->val = ent.val; - - return page; -} -#else -static struct page *mc_handle_swap_pte(struct vm_area_struct *vma, - pte_t ptent, swp_entry_t *entry) -{ - return NULL; -} -#endif - -static struct page *mc_handle_file_pte(struct vm_area_struct *vma, - unsigned long addr, pte_t ptent) -{ - unsigned long index; - struct folio *folio; - - if (!vma->vm_file) /* anonymous vma */ - return NULL; - if (!(mc.flags & MOVE_FILE)) - return NULL; - - /* folio is moved even if it's not RSS of this task(page-faulted). */ - /* shmem/tmpfs may report page out on swap: account for that too. */ - index = linear_page_index(vma, addr); - folio = filemap_get_incore_folio(vma->vm_file->f_mapping, index); - if (IS_ERR(folio)) - return NULL; - return folio_file_page(folio, index); -} - -static void memcg1_check_events(struct mem_cgroup *memcg, int nid); -static void memcg1_charge_statistics(struct mem_cgroup *memcg, int nr_pages); - -/** - * mem_cgroup_move_account - move account of the folio - * @folio: The folio. - * @compound: charge the page as compound or small page - * @from: mem_cgroup which the folio is moved from. - * @to: mem_cgroup which the folio is moved to. @from != @to. - * - * The folio must be locked and not on the LRU. - * - * This function doesn't do "charge" to new cgroup and doesn't do "uncharge" - * from old cgroup. - */ -static int mem_cgroup_move_account(struct folio *folio, - bool compound, - struct mem_cgroup *from, - struct mem_cgroup *to) -{ - struct lruvec *from_vec, *to_vec; - struct pglist_data *pgdat; - unsigned int nr_pages = compound ? folio_nr_pages(folio) : 1; - int nid, ret; - - VM_BUG_ON(from == to); - VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); - VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); - VM_BUG_ON(compound && !folio_test_large(folio)); - - ret = -EINVAL; - if (folio_memcg(folio) != from) - goto out; - - pgdat = folio_pgdat(folio); - from_vec = mem_cgroup_lruvec(from, pgdat); - to_vec = mem_cgroup_lruvec(to, pgdat); - - folio_memcg_lock(folio); - - if (folio_test_anon(folio)) { - if (folio_mapped(folio)) { - __mod_lruvec_state(from_vec, NR_ANON_MAPPED, -nr_pages); - __mod_lruvec_state(to_vec, NR_ANON_MAPPED, nr_pages); - if (folio_test_pmd_mappable(folio)) { - __mod_lruvec_state(from_vec, NR_ANON_THPS, - -nr_pages); - __mod_lruvec_state(to_vec, NR_ANON_THPS, - nr_pages); - } - } - } else { - __mod_lruvec_state(from_vec, NR_FILE_PAGES, -nr_pages); - __mod_lruvec_state(to_vec, NR_FILE_PAGES, nr_pages); - - if (folio_test_swapbacked(folio)) { - __mod_lruvec_state(from_vec, NR_SHMEM, -nr_pages); - __mod_lruvec_state(to_vec, NR_SHMEM, nr_pages); - } - - if (folio_mapped(folio)) { - __mod_lruvec_state(from_vec, NR_FILE_MAPPED, -nr_pages); - __mod_lruvec_state(to_vec, NR_FILE_MAPPED, nr_pages); - } - - if (folio_test_dirty(folio)) { - struct address_space *mapping = folio_mapping(folio); - - if (mapping_can_writeback(mapping)) { - __mod_lruvec_state(from_vec, NR_FILE_DIRTY, - -nr_pages); - __mod_lruvec_state(to_vec, NR_FILE_DIRTY, - nr_pages); - } - } - } - -#ifdef CONFIG_SWAP - if (folio_test_swapcache(folio)) { - __mod_lruvec_state(from_vec, NR_SWAPCACHE, -nr_pages); - __mod_lruvec_state(to_vec, NR_SWAPCACHE, nr_pages); - } -#endif - if (folio_test_writeback(folio)) { - __mod_lruvec_state(from_vec, NR_WRITEBACK, -nr_pages); - __mod_lruvec_state(to_vec, NR_WRITEBACK, nr_pages); - } - - /* - * All state has been migrated, let's switch to the new memcg. - * - * It is safe to change page's memcg here because the page - * is referenced, charged, isolated, and locked: we can't race - * with (un)charging, migration, LRU putback, or anything else - * that would rely on a stable page's memory cgroup. - * - * Note that folio_memcg_lock is a memcg lock, not a page lock, - * to save space. As soon as we switch page's memory cgroup to a - * new memcg that isn't locked, the above state can change - * concurrently again. Make sure we're truly done with it. - */ - smp_mb(); - - css_get(&to->css); - css_put(&from->css); - - folio->memcg_data = (unsigned long)to; - - __folio_memcg_unlock(from); - - ret = 0; - nid = folio_nid(folio); - - local_irq_disable(); - memcg1_charge_statistics(to, nr_pages); - memcg1_check_events(to, nid); - memcg1_charge_statistics(from, -nr_pages); - memcg1_check_events(from, nid); - local_irq_enable(); -out: - return ret; -} - -/** - * get_mctgt_type - get target type of moving charge - * @vma: the vma the pte to be checked belongs - * @addr: the address corresponding to the pte to be checked - * @ptent: the pte to be checked - * @target: the pointer the target page or swap ent will be stored(can be NULL) - * - * Context: Called with pte lock held. - * Return: - * * MC_TARGET_NONE - If the pte is not a target for move charge. - * * MC_TARGET_PAGE - If the page corresponding to this pte is a target for - * move charge. If @target is not NULL, the folio is stored in target->folio - * with extra refcnt taken (Caller should release it). - * * MC_TARGET_SWAP - If the swap entry corresponding to this pte is a - * target for charge migration. If @target is not NULL, the entry is - * stored in target->ent. - * * MC_TARGET_DEVICE - Like MC_TARGET_PAGE but page is device memory and - * thus not on the lru. For now such page is charged like a regular page - * would be as it is just special memory taking the place of a regular page. - * See Documentations/vm/hmm.txt and include/linux/hmm.h - */ -static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma, - unsigned long addr, pte_t ptent, union mc_target *target) -{ - struct page *page = NULL; - struct folio *folio; - enum mc_target_type ret = MC_TARGET_NONE; - swp_entry_t ent = { .val = 0 }; - - if (pte_present(ptent)) - page = mc_handle_present_pte(vma, addr, ptent); - else if (pte_none_mostly(ptent)) - /* - * PTE markers should be treated as a none pte here, separated - * from other swap handling below. - */ - page = mc_handle_file_pte(vma, addr, ptent); - else if (is_swap_pte(ptent)) - page = mc_handle_swap_pte(vma, ptent, &ent); - - if (page) - folio = page_folio(page); - if (target && page) { - if (!folio_trylock(folio)) { - folio_put(folio); - return ret; - } - /* - * page_mapped() must be stable during the move. This - * pte is locked, so if it's present, the page cannot - * become unmapped. If it isn't, we have only partial - * control over the mapped state: the page lock will - * prevent new faults against pagecache and swapcache, - * so an unmapped page cannot become mapped. However, - * if the page is already mapped elsewhere, it can - * unmap, and there is nothing we can do about it. - * Alas, skip moving the page in this case. - */ - if (!pte_present(ptent) && page_mapped(page)) { - folio_unlock(folio); - folio_put(folio); - return ret; - } - } - - if (!page && !ent.val) - return ret; - if (page) { - /* - * Do only loose check w/o serialization. - * mem_cgroup_move_account() checks the page is valid or - * not under LRU exclusion. - */ - if (folio_memcg(folio) == mc.from) { - ret = MC_TARGET_PAGE; - if (folio_is_device_private(folio) || - folio_is_device_coherent(folio)) - ret = MC_TARGET_DEVICE; - if (target) - target->folio = folio; - } - if (!ret || !target) { - if (target) - folio_unlock(folio); - folio_put(folio); - } - } - /* - * There is a swap entry and a page doesn't exist or isn't charged. - * But we cannot move a tail-page in a THP. - */ - if (ent.val && !ret && (!page || !PageTransCompound(page)) && - mem_cgroup_id(mc.from) == lookup_swap_cgroup_id(ent)) { - ret = MC_TARGET_SWAP; - if (target) - target->ent = ent; - } - return ret; -} - -#ifdef CONFIG_TRANSPARENT_HUGEPAGE -/* - * We don't consider PMD mapped swapping or file mapped pages because THP does - * not support them for now. - * Caller should make sure that pmd_trans_huge(pmd) is true. - */ -static enum mc_target_type get_mctgt_type_thp(struct vm_area_struct *vma, - unsigned long addr, pmd_t pmd, union mc_target *target) -{ - struct page *page = NULL; - struct folio *folio; - enum mc_target_type ret = MC_TARGET_NONE; - - if (unlikely(is_swap_pmd(pmd))) { - VM_BUG_ON(thp_migration_supported() && - !is_pmd_migration_entry(pmd)); - return ret; - } - page = pmd_page(pmd); - VM_BUG_ON_PAGE(!page || !PageHead(page), page); - folio = page_folio(page); - if (!(mc.flags & MOVE_ANON)) - return ret; - if (folio_memcg(folio) == mc.from) { - ret = MC_TARGET_PAGE; - if (target) { - folio_get(folio); - if (!folio_trylock(folio)) { - folio_put(folio); - return MC_TARGET_NONE; - } - target->folio = folio; - } - } - return ret; -} -#else -static inline enum mc_target_type get_mctgt_type_thp(struct vm_area_struct *vma, - unsigned long addr, pmd_t pmd, union mc_target *target) -{ - return MC_TARGET_NONE; -} -#endif - -static int mem_cgroup_count_precharge_pte_range(pmd_t *pmd, - unsigned long addr, unsigned long end, - struct mm_walk *walk) -{ - struct vm_area_struct *vma = walk->vma; - pte_t *pte; - spinlock_t *ptl; - - ptl = pmd_trans_huge_lock(pmd, vma); - if (ptl) { - /* - * Note their can not be MC_TARGET_DEVICE for now as we do not - * support transparent huge page with MEMORY_DEVICE_PRIVATE but - * this might change. - */ - if (get_mctgt_type_thp(vma, addr, *pmd, NULL) == MC_TARGET_PAGE) - mc.precharge += HPAGE_PMD_NR; - spin_unlock(ptl); - return 0; - } - - pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); - if (!pte) - return 0; - for (; addr != end; pte++, addr += PAGE_SIZE) - if (get_mctgt_type(vma, addr, ptep_get(pte), NULL)) - mc.precharge++; /* increment precharge temporarily */ - pte_unmap_unlock(pte - 1, ptl); - cond_resched(); - - return 0; -} - -static const struct mm_walk_ops precharge_walk_ops = { - .pmd_entry = mem_cgroup_count_precharge_pte_range, - .walk_lock = PGWALK_RDLOCK, -}; - -static unsigned long mem_cgroup_count_precharge(struct mm_struct *mm) -{ - unsigned long precharge; - - mmap_read_lock(mm); - walk_page_range(mm, 0, ULONG_MAX, &precharge_walk_ops, NULL); - mmap_read_unlock(mm); - - precharge = mc.precharge; - mc.precharge = 0; - - return precharge; -} - -static int mem_cgroup_precharge_mc(struct mm_struct *mm) -{ - unsigned long precharge = mem_cgroup_count_precharge(mm); - - VM_BUG_ON(mc.moving_task); - mc.moving_task = current; - return mem_cgroup_do_precharge(precharge); -} - -/* cancels all extra charges on mc.from and mc.to, and wakes up all waiters. */ -static void __mem_cgroup_clear_mc(void) -{ - struct mem_cgroup *from = mc.from; - struct mem_cgroup *to = mc.to; - - /* we must uncharge all the leftover precharges from mc.to */ - if (mc.precharge) { - mem_cgroup_cancel_charge(mc.to, mc.precharge); - mc.precharge = 0; - } - /* - * we didn't uncharge from mc.from at mem_cgroup_move_account(), so - * we must uncharge here. - */ - if (mc.moved_charge) { - mem_cgroup_cancel_charge(mc.from, mc.moved_charge); - mc.moved_charge = 0; - } - /* we must fixup refcnts and charges */ - if (mc.moved_swap) { - /* uncharge swap account from the old cgroup */ - if (!mem_cgroup_is_root(mc.from)) - page_counter_uncharge(&mc.from->memsw, mc.moved_swap); - - mem_cgroup_id_put_many(mc.from, mc.moved_swap); - - /* - * we charged both to->memory and to->memsw, so we - * should uncharge to->memory. - */ - if (!mem_cgroup_is_root(mc.to)) - page_counter_uncharge(&mc.to->memory, mc.moved_swap); - - mc.moved_swap = 0; - } - memcg1_oom_recover(from); - memcg1_oom_recover(to); - wake_up_all(&mc.waitq); -} - -static void mem_cgroup_clear_mc(void) -{ - struct mm_struct *mm = mc.mm; - - /* - * we must clear moving_task before waking up waiters at the end of - * task migration. - */ - mc.moving_task = NULL; - __mem_cgroup_clear_mc(); - spin_lock(&mc.lock); - mc.from = NULL; - mc.to = NULL; - mc.mm = NULL; - spin_unlock(&mc.lock); - - mmput(mm); -} - -int memcg1_can_attach(struct cgroup_taskset *tset) -{ - struct cgroup_subsys_state *css; - struct mem_cgroup *memcg = NULL; /* unneeded init to make gcc happy */ - struct mem_cgroup *from; - struct task_struct *leader, *p; - struct mm_struct *mm; - unsigned long move_flags; - int ret = 0; - - /* charge immigration isn't supported on the default hierarchy */ - if (cgroup_subsys_on_dfl(memory_cgrp_subsys)) - return 0; - - /* - * Multi-process migrations only happen on the default hierarchy - * where charge immigration is not used. Perform charge - * immigration if @tset contains a leader and whine if there are - * multiple. - */ - p = NULL; - cgroup_taskset_for_each_leader(leader, css, tset) { - WARN_ON_ONCE(p); - p = leader; - memcg = mem_cgroup_from_css(css); - } - if (!p) - return 0; - - /* - * We are now committed to this value whatever it is. Changes in this - * tunable will only affect upcoming migrations, not the current one. - * So we need to save it, and keep it going. - */ - move_flags = READ_ONCE(memcg->move_charge_at_immigrate); - if (!move_flags) - return 0; - - from = mem_cgroup_from_task(p); - - VM_BUG_ON(from == memcg); - - mm = get_task_mm(p); - if (!mm) - return 0; - /* We move charges only when we move a owner of the mm */ - if (mm->owner == p) { - VM_BUG_ON(mc.from); - VM_BUG_ON(mc.to); - VM_BUG_ON(mc.precharge); - VM_BUG_ON(mc.moved_charge); - VM_BUG_ON(mc.moved_swap); - - spin_lock(&mc.lock); - mc.mm = mm; - mc.from = from; - mc.to = memcg; - mc.flags = move_flags; - spin_unlock(&mc.lock); - /* We set mc.moving_task later */ - - ret = mem_cgroup_precharge_mc(mm); - if (ret) - mem_cgroup_clear_mc(); - } else { - mmput(mm); - } - return ret; -} - -void memcg1_cancel_attach(struct cgroup_taskset *tset) -{ - if (mc.to) - mem_cgroup_clear_mc(); -} - -static int mem_cgroup_move_charge_pte_range(pmd_t *pmd, - unsigned long addr, unsigned long end, - struct mm_walk *walk) -{ - int ret = 0; - struct vm_area_struct *vma = walk->vma; - pte_t *pte; - spinlock_t *ptl; - enum mc_target_type target_type; - union mc_target target; - struct folio *folio; - - ptl = pmd_trans_huge_lock(pmd, vma); - if (ptl) { - if (mc.precharge < HPAGE_PMD_NR) { - spin_unlock(ptl); - return 0; - } - target_type = get_mctgt_type_thp(vma, addr, *pmd, &target); - if (target_type == MC_TARGET_PAGE) { - folio = target.folio; - if (folio_isolate_lru(folio)) { - if (!mem_cgroup_move_account(folio, true, - mc.from, mc.to)) { - mc.precharge -= HPAGE_PMD_NR; - mc.moved_charge += HPAGE_PMD_NR; - } - folio_putback_lru(folio); - } - folio_unlock(folio); - folio_put(folio); - } else if (target_type == MC_TARGET_DEVICE) { - folio = target.folio; - if (!mem_cgroup_move_account(folio, true, - mc.from, mc.to)) { - mc.precharge -= HPAGE_PMD_NR; - mc.moved_charge += HPAGE_PMD_NR; - } - folio_unlock(folio); - folio_put(folio); - } - spin_unlock(ptl); - return 0; - } - -retry: - pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); - if (!pte) - return 0; - for (; addr != end; addr += PAGE_SIZE) { - pte_t ptent = ptep_get(pte++); - bool device = false; - swp_entry_t ent; - - if (!mc.precharge) - break; - - switch (get_mctgt_type(vma, addr, ptent, &target)) { - case MC_TARGET_DEVICE: - device = true; - fallthrough; - case MC_TARGET_PAGE: - folio = target.folio; - /* - * We can have a part of the split pmd here. Moving it - * can be done but it would be too convoluted so simply - * ignore such a partial THP and keep it in original - * memcg. There should be somebody mapping the head. - */ - if (folio_test_large(folio)) - goto put; - if (!device && !folio_isolate_lru(folio)) - goto put; - if (!mem_cgroup_move_account(folio, false, - mc.from, mc.to)) { - mc.precharge--; - /* we uncharge from mc.from later. */ - mc.moved_charge++; - } - if (!device) - folio_putback_lru(folio); -put: /* get_mctgt_type() gets & locks the page */ - folio_unlock(folio); - folio_put(folio); - break; - case MC_TARGET_SWAP: - ent = target.ent; - if (!mem_cgroup_move_swap_account(ent, mc.from, mc.to)) { - mc.precharge--; - mem_cgroup_id_get_many(mc.to, 1); - /* we fixup other refcnts and charges later. */ - mc.moved_swap++; - } - break; - default: - break; - } - } - pte_unmap_unlock(pte - 1, ptl); - cond_resched(); - - if (addr != end) { - /* - * We have consumed all precharges we got in can_attach(). - * We try charge one by one, but don't do any additional - * charges to mc.to if we have failed in charge once in attach() - * phase. - */ - ret = mem_cgroup_do_precharge(1); - if (!ret) - goto retry; - } - - return ret; -} - -static const struct mm_walk_ops charge_walk_ops = { - .pmd_entry = mem_cgroup_move_charge_pte_range, - .walk_lock = PGWALK_RDLOCK, -}; - -static void mem_cgroup_move_charge(void) -{ - lru_add_drain_all(); - /* - * Signal folio_memcg_lock() to take the memcg's move_lock - * while we're moving its pages to another memcg. Then wait - * for already started RCU-only updates to finish. - */ - atomic_inc(&mc.from->moving_account); - synchronize_rcu(); -retry: - if (unlikely(!mmap_read_trylock(mc.mm))) { - /* - * Someone who are holding the mmap_lock might be waiting in - * waitq. So we cancel all extra charges, wake up all waiters, - * and retry. Because we cancel precharges, we might not be able - * to move enough charges, but moving charge is a best-effort - * feature anyway, so it wouldn't be a big problem. - */ - __mem_cgroup_clear_mc(); - cond_resched(); - goto retry; - } - /* - * When we have consumed all precharges and failed in doing - * additional charge, the page walk just aborts. - */ - walk_page_range(mc.mm, 0, ULONG_MAX, &charge_walk_ops, NULL); - mmap_read_unlock(mc.mm); - atomic_dec(&mc.from->moving_account); -} - -void memcg1_move_task(void) -{ - if (mc.to) { - mem_cgroup_move_charge(); - mem_cgroup_clear_mc(); - } -} - -#else /* !CONFIG_MMU */ -int memcg1_can_attach(struct cgroup_taskset *tset) -{ - return 0; -} -void memcg1_cancel_attach(struct cgroup_taskset *tset) -{ -} -void memcg1_move_task(void) -{ -} -#endif - static void __mem_cgroup_threshold(struct mem_cgroup *memcg, bool swap) { struct mem_cgroup_threshold_ary *t; diff --git a/mm/memcontrol-v1.h b/mm/memcontrol-v1.h index c0672e25bcdb..0e3b82951d91 100644 --- a/mm/memcontrol-v1.h +++ b/mm/memcontrol-v1.h @@ -80,12 +80,7 @@ static inline void memcg1_soft_limit_reset(struct mem_cgroup *memcg) WRITE_ONCE(memcg->soft_limit, PAGE_COUNTER_MAX); } -bool memcg1_wait_acct_move(struct mem_cgroup *memcg); - struct cgroup_taskset; -int memcg1_can_attach(struct cgroup_taskset *tset); -void memcg1_cancel_attach(struct cgroup_taskset *tset); -void memcg1_move_task(void); void memcg1_css_offline(struct mem_cgroup *memcg); /* for encoding cft->private value on file */ @@ -130,7 +125,6 @@ static inline void memcg1_free_events(struct mem_cgroup *memcg) {} static inline void memcg1_memcg_init(struct mem_cgroup *memcg) {} static inline void memcg1_remove_from_trees(struct mem_cgroup *memcg) {} static inline void memcg1_soft_limit_reset(struct mem_cgroup *memcg) {} -static inline bool memcg1_wait_acct_move(struct mem_cgroup *memcg) { return false; } static inline void memcg1_css_offline(struct mem_cgroup *memcg) {} static inline bool memcg1_oom_prepare(struct mem_cgroup *memcg, bool *locked) { return true; } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5c3a8629ef3e..94279b9c766a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -2242,12 +2242,6 @@ int try_charge_memcg(struct mem_cgroup *memcg, gfp_t gfp_mask, */ if (nr_reclaimed && nr_pages <= (1 << PAGE_ALLOC_COSTLY_ORDER)) goto retry; - /* - * At task move, charge accounts can be doubly counted. So, it's - * better to wait until the end of task_move if something is going on. - */ - if (memcg1_wait_acct_move(mem_over_limit)) - goto retry; if (nr_retries--) goto retry; @@ -4439,9 +4433,6 @@ struct cgroup_subsys memory_cgrp_subsys = { .exit = mem_cgroup_exit, .dfl_cftypes = memory_files, #ifdef CONFIG_MEMCG_V1 - .can_attach = memcg1_can_attach, - .cancel_attach = memcg1_cancel_attach, - .post_attach = memcg1_move_task, .legacy_cftypes = mem_cgroup_legacy_files, #endif .early_init = 0, From patchwork Fri Oct 25 01:23:00 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13849919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FF80D1039C for ; Fri, 25 Oct 2024 01:23:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0D3036B00A0; Thu, 24 Oct 2024 21:23:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0337D6B00A1; Thu, 24 Oct 2024 21:23:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7A6A6B00A2; Thu, 24 Oct 2024 21:23:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id B2F556B00A0 for ; Thu, 24 Oct 2024 21:23:39 -0400 (EDT) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 0C4B9C02AE for ; Fri, 25 Oct 2024 01:23:19 +0000 (UTC) X-FDA: 82710376434.03.4E9C451 Received: from out-187.mta1.migadu.com (out-187.mta1.migadu.com [95.215.58.187]) by imf03.hostedemail.com (Postfix) with ESMTP id C0D7020020 for ; Fri, 25 Oct 2024 01:23:28 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="Qnj7/ylT"; spf=pass (imf03.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.187 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729819365; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ea/MjMCn/PUwsZr9oC0H1/BbJ/TXP7i5xCISmkGKoYY=; b=ounHDPJd/XTpbD4v5GmABGei4PMCMGeH4s4TtJGGNX1mKVn/A0rWfFsRFZTrGQFuQ96Frb bn0siJkHDSb9qtUV7SCiW+iSgXQxV8eAsbLGjesV4NzP2PtahybPetOHopiA5Xq63q3e8m CWIiXCqweWIwkHLJypkI73NyYYVh+sA= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b="Qnj7/ylT"; spf=pass (imf03.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.187 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729819365; a=rsa-sha256; cv=none; b=fyfVcA7eBDqRzcw6SniWEEuLdx6dW8CLksCP3DCrN+7WGoQSPRGDC2QY2hHxq3/ZNBu/jV E9mFpq7fvocoJXPMjQlKBrN4Za1qTnUcG4ZhV9OW+MxPlXxAFkw1yaSp6Dvg/noZ3DAO6Q 26+VctsKTt7X5B/7GqCNFHvnVfVp3BA= X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1729819416; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Ea/MjMCn/PUwsZr9oC0H1/BbJ/TXP7i5xCISmkGKoYY=; b=Qnj7/ylTgj6qu/raQvuKWAyO7g8JdotVpJCyFMYSo3WFE7xF7Z+WM55uL9HFpK1rmXhc2+ 0IN2sQ/XwvawwCMrgzk03NlVPJPW0VfZskrXjGkZycB05364Ktq7ebY8tvS9oZteVpPoB/ kixS1n6ATSFM4zxELJsS/AoqmrK8yDI= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Hugh Dickins , Yosry Ahmed , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Meta kernel team Subject: [PATCH v1 3/6] memcg-v1: no need for memcg locking for dirty tracking Date: Thu, 24 Oct 2024 18:23:00 -0700 Message-ID: <20241025012304.2473312-4-shakeel.butt@linux.dev> In-Reply-To: <20241025012304.2473312-1-shakeel.butt@linux.dev> References: <20241025012304.2473312-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspam-User: X-Stat-Signature: quc1otb88s3kog3n1aeb7mnnm1tf3obi X-Rspamd-Queue-Id: C0D7020020 X-Rspamd-Server: rspam11 X-HE-Tag: 1729819408-518809 X-HE-Meta: U2FsdGVkX19IR6CqWoAb928WGcuAGcH99xKsUUoSZqZzRqVed6nq5RWn7UXpILFZlpDBpaCWMsMTdKj30H1YpFiOKvPb9nT0V9p5fS+OfO5m+I+u1/xyhWkC7g+OYG3OuBupvXzL+3E0zEr+S7VrXSkc2lvHiOtWrifxHZ0LjwdDnUCqDgcJSGWGu4EoloPNLYLaTLW/2I7O8FhuEJK1/VLtanRdowPcXfXuMtuD/UttdSpZaYsvripVYuLVzGKkOaMtLZkWVZLNxVRAbZWuyFQWSAfyzLbXlfuw1RXz+qsBx76JUMwVKJAOugeqc/g4JbZD3oXqlzm/d5aHjB0ULOzXlla4euzMMMZPuTwX2n+9LtrZHAlBkofMI1uMFQ1dDrPO7TIkvuCFiFGCH4oXqoZKqUGhMmlAH6NeaBHR4NGe8IkcV9xIa9kYq+51D4X6E80GdSXWIOln5sXGunYoUG4o7ZA5SIlc+sqDSdVO2N+KgND8G6C0AkjC4nmvO+lAGstwyBdDaTDnNFaFybNigNoZs6opzbAvs7972iavDiu9SnmHcLERdrq2ITaiXGUXPF0bF5EHCrpr8uf3tr3RWG4KhqKzHHYYxtamny9VYW609nK0FnsQvxtF5TTbiz1NrS3d6VC8sJ/YGVcRJzAnlT+MmWtpD5DWY8AoPE56hx1d2x2hcAaQfyYilOV9d7JxA1tkkqpX9c/XVCb7eW4y9j0tzTiScy9MSuVjC4YS/VvnC9vBlYwDv1YR0xR1rkWp8+st+KkVV/PKToTjpeNaj6Ulcxi5cgsQQgxsPTNZhRIqnCr1fw1YSCBKN6t96xyeSGNOR/v+UHYBN8MVdgd36c+AD1R5V9HjUXKKEpx7Fm29XWLilhcTe9WFQf8BQK6sxh0JHwa0DUq/PspR6iwaT7iKIf8Ntg2eYLPZrCc6us35rEVxvbMNuyAeu5fBrMpmZghM281hD+0Mvc1YgU4 iZXMzAdW x492MOrqntiDNJ4cgLCHm1JL42rq2iNprGBnfqj73Jh4XK6pVFAmF91OEPrePC2mV3d4zyZsFGoupyK+UV8eKQwx7h0vL2lI78koUO2i9DdTdJk8sUB+k3SAFIlI4S232Cu1ArpZmu7RpfqU= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: During the era of memcg charge migration, the kernel has to be make sure that the dirty stat updates do not race with the charge migration. Otherwise it might update the dirty stats of the wrong memcg. Now with the memcg charge migration deprecated, there is no more race for dirty stat updates and the previous locking can be removed. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko Reviewed-by: Roman Gushchin Acked-by: Johannes Weiner --- fs/buffer.c | 5 ----- mm/page-writeback.c | 16 +++------------- 2 files changed, 3 insertions(+), 18 deletions(-) diff --git a/fs/buffer.c b/fs/buffer.c index 1fc9a50def0b..88e765b0699f 100644 --- a/fs/buffer.c +++ b/fs/buffer.c @@ -736,15 +736,12 @@ bool block_dirty_folio(struct address_space *mapping, struct folio *folio) * Lock out page's memcg migration to keep PageDirty * synchronized with per-memcg dirty page counters. */ - folio_memcg_lock(folio); newly_dirty = !folio_test_set_dirty(folio); spin_unlock(&mapping->i_private_lock); if (newly_dirty) __folio_mark_dirty(folio, mapping, 1); - folio_memcg_unlock(folio); - if (newly_dirty) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); @@ -1194,13 +1191,11 @@ void mark_buffer_dirty(struct buffer_head *bh) struct folio *folio = bh->b_folio; struct address_space *mapping = NULL; - folio_memcg_lock(folio); if (!folio_test_set_dirty(folio)) { mapping = folio->mapping; if (mapping) __folio_mark_dirty(folio, mapping, 0); } - folio_memcg_unlock(folio); if (mapping) __mark_inode_dirty(mapping->host, I_DIRTY_PAGES); } diff --git a/mm/page-writeback.c b/mm/page-writeback.c index 1d7179aba8e3..a76a73529fd9 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -2743,8 +2743,6 @@ EXPORT_SYMBOL(noop_dirty_folio); /* * Helper function for set_page_dirty family. * - * Caller must hold folio_memcg_lock(). - * * NOTE: This relies on being atomic wrt interrupts. */ static void folio_account_dirtied(struct folio *folio, @@ -2777,7 +2775,6 @@ static void folio_account_dirtied(struct folio *folio, /* * Helper function for deaccounting dirty page without writeback. * - * Caller must hold folio_memcg_lock(). */ void folio_account_cleaned(struct folio *folio, struct bdi_writeback *wb) { @@ -2795,9 +2792,8 @@ void folio_account_cleaned(struct folio *folio, struct bdi_writeback *wb) * If warn is true, then emit a warning if the folio is not uptodate and has * not been truncated. * - * The caller must hold folio_memcg_lock(). It is the caller's - * responsibility to prevent the folio from being truncated while - * this function is in progress, although it may have been truncated + * It is the caller's responsibility to prevent the folio from being truncated + * while this function is in progress, although it may have been truncated * before this function is called. Most callers have the folio locked. * A few have the folio blocked from truncation through other means (e.g. * zap_vma_pages() has it mapped and is holding the page table lock). @@ -2841,14 +2837,10 @@ void __folio_mark_dirty(struct folio *folio, struct address_space *mapping, */ bool filemap_dirty_folio(struct address_space *mapping, struct folio *folio) { - folio_memcg_lock(folio); - if (folio_test_set_dirty(folio)) { - folio_memcg_unlock(folio); + if (folio_test_set_dirty(folio)) return false; - } __folio_mark_dirty(folio, mapping, !folio_test_private(folio)); - folio_memcg_unlock(folio); if (mapping->host) { /* !PageAnon && !swapper_space */ @@ -2975,14 +2967,12 @@ void __folio_cancel_dirty(struct folio *folio) struct bdi_writeback *wb; struct wb_lock_cookie cookie = {}; - folio_memcg_lock(folio); wb = unlocked_inode_to_wb_begin(inode, &cookie); if (folio_test_clear_dirty(folio)) folio_account_cleaned(folio, wb); unlocked_inode_to_wb_end(inode, &cookie); - folio_memcg_unlock(folio); } else { folio_clear_dirty(folio); } From patchwork Fri Oct 25 01:23:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13849920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6369DD1038E for ; Fri, 25 Oct 2024 01:23:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E684C6B00A1; Thu, 24 Oct 2024 21:23:42 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E164D6B00A2; Thu, 24 Oct 2024 21:23:42 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C424D6B00A3; Thu, 24 Oct 2024 21:23:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id A21836B00A1 for ; Thu, 24 Oct 2024 21:23:42 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 45C9F1A02A7 for ; Fri, 25 Oct 2024 01:23:08 +0000 (UTC) X-FDA: 82710376896.23.339C403 Received: from out-186.mta1.migadu.com (out-186.mta1.migadu.com [95.215.58.186]) by imf16.hostedemail.com (Postfix) with ESMTP id C9ABA180005 for ; Fri, 25 Oct 2024 01:23:21 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=FyQBMk5A; spf=pass (imf16.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.186 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729819268; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7wiuyjy0zmQLWgqvTUEEYQB7YbC8EAAKeOqIvPyT1kA=; b=Ph7azDF0VeJ8frYa5n+OopR7iv5xJswrnxLXx8HpAkEruuYk0VmGuTnpYFzuBlXglYyRdR ZJJFsp5tPtgrA7sztuHFJbCCxcd8EGCUg66diqYtD8dMawyVGgHMawrPvgZ9z3I5sMAC1D 3LyFUgX796W1cVFW/wyYHrDnMbNgxk4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729819268; a=rsa-sha256; cv=none; b=Y4Bu3/4QbedFZrD4EDc0jwiMp71jr5cd+2T3+FMlGcUfFoN6id+gEcrJWiVic1mIIjG/wz t7ayOkB0/5Q+RD5obqqzl/1Z0gZYdCihY6EeZFUEYL02KqNp5RxL2gB8r+SW+T4TeBgtmr x18levDy91dEVMAPZbiqLT+M+6mJQQE= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=FyQBMk5A; spf=pass (imf16.hostedemail.com: domain of shakeel.butt@linux.dev designates 95.215.58.186 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1729819419; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7wiuyjy0zmQLWgqvTUEEYQB7YbC8EAAKeOqIvPyT1kA=; b=FyQBMk5AhTU+B75oVA+oWUlAvWNvk+tsSDX0Snyy95VoOAUeapV7M6ETpL7FGJzvOFBblq h1ob8GVon/ncRKjwN1/gbkrdP2Q6N/PGtjFTKB5tsUnTA8uQOkhNwestaeF4grzP4JmMTn PxF+k0emd7XdYjJE4En9YYVB5PEuewc= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Hugh Dickins , Yosry Ahmed , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Meta kernel team Subject: [PATCH v1 4/6] memcg-v1: no need for memcg locking for writeback tracking Date: Thu, 24 Oct 2024 18:23:01 -0700 Message-ID: <20241025012304.2473312-5-shakeel.butt@linux.dev> In-Reply-To: <20241025012304.2473312-1-shakeel.butt@linux.dev> References: <20241025012304.2473312-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspamd-Queue-Id: C9ABA180005 X-Stat-Signature: xg31fu9xbgmagheyoyh5zgimsf69y1wi X-Rspamd-Server: rspam09 X-Rspam-User: X-HE-Tag: 1729819401-578340 X-HE-Meta: U2FsdGVkX190eqt+6wBL+oirRb7kwtc4AVGe380HQs+s4nIs3evP0EpnyCBsaRYyWQnS44zVMJdo7mwUGCvIREdnfzkRalEB09bnjNlsXFkkZrAR1UqHzfjVbIWQwGgYINKnNOVjOxl2E2qzsHboi3kFz+x8JVFhTwstUz50VL9kNrEBMmm0rJ7JJmcSdmaUp+ayna25P7eMYHf6bATx5/U4DH3pEiqA1f8HWqGtJcucOftvQ0KWH92Ir2M3OjESDkh2xqf8LP4g6C7FFJngEKw4cguoV2AYGE983mVg8gFhoaGeyv7vGjZsrCUw4u1yhoUnqPXyfrKFRogbaK4jso/JAiqwt2Q22Kc6tuzyGUs2byxX47l++nyZVeWHd30mDs6W6Y3coxkZQ4HoYG/4pGn6kWnG8JzspL/JzAIt+qeahPDH8164jVQfY/YrvVCByGywBtg7qXnh3h9LMvFsyZ00oXHdlhtHsPrtbodeyHKiEMJ62kqyB64gWoD6mMJ0HP0eJDp9q32aAsjMGNQBbOUgy4LIcwYnnYMidtQw5nqeKtyJAaIbfSwNcAz7wQimSsZkqliTlQIhKh/hXoDH8yxL522x4V7hYBqzLNO4QwChBOEGbswoT7hFO3gunaOfnGKdxKSARSSlEa/8x0W9sC2DfuND9gNYfjQbpDM3t2Svp8QPdYA0H/uEH9PdWpZUUX9YVW2b0QgkVqITmTacY3DLLy80jAL5IcEpKhfRNuFvtcMDJ7YqTJ31ugmBaZq+4nf1LjMzytzZ8nCaYI//vGAL4yM48NFjg8sEfINFkrAtLZtJZkgfNLVuio799v51JN9sunNEKfK6T+38H8h+zipKbIxRsELtB4Hd1iGbNi5lROBD05jaN13OJd0FBHmiijSFvEK0aGpyh+nXDiuBgqdVXHRw2FA6K8/w8tePRZL3KH2Uw7n+Yruro/3ky2hfQh72J1lsQE8bMi4C633 TgklBu+J 4ewxaJ1+Te83xn/AAQQNbTdqgsZ0si/ebTimLMjLORqyY/tnQoqFeX0ZVJ5lG8gsVFXxz9KLXq/7nAAx1z8uCvZTS/7Yp3nbmyoZO+YmtuHO2krd9DdzK476Qy6osINYZjzzHdQ90X2KjwIQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: During the era of memcg charge migration, the kernel has to be make sure that the writeback stat updates do not race with the charge migration. Otherwise it might update the writeback stats of the wrong memcg. Now with the memcg charge migration deprecated, there is no more race for writeback stat updates and the previous locking can be removed. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko Reviewed-by: Roman Gushchin Acked-by: Johannes Weiner --- mm/page-writeback.c | 4 ---- 1 file changed, 4 deletions(-) diff --git a/mm/page-writeback.c b/mm/page-writeback.c index a76a73529fd9..9c3317c3a615 100644 --- a/mm/page-writeback.c +++ b/mm/page-writeback.c @@ -3083,7 +3083,6 @@ bool __folio_end_writeback(struct folio *folio) struct address_space *mapping = folio_mapping(folio); bool ret; - folio_memcg_lock(folio); if (mapping && mapping_use_writeback_tags(mapping)) { struct inode *inode = mapping->host; struct backing_dev_info *bdi = inode_to_bdi(inode); @@ -3114,7 +3113,6 @@ bool __folio_end_writeback(struct folio *folio) lruvec_stat_mod_folio(folio, NR_WRITEBACK, -nr); zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, -nr); node_stat_mod_folio(folio, NR_WRITTEN, nr); - folio_memcg_unlock(folio); return ret; } @@ -3127,7 +3125,6 @@ void __folio_start_writeback(struct folio *folio, bool keep_write) VM_BUG_ON_FOLIO(folio_test_writeback(folio), folio); - folio_memcg_lock(folio); if (mapping && mapping_use_writeback_tags(mapping)) { XA_STATE(xas, &mapping->i_pages, folio_index(folio)); struct inode *inode = mapping->host; @@ -3168,7 +3165,6 @@ void __folio_start_writeback(struct folio *folio, bool keep_write) lruvec_stat_mod_folio(folio, NR_WRITEBACK, nr); zone_stat_mod_folio(folio, NR_ZONE_WRITE_PENDING, nr); - folio_memcg_unlock(folio); access_ret = arch_make_folio_accessible(folio); /* From patchwork Fri Oct 25 01:23:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13849921 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6A815D1038E for ; Fri, 25 Oct 2024 01:23:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F02956B00A4; Thu, 24 Oct 2024 21:23:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EB2336B00A5; Thu, 24 Oct 2024 21:23:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D7B706B00A6; Thu, 24 Oct 2024 21:23:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id B80CE6B00A4 for ; Thu, 24 Oct 2024 21:23:47 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C4475A02FB for ; Fri, 25 Oct 2024 01:23:12 +0000 (UTC) X-FDA: 82710377274.16.311A253 Received: from out-172.mta0.migadu.com (out-172.mta0.migadu.com [91.218.175.172]) by imf11.hostedemail.com (Postfix) with ESMTP id 8A5004001A for ; Fri, 25 Oct 2024 01:23:22 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=GrsC3TDa; spf=pass (imf11.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.172 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729819271; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=iejBIQWD2OAhZkVfaU63yScFrNJ6h4kxppqkG7YGNik=; b=AVWXLeQzZC9Q7YMrykiLRy5JtGuDLlKkDVkkWsGyAMenjGlYNOAmsdheOSYP8aVVFXai7q rueq3XM6xKnQ+upg+k2Qteo0jZsaJ79xQRW8zXpgy0aR1kNQbidwEvidxq+pbbu/ehR4Jl pCFVx1ELJRSs1djAI1RSuVxlGY1bdso= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729819271; a=rsa-sha256; cv=none; b=Qg3XAB9nCeDAqhdjO93vcddcdgI0HNRM3RVXX2oc4JrR6jj7bjdAYp0sndN5eHh4pWyxqu Y7dscKzfbzYRzgQ4vHJ6AdoclfqGxweea59Q+fR6PYowaoc/eyDTo1hdCGAHEBZruMRJCd rBjnubGQe9oMe6qXsBcGGJlBxeepvgo= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=GrsC3TDa; spf=pass (imf11.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.172 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev; dmarc=pass (policy=none) header.from=linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1729819424; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=iejBIQWD2OAhZkVfaU63yScFrNJ6h4kxppqkG7YGNik=; b=GrsC3TDaVDfYh+TUdZFQPVmDb83cgTuYqF7TK1rwrc6JkwUX3NwN47+8aVi9qNqpCYg9I5 oYRZ+6lJYSOJw+Zsn2rtkf6E3JCqoEbsRETnHxv6dd9lUBtB1tqCkB80WWWonbPuzHynDI kNuvqZMBg1LZe3LcBjV/jtUeMaZu/ig= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Hugh Dickins , Yosry Ahmed , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Meta kernel team Subject: [PATCH v1 5/6] memcg-v1: no need for memcg locking for MGLRU Date: Thu, 24 Oct 2024 18:23:02 -0700 Message-ID: <20241025012304.2473312-6-shakeel.butt@linux.dev> In-Reply-To: <20241025012304.2473312-1-shakeel.butt@linux.dev> References: <20241025012304.2473312-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Stat-Signature: mhfab1wfmb5wiwszteq5zmk39nn1jo94 X-Rspamd-Queue-Id: 8A5004001A X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729819402-998688 X-HE-Meta: U2FsdGVkX1/Lg9lXYK3v6UHv2wPByNmoZ6ODt7g5MeDa3gB1LbbEQjwueyImOL09GnX3fDuiqSDyziTAJmdYS3nCb6WWb9oi9D7/YycnDn0prtR3CaTCSgB7PUqHkHoe9ds+ezRhQVn0fSUZyqjCviJVq3CeeJM/qrlL1t9tb5lLTJSBSt+HUk+5ymxDsOnBh50Tv3UJnLlxpx+kQokzVxgZ0MD9NFBr2OzMIkZuirnJw+oVb7FT5BIUyURPfA6rWagQdd/l410H26EX1AeHrJcAVLI1DmhewHORDIu602vm0XUd8ZTlufVZ9ibJnqEQWlSJe3I3Ga78DaZ/HoUAgSFd26E4obzgmYybnt33cTkJ48ssKqJHp1ySIvyko5imQRhljYGhXpUIshwokkzxB3OxzSmEE4UprG0i5R6eB5UfjoC2QA6c0whasJb3IFEsW0OHaUcZtT5QC9BCJHSjDClK/H3Iu7IaaYHsGW22rgCcUobpbZjev7L8upyF/5dyH/SHWXpFhYTgztb9VUrRDwcygLnjXBh1QX3b+LNZeP3Ek1cdDhhC4v5/MeaKwwrEjNcymBexi89VAYROR3K2/f2PTWh69QvFIx6x9zl4vRTMk8Fym5tlPuw0nTCpbAi39AEbC0JC7x41uaQJOSdBasJimPRQh6hy1LMBKTTaqs+CS+Mi+zYgB1EcqVZ/51UkH4pDmmYFMKsakPgqi5W1YfJ1SwTsRE647k2lT1yc6+b3xaW9YdK7qJAS8Reu95hEX9UR3bXBom4w8OBg/rMqVSo6eq+JYcjR3merslcjb8ELBYNbUWUX6K3b6kRTKfGgrJ8arrG32T0W4GqZCc3GExxVoYrOWX7tTuyEY4X2jABTzROkirncpASpnGgXNu7MC73V92Bb9/4Wpyp36Za4rzO0zciN7YWyIy/6whYoDrmZkY0dpv9DmYM1vleazpeDb7MrNezAC8nUhi/oYr7 f8Zplac3 PW2zg5rjIONWlLlPqVSjfdpTCL+YO1y5x2XF6Q23yS7VfedI0n+9TzMLUPLYdiOysti3q7rBGW+OzFwmIwSbs+zp+G/xLW3fitX4uD+rH1SJZOQvr5wgrpgwnoYbLOUgXLFJi9HSfnC++UhM= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: While updating the generation of the folios, MGLRU requires that the folio's memcg association remains stable. With the charge migration deprecated, there is no need for MGLRU to acquire locks to keep the folio and memcg association stable. Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin --- mm/vmscan.c | 11 ----------- 1 file changed, 11 deletions(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 29c098790b01..fd7171658b63 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -3662,10 +3662,6 @@ static void walk_mm(struct mm_struct *mm, struct lru_gen_mm_walk *walk) if (walk->seq != max_seq) break; - /* folio_update_gen() requires stable folio_memcg() */ - if (!mem_cgroup_trylock_pages(memcg)) - break; - /* the caller might be holding the lock for write */ if (mmap_read_trylock(mm)) { err = walk_page_range(mm, walk->next_addr, ULONG_MAX, &mm_walk_ops, walk); @@ -3673,8 +3669,6 @@ static void walk_mm(struct mm_struct *mm, struct lru_gen_mm_walk *walk) mmap_read_unlock(mm); } - mem_cgroup_unlock_pages(); - if (walk->batched) { spin_lock_irq(&lruvec->lru_lock); reset_batch_size(walk); @@ -4096,10 +4090,6 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) } } - /* folio_update_gen() requires stable folio_memcg() */ - if (!mem_cgroup_trylock_pages(memcg)) - return true; - arch_enter_lazy_mmu_mode(); pte -= (addr - start) / PAGE_SIZE; @@ -4144,7 +4134,6 @@ bool lru_gen_look_around(struct page_vma_mapped_walk *pvmw) } arch_leave_lazy_mmu_mode(); - mem_cgroup_unlock_pages(); /* feedback from rmap walkers to page table walkers */ if (mm_state && suitable_to_scan(i, young)) From patchwork Fri Oct 25 01:23:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Shakeel Butt X-Patchwork-Id: 13849922 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1001AD10399 for ; Fri, 25 Oct 2024 01:23:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 984886B00A6; Thu, 24 Oct 2024 21:23:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 936416B00A7; Thu, 24 Oct 2024 21:23:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 786CE6B00A8; Thu, 24 Oct 2024 21:23:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5C97C6B00A6 for ; Thu, 24 Oct 2024 21:23:51 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 4BAB71C4E3B for ; Fri, 25 Oct 2024 01:23:29 +0000 (UTC) X-FDA: 82710377064.27.FF2F1E2 Received: from out-181.mta0.migadu.com (out-181.mta0.migadu.com [91.218.175.181]) by imf22.hostedemail.com (Postfix) with ESMTP id F3638C0003 for ; Fri, 25 Oct 2024 01:23:25 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=rXxfplL1; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.181 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729819260; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Q0mu9tOfXcx8yb1oCnPvrtiBabRTMVENSuF8G5tk5bo=; b=WUTE6tjjE6rDBuS4MD90Xi3tiklZ7+aKhmGVITD/mWsDEqFGKyLmNKgrGJHoSsYz3pO/N4 TRktl3HxWec+Gy6mdgKsWTHO3g/57ZydIsRsDS94xmzQep/VnDqDi5Z8zdYtVd5x+9VljU porcr0DRDN23qCDFpwRsVsKpVha5vnU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729819260; a=rsa-sha256; cv=none; b=gYyj5fYu6/0wOMgLIU5erQpNS7BThmUdkhdZnu0UKE1CAYBqOIGYFremx7KFlQkmk9E+U6 sIRsnDo5n5TGz1yZ+jCCRi3opOzEDwlnYUOemNfTmEUP0K1Oc3bS9q5O0pN1RMo7jJYJS6 39yaRFLJhgRjGqrliMDuB55V+4wZIok= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=linux.dev header.s=key1 header.b=rXxfplL1; dmarc=pass (policy=none) header.from=linux.dev; spf=pass (imf22.hostedemail.com: domain of shakeel.butt@linux.dev designates 91.218.175.181 as permitted sender) smtp.mailfrom=shakeel.butt@linux.dev X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1729819428; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Q0mu9tOfXcx8yb1oCnPvrtiBabRTMVENSuF8G5tk5bo=; b=rXxfplL1oxk+AIg/yAvQpipVX2ToXVR8t+QCcVMsc87tOuuuybOaPEyA0kD0BObR1jMdgb NGwUXhKy559lBTn5ACLxZQA6bLDUv51gmlIcOMFr9DQ61Jq9Z87qVsYCIlfJgppKQToYsc Ao2ZZgzxVYFAeUq8ZxtUdaFJ1I4zaVk= From: Shakeel Butt To: Andrew Morton Cc: Johannes Weiner , Michal Hocko , Roman Gushchin , Muchun Song , Hugh Dickins , Yosry Ahmed , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Meta kernel team Subject: [PATCH v1 6/6] memcg-v1: remove memcg move locking code Date: Thu, 24 Oct 2024 18:23:03 -0700 Message-ID: <20241025012304.2473312-7-shakeel.butt@linux.dev> In-Reply-To: <20241025012304.2473312-1-shakeel.butt@linux.dev> References: <20241025012304.2473312-1-shakeel.butt@linux.dev> MIME-Version: 1.0 X-Migadu-Flow: FLOW_OUT X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: F3638C0003 X-Stat-Signature: srq48sen7cf4qh6by9s1qr9yn1urxtdg X-Rspam-User: X-HE-Tag: 1729819405-508228 X-HE-Meta: U2FsdGVkX18/Vlx1TayrwtQAEMFAlu0ZSuKKpZ6y8/x28o+4Rz1ebclie9dreMHBFdagDsATb7qMW6yAWtk34WfI8Jt6Itcis1w0iNKJXPrlDDSuX8T5j0HpjFXubv8pyOy4Hbohr6x01MsoqKIJOZLkgxkJzY6TQUW30X4QVSKdf/9rYtQG0B2Mm7aHmqb3a5V+vQ3GsXP2G1yKPk1OVUyzvxeTpsTnpfgnOR3MWFpfGXJikrrOWpTdya8p7QCMQgUqiiuBwO0h0y+A09AzCUA9Cv3pJES07lvUKl70TYE/OpJwgfnDCJv6VDmaKMPIaqT8itcWays2IRX9qQF6YSaVlw+7oKwV6q2ACBbVRb3Y8cw0SSvW5QHC1icAQG85gYVoWUobMcBQFPkHjixckxa3zqoq/yxe9QYiERxrxBSjBDMR2DRARYwa2up6bJWVYwLydHastI+WL6klxhDlTNUvZGhLDfsreQZ2CiNOeLY5Ug6xCQnSdHiLGyL/6XcqwTOMZ6zzw2b7PzBvHqv1D7GPbTrRtIEulheq0Rt0+R1TvwvLSwftNxMd6K5cI1/z3CHokX40SrZ98ouRLiBDfayxDeS7zf/rCDLZhC1hq7zSdJ/UyBG4xK+1s/Ws2AF/wGYYuDactjGECTWajFO5q/YTjIZmpTddCyaOEY4RmIj6+aAjIuEwnXRUNNxqQWrgzTU/czvtn74nFm9hbjTgdga3nHNvmSRD4a2y4bhUCTFf7WQ/pFy22EMdftMgH0PMP3zLfMBz9r82YGQq6OAr8egyyY6527H9aoIuO8IoqoyFrCkjsE/Yuikdv7tRTLi4AjW5aaXCNsTEIFuijgBVed2eKkxu/WP2BUfDBsKtm2C/36x/G7meiVvI2owFEfa5IHff5wux0+4+yaIux4gPeIpPjNpd8NTP1eIZz4+noVKHFth35knR4lBqmXzjeneIeUuAtdpGRY+iCWsPaUU GgtNTkgb 1Ux9ggGWQ35gdAC3a/xlxbznv1dt42J/d6WCCFzYggWeo78ccGc4+VU6ez6GIQwM+HCvhDoPxrV2fK+XnuD2X2T4jDWtLUOvYp4sJ5WdZqj3MVDiNg9i5KVQE5Qj4nTT0GOIVUANHRyXWOk0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The memcg v1's charge move feature has been deprecated. All the places using the memcg move lock, have stopped using it as they don't need the protection any more. Let's proceed to remove all the locking code related to charge moving. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko Reviewed-by: Roman Gushchin Acked-by: Johannes Weiner --- Changes since RFC: - Remove the memcg move locking in separate patches. include/linux/memcontrol.h | 54 ------------------------- mm/filemap.c | 1 - mm/memcontrol-v1.c | 82 -------------------------------------- mm/memcontrol.c | 5 --- mm/rmap.c | 1 - 5 files changed, 143 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 798db70b0a30..932534291ca2 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -299,20 +299,10 @@ struct mem_cgroup { /* For oom notifier event fd */ struct list_head oom_notify; - /* taken only while moving_account > 0 */ - spinlock_t move_lock; - unsigned long move_lock_flags; - /* Legacy tcp memory accounting */ bool tcpmem_active; int tcpmem_pressure; - /* - * set > 0 if pages under this cgroup are moving to other cgroup. - */ - atomic_t moving_account; - struct task_struct *move_lock_task; - /* List of events which userspace want to receive */ struct list_head event_list; spinlock_t event_list_lock; @@ -428,9 +418,7 @@ static inline struct obj_cgroup *__folio_objcg(struct folio *folio) * * - the folio lock * - LRU isolation - * - folio_memcg_lock() * - exclusive reference - * - mem_cgroup_trylock_pages() * * For a kmem folio a caller should hold an rcu read lock to protect memcg * associated with a kmem folio from being released. @@ -499,9 +487,7 @@ static inline struct mem_cgroup *folio_memcg_rcu(struct folio *folio) * * - the folio lock * - LRU isolation - * - lock_folio_memcg() * - exclusive reference - * - mem_cgroup_trylock_pages() * * For a kmem folio a caller should hold an rcu read lock to protect memcg * associated with a kmem folio from being released. @@ -1873,26 +1859,6 @@ static inline bool task_in_memcg_oom(struct task_struct *p) return p->memcg_in_oom; } -void folio_memcg_lock(struct folio *folio); -void folio_memcg_unlock(struct folio *folio); - -/* try to stablize folio_memcg() for all the pages in a memcg */ -static inline bool mem_cgroup_trylock_pages(struct mem_cgroup *memcg) -{ - rcu_read_lock(); - - if (mem_cgroup_disabled() || !atomic_read(&memcg->moving_account)) - return true; - - rcu_read_unlock(); - return false; -} - -static inline void mem_cgroup_unlock_pages(void) -{ - rcu_read_unlock(); -} - static inline void mem_cgroup_enter_user_fault(void) { WARN_ON(current->in_user_fault); @@ -1914,26 +1880,6 @@ unsigned long memcg1_soft_limit_reclaim(pg_data_t *pgdat, int order, return 0; } -static inline void folio_memcg_lock(struct folio *folio) -{ -} - -static inline void folio_memcg_unlock(struct folio *folio) -{ -} - -static inline bool mem_cgroup_trylock_pages(struct mem_cgroup *memcg) -{ - /* to match folio_memcg_rcu() */ - rcu_read_lock(); - return true; -} - -static inline void mem_cgroup_unlock_pages(void) -{ - rcu_read_unlock(); -} - static inline bool task_in_memcg_oom(struct task_struct *p) { return false; diff --git a/mm/filemap.c b/mm/filemap.c index 630a1c431ea1..e582a1545d2a 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -119,7 +119,6 @@ * ->i_pages lock (folio_remove_rmap_pte->set_page_dirty) * bdi.wb->list_lock (folio_remove_rmap_pte->set_page_dirty) * ->inode->i_lock (folio_remove_rmap_pte->set_page_dirty) - * ->memcg->move_lock (folio_remove_rmap_pte->folio_memcg_lock) * bdi.wb->list_lock (zap_pte_range->set_page_dirty) * ->inode->i_lock (zap_pte_range->set_page_dirty) * ->private_lock (zap_pte_range->block_dirty_folio) diff --git a/mm/memcontrol-v1.c b/mm/memcontrol-v1.c index 9c0fba8c8a83..539ceefa9d2d 100644 --- a/mm/memcontrol-v1.c +++ b/mm/memcontrol-v1.c @@ -401,87 +401,6 @@ unsigned long memcg1_soft_limit_reclaim(pg_data_t *pgdat, int order, return nr_reclaimed; } -/** - * folio_memcg_lock - Bind a folio to its memcg. - * @folio: The folio. - * - * This function prevents unlocked LRU folios from being moved to - * another cgroup. - * - * It ensures lifetime of the bound memcg. The caller is responsible - * for the lifetime of the folio. - */ -void folio_memcg_lock(struct folio *folio) -{ - struct mem_cgroup *memcg; - unsigned long flags; - - /* - * The RCU lock is held throughout the transaction. The fast - * path can get away without acquiring the memcg->move_lock - * because page moving starts with an RCU grace period. - */ - rcu_read_lock(); - - if (mem_cgroup_disabled()) - return; -again: - memcg = folio_memcg(folio); - if (unlikely(!memcg)) - return; - -#ifdef CONFIG_PROVE_LOCKING - local_irq_save(flags); - might_lock(&memcg->move_lock); - local_irq_restore(flags); -#endif - - if (atomic_read(&memcg->moving_account) <= 0) - return; - - spin_lock_irqsave(&memcg->move_lock, flags); - if (memcg != folio_memcg(folio)) { - spin_unlock_irqrestore(&memcg->move_lock, flags); - goto again; - } - - /* - * When charge migration first begins, we can have multiple - * critical sections holding the fast-path RCU lock and one - * holding the slowpath move_lock. Track the task who has the - * move_lock for folio_memcg_unlock(). - */ - memcg->move_lock_task = current; - memcg->move_lock_flags = flags; -} - -static void __folio_memcg_unlock(struct mem_cgroup *memcg) -{ - if (memcg && memcg->move_lock_task == current) { - unsigned long flags = memcg->move_lock_flags; - - memcg->move_lock_task = NULL; - memcg->move_lock_flags = 0; - - spin_unlock_irqrestore(&memcg->move_lock, flags); - } - - rcu_read_unlock(); -} - -/** - * folio_memcg_unlock - Release the binding between a folio and its memcg. - * @folio: The folio. - * - * This releases the binding created by folio_memcg_lock(). This does - * not change the accounting of this folio to its memcg, but it does - * permit others to change it. - */ -void folio_memcg_unlock(struct folio *folio) -{ - __folio_memcg_unlock(folio_memcg(folio)); -} - static u64 mem_cgroup_move_charge_read(struct cgroup_subsys_state *css, struct cftype *cft) { @@ -1189,7 +1108,6 @@ void memcg1_memcg_init(struct mem_cgroup *memcg) { INIT_LIST_HEAD(&memcg->oom_notify); mutex_init(&memcg->thresholds_lock); - spin_lock_init(&memcg->move_lock); INIT_LIST_HEAD(&memcg->event_list); spin_lock_init(&memcg->event_list_lock); } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 94279b9c766a..3c223aaeb6af 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -1189,7 +1189,6 @@ void lruvec_memcg_debug(struct lruvec *lruvec, struct folio *folio) * These functions are safe to use under any of the following conditions: * - folio locked * - folio_test_lru false - * - folio_memcg_lock() * - folio frozen (refcount of 0) * * Return: The lruvec this folio is on with its lock held. @@ -1211,7 +1210,6 @@ struct lruvec *folio_lruvec_lock(struct folio *folio) * These functions are safe to use under any of the following conditions: * - folio locked * - folio_test_lru false - * - folio_memcg_lock() * - folio frozen (refcount of 0) * * Return: The lruvec this folio is on with its lock held and interrupts @@ -1235,7 +1233,6 @@ struct lruvec *folio_lruvec_lock_irq(struct folio *folio) * These functions are safe to use under any of the following conditions: * - folio locked * - folio_test_lru false - * - folio_memcg_lock() * - folio frozen (refcount of 0) * * Return: The lruvec this folio is on with its lock held and interrupts @@ -2375,9 +2372,7 @@ static void commit_charge(struct folio *folio, struct mem_cgroup *memcg) * * - the page lock * - LRU isolation - * - folio_memcg_lock() * - exclusive reference - * - mem_cgroup_trylock_pages() */ folio->memcg_data = (unsigned long)memcg; } diff --git a/mm/rmap.c b/mm/rmap.c index 4785a693857a..c6c4d4ea29a7 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -32,7 +32,6 @@ * swap_lock (in swap_duplicate, swap_info_get) * mmlist_lock (in mmput, drain_mmlist and others) * mapping->private_lock (in block_dirty_folio) - * folio_lock_memcg move_lock (in block_dirty_folio) * i_pages lock (widely used) * lruvec->lru_lock (in folio_lruvec_lock_irq) * inode->i_lock (in set_page_dirty's __mark_inode_dirty)