From patchwork Tue Dec 6 17:13:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Weiner X-Patchwork-Id: 13066170 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9CF7EC352A1 for ; Tue, 6 Dec 2022 17:14:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 362B48E0006; Tue, 6 Dec 2022 12:14:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 312878E0001; Tue, 6 Dec 2022 12:14:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E4AD8E0006; Tue, 6 Dec 2022 12:14:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 0F2638E0001 for ; Tue, 6 Dec 2022 12:14:06 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id AE45C1C65BB for ; Tue, 6 Dec 2022 17:14:05 +0000 (UTC) X-FDA: 80212529250.13.3BB1CAC Received: from mail-ed1-f48.google.com (mail-ed1-f48.google.com [209.85.208.48]) by imf22.hostedemail.com (Postfix) with ESMTP id 51520C0005 for ; Tue, 6 Dec 2022 17:14:04 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=U1tH6w6X; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf22.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.208.48 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670346845; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KduHn+5gyrCZCJ3PZZKBhevPs/yCR7iZhiHXyVA0CdE=; b=zGNSrHSg+awmVQtB/ngA48Dz2ffQiu2K5an1OhhJmMBJNw8VAQREyVkzzxfyW9xhIGR3gO +MwKxm2UfHW/3XH88P/DKE411GdjaVGESywsyMKrypgy9h1WRDaW0IfLQuz74dq8hz1OoV mp7TP0BBwNPYWJyU8MweZbIcy/kC/Bg= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=cmpxchg-org.20210112.gappssmtp.com header.s=20210112 header.b=U1tH6w6X; dmarc=pass (policy=none) header.from=cmpxchg.org; spf=pass (imf22.hostedemail.com: domain of hannes@cmpxchg.org designates 209.85.208.48 as permitted sender) smtp.mailfrom=hannes@cmpxchg.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670346845; a=rsa-sha256; cv=none; b=nDazfrDzoNB2D9b13hjdavQAPgRTpTn3GJ9mC5i9guEp/pcMBkHImbwkteVBwYrTITqFzF mFHK0b+lDLzHI3bLMc4gWRULg47vywx+DdlKnOVnQPBOBoHoVYh/g3MmT8rWr5Kq/v1FpY o0e3kUQU+FelHWt39mhTQA9n8qioZcc= Received: by mail-ed1-f48.google.com with SMTP id f7so21239016edc.6 for ; Tue, 06 Dec 2022 09:14:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cmpxchg-org.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KduHn+5gyrCZCJ3PZZKBhevPs/yCR7iZhiHXyVA0CdE=; b=U1tH6w6XQXpYI+Hv+M2BpssrkbT69EZiZtCJ8KUrJiRcGor3DEDhd1mEQhmYBwNQAL 9YedqqgQJoMza6N6wOnCopBmkDUVl4TGszL7044J1ET/RB82yJLOSkO3A10sVEkWg8tw 0z38g0G+6klmx7+DUWF/Lj59zdXgVZZkiBhgPm7HtpP6UoAVuRquXY0sRXXXI3tZin53 VD8Oobr5c3lkHwIXOQw6vLPebCgDSNO7nkyR39EwRxodTc4l3/xeYHpMD0CAYMriMrd2 N6OnCJlPli+HAoMxQiW5waMNstoUqDSvHSRvQNN/Uv8NqyA+jeyP7SgUNW7gld1NZOSf l4zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KduHn+5gyrCZCJ3PZZKBhevPs/yCR7iZhiHXyVA0CdE=; b=ro0McSKwtiFqs4D+zcdWwym5PSJD/QN23HNi8Ga+YTigRY3Ft7E7zkWXyuAbdY7uY6 syq/KvMxGbDUPccRLsNTjzYJ4Gnyas6lZgRyh8frgBUkn7R8jDWh8YTBoGON5NKuPuDB vLoA/ktILRuQxKTC7yvpC4YR2GWf/QcRj0TCHuqPf0NkImcGekJrHOfquzM9uhqj/YEP qShw0vmrPfgz8xw4zr3S53d8V0IsmTis3KEtQjNHM9aaOa7cmlWiZo1kYNhOokRYm0UJ yGZfIhFrwn5ugS9VJwSPIixCqeFyuOhCnrdbnU24V9DFZ/9zXQYFkwnMKV3aDmQqENRJ leKQ== X-Gm-Message-State: ANoB5pkjIUfRtT3IjA8TN5wa3VSSp7wwiqDzakJvKYZbWQK08oEHtnjc l9Fx4NwicoJroROi8C+DfAgzvQ== X-Google-Smtp-Source: AA0mqf7Ryd9NRT+3N9jQ0YqZ+vYbjjy0DyrVXIILJI2vcnGZiLsDKaLX+crjVJIhIeHPDsTlOyd56w== X-Received: by 2002:a05:6402:444a:b0:459:401:c23e with SMTP id o10-20020a056402444a00b004590401c23emr65038318edb.23.1670346843685; Tue, 06 Dec 2022 09:14:03 -0800 (PST) Received: from localhost ([2a02:8070:6387:ab20:15aa:3c87:c206:d15e]) by smtp.gmail.com with ESMTPSA id 9-20020a170906200900b007c0688a68cbsm7534622ejo.176.2022.12.06.09.14.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 06 Dec 2022 09:14:03 -0800 (PST) From: Johannes Weiner To: Andrew Morton Cc: Linus Torvalds , Hugh Dickins , Shakeel Butt , Michal Hocko , linux-mm@kvack.org, cgroups@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 1/3] mm: memcontrol: skip moving non-present pages that are mapped elsewhere Date: Tue, 6 Dec 2022 18:13:39 +0100 Message-Id: <20221206171340.139790-2-hannes@cmpxchg.org> X-Mailer: git-send-email 2.38.1 In-Reply-To: <20221206171340.139790-1-hannes@cmpxchg.org> References: <20221206171340.139790-1-hannes@cmpxchg.org> MIME-Version: 1.0 X-Spamd-Result: default: False [6.14 / 9.00]; SORBS_IRL_BL(3.00)[209.85.208.48:from]; R_MISSING_CHARSET(2.50)[]; MID_CONTAINS_FROM(1.00)[]; BAYES_HAM(-0.46)[72.55%]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; BAD_REP_POLICIES(0.10)[]; TO_DN_SOME(0.00)[]; R_DKIM_ALLOW(0.00)[cmpxchg-org.20210112.gappssmtp.com:s=20210112]; RCPT_COUNT_SEVEN(0.00)[8]; ARC_NA(0.00)[]; MIME_TRACE(0.00)[0:+]; FROM_EQ_ENVFROM(0.00)[]; R_SPF_ALLOW(0.00)[+ip4:209.85.128.0/17]; DMARC_POLICY_ALLOW(0.00)[cmpxchg.org,none]; TO_MATCH_ENVRCPT_SOME(0.00)[]; PREVIOUSLY_DELIVERED(0.00)[linux-mm@kvack.org]; DKIM_TRACE(0.00)[cmpxchg-org.20210112.gappssmtp.com:+]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; RCVD_COUNT_THREE(0.00)[3]; FROM_HAS_DN(0.00)[]; RCVD_VIA_SMTP_AUTH(0.00)[] X-Rspamd-Queue-Id: 51520C0005 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: xitioyhwtpds3z7i1amdi99qr9tmgfnx X-HE-Tag: 1670346844-482011 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: During charge moving, the pte lock and the page lock cover nearly all cases of stabilizing page_mapped(). The only exception is when we're looking at a non-present pte and find a page in the page cache or in the swapcache: if the page is mapped elsewhere, it can become unmapped outside of our control. For this reason, rmap needs lock_page_memcg(). We don't like cgroup-specific locks in generic MM code - especially in performance-critical MM code - and for a legacy feature that's unlikely to have many users left - if any. So remove the exception. Arguably that's better semantics anyway: the page is shared, and another process seems to be the more active user. Once we stop moving such pages, rmap doesn't need lock_page_memcg() anymore. The next patch will remove it. Suggested-by: Hugh Dickins Signed-off-by: Johannes Weiner Acked-by: Hugh Dickins Acked-by: Shakeel Butt --- mm/memcontrol.c | 52 ++++++++++++++++++++++++++++++++++++------------- 1 file changed, 38 insertions(+), 14 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 48c44229cf47..b696354c1b21 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -5681,7 +5681,7 @@ static struct page *mc_handle_file_pte(struct vm_area_struct *vma, * @from: mem_cgroup which the page is moved from. * @to: mem_cgroup which the page is moved to. @from != @to. * - * The caller must make sure the page is not on LRU (isolate_page() is useful.) + * The page must be locked and not on the LRU. * * This function doesn't do "charge" to new cgroup and doesn't do "uncharge" * from old cgroup. @@ -5698,20 +5698,13 @@ static int mem_cgroup_move_account(struct page *page, int nid, ret; VM_BUG_ON(from == to); + VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); VM_BUG_ON(compound && !folio_test_large(folio)); - /* - * Prevent mem_cgroup_migrate() from looking at - * page's memory cgroup of its source page while we change it. - */ - ret = -EBUSY; - if (!folio_trylock(folio)) - goto out; - ret = -EINVAL; if (folio_memcg(folio) != from) - goto out_unlock; + goto out; pgdat = folio_pgdat(folio); from_vec = mem_cgroup_lruvec(from, pgdat); @@ -5798,8 +5791,6 @@ static int mem_cgroup_move_account(struct page *page, mem_cgroup_charge_statistics(from, -nr_pages); memcg_check_events(from, nid); local_irq_enable(); -out_unlock: - folio_unlock(folio); out: return ret; } @@ -5848,6 +5839,29 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma, else if (is_swap_pte(ptent)) page = mc_handle_swap_pte(vma, ptent, &ent); + if (target && page) { + if (!trylock_page(page)) { + put_page(page); + return ret; + } + /* + * page_mapped() must be stable during the move. This + * pte is locked, so if it's present, the page cannot + * become unmapped. If it isn't, we have only partial + * control over the mapped state: the page lock will + * prevent new faults against pagecache and swapcache, + * so an unmapped page cannot become mapped. However, + * if the page is already mapped elsewhere, it can + * unmap, and there is nothing we can do about it. + * Alas, skip moving the page in this case. + */ + if (!pte_present(ptent) && page_mapped(page)) { + unlock_page(page); + put_page(page); + return ret; + } + } + if (!page && !ent.val) return ret; if (page) { @@ -5864,8 +5878,11 @@ static enum mc_target_type get_mctgt_type(struct vm_area_struct *vma, if (target) target->page = page; } - if (!ret || !target) + if (!ret || !target) { + if (target) + unlock_page(page); put_page(page); + } } /* * There is a swap entry and a page doesn't exist or isn't charged. @@ -5905,6 +5922,10 @@ static enum mc_target_type get_mctgt_type_thp(struct vm_area_struct *vma, ret = MC_TARGET_PAGE; if (target) { get_page(page); + if (!trylock_page(page)) { + put_page(page); + return MC_TARGET_NONE; + } target->page = page; } } @@ -6143,6 +6164,7 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd, } putback_lru_page(page); } + unlock_page(page); put_page(page); } else if (target_type == MC_TARGET_DEVICE) { page = target.page; @@ -6151,6 +6173,7 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd, mc.precharge -= HPAGE_PMD_NR; mc.moved_charge += HPAGE_PMD_NR; } + unlock_page(page); put_page(page); } spin_unlock(ptl); @@ -6193,7 +6216,8 @@ static int mem_cgroup_move_charge_pte_range(pmd_t *pmd, } if (!device) putback_lru_page(page); -put: /* get_mctgt_type() gets the page */ +put: /* get_mctgt_type() gets & locks the page */ + unlock_page(page); put_page(page); break; case MC_TARGET_SWAP: