From patchwork Mon Aug 19 02:30:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13767728 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8F5DC5321D for ; Mon, 19 Aug 2024 02:31:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0CE626B0082; Sun, 18 Aug 2024 22:31:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 07F1C6B0083; Sun, 18 Aug 2024 22:31:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E13306B0085; Sun, 18 Aug 2024 22:31:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id C73506B0082 for ; Sun, 18 Aug 2024 22:31:54 -0400 (EDT) Received: from smtpin01.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 76749140F30 for ; Mon, 19 Aug 2024 02:31:54 +0000 (UTC) X-FDA: 82467419748.01.355EA9A Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) by imf03.hostedemail.com (Postfix) with ESMTP id AB09E20023 for ; Mon, 19 Aug 2024 02:31:52 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Ia0A1eKE; spf=pass (imf03.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724034673; a=rsa-sha256; cv=none; b=S72+k/ErOxjYqqv6vfvCEJkuWhaTlnGLH6d/eGjIY7m1B72tECpri1v2wDC7aeN+ZPPI7/ 0nIgsduPH5uRtq99ZZVvNNmLiMiDSKiuJ5sXA2n2PZqWJcSOBEkoASCSG31kn1tXHaImPT 5f6tVX5S3WYurV+t3hylzNe2JR6JmIQ= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Ia0A1eKE; spf=pass (imf03.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.181 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724034673; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=qHbAGWI5YhvG8+e8FC6gBqfnjsq6DXdxHGGYZ2mM3LI=; b=ZXs1D1gw86kvC0rv2rjDR1s0nESu9lqNOmNHIGONyBnO2MR/RPSN6THY2BIDz8G8A8LJvc bzj612p4xgVRa2rqi0kEtQl/Qs5VSVYk7qbWaPaHnBWxmeeIqjVbId4eMTumNXq4UWpRbZ y/MJgfi1hA0fhrh8qNFSDdteqYHH8II= Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-451a0b04f6bso25730171cf.2 for ; Sun, 18 Aug 2024 19:31:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724034712; x=1724639512; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=qHbAGWI5YhvG8+e8FC6gBqfnjsq6DXdxHGGYZ2mM3LI=; b=Ia0A1eKEnz+MlUZAXT4x9qIb/0CqcgfIBfhHoD+DrDsI/CCV5Xt7nufIElSN/UwX/e UosUNduuSedQtCS1uZWYf5kw1WPspiB4WBehXnaCc+8wBuXhbYEry+Cks39ZcPeFPPBA 86Q4RNCOYJeLzGSH6sgRBME4pplbgJcTyylj8jTKnSP4g1f0NoI0/rsfAYzez+T4RgUm AXrvFpPC7RL1uQCM0RSttaMbLktmsEW4iOYeKx4Oyxz57a0Gh2PxFfNabOORXUK3urJE ZlDekp7fPrDdvU1o+X4VtCGlSU4nFwUOA5uo8U1QfyZ17OEVSq4mopWp7iRHGys9HLqb tWCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724034712; x=1724639512; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=qHbAGWI5YhvG8+e8FC6gBqfnjsq6DXdxHGGYZ2mM3LI=; b=ozC8+88qZtN8JHF0MxtSwyPQpdC/L2taPeKSd6mSq/u+hE3GPQRegHbNL3GmE7NOrT sRb7b8JBeZ2HhUuCsthM5UHWv2Xsyg/s2K7sRj/7MVSRuH6V0RxKCS8s/wBJ8/R0ubeB WIh1XXBGhnvsx3NoZgx254nOvfIQgGvfGCyYnBy9Cfj0/V39aRmGkbblIRYiNQ9FuQ0D 676JoG+04G5Y0Fl/Gmte8JIu4i7TJLojxrIpuf508E9lQPhPCuhdN2lwbr3HJJ0dY5rT V7qUadAfoSolZ+TOWVCWOsZ9UJyaJz/W750IPgQjYr0zVRtHsKttu96NNpM3GlifW8du TG+w== X-Forwarded-Encrypted: i=1; AJvYcCXf2iQf6f0yZs3tSPu949Sj3IziQDhI04Y+0vYIGO/K+ZmrikGFcg8K1emxYLxxUCoEhHiUZGxmGgpugio3b+aZmSQ= X-Gm-Message-State: AOJu0YxR5R5uPOYAVToZcv/oRumC5iZcXBifhIq1TB3t1jSxJzXYzGlF utuXYaGG1X/le+kuZnci2AATvyLyOzI/VHKxmA8Ya5OsF6fS5vBP X-Google-Smtp-Source: AGHT+IEpaTMKPf3VW76AVnS5nT7RUI9JcYK6qI1yzTo5GJ8U9EnjvGJNpHkQcNL620aPLntPx4HfJA== X-Received: by 2002:a05:622a:988:b0:44f:e132:14df with SMTP id d75a77b69052e-4537425375emr117719541cf.21.1724034711670; Sun, 18 Aug 2024 19:31:51 -0700 (PDT) Received: from localhost (fwdproxy-ash-003.fbsv.net. [2a03:2880:20ff:3::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-45369ff4bc2sm36832461cf.37.2024.08.18.19.31.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Aug 2024 19:31:50 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai , Usama Arif Subject: [PATCH v4 1/6] mm: free zapped tail pages when splitting isolated thp Date: Mon, 19 Aug 2024 03:30:54 +0100 Message-ID: <20240819023145.2415299-2-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240819023145.2415299-1-usamaarif642@gmail.com> References: <20240819023145.2415299-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Stat-Signature: wn9mcoyabbbj97hzrd1j5ur4dp4153qq X-Rspamd-Queue-Id: AB09E20023 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1724034712-482736 X-HE-Meta: U2FsdGVkX19tHRkGBSWy+TxYFDgjN3rSDC1TA4pKj6EPhJkdgvmwvZ4ra0DcE+oxsWhYopLDPt30AecJIpaoKBo1sruxv5PHwGUmSw7zZzuAYwv1kf2EnB9ck+/gfrXqarLHB/brkWNWFDrkjEbgiQnpLw1tzFNjYJQtO0pMFyzwlpRG8GfTlL1hZrtICVF09V+0LpChf1XMyswfDl3HdUKd8oweX+DBjpV0BW8ygM4oRllQtluJwiFCt/yFuFhorKAwor/95QrNElbOUbLvJ4F/xs2vGnNXuUJgxMVPT2uKENr2J6KKeOgilnf3gNUG+sxEKZRyb9MNjF1Gn9Jee73wUaORwsqPdHCrR7JMb9TyxiGIo1ps97Vm7ZpWk49eH0jmo7UdUviK6hD7tWEi29jlu60dyAizcQOuBTYF7dVeSGdVFYu06F+tokFRcC2nCCn8/0pWkS5rLn2S1aJIhkpPXmPJ7AviLVUq6FMDgwsyZsOSoVzeXH8QyDIzFkG9f2nWmzK1XKQG90woHVHMpG9hhCh3LlAZumfPFWxPXJcH8xLYqv4f1V+Ikd3BaQ1gnyhGe9Xjv87eUkwL9Z9cH+NcgAmxbLUdTXmpi2gBJ5D+NpeDvU9lkk9ZM956qW2XpBMLueGDMgxHDKdX7OPt61dsLTYyJKrApmIDBBMKwH6iwlWccfv4kBJ5Y5PndtF0O/SR3qWiMyXHu3Bjws5fc0BBJhUlUQ2F7rtAWtAViW4vNslmvSDqCNsqH4UR7qBroG1PVo0MBg38OogaQ9624JQwsMqgrDuBs3La5xbFcCEVDe3YmXiFH+Fn7LB+V3T/Np1qjfdbDSj3dnHtW9us34g5QpVhaOU4SzzqH4p5I2UltOxrB9jd0QicOentqU6PSZeOOWvtU4/YTk9NlDugd6Cnu+lPvsteWC7xDsTQBRjG6cN3Mk/xUvwiBMO7EbPtOpNN9cqw9wY7r6JPQda wiGI0gAx vhZZrCopxSMq3qgIKqOfQ8tK6rsB4lycItq2c4CgdX8vjO7faK9NKwaGwejz74ySztymNbYs7RRADJrXuWsPcUtwUmhC+UV1qBUJpOQDzDOA44Svw02KLXqIr4RLyQG5KXKuwWpQ3J8IvQQG547yi+qHe2i4LJ4wqySnga0Rt5sVkicvINAJvehZLXGN9J37iriH2Yk01XrUjBA1H0u1JJ1ctl3gN573Bje/E/rG/9ywxHL3x+6GlTFb9q9hAx5au6hrOSz7Fw0Zt1Ty/nT2EJEBvcWKc9WXq7ZWCyylIXIqf4TfaqxAAOfilwFs/dyZ+/Jm8LvIOuScJp2sO7ykU4+cCXqG5llN2aUFRZ5rqVHedni5BJ7O7kqlksAR+N9rK2oyWzagFP5lHh8fMkJUV53DNvZv0ZljBhHF+yHPB3yYzrHBsgmfDJcQqWKovIvA/tJY051nNRncdbuKP3Xofy6YsiY1AfJkQQVW7fhIS4PA1wCLD3oqDOFa+2g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yu Zhao If a tail page has only two references left, one inherited from the isolation of its head and the other from lru_add_page_tail() which we are about to drop, it means this tail page was concurrently zapped. Then we can safely free it and save page reclaim or migration the trouble of trying it. Signed-off-by: Yu Zhao Tested-by: Shuang Zhai Signed-off-by: Usama Arif Acked-by: Johannes Weiner --- mm/huge_memory.c | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 04ee8abd6475..147655821f09 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3059,7 +3059,9 @@ static void __split_huge_page(struct page *page, struct list_head *list, unsigned int new_nr = 1 << new_order; int order = folio_order(folio); unsigned int nr = 1 << order; + struct folio_batch free_folios; + folio_batch_init(&free_folios); /* complete memcg works before add pages to LRU */ split_page_memcg(head, order, new_order); @@ -3143,6 +3145,27 @@ static void __split_huge_page(struct page *page, struct list_head *list, if (subpage == page) continue; folio_unlock(new_folio); + /* + * If a folio has only two references left, one inherited + * from the isolation of its head and the other from + * lru_add_page_tail() which we are about to drop, it means this + * folio was concurrently zapped. Then we can safely free it + * and save page reclaim or migration the trouble of trying it. + */ + if (list && folio_ref_freeze(new_folio, 2)) { + VM_WARN_ON_ONCE_FOLIO(folio_test_lru(new_folio), new_folio); + VM_WARN_ON_ONCE_FOLIO(folio_test_large(new_folio), new_folio); + VM_WARN_ON_ONCE_FOLIO(folio_mapped(new_folio), new_folio); + + folio_clear_active(new_folio); + folio_clear_unevictable(new_folio); + list_del(&new_folio->lru); + if (!folio_batch_add(&free_folios, new_folio)) { + mem_cgroup_uncharge_folios(&free_folios); + free_unref_folios(&free_folios); + } + continue; + } /* * Subpages may be freed if there wasn't any mapping @@ -3153,6 +3176,11 @@ static void __split_huge_page(struct page *page, struct list_head *list, */ free_page_and_swap_cache(subpage); } + + if (free_folios.nr) { + mem_cgroup_uncharge_folios(&free_folios); + free_unref_folios(&free_folios); + } } /* Racy check whether the huge page can be split */ From patchwork Mon Aug 19 02:30:55 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13767736 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 979AFC52D7C for ; Mon, 19 Aug 2024 02:39:43 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1710E6B0093; Sun, 18 Aug 2024 22:39:43 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 121516B0095; Sun, 18 Aug 2024 22:39:43 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F03A86B0098; Sun, 18 Aug 2024 22:39:42 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id D5F766B0093 for ; Sun, 18 Aug 2024 22:39:42 -0400 (EDT) Received: from smtpin17.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 7A667140F1A for ; Mon, 19 Aug 2024 02:39:42 +0000 (UTC) X-FDA: 82467439404.17.BEE0B37 Received: from mail-oi1-f181.google.com (mail-oi1-f181.google.com [209.85.167.181]) by imf01.hostedemail.com (Postfix) with ESMTP id A44DB40003 for ; Mon, 19 Aug 2024 02:39:40 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="XH4xZx0/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.167.181 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724035123; a=rsa-sha256; cv=none; b=KHEh5rB/kIj3JHKExuQQ3bSt3f8vjQOMiQZ7Q3TplGRxHcvWGqIumx0BFQXyBipggZtA7o 6c9hk2D6abVSDCcwOFEC/lVSEnzk5gDSs9v96DdZb4u/pqukZOLLLxcLpbSlwpYjg4IyL6 wuZ7sS4MbrnUzKtzg+ViffSR94Uumco= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="XH4xZx0/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.167.181 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724035123; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+Be5wg7AuGlyVLPd79rHyMZjywH2a3H2lUduVQVliSc=; b=PrQbIyYjSjyYOUKPc7JgBf96QwUpnhSpln2JNCdt4iMD1RHLNV0CRscvgHi5mEjsXEpVh6 uYtpk+3Cf0ra6Bt5u9p6xxB68GKBBb9fYdkjcJOWItCRp5b3WBgZaMqtCVNtct0pOg0ufj GLmbEZfxPU56kHVAoY/Fg5sR6Jj/gcg= Received: by mail-oi1-f181.google.com with SMTP id 5614622812f47-3db14339fb0so2366948b6e.2 for ; Sun, 18 Aug 2024 19:39:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724035179; x=1724639979; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+Be5wg7AuGlyVLPd79rHyMZjywH2a3H2lUduVQVliSc=; b=XH4xZx0/KSgZfWEsmN8I852Fs1tqyyqUgvspyT43aWmFkl+cCyCVppfD0qbT5Ngp2y 3cVQIE3lzAn8Ht51bm4g3w9ty97hpVmQecnv/ltzyGvwtF7O9y0MTJQ8/ezWGccVhfJi FH4aVdTP8tc4SWn8pusf3BMAICgt+dF4Mtq0Gnth/4ApZDYl7EGOOA3tFTn+zPA+Exlf gCVQ8MsTFTRkxPYGqiS007vrzHhwl3c2HF0c1YXSfEN4rRgcwdnBy9H84y+Isf/EMura jiq7wbtaGqWCZkWicLWFnre7E+Bjfzt6f5uEcbAHrnOUQqD4ayJOAeC/4B260FAxIlO9 ya8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724035179; x=1724639979; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+Be5wg7AuGlyVLPd79rHyMZjywH2a3H2lUduVQVliSc=; b=WdKZvLoUOKioNcnInevcGQQQWUxwsNh1AH0nQyTKAWEDOJeIhb1qe7AmYSG1P4PKwS qGh0SbYvnLDdRIwgWAvlxF3p+NN9luGO5z72zsOdIq3BfEf0qrqCfGrtnxZmy6hS1HXB 2yX63N/J0ZIcl/CfTVwBWfEGjhGwr1JMyOhWMCHr3aC5o9xErh6B+LNcjXT9enGPNxn6 8c8Tv7DNS2L5Ssj1UHh4nSuOqTHNRCMNtRLTGZNkU0Ay8+o1pLmUtdOgIK/3+4otcT2K ++vjLYCCPGQLr3aF4x3dUgvzdhe0ZyzsFEVAzlqWWhS+oWkrZkx+ZAUDVpTsF+JmJDCa pKqA== X-Forwarded-Encrypted: i=1; AJvYcCURxoaiFcWOQx50IJ6ljlUr8TmwwdGqTbbaCM/75+eb8enwrw7ZFkO0CqC2yTdwkIDPKxo6Bk1JnNPH+7hDs+bGLPM= X-Gm-Message-State: AOJu0Yze8NA1BR5KvOW9QB9DPZ8s+oMLDDQ+y2oVyvM5GLNfquaO/v7D +vSpxuKc7CUM/HD1QSCIwJg+M1lKxBxLnZnU0fdIlu2eMs2/rQ0ky3lRma3s X-Google-Smtp-Source: AGHT+IFeP4hzF59Alq38selex7r4IkghqvqpA36paciq5N3TrCWUhapW14wSlaHsVPgGfZ79jyfiJw== X-Received: by 2002:a05:622a:4c8b:b0:451:d557:22ed with SMTP id d75a77b69052e-454b67b6fd7mr63370071cf.11.1724034712823; Sun, 18 Aug 2024 19:31:52 -0700 (PDT) Received: from localhost (fwdproxy-ash-000.fbsv.net. [2a03:2880:20ff::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-45369ff5460sm36860341cf.35.2024.08.18.19.31.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Aug 2024 19:31:52 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai , Usama Arif Subject: [PATCH v4 2/6] mm: remap unused subpages to shared zeropage when splitting isolated thp Date: Mon, 19 Aug 2024 03:30:55 +0100 Message-ID: <20240819023145.2415299-3-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240819023145.2415299-1-usamaarif642@gmail.com> References: <20240819023145.2415299-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: A44DB40003 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: oic7jk671da9oaggxzemqnsapu9kbef6 X-HE-Tag: 1724035180-552459 X-HE-Meta: U2FsdGVkX1/iXkCtK3HdFGFIG/pk3Fa/aJEhYS5AnDLJIzI5VWq/K3/q+ga7Yu5dM3tJeq0Oq0dwYj8cgMimrvWCTgjA7OlVBf8Mwtjl68waA+OucwVJpX8Cd1A61KgK0NmhYjdUuUHAXYxMoZsbFyHk1hUthdMcxFXfF1qQ/iEfY1kw/bsFz0IDpmWVk87o2pYg18bvAUmhgj3fIObfcIIQsJZd4wk7pZ3HR2bBtYmjeLuxC75eQAzp9kATEzaDuJhfZLk4ZgbeiQ4PLawpUVDD6f0LfRJuYGl3ia5zhBIKOSQWOHlzKXQUYJkx2eQKCZYSBJAeupJZX/NSHmlZphofdG137uzmiuEpOROeFP59G4nCiWuN20rdzxgL0cPGgkNp0uUBirNIL6ysal+Bku7w3rkuBHBHSqNmvpUwepBh5l4rPAteh8ebGqKw8sr6R+E2XCY5mDQv9oGPbj4nB8b9XzfhVFeXsC7CmJruDFXxYz6IH6CsvVYTY0f9JLwNtgQRUfUxnmEW6RSaxkRf2GcCV30Iwa9Y6k3PNn1zzrmAskL8PJ8qewM1LkBgbWD394sLqMPlxkIKm2PM9cEEKcMCZD3hV5PHQHK6qjCvH90/+ywZbx6EthmYqtgxS6pHlxF5nzCTqBxz7zqcWQ5CM8jKA+nFAtnbxKaReWbTo2Tj+pcwdhVviDGsMoISdurZUboqL/9dfPkC4L6mIfDBD1eqm8qKNA3MZ3RhXqga6YicYnwB0yzoUou3IsX8ecMIEohj8qDGRNq3rlANlPRo/Qi0zRtNR3iFQlkzQucb6rmProyssEIsyAVc/jHVEvXAQjXV3NQXwGH77l7Jes3kcMaGmSt0RsOkfB00W50mcMjQWOv+yxE0pwIKZPHvOPf54jD4jk+cP/z7K8OIJQcDSb8UGv+0gfyEl10JBrSf2CqmdsR5hfeqBStLzyQoxD+Yh6nTRcZUB6D2HoUu/eK m+zyaZzC OlwRAgN+jx6F5Xob0vHXVfTjVXBpamIUPq1OV6fdAej1NLxXZssqKq11aafJAdtuj3KBRXX+aCziQx5yTctGWoNwqD4FkB/5rylM5ips1IQPv52tCv1kHSx6p7TpOvL62/S4El6Le342iakpz7r77jTpcb0LtA25q7Qvcu6M+ArhrXXIJJsTnndke1rxWpEUf89BdtjSX5esvau3eC/ATbC7qU+F0RlcbCW9DiZ1vCsy80hso/EMDL1/jEZa59lOuZTOqCrxeokVHmRZSKM/GyKoD/p9zOgSEzl6dMP3P+3VfyN3yWUq0ZGun+GxM2hG60wJqYwdS528SW2G1Gbh0dz+9BrpTE2A1wMyRMDm6Rv+pbYs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yu Zhao Here being unused means containing only zeros and inaccessible to userspace. When splitting an isolated thp under reclaim or migration, the unused subpages can be mapped to the shared zeropage, hence saving memory. This is particularly helpful when the internal fragmentation of a thp is high, i.e. it has many untouched subpages. This is also a prerequisite for THP low utilization shrinker which will be introduced in later patches, where underutilized THPs are split, and the zero-filled pages are freed saving memory. Signed-off-by: Yu Zhao Tested-by: Shuang Zhai Signed-off-by: Usama Arif --- include/linux/rmap.h | 7 ++++- mm/huge_memory.c | 8 ++--- mm/migrate.c | 72 ++++++++++++++++++++++++++++++++++++++------ mm/migrate_device.c | 4 +-- 4 files changed, 75 insertions(+), 16 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 0978c64f49d8..07854d1f9ad6 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -745,7 +745,12 @@ int folio_mkclean(struct folio *); int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, struct vm_area_struct *vma); -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked); +enum rmp_flags { + RMP_LOCKED = 1 << 0, + RMP_USE_SHARED_ZEROPAGE = 1 << 1, +}; + +void remove_migration_ptes(struct folio *src, struct folio *dst, int flags); /* * rmap_walk_control: To control rmap traversing for specific needs diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 147655821f09..2d77b5d2291e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2911,7 +2911,7 @@ bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, return false; } -static void remap_page(struct folio *folio, unsigned long nr) +static void remap_page(struct folio *folio, unsigned long nr, int flags) { int i = 0; @@ -2919,7 +2919,7 @@ static void remap_page(struct folio *folio, unsigned long nr) if (!folio_test_anon(folio)) return; for (;;) { - remove_migration_ptes(folio, folio, true); + remove_migration_ptes(folio, folio, RMP_LOCKED | flags); i += folio_nr_pages(folio); if (i >= nr) break; @@ -3129,7 +3129,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, if (nr_dropped) shmem_uncharge(folio->mapping->host, nr_dropped); - remap_page(folio, nr); + remap_page(folio, nr, PageAnon(head) ? RMP_USE_SHARED_ZEROPAGE : 0); /* * set page to its compound_head when split to non order-0 pages, so @@ -3425,7 +3425,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (mapping) xas_unlock(&xas); local_irq_enable(); - remap_page(folio, folio_nr_pages(folio)); + remap_page(folio, folio_nr_pages(folio), 0); ret = -EAGAIN; } diff --git a/mm/migrate.c b/mm/migrate.c index 66a5f73ebfdf..2d2e65d69427 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -178,13 +178,57 @@ void putback_movable_pages(struct list_head *l) } } +static bool try_to_map_unused_to_zeropage(struct page_vma_mapped_walk *pvmw, + struct folio *folio, + unsigned long idx) +{ + struct page *page = folio_page(folio, idx); + bool contains_data; + pte_t newpte; + void *addr; + + VM_BUG_ON_PAGE(PageCompound(page), page); + VM_BUG_ON_PAGE(!PageAnon(page), page); + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page); + + if (PageMlocked(page) || (pvmw->vma->vm_flags & VM_LOCKED) || + mm_forbids_zeropage(pvmw->vma->vm_mm)) + return false; + + /* + * The pmd entry mapping the old thp was flushed and the pte mapping + * this subpage has been non present. If the subpage is only zero-filled + * then map it to the shared zeropage. + */ + addr = kmap_local_page(page); + contains_data = memchr_inv(addr, 0, PAGE_SIZE); + kunmap_local(addr); + + if (contains_data) + return false; + + newpte = pte_mkspecial(pfn_pte(my_zero_pfn(pvmw->address), + pvmw->vma->vm_page_prot)); + set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte); + + dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio)); + return true; +} + +struct rmap_walk_arg { + struct folio *folio; + bool map_unused_to_zeropage; +}; + /* * Restore a potential migration pte to a working pte entry */ static bool remove_migration_pte(struct folio *folio, - struct vm_area_struct *vma, unsigned long addr, void *old) + struct vm_area_struct *vma, unsigned long addr, void *arg) { - DEFINE_FOLIO_VMA_WALK(pvmw, old, vma, addr, PVMW_SYNC | PVMW_MIGRATION); + struct rmap_walk_arg *rmap_walk_arg = arg; + DEFINE_FOLIO_VMA_WALK(pvmw, rmap_walk_arg->folio, vma, addr, PVMW_SYNC | PVMW_MIGRATION); while (page_vma_mapped_walk(&pvmw)) { rmap_t rmap_flags = RMAP_NONE; @@ -208,6 +252,9 @@ static bool remove_migration_pte(struct folio *folio, continue; } #endif + if (rmap_walk_arg->map_unused_to_zeropage && + try_to_map_unused_to_zeropage(&pvmw, folio, idx)) + continue; folio_get(folio); pte = mk_pte(new, READ_ONCE(vma->vm_page_prot)); @@ -286,14 +333,21 @@ static bool remove_migration_pte(struct folio *folio, * Get rid of all migration entries and replace them by * references to the indicated page. */ -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked) +void remove_migration_ptes(struct folio *src, struct folio *dst, int flags) { + struct rmap_walk_arg rmap_walk_arg = { + .folio = src, + .map_unused_to_zeropage = flags & RMP_USE_SHARED_ZEROPAGE, + }; + struct rmap_walk_control rwc = { .rmap_one = remove_migration_pte, - .arg = src, + .arg = &rmap_walk_arg, }; - if (locked) + VM_BUG_ON_FOLIO((flags & RMP_USE_SHARED_ZEROPAGE) && (src != dst), src); + + if (flags & RMP_LOCKED) rmap_walk_locked(dst, &rwc); else rmap_walk(dst, &rwc); @@ -903,7 +957,7 @@ static int writeout(struct address_space *mapping, struct folio *folio) * At this point we know that the migration attempt cannot * be successful. */ - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, 0); rc = mapping->a_ops->writepage(&folio->page, &wbc); @@ -1067,7 +1121,7 @@ static void migrate_folio_undo_src(struct folio *src, struct list_head *ret) { if (page_was_mapped) - remove_migration_ptes(src, src, false); + remove_migration_ptes(src, src, 0); /* Drop an anon_vma reference if we took one */ if (anon_vma) put_anon_vma(anon_vma); @@ -1305,7 +1359,7 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, lru_add_drain(); if (old_page_state & PAGE_WAS_MAPPED) - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, 0); out_unlock_both: folio_unlock(dst); @@ -1443,7 +1497,7 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, if (page_was_mapped) remove_migration_ptes(src, - rc == MIGRATEPAGE_SUCCESS ? dst : src, false); + rc == MIGRATEPAGE_SUCCESS ? dst : src, 0); unlock_put_anon: folio_unlock(dst); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 6d66dc1c6ffa..8f875636b35b 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -424,7 +424,7 @@ static unsigned long migrate_device_unmap(unsigned long *src_pfns, continue; folio = page_folio(page); - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, 0); src_pfns[i] = 0; folio_unlock(folio); @@ -837,7 +837,7 @@ void migrate_device_finalize(unsigned long *src_pfns, src = page_folio(page); dst = page_folio(newpage); - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, 0); folio_unlock(src); if (is_zone_device_page(page)) From patchwork Mon Aug 19 02:30:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13767729 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A71F6C531DF for ; Mon, 19 Aug 2024 02:31:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 11E9A6B0085; Sun, 18 Aug 2024 22:31:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0A7076B0088; Sun, 18 Aug 2024 22:31:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E3B356B0089; Sun, 18 Aug 2024 22:31:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id BC37D6B0085 for ; Sun, 18 Aug 2024 22:31:57 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 75710A869D for ; Mon, 19 Aug 2024 02:31:57 +0000 (UTC) X-FDA: 82467419874.29.141C473 Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) by imf09.hostedemail.com (Postfix) with ESMTP id AC8F1140017 for ; Mon, 19 Aug 2024 02:31:55 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=itE3oYs5; spf=pass (imf09.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.170 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724034638; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zqJ9Y6RR0iCOONporxiAWd0rvcRWYXEUM+u9cAIorwk=; b=Owo/SJvtMXykW8EjYMAoXbuTcNtghag9XOhQfHQGUKxeWPpDDcLkix0Sy2BImEXwlTiEaz gC8XIXfgur12rIl4lRSL0Q7q+wRE6ekqzt1EK+JJToUgGZlC6DOnEjK7ImaB31Q/3jmETY lENJfyonc9wwRHAXTgqBXUbJj6He3+4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724034638; a=rsa-sha256; cv=none; b=YRf+2A/ezS2x07Wed6khwvZFuPlyMz4CXMrxdQpUvUHnXfrtrShONsRKd3ktJyl7HbYCba wh39hM6hvKSUHpNTI6wZtI5f2H8AyMQO4Vrun+d76NhN9fm0eT41+Q/7dx6qwn/U/K1BWG xfFcQcqNGmWZzIZzZTBaDPvNgmHg7l0= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=itE3oYs5; spf=pass (imf09.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.170 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com Received: by mail-yb1-f170.google.com with SMTP id 3f1490d57ef6-e117059666eso3506231276.3 for ; Sun, 18 Aug 2024 19:31:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724034715; x=1724639515; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zqJ9Y6RR0iCOONporxiAWd0rvcRWYXEUM+u9cAIorwk=; b=itE3oYs5X2ewpcCPBcc+N3MzJ1CNzXnV6hxDyUh1Qj8PlnQHoylqc9S4Xg2l0DvDx2 Tt+ar1U1ISK/0jkxiOZL1DiznwVun5JSmkHoOc+dB2ddL9WhV0Is1ax3/qk8rMQT9ZC4 NnyakZd3LzNB+h6j+sny0jc9gdBxJDL/mDHHY6OQa6sm07mYeWRUq2jrbNMwbTe9jrlT BNnAQWaNXWVoPHFBkYd1gRMrVSBvJm3tmfeWTTu86pHYR/0WGXxXkaxcS4ScBt7AAn17 ksYSdss+FFdW5I/esZdMd2fpmqVnA7jIct7Hi9slQAAoAHKzmykAkR30OUBikXq2cc0e hVWw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724034715; x=1724639515; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zqJ9Y6RR0iCOONporxiAWd0rvcRWYXEUM+u9cAIorwk=; b=gCw+/b/+bEVYBI3dQfzcqHEr9fT/PlQvtFL4wUD5mqEiABGHGOPPk03XS0FLBLI16A qRV2NpsNCqkKsB30YbnjvzwGimo4DMInLYWw1ZYBfBfO9NV+Sr+5BIpqed3tgv/jgt+1 UydSbhmga/la8WYmBtVjh4L/oB4bznPDzgHff2hFWB0ponr5v/0t+q41IChVzTs97tJF AlosaqF2ONU3oAGWFZ4nU7D9XJKkkmCswVPEWJ2MdzheWXn1J6W1Na5eSN3Q7ZHuGGhD D9ewNARYud26JjwmDe/9yPMEOVLVfwtBl39RxxGjOI8DAOxOz6xg8AKLfYv4aq49eEJj 738Q== X-Forwarded-Encrypted: i=1; AJvYcCXlZhZhxSnANfIRBx6FUGv69DRyOCFRtZc+gZpJ8g+FGVtQ9L7kAJpz1Y4u1grMA8Q+i/9RzcmVbWHfTmWL1hIWzNU= X-Gm-Message-State: AOJu0Yzj6oRJVn93PlcyiOLP9e+TRPBk/8tE0cmhhof/GY9+GpWGbcFs ctFDP9B/OoTZ+ZYwgy2x/3Xfppo7m//7cfvrmlD2FMBSpxQvXVOk X-Google-Smtp-Source: AGHT+IGftm6KdS613kRYfd4LezgLxWlUfzFZGyqU4Es+6S/AqWgBVnIYntbuL1qMdd7RJe0UQP4CYA== X-Received: by 2002:a05:6902:260c:b0:e13:eaf4:884f with SMTP id 3f1490d57ef6-e13eaf4b68emr3300790276.13.1724034714608; Sun, 18 Aug 2024 19:31:54 -0700 (PDT) Received: from localhost (fwdproxy-ash-008.fbsv.net. [2a03:2880:20ff:8::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4536a04e638sm36843281cf.71.2024.08.18.19.31.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Aug 2024 19:31:53 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Alexander Zhu , Usama Arif Subject: [PATCH v4 3/6] mm: selftest to verify zero-filled pages are mapped to zeropage Date: Mon, 19 Aug 2024 03:30:56 +0100 Message-ID: <20240819023145.2415299-4-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240819023145.2415299-1-usamaarif642@gmail.com> References: <20240819023145.2415299-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Stat-Signature: xbfwgr41udzsyjc5pziu5er3ye4su5ug X-Rspamd-Queue-Id: AC8F1140017 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1724034715-592141 X-HE-Meta: U2FsdGVkX18kPViwtYVoxwvKkcKxEmHbg1J09gQiBoz4aOF73Vk149KbSDciq+Yk3dMpaBoIMlYzrL0Ya0S4mGVrQ7uCFNxVAUB3kJz+43agFbOWtUFqOaupzi/Wj0IakQH/SIFur4dQnM+p0oSSNvwX3YEqCDtR9j8wh5DZv4xIQK/WOZYxJq4YDBx2OPmRe7ZGLUQVoGR5N7OheZVbmj5xhzcThu5BU8i+ATbFR1a/1Gqd4Vd+G9zzF0ieMFILkpGxF8hpVlihUTgBP9yise13R1Ai79bR3jfm6f858ZIrGjkLhJ4fpEO8cxzvdY81vNulseJk/sWLmZ9CON70TB60Hse1hpHnKgM4O8riaFsIexAEuFdqfTcg6p0OifRDUkD08liqwratXru17JP+yVAAKtthUKrGeE16qz6+z9nCVdn6YkE7+pP+HN2KdlgHEmMCnbjtqoljZx739dHjrsZw90N0oBe26I9CJHQgGqW2xQyZQ54qVNBfwECnv9SGu3mpOX2csCELhNP19vzlS+9XEUGWEFAiUMEU/YYyUCSGecDLc6z6HFjYAwXc5SKELgaTArzTcCR/RpsiR9pFFYT3duKShcWKOpY2eu48lzmPqcufF/iepVl9uH65S8VR8RCx4MokVVt8ArlGmLGGu/SgYuavIFFCZLfkYQ3Kb4r3M56PjB0HcX3oq8D3Fc8xIPKwwrJj0xEeIFpkr+rZABeqF473k7+Gf/Ofs25tpyYiuQKs8Ki5Wgtqe08xEqavF2FF6HOooUFvZR5YtuJR43g/X+vEGWdKwXAKPmmVhKec6z0RRpCw0n/cpPo5R3lvu3yaXKOuczXzdsA1eKX77APBszm6s1eeAAWFMstry+yzd4r3DsEef4TWfIXLfvrcKACjtk70PvwvQRbJvFbJyAAZdLHgXz5/ZISzvCeILJtuJRfCgC77toFLZjhnSXEIlRUJHSDdhuSPJwqiLtM L8G02gfW TnjiZ0y016dRuqS9hVUrWd2jqWPE5qpYhyCyuE0OSJ4utIdtOSqlP9vcbD/CgDdEj2Xy3vNZ1HzHqF0AVGVnOIOE5CLcm/eKYxT662ZPwYFKkzvBS+zaJgnO9Bj66kquE1KUw7XvD1cyHrkSvFoBGwVWXKAAml0914ytLEQzBgDk2sTO0FIPRHLkXHCZ9HXgoB8TpaqWsDV9ZjU44auprRLsdsf+xPISRy2w65bQJs9C1x7OV31uTtrddX4VIDoL6u8j9eAmrFx1Dy9ZZ2fPKVHZTgzkPobEZfRtPoPZldaoFRz7yrovjUUMJgV4sLvh0pSzUfZXZrJTRyUu6sKQ/HWa5hcCKNUPsAv1kNtLUHYCctFIlVk3gGXuB5ifyiOroWPakitnmwPdy8/hTzt4epQF5QY1Is5mMVMkj7p0aTDBX08jO+m8mRpc9kmiJLZEX5cc0+3pqvkRoz9M= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexander Zhu When a THP is split, any subpage that is zero-filled will be mapped to the shared zeropage, hence saving memory. Add selftest to verify this by allocating zero-filled THP and comparing RssAnon before and after split. Signed-off-by: Alexander Zhu Acked-by: Rik van Riel Signed-off-by: Usama Arif Signed-off-by: Usama Arif --- .../selftests/mm/split_huge_page_test.c | 71 +++++++++++++++++++ tools/testing/selftests/mm/vm_util.c | 22 ++++++ tools/testing/selftests/mm/vm_util.h | 1 + 3 files changed, 94 insertions(+) diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c index e5e8dafc9d94..eb6d1b9fc362 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -84,6 +84,76 @@ static void write_debugfs(const char *fmt, ...) write_file(SPLIT_DEBUGFS, input, ret + 1); } +static char *allocate_zero_filled_hugepage(size_t len) +{ + char *result; + size_t i; + + result = memalign(pmd_pagesize, len); + if (!result) { + printf("Fail to allocate memory\n"); + exit(EXIT_FAILURE); + } + + madvise(result, len, MADV_HUGEPAGE); + + for (i = 0; i < len; i++) + result[i] = (char)0; + + return result; +} + +static void verify_rss_anon_split_huge_page_all_zeroes(char *one_page, int nr_hpages, size_t len) +{ + unsigned long rss_anon_before, rss_anon_after; + size_t i; + + if (!check_huge_anon(one_page, 4, pmd_pagesize)) { + printf("No THP is allocated\n"); + exit(EXIT_FAILURE); + } + + rss_anon_before = rss_anon(); + if (!rss_anon_before) { + printf("No RssAnon is allocated before split\n"); + exit(EXIT_FAILURE); + } + + /* split all THPs */ + write_debugfs(PID_FMT, getpid(), (uint64_t)one_page, + (uint64_t)one_page + len, 0); + + for (i = 0; i < len; i++) + if (one_page[i] != (char)0) { + printf("%ld byte corrupted\n", i); + exit(EXIT_FAILURE); + } + + if (!check_huge_anon(one_page, 0, pmd_pagesize)) { + printf("Still AnonHugePages not split\n"); + exit(EXIT_FAILURE); + } + + rss_anon_after = rss_anon(); + if (rss_anon_after >= rss_anon_before) { + printf("Incorrect RssAnon value. Before: %ld After: %ld\n", + rss_anon_before, rss_anon_after); + exit(EXIT_FAILURE); + } +} + +void split_pmd_zero_pages(void) +{ + char *one_page; + int nr_hpages = 4; + size_t len = nr_hpages * pmd_pagesize; + + one_page = allocate_zero_filled_hugepage(len); + verify_rss_anon_split_huge_page_all_zeroes(one_page, nr_hpages, len); + printf("Split zero filled huge pages successful\n"); + free(one_page); +} + void split_pmd_thp(void) { char *one_page; @@ -431,6 +501,7 @@ int main(int argc, char **argv) fd_size = 2 * pmd_pagesize; + split_pmd_zero_pages(); split_pmd_thp(); split_pte_mapped_thp(); split_file_backed_thp(); diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c index 5a62530da3b5..d8d0cf04bb57 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -12,6 +12,7 @@ #define PMD_SIZE_FILE_PATH "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size" #define SMAP_FILE_PATH "/proc/self/smaps" +#define STATUS_FILE_PATH "/proc/self/status" #define MAX_LINE_LENGTH 500 unsigned int __page_size; @@ -171,6 +172,27 @@ uint64_t read_pmd_pagesize(void) return strtoul(buf, NULL, 10); } +unsigned long rss_anon(void) +{ + unsigned long rss_anon = 0; + FILE *fp; + char buffer[MAX_LINE_LENGTH]; + + fp = fopen(STATUS_FILE_PATH, "r"); + if (!fp) + ksft_exit_fail_msg("%s: Failed to open file %s\n", __func__, STATUS_FILE_PATH); + + if (!check_for_pattern(fp, "RssAnon:", buffer, sizeof(buffer))) + goto err_out; + + if (sscanf(buffer, "RssAnon:%10lu kB", &rss_anon) != 1) + ksft_exit_fail_msg("Reading status error\n"); + +err_out: + fclose(fp); + return rss_anon; +} + bool __check_huge(void *addr, char *pattern, int nr_hpages, uint64_t hpage_size) { diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h index 9007c420d52c..71b75429f4a5 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -39,6 +39,7 @@ unsigned long pagemap_get_pfn(int fd, char *start); void clear_softdirty(void); bool check_for_pattern(FILE *fp, const char *pattern, char *buf, size_t len); uint64_t read_pmd_pagesize(void); +uint64_t rss_anon(void); bool check_huge_anon(void *addr, int nr_hpages, uint64_t hpage_size); bool check_huge_file(void *addr, int nr_hpages, uint64_t hpage_size); bool check_huge_shmem(void *addr, int nr_hpages, uint64_t hpage_size); From patchwork Mon Aug 19 02:30:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13767730 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D93C3C52D7C for ; Mon, 19 Aug 2024 02:32:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7B0776B008C; Sun, 18 Aug 2024 22:31:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 714246B008A; Sun, 18 Aug 2024 22:31:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 4EED86B008C; Sun, 18 Aug 2024 22:31:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2BB446B0089 for ; Sun, 18 Aug 2024 22:31:59 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id A88B040EAF for ; Mon, 19 Aug 2024 02:31:58 +0000 (UTC) X-FDA: 82467419916.13.885B022 Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) by imf06.hostedemail.com (Postfix) with ESMTP id D755218000E for ; Mon, 19 Aug 2024 02:31:56 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kJy1gz69; spf=pass (imf06.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.173 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724034614; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=7qMuNDyBvGp6REiF+VtGrTnesU92EL9IKEWMO4HzFdQ=; b=NBXCSqS4dUS919SlwvV8KwjE1cRs5NWHkW+WQlBzjrIpkTMkPROsuhREko4s/il7ONqmkj IcR0I9svDofRd9po8eBls+l7bK13sfYGO2r+3bDm3HK4lM7Ik45OWvHXxEDKPUi7Tkfsxi r0QlYS2odaPkgGbJYQgYJBcuWfC0MxI= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=kJy1gz69; spf=pass (imf06.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.173 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724034614; a=rsa-sha256; cv=none; b=X9k9zR7tgBOq2M6UeZC9gCX553GjDIV0KoAPVcu8GEJsarRgSPDXhgfbi9ztqnAIk1qQjy 99wIyeBZf5UqgWN3UAWOHAovUHy2BTfbT0RBFtv5rTnpTlpC2RrxbMBxPkSWmQgTmFuCcm EBuaQMbs0YYWa5LNEiQeek4ENW1ly14= Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-44fea44f725so31123751cf.1 for ; Sun, 18 Aug 2024 19:31:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724034716; x=1724639516; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7qMuNDyBvGp6REiF+VtGrTnesU92EL9IKEWMO4HzFdQ=; b=kJy1gz69rZL/2Hw+RrceeyTiHby/1vFJaKP8cBZGme6xqlvzmHprc15Jer+B+7a4HU ODQ3JeUgQUUYmmDeBuBb9dU9pt/E3mK+mjTsRl3ukVFUXIrF5EpK1LYA+n1VjMi7f9Up 3n4EfwSn1Cy5NZZiFVsUiA1utHIhJvVErY789mk9IY5H06OZ2MutYzCB3YojdDOqv7Wm BDNNgbOevwP4YON0bHJcu7GKceaU9kbredziuB6w9GeozFndAnj0AS5p4yw8F9KBpkhR sFI/tZ1ZfpjPfEDXw4kZd/5l1lbx2jZUHSWSZjyuQylVPJw/Gaq6SObzdh1DHk4rq9Gv c44g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724034716; x=1724639516; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7qMuNDyBvGp6REiF+VtGrTnesU92EL9IKEWMO4HzFdQ=; b=vkndlEoI+vxoWeSOwBEdu/B49mPwySSMkpa/KXn4JhYGdjiGroJm8NWIcgq3jtHwoL P/y3TdKHIUwhRJNKCYO0/ovPTYsWHkOsKr8SO6BP64J3rsVq9smRO7Lj5RuuBJqUruUI XNdyzwJDiOsgWy6eQtkVMlw6zIhQnjGjEUmtrBX5tepKB1e4jYA/e+RMD+1MgCtftnDy xXIGINd/7DOj7aQ4HdeB1cUVgTLxySciu4DqGm7XigZLzFmV7tavi9bkSGfTZvUTUvW0 gZcLSRDH9IPHpFd4g6BHLDqZPw2zZJjajVzHOyNWCsQwb9RMDmtyTGQ1KUU8q+ngTd70 ArgA== X-Forwarded-Encrypted: i=1; AJvYcCW4/o/LwTFuNHK6Z/NBciIguSvNwwhNQPTeW2oDiAkbQQZDXtZbzAvn+yDEDSNywljccmNPuUJ/CoKdXXZVvOBGiHk= X-Gm-Message-State: AOJu0YziVROdNYlvcSGlKyGM7XlFbr7I5O39Zf+VpZ4h2tBR6J4OZIa9 ADpbjDQIwOhsYl8RSYSaGNspeUhkRT/dwrAZ/Ca5HcaLOX0EfLEn X-Google-Smtp-Source: AGHT+IFmtPd7hU7za122IjXT3WTFdEVF9e2slXlOQoczJAWqePZoKzw0RZWmiFcdFLHebyhRGR/CwA== X-Received: by 2002:a05:622a:4d0c:b0:453:1334:9725 with SMTP id d75a77b69052e-45374f5468cmr173712171cf.3.1724034715718; Sun, 18 Aug 2024 19:31:55 -0700 (PDT) Received: from localhost (fwdproxy-ash-000.fbsv.net. [2a03:2880:20ff::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-45369fee649sm36828591cf.29.2024.08.18.19.31.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Aug 2024 19:31:55 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v4 4/6] mm: Introduce a pageflag for partially mapped folios Date: Mon, 19 Aug 2024 03:30:57 +0100 Message-ID: <20240819023145.2415299-5-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240819023145.2415299-1-usamaarif642@gmail.com> References: <20240819023145.2415299-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: D755218000E X-Stat-Signature: agxgrxs964jt6f7qon68bxzhwsqswo8j X-Rspam-User: X-HE-Tag: 1724034716-993560 X-HE-Meta: U2FsdGVkX19YEDC3u4burDNEqVMvDa3U90zKqkGOLX8emUvnFqSLWmZtj4n7sJGBNCzmKrxV6h5/z8jcU6bhpvhKJg9SG2HitO5BJPi5hdZ8jnicZv/4HRqTqo5A7Tgoafy3YxW9kp1X511MHaCqdS9GVZEIqZnxWKDYRWCRsYxzUMfwfO6WJ0VG6IH/LTke5xLxF3hdQnKxdFzXozw93+YR/ol4ZobeHl4NqvpZgJp/WgksbsFrmJl7V5mRHZ+N6BoQxZwO7gFNowQdUEsTh7CmAIf1os1vlnkdAMBexN1i/Dqdx9aX+AscdZIQm0Qf7rY9jeziTvPIPTv3mLwv4l0BtSlkRSXr6I2Iop6n5SksHrsaSJhh37djWaRpyQWy17u+OYJpWw8ZCiOZ5ieOzBnSmcxgG5AbEhJNLpYMG1qz3et0+X9MFXVrslz1Zgc0iza0z3LrjL/Q4nQLMe37HhJuK/Q5yWIPYgQsIdKV5q9W5phMxJ8CyeWEny8sTmM4wXSfGDQW0OCLPmmhLXG8nRCwc5isyvmAjJ+19RzR7W0wGeETxkn+lYHU2Ew4RjC68j4zcEsguVy+PRptzrWEOQEoOtYBLaX7d7SZFqnmcXUlyxLhe/rc5u3GRGTwbDaiPHu7NniGEUDQwZ2h0CHtJ9nrx4J9XCmblPn4S2hv+JKVIuxZBnVxzD8PkNziuDONkFePXChpXvndHjWSUl/FidMGLw5VBolCEHmyzoqBLCz100tv3JA1mAGHTXqKtGorAvXBoQ0x8s8V3lJ74yJwqeMp+ec8+max+F8WM4ziAqXRhfSCuhUTB8kTbo48YYOsu3lBBj9x+I4vBvFzXUk9Wz0IprJlTU1t3iv/D9MC9sbcaaQRdolUHjO9/uqNVM28Yk8aXyb7NIQb3xY0pKxbLfKYkfvnaNwgaS0LI2ynf+SVM6z7eOfb2vM2JQ8cttmvnS6+AU236hKsjq7GCqG ECkaWuk6 WiRPgQfR5381TVZzvARXg/M+kfugTsBXYZ1z1eLnjS5ORWQRj7Vck2EDNlBu8mHLWbTFjYusVQhVvjG6oFAaQ5C0B013h0DtHPhYmIBDH+yyqNqlcoA02zFIK+3JmhQksimBys85VSdd44sn8ndC/i63CO5zfi72n4n1PHheFksx01gu+iWErGNHWFZildM5XAstddtafz5gzY+NiqUB/BWozN5DAcISb5a1EqO1ewPTzXnKnVExF+Cj2u1hr9XTUpR9Juhbb5OIETu+jqPLq9jkHxwicLw7H9sKB4hYNv5Z5FlyFKff8fZXVGHorljphWJ7BqHK1h7MrYkaE7Td/X3xWjg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently folio->_deferred_list is used to keep track of partially_mapped folios that are going to be split under memory pressure. In the next patch, all THPs that are faulted in and collapsed by khugepaged are also going to be tracked using _deferred_list. This patch introduces a pageflag to be able to distinguish between partially mapped folios and others in the deferred_list at split time in deferred_split_scan. Its needed as __folio_remove_rmap decrements _mapcount, _large_mapcount and _entire_mapcount, hence it won't be possible to distinguish between partially mapped folios and others in deferred_split_scan. Eventhough it introduces an extra flag to track if the folio is partially mapped, there is no functional change intended with this patch and the flag is not useful in this patch itself, it will become useful in the next patch when _deferred_list has non partially mapped folios. Signed-off-by: Usama Arif Signed-off-by: Usama Arif --- include/linux/huge_mm.h | 4 ++-- include/linux/page-flags.h | 11 +++++++++++ mm/huge_memory.c | 23 ++++++++++++++++------- mm/internal.h | 4 +++- mm/memcontrol.c | 3 ++- mm/migrate.c | 3 ++- mm/page_alloc.c | 5 +++-- mm/rmap.c | 5 +++-- mm/vmscan.c | 3 ++- 9 files changed, 44 insertions(+), 17 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 4c32058cacfe..969f11f360d2 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -321,7 +321,7 @@ static inline int split_huge_page(struct page *page) { return split_huge_page_to_list_to_order(page, NULL, 0); } -void deferred_split_folio(struct folio *folio); +void deferred_split_folio(struct folio *folio, bool partially_mapped); void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct folio *folio); @@ -495,7 +495,7 @@ static inline int split_huge_page(struct page *page) { return 0; } -static inline void deferred_split_folio(struct folio *folio) {} +static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {} #define split_huge_pmd(__vma, __pmd, __address) \ do { } while (0) diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index a0a29bd092f8..c3bb0e0da581 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -182,6 +182,7 @@ enum pageflags { /* At least one page in this folio has the hwpoison flag set */ PG_has_hwpoisoned = PG_active, PG_large_rmappable = PG_workingset, /* anon or file-backed */ + PG_partially_mapped = PG_reclaim, /* was identified to be partially mapped */ }; #define PAGEFLAGS_MASK ((1UL << NR_PAGEFLAGS) - 1) @@ -861,8 +862,18 @@ static inline void ClearPageCompound(struct page *page) ClearPageHead(page); } FOLIO_FLAG(large_rmappable, FOLIO_SECOND_PAGE) +FOLIO_TEST_FLAG(partially_mapped, FOLIO_SECOND_PAGE) +/* + * PG_partially_mapped is protected by deferred_split split_queue_lock, + * so its safe to use non-atomic set/clear. + */ +__FOLIO_SET_FLAG(partially_mapped, FOLIO_SECOND_PAGE) +__FOLIO_CLEAR_FLAG(partially_mapped, FOLIO_SECOND_PAGE) #else FOLIO_FLAG_FALSE(large_rmappable) +FOLIO_TEST_FLAG_FALSE(partially_mapped) +__FOLIO_SET_FLAG_NOOP(partially_mapped) +__FOLIO_CLEAR_FLAG_NOOP(partially_mapped) #endif #define PG_head_mask ((1UL << PG_head)) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 2d77b5d2291e..70ee49dfeaad 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -3398,6 +3398,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, * page_deferred_list. */ list_del_init(&folio->_deferred_list); + __folio_clear_partially_mapped(folio); } spin_unlock(&ds_queue->split_queue_lock); if (mapping) { @@ -3454,11 +3455,13 @@ void __folio_undo_large_rmappable(struct folio *folio) if (!list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; list_del_init(&folio->_deferred_list); + __folio_clear_partially_mapped(folio); } spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); } -void deferred_split_folio(struct folio *folio) +/* partially_mapped=false won't clear PG_partially_mapped folio flag */ +void deferred_split_folio(struct folio *folio, bool partially_mapped) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); #ifdef CONFIG_MEMCG @@ -3486,14 +3489,19 @@ void deferred_split_folio(struct folio *folio) if (folio_test_swapcache(folio)) return; - if (!list_empty(&folio->_deferred_list)) - return; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + if (partially_mapped) { + if (!folio_test_partially_mapped(folio)) { + __folio_set_partially_mapped(folio); + if (folio_test_pmd_mappable(folio)) + count_vm_event(THP_DEFERRED_SPLIT_PAGE); + count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + } + } else { + /* partially mapped folios cannot become non-partially mapped */ + VM_WARN_ON_FOLIO(folio_test_partially_mapped(folio), folio); + } if (list_empty(&folio->_deferred_list)) { - if (folio_test_pmd_mappable(folio)) - count_vm_event(THP_DEFERRED_SPLIT_PAGE); - count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); ds_queue->split_queue_len++; #ifdef CONFIG_MEMCG @@ -3542,6 +3550,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, } else { /* We lost race with folio_put() */ list_del_init(&folio->_deferred_list); + __folio_clear_partially_mapped(folio); ds_queue->split_queue_len--; } if (!--sc->nr_to_scan) diff --git a/mm/internal.h b/mm/internal.h index 52f7fc4e8ac3..27cbb5365841 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -662,8 +662,10 @@ static inline void prep_compound_head(struct page *page, unsigned int order) atomic_set(&folio->_entire_mapcount, -1); atomic_set(&folio->_nr_pages_mapped, 0); atomic_set(&folio->_pincount, 0); - if (order > 1) + if (order > 1) { INIT_LIST_HEAD(&folio->_deferred_list); + __folio_clear_partially_mapped(folio); + } } static inline void prep_compound_tail(struct page *head, int tail_idx) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index e1ffd2950393..0fd95daecf9a 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4669,7 +4669,8 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); VM_BUG_ON_FOLIO(folio_order(folio) > 1 && !folio_test_hugetlb(folio) && - !list_empty(&folio->_deferred_list), folio); + !list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio), folio); /* * Nobody should be changing or seriously looking at diff --git a/mm/migrate.c b/mm/migrate.c index 2d2e65d69427..ef4a732f22b1 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1735,7 +1735,8 @@ static int migrate_pages_batch(struct list_head *from, * use _deferred_list. */ if (nr_pages > 2 && - !list_empty(&folio->_deferred_list)) { + !list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio)) { if (!try_split_folio(folio, split_folios, mode)) { nr_failed++; stats->nr_thp_failed += is_thp; diff --git a/mm/page_alloc.c b/mm/page_alloc.c index 408ef3d25cf5..a145c550dd2a 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -957,8 +957,9 @@ static int free_tail_page_prepare(struct page *head_page, struct page *page) break; case 2: /* the second tail page: deferred_list overlaps ->mapping */ - if (unlikely(!list_empty(&folio->_deferred_list))) { - bad_page(page, "on deferred list"); + if (unlikely(!list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio))) { + bad_page(page, "partially mapped folio on deferred list"); goto out; } break; diff --git a/mm/rmap.c b/mm/rmap.c index a6b9cd0b2b18..4c330635aa4e 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1578,8 +1578,9 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, * Check partially_mapped first to ensure it is a large folio. */ if (partially_mapped && folio_test_anon(folio) && - list_empty(&folio->_deferred_list)) - deferred_split_folio(folio); + !folio_test_partially_mapped(folio)) + deferred_split_folio(folio, true); + __folio_mod_stat(folio, -nr, -nr_pmdmapped); /* diff --git a/mm/vmscan.c b/mm/vmscan.c index 25e43bb3b574..25f4e8403f41 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1233,7 +1233,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, * Split partially mapped folios right away. * We can free the unmapped pages without IO. */ - if (data_race(!list_empty(&folio->_deferred_list)) && + if (data_race(!list_empty(&folio->_deferred_list) && + folio_test_partially_mapped(folio)) && split_folio_to_list(folio, folio_list)) goto activate_locked; } From patchwork Mon Aug 19 02:30:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13767731 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07EB6C531DF for ; Mon, 19 Aug 2024 02:32:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E3F106B0089; Sun, 18 Aug 2024 22:32:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DC9486B008A; Sun, 18 Aug 2024 22:32:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA5A86B0092; Sun, 18 Aug 2024 22:32:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 99B666B0089 for ; Sun, 18 Aug 2024 22:32:00 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 4F104A0F2F for ; Mon, 19 Aug 2024 02:32:00 +0000 (UTC) X-FDA: 82467420000.30.635183B Received: from mail-yb1-f176.google.com (mail-yb1-f176.google.com [209.85.219.176]) by imf01.hostedemail.com (Postfix) with ESMTP id 739954001B for ; Mon, 19 Aug 2024 02:31:58 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=nOMkfXTX; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.176 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724034703; a=rsa-sha256; cv=none; b=Q0Z7yZ8niQIW+4gEnViT7PVU5yMXW0VCfw99odeKVI0OGWicdTwgwNPBMJVbmYzHbicStV DTtnwA/Gx2OPlMEGEqFZxhxb/VTiFH7WyzyUmEOOrGSkHK0NGQyY+GKsJxX6JvGC7+TMuv hdeJjqPV8ruIXnI6pCb3z8DKPMxJ0Os= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=nOMkfXTX; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.176 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724034703; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BDAWwS723hN3aXuX35eiOFn0OpaDg+7RbZaRdmjXne8=; b=11gz7donH7vzikJt2hqKI4+EawJtUyYrakq+1k8/8OLlnkkmgpHGL1Wyso3zZdK9Wv4FnA d4Ye54ye7SJnHt7gp/BT3mgoJCvMbSe7cVmz3TbDqCnG0Xa9F8C7YfnTooeH4z6Q/iEF5r KCWoXqgTyWwnjfGDKKGcJ4x+MGmF2U0= Received: by mail-yb1-f176.google.com with SMTP id 3f1490d57ef6-e05f25fb96eso3979409276.1 for ; Sun, 18 Aug 2024 19:31:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724034717; x=1724639517; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=BDAWwS723hN3aXuX35eiOFn0OpaDg+7RbZaRdmjXne8=; b=nOMkfXTXs/t9bxav4amvLr5JFktjvjLLPSwuiwZDLrKXSZbi815ZmPo4t9q9qdDLMi f2K6LkVIBc2BXuIReimzA7rSUn6WLvgCRZ6yviXkdGrTm5wbMX7Ld3Ej9FjPz+TQ616w weXNkQsoqyHuakTum5FqViRQFdO6oZiHW0unDxWRD6qAMxV3iMfrr9GtZSLzb8u5Z7XW DG/6YIBqYulD+x/ABst/a80O4C81ftwyx8nsvNPBGJzGr6e7A9wav6lZttdDhghIQ0GJ rr+Sga21fsDbNsfjHaDU9m+z5hs1rf0yu8N3NFBMVLg25FclnET9V5MRb7KFx9uPdf2g 1A9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724034717; x=1724639517; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BDAWwS723hN3aXuX35eiOFn0OpaDg+7RbZaRdmjXne8=; b=E96ZT3oQcHZvedwHk4dbpkOCG3TIchDYsuHhuuywAg3L+9WQSr0Z0eIwP/rSaTeW85 TuM0eUaQ1ZeVnEHZuXy0YctMzqAnutj3a7Yh9fiJaji0f9xH4WnNhJIaxHkTeULCsCoQ UH1j34cD6/ua56K8ssUoTnUIbq2+qmIQIXCRDscVMvhDXJCu3VZl+1VL+Fqzs91dHfhe 3OPgTCGJ7/qRsi25hAZrzEazsTaPN162JBc5xtFuxBcnUr+Kyig0elOs9rGy7LUjeC60 5fEpQlkd3OiFDUxdbi/0p59lqILHh1mYVQ8yHKV/O1VAh3Xg/Jp/xp6DHi1VUEd0JoCA DT8g== X-Forwarded-Encrypted: i=1; AJvYcCUKjhRi1mDDyScfZI6u0R8Fl6Ec9FtFyH8HLso6khxX+B6OYHfIOvSZpMQ2vdcaJISP76Hm8+kbNg==@kvack.org X-Gm-Message-State: AOJu0YypoqlH5Fn3orH4z0I05oMobXdIGuqA/LL208KfiCTdt/ZxgTfV P8kOXLwJYTlrH50TOMR+uzp1JyhUYRWX33cI1vHG7t/hRu+hNmsb X-Google-Smtp-Source: AGHT+IHePXgHOJtzLLyUXdkdD/gbE32Y/UIS3Wv6y/4aGanspmXVzMlLaX5LmK42xXQm7XY5FnN/nA== X-Received: by 2002:a05:6902:18c3:b0:e13:d3ec:2b8f with SMTP id 3f1490d57ef6-e13d3ec34a5mr7085223276.52.1724034717319; Sun, 18 Aug 2024 19:31:57 -0700 (PDT) Received: from localhost (fwdproxy-ash-014.fbsv.net. [2a03:2880:20ff:e::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4536a0046a4sm36947491cf.41.2024.08.18.19.31.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Aug 2024 19:31:56 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v4 5/6] mm: split underused THPs Date: Mon, 19 Aug 2024 03:30:58 +0100 Message-ID: <20240819023145.2415299-6-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240819023145.2415299-1-usamaarif642@gmail.com> References: <20240819023145.2415299-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 739954001B X-Rspamd-Server: rspam01 X-Stat-Signature: 7mksgpksyudbdyz3fq67swfj7nr6ui4a X-HE-Tag: 1724034718-549774 X-HE-Meta: U2FsdGVkX1+EcFE0pRhCfktIbrauY9JWjnX/KjPzior2t9vxMwYCQzhCaPKH07DBKOmBjcte1nfv8SX9qp8Y5AAuxX5dCIQ44QFtARQwaFoqOarYZG8dJzHsPta8pAWZS9AqiyqJpjzuTdX1Jyp4AZSqjuQO8R7G6jVj88Pui2ODd+HKSuaXF7DmjedNNBlk3XkrHSdvS5vigJ2cmsAwaOKyjK9EcTAs7DVwR03W3Ny0p6+odv5fFfiA7twUO+1c1apsy/a8KsySFJGXcUU+UXOlCwUq88X8lZQZQaiyF+e0Ip/gMRaMsePGgttDuhUMn4J+tJpK2g05BNCvuydYBvUG/W6l7ueedFztrmKgmu/dxt3+dn/fGrRMDwI1RY0OvMF93YfGfTlYoT9gFXaI7BaS40lERq7Gx5yOb7KALTlH0VjF2dp8ZFn4BUUWREe7gdMf2QJkUwJ2NLoKvqBFlYpj7NdX6D9p1087iqy/ULkumlGWtxx2E+TZucl0c1L66xm9YIeu5YDnh1smhIPZatmNwyI6l6VcdqcQCirkkoNTCnDFgGTLl6yk9iFWNdvSpNSkeFpF8Ta5tva6vNWm6CObTuImN70T4Vpv2iMuciRgBScELy3mReYkc0g5dijOt17sRwVjndH/gFR/MXLvIgNkUCGDR7o9t7BKmHBYud9APxacaA8PYzid4Of2vtzVKiFVghR9alJXOHK0+UUgWTyxruWjEwG658jXAa1DLENatxqjqxj3qTTUKSknxCb1NJiOS1fwYQJvz71fobocU13/qlbiF39sH6pty8BmWqz/q/+HZHKCi1fdJCGeBS9XM0eQRUdLazoC5n5y06u4+4srzXmMl8INxDBK2toNgFdoECW2DzAzCBEW5tEg1+T7Mli6R1hPSkMHqQeaw9/OspgnJrHWXAjoQrUwzPV7nv9uFUX+GWol/WCxYbZkRyL3M+0iqpVQPYhGjxRrSZe HhMjRZLy Co7nuLLt14mLId1/ySzs5wRwvYLbtDsIe/sfXiY33Ke8wF8xB2HcvwWIF+h2OkYmr+f/SEYPB4F6TkdUMb5WA5+inzW1emptdLPcPg04s14rCjCBkkgEomWYBqLJatBj9HvL/cq07w3WEA74fQQiciBXxBOJSaUKLe5UhH2h8czxDmHfd0/Kik4emu2+bu+FlpSCOoBEK82Ey8p+mW37hkt/e7ikDg61trBS3dzzODIvA0Soma/NBKicJ9HddXr9jCxCZFjMhfKM/ra99xAs9IJgAArPFU11nhJfZ2GUbRPaf9biOvAIqHyz5pltRdrYaOpmHWydfyim+YeW54uKhM0lx1bqLifpExSKBnvcGC7E6/j8kAATkAsOotU/Z5nzYQDHqC47Fo1ctpQheCO5JKb+k+fqY0BkmGEFuNc4+V6Hy0WNL0d3SYzxQyB2bgGcJE0bvzLE2sxN1GGs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is an attempt to mitigate the issue of running out of memory when THP is always enabled. During runtime whenever a THP is being faulted in (__do_huge_pmd_anonymous_page) or collapsed by khugepaged (collapse_huge_page), the THP is added to _deferred_list. Whenever memory reclaim happens in linux, the kernel runs the deferred_split shrinker which goes through the _deferred_list. If the folio was partially mapped, the shrinker attempts to split it. If the folio is not partially mapped, the shrinker checks if the THP was underused, i.e. how many of the base 4K pages of the entire THP were zero-filled. If this number goes above a certain threshold (decided by /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none), the shrinker will attempt to split that THP. Then at remap time, the pages that were zero-filled are mapped to the shared zeropage, hence saving memory. Suggested-by: Rik van Riel Co-authored-by: Johannes Weiner Signed-off-by: Usama Arif --- Documentation/admin-guide/mm/transhuge.rst | 6 +++ include/linux/khugepaged.h | 1 + include/linux/vm_event_item.h | 1 + mm/huge_memory.c | 60 +++++++++++++++++++++- mm/khugepaged.c | 3 +- mm/vmstat.c | 1 + 6 files changed, 69 insertions(+), 3 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 058485daf186..40741b892aff 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -447,6 +447,12 @@ thp_deferred_split_page splitting it would free up some memory. Pages on split queue are going to be split under memory pressure. +thp_underused_split_page + is incremented when a huge page on the split queue was split + because it was underused. A THP is underused if the number of + zero pages in the THP is above a certain threshold + (/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none). + thp_split_pmd is incremented every time a PMD split into table of PTEs. This can happen, for instance, when application calls mprotect() or diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h index f68865e19b0b..30baae91b225 100644 --- a/include/linux/khugepaged.h +++ b/include/linux/khugepaged.h @@ -4,6 +4,7 @@ #include /* MMF_VM_HUGEPAGE */ +extern unsigned int khugepaged_max_ptes_none __read_mostly; #ifdef CONFIG_TRANSPARENT_HUGEPAGE extern struct attribute_group khugepaged_attr_group; diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index aae5c7c5cfb4..aed952d04132 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -105,6 +105,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, THP_SPLIT_PAGE, THP_SPLIT_PAGE_FAILED, THP_DEFERRED_SPLIT_PAGE, + THP_UNDERUSED_SPLIT_PAGE, THP_SPLIT_PMD, THP_SCAN_EXCEED_NONE_PTE, THP_SCAN_EXCEED_SWAP_PTE, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 70ee49dfeaad..f5363cf900f9 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -1087,6 +1087,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); mm_inc_nr_ptes(vma->vm_mm); + deferred_split_folio(folio, false); spin_unlock(vmf->ptl); count_vm_event(THP_FAULT_ALLOC); count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); @@ -3526,6 +3527,39 @@ static unsigned long deferred_split_count(struct shrinker *shrink, return READ_ONCE(ds_queue->split_queue_len); } +static bool thp_underused(struct folio *folio) +{ + int num_zero_pages = 0, num_filled_pages = 0; + void *kaddr; + int i; + + if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1) + return false; + + for (i = 0; i < folio_nr_pages(folio); i++) { + kaddr = kmap_local_folio(folio, i * PAGE_SIZE); + if (!memchr_inv(kaddr, 0, PAGE_SIZE)) { + num_zero_pages++; + if (num_zero_pages > khugepaged_max_ptes_none) { + kunmap_local(kaddr); + return true; + } + } else { + /* + * Another path for early exit once the number + * of non-zero filled pages exceeds threshold. + */ + num_filled_pages++; + if (num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) { + kunmap_local(kaddr); + return false; + } + } + kunmap_local(kaddr); + } + return false; +} + static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc) { @@ -3559,13 +3593,35 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); list_for_each_entry_safe(folio, next, &list, _deferred_list) { + bool did_split = false; + bool underused = false; + + if (!folio_test_partially_mapped(folio)) { + underused = thp_underused(folio); + if (!underused) + goto next; + } if (!folio_trylock(folio)) goto next; - /* split_huge_page() removes page from list on success */ - if (!split_folio(folio)) + if (!split_folio(folio)) { + did_split = true; + if (underused) + count_vm_event(THP_UNDERUSED_SPLIT_PAGE); split++; + } folio_unlock(folio); next: + /* + * split_folio() removes folio from list on success. + * Only add back to the queue if folio is partially mapped. + * If thp_underused returns false, or if split_folio fails + * in the case it was underused, then consider it used and + * don't add it back to split_queue. + */ + if (!did_split && !folio_test_partially_mapped(folio)) { + list_del_init(&folio->_deferred_list); + ds_queue->split_queue_len--; + } folio_put(folio); } diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 6c42062478c1..2e138b22d939 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -85,7 +85,7 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait); * * Note that these are only respected if collapse was initiated by khugepaged. */ -static unsigned int khugepaged_max_ptes_none __read_mostly; +unsigned int khugepaged_max_ptes_none __read_mostly; static unsigned int khugepaged_max_ptes_swap __read_mostly; static unsigned int khugepaged_max_ptes_shared __read_mostly; @@ -1235,6 +1235,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, address, pmd, _pmd); update_mmu_cache_pmd(vma, address, pmd); + deferred_split_folio(folio, false); spin_unlock(pmd_ptl); folio = NULL; diff --git a/mm/vmstat.c b/mm/vmstat.c index c3a402ea91f0..6060bb7bbb44 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1384,6 +1384,7 @@ const char * const vmstat_text[] = { "thp_split_page", "thp_split_page_failed", "thp_deferred_split_page", + "thp_underused_split_page", "thp_split_pmd", "thp_scan_exceed_none_pte", "thp_scan_exceed_swap_pte", From patchwork Mon Aug 19 02:30:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13767732 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3BAEDC52D7C for ; Mon, 19 Aug 2024 02:32:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7D63D6B008A; Sun, 18 Aug 2024 22:32:02 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 75E7B6B0093; Sun, 18 Aug 2024 22:32:02 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5B2276B0095; Sun, 18 Aug 2024 22:32:02 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 35F1E6B008A for ; Sun, 18 Aug 2024 22:32:02 -0400 (EDT) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id AA3A51A0BAD for ; Mon, 19 Aug 2024 02:32:01 +0000 (UTC) X-FDA: 82467420042.22.41D6C4B Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by imf25.hostedemail.com (Postfix) with ESMTP id D9F44A000F for ; Mon, 19 Aug 2024 02:31:59 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="h0K6mD1/"; spf=pass (imf25.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.41 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1724034659; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=eO6lPMqYWNsw0u0kOe6TeGUxnE5KvmwuKqe4o/pamZ4=; b=rZxPagsgSBKx5GR0oI0yUtB+vduh/BRxnbKqCwvtrApdtHqvaXn2OrDonVEQ5xVNZ5i6bq BOgvSZJWP+3JtSp0t2e5fz7/QZ8gxem8R+B3E69qmjCrYwASRW+f2EyuKH1Lh56w0Cbqds sjUINwtuDxfgSFDmJabxlRO3cVDKFqc= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="h0K6mD1/"; spf=pass (imf25.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.41 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1724034659; a=rsa-sha256; cv=none; b=awvLR8LsFDvim9PB0YuOoPfd5HXCuKqJA7QdvOKRE/sJwPJv2jiu9u44tnsa00V+z3nDqa M18dd7NoxeghjOl26NlFr38fomafmMz9gmtzvVA5sCyS5JwPsN9xrLfBNVoJSnwQ8mgrm9 TpcPkq/czMaou4HeVPLVP6Ao4Ui6ErA= Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-6bf9ddfc2dcso2011496d6.1 for ; Sun, 18 Aug 2024 19:31:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1724034719; x=1724639519; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eO6lPMqYWNsw0u0kOe6TeGUxnE5KvmwuKqe4o/pamZ4=; b=h0K6mD1/SlKVOCp+eEpSCnSF1y7I1lGtXhLb76jsh+LFIqgwZDEjbFVgTl7eVdmtMn u8viJSpHDrSNOLCnOKGrR3qTQvumVE18ZFS0ZPRh6YpWxC/oxRG5H0KtBZZctQyWolnt r3YWTVU4tztRXuzvFvt3meJFHGvwnU+fFhUqbMxl6MKbNL8duvGDEKO+xNNsnGJSQsLd NPfqz5xzeSIuL9HM9jeVIbNYBX7T45k8FaAKe6VcGI34gt6LGa1g36bAUzVGbqYeq1t/ gZQSDkvD2DmFsO44UQKS+soyZBYbrQDnyiy5cKLORW3cu9pJog9I9j0coYod3kAItc79 lAAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724034719; x=1724639519; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eO6lPMqYWNsw0u0kOe6TeGUxnE5KvmwuKqe4o/pamZ4=; b=KpJZebuJBnssZtb/aSh45f8vvNG0QgcPBZwYa5oQS/mns2LyXi6JO9OvKSPNR0Fh5B adzJqV3umUTp27+aknIuQ0VaZIwcsXGfJ9PR3Fab7NJJuZwIYdDadJ20XxiyDYlTxjr2 w4+sVxyr5xo3vThN+fWXj8Z4xUrQOk/49tGqugpPYhTYpVS1m7bAi26pnRx+8GbYydIs kke9rf7xJI5DursetrM8LycMfXgW6bAiQhCPhGBv9LPxKzU201CV3fX2R1nOJgZ5nSzz ePTha20/CH/8F2Y+dex0r0Il7SF1sYJJDEylwnz9FHLUGB5fVBigQmA89tn0QkXPAIGx IV9Q== X-Forwarded-Encrypted: i=1; AJvYcCUmDNaAkuF1psPtS8wbW71SJukXC8rC8WON16RpUElHe3GXGBmljBkT6t4VpAhbFf0naU7HACYKQH5sog7O0nYrMoc= X-Gm-Message-State: AOJu0YwxNn6Q4iKqXXxP793QhsikCZgBOnlPVkU3HB0O98UgkRcbAzUj 0WXY6w3P0a1HYEszuahXz+/ryq99NYJRVJ5ypCXvyU+VK+c9Egw9 X-Google-Smtp-Source: AGHT+IGNG9vxDFxLwp6LH6gHm98AbumciP+UKQShkj+bs5fdr5dQCOqFsqwOOMWP9Hvs4t2YHLHiLg== X-Received: by 2002:a05:6214:588e:b0:6bf:894e:7964 with SMTP id 6a1803df08f44-6bf894e7aabmr54977146d6.57.1724034718766; Sun, 18 Aug 2024 19:31:58 -0700 (PDT) Received: from localhost (fwdproxy-ash-012.fbsv.net. [2a03:2880:20ff:c::face:b00c]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6bf763cb74fsm36274866d6.112.2024.08.18.19.31.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 18 Aug 2024 19:31:58 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, ryncsn@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH v4 6/6] mm: add sysfs entry to disable splitting underused THPs Date: Mon, 19 Aug 2024 03:30:59 +0100 Message-ID: <20240819023145.2415299-7-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20240819023145.2415299-1-usamaarif642@gmail.com> References: <20240819023145.2415299-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Stat-Signature: g7f3y1zmymna9bj9diqq6ztsgbw3f56o X-Rspam-User: X-Rspamd-Queue-Id: D9F44A000F X-Rspamd-Server: rspam02 X-HE-Tag: 1724034719-839510 X-HE-Meta: U2FsdGVkX19MP523Ri6KTge+yLmamjOQcUyZI2pkTbmkHuP2VoDNTWVAQF0sDzXl0FQ4Ooiv+k4XrcAka40dmCp03X9Cw/5kEjp/f9TLiNeImEuepUpT/5NGEFP1WiFZp8/ol3z15RuiD8igJARYUhVGL9IE9TvwwF+D59DVGlajh9Ikl05mDD6mjjorFgCoeTR7S0XJdGsdSF5MzNyWq5aeYpUq3BptBaMF6c+39mSSb7puB/jZz0zWhq8svY6s3JvVCvQZVAMOTxj7Hh2RX9bqUZOvSxq1GW4T2G0W4f1mErgVIZOAMlErtNmGlekXzDkhxJWHP7c0+uklpGG91fj1Abka/xmsuzrap2JhaRa2AchKQGXDwMNYSwGuPUHirLXp0xBYe8jYZz5Dr18BMskNVwD/YKDb/eD9rxwzqE+NiEEhCKm+HY8jaDH2kpfAjNcQ1zqLkuVnTTQEWW9EF7RXADZxZvmDb8rQ4QT4ZjZpAKKjiv1n23LyinLRFHbbPJFRR5sqw/HuYBBgAzvTYh4MiHgAGy3IwT+Bcn2Lola4OeKS/jMa/fPfpGhmPXWQwWSTq1lsrbqML/vlE94VqQwa3SwsWBL2Zi0kq7NNVJB7yhm2y19LL3YHHyh+NpfvPzHeFR34d3PgKP59/yChvkKgX74LSZSuelEJQGLAlE3IgGZizrrIJ8DboXDjdED60ZV5J6+YppLTYu+RJk16lucVZL0o/bjxqed0AJqQQ2smNBkVgFxsCkte3y16G71+P1PS5XggVtDyLuL5k66bs0zzPiZhdLPHPTVSMpxlHT90K/TUd6Gs8mIGtk+8fzgb36yZvPFFwrGdy0aqct5Do0qr+89HhLztZjpqRshPYNYNvdTaHHRXXZQYiGdn7ihd7vIaauXBs+TgXsLAaXgmoa9S5I0XIMfszSP9McZOp0RXgHDs/l8Fsv9a2mP747CeFBzu5zdClvBT5Bsm+Nd 9UYUSTB2 NlT4vxs6oaZsSygVr2ukicLHVSWsGgEC9RvHsZAeFB3Il+Q6PMCsxgSytQzoqCQxXTxumjMQdL/WvjBipVDRGk688Z9VQBsUA4xYdfSEeNwrqJPHt4prkvmYHQZXEW3Vtiete7sdkX/2Kl0w7vtaK0FG1n9hSThy5qeYwEXxRHvrImCJ5aINZ4fUJKFaJX/b+7sO6fUEK/kz9Mbvj7hsp8h/yO8mD8//f16OQevcB2pH6zdyi+ycuVo3h/5FOAUCCmCJZxuh76P1V4DZBeqV/kIJAdTya2HTovMLNcyIfIqR2kZlDeDunhG+Kj9a292CEIbMx4xu0OsJzMtUXIKC3bmjG6A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If disabled, THPs faulted in or collapsed will not be added to _deferred_list, and therefore won't be considered for splitting under memory pressure if underused. Signed-off-by: Usama Arif --- Documentation/admin-guide/mm/transhuge.rst | 10 +++++++++ mm/huge_memory.c | 26 ++++++++++++++++++++++ 2 files changed, 36 insertions(+) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 40741b892aff..02ae7bc9efbd 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -202,6 +202,16 @@ PMD-mappable transparent hugepage:: cat /sys/kernel/mm/transparent_hugepage/hpage_pmd_size +All THPs at fault and collapse time will be added to _deferred_list, +and will therefore be split under memory presure if they are considered +"underused". A THP is underused if the number of zero-filled pages in +the THP is above max_ptes_none (see below). It is possible to disable +this behaviour by writing 0 to shrink_underused, and enable it by writing +1 to it:: + + echo 0 > /sys/kernel/mm/transparent_hugepage/shrink_underused + echo 1 > /sys/kernel/mm/transparent_hugepage/shrink_underused + khugepaged will be automatically started when PMD-sized THP is enabled (either of the per-size anon control or the top-level control are set to "always" or "madvise"), and it'll be automatically shutdown when diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f5363cf900f9..5d67d3b3c1b2 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -74,6 +74,7 @@ static unsigned long deferred_split_count(struct shrinker *shrink, struct shrink_control *sc); static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc); +static bool split_underused_thp = true; static atomic_t huge_zero_refcount; struct folio *huge_zero_folio __read_mostly; @@ -439,6 +440,27 @@ static ssize_t hpage_pmd_size_show(struct kobject *kobj, static struct kobj_attribute hpage_pmd_size_attr = __ATTR_RO(hpage_pmd_size); +static ssize_t split_underused_thp_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%d\n", split_underused_thp); +} + +static ssize_t split_underused_thp_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int err = kstrtobool(buf, &split_underused_thp); + + if (err < 0) + return err; + + return count; +} + +static struct kobj_attribute split_underused_thp_attr = __ATTR( + shrink_underused, 0644, split_underused_thp_show, split_underused_thp_store); + static struct attribute *hugepage_attr[] = { &enabled_attr.attr, &defrag_attr.attr, @@ -447,6 +469,7 @@ static struct attribute *hugepage_attr[] = { #ifdef CONFIG_SHMEM &shmem_enabled_attr.attr, #endif + &split_underused_thp_attr.attr, NULL, }; @@ -3477,6 +3500,9 @@ void deferred_split_folio(struct folio *folio, bool partially_mapped) if (folio_order(folio) <= 1) return; + if (!partially_mapped && !split_underused_thp) + return; + /* * The try_to_unmap() in page reclaim path might reach here too, * this may cause a race condition to corrupt deferred split queue.