From patchwork Tue Jul 30 12:45:58 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13747379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B1B5DC3DA49 for ; Tue, 30 Jul 2024 12:54:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 119696B0089; Tue, 30 Jul 2024 08:54:14 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0C8B96B008A; Tue, 30 Jul 2024 08:54:14 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E85086B008C; Tue, 30 Jul 2024 08:54:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id CB6216B0089 for ; Tue, 30 Jul 2024 08:54:13 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 48A2316015F for ; Tue, 30 Jul 2024 12:54:13 +0000 (UTC) X-FDA: 82396411986.02.DB1F606 Received: from mail-qk1-f179.google.com (mail-qk1-f179.google.com [209.85.222.179]) by imf05.hostedemail.com (Postfix) with ESMTP id 74DD910001A for ; Tue, 30 Jul 2024 12:54:11 +0000 (UTC) Authentication-Results: imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VA8Xtxa3; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.179 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722344047; a=rsa-sha256; cv=none; b=V7IwB/zgGQuPf+oMy1W7romGKCqUCr3paQnthnPD6HSfD+9qa1SWpZ5JwMTXdlqT+2yg7K L6Ai1S4t0TiWl6s0kF2XgICnfGeyZNyShTNqfIZQiOY4xODVnQx3vpmccJSriK9iV3mOXN 3IeNUP6bLzGErqZjzc0hP162tzvhkeA= ARC-Authentication-Results: i=1; imf05.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=VA8Xtxa3; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf05.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.179 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722344047; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=KP/Cz3QVNU5uyUnKyaSLbXt0q/kOvLrAG+hU9UnZ/BE=; b=fIu5dyU25HleKh++UievtDvBd80QvjgRGWHszyqOL6IFUzeGVLeGOlG1G4nDaktsiaREi6 CQUQ8f9bbh1tIteIlYZI9gxa9GK2N90O2fa+U4iOAVquoN5fA76YLXETuI00V5/WtuPVnR Jlq/S+A4Zh6qQUNMKFnX4uOwIKXl+FY= Received: by mail-qk1-f179.google.com with SMTP id af79cd13be357-7a1d42da3e9so300776285a.1 for ; Tue, 30 Jul 2024 05:54:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722344050; x=1722948850; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KP/Cz3QVNU5uyUnKyaSLbXt0q/kOvLrAG+hU9UnZ/BE=; b=VA8Xtxa3viJ0s8IdiHzPPgEKbxkpV+vJaOK5O861oEUYj+FYz75EYyoI89GOXx8f/2 iTlGpsI3Z/F9yRVmqmowZXvvdDq+EAD270na5G2w5zvQ0qWNqSjTtlwS6dH4cl6+3OA/ x/b7Ktv31Tr23qGipNj+1/MrAQuTQwfRRwlTy6cTDeBqxFzcbZQ9cW/n+cc68lzeoMfP 5ZiB4pXkDmwetcBV39mGdxPsOm41UZ+sFDU82SQjKalyRR3F2mJf50g0Tm/PVNxhPGuR sHZCoNh7CeGDdPBjUffTuVTHR1kC0UwA1HGB0zoso41VtEZBUr24nfKS6tRuPbsEYUDl SFbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722344050; x=1722948850; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KP/Cz3QVNU5uyUnKyaSLbXt0q/kOvLrAG+hU9UnZ/BE=; b=V7ZbeUJhMH1NH9D8o6gRG1kSUwczGDWrZnIRmPfixFukKVXLaKvVqSsBwZLwdqlodk Q0baXnuNA+yklC8ifwbxHabXTNfCkIe723autoIbxgWofe0lVX2kc5hD/FzVD7sh6nra OleJb4kccCsvI1Euz7i66NpAQw5SF/M3WqnpQ1E5Em7Cv3DlwBHy8OyJOIJwqK+E9iAk TUozezQdNhZKiS7JQ2QNA6aEqO+aL90N5XUZAtinoQdMMfzTxwXX4oC3/l5ss0VrI4T/ pOrjJg9lyuwBiANXcCnpsU3isWsH4d3/jWVNDD0QluRiDWgLwPuhsvSUrJJ82Abq++v6 YmLA== X-Forwarded-Encrypted: i=1; AJvYcCX7S9hKNgXSJ6BzEPQ8RhhxtxeuTivVbKA7ANOQMX+F1B54FFTCxz/LJlNL1nJP1RPUXyoRWQ0c0qLrcjW2aJ0H2xE= X-Gm-Message-State: AOJu0YxArOGPozzorzrMaXNsDzn2C4/mEEPej0carL8/Hmb/0I3zUKfI GcneE3ilKpwEKBMIQn5JGQadlfjxDxUwQIK2SwsSPmJJ15l+RmY7 X-Google-Smtp-Source: AGHT+IGKfomzFh7hnhpcESd3MD/dzvcJlg6cDEKXingUGcZg56zluFUVdo6AMVlaIlW4lTWHXPgxoA== X-Received: by 2002:a05:620a:24c3:b0:79e:fcb8:815c with SMTP id af79cd13be357-7a1e52cdbf2mr1496430585a.54.1722344050589; Tue, 30 Jul 2024 05:54:10 -0700 (PDT) Received: from localhost (fwdproxy-ash-009.fbsv.net. [2a03:2880:20ff:9::face:b00c]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d73ea990sm626878885a.55.2024.07.30.05.54.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 05:54:09 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH 1/6] Revert "memcg: remove mem_cgroup_uncharge_list()" Date: Tue, 30 Jul 2024 13:45:58 +0100 Message-ID: <20240730125346.1580150-2-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240730125346.1580150-1-usamaarif642@gmail.com> References: <20240730125346.1580150-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 74DD910001A X-Rspamd-Server: rspam01 X-Stat-Signature: brz7juccu19qstgtiiqdn9bu66q11msd X-HE-Tag: 1722344051-574690 X-HE-Meta: U2FsdGVkX1+WTebNjB45rWkvYdGvNSZu1Uc3Yfg0K3heZLFJBpirYLts/ux5E594EmN9vV0UHi7GZwOwrx9NM0ySgyU37ge4wh3Cwp7lt6yx4VekzfWlWFgQ6LPm1Q5jPknuxCXJJk/r9ivCTNbXq0BTc0igs1UO/JWtDoeJZliIYXPX8bP+ihZd+DFUw4FUvhhqoHpZcGpClEEQ8J7U8iF0ZS0LIs+Sy/OCVzQkw7izxEOZllQ9QGl52mK/REUvgqckKP1pCRFrbSYAz35drwRHyWJNn/8KoC/Km8rRCSEOOiF4gslvcXyg1HWBkit2j6qmCA/gxgLlis4oLnMQC2zllsE4ocCwBZRbQlJ4aR2eJ7o55q3M48HRAqyVXnZBdQCn2G89veuB/k2kAKR8L4pp/0xaPQ1GbtAbjjRCFJ0ElzXV7l1s34pJOq2vdTz6kLupWrVcSpzCuZ7MJqFCCaNE84lBkFGvtjKpqAsQ0U4LaFYJQHVj5uSLA44oYVNQu/mfsyuw+uQ7fV0JCwY0KQSFx6Kw6bUl4aEgYitJX/ZRbqnYbUbiTWY2Ycu+A+FAHyMzXbnYYZVikGDcm8IymS/iXq1vJeObxIctOFBHlzXeyBeEV/+P14GmoALj5oOCILFMKtAL/Ab4IYUW/+sk8F9WJiKc54p+LMhH+N6aah9Bd7MrEhlDCFwKziFe8a8CKd+fgXFdSwiwDQJaqG8OqC4bj3KkL4zaOMHHkvYIfgJPMHcpFcZqbriOPReydH42BDsGx3ISxTSB3gfZRiTi6hlLZAq1CrB8xlQFLtNvmu2hakRCNtLqQr28Yq9JhwXisuMJvKAp2uesKhYqSCdHq+GiHBlnQaBL3qzKDciGkUwn//Tevf5obiN1v9EN0q6i4MnAYKGTENcg5LtHlSz2TSYJ++L39klEkYjhKNYxrJyd9Nx35SYK29NXSP1WZM8EzH4ueCD+1IHz6WVgSal 7wkYRhk2 Yu6kublnynRuM5IlSk+3grl0mkqqduw42pdRCauCwSzgr35CtPHjrwH1wwiTvICgRzcBMCOGZ5oc+KMM0CnUA/oh9CJp2DBCx/4ZITgJ4SsxP2LkhhXWg2TYNBKhR84uXSj3PkiEiRBWNZ6RYGU4LKICtfY/9KcFnK88uOeEEJoG+L9eOffVD6TSre40vvuBEPSHNw/FGH4abWcxOPjtly9bsMKSPcF3Ay44FUk9ieOJgfX3VTGwIXzmywpOxD0SOT8Uix678gs7x98sD9i97eCRJVgt4ADi4wpnH5zPGPBj2WSkF5Pqm2FqVan5/OqdpCq1SodpCpmIyiDL6a+sDJxyVJejmlqfnSgI/z/eTuYpoD+obhqsjN7BVmXcVofL1NczQ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000068, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: mem_cgroup_uncharge_list will be needed in a later patch for an optimization to free zapped tail pages when splitting isolated thp. Signed-off-by: Usama Arif --- include/linux/memcontrol.h | 12 ++++++++++++ mm/memcontrol.c | 19 +++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 07eadf7ecbba..cbaf0ea1b217 100644 --- a/include/linux/memcontrol.h +++ b/include/linux/memcontrol.h @@ -713,6 +713,14 @@ static inline void mem_cgroup_uncharge(struct folio *folio) __mem_cgroup_uncharge(folio); } +void __mem_cgroup_uncharge_list(struct list_head *page_list); +static inline void mem_cgroup_uncharge_list(struct list_head *page_list) +{ + if (mem_cgroup_disabled()) + return; + __mem_cgroup_uncharge_list(page_list); +} + void __mem_cgroup_uncharge_folios(struct folio_batch *folios); static inline void mem_cgroup_uncharge_folios(struct folio_batch *folios) { @@ -1203,6 +1211,10 @@ static inline void mem_cgroup_uncharge(struct folio *folio) { } +static inline void mem_cgroup_uncharge_list(struct list_head *page_list) +{ +} + static inline void mem_cgroup_uncharge_folios(struct folio_batch *folios) { } diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 9b3ef3a70833..f568b9594c2b 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4717,6 +4717,25 @@ void __mem_cgroup_uncharge(struct folio *folio) uncharge_batch(&ug); } +/** + * __mem_cgroup_uncharge_list - uncharge a list of page + * @page_list: list of pages to uncharge + * + * Uncharge a list of pages previously charged with + * __mem_cgroup_charge(). + */ +void __mem_cgroup_uncharge_list(struct list_head *page_list) +{ + struct uncharge_gather ug; + struct folio *folio; + + uncharge_gather_clear(&ug); + list_for_each_entry(folio, page_list, lru) + uncharge_folio(folio, &ug); + if (ug.memcg) + uncharge_batch(&ug); +} + void __mem_cgroup_uncharge_folios(struct folio_batch *folios) { struct uncharge_gather ug; From patchwork Tue Jul 30 12:45:59 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13747380 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0CDDDC3DA7E for ; Tue, 30 Jul 2024 12:54:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6219E6B008A; Tue, 30 Jul 2024 08:54:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5D2586B008C; Tue, 30 Jul 2024 08:54:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 361CA6B0092; Tue, 30 Jul 2024 08:54:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 1ADA26B008A for ; Tue, 30 Jul 2024 08:54:15 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BB17E8026E for ; Tue, 30 Jul 2024 12:54:14 +0000 (UTC) X-FDA: 82396412028.28.7450334 Received: from mail-qv1-f41.google.com (mail-qv1-f41.google.com [209.85.219.41]) by imf12.hostedemail.com (Postfix) with ESMTP id DF99A40019 for ; Tue, 30 Jul 2024 12:54:12 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Qblb0GSb; spf=pass (imf12.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.41 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722344010; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=xejBMxasF4z7vTbtt633OZCxjB3/TpD8KsbFAqhPXuY=; b=FgYuz7D6eNEo/rE++0W9TYTlvSrBP5aMKgGgmqeDP7tShQIRE0A7YX55XYi5yiw6pXwTKE jdegyU+teDLB084l5gxBe8E1dr3IMhPjKIlgaKOYt/OwM6cMkXbeX74OADQ/gdjfWzSNed Fy0m6rpaYo5E4kIWCgoRssyhSGYZvz0= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=Qblb0GSb; spf=pass (imf12.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.219.41 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722344010; a=rsa-sha256; cv=none; b=5LCVFvGhOoc8G9V96YNDC+TKC8pnzx4Ff7BW4wRwfR3EORmDp6eMoJp0GUpN0dknokmTkM oSJum/xaH2So0ThSgAdqx/X88CpQTmxrR0AGoSV5nCmUR1l4ryaHuFaIXqMDKlx8H8aDPB 6O2km6ZYBKzW2btRHNOj+kvOqjcEYjM= Received: by mail-qv1-f41.google.com with SMTP id 6a1803df08f44-6b7a4668f1fso26060506d6.3 for ; Tue, 30 Jul 2024 05:54:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722344052; x=1722948852; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xejBMxasF4z7vTbtt633OZCxjB3/TpD8KsbFAqhPXuY=; b=Qblb0GSb8VTJ0z2pgM6C4UOqThFcP1/M5A7XX1yol4HHQbhcYLKY4oELyqM9csP9Pm 6KmZikxS8BY4DQjb+dGwCte69kUNQQAmaMqWVz95KOIsxf1YXbDCJxmR6pNlccJ4UMS0 Zj6Ncj7lsiRAbdWGONJSaIjzjCHs4sBUOctEb7eaMMs7UjtMq2WS5IR0154khablwTh/ xDcJpKR44AdiLnyhUzdlCG2fKcSqem6HIsZJCSotdRK342adu2aEiNZGPoLxPR89rbNm 02j+fil1f+kioO2wbBzRCTQDYLQYZUH3TCKffBLViA00Be9aQfj7xXMe9yCXs/xIK3fQ Azqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722344052; x=1722948852; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xejBMxasF4z7vTbtt633OZCxjB3/TpD8KsbFAqhPXuY=; b=p5XgMg3CIaa5uzInqrky541tvo3VBLzxvMWRGq2PBycMS+cPq66mIMWFIQZNwQg+/v u4ly/9JxqAfGP7Hy90UQcee/eDPYVZN2gKxZDiNe+7U6E2wcMDDMD9z6gRFRnITjB8ep cgyip5ZpPYg6T+DgmnQvcBtDscAZgoYTJ421gJC4T0SAGC1Ho8ISSRUpBUXWR8rnnM9J JPIyAMFGPgVtoiIXF/z8o9zFX41WQs42BiD/fkXA3OUge/LR6MkDJGhEXoDwei2km/el aegspr/VNBtRcOnIAuiSf/ulFLGhK18KziAP3pArRVm5lQQwoQBxOZ8h8InSl6310yH7 qq2w== X-Forwarded-Encrypted: i=1; AJvYcCWYOYAlUeVOHo/QiG5iHZ7P4x23yzMenkyTkP5eak0VD1OxZbXgVldwV7eJLzmtfNnhidOEOgbDiw6nK4KutDKmdg8= X-Gm-Message-State: AOJu0YzNjT6EpuQ3do2lciwM79FtXg9keglAv43s40Plnexwy59+KWHo BVhZC/8OuObnsVliFr6Ylnq6RMt20dUWa59CzKHOF2Ydvsy/NwoG X-Google-Smtp-Source: AGHT+IH1TaNMaT1owhjXGb3Te5E+aSoMQp8eh+KW6WOTF5nBI+raPcLbOMS88vx6Ilwbsv6KRamZCA== X-Received: by 2002:a05:6214:4019:b0:6b9:299b:94ba with SMTP id 6a1803df08f44-6bb55ad89a7mr115074966d6.46.1722344051931; Tue, 30 Jul 2024 05:54:11 -0700 (PDT) Received: from localhost (fwdproxy-ash-112.fbsv.net. [2a03:2880:20ff:70::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44fe8123516sm50426311cf.1.2024.07.30.05.54.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 05:54:11 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH 2/6] Revert "mm: remove free_unref_page_list()" Date: Tue, 30 Jul 2024 13:45:59 +0100 Message-ID: <20240730125346.1580150-3-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240730125346.1580150-1-usamaarif642@gmail.com> References: <20240730125346.1580150-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Stat-Signature: k6okmpk8xfuxmsaf144jjdwronycmfqh X-Rspam-User: X-Rspamd-Queue-Id: DF99A40019 X-Rspamd-Server: rspam02 X-HE-Tag: 1722344052-681709 X-HE-Meta: U2FsdGVkX1/0fdYWDkDl+YWN6lWk7Wgn0vwCZ3JoVQpUT95V+YXKIsmm/UcqX2oK6eClssDzxU792nDWdEboOYA1vtaZtisVPC+MH72BSJhp842ooe5SVuYAazCS4o14l8xUsB8Yq077YfPucEn0N7PfbSdZt7zTz1t93BM/eaYA1ZTAyaNPsC1cAp4pLEr7j4tEGDbZuvMPl7lYHDRetU0SciybJYr/1MuOev0MygWjBXH/bFj7FXbkgAGPO9zLyfnuN/E6empgfjogVklXck8/AxKHZ2DqaJGZF6Y8xyHZrdYD6ho35D/6Mq1BbclOXBjWXFzRL8QWmslLpgpGP0Tc13xZEdPfebkuynloaSHmU0a56Zp3u7afyomJD3pO3M9jmFEmUMGPX3SQU58GwpAq1Vzq57wG0LfGOwzfuCmAvQuaITNcvla3UzbXSbcO+J5hwMOnAygzfJ0UP36f5Z6UhYyqvnPaTtuH4Dz+XrETn8I9PM+K+ITk9X8HuQvkyF4ph+KyBa0fi8A05NuaM+Ew7RpTwKkn9mnSEJioYRaTy/g6i9QzeSiNH23kBh2pcmqBRB2nTtJwURiLBEa/J3a0Vto+epk/eu8kwfOFxSfAaBOKA0MgNH9Q0B+UoUq7p1BY6xVdusR7XtsOw6Td6BiBglLSGYW9yKTXX7s27l2NiEOMhRH/G1Iwe7z/iX1g4XRoeoVPtnazaNS4pEFTcS81CORqDv84FMYstHtPzDLsLEuizPGOGjyl4rm/5NfZ7na0xscpM639q2oo7CJ6Go3hh49xGa+3wk3URTY8M2oxKWyvOVwbH23Zp/7dhSHaUO35S/8F+Wmd/tDkwRvf1mFHNnt6s9plvqgoYV9bbfGUyTkjrcYrPBjY7WU3dy2jvzzgIsG76EEMeAFeBSaW+WWOEB8q/8rlk4pGm1luBEvS7Bxjg8vWm4PAifZeZIb8EL/LCqXXWP3fAjJTeRe XXZkrwZr rcdvRs0vCM9+W/LXjy9fG93apMS6/mKCHZOfZC4mfy8E8Iwvl4HwldgFQVFjh6qXWxtYpRl0CYy4oVjyr9rA1Gicq3U+m0sOPcOkXy24p07Qa2sApRen7Chlk9ByZV3yWsdJnQ/TfaDFMrPjC3HttTbJn1UWhnSf3hK1X8+fn+QnyONT3ETmjlnX83hpLY9ke3/FVqoWx/I6mR8tb8f/vRCfF+WdgKVTQHfOeng1yeyLjhvYdxcAms0B4LWMqanjeKiA+kYHYKNW1rEnkmPkPv5WdfosrWvDZlDqE2IPEPi0YSBvJxvSIluDmuyaYBl0rMFO/gQhi6s7tdMEFzspI7rFYMx4n2hsD4olXJT/RzYVfajDitUvGsbfjGglokuT2GBLh X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: free_unref_page_list will be needed in a later patch for an optimization to free zapped tail pages when splitting isolated thp. Signed-off-by: Usama Arif --- mm/internal.h | 1 + mm/page_alloc.c | 18 ++++++++++++++++++ 2 files changed, 19 insertions(+) diff --git a/mm/internal.h b/mm/internal.h index 7a3bcc6d95e7..259afe44dc88 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -680,6 +680,7 @@ extern int user_min_free_kbytes; void free_unref_page(struct page *page, unsigned int order); void free_unref_folios(struct folio_batch *fbatch); +void free_unref_page_list(struct list_head *list); extern void zone_pcp_reset(struct zone *zone); extern void zone_pcp_disable(struct zone *zone); diff --git a/mm/page_alloc.c b/mm/page_alloc.c index aae00ba3b3bd..38832e6b1e6c 100644 --- a/mm/page_alloc.c +++ b/mm/page_alloc.c @@ -2774,6 +2774,24 @@ void free_unref_folios(struct folio_batch *folios) folio_batch_reinit(folios); } +void free_unref_page_list(struct list_head *list) +{ + struct folio_batch fbatch; + + folio_batch_init(&fbatch); + while (!list_empty(list)) { + struct folio *folio = list_first_entry(list, struct folio, lru); + + list_del(&folio->lru); + if (folio_batch_add(&fbatch, folio) > 0) + continue; + free_unref_folios(&fbatch); + } + + if (fbatch.nr) + free_unref_folios(&fbatch); +} + /* * split_page takes a non-compound higher-order page, and splits it into * n (1< X-Patchwork-Id: 13747381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8865CC3DA7E for ; Tue, 30 Jul 2024 12:54:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A8E366B008C; Tue, 30 Jul 2024 08:54:16 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A1ACA6B0095; Tue, 30 Jul 2024 08:54:16 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 81E196B0096; Tue, 30 Jul 2024 08:54:16 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 5E8CB6B008C for ; Tue, 30 Jul 2024 08:54:16 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 0D8A4402D9 for ; Tue, 30 Jul 2024 12:54:16 +0000 (UTC) X-FDA: 82396412112.23.158199E Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) by imf01.hostedemail.com (Postfix) with ESMTP id 440EB4000D for ; Tue, 30 Jul 2024 12:54:14 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="g/59N/X/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.174 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722343993; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ktt9C/EJSJHB77GbE72Wr4Bnix41Ywh1hlDOhut1uLE=; b=JEbkl8Y8rEbMXqKAeYaTWMNnQ2YFK5fMLgqveKEQ5BgAAKKB5Yfjux9kfDcX3cAyOQk+9w g+lTC5m9poO3kY7JVxoBCpXbWNwSEqEZWy+IXMmIy8fncFQqNdEPAmd9Me3fCGF3x/cyYg N4u2duV1r7ZGR4bGtJl92B1Ml/HSy1o= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722343993; a=rsa-sha256; cv=none; b=mjasP1vS5Y+EFLMxB9rmAxMfyCpLwN7TALynfutf7o5XpQ+AOZ43W5slgQddY2vu8b5nQb 2JWHCRu70jbQE8eW4yduodMimqfeZYmuyeugIgnxSOAND/pcrjQf/DyoR/PvQw9jt38nun +KvKy3VapH1LuA5FgOqoHpi+k80CL7k= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b="g/59N/X/"; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf01.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.174 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com Received: by mail-qk1-f174.google.com with SMTP id af79cd13be357-7a1d067d5bbso281716385a.3 for ; Tue, 30 Jul 2024 05:54:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722344053; x=1722948853; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Ktt9C/EJSJHB77GbE72Wr4Bnix41Ywh1hlDOhut1uLE=; b=g/59N/X/FDWgCaVygq9rKW0eXUV3VQtQWFGDL7Ndgf2X3UZLDjukUCq8GYOz2f3adh J4tfz+uDLAczzukNQPbawV0mSkxopFv6hARob9vr8zZBmMFoeVhlYTkGxI1ZkY8b65yu 3nmi7pDMqkXAqHAgYx5hdSJnYNH+jfCEjlP9/O0uAerNYVmg9hCYToEFWNor8yhbQLe9 Pcw1xsLvf4Bj7GFhauqBiBCTySEyCSdjEIlLzrnY4Wy/eVBArnU86kMS8b7j1HvYF/ZT ATAMpuMxpojJKSlxVnhaZIBx+MDiylWv6ZpdOjU+wSBbxwU7PjiYWlA2qi2EgO5DykXk Zt5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722344053; x=1722948853; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Ktt9C/EJSJHB77GbE72Wr4Bnix41Ywh1hlDOhut1uLE=; b=XBLwVmcAPEdE1W+HczaS5xaHyY0yuFR4ES9/oyCKJdOHTlCmElKHrvPFSlsFSQON1L McMn67QSReOZHPXahie/GZh+ZOnYdL2WJoji/mfKc7OdznYBRE3jnji6CD7f3kxphIu1 F/NZxCMm0SLHX6JCcrD+Ebz70oWzVzhDEmRpylf1RNQrlnDok2jnvcpP7RxpOQe7BfuB MSjLhV2glAo7ttPbZOh2W+s3vVmy9REMOl1Gn4UpziA15ww3W/G+fL5MXM0kmkFYTpTX dptWg06zm/rEfN18Ye4FFpp1jTzM+4YKuIR++5CmtLiZ7NRl3vcYu96EAAkwHWLB1vrY 7QwQ== X-Forwarded-Encrypted: i=1; AJvYcCU7NAw19c5TCXHPNUePJJfOSLMh8d+1yZ6HnvoCWLd5MYcdK0Qgn1gJgn2SQPUL2SD2wsMo+X78wwKP0YcefO0YEEc= X-Gm-Message-State: AOJu0YwYgIkVPE53QHvjKExr4Aeda96HxhzJhxyvB4r/Etqc35AgquvG I70raNFxICf4zXBT96jkTCrj83MOgIf+1e1Yj6W7Gld46UNGeUXf X-Google-Smtp-Source: AGHT+IGs+7p/NLEKs6R6CF0jeACef2MQxZrB+Zf+vdpZds2BGZMJaXfelQzjq7hVci+EokgCJHDthQ== X-Received: by 2002:a05:620a:4496:b0:79f:dce:76c7 with SMTP id af79cd13be357-7a1e5302964mr1165797885a.68.1722344053350; Tue, 30 Jul 2024 05:54:13 -0700 (PDT) Received: from localhost (fwdproxy-ash-113.fbsv.net. [2a03:2880:20ff:71::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44fe8171e52sm49968221cf.42.2024.07.30.05.54.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 05:54:13 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai , Usama Arif Subject: [PATCH 3/6] mm: free zapped tail pages when splitting isolated thp Date: Tue, 30 Jul 2024 13:46:00 +0100 Message-ID: <20240730125346.1580150-4-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240730125346.1580150-1-usamaarif642@gmail.com> References: <20240730125346.1580150-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 440EB4000D X-Stat-Signature: ekr6z84dc8n8dkhjgffbm9s9usdbfw4z X-Rspam-User: X-HE-Tag: 1722344054-143602 X-HE-Meta: U2FsdGVkX18AZ2foRfFjR2HyFkNagwEy4VYVXDfVn1tCgX83gPPOn9RumwWiiAimuvP1iFxu+WI/8kJNL3K8G7LTOaOgMY92u3r0EekYjqk5jvdjnzMdA1uV8If3gyOGMrHbJUlYYlolVMIkV3E0yx040vGBRzXYTFEjaH2hwUPhUaAK2USItMmAPOCCrxq33pCdFE3hnF0DWjvxi/pHun/AFoMjhRGbg9S1AgAFT8rOCHUayV+M74fmWsfWlHyW1YUChwi3R9nAOhPEWiGdSN/L/v6SC6wCcMXl6cr7ifdUrKBjO0NvBsEPGiICyYB1VDLkDC/zhbtZHFlvcFEhQglmnPUb8sx4qU7Bl17sWmbXagVy3qw/wPj6jdWHbMfJxP/mrqwX/849OJj1gu+21hDr/xSd0SsYX1BxIpsm0AoFkUB7aP+FCGb5RSnStDLEcGBMkqHX9OQOBGl/JCmg0nVJl0F6zr2Slo60DzGZTQCjTwCOJJVM/Oed9VtkIMmgHGBCwyMg3BwKrYEbP8MWH4WRrDsq9r049wlSJevSsYFNyFzGC6pvwbCzsyz+CXQcOeX3I5yw9VtJmMgg+t4AydmAl2c2VqacKlTo+Hm06e+c4u1vv38CQtUOmwl2zqyypkj6dCPtdjvqChN2TyF+nP3rUMoNUelBDE56CUZn6ou62ldPzmu/c4xGfbPuhPbqBMKCnyMrAsFItytXX1BLZ8rLuWY2dbKGjnrk8gnEGIlrhKEo7uGQThacTiTXivPjgkoRHNssTI7fGXsykcPpLhjCplSCyZMST2turuxxEIT27g42LGReYV6323jNZPS8bm1XhFDnpvZZci5Jm/8t25M8OXK8qfvNBIsPDmWZpSTIvhBINDhf7Dslhhqvuzbg63UCHihV5gKg5Bb7UhE4g9zRSqtt3bAf1VY1o8RiN7lhjltf3OPODv+ufM1EGAT51YKeQr31oltqaNfjsk6 pljNAOXs qcbymWOOOGcTvQZoFpZcIdPxXDb91p7EMUvwCusMRk/n4Y/uwFWsboyjoafp36CdW3yzZ3sL9VvvaHQcC/AJDMScSLVZaxDOA31EA74tVS3VYKRVu4BN3a0ZGvg7o8R1OjZtGVwJa+jbEffYXoP+t6pZ+25G3N2zwLHTLeKL8zQcvNdRlZSOyPKcV/pRLjWWO7Bcc8oc7G19sU8qC+wJulr5odzciqO85cXsmJackQ4InPjmtyKtf+E0IOXCJpc4gzp0mHJgBaokoei+NzzU6SALBUhtC6aSOiNtvPlJwHd5iPtlDH+Ab4uMZJUxmT8jMn/ZjjDKYfJHjQYwTEsThvY73Z0GQr+tTc02TJ9XhMQ3tfMk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000001, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yu Zhao If a tail page has only two references left, one inherited from the isolation of its head and the other from lru_add_page_tail() which we are about to drop, it means this tail page was concurrently zapped. Then we can safely free it and save page reclaim or migration the trouble of trying it. Signed-off-by: Yu Zhao Tested-by: Shuang Zhai Signed-off-by: Usama Arif --- mm/huge_memory.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 0167dc27e365..76a3b6a2b796 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2923,6 +2923,8 @@ static void __split_huge_page(struct page *page, struct list_head *list, unsigned int new_nr = 1 << new_order; int order = folio_order(folio); unsigned int nr = 1 << order; + LIST_HEAD(pages_to_free); + int nr_pages_to_free = 0; /* complete memcg works before add pages to LRU */ split_page_memcg(head, order, new_order); @@ -3007,6 +3009,24 @@ static void __split_huge_page(struct page *page, struct list_head *list, if (subpage == page) continue; folio_unlock(new_folio); + /* + * If a tail page has only two references left, one inherited + * from the isolation of its head and the other from + * lru_add_page_tail() which we are about to drop, it means this + * tail page was concurrently zapped. Then we can safely free it + * and save page reclaim or migration the trouble of trying it. + */ + if (list && page_ref_freeze(subpage, 2)) { + VM_BUG_ON_PAGE(PageLRU(subpage), subpage); + VM_BUG_ON_PAGE(PageCompound(subpage), subpage); + VM_BUG_ON_PAGE(page_mapped(subpage), subpage); + + ClearPageActive(subpage); + ClearPageUnevictable(subpage); + list_move(&subpage->lru, &pages_to_free); + nr_pages_to_free++; + continue; + } /* * Subpages may be freed if there wasn't any mapping @@ -3017,6 +3037,12 @@ static void __split_huge_page(struct page *page, struct list_head *list, */ free_page_and_swap_cache(subpage); } + + if (!nr_pages_to_free) + return; + + mem_cgroup_uncharge_list(&pages_to_free); + free_unref_page_list(&pages_to_free); } /* Racy check whether the huge page can be split */ From patchwork Tue Jul 30 12:46:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13747382 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 519A5C3DA49 for ; Tue, 30 Jul 2024 12:54:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2D206B0095; Tue, 30 Jul 2024 08:54:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A8E186B0098; Tue, 30 Jul 2024 08:54:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 827A96B0099; Tue, 30 Jul 2024 08:54:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5770E6B0095 for ; Tue, 30 Jul 2024 08:54:18 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id E0861C0286 for ; Tue, 30 Jul 2024 12:54:17 +0000 (UTC) X-FDA: 82396412154.18.B32609B Received: from mail-vk1-f180.google.com (mail-vk1-f180.google.com [209.85.221.180]) by imf09.hostedemail.com (Postfix) with ESMTP id E975B140008 for ; Tue, 30 Jul 2024 12:54:15 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XN1LpwHA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.221.180 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722344011; a=rsa-sha256; cv=none; b=LrWlnWHueUoHMU35P4ARLZ5y1bXSc3UjDvH92Jz0xxNqVfsYOOI08fVcxp3uqDOvF2KMEd 4Tu6OdDKboJF+xYC6We0wRUqylp3lji7wD/H04fYtKCFtxNfnN9I7SMHt8yjCZKuwMrsn2 UewbcC8Meh5wMvEm8RtMkRtHGQQii50= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XN1LpwHA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.221.180 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722344011; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a1wFF2OykYTq7u4lRnrBlmyhKJmdj3JpsXJVQxU9HeI=; b=NxoTrHLDRvIRU79nLB1oveZzXdr9SfkcTrrdQyB4BlD0Pw2yjHrOKFP4ef2oZdiZNLlIlL nsrmz77/vq/SFg1LiJnZCd5Sggvmf4AFmSw3hoyfqyLdEzZYJgoIOHsKaQOBU8BszYvCV+ O3QyyPYxzBrYDISISs8sHFPYUgm8s1s= Received: by mail-vk1-f180.google.com with SMTP id 71dfb90a1353d-4f6e36ea1ebso1199474e0c.0 for ; Tue, 30 Jul 2024 05:54:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722344055; x=1722948855; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=a1wFF2OykYTq7u4lRnrBlmyhKJmdj3JpsXJVQxU9HeI=; b=XN1LpwHAlBDGKxuIT0ZMbPaQDOQQsIORK+rcAcIx0V5nYIGtWysVs+lpH0zUM2YGmS SS8U5jP76xExuSCYd/ugntIvyW+jmSJMo+k/dDCAifE7vrzs7tLDatSfhZVtT3uzPXoK DNkJ+IDIj5+fMOUCGptoFPpoo9UyeBc3J4uZ2ClzBLuMjz8oqJ6tGX7KjeSfnb8YwrWx kaSlQOLHYLoTaCtk73UyAO01gAb49uq25joLNBW/ckQPdLO8OD3G2iAtJZ1tIS49EcgJ fNOPcxJiDDidqbwZvfT8GOIyqV0PCKREBtF7xzNK4O/noA5vy/kMGFTkG7JtsXcdYMjO SNyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722344055; x=1722948855; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a1wFF2OykYTq7u4lRnrBlmyhKJmdj3JpsXJVQxU9HeI=; b=A67BQqrGJsmioYdyMI636dUOuDERxDTTajShQQJx9DhcaEZUz5/GxO2Xt8MsE0dQV9 gObqix9QRY7jVwNOsRLf196/2VA06FsIMJFWuyI4mc4hMUDD0BYVejcK882Jab5oiKSg 1MZatzsCpLM7+qRJnoAaYt1mjjW42bo//cpWar0bkAYQJ91eOlBjwu6TmeRr0/7aqV7g +4VyM045apLv9TcOPHRAF6ouRawcAzDDg1l04MrWj7TnmzqcK9VReg49nM6/yTYEktOn ZMOKitb1ZkHZ2C5Bf+Dc0iZvnFazMjtY3vXBdrz/HJCe6xkvU8Td8+2ZTGeRnQc3elym v4EA== X-Forwarded-Encrypted: i=1; AJvYcCXt095GbhZA7jJ8mUd91FE1zSaNHuCWkB91E4k4QnnIuO29gHhdlph3Rby74b2xf4bESXU3QnGePmYUijUoR7Eh9Ao= X-Gm-Message-State: AOJu0YxnMbpIhYWpB6avVsc9YmzdQSQ8u6bwMS5RKqh0C0Ya2JLiDyuM IPcmEyCG4f7JIA+Eim9rhZl7jQ/1d/vizmgcEcY3p9oxGjwMw0IM X-Google-Smtp-Source: AGHT+IFv69c1CqQsvpikAHVYWXzas2sO4e5GzupNiWoyXChBgpWC9IWcrnZ5g+GmctsDCyK+p6YuCA== X-Received: by 2002:a05:6102:54ac:b0:492:a93d:7ca7 with SMTP id ada2fe7eead31-493fa61a794mr12211983137.4.1722344054907; Tue, 30 Jul 2024 05:54:14 -0700 (PDT) Received: from localhost (fwdproxy-ash-114.fbsv.net. [2a03:2880:20ff:72::face:b00c]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d745f404sm628318285a.131.2024.07.30.05.54.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 05:54:14 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai , Usama Arif Subject: [PATCH 4/6] mm: don't remap unused subpages when splitting isolated thp Date: Tue, 30 Jul 2024 13:46:01 +0100 Message-ID: <20240730125346.1580150-5-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240730125346.1580150-1-usamaarif642@gmail.com> References: <20240730125346.1580150-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: E975B140008 X-Stat-Signature: d3gfn6cc4cotdbee5hdd9wbha4t9y6ps X-Rspam-User: X-HE-Tag: 1722344055-301800 X-HE-Meta: U2FsdGVkX18EfzHygrt6tWChb25Sjorqj7Czs7Tmg6Fu0wSwErXmImE1MhZDQZ0ftwTkQ9g6+c+zGyQUekga5bZiNA+v8OWjLRMulazVnY1H2CEIYr3wa9z+qZLO0OMSj8DcBoIQSLq/tylHhvov/E8mqXxnfSVhlewpDMAXkgmmZgS5YV5XS1rX3uoPOBWQyW6SPVwpe1wLIqmNWUQTWq87bLXNoIMLZPI6ZT0VW89mqX5YQdLSvvPoA754BSRJl3DZc5NiXEc9KpKQF3zth/x5kifDXPXmZfjJrrG9u4+tpjDTDw7JixlWkZPXYSN/xXuBDwN6UNz3rDbl97AyWB1HnU9FgT/yrLTZ4DQNZpRYBd9sbDb/miBxC8JgnmR496hstZaFAVtND/TBLvAikptZnAwKR/r5hkKPQLAT0hlmJESCk/iCbUUdvMNEW0V6wH4t5UKNUOsvHvsxEODUgAeL25uh9YnF6dhe0fMOCOih11KjQsSxVdmKDmtzQ9+DEBSRm6IlFHpiLRcGJ8t+Ry34XMHyzDk1+RlBvBlPkv6sJL1vU4RbzHrzG9nEJWl965S5mzf5kuBsvMbzARIrGyNAJrkp40nAT9x6qFu1er0I2ruz2ma0Z6vICCCHIA3z4eI5mAgfuQQ5Ze7tQA2oNKsidL5saMoITdso2H3nD1+x5z5YeGqwhhk6GeP3G0JsKydb/lHhTAOdlFt9wKTBRSn7B4X9p9OKj1MLfYvDoCxhdCpSVMNzTEZ4LPkkXoFW+DrAF0I5ZGl95fh66eNUtfDPAZtSVUn90BTRCrSdcIKfry327NoTgrbU9nw+u5NF5h2WutkjOMJ3o3mJ0pb7lmoxm6DIdQcM6mrbzIwORfNxFWfNFhKRbiGib6Csm6/Kc6u49mcKgvtG1sHYQPcIZAezMPqlrA0webnN7bTD4DI7oPSfUcG6VHxhDyhkRf4f/RMjfsmLvFULpneRDlN 0pr6uSjR DdKLDEUa3o1ppakhoXbG23xLNUqHymuQKUIC/dH4qzlGDzOINuwaRpydeEcmljuBGdt4+gLT4qm1pU6pQ5VRj3NwwDgFWLjxahxqHbuQTZ+IWvr5rGxS5aUCnsPDUzLZsYZjPXTyL+yr1znS/Q0Dk26LrH9Y2sorCEqb5Wb6wH5u/HfScvkS54cTb1IZPgI6oCAt59PQTpC93qT+bDjtnaTRsiDAymoXrRNuuEgy9DKuScZvRm8ib0skIGbH1yoKy1RXeSbkvOIc9X63HZ3s5VtP4mEwDF+pIKeopX80MbE2LfQIGm2DR1MHK4sL0OACcBjK30ismGikJyI/pUyqrEBck5hMYEI/YQbBZWWrsJGZGSic= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yu Zhao Here being unused means containing only zeros and inaccessible to userspace. When splitting an isolated thp under reclaim or migration, there is no need to remap its unused subpages because they can be faulted in anew. Not remapping them avoids writeback or copying during reclaim or migration. This is particularly helpful when the internal fragmentation of a thp is high, i.e., it has many untouched subpages. This is also a prerequisite for THP low utilization shrinker which will be introduced in later patches, where underutilized THPs are split, and the zero-filled split pages are freed saving memory. Signed-off-by: Yu Zhao Tested-by: Shuang Zhai Signed-off-by: Usama Arif --- include/linux/rmap.h | 2 +- mm/huge_memory.c | 8 ++--- mm/migrate.c | 73 +++++++++++++++++++++++++++++++++++++++----- mm/migrate_device.c | 4 +-- 4 files changed, 72 insertions(+), 15 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 0978c64f49d8..805ab09057ed 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -745,7 +745,7 @@ int folio_mkclean(struct folio *); int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, struct vm_area_struct *vma); -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked); +void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked, bool unmap_unused); /* * rmap_walk_control: To control rmap traversing for specific needs diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 76a3b6a2b796..892467d85f3a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2775,7 +2775,7 @@ bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, return false; } -static void remap_page(struct folio *folio, unsigned long nr) +static void remap_page(struct folio *folio, unsigned long nr, bool unmap_unused) { int i = 0; @@ -2783,7 +2783,7 @@ static void remap_page(struct folio *folio, unsigned long nr) if (!folio_test_anon(folio)) return; for (;;) { - remove_migration_ptes(folio, folio, true); + remove_migration_ptes(folio, folio, true, unmap_unused); i += folio_nr_pages(folio); if (i >= nr) break; @@ -2993,7 +2993,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, if (nr_dropped) shmem_uncharge(folio->mapping->host, nr_dropped); - remap_page(folio, nr); + remap_page(folio, nr, PageAnon(head)); /* * set page to its compound_head when split to non order-0 pages, so @@ -3286,7 +3286,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (mapping) xas_unlock(&xas); local_irq_enable(); - remap_page(folio, folio_nr_pages(folio)); + remap_page(folio, folio_nr_pages(folio), false); ret = -EAGAIN; } diff --git a/mm/migrate.c b/mm/migrate.c index b273bac0d5ae..f4f06bdded70 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -177,13 +177,61 @@ void putback_movable_pages(struct list_head *l) } } +static bool try_to_unmap_unused(struct page_vma_mapped_walk *pvmw, + struct folio *folio, + unsigned long idx) +{ + struct page *page = folio_page(folio, idx); + void *addr; + bool dirty; + pte_t newpte; + + VM_BUG_ON_PAGE(PageCompound(page), page); + VM_BUG_ON_PAGE(!PageAnon(page), page); + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page); + + if (PageMlocked(page) || (pvmw->vma->vm_flags & VM_LOCKED)) + return false; + + /* + * The pmd entry mapping the old thp was flushed and the pte mapping + * this subpage has been non present. Therefore, this subpage is + * inaccessible. We don't need to remap it if it contains only zeros. + */ + addr = kmap_local_page(page); + dirty = memchr_inv(addr, 0, PAGE_SIZE); + kunmap_local(addr); + + if (dirty) + return false; + + pte_clear_not_present_full(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, false); + + if (userfaultfd_armed(pvmw->vma)) { + newpte = pte_mkspecial(pfn_pte(page_to_pfn(ZERO_PAGE(pvmw->address)), + pvmw->vma->vm_page_prot)); + ptep_clear_flush(pvmw->vma, pvmw->address, pvmw->pte); + set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte); + } + + dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio)); + return true; +} + +struct rmap_walk_arg { + struct folio *folio; + bool unmap_unused; +}; + /* * Restore a potential migration pte to a working pte entry */ static bool remove_migration_pte(struct folio *folio, - struct vm_area_struct *vma, unsigned long addr, void *old) + struct vm_area_struct *vma, unsigned long addr, void *arg) { - DEFINE_FOLIO_VMA_WALK(pvmw, old, vma, addr, PVMW_SYNC | PVMW_MIGRATION); + struct rmap_walk_arg *rmap_walk_arg = arg; + DEFINE_FOLIO_VMA_WALK(pvmw, rmap_walk_arg->folio, vma, addr, PVMW_SYNC | PVMW_MIGRATION); while (page_vma_mapped_walk(&pvmw)) { rmap_t rmap_flags = RMAP_NONE; @@ -207,6 +255,8 @@ static bool remove_migration_pte(struct folio *folio, continue; } #endif + if (rmap_walk_arg->unmap_unused && try_to_unmap_unused(&pvmw, folio, idx)) + continue; folio_get(folio); pte = mk_pte(new, READ_ONCE(vma->vm_page_prot)); @@ -285,13 +335,20 @@ static bool remove_migration_pte(struct folio *folio, * Get rid of all migration entries and replace them by * references to the indicated page. */ -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked) +void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked, bool unmap_unused) { + struct rmap_walk_arg rmap_walk_arg = { + .folio = src, + .unmap_unused = unmap_unused, + }; + struct rmap_walk_control rwc = { .rmap_one = remove_migration_pte, - .arg = src, + .arg = &rmap_walk_arg, }; + VM_BUG_ON_FOLIO(unmap_unused && src != dst, src); + if (locked) rmap_walk_locked(dst, &rwc); else @@ -904,7 +961,7 @@ static int writeout(struct address_space *mapping, struct folio *folio) * At this point we know that the migration attempt cannot * be successful. */ - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, false, false); rc = mapping->a_ops->writepage(&folio->page, &wbc); @@ -1068,7 +1125,7 @@ static void migrate_folio_undo_src(struct folio *src, struct list_head *ret) { if (page_was_mapped) - remove_migration_ptes(src, src, false); + remove_migration_ptes(src, src, false, false); /* Drop an anon_vma reference if we took one */ if (anon_vma) put_anon_vma(anon_vma); @@ -1306,7 +1363,7 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, lru_add_drain(); if (old_page_state & PAGE_WAS_MAPPED) - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, false, false); out_unlock_both: folio_unlock(dst); @@ -1444,7 +1501,7 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, if (page_was_mapped) remove_migration_ptes(src, - rc == MIGRATEPAGE_SUCCESS ? dst : src, false); + rc == MIGRATEPAGE_SUCCESS ? dst : src, false, false); unlock_put_anon: folio_unlock(dst); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 6d66dc1c6ffa..a1630d8e0d95 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -424,7 +424,7 @@ static unsigned long migrate_device_unmap(unsigned long *src_pfns, continue; folio = page_folio(page); - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, false, false); src_pfns[i] = 0; folio_unlock(folio); @@ -837,7 +837,7 @@ void migrate_device_finalize(unsigned long *src_pfns, src = page_folio(page); dst = page_folio(newpage); - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, false, false); folio_unlock(src); if (is_zone_device_page(page)) From patchwork Tue Jul 30 12:46:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13747383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id EE99FC3DA7E for ; Tue, 30 Jul 2024 12:54:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEBF46B0098; Tue, 30 Jul 2024 08:54:19 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D738C6B0099; Tue, 30 Jul 2024 08:54:19 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B7A346B009A; Tue, 30 Jul 2024 08:54:19 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 96F366B0098 for ; Tue, 30 Jul 2024 08:54:19 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 40D911C1086 for ; Tue, 30 Jul 2024 12:54:19 +0000 (UTC) X-FDA: 82396412238.27.6A4DBB7 Received: from mail-qk1-f176.google.com (mail-qk1-f176.google.com [209.85.222.176]) by imf26.hostedemail.com (Postfix) with ESMTP id 730A2140003 for ; Tue, 30 Jul 2024 12:54:17 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZHuMCkwd; spf=pass (imf26.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.176 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722343985; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WMly8HhHGH4A9wPlKbdln8VuB1NNTIz/ItEPyhn23R8=; b=eRnAttfeDihBK04ugvz0Tjxy7RCHdGttyFcdf0jXvyhA0BOGdXo8IdQE7FgGAB99nowX96 5/MRFaUoprdMkzdSFh2Fa09fl/jyxihIQz+Q178TvXFTpDiF6Ar64Z8rdmpHSpR14p6B4I hFxCeuSPGUzYUAdok731A7v3QhrIAPY= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=ZHuMCkwd; spf=pass (imf26.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.222.176 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com; dmarc=pass (policy=none) header.from=gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722343985; a=rsa-sha256; cv=none; b=KSIZwebboTO29ysAthBxB6SFvks8bodlK5zLsdB39MQ1AT1vgbmvenv77wgxur0EhLEtgt fflPwK1DOdaUXg3Mj/rFvOBWbYV0KCYOaZCJq0yEyZvmbCJgt3ueuSYcIzejUWF54e4WwR kuVCkfptyNyIUbcVv7K2+PEKK3dsVaM= Received: by mail-qk1-f176.google.com with SMTP id af79cd13be357-7a1e24f3c0dso270093085a.1 for ; Tue, 30 Jul 2024 05:54:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722344056; x=1722948856; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WMly8HhHGH4A9wPlKbdln8VuB1NNTIz/ItEPyhn23R8=; b=ZHuMCkwd4E29DrAj/3AGLCbCo9iDpa+rhdiXT3jtyKF4u8wAIIG5bF+4Kr4i32EwoW LU+r1T6rQpHLUfqrs9daSJ4OiDOWdiFsgTaSA9xVXNgLdllYWqPdQtPf22Str7NK4Rqd q1ncRpYtgyWgEHpZ65ZgvxWH2WSW3pDiUmdoJYb84jDiP9JMcKO0gshtvZEk5a2uXnLz o28IuwZc5VG9LTt86eiQJoz33p/ruY92nXOXM4dXZeWUztksgQnntuz1xh1YJr/3Fxet HJBbQSw6VadsA3+sNXWydm+qpe750g5ULwLbHx4WChMDivKDfhmWOL+0ZxYLOFAtI6DJ rjSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722344056; x=1722948856; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WMly8HhHGH4A9wPlKbdln8VuB1NNTIz/ItEPyhn23R8=; b=cYJBXvRaT9ccFXG5H8xLrnpTG3Z034QB/G/axF6C29/86cPREmK0dmWnhWKvBBvTvi MGRFwYG+McQOmgtWPslj2dEDT+tgStEbvvrbSuMOWwigr9LKMPBowxRi3nsY3ARdoLSu mM4LEl1iu124WsyzJkQIn66GLjHI3RubFFcXpFHBEu0uPZNLBFyiI5M3umcOlcoeEAu7 FXOt0Y3d0asHMdNd8A5H4n5+pOrtSSPg+6/AZV5jgZxKWY9q0lNDBJOio3Bpee4V49J0 DqmkipBDAXXrxGeU0RnG1fOS2x9mdC2i257/z8M5oQaipmz14AV4/ysiu1W+C/qDBQgS /f8w== X-Forwarded-Encrypted: i=1; AJvYcCVVXDoDaARJmAU0HyO14fhWKMdEg0keT2iaKB02It6CL9YqTrE3LqrKozJnprK4mItG8kfwtIfGtBDdQRASwzSkeQw= X-Gm-Message-State: AOJu0Ywx4mxiin9bXK2640v08qB5dXcyCVVc4hecEQ7NgKGy1vW3q5Ic UXzNDkn4ITZgKdChAaE3IhbKWoOHZ13twBxGtuH2jzk9PH8AkxuV X-Google-Smtp-Source: AGHT+IEmkurgjtLPaaaxi8bgqZN25LoWu8XIBbBu/DllgXvVyAnH9Zg7Q6mMTVSeKZ93YStrgiY5Rg== X-Received: by 2002:a05:620a:4507:b0:79d:751b:67b6 with SMTP id af79cd13be357-7a1e525458cmr1406226385a.16.1722344056416; Tue, 30 Jul 2024 05:54:16 -0700 (PDT) Received: from localhost (fwdproxy-ash-007.fbsv.net. [2a03:2880:20ff:7::face:b00c]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d7435577sm636521785a.96.2024.07.30.05.54.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 05:54:15 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Alexander Zhu , Usama Arif Subject: [PATCH 5/6] mm: add selftests to split_huge_page() to verify unmap/zap of zero pages Date: Tue, 30 Jul 2024 13:46:02 +0100 Message-ID: <20240730125346.1580150-6-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240730125346.1580150-1-usamaarif642@gmail.com> References: <20240730125346.1580150-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 730A2140003 X-Stat-Signature: xhzrq4yebrufsn9jy8dk4t7py5odttzx X-Rspam-User: X-HE-Tag: 1722344057-353933 X-HE-Meta: U2FsdGVkX1+wm7TWOeEvnKrAF0VTwD1rp6BR3u7z8QK3r1cStjS74BJ+TA7Et4AoMCoK7HTfSohEfXw6OKiQK5IuAlFqtsPIXQQzBRorxB8t5mpnZL5eZdtyllLBxu+968dVKR9ruANgPSsQGnWCNyYytPzzqsBeoa/aZxtkn/x3Sj/wG1bDEmbN6J6zyH/fM2PqJYYEP+GR48uJXHwEryK/lhepRqfd0xXQFee8kYRDX3bRjUcfzyu9OO0rrZesmcMtqwpREc7KQBNdlobmE2UulMUBCGAy7PXUhq2h293SAxZML6GjJk6JywYUVUa+c/jOoyIxLtJiHluF1NasEQzowKW23xMUCJ53MVfKphVZCe5EHV7UJHENMpdjVvl7h5PdvoI/1jlOSGQT4/i5G5CFcd8u0YT7leD2ivxIIY3gRtVcUnkSUll4ExYai2anFwrHq7yEnvCnUMahIdW1ywoDSmVV+c7NlyGzacgu3+JIwq/2gXIyWlIV+xextPvVYyf5o001Xh1CK27aE76SEMHvPMj1PHZLd/1ZVIhBpRQ1+r8zc7zc1SN1CGt46E3UOK8Zq5eHkgLUU+EQvTRbzy29HiuprYRGRrGudrmAlpziuc3DnIkEXxP3IgsJ0pVNUke93owCC5tnUpgDyYW8uHI4BOf2rYq6LjMNybgo/mFQ0Wfsyl5lY6u0EUiOJtoALuF3eEjqZ7VldzqVA+1cbATh1yKKa75BY3PgeSfcHqtBbQQrzEp5sT/7tJkFHn2ruzdeNgJhxP8KXo4lj50Sjelve2/K2AtzXNfp9Ij3C9c29X4keWLWT/X2IJ3zfyxrnRsgRfj203UVkf7l7/FyWsqsrCYcoFYIxLkJNqKEGoCfciSIvQAyR68bIahHQipGLOvl1KWoqBnbkY4EeoyJEK+qNsllfrggoJ+rWQsl5VOPMUD5N42d0tAaKEwRvIVmdEa66zf7rX1VvHPDtR7 xNSgPoxm SsOZ0xW4H5dLiEp5mbCKVx0mA42EylagjQK1Z2ErzkxpjtWykKHp8s8bRMWLvkIfkjVlJbcXAhb06kpVfANx6gOi3mGPinvYMFzSOrxSBLAmUFZ78x2Fh31QE6NvZOf6H7AH20pHO9LBY+QCqvEpow6u3IF6BGph1RhWd/uUtfgHarKFSdVtSK5fBzP/Bd1wSjxoB9HJdvb5LEuPiE/q8IXCdSMt2bPH8TyK7ZBwqQRP1HTYQAQgtTqUapN5HHdvirM991jd8mKkuDNyI9AokZufjFmb1kXFnBvMjv7KZHOBXQ8g3wr505UO7ihgfnNJMdQdg9qMv56UjYQ0CFeCibGRMp30DxKn33xIRR4VsokhMLSGR/vVvUwXpqkDV5DinU4AvtJ0/YDBs5UqERwxm/glv3gTioCOhYjqRQ5MtY0aF8HKJADa0/f5ZOBCxVQhRO/Q3 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexander Zhu Self tests to verify the RssAnon value to make sure zero pages are not remapped except in the case of userfaultfd. Also includes a self test for the userfaultfd use case. Signed-off-by: Alexander Zhu Signed-off-by: Usama Arif Acked-by: Rik van Riel --- .../selftests/mm/split_huge_page_test.c | 113 ++++++++++++++++++ tools/testing/selftests/mm/vm_util.c | 22 ++++ tools/testing/selftests/mm/vm_util.h | 1 + 3 files changed, 136 insertions(+) diff --git a/tools/testing/selftests/mm/split_huge_page_test.c b/tools/testing/selftests/mm/split_huge_page_test.c index e5e8dafc9d94..da271ad6ff11 100644 --- a/tools/testing/selftests/mm/split_huge_page_test.c +++ b/tools/testing/selftests/mm/split_huge_page_test.c @@ -17,6 +17,8 @@ #include #include #include +#include +#include #include "vm_util.h" #include "../kselftest.h" @@ -84,6 +86,115 @@ static void write_debugfs(const char *fmt, ...) write_file(SPLIT_DEBUGFS, input, ret + 1); } +static char *allocate_zero_filled_hugepage(size_t len) +{ + char *result; + size_t i; + + result = memalign(pmd_pagesize, len); + if (!result) { + printf("Fail to allocate memory\n"); + exit(EXIT_FAILURE); + } + + madvise(result, len, MADV_HUGEPAGE); + + for (i = 0; i < len; i++) + result[i] = (char)0; + + return result; +} + +static void verify_rss_anon_split_huge_page_all_zeroes(char *one_page, int nr_hpages, size_t len) +{ + uint64_t rss_anon_before, rss_anon_after; + size_t i; + + if (!check_huge_anon(one_page, 4, pmd_pagesize)) { + printf("No THP is allocated\n"); + exit(EXIT_FAILURE); + } + + rss_anon_before = rss_anon(); + if (!rss_anon_before) { + printf("No RssAnon is allocated before split\n"); + exit(EXIT_FAILURE); + } + + /* split all THPs */ + write_debugfs(PID_FMT, getpid(), (uint64_t)one_page, + (uint64_t)one_page + len, 0); + + for (i = 0; i < len; i++) + if (one_page[i] != (char)0) { + printf("%ld byte corrupted\n", i); + exit(EXIT_FAILURE); + } + + if (!check_huge_anon(one_page, 0, pmd_pagesize)) { + printf("Still AnonHugePages not split\n"); + exit(EXIT_FAILURE); + } + + rss_anon_after = rss_anon(); + if (rss_anon_after >= rss_anon_before) { + printf("Incorrect RssAnon value. Before: %ld After: %ld\n", + rss_anon_before, rss_anon_after); + exit(EXIT_FAILURE); + } +} + +void split_pmd_zero_pages(void) +{ + char *one_page; + int nr_hpages = 4; + size_t len = nr_hpages * pmd_pagesize; + + one_page = allocate_zero_filled_hugepage(len); + verify_rss_anon_split_huge_page_all_zeroes(one_page, nr_hpages, len); + printf("Split zero filled huge pages successful\n"); + free(one_page); +} + +void split_pmd_zero_pages_uffd(void) +{ + char *one_page; + int nr_hpages = 4; + size_t len = nr_hpages * pmd_pagesize; + long uffd; /* userfaultfd file descriptor */ + struct uffdio_api uffdio_api; + struct uffdio_register uffdio_register; + + /* Create and enable userfaultfd object. */ + + uffd = syscall(__NR_userfaultfd, O_CLOEXEC | O_NONBLOCK); + if (uffd == -1) { + perror("userfaultfd"); + exit(1); + } + + uffdio_api.api = UFFD_API; + uffdio_api.features = 0; + if (ioctl(uffd, UFFDIO_API, &uffdio_api) == -1) { + perror("ioctl-UFFDIO_API"); + exit(1); + } + + one_page = allocate_zero_filled_hugepage(len); + + uffdio_register.range.start = (unsigned long)one_page; + uffdio_register.range.len = len; + uffdio_register.mode = UFFDIO_REGISTER_MODE_WP; + if (ioctl(uffd, UFFDIO_REGISTER, &uffdio_register) == -1) { + perror("ioctl-UFFDIO_REGISTER"); + exit(1); + } + + verify_rss_anon_split_huge_page_all_zeroes(one_page, nr_hpages, len); + printf("Split zero filled huge pages with uffd successful\n"); + free(one_page); +} + void split_pmd_thp(void) { char *one_page; @@ -431,6 +542,8 @@ int main(int argc, char **argv) fd_size = 2 * pmd_pagesize; + split_pmd_zero_pages(); + split_pmd_zero_pages_uffd(); split_pmd_thp(); split_pte_mapped_thp(); split_file_backed_thp(); diff --git a/tools/testing/selftests/mm/vm_util.c b/tools/testing/selftests/mm/vm_util.c index 5a62530da3b5..7b7e763ba8e3 100644 --- a/tools/testing/selftests/mm/vm_util.c +++ b/tools/testing/selftests/mm/vm_util.c @@ -12,6 +12,7 @@ #define PMD_SIZE_FILE_PATH "/sys/kernel/mm/transparent_hugepage/hpage_pmd_size" #define SMAP_FILE_PATH "/proc/self/smaps" +#define STATUS_FILE_PATH "/proc/self/status" #define MAX_LINE_LENGTH 500 unsigned int __page_size; @@ -171,6 +172,27 @@ uint64_t read_pmd_pagesize(void) return strtoul(buf, NULL, 10); } +uint64_t rss_anon(void) +{ + uint64_t rss_anon = 0; + FILE *fp; + char buffer[MAX_LINE_LENGTH]; + + fp = fopen(STATUS_FILE_PATH, "r"); + if (!fp) + ksft_exit_fail_msg("%s: Failed to open file %s\n", __func__, STATUS_FILE_PATH); + + if (!check_for_pattern(fp, "RssAnon:", buffer, sizeof(buffer))) + goto err_out; + + if (sscanf(buffer, "RssAnon:%10ld kB", &rss_anon) != 1) + ksft_exit_fail_msg("Reading status error\n"); + +err_out: + fclose(fp); + return rss_anon; +} + bool __check_huge(void *addr, char *pattern, int nr_hpages, uint64_t hpage_size) { diff --git a/tools/testing/selftests/mm/vm_util.h b/tools/testing/selftests/mm/vm_util.h index 9007c420d52c..71b75429f4a5 100644 --- a/tools/testing/selftests/mm/vm_util.h +++ b/tools/testing/selftests/mm/vm_util.h @@ -39,6 +39,7 @@ unsigned long pagemap_get_pfn(int fd, char *start); void clear_softdirty(void); bool check_for_pattern(FILE *fp, const char *pattern, char *buf, size_t len); uint64_t read_pmd_pagesize(void); +uint64_t rss_anon(void); bool check_huge_anon(void *addr, int nr_hpages, uint64_t hpage_size); bool check_huge_file(void *addr, int nr_hpages, uint64_t hpage_size); bool check_huge_shmem(void *addr, int nr_hpages, uint64_t hpage_size); From patchwork Tue Jul 30 12:46:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13747384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E9CD7C3DA49 for ; Tue, 30 Jul 2024 12:54:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 49CFB6B009A; Tue, 30 Jul 2024 08:54:21 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 44BDF6B009B; Tue, 30 Jul 2024 08:54:21 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2031C6B009C; Tue, 30 Jul 2024 08:54:21 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E88656B009A for ; Tue, 30 Jul 2024 08:54:20 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 95C2B802CA for ; Tue, 30 Jul 2024 12:54:20 +0000 (UTC) X-FDA: 82396412280.29.E6DA538 Received: from mail-qt1-f176.google.com (mail-qt1-f176.google.com [209.85.160.176]) by imf24.hostedemail.com (Postfix) with ESMTP id BE69F180012 for ; Tue, 30 Jul 2024 12:54:18 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=e4CFB5mu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722344014; a=rsa-sha256; cv=none; b=jmj7dtvI1X/eKwW76cMipfZM39RfEndvkzjDoXXgkwVwZJrZfS/jFho8z8k2oGGicrwtUF tqm290yOUXLRm26hDElBTwSHUd+VZL7a39D23827Mtv8wkA08ukoj9dkDL47yzdJea0wAY Sp6059ngdDKrnc9KiaknSjcgAoCSTCw= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=e4CFB5mu; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf24.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.160.176 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722344014; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bVPY4ASHiKDxJoUMy5U0J/OHFKRn+kIriV1SskgJKvM=; b=oP+0U4cplOxCXiNV875BUXQDxrMWd8S8ZNG0VdlLd/gBO8zxTmgAF4ezLl6QUnSilYoS5M Q3GGcT0vqWO29q2Mh+8zTooG4tyzFHAG3hfGLaU3lltJW6brsEnrJVZc6Ld+PRZHxvjRtM AzRChkiO5Tzfsz/UnSxB/qted0vdfH0= Received: by mail-qt1-f176.google.com with SMTP id d75a77b69052e-44f666d9607so22943571cf.1 for ; Tue, 30 Jul 2024 05:54:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722344058; x=1722948858; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=bVPY4ASHiKDxJoUMy5U0J/OHFKRn+kIriV1SskgJKvM=; b=e4CFB5muVhpKYwAecpBxb0TtZiKCPbNYq2O14qGCYT/8Iycu7ft7NGy2PahDjpaaZq ZL2IrX/TveiSZyL6c+s0lFUXA8bgxbvhDYJU4WuOz+A/bqLxVOA2u2vui+7cMvaUTZUd NPHvYgT2ClkjvSEcMQN+8WjRCEdAzyS3V0jdTKsoxo9ObqZZcKYsV3Bc8YWxYJlPb2ET 5VXftdqIbjv+OvaEZ9lcRuZSOpxYIxpN/AGNoh+gTY0UPrp92OQ1WvlPtm2I83roBdd8 qQljv8PK1yXDDx76xGfuf8xDtDuK7LFVW61WfECp2AD2A3zXJcAIbFbob+qu8EyyzrxM L0zA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722344058; x=1722948858; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=bVPY4ASHiKDxJoUMy5U0J/OHFKRn+kIriV1SskgJKvM=; b=GWi7pTm4IkNyhYHP3ngEQTAazAtVhuCq2nJY5Egc596qbmfh5lMxE71oBINDX8+BJC 5ckcsyjcStViEoFpKBDil7DZ8LrTfmNQkBS/WdeRG2gshOIZrHK77nm2oqSpZzn0z9M2 IqRa6vfKEWK7ni0FsQ8Bd9ANqC8sqzwFsAovYhe7YqsM7xTqAEGxEwPCi5+ZiTgsCETv WGiUMBf/hQ3u3Xom7QqZP4y0SWNBwCUaZuyEU7V+pDVg+lpTdEadFk0fwelRJ5/T55vu Og29Ztgd7Zs+naDVsUgAM96XVgPLV+Z3pc84i2GZNYooS7FY8nXYmR5D5Ky2YxL26GcW BPdw== X-Forwarded-Encrypted: i=1; AJvYcCWziXgi6fROeGecMTpeFClIsEIadJVLxt7rCsEVODDnsZiNAs4O9lEFUUtGsvLhJBDqifBOZZmP2KYTHK3addMD4S4= X-Gm-Message-State: AOJu0YxocSdT4PW8K8KoUOWor6Shk46RsuU8bTt94LiMmKFJ/DWyvLYa r+LgsQjj4DYvFofPFkBZgq4rUw+ViUedxSVmbUAxVXz9cAkLwt7l X-Google-Smtp-Source: AGHT+IFuFNIaZtWXUVIV7gwuLUlPC5q8beeG5HvE70wh9m/VCp60jshxDnGmEpdr8lVXHO6FbmZa+A== X-Received: by 2002:ac8:5d0e:0:b0:442:2c5f:d2f7 with SMTP id d75a77b69052e-45004f2dc53mr162177051cf.31.1722344057639; Tue, 30 Jul 2024 05:54:17 -0700 (PDT) Received: from localhost (fwdproxy-ash-007.fbsv.net. [2a03:2880:20ff:7::face:b00c]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44fe84199d2sm49133741cf.97.2024.07.30.05.54.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 05:54:17 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Usama Arif Subject: [PATCH 6/6] mm: split underutilized THPs Date: Tue, 30 Jul 2024 13:46:03 +0100 Message-ID: <20240730125346.1580150-7-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240730125346.1580150-1-usamaarif642@gmail.com> References: <20240730125346.1580150-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: BE69F180012 X-Stat-Signature: e9i7x8anowmt8z1s3hsk769rikwjpyg4 X-Rspam-User: X-HE-Tag: 1722344058-620678 X-HE-Meta: U2FsdGVkX19xnR8mi81rdNNftwO5tvpP8HMJOphVH5CqwWYsmvQxm/yoUd8vaLsSaoh7ZCoKtWYowrOTHYL5wTfk31ezkUpZpulycnzl8SR/9exbynnP05dNX3UgSk2OeWFmpYBYap43nzHnm2z0NMno0qr2wHoy6MlonY27bpWBiATXT/rJWfvOlKaqfxWznHgiFFCCbenhh2leGR725DtOUnN9AxYdNmRtamQ45H46YV2wP9vhR4GkMzZScdiAKk16RfZheGtFyzwZOfn+Ur9Z/TWr4QYDqXwy4YQbl0oebT80ilcnoEQgmU4Dm/Qeziw8cGetD4H8NsFrp4Loj/Ue66qMS+9lO+RwnVmZAQXLvCja/kAUZ7MqmrbOYFu3AHeq9l8GxF48ky7ueQSp2WV5/ygxTcQMgp0sHbm96tQNq/Xo7ulkjf1P78cw9/kuybauvlbv6NDxkNWiOGgxxUKYJNJGzMPtDsfj8FWxhQGKYZaEZVnsefflmSbp0ZdL+TqnkmLMnMAnHgwOBd90turj9iQItOuZZqQiNjwrNTJjCidHvMKCdAQYM9CIjPHXKYYmDJTKvfhXy6NXPCvsXHDlDVuLWW20gPXzwGaA2X9xHkFbr8rqcwl6umpsZu6lO5vFHSBWroR2RfARMtvrP71k5dzbmCmOIWeRdmzvMNI76qe2jSwyOzJJMgj19pjukjN8vuHh/lFWfic+Jo/zNDY6eBgDBHY5jMKMl2bBFp/F2SDYhaKIIHyf04ON3Fw5mGrrHFyEpdEdpWzwI+vRd4D+oV1JskyVFcavDfiZE1zBDXGrz/cQAVCm7RP+U4iJ52wceShxpPhMXPfjCpjBRJG+IYaFPpuK3WQaugqEZ6tZkqMJJ1C1du3hUiarkmYkp/psvbrVLVs47uFOCrfxVoLbV23UrKB8HAigCoeFFG4g7B/VEZHNpCyt1ysIR3TtfEK1zm4fsz/sCffO4DM PEDm4PrH EDWTSu8py715U6QEeXVmQlF1r7x4Rgi7hPplnkOaITBiJGU1C//yNgirHHrFBcDBPDqtEjxtji1ImYAuXpDa7qaykqE3Moa+eJct8/v3VBjm+B8/zJY4xMaty+RpwEzWCZ5JqOv/YOycaDajMUQUv3JiMjQ6pTKriolQoS+aUbVFZqjhiUb4wXwGydStBOJOKaH65d1zwMzoDJMBhfTIRb9VF/vPnMAbQjPyXRWKariCU9DitHto6f0KTySBSQYsjHVjG3nqGzHY4V9wWOEo1qZcNGKH82oa7v0agf7hNJ51uWXBXDl3BPMp2ws772xJJPPNZMopGK4+bVklZjOZAXRl7lB3CDq1uWsA5pjY79j1cjevQvCdZWzdEE2m5UmSOM8umNEWMz9aF5bAntp3cCMIIxHXadTBV/zGh/PBeYS5FdOQ7znk5Z4WNpn18fNkhfFXtYhO07iNVPXc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: This is an attempt to mitigate the issue of running out of memory when THP is always enabled. During runtime whenever a THP is being faulted in (__do_huge_pmd_anonymous_page) or collapsed by khugepaged (collapse_huge_page), the THP is added to _deferred_list. Whenever memory reclaim happens in linux, the kernel runs the deferred_split shrinker which goes through the _deferred_list. If the folio was partially mapped, the shrinker attempts to split it. A new boolean is added to be able to distinguish between partially mapped folios and others in the deferred_list at split time in deferred_split_scan. Its needed as __folio_remove_rmap decrements the folio mapcount elements, hence it won't be possible to distinguish between partially mapped folios and others in deferred_split_scan without the boolean. If folio->_partially_mapped is not set, the shrinker checks if the THP was underutilized, i.e. how many of the base 4K pages of the entire THP were zero-filled. If this number goes above a certain threshold (decided by /sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none), the shrinker will attempt to split that THP. Then at remap time, the pages that were zero-filled are not remapped, hence saving memory. Suggested-by: Rik van Riel Co-authored-by: Johannes Weiner Signed-off-by: Usama Arif --- Documentation/admin-guide/mm/transhuge.rst | 6 ++ include/linux/huge_mm.h | 4 +- include/linux/khugepaged.h | 1 + include/linux/mm_types.h | 2 + include/linux/vm_event_item.h | 1 + mm/huge_memory.c | 118 ++++++++++++++++++--- mm/hugetlb.c | 1 + mm/internal.h | 4 +- mm/khugepaged.c | 3 +- mm/memcontrol.c | 3 +- mm/migrate.c | 3 +- mm/rmap.c | 2 +- mm/vmscan.c | 3 +- mm/vmstat.c | 1 + 14 files changed, 130 insertions(+), 22 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 058485daf186..24eec1c03ad8 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -447,6 +447,12 @@ thp_deferred_split_page splitting it would free up some memory. Pages on split queue are going to be split under memory pressure. +thp_underutilized_split_page + is incremented when a huge page on the split queue was split + because it was underutilized. A THP is underutilized if the + number of zero pages in the THP are above a certain threshold + (/sys/kernel/mm/transparent_hugepage/khugepaged/max_ptes_none). + thp_split_pmd is incremented every time a PMD split into table of PTEs. This can happen, for instance, when application calls mprotect() or diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index e25d9ebfdf89..00af84aa88ea 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -321,7 +321,7 @@ static inline int split_huge_page(struct page *page) { return split_huge_page_to_list_to_order(page, NULL, 0); } -void deferred_split_folio(struct folio *folio); +void deferred_split_folio(struct folio *folio, bool partially_mapped); void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct folio *folio); @@ -484,7 +484,7 @@ static inline int split_huge_page(struct page *page) { return 0; } -static inline void deferred_split_folio(struct folio *folio) {} +static inline void deferred_split_folio(struct folio *folio, bool partially_mapped) {} #define split_huge_pmd(__vma, __pmd, __address) \ do { } while (0) diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h index f68865e19b0b..30baae91b225 100644 --- a/include/linux/khugepaged.h +++ b/include/linux/khugepaged.h @@ -4,6 +4,7 @@ #include /* MMF_VM_HUGEPAGE */ +extern unsigned int khugepaged_max_ptes_none __read_mostly; #ifdef CONFIG_TRANSPARENT_HUGEPAGE extern struct attribute_group khugepaged_attr_group; diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 485424979254..443026cf763e 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -311,6 +311,7 @@ typedef struct { * @_hugetlb_cgroup_rsvd: Do not use directly, use accessor in hugetlb_cgroup.h. * @_hugetlb_hwpoison: Do not use directly, call raw_hwp_list_head(). * @_deferred_list: Folios to be split under memory pressure. + * @_partially_mapped: Folio was partially mapped. * @_unused_slab_obj_exts: Placeholder to match obj_exts in struct slab. * * A folio is a physically, virtually and logically contiguous set @@ -393,6 +394,7 @@ struct folio { unsigned long _head_2a; /* public: */ struct list_head _deferred_list; + bool _partially_mapped; /* private: the union with struct page is transitional */ }; struct page __page_2; diff --git a/include/linux/vm_event_item.h b/include/linux/vm_event_item.h index aae5c7c5cfb4..bf1470a7a737 100644 --- a/include/linux/vm_event_item.h +++ b/include/linux/vm_event_item.h @@ -105,6 +105,7 @@ enum vm_event_item { PGPGIN, PGPGOUT, PSWPIN, PSWPOUT, THP_SPLIT_PAGE, THP_SPLIT_PAGE_FAILED, THP_DEFERRED_SPLIT_PAGE, + THP_UNDERUTILIZED_SPLIT_PAGE, THP_SPLIT_PMD, THP_SCAN_EXCEED_NONE_PTE, THP_SCAN_EXCEED_SWAP_PTE, diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 892467d85f3a..3305e6d0b90e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -73,6 +73,7 @@ static unsigned long deferred_split_count(struct shrinker *shrink, struct shrink_control *sc); static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc); +static bool split_underutilized_thp = true; static atomic_t huge_zero_refcount; struct folio *huge_zero_folio __read_mostly; @@ -438,6 +439,27 @@ static ssize_t hpage_pmd_size_show(struct kobject *kobj, static struct kobj_attribute hpage_pmd_size_attr = __ATTR_RO(hpage_pmd_size); +static ssize_t split_underutilized_thp_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + return sysfs_emit(buf, "%d\n", split_underutilized_thp); +} + +static ssize_t split_underutilized_thp_store(struct kobject *kobj, + struct kobj_attribute *attr, + const char *buf, size_t count) +{ + int err = kstrtobool(buf, &split_underutilized_thp); + + if (err < 0) + return err; + + return count; +} + +static struct kobj_attribute split_underutilized_thp_attr = __ATTR( + thp_low_util_shrinker, 0644, split_underutilized_thp_show, split_underutilized_thp_store); + static struct attribute *hugepage_attr[] = { &enabled_attr.attr, &defrag_attr.attr, @@ -446,6 +468,7 @@ static struct attribute *hugepage_attr[] = { #ifdef CONFIG_SHMEM &shmem_enabled_attr.attr, #endif + &split_underutilized_thp_attr.attr, NULL, }; @@ -1002,6 +1025,7 @@ static vm_fault_t __do_huge_pmd_anonymous_page(struct vm_fault *vmf, update_mmu_cache_pmd(vma, vmf->address, vmf->pmd); add_mm_counter(vma->vm_mm, MM_ANONPAGES, HPAGE_PMD_NR); mm_inc_nr_ptes(vma->vm_mm); + deferred_split_folio(folio, false); spin_unlock(vmf->ptl); count_vm_event(THP_FAULT_ALLOC); count_mthp_stat(HPAGE_PMD_ORDER, MTHP_STAT_ANON_FAULT_ALLOC); @@ -3259,6 +3283,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, * page_deferred_list. */ list_del_init(&folio->_deferred_list); + folio->_partially_mapped = false; } spin_unlock(&ds_queue->split_queue_lock); if (mapping) { @@ -3315,11 +3340,12 @@ void __folio_undo_large_rmappable(struct folio *folio) if (!list_empty(&folio->_deferred_list)) { ds_queue->split_queue_len--; list_del_init(&folio->_deferred_list); + folio->_partially_mapped = false; } spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); } -void deferred_split_folio(struct folio *folio) +void deferred_split_folio(struct folio *folio, bool partially_mapped) { struct deferred_split *ds_queue = get_deferred_split_queue(folio); #ifdef CONFIG_MEMCG @@ -3334,6 +3360,9 @@ void deferred_split_folio(struct folio *folio) if (folio_order(folio) <= 1) return; + if (!partially_mapped && !split_underutilized_thp) + return; + /* * The try_to_unmap() in page reclaim path might reach here too, * this may cause a race condition to corrupt deferred split queue. @@ -3347,14 +3376,14 @@ void deferred_split_folio(struct folio *folio) if (folio_test_swapcache(folio)) return; - if (!list_empty(&folio->_deferred_list)) - return; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + folio->_partially_mapped = partially_mapped; if (list_empty(&folio->_deferred_list)) { - if (folio_test_pmd_mappable(folio)) - count_vm_event(THP_DEFERRED_SPLIT_PAGE); - count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + if (partially_mapped) { + if (folio_test_pmd_mappable(folio)) + count_vm_event(THP_DEFERRED_SPLIT_PAGE); + count_mthp_stat(folio_order(folio), MTHP_STAT_SPLIT_DEFERRED); + } list_add_tail(&folio->_deferred_list, &ds_queue->split_queue); ds_queue->split_queue_len++; #ifdef CONFIG_MEMCG @@ -3379,6 +3408,39 @@ static unsigned long deferred_split_count(struct shrinker *shrink, return READ_ONCE(ds_queue->split_queue_len); } +static bool thp_underutilized(struct folio *folio) +{ + int num_zero_pages = 0, num_filled_pages = 0; + void *kaddr; + int i; + + if (khugepaged_max_ptes_none == HPAGE_PMD_NR - 1) + return false; + + for (i = 0; i < folio_nr_pages(folio); i++) { + kaddr = kmap_local_folio(folio, i * PAGE_SIZE); + if (memchr_inv(kaddr, 0, PAGE_SIZE) == NULL) { + num_zero_pages++; + if (num_zero_pages > khugepaged_max_ptes_none) { + kunmap_local(kaddr); + return true; + } + } else { + /* + * Another path for early exit once the number + * of non-zero filled pages exceeds threshold. + */ + num_filled_pages++; + if (num_filled_pages >= HPAGE_PMD_NR - khugepaged_max_ptes_none) { + kunmap_local(kaddr); + return false; + } + } + kunmap_local(kaddr); + } + return false; +} + static unsigned long deferred_split_scan(struct shrinker *shrink, struct shrink_control *sc) { @@ -3403,6 +3465,7 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, } else { /* We lost race with folio_put() */ list_del_init(&folio->_deferred_list); + folio->_partially_mapped = false; ds_queue->split_queue_len--; } if (!--sc->nr_to_scan) @@ -3411,18 +3474,45 @@ static unsigned long deferred_split_scan(struct shrinker *shrink, spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); list_for_each_entry_safe(folio, next, &list, _deferred_list) { + bool did_split = false; + bool underutilized = false; + + if (folio->_partially_mapped) + goto split; + underutilized = thp_underutilized(folio); + if (underutilized) + goto split; + continue; +split: if (!folio_trylock(folio)) - goto next; - /* split_huge_page() removes page from list on success */ - if (!split_folio(folio)) - split++; + continue; + did_split = !split_folio(folio); folio_unlock(folio); -next: - folio_put(folio); + if (did_split) { + /* Splitting removed folio from the list, drop reference here */ + folio_put(folio); + if (underutilized) + count_vm_event(THP_UNDERUTILIZED_SPLIT_PAGE); + split++; + } } spin_lock_irqsave(&ds_queue->split_queue_lock, flags); - list_splice_tail(&list, &ds_queue->split_queue); + /* + * Only add back to the queue if folio->_partially_mapped is set. + * If thp_underutilized returns false, or if split_folio fails in + * the case it was underutilized, then consider it used and don't + * add it back to split_queue. + */ + list_for_each_entry_safe(folio, next, &list, _deferred_list) { + if (folio->_partially_mapped) + list_move(&folio->_deferred_list, &ds_queue->split_queue); + else { + list_del_init(&folio->_deferred_list); + ds_queue->split_queue_len--; + } + folio_put(folio); + } spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); /* diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 5a32157ca309..df2da47d0637 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1758,6 +1758,7 @@ static void __update_and_free_hugetlb_folio(struct hstate *h, free_gigantic_folio(folio, huge_page_order(h)); } else { INIT_LIST_HEAD(&folio->_deferred_list); + folio->_partially_mapped = false; folio_put(folio); } } diff --git a/mm/internal.h b/mm/internal.h index 259afe44dc88..8fc072cc3023 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -657,8 +657,10 @@ static inline void prep_compound_head(struct page *page, unsigned int order) atomic_set(&folio->_entire_mapcount, -1); atomic_set(&folio->_nr_pages_mapped, 0); atomic_set(&folio->_pincount, 0); - if (order > 1) + if (order > 1) { INIT_LIST_HEAD(&folio->_deferred_list); + folio->_partially_mapped = false; + } } static inline void prep_compound_tail(struct page *head, int tail_idx) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index f3b3db104615..5a434fdbc1ef 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -85,7 +85,7 @@ static DECLARE_WAIT_QUEUE_HEAD(khugepaged_wait); * * Note that these are only respected if collapse was initiated by khugepaged. */ -static unsigned int khugepaged_max_ptes_none __read_mostly; +unsigned int khugepaged_max_ptes_none __read_mostly; static unsigned int khugepaged_max_ptes_swap __read_mostly; static unsigned int khugepaged_max_ptes_shared __read_mostly; @@ -1235,6 +1235,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, pgtable_trans_huge_deposit(mm, pmd, pgtable); set_pmd_at(mm, address, pmd, _pmd); update_mmu_cache_pmd(vma, address, pmd); + deferred_split_folio(folio, false); spin_unlock(pmd_ptl); folio = NULL; diff --git a/mm/memcontrol.c b/mm/memcontrol.c index f568b9594c2b..2ee61d619d86 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4651,7 +4651,8 @@ static void uncharge_folio(struct folio *folio, struct uncharge_gather *ug) VM_BUG_ON_FOLIO(folio_test_lru(folio), folio); VM_BUG_ON_FOLIO(folio_order(folio) > 1 && !folio_test_hugetlb(folio) && - !list_empty(&folio->_deferred_list), folio); + !list_empty(&folio->_deferred_list) && + folio->_partially_mapped, folio); /* * Nobody should be changing or seriously looking at diff --git a/mm/migrate.c b/mm/migrate.c index f4f06bdded70..2731ac20ff33 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1734,7 +1734,8 @@ static int migrate_pages_batch(struct list_head *from, * use _deferred_list. */ if (nr_pages > 2 && - !list_empty(&folio->_deferred_list)) { + !list_empty(&folio->_deferred_list) && + folio->_partially_mapped) { if (try_split_folio(folio, split_folios) == 0) { nr_failed++; stats->nr_thp_failed += is_thp; diff --git a/mm/rmap.c b/mm/rmap.c index 2630bde38640..1b5418121965 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1582,7 +1582,7 @@ static __always_inline void __folio_remove_rmap(struct folio *folio, */ if (folio_test_anon(folio) && partially_mapped && list_empty(&folio->_deferred_list)) - deferred_split_folio(folio); + deferred_split_folio(folio, true); } __folio_mod_stat(folio, -nr, -nr_pmdmapped); diff --git a/mm/vmscan.c b/mm/vmscan.c index c89d0551655e..1bee9b1262f6 100644 --- a/mm/vmscan.c +++ b/mm/vmscan.c @@ -1233,7 +1233,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list, * Split partially mapped folios right away. * We can free the unmapped pages without IO. */ - if (data_race(!list_empty(&folio->_deferred_list)) && + if (data_race(!list_empty(&folio->_deferred_list) && + folio->_partially_mapped) && split_folio_to_list(folio, folio_list)) goto activate_locked; } diff --git a/mm/vmstat.c b/mm/vmstat.c index 5082431dad28..525fad4a1d6d 100644 --- a/mm/vmstat.c +++ b/mm/vmstat.c @@ -1367,6 +1367,7 @@ const char * const vmstat_text[] = { "thp_split_page", "thp_split_page_failed", "thp_deferred_split_page", + "thp_underutilized_split_page", "thp_split_pmd", "thp_scan_exceed_none_pte", "thp_scan_exceed_swap_pte",