From patchwork Sat Apr 19 07:32:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jinjiang Tu X-Patchwork-Id: 14057813 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5D89C369BD for ; Sat, 19 Apr 2025 07:42:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CFA566B0008; Sat, 19 Apr 2025 03:42:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C81646B000A; Sat, 19 Apr 2025 03:42:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AADB06B000C; Sat, 19 Apr 2025 03:42:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 89F066B0008 for ; Sat, 19 Apr 2025 03:42:47 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 451AB121DFF for ; Sat, 19 Apr 2025 07:42:48 +0000 (UTC) X-FDA: 83350001616.24.77B45A2 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by imf03.hostedemail.com (Postfix) with ESMTP id 4386820002 for ; Sat, 19 Apr 2025 07:42:44 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1745048566; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding:in-reply-to: references; bh=qsDBMzMrOSWKt+85DaNtJdez3dQ1tKRDzvNJpL00wBI=; b=3aelkfWc9QGLr1F2ReOCRDavx+ht6JwTOD2qeaAbteSwTXEXe6F99Nv5b8pZz9PQTC+1mW vT0JLQx2QOvUGMUhdCmz8b7NdSqzAvZbtkYaZl+rslqStHyDZhrpnb5z4nfB1IDwopKmrh xCUJoGzaUNsMS9i/nugm9GxWxcgxhmQ= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; spf=pass (imf03.hostedemail.com: domain of tujinjiang@huawei.com designates 45.249.212.187 as permitted sender) smtp.mailfrom=tujinjiang@huawei.com; dmarc=pass (policy=quarantine) header.from=huawei.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1745048566; a=rsa-sha256; cv=none; b=NFcZ/gyAa5GQ8lw+UGaf7wYh2bVT3pLyyEuUsw+ChcYj3VHpGA3hmPHWHWJ4UVUPxUmF4o 19oPgJHkDP+uVsHOb/6SDtXPZN3M+zJ+sm/a3Qq9ciLH/jppa4ynMIw8Xx28Y249BTVvDo 68eEx6gP9f1W+Gumf/e1X3SwIBdYHhU= Received: from mail.maildlp.com (unknown [172.19.162.254]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4Zfk5P4Js2zvWpB; Sat, 19 Apr 2025 15:38:33 +0800 (CST) Received: from kwepemo200002.china.huawei.com (unknown [7.202.195.209]) by mail.maildlp.com (Postfix) with ESMTPS id EA35818047F; Sat, 19 Apr 2025 15:42:40 +0800 (CST) Received: from huawei.com (10.175.124.71) by kwepemo200002.china.huawei.com (7.202.195.209) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Sat, 19 Apr 2025 15:42:40 +0800 From: Jinjiang Tu To: , , , , CC: , , , Subject: [RFC PATCH] docs: hugetlbpage.rst: add free surplus huge pages description Date: Sat, 19 Apr 2025 15:32:14 +0800 Message-ID: <20250419073214.2688926-1-tujinjiang@huawei.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 X-Originating-IP: [10.175.124.71] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemo200002.china.huawei.com (7.202.195.209) X-Stat-Signature: h9bm3iwy76gsbc4c9hbnwgyt1huo3b7o X-Rspamd-Queue-Id: 4386820002 X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1745048564-695099 X-HE-Meta: U2FsdGVkX18EpRfefUQm3nO+dWojOlG7PA9ZdzMPWiqIuwylhqyd0vXHq8IHdTEmp5bk55qHKtHEHTvWXAbdnxE86y8tbWJOhlJqCo8vwkU/brIyGQOo6u7aAhgQcb6DCVcXNqB695kTQZRO6SqD3PmRrO5utdzLUhmuCXDdM+PhQ93k1B36hoR6LiLivFoFd0qDgXYEtJ4S04gxcwyG7SFQNxny1vWCW8d+ICXOUImailsx11FTfuosV3lhqCOHtImqbY2nYLfqMKgbMGtKrJJwWkww+rJkH8DLniAwE2a6UkZWnVFCwM93hZfl+qA79RGKmOwIdoM0VVicfQ/Xl1rw2vWqWw0WYMvVwH5VuhGCxsXnb4agf4SJ3ajbIdP7ymkiDJ4D3usYzCooeKuX93ymEZM0vkiMIoiGUnUNXwFB1mhuC7QigplV2OYPBJ2M52c+toF71KRSUGvT9jfmUGVMSeiNZYyh1DYNY38Lg7K/HKcxOm846wI05cXpBhJYiLLW0Uo7GRZTJk8zk8GDRDL/1Ojcy7wRR514Rfl4MgICaBClfnu1XpB6xJqKn/Keu3kt+lDmuJX8sEY/+V1f1TS9CjNNNzOvQu66rd1xB8lUtmcTd6kyjp9l7XaM7EEtizyOE/FBlY08Yoveo1449YK6L3XWx3bSTs7iCtTjhCGI0ssB+qgQIzlQbcwq7fZ2i4K6YXAeX9RgRkA7o3mBYevwScZBHEkiKzBWfZlmFSApctCozt3wjbm6nmwM47b/cz/mgAkmseyWGp+H2PFYdE/OaCfeuXqE/YlZJdOGX7O/UlimkpQjW88MdsrHfPxsv4FMOxeVHCA2mcll2LqZkDIYA9/Yk/Vvax7fSujJgB6/Exkl5XcmYCVsl9K7Jt9d/g3RQwJq96VgD4b/3cNWqae+KkjKp90IkuOO6xSwmycqwsmcV2eATWrhiFFdPpn7QZXfAgJz75oMxp1AUXO JYN56glL XpNWjdhWMk6+ZcbucPGcoGV37GUPEfSyLbFKnH5sPzM+KKKB4B4VYw4M7BIZpH19mdqlZilXgKwRRSBwpef0Zq9QJ2N23kf0OqNnL3gm89SrsKd4GOsN9q88wayjmkiVD+gvRuHyAei9PbKE= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: When echo 0 > /proc/sys/vm/nr_hugepages is concurrent with freeing in-use huge pages to the huge page pool, some free huge pages may fail to be destroyed and accounted as surplus. The counts are like below: HugePages_Total: 1024 HugePages_Free: 1024 HugePages_Surp: 1024 When set_max_huge_pages() decrease the pool size, it first return free pages to the buddy allocator, and then account other pages as surplus. Between the two steps, the hugetlb_lock is released to free memory and require the hugetlb_lock again. If another process free huge pages to the pool between the two steps, these free huge pages will be accounted as surplus. Besides, Free surplus huge pages come from failing to restore vmemmap. Once the two situation occurs, users couldn't directly shrink the huge page pool via echo 0 > nr_hugepages, should use one of the two ways to destroy these free surplus huge pages: 1) echo $nr_surplus > nr_hugepages to convert the surplus free huge pages to persistent free huge pages first, and then echo 0 > nr_hugepages to destroy these huge pages. 2) allocate these free surplus huge pages, and will try to destroy them when freeing them. However, there is no documentation to describe it, users may be confused and don't know how to handle in such case. So update the documention. Signed-off-by: Jinjiang Tu --- Documentation/admin-guide/mm/hugetlbpage.rst | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/Documentation/admin-guide/mm/hugetlbpage.rst b/Documentation/admin-guide/mm/hugetlbpage.rst index 67a941903fd2..0456cefae039 100644 --- a/Documentation/admin-guide/mm/hugetlbpage.rst +++ b/Documentation/admin-guide/mm/hugetlbpage.rst @@ -239,6 +239,17 @@ this condition holds--that is, until ``nr_hugepages+nr_overcommit_hugepages`` is increased sufficiently, or the surplus huge pages go out of use and are freed-- no more surplus huge pages will be allowed to be allocated. +Caveat: Shrinking the persistent huge page pool via ``nr_hugepages`` may be +concurrent with freeing in-use huge pages to the huge page pool, leading to some +huge pages are still in the huge page pool and accounted as surplus. Besides, +When the feature of freeing unused vmemmap pages associated with each hugetlb page +is enabled, free huge page may be accounted as surplus too. In such two cases, users +couldn't directly shrink the huge page pool via echo 0 to ``nr_hugepages``, should +echo $nr_surplus to ``nr_hugepages`` to convert the surplus free huge pages to +persistent free huge pages first, and then echo 0 to ``nr_hugepages`` to destroy +these huge pages. Another way to destroy is allocating these free surplus huge +pages and these huge pages will be tried to destroy when they are freed. + With support for multiple huge page pools at run-time available, much of the huge page userspace interface in ``/proc/sys/vm`` has been duplicated in sysfs.