From patchwork Thu Jul 11 07:29:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13730096 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81BFCC3271E for ; Thu, 11 Jul 2024 07:29:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 154796B00A0; Thu, 11 Jul 2024 03:29:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DA806B00A1; Thu, 11 Jul 2024 03:29:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E94CA6B00A2; Thu, 11 Jul 2024 03:29:43 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C94FA6B00A0 for ; Thu, 11 Jul 2024 03:29:43 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 73660A0452 for ; Thu, 11 Jul 2024 07:29:43 +0000 (UTC) X-FDA: 82326647046.12.6DD4310 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf24.hostedemail.com (Postfix) with ESMTP id C8064180004 for ; Thu, 11 Jul 2024 07:29:41 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720682957; a=rsa-sha256; cv=none; b=a9zQ7f/k1KGj5c2XqUTwe1pUggHScUF/Ex2NWOFpMzXv3ufo+60t+1v6EHpnGNGJN1/WfW B3f3RSj7j8UykSVW5/ym5CiM0VuUhCeP0t49GRhdgF3eT4l5UMkML5cQ1+3fseJOzZqKaC OhmtYZ5I8fAkhEXNwbMe5pnupboUhlE= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf24.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720682957; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=6CvvoxGBKwXTXUMYHZFV2RIww44uZBcSMHZeXFKKQos=; b=PJxKh2cMuvIl8c7WJe85KcjcfI+1R4B6jxhnXgTg4Wxyn6pzmKhz9JU/RbK4PyRvKCEBNk QE9uIiq8+xD6r4xkTQKmbDG4silz5/GLghPh3KUgPkqfuSekiFZoMj0x0pPzMcog+fh28I cjYptOZEdcqzCTz6kXDU/NE2f9cYg3M= Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2804E1042; Thu, 11 Jul 2024 00:30:06 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3F6DB3F762; Thu, 11 Jul 2024 00:29:39 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Hugh Dickins , Jonathan Corbet , "Matthew Wilcox (Oracle)" , David Hildenbrand , Barry Song , Lance Yang , Baolin Wang Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1 1/2] mm: Cleanup count_mthp_stat() definition Date: Thu, 11 Jul 2024 08:29:24 +0100 Message-ID: <20240711072929.3590000-2-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240711072929.3590000-1-ryan.roberts@arm.com> References: <20240711072929.3590000-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: C8064180004 X-Rspam-User: X-Rspamd-Server: rspam05 X-Stat-Signature: iitebr9dtdnrx8ocgzu76jgazmkqc8nz X-HE-Tag: 1720682981-312108 X-HE-Meta: U2FsdGVkX1+/bXorZDXMj/7Rf45jD+Zp7fwKK/gtoauuTiaJI8hbvOiadxYy7/Q7cBSxwzPR+k7Sm5i4bmjEpzTtyLZtkBOqfuG2qAeiMFN358PKnajTgVzSv9kYrFFItcDDguEwN30MHKTYOtdzDSQiN33mTQwjPKDe694FQjX/F4vkxoJfvtt4XRPFcLPh6hujwioS2XsLRQgHa+MNKq0EdN4BtYcYgjo1cy1d/08prYou7foo8K9RTYXFmGZhDahzq8VH7KRKU8WsM2uUIgv/XGFH+QGkZ6RR8/PRMHv9QdXFaI7FqHUkr3U7nl7/5urwRkUTpYBZuk51Fqe94NkRG0u4+LqFaznE8UY8s9M3EOQuXftKzP71k8v7FLVifkz6gYhym9tl4ylYRmtPp5hmEK8oVpu/nXFsl0FAzsBcN6eFs0i3tKICIRuHzdI+MFB9aaqm/zx2Ol2OSpnr0Xn5JISIaYnq8MFHhN15q+hIFvlsD8wd/nTMA16A4WaDcibrV9E0NJTuRoNm7CwJQUuA8+Jcc8/aMAwCCHLFaM4m7o/IdV0v2cHTW8hmL4eVKy6CPsvKA+kscPcI5Zjoz8IegcA67zQwNscNjmUWeq+0wisvyydLlGKORCCF3hi/zBIaAmpoPEiMHHA/7KboytjDrPXH5hfaqgUjcm2O3pHrRvx1O2jXDcYDz+cdIdK8j0CDZ4Pyn36GwDSe5tdOGQldvYk9FhEfIksh4XJgvz/SWWySnR3AlbCCTsxVgnL/ud8YQlJq9Yj7ngUQad3mxUbgsSPoU7IxnYj4ozz9OcqleFa4ujmzP7xYRapqsZbP3nQpyYVffWA4XnyjR422asnmFM5H7WfdMhcHKBM7MGNaWdnKg7BygMakrPDjvXDyyAdBNScD6ZR2FG0jXNpJDv/M8DTsTlry1kcP6PM39m/IfbjWVSFWdcwY0TodEPAglEXzqmrJjG2gdw2Dt9d oSHAIMAL 7KMMiYHNcjOzh2ywS31Ju+G7AdOUx0j9MVQyvEiM/WeDRaxW8GcWqNKdkpxGQbdicLUhS1ZYBkYucq/eNXO1f+64UQu6qwUrLBdKQuNJm4IgxEILZF18gr8wjN0DalKshzmlsDkVUelYUn5lMrOKhtj2+ispnV8TH4wnu43s0oD8FQqNVlqE67+E6AgTxr48dACVdz3Mm7u6/GkI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Let's move count_mthp_stat() so that it's always defined, even when THP is disabled. Previously uses of the function in files such as shmem.c, which are compiled even when THP is disabled, required ugly THP ifdeferry. With this cleanup, we can remove those ifdefs and the function resolves to a nop when THP is disabled. I shortly plan to call count_mthp_stat() from more THP-invariant source files. Signed-off-by: Ryan Roberts Acked-by: Barry Song Reviewed-by: Baolin Wang Reviewed-by: Lance Yang --- include/linux/huge_mm.h | 70 ++++++++++++++++++++--------------------- mm/memory.c | 2 -- mm/shmem.c | 6 ---- 3 files changed, 35 insertions(+), 43 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index cff002be83eb..cb93b9009ce4 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -108,6 +108,41 @@ extern struct kobj_attribute thpsize_shmem_enabled_attr; #define HPAGE_PUD_MASK (~(HPAGE_PUD_SIZE - 1)) #define HPAGE_PUD_SIZE ((1UL) << HPAGE_PUD_SHIFT) +enum mthp_stat_item { + MTHP_STAT_ANON_FAULT_ALLOC, + MTHP_STAT_ANON_FAULT_FALLBACK, + MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, + MTHP_STAT_SWPOUT, + MTHP_STAT_SWPOUT_FALLBACK, + MTHP_STAT_SHMEM_ALLOC, + MTHP_STAT_SHMEM_FALLBACK, + MTHP_STAT_SHMEM_FALLBACK_CHARGE, + MTHP_STAT_SPLIT, + MTHP_STAT_SPLIT_FAILED, + MTHP_STAT_SPLIT_DEFERRED, + __MTHP_STAT_COUNT +}; + +#if defined(CONFIG_TRANSPARENT_HUGEPAGE) && defined(CONFIG_SYSFS) +struct mthp_stat { + unsigned long stats[ilog2(MAX_PTRS_PER_PTE) + 1][__MTHP_STAT_COUNT]; +}; + +DECLARE_PER_CPU(struct mthp_stat, mthp_stats); + +static inline void count_mthp_stat(int order, enum mthp_stat_item item) +{ + if (order <= 0 || order > PMD_ORDER) + return; + + this_cpu_inc(mthp_stats.stats[order][item]); +} +#else +static inline void count_mthp_stat(int order, enum mthp_stat_item item) +{ +} +#endif + #ifdef CONFIG_TRANSPARENT_HUGEPAGE extern unsigned long transparent_hugepage_flags; @@ -263,41 +298,6 @@ struct thpsize { #define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) -enum mthp_stat_item { - MTHP_STAT_ANON_FAULT_ALLOC, - MTHP_STAT_ANON_FAULT_FALLBACK, - MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE, - MTHP_STAT_SWPOUT, - MTHP_STAT_SWPOUT_FALLBACK, - MTHP_STAT_SHMEM_ALLOC, - MTHP_STAT_SHMEM_FALLBACK, - MTHP_STAT_SHMEM_FALLBACK_CHARGE, - MTHP_STAT_SPLIT, - MTHP_STAT_SPLIT_FAILED, - MTHP_STAT_SPLIT_DEFERRED, - __MTHP_STAT_COUNT -}; - -struct mthp_stat { - unsigned long stats[ilog2(MAX_PTRS_PER_PTE) + 1][__MTHP_STAT_COUNT]; -}; - -#ifdef CONFIG_SYSFS -DECLARE_PER_CPU(struct mthp_stat, mthp_stats); - -static inline void count_mthp_stat(int order, enum mthp_stat_item item) -{ - if (order <= 0 || order > PMD_ORDER) - return; - - this_cpu_inc(mthp_stats.stats[order][item]); -} -#else -static inline void count_mthp_stat(int order, enum mthp_stat_item item) -{ -} -#endif - #define transparent_hugepage_use_zero_page() \ (transparent_hugepage_flags & \ (1<vm_mm, MM_ANONPAGES, nr_pages); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE count_mthp_stat(folio_order(folio), MTHP_STAT_ANON_FAULT_ALLOC); -#endif folio_add_new_anon_rmap(folio, vma, addr, RMAP_EXCLUSIVE); folio_add_lru_vma(folio, vma); setpte: diff --git a/mm/shmem.c b/mm/shmem.c index f24dfbd387ba..fce1343f44e6 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1776,9 +1776,7 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, if (pages == HPAGE_PMD_NR) count_vm_event(THP_FILE_FALLBACK); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE count_mthp_stat(order, MTHP_STAT_SHMEM_FALLBACK); -#endif order = next_order(&suitable_orders, order); } } else { @@ -1803,10 +1801,8 @@ static struct folio *shmem_alloc_and_add_folio(struct vm_fault *vmf, count_vm_event(THP_FILE_FALLBACK); count_vm_event(THP_FILE_FALLBACK_CHARGE); } -#ifdef CONFIG_TRANSPARENT_HUGEPAGE count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_FALLBACK); count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_FALLBACK_CHARGE); -#endif } goto unlock; } @@ -2180,9 +2176,7 @@ static int shmem_get_folio_gfp(struct inode *inode, pgoff_t index, if (!IS_ERR(folio)) { if (folio_test_pmd_mappable(folio)) count_vm_event(THP_FILE_ALLOC); -#ifdef CONFIG_TRANSPARENT_HUGEPAGE count_mthp_stat(folio_order(folio), MTHP_STAT_SHMEM_ALLOC); -#endif goto alloced; } if (PTR_ERR(folio) == -EEXIST) From patchwork Thu Jul 11 07:29:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ryan Roberts X-Patchwork-Id: 13730097 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 94F7FC3DA41 for ; Thu, 11 Jul 2024 07:29:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 19F186B0096; Thu, 11 Jul 2024 03:29:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 150EB6B00A1; Thu, 11 Jul 2024 03:29:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F31476B00A3; Thu, 11 Jul 2024 03:29:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id D303C6B00A1 for ; Thu, 11 Jul 2024 03:29:45 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 8CD9714042F for ; Thu, 11 Jul 2024 07:29:45 +0000 (UTC) X-FDA: 82326647130.25.5AA0153 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by imf03.hostedemail.com (Postfix) with ESMTP id CDCB820018 for ; Thu, 11 Jul 2024 07:29:43 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1720682947; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KQjoejng8Q6Siq9sx9s78sLcILiiCi08NPTr0leitC4=; b=Jlcj3i76r6ViM0SRC1iiIPYbJA6mH+R+4JHqozEYYrUTm3181DdpDQpF/5WTqwtG0Mi7IQ RlVaYBqypMozXPhVyy5gRDlmCqU5JKOmJl6gqF9TZnNU3Ahx3yfdMqs7gsZg9x/8j9WLO7 1rLRxt4C6vEGsg34y69CRgG6a/Tpce8= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1720682947; a=rsa-sha256; cv=none; b=IId/mySIWOXLLuevZWzryo1xk4eYpUCJsIw8o1nsGTzCk0ATWjHIutKSggN9Q1AFPbGpyU 9LPIyZtQD8Q2Y3krbMM/rzR/B99X2PUm5MF3Zk32fPIxReVipPKjV9NPzlBgMSJLvxxVqI ZQ2+44vpTqex1RvAsRabjtbiMiIrqtU= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none; dmarc=pass (policy=none) header.from=arm.com; spf=pass (imf03.hostedemail.com: domain of ryan.roberts@arm.com designates 217.140.110.172 as permitted sender) smtp.mailfrom=ryan.roberts@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 481C913D5; Thu, 11 Jul 2024 00:30:08 -0700 (PDT) Received: from e125769.cambridge.arm.com (e125769.cambridge.arm.com [10.1.196.27]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3F8AB3F762; Thu, 11 Jul 2024 00:29:41 -0700 (PDT) From: Ryan Roberts To: Andrew Morton , Hugh Dickins , Jonathan Corbet , "Matthew Wilcox (Oracle)" , David Hildenbrand , Barry Song , Lance Yang , Baolin Wang Cc: Ryan Roberts , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v1 2/2] mm: mTHP stats for pagecache folio allocations Date: Thu, 11 Jul 2024 08:29:25 +0100 Message-ID: <20240711072929.3590000-3-ryan.roberts@arm.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240711072929.3590000-1-ryan.roberts@arm.com> References: <20240711072929.3590000-1-ryan.roberts@arm.com> MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: CDCB820018 X-Stat-Signature: aoexzig5nxganp679ea1yunppx9bpmca X-Rspam-User: X-HE-Tag: 1720682983-442978 X-HE-Meta: U2FsdGVkX1/LOwcdiGS6Q74wPp9MTqxFiSa8QzER6uyQm06aQd3pFZYgdy5lTGNuh5IvBG/igs33g2FTW/m8o52CsDgf/907+t6jJErvDnslgCsY8EHDzhKgwdk+fH3VWob333XSjmq83bXuZXcuy1+HZVM6iCibeIDBXLJO6CpDRKx2BvwUi5bM4qZnAJqAPwaSTOb9bKERfv6a2lGtYx3acYAiT1r8pkK8A5BrNRSYV+IG1yXiTZBl3FVd/t48Xpy1na4Nt2u7BpXbh26zYywXzu00Ac5agVjJxqPs6NDvCMMnOtLaisidD7l3jXtcv1CuYY1DvIWc6Y+XCiU//CHYQmY1g+ttHfrCto8GYJTX6WI6uSynCaqZfS4rLoVEIZhj1LRE8XNEApWOs9t6QG1TlNm6cK+ku+IRKkizz91mVnksawybHspHTypXiuFmPxXOBffDi9w+NPBy3mdo6KviztdX7fpxDtPTo+K+AtYuYQns25hxyKsDMm0O+moQXO8aPoNMHM6rltZ035JRI8Yy1+QxGyp9Dg/7uxGFaBqR35k37jVOsTGAe6sI9YwgMhWZ4E/GJHTlFNxw5Xfvaoxk4XhaFSqKgTjqVzr44iVcRFxsrDG5EFv9etiUqMeK4rzZ3kUZ+IVePV2AoGB66PcNvPOxz5GfKmTAhoY0DmjiYLUsOTU6R5k4rXYfxfd60W+wSqZ70+Y+S5QjneP6ELgW0K6wKi+/CVUJfsTQR+gyI0zJaePrD+DU9DFsJSjVqlTY/gd0ASf7p6AqkWWqp6CCLdROvIe2QAW67Fd+s8ExLWElOCSzThSWspMIimualLGJcs7X1CE9LKcUdSCV0yGXfW9H7SNmMnGsZ+1HXOw35uu9/1MamJfhVGEacGIY39xMX4JYIcbkF3BiddHRZgA2o+HdkpaemiCku0XKfqCdyw+XiUWAKW0tRxh+LJla8D44dvyL+YxCe9POo+S rXfwMdfD /poL9IFCXygv6vleZURBiZr6+p1/1Qgl2vzoj92R5/ebMJl85Prjefqm9CWm5u28x9zhU6QtULkGSBF+xuOYwwjyaxD36mQMjnNGHYCkG2PU57hvyjqGCGIWYwHapyXZnQ5GCqUT/9s86EzzJTTkw2ZRyzQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Expose 3 new mTHP stats for file (pagecache) folio allocations: /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/file_alloc /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/file_fallback /sys/kernel/mm/transparent_hugepage/hugepages-*kB/stats/file_fallback_charge This will provide some insight on the sizes of large folios being allocated for file-backed memory, and how often allocation is failing. All non-order-0 (and most order-0) folio allocations are currently done through filemap_alloc_folio(), and folios are charged in a subsequent call to filemap_add_folio(). So count file_fallback when allocation fails in filemap_alloc_folio() and count file_alloc or file_fallback_charge in filemap_add_folio(), based on whether charging succeeded or not. There are some users of filemap_add_folio() that allocate their own order-0 folio by other means, so we would not count an allocation failure in this case, but we also don't care about order-0 allocations. This approach feels like it should be good enough and doesn't require any (impractically large) refactoring. The existing mTHP stats interface is reused to provide consistency to users. And because we are reusing the same interface, we can reuse the same infrastructure on the kernel side. The one small wrinkle is that the set of folio sizes supported by the pagecache are not identical to those supported by anon and shmem; pagecache supports order-1, unlike anon and shmem, and the max pagecache order may be less than PMD-size (see arm64 with 64K base pages), again unlike anon and shmem. So we now create a hugepages-*kB directory for the union of the sizes supported by all 3 memory types and populate it with the relevant stats and controls. Signed-off-by: Ryan Roberts --- Documentation/admin-guide/mm/transhuge.rst | 13 +++ include/linux/huge_mm.h | 6 +- include/linux/pagemap.h | 17 ++- mm/filemap.c | 6 +- mm/huge_memory.c | 117 ++++++++++++++++----- 5 files changed, 128 insertions(+), 31 deletions(-) diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst index 058485daf186..d4857e457add 100644 --- a/Documentation/admin-guide/mm/transhuge.rst +++ b/Documentation/admin-guide/mm/transhuge.rst @@ -512,6 +512,19 @@ shmem_fallback_charge falls back to using small pages even though the allocation was successful. +file_alloc + is incremented every time a file huge page is successfully + allocated. + +file_fallback + is incremented if a file huge page is attempted to be allocated + but fails and instead falls back to using small pages. + +file_fallback_charge + is incremented if a file huge page cannot be charged and instead + falls back to using small pages even though the allocation was + successful. + split is incremented every time a huge page is successfully split into smaller orders. This can happen for a variety of reasons but a diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index cb93b9009ce4..b4fba11976f2 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -117,6 +117,9 @@ enum mthp_stat_item { MTHP_STAT_SHMEM_ALLOC, MTHP_STAT_SHMEM_FALLBACK, MTHP_STAT_SHMEM_FALLBACK_CHARGE, + MTHP_STAT_FILE_ALLOC, + MTHP_STAT_FILE_FALLBACK, + MTHP_STAT_FILE_FALLBACK_CHARGE, MTHP_STAT_SPLIT, MTHP_STAT_SPLIT_FAILED, MTHP_STAT_SPLIT_DEFERRED, @@ -292,11 +295,10 @@ unsigned long thp_vma_allowable_orders(struct vm_area_struct *vma, struct thpsize { struct kobject kobj; - struct list_head node; int order; }; -#define to_thpsize(kobj) container_of(kobj, struct thpsize, kobj) +#define to_thpsize(_kobj) container_of(_kobj, struct thpsize, kobj) #define transparent_hugepage_use_zero_page() \ (transparent_hugepage_flags & \ diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index 6e2f72d03176..f45a1ba6d9b6 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -365,6 +365,7 @@ static inline void mapping_set_gfp_mask(struct address_space *m, gfp_t mask) */ #define MAX_XAS_ORDER (XA_CHUNK_SHIFT * 2 - 1) #define MAX_PAGECACHE_ORDER min(MAX_XAS_ORDER, PREFERRED_MAX_PAGECACHE_ORDER) +#define PAGECACHE_LARGE_ORDERS ((BIT(MAX_PAGECACHE_ORDER + 1) - 1) & ~BIT(0)) /** * mapping_set_large_folios() - Indicate the file supports large folios. @@ -562,14 +563,26 @@ static inline void *detach_page_private(struct page *page) } #ifdef CONFIG_NUMA -struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order); +struct folio *__filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order); #else -static inline struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) +static inline struct folio *__filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) { return folio_alloc_noprof(gfp, order); } #endif +static inline struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) +{ + struct folio *folio; + + folio = __filemap_alloc_folio_noprof(gfp, order); + + if (!folio) + count_mthp_stat(order, MTHP_STAT_FILE_FALLBACK); + + return folio; +} + #define filemap_alloc_folio(...) \ alloc_hooks(filemap_alloc_folio_noprof(__VA_ARGS__)) diff --git a/mm/filemap.c b/mm/filemap.c index 53d5d0410b51..131d514fca29 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -963,6 +963,8 @@ int filemap_add_folio(struct address_space *mapping, struct folio *folio, int ret; ret = mem_cgroup_charge(folio, NULL, gfp); + count_mthp_stat(folio_order(folio), + ret ? MTHP_STAT_FILE_FALLBACK_CHARGE : MTHP_STAT_FILE_ALLOC); if (ret) return ret; @@ -990,7 +992,7 @@ int filemap_add_folio(struct address_space *mapping, struct folio *folio, EXPORT_SYMBOL_GPL(filemap_add_folio); #ifdef CONFIG_NUMA -struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) +struct folio *__filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) { int n; struct folio *folio; @@ -1007,7 +1009,7 @@ struct folio *filemap_alloc_folio_noprof(gfp_t gfp, unsigned int order) } return folio_alloc_noprof(gfp, order); } -EXPORT_SYMBOL(filemap_alloc_folio_noprof); +EXPORT_SYMBOL(__filemap_alloc_folio_noprof); #endif /* diff --git a/mm/huge_memory.c b/mm/huge_memory.c index f9696c94e211..559553e2a662 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -452,8 +452,9 @@ static const struct attribute_group hugepage_attr_group = { static void hugepage_exit_sysfs(struct kobject *hugepage_kobj); static void thpsize_release(struct kobject *kobj); +static void thpsize_child_release(struct kobject *kobj); static DEFINE_SPINLOCK(huge_anon_orders_lock); -static LIST_HEAD(thpsize_list); +static LIST_HEAD(thpsize_child_list); static ssize_t thpsize_enabled_show(struct kobject *kobj, struct kobj_attribute *attr, char *buf) @@ -537,6 +538,18 @@ static const struct kobj_type thpsize_ktype = { .sysfs_ops = &kobj_sysfs_ops, }; +static const struct kobj_type thpsize_child_ktype = { + .release = &thpsize_child_release, + .sysfs_ops = &kobj_sysfs_ops, +}; + +struct thpsize_child { + struct kobject kobj; + struct list_head node; +}; + +#define to_thpsize_child(_kobj) container_of(_kobj, struct thpsize, kobj) + DEFINE_PER_CPU(struct mthp_stat, mthp_stats) = {{{0}}}; static unsigned long sum_mthp_stat(int order, enum mthp_stat_item item) @@ -557,7 +570,7 @@ static unsigned long sum_mthp_stat(int order, enum mthp_stat_item item) static ssize_t _name##_show(struct kobject *kobj, \ struct kobj_attribute *attr, char *buf) \ { \ - int order = to_thpsize(kobj)->order; \ + int order = to_thpsize(kobj->parent)->order; \ \ return sysfs_emit(buf, "%lu\n", sum_mthp_stat(order, _index)); \ } \ @@ -591,41 +604,93 @@ static struct attribute *stats_attrs[] = { }; static struct attribute_group stats_attr_group = { - .name = "stats", .attrs = stats_attrs, }; -static struct thpsize *thpsize_create(int order, struct kobject *parent) +DEFINE_MTHP_STAT_ATTR(file_alloc, MTHP_STAT_FILE_ALLOC); +DEFINE_MTHP_STAT_ATTR(file_fallback, MTHP_STAT_FILE_FALLBACK); +DEFINE_MTHP_STAT_ATTR(file_fallback_charge, MTHP_STAT_FILE_FALLBACK_CHARGE); + +static struct attribute *file_stats_attrs[] = { + &file_alloc_attr.attr, + &file_fallback_attr.attr, + &file_fallback_charge_attr.attr, + NULL, +}; + +static struct attribute_group file_stats_attr_group = { + .attrs = file_stats_attrs, +}; + +static int thpsize_create(int order, struct kobject *parent) { unsigned long size = (PAGE_SIZE << order) / SZ_1K; + struct thpsize_child *stats; struct thpsize *thpsize; int ret; + /* + * Each child object (currently only "stats" directory) holds a + * reference to the top-level thpsize object, so we can drop our ref to + * the top-level once stats is setup. Then we just need to drop a + * reference on any children to clean everything up. We can't just use + * the attr group name for the stats subdirectory because there may be + * multiple attribute groups to populate inside stats and overlaying + * using the name property isn't supported in that way; each attr group + * name, if provided, must be unique in the parent directory. + */ + thpsize = kzalloc(sizeof(*thpsize), GFP_KERNEL); - if (!thpsize) - return ERR_PTR(-ENOMEM); + if (!thpsize) { + ret = -ENOMEM; + goto err; + } + thpsize->order = order; ret = kobject_init_and_add(&thpsize->kobj, &thpsize_ktype, parent, "hugepages-%lukB", size); if (ret) { kfree(thpsize); - return ERR_PTR(ret); + goto err; } - ret = sysfs_create_group(&thpsize->kobj, &thpsize_attr_group); - if (ret) { + stats = kzalloc(sizeof(*stats), GFP_KERNEL); + if (!stats) { kobject_put(&thpsize->kobj); - return ERR_PTR(ret); + ret = -ENOMEM; + goto err; } - ret = sysfs_create_group(&thpsize->kobj, &stats_attr_group); + ret = kobject_init_and_add(&stats->kobj, &thpsize_child_ktype, + &thpsize->kobj, "stats"); + kobject_put(&thpsize->kobj); if (ret) { - kobject_put(&thpsize->kobj); - return ERR_PTR(ret); + kfree(stats); + goto err; } - thpsize->order = order; - return thpsize; + if (BIT(order) & THP_ORDERS_ALL_ANON) { + ret = sysfs_create_group(&thpsize->kobj, &thpsize_attr_group); + if (ret) + goto err_put; + + ret = sysfs_create_group(&stats->kobj, &stats_attr_group); + if (ret) + goto err_put; + } + + if (BIT(order) & PAGECACHE_LARGE_ORDERS) { + ret = sysfs_create_group(&stats->kobj, &file_stats_attr_group); + if (ret) + goto err_put; + } + + list_add(&stats->node, &thpsize_child_list); + return 0; +err_put: + kobject_put(&stats->kobj); +err: + return ret; } static void thpsize_release(struct kobject *kobj) @@ -633,10 +698,14 @@ static void thpsize_release(struct kobject *kobj) kfree(to_thpsize(kobj)); } +static void thpsize_child_release(struct kobject *kobj) +{ + kfree(to_thpsize_child(kobj)); +} + static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) { int err; - struct thpsize *thpsize; unsigned long orders; int order; @@ -665,16 +734,14 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) goto remove_hp_group; } - orders = THP_ORDERS_ALL_ANON; + orders = THP_ORDERS_ALL_ANON | PAGECACHE_LARGE_ORDERS; order = highest_order(orders); while (orders) { - thpsize = thpsize_create(order, *hugepage_kobj); - if (IS_ERR(thpsize)) { + err = thpsize_create(order, *hugepage_kobj); + if (err) { pr_err("failed to create thpsize for order %d\n", order); - err = PTR_ERR(thpsize); goto remove_all; } - list_add(&thpsize->node, &thpsize_list); order = next_order(&orders, order); } @@ -692,11 +759,11 @@ static int __init hugepage_init_sysfs(struct kobject **hugepage_kobj) static void __init hugepage_exit_sysfs(struct kobject *hugepage_kobj) { - struct thpsize *thpsize, *tmp; + struct thpsize_child *child, *tmp; - list_for_each_entry_safe(thpsize, tmp, &thpsize_list, node) { - list_del(&thpsize->node); - kobject_put(&thpsize->kobj); + list_for_each_entry_safe(child, tmp, &thpsize_child_list, node) { + list_del(&child->node); + kobject_put(&child->kobj); } sysfs_remove_group(hugepage_kobj, &khugepaged_attr_group);