From patchwork Thu Apr 17 00:02:32 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nico Pache X-Patchwork-Id: 14054628 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C43A8C369C2 for ; Thu, 17 Apr 2025 00:07:07 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 760E728014D; Wed, 16 Apr 2025 20:07:06 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 6E6ED28014B; Wed, 16 Apr 2025 20:07:06 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 5617428014D; Wed, 16 Apr 2025 20:07:06 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 34A7A28014B for ; Wed, 16 Apr 2025 20:07:06 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8D9491605BA for ; Thu, 17 Apr 2025 00:07:06 +0000 (UTC) X-FDA: 83341595652.16.3C57DA5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf20.hostedemail.com (Postfix) with ESMTP id C65D01C0006 for ; Thu, 17 Apr 2025 00:07:04 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SatzTB3H; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf20.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744848424; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=VDvWsGWeORIq3FSYTdeKCOhlS6o9LDg3hAe/f5gBDVY=; b=uXNW4A6PV03HKReTnz7EP0qSSuUG6RlOZeOQSlaUidzOB70+gR5Gbv3LWW5tGeUQbJ+0fB 9NlsJ/u0qDylIJGjEdutZuh0bredgxxK8nVmn4csAnp43GJbhlxq0JlnwZBsQmi9dBOXO1 GBIBfrSlpPL8iW56QvFvFftGQqigZiM= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744848424; a=rsa-sha256; cv=none; b=X9xAZejeV6Ygc1PK+qrB+XPHr0ysDqcMkxGUHK/hbXBJEInG5r4AliGnDFkqUDiH/fHjq2 Ek1jeaxY9c2NQXRWc8yPIGNNhoLxN0gt8NGIJJBcD6wrqAE+EE3StRTeMJMc5l7ytwRMPN nGiRoXoLvARJZLYXjDSG6lOKle+1NOw= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=SatzTB3H; dmarc=pass (policy=quarantine) header.from=redhat.com; spf=pass (imf20.hostedemail.com: domain of npache@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=npache@redhat.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1744848424; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VDvWsGWeORIq3FSYTdeKCOhlS6o9LDg3hAe/f5gBDVY=; b=SatzTB3HZxn3Z78ctAnuunj1IlWBEb5hsku+AJHSoYe/T07Q/kcHP4aeFv+yWlYkBpkPNg FFhaqR8VEwxg6Ux8BMpcmlNqk0yFNQtxstfE6U8hk+9VuOq0z1kVrcW52j6/UykrRRcU9w 4/OcfEigFeGj1EUwVVdEjy5tSVwtIFA= Received: from mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (ec2-35-165-154-97.us-west-2.compute.amazonaws.com [35.165.154.97]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-524-p4PD8IGnPfGM71B3NKFMdQ-1; Wed, 16 Apr 2025 20:06:58 -0400 X-MC-Unique: p4PD8IGnPfGM71B3NKFMdQ-1 X-Mimecast-MFC-AGG-ID: p4PD8IGnPfGM71B3NKFMdQ_1744848413 Received: from mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.111]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-06.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id 8270A18001E0; Thu, 17 Apr 2025 00:06:53 +0000 (UTC) Received: from h1.redhat.com (unknown [10.22.88.34]) by mx-prod-int-08.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D0F151800352; Thu, 17 Apr 2025 00:06:44 +0000 (UTC) From: Nico Pache To: linux-mm@kvack.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org, linux-trace-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, corbet@lwn.net, rostedt@goodmis.org, mhiramat@kernel.org, mathieu.desnoyers@efficios.com, david@redhat.com, baohua@kernel.org, baolin.wang@linux.alibaba.com, ryan.roberts@arm.com, willy@infradead.org, peterx@redhat.com, ziy@nvidia.com, wangkefeng.wang@huawei.com, usamaarif642@gmail.com, sunnanyong@huawei.com, vishal.moola@gmail.com, thomas.hellstrom@linux.intel.com, yang@os.amperecomputing.com, kirill.shutemov@linux.intel.com, aarcange@redhat.com, raquini@redhat.com, dev.jain@arm.com, anshuman.khandual@arm.com, catalin.marinas@arm.com, tiwai@suse.de, will@kernel.org, dave.hansen@linux.intel.com, jack@suse.cz, cl@gentwo.org, jglisse@google.com, surenb@google.com, zokeefe@google.com, hannes@cmpxchg.org, rientjes@google.com, mhocko@suse.com, rdunlap@infradead.org Subject: [PATCH v4 06/12] khugepaged: introduce khugepaged_scan_bitmap for mTHP support Date: Wed, 16 Apr 2025 18:02:32 -0600 Message-ID: <20250417000238.74567-7-npache@redhat.com> In-Reply-To: <20250417000238.74567-1-npache@redhat.com> References: <20250417000238.74567-1-npache@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.4.1 on 10.30.177.111 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: C65D01C0006 X-Rspam-User: X-Stat-Signature: y7nrq1y85kc65iohbnp363etshthtjzy X-HE-Tag: 1744848424-935222 X-HE-Meta: U2FsdGVkX19CdrIoEJ8jwtCxm8CNu4tmFY8Y2YQU55ZUhSSuK7NHOMyrGVDOTI8W2Uc8ka4Vd9spXdV7Jtgxjcc7PvIxfjXIUJOqnQdbD56PImRfldGCXJlyd4s2V5B35kMffdjBa4Sj/9Nj4nNx96xye9o94ljECIDGZTp/40F8GEOHcyhG9HIhtWp7KdYX61X/26pMkhIisRYDtJiArbRB1QUrh9dfepnZO96PKySfI+p+2yW91ccnkgzviPJw1tdaBEjkAPmfpOS5Ssy8XlGWCwQ3Ky62R6KZaJC3RDSnC3zhrAF5Tq/XHqgTvEyW5n41YFfRQ4MIzm1+yRF9rGcpuR514xh2UlCGQWwO2W5lnZEgph9C0i5zQb05RtjGSESn/eOI3XqrFZVb8d4o6Jr9Sq/ZGcyog5NM2j/B47n4dvKu35+PLBFjRFlgoiIWwAT4dy8akf4OJWKbCozRSoHFzSdUkNl1jfz7c0BAwakNjvc/Pgwz7lKkN5fPaVdcNJz3/q2BguSCRIoAP33vqiIdY+FbC7X2Y+G6QWAnerGOBG+HFzB/rV2dch0sjQguGAfktHsaVna3kUDRT0Bd70EDdJpvTqYD7U4D1xCzbS3bcf5f05zi8J47ZYmLEo7HabsmROqN6b3Sn61SJMU/O9BAI0+7NGpf662SiH2zxotX13M6UsnFuBwhd1hdRdWn+k1jKzJDjJZJZoe67XKZX1BNoVAafhw66hvvaRK4SawzZbx5//oNrb7Yi8alWjIa57cw1ch9uYBpiAPYL11Behgv8YMCrM4aT0SUPW1qQgq5f6XaPHNvk5ScajJagt7r/oRlDVsokYqcoqg/od2kUBh2CPUssGJ2Q46xs2LJFUanSD+bga1OCEbEf7EBN+HTcUYc5bC3uGhllgJZRJhgBpmU9X0tYk3hMgWGG5rSgwwsZTxxiP6oai6W6k57whU7vrwZ0U5A4yZTuZecEB0 OnOVATjH 6CWGxdB5vRg3CFE5NdzL/Kf1T1ALcKZEgGhOVpC1TzNgKGKQ1mctwf7eRO81BxcKzhOGOBew1gHmscx8jC+Lr+vFVC+PTqs7UDqtFNx5mNnBQOXy5/wy+lZxDVLRyzhCHmVFFSy8U5cX058wciGgnmOzCClian77JEejoIfRyYZRMAHaNnKC+ThqwMtIKEWWePFaVSDjPOce1lgxSjBHGqYHNyIcjW8FX03B1clxNuEYOGQ3wciAdQmJJvQw17MZRgLcJwbSzVAvgE/2jWjF2Td1+XoV//5G8dL8cjI2xCYviJpZzUC1piDMsWytZVRf2MDJYFYwg91eQYY87FzPA3YJCAKqMX8/dGFoeOW+ETdvhQoovFNAgtXMsJpt37eHuTY3LnGRGqgTDtA9AWH4dqtmYAUugscSE2vLodJw4PNPl9FlRgh4u7yeEgQAXmWKX4gs6L9ZfXwUgfqo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: khugepaged scans PMD ranges for potential collapse to a hugepage. To add mTHP support we use this scan to instead record chunks of utilized sections of the PMD. khugepaged_scan_bitmap uses a stack struct to recursively scan a bitmap that represents chunks of utilized regions. We can then determine what mTHP size fits best and in the following patch, we set this bitmap while scanning the PMD. max_ptes_none is used as a scale to determine how "full" an order must be before being considered for collapse. When attempting to collapse an order that has its order set to "always" lets always collapse to that order in a greedy manner without considering the number of bits set. Signed-off-by: Nico Pache --- include/linux/khugepaged.h | 4 ++ mm/khugepaged.c | 94 ++++++++++++++++++++++++++++++++++---- 2 files changed, 89 insertions(+), 9 deletions(-) diff --git a/include/linux/khugepaged.h b/include/linux/khugepaged.h index 1f46046080f5..18fe6eb5051d 100644 --- a/include/linux/khugepaged.h +++ b/include/linux/khugepaged.h @@ -1,6 +1,10 @@ /* SPDX-License-Identifier: GPL-2.0 */ #ifndef _LINUX_KHUGEPAGED_H #define _LINUX_KHUGEPAGED_H +#define KHUGEPAGED_MIN_MTHP_ORDER 2 +#define KHUGEPAGED_MIN_MTHP_NR (1<mthp_bitmap_stack[++top] = (struct scan_bit_state) + { HPAGE_PMD_ORDER - KHUGEPAGED_MIN_MTHP_ORDER, 0 }; + + while (top >= 0) { + state = cc->mthp_bitmap_stack[top--]; + order = state.order + KHUGEPAGED_MIN_MTHP_ORDER; + offset = state.offset; + num_chunks = 1 << (state.order); + // Skip mTHP orders that are not enabled + if (!test_bit(order, &enabled_orders)) + goto next; + + // copy the relavant section to a new bitmap + bitmap_shift_right(cc->mthp_bitmap_temp, cc->mthp_bitmap, offset, + MTHP_BITMAP_SIZE); + + bits_set = bitmap_weight(cc->mthp_bitmap_temp, num_chunks); + threshold_bits = (HPAGE_PMD_NR - khugepaged_max_ptes_none - 1) + >> (HPAGE_PMD_ORDER - state.order); + + //Check if the region is "almost full" based on the threshold + if (bits_set > threshold_bits || is_pmd_only + || test_bit(order, &huge_anon_orders_always)) { + ret = collapse_huge_page(mm, address, referenced, unmapped, cc, + mmap_locked, order, offset * KHUGEPAGED_MIN_MTHP_NR); + if (ret == SCAN_SUCCEED) { + collapsed += (1 << order); + continue; + } + } + +next: + if (state.order > 0) { + next_order = state.order - 1; + mid_offset = offset + (num_chunks / 2); + cc->mthp_bitmap_stack[++top] = (struct scan_bit_state) + { next_order, mid_offset }; + cc->mthp_bitmap_stack[++top] = (struct scan_bit_state) + { next_order, offset }; + } + } + return collapsed; +} + static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, bool *mmap_locked, @@ -1445,9 +1523,7 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, pte_unmap_unlock(pte, ptl); if (result == SCAN_SUCCEED) { result = collapse_huge_page(mm, address, referenced, - unmapped, cc); - /* collapse_huge_page will return with the mmap_lock released */ - *mmap_locked = false; + unmapped, cc, mmap_locked, HPAGE_PMD_ORDER, 0); } out: trace_mm_khugepaged_scan_pmd(mm, &folio->page, writable, referenced,