From patchwork Tue Feb 18 18:16:45 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13980416 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B092FC021AD for ; Tue, 18 Feb 2025 18:17:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F27A8280185; Tue, 18 Feb 2025 13:17:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ED53F280181; Tue, 18 Feb 2025 13:17:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CDB16280185; Tue, 18 Feb 2025 13:17:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id A609E280181 for ; Tue, 18 Feb 2025 13:17:36 -0500 (EST) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 6E87D1C6FBA for ; Tue, 18 Feb 2025 18:17:36 +0000 (UTC) X-FDA: 83133873312.07.7369B06 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf08.hostedemail.com (Postfix) with ESMTP id 9333C16000A for ; Tue, 18 Feb 2025 18:17:34 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=QPKY2vyC; spf=pass (imf08.hostedemail.com: domain of 3vc60ZwQKCHIVlTbWeeWbU.SecbYdkn-ccalQSa.ehW@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3vc60ZwQKCHIVlTbWeeWbU.SecbYdkn-ccalQSa.ehW@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1739902654; a=rsa-sha256; cv=none; b=1cs8JFXL/d7LK/f/nC2P+d/erXW7g5ujseQFci9ef4MgKG8o1Hy4Iy7TuYBDx2BO+f90Aq vcO4De7hPNDk1uSWybmAQVLruUwSJrsMwB4wQJxN/PmV3/egGrh7Zxr5CXpA9yrk/5AZ44 b5FH/UkIqbbDinmBm1M2gG4rBK68QZI= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=QPKY2vyC; spf=pass (imf08.hostedemail.com: domain of 3vc60ZwQKCHIVlTbWeeWbU.SecbYdkn-ccalQSa.ehW@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3vc60ZwQKCHIVlTbWeeWbU.SecbYdkn-ccalQSa.ehW@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1739902654; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=H28KJe5MgacBIfLyyRLl5z2RjmgjIpFaHKUhXqFLNBk=; b=I+hW8l47rMOTKbX26z5ZLArFZdEbF8yaNDW8Qh4rrfBTcG8khkFAqEqU+thfidTYu6D1QE 7GKD0M8dMhxT9ooHRzLSJj0PSp94nBtkRe07EYgJBURmxuGXoEGCL22eGefjjoBNa4HFLw 1N3a/hQO/1/+0WvVgLxlc4zLZvguysU= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-220d9d98ea6so149382365ad.3 for ; Tue, 18 Feb 2025 10:17:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1739902653; x=1740507453; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=H28KJe5MgacBIfLyyRLl5z2RjmgjIpFaHKUhXqFLNBk=; b=QPKY2vyCRvRwOfx1sYCaOC5aMNj5iwpEaSMPPZGBxxJjH8F7lXMHIOlhvGRK17hl5C qvX+3X+bP1O2WZwRkW+Lj5/QBupDROxxi33+1v4YOOyNY7AD2ik4pmNP6MoMblSSqcmw NtU7V94DFady0/gzPwhjTjnCOYtggOjd+LqufIN3Akg9IsGNURQdO6CR1GoJCK4xFXxa R4IkxaQX4iH91Dc+VWBVNmnB8r9qFD+NTyS3kRnHRkyI9biYDS+urCr4ox1bUhkAheG9 gjXOOs97Acj1oiDEkDmxWbnjqGkPSxVwdmtYgfQkVEN3RweQRwJ28+l9POwsg6mxGpIq 4kyQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1739902653; x=1740507453; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=H28KJe5MgacBIfLyyRLl5z2RjmgjIpFaHKUhXqFLNBk=; b=AFNOrE8qtvJ4xkR5JWaX2XbNTh+xt5L9uOkFOL1ukZmPvZxBKYDy5Lah95YDKw8ncP qSGi3sanJODpxDOUaKxFr03LUZkASb0sjiQJQsHDJh0DN1qZcR2T+wM2JW/VTuLI824o uO6nQb0mtF7ARX9kQUSV7csYViQO+PYxDk4fLfOwZrfnUyhsekpdzxDe8SkbWqfnMY7D LlSmrmfSU96KPpAqYhydOVURsS25yfeTFER+6Nks9qrDKspSBjhhuXEywgMsu7SPKAtn imUpSf+9gdb9gAA9Ix1xnWryu+W5Te63Z4+w8o/2R/BriinZHY4vU8oBCaDwiv6MF39g Kf/w== X-Forwarded-Encrypted: i=1; AJvYcCXRu0FkaPraPwZCrpdR8eEJ0kDr4j+TyPnYTHDJ1O1yQyJArOhRQSthRCitd1X1iubfaugqUMJSvQ==@kvack.org X-Gm-Message-State: AOJu0Ywy5kq+PjQWtZmT1pEtaRjXWo4aWqs7RlZZeHM8RklvMbU4Lv3K HCXaur8H/rH6oC2L9ET65RkUyEWgE7Hxo8Y3d3PzuK6rBskYqVqX4xYMKxXvoX/hOVR29g== X-Google-Smtp-Source: AGHT+IG4XHG+Obo1vJ71xPprSQ0S+R8qkhjNyCp62AyEMPWnIwuFAP+MCFzfVR2WL5XMuvsvHBwfCzzT X-Received: from pfnd20.prod.google.com ([2002:aa7:8154:0:b0:732:730c:9012]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:a121:b0:1ee:c3dc:974f with SMTP id adf61e73a8af0-1eec3dc9937mr8778861637.25.1739902653508; Tue, 18 Feb 2025 10:17:33 -0800 (PST) Date: Tue, 18 Feb 2025 18:16:45 +0000 In-Reply-To: <20250218181656.207178-1-fvdl@google.com> Mime-Version: 1.0 References: <20250218181656.207178-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.601.g30ceb7b040-goog Message-ID: <20250218181656.207178-18-fvdl@google.com> Subject: [PATCH v4 17/27] mm/hugetlb: add pre-HVO framework From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, Frank van der Linden X-Stat-Signature: nihahshznekrq4i7qwk889zbfnobw3bg X-Rspamd-Server: rspam06 X-Rspamd-Queue-Id: 9333C16000A X-Rspam-User: X-HE-Tag: 1739902654-908196 X-HE-Meta: U2FsdGVkX1964rgeAXQeekyKvg58r4kIw87hF6dRRym/fNZ9VrED0qGpHGdp2LifaQcfa9xODfEHSsatS7ajiOohseSQGn3zrftYdCoFOScl0azP0taH3L8lg35iQN3kGH5ZZHGTsnp09SzvB6gMXT2bhcHK7HSWcf9NWbHn2JBIbQ5J5icGitF/huHWo0jCn+uXczAb1be/1w7Zsx2BHh+u7hgkceGDJEh8N4zeGOCQ10bJ0i4tAYlnEUHa8RupxIcl4K6PhAF/iM17myTz40vGRXH/yUQVJKOFG7D6m78a5p9MqJ6nfhvgIF5fwnb45Rf83BoBSawhoqYHrBoPYgf1vmIyk0Trt6yxw80OG+pdwewYoEA46eZkElcOXnMaO4WRFpkOTBstfHP9L5POnzHIaE5Gb6k46yR6moeHZ1z1AoZ9h1o/2W6USB8xEGpjGhj+Op7uGNL6mnoWRUS2nmW55sg6JZiebBUK9OtkVbFF/WlV+8prxVYZtGbF6Nhf2dcoUUu2RVAU5PBXO416bglYV/iEiYeofn1A8910Is4vlH0sVv0czICgfubkpzhe0PEgiGQ8UYWHtJai/vVXGFBh5FPWcqZq6mHJfvshv23lA7TQDYy1rvUOVsJNIDa05f71dm4RigzLbqqMTTEJcGDwZvF7RxNrj5Lhq6iltM6cn+f2oPKS+P6Yy5NUBZjjXEKTf6chMRb69Clkx/HweXs0e68YcbnDF1d3g6eTc05Li5mJfYu/xAWrdY4/MyAuqlu2fHPCURNDk+bf5R6SXwNLFzYOOeWNebwIBc1MbUiV+s7DrWWrh68G8NLhAOylOmQ6A5hVOXNmcI5p4BWuEDSdsUMSResvFdb5mY7xwWsiEJCrNIJdfZzb4oOsOWqHXJuv5giShwkwxpqITq4AZrEad/RT7GW97vtcoFDfhJIEsr3ZJsWCo1KnD4hQlOk14yeDqkkRKvLSnDGZL4t 2sEZNtbK PwnksHP0/RIyG620TJ83ik/U+rQfzG9EhGTKltGPPp5A3gDfxkuJt4nPXEkVD3NR7p2I8NsPbJTRbfTojF2O4khnFCzQv1mtr6Li7CeF4TRXyvwsvy0nWUgDsou1zQ2knFX8q7C/FVfxHUS1i5/IPLHIfsFeI3KDchmh4b4q0eMu469kb8OGcEhjomb+pVRwz/pHjGAKldIyn71dmsuArj4DVDbFcb+gNl8oJLXffA9zhHY0kaU6F1SmFVCa1IIOTH57mc5KU2eVr/H2u2tSsJ21NGauS/VO90beXBC2O/tF6bjISY6QJpGnfIYJbobrdMtcCP7k7xHzx5fpl5+YxVWxt0cx4uuYxmA3/K3sry+kbnobdPFolg22bHwXh0eKWRN4laFF1LRAUsTa28q2GRYe2YOOLfUHVHyYfOnlwtAvjkiz0mQhz2lLHFwYWvaYUZiCd3DmxZu1+M1D0tXwqTzwclSYC0Gkq8P+pEkQI//CwgoWIRMUnuRLbyCO2hDd5iEvz/QOz9JsvDJ8aaRdrcOrlwA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Define flags for pre-HVOed bootmem hugetlb pages, and act on them. The most important flag is the HVO flag, signalling that a bootmem allocated gigantic page has already been HVO-ed. If this flag is seen by the hugetlb bootmem gather code, the page is marked as HVO optimized. The HVO code will then not try to optimize it again. Instead, it will just map the tail page mirror pages read-only, completing the HVO steps. No functional change, as nothing sets the flags yet. Signed-off-by: Frank van der Linden --- arch/powerpc/mm/hugetlbpage.c | 1 + include/linux/hugetlb.h | 4 +++ mm/hugetlb.c | 24 ++++++++++++++++- mm/hugetlb_vmemmap.c | 50 +++++++++++++++++++++++++++++++++-- mm/hugetlb_vmemmap.h | 7 +++++ 5 files changed, 83 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 6b043180220a..d3c1b749dcfc 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -113,6 +113,7 @@ static int __init pseries_alloc_bootmem_huge_page(struct hstate *hstate) gpage_freearray[nr_gpages] = 0; list_add(&m->list, &huge_boot_pages[0]); m->hstate = hstate; + m->flags = 0; return 1; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5061279e5f73..10a7ce2b95e1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -681,8 +681,12 @@ struct hstate { struct huge_bootmem_page { struct list_head list; struct hstate *hstate; + unsigned long flags; }; +#define HUGE_BOOTMEM_HVO 0x0001 +#define HUGE_BOOTMEM_ZONES_VALID 0x0002 + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0f14a7736875..40c88c46b34f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3215,6 +3215,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; + m->flags = 0; return 1; } @@ -3282,7 +3283,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, struct folio *folio, *tmp_f; /* Send list for bulk vmemmap optimization processing */ - hugetlb_vmemmap_optimize_folios(h, folio_list); + hugetlb_vmemmap_optimize_bootmem_folios(h, folio_list); list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { @@ -3311,6 +3312,13 @@ static bool __init hugetlb_bootmem_page_zones_valid(int nid, unsigned long start_pfn; bool valid; + if (m->flags & HUGE_BOOTMEM_ZONES_VALID) { + /* + * Already validated, skip check. + */ + return true; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, @@ -3343,6 +3351,11 @@ static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, } } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return (m->flags & HUGE_BOOTMEM_HVO); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3383,6 +3396,15 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); init_new_hugetlb_folio(h, folio); + + if (hugetlb_bootmem_page_prehvo(m)) + /* + * If pre-HVO was done, just set the + * flag, the HVO code will then skip + * this folio. + */ + folio_set_hugetlb_vmemmap_optimized(folio); + list_add(&folio->lru, &folio_list); /* diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 5b484758f813..be6b33ecbc8e 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -649,14 +649,39 @@ static int hugetlb_vmemmap_split_folio(const struct hstate *h, struct folio *fol return vmemmap_remap_split(vmemmap_start, vmemmap_end, vmemmap_reuse); } -void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +static void __hugetlb_vmemmap_optimize_folios(struct hstate *h, + struct list_head *folio_list, + bool boot) { struct folio *folio; + int nr_to_optimize; LIST_HEAD(vmemmap_pages); unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU; + nr_to_optimize = 0; list_for_each_entry(folio, folio_list, lru) { - int ret = hugetlb_vmemmap_split_folio(h, folio); + int ret; + unsigned long spfn, epfn; + + if (boot && folio_test_hugetlb_vmemmap_optimized(folio)) { + /* + * Already optimized by pre-HVO, just map the + * mirrored tail page structs RO. + */ + spfn = (unsigned long)&folio->page; + epfn = spfn + pages_per_huge_page(h); + vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), + HUGETLB_VMEMMAP_RESERVE_SIZE); + register_page_bootmem_memmap(pfn_to_section_nr(spfn), + &folio->page, + HUGETLB_VMEMMAP_RESERVE_SIZE); + static_branch_inc(&hugetlb_optimize_vmemmap_key); + continue; + } + + nr_to_optimize++; + + ret = hugetlb_vmemmap_split_folio(h, folio); /* * Spliting the PMD requires allocating a page, thus lets fail @@ -668,6 +693,16 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l break; } + if (!nr_to_optimize) + /* + * All pre-HVO folios, nothing left to do. It's ok if + * there is a mix of pre-HVO and not yet HVO-ed folios + * here, as __hugetlb_vmemmap_optimize_folio() will + * skip any folios that already have the optimized flag + * set, see vmemmap_should_optimize_folio(). + */ + goto out; + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { @@ -693,10 +728,21 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l } } +out: flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, false); +} + +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, true); +} + static const struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 2fcae92d3359..71110a90275f 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -24,6 +24,8 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *non_hvo_folios); void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); + static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { @@ -64,6 +66,11 @@ static inline void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list { } +static inline void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, + struct list_head *folio_list) +{ +} + static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct hstate *h) { return 0;