From patchwork Fri Feb 28 18:29:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996900 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D8B4C282C6 for ; Fri, 28 Feb 2025 18:29:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DB0CC6B0083; Fri, 28 Feb 2025 13:29:53 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D60C36B0085; Fri, 28 Feb 2025 13:29:53 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BDD10280001; Fri, 28 Feb 2025 13:29:53 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A044F6B0083 for ; Fri, 28 Feb 2025 13:29:53 -0500 (EST) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 455F61C8ACF for ; Fri, 28 Feb 2025 18:29:53 +0000 (UTC) X-FDA: 83170192266.29.A9EEA13 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf27.hostedemail.com (Postfix) with ESMTP id 8621140018 for ; Fri, 28 Feb 2025 18:29:51 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pmii0ehb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 3ngDCZwQKCOsSiQYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3ngDCZwQKCOsSiQYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767391; a=rsa-sha256; cv=none; b=09ASECOqzwbDRT2jQDHjjuYyeDteMrsDqx3DwPqbjH+kMQkKmKkEOS2Gf/A7XqjGFv74PC cX72VpsoKa0SZVqrqXlhZZJlrhUJ0mUjiX8ucZq2uu9G0SuPyjeOlD4D9MV89YdNAWt8fU J6XwB/Yp/3S2TEumtg47L8NOBMnwRtk= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=pmii0ehb; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf27.hostedemail.com: domain of 3ngDCZwQKCOsSiQYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3ngDCZwQKCOsSiQYTbbTYR.PbZYVahk-ZZXiNPX.beT@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767391; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kKtMLmlpe+m6qN9ca3/progHrAqe0123z61YwAW9d3s=; b=zNqhuaGJdm9KoC+TLh3VgnBxchZifclX4zFhpcWWzoY9/PFMN3fYhNCGDxz53SL1X5DcbW zUShHCj+3U6f3+/IIfm8Wv+adrgFcdPbxviTBjF5Fu57880Nv39Rhd2oh1yVqdGHt9wvK4 sMYRarc6r+wyCYRzXIUx++Cjkzpp6XE= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fec3e38b60so1478603a91.0 for ; Fri, 28 Feb 2025 10:29:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767390; x=1741372190; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kKtMLmlpe+m6qN9ca3/progHrAqe0123z61YwAW9d3s=; b=pmii0ehbXNF12C/WQWvnJZvHGoKrjNhHkZ3J2l5D/fjSZvx2p8W4xhRJWPDTaiFGUM QLAwBAILn6TBU4HKMcjsmdjyXxKd2aOYoimsVFvYaFoIgx/rg+W9Fsbo3FCUtCiewHrU 0PUQlw54BUFb0C7A1aXOtb1LRFiuFgbqvLnEQwzBA0UVm5zDaUxP0KMKc/V/6tl8UTDV 36qQhL50X8y5TCIQzdi7pi6vHPs64WTqCYWDxv0oeDtvCjixHiw0eY2l2CPMAy8eiK6H kQfZIzaPKI8vFfLwxMRW7z2Wz5TJVn+db5eBgbXo67weFqGVWHql3IG4mvHsPWfxbzGN ayqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767390; x=1741372190; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kKtMLmlpe+m6qN9ca3/progHrAqe0123z61YwAW9d3s=; b=Svg077FF+/tGH+HpdRrlZVfbcL1DZ0XZ8RUdgRnvCWmNa6q18T+n73hBLaaRiLVbR9 rP6lFBjwZ+1cYINTNBCfOkHAkdtDF4EQOpWsqq1LU/yg58ubqaND3dAOQBOfwkNOO2Nk VjN7/6zY1nS0Z/KkU8p1BYUwg49xx7yoJGcREw+mKL1R0Nzvvew6/O7UDkXStmvBL1bv +K2MTYNvm/uGoFJofES1MMYpKDMYstsF/kaDAUPX7YBqe3oGTRbVEZ1QsZI3m+dGacEi Z6ZJUp/Qpyh1h1pcttdOqQkZFQU490w/adniAZ59CFIg2sycwWvMMoL10S4vlp+7nLX2 QNcQ== X-Forwarded-Encrypted: i=1; AJvYcCVw3bJROn4eGzLB84F8xNUUfWZgBdOER6Spht1Xc0RgR0JHex0y/mOlwASnemJBjN1ZOXnARiL9Hg==@kvack.org X-Gm-Message-State: AOJu0YyN2l61LqfHGZbmtnR72ExJFStImLwRRs4Qc6RMnCM5Ie72XEsC U8KkG7rnRt+BMZo231iWJjJLpxqpGjJYiM6Bb48XkhVe/BUp0TRhHZzdweI5C+xlFpVyPA== X-Google-Smtp-Source: AGHT+IHqJuYtYkzOhZDMCMvoR+05mDEEDRM0zr7QytAhww8Je1Y3UD1lMlTzOEV4CRBkoQI5L703TwyQ X-Received: from pjbsr11.prod.google.com ([2002:a17:90b:4e8b:b0:2ef:82a8:7171]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3a89:b0:2ef:e0bb:1ef2 with SMTP id 98e67ed59e1d1-2febab71ec6mr6563503a91.19.1740767390505; Fri, 28 Feb 2025 10:29:50 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:02 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-2-fvdl@google.com> Subject: [PATCH v5 01/27] mm/cma: export total and free number of pages for CMA areas From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Oscar Salvador X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 8621140018 X-Stat-Signature: 7dea87fcr7zcidwt4dxz6a9q9jo1xc1k X-HE-Tag: 1740767391-317122 X-HE-Meta: U2FsdGVkX1+2pJ0DfHFTKJNo68QRiS9tbbzliwWsbj0fdFJ4acHwICqx7haGyV07yWmCgFIxaNxtLGUNAZB2+DaSzjm1qEchjD9G0P3mF/imGXyPL3p1m+VCKUtqHImxQVIsh0wzoksjIGVB9Wt83WTi3VPXYX1vXxEBYyaNXMYYRSWtcoM7RCOVHP1WoRBYgsspTvQfvAY3I11bzHkXpDIqjJHNQGeo/8azlt7X/MgZ+aang85bsEBIUJkJtjZfY+ofu1Gx4Ip34vvYYzI35LgXZJ/Z678DSwN3Z1uzpnq234bT+ceWjHbJ5KpPBsRffPnUGeKSqsxeQooEROekGtQikRMJdsl3MjaIpR27K+yaJb8C6u+Fvo3cCbb2dbvcE9gl0Wg7o0DKWybqxi3oY+9BUbSEN/ePH2nrXRaecRgGc1UTpmvl2ctZJdvGtIXiwGS2h8wQgNjFAjyQvv1dmu3XCEDzmuZGNqAbPBuvWNU6b0U2+2R1wopYuhGFRgKlJ0bzP3ZpU2yW73BeLN9IviLAFc0KR/BrMPSuHlHe2bs55yGEO0zgH6sUIYMKJAVznL6oW+imy/9VnW828QgSFTN6I4YvdQbQK6YUmMk0pesVpdeTUWRHvQk4vdWJgeKZjjAIjh/SWUlY70z/pY9G8e3dWSk08XtUXNmOLmDoJwLtyYtH/FxQ9XHH9P0Mz4hXrGnhEOB+KsegZDqcYI3Gl6UtYNFH3+aGdqJ9KYYNzVeHFaMBs+sKzoON3BT+K4Ft07YlZNdiXHzAu9Bz+aLSnUnSoRN7JKoKhk4f1Gdza6nBwjAcrrzEEgHV3IXh7Zr6goJyEF5i4UUbSY4b45UsK+ch5VwwHhy/S1vtRsGwwV0V9q18VrVGQGQ0elNDxvc/OcmALuP8nRiOH8TB5HCxg5MVc28wEMElWSVj6bmttK3UOdfw4CQ2g/xZ6fKxw4vvw9qhoQpadnRiYEKepNm fPjLFEeq VkeQrxmr4gUuI3KqK3Oa6vI0WiBhlpyh5eWSPqPjbedCHwbI9EZxQgGS2fp/+6vXODJcYxL8lEeWSuqX5iBe2Md6I/GqcncLqsZp/fyTIscaEiC5o5ogYeYNrRJYhV2asn5E7ktAgg2x1UjIADMxwZtjOWhq5pb1Xcbnz3f7TXElKDxJduz2+CaHBMxfG0ruru8VdN1TSGFsrw+MmXzx1umuvKwS0J2DDX16GTEikjrj9qLFekzcYh9W5QuycNtUagtdq2vnnzZZ9cw0CAZSsM4hmgXxGebVu0Tp7tXZU2wcdtxuNewoU+al9nCJxL27A7A6c1Iv8qXcz4MDo7pqoFsiIRE8NBG3AMm3MP6VpbnajLq82g54HLck0b7E9Ne8f/sZdPFt/qnhPhf+4wgmRm/cKjCXPLuY3fGOBSZmkGzyEITTo/cuU/psYlwUPtUDdH5fUwBGiOWHEFr797Ms25a9PVkj+seCQeNzP4mJGOVng3DXUuS63a3bBGQMmPaka/SbltPrHBUzZ6V0u9tULOKMbdlF1tnfw2Q+tIvfFEAmGhdqgbOEmUw+7lkLIZ/r0+BhxMgxDl/risBVxc5L/cbyAqdNF0vJSimcNk0WoKLuOMO8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In addition to the number of allocations and releases, system management software may like to be aware of the size of CMA areas, and how many pages are available in it. This information is currently not available, so export it in total_page and available_pages, respectively. The name 'available_pages' was picked over 'free_pages' because 'free' implies that the pages are unused. But they might not be, they just haven't been used by cma_alloc The number of available pages is tracked regardless of CONFIG_CMA_SYSFS, allowing for a few minor shortcuts in the code, avoiding bitmap operations. Reviewed-by: Oscar Salvador Signed-off-by: Frank van der Linden --- Documentation/ABI/testing/sysfs-kernel-mm-cma | 13 +++++++++++ mm/cma.c | 22 ++++++++++++++----- mm/cma.h | 1 + mm/cma_debug.c | 5 +---- mm/cma_sysfs.c | 20 +++++++++++++++++ 5 files changed, 51 insertions(+), 10 deletions(-) diff --git a/Documentation/ABI/testing/sysfs-kernel-mm-cma b/Documentation/ABI/testing/sysfs-kernel-mm-cma index dfd755201142..aaf2a5d8b13b 100644 --- a/Documentation/ABI/testing/sysfs-kernel-mm-cma +++ b/Documentation/ABI/testing/sysfs-kernel-mm-cma @@ -29,3 +29,16 @@ Date: Feb 2024 Contact: Anshuman Khandual Description: the number of pages CMA API succeeded to release + +What: /sys/kernel/mm/cma//total_pages +Date: Jun 2024 +Contact: Frank van der Linden +Description: + The size of the CMA area in pages. + +What: /sys/kernel/mm/cma//available_pages +Date: Jun 2024 +Contact: Frank van der Linden +Description: + The number of pages in the CMA area that are still + available for CMA allocation. diff --git a/mm/cma.c b/mm/cma.c index de5bc0c81fc2..95a8788e54d3 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -86,6 +86,7 @@ static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, spin_lock_irqsave(&cma->lock, flags); bitmap_clear(cma->bitmap, bitmap_no, bitmap_count); + cma->available_count += count; spin_unlock_irqrestore(&cma->lock, flags); } @@ -133,7 +134,7 @@ static void __init cma_activate_area(struct cma *cma) free_reserved_page(pfn_to_page(pfn)); } totalcma_pages -= cma->count; - cma->count = 0; + cma->available_count = cma->count = 0; pr_err("CMA area %s could not be activated\n", cma->name); } @@ -206,7 +207,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); cma->base_pfn = PFN_DOWN(base); - cma->count = size >> PAGE_SHIFT; + cma->available_count = cma->count = size >> PAGE_SHIFT; cma->order_per_bit = order_per_bit; *res_cma = cma; cma_area_count++; @@ -390,7 +391,7 @@ static void cma_debug_show_areas(struct cma *cma) { unsigned long next_zero_bit, next_set_bit, nr_zero; unsigned long start = 0; - unsigned long nr_part, nr_total = 0; + unsigned long nr_part; unsigned long nbits = cma_bitmap_maxno(cma); spin_lock_irq(&cma->lock); @@ -402,12 +403,12 @@ static void cma_debug_show_areas(struct cma *cma) next_set_bit = find_next_bit(cma->bitmap, nbits, next_zero_bit); nr_zero = next_set_bit - next_zero_bit; nr_part = nr_zero << cma->order_per_bit; - pr_cont("%s%lu@%lu", nr_total ? "+" : "", nr_part, + pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, next_zero_bit); - nr_total += nr_part; start = next_zero_bit + nr_zero; } - pr_cont("=> %lu free of %lu total pages\n", nr_total, cma->count); + pr_cont("=> %lu free of %lu total pages\n", cma->available_count, + cma->count); spin_unlock_irq(&cma->lock); } @@ -444,6 +445,14 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, for (;;) { spin_lock_irq(&cma->lock); + /* + * If the request is larger than the available number + * of pages, stop right away. + */ + if (count > cma->available_count) { + spin_unlock_irq(&cma->lock); + break; + } bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, bitmap_maxno, start, bitmap_count, mask, offset); @@ -452,6 +461,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, break; } bitmap_set(cma->bitmap, bitmap_no, bitmap_count); + cma->available_count -= count; /* * It's safe to drop the lock here. We've marked this region for * our exclusive use. If the migration fails we will take the diff --git a/mm/cma.h b/mm/cma.h index 8485ef893e99..3dd3376ae980 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -13,6 +13,7 @@ struct cma_kobject { struct cma { unsigned long base_pfn; unsigned long count; + unsigned long available_count; unsigned long *bitmap; unsigned int order_per_bit; /* Order of pages represented by one bit */ spinlock_t lock; diff --git a/mm/cma_debug.c b/mm/cma_debug.c index 602fff89b15f..89236f22230a 100644 --- a/mm/cma_debug.c +++ b/mm/cma_debug.c @@ -34,13 +34,10 @@ DEFINE_DEBUGFS_ATTRIBUTE(cma_debugfs_fops, cma_debugfs_get, NULL, "%llu\n"); static int cma_used_get(void *data, u64 *val) { struct cma *cma = data; - unsigned long used; spin_lock_irq(&cma->lock); - /* pages counter is smaller than sizeof(int) */ - used = bitmap_weight(cma->bitmap, (int)cma_bitmap_maxno(cma)); + *val = cma->count - cma->available_count; spin_unlock_irq(&cma->lock); - *val = (u64)used << cma->order_per_bit; return 0; } diff --git a/mm/cma_sysfs.c b/mm/cma_sysfs.c index f50db3973171..97acd3e5a6a5 100644 --- a/mm/cma_sysfs.c +++ b/mm/cma_sysfs.c @@ -62,6 +62,24 @@ static ssize_t release_pages_success_show(struct kobject *kobj, } CMA_ATTR_RO(release_pages_success); +static ssize_t total_pages_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct cma *cma = cma_from_kobj(kobj); + + return sysfs_emit(buf, "%lu\n", cma->count); +} +CMA_ATTR_RO(total_pages); + +static ssize_t available_pages_show(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + struct cma *cma = cma_from_kobj(kobj); + + return sysfs_emit(buf, "%lu\n", cma->available_count); +} +CMA_ATTR_RO(available_pages); + static void cma_kobj_release(struct kobject *kobj) { struct cma *cma = cma_from_kobj(kobj); @@ -75,6 +93,8 @@ static struct attribute *cma_attrs[] = { &alloc_pages_success_attr.attr, &alloc_pages_fail_attr.attr, &release_pages_success_attr.attr, + &total_pages_attr.attr, + &available_pages_attr.attr, NULL, }; ATTRIBUTE_GROUPS(cma); From patchwork Fri Feb 28 18:29:03 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996901 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76B9DC282D0 for ; Fri, 28 Feb 2025 18:29:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BBD3A280003; Fri, 28 Feb 2025 13:29:55 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B67DC280001; Fri, 28 Feb 2025 13:29:55 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 991C3280003; Fri, 28 Feb 2025 13:29:55 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 6D9B4280001 for ; Fri, 28 Feb 2025 13:29:55 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1D50BA3614 for ; Fri, 28 Feb 2025 18:29:55 +0000 (UTC) X-FDA: 83170192350.03.AE796BB Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf29.hostedemail.com (Postfix) with ESMTP id 40ADC120019 for ; Fri, 28 Feb 2025 18:29:53 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bFB0O73R; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3nwDCZwQKCOwTjRZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3nwDCZwQKCOwTjRZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767393; a=rsa-sha256; cv=none; b=v6n+REYGMWsd5UFlOJgbb2S6x0WObJqmN9d1ad5MF1K45TZSNs1o++ozzA2bduqWVbkKVr 7xNZEDN4Vk40s9p0WyddNJvCsywbvEM9km5revT+qAbL9pUhDNF/GTzXu1p3q5IXzBB0vk Ijg7S5lEb7u6RgTuKuT5XAtW37+isEg= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=bFB0O73R; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf29.hostedemail.com: domain of 3nwDCZwQKCOwTjRZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3nwDCZwQKCOwTjRZUccUZS.QcaZWbil-aaYjOQY.cfU@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767393; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=dzEENFYZl7nQllfpXO7dpxeX1+3boBuKmXbQdqjtFEg=; b=qMnf5sXJdAvHBNItvib817hYRRSY73E9wwxFmCFBbQdiSbGI4hqS4w0n6pSyr5PZimidHk 1szMShAP6wLx6zXTCqDw4snxQj2O6fHJnizG6+VvwaUHKEaKUCuQDcqGF51UyryLMgGUCf FYo7rHscBpRSzRzJ7VgKxvxH1/2u7G0= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2237f73c669so3645245ad.1 for ; Fri, 28 Feb 2025 10:29:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767392; x=1741372192; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=dzEENFYZl7nQllfpXO7dpxeX1+3boBuKmXbQdqjtFEg=; b=bFB0O73RbBvoTzqyhbaB2qRoFr2gLLGTNwNrVswNxVuQVgIkNIUSW79weBkLRG7s/l CLl8dC+uTKwvUh8CwoQ8k+AhPRcn0bSk6N+Uz5BRVeXQ36S5I8nbHDm/8rY3eKLJvFfq bjxKZBPCSvQm8576xYDMVkmtYzaJWL33VQn6do+1lXNvMPBmV2fDgKLuMb2xCDiRdq2/ nqNVEJNIttgwRwYEfpNmVceODFZpdrtX88V9t1jg0ToA43fDsmtv1GT5rjxEzKXvm3JY aSgFrauVFYwHBLfhl3pJz0zJVC3ZqCSLUr5hjPMh10LhT5koKfpxl70AkwC2IvjdenWN MwmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767392; x=1741372192; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dzEENFYZl7nQllfpXO7dpxeX1+3boBuKmXbQdqjtFEg=; b=Y73kG6wZr6lHVPa7T7zTsM/cr2oR8p7eEbM9XX79WFWjQ2oYklD/erLuSn+gG95+ID wTCOyAeqsZlZR148upMTSB3j5Vd6L+d5DmTlK0pr6Uz+ile/2mj4SVrYgFpQ3DtT6i+Y +HGhrgBAf3jX5A8DBHIAsjS+v867O34Jb/RghTIaDeT8XJuZr8RJ11peTAzKdSX3uMS7 80RD6oqyIaaqN7O7VNT20gWWanVZejChIaEJ4VnCc2Lx4/YtKyCq1+106sxbGw61L39F mDiXF77MchZU+5nV5T2pB+tHtA2Q7ch7O+EssSaeC84T4n83Qf1ojUjRJDdIAfcZ7/hd Hn9g== X-Forwarded-Encrypted: i=1; AJvYcCVHHX2c3hdC1ms8Wzu8TGwC/zDuydKgZW3gzd76F0OdcSa5RMeJTxkZi1M5eBZxXovbp+DVnRXzNg==@kvack.org X-Gm-Message-State: AOJu0Yz0OVUG1mmqNyi1mLs+ArVJOqXCC4O3iq9ghJrKeavGJHsMBZRr Ns02zlSeYJ6z+Hk0SOtrfF2N8w2Ik9o0zxfSv9RcTV13J6BTvkPDiHHbHWpODLXJbv5Osw== X-Google-Smtp-Source: AGHT+IEozZRcCCzIxcAfCp0hqih2IMauOaJb9F3VaZQMgVwa3HP5TGN8xienmC6z56OPijrfHzg1G3ms X-Received: from pgar6.prod.google.com ([2002:a05:6a02:2e86:b0:adf:4e78:be67]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:cd91:b0:1ee:c6bf:7c49 with SMTP id adf61e73a8af0-1f2f4c9737emr8391249637.6.1740767391798; Fri, 28 Feb 2025 10:29:51 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:03 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-3-fvdl@google.com> Subject: [PATCH v5 02/27] mm, cma: support multiple contiguous ranges, if requested From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Arnd Bergmann X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 40ADC120019 X-Stat-Signature: 5tdye85yyoip69818tcruwum1beytb9d X-HE-Tag: 1740767393-26276 X-HE-Meta: U2FsdGVkX19Y39lBS9u+JTggx8L74NUooV67o62JmLCZryquQthw35Zrk/SWhwyRNmkAyCCO7D1CFw8mrNRhK4TZ3nVqUacHsDvgp7h3ZQeApSm2NZfWpEH3fSaaSIzqRWazum+QD3Wr/o9DWG1LNoP5QArU1ctyzLBWNxME5gpcTlLVeld4OQvwoYVSus3wrCf119P7MsK3XcgputhvWtX3Wrknn738I6W/Yd9LjbbHEj8495STCEAnNvG63YTuUGEn9bC1xyv8uVfmGh6FzVJpfGHwZp0Uuj7QvULLwAbNiVQUdvZ30yqgGH6zQ+0XPmUUVdHJ6dxTuS04MfRS/j0QDiJdhRhBGIs0GkSOhoH02OYyk4kbksSiKnks99+jxHH2pz/Sz6kxn+3XaXg6ZmlTuc21FWbzCA/gT9bwM8YhVmEQpsCgDRqAzit1xrurlJRD0CLNepWzb42PPFezDM7FgavyC22ktXNcHtYve4Xj0Rbga8FNHBskAtiPLF2NmAmgwnNYO/Y4nb5NtIbGiakFXOPRWkH2nvwhXP5EK0Bj5Nn62FIv+UTDq+Wwamopudq4n5HYv6O4YHrdBgjuIDoZdT9NyZ8lf9J7SknkU3PSq4iE6oiaRezDiWI8lEnqRJXRrdGw9V6zxCvnLBp2QBLIyll4P46Yf1CPrF0QuckuQqDjXt5B7P2kEclBbHGlTvNzWXF08GBe1kyVynpUXdVde/IKOZzFoinqtDYU6tzmsAoJsyfYjEHl8UWCoe5iDD/ioNGGBsEj46RQG3I9plmxKlpKbv/vB1vX0bK2OpU0ey5r4FGx9Pk2ITRq+I5MgqlPslVXePRthhT2dLFLa3dT8RCd58Q3CBc+a9oW/RM374Y5cL3jBkDRbtfuH0Jc78pqO0gF0JuwVNCHcqC203r01tH2F/73TJV56r7Gy5deAy7K/4X0t38eh2bCxSE/K2XdmY6S8yvyfF4RlXF afznPWZ8 gkh2XO6R84DZg0tct9P2EgXKcxXPQtktJSrWLVuXqWGr077/GNzI7TMD2z62eGdNd0SExOaHGCiTheWq+VbvRy1TADKQyuEb5wSpaePRZgk8M8VFc0TqRcKLHpccfD0ZqGYx2yJVsGk15iNCDQe50LP8u2FRrUIxS1q77RdNU6Kz9lJnbchCl9ipjaRD23137BxJCFcipoqy9U/ihOvljuUk5FhMvBGtBoUTUnRcojPljlPSbzdJEpgRxNukyIPgCC+juZKnuwphyhLR6mX7y0J2d2EnHub34bV5FAS57g1oVXF8IGDNt/USaqCDTszfCumrlR/DIEHVeWTs7u7TGse/zYr7u7Qu2Z4XRQhD6+YCBtX97fVfbbywQgB0gqPOATgT3FNMUu6yz6/fnz0nzcXk8Id9seRKF/zuS574kIMpiLtgU7k/mlvywNXf1OXCS2JFvDO79kX9AnwiJkiCVg7znioaqIcetiJVBSYy7gQVse1z6WGlmZlCjSgkWVN5t8mk0Tonba8r4oKb6JOuMKCtR7Px1YrfRSBcK/ETERFIoPdeYF3OFck/H7g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Currently, CMA manages one range of physically contiguous memory. Creation of larger CMA areas with hugetlb_cma may run in to gaps in physical memory, so that they are not able to allocate that contiguous physical range from memblock when creating the CMA area. This can happen, for example, on an AMD system with > 1TB of memory, where there will be a gap just below the 1TB (40bit DMA) line. If you have set aside most of memory for potential hugetlb CMA allocation, cma_declare_contiguous_nid will fail. hugetlb_cma doesn't need the entire area to be one physically contiguous range. It just cares about being able to get physically contiguous chunks of a certain size (e.g. 1G), and it is fine to have the CMA area backed by multiple physical ranges, as long as it gets 1G contiguous allocations. Multi-range support is implemented by introducing an array of ranges, instead of just one big one. Each range has its own bitmap. Effectively, the allocate and release operations work as before, just per-range. So, instead of going through one large bitmap, they now go through a number of smaller ones. The maximum number of supported ranges is 8, as defined in CMA_MAX_RANGES. Since some current users of CMA expect a CMA area to just use one physically contiguous range, only allow for multiple ranges if a new interface, cma_declare_contiguous_nid_multi, is used. The other interfaces will work like before, creating only CMA areas with 1 range. cma_declare_contiguous_nid_multi works as follows, mimicking the default "bottom-up, above 4G" reservation approach: 0) Try cma_declare_contiguous_nid, which will use only one region. If this succeeds, return. This makes sure that for all the cases that currently work, the behavior remains unchanged even if the caller switches from cma_declare_contiguous_nid to cma_declare_contiguous_nid_multi. 1) Select the largest free memblock ranges above 4G, with a maximum number of CMA_MAX_RANGES. 2) If we did not find at most CMA_MAX_RANGES that add up to the total size requested, return -ENOMEM. 3) Sort the selected ranges by base address. 4) Reserve them bottom-up until we get what we wanted. Cc: Arnd Bergmann Signed-off-by: Frank van der Linden --- Documentation/admin-guide/mm/cma_debugfs.rst | 10 +- include/linux/cma.h | 3 + mm/cma.c | 594 +++++++++++++++---- mm/cma.h | 27 +- mm/cma_debug.c | 56 +- 5 files changed, 550 insertions(+), 140 deletions(-) diff --git a/Documentation/admin-guide/mm/cma_debugfs.rst b/Documentation/admin-guide/mm/cma_debugfs.rst index 7367e6294ef6..4120e9cb0cd5 100644 --- a/Documentation/admin-guide/mm/cma_debugfs.rst +++ b/Documentation/admin-guide/mm/cma_debugfs.rst @@ -12,10 +12,16 @@ its CMA name like below: The structure of the files created under that directory is as follows: - - [RO] base_pfn: The base PFN (Page Frame Number) of the zone. + - [RO] base_pfn: The base PFN (Page Frame Number) of the CMA area. + This is the same as ranges/0/base_pfn. - [RO] count: Amount of memory in the CMA area. - [RO] order_per_bit: Order of pages represented by one bit. - - [RO] bitmap: The bitmap of page states in the zone. + - [RO] bitmap: The bitmap of allocated pages in the area. + This is the same as ranges/0/base_pfn. + - [RO] ranges/N/base_pfn: The base PFN of contiguous range N + in the CMA area. + - [RO] ranges/N/bitmap: The bit map of allocated pages in + range N in the CMA area. - [WO] alloc: Allocate N pages from that CMA area. For example:: echo 5 > /cma//alloc diff --git a/include/linux/cma.h b/include/linux/cma.h index d15b64f51336..863427c27dc2 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -40,6 +40,9 @@ static inline int __init cma_declare_contiguous(phys_addr_t base, return cma_declare_contiguous_nid(base, size, limit, alignment, order_per_bit, fixed, name, res_cma, NUMA_NO_NODE); } +extern int __init cma_declare_contiguous_multi(phys_addr_t size, + phys_addr_t align, unsigned int order_per_bit, + const char *name, struct cma **res_cma, int nid); extern int cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, unsigned int order_per_bit, const char *name, diff --git a/mm/cma.c b/mm/cma.c index 95a8788e54d3..34caa6b29c99 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -18,6 +18,7 @@ #include #include +#include #include #include #include @@ -35,9 +36,16 @@ struct cma cma_areas[MAX_CMA_AREAS]; unsigned int cma_area_count; static DEFINE_MUTEX(cma_mutex); +static int __init __cma_declare_contiguous_nid(phys_addr_t base, + phys_addr_t size, phys_addr_t limit, + phys_addr_t alignment, unsigned int order_per_bit, + bool fixed, const char *name, struct cma **res_cma, + int nid); + phys_addr_t cma_get_base(const struct cma *cma) { - return PFN_PHYS(cma->base_pfn); + WARN_ON_ONCE(cma->nranges != 1); + return PFN_PHYS(cma->ranges[0].base_pfn); } unsigned long cma_get_size(const struct cma *cma) @@ -63,9 +71,10 @@ static unsigned long cma_bitmap_aligned_mask(const struct cma *cma, * The value returned is represented in order_per_bits. */ static unsigned long cma_bitmap_aligned_offset(const struct cma *cma, + const struct cma_memrange *cmr, unsigned int align_order) { - return (cma->base_pfn & ((1UL << align_order) - 1)) + return (cmr->base_pfn & ((1UL << align_order) - 1)) >> cma->order_per_bit; } @@ -75,46 +84,57 @@ static unsigned long cma_bitmap_pages_to_bits(const struct cma *cma, return ALIGN(pages, 1UL << cma->order_per_bit) >> cma->order_per_bit; } -static void cma_clear_bitmap(struct cma *cma, unsigned long pfn, - unsigned long count) +static void cma_clear_bitmap(struct cma *cma, const struct cma_memrange *cmr, + unsigned long pfn, unsigned long count) { unsigned long bitmap_no, bitmap_count; unsigned long flags; - bitmap_no = (pfn - cma->base_pfn) >> cma->order_per_bit; + bitmap_no = (pfn - cmr->base_pfn) >> cma->order_per_bit; bitmap_count = cma_bitmap_pages_to_bits(cma, count); spin_lock_irqsave(&cma->lock, flags); - bitmap_clear(cma->bitmap, bitmap_no, bitmap_count); + bitmap_clear(cmr->bitmap, bitmap_no, bitmap_count); cma->available_count += count; spin_unlock_irqrestore(&cma->lock, flags); } static void __init cma_activate_area(struct cma *cma) { - unsigned long base_pfn = cma->base_pfn, pfn; + unsigned long pfn, base_pfn; + int allocrange, r; struct zone *zone; + struct cma_memrange *cmr; + + for (allocrange = 0; allocrange < cma->nranges; allocrange++) { + cmr = &cma->ranges[allocrange]; + cmr->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma, cmr), + GFP_KERNEL); + if (!cmr->bitmap) + goto cleanup; + } - cma->bitmap = bitmap_zalloc(cma_bitmap_maxno(cma), GFP_KERNEL); - if (!cma->bitmap) - goto out_error; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + base_pfn = cmr->base_pfn; - /* - * alloc_contig_range() requires the pfn range specified to be in the - * same zone. Simplify by forcing the entire CMA resv range to be in the - * same zone. - */ - WARN_ON_ONCE(!pfn_valid(base_pfn)); - zone = page_zone(pfn_to_page(base_pfn)); - for (pfn = base_pfn + 1; pfn < base_pfn + cma->count; pfn++) { - WARN_ON_ONCE(!pfn_valid(pfn)); - if (page_zone(pfn_to_page(pfn)) != zone) - goto not_in_zone; - } + /* + * alloc_contig_range() requires the pfn range specified + * to be in the same zone. Simplify by forcing the entire + * CMA resv range to be in the same zone. + */ + WARN_ON_ONCE(!pfn_valid(base_pfn)); + zone = page_zone(pfn_to_page(base_pfn)); + for (pfn = base_pfn + 1; pfn < base_pfn + cmr->count; pfn++) { + WARN_ON_ONCE(!pfn_valid(pfn)); + if (page_zone(pfn_to_page(pfn)) != zone) + goto cleanup; + } - for (pfn = base_pfn; pfn < base_pfn + cma->count; - pfn += pageblock_nr_pages) - init_cma_reserved_pageblock(pfn_to_page(pfn)); + for (pfn = base_pfn; pfn < base_pfn + cmr->count; + pfn += pageblock_nr_pages) + init_cma_reserved_pageblock(pfn_to_page(pfn)); + } spin_lock_init(&cma->lock); @@ -125,13 +145,19 @@ static void __init cma_activate_area(struct cma *cma) return; -not_in_zone: - bitmap_free(cma->bitmap); -out_error: +cleanup: + for (r = 0; r < allocrange; r++) + bitmap_free(cma->ranges[r].bitmap); + /* Expose all pages to the buddy, they are useless for CMA. */ if (!cma->reserve_pages_on_error) { - for (pfn = base_pfn; pfn < base_pfn + cma->count; pfn++) - free_reserved_page(pfn_to_page(pfn)); + for (r = 0; r < allocrange; r++) { + cmr = &cma->ranges[r]; + for (pfn = cmr->base_pfn; + pfn < cmr->base_pfn + cmr->count; + pfn++) + free_reserved_page(pfn_to_page(pfn)); + } } totalcma_pages -= cma->count; cma->available_count = cma->count = 0; @@ -154,6 +180,43 @@ void __init cma_reserve_pages_on_error(struct cma *cma) cma->reserve_pages_on_error = true; } +static int __init cma_new_area(const char *name, phys_addr_t size, + unsigned int order_per_bit, + struct cma **res_cma) +{ + struct cma *cma; + + if (cma_area_count == ARRAY_SIZE(cma_areas)) { + pr_err("Not enough slots for CMA reserved regions!\n"); + return -ENOSPC; + } + + /* + * Each reserved area must be initialised later, when more kernel + * subsystems (like slab allocator) are available. + */ + cma = &cma_areas[cma_area_count]; + cma_area_count++; + + if (name) + snprintf(cma->name, CMA_MAX_NAME, "%s", name); + else + snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); + + cma->available_count = cma->count = size >> PAGE_SHIFT; + cma->order_per_bit = order_per_bit; + *res_cma = cma; + totalcma_pages += cma->count; + + return 0; +} + +static void __init cma_drop_area(struct cma *cma) +{ + totalcma_pages -= cma->count; + cma_area_count--; +} + /** * cma_init_reserved_mem() - create custom contiguous area from reserved memory * @base: Base address of the reserved area @@ -172,13 +235,9 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, struct cma **res_cma) { struct cma *cma; + int ret; /* Sanity checks */ - if (cma_area_count == ARRAY_SIZE(cma_areas)) { - pr_err("Not enough slots for CMA reserved regions!\n"); - return -ENOSPC; - } - if (!size || !memblock_is_region_reserved(base, size)) return -EINVAL; @@ -195,25 +254,261 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, if (!IS_ALIGNED(base | size, CMA_MIN_ALIGNMENT_BYTES)) return -EINVAL; + ret = cma_new_area(name, size, order_per_bit, &cma); + if (ret != 0) + return ret; + + cma->ranges[0].base_pfn = PFN_DOWN(base); + cma->ranges[0].count = cma->count; + cma->nranges = 1; + + *res_cma = cma; + + return 0; +} + +/* + * Structure used while walking physical memory ranges and finding out + * which one(s) to use for a CMA area. + */ +struct cma_init_memrange { + phys_addr_t base; + phys_addr_t size; + struct list_head list; +}; + +/* + * Work array used during CMA initialization. + */ +static struct cma_init_memrange memranges[CMA_MAX_RANGES] __initdata; + +static bool __init revsizecmp(struct cma_init_memrange *mlp, + struct cma_init_memrange *mrp) +{ + return mlp->size > mrp->size; +} + +static bool __init basecmp(struct cma_init_memrange *mlp, + struct cma_init_memrange *mrp) +{ + return mlp->base < mrp->base; +} + +/* + * Helper function to create sorted lists. + */ +static void __init list_insert_sorted( + struct list_head *ranges, + struct cma_init_memrange *mrp, + bool (*cmp)(struct cma_init_memrange *lh, struct cma_init_memrange *rh)) +{ + struct list_head *mp; + struct cma_init_memrange *mlp; + + if (list_empty(ranges)) + list_add(&mrp->list, ranges); + else { + list_for_each(mp, ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + if (cmp(mlp, mrp)) + break; + } + __list_add(&mrp->list, mlp->list.prev, &mlp->list); + } +} + +/* + * Create CMA areas with a total size of @total_size. A normal allocation + * for one area is tried first. If that fails, the biggest memblock + * ranges above 4G are selected, and allocated bottom up. + * + * The complexity here is not great, but this function will only be + * called during boot, and the lists operated on have fewer than + * CMA_MAX_RANGES elements (default value: 8). + */ +int __init cma_declare_contiguous_multi(phys_addr_t total_size, + phys_addr_t align, unsigned int order_per_bit, + const char *name, struct cma **res_cma, int nid) +{ + phys_addr_t start, end; + phys_addr_t size, sizesum, sizeleft; + struct cma_init_memrange *mrp, *mlp, *failed; + struct cma_memrange *cmrp; + LIST_HEAD(ranges); + LIST_HEAD(final_ranges); + struct list_head *mp, *next; + int ret, nr = 1; + u64 i; + struct cma *cma; + /* - * Each reserved area must be initialised later, when more kernel - * subsystems (like slab allocator) are available. + * First, try it the normal way, producing just one range. */ - cma = &cma_areas[cma_area_count]; + ret = __cma_declare_contiguous_nid(0, total_size, 0, align, + order_per_bit, false, name, res_cma, nid); + if (ret != -ENOMEM) + goto out; - if (name) - snprintf(cma->name, CMA_MAX_NAME, name); - else - snprintf(cma->name, CMA_MAX_NAME, "cma%d\n", cma_area_count); + /* + * Couldn't find one range that fits our needs, so try multiple + * ranges. + * + * No need to do the alignment checks here, the call to + * cma_declare_contiguous_nid above would have caught + * any issues. With the checks, we know that: + * + * - @align is a power of 2 + * - @align is >= pageblock alignment + * - @size is aligned to @align and to @order_per_bit + * + * So, as long as we create ranges that have a base + * aligned to @align, and a size that is aligned to + * both @align and @order_to_bit, things will work out. + */ + nr = 0; + sizesum = 0; + failed = NULL; - cma->base_pfn = PFN_DOWN(base); - cma->available_count = cma->count = size >> PAGE_SHIFT; - cma->order_per_bit = order_per_bit; + ret = cma_new_area(name, total_size, order_per_bit, &cma); + if (ret != 0) + goto out; + + align = max_t(phys_addr_t, align, CMA_MIN_ALIGNMENT_BYTES); + /* + * Create a list of ranges above 4G, largest range first. + */ + for_each_free_mem_range(i, nid, MEMBLOCK_NONE, &start, &end, NULL) { + if (upper_32_bits(start) == 0) + continue; + + start = ALIGN(start, align); + if (start >= end) + continue; + + end = ALIGN_DOWN(end, align); + if (end <= start) + continue; + + size = end - start; + size = ALIGN_DOWN(size, (PAGE_SIZE << order_per_bit)); + if (!size) + continue; + sizesum += size; + + pr_debug("consider %016llx - %016llx\n", (u64)start, (u64)end); + + /* + * If we don't yet have used the maximum number of + * areas, grab a new one. + * + * If we can't use anymore, see if this range is not + * smaller than the smallest one already recorded. If + * not, re-use the smallest element. + */ + if (nr < CMA_MAX_RANGES) + mrp = &memranges[nr++]; + else { + mrp = list_last_entry(&ranges, + struct cma_init_memrange, list); + if (size < mrp->size) + continue; + list_del(&mrp->list); + sizesum -= mrp->size; + pr_debug("deleted %016llx - %016llx from the list\n", + (u64)mrp->base, (u64)mrp->base + size); + } + mrp->base = start; + mrp->size = size; + + /* + * Now do a sorted insert. + */ + list_insert_sorted(&ranges, mrp, revsizecmp); + pr_debug("added %016llx - %016llx to the list\n", + (u64)mrp->base, (u64)mrp->base + size); + pr_debug("total size now %llu\n", (u64)sizesum); + } + + /* + * There is not enough room in the CMA_MAX_RANGES largest + * ranges, so bail out. + */ + if (sizesum < total_size) { + cma_drop_area(cma); + ret = -ENOMEM; + goto out; + } + + /* + * Found ranges that provide enough combined space. + * Now, sorted them by address, smallest first, because we + * want to mimic a bottom-up memblock allocation. + */ + sizesum = 0; + list_for_each_safe(mp, next, &ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + list_del(mp); + list_insert_sorted(&final_ranges, mlp, basecmp); + sizesum += mlp->size; + if (sizesum >= total_size) + break; + } + + /* + * Walk the final list, and add a CMA range for + * each range, possibly not using the last one fully. + */ + nr = 0; + sizeleft = total_size; + list_for_each(mp, &final_ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + size = min(sizeleft, mlp->size); + if (memblock_reserve(mlp->base, size)) { + /* + * Unexpected error. Could go on to + * the next one, but just abort to + * be safe. + */ + failed = mlp; + break; + } + + pr_debug("created region %d: %016llx - %016llx\n", + nr, (u64)mlp->base, (u64)mlp->base + size); + cmrp = &cma->ranges[nr++]; + cmrp->base_pfn = PHYS_PFN(mlp->base); + cmrp->count = size >> PAGE_SHIFT; + + sizeleft -= size; + if (sizeleft == 0) + break; + } + + if (failed) { + list_for_each(mp, &final_ranges) { + mlp = list_entry(mp, struct cma_init_memrange, list); + if (mlp == failed) + break; + memblock_phys_free(mlp->base, mlp->size); + } + cma_drop_area(cma); + ret = -ENOMEM; + goto out; + } + + cma->nranges = nr; *res_cma = cma; - cma_area_count++; - totalcma_pages += cma->count; - return 0; +out: + if (ret != 0) + pr_err("Failed to reserve %lu MiB\n", + (unsigned long)total_size / SZ_1M); + else + pr_info("Reserved %lu MiB in %d range%s\n", + (unsigned long)total_size / SZ_1M, nr, + nr > 1 ? "s" : ""); + + return ret; } /** @@ -241,6 +536,26 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, phys_addr_t alignment, unsigned int order_per_bit, bool fixed, const char *name, struct cma **res_cma, int nid) +{ + int ret; + + ret = __cma_declare_contiguous_nid(base, size, limit, alignment, + order_per_bit, fixed, name, res_cma, nid); + if (ret != 0) + pr_err("Failed to reserve %ld MiB\n", + (unsigned long)size / SZ_1M); + else + pr_info("Reserved %ld MiB at %pa\n", + (unsigned long)size / SZ_1M, &base); + + return ret; +} + +static int __init __cma_declare_contiguous_nid(phys_addr_t base, + phys_addr_t size, phys_addr_t limit, + phys_addr_t alignment, unsigned int order_per_bit, + bool fixed, const char *name, struct cma **res_cma, + int nid) { phys_addr_t memblock_end = memblock_end_of_DRAM(); phys_addr_t highmem_start; @@ -273,10 +588,9 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, /* Sanitise input arguments. */ alignment = max_t(phys_addr_t, alignment, CMA_MIN_ALIGNMENT_BYTES); if (fixed && base & (alignment - 1)) { - ret = -EINVAL; pr_err("Region at %pa must be aligned to %pa bytes\n", &base, &alignment); - goto err; + return -EINVAL; } base = ALIGN(base, alignment); size = ALIGN(size, alignment); @@ -294,10 +608,9 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, * low/high memory boundary. */ if (fixed && base < highmem_start && base + size > highmem_start) { - ret = -EINVAL; pr_err("Region at %pa defined on low/high memory boundary (%pa)\n", &base, &highmem_start); - goto err; + return -EINVAL; } /* @@ -309,18 +622,16 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, limit = memblock_end; if (base + size > limit) { - ret = -EINVAL; pr_err("Size (%pa) of region at %pa exceeds limit (%pa)\n", &size, &base, &limit); - goto err; + return -EINVAL; } /* Reserve memory */ if (fixed) { if (memblock_is_region_reserved(base, size) || memblock_reserve(base, size) < 0) { - ret = -EBUSY; - goto err; + return -EBUSY; } } else { phys_addr_t addr = 0; @@ -357,10 +668,8 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, if (!addr) { addr = memblock_alloc_range_nid(size, alignment, base, limit, nid, true); - if (!addr) { - ret = -ENOMEM; - goto err; - } + if (!addr) + return -ENOMEM; } /* @@ -373,75 +682,67 @@ int __init cma_declare_contiguous_nid(phys_addr_t base, ret = cma_init_reserved_mem(base, size, order_per_bit, name, res_cma); if (ret) - goto free_mem; - - pr_info("Reserved %ld MiB at %pa on node %d\n", (unsigned long)size / SZ_1M, - &base, nid); - return 0; + memblock_phys_free(base, size); -free_mem: - memblock_phys_free(base, size); -err: - pr_err("Failed to reserve %ld MiB on node %d\n", (unsigned long)size / SZ_1M, - nid); return ret; } static void cma_debug_show_areas(struct cma *cma) { unsigned long next_zero_bit, next_set_bit, nr_zero; - unsigned long start = 0; + unsigned long start; unsigned long nr_part; - unsigned long nbits = cma_bitmap_maxno(cma); + unsigned long nbits; + int r; + struct cma_memrange *cmr; spin_lock_irq(&cma->lock); pr_info("number of available pages: "); - for (;;) { - next_zero_bit = find_next_zero_bit(cma->bitmap, nbits, start); - if (next_zero_bit >= nbits) - break; - next_set_bit = find_next_bit(cma->bitmap, nbits, next_zero_bit); - nr_zero = next_set_bit - next_zero_bit; - nr_part = nr_zero << cma->order_per_bit; - pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, - next_zero_bit); - start = next_zero_bit + nr_zero; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + + start = 0; + nbits = cma_bitmap_maxno(cma, cmr); + + pr_info("range %d: ", r); + for (;;) { + next_zero_bit = find_next_zero_bit(cmr->bitmap, + nbits, start); + if (next_zero_bit >= nbits) + break; + next_set_bit = find_next_bit(cmr->bitmap, nbits, + next_zero_bit); + nr_zero = next_set_bit - next_zero_bit; + nr_part = nr_zero << cma->order_per_bit; + pr_cont("%s%lu@%lu", start ? "+" : "", nr_part, + next_zero_bit); + start = next_zero_bit + nr_zero; + } + pr_info("\n"); } pr_cont("=> %lu free of %lu total pages\n", cma->available_count, cma->count); spin_unlock_irq(&cma->lock); } -static struct page *__cma_alloc(struct cma *cma, unsigned long count, - unsigned int align, gfp_t gfp) +static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr, + unsigned long count, unsigned int align, + struct page **pagep, gfp_t gfp) { unsigned long mask, offset; unsigned long pfn = -1; unsigned long start = 0; unsigned long bitmap_maxno, bitmap_no, bitmap_count; - unsigned long i; + int ret = -EBUSY; struct page *page = NULL; - int ret = -ENOMEM; - const char *name = cma ? cma->name : NULL; - - trace_cma_alloc_start(name, count, align); - - if (!cma || !cma->count || !cma->bitmap) - return page; - - pr_debug("%s(cma %p, name: %s, count %lu, align %d)\n", __func__, - (void *)cma, cma->name, count, align); - - if (!count) - return page; mask = cma_bitmap_aligned_mask(cma, align); - offset = cma_bitmap_aligned_offset(cma, align); - bitmap_maxno = cma_bitmap_maxno(cma); + offset = cma_bitmap_aligned_offset(cma, cmr, align); + bitmap_maxno = cma_bitmap_maxno(cma, cmr); bitmap_count = cma_bitmap_pages_to_bits(cma, count); if (bitmap_count > bitmap_maxno) - return page; + goto out; for (;;) { spin_lock_irq(&cma->lock); @@ -453,14 +754,14 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, spin_unlock_irq(&cma->lock); break; } - bitmap_no = bitmap_find_next_zero_area_off(cma->bitmap, + bitmap_no = bitmap_find_next_zero_area_off(cmr->bitmap, bitmap_maxno, start, bitmap_count, mask, offset); if (bitmap_no >= bitmap_maxno) { spin_unlock_irq(&cma->lock); break; } - bitmap_set(cma->bitmap, bitmap_no, bitmap_count); + bitmap_set(cmr->bitmap, bitmap_no, bitmap_count); cma->available_count -= count; /* * It's safe to drop the lock here. We've marked this region for @@ -469,7 +770,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, */ spin_unlock_irq(&cma->lock); - pfn = cma->base_pfn + (bitmap_no << cma->order_per_bit); + pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit); mutex_lock(&cma_mutex); ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp); mutex_unlock(&cma_mutex); @@ -478,7 +779,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, break; } - cma_clear_bitmap(cma, pfn, count); + cma_clear_bitmap(cma, cmr, pfn, count); if (ret != -EBUSY) break; @@ -490,6 +791,38 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, /* try again with a bit different memory target */ start = bitmap_no + mask + 1; } +out: + *pagep = page; + return ret; +} + +static struct page *__cma_alloc(struct cma *cma, unsigned long count, + unsigned int align, gfp_t gfp) +{ + struct page *page = NULL; + int ret = -ENOMEM, r; + unsigned long i; + const char *name = cma ? cma->name : NULL; + + trace_cma_alloc_start(name, count, align); + + if (!cma || !cma->count) + return page; + + pr_debug("%s(cma %p, name: %s, count %lu, align %d)\n", __func__, + (void *)cma, cma->name, count, align); + + if (!count) + return page; + + for (r = 0; r < cma->nranges; r++) { + page = NULL; + + ret = cma_range_alloc(cma, &cma->ranges[r], count, align, + &page, gfp); + if (ret != -EBUSY || page) + break; + } /* * CMA can allocate multiple page blocks, which results in different @@ -508,7 +841,8 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count, } pr_debug("%s(): returned %p\n", __func__, page); - trace_cma_alloc_finish(name, pfn, page, count, align, ret); + trace_cma_alloc_finish(name, page ? page_to_pfn(page) : 0, + page, count, align, ret); if (page) { count_vm_event(CMA_ALLOC_SUCCESS); cma_sysfs_account_success_pages(cma, count); @@ -551,20 +885,31 @@ struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp) bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned long count) { - unsigned long pfn; + unsigned long pfn, end; + int r; + struct cma_memrange *cmr; + bool ret; - if (!cma || !pages) + if (!cma || !pages || count > cma->count) return false; pfn = page_to_pfn(pages); + ret = false; - if (pfn < cma->base_pfn || pfn >= cma->base_pfn + cma->count) { - pr_debug("%s(page %p, count %lu)\n", __func__, - (void *)pages, count); - return false; + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + end = cmr->base_pfn + cmr->count; + if (pfn >= cmr->base_pfn && pfn < end) { + ret = pfn + count <= end; + break; + } } - return true; + if (!ret) + pr_debug("%s(page %p, count %lu)\n", + __func__, (void *)pages, count); + + return ret; } /** @@ -580,19 +925,32 @@ bool cma_pages_valid(struct cma *cma, const struct page *pages, bool cma_release(struct cma *cma, const struct page *pages, unsigned long count) { - unsigned long pfn; + struct cma_memrange *cmr; + unsigned long pfn, end_pfn; + int r; + + pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count); if (!cma_pages_valid(cma, pages, count)) return false; - pr_debug("%s(page %p, count %lu)\n", __func__, (void *)pages, count); - pfn = page_to_pfn(pages); + end_pfn = pfn + count; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + if (pfn >= cmr->base_pfn && + pfn < (cmr->base_pfn + cmr->count)) { + VM_BUG_ON(end_pfn > cmr->base_pfn + cmr->count); + break; + } + } - VM_BUG_ON(pfn + count > cma->base_pfn + cma->count); + if (r == cma->nranges) + return false; free_contig_range(pfn, count); - cma_clear_bitmap(cma, pfn, count); + cma_clear_bitmap(cma, cmr, pfn, count); cma_sysfs_account_release_pages(cma, count); trace_cma_release(cma->name, pfn, pages, count); diff --git a/mm/cma.h b/mm/cma.h index 3dd3376ae980..5f39dd1aac91 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -10,19 +10,35 @@ struct cma_kobject { struct cma *cma; }; +/* + * Multi-range support. This can be useful if the size of the allocation + * is not expected to be larger than the alignment (like with hugetlb_cma), + * and the total amount of memory requested, while smaller than the total + * amount of memory available, is large enough that it doesn't fit in a + * single physical memory range because of memory holes. + */ +struct cma_memrange { + unsigned long base_pfn; + unsigned long count; + unsigned long *bitmap; +#ifdef CONFIG_CMA_DEBUGFS + struct debugfs_u32_array dfs_bitmap; +#endif +}; +#define CMA_MAX_RANGES 8 + struct cma { - unsigned long base_pfn; unsigned long count; unsigned long available_count; - unsigned long *bitmap; unsigned int order_per_bit; /* Order of pages represented by one bit */ spinlock_t lock; #ifdef CONFIG_CMA_DEBUGFS struct hlist_head mem_head; spinlock_t mem_head_lock; - struct debugfs_u32_array dfs_bitmap; #endif char name[CMA_MAX_NAME]; + int nranges; + struct cma_memrange ranges[CMA_MAX_RANGES]; #ifdef CONFIG_CMA_SYSFS /* the number of CMA page successful allocations */ atomic64_t nr_pages_succeeded; @@ -39,9 +55,10 @@ struct cma { extern struct cma cma_areas[MAX_CMA_AREAS]; extern unsigned int cma_area_count; -static inline unsigned long cma_bitmap_maxno(struct cma *cma) +static inline unsigned long cma_bitmap_maxno(struct cma *cma, + struct cma_memrange *cmr) { - return cma->count >> cma->order_per_bit; + return cmr->count >> cma->order_per_bit; } #ifdef CONFIG_CMA_SYSFS diff --git a/mm/cma_debug.c b/mm/cma_debug.c index 89236f22230a..fdf899532ca0 100644 --- a/mm/cma_debug.c +++ b/mm/cma_debug.c @@ -46,17 +46,26 @@ DEFINE_DEBUGFS_ATTRIBUTE(cma_used_fops, cma_used_get, NULL, "%llu\n"); static int cma_maxchunk_get(void *data, u64 *val) { struct cma *cma = data; + struct cma_memrange *cmr; unsigned long maxchunk = 0; - unsigned long start, end = 0; - unsigned long bitmap_maxno = cma_bitmap_maxno(cma); + unsigned long start, end; + unsigned long bitmap_maxno; + int r; spin_lock_irq(&cma->lock); - for (;;) { - start = find_next_zero_bit(cma->bitmap, bitmap_maxno, end); - if (start >= bitmap_maxno) - break; - end = find_next_bit(cma->bitmap, bitmap_maxno, start); - maxchunk = max(end - start, maxchunk); + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + bitmap_maxno = cma_bitmap_maxno(cma, cmr); + end = 0; + for (;;) { + start = find_next_zero_bit(cmr->bitmap, + bitmap_maxno, end); + if (start >= bitmap_maxno) + break; + end = find_next_bit(cmr->bitmap, bitmap_maxno, + start); + maxchunk = max(end - start, maxchunk); + } } spin_unlock_irq(&cma->lock); *val = (u64)maxchunk << cma->order_per_bit; @@ -159,24 +168,41 @@ DEFINE_DEBUGFS_ATTRIBUTE(cma_alloc_fops, NULL, cma_alloc_write, "%llu\n"); static void cma_debugfs_add_one(struct cma *cma, struct dentry *root_dentry) { - struct dentry *tmp; + struct dentry *tmp, *dir, *rangedir; + int r; + char rdirname[12]; + struct cma_memrange *cmr; tmp = debugfs_create_dir(cma->name, root_dentry); debugfs_create_file("alloc", 0200, tmp, cma, &cma_alloc_fops); debugfs_create_file("free", 0200, tmp, cma, &cma_free_fops); - debugfs_create_file("base_pfn", 0444, tmp, - &cma->base_pfn, &cma_debugfs_fops); debugfs_create_file("count", 0444, tmp, &cma->count, &cma_debugfs_fops); debugfs_create_file("order_per_bit", 0444, tmp, &cma->order_per_bit, &cma_debugfs_fops); debugfs_create_file("used", 0444, tmp, cma, &cma_used_fops); debugfs_create_file("maxchunk", 0444, tmp, cma, &cma_maxchunk_fops); - cma->dfs_bitmap.array = (u32 *)cma->bitmap; - cma->dfs_bitmap.n_elements = DIV_ROUND_UP(cma_bitmap_maxno(cma), - BITS_PER_BYTE * sizeof(u32)); - debugfs_create_u32_array("bitmap", 0444, tmp, &cma->dfs_bitmap); + rangedir = debugfs_create_dir("ranges", tmp); + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + snprintf(rdirname, sizeof(rdirname), "%d", r); + dir = debugfs_create_dir(rdirname, rangedir); + debugfs_create_file("base_pfn", 0444, dir, + &cmr->base_pfn, &cma_debugfs_fops); + cmr->dfs_bitmap.array = (u32 *)cmr->bitmap; + cmr->dfs_bitmap.n_elements = + DIV_ROUND_UP(cma_bitmap_maxno(cma, cmr), + BITS_PER_BYTE * sizeof(u32)); + debugfs_create_u32_array("bitmap", 0444, dir, + &cmr->dfs_bitmap); + } + + /* + * Backward compatible symlinks to range 0 for base_pfn and bitmap. + */ + debugfs_create_symlink("base_pfn", tmp, "ranges/0/base_pfn"); + debugfs_create_symlink("bitmap", tmp, "ranges/0/bitmap"); } static int __init cma_debugfs_init(void) From patchwork Fri Feb 28 18:29:04 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996902 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C7920C282D1 for ; Fri, 28 Feb 2025 18:29:58 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AA7FD280004; Fri, 28 Feb 2025 13:29:57 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B7EF280001; Fri, 28 Feb 2025 13:29:57 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 80AE5280004; Fri, 28 Feb 2025 13:29:57 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5148F280001 for ; Fri, 28 Feb 2025 13:29:57 -0500 (EST) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 07F6A1A110E for ; Fri, 28 Feb 2025 18:29:57 +0000 (UTC) X-FDA: 83170192434.16.4AE1328 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf20.hostedemail.com (Postfix) with ESMTP id 37D221C0011 for ; Fri, 28 Feb 2025 18:29:55 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=i1SjNWel; spf=pass (imf20.hostedemail.com: domain of 3ogDCZwQKCO8WmUcXffXcV.TfdcZelo-ddbmRTb.fiX@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3ogDCZwQKCO8WmUcXffXcV.TfdcZelo-ddbmRTb.fiX@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767395; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=cv0B4/uThibyDXPGrCU6CzEy8xZoS0FJOJ+cAKPL8G4=; b=3TbjZaID/hUlqXtFfK/8OqxRZl7e8hGHCu/pomKa43UR1gAm4k5rNGohha8ED3wjKrUe20 4kC6311X//oIsHaFBFTTiT0LFqe4KH14TQSPnr6N/OgGsvNukj+tZSElXeKr3zL+w4v5T3 F3PP01BDscEbFM6/2Wz84bIz1ZNDBBU= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=i1SjNWel; spf=pass (imf20.hostedemail.com: domain of 3ogDCZwQKCO8WmUcXffXcV.TfdcZelo-ddbmRTb.fiX@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3ogDCZwQKCO8WmUcXffXcV.TfdcZelo-ddbmRTb.fiX@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767395; a=rsa-sha256; cv=none; b=JiX+zqtHd3PYGUtDji13y5YNqNMj0VAmyEek9Feujg6elCjSLzt/l3hOAQX45ADYjcXsYn y0qbpwyQ7MHW+j3YQKSmOYW2oxnbfs7PmQSfzpUACNxPUX1C8rx5lFOsX/wD1Bb+0sD06e vsByNEVyxrK25uJGs2MtKvqw7RsU5wk= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-22326da4c8eso46077475ad.3 for ; Fri, 28 Feb 2025 10:29:54 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767394; x=1741372194; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=cv0B4/uThibyDXPGrCU6CzEy8xZoS0FJOJ+cAKPL8G4=; b=i1SjNWelnak/JOxrF4fLc+20TtJ8bVA738C+CE8QAl4ejiCJLAsiydkArRBG8tOF8D 2MFokTpjZIcJENHyTbzpfDWX5aS3NTAtFhRbLfUlaOTe4GUaASdrcgVcuaXVhsQZEwjv 8gMbkhPhXSPgsfvXmb1iMB+51F6kt8MpgBuzL7ePgDoJwUvKoL8PL5RGwijbibSQfknv GWi9ldgqWZvKejv8Fx7zNxcd8Ox4xE8/getPhDgFhmJaIgSyk1Am3B+kLXWDhLmH9ceu cecAFwHcv/+0+ShUwV0wPhre04GX9VlrYIFjXqClFyhpPuJlreDsqiENxI2zMs67WcNo P4dQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767394; x=1741372194; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=cv0B4/uThibyDXPGrCU6CzEy8xZoS0FJOJ+cAKPL8G4=; b=Qzmy3dbQHZ0pTw9Q1c6B48ud4vVQwRqpoRryN5qk/uSjdw9hNPVGvXVcwETJiP3NhU hLKGzlT6JfMLB3kcT5wk5Toxk0nWhE9IUjvnGuUbbGIvp3RwqS4ol63LFUMKp/zrHVat 4v6g4qa/yw5Xa6R9BfvXS6tfpOjPe7NaGfewzwQCuy1uDC8oFWftjvuFoOGAmTI+XUhj 7cAc/BWxU7tOY18b9rRAOod6utZT2pEyF/wV5V3H5rKCOB9TrKklQDRYvxGGGSC4SPqj SdXZXdu+opALdSPpiNCW/TCGYg2dau6iyQA+sbLYWQcwa0lMoz4w7spQWpx+DWFBDc0m M7qA== X-Forwarded-Encrypted: i=1; AJvYcCUPVHXlJIhi2s99mzOTQtth6JVIjkcCrxciA6tTsKqxNC0KrrYOH9vpjBvG2qRcTQaliI7QplixuA==@kvack.org X-Gm-Message-State: AOJu0YzlyidsMNKloRLd3ftbiPv8XZpscNYjgzIN5luZ57hWOUAIztHf j0q2oq2w9UiJk99DUNGeSCZKTzaXfUTl2mSCU18eS6EOMYF6gRnwlz5AC4YW7YoPytpYlg== X-Google-Smtp-Source: AGHT+IGsUKkAmT8g+s3WZ9uk+Wed7lmB9nB51UlYclAmGf34PovRshjaGeeXtqu2JnyLABvgZzxxfd3d X-Received: from pljc15.prod.google.com ([2002:a17:903:3b8f:b0:220:bf5f:1984]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f693:b0:21f:9c48:254b with SMTP id d9443c01a7336-22368fa8f86mr82559765ad.24.1740767394131; Fri, 28 Feb 2025 10:29:54 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:04 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-4-fvdl@google.com> Subject: [PATCH v5 03/27] mm/cma: introduce cma_intersects function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Heiko Carstens , Vasily Gorbik , Alexander Gordeev , linux-s390@vger.kernel.org X-Rspam-User: X-Rspamd-Queue-Id: 37D221C0011 X-Rspamd-Server: rspam09 X-Stat-Signature: qnpzamsb77k3qpwotnyw5553qqztrn9a X-HE-Tag: 1740767395-104150 X-HE-Meta: U2FsdGVkX1/0sfoceUzy5lXWJScLkZf4jOYQ0WwB43opEpTRZyj1bHuJXHU/Tmn6+AUqFqXGFYTQv21lcIiWXIcAUyk2ePWNcNS9lzCfpxS80CriiH46R57ucyTIdWGhwK7xfnI/6qD3y4zKaCGCdTYDmPg31duDRAgTeCgtSYnI9fU6gz96rH/5O8M1WfllG0Rgl5E3c+Yc4vxkLtiGrKiH6dttqFFmjXu8XhG/HNJJKDj49s7P6BfzU1D0eG8AXnrKI608QNHwYwLwcKbkD4GqsFssyB2T/cOzBmHFZaIMKCiZ7ntmSVwBGK2gC6liORvbQIke4fcwbXtuOfnRhKH3WWMze85K9M7LK07nUNEizjgTgTcxSZYht2DUv6ZT0fhsXltaXdPCSPzhoRQPyT36S3S05aQxVGVKqYBQAeFVPVYhQc3UzyXgvjgd8SP4Hty3Vngxe9sU4vFPZYU6JMdKA1pBD9zOdKg2NAeMDZQAenWm4WU8UFCPUwWDQcbG2tN8Te2sB5JAKlX/QJf08CHYofLNqmiCa64DRU/0cDlHUF6EvKHBeeihMkaT6u8Af4AVIkgck660fVBj20vHJ6U1GScFlTjmmsengZ5XCTVg2uXWe++rDhrsott8Qkn/NuCTTLVnq7GPGNApJ7+taJeWIan4xE6VakNUdOxqCqrusNjYWdeR9vvenByJcVD3eK3/ehOePh5pDJrR9ApjM7gUgZPmiC8O84xEkR93G4zlH9P2cxSQJbmSz7k/+RoX2v14pAcogfe3V0dUDRSO/GIIBVZWD3Vu+FfyI6X4TEc/c0CcKSTFJLbqFIwa4ejVFkyqiri5cBN1003dnYnKTucRQ178D+RD5ebxpjuKew5cNHYbiDL38DTKushfKrZC3ORef++3xBgAIfz+R/4AY1Ozmbiw4Pjjgm+uPrblQrWoDUpQ+bQ424LYbSStsctfgqP/52TcIWRwTLFR3/Y nWXI8uH3 Bxh53c6Cib0UWj7NNe6wc8ZphdEj/kqdDiLqZYvLENmriFZxi3r/xCtzSBH2t+sSVW+FGWx6sJ3efFl9IM0B5ecb9c2fTBEPXg6ftZjz1yXcTPL+1wFge1H1PKziZtbkL1rPlT97aHgq6AtzVh+4IIe58pBP4NdchjoJsj/53mBkJJbfXy4mCFSxDC4QqMxSSLIJjmpPoy6plOoKVU/6DXL5Jqj42VKdofMnTniWADoIsPrB8QBrMyd8D3ZKBnEjTCjrSrqDj0uDidhh0ckoeqoXA+an/2e9vGQy4RKsMnlvMMVo7eFrC56oSyj1eF7CziDsMzwGEVgzDyCG7HE677bysLq2uNnQdqX1dfiGP5UAR8WvKzzlsTElN7QuzM9VTm5BBLVLE//sMbh0cYOhWwP0SRjigNyWu0afmVg910ubRx2EP/tAQ4OkXCFsO42EJk5Q0ZKcQHu5cRyqem5cKPOk2nluNV9+F6wZtV0pqKCsZvQK6iTTPrdu2AE9hbjWRoCWJpc3VkLoMbxGRnTMAPzb+X9d4CKbESsrLOY8Q7vJElXkKwY89YZ12BcZimHeIsP9zkaKO3DSxGXiP91pbQY93xcoCHOLwnvId6YwkVUi29MlitOisTBny+Q== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000035, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that CMA areas can have multiple physical ranges, code can't assume a CMA struct represents a base_pfn plus a size, as returned from cma_get_base. Most cases are ok though, since they all explicitly refer to CMA areas that were created using existing interfaces (cma_declare_contiguous_nid or cma_init_reserved_mem), which guarantees they have just one physical range. An exception is the s390 code, which walks all CMA ranges to see if they intersect with a range of memory that is about to be hotremoved. So, in the future, it might run in to multi-range areas. To keep this check working, define a cma_intersects function. This just checks if a physaddr range intersects any of the ranges. Use it in the s390 check. Cc: Heiko Carstens Cc: Vasily Gorbik Cc: Alexander Gordeev Cc: linux-s390@vger.kernel.org Acked-by: Alexander Gordeev Signed-off-by: Frank van der Linden --- arch/s390/mm/init.c | 13 +++++-------- include/linux/cma.h | 1 + mm/cma.c | 21 +++++++++++++++++++++ 3 files changed, 27 insertions(+), 8 deletions(-) diff --git a/arch/s390/mm/init.c b/arch/s390/mm/init.c index f2298f7a3f21..d88cb1c13f7d 100644 --- a/arch/s390/mm/init.c +++ b/arch/s390/mm/init.c @@ -239,16 +239,13 @@ struct s390_cma_mem_data { static int s390_cma_check_range(struct cma *cma, void *data) { struct s390_cma_mem_data *mem_data; - unsigned long start, end; mem_data = data; - start = cma_get_base(cma); - end = start + cma_get_size(cma); - if (end < mem_data->start) - return 0; - if (start >= mem_data->end) - return 0; - return -EBUSY; + + if (cma_intersects(cma, mem_data->start, mem_data->end)) + return -EBUSY; + + return 0; } static int s390_cma_mem_notifier(struct notifier_block *nb, diff --git a/include/linux/cma.h b/include/linux/cma.h index 863427c27dc2..03d85c100dcc 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -53,6 +53,7 @@ extern bool cma_pages_valid(struct cma *cma, const struct page *pages, unsigned extern bool cma_release(struct cma *cma, const struct page *pages, unsigned long count); extern int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data); +extern bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end); extern void cma_reserve_pages_on_error(struct cma *cma); diff --git a/mm/cma.c b/mm/cma.c index 34caa6b29c99..8dc46bfa3819 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -978,3 +978,24 @@ int cma_for_each_area(int (*it)(struct cma *cma, void *data), void *data) return 0; } + +bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end) +{ + int r; + struct cma_memrange *cmr; + unsigned long rstart, rend; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + + rstart = PFN_PHYS(cmr->base_pfn); + rend = PFN_PHYS(cmr->base_pfn + cmr->count); + if (end < rstart) + continue; + if (start >= rend) + continue; + return true; + } + + return false; +} From patchwork Fri Feb 28 18:29:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996903 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88E7DC282D0 for ; Fri, 28 Feb 2025 18:30:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 21C30280005; Fri, 28 Feb 2025 13:29:59 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DF7F280001; Fri, 28 Feb 2025 13:29:59 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E748B280005; Fri, 28 Feb 2025 13:29:58 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id C9807280001 for ; Fri, 28 Feb 2025 13:29:58 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 82E501C8239 for ; Fri, 28 Feb 2025 18:29:58 +0000 (UTC) X-FDA: 83170192476.27.7ECCAF1 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf17.hostedemail.com (Postfix) with ESMTP id B57044001F for ; Fri, 28 Feb 2025 18:29:56 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZBtl5wea; spf=pass (imf17.hostedemail.com: domain of 3owDCZwQKCPAXnVdYggYdW.Ugedafmp-eecnSUc.gjY@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3owDCZwQKCPAXnVdYggYdW.Ugedafmp-eecnSUc.gjY@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767396; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=W6gZ1EoDBME1HGkx1DBWcAG+gCVdWdKNzaRnW4BhOvY=; b=qS8MATnWQQpt/HQ/ay1j3ZkpjoogPCxYqBZb1yzh88UTaovXt7jveavzcqJf/hIwdnDVY2 0SGb7ZL9AAt5JJRPsJ01Of4cbDh+FR3sYw206qc3n5EEzre6Z6m4HLx0oGZvqvk+GJVT4s faxk0lfcMESGtrt3LlBpQJ1u597Q3yY= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZBtl5wea; spf=pass (imf17.hostedemail.com: domain of 3owDCZwQKCPAXnVdYggYdW.Ugedafmp-eecnSUc.gjY@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3owDCZwQKCPAXnVdYggYdW.Ugedafmp-eecnSUc.gjY@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767396; a=rsa-sha256; cv=none; b=eeG6sEadqH04+Yqz5XXHnCsiJayq6l7CdnPJW0s9qU7QWorugTJAa3uucjghoh4+gra8Sf Rclpb/A5Dq/48NWnnJfCZEK1H7zhmbInQYt64p+KunGdjRzqlL/b67OV2gOMhLaMYmG0wx aMp1tKiOimkGqHHj5l3xHV0FRamI4vQ= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fed20dd70cso426906a91.1 for ; Fri, 28 Feb 2025 10:29:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767396; x=1741372196; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=W6gZ1EoDBME1HGkx1DBWcAG+gCVdWdKNzaRnW4BhOvY=; b=ZBtl5weax/MB7kKACE+DssQK2Z0s4cYKOjzmzaRgqOITEiX3YW+WRHqGAbJVL+fNU4 ME08Pfdh7SOa/wQ5ccvGsdfm2rZfjkDXZv05cUrCn8V/0pTpxCdS1qmukVixz/zR5cfv UpL+iLWCWDfQyiyip0bECR5mWqfzmQjZABjwZ4cUcG7VvEpiuXHopipG6cnrD5ODuVgH ghJmA2aHFxGTuGnZcvyzHEpCWGgCySq53tkkpo3pXcS5cFEbMciK8XT7lPqK3wnjQYHi IPKzvErC7fYV1s5GPduwbDNqF/6fBNdmFpjBFN463oCpl9y03LUx8rp+KQYK9DoVpkB5 NoRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767396; x=1741372196; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=W6gZ1EoDBME1HGkx1DBWcAG+gCVdWdKNzaRnW4BhOvY=; b=hY7QXR9NRSkvEyrjFEssp8VZm043+L8vc0EYZND3RwA2+EO+OYrnT8XQAbmBdxeYFG uj7K8OZpdDbtb7yj00iPTK/iA44bwgNe/wvN73xAsDLCG1fXHJAvz5CVAkn1FEqsyRfS GHaANSbAa9ylEZdh4xPbEXZ3AhmnliyN1LPLBNpVwF1tuCy7eKguEREih5AFxyq1mN63 Yhjx/azjjWyNDawM8QJWR2S6a59IO396UEojMOl9/9QTzmNCRdzUqT22ywrNpVOjfb9d MapDpss8zOW+ZvBbjhlEjFLAOBdMzMrqvJCO5Jll0/tFe40ERqEyhYXjTtgaQcAEOiDn JzXw== X-Forwarded-Encrypted: i=1; AJvYcCX68ThKCuSm0x6mL6reY19Jci3AF/DDRl7BG/k+i7SrfCqg+gqtbLv6l16Lr4l2yzsM2KKM+AGzig==@kvack.org X-Gm-Message-State: AOJu0YySjuIZ1xYaZvB4FxkJIXnuaUJCngosYhHEA4do2ASJcwuQ0AnR rqedsJfGpz0NIpKNl7mRXTXXm5+RDvnkFAofLuTGiSWsg4a3HwJDQExhoRTqK6goZb9rXw== X-Google-Smtp-Source: AGHT+IGBekA9yH+3WWdUMJNOD5Pd3PBDucriRK0CwTmdQpzZ//urtl/q/fvn5N6SAkEkZtWc1aZX3AbQ X-Received: from pjn14.prod.google.com ([2002:a17:90b:570e:b0:2ea:aa56:49c]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1e01:b0:2ee:ad18:b30d with SMTP id 98e67ed59e1d1-2febab30c86mr6700197a91.6.1740767395727; Fri, 28 Feb 2025 10:29:55 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:05 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-5-fvdl@google.com> Subject: [PATCH v5 04/27] mm, hugetlb: use cma_declare_contiguous_multi From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: B57044001F X-Rspamd-Server: rspam09 X-Stat-Signature: japqnqd9tj18bc1zxympaatic1w411zs X-HE-Tag: 1740767396-868905 X-HE-Meta: U2FsdGVkX19778ZJLmg1gTUil+5vIl4IKSqHoln2azfQgI9zD4iykiqtvH17kSMwegqPPRu0Y5PtJ5Wa+Q+0aL3Oii9/me0KxvYo5zi8Msk94f937qEteqA9XLbWrZWBQLpOThJIgYUP4m/HiaR/JYJQMx8YMmSOCRmxRfbqusVN9rof+c7mCGAq5+v5+deHPeCQiHkO4qAy3lVF1msXda/ZUKvB5tg/PObH1TIYnB/gPWAcplKjKY7KxTBUxPnvtUDax44KmzoUwSCgAg6SozbiLS5NElIVUcBRwioKjupJcCcIGhsO5Up2HkWn1uK2qjXLu5jSNfsFphALS3pnrUQlz542cVKzR5zt55SVak9eor8tbOXYG0rXU/bwKzmGvtj6cOU8sTgBlmthXAuoW7MPjlfgqRLBKzCLkme3dT0Q4z9/IcZ5OTAPHMRotkbRV/xr1fhBU4HPGfCsJjdoaRkpN2RB8RdO/GGLnV/BR13mXw/5sk4xUpbM/EtTV1DMh2c65ugdQUt+V0mxBL3yfr3j7R5MMjEQYkzGKrlfiRZ+vgpuZ/hfTRWI85fNRsRS53Fkfw06+1+P2P0WGsnN0ZNr2C4ct78wGAoPGo5Q0c0RsMSyEpQg1E5k4lPzyP39dTfR1+dizd14W2xp8Ehm5UrSQk9Mg1mVmFz1yocPe/ddVyr+1h2EvHvQgpik6dqAHxnUdwJ/x5GO8+eWEJuxj9zEVeaBKVLNCX/PPBBEXz1DhcrDUms19tCCBq/SLwaZZO3VGPUds6TsutBwysC1HlmwRZzYBE84GYAZB6jWsIjESHwLk74nK7EItGsFKlYzFL/Q1m3EkPgXyWeun22hO3DacZZsDLw1UUVKxfDhdCTKvKkLIxi0FbbjK+mdcx9uUpph/w1q2wAsGlbHPunCYsQ6nqGIf4yAJM3R4RqhIjy3Pz27QmDNKCBfQOtkKNzoCbckOhXWkSKE3Hx2dAJ +GptQq0Y 5JHaaKlGPK9BxmHj5m+DIW40L2YBt8DQhgDByZb4SXjdBo77TwYKf72qiHdOXlhsAa8n0I4nVuI24PxM16uuvWW8kqaE/1ZpEC2R1XPatxR5i51NoI+QIMZ9KLrK7WBxpNQRdnCX2pGCbBYsA0tVi/cdl5cHlE5KuqYf2cY5KxfgOjJbpeZiqLj7k9nbvTIsBxhEpwlxg5/PkgU2Hk1Eq+I2tD+OuHcEsSr9oS0vBe4VtumeVajDAHmJx4EBvgeNqQs3AiNrjhXlprONejlFfC++6eBe5qBJT+0wSGe5xBC5LaBSJtdRGhlDd10jIdByEjww7bufG+fgvPbXNMSzLcWHyOW2K5BaUV2eIUUwMBDAbwGQ1ttvGSknjn9U3yluMOchK01UfsDGJLqo3dIj4nZ2sgcXUKGTMl6Imkj+ElyFxXhxW3cnqypZO1Whsm37UzFJFAOVEVtI0TVvDwtOLRd3Ih7pNAP6WNYuIbhpyD0P88HA7T2dflYjQ895NWBNKod9xIGlYPByNUxddx4It+e1hz5nAb222TTH02CVsn3iYBTL+i0NyKIeJ8v3akqE638CNmht7zwBM1Yc= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: hugetlb_cma is fine with using multiple CMA ranges, as long as it can get its gigantic pages allocated from them. So, use cma_declare_contiguous_multi to allow for multiple ranges, increasing the chances of getting what we want on systems with gaps in physical memory. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 163190e89ea1..fadfacf56066 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -7738,9 +7738,8 @@ void __init hugetlb_cma_reserve(int order) * may be returned to CMA allocator in the case of * huge page demotion. */ - res = cma_declare_contiguous_nid(0, size, 0, - PAGE_SIZE << order, - HUGETLB_PAGE_ORDER, false, name, + res = cma_declare_contiguous_multi(size, PAGE_SIZE << order, + HUGETLB_PAGE_ORDER, name, &hugetlb_cma[nid], nid); if (res) { pr_warn("hugetlb_cma: reservation failed: err %d, node %d", From patchwork Fri Feb 28 18:29:06 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996904 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 093B9C282C6 for ; Fri, 28 Feb 2025 18:30:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B68D9280007; Fri, 28 Feb 2025 13:30:00 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B1EC6280001; Fri, 28 Feb 2025 13:30:00 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 993DF280007; Fri, 28 Feb 2025 13:30:00 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 78232280001 for ; Fri, 28 Feb 2025 13:30:00 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 267761C6988 for ; Fri, 28 Feb 2025 18:30:00 +0000 (UTC) X-FDA: 83170192560.13.56F7207 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf15.hostedemail.com (Postfix) with ESMTP id 5E226A0007 for ; Fri, 28 Feb 2025 18:29:58 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Yl3kDXVN; spf=pass (imf15.hostedemail.com: domain of 3pQDCZwQKCPIZpXfaiiafY.Wigfchor-ggepUWe.ila@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3pQDCZwQKCPIZpXfaiiafY.Wigfchor-ggepUWe.ila@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767398; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=MFqvkI2A59BbRTUcT+L5ulTUxSAi2JTL7XhWtVP7K+w=; b=xa2T9rekDzTPCrvAW9dQwoa8VfO6/ngX5fuu6PbRKQLg3138AkFZU5iy5l+lWCgiWC0Rz9 DprTJyPpsMRPDqEMx7+O/lBVX4zzEmSYCaSX710PH/xnF7c4MeFDzgJfjZS0hrISedCbbc Qn5CvzTOzUWIp1PTWwFTG6108EK+cDY= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Yl3kDXVN; spf=pass (imf15.hostedemail.com: domain of 3pQDCZwQKCPIZpXfaiiafY.Wigfchor-ggepUWe.ila@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3pQDCZwQKCPIZpXfaiiafY.Wigfchor-ggepUWe.ila@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767398; a=rsa-sha256; cv=none; b=q65FoUE40bb72noqj5S8WBAfFrTSGp0z2NIO4HJbDe2q66w5iNIjBAI4k5WKbLGSNC/1o9 R9nm9yZ/Cwus8E5bGMtBjJOsP1hV4Zv2vxZOxPqGyG8rHSqNxSYel9U05iMIXTht5JkxOE BtNSDR21txmB3eGJvVJthPUdi8MEBKQ= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2fed5debb85so50193a91.2 for ; Fri, 28 Feb 2025 10:29:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767397; x=1741372197; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=MFqvkI2A59BbRTUcT+L5ulTUxSAi2JTL7XhWtVP7K+w=; b=Yl3kDXVNzcfKxkfj3vFpCLqEnRKbSJ1x8GHauEGNdWD7KPFJfinj/U9vVoZnV1nxDS bj7PUb+2EehmuYv/mqp4+jn2iQjPtUlpO5Q8nNXLCWehcginIG1ddYT3P7trc+AA7HMd qJDozdGfIbtG4wf6NvZcslaQHmAJT7zDrYhIyCeLNzi/t60TGoemU/30QXXTOXNdgJEG pkdTU4ZHShHxLbJAwacaVm2SCj1OjDdP/rSc5pwtgSJhqQnuEbQv78cSgL1Ae5msi+eA RL3OU7m19pNW9kMn3fQY5CJkneXwpZOrSxI3/DPp6fcJhP33KXKtMWmysYGTXyiPHO6J bbqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767397; x=1741372197; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=MFqvkI2A59BbRTUcT+L5ulTUxSAi2JTL7XhWtVP7K+w=; b=u0xxck19B4HymesRnszLs+ecFF6B+cryv5yBYsb71rsLg1qLebY+AIr1mnU+gJjWsw sfuLqyJyGe/Sz0F6scNaFYBecVNKq9OLYfTMnujR8LP+E+ANdmSSPtCd+6IZ3zkv2ESK 818H5HiF3hYMjSF1MDWphb4+U46AOt74wVB/CZE2jvdUxq7r62IXbQuEfoSHmSpet12t gditzPXs9upxTf+ah6nhrZi73KM6aaHoUY2dGF4jEz6Wz3VcBCI0OHM+71tWWSJgQvX/ KOagEgQRddfoTUTj26jK43Hclrj0ULuCZ5afsweOpi1OchBI3NalL6Y1U6Bg1NSMOr4E bEqQ== X-Forwarded-Encrypted: i=1; AJvYcCVNOJhNl1NpJkrSQfJ6AjJN6JEXoMN5DPHeYklMfhroKqIPyGyfeJglAZneyhSvqMhLdzoan82jCA==@kvack.org X-Gm-Message-State: AOJu0YyAd3+vQdYoygqxpvV1ElRpq/e1BYKcTCXS0PVnpPmgOVjkvQG4 O2keI+xj8ZIaxMi8+xi3TvgiQDbDH9gghDLGGVM6PCnmaYh8M0oe+fLeyTgXZjnex4KPXw== X-Google-Smtp-Source: AGHT+IHeteXGZQA011CNg8cQpzXHdzrwZLVc5nXKBfxk4YN+FprIvtPh1vjqeNwkbz2HLtK+Z+J7QiKW X-Received: from pjbhl3.prod.google.com ([2002:a17:90b:1343:b0:2fc:11a0:c53f]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4c11:b0:2fe:baa3:b8b9 with SMTP id 98e67ed59e1d1-2febab2bdd6mr7106444a91.4.1740767397292; Fri, 28 Feb 2025 10:29:57 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:06 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-6-fvdl@google.com> Subject: [PATCH v5 05/27] mm/hugetlb: remove redundant __ClearPageReserved From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Oscar Salvador X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 5E226A0007 X-Stat-Signature: 6iy4xntc39x94o7nwpjdiwa45ae1n5m1 X-HE-Tag: 1740767398-33036 X-HE-Meta: U2FsdGVkX19+yndsqx31uRr3oXllBW3FuZDjCW7GntU7JNtJmQsColcggb57MZUnK8dK+gBRQBI+dmk8Aq4Ez89KyQvZbzxxMrF9GjEiibLM5ri5+prr6cbHowjnNKalhOqpidHAHxHg9xHL2lRHUdnO4bMRIDs4wwWuc8TGZy2KMgohBh3S69+wr2hJLJXuL4BpgMSfB7Tl2ZdppxVws/H+z8LzblDgXgheQfyVmIYtPZZthBAj7HTgCFuroFv5tdtyHlil/k0htq7p38jAeC3894nijzK78aGa2VtL4njsRN4FvhJVH7GuFfz/niNF7WdLebAnixZbSm0fLT++nY64u+To1ajtowzphVAVbYVQ9Jq78sBcbvFnN9438kr9btBxcE8ho2gQlDVSrq/3wfgyTxdkz+LqWm9U0cgTYalUNLnl9ajvrsJc1KnwkNpS+hI8YxEVU77mqqkxvpKeq3OwKFrwBLdYH2E7TGKPS9jNtUpte2rN2YSE+8FH7+C2YpBB33Fv9oRG4A+JzxgsHvH24O6pZEP2G+f0n/9ABn9hg0AK5VrYMq4pFuN0STqVOTbwxSXVHaq+aQ/JtNdh/RHsozxV6LUJk9kiYwoxUwOqEBXbK7LTgJWZXne0Bsh2GE8WMQt7xeaOHQ18boS+s1VeH976tJOj+kMNg0Wfk0U5VliKZmPC7Yn7TcCGDMnoY48/VJRuoHOaq58UGxJdFxxu1E34JBqEqFsprkyQtGOvQRPwQHBtl/xQwDml8Mg3XK2ZVnySf8Fvsz0R9vnhBs9iSi/8NaqiaECuzJVW2KRC3R6fqWxiqPysiBRiYmdz67GsOBQ3eMTlo3mIE70ebzEUiXV5qvLNNYVvIpPDEb/ETZMkcg1eDXVweoo0keQcseXTh5+u9IDuuYFpb0OedsAPZOzVQmD8+JrJoedpmFX/Xzuy4GgORCpXSQbhrBSTjpQIjUGg9yUc3pa7/vK SQGCmnfg hYOb822SNN8tEl+VO3EgHsbq4KaiW3d4XHBsMxF/Wx6EiYdYtkYvSIXJC3hf3cPDt09b+DaIbtzLtheTiHwnmdvivD1EI7+zKPTyPovutjCa6looAhM4nbIY5g1oJA1LdFd44RoQGox8poz8pBs1SfTt7RTJc3zHmQ2tRF5bjJ+QEnOqK8fNpvvFSi8+/k4uAeLVx3RzaiAtyZZD7FZykEvzlOTzFzjYX/5tuVq0ZzfEQzF62oaJsW0fTPgJkvhsDKCLxvRlBg/M2BJVikxJpJSYpaq2ywjb3pEG2+EpKGO+6D8bRswN2348qFYWdMiUJZL+p8QqnIstwcQ41cF8y0diqzPrKaI9kYc1d4w+A8Ie2eKU4a+qfKc9uT+aYkfRbnXmBsG8Pfp4WrveU7lsumJPxAzXhIErWnChIgTUC8uQwUGwrn0qQtq4oAnniR7jMdgDIw8kUETlEHf5/iMpGAD3wQbnn4tXtShx3KWtEmSRe7FGmklj8pqaP3F1QPzQYWI9y+uStxoJ9bKUs8oMf19W2+btCFPfd4fyr3egitL6UlTFqLMIbx+pwHVwgjaYqvpxNd5BcAKoCs7Q= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: In hugetlb_folio_init_tail_vmemmap, the reserved flag is cleared for the tail page just before it is zeroed out, which is redundant. Remove the __ClearPageReserved call. Reviewed-by: Oscar Salvador Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 1 - 1 file changed, 1 deletion(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index fadfacf56066..d6d7ebc75b86 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3198,7 +3198,6 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio, for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) { struct page *page = pfn_to_page(pfn); - __ClearPageReserved(folio_page(folio, pfn - head_pfn)); __init_single_page(page, pfn, zone, nid); prep_compound_tail((struct page *)folio, pfn - head_pfn); ret = page_ref_freeze(page, 1); From patchwork Fri Feb 28 18:29:07 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996905 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7113BC282C6 for ; Fri, 28 Feb 2025 18:30:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5ED82280008; Fri, 28 Feb 2025 13:30:02 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 59A9E280001; Fri, 28 Feb 2025 13:30:02 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 46192280008; Fri, 28 Feb 2025 13:30:02 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 22E8D280001 for ; Fri, 28 Feb 2025 13:30:02 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D8D0DB30E4 for ; Fri, 28 Feb 2025 18:30:01 +0000 (UTC) X-FDA: 83170192602.13.5025E37 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf17.hostedemail.com (Postfix) with ESMTP id 146F44000A for ; Fri, 28 Feb 2025 18:29:59 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jndwX1Ap; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3pwDCZwQKCPQbrZhckkcha.Ykihejqt-iigrWYg.knc@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3pwDCZwQKCPQbrZhckkcha.Ykihejqt-iigrWYg.knc@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767400; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=OPcdQrMWcNv7Zau8OEpT7S16MfDg/G7TA8+N5ucsG+E=; b=AmpFHMJ62UgrM68Ys6wqUowNyNo1JHJqGyuvZJ8V3NMXLLWN8r1RlMIEHpRxSY+Fsf5BZ5 j5Bc2UU1f6rUUHqakZ+ofsEHzTKU1JWk5mSCjLGxpUoX6blSeplOdFP66+eIslphPOTZZA t4+Rid1R9WMdifOBfwzXJpjxNooRqOc= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767400; a=rsa-sha256; cv=none; b=wPdmz84MlU3HJVbn7b9Ol/dzZ3bSmhCI5y3lEWcY/89pGIbzJRMlqPE8OLCqyt90DVT8WG PhYez8geVJaFfbElHYhimWrfYpdi7gUzt44xnmi5FNwWH9ic47FS2YBKg30K77aPu40tt6 uHOGm5wAgdugZLQPmjPZ+E4GzIRujis= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=jndwX1Ap; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf17.hostedemail.com: domain of 3pwDCZwQKCPQbrZhckkcha.Ykihejqt-iigrWYg.knc@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3pwDCZwQKCPQbrZhckkcha.Ykihejqt-iigrWYg.knc@flex--fvdl.bounces.google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2feb8d29740so2891791a91.1 for ; Fri, 28 Feb 2025 10:29:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767399; x=1741372199; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OPcdQrMWcNv7Zau8OEpT7S16MfDg/G7TA8+N5ucsG+E=; b=jndwX1ApRrslc5HCGSKgulAzy2rCFIGQ2u4rgBRWkOxtlw/tnSWwmuPNUKa1D4zvVa Ns6SJtW2iYXIq9SkbBKgL2jHlPKRDIgJX9j0dK1y5FmM1HMWz3iK9kjl3MkD1x+VCDUv oqLAv6k8VjWAY5/IzuSL+DwpAgC0aqJNj8pKwP9JLWKayR1jVDpFPTyBu4PELyYlCDG3 sAmzNzSnztEj7gmsQQSMSf2Bwo7YaFwXza/rpd4sAk+lftOO+osNojaUXCciYKZZ48wQ rN9VwCXPmwNt7C/BeW4DzizFA/MD9/ECd0QChO2QiuH/631S6HfUXk5UFwY2Oa80bPHp BTog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767399; x=1741372199; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OPcdQrMWcNv7Zau8OEpT7S16MfDg/G7TA8+N5ucsG+E=; b=SlOzYqu3ELW/N7Dz/AjwoJBqKyB9nyitlLp81hD8PSt0X0kDD+Z4P7vEBT/aOrVzBQ FWFRR7QYP8ZYIXMVVBEQAknTdA9muqDDp495LwTk5nXG2KBRrVmtm089BCW1m9DZoi1q 7IYcTOkTzlhqPGtXu/OZr3EnkhHnLYEOEfbIcEfBr5WhtjKwDuhF26Oai+RfkP3+MmAY zae4h6WRqQo9Q//A/qweLoKazVkEYGWvG5thX55SYozGpKo2vR6M6e0JP7lMr91Hs9ev KGkvB88KFd/zEk5w01/tNlSMOPnbaV6rAizTrUANDXVoCYNDX38Oz8gcYc9l8vP9XBmK vlSA== X-Forwarded-Encrypted: i=1; AJvYcCVkM/xeQQMB5V/SE2bY4Ey4V6cdQvwWfuykTRfZ5UfXrI6wm/Wo6QTiLlXhjWsq3DYsfJoLYxaGxw==@kvack.org X-Gm-Message-State: AOJu0Yy5l9Sh8Rhlc+oPcebG6HPIE7/1fTGqXeXMq6YNz1P4PImA7x/2 mtkP2XOE5wymEzsdPAClA/a0DyRcC5fUHr+sO1a4uTc7t9ZoP6lkhFWzXlsV5EhWb4mb+g== X-Google-Smtp-Source: AGHT+IEc3NQ9Go14+cTqU/Rv73CdPaJm8EFFQ8SG1wAZVrkhFUHYZ37HYVGqfmxLOr8BdLrKJjaxUlZ8 X-Received: from pjur6.prod.google.com ([2002:a17:90a:d406:b0:2fc:ccfe:368]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2fc7:b0:2ef:31a9:95c6 with SMTP id 98e67ed59e1d1-2febab5bf2fmr7679905a91.14.1740767399073; Fri, 28 Feb 2025 10:29:59 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:07 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-7-fvdl@google.com> Subject: [PATCH v5 06/27] mm/hugetlb: use online nodes for bootmem allocation From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 146F44000A X-Stat-Signature: 3mr8prbz5b1wfyobs4i8ewkfyeggiug4 X-HE-Tag: 1740767399-973817 X-HE-Meta: U2FsdGVkX18aKUFVNzAtgfLLar+2CkJjoddhn7uJkaExyllTTsxBvVDmuQBbb15619rg1c2RKrMLLCM2SRc17InvPo4gj5dvbUkGNFZWmoqCtJgOfLvGuQkN29zhl8M5eW43Umj324hUA743ehTL5eNhLlAmyySn/IbmZ+DRPeXOVIb62i9+HYotlyBkIaArphTrYAWCE2bjynBFSmwrLLca1UAf+ThsIb9FDLsEkdipFrq9nNT0MoRRdtMLRpwuZMzizrn2pq6FPWwB0YswIq41pIsfvTQNVHC9mRwISjalsIh8ViL9UDagG2gkVazTjGefX13lJCxggEwVHl29BuSkj4XK39r13tHfw6uW9807b6uDq1ATb+YfwirhcPmpz/0rVXCd4RcjTjQHLtwlJqWEtWoR0sBh5LDENLOYoms+kfvZmIEHb7dYb5h45naQIdFImB5lWZ8KBKwceyve2mbKMlAYuA9CeBBFKRTGbwYT8Uwl3LVVDcaITWXn3tgMeTlhYPSKtHA+m3yrVUKfTjq7bYEqYsjCcjiUqkEK3PLdjm7takUqJ/Hmz9tl/i1TvVhgFQyJXUwuUYVbsbCW13C9I8C2WrNvEtGprboYHa1lmypmf04K1FL1ZtB1kALtxyoYEfrcPOldvUl2LLGDr5X1F3ZmIwX62GeyWh5dtPlt81T4qNnMs8Lbj6aaBd1NhWXCS7W7LHkbAOFFlMnmw/yrxFbx3/XMSSpm+cNGg6kysGZ5r+b61ltBKJmdXy/T1BVU2odnjEwGgTWNeVWk+JGo8hidzlTyLG+Be4ea1CJ2N/nbdPOY2q4b3Xs1eLVs98kkvF76tyfuFTVVny90ieIZ2z0jXCqpKDAEaGrp9hEzHhW2KJYVekSvPheh7O9lzJvcSTRxMJCq1NRLrguMbPM/jJh8N3q5JQs8KUeto+ds0GScJnnFSZ7+TIF88sQlaO8MEbeOWkBff7K0Ls8 jyCw1kwA KHY4vF2LYYDdXRE3zuN413Tm0HN/zAQ4ZtBcb3ez3IoN7kweFJZVGfzGfuEXVEesZtxwDRbA6NFKWIPSMTi+I9ASGPkeZjLIrUUwwH35YS0KBrzDf6mnVY/LyluYMbZs314pWjXpt2v8FqKnBhinXNhIk043dB0sRlcb8LM3x6Vbwe83jpaI8RnKvKDVH1VPYBkjx6wpKHALbHuEp0pk3zfjc7n6krj1wjEy89j2sfQTAHFeF3kAkquoCV5TJLNB+Pl3Vu/XPcUTqh2/3nE3YHIPA9zFY+4jNxrcYxcIHqwutx48EsSfSYR+W/pxEh37xApY6YuSAf/uXNIdu41kEOvXMfdYX0JqiOONHbqH4a2H33PWFG/FKbmJwolHTpg8bSQV5rA6HBz+Fgee/4oJ1ISBaDccz+UuvlwlAXmP8N7i/femNYv9bBXKgneCg0pyxHxhAe5RmAIoGAXvTiSth+7xr8yKuXls/8OtY8+Q7anfS1Nyxn+mJOFPeJHYP+bt3lt3yXSUa6NAFaAbA1ayWFMuYuA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Later commits will move hugetlb bootmem allocation to earlier in init, when N_MEMORY has not yet been set on nodes. Use online nodes instead. At most, this wastes just a few cycles once during boot (and most likely none). Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index d6d7ebc75b86..0592c076cd36 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3152,7 +3152,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) goto found; } /* allocate from next node when distributing huge pages */ - for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_MEMORY]) { + for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_ONLINE]) { m = memblock_alloc_try_nid_raw( huge_page_size(h), huge_page_size(h), 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); @@ -4546,8 +4546,8 @@ void __init hugetlb_add_hstate(unsigned int order) for (i = 0; i < MAX_NUMNODES; ++i) INIT_LIST_HEAD(&h->hugepage_freelists[i]); INIT_LIST_HEAD(&h->hugepage_activelist); - h->next_nid_to_alloc = first_memory_node; - h->next_nid_to_free = first_memory_node; + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/SZ_1K); From patchwork Fri Feb 28 18:29:08 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996906 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1DA8CC282C6 for ; Fri, 28 Feb 2025 18:30:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3CBA1280009; Fri, 28 Feb 2025 13:30:04 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 352DA280001; Fri, 28 Feb 2025 13:30:04 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1CA90280009; Fri, 28 Feb 2025 13:30:04 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E8EA4280001 for ; Fri, 28 Feb 2025 13:30:03 -0500 (EST) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id A4E0AC1159 for ; Fri, 28 Feb 2025 18:30:03 +0000 (UTC) X-FDA: 83170192686.04.0A9E04C Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf20.hostedemail.com (Postfix) with ESMTP id C831A1C0018 for ; Fri, 28 Feb 2025 18:30:01 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ksuBwtDx; spf=pass (imf20.hostedemail.com: domain of 3qADCZwQKCPUcsaidlldib.Zljifkru-jjhsXZh.lod@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3qADCZwQKCPUcsaidlldib.Zljifkru-jjhsXZh.lod@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767401; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=e4fT3nl83EuvMS99kJHXSYdNG4GrpqXQFI3hJuJeA/Y=; b=X3EahZwXGfpAlmuLDZGX4GGSD58z9tXa0kTzgSsc72fKn9zII64SAb6w2GMNBip4pycoGp tB+0z89iZU/GCUoJB6VS8tF41WNawXYJesUpQ0mXyHsNG821Bj7Xf4d8soX3gqtAhm7AIy To93v8ySbG0KJ6DJj/2Wh3B9AKw7Bic= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ksuBwtDx; spf=pass (imf20.hostedemail.com: domain of 3qADCZwQKCPUcsaidlldib.Zljifkru-jjhsXZh.lod@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3qADCZwQKCPUcsaidlldib.Zljifkru-jjhsXZh.lod@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767401; a=rsa-sha256; cv=none; b=EzPSG67SAw7Tp0eDiw5k2pIHdHsUck6G7293XmJqG5rYbNZndV+1REqnGI2IXui0rFtqYC nzyhclO8Oj71SRYqwx8oi/awTO6h+t0Yg7mLroE6lsjlf6hip2PjWUVYRglWSMX6hFWjkD Ee9TpXUmRuNYfg6Cyz3ix6TNe8oy5VM= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-22355de1de8so39688285ad.1 for ; Fri, 28 Feb 2025 10:30:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767401; x=1741372201; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=e4fT3nl83EuvMS99kJHXSYdNG4GrpqXQFI3hJuJeA/Y=; b=ksuBwtDxd+NN5fEwoCGX77ZbhjPTdPIVou2glg2ekWDxyAl+CenzCjaqRaMVw0FAOA qFe+0i6FrdLnnZOOQIPVW7aFlo5zmrj54GSp6bDLWhhCfwESXwg9S6PONugErPVqdiiT 8IYIjGM3pYZLA62jBuQEv+vi7BDTZFluK607udML4JigHF9paDIuHHJUsM+ykVO6bEg1 RIhdW+zye1fT4+pX0awiNkrlG2M7CaJ+9ia7DW8i740gD2e7DSlvQmREIg/8HJQK6scC 4DwJ7K9vTym7eIa4xnOAsNIi/gzofu+KrbNS5wkzGwJWJj19hVqMxd6UCMrPXUs/tQqS 9kUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767401; x=1741372201; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=e4fT3nl83EuvMS99kJHXSYdNG4GrpqXQFI3hJuJeA/Y=; b=AeUy940wFDHU64EU6vJ33nmhAP6aUWYkzQox9pE7vKyFJssu39BqpZySD2A7Q13dme 1KxONHhatoVAO8fe17CPtvFNcyqt/4DheLG8nxoPkOsb8qhj5vtbf0dKJ2iokNbiPYTO X5LQrBIeLna4YJJ+GajAqmUPcsj+w9fdj4/RfDFA9KXwHyZoaTSxxPozLJB52aUciCpc Hs/MjXOZF4Fhj7beDFrJeVSz/OYTpDsbAHCRElqMUirJfq4Hfzh+fNoc1KcPddx3NUfi ggXNKpwvGROHCv59CIxz1ylYP/or0hSPVOLguveP+5EeMMQsWPB6b++U0cQz5zwRQ85b 0Dmg== X-Forwarded-Encrypted: i=1; AJvYcCV46KPj0jc+AKf6tGK/jUXfdql7UHYQPn/mX+Muenv+aiV1Yr3mtVHQu8ExfJglGOm+pz47ReqfEw==@kvack.org X-Gm-Message-State: AOJu0YzAoNjMt8TlEHtV5EcTwv31I7m19LczpXD2gwBzk1uDyeLxbaPY d+sQATbtK1abBUbb5ii/FcJFWjcFAg4B++IOXXxgNA/id7lerI8HCoCRnC1d3Ewh84k/Zg== X-Google-Smtp-Source: AGHT+IGUAIYvgIF+BtbI5pOnkTpsWOz6nQUUqftXrAjuMRZR1VXxhKSR8/d7Qe0qOmWXJT+LI7VzDoQF X-Received: from pge11.prod.google.com ([2002:a05:6a02:2d0b:b0:ad5:4ccd:a656]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:2449:b0:1ee:d418:f754 with SMTP id adf61e73a8af0-1f2f4e5a4b3mr8377121637.40.1740767400717; Fri, 28 Feb 2025 10:30:00 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:08 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-8-fvdl@google.com> Subject: [PATCH v5 07/27] mm/hugetlb: convert cmdline parameters from setup to early From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: C831A1C0018 X-Rspamd-Server: rspam09 X-Stat-Signature: nt145chsgcdxbw1zws4n6t8yfc1fauer X-HE-Tag: 1740767401-587635 X-HE-Meta: U2FsdGVkX1/XtvkzK5gGKqaumlTSawaE92usS1OE9BSdSoG9Bm8wUonOVX59qOoWoAYbx4BTaSKd4urIAehSRa1UzBOg0WouELhxBsD0+dXZ+PFlOe5rK0OeNjFPzSKs7Y240TEyfmpqNkSgOyoJWrP8KD/A26HqkQtKTff5W0w5gPFxx3YPuaR6HtVcURAD9sOnbUsn5YRYQW2eF64+JnKTbz11sXwtcHgFrnHQ8MWHOKV/8a/EuvfnrS+An29N2MCXZe3XVOGCgBKisc/kHYFCauj5mhf4Yr8doqsXjSLurQ9sTTLIjQq4avJEC7qX36b6sgl34SR9sTAz+9CqbHtkgcZmeSumNGKY48r7nVTT4sqSVHRz/+NhHGSvh28RnCYMCKogdeqF8FsLYGi7dfCdx5eiG0AzfTKXTnIr8lXTsT1DaKUqBfUrGuG8df+CmwjAnh14AoPC2nR2eOKmsM0DjuHgbZTzGxcp2c4sefoDO7g0owaO3r3Q2yjOk61YNGYnbQj0nja0fF3invqHyUrW0f2SremooV9Sc14Y93sdSTa5jIgiGrAMQTBcCDz+n4OmXt4KpgEJ4huLU3ZkjsiaqO+xcvtJf6y5WJyE8INALZc+8vNfQgvHcfv9mxK0ZVRhYHZm2RomtcX9mJA3wLHbLTfPSywmrrr4Vy3R9LTMlVjkZssGsfUNZwL/oOOvQTIWwUIcok3DbDlS9cuv3S5JiVH/F19+nfry/gSCJPpH+rslhhn7Uhi8x7u+Xx/CUXucnStKTXq9rOGN+4TBX3SSAJEc6XDG/hkxMup+A6PIUC6HG6v+6979qmoPtrlB152+pF4Zh/rlQrkqbABE4eZPyFcV7FCDqCb/AS/+ddugEzKZqHiFpwiOCvIYPdszaqvc52FgxeIVdHfNYILh7fjfo65yltQe3VZ4Sd7b57pN0v9GQZ60SGQCTurLXJ0dDBSaOq8zKrWPwz1bpir TpBiq5jM qjHwPHpKzbLSeUIKa312deSbVBtoajqo1zrlBbxm6wlyKjsnGYNxXLotoRwrDVrvuaMngJe6aakv1xH8VhVmBWynwaOwx98TFWNaDDF1KmaegW23Qv59IDgeWhLgzHMALF9J/x2Quz9tbUEMkkip/RMAzBF+hHrQw+lBAIfOj7Kh8RvYZ7yQ7Jc054fcjiWM9cvejMcKMPwzBQqJGbuov0zQIrIYnG8j0IoIXNun2eaDQGbVE9Pbvz2X0kwUg3MFR/UPJicc2MxeIY0v0aUpv1S9VaLwKEAtGNAOhEwfFKV9XZIgb61y4tEfpG/I25ND5vYP/Z48UMA52K/h0Z9o7E7joBfUz3IeBULay/PmwPVxQbFZ4Ykj46IV2cu1O/aNkUAlCVFh/iVdG5WZG7+HvZqpMOfV2YB1PgGyOG+FzhE7UlmMRdLBmu4eU+wEIgDz0sAgEdcnyupbwgkoDJyd/9b5far0Fs2YcgfLjh/X83s4PcOVTPp+HDVlnwUz+7Xfx/NysLX+JOYeywtaBNjXDV/bk0g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Convert the cmdline parameters (hugepagesz, hugepages, default_hugepagesz and hugetlb_free_vmemmap) to early parameters. Since parse_early_param might run before MMU setups on some platforms (powerpc), validation of huge page sizes as specified in command line parameters would fail. So instead, for the hstate-related values, just record the them and parse them on demand, from hugetlb_bootmem_alloc. The allocation of hugetlb bootmem pages is now done in hugetlb_bootmem_alloc, which is called explicitly at the start of mm_core_init(). core_initcall would be too late, as that happens with memblock already torn down. This change will allow earlier allocation and initialization of bootmem hugetlb pages later on. No functional change intended. Signed-off-by: Frank van der Linden --- .../admin-guide/kernel-parameters.txt | 14 +- include/linux/hugetlb.h | 6 + mm/hugetlb.c | 133 ++++++++++++++---- mm/hugetlb_vmemmap.c | 6 +- mm/mm_init.c | 3 + 5 files changed, 126 insertions(+), 36 deletions(-) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index fb8752b42ec8..ae21d911d1c7 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1861,7 +1861,7 @@ hpet_mmap= [X86, HPET_MMAP] Allow userspace to mmap HPET registers. Default set by CONFIG_HPET_MMAP_DEFAULT. - hugepages= [HW] Number of HugeTLB pages to allocate at boot. + hugepages= [HW,EARLY] Number of HugeTLB pages to allocate at boot. If this follows hugepagesz (below), it specifies the number of pages of hugepagesz to be allocated. If this is the first HugeTLB parameter on the command @@ -1873,12 +1873,12 @@ :[,:] hugepagesz= - [HW] The size of the HugeTLB pages. This is used in - conjunction with hugepages (above) to allocate huge - pages of a specific size at boot. The pair - hugepagesz=X hugepages=Y can be specified once for - each supported huge page size. Huge page sizes are - architecture dependent. See also + [HW,EARLY] The size of the HugeTLB pages. This is + used in conjunction with hugepages (above) to + allocate huge pages of a specific size at boot. The + pair hugepagesz=X hugepages=Y can be specified once + for each supported huge page size. Huge page sizes + are architecture dependent. See also Documentation/admin-guide/mm/hugetlbpage.rst. Format: size[KMG] diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index ec8c0ccc8f95..9cd7c9dacb88 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -174,6 +174,8 @@ struct address_space *hugetlb_folio_mapping_lock_write(struct folio *folio); extern int sysctl_hugetlb_shm_group; extern struct list_head huge_boot_pages[MAX_NUMNODES]; +void hugetlb_bootmem_alloc(void); + /* arch callbacks */ #ifndef CONFIG_HIGHPTE @@ -1250,6 +1252,10 @@ static inline bool hugetlbfs_pagecache_present( { return false; } + +static inline void hugetlb_bootmem_alloc(void) +{ +} #endif /* CONFIG_HUGETLB_PAGE */ static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0592c076cd36..1a200f89e21a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -62,6 +63,24 @@ static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; +/* + * Due to ordering constraints across the init code for various + * architectures, hugetlb hstate cmdline parameters can't simply + * be early_param. early_param might call the setup function + * before valid hugetlb page sizes are determined, leading to + * incorrect rejection of valid hugepagesz= options. + * + * So, record the parameters early and consume them whenever the + * init code is ready for them, by calling hugetlb_parse_params(). + */ + +/* one (hugepagesz=,hugepages=) pair per hstate, one default_hugepagesz */ +#define HUGE_MAX_CMDLINE_ARGS (2 * HUGE_MAX_HSTATE + 1) +struct hugetlb_cmdline { + char *val; + int (*setup)(char *val); +}; + /* for command line parsing */ static struct hstate * __initdata parsed_hstate; static unsigned long __initdata default_hstate_max_huge_pages; @@ -69,6 +88,20 @@ static bool __initdata parsed_valid_hugepagesz = true; static bool __initdata parsed_default_hugepagesz; static unsigned int default_hugepages_in_node[MAX_NUMNODES] __initdata; +static char hstate_cmdline_buf[COMMAND_LINE_SIZE] __initdata; +static int hstate_cmdline_index __initdata; +static struct hugetlb_cmdline hugetlb_params[HUGE_MAX_CMDLINE_ARGS] __initdata; +static int hugetlb_param_index __initdata; +static __init int hugetlb_add_param(char *s, int (*setup)(char *val)); +static __init void hugetlb_parse_params(void); + +#define hugetlb_early_param(str, func) \ +static __init int func##args(char *s) \ +{ \ + return hugetlb_add_param(s, func); \ +} \ +early_param(str, func##args) + /* * Protects updates to hugepage_freelists, hugepage_activelist, nr_huge_pages, * free_huge_pages, and surplus_huge_pages. @@ -3484,6 +3517,8 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) for (i = 0; i < MAX_NUMNODES; i++) INIT_LIST_HEAD(&huge_boot_pages[i]); + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; initialized = true; } @@ -4546,8 +4581,6 @@ void __init hugetlb_add_hstate(unsigned int order) for (i = 0; i < MAX_NUMNODES; ++i) INIT_LIST_HEAD(&h->hugepage_freelists[i]); INIT_LIST_HEAD(&h->hugepage_activelist); - h->next_nid_to_alloc = first_online_node; - h->next_nid_to_free = first_online_node; snprintf(h->name, HSTATE_NAME_LEN, "hugepages-%lukB", huge_page_size(h)/SZ_1K); @@ -4572,6 +4605,42 @@ static void __init hugepages_clear_pages_in_node(void) } } +static __init int hugetlb_add_param(char *s, int (*setup)(char *)) +{ + size_t len; + char *p; + + if (hugetlb_param_index >= HUGE_MAX_CMDLINE_ARGS) + return -EINVAL; + + len = strlen(s) + 1; + if (len + hstate_cmdline_index > sizeof(hstate_cmdline_buf)) + return -EINVAL; + + p = &hstate_cmdline_buf[hstate_cmdline_index]; + memcpy(p, s, len); + hstate_cmdline_index += len; + + hugetlb_params[hugetlb_param_index].val = p; + hugetlb_params[hugetlb_param_index].setup = setup; + + hugetlb_param_index++; + + return 0; +} + +static __init void hugetlb_parse_params(void) +{ + int i; + struct hugetlb_cmdline *hcp; + + for (i = 0; i < hugetlb_param_index; i++) { + hcp = &hugetlb_params[i]; + + hcp->setup(hcp->val); + } +} + /* * hugepages command line processing * hugepages normally follows a valid hugepagsz or default_hugepagsz @@ -4591,7 +4660,7 @@ static int __init hugepages_setup(char *s) if (!parsed_valid_hugepagesz) { pr_warn("HugeTLB: hugepages=%s does not follow a valid hugepagesz, ignoring\n", s); parsed_valid_hugepagesz = true; - return 1; + return -EINVAL; } /* @@ -4645,24 +4714,16 @@ static int __init hugepages_setup(char *s) } } - /* - * Global state is always initialized later in hugetlb_init. - * But we need to allocate gigantic hstates here early to still - * use the bootmem allocator. - */ - if (hugetlb_max_hstate && hstate_is_gigantic(parsed_hstate)) - hugetlb_hstate_alloc_pages(parsed_hstate); - last_mhp = mhp; - return 1; + return 0; invalid: pr_warn("HugeTLB: Invalid hugepages parameter %s\n", p); hugepages_clear_pages_in_node(); - return 1; + return -EINVAL; } -__setup("hugepages=", hugepages_setup); +hugetlb_early_param("hugepages", hugepages_setup); /* * hugepagesz command line processing @@ -4681,7 +4742,7 @@ static int __init hugepagesz_setup(char *s) if (!arch_hugetlb_valid_size(size)) { pr_err("HugeTLB: unsupported hugepagesz=%s\n", s); - return 1; + return -EINVAL; } h = size_to_hstate(size); @@ -4696,7 +4757,7 @@ static int __init hugepagesz_setup(char *s) if (!parsed_default_hugepagesz || h != &default_hstate || default_hstate.max_huge_pages) { pr_warn("HugeTLB: hugepagesz=%s specified twice, ignoring\n", s); - return 1; + return -EINVAL; } /* @@ -4706,14 +4767,14 @@ static int __init hugepagesz_setup(char *s) */ parsed_hstate = h; parsed_valid_hugepagesz = true; - return 1; + return 0; } hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); parsed_valid_hugepagesz = true; - return 1; + return 0; } -__setup("hugepagesz=", hugepagesz_setup); +hugetlb_early_param("hugepagesz", hugepagesz_setup); /* * default_hugepagesz command line input @@ -4727,14 +4788,14 @@ static int __init default_hugepagesz_setup(char *s) parsed_valid_hugepagesz = false; if (parsed_default_hugepagesz) { pr_err("HugeTLB: default_hugepagesz previously specified, ignoring %s\n", s); - return 1; + return -EINVAL; } size = (unsigned long)memparse(s, NULL); if (!arch_hugetlb_valid_size(size)) { pr_err("HugeTLB: unsupported default_hugepagesz=%s\n", s); - return 1; + return -EINVAL; } hugetlb_add_hstate(ilog2(size) - PAGE_SHIFT); @@ -4751,17 +4812,33 @@ static int __init default_hugepagesz_setup(char *s) */ if (default_hstate_max_huge_pages) { default_hstate.max_huge_pages = default_hstate_max_huge_pages; - for_each_online_node(i) - default_hstate.max_huge_pages_node[i] = - default_hugepages_in_node[i]; - if (hstate_is_gigantic(&default_hstate)) - hugetlb_hstate_alloc_pages(&default_hstate); + /* + * Since this is an early parameter, we can't check + * NUMA node state yet, so loop through MAX_NUMNODES. + */ + for (i = 0; i < MAX_NUMNODES; i++) { + if (default_hugepages_in_node[i] != 0) + default_hstate.max_huge_pages_node[i] = + default_hugepages_in_node[i]; + } default_hstate_max_huge_pages = 0; } - return 1; + return 0; +} +hugetlb_early_param("default_hugepagesz", default_hugepagesz_setup); + +void __init hugetlb_bootmem_alloc(void) +{ + struct hstate *h; + + hugetlb_parse_params(); + + for_each_hstate(h) { + if (hstate_is_gigantic(h)) + hugetlb_hstate_alloc_pages(h); + } } -__setup("default_hugepagesz=", default_hugepagesz_setup); static unsigned int allowed_mems_nr(struct hstate *h) { diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 7735972add01..5b484758f813 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -444,7 +444,11 @@ DEFINE_STATIC_KEY_FALSE(hugetlb_optimize_vmemmap_key); EXPORT_SYMBOL(hugetlb_optimize_vmemmap_key); static bool vmemmap_optimize_enabled = IS_ENABLED(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP_DEFAULT_ON); -core_param(hugetlb_free_vmemmap, vmemmap_optimize_enabled, bool, 0); +static int __init hugetlb_vmemmap_optimize_param(char *buf) +{ + return kstrtobool(buf, &vmemmap_optimize_enabled); +} +early_param("hugetlb_free_vmemmap", hugetlb_vmemmap_optimize_param); static int __hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio, unsigned long flags) diff --git a/mm/mm_init.c b/mm/mm_init.c index 2630cc30147e..d2dee53e95dd 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -30,6 +30,7 @@ #include #include #include +#include #include "internal.h" #include "slab.h" #include "shuffle.h" @@ -2641,6 +2642,8 @@ static void __init mem_init_print_info(void) */ void __init mm_core_init(void) { + hugetlb_bootmem_alloc(); + /* Initializations relying on SMP setup */ BUILD_BUG_ON(MAX_ZONELISTS > 2); build_all_zonelists(NULL); From patchwork Fri Feb 28 18:29:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996907 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB03AC282D1 for ; Fri, 28 Feb 2025 18:30:11 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17D2728000A; Fri, 28 Feb 2025 13:30:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 12E7E280001; Fri, 28 Feb 2025 13:30:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E250928000A; Fri, 28 Feb 2025 13:30:05 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id BD538280001 for ; Fri, 28 Feb 2025 13:30:05 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 6396B50C24 for ; Fri, 28 Feb 2025 18:30:05 +0000 (UTC) X-FDA: 83170192770.15.D4F3488 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf03.hostedemail.com (Postfix) with ESMTP id 69C3F2000B for ; Fri, 28 Feb 2025 18:30:03 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Sv8BlwRV; spf=pass (imf03.hostedemail.com: domain of 3qgDCZwQKCPceuckfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3qgDCZwQKCPceuckfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767403; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=vT9V+Abq/i2ri7sXnmoXq0qIAuHJN34FJxNuQkKtmyk=; b=r3vwTyKbcZZZPfBnrJXqjrZHW+30pIuumUTnWHOHTGCGsIWEek5fwX+AHPysF2AGYvIeyo hXWametpHSwv/sOmzHHwR/SaBrsZmk2LNdooLD1m2eNT0Fnxe58zdOSgEhjbE3v7wdF/8z bo7eU6v1rm2sryy0zHkust4OT2lIuyE= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767403; a=rsa-sha256; cv=none; b=Sa4aYnHI1tzcJeN/aZAGsxdf2bELxTvH55U5bEh/hITY9qXUdz1HFeP3a/R1faO6DCphBc op662a2iutvc05wzsKlblkhDWCrdSFr2GSgov9Wz4JlVt5RZ9cIhdZRgNw4u5yT/4USpbH IT+D1NINq4BHotr1Pcf420SsvHHQ1aY= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Sv8BlwRV; spf=pass (imf03.hostedemail.com: domain of 3qgDCZwQKCPceuckfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3qgDCZwQKCPceuckfnnfkd.bnlkhmtw-lljuZbj.nqf@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fe916ba298so5049685a91.1 for ; Fri, 28 Feb 2025 10:30:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767402; x=1741372202; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=vT9V+Abq/i2ri7sXnmoXq0qIAuHJN34FJxNuQkKtmyk=; b=Sv8BlwRV7tgtrFYj5uzuCYyv6BqOwZKUCvJ8jmkXXVEyHmj61xd6ltOHTl8Q70U2ey 9tB+OSrKPVjf8GVQsi1DVakC8fXVL92hskjZ9/Jl+Y4p1a65QPPFNiRj87S6527E/Pm1 C6uIe7GtgdYlWTYP2pdxoODC//PklpkoODxDchlIELGpCFJqrji6yNqXEs77LteQIfnZ PQHNGUX/R7sY+D7ln5WiMwGOvwssGi1K3oO/RJ5LO8HkcSY2itM5JhS+1xwBWMxf8cUs cPFOR03aSqMfaTHD4M9OqCy0SqZJdezCSSv9/BSwufjlRGSHyjytj3N9lcdNMRKgkz7P oraQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767402; x=1741372202; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=vT9V+Abq/i2ri7sXnmoXq0qIAuHJN34FJxNuQkKtmyk=; b=nHRACFRezWyTmyhAWY7AVH8CIfQKs/eWMbhSC3ZiLgEA8U2Po1upc03fYQ+CKGzX9r +rg+HldnnTE0fErFojXO8+p7a5TVteAppKza+29Fh1Qv4cOsXKjKFor+pw3furCPX6t2 W2yk8XuxS60GelWO7U5FOQIBHCIIBh/KMnFZs22JKyIoKaBbed4iZ8sbgQl8YtIv6IQW xkrMHOlsZd0NaS1lVOKD6ZEBmbiAiwhe1fNLVOYNimK7ZH9CdHQOCMhWTUICvVqvCSnL 734Cc63nryWBLnhAalxTGfQ91frd+XGKXj0j0gzxrDvwDt5ZsFFlxJ0FpTp9mTa0Mz63 Ol6A== X-Forwarded-Encrypted: i=1; AJvYcCVp/WjJVGaYNkJ1isaea+uyEayDPvy8yfHJfjrUvHKbNvkoE/iRRO9CpSJOq9RFcOwY83hwL/wIrw==@kvack.org X-Gm-Message-State: AOJu0YyfFFxP+JKacjx0mcIty3a3x4N01OM6BwNJ11SU71CVzD09p2Vi F/6Qg/K6OuWPUuM5omFLJWnzHfNFPmA/cjtnGWPhuLx0Mtxcl7owO88A1tBd2nQN0gYCow== X-Google-Smtp-Source: AGHT+IEJZtJlg1nf2iF5sruFhlqBO6bDCuDixMIcx2WJKx58Jky48PZHl6y7MGnhwLHWadCWonhMeC6V X-Received: from pjboh15.prod.google.com ([2002:a17:90b:3a4f:b0:2ea:5be5:da6]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2247:b0:2fe:84d6:cdf9 with SMTP id 98e67ed59e1d1-2febabf1a73mr6371733a91.26.1740767402413; Fri, 28 Feb 2025 10:30:02 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:09 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-9-fvdl@google.com> Subject: [PATCH v5 08/27] x86/mm: make register_page_bootmem_memmap handle PTE mappings From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Dave Hansen , Andy Lutomirski , Peter Zijlstra , Dan Carpenter X-Stat-Signature: tkt1hajn3kguryfbyqtjhgb8y988h88n X-Rspamd-Queue-Id: 69C3F2000B X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740767403-542698 X-HE-Meta: U2FsdGVkX18LE9YcCC40PZZop0uSLIHJhD6kKjdEHeg8QfDW3eWV+gs/eG0wgm+NHjWZ+QNgwC6jgZfU9Ou8ASVCR8J4Qne0skDXk7ork01Klw6L+kmo3MbrmC0zUb29Wv7Su1eGxF/hQpRfy8xzX7AW3fsMgdv2MDMb5CqQzPm7eU9z5xUfcr0+OIwzq+75dyNgI2Tlp+5ptHGrU8gfenzcfM94mSnCuthPO+CRBssA7b6Y0xYj/ftBblsEQVLT03ue23wh6LE/CqhokeFuFAq6EnxIwZpr663OiyepVKUpN61X31FBcwt7FyZeqqwYuyrFoPK/FWxJEe4alpuBeo/XPHZam4AVG6lzeioXSbbEbLEUAkNnWeTFTULMzoFtHZ4x/zhQ0HLBGTaT9Wis61F5POZ6SaSblof4gqtby4qmANhSVcFZKe8rOGDWHyMZ0end/LjHjffzHYFFvmsuaCTpJeU8dc6Dsjq/1iKSMUikoaq59YLqb0XhbCh+iBFAMa8pqiBNMXT4asD9ZrQgY65wNigO5nTh/V8Qq2xhf8zsJny3xGzLiI9vQ3zVg+/zY2KHzJ/a2Fyh1E2UdD0XFapKGEE4m46aqLPyuS2w1PJa3yC5PzUpMpbn/2DNtNy6aq2aB7NgD2WYyM2rT/svgvdZG5dtRYe3FNuzcMQn9+9xdx5kuMnbz2Y+Bq0r6n+PaoRtjjNFBA9sm6g2g69y+FwYKAOcTUwhmA0o0mNA9knDO8k+U8QAklurnbU+HOa+LZqg7j0dC8Orpxpz8lck+w4sGoddYTtpdDMqpPzyXGe6rwQtN58zjq7d0plCP66AaXKrVGz4xn0DQoOpRtbwYKJKO/rzY02qrF6OL+ii0g26TQ8Gui358rxOaEA6uOWZV6euqeY3k0Yew9eRAGELNrfGjae1WmsPIEW1UlSrp4nu1DU+gp4+TIZ5omYLXL4UcqR62QVqbSWT6QpgxO4 Aarp7oYn FLfQrOiZBZNMljxycsZkEr4yBBAbMeXXHS3j6lV3q5+c6mP6lFGRyiIARL4zb5wI9HyZMo/fsgGgH1amEgUliREC415TyPUcOlj1p0NFYB9FVYhij9fpo+ZvD4oslzp+iGmFPX+v39S6DwWbjrK1PZNUoDYO3oVoiWfsTa6Z79eumWvowYf9tk/04W/TKF1O7wI4qEqVz+MPpjGYTutW6oKWswmRdNNb1TtNA0WqVw3jQx0ut8IY5BGXG3uCs933ipiPMFGmNs5IM26b3JGqXON0WlqskWwhYkhh0tQekuxQSG/uxWgJEYKKA5xxwFaxgVs5wel+Ut2loXY5ejM0YvAMgVIChU8vvvB3/HUgA8qpddC3VnxqHKW88IBzYRXPUHVxJVVWVhZnDsx1fCUb/6CQfdC4rbPee3DLM4mYgvQWXFpQLVYihxngs47/DZHIlRnApAvtF7sTrPyrzEw3KX04vsnh5h8//rOhPGUDx5jXaEtdBefTUw+OdmoAMMTlaNuqaQjTV58iW64XM2g3wTiw4lMnn8s2tEDZdhCHtpqJ8bIJRgGhou2GwqRrpKSHywMs9BUC0S58oQ9vGwCtBjP+joYGIGCqFAIaxRoc1V2PjZxCRog32fpvEVA2lZLumRt0nK8a8Vpt2xM73Mf7P7P0VW3d8HhXzMorsKeyVNplAk8SBp6nTGEuJ72FLmgCiF0bOAYm0gXmtxhmEIr3/gwpeCRB3vTGrqiCSqsijaFGdXPA= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: register_page_bootmem_memmap expects that vmemmap pages handed to it are PMD-mapped, and that the number of pages to call get_page_bootmem on is PMD-aligned. This is currently a correct assumption, but will no longer be true once pre-HVO of hugetlb pages is implemented. Make it handle PTE-mapped vmemmap pages and a nr_pages argument that is not necessarily PAGES_PER_SECTION. Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Cc: Dan Carpenter Signed-off-by: Frank van der Linden --- arch/x86/mm/init_64.c | 18 ++++++++---------- 1 file changed, 8 insertions(+), 10 deletions(-) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 01ea7c6df303..6e8e4ef5312a 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1599,11 +1599,14 @@ void register_page_bootmem_memmap(unsigned long section_nr, } get_page_bootmem(section_nr, pud_page(*pud), MIX_SECTION_INFO); - if (!boot_cpu_has(X86_FEATURE_PSE)) { + pmd = pmd_offset(pud, addr); + if (pmd_none(*pmd)) { + next = (addr + PAGE_SIZE) & PAGE_MASK; + continue; + } + + if (!boot_cpu_has(X86_FEATURE_PSE) || !pmd_leaf(*pmd)) { next = (addr + PAGE_SIZE) & PAGE_MASK; - pmd = pmd_offset(pud, addr); - if (pmd_none(*pmd)) - continue; get_page_bootmem(section_nr, pmd_page(*pmd), MIX_SECTION_INFO); @@ -1614,12 +1617,7 @@ void register_page_bootmem_memmap(unsigned long section_nr, SECTION_INFO); } else { next = pmd_addr_end(addr, end); - - pmd = pmd_offset(pud, addr); - if (pmd_none(*pmd)) - continue; - - nr_pmd_pages = 1 << get_order(PMD_SIZE); + nr_pmd_pages = (next - addr) >> PAGE_SHIFT; page = pmd_page(*pmd); while (nr_pmd_pages--) get_page_bootmem(section_nr, page++, From patchwork Fri Feb 28 18:29:10 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996908 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D923C282D0 for ; Fri, 28 Feb 2025 18:30:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1074B28000B; Fri, 28 Feb 2025 13:30:08 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0BCE4280001; Fri, 28 Feb 2025 13:30:08 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E4CD028000B; Fri, 28 Feb 2025 13:30:07 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id C4FCF280001 for ; Fri, 28 Feb 2025 13:30:07 -0500 (EST) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 109EE8110D for ; Fri, 28 Feb 2025 18:30:07 +0000 (UTC) X-FDA: 83170192854.14.0E6C0C0 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf09.hostedemail.com (Postfix) with ESMTP id 2F59214001B for ; Fri, 28 Feb 2025 18:30:04 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=WEGDSd51; spf=pass (imf09.hostedemail.com: domain of 3rADCZwQKCPkgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3rADCZwQKCPkgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767405; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=hj109oq4/9DdjGz6HODDxStLMga0lyJyfx71i/zQohU=; b=eJetX2z9TmMRKoekvb2yuCweNqtjClGSyLovh7a2RMEc8+zgsPweAx+ksWpP+JYr7hEd9j bDmlX6PmJ5giJ2NLmBcar3bIISEORli11HjVMiMpxJyQjIRGtwnmPVLp8O125L6isxwqB8 jkSugZr8mipg6ov3LgWPnCTa47oJIOQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767405; a=rsa-sha256; cv=none; b=nLdJcPf7X2YAG2o7Q+EgXQVAB39rSdNloCaV4Apuqy2P8uRRLQLBTMiBimSpos78GvtcSE ZiA0ix2T6EU3WuRNyuSZK6Pj9/R7y3y6y+rMBv+LlG22FhB/2wy98EDDPfOGuBp76X0m5a o4br2s2jgtJ2eEsq6NKTXeqyqBbfiFg= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=WEGDSd51; spf=pass (imf09.hostedemail.com: domain of 3rADCZwQKCPkgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3rADCZwQKCPkgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2fed20dd70cso427077a91.1 for ; Fri, 28 Feb 2025 10:30:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767404; x=1741372204; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=hj109oq4/9DdjGz6HODDxStLMga0lyJyfx71i/zQohU=; b=WEGDSd51DxDX1HEw9yLrZSuBn303JcEkfIiObR2PGFENJVj3of2wYvD1x8ly/1n8su 2Tn7N11+H+z5jLb5xC+L0LkcXHaWpwnOrpOi2Z4yjtuicH/GQIQUUW7yTMwHvkvU9RTy AkBNAweZJJwyaQHXRg9+VXPCHGIFmqPUG2wIYbwlh9bogV73H3cJeE5KQueMogphD+I0 BFWRgz5ZE2iH2TBTGVqFj38s2GGm+eQValP111XTFc+M4hc6nAH+WMI/uzKdkiSCYyqv EddF0kWbDDYfFL6H5DCcgT2MTL4QcSM2OHbK8rfjYLLTUm41duHIpFlKE4xzowOn8X79 mp4w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767404; x=1741372204; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=hj109oq4/9DdjGz6HODDxStLMga0lyJyfx71i/zQohU=; b=Kdfv58CThNS7o4KsOvRVZR5hmSeZz1A0IMw0jPYqi07ZaTUGUbAuZWLC4hNeFJufyr jd+C/nY9tP3wHOWGLgZxwhnJI4lfmEThc4BddywEfrEg0zUsyRszdOA2Tw8Gs0vuQ+6y W7sAqJJVBm1aYlnqxEjdEr/JAiszsu9ac1K6MaQQC+rm2LfdTjo379soByZJKPF/UAlX olrBDHndiV8NqnOulbHcQ+BG/SnYp9wS8BYf0f0E5FgH8bjEJooKBzuURBONZ/41SZIt Jgwd9MCIoOCXIT7i9X09nUgvtQaLufehQulxYowIJkJTk1KJsfbvYMKHCGUIb9Rnmaxu GE6g== X-Forwarded-Encrypted: i=1; AJvYcCVbCe9ljWtvroHYxFq3wd2R5K/7aOnFvymE55avb92EOqoottKB8pqQ5ssAxfyWGn8o7hkznAf0BA==@kvack.org X-Gm-Message-State: AOJu0YzdDzx8Eqj3bCmY72WQreuQRpJbEieup4e45gDzLOqznd5e8OSC 0n39xypU1/bowSYAYJQjfuuD6iSxVPB2MiawSWIKkPZtrQOrIXZCv/zrJzMphw8gHWi2DQ== X-Google-Smtp-Source: AGHT+IHtRop7g3Onw3l9wZ1xBsCGhgyKFRwX1qlsiqvWyIrnE5Bj3EWSAy2ByEB3fvrSkioLhcNZQOAu X-Received: from pjblb1.prod.google.com ([2002:a17:90b:4a41:b0:2fc:1158:9fe5]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:2f8d:b0:2ee:48bf:7dc3 with SMTP id 98e67ed59e1d1-2febab7862fmr7705039a91.15.1740767404053; Fri, 28 Feb 2025 10:30:04 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:10 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-10-fvdl@google.com> Subject: [PATCH v5 09/27] mm/bootmem_info: export register_page_bootmem_memmap From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspamd-Server: rspam02 X-Stat-Signature: 3oqoe7qoqsrbq3w8qsr33afcmk178yww X-Rspamd-Queue-Id: 2F59214001B X-Rspam-User: X-HE-Tag: 1740767404-595927 X-HE-Meta: U2FsdGVkX1+jld/uqo/cbyyZQk1nyqjuLcs1nvNzEDG29wJjjn7QgIBatj1bcksyf1sTHDSTY/2apoHa6u5sme3EwFRSXvx9WJLhqbgokAlZwIqvU7jG84tqyUE2io87VmzOYdbjFhJMkuwusrbTvntN1dvJ5MTBfJ6IXhaJFq2VHvqb/T/nSoorD+hQUbI/kgK3EPZnWWxDdw1RYZ2FqlreYWYhKCukGaDZPpjpsmsIFwZCWp8YM/mgciS1hYEXdy+cx9Jio3W7HJwMyaLC2kZvbaSyA4R9TxXPtRDX7fUFl4TY5uBtKvKWERGLfLLcK99c7/AfMSa8Wlo3pP525kHPMMXrW4F2ma2nuACPTlAnm4biCf9BzlQaY7dKKyIYJmN5dPKBqGQnUFTbEOU346C06nYN5VpEbyqNtLf6azNorkVcoCxkYXV/nHtqblykMFooCCzpM1SDvI1p98GvinuUeU0uZO8ArLdI/KvXEhjc2SlA8TTVau+hW1sg4wjp3gnwH9Y6cNCGWRbHcJzB6mvKvweMmpbOLdSMLwuTqqY+hfyp0xK6AlrraJcySYNBT4rh9JP/dLiE8cRxYQkLOL3x6jcH+os/ZV0CrylwZym6kwhauLksgN8y0yQHPh6SzaCk1yJVRY5sM85qOnI7WY3RBFMTAvHClUStC4rJU05IV0jhBS9F8NRDiEA3nUlO92lgsFCD//rVo23fRUQGbImFO/WaRBxbHoJ2rllVUmBFvSkbuINSTwubSpPbdpawommEbE/3jba7eQV8GZdI7iXrwECuQoWesJ2X3OlWoygoQNYvz8BIAh08InU2vHNbj3zukndHZ9BTNFHGajeuE0tdSpXCT3BT5vmD+aes39kzZ32f0v5T+9d+mXWstKa0WIEfO7BqIscOjP1WDRSzm8wPZUaqLHB0n72yl3ArElVUSaENL8Xwhe4REwut7Sa7NqKujikOLSlRmlVHlQJ 0f3V7D4p /2whaRDAyp/VAgJYDZD/x7o4ws/87ZtCfy/BdDf9Q4kbt+LabBp+uj5jb/UHsXdVkN5Z874AaDtarPnY3e4uLhqtfivm+iQWavnpkji0e/GiYCefeAKE5UWoVjZz7ECu32LZRh+kNRhVxXVQzvi+HHWAL4Xk7jBWdkdzZEwBrN009g7pfYcm1wt5xFzrSbYW8160JUcNHEho3Qo7z/s0hCD6/9v6UhWZjopFSmDIjBTU+kINXNf8YCgQjBOkBiMbVBub9MPZPONzYiaDTmfQ5ZPZ9ILUOUG1xg3D/HCgoEkSb+rV8aojIIAYd/k5Rhd0JczpLSEN3mctrvJNcIKk2esCYV3bdsUKqGhW4nfs+7vehzFgVn4NzTb3W7OBz6RFSAP8D+LYONeLrSNCJym/xRJMf8spp9QEYkywAkXZSLf8gqLnQ5Zeo35Z3qGrfndNMFxzaqc0hHR7I+OlFakHdF45xwTLVcpQMhcmHoUQaoluaSKEtXsZEVP4GEwBp6FZ172Uq3YOKlYlpUu4TtNPr2ImRIEC2agvhSwBPymghd9RrUkSiUFlfA2fMWKmgP6Oy74Lj X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If other mm code wants to use this function for early memmap inialization (on the platforms that have it), it should be made available properly, not just unconditionally in mm.h Make this function available for such cases. Signed-off-by: Frank van der Linden --- arch/powerpc/mm/init_64.c | 4 ++++ include/linux/bootmem_info.h | 7 +++++++ include/linux/mm.h | 3 --- 3 files changed, 11 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/init_64.c b/arch/powerpc/mm/init_64.c index d96bbc001e73..b6f3ae03ca9e 100644 --- a/arch/powerpc/mm/init_64.c +++ b/arch/powerpc/mm/init_64.c @@ -41,6 +41,7 @@ #include #include #include +#include #include #include @@ -386,10 +387,13 @@ void __ref vmemmap_free(unsigned long start, unsigned long end, } #endif + +#ifdef CONFIG_HAVE_BOOTMEM_INFO_NODE void register_page_bootmem_memmap(unsigned long section_nr, struct page *start_page, unsigned long size) { } +#endif /* CONFIG_HAVE_BOOTMEM_INFO_NODE */ #endif /* CONFIG_SPARSEMEM_VMEMMAP */ diff --git a/include/linux/bootmem_info.h b/include/linux/bootmem_info.h index d8a8d245824a..4c506e76a808 100644 --- a/include/linux/bootmem_info.h +++ b/include/linux/bootmem_info.h @@ -18,6 +18,8 @@ enum bootmem_type { #ifdef CONFIG_HAVE_BOOTMEM_INFO_NODE void __init register_page_bootmem_info_node(struct pglist_data *pgdat); +void register_page_bootmem_memmap(unsigned long section_nr, struct page *map, + unsigned long nr_pages); void get_page_bootmem(unsigned long info, struct page *page, enum bootmem_type type); @@ -58,6 +60,11 @@ static inline void register_page_bootmem_info_node(struct pglist_data *pgdat) { } +static inline void register_page_bootmem_memmap(unsigned long section_nr, + struct page *map, unsigned long nr_pages) +{ +} + static inline void put_page_bootmem(struct page *page) { } diff --git a/include/linux/mm.h b/include/linux/mm.h index 7b1068ddcbb7..6dfc41b461af 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3918,9 +3918,6 @@ static inline bool vmemmap_can_optimize(struct vmem_altmap *altmap, } #endif -void register_page_bootmem_memmap(unsigned long section_nr, struct page *map, - unsigned long nr_pages); - enum mf_flags { MF_COUNT_INCREASED = 1 << 0, MF_ACTION_REQUIRED = 1 << 1, From patchwork Fri Feb 28 18:29:11 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996909 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D972AC282C6 for ; Fri, 28 Feb 2025 18:30:16 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A099B28000C; Fri, 28 Feb 2025 13:30:09 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9B84A280001; Fri, 28 Feb 2025 13:30:09 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 80B6B28000C; Fri, 28 Feb 2025 13:30:09 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 5426B280001 for ; Fri, 28 Feb 2025 13:30:09 -0500 (EST) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 012A21210FC for ; Fri, 28 Feb 2025 18:30:08 +0000 (UTC) X-FDA: 83170192938.06.254E939 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf07.hostedemail.com (Postfix) with ESMTP id EB97E4000A for ; Fri, 28 Feb 2025 18:30:06 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Emv7zQQZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3rQDCZwQKCPohxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3rQDCZwQKCPohxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767407; a=rsa-sha256; cv=none; b=Rht/inylG8fZWj9U4OY2d7oJHYJJzIbz26RJuMQMf6QVK+AUjNOrL9iuOZJjZ45b6L2gRj 2NsdwoaQWqvG5aS3/qvknf5MZsIA2+eL7niXGJNAHfiGTIZ6uEWRnYiFpongSPhk9Mwxtg HW7XgtPU6WKFrMJ2JN1j7aMkWclJTjE= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Emv7zQQZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3rQDCZwQKCPohxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3rQDCZwQKCPohxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767407; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ra4AUllv/zBr2Fkht24FlEHhA3ebueh3mxdRv0OhSAk=; b=LvYq+Hz0i6eUjK4YDWfYxeXxcmR3hDLzdUnoBbPJTcEgOT11II2mT+J5QyVLPPIRaiP8Jw rzQcx50lk8Iwgv4OZ1dnEpCF36GAAu+KlkJmlcWqIU7IgFi64uWT5vX59ejE1PusHSdN/w VN6ncy86WNQQ9Xf3CXPS1iH7tmTb8CE= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2217a4bfcc7so42740035ad.3 for ; Fri, 28 Feb 2025 10:30:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767406; x=1741372206; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ra4AUllv/zBr2Fkht24FlEHhA3ebueh3mxdRv0OhSAk=; b=Emv7zQQZkaEZ2IJatcxe3Cu4c1aGUHRJO/cKjdPuEkiMidX1xn9u7oAJWYOURAssbY vWHgy0nxkVuuxK9DYOfUnvwN7XnMf4Y15vB4ldHBWcfjK8sQTxjNgarHt7PJU1Xq65q5 jG7pHPP3Rz/s7dU8kMSUGM50SqccqKezwdlbQoRzj2JFnlX97Xk9IYwyPqBpp6g1qKFC 7CMeMHtBHJQ7Qp3I9JJptxiUdYX5/w3GC0VxONuXfwGHg9omqhXZk6si+vV6GbDBqb0D RzhGFd8bruRoHI4vQwQEowi4A/lbWFG4j0AqeNU12izQrsqrlrKU4z2vOyVpxlkKEcqb qYCg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767406; x=1741372206; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ra4AUllv/zBr2Fkht24FlEHhA3ebueh3mxdRv0OhSAk=; b=jmRnLUFLMiRGmdwIIqYqh5bKPY0dmh3Jd+dtykzrgtqTmyp6387KK4JKfRAXE/OUr1 mRXKqR2TIywfi5YJiX3+Xtje3sP3gxRhIQlZY8nB+/lZApdXD8UJvrDDmo+IoXRXgRcN o4RNeG5Y95A+ahqrKPDpbijxxpRAnJ+ntbA81NH9DFlGhMUQophCgGG0mq5GaFEwTlDC ugeU2kRDy3ls0kXMyGz0mY9q+Yv2R8jHcKiAXS+e+J7lvfcxqpZMjlqN8/mizaKPI5q5 bBRkWMQ3bQbXIpdDjOX34GrwQVDHwVcPiRD8l0WIkzHyllHF45EBj6kKDCHkFlYtsglT 0iNg== X-Forwarded-Encrypted: i=1; AJvYcCU8BRBSyDDX1nWFJPraEQJALhdoKbpPmkjSRYzvgKIoyefVdhE/fmir/eRK8LBwsTkORFNME18QBQ==@kvack.org X-Gm-Message-State: AOJu0Ywss8APDckovTBnapjTXiOWO/BkVbS3JHKFZowG8f2p9JeTX+vk SE+LyWdxcZ/P6uXONZKOzIeZSYKu7cvbPhLs1+/Q0xHAdi0wOMm0A+Ayl4dD1NvUVWw4Cg== X-Google-Smtp-Source: AGHT+IEwo5zR6/BSpe744ofWatyxcchtZST/W5Hz30OslVcHVriswK0uaLJEww6tSZ4q1o2zKNWmWI7r X-Received: from pljs12.prod.google.com ([2002:a17:903:3bac:b0:223:52c5:17f6]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:e751:b0:216:393b:23d4 with SMTP id d9443c01a7336-22368f6a3bcmr73705915ad.11.1740767405647; Fri, 28 Feb 2025 10:30:05 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:11 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-11-fvdl@google.com> Subject: [PATCH v5 10/27] mm/sparse: allow for alternate vmemmap section init at boot From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Johannes Weiner X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: EB97E4000A X-Stat-Signature: 4gtqjauu68khtonnkp39cjcuw9wuxt4z X-HE-Tag: 1740767406-61159 X-HE-Meta: U2FsdGVkX1+MYoZ7pfgp8zITlBswj0VR0db7XJ0sqB6s6sicmcIIs7Sy/iRboGQJe8fWdFinq5Y/OhUfjg9tA67DzXuN69QBT733S7uP8N228QeLQrjDJdy4EqRMNoVm3h0/2On29Xql9Nd6XHC9pBDqWV3uj4Ln4lnT+iHCKwqLPCWdPeb9J2pIDksL/6DODvtDu6syZ8YCDeC92ih3mBQS/WhPuvmlev/VGAfiNpCcJ8n5UZlAOYIAernSvAbBA8S+BWJQiVM22/JuCAjAzrhIV+GVdzN8Ii4XbM+e1Fm0qzsltq6PMTpenRVD37JqZQSfhINH5bAiJmt65QIF+rzWY0DAtpEme5kzRJLJvwuztc5b3DL5WDK/JHjxJv/NbG30/fPXWZaz6VTLntVVsfoMctrDgIWPP+gsDFAQ3r5IlDIxptu6xXk9a+7fq6Bo3WYNAYiB2q+Kt+luCVXXW+xTIyb7Hr41e/3DncVcP2ersxjNUkriol0ENrqXfLUtCVT/0wmP+zY4Nx8AslDXSzbm+nRx4r4n1wgU0Zhaup5DkkxaBtkF3lsQXeRQ2OnrIK6l5GPr2AA9m50K4FqQIxQzAGXvg8rHxP497xFfZNmTKZxUu0y9f2Rx8X4LKKKXQVB/ci9PqOT6Z3jAeHI95Q4mY1YZaDu1K9R7m15JPbmvFjDgNZaB5ViO978+YexD22ah6uLBX+BN/ugC9Vj2LRt1ivm2mU8ddQblEBmLVsE/tcqrxFbYBcT1tJkQhDCqtJd1GIw1r1I4mmbdtUn0S/ILLyebTl+wMI35dMHvrOMtO+plAUkaJ9Sha83BNNlC8YSQIKJlgiaDJu4bDYu1ZDUGSlFGe6ZWFRoAg13NsX4J6PU25/672tCCnmsYU26FrOkwTKGQdhbalsxIRyJePMNjxjltWXSH4zTMAB75SfMEb3OUoqUDNJ+xkUUPDNjsciUhxmDgjKwfVyNXh0h wseOAcXY 53HHvdfqOF3dnOkwDKVLRnmQJ1B49AgtpF5lJwPHeSdSwWKvoqbS+KXNcQ695sI0WrUWP6TUKHBqB4/EAibm/udMd27WlAiP2EnupC+YVMtLlKUhePTSgEkOZ1dIW8sdKE7qcvsFQv8uMCP2MdVSBXyaidaLWCj6jKVyl6M/kNUO+d4TTV6GKma2gD7HfojqKX/Ixj8WJddgvfDr28fSHHIPVgkG5qMefkf6c64nLP0jq0X88i5rb66Rpid8WzZUwwWD83cUqVoPTVM6FIdlJG0/oZkkpcQT7R3qONgP8iSJg6w8ij0FDeRkujSpCtzzf1g5OFH4v5l2tjqe0qcJ7BQEwPwD4Uy18IPA02PXhRzR9G/B92aAWsaSSn6of0lSbsWpwojI3YU75EzP71yRiPxxeQlC5WQgrNGnMzm9gp/Xws2IARlrM6a0q8X+huq0jlXBdMUKYEFyBfvAwayww2gQ0dmMNGYYe7ZvWaEQgH6n3zkrcWJBBW5Cbr2EgHKoIRUOijjpye8Ac3rOBt2QSZWtzZixSmsq1UL9hx5sTs4R/fRNet0KGk6Vs1btFt92LgfAhVN6b7P3uQH4= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add functions that are called just before the per-section memmap is initialized and just before the memmap page structures are initialized. They are called sparse_vmemmap_init_nid_early and sparse_vmemmap_init_nid_late, respectively. This allows for mm subsystems to add calls to initialize memmap and page structures in a specific way, if using SPARSEMEM_VMEMMAP. Specifically, hugetlb can pre-HVO bootmem allocated pages that way, so that no time and resources are wasted on allocating vmemmap pages, only to free them later (and possibly unnecessarily running the system out of memory in the process). Refactor some code and export a few convenience functions for external use. In sparse_init_nid, skip any sections that are already initialized, e.g. they have been initialized by sparse_vmemmap_init_nid_early already. The hugetlb code to use these functions will be added in a later commit. Export section_map_size, as any alternate memmap init code will want to use it. The internal config option to enable this is SPARSEMEM_VMEMMAP_PREINIT, which is selected if an architecture-specific option, ARCH_WANT_HUGETLB_VMEMMAP_PREINIT, is set. In the future, if other subsystems want to do preinit too, they can do it in a similar fashion. The internal config option is there because a section flag is used, and the number of flags available is architecture-dependent (see mmzone.h). Architecures can decide if there is room for the flag when enabling options that select SPARSEMEM_VMEMMAP_PREINIT. Fortunately, as of right now, all sparse vmemmap using architectures do have room. Cc: Johannes Weiner Signed-off-by: Frank van der Linden --- fs/Kconfig | 1 + include/linux/mm.h | 1 + include/linux/mmzone.h | 35 +++++++++++++++++ mm/Kconfig | 6 +++ mm/bootmem_info.c | 4 +- mm/mm_init.c | 3 ++ mm/sparse-vmemmap.c | 23 +++++++++++ mm/sparse.c | 87 ++++++++++++++++++++++++++++++++---------- 8 files changed, 138 insertions(+), 22 deletions(-) diff --git a/fs/Kconfig b/fs/Kconfig index 64d420e3c475..8bcd3a6f80ab 100644 --- a/fs/Kconfig +++ b/fs/Kconfig @@ -286,6 +286,7 @@ config HUGETLB_PAGE_OPTIMIZE_VMEMMAP def_bool HUGETLB_PAGE depends on ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP depends on SPARSEMEM_VMEMMAP + select SPARSEMEM_VMEMMAP_PREINIT if ARCH_WANT_HUGETLB_VMEMMAP_PREINIT config HUGETLB_PMD_PAGE_TABLE_SHARING def_bool HUGETLB_PAGE diff --git a/include/linux/mm.h b/include/linux/mm.h index 6dfc41b461af..df83653ed6e3 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3828,6 +3828,7 @@ static inline void print_vma_addr(char *prefix, unsigned long rip) #endif void *sparse_buffer_alloc(unsigned long size); +unsigned long section_map_size(void); struct page * __populate_section_memmap(unsigned long pfn, unsigned long nr_pages, int nid, struct vmem_altmap *altmap, struct dev_pagemap *pgmap); diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h index 9540b41894da..44ecb2f90db4 100644 --- a/include/linux/mmzone.h +++ b/include/linux/mmzone.h @@ -1933,6 +1933,9 @@ enum { SECTION_IS_EARLY_BIT, #ifdef CONFIG_ZONE_DEVICE SECTION_TAINT_ZONE_DEVICE_BIT, +#endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT + SECTION_IS_VMEMMAP_PREINIT_BIT, #endif SECTION_MAP_LAST_BIT, }; @@ -1944,6 +1947,9 @@ enum { #ifdef CONFIG_ZONE_DEVICE #define SECTION_TAINT_ZONE_DEVICE BIT(SECTION_TAINT_ZONE_DEVICE_BIT) #endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +#define SECTION_IS_VMEMMAP_PREINIT BIT(SECTION_IS_VMEMMAP_PREINIT_BIT) +#endif #define SECTION_MAP_MASK (~(BIT(SECTION_MAP_LAST_BIT) - 1)) #define SECTION_NID_SHIFT SECTION_MAP_LAST_BIT @@ -1998,6 +2004,30 @@ static inline int online_device_section(struct mem_section *section) } #endif +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +static inline int preinited_vmemmap_section(struct mem_section *section) +{ + return (section && + (section->section_mem_map & SECTION_IS_VMEMMAP_PREINIT)); +} + +void sparse_vmemmap_init_nid_early(int nid); +void sparse_vmemmap_init_nid_late(int nid); + +#else +static inline int preinited_vmemmap_section(struct mem_section *section) +{ + return 0; +} +static inline void sparse_vmemmap_init_nid_early(int nid) +{ +} + +static inline void sparse_vmemmap_init_nid_late(int nid) +{ +} +#endif + static inline int online_section_nr(unsigned long nr) { return online_section(__nr_to_section(nr)); @@ -2035,6 +2065,9 @@ static inline int pfn_section_valid(struct mem_section *ms, unsigned long pfn) } #endif +void sparse_init_early_section(int nid, struct page *map, unsigned long pnum, + unsigned long flags); + #ifndef CONFIG_HAVE_ARCH_PFN_VALID /** * pfn_valid - check if there is a valid memory map entry for a PFN @@ -2116,6 +2149,8 @@ void sparse_init(void); #else #define sparse_init() do {} while (0) #define sparse_index_init(_sec, _nid) do {} while (0) +#define sparse_vmemmap_init_nid_early(_nid, _use) do {} while (0) +#define sparse_vmemmap_init_nid_late(_nid) do {} while (0) #define pfn_in_present_section pfn_valid #define subsection_map_init(_pfn, _nr_pages) do {} while (0) #endif /* CONFIG_SPARSEMEM */ diff --git a/mm/Kconfig b/mm/Kconfig index 1b501db06417..0837f989a2dc 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -489,6 +489,9 @@ config SPARSEMEM_VMEMMAP SPARSEMEM_VMEMMAP uses a virtually mapped memmap to optimise pfn_to_page and page_to_pfn operations. This is the most efficient option when sufficient kernel resources are available. + +config SPARSEMEM_VMEMMAP_PREINIT + bool # # Select this config option from the architecture Kconfig, if it is preferred # to enable the feature of HugeTLB/dev_dax vmemmap optimization. @@ -499,6 +502,9 @@ config ARCH_WANT_OPTIMIZE_DAX_VMEMMAP config ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP bool +config ARCH_WANT_HUGETLB_VMEMMAP_PREINIT + bool + config HAVE_MEMBLOCK_PHYS_MAP bool diff --git a/mm/bootmem_info.c b/mm/bootmem_info.c index 95f288169a38..b0e2a9fa641f 100644 --- a/mm/bootmem_info.c +++ b/mm/bootmem_info.c @@ -88,7 +88,9 @@ static void __init register_page_bootmem_info_section(unsigned long start_pfn) memmap = sparse_decode_mem_map(ms->section_mem_map, section_nr); - register_page_bootmem_memmap(section_nr, memmap, PAGES_PER_SECTION); + if (!preinited_vmemmap_section(ms)) + register_page_bootmem_memmap(section_nr, memmap, + PAGES_PER_SECTION); usage = ms->usage; page = virt_to_page(usage); diff --git a/mm/mm_init.c b/mm/mm_init.c index d2dee53e95dd..9f1e41c3dde6 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -1862,6 +1862,9 @@ void __init free_area_init(unsigned long *max_zone_pfn) } } + for_each_node_state(nid, N_MEMORY) + sparse_vmemmap_init_nid_late(nid); + calc_nr_kernel_pages(); memmap_init(); diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 3287ebadd167..8751c46c35e4 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -470,3 +470,26 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn, return pfn_to_page(pfn); } + +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +/* + * This is called just before initializing sections for a NUMA node. + * Any special initialization that needs to be done before the + * generic initialization can be done from here. Sections that + * are initialized in hooks called from here will be skipped by + * the generic initialization. + */ +void __init sparse_vmemmap_init_nid_early(int nid) +{ +} + +/* + * This is called just before the initialization of page structures + * through memmap_init. Zones are now initialized, so any work that + * needs to be done that needs zone information can be done from + * here. + */ +void __init sparse_vmemmap_init_nid_late(int nid) +{ +} +#endif diff --git a/mm/sparse.c b/mm/sparse.c index 133b033d0cba..ee0234a77c7f 100644 --- a/mm/sparse.c +++ b/mm/sparse.c @@ -408,13 +408,13 @@ static void __init check_usemap_section_nr(int nid, #endif /* CONFIG_MEMORY_HOTREMOVE */ #ifdef CONFIG_SPARSEMEM_VMEMMAP -static unsigned long __init section_map_size(void) +unsigned long __init section_map_size(void) { return ALIGN(sizeof(struct page) * PAGES_PER_SECTION, PMD_SIZE); } #else -static unsigned long __init section_map_size(void) +unsigned long __init section_map_size(void) { return PAGE_ALIGN(sizeof(struct page) * PAGES_PER_SECTION); } @@ -495,6 +495,44 @@ void __weak __meminit vmemmap_populate_print_last(void) { } +static void *sparse_usagebuf __meminitdata; +static void *sparse_usagebuf_end __meminitdata; + +/* + * Helper function that is used for generic section initialization, and + * can also be used by any hooks added above. + */ +void __init sparse_init_early_section(int nid, struct page *map, + unsigned long pnum, unsigned long flags) +{ + BUG_ON(!sparse_usagebuf || sparse_usagebuf >= sparse_usagebuf_end); + check_usemap_section_nr(nid, sparse_usagebuf); + sparse_init_one_section(__nr_to_section(pnum), pnum, map, + sparse_usagebuf, SECTION_IS_EARLY | flags); + sparse_usagebuf = (void *)sparse_usagebuf + mem_section_usage_size(); +} + +static int __init sparse_usage_init(int nid, unsigned long map_count) +{ + unsigned long size; + + size = mem_section_usage_size() * map_count; + sparse_usagebuf = sparse_early_usemaps_alloc_pgdat_section( + NODE_DATA(nid), size); + if (!sparse_usagebuf) { + sparse_usagebuf_end = NULL; + return -ENOMEM; + } + + sparse_usagebuf_end = sparse_usagebuf + size; + return 0; +} + +static void __init sparse_usage_fini(void) +{ + sparse_usagebuf = sparse_usagebuf_end = NULL; +} + /* * Initialize sparse on a specific node. The node spans [pnum_begin, pnum_end) * And number of present sections in this node is map_count. @@ -503,47 +541,54 @@ static void __init sparse_init_nid(int nid, unsigned long pnum_begin, unsigned long pnum_end, unsigned long map_count) { - struct mem_section_usage *usage; unsigned long pnum; struct page *map; + struct mem_section *ms; - usage = sparse_early_usemaps_alloc_pgdat_section(NODE_DATA(nid), - mem_section_usage_size() * map_count); - if (!usage) { + if (sparse_usage_init(nid, map_count)) { pr_err("%s: node[%d] usemap allocation failed", __func__, nid); goto failed; } + sparse_buffer_init(map_count * section_map_size(), nid); + + sparse_vmemmap_init_nid_early(nid); + for_each_present_section_nr(pnum_begin, pnum) { unsigned long pfn = section_nr_to_pfn(pnum); if (pnum >= pnum_end) break; - map = __populate_section_memmap(pfn, PAGES_PER_SECTION, - nid, NULL, NULL); - if (!map) { - pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", - __func__, nid); - pnum_begin = pnum; - sparse_buffer_fini(); - goto failed; + ms = __nr_to_section(pnum); + if (!preinited_vmemmap_section(ms)) { + map = __populate_section_memmap(pfn, PAGES_PER_SECTION, + nid, NULL, NULL); + if (!map) { + pr_err("%s: node[%d] memory map backing failed. Some memory will not be available.", + __func__, nid); + pnum_begin = pnum; + sparse_usage_fini(); + sparse_buffer_fini(); + goto failed; + } + sparse_init_early_section(nid, map, pnum, 0); } - check_usemap_section_nr(nid, usage); - sparse_init_one_section(__nr_to_section(pnum), pnum, map, usage, - SECTION_IS_EARLY); - usage = (void *) usage + mem_section_usage_size(); } + sparse_usage_fini(); sparse_buffer_fini(); return; failed: - /* We failed to allocate, mark all the following pnums as not present */ + /* + * We failed to allocate, mark all the following pnums as not present, + * except the ones already initialized earlier. + */ for_each_present_section_nr(pnum_begin, pnum) { - struct mem_section *ms; - if (pnum >= pnum_end) break; ms = __nr_to_section(pnum); + if (!preinited_vmemmap_section(ms)) + ms->section_mem_map = 0; ms->section_mem_map = 0; } } From patchwork Fri Feb 28 18:29:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996910 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E81DC282D0 for ; Fri, 28 Feb 2025 18:30:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C1A2228000D; Fri, 28 Feb 2025 13:30:10 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B6589280001; Fri, 28 Feb 2025 13:30:10 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9DFEC28000D; Fri, 28 Feb 2025 13:30:10 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 7E10E280001 for ; Fri, 28 Feb 2025 13:30:10 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4177712114C for ; Fri, 28 Feb 2025 18:30:10 +0000 (UTC) X-FDA: 83170192980.15.BF29438 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf12.hostedemail.com (Postfix) with ESMTP id 7A20940013 for ; Fri, 28 Feb 2025 18:30:08 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=c4q0OIsW; spf=pass (imf12.hostedemail.com: domain of 3rwDCZwQKCPwjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3rwDCZwQKCPwjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767408; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kbowoUPJCyiQsizxpkD6Ny3B5kmia3e97FJeeBCM/C8=; b=wyBkQ9gVi/k1tnCp0TZ8MFV6vDmcgVsvqktC7ZStP/0M8ZMHGDpEQ77eUlO4EfPFLqBFXI VSGu8uBivyL9A7fhksQCRA3/bdP6DSR4GtbbftVW9usAdWhFIDwcrrorV0skS9U/7/YjbY 4g9uvrBZp+KfaIQP5jGUi+vZIVglQbo= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=c4q0OIsW; spf=pass (imf12.hostedemail.com: domain of 3rwDCZwQKCPwjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3rwDCZwQKCPwjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767408; a=rsa-sha256; cv=none; b=m1Oxh/csNmBhqWZPSLTrG/lvLlyiDmkqKi1ec8zynBfivUiV+YO8n4Y/SZHcfSw2UJRCyI RDQJBIr7t2/UxgR4Z2m3fvz1MaQpMFe8WX/DLCpsLSwnC8QjwMhCZeku425/7CxGyyY5wQ ByHgDtzbYYC7jI7P1mkwK0Du9oox9yI= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fc3e239675so8299656a91.0 for ; Fri, 28 Feb 2025 10:30:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767407; x=1741372207; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kbowoUPJCyiQsizxpkD6Ny3B5kmia3e97FJeeBCM/C8=; b=c4q0OIsWGnB6CLiny//4NpXn+apbFe6khjR4xQ0wmM4fKTzjJW0D2/SQjcjd4sfBTZ hzAN0y6+csJqcytmIGY8TowwG2EFgQxaoPrC9nXkmoJn5JAAohQTu1RNMZ7M38Ab3gGy ooQK6jiSxg0RTtGT/Lp5Vv9V0iN3G0pMWROxrn84BVInusnZ+4WlIFHxNfHIw/HA244z BBLNTpNPCaXTrzVWAzJVzq9ugJFQUgWnCbZ2LL4qjhfFNtdKVjnbnIW6P2EY41RcQ+Mf nafHaLWidBBLDbJ5yznw9KUzrqFAd710XDQeNLCbYMzuDEJjO3dG1ci/dJATcjcy/xzQ QSgA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767407; x=1741372207; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kbowoUPJCyiQsizxpkD6Ny3B5kmia3e97FJeeBCM/C8=; b=AT0qqX0vr2I/rvFX5G3XY5UMZ1DNcJHZ+O2FRRy9sNfYQOV9B7wiKuq6F+Oc0hHIG8 AqjWHrD64rW40/t8pwfE8hcDJetb9ZENrdXJ59IUFW0jHSkhQMl10Cnw9NuK3wfkul+J 4EalEnUYzewmsGcwaTZKBoT0HemoFb1JU62gPOamwvDC50wRLRVrIMo6Kgl9TslGORYt proC3aii967mWylU7xWtBaDdaFXfPBL30rOauU0wrpS6cDa7UNXq/YvXaBVDHIQsS6Ez i+7yRM7rymGItR1+RwDTvgTKA+qx0ipQZ4ua9zKIjAW0ipKz4jrCaTxO5tEb54jA60vp KFmw== X-Forwarded-Encrypted: i=1; AJvYcCVydjNYhtss642iVMiVSfDnEuvns2huNffusNvCqTYvxrE/Bo4mJ2kQzMKblqmkm8C/30wSKzue7w==@kvack.org X-Gm-Message-State: AOJu0YwzvUHG2M8hh4Be77KUGI9RvnqNf6JVdgrx0ksihcNMds3PWlKW eiYOS4A7QaosoVcVtLIfsj3PZJkHwl4vW907mWIaS5H8tt8jSMm0ESp+FOre+OplL22Aww== X-Google-Smtp-Source: AGHT+IGKuB3uJBVlbNU/F3YmwsBG4OICoUNsGFAqYF88KK7BiOQo3YRtYFp6xhdAPgZNqpSARz9qCLce X-Received: from pjbsn4.prod.google.com ([2002:a17:90b:2e84:b0:2fc:13d6:b4cb]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1804:b0:2fe:8c22:48b0 with SMTP id 98e67ed59e1d1-2febab7876amr7870153a91.15.1740767407364; Fri, 28 Feb 2025 10:30:07 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:12 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-12-fvdl@google.com> Subject: [PATCH v5 11/27] mm/hugetlb: set migratetype for bootmem folios From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 7A20940013 X-Stat-Signature: pn619wu894enr54xardh98r38zthcqap X-HE-Tag: 1740767408-483364 X-HE-Meta: U2FsdGVkX1/4tFThf6GQJf//FJ+crWk7Z6LT1gd3rnv6yFw7LTHDd7YI5icGpNGzwzJdLpMctU4q5FSaJn20sGPISgGFd4EuzjRj7jfnM7UN2CZOBv0tedsq4U/ps4vvYlIGt8CoxyO5YzYHnodgd81N23nJyyYFKEpxKKr1evrcmLKotxAZL3nWUZbmQnwPfFEonqgm4RDbeBbv5TK97Qvq+CcCbCjplrfpJmW8WCbroDAyiKeJag8mPpVKaCj5IM+kL5UCdyhJednQ834yNEJnHMiIVg4xRcF9OJ166tDsN17QsBFBZfA7dijk9Sve1MS6U+3MYfjrw91u1WEBnrdSAVmi67x1mqdqfNgL7bUPJiQj1D8YSZ38NvpoT/fuA0op+nvWayzjmW/d8AJG4YscGsHQfl+DRefaTGzDuTYIpB53WZ6S1BgjmURn8AOgTuCFF0xwYZJpbjID3NCVL/+im6fb1myVa0Y5UJRqPH4tZ0TXC4WtqrRfrggnDtNiuKHjrri7CSFc30hiZ85Dc0yqMRqgR4jseyLWK2rRIDlWg8IX1or9i1SA5fXM2LUUEp9jJNrkRxx8p0pobrU2rUdXJr9lqjfnwOmpDbfIHCpGZDCyURcbE7ER9pvrupRcnn8aaDOwhDnYX6huGFLpGSsvXj0UbeWzBHYUuQ3BK/xDoNSzcwLpP3Vklhg1acJZR+pWA+r2XVXLzOka7LuCFsedkj91d7mdYp0/YQ2+axIHB9RZMSSL49uNLwJGe//hbzFLERiDch22LdQGaK++zcCqjqnFZWwdZKx/h5l9G1hn3PfAW5xIe/6Sa8+wzDD83V/aQrv0PIbm8Gta0SiHDJYq0OsPqtRtr+FmrU4tdNYYw0sdSjyqrYZ/+p2D1YktU+S/tq3JUWk1bigecj3jS4S5oQBpi5Gii7ksKAsjq2GqX9VXDv97PkJXUGur4pQ5wC9BT1CIpTN+9YQy79J rhnGcnyj W66d3tJO9O2DlD7z5L4OQYf8MRJQIdEurK5O2dktoNuW09/V7GjdWuW2hrNzTZq7hlhjoJdkB99S+j30CapPESuzi62wXW0voDT8Cda7nbBwBjaHNhxgasJ1+JUwCQAG64OXCswzm7l1VMezXzQw3yhJxptE0Sn36j6LxqtIgwrl2hAYCYuondoM+2NrqtVDAwidFP3dBNk5+HnBvGCrJrkFHOrWPNvMs37kLPRuYGIAZxBZ6AXTkuumdDWuyNapMk1CkhgrPPO1JogJHZi5yD/2hfsy7EZzwLbZf2LDi0F0rqB3IYIWhBh7kS3MqidQdjNesbjIdBqqV9cXy/Un2QLOcQraaqbpttG6NkFUiJzILBripXd+TDEYpxbRHCB7ZOLs6lMRurK4KZZFZeWUuBGUfI/REp0fL0184ttZzJAM3iX9HdPkm+uYRNZ4wBjBQgTXmmAr7guOCh7mJw7RxP7JE2U6st0+7RvwB9DRTL+UkNsBz8+as6F2/QLOLegX+dmdLvhLd+QJCwkSD8HQYRGnYTA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The pageblocks that back memblock allocated hugetlb folios might not have the migrate type set, in the CONFIG_DEFERRED_STRUCT_PAGE_INIT case. memblock allocated hugetlb folios might be given to the buddy allocator eventually (if nr_hugepages is lowered), so make sure that the migrate type for the pageblocks contained in them is set when initializing them. Set it to the default that memmap init also uses (MIGRATE_MOVABLE). Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 21 +++++++++++++++++++++ 1 file changed, 21 insertions(+) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 1a200f89e21a..19a7a795a388 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3254,6 +3254,26 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio, prep_compound_head((struct page *)folio, huge_page_order(h)); } +/* + * memblock-allocated pageblocks might not have the migrate type set + * if marked with the 'noinit' flag. Set it to the default (MIGRATE_MOVABLE) + * here. + * + * Note that this will not write the page struct, it is ok (and necessary) + * to do this on vmemmap optimized folios. + */ +static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, + struct hstate *h) +{ + unsigned long nr_pages = pages_per_huge_page(h), i; + + WARN_ON_ONCE(!pageblock_aligned(folio_pfn(folio))); + + for (i = 0; i < nr_pages; i += pageblock_nr_pages) + set_pageblock_migratetype(folio_page(folio, i), + MIGRATE_MOVABLE); +} + static void __init prep_and_add_bootmem_folios(struct hstate *h, struct list_head *folio_list) { @@ -3275,6 +3295,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, HUGETLB_VMEMMAP_RESERVE_PAGES, pages_per_huge_page(h)); } + hugetlb_bootmem_init_migratetype(folio, h); /* Subdivide locks to achieve better parallel performance */ spin_lock_irqsave(&hugetlb_lock, flags); __prep_account_new_huge_page(h, folio_nid(folio)); From patchwork Fri Feb 28 18:29:13 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996911 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id F268CC282D1 for ; Fri, 28 Feb 2025 18:30:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 91E9D28000E; Fri, 28 Feb 2025 13:30:12 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8F47E280001; Fri, 28 Feb 2025 13:30:12 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7962028000E; Fri, 28 Feb 2025 13:30:12 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 553BF280001 for ; Fri, 28 Feb 2025 13:30:12 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id ECBC4121140 for ; Fri, 28 Feb 2025 18:30:11 +0000 (UTC) X-FDA: 83170193022.03.86BF0BD Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf13.hostedemail.com (Postfix) with ESMTP id 1DD2020008 for ; Fri, 28 Feb 2025 18:30:09 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vzEwIPcF; spf=pass (imf13.hostedemail.com: domain of 3sADCZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3sADCZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767410; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=K94kLlqGeFQ+aFx16P/h3vYpIISWhF2cF1hUz7z5Syk=; b=EfVXRtQX5STHFIFihccnDqsDLWRufqlPigZTUo0oXUWbtFuAdi+CnjEEB9Nx4l/h3bRVWt lJBj5sOY7FgwM41mL+Pw28a9FudDOsmQ6Yp6CLKQk3te8H1rJwYqBMmp2GoTAtXDOBIwiZ TKtrIVgApM3A5iaatpDe8ULVZIvmhMk= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767410; a=rsa-sha256; cv=none; b=qWURIXekoxwPUZLIsgo0Op1IJy+IVj7la1Q6isD4m5CqNXwTtjcc5BzOITGfkKsHhYLQ96 Emis1/wd9xvyzVjgMzENKFM21xtSHKwJ2NyWYkSVgkQA+ByJd9QZFd8OdUy2SiYZW+zKCz FXBo1o1UICBY/7AzeTHE6fsO3EJf76M= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vzEwIPcF; spf=pass (imf13.hostedemail.com: domain of 3sADCZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3sADCZwQKCP0k0iqlttlqj.htrqnsz2-rrp0fhp.twl@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21f6cb3097bso65128055ad.3 for ; Fri, 28 Feb 2025 10:30:09 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767409; x=1741372209; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=K94kLlqGeFQ+aFx16P/h3vYpIISWhF2cF1hUz7z5Syk=; b=vzEwIPcFuPaYnketRIBMuavUx6zpjOVhE7gQA/2qn3rKaTSWhpYUiBo2Oaanq1SSDf ef9Nj3oTzUHLqHvecsPQgVYsSKQP8EotJLTdjR39AJazrKYjsLWN2cTvLfvRJgbDLoaU 4nzWrPbDjQZMjfk3lnzFIBF+UxNabQ2LYiZY6NkUG/5EAhxlw2enohLvEhMgYPdK83A6 8gNUyTtbmx000BiPxKZCwnPZByovfFFeNsVEWH32r51unyN+iQIzXIYvBOuK1W0yb4Zd 9s4hycWS6yAVSPHR1scZSvAsRER7qGeXK1QT25ZEd07oXg5ZC7b/V2xvUvqu0WF0p5mM vqHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767409; x=1741372209; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=K94kLlqGeFQ+aFx16P/h3vYpIISWhF2cF1hUz7z5Syk=; b=kUtIoMIfDwtLcCkmtmoAj3WdnJCq/IFPxX8i2jOi60QrTukBIHxgTS3jnvPLkoBHuZ P1L0s2ob8u2ff62euuSx1QauQrFMupe3qi5N58RG3kwYoR+o5Ac6H7h4GB35lh/orXFN gr04sJe7BFCJMzcKAnoY6HjdgKBUMMT1+8nxat1V/mz0qEyWxKgActIRCtJs3kG+1jAX foHMCOcScTzRFzSYZrMef4YEZ3F+ZYJJUwWO9/Gu1revs7Bho/lRC4GYrai0D86F7Xoj tVRrvUAYmOb+ZX721QNCDKSE/yANUfJtuwTs4TrJlGT6hhNJ71XwWIOMbX5NCOakMFzb 9sSw== X-Forwarded-Encrypted: i=1; AJvYcCW/S6gxEpxnua2chZpZBnzUCJt3KVRRXynoVA7A2ChIm0EEX+FZtgP2dSO6bC45EXSaFU+5dWiOFA==@kvack.org X-Gm-Message-State: AOJu0YwpAeepTJmw5AaNWKR/qv5W0Cb17cnus8q8F8u2tlWouh0+Rr+u cwfjHNJTo0YWBkrEiGW2Yycd4Gc79y4M6wvjezbGthfDRG6IG04tpVozY11AcKo04GTOjw== X-Google-Smtp-Source: AGHT+IFDWi4N9QdQQBBc9RlXjvM6dqwf/CPH36DP/o8uLrYBQ+/2fZBTUs+cvUB/toGJpVVERB85nZsx X-Received: from pjur16.prod.google.com ([2002:a17:90a:d410:b0:2fa:1fac:269c]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:e750:b0:21f:ba96:5de9 with SMTP id d9443c01a7336-2236926f502mr89499265ad.49.1740767408967; Fri, 28 Feb 2025 10:30:08 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:13 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-13-fvdl@google.com> Subject: [PATCH v5 12/27] mm: define __init_reserved_page_zone function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Stat-Signature: hwnrpfdcuohaq9uyu6si3n88du1pfucr X-Rspamd-Queue-Id: 1DD2020008 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740767409-539985 X-HE-Meta: U2FsdGVkX18L+OAqIGV7nJNVbA6MNPCmtvFW7KXR0RjJE75fh2p5X1gDg1rC6a0IZihk9MnBXFL2KmpeMsIh/DIcXuazZVNOddl6RJhWto/wkQ4mKDqvdiczBSNrEpNA/hGVyt4NJgnJbAPIwnIVKC8CHOIxDojTjY0w2r3oGVST1s7CZ+DrVloKu3oPVbkptVmR+Cwi/IRudHsbsjEJLEyLc5JLUmU6dBRUlVgU+29SIkB5mCVjrv+SZ8+fIHj5HsjhSM7rvMa3bINbndtPcIez+oG9PbFOGSbZeh5AfTi5/8k8gIc6DgQLSg3xBssWzHcJWMeyvEEn9KCJrDNs0ID7EQnWRwQKUbQ5oWZ0R5dnIjbBGF2vQzfIFaeeeoAGp7Gijuqi/dDmlOTTTdbDZxyRewsnI/neVay81yw/Hpfzm/epaRl603q8djO5obksv5lbidSX7Qfwpk0O7Ltk8QGAX7KWro6vAgI7u0wwRDLojh0ng/mB/5zN7/f/+5zdNAqxkFOBoac0stYs70HFSz/MgWWKQVtuhzWnVm4poROyHM1FlBhkhCfhOXgyQau6jjce9pjUtg1bvu++QItB2ECZJ+5/luCpgvjhFR6WW2lQX0Hhwiir1nVAYUvYwJ1E0CSkB3HtlTEsm3R2tqtd4ULeFltrYQl+RjYiRyzfzo8bcQvrW4wxZ8vpeXEcjEW3l3x20HYAWYkvIquZgr5n+i/npiPqM1QHNsRHmsnYcczTzlUJuV/K7nx44+GcQJ/puSoEZf2oBRv2J/t483mi/xdH/hjUzlamshTCDFqq3CppLFRLIlwiCAslNR4v9U8bzbTneb9iCMWwe9+nkzq2bSpmgJV2Tv2OeAJyJ9syK/OuMRQ9k6VStLl1Dg0qgsBPw3GMDKBM0OD9z/2Hw6lBZGwy126Q7ggWxXjn+PpEC+jDTMzrJuudx0IcXYfZ/yxH9VXbMa/oyazJrpeVdgm 6fLxdVTv zV+p/q9izQcrSl7UGjMlKXBU9C/JbiHqXV1NvMv9hQciatIBa4Xz83FgnzeG78CUu6ujDL6n2X4UXjHhSSbNt1SS/skTOfiM6uKnPcoQDlPIiOHdI55RC1LbznNK0kYNLfUcPbhsRELKVjwpuZtzT4Z4bOXBpIC4WV7hMpe2cBbEfxCidVi1EgfQj6DrqPxH9DfCP3904pU2dHUUP9RkSjUWMQbgAP5x4tP77lJGKs0liOH1G0qXnlN9dirKN7fWNyXqvD2DpOL6KC0qqz/ONlya2CVWfo6rPDqX1wvy42SAJqW/oxxoOSequlf+TlbMSKDMD79f0WqydCIGUs/rciN29LnlswBvnXQSJGL+ZSFDjEhO+EkFY1OzTinLrJIBpqmgT6w9yMUVRjTe3qVCguTr/t6KK2tRAPgaoYWwjNoH6i98AWBMmTt6xPM+c5a7fdldQLWtlLcjOKrDde4BslURzpF9JJG3q+d5s+GyrgbW5e2rAaQcHRIR4qBfLZj/bXscZ4IfCk3Cf43udMMRhEcErN231K3aBEkvX4ylAGPqMCCmseJQSuscxXaSI19xesE7BUTPpxlv3J3/ufaE+okH7UidjE1fDa73+Gwoa9y+tfW8= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Sometimes page structs must be unconditionally initialized as reserved, regardless of DEFERRED_STRUCT_PAGE_INIT. Define a function, __init_reserved_page_zone, containing code that already did all of the work in init_reserved_page, and make it available for use. Signed-off-by: Frank van der Linden --- mm/internal.h | 1 + mm/mm_init.c | 38 +++++++++++++++++++++++--------------- 2 files changed, 24 insertions(+), 15 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 109ef30fee11..57662141930e 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1448,6 +1448,7 @@ static inline bool pte_needs_soft_dirty_wp(struct vm_area_struct *vma, pte_t pte void __meminit __init_single_page(struct page *page, unsigned long pfn, unsigned long zone, int nid); +void __meminit __init_reserved_page_zone(unsigned long pfn, int nid); /* shrinker related functions */ unsigned long shrink_slab(gfp_t gfp_mask, int nid, struct mem_cgroup *memcg, diff --git a/mm/mm_init.c b/mm/mm_init.c index 9f1e41c3dde6..925ed6564572 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -650,6 +650,28 @@ static inline void fixup_hashdist(void) static inline void fixup_hashdist(void) {} #endif /* CONFIG_NUMA */ +/* + * Initialize a reserved page unconditionally, finding its zone first. + */ +void __meminit __init_reserved_page_zone(unsigned long pfn, int nid) +{ + pg_data_t *pgdat; + int zid; + + pgdat = NODE_DATA(nid); + + for (zid = 0; zid < MAX_NR_ZONES; zid++) { + struct zone *zone = &pgdat->node_zones[zid]; + + if (zone_spans_pfn(zone, pfn)) + break; + } + __init_single_page(pfn_to_page(pfn), pfn, zid, nid); + + if (pageblock_aligned(pfn)) + set_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE); +} + #ifdef CONFIG_DEFERRED_STRUCT_PAGE_INIT static inline void pgdat_set_deferred_range(pg_data_t *pgdat) { @@ -708,24 +730,10 @@ defer_init(int nid, unsigned long pfn, unsigned long end_pfn) static void __meminit init_reserved_page(unsigned long pfn, int nid) { - pg_data_t *pgdat; - int zid; - if (early_page_initialised(pfn, nid)) return; - pgdat = NODE_DATA(nid); - - for (zid = 0; zid < MAX_NR_ZONES; zid++) { - struct zone *zone = &pgdat->node_zones[zid]; - - if (zone_spans_pfn(zone, pfn)) - break; - } - __init_single_page(pfn_to_page(pfn), pfn, zid, nid); - - if (pageblock_aligned(pfn)) - set_pageblock_migratetype(pfn_to_page(pfn), MIGRATE_MOVABLE); + __init_reserved_page_zone(pfn, nid); } #else static inline void pgdat_set_deferred_range(pg_data_t *pgdat) {} From patchwork Fri Feb 28 18:29:14 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996912 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82BB2C282D0 for ; Fri, 28 Feb 2025 18:30:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C9FDF28000F; Fri, 28 Feb 2025 13:30:13 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C50D2280001; Fri, 28 Feb 2025 13:30:13 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA2AC28000F; Fri, 28 Feb 2025 13:30:13 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8A2FF280001 for ; Fri, 28 Feb 2025 13:30:13 -0500 (EST) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 48F5C1C91F9 for ; Fri, 28 Feb 2025 18:30:13 +0000 (UTC) X-FDA: 83170193106.28.CF97916 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf24.hostedemail.com (Postfix) with ESMTP id 6836E18000F for ; Fri, 28 Feb 2025 18:30:11 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="ylVR/JYE"; spf=pass (imf24.hostedemail.com: domain of 3sgDCZwQKCAEgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3sgDCZwQKCAEgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767411; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=8BnmSyjAWl8KyXjU3dkeSbWxZ2sVLeXlaOuN4oJvYMc=; b=vXAoTmB19aqLUEFQ0t1cbvuL5wk+or43CMKyo/EUsrEhzef/kAM2pmezietMPsgg4ktKg9 v0pzNnWKbMx7YYal7LXlvSRoutsUQLL6TnT/m4gr7V8C8tXKHRSFvW6zXBT9lZYVdfAz31 +nbf8jeOUPrVvY35a0s3Svbh1buAmLs= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="ylVR/JYE"; spf=pass (imf24.hostedemail.com: domain of 3sgDCZwQKCAEgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3sgDCZwQKCAEgwemhpphmf.dpnmjovy-nnlwbdl.psh@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767411; a=rsa-sha256; cv=none; b=zSShYzPI3BUtDgKct0Rs9dc/Bzv84VMXlEdQJMJSiYIAUpa1sOXUMfsZTVWvYqWVkVTxvd U5GpAqEeNMTqb2AENzWRvYYirpS3vc8rOWhnpjqezNRPiiecRGQU4HzTYsuNCITpr0FHrF 06ruXuO9pjQpRihEqoJtBETasirj9Nc= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2fecd603bb8so831649a91.3 for ; Fri, 28 Feb 2025 10:30:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767410; x=1741372210; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=8BnmSyjAWl8KyXjU3dkeSbWxZ2sVLeXlaOuN4oJvYMc=; b=ylVR/JYEAWF1zFszu/kUue7CAjUPoYs47yC5qej+Vp6nO0QKbsS6nwVlc9LRnLYP65 kapjvKLJQBVPZJ+lKZcTqYbnQTRSWwgxTP9FHRjA/GsoLsRw22uzX+Q8clGeQP+JZdho 6T4qETSRIipUpN7yr0/uYAAS+5RZ2/T5MtalkaAsLnjGEAjNrZWCJhl//1g8pzacD7wJ 1fHr1zuX9caoAMHVh/HDl1DRpqDndj8kIxS5J3zpgIO8+Ka+4lXqM9Sw7L2LPQzvZAyI O9D+Ua9gRJm3ZWYAHGnGCcFK96vUc5kAuxbH3cwXBi+pE33kl4tu/cyHwo13Em9ClAPU lyeA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767410; x=1741372210; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=8BnmSyjAWl8KyXjU3dkeSbWxZ2sVLeXlaOuN4oJvYMc=; b=RsmCF0mJDQLiEY+AYZ/tyRE3C7mnrmcBZnW3Rhp3Wq8rSPfFk0iF6pwJd7RjfTPsC+ nltXNen2egGrOV0KdXgkLs0LlbYwCA4cEFXxd6M03il8vseLmyFyOsfI/C+o6Sx0letE eL4f8bhPlIV8OVh6qk3o4L9q90qmKlO/Gc2XFj6/MKf8O51WZrzFkbGTasassX3k7hqb PVG7XI/1+jRySLk6Ko5YxFLiXG9XXEPyutbmxaiKMGGB9dTjUON9OInHOg0edFTcNsMz BJrdX2Zu7bNbHIT1XcEL82+ui0PK4As8GKOHH8EkytP4+MtxsTPJDZhNY9/sPKHgqSPz Y/9w== X-Forwarded-Encrypted: i=1; AJvYcCWB+aVCL+32Vzp3184GJAuAy4zPbAr+DdcybgjvcKGKvnsam2mp+2di/NDh1Wtu1GcmVi2FUSpkbA==@kvack.org X-Gm-Message-State: AOJu0YzQEmF3ojwMxOR/9KgJX2bQpaR0pj9LQ22OnmCw2THqvdG+/Fxn uY479Rch3Iz37ScfA2Dd8mDg2nMesCX4ovbsO+Czjh20IZ4RjcwG3mEzWlKgRRv+/POHbw== X-Google-Smtp-Source: AGHT+IHkA2l5A4sRrVBTdd/1UDvjktTBUXrR2r60wNN+pEhOugKCAuRSJ2vy2XoSkO7SYrkW3vFR2Pm+ X-Received: from pgmm31.prod.google.com ([2002:a05:6a02:551f:b0:ad5:52c1:c4d0]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:1595:b0:1ee:d6a7:e341 with SMTP id adf61e73a8af0-1f2f4e45091mr9004502637.30.1740767410238; Fri, 28 Feb 2025 10:30:10 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:14 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-14-fvdl@google.com> Subject: [PATCH v5 13/27] mm/hugetlb: check bootmem pages for zone intersections From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Stat-Signature: izpd6qy6unjg6egr3cs9ynstjo9gy4f3 X-Rspamd-Queue-Id: 6836E18000F X-Rspamd-Server: rspam07 X-HE-Tag: 1740767411-512818 X-HE-Meta: U2FsdGVkX19LFl6pPYeIn7c5jPHzBP24MeBGM4P83wan/YSeExq07a3QQg9yGJY1vZnQ2ctsbYSCqcC8BYH0+e/yCN5a1C/mtwfufS85l0SgS950SIqcnpNYCt9sKNxzOZpaY4TJYaFDNWXXkkSoWqjp8h3hpoA0VE0JEVEYZnOoF7nNu8YTqxLUBfSS+0avR6KNtmiOAVM/LTaqe/fw61j+70xygG4756b342tZn9U9hyAB8ArVmKF8SeTDNE7czka2RMSaaHMTaXwcKXCNF4Bd8srxrVwPYKfNOURIBpJfsrzfvaTnulKS1ef0bqUDyWXmfPG7FPl6uXTfFTwMyaiZv2IKH3tD+csKTj7YKeSfnEUXKRmU3SFpFXCGqbSv0lWak15fkAMK306E0Sz4V1Vbv9JJqu0SbUxtDBuaGTOV4IOb11rA/Oo5r0lJjrSu7PPRTK3CLbKj4Vs5BopI7ze8DJ3SzP+Q9lr/blSk2GHpqKpvgJqMWeOxeNRF/g9hjeNf2mXHTHWci5j4li8WncnhrUKViNzEKOJvOxaZDyO41gqkj5Ijt2WpNQ7trmOSSF8j/EJ6itTt+K3HaWwwrq9hJw5q2Q6VmiKXtmxNgvGTmzT+yQimFdmBKMhQF7wDASKpPCXwqZ2/cskv1MLUj1rzZn7dRMd4EeTCNAnZWm8SzGY9erBMw64QsziofZIBXu/VXA9EcUt/6xkX0C5zJtplr1wavLfLsGCddB+UBRz1lN647bVQ9bkOs3oxt+qVadhpxAK0cSktUdbpvwEJ034hN+bYxGLZfJAoI46XUR0+Cmo/ZL9XHLash+b4hNGqJjfpcEmlPQ05RtMZhnSH6kRClAdqfpxau/8KY0QE/sO865WrccHC1JAu9c9sUzPK/FmFo2lOs7dYwcwUM0DrjbHRpn9MDvg/ilNh92cJ9TYPi5dCrjsFgLmZ9s4ZcyqylG7+sv4ct5k+qTWtlH2 KY7FBCZO UXfjqQWkOecGxETDmnarJebrvZv+jzQqpGP4Ns/sZJtsb7ClSLXvwVCHapbnvQlrOab+bsyAkjtVkk/qB5TEdg4DGEJUUnLnlnBHpd+4xqacXX8Zr40O+DEJEz+TYNOtdtsQiuwQ03Ka++l9KoxCA12xKSKkZzhtPuPlR+pfglX/MDynbNeYy5CeFUDhK4cIOU3EhcRMs73vFb5iTX4zR4jwVy4Ce/9/T75Q5DrC4031joMttHdYKmSNmmzjzydY0KYST9tHyisQ93Nd0XE9qxYa1nqHks57liQR6+38y+ZF4JwYDUXhtQqYh9gO468nFcgyP1P2bqcsNQnsHwMA5NB/x9VZSkUzrBwXTh75TBxn8W2bhtDZb7wepqZoI5Al5ahlVoOV9rwTkgfBdAn2CxAmK1CaW2r60U2L+5X2hUuuNDw9uUGg192T1YwJEy6iUqA/K1slUe/Cp3m0Q8y0bcvvag9bljRgDaZ5ZHc9mdaTwklBJdB4LqVv4fVWAHikZ6yTJJ9kkkA21N1PvuIFWC2fpDA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Bootmem hugetlb pages are allocated using memblock, which isn't (and mostly can't be) aware of zones. So, they may end up crossing zone boundaries. This would create confusion, a hugetlb page that is part of multiple zones is bad. Worse, HVO might then end up stealthily re-assigning pages to a different zone when a hugetlb page is freed, since the tail page structures beyond the first vmemmap page would inherit the zone of the first page structures. While the chance of this happening is low, you can definitely create a configuration where this happens (especially using ZONE_MOVABLE). To avoid this issue, check if bootmem hugetlb pages intersect with multiple zones during the gather phase, and discard them, handing them to the page allocator, if they do. Record the number of invalid bootmem pages per node and subtract them from the number of available pages at the end, making it easier to do these checks in multiple places later on. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 61 +++++++++++++++++++++++++++++++++++++++++++++++++-- mm/internal.h | 2 ++ mm/mm_init.c | 25 +++++++++++++++++++++ 3 files changed, 86 insertions(+), 2 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 19a7a795a388..f9704a0e62de 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -62,6 +62,7 @@ static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; +static unsigned long hstate_boot_nrinvalid[HUGE_MAX_HSTATE] __initdata; /* * Due to ordering constraints across the init code for various @@ -3304,6 +3305,44 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, } } +static bool __init hugetlb_bootmem_page_zones_valid(int nid, + struct huge_bootmem_page *m) +{ + unsigned long start_pfn; + bool valid; + + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; + + valid = !pfn_range_intersects_zones(nid, start_pfn, + pages_per_huge_page(m->hstate)); + if (!valid) + hstate_boot_nrinvalid[hstate_index(m->hstate)]++; + + return valid; +} + +/* + * Free a bootmem page that was found to be invalid (intersecting with + * multiple zones). + * + * Since it intersects with multiple zones, we can't just do a free + * operation on all pages at once, but instead have to walk all + * pages, freeing them one by one. + */ +static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, + struct hstate *h) +{ + unsigned long npages = pages_per_huge_page(h); + unsigned long pfn; + + while (npages--) { + pfn = page_to_pfn(page); + __init_reserved_page_zone(pfn, nid); + free_reserved_page(page); + page++; + } +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3311,14 +3350,25 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, static void __init gather_bootmem_prealloc_node(unsigned long nid) { LIST_HEAD(folio_list); - struct huge_bootmem_page *m; + struct huge_bootmem_page *m, *tm; struct hstate *h = NULL, *prev_h = NULL; - list_for_each_entry(m, &huge_boot_pages[nid], list) { + list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) { struct page *page = virt_to_page(m); struct folio *folio = (void *)page; h = m->hstate; + if (!hugetlb_bootmem_page_zones_valid(nid, m)) { + /* + * Can't use this page. Initialize the + * page structures if that hasn't already + * been done, and give them to the page + * allocator. + */ + hugetlb_bootmem_free_invalid_page(nid, page, h); + continue; + } + /* * It is possible to have multiple huge page sizes (hstates) * in this list. If so, process each size separately. @@ -3590,13 +3640,20 @@ static void __init hugetlb_init_hstates(void) static void __init report_hugepages(void) { struct hstate *h; + unsigned long nrinvalid; for_each_hstate(h) { char buf[32]; + nrinvalid = hstate_boot_nrinvalid[hstate_index(h)]; + h->max_huge_pages -= nrinvalid; + string_get_size(huge_page_size(h), 1, STRING_UNITS_2, buf, 32); pr_info("HugeTLB: registered %s page size, pre-allocated %ld pages\n", buf, h->free_huge_pages); + if (nrinvalid) + pr_info("HugeTLB: %s page size: %lu invalid page%s discarded\n", + buf, nrinvalid, nrinvalid > 1 ? "s" : ""); pr_info("HugeTLB: %d KiB vmemmap can be freed for a %s page\n", hugetlb_vmemmap_optimizable_size(h) / SZ_1K, buf); } diff --git a/mm/internal.h b/mm/internal.h index 57662141930e..63fda9bb9426 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -658,6 +658,8 @@ static inline struct page *pageblock_pfn_to_page(unsigned long start_pfn, } void set_zone_contiguous(struct zone *zone); +bool pfn_range_intersects_zones(int nid, unsigned long start_pfn, + unsigned long nr_pages); static inline void clear_zone_contiguous(struct zone *zone) { diff --git a/mm/mm_init.c b/mm/mm_init.c index 925ed6564572..f7d5b4fe1ae9 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2287,6 +2287,31 @@ void set_zone_contiguous(struct zone *zone) zone->contiguous = true; } +/* + * Check if a PFN range intersects multiple zones on one or more + * NUMA nodes. Specify the @nid argument if it is known that this + * PFN range is on one node, NUMA_NO_NODE otherwise. + */ +bool pfn_range_intersects_zones(int nid, unsigned long start_pfn, + unsigned long nr_pages) +{ + struct zone *zone, *izone = NULL; + + for_each_zone(zone) { + if (nid != NUMA_NO_NODE && zone_to_nid(zone) != nid) + continue; + + if (zone_intersects(zone, start_pfn, nr_pages)) { + if (izone != NULL) + return true; + izone = zone; + } + + } + + return false; +} + static void __init mem_init_print_info(void); void __init page_alloc_init_late(void) { From patchwork Fri Feb 28 18:29:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996913 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 529B8C282D0 for ; Fri, 28 Feb 2025 18:30:27 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 91FA2280010; Fri, 28 Feb 2025 13:30:15 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 882A3280001; Fri, 28 Feb 2025 13:30:15 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 72144280010; Fri, 28 Feb 2025 13:30:15 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 4CB52280001 for ; Fri, 28 Feb 2025 13:30:15 -0500 (EST) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 0C052810AF for ; Fri, 28 Feb 2025 18:30:15 +0000 (UTC) X-FDA: 83170193190.15.7EDE29B Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf25.hostedemail.com (Postfix) with ESMTP id 25E18A001A for ; Fri, 28 Feb 2025 18:30:12 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BTqXKzy4; spf=pass (imf25.hostedemail.com: domain of 3swDCZwQKCAIhxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3swDCZwQKCAIhxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767413; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=r8aB06LbmC50HiWzwaYK4Q8Yqhxu+9P15EWR1SVWZ+A=; b=LtTNn1nQo08cTJatey+9aOUIEAISXaEVp8IgQxnjeistS1BLRMkMPExzRrB4c11boQUtlI ybjZBsS49m5HOrW9REm+3a6tUTWN6vg7RTrT/X8cz0al/iiScaafi30zXMpfjDXN2ggZC5 wuYMLz+vcKV+U0K4TGiTPk11+Hqg1gA= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767413; a=rsa-sha256; cv=none; b=cbVuzUImqCEc+cMvhhphD58rxZsY//+Bs0KLkmlicMXwopfdH50/LQ1XzzjTeevheW5A21 7MdpUA45teSXBZ7GiipO0OpSRKNykRbcEUo5c+7xlN6qHGCzHs47tndFYyC03PrW3iL1Ww 7q4ZkMAZwWcEL0Ff9M7WyLNFrTbtCbY= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=BTqXKzy4; spf=pass (imf25.hostedemail.com: domain of 3swDCZwQKCAIhxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3swDCZwQKCAIhxfniqqing.eqonkpwz-oomxcem.qti@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2fc1cb0c2cbso7606417a91.1 for ; Fri, 28 Feb 2025 10:30:12 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767412; x=1741372212; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=r8aB06LbmC50HiWzwaYK4Q8Yqhxu+9P15EWR1SVWZ+A=; b=BTqXKzy4ajah1QOdGQVwehrG7QlTfjszven0nN6bwSmflVSsZzSqIL5C8bne9TQjwC S3k9U7LU1RvpSAeURWrLsylT0i52SiRlKtQmRmCvTN1rLTrHgTes0+uizL+b+FEoH8Ma ZWScdmpXo30k1kRMg2PYw2OtI9WIa+MMkkHf2tbRRzEjoIuBEbG9t4auG/x6HjWROWWN GOsWRv6eT3ZwfXM/7EIJEhZ77bnkfyw7NeovQunyxZ9QL1PohHQ4Dd5bfaKimxb+MMQr GGXWufTOcJr6pRqJh5ixoJUm4wuAZF0XxLYTDpgRXIPcA0I4S7/jsgEL2ue6j2gveIsm nesg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767412; x=1741372212; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=r8aB06LbmC50HiWzwaYK4Q8Yqhxu+9P15EWR1SVWZ+A=; b=ZB7ZGq3rf/jWqEYmACkLO6p4wXV+RKRS5ZgplitqYIODkAcOSdMepoAg/HGLztLOEs ruUSBXN5TbmPIcrwmBQapOWawFUx/0mXcjpqJtfjAp4c+abJ/B2YTTCTj/o1wnB08FnP GqmuwwRsoBJG/hmlINERfTqa3En/KaNFKAava/Bjl3121RK2Q6eQtDwLkZ8B4LNbsay7 QW7jcIbjLO2ukyNqIxEYPAgwP7DmIicvmSKJrmTMK2rQaTP6qMX/sl/qIcTt4iPyqnPN tiufQfiQOnWiX6Ilr7JkL/9AAmkzThPBYfipb2Y2Xz+ulneNWw7/aKZu0PKtNOMghvRJ MMLQ== X-Forwarded-Encrypted: i=1; AJvYcCW9qmAH45GP7C2WIdaYq/2PB5OUZJv4P+oNWwIuLsFWVI+YZBmrN81agSIiXiZ6hVKhogyYSfneFg==@kvack.org X-Gm-Message-State: AOJu0Yy09Fa+NmJHh7Z1VNzC3X775gvCs9Q0J2x1zS+JMYw1qlO3lL4N B7CbprrCT+gdMDw0qr9IkNMB6MJPVHwxxPM0BSPNGdj/NIadh/5r49zHQ+DP4qb09PfNQQ== X-Google-Smtp-Source: AGHT+IEomtc175Tn7QU4dLcaFEmpKHLIE++VomnQMAMqddL9GpIn6l/mbB6muVrn6Dl0pyEnMs9+WryW X-Received: from pgvt16.prod.google.com ([2002:a65:64d0:0:b0:ade:f03a:8509]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:144e:b0:1f2:e0c3:2619 with SMTP id adf61e73a8af0-1f2f4ddb75cmr9167961637.32.1740767411937; Fri, 28 Feb 2025 10:30:11 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:15 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-15-fvdl@google.com> Subject: [PATCH v5 14/27] mm/sparse: add vmemmap_*_hvo functions From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspamd-Server: rspam02 X-Stat-Signature: kc4ds7aqe9nzknue6u8fnt7o1p6r3tk4 X-Rspamd-Queue-Id: 25E18A001A X-Rspam-User: X-HE-Tag: 1740767412-72710 X-HE-Meta: U2FsdGVkX1+cejnsutPbOjwA3xKenjQFqXViCKh3UaJOMJkUZLmXN2YZ4X1pYQPbRCKNqzCZrPd8rUD5Qa8WhrXpWOnBtgnBBCPXmsSx50jkBedkTwFj9Uq0/F5cJwlzwBg3YwP7jayvUrL1RdH0CSd5B4lUkCPh4OafcavVYB46aJFipxZj6NChjtGGk1ZGtQuZKwHMI307Y2xxvxU6yXhsdpTZjx34V9xFdu/LQkywrduYE4HkXxwOs5x3YdK/jh1ny4naNbRI8YzJc6T+yt5FPrkJ7ZWYm4MQDZxHIrLAg14rAFTj50CAGuePLYgGs7y+aMAdbhxlgwqDNjNiFFHAyMYGRAlWQB2bWbARWFcQJ7tT4lqcqtWKONvhvAJuif0H2UVdPf215p6jxjtm/ydSivjyDil+lo2z8mspj3OQUVMFSKfd6WVdKH2RxQkJBJjlIPALWVr1OtZN6Q+KvcPMK989j3uQOQCXJg/PlMLUmtb3tQ+aOk6xdTAGH31GYqO1tjZV1QX87E4Lzdv6Y6kbln0B8Ifk4quwf33glpCIBjEXSvdO4adABjuDtkYpJVaV1G14YXEI0+M+TtwN2+4f2Ps3go6udm5ORc0y4AI7est1KtFwiok8IgQlOmDGZkv4NUJUe3XG8CJ2iAhOb/ohtXmGBgOy82HUip9ovC/394z/uyrxYs72rROO0AmHuB62hYfZg4Mv0fEBmDc92ouQKOrfF+B2f0+bMxMjTStS+HkAwCHw9NvwxTiVbPRgRWRRl00hvPe9Bv99s9CfD5a5DhMG0IUAHfj34O93qjxYsdRLina1ogGZwMXifaFsE+ViJuTMrjee0TzsHwGl4FBZBpDJL20QK9GZHtSVJlg5vlEPbwuqA9FkaXjujvtu4lOTHOJE5/KvMJMG5SRP567PX6LbfaT6r09WwrFVL5DU8SckaJ0UmMImvh45j60oFtCDfDfeSwo/7uErUTc UCA/q2t5 p4V3Yv6QQ3UqRVd1D0L1vzl/93pXRO/FFvYkob9D02CPTZDgp1T9IoR9V3G79ipMxFY779dm9v3ng8XqrFSTrkYYub16ufTYgfAZrrBA5KKkGQZ3b7YcIx+BUksByqiezNJngrC5xaUq5d51ZOuEXo8G1Zh8sZLGJgauELaflF8++Csr+KflkW++qJgHcIU7OHRCQwYOABh8/2ohIRWXJxnbGRHPJYcxZQaXXgkMHAMOUJ32dL4yU6Ps2ULr4q1pcaaFCk63BDLjcpMnY/SbWR+ZnJUiXW8fg31Qe8A98BjXrpr79M/LUSbQYKLY2Qf81xLFRlziPsXUUNXtF12iUnUlM1yOWM927n9nHSe9UcT60Ux3ECbw+OutABehbvdlcS0B/OS5OOYp5zj4H+DCLu4MPz67IhqdLr5NCZFtd7CJDdfLP5Lp07vm3a5Q67EASK12tjA/migHX9C6vGZGcx4zoEpKMy6TFQDmQbcD5cvam7OWsut3P1jCkEaozGGc519AoOTACWkY6GmLxJYvj1TsZD2fUUlaHIgkwRtJY/XG2XAqm2jBaoVTfq26GBrd4HQ7+ X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add a few functions to enable early HVO: vmemmap_populate_hvo vmemmap_undo_hvo vmemmap_wrprotect_hvo The populate and undo functions are expected to be used in early init, from the sparse_init_nid_early() function. The wrprotect function is to be used, potentially, later. To implement these functions, mostly re-use the existing compound pages vmemmap logic used by DAX. vmemmap_populate_address has its argument changed a bit in this commit: the page structure passed in to be reused in the mapping is replaced by a PFN and a flag. The flag indicates whether an extra ref should be taken on the vmemmap page containing the head page structure. Taking the ref is appropriate to for DAX / ZONE_DEVICE, but not for HugeTLB HVO. The HugeTLB vmemmap optimization maps tail page structure pages read-only. The vmemmap_wrprotect_hvo function that does this is implemented separately, because it cannot be guaranteed that reserved page structures will not be write accessed during memory initialization. Even with CONFIG_DEFERRED_STRUCT_PAGE_INIT, they might still be written to (if they are at the bottom of a zone). So, vmemmap_populate_hvo leaves the tail page structure pages RW initially, and then later during initialization, after memmap init is fully done, vmemmap_wrprotect_hvo must be called to finish the job. Subsequent commits will use these functions for early HugeTLB HVO. Signed-off-by: Frank van der Linden --- include/linux/mm.h | 9 ++- mm/sparse-vmemmap.c | 141 +++++++++++++++++++++++++++++++++++++++----- 2 files changed, 135 insertions(+), 15 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index df83653ed6e3..0463c062fd7a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -3837,7 +3837,8 @@ p4d_t *vmemmap_p4d_populate(pgd_t *pgd, unsigned long addr, int node); pud_t *vmemmap_pud_populate(p4d_t *p4d, unsigned long addr, int node); pmd_t *vmemmap_pmd_populate(pud_t *pud, unsigned long addr, int node); pte_t *vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, - struct vmem_altmap *altmap, struct page *reuse); + struct vmem_altmap *altmap, unsigned long ptpfn, + unsigned long flags); void *vmemmap_alloc_block(unsigned long size, int node); struct vmem_altmap; void *vmemmap_alloc_block_buf(unsigned long size, int node, @@ -3853,6 +3854,12 @@ int vmemmap_populate_hugepages(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap); int vmemmap_populate(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap); +int vmemmap_populate_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); +int vmemmap_undo_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); +void vmemmap_wrprotect_hvo(unsigned long start, unsigned long end, int node, + unsigned long headsize); void vmemmap_populate_print_last(void); #ifdef CONFIG_MEMORY_HOTPLUG void vmemmap_free(unsigned long start, unsigned long end, diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 8751c46c35e4..8cc848c4b17c 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -30,6 +30,13 @@ #include #include +#include + +/* + * Flags for vmemmap_populate_range and friends. + */ +/* Get a ref on the head page struct page, for ZONE_DEVICE compound pages */ +#define VMEMMAP_POPULATE_PAGEREF 0x0001 #include "internal.h" @@ -144,17 +151,18 @@ void __meminit vmemmap_verify(pte_t *pte, int node, pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, unsigned long flags) { pte_t *pte = pte_offset_kernel(pmd, addr); if (pte_none(ptep_get(pte))) { pte_t entry; void *p; - if (!reuse) { + if (ptpfn == (unsigned long)-1) { p = vmemmap_alloc_block_buf(PAGE_SIZE, node, altmap); if (!p) return NULL; + ptpfn = PHYS_PFN(__pa(p)); } else { /* * When a PTE/PMD entry is freed from the init_mm @@ -165,10 +173,10 @@ pte_t * __meminit vmemmap_pte_populate(pmd_t *pmd, unsigned long addr, int node, * and through vmemmap_populate_compound_pages() when * slab is available. */ - get_page(reuse); - p = page_to_virt(reuse); + if (flags & VMEMMAP_POPULATE_PAGEREF) + get_page(pfn_to_page(ptpfn)); } - entry = pfn_pte(__pa(p) >> PAGE_SHIFT, PAGE_KERNEL); + entry = pfn_pte(ptpfn, PAGE_KERNEL); set_pte_at(&init_mm, addr, pte, entry); } return pte; @@ -238,7 +246,8 @@ pgd_t * __meminit vmemmap_pgd_populate(unsigned long addr, int node) static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, + unsigned long flags) { pgd_t *pgd; p4d_t *p4d; @@ -258,7 +267,7 @@ static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, pmd = vmemmap_pmd_populate(pud, addr, node); if (!pmd) return NULL; - pte = vmemmap_pte_populate(pmd, addr, node, altmap, reuse); + pte = vmemmap_pte_populate(pmd, addr, node, altmap, ptpfn, flags); if (!pte) return NULL; vmemmap_verify(pte, node, addr, addr + PAGE_SIZE); @@ -269,13 +278,15 @@ static pte_t * __meminit vmemmap_populate_address(unsigned long addr, int node, static int __meminit vmemmap_populate_range(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap, - struct page *reuse) + unsigned long ptpfn, + unsigned long flags) { unsigned long addr = start; pte_t *pte; for (; addr < end; addr += PAGE_SIZE) { - pte = vmemmap_populate_address(addr, node, altmap, reuse); + pte = vmemmap_populate_address(addr, node, altmap, + ptpfn, flags); if (!pte) return -ENOMEM; } @@ -286,7 +297,107 @@ static int __meminit vmemmap_populate_range(unsigned long start, int __meminit vmemmap_populate_basepages(unsigned long start, unsigned long end, int node, struct vmem_altmap *altmap) { - return vmemmap_populate_range(start, end, node, altmap, NULL); + return vmemmap_populate_range(start, end, node, altmap, -1, 0); +} + +/* + * Undo populate_hvo, and replace it with a normal base page mapping. + * Used in memory init in case a HVO mapping needs to be undone. + * + * This can happen when it is discovered that a memblock allocated + * hugetlb page spans multiple zones, which can only be verified + * after zones have been initialized. + * + * We know that: + * 1) The first @headsize / PAGE_SIZE vmemmap pages were individually + * allocated through memblock, and mapped. + * + * 2) The rest of the vmemmap pages are mirrors of the last head page. + */ +int __meminit vmemmap_undo_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + unsigned long maddr, pfn; + pte_t *pte; + int headpages; + + /* + * Should only be called early in boot, so nothing will + * be accessing these page structures. + */ + WARN_ON(!early_boot_irqs_disabled); + + headpages = headsize >> PAGE_SHIFT; + + /* + * Clear mirrored mappings for tail page structs. + */ + for (maddr = addr + headsize; maddr < end; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + pte_clear(&init_mm, maddr, pte); + } + + /* + * Clear and free mappings for head page and first tail page + * structs. + */ + for (maddr = addr; headpages-- > 0; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + pfn = pte_pfn(ptep_get(pte)); + pte_clear(&init_mm, maddr, pte); + memblock_phys_free(PFN_PHYS(pfn), PAGE_SIZE); + } + + flush_tlb_kernel_range(addr, end); + + return vmemmap_populate(addr, end, node, NULL); +} + +/* + * Write protect the mirrored tail page structs for HVO. This will be + * called from the hugetlb code when gathering and initializing the + * memblock allocated gigantic pages. The write protect can't be + * done earlier, since it can't be guaranteed that the reserved + * page structures will not be written to during initialization, + * even if CONFIG_DEFERRED_STRUCT_PAGE_INIT is enabled. + * + * The PTEs are known to exist, and nothing else should be touching + * these pages. The caller is responsible for any TLB flushing. + */ +void vmemmap_wrprotect_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + unsigned long maddr; + pte_t *pte; + + for (maddr = addr + headsize; maddr < end; maddr += PAGE_SIZE) { + pte = virt_to_kpte(maddr); + ptep_set_wrprotect(&init_mm, maddr, pte); + } +} + +/* + * Populate vmemmap pages HVO-style. The first page contains the head + * page and needed tail pages, the other ones are mirrors of the first + * page. + */ +int __meminit vmemmap_populate_hvo(unsigned long addr, unsigned long end, + int node, unsigned long headsize) +{ + pte_t *pte; + unsigned long maddr; + + for (maddr = addr; maddr < addr + headsize; maddr += PAGE_SIZE) { + pte = vmemmap_populate_address(maddr, node, NULL, -1, 0); + if (!pte) + return -ENOMEM; + } + + /* + * Reuse the last page struct page mapped above for the rest. + */ + return vmemmap_populate_range(maddr, end, node, NULL, + pte_pfn(ptep_get(pte)), 0); } void __weak __meminit vmemmap_set_pmd(pmd_t *pmd, void *p, int node, @@ -409,7 +520,8 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, * with just tail struct pages. */ return vmemmap_populate_range(start, end, node, NULL, - pte_page(ptep_get(pte))); + pte_pfn(ptep_get(pte)), + VMEMMAP_POPULATE_PAGEREF); } size = min(end - start, pgmap_vmemmap_nr(pgmap) * sizeof(struct page)); @@ -417,13 +529,13 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, unsigned long next, last = addr + size; /* Populate the head page vmemmap page */ - pte = vmemmap_populate_address(addr, node, NULL, NULL); + pte = vmemmap_populate_address(addr, node, NULL, -1, 0); if (!pte) return -ENOMEM; /* Populate the tail pages vmemmap page */ next = addr + PAGE_SIZE; - pte = vmemmap_populate_address(next, node, NULL, NULL); + pte = vmemmap_populate_address(next, node, NULL, -1, 0); if (!pte) return -ENOMEM; @@ -433,7 +545,8 @@ static int __meminit vmemmap_populate_compound_pages(unsigned long start_pfn, */ next += PAGE_SIZE; rc = vmemmap_populate_range(next, last, node, NULL, - pte_page(ptep_get(pte))); + pte_pfn(ptep_get(pte)), + VMEMMAP_POPULATE_PAGEREF); if (rc) return -ENOMEM; } From patchwork Fri Feb 28 18:29:16 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996914 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1408FC282C6 for ; Fri, 28 Feb 2025 18:30:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 32F34280011; Fri, 28 Feb 2025 13:30:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2D9B0280001; Fri, 28 Feb 2025 13:30:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17D43280011; Fri, 28 Feb 2025 13:30:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id D728D280001 for ; Fri, 28 Feb 2025 13:30:16 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 907C0C1154 for ; Fri, 28 Feb 2025 18:30:16 +0000 (UTC) X-FDA: 83170193232.18.9DE344B Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf08.hostedemail.com (Postfix) with ESMTP id B429B160014 for ; Fri, 28 Feb 2025 18:30:14 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZnIDgEvV; spf=pass (imf08.hostedemail.com: domain of 3tQDCZwQKCAQjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3tQDCZwQKCAQjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767414; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=zk9Htkr8Yt0AZQiLc51J5q9c0jXDbJ3jlcXbWrZj33Y=; b=5AWTVtbp/1KXdrqIgKUwZuAfRzjm+vV+zjhS5YsHQC89A4keD6ZhtrDHqfLEzykY13jqP3 3yFMNEwmt7Zp6E4gNkXiUregaaNRPVUfn/kcO/6BbFqUG78k8fXVjNhuw06VlPNN5+bHgF OXTYzzXT8oZ8kruUk5CqZDnIofDZHbw= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ZnIDgEvV; spf=pass (imf08.hostedemail.com: domain of 3tQDCZwQKCAQjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3tQDCZwQKCAQjzhpksskpi.gsqpmry1-qqozego.svk@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767414; a=rsa-sha256; cv=none; b=ajZLUKkabOAPjjK1JKiOFc6fmn8VZfTPpHk57oFidDEWtFqJAdPXuir3bLlQoxvGWS2dIg nP5i4uZKHBXtSShj7WjRZKioJCiUil8ctImuV19R+JYHg1frJYkCvIASjOIbebqK2JpL+T Dlb7Ol5mbHVIEYZNNj1w82lr00j48rs= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fe8c5dbdb0so5122453a91.3 for ; Fri, 28 Feb 2025 10:30:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767413; x=1741372213; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=zk9Htkr8Yt0AZQiLc51J5q9c0jXDbJ3jlcXbWrZj33Y=; b=ZnIDgEvVJIfYWUVuSfMtGX4HUpuvuX3xGi6jNg+SIojAlWJa6qou0j2hglZT9encR3 VoONEyIUHKpjDRfaCYz2azz/JnmxDmWbcdDpIaBD54TXQgemy13CSRUIBqYb4hoiL0AF kT/+/FQF6hg+4KsbaYPybY8cQiDEvB8z2UudY8KuF2jTPwA2o4XVWkSj7XBTw0YrQsFd VC+5rq/Ns/ztUyht+ljxRlKKRr7fTVfLku3uHqj2uwfou42OyICGbsVYr37ZrM+JxaOZ H3iLSPR0/6+rw4kQbnvZ4f/C5wZjWqSroBGHh9k9ZeChmup+fFh4bVLZJZfeLuZLk2B3 IlnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767413; x=1741372213; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zk9Htkr8Yt0AZQiLc51J5q9c0jXDbJ3jlcXbWrZj33Y=; b=DHdx0ixz/Pw7qRjFOB3L/g/9BBcK2a8xq2Oc2T8kJwp2lRMY3fkmvG3mx/2snteo0z tVWChJSciB8whewGlbD2ZYa3saGyxcphPQogMwXWTRnyTSnxtSPiL5v5CxMdRIRy1xC8 E33H5sbYG+PYn6AMO37tRTKevZpFx0jjEBx3tXLiBOi0dxZr0SQVbFk/bMqZJq+G7ths ceUfSt4AJYgk7jIGZPjVR4JsnH3eCiKg1+PqSgWMiaHfBhy1qJWwD121pxfxKCHAqvhv IzYLyJuCCB8yVm5pXvlv7Udc04h2nsejsWJp4tgRPcmvqn6Y15ENnn1DnZYYzU/S4OK6 c4vg== X-Forwarded-Encrypted: i=1; AJvYcCXfZUkhJUQdX4xqj6XhEQZ5u96rI+EL+53q+1DNJ2hronWdlVBOg7kCZ+x80jjPYFhrO+7wF6t+mw==@kvack.org X-Gm-Message-State: AOJu0YwF570hpdQIWEp64gEAdu+ES6e/9BVcZH8qa8VeNYwUEz+SCExH RMwntS/vMsKBPixcXEhKTwNY44XVaWz/Yrbw7QlZNsgyxHn+nx4crEsGuCrc3/fNAmc9Qw== X-Google-Smtp-Source: AGHT+IEv+QN7E0eQvDAjJvrjNKcnw7M4fyJS/kvcrVF43N89A8GZT3t7+PIL/4t4nWbs737dFUx4stCK X-Received: from pjbee14.prod.google.com ([2002:a17:90a:fc4e:b0:2ea:5469:76c2]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3803:b0:2ee:44ec:e524 with SMTP id 98e67ed59e1d1-2febac0a7e6mr6645676a91.35.1740767413637; Fri, 28 Feb 2025 10:30:13 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:16 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-16-fvdl@google.com> Subject: [PATCH v5 15/27] mm/hugetlb: deal with multiple calls to hugetlb_bootmem_alloc From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: B429B160014 X-Stat-Signature: wkmn4caygjxcxuboawim6zrnd6e4xnsu X-HE-Tag: 1740767414-496425 X-HE-Meta: U2FsdGVkX1+qx3sxOmlQioP8mDXw74RTso/TMzRUv4uoveaM1kfTEfYuGsn3tXtwstZTxn+fKmAr8FKWi7dpsQJRlAdk4cg5LqhRjnhI2fx38CjX/gd/lpVMYKHg73Rz2rMmExavNVDx8orxZrbttMUJ3y1WrdVJkPcovffuZAn7ZLhu4/bmW32uU8/TPE/K2gGuwrtvn9cgv4T+ydE1UCpflfyX5y3hLyofCSnCEtV7btmY2VVJDg+QJruIqYAVDFbnp7EK/oR8JzRnoWbKW6enMd707/bPTg6UZB+or3OhKKd6uQ0R5NCPBbjFON4f5nvPb2q3gMw1ffGa3603/nhC5XgESfhiltP02izj5bKevsH9VZsbcvgFJESVa0eva8E/gtpaYtN2dOdiXDf42+Tk4QoTqTf1OToq78/wBAqV41j77H0l8tyMajAeyjqzAxIHf/LoOPxHECAzKtw45YopdPcHqponMBFDw9PdlC0QNpPd7GQx0OTpXDO6BSUdELR4ZIFQJcigEVogtH6VgBwlz5mRGv8V92x+6u8SsLfPNb6ZHt3um4SzJh782szmt4NIPdbVOrwSigzSvwJljS6+7l1da93Co/wZXAQvBBKnO8Dpvmm6pgLXj6+uVnLSCbN8HwVft+/mQV3W7bV/GASE6bnXwZnpHuP7xDjwDigV+M8/SEtygz/fOCuaA0lA/DW5/yburELM4MGj42a9fjbuHMmL5XOm6vFBBCkMnTQ4E7YVqIcovebQt9BkwecSlMUxm3MKn9KJW5Zr8xebtc+VsfNc13Ywd23SGlvtSvMXuW/a1HgonWFjDaIT82fbQmEYu3D6eLK1VeztEAyhZYapjePYQzfFHMZqMDTTe0mVNTqHGJsgF5GVbx67jn8yl6CJCDdGVmuoC7pk4pDXZNpWbXAnchYWT8+iDmHNaCdrqjZCyM6XfRsLTSTcFJruL77Q7rslD1rYYzkHfDA oXTlkAgh RQrxxQ3W+c8ZDAmz5K43KfuwOWtFzbsHvnnJeszCDve73F6dG1HlcOusi854+EfpJPneG1rlmb0wzlx5DR5Kx5WYNBgP4aa3pb2T19NLiZL/gbbYE8qIX6c654TjeYwciT6Hk1c3R8nRFVflscMAhqffQ9D15fJReO1MtX5lqK0z4qcIljDwNMttSBN9DKdCgv8hYK7Ljuw/HBQkrvkP8rYAxOugmTVCbOrVLVJRYcDc6xESvDk6KabLzXPN2TBzududmvpYHLDtqD6f19hPbwfHlj5Zjg+K787k3ymKfmemMlypa32Z8vT7SKqTUu6HboWX2msXYBZxt+fx2j+mycpdH/+g5SaWMvjMft3PS9vzwDP0ewCUeKTwl0vqSmX8BgI6r68vISAPrRMHkmAIqfZuNDHE/7TUl5jfHtmtDlpN0Ovcsh97mrd12YVxDNkMQSBBQPO4OwnwDXX0mNv5QbxVr4Aw0UQ6joN3y9Gkt2rDRp+xhsZcUzDP1fuJTsaSONxNEU/Ij3lSxRvEDKFCy90ieo2NPc1zwhGJfQ9wLsjwijmci3bUAqXLrW+D1GG2r2kgOkbMm8FDtg1I= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Architectures that want pre-HVO of hugetlb vmemmap pages will need to call hugetlb_bootmem_alloc from an earlier spot in boot (before sparse_init). To facilitate some architectures doing this, protect hugetlb_bootmem_alloc against multiple calls. Also provide a helper function to check if it's been called, so that the early HVO code, to be added later, can see if there is anything to do. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 6 ++++++ mm/hugetlb.c | 12 ++++++++++++ 2 files changed, 18 insertions(+) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 9cd7c9dacb88..5061279e5f73 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -175,6 +175,7 @@ extern int sysctl_hugetlb_shm_group; extern struct list_head huge_boot_pages[MAX_NUMNODES]; void hugetlb_bootmem_alloc(void); +bool hugetlb_bootmem_allocated(void); /* arch callbacks */ @@ -1256,6 +1257,11 @@ static inline bool hugetlbfs_pagecache_present( static inline void hugetlb_bootmem_alloc(void) { } + +static inline bool hugetlb_bootmem_allocated(void) +{ + return false; +} #endif /* CONFIG_HUGETLB_PAGE */ static inline spinlock_t *huge_pte_lock(struct hstate *h, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index f9704a0e62de..ea5f22182c6e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -4906,16 +4906,28 @@ static int __init default_hugepagesz_setup(char *s) } hugetlb_early_param("default_hugepagesz", default_hugepagesz_setup); +static bool __hugetlb_bootmem_allocated __initdata; + +bool __init hugetlb_bootmem_allocated(void) +{ + return __hugetlb_bootmem_allocated; +} + void __init hugetlb_bootmem_alloc(void) { struct hstate *h; + if (__hugetlb_bootmem_allocated) + return; + hugetlb_parse_params(); for_each_hstate(h) { if (hstate_is_gigantic(h)) hugetlb_hstate_alloc_pages(h); } + + __hugetlb_bootmem_allocated = true; } static unsigned int allowed_mems_nr(struct hstate *h) From patchwork Fri Feb 28 18:29:17 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996915 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 87B9DC282D1 for ; Fri, 28 Feb 2025 18:30:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B3FC1280012; Fri, 28 Feb 2025 13:30:18 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id AF0BC280001; Fri, 28 Feb 2025 13:30:18 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96B07280012; Fri, 28 Feb 2025 13:30:18 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 757FF280001 for ; Fri, 28 Feb 2025 13:30:18 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 022A31410E4 for ; Fri, 28 Feb 2025 18:30:17 +0000 (UTC) X-FDA: 83170193316.22.4F62263 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf17.hostedemail.com (Postfix) with ESMTP id 2E86F4001F for ; Fri, 28 Feb 2025 18:30:15 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ylwtk38b; spf=pass (imf17.hostedemail.com: domain of 3twDCZwQKCAYl1jrmuumrk.iusrot03-ssq1giq.uxm@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3twDCZwQKCAYl1jrmuumrk.iusrot03-ssq1giq.uxm@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767416; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=pBBYpB4hgweH6Clpvn0bScN6WwMSi1BpuOUxZoDm4hs=; b=LOvnzl8xtO1ti/1qK66elw4eSpUa+tUiiacQrTGXWnh7blHRd9LV0KDTPapzoCaob+zW9i GY4Q5qeL8QZmU6maUnI5/h3TJaP1QdYHZenAEeUx7aWcfz7g8xkI2tlzxYIsdwyxj++m5d FLus//SbTWz+Z839T8EdMYScbpKAi5Q= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=ylwtk38b; spf=pass (imf17.hostedemail.com: domain of 3twDCZwQKCAYl1jrmuumrk.iusrot03-ssq1giq.uxm@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3twDCZwQKCAYl1jrmuumrk.iusrot03-ssq1giq.uxm@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767416; a=rsa-sha256; cv=none; b=SxQOttQHo5tM9eGSD9/jLYNMEfVBAjXBDpy1M9NuzvM1ZOW25hdZAO+jjvGL5gCbLMp/XV 5NSOI9Kd9NdCxLcmuHF0HeBfWLGgOS5OGOZYNSZ6cIfXmpJ/M0cmpn8EXR5T9GGvkJcMQj PhWOUoVbqjzrcHvTeXbzXCYle1iQGA8= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fed5debb85so50578a91.2 for ; Fri, 28 Feb 2025 10:30:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767415; x=1741372215; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=pBBYpB4hgweH6Clpvn0bScN6WwMSi1BpuOUxZoDm4hs=; b=ylwtk38btHXhLN679ZN9SfjYplMpbbDQLkOYirrTuc20j256SBZUrSYMfQLPiX/Wba /81LP7QXLgT8CCqsvQCCjwaxASeKfBScbHnf7FUA1toiO69jJn1ds0clOnZPmDcrU5H2 NN8EA4V1vzbR4tagTx9BtAC7XYbkXlIGCDr83mDlJ22F7XJrAQn+jaD2S9quWulZC/Oa kVppDOzA18OLg6lYF/5OnvtQbLNRYIhi6KW0UIrHmDS7e5Ddjj2Ic4G8jldeQjsGnLPN osLksC+IsGb76Wih+aCKC/WPy4FmAoYqcHUk7vpkMTYHaoFjql+qDZTYquMduE1jSX2A trjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767415; x=1741372215; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=pBBYpB4hgweH6Clpvn0bScN6WwMSi1BpuOUxZoDm4hs=; b=ENhF6P0HUd2WACdU/dDvwbRg/Ns1B1hrK/ZXLzXNuLEpoXhsvzV1R3PzY9soSSVuZG PWrzA+kebay63hfrTsff5C1wSFfD5gseW7rNsru2WUs4cEa44f9+feT1utXPVt7dBBCn +gXs4nlzig6gEChkM50GGx7JE0E7/ZGNqF3jG2HHbrCYtMza5dTl5H0jaUN0wpOHCkul Rpx6uP9F03aZpjZNRlq15P0CPgesJmx9KMVFfNmN1PsuWrZ50OjupQzAcaWoguGb+Csq fJD5AVasCF2/8r1wWVBD5WGVscp6/2WS27fOUiBx3oR5NuzMv5dIH4X88Z6TChI8ZgDp 8rVw== X-Forwarded-Encrypted: i=1; AJvYcCWvMLbkNzva9IW94QtvYMwsH+7qoj51IgR3g39GAs0qywzEVlNiSmZ03JQB0XkdPvZmaepjSm+hBw==@kvack.org X-Gm-Message-State: AOJu0Yyni5Cylp7R/0gFN15NjDUXORyKitHTceA45/r3KJ8rlUui95Sn 2UYcMPYhIeftJuGINbFLsi3BF/9Sdltq6BpI2RxY3iWTlgy7gSVtU+a64jvm5q5ixPNLXg== X-Google-Smtp-Source: AGHT+IGJ1/QMzLpmOmVFjGTQ/4fFo0bzWyrrthdyD9uiwQoixag+hgh7PWEv/GbYJAbMW/NyrIAhOQFu X-Received: from pjbpt14.prod.google.com ([2002:a17:90b:3d0e:b0:2fc:201d:6026]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:1b08:b0:2fa:13d9:39c with SMTP id 98e67ed59e1d1-2febab3e741mr7883201a91.14.1740767415176; Fri, 28 Feb 2025 10:30:15 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:17 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-17-fvdl@google.com> Subject: [PATCH v5 16/27] mm/hugetlb: move huge_boot_pages list init to hugetlb_bootmem_alloc From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Rspamd-Queue-Id: 2E86F4001F X-Rspamd-Server: rspam09 X-Stat-Signature: edu4sqwibzuibrzeehnsagknmtb66af7 X-HE-Tag: 1740767415-435045 X-HE-Meta: U2FsdGVkX19+q3HRU/WOln723acNM4z64aM+/DoBPGZaJ36DkD69N7lB03QA4moXJ8h4Fx/MuvEho5pPBLVjsoxl/hdO+O0JMcPvRN2pGsSAY5IQDP7kKzXhAva6M9moaTSWLLLfVRgBaORTSiMxEon00/iTqTL8eerUCNDKcz9u8n9bbNq6khtVBG3IG2PEV+E9Yw4Op2YZEmonZ3gYvi4V54uRt2bb7MkXZT9xDrzo1N0XVz5BCK8RN4JWNit0z1ctAGARpX/nfvaEWT6HXheVED9K8MEyLHA9BPBSM3K94ocsYHfMAt5hqIwViFX8fHKD3G7Ii98bbbG8oIUfgbbSfEKOdbk1h5RBDm1DDYgKLpYFg5Hpf+jfn5Touvln8xVFQFjj9XhGYfORQRIOkqe/nGw1BHBlWnTjvhEieQmURm5J+9Wc/H7tNhX45lPEvk8KKHln0QGOl0klusdY+tEMHggYRXpfvHsLDS1jrNf6QEyPhYM7FNnJ0yktwIOwNcixpZqt3UUfRBPffrRBCDXokPWZlidDSFB2ZXsHbE3UT22LtZW+FvUuA2JKJxKRxCQLVa1B8C5CRxE6aLrxnV5UugHPOmXNo0lBnSgNf4Y2JlsM6pOhelg/du/q50r1r+iM23PIuGESXSRpPA2j67Qc/fSvXjiuglK54Bkpo30O/edJWWlorg9dUAqk7AkhTiPN0MZbtTi4lcd6Lea8qA7JCk6Dtx6eXYTHECa5B5grLILypUnaNeYCbP5JWNpsQOL9PNAkBscD7OmHHJXlS8Jt5t/M/XMmsTvzqlsq7iwqqZxVOZuH3IjPvFv/ZyKgbPMFUFfex5QUwonbisE2iz1M2qOX/1SEsbwobFRBTjnwYwg3kG6CE4cr7wE3b/B++J8IGfomRHMasi4MoxHvBfWT3CUxoiofbz4Lfqwjl1Nk/Xc3/l1d+Qfqt9/h5BACIi8IfUfyzNN8zQfMSqh xUlHKwBV eSTI3/kaLifYH+n119xG+FkEheORV2FUeDS/KcpWJZjrzxLuPbkn13WkZdJCEHDjFjVJkU7jwPZ0lVQ3akWC7fwIEaYZ56EHqMXrYvDpzjKOtTGYZCO9XjqxZ6AJ7aV9pan8kmqNfr4ORaGqAVgBy/ezzCFENBRriQb9rw9uBn4LWUCfMnTZ6FwmZW5E6+fZHDU07OSJgMzzJRfCAhwgrW8f1QTHelWMd5BY0U37YOrQC9R7pJ/FuOnWb0q/sdWfUMNSJ2jmSEowT3qf6fd771610HMoSLTa4pRx0xBSvc3PeuCih6cbhV6v3xyEUp6h5+sBMCLmIgVv/mJ/j0hkRIp6ZQAPuGVyBPmgJdL5T4OPca/ixa0pWaRccwqjH/0s3HQZgjWLZ0g0qavuCjLH86WvXwq9uTBJpNEPa8IlNllrRaWnvqKe8dTAim8dwIGaBK4ggsp7OQod+kwWatIidCZm5KnnqDd7pHPMHycTQHVUVXo6Zy6bV4Ogj8sysjGM1d5pwSwtUunfsZgNhD9JfpHvQxT8AN9UAGpPTX3Hfe8fK+mgXDKRvN8mkRooTyc5E7pRg8ENUIdecAlvEu2gPkGkUjUbeL6VhxC9JE+2Mlx5ckso= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Instead of initializing the per-node hugetlb bootmem pages list from the alloc function, we can now do it in a somewhat cleaner way, since there is an explicit hugetlb_bootmem_alloc function. Initialize the lists there. Signed-off-by: Frank van der Linden --- mm/hugetlb.c | 19 +++++++------------ 1 file changed, 7 insertions(+), 12 deletions(-) diff --git a/mm/hugetlb.c b/mm/hugetlb.c index ea5f22182c6e..0f14a7736875 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3574,7 +3574,6 @@ static unsigned long __init hugetlb_pages_alloc_boot(struct hstate *h) static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long allocated; - static bool initialized __initdata; /* skip gigantic hugepages allocation if hugetlb_cma enabled */ if (hstate_is_gigantic(h) && hugetlb_cma_size) { @@ -3582,17 +3581,6 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) return; } - /* hugetlb_hstate_alloc_pages will be called many times, initialize huge_boot_pages once */ - if (!initialized) { - int i = 0; - - for (i = 0; i < MAX_NUMNODES; i++) - INIT_LIST_HEAD(&huge_boot_pages[i]); - h->next_nid_to_alloc = first_online_node; - h->next_nid_to_free = first_online_node; - initialized = true; - } - /* do node specific alloc */ if (hugetlb_hstate_alloc_pages_specific_nodes(h)) return; @@ -4916,13 +4904,20 @@ bool __init hugetlb_bootmem_allocated(void) void __init hugetlb_bootmem_alloc(void) { struct hstate *h; + int i; if (__hugetlb_bootmem_allocated) return; + for (i = 0; i < MAX_NUMNODES; i++) + INIT_LIST_HEAD(&huge_boot_pages[i]); + hugetlb_parse_params(); for_each_hstate(h) { + h->next_nid_to_alloc = first_online_node; + h->next_nid_to_free = first_online_node; + if (hstate_is_gigantic(h)) hugetlb_hstate_alloc_pages(h); } From patchwork Fri Feb 28 18:29:18 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996916 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21187C282D0 for ; Fri, 28 Feb 2025 18:30:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C666280013; Fri, 28 Feb 2025 13:30:20 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 573C5280001; Fri, 28 Feb 2025 13:30:20 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 39EB5280013; Fri, 28 Feb 2025 13:30:20 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 1948C280001 for ; Fri, 28 Feb 2025 13:30:20 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id BA8FD1C8239 for ; Fri, 28 Feb 2025 18:30:19 +0000 (UTC) X-FDA: 83170193358.05.EAB290C Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf16.hostedemail.com (Postfix) with ESMTP id D116B180018 for ; Fri, 28 Feb 2025 18:30:17 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=sf7+UT0J; spf=pass (imf16.hostedemail.com: domain of 3uADCZwQKCAcm2ksnvvnsl.jvtspu14-ttr2hjr.vyn@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3uADCZwQKCAcm2ksnvvnsl.jvtspu14-ttr2hjr.vyn@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767417; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+u9YpCDfOddo7IoooZ06HWMpfAumbRB14bwYiVvX4Wo=; b=DZdTzNjiycwS0m7OlF5SY5fKqwLvQyZqjEWdholjnjXdjibMlQOFkpI5Ov/6rJeEKREhtc 2qZhjbPmmGrqgchCjivZr+QyfADpIJcmIrVk+knj+IMOzYj6Cc2hekgxerTyziIkGcmefJ Pun+5kUEDmrjet18cIKyjH8xrzI+mK0= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=sf7+UT0J; spf=pass (imf16.hostedemail.com: domain of 3uADCZwQKCAcm2ksnvvnsl.jvtspu14-ttr2hjr.vyn@flex--fvdl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=3uADCZwQKCAcm2ksnvvnsl.jvtspu14-ttr2hjr.vyn@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767417; a=rsa-sha256; cv=none; b=R0vGBSfXBzjBpODSXdywbYFNbn7siXc2GdRYv9aaMSmgAQ5aZtvtqwSBzUBxuwNUS4bOdf uhIXOdQh4+Eadx0EztjrApy0pAihCVi4/sS+Rrq1HZiofu9ugq1/LvAd06ZI7yS5gVgl/Q PlUkDaHqTsGjwsztLHAB3YCondWRLTk= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-21f6cb3097bso65132505ad.3 for ; Fri, 28 Feb 2025 10:30:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767417; x=1741372217; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=+u9YpCDfOddo7IoooZ06HWMpfAumbRB14bwYiVvX4Wo=; b=sf7+UT0J75lV4KLBsnDtPAo5KgKFaAVoRPWfro1CR/nYM8y7ioecwrgom5Wh5dTfJU Wg7fcbDTNZ9Srhbtd1Sdrm8zLDC2gB20z2NQJblhB/XHEZ0xMH3udTx9Vnzlgy1+RT8z YoOqCYCSTmdudJVXc6NEfoLe3KL5f6/tHOfLYzk138NZX9C9DOvhEswLKtU2f8tJAPBV 1tXNawkX6iEJaETCqfXazMDUCm/B4RMlvJo4vAnOeNkSCszWwHoykN0vwDYMziqCaZi7 PGJHtGuNcTrzXWvuYnhW3s7acJOqOmuoq2f2L850gx2rGzU+/t7M/50IPcHGCi6JzD+M rxTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767417; x=1741372217; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=+u9YpCDfOddo7IoooZ06HWMpfAumbRB14bwYiVvX4Wo=; b=t5YHFGo42XZaEoCLljuB9Vxd9+R1QOnqFMqMCu04xqsfduTSm0kxpGan7Hz7ya3I3y yW7MYqZK5l8irhzbMbkExNSKgQdeVtXNORM/v29qGdZYBlILQjHNxiCAXi84NzTos5Xw MQLyoVDe2J1PJxcSnXi9Tw0katkvB4ZmnYh3LIrN6rFRAv0GQnRbCQBNJjg2k1Jxy9Sf sMyooB1otK0gojdlhT7h7NwmQicPf5Bh3TYyVjNLLkOXohyJJnoorIWpGx9Y4HAPztFh NxXcsvbszlYMLTnHfSQ6efT6uczBCPAa7vtiQCxxVT/KsOrKN5ZDxLXNRimBnWtL3B4V 1OEw== X-Forwarded-Encrypted: i=1; AJvYcCUorD+A7dO5Cr97SEZqbLFkJEklh5tjWdx/MdF+0x2PPM1D1dAh2hak+AiZTIKeQ+BeH07SAuT+CQ==@kvack.org X-Gm-Message-State: AOJu0YzGVxMqeARidXKMuqDer29ARxcX8q1vHV/v8mSDfPpeLkiyFy2B O/6WbqNZLRt1uq4st+QJIHExLJIxyFha5DoSrgAgzuC/n/u/B73Am5zWcRRcfL7bgE/5zg== X-Google-Smtp-Source: AGHT+IHTbxxgSRmI9JPbjqv6YVWrtnqd2U549w4/Q1RWPys+2AxjBGjQpLOqvUpKa1VvMDtiAePntETH X-Received: from pgc16.prod.google.com ([2002:a05:6a02:2f90:b0:aeb:ac4a:ebf6]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:788a:b0:1ee:d438:3fb4 with SMTP id adf61e73a8af0-1f2f4e0142emr6186292637.28.1740767416720; Fri, 28 Feb 2025 10:30:16 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:18 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-18-fvdl@google.com> Subject: [PATCH v5 17/27] mm/hugetlb: add pre-HVO framework From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Stat-Signature: 568iuwho4oumu8dwf9fcnh69b374wxrg X-Rspamd-Queue-Id: D116B180018 X-Rspamd-Server: rspam06 X-Rspam-User: X-HE-Tag: 1740767417-442455 X-HE-Meta: U2FsdGVkX1/GvHLgyRdZkBEKxj7vshBz6FiMgKgdqv0k2TtsJdSaK80KY+jkQAFaD49n0cCrGLYOhpY28hyJhPQg713WhK4ebw0QXxWsXk0EhhSesMgTCGq5mEIWwCNPyc3pmzG7ARN2HC0HCQRTO2/Qdk5PSTEdHt/IT7SRuOdwVJKfnn9gDtoEJ4VbCEYbo6soYPbJX+KrS+EbSTlbs/kdSI99FQs/D3TBsNrKS0Ub79mCESOo1nnSdz/B/qjMiMau2BM4pcr+xpyn7PGxWC0SiMwRwoUqiHGJL2+fx85xHup/Lu18Uak5U/FZthPk0wxzjZgBuxKE9rH8qsqG3njx+cwu0m6duidcq2MrOgGsE//nOyGo51CwNxIMbBrKqNlNWoO2P7WPA5Pn+V2hi6uvwPDfpux3OoNGD1tW7XOs1cmQq19hwVfrTm2hgRz6GcNQo7rgPNoUb/UBsaKPhweAEm6LDC5/W610fNj584hk1vKkshoScdX9X6ETVRt7p4ks1POBsWW0VoinmsEmJGUynZTSWfiEHm/fvv39RONPfFEUqWeceYfXm58mI/QN/6TCnZOaD7oheNdjP7wCZ40msLQh5wxakZL6VGNooT/ZzVilX1EayPTtFWI+mxKcNKcB1rm+S62egPw9MR6l4VMVGR6JcDyl1u0J5q2YJuLkUFEmzurx+Y/5MJBBV+pkZJ0fAQo4okwG+lffKj4Pxp4+ssEaR/cJqsfYS4thsy3WzW1nXNXXmb4ZAJvFeUTPQi80GYT9VK4LiAVbNxj3i9q7a0gTDnWQ+BYoWAldHN9m+wEDvZoBbMr3LatuIvhKtoqB+SWpqUa+GQfPRQqAjJzZQT49oHZ83Qljz771+fJd/HGffLvy7aXYCiYgDfdHOVPOrKo1h/hA8NzYAvrOxDeIfdOMO2SVgikRByz6NayWC874Rn5WgceSD2V3aS+6foU1nhwgSHaYMKSVea4 uegWsEMe t0RIJcKBy1kXjb7k6kSpkMutD3IQQdcHuw7qGf5waSBvyqC81QRdwxtmj0L6G4yBrXRnWM0b81ZcPB1bU1cKWB4mFMvdyRt0LgDF/56Ts81QFsEDe20jDQxPSiYY3hMTk2FjZhXKuuuqhc+zmhfxE3WUdQJ0VWflRlB0haovps67X6A4BNJK1ZGZJ1KpbDD9Q49s1g2HEjuJkjodBj8i61R0wJ22NfabhXibDRYKmtDYe4v8T+kXQzXoNKp9RpNbjJaLFILR9vDwpbOW7XuX/0WldnI4mi4UMYup2JfW6KW8wq89RBlxMNsiGkdfMyfdiSun2l2GGUJowrvwHiTFEu+rTCbsWbu51dlWkIhPlTlc9wmhb1kfXyTKBApsbNoWNSyTQrTPCGyD9XR+3WP4cbgY0RZqiD7CXhe420pf2Jp2JKzxkDRNz0sIONFY8N+FGRPOVCSOLINZzXi1VTtqisKxrTojovNXHzvVmPrJbDtsewE4M9fyJukKtCVBXPoWdrUtB308y7oRdV6Z3dCaQwi6O7g== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Define flags for pre-HVOed bootmem hugetlb pages, and act on them. The most important flag is the HVO flag, signalling that a bootmem allocated gigantic page has already been HVO-ed. If this flag is seen by the hugetlb bootmem gather code, the page is marked as HVO optimized. The HVO code will then not try to optimize it again. Instead, it will just map the tail page mirror pages read-only, completing the HVO steps. No functional change, as nothing sets the flags yet. Signed-off-by: Frank van der Linden --- arch/powerpc/mm/hugetlbpage.c | 1 + include/linux/hugetlb.h | 4 +++ mm/hugetlb.c | 24 ++++++++++++++++- mm/hugetlb_vmemmap.c | 50 +++++++++++++++++++++++++++++++++-- mm/hugetlb_vmemmap.h | 7 +++++ 5 files changed, 83 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/mm/hugetlbpage.c b/arch/powerpc/mm/hugetlbpage.c index 6b043180220a..d3c1b749dcfc 100644 --- a/arch/powerpc/mm/hugetlbpage.c +++ b/arch/powerpc/mm/hugetlbpage.c @@ -113,6 +113,7 @@ static int __init pseries_alloc_bootmem_huge_page(struct hstate *hstate) gpage_freearray[nr_gpages] = 0; list_add(&m->list, &huge_boot_pages[0]); m->hstate = hstate; + m->flags = 0; return 1; } diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 5061279e5f73..10a7ce2b95e1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -681,8 +681,12 @@ struct hstate { struct huge_bootmem_page { struct list_head list; struct hstate *hstate; + unsigned long flags; }; +#define HUGE_BOOTMEM_HVO 0x0001 +#define HUGE_BOOTMEM_ZONES_VALID 0x0002 + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0f14a7736875..40c88c46b34f 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3215,6 +3215,7 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; + m->flags = 0; return 1; } @@ -3282,7 +3283,7 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, struct folio *folio, *tmp_f; /* Send list for bulk vmemmap optimization processing */ - hugetlb_vmemmap_optimize_folios(h, folio_list); + hugetlb_vmemmap_optimize_bootmem_folios(h, folio_list); list_for_each_entry_safe(folio, tmp_f, folio_list, lru) { if (!folio_test_hugetlb_vmemmap_optimized(folio)) { @@ -3311,6 +3312,13 @@ static bool __init hugetlb_bootmem_page_zones_valid(int nid, unsigned long start_pfn; bool valid; + if (m->flags & HUGE_BOOTMEM_ZONES_VALID) { + /* + * Already validated, skip check. + */ + return true; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, @@ -3343,6 +3351,11 @@ static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, } } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return (m->flags & HUGE_BOOTMEM_HVO); +} + /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3383,6 +3396,15 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) hugetlb_folio_init_vmemmap(folio, h, HUGETLB_VMEMMAP_RESERVE_PAGES); init_new_hugetlb_folio(h, folio); + + if (hugetlb_bootmem_page_prehvo(m)) + /* + * If pre-HVO was done, just set the + * flag, the HVO code will then skip + * this folio. + */ + folio_set_hugetlb_vmemmap_optimized(folio); + list_add(&folio->lru, &folio_list); /* diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 5b484758f813..be6b33ecbc8e 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -649,14 +649,39 @@ static int hugetlb_vmemmap_split_folio(const struct hstate *h, struct folio *fol return vmemmap_remap_split(vmemmap_start, vmemmap_end, vmemmap_reuse); } -void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +static void __hugetlb_vmemmap_optimize_folios(struct hstate *h, + struct list_head *folio_list, + bool boot) { struct folio *folio; + int nr_to_optimize; LIST_HEAD(vmemmap_pages); unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU; + nr_to_optimize = 0; list_for_each_entry(folio, folio_list, lru) { - int ret = hugetlb_vmemmap_split_folio(h, folio); + int ret; + unsigned long spfn, epfn; + + if (boot && folio_test_hugetlb_vmemmap_optimized(folio)) { + /* + * Already optimized by pre-HVO, just map the + * mirrored tail page structs RO. + */ + spfn = (unsigned long)&folio->page; + epfn = spfn + pages_per_huge_page(h); + vmemmap_wrprotect_hvo(spfn, epfn, folio_nid(folio), + HUGETLB_VMEMMAP_RESERVE_SIZE); + register_page_bootmem_memmap(pfn_to_section_nr(spfn), + &folio->page, + HUGETLB_VMEMMAP_RESERVE_SIZE); + static_branch_inc(&hugetlb_optimize_vmemmap_key); + continue; + } + + nr_to_optimize++; + + ret = hugetlb_vmemmap_split_folio(h, folio); /* * Spliting the PMD requires allocating a page, thus lets fail @@ -668,6 +693,16 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l break; } + if (!nr_to_optimize) + /* + * All pre-HVO folios, nothing left to do. It's ok if + * there is a mix of pre-HVO and not yet HVO-ed folios + * here, as __hugetlb_vmemmap_optimize_folio() will + * skip any folios that already have the optimized flag + * set, see vmemmap_should_optimize_folio(). + */ + goto out; + flush_tlb_all(); list_for_each_entry(folio, folio_list, lru) { @@ -693,10 +728,21 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l } } +out: flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } +void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, false); +} + +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list) +{ + __hugetlb_vmemmap_optimize_folios(h, folio_list, true); +} + static const struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 2fcae92d3359..71110a90275f 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -24,6 +24,8 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *non_hvo_folios); void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); +void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); + static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) { @@ -64,6 +66,11 @@ static inline void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list { } +static inline void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, + struct list_head *folio_list) +{ +} + static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct hstate *h) { return 0; From patchwork Fri Feb 28 18:29:19 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996917 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DB7EC282C6 for ; Fri, 28 Feb 2025 18:30:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D3FCE280014; Fri, 28 Feb 2025 13:30:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CED76280001; Fri, 28 Feb 2025 13:30:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AA36A280014; Fri, 28 Feb 2025 13:30:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 8B4C2280001 for ; Fri, 28 Feb 2025 13:30:21 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 4D5471A10FA for ; Fri, 28 Feb 2025 18:30:21 +0000 (UTC) X-FDA: 83170193442.19.6C74090 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf29.hostedemail.com (Postfix) with ESMTP id 778C612000A for ; Fri, 28 Feb 2025 18:30:19 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Pjo1d1nK; spf=pass (imf29.hostedemail.com: domain of 3ugDCZwQKCAko4mupxxpun.lxvurw36-vvt4jlt.x0p@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3ugDCZwQKCAko4mupxxpun.lxvurw36-vvt4jlt.x0p@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767419; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=I2OTsio8sZEwb/RU9cVLVK4uXYnO7HAAoJEXN+XoPAc=; b=wqfY1DIqditd6iQuyyHIhDFLNqJwTQeNtNlvdXbvdpX2LV/MbgyTt2ntuQu3OaPEdAnWSj u0G6MmkYka56fQuxPdN0G44WrYro2GyTZIiqY0b06C7X4lVFIgdu5Z7764BxDH29d9CRuo M34t0F9N3YVPsUQtaDa6Ibhqalf+BgY= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Pjo1d1nK; spf=pass (imf29.hostedemail.com: domain of 3ugDCZwQKCAko4mupxxpun.lxvurw36-vvt4jlt.x0p@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3ugDCZwQKCAko4mupxxpun.lxvurw36-vvt4jlt.x0p@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767419; a=rsa-sha256; cv=none; b=fB5LLq3QMWEshluV7ux4rzZc59nuX+l9YnIlFS0X9Ys7gWAnmyjXh+dKjw4n12NV8SVQxQ AXPuL4q5BenaqdX13eLG93j/DYDbOWILuUi6xxbs9cgyi400WBnWFfOs27ugQIbdP95Pwp yJSM3zAwT2B+OAbN2cY7/+1JGjsEWcM= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fec7e82f6fso1154099a91.1 for ; Fri, 28 Feb 2025 10:30:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767418; x=1741372218; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=I2OTsio8sZEwb/RU9cVLVK4uXYnO7HAAoJEXN+XoPAc=; b=Pjo1d1nK4dnePFUJ8FB4wkCxXdT7qSBY1w/fEngqBo/LBUIw3pzJhAhoxaYV8jZrtQ EfAOAFur9KsMTHCtntp3J/1pP5XbqkgsEyRalHN5lGyIO102RXc12YEbOXkDjEqNiVk0 S3zpb524Wuwco5lHSI72Oja8VOcb/iUZAvPn03y416doXuxioIgf2XDZuFsWWm5ScMAJ VVWHSAWr++LxlzkSWkYdnvAY5CT74nMDfqg6radNt7FU14bOwHEN3/53WJ++n2Z9hlsc P6icqs1LKM+TYZcVi0oV/ryHoPUiZplv83p27bXjart4PyTe0y0R/lI/8MyfXF7PtePp 5pfA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767418; x=1741372218; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=I2OTsio8sZEwb/RU9cVLVK4uXYnO7HAAoJEXN+XoPAc=; b=w3JfuHqGz2AgRtQJXUxnJaiE8DsXKr82QIETDSn3Q8f99X+72v2BceR4FP6Mx2h5BC SOWJltyqifOmMCZVxUfCGEOXG4oKkblY5HSpx3SZ34dVud96iODyvwdrkRiSKlefepww fhpXU1oVuhlv6GOtNg19tAte7qWQNuBZRi1RA2/R7OV0HjlXTAHyJgsycbjyGQV7OqA4 gfb3S0comcmGdpL2veUMMIvFcvP7gKCeI/nH5PfEB/gUkeuxKbT2tf4RhrdAWGPx08Fr mgGaIra+1OGaSaXKVTVZR9NrTlCOt3MRNRdZ4v7zlgV8LlhQ3zP80WH1YMOGJXYQfv/7 REhQ== X-Forwarded-Encrypted: i=1; AJvYcCV0G+PiZY0yDHVhgvNQ3tieJMgW68kSFiVfjkExfldaGN2kQT+x/7+5UtHCu69BVn8uXKx7nfam3A==@kvack.org X-Gm-Message-State: AOJu0YxJjJRdy2g55ttCIXqe9k59EXsTob8XnADsW/awxLJhazzAQlCU O7bOZR6nqTOf26GUHyLm7riN1OKVmPAxwefS9zXH7gO9pPrxN88vAHOkYt89tcpMB8PMRw== X-Google-Smtp-Source: AGHT+IHUnBifXScpSNr6SZNgKx7X+v++q78gepOk2a+w86P9LcvmPk5lA9HV6vO/w06pUMljs+USRXb+ X-Received: from pgwa13.prod.google.com ([2002:a65:654d:0:b0:ad5:57cd:8f91]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:789a:b0:1f0:e322:45 with SMTP id adf61e73a8af0-1f2f4cd6fdbmr7693779637.12.1740767418371; Fri, 28 Feb 2025 10:30:18 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:19 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-19-fvdl@google.com> Subject: [PATCH v5 18/27] mm/hugetlb_vmemmap: fix hugetlb_vmemmap_restore_folios definition From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Stat-Signature: qndua5aqntmcdi1tj338tn9sujff1aoe X-Rspamd-Queue-Id: 778C612000A X-Rspamd-Server: rspam07 X-HE-Tag: 1740767419-593316 X-HE-Meta: U2FsdGVkX1/M4RP7zgPXiqgVKauyWayX3rMObihmduC3woUuirG9P85ETEzbehX+F9/dO9h7vDweWNdewjba7t0wBkQZs7LXEPi/VOZczbML5Ayrhl5S0ILdUekh4z8stFsR/O2pTdXfb5hPGZWTQOIEKOoWg2utIKrRPenfZQJyJS3nGN7ztVZoiND6bBQ4n/LeXqN54r/C2hf8loXZ5QBwgiiYk8HuPVK+xHzDn/QAV3qBxK4cGArAQbxWcd6gZk8eoqRuHVEbg/zq6Qpa2FdJSUHd4z4BQkIcOlVeQpOixQmVgRna2Mx9Smh/EXxuY6sg56rD308nk87Q50/GkYEy9nkwdynPCR2+hKLm7OpJMETeXBpafxqyW7EyFgqeFsp1bI9MBddrKsl/xCKUpsmy7Ycn3TWgxpjJFaExjo2dbFMilODZUn8v1K/vErW9V/r3g0F271d5yjX4Z4HPR61Lh7X0u4Q6q+Xu37LRxHswqvRi15t73k+/UXHdRVFoM9JLSwwTbxwAAaKPlVfal42Ihdhh/10gh0dN9bsLstCOqAk6NFaDdVx5DbxNqqzrqpEtCs1EWdrogdu4BPer9nuAn30XFa6cvxIBZ5MV8Gbzv+zthWWTM/a3yNpz3pky9RevVnIS+sgS9jmGRDuwTWGCWfVaqHUZxUZa5pu/Q/Ysme6WK67xrgQhp4isvbLepLySJdTYd8KyJZnM0RfA71+5qnsMQEd8ipVt1vR2Ny/Wioq7dFd2OXQSOUncoLf8kJAfRXdWfyItvRNyjuE8cexOD4XUcVdLJOdu22nsUwyj4rWhuAgYZWwY6WrQoVNtcNdGFs7E6Yg/MVvgiHLZc/oqttKqxSVys8LNJtOSFaJvLzx8bGC8dGblUC3DoE8Ko1Z4SlVJshvDLq4xvaSU5Ghztj/tf9zXIKPFkwUKFF2px8RrVQZshGN1wr6wJF9dp1MiieR6ogzujQJnm/9 Z75YNbgT EIoogxQeFV/r7XGJrrCvHwIYTbPLyWyo/KCM44903LwY0CvM+EHnZyp7Tw+0BhvnwNawkcPK7Pn6zfyRvE+5prxtu8mENdKdCARxAhgY2k4PjO9ShsAPBciibJBH27QrwNvWKeikAAtddkRD9KR6wGON1k73lVzCjDHT8VIiUSWi4bQ7X5xpGxbftPvw1vP3hTa9PVvL2uAJ7WWNHoXMIPKLMsyb8LTDE+GWFPWF/JdfAoOsbj/WozFgJl5bPJC/5zAy0HQ+/OInVuCuGUoU5D3AHh6z8FTQlGG2R/L6imNQGtJ5VcHxsV2+RbvdCZNnCrAozsmvHgtzf5loiA24Za1F86ezzXZm46IHthIxx93OuY4g+WG8MEMf2sDAObZ2mP6YodFPWZ1xwbDapLCLguRgsPxdMSrIPNS+A7Ek8osuRtdofhFroX5m2fwuS4yUWYVGaPw3UbwnrOgfZnO0W4zg0UpaZqb9sbZZIkqrjgHf8b0fhwAn6HbHLt2PiN+ud6y4kfh+gnB7UJRmmttVvGWogOpaqHtRh9j8AP1Jm9rHW8qPDfhySCZMPyVjVIGZFODXL X-Bogosity: Ham, tests=bogofilter, spamicity=0.000014, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Make the hugetlb_vmemmap_restore_folios definition inline for the !CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP case, so that including this file in files other than hugetlb_vmemmap.c will work. Fixes: cfb8c75099db ("hugetlb: perform vmemmap restoration on a list of pages") Signed-off-by: Frank van der Linden --- mm/hugetlb_vmemmap.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 71110a90275f..62d3d645a793 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -50,7 +50,7 @@ static inline int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct f return 0; } -static long hugetlb_vmemmap_restore_folios(const struct hstate *h, +static inline long hugetlb_vmemmap_restore_folios(const struct hstate *h, struct list_head *folio_list, struct list_head *non_hvo_folios) { From patchwork Fri Feb 28 18:29:20 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996918 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8E95C282D2 for ; Fri, 28 Feb 2025 18:30:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A0ACE280015; Fri, 28 Feb 2025 13:30:23 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 99111280001; Fri, 28 Feb 2025 13:30:23 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7E67A280015; Fri, 28 Feb 2025 13:30:23 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 5CCEC280001 for ; Fri, 28 Feb 2025 13:30:23 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 2077AC1030 for ; Fri, 28 Feb 2025 18:30:23 +0000 (UTC) X-FDA: 83170193526.19.397A8B5 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf12.hostedemail.com (Postfix) with ESMTP id 17AB34001E for ; Fri, 28 Feb 2025 18:30:20 +0000 (UTC) Authentication-Results: imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AY3x37Of; spf=pass (imf12.hostedemail.com: domain of 3vADCZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3vADCZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767421; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=bilRBQrQ392Fbxoju7upbE9+OCkgy4821gbuoMPJZmQ=; b=npeFpf3cMCjcRnFH4opWi9+q3sUxlDT/u0a/oz1JgzdSYeWxWhfw96WlQGFQNk+ICI45n5 42MTHR7YIRred7r6l2v+enRwWiEPEq21SgGQTotl8o1CVSNPubwHhbt8zS2kHtfxbeWdao OFJJd/NrEYpXQbDY2iVDXtBNIjmcMtM= ARC-Authentication-Results: i=1; imf12.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=AY3x37Of; spf=pass (imf12.hostedemail.com: domain of 3vADCZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3vADCZwQKCAsq6owrzzrwp.nzxwty58-xxv6lnv.z2r@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767421; a=rsa-sha256; cv=none; b=uOrqdmznbShBFXRUEZ7lpPSG/M89CEb/XLKoxctVRs7mK1a/8WEwEDEqom8gD3JTc9o0eZ dUmScF+6xiJA5e48sT4e9Boe0jshqvJQPnX1/Vh8NlyCS9Lap7giN/CwWCIQxWhMOA7dBR VvTFIg4Xf6g7aydm9WP4CT5l+gtzJw8= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fea8e4a655so6940025a91.0 for ; Fri, 28 Feb 2025 10:30:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767420; x=1741372220; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=bilRBQrQ392Fbxoju7upbE9+OCkgy4821gbuoMPJZmQ=; b=AY3x37Of1VySVjuTF17RnNJNLv37ypeLvSsTDEZ6lZCvy4APLNFL8CfM4Qgwki+SL4 lXGxEzvuDe7Nv9VbdGpbWbLvotF45kH+2u5U9Od0i9DjIVSiIW/6Rr/Q7OOWG9OSmoa2 92FWOayGqOJnRWvfxOODlkr1oChbpk/+inE/0tbiYCgLbrIWh8nbgB4qviAL04HRCb9+ 6LfKT2d+JOW+/jUxRmnebiuMQTBBI7vNqhpUD+a5HSM+Dhx8nr3YuRBtvoXFr0zMhZtC DZVzT4pIlDBj/LJMRwmO9QlcwhpECXFVjcPXqkacCQ5rQQXv072THsEdrL24eyd1JE8E TyEQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767420; x=1741372220; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=bilRBQrQ392Fbxoju7upbE9+OCkgy4821gbuoMPJZmQ=; b=k7476f3ifU4ne5Ie3ZxgMu+qash/te/UjVGq2pgIKSg0/hYi0nHwX2YIaxxGE8evoP KkYd5XXnZ4ovgZSQMgK3rtoS0RnmbLU2iCAz1L9T8x6Q/aY74g3Bks3dfJ8G1B68zV+t 4++36JRx0Ne8cXltxI8K22xRW4qV9+01fMB3djx1oq/ui4k8vFdzqxLJuSzCY/EWbo6v EjUEJ2Xpk6lcCM55PzLXVQDWff1BACfooYdQiESLxobiG6PStlFsEDFUTe/mqdY/xJiY joJG9Jyz7Z0TvhC+VufTD9983XimLd2PzCeczXv++vH++s0QRR2NdPbZ7K9Zu0QeOMWw AKMA== X-Forwarded-Encrypted: i=1; AJvYcCWKrIvsWCFF3wHOa4MK41htC0iEHDJVz2DXjUfUmpT5p2vcemlWtoTV1DyQTyj3YeuMWOgeAiNbow==@kvack.org X-Gm-Message-State: AOJu0YwGsiJJDXmuC92iBLQdQwt6SzUnj6WxnA/QD+0tm4GFEymHM+cg I1DW+ot83YNvZfahVhT3cfQBB4qzz/lFgge/2DC3ozcSlW1ZclDRRb7wTk/tuwctWfXsmQ== X-Google-Smtp-Source: AGHT+IHZH9jgpRWIq6i/MFnGQz3hWlzoaLJhHTlF5N5208y0nznHeVTLTF4A3tr7Hamk1dVEU9yqERiZ X-Received: from pjbqa8.prod.google.com ([2002:a17:90b:4fc8:b0:2fa:210c:d068]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4fd0:b0:2ee:aed6:9ec2 with SMTP id 98e67ed59e1d1-2febab5e11dmr8140944a91.14.1740767420048; Fri, 28 Feb 2025 10:30:20 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:20 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-20-fvdl@google.com> Subject: [PATCH v5 19/27] mm/hugetlb: do pre-HVO for bootmem allocated pages From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Stat-Signature: n8dggiwauqxo4srhk6cw38f6ohufc8s9 X-Rspamd-Queue-Id: 17AB34001E X-Rspamd-Server: rspam07 X-HE-Tag: 1740767420-82476 X-HE-Meta: U2FsdGVkX1+yEtVjDFW81HiFMoOqsXTNB2M4HFlMM1wKPE/FQTC6FUEAJ31OAbNb1XIHQfPqjLZpuKUp2ad6mxpQ9qpE55lvp/VO8/TaYYZghwV25iaiZfBTA2q4I+bXGV6XgA4rrcTD8S2xszxO2hghdtnEVtVuEsPxxoNQZbMDTrAp9N3vnNSUFx8kgS9HRzVQhdn5omWIxIoGIrRYvgR5n1PdEtrQOvLwAnPYeUdq6aK9cc2T4tej8f4lXIUWXS08Gvibx6NZIkvj6+DFltFCgGxa8a2CCYfso2yyy8SgTjQiBKXQlSJVR4+bgStaQYu039xbk7EhqXWM6h/dmiKPbvvwZw1AItWe8/JLcrmNRESD2WWf00bKO822zHyY6UlKasRhHlNlQqgbNDb2uVvxv4IeUIgHKTJKXVYaghRIpi/jidNQM4R1WZrDDkbgL5HgczlsiFFfULuCFmLquNk591btlJqnL3lHNbadEbNCw4XK7d22Ob58dfmd6/qWte36KxbQlaRp0Hsk2TzozjLOGJScHvP2b9GIx+oE+xZxANLtHlIvElcGEqGhqCoI5+XjuO09LLWw24APztprch+iWX40Jb7zG73xJFvgl6gHCNBBbBRwpEq24/4bMjDteOU6teyp79BA3rp62MWmFfX/I0sSv2923KH5nM6xudmMCykEEJ9hl+i6AbDfRR2ed0N+iaZjIqT6WDWE8i3YPh5SBDXfFUJm35AawN63NScjYhod7IhuwlSkI21gHyfddjP+0Rkcktn5I/GEFRyRmgzHiyUSm81kNyJGUWh9Qo6hBvLC9nTc14xU2dS2yLbvXhuvJWOIEAb4uyZic/LljTqMZag+GlQro9LKp42RPkANsU3zyGcTbOXO9l1xoQUmH6SqAb7nSgCWXUBC00ygqkYSwQrJL67w+xy3+qVJTFZo1oBlgu3gXNqD8Y7YcA1GsfcF8iI14UrGLudJr3b SrsbBrU7 yqfDNkyjrsdnOPaUALfd/dmMC+klgoGGcd8docCl4xaSbz4EASK2u07jt84cIHw1xu/5fh3SCfdlHTJYPUtXrbUvGkh7Rg9Ay1EcCVPSSJzkx2HRViBG1Xl8jt40xPHM8vyZ0bqeL6x5eH/ItEXisBSeuU3D9Q8FLE+oMnsf1NlgzIPAd4xFs8fKzeu2f7rTMAsGhNced3gzuu+laMXpO2aL2DV/y/NLE2011q1K0psID9wPO2/8V2qGDGnTTKAQd5KEUxeYY+H9GyjfobQFDXMfkRanW5SLE7Nx3UsIj5uEWd5eLUFatUbGhqgfh1TQuViuZUG0B/7G3IAoW8l1PFzhE7ELb5KI/CxCO2/3SJC8OTm7jNZpsEn3fodN3Yru9BWdW9FZMKqPLbSd8815mCNVXYrW3eO/nY8Nq+BTdmNpqZZEv5teXLNcId+Zi6ndO+sUlDniXZZFtSbwJq2b5MrQCVxZ0QhH06R2WvN3rbb0lvtqvODJ8/n2UCbBxkzkHE1DArXr1hPGmlBQkx3sRI5SNgw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: For large systems, the overhead of vmemmap pages for hugetlb is substantial. It's about 1.5% of memory, which is about 45G for a 3T system. If you want to configure most of that system for hugetlb (e.g. to use as backing memory for VMs), there is a chance of running out of memory on boot, even though you know that the 45G will become available later. To avoid this scenario, and since it's a waste to first allocate and then free that 45G during boot, do pre-HVO for hugetlb bootmem allocated pages ('gigantic' pages). pre-HVO is done by adding functions that are called from sparse_init_nid_early and sparse_init_nid_late. The first is called before memmap allocation, so it takes care of allocating memmap HVO-style. The second verifies that all bootmem pages look good, specifically it checks that they do not intersect with multiple zones. This can only be done from sparse_init_nid_late path, when zones have been initialized. The hugetlb page size must be aligned to the section size, and aligned to the size of memory described by the number of page structures contained in one PMD (since pre-HVO is not prepared to split PMDs). This should be true for most 'gigantic' pages, it is for 1G pages on x86, where both of these alignment requirements are 128M. This will only have an effect if hugetlb_bootmem_alloc was called early in boot. If not, it won't do anything, and HVO for bootmem hugetlb pages works as before. Signed-off-by: Frank van der Linden --- include/linux/hugetlb.h | 2 + mm/hugetlb.c | 17 ++++- mm/hugetlb_vmemmap.c | 143 ++++++++++++++++++++++++++++++++++++++++ mm/hugetlb_vmemmap.h | 14 ++++ mm/sparse-vmemmap.c | 4 ++ 5 files changed, 177 insertions(+), 3 deletions(-) diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 10a7ce2b95e1..2512463bca49 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -687,6 +687,8 @@ struct huge_bootmem_page { #define HUGE_BOOTMEM_HVO 0x0001 #define HUGE_BOOTMEM_ZONES_VALID 0x0002 +bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); + int isolate_or_dissolve_huge_page(struct page *page, struct list_head *list); int replace_free_hugepage_folios(unsigned long start_pfn, unsigned long end_pfn); struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 40c88c46b34f..634dc53f1e3e 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -3211,7 +3211,18 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) */ memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), huge_page_size(h) - PAGE_SIZE); - /* Put them into a private list first because mem_map is not up yet */ + + /* + * Put them into a private list first because mem_map is not up yet. + * + * For pre-HVO to work correctly, pages need to be on the list for + * the node they were actually allocated from. That node may be + * different in the case of fallback by memblock_alloc_try_nid_raw. + * So, extract the actual node first. + */ + if (nid == NUMA_NO_NODE) + node = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m))); + INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[node]); m->hstate = h; @@ -3306,8 +3317,8 @@ static void __init prep_and_add_bootmem_folios(struct hstate *h, } } -static bool __init hugetlb_bootmem_page_zones_valid(int nid, - struct huge_bootmem_page *m) +bool __init hugetlb_bootmem_page_zones_valid(int nid, + struct huge_bootmem_page *m) { unsigned long start_pfn; bool valid; diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index be6b33ecbc8e..9a99dfa3c495 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -743,6 +743,149 @@ void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head __hugetlb_vmemmap_optimize_folios(h, folio_list, true); } +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT + +/* Return true of a bootmem allocated HugeTLB page should be pre-HVO-ed */ +static bool vmemmap_should_optimize_bootmem_page(struct huge_bootmem_page *m) +{ + unsigned long section_size, psize, pmd_vmemmap_size; + phys_addr_t paddr; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return false; + + if (!hugetlb_vmemmap_optimizable(m->hstate)) + return false; + + psize = huge_page_size(m->hstate); + paddr = virt_to_phys(m); + + /* + * Pre-HVO only works if the bootmem huge page + * is aligned to the section size. + */ + section_size = (1UL << PA_SECTION_SHIFT); + if (!IS_ALIGNED(paddr, section_size) || + !IS_ALIGNED(psize, section_size)) + return false; + + /* + * The pre-HVO code does not deal with splitting PMDS, + * so the bootmem page must be aligned to the number + * of base pages that can be mapped with one vmemmap PMD. + */ + pmd_vmemmap_size = (PMD_SIZE / (sizeof(struct page))) << PAGE_SHIFT; + if (!IS_ALIGNED(paddr, pmd_vmemmap_size) || + !IS_ALIGNED(psize, pmd_vmemmap_size)) + return false; + + return true; +} + +/* + * Initialize memmap section for a gigantic page, HVO-style. + */ +void __init hugetlb_vmemmap_init_early(int nid) +{ + unsigned long psize, paddr, section_size; + unsigned long ns, i, pnum, pfn, nr_pages; + unsigned long start, end; + struct huge_bootmem_page *m = NULL; + void *map; + + /* + * Noting to do if bootmem pages were not allocated + * early in boot, or if HVO wasn't enabled in the + * first place. + */ + if (!hugetlb_bootmem_allocated()) + return; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return; + + section_size = (1UL << PA_SECTION_SHIFT); + + list_for_each_entry(m, &huge_boot_pages[nid], list) { + if (!vmemmap_should_optimize_bootmem_page(m)) + continue; + + nr_pages = pages_per_huge_page(m->hstate); + psize = nr_pages << PAGE_SHIFT; + paddr = virt_to_phys(m); + pfn = PHYS_PFN(paddr); + map = pfn_to_page(pfn); + start = (unsigned long)map; + end = start + nr_pages * sizeof(struct page); + + if (vmemmap_populate_hvo(start, end, nid, + HUGETLB_VMEMMAP_RESERVE_SIZE) < 0) + continue; + + memmap_boot_pages_add(HUGETLB_VMEMMAP_RESERVE_SIZE / PAGE_SIZE); + + pnum = pfn_to_section_nr(pfn); + ns = psize / section_size; + + for (i = 0; i < ns; i++) { + sparse_init_early_section(nid, map, pnum, + SECTION_IS_VMEMMAP_PREINIT); + map += section_map_size(); + pnum++; + } + + m->flags |= HUGE_BOOTMEM_HVO; + } +} + +void __init hugetlb_vmemmap_init_late(int nid) +{ + struct huge_bootmem_page *m, *tm; + unsigned long phys, nr_pages, start, end; + unsigned long pfn, nr_mmap; + struct hstate *h; + void *map; + + if (!hugetlb_bootmem_allocated()) + return; + + if (!READ_ONCE(vmemmap_optimize_enabled)) + return; + + list_for_each_entry_safe(m, tm, &huge_boot_pages[nid], list) { + if (!(m->flags & HUGE_BOOTMEM_HVO)) + continue; + + phys = virt_to_phys(m); + h = m->hstate; + pfn = PHYS_PFN(phys); + nr_pages = pages_per_huge_page(h); + + if (!hugetlb_bootmem_page_zones_valid(nid, m)) { + /* + * Oops, the hugetlb page spans multiple zones. + * Remove it from the list, and undo HVO. + */ + list_del(&m->list); + + map = pfn_to_page(pfn); + + start = (unsigned long)map; + end = start + nr_pages * sizeof(struct page); + + vmemmap_undo_hvo(start, end, nid, + HUGETLB_VMEMMAP_RESERVE_SIZE); + nr_mmap = end - start - HUGETLB_VMEMMAP_RESERVE_SIZE; + memmap_boot_pages_add(DIV_ROUND_UP(nr_mmap, PAGE_SIZE)); + + memblock_phys_free(phys, huge_page_size(h)); + continue; + } else + m->flags |= HUGE_BOOTMEM_ZONES_VALID; + } +} +#endif + static const struct ctl_table hugetlb_vmemmap_sysctls[] = { { .procname = "hugetlb_optimize_vmemmap", diff --git a/mm/hugetlb_vmemmap.h b/mm/hugetlb_vmemmap.h index 62d3d645a793..18b490825215 100644 --- a/mm/hugetlb_vmemmap.h +++ b/mm/hugetlb_vmemmap.h @@ -9,6 +9,8 @@ #ifndef _LINUX_HUGETLB_VMEMMAP_H #define _LINUX_HUGETLB_VMEMMAP_H #include +#include +#include /* * Reserve one vmemmap page, all vmemmap addresses are mapped to it. See @@ -25,6 +27,10 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h, void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio); void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_list); void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, struct list_head *folio_list); +#ifdef CONFIG_SPARSEMEM_VMEMMAP_PREINIT +void hugetlb_vmemmap_init_early(int nid); +void hugetlb_vmemmap_init_late(int nid); +#endif static inline unsigned int hugetlb_vmemmap_size(const struct hstate *h) @@ -71,6 +77,14 @@ static inline void hugetlb_vmemmap_optimize_bootmem_folios(struct hstate *h, { } +static inline void hugetlb_vmemmap_init_early(int nid) +{ +} + +static inline void hugetlb_vmemmap_init_late(int nid) +{ +} + static inline unsigned int hugetlb_vmemmap_optimizable_size(const struct hstate *h) { return 0; diff --git a/mm/sparse-vmemmap.c b/mm/sparse-vmemmap.c index 8cc848c4b17c..fd2ab5118e13 100644 --- a/mm/sparse-vmemmap.c +++ b/mm/sparse-vmemmap.c @@ -32,6 +32,8 @@ #include #include +#include "hugetlb_vmemmap.h" + /* * Flags for vmemmap_populate_range and friends. */ @@ -594,6 +596,7 @@ struct page * __meminit __populate_section_memmap(unsigned long pfn, */ void __init sparse_vmemmap_init_nid_early(int nid) { + hugetlb_vmemmap_init_early(nid); } /* @@ -604,5 +607,6 @@ void __init sparse_vmemmap_init_nid_early(int nid) */ void __init sparse_vmemmap_init_nid_late(int nid) { + hugetlb_vmemmap_init_late(nid); } #endif From patchwork Fri Feb 28 18:29:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996919 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D269C282C6 for ; Fri, 28 Feb 2025 18:30:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EC6C9280016; Fri, 28 Feb 2025 13:30:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E7408280001; Fri, 28 Feb 2025 13:30:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D14CF280016; Fri, 28 Feb 2025 13:30:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id AC742280001 for ; Fri, 28 Feb 2025 13:30:24 -0500 (EST) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 5D80FA3678 for ; Fri, 28 Feb 2025 18:30:24 +0000 (UTC) X-FDA: 83170193568.19.B4CC88A Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf23.hostedemail.com (Postfix) with ESMTP id 77DA9140011 for ; Fri, 28 Feb 2025 18:30:22 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Qq4A+ssg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3vQDCZwQKCAwr7pxs00sxq.o0yxuz69-yyw7mow.03s@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3vQDCZwQKCAwr7pxs00sxq.o0yxuz69-yyw7mow.03s@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767422; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Qsv/Ru062kbIZ6fLgoxLCpOrSns6HMFCNnESdSOgKyI=; b=ZHqYbClKy7fxY25r6HpHZ64rNXlvcsBmIbaJ9k+qRzk5TummfbtRlliroZrSFRx3ggUW3u az1Wl2dLULD8J9I93hrG4ua4upSafmI3YBnYlM5GOZ0494SzOP7vdgqiZjKMG2NvXd8qSk INIjiphbexJ46raPkFBM9eED7H9oRGY= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767422; a=rsa-sha256; cv=none; b=aDbT6O6Pb8GqJVOE55giA1N+pI4Bz+Qw+GDQ+DTs2P9YExf5VqC5N1sb1O2IRpstfSh4HD NyDNHJKtlVDUo/GibJkfLxBzMzOMxWKi2dbtQBrbCyljhSJP72ULOnfGIqOw4xsaLfR85B ts5RNZ6uAFIaaxPWBbNxXXAnLLp22FE= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=Qq4A+ssg; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3vQDCZwQKCAwr7pxs00sxq.o0yxuz69-yyw7mow.03s@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3vQDCZwQKCAwr7pxs00sxq.o0yxuz69-yyw7mow.03s@flex--fvdl.bounces.google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2feb5cd04a0so3151289a91.2 for ; Fri, 28 Feb 2025 10:30:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767421; x=1741372221; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Qsv/Ru062kbIZ6fLgoxLCpOrSns6HMFCNnESdSOgKyI=; b=Qq4A+ssg5FEZZgj+CQAqT/HDIe0oEHC7cExUgtWOO+XGb8h12RssM/XDF2WF4FhGPH qtGi+gttLusLVDzUSK1uB0cz8Eqr9GuwBQjf/Y8YK32OUDOMrv1zh7bqzj2F0BASMYyh q7HYski/jWDeWPsJ5V3ulkJJ8ogYuu6alHxCcxDShsakw5qkeuYcJh5DnZhFSjRr9aoN 0G15WHhN8WVTmlX6x7XjJSasjVJb44CY4e4kyH5u8icaUjHspxu7rnunPENlx3xhBQkg tX/NeVKosEJytTGx2Aj55/5IN7/aqgJzhmY8pZBe+rUhxdI6vepyXicRYHPEtQTwnwgF iMcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767421; x=1741372221; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Qsv/Ru062kbIZ6fLgoxLCpOrSns6HMFCNnESdSOgKyI=; b=GxR7EXg8auA441CbRtNmtGWn1suaaHEzD2rTvR+7nqKOw+Vx6gEhOXN78kBaETk5aJ Xba3BbDrrlTiAQRYXu6WOLSe4aItq7hEkqJqjflpOrxYlZRnySZ5RcIyKjgDmIxdWnOu ZQJzm8A/y3gSZiRwPkBa9MZSiNCvvIniCMoDV0BPHVjnalRzAABTaoD+FJeGwyghPHvN 7+1sVV84rjRKDo92zg4B07S63iU9M6FS8zKYlSPYa1r25NJyeJ/gpCnyIcletjOfP8Sp LGsBib8Sn6GMAA1Jw0FILOyjltIlaJxI0xhzJ+m9C+tdTu3bAfCQPXmPYlkDwCCHd/Wi gEpQ== X-Forwarded-Encrypted: i=1; AJvYcCXMfLuwVKgpOzo/R2uWNZUzOEKsIfPk3/j1pb99+LWUBl4GVIA+NWvi0fhrd1oHOdWR39/CpA9Lbw==@kvack.org X-Gm-Message-State: AOJu0YzS6AHFPEVfbxi0lLmXGNXFEi4I2DPIAn+ct3yBW+hzx0oEuRyO uq1Kg69nGsuE8qvLS/RWhlWLv5EhNdKgJ3B1Gd5+SD1GQOUcPpyVnZMFPaDtKtX1w72o7Q== X-Google-Smtp-Source: AGHT+IFQ+4CYaI0GArxrWm7V0AIunfjglhCLiozYOHsEzSuzY5cNohoIWVj5V0zB+hLlPEDwCz0ehvUR X-Received: from pjur6.prod.google.com ([2002:a17:90a:d406:b0:2ee:3128:390f]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:38c9:b0:2fe:7fea:ca34 with SMTP id 98e67ed59e1d1-2febac10a31mr6078409a91.32.1740767421435; Fri, 28 Feb 2025 10:30:21 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:21 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-21-fvdl@google.com> Subject: [PATCH v5 20/27] x86/setup: call hugetlb_bootmem_alloc early From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Dave Hansen , Andy Lutomirski , Peter Zijlstra X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: 77DA9140011 X-Stat-Signature: 3g8mo46s1zw4df5aryhqnnz43n9b8pa7 X-HE-Tag: 1740767422-942024 X-HE-Meta: U2FsdGVkX1/rxXZO5FzBeddcwfSRq8obVzb71lh8eCd8/ie1134Vo94kTmx/aGMLnMqsrusW69dwocudNGqgw2z0Ij2wmoiNE/GI1PGMzivVpEVOwpMnjpD7kWalYAeWjjEDHehppl4OJTGqDDqaXvJ7LPQo1JzeppFLdII3PAW3vtyLu54Ca7vStFIPo/Ws5s8Tk3gSwumFR7f5Bnn3zy9V17M4ylTH1aIxy3M8bF9gGNdxt8580YtMs5Fw1Pd8nVhxYKcIkHCLFBZuKv9YG9pvrB/LzB7lyZy6+aiYLfRXhUCaRGucf+e+y24V9N6YAlQEJi0MgJQkSDm0kO8m+br2j1w5ZRzXqr+KlPuKznaADpQTQJwUwQdQwhY65SK2HlpoUj0g/cZ5xm4qIZqUn9RZVgwuwlh9wPyf5X99Up4dEN97X0Ld2neMzQXwMlCvDed/8jtqzP2bCZAQu94rPYgkol2T8prauP8ezEaZXgcq04c4SoKx/eDo4ugVOwbAjT2B8pz4rDtU5wwDj/C9lrZE8sw6Y2YyErJxcN4c0+MFaH8FehjltiNdsy4LAhg1mpSdnvd6oJzndqeiqPL0637xaPY4y5vyilZMxgBVARs6cLrDVvAqXlDWfcV2AqdJsPPEQqvgtsiAjy3Rw6W8WCCfluHSmeMWJFA2xZtMHkykEeiuWISA3KZQjOrppIQ0+gCBnOkO+3727JVXlapI8B1xkPGX9AGovk9fa7fhy4jJaatAvNK3gyZHPncbzMtFLt2oaSon1tIO8j4cvheG2hSiS4xk/s/kwjcBeJ6CGxqFipr3NYXs2mOWxesnsyov4aagw+wnSbqftmVoAJZpXCEbG8Cb7xvXEVTDawjpM8pIe7VajQcaGY+7ieacCQBKChN5nF4ajnavjnJIqXGxJyk+dQzxXhb//mGqKyC2kHMdhoR2DWiJKJ7vHy0WDSgtbFh+J5OoVZ4T8za7L2D 8491Seuh t40ouSyHPFDQ2o1Pm+QRBox42JQVtGZZLDOeeK9/Sk4//IhJnolsVCK+Y+Vn4QtQDz2FBp4LJ3f6IG/lJ5Q687emoXYoxjAwNVldeKMAIoLruxTDjttPXsmgMvVmCE9DkOQzvaDutP9NuJ3iimL6QLqYCROlW5EMFZD6M4GI+Bm+TjBvG7ukV2tUcnolW7RJqv59VIl+GyItEifvkluoB1Z5puJNvolEkj5vOJhqjSDCXApSpnqL6luNVHlpk3h8lAU6M89SpjzzziuYHATF4wS0lQh7jQTqrKE0KL1mLW9BRCyJaNKp27GErbhQF5WTaEItVheyjEuQbioXgSM3+vnDlJFvsFMFAGmXbiY37xXT4KVo720hM3jjINYsIpANT8cXGVgnUhT2N4lHdRVvgV4jV7Ls0U60bdfb5X2LVfASH3OdxB9usao6GThAgXzT7Ma3UwUszjZrpM4QuoO7r/mP5Q+KTMdmmu/2EK78maugPs+GyU+ALyWSuFMnEmZKyRBRzYwinrNldk24dYBpv9ax4TLwpstr61LnEQARbVU7pyeJ/TVmD9VeqQZUqB6QjVxqSULSOpmijHqsuWnISCJp+xZmIBTyQr6EfdXq90oKRL/Ch117bQd+h8QuOvH3ZsSkNvF+WL1coLareCj54XOFC8j8kQtBwaxzbJRhRJrwqNq/tgIX729JFUg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Call hugetlb_bootmem_allloc in an earlier spot in setup, after hugelb_cma_reserve. This will make vmemmap preinit of the sections covered by the allocated hugetlb pages possible. Cc: Dave Hansen Cc: Andy Lutomirski Cc: Peter Zijlstra Signed-off-by: Frank van der Linden --- arch/x86/kernel/setup.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index cebee310e200..ff8604007b08 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -1108,8 +1108,10 @@ void __init setup_arch(char **cmdline_p) initmem_init(); dma_contiguous_reserve(max_pfn_mapped << PAGE_SHIFT); - if (boot_cpu_has(X86_FEATURE_GBPAGES)) + if (boot_cpu_has(X86_FEATURE_GBPAGES)) { hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); + hugetlb_bootmem_alloc(); + } /* * Reserve memory for crash kernel after SRAT is parsed so that it From patchwork Fri Feb 28 18:29:22 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996920 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB728C282D0 for ; Fri, 28 Feb 2025 18:30:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D62F4280017; Fri, 28 Feb 2025 13:30:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CED75280001; Fri, 28 Feb 2025 13:30:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B3FE0280017; Fri, 28 Feb 2025 13:30:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 91F04280001 for ; Fri, 28 Feb 2025 13:30:26 -0500 (EST) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 47517C1122 for ; Fri, 28 Feb 2025 18:30:26 +0000 (UTC) X-FDA: 83170193652.02.1A1E439 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf21.hostedemail.com (Postfix) with ESMTP id 19B961C001E for ; Fri, 28 Feb 2025 18:30:23 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fQT+ZLvZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3vwDCZwQKCA4t9rzu22uzs.q20zw18B-00y9oqy.25u@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3vwDCZwQKCA4t9rzu22uzs.q20zw18B-00y9oqy.25u@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767424; a=rsa-sha256; cv=none; b=ZjGMFY0rHqobcX915asx7p/x9lCWOizgPzfy1XOxhlPeOLHJCh9haGmO13VurRT5wEStBm Cc1Y5QJxCoa29u+5Kwam5r9mn/wpoQo3AmTiMMH9RXtD4m2siwkIzkFTBLtz+cI0zMQ0Um 8wqQzTl9dFKLirPk+lzTXVS+AkxJiTs= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=fQT+ZLvZ; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf21.hostedemail.com: domain of 3vwDCZwQKCA4t9rzu22uzs.q20zw18B-00y9oqy.25u@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3vwDCZwQKCA4t9rzu22uzs.q20zw18B-00y9oqy.25u@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767424; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Ri+EEQGIMa13uM7TTbruNz5oquaji2Ir3iub6xGqmT0=; b=EQSwejsNjLI4A7WhpxP0OolrA2woBkniyJ87JLFruBfWuc/bpvCjr7oZqJDRTpL/0DTB2J mk5JsSIl7l1n/pTSK/tEADxDBBaEKzvoLMbwZvutYbwMwXUKyfY+d6Q5IhNyiH2p0s68C2 KQxZ+F8UFWhWdg2Htseb8Etw2b0SqpE= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2f816a85facso5031332a91.3 for ; Fri, 28 Feb 2025 10:30:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767423; x=1741372223; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Ri+EEQGIMa13uM7TTbruNz5oquaji2Ir3iub6xGqmT0=; b=fQT+ZLvZz/DvyeBPijUdQOdcll/seSLC8eIN7rlod3X4mL7xxPwkHWoYf+o7XqqbfM tN3KDbjkbF0G47I4P03h/GZhZqDqV5ztrWFasr1zaR1ZIxUjd3P6C3/qZHPm92e9ej/t YxWQDpP3EiYQidCq9NwMzEWTBPVTXMQ6x7Z1QEYtF7Csd4+U2qiVlkP9dHVD5aKGa4ST OnQSrGTQwPn7HpRQgKjCoZBn95jOP+R7KjjxytzzAbZA/D7K/qL/gcJklReWz8Vo3W8h OsImLJuY7KTPYFwpxnR0LWGEyQcrFUmnDQzJmeMK+5YkEEsRLSeXG2QyWDpoQReo8Ndm d7fQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767423; x=1741372223; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Ri+EEQGIMa13uM7TTbruNz5oquaji2Ir3iub6xGqmT0=; b=UhWukmgvJdBVc73QoxprjRFqskX3vCw/tmj44xCilViamRN5Sszg5z9xbchY47GPOQ Fqol0mmHGG5UUisH22eDfWXp4R09+bGEvmE5B6Qo8/5jlkysFBghUl1GbwMbW2iPrGnH 9Ch8VnLayqzU2O45DCAQp7e+RsUtcm+EVE3N53cS7ywAguXwiQBCMMdt6lBILWHhyRxB zcZYSLkZ/k6iniYVx8w5c06l1WwLWEwWzntBVM92CJcCsa/GgXlAvSF8zZlylBMRuB7X kAxqj6tvwSsqWbNtpwxZHfqBbjXkC3wz8bnnoHoovv+l0nkB45US02ApXAVAK008KJqz WTbg== X-Forwarded-Encrypted: i=1; AJvYcCWjLCj/z8bnH2Jvqn7BV87s1SdeKydb8fBxnkqokSh4ZY5AwOtsUUHSvemiVJ/zCxlqiHCQ+QituA==@kvack.org X-Gm-Message-State: AOJu0YxZ3tezIS1xaU4jZ2Epqtku0U1K+sv68HdMyZbvcS8rKX2CFNaK rvhhYK17RuYS2Fs+MsIgitB+mh9Z4lZ0KB2rz7yM9QCSq0E4jvZJL5dvw/Iuicqt4brJBw== X-Google-Smtp-Source: AGHT+IHkocdsxLFj1UhQoNs63q4QIAx+wUoAy7Ek0ZQmWo/uXxf9KSLx0hvkfC5/UWsfEohsJBSicvyr X-Received: from pjbee14.prod.google.com ([2002:a17:90a:fc4e:b0:2ea:5469:76c2]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4fc8:b0:2ee:c9b6:4c42 with SMTP id 98e67ed59e1d1-2febab78711mr8118779a91.16.1740767423034; Fri, 28 Feb 2025 10:30:23 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:22 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-22-fvdl@google.com> Subject: [PATCH v5 21/27] x86/mm: set ARCH_WANT_HUGETLB_VMEMMAP_PREINIT From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Johannes Weiner X-Rspamd-Queue-Id: 19B961C001E X-Stat-Signature: woufgcow1hi7qzq5k7cmokdbbhg756jk X-Rspam-User: X-Rspamd-Server: rspam05 X-HE-Tag: 1740767423-306777 X-HE-Meta: U2FsdGVkX1+hZxqGjCKdxVHtbb2iyA6XxBDs7zpU1lm5bDM97Oj063FdQHo5j8sJ7A2z9yEgbKld8GeesoKSLIKNU4xhIVkEiNH/m1FftTIX4AopoIKi7HaQ9A84KYr39YrnvX5/lREhcyIf/rykGBMOpIryqQuIxf4bcg7CpknGRM8dYX0lhXWgvQvodZxqVqrgl7urec319xqnc86VRMFZOBD/LH2Re03C79cpa5gQiK3DMS6C9reCtSaG2cjAMVXxNDdiacmF+FbksuQZ50Ud4V7br/3zsisNFxmX8Gz+XUUt/xIskCyQ57b4pLyk2mmhsk/kWr1L1Iqv6/r04C5GK7was81+/IMVoKe2abS9dexqYprpjrzkmyid6Wu0BMr8QJLBMrLS5eDipWMpwG26LNi1o7DzJRdNtLLnLGzSLRY7yXQKKu2TxY96kYpVL3CLgNrAg4qvPmASoF5NG/cL6RV67kBJjKSXd8fIgO9Z3TlI3WiGmG4eSA/KEhKueRjXioXImpehqnKvPbyraZ7Yk9+Ziow/2C5nRBJW0DnB+m7QkOwk5XlptHBmr0beoih0+T9mikng6Ue1QNjaKLw1wSCibQst83IqLx+0dFathaPtYo/gMgNdYd8WTL0URVXHm0Zjss3SzJOp9vDKVe1l/9qU+W9cqHpEWLJoBM5gw17d74uDfR9x2poGBg6GatofSs2HIBRFUgTksRKybZ5MHwevE1VOJnMDyqR8N8jQkZSxztUTUm7gd8Zd62X5gEcelis5NX9IeqoSYBK8j9JTn+FNk5ipIc5sEBzobuu0QhZ73LoPKel1jhD9J1jD0p2i2hcJvT2zV4U9S5E3Omjjd/xartVa46XteNTJpqxBJ4X4tkcgZbxoEdkZjON6a4L/rqZNcmAGaGReGj4qEtXVHThiUJU0oGx3ySpQT8UUaQ1Q8wY4cZCZZG4zmxfBkjSKat3+2+LJkX3H0af D9MgZCci wppk0syzlVjvyAHviGuE9SMqVeTOd89Da3D8lHBIGinAA06QnOx1HSnzMymYd7MhOYtffcX3x7iorsUqrQ870ziYCcCap4VF02m9ejcu3FibxH+GrWzFlWdQrbfErzdMVMCmD2ZwdJZ3D0LcgmrQFO/kcFVamoZZ4b8Vny23yFpK9nTn84t+JkmVphdIjT9sH7b79KlzEh+bus4IgA2z0aW+2W2E04GgiQ63RLZzz1K1yY17yxqzJlJ3R+IYq04SdEL0fXtII4gIVAKEhTdeSzkPY0T2aCnRd6qFhpsKJHENSbDr+cczV1mgW1MVyvlQcuP29ezP+CP0JG37A86TTGRLh82igNl3DR1nDQpjW3gNF3pk4N5sBzx1ZwzufzX2BqvD0A/JpP0MPYWoyyixKFKvGeJmQVJ/YnqQuph7GZ+TZW0K6HwVNF6GhV9jHJvfOIS8esVSlrsHMCbL7A81fMrV7YqTQLtMJCedoQgI62SHX7ITXqn9pSTftFvV2zKCp1KdIhPS1NnCKxRJE/O+lZZd85DFa6/2J+axZQFlyavQ5wlcMvs7Oh+JAaZfpBnQUJryOGobjouZSRhoAeH7ah62sxWl846Vefi1OtHHTuKUngfQ2w3pDFCjH3+G/Wswm4a/d X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Now that hugetlb bootmem pages are allocated earlier, and available for section preinit (HVO-style), set ARCH_WANT_HUGETLB_VMEMMAP_PREINIT for x86_64, so that is can be done. This enables pre-HVO on x86_64. Cc: Johannes Weiner Signed-off-by: Frank van der Linden --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index be2c311f5118..384e54b23d50 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -146,6 +146,7 @@ config X86 select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANT_OPTIMIZE_DAX_VMEMMAP if X86_64 select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP if X86_64 + select ARCH_WANT_HUGETLB_VMEMMAP_PREINIT if X86_64 select ARCH_WANTS_THP_SWAP if X86_64 select ARCH_HAS_PARANOID_L1D_FLUSH select BUILDTIME_TABLE_SORT From patchwork Fri Feb 28 18:29:23 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996921 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 33E19C282D0 for ; Fri, 28 Feb 2025 18:30:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5C384280018; Fri, 28 Feb 2025 13:30:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5722F280001; Fri, 28 Feb 2025 13:30:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3EE1B280018; Fri, 28 Feb 2025 13:30:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 1040A280001 for ; Fri, 28 Feb 2025 13:30:28 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id A922D81102 for ; Fri, 28 Feb 2025 18:30:27 +0000 (UTC) X-FDA: 83170193694.25.FB68242 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf20.hostedemail.com (Postfix) with ESMTP id BB7241C002A for ; Fri, 28 Feb 2025 18:30:25 +0000 (UTC) Authentication-Results: imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zXKTCzZ+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 3wADCZwQKCA8uAs0v33v0t.r310x29C-11zAprz.36v@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3wADCZwQKCA8uAs0v33v0t.r310x29C-11zAprz.36v@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767425; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=4Pj/N1+GF3Q+15wlANDSI128u4evupABN+ixVt0HlBM=; b=YgkuKpvTKuIF4dSb/tD5bq9q0crgOZvO2deIouqDZekvdfHhsFPvk/OAisUReUmyyCyzt/ novtXGC8VBIz2q9ptWrw7eUhndSmMEQSdPQoLF4rNyXdisEMhnQa4gZQ9fKOtJP6EyYZcw 3AqOnLoxCpgCc/XeURuBgdk+Rvk5qP0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767425; a=rsa-sha256; cv=none; b=ENJtA1xN9jBETaQ7FvDr25pnJSuSGA9EJwRZFAtTPiVhTOXB6+T8UmHH9sVVTs4TdKRxJG O/QBQ+gr09RGcaj1QDEbYMYKj4JAVNUcn0uAm6z7XsgjSxf3bdgLgkD6qrXDpHF5VuQtUc 9I5oi3eg8W3at7A/TJiWtRuW9PG/Wlc= ARC-Authentication-Results: i=1; imf20.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zXKTCzZ+; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf20.hostedemail.com: domain of 3wADCZwQKCA8uAs0v33v0t.r310x29C-11zAprz.36v@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3wADCZwQKCA8uAs0v33v0t.r310x29C-11zAprz.36v@flex--fvdl.bounces.google.com Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2f816a85facso5031415a91.3 for ; Fri, 28 Feb 2025 10:30:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767424; x=1741372224; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=4Pj/N1+GF3Q+15wlANDSI128u4evupABN+ixVt0HlBM=; b=zXKTCzZ+6R6Z5aqsSjmiED7UswxTONoOUAa/MLHztZYIxaBRBorcBv1WAz07UuFtL9 01dxu4/Hvkl5qAwu2FP8qrYUISkT4kx/QLsbAmGEXh0HuvSTg1ZOZuzQsJMyew601XRo 8EUrOKCMuGccXW1QcSL4GBJtgcAVgcP8eWtC6lU4cZVkCkzTpDpZvXWpvLBBGbdxq5g2 ztKeVxU/+0lOzjZOsk7b8+2dah8YtSMoCTJNoIjwc+bOtIFs8i30fYv0OrrAS4MAKrYq zGq6YXVOJHbet3+x3t5Xpj1qGvqKjG3OV9jlY0KDihyivJVCshSyPd9ZHIUb13IDTK+k QC+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767424; x=1741372224; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=4Pj/N1+GF3Q+15wlANDSI128u4evupABN+ixVt0HlBM=; b=BY4hmdZ3aYunhne+w3eapBadNSiODkZS328yR4NduA37HfN5xllPgbTDsTUod9wf7p FXgID2AuWuAv2mURn4dLC9cryZxm163a0P+cczP03InoB31aMiuYji+8WbYGujUyuLN0 lzJ3KGyiKq586tIJqzmTcj7j3RY/BFva6fmGz/EgTxijpJKP4wiFSB+x6ou2J4TthXOb 6F8dBC8hjO5oovhjn+ckkBiA3oT5DPuy5XSCpxxki5MLO/rn6Hrpl/FgFlP1PF4krHiL FjMLZhYDcnV2Gpa5rVLktYJ1gefe80nO/6YBi7YcWyG3hqI39LCLw2FDx45wRPEr9nup t7ow== X-Forwarded-Encrypted: i=1; AJvYcCVr+Wjgu+7rooJAEoAHpla/WlmVQHSrwYCCipgNHs4Xn5R+fs2DTm80noFTI2RUwk6XZIEmeua7RA==@kvack.org X-Gm-Message-State: AOJu0YyFpf1Plluo1GLykEwnf9Awy+vElDwc9BZKahOvMPybhAQDDPwQ QZjqQAOkXbvjxOvXMXusTlTNWsUhSgiZ+lIHY3ykWDlyk3siUKNeL+MoK5EFF+LWidConw== X-Google-Smtp-Source: AGHT+IF24dXL3LJ+F8+s7hLTj4qoGxn/r2uY+HJ8KpKKTOzotTZzgKMLzWfSEpexnlD+Bvn5pLcYDA3/ X-Received: from pgwb25.prod.google.com ([2002:a65:6699:0:b0:af1:44fa:2fc3]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:600e:b0:1f1:1cd1:e3fa with SMTP id adf61e73a8af0-1f2f4c9c728mr8137086637.6.1740767424601; Fri, 28 Feb 2025 10:30:24 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:23 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-23-fvdl@google.com> Subject: [PATCH v5 22/27] mm/cma: simplify zone intersection check From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: BB7241C002A X-Stat-Signature: 5c6hefea1fdox7huf36m3d5uf9uzi684 X-HE-Tag: 1740767425-326017 X-HE-Meta: U2FsdGVkX1/HmS9IvgLbM8G1ZGjSpfmE/PY6nQAjE2RYr+Q46tYUwZp0ek8WZdLLHelzQGy1i8KHZSerWPAxqiGvphZk14V4PIxPbR85FmgLYT5VgUOu3fJgx1tDVNQL1KpHzAB7xAH9N3T1TeW6Z4aYgAPqZT57yZRxk8I67q1ndKddoGrGg1vPP+VOIb31Frkwm7NfaSlYYJKxDNLeF5ZCxtaniV2AhB4HRUvi4khZO6Ffwl6uH+SN0Nxn5x4tyP9+vdSO5XPu0DOJS+BienCwEJivQm0finheRrPLR23eaRjmcQXEyHJUKNA+nS5VNMAYN1X7Y+UHxIhpfGljPAVg9pyMpUvicsS268Q9sBl6ajqEyEQf/Z8ssrz+/4JNoDNcTyl8wlS2j7A0CcYMi3rMq12QNc/2aFB1Jzr/cLgliYkDU0sAyxNCEIz5wPLiTj/ZFalcLaW3TgZY0wlNDeSJNtcgLY0pin9FiLmjCKNka69FHZ7Z5vnwcFhfwGTbHNYz5TQR06m/iZdADDKCHZDLDDAszwP012+9gNt+GmQtk29chK1e7R90+yCIC+fVeSFlYcrhAg6vXjyESAKP+Q70Ld309IVQM6J3m1sZFhiaOkIJlI4Pt1i7JoKUGbYc6F6j0h87qSoEWFqA6OFU3LK1kQTx1cF+eqdkZX68EdIVtaIJoyWXrVjYkSmqNOc+aP4xMUAKlVMKci/eanIVGXLeAYU+XWAq1yMofIdnIJURM5Hh3VmPs3uGAaJwA7pIYYEG+9rPOh3UTlrboORhaV9lSaacZf+xZERh9HkW41/KWayzptYPRYuZYmdU0lnwN43IRlLhVRVPrh4EXeWWQy22K2mfGYOnHkYupxmSLxyGKyDZi6htrlNP64fZPdKa9qmBqrfsPXKBMsOhslycHxd1ZjwVBgrDjBAqvY2uludiXiXJVrwrIpMyRE3RcqlLxTYXRjrJMSVk4vYiQXU P+e8/oSy tQMzZmIzfDUZjgFqs4PGs33FuZWx8P4EMih0TDOlbjEkNC4P5v4O8taSdkl4+ScaRqkENFATymV9GVHzKC28w7izETXQmWSkQAwnbfhksJEhL3SDc3nqgbYYJsNESHYJe9Li/K280tVXwgCSbIZIH+d43I2YXZ8vwPLMAq1P1jTG1ih3c8cYtfxQQktwih3ipgKj6ty8L26Wx5Nw/MqEeuASMUhIWjJHR0ElUeY1EV9TjVxN+AHJkOXjM3bBls+OItjgfCXdKS98Owts+qtZegphozSzHtWo2TbHb5GVMe/iHDwqAj3pWrHkpGPs6E1TWtNCT0kBcEEoWvSifYjAS5DM7ofr2buSbaYTTDTai/P6/oL60ptPv75ebWdiUt1paTRxq8SCqFnv5J4rfbaDeX3y/GpE5lPDJThGMMmHQyN3un/r6oer13dj76iaM7pWWTVoL+DO1GCqQwoc9cmF9E6SCyXlUB3hWHTuurJYEik44tFAVAI04Pgx3NL4hcqb4Q/hEHpAlEdyU/zOWrqWrmwM7qw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: cma_activate_area walks all pages in the area, checking their zone individually to see if the area resides in more than one zone. Make this a little more efficient by using the recently introduced pfn_range_intersects_zones() function. Store the NUMA node id (if any) in the cma structure to facilitate this. Signed-off-by: Frank van der Linden --- mm/cma.c | 13 ++++++------- mm/cma.h | 2 ++ 2 files changed, 8 insertions(+), 7 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 8dc46bfa3819..61ad4fd2f62d 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -103,7 +103,6 @@ static void __init cma_activate_area(struct cma *cma) { unsigned long pfn, base_pfn; int allocrange, r; - struct zone *zone; struct cma_memrange *cmr; for (allocrange = 0; allocrange < cma->nranges; allocrange++) { @@ -124,12 +123,8 @@ static void __init cma_activate_area(struct cma *cma) * CMA resv range to be in the same zone. */ WARN_ON_ONCE(!pfn_valid(base_pfn)); - zone = page_zone(pfn_to_page(base_pfn)); - for (pfn = base_pfn + 1; pfn < base_pfn + cmr->count; pfn++) { - WARN_ON_ONCE(!pfn_valid(pfn)); - if (page_zone(pfn_to_page(pfn)) != zone) - goto cleanup; - } + if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) + goto cleanup; for (pfn = base_pfn; pfn < base_pfn + cmr->count; pfn += pageblock_nr_pages) @@ -261,6 +256,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, cma->ranges[0].base_pfn = PFN_DOWN(base); cma->ranges[0].count = cma->count; cma->nranges = 1; + cma->nid = NUMA_NO_NODE; *res_cma = cma; @@ -497,6 +493,7 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, } cma->nranges = nr; + cma->nid = nid; *res_cma = cma; out: @@ -684,6 +681,8 @@ static int __init __cma_declare_contiguous_nid(phys_addr_t base, if (ret) memblock_phys_free(base, size); + (*res_cma)->nid = nid; + return ret; } diff --git a/mm/cma.h b/mm/cma.h index 5f39dd1aac91..ff79dba5508c 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -50,6 +50,8 @@ struct cma { struct cma_kobject *cma_kobj; #endif bool reserve_pages_on_error; + /* NUMA node (NUMA_NO_NODE if unspecified) */ + int nid; }; extern struct cma cma_areas[MAX_CMA_AREAS]; From patchwork Fri Feb 28 18:29:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996929 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4C0D3C282D1 for ; Fri, 28 Feb 2025 18:31:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D07D28001D; Fri, 28 Feb 2025 13:30:36 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8814A280001; Fri, 28 Feb 2025 13:30:36 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6AB4728001D; Fri, 28 Feb 2025 13:30:36 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 4AC24280001 for ; Fri, 28 Feb 2025 13:30:36 -0500 (EST) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 924B41410A4 for ; Fri, 28 Feb 2025 18:30:35 +0000 (UTC) X-FDA: 83170194030.10.2075B36 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf24.hostedemail.com (Postfix) with ESMTP id 58744180029 for ; Fri, 28 Feb 2025 18:30:27 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="SVy/a3vG"; spf=pass (imf24.hostedemail.com: domain of 3wgDCZwQKCBEwCu2x55x2v.t532z4BE-331Crt1.58x@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3wgDCZwQKCBEwCu2x55x2v.t532z4BE-331Crt1.58x@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767427; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=GJJkcZXm0UDGyycKT5zJa6qHUoeoBff/K3o/6oYQz4Q=; b=dWvHQAqaXU+e+myOXYm+L1rtINBeF0mPcnIF5CSs86Gy3z464Np9393Ku6kosxSELrqF9+ 6VJY5lo4Db7Y0DynKsfdoHOObHUeixY6ZjF93Rzve1Dtk9HuTzRWOYnzZ+hLznYZ3vsAp+ T5UhlITfoNohqT7nkh5Qv8aR1ncgWKc= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="SVy/a3vG"; spf=pass (imf24.hostedemail.com: domain of 3wgDCZwQKCBEwCu2x55x2v.t532z4BE-331Crt1.58x@flex--fvdl.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3wgDCZwQKCBEwCu2x55x2v.t532z4BE-331Crt1.58x@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767427; a=rsa-sha256; cv=none; b=VYDXIQ8BkgDhAEz0NM2Wc45SBYMlMdpjdNHD9EV2rDNfyWc4kNszewfo9D31KuYVaxvPZv hnnCAEkbojXTI4AdM63SNHILmDLsbDcgwOque1qgk+T1TsyPrcsMqnxIJQUsNc7dMsPak+ 0TUQkt8ghnRnnTUdOrnX2XGFOO8g4WM= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-22107b29ac3so39413065ad.1 for ; Fri, 28 Feb 2025 10:30:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767426; x=1741372226; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=GJJkcZXm0UDGyycKT5zJa6qHUoeoBff/K3o/6oYQz4Q=; b=SVy/a3vGajJAti01RZ+U5R0vaM/A7Ja29JLOAjj+XSWzp0w7d2lg4v3G79BBXqpt2z q/uxWRIi2GhTqigNGtCnRQAHBxdmaDoFS8V24jyX1STp/DoHIYovvQQYurqi+kHox1WT tkd9lXKWxBuHWeh4XLZy7AJeIwqliyTAqcPfCHmyK5lgpIJaG+SgKf3xWsDsisi6J/9c i9IaRlcxfuHEXmEparc2vLlxxcZhKUEonI3e93omeAFZAnONoqpAfWY2RBPDO6TsonWF cWRNooWaIJPtN7/VLwdlscppLbXdt5PdoMvxBIqhGrumyS9Ex04T5hdKnxaE++du7hHO lQKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767426; x=1741372226; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=GJJkcZXm0UDGyycKT5zJa6qHUoeoBff/K3o/6oYQz4Q=; b=Wtpmx6Rc0PH9HOsWVg2fch4FLH30Y7YV7WX8zGUduxZIF04NoiSJMADkOzM41dwrfT paD0ip2LT7ulU7v7+/PfyMknzxANq7nAKpR0DPLamFtD64P6shyG7r9D6yAaYFseVZFi Qngut2caE1DrO7IVNtqgvYzk5x6ZaCwm27YP3woNyqkUAUlhjoBqRi2HZ0uu6BhPWHQP 8MV60IWAkFrg4flv4D5fj9RIgk5yirxBOkqBahfSnfzyg2dn5AkFyHDbY6aXKDQWm4Si 8b3QN7mpgwUeGAZM5G1p1Fjss3znIZ/UNXuFhAYwwGI+cNSgIzn+wF4ODMgZHzLYA3lW HZGA== X-Forwarded-Encrypted: i=1; AJvYcCWuPK6xVxxOd7NC8DfsXmQyr9swYn2kTzf3qmswDKTpMJapyk1ajE7Ytie53+8TSjRIv/sBuxhsHA==@kvack.org X-Gm-Message-State: AOJu0Yx5H5JRJtvvriqFJrphUJJkF79GtEDcGwxKq2HzfUdJy+SsD6Ad ZJn+Yfe/huXxoEcWGgUkZLP9gO82K5a5PvBmTj4WoJOvXCTL2ZdX1Kq6TWUK995CdhXlYg== X-Google-Smtp-Source: AGHT+IFaQfecV0+g8772luhiih5Wz27ZxuM2qewDmjx0CupIM+riybkEMnFN2lTpHGDtl7PIHg8dgK4Y X-Received: from pjbpw4.prod.google.com ([2002:a17:90b:2784:b0:2ef:82c0:cb8d]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:ef47:b0:221:2d4b:b4c6 with SMTP id d9443c01a7336-2234b05eb87mr134741995ad.17.1740767426169; Fri, 28 Feb 2025 10:30:26 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:24 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-24-fvdl@google.com> Subject: [PATCH v5 23/27] mm/cma: introduce a cma validate function From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Stat-Signature: zewjmo9amqgb4nbesu54xdru8unczqxb X-Rspamd-Queue-Id: 58744180029 X-Rspamd-Server: rspam07 X-HE-Tag: 1740767427-604587 X-HE-Meta: U2FsdGVkX1+gEcoVrR5Rq1ni+7oJFCRVX4Uboih3mJqdgDBvKEXxYgGIrLYZta2cnUlL9B0hAKeah8Hq/XQTVrYSyZ/bBf/h2+OBULBFBqFke8GvsNKFYNSKM32YeoshKZzX1TvrT5HlcFm/S4Vs4ToGdY05ER+tSqZN3gXWFWC0SKF64jMAfoPszpJPr8Qn894Io296Z1VEr5ZXrcMec93no8oQYwxABeCDafOXxEcOGxqneAK1Gis4TrEn6fSyQnwet47KBxpc+xunvx27opRwbaIrR/9us59ydQm/kFZKUd22RmEKxqotru05AoFoMDPagm22AtLyUdAaObrratTRqWT+QANmbF3pdC3KDnIyPopSe3gkpVE8/ZP2wbr8onr73678PJQyoOPdGiswsOVceNZV7dsP1pwa079Toubjsz/VXEvEpOQtHhfnhNzUWigo0tjaTaizS9EkdT6aQIq6I+2OKp/RPkZZujNmfkop1d/xJNbrvGdtjpJ4ChsxFe5Pj1yj3ga8xUcCuxL3nNJN4bWfaIhKGHQYctz/BLsEAFsREb+Zj3qIUL2WdgIJf/0z6vCaC5pvtgN+oQODTJ9jyYLJCybmVOzGNSZfB1sWy4X3js/3W1u36KpCCT0TEI6e6tZ9QLKOtQLhtyth0A4t3r69vVYsIbfcggBBZGpOrv4X1p3cVGk1B0RUeWxhoZECWwjtkFANmhWcJAe6vCxrPQTiGOLot8SxWt/NInaSkzc+M0bSiK7HFiYY35TLq3ca5B8SO4YZXO9gsHPJyltkqQFrYsFwupC/5jon2th0aHUhm92wDtnyQgbcgEtdzBGLGYKE1Ind8YprgJVPZ48Z7NUppkx3qmhhK5UCkEaXHfcrb/Tx+l2aQVXQ5ImLPNzcssHZaS0ryeqPTN4oTEYp1VUfwA9ZEWFhi2M3u5dHVMFnCegZtimEerLfbuHcVHzM0fVr1j3N2tD2i2g diI4IRhV odaesEgBHb92v1nS9QLiiPvJBfU52hScKFwhZyMIp6M3R9Ku6vmi8yz620cYYiBdtvSVY/9IsqGAP3D2bdykboG6X9hacIbAubBWphP8AZ8JX4SG1/ReGz0ZONdiYTrr7w847KKyp3Egvjtnu7rqFDhTsPCUjDuT+heNYjqK+sLEiRjuDqAYAAdPJunYbkYZPWR4vNyYkfee+/p2PHbL5VBDA94dcaPG3AS7sm8k8m/ZXurhjTtuMdeUpu+W/pUcF8MijG/knsyDhOPy5sKxZjF2FohH6oXqr8Mb/ip2FAmk2JorpB03aP9w3jKcxraGGqXy+A4aqtuU2PJ+ajP6cUz344EcPa4HQ0KwJD/9QXVwSX2RqgQMKVERC4cVzexE4Vm420yIaFOBgmBnCeI6iR8+wR+7mHiFwCqEfT+DGgJHW5HbD7vMs9Ly/u0yAnSBPQAJuPjWvFqcXz4oZ+eTrAEXjTTQ8RGOKrPp9m8np7AbClrThZbMbc5I/ssAzisJGdR0PsL9JOUCpKyNjFS+FjUNleA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Define a function to check if a CMA area is valid, which means: do its ranges not cross any zone boundaries. Store the result in the newly created flags for each CMA area, so that multiple calls are dealt with. This allows for checking the validity of a CMA area early, which is needed later in order to be able to allocate hugetlb bootmem pages from it with pre-HVO. Signed-off-by: Frank van der Linden --- include/linux/cma.h | 5 ++++ mm/cma.c | 60 ++++++++++++++++++++++++++++++++++++--------- mm/cma.h | 8 +++++- 3 files changed, 60 insertions(+), 13 deletions(-) diff --git a/include/linux/cma.h b/include/linux/cma.h index 03d85c100dcc..62d9c1cf6326 100644 --- a/include/linux/cma.h +++ b/include/linux/cma.h @@ -60,6 +60,7 @@ extern void cma_reserve_pages_on_error(struct cma *cma); #ifdef CONFIG_CMA struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp); bool cma_free_folio(struct cma *cma, const struct folio *folio); +bool cma_validate_zones(struct cma *cma); #else static inline struct folio *cma_alloc_folio(struct cma *cma, int order, gfp_t gfp) { @@ -70,6 +71,10 @@ static inline bool cma_free_folio(struct cma *cma, const struct folio *folio) { return false; } +static inline bool cma_validate_zones(struct cma *cma) +{ + return false; +} #endif #endif diff --git a/mm/cma.c b/mm/cma.c index 61ad4fd2f62d..5e1d169e24fa 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -99,6 +99,49 @@ static void cma_clear_bitmap(struct cma *cma, const struct cma_memrange *cmr, spin_unlock_irqrestore(&cma->lock, flags); } +/* + * Check if a CMA area contains no ranges that intersect with + * multiple zones. Store the result in the flags in case + * this gets called more than once. + */ +bool cma_validate_zones(struct cma *cma) +{ + int r; + unsigned long base_pfn; + struct cma_memrange *cmr; + bool valid_bit_set; + + /* + * If already validated, return result of previous check. + * Either the valid or invalid bit will be set if this + * check has already been done. If neither is set, the + * check has not been performed yet. + */ + valid_bit_set = test_bit(CMA_ZONES_VALID, &cma->flags); + if (valid_bit_set || test_bit(CMA_ZONES_INVALID, &cma->flags)) + return valid_bit_set; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + base_pfn = cmr->base_pfn; + + /* + * alloc_contig_range() requires the pfn range specified + * to be in the same zone. Simplify by forcing the entire + * CMA resv range to be in the same zone. + */ + WARN_ON_ONCE(!pfn_valid(base_pfn)); + if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) { + set_bit(CMA_ZONES_INVALID, &cma->flags); + return false; + } + } + + set_bit(CMA_ZONES_VALID, &cma->flags); + + return true; +} + static void __init cma_activate_area(struct cma *cma) { unsigned long pfn, base_pfn; @@ -113,19 +156,12 @@ static void __init cma_activate_area(struct cma *cma) goto cleanup; } + if (!cma_validate_zones(cma)) + goto cleanup; + for (r = 0; r < cma->nranges; r++) { cmr = &cma->ranges[r]; base_pfn = cmr->base_pfn; - - /* - * alloc_contig_range() requires the pfn range specified - * to be in the same zone. Simplify by forcing the entire - * CMA resv range to be in the same zone. - */ - WARN_ON_ONCE(!pfn_valid(base_pfn)); - if (pfn_range_intersects_zones(cma->nid, base_pfn, cmr->count)) - goto cleanup; - for (pfn = base_pfn; pfn < base_pfn + cmr->count; pfn += pageblock_nr_pages) init_cma_reserved_pageblock(pfn_to_page(pfn)); @@ -145,7 +181,7 @@ static void __init cma_activate_area(struct cma *cma) bitmap_free(cma->ranges[r].bitmap); /* Expose all pages to the buddy, they are useless for CMA. */ - if (!cma->reserve_pages_on_error) { + if (!test_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags)) { for (r = 0; r < allocrange; r++) { cmr = &cma->ranges[r]; for (pfn = cmr->base_pfn; @@ -172,7 +208,7 @@ core_initcall(cma_init_reserved_areas); void __init cma_reserve_pages_on_error(struct cma *cma) { - cma->reserve_pages_on_error = true; + set_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags); } static int __init cma_new_area(const char *name, phys_addr_t size, diff --git a/mm/cma.h b/mm/cma.h index ff79dba5508c..bddc84b3cd96 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -49,11 +49,17 @@ struct cma { /* kobject requires dynamic object */ struct cma_kobject *cma_kobj; #endif - bool reserve_pages_on_error; + unsigned long flags; /* NUMA node (NUMA_NO_NODE if unspecified) */ int nid; }; +enum cma_flags { + CMA_RESERVE_PAGES_ON_ERROR, + CMA_ZONES_VALID, + CMA_ZONES_INVALID, +}; + extern struct cma cma_areas[MAX_CMA_AREAS]; extern unsigned int cma_area_count; From patchwork Fri Feb 28 18:29:25 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996925 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74863C282D1 for ; Fri, 28 Feb 2025 18:30:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B1FB5280019; Fri, 28 Feb 2025 13:30:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id ACBF3280001; Fri, 28 Feb 2025 13:30:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 91E5A280019; Fri, 28 Feb 2025 13:30:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6F7F6280001 for ; Fri, 28 Feb 2025 13:30:31 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 2C231A367B for ; Fri, 28 Feb 2025 18:30:31 +0000 (UTC) X-FDA: 83170193862.20.3999DBE Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf10.hostedemail.com (Postfix) with ESMTP id 00FBEC0014 for ; Fri, 28 Feb 2025 18:30:28 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="dMjQ/Pyw"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of 3wwDCZwQKCBIxDv3y66y3w.u64305CF-442Dsu2.69y@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3wwDCZwQKCBIxDv3y66y3w.u64305CF-442Dsu2.69y@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767429; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=A/PlrUHdXBS5aQMGQGUAn/25QTwI04y5T00xCw9SgH8=; b=HGv3obbHF1P0WKwa2Td27w8aHJedeVOy/0NTEfOmN1DXrIMUlpi/NDfT8noBS7dudn58Jj dA9jwx96/lcH1gMFW+pG8mABm/daEP9MKE3AH6z/WbKh7oh8jDtHGeJsnT2JQhHezv38qW MA9n4lpyGUHIPGgbZyvWHLwzwCG5dGY= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b="dMjQ/Pyw"; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of 3wwDCZwQKCBIxDv3y66y3w.u64305CF-442Dsu2.69y@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3wwDCZwQKCBIxDv3y66y3w.u64305CF-442Dsu2.69y@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767429; a=rsa-sha256; cv=none; b=xIfNVgGYY2ub9j1sZeD0AQGwjQktLLDxYTnf8DH8KnhlwIJOKMMU3dm8fgSxifPE+l09Gk YeD09z+ReSMNl+pbWKKevRmcRdtsx/zZ/9LXJftfXNg1JFlFL9CfPdHcnrwPeDWxtfYLUl Sw4+p+VJD+4eEc8Twjl9Pl3JXNjJjMs= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2fc1eabf4f7so5261685a91.1 for ; Fri, 28 Feb 2025 10:30:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767428; x=1741372228; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=A/PlrUHdXBS5aQMGQGUAn/25QTwI04y5T00xCw9SgH8=; b=dMjQ/PywkoTbLppi4lXHCJFn2uL/MEIaazRmq1vRGi1jImPNXdLbhjmvpxt6vVDt2+ s1tue7PlMLamt7znmSrzHmyUVW0eIKE5MNVcpIdeuddfbNOkC4s7QPvX0mfFeqDz9PoE XhVQ2HAjYlFSYJ6sfJfCYfuSv3mHWAxhHwWoEWpFt5wE+lJ0AvXo7XXqm1QjN7OQO3oY HTUKaohf2Btx0Cwg2vYEyE+9nhgjUE5dlkQY9bVDwYATv3nfvtyHWfesMpuF0zq5z0Pw ESGm7sXx0jYhRtvXZ/MjZOzAAey2ySFWf2jngpcbVYLgGF89qOmcgjplhRNidG9ncrtP bu8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767428; x=1741372228; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=A/PlrUHdXBS5aQMGQGUAn/25QTwI04y5T00xCw9SgH8=; b=c4OVfyvmeNx+ZgS5Rku2ZaTzAZbEHyTdOu8cOjUaAoH2ZpjaCCBeE0VFWcqfmWQsMq lu/MhM0iuzvTA1wb+kkMRuHXgJa8OUsQEmiVvMxn4qZuneMAhPY91+lTmaLyOmcW8q6G flLJxbVXJ4exiZch5MrzYPnB6VWScVH8Sd06kNW88JzpgwNxt8Qo0l6i47dE1sgmbMO8 DvajBLaTVoQuq1VDNoh0t8OuEGjH6L+bthMl6BTiz958pzX4brx6Zl0hvUGEWsy/ogvW lHHf4jgKrbneWC59bLUOg4r38+4EYvol0W2wTVlaEg1zsfVeo1YhhXntSpV1qn/+vr8B VyAQ== X-Forwarded-Encrypted: i=1; AJvYcCWIed8DSpzM9MYNDREG92MqsZ/boZshk/C+iVmgLJsDh2bOD3dvSk9PSawFwDwJNGfFYa4synprEw==@kvack.org X-Gm-Message-State: AOJu0YxovXoxQJ8cgkZnoI3CJSeQHOfVrgH85YnycsmFINGW1wO/YFD9 ClZDyJH1jqWXBRDnxKzOYlQ+sBiVfc0L610g/I3LvPc6o7TzeNWi+2mCM173bJX2h2fYLw== X-Google-Smtp-Source: AGHT+IE9hv4JieOLsrmBlcdFHGW2nIb9erq5H77qZHSPmE2pf3XgwT3SfOsT16N+0hOwK5A8lx3Kobsp X-Received: from pjbpd10.prod.google.com ([2002:a17:90b:1dca:b0:2fa:15aa:4d1e]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3845:b0:2ee:b6c5:1def with SMTP id 98e67ed59e1d1-2febab3e271mr7456963a91.8.1740767427793; Fri, 28 Feb 2025 10:30:27 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:25 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-25-fvdl@google.com> Subject: [PATCH v5 24/27] mm/cma: introduce interface for early reservations From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 00FBEC0014 X-Rspam-User: X-Stat-Signature: 49ikz9impbiz547afa6398hk7xct8kan X-HE-Tag: 1740767428-296684 X-HE-Meta: U2FsdGVkX1/96jSLrbENyNjdoAhHJMOJg7I5siMZUAL85CalDhMPKo7Rc4ip+c0AVcl4EKwEj0XXdULvTOwaNy42+UKQbwpoGrYh4ewDLdC42a9kxyNF3h4OquSY+0jM99l+dpDlKfGR/zfOBvxk/98KCidKarVkZf5Z010rwRcf9D3++koSyLM0A1s2VQ9bxnwBNJnJpJg2mQagQcw8gX2tiYCTg0sKsnM94MAYZJSrGQbKZJfPEUNTb83QVL7GqNXb2/eZVOe5D0fF/y6ybVcYL9aeRdNIqolNf3e/2pyJX16HOHQYHFh/PjqiuCHm+lLeKsuFwmMV4pArgdDRcjr9/kkereJ3fHCJ8cCi9bRXUA5nJE6iAOEfkGuG19GUGN1rmaKKDGREVVxOlv1KJ17OEGy3N+rEEQ+NPR0KHcHjGOpiBs6oGz5QR3Zmkv4+KBnbCAJH7j+PNJhdwAHxlh0vPEwnW+UCJFb7+s0jRBaKzshAU6uDrHnBrUYuWlujv/hS/rCMq+jngDdNvVLVmSIkPOIryy9lgBjCwUnTWyQ06QDmYWvK2FNcMujS3CJU9tT+PxRz7/h8lom2QDfHhwvr1yiEdr6AiP33SaR4gi747TjEsZm+GFvzlFgq+tytaS/e9Im5kJYKf0J0n/xtUqIegZbdaRJFvdPiomxD/SK/M7uGdgZS3HQ9hA5HtynQEph6Y/9o09tSr2u1QfpVF3hHIRqOzV2wT2Y1F4P9q7TQ3z3LL+zmjtskNnGjvHi1QB4eE8m8gCK86cKj92JXdB5KzLh+xopwWYbUtGdPrGBxSQ4/L+gYl4U2n5BxBoB7kM906bx8+UdKtk/EuaD/eEXirX54kB8gvQPVgi2FQjIY4kY1v7Uik/IBrlbYm/Uw+Cn0FLHwyK9zhunV+uTy/1h5jrQRjEamkMkF5wdpz32SEMlBbncNYdNkHAXY7t7TYp72BLvPmISfJBo+R/+ 4CIjefcl 35UaWIwqgqTdMEh7QQbM0C4jJH7gjUrdijuv5xeMB3iNi+YBgptEbqJgN9Qyv9knK87wfRsEzgnnnLB1+a6po4En+XqhFyzE2IQkWHX/n56DmIuoLDb2js0h0wnT3n3ZhFEdQ3uZiljk7sNx8WVNXqClU/hO8jssuJ0araMVRkr3k5I/eJ889kkXCm35k4YIoHE9inXJl9yUWh8jB76xKIegHKwoIKFqDpQ1ctdYdjFVbNJ5VQeQLrGzI8ASc324/CIsTyfiTxFYJGxeFvLkAQm/7bIWU+VufiqYDbA+75XAZkO9dvBFc2Cv2rBecb0bwfFKyKOLE0d2Pn/3dRLOBp4LYks+qdAOD5tSyvPWlbuf+dTD4MtEvJ1bWVqmKoToSoXbf5BQTpUTTeH5JFeXO7YTn50pUIg8mkvoLr+weJgOiwHOyWT3qXnW3RYzBn/C4PPAkBojlpO20g3eeNgASLzBZKwk9YJrRVY8q6IPY3Ql8UByP1ZM4JQ7eh/8KqznNT3Ev8RakS+KXG/fcqXVpFEQNgg== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: It can be desirable to reserve memory in a CMA area before it is activated, early in boot. Such reservations would effectively be memblock allocations, but they can be returned to the CMA area later. This functionality can be used to allow hugetlb bootmem allocations from a hugetlb CMA area. A new interface, cma_reserve_early is introduced. This allows for pageblock-aligned reservations. These reservations are skipped during the initial handoff of pages in a CMA area to the buddy allocator. The caller is responsible for making sure that the page structures are set up, and that the migrate type is set correctly, as with other memblock allocations that stick around. If the CMA area fails to activate (because it intersects with multiple zones), the reserved memory is not given to the buddy allocator, the caller needs to take care of that. Signed-off-by: Frank van der Linden --- mm/cma.c | 83 ++++++++++++++++++++++++++++++++++++++++++++++----- mm/cma.h | 8 +++++ mm/internal.h | 16 ++++++++++ mm/mm_init.c | 9 ++++++ 4 files changed, 109 insertions(+), 7 deletions(-) diff --git a/mm/cma.c b/mm/cma.c index 5e1d169e24fa..09322b8284bd 100644 --- a/mm/cma.c +++ b/mm/cma.c @@ -144,9 +144,10 @@ bool cma_validate_zones(struct cma *cma) static void __init cma_activate_area(struct cma *cma) { - unsigned long pfn, base_pfn; + unsigned long pfn, end_pfn; int allocrange, r; struct cma_memrange *cmr; + unsigned long bitmap_count, count; for (allocrange = 0; allocrange < cma->nranges; allocrange++) { cmr = &cma->ranges[allocrange]; @@ -161,8 +162,13 @@ static void __init cma_activate_area(struct cma *cma) for (r = 0; r < cma->nranges; r++) { cmr = &cma->ranges[r]; - base_pfn = cmr->base_pfn; - for (pfn = base_pfn; pfn < base_pfn + cmr->count; + if (cmr->early_pfn != cmr->base_pfn) { + count = cmr->early_pfn - cmr->base_pfn; + bitmap_count = cma_bitmap_pages_to_bits(cma, count); + bitmap_set(cmr->bitmap, 0, bitmap_count); + } + + for (pfn = cmr->early_pfn; pfn < cmr->base_pfn + cmr->count; pfn += pageblock_nr_pages) init_cma_reserved_pageblock(pfn_to_page(pfn)); } @@ -173,6 +179,7 @@ static void __init cma_activate_area(struct cma *cma) INIT_HLIST_HEAD(&cma->mem_head); spin_lock_init(&cma->mem_head_lock); #endif + set_bit(CMA_ACTIVATED, &cma->flags); return; @@ -184,9 +191,8 @@ static void __init cma_activate_area(struct cma *cma) if (!test_bit(CMA_RESERVE_PAGES_ON_ERROR, &cma->flags)) { for (r = 0; r < allocrange; r++) { cmr = &cma->ranges[r]; - for (pfn = cmr->base_pfn; - pfn < cmr->base_pfn + cmr->count; - pfn++) + end_pfn = cmr->base_pfn + cmr->count; + for (pfn = cmr->early_pfn; pfn < end_pfn; pfn++) free_reserved_page(pfn_to_page(pfn)); } } @@ -290,6 +296,7 @@ int __init cma_init_reserved_mem(phys_addr_t base, phys_addr_t size, return ret; cma->ranges[0].base_pfn = PFN_DOWN(base); + cma->ranges[0].early_pfn = PFN_DOWN(base); cma->ranges[0].count = cma->count; cma->nranges = 1; cma->nid = NUMA_NO_NODE; @@ -509,6 +516,7 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, nr, (u64)mlp->base, (u64)mlp->base + size); cmrp = &cma->ranges[nr++]; cmrp->base_pfn = PHYS_PFN(mlp->base); + cmrp->early_pfn = cmrp->base_pfn; cmrp->count = size >> PAGE_SHIFT; sizeleft -= size; @@ -540,7 +548,6 @@ int __init cma_declare_contiguous_multi(phys_addr_t total_size, pr_info("Reserved %lu MiB in %d range%s\n", (unsigned long)total_size / SZ_1M, nr, nr > 1 ? "s" : ""); - return ret; } @@ -1034,3 +1041,65 @@ bool cma_intersects(struct cma *cma, unsigned long start, unsigned long end) return false; } + +/* + * Very basic function to reserve memory from a CMA area that has not + * yet been activated. This is expected to be called early, when the + * system is single-threaded, so there is no locking. The alignment + * checking is restrictive - only pageblock-aligned areas + * (CMA_MIN_ALIGNMENT_BYTES) may be reserved through this function. + * This keeps things simple, and is enough for the current use case. + * + * The CMA bitmaps have not yet been allocated, so just start + * reserving from the bottom up, using a PFN to keep track + * of what has been reserved. Unreserving is not possible. + * + * The caller is responsible for initializing the page structures + * in the area properly, since this just points to memblock-allocated + * memory. The caller should subsequently use init_cma_pageblock to + * set the migrate type and CMA stats the pageblocks that were reserved. + * + * If the CMA area fails to activate later, memory obtained through + * this interface is not handed to the page allocator, this is + * the responsibility of the caller (e.g. like normal memblock-allocated + * memory). + */ +void __init *cma_reserve_early(struct cma *cma, unsigned long size) +{ + int r; + struct cma_memrange *cmr; + unsigned long available; + void *ret = NULL; + + if (!cma || !cma->count) + return NULL; + /* + * Can only be called early in init. + */ + if (test_bit(CMA_ACTIVATED, &cma->flags)) + return NULL; + + if (!IS_ALIGNED(size, CMA_MIN_ALIGNMENT_BYTES)) + return NULL; + + if (!IS_ALIGNED(size, (PAGE_SIZE << cma->order_per_bit))) + return NULL; + + size >>= PAGE_SHIFT; + + if (size > cma->available_count) + return NULL; + + for (r = 0; r < cma->nranges; r++) { + cmr = &cma->ranges[r]; + available = cmr->count - (cmr->early_pfn - cmr->base_pfn); + if (size <= available) { + ret = phys_to_virt(PFN_PHYS(cmr->early_pfn)); + cmr->early_pfn += size; + cma->available_count -= size; + return ret; + } + } + + return ret; +} diff --git a/mm/cma.h b/mm/cma.h index bddc84b3cd96..df7fc623b7a6 100644 --- a/mm/cma.h +++ b/mm/cma.h @@ -16,9 +16,16 @@ struct cma_kobject { * and the total amount of memory requested, while smaller than the total * amount of memory available, is large enough that it doesn't fit in a * single physical memory range because of memory holes. + * + * Fields: + * @base_pfn: physical address of range + * @early_pfn: first PFN not reserved through cma_reserve_early + * @count: size of range + * @bitmap: bitmap of allocated (1 << order_per_bit)-sized chunks. */ struct cma_memrange { unsigned long base_pfn; + unsigned long early_pfn; unsigned long count; unsigned long *bitmap; #ifdef CONFIG_CMA_DEBUGFS @@ -58,6 +65,7 @@ enum cma_flags { CMA_RESERVE_PAGES_ON_ERROR, CMA_ZONES_VALID, CMA_ZONES_INVALID, + CMA_ACTIVATED, }; extern struct cma cma_areas[MAX_CMA_AREAS]; diff --git a/mm/internal.h b/mm/internal.h index 63fda9bb9426..8318c8e6e589 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -848,6 +848,22 @@ void init_cma_reserved_pageblock(struct page *page); #endif /* CONFIG_COMPACTION || CONFIG_CMA */ +struct cma; + +#ifdef CONFIG_CMA +void *cma_reserve_early(struct cma *cma, unsigned long size); +void init_cma_pageblock(struct page *page); +#else +static inline void *cma_reserve_early(struct cma *cma, unsigned long size) +{ + return NULL; +} +static inline void init_cma_pageblock(struct page *page) +{ +} +#endif + + int find_suitable_fallback(struct free_area *area, unsigned int order, int migratetype, bool only_stealable, bool *can_steal); diff --git a/mm/mm_init.c b/mm/mm_init.c index f7d5b4fe1ae9..f31260fd393e 100644 --- a/mm/mm_init.c +++ b/mm/mm_init.c @@ -2263,6 +2263,15 @@ void __init init_cma_reserved_pageblock(struct page *page) adjust_managed_page_count(page, pageblock_nr_pages); page_zone(page)->cma_pages += pageblock_nr_pages; } +/* + * Similar to above, but only set the migrate type and stats. + */ +void __init init_cma_pageblock(struct page *page) +{ + set_pageblock_migratetype(page, MIGRATE_CMA); + adjust_managed_page_count(page, pageblock_nr_pages); + page_zone(page)->cma_pages += pageblock_nr_pages; +} #endif void set_zone_contiguous(struct zone *zone) From patchwork Fri Feb 28 18:29:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996926 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D8084C282D0 for ; Fri, 28 Feb 2025 18:30:51 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B6AC728001A; Fri, 28 Feb 2025 13:30:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B1939280001; Fri, 28 Feb 2025 13:30:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 96AB128001A; Fri, 28 Feb 2025 13:30:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 75414280001 for ; Fri, 28 Feb 2025 13:30:32 -0500 (EST) Received: from smtpin22.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 2796281117 for ; Fri, 28 Feb 2025 18:30:32 +0000 (UTC) X-FDA: 83170193904.22.3735FD9 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf23.hostedemail.com (Postfix) with ESMTP id 4828F14001E for ; Fri, 28 Feb 2025 18:30:30 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vUXOuCTN; spf=pass (imf23.hostedemail.com: domain of 3xQDCZwQKCBQzFx508805y.w86527EH-664Fuw4.8B0@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3xQDCZwQKCBQzFx508805y.w86527EH-664Fuw4.8B0@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767430; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SJXuy9g+fIYCpUv35AHIylFJGRqMuxPCbt3E+0p7mQw=; b=FnZzGghszAD0vvtggL9PtOPE+qwjUi+Okl3SZkoqxMjVvtMaVTW+corNwtD0x1k/UioaRI H9hIongIJXB8R+ydW4/HaFX4hQio/OwUCKavLdSv4499LGaPpgvyvEayxn/HD+WjnOvAZn eU7D92rFKc0WEfIGuU52aMY+rlbSl/s= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767430; a=rsa-sha256; cv=none; b=A+1EDhmu8N90AYPxJdVgw72LjtQv+2/h9dyApM2ONHox/Pbd/YjkwMRxNlx/L5NDQOmIuJ PIZUfU+P3lIDO36gMyCtljswe9HqxuzObrTs6RNlMF4skk6nI/n7DckRYkH01TBqo8SvFv MTtKNnqcNkhGp2+AwRCNGNmHOhVYiXY= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=vUXOuCTN; spf=pass (imf23.hostedemail.com: domain of 3xQDCZwQKCBQzFx508805y.w86527EH-664Fuw4.8B0@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3xQDCZwQKCBQzFx508805y.w86527EH-664Fuw4.8B0@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2feb4648d4dso5632529a91.1 for ; Fri, 28 Feb 2025 10:30:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767429; x=1741372229; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SJXuy9g+fIYCpUv35AHIylFJGRqMuxPCbt3E+0p7mQw=; b=vUXOuCTNt+DgiQK/b+3p+ndDGH3KkdcXdYqDruitGQLEEr815lFd5i9lOC9Nn+8ygu 8wt4ZJLG9F+3uwvlklXC1Qlc2FSyGusWunQKqzy7BqzX5NHV8SXBPVWKoC84pb7cxKWV 3Emv72nyycoJ3+XZFNQ3IcDEEaZBh2wSv0P4Uw1s3IRxLedGvqPnJ2V71nAQf89kh9ZF H7cmcMk2//s4N1O2TBg7sDKSDVB9+W/4R1M02p+vZlrezMdBJ6L3V5Pte/N96JFVPTdB nqt95XMqrAMf3Fbye1OxXogZ4HrndApx7j0FY++1omr5/aGJBLoMWhHz11Ch8J4JoaLT zGZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767429; x=1741372229; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SJXuy9g+fIYCpUv35AHIylFJGRqMuxPCbt3E+0p7mQw=; b=lgzIOt30jW1eQBDKBbt2HDd/5V+0s1JBi/LxPJiVg/qGQFfslUu8pI6m7AyEwuSNUB 8rrz2myejKDm/X5W6PL2PvUQ1p2697xDP3S0P9N5XO1h8Of9ftVANXSZkMjOAUmT1woD ULxUKn2mDMEL6bPChg8IWED5BViHVLAOiXv9gsiNd1bBmdE0jRc0StH0ouAOEXd8qH7o zBqqniHDk6IKjYXIakMZ66AISsN/tdthvL6uxztj0JNifl9BhDOx2tbIC+ebHIlKyeSc cc0WVD0qWf7MxkWldKYhh+woyNDhnOe50ZjiCvuIt3mEhf54eT4Zix886hD/VCqqQMBI lK3w== X-Forwarded-Encrypted: i=1; AJvYcCWMdI5wUvVM1AfofKWfPVK0xqpoIAeM+uGNdhLqhUZ91YtSKUDqFUsXF6I928yjgNKTWvvLTVED3Q==@kvack.org X-Gm-Message-State: AOJu0YzFyhBassPjfA9ALDwpPyYmk8IoDWzrnA67bvcyqI4a+CamNeQ8 JvwmwEbPvrJp8OkBqQRq9NkGKYBugRjolJdU0/T2b4xM8B1fYmp10UIyyGtFIPeQzZ1tSw== X-Google-Smtp-Source: AGHT+IGyWQ5eAZFcGmvk4D9+09d3FgZW3zWgk6sQSpIGwMbzM1TQ8agrO5NPaNt1FY233n8g7xNvD2p7 X-Received: from pjyp12.prod.google.com ([2002:a17:90a:e70c:b0:2d8:8340:8e46]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3c88:b0:2ee:ee77:2263 with SMTP id 98e67ed59e1d1-2febab2ecd6mr7718548a91.7.1740767429126; Fri, 28 Feb 2025 10:30:29 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:26 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-26-fvdl@google.com> Subject: [PATCH v5 25/27] mm/hugetlb: add hugetlb_cma_only cmdline option From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Stat-Signature: oxxu8w53s1gi96ija7gydwscdpwdcuis X-Rspamd-Queue-Id: 4828F14001E X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1740767430-808824 X-HE-Meta: U2FsdGVkX1+0b+bJse6wfK64bbYKjlJYng/GDhGx8m8/lMzUF5MppJL3lNBI3xgthYEGE8bwBfpPyGM0/L3v0s+h0z4tGXfpG4zq1v2RrDDaZjPwQbRxOFEkRkM/EaSwKNXSxQE++kyysdAz0N0j3Tjm9tRMoCQrUabxTx9jJwphdvVvcdKd0r/KCf31K/gYOfYx+wS8Lgus5G29Bkd8HPwF/QB8gbT5dV6ugmknDCZ0n+YHmFMlbbF+gDzzCHv0rmnPJ/KXePROBTeWzU+y7MV9eEljLjaMH2PvNfU0Oo1i2BFnTH7DG+HhnwBDOsHRQtARZbQ5VJ15MN6kK7nOZMjjUeqA5KnzMgzKA3nqLrzx1aPl/M1/lqY5hlIC5HnkTnI8giaT1LIPBjB34FxD3DBgMh1TWZpLITzZ4WClEtBp+mRxzeFcIXoCirV3hxIe4dSvo5jJ9zthbNh+9bQXzOG55EbkHV2GpvPXApzeanDcIKiDjUC26HWhsK+5Rh7Gv4drx/8iZzFilIYW2hNoOTfP66ozMBr/lx94b21/7VakQttol/UKWuWObGKlT2aNOsU9uxyc5kI8xiO/KgJA2E2vHGpNIFL/KSYOdx35zkAIRRC+4h3rHQ9hMMfuvCQtgLpCaGYGtnm8fJgnffg/r/jTtK5xxl55FG4gz/i97GFIo9pDfgjoEoalLeqFUucD3uJmJ4d+sxWV/tlMvsrjldWn+sV+6hikULic14gB1ZmVAUCgHIw3xya/Cy7xdkXZdXQRcY6LrmsI1+oc/jvy4LLXMXTR+BzUeLmkqyy2ZUUaeeAWujygT1QSMSQLChzD60U2sgxZt3SgioTctKpUqOwRgiu0MEpL879tNu2dORBi7XI3NqaLt3rSyd+gN0cWpKY49DcqFxWAQhC3zt+f81aFgtvtcuW/qgNLbZ8N9HKWnRc+K3Y1p7jJrvzSOWTxPuTX7+4z/5w1L4Xi6Jn D/+3u67C P/xy+NiFAb+LBmWFKGi+f/cEfD84rHMbk2DgYEJfkse27c9F3dOii3r+MWeSJzGR/jBhJcFarVB6SdCnrOwGlUOwItuWHVHKZTnmBN9h+iqr73wHqIL+VY6WdyWhkPr49QVH2eWNQrBv4DrN9jbmLS0dIxVGj9E08mHvvMZNf1Q1UxnumjLs1SDcre5SBaFJ1nHcflLZIMmUa7OO2d9Il7SQRghPtd66l/BapQTA2gnFGNIGV1Q5n6PTCtC6zh7yJrutJaF5C76TYaWIGS1MBI8c5EtDDi2GEq2cMNU0X58D0Fe1j695je6XlrOjsJFhhz4mEnxO286kVkod42+/9j7BUB1I0qfaReUUy57nBESd7PpZx2jZTpVzKnG8HG2tWSUzejd287JNLREHnDChdbQ0z0/DBfzBeZE5iK9G5VIvr69zqExYYXtYCQog7stvYHLQ1hGKIxXL+fcDBVecvi7l7RTFUq0tFQ1b9dU6paLx1Fu8LxfnOFQRt0p4kbWkc9HATyBaxE5PoJMkmLIRlu+a8g612B79HYqfD3ciZOok/BaRJ/iXpt47UN5mPEwB2aXsP X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add an option to force hugetlb gigantic pages to be allocated using CMA only (if hugetlb_cma is enabled). This avoids a fallback to allocation from the rest of system memory if the CMA allocation fails. This makes the size of hugetlb_cma a hard upper boundary for gigantic hugetlb page allocations. This is useful because, with a large CMA area, the kernel's unmovable allocations will have less room to work with and it is undesirable for new hugetlb gigantic page allocations to be done from that remaining area. It will eat in to the space available for unmovable allocations, leading to unwanted system behavior (OOMs because the kernel fails to do unmovable allocations). So, with this enabled, an administrator can force a hard upper bound for runtime gigantic page allocations, and have more predictable system behavior. Signed-off-by: Frank van der Linden --- Documentation/admin-guide/kernel-parameters.txt | 7 +++++++ mm/hugetlb.c | 14 ++++++++++++++ 2 files changed, 21 insertions(+) diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt index ae21d911d1c7..491628ac071a 100644 --- a/Documentation/admin-guide/kernel-parameters.txt +++ b/Documentation/admin-guide/kernel-parameters.txt @@ -1892,6 +1892,13 @@ hugepages using the CMA allocator. If enabled, the boot-time allocation of gigantic hugepages is skipped. + hugetlb_cma_only= + [HW,CMA,EARLY] When allocating new HugeTLB pages, only + try to allocate from the CMA areas. + + This option does nothing if hugetlb_cma= is not also + specified. + hugetlb_free_vmemmap= [KNL] Requires CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP enabled. diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 634dc53f1e3e..0b483c466656 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -59,6 +59,7 @@ struct hstate hstates[HUGE_MAX_HSTATE]; static struct cma *hugetlb_cma[MAX_NUMNODES]; static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; #endif +static bool hugetlb_cma_only; static unsigned long hugetlb_cma_size __initdata; __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; @@ -1510,6 +1511,9 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, } #endif if (!folio) { + if (hugetlb_cma_only) + return NULL; + folio = folio_alloc_gigantic(order, gfp_mask, nid, nodemask); if (!folio) return NULL; @@ -4738,6 +4742,9 @@ static __init void hugetlb_parse_params(void) hcp->setup(hcp->val); } + + if (!hugetlb_cma_size) + hugetlb_cma_only = false; } /* @@ -7850,6 +7857,13 @@ static int __init cmdline_parse_hugetlb_cma(char *p) early_param("hugetlb_cma", cmdline_parse_hugetlb_cma); +static int __init cmdline_parse_hugetlb_cma_only(char *p) +{ + return kstrtobool(p, &hugetlb_cma_only); +} + +early_param("hugetlb_cma_only", cmdline_parse_hugetlb_cma_only); + void __init hugetlb_cma_reserve(int order) { unsigned long size, reserved, per_node; From patchwork Fri Feb 28 18:29:27 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996927 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C6EDC282D0 for ; Fri, 28 Feb 2025 18:30:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 685C828001B; Fri, 28 Feb 2025 13:30:34 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 6369F280001; Fri, 28 Feb 2025 13:30:34 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 48A3D28001B; Fri, 28 Feb 2025 13:30:34 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 26F81280001 for ; Fri, 28 Feb 2025 13:30:34 -0500 (EST) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id C56EEC1160 for ; Fri, 28 Feb 2025 18:30:33 +0000 (UTC) X-FDA: 83170193946.24.2379246 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf23.hostedemail.com (Postfix) with ESMTP id D5699140020 for ; Fri, 28 Feb 2025 18:30:31 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gqSRIi1L; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3xgDCZwQKCBU0Gy619916z.x97638FI-775Gvx5.9C1@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3xgDCZwQKCBU0Gy619916z.x97638FI-775Gvx5.9C1@flex--fvdl.bounces.google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767432; a=rsa-sha256; cv=none; b=A0RVKKz3xB9AYaJ+jl5GX/8rTrzMvwmCuambkh4FyIvNiV8/2t0yljjMIsuB+Plh3pJJyc nXcarzxiPjHyj1NzkkSanPGwXIqE0qm+oe7M+MQMLbjbUmx95dj9hWgPzrSG80kb5tUJuj PyJs91xPgi4wIRTeaE33gQqtEN2N6IY= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=gqSRIi1L; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf23.hostedemail.com: domain of 3xgDCZwQKCBU0Gy619916z.x97638FI-775Gvx5.9C1@flex--fvdl.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3xgDCZwQKCBU0Gy619916z.x97638FI-775Gvx5.9C1@flex--fvdl.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767432; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=TMe19yWV86sRu83AfjzN0OeG5+sON+m39CVh+mgprHg=; b=GzhALx6AGRNKcrXAhHgBQI5jK6rDC2ykfGkcUDCm4bJ05X5mbA2UZocu/m3BjEd4BSD2FB Lkvm15c+GtDm6CeULvFFyU+m2bf3uBjx5GFa9/okqXfbjuf8xTQ1Z4X8tKHp87l8xLG7JC vy7x6uR/i4qHs8dNgXeqAmkLvITotHs= Received: by mail-pj1-f74.google.com with SMTP id 98e67ed59e1d1-2fec1f46678so2976757a91.1 for ; Fri, 28 Feb 2025 10:30:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767431; x=1741372231; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=TMe19yWV86sRu83AfjzN0OeG5+sON+m39CVh+mgprHg=; b=gqSRIi1Lr6sSbJkpUl+srL0Pe8obGQzJ7Wq3uOp+XfIFfBlzCT6aJ3HD8nEBAUUZSH 1pboEhg2/whlhvngGJ089j+4GXRceTSA6zuND6HZTvKn09B7sXss8Uj0hT0J0oA7rqw/ 7j5jVC+P1GpSMFoLhu0DwBR8j5ylfTncZwnbBLqwl9nM/zZGdSwr/qJ67Ua8umMpYf2f df+RkEwlUpbw5rzvUqu9A2pBMgOWrxh0wKdQspg1InyRg36OrKJqw/HBkqeBZo+aZS/u 829TtIGvOZ0aF9kQA/5DWlITuana9Lc85G0d+TacCFbI/Tthr0eTss4uvk6f/EUjg9Ud ufCw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767431; x=1741372231; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=TMe19yWV86sRu83AfjzN0OeG5+sON+m39CVh+mgprHg=; b=BwIo1nqn4rcQtSIHoLbdy9ehwdeVOm4n320WQN8jZ+zAS/xSwrysPHFO3QAMgGukfA UBQNMbjhHWbZZC0u6sxhT+qB2CYQ11ddnNBhteY7yaDADI+lhMoouTgzuQmGXOWuuczN Ano6t/jeQuNFQpKHxSreYZLIq38HKwU0iO46cDqpYzuciqeF3OuVKNubqjsdWik0cPPn LI2NO9f5V2sxUReweuYmGDPI9I38DecLg5Go4Wu+X40DMzzXSd2lgnG670FO0Cy+1C9j cTtAFUuDizczFrZgEKnKLvoxB638LUGPlPHdjFnOU7uUJO2EHagJZ9IFQWUmCdadvX9j +8lQ== X-Forwarded-Encrypted: i=1; AJvYcCV8zn5Cpa2Up5U2+vZFHcmARZ4BIwSB3tEX7EfnwP+x2k07jWI1C7zYS/AAmBh+qp5JlByXsx71jA==@kvack.org X-Gm-Message-State: AOJu0Yw2S4e5wg30wGyDcmrq5aQt2abJrctX00d3a9Zmt3lEVpNWguXE KQEXFqehJi8hNrlrQc4vIGUx+IjEbSQIMMEHUX4UsrvtNChOKeAVF56AxnAbtBDdnozO1A== X-Google-Smtp-Source: AGHT+IFdDufRUVuOA2NiFtFscHtFYi1z0zZ+G49tzlylohpjZ77dDf1/vb2nPMiEU/5Q5ipJa4U2fDbm X-Received: from pjbsr16.prod.google.com ([2002:a17:90b:4e90:b0:2fa:2891:e310]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3d0a:b0:2f6:be57:49d2 with SMTP id 98e67ed59e1d1-2febab7459bmr8199787a91.17.1740767430763; Fri, 28 Feb 2025 10:30:30 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:27 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-27-fvdl@google.com> Subject: [PATCH v5 26/27] mm/hugetlb: enable bootmem allocation from CMA areas From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden , Madhavan Srinivasan , Michael Ellerman , linuxppc-dev@lists.ozlabs.org X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: D5699140020 X-Stat-Signature: ypap9x9mjwrfaicuubnsczfme3eufano X-HE-Tag: 1740767431-405728 X-HE-Meta: U2FsdGVkX19QhQ1apYNqw9iieJ0MAUz7bultHquHW3jZ/jc3HulIeJn4nOrICB7HM89Mw7pBBU6ezwhtfocGRoLlpE+pSWrV6kSSqY4uqGVAOTurodLtPw75WDXAPC1oX18kGyXwD+gkewRUk9VF5bS2F0zbYeysAUbBKbzvusUHYRvR0tZNGjpqd5EBMSaVg/sotpzw25yd1rxxLnLP/lERrlhN3eO2uyIAwZiOWbc2IHoYqCNvif0nPB/gCdZIvKMhh7ouIB4HwsiuRFWKPrzaZn6RozfyX8Uq4qGw4ZLzjZ+HVSUqxco5ylmTXTVS7EK1qGkh6zSdc9jkjknEpEjeZZXmf+EdoUkmj4OjB4bGbAg0MhlMWPvLYw6eWZq3LQnlEFo8LoSO36xWZ03DCa6OvVstCxGLrl8uzp1JG3fS9zBcPznPdywPCR+cNNmfKLYVC6zfyNxkApBfPtsBoLvPXfLqpExzdZtJRkfnAHqBUpv/5xn/AFwRe6vNDobbO1I3dRqEyWp5c3tMmFVm8wd1jMVLNe42zFhGRhqUMCHChQ+yeMdIRjU0caE+eXXc14Lpe5O2FGkW08xM1ecboXmLyLr2VtJG0pIgPQkK2nnSW1Bu+nNjQfIZyf+MtRe8HpjbaZpogdnxciwiRIjSdRQeUFw4v4Lhbnu+ntanQOpAv94OAoscc6xANqc8T6TwoQYSh1ID4sUAPy8XLFSgpYuNCyUeS1/LMjXKGC7NSlfzGnt8sIJJmARcFv+uVDNjoQHGYuKku596etq/DeyDuy5XO+TAJvTk+fPFlHlbiD3+jM2W3cPRTpPjZ6/gaBwNXDD2c18mE2FJuoRQmBjukrnw1N1mQSFdYHvYnDk2lk2sH/sTWbuoLl79ILKpI9vhz3okq02M6OA4oA/cr9eXq0//QL88Jy0S3DY6jX8y/YOmWn1wiBiVry0tC+/+2TeeVUwK3oweRsO1NROZx77 VBvURyKV vIeElBJ99sczuiyUs3RDajUYSe/xx3Fcf46ZPYBtHKfxsqvR4Lfi8fIfkyvb+A7GSEtwaA+oavoe9V84WvbQp+BlqRjAcGUTUmdqfwumPfGCwvMGfkUnDUM5DIk8ZH9Io+J1LkeFAhFPAwMDD3aWY+sHdwvT/3CNJMHQtVw7qtOTe7IFe+5QqW6i1sj+gtZXTwDvLOKeta75tpAO0+P2fMkx1kps5LXI+YQaWqRAJVRSEGvkiEoEHXYEHlOaaLxgjhaXB+nudHnpMoNNU/Zh1nOKyiINdJaSpCFUyfbfosHWybH5EtKZQaQf4Qu3+c18IbFggY4Uh+1KvOkuGnLW5HrDIo283mVcfAAcuNxfSQ82CwUXHMrR9lfoxp87WrbqtNA1LynSxN9r7ATFCmLX+SdMNldVp6QKQh9iQ6cgofSHZVSXCmoFXTCJcwMl3arW5SDKjOW1Vpf0JcfW9ftgSbDPNv0OwHQf1UfUfP+rxgXiPcGCFHAT3UR2gybDbLgS2pJzX/iLFjDzlTaAs8brRWX3cwdU1h5oelpqsNIVTsLppZi9MluT9Bdb00K7ssVEC/TYzn/i5Fn1rxtoDWhH83iezA1Dg1rJTU37MOwEPhVKYcXQ= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: If hugetlb_cma_only is enabled, we know that hugetlb pages can only be allocated from CMA. Now that there is an interface to do early reservations from a CMA area (returning memblock memory), it can be used to allocate hugetlb pages from CMA. This also allows for doing pre-HVO on these pages (if enabled). Make sure to initialize the page structures and associated data correctly. Create a flag to signal that a hugetlb page has been allocated from CMA to make things a little easier. Some configurations of powerpc have a special hugetlb bootmem allocator, so introduce a boolean arch_specific_huge_bootmem_alloc that returns true if such an allocator is present. In that case, CMA bootmem allocations can't be used, so check that function before trying. Cc: Madhavan Srinivasan Cc: Michael Ellerman Cc: linuxppc-dev@lists.ozlabs.org Signed-off-by: Frank van der Linden --- arch/powerpc/include/asm/book3s/64/hugetlb.h | 6 + include/linux/hugetlb.h | 17 ++ mm/hugetlb.c | 168 ++++++++++++++----- 3 files changed, 152 insertions(+), 39 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/hugetlb.h b/arch/powerpc/include/asm/book3s/64/hugetlb.h index f0bba9c5f9c3..bb786694dd26 100644 --- a/arch/powerpc/include/asm/book3s/64/hugetlb.h +++ b/arch/powerpc/include/asm/book3s/64/hugetlb.h @@ -94,4 +94,10 @@ static inline int check_and_get_huge_psize(int shift) return mmu_psize; } +#define arch_has_huge_bootmem_alloc arch_has_huge_bootmem_alloc + +static inline bool arch_has_huge_bootmem_alloc(void) +{ + return (firmware_has_feature(FW_FEATURE_LPAR) && !radix_enabled()); +} #endif diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index 2512463bca49..6c6546b54934 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -591,6 +591,7 @@ enum hugetlb_page_flags { HPG_freed, HPG_vmemmap_optimized, HPG_raw_hwp_unreliable, + HPG_cma, __NR_HPAGEFLAGS, }; @@ -650,6 +651,7 @@ HPAGEFLAG(Temporary, temporary) HPAGEFLAG(Freed, freed) HPAGEFLAG(VmemmapOptimized, vmemmap_optimized) HPAGEFLAG(RawHwpUnreliable, raw_hwp_unreliable) +HPAGEFLAG(Cma, cma) #ifdef CONFIG_HUGETLB_PAGE @@ -678,14 +680,18 @@ struct hstate { char name[HSTATE_NAME_LEN]; }; +struct cma; + struct huge_bootmem_page { struct list_head list; struct hstate *hstate; unsigned long flags; + struct cma *cma; }; #define HUGE_BOOTMEM_HVO 0x0001 #define HUGE_BOOTMEM_ZONES_VALID 0x0002 +#define HUGE_BOOTMEM_CMA 0x0004 bool hugetlb_bootmem_page_zones_valid(int nid, struct huge_bootmem_page *m); @@ -823,6 +829,17 @@ static inline pte_t arch_make_huge_pte(pte_t entry, unsigned int shift, } #endif +#ifndef arch_has_huge_bootmem_alloc +/* + * Some architectures do their own bootmem allocation, so they can't use + * early CMA allocation. + */ +static inline bool arch_has_huge_bootmem_alloc(void) +{ + return false; +} +#endif + static inline struct hstate *folio_hstate(struct folio *folio) { VM_BUG_ON_FOLIO(!folio_test_hugetlb(folio), folio); diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 0b483c466656..664ccaaa717a 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -131,8 +131,10 @@ static void hugetlb_free_folio(struct folio *folio) #ifdef CONFIG_CMA int nid = folio_nid(folio); - if (cma_free_folio(hugetlb_cma[nid], folio)) + if (folio_test_hugetlb_cma(folio)) { + WARN_ON_ONCE(!cma_free_folio(hugetlb_cma[nid], folio)); return; + } #endif folio_put(folio); } @@ -1508,6 +1510,9 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, break; } } + + if (folio) + folio_set_hugetlb_cma(folio); } #endif if (!folio) { @@ -3174,6 +3179,86 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, return ERR_PTR(-ENOSPC); } +static bool __init hugetlb_early_cma(struct hstate *h) +{ + if (arch_has_huge_bootmem_alloc()) + return false; + + return (hstate_is_gigantic(h) && hugetlb_cma_only); +} + +static __init void *alloc_bootmem(struct hstate *h, int nid, bool node_exact) +{ + struct huge_bootmem_page *m; + unsigned long flags; + struct cma *cma; + int listnode = nid; + +#ifdef CONFIG_CMA + if (hugetlb_early_cma(h)) { + flags = HUGE_BOOTMEM_CMA; + cma = hugetlb_cma[nid]; + m = cma_reserve_early(cma, huge_page_size(h)); + if (!m) { + int node; + + if (node_exact) + return NULL; + for_each_online_node(node) { + cma = hugetlb_cma[node]; + if (!cma || node == nid) + continue; + m = cma_reserve_early(cma, huge_page_size(h)); + if (m) { + listnode = node; + break; + } + } + } + } else +#endif + { + flags = 0; + cma = NULL; + if (node_exact) + m = memblock_alloc_exact_nid_raw(huge_page_size(h), + huge_page_size(h), 0, + MEMBLOCK_ALLOC_ACCESSIBLE, nid); + else { + m = memblock_alloc_try_nid_raw(huge_page_size(h), + huge_page_size(h), 0, + MEMBLOCK_ALLOC_ACCESSIBLE, nid); + /* + * For pre-HVO to work correctly, pages need to be on + * the list for the node they were actually allocated + * from. That node may be different in the case of + * fallback by memblock_alloc_try_nid_raw. So, + * extract the actual node first. + */ + if (m) + listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m))); + } + } + + if (m) { + /* + * Use the beginning of the huge page to store the + * huge_bootmem_page struct (until gather_bootmem + * puts them into the mem_map). + * + * Put them into a private list first because mem_map + * is not up yet. + */ + INIT_LIST_HEAD(&m->list); + list_add(&m->list, &huge_boot_pages[listnode]); + m->hstate = h; + m->flags = flags; + m->cma = cma; + } + + return m; +} + int alloc_bootmem_huge_page(struct hstate *h, int nid) __attribute__ ((weak, alias("__alloc_bootmem_huge_page"))); int __alloc_bootmem_huge_page(struct hstate *h, int nid) @@ -3183,22 +3268,15 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) /* do node specific alloc */ if (nid != NUMA_NO_NODE) { - m = memblock_alloc_exact_nid_raw(huge_page_size(h), huge_page_size(h), - 0, MEMBLOCK_ALLOC_ACCESSIBLE, nid); + m = alloc_bootmem(h, node, true); if (!m) return 0; goto found; } + /* allocate from next node when distributing huge pages */ for_each_node_mask_to_alloc(&h->next_nid_to_alloc, nr_nodes, node, &node_states[N_ONLINE]) { - m = memblock_alloc_try_nid_raw( - huge_page_size(h), huge_page_size(h), - 0, MEMBLOCK_ALLOC_ACCESSIBLE, node); - /* - * Use the beginning of the huge page to store the - * huge_bootmem_page struct (until gather_bootmem - * puts them into the mem_map). - */ + m = alloc_bootmem(h, node, false); if (!m) return 0; goto found; @@ -3216,21 +3294,6 @@ int __alloc_bootmem_huge_page(struct hstate *h, int nid) memblock_reserved_mark_noinit(virt_to_phys((void *)m + PAGE_SIZE), huge_page_size(h) - PAGE_SIZE); - /* - * Put them into a private list first because mem_map is not up yet. - * - * For pre-HVO to work correctly, pages need to be on the list for - * the node they were actually allocated from. That node may be - * different in the case of fallback by memblock_alloc_try_nid_raw. - * So, extract the actual node first. - */ - if (nid == NUMA_NO_NODE) - node = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m))); - - INIT_LIST_HEAD(&m->list); - list_add(&m->list, &huge_boot_pages[node]); - m->hstate = h; - m->flags = 0; return 1; } @@ -3271,13 +3334,25 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio, prep_compound_head((struct page *)folio, huge_page_order(h)); } +static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) +{ + return m->flags & HUGE_BOOTMEM_HVO; +} + +static bool __init hugetlb_bootmem_page_earlycma(struct huge_bootmem_page *m) +{ + return m->flags & HUGE_BOOTMEM_CMA; +} + /* * memblock-allocated pageblocks might not have the migrate type set * if marked with the 'noinit' flag. Set it to the default (MIGRATE_MOVABLE) - * here. + * here, or MIGRATE_CMA if this was a page allocated through an early CMA + * reservation. * - * Note that this will not write the page struct, it is ok (and necessary) - * to do this on vmemmap optimized folios. + * In case of vmemmap optimized folios, the tail vmemmap pages are mapped + * read-only, but that's ok - for sparse vmemmap this does not write to + * the page structure. */ static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, struct hstate *h) @@ -3286,9 +3361,13 @@ static void __init hugetlb_bootmem_init_migratetype(struct folio *folio, WARN_ON_ONCE(!pageblock_aligned(folio_pfn(folio))); - for (i = 0; i < nr_pages; i += pageblock_nr_pages) - set_pageblock_migratetype(folio_page(folio, i), + for (i = 0; i < nr_pages; i += pageblock_nr_pages) { + if (folio_test_hugetlb_cma(folio)) + init_cma_pageblock(folio_page(folio, i)); + else + set_pageblock_migratetype(folio_page(folio, i), MIGRATE_MOVABLE); + } } static void __init prep_and_add_bootmem_folios(struct hstate *h, @@ -3334,10 +3413,16 @@ bool __init hugetlb_bootmem_page_zones_valid(int nid, return true; } + if (hugetlb_bootmem_page_earlycma(m)) { + valid = cma_validate_zones(m->cma); + goto out; + } + start_pfn = virt_to_phys(m) >> PAGE_SHIFT; valid = !pfn_range_intersects_zones(nid, start_pfn, pages_per_huge_page(m->hstate)); +out: if (!valid) hstate_boot_nrinvalid[hstate_index(m->hstate)]++; @@ -3366,11 +3451,6 @@ static void __init hugetlb_bootmem_free_invalid_page(int nid, struct page *page, } } -static bool __init hugetlb_bootmem_page_prehvo(struct huge_bootmem_page *m) -{ - return (m->flags & HUGE_BOOTMEM_HVO); -} - /* * Put bootmem huge pages into the standard lists after mem_map is up. * Note: This only applies to gigantic (order > MAX_PAGE_ORDER) pages. @@ -3420,14 +3500,21 @@ static void __init gather_bootmem_prealloc_node(unsigned long nid) */ folio_set_hugetlb_vmemmap_optimized(folio); + if (hugetlb_bootmem_page_earlycma(m)) + folio_set_hugetlb_cma(folio); + list_add(&folio->lru, &folio_list); /* * We need to restore the 'stolen' pages to totalram_pages * in order to fix confusing memory reports from free(1) and * other side-effects, like CommitLimit going negative. + * + * For CMA pages, this is done in init_cma_pageblock + * (via hugetlb_bootmem_init_migratetype), so skip it here. */ - adjust_managed_page_count(page, pages_per_huge_page(h)); + if (!folio_test_hugetlb_cma(folio)) + adjust_managed_page_count(page, pages_per_huge_page(h)); cond_resched(); } @@ -3612,8 +3699,11 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) { unsigned long allocated; - /* skip gigantic hugepages allocation if hugetlb_cma enabled */ - if (hstate_is_gigantic(h) && hugetlb_cma_size) { + /* + * Skip gigantic hugepages allocation if early CMA + * reservations are not available. + */ + if (hstate_is_gigantic(h) && hugetlb_cma_size && !hugetlb_early_cma(h)) { pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); return; } From patchwork Fri Feb 28 18:29:28 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Frank van der Linden X-Patchwork-Id: 13996928 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B33AC282C6 for ; Fri, 28 Feb 2025 18:30:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E2AFC28001C; Fri, 28 Feb 2025 13:30:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DDA5C280001; Fri, 28 Feb 2025 13:30:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C507228001C; Fri, 28 Feb 2025 13:30:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 91C36280001 for ; Fri, 28 Feb 2025 13:30:35 -0500 (EST) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 499F7161060 for ; Fri, 28 Feb 2025 18:30:35 +0000 (UTC) X-FDA: 83170194030.11.3C16CF6 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf30.hostedemail.com (Postfix) with ESMTP id 6869B8000E for ; Fri, 28 Feb 2025 18:30:33 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DBirLHdl; spf=pass (imf30.hostedemail.com: domain of 3yADCZwQKCBc2I083BB381.zB985AHK-997Ixz7.BE3@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3yADCZwQKCBc2I083BB381.zB985AHK-997Ixz7.BE3@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1740767433; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=IoBHRW/adUD3sx6ef6qBe3DIT3wk3SN+71DQwCQYK8E=; b=zvumtUmNxu9GXfgEagVG0IQwJChY5M3cbsIwvEI2dd5cf5YNFnQwwGSfSqzSzl1Eqyjrl8 SfY58DWLapWJ5A9aBpUlsgkBcKiRf8QfCRL3NOBtdyQpW6i8pNMuY/Q0oYXHFqgtgR2RkI ft8N6UW/kD/RgJN1YUL2Orask4dCnEc= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=DBirLHdl; spf=pass (imf30.hostedemail.com: domain of 3yADCZwQKCBc2I083BB381.zB985AHK-997Ixz7.BE3@flex--fvdl.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3yADCZwQKCBc2I083BB381.zB985AHK-997Ixz7.BE3@flex--fvdl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1740767433; a=rsa-sha256; cv=none; b=iqZm9WJTvqfG47ZMuM/uMXA2ohRj+Qn4McJv+5DVZMDucLW4uuVt9cjvtU5YM2yNXYXn4z YjJpCD1DXqaiUYsNMFu8YFQOydxZ0EliIbtLgVXqryL2Ha/X43nvGxNAQz7vXbBWQlVvwG FSdbLVQ9I/w3j9Xk8y2mTNdBCwz8DZw= Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2fec3e38b60so1479739a91.0 for ; Fri, 28 Feb 2025 10:30:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1740767432; x=1741372232; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=IoBHRW/adUD3sx6ef6qBe3DIT3wk3SN+71DQwCQYK8E=; b=DBirLHdlomuc5JLJvo0MMe1+KWe7cRzktmV13q9CCIJw9ePxPN4dpiqDWiM7kWt3OT kz+I2wJx1Hxmwk9JKVQbb5VHJwYthQVSCaxgU24MVghwTyZ1XTgd2xwnpuf8nq54Fpkk 8ire5V9oVUqjM/ucmeFqe5RF7OXTFNVpFEbLduynvZw4sMDPPzHsIIVdLyBjALJ8RKiG IeJ6AQUsHseEvTshQoAeRuEXFX1Lwo+/r2CgZX1a5VGb72o/Yb6AueDA0D2hKol4LV58 jUxZ9s1pVXuQzw7QTexvV8o/wXJDKIgx4L8Adwomr5hPXIa/F5lzsXSVQfTBot8ypKWK NCjg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1740767432; x=1741372232; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=IoBHRW/adUD3sx6ef6qBe3DIT3wk3SN+71DQwCQYK8E=; b=FB2eU7Hc73xc/In27jBBm7OO/LUbnNpsfc124Xd0nQYUqMP3n8M5elng+fHeqoTRcj 2OIVOXzTvIqn9iUv1IXjXyrniFdkCwhTGydS3AGlIRJgCTb/RW1IWFgSDJVDz8B21V8m 752NynTGs/Dx+9f/c8+gEsi+ii17Wt6ubuNuAlKNO5YNlFh9JGIY+r8JQ4JLGvc+yn70 EOmiVUVZaCDCqHkZnB4CrKrRoB0FKIMpuj2sL7d6X4vdJ8tRno5cXDQpWFzl2VHRxNHV Jef1FaRUOmVraPAyFwkFpmmmuCiP3oVmLok4OH2ba0XeAW4gTWOnFw1eoL+XFHygS96t 7Ejw== X-Forwarded-Encrypted: i=1; AJvYcCX7ALhLNT6suJBQAHFxCTeSHxiCaTaHE2nQPVfXclTay4qLG4vKCPR3bMgZ7f3GGdH8Jlf72v29zw==@kvack.org X-Gm-Message-State: AOJu0YyR4kTUoDRybqOj3pm6k4t3hWCZAcbrPPrcX56+6AhE/zytnsK9 VUCl59nEcdTuZ9cKDMW3lmgCAPjwaJtLyJrbN4F95eyHx+Fk+Dakx6jXmXqNuq6X5C91ug== X-Google-Smtp-Source: AGHT+IFYRZfw/iKPy89wJmaJ/TEGAeh30n7Li+71hZoPe7g3000pn8NixLfIZexb+lsh6KpV8fXysdvq X-Received: from pjbnv1.prod.google.com ([2002:a17:90b:1b41:b0:2fe:d61a:ed]) (user=fvdl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3511:b0:2fa:1f1b:3db2 with SMTP id 98e67ed59e1d1-2febabcfedfmr6880143a91.25.1740767432395; Fri, 28 Feb 2025 10:30:32 -0800 (PST) Date: Fri, 28 Feb 2025 18:29:28 +0000 In-Reply-To: <20250228182928.2645936-1-fvdl@google.com> Mime-Version: 1.0 References: <20250228182928.2645936-1-fvdl@google.com> X-Mailer: git-send-email 2.48.1.711.g2feabab25a-goog Message-ID: <20250228182928.2645936-28-fvdl@google.com> Subject: [PATCH v5 27/27] mm/hugetlb: move hugetlb CMA code in to its own file From: Frank van der Linden To: akpm@linux-foundation.org, muchun.song@linux.dev, linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: yuzhao@google.com, usamaarif642@gmail.com, joao.m.martins@oracle.com, roman.gushchin@linux.dev, ziy@nvidia.com, david@redhat.com, Frank van der Linden X-Rspam-User: X-Rspamd-Server: rspam10 X-Rspamd-Queue-Id: 6869B8000E X-Stat-Signature: ha9i1n5gwuwq17gi9q51og96mbxiecu3 X-HE-Tag: 1740767433-839671 X-HE-Meta: U2FsdGVkX18ZBhTl/1+k0u0WOHQ0PfRQg8N88z4Wtfx3y+tkUMsawUL+HFCXCNI3fMgaHjduaEjetXjfgUIpZr1xU6/91E5/4bT3XWWxizUvVvWXAdAwtYWiSYw4gOwKtqGRFo10cO6ZxfuTPmSCrbo0xNp2/PlgXluQVEQhW72tjyuggev6nyJWJe34u9S6f5hlEvyAgmCdxn8DojaggiAlGVNAEMg/cDtp6hOat7iK/gdtdK0Su5DJSbKSqK+x6+2GrrXeUHDAk3MO3zudRfvpccE//siTZtogQXhPnr7UUcAP4lTatftCOEwmXpKLm9Z4vUd08glXjGfbQv6922b2IWK+HdU0MajxMU7ODSsyx23qTI/5WaHhJ8sBmqu8ScqoAt35EG2B3g3icaFyjYV67a5TsISrnM53AdOkozjL9xxKNWSNPyedfIZLrPeM7rmbQa8Bza6MhkwGiifc/2r0V9vKjzqxKyilghhQbMwoykVcxUnZFiOdT3JjtsrkZrHkyVmfNabR8LaEwAN/Qu7NoDkvdej3g7NyhT19J4x07Hba5L0CWIa7/Byh10pikeKRZkFZJtvhmskr9CNLOEWOCDXlUjSUWS4DnMUiS0MqrK9zY5iS0z1oUbX3RaLkz5jJCkAqPCkMy5o+ETbbrzGAboMXM/9jnrK9obd+kKO5d+LERuVIzu3DJ2EQNx1t3wf6JMarlJYHdGYiFjM5v0hRKvYlGk/E0cp+n2M3n6K63pZWCdLR9lW8w7lAR685PgI2wHR/mw8HIIy/IwaigAUkRON5pQPXeL2axMsuQkkpC1m3UQ0CWm4uAa5bxwYrw4xkhU5SZcBf9m/DNBTo90yRcYjbg1PJWvUj9g2EAspATncyd1lfgQvRVTNk99zNLnwfGNEBxKUtv3XuE/e8qMdoi/uAXqYm/kIbec8Bx3ORaKVU729PVTKJHyGr1Y67GlCHwKkAWLzq3q0Zt+v Nqv87wpN OlJ7IvpcdWPO38EfYJYfQ9VKIXAzvBG8dupVC7STcjV2wFmVZNQcHInrQyXsM/sh1WO/XuxSTbiplMEgtttZJ0mv++iu3lUXSntLM2ST+F7+9lJgam0pmmOZbr6EDz3omiaeazwlsbOFBYpPIXlz7UvuoT0rbh0IR66OHrfrwyMF9u3bgG5wPOWbYA6se9I3jFIdeH+GSZMvq9mTJRAIgyLjNSDqX4keo63u8iGk4nk4QODCxiRfWoeTXmbne+9/k5yHvr7k5TmpHMdb/25r07qi2AU5AEUzqbKbwXA/cBrplfpGHWUyHqN4TEe/YMfjdwJ+ITMSb4l24MyqPVafVdfZKvpKGM8dGS3U0E570X0vmY95HwfelL5Sk1pc57Rw0IG7RRPJYixcEtM0j9OSZ/vwhWVsA+3TLCF3hRJOe0sdY4v3/XpfrHsAOrB6qErnIXz93q8+lyTBBkyzCmKKDwcvBUDr/lNB7j1kB3YeV59x7amQgkUQ3QhPjvNA6LJq1AkhGvNPycxywTGwJP3PFUQ2kCw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: hugetlb.c contained a number of CONFIG_CMA ifdefs, and the code inside them was large enough to merit being in its own file, so move it, cleaning up things a bit. Hide some direct variable access behind functions to accommodate the move. No functional change intended. Signed-off-by: Frank van der Linden --- MAINTAINERS | 2 + mm/Makefile | 3 + mm/hugetlb.c | 269 +++------------------------------------------ mm/hugetlb_cma.c | 275 +++++++++++++++++++++++++++++++++++++++++++++++ mm/hugetlb_cma.h | 57 ++++++++++ 5 files changed, 354 insertions(+), 252 deletions(-) create mode 100644 mm/hugetlb_cma.c create mode 100644 mm/hugetlb_cma.h diff --git a/MAINTAINERS b/MAINTAINERS index 8e0736dc2ee0..7d083b653b69 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -10710,6 +10710,8 @@ F: fs/hugetlbfs/ F: include/linux/hugetlb.h F: include/trace/events/hugetlbfs.h F: mm/hugetlb.c +F: mm/hugetlb_cma.c +F: mm/hugetlb_cma.h F: mm/hugetlb_vmemmap.c F: mm/hugetlb_vmemmap.h F: tools/testing/selftests/cgroup/test_hugetlb_memcg.c diff --git a/mm/Makefile b/mm/Makefile index 850386a67b3e..810ccd45d270 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -79,6 +79,9 @@ obj-$(CONFIG_SWAP) += page_io.o swap_state.o swapfile.o swap_slots.o obj-$(CONFIG_ZSWAP) += zswap.o obj-$(CONFIG_HAS_DMA) += dmapool.o obj-$(CONFIG_HUGETLBFS) += hugetlb.o +ifdef CONFIG_CMA +obj-$(CONFIG_HUGETLBFS) += hugetlb_cma.o +endif obj-$(CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP) += hugetlb_vmemmap.o obj-$(CONFIG_NUMA) += mempolicy.o obj-$(CONFIG_SPARSEMEM) += sparse.o diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 664ccaaa717a..3ee98f612137 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -49,19 +49,13 @@ #include #include "internal.h" #include "hugetlb_vmemmap.h" +#include "hugetlb_cma.h" #include int hugetlb_max_hstate __read_mostly; unsigned int default_hstate_idx; struct hstate hstates[HUGE_MAX_HSTATE]; -#ifdef CONFIG_CMA -static struct cma *hugetlb_cma[MAX_NUMNODES]; -static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; -#endif -static bool hugetlb_cma_only; -static unsigned long hugetlb_cma_size __initdata; - __initdata struct list_head huge_boot_pages[MAX_NUMNODES]; static unsigned long hstate_boot_nrinvalid[HUGE_MAX_HSTATE] __initdata; @@ -128,14 +122,11 @@ static struct resv_map *vma_resv_map(struct vm_area_struct *vma); static void hugetlb_free_folio(struct folio *folio) { -#ifdef CONFIG_CMA - int nid = folio_nid(folio); - if (folio_test_hugetlb_cma(folio)) { - WARN_ON_ONCE(!cma_free_folio(hugetlb_cma[nid], folio)); + hugetlb_cma_free_folio(folio); return; } -#endif + folio_put(folio); } @@ -1492,31 +1483,9 @@ static struct folio *alloc_gigantic_folio(struct hstate *h, gfp_t gfp_mask, if (nid == NUMA_NO_NODE) nid = numa_mem_id(); retry: - folio = NULL; -#ifdef CONFIG_CMA - { - int node; - - if (hugetlb_cma[nid]) - folio = cma_alloc_folio(hugetlb_cma[nid], order, gfp_mask); - - if (!folio && !(gfp_mask & __GFP_THISNODE)) { - for_each_node_mask(node, *nodemask) { - if (node == nid || !hugetlb_cma[node]) - continue; - - folio = cma_alloc_folio(hugetlb_cma[node], order, gfp_mask); - if (folio) - break; - } - } - - if (folio) - folio_set_hugetlb_cma(folio); - } -#endif + folio = hugetlb_cma_alloc_folio(h, gfp_mask, nid, nodemask); if (!folio) { - if (hugetlb_cma_only) + if (hugetlb_cma_exclusive_alloc()) return NULL; folio = folio_alloc_gigantic(order, gfp_mask, nid, nodemask); @@ -3179,47 +3148,14 @@ struct folio *alloc_hugetlb_folio(struct vm_area_struct *vma, return ERR_PTR(-ENOSPC); } -static bool __init hugetlb_early_cma(struct hstate *h) -{ - if (arch_has_huge_bootmem_alloc()) - return false; - - return (hstate_is_gigantic(h) && hugetlb_cma_only); -} - static __init void *alloc_bootmem(struct hstate *h, int nid, bool node_exact) { struct huge_bootmem_page *m; - unsigned long flags; - struct cma *cma; int listnode = nid; -#ifdef CONFIG_CMA - if (hugetlb_early_cma(h)) { - flags = HUGE_BOOTMEM_CMA; - cma = hugetlb_cma[nid]; - m = cma_reserve_early(cma, huge_page_size(h)); - if (!m) { - int node; - - if (node_exact) - return NULL; - for_each_online_node(node) { - cma = hugetlb_cma[node]; - if (!cma || node == nid) - continue; - m = cma_reserve_early(cma, huge_page_size(h)); - if (m) { - listnode = node; - break; - } - } - } - } else -#endif - { - flags = 0; - cma = NULL; + if (hugetlb_early_cma(h)) + m = hugetlb_cma_alloc_bootmem(h, &listnode, node_exact); + else { if (node_exact) m = memblock_alloc_exact_nid_raw(huge_page_size(h), huge_page_size(h), 0, @@ -3238,6 +3174,11 @@ static __init void *alloc_bootmem(struct hstate *h, int nid, bool node_exact) if (m) listnode = early_pfn_to_nid(PHYS_PFN(virt_to_phys(m))); } + + if (m) { + m->flags = 0; + m->cma = NULL; + } } if (m) { @@ -3252,8 +3193,6 @@ static __init void *alloc_bootmem(struct hstate *h, int nid, bool node_exact) INIT_LIST_HEAD(&m->list); list_add(&m->list, &huge_boot_pages[listnode]); m->hstate = h; - m->flags = flags; - m->cma = cma; } return m; @@ -3703,7 +3642,8 @@ static void __init hugetlb_hstate_alloc_pages(struct hstate *h) * Skip gigantic hugepages allocation if early CMA * reservations are not available. */ - if (hstate_is_gigantic(h) && hugetlb_cma_size && !hugetlb_early_cma(h)) { + if (hstate_is_gigantic(h) && hugetlb_cma_total_size() && + !hugetlb_early_cma(h)) { pr_warn_once("HugeTLB: hugetlb_cma is enabled, skip boot time allocation\n"); return; } @@ -3740,7 +3680,7 @@ static void __init hugetlb_init_hstates(void) */ if (hstate_is_gigantic(h) && !gigantic_page_runtime_supported()) continue; - if (hugetlb_cma_size && h->order <= HUGETLB_PAGE_ORDER) + if (hugetlb_cma_total_size() && h->order <= HUGETLB_PAGE_ORDER) continue; for_each_hstate(h2) { if (h2 == h) @@ -4642,14 +4582,6 @@ static void hugetlb_register_all_nodes(void) { } #endif -#ifdef CONFIG_CMA -static void __init hugetlb_cma_check(void); -#else -static inline __init void hugetlb_cma_check(void) -{ -} -#endif - static void __init hugetlb_sysfs_init(void) { struct hstate *h; @@ -4833,8 +4765,7 @@ static __init void hugetlb_parse_params(void) hcp->setup(hcp->val); } - if (!hugetlb_cma_size) - hugetlb_cma_only = false; + hugetlb_cma_validate_params(); } /* @@ -7904,169 +7835,3 @@ void hugetlb_unshare_all_pmds(struct vm_area_struct *vma) hugetlb_unshare_pmds(vma, ALIGN(vma->vm_start, PUD_SIZE), ALIGN_DOWN(vma->vm_end, PUD_SIZE)); } - -#ifdef CONFIG_CMA -static bool cma_reserve_called __initdata; - -static int __init cmdline_parse_hugetlb_cma(char *p) -{ - int nid, count = 0; - unsigned long tmp; - char *s = p; - - while (*s) { - if (sscanf(s, "%lu%n", &tmp, &count) != 1) - break; - - if (s[count] == ':') { - if (tmp >= MAX_NUMNODES) - break; - nid = array_index_nospec(tmp, MAX_NUMNODES); - - s += count + 1; - tmp = memparse(s, &s); - hugetlb_cma_size_in_node[nid] = tmp; - hugetlb_cma_size += tmp; - - /* - * Skip the separator if have one, otherwise - * break the parsing. - */ - if (*s == ',') - s++; - else - break; - } else { - hugetlb_cma_size = memparse(p, &p); - break; - } - } - - return 0; -} - -early_param("hugetlb_cma", cmdline_parse_hugetlb_cma); - -static int __init cmdline_parse_hugetlb_cma_only(char *p) -{ - return kstrtobool(p, &hugetlb_cma_only); -} - -early_param("hugetlb_cma_only", cmdline_parse_hugetlb_cma_only); - -void __init hugetlb_cma_reserve(int order) -{ - unsigned long size, reserved, per_node; - bool node_specific_cma_alloc = false; - int nid; - - /* - * HugeTLB CMA reservation is required for gigantic - * huge pages which could not be allocated via the - * page allocator. Just warn if there is any change - * breaking this assumption. - */ - VM_WARN_ON(order <= MAX_PAGE_ORDER); - cma_reserve_called = true; - - if (!hugetlb_cma_size) - return; - - for (nid = 0; nid < MAX_NUMNODES; nid++) { - if (hugetlb_cma_size_in_node[nid] == 0) - continue; - - if (!node_online(nid)) { - pr_warn("hugetlb_cma: invalid node %d specified\n", nid); - hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; - hugetlb_cma_size_in_node[nid] = 0; - continue; - } - - if (hugetlb_cma_size_in_node[nid] < (PAGE_SIZE << order)) { - pr_warn("hugetlb_cma: cma area of node %d should be at least %lu MiB\n", - nid, (PAGE_SIZE << order) / SZ_1M); - hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; - hugetlb_cma_size_in_node[nid] = 0; - } else { - node_specific_cma_alloc = true; - } - } - - /* Validate the CMA size again in case some invalid nodes specified. */ - if (!hugetlb_cma_size) - return; - - if (hugetlb_cma_size < (PAGE_SIZE << order)) { - pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n", - (PAGE_SIZE << order) / SZ_1M); - hugetlb_cma_size = 0; - return; - } - - if (!node_specific_cma_alloc) { - /* - * If 3 GB area is requested on a machine with 4 numa nodes, - * let's allocate 1 GB on first three nodes and ignore the last one. - */ - per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes); - pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n", - hugetlb_cma_size / SZ_1M, per_node / SZ_1M); - } - - reserved = 0; - for_each_online_node(nid) { - int res; - char name[CMA_MAX_NAME]; - - if (node_specific_cma_alloc) { - if (hugetlb_cma_size_in_node[nid] == 0) - continue; - - size = hugetlb_cma_size_in_node[nid]; - } else { - size = min(per_node, hugetlb_cma_size - reserved); - } - - size = round_up(size, PAGE_SIZE << order); - - snprintf(name, sizeof(name), "hugetlb%d", nid); - /* - * Note that 'order per bit' is based on smallest size that - * may be returned to CMA allocator in the case of - * huge page demotion. - */ - res = cma_declare_contiguous_multi(size, PAGE_SIZE << order, - HUGETLB_PAGE_ORDER, name, - &hugetlb_cma[nid], nid); - if (res) { - pr_warn("hugetlb_cma: reservation failed: err %d, node %d", - res, nid); - continue; - } - - reserved += size; - pr_info("hugetlb_cma: reserved %lu MiB on node %d\n", - size / SZ_1M, nid); - - if (reserved >= hugetlb_cma_size) - break; - } - - if (!reserved) - /* - * hugetlb_cma_size is used to determine if allocations from - * cma are possible. Set to zero if no cma regions are set up. - */ - hugetlb_cma_size = 0; -} - -static void __init hugetlb_cma_check(void) -{ - if (!hugetlb_cma_size || cma_reserve_called) - return; - - pr_warn("hugetlb_cma: the option isn't supported by current arch\n"); -} - -#endif /* CONFIG_CMA */ diff --git a/mm/hugetlb_cma.c b/mm/hugetlb_cma.c new file mode 100644 index 000000000000..e0f2d5c3a84c --- /dev/null +++ b/mm/hugetlb_cma.c @@ -0,0 +1,275 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include + +#include +#include + +#include +#include "internal.h" +#include "hugetlb_cma.h" + + +static struct cma *hugetlb_cma[MAX_NUMNODES]; +static unsigned long hugetlb_cma_size_in_node[MAX_NUMNODES] __initdata; +static bool hugetlb_cma_only; +static unsigned long hugetlb_cma_size __initdata; + +void hugetlb_cma_free_folio(struct folio *folio) +{ + int nid = folio_nid(folio); + + WARN_ON_ONCE(!cma_free_folio(hugetlb_cma[nid], folio)); +} + + +struct folio *hugetlb_cma_alloc_folio(struct hstate *h, gfp_t gfp_mask, + int nid, nodemask_t *nodemask) +{ + int node; + int order = huge_page_order(h); + struct folio *folio = NULL; + + if (hugetlb_cma[nid]) + folio = cma_alloc_folio(hugetlb_cma[nid], order, gfp_mask); + + if (!folio && !(gfp_mask & __GFP_THISNODE)) { + for_each_node_mask(node, *nodemask) { + if (node == nid || !hugetlb_cma[node]) + continue; + + folio = cma_alloc_folio(hugetlb_cma[node], order, gfp_mask); + if (folio) + break; + } + } + + if (folio) + folio_set_hugetlb_cma(folio); + + return folio; +} + +struct huge_bootmem_page * __init +hugetlb_cma_alloc_bootmem(struct hstate *h, int *nid, bool node_exact) +{ + struct cma *cma; + struct huge_bootmem_page *m; + int node = *nid; + + cma = hugetlb_cma[*nid]; + m = cma_reserve_early(cma, huge_page_size(h)); + if (!m) { + if (node_exact) + return NULL; + + for_each_online_node(node) { + cma = hugetlb_cma[node]; + if (!cma || node == *nid) + continue; + m = cma_reserve_early(cma, huge_page_size(h)); + if (m) { + *nid = node; + break; + } + } + } + + if (m) { + m->flags = HUGE_BOOTMEM_CMA; + m->cma = cma; + } + + return m; +} + + +static bool cma_reserve_called __initdata; + +static int __init cmdline_parse_hugetlb_cma(char *p) +{ + int nid, count = 0; + unsigned long tmp; + char *s = p; + + while (*s) { + if (sscanf(s, "%lu%n", &tmp, &count) != 1) + break; + + if (s[count] == ':') { + if (tmp >= MAX_NUMNODES) + break; + nid = array_index_nospec(tmp, MAX_NUMNODES); + + s += count + 1; + tmp = memparse(s, &s); + hugetlb_cma_size_in_node[nid] = tmp; + hugetlb_cma_size += tmp; + + /* + * Skip the separator if have one, otherwise + * break the parsing. + */ + if (*s == ',') + s++; + else + break; + } else { + hugetlb_cma_size = memparse(p, &p); + break; + } + } + + return 0; +} + +early_param("hugetlb_cma", cmdline_parse_hugetlb_cma); + +static int __init cmdline_parse_hugetlb_cma_only(char *p) +{ + return kstrtobool(p, &hugetlb_cma_only); +} + +early_param("hugetlb_cma_only", cmdline_parse_hugetlb_cma_only); + +void __init hugetlb_cma_reserve(int order) +{ + unsigned long size, reserved, per_node; + bool node_specific_cma_alloc = false; + int nid; + + /* + * HugeTLB CMA reservation is required for gigantic + * huge pages which could not be allocated via the + * page allocator. Just warn if there is any change + * breaking this assumption. + */ + VM_WARN_ON(order <= MAX_PAGE_ORDER); + cma_reserve_called = true; + + if (!hugetlb_cma_size) + return; + + for (nid = 0; nid < MAX_NUMNODES; nid++) { + if (hugetlb_cma_size_in_node[nid] == 0) + continue; + + if (!node_online(nid)) { + pr_warn("hugetlb_cma: invalid node %d specified\n", nid); + hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; + hugetlb_cma_size_in_node[nid] = 0; + continue; + } + + if (hugetlb_cma_size_in_node[nid] < (PAGE_SIZE << order)) { + pr_warn("hugetlb_cma: cma area of node %d should be at least %lu MiB\n", + nid, (PAGE_SIZE << order) / SZ_1M); + hugetlb_cma_size -= hugetlb_cma_size_in_node[nid]; + hugetlb_cma_size_in_node[nid] = 0; + } else { + node_specific_cma_alloc = true; + } + } + + /* Validate the CMA size again in case some invalid nodes specified. */ + if (!hugetlb_cma_size) + return; + + if (hugetlb_cma_size < (PAGE_SIZE << order)) { + pr_warn("hugetlb_cma: cma area should be at least %lu MiB\n", + (PAGE_SIZE << order) / SZ_1M); + hugetlb_cma_size = 0; + return; + } + + if (!node_specific_cma_alloc) { + /* + * If 3 GB area is requested on a machine with 4 numa nodes, + * let's allocate 1 GB on first three nodes and ignore the last one. + */ + per_node = DIV_ROUND_UP(hugetlb_cma_size, nr_online_nodes); + pr_info("hugetlb_cma: reserve %lu MiB, up to %lu MiB per node\n", + hugetlb_cma_size / SZ_1M, per_node / SZ_1M); + } + + reserved = 0; + for_each_online_node(nid) { + int res; + char name[CMA_MAX_NAME]; + + if (node_specific_cma_alloc) { + if (hugetlb_cma_size_in_node[nid] == 0) + continue; + + size = hugetlb_cma_size_in_node[nid]; + } else { + size = min(per_node, hugetlb_cma_size - reserved); + } + + size = round_up(size, PAGE_SIZE << order); + + snprintf(name, sizeof(name), "hugetlb%d", nid); + /* + * Note that 'order per bit' is based on smallest size that + * may be returned to CMA allocator in the case of + * huge page demotion. + */ + res = cma_declare_contiguous_multi(size, PAGE_SIZE << order, + HUGETLB_PAGE_ORDER, name, + &hugetlb_cma[nid], nid); + if (res) { + pr_warn("hugetlb_cma: reservation failed: err %d, node %d", + res, nid); + continue; + } + + reserved += size; + pr_info("hugetlb_cma: reserved %lu MiB on node %d\n", + size / SZ_1M, nid); + + if (reserved >= hugetlb_cma_size) + break; + } + + if (!reserved) + /* + * hugetlb_cma_size is used to determine if allocations from + * cma are possible. Set to zero if no cma regions are set up. + */ + hugetlb_cma_size = 0; +} + +void __init hugetlb_cma_check(void) +{ + if (!hugetlb_cma_size || cma_reserve_called) + return; + + pr_warn("hugetlb_cma: the option isn't supported by current arch\n"); +} + +bool hugetlb_cma_exclusive_alloc(void) +{ + return hugetlb_cma_only; +} + +unsigned long __init hugetlb_cma_total_size(void) +{ + return hugetlb_cma_size; +} + +void __init hugetlb_cma_validate_params(void) +{ + if (!hugetlb_cma_size) + hugetlb_cma_only = false; +} + +bool __init hugetlb_early_cma(struct hstate *h) +{ + if (arch_has_huge_bootmem_alloc()) + return false; + + return hstate_is_gigantic(h) && hugetlb_cma_only; +} diff --git a/mm/hugetlb_cma.h b/mm/hugetlb_cma.h new file mode 100644 index 000000000000..f7d7fb9880a2 --- /dev/null +++ b/mm/hugetlb_cma.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_HUGETLB_CMA_H +#define _LINUX_HUGETLB_CMA_H + +#ifdef CONFIG_CMA +void hugetlb_cma_free_folio(struct folio *folio); +struct folio *hugetlb_cma_alloc_folio(struct hstate *h, gfp_t gfp_mask, + int nid, nodemask_t *nodemask); +struct huge_bootmem_page *hugetlb_cma_alloc_bootmem(struct hstate *h, int *nid, + bool node_exact); +void hugetlb_cma_check(void); +bool hugetlb_cma_exclusive_alloc(void); +unsigned long hugetlb_cma_total_size(void); +void hugetlb_cma_validate_params(void); +bool hugetlb_early_cma(struct hstate *h); +#else +static inline void hugetlb_cma_free_folio(struct folio *folio) +{ +} + +static inline struct folio *hugetlb_cma_alloc_folio(struct hstate *h, + gfp_t gfp_mask, int nid, nodemask_t *nodemask) +{ + return NULL; +} + +static inline +struct huge_bootmem_page *hugetlb_cma_alloc_bootmem(struct hstate *h, int *nid, + bool node_exact) +{ + return NULL; +} + +static inline void hugetlb_cma_check(void) +{ +} + +static inline bool hugetlb_cma_exclusive_alloc(void) +{ + return false; +} + +static inline unsigned long hugetlb_cma_total_size(void) +{ + return 0; +} + +static inline void hugetlb_cma_validate_params(void) +{ +} + +static inline bool hugetlb_early_cma(struct hstate *h) +{ + return false; +} +#endif +#endif