From patchwork Fri Apr 11 05:37:43 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Changyuan Lyu X-Patchwork-Id: 14047572 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 09101C369A2 for ; Fri, 11 Apr 2025 05:38:59 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BF2A1280164; Fri, 11 Apr 2025 01:38:44 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B9A0228015B; Fri, 11 Apr 2025 01:38:44 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 92B99280164; Fri, 11 Apr 2025 01:38:44 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 6B59E28015B for ; Fri, 11 Apr 2025 01:38:44 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 4E09E161B2E for ; Fri, 11 Apr 2025 05:38:45 +0000 (UTC) X-FDA: 83320658610.20.4A6E00A Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) by imf16.hostedemail.com (Postfix) with ESMTP id 80B17180006 for ; Fri, 11 Apr 2025 05:38:43 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zyrNY5lA; spf=pass (imf16.hostedemail.com: domain of 34qr4ZwoKCGEBG9MFXT9MKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--changyuanl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=34qr4ZwoKCGEBG9MFXT9MKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--changyuanl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1744349923; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kA+rlZCcqa8ZBpJsZPffMW5upf9/jLnF7y6wsXcCijE=; b=QWY0eCj76vBf/4o9Ok8xVuRaazUK9n5+7PhogPuEWk6qZXFT+O3jAYN/I1CBj/4tdyrMOt Y8PdlL8VrXqX+s/2WbZ4IVACmOMXvhgnOzLEQRZuUbCmW282vYCqET4guuMhSm4TqBksop PLARCVWmQ/X81MZoJWvWZpBiHau8joU= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=zyrNY5lA; spf=pass (imf16.hostedemail.com: domain of 34qr4ZwoKCGEBG9MFXT9MKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--changyuanl.bounces.google.com designates 209.85.214.202 as permitted sender) smtp.mailfrom=34qr4ZwoKCGEBG9MFXT9MKFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--changyuanl.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1744349923; a=rsa-sha256; cv=none; b=GJ1XwjdIFEqCOtm4oc+WrzfMLNVykSsFucTqt/FrUbMDwg661bpiZD6CVTKEKM6qwaDKU5 Mc+1gO401YMhw6NYUwB+LNCXiIzJRNag//tqJF89QBuB+DNOG8RTvhN4JQqFODJFGtJUM7 N4k+3acJU1beljtoAZYWUWW3POz/WJc= Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-2241ae15dcbso19027305ad.0 for ; Thu, 10 Apr 2025 22:38:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1744349922; x=1744954722; darn=kvack.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kA+rlZCcqa8ZBpJsZPffMW5upf9/jLnF7y6wsXcCijE=; b=zyrNY5lA716Aagv2Ibw1zGDOk4TY2TOBVoKNGweeyk/Cf8odClCZFMlQEsDhSPGifp gnClB4iZUNq1EgBbp6X/ZExPkkir0NvVi+MY++U4QTNNOMdESAuwqyuZfZ85d8crLhSP Yeoubj45xYHs3DHN/nqmozBkQBCEwnoFD2xgOS4b+AEdhiBUx3PGV1snzG8I7pSQTHX7 FDnSz0N22Ve5BwKn0D2uW3Vq5fa6f4i6t8vDqPA6uDyaKw+W40rX8u1s4/EeiRaEz3KR DsH8lHJiyC38CVahXk4z6L3JiJ8+aZriJIL1fhzZU52ShQXRF6cno1EoG7FoWwTdhJDe bP7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744349922; x=1744954722; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kA+rlZCcqa8ZBpJsZPffMW5upf9/jLnF7y6wsXcCijE=; b=iHeTeT9Sav1Yt+SH+UkDsK1JwGaXgfAaulDeliwpGlLZyr/4NBvrGYH5rZOIcy5GlF xCukAw/FX8pmV8iHsGI80wZKnFzRiR1I8NH65hEQXb6DV1MSWKihuCyCrgCHaqWI2YXz RePmbgbktiQi24dkIjWJPm2GumDPEgJknefNydM5mtJJuur00ucniHkCrcLbcHGim51C qT4tP5yjXzxeJinzG/f37/QEwCluf73WQ0SP/Vr5OdBlwbanLbdz0dEHDRCo3gSA62Nm yKaFMsRYNFGVOzhsNWZ0/ys7NBnMrnhaZnM+BSF2fjTuL6Z4JOFKtP5jYTZ/adfMyypx 3vkg== X-Forwarded-Encrypted: i=1; AJvYcCWObwGe8718hitGiXo9vgHl+XUMN+uGfi4Y7aX2S3rAcedB4kBfmCJSgAF3eM08UVVIWoRUDR3VtA==@kvack.org X-Gm-Message-State: AOJu0YwFW00zXxb937A+kxqAEcih7wU1+yKmgMi0X6qWWC/HIAN2qDbk +WU6rimq3lSdDAzWLHyeIlXiDwIVeUkeFpq3npxaXP9XjVevEwH69dT9/Be+FqglN1i6PSeQmhU ENDftbD9l87VtYLlTuQ== X-Google-Smtp-Source: AGHT+IFMNgH3FeNkL43VcSwOV5hq+5tHGG0G6gtWFb6aZTBEqWyHMIBbsA40BOUPJoyVSd2JH8itZpILADhQmm2q X-Received: from plbjg10.prod.google.com ([2002:a17:903:26ca:b0:223:f321:1a96]) (user=changyuanl job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:fc4d:b0:224:1ce1:a3f4 with SMTP id d9443c01a7336-22bea4a1e8bmr23409575ad.1.1744349922412; Thu, 10 Apr 2025 22:38:42 -0700 (PDT) Date: Thu, 10 Apr 2025 22:37:43 -0700 In-Reply-To: <20250411053745.1817356-1-changyuanl@google.com> Mime-Version: 1.0 References: <20250411053745.1817356-1-changyuanl@google.com> X-Mailer: git-send-email 2.49.0.604.gff1f9ca942-goog Message-ID: <20250411053745.1817356-13-changyuanl@google.com> Subject: [PATCH v6 12/14] memblock: add KHO support for reserve_mem From: Changyuan Lyu To: linux-kernel@vger.kernel.org Cc: akpm@linux-foundation.org, anthony.yznaga@oracle.com, arnd@arndb.de, ashish.kalra@amd.com, benh@kernel.crashing.org, bp@alien8.de, catalin.marinas@arm.com, corbet@lwn.net, dave.hansen@linux.intel.com, devicetree@vger.kernel.org, dwmw2@infradead.org, ebiederm@xmission.com, graf@amazon.com, hpa@zytor.com, jgowans@amazon.com, kexec@lists.infradead.org, krzk@kernel.org, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, linux-mm@kvack.org, luto@kernel.org, mark.rutland@arm.com, mingo@redhat.com, pasha.tatashin@soleen.com, pbonzini@redhat.com, peterz@infradead.org, ptyadav@amazon.de, robh@kernel.org, rostedt@goodmis.org, rppt@kernel.org, saravanak@google.com, skinsburskii@linux.microsoft.com, tglx@linutronix.de, thomas.lendacky@amd.com, will@kernel.org, x86@kernel.org, Changyuan Lyu X-Rspamd-Queue-Id: 80B17180006 X-Stat-Signature: pb4d16jwbd5geogfq968nwj6o4w9zr7b X-Rspam-User: X-Rspamd-Server: rspam12 X-HE-Tag: 1744349923-855699 X-HE-Meta: U2FsdGVkX197mzmlrkTCK9uez6/E2Dj8T18CSHyWI0ZDU6qPd9OpHCNxmkshsPdd0+9t6xMXNl3qKKsHTtkYOcqentUD3voXA3LokN9iALLKBsYnh2Kj0moM+MudqJMbTom6xoP4ZtLGwJktoAFbBBL5GtpIuYpjJGxDHU2xSy9bP7pvKCaraoB4XVVYzrMTQgDwmTxDFTBcqUtv3E7l+PIB7z6F283l533Tx4NbDSxdhb4To5XlsHCW1tPQUH/uUKrV82uyN+lHKZ3EAQDPH+Iipn3BUQNwnrk67YXbUohZ8Fj8g9Qnh9+jbA4rsViSNsbxKOprre5YCF4RseAM0PRGlLhFiDylbwWNc/aGjkH7rls807qRB8eZJid7vin+MoqoGID7kG+V4GKskZnF1CgYzUWBosLBOLJ55K2O/mYNLjoImwGlKGaUm+wOVZb3c3g9Yiqud2SImPx1hnRT7jHlilfLX1LkYHJjorHDQGQZvfKZUP21XoxqiY2ey5D2oW0hJS45MgfBVMovaZdhr7GEaodB1egtqrSJ2wC68APnBpaNb4u8CiJ1qjTU5tu+rxIpdoZOAs/+JU9vhoUULk5qpTCoWSA+3Of7QhHjssgGkN6sPDtIHEDRIiN1jRhulydVRK1jAUy/3OOvJ1EfShaRY2bOf+G54c59r+Q8OI+etdJbmcNm3ZGdmYWnzwJE0oSffK5UPdN2OTqz/ScdW9tE9cy9NOLkXxozR3cvQ24DTN4BQEL4vbc0kzVlajgWC+K0UlZEJtu9+BODC5onI4QRMhYuAl0zqJAusLx7Lry8xi923FdafZ0sOni6xJgY/4v6CIvHO5LFASoOHlypSYMWka8QJWthOUOFM4PkBl04qBJ37gPI5w9TGBoriSWHZRwUCpPD07TAKFkKp7JOWLgGrQQOj369cdmbgxG8v5qhSPn05ZHhC/9rX2GlfHnbQzNAyYnxiM1R6/nISik GZXn6Xfl lRzeXoqroHL88pcWMq/v5J97wTVhL18c53h5UJjYP75WG7AiJzYGY96lpZV5YmFP76zB2PHNa24kgL9DGbJX6n2KpCdgnwk6HVKog9fxInTWq/iU71x+F9VhQvaGkhsc/NKEeZ2Cr3ftdPEwWPaOm+JSMcW/kMatVP1OQP24Nu1F1cedi7tWaQ95QDgK8Inpa0MDpFc69eOfr2/mV1P4Ke3tabR/3irp2LD9c3lmuz52lD7FMjsQ3aba7EasN9mCr/uYDVoAC2Us9la+nZkNqtF7AQ8fkVRQ89mpvCCqOSX38JyadwrUXod/Kv/WLbwqSZgxdIROQltxVFSLb0VdgBbVCr6Z+z8LqqgRwuMF9bsNTfBlzm82rVaGyvN23Yc5a4AoDrH4B7TLsoWyHD7cvd5im2arTel4b+7tQYQZsMdNLuZDHc1xDZgmHCPvy7mX5HvHGHgFZ4IryubEM3KIKjyNN7n2h5Rs81bFS1LuIW7XxlxNQh0gwPAcSNw8p9bcWRjPKaBdcrYXG+RCgqshHx2X6NxAaH8U9+Bk0qXzRcBq/YUIaH18gGMzGa/4j5+EBdRR/6BMRfmdrjZBjcOKwal2Q5tKbe2k9VpxxT6lMovI598sWeHWJS+w7JeQT1nuwTlUtXAez6s4O/U8DH6QpwKZtAIPA0jGYiqso8YZaQ7fUduI= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Alexander Graf Linux has recently gained support for "reserve_mem": A mechanism to allocate a region of memory early enough in boot that we can cross our fingers and hope it stays at the same location during most boots, so we can store for example ftrace buffers into it. Thanks to KASLR, we can never be really sure that "reserve_mem" allocations are static across kexec. Let's teach it KHO awareness so that it serializes its reservations on kexec exit and deserializes them again on boot, preserving the exact same mapping across kexec. This is an example user for KHO in the KHO patch set to ensure we have at least one (not very controversial) user in the tree before extending KHO's use to more subsystems. Signed-off-by: Alexander Graf Co-developed-by: Mike Rapoport (Microsoft) Signed-off-by: Mike Rapoport (Microsoft) Co-developed-by: Changyuan Lyu Signed-off-by: Changyuan Lyu --- mm/memblock.c | 205 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 205 insertions(+) diff --git a/mm/memblock.c b/mm/memblock.c index 456689cb73e20..3571a859f2fe1 100644 --- a/mm/memblock.c +++ b/mm/memblock.c @@ -18,6 +18,11 @@ #include #include +#ifdef CONFIG_KEXEC_HANDOVER +#include +#include +#endif /* CONFIG_KEXEC_HANDOVER */ + #include #include @@ -2475,6 +2480,201 @@ int reserve_mem_release_by_name(const char *name) return 1; } +#ifdef CONFIG_KEXEC_HANDOVER +#define MEMBLOCK_KHO_FDT "memblock" +#define MEMBLOCK_KHO_NODE_COMPATIBLE "memblock-v1" +#define RESERVE_MEM_KHO_NODE_COMPATIBLE "reserve-mem-v1" +static struct page *kho_fdt; + +static int reserve_mem_kho_finalize(struct kho_serialization *ser) +{ + int err = 0, i; + + if (!reserved_mem_count) + return NOTIFY_DONE; + + if (IS_ERR(kho_fdt)) { + err = PTR_ERR(kho_fdt); + pr_err("memblock FDT was not prepared successfully: %d\n", err); + return notifier_from_errno(err); + } + + for (i = 0; i < reserved_mem_count; i++) { + struct reserve_mem_table *map = &reserved_mem_table[i]; + + err |= kho_preserve_phys(ser, map->start, map->size); + } + + err |= kho_preserve_folio(ser, page_folio(kho_fdt)); + err |= kho_add_subtree(ser, MEMBLOCK_KHO_FDT, page_to_virt(kho_fdt)); + + return notifier_from_errno(err); +} + +static int reserve_mem_kho_notifier(struct notifier_block *self, + unsigned long cmd, void *v) +{ + switch (cmd) { + case KEXEC_KHO_FINALIZE: + return reserve_mem_kho_finalize((struct kho_serialization *)v); + case KEXEC_KHO_ABORT: + return NOTIFY_DONE; + default: + return NOTIFY_BAD; + } +} + +static struct notifier_block reserve_mem_kho_nb = { + .notifier_call = reserve_mem_kho_notifier, +}; + +static void __init prepare_kho_fdt(void) +{ + int err = 0, i; + void *fdt; + + if (!reserved_mem_count) + return; + + kho_fdt = alloc_page(GFP_KERNEL); + if (!kho_fdt) { + kho_fdt = ERR_PTR(-ENOMEM); + return; + } + + fdt = page_to_virt(kho_fdt); + + err |= fdt_create(fdt, PAGE_SIZE); + err |= fdt_finish_reservemap(fdt); + + err |= fdt_begin_node(fdt, ""); + err |= fdt_property_string(fdt, "compatible", MEMBLOCK_KHO_NODE_COMPATIBLE); + for (i = 0; i < reserved_mem_count; i++) { + struct reserve_mem_table *map = &reserved_mem_table[i]; + + err |= fdt_begin_node(fdt, map->name); + err |= fdt_property_string(fdt, "compatible", RESERVE_MEM_KHO_NODE_COMPATIBLE); + err |= fdt_property(fdt, "start", &map->start, sizeof(map->start)); + err |= fdt_property(fdt, "size", &map->size, sizeof(map->size)); + err |= fdt_end_node(fdt); + } + err |= fdt_end_node(fdt); + + err |= fdt_finish(fdt); + + if (err) { + pr_err("failed to prepare memblock FDT for KHO: %d\n", err); + put_page(kho_fdt); + kho_fdt = ERR_PTR(-EINVAL); + } +} + +static int __init reserve_mem_init(void) +{ + if (!kho_is_enabled()) + return 0; + + prepare_kho_fdt(); + + return register_kho_notifier(&reserve_mem_kho_nb); +} +late_initcall(reserve_mem_init); + +static void *kho_fdt_in __initdata; + +static void *__init reserve_mem_kho_retrieve_fdt(void) +{ + phys_addr_t fdt_phys; + struct folio *fdt_folio; + void *fdt; + int err; + + err = kho_retrieve_subtree(MEMBLOCK_KHO_FDT, &fdt_phys); + if (err) { + if (err != -ENOENT) + pr_warn("failed to retrieve FDT '%s' from KHO: %d\n", + MEMBLOCK_KHO_FDT, err); + return ERR_PTR(err); + } + + fdt_folio = kho_restore_folio(fdt_phys); + if (!fdt_folio) { + pr_warn("failed to restore memblock KHO FDT (0x%llx)\n", fdt_phys); + return ERR_PTR(-EFAULT); + } + + fdt = page_to_virt(folio_page(fdt_folio, 0)); + + err = fdt_node_check_compatible(fdt, 0, MEMBLOCK_KHO_NODE_COMPATIBLE); + if (err) { + pr_warn("FDT '%s' is incompatible with '%s': %d\n", + MEMBLOCK_KHO_FDT, MEMBLOCK_KHO_NODE_COMPATIBLE, err); + return ERR_PTR(-EINVAL); + } + + return fdt; +} + +static bool __init reserve_mem_kho_revive(const char *name, phys_addr_t size, + phys_addr_t align) +{ + int err, len_start, len_size, offset; + const phys_addr_t *p_start, *p_size; + const void *fdt; + + if (!kho_fdt_in) + kho_fdt_in = reserve_mem_kho_retrieve_fdt(); + + if (IS_ERR(kho_fdt_in)) + return false; + + fdt = kho_fdt_in; + + offset = fdt_subnode_offset(fdt, 0, name); + if (offset < 0) { + pr_warn("FDT '%s' has no child '%s': %d\n", + MEMBLOCK_KHO_FDT, name, offset); + return false; + } + err = fdt_node_check_compatible(fdt, offset, RESERVE_MEM_KHO_NODE_COMPATIBLE); + if (err) { + pr_warn("Node '%s' is incompatible with '%s': %d\n", + name, RESERVE_MEM_KHO_NODE_COMPATIBLE, err); + return false; + } + + p_start = fdt_getprop(fdt, offset, "start", &len_start); + p_size = fdt_getprop(fdt, offset, "size", &len_size); + if (!p_start || len_start != sizeof(*p_start) || !p_size || + len_size != sizeof(*p_size)) { + return false; + } + + if (*p_start & (align - 1)) { + pr_warn("KHO reserve-mem '%s' has wrong alignment (0x%lx, 0x%lx)\n", + name, (long)align, (long)*p_start); + return false; + } + + if (*p_size != size) { + pr_warn("KHO reserve-mem '%s' has wrong size (0x%lx != 0x%lx)\n", + name, (long)*p_size, (long)size); + return false; + } + + reserved_mem_add(*p_start, size, name); + pr_info("Revived memory reservation '%s' from KHO\n", name); + + return true; +} +#else +static bool __init reserve_mem_kho_revive(const char *name, phys_addr_t size, + phys_addr_t align) +{ + return false; +} +#endif /* CONFIG_KEXEC_HANDOVER */ + /* * Parse reserve_mem=nn:align:name */ @@ -2530,6 +2730,11 @@ static int __init reserve_mem(char *p) if (reserve_mem_find_by_name(name, &start, &tmp)) return -EBUSY; + /* Pick previous allocations up from KHO if available */ + if (reserve_mem_kho_revive(name, size, align)) + return 1; + + /* TODO: Allocation must be outside of scratch region */ start = memblock_phys_alloc(size, align); if (!start) return -ENOMEM;