From patchwork Thu Apr 27 00:08:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE849C77B60 for ; Thu, 27 Apr 2023 00:10:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D1C3C6B0082; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CA5026B0083; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A829F6B0087; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 953F66B0082 for ; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B5B81C048C for ; Thu, 27 Apr 2023 00:09:53 +0000 (UTC) X-FDA: 80725237866.18.B129441 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf30.hostedemail.com (Postfix) with ESMTP id A95BA8000B for ; Thu, 27 Apr 2023 00:09:51 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b="Bm/jAfF0"; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf30.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554191; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=qCn/KjOTilZ1uLUD9nUQPCZhyulY7CLY6Ex38w3zacI=; b=Lf5pqUkVP7VSn+ovlX+Mc766Vk29R/4OM+qlX0n8ZBCXu+EiUYnCfP1sMn0WDZ+TK4YKd3 xaTBQFxdlcPDyGFwVM2jUQOIcf2cLKlGR5vQTwE1vtwPJmr4fVBsGymy4toUZZJAded7YX OJcTf2+m3bFArhZ8S0VCTLdMIoU9XpA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b="Bm/jAfF0"; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf30.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554191; a=rsa-sha256; cv=none; b=hE+FemkYrIpfWz2ZBL4dvx3p1wtp7sLv4mhVkfsQZ3KGQfRrN1wYatt1QxI/59MyxAdMK8 WiwZeNIoQBowi9qVRQVUOQmo7woMNql6cpmUYW/IS7bLugDLjO+t1fIMe0SFbQnaFzFhVS NViewCQyJz35tvnXaWsYJfr1v2MP1VQ= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxES2025338; Thu, 27 Apr 2023 00:09:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=qCn/KjOTilZ1uLUD9nUQPCZhyulY7CLY6Ex38w3zacI=; b=Bm/jAfF05hZ41Po/Sldj/OVZEgh4vornvAX3ywU3g0/yLbdoYbJMFOMTp7C5akCX3fcZ +pInqqlhBeWA1j5TPNcZ+6iEF6gS/F5sq1lf/buPv1WXX5wsj5gKfhyz8y8F710gFQvy q3dmGfVJQxHlYBNf/drz3+bPaSYVMM92CKhQlUQu4PKBx+edhfUxVnE93LXz0l2c3jod c0Yhtsq4U5CR9dYfi1acKahr/PyUR0HznCt6VPM22Bm0p28plaj/4tvVXewIDniGtxfB U6VxuygBOvSiOhVsBYRZQ4FOX7K5ngh3ZyBqGktoa1h2pAyzX/SDB/RijrDDl5gqc2tg Fw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622ty5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:21 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNUjaf007380; Thu, 27 Apr 2023 00:09:21 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpkr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:21 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938i013888; Thu, 27 Apr 2023 00:09:20 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-12; Thu, 27 Apr 2023 00:09:20 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 11/21] mm: PKRAM: reserve preserved memory at boot Date: Wed, 26 Apr 2023 17:08:47 -0700 Message-Id: <1682554137-13938-12-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: ePrmjbJOEXsn1MF-WbF_e2bo2djCeESw X-Proofpoint-GUID: ePrmjbJOEXsn1MF-WbF_e2bo2djCeESw X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: A95BA8000B X-Stat-Signature: 7jwoh9qrsna8f4kn5sfi9t1aj68mjiao X-HE-Tag: 1682554191-309879 X-HE-Meta: U2FsdGVkX1/TKT3PYKQPS3w3MECpFIAIL9HgwYEcBj4pyB/y5isOGVgirMZurDMEt04mtR+eCkUQu9vQRx7PXyt8hIJIo587AH1mzRDb6mdQMJDPm6dtyGOO3iRkoSWNQ9pK9wBxybB+fmh2p8kzZMH6TAyp/QXl2RaJtDBvfxSXgTz1KatCu80A4AuyA4auu8jnXDc9Qw05BQUr+TM87B1J9m4gEaEwjFSqVDC+3GmM30oIgzhkEM9NJvJtz5FUFejokQ0/fI8MFx0wvl2pNZFYuFkpcN3G0KD9r6AAwtDOKShGYzLHFKqTgVLQIaKOsAyHsIqts1Vosrv+J8MISe0KUMwov31K98Cdj9kxSLdg1rAZnJhQq0X+86A3VMBz8VVrKMhAm7Lp3gdTaV+bocorsh54GC2wBQAH00Ab8XzckpkIESWf8RhvKEw8xzVrwe761PrA2gTZ63G871vPsG12ehirWIareqxaWBzM0gwlIpqYUKo4ROIjUiurNweX6NIjy5D/FmpNxnc3z5P//ucF9rBZol+K+GCKJU6IzrRN1rV8wfkAK3rxNdbN2SAjhtkCrgclS9VXYkIktf8iPMmuW/5tTcUeklPAjzCxE6YGmtXoaWpWrW79DhMwlVFKt2b/7RTrevGHMVUWM8JOvo2sUMJhDVArOk43VZTSDZvKr0xTwuXrjNhFM82/1+iaH220hcU/chRk5aMxu2EAZYFlZHSIoYMTEyelNMBZW8Z1oaTIih0cCVErsMzymN2XXMx0NnuYCiFq39OOZ4WCcAaY880Qfqb1nUL3rUqkdKFBO+nGKhMqqdG8jIrzhoSLQ8rm7KBxzHGnU1JVZXZinNgKUBxWE5QDHLudbWoO46DfqpFs+JInhoKKv0t7LQdqibsbCeh1I8BKBEIPcvUMXbCHo4hkco8clcOEEGkft9OzkRwPChWiMPFrbeNB6jR6nR1ryOgzyN0NLgYeTlC v7U0ggVH rtcpmlNFhsy8g3j0NbRHLK3tzZch+CIriubFhimSVsYwz4Wm3e2yNDgSx8nb68aCzCHKV+HSDI92J64O+nGIqZfYyJqNvNF3IHGkPvVGBKP/rpkrmdrbG6Bw7M9QJ8GvrYO5c X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Keep preserved pages from being recycled during boot by adding them to the memblock reserved list during early boot. If memory reservation fails (e.g. a region has already been reserved), all preserved pages are dropped. Signed-off-by: Anthony Yznaga --- arch/x86/kernel/setup.c | 3 ++ arch/x86/mm/init_64.c | 2 ++ include/linux/pkram.h | 8 +++++ mm/pkram.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++--- 4 files changed, 92 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 16babff771bd..2806b21236d0 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -1221,6 +1222,8 @@ void __init setup_arch(char **cmdline_p) initmem_init(); dma_contiguous_reserve(max_pfn_mapped << PAGE_SHIFT); + pkram_reserve(); + if (boot_cpu_has(X86_FEATURE_GBPAGES)) hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index a190aae8ceaf..a46ffb434f39 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -1339,6 +1340,7 @@ void __init mem_init(void) after_bootmem = 1; x86_init.hyper.init_after_bootmem(); + totalram_pages_add(pkram_reserved_pages); /* * Must be done after boot memory is put on freelist, because here we * might set fields in deferred struct pages that have not yet been diff --git a/include/linux/pkram.h b/include/linux/pkram.h index b614e9059bba..53d5a1ec42ff 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -99,4 +99,12 @@ int pkram_prepare_save(struct pkram_stream *ps, const char *name, ssize_t pkram_write(struct pkram_access *pa, const void *buf, size_t count); size_t pkram_read(struct pkram_access *pa, void *buf, size_t count); +#ifdef CONFIG_PKRAM +extern unsigned long pkram_reserved_pages; +void pkram_reserve(void); +#else +#define pkram_reserved_pages 0UL +static inline void pkram_reserve(void) { } +#endif + #endif /* _LINUX_PKRAM_H */ diff --git a/mm/pkram.c b/mm/pkram.c index c649504fa1fa..b711f94dbef4 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -134,6 +134,8 @@ extern void pkram_find_preserved(unsigned long start, unsigned long end, void *p static LIST_HEAD(pkram_nodes); /* linked through page::lru */ static DEFINE_MUTEX(pkram_mutex); /* serializes open/close */ +unsigned long __initdata pkram_reserved_pages; + /* * The PKRAM super block pfn, see above. */ @@ -143,6 +145,59 @@ static int __init parse_pkram_sb_pfn(char *arg) } early_param("pkram", parse_pkram_sb_pfn); +static void * __init pkram_map_meta(unsigned long pfn) +{ + if (pfn >= max_low_pfn) + return ERR_PTR(-EINVAL); + return pfn_to_kaddr(pfn); +} + +int pkram_merge_with_reserved(void); +/* + * Reserve pages that belong to preserved memory. + * + * This function should be called at boot time as early as possible to prevent + * preserved memory from being recycled. + */ +void __init pkram_reserve(void) +{ + int err = 0; + + if (!pkram_sb_pfn) + return; + + pr_info("PKRAM: Examining preserved memory...\n"); + + /* Verify that nothing else has reserved the pkram_sb page */ + if (memblock_is_region_reserved(PFN_PHYS(pkram_sb_pfn), PAGE_SIZE)) { + err = -EBUSY; + goto out; + } + + pkram_sb = pkram_map_meta(pkram_sb_pfn); + if (IS_ERR(pkram_sb)) { + err = PTR_ERR(pkram_sb); + goto out; + } + /* An empty pkram_sb is not an error */ + if (!pkram_sb->node_pfn) { + pkram_sb = NULL; + goto done; + } + + err = pkram_merge_with_reserved(); +out: + if (err) { + pr_err("PKRAM: Reservation failed: %d\n", err); + WARN_ON(pkram_reserved_pages > 0); + pkram_sb = NULL; + return; + } + +done: + pr_info("PKRAM: %lu pages reserved\n", pkram_reserved_pages); +} + static inline struct page *pkram_alloc_page(gfp_t gfp_mask) { struct page *page; @@ -162,6 +217,7 @@ static inline struct page *pkram_alloc_page(gfp_t gfp_mask) static inline void pkram_free_page(void *addr) { + __ClearPageReserved(virt_to_page(addr)); pkram_remove_identity_map(virt_to_page(addr)); free_page((unsigned long)addr); } @@ -193,13 +249,23 @@ static void pkram_truncate_link(struct pkram_link *link) { struct page *page; pkram_entry_t p; - int i; + int i, j, order; for (i = 0; i < PKRAM_LINK_ENTRIES_MAX; i++) { p = link->entry[i]; if (!p) continue; + order = p & PKRAM_ENTRY_ORDER_MASK; + if (order >= MAX_ORDER) { + pr_err("PKRAM: attempted truncate of invalid page\n"); + return; + } page = pfn_to_page(PHYS_PFN(p)); + for (j = 0; j < (1 << order); j++) { + struct page *pg = page + j; + + __ClearPageReserved(pg); + } pkram_remove_identity_map(page); put_page(page); } @@ -680,7 +746,7 @@ static int __pkram_bytes_save_page(struct pkram_access *pa, struct page *page) static struct page *__pkram_prep_load_page(pkram_entry_t p) { struct page *page; - int order; + int i, order; short flags; flags = (p >> PKRAM_ENTRY_FLAGS_SHIFT) & PKRAM_ENTRY_FLAGS_MASK; @@ -690,9 +756,16 @@ static struct page *__pkram_prep_load_page(pkram_entry_t p) page = pfn_to_page(PHYS_PFN(p)); - if (!page_ref_freeze(pg, 1)) { - pr_err("PKRAM preserved page has unexpected inflated ref count\n"); - goto out_error; + for (i = 0; i < (1 << order); i++) { + struct page *pg = page + i; + int was_rsvd; + + was_rsvd = PageReserved(pg); + __ClearPageReserved(pg); + if ((was_rsvd || i == 0) && !page_ref_freeze(pg, 1)) { + pr_err("PKRAM preserved page has unexpected inflated ref count\n"); + goto out_error; + } } if (order) { @@ -1331,6 +1404,7 @@ int __init pkram_create_merged_reserved(struct memblock_type *new) } WARN_ON(cnt_a + cnt_b != k); + pkram_reserved_pages = nr_preserved; new->cnt = cnt_a + cnt_b; new->total_size = total_size;