From patchwork Thu Apr 27 00:08:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4A00C7618E for ; Thu, 27 Apr 2023 00:09:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2BC056B0075; Wed, 26 Apr 2023 20:09:49 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 245876B0078; Wed, 26 Apr 2023 20:09:49 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BE226B007D; Wed, 26 Apr 2023 20:09:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id E732D6B0075 for ; Wed, 26 Apr 2023 20:09:48 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id AAA3A1A0523 for ; Thu, 27 Apr 2023 00:09:48 +0000 (UTC) X-FDA: 80725237656.20.963495C Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf25.hostedemail.com (Postfix) with ESMTP id AAE53A0012 for ; Thu, 27 Apr 2023 00:09:46 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=Qewotlad; spf=pass (imf25.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554186; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=m5WZMJcDB0yDdPEBWmM9MnWGtmdsh6oAci435qRzgOU=; b=r631A83BcUVNsAtKRoqfHiHzuznnfYg0RzEUNLBH2BYWcs2h6jfJDpBNosIODVXEJ3K6z4 ezuMmcSYfaoYznZV54aF7pxOdeedIBHKmLiI3zcSR/mtY/zt40ceIcxCiRmSvL6tuDTKg4 lZUbl1YJsv5Y+dD3/EdnqIly+BUdwQs= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=Qewotlad; spf=pass (imf25.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554186; a=rsa-sha256; cv=none; b=qViHa7XUE3H2DRAn+Ts+Fu/PknF3tF6GIdrLzbOQwtTHuwFhZL4hWvPGxm0zTtcEP6L4we eDZuuzCp0dWGlf/KNq2O4vJRdeEWshscm/RxdbDqzlbnrbrdpp4AsmUWGPUBj2XpDBw0BX G3D73p+291iPMsYJxsEEdlMutIrF6H4= Received: from pps.filterd (m0333520.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGwmBj013734; Thu, 27 Apr 2023 00:09:07 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=m5WZMJcDB0yDdPEBWmM9MnWGtmdsh6oAci435qRzgOU=; b=QewotladiR+frxDMWZyUDcfpo/8nqkUiKnfHSRs9SL+UuWgOSwzwz6Xrj6GTL5w9DtxW MsjduscPTtA/yoi3zzPoi++bDrBIHN5XH+yM6R4aRZgVRMh+Yn+wYEZctYXL2UHw5isz SZLBRu9mDnChGTVbBWeKV9emv7S6ny2RHEUkysCK3/Bby+X4HpppPpaRGC6yBauLJ509 9vjIT4OFeFr0q6EhWumjwXuxVRWx582ycuLwg7Zj/NgyunYsJp0eat6mXnRj5WXKjUyz xHZNZdfR08moqTT7pVRXGXzUcoxSXd/0HnRaSj3A8t16cPPHaG5hUWiTMGDF5ASf3oX2 Kw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q47md2umn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:07 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QMn2Cf007147; Thu, 27 Apr 2023 00:09:06 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mp9f-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:06 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938O013888; Thu, 27 Apr 2023 00:09:05 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-2; Thu, 27 Apr 2023 00:09:05 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 01/21] mm: add PKRAM API stubs and Kconfig Date: Wed, 26 Apr 2023 17:08:37 -0700 Message-Id: <1682554137-13938-2-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: oZBoD9zIKFev_Cz5qJgpsNFO-hW5IASo X-Proofpoint-ORIG-GUID: oZBoD9zIKFev_Cz5qJgpsNFO-hW5IASo X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: AAE53A0012 X-Stat-Signature: s7y6rf8ooh5rjxr1na65zjcejo71kxf5 X-HE-Tag: 1682554186-74016 X-HE-Meta: U2FsdGVkX19yNM6OH7Rk3V4YjjT0QP1mQr4W3uHQZtRlrA7dCz0oTu0mO0f1Lw9vvgJMVYVXt5gEsmY1Nq9Zetnfw8XDWiyHA6SN+QSTsR4s5cpqt1YEiTlQW/90pavWuutIPbhecH3GaoU7W4wSuiYL46NI29GrvqYTv8CHEj9N9ViiWjySAG8oUbUhXkXsu07Lj4B9aAS+z2d6bMgsU4YwSTTmW4+14LmgSCgrUcv5Eq2esrl1GR+u2LE7a6wlQCgDi6tvMJziHq+jzAdFdl8N4+UB3kvYYD3V8QS18w4oZ5O/AJvgrSiAso8SQO+muV1Zwb6OgHyhy/hIGVnsL6scf5on0KVJSr2je0a9dw1E1LYRGBRgAQzklAGbcBsEwAbQXJlgZ0nQCi4BaZvqYUzVkmlQkWbtXl6hfRk0pnf68d1Ovwgk3Ax/2xKVh5xZUb59sP5ZtvMrmmbO4wUPz1feiTK1ZpX49jsj3MENri2sk2sRH+GDaRnt9+NtbWyzBV9tEgfJB1RAmFf7n4fI3U0j9lEQTK3etNkU8rZtJF7+Mo7DWilsouAIEAx5SXo3SRNw1keO6jHD3rRmq0Bx83UbMPSRa0wBNFHW0boU3WR0O8OLH5I42XStaFlKkCMWizIdJjQCji+++JvpfJbXK91eIjKvFfW51VGyGJYS/Mr5iKC4gvBqu+Cw4Wqr0/2NSWEFDvF9xY4tHKP3z73jEWrC+HyZ6qzjr2bdxwxj+RhNbUAXvJOD0rxB+5F8NHkLM15QxejXjH0FD05M9Ef9P/OEh6eFrvHdYBie4OH0iLQ5slQuju7dYqdpEj7J/9U3r2tundEALu68SUPJ4OBGtac08FImZia3B5Xv0Ikz1EPMolGEu6Qwjvb03mTvSpymBGavfENPJeGWx+U82CBghSjmUVk1biP8n3ko8SlicMgMTNWsFXisXxTV4bihqtAg37Efs61I3QvUDkhvTxm Zoy5F645 vFQFAieXaqlXOYkG9Tcwt2BpMNeoILEIf8PngIcHwkwA0LiLw+NpKChHrdElAy5KE3VTc7yjVkkBbs2UyutvjM/UMKt3MUkXD2OjtMDaN9TbO/zDQLOs8NXhGNwkBfmtlK0bUCT3T3BwvTqrXF2hDdDlxgZlVND+w2/4q X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Preserved-across-kexec memory or PKRAM is a method for saving memory pages of the currently executing kernel and restoring them after kexec boot into a new one. This can be utilized for preserving guest VM state, large in-memory databases, process memory, etc. across reboot. While DRAM-as-PMEM or actual persistent memory could be used to accomplish these things, PKRAM provides the latency of DRAM with the flexibility of dynamically determining the amount of memory to preserve. The proposed API: * Preserved memory is divided into nodes which can be saved or loaded independently of each other. The nodes are identified by unique name strings. A PKRAM node is created when save is initiated by calling pkram_prepare_save(). A PKRAM node is removed when load is initiated by calling pkram_prepare_load(). See below * A node is further divided into objects. An object represents closely coupled data in the form of a grouping of folios and/or a stream of byte data. For example, the folios and attributes of a file. After initiating an operation on a PKRAM node, PKRAM objects are initialized for saving or loading by calling pkram_prepare_save_obj() or pkram_prepare_load_obj(). * For saving/loading data from a PKRAM node/object instances of the pkram_stream and pkram_access structs are used. pkram_stream tracks the node and object being operated on while pkram_access tracks the data type and position within an object. The pkram_stream struct is initialized by calling pkram_prepare_save() or pkram_prepare_load() and then pkram_prepare_save_obj() or pkram_prepare_load_obj(). Once a pkram_stream is fully initialized, a pkram_access struct is initialized for each data type associated with the object. After save or load of a data type for the object is complete, pkram_finish_access() is called. After save or load is complete for the object, pkram_finish_save_obj() or pkram_finish_load_obj() must be called followed by pkram_finish_save() or pkram_finish_load() when save or load is completed for the node. If an error occurred during save, the saved data and the PKRAM node may be freed by calling pkram_discard_save() instead of pkram_finish_save(). * Both folio data and byte data can separately be streamed to a PKRAM object. pkram_save_folio() and pkram_load_folio() are used to stream folio data while pkram_write() and pkram_read() are used to stream byte data. A sequence of operations for saving/loading data from PKRAM would look like: * For saving data to PKRAM: /* create a PKRAM node and do initial stream setup */ pkram_prepare_save() /* create a PKRAM object associated with the PKRAM node and complete stream initialization */ pkram_prepare_save_obj() /* save data to the node/object */ PKRAM_ACCESS(pa_folios,...) PKRAM_ACCESS(pa_bytes,...) pkram_save_folio(pa_folios,...)[,...] /* for file folios */ pkram_write(pa_bytes,...)[,...] /* for a byte stream */ pkram_finish_access(pa_folios) pkram_finish_access(pa_bytes) pkram_finish_save_obj() /* commit the save or discard and delete the node */ pkram_finish_save() /* on success, or pkram_discard_save() * ... in case of error */ * For loading data from PKRAM: /* remove a PKRAM node from the list and do initial stream setup */ pkram_prepare_load() /* Remove a PKRAM object from the node and complete stream initializtion for loading data from it. */ pkram_prepare_load_obj() /* load data from the node/object */ PKRAM_ACCESS(pa_folios,...) PKRAM_ACCESS(pa_bytes,...) pkram_load_folio(pa_folios,...)[,...] /* for file folios */ pkram_read(pa_bytes,...)[,...] /* for a byte stream */ */ pkram_finish_access(pa_folios) pkram_finish_access(pa_bytes) /* free the object */ pkram_finish_load_obj() /* free the node */ pkram_finish_load() Originally-by: Vladimir Davydov Signed-off-by: Anthony Yznaga --- include/linux/pkram.h | 47 +++++++++++++ mm/Kconfig | 9 +++ mm/Makefile | 2 + mm/pkram.c | 179 ++++++++++++++++++++++++++++++++++++++++++++++++++ 4 files changed, 237 insertions(+) create mode 100644 include/linux/pkram.h create mode 100644 mm/pkram.c diff --git a/include/linux/pkram.h b/include/linux/pkram.h new file mode 100644 index 000000000000..57b8db4229a4 --- /dev/null +++ b/include/linux/pkram.h @@ -0,0 +1,47 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _LINUX_PKRAM_H +#define _LINUX_PKRAM_H + +#include +#include +#include + +/** + * enum pkram_data_flags - definition of data types contained in a pkram obj + * @PKRAM_DATA_none: No data types configured + */ +enum pkram_data_flags { + PKRAM_DATA_none = 0x0, /* No data types configured */ +}; + +struct pkram_stream; +struct pkram_access; + +#define PKRAM_NAME_MAX 256 /* including nul */ + +int pkram_prepare_save(struct pkram_stream *ps, const char *name, + gfp_t gfp_mask); +int pkram_prepare_save_obj(struct pkram_stream *ps, enum pkram_data_flags flags); + +void pkram_finish_save(struct pkram_stream *ps); +void pkram_finish_save_obj(struct pkram_stream *ps); +void pkram_discard_save(struct pkram_stream *ps); + +int pkram_prepare_load(struct pkram_stream *ps, const char *name); +int pkram_prepare_load_obj(struct pkram_stream *ps); + +void pkram_finish_load(struct pkram_stream *ps); +void pkram_finish_load_obj(struct pkram_stream *ps); + +#define PKRAM_ACCESS(name, stream, type) \ + struct pkram_access name + +void pkram_finish_access(struct pkram_access *pa, bool status_ok); + +int pkram_save_folio(struct pkram_access *pa, struct folio *folio); +struct folio *pkram_load_folio(struct pkram_access *pa, unsigned long *index); + +ssize_t pkram_write(struct pkram_access *pa, const void *buf, size_t count); +size_t pkram_read(struct pkram_access *pa, void *buf, size_t count); + +#endif /* _LINUX_PKRAM_H */ diff --git a/mm/Kconfig b/mm/Kconfig index 4751031f3f05..10f089f4a181 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1202,6 +1202,15 @@ config LRU_GEN_STATS This option has a per-memcg and per-node memory overhead. # } +config PKRAM + bool "Preserved-over-kexec memory storage" + default n + help + This option adds the kernel API that enables saving memory pages of + the currently executing kernel and restoring them after a kexec in + the newly booted one. This can be utilized for speeding up reboot by + leaving process memory and/or FS caches in-place. + source "mm/damon/Kconfig" endmenu diff --git a/mm/Makefile b/mm/Makefile index 8e105e5b3e29..7a8d5a286d48 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -138,3 +138,5 @@ obj-$(CONFIG_IO_MAPPING) += io-mapping.o obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o +obj-$(CONFIG_PKRAM) += pkram.o +>>>>>>> mm: add PKRAM API stubs and Kconfig diff --git a/mm/pkram.c b/mm/pkram.c new file mode 100644 index 000000000000..421de8211e05 --- /dev/null +++ b/mm/pkram.c @@ -0,0 +1,179 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include + +/** + * Create a preserved memory node with name @name and initialize stream @ps + * for saving data to it. + * + * @gfp_mask specifies the memory allocation mask to be used when saving data. + * + * Returns 0 on success, -errno on failure. + * + * After the save has finished, pkram_finish_save() (or pkram_discard_save() in + * case of failure) is to be called. + */ +int pkram_prepare_save(struct pkram_stream *ps, const char *name, gfp_t gfp_mask) +{ + return -EINVAL; +} + +/** + * Create a preserved memory object and initialize stream @ps for saving data + * to it. + * + * Returns 0 on success, -errno on failure. + * + * After the save has finished, pkram_finish_save_obj() (or pkram_discard_save() + * in case of failure) is to be called. + */ +int pkram_prepare_save_obj(struct pkram_stream *ps, enum pkram_data_flags flags) +{ + return -EINVAL; +} + +/** + * Commit the object started with pkram_prepare_save_obj() to preserved memory. + */ +void pkram_finish_save_obj(struct pkram_stream *ps) +{ + WARN_ON_ONCE(1); +} + +/** + * Commit the save to preserved memory started with pkram_prepare_save(). + * After the call, the stream may not be used any more. + */ +void pkram_finish_save(struct pkram_stream *ps) +{ + WARN_ON_ONCE(1); +} + +/** + * Cancel the save to preserved memory started with pkram_prepare_save() and + * destroy the corresponding preserved memory node freeing any data already + * saved to it. + */ +void pkram_discard_save(struct pkram_stream *ps) +{ + WARN_ON_ONCE(1); +} + +/** + * Remove the preserved memory node with name @name and initialize stream @ps + * for loading data from it. + * + * Returns 0 on success, -errno on failure. + * + * After the load has finished, pkram_finish_load() is to be called. + */ +int pkram_prepare_load(struct pkram_stream *ps, const char *name) +{ + return -EINVAL; +} + +/** + * Remove the next preserved memory object from the stream @ps and + * initialize stream @ps for loading data from it. + * + * Returns 0 on success, -errno on failure. + * + * After the load has finished, pkram_finish_load_obj() is to be called. + */ +int pkram_prepare_load_obj(struct pkram_stream *ps) +{ + return -EINVAL; +} + +/** + * Finish the load of a preserved memory object started with + * pkram_prepare_load_obj() freeing the object and any data that has not + * been loaded from it. + */ +void pkram_finish_load_obj(struct pkram_stream *ps) +{ + WARN_ON_ONCE(1); +} + +/** + * Finish the load from preserved memory started with pkram_prepare_load() + * freeing the corresponding preserved memory node and any data that has + * not been loaded from it. + */ +void pkram_finish_load(struct pkram_stream *ps) +{ + WARN_ON_ONCE(1); +} + +/** + * Finish the data access to or from the preserved memory node and object + * associated with pkram stream access @pa. The access must have been + * initialized with PKRAM_ACCESS(). + */ +void pkram_finish_access(struct pkram_access *pa, bool status_ok) +{ + WARN_ON_ONCE(1); +} + +/** + * Save folio @folio to the preserved memory node and object associated + * with pkram stream access @pa. The stream must have been initialized with + * pkram_prepare_save() and pkram_prepare_save_obj() and access initialized + * with PKRAM_ACCESS(). + * + * Returns 0 on success, -errno on failure. + */ +int pkram_save_folio(struct pkram_access *pa, struct folio *folio) +{ + return -EINVAL; +} + +/** + * Load the next folio from the preserved memory node and object associated + * with pkram stream access @pa. The stream must have been initialized with + * pkram_prepare_load() and pkram_prepare_load_obj() and access initialized + * with PKRAM_ACCESS(). + * + * If not NULL, @index is initialized with the preserved mapping offset of the + * folio loaded. + * + * Returns the folio loaded or NULL if the node is empty. + * + * The folio loaded has its refcount incremented. + */ +struct folio *pkram_load_folio(struct pkram_access *pa, unsigned long *index) +{ + return NULL; +} + +/** + * Copy @count bytes from @buf to the preserved memory node and object + * associated with pkram stream access @pa. The stream must have been + * initialized with pkram_prepare_save() and pkram_prepare_save_obj() + * and access initialized with PKRAM_ACCESS(); + * + * On success, returns the number of bytes written, which is always equal to + * @count. On failure, -errno is returned. + */ +ssize_t pkram_write(struct pkram_access *pa, const void *buf, size_t count) +{ + return -EINVAL; +} + +/** + * Copy up to @count bytes from the preserved memory node and object + * associated with pkram stream access @pa to @buf. The stream must have been + * initialized with pkram_prepare_load() and pkram_prepare_load_obj() and + * access initialized PKRAM_ACCESS(). + * + * Returns the number of bytes read, which may be less than @count if the node + * has fewer bytes available. + */ +size_t pkram_read(struct pkram_access *pa, void *buf, size_t count) +{ + return 0; +} From patchwork Thu Apr 27 00:08:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225029 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D7DEAC7618E for ; Thu, 27 Apr 2023 00:09:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF10C6B0072; Wed, 26 Apr 2023 20:09:45 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id EA0636B0075; Wed, 26 Apr 2023 20:09:45 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D41496B0074; Wed, 26 Apr 2023 20:09:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id C027F6B0071 for ; Wed, 26 Apr 2023 20:09:45 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 92D58160418 for ; Thu, 27 Apr 2023 00:09:45 +0000 (UTC) X-FDA: 80725237530.08.95807FC Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf13.hostedemail.com (Postfix) with ESMTP id 86C4320014 for ; Thu, 27 Apr 2023 00:09:43 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=TxlrwQ0h; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf13.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554183; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=2ukFf+zRaNRmI8y9b//d4HHLWejv/e/RSSnUHKNUOb0=; b=qvJMQa6ADBieDXO5JdqBPLJDmy7C3dMPS5M0SkgNqWxrEJil12utIm/xH7quBKv9B2LUef bluUsMBJUnzY98XMEI74moj1MljpL1415UqVetDYPWKWQXeWGoTq1faw/IqwC071kxJoW7 I6KCWra3i7zAgRxV3SBJI0iVegKkuV0= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=TxlrwQ0h; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf13.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554183; a=rsa-sha256; cv=none; b=YPksKJLBuhFMDl3vh0BNZmAdrFxItp07nJjHpH17681tgqqjrPDkihP9xB3WL7voOb4+Dr dw9CBPl8p6fkR/8xn6Cc4k45idcSZm9Qk5O6c5Z5Q9Rw8HBJIX6Iu/tQD/Q6EVfZO/bAuh 4PtwWjtX+pS1VqzXhBd9MwBU28ya4bk= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxEtw025349; Thu, 27 Apr 2023 00:09:09 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=2ukFf+zRaNRmI8y9b//d4HHLWejv/e/RSSnUHKNUOb0=; b=TxlrwQ0h7qFCJaeuzgzyEU0pAR/9XxEhbPIoeg6wR+uexwlVY+B8YNO/biu1qWN7dWsN zWja6dGffqrU7IioURUI0OMKQZQ9tuPCh7XG67qZectGRQhDmBVm+QZ5BEeNmOZgOdVi nwIybKeQicAO1qnGABVzNxIV9c/kc+TFYGvW3OrDaKffhxCBcJ2xvnKxbTw+ksNL93ET +3oK/5n13QLSew1loC2CozejsH8uV5iwonsq7XVCCgHaF7rHjDHPkQxkLHPsi67hbOyd YXklKe+yThp4IMn+e8XDiW13KH4eGybP1zScjsIW4e6WZfTZ9OYxWFtBQe1eEa6RptVa Hg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622txu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:08 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QLGcu0007340; Thu, 27 Apr 2023 00:09:07 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpab-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:07 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938Q013888; Thu, 27 Apr 2023 00:09:07 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-3; Thu, 27 Apr 2023 00:09:06 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 02/21] mm: PKRAM: implement node load and save functions Date: Wed, 26 Apr 2023 17:08:38 -0700 Message-Id: <1682554137-13938-3-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: BhiXjpaVT-IHUNxaCSxCFNS9jekWEvq_ X-Proofpoint-GUID: BhiXjpaVT-IHUNxaCSxCFNS9jekWEvq_ X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 86C4320014 X-Stat-Signature: c3ap811hmg4ytcrmbwedpjmjk4b8rku4 X-HE-Tag: 1682554183-746766 X-HE-Meta: U2FsdGVkX18v7CmNO99NAmsNcrwyyJBas1QFhv6a04H6T0Xcmy2e2yx3ae2VsHwqZLs/VVZKjD5HzPn1Br/IWkuQJiJdm8zR4H8Hmso4Z2QbtU6NxRhIloofmsxF4B1Ghjmy6eLiDJ2PdrPQtkQZHkJ8guYpr9Xv878tRh/6drbxbPciQLzgMnT7TTFqeYT6D1rmOUX/NwQO7Wj3tRJ3qcuPbo3LL+VhRamR/jF8tIOBxd4ASpe+lb377VJ4C0OWPBtYqDDRsFjiOyKVNz9CFBoaLoKAjjalXd4JOjSzasPaWUtxQGmjlyX215GLn6Oxneeas4rJ/ItJHps8ufGqUkwW1T6csKSp8FDyDcFPkaFyZrGGfglpLlFazNUEKMo9vdTTk4niZW4U+KPnlqxCUyQCH0GM86qR6dlXWXhft9zr+HQDzY/GbBpj3YeaHvpM3mCfIGX48cx5CgBFmfndY8iievxgQ/ivPhePN4G63z1w1T5jvGxJkzvrs95X1+g2PCS5enRNwFwkY2H5PJ5vQi1k77/Hl0we3MgRevemwgqU5hOPU5j0RH4FvoTKLoczATNFgTMkvW8UuNcsq10hULUveFBstG8vE26BXbZ4mJ6FfTpx/MhjHXfrp9uVU7/+ZEwjts3/5kAqyjL1SIu7QaMS1jp/VmXfmYxHdlqewt6uy1d3BSK/6g78CokaAbTQ3X6bapDsZ0sDxLlhZqzVCiPrtrGiP7SeIZ+eYT9TtI9c+SQC89DDZZeHDM/A0OPvkRFRz0N4/lUIMSyC9sJ/Q9iYt/CDdYbYDFYucvtRRIfZnutKxP66ZbhY+ccc0TvUqiQEIqlj8d30e7BHvsa0amyDtZMst13zYK7LpDgcppBpsvFaEQEEF3X+wOledlXLqsGdfeq/cpg5YRaDd88Z36GyFqUM1JkNZPlyXT6wbxsUKxhJ7qv6rWj7Q8VyVjNoCsahyLzf5z/iPmrbFWG StxC/7cv wx1IsZqUkJ4Fvhd4cISsD12ZEXnOJArrSJJYqQLvYiOyC+1cp+vFLiFS5EFN3SaZR/xZIQFfQq6EaSlQA8rBuQd8ZIkfrw1n8vuHKWgbK5fN0FwWqEbDAvyvuS6a0dt+3VPoAJXTmsx2mvQB+W/udxwMjsdj3Nnvve2He X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Preserved memory is divided into nodes which can be saved and loaded independently of each other. PKRAM nodes are kept on a list and identified by unique names. Whenever a save operation is initiated by calling pkram_prepare_save(), a new node is created and linked to the list. When the save operation has been committed by calling pkram_finish_save(), the node becomes loadable. A load operation can be then initiated by calling pkram_prepare_load() which deletes the node from the list and prepares the corresponding stream for loading data from it. After the load has been finished, the pkram_finish_load() function must be called to free the node. Nodes are also deleted when a save operation is discarded, i.e. pkram_discard_save() is called instead of pkram_finish_save(). Originally-by: Vladimir Davydov Signed-off-by: Anthony Yznaga --- include/linux/pkram.h | 8 ++- mm/pkram.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 149 insertions(+), 6 deletions(-) diff --git a/include/linux/pkram.h b/include/linux/pkram.h index 57b8db4229a4..8def9017b16a 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -6,6 +6,8 @@ #include #include +struct pkram_node; + /** * enum pkram_data_flags - definition of data types contained in a pkram obj * @PKRAM_DATA_none: No data types configured @@ -14,7 +16,11 @@ enum pkram_data_flags { PKRAM_DATA_none = 0x0, /* No data types configured */ }; -struct pkram_stream; +struct pkram_stream { + gfp_t gfp_mask; + struct pkram_node *node; +}; + struct pkram_access; #define PKRAM_NAME_MAX 256 /* including nul */ diff --git a/mm/pkram.c b/mm/pkram.c index 421de8211e05..bbfd8df0874e 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -2,16 +2,85 @@ #include #include #include +#include #include +#include #include +#include #include +/* + * Preserved memory is divided into nodes that can be saved or loaded + * independently of each other. The nodes are identified by unique name + * strings. + * + * The structure occupies a memory page. + */ +struct pkram_node { + __u32 flags; + + __u8 name[PKRAM_NAME_MAX]; +}; + +#define PKRAM_SAVE 1 +#define PKRAM_LOAD 2 +#define PKRAM_ACCMODE_MASK 3 + +static LIST_HEAD(pkram_nodes); /* linked through page::lru */ +static DEFINE_MUTEX(pkram_mutex); /* serializes open/close */ + +static inline struct page *pkram_alloc_page(gfp_t gfp_mask) +{ + return alloc_page(gfp_mask); +} + +static inline void pkram_free_page(void *addr) +{ + free_page((unsigned long)addr); +} + +static inline void pkram_insert_node(struct pkram_node *node) +{ + list_add(&virt_to_page(node)->lru, &pkram_nodes); +} + +static inline void pkram_delete_node(struct pkram_node *node) +{ + list_del(&virt_to_page(node)->lru); +} + +static struct pkram_node *pkram_find_node(const char *name) +{ + struct page *page; + struct pkram_node *node; + + list_for_each_entry(page, &pkram_nodes, lru) { + node = page_address(page); + if (strcmp(node->name, name) == 0) + return node; + } + return NULL; +} + +static void pkram_stream_init(struct pkram_stream *ps, + struct pkram_node *node, gfp_t gfp_mask) +{ + memset(ps, 0, sizeof(*ps)); + ps->gfp_mask = gfp_mask; + ps->node = node; +} + /** * Create a preserved memory node with name @name and initialize stream @ps * for saving data to it. * * @gfp_mask specifies the memory allocation mask to be used when saving data. * + * Error values: + * %ENAMETOOLONG: name len >= PKRAM_NAME_MAX + * %ENOMEM: insufficient memory available + * %EEXIST: node with specified name already exists + * * Returns 0 on success, -errno on failure. * * After the save has finished, pkram_finish_save() (or pkram_discard_save() in @@ -19,7 +88,34 @@ */ int pkram_prepare_save(struct pkram_stream *ps, const char *name, gfp_t gfp_mask) { - return -EINVAL; + struct page *page; + struct pkram_node *node; + int err = 0; + + if (strlen(name) >= PKRAM_NAME_MAX) + return -ENAMETOOLONG; + + page = pkram_alloc_page(gfp_mask | __GFP_ZERO); + if (!page) + return -ENOMEM; + node = page_address(page); + + node->flags = PKRAM_SAVE; + strcpy(node->name, name); + + mutex_lock(&pkram_mutex); + if (!pkram_find_node(name)) + pkram_insert_node(node); + else + err = -EEXIST; + mutex_unlock(&pkram_mutex); + if (err) { + pkram_free_page(node); + return err; + } + + pkram_stream_init(ps, node, gfp_mask); + return 0; } /** @@ -50,7 +146,11 @@ void pkram_finish_save_obj(struct pkram_stream *ps) */ void pkram_finish_save(struct pkram_stream *ps) { - WARN_ON_ONCE(1); + struct pkram_node *node = ps->node; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); + + node->flags &= ~PKRAM_ACCMODE_MASK; } /** @@ -60,7 +160,15 @@ void pkram_finish_save(struct pkram_stream *ps) */ void pkram_discard_save(struct pkram_stream *ps) { - WARN_ON_ONCE(1); + struct pkram_node *node = ps->node; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); + + mutex_lock(&pkram_mutex); + pkram_delete_node(node); + mutex_unlock(&pkram_mutex); + + pkram_free_page(node); } /** @@ -69,11 +177,36 @@ void pkram_discard_save(struct pkram_stream *ps) * * Returns 0 on success, -errno on failure. * + * Error values: + * %ENOENT: node with specified name does not exist + * %EBUSY: save to required node has not finished yet + * * After the load has finished, pkram_finish_load() is to be called. */ int pkram_prepare_load(struct pkram_stream *ps, const char *name) { - return -EINVAL; + struct pkram_node *node; + int err = 0; + + mutex_lock(&pkram_mutex); + node = pkram_find_node(name); + if (!node) { + err = -ENOENT; + goto out_unlock; + } + if (node->flags & PKRAM_ACCMODE_MASK) { + err = -EBUSY; + goto out_unlock; + } + pkram_delete_node(node); +out_unlock: + mutex_unlock(&pkram_mutex); + if (err) + return err; + + node->flags |= PKRAM_LOAD; + pkram_stream_init(ps, node, 0); + return 0; } /** @@ -106,7 +239,11 @@ void pkram_finish_load_obj(struct pkram_stream *ps) */ void pkram_finish_load(struct pkram_stream *ps) { - WARN_ON_ONCE(1); + struct pkram_node *node = ps->node; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_LOAD); + + pkram_free_page(node); } /** From patchwork Thu Apr 27 00:08:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225112 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 45BC7C7618E for ; Thu, 27 Apr 2023 02:14:56 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8CF466B0071; Wed, 26 Apr 2023 22:14:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 87E8B6B0072; Wed, 26 Apr 2023 22:14:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 745746B0074; Wed, 26 Apr 2023 22:14:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 63FFC6B0071 for ; Wed, 26 Apr 2023 22:14:55 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 09341160114 for ; Thu, 27 Apr 2023 02:14:55 +0000 (UTC) X-FDA: 80725552950.30.EEFD3CA Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf21.hostedemail.com (Postfix) with ESMTP id 034A01C000A for ; Thu, 27 Apr 2023 02:14:52 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=1O7OzEZ6; spf=pass (imf21.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682561693; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=V8bKMJcQa6Jv2T9adbwM8jWBRAWVIiOFCv4jdhXt4wk=; b=vT/d/7Zx0sD8ri5fhV0S6FFIF4si50fdqeOxiR4j3VdoZwqsuJzZP0NiSukevVLkF0XcQO /UoC92CB15rZEyIXSm3PLz0r67NCcJ/LU9mGIzkCXlH1gvMhWbh2kSjjbIdMw1M54e0Br5 UHRg1uPGKLhl4LVjq/tjVYRRDe+162U= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=1O7OzEZ6; spf=pass (imf21.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682561693; a=rsa-sha256; cv=none; b=EhkvtnkHexsHSfNhjvAWDyxDgES55wsgYpwYSXRPKjWe+58wgIMAcP3uDHImZa6nrcGhkT aib2e7j3TnY259NExP9H0dX55CTHjpW0RvkSulzyYN3p113sD407vD4Yi8yxJCiTWi4iOW DHZcB4lwk6wkSMQ2IUCFikP9bPTOsT8= Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGx6XV014746; Thu, 27 Apr 2023 00:09:10 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=V8bKMJcQa6Jv2T9adbwM8jWBRAWVIiOFCv4jdhXt4wk=; b=1O7OzEZ6q3Uaoe9tuHIc0BR7Q0Y5MQzRvRBF53JohJ57D3BnVoGakPz0LMCaXlr5k6EE hcxD5lmKhhEzCaYotXvvhdcoZM2GlaWAVi0uGAe97gWkaCXkvAOoMAPYpJ2EdV7Ytq0S fMpw/wChwFMuBMEXu5cQ8vjvoxrH0kZ2H63FuQNA1E2CJtVvVklLv66+AZT1lQie52OH vXAPe8Nv/QXXF3iyefzyZ0nt6hCgrNCXm7+dd0y+ij+jdzxQbYH9cvAC/zJfiTOJjHaR 3vOFdz2NsVHfxbdkqOAeCcHG8QQK+nMa6GmGSMNAwC2+lUJ+COrkYB7u9Zgi39ZmGxqK nQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q47fatmrn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:10 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QN01rW007153; Thu, 27 Apr 2023 00:09:09 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpbb-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:09 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938S013888; Thu, 27 Apr 2023 00:09:08 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-4; Thu, 27 Apr 2023 00:09:08 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 03/21] mm: PKRAM: implement object load and save functions Date: Wed, 26 Apr 2023 17:08:39 -0700 Message-Id: <1682554137-13938-4-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: KQ35sjKexsktQQWPCGo9miJFSnT_W4A6 X-Proofpoint-GUID: KQ35sjKexsktQQWPCGo9miJFSnT_W4A6 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 034A01C000A X-Rspam-User: X-Stat-Signature: t8ixtgojywdintrtef6kox4tzxunitg4 X-HE-Tag: 1682561692-797099 X-HE-Meta: U2FsdGVkX19b5RtoxeLlGmVN25h5tQWNFg2BJz8SNH/Vq2Odr8qEH5hzg98DSXWyp/C9TcflhZre0teWaIPf0SmLgjUSnV+Xm72OrerISzgkSC+yzU5FgtDCQWrE2M0iCk5rvi5XBDUk2zkfdUw2Eg44pbvdE76Jwtfbnni41l9XbJbDIHca29/6Per6PcsRxptTxqFaj3Y+l9QbKhTFHC/32ie2kLa2OcfI9xd8u4ag/A9LZTJeTfcb0vOqkhxJe44qgn6bJqq9/wtoTquSg/Q2OYNKjPnVW3/29k5g+dS/ahlqtWQzCnGd8gtTQWQrMgDF965T8Z5Oo9XHEX1QXMz1rZSmMFgpr1/ONdCZ0EC459dYweLjKbkiDil7gMiVTXYyBinb1hG7F9Y1CVdlHCQkma6LM5dLv8YZbTjdE+ZlDDLzDkmM8Ywg9RkY9Fe/o3/kl5vT9uU3kAHAF9oE/mNEUwjHzjh7F4fOvhZWSPlbaXzOx0a8N8XGRtWZzQcCu1xBwdSOHgRSNVL8ha1ZXMOif57sdO91nIy2DhDr8IbPyHliIR/ILVAttnmjnQjnc0B2IePtUs0APW5yVSbcp6SS6sb9XG3fl14HslIK1ELHgc8EIF23uXAUJ6GjWL3ECdp5lUyD4g/IpoAZOX4+cm3oUeO9axMMSd1CGBvVdstBvT5+mDNN7iJdQvn2AGDNqubfj++JqfyhqGVEMNHNSNgIEu++fC0hX0ygimG8JHFwxP5le/1HfERlJ4JZlRYmNyqfEuULG9UxWnb/P2CL5QEib9po/Cd1RwUWhZcEYAkQ5A41HaX3n2Ya8EzS7mbi4QyBD2M/DaAV4Az8bCpYwVpgL8LCgX3gwMi8/MVtjmPwBi3ZwracQnQYil6h8XYaXwili9n3cJBVmeHGn03XwmIM6g4fv+BSJynD6+YqSieWupm3ABp5Y0QfKL41swkCRbOmNqiJ9iqWdeowk4G 5vfy9uFJ i8kYEfBzF9mHFaekAEBpUzJJZXUd0GGc+K4rjghsuC7wemsrJb3uGB2xaMy2qNZxw78xHK5kD6YemjkuIcGuvEpGxJ6E7u6K9ZaATYJ9lo8WZOE57WO2BaXCC5THP5EYOqd+Y X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: PKRAM nodes are further divided into a list of objects. After a save operation has been initiated for a node, a save operation for an object associated with the node is initiated by calling pkram_prepare_save_obj(). A new object is created and linked to the node. The save operation for the object is committed by calling pkram_finish_save_obj(). After a load operation has been initiated, pkram_prepare_load_obj() is called to delete the next object from the node and prepare the corresponding stream for loading data from it. After the load of object has been finished, pkram_finish_load_obj() is called to free the object. Objects are also deleted when a save operation is discarded. Signed-off-by: Anthony Yznaga --- include/linux/pkram.h | 2 ++ mm/pkram.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++--- 2 files changed, 70 insertions(+), 4 deletions(-) diff --git a/include/linux/pkram.h b/include/linux/pkram.h index 8def9017b16a..83718ad0e416 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -7,6 +7,7 @@ #include struct pkram_node; +struct pkram_obj; /** * enum pkram_data_flags - definition of data types contained in a pkram obj @@ -19,6 +20,7 @@ enum pkram_data_flags { struct pkram_stream { gfp_t gfp_mask; struct pkram_node *node; + struct pkram_obj *obj; }; struct pkram_access; diff --git a/mm/pkram.c b/mm/pkram.c index bbfd8df0874e..6e3895cb9872 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -6,9 +6,14 @@ #include #include #include +#include #include #include +struct pkram_obj { + __u64 obj_pfn; /* points to the next object in the list */ +}; + /* * Preserved memory is divided into nodes that can be saved or loaded * independently of each other. The nodes are identified by unique name @@ -18,6 +23,7 @@ */ struct pkram_node { __u32 flags; + __u64 obj_pfn; /* points to the first obj of the node */ __u8 name[PKRAM_NAME_MAX]; }; @@ -62,6 +68,21 @@ static struct pkram_node *pkram_find_node(const char *name) return NULL; } +static void pkram_truncate_node(struct pkram_node *node) +{ + unsigned long obj_pfn; + struct pkram_obj *obj; + + obj_pfn = node->obj_pfn; + while (obj_pfn) { + obj = pfn_to_kaddr(obj_pfn); + obj_pfn = obj->obj_pfn; + pkram_free_page(obj); + cond_resched(); + } + node->obj_pfn = 0; +} + static void pkram_stream_init(struct pkram_stream *ps, struct pkram_node *node, gfp_t gfp_mask) { @@ -124,12 +145,31 @@ int pkram_prepare_save(struct pkram_stream *ps, const char *name, gfp_t gfp_mask * * Returns 0 on success, -errno on failure. * + * Error values: + * %ENOMEM: insufficient memory available + * * After the save has finished, pkram_finish_save_obj() (or pkram_discard_save() * in case of failure) is to be called. */ int pkram_prepare_save_obj(struct pkram_stream *ps, enum pkram_data_flags flags) { - return -EINVAL; + struct pkram_node *node = ps->node; + struct pkram_obj *obj; + struct page *page; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); + + page = pkram_alloc_page(ps->gfp_mask | __GFP_ZERO); + if (!page) + return -ENOMEM; + obj = page_address(page); + + if (node->obj_pfn) + obj->obj_pfn = node->obj_pfn; + node->obj_pfn = page_to_pfn(page); + + ps->obj = obj; + return 0; } /** @@ -137,7 +177,9 @@ int pkram_prepare_save_obj(struct pkram_stream *ps, enum pkram_data_flags flags) */ void pkram_finish_save_obj(struct pkram_stream *ps) { - WARN_ON_ONCE(1); + struct pkram_node *node = ps->node; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); } /** @@ -168,6 +210,7 @@ void pkram_discard_save(struct pkram_stream *ps) pkram_delete_node(node); mutex_unlock(&pkram_mutex); + pkram_truncate_node(node); pkram_free_page(node); } @@ -215,11 +258,26 @@ int pkram_prepare_load(struct pkram_stream *ps, const char *name) * * Returns 0 on success, -errno on failure. * + * Error values: + * %ENODATA: Stream @ps has no preserved memory objects + * * After the load has finished, pkram_finish_load_obj() is to be called. */ int pkram_prepare_load_obj(struct pkram_stream *ps) { - return -EINVAL; + struct pkram_node *node = ps->node; + struct pkram_obj *obj; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_LOAD); + + if (!node->obj_pfn) + return -ENODATA; + + obj = pfn_to_kaddr(node->obj_pfn); + node->obj_pfn = obj->obj_pfn; + + ps->obj = obj; + return 0; } /** @@ -229,7 +287,12 @@ int pkram_prepare_load_obj(struct pkram_stream *ps) */ void pkram_finish_load_obj(struct pkram_stream *ps) { - WARN_ON_ONCE(1); + struct pkram_node *node = ps->node; + struct pkram_obj *obj = ps->obj; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_LOAD); + + pkram_free_page(obj); } /** @@ -243,6 +306,7 @@ void pkram_finish_load(struct pkram_stream *ps) BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_LOAD); + pkram_truncate_node(node); pkram_free_page(node); } From patchwork Thu Apr 27 00:08:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225059 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 07A1AC7618E for ; Thu, 27 Apr 2023 01:04:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7AFBB6B0071; Wed, 26 Apr 2023 21:04:52 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 75F836B0072; Wed, 26 Apr 2023 21:04:52 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6007F6B0074; Wed, 26 Apr 2023 21:04:52 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4C7E96B0071 for ; Wed, 26 Apr 2023 21:04:52 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id 1A152A0356 for ; Thu, 27 Apr 2023 01:04:52 +0000 (UTC) X-FDA: 80725376424.06.26E7357 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf25.hostedemail.com (Postfix) with ESMTP id 0E7A0A0025 for ; Thu, 27 Apr 2023 01:04:49 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=XioIQjRs; spf=pass (imf25.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682557490; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=M8m1jpfuKsBAiOuFoEkjKkL2mHa5vwe/lnUsSpAdeqY=; b=yCKxWr+RuMKHn0irVnOY3VQb1wpEbAKP77PFHjCKku6sOED1sDUZNrZXivmTv8SGAP6k/o bpTtkdFSxojDqDbkKzzg0iBMA0ZknvSQWTkOyabAaA+RzyQNVchaDGu64zRoW2ZkDf4NKV GZsx3dwzW3lltL0TMrODwC2tSsRmGjw= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=XioIQjRs; spf=pass (imf25.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682557490; a=rsa-sha256; cv=none; b=aHK+3ZeHkcjOp5nOrxg3ZHrb5oeOxV+5m6vx5zloOtyFAho0xP/XKkd/xx3SBkfezBz8uE DtngDuuFNc7zvAOFKazynxbkEH/tE4TsTEz7gCKvp1L4wJ38+5IpMDrNVOmel2P3qczsZN GKtynaK+6ccTnqZfg7p4qP6PkSDUkos= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxD1d025310; Thu, 27 Apr 2023 00:09:12 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=M8m1jpfuKsBAiOuFoEkjKkL2mHa5vwe/lnUsSpAdeqY=; b=XioIQjRsKO55if7qU5pc5RmHJ7+xiiutqegxkqaBdyQfIyM/ZkyhCK464QKjJLnxrdmt qfFhb26IJfPS5NrhO41Fd7xLL/JUqLb2zwxT1JipnXsUch18NntSN6ysSHKmoqiuDvv8 UTBTXwdjJbiMdb65OtwieFuuVoojyf3WE8H67z6EwxDbSnkTVFT1lOERHjEV+vVNpX61 Ib3BI/N2BlYpkNNE/ID2hr5v1pz35CmC/13qXlClCcLN2nevgrzffP8hHOs7I0uxSz9Y ppfwpanX2o5Wl8U9+t5Gu1Rj0PFXIxFyaD66RUZewzyXug20mDDAkAGIr2R2GeXnwI6p RA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622txw-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:11 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QM05AR007142; Thu, 27 Apr 2023 00:09:11 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpc0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:10 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938U013888; Thu, 27 Apr 2023 00:09:10 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-5; Thu, 27 Apr 2023 00:09:10 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 04/21] mm: PKRAM: implement folio stream operations Date: Wed, 26 Apr 2023 17:08:40 -0700 Message-Id: <1682554137-13938-5-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: 9E3o04aE9ICIsw0HPe36rS9Y7E8_BkLp X-Proofpoint-GUID: 9E3o04aE9ICIsw0HPe36rS9Y7E8_BkLp X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0E7A0A0025 X-Stat-Signature: pdktwrhp9opti6y989nzcjeyw6omehcj X-HE-Tag: 1682557489-284678 X-HE-Meta: U2FsdGVkX1+IErl3aoMRFo5u+ClsH6lacr7YppnOY5YWRUuc4j3B5zn2aS1nyDW+4y6qNAL8RY0IGkK4T/qFoWL/SNJZY8u0ZOaaXD7h3WVi+G60tK64/RZ+Vy8q2Q9JnWLmNdmTzan+SONNWDFo/37ULw07zDPddIN5H5Ey/2C4PR7uJZmYnarI/FOY/InAht/swcKZpsZZMAawfFsHHXN6AjhHJMR74du54qtpEoRhI5CdchFx6B83/+z6U4MCz8fcoYK9QKm/RfxO5a71DutKUtrVmtUy2ipBIQkrdQGh6GNOdvGIA4FnEj8IaF7k9sJOGf9aIkQ7fafUxwk/o5jEmXUgqN1v1wy4R5bAJRJYQb6fy7dq81jQFRbT3mQTfWfVoWOEPAcBCGVUTYvsRCXSUOr4R15F6U+uLbVxk0zD7p9g/uIrgo01vs6zknns1GCuRjj7UDVxnedJ7/r/OVhVlaNC/1tujGJ6xdoqus4K03l0lP4JMJgEYrSGL0Z70hT23x+hnRbe4DAWuG09FXkl5f1fLwtYCUct+5AKRJvpKigmxF2sdrSpAC08nwKMkhVOi/woEvayx5rl0hrm9qV+0ZA6BdiZKVj4iNlEE8rHiSoLnK3dl32UAxR2v/iqME6XWPzptGOCmxENt24z5Bvy6lUJXH9rJVDLhOM6pCBFfQEeWgYXdG64e42oDeggIvEUCzdP5qiDdeARSu4LN9XfT6As4+GuCdpVtXcJu/lzl8TR2AE/jW5morXbM+THvAFQL2EcDosei7JPBXEZLVM4e0Ia8kI/z3rR9kZrkK73QpuU+u5WTpDcscqwtgsT6V4aGJwLvkAY0+1YQT2MHglT2tjm79SSlrdrKgVQalBZqrAoj1bfTBUaV20kmtr1TtJ2/upJzj043sAq/VuLIlAf4laQkU5v68nLt5def2kDWGn7er/fpJLx61XDazfS+1C1b0nOfH5pIvbJ7bO dEpCoBzI qtZSfV/2bo+6u4ae55tejkhU7/4RFLLiHwufryBl3sbcfTbkqCfRWlWv+z80luyTzCIXXBwxhk2uBP/SD99Gli/bxaSd0qgejMcTbmXhxMxaEmr0LltX25uImDNrc9DY1e7MCT8hULpWjwXbxeqmgdLcT28TLxmvsQxTJ8epvltpBPh1HC5N2Gr3vvw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Implement pkram_save_folio() to populate a PKRAM object with in-memory folios and pkram_load_folio() to load folios from a PKRAM object. Saving a folio to PKRAM is accomplished by recording its pfn, order, and mapping index and incrementing its refcount so that it will not be freed after the last user puts it. Originally-by: Vladimir Davydov Signed-off-by: Anthony Yznaga --- include/linux/pkram.h | 42 ++++++- mm/pkram.c | 311 +++++++++++++++++++++++++++++++++++++++++++++++++- 2 files changed, 346 insertions(+), 7 deletions(-) diff --git a/include/linux/pkram.h b/include/linux/pkram.h index 83718ad0e416..130ab5c2d94a 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -8,22 +8,47 @@ struct pkram_node; struct pkram_obj; +struct pkram_link; /** * enum pkram_data_flags - definition of data types contained in a pkram obj * @PKRAM_DATA_none: No data types configured + * @PKRAM_DATA_folios: obj contains folio data */ enum pkram_data_flags { - PKRAM_DATA_none = 0x0, /* No data types configured */ + PKRAM_DATA_none = 0x0, /* No data types configured */ + PKRAM_DATA_folios = 0x1, /* Contains folio data */ +}; + +struct pkram_data_stream { + /* List of link pages to add/remove from */ + __u64 *head_link_pfnp; + __u64 *tail_link_pfnp; + + struct pkram_link *link; /* current link */ + unsigned int entry_idx; /* next entry in link */ }; struct pkram_stream { gfp_t gfp_mask; struct pkram_node *node; struct pkram_obj *obj; + + __u64 *folios_head_link_pfnp; + __u64 *folios_tail_link_pfnp; +}; + +struct pkram_folios_access { + unsigned long next_index; }; -struct pkram_access; +struct pkram_access { + enum pkram_data_flags dtype; + struct pkram_stream *ps; + struct pkram_data_stream pds; + + struct pkram_folios_access folios; +}; #define PKRAM_NAME_MAX 256 /* including nul */ @@ -41,8 +66,19 @@ int pkram_prepare_save(struct pkram_stream *ps, const char *name, void pkram_finish_load(struct pkram_stream *ps); void pkram_finish_load_obj(struct pkram_stream *ps); +#define PKRAM_PDS_INIT(name, stream, type) { \ + .head_link_pfnp = (stream)->type##_head_link_pfnp, \ + .tail_link_pfnp = (stream)->type##_tail_link_pfnp, \ + } + +#define PKRAM_ACCESS_INIT(name, stream, type) { \ + .dtype = PKRAM_DATA_##type, \ + .ps = (stream), \ + .pds = PKRAM_PDS_INIT(name, stream, type), \ + } + #define PKRAM_ACCESS(name, stream, type) \ - struct pkram_access name + struct pkram_access name = PKRAM_ACCESS_INIT(name, stream, type) void pkram_finish_access(struct pkram_access *pa, bool status_ok); diff --git a/mm/pkram.c b/mm/pkram.c index 6e3895cb9872..610ff7a88c98 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include #include +#include #include #include #include @@ -10,8 +11,40 @@ #include #include +#include "internal.h" + + +/* + * Represents a reference to a data page saved to PKRAM. + */ +typedef __u64 pkram_entry_t; + +#define PKRAM_ENTRY_FLAGS_SHIFT 0x5 +#define PKRAM_ENTRY_FLAGS_MASK 0x7f +#define PKRAM_ENTRY_ORDER_MASK 0x1f + +/* + * Keeps references to folios saved to PKRAM. + * The structure occupies a memory page. + */ +struct pkram_link { + __u64 link_pfn; /* points to the next link of the object */ + __u64 index; /* mapping index of first pkram_entry_t */ + + /* + * the array occupies the rest of the link page; if the link is not + * full, the rest of the array must be filled with zeros + */ + pkram_entry_t entry[]; +}; + +#define PKRAM_LINK_ENTRIES_MAX \ + ((PAGE_SIZE-sizeof(struct pkram_link))/sizeof(pkram_entry_t)) + struct pkram_obj { - __u64 obj_pfn; /* points to the next object in the list */ + __u64 folios_head_link_pfn; /* the first folios link of the object */ + __u64 folios_tail_link_pfn; /* the last folios link of the object */ + __u64 obj_pfn; /* points to the next object in the list */ }; /* @@ -19,6 +52,10 @@ struct pkram_obj { * independently of each other. The nodes are identified by unique name * strings. * + * References to folios saved to a preserved memory node are kept in a + * singly-linked list of PKRAM link structures (see above), the node has a + * pointer to the head of. + * * The structure occupies a memory page. */ struct pkram_node { @@ -68,6 +105,41 @@ static struct pkram_node *pkram_find_node(const char *name) return NULL; } +static void pkram_truncate_link(struct pkram_link *link) +{ + struct page *page; + pkram_entry_t p; + int i; + + for (i = 0; i < PKRAM_LINK_ENTRIES_MAX; i++) { + p = link->entry[i]; + if (!p) + continue; + page = pfn_to_page(PHYS_PFN(p)); + put_page(page); + } +} + +static void pkram_truncate_links(unsigned long link_pfn) +{ + struct pkram_link *link; + + while (link_pfn) { + link = pfn_to_kaddr(link_pfn); + pkram_truncate_link(link); + link_pfn = link->link_pfn; + pkram_free_page(link); + cond_resched(); + } +} + +static void pkram_truncate_obj(struct pkram_obj *obj) +{ + pkram_truncate_links(obj->folios_head_link_pfn); + obj->folios_head_link_pfn = 0; + obj->folios_tail_link_pfn = 0; +} + static void pkram_truncate_node(struct pkram_node *node) { unsigned long obj_pfn; @@ -76,6 +148,7 @@ static void pkram_truncate_node(struct pkram_node *node) obj_pfn = node->obj_pfn; while (obj_pfn) { obj = pfn_to_kaddr(obj_pfn); + pkram_truncate_obj(obj); obj_pfn = obj->obj_pfn; pkram_free_page(obj); cond_resched(); @@ -83,6 +156,84 @@ static void pkram_truncate_node(struct pkram_node *node) node->obj_pfn = 0; } +static void pkram_add_link(struct pkram_link *link, struct pkram_data_stream *pds) +{ + __u64 link_pfn = page_to_pfn(virt_to_page(link)); + + if (!*pds->head_link_pfnp) { + *pds->head_link_pfnp = link_pfn; + *pds->tail_link_pfnp = link_pfn; + } else { + struct pkram_link *tail = pfn_to_kaddr(*pds->tail_link_pfnp); + + tail->link_pfn = link_pfn; + *pds->tail_link_pfnp = link_pfn; + } +} + +static struct pkram_link *pkram_remove_link(struct pkram_data_stream *pds) +{ + struct pkram_link *link; + + if (!*pds->head_link_pfnp) + return NULL; + + link = pfn_to_kaddr(*pds->head_link_pfnp); + *pds->head_link_pfnp = link->link_pfn; + if (!*pds->head_link_pfnp) + *pds->tail_link_pfnp = 0; + else + link->link_pfn = 0; + + return link; +} + +static struct pkram_link *pkram_new_link(struct pkram_data_stream *pds, gfp_t gfp_mask) +{ + struct pkram_link *link; + struct page *link_page; + + link_page = pkram_alloc_page((gfp_mask & GFP_RECLAIM_MASK) | + __GFP_ZERO); + if (!link_page) + return NULL; + + link = page_address(link_page); + pkram_add_link(link, pds); + pds->link = link; + pds->entry_idx = 0; + + return link; +} + +static void pkram_add_link_entry(struct pkram_data_stream *pds, struct page *page) +{ + struct pkram_link *link = pds->link; + pkram_entry_t p; + short flags = 0; + + p = page_to_phys(page); + p |= compound_order(page); + p |= ((flags & PKRAM_ENTRY_FLAGS_MASK) << PKRAM_ENTRY_FLAGS_SHIFT); + link->entry[pds->entry_idx] = p; + pds->entry_idx++; +} + +static int pkram_next_link(struct pkram_data_stream *pds, struct pkram_link **linkp) +{ + struct pkram_link *link; + + link = pkram_remove_link(pds); + if (!link) + return -ENODATA; + + pds->link = link; + pds->entry_idx = 0; + *linkp = link; + + return 0; +} + static void pkram_stream_init(struct pkram_stream *ps, struct pkram_node *node, gfp_t gfp_mask) { @@ -159,6 +310,9 @@ int pkram_prepare_save_obj(struct pkram_stream *ps, enum pkram_data_flags flags) BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); + if (flags & ~PKRAM_DATA_folios) + return -EINVAL; + page = pkram_alloc_page(ps->gfp_mask | __GFP_ZERO); if (!page) return -ENOMEM; @@ -168,6 +322,10 @@ int pkram_prepare_save_obj(struct pkram_stream *ps, enum pkram_data_flags flags) obj->obj_pfn = node->obj_pfn; node->obj_pfn = page_to_pfn(page); + if (flags & PKRAM_DATA_folios) { + ps->folios_head_link_pfnp = &obj->folios_head_link_pfn; + ps->folios_tail_link_pfnp = &obj->folios_tail_link_pfn; + } ps->obj = obj; return 0; } @@ -274,8 +432,17 @@ int pkram_prepare_load_obj(struct pkram_stream *ps) return -ENODATA; obj = pfn_to_kaddr(node->obj_pfn); + if (!obj->folios_head_link_pfn) { + WARN_ON(1); + return -EINVAL; + } + node->obj_pfn = obj->obj_pfn; + if (obj->folios_head_link_pfn) { + ps->folios_head_link_pfnp = &obj->folios_head_link_pfn; + ps->folios_tail_link_pfnp = &obj->folios_tail_link_pfn; + } ps->obj = obj; return 0; } @@ -292,6 +459,7 @@ void pkram_finish_load_obj(struct pkram_stream *ps) BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_LOAD); + pkram_truncate_obj(obj); pkram_free_page(obj); } @@ -317,7 +485,41 @@ void pkram_finish_load(struct pkram_stream *ps) */ void pkram_finish_access(struct pkram_access *pa, bool status_ok) { - WARN_ON_ONCE(1); + if (status_ok) + return; + + if (pa->ps->node->flags == PKRAM_SAVE) + return; + + if (pa->pds.link) + pkram_truncate_link(pa->pds.link); +} + +/* + * Add a page to a PKRAM obj allocating a new PKRAM link if necessary. + */ +static int __pkram_save_page(struct pkram_access *pa, struct page *page, + unsigned long index) +{ + struct pkram_data_stream *pds = &pa->pds; + struct pkram_link *link = pds->link; + + if (!link || pds->entry_idx >= PKRAM_LINK_ENTRIES_MAX || + index != pa->folios.next_index) { + link = pkram_new_link(pds, pa->ps->gfp_mask); + if (!link) + return -ENOMEM; + + pa->folios.next_index = link->index = index; + } + + get_page(page); + + pkram_add_link_entry(pds, page); + + pa->folios.next_index += compound_nr(page); + + return 0; } /** @@ -327,10 +529,102 @@ void pkram_finish_access(struct pkram_access *pa, bool status_ok) * with PKRAM_ACCESS(). * * Returns 0 on success, -errno on failure. + * + * Error values: + * %ENOMEM: insufficient amount of memory available + * + * Saving a folio to preserved memory is simply incrementing its refcount so + * that it will not get freed after the last user puts it. That means it is + * safe to use the folio as usual after it has been saved. */ int pkram_save_folio(struct pkram_access *pa, struct folio *folio) { - return -EINVAL; + struct pkram_node *node = pa->ps->node; + struct page *page = folio_page(folio, 0); + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); + + return __pkram_save_page(pa, page, page->index); +} + +static struct page *__pkram_prep_load_page(pkram_entry_t p) +{ + struct page *page; + int order; + short flags; + + flags = (p >> PKRAM_ENTRY_FLAGS_SHIFT) & PKRAM_ENTRY_FLAGS_MASK; + order = p & PKRAM_ENTRY_ORDER_MASK; + if (order >= MAX_ORDER) + goto out_error; + + page = pfn_to_page(PHYS_PFN(p)); + + if (!page_ref_freeze(pg, 1)) { + pr_err("PKRAM preserved page has unexpected inflated ref count\n"); + goto out_error; + } + + if (order) { + prep_compound_page(page, order); + if (order > 1) + prep_transhuge_page(page); + } + + page_ref_unfreeze(page, 1); + + return page; + +out_error: + return ERR_PTR(-EINVAL); +} + +/* + * Extract the next page from preserved memory freeing a PKRAM link if it + * becomes empty. + */ +static struct page *__pkram_load_page(struct pkram_access *pa, unsigned long *index) +{ + struct pkram_data_stream *pds = &pa->pds; + struct pkram_link *link = pds->link; + struct page *page; + pkram_entry_t p; + int ret; + + if (!link) { + ret = pkram_next_link(pds, &link); + if (ret) + return NULL; + + if (index) + pa->folios.next_index = link->index; + } + + BUG_ON(pds->entry_idx >= PKRAM_LINK_ENTRIES_MAX); + + p = link->entry[pds->entry_idx]; + BUG_ON(!p); + + page = __pkram_prep_load_page(p); + if (IS_ERR(page)) + return page; + + if (index) { + *index = pa->folios.next_index; + pa->folios.next_index += compound_nr(page); + } + + /* clear to avoid double free (see pkram_truncate_link()) */ + link->entry[pds->entry_idx] = 0; + + pds->entry_idx++; + if (pds->entry_idx >= PKRAM_LINK_ENTRIES_MAX || + !link->entry[pds->entry_idx]) { + pds->link = NULL; + pkram_free_page(link); + } + + return page; } /** @@ -348,7 +642,16 @@ int pkram_save_folio(struct pkram_access *pa, struct folio *folio) */ struct folio *pkram_load_folio(struct pkram_access *pa, unsigned long *index) { - return NULL; + struct pkram_node *node = pa->ps->node; + struct page *page; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_LOAD); + + page = __pkram_load_page(pa, index); + if (IS_ERR_OR_NULL(page)) + return (struct folio *)page; + else + return page_folio(page); } /** From patchwork Thu Apr 27 00:08:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225034 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B32CAC77B7C for ; Thu, 27 Apr 2023 00:09:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 1841B6B0078; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 134866B007D; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F16366B007E; Wed, 26 Apr 2023 20:09:49 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id E01C36B0078 for ; Wed, 26 Apr 2023 20:09:49 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id ACB88120408 for ; Thu, 27 Apr 2023 00:09:49 +0000 (UTC) X-FDA: 80725237698.13.8E3FBA3 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf06.hostedemail.com (Postfix) with ESMTP id C9E11180010 for ; Thu, 27 Apr 2023 00:09:47 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=GCWhljQN; spf=pass (imf06.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554187; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=MfX2UCA2WLfnR/ast08TW20ISh9rczjU8Ar+dRs48HA=; b=8gZ4J6rdxNU8XG0Otb5tvoOQAWXdjhZvx3RUb/K5a7e7N8GCbYJlF5PAQRqXHg+KJV2UtO 0RkqcTNpjbFV09GDHyDF0K70xDsSk730MIYsv0u4gsjvhiIxoiVRh9/p8diZtaZdL3BrLb iHNfhTNi4aCR+HqaAC/lgxhhF3yaZjY= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=GCWhljQN; spf=pass (imf06.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554187; a=rsa-sha256; cv=none; b=SoEkPi1ENhfZ3pUM3vIAxF1IMILF8d7+uyFc84M3TGB5r1Xe84WMWaUtu98VEnujimNww2 +VE0BoYxmhKo+AlaHho7WhnLNXgA+vn1tBje1V7aAabNqIGHSvEWJMIs0fL7HhM39S26yq 5zd0PNRah+CjjAC1GnGwHhABK2uBXU4= Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGx0jh015505; Thu, 27 Apr 2023 00:09:13 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=MfX2UCA2WLfnR/ast08TW20ISh9rczjU8Ar+dRs48HA=; b=GCWhljQNjE5AjIClKhqzHeKVW3z5kr5UBJP7HEDWHVPjD+xiaPJ2/5t9JZYlCLAGl/i7 yvTMTaC3FGePZ3YrF4pTNY5ulz6nyTG1wx4O2xEl56aT+LOheMQ72F2eqB71yGtxnX33 o/hrmAtnO3q1uFdVKVmX1jK+/8S2wrSpWAvtkAkE9A4H6AT9NupoiSiA9BPoU69PpeL1 VoaJtJ2XB0hsW0yKsZDTdd1p+zgv2sLxe8Xm5EP2E3B7UpH96KL1y7cW/00QTYOxaYPR XXZ0vDADFP4dLfYAb/Ef7z3nTdE2ZfvilGGHVUfiEBjCMvZFLCCUdIorty/kdB5tRZZL RQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q460dampk-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:13 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNBFC2007555; Thu, 27 Apr 2023 00:09:12 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpcy-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:12 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938W013888; Thu, 27 Apr 2023 00:09:11 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-6; Thu, 27 Apr 2023 00:09:11 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 05/21] mm: PKRAM: implement byte stream operations Date: Wed, 26 Apr 2023 17:08:41 -0700 Message-Id: <1682554137-13938-6-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: oHWiWle6IkzwQxWZ4r-eqosGLrXgVc0t X-Proofpoint-ORIG-GUID: oHWiWle6IkzwQxWZ4r-eqosGLrXgVc0t X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: C9E11180010 X-Rspam-User: X-Stat-Signature: a7sxcq3dq66ezyn6ckayr79yxtego6hu X-HE-Tag: 1682554187-843949 X-HE-Meta: U2FsdGVkX1/VdRQrbG+JkkptjZN/9CXw78xwBB7/G/69RoLnsycOyJ0e+2ABXxPkZnL5xdYPOtfAuqJPDZMaDrWD3+J0J4LVmBBcBWGqoOmPtbYM+cCImNrpOA+kkGAf6U9BlFDTRm0LtzMqw/hpPlsi7upVfd9GcQ5wilci3f7ySkX30S5EckSfcKTMbldOCxoADckxSAJGxaRB2EQfi+/3Y8QLyhMYyQQBXrl4IfoRP1n8i3IwjsK1M1+T+7LcBRpp7i9cibIuB0PoGDjajI4c/hqe4A/DTWf0F4ej3e0IjioUwYM971/01ej+33P0Bvw/cgzqpLeTXPh2dyh7XwbW3h37S/BIjdF+CiTjcbuJKtT8uB6Ni9rzNhFtFUw6t+BH6sj6OXYSi3pVdgxRtGZHYghSkwyWWM2kEG20MRd8OK8VniAPpaVZnOsvbRGjt+XPolsvt8sqiTjAPZPvC//QeSSLn3s5SdJFqzTs9ZJPj4IfYtNdjkMWKTxHTufFqzWBcuyjR5u5H9Kbxb4sr545BR0B5t+q5hHfIXzxlTeFzV8OgrQp6qVtJP45+EuwWnPWMKYeJpwMIICbV2mSCsCNTuJyOh0Uj29Qt/jZT38h77Eov6DkUUbb/cvXyUhU5t3oyvWbWR5cFLCco3TImjVcdq03mDTuSvJY/lM4PKoToZ3GKz2GQ89iU/LWY6UgwI48ulJvz7H17ss0ldAOXwxKxD+upPFQJVz8rN3hBFFFhmH5FLe7I/4+m+MHo/GXpe/l1k9zx239E3cEMgaahRdisVsjV0VzLdFFTGAQKjbskE/tE3zdQj4vVvVogxXoZta25UAp7Q/4cxjVewRpwNf6mVRcQp1g6QRIHCRU/e0oSgqfAKWvSug5EioL6AiARdMfXhIevKzlx4Msmo1toUASNeWW9bYaU7o5r8fPENCC7418uASIbGRADpqMsCxyh7xvfSet8nQiikUVhpr Jqyx68r6 /eaLVZIYzI0SCtcm3J3bALQJ3CBWiXo61c17IQsM3nFk3uB+46CTAD73SeIu1tvPHzdZqn11XJ9LGHuFwNMVrLrNXxEQKVqABwiEK9FcHb5efl7ThH9CrTgNDC3ssvzJ5G43soZEnwRs/3xkRSA4hEpRpRWXstA1YNzT1 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This patch adds the ability to save an arbitrary byte streams to a a PKRAM object using pkram_write() to be restored later using pkram_read(). Originally-by: Vladimir Davydov Signed-off-by: Anthony Yznaga --- include/linux/pkram.h | 11 +++++ mm/pkram.c | 123 ++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 130 insertions(+), 4 deletions(-) diff --git a/include/linux/pkram.h b/include/linux/pkram.h index 130ab5c2d94a..b614e9059bba 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -14,10 +14,12 @@ * enum pkram_data_flags - definition of data types contained in a pkram obj * @PKRAM_DATA_none: No data types configured * @PKRAM_DATA_folios: obj contains folio data + * @PKRAM_DATA_bytes: obj contains byte data */ enum pkram_data_flags { PKRAM_DATA_none = 0x0, /* No data types configured */ PKRAM_DATA_folios = 0x1, /* Contains folio data */ + PKRAM_DATA_bytes = 0x2, /* Contains byte data */ }; struct pkram_data_stream { @@ -36,18 +38,27 @@ struct pkram_stream { __u64 *folios_head_link_pfnp; __u64 *folios_tail_link_pfnp; + + __u64 *bytes_head_link_pfnp; + __u64 *bytes_tail_link_pfnp; }; struct pkram_folios_access { unsigned long next_index; }; +struct pkram_bytes_access { + struct page *data_page; /* current page */ + unsigned int data_offset; /* offset into current page */ +}; + struct pkram_access { enum pkram_data_flags dtype; struct pkram_stream *ps; struct pkram_data_stream pds; struct pkram_folios_access folios; + struct pkram_bytes_access bytes; }; #define PKRAM_NAME_MAX 256 /* including nul */ diff --git a/mm/pkram.c b/mm/pkram.c index 610ff7a88c98..eac8cf6b0cdf 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -1,6 +1,7 @@ // SPDX-License-Identifier: GPL-2.0 #include #include +#include #include #include #include @@ -44,6 +45,9 @@ struct pkram_link { struct pkram_obj { __u64 folios_head_link_pfn; /* the first folios link of the object */ __u64 folios_tail_link_pfn; /* the last folios link of the object */ + __u64 bytes_head_link_pfn; /* the first bytes link of the object */ + __u64 bytes_tail_link_pfn; /* the last bytes link of the object */ + __u64 data_len; /* byte data size */ __u64 obj_pfn; /* points to the next object in the list */ }; @@ -138,6 +142,11 @@ static void pkram_truncate_obj(struct pkram_obj *obj) pkram_truncate_links(obj->folios_head_link_pfn); obj->folios_head_link_pfn = 0; obj->folios_tail_link_pfn = 0; + + pkram_truncate_links(obj->bytes_head_link_pfn); + obj->bytes_head_link_pfn = 0; + obj->bytes_tail_link_pfn = 0; + obj->data_len = 0; } static void pkram_truncate_node(struct pkram_node *node) @@ -310,7 +319,7 @@ int pkram_prepare_save_obj(struct pkram_stream *ps, enum pkram_data_flags flags) BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); - if (flags & ~PKRAM_DATA_folios) + if (flags & ~(PKRAM_DATA_folios | PKRAM_DATA_bytes)) return -EINVAL; page = pkram_alloc_page(ps->gfp_mask | __GFP_ZERO); @@ -326,6 +335,10 @@ int pkram_prepare_save_obj(struct pkram_stream *ps, enum pkram_data_flags flags) ps->folios_head_link_pfnp = &obj->folios_head_link_pfn; ps->folios_tail_link_pfnp = &obj->folios_tail_link_pfn; } + if (flags & PKRAM_DATA_bytes) { + ps->bytes_head_link_pfnp = &obj->bytes_head_link_pfn; + ps->bytes_tail_link_pfnp = &obj->bytes_tail_link_pfn; + } ps->obj = obj; return 0; } @@ -432,7 +445,7 @@ int pkram_prepare_load_obj(struct pkram_stream *ps) return -ENODATA; obj = pfn_to_kaddr(node->obj_pfn); - if (!obj->folios_head_link_pfn) { + if (!obj->folios_head_link_pfn && !obj->bytes_head_link_pfn) { WARN_ON(1); return -EINVAL; } @@ -443,6 +456,10 @@ int pkram_prepare_load_obj(struct pkram_stream *ps) ps->folios_head_link_pfnp = &obj->folios_head_link_pfn; ps->folios_tail_link_pfnp = &obj->folios_tail_link_pfn; } + if (obj->bytes_head_link_pfn) { + ps->bytes_head_link_pfnp = &obj->bytes_head_link_pfn; + ps->bytes_tail_link_pfnp = &obj->bytes_tail_link_pfn; + } ps->obj = obj; return 0; } @@ -493,6 +510,9 @@ void pkram_finish_access(struct pkram_access *pa, bool status_ok) if (pa->pds.link) pkram_truncate_link(pa->pds.link); + + if ((pa->dtype == PKRAM_DATA_bytes) && (pa->bytes.data_page)) + pkram_free_page(page_address(pa->bytes.data_page)); } /* @@ -547,6 +567,22 @@ int pkram_save_folio(struct pkram_access *pa, struct folio *folio) return __pkram_save_page(pa, page, page->index); } +static int __pkram_bytes_save_page(struct pkram_access *pa, struct page *page) +{ + struct pkram_data_stream *pds = &pa->pds; + struct pkram_link *link = pds->link; + + if (!link || pds->entry_idx >= PKRAM_LINK_ENTRIES_MAX) { + link = pkram_new_link(pds, pa->ps->gfp_mask); + if (!link) + return -ENOMEM; + } + + pkram_add_link_entry(pds, page); + + return 0; +} + static struct page *__pkram_prep_load_page(pkram_entry_t p) { struct page *page; @@ -662,10 +698,53 @@ struct folio *pkram_load_folio(struct pkram_access *pa, unsigned long *index) * * On success, returns the number of bytes written, which is always equal to * @count. On failure, -errno is returned. + * + * Error values: + * %ENOMEM: insufficient amount of memory available */ ssize_t pkram_write(struct pkram_access *pa, const void *buf, size_t count) { - return -EINVAL; + struct pkram_node *node = pa->ps->node; + struct pkram_obj *obj = pa->ps->obj; + size_t copy_count, write_count = 0; + void *addr; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); + + while (count > 0) { + if (!pa->bytes.data_page) { + gfp_t gfp_mask = pa->ps->gfp_mask; + struct page *page; + int err; + + page = pkram_alloc_page((gfp_mask & GFP_RECLAIM_MASK) | + __GFP_HIGHMEM | __GFP_ZERO); + if (!page) + return -ENOMEM; + err = __pkram_bytes_save_page(pa, page); + if (err) { + pkram_free_page(page_address(page)); + return err; + } + pa->bytes.data_page = page; + pa->bytes.data_offset = 0; + } + + copy_count = min_t(size_t, count, PAGE_SIZE - pa->bytes.data_offset); + addr = kmap_local_page(pa->bytes.data_page); + memcpy(addr + pa->bytes.data_offset, buf, copy_count); + kunmap_local(addr); + + buf += copy_count; + obj->data_len += copy_count; + pa->bytes.data_offset += copy_count; + if (pa->bytes.data_offset >= PAGE_SIZE) + pa->bytes.data_page = NULL; + + write_count += copy_count; + count -= copy_count; + } + return write_count; } /** @@ -679,5 +758,41 @@ ssize_t pkram_write(struct pkram_access *pa, const void *buf, size_t count) */ size_t pkram_read(struct pkram_access *pa, void *buf, size_t count) { - return 0; + struct pkram_node *node = pa->ps->node; + struct pkram_obj *obj = pa->ps->obj; + size_t copy_count, read_count = 0; + char *addr; + + BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_LOAD); + + while (count > 0 && obj->data_len > 0) { + if (!pa->bytes.data_page) { + struct page *page; + + page = __pkram_load_page(pa, NULL); + if (IS_ERR_OR_NULL(page)) + break; + pa->bytes.data_page = page; + pa->bytes.data_offset = 0; + } + + copy_count = min_t(size_t, count, PAGE_SIZE - pa->bytes.data_offset); + if (copy_count > obj->data_len) + copy_count = obj->data_len; + addr = kmap_local_page(pa->bytes.data_page); + memcpy(buf, addr + pa->bytes.data_offset, copy_count); + kunmap_local(addr); + + buf += copy_count; + obj->data_len -= copy_count; + pa->bytes.data_offset += copy_count; + if (pa->bytes.data_offset >= PAGE_SIZE || !obj->data_len) { + put_page(pa->bytes.data_page); + pa->bytes.data_page = NULL; + } + + read_count += copy_count; + count -= copy_count; + } + return read_count; } From patchwork Thu Apr 27 00:08:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225030 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id DCCD1C77B7C for ; Thu, 27 Apr 2023 00:09:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 357A66B0071; Wed, 26 Apr 2023 20:09:46 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 307FD6B0074; Wed, 26 Apr 2023 20:09:46 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C7266B0078; Wed, 26 Apr 2023 20:09:45 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DD0A86B0071 for ; Wed, 26 Apr 2023 20:09:45 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 969E2ACC73 for ; Thu, 27 Apr 2023 00:09:45 +0000 (UTC) X-FDA: 80725237530.10.7EB9ECD Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf14.hostedemail.com (Postfix) with ESMTP id 95CEF10001A for ; Thu, 27 Apr 2023 00:09:43 +0000 (UTC) Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=FbqfouMw; spf=pass (imf14.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554183; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=Ubk61ivo6vmHzGraxUFDO1r49XLMJqa4VVETk4Cbi0M=; b=giq211glNC2bBv5NJTr3h+Gl5v0X4aIh5V9Zws1Hzu0rTtYc6RH9vOUHyNESKaJ9QNR0nt waOPHGof+MYwVJiFaOzWcpMeFjKqECCvKexsFvBPaecI0lKdjAplQgs7ZnBmkMsMvYIqLT UAT5n9JpfXOiDnLQenRFpNNRTeA/M3Y= ARC-Authentication-Results: i=1; imf14.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=FbqfouMw; spf=pass (imf14.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554183; a=rsa-sha256; cv=none; b=3HyQcKonF9XRl7yNCZGtiyjRejPHPHiNaXAQGBOCvO5HKmU+tElv+3LIVHkBN+T5Eguoeo nTWbcK86lCv7kcDKnD+Iy9YhLhhoIT2mVcaaQQkh+Rl+7tgsHd3Jhiew+XmbUP0hsYMK9k JsT6m8QD6c1uiYva8g/6xB6wpq99vFM= Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGx2Ov004937; Thu, 27 Apr 2023 00:09:14 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=Ubk61ivo6vmHzGraxUFDO1r49XLMJqa4VVETk4Cbi0M=; b=FbqfouMwai47X2F7jNvsJULqyrfxpV15OToo/loakJHJQEPmxoEdpqyIHRa9xadym2iu Zfj3nxbIdNpCk2h9qaTsdZeZJOhA/A8QwbQ9vqot2Pmw4tcVa5qExJaA5QclNyOSEjxX Nw0W6/FfI79xQncISher39+JSNcaco3uaQ8BhrKSNNqcACD58W1dtnCPmxTEo2M7bsaB yklcT/WnUpoYfsWS9mEFJt9lu0jnwkwk/wMTDtBP/V5JOuMNzVKJVYsIpHKD8QXsqlgI 2VLOSJBeOHOwEuIzNy8aRR5TyDsY/oq4xcl5TIXA81o+4TrlYj9fxI45DAynbDiVR2SL kg== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46gbtshn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:14 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNdMUZ007383; Thu, 27 Apr 2023 00:09:13 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpdh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:13 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938Y013888; Thu, 27 Apr 2023 00:09:13 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-7; Thu, 27 Apr 2023 00:09:12 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 06/21] mm: PKRAM: link nodes by pfn before reboot Date: Wed, 26 Apr 2023 17:08:42 -0700 Message-Id: <1682554137-13938-7-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: 4dfsl940M32SqdTr3kz-hSfm9AlSY8O0 X-Proofpoint-ORIG-GUID: 4dfsl940M32SqdTr3kz-hSfm9AlSY8O0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 95CEF10001A X-Rspam-User: X-Stat-Signature: zyo6wiw5uacwp1kcc6omfxximxhfgseh X-HE-Tag: 1682554183-359127 X-HE-Meta: U2FsdGVkX19ucIphlvHP7PnRaapGhbPsGApDMi3f6jaAwKGnAappe7PW0BCq6/7M8ivAJKeEz95HfxouPtzIUQeSzmVoCuGv0eh8iI1oEM4AZUuB9p3/+D8zkIJbVgFH7d9Qcugy672SlsQxo82wJB07vnHCwF1I6m6kVCT+QVadrSqN3QxOdBhrWG+3KWppLNaHGGv9D26Me5TSIdKkq2JbHUAuZAObuG1Q4mt73mjdr0rxwp0v0iL98f94AaHtigSVb6NUYDeuLn8SpDpd/V+8bg3NCvS4b/tPkhonE/HKKA0q4L4InYdOzoAA/5rpNvfVTDNCF2ZjxvTjdmH5fu8W5Q0hM/HYtN7KOek2dAUpeSQEt6InlXTpldEu+FtqzL6B9dh52ATJ0SRRYIw22Tt1WpjezI3IdKU38AsnARFkwRky3RZAwkFOz0PvdwZ8OlF6KPNxvTklcm01R+Y4ZV/1YTjwxKg3bHQh9TZzcSUz4kiavuQ68awklgTxxPXPsA1JNL3Ry8jHsuEiSk3Bj8AygxTsCXChNvSpk96fZhcN+Of4nsGnJmrBPTvr82e6ah6nzKMQbXNJlJT0E3KRG0xqz53GC/e75aAdb8jKxypyNaxLP//x12BxXAhNU0kTiqyUN0Lr/6K+Fe9X9WHgVpljLFxvL0i3W4ok3fZcxz58hEqEEeqOgzlOa1d+xtU4ZQ1nqr4BUXIONYitDvXc3Osg1RcfRzM0SJM07HHUrgisS5F8atvvAyqS+qEj1Ye1BaSHH+mbDS/GR1iCyiMKYWKcalqFuPVLFCgY7TZvobilHFunhBQU+YzXx+DAYclVncRGUBz16AcD0vkXjDXCJhuDWyUNN85ZL4H3wMmXJNuMEsNa5nqidBwGROOSm3UVHb1YsT7COGTAp+6cTrf/XAPQ6XD+FFqvHDC0U8bquErCpLLR3j4538H3onnwat0qgRtAnU+eB5b2woTwU7z Mk6vHUsS aawUdYCmgMgr1M0caJ6CnkBBoIaN+i53MsEpVl7x1e2XGiKyJwWNox8GYuNBj1ly2bNb8qQWBNWXPMAIOp+HxxM/4H++BcdGWt2zv4cIVSfVwG3eIdV6AjEnfDWllWPMyIt5M6OseICz1K4KgrvIaKcsFavJKNfhFpQWl X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Since page structs are used for linking PKRAM nodes and cleared on boot, organize all PKRAM nodes into a list singly-linked by pfns before reboot to facilitate restoring the node list in the new kernel. Originally-by: Vladimir Davydov Signed-off-by: Anthony Yznaga --- mm/pkram.c | 50 ++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/mm/pkram.c b/mm/pkram.c index eac8cf6b0cdf..da166cb6afb7 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -2,12 +2,16 @@ #include #include #include +#include #include #include #include #include +#include #include +#include #include +#include #include #include #include @@ -60,11 +64,15 @@ struct pkram_obj { * singly-linked list of PKRAM link structures (see above), the node has a * pointer to the head of. * + * To facilitate data restore in the new kernel, before reboot all PKRAM nodes + * are organized into a list singly-linked by pfn's (see pkram_reboot()). + * * The structure occupies a memory page. */ struct pkram_node { __u32 flags; __u64 obj_pfn; /* points to the first obj of the node */ + __u64 node_pfn; /* points to the next node in the node list */ __u8 name[PKRAM_NAME_MAX]; }; @@ -73,6 +81,10 @@ struct pkram_node { #define PKRAM_LOAD 2 #define PKRAM_ACCMODE_MASK 3 +/* + * For convenience sake PKRAM nodes are kept in an auxiliary doubly-linked list + * connected through the lru field of the page struct. + */ static LIST_HEAD(pkram_nodes); /* linked through page::lru */ static DEFINE_MUTEX(pkram_mutex); /* serializes open/close */ @@ -796,3 +808,41 @@ size_t pkram_read(struct pkram_access *pa, void *buf, size_t count) } return read_count; } + +/* + * Build the list of PKRAM nodes. + */ +static void __pkram_reboot(void) +{ + struct page *page; + struct pkram_node *node; + unsigned long node_pfn = 0; + + list_for_each_entry_reverse(page, &pkram_nodes, lru) { + node = page_address(page); + if (WARN_ON(node->flags & PKRAM_ACCMODE_MASK)) + continue; + node->node_pfn = node_pfn; + node_pfn = page_to_pfn(page); + } +} + +static int pkram_reboot(struct notifier_block *notifier, + unsigned long val, void *v) +{ + if (val != SYS_RESTART) + return NOTIFY_DONE; + __pkram_reboot(); + return NOTIFY_OK; +} + +static struct notifier_block pkram_reboot_notifier = { + .notifier_call = pkram_reboot, +}; + +static int __init pkram_init(void) +{ + register_reboot_notifier(&pkram_reboot_notifier); + return 0; +} +module_init(pkram_init); From patchwork Thu Apr 27 00:08:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9DA1BC77B60 for ; Thu, 27 Apr 2023 00:09:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2BD496B0074; Wed, 26 Apr 2023 20:09:47 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 26C076B0075; Wed, 26 Apr 2023 20:09:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0BEEE6B0078; Wed, 26 Apr 2023 20:09:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id ED3516B0074 for ; Wed, 26 Apr 2023 20:09:46 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id BF95A40531 for ; Thu, 27 Apr 2023 00:09:46 +0000 (UTC) X-FDA: 80725237572.02.284907F Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf17.hostedemail.com (Postfix) with ESMTP id B6CC340009 for ; Thu, 27 Apr 2023 00:09:44 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=vTo9B+vv; spf=pass (imf17.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554184; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=kCxexgjquhm43uIqnCzaCKme8/lW/JHqc5BFLC3zBqE=; b=IQs/GDg3UmIuyuPIo80NBmqP4FSRcxl7cwNAEii8KWBxc4oRaWVMXA0Yv3lGFCcKUm8Ip1 flKH3DEdLa+bTt3ny2vJlgO2Ebw2zo0jjUKOIypBdNMnaeE5YGjHHp8arIC1u/zMDhUbFc bpwydAb2odOwfciIF/E256LvVlZFR+U= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=vTo9B+vv; spf=pass (imf17.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554184; a=rsa-sha256; cv=none; b=Y2aTnE3oUvFZphhZ3Gn+G0GYQ6laVAcD3fWT7vsBiXEaWQw0XS38M5npSrgSkDKb/wm450 2cerO6eMvTWwZ+5u9CGUwPjIGIPI1jMLojil9LIHl1gab9T+Oiv8u1rPR0IaomHRkxigSm f9avYKjZFv/nv1z4dxGRS2E0ltkw0UM= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxDTf025309; Thu, 27 Apr 2023 00:09:16 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=kCxexgjquhm43uIqnCzaCKme8/lW/JHqc5BFLC3zBqE=; b=vTo9B+vvAjWpZl5sA4oPL1lKBhlH5A1X8UCeXo14KQCHh4gDG9JGTe8jsA8SvhWEA6lX rCwXFo16cZYMNfr4aZeivRf7u3DFx56P7YgMJx3riVsOHDgVAViRH21ldwmNdxF2DBeb JqNRvOwjMdpuYY+yWWgDEzFFodnBKgAjYlQhjn5Sy84JLsimXJOqFOWvEWcUM7WH2Ae3 yxJTV377SDejfMR/5XEMdNkOD2spr/zEOAXfZVFI3I7cKVkYN4JLeqnXL9DqpOG7fOVy E/grQ+ERNydeVl9TX0aKf+NA2mkgf95PlZu+WMDNVMhFHDqzhiq9/daDDPxs4kFFdds/ 8A== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622ty0-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:15 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QMwjOd007159; Thu, 27 Apr 2023 00:09:15 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpep-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:15 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938a013888; Thu, 27 Apr 2023 00:09:14 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-8; Thu, 27 Apr 2023 00:09:14 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 07/21] mm: PKRAM: introduce super block Date: Wed, 26 Apr 2023 17:08:43 -0700 Message-Id: <1682554137-13938-8-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: JPUChGVr1Ox4-7b8jQIrZtvAKfFxOq23 X-Proofpoint-GUID: JPUChGVr1Ox4-7b8jQIrZtvAKfFxOq23 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: B6CC340009 X-Stat-Signature: wccu1xiukz8rbx63mkreu4bsoktf4d4y X-HE-Tag: 1682554184-613556 X-HE-Meta: U2FsdGVkX186EEXg7BRoMPGy5HzbnvnTDt+phDvI7KaenZoGmix3eDW03tyKd1O3JQa5ujX5f+xZckXUrKWW3uizHap5Ed7/ciaVspxi5d7+hEWPety+SQJK7bRrdsdVFqOS3vu31cEwvGJ97QskaaUDZMPkIcsBX/xfh6JeB9e/5/sgJiFP1fbEHG9Sl3NJ1xI3VZ9TUtFVpQJBB3iYZd5Oqis6MqN9UdOzxrlViYKMqlmVipIBjAFI3FMJU3PN0V6SDfVAWpx2SyfPfklcESPOTtw+8W/tTcgSGqcvbsWKZ22lcrg2TCcOF7zA/AqWVU25IPF0XefX1RzBOdEBQ8BQ48lx8cFkxO024NOqzkP118xcKnIE8iW1RymFenrZEY/GV2e1e6/ngKomddOdMZ6SZIkXEwNLn82duttsROtOncgpO1Sw2v+j8Zg9HnbP28f9vNVtkrog72hMpUOLiVM7Zv1R35bxP96OTIXdQHBWYs0u6bxNCjyUUmF6Xa67EAWk5gWcNn8YskHyCqqbNbtC6+wP1H0tK4lbI5znL/BO/Um8d/RhGMuR2hOGwxzq7DxifH929pB0dEdfGbnlz2ZhOZt7g3qmkM6uwCi7UADC/YVD72DQ2kv77BfMOszcLlngpRbLf6rvKlJKJOPZb+BJudRzve/MVii+cguSAwEANMrCcFhi4sPKt5c5M/OdPpdCslFo2yKBOmo/bMG4i/rAZRPFRnXLiBunehWl7bxZ4XL6bM+EyNI0cCBi0i/X1Z65np5r6ZchTJDu+H0z7afjFDcM/bWvoQ8kRaP1W/EtY5POywBPA+2hVqLj0v7fo+aDFL+keK2+Pa1+C1DbEGwtZDb9UsSFEJ/S50u9CnziMolh8atOtD/rGy2muPg7tAlBFHG4lS/Rre11oaX84m2HR/A2JgysGJyncgnbNfJqAh2rHeX2FsBERecVxrlODJz6xZZtrgPKS7lh9IO UhHog0fT lsoDfgOt+b9N94wMten6IiqttIwJeohbngygtQvAtIzPQ6ay1RkW8ZXHqsFjq3iJTdiCAazZgZfeJG+RFutOtFzoCVOTDnG/khDGjBkmMUvQHorFxZasjtZ2I0BZdiwKvZF2c9NFsXyMOFz+B83WNgCLSZfrS0xaAJ4pe X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The PKRAM super block is the starting point for restoring preserved memory. By providing the super block to the new kernel at boot time, preserved memory can be reserved and made available to be restored. To point the kernel to the location of the super block, one passes its pfn via the 'pkram' boot param. For that purpose, the pkram super block pfn is exported via /sys/kernel/pkram. If none is passed, any preserved memory will not be kept, and a new super block will be allocated. Originally-by: Vladimir Davydov Signed-off-by: Anthony Yznaga --- mm/pkram.c | 102 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 100 insertions(+), 2 deletions(-) diff --git a/mm/pkram.c b/mm/pkram.c index da166cb6afb7..c66b2ae4d520 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -5,15 +5,18 @@ #include #include #include +#include #include #include #include #include #include +#include #include #include #include #include +#include #include #include "internal.h" @@ -82,12 +85,38 @@ struct pkram_node { #define PKRAM_ACCMODE_MASK 3 /* + * The PKRAM super block contains data needed to restore the preserved memory + * structure on boot. The pointer to it (pfn) should be passed via the 'pkram' + * boot param if one wants to restore preserved data saved by the previously + * executing kernel. For that purpose the kernel exports the pfn via + * /sys/kernel/pkram. If none is passed, preserved memory if any will not be + * preserved and a new clean page will be allocated for the super block. + * + * The structure occupies a memory page. + */ +struct pkram_super_block { + __u64 node_pfn; /* first element of the node list */ +}; + +static unsigned long pkram_sb_pfn __initdata; +static struct pkram_super_block *pkram_sb; + +/* * For convenience sake PKRAM nodes are kept in an auxiliary doubly-linked list * connected through the lru field of the page struct. */ static LIST_HEAD(pkram_nodes); /* linked through page::lru */ static DEFINE_MUTEX(pkram_mutex); /* serializes open/close */ +/* + * The PKRAM super block pfn, see above. + */ +static int __init parse_pkram_sb_pfn(char *arg) +{ + return kstrtoul(arg, 16, &pkram_sb_pfn); +} +early_param("pkram", parse_pkram_sb_pfn); + static inline struct page *pkram_alloc_page(gfp_t gfp_mask) { return alloc_page(gfp_mask); @@ -270,6 +299,7 @@ static void pkram_stream_init(struct pkram_stream *ps, * @gfp_mask specifies the memory allocation mask to be used when saving data. * * Error values: + * %ENODEV: PKRAM not available * %ENAMETOOLONG: name len >= PKRAM_NAME_MAX * %ENOMEM: insufficient memory available * %EEXIST: node with specified name already exists @@ -285,6 +315,9 @@ int pkram_prepare_save(struct pkram_stream *ps, const char *name, gfp_t gfp_mask struct pkram_node *node; int err = 0; + if (!pkram_sb) + return -ENODEV; + if (strlen(name) >= PKRAM_NAME_MAX) return -ENAMETOOLONG; @@ -404,6 +437,7 @@ void pkram_discard_save(struct pkram_stream *ps) * Returns 0 on success, -errno on failure. * * Error values: + * %ENODEV: PKRAM not available * %ENOENT: node with specified name does not exist * %EBUSY: save to required node has not finished yet * @@ -414,6 +448,9 @@ int pkram_prepare_load(struct pkram_stream *ps, const char *name) struct pkram_node *node; int err = 0; + if (!pkram_sb) + return -ENODEV; + mutex_lock(&pkram_mutex); node = pkram_find_node(name); if (!node) { @@ -825,6 +862,13 @@ static void __pkram_reboot(void) node->node_pfn = node_pfn; node_pfn = page_to_pfn(page); } + + /* + * Zero out pkram_sb completely since it may have been passed from + * the previous boot. + */ + memset(pkram_sb, 0, PAGE_SIZE); + pkram_sb->node_pfn = node_pfn; } static int pkram_reboot(struct notifier_block *notifier, @@ -832,7 +876,8 @@ static int pkram_reboot(struct notifier_block *notifier, { if (val != SYS_RESTART) return NOTIFY_DONE; - __pkram_reboot(); + if (pkram_sb) + __pkram_reboot(); return NOTIFY_OK; } @@ -840,9 +885,62 @@ static int pkram_reboot(struct notifier_block *notifier, .notifier_call = pkram_reboot, }; +static ssize_t show_pkram_sb_pfn(struct kobject *kobj, + struct kobj_attribute *attr, char *buf) +{ + unsigned long pfn = pkram_sb ? PFN_DOWN(__pa(pkram_sb)) : 0; + + return sprintf(buf, "%lx\n", pfn); +} + +static struct kobj_attribute pkram_sb_pfn_attr = + __ATTR(pkram, 0444, show_pkram_sb_pfn, NULL); + +static struct attribute *pkram_attrs[] = { + &pkram_sb_pfn_attr.attr, + NULL, +}; + +static struct attribute_group pkram_attr_group = { + .attrs = pkram_attrs, +}; + +/* returns non-zero on success */ +static int __init pkram_init_sb(void) +{ + unsigned long pfn; + struct pkram_node *node; + + if (!pkram_sb) { + struct page *page; + + page = pkram_alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) { + pr_err("PKRAM: Failed to allocate super block\n"); + return 0; + } + pkram_sb = page_address(page); + } + + /* + * Build auxiliary doubly-linked list of nodes connected through + * page::lru for convenience sake. + */ + pfn = pkram_sb->node_pfn; + while (pfn) { + node = pfn_to_kaddr(pfn); + pkram_insert_node(node); + pfn = node->node_pfn; + } + return 1; +} + static int __init pkram_init(void) { - register_reboot_notifier(&pkram_reboot_notifier); + if (pkram_init_sb()) { + register_reboot_notifier(&pkram_reboot_notifier); + sysfs_update_group(kernel_kobj, &pkram_attr_group); + } return 0; } module_init(pkram_init); From patchwork Thu Apr 27 00:08:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C027DC7EE25 for ; Thu, 27 Apr 2023 00:10:06 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3CE096B0080; Wed, 26 Apr 2023 20:09:53 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 37D5B6B0081; Wed, 26 Apr 2023 20:09:53 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 133B66B0082; Wed, 26 Apr 2023 20:09:53 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id EFC936B0080 for ; Wed, 26 Apr 2023 20:09:52 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id C40E2A0509 for ; Thu, 27 Apr 2023 00:09:52 +0000 (UTC) X-FDA: 80725237824.26.200974A Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf16.hostedemail.com (Postfix) with ESMTP id D16D6180007 for ; Thu, 27 Apr 2023 00:09:50 +0000 (UTC) Authentication-Results: imf16.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=mDqUgTbT; spf=pass (imf16.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554190; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=gsaWUneWyL6rRzD/yIHYfGhUn1k+T9X+WkpsWzv1iV8=; b=3gC980G+fEbFc1JSKC80jOLQK1mEOggVioOOK5hGhzMch5DrJia3WhH600YbMgTLvPseVm jTbkybzNdRzfC7xAlPYphYyUKSFvUqX/Uh6Y+4zMwgK8OrkxxiHleG90uxvj9yoPbVtu/5 SMnpMQxjoJR/qoA5Npxrw8LfbApbtAs= ARC-Authentication-Results: i=1; imf16.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=mDqUgTbT; spf=pass (imf16.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554190; a=rsa-sha256; cv=none; b=Efr/MQKbWOMUZDytKbLTiLUvp2XsI4rIWFomaYNIFUeHAoPWcOGS3Fog2ep256kgm1hjGp chzc2095ACpjLuX8Fm8MaJj3EDjLR2cDqeBlV6Y84vzfzzd36T1zq1UbnMEv2M1ZRh3/As RLO66B2RYfNVeHbxDG4YSZuM078qioI= Received: from pps.filterd (m0246630.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGx0ji015505; Thu, 27 Apr 2023 00:09:17 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=gsaWUneWyL6rRzD/yIHYfGhUn1k+T9X+WkpsWzv1iV8=; b=mDqUgTbTSU6cnPw3zIM1XyCiMw+fSQGgT3QxlLZPvuf6xhN7kjfpA5S65xHVuzL7FT1d MGjlWEjvHmFFK3pStMXSW4nUHTx3bOBt7b2EJakJCY7L7meqs0m0QV2Vtng255bRdgm8 /rWYGb8HvY3c+vhBXe6XD5MEw4a0TCADR/xoqYuuf47wVDeLuXAs4nhk4CCj/AQuq3dL MKTx8r86M10M50OkBFdr6kXy6b6ORkvf2QOT7QqahA3CX8uDHoX4t/UG7HZ1itPxWs8D GTo2afqCR/a4R6vypgdRkWzRNdCnjRCdrFsN6EfXrUHHNrXiRPPYj+Hya6fk5sCqJDgZ OA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q460dampm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:17 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QMDCSI007353; Thu, 27 Apr 2023 00:09:16 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpfu-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:16 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938c013888; Thu, 27 Apr 2023 00:09:15 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-9; Thu, 27 Apr 2023 00:09:15 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 08/21] PKRAM: track preserved pages in a physical mapping pagetable Date: Wed, 26 Apr 2023 17:08:44 -0700 Message-Id: <1682554137-13938-9-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: 6JJSP6KEDa_4tfH-GxMGt6XutfQgNfDT X-Proofpoint-ORIG-GUID: 6JJSP6KEDa_4tfH-GxMGt6XutfQgNfDT X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D16D6180007 X-Rspam-User: X-Stat-Signature: rnw86rnu6698m5a8z4qox36gkatmowp1 X-HE-Tag: 1682554190-775702 X-HE-Meta: U2FsdGVkX1+vGUevujlfuJ5wLZAJaUQK88mxKj6mBGZdSeGiSi4xJwxlXmUzRzQh8/Lb0OghCn1DiQj74QtzO6qILnyM9/9pnQaHLDbxO99WE7r7cV2MtbPXNAyGcH+FMrWeHEW4eIXhGmVw3Lpr078sQKUWAi+japLjmI8gGhVdw+9pWsPnD+FYVEw3/CaFnK4srgtTytYN3rdmGC5SJe+xJWpyfJoJTkQ5Xdhjh/CiLvm0//zgLw0YlkJ3UWxmerU5R3rOSM84mH5AftsXnLOd+GYWr8yxcps8C8c5hkcRiv/lEubZLntghekb7lYDGVxUVCZ8AX/XaKWvhdYLWTD6lM6FzWY+CKX3LrcGAhxLgujy6pP1EfoCxMW9BqeNl53DnEbAbcAev87GWunOKLX4iK7sXGL6dYKnhgAQLfqrirNoQfGU9rJxDbHA3eE5BpHeuv6YlxebzZ2VLNFQ3TS1/ysDIXw5N+SYtOY/tYwt9L9DUJM5bjsQjmrnAkdhvajZsma16dayAS+UoRfBhMmVc0HXkiTtVTD2o0CkZvtUlHXd2ceicSyPiC+XQ/7Ybpcfz0L2K9JDbuA/P2TYZgYs3gHTFYXxgwDTFtPSKiId7/ykVK4ZkZ74UROpEVyc2SFdxXJBZ4tORxCTmlZ7usCPbSmsgcfAyAolVKKqUaoOP3/2nB2NlK2YOy0cLZUmCcIwYuvpDcg3sZ0oV8dH5I2P2xHPlP56WxogjVmgo7lS50fQvFl2sP01u1lnzFTje3POAypCHBX6AzXqk/xUHBu+z50U4qmqIoTlHopCmwRE6y/ioNtUDriYBLqUwLyAzIE5BobM1ItaKU/VyZixM3xytdK6+jd2Zs8nK1zDa4OsA0CmI7yGGjkTP1QzW4VbUYDiIvdyNOqho+oUN0zW9Wam2MWDFWs23vdy/dA672MYebRZEhP5AWruOoMzOHPM8fw7E9BQj0sRTW05mdX Tbh7FTRp Y/Pp4wUW5Eh+OnIs0Nci/jXWCOUumcp43JxEgloYZx2ivNbaRliFH1M8UkCMIDUubYNBYTNSyaLXw0KuIC5U4QnzUFp6eCVhsqx832HmF3h8Pbl5zHkeVRnKv3yHLGEzStqki X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Later patches in this series will need a way to efficiently identify physically contiguous ranges of preserved pages independent of their virtual addresses. To facilitate this all pages to be preserved across kexec are added to a pseudo identity mapping pagetable. The pagetable makes use of the existing architecture definitions for building a memory mapping pagetable except that a bitmap is used to represent the presence or absence of preserved pages at the PTE level. Signed-off-by: Anthony Yznaga --- mm/Makefile | 4 +- mm/pkram.c | 30 ++++- mm/pkram_pagetable.c | 375 +++++++++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 404 insertions(+), 5 deletions(-) create mode 100644 mm/pkram_pagetable.c diff --git a/mm/Makefile b/mm/Makefile index 7a8d5a286d48..7a1a33b67de6 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -138,5 +138,5 @@ obj-$(CONFIG_IO_MAPPING) += io-mapping.o obj-$(CONFIG_HAVE_BOOTMEM_INFO_NODE) += bootmem_info.o obj-$(CONFIG_GENERIC_IOREMAP) += ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) += shrinker_debug.o -obj-$(CONFIG_PKRAM) += pkram.o ->>>>>>> mm: add PKRAM API stubs and Kconfig +obj-$(CONFIG_PKRAM) += pkram.o pkram_pagetable.o +>>>>>>> PKRAM: track preserved pages in a physical mapping pagetable diff --git a/mm/pkram.c b/mm/pkram.c index c66b2ae4d520..e6c0f3c52465 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -101,6 +101,9 @@ struct pkram_super_block { static unsigned long pkram_sb_pfn __initdata; static struct pkram_super_block *pkram_sb; +extern int pkram_add_identity_map(struct page *page); +extern void pkram_remove_identity_map(struct page *page); + /* * For convenience sake PKRAM nodes are kept in an auxiliary doubly-linked list * connected through the lru field of the page struct. @@ -119,11 +122,24 @@ static int __init parse_pkram_sb_pfn(char *arg) static inline struct page *pkram_alloc_page(gfp_t gfp_mask) { - return alloc_page(gfp_mask); + struct page *page; + int err; + + page = alloc_page(gfp_mask); + if (page) { + err = pkram_add_identity_map(page); + if (err) { + __free_page(page); + page = NULL; + } + } + + return page; } static inline void pkram_free_page(void *addr) { + pkram_remove_identity_map(virt_to_page(addr)); free_page((unsigned long)addr); } @@ -161,6 +177,7 @@ static void pkram_truncate_link(struct pkram_link *link) if (!p) continue; page = pfn_to_page(PHYS_PFN(p)); + pkram_remove_identity_map(page); put_page(page); } } @@ -610,10 +627,15 @@ int pkram_save_folio(struct pkram_access *pa, struct folio *folio) { struct pkram_node *node = pa->ps->node; struct page *page = folio_page(folio, 0); + int err; BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); - return __pkram_save_page(pa, page, page->index); + err = __pkram_save_page(pa, page, page->index); + if (!err) + err = pkram_add_identity_map(page); + + return err; } static int __pkram_bytes_save_page(struct pkram_access *pa, struct page *page) @@ -658,6 +680,8 @@ static struct page *__pkram_prep_load_page(pkram_entry_t p) page_ref_unfreeze(page, 1); + pkram_remove_identity_map(page); + return page; out_error: @@ -914,7 +938,7 @@ static int __init pkram_init_sb(void) if (!pkram_sb) { struct page *page; - page = pkram_alloc_page(GFP_KERNEL | __GFP_ZERO); + page = alloc_page(GFP_KERNEL | __GFP_ZERO); if (!page) { pr_err("PKRAM: Failed to allocate super block\n"); return 0; diff --git a/mm/pkram_pagetable.c b/mm/pkram_pagetable.c new file mode 100644 index 000000000000..85e34301ef1e --- /dev/null +++ b/mm/pkram_pagetable.c @@ -0,0 +1,375 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include + +static pgd_t *pkram_pgd; +static DEFINE_SPINLOCK(pkram_pgd_lock); + +#define set_p4d(p4dp, p4d) WRITE_ONCE(*(p4dp), (p4d)) + +#define PKRAM_PTE_BM_BYTES (PTRS_PER_PTE / BITS_PER_BYTE) +#define PKRAM_PTE_BM_MASK (PAGE_SIZE / PKRAM_PTE_BM_BYTES - 1) + +static pmd_t make_bitmap_pmd(unsigned long *bitmap) +{ + unsigned long val; + + val = __pa(ALIGN_DOWN((unsigned long)bitmap, PAGE_SIZE)); + val |= (((unsigned long)bitmap & ~PAGE_MASK) / PKRAM_PTE_BM_BYTES); + + return __pmd(val); +} + +static unsigned long *get_bitmap_addr(pmd_t pmd) +{ + unsigned long val, off; + + val = pmd_val(pmd); + off = (val & PKRAM_PTE_BM_MASK) * PKRAM_PTE_BM_BYTES; + + val = (val & PAGE_MASK) + off; + + return __va(val); +} + +int pkram_add_identity_map(struct page *page) +{ + unsigned long paddr; + unsigned long *bitmap; + unsigned int index; + struct page *pg; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + if (!pkram_pgd) { + spin_lock(&pkram_pgd_lock); + if (!pkram_pgd) { + pg = alloc_page(GFP_ATOMIC|__GFP_ZERO); + if (!pg) + goto nomem; + pkram_pgd = page_address(pg); + } + spin_unlock(&pkram_pgd_lock); + } + + paddr = __pa(page_address(page)); + pgd = pkram_pgd; + pgd += pgd_index(paddr); + if (pgd_none(*pgd)) { + spin_lock(&pkram_pgd_lock); + if (pgd_none(*pgd)) { + pg = alloc_page(GFP_ATOMIC|__GFP_ZERO); + if (!pg) + goto nomem; + p4d = page_address(pg); + set_pgd(pgd, __pgd(__pa(p4d))); + } + spin_unlock(&pkram_pgd_lock); + } + p4d = p4d_offset(pgd, paddr); + if (p4d_none(*p4d)) { + spin_lock(&pkram_pgd_lock); + if (p4d_none(*p4d)) { + pg = alloc_page(GFP_ATOMIC|__GFP_ZERO); + if (!pg) + goto nomem; + pud = page_address(pg); + set_p4d(p4d, __p4d(__pa(pud))); + } + spin_unlock(&pkram_pgd_lock); + } + pud = pud_offset(p4d, paddr); + if (pud_none(*pud)) { + spin_lock(&pkram_pgd_lock); + if (pud_none(*pud)) { + pg = alloc_page(GFP_ATOMIC|__GFP_ZERO); + if (!pg) + goto nomem; + pmd = page_address(pg); + set_pud(pud, __pud(__pa(pmd))); + } + spin_unlock(&pkram_pgd_lock); + } + pmd = pmd_offset(pud, paddr); + if (pmd_none(*pmd)) { + spin_lock(&pkram_pgd_lock); + if (pmd_none(*pmd)) { + if (PageTransHuge(page)) { + set_pmd(pmd, pmd_mkhuge(*pmd)); + spin_unlock(&pkram_pgd_lock); + goto done; + } + bitmap = bitmap_zalloc(PTRS_PER_PTE, GFP_ATOMIC); + if (!bitmap) + goto nomem; + set_pmd(pmd, make_bitmap_pmd(bitmap)); + } else { + BUG_ON(pmd_leaf(*pmd)); + bitmap = get_bitmap_addr(*pmd); + } + spin_unlock(&pkram_pgd_lock); + } else { + BUG_ON(pmd_leaf(*pmd)); + bitmap = get_bitmap_addr(*pmd); + } + + index = pte_index(paddr); + BUG_ON(test_bit(index, bitmap)); + set_bit(index, bitmap); + smp_mb__after_atomic(); + if (bitmap_full(bitmap, PTRS_PER_PTE)) + set_pmd(pmd, pmd_mkhuge(*pmd)); +done: + return 0; +nomem: + return -ENOMEM; +} + +void pkram_remove_identity_map(struct page *page) +{ + unsigned long *bitmap; + unsigned long paddr; + unsigned int index; + pgd_t *pgd; + p4d_t *p4d; + pud_t *pud; + pmd_t *pmd; + + /* + * pkram_pgd will be null when freeing metadata pages after a reboot + */ + if (!pkram_pgd) + return; + + paddr = __pa(page_address(page)); + pgd = pkram_pgd; + pgd += pgd_index(paddr); + if (pgd_none(*pgd)) { + WARN_ONCE(1, "PKRAM: %s: no pgd for 0x%lx\n", __func__, paddr); + return; + } + p4d = p4d_offset(pgd, paddr); + if (p4d_none(*p4d)) { + WARN_ONCE(1, "PKRAM: %s: no p4d for 0x%lx\n", __func__, paddr); + return; + } + pud = pud_offset(p4d, paddr); + if (pud_none(*pud)) { + WARN_ONCE(1, "PKRAM: %s: no pud for 0x%lx\n", __func__, paddr); + return; + } + pmd = pmd_offset(pud, paddr); + if (pmd_none(*pmd)) { + WARN_ONCE(1, "PKRAM: %s: no pmd for 0x%lx\n", __func__, paddr); + return; + } + if (PageTransHuge(page)) { + BUG_ON(!pmd_leaf(*pmd)); + pmd_clear(pmd); + return; + } + + if (pmd_leaf(*pmd)) { + spin_lock(&pkram_pgd_lock); + if (pmd_leaf(*pmd)) + set_pmd(pmd, __pmd(pte_val(pte_clrhuge(*(pte_t *)pmd)))); + spin_unlock(&pkram_pgd_lock); + } + + bitmap = get_bitmap_addr(*pmd); + index = pte_index(paddr); + clear_bit(index, bitmap); + smp_mb__after_atomic(); + + spin_lock(&pkram_pgd_lock); + if (!pmd_none(*pmd) && bitmap_empty(bitmap, PTRS_PER_PTE)) { + pmd_clear(pmd); + spin_unlock(&pkram_pgd_lock); + bitmap_free(bitmap); + } else { + spin_unlock(&pkram_pgd_lock); + } +} + +struct pkram_pg_state { + int (*range_cb)(unsigned long base, unsigned long size, void *private); + unsigned long start_addr; + unsigned long curr_addr; + unsigned long min_addr; + unsigned long max_addr; + void *private; + bool tracking; +}; + +#define pgd_none(a) (pgtable_l5_enabled() ? pgd_none(a) : p4d_none(__p4d(pgd_val(a)))) + +static int note_page(struct pkram_pg_state *st, unsigned long addr, bool present) +{ + if (!st->tracking && present) { + if (addr >= st->max_addr) + return 1; + /* + * addr can be < min_addr if the page straddles the + * boundary + */ + st->start_addr = max(addr, st->min_addr); + st->tracking = true; + } else if (st->tracking) { + unsigned long base, size; + int ret; + + /* Continue tracking if upper bound has not been reached */ + if (present && addr < st->max_addr) + return 0; + + addr = min(addr, st->max_addr); + + base = st->start_addr; + size = addr - st->start_addr; + st->tracking = false; + + ret = st->range_cb(base, size, st->private); + + if (addr == st->max_addr) + return 1; + else + return ret; + } + + return 0; +} + +static int walk_pte_level(struct pkram_pg_state *st, pmd_t addr, unsigned long P) +{ + unsigned long *bitmap; + int present; + int i, ret; + + bitmap = get_bitmap_addr(addr); + for (i = 0; i < PTRS_PER_PTE; i++) { + unsigned long curr_addr = P + i * PAGE_SIZE; + + if (curr_addr < st->min_addr) + continue; + present = test_bit(i, bitmap); + ret = note_page(st, curr_addr, present); + if (ret) + break; + } + + return ret; +} + +static int walk_pmd_level(struct pkram_pg_state *st, pud_t addr, unsigned long P) +{ + pmd_t *start; + int i, ret; + + start = pud_pgtable(addr); + for (i = 0; i < PTRS_PER_PMD; i++, start++) { + unsigned long curr_addr = P + i * PMD_SIZE; + + if (curr_addr + PMD_SIZE <= st->min_addr) + continue; + if (!pmd_none(*start)) { + if (pmd_leaf(*start)) + ret = note_page(st, curr_addr, true); + else + ret = walk_pte_level(st, *start, curr_addr); + } else + ret = note_page(st, curr_addr, false); + if (ret) + break; + } + + return ret; +} + +static int walk_pud_level(struct pkram_pg_state *st, p4d_t addr, unsigned long P) +{ + pud_t *start; + int i, ret; + + start = p4d_pgtable(addr); + for (i = 0; i < PTRS_PER_PUD; i++, start++) { + unsigned long curr_addr = P + i * PUD_SIZE; + + if (curr_addr + PUD_SIZE <= st->min_addr) + continue; + if (!pud_none(*start)) { + if (pud_leaf(*start)) + ret = note_page(st, curr_addr, true); + else + ret = walk_pmd_level(st, *start, curr_addr); + } else + ret = note_page(st, curr_addr, false); + if (ret) + break; + } + + return ret; +} + +static int walk_p4d_level(struct pkram_pg_state *st, pgd_t addr, unsigned long P) +{ + p4d_t *start; + int i, ret; + + if (PTRS_PER_P4D == 1) + return walk_pud_level(st, __p4d(pgd_val(addr)), P); + + start = (p4d_t *)pgd_page_vaddr(addr); + for (i = 0; i < PTRS_PER_P4D; i++, start++) { + unsigned long curr_addr = P + i * P4D_SIZE; + + if (curr_addr + P4D_SIZE <= st->min_addr) + continue; + if (!p4d_none(*start)) { + if (p4d_leaf(*start)) + ret = note_page(st, curr_addr, true); + else + ret = walk_pud_level(st, *start, curr_addr); + } else + ret = note_page(st, curr_addr, false); + if (ret) + break; + } + + return ret; +} + +void pkram_walk_pgt(struct pkram_pg_state *st, pgd_t *pgd) +{ + pgd_t *start = pgd; + int i, ret = 0; + + for (i = 0; i < PTRS_PER_PGD; i++, start++) { + unsigned long curr_addr = i * PGDIR_SIZE; + + if (curr_addr + PGDIR_SIZE <= st->min_addr) + continue; + if (!pgd_none(*start)) + ret = walk_p4d_level(st, *start, curr_addr); + else + ret = note_page(st, curr_addr, false); + if (ret) + break; + } +} + +void pkram_find_preserved(unsigned long start, unsigned long end, void *private, int (*callback)(unsigned long base, unsigned long size, void *private)) +{ + struct pkram_pg_state st = { + .range_cb = callback, + .min_addr = start, + .max_addr = end, + .private = private, + }; + + if (!pkram_pgd) + return; + + pkram_walk_pgt(&st, pkram_pgd); +} From patchwork Thu Apr 27 00:08:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 909C2C77B60 for ; Thu, 27 Apr 2023 00:10:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C33DF6B0088; Wed, 26 Apr 2023 20:10:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id B6ED46B008A; Wed, 26 Apr 2023 20:10:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 971116B008C; Wed, 26 Apr 2023 20:10:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 84E976B0088 for ; Wed, 26 Apr 2023 20:10:01 -0400 (EDT) Received: from smtpin28.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 5F48E1C6B08 for ; Thu, 27 Apr 2023 00:10:01 +0000 (UTC) X-FDA: 80725238202.28.A44D034 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf04.hostedemail.com (Postfix) with ESMTP id 6F08E40003 for ; Thu, 27 Apr 2023 00:09:59 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=Zggh1E8c; spf=pass (imf04.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554199; a=rsa-sha256; cv=none; b=5dSCzg4NQv2vhtqJhAEyWfKsKEsK2WaaN8vmKF/U31tAqFoDkbeuw3Nc3bUqwQcgYyOT/V G1pA3nEZkLAOcl9LEFwILEUdD4jl6CQx3z3HfGs0k1EQv3wkUc2UKiU9hK2YwE8KAiiKuK bUMWFNDwBZeA8mPCWW0vQPjm4WHghsQ= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=Zggh1E8c; spf=pass (imf04.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554199; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=TFcTuHsIv/plzZmdlp1y8kigeEeyXo4KMEAxl9EnHkE=; b=HwiBaIri1UClqJe1LcH6cBpivOYux0BNy0FvL5/GQnk8uc+9DVzP31amjykQ44hrZnOxXM Hh6fJG5dACzValg1vjHWx5ehKNNapwJGZaCi8fS7u7luT5DjQrIG2quHDmHlxya0rB3I3u 7sEOFbAx65hgSYdNStcHNtch85W7NNE= Received: from pps.filterd (m0246632.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGwraY017018; Thu, 27 Apr 2023 00:09:19 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=TFcTuHsIv/plzZmdlp1y8kigeEeyXo4KMEAxl9EnHkE=; b=Zggh1E8czE4x4ifpjvsDspPPPFrTZtGY0vjQy73xEQSQJCKBnE4cJTj4k5BJ5EXzz1Su KsiwjlUDFJNM0uU2OJB/zM5Bm9iqCJeuDWrpQ/MKoHxEXp0zMW1mIqKa3m5N4CQormOS z7FEU3qe7EeyswcxhsWF8wXz9MgbSVFiSArjA8dRyMlNb/JnPJC7N6VrgImA1wYTkyxp ctCaI/TG+W5xh1cBKNImbW8wT7qmLf4zR/nm8RB1KqjFdQRqFLyQoSuKmPAtJLhdvsAQ FQE42ee6cVT5V1dU/kk3KMCHIgvH2DN3bPq0HhRluiybIMW2fDFh36w+3T5LamXb1skG hA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q476u2ng3-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:18 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QLgiYb007654; Thu, 27 Apr 2023 00:09:18 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpgv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:17 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938e013888; Thu, 27 Apr 2023 00:09:17 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-10; Thu, 27 Apr 2023 00:09:17 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 09/21] PKRAM: pass a list of preserved ranges to the next kernel Date: Wed, 26 Apr 2023 17:08:45 -0700 Message-Id: <1682554137-13938-10-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: xTHpC9CzAHAd5dZprMjOHVznQrKD_GfJ X-Proofpoint-ORIG-GUID: xTHpC9CzAHAd5dZprMjOHVznQrKD_GfJ X-Rspam-User: X-Rspamd-Queue-Id: 6F08E40003 X-Rspamd-Server: rspam01 X-Stat-Signature: o4ki9t5uc84bqkmktm8xhqc8skixoonc X-HE-Tag: 1682554199-534567 X-HE-Meta: U2FsdGVkX18hsWRXMLdWyh+m9mS+Lej/uGpuehkW1QSmiYQkd2GPWkPed9eCsVYlJOa2r7UPQQOcJjCeqWyqLvkbSkt8qwdm3L4GC9v+gXfb9GVCQWHY5RN3U6ZP7JG3pKC7WDUdu9/G/+FFed7Qxsje6+olE2QHHOUc8Dn+2RuZ/zqFjWxkc1po54Ov3A6DFHWjegE2LEIKEPje1jTM7aJz0UXOYDoCSNV68k3SLoh1lMH+TQu6ce/pY7/0OswhQkjDh5hQAzoay3coBZGJkZKIfxFKg33UhUqN3VX5QpU8cPzQa1B6F8DQ2Ii1KPSkBB6X3PMSp2Z3wL20fnt2sZ6w2m9/SQ5hyJXMmbWOfAK3LBp/DBrGtGODiedj6kOQjHR/iWfRfxeoBZKWwJAeSWiNP/2X8vbOAz0wD9xwKwsavWdJPwb1XkIH2Z+deE/pwtB/Yo4K8GkSbz0/mYXHWtUW6l+pAoTzwR0VrBuG6FI50TN094g2a/MhEyoDReIbJ4jJIC0Zt9X+aVzDN0gfsBoy9uin6uRWmlo6LBmui+8j+ysk5tVQ/U7ZHsgLQZlPSYdCR7Y9zkHAUKzrSMSTQ3iqNj+BdAfQG38e7CrH9zSUS+nyAZT1A05rQGFq6xx/5kmYZtY2cxruzPpL7hvtYYiLNi7vOmUQSntneHA+lp9CXI/nhV9dwNLTSzOYmj/mIbMVtjcE8t/aZyYqlB2Y4QslBuZm+OXdDZLK0ojrCSu76XERL0XCVSI2IRc2bKSjHJb6838y0F5fVweiZ4c5wBIZ1SOz6mIi2XKpJA6jdVYqzwzQ6qbAIIqez02h+vzvIm16B6g/O/BRG0SrAV7T35+Igb8BAOFZPbp+UFEN47yKVqonuWrdCeSbSDXzCr8xGCgwF6GMvEvDf0vfGeRoEC19n625X+ODZ1pvl3AwSqBl+pkJjd6QVYZMVj+16GgDTj+0eXHT4Pq+ciHBcUV C/wIm5QO ibqH0E8c7EXd6u55JWvrdghd1Ly+mGJYss+ridiKFirr9ydooOayH4XmiQxIWWxIvjClhxyjdVJv0fdtGKOwJZowmK9oGpAZSH0PPfhn2muiq+8XIC5O4WAhaeQa7wCY+95bI X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In order to build a new memblock reserved list during boot that includes ranges preserved by the previous kernel, a list of preserved ranges is passed to the next kernel via the pkram superblock. The ranges are stored in ascending order in a linked list of pages. A more complete memblock list is not prepared to avoid possible conflicts with changes in a newer kernel and to avoid having to allocate a contiguous range larger than a page. Signed-off-by: Anthony Yznaga --- mm/pkram.c | 184 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 177 insertions(+), 7 deletions(-) diff --git a/mm/pkram.c b/mm/pkram.c index e6c0f3c52465..3790e5180feb 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -84,6 +84,20 @@ struct pkram_node { #define PKRAM_LOAD 2 #define PKRAM_ACCMODE_MASK 3 +struct pkram_region { + phys_addr_t base; + phys_addr_t size; +}; + +struct pkram_region_list { + __u64 prev_pfn; + __u64 next_pfn; + + struct pkram_region regions[0]; +}; + +#define PKRAM_REGIONS_LIST_MAX \ + ((PAGE_SIZE-sizeof(struct pkram_region_list))/sizeof(struct pkram_region)) /* * The PKRAM super block contains data needed to restore the preserved memory * structure on boot. The pointer to it (pfn) should be passed via the 'pkram' @@ -96,13 +110,21 @@ struct pkram_node { */ struct pkram_super_block { __u64 node_pfn; /* first element of the node list */ + __u64 region_list_pfn; + __u64 nr_regions; }; +static struct pkram_region_list *pkram_regions_list; +static int pkram_init_regions_list(void); +static unsigned long pkram_populate_regions_list(void); + static unsigned long pkram_sb_pfn __initdata; static struct pkram_super_block *pkram_sb; extern int pkram_add_identity_map(struct page *page); extern void pkram_remove_identity_map(struct page *page); +extern void pkram_find_preserved(unsigned long start, unsigned long end, void *private, + int (*callback)(unsigned long base, unsigned long size, void *private)); /* * For convenience sake PKRAM nodes are kept in an auxiliary doubly-linked list @@ -878,21 +900,48 @@ static void __pkram_reboot(void) struct page *page; struct pkram_node *node; unsigned long node_pfn = 0; + unsigned long rl_pfn = 0; + unsigned long nr_regions = 0; + int err = 0; - list_for_each_entry_reverse(page, &pkram_nodes, lru) { - node = page_address(page); - if (WARN_ON(node->flags & PKRAM_ACCMODE_MASK)) - continue; - node->node_pfn = node_pfn; - node_pfn = page_to_pfn(page); + if (!list_empty(&pkram_nodes)) { + err = pkram_add_identity_map(virt_to_page(pkram_sb)); + if (err) { + pr_err("PKRAM: failed to add super block to pagetable\n"); + goto done; + } + list_for_each_entry_reverse(page, &pkram_nodes, lru) { + node = page_address(page); + if (WARN_ON(node->flags & PKRAM_ACCMODE_MASK)) + continue; + node->node_pfn = node_pfn; + node_pfn = page_to_pfn(page); + } + err = pkram_init_regions_list(); + if (err) { + pr_err("PKRAM: failed to init regions list\n"); + goto done; + } + nr_regions = pkram_populate_regions_list(); + if (IS_ERR_VALUE(nr_regions)) { + err = nr_regions; + pr_err("PKRAM: failed to populate regions list\n"); + goto done; + } + rl_pfn = page_to_pfn(virt_to_page(pkram_regions_list)); } +done: /* * Zero out pkram_sb completely since it may have been passed from * the previous boot. */ memset(pkram_sb, 0, PAGE_SIZE); - pkram_sb->node_pfn = node_pfn; + if (!err && node_pfn) { + pkram_sb->node_pfn = node_pfn; + pkram_sb->region_list_pfn = rl_pfn; + pkram_sb->nr_regions = nr_regions; + } } static int pkram_reboot(struct notifier_block *notifier, @@ -968,3 +1017,124 @@ static int __init pkram_init(void) return 0; } module_init(pkram_init); + +static int count_region_cb(unsigned long base, unsigned long size, void *private) +{ + unsigned long *nr_regions = (unsigned long *)private; + + (*nr_regions)++; + return 0; +} + +static unsigned long pkram_count_regions(void) +{ + unsigned long nr_regions = 0; + + pkram_find_preserved(0, PHYS_ADDR_MAX, &nr_regions, count_region_cb); + + return nr_regions; +} + +/* + * To faciliate rapidly building a new memblock reserved list during boot + * with the addition of preserved memory ranges a regions list is built + * before reboot. + * The regions list is a linked list of pages with each page containing an + * array of preserved memory ranges. The ranges are stored in each page + * and across the list in address order. A linked list is used rather than + * a single contiguous range to mitigate against the possibility that a + * larger, contiguous allocation may fail due to fragmentation. + * + * Since the pages of the regions list must be preserved and the pkram + * pagetable is used to determine what ranges are preserved, the list pages + * must be allocated and represented in the pkram pagetable before they can + * be populated. Rather than recounting the number of regions after + * allocating pages and repeating until a precise number of pages are + * allocated, the number of pages needed is estimated. + */ +static int pkram_init_regions_list(void) +{ + struct pkram_region_list *rl; + unsigned long nr_regions; + unsigned long nr_lpages; + struct page *page; + + nr_regions = pkram_count_regions(); + + nr_lpages = DIV_ROUND_UP(nr_regions, PKRAM_REGIONS_LIST_MAX); + nr_regions += nr_lpages; + nr_lpages = DIV_ROUND_UP(nr_regions, PKRAM_REGIONS_LIST_MAX); + + for (; nr_lpages; nr_lpages--) { + page = pkram_alloc_page(GFP_KERNEL | __GFP_ZERO); + if (!page) + return -ENOMEM; + rl = page_address(page); + if (pkram_regions_list) { + rl->next_pfn = page_to_pfn(virt_to_page(pkram_regions_list)); + pkram_regions_list->prev_pfn = page_to_pfn(page); + } + pkram_regions_list = rl; + } + + return 0; +} + +struct pkram_regions_priv { + struct pkram_region_list *curr; + struct pkram_region_list *last; + unsigned long nr_regions; + int idx; +}; + +static int add_region_cb(unsigned long base, unsigned long size, void *private) +{ + struct pkram_regions_priv *priv; + struct pkram_region_list *rl; + int i; + + priv = (struct pkram_regions_priv *)private; + rl = priv->curr; + i = priv->idx; + + if (!rl) { + WARN_ON(1); + return 1; + } + + if (!i) + priv->last = priv->curr; + + rl->regions[i].base = base; + rl->regions[i].size = size; + + priv->nr_regions++; + i++; + if (i == PKRAM_REGIONS_LIST_MAX) { + u64 next_pfn = rl->next_pfn; + + if (next_pfn) + priv->curr = pfn_to_kaddr(next_pfn); + else + priv->curr = NULL; + + i = 0; + } + priv->idx = i; + + return 0; +} + +static unsigned long pkram_populate_regions_list(void) +{ + struct pkram_regions_priv priv = { .curr = pkram_regions_list }; + + pkram_find_preserved(0, PHYS_ADDR_MAX, &priv, add_region_cb); + + /* + * Link the first node to the last populated one. + */ + pkram_regions_list->prev_pfn = page_to_pfn(virt_to_page(priv.last)); + + return priv.nr_regions; +} From patchwork Thu Apr 27 00:08:46 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225032 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EF29C77B7C for ; Thu, 27 Apr 2023 00:09:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 15EB96B007B; Wed, 26 Apr 2023 20:09:48 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 023576B0078; Wed, 26 Apr 2023 20:09:47 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DDD7A6B007B; Wed, 26 Apr 2023 20:09:47 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id CA3976B0075 for ; Wed, 26 Apr 2023 20:09:47 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 9C8034052B for ; Thu, 27 Apr 2023 00:09:47 +0000 (UTC) X-FDA: 80725237614.25.D7FF57B Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf28.hostedemail.com (Postfix) with ESMTP id 5AA46C0004 for ; Thu, 27 Apr 2023 00:09:45 +0000 (UTC) Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=ymokuz3w; spf=pass (imf28.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554185; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=HqaUuCd1TrikYSZqcmSr/0DHMDDKEbvkxeVBZQXF+4s=; b=TvdfJ4KBsx9rycr5eh8h1Tp8nNjTJrsqRtgnvlOS0T5JB3RmjJ9PS+SfqFf9t4I5DL4Zgo AW4UbVNy1ntcYTxL3FstezFou2ax1nlLNZwLn9LsYjkvrGlW2s9rTgsgVPYxYX3kOz/oqw kWJRFPEM4NBjG/7Zi667j+urJFi2EUk= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=ymokuz3w; spf=pass (imf28.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554185; a=rsa-sha256; cv=none; b=BPND1Yo+/8W44LYK2blQZPAaYlkSZgVmYchj2RnSqsCIGK26lIOgtSmhxUEBj6BbLjb5z/ U66Z5DG+8DSM6uV4bUaQFtcLNLAydWHh7CQs4mZ+faZKNk1LpaZ1Sa62detoND5A8HdGtx k0Y8apGB3/Nu1uvpp8n1L67KHKXEUOM= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxEQx025323; Thu, 27 Apr 2023 00:09:20 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=HqaUuCd1TrikYSZqcmSr/0DHMDDKEbvkxeVBZQXF+4s=; b=ymokuz3wRntwI30/UPMDy8uxtvahfMueGv8Da1N3qrPtLBUqqTRtFtKbvokd96e5oZPc pGY+TCBPIuSXwvnuN1QwEW7r2+WpUN6T68bsCFTIQXhsaByghzHkSw63dHZAIArr8Zok vRy10JtQfeUPrT0F4r+cKjI4JYbseEZWXcPUr7KtYBgw5zozoWZePDLxldsjq0Z8Fbe5 SJR+63Zf6mzrcGiBcfd8IT3o7crf+MDRZs6cwBzPEY7dvharmXfVNp0dVkumlGxD/XU6 S108cMR7bmO/7+BFit4IaIqe96T4ktAHBuex20hQ1G56cNGPf/8/HB8tDXQFc3myQwQr Bw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622ty4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:20 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNV55B007557; Thu, 27 Apr 2023 00:09:19 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpjm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:19 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938g013888; Thu, 27 Apr 2023 00:09:19 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-11; Thu, 27 Apr 2023 00:09:18 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 10/21] PKRAM: prepare for adding preserved ranges to memblock reserved Date: Wed, 26 Apr 2023 17:08:46 -0700 Message-Id: <1682554137-13938-11-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: Pb7Yjfl5p58r9mry-DL31w9dKlO5igMJ X-Proofpoint-GUID: Pb7Yjfl5p58r9mry-DL31w9dKlO5igMJ X-Rspamd-Queue-Id: 5AA46C0004 X-Stat-Signature: ppbdzr66qk8fqkkwpds4a35h1f53hnwe X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1682554185-701435 X-HE-Meta: U2FsdGVkX19f/tDyIPIilBIFwMYf3z/mut+GBczL2W6wM3gRQ2x99C6oPckzSNGnID3vS/NFgFRoG5AdOPemZxPgmHUPl7qjW7ljjXuM5x3ZIKHeXBXoeM4SQAndqO+BNtmKkOVdcmYL20+DZUPnZiGB+mCbI9nqtpUeo3aKfJbem8ahrQy+qR3gk+F90N50o7HVld/zX/vgfmohldUPCtTqOFWC7V3K8V/zdfRTNj3KDsqGa7KEM56maJJDXXjSNLN1CYAZXBJwM4MVqc1R6tIzWL87IXkPKt3XzjQwSGfy1PJ9EMa5w+S5deRwL41EVEajIs2E2Rw8Pr8QPftqmZ7rKTS59wNdsz7HhT8Jc3j3V3Pd2vt6dy4T6Zw0CoxbBos/H0RkTHyzfq5KN5Be2yNeMYwQfPOuxYLpBKXIgEQHeWEE/OClamraMwxTFpdT3TUImzgYquT1LllWSXNwT6TVxe2m9Z5BRGLZoYaUqZkxMAmRtQ311ctQ9IB/YmbnZxqE2IR2BixppamX24BX8VV8F5OWajmuSqFfyFRbQz+UULvkoPDeGcaSEaH96R3YRHWFchabdl75Lz5Dj8aJwGlKTcDZGk74WJZkr6X5+i9oYLIX234vEna19JmtCtqUdmX9TuR2olTWsIcwrGZoJxAB9juDLOMqSlJwRQaBxKs333MTn2Tq65uls1appz84ebZv6nd3j28q/afDm2HNTB90xq/PtQJzx3xeRFT/uYuy9ppz+OK2GHHN75o++Fvf/Kq5ost8YFeyJylvFbgXVkQvAFH4fFnuvq0wbLnvfSjQCkNWgJt3yCW6ZdPk2/eywuMvud2+4BwXP6Es3Tp8O8vYKx6B24xfZ/aAitZyZGGF4C7TeEJvp7Oh7g9fJbSP3sFx/6gQBBrJkXU8Ynmz4k5Tt5iiaQ71EUVeEaegw2O+pONoxw0kwBf2k/w7WOBJNLSOuAyNXRB+0rUUeRC zr8IRSFO XNQeiBjzeYJRBpQ55lu7VssTYRHkH0jZSPYZgvinOVq1noI9qkEwEqDGdKG/NiKKv97VWklnKqYVkdnHcrONEFjPSF3Ie5mHVgrUcoc7w5zYrso1QyH2kFFWFGr80XMhAHmWY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Calling memblock_reserve() repeatedly to add preserved ranges is inefficient and risks clobbering preserved memory if the memblock reserved regions array must be resized. Instead, calculate the size needed to accommodate the preserved ranges, find a suitable range for a new reserved regions array that does not overlap any preserved range, and populate it with a new, merged regions array. Signed-off-by: Anthony Yznaga --- mm/pkram.c | 244 +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 244 insertions(+) diff --git a/mm/pkram.c b/mm/pkram.c index 3790e5180feb..c649504fa1fa 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include #include @@ -1138,3 +1139,246 @@ static unsigned long pkram_populate_regions_list(void) return priv.nr_regions; } + +struct pkram_region *pkram_first_region(struct pkram_super_block *sb, + struct pkram_region_list **rlp, int *idx) +{ + WARN_ON(!sb); + WARN_ON(!sb->region_list_pfn); + + if (!sb || !sb->region_list_pfn) + return NULL; + + *rlp = pfn_to_kaddr(sb->region_list_pfn); + *idx = 0; + + return &(*rlp)->regions[0]; +} + +struct pkram_region *pkram_next_region(struct pkram_region_list **rlp, int *idx) +{ + struct pkram_region_list *rl = *rlp; + int i = *idx; + + i++; + if (i >= PKRAM_REGIONS_LIST_MAX) { + if (!rl->next_pfn) { + pr_err("PKRAM: %s: no more pkram_region_list pages\n", __func__); + return NULL; + } + rl = pfn_to_kaddr(rl->next_pfn); + *rlp = rl; + i = 0; + } + *idx = i; + + if (rl->regions[i].size == 0) + return NULL; + + return &rl->regions[i]; +} + +struct pkram_region *pkram_first_region_topdown(struct pkram_super_block *sb, + struct pkram_region_list **rlp, int *idx) +{ + struct pkram_region_list *rl; + + WARN_ON(!sb); + WARN_ON(!sb->region_list_pfn); + + if (!sb || !sb->region_list_pfn) + return NULL; + + rl = pfn_to_kaddr(sb->region_list_pfn); + if (!rl->prev_pfn) { + WARN_ON(1); + return NULL; + } + rl = pfn_to_kaddr(rl->prev_pfn); + + *rlp = rl; + + *idx = (sb->nr_regions - 1) % PKRAM_REGIONS_LIST_MAX; + + return &rl->regions[*idx]; +} + +struct pkram_region *pkram_next_region_topdown(struct pkram_region_list **rlp, int *idx) +{ + struct pkram_region_list *rl = *rlp; + int i = *idx; + + if (i == 0) { + if (!rl->prev_pfn) + return NULL; + rl = pfn_to_kaddr(rl->prev_pfn); + *rlp = rl; + i = PKRAM_REGIONS_LIST_MAX - 1; + } else + i--; + + *idx = i; + + return &rl->regions[i]; +} + +/* + * Use the pkram regions list to allocate a block of memory that does + * not overlap with preserved pages. + */ +phys_addr_t __init alloc_topdown(phys_addr_t size) +{ + phys_addr_t hole_start, hole_end, hole_size; + struct pkram_region_list *rl; + struct pkram_region *r; + phys_addr_t addr = 0; + int idx; + + hole_end = memblock.current_limit; + r = pkram_first_region_topdown(pkram_sb, &rl, &idx); + + while (r) { + hole_start = r->base + r->size; + hole_size = hole_end - hole_start; + + if (hole_size >= size) { + addr = memblock_phys_alloc_range(size, PAGE_SIZE, + hole_start, hole_end); + if (addr) + break; + } + + hole_end = r->base; + r = pkram_next_region_topdown(&rl, &idx); + } + + if (!addr) + addr = memblock_phys_alloc_range(size, PAGE_SIZE, 0, hole_end); + + return addr; +} + +int __init pkram_create_merged_reserved(struct memblock_type *new) +{ + unsigned long cnt_a; + unsigned long cnt_b; + long i, j, k; + struct memblock_region *r; + struct memblock_region *rgn; + struct pkram_region *pkr; + struct pkram_region_list *rl; + int idx; + unsigned long total_size = 0; + unsigned long nr_preserved = 0; + + cnt_a = memblock.reserved.cnt; + cnt_b = pkram_sb->nr_regions; + + i = 0; + j = 0; + k = 0; + + pkr = pkram_first_region(pkram_sb, &rl, &idx); + if (!pkr) + return -EINVAL; + while (i < cnt_a && j < cnt_b && pkr) { + r = &memblock.reserved.regions[i]; + rgn = &new->regions[k]; + + if (r->base + r->size <= pkr->base) { + *rgn = *r; + i++; + } else if (pkr->base + pkr->size <= r->base) { + rgn->base = pkr->base; + rgn->size = pkr->size; + memblock_set_region_node(rgn, MAX_NUMNODES); + + nr_preserved += (rgn->size >> PAGE_SHIFT); + pkr = pkram_next_region(&rl, &idx); + j++; + } else { + pr_err("PKRAM: unexpected overlap:\n"); + pr_err("PKRAM: reserved: base=%pa,size=%pa,flags=0x%x\n", &r->base, + &r->size, (int)r->flags); + pr_err("PKRAM: pkram: base=%pa,size=%pa\n", &pkr->base, &pkr->size); + return -EBUSY; + } + total_size += rgn->size; + k++; + } + + while (i < cnt_a) { + r = &memblock.reserved.regions[i]; + rgn = &new->regions[k]; + + *rgn = *r; + + total_size += rgn->size; + i++; + k++; + } + while (j < cnt_b && pkr) { + rgn = &new->regions[k]; + rgn->base = pkr->base; + rgn->size = pkr->size; + memblock_set_region_node(rgn, MAX_NUMNODES); + + nr_preserved += (rgn->size >> PAGE_SHIFT); + total_size += rgn->size; + pkr = pkram_next_region(&rl, &idx); + j++; + k++; + } + + WARN_ON(cnt_a + cnt_b != k); + new->cnt = cnt_a + cnt_b; + new->total_size = total_size; + + return 0; +} + +/* + * Reserve pages that belong to preserved memory. This is accomplished by + * merging the existing reserved ranges with the preserved ranges into + * a new, sufficiently sized memblock reserved array. + * + * This function should be called at boot time as early as possible to prevent + * preserved memory from being recycled. + */ +int __init pkram_merge_with_reserved(void) +{ + struct memblock_type new; + unsigned long new_max; + phys_addr_t new_size; + phys_addr_t addr; + int err; + + /* + * Need space to insert one more range into memblock.reserved + * without memblock_double_array() being called. + */ + if (memblock.reserved.cnt == memblock.reserved.max) { + WARN_ONCE(1, "PKRAM: no space for new memblock list\n"); + return -ENOMEM; + } + + new_max = memblock.reserved.max + pkram_sb->nr_regions; + new_size = PAGE_ALIGN(sizeof(struct memblock_region) * new_max); + + addr = alloc_topdown(new_size); + if (!addr) + return -ENOMEM; + + new.regions = __va(addr); + new.max = new_max; + err = pkram_create_merged_reserved(&new); + if (err) + return err; + + memblock.reserved.cnt = new.cnt; + memblock.reserved.max = new.max; + memblock.reserved.total_size = new.total_size; + memblock.reserved.regions = new.regions; + + return 0; +} From patchwork Thu Apr 27 00:08:47 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE849C77B60 for ; Thu, 27 Apr 2023 00:10:12 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D1C3C6B0082; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id CA5026B0083; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A829F6B0087; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 953F66B0082 for ; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id B5B81C048C for ; Thu, 27 Apr 2023 00:09:53 +0000 (UTC) X-FDA: 80725237866.18.B129441 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf30.hostedemail.com (Postfix) with ESMTP id A95BA8000B for ; Thu, 27 Apr 2023 00:09:51 +0000 (UTC) Authentication-Results: imf30.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b="Bm/jAfF0"; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf30.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554191; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=qCn/KjOTilZ1uLUD9nUQPCZhyulY7CLY6Ex38w3zacI=; b=Lf5pqUkVP7VSn+ovlX+Mc766Vk29R/4OM+qlX0n8ZBCXu+EiUYnCfP1sMn0WDZ+TK4YKd3 xaTBQFxdlcPDyGFwVM2jUQOIcf2cLKlGR5vQTwE1vtwPJmr4fVBsGymy4toUZZJAded7YX OJcTf2+m3bFArhZ8S0VCTLdMIoU9XpA= ARC-Authentication-Results: i=1; imf30.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b="Bm/jAfF0"; dmarc=pass (policy=none) header.from=oracle.com; spf=pass (imf30.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554191; a=rsa-sha256; cv=none; b=hE+FemkYrIpfWz2ZBL4dvx3p1wtp7sLv4mhVkfsQZ3KGQfRrN1wYatt1QxI/59MyxAdMK8 WiwZeNIoQBowi9qVRQVUOQmo7woMNql6cpmUYW/IS7bLugDLjO+t1fIMe0SFbQnaFzFhVS NViewCQyJz35tvnXaWsYJfr1v2MP1VQ= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxES2025338; Thu, 27 Apr 2023 00:09:22 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=qCn/KjOTilZ1uLUD9nUQPCZhyulY7CLY6Ex38w3zacI=; b=Bm/jAfF05hZ41Po/Sldj/OVZEgh4vornvAX3ywU3g0/yLbdoYbJMFOMTp7C5akCX3fcZ +pInqqlhBeWA1j5TPNcZ+6iEF6gS/F5sq1lf/buPv1WXX5wsj5gKfhyz8y8F710gFQvy q3dmGfVJQxHlYBNf/drz3+bPaSYVMM92CKhQlUQu4PKBx+edhfUxVnE93LXz0l2c3jod c0Yhtsq4U5CR9dYfi1acKahr/PyUR0HznCt6VPM22Bm0p28plaj/4tvVXewIDniGtxfB U6VxuygBOvSiOhVsBYRZQ4FOX7K5ngh3ZyBqGktoa1h2pAyzX/SDB/RijrDDl5gqc2tg Fw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622ty5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:21 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNUjaf007380; Thu, 27 Apr 2023 00:09:21 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpkr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:21 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938i013888; Thu, 27 Apr 2023 00:09:20 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-12; Thu, 27 Apr 2023 00:09:20 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 11/21] mm: PKRAM: reserve preserved memory at boot Date: Wed, 26 Apr 2023 17:08:47 -0700 Message-Id: <1682554137-13938-12-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: ePrmjbJOEXsn1MF-WbF_e2bo2djCeESw X-Proofpoint-GUID: ePrmjbJOEXsn1MF-WbF_e2bo2djCeESw X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: A95BA8000B X-Stat-Signature: 7jwoh9qrsna8f4kn5sfi9t1aj68mjiao X-HE-Tag: 1682554191-309879 X-HE-Meta: U2FsdGVkX1/TKT3PYKQPS3w3MECpFIAIL9HgwYEcBj4pyB/y5isOGVgirMZurDMEt04mtR+eCkUQu9vQRx7PXyt8hIJIo587AH1mzRDb6mdQMJDPm6dtyGOO3iRkoSWNQ9pK9wBxybB+fmh2p8kzZMH6TAyp/QXl2RaJtDBvfxSXgTz1KatCu80A4AuyA4auu8jnXDc9Qw05BQUr+TM87B1J9m4gEaEwjFSqVDC+3GmM30oIgzhkEM9NJvJtz5FUFejokQ0/fI8MFx0wvl2pNZFYuFkpcN3G0KD9r6AAwtDOKShGYzLHFKqTgVLQIaKOsAyHsIqts1Vosrv+J8MISe0KUMwov31K98Cdj9kxSLdg1rAZnJhQq0X+86A3VMBz8VVrKMhAm7Lp3gdTaV+bocorsh54GC2wBQAH00Ab8XzckpkIESWf8RhvKEw8xzVrwe761PrA2gTZ63G871vPsG12ehirWIareqxaWBzM0gwlIpqYUKo4ROIjUiurNweX6NIjy5D/FmpNxnc3z5P//ucF9rBZol+K+GCKJU6IzrRN1rV8wfkAK3rxNdbN2SAjhtkCrgclS9VXYkIktf8iPMmuW/5tTcUeklPAjzCxE6YGmtXoaWpWrW79DhMwlVFKt2b/7RTrevGHMVUWM8JOvo2sUMJhDVArOk43VZTSDZvKr0xTwuXrjNhFM82/1+iaH220hcU/chRk5aMxu2EAZYFlZHSIoYMTEyelNMBZW8Z1oaTIih0cCVErsMzymN2XXMx0NnuYCiFq39OOZ4WCcAaY880Qfqb1nUL3rUqkdKFBO+nGKhMqqdG8jIrzhoSLQ8rm7KBxzHGnU1JVZXZinNgKUBxWE5QDHLudbWoO46DfqpFs+JInhoKKv0t7LQdqibsbCeh1I8BKBEIPcvUMXbCHo4hkco8clcOEEGkft9OzkRwPChWiMPFrbeNB6jR6nR1ryOgzyN0NLgYeTlC v7U0ggVH rtcpmlNFhsy8g3j0NbRHLK3tzZch+CIriubFhimSVsYwz4Wm3e2yNDgSx8nb68aCzCHKV+HSDI92J64O+nGIqZfYyJqNvNF3IHGkPvVGBKP/rpkrmdrbG6Bw7M9QJ8GvrYO5c X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Keep preserved pages from being recycled during boot by adding them to the memblock reserved list during early boot. If memory reservation fails (e.g. a region has already been reserved), all preserved pages are dropped. Signed-off-by: Anthony Yznaga --- arch/x86/kernel/setup.c | 3 ++ arch/x86/mm/init_64.c | 2 ++ include/linux/pkram.h | 8 +++++ mm/pkram.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++--- 4 files changed, 92 insertions(+), 5 deletions(-) diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 16babff771bd..2806b21236d0 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -18,6 +18,7 @@ #include #include #include +#include #include #include #include @@ -1221,6 +1222,8 @@ void __init setup_arch(char **cmdline_p) initmem_init(); dma_contiguous_reserve(max_pfn_mapped << PAGE_SHIFT); + pkram_reserve(); + if (boot_cpu_has(X86_FEATURE_GBPAGES)) hugetlb_cma_reserve(PUD_SHIFT - PAGE_SHIFT); diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index a190aae8ceaf..a46ffb434f39 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -34,6 +34,7 @@ #include #include #include +#include #include #include @@ -1339,6 +1340,7 @@ void __init mem_init(void) after_bootmem = 1; x86_init.hyper.init_after_bootmem(); + totalram_pages_add(pkram_reserved_pages); /* * Must be done after boot memory is put on freelist, because here we * might set fields in deferred struct pages that have not yet been diff --git a/include/linux/pkram.h b/include/linux/pkram.h index b614e9059bba..53d5a1ec42ff 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -99,4 +99,12 @@ int pkram_prepare_save(struct pkram_stream *ps, const char *name, ssize_t pkram_write(struct pkram_access *pa, const void *buf, size_t count); size_t pkram_read(struct pkram_access *pa, void *buf, size_t count); +#ifdef CONFIG_PKRAM +extern unsigned long pkram_reserved_pages; +void pkram_reserve(void); +#else +#define pkram_reserved_pages 0UL +static inline void pkram_reserve(void) { } +#endif + #endif /* _LINUX_PKRAM_H */ diff --git a/mm/pkram.c b/mm/pkram.c index c649504fa1fa..b711f94dbef4 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -134,6 +134,8 @@ extern void pkram_find_preserved(unsigned long start, unsigned long end, void *p static LIST_HEAD(pkram_nodes); /* linked through page::lru */ static DEFINE_MUTEX(pkram_mutex); /* serializes open/close */ +unsigned long __initdata pkram_reserved_pages; + /* * The PKRAM super block pfn, see above. */ @@ -143,6 +145,59 @@ static int __init parse_pkram_sb_pfn(char *arg) } early_param("pkram", parse_pkram_sb_pfn); +static void * __init pkram_map_meta(unsigned long pfn) +{ + if (pfn >= max_low_pfn) + return ERR_PTR(-EINVAL); + return pfn_to_kaddr(pfn); +} + +int pkram_merge_with_reserved(void); +/* + * Reserve pages that belong to preserved memory. + * + * This function should be called at boot time as early as possible to prevent + * preserved memory from being recycled. + */ +void __init pkram_reserve(void) +{ + int err = 0; + + if (!pkram_sb_pfn) + return; + + pr_info("PKRAM: Examining preserved memory...\n"); + + /* Verify that nothing else has reserved the pkram_sb page */ + if (memblock_is_region_reserved(PFN_PHYS(pkram_sb_pfn), PAGE_SIZE)) { + err = -EBUSY; + goto out; + } + + pkram_sb = pkram_map_meta(pkram_sb_pfn); + if (IS_ERR(pkram_sb)) { + err = PTR_ERR(pkram_sb); + goto out; + } + /* An empty pkram_sb is not an error */ + if (!pkram_sb->node_pfn) { + pkram_sb = NULL; + goto done; + } + + err = pkram_merge_with_reserved(); +out: + if (err) { + pr_err("PKRAM: Reservation failed: %d\n", err); + WARN_ON(pkram_reserved_pages > 0); + pkram_sb = NULL; + return; + } + +done: + pr_info("PKRAM: %lu pages reserved\n", pkram_reserved_pages); +} + static inline struct page *pkram_alloc_page(gfp_t gfp_mask) { struct page *page; @@ -162,6 +217,7 @@ static inline struct page *pkram_alloc_page(gfp_t gfp_mask) static inline void pkram_free_page(void *addr) { + __ClearPageReserved(virt_to_page(addr)); pkram_remove_identity_map(virt_to_page(addr)); free_page((unsigned long)addr); } @@ -193,13 +249,23 @@ static void pkram_truncate_link(struct pkram_link *link) { struct page *page; pkram_entry_t p; - int i; + int i, j, order; for (i = 0; i < PKRAM_LINK_ENTRIES_MAX; i++) { p = link->entry[i]; if (!p) continue; + order = p & PKRAM_ENTRY_ORDER_MASK; + if (order >= MAX_ORDER) { + pr_err("PKRAM: attempted truncate of invalid page\n"); + return; + } page = pfn_to_page(PHYS_PFN(p)); + for (j = 0; j < (1 << order); j++) { + struct page *pg = page + j; + + __ClearPageReserved(pg); + } pkram_remove_identity_map(page); put_page(page); } @@ -680,7 +746,7 @@ static int __pkram_bytes_save_page(struct pkram_access *pa, struct page *page) static struct page *__pkram_prep_load_page(pkram_entry_t p) { struct page *page; - int order; + int i, order; short flags; flags = (p >> PKRAM_ENTRY_FLAGS_SHIFT) & PKRAM_ENTRY_FLAGS_MASK; @@ -690,9 +756,16 @@ static struct page *__pkram_prep_load_page(pkram_entry_t p) page = pfn_to_page(PHYS_PFN(p)); - if (!page_ref_freeze(pg, 1)) { - pr_err("PKRAM preserved page has unexpected inflated ref count\n"); - goto out_error; + for (i = 0; i < (1 << order); i++) { + struct page *pg = page + i; + int was_rsvd; + + was_rsvd = PageReserved(pg); + __ClearPageReserved(pg); + if ((was_rsvd || i == 0) && !page_ref_freeze(pg, 1)) { + pr_err("PKRAM preserved page has unexpected inflated ref count\n"); + goto out_error; + } } if (order) { @@ -1331,6 +1404,7 @@ int __init pkram_create_merged_reserved(struct memblock_type *new) } WARN_ON(cnt_a + cnt_b != k); + pkram_reserved_pages = nr_preserved; new->cnt = cnt_a + cnt_b; new->total_size = total_size; From patchwork Thu Apr 27 00:08:48 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225040 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35FEEC7EE25 for ; Thu, 27 Apr 2023 00:10:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id F34436B0083; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E58C06B0085; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id B45BE6B0085; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id A37ED6B0083 for ; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 79C4F1404CC for ; Thu, 27 Apr 2023 00:09:54 +0000 (UTC) X-FDA: 80725237908.11.0C8BD68 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf25.hostedemail.com (Postfix) with ESMTP id 51B31A001F for ; Thu, 27 Apr 2023 00:09:52 +0000 (UTC) Authentication-Results: imf25.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=IZlqAoUK; spf=pass (imf25.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554192; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=G4RP12t8eWRPBRfP51I1pAfPsGdNJr7M6XikXc8e+iY=; b=ofTc+OLPmt7nDpgiwSZnA0oPN+AmGo7bBslRrqfy9N2WHgT0FAgT/8IWlUSfnB5iibbnLB IvD7KMeWWsBFvM2hgeb7SJJtQY22PeiYkN+hCxIe9k4joZfF6DP3i3OxJm6p846keRAF9O OrTpEIpVIVvcOvjiVjGDCCBWe21djl4= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554192; a=rsa-sha256; cv=none; b=ND4RgEX6AqVGm9HcCQrtSbtN/6pFHRKr5TZHrc6tzAs+pAs1DkZtCaXFZwfE66IZzZYohM LXJ+9CQzX7GC1+6ynv1T1YcvLoL1yyvSliQ8j+rtEhKsCulaCWRwvupbOyVres40Web6fb 6F3CLNEf4nJHNtUC0o737XVgRehZsgU= ARC-Authentication-Results: i=1; imf25.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=IZlqAoUK; spf=pass (imf25.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGx3sS004984; Thu, 27 Apr 2023 00:09:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=G4RP12t8eWRPBRfP51I1pAfPsGdNJr7M6XikXc8e+iY=; b=IZlqAoUK9fHXAG95V5pJiDT0eeJVAQiL96nlqaKKbncGPOq4+NjDLrwT+LKmBt73JFkD 3mcPIMqgzlzA47ShALrJPkkxu7pusb4qi26KHoPAGbSpX/M1JkOMZO4JOtVmpSLR0jX9 v6EeFv0n1eY20Q2PLxjvtu1oxZk6jQhbzO6B+YZAzZEf1Dt7fxFNdTu0cCovcI4krNkC YCzNxBI2NikR170m1E32WLPrnbIeT83z/r8rH2LuL9SE8ib9UzhQQAVzwkNu+NT3zLlP AuUK3EEshPKJCjhjeWrWKljfngifihSrxos/AK80RytyAn6rR1izL4yHZkPJbhQxVjDF RQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46gbtshv-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:23 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33R00niN007445; Thu, 27 Apr 2023 00:09:22 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpmt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:22 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938k013888; Thu, 27 Apr 2023 00:09:21 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-13; Thu, 27 Apr 2023 00:09:21 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 12/21] PKRAM: free the preserved ranges list Date: Wed, 26 Apr 2023 17:08:48 -0700 Message-Id: <1682554137-13938-13-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: PNCavCd6ue3p59Kx26J25bbv7bWbtwP3 X-Proofpoint-ORIG-GUID: PNCavCd6ue3p59Kx26J25bbv7bWbtwP3 X-Rspam-User: X-Rspamd-Queue-Id: 51B31A001F X-Rspamd-Server: rspam09 X-Stat-Signature: im1ehjwok4pqrawusoxgsr184eb7e1un X-HE-Tag: 1682554192-284039 X-HE-Meta: U2FsdGVkX1+n8engTlZJmYPJMEx0yGsoSEy2Hk2qYdjRMC7ha8Ulb2ZgbPWwA71Ht1T6dvbkKtg3xoSAFancyEibUZeD5h8Tynkxcz+HHsII8SF2LteMbGOZR+tDz8tqrNWplDwr3Dc+vtQIPbQ7SlgWyLV33XNHFWmq8Z5RRq/viOGC8LQ9VyIptAhHSHBJbpcsJe9E1IoFbhmCmyDcoWWoE8MPY3TGag0Bk01RB/CPDShFgCvGa6+NufQo6ZzVk5BOjzT43+bNUplwPQhQMXGON8uzisJrxfQtAIWimKjeXeP9mAHDPGxHcplNc/QIOYYlV4UfyTM1YakQlPwMxJtR0Yrjd+hrQVZW/TzI1qjpiWqPhw+PDn7OX+Ev0rJdzbRroAbv+MFj+ouQcQo0kRNCmuuxzvJlkiO7+/xU+aJ7mxgVU0h5nWbfw1P18QQCdTZf1ridqBHxEttYWOk6x5vYvBYWoS7umjPHtOkODuf1yKhHX98LHUdM6Tf2otwArsmXD+v30a1Qu0uWiYP1jA91cpbpC8kiAGlwHxQiUh+X3iJ11Y0Uim8FfKcYHxr0AaRJwdprFLAAdSVCEgkabHLM8w3FCsVFclOk1Uzt1FmYO4UmP/Tjc98j3YpPYLSctQLyoqjS+ng98p+NV7JYCWLD8YlAe3h+QAg5Rk5RU7xR3/D9WeecOz/RZiD7gFSOwWkQAcyStq8Yx36FjTP7gCDy1RuIZDIcBdUDNAkv0t+3KBjvaEtLS9V4P1LBYKONK1HY1XpwMSJgTDHQEE4hHEGnj1qU03OaS+vjFAGA9U+UhEbyoBxNKy4vYcmEpf0gvssFzlPtHbcWikei34zSom6oOAc/PLJs9Gz9L1Fzyhlj5pfiVFQAI1iHtl9vb3uWwbMaaeV0y84Coy07KbIKlkicI5no5kygsxSH/dyFZEwkjyKzGyZ+fPU7bwEBVhywUgPkdpfooDdyH7sU5VC NvGLq5q/ hD3DIneAF6Obs2OCKyzw2u1kM2aQzta5yPL4EgePgq0ZkLUwHn/38ePQTKUtNd/yy4qpgS+BBM/hDKzEQHLR60qPbcvMqpJNQbzy1/skAdM1mumeiJvDlcp1baOecYRp95j3u X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Free the pages used to pass the preserved ranges to the new boot. Signed-off-by: Anthony Yznaga --- arch/x86/mm/init_64.c | 1 + include/linux/pkram.h | 2 ++ mm/pkram.c | 20 ++++++++++++++++++++ 3 files changed, 23 insertions(+) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index a46ffb434f39..9e68f07367fa 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -1340,6 +1340,7 @@ void __init mem_init(void) after_bootmem = 1; x86_init.hyper.init_after_bootmem(); + pkram_cleanup(); totalram_pages_add(pkram_reserved_pages); /* * Must be done after boot memory is put on freelist, because here we diff --git a/include/linux/pkram.h b/include/linux/pkram.h index 53d5a1ec42ff..c909aa299fc4 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -102,9 +102,11 @@ int pkram_prepare_save(struct pkram_stream *ps, const char *name, #ifdef CONFIG_PKRAM extern unsigned long pkram_reserved_pages; void pkram_reserve(void); +void pkram_cleanup(void); #else #define pkram_reserved_pages 0UL static inline void pkram_reserve(void) { } +static inline void pkram_cleanup(void) { } #endif #endif /* _LINUX_PKRAM_H */ diff --git a/mm/pkram.c b/mm/pkram.c index b711f94dbef4..c63b27bb711b 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -1456,3 +1456,23 @@ int __init pkram_merge_with_reserved(void) return 0; } + +void __init pkram_cleanup(void) +{ + struct pkram_region_list *rl; + unsigned long next_pfn; + + if (!pkram_sb || !pkram_reserved_pages) + return; + + next_pfn = pkram_sb->region_list_pfn; + + while (next_pfn) { + struct page *page = pfn_to_page(next_pfn); + + rl = pfn_to_kaddr(next_pfn); + next_pfn = rl->next_pfn; + __free_pages_core(page, 0); + pkram_reserved_pages--; + } +} From patchwork Thu Apr 27 00:08:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225122 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10417C7618E for ; Thu, 27 Apr 2023 02:53:55 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7D4A56B0074; Wed, 26 Apr 2023 22:53:55 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 785736B0075; Wed, 26 Apr 2023 22:53:55 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 64C666B0078; Wed, 26 Apr 2023 22:53:55 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 5594F6B0074 for ; Wed, 26 Apr 2023 22:53:55 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 1B07B16010A for ; Thu, 27 Apr 2023 02:53:54 +0000 (UTC) X-FDA: 80725651188.07.5C3D9C0 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf27.hostedemail.com (Postfix) with ESMTP id 0C29440004 for ; Thu, 27 Apr 2023 02:53:50 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=0j0n+Kwn; spf=pass (imf27.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682564031; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=/klwynvryfeEh0hvbf4HAyTY1q/nlnl3HuBNmoKxWqE=; b=r+OXTyBxhJKyU5s97OTI6Q6dxDbOuP0rFoDh0F99tkNjk/UtmN4Ap2HCjAiZYx9hYa3SPR b0hleTpkMG3lKIS9TYXNTsDvWO3Ji3nwnmPtnjzDZKAHlb0qF9qRsYvX56thu4SGTEax7W Y8boag/YFGg5qkIz0NRJfqQl4rO4OdI= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682564031; a=rsa-sha256; cv=none; b=lqtGTKz+oUhGY4WavLf/q2eamF47XiPBw3I+v8P3FoGi7axcLXtxokHqG9ATjAxrBApSgh UHvMVqy3ARZwCdnepor1fGriBlMDdLq7KjAmMRbF6dFBM3EH42F+nxgISn5XXMjuWv4iss B9OyLzEcYxyGb08Qgv90EARlm0d3Qs0= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=0j0n+Kwn; spf=pass (imf27.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGx8Pf014802; Thu, 27 Apr 2023 00:09:24 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=/klwynvryfeEh0hvbf4HAyTY1q/nlnl3HuBNmoKxWqE=; b=0j0n+Kwn3Bvz2+mjvASxMuPf8yoNwQ+WflTC1dDKD5SawtcQeSwCAX5Kl9PhTbjLM7XR 0SZX9uzBB7nHTzqDuYT3ExaruHA9SVIUtnOMZT42jsPxL3vRnz7XBauO5nY7dbWI8FGa k+a1xB0bz8FRktPPGJfMWdBAzSN67DoN4vn/wLfmsPxiBZXp0NViGyfMYOIiBfLxFYT4 YXmoG776eXjaC1bOyRIBJqyj52I0AsEr/DhPR3yHpLY7noBzb6XnjLZj8fFYuYmVKzKv Q/FhZ+/q1DvdVY6ADsoIdHyzTV0xawqFouIVJzNEoW/6uIlQ72P899/I2IKT1PoYoTBo 0Q== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q47fatms1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:24 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNIv5l007334; Thu, 27 Apr 2023 00:09:23 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpnj-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:23 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938m013888; Thu, 27 Apr 2023 00:09:23 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-14; Thu, 27 Apr 2023 00:09:23 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 13/21] PKRAM: prevent inadvertent use of a stale superblock Date: Wed, 26 Apr 2023 17:08:49 -0700 Message-Id: <1682554137-13938-14-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: dA85op-dAowHiRV38HyLZKKt_wYwuMGr X-Proofpoint-GUID: dA85op-dAowHiRV38HyLZKKt_wYwuMGr X-Rspam-User: X-Rspamd-Queue-Id: 0C29440004 X-Rspamd-Server: rspam09 X-Stat-Signature: 4ic33pdhzycaw1kz3mnhfqd5n137n7wt X-HE-Tag: 1682564030-240709 X-HE-Meta: U2FsdGVkX18vhxPqzG9Ie000PWFEM5mU9MgM5umEvWZn/2b3RkxkskpDPJazejc2FeiDkMfMfzMXqeVUpbtUGvRo1+z8Xv1jByDBSD1KxHr67BskERRb5iTsrVvD8NKvlQ7oogu/P9fk+lLouzuI2uOybmPl/VJVSnsO9fcLLu++lRwszIwGbu7s5X/pzKBOBQXmBKhh6ai7szXfrETK09fDOmS3I+qXghuQSCZyjeAOA7w26K3Mm73fAeWkMDhzWmJpUGFiBl8uVSDRWuPcYbEF5wtPNi8VmJauDRSmnXOAVN246GOMBVjJPkfxP5iqOAJeL3m4Rdztb91L6DESA1s9g3H/wUhn+4FyEnZE1Qp+6OthFmq9ocaqzACJ+0SXS+E6w6ybvYqQCooR+c5y0s8p7cQVVq8PPEPsEW/7rHDoOWq3BcUP7M/U+SYvnC0vNuKihw5czWkmZR/CG6RG1v+NXnXOcdGff1GZJ3ZuixPBRvR0Emv9Dx/TyFr4ww7COHjlOPtAaOVV8zViV97EdGSyUQUDAivHWe2jsUnaI5qfmBzndZy2KiZHtOf19tXKt/ZvzDd+klGrvRatAkcWaYx3NK0Kl/1dR6EPEBVJg3bp48vLWS5IGsaQgy5IuFugNQTg0cKKITM+8EAyybsbQdP4vyLHWEFTLCjlXlkbJPHSFKjt88868vyvWrDQqbY0AQPDx9jAhvSN27qDB2q/9HNLS8NYki8oa2q88AVT0LVsUTqP4vrwVmqCnDhumS/29Mi4D3iA2/uh9xzApU9hEfeAJrCV04zc9/Xu6unuuMDzPNjNF59BDFW3RjZa1mK2HSGgL2gsxWVJzDkJ1W6xAlHqDm4JzFTcuuPd04I9e4man6y9NPRKgBmT2QXcicyWpm1dA8RMdw7/Bar6khCSpmBnMSJru2AGbg74jORboqvNgEXqQipczExElCrH8TEqr1B0r6Y2OPPzoFlskSl FsEldgC5 PYUjjW/XndsVW+P7ysARNxxF8Rff5ZSFSa4/4J+apFFjxrDyiT+cKBdYHCoEfMeJiqdD9w2Z8pyVbFDA472PtJKt8Tqi0Wnp5ik7tFW4xx+u2S8h7We7PQqq6FY0/rczfmpaY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When pages have been saved to be preserved by the current boot, set a magic number on the super block to be validated by the next kernel. Signed-off-by: Anthony Yznaga --- mm/pkram.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/mm/pkram.c b/mm/pkram.c index c63b27bb711b..befdffc76940 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -22,6 +22,7 @@ #include "internal.h" +#define PKRAM_MAGIC 0x706B726D /* * Represents a reference to a data page saved to PKRAM. @@ -110,6 +111,8 @@ struct pkram_region_list { * The structure occupies a memory page. */ struct pkram_super_block { + __u32 magic; + __u64 node_pfn; /* first element of the node list */ __u64 region_list_pfn; __u64 nr_regions; @@ -179,6 +182,11 @@ void __init pkram_reserve(void) err = PTR_ERR(pkram_sb); goto out; } + if (pkram_sb->magic != PKRAM_MAGIC) { + pr_err("PKRAM: invalid super block\n"); + err = -EINVAL; + goto out; + } /* An empty pkram_sb is not an error */ if (!pkram_sb->node_pfn) { pkram_sb = NULL; @@ -1012,6 +1020,7 @@ static void __pkram_reboot(void) */ memset(pkram_sb, 0, PAGE_SIZE); if (!err && node_pfn) { + pkram_sb->magic = PKRAM_MAGIC; pkram_sb->node_pfn = node_pfn; pkram_sb->region_list_pfn = rl_pfn; pkram_sb->nr_regions = nr_regions; From patchwork Thu Apr 27 00:08:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225036 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 927FFC77B60 for ; Thu, 27 Apr 2023 00:10:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E07056B007E; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB7586B0080; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C56EB6B0081; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id B3DC46B007E for ; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 9A1E3ACF4C for ; Thu, 27 Apr 2023 00:09:50 +0000 (UTC) X-FDA: 80725237740.04.4CEBAF5 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf23.hostedemail.com (Postfix) with ESMTP id 937B814000B for ; Thu, 27 Apr 2023 00:09:48 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=xpVH9EsA; spf=pass (imf23.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554188; a=rsa-sha256; cv=none; b=qxF6j9/asEsn2OnR4ybvPMBpdKFn9sQNWMqO6s9+bVFUZ+KRUAnJkJ4MKx5aJJ856ECFoa qsLsOX/3y8AuOfP9LM8pHoI1+T88xQYjIMauWZQPcND/EFi4BocravMgVo+Va9hg5aZJK1 QWrOqj2eFg7100pZiahc3Wq35cE22HA= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=xpVH9EsA; spf=pass (imf23.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554188; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=ltRUZX+kav0KgoOWwJByY5TJz/DW4UwuIA/fz1dRFQU=; b=2/USJNzWSPIzcM5j5ZU66xuNvebtYV5eldDeNVWRyuExn2B8olnJIOWyz88SrN99xcGDej 3OH8HlUCPZf205P93rcL30sWvPBRpyYXfZb1UhETqQYKMzlCUJvuTfogpW7fvchdtMJLjh 4n+UkbCZriAY71e0qam5eqPCdB52Uts= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxDTg025309; Thu, 27 Apr 2023 00:09:26 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=ltRUZX+kav0KgoOWwJByY5TJz/DW4UwuIA/fz1dRFQU=; b=xpVH9EsAYlNzHPNKr28M6uYk8hxosg9wy9QOXmzA9yN+HMuUoZfcrajvTUUr1lkOvLaX RM5c5MqPJv+zMSxVMYAjYSzVMxH6Rel4z1jaQnAe6791QBOa/kQboYsFtWjcxY5DF1+a bZzZXeUrWtDQzeHd6c+9eLsg7Xag/Vq9S/PA337P3zPlOG8CX9SArYqtBICe5MEdhCkn 2o/qGCjzmDk6N7GnF5wgATLTEi4/nVQSw6G+sUQlDmkGujq7HBYq5x1/ZTnpjCqDlMh3 qdorPMujYiR67P6HuxfFSjflchL9Svsbzw9GxTe6B8GKNUwz6HdzD7R0TQBFPcB+c7NM qQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622ty7-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:25 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNIv5m007334; Thu, 27 Apr 2023 00:09:25 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mppp-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:25 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938o013888; Thu, 27 Apr 2023 00:09:24 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-15; Thu, 27 Apr 2023 00:09:24 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 14/21] PKRAM: provide a way to ban pages from use by PKRAM Date: Wed, 26 Apr 2023 17:08:50 -0700 Message-Id: <1682554137-13938-15-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: BQQ0fjLL1Sgk_Nth9r2mdIqR3T5y8aHc X-Proofpoint-GUID: BQQ0fjLL1Sgk_Nth9r2mdIqR3T5y8aHc X-Rspam-User: X-Rspamd-Queue-Id: 937B814000B X-Rspamd-Server: rspam01 X-Stat-Signature: ay1oh7xddqmcgkrg673yegjmy1grw4m8 X-HE-Tag: 1682554188-652933 X-HE-Meta: U2FsdGVkX1/ocTfcAvZI/IH0utjcWBexK3veyqz+VunuHdzJvMcEI79k057oFSO3/65djXUNQ1LhNWLF3IaPSSPY4n9VHfZAou5P77F2LVEUBH5VUr+Hao5HI7ocBm9RuDkUTB6nVh2RUdQFTiLFPHI+uH8sqqAr9I7UQgB6oo8g0Oyc+GhiE7cJkiYEp8vLsqWmtgrDJ5AX/fcaOSyzfY/RYGUirgljUbHqWicxJ7GqJyb4GqgbaA21RCUPy3oID6rx5G8YBKYKaYZh5GBsfxyK/Mi3AIHgIhmtunI/Zvrl0dyLOyLMpFvqlJcY5ut5CcQbPPNqRQf7pzI7DocFfuwAlFfRE31b6DUpJGQKeSr+zTL9IWU3G3uuY3qJcJ5hebQTUsR/siE2zgO92zgOkbzcfeGoX8uNqn81+KEtxyxUqw1hEuV3xCSJL+dlMK2RVDv+tYFcedTVSpD4laCivMCqD4oZPmAsuhFgDNvGh7rnOjKVkK9b9NP2scN3sV6quiGDlIKvv5LEd9i421e6dttyEO2nw+WsqjJjhgEzIL2S7lF72eB4GCwpjydQ9z8khuE0xSmNqHj+Z5tz+4HXlgDH6j1qwYOAqI553DTYubu4js5dhKzhKTKI+lVfZ1yizunsZe5qU6nXC2dj31TrDEaBTvGLcmIBS00BsI7lA+KeCFfhmcjniogXuH8rk0iioUOFzDI45huVv0sjKUsCFoARIUMybbGkaAHsjR7Mw0JXhU3lxngnSs5Sjfn7J3Q0//POZz9+XrgmRhSymEahUtscmrFnpoE5oAMPPsuzCG3m7PwHG/RCFZa38OxHF0kDHue4KxBFNFG7Co76Ts8Vm63jmg7pKJcRUTSrdyKQhoqsKooLKwoeJrF0TtMhd5GDclSWI542AKrEC8o1yCO4/WgKF6/NdACNG2FU9WQHKZhRdsjE+kNTVsXpL2RjTdoYQlUEWWOtEq4r7hOgqWj yCHTRCJ0 PfLEQcke8evqb8akGTu+q1u7BJAc/Hmcat7fhDsS1YJGDHH+xl3mm7nqfO8j9PS0fKihCwB+xUmEjzG5N6M4PfeVJhlpWi3lJfaQSLMDGTsZTtbRIt7B/O4PQqFVfnJGQK69wklR5adk4DgzYT324fxqyMfFtMzRSqIG9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Not all memory ranges can be used for saving preserved over-kexec data. For example, a kexec kernel may be loaded before pages are preserved. The memory regions where the kexec segments will be copied to on kexec must not contain preserved pages or else they will be clobbered. Originally-by: Vladimir Davydov Signed-off-by: Anthony Yznaga --- include/linux/pkram.h | 2 + mm/pkram.c | 205 ++++++++++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 207 insertions(+) diff --git a/include/linux/pkram.h b/include/linux/pkram.h index c909aa299fc4..29109e875604 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -103,10 +103,12 @@ int pkram_prepare_save(struct pkram_stream *ps, const char *name, extern unsigned long pkram_reserved_pages; void pkram_reserve(void); void pkram_cleanup(void); +void pkram_ban_region(unsigned long start, unsigned long end); #else #define pkram_reserved_pages 0UL static inline void pkram_reserve(void) { } static inline void pkram_cleanup(void) { } +static inline void pkram_ban_region(unsigned long start, unsigned long end) { } #endif #endif /* _LINUX_PKRAM_H */ diff --git a/mm/pkram.c b/mm/pkram.c index befdffc76940..cef75bd8ba99 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -140,6 +140,28 @@ extern void pkram_find_preserved(unsigned long start, unsigned long end, void *p unsigned long __initdata pkram_reserved_pages; /* + * For tracking a region of memory that PKRAM is not allowed to use. + */ +struct banned_region { + unsigned long start, end; /* pfn, inclusive */ +}; + +#define MAX_NR_BANNED (32 + MAX_NUMNODES * 2) + +static unsigned int nr_banned; /* number of banned regions */ + +/* banned regions; arranged in ascending order, do not overlap */ +static struct banned_region banned[MAX_NR_BANNED]; +/* + * If a page allocated for PKRAM turns out to belong to a banned region, + * it is placed on the banned_pages list so subsequent allocation attempts + * do not encounter it again. The list is shrunk when system memory is low. + */ +static LIST_HEAD(banned_pages); /* linked through page::lru */ +static DEFINE_SPINLOCK(banned_pages_lock); +static unsigned long nr_banned_pages; + +/* * The PKRAM super block pfn, see above. */ static int __init parse_pkram_sb_pfn(char *arg) @@ -206,12 +228,116 @@ void __init pkram_reserve(void) pr_info("PKRAM: %lu pages reserved\n", pkram_reserved_pages); } +/* + * Ban pfn range [start..end] (inclusive) from use in PKRAM. + */ +void pkram_ban_region(unsigned long start, unsigned long end) +{ + int i, merged = -1; + + /* first try to merge the region with an existing one */ + for (i = nr_banned - 1; i >= 0 && start <= banned[i].end + 1; i--) { + if (end + 1 >= banned[i].start) { + start = min(banned[i].start, start); + end = max(banned[i].end, end); + if (merged < 0) + merged = i; + } else + /* + * Regions are arranged in ascending order and do not + * intersect so the merged region cannot jump over its + * predecessors. + */ + BUG_ON(merged >= 0); + } + + i++; + + if (merged >= 0) { + banned[i].start = start; + banned[i].end = end; + /* shift if merged with more than one region */ + memmove(banned + i + 1, banned + merged + 1, + sizeof(*banned) * (nr_banned - merged - 1)); + nr_banned -= merged - i; + return; + } + + /* + * The region does not intersect with an existing one; + * try to create a new one. + */ + if (nr_banned == MAX_NR_BANNED) { + pr_err("PKRAM: Failed to ban %lu-%lu: Too many banned regions\n", + start, end); + return; + } + + memmove(banned + i + 1, banned + i, + sizeof(*banned) * (nr_banned - i)); + banned[i].start = start; + banned[i].end = end; + nr_banned++; +} + +static void pkram_show_banned(void) +{ + int i; + unsigned long n, total = 0; + + pr_info("PKRAM: banned regions:\n"); + for (i = 0; i < nr_banned; i++) { + n = banned[i].end - banned[i].start + 1; + pr_info("%4d: [%08lx - %08lx] %ld pages\n", + i, banned[i].start, banned[i].end, n); + total += n; + } + pr_info("Total banned: %ld pages in %d regions\n", + total, nr_banned); +} + +/* + * Returns true if the page may not be used for storing preserved data. + */ +static bool pkram_page_banned(struct page *page) +{ + unsigned long epfn, pfn = page_to_pfn(page); + int l = 0, r = nr_banned - 1, m; + + epfn = pfn + compound_nr(page) - 1; + + /* do binary search */ + while (l <= r) { + m = (l + r) / 2; + if (epfn < banned[m].start) + r = m - 1; + else if (pfn > banned[m].end) + l = m + 1; + else + return true; + } + return false; +} + static inline struct page *pkram_alloc_page(gfp_t gfp_mask) { struct page *page; + LIST_HEAD(list); + unsigned long len = 0; int err; page = alloc_page(gfp_mask); + while (page && pkram_page_banned(page)) { + len++; + list_add(&page->lru, &list); + page = alloc_page(gfp_mask); + } + if (len > 0) { + spin_lock(&banned_pages_lock); + nr_banned_pages += len; + list_splice(&list, &banned_pages); + spin_unlock(&banned_pages_lock); + } if (page) { err = pkram_add_identity_map(page); if (err) { @@ -230,6 +356,53 @@ static inline void pkram_free_page(void *addr) free_page((unsigned long)addr); } +static void __banned_pages_shrink(unsigned long nr_to_scan) +{ + struct page *page; + + if (nr_to_scan <= 0) + return; + + while (nr_banned_pages > 0) { + BUG_ON(list_empty(&banned_pages)); + page = list_first_entry(&banned_pages, struct page, lru); + list_del(&page->lru); + __free_page(page); + nr_banned_pages--; + nr_to_scan--; + if (!nr_to_scan) + break; + } +} + +static unsigned long +banned_pages_count(struct shrinker *shrink, struct shrink_control *sc) +{ + return nr_banned_pages; +} + +static unsigned long +banned_pages_scan(struct shrinker *shrink, struct shrink_control *sc) +{ + int nr_left = nr_banned_pages; + + if (!sc->nr_to_scan || !nr_left) + return nr_left; + + spin_lock(&banned_pages_lock); + __banned_pages_shrink(sc->nr_to_scan); + nr_left = nr_banned_pages; + spin_unlock(&banned_pages_lock); + + return nr_left; +} + +static struct shrinker banned_pages_shrinker = { + .count_objects = banned_pages_count, + .scan_objects = banned_pages_scan, + .seeks = DEFAULT_SEEKS, +}; + static inline void pkram_insert_node(struct pkram_node *node) { list_add(&virt_to_page(node)->lru, &pkram_nodes); @@ -705,6 +878,31 @@ static int __pkram_save_page(struct pkram_access *pa, struct page *page, return 0; } +static int __pkram_save_page_copy(struct pkram_access *pa, struct page *page) +{ + int nr_pages = compound_nr(page); + pgoff_t index = page->index; + int i, err; + + for (i = 0; i < nr_pages; i++, index++) { + struct page *p = page + i; + struct page *new; + + new = pkram_alloc_page(pa->ps->gfp_mask); + if (!new) + return -ENOMEM; + + copy_highpage(new, p); + err = __pkram_save_page(pa, new, index); + if (err) { + pkram_free_page(page_address(new)); + return err; + } + } + + return 0; +} + /** * Save folio @folio to the preserved memory node and object associated * with pkram stream access @pa. The stream must have been initialized with @@ -728,6 +926,10 @@ int pkram_save_folio(struct pkram_access *pa, struct folio *folio) BUG_ON((node->flags & PKRAM_ACCMODE_MASK) != PKRAM_SAVE); + /* if page is banned, relocate it */ + if (pkram_page_banned(page)) + return __pkram_save_page_copy(pa, page); + err = __pkram_save_page(pa, page, page->index); if (!err) err = pkram_add_identity_map(page); @@ -987,6 +1189,7 @@ static void __pkram_reboot(void) int err = 0; if (!list_empty(&pkram_nodes)) { + pkram_show_banned(); err = pkram_add_identity_map(virt_to_page(pkram_sb)); if (err) { pr_err("PKRAM: failed to add super block to pagetable\n"); @@ -1073,6 +1276,7 @@ static int __init pkram_init_sb(void) page = alloc_page(GFP_KERNEL | __GFP_ZERO); if (!page) { pr_err("PKRAM: Failed to allocate super block\n"); + __banned_pages_shrink(ULONG_MAX); return 0; } pkram_sb = page_address(page); @@ -1095,6 +1299,7 @@ static int __init pkram_init(void) { if (pkram_init_sb()) { register_reboot_notifier(&pkram_reboot_notifier); + register_shrinker(&banned_pages_shrinker, "pkram"); sysfs_update_group(kernel_kobj, &pkram_attr_group); } return 0; From patchwork Thu Apr 27 00:08:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B0B58C7EE24 for ; Thu, 27 Apr 2023 00:10:00 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8D8596B007D; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8ACD86B007E; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 727796B0080; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 61F796B007D for ; Wed, 26 Apr 2023 20:09:50 -0400 (EDT) Received: from smtpin06.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 2339816045A for ; Thu, 27 Apr 2023 00:09:50 +0000 (UTC) X-FDA: 80725237740.06.FAF4B35 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf01.hostedemail.com (Postfix) with ESMTP id 1B44840021 for ; Thu, 27 Apr 2023 00:09:47 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=Yy0rDZNc; spf=pass (imf01.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554188; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=VjHtF6foxcpPkuDRLURypMKoHlv1VKb82EFT44hMcWo=; b=1syJRBFRut2dCg7tG+QyhAYVkTrVCJgJz4uC8ngUzQUTmY540ed5DRPbDsYXRnsueAlEbS ePwpy5/u47hydW3nORgUWw8EDVOhfeX1kNhHjqTito9c/T8pEfrmEjNCK/XsK9S6pvh/yR alQ7Pchr7oZ72kzvvODiUcgPq7H0R6E= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=Yy0rDZNc; spf=pass (imf01.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554188; a=rsa-sha256; cv=none; b=ny6rFjqpUZJNAB9J6gIWAMkIfktj3uHRWjVakEy+3qiKFn6TMDK2um5NGHioMuYoWFr7o5 55QvdwHOkwtsW/ocbBzzOeaMiuFbNcc19XI7m4Dddo9+qx2a10vAMwAgN0NErQpX1iyZiH nFNBt5Fr+GijwnBSiyOzO2YOgeN3AK8= Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxIC6009633; Thu, 27 Apr 2023 00:09:28 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=VjHtF6foxcpPkuDRLURypMKoHlv1VKb82EFT44hMcWo=; b=Yy0rDZNcm0K3+32Kv5Kt0bnPhZ+cjF+9HlcT8JjGanst6ujGlEFRgGR7Ty79CgHZnvhb 7RTmGajtoDcdzYpnzsTA+k/ABNEC1jZn22VcWZc0OANw7MpS0eOUswTIk1JBI0ub4N37 IxTv2QvPe0bmDiAlAOcHmJTDoHO8Zod3IT83urwFmtn2gzLy+JH6nmLGe+e863aIch4A WuzSrCLKVsUPlRJ9BwvFUGWu9PKDq5sFRAeU5XrJbS8BjGunh37SO3dCp9KefjeNn9CB s3FOGBZB1g9Z9GtLUzadHIQb3UVXkgecu+FEa8rdMXsWTpQtMXS/5i1nqDt2NMBR19YD 5w== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q484utq26-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:27 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNUjaj007380; Thu, 27 Apr 2023 00:09:27 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpqn-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:26 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938q013888; Thu, 27 Apr 2023 00:09:26 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-16; Thu, 27 Apr 2023 00:09:26 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 15/21] kexec: PKRAM: prevent kexec clobbering preserved pages in some cases Date: Wed, 26 Apr 2023 17:08:51 -0700 Message-Id: <1682554137-13938-16-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: -pxnJnsePK2q5pM0GAqWo7YuzPwuh3pz X-Proofpoint-ORIG-GUID: -pxnJnsePK2q5pM0GAqWo7YuzPwuh3pz X-Rspam-User: X-Rspamd-Server: rspam03 X-Stat-Signature: npux4x6jhhotsja7hx3fcykikuhn4i7c X-Rspamd-Queue-Id: 1B44840021 X-HE-Tag: 1682554187-148689 X-HE-Meta: U2FsdGVkX18TSd9lpdG+nND4caax4lxySF3qXLZW235R9tpjMnFrEv3K+O46R2+Z6ankSXJcGetnx9XxmDUgw/aTWeWKSXHcVRWtZm44xXZmneQ+nmRNa8icw+nsIt7A8obr6DlCxoEqWJHPzxRReFpRima6AUH/TOjZsZIylDPYZtjZdLK5zh105OyD3K80oYhipdi8NSBfffIblzqoZkQ64Tca1gwUWPMk7+EXNwPlLcf2ljFPWozQh8WkaMwAuUofdKuZFl/aeJicjg38Iw/js4nAKE2aJEZOkmh/yCIU8LdijE0/VRz981dag2pZzALuUksSudgkz8VpQ+8p+4GItmSYW2u3oOgh/P4vvv4lKkf0CB20BBoV5GQFSdi+xhbeIlWNoHi3Xe1rhuFSA8Q0eXcaa9ECOu2XlC2Ss3HLwSCAHr8qgFmrZDXGN8itITyWTvYua+AOckRoiEGe6w8vbjwej8teTEY1btIr3OQbnbMGj82s4SCyxbXdZ1KC/LX6Mj003vl/RHOQ9unKsUTz35WE16EcmBj0hINc3FBX//QeqayXWiuEss3ac0fpeXtvsuItqFKpQnQOsptJHTnLOqarf2Pe3l/QqF0exnVn3+fc/wzU+PeAxoHJS28gEJ86Kinh9bKrcfdIJIj841YFNJHthqvq57HSK8EHbkjyVrcjcHCBrVs+Wi8qKhpoibhybTIPaPRgUiD9hzSNJJm1an7vj5Oxdt5e4/F6Ddukka/EJo7NcYqw+Uhg2uSnYTXxY8k6XntUYnLVx9U6RYoJPCgTTapJO2I3UUEiwd94tirV0rGhDH65b9nJN0JaJLfl67CPnKXbUPRXxB96dr8rjKVStXHqwyGz2/lXn/GPLjqfq596Hq8QGK2BRWaXYrZ2t4N9NuE22Fq9VjeT69pGtmgboKiS1qXImWwnuANJ2jSS5McfGZc5tVc5KnF7d0A3VQAQlGQj7d/217w JShfqb8Z /fTJ9q1PVp1XknvYFWlP7bReexqcURhEuKTMMt8vIKacuvDoGotx6YLQmbW1TElrAfl2hGV41me/ieEe0MUPDilSXHod1oK9KMPqSkqU9cNSYrl1/rvqTV8gp5c9+KbCcjzosnJ3nlemivPq5V6uswA66RVECQ7cX7FFNBTQJhH5C4Vk= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When loading a kernel for kexec, dynamically update the list of physical ranges that are not to be used for storing preserved pages with the ranges where kexec segments will be copied to on reboot. This ensures no pages preserved after the new kernel has been loaded will reside in these ranges on reboot. Not yet handled is the case where pages have been preserved before a kexec kernel is loaded. This will be covered by a later patch. Signed-off-by: Anthony Yznaga --- kernel/kexec.c | 9 +++++++++ kernel/kexec_file.c | 10 ++++++++++ 2 files changed, 19 insertions(+) diff --git a/kernel/kexec.c b/kernel/kexec.c index 92d301f98776..cd871fc07c65 100644 --- a/kernel/kexec.c +++ b/kernel/kexec.c @@ -16,6 +16,7 @@ #include #include #include +#include #include "kexec_internal.h" @@ -153,6 +154,14 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments, if (ret) goto out; + for (i = 0; i < nr_segments; i++) { + unsigned long mem = image->segment[i].mem; + size_t memsz = image->segment[i].memsz; + + if (memsz) + pkram_ban_region(PFN_DOWN(mem), PFN_UP(mem + memsz) - 1); + } + /* Install the new kernel and uninstall the old */ image = xchg(dest_image, image); diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index f1a0e4e3fb5c..ca2aa2d61955 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -27,6 +27,8 @@ #include #include #include +#include + #include "kexec_internal.h" #ifdef CONFIG_KEXEC_SIG @@ -403,6 +405,14 @@ static int kexec_image_verify_sig(struct kimage *image, void *buf, if (ret) goto out; + for (i = 0; i < image->nr_segments; i++) { + unsigned long mem = image->segment[i].mem; + size_t memsz = image->segment[i].memsz; + + if (memsz) + pkram_ban_region(PFN_DOWN(mem), PFN_UP(mem + memsz) - 1); + } + /* * Free up any temporary buffers allocated which are not needed * after image has been loaded From patchwork Thu Apr 27 00:08:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15282C77B60 for ; Thu, 27 Apr 2023 02:16:18 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 830D66B0071; Wed, 26 Apr 2023 22:16:17 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 7E12F6B0072; Wed, 26 Apr 2023 22:16:17 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6D0396B0074; Wed, 26 Apr 2023 22:16:17 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 609BB6B0071 for ; Wed, 26 Apr 2023 22:16:17 -0400 (EDT) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 37AE416010C for ; Thu, 27 Apr 2023 02:16:17 +0000 (UTC) X-FDA: 80725556394.05.E982CF9 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf29.hostedemail.com (Postfix) with ESMTP id 38050120017 for ; Thu, 27 Apr 2023 02:16:15 +0000 (UTC) Authentication-Results: imf29.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b="uln3/Y5F"; spf=pass (imf29.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682561775; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=iVHkxnhPmN3ZiIllDQA4XN9d0CPfj30tNxwbJaWlZYE=; b=EKRWE2Asx2tgNXTFQrh16ZDx6tUkd1CJHC1P+Um4ddrCfqnv7PaGHJu0ZJ1RTGX1VYUDeg 8cCW3DA3CkLmRNsW70544M2jE/lBpi8bpg0WJZTsP9KrTIdtzBex4ylGz/SMVsR0/p6XaR Nts2GsdE15z7uDZ833Dj5u0F5ggcBeY= ARC-Authentication-Results: i=1; imf29.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b="uln3/Y5F"; spf=pass (imf29.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682561775; a=rsa-sha256; cv=none; b=Yo6J/ywKEHjDJH/gAmGhgUAjRAEmzQV7EIl9KSR9LWC/nOcHwmsmbaUJ3mmSpzBT+dqytf 0eEKiRS39L41j07Ps+YlfI9Vki0RYogMiykrFwOWkLogxhIwqT0TKIUCNb2h7emlGO+MzF JIzDGPSUB52/5f3Y/QdjFp6wOXbs7ho= Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxER0025323; Thu, 27 Apr 2023 00:09:29 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=iVHkxnhPmN3ZiIllDQA4XN9d0CPfj30tNxwbJaWlZYE=; b=uln3/Y5F+Sv5bX0Aapb9pqhswccVRvqb/9UqSXysiUWJfvKaAtGgueeexgYxPSyeDVU1 JY61SJ0QAxIz9xFLQu1a+bfAQ7BoRsRMO/zykqYENeQfQX2Tsp31Pwuci7L745E6EStq al6VWV501TMo24pZir/gWgRySI/5HjAKGGrMSanISa8maMcnnLFivFO29ZEDyZLAdXSZ RWl2os1ejhXGKljBNkHolMvKGrZDQUtuY+vOxt6u02F5SFOr5ARd7jtA13Qssv/4Q+Nd ZEr7Z+4EcuFibdjMmqSjL+Co/bgWhtdnklJ3asiCwPFLUA3oIv16J3DXTRhyeQz5K8J3 vA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622tyd-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:28 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNRYCp007326; Thu, 27 Apr 2023 00:09:28 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mprm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:28 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938s013888; Thu, 27 Apr 2023 00:09:27 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-17; Thu, 27 Apr 2023 00:09:27 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 16/21] PKRAM: provide a way to check if a memory range has preserved pages Date: Wed, 26 Apr 2023 17:08:52 -0700 Message-Id: <1682554137-13938-17-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: 2bK9bjHYj08MJyIhhXZ9v97WqmZ7xyPq X-Proofpoint-GUID: 2bK9bjHYj08MJyIhhXZ9v97WqmZ7xyPq X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: 38050120017 X-Rspam-User: X-Stat-Signature: hdj4939q7ka6xjo8jzcqewbd959r8iux X-HE-Tag: 1682561774-878491 X-HE-Meta: U2FsdGVkX1+UNrYJ+NPNOo5zFEcCAe23hFH10IC//VQXbzWaH8uieXF2fNZVLi8gq1HyYDlW3MCRgRHIlGYmmC+WcNXWd6YV1jI4pCOWvw/wqjO6cCJq8rKCSbTTo8Pdzca/tSyrFk/nvsr1eShqEIH5rgvc13Vh+qJylFq4EuJ5V3poOG33HRN3zOEk76T3A61x2UOpH2HdO9ROolQ0kylik1w4QViKSRtVe2eXbNskASAjjmPfbJ0c8a8b8V0MOhmyhpR2+r3np+tBFkyiUxPpOBr91xSgmJRBOfuDFVerZXGGed5zroewL66d2CX9HBwXQVF3cx1eM3Bsyl437yIIGwsgIXQP2ISwHtci4j049pC7v0fV4IhWWzj3NyBpPVJuz/nF7iPFyxpx97Kdl3Jv6AxOOBwP5ueBMcI3r9AnHlGAUTSdT4Cq2zkIUeYnQUvzEKgI6LIw/24Q9gWDzDrWIYO2bFbJz/SEDVEC9qfFouSbbat6WHFlv7bEb76gEul/6ce7pDys8rSMLEDE4/P/+LZyfAK2yh6hfmR4oNb1IGtxilEyxtsrz9GLGFB/jKuNaaFkssU+fIHQwVtoYw52SBXCH5s3jP1niRv4bfWRLdogWhPHFo5MIkTVqh6JaqxkpHQFkhSqXQI4geTjQ+rxaPL09pli05vIxPjHVf1DTyn6m3fr7W49vuXxRqD3ZvKxPSVBd6H78z5AdpJRlydDvkBme327c29W6UMICIRQjSd4iMU6cIA5nJK/jKEAH+S0w6HrfA1zLplKQ2pM8d5Oy+ZXFIUz+CwczF739zFasbqli1jo844moLJ0SmBOytHz+deyGU32cQRwmavnV/aANfDiIjIiBO6nuVaR5iZ0RAu29MjpFpnGf0Qfrl/zh5YXAMvf05ExgXfNh0YOV13ZxPCCZdVD5CsNCNhkbfoYig6+EZihJn5c/aneMQUJqrjw29606ZGnwgZw5VH NE7cWVjW Fi1/77DGCLUICRY4qW8EdXI16DcxnlUmWp7q0aiwAUMHvjTWStKoifX6MhfT8ro0QVxAmsa0iwLEgi9BWDkIQKlela1OTd8BTwDEYoW5HzewEXLRi6bp9N82ZYBWl8OK27iMYGD9g68nkEFWKE7VR8bSvLgP0O8p+9Fsr9qLfRrfg3RKQMq2m2CMemTh8b/Wx5mcR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When a kernel is loaded for kexec the address ranges where the kexec segments will be copied to may conflict with pages already set to be preserved. Provide a way to determine if preserved pages exist in a specified range. Signed-off-by: Anthony Yznaga --- include/linux/pkram.h | 2 ++ mm/pkram.c | 20 ++++++++++++++++++++ 2 files changed, 22 insertions(+) diff --git a/include/linux/pkram.h b/include/linux/pkram.h index 29109e875604..bec9ae75e802 100644 --- a/include/linux/pkram.h +++ b/include/linux/pkram.h @@ -104,11 +104,13 @@ int pkram_prepare_save(struct pkram_stream *ps, const char *name, void pkram_reserve(void); void pkram_cleanup(void); void pkram_ban_region(unsigned long start, unsigned long end); +int pkram_has_preserved_pages(unsigned long start, unsigned long end); #else #define pkram_reserved_pages 0UL static inline void pkram_reserve(void) { } static inline void pkram_cleanup(void) { } static inline void pkram_ban_region(unsigned long start, unsigned long end) { } +static inline int pkram_has_preserved_pages(unsigned long start, unsigned long end) { return 0; } #endif #endif /* _LINUX_PKRAM_H */ diff --git a/mm/pkram.c b/mm/pkram.c index cef75bd8ba99..474fb6fc8355 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -1690,3 +1690,23 @@ void __init pkram_cleanup(void) pkram_reserved_pages--; } } + +static int has_preserved_pages_cb(unsigned long base, unsigned long size, void *private) +{ + int *has_preserved = (int *)private; + + *has_preserved = 1; + return 1; +} + +/* + * Check whether the memory range [start, end) contains preserved pages. + */ +int pkram_has_preserved_pages(unsigned long start, unsigned long end) +{ + int has_preserved = 0; + + pkram_find_preserved(start, end, &has_preserved, has_preserved_pages_cb); + + return has_preserved; +} From patchwork Thu Apr 27 00:08:53 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225038 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id BDE0CC7EE26 for ; Thu, 27 Apr 2023 00:10:09 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 469DE6B0081; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4193E6B0082; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2BBBE6B0083; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 1A4C86B0081 for ; Wed, 26 Apr 2023 20:09:54 -0400 (EDT) Received: from smtpin15.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id E979C1404F9 for ; Thu, 27 Apr 2023 00:09:53 +0000 (UTC) X-FDA: 80725237866.15.17A9CD6 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf21.hostedemail.com (Postfix) with ESMTP id E31E91C0011 for ; Thu, 27 Apr 2023 00:09:51 +0000 (UTC) Authentication-Results: imf21.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=Fr50Imkc; spf=pass (imf21.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554192; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=GKXPxESVfvIM+pNIB5lGbZkr/9+GXI6AMF73iO2olBA=; b=JvrBhBy1Iw1fuKJSehZZkkwa5yFN95TYDxKEDBcS2fAiIda2pl0vCIhWQP1G4POLgicWra HU8ZWtLcEuuh7VMBGXsUC70n9bZZlQhtmnU7qkzq5W0uLzict5M+bd19v12uzsVlB2P01+ p65GfZ26kNPyI9bDs7tW6ZoeFzxZJ5E= ARC-Authentication-Results: i=1; imf21.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=Fr50Imkc; spf=pass (imf21.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554192; a=rsa-sha256; cv=none; b=cm9CaNTZqDHL0vQr4WgvDprJsJBG9mIu0Tj7a7brJj2vpcjutp92EJm0d6sth6JPNxOYFB NqeVE04g6+Y+dDkHUkOyyOwAwfYHKH/Lx8cPn+CNjtx09s2e2VS2NaJdk7mVtWeuFIhdjd TAToYsDcxekEOPXS1PHWfLI7Fx2ykYc= Received: from pps.filterd (m0246617.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxJYE009644; Thu, 27 Apr 2023 00:09:31 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=GKXPxESVfvIM+pNIB5lGbZkr/9+GXI6AMF73iO2olBA=; b=Fr50ImkcFewD5fhKQE2bdokl3ohNxwbCvzToQ3b3qQt8nmwfevLzs25EG7cuy39H0eWI n1LOh78LxZm9im3r9t7TbZe9BB3hn1nNdRiHeT9nPO+1uIo0+IM7jW8Nhi9WY9+llkpC C/hv+kIvPkuBGcQeWZeAHkGuhJKvx2Q2mu4mdrCOq9O3mqjJWFFzeVRC6bkX+TU9BPew B3hlYsgLWoUYC/EAzBBk2Kj2SXpbhsLn6xM59mCHDWHbJINIhPBxQtipbfWF6rvidti+ y2+4OGsrI+sJd8YTcewFfgjMScpiVf/CVq/hrFB+6a2eCVKnpqU302dIFuD/UE0e+l+f Ew== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q484utq2b-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:30 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QN01rh007153; Thu, 27 Apr 2023 00:09:30 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpsr-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:30 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938u013888; Thu, 27 Apr 2023 00:09:29 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-18; Thu, 27 Apr 2023 00:09:29 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 17/21] kexec: PKRAM: avoid clobbering already preserved pages Date: Wed, 26 Apr 2023 17:08:53 -0700 Message-Id: <1682554137-13938-18-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: AvvXex8oIgFKryG_FPd3JZhh5CVfptob X-Proofpoint-ORIG-GUID: AvvXex8oIgFKryG_FPd3JZhh5CVfptob X-Stat-Signature: 9pxzhcxxnhpubdziytkfk5aqyk1bimyj X-Rspam-User: X-Rspamd-Queue-Id: E31E91C0011 X-Rspamd-Server: rspam06 X-HE-Tag: 1682554191-166040 X-HE-Meta: U2FsdGVkX1/7vjg3WKN3lU/1mjgLwhEhUpacnzAxNQ5OsNTlNjTTGPYtBSIqB4zVwiKOVpnM5OjUKKnwn1FOC4Bh5J/a3RbUgq7if7N/WaisxnSa2gE+Iz/treNXb+72gSTLbaMOw6F4LdJmUB9y7rypMN6LXWEk+HVL34VfZI8Kfxm0CVn8Zam0udN9BYfTI7ePRNjEqSCv84cM46cQxhsyxYqbDL9xCp7s9ftuAVYQqvUEc9W3TCLLwVU28KwXf1d5EjmtXEX3CL8KhJCVNoFf2AZserDt1RQq+sTsD7196aUlZm/UUOeSBmo4usO7PUxxRo9IG/nI0vF9G7XoDnqMnlcrnvF7mkKzqFH8MzxiDbsc132/6dMYz/IsGGS+k/OfWetSzlTkl6fId7X1dod2zPkzic/VXppmCRjHa43DqDbmyxludXBbb+SZPTRxNhDrt545InlC9XAAQ7hHl7VO3hzVbFMF3CpP41oWY6V7oCjyz5MPpxqgogldzdllQrUlSx8MugXzViHXNpy4Q7YnK0+hFZ+18+Zr/c7rDPyViBIignDUY6n69tMSQ25XO8ttcxkJjquGcnINn/lrJLZ2Kfompo9myTZ9js6HTAiFiC+BUo1HxFMTy9FzABh6cDoo+iOrBbYt1YWmqoqhK95vpIrzIfGs81TueAcDJKwNqqAtacoF3YTHvQFUa5cYfYq21e1XcW8+scw/n6drC7IRIrDu6tjNClmxnnWvhQO2LzlnVLQ4y+sA14GVQM4WQgQuiTN/+0RYywWpWDZ/oC8MgbaH4fQR/QhbxfefmuooADWFxJ73Izktw2u12HbiMfn/BY0xhcJtv/E3mVjYSsTdfYan/tFT2g8BB1TfWkkV8u2nw7DI0mR/d88xd7mnLuzvIyViBfn/+Mc1T3YHHftNv5YeL/mOquGedCeOLun8Sdc827C03olyKRQi5lbLdv2nEzfFpkEs2Fvlerz n/SrqgrR mAthOrHa2UWu98aaU+N4IuJRPxbWcRpKbyjvp20Mq/WGRWp6yNLrn36/a/ucISKbf5K/4CNUIXxx8gvQLAETCO+soGXzdcHSXKzsF7YsXhcVO92/90XRMih45TKGb/2yVdfAtg9WqWDeK/6aM5BN4+i3ln0W+8+uKUubQcvE+WLkkJ5A= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Ensure destination ranges of the kexec segments do not overlap with any kernel pages marked to be preserved across kexec. For kexec_load, return EADDRNOTAVAIL if overlap is detected. For kexec_file_load, skip ranges containing preserved pages when seaching for available ranges to use. Signed-off-by: Anthony Yznaga --- kernel/kexec_core.c | 3 +++ kernel/kexec_file.c | 5 +++++ 2 files changed, 8 insertions(+) diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index 3d578c6fefee..e0d52f70cb48 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -40,6 +40,7 @@ #include #include #include +#include #include #include @@ -178,6 +179,8 @@ int sanity_check_segment_list(struct kimage *image) return -EADDRNOTAVAIL; if (mend >= KEXEC_DESTINATION_MEMORY_LIMIT) return -EADDRNOTAVAIL; + if (pkram_has_preserved_pages(mstart, mend)) + return -EADDRNOTAVAIL; } /* Verify our destination addresses do not overlap. diff --git a/kernel/kexec_file.c b/kernel/kexec_file.c index ca2aa2d61955..8bca01060d32 100644 --- a/kernel/kexec_file.c +++ b/kernel/kexec_file.c @@ -490,6 +490,11 @@ static int locate_mem_hole_bottom_up(unsigned long start, unsigned long end, continue; } + if (pkram_has_preserved_pages(temp_start, temp_end + 1)) { + temp_start = temp_start - PAGE_SIZE; + continue; + } + /* We found a suitable memory range */ break; } while (1); From patchwork Thu Apr 27 00:08:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225058 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C92E9C7618E for ; Thu, 27 Apr 2023 00:49:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 655C16B0071; Wed, 26 Apr 2023 20:49:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 62D336B0072; Wed, 26 Apr 2023 20:49:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 51B976B0074; Wed, 26 Apr 2023 20:49:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 437076B0071 for ; Wed, 26 Apr 2023 20:49:37 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 1498E8020B for ; Thu, 27 Apr 2023 00:49:37 +0000 (UTC) X-FDA: 80725337994.14.7997F9C Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf03.hostedemail.com (Postfix) with ESMTP id 0D5182000A for ; Thu, 27 Apr 2023 00:49:34 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=lFddECHU; spf=pass (imf03.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682556575; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=43eGY+TDyN1Z/nEUhZp/pkRJ381CTkYO9EwFJXyHijs=; b=x7rSWeq4+fLh2lRmaJD8wxQ0Jud1QDgsiIUIyNfFeEYsN0PkwDUZZNZKusPiEHbkxWb7nv 0uurn58qWCSI091ttEcD0dYErnG7y4VO/f1iyqBS+SkPv+Yvp6nTLYZ+DbrcBoa1uQylw1 XUurVrMLx+KQxYV6CzAdUAXfadph4xY= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=lFddECHU; spf=pass (imf03.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682556575; a=rsa-sha256; cv=none; b=GFCHnJk6sbwiRb0Xg4wByKI07wzy4dARukphS7LWpNd8OoehwI3ir52YNgQu1HwGBdLDz5 Fomaw/5FMH0dFOHWw01C9Vf2fFKWoJUOYV7t17kos6KzVIV4DIzJtjtyp6vvgBPJhPPCby QBp+chJ9a456+UcYy6Ovn2soYEN26OE= Received: from pps.filterd (m0246629.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGx63C014744; Thu, 27 Apr 2023 00:09:32 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=43eGY+TDyN1Z/nEUhZp/pkRJ381CTkYO9EwFJXyHijs=; b=lFddECHUFaWe4SvxIWe74XuldZD3XhgVHok2PgKl2CS5V9ILX3LvwHV30JNs3cypkMZR 8uiAfjJTFIhtWlrJRlFDLbAorKgQgvIuVOLrLSEI6IxPSo58U7I0ugig8LUpDtroQYf0 qEQ1xL25lZRaIfxmaXWpRfGNpN5jYkV1m5iS+Qye7WwBzoSapoardIE7ZBZBnESy+/4m gSbsTmhPLiFOZ+wkOz++c0c0sgsqgkb4FrVZUnIxbqFMpcl7ubrUGjWlxLi5zPvEIXaS CguZa3wdyDWhJpmUoSLf4r4OI1kFIpshJwuSTa3CKlRl9yR0fibdnDQCo6NYgE9+jhUZ TA== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q47fatmse-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:31 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QMn2Cq007147; Thu, 27 Apr 2023 00:09:31 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpu2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:31 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R0938w013888; Thu, 27 Apr 2023 00:09:30 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-19; Thu, 27 Apr 2023 00:09:30 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 18/21] mm: PKRAM: allow preserved memory to be freed from userspace Date: Wed, 26 Apr 2023 17:08:54 -0700 Message-Id: <1682554137-13938-19-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: -W4-EPoCWhe2SX1_zknW2SnhXj4oiT4p X-Proofpoint-GUID: -W4-EPoCWhe2SX1_zknW2SnhXj4oiT4p X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: 0D5182000A X-Stat-Signature: 3kmz7kgmjuz3qb975m43xyrpc1hx8x8k X-HE-Tag: 1682556574-190716 X-HE-Meta: U2FsdGVkX18yG0jjZP9fu1GTIktCztRXmzbB5fJG+fod1yM6Qh3/vNHWjYdgzfxGUnBK3xg1frPuRRbAa1gJkEsguSPTM7s8gtJJlT+9CrY9wt7/JMgK7/EwPYUhitWKaD6AUosw1lC5AVozs1jKYfULZzwUzzw34/FQN4Nfgiyj+hB+DoGe9QjFxI9ztUhvJamyj8bsfCgK/spgtTdalDb27HAtRSKDIaUuy3crbF9AC0RNqeTbkZnyru9XyEMFE0i0sngDklbmMKtYqMIF09wIz7/UD6UEo6y7aad/iXI7wNGP95N5sIEhQH5dOOKqtTNV3VBJPJu/Kodhf5BiDqYVC8dSrX/1M1J7pKklHxIrXDkkEyEc2RPHV6jrcICyeecgw984y2uXDj5o9nnyyma26oqkUzEbK4a+I0RoZb61bPs1uGpJF6Bh50yep3vuJpQz3PGxPgqq4rFK5I7IBtAfKgMzYRKMK1J7dqe4htLBskrpp9fri5xA1NszSNMJ3IplwENnCKSu1tDh1KSOm2wVeHlVC5ypZ0xVx3jFRBU6XS4I7KaJbvYU1zF7a3XzR9kdNce1rCGuXv9pfaSKzIBCp0cEdVoNskE0/K2u5ZBeti+cnJWJSBLydCsEASVxIEGTYmbEDrn1kOm6feueTVYqJMqcgK64SqGCZ1exhZyqOdlXA7Mlr+ymW8Mycl8cDLkUx5NeD++Ta5k8h3qW9dG2yfM557/fPWLTjefj13zRhTRop+b4ZhI7dxo4EVtL2nNSgLovnzkkTqhQCRWfwmhQdVptowdI0gXfICjP5z1Z/Gb+HrEik+/Zm4vZ6V/hBqa4YsMlhTq/z7NF0DioGAhV2itAUbAU5RvZFoVXK5H7WCe14FVPodkL3zfKpBh9QinIN3eP+uy7lNmVw5v6ld2lhMy91QFt6pfC3hTfBVB6emDxWYqSo0sZVCNTtk4VwjTDQNcA3pU1c6G4Q79 /plhOQL+ Z9U/46bvnNh9ye06cEMisIyz7FmpG8buUo2iliGqpkkGjIIofnXKWC8XVKf16/Jzm2BwsBtc6y3wqg/dtMjwvXYgOhC/uuQOcu6BrcATiGtXUsFZJjy1y9yl9ouyrt11U+M8S9myTqBvil/+bKAqovmb2R6q/JXgARf3S X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: To free all space utilized for preserved memory, one can write 0 to /sys/kernel/pkram. This will destroy all PKRAM nodes that are not currently being read or written. Originally-by: Vladimir Davydov Signed-off-by: Anthony Yznaga --- mm/pkram.c | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/mm/pkram.c b/mm/pkram.c index 474fb6fc8355..d404e415f3cb 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -493,6 +493,32 @@ static void pkram_truncate_node(struct pkram_node *node) node->obj_pfn = 0; } +/* + * Free all nodes that are not under operation. + */ +static void pkram_truncate(void) +{ + struct page *page, *tmp; + struct pkram_node *node; + LIST_HEAD(dispose); + + mutex_lock(&pkram_mutex); + list_for_each_entry_safe(page, tmp, &pkram_nodes, lru) { + node = page_address(page); + if (!(node->flags & PKRAM_ACCMODE_MASK)) + list_move(&page->lru, &dispose); + } + mutex_unlock(&pkram_mutex); + + while (!list_empty(&dispose)) { + page = list_first_entry(&dispose, struct page, lru); + list_del(&page->lru); + node = page_address(page); + pkram_truncate_node(node); + pkram_free_page(node); + } +} + static void pkram_add_link(struct pkram_link *link, struct pkram_data_stream *pds) { __u64 link_pfn = page_to_pfn(virt_to_page(link)); @@ -1252,8 +1278,19 @@ static ssize_t show_pkram_sb_pfn(struct kobject *kobj, return sprintf(buf, "%lx\n", pfn); } +static ssize_t store_pkram_sb_pfn(struct kobject *kobj, + struct kobj_attribute *attr, const char *buf, size_t count) +{ + int val; + + if (kstrtoint(buf, 0, &val) || val) + return -EINVAL; + pkram_truncate(); + return count; +} + static struct kobj_attribute pkram_sb_pfn_attr = - __ATTR(pkram, 0444, show_pkram_sb_pfn, NULL); + __ATTR(pkram, 0644, show_pkram_sb_pfn, store_pkram_sb_pfn); static struct attribute *pkram_attrs[] = { &pkram_sb_pfn_attr.attr, From patchwork Thu Apr 27 00:08:55 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225084 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E1ED2C77B60 for ; Thu, 27 Apr 2023 01:27:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 55D0B6B0078; Wed, 26 Apr 2023 21:27:01 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 50D7D6B007B; Wed, 26 Apr 2023 21:27:01 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3D5266B007D; Wed, 26 Apr 2023 21:27:01 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 2DE926B0078 for ; Wed, 26 Apr 2023 21:27:01 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 05389120124 for ; Thu, 27 Apr 2023 01:27:00 +0000 (UTC) X-FDA: 80725432242.23.AB36060 Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf24.hostedemail.com (Postfix) with ESMTP id E1BA3180013 for ; Thu, 27 Apr 2023 01:26:58 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=J9766iWR; spf=pass (imf24.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682558819; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=VQwaugMMgK/MS0E/urOkz6I4ukGs9iD4CWV4YSx/1xc=; b=vFMMZUbnsDma+MNdohI4jz+GdTi6nE4HKpcJNLguyylecrvHUFIMWv2uPhcxNgbi+dBJPP 1Rp+MhqO4pa3ZaDETA3e/JFZD8g5/2p4z0WOdnf9kBI3reziysYEWVZZ6Gc+PEjp0eSUkz qk6o2TFM4jyfBr6ro5jut3F7fZqmouU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682558819; a=rsa-sha256; cv=none; b=U3s3EmaUKTjsTFzYn5O+lEZQ2Pg2XyHUZASQqUppJ/5cQ7qn2iEzOn1GZvY+pgwDUiPWcJ P89VIBHHVR1kaCNErI9YGID30mcG1hLnuzSMhPTmIZ+pJ1JNrpyKauuuWzvZVRMN3EIWNF ZrC1WLfsotLOLK+28lu9GJEL/dFefRY= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=J9766iWR; spf=pass (imf24.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com Received: from pps.filterd (m0246627.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGxFQg025361; Thu, 27 Apr 2023 00:09:33 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=VQwaugMMgK/MS0E/urOkz6I4ukGs9iD4CWV4YSx/1xc=; b=J9766iWR8G77hZ7IvzmoZgr4vzzvhIrhZGyputyh0qhGelOn0BZzz+TYbnctnLfB2tXh 00taSjRcT8LFpZEslBFSSAM+7RpAZLoDdPLhJX6aat3Xus05BlbcMz45pXDselmAKTQL 4j9iCnvsVigM9pq/Hn6cRChdsGBc3RcS+agLBtC0Ryp+vhjYvDipYwOzgG+fZ0YtUBKh ywDBGNDhlrFvWYIx2fSMBZzPveVn06sAanl2zckyI7jdur2dQ1+8SRVOAnEI4OnwJAyu qzNFfHr6Mb0eJ2sQWEFZDyiF5ZYwCvt6ZBGbB2qNbzLkGSSfha9z8223RFFTKeqKIbV2 AQ== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46622tym-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:33 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QNUZGN007329; Thu, 27 Apr 2023 00:09:32 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpux-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:32 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R09390013888; Thu, 27 Apr 2023 00:09:32 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-20; Thu, 27 Apr 2023 00:09:31 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 19/21] PKRAM: disable feature when running the kdump kernel Date: Wed, 26 Apr 2023 17:08:55 -0700 Message-Id: <1682554137-13938-20-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-ORIG-GUID: jPzpZYiuahk2b6x6CL1T4-5GRvhXf1gU X-Proofpoint-GUID: jPzpZYiuahk2b6x6CL1T4-5GRvhXf1gU X-Rspam-User: X-Rspamd-Queue-Id: E1BA3180013 X-Rspamd-Server: rspam09 X-Stat-Signature: 3pgmki6bc35qugs7tzkszc64ujk1bcb7 X-HE-Tag: 1682558818-950794 X-HE-Meta: U2FsdGVkX1/qFpKMMxf8xvc7tclLiQHugcx/sbYfQvv2TVvh9/ysaY9qXGWyrsaavo6vVr191SFM4CxZRFeBr6Us9t1s6+5/WObe9M93fN4Xi8O79+0A6OMjwUVEmn7766OfbgimiZk+RcyJPM5RpY4pw1/ZIIPmx5XJ0wzyORhvWwooQCUILlFr9ZQgVCR/PVswU+Z2iExZjst5IYpAtg+6Mf+5GaYNT0dEqqZlKxahug/Y8YzAdj0BOl2OxaVvcVYUsasQz/NesJ2eZ3ONtd6R246apY6yw5G3HRqlpLd5CPWcwVC7siHYFUhiJgsn1IOe7eqxdvgJS2Y4Y88IvCZov45jC/wCL6NgOvRfUN+Z35PWuAS2Y/rzVHJH7zqqQfhxKC401DJqgjaislozLxdHHX2QkC84PK9KNMh92+kdFJ+Z691CyvZaa8NdiMPjb+YS+QbyUOS8RZGWR//ffLihiEA+QPEh3t3i6WNHEVL+UuShUag/XqNTLdc1IkS2DUrZvwnXsYxJUYPdKpGSb3RvWtbrsCQIXJbtouTg5fkY7gntZUbRScJdSIZW7YHtYzkgTL1fGdshQvVLEB5VVsTfuStUtu60A3exrHZai6Zmg/gDvjbGoXEknvPD5+UUNYq2xcys3/uXjlQJaGKakmY93dyN4zdU6nbZ5OeBGapw83xHkk0mcSRLDl9eBZDdUjLF2h6U92AkrbbeEmxtN+SO4NcVnfZT+pyodN3ODEPupAw+5NxqbASMXzdZmt+TVclEAcpgHRVgb1JlInYFQT73XvTS6mywXV3bWR+GQdU9EtOXUImext/he/rEyeiVv2SRsCELCeQwxbcI1nKdqIcNKvKkJnXwupczwH/ig01Hw/nmyPAGBQMC/gLCGLMd+IzzlUFPF4EczcxCgqbhZKZko4K9HQpZy1Zr5SL5FXpAf9XN1KQUVCPJqQOlXkaW7je5RyQ6g/7odY6DXqP NkIz0imK tK2IpLdLCR4X4UT5453F92OW0nBedClreqlF+/XP7MpajRubEi+3kjRbv1TgzhByYFgNwmjduTULQV21Es1U5kB39uMkmt8VE8l+akQDLSKpeULYHWN432SiBXtiK7WyKKSmu X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The kdump kernel should not preserve or restore pages. Signed-off-by: Anthony Yznaga --- mm/pkram.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/mm/pkram.c b/mm/pkram.c index d404e415f3cb..f38236e5d836 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -1,4 +1,5 @@ // SPDX-License-Identifier: GPL-2.0 +#include #include #include #include @@ -188,7 +189,7 @@ void __init pkram_reserve(void) { int err = 0; - if (!pkram_sb_pfn) + if (!pkram_sb_pfn || is_kdump_kernel()) return; pr_info("PKRAM: Examining preserved memory...\n"); @@ -285,6 +286,9 @@ static void pkram_show_banned(void) int i; unsigned long n, total = 0; + if (is_kdump_kernel()) + return; + pr_info("PKRAM: banned regions:\n"); for (i = 0; i < nr_banned; i++) { n = banned[i].end - banned[i].start + 1; @@ -1334,7 +1338,7 @@ static int __init pkram_init_sb(void) static int __init pkram_init(void) { - if (pkram_init_sb()) { + if (!is_kdump_kernel() && pkram_init_sb()) { register_reboot_notifier(&pkram_reboot_notifier); register_shrinker(&banned_pages_shrinker, "pkram"); sysfs_update_group(kernel_kobj, &pkram_attr_group); From patchwork Thu Apr 27 00:08:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D9ABC7EE23 for ; Thu, 27 Apr 2023 00:10:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9C12A6B0085; Wed, 26 Apr 2023 20:09:58 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 925D86B0087; Wed, 26 Apr 2023 20:09:58 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 776B56B0088; Wed, 26 Apr 2023 20:09:58 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 621196B0085 for ; Wed, 26 Apr 2023 20:09:58 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 4503C40236 for ; Thu, 27 Apr 2023 00:09:58 +0000 (UTC) X-FDA: 80725238076.19.CA60E0B Received: from mx0a-00069f02.pphosted.com (mx0a-00069f02.pphosted.com [205.220.165.32]) by imf04.hostedemail.com (Postfix) with ESMTP id 2EE7040014 for ; Thu, 27 Apr 2023 00:09:55 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=D93iwhD4; spf=pass (imf04.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554196; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=yIENXNNzi/QzzmshoP+q+Y+PreZ+zZ5LAbsbewQ0r6M=; b=MGD6daFdSMWk5GaEuUhudeuTr8mVUbrdVXUpJ5cXF1iOpwU2xiVRUJ9yoqaQXT/FsZ2oI/ 4rD/8oKNR8KPe48Vz6i7kH/arkR5r1jfanhxGFsEd3cTL+4G18T+zcxo8ZP2dHg8YomFxj 96yN7hJMTg8YqyDWkoixTX1RIJsD3Fs= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=D93iwhD4; spf=pass (imf04.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.165.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554196; a=rsa-sha256; cv=none; b=GqkQoD/wUoNOk6ijJcepdFg/qPAF13cNxu6l0FFLh5u7cbhgRwN9VQy9JUnL8zjMwoJYoF Un9uSiN12oFbKQc/mZgJcgcvdgsSxAt8RQDQPrOLNpEQ76LiNu+jWs3SYo3H6kr7dECDVM cAkkPNUvPBif5DryfL3uV/v7RJR2VAo= Received: from pps.filterd (m0333521.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGx3sV004984; Thu, 27 Apr 2023 00:09:35 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=yIENXNNzi/QzzmshoP+q+Y+PreZ+zZ5LAbsbewQ0r6M=; b=D93iwhD4tLk5IUOUbCMp75EVVn8pz3JVlt8axqiDirtNgTH2ZnBgIb0wXe6ekEbLRo3e eEPG+sS5dvN3KroM8CtCay1vQzVY8aDzCvlx8qf7BrUi/uc6Ri73ggcjNDyyOTXKHrZb p7qvzaHBremK6EWvwWWGnKslZqz39B4flJiXKpv5Wk0duzRK9YcIjqc0pnue7uMjxDo5 62ZZ4RO8P1pw10cOAOEE5i57vPlb6IBrJc3759GRFGLvTLDeHDdKMQi/RPcUVjlGAY+o u/wecOchLHcLvkKmIRLTn5Iepx8SHgbkzjsv0cZDb8NF5yb/8y2kp/+ObTQ4ssFP3mU6 6g== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46gbtsj9-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:34 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QM3usC007418; Thu, 27 Apr 2023 00:09:34 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpvt-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:34 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R09392013888; Thu, 27 Apr 2023 00:09:33 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-21; Thu, 27 Apr 2023 00:09:33 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 20/21] x86/KASLR: PKRAM: support physical kaslr Date: Wed, 26 Apr 2023 17:08:56 -0700 Message-Id: <1682554137-13938-21-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: 1EnT7XCXpRCY0gBKXVSZzw8iuiNCBjPV X-Proofpoint-ORIG-GUID: 1EnT7XCXpRCY0gBKXVSZzw8iuiNCBjPV X-Rspamd-Queue-Id: 2EE7040014 X-Stat-Signature: 6ws91odketfrrc84ot9uhoy9a7cdwy9z X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1682554195-562393 X-HE-Meta: U2FsdGVkX185UJTk+5wzz1hx+ttNHFV8rWUJwFaki9QzX/3PnRotaETJjadXyrgax/4EFK4EaC4MY4fb3oogIYt/sqRPpRHUeY5loJHO/1/FSZD3qTaAU/A75QaOYKI0zjV2/fC2bmJ4pV0g1v5FCORQKG7OZJgY9s3fmWBOBmUO0hUeYwAl2ikrfqviSaTYLjK8RrJXXhkmLDPVSOGuurbdgxKvbWuyH85uTO0hDgcBTDsgSP69tJUgeGSZ1bBmMLZIuw7xuKffq0PoQWJ2tMIS6P8sHp0uAnFmgTB48+OYu6AXIzEe4xhHn4X0HoY+k9xPVV6k4QQPeIbI9Da2xU0lRXJ1OzkKINbAhBVhwERic2y4dgPe2ZiCgoZTvH8cRDBvGf6g0De2FUZ2Xx3pE0LSsCKTYmFbN/UIRecoMkkEZZDw9lOAv3Xfa1lbvWZ0fTNTCPLum9hL90u3CK3Hbjgy96JmeliJX1pTP82P3aYoQ/t8iW8BMPQbjHgPlCNENN10gFBJa2Bkwx3Fx3wVAq+CoejqfZjYZ4f/3en9/TIObwBL4VwD/Ew9nYMkZRseqPOu0mXrIcO+1JJ9BEf/PH2EKQZbdHDwgdUzJLIHGUSku2zjbNfceAD3N/pmClt/DpEPDxa2ns5Svl8rq9SRrtv0frkPAhuC+45QZ83hP5JcnlbRjOLNrZlsRrIG1yn1NRpzmWFzbZlLuFXuveyMiDF7VUcAf1efkU/i4PUpwxrq46muB0ZzIh7llpl610vS6sUISyKBpDNTsEPrkR2UuhEN3HVsETqEP2I9pgkywGsq2GzCwbH4ANt8dC9TQRGszWO52vb81S8m+E9MD001Xb8ikC71xBnzTc0IOJwVPHypXzYrZ/EImc4ChEwBzYN+fDTGRcs3Tbhh+ReF2O902O/lIAOjg79YtjlQgADewQlghjcm0r+5jfqsDGLlGRjWb+U35LVyjjzVJ2KMAUS 1OEejJSn 3PICJRZ+awFhTgwf0Ftl1McUXirZL29ItXjNQIkx+xbbgYrkM0QLe17OxnDFxh9KPg7Msqi5o2j6prMF7KE39OlYb0NGEgrH+fcGa3tmYxbCsU6LBcFmqTzLe9JeNJ0tfDli9 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Avoid regions of memory that contain preserved pages when computing slots used to select where to put the decompressed kernel. Signed-off-by: Anthony Yznaga --- arch/x86/boot/compressed/Makefile | 3 ++ arch/x86/boot/compressed/kaslr.c | 10 +++- arch/x86/boot/compressed/misc.h | 10 ++++ arch/x86/boot/compressed/pkram.c | 110 ++++++++++++++++++++++++++++++++++++++ mm/pkram.c | 2 +- 5 files changed, 132 insertions(+), 3 deletions(-) create mode 100644 arch/x86/boot/compressed/pkram.c diff --git a/arch/x86/boot/compressed/Makefile b/arch/x86/boot/compressed/Makefile index 6b6cfe607bdb..d9a5af94a797 100644 --- a/arch/x86/boot/compressed/Makefile +++ b/arch/x86/boot/compressed/Makefile @@ -103,6 +103,9 @@ ifdef CONFIG_X86_64 vmlinux-objs-$(CONFIG_AMD_MEM_ENCRYPT) += $(obj)/mem_encrypt.o vmlinux-objs-y += $(obj)/pgtable_64.o vmlinux-objs-$(CONFIG_AMD_MEM_ENCRYPT) += $(obj)/sev.o +ifdef CONFIG_RANDOMIZE_BASE + vmlinux-objs-$(CONFIG_PKRAM) += $(obj)/pkram.o +endif endif vmlinux-objs-$(CONFIG_ACPI) += $(obj)/acpi.o diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index 454757fbdfe5..047b8b9a0799 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -436,6 +436,7 @@ static bool mem_avoid_overlap(struct mem_vector *img, struct setup_data *ptr; u64 earliest = img->start + img->size; bool is_overlapping = false; + struct mem_vector avoid; for (i = 0; i < MEM_AVOID_MAX; i++) { if (mem_overlaps(img, &mem_avoid[i]) && @@ -449,8 +450,6 @@ static bool mem_avoid_overlap(struct mem_vector *img, /* Avoid all entries in the setup_data linked list. */ ptr = (struct setup_data *)(unsigned long)boot_params->hdr.setup_data; while (ptr) { - struct mem_vector avoid; - avoid.start = (unsigned long)ptr; avoid.size = sizeof(*ptr) + ptr->len; @@ -475,6 +474,12 @@ static bool mem_avoid_overlap(struct mem_vector *img, ptr = (struct setup_data *)(unsigned long)ptr->next; } + if (pkram_has_overlap(img, &avoid) && (avoid.start < earliest)) { + *overlap = avoid; + earliest = overlap->start; + is_overlapping = true; + } + return is_overlapping; } @@ -836,6 +841,7 @@ void choose_random_location(unsigned long input, return; } + pkram_init(); boot_params->hdr.loadflags |= KASLR_FLAG; if (IS_ENABLED(CONFIG_X86_32)) diff --git a/arch/x86/boot/compressed/misc.h b/arch/x86/boot/compressed/misc.h index 20118fb7c53b..01ff5e507064 100644 --- a/arch/x86/boot/compressed/misc.h +++ b/arch/x86/boot/compressed/misc.h @@ -124,6 +124,16 @@ static inline void console_init(void) { } #endif +#ifdef CONFIG_PKRAM +void pkram_init(void); +int pkram_has_overlap(struct mem_vector *entry, struct mem_vector *overlap); +#else +static inline void pkram_init(void) { } +static inline int pkram_has_overlap(struct mem_vector *entry, + struct mem_vector *overlap) +{ return 0; } +#endif + #ifdef CONFIG_AMD_MEM_ENCRYPT void sev_enable(struct boot_params *bp); void snp_check_features(void); diff --git a/arch/x86/boot/compressed/pkram.c b/arch/x86/boot/compressed/pkram.c new file mode 100644 index 000000000000..19267ca2ce8e --- /dev/null +++ b/arch/x86/boot/compressed/pkram.c @@ -0,0 +1,110 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "misc.h" + +#define PKRAM_MAGIC 0x706B726D + +struct pkram_super_block { + __u32 magic; + + __u64 node_pfn; + __u64 region_list_pfn; + __u64 nr_regions; +}; + +struct pkram_region { + phys_addr_t base; + phys_addr_t size; +}; + +struct pkram_region_list { + __u64 prev_pfn; + __u64 next_pfn; + + struct pkram_region regions[0]; +}; + +#define PKRAM_REGIONS_LIST_MAX \ + ((PAGE_SIZE-sizeof(struct pkram_region_list))/sizeof(struct pkram_region)) + +static u64 pkram_sb_pfn; +static struct pkram_super_block *pkram_sb; + +void pkram_init(void) +{ + struct pkram_super_block *sb; + char arg[32]; + + if (cmdline_find_option("pkram", arg, sizeof(arg)) > 0) { + if (kstrtoull(arg, 16, &pkram_sb_pfn) != 0) + return; + } else + return; + + sb = (struct pkram_super_block *)(pkram_sb_pfn << PAGE_SHIFT); + if (sb->magic != PKRAM_MAGIC) { + debug_putstr("PKRAM: invalid super block\n"); + return; + } + + pkram_sb = sb; +} + +static struct pkram_region *pkram_first_region(struct pkram_super_block *sb, + struct pkram_region_list **rlp, int *idx) +{ + if (!sb || !sb->region_list_pfn) + return NULL; + + *rlp = (struct pkram_region_list *)(sb->region_list_pfn << PAGE_SHIFT); + *idx = 0; + + return &(*rlp)->regions[0]; +} + +static struct pkram_region *pkram_next_region(struct pkram_region_list **rlp, int *idx) +{ + struct pkram_region_list *rl = *rlp; + int i = *idx; + + i++; + if (i >= PKRAM_REGIONS_LIST_MAX) { + if (!rl->next_pfn) { + debug_putstr("PKRAM: no more pkram_region_list pages\n"); + return NULL; + } + rl = (struct pkram_region_list *)(rl->next_pfn << PAGE_SHIFT); + *rlp = rl; + i = 0; + } + *idx = i; + + if (rl->regions[i].size == 0) + return NULL; + + return &rl->regions[i]; +} + +int pkram_has_overlap(struct mem_vector *entry, struct mem_vector *overlap) +{ + struct pkram_region_list *rl; + struct pkram_region *r; + int idx; + + r = pkram_first_region(pkram_sb, &rl, &idx); + + while (r) { + if (r->base + r->size <= entry->start) { + r = pkram_next_region(&rl, &idx); + continue; + } + if (r->base >= entry->start + entry->size) + return 0; + + overlap->start = r->base; + overlap->size = r->size; + return 1; + } + + return 0; +} diff --git a/mm/pkram.c b/mm/pkram.c index f38236e5d836..a3e045b8dfe4 100644 --- a/mm/pkram.c +++ b/mm/pkram.c @@ -96,7 +96,7 @@ struct pkram_region_list { __u64 prev_pfn; __u64 next_pfn; - struct pkram_region regions[0]; + struct pkram_region regions[]; }; #define PKRAM_REGIONS_LIST_MAX \ From patchwork Thu Apr 27 00:08:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Anthony Yznaga X-Patchwork-Id: 13225042 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C20EC7EE26 for ; Thu, 27 Apr 2023 00:10:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CCFA66B0087; Wed, 26 Apr 2023 20:09:59 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id C57AC6B0088; Wed, 26 Apr 2023 20:09:59 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id AF7F86B0089; Wed, 26 Apr 2023 20:09:59 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 9A7056B0087 for ; Wed, 26 Apr 2023 20:09:59 -0400 (EDT) Received: from smtpin14.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 72344120460 for ; Thu, 27 Apr 2023 00:09:59 +0000 (UTC) X-FDA: 80725238118.14.5C47A29 Received: from mx0b-00069f02.pphosted.com (mx0b-00069f02.pphosted.com [205.220.177.32]) by imf01.hostedemail.com (Postfix) with ESMTP id 7B5DA4001A for ; Thu, 27 Apr 2023 00:09:56 +0000 (UTC) Authentication-Results: imf01.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=MOXjt11P; spf=pass (imf01.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1682554196; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:content-type: content-transfer-encoding:in-reply-to:in-reply-to: references:references:dkim-signature; bh=75Jn2zmafPDZhmtzI/Kq6AU5iLzwZOe7CoJGlihTNLk=; b=ixh2IWa/dNHkzy5rTnlfHzVBpU3gHiJZgEH8p7r+IMqK5WVR7KY95kB+nmH6WoBTXxIOxA p4J1KJWrp9PpNR/khPy6467FL8Nmt0CYj0gKaAXo3ZzI7S9YBjeiXGVVMh03DcLbudDg1S k/ZR0Ora1EyQXEpxrmNCFHS+cYZrTVg= ARC-Authentication-Results: i=1; imf01.hostedemail.com; dkim=pass header.d=oracle.com header.s=corp-2023-03-30 header.b=MOXjt11P; spf=pass (imf01.hostedemail.com: domain of anthony.yznaga@oracle.com designates 205.220.177.32 as permitted sender) smtp.mailfrom=anthony.yznaga@oracle.com; dmarc=pass (policy=none) header.from=oracle.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1682554196; a=rsa-sha256; cv=none; b=G1SzpgEvL3FIaRZX8xNgWI+LfKYvMJvhyN2O+8Qlld/WOt1rlbvc2O+3EoU0rBd8bx4muH T2VgZYB2noKz7f1WynMNKQtQgoOXwYbQ1cVpkhIHbEHnCinjy8Imbn++DPvBmiLTGrp+2E L+gyKaRW2gxurDDrQ3FQ5xn2qDDbOFQ= Received: from pps.filterd (m0246631.ppops.net [127.0.0.1]) by mx0b-00069f02.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 33QGwqlV017095; Thu, 27 Apr 2023 00:09:37 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=from : to : cc : subject : date : message-id : in-reply-to : references; s=corp-2023-03-30; bh=75Jn2zmafPDZhmtzI/Kq6AU5iLzwZOe7CoJGlihTNLk=; b=MOXjt11P2QQBa/6vgt8DMX+0+ZgqCUaivtCph8kNnTLFgBTbUFLihY8agUGoQkhqCOXL 0r3HEqV+4zssxzMNDVvjT0WWpU/tglDW6uKbJjRCvwYi199DE6+ohLNKqSrT7E5STFoX Jw9AKL4Zay68MyY1E2qTcYNTMMhyg/I5Ydd/AJyxqffV1b140ubkZsm9S9ROeOCfNvJy M3Zen9xmKgd3oPhKdrXmbDlq0nHMqHdvv6wJphA8J14HaVDW+6xqRy2BukjvxwL1tWMV OM2ISdxQsb1f1iT79Xjrs0hdyaPvf1E7CzCZ7AE3hZdsxWNpSI8UrQ6+k8mIlW6u0G98 Dw== Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.appoci.oracle.com [138.1.114.2]) by mx0b-00069f02.pphosted.com (PPS) with ESMTPS id 3q46c4arvx-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:36 +0000 Received: from pps.filterd (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (8.17.1.19/8.17.1.19) with ESMTP id 33QMDCSU007353; Thu, 27 Apr 2023 00:09:36 GMT Received: from pps.reinject (localhost [127.0.0.1]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTPS id 3q4618mpx1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 27 Apr 2023 00:09:36 +0000 Received: from phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com [127.0.0.1]) by pps.reinject (8.17.1.5/8.17.1.5) with ESMTP id 33R09394013888; Thu, 27 Apr 2023 00:09:35 GMT Received: from ca-qasparc-x86-2.us.oracle.com (ca-qasparc-x86-2.us.oracle.com [10.147.24.103]) by phxpaimrmta01.imrmtpd1.prodappphxaev1.oraclevcn.com (PPS) with ESMTP id 3q4618mp42-22; Thu, 27 Apr 2023 00:09:35 +0000 From: Anthony Yznaga To: linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, x86@kernel.org, hpa@zytor.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, rppt@kernel.org, akpm@linux-foundation.org, ebiederm@xmission.com, keescook@chromium.org, graf@amazon.com, jason.zeng@intel.com, lei.l.li@intel.com, steven.sistare@oracle.com, fam.zheng@bytedance.com, mgalaxy@akamai.com, kexec@lists.infradead.org Subject: [RFC v3 21/21] x86/boot/compressed/64: use 1GB pages for mappings Date: Wed, 26 Apr 2023 17:08:57 -0700 Message-Id: <1682554137-13938-22-git-send-email-anthony.yznaga@oracle.com> X-Mailer: git-send-email 1.9.4 In-Reply-To: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> References: <1682554137-13938-1-git-send-email-anthony.yznaga@oracle.com> X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.254,Aquarius:18.0.942,Hydra:6.0.573,FMLib:17.11.170.22 definitions=2023-04-26_10,2023-04-26_03,2023-02-09_01 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 phishscore=0 bulkscore=0 mlxlogscore=999 malwarescore=0 mlxscore=0 spamscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2303200000 definitions=main-2304270000 X-Proofpoint-GUID: HA43vq6yRTxbNgszb89rmJy5Zv5oWhzC X-Proofpoint-ORIG-GUID: HA43vq6yRTxbNgszb89rmJy5Zv5oWhzC X-Rspamd-Queue-Id: 7B5DA4001A X-Stat-Signature: femfc9nncwywocsf6jehrc3akt8ku3f7 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1682554196-212910 X-HE-Meta: U2FsdGVkX188sXRk626XrIKm8rCxdPT9JBfe+QUXDXf0WpPKSIjcOTdlu5grEyIPBbKWu4LzcrKycm8qskxuU9R6ZYZoMG51Nq1M+037+uvFtvDvvsuP3IXZm73KYMZypt2TSHIolaqeUgjmodTmJSrZI1CRPqx1oRQ/vhZi5fYAqP0RFXniac+Jw2S2q446k2I7fGcrVxa/l2VAQARNx4BliGcUq4AUHXEY5YEP7I+Ekwjx/iHc3ehanXgIojaLGn8UN4flWCimG/Ca+17bPqydYAxdzEp3N0/W6wDJ5pqQASfuISkb9ieQhVcY1m3T7jnS45h3WCaeTLlCUtyYQci8qwcNBfyeeH5kYXZiCSal3fhIAKQ72jf51USomgjdcwrywq9IUi9lj2R0kLVZroNv+P5l4nCJRYGw5tU+yx2DjXSlZdA0oV7tnXA5Ssi2e2+elnTBWSdwqYMPU+fxRY78YzLOYXkR5jxzPCjyLEW3gE2/ZAH/f6u7Q2ovqn+y5Ny+uVRK78kf+yzSBADnU/NwGOc7xAjYt9iHcUCW21G4ifD5lP7xqkDu1g7URB9QCTfGCr4EIlRcAdBYhgoPirglXs/+MfIHCPiDto+/1CLZi9H7fZpxHo45k8XWewEMLF4YKIbNrBzulFR2Qmdq2ciPSiUrtQCxdwi3M5boMIDWY/ISlWCSliPgxmCsokEY0UcA7QQBObV9k15KfQWHOWV4VyAdd/jTIQdSia5wlVZnWjybxf8IuCINdB67C3yGodKWUyrYrWzG4wvgAvKQV7h+Zo5MDQqJ468qNr1WPYQvvoUqTo/04TTJB4MC/rIbmqlJ01vtm2qYluJ3kqcydW/G7ZmU/5ZPwygGcSv7kLQMICHCKDeNwoZWA36EmGsWPTKCUuxgUuXzyMEACFSCXvdrney4fGxTfOjE1J2pvfhzE0Squ7TXt0AofTZQDHZomwHIHcdv63jDECTOxc6 YnfEATTB VNVtrjdiRknl/ZtLRfONMfMsmPvENVSOYpyTQYox1OhoVKbznXS2goPIOuDoFwkBPVNf+C8kGg5MgHVJ6bgVe6n/XlngUDw9n6a16GMwupVnFBcLAfI1WUtipErXu+bybJVyS X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: pkram kaslr code can incur multiple page faults when it walks its preserved ranges list called via mem_avoid_overlap(). The multiple faults can easily end up using up the small number of pages available to be allocated for page table pages. This patch hacks things so that mappings are 1GB which results in the need for far fewer page table pages. As is this breaks AMD SEV-ES which expects the mappings to be 2M. This could possibly be fixed by updating split code to split 1GB page if the aren't any other issues with using 1GB mappings. Signed-off-by: Anthony Yznaga --- arch/x86/boot/compressed/ident_map_64.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/arch/x86/boot/compressed/ident_map_64.c b/arch/x86/boot/compressed/ident_map_64.c index 321a5011042d..1e02cf6dda3c 100644 --- a/arch/x86/boot/compressed/ident_map_64.c +++ b/arch/x86/boot/compressed/ident_map_64.c @@ -95,8 +95,8 @@ void kernel_add_identity_map(unsigned long start, unsigned long end) int ret; /* Align boundary to 2M. */ - start = round_down(start, PMD_SIZE); - end = round_up(end, PMD_SIZE); + start = round_down(start, PUD_SIZE); + end = round_up(end, PUD_SIZE); if (start >= end) return; @@ -120,6 +120,7 @@ void initialize_identity_maps(void *rmode) mapping_info.context = &pgt_data; mapping_info.page_flag = __PAGE_KERNEL_LARGE_EXEC | sme_me_mask; mapping_info.kernpg_flag = _KERNPG_TABLE; + mapping_info.direct_gbpages = true; /* * It should be impossible for this not to already be true, @@ -365,8 +366,8 @@ void do_boot_page_fault(struct pt_regs *regs, unsigned long error_code) ghcb_fault = sev_es_check_ghcb_fault(address); - address &= PMD_MASK; - end = address + PMD_SIZE; + address &= PUD_MASK; + end = address + PUD_SIZE; /* * Check for unexpected error codes. Unexpected are: