From patchwork Fri Dec 22 19:51:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503779 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 90F81C41535 for ; Fri, 22 Dec 2023 19:52:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=NCDzEYyoDjyqdBFMgWxrj6HYgxmMv7COcx05TlNHd5s=; b=dBTRROztcospL8 COjpjR9bDaF9f7QMqydc/3+Em5vwY6b5oY/r6pya9ytePfJIR2Q2EP8fVSHan8X2+lu0wYbylGA4E ovmG7/qo55tuU+1BK/RbPScYTNBP7pzRNtzb7dFPlawRuL+KuDTNgVGsB268CziXPqheYjl5XC/xV Zfr0CO5Ak817XJTSF1pv1ZC02Ny1GO6DnLcfXQ8yULs1uRFXq+oDHYtxEBv4CsDyf2X4Ei9ReBiHm Cx8nC8ZLCapyJwFgS94bTPB3g4fF2K3Urif8hp2fZlQvot7vIO+PT19ovCRVXnbFvI2bwms7DB4Jz EFhUFk/y1zhZAzu4mV2Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZ3-006kdr-2E; Fri, 22 Dec 2023 19:52:09 +0000 Received: from smtp-fw-52002.amazon.com ([52.119.213.150]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZ0-006kZh-0t; Fri, 22 Dec 2023 19:52:08 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274726; x=1734810726; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=ZLYCMQzZdYknax/wRIALOVvHrvD35sO3OPXkV3K/85M=; b=RfJ/NO0hqDGmxqMcyh8oWXjCg6Xq/O5ElXQukb5LhBwkgL9VvIkKm/tw 5Ulg0KVf3CIQBgmAR7rqTlrXxBinC8g/Mm2nLmECkllmQFRpoN36nisy1 8R2zxf290trWIbcCK+7eKOEDOKpr8ZfZ54AfoOfvMxEsQmZuVw89gLpSH s=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="602627198" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-iad-1d-m6i4x-d8e96288.us-east-1.amazon.com) ([10.43.8.6]) by smtp-border-fw-52002.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:51:58 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan3.iad.amazon.com [10.32.235.38]) by email-inbound-relay-iad-1d-m6i4x-d8e96288.us-east-1.amazon.com (Postfix) with ESMTPS id 5710B803B7; Fri, 22 Dec 2023 19:51:51 +0000 (UTC) Received: from EX19MTAUWC002.ant.amazon.com [10.0.7.35:35926] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.56.23:2525] with esmtp (Farcaster) id 82a6dd69-7016-4d64-b557-324d614fe9ef; Fri, 22 Dec 2023 19:51:50 +0000 (UTC) X-Farcaster-Flow-ID: 82a6dd69-7016-4d64-b557-324d614fe9ef Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:51:49 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:51:45 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 06/17] kexec: Add config option for KHO Date: Fri, 22 Dec 2023 19:51:33 +0000 Message-ID: <20231222195144.24532-1-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222193607.15474-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D032UWA002.ant.amazon.com (10.13.139.81) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115206_442859_62394748 X-CRM114-Status: GOOD ( 13.88 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org We have all generic code in place now to support Kexec with KHO. This patch adds a config option that depends on architecture support to enable KHO support. Signed-off-by: Alexander Graf --- kernel/Kconfig.kexec | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/kernel/Kconfig.kexec b/kernel/Kconfig.kexec index 2fd510256604..909ab28f1341 100644 --- a/kernel/Kconfig.kexec +++ b/kernel/Kconfig.kexec @@ -91,6 +91,19 @@ config KEXEC_JUMP Jump between original kernel and kexeced kernel and invoke code in physical address mode via KEXEC +config KEXEC_KHO + bool "kexec handover" + depends on ARCH_SUPPORTS_KEXEC_KHO + depends on KEXEC + select MEMBLOCK_SCRATCH + select LIBFDT + select CMA + help + Allow kexec to hand over state across kernels by generating and + passing additional metadata to the target kernel. This is useful + to keep data or state alive across the kexec. For this to work, + both source and target kernels need to have this option enabled. + config CRASH_DUMP bool "kernel crash dumps" depends on ARCH_SUPPORTS_CRASH_DUMP From patchwork Fri Dec 22 19:51:34 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C5504C41535 for ; Fri, 22 Dec 2023 19:52:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=t/HoNvYtVnUHYbr/gqUTWzQx+//j78PRIUS868a7Zt8=; b=0J0QJ5PbjiDA30 Uv6741g2U16IT3TdAHIUi+4popkY32xr0U4HfRka059OkxozzKEz55Pv24cUgtW9vWhRqwpCRQiC6 aIlGFSvhxs5tt3IZXETZbtUpVOHy3w0gSiZgI7F5kIiJosqfu6n+iiaKg7+A77MM6xoll2XocOdnE zuQD6tS9wpWQeU+HGU8iiePEoOCQmCLT5BALqzrAI0NYxXUgCjhZZ18sKzcA0Tno8RVvZJsgxKJw9 BgdNsd3GFBgY8mgd377iydyw+4PXJnYlD3wt4gw4WOVtlzKY/92Aj89f/IrxRSao/g16v89oXdggp kLgFIZejNb7BZT7Pw64w==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZC-006kj2-0a; Fri, 22 Dec 2023 19:52:18 +0000 Received: from smtp-fw-52004.amazon.com ([52.119.213.154]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZ1-006kZq-0V; Fri, 22 Dec 2023 19:52:09 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274727; x=1734810727; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Wy3QrJqbxk6W3Bwc5BKomfQATEZZd0nBtswCXV4R7vM=; b=aAH88/pOd4gXdkIpSRrKXAAMbr44818mzOTvUqfid033YTcH5d0sr06G mb0sWaeT3f3TbbM+2OldcgqvQkkafV2k5T6wfFaSPV66kqkWD4Bvrr8s4 khbaiYjao8voEO3mHUNxEk75s4d5mv+PmAVl2/s3ga0YWn+SQvAzX60PX 8=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="174097101" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-iad-1d-m6i4x-e651a362.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:52:01 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan3.iad.amazon.com [10.32.235.38]) by email-inbound-relay-iad-1d-m6i4x-e651a362.us-east-1.amazon.com (Postfix) with ESMTPS id 6EDEC8053C; Fri, 22 Dec 2023 19:51:54 +0000 (UTC) Received: from EX19MTAUWA001.ant.amazon.com [10.0.7.35:11667] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.56.23:2525] with esmtp (Farcaster) id 7869ce95-383d-43dc-8bf7-e44da388af62; Fri, 22 Dec 2023 19:51:53 +0000 (UTC) X-Farcaster-Flow-ID: 7869ce95-383d-43dc-8bf7-e44da388af62 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWA001.ant.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:51:53 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:51:49 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 07/17] kexec: Add documentation for KHO Date: Fri, 22 Dec 2023 19:51:34 +0000 Message-ID: <20231222195144.24532-2-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D032UWA002.ant.amazon.com (10.13.139.81) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115207_323209_807CB0BA X-CRM114-Status: GOOD ( 35.79 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With KHO in place, let's add documentation that describes what it is and how to use it. Signed-off-by: Alexander Graf --- Documentation/kho/concepts.rst | 88 ++++++++++++++++++++++++++++++++ Documentation/kho/index.rst | 19 +++++++ Documentation/kho/usage.rst | 57 +++++++++++++++++++++ Documentation/subsystem-apis.rst | 1 + 4 files changed, 165 insertions(+) create mode 100644 Documentation/kho/concepts.rst create mode 100644 Documentation/kho/index.rst create mode 100644 Documentation/kho/usage.rst diff --git a/Documentation/kho/concepts.rst b/Documentation/kho/concepts.rst new file mode 100644 index 000000000000..8e4fe8c57865 --- /dev/null +++ b/Documentation/kho/concepts.rst @@ -0,0 +1,88 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +======================= +Kexec Handover Concepts +======================= + +Kexec HandOver (KHO) is a mechanism that allows Linux to preserve state - +arbitrary properties as well as memory locations - across kexec. + +It introduces multiple concepts: + +KHO Device Tree +--------------- + +Every KHO kexec carries a KHO specific flattened device tree blob that +describes the state of the system. Device drivers can register to KHO to +serialize their state before kexec. After KHO, device drivers can read +the device tree and extract previous state. + +KHO only uses the fdt container format and libfdt library, but does not +adhere to the same property semantics that normal device trees do: Properties +are passed in native endianness and standardized properties like ``regs`` and +``ranges`` do not exist, hence there are no ``#...-cells`` properties. + +KHO introduces a new concept to its device tree: ``mem`` properties. A +``mem`` property can inside any subnode in the device tree. When present, +it contains an array of physical memory ranges that the new kernel must mark +as reserved on boot. It is recommended, but not required, to make these ranges +as physically contiguous as possible to reduce the number of array elements :: + + struct kho_mem { + __u64 addr; + __u64 len; + }; + +After boot, drivers can call the kho subsystem to transfer ownership of memory +that was reserved via a ``mem`` property to themselves to continue using memory +from the previous execution. + +The KHO device tree follows the in-Linux schema requirements. Any element in +the device tree is documented via device tree schema yamls that explain what +data gets transferred. + +Mem cache +--------- + +The new kernel needs to know about all memory reservations, but is unable to +parse the device tree yet in early bootup code because of memory limitations. +To simplify the initial memory reservation flow, the old kernel passes a +preprocessed array of physically contiguous reserved ranges to the new kernel. + +These reservations have to be separate from architectural memory maps and +reservations because they differ on every kexec, while the architectural ones +get passed directly between invocations. + +The less entries this cache contains, the faster the new kernel will boot. + +Scratch Region +-------------- + +To boot into kexec, we need to have a physically contiguous memory range that +contains no handed over memory. Kexec then places the target kernel and initrd +into that region. The new kernel exclusively uses this region for memory +allocations before it ingests the mem cache. + +We guarantee that we always have such a region through the scratch region: On +first boot, you can pass the ``kho_scratch`` kernel command line option. When +it is set, Linux allocates a CMA region of the given size. CMA gives us the +guarantee that no handover pages land in that region, because handover +pages must be at a static physical memory location and CMA enforces that +only movable pages can be located inside. + +After KHO kexec, we ignore the ``kho_scratch`` kernel command line option and +instead reuse the exact same region that was originally allocated. This allows +us to recursively execute any amount of KHO kexecs. Because we used this region +for boot memory allocations and as target memory for kexec blobs, some parts +of that memory region may be reserved. These reservations are irrenevant for +the next KHO, because kexec can overwrite even the original kernel. + +KHO active phase +---------------- + +To enable user space based kexec file loader, the kernel needs to be able to +provide the device tree that describes the previous kernel's state before +performing the actual kexec. The process of generating that device tree is +called serialization. When the device tree is generated, some properties +of the system may become immutable because they are already written down +in the device tree. That state is called the KHO active phase. diff --git a/Documentation/kho/index.rst b/Documentation/kho/index.rst new file mode 100644 index 000000000000..5e7eeeca8520 --- /dev/null +++ b/Documentation/kho/index.rst @@ -0,0 +1,19 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +======================== +Kexec Handover Subsystem +======================== + +.. toctree:: + :maxdepth: 1 + + concepts + usage + +.. only:: subproject and html + + + Indices + ======= + + * :ref:`genindex` diff --git a/Documentation/kho/usage.rst b/Documentation/kho/usage.rst new file mode 100644 index 000000000000..5efa2a58f9c3 --- /dev/null +++ b/Documentation/kho/usage.rst @@ -0,0 +1,57 @@ +.. SPDX-License-Identifier: GPL-2.0-or-later + +==================== +Kexec Handover Usage +==================== + +Kexec HandOver (KHO) is a mechanism that allows Linux to preserve state - +arbitrary properties as well as memory locations - across kexec. + +This document expects that you are familiar with the base KHO +:ref:`Documentation/kho/concepts.rst `. If you have not read +them yet, please do so now. + +Prerequisites +------------- + +KHO is available when the ``CONFIG_KEXEC_KHO`` config option is set to y +at compile team. Every KHO producer has its own config option that you +need to enable if you would like to preserve their respective state across +kexec. + +To use KHO, please boot the kernel with the ``kho_scratch`` command +line parameter set to allocate a scratch region. For example +``kho_scratch=512M`` will reserve a 512 MiB scratch region on boot. + +Perform a KHO kexec +------------------- + +Before you can perform a KHO kexec, you need to move the system into the +:ref:`Documentation/kho/concepts.rst ` :: + + $ echo 1 > /sys/kernel/kho/active + +After this command, the KHO device tree is available in ``/sys/kernel/kho/dt``. + +Next, load the target payload and kexec into it. It is important that you +use the ``-s`` parameter to use the in-kernel kexec file loader, as user +space kexec tooling currently has no support for KHO with the user space +based file loader :: + + # kexec -l Image --initrd=initrd -s + # kexec -e + +The new kernel will boot up and contain some of the previous kernel's state. + +For example, if you enabled ``CONFIG_FTRACE_KHO``, the new kernel will contain +the old kernel's trace buffers in ``/sys/kernel/debug/tracing/trace``. + +Abort a KHO exec +---------------- + +You can move the system out of KHO active phase again by calling :: + + $ echo 1 > /sys/kernel/kho/active + +After this command, the KHO device tree is no longer available in +``/sys/kernel/kho/dt``. diff --git a/Documentation/subsystem-apis.rst b/Documentation/subsystem-apis.rst index 930dc23998a0..8207b6514d87 100644 --- a/Documentation/subsystem-apis.rst +++ b/Documentation/subsystem-apis.rst @@ -86,3 +86,4 @@ Storage interfaces misc-devices/index peci/index wmi/index + kho/index From patchwork Fri Dec 22 19:51:35 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503780 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id F3044C41535 for ; Fri, 22 Dec 2023 19:52:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=S8+v/wuUvg3VMbi+flJz3CFux0D9mq/ffK4BEBYiZAY=; b=4aaVhqPIVIU3gf EaJXLsdAbAAre/idA8628Y5fgs+m1TO9xWMVzIPsN4MS5zvSiBPPuVDimPOFIGkxndMzOhtevscVX ZusD2xqwkzEXstk2nfQx2zTt87+t6TB+mkr6DzoCKAHZ1md4jLFXayy5SXE9KQyFZ1DmeilwFjZ/Z nNv/IwlUc+w0VNJ4faD/dmBdS12BealCH3SMOjoqPGvDz4eVzqTNZ65Ngowcl8kcYBSK7rI9jmMZP pELqyxMElQK24vERSG0U8sRxL5jnznbWyPDF9oQhUXsY7kAjvwG/isH41THCv5KwbxQqnPvh30b0B v4PgVocWMQ3ALNi0v+lQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZC-006kjc-2n; Fri, 22 Dec 2023 19:52:18 +0000 Received: from smtp-fw-52004.amazon.com ([52.119.213.154]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZ1-006kaC-0V; Fri, 22 Dec 2023 19:52:09 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274727; x=1734810727; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4hs41nrvxXplJ/QyH6HSlHLbtIDrbQgqX7R8fKuobQ4=; b=GSgDevbQB4A6OU5u44sgzH+MP7sYGJv+DEC+nvLiC0DHrmcmHu9nZWvG PHW3VqX4XjpyBSIzYq0jSjVghL3dXv4x/NxZnxRQO40UtYqGsTqIJvxJw YUAI52FQm8zbhvL4iFDlEJjGCJGdHBwQFMgiCAAymvPMdedQZsbd/DsUv w=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="174097104" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-iad-1a-m6i4x-9fe6ad2f.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:52:05 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan3.iad.amazon.com [10.32.235.38]) by email-inbound-relay-iad-1a-m6i4x-9fe6ad2f.us-east-1.amazon.com (Postfix) with ESMTPS id 19E84808C4; Fri, 22 Dec 2023 19:51:57 +0000 (UTC) Received: from EX19MTAUWC001.ant.amazon.com [10.0.7.35:38232] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.16.72:2525] with esmtp (Farcaster) id 11d71bf8-2bc3-45a5-aa9e-d1953eed09b2; Fri, 22 Dec 2023 19:51:57 +0000 (UTC) X-Farcaster-Flow-ID: 11d71bf8-2bc3-45a5-aa9e-d1953eed09b2 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:51:57 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:51:53 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 08/17] arm64: Add KHO support Date: Fri, 22 Dec 2023 19:51:35 +0000 Message-ID: <20231222195144.24532-3-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D032UWA002.ant.amazon.com (10.13.139.81) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115207_353990_FF439A5A X-CRM114-Status: GOOD ( 21.24 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org We now have all bits in place to support KHO kexecs. This patch adds awareness of KHO in the kexec file as well as boot path for arm64 and adds the respective kconfig option to the architecture so that it can use KHO successfully. Signed-off-by: Alexander Graf --- v1 -> v2: - test bot warning fix - Change kconfig option to ARCH_SUPPORTS_KEXEC_KHO - s/kho_reserve_mem/kho_reserve_previous_mem/g - s/kho_reserve/kho_reserve_scratch/g - Remove / reduce ifdefs for kho fdt code --- arch/arm64/Kconfig | 3 +++ arch/arm64/kernel/setup.c | 2 ++ arch/arm64/mm/init.c | 8 ++++++ drivers/of/fdt.c | 39 ++++++++++++++++++++++++++++ drivers/of/kexec.c | 54 +++++++++++++++++++++++++++++++++++++++ 5 files changed, 106 insertions(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index 7b071a00425d..4a2fd3deaa16 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -1495,6 +1495,9 @@ config ARCH_SUPPORTS_KEXEC_IMAGE_VERIFY_SIG config ARCH_DEFAULT_KEXEC_IMAGE_VERIFY_SIG def_bool y +config ARCH_SUPPORTS_KEXEC_KHO + def_bool y + config ARCH_SUPPORTS_CRASH_DUMP def_bool y diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c index 417a8a86b2db..9aa05b84d202 100644 --- a/arch/arm64/kernel/setup.c +++ b/arch/arm64/kernel/setup.c @@ -346,6 +346,8 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p) paging_init(); + kho_reserve_previous_mem(); + acpi_table_upgrade(); /* Parse the ACPI tables for possible boot-time configuration */ diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c index 74c1db8ce271..1a8fc91509af 100644 --- a/arch/arm64/mm/init.c +++ b/arch/arm64/mm/init.c @@ -358,6 +358,8 @@ void __init bootmem_init(void) */ arch_reserve_crashkernel(); + kho_reserve_scratch(); + memblock_dump_all(); } @@ -386,6 +388,12 @@ void __init mem_init(void) /* this will put all unused low memory onto the freelists */ memblock_free_all(); + /* + * Now that all KHO pages are marked as reserved, let's flip them back + * to normal pages with accurate refcount. + */ + kho_populate_refcount(); + /* * Check boundaries twice: Some fundamental inconsistencies can be * detected at build time already. diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c index bf502ba8da95..f9b9a36fb722 100644 --- a/drivers/of/fdt.c +++ b/drivers/of/fdt.c @@ -1006,6 +1006,42 @@ void __init early_init_dt_check_for_usable_mem_range(void) memblock_add(rgn[i].base, rgn[i].size); } +/** + * early_init_dt_check_kho - Decode info required for kexec handover from DT + */ +static void __init early_init_dt_check_kho(void) +{ + unsigned long node = chosen_node_offset; + u64 kho_start, scratch_start, scratch_size, mem_start, mem_size; + const __be32 *p; + int l; + + if (!IS_ENABLED(CONFIG_KEXEC_KHO) || (long)node < 0) + return; + + p = of_get_flat_dt_prop(node, "linux,kho-dt", &l); + if (l != (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32)) + return; + + kho_start = dt_mem_next_cell(dt_root_addr_cells, &p); + + p = of_get_flat_dt_prop(node, "linux,kho-scratch", &l); + if (l != (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32)) + return; + + scratch_start = dt_mem_next_cell(dt_root_addr_cells, &p); + scratch_size = dt_mem_next_cell(dt_root_addr_cells, &p); + + p = of_get_flat_dt_prop(node, "linux,kho-mem", &l); + if (l != (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32)) + return; + + mem_start = dt_mem_next_cell(dt_root_addr_cells, &p); + mem_size = dt_mem_next_cell(dt_root_addr_cells, &p); + + kho_populate(kho_start, scratch_start, scratch_size, mem_start, mem_size); +} + #ifdef CONFIG_SERIAL_EARLYCON int __init early_init_dt_scan_chosen_stdout(void) @@ -1304,6 +1340,9 @@ void __init early_init_dt_scan_nodes(void) /* Handle linux,usable-memory-range property */ early_init_dt_check_for_usable_mem_range(); + + /* Handle kexec handover */ + early_init_dt_check_kho(); } bool __init early_init_dt_scan(void *params) diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c index 68278340cecf..59070b09ad45 100644 --- a/drivers/of/kexec.c +++ b/drivers/of/kexec.c @@ -264,6 +264,55 @@ static inline int setup_ima_buffer(const struct kimage *image, void *fdt, } #endif /* CONFIG_IMA_KEXEC */ +static int kho_add_chosen(const struct kimage *image, void *fdt, int chosen_node) +{ + void *dt = NULL; + phys_addr_t dt_mem = 0; + phys_addr_t dt_len = 0; + phys_addr_t scratch_mem = 0; + phys_addr_t scratch_len = 0; + void *mem_cache = NULL; + phys_addr_t mem_cache_mem = 0; + phys_addr_t mem_cache_len = 0; + int ret = 0; + +#ifdef CONFIG_KEXEC_KHO + dt = image->kho.dt.buffer; + dt_mem = image->kho.dt.mem; + dt_len = image->kho.dt.bufsz; + + scratch_mem = kho_scratch_phys; + scratch_len = kho_scratch_len; + + mem_cache = image->kho.mem_cache.buffer; + mem_cache_mem = image->kho.mem_cache.mem; + mem_cache_len = image->kho.mem_cache.bufsz; +#endif + + if (!dt || !mem_cache) + goto out; + + pr_debug("Adding kho metadata to DT"); + + ret = fdt_appendprop_addrrange(fdt, 0, chosen_node, "linux,kho-dt", + dt_mem, dt_len); + if (ret) + goto out; + + ret = fdt_appendprop_addrrange(fdt, 0, chosen_node, "linux,kho-scratch", + scratch_mem, scratch_len); + if (ret) + goto out; + + ret = fdt_appendprop_addrrange(fdt, 0, chosen_node, "linux,kho-mem", + mem_cache_mem, mem_cache_len); + if (ret) + goto out; + +out: + return ret; +} + /* * of_kexec_alloc_and_setup_fdt - Alloc and setup a new Flattened Device Tree * @@ -412,6 +461,11 @@ void *of_kexec_alloc_and_setup_fdt(const struct kimage *image, } } + /* Add kho metadata if this is a KHO image */ + ret = kho_add_chosen(image, fdt, chosen_node); + if (ret) + goto out; + /* add bootargs */ if (cmdline) { ret = fdt_setprop_string(fdt, chosen_node, "bootargs", cmdline); From patchwork Fri Dec 22 19:51:36 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503783 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id C774BC41535 for ; Fri, 22 Dec 2023 19:53:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=M2mF3JkJ2jzu6mW7KEbNIM45qG4sWmoOGBcpx54duJ8=; b=LDLsYpH9eUdUhV fAt2ikjEfaXZ60L+jYmPwRDfUd3NWZRUqlnLJP+8/nh4Hgaoy7YOzO+EBOM/0kXm1jHK+rk9m180S 7fLTsNOBcpglbBOtOCo6R6XH6VFrds0Qve2ueKMA0Fz4SXg1kANhU5ELDbgU6z1f9+75VeFjBLwvA j1OA/YcrQx66S/5RkG+IKgQmWpnfmXHoD2lYxP0oYejZYvocTYbDJoymTBd59ONEvr5ZpsOfzCWRM NYwIUUq9slABZq9aUudk8nB93PdNdKdFgieGUpl8CFzHIZtWzRWo2GoOamiRPtRTJhnvGL26wdaCt sO+auMbrXbI3AHuJhSqQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZb-006l1k-1s; Fri, 22 Dec 2023 19:52:43 +0000 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZR-006kpb-2e; Fri, 22 Dec 2023 19:52:40 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274754; x=1734810754; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=GaYoJtOeVSzgmYS5gukWExN0RDseJvcoWD78p448UjE=; b=bd598Tu5flRZ0Sg4V98pvl6ug/VdK1jyANeUR4yp9EYw8SNSJz9s8jiU 3cpuV8eBp1EWhRQPzWQ7Utb7iv2qlFVJCLi2eKDSPlxtXcpYM0LgSh6Tn 0DWb+Mvm3Quj8pICKCorjZVVvmJNWe7N4X2XIC930FE9RFxu4td+LP0F5 c=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="385308504" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-iad-1a-m6i4x-617e30c2.us-east-1.amazon.com) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:52:32 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan2.iad.amazon.com [10.32.235.34]) by email-inbound-relay-iad-1a-m6i4x-617e30c2.us-east-1.amazon.com (Postfix) with ESMTPS id 15BEA697D5; Fri, 22 Dec 2023 19:52:23 +0000 (UTC) Received: from EX19MTAUWB002.ant.amazon.com [10.0.7.35:63687] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.27.188:2525] with esmtp (Farcaster) id cb4e22c4-a183-404a-86f3-2ce51b4d82d5; Fri, 22 Dec 2023 19:52:23 +0000 (UTC) X-Farcaster-Flow-ID: cb4e22c4-a183-404a-86f3-2ce51b4d82d5 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWB002.ant.amazon.com (10.250.64.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:23 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:19 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 09/17] x86: Add KHO support Date: Fri, 22 Dec 2023 19:51:36 +0000 Message-ID: <20231222195144.24532-4-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D042UWA003.ant.amazon.com (10.13.139.44) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115233_947980_735F5A4B X-CRM114-Status: GOOD ( 31.23 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org We now have all bits in place to support KHO kexecs. This patch adds awareness of KHO in the kexec file as well as boot path for x86 and adds the respective kconfig option to the architecture so that it can use KHO successfully. In addition, it enlightens it decompression code with KHO so that its KASLR location finder only considers memory regions that are not already occupied by KHO memory. Signed-off-by: Alexander Graf --- v1 -> v2: - Change kconfig option to ARCH_SUPPORTS_KEXEC_KHO - s/kho_reserve_mem/kho_reserve_previous_mem/g - s/kho_reserve/kho_reserve_scratch/g --- arch/x86/Kconfig | 3 ++ arch/x86/boot/compressed/kaslr.c | 55 +++++++++++++++++++++++++++ arch/x86/include/uapi/asm/bootparam.h | 15 +++++++- arch/x86/kernel/e820.c | 9 +++++ arch/x86/kernel/kexec-bzimage64.c | 39 +++++++++++++++++++ arch/x86/kernel/setup.c | 46 ++++++++++++++++++++++ arch/x86/mm/init_32.c | 7 ++++ arch/x86/mm/init_64.c | 7 ++++ 8 files changed, 180 insertions(+), 1 deletion(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 3762f41bb092..9aa31b3dcebc 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -2094,6 +2094,9 @@ config ARCH_SUPPORTS_KEXEC_BZIMAGE_VERIFY_SIG config ARCH_SUPPORTS_KEXEC_JUMP def_bool y +config ARCH_SUPPORTS_KEXEC_KHO + def_bool y + config ARCH_SUPPORTS_CRASH_DUMP def_bool X86_64 || (X86_32 && HIGHMEM) diff --git a/arch/x86/boot/compressed/kaslr.c b/arch/x86/boot/compressed/kaslr.c index dec961c6d16a..93ea292e4c18 100644 --- a/arch/x86/boot/compressed/kaslr.c +++ b/arch/x86/boot/compressed/kaslr.c @@ -29,6 +29,7 @@ #include #include #include +#include #include #include @@ -472,6 +473,60 @@ static bool mem_avoid_overlap(struct mem_vector *img, } } +#ifdef CONFIG_KEXEC_KHO + if (ptr->type == SETUP_KEXEC_KHO) { + struct kho_data *kho = (struct kho_data *)ptr->data; + struct kho_mem *mems = (void *)kho->mem_cache_addr; + int nr_mems = kho->mem_cache_size / sizeof(*mems); + int i; + + /* Avoid the mem cache */ + avoid = (struct mem_vector) { + .start = kho->mem_cache_addr, + .size = kho->mem_cache_size, + }; + + if (mem_overlaps(img, &avoid) && (avoid.start < earliest)) { + *overlap = avoid; + earliest = overlap->start; + is_overlapping = true; + } + + /* And the KHO DT */ + avoid = (struct mem_vector) { + .start = kho->dt_addr, + .size = kho->dt_size, + }; + + if (mem_overlaps(img, &avoid) && (avoid.start < earliest)) { + *overlap = avoid; + earliest = overlap->start; + is_overlapping = true; + } + + /* As well as any other KHO memory reservations */ + for (i = 0; i < nr_mems; i++) { + avoid = (struct mem_vector) { + .start = mems[i].addr, + .size = mems[i].len, + }; + + /* + * This mem starts after our current break. + * The array is sorted, so we're done. + */ + if (avoid.start >= earliest) + break; + + if (mem_overlaps(img, &avoid)) { + *overlap = avoid; + earliest = overlap->start; + is_overlapping = true; + } + } + } +#endif + ptr = (struct setup_data *)(unsigned long)ptr->next; } diff --git a/arch/x86/include/uapi/asm/bootparam.h b/arch/x86/include/uapi/asm/bootparam.h index 01d19fc22346..013af38a9673 100644 --- a/arch/x86/include/uapi/asm/bootparam.h +++ b/arch/x86/include/uapi/asm/bootparam.h @@ -13,7 +13,8 @@ #define SETUP_CC_BLOB 7 #define SETUP_IMA 8 #define SETUP_RNG_SEED 9 -#define SETUP_ENUM_MAX SETUP_RNG_SEED +#define SETUP_KEXEC_KHO 10 +#define SETUP_ENUM_MAX SETUP_KEXEC_KHO #define SETUP_INDIRECT (1<<31) #define SETUP_TYPE_MAX (SETUP_ENUM_MAX | SETUP_INDIRECT) @@ -181,6 +182,18 @@ struct ima_setup_data { __u64 size; } __attribute__((packed)); +/* + * Locations of kexec handover metadata + */ +struct kho_data { + __u64 dt_addr; + __u64 dt_size; + __u64 scratch_addr; + __u64 scratch_size; + __u64 mem_cache_addr; + __u64 mem_cache_size; +} __attribute__((packed)); + /* The so-called "zeropage" */ struct boot_params { struct screen_info screen_info; /* 0x000 */ diff --git a/arch/x86/kernel/e820.c b/arch/x86/kernel/e820.c index fb8cf953380d..c891b83f5b1c 100644 --- a/arch/x86/kernel/e820.c +++ b/arch/x86/kernel/e820.c @@ -1341,6 +1341,15 @@ void __init e820__memblock_setup(void) continue; memblock_add(entry->addr, entry->size); + + /* + * At this point with KHO we only allocate from scratch memory + * and only from memory below ISA_END_ADDRESS. Make sure that + * when we add memory for the eligible range, we add it as + * scratch memory so that we can resize the memblocks array. + */ + if (is_kho_boot() && (end <= ISA_END_ADDRESS)) + memblock_mark_scratch(entry->addr, end); } /* Throw away partial pages: */ diff --git a/arch/x86/kernel/kexec-bzimage64.c b/arch/x86/kernel/kexec-bzimage64.c index a61c12c01270..0cb8d0650a02 100644 --- a/arch/x86/kernel/kexec-bzimage64.c +++ b/arch/x86/kernel/kexec-bzimage64.c @@ -15,6 +15,7 @@ #include #include #include +#include #include #include #include @@ -233,6 +234,33 @@ setup_ima_state(const struct kimage *image, struct boot_params *params, #endif /* CONFIG_IMA_KEXEC */ } +static void setup_kho(const struct kimage *image, struct boot_params *params, + unsigned long params_load_addr, + unsigned int setup_data_offset) +{ +#ifdef CONFIG_KEXEC_KHO + struct setup_data *sd = (void *)params + setup_data_offset; + struct kho_data *kho = (void *)sd + sizeof(*sd); + + sd->type = SETUP_KEXEC_KHO; + sd->len = sizeof(struct kho_data); + + /* Only add if we have all KHO images in place */ + if (!image->kho.dt.buffer || !image->kho.mem_cache.buffer) + return; + + /* Add setup data */ + kho->dt_addr = image->kho.dt.mem; + kho->dt_size = image->kho.dt.bufsz; + kho->scratch_addr = kho_scratch_phys; + kho->scratch_size = kho_scratch_len; + kho->mem_cache_addr = image->kho.mem_cache.mem; + kho->mem_cache_size = image->kho.mem_cache.bufsz; + sd->next = params->hdr.setup_data; + params->hdr.setup_data = params_load_addr + setup_data_offset; +#endif /* CONFIG_KEXEC_KHO */ +} + static int setup_boot_parameters(struct kimage *image, struct boot_params *params, unsigned long params_load_addr, @@ -305,6 +333,13 @@ setup_boot_parameters(struct kimage *image, struct boot_params *params, sizeof(struct ima_setup_data); } + if (IS_ENABLED(CONFIG_KEXEC_KHO)) { + /* Setup space to store preservation metadata */ + setup_kho(image, params, params_load_addr, setup_data_offset); + setup_data_offset += sizeof(struct setup_data) + + sizeof(struct kho_data); + } + /* Setup RNG seed */ setup_rng_seed(params, params_load_addr, setup_data_offset); @@ -470,6 +505,10 @@ static void *bzImage64_load(struct kimage *image, char *kernel, kbuf.bufsz += sizeof(struct setup_data) + sizeof(struct ima_setup_data); + if (IS_ENABLED(CONFIG_KEXEC_KHO)) + kbuf.bufsz += sizeof(struct setup_data) + + sizeof(struct kho_data); + params = kzalloc(kbuf.bufsz, GFP_KERNEL); if (!params) return ERR_PTR(-ENOMEM); diff --git a/arch/x86/kernel/setup.c b/arch/x86/kernel/setup.c index 1526747bedf2..bd21f9a601a2 100644 --- a/arch/x86/kernel/setup.c +++ b/arch/x86/kernel/setup.c @@ -382,6 +382,29 @@ int __init ima_get_kexec_buffer(void **addr, size_t *size) } #endif +static void __init add_kho(u64 phys_addr, u32 data_len) +{ +#ifdef CONFIG_KEXEC_KHO + struct kho_data *kho; + u64 addr = phys_addr + sizeof(struct setup_data); + u64 size = data_len - sizeof(struct setup_data); + + kho = early_memremap(addr, size); + if (!kho) { + pr_warn("setup: failed to memremap kho data (0x%llx, 0x%llx)\n", + addr, size); + return; + } + + kho_populate(kho->dt_addr, kho->scratch_addr, kho->scratch_size, + kho->mem_cache_addr, kho->mem_cache_size); + + early_memunmap(kho, size); +#else + pr_warn("Passed KHO data, but CONFIG_KEXEC_KHO not set. Ignoring.\n"); +#endif +} + static void __init parse_setup_data(void) { struct setup_data *data; @@ -410,6 +433,9 @@ static void __init parse_setup_data(void) case SETUP_IMA: add_early_ima_buffer(pa_data); break; + case SETUP_KEXEC_KHO: + add_kho(pa_data, data_len); + break; case SETUP_RNG_SEED: data = early_memremap(pa_data, data_len); add_bootloader_randomness(data->data, data->len); @@ -989,8 +1015,26 @@ void __init setup_arch(char **cmdline_p) cleanup_highmap(); memblock_set_current_limit(ISA_END_ADDRESS); + e820__memblock_setup(); + /* + * We can resize memblocks at this point, let's dump all KHO + * reservations in and switch from scratch-only to normal allocations + */ + kho_reserve_previous_mem(); + + /* Allocations now skip scratch mem, return low 1M to the pool */ + if (is_kho_boot()) { + u64 i; + phys_addr_t base, end; + + __for_each_mem_range(i, &memblock.memory, NULL, NUMA_NO_NODE, + MEMBLOCK_SCRATCH, &base, &end, NULL) + if (end <= ISA_END_ADDRESS) + memblock_clear_scratch(base, end - base); + } + /* * Needs to run after memblock setup because it needs the physical * memory size. @@ -1106,6 +1150,8 @@ void __init setup_arch(char **cmdline_p) */ arch_reserve_crashkernel(); + kho_reserve_scratch(); + memblock_find_dma_reserve(); if (!early_xdbc_setup_hardware()) diff --git a/arch/x86/mm/init_32.c b/arch/x86/mm/init_32.c index b63403d7179d..6c3810afed04 100644 --- a/arch/x86/mm/init_32.c +++ b/arch/x86/mm/init_32.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -738,6 +739,12 @@ void __init mem_init(void) after_bootmem = 1; x86_init.hyper.init_after_bootmem(); + /* + * Now that all KHO pages are marked as reserved, let's flip them back + * to normal pages with accurate refcount. + */ + kho_populate_refcount(); + /* * Check boundaries twice: Some fundamental inconsistencies can * be detected at build time already. diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index a190aae8ceaf..3ce1a4767610 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -1339,6 +1340,12 @@ void __init mem_init(void) after_bootmem = 1; x86_init.hyper.init_after_bootmem(); + /* + * Now that all KHO pages are marked as reserved, let's flip them back + * to normal pages with accurate refcount. + */ + kho_populate_refcount(); + /* * Must be done after boot memory is put on freelist, because here we * might set fields in deferred struct pages that have not yet been From patchwork Fri Dec 22 19:51:37 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503782 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 95571C41535 for ; Fri, 22 Dec 2023 19:52:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=/Gmt3CTyUCpHQkU6t2UCJXe5+hg4IDKfq6OHJk29Sxg=; b=ol3rL3hEsfVrL5 Q+z/xPhl3q8Q99x2MBlCw1fiNfZO/tudO+VK2cI1SpecrYbfqD4Fqxxb4ZyqNueNYY7BO5yF0CjH0 z/YE/DDj2CbCh/W0ugKk/YWNEzuji3DkGOFZo4NLC5D92zMOMRZK07xMKun3VYrjslJIvTjlaKPBK jTc74QouUvz1Z/eCTU+hPkra0Kdh96P8D0BQCPBefbGKBBCp29aJOG9fmJlyX22Si8oU9D0RvpgCI rAeXcNuGnVpF25F8VziO2MI6bC/3N6J8FKvhbpe2/wZO/fTASkEMxq/aLvf5Nibvv4LBeKHYTDofW OCVTUM82SiCGMzWhNRWg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZS-006kub-2h; Fri, 22 Dec 2023 19:52:34 +0000 Received: from smtp-fw-9102.amazon.com ([207.171.184.29]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZO-006kpb-0K; Fri, 22 Dec 2023 19:52:31 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274750; x=1734810750; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=dwhP9TX+qhNuxIBvRPP4G1COkpJHtM0yxRM6jOIEYyk=; b=sRto3Z+s4uFZNT4q6iaUE9hxv/Ha3doRB3IbVmHcpBKI4TUjbWEbXHuh BBKK28pnADTPd+1qSzobsjKNq2YucdkPVgc+Lh1EfF944yZTdoltcxLcg DA1QArXT2JStR0kqY2R2l4picJsJHR51xgimL5Gi37+pvzpUnp4Gnc+Sx Q=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="385308500" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-pdx-2c-m6i4x-f7c754c9.us-west-2.amazon.com) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:52:30 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (pdx2-ws-svc-p26-lb5-vlan3.pdx.amazon.com [10.39.38.70]) by email-inbound-relay-pdx-2c-m6i4x-f7c754c9.us-west-2.amazon.com (Postfix) with ESMTPS id EBCEB410B4; Fri, 22 Dec 2023 19:52:27 +0000 (UTC) Received: from EX19MTAUWA002.ant.amazon.com [10.0.38.20:56802] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.60.146:2525] with esmtp (Farcaster) id a9d24c67-a15d-4b10-a683-30cfdafa3ac6; Fri, 22 Dec 2023 19:52:27 +0000 (UTC) X-Farcaster-Flow-ID: a9d24c67-a15d-4b10-a683-30cfdafa3ac6 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:26 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:22 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 10/17] tracing: Initialize fields before registering Date: Fri, 22 Dec 2023 19:51:37 +0000 Message-ID: <20231222195144.24532-5-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D042UWA003.ant.amazon.com (10.13.139.44) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115230_184774_233C56E1 X-CRM114-Status: GOOD ( 20.32 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With KHO, we need to know all event fields before we allocate an event type for a trace event so that we can recover it based on a previous execution context. Before this patch, fields were only initialized after we allocated a type id. After this patch, we try to allocate it early as well. This patch leaves the old late initialization logic in place. The field init code already validates whether there are any fields present, which means it's legal to call it multiple times. This way we're sure we don't miss any call sites. Signed-off-by: Alexander Graf --- include/linux/trace_events.h | 1 + kernel/trace/trace_events.c | 14 +++++++++----- kernel/trace/trace_events_synth.c | 14 +++++++++----- kernel/trace/trace_events_user.c | 4 ++++ kernel/trace/trace_probe.c | 4 ++++ 5 files changed, 27 insertions(+), 10 deletions(-) diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h index d68ff9b1247f..8fe8970b48e3 100644 --- a/include/linux/trace_events.h +++ b/include/linux/trace_events.h @@ -842,6 +842,7 @@ extern int trace_define_field(struct trace_event_call *call, const char *type, extern int trace_add_event_call(struct trace_event_call *call); extern int trace_remove_event_call(struct trace_event_call *call); extern int trace_event_get_offsets(struct trace_event_call *call); +extern int trace_event_define_fields(struct trace_event_call *call); int ftrace_set_clr_event(struct trace_array *tr, char *buf, int set); int trace_set_clr_event(const char *system, const char *event, int set); diff --git a/kernel/trace/trace_events.c b/kernel/trace/trace_events.c index f29e815ca5b2..fbf8be1d2806 100644 --- a/kernel/trace/trace_events.c +++ b/kernel/trace/trace_events.c @@ -462,6 +462,11 @@ static void test_event_printk(struct trace_event_call *call) int trace_event_raw_init(struct trace_event_call *call) { int id; + int ret; + + ret = trace_event_define_fields(call); + if (ret) + return ret; id = register_trace_event(&call->event); if (!id) @@ -2402,8 +2407,7 @@ event_subsystem_dir(struct trace_array *tr, const char *name, return NULL; } -static int -event_define_fields(struct trace_event_call *call) +int trace_event_define_fields(struct trace_event_call *call) { struct list_head *head; int ret = 0; @@ -2592,7 +2596,7 @@ event_create_dir(struct eventfs_inode *parent, struct trace_event_file *file) file->ei = ei; - ret = event_define_fields(call); + ret = trace_event_define_fields(call); if (ret < 0) { pr_warn("Could not initialize trace point events/%s\n", name); return ret; @@ -2978,7 +2982,7 @@ __trace_add_new_event(struct trace_event_call *call, struct trace_array *tr) if (eventdir_initialized) return event_create_dir(tr->event_dir, file); else - return event_define_fields(call); + return trace_event_define_fields(call); } static void trace_early_triggers(struct trace_event_file *file, const char *name) @@ -3015,7 +3019,7 @@ __trace_early_add_new_event(struct trace_event_call *call, if (!file) return -ENOMEM; - ret = event_define_fields(call); + ret = trace_event_define_fields(call); if (ret) return ret; diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c index 846e02c0fb59..4db41218ccf7 100644 --- a/kernel/trace/trace_events_synth.c +++ b/kernel/trace/trace_events_synth.c @@ -880,17 +880,21 @@ static int register_synth_event(struct synth_event *event) INIT_LIST_HEAD(&call->class->fields); call->event.funcs = &synth_event_funcs; call->class->fields_array = synth_event_fields_array; + call->flags = TRACE_EVENT_FL_TRACEPOINT; + call->class->reg = trace_event_reg; + call->class->probe = trace_event_raw_event_synth; + call->data = event; + call->tp = event->tp; + + ret = trace_event_define_fields(call); + if (ret) + goto out; ret = register_trace_event(&call->event); if (!ret) { ret = -ENODEV; goto out; } - call->flags = TRACE_EVENT_FL_TRACEPOINT; - call->class->reg = trace_event_reg; - call->class->probe = trace_event_raw_event_synth; - call->data = event; - call->tp = event->tp; ret = trace_add_event_call(call); if (ret) { diff --git a/kernel/trace/trace_events_user.c b/kernel/trace/trace_events_user.c index 9365ce407426..b9837e987525 100644 --- a/kernel/trace/trace_events_user.c +++ b/kernel/trace/trace_events_user.c @@ -1900,6 +1900,10 @@ static int user_event_trace_register(struct user_event *user) { int ret; + ret = trace_event_define_fields(&user->call); + if (ret) + return ret; + ret = register_trace_event(&user->call.event); if (!ret) diff --git a/kernel/trace/trace_probe.c b/kernel/trace/trace_probe.c index 4dc74d73fc1d..da73a02246d8 100644 --- a/kernel/trace/trace_probe.c +++ b/kernel/trace/trace_probe.c @@ -1835,6 +1835,10 @@ int trace_probe_register_event_call(struct trace_probe *tp) trace_probe_name(tp))) return -EEXIST; + ret = trace_event_define_fields(call); + if (ret) + return ret; + ret = register_trace_event(&call->event); if (!ret) return -ENODEV; From patchwork Fri Dec 22 19:51:38 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503784 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E4AF7C4706C for ; Fri, 22 Dec 2023 19:53:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=GVLZeu5lBNyNjnrTBAj9im9YaUX+LMyboufQKEF0BuQ=; b=2NnJoS+4bINIuj kqHb0nDhMESE9RVcxYycngh11eKhL9HK8+xduskWH5/qMz683nFH1znbW9nJ04+Mt29xiV30JPdIJ TFFdalC6wp4Fw/1UGiDXTfaHoljV2PaJ7JcAu39HJL2cgY2tHyFCnhGRh7ZuU+thscxkZEKki0uUC WPQ0OGTs5V+pJG1u2R2AGgkl9i8FT5JioChXVdovnkA17tfrzSa2ijjl9ggQI5H7n6EF9Muy9ICz2 4h2b3DeeviaV4wYtAmsgeFhfdCDPBGg3H9bb90P45BGwPNnXHgIlwyICoKlaxE7OfOKPM+QmpMzL2 +xJLzmL/YSuiV4qxZCug==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZl-006l9d-0q; Fri, 22 Dec 2023 19:52:53 +0000 Received: from smtp-fw-52002.amazon.com ([52.119.213.150]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlZY-006kyT-04; Fri, 22 Dec 2023 19:52:43 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274760; x=1734810760; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=qmhUZjf5F0irRMFPH4YUunFpNZODTBv7SKMl/vOJmCU=; b=DZRtWwrHDbeJZ7Kiu3qm0w1oqMzmDwUgILe68nCq3rnvP8iIAmvFKR7g EeCqvMM14rOzGHaBMcpBn4nfvfAGp7++DRi8C78/ATDQ6gamcz7BDsceo J8yb88i+4wsUT07oREBsaqzbz7bGNmKVBS0BRC7/94nIRfgXzSlHZ8aD2 I=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="602627243" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-iad-1d-m6i4x-25ac6bd5.us-east-1.amazon.com) ([10.43.8.6]) by smtp-border-fw-52002.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:52:39 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan2.iad.amazon.com [10.32.235.34]) by email-inbound-relay-iad-1d-m6i4x-25ac6bd5.us-east-1.amazon.com (Postfix) with ESMTPS id E5434498C1; Fri, 22 Dec 2023 19:52:31 +0000 (UTC) Received: from EX19MTAUWB001.ant.amazon.com [10.0.7.35:18611] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.6.30:2525] with esmtp (Farcaster) id 881e61d8-d7c5-43dd-85b6-cb73d6682b26; Fri, 22 Dec 2023 19:52:30 +0000 (UTC) X-Farcaster-Flow-ID: 881e61d8-d7c5-43dd-85b6-cb73d6682b26 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:30 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:26 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 11/17] tracing: Introduce kho serialization Date: Fri, 22 Dec 2023 19:51:38 +0000 Message-ID: <20231222195144.24532-6-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D042UWA003.ant.amazon.com (10.13.139.44) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115240_197882_36AB0304 X-CRM114-Status: GOOD ( 15.89 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org We want to be able to transfer ftrace state from one kernel to the next. To start off with, let's establish all the boiler plate to get a write hook when KHO wants to serialize and fill out basic data. Follow-up patches will fill in serialization of ring buffers and events. Signed-off-by: Alexander Graf --- v1 -> v2: - Remove ifdefs --- kernel/trace/trace.c | 47 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 47 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 199df497db07..6ec31879b4eb 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -32,6 +32,7 @@ #include #include #include +#include #include #include #include @@ -866,6 +867,8 @@ static struct tracer *trace_types __read_mostly; */ DEFINE_MUTEX(trace_types_lock); +static bool trace_in_kho; + /* * serialize the access of the ring buffer * @@ -10560,12 +10563,56 @@ void __init early_trace_init(void) init_events(); } +static int trace_kho_notifier(struct notifier_block *self, + unsigned long cmd, + void *v) +{ + const char compatible[] = "ftrace-v1"; + void *fdt = v; + int err = 0; + + switch (cmd) { + case KEXEC_KHO_ABORT: + if (trace_in_kho) + mutex_unlock(&trace_types_lock); + trace_in_kho = false; + return NOTIFY_DONE; + case KEXEC_KHO_DUMP: + /* Handled below */ + break; + default: + return NOTIFY_BAD; + } + + if (unlikely(tracing_disabled)) + return NOTIFY_DONE; + + err |= fdt_begin_node(fdt, "ftrace"); + err |= fdt_property(fdt, "compatible", compatible, sizeof(compatible)); + err |= fdt_end_node(fdt); + + if (!err) { + /* Hold all future allocations */ + mutex_lock(&trace_types_lock); + trace_in_kho = true; + } + + return err ? NOTIFY_BAD : NOTIFY_DONE; +} + +static struct notifier_block trace_kho_nb = { + .notifier_call = trace_kho_notifier, +}; + void __init trace_init(void) { trace_event_init(); if (boot_instance_index) enable_instances(); + + if (IS_ENABLED(CONFIG_FTRACE_KHO)) + register_kho_notifier(&trace_kho_nb); } __init static void clear_boot_tracer(void) From patchwork Fri Dec 22 19:51:39 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503785 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 9A92CC41535 for ; Fri, 22 Dec 2023 19:53:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=4U/lzHPINpr839oSccwDzen4qbF7s5mD/KU0dmw2xeU=; b=VkXHwhXkiKwBco +m0kauvUjFS0cWILAL3zUiNs46bj/VEoYsgSxlgG14FDj/N+aEnoVw8ZdScwSmmozl8v8wYsNCr0X FydA5y8wiTVkc30nYWDWxHyc5YnHpVRf3OaPkB+5G8DEy4IHID75cSdkOnV9l4lkkoGJESqs6OYzE OOLQKnqEcwn/MZQWJ+Y9RQmp0d/GzvCJAe9CTTK+9Mrfyq4UkvafxXuk31U/fuUnNRFIk16QXfQns 5Y7PtX7uqxXfNgLAInk+LmTOetLjXD2CXdGxHIIaeCLTz5squ5lNlQO9fx5gBfaI4BbJpQCDeQdQ3 CBBZ6eTT5F565Nz1Uzlg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGla5-006lNL-0s; Fri, 22 Dec 2023 19:53:13 +0000 Received: from smtp-fw-6001.amazon.com ([52.95.48.154]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGla0-006lIs-0v; Fri, 22 Dec 2023 19:53:10 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274789; x=1734810789; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=L4TnKEXfb2ccBb1Wze720VBcN3l/JR7BbWfBBtzRgXo=; b=t5a8J9QYWyPRjd2Kcsb17r9s6xQt5cQIunmuTQe9oJt8V4NUWRnS1ShO /sykVc/2oUnY7dy4ZEArPR9H/4SpN7/LxcO7lO3WzLO73SaO45WLmrKrn DChRq7o+/6HeWIDlGDOnO5s54W0RvqRgFcqv2zbsHago5CUrVLBlOk0kr E=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="378271995" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-iad-1e-m6i4x-0aba4706.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-6001.iad6.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:53:06 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan2.iad.amazon.com [10.32.235.34]) by email-inbound-relay-iad-1e-m6i4x-0aba4706.us-east-1.amazon.com (Postfix) with ESMTPS id 60A32A3F3F; Fri, 22 Dec 2023 19:52:58 +0000 (UTC) Received: from EX19MTAUWC001.ant.amazon.com [10.0.7.35:7922] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.60.146:2525] with esmtp (Farcaster) id e9c6b21e-270d-43aa-aac6-432f3a7148e1; Fri, 22 Dec 2023 19:52:57 +0000 (UTC) X-Farcaster-Flow-ID: e9c6b21e-270d-43aa-aac6-432f3a7148e1 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWC001.ant.amazon.com (10.250.64.174) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:56 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:52 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 12/17] tracing: Add kho serialization of trace buffers Date: Fri, 22 Dec 2023 19:51:39 +0000 Message-ID: <20231222195144.24532-7-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D036UWB004.ant.amazon.com (10.13.139.170) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115308_499110_E2A12D8D X-CRM114-Status: GOOD ( 24.57 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When we do a kexec handover, we want to preserve previous ftrace data into the new kernel. At the point when we write out the handover data, ftrace may still be running and recording new events and we want to capture all of those too. To allow the new kernel to revive all trace data up to reboot, we store all locations of trace buffers as well as their linked list metadata. We can then later reuse the linked list to reconstruct the head pointer. This patch implements the write-out logic for trace buffers. Signed-off-by: Alexander Graf --- v1 -> v2: - Leave the node generation code that needs to know the name in trace.c so that ring buffers can stay anonymous --- include/linux/ring_buffer.h | 2 + kernel/trace/ring_buffer.c | 76 +++++++++++++++++++++++++++++++++++++ kernel/trace/trace.c | 16 ++++++++ 3 files changed, 94 insertions(+) diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h index 782e14f62201..1c5eb33f0cb5 100644 --- a/include/linux/ring_buffer.h +++ b/include/linux/ring_buffer.h @@ -211,4 +211,6 @@ int trace_rb_cpu_prepare(unsigned int cpu, struct hlist_node *node); #define trace_rb_cpu_prepare NULL #endif +int ring_buffer_kho_write(void *fdt, struct trace_buffer *buffer); + #endif /* _LINUX_RING_BUFFER_H */ diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 83eab547f1d1..971af7ee35da 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -5853,6 +5854,81 @@ int trace_rb_cpu_prepare(unsigned int cpu, struct hlist_node *node) return 0; } +#ifdef CONFIG_FTRACE_KHO +static int rb_kho_write_cpu(void *fdt, struct trace_buffer *buffer, int cpu) +{ + int i = 0; + int err = 0; + struct list_head *tmp; + const char compatible[] = "ftrace,cpu-v1"; + char name[] = "cpuffffffff"; + int nr_pages; + struct ring_buffer_per_cpu *cpu_buffer; + bool first_loop = true; + struct kho_mem *mem; + uint64_t mem_len; + + if (!cpumask_test_cpu(cpu, buffer->cpumask)) + return 0; + + cpu_buffer = buffer->buffers[cpu]; + + nr_pages = cpu_buffer->nr_pages; + mem_len = sizeof(*mem) * nr_pages * 2; + mem = vmalloc(mem_len); + + snprintf(name, sizeof(name), "cpu%x", cpu); + + err |= fdt_begin_node(fdt, name); + err |= fdt_property(fdt, "compatible", compatible, sizeof(compatible)); + err |= fdt_property(fdt, "cpu", &cpu, sizeof(cpu)); + + for (tmp = rb_list_head(cpu_buffer->pages); + tmp != rb_list_head(cpu_buffer->pages) || first_loop; + tmp = rb_list_head(tmp->next), first_loop = false) { + struct buffer_page *bpage = (struct buffer_page *)tmp; + + /* Ring is larger than it should be? */ + if (i >= (nr_pages * 2)) { + pr_err("ftrace ring has more pages than nr_pages (%d / %d)", i, nr_pages); + err = -EINVAL; + break; + } + + /* First describe the bpage */ + mem[i++] = (struct kho_mem) { + .addr = __pa(bpage), + .len = sizeof(*bpage) + }; + + /* Then the data page */ + mem[i++] = (struct kho_mem) { + .addr = __pa(bpage->page), + .len = PAGE_SIZE + }; + } + + err |= fdt_property(fdt, "mem", mem, mem_len); + err |= fdt_end_node(fdt); + + vfree(mem); + return err; +} + +int ring_buffer_kho_write(void *fdt, struct trace_buffer *buffer) +{ + int err, i; + + for (i = 0; i < buffer->cpus; i++) { + err = rb_kho_write_cpu(fdt, buffer, i); + if (err) + return err; + } + + return 0; +} +#endif + #ifdef CONFIG_RING_BUFFER_STARTUP_TEST /* * This is a basic integrity check of the ring buffer. diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 6ec31879b4eb..2ccea4c1965b 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -10563,6 +10563,21 @@ void __init early_trace_init(void) init_events(); } +static int trace_kho_write_trace_array(void *fdt, struct trace_array *tr) +{ + const char *name = tr->name ? tr->name : "global_trace"; + const char compatible[] = "ftrace,array-v1"; + int err = 0; + + err |= fdt_begin_node(fdt, name); + err |= fdt_property(fdt, "compatible", compatible, sizeof(compatible)); + err |= fdt_property(fdt, "trace_flags", &tr->trace_flags, sizeof(tr->trace_flags)); + err |= ring_buffer_kho_write(fdt, tr->array_buffer.buffer); + err |= fdt_end_node(fdt); + + return err; +} + static int trace_kho_notifier(struct notifier_block *self, unsigned long cmd, void *v) @@ -10589,6 +10604,7 @@ static int trace_kho_notifier(struct notifier_block *self, err |= fdt_begin_node(fdt, "ftrace"); err |= fdt_property(fdt, "compatible", compatible, sizeof(compatible)); + err |= trace_kho_write_trace_array(fdt, &global_trace); err |= fdt_end_node(fdt); if (!err) { From patchwork Fri Dec 22 19:51:40 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503786 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 2C67DC4706C for ; Fri, 22 Dec 2023 19:53:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=/benSkV9Bl/sqfG8U2PZ+SxxnS1jJSTY7d2vi0qxMhI=; b=chwJfgC3NgrnH9 mWwAFkfij0/s6CP2aChtSO6sXbt7dSKgpV3mtjXIUlI+KaFPQky4yujTnPNCL+YSzKrNXFoBkmeiJ VqelfQE97Ruvd6VSGqYTAosFu8GdKj6EpBG/0d4qLMQus2ATJA20PEiFHvQPtLBAnhjDlrwt/VZha 2wViphDZy/mZ3fqWZ335s014SmbqlQ5M6fM6OfAd0jcYcxZ4J5XTdPPn4V09NvmDC4SURThVqAQWN 0JOeILQh1CefpplXgVg16lAmhjTNrsaD/ZE9XjeNb4XN4iAqLlETGFzFvoJGcs2oz2Uk19XkaP0yt jJGI084zczLk/WSqSB/Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlaE-006lUg-1q; Fri, 22 Dec 2023 19:53:22 +0000 Received: from smtp-fw-80008.amazon.com ([99.78.197.219]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGla4-006lLw-0a; Fri, 22 Dec 2023 19:53:17 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274792; x=1734810792; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DmWS8dZ9jK84BAKLfSbbK2PJywlEcHQnIlXSuHEMw/A=; b=ZA+2na76H4K98Wh849AfON1CAQAh2x0CQtlavv44fhS9nIZer6Wqe4cZ s2Bpsf2ndfz0jbNbEIbuh+33M605ESB/TeJQ4ScjkBGl28a4lOIYn44rB or2l48WRLATk8QofGfC7G4kEKScMJjh4z4XlgC93ghEYwhNOK/URDqi+G w=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="53453679" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-iad-1e-m6i4x-9694bb9e.us-east-1.amazon.com) ([10.25.36.214]) by smtp-border-fw-80008.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:53:09 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan2.iad.amazon.com [10.32.235.34]) by email-inbound-relay-iad-1e-m6i4x-9694bb9e.us-east-1.amazon.com (Postfix) with ESMTPS id 55927804E1; Fri, 22 Dec 2023 19:53:02 +0000 (UTC) Received: from EX19MTAUWB002.ant.amazon.com [10.0.7.35:14717] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.6.30:2525] with esmtp (Farcaster) id a9f26b86-9fc7-4cad-9223-cfbee1a06d30; Fri, 22 Dec 2023 19:53:01 +0000 (UTC) X-Farcaster-Flow-ID: a9f26b86-9fc7-4cad-9223-cfbee1a06d30 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWB002.ant.amazon.com (10.250.64.231) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:00 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:52:56 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 13/17] tracing: Recover trace buffers from kexec handover Date: Fri, 22 Dec 2023 19:51:40 +0000 Message-ID: <20231222195144.24532-8-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D036UWB004.ant.amazon.com (10.13.139.170) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115312_314144_E09E4A45 X-CRM114-Status: GOOD ( 30.84 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When kexec handover is in place, we now know the location of all previous buffers for ftrace rings. With this patch applied, ftrace reassembles any new trace buffer that carries the same name as a previous one with the same data pages that the previous buffer had. That way, a buffer that we had in place before kexec becomes readable after kexec again as soon as it gets initialized with the same name. Signed-off-by: Alexander Graf --- v1 -> v2: - Move from names to fdt offsets. That way, trace.c can find the trace array offset and then the ring buffer code only needs to read out its per-CPU data. That way it can stay oblivient to its name. - Make kho_get_fdt() const - Remove ifdefs --- include/linux/ring_buffer.h | 15 ++-- kernel/trace/ring_buffer.c | 171 ++++++++++++++++++++++++++++++++++-- kernel/trace/trace.c | 32 ++++++- 3 files changed, 206 insertions(+), 12 deletions(-) diff --git a/include/linux/ring_buffer.h b/include/linux/ring_buffer.h index 1c5eb33f0cb5..f6d6ce441890 100644 --- a/include/linux/ring_buffer.h +++ b/include/linux/ring_buffer.h @@ -84,20 +84,23 @@ void ring_buffer_discard_commit(struct trace_buffer *buffer, /* * size is in bytes for each per CPU buffer. */ -struct trace_buffer * -__ring_buffer_alloc(unsigned long size, unsigned flags, struct lock_class_key *key); +struct trace_buffer *__ring_buffer_alloc(unsigned long size, unsigned flags, + struct lock_class_key *key, + int tr_off); /* * Because the ring buffer is generic, if other users of the ring buffer get * traced by ftrace, it can produce lockdep warnings. We need to keep each * ring buffer's lock class separate. */ -#define ring_buffer_alloc(size, flags) \ -({ \ - static struct lock_class_key __key; \ - __ring_buffer_alloc((size), (flags), &__key); \ +#define ring_buffer_alloc_kho(size, flags, tr_off) \ +({ \ + static struct lock_class_key __key; \ + __ring_buffer_alloc((size), (flags), &__key, tr_off); \ }) +#define ring_buffer_alloc(size, flags) ring_buffer_alloc_kho(size, flags, 0) + int ring_buffer_wait(struct trace_buffer *buffer, int cpu, int full); __poll_t ring_buffer_poll_wait(struct trace_buffer *buffer, int cpu, struct file *filp, poll_table *poll_table, int full); diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c index 971af7ee35da..4c62a66068a7 100644 --- a/kernel/trace/ring_buffer.c +++ b/kernel/trace/ring_buffer.c @@ -558,6 +558,7 @@ struct trace_buffer { struct rb_irq_work irq_work; bool time_stamp_abs; + int tr_off; }; struct ring_buffer_iter { @@ -574,6 +575,15 @@ struct ring_buffer_iter { int missed_events; }; +struct rb_kho_cpu { + const struct kho_mem *mem; + uint32_t nr_mems; +}; + +static int rb_kho_replace_buffers(struct ring_buffer_per_cpu *cpu_buffer, + struct rb_kho_cpu *kho); +static int rb_kho_read_cpu(int tr_off, int cpu, struct rb_kho_cpu *kho); + #ifdef RB_TIME_32 /* @@ -1762,12 +1772,15 @@ static void rb_free_cpu_buffer(struct ring_buffer_per_cpu *cpu_buffer) * drop data when the tail hits the head. */ struct trace_buffer *__ring_buffer_alloc(unsigned long size, unsigned flags, - struct lock_class_key *key) + struct lock_class_key *key, + int tr_off) { + int cpu = raw_smp_processor_id(); + struct rb_kho_cpu kho = {}; struct trace_buffer *buffer; + bool use_kho = false; long nr_pages; int bsize; - int cpu; int ret; /* keep it in its own cache line */ @@ -1780,9 +1793,16 @@ struct trace_buffer *__ring_buffer_alloc(unsigned long size, unsigned flags, goto fail_free_buffer; nr_pages = DIV_ROUND_UP(size, BUF_PAGE_SIZE); + if (!rb_kho_read_cpu(tr_off, cpu, &kho) && kho.nr_mems > 4) { + nr_pages = kho.nr_mems / 2; + use_kho = true; + pr_debug("Using kho on CPU [%03d]", cpu); + } + buffer->flags = flags; buffer->clock = trace_clock_local; buffer->reader_lock_key = key; + buffer->tr_off = tr_off; init_irq_work(&buffer->irq_work.work, rb_wake_up_waiters); init_waitqueue_head(&buffer->irq_work.waiters); @@ -1799,12 +1819,14 @@ struct trace_buffer *__ring_buffer_alloc(unsigned long size, unsigned flags, if (!buffer->buffers) goto fail_free_cpumask; - cpu = raw_smp_processor_id(); cpumask_set_cpu(cpu, buffer->cpumask); buffer->buffers[cpu] = rb_allocate_cpu_buffer(buffer, nr_pages, cpu); if (!buffer->buffers[cpu]) goto fail_free_buffers; + if (use_kho && rb_kho_replace_buffers(buffer->buffers[cpu], &kho)) + pr_warn("Could not revive all previous trace data"); + ret = cpuhp_state_add_instance(CPUHP_TRACE_RB_PREPARE, &buffer->node); if (ret < 0) goto fail_free_buffers; @@ -5818,7 +5840,9 @@ EXPORT_SYMBOL_GPL(ring_buffer_read_page); */ int trace_rb_cpu_prepare(unsigned int cpu, struct hlist_node *node) { + struct rb_kho_cpu kho = {}; struct trace_buffer *buffer; + bool use_kho = false; long nr_pages_same; int cpu_i; unsigned long nr_pages; @@ -5842,6 +5866,12 @@ int trace_rb_cpu_prepare(unsigned int cpu, struct hlist_node *node) /* allocate minimum pages, user can later expand it */ if (!nr_pages_same) nr_pages = 2; + + if (!rb_kho_read_cpu(buffer->tr_off, cpu, &kho) && kho.nr_mems > 4) { + nr_pages = kho.nr_mems / 2; + use_kho = true; + } + buffer->buffers[cpu] = rb_allocate_cpu_buffer(buffer, nr_pages, cpu); if (!buffer->buffers[cpu]) { @@ -5849,13 +5879,143 @@ int trace_rb_cpu_prepare(unsigned int cpu, struct hlist_node *node) cpu); return -ENOMEM; } + + if (use_kho && rb_kho_replace_buffers(buffer->buffers[cpu], &kho)) + pr_warn("Could not revive all previous trace data"); + smp_wmb(); cpumask_set_cpu(cpu, buffer->cpumask); return 0; } -#ifdef CONFIG_FTRACE_KHO -static int rb_kho_write_cpu(void *fdt, struct trace_buffer *buffer, int cpu) +static int rb_kho_replace_buffers(struct ring_buffer_per_cpu *cpu_buffer, + struct rb_kho_cpu *kho) +{ + bool first_loop = true; + struct list_head *tmp; + int err = 0; + int i = 0; + + if (!IS_ENABLED(CONFIG_FTRACE_KHO)) + return -EINVAL; + + if (kho->nr_mems != cpu_buffer->nr_pages * 2) + return -EINVAL; + + for (tmp = rb_list_head(cpu_buffer->pages); + tmp != rb_list_head(cpu_buffer->pages) || first_loop; + tmp = rb_list_head(tmp->next), first_loop = false) { + struct buffer_page *bpage = (struct buffer_page *)tmp; + const struct kho_mem *mem_bpage = &kho->mem[i++]; + const struct kho_mem *mem_page = &kho->mem[i++]; + const uint64_t rb_page_head = 1; + struct buffer_page *old_bpage; + void *old_page; + + old_bpage = __va(mem_bpage->addr); + if (!bpage) + goto out; + + if ((ulong)old_bpage->list.next & rb_page_head) { + struct list_head *new_lhead; + struct buffer_page *new_head; + + new_lhead = rb_list_head(bpage->list.next); + new_head = (struct buffer_page *)new_lhead; + + /* Assume the buffer is completely full */ + cpu_buffer->tail_page = bpage; + cpu_buffer->commit_page = bpage; + /* Set the head pointers to what they were before */ + cpu_buffer->head_page->list.prev->next = (struct list_head *) + ((ulong)cpu_buffer->head_page->list.prev->next & ~rb_page_head); + cpu_buffer->head_page = new_head; + bpage->list.next = (struct list_head *)((ulong)new_lhead | rb_page_head); + } + + if (rb_page_entries(old_bpage) || rb_page_write(old_bpage)) { + /* + * We want to recycle the pre-kho page, it contains + * trace data. To do so, we unreserve it and swap the + * current data page with the pre-kho one + */ + old_page = kho_claim_mem(mem_page); + + /* Recycle the old page, it contains data */ + free_page((ulong)bpage->page); + bpage->page = old_page; + + bpage->write = old_bpage->write; + bpage->entries = old_bpage->entries; + bpage->real_end = old_bpage->real_end; + + local_inc(&cpu_buffer->pages_touched); + } else { + kho_return_mem(mem_page); + } + + kho_return_mem(mem_bpage); + } + +out: + return err; +} + +static int rb_kho_read_cpu(int tr_off, int cpu, struct rb_kho_cpu *kho) +{ + const void *fdt = kho_get_fdt(); + int mem_len; + int err = 0; + char *path; + int off; + + if (!IS_ENABLED(CONFIG_FTRACE_KHO)) + return -EINVAL; + + if (!tr_off || !fdt || !kho) + return -EINVAL; + + path = kasprintf(GFP_KERNEL, "cpu%x", cpu); + if (!path) + return -ENOMEM; + + pr_debug("Trying to revive trace cpu '%s'", path); + + off = fdt_subnode_offset(fdt, tr_off, path); + if (off < 0) { + pr_debug("Could not find '%s' in DT", path); + err = -ENOENT; + goto out; + } + + err = fdt_node_check_compatible(fdt, off, "ftrace,cpu-v1"); + if (err) { + pr_warn("Node '%s' has invalid compatible", path); + err = -EINVAL; + goto out; + } + + kho->mem = fdt_getprop(fdt, off, "mem", &mem_len); + if (!kho->mem) { + pr_warn("Node '%s' has invalid mem property", path); + err = -EINVAL; + goto out; + } + + kho->nr_mems = mem_len / sizeof(*kho->mem); + + /* Should follow "bpage 0, page 0, bpage 1, page 1, ..." pattern */ + if ((kho->nr_mems & 1)) { + err = -EINVAL; + goto out; + } + +out: + kfree(path); + return err; +} + +static int __maybe_unused rb_kho_write_cpu(void *fdt, struct trace_buffer *buffer, int cpu) { int i = 0; int err = 0; @@ -5915,6 +6075,7 @@ static int rb_kho_write_cpu(void *fdt, struct trace_buffer *buffer, int cpu) return err; } +#ifdef CONFIG_FTRACE_KHO int ring_buffer_kho_write(void *fdt, struct trace_buffer *buffer) { int err, i; diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 2ccea4c1965b..94e30dfacfd1 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -9348,16 +9348,46 @@ static struct dentry *trace_instance_dir; static void init_tracer_tracefs(struct trace_array *tr, struct dentry *d_tracer); +static int trace_kho_off_tr(struct trace_array *tr) +{ + const char *name = tr->name ? tr->name : "global_trace"; + const void *fdt = kho_get_fdt(); + char *path; + int off; + + if (!IS_ENABLED(CONFIG_FTRACE_KHO)) + return 0; + + if (!fdt) + return 0; + + path = kasprintf(GFP_KERNEL, "/ftrace/%s", name); + if (!path) + return -ENOMEM; + + pr_debug("Trying to revive trace buffer '%s'", path); + + off = fdt_path_offset(fdt, path); + if (off < 0) { + pr_debug("Could not find '%s' in DT", path); + off = 0; + } + + kfree(path); + return off; +} + static int allocate_trace_buffer(struct trace_array *tr, struct array_buffer *buf, int size) { + int tr_off = trace_kho_off_tr(tr); enum ring_buffer_flags rb_flags; rb_flags = tr->trace_flags & TRACE_ITER_OVERWRITE ? RB_FL_OVERWRITE : 0; buf->tr = tr; - buf->buffer = ring_buffer_alloc(size, rb_flags); + buf->buffer = ring_buffer_alloc_kho(size, rb_flags, tr_off); if (!buf->buffer) return -ENOMEM; From patchwork Fri Dec 22 19:51:41 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503787 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 203ECC41535 for ; Fri, 22 Dec 2023 19:53:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=5OhvV6usZ4zdymFUiSsb8Kd7O/mgjffyvLxwHJtGNzQ=; b=0VAug2MZGFQfXx 5BmgYsML8+FOS4jddNSimisTGsu4j9gG6CClagZZRkCP/jhQ/6mVn7qzHj0UcbvTnWrcbzhL1edxr /EaUdXkHVnFfsqH878Mz6feX9NXRA42KoMYd9FU8boSv2EAQMxg1s0VSO0q/BQveOVzgZaNbAmjkx Viu2oZi+Ald4ptjkl8AKZ3c9S9nRrdXutb9rPBSAAlaiXoro+SbahQvHj7vOeQ+FBNuivUHhK1Pfi /8etpDNSuz//IwidrVhQERmtmEbysilaK+z641q0lz4CAmQz6fdpK7qF3ccEmp3OV0wAbAyReW3qw FhBKG2uHJI4RAc6dUkSQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlaM-006lb7-2v; Fri, 22 Dec 2023 19:53:30 +0000 Received: from smtp-fw-80006.amazon.com ([99.78.197.217]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGla8-006lPY-14; Fri, 22 Dec 2023 19:53:21 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274796; x=1734810796; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=MGZgj7MpJhCqNq0rnds5z2tkCyQdgvCc34iYycu1T5U=; b=Sz5pvVqw2jRMXcD7bK9fBooL/swI6tGg7WHUWgQaTm0j95V/Do9hB4uT rXaPm6HaD8ygNZHEoanX49YY6L8qR749qUMVN6FfodOOiAdElW4PUCCBh TDtM+L65xiPrSKgsSSw5DJWNVZgCKUgeAeRKC06prUuznm/+XMzMLO3om A=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="261447211" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-iad-1e-m6i4x-7dc0ecf1.us-east-1.amazon.com) ([10.25.36.214]) by smtp-border-fw-80006.pdx80.corp.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:53:13 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan2.iad.amazon.com [10.32.235.34]) by email-inbound-relay-iad-1e-m6i4x-7dc0ecf1.us-east-1.amazon.com (Postfix) with ESMTPS id 506F080944; Fri, 22 Dec 2023 19:53:06 +0000 (UTC) Received: from EX19MTAUWA002.ant.amazon.com [10.0.38.20:4780] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.47.55:2525] with esmtp (Farcaster) id d8dd3a1b-7891-4d9e-b0b4-b1661fc60c63; Fri, 22 Dec 2023 19:53:05 +0000 (UTC) X-Farcaster-Flow-ID: d8dd3a1b-7891-4d9e-b0b4-b1661fc60c63 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWA002.ant.amazon.com (10.250.64.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:04 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:00 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 14/17] tracing: Add kho serialization of trace events Date: Fri, 22 Dec 2023 19:51:41 +0000 Message-ID: <20231222195144.24532-9-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D036UWB004.ant.amazon.com (10.13.139.170) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115316_435682_C7534194 X-CRM114-Status: GOOD ( 21.67 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Events and thus their parsing handle in ftrace have dynamic IDs that get assigned whenever the event is added to the system. If we want to parse trace events after kexec, we need to link event IDs back to the original trace event that existed before we kexec'ed. There are broadly 2 paths we could take for that: 1) Save full event description across KHO, restore after kexec, merge identical trace events into a single identifier. 2) Recover the ID of post-kexec added events so they get the same ID after kexec that they had before kexec This patch implements the second option. It's simpler and thus less intrusive. However, it means we can not fully parse affected events when the kernel removes or modifies trace events across a kho kexec. Signed-off-by: Alexander Graf --- v1 -> v2: - Leave anything that requires a name in trace.c to keep buffers unnamed entities - Put events as array into a property, use fingerprint instead of names to identify them - Reduce footprint without CONFIG_FTRACE_KHO --- kernel/trace/trace.c | 1 + kernel/trace/trace_output.c | 89 +++++++++++++++++++++++++++++++++++++ kernel/trace/trace_output.h | 5 +++ 3 files changed, 95 insertions(+) diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c index 94e30dfacfd1..b9ce8cf24d02 100644 --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -10634,6 +10634,7 @@ static int trace_kho_notifier(struct notifier_block *self, err |= fdt_begin_node(fdt, "ftrace"); err |= fdt_property(fdt, "compatible", compatible, sizeof(compatible)); + err |= trace_kho_write_events(fdt); err |= trace_kho_write_trace_array(fdt, &global_trace); err |= fdt_end_node(fdt); diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c index 3e7fa44dc2b2..7d8815352e20 100644 --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -12,6 +12,8 @@ #include #include #include +#include +#include #include "trace_output.h" @@ -669,6 +671,93 @@ int trace_print_lat_context(struct trace_iterator *iter) return !trace_seq_has_overflowed(s); } +/** + * event2fp - Return fingerprint of an event + * @event: The event to fingerprint + * + * For KHO, we need to match events before and after kexec to recover its type + * id. This function returns a hash that combines an event's name, and all of + * its fields' lengths. + */ +static u32 event2fp(struct trace_event *event) +{ + struct ftrace_event_field *field; + struct trace_event_call *call; + struct list_head *head; + const char *name; + u32 crc32 = ~0; + + /* Low type numbers are static, nothing to checksum */ + if (event->type && event->type < __TRACE_LAST_TYPE) + return event->type; + + call = container_of(event, struct trace_event_call, event); + name = trace_event_name(call); + if (name) + crc32 = crc32_le(crc32, name, strlen(name)); + + head = trace_get_fields(call); + list_for_each_entry(field, head, link) + crc32 = crc32_le(crc32, (char *)&field->size, sizeof(field->size)); + + return crc32; +} + +struct trace_event_map { + u32 crc32; + u32 type; +}; + +static int __maybe_unused _trace_kho_write_events(void *fdt) +{ + struct trace_event_call *call; + int count = __TRACE_LAST_TYPE - 1; + struct trace_event_map *map; + int err = 0; + int i; + + down_read(&trace_event_sem); + /* Allocate an array that we can place all maps into */ + list_for_each_entry(call, &ftrace_events, list) + count++; + + map = vmalloc(count * sizeof(*map)); + if (!map) + return -ENOMEM; + + /* Then fill the array with all crc32 values */ + count = 0; + for (i = 1; i < __TRACE_LAST_TYPE; i++) + map[count++] = (struct trace_event_map) { + .crc32 = count, + .type = count, + }; + + list_for_each_entry(call, &ftrace_events, list) { + struct trace_event *event = &call->event; + + map[count++] = (struct trace_event_map) { + .crc32 = event2fp(event), + .type = event->type, + }; + } + up_read(&trace_event_sem); + + /* And finally write it into a DT variable */ + err |= fdt_property(fdt, "events", map, count * sizeof(*map)); + + vfree(map); + return err; +} + +#ifdef CONFIG_FTRACE_KHO +int trace_kho_write_events(void *fdt) +{ + return _trace_kho_write_events(fdt); +} +#endif + + /** * ftrace_find_event - find a registered event * @type: the type of event to look for diff --git a/kernel/trace/trace_output.h b/kernel/trace/trace_output.h index dca40f1f1da4..07481f295436 100644 --- a/kernel/trace/trace_output.h +++ b/kernel/trace/trace_output.h @@ -25,6 +25,11 @@ extern enum print_line_t print_event_fields(struct trace_iterator *iter, extern void trace_event_read_lock(void); extern void trace_event_read_unlock(void); extern struct trace_event *ftrace_find_event(int type); +#ifdef CONFIG_FTRACE_KHO +extern int trace_kho_write_events(void *fdt); +#else +static inline int trace_kho_write_events(void *fdt) { return -EINVAL; } +#endif extern enum print_line_t trace_nop_print(struct trace_iterator *iter, int flags, struct trace_event *event); From patchwork Fri Dec 22 19:51:42 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503788 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0DAE0C41535 for ; Fri, 22 Dec 2023 19:54:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=esfVGvJonC5eJf5tTshBjJ62oCGMd/cwG5udw+Va1BM=; b=kdtF5wBhXrZouu 1mUJHeZcXit+GMmcVSMMFFd4G/BJGJwIYew9zvjq0Frjy14y/IxKue5kuuRPFqDLfvh6YBKwek4Sd 4WShr1aDD/wnZTjD79UEuet9pXyOUCJou+3EIrLGL/3xuC0P2hqcYYdPwi3Ayflb/BFpYzt8IZ0RR QBleST6zIMSVqzUu4/v1c6eEO7S+fpWthBzceyYa4McOoKX9uoM5mw8QnMdW6o83Jc1nKkh3x6saS YRjjMwwsLQzzogt95FJE8GNMy4Dr4O6k+GsR3iWOaEp8XYO7Qm2wU+MYfxKwL52VYZNXD5uB3scW6 VZD08lvEVMukhMNY0VwQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlaf-006lpU-38; Fri, 22 Dec 2023 19:53:50 +0000 Received: from smtp-fw-9105.amazon.com ([207.171.188.204]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlaZ-006ljh-0B; Fri, 22 Dec 2023 19:53:47 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274824; x=1734810824; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=V7RkoctXcKrPHqjafH3kK9YvW0BnbS/G0GKRcf3+doU=; b=TPTOQNRaCECE6Kjd8dFr5GKv4yVAPY1hC1qIQhiCNQbljfTgbIj+XFRj CTzDwNbjoJqtTTKqg5n1tJ53VZtZ1K1WenNilG3+QgQfz50bpNZffJMNm Id7XQepzigXlmTf4ygiDtnhUgTpaCtMv58RfKfsQTeODfGeeacbFX5+7s 4=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="693396640" Received: from pdx4-co-svc-p1-lb2-vlan2.amazon.com (HELO email-inbound-relay-iad-1d-m6i4x-153b24bc.us-east-1.amazon.com) ([10.25.36.210]) by smtp-border-fw-9105.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:53:40 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan3.iad.amazon.com [10.32.235.38]) by email-inbound-relay-iad-1d-m6i4x-153b24bc.us-east-1.amazon.com (Postfix) with ESMTPS id 72469C1DD1; Fri, 22 Dec 2023 19:53:31 +0000 (UTC) Received: from EX19MTAUWB001.ant.amazon.com [10.0.7.35:12400] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.27.188:2525] with esmtp (Farcaster) id dfbf3b6e-c104-498b-b8dd-3bacfde49b32; Fri, 22 Dec 2023 19:53:30 +0000 (UTC) X-Farcaster-Flow-ID: dfbf3b6e-c104-498b-b8dd-3bacfde49b32 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWB001.ant.amazon.com (10.250.64.248) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:30 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:26 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 15/17] tracing: Recover trace events from kexec handover Date: Fri, 22 Dec 2023 19:51:42 +0000 Message-ID: <20231222195144.24532-10-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D035UWB003.ant.amazon.com (10.13.138.85) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115343_183996_7D1A91EA X-CRM114-Status: GOOD ( 28.36 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org This patch implements all logic necessary to match a new trace event that we add against preserved trace events from kho. If we find a match, we give the new trace event the old event's identifier. That way, trace read-outs are able to make sense of buffer contents again because the parsing code for events looks at the same identifiers. Signed-off-by: Alexander Graf --- v1 -> v2: - make kho_get_fdt() const - Get events as array from a property, use fingerprint instead of names to identify events - Remove ifdefs --- kernel/trace/trace_output.c | 158 +++++++++++++++++++++++++++++++++++- 1 file changed, 156 insertions(+), 2 deletions(-) diff --git a/kernel/trace/trace_output.c b/kernel/trace/trace_output.c index 7d8815352e20..937002a204e1 100644 --- a/kernel/trace/trace_output.c +++ b/kernel/trace/trace_output.c @@ -24,6 +24,8 @@ DECLARE_RWSEM(trace_event_sem); static struct hlist_head event_hash[EVENT_HASHSIZE] __read_mostly; +static bool trace_is_kho_event(int type); + enum print_line_t trace_print_bputs_msg_only(struct trace_iterator *iter) { struct trace_seq *s = &iter->seq; @@ -784,7 +786,7 @@ static DEFINE_IDA(trace_event_ida); static void free_trace_event_type(int type) { - if (type >= __TRACE_LAST_TYPE) + if (type >= __TRACE_LAST_TYPE && !trace_is_kho_event(type)) ida_free(&trace_event_ida, type); } @@ -810,6 +812,156 @@ void trace_event_read_unlock(void) up_read(&trace_event_sem); } + +/** + * trace_kho_get_map - Return the KHO event map + * @pmap: Pointer to a trace map array. Will be filled on success. + * @plen: Pointer to the length of the map. Will be filled on success. + * @unallocated: True if the event does not have an ID yet + * + * Event types are semi-dynamically generated. To ensure that + * their identifiers match before and after kexec with KHO, + * we store an event map in the KHO DT. Whenever we need the + * map, this function provides it. + * + * The first time we request a map, it also walks through it and + * reserves all identifiers so later event registration has find their + * identifier already reserved. + */ +static int trace_kho_get_map(const struct trace_event_map **pmap, int *plen, + bool unallocated) +{ + static const struct trace_event_map *event_map; + static int event_map_len; + static bool event_map_reserved; + const struct trace_event_map *map = NULL; + const void *fdt = kho_get_fdt(); + const char *path = "/ftrace"; + int off, err, len = 0; + int i; + + if (!IS_ENABLED(CONFIG_FTRACE_KHO) || !fdt) + return -EINVAL; + + if (event_map) { + map = event_map; + len = event_map_len; + } + + if (!map) { + off = fdt_path_offset(fdt, path); + + if (off < 0) { + pr_debug("Could not find '%s' in DT", path); + return -EINVAL; + } + + err = fdt_node_check_compatible(fdt, off, "ftrace-v1"); + if (err) { + pr_warn("Node '%s' has invalid compatible", path); + return -EINVAL; + } + + map = fdt_getprop(fdt, off, "events", &len); + if (!map) + return -EINVAL; + + event_map = map; + event_map_len = len; + } + + if (unallocated && !event_map_reserved) { + /* + * Reserve all IDs in our IDA. We only have a working IDA later + * in boot, so restrict it to when we allocate a dynamic type id + * for an event. + */ + for (i = 0; i < len; i += sizeof(*map)) { + const struct trace_event_map *imap = (void *)map + i; + + if (imap->type < __TRACE_LAST_TYPE) + continue; + if (ida_alloc_range(&trace_event_ida, imap->type, imap->type, + GFP_KERNEL) != imap->type) { + pr_warn("Unable to reserve id %d", imap->type); + return -EINVAL; + } + } + + event_map_reserved = true; + } + + *pmap = map; + *plen = len; + + return 0; +} + +/** + * trace_is_kho_event - returns true if the event type is KHO reserved + * @event: the event type to enumerate + * + * With KHO, we reserve all previous kernel's trace event types in the + * KHO DT. Then, when we allocate a type, we just reuse the previous + * kernel's value. However, that means we have to keep these type identifiers + * reserved across the lifetime of the system, because we may get a new event + * that matches the old kernel's event fingerprint. This function is a small + * helper that allows us to check whether a type ID is in use by KHO. + */ +static bool trace_is_kho_event(int type) +{ + const struct trace_event_map *map = NULL; + int len, i; + + if (trace_kho_get_map(&map, &len, false)) + return false; + + if (!map) + return false; + + for (i = 0; i < len; i += sizeof(*map), map++) + if (map->type == type) + return true; + + return false; +} + +/** + * trace_kho_fill_event_type - restore event type info from KHO + * @event: the event to enumerate + * + * Event types are semi-dynamically generated. To ensure that + * their identifiers match before and after kexec with KHO, + * let's match up unique fingerprint - either their predetermined + * type or their crc32 value - and fill in the respective type + * information if we booted with KHO. + */ +static bool trace_kho_fill_event_type(struct trace_event *event) +{ + const struct trace_event_map *map = NULL; + int len = 0, i; + u32 crc32; + + if (trace_kho_get_map(&map, &len, !event->type)) + return false; + + crc32 = event2fp(event); + + for (i = 0; i < len; i += sizeof(*map), map++) { + if (map->crc32 == crc32) { + if (!map->type) + return false; + + event->type = map->type; + return true; + } + } + + pr_debug("Could not find event"); + + return false; +} + /** * register_trace_event - register output for an event type * @event: the event type to register @@ -838,7 +990,9 @@ int register_trace_event(struct trace_event *event) if (WARN_ON(!event->funcs)) goto out; - if (!event->type) { + if (trace_kho_fill_event_type(event)) { + pr_debug("Recovered id=%d", event->type); + } else if (!event->type) { event->type = alloc_trace_event_type(); if (!event->type) goto out; From patchwork Fri Dec 22 19:51:43 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503789 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 078F0C46CD8 for ; Fri, 22 Dec 2023 19:54:26 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=yMBXyCk7L2APCuOUo2w+XmqDLyCws/15Ab1H/yzUgF0=; b=R5ukOIi63HPXpI Vh4XrhruW69Qb2gIN5ppXrMKAPRf+1YsPwLVPki0wPIAl/VDjF3kgYGX2Fb/7W5mbwBnsWEpEsP2k PcmiwYov0NOTdytYndADiScpsU/Y5RG8+Ual4np7hrPCa6x1u5H1BacJvdV/m5VRffX1iduD1aLxV hRvXuJM8MsfD9FeKka/IPYWoJEa1M9vqqDy4P/u/lCmLkdAvmeFsIkAyAWkRF2Z8Q9Sfe9lgCTQs1 ISEAgHc/DvuW8F3gyiLSA/WrEPX9SrD337ZDYgI+pQ8Z4C5Z3md3nfH++hdJHhI8qx5AXd8GLH3kE 20D1yeSHrOxq/9fCz4Cg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlao-006lxb-30; Fri, 22 Dec 2023 19:53:58 +0000 Received: from desiato.infradead.org ([2001:8b0:10b:1:d65d:64ff:fe57:4e05]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlai-006lrP-2C; Fri, 22 Dec 2023 19:53:52 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=desiato.20200630; h=Content-Transfer-Encoding:Content-Type :MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From: Sender:Reply-To:Content-ID:Content-Description; bh=K22H13Upw8ZJvq6V1e5pgkkFku3zAKlZ1ZPbskJJqpo=; b=dNMTykfrYuqu9FitmygJeTEoYO 5cnfZhAQHXMbQWDNZeRTOHWOZWnOMUw3QExnz1fRXvp5tPx1C0vfZbaKNAU+OazljHXoeohYzXyn8 BoOqzYyQ1652Tkny2BMaWp2P0NbTo225CrQ6JMbNlisQJbIA55YPuq6E79R8yyjc3TMzVuN03CfCs DrnvvNaeY90dedHyQWA0fmZgegPiKRuBJg0oCvcOcnJAqTRbjiqSbpWDCZQZvFRLqAxxQTADRzAY3 5OH5aovYUEzRWXGTBAHa5lbhybgya086NAuDLsECPd1GHQsqs1RtSa2OFMktrbi081tvalxL4WWTG 3eo3mOWw==; Received: from smtp-fw-9102.amazon.com ([207.171.184.29]) by desiato.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlaf-00FNbr-0A; Fri, 22 Dec 2023 19:53:51 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274829; x=1734810829; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=K22H13Upw8ZJvq6V1e5pgkkFku3zAKlZ1ZPbskJJqpo=; b=Qw9xexZzeQRqqR/Lc87YkFYc+ZsqgLcrmrtjaoiifI2pDLDgcnLTSOdh W2hIbuBJmDHMeBYGp7hmfQr0d9K2dQsuCOXv10OYlIt2w/z96UcXogEmf 6Qo2W2/8xFYkmP2rxPFaRqsUJ5CJpIScT5tcPHyVChRGG1Qcc0iM6Vw/H Y=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="385308754" Received: from pdx4-co-svc-p1-lb2-vlan3.amazon.com (HELO email-inbound-relay-iad-1a-m6i4x-bbc6e425.us-east-1.amazon.com) ([10.25.36.214]) by smtp-border-fw-9102.sea19.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:53:45 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan3.iad.amazon.com [10.32.235.38]) by email-inbound-relay-iad-1a-m6i4x-bbc6e425.us-east-1.amazon.com (Postfix) with ESMTPS id 3C35E80CB5; Fri, 22 Dec 2023 19:53:34 +0000 (UTC) Received: from EX19MTAUWC002.ant.amazon.com [10.0.7.35:50263] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.56.23:2525] with esmtp (Farcaster) id d873e908-4dd5-450d-9721-5ebc91d6108f; Fri, 22 Dec 2023 19:53:34 +0000 (UTC) X-Farcaster-Flow-ID: d873e908-4dd5-450d-9721-5ebc91d6108f Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWC002.ant.amazon.com (10.250.64.143) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:34 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:30 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 16/17] tracing: Add config option for kexec handover Date: Fri, 22 Dec 2023 19:51:43 +0000 Message-ID: <20231222195144.24532-11-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D035UWB003.ant.amazon.com (10.13.138.85) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_195349_747185_68A01CE8 X-CRM114-Status: GOOD ( 13.59 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Now that all bits are in place to allow ftrace to pass its trace data into the next kernel on kexec, let's give users a kconfig option to enable the functionality. Signed-off-by: Alexander Graf --- v1 -> v2: - Select crc32 --- kernel/trace/Kconfig | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/kernel/trace/Kconfig b/kernel/trace/Kconfig index 61c541c36596..418a5ae11aac 100644 --- a/kernel/trace/Kconfig +++ b/kernel/trace/Kconfig @@ -1169,6 +1169,20 @@ config HIST_TRIGGERS_DEBUG If unsure, say N. +config FTRACE_KHO + bool "Ftrace Kexec handover support" + depends on KEXEC_KHO + select CRC32 + help + Enable support for ftrace to pass metadata across kexec so the new + kernel continues to use the previous kernel's trace buffers. + + This can be useful when debugging kexec performance or correctness + issues: The new kernel can dump the old kernel's trace buffer which + contains all events until reboot. + + If unsure, say N. + source "kernel/trace/rv/Kconfig" endif # FTRACE From patchwork Fri Dec 22 19:51:44 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alexander Graf X-Patchwork-Id: 13503790 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id E113BC41535 for ; Fri, 22 Dec 2023 19:54:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Message-ID:Date:Subject:CC:To:From:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=V5pIbdFcUK9nBjV0MXMsmxKhBPsiK6qkVT47QNnnHTw=; b=bjOEUexCBleJYN 4qx6m/wDnBbT0BI5EIDWvWdnXG8ne54lsWBdSrARDKxAQre6p2AtyDkqlnhxm9MF0PU5TWDqGP8vE 048S2kbP9g8jOKWXSEF2UF08ZTCUVimcjmpSn33G79V7EWxBOIaddtvrPz53to+JFeQ+4D1+n26gd eUb2kt+QjTJwrJS4SGc5zHG/6eypEOIipGxVk4GYhg9WBFfA4GTYeoXm4D6aBWTsHtKyTky2KEpjH h+3/Z1pd/gu3zk36svPM6CAAsUpNQbPPJDJLHDEz1upSay7szElLOhj7HxCYVuWpoxxIuNUNfLdbd BaeOghv7gjeUZ4l4zftA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1rGlax-006m3w-1E; Fri, 22 Dec 2023 19:54:07 +0000 Received: from smtp-fw-52002.amazon.com ([52.119.213.150]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1rGlam-006luJ-0N; Fri, 22 Dec 2023 19:54:02 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1703274836; x=1734810836; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=a+3i4loybtPdN0BnSihZonArbdMyz7IERtq4xLgkTQM=; b=UnmFO7dgnpQoIvHDXRPrXy5s8+CTuS4fPU/9ATOSqq6mp73kNjrpWCz8 CESTzsIfNwxpXTW15n1Q+pzr2YSdobCEDEZCLmDepUHXcnw3HsB3+E4a4 P+PBBh/icwJlUjldS+vV69tpyCTs7cMc8n2FSvcdSnDmDQSDDYZ7hLARW U=; X-IronPort-AV: E=Sophos;i="6.04,297,1695686400"; d="scan'208";a="602627379" Received: from iad12-co-svc-p1-lb1-vlan3.amazon.com (HELO email-inbound-relay-pdx-2b-m6i4x-f323d91c.us-west-2.amazon.com) ([10.43.8.6]) by smtp-border-fw-52002.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 22 Dec 2023 19:53:53 +0000 Received: from smtpout.prod.us-west-2.prod.farcaster.email.amazon.dev (pdx2-ws-svc-p26-lb5-vlan2.pdx.amazon.com [10.39.38.66]) by email-inbound-relay-pdx-2b-m6i4x-f323d91c.us-west-2.amazon.com (Postfix) with ESMTPS id 69E8040DFE; Fri, 22 Dec 2023 19:53:38 +0000 (UTC) Received: from EX19MTAUWA001.ant.amazon.com [10.0.7.35:18182] by smtpin.naws.us-west-2.prod.farcaster.email.amazon.dev [10.0.15.218:2525] with esmtp (Farcaster) id bf3bafa8-c33b-453a-b6f8-9e29248b4425; Fri, 22 Dec 2023 19:53:38 +0000 (UTC) X-Farcaster-Flow-ID: bf3bafa8-c33b-453a-b6f8-9e29248b4425 Received: from EX19D020UWC004.ant.amazon.com (10.13.138.149) by EX19MTAUWA001.ant.amazon.com (10.250.64.204) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:37 +0000 Received: from dev-dsk-graf-1a-5ce218e4.eu-west-1.amazon.com (10.253.83.51) by EX19D020UWC004.ant.amazon.com (10.13.138.149) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Fri, 22 Dec 2023 19:53:33 +0000 From: Alexander Graf To: CC: , , , , , , , Eric Biederman , "H. Peter Anvin" , Andy Lutomirski , Peter Zijlstra , "Rob Herring" , Steven Rostedt , "Andrew Morton" , Mark Rutland , "Tom Lendacky" , Ashish Kalra , James Gowans , Stanislav Kinsburskii , , , , Anthony Yznaga , Usama Arif , David Woodhouse , Benjamin Herrenschmidt Subject: [PATCH v2 17/17] devicetree: Add bindings for ftrace KHO Date: Fri, 22 Dec 2023 19:51:44 +0000 Message-ID: <20231222195144.24532-12-graf@amazon.com> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20231222195144.24532-1-graf@amazon.com> References: <20231222193607.15474-1-graf@amazon.com> <20231222195144.24532-1-graf@amazon.com> MIME-Version: 1.0 X-Originating-IP: [10.253.83.51] X-ClientProxiedBy: EX19D035UWB003.ant.amazon.com (10.13.138.85) To EX19D020UWC004.ant.amazon.com (10.13.138.149) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20231222_115356_384666_C8D741B9 X-CRM114-Status: GOOD ( 20.70 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With ftrace in KHO, we are creating an ABI between old kernel and new kernel about the state that they transfer. To ensure that we document that state and catch any breaking change, let's add its schema to the common devicetree bindings. This way, we can quickly reason about the state that gets passed. Signed-off-by: Alexander Graf --- .../bindings/kho/ftrace/ftrace-array.yaml | 46 +++++++++++++++ .../bindings/kho/ftrace/ftrace-cpu.yaml | 56 +++++++++++++++++++ .../bindings/kho/ftrace/ftrace.yaml | 48 ++++++++++++++++ 3 files changed, 150 insertions(+) create mode 100644 Documentation/devicetree/bindings/kho/ftrace/ftrace-array.yaml create mode 100644 Documentation/devicetree/bindings/kho/ftrace/ftrace-cpu.yaml create mode 100644 Documentation/devicetree/bindings/kho/ftrace/ftrace.yaml diff --git a/Documentation/devicetree/bindings/kho/ftrace/ftrace-array.yaml b/Documentation/devicetree/bindings/kho/ftrace/ftrace-array.yaml new file mode 100644 index 000000000000..9960fefc292d --- /dev/null +++ b/Documentation/devicetree/bindings/kho/ftrace/ftrace-array.yaml @@ -0,0 +1,46 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/kho/ftrace/ftrace-array.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Ftrace trace array + +maintainers: + - Alexander Graf + +properties: + compatible: + enum: + - ftrace,array-v1 + + trace_flags: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + Bitmap of all the trace flags that were enabled in the trace array at the + point of serialization. + +# Subnodes will be of type "ftrace,cpu-v1", one each per CPU +additionalProperties: true + +required: + - compatible + - trace_flags + +examples: + - | + ftrace { + compatible = "ftrace-v1"; + events = <1 1 2 2 3 3>; + + global_trace { + compatible = "ftrace,array-v1"; + trace_flags = < 0x3354601 >; + + cpu0 { + compatible = "ftrace,cpu-v1"; + cpu = < 0x00 >; + mem = < 0x101000000ULL 0x38ULL 0x101000100ULL 0x1000ULL 0x101000038ULL 0x38ULL 0x101002000ULL 0x1000ULL>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/kho/ftrace/ftrace-cpu.yaml b/Documentation/devicetree/bindings/kho/ftrace/ftrace-cpu.yaml new file mode 100644 index 000000000000..58c715e93f37 --- /dev/null +++ b/Documentation/devicetree/bindings/kho/ftrace/ftrace-cpu.yaml @@ -0,0 +1,56 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/kho/ftrace/ftrace-cpu.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Ftrace per-CPU ring buffer contents + +maintainers: + - Alexander Graf + +properties: + compatible: + enum: + - ftrace,cpu-v1 + + cpu: + $ref: /schemas/types.yaml#/definitions/uint32 + description: + CPU number of the CPU that this ring buffer belonged to when it was + serialized. + + mem: + $ref: /schemas/types.yaml#/definitions/uint32-array + description: + Array of { u64 phys_addr, u64 len } elements that describe a list of ring + buffer pages. Each page consists of two elements. The first element + describes the location of the struct buffer_page that contains metadata + for a given ring buffer page, such as the ring's head indicator. The + second element points to the ring buffer data page which contains the raw + trace data. + +additionalProperties: false + +required: + - compatible + - cpu + - mem + +examples: + - | + ftrace { + compatible = "ftrace-v1"; + events = <1 1 2 2 3 3>; + + global_trace { + compatible = "ftrace,array-v1"; + trace_flags = < 0x3354601 >; + + cpu0 { + compatible = "ftrace,cpu-v1"; + cpu = < 0x00 >; + mem = < 0x101000000ULL 0x38ULL 0x101000100ULL 0x1000ULL 0x101000038ULL 0x38ULL 0x101002000ULL 0x1000ULL>; + }; + }; + }; diff --git a/Documentation/devicetree/bindings/kho/ftrace/ftrace.yaml b/Documentation/devicetree/bindings/kho/ftrace/ftrace.yaml new file mode 100644 index 000000000000..b87a64843af3 --- /dev/null +++ b/Documentation/devicetree/bindings/kho/ftrace/ftrace.yaml @@ -0,0 +1,48 @@ +# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) +%YAML 1.2 +--- +$id: http://devicetree.org/schemas/kho/ftrace/ftrace.yaml# +$schema: http://devicetree.org/meta-schemas/core.yaml# + +title: Ftrace core data + +maintainers: + - Alexander Graf + +properties: + compatible: + enum: + - ftrace-v1 + + events: + $ref: /schemas/types.yaml#/definitions/uint32-array + description: + Array of { u32 crc, u32 type } elements. Each element contains a unique + identifier for an event, followed by the identifier that this event had + in the previous kernel's trace buffers. + +# Other child nodes will be of type "ftrace,array-v1". Each of which describe +# a trace buffer +additionalProperties: true + +required: + - compatible + - events + +examples: + - | + ftrace { + compatible = "ftrace-v1"; + events = <1 1 2 2 3 3>; + + global_trace { + compatible = "ftrace,array-v1"; + trace_flags = < 0x3354601 >; + + cpu0 { + compatible = "ftrace,cpu-v1"; + cpu = < 0x00 >; + mem = < 0x101000000ULL 0x38ULL 0x101000100ULL 0x1000ULL 0x101000038ULL 0x38ULL 0x101002000ULL 0x1000ULL>; + }; + }; + };