@@ -2698,6 +2698,31 @@
kgdbwait [KGDB,EARLY] Stop kernel execution and enter the
kernel debugger at the earliest opportunity.
+ kho= [KEXEC,EARLY]
+ Format: { "0" | "1" | "off" | "on" | "y" | "n" }
+ Enables or disables Kexec HandOver.
+ "0" | "off" | "n" - kexec handover is disabled
+ "1" | "on" | "y" - kexec handover is enabled
+
+ kho_scratch= [KEXEC,EARLY]
+ Format: ll[KMG],mm[KMG],nn[KMG] | nn%
+ Defines the size of the KHO scratch region. The KHO
+ scratch regions are physically contiguous memory
+ ranges that can only be used for non-kernel
+ allocations. That way, even when memory is heavily
+ fragmented with handed over memory, the kexeced
+ kernel will always have enough contiguous ranges to
+ bootstrap itself.
+
+ It is possible to specify the exact amount of
+ memory in the form of "ll[KMG],mm[KMG],nn[KMG]"
+ where the first parameter defines the size of a low
+ memory scratch area, the second parameter defines
+ the size of a global scratch area and the third
+ parameter defines the size of additional per-node
+ scratch areas. The form "nn%" defines scale factor
+ (in percents) of memory that was used during boot.
+
kmac= [MIPS] Korina ethernet MAC address.
Configure the RouterBoard 532 series on-chip
Ethernet adapter MAC address.
new file mode 100644
@@ -0,0 +1,70 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+.. _concepts:
+
+=======================
+Kexec Handover Concepts
+=======================
+
+Kexec HandOver (KHO) is a mechanism that allows Linux to preserve state -
+arbitrary properties as well as memory locations - across kexec.
+
+It introduces multiple concepts:
+
+KHO State tree
+==============
+
+Every KHO kexec carries a state tree, in the format of flattened device tree
+(FDT), that describes the state of the system. Device drivers can register to
+KHO to serialize their state before kexec. After KHO, device drivers can read
+the FDT and extract previous state.
+
+KHO only uses the FDT container format and libfdt library, but does not
+adhere to the same property semantics that normal device trees do: Properties
+are passed in native endianness and standardized properties like ``regs`` and
+``ranges`` do not exist, hence there are no ``#...-cells`` properties.
+
+Scratch Regions
+===============
+
+To boot into kexec, we need to have a physically contiguous memory range that
+contains no handed over memory. Kexec then places the target kernel and initrd
+into that region. The new kernel exclusively uses this region for memory
+allocations before during boot up to the initialization of the page allocator.
+
+We guarantee that we always have such regions through the scratch regions: On
+first boot KHO allocates several physically contiguous memory regions. Since
+after kexec these regions will be used by early memory allocations, there is a
+scratch region per NUMA node plus a scratch region to satisfy allocations
+requests that do not require particular NUMA node assignment.
+By default, size of the scratch region is calculated based on amount of memory
+allocated during boot. The ``kho_scratch`` kernel command line option may be
+used to explicitly define size of the scratch regions.
+The scratch regions are declared as CMA when page allocator is initialized so
+that their memory can be used during system lifetime. CMA gives us the
+guarantee that no handover pages land in that region, because handover pages
+must be at a static physical memory location and CMA enforces that only
+movable pages can be located inside.
+
+After KHO kexec, we ignore the ``kho_scratch`` kernel command line option and
+instead reuse the exact same region that was originally allocated. This allows
+us to recursively execute any amount of KHO kexecs. Because we used this region
+for boot memory allocations and as target memory for kexec blobs, some parts
+of that memory region may be reserved. These reservations are irrelevant for
+the next KHO, because kexec can overwrite even the original kernel.
+
+.. _finalization_phase:
+
+KHO finalization phase
+======================
+
+To enable user space based kexec file loader, the kernel needs to be able to
+provide the FDT that describes the previous kernel's state before
+performing the actual kexec. The process of generating that FDT is
+called serialization. When the FDT is generated, some properties
+of the system may become immutable because they are already written down
+in the FDT. That state is called the KHO finalization phase.
+
+With the in-kernel kexec file loader, i.e., using the syscall
+``kexec_file_load``, KHO FDT is not created until the actual kexec. Thus the
+finalization phase is much shorter. User space can optionally choose to generate
+the FDT early using the debugfs interface.
new file mode 100644
@@ -0,0 +1,62 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+=======
+KHO FDT
+=======
+
+KHO uses the flattened device tree (FDT) container format and libfdt
+library to create and parse the data that is passed between the
+kernels. The properties in KHO FDT are stored in native format and can
+include any data KHO users need to preserve. Parsing of FDT subnodes is
+responsibility of KHO users, except for nodes and properties defined by
+KHO itself.
+
+KHO nodes and properties
+========================
+
+Node ``preserved-memory``
+-------------------------
+
+KHO saves a special node named ``preserved-memory`` under the root node.
+This node contains the metadata for KHO to preserve pages across kexec.
+
+Property ``compatible``
+-----------------------
+
+The ``compatible`` property determines compatibility between the kernel
+that created the KHO FDT and the kernel that attempts to load it.
+If the kernel that loads the KHO FDT is not compatible with it, the entire
+KHO process will be bypassed.
+
+Examples
+========
+
+The following example demonstrates KHO FDT that preserves two memory
+regions create with ``reserve_mem`` kernel command line parameter::
+
+ /dts-v1/;
+
+ / {
+ compatible = "kho-v1";
+
+ memblock {
+ compatible = "memblock-v1";
+
+ region1 {
+ compatible = "reserve-mem-v1";
+ start = <0xc07a 0x4000000>;
+ size = <0x01 0x00>;
+ };
+
+ region2 {
+ compatible = "reserve-mem-v1";
+ start = <0xc07b 0x4000000>;
+ size = <0x8000 0x00>;
+ };
+
+ };
+
+ preserved-memory {
+ metadata = <0x00 0x00>;
+ };
+ };
new file mode 100644
@@ -0,0 +1,14 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+========================
+Kexec Handover Subsystem
+========================
+
+.. toctree::
+ :maxdepth: 1
+
+ concepts
+ usage
+ fdt
+
+.. only:: subproject and html
new file mode 100644
@@ -0,0 +1,118 @@
+.. SPDX-License-Identifier: GPL-2.0-or-later
+
+====================
+Kexec Handover Usage
+====================
+
+Kexec HandOver (KHO) is a mechanism that allows Linux to preserve state -
+arbitrary properties as well as memory locations - across kexec.
+
+This document expects that you are familiar with the base KHO
+:ref:`concepts <concepts>`. If you have not read
+them yet, please do so now.
+
+Prerequisites
+=============
+
+KHO is available when the ``CONFIG_KEXEC_HANDOVER`` config option is set to y
+at compile time. Every KHO producer may have its own config option that you
+need to enable if you would like to preserve their respective state across
+kexec.
+
+To use KHO, please boot the kernel with the ``kho=on`` command line
+parameter. You may use ``kho_scratch`` parameter to define size of the
+scratch regions. For example ``kho_scratch=16M,512M,256M`` will reserve a
+16 MiB low memory scratch area, a 512 MiB global scratch region, and 256 MiB
+per NUMA node scratch regions on boot.
+
+Perform a KHO kexec
+===================
+
+First, before you perform a KHO kexec, you can optionally move the system into
+the :ref:`KHO finalization phase <finalization_phase>` ::
+
+ $ echo 1 > /sys/kernel/debug/kho/out/finalize
+
+After this command, the KHO FDT is available in
+``/sys/kernel/debug/kho/out/fdt``.
+
+Next, load the target payload and kexec into it. It is important that you
+use the ``-s`` parameter to use the in-kernel kexec file loader, as user
+space kexec tooling currently has no support for KHO with the user space
+based file loader ::
+
+ # kexec -l Image --initrd=initrd -s
+ # kexec -e
+
+If you skipped finalization in the first step, ``kexec -e`` triggers
+FDT finalization automatically. The new kernel will boot up and contain
+some of the previous kernel's state.
+
+For example, if you used ``reserve_mem`` command line parameter to create
+an early memory reservation, the new kernel will have that memory at the
+same physical address as the old kernel.
+
+Unfreeze KHO FDT data
+=====================
+
+You can move the system out of KHO finalization phase by calling ::
+
+ $ echo 0 > /sys/kernel/debug/kho/out/finalize
+
+After this command, the KHO FDT is no longer available in
+``/sys/kernel/debug/kho/out/fdt``, and the states kept in KHO can be
+modified by other kernel subsystems again.
+
+debugfs Interfaces
+==================
+
+Currently KHO creates the following debugfs interfaces. Notice that these
+interfaces may change in the future. They will be moved to sysfs once KHO is
+stabilized.
+
+``/sys/kernel/debug/kho/out/finalize``
+ Kexec HandOver (KHO) allows Linux to transition the state of
+ compatible drivers into the next kexec'ed kernel. To do so,
+ device drivers will serialize their current state into an FDT.
+ While the state is serialized, they are unable to perform
+ any modifications to state that was serialized, such as
+ handed over memory allocations.
+
+ When this file contains "1", the system is in the transition
+ state. When contains "0", it is not. To switch between the
+ two states, echo the respective number into this file.
+
+``/sys/kernel/debug/kho/out/fdt_max``
+ KHO needs to allocate a buffer for the FDT that gets
+ generated before it knows the final size. By default, it
+ will allocate 10 MiB for it. You can write to this file
+ to modify the size of that allocation.
+
+``/sys/kernel/debug/kho/out/fdt``
+ When KHO state tree is finalized, the kernel exposes the
+ flattened device tree blob that carries its current KHO
+ state in this file. Kexec user space tooling can use this
+ as input file for the KHO payload image.
+
+``/sys/kernel/debug/kho/out/scratch_len``
+ To support continuous KHO kexecs, we need to reserve
+ physically contiguous memory regions that will always stay
+ available for future kexec allocations. This file describes
+ the length of these memory regions. Kexec user space tooling
+ can use this to determine where it should place its payload
+ images.
+
+``/sys/kernel/debug/kho/out/scratch_phys``
+ To support continuous KHO kexecs, we need to reserve
+ physically contiguous memory regions that will always stay
+ available for future kexec allocations. This file describes
+ the physical location of these memory regions. Kexec user space
+ tooling can use this to determine where it should place its
+ payload images.
+
+``/sys/kernel/debug/kho/in/fdt``
+ When the kernel was booted with Kexec HandOver (KHO),
+ the state tree that carries metadata about the previous
+ kernel's state is in this file in the format of flattened
+ device tree. This file may disappear when all consumers of
+ it finished to interpret their metadata.
@@ -90,3 +90,4 @@ Other subsystems
peci/index
wmi/index
tee/index
+ kho/index
@@ -12828,6 +12828,7 @@ F: include/linux/kernfs.h
KEXEC
L: kexec@lists.infradead.org
W: http://kernel.org/pub/linux/utils/kernel/kexec/
+F: Documentation/kho/
F: include/linux/kexec*.h
F: include/uapi/linux/kexec.h
F: kernel/kexec*