[v9,05/39] arm64/gcs: Document the ABI for Guarded Control Stacks

Message ID	20240625-arm64-gcs-v9-5-0f634469b8f0@kernel.org (mailing list archive)
State	New
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4CEBD178CEA; Tue, 25 Jun 2024 15:01:17 +0000 (UTC) From: Mark Brown <broonie@kernel.org> Date: Tue, 25 Jun 2024 15:57:33 +0100 Subject: [PATCH v9 05/39] arm64/gcs: Document the ABI for Guarded Control Stacks Precedence: bulk MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <20240625-arm64-gcs-v9-5-0f634469b8f0@kernel.org> References: <20240625-arm64-gcs-v9-0-0f634469b8f0@kernel.org> In-Reply-To: <20240625-arm64-gcs-v9-0-0f634469b8f0@kernel.org> To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>, Jonathan Corbet <corbet@lwn.net>, Andrew Morton <akpm@linux-foundation.org>, Marc Zyngier <maz@kernel.org>, Oliver Upton <oliver.upton@linux.dev>, James Morse <james.morse@arm.com>, Suzuki K Poulose <suzuki.poulose@arm.com>, Arnd Bergmann <arnd@arndb.de>, Oleg Nesterov <oleg@redhat.com>, Eric Biederman <ebiederm@xmission.com>, Shuah Khan <shuah@kernel.org>, "Rick P. Edgecombe" <rick.p.edgecombe@intel.com>, Deepak Gupta <debug@rivosinc.com>, Ard Biesheuvel <ardb@kernel.org>, Szabolcs Nagy <Szabolcs.Nagy@arm.com>, Kees Cook <kees@kernel.org> Cc: "H.J. Lu" <hjl.tools@gmail.com>, Paul Walmsley <paul.walmsley@sifive.com>, Palmer Dabbelt <palmer@dabbelt.com>, Albert Ou <aou@eecs.berkeley.edu>, Florian Weimer <fweimer@redhat.com>, Christian Brauner <brauner@kernel.org>, Thiago Jung Bauermann <thiago.bauermann@linaro.org>, Ross Burton <ross.burton@arm.com>, linux-arm-kernel@lists.infradead.org, linux-doc@vger.kernel.org, kvmarm@lists.linux.dev, linux-fsdevel@vger.kernel.org, linux-arch@vger.kernel.org, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org, Mark Brown <broonie@kernel.org>
Series	arm64/gcs: Provide support for GCS in userspace \| expand [v9,00/39] arm64/gcs: Provide support for GCS in userspace [v9,01/39] arm64/mm: Restructure arch_validate_flags() for extensibility [v9,02/39] prctl: arch-agnostic prctl for shadow stack [v9,03/39] mman: Add map_shadow_stack() flags [v9,04/39] arm64: Document boot requirements for Guarded Control Stacks [v9,05/39] arm64/gcs: Document the ABI for Guarded Control Stacks [v9,06/39] arm64/sysreg: Add definitions for architected GCS caps [v9,07/39] arm64/gcs: Add manual encodings of GCS instructions [v9,08/39] arm64/gcs: Provide put_user_gcs() [v9,09/39] arm64/cpufeature: Runtime detection of Guarded Control Stack (GCS) [v9,10/39] arm64/mm: Allocate PIE slots for EL0 guarded control stack [v9,11/39] mm: Define VM_SHADOW_STACK for arm64 when we support GCS [v9,12/39] arm64/mm: Map pages for guarded control stack [v9,13/39] KVM: arm64: Manage GCS registers for guests [v9,14/39] arm64/gcs: Allow GCS usage at EL0 and EL1 [v9,15/39] arm64/idreg: Add overrride for GCS [v9,16/39] arm64/hwcap: Add hwcap for GCS [v9,17/39] arm64/traps: Handle GCS exceptions [v9,18/39] arm64/mm: Handle GCS data aborts [v9,19/39] arm64/gcs: Context switch GCS state for EL0 [v9,20/39] arm64/gcs: Ensure that new threads have a GCS [v9,21/39] arm64/gcs: Implement shadow stack prctl() interface [v9,22/39] arm64/mm: Implement map_shadow_stack() [v9,23/39] arm64/signal: Set up and restore the GCS context for signal handlers [v9,24/39] arm64/signal: Expose GCS state in signal frames [v9,25/39] arm64/ptrace: Expose GCS via ptrace and core files [v9,26/39] arm64: Add Kconfig for Guarded Control Stack (GCS) [v9,27/39] kselftest/arm64: Verify the GCS hwcap [v9,28/39] kselftest: Provide shadow stack enable helpers for arm64 [v9,29/39] selftests/clone3: Enable arm64 shadow stack testing [v9,30/39] kselftest/arm64: Add GCS as a detected feature in the signal tests [v9,31/39] kselftest/arm64: Add framework support for GCS to signal handling tests [v9,32/39] kselftest/arm64: Allow signals tests to specify an expected si_code [v9,33/39] kselftest/arm64: Always run signals tests with GCS enabled [v9,34/39] kselftest/arm64: Add very basic GCS test program [v9,35/39] kselftest/arm64: Add a GCS test program built with the system libc [v9,36/39] kselftest/arm64: Add test coverage for GCS mode locking [v9,37/39] kselftest/arm64: Add GCS signal tests [v9,38/39] kselftest/arm64: Add a GCS stress test [v9,39/39] kselftest/arm64: Enable GCS for the FP stress tests

On 6/25/24 7:57 AM, Mark Brown wrote: > Add some documentation of the userspace ABI for Guarded Control Stacks. > > Reviewed-by: Thiago Jung Bauermann <thiago.bauermann@linaro.org> > Signed-off-by: Mark Brown <broonie@kernel.org> > --- > Documentation/arch/arm64/gcs.rst | 233 +++++++++++++++++++++++++++++++++++++ > Documentation/arch/arm64/index.rst | 1 + > 2 files changed, 234 insertions(+) > > diff --git a/Documentation/arch/arm64/gcs.rst b/Documentation/arch/arm64/gcs.rst > new file mode 100644 > index 000000000000..c45c0326836a > --- /dev/null > +++ b/Documentation/arch/arm64/gcs.rst > @@ -0,0 +1,233 @@ > +=============================================== > +Guarded Control Stack support for AArch64 Linux > +=============================================== > + > +This document outlines briefly the interface provided to userspace by Linux in > +order to support use of the ARM Guarded Control Stack (GCS) feature. > + > +This is an outline of the most important features and issues only and not > +intended to be exhaustive. > + > + > + > +1. General > +----------- > + > +* GCS is an architecture feature intended to provide greater protection > + against return oriented programming (ROP) attacks and to simplify the > + implementation of features that need to collect stack traces such as > + profiling. > + > +* When GCS is enabled a separate guarded control stack is maintained by the > + PE which is writeable only through specific GCS operations. This > + stores the call stack only, when a procedure call instruction is only. When > + performed the current PC is pushed onto the GCS and on RET the > + address in the LR is verified against that on the top of the GCS. > + > +* When active current GCS pointer is stored in the system register Cannot parse this incomplete sentence... > + GCSPR_EL0. This is readable by userspace but can only be updated > + via specific GCS instructions. > + > +* The architecture provides instructions for switching between guarded > + control stacks with checks to ensure that the new stack is a valid > + target for switching. > + > +* The functionality of GCS is similar to that provided by the x86 Shadow > + Stack feature, due to sharing of userspace interfaces the ABI refers to feature. Due to > + shadow stacks rather than GCS. > + > +* Support for GCS is reported to userspace via HWCAP2_GCS in the aux vector > + AT_HWCAP2 entry. > + > +* GCS is enabled per thread. While there is support for disabling GCS > + at runtime this should be done with great care. > + > +* GCS memory access faults are reported as normal memory access faults. > + > +* GCS specific errors (those reported with EC 0x2d) will be reported as > + SIGSEGV with a si_code of SEGV_CPERR (control protection error). > + > +* GCS is supported only for AArch64. > + > +* On systems where GCS is supported GCSPR_EL0 is always readable by EL0 > + regardless of the GCS configuration for the thread. > + > +* The architecture supports enabling GCS without verifying that return values > + in LR match those in the GCS, the LR will be ignored. This is not supported GCS; the LR > + by Linux. > + > +* EL0 GCS entries with bit 63 set are reserved for use, one such use is defined for use. One such > + below for signals and should be ignored when parsing the stack if not > + understood. > + > + > +2. Enabling and disabling Guarded Control Stacks > +------------------------------------------------- > + > +* GCS is enabled and disabled for a thread via the PR_SET_SHADOW_STACK_STATUS > + prctl(), this takes a single flags argument specifying which GCS features prctl(). This takes > + should be used. > + > +* When set PR_SHADOW_STACK_ENABLE flag allocates a Guarded Control Stack > + and enables GCS for the thread, enabling the functionality controlled by > + GCSCRE0_EL1.{nTR, RVCHKEN, PCRSEL}. > + > +* When set the PR_SHADOW_STACK_PUSH flag enables the functionality controlled > + by GCSCRE0_EL1.PUSHMEn, allowing explicit GCS pushes. > + > +* When set the PR_SHADOW_STACK_WRITE flag enables the functionality controlled > + by GCSCRE0_EL1.STREn, allowing explicit stores to the Guarded Control Stack. > + > +* Any unknown flags will cause PR_SET_SHADOW_STACK_STATUS to return -EINVAL. > + > +* PR_LOCK_SHADOW_STACK_STATUS is passed a bitmask of features with the same > + values as used for PR_SET_SHADOW_STACK_STATUS. Any future changes to the > + status of the specified GCS mode bits will be rejected. > + > +* PR_LOCK_SHADOW_STACK_STATUS allows any bit to be locked, this allows locked; this allows > + userspace to prevent changes to any future features. > + > +* There is no support for a process to remove a lock that has been set for > + it. > + > +* PR_SET_SHADOW_STACK_STATUS and PR_LOCK_SHADOW_STACK_STATUS affect only the > + thread that called them, any other running threads will be unaffected. them; any other > + > +* New threads inherit the GCS configuration of the thread that created them. > + > +* GCS is disabled on exec(). > + > +* The current GCS configuration for a thread may be read with the > + PR_GET_SHADOW_STACK_STATUS prctl(), this returns the same flags that prctl(). This > + are passed to PR_SET_SHADOW_STACK_STATUS. > + > +* If GCS is disabled for a thread after having previously been enabled then > + the stack will remain allocated for the lifetime of the thread. At present > + any attempt to reenable GCS for the thread will be rejected, this may be rejected; this > + revisited in future. > + > +* It should be noted that since enabling GCS will result in GCS becoming > + active immediately it is not normally possible to return from the function > + that invoked the prctl() that enabled GCS. It is expected that the normal > + usage will be that GCS is enabled very early in execution of a program. > + > + > + > +3. Allocation of Guarded Control Stacks > +---------------------------------------- > + > +* When GCS is enabled for a thread a new Guarded Control Stack will be > + allocated for it of size RLIMIT_STACK or 4 gigabytes, whichever is > + smaller. > + > +* When a new thread is created by a thread which has GCS enabled then a > + new Guarded Control Stack will be allocated for the new thread with > + half the size of the standard stack. > + > +* When a stack is allocated by enabling GCS or during thread creation then > + the top 8 bytes of the stack will be initialised to 0 and GCSPR_EL0 will > + be set to point to the address of this 0 value, this can be used to value. This can be > + detect the top of the stack. > + > +* Additional Guarded Control Stacks can be allocated using the > + map_shadow_stack() system call. > + > +* Stacks allocated using map_shadow_stack() can optionally have an end of > + stack marker and cap placed at the top of the stack. If the flag > + SHADOW_STACK_SET_TOKEN is specified a cap will be placed on the stack, stack; > + if SHADOW_STACK_SET_MARKER is not specified the cap will be the top 8 > + bytes of the stack and if it is specified then the cap will be the next > + 8 bytes. While specifying just SHADOW_STACK_SET_MARKER by itself is > + valid since the marker is all bits 0 it has no observable effect. > + > +* Stacks allocated using map_shadow_stack() must have a size which is a > + multiple of 8 bytes larger than 8 bytes and must be 8 bytes aligned. > + > +* An address can be specified to map_shadow_stack(), if one is provided then map_shadow_stack(). If one > + it must be aligned to a page boundary. > + > +* When a thread is freed the Guarded Control Stack initially allocated for > + that thread will be freed. Note carefully that if the stack has been > + switched this may not be the stack currently in use by the thread. > + > + > +4. Signal handling > +-------------------- > + > +* A new signal frame record gcs_context encodes the current GCS mode and > + pointer for the interrupted context on signal delivery. This will always > + be present on systems that support GCS. > + > +* The record contains a flag field which reports the current GCS configuration > + for the interrupted context as PR_GET_SHADOW_STACK_STATUS would. > + > +* The signal handler is run with the same GCS configuration as the interrupted > + context. > + > +* When GCS is enabled for the interrupted thread a signal handling specific > + GCS cap token will be written to the GCS, this is an architectural GCS cap GCS. This is > + token with bit 63 set and the token type (bits 0..11) all clear. The > + GCSPR_EL0 reported in the signal frame will point to this cap token. > + > +* The signal handler will use the same GCS as the interrupted context. > + > +* When GCS is enabled on signal entry a frame with the address of the signal > + return handler will be pushed onto the GCS, allowing return from the signal > + handler via RET as normal. This will not be reported in the gcs_context in > + the signal frame. > + > + > +5. Signal return > +----------------- > + > +When returning from a signal handler: > + > +* If there is a gcs_context record in the signal frame then the GCS flags > + and GCSPR_EL0 will be restored from that context prior to further > + validation. > + > +* If there is no gcs_context record in the signal frame then the GCS > + configuration will be unchanged. > + > +* If GCS is enabled on return from a signal handler then GCSPR_EL0 must > + point to a valid GCS signal cap record, this will be popped from the record; this will be > + GCS prior to signal return. > + > +* If the GCS configuration is locked when returning from a signal then any > + attempt to change the GCS configuration will be treated as an error. This > + is true even if GCS was not enabled prior to signal entry. > + > +* GCS may be disabled via signal return but any attempt to enable GCS via > + signal return will be rejected. > + > + > +6. ptrace extensions > +--------------------- > + > +* A new regset NT_ARM_GCS is defined for use with PTRACE_GETREGSET and > + PTRACE_SETREGSET. > + > +* Due to the complexity surrounding allocation and deallocation of stacks and > + lack of practical application it is not possible to enable GCS via ptrace. > + GCS may be disabled via the ptrace interface. > + > +* Other GCS modes may be configured via ptrace. > + > +* Configuration via ptrace ignores locking of GCS mode bits. > + > + > +7. ELF coredump extensions > +--------------------------- > + > +* NT_ARM_GCS notes will be added to each coredump for each thread of the > + dumped process. The contents will be equivalent to the data that would > + have been read if a PTRACE_GETREGSET of the corresponding type were > + executed for each thread when the coredump was generated. > + > + > + > +8. /proc extensions > +-------------------- > + > +* Guarded Control Stack pages will include "ss" in their VmFlags in > + /proc/<pid>/smaps.

diff --git a/Documentation/arch/arm64/gcs.rst b/Documentation/arch/arm64/gcs.rst new file mode 100644 index 000000000000..c45c0326836a --- /dev/null +++ b/Documentation/arch/arm64/gcs.rst @@ -0,0 +1,233 @@ +=============================================== +Guarded Control Stack support for AArch64 Linux +=============================================== + +This document outlines briefly the interface provided to userspace by Linux in +order to support use of the ARM Guarded Control Stack (GCS) feature. + +This is an outline of the most important features and issues only and not +intended to be exhaustive. + + + +1. General +----------- + +* GCS is an architecture feature intended to provide greater protection + against return oriented programming (ROP) attacks and to simplify the + implementation of features that need to collect stack traces such as + profiling. + +* When GCS is enabled a separate guarded control stack is maintained by the + PE which is writeable only through specific GCS operations. This + stores the call stack only, when a procedure call instruction is + performed the current PC is pushed onto the GCS and on RET the + address in the LR is verified against that on the top of the GCS. + +* When active current GCS pointer is stored in the system register + GCSPR_EL0. This is readable by userspace but can only be updated + via specific GCS instructions. + +* The architecture provides instructions for switching between guarded + control stacks with checks to ensure that the new stack is a valid + target for switching. + +* The functionality of GCS is similar to that provided by the x86 Shadow + Stack feature, due to sharing of userspace interfaces the ABI refers to + shadow stacks rather than GCS. + +* Support for GCS is reported to userspace via HWCAP2_GCS in the aux vector + AT_HWCAP2 entry. + +* GCS is enabled per thread. While there is support for disabling GCS + at runtime this should be done with great care. + +* GCS memory access faults are reported as normal memory access faults. + +* GCS specific errors (those reported with EC 0x2d) will be reported as + SIGSEGV with a si_code of SEGV_CPERR (control protection error). + +* GCS is supported only for AArch64. + +* On systems where GCS is supported GCSPR_EL0 is always readable by EL0 + regardless of the GCS configuration for the thread. + +* The architecture supports enabling GCS without verifying that return values + in LR match those in the GCS, the LR will be ignored. This is not supported + by Linux. + +* EL0 GCS entries with bit 63 set are reserved for use, one such use is defined + below for signals and should be ignored when parsing the stack if not + understood. + + +2. Enabling and disabling Guarded Control Stacks +------------------------------------------------- + +* GCS is enabled and disabled for a thread via the PR_SET_SHADOW_STACK_STATUS + prctl(), this takes a single flags argument specifying which GCS features + should be used. + +* When set PR_SHADOW_STACK_ENABLE flag allocates a Guarded Control Stack + and enables GCS for the thread, enabling the functionality controlled by + GCSCRE0_EL1.{nTR, RVCHKEN, PCRSEL}. + +* When set the PR_SHADOW_STACK_PUSH flag enables the functionality controlled + by GCSCRE0_EL1.PUSHMEn, allowing explicit GCS pushes. + +* When set the PR_SHADOW_STACK_WRITE flag enables the functionality controlled + by GCSCRE0_EL1.STREn, allowing explicit stores to the Guarded Control Stack. + +* Any unknown flags will cause PR_SET_SHADOW_STACK_STATUS to return -EINVAL. + +* PR_LOCK_SHADOW_STACK_STATUS is passed a bitmask of features with the same + values as used for PR_SET_SHADOW_STACK_STATUS. Any future changes to the + status of the specified GCS mode bits will be rejected. + +* PR_LOCK_SHADOW_STACK_STATUS allows any bit to be locked, this allows + userspace to prevent changes to any future features. + +* There is no support for a process to remove a lock that has been set for + it. + +* PR_SET_SHADOW_STACK_STATUS and PR_LOCK_SHADOW_STACK_STATUS affect only the + thread that called them, any other running threads will be unaffected. + +* New threads inherit the GCS configuration of the thread that created them. + +* GCS is disabled on exec(). + +* The current GCS configuration for a thread may be read with the + PR_GET_SHADOW_STACK_STATUS prctl(), this returns the same flags that + are passed to PR_SET_SHADOW_STACK_STATUS. + +* If GCS is disabled for a thread after having previously been enabled then + the stack will remain allocated for the lifetime of the thread. At present + any attempt to reenable GCS for the thread will be rejected, this may be + revisited in future. + +* It should be noted that since enabling GCS will result in GCS becoming + active immediately it is not normally possible to return from the function + that invoked the prctl() that enabled GCS. It is expected that the normal + usage will be that GCS is enabled very early in execution of a program. + + + +3. Allocation of Guarded Control Stacks +---------------------------------------- + +* When GCS is enabled for a thread a new Guarded Control Stack will be + allocated for it of size RLIMIT_STACK or 4 gigabytes, whichever is + smaller. + +* When a new thread is created by a thread which has GCS enabled then a + new Guarded Control Stack will be allocated for the new thread with + half the size of the standard stack. + +* When a stack is allocated by enabling GCS or during thread creation then + the top 8 bytes of the stack will be initialised to 0 and GCSPR_EL0 will + be set to point to the address of this 0 value, this can be used to + detect the top of the stack. + +* Additional Guarded Control Stacks can be allocated using the + map_shadow_stack() system call. + +* Stacks allocated using map_shadow_stack() can optionally have an end of + stack marker and cap placed at the top of the stack. If the flag + SHADOW_STACK_SET_TOKEN is specified a cap will be placed on the stack, + if SHADOW_STACK_SET_MARKER is not specified the cap will be the top 8 + bytes of the stack and if it is specified then the cap will be the next + 8 bytes. While specifying just SHADOW_STACK_SET_MARKER by itself is + valid since the marker is all bits 0 it has no observable effect. + +* Stacks allocated using map_shadow_stack() must have a size which is a + multiple of 8 bytes larger than 8 bytes and must be 8 bytes aligned. + +* An address can be specified to map_shadow_stack(), if one is provided then + it must be aligned to a page boundary. + +* When a thread is freed the Guarded Control Stack initially allocated for + that thread will be freed. Note carefully that if the stack has been + switched this may not be the stack currently in use by the thread. + + +4. Signal handling +-------------------- + +* A new signal frame record gcs_context encodes the current GCS mode and + pointer for the interrupted context on signal delivery. This will always + be present on systems that support GCS. + +* The record contains a flag field which reports the current GCS configuration + for the interrupted context as PR_GET_SHADOW_STACK_STATUS would. + +* The signal handler is run with the same GCS configuration as the interrupted + context. + +* When GCS is enabled for the interrupted thread a signal handling specific + GCS cap token will be written to the GCS, this is an architectural GCS cap + token with bit 63 set and the token type (bits 0..11) all clear. The + GCSPR_EL0 reported in the signal frame will point to this cap token. + +* The signal handler will use the same GCS as the interrupted context. + +* When GCS is enabled on signal entry a frame with the address of the signal + return handler will be pushed onto the GCS, allowing return from the signal + handler via RET as normal. This will not be reported in the gcs_context in + the signal frame. + + +5. Signal return +----------------- + +When returning from a signal handler: + +* If there is a gcs_context record in the signal frame then the GCS flags + and GCSPR_EL0 will be restored from that context prior to further + validation. + +* If there is no gcs_context record in the signal frame then the GCS + configuration will be unchanged. + +* If GCS is enabled on return from a signal handler then GCSPR_EL0 must + point to a valid GCS signal cap record, this will be popped from the + GCS prior to signal return. + +* If the GCS configuration is locked when returning from a signal then any + attempt to change the GCS configuration will be treated as an error. This + is true even if GCS was not enabled prior to signal entry. + +* GCS may be disabled via signal return but any attempt to enable GCS via + signal return will be rejected. + + +6. ptrace extensions +--------------------- + +* A new regset NT_ARM_GCS is defined for use with PTRACE_GETREGSET and + PTRACE_SETREGSET. + +* Due to the complexity surrounding allocation and deallocation of stacks and + lack of practical application it is not possible to enable GCS via ptrace. + GCS may be disabled via the ptrace interface. + +* Other GCS modes may be configured via ptrace. + +* Configuration via ptrace ignores locking of GCS mode bits. + + +7. ELF coredump extensions +--------------------------- + +* NT_ARM_GCS notes will be added to each coredump for each thread of the + dumped process. The contents will be equivalent to the data that would + have been read if a PTRACE_GETREGSET of the corresponding type were + executed for each thread when the coredump was generated. + + + +8. /proc extensions +-------------------- + +* Guarded Control Stack pages will include "ss" in their VmFlags in + /proc/<pid>/smaps. diff --git a/Documentation/arch/arm64/index.rst b/Documentation/arch/arm64/index.rst index d08e924204bf..dcf3ee3eb8c0 100644 --- a/Documentation/arch/arm64/index.rst +++ b/Documentation/arch/arm64/index.rst @@ -14,6 +14,7 @@ ARM64 Architecture booting cpu-feature-registers elf_hwcaps + gcs hugetlbpage kdump legacy_instructions

[v9,05/39] arm64/gcs: Document the ABI for Guarded Control Stacks

Commit Message

Comments

Patch