Message ID | 20220929222936.14584-2-rick.p.edgecombe@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Shadowstacks for userspace | expand |
On Thu, Sep 29, 2022 at 03:28:58PM -0700, Rick Edgecombe wrote: > +.. SPDX-License-Identifier: GPL-2.0 > + > +========================================= > +Control-flow Enforcement Technology (CET) > +========================================= > + > +Overview > +======== > + > +Control-flow Enforcement Technology (CET) is term referring to several > +related x86 processor features that provides protection against control > +flow hijacking attacks. The HW feature itself can be set up to protect > +both applications and the kernel. Only user-mode protection is implemented > +in the 64-bit kernel. > + > +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is > +a secondary stack allocated from memory and cannot be directly modified by > +applications. When executing a CALL instruction, the processor pushes the > +return address to both the normal stack and the shadow stack. Upon > +function return, the processor pops the shadow stack copy and compares it > +to the normal stack copy. If the two differ, the processor raises a > +control-protection fault. Indirect branch tracking verifies indirect > +CALL/JMP targets are intended as marked by the compiler with 'ENDBR' > +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking > +and only Shadow Stack is currently supported in the kernel. > + > +The Kconfig options is X86_SHADOW_STACK, and it can be disabled with > +the kernel parameter clearcpuid, like this: "clearcpuid=shstk". > + > +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1 > +or later are required. To build a CET-enabled application, GLIBC v2.28 or > +later is also required. > + > +At run time, /proc/cpuinfo shows CET features if the processor supports > +CET. > + > +Application Enabling > +==================== > + > +An application's CET capability is marked in its ELF header and can be > +verified from readelf/llvm-readelf output: > + > + readelf -n <application> | grep -a SHSTK > + properties: x86 feature: SHSTK > + > +The kernel does not process these applications directly. Applications must > +enable them using the interface descriped in section 4. Typically this > +would be done in dynamic loader or static runtime objects, as is the case > +in glibc. > + > +Backward Compatibility > +====================== > + > +GLIBC provides a few CET tunables via the GLIBC_TUNABLES environment > +variable: > + > +GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-WRSS > + Turn off SHSTK/WRSS. > + > +GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive> > + This controls how dlopen() handles SHSTK legacy libraries:: > + > + on - continue with SHSTK enabled; > + permissive - continue with SHSTK off. > + > +Details can be found in the GLIBC manual pages. > + > +CET arch_prctl()'s > +================== > + > +Elf features should be enabled by the loader using the below arch_prctl's. > + > +arch_prctl(ARCH_CET_ENABLE, unsigned int feature) > + Enable a single feature specified in 'feature'. Can only operate on > + one feature at a time. > + > +arch_prctl(ARCH_CET_DISABLE, unsigned int feature) > + Disable features specified in 'feature'. Can only operate on > + one feature at a time. > + > +arch_prctl(ARCH_CET_LOCK, unsigned int features) > + Lock in features at their current enabled or disabled status. > + > +The return values are as following: > + On success, return 0. On error, errno can be:: > + > + -EPERM if any of the passed feature are locked. > + -EOPNOTSUPP if the feature is not supported by the hardware or > + disabled by kernel parameter. > + -EINVAL arguments (non existing feature, etc) > + > +Currently shadow stack and WRSS are supported via this interface. WRSS > +can only be enabled with shadow stack, and is automatically disabled > +if shadow stack is disabled. > + > +Proc status > +=========== > +To check if an application is actually running with shadow stack, the > +user can read the /proc/$PID/arch_status. It will report "wrss" or > +"shstk" depending on what is enabled. > + > +The implementation of the Shadow Stack > +====================================== > + > +Shadow Stack size > +----------------- > + > +A task's shadow stack is allocated from memory to a fixed size of > +MIN(RLIMIT_STACK, 4 GB). In other words, the shadow stack is allocated to > +the maximum size of the normal stack, but capped to 4 GB. However, > +a compat-mode application's address space is smaller, each of its thread's > +shadow stack size is MIN(1/4 RLIMIT_STACK, 4 GB). > + > +Signal > +------ > + > +By default, the main program and its signal handlers use the same shadow > +stack. Because the shadow stack stores only return addresses, a large > +shadow stack covers the condition that both the program stack and the > +signal alternate stack run out. > + > +The kernel creates a restore token for the shadow stack and pushes the > +restorer address to the shadow stack. Then verifies that token when > +restoring from the signal handler. > + > +Fork > +---- > + > +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required > +to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a > +shadow access triggers a page fault with the shadow stack access bit set > +in the page fault error code. > + > +When a task forks a child, its shadow stack PTEs are copied and both the > +parent's and the child's shadow stack PTEs are cleared of the dirty bit. > +Upon the next shadow stack access, the resulting shadow stack page fault > +is handled by page copy/re-use. > + > +When a pthread child is created, the kernel allocates a new shadow stack > +for the new thread. The documentation above can be improved (both grammar and formatting): ---- >8 ---- diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst index 6b270a24ebc3a2..f691f7995cf088 100644 --- a/Documentation/x86/cet.rst +++ b/Documentation/x86/cet.rst @@ -15,92 +15,101 @@ in the 64-bit kernel. CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is a secondary stack allocated from memory and cannot be directly modified by -applications. When executing a CALL instruction, the processor pushes the +applications. When executing a ``CALL`` instruction, the processor pushes the return address to both the normal stack and the shadow stack. Upon function return, the processor pops the shadow stack copy and compares it to the normal stack copy. If the two differ, the processor raises a control-protection fault. Indirect branch tracking verifies indirect -CALL/JMP targets are intended as marked by the compiler with 'ENDBR' -opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking -and only Shadow Stack is currently supported in the kernel. +``CALL``/``JMP`` targets are intended as marked by the compiler with ``ENDBR`` +opcodes. Not all CPUs have both Shadow Stack and Indirect Branch Tracking +and only Shadow Stack is currently supported by the kernel. -The Kconfig options is X86_SHADOW_STACK, and it can be disabled with -the kernel parameter clearcpuid, like this: "clearcpuid=shstk". +The Kconfig options is ``X86_SHADOW_STACK`` and it can be overridden with +the kernel command-line parameter ``clearcpuid`` (for example +``clearcpuid=shstk``). To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1 -or later are required. To build a CET-enabled application, GLIBC v2.28 or +or later are required. To build a CET-enabled application, glibc v2.28 or later is also required. -At run time, /proc/cpuinfo shows CET features if the processor supports -CET. +At run time, ``/proc/cpuinfo`` shows CET features if the processor supports +them -Application Enabling -==================== +Enabling CET in applications +============================ -An application's CET capability is marked in its ELF header and can be -verified from readelf/llvm-readelf output: +The CET capability of an application is marked in its ELF header and can be +verified from ``readelf``/``llvm-readelf`` output:: readelf -n <application> | grep -a SHSTK properties: x86 feature: SHSTK The kernel does not process these applications directly. Applications must -enable them using the interface descriped in section 4. Typically this +enable them using :ref:`cet-arch_prctl`. Typically this would be done in dynamic loader or static runtime objects, as is the case in glibc. Backward Compatibility ====================== -GLIBC provides a few CET tunables via the GLIBC_TUNABLES environment +glibc provides a few CET tunables via the ``GLIBC_TUNABLES`` environment variable: -GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-WRSS + * ``GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-WRSS`` + Turn off SHSTK/WRSS. -GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive> - This controls how dlopen() handles SHSTK legacy libraries:: + * ``GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive>`` - on - continue with SHSTK enabled; - permissive - continue with SHSTK off. + This controls how :manpage:`dlopen(3)` handles SHSTK legacy libraries. + Possible values are: -Details can be found in the GLIBC manual pages. + * ``on`` - continue with SHSTK enabled; + * ``permissive`` - continue with SHSTK off. -CET arch_prctl()'s -================== +.. _cet-arch_prctl: -Elf features should be enabled by the loader using the below arch_prctl's. +CET arch_prctl() interface +========================== -arch_prctl(ARCH_CET_ENABLE, unsigned int feature) - Enable a single feature specified in 'feature'. Can only operate on +ELF features should be enabled by the loader using the following +:manpage:`arch_prctl(2)` subfunctions: + + * ``arch_prctl(ARCH_CET_ENABLE, unsigned int feature)`` + + Enable a single feature specified in ``feature``. Can only operate on one feature at a time. -arch_prctl(ARCH_CET_DISABLE, unsigned int feature) - Disable features specified in 'feature'. Can only operate on + * ``arch_prctl(ARCH_CET_DISABLE, unsigned int feature)`` + + Disable features specified in ``feature``. Can only operate on one feature at a time. -arch_prctl(ARCH_CET_LOCK, unsigned int features) - Lock in features at their current enabled or disabled status. + * ``arch_prctl(ARCH_CET_LOCK, unsigned int features)`` + + Lock in features at their current status. + + * ``arch_prctl(ARCH_CET_UNLOCK, unsigned int features)`` -arch_prctl(ARCH_CET_UNLOCK, unsigned int features) Unlock features. -The return values are as following: - On success, return 0. On error, errno can be:: +On success, :manpage:`arch_prctl(2)` returns 0, otherwise the errno +can be: - -EPERM if any of the passed feature are locked. - -EOPNOTSUPP if the feature is not supported by the hardware or - disabled by kernel parameter. - -EINVAL arguments (non existing feature, etc) + - ``EPERM`` if any of the passed feature are locked. + - ``EOPNOTSUPP`` if the feature is not supported by the hardware or + disabled by the kernel command-line parameter. + - ``EINVAL`` if the arguments are invalid (non existing feature, etc). Currently shadow stack and WRSS are supported via this interface. WRSS can only be enabled with shadow stack, and is automatically disabled if shadow stack is disabled. -Proc status +proc status =========== -To check if an application is actually running with shadow stack, the -user can read the /proc/$PID/arch_status. It will report "wrss" or -"shstk" depending on what is enabled. +To check if an application is actually running with shadow stack, users can +read ``/proc/$PID/arch_status``. It will report ``wrss`` or +``shstk`` depending on what is enabled. The implementation of the Shadow Stack ====================================== @@ -108,11 +117,11 @@ The implementation of the Shadow Stack Shadow Stack size ----------------- -A task's shadow stack is allocated from memory to a fixed size of -MIN(RLIMIT_STACK, 4 GB). In other words, the shadow stack is allocated to +The shadow stack of a task is allocated from memory to a fixed size of +``MIN(RLIMIT_STACK, 4 GB)``. In other words, the shadow stack is allocated to the maximum size of the normal stack, but capped to 4 GB. However, -a compat-mode application's address space is smaller, each of its thread's -shadow stack size is MIN(1/4 RLIMIT_STACK, 4 GB). +the address space of a compat-mode application is smaller; the shadow stack +size of each of its thread is ``MIN(1/4 RLIMIT_STACK, 4 GB)``. Signal ------ @@ -123,19 +132,19 @@ shadow stack covers the condition that both the program stack and the signal alternate stack run out. The kernel creates a restore token for the shadow stack and pushes the -restorer address to the shadow stack. Then verifies that token when -restoring from the signal handler. +restorer address to it. Then the kernel verifies that token when restoring +from the signal handler. Fork ---- -The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required -to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a +The shadow stack vma has ``VM_SHADOW_STACK`` flag set; its PTEs are required +to be read-only and dirty. When a shadow stack PTE is read-write and dirty, a shadow access triggers a page fault with the shadow stack access bit set in the page fault error code. When a task forks a child, its shadow stack PTEs are copied and both the -parent's and the child's shadow stack PTEs are cleared of the dirty bit. +shadow stack PTEs of parent and child are cleared of the dirty bit. Upon the next shadow stack access, the resulting shadow stack page fault is handled by page copy/re-use. Thanks.
Bagas Sanjaya <bagasdotme@gmail.com> writes: > The documentation above can be improved (both grammar and formatting): > > ---- >8 ---- > > diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst > index 6b270a24ebc3a2..f691f7995cf088 100644 > --- a/Documentation/x86/cet.rst > +++ b/Documentation/x86/cet.rst > @@ -15,92 +15,101 @@ in the 64-bit kernel. > > CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is > a secondary stack allocated from memory and cannot be directly modified by > -applications. When executing a CALL instruction, the processor pushes the > +applications. When executing a ``CALL`` instruction, the processor pushes the Just to be clear, not everybody is fond of sprinkling lots of ``literal text`` throughout the documentation in this way. Heavy use of it will certainly clutter the plain-text file and can be a net negative overall. Thanks, jon
On 9/30/22 20:33, Jonathan Corbet wrote: >> CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is >> a secondary stack allocated from memory and cannot be directly modified by >> -applications. When executing a CALL instruction, the processor pushes the >> +applications. When executing a ``CALL`` instruction, the processor pushes the > > Just to be clear, not everybody is fond of sprinkling lots of ``literal > text`` throughout the documentation in this way. Heavy use of it will > certainly clutter the plain-text file and can be a net negative overall. > Actually there is a trade-off between semantic correctness and plain-text clarity. With regards to inline code samples (like identifiers), I fall into the former camp. But when I'm reviewing patches for which the surrounding documentation go latter camp (leave code samples alone without markup), I can adapt to that style as long as it causes no warnings whatsover.
On Fri, 2022-09-30 at 20:41 +0700, Bagas Sanjaya wrote: > On 9/30/22 20:33, Jonathan Corbet wrote: > > > CET introduces Shadow Stack and Indirect Branch Tracking. > > > Shadow stack is > > > a secondary stack allocated from memory and cannot be directly > > > modified by > > > -applications. When executing a CALL instruction, the processor > > > pushes the > > > +applications. When executing a ``CALL`` instruction, the > > > processor pushes the > > > > Just to be clear, not everybody is fond of sprinkling lots of > > ``literal > > text`` throughout the documentation in this way. Heavy use of it > > will > > certainly clutter the plain-text file and can be a net negative > > overall. > > > > Actually there is a trade-off between semantic correctness and plain- > text > clarity. With regards to inline code samples (like identifiers), I > fall > into the former camp. But when I'm reviewing patches for which the > surrounding documentation go latter camp (leave code samples alone > without > markup), I can adapt to that style as long as it causes no warnings > whatsover. Thanks. Unless anyone has any objections, I think I'll take all these changes, except for the literal-izing of the instructions. They are not really being used as code samples in this case. Bagas, can you reply with your sign-off and I'll just apply it?
On Thu, Sep 29, 2022 at 03:28:58PM -0700, Rick Edgecombe wrote: > [...] > +Overview > +======== > + > +Control-flow Enforcement Technology (CET) is term referring to several > +related x86 processor features that provides protection against control > +flow hijacking attacks. The HW feature itself can be set up to protect > +both applications and the kernel. Only user-mode protection is implemented > +in the 64-bit kernel. This likely needs rewording, since it's not strictly true any more: IBT is supported in kernel-mode now (CONFIG_X86_IBT). > +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is > +a secondary stack allocated from memory and cannot be directly modified by > +applications. When executing a CALL instruction, the processor pushes the > +return address to both the normal stack and the shadow stack. Upon > +function return, the processor pops the shadow stack copy and compares it > +to the normal stack copy. If the two differ, the processor raises a > +control-protection fault. Indirect branch tracking verifies indirect > +CALL/JMP targets are intended as marked by the compiler with 'ENDBR' > +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking > +and only Shadow Stack is currently supported in the kernel. > + > +The Kconfig options is X86_SHADOW_STACK, and it can be disabled with > +the kernel parameter clearcpuid, like this: "clearcpuid=shstk". > + > +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1 > +or later are required. To build a CET-enabled application, GLIBC v2.28 or > +later is also required. > + > +At run time, /proc/cpuinfo shows CET features if the processor supports > +CET. Maybe call them out by name: shstk ibt > +CET arch_prctl()'s > +================== > + > +Elf features should be enabled by the loader using the below arch_prctl's. > + > +arch_prctl(ARCH_CET_ENABLE, unsigned int feature) > + Enable a single feature specified in 'feature'. Can only operate on > + one feature at a time. Does this mean only 1 bit out of the 32 may be specified? > + > +arch_prctl(ARCH_CET_DISABLE, unsigned int feature) > + Disable features specified in 'feature'. Can only operate on > + one feature at a time. > + > +arch_prctl(ARCH_CET_LOCK, unsigned int features) > + Lock in features at their current enabled or disabled status. How is the "features" argument processed here? > [...] > +Proc status > +=========== > +To check if an application is actually running with shadow stack, the > +user can read the /proc/$PID/arch_status. It will report "wrss" or > +"shstk" depending on what is enabled. TIL about "arch_status". :) Why is this a separate file? "status" is already has unique field names. > +Fork > +---- > + > +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required > +to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a > +shadow access triggers a page fault with the shadow stack access bit set > +in the page fault error code. > + > +When a task forks a child, its shadow stack PTEs are copied and both the > +parent's and the child's shadow stack PTEs are cleared of the dirty bit. > +Upon the next shadow stack access, the resulting shadow stack page fault > +is handled by page copy/re-use. > + > +When a pthread child is created, the kernel allocates a new shadow stack > +for the new thread. Perhaps speak to the ASLR characteristics of the shstk here? Also, it seems if there is a "Fork" section, there should be an "Exec" section? I suspect it would be short: shstk is disabled when execve() is called and must be re-enabled from userspace, yes? -Kees
On 9/29/22 20:41, Bagas Sanjaya wrote: ... > The documentation above can be improved (both grammar and formatting): > > ---- >8 ---- > > diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst > index 6b270a24ebc3a2..f691f7995cf088 100644 > --- a/Documentation/x86/cet.rst > +++ b/Documentation/x86/cet.rst > @@ -15,92 +15,101 @@ in the 64-bit kernel. > > CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is > a secondary stack allocated from memory and cannot be directly modified by > -applications. When executing a CALL instruction, the processor pushes the > +applications. When executing a ``CALL`` instruction, the processor pushes the It's always a judgment call, as to whether to use something like ``CALL` or just plain CALL. Here, I'd like to opine that that the benefits of ``CALL`` are very small, whereas plain text in cet.rst has been made significantly worse. So the result is, "this is not worth it". The same is true of pretty much all of the other literalizing changes below, IMHO. Just so you have some additional input on this. I tend to spend time fussing a lot (too much, yes) over readability issues, so this jumps right out at me. :) thanks,
On 10/3/22 12:35, John Hubbard wrote: > It's always a judgment call, as to whether to use something like ``CALL` > or just plain CALL. Here, I'd like to opine that that the benefits of > ``CALL`` are very small, whereas plain text in cet.rst has been made > significantly worse. So the result is, "this is not worth it". I'm definitely in this camp as well. Unless the markup *really* adds to readability, just leave it alone.
On Mon, 2022-10-03 at 10:18 -0700, Kees Cook wrote: > On Thu, Sep 29, 2022 at 03:28:58PM -0700, Rick Edgecombe wrote: > > [...] > > +Overview > > +======== > > + > > +Control-flow Enforcement Technology (CET) is term referring to > > several > > +related x86 processor features that provides protection against > > control > > +flow hijacking attacks. The HW feature itself can be set up to > > protect > > +both applications and the kernel. Only user-mode protection is > > implemented > > +in the 64-bit kernel. > > This likely needs rewording, since it's not strictly true any more: > IBT is supported in kernel-mode now (CONFIG_X86_IBT). Yep, thanks. > > > +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow > > stack is > > +a secondary stack allocated from memory and cannot be directly > > modified by > > +applications. When executing a CALL instruction, the processor > > pushes the > > +return address to both the normal stack and the shadow stack. Upon > > +function return, the processor pops the shadow stack copy and > > compares it > > +to the normal stack copy. If the two differ, the processor raises > > a > > +control-protection fault. Indirect branch tracking verifies > > indirect > > +CALL/JMP targets are intended as marked by the compiler with > > 'ENDBR' > > +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch > > Tracking > > +and only Shadow Stack is currently supported in the kernel. > > + > > +The Kconfig options is X86_SHADOW_STACK, and it can be disabled > > with > > +the kernel parameter clearcpuid, like this: "clearcpuid=shstk". > > + > > +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM > > v10.0.1 > > +or later are required. To build a CET-enabled application, GLIBC > > v2.28 or > > +later is also required. > > + > > +At run time, /proc/cpuinfo shows CET features if the processor > > supports > > +CET. > > Maybe call them out by name: shstk ibt Ok. > > > +CET arch_prctl()'s > > +================== > > + > > +Elf features should be enabled by the loader using the below > > arch_prctl's. > > + > > +arch_prctl(ARCH_CET_ENABLE, unsigned int feature) > > + Enable a single feature specified in 'feature'. Can only > > operate on > > + one feature at a time. > > Does this mean only 1 bit out of the 32 may be specified? Yes, exactly. > > > + > > +arch_prctl(ARCH_CET_DISABLE, unsigned int feature) > > + Disable features specified in 'feature'. Can only operate on > > + one feature at a time. > > + > > +arch_prctl(ARCH_CET_LOCK, unsigned int features) > > + Lock in features at their current enabled or disabled status. > > How is the "features" argument processed here? Yes, this should have more info. The kernel keeps a mask of features that are "locked". The mask is ORed with the existing value. So any bits set here cannot be enabled or disabled afterwards. Bit's unset in the mask passed are ignored. > > > [...] > > +Proc status > > +=========== > > +To check if an application is actually running with shadow stack, > > the > > +user can read the /proc/$PID/arch_status. It will report "wrss" or > > +"shstk" depending on what is enabled. > > TIL about "arch_status". :) Why is this a separate file? "status" is > already has unique field names. It looks like "status" only has arch-agnostic feature status today. Maybe that's the reason? CET seems to fit there though. > > > +Fork > > +---- > > + > > +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are > > required > > +to be read-only and dirty. When a shadow stack PTE is not RO and > > dirty, a > > +shadow access triggers a page fault with the shadow stack access > > bit set > > +in the page fault error code. > > + > > +When a task forks a child, its shadow stack PTEs are copied and > > both the > > +parent's and the child's shadow stack PTEs are cleared of the > > dirty bit. > > +Upon the next shadow stack access, the resulting shadow stack page > > fault > > +is handled by page copy/re-use. > > + > > +When a pthread child is created, the kernel allocates a new shadow > > stack > > +for the new thread. > > Perhaps speak to the ASLR characteristics of the shstk here? It behaves just like mmap(). I can add some info. > > Also, it seems if there is a "Fork" section, there should be an > "Exec" > section? I suspect it would be short: shstk is disabled when execve() > is > called and must be re-enabled from userspace, yes? Sure, I can add some info.
On 10/4/22 02:35, John Hubbard wrote: > It's always a judgment call, as to whether to use something like ``CALL` > or just plain CALL. Here, I'd like to opine that that the benefits of > ``CALL`` are very small, whereas plain text in cet.rst has been made > significantly worse. So the result is, "this is not worth it". > Hmm, seems like neither CALL nor ``CALL`` is better, right?
On 10/3/22 23:56, Edgecombe, Rick P wrote: > On Fri, 2022-09-30 at 20:41 +0700, Bagas Sanjaya wrote: >> On 9/30/22 20:33, Jonathan Corbet wrote: >>>> CET introduces Shadow Stack and Indirect Branch Tracking. >>>> Shadow stack is >>>> a secondary stack allocated from memory and cannot be directly >>>> modified by >>>> -applications. When executing a CALL instruction, the processor >>>> pushes the >>>> +applications. When executing a ``CALL`` instruction, the >>>> processor pushes the >>> >>> Just to be clear, not everybody is fond of sprinkling lots of >>> ``literal >>> text`` throughout the documentation in this way. Heavy use of it >>> will >>> certainly clutter the plain-text file and can be a net negative >>> overall. >>> >> >> Actually there is a trade-off between semantic correctness and plain- >> text >> clarity. With regards to inline code samples (like identifiers), I >> fall >> into the former camp. But when I'm reviewing patches for which the >> surrounding documentation go latter camp (leave code samples alone >> without >> markup), I can adapt to that style as long as it causes no warnings >> whatsover. > > Thanks. Unless anyone has any objections, I think I'll take all these > changes, except for the literal-izing of the instructions. They are not > really being used as code samples in this case. > > Bagas, can you reply with your sign-off and I'll just apply it? OK, here goes... Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
On 29/09/2022 23:28, Rick Edgecombe wrote: > diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst > new file mode 100644 > index 000000000000..4a0dfb6830f9 > --- /dev/null > +++ b/Documentation/x86/cet.rst > @@ -0,0 +1,140 @@ > +.. SPDX-License-Identifier: GPL-2.0 > + > +========================================= > +Control-flow Enforcement Technology (CET) > +========================================= > + > +Overview > +======== > + > +Control-flow Enforcement Technology (CET) is term referring to several > +related x86 processor features that provides protection against control > +flow hijacking attacks. The HW feature itself can be set up to protect > +both applications and the kernel. Only user-mode protection is implemented > +in the 64-bit kernel. > + > +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is > +a secondary stack allocated from memory and cannot be directly modified by > +applications. When executing a CALL instruction, the processor pushes the > +return address to both the normal stack and the shadow stack. Upon > +function return, the processor pops the shadow stack copy and compares it > +to the normal stack copy. If the two differ, the processor raises a > +control-protection fault. Indirect branch tracking verifies indirect > +CALL/JMP targets are intended as marked by the compiler with 'ENDBR' > +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking > +and only Shadow Stack is currently supported in the kernel. This paragraph is stale, isn't it? AIUI, by the end of this series, what is supported is in-kernel self-protection using CET-IBT, and userspace shadow stacks. It is probably worth keeping the implementation-agnostic bits separate from the "what is currently supported" matrix. I'm not certain if its worth splitting into cet.rst, cet-kernel.rst and cet-user.rst at this point, but it's something to consider. > +The Kconfig options is X86_SHADOW_STACK, and it can be disabled with > +the kernel parameter clearcpuid, like this: "clearcpuid=shstk". What about namespacing? For the CPUID features themselves, yes they're shstk and ibt. But for the Kconfig options, the user and kernel implementations are wildly different for both shstk and ibt. Are they going to want to share the same Kconfig option from the getgo? Independent of the Kconfig symbol, user and kernel have separate enablement criteria. e.g. kernel shstk is likely going to be dependent on the FRED feature, and simply looking at `shstk` in /proc/cpuinfo doesn't necessarily tell you all you want to know. > + > +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1 What are the other dependences here? In principle shstk only needs assembler support for the new instructions, and that's Binutils 2.29 / LLVM 6 from my notes. It's IBT which needs compiler support (and then, even only kernel IBT), and that work is already done. > +or later are required. To build a CET-enabled application, GLIBC v2.28 or > +later is also required. > + > +At run time, /proc/cpuinfo shows CET features if the processor supports > +CET. Probably helpful to state what these are. > + > +Application Enabling > +==================== > + > +An application's CET capability is marked in its ELF header and can be Technically its in an ELF note, not the ELF header. ~Andrew
On Mon, Oct 03, 2022 at 04:56:10PM +0000, Edgecombe, Rick P wrote:
> Thanks. Unless anyone has any objections
Well, I'll object. I still feel rst should burn in hell. Plain text FTW.
On 10/5/22 16:10, Peter Zijlstra wrote: > On Mon, Oct 03, 2022 at 04:56:10PM +0000, Edgecombe, Rick P wrote: >> Thanks. Unless anyone has any objections > > Well, I'll object. I still feel rst should burn in hell. Plain text FTW. > > .txt maybe?
On Wed, Oct 05, 2022 at 04:25:39PM +0700, Bagas Sanjaya wrote: > On 10/5/22 16:10, Peter Zijlstra wrote: > > On Mon, Oct 03, 2022 at 04:56:10PM +0000, Edgecombe, Rick P wrote: > >> Thanks. Unless anyone has any objections > > > > Well, I'll object. I still feel rst should burn in hell. Plain text FTW. > > > > > > .txt maybe? We had that, but some idiots went and converted the lot to .rst :-(
* Rick Edgecombe: > +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1 > +or later are required. To build a CET-enabled application, GLIBC v2.28 or > +later is also required. Uhm, I think we are using binutils 2.30 with extra fixes. I hope that these binaries are still valid. More importantly, glibc needs to be configured with --enable-cet explicitly (unless the compiler defaults to CET). The default glibc build with a default GCC will produce dynamically-linked executables that disable CET (when running on later/differently configured glibc builds). The statically linked object files are not marked up for CET in that case. I think the goal is to support the new kernel interface for actually switching on SHSTK in glibc 2.37. But at that point, hopefully all those existing binaries can start enjoying the STSTK benefits. Thanks, Florian
On Mon, 2022-10-10 at 14:19 +0200, Florian Weimer wrote: > Uhm, I think we are using binutils 2.30 with extra fixes. I hope > that > these binaries are still valid. Yea, you're right. Andrew Cooper pointed out it has been supported since 2.29, so 2.30 should be fine. > > More importantly, glibc needs to be configured with --enable-cet > explicitly (unless the compiler defaults to CET). The default glibc > build with a default GCC will produce dynamically-linked executables > that disable CET (when running on later/differently configured glibc > builds). The statically linked object files are not marked up for > CET > in that case. Thanks, that's a good point. I'll add a blurb about glibc needs to be compiled with CET support. > > I think the goal is to support the new kernel interface for actually > switching on SHSTK in glibc 2.37. But at that point, hopefully all > those existing binaries can start enjoying the STSTK benefits. Can you share more about this plan? HJ was previously planning to wait until the kernel support was upstream before making any more glibc changes. Hopefully this will be in time for that, but I'd really rather not repeat what happened last time where we had to design the kernel interface around not breaking old glibc's with mismatched CET enablement. What did you think of the proposal to disable existing binaries and start from scratch? Elaborated in the coverletter in the section "Compatibility of Existing Binaries/Enabling Interface".
On Mon, Oct 10, 2022 at 9:44 AM Edgecombe, Rick P <rick.p.edgecombe@intel.com> wrote: > > On Mon, 2022-10-10 at 14:19 +0200, Florian Weimer wrote: > > Uhm, I think we are using binutils 2.30 with extra fixes. I hope > > that > > these binaries are still valid. > > Yea, you're right. Andrew Cooper pointed out it has been supported > since 2.29, so 2.30 should be fine. > > > > > More importantly, glibc needs to be configured with --enable-cet > > explicitly (unless the compiler defaults to CET). The default glibc > > build with a default GCC will produce dynamically-linked executables > > that disable CET (when running on later/differently configured glibc > > builds). The statically linked object files are not marked up for > > CET > > in that case. > > Thanks, that's a good point. I'll add a blurb about glibc needs to be > compiled with CET support. > > > > > I think the goal is to support the new kernel interface for actually > > switching on SHSTK in glibc 2.37. But at that point, hopefully all > > those existing binaries can start enjoying the STSTK benefits. > > Can you share more about this plan? HJ was previously planning to wait > until the kernel support was upstream before making any more glibc > changes. Hopefully this will be in time for that, but I'd really rather > not repeat what happened last time where we had to design the kernel > interface around not breaking old glibc's with mismatched CET > enablement. > > What did you think of the proposal to disable existing binaries and > start from scratch? Elaborated in the coverletter in the section > "Compatibility of Existing Binaries/Enabling Interface". My current glibc plan is that kernel won't enable CET automatically and glibc will issue syscall to enable CET at early startup time. All existing CET enabled dynamic executables will have CET enabled under the CET kernel and the updated CET glibc.
* Rick P. Edgecombe: >> I think the goal is to support the new kernel interface for actually >> switching on SHSTK in glibc 2.37. But at that point, hopefully all >> those existing binaries can start enjoying the STSTK benefits. > > Can you share more about this plan? HJ was previously planning to wait > until the kernel support was upstream before making any more glibc > changes. Hopefully this will be in time for that, but I'd really rather > not repeat what happened last time where we had to design the kernel > interface around not breaking old glibc's with mismatched CET > enablement. You're still doing that (keeping that gap in this constant), and this appreciated and very much necessary. > What did you think of the proposal to disable existing binaries and > start from scratch? Elaborated in the coverletter in the section > "Compatibility of Existing Binaries/Enabling Interface". The ABI was finalized around four years ago, and we have shipped several Fedora and Red Hat Enterprise Linux versions with it. Other distributions did as well. It's a bit late to make changes now, and certainly not for such trivialities. In the case of the IBT ABI, it may be tempting to start over in a less trivial way, to radically reduce the amount of ENDBR instructions. But that doesn't concern SHSTK, and there's no actual implementation anyway. But as H.J. implied, you would have to do rather nasty things in the kernel to prevent us from achieving ABI compatibility in userspace, like parsing property notes on the main executable and disabling the new arch_prctl calls if you see something there that you don't like. 8-) Of course no one is going to implement that. (We are fine with swapping out glibc and its dynamic loader to enable CET with the appropriate kernel mechanism, but we wouldn't want to change the way all other binaries are marked up.) Thanks, Florian
On 10/12/22 05:29, Florian Weimer wrote: >> What did you think of the proposal to disable existing binaries and >> start from scratch? Elaborated in the coverletter in the section >> "Compatibility of Existing Binaries/Enabling Interface". > The ABI was finalized around four years ago, and we have shipped several > Fedora and Red Hat Enterprise Linux versions with it. Other > distributions did as well. It's a bit late to make changes now, and > certainly not for such trivialities. Just to be clear: You're saying that a user/kernel ABI was "finalized" by glibc shipping the user side of it, before there being an upstream kernel implementation?
* Dave Hansen: > On 10/12/22 05:29, Florian Weimer wrote: >>> What did you think of the proposal to disable existing binaries and >>> start from scratch? Elaborated in the coverletter in the section >>> "Compatibility of Existing Binaries/Enabling Interface". >> The ABI was finalized around four years ago, and we have shipped several >> Fedora and Red Hat Enterprise Linux versions with it. Other >> distributions did as well. It's a bit late to make changes now, and >> certainly not for such trivialities. > > Just to be clear: You're saying that a user/kernel ABI was "finalized" > by glibc shipping the user side of it, before there being an upstream > kernel implementation? Sorry for being unclear. I was refering to the x86-64 ELF psABI supplement for CET, not the kernel/userspace interface, which still does not exist in its final form as of today, as far as I understand it. Thanks, Florian
On Wed, 2022-10-12 at 14:29 +0200, Florian Weimer wrote: > The ABI was finalized around four years ago, and we have shipped > several > Fedora and Red Hat Enterprise Linux versions with it. Other > distributions did as well. It's a bit late to make changes now, and > certainly not for such trivialities. In the case of the IBT ABI, it > may > be tempting to start over in a less trivial way, to radically reduce > the > amount of ENDBR instructions. But that doesn't concern SHSTK, and > there's no actual implementation anyway. > > But as H.J. implied, you would have to do rather nasty things in the > kernel to prevent us from achieving ABI compatibility in userspace, > like > parsing property notes on the main executable and disabling the new > arch_prctl calls if you see something there that you don't like. 8-) > Of course no one is going to implement that. > > (We are fine with swapping out glibc and its dynamic loader to enable > CET with the appropriate kernel mechanism, but we wouldn't want to > change the way all other binaries are marked up.) So we have these compatibility issues with existing binaries. We know some apps are totally broken. It sounds like you are proposing to ignore this and let people who hit the issues work through it themselves. This was also proposed by other glibc developers as a solution for past CET compatibility issues that broke boot on kernel upgrade. I have to say, as the person pushing these patches, I’m uncomfortable with this approach. I don’t think users will like the results. Basically, do they want to upgrade and run a bunch of untested integration with known failures? I also don’t want to get this feature reverted and I’m not exactly sure how this scenario would be taken. But I hear the point about it not being ideal to abandon the existing CET userspace. I think there is also a point about how userspace chose to do this optimistic and early wide enabling, even if it was a bad idea, and so how much should the kernel try to save userspace from itself. So what do you think about this instead: The current psABI spec talks about the binary being compatible with shadow stack. It doesn’t say much about what should happen after the loader. Since the greater ecosystem has used this bit with a more cavalier attitude, glibc could treat it as a request for a warn and continue mode. In the meantime we could have a new bit shstk_strict, that requests behavior like these patches implement, and kills the process on violation. Glibc/tools could add support for this strict bit and anyone that wants to more carefully compile with it could finally get shadow stack today. Then the implementation of the warn and continue mode could follow that, and glibc could map the original shstk bit to that kernel mode. So the old binaries would get there eventually, which is better than the continuing nothing they have today. And speaking of having nothing today, there are people that really want to use shadow stack and do not care at all about having CET support for existing binaries. Neither glibc or elf bits are required to use kernel shadow stack support. So if it comes to it, I don’t want to hold support back for other users because the elf note bit enabling path grew some issues. Please let me know about what you think of that plan.
On Thu, Oct 13, 2022 at 2:28 PM Edgecombe, Rick P <rick.p.edgecombe@intel.com> wrote: > > On Wed, 2022-10-12 at 14:29 +0200, Florian Weimer wrote: > > The ABI was finalized around four years ago, and we have shipped > > several > > Fedora and Red Hat Enterprise Linux versions with it. Other > > distributions did as well. It's a bit late to make changes now, and > > certainly not for such trivialities. In the case of the IBT ABI, it > > may > > be tempting to start over in a less trivial way, to radically reduce > > the > > amount of ENDBR instructions. But that doesn't concern SHSTK, and > > there's no actual implementation anyway. > > > > But as H.J. implied, you would have to do rather nasty things in the > > kernel to prevent us from achieving ABI compatibility in userspace, > > like > > parsing property notes on the main executable and disabling the new > > arch_prctl calls if you see something there that you don't like. 8-) > > Of course no one is going to implement that. > > > > (We are fine with swapping out glibc and its dynamic loader to enable > > CET with the appropriate kernel mechanism, but we wouldn't want to > > change the way all other binaries are marked up.) > > So we have these compatibility issues with existing binaries. We know > some apps are totally broken. It sounds like you are proposing to > ignore this and let people who hit the issues work through it > themselves. This was also proposed by other glibc developers as a > solution for past CET compatibility issues that broke boot on kernel > upgrade. I have to say, as the person pushing these patches, I’m > uncomfortable with this approach. I don’t think users will like the > results. Basically, do they want to upgrade and run a bunch of untested > integration with known failures? I also don’t want to get this feature > reverted and I’m not exactly sure how this scenario would be taken. > > But I hear the point about it not being ideal to abandon the existing > CET userspace. I think there is also a point about how userspace chose > to do this optimistic and early wide enabling, even if it was a bad > idea, and so how much should the kernel try to save userspace from > itself. So what do you think about this instead: > > The current psABI spec talks about the binary being compatible with > shadow stack. It doesn’t say much about what should happen after the > loader. Since the greater ecosystem has used this bit with a more > cavalier attitude, glibc could treat it as a request for a warn and > continue mode. In the meantime we could have a new bit shstk_strict, > that requests behavior like these patches implement, and kills the > process on violation. Glibc/tools could add support for this strict bit > and anyone that wants to more carefully compile with it could finally > get shadow stack today. Then the implementation of the warn and > continue mode could follow that, and glibc could map the original shstk > bit to that kernel mode. So the old binaries would get there > eventually, which is better than the continuing nothing they have > today. > > And speaking of having nothing today, there are people that really want > to use shadow stack and do not care at all about having CET support for > existing binaries. Neither glibc or elf bits are required to use kernel > shadow stack support. So if it comes to it, I don’t want to hold > support back for other users because the elf note bit enabling path > grew some issues. > > Please let me know about what you think of that plan. The kernel CET description +The kernel does not process these applications directly. Applications must +enable them using the interface descriped in section 4. Typically this +would be done in dynamic loader or static runtime objects, as is the case +in glibc. may leave an impression that each application needs to use the kernel interface to enable CET itself. This is an option. But the updated glibc will enable CET automatically on behalf of the CET enabled application. If the glibc isn't updated to use the new CET kernel interface, the existing CET enabled binaries will run correctly under the new CET enabled kernel without CET enabled.
On Thu, 2022-10-13 at 14:28 -0700, Rick Edgecombe wrote: > In the meantime we could have a new bit shstk_strict, > that requests behavior like these patches implement, and kills the > process on violation. Glibc/tools could add support for this strict > bit > and anyone that wants to more carefully compile with it could finally > get shadow stack today. Then the implementation of the warn and > continue mode could follow that, and glibc could map the original > shstk > bit to that kernel mode. So the old binaries would get there > eventually, which is better than the continuing nothing they have > today. Hi, Any thoughts on this proposal? Thanks, Rick
diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst new file mode 100644 index 000000000000..4a0dfb6830f9 --- /dev/null +++ b/Documentation/x86/cet.rst @@ -0,0 +1,140 @@ +.. SPDX-License-Identifier: GPL-2.0 + +========================================= +Control-flow Enforcement Technology (CET) +========================================= + +Overview +======== + +Control-flow Enforcement Technology (CET) is term referring to several +related x86 processor features that provides protection against control +flow hijacking attacks. The HW feature itself can be set up to protect +both applications and the kernel. Only user-mode protection is implemented +in the 64-bit kernel. + +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is +a secondary stack allocated from memory and cannot be directly modified by +applications. When executing a CALL instruction, the processor pushes the +return address to both the normal stack and the shadow stack. Upon +function return, the processor pops the shadow stack copy and compares it +to the normal stack copy. If the two differ, the processor raises a +control-protection fault. Indirect branch tracking verifies indirect +CALL/JMP targets are intended as marked by the compiler with 'ENDBR' +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking +and only Shadow Stack is currently supported in the kernel. + +The Kconfig options is X86_SHADOW_STACK, and it can be disabled with +the kernel parameter clearcpuid, like this: "clearcpuid=shstk". + +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1 +or later are required. To build a CET-enabled application, GLIBC v2.28 or +later is also required. + +At run time, /proc/cpuinfo shows CET features if the processor supports +CET. + +Application Enabling +==================== + +An application's CET capability is marked in its ELF header and can be +verified from readelf/llvm-readelf output: + + readelf -n <application> | grep -a SHSTK + properties: x86 feature: SHSTK + +The kernel does not process these applications directly. Applications must +enable them using the interface descriped in section 4. Typically this +would be done in dynamic loader or static runtime objects, as is the case +in glibc. + +Backward Compatibility +====================== + +GLIBC provides a few CET tunables via the GLIBC_TUNABLES environment +variable: + +GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-WRSS + Turn off SHSTK/WRSS. + +GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive> + This controls how dlopen() handles SHSTK legacy libraries:: + + on - continue with SHSTK enabled; + permissive - continue with SHSTK off. + +Details can be found in the GLIBC manual pages. + +CET arch_prctl()'s +================== + +Elf features should be enabled by the loader using the below arch_prctl's. + +arch_prctl(ARCH_CET_ENABLE, unsigned int feature) + Enable a single feature specified in 'feature'. Can only operate on + one feature at a time. + +arch_prctl(ARCH_CET_DISABLE, unsigned int feature) + Disable features specified in 'feature'. Can only operate on + one feature at a time. + +arch_prctl(ARCH_CET_LOCK, unsigned int features) + Lock in features at their current enabled or disabled status. + +The return values are as following: + On success, return 0. On error, errno can be:: + + -EPERM if any of the passed feature are locked. + -EOPNOTSUPP if the feature is not supported by the hardware or + disabled by kernel parameter. + -EINVAL arguments (non existing feature, etc) + +Currently shadow stack and WRSS are supported via this interface. WRSS +can only be enabled with shadow stack, and is automatically disabled +if shadow stack is disabled. + +Proc status +=========== +To check if an application is actually running with shadow stack, the +user can read the /proc/$PID/arch_status. It will report "wrss" or +"shstk" depending on what is enabled. + +The implementation of the Shadow Stack +====================================== + +Shadow Stack size +----------------- + +A task's shadow stack is allocated from memory to a fixed size of +MIN(RLIMIT_STACK, 4 GB). In other words, the shadow stack is allocated to +the maximum size of the normal stack, but capped to 4 GB. However, +a compat-mode application's address space is smaller, each of its thread's +shadow stack size is MIN(1/4 RLIMIT_STACK, 4 GB). + +Signal +------ + +By default, the main program and its signal handlers use the same shadow +stack. Because the shadow stack stores only return addresses, a large +shadow stack covers the condition that both the program stack and the +signal alternate stack run out. + +The kernel creates a restore token for the shadow stack and pushes the +restorer address to the shadow stack. Then verifies that token when +restoring from the signal handler. + +Fork +---- + +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required +to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a +shadow access triggers a page fault with the shadow stack access bit set +in the page fault error code. + +When a task forks a child, its shadow stack PTEs are copied and both the +parent's and the child's shadow stack PTEs are cleared of the dirty bit. +Upon the next shadow stack access, the resulting shadow stack page fault +is handled by page copy/re-use. + +When a pthread child is created, the kernel allocates a new shadow stack +for the new thread. diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst index c73d133fd37c..9ac03055c4b5 100644 --- a/Documentation/x86/index.rst +++ b/Documentation/x86/index.rst @@ -22,6 +22,7 @@ x86-specific Documentation mtrr pat intel-hfi + cet iommu intel_txt amd-memory-encryption