diff mbox series

[v2,01/39] Documentation/x86: Add CET description

Message ID 20220929222936.14584-2-rick.p.edgecombe@intel.com (mailing list archive)
State New
Headers show
Series Shadowstacks for userspace | expand

Commit Message

Rick Edgecombe Sept. 29, 2022, 10:28 p.m. UTC
From: Yu-cheng Yu <yu-cheng.yu@intel.com>

Introduce a new document on Control-flow Enforcement Technology (CET).

Signed-off-by: Yu-cheng Yu <yu-cheng.yu@intel.com>
Co-developed-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Signed-off-by: Rick Edgecombe <rick.p.edgecombe@intel.com>
Cc: Kees Cook <keescook@chromium.org>

---

v2:
 - Updated to new arch_prctl() API
 - Add bit about new proc status

v1:
 - Update and clarify the docs.
 - Moved kernel parameters documentation to other patch.

 Documentation/x86/cet.rst   | 140 ++++++++++++++++++++++++++++++++++++
 Documentation/x86/index.rst |   1 +
 2 files changed, 141 insertions(+)
 create mode 100644 Documentation/x86/cet.rst

Comments

Bagas Sanjaya Sept. 30, 2022, 3:41 a.m. UTC | #1
On Thu, Sep 29, 2022 at 03:28:58PM -0700, Rick Edgecombe wrote:
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=========================================
> +Control-flow Enforcement Technology (CET)
> +=========================================
> +
> +Overview
> +========
> +
> +Control-flow Enforcement Technology (CET) is term referring to several
> +related x86 processor features that provides protection against control
> +flow hijacking attacks. The HW feature itself can be set up to protect
> +both applications and the kernel. Only user-mode protection is implemented
> +in the 64-bit kernel.
> +
> +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
> +a secondary stack allocated from memory and cannot be directly modified by
> +applications. When executing a CALL instruction, the processor pushes the
> +return address to both the normal stack and the shadow stack. Upon
> +function return, the processor pops the shadow stack copy and compares it
> +to the normal stack copy. If the two differ, the processor raises a
> +control-protection fault. Indirect branch tracking verifies indirect
> +CALL/JMP targets are intended as marked by the compiler with 'ENDBR'
> +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking
> +and only Shadow Stack is currently supported in the kernel.
> +
> +The Kconfig options is X86_SHADOW_STACK, and it can be disabled with
> +the kernel parameter clearcpuid, like this: "clearcpuid=shstk".
> +
> +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1
> +or later are required. To build a CET-enabled application, GLIBC v2.28 or
> +later is also required.
> +
> +At run time, /proc/cpuinfo shows CET features if the processor supports
> +CET.
> +
> +Application Enabling
> +====================
> +
> +An application's CET capability is marked in its ELF header and can be
> +verified from readelf/llvm-readelf output:
> +
> +    readelf -n <application> | grep -a SHSTK
> +        properties: x86 feature: SHSTK
> +
> +The kernel does not process these applications directly. Applications must
> +enable them using the interface descriped in section 4. Typically this
> +would be done in dynamic loader or static runtime objects, as is the case
> +in glibc.
> +
> +Backward Compatibility
> +======================
> +
> +GLIBC provides a few CET tunables via the GLIBC_TUNABLES environment
> +variable:
> +
> +GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-WRSS
> +    Turn off SHSTK/WRSS.
> +
> +GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive>
> +    This controls how dlopen() handles SHSTK legacy libraries::
> +
> +        on         - continue with SHSTK enabled;
> +        permissive - continue with SHSTK off.
> +
> +Details can be found in the GLIBC manual pages.
> +
> +CET arch_prctl()'s
> +==================
> +
> +Elf features should be enabled by the loader using the below arch_prctl's.
> +
> +arch_prctl(ARCH_CET_ENABLE, unsigned int feature)
> +    Enable a single feature specified in 'feature'. Can only operate on
> +    one feature at a time.
> +
> +arch_prctl(ARCH_CET_DISABLE, unsigned int feature)
> +    Disable features specified in 'feature'. Can only operate on
> +    one feature at a time.
> +
> +arch_prctl(ARCH_CET_LOCK, unsigned int features)
> +    Lock in features at their current enabled or disabled status.
> +
> +The return values are as following:
> +    On success, return 0. On error, errno can be::
> +
> +        -EPERM if any of the passed feature are locked.
> +        -EOPNOTSUPP if the feature is not supported by the hardware or
> +         disabled by kernel parameter.
> +        -EINVAL arguments (non existing feature, etc)
> +
> +Currently shadow stack and WRSS are supported via this interface. WRSS
> +can only be enabled with shadow stack, and is automatically disabled
> +if shadow stack is disabled.
> +
> +Proc status
> +===========
> +To check if an application is actually running with shadow stack, the
> +user can read the /proc/$PID/arch_status. It will report "wrss" or
> +"shstk" depending on what is enabled.
> +
> +The implementation of the Shadow Stack
> +======================================
> +
> +Shadow Stack size
> +-----------------
> +
> +A task's shadow stack is allocated from memory to a fixed size of
> +MIN(RLIMIT_STACK, 4 GB). In other words, the shadow stack is allocated to
> +the maximum size of the normal stack, but capped to 4 GB. However,
> +a compat-mode application's address space is smaller, each of its thread's
> +shadow stack size is MIN(1/4 RLIMIT_STACK, 4 GB).
> +
> +Signal
> +------
> +
> +By default, the main program and its signal handlers use the same shadow
> +stack. Because the shadow stack stores only return addresses, a large
> +shadow stack covers the condition that both the program stack and the
> +signal alternate stack run out.
> +
> +The kernel creates a restore token for the shadow stack and pushes the
> +restorer address to the shadow stack. Then verifies that token when
> +restoring from the signal handler.
> +
> +Fork
> +----
> +
> +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required
> +to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a
> +shadow access triggers a page fault with the shadow stack access bit set
> +in the page fault error code.
> +
> +When a task forks a child, its shadow stack PTEs are copied and both the
> +parent's and the child's shadow stack PTEs are cleared of the dirty bit.
> +Upon the next shadow stack access, the resulting shadow stack page fault
> +is handled by page copy/re-use.
> +
> +When a pthread child is created, the kernel allocates a new shadow stack
> +for the new thread.

The documentation above can be improved (both grammar and formatting):

---- >8 ----

diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst
index 6b270a24ebc3a2..f691f7995cf088 100644
--- a/Documentation/x86/cet.rst
+++ b/Documentation/x86/cet.rst
@@ -15,92 +15,101 @@ in the 64-bit kernel.
 
 CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
 a secondary stack allocated from memory and cannot be directly modified by
-applications. When executing a CALL instruction, the processor pushes the
+applications. When executing a ``CALL`` instruction, the processor pushes the
 return address to both the normal stack and the shadow stack. Upon
 function return, the processor pops the shadow stack copy and compares it
 to the normal stack copy. If the two differ, the processor raises a
 control-protection fault. Indirect branch tracking verifies indirect
-CALL/JMP targets are intended as marked by the compiler with 'ENDBR'
-opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking
-and only Shadow Stack is currently supported in the kernel.
+``CALL``/``JMP`` targets are intended as marked by the compiler with ``ENDBR``
+opcodes. Not all CPUs have both Shadow Stack and Indirect Branch Tracking
+and only Shadow Stack is currently supported by the kernel.
 
-The Kconfig options is X86_SHADOW_STACK, and it can be disabled with
-the kernel parameter clearcpuid, like this: "clearcpuid=shstk".
+The Kconfig options is ``X86_SHADOW_STACK`` and it can be overridden with
+the kernel command-line parameter ``clearcpuid`` (for example
+``clearcpuid=shstk``).
 
 To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1
-or later are required. To build a CET-enabled application, GLIBC v2.28 or
+or later are required. To build a CET-enabled application, glibc v2.28 or
 later is also required.
 
-At run time, /proc/cpuinfo shows CET features if the processor supports
-CET.
+At run time, ``/proc/cpuinfo`` shows CET features if the processor supports
+them
 
-Application Enabling
-====================
+Enabling CET in applications
+============================
 
-An application's CET capability is marked in its ELF header and can be
-verified from readelf/llvm-readelf output:
+The CET capability of an application is marked in its ELF header and can be
+verified from ``readelf``/``llvm-readelf`` output::
 
     readelf -n <application> | grep -a SHSTK
         properties: x86 feature: SHSTK
 
 The kernel does not process these applications directly. Applications must
-enable them using the interface descriped in section 4. Typically this
+enable them using :ref:`cet-arch_prctl`. Typically this
 would be done in dynamic loader or static runtime objects, as is the case
 in glibc.
 
 Backward Compatibility
 ======================
 
-GLIBC provides a few CET tunables via the GLIBC_TUNABLES environment
+glibc provides a few CET tunables via the ``GLIBC_TUNABLES`` environment
 variable:
 
-GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-WRSS
+  * ``GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-WRSS``
+
     Turn off SHSTK/WRSS.
 
-GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive>
-    This controls how dlopen() handles SHSTK legacy libraries::
+  * ``GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive>``
 
-        on         - continue with SHSTK enabled;
-        permissive - continue with SHSTK off.
+    This controls how :manpage:`dlopen(3)` handles SHSTK legacy libraries.
+    Possible values are:
 
-Details can be found in the GLIBC manual pages.
+    * ``on``         - continue with SHSTK enabled;
+    * ``permissive`` - continue with SHSTK off.
 
-CET arch_prctl()'s
-==================
+.. _cet-arch_prctl:
 
-Elf features should be enabled by the loader using the below arch_prctl's.
+CET arch_prctl() interface
+==========================
 
-arch_prctl(ARCH_CET_ENABLE, unsigned int feature)
-    Enable a single feature specified in 'feature'. Can only operate on
+ELF features should be enabled by the loader using the following
+:manpage:`arch_prctl(2)` subfunctions:
+
+  * ``arch_prctl(ARCH_CET_ENABLE, unsigned int feature)``
+
+    Enable a single feature specified in ``feature``. Can only operate on
     one feature at a time.
 
-arch_prctl(ARCH_CET_DISABLE, unsigned int feature)
-    Disable features specified in 'feature'. Can only operate on
+  * ``arch_prctl(ARCH_CET_DISABLE, unsigned int feature)``
+
+    Disable features specified in ``feature``. Can only operate on
     one feature at a time.
 
-arch_prctl(ARCH_CET_LOCK, unsigned int features)
-    Lock in features at their current enabled or disabled status.
+  * ``arch_prctl(ARCH_CET_LOCK, unsigned int features)``
+
+    Lock in features at their current status.
+
+  * ``arch_prctl(ARCH_CET_UNLOCK, unsigned int features)``
 
-arch_prctl(ARCH_CET_UNLOCK, unsigned int features)
     Unlock features.
 
-The return values are as following:
-    On success, return 0. On error, errno can be::
+On success, :manpage:`arch_prctl(2)` returns 0, otherwise the errno
+can be:
 
-        -EPERM if any of the passed feature are locked.
-        -EOPNOTSUPP if the feature is not supported by the hardware or
-         disabled by kernel parameter.
-        -EINVAL arguments (non existing feature, etc)
+  - ``EPERM`` if any of the passed feature are locked.
+  - ``EOPNOTSUPP`` if the feature is not supported by the hardware or
+    disabled by the kernel command-line parameter.
+  - ``EINVAL`` if the arguments are invalid (non existing feature, etc).
 
 Currently shadow stack and WRSS are supported via this interface. WRSS
 can only be enabled with shadow stack, and is automatically disabled
 if shadow stack is disabled.
 
-Proc status
+proc status
 ===========
-To check if an application is actually running with shadow stack, the
-user can read the /proc/$PID/arch_status. It will report "wrss" or
-"shstk" depending on what is enabled.
+To check if an application is actually running with shadow stack, users can
+read ``/proc/$PID/arch_status``. It will report ``wrss`` or
+``shstk`` depending on what is enabled.
 
 The implementation of the Shadow Stack
 ======================================
@@ -108,11 +117,11 @@ The implementation of the Shadow Stack
 Shadow Stack size
 -----------------
 
-A task's shadow stack is allocated from memory to a fixed size of
-MIN(RLIMIT_STACK, 4 GB). In other words, the shadow stack is allocated to
+The shadow stack of a task is allocated from memory to a fixed size of
+``MIN(RLIMIT_STACK, 4 GB)``. In other words, the shadow stack is allocated to
 the maximum size of the normal stack, but capped to 4 GB. However,
-a compat-mode application's address space is smaller, each of its thread's
-shadow stack size is MIN(1/4 RLIMIT_STACK, 4 GB).
+the address space of a compat-mode application is smaller; the shadow stack
+size of each of its thread is ``MIN(1/4 RLIMIT_STACK, 4 GB)``.
 
 Signal
 ------
@@ -123,19 +132,19 @@ shadow stack covers the condition that both the program stack and the
 signal alternate stack run out.
 
 The kernel creates a restore token for the shadow stack and pushes the
-restorer address to the shadow stack. Then verifies that token when
-restoring from the signal handler.
+restorer address to it. Then the kernel verifies that token when restoring
+from the signal handler.
 
 Fork
 ----
 
-The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required
-to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a
+The shadow stack vma has ``VM_SHADOW_STACK`` flag set; its PTEs are required
+to be read-only and dirty. When a shadow stack PTE is read-write and dirty, a
 shadow access triggers a page fault with the shadow stack access bit set
 in the page fault error code.
 
 When a task forks a child, its shadow stack PTEs are copied and both the
-parent's and the child's shadow stack PTEs are cleared of the dirty bit.
+shadow stack PTEs of parent and child are cleared of the dirty bit.
 Upon the next shadow stack access, the resulting shadow stack page fault
 is handled by page copy/re-use.
 
Thanks.
Jonathan Corbet Sept. 30, 2022, 1:33 p.m. UTC | #2
Bagas Sanjaya <bagasdotme@gmail.com> writes:

> The documentation above can be improved (both grammar and formatting):
>
> ---- >8 ----
>
> diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst
> index 6b270a24ebc3a2..f691f7995cf088 100644
> --- a/Documentation/x86/cet.rst
> +++ b/Documentation/x86/cet.rst
> @@ -15,92 +15,101 @@ in the 64-bit kernel.
>  
>  CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
>  a secondary stack allocated from memory and cannot be directly modified by
> -applications. When executing a CALL instruction, the processor pushes the
> +applications. When executing a ``CALL`` instruction, the processor pushes the

Just to be clear, not everybody is fond of sprinkling lots of ``literal
text`` throughout the documentation in this way.  Heavy use of it will
certainly clutter the plain-text file and can be a net negative overall.

Thanks,

jon
Bagas Sanjaya Sept. 30, 2022, 1:41 p.m. UTC | #3
On 9/30/22 20:33, Jonathan Corbet wrote:
>>  CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
>>  a secondary stack allocated from memory and cannot be directly modified by
>> -applications. When executing a CALL instruction, the processor pushes the
>> +applications. When executing a ``CALL`` instruction, the processor pushes the
> 
> Just to be clear, not everybody is fond of sprinkling lots of ``literal
> text`` throughout the documentation in this way.  Heavy use of it will
> certainly clutter the plain-text file and can be a net negative overall.
> 

Actually there is a trade-off between semantic correctness and plain-text
clarity. With regards to inline code samples (like identifiers), I fall
into the former camp. But when I'm reviewing patches for which the
surrounding documentation go latter camp (leave code samples alone without
markup), I can adapt to that style as long as it causes no warnings
whatsover.
Rick Edgecombe Oct. 3, 2022, 4:56 p.m. UTC | #4
On Fri, 2022-09-30 at 20:41 +0700, Bagas Sanjaya wrote:
> On 9/30/22 20:33, Jonathan Corbet wrote:
> > >   CET introduces Shadow Stack and Indirect Branch Tracking.
> > > Shadow stack is
> > >   a secondary stack allocated from memory and cannot be directly
> > > modified by
> > > -applications. When executing a CALL instruction, the processor
> > > pushes the
> > > +applications. When executing a ``CALL`` instruction, the
> > > processor pushes the
> > 
> > Just to be clear, not everybody is fond of sprinkling lots of
> > ``literal
> > text`` throughout the documentation in this way.  Heavy use of it
> > will
> > certainly clutter the plain-text file and can be a net negative
> > overall.
> > 
> 
> Actually there is a trade-off between semantic correctness and plain-
> text
> clarity. With regards to inline code samples (like identifiers), I
> fall
> into the former camp. But when I'm reviewing patches for which the
> surrounding documentation go latter camp (leave code samples alone
> without
> markup), I can adapt to that style as long as it causes no warnings
> whatsover.

Thanks. Unless anyone has any objections, I think I'll take all these
changes, except for the literal-izing of the instructions. They are not
really being used as code samples in this case.

Bagas, can you reply with your sign-off and I'll just apply it?
Kees Cook Oct. 3, 2022, 5:18 p.m. UTC | #5
On Thu, Sep 29, 2022 at 03:28:58PM -0700, Rick Edgecombe wrote:
> [...]
> +Overview
> +========
> +
> +Control-flow Enforcement Technology (CET) is term referring to several
> +related x86 processor features that provides protection against control
> +flow hijacking attacks. The HW feature itself can be set up to protect
> +both applications and the kernel. Only user-mode protection is implemented
> +in the 64-bit kernel.

This likely needs rewording, since it's not strictly true any more:
IBT is supported in kernel-mode now (CONFIG_X86_IBT).

> +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
> +a secondary stack allocated from memory and cannot be directly modified by
> +applications. When executing a CALL instruction, the processor pushes the
> +return address to both the normal stack and the shadow stack. Upon
> +function return, the processor pops the shadow stack copy and compares it
> +to the normal stack copy. If the two differ, the processor raises a
> +control-protection fault. Indirect branch tracking verifies indirect
> +CALL/JMP targets are intended as marked by the compiler with 'ENDBR'
> +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking
> +and only Shadow Stack is currently supported in the kernel.
> +
> +The Kconfig options is X86_SHADOW_STACK, and it can be disabled with
> +the kernel parameter clearcpuid, like this: "clearcpuid=shstk".
> +
> +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1
> +or later are required. To build a CET-enabled application, GLIBC v2.28 or
> +later is also required.
> +
> +At run time, /proc/cpuinfo shows CET features if the processor supports
> +CET.

Maybe call them out by name: shstk ibt

> +CET arch_prctl()'s
> +==================
> +
> +Elf features should be enabled by the loader using the below arch_prctl's.
> +
> +arch_prctl(ARCH_CET_ENABLE, unsigned int feature)
> +    Enable a single feature specified in 'feature'. Can only operate on
> +    one feature at a time.

Does this mean only 1 bit out of the 32 may be specified?

> +
> +arch_prctl(ARCH_CET_DISABLE, unsigned int feature)
> +    Disable features specified in 'feature'. Can only operate on
> +    one feature at a time.
> +
> +arch_prctl(ARCH_CET_LOCK, unsigned int features)
> +    Lock in features at their current enabled or disabled status.

How is the "features" argument processed here?

> [...]
> +Proc status
> +===========
> +To check if an application is actually running with shadow stack, the
> +user can read the /proc/$PID/arch_status. It will report "wrss" or
> +"shstk" depending on what is enabled.

TIL about "arch_status". :) Why is this a separate file? "status" is
already has unique field names.

> +Fork
> +----
> +
> +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required
> +to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a
> +shadow access triggers a page fault with the shadow stack access bit set
> +in the page fault error code.
> +
> +When a task forks a child, its shadow stack PTEs are copied and both the
> +parent's and the child's shadow stack PTEs are cleared of the dirty bit.
> +Upon the next shadow stack access, the resulting shadow stack page fault
> +is handled by page copy/re-use.
> +
> +When a pthread child is created, the kernel allocates a new shadow stack
> +for the new thread.

Perhaps speak to the ASLR characteristics of the shstk here?

Also, it seems if there is a "Fork" section, there should be an "Exec"
section? I suspect it would be short: shstk is disabled when execve() is
called and must be re-enabled from userspace, yes?

-Kees
John Hubbard Oct. 3, 2022, 7:35 p.m. UTC | #6
On 9/29/22 20:41, Bagas Sanjaya wrote:
...
> The documentation above can be improved (both grammar and formatting):
> 
> ---- >8 ----
> 
> diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst
> index 6b270a24ebc3a2..f691f7995cf088 100644
> --- a/Documentation/x86/cet.rst
> +++ b/Documentation/x86/cet.rst
> @@ -15,92 +15,101 @@ in the 64-bit kernel.
>   
>   CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
>   a secondary stack allocated from memory and cannot be directly modified by
> -applications. When executing a CALL instruction, the processor pushes the
> +applications. When executing a ``CALL`` instruction, the processor pushes the

It's always a judgment call, as to whether to use something like ``CALL`
or just plain CALL. Here, I'd like to opine that that the benefits of
``CALL`` are very small, whereas plain text in cet.rst has been made
significantly worse. So the result is, "this is not worth it".

The same is true of pretty much all of the other literalizing changes
below, IMHO.

Just so you have some additional input on this. I tend to spend time
fussing a lot (too much, yes) over readability issues, so this jumps
right out at me. :)

thanks,
Dave Hansen Oct. 3, 2022, 7:39 p.m. UTC | #7
On 10/3/22 12:35, John Hubbard wrote:
> It's always a judgment call, as to whether to use something like ``CALL`
> or just plain CALL. Here, I'd like to opine that that the benefits of
> ``CALL`` are very small, whereas plain text in cet.rst has been made
> significantly worse. So the result is, "this is not worth it".

I'm definitely in this camp as well.  Unless the markup *really* adds to
readability, just leave it alone.
Rick Edgecombe Oct. 3, 2022, 7:46 p.m. UTC | #8
On Mon, 2022-10-03 at 10:18 -0700, Kees Cook wrote:
> On Thu, Sep 29, 2022 at 03:28:58PM -0700, Rick Edgecombe wrote:
> > [...]
> > +Overview
> > +========
> > +
> > +Control-flow Enforcement Technology (CET) is term referring to
> > several
> > +related x86 processor features that provides protection against
> > control
> > +flow hijacking attacks. The HW feature itself can be set up to
> > protect
> > +both applications and the kernel. Only user-mode protection is
> > implemented
> > +in the 64-bit kernel.
> 
> This likely needs rewording, since it's not strictly true any more:
> IBT is supported in kernel-mode now (CONFIG_X86_IBT).

Yep, thanks.

> 
> > +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow
> > stack is
> > +a secondary stack allocated from memory and cannot be directly
> > modified by
> > +applications. When executing a CALL instruction, the processor
> > pushes the
> > +return address to both the normal stack and the shadow stack. Upon
> > +function return, the processor pops the shadow stack copy and
> > compares it
> > +to the normal stack copy. If the two differ, the processor raises
> > a
> > +control-protection fault. Indirect branch tracking verifies
> > indirect
> > +CALL/JMP targets are intended as marked by the compiler with
> > 'ENDBR'
> > +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch
> > Tracking
> > +and only Shadow Stack is currently supported in the kernel.
> > +
> > +The Kconfig options is X86_SHADOW_STACK, and it can be disabled
> > with
> > +the kernel parameter clearcpuid, like this: "clearcpuid=shstk".
> > +
> > +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM
> > v10.0.1
> > +or later are required. To build a CET-enabled application, GLIBC
> > v2.28 or
> > +later is also required.
> > +
> > +At run time, /proc/cpuinfo shows CET features if the processor
> > supports
> > +CET.
> 
> Maybe call them out by name: shstk ibt

Ok.

> 
> > +CET arch_prctl()'s
> > +==================
> > +
> > +Elf features should be enabled by the loader using the below
> > arch_prctl's.
> > +
> > +arch_prctl(ARCH_CET_ENABLE, unsigned int feature)
> > +    Enable a single feature specified in 'feature'. Can only
> > operate on
> > +    one feature at a time.
> 
> Does this mean only 1 bit out of the 32 may be specified?

Yes, exactly.

> 
> > +
> > +arch_prctl(ARCH_CET_DISABLE, unsigned int feature)
> > +    Disable features specified in 'feature'. Can only operate on
> > +    one feature at a time.
> > +
> > +arch_prctl(ARCH_CET_LOCK, unsigned int features)
> > +    Lock in features at their current enabled or disabled status.
> 
> How is the "features" argument processed here?

Yes, this should have more info. The kernel keeps a mask of features
that are "locked". The mask is ORed with the existing value. So any
bits set here cannot be enabled or disabled afterwards. Bit's unset in
the mask passed are ignored.

> 
> > [...]
> > +Proc status
> > +===========
> > +To check if an application is actually running with shadow stack,
> > the
> > +user can read the /proc/$PID/arch_status. It will report "wrss" or
> > +"shstk" depending on what is enabled.
> 
> TIL about "arch_status". :) Why is this a separate file? "status" is
> already has unique field names.

It looks like "status" only has arch-agnostic feature status today.
Maybe that's the reason? CET seems to fit there though.

> 
> > +Fork
> > +----
> > +
> > +The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are
> > required
> > +to be read-only and dirty. When a shadow stack PTE is not RO and
> > dirty, a
> > +shadow access triggers a page fault with the shadow stack access
> > bit set
> > +in the page fault error code.
> > +
> > +When a task forks a child, its shadow stack PTEs are copied and
> > both the
> > +parent's and the child's shadow stack PTEs are cleared of the
> > dirty bit.
> > +Upon the next shadow stack access, the resulting shadow stack page
> > fault
> > +is handled by page copy/re-use.
> > +
> > +When a pthread child is created, the kernel allocates a new shadow
> > stack
> > +for the new thread.
> 
> Perhaps speak to the ASLR characteristics of the shstk here?

It behaves just like mmap(). I can add some info.

> 
> Also, it seems if there is a "Fork" section, there should be an
> "Exec"
> section? I suspect it would be short: shstk is disabled when execve()
> is
> called and must be re-enabled from userspace, yes?

Sure, I can add some info.
Bagas Sanjaya Oct. 4, 2022, 2:13 a.m. UTC | #9
On 10/4/22 02:35, John Hubbard wrote:
> It's always a judgment call, as to whether to use something like ``CALL`
> or just plain CALL. Here, I'd like to opine that that the benefits of
> ``CALL`` are very small, whereas plain text in cet.rst has been made
> significantly worse. So the result is, "this is not worth it".
> 

Hmm, seems like neither CALL nor ``CALL`` is better, right?
Bagas Sanjaya Oct. 4, 2022, 2:16 a.m. UTC | #10
On 10/3/22 23:56, Edgecombe, Rick P wrote:
> On Fri, 2022-09-30 at 20:41 +0700, Bagas Sanjaya wrote:
>> On 9/30/22 20:33, Jonathan Corbet wrote:
>>>>   CET introduces Shadow Stack and Indirect Branch Tracking.
>>>> Shadow stack is
>>>>   a secondary stack allocated from memory and cannot be directly
>>>> modified by
>>>> -applications. When executing a CALL instruction, the processor
>>>> pushes the
>>>> +applications. When executing a ``CALL`` instruction, the
>>>> processor pushes the
>>>
>>> Just to be clear, not everybody is fond of sprinkling lots of
>>> ``literal
>>> text`` throughout the documentation in this way.  Heavy use of it
>>> will
>>> certainly clutter the plain-text file and can be a net negative
>>> overall.
>>>
>>
>> Actually there is a trade-off between semantic correctness and plain-
>> text
>> clarity. With regards to inline code samples (like identifiers), I
>> fall
>> into the former camp. But when I'm reviewing patches for which the
>> surrounding documentation go latter camp (leave code samples alone
>> without
>> markup), I can adapt to that style as long as it causes no warnings
>> whatsover.
> 
> Thanks. Unless anyone has any objections, I think I'll take all these
> changes, except for the literal-izing of the instructions. They are not
> really being used as code samples in this case.
> 
> Bagas, can you reply with your sign-off and I'll just apply it?

OK, here goes...

Signed-off-by: Bagas Sanjaya <bagasdotme@gmail.com>
Andrew Cooper Oct. 5, 2022, 12:02 a.m. UTC | #11
On 29/09/2022 23:28, Rick Edgecombe wrote:
> diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst
> new file mode 100644
> index 000000000000..4a0dfb6830f9
> --- /dev/null
> +++ b/Documentation/x86/cet.rst
> @@ -0,0 +1,140 @@
> +.. SPDX-License-Identifier: GPL-2.0
> +
> +=========================================
> +Control-flow Enforcement Technology (CET)
> +=========================================
> +
> +Overview
> +========
> +
> +Control-flow Enforcement Technology (CET) is term referring to several
> +related x86 processor features that provides protection against control
> +flow hijacking attacks. The HW feature itself can be set up to protect
> +both applications and the kernel. Only user-mode protection is implemented
> +in the 64-bit kernel.
> +
> +CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
> +a secondary stack allocated from memory and cannot be directly modified by
> +applications. When executing a CALL instruction, the processor pushes the
> +return address to both the normal stack and the shadow stack. Upon
> +function return, the processor pops the shadow stack copy and compares it
> +to the normal stack copy. If the two differ, the processor raises a
> +control-protection fault. Indirect branch tracking verifies indirect
> +CALL/JMP targets are intended as marked by the compiler with 'ENDBR'
> +opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking
> +and only Shadow Stack is currently supported in the kernel.

This paragraph is stale, isn't it?

AIUI, by the end of this series, what is supported is in-kernel
self-protection using CET-IBT, and userspace shadow stacks.

It is probably worth keeping the implementation-agnostic bits separate
from the "what is currently supported" matrix.  I'm not certain if its
worth splitting into cet.rst, cet-kernel.rst and cet-user.rst at this
point, but it's something to consider.

> +The Kconfig options is X86_SHADOW_STACK, and it can be disabled with
> +the kernel parameter clearcpuid, like this: "clearcpuid=shstk".

What about namespacing?  For the CPUID features themselves, yes they're
shstk and ibt.

But for the Kconfig options, the user and kernel implementations are
wildly different for both shstk and ibt.  Are they going to want to
share the same Kconfig option from the getgo?

Independent of the Kconfig symbol, user and kernel have separate
enablement criteria.  e.g. kernel shstk is likely going to be dependent
on the FRED feature, and simply looking at `shstk` in /proc/cpuinfo
doesn't necessarily tell you all you want to know.

> +
> +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1

What are the other dependences here?

In principle shstk only needs assembler support for the new
instructions, and that's Binutils 2.29 / LLVM 6 from my notes.

It's IBT which needs compiler support (and then, even only kernel IBT),
and that work is already done.

> +or later are required. To build a CET-enabled application, GLIBC v2.28 or
> +later is also required.
> +
> +At run time, /proc/cpuinfo shows CET features if the processor supports
> +CET.

Probably helpful to state what these are.

> +
> +Application Enabling
> +====================
> +
> +An application's CET capability is marked in its ELF header and can be

Technically its in an ELF note, not the ELF header.

~Andrew
Peter Zijlstra Oct. 5, 2022, 9:10 a.m. UTC | #12
On Mon, Oct 03, 2022 at 04:56:10PM +0000, Edgecombe, Rick P wrote:
> Thanks. Unless anyone has any objections

Well, I'll object. I still feel rst should burn in hell. Plain text FTW.
Bagas Sanjaya Oct. 5, 2022, 9:25 a.m. UTC | #13
On 10/5/22 16:10, Peter Zijlstra wrote:
> On Mon, Oct 03, 2022 at 04:56:10PM +0000, Edgecombe, Rick P wrote:
>> Thanks. Unless anyone has any objections
> 
> Well, I'll object. I still feel rst should burn in hell. Plain text FTW.
> 
> 

.txt maybe?
Peter Zijlstra Oct. 5, 2022, 9:46 a.m. UTC | #14
On Wed, Oct 05, 2022 at 04:25:39PM +0700, Bagas Sanjaya wrote:
> On 10/5/22 16:10, Peter Zijlstra wrote:
> > On Mon, Oct 03, 2022 at 04:56:10PM +0000, Edgecombe, Rick P wrote:
> >> Thanks. Unless anyone has any objections
> > 
> > Well, I'll object. I still feel rst should burn in hell. Plain text FTW.
> > 
> > 
> 
> .txt maybe?

We had that, but some idiots went and converted the lot to .rst :-(
Florian Weimer Oct. 10, 2022, 12:19 p.m. UTC | #15
* Rick Edgecombe:

> +To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1
> +or later are required. To build a CET-enabled application, GLIBC v2.28 or
> +later is also required.

Uhm, I think we are using binutils 2.30 with extra fixes.  I hope that
these binaries are still valid.

More importantly, glibc needs to be configured with --enable-cet
explicitly (unless the compiler defaults to CET).  The default glibc
build with a default GCC will produce dynamically-linked executables
that disable CET (when running on later/differently configured glibc
builds).  The statically linked object files are not marked up for CET
in that case.

I think the goal is to support the new kernel interface for actually
switching on SHSTK in glibc 2.37.  But at that point, hopefully all
those existing binaries can start enjoying the STSTK benefits.

Thanks,
Florian
Rick Edgecombe Oct. 10, 2022, 4:44 p.m. UTC | #16
On Mon, 2022-10-10 at 14:19 +0200, Florian Weimer wrote:
> Uhm, I think we are using binutils 2.30 with extra fixes.  I hope
> that
> these binaries are still valid.

Yea, you're right. Andrew Cooper pointed out it has been supported
since 2.29, so 2.30 should be fine.

> 
> More importantly, glibc needs to be configured with --enable-cet
> explicitly (unless the compiler defaults to CET).  The default glibc
> build with a default GCC will produce dynamically-linked executables
> that disable CET (when running on later/differently configured glibc
> builds).  The statically linked object files are not marked up for
> CET
> in that case.

Thanks, that's a good point. I'll add a blurb about glibc needs to be
compiled with CET support.

> 
> I think the goal is to support the new kernel interface for actually
> switching on SHSTK in glibc 2.37.  But at that point, hopefully all
> those existing binaries can start enjoying the STSTK benefits.

Can you share more about this plan? HJ was previously planning to wait
until the kernel support was upstream before making any more glibc
changes. Hopefully this will be in time for that, but I'd really rather
not repeat what happened last time where we had to design the kernel
interface around not breaking old glibc's with mismatched CET
enablement.

What did you think of the proposal to disable existing binaries and
start from scratch? Elaborated in the coverletter in the section
"Compatibility of Existing Binaries/Enabling Interface".
H.J. Lu Oct. 10, 2022, 4:51 p.m. UTC | #17
On Mon, Oct 10, 2022 at 9:44 AM Edgecombe, Rick P
<rick.p.edgecombe@intel.com> wrote:
>
> On Mon, 2022-10-10 at 14:19 +0200, Florian Weimer wrote:
> > Uhm, I think we are using binutils 2.30 with extra fixes.  I hope
> > that
> > these binaries are still valid.
>
> Yea, you're right. Andrew Cooper pointed out it has been supported
> since 2.29, so 2.30 should be fine.
>
> >
> > More importantly, glibc needs to be configured with --enable-cet
> > explicitly (unless the compiler defaults to CET).  The default glibc
> > build with a default GCC will produce dynamically-linked executables
> > that disable CET (when running on later/differently configured glibc
> > builds).  The statically linked object files are not marked up for
> > CET
> > in that case.
>
> Thanks, that's a good point. I'll add a blurb about glibc needs to be
> compiled with CET support.
>
> >
> > I think the goal is to support the new kernel interface for actually
> > switching on SHSTK in glibc 2.37.  But at that point, hopefully all
> > those existing binaries can start enjoying the STSTK benefits.
>
> Can you share more about this plan? HJ was previously planning to wait
> until the kernel support was upstream before making any more glibc
> changes. Hopefully this will be in time for that, but I'd really rather
> not repeat what happened last time where we had to design the kernel
> interface around not breaking old glibc's with mismatched CET
> enablement.
>
> What did you think of the proposal to disable existing binaries and
> start from scratch? Elaborated in the coverletter in the section
> "Compatibility of Existing Binaries/Enabling Interface".

My current glibc plan is that kernel won't enable CET automatically
and glibc will issue syscall to enable CET at early startup time.   All
existing CET enabled dynamic executables will have CET enabled
under the CET kernel and the updated CET glibc.
Florian Weimer Oct. 12, 2022, 12:29 p.m. UTC | #18
* Rick P. Edgecombe:

>> I think the goal is to support the new kernel interface for actually
>> switching on SHSTK in glibc 2.37.  But at that point, hopefully all
>> those existing binaries can start enjoying the STSTK benefits.
>
> Can you share more about this plan? HJ was previously planning to wait
> until the kernel support was upstream before making any more glibc
> changes. Hopefully this will be in time for that, but I'd really rather
> not repeat what happened last time where we had to design the kernel
> interface around not breaking old glibc's with mismatched CET
> enablement.

You're still doing that (keeping that gap in this constant), and this
appreciated and very much necessary.

> What did you think of the proposal to disable existing binaries and
> start from scratch? Elaborated in the coverletter in the section
> "Compatibility of Existing Binaries/Enabling Interface".

The ABI was finalized around four years ago, and we have shipped several
Fedora and Red Hat Enterprise Linux versions with it.  Other
distributions did as well.  It's a bit late to make changes now, and
certainly not for such trivialities.  In the case of the IBT ABI, it may
be tempting to start over in a less trivial way, to radically reduce the
amount of ENDBR instructions.  But that doesn't concern SHSTK, and
there's no actual implementation anyway.

But as H.J. implied, you would have to do rather nasty things in the
kernel to prevent us from achieving ABI compatibility in userspace, like
parsing property notes on the main executable and disabling the new
arch_prctl calls if you see something there that you don't like. 8-)
Of course no one is going to implement that.

(We are fine with swapping out glibc and its dynamic loader to enable
CET with the appropriate kernel mechanism, but we wouldn't want to
change the way all other binaries are marked up.)

Thanks,
Florian
Dave Hansen Oct. 12, 2022, 3:59 p.m. UTC | #19
On 10/12/22 05:29, Florian Weimer wrote:
>> What did you think of the proposal to disable existing binaries and
>> start from scratch? Elaborated in the coverletter in the section
>> "Compatibility of Existing Binaries/Enabling Interface".
> The ABI was finalized around four years ago, and we have shipped several
> Fedora and Red Hat Enterprise Linux versions with it.  Other
> distributions did as well.  It's a bit late to make changes now, and
> certainly not for such trivialities. 

Just to be clear: You're saying that a user/kernel ABI was "finalized"
by glibc shipping the user side of it, before there being an upstream
kernel implementation?
Florian Weimer Oct. 12, 2022, 4:54 p.m. UTC | #20
* Dave Hansen:

> On 10/12/22 05:29, Florian Weimer wrote:
>>> What did you think of the proposal to disable existing binaries and
>>> start from scratch? Elaborated in the coverletter in the section
>>> "Compatibility of Existing Binaries/Enabling Interface".
>> The ABI was finalized around four years ago, and we have shipped several
>> Fedora and Red Hat Enterprise Linux versions with it.  Other
>> distributions did as well.  It's a bit late to make changes now, and
>> certainly not for such trivialities. 
>
> Just to be clear: You're saying that a user/kernel ABI was "finalized"
> by glibc shipping the user side of it, before there being an upstream
> kernel implementation?

Sorry for being unclear.  I was refering to the x86-64 ELF psABI
supplement for CET, not the kernel/userspace interface, which still does
not exist in its final form as of today, as far as I understand it.

Thanks,
Florian
Rick Edgecombe Oct. 13, 2022, 9:28 p.m. UTC | #21
On Wed, 2022-10-12 at 14:29 +0200, Florian Weimer wrote:
> The ABI was finalized around four years ago, and we have shipped
> several
> Fedora and Red Hat Enterprise Linux versions with it.  Other
> distributions did as well.  It's a bit late to make changes now, and
> certainly not for such trivialities.  In the case of the IBT ABI, it
> may
> be tempting to start over in a less trivial way, to radically reduce
> the
> amount of ENDBR instructions.  But that doesn't concern SHSTK, and
> there's no actual implementation anyway.
> 
> But as H.J. implied, you would have to do rather nasty things in the
> kernel to prevent us from achieving ABI compatibility in userspace,
> like
> parsing property notes on the main executable and disabling the new
> arch_prctl calls if you see something there that you don't like. 8-)
> Of course no one is going to implement that.
> 
> (We are fine with swapping out glibc and its dynamic loader to enable
> CET with the appropriate kernel mechanism, but we wouldn't want to
> change the way all other binaries are marked up.)

So we have these compatibility issues with existing binaries. We know
some apps are totally broken. It sounds like you are proposing to
ignore this and let people who hit the issues work through it
themselves. This was also proposed by other glibc developers as a
solution for past CET compatibility issues that broke boot on kernel
upgrade. I have to say, as the person pushing these patches, I’m
uncomfortable with this approach. I don’t think users will like the
results. Basically, do they want to upgrade and run a bunch of untested
integration with known failures? I also don’t want to get this feature
reverted and I’m not exactly sure how this scenario would be taken.

But I hear the point about it not being ideal to abandon the existing
CET userspace. I think there is also a point about how userspace chose
to do this optimistic and early wide enabling, even if it was a bad
idea, and so how much should the kernel try to save userspace from
itself. So what do you think about this instead:

The current psABI spec talks about the binary being compatible with
shadow stack. It doesn’t say much about what should happen after the
loader. Since the greater ecosystem has used this bit with a more
cavalier attitude, glibc could treat it as a request for a warn and
continue mode. In the meantime we could have a new bit shstk_strict,
that requests behavior like these patches implement, and kills the
process on violation. Glibc/tools could add support for this strict bit
and anyone that wants to more carefully compile with it could finally
get shadow stack today. Then the implementation of the warn and
continue mode could follow that, and glibc could map the original shstk
bit to that kernel mode. So the old binaries would get there
eventually, which is better than the continuing nothing they have
today.

And speaking of having nothing today, there are people that really want
to use shadow stack and do not care at all about having CET support for
existing binaries. Neither glibc or elf bits are required to use kernel
shadow stack support. So if it comes to it, I don’t want to hold
support back for other users because the elf note bit enabling path
grew some issues.

Please let me know about what you think of that plan.
H.J. Lu Oct. 13, 2022, 10:15 p.m. UTC | #22
On Thu, Oct 13, 2022 at 2:28 PM Edgecombe, Rick P
<rick.p.edgecombe@intel.com> wrote:
>
> On Wed, 2022-10-12 at 14:29 +0200, Florian Weimer wrote:
> > The ABI was finalized around four years ago, and we have shipped
> > several
> > Fedora and Red Hat Enterprise Linux versions with it.  Other
> > distributions did as well.  It's a bit late to make changes now, and
> > certainly not for such trivialities.  In the case of the IBT ABI, it
> > may
> > be tempting to start over in a less trivial way, to radically reduce
> > the
> > amount of ENDBR instructions.  But that doesn't concern SHSTK, and
> > there's no actual implementation anyway.
> >
> > But as H.J. implied, you would have to do rather nasty things in the
> > kernel to prevent us from achieving ABI compatibility in userspace,
> > like
> > parsing property notes on the main executable and disabling the new
> > arch_prctl calls if you see something there that you don't like. 8-)
> > Of course no one is going to implement that.
> >
> > (We are fine with swapping out glibc and its dynamic loader to enable
> > CET with the appropriate kernel mechanism, but we wouldn't want to
> > change the way all other binaries are marked up.)
>
> So we have these compatibility issues with existing binaries. We know
> some apps are totally broken. It sounds like you are proposing to
> ignore this and let people who hit the issues work through it
> themselves. This was also proposed by other glibc developers as a
> solution for past CET compatibility issues that broke boot on kernel
> upgrade. I have to say, as the person pushing these patches, I’m
> uncomfortable with this approach. I don’t think users will like the
> results. Basically, do they want to upgrade and run a bunch of untested
> integration with known failures? I also don’t want to get this feature
> reverted and I’m not exactly sure how this scenario would be taken.
>
> But I hear the point about it not being ideal to abandon the existing
> CET userspace. I think there is also a point about how userspace chose
> to do this optimistic and early wide enabling, even if it was a bad
> idea, and so how much should the kernel try to save userspace from
> itself. So what do you think about this instead:
>
> The current psABI spec talks about the binary being compatible with
> shadow stack. It doesn’t say much about what should happen after the
> loader. Since the greater ecosystem has used this bit with a more
> cavalier attitude, glibc could treat it as a request for a warn and
> continue mode. In the meantime we could have a new bit shstk_strict,
> that requests behavior like these patches implement, and kills the
> process on violation. Glibc/tools could add support for this strict bit
> and anyone that wants to more carefully compile with it could finally
> get shadow stack today. Then the implementation of the warn and
> continue mode could follow that, and glibc could map the original shstk
> bit to that kernel mode. So the old binaries would get there
> eventually, which is better than the continuing nothing they have
> today.
>
> And speaking of having nothing today, there are people that really want
> to use shadow stack and do not care at all about having CET support for
> existing binaries. Neither glibc or elf bits are required to use kernel
> shadow stack support. So if it comes to it, I don’t want to hold
> support back for other users because the elf note bit enabling path
> grew some issues.
>
> Please let me know about what you think of that plan.

The kernel CET description

+The kernel does not process these applications directly. Applications must
+enable them using the interface descriped in section 4. Typically this
+would be done in dynamic loader or static runtime objects, as is the case
+in glibc.

may leave an impression that each application needs to use the kernel
interface to enable CET itself.  This is an option.  But the updated glibc
will enable CET automatically on behalf of the CET enabled application.
If the glibc isn't updated to use the new CET kernel interface, the existing
CET enabled binaries will run correctly under the new CET enabled
kernel without CET enabled.
Rick Edgecombe Oct. 26, 2022, 9:59 p.m. UTC | #23
On Thu, 2022-10-13 at 14:28 -0700, Rick Edgecombe wrote:
> In the meantime we could have a new bit shstk_strict,
> that requests behavior like these patches implement, and kills the
> process on violation. Glibc/tools could add support for this strict
> bit
> and anyone that wants to more carefully compile with it could finally
> get shadow stack today. Then the implementation of the warn and
> continue mode could follow that, and glibc could map the original
> shstk
> bit to that kernel mode. So the old binaries would get there
> eventually, which is better than the continuing nothing they have
> today.

Hi,

Any thoughts on this proposal?

Thanks,

Rick
diff mbox series

Patch

diff --git a/Documentation/x86/cet.rst b/Documentation/x86/cet.rst
new file mode 100644
index 000000000000..4a0dfb6830f9
--- /dev/null
+++ b/Documentation/x86/cet.rst
@@ -0,0 +1,140 @@ 
+.. SPDX-License-Identifier: GPL-2.0
+
+=========================================
+Control-flow Enforcement Technology (CET)
+=========================================
+
+Overview
+========
+
+Control-flow Enforcement Technology (CET) is term referring to several
+related x86 processor features that provides protection against control
+flow hijacking attacks. The HW feature itself can be set up to protect
+both applications and the kernel. Only user-mode protection is implemented
+in the 64-bit kernel.
+
+CET introduces Shadow Stack and Indirect Branch Tracking. Shadow stack is
+a secondary stack allocated from memory and cannot be directly modified by
+applications. When executing a CALL instruction, the processor pushes the
+return address to both the normal stack and the shadow stack. Upon
+function return, the processor pops the shadow stack copy and compares it
+to the normal stack copy. If the two differ, the processor raises a
+control-protection fault. Indirect branch tracking verifies indirect
+CALL/JMP targets are intended as marked by the compiler with 'ENDBR'
+opcodes. Not all CPU's have both Shadow Stack and Indirect Branch Tracking
+and only Shadow Stack is currently supported in the kernel.
+
+The Kconfig options is X86_SHADOW_STACK, and it can be disabled with
+the kernel parameter clearcpuid, like this: "clearcpuid=shstk".
+
+To build a CET-enabled kernel, Binutils v2.31 and GCC v8.1 or LLVM v10.0.1
+or later are required. To build a CET-enabled application, GLIBC v2.28 or
+later is also required.
+
+At run time, /proc/cpuinfo shows CET features if the processor supports
+CET.
+
+Application Enabling
+====================
+
+An application's CET capability is marked in its ELF header and can be
+verified from readelf/llvm-readelf output:
+
+    readelf -n <application> | grep -a SHSTK
+        properties: x86 feature: SHSTK
+
+The kernel does not process these applications directly. Applications must
+enable them using the interface descriped in section 4. Typically this
+would be done in dynamic loader or static runtime objects, as is the case
+in glibc.
+
+Backward Compatibility
+======================
+
+GLIBC provides a few CET tunables via the GLIBC_TUNABLES environment
+variable:
+
+GLIBC_TUNABLES=glibc.tune.hwcaps=-SHSTK,-WRSS
+    Turn off SHSTK/WRSS.
+
+GLIBC_TUNABLES=glibc.tune.x86_shstk=<on, permissive>
+    This controls how dlopen() handles SHSTK legacy libraries::
+
+        on         - continue with SHSTK enabled;
+        permissive - continue with SHSTK off.
+
+Details can be found in the GLIBC manual pages.
+
+CET arch_prctl()'s
+==================
+
+Elf features should be enabled by the loader using the below arch_prctl's.
+
+arch_prctl(ARCH_CET_ENABLE, unsigned int feature)
+    Enable a single feature specified in 'feature'. Can only operate on
+    one feature at a time.
+
+arch_prctl(ARCH_CET_DISABLE, unsigned int feature)
+    Disable features specified in 'feature'. Can only operate on
+    one feature at a time.
+
+arch_prctl(ARCH_CET_LOCK, unsigned int features)
+    Lock in features at their current enabled or disabled status.
+
+The return values are as following:
+    On success, return 0. On error, errno can be::
+
+        -EPERM if any of the passed feature are locked.
+        -EOPNOTSUPP if the feature is not supported by the hardware or
+         disabled by kernel parameter.
+        -EINVAL arguments (non existing feature, etc)
+
+Currently shadow stack and WRSS are supported via this interface. WRSS
+can only be enabled with shadow stack, and is automatically disabled
+if shadow stack is disabled.
+
+Proc status
+===========
+To check if an application is actually running with shadow stack, the
+user can read the /proc/$PID/arch_status. It will report "wrss" or
+"shstk" depending on what is enabled.
+
+The implementation of the Shadow Stack
+======================================
+
+Shadow Stack size
+-----------------
+
+A task's shadow stack is allocated from memory to a fixed size of
+MIN(RLIMIT_STACK, 4 GB). In other words, the shadow stack is allocated to
+the maximum size of the normal stack, but capped to 4 GB. However,
+a compat-mode application's address space is smaller, each of its thread's
+shadow stack size is MIN(1/4 RLIMIT_STACK, 4 GB).
+
+Signal
+------
+
+By default, the main program and its signal handlers use the same shadow
+stack. Because the shadow stack stores only return addresses, a large
+shadow stack covers the condition that both the program stack and the
+signal alternate stack run out.
+
+The kernel creates a restore token for the shadow stack and pushes the
+restorer address to the shadow stack. Then verifies that token when
+restoring from the signal handler.
+
+Fork
+----
+
+The shadow stack's vma has VM_SHADOW_STACK flag set; its PTEs are required
+to be read-only and dirty. When a shadow stack PTE is not RO and dirty, a
+shadow access triggers a page fault with the shadow stack access bit set
+in the page fault error code.
+
+When a task forks a child, its shadow stack PTEs are copied and both the
+parent's and the child's shadow stack PTEs are cleared of the dirty bit.
+Upon the next shadow stack access, the resulting shadow stack page fault
+is handled by page copy/re-use.
+
+When a pthread child is created, the kernel allocates a new shadow stack
+for the new thread.
diff --git a/Documentation/x86/index.rst b/Documentation/x86/index.rst
index c73d133fd37c..9ac03055c4b5 100644
--- a/Documentation/x86/index.rst
+++ b/Documentation/x86/index.rst
@@ -22,6 +22,7 @@  x86-specific Documentation
    mtrr
    pat
    intel-hfi
+   cet
    iommu
    intel_txt
    amd-memory-encryption