mbox series

[RFC,RFT,0/5] fork: Support shadow stacks in clone3()

Message ID 20231023-clone3-shadow-stack-v1-0-d867d0b5d4d0@kernel.org (mailing list archive)
Headers show
Series fork: Support shadow stacks in clone3() | expand

Message

Mark Brown Oct. 23, 2023, 1:20 p.m. UTC
The kernel has recently added support for shadow stacks, currently
x86 only using their CET feature but both arm64 and RISC-V have
equivalent features (GCS and Zisslpcfi respectively), I am actively
working on GCS[1].  With shadow stacks the hardware maintains an
additional stack containing only the return addresses for branch
instructions which is not generally writeable by userspace and ensures
that any returns are to the recorded addresses.  This provides some
protection against ROP attacks and making it easier to collect call
stacks.  These shadow stacks are allocated in the address space of the
userspace process.

Our API for shadow stacks does not currently offer userspace any
flexiblity for managing the allocation of shadow stacks for newly
created threads, instead the kernel allocates a new shadow stack with
the same size as the normal stack whenever a thread is created with the
feature enabled.  The stacks allocated in this way are freed by the
kernel when the thread exits or shadow stacks are disabled for the
thread.  This lack of flexibility and control isn't ideal, in the vast
majority of cases the shadow stack will be over allocated and the
implicit allocation and deallocation is not consistent with other
interfaces.  As far as I can tell the interface is done in this manner
mainly because the shadow stack patches were in development since before
clone3() was implemented.

Since clone3() is readily extensible let's add support for specifying a
shadow stack when creating a new thread or process in a similar manner
to how the normal stack is specified, keeping the current implicit
allocation behaviour if one is not specified either with clone3() or
through the use of clone().  When the shadow stack is specified
explicitly the kernel will not free it, the inconsistency with
implicitly allocated shadow stacks is a bit awkward but that's existing
ABI so we can't change it.

The memory provided must have been allocated for use as a shadow stack,
the expectation is that this will be done using the map_shadow_stack()
syscall.  I opted not to add validation for this in clone3() since it
will be enforced by hardware anyway.

Please note that the x86 portions of this code are build tested only, I
don't appear to have a system that can run CET avaible to me, I have
done testing with an integration into my pending work for GCS.  There is
some possibility that the arm64 implementation may require the use of
clone3() and explicit userspace allocation of shadow stacks, this is
still under discussion.

A new architecture feature Kconfig option for shadow stacks is added as
here, this was suggested as part of the review comments for the arm64
GCS series and since we need to detect if shadow stacks are supported it
seemed sensible to roll it in here.

The selftest portions of this depend on 34dce23f7e40 ("selftests/clone3:
Report descriptive test names") in -next[2].

[1] https://lore.kernel.org/r/20231009-arm64-gcs-v6-0-78e55deaa4dd@kernel.org/
[2] https://lore.kernel.org/r/20231018-kselftest-clone3-output-v1-1-12b7c50ea2cf@kernel.org

Signed-off-by: Mark Brown <broonie@kernel.org>
---
Mark Brown (5):
      mm: Introduce ARCH_HAS_USER_SHADOW_STACK
      fork: Add shadow stack support to clone3()
      selftests/clone3: Factor more of main loop into test_clone3()
      selftests/clone3: Allow tests to flag if -E2BIG is a valid error code
      kselftest/clone3: Test shadow stack support

 arch/x86/Kconfig                                  |   1 +
 arch/x86/include/asm/shstk.h                      |  11 +-
 arch/x86/kernel/process.c                         |   2 +-
 arch/x86/kernel/shstk.c                           |  36 ++++-
 fs/proc/task_mmu.c                                |   2 +-
 include/linux/mm.h                                |   2 +-
 include/linux/sched/task.h                        |   2 +
 include/uapi/linux/sched.h                        |  17 +-
 kernel/fork.c                                     |  40 ++++-
 mm/Kconfig                                        |   6 +
 tools/testing/selftests/clone3/clone3.c           | 180 +++++++++++++++++-----
 tools/testing/selftests/clone3/clone3_selftests.h |   5 +
 12 files changed, 247 insertions(+), 57 deletions(-)
---
base-commit: 80ab9b52e8d4add7735abdfb935877354b69edb6
change-id: 20231019-clone3-shadow-stack-15d40d2bf536

Best regards,