mbox series

[v4,0/3] arm64: dynamic shadow call stack support

Message ID 20220701152724.3343599-1-ardb@kernel.org (mailing list archive)
Headers show
Series arm64: dynamic shadow call stack support | expand

Message

Ard Biesheuvel July 1, 2022, 3:27 p.m. UTC
Generic kernel images such as Android's GKI usually enable all available
security features, which are typically implemented in such a way that
they only take effect if the underlying hardware can support it, but
don't interfere with correct and efficient operation otherwise.

For shadow call stack support, which is always supported by the
hardware, it means it will be enabled even if pointer authentication is
also supported, and enabled for signing return addresses stored on the
stack. The additional security provided by shadow call stack is only
marginal in this case, whereas the performance overhead is not.

Given that return address signing is based on PACIASP/AUTIASP
instructions that implicitly operate on the return address register
(X30) and are not idempotent (i.e., each needs to be emitted exactly
once before the return address is stored on the ordinary stack and after
it has been retrieved from it), we can convert these instruction 1:1
into shadow call stack pushes and pops involving the register X30.
As this is something that can be done at runtime rather than build time,
we can do this conditionally based on whether or not return address
signing is supported on the underlying hardware.

In order to be able to unwind call stacks that involve return address
signing, whether or not the return address is currently signed is
tracked by DWARF CFI directives in the unwinding metadata. This means we
can use this information to locate all PACIASP/AUTIASP instructions in
the binary, instead of having to use brute force and go over all
instructions in the entire program.

This series implements this approach for Clang, which has recently been
fixed to emit all these CFI directives correctly. This series is based
on an older PoC sent out last year [0] that targeted GCC only (due to
this issue). This v3 targets Clang only, as GCC has its own issues with
CFI accuracy.

Changes since v3 [1]:
- rebase onto arm64/for-next/core
- fix init value of dynamic_scs_enabled static key
- don't discard .eh_frame sections (to work around a bug in an older
  Clang version if we are keeping them for dynamic SCS patching,
- print a diagnostic if dynamic SCS patching is enabled,
- apply build fix suggested by Sami and add his ack to patch #2

Changes since v2 [2]:
- don't enable unwind table generation for nVHE code - it cannot be
  patched anyway so it has no use for it;
- drop checks for ID reg overrides
- fix some remaining TODOs regarding augmentation data and the code
  alignment factor
- disable PAC for leaf functions when dynamic SCS is configured, so that
  we don't end up with SCS pushes and pops in all leaf functions too;
- add I-cache maintenance after code patching
- add Rb's from Nick and Kees.

Changes since RFC v1:
- implement boot time check for PAC/BTI support, and only enable dynamic
  SCS if neither are supported;
- implement module patching as well;
- switch to Clang, and drop workaround for GCC bug;

[0] https://lore.kernel.org/linux-arm-kernel/20211013152243.2216899-1-ardb@kernel.org/
[1] https://lore.kernel.org/linux-arm-kernel/20220613134008.3760481-1-ardb@kernel.org/
[2] https://lore.kernel.org/linux-arm-kernel/20220505161011.1801596-1-ardb@kernel.org/

Ard Biesheuvel (3):
  arm64: unwind: add asynchronous unwind tables to kernel and modules
  scs: add support for dynamic shadow call stacks
  arm64: implement dynamic shadow call stack for Clang

 Makefile                              |   2 +
 arch/Kconfig                          |   7 +
 arch/arm64/Kconfig                    |  12 +
 arch/arm64/Makefile                   |  15 +-
 arch/arm64/include/asm/module.lds.h   |   8 +
 arch/arm64/include/asm/scs.h          |  47 ++++
 arch/arm64/kernel/Makefile            |   2 +
 arch/arm64/kernel/head.S              |   3 +
 arch/arm64/kernel/irq.c               |   2 +-
 arch/arm64/kernel/module.c            |   8 +
 arch/arm64/kernel/patch-scs.c         | 257 ++++++++++++++++++++
 arch/arm64/kernel/pi/Makefile         |   1 +
 arch/arm64/kernel/sdei.c              |   2 +-
 arch/arm64/kernel/setup.c             |   4 +
 arch/arm64/kernel/vmlinux.lds.S       |  13 +
 arch/arm64/kvm/hyp/nvhe/Makefile      |   1 +
 drivers/firmware/efi/libstub/Makefile |   1 +
 include/asm-generic/vmlinux.lds.h     |   9 +-
 include/linux/scs.h                   |  18 ++
 kernel/scs.c                          |  14 +-
 scripts/module.lds.S                  |   8 +-
 21 files changed, 425 insertions(+), 9 deletions(-)
 create mode 100644 arch/arm64/kernel/patch-scs.c

Comments

Sami Tolvanen July 7, 2022, 7:35 p.m. UTC | #1
Hi Ard,

On Fri, Jul 01, 2022 at 05:27:21PM +0200, Ard Biesheuvel wrote:
> Generic kernel images such as Android's GKI usually enable all available
> security features, which are typically implemented in such a way that
> they only take effect if the underlying hardware can support it, but
> don't interfere with correct and efficient operation otherwise.
> 
> For shadow call stack support, which is always supported by the
> hardware, it means it will be enabled even if pointer authentication is
> also supported, and enabled for signing return addresses stored on the
> stack. The additional security provided by shadow call stack is only
> marginal in this case, whereas the performance overhead is not.
> 
> Given that return address signing is based on PACIASP/AUTIASP
> instructions that implicitly operate on the return address register
> (X30) and are not idempotent (i.e., each needs to be emitted exactly
> once before the return address is stored on the ordinary stack and after
> it has been retrieved from it), we can convert these instruction 1:1
> into shadow call stack pushes and pops involving the register X30.
> As this is something that can be done at runtime rather than build time,
> we can do this conditionally based on whether or not return address
> signing is supported on the underlying hardware.
> 
> In order to be able to unwind call stacks that involve return address
> signing, whether or not the return address is currently signed is
> tracked by DWARF CFI directives in the unwinding metadata. This means we
> can use this information to locate all PACIASP/AUTIASP instructions in
> the binary, instead of having to use brute force and go over all
> instructions in the entire program.
> 
> This series implements this approach for Clang, which has recently been
> fixed to emit all these CFI directives correctly. This series is based
> on an older PoC sent out last year [0] that targeted GCC only (due to
> this issue). This v3 targets Clang only, as GCC has its own issues with
> CFI accuracy.
> 
> Changes since v3 [1]:
> - rebase onto arm64/for-next/core

Btw, this no longer seems to apply cleanly to for-next/core. I've found
using git format-patch --base=auto helpful when sending patches against
trees that change more frequently.

> - fix init value of dynamic_scs_enabled static key
> - don't discard .eh_frame sections (to work around a bug in an older
>   Clang version if we are keeping them for dynamic SCS patching,
> - print a diagnostic if dynamic SCS patching is enabled,
> - apply build fix suggested by Sami and add his ack to patch #2

Nevertheless, the patches look good to me, and SCS was correctly enabled
on CPUs without PAC support in my testing. For the series:

Reviewed-by: Sami Tolvanen <samitolvanen@google.com>
Tested-by: Sami Tolvanen <samitolvanen@google.com>

Sami