mbox series

[RESEND,bpf-next,0/4] Reduce overhead of LSMs with static calls

Message ID 20230120000818.1324170-1-kpsingh@kernel.org (mailing list archive)
Headers show
Series Reduce overhead of LSMs with static calls | expand

Message

KP Singh Jan. 20, 2023, 12:08 a.m. UTC
# Background

LSM hooks (callbacks) are currently invoked as indirect function calls. These
callbacks are registered into a linked list at boot time as the order of the
LSMs can be configured on the kernel command line with the "lsm=" command line
parameter.

Indirect function calls have a high overhead due to retpoline mitigation for
various speculative execution attacks.

Retpolines remain relevant even with newer generation CPUs as recently
discovered speculative attacks, like Spectre BHB need Retpolines to mitigate
against branch history injection and still need to be used in combination with
newer mitigation features like eIBRS.

This overhead is especially significant for the "bpf" LSM which allows the user
to implement LSM functionality with eBPF program. In order to facilitate this
the "bpf" LSM provides a default callback for all LSM hooks. When enabled,
the "bpf" LSM incurs an unnecessary / avoidable indirect call. This is
especially bad in OS hot paths (e.g. in the networking stack).
This overhead prevents the adoption of bpf LSM on performance critical
systems, and also, in general, slows down all LSMs.

Since we know the address of the enabled LSM callbacks at compile time and only
the order is determined at boot time, the LSM framework can allocate static
calls for each of the possible LSM callbacks and these calls can be updated once
the order is determined at boot.

This series is a respin of the RFC proposed by Paul Renauld (renauld@google.com)
and Brendan Jackman (jackmanb@google.com) [1]

# Performance improvement

With this patch-set some syscalls with lots of LSM hooks in their path
benefitted at an average of ~3%. Here are the results of the relevant Unixbench
system benchmarks with BPF LSM and a major LSM (in this case apparmor) enabled
with and without the series.

Benchmark                                               Delta(%): (+ is better)
===============================================================================
Execl Throughput                                             +2.9015
File Write 1024 bufsize 2000 maxblocks                       +5.4196
Pipe Throughput                                              +7.7434
Pipe-based Context Switching                                 +3.5118
Process Creation                                             +0.3552
Shell Scripts (1 concurrent)                                 +1.7106
System Call Overhead                                         +3.0067
System Benchmarks Index Score (Partial Only):                +3.1809

In the best case, some syscalls like eventfd_create benefitted to about ~10%.
The full analysis can be viewed at https://kpsingh.ch/lsm-perf

[1] https://lore.kernel.org/linux-security-module/20200820164753.3256899-1-jackmanb@chromium.org/

KP Singh (4):
  kernel: Add helper macros for loop unrolling
  security: Generate a header with the count of enabled LSMs
  security: Replace indirect LSM hook calls with static calls
  bpf: Only enable BPF LSM hooks when an LSM program is attached

 include/linux/bpf.h              |   1 +
 include/linux/bpf_lsm.h          |   1 +
 include/linux/lsm_hooks.h        |  94 +++++++++++--
 include/linux/unroll.h           |  35 +++++
 kernel/bpf/trampoline.c          |  29 ++++-
 scripts/Makefile                 |   1 +
 scripts/security/.gitignore      |   1 +
 scripts/security/Makefile        |   4 +
 scripts/security/gen_lsm_count.c |  57 ++++++++
 security/Makefile                |  11 ++
 security/bpf/hooks.c             |  26 +++-
 security/security.c              | 217 ++++++++++++++++++++-----------
 12 files changed, 386 insertions(+), 91 deletions(-)
 create mode 100644 include/linux/unroll.h
 create mode 100644 scripts/security/.gitignore
 create mode 100644 scripts/security/Makefile
 create mode 100644 scripts/security/gen_lsm_count.c