Message ID | 20230928202410.3765062-1-kpsingh@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | Reduce overhead of LSMs with static calls | expand |
On Thu, Sep 28, 2023 at 10:24:05PM +0200, KP Singh wrote: > # Performance improvement > > With this patch-set some syscalls with lots of LSM hooks in their path > benefitted at an average of ~3% and I/O and Pipe based system calls benefitting > the most. Paul, FWIW, I think this series is ready to land in -next. I'd like it to get some bake time there just to see if anything unexpected shows up. It's quite happy in all my local testing, though. -Kees
On Thu, 2023-09-28 at 22:24 +0200, KP Singh wrote: > # Background > > LSM hooks (callbacks) are currently invoked as indirect function calls. These > callbacks are registered into a linked list at boot time as the order of the > LSMs can be configured on the kernel command line with the "lsm=" command line > parameter. > > Indirect function calls have a high overhead due to retpoline mitigation for > various speculative execution attacks. > > Retpolines remain relevant even with newer generation CPUs as recently > discovered speculative attacks, like Spectre BHB need Retpolines to mitigate > against branch history injection and still need to be used in combination with > newer mitigation features like eIBRS. > > This overhead is especially significant for the "bpf" LSM which allows the user > to implement LSM functionality with eBPF program. In order to facilitate this > the "bpf" LSM provides a default callback for all LSM hooks. When enabled, > the "bpf" LSM incurs an unnecessary / avoidable indirect call. This is > especially bad in OS hot paths (e.g. in the networking stack). > This overhead prevents the adoption of bpf LSM on performance critical > systems, and also, in general, slows down all LSMs. > > Since we know the address of the enabled LSM callbacks at compile time and only > the order is determined at boot time, the LSM framework can allocate static > calls for each of the possible LSM callbacks and these calls can be updated once > the order is determined at boot. > > This series is a respin of the RFC proposed by Paul Renauld (renauld@google.com) > and Brendan Jackman (jackmanb@google.com) [1] > > # Performance improvement > > With this patch-set some syscalls with lots of LSM hooks in their path > benefitted at an average of ~3% and I/O and Pipe based system calls benefitting > the most. > > Here are the results of the relevant Unixbench system benchmarks with BPF LSM > and SELinux enabled with default policies enabled with and without these > patches. > > Benchmark Delta(%): (+ is better) > =============================================================================== > Execl Throughput +1.9356 > File Write 1024 bufsize 2000 maxblocks +6.5953 > Pipe Throughput +9.5499 > Pipe-based Context Switching +3.0209 > Process Creation +2.3246 > Shell Scripts (1 concurrent) +1.4975 > System Call Overhead +2.7815 > System Benchmarks Index Score (Partial Only): +3.4859 FTR, I also measure a ~3% tput improvement in UDP stream test over loopback. @KP Singh, I would have appreciated being cc-ed here, since I provided feedback on a previous revision (as soon as I learned of this effort). Cheers, Paolo
On Mon, Oct 2, 2023 at 1:06 PM Paolo Abeni <pabeni@redhat.com> wrote: > > On Thu, 2023-09-28 at 22:24 +0200, KP Singh wrote: > > # Background > > > > LSM hooks (callbacks) are currently invoked as indirect function calls. These > > callbacks are registered into a linked list at boot time as the order of the > > LSMs can be configured on the kernel command line with the "lsm=" command line > > parameter. > > > > Indirect function calls have a high overhead due to retpoline mitigation for > > various speculative execution attacks. > > > > Retpolines remain relevant even with newer generation CPUs as recently > > discovered speculative attacks, like Spectre BHB need Retpolines to mitigate > > against branch history injection and still need to be used in combination with > > newer mitigation features like eIBRS. > > > > This overhead is especially significant for the "bpf" LSM which allows the user > > to implement LSM functionality with eBPF program. In order to facilitate this > > the "bpf" LSM provides a default callback for all LSM hooks. When enabled, > > the "bpf" LSM incurs an unnecessary / avoidable indirect call. This is > > especially bad in OS hot paths (e.g. in the networking stack). > > This overhead prevents the adoption of bpf LSM on performance critical > > systems, and also, in general, slows down all LSMs. > > > > Since we know the address of the enabled LSM callbacks at compile time and only > > the order is determined at boot time, the LSM framework can allocate static > > calls for each of the possible LSM callbacks and these calls can be updated once > > the order is determined at boot. > > > > This series is a respin of the RFC proposed by Paul Renauld (renauld@google.com) > > and Brendan Jackman (jackmanb@google.com) [1] > > > > # Performance improvement > > > > With this patch-set some syscalls with lots of LSM hooks in their path > > benefitted at an average of ~3% and I/O and Pipe based system calls benefitting > > the most. > > > > Here are the results of the relevant Unixbench system benchmarks with BPF LSM > > and SELinux enabled with default policies enabled with and without these > > patches. > > > > Benchmark Delta(%): (+ is better) > > =============================================================================== > > Execl Throughput +1.9356 > > File Write 1024 bufsize 2000 maxblocks +6.5953 > > Pipe Throughput +9.5499 > > Pipe-based Context Switching +3.0209 > > Process Creation +2.3246 > > Shell Scripts (1 concurrent) +1.4975 > > System Call Overhead +2.7815 > > System Benchmarks Index Score (Partial Only): +3.4859 > > FTR, I also measure a ~3% tput improvement in UDP stream test over > loopback. > Thanks for running the numbers and testing these patches, greatly appreciated! > @KP Singh, I would have appreciated being cc-ed here, since I provided Definitely, a miss on my part. Will keep you Cc'ed in any future revisions. I think we can also add a Tested-by: tag on the main patch and add your performance numbers to the commit as well. - KP > feedback on a previous revision (as soon as I learned of this effort). > > Cheers, > > Paolo >
On Mon, 2023-10-02 at 13:09 +0200, KP Singh wrote: > On Mon, Oct 2, 2023 at 1:06 PM Paolo Abeni <pabeni@redhat.com> wrote: > > On Thu, 2023-09-28 at 22:24 +0200, KP Singh wrote: > > > # Background > > > > > > LSM hooks (callbacks) are currently invoked as indirect function calls. These > > > callbacks are registered into a linked list at boot time as the order of the > > > LSMs can be configured on the kernel command line with the "lsm=" command line > > > parameter. > > > > > > Indirect function calls have a high overhead due to retpoline mitigation for > > > various speculative execution attacks. > > > > > > Retpolines remain relevant even with newer generation CPUs as recently > > > discovered speculative attacks, like Spectre BHB need Retpolines to mitigate > > > against branch history injection and still need to be used in combination with > > > newer mitigation features like eIBRS. > > > > > > This overhead is especially significant for the "bpf" LSM which allows the user > > > to implement LSM functionality with eBPF program. In order to facilitate this > > > the "bpf" LSM provides a default callback for all LSM hooks. When enabled, > > > the "bpf" LSM incurs an unnecessary / avoidable indirect call. This is > > > especially bad in OS hot paths (e.g. in the networking stack). > > > This overhead prevents the adoption of bpf LSM on performance critical > > > systems, and also, in general, slows down all LSMs. > > > > > > Since we know the address of the enabled LSM callbacks at compile time and only > > > the order is determined at boot time, the LSM framework can allocate static > > > calls for each of the possible LSM callbacks and these calls can be updated once > > > the order is determined at boot. > > > > > > This series is a respin of the RFC proposed by Paul Renauld (renauld@google.com) > > > and Brendan Jackman (jackmanb@google.com) [1] > > > > > > # Performance improvement > > > > > > With this patch-set some syscalls with lots of LSM hooks in their path > > > benefitted at an average of ~3% and I/O and Pipe based system calls benefitting > > > the most. > > > > > > Here are the results of the relevant Unixbench system benchmarks with BPF LSM > > > and SELinux enabled with default policies enabled with and without these > > > patches. > > > > > > Benchmark Delta(%): (+ is better) > > > =============================================================================== > > > Execl Throughput +1.9356 > > > File Write 1024 bufsize 2000 maxblocks +6.5953 > > > Pipe Throughput +9.5499 > > > Pipe-based Context Switching +3.0209 > > > Process Creation +2.3246 > > > Shell Scripts (1 concurrent) +1.4975 > > > System Call Overhead +2.7815 > > > System Benchmarks Index Score (Partial Only): +3.4859 > > > > FTR, I also measure a ~3% tput improvement in UDP stream test over > > loopback. > > > > Thanks for running the numbers and testing these patches, greatly appreciated! > > > @KP Singh, I would have appreciated being cc-ed here, since I provided > > Definitely, a miss on my part. Will keep you Cc'ed in any future revisions. Thanks! > I think we can also add a Tested-by: tag on the main patch and add > your performance numbers to the commit as well. Feel free to include that, even if my testing is limited to the performance test described above. Cheers, Paolo