mbox series

[RFC,0/2] landlock: Multithreaded policy enforcement

Message ID 20250221184417.27954-2-gnoack3000@gmail.com (mailing list archive)
Headers show
Series landlock: Multithreaded policy enforcement | expand

Message

Günther Noack Feb. 21, 2025, 6:44 p.m. UTC
Hello!

This patch set adds the LANDLOCK_RESTRICT_SELF_TSYNC flag to
landlock_restrict_self().  With this flag, the passed Landlock ruleset
will not only be applied to the calling thread, but to all threads
which belong to the same process.

I am sending this intentionally early.  At this point, I am mostly
looking for high-level comments to check whether the general approach
is feasible at all and whether I am using appropriate concurrency
mechanisms for it.

Motivation
==========

TL;DR: The libpsx/nptl(7) signal hack which we use in user space for
multi-threaded Landlock enforcement is incompatible with Landlock's
signal scoping support.  Landlock can restrict the use of signals
across Landlock domains, but we need signals ourselves in user space
in ways that are not permitted any more under these restrictions.

Enabling Landlock proves to be difficult in processes that are already
multi-threaded at the time of enforcement:

* Enforcement in only one thread is usually a mistake because threads
  do not normally have proper security boundaries between them.

* Also, multithreading is unavoidable in some circumstances, such as
  when using Landlock from a Go program.  Go programs are already
  multithreaded by the time that they enter the "func main()".

So far, the approach in Go[1] was to use libpsx[2].  This library
implements the mechanism described in nptl(7) [3]: It keeps track of
all threads with a linker hack and then makes all threads do the same
syscall by registering a signal handler for them and invoking it.

With commit 54a6e6bbf3be ("landlock: Add signal scoping"), Landlock
gained the ability to restrict the use of signals across different
Landlock domains.

Landlock's signal scoping support is incompatible with the libpsx
approach of enabling Landlock:

(1) With libpsx, although all threads enforce the same ruleset object,
    they technically do the operation separately and end up in
    distinct Landlock domains.  When the ruleset uses
    LANDLOCK_SCOPE_SIGNAL, it becomes impossible to use cross-thread
    signals in that process.

(2) Cross-thread Signals are themselves needed to enforce further
    nested Landlock domains across multiple threads.  So nested
    Landlock policies become impossible there.

In addition to Landlock itself, cross-thread signals are also needed
for other seemingly-harmless API calls like the setuid(2) [4] and for
the use of libcap (co-developed with libpsx), which have the same
problem where the underlying syscall only applies to the calling
thread.

Implementation approach
=======================

The implementation is inspired by Jann Horn's earlier patch set for
cred_transfer() [5] as well as by the libpsx approach described in [2]
and [3].  In some way, it is a libpsx-like implementation in the
kernel:

When the flag is provided, the calling thread adds a task_work to all
threads to be executed through a pseudo-signal and makes them run it.
The logic is in the function restrict_one_thread().  The threads
execute the following phases in lock step (with a barrier
synchronization in between):

* Preparation Phase
  * Check that the thread meets all prerequisites
  * Create a new struct cred with prepare_creds()
  * Manipulate the struct cred as needed for policy enforcement
  * Write back possible errors to a shared memory location
* BARRIER: Wait for all threads to catch up to this point
* Commit Phase
  * Check that none of the threads had an error.
  * Invoke commit_creds() (or abort_creds() in case of error)

None of the functions used in the commit_phase() can return errors, so
at the time where each thread decides to commit, we are already sure
that it will work across all threads.

At the end, the calling thread waits for all task_work to finish
before returning from the syscall.

(The calling thread itself obviously does not schedule a task_work for
itself, but executes the same restrict_one_thread() logic inline.)

Open questions
==============

The patch set is in early stages, and there are some open questions
for which I am seeking feedback:

Conceptual open questions
~~~~~~~~~~~~~~~~~~~~~~~~~

* To what extent should we expect the threads to be in a known state?

  The single-threaded variant currently requires the
  PR_SET_NO_NEW_PRIVS flag or CAP_SYS_ADMIN as prerequisite for
  enforcing a Landlock ruleset.

  When we are in the multi-threaded variant, there are two main approaches:

  a) Expect same state on all threads: The simplest implementation
     would be that we expect all threads to also have
     PR_SET_NO_NEW_PRIVS or CAP_SYS_ADMIN, and to already be part of
     the same Landlock domain.  Otherwise, the operation is aborted on
     all threads.
  
  b) Pull all threads towards the same state: The 'synchronization'
     option would be that we implicitly establish the same
     PR_SET_NO_NEW_PRIVS and Landlock domain configuration on all
     threads.

     Weird case: If the calling thread has CAP_SYS_ADMIN but not
     PR_SET_NO_NEW_PRIVS, does this mean that a Landlock domain can be
     enabled on a thread without PR_SET_NO_NEW_PRIVS?  (We surely
     should not implicitly grant CAP_SYS_ADMIN to another thread?)
  
  Solutions in the middle between these two might also be possible.
  Depending on the approach, we might want to change the flag name to
  say something else but ..._TSYNC. (That name is just carried over
  from the similarly-named SECCOMP_FILTER_FLAG_TSYNC)

Implementation open questions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

* Are there better synchronization mechanisms for this purpose than
  task_work with barriers?

* Need to double check deadlock scenarios.

* Need to double check that we are not racing with a concurrent thread
  creation.

* Use the multi-threaded code for the single-threaded variant as well?

  At the current stage, I left the single-threaded code intact and
  copied some of its logic into the multi-threaded variant.  It is
  technically possible to unify this, if we are OK with the single
  threaded code using atomic operations and struct completion all by
  itself (or by interleaving many if-checks, but not sure whether
  that's worth it).

* Does landlock_restrict_self() need to be interruptible?

  (For reference, Jann's patch [5] used
  wait_for_completion_interruptible())


[1] https://github.com/landlock-lsm/go-landlock
[2] https://sites.google.com/site/fullycapable/who-ordered-libpsx
[3] https://man.gnoack.org/7/nptl
[4] https://man.gnoack.org/2/setuid#VERSIONS
[5] https://lore.kernel.org/all/20240805-remove-cred-transfer-v2-0-a2aa1d45e6b8@google.com/

Günther Noack (2):
  landlock: Multithreading support for landlock_restrict_self()
  landlock: selftests for LANDLOCK_RESTRICT_SELF_TSYNC

 include/uapi/linux/landlock.h                 |  10 +
 security/landlock/syscalls.c                  | 232 +++++++++++++++++-
 tools/testing/selftests/landlock/tsync_test.c |  94 +++++++
 3 files changed, 331 insertions(+), 5 deletions(-)
 create mode 100644 tools/testing/selftests/landlock/tsync_test.c


base-commit: 69e858e0b8b2ea07759e995aa383e8780d9d140c