mbox series

[v3,0/4] Introduce security_create_user_ns()

Message ID 20220721172808.585539-1-fred@cloudflare.com (mailing list archive)
Headers show
Series Introduce security_create_user_ns() | expand

Message

Frederick Lawler July 21, 2022, 5:28 p.m. UTC
While creating a LSM BPF MAC policy to block user namespace creation, we
used the LSM cred_prepare hook because that is the closest hook to prevent
a call to create_user_ns().

The calls look something like this:

    cred = prepare_creds()
        security_prepare_creds()
            call_int_hook(cred_prepare, ...
    if (cred)
        create_user_ns(cred)

We noticed that error codes were not propagated from this hook and
introduced a patch [1] to propagate those errors.

The discussion notes that security_prepare_creds()
is not appropriate for MAC policies, and instead the hook is
meant for LSM authors to prepare credentials for mutation. [2]

Ultimately, we concluded that a better course of action is to introduce
a new security hook for LSM authors. [3]

This patch set first introduces a new security_create_user_ns() function
and userns_create LSM hook, then marks the hook as sleepable in BPF.

Links:
1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/

Past discussions:
V2: https://lore.kernel.org/all/20220707223228.1940249-1-fred@cloudflare.com/
V1: https://lore.kernel.org/all/20220621233939.993579-1-fred@cloudflare.com/

Changes since v2:
- Rename create_user_ns hook to userns_create
- Use user_namespace as an object opposed to a generic namespace object
- s/domB_t/domA_t in commit message
Changes since v1:
- Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
- Add selinux: Implement create_user_ns hook patch
- Change function signature of security_create_user_ns() to only take
  struct cred
- Move security_create_user_ns() call after id mapping check in
  create_user_ns()
- Update documentation to reflect changes

Frederick Lawler (4):
  security, lsm: Introduce security_create_user_ns()
  bpf-lsm: Make bpf_lsm_userns_create() sleepable
  selftests/bpf: Add tests verifying bpf lsm userns_create hook
  selinux: Implement userns_create hook

 include/linux/lsm_hook_defs.h                 |  1 +
 include/linux/lsm_hooks.h                     |  4 +
 include/linux/security.h                      |  6 ++
 kernel/bpf/bpf_lsm.c                          |  1 +
 kernel/user_namespace.c                       |  5 ++
 security/security.c                           |  5 ++
 security/selinux/hooks.c                      |  9 ++
 security/selinux/include/classmap.h           |  2 +
 .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
 .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
 10 files changed, 160 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c

--
2.30.2

Comments

Martin KaFai Lau July 22, 2022, 6:11 a.m. UTC | #1
On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> While creating a LSM BPF MAC policy to block user namespace creation, we
> used the LSM cred_prepare hook because that is the closest hook to prevent
> a call to create_user_ns().
> 
> The calls look something like this:
> 
>     cred = prepare_creds()
>         security_prepare_creds()
>             call_int_hook(cred_prepare, ...
>     if (cred)
>         create_user_ns(cred)
> 
> We noticed that error codes were not propagated from this hook and
> introduced a patch [1] to propagate those errors.
> 
> The discussion notes that security_prepare_creds()
> is not appropriate for MAC policies, and instead the hook is
> meant for LSM authors to prepare credentials for mutation. [2]
> 
> Ultimately, we concluded that a better course of action is to introduce
> a new security hook for LSM authors. [3]
> 
> This patch set first introduces a new security_create_user_ns() function
> and userns_create LSM hook, then marks the hook as sleepable in BPF.
Patch 1 and 4 still need review from the lsm/security side.
Paul Moore July 22, 2022, 12:20 p.m. UTC | #2
On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:

> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>> While creating a LSM BPF MAC policy to block user namespace creation, we
>> used the LSM cred_prepare hook because that is the closest hook to prevent
>> a call to create_user_ns().
>>
>> The calls look something like this:
>>
>> cred = prepare_creds()
>> security_prepare_creds()
>> call_int_hook(cred_prepare, ...
>> if (cred)
>> create_user_ns(cred)
>>
>> We noticed that error codes were not propagated from this hook and
>> introduced a patch [1] to propagate those errors.
>>
>> The discussion notes that security_prepare_creds()
>> is not appropriate for MAC policies, and instead the hook is
>> meant for LSM authors to prepare credentials for mutation. [2]
>>
>> Ultimately, we concluded that a better course of action is to introduce
>> a new security hook for LSM authors. [3]
>>
>> This patch set first introduces a new security_create_user_ns() function
>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> Patch 1 and 4 still need review from the lsm/security side.


This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.

I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.

--
paul-moore.com
Eric W. Biederman July 22, 2022, 5:05 p.m. UTC | #3
Frederick Lawler <fred@cloudflare.com> writes:

> While creating a LSM BPF MAC policy to block user namespace creation, we
> used the LSM cred_prepare hook because that is the closest hook to prevent
> a call to create_user_ns().

That description is wrong.  Your goal his is not to limit access to
the user namespace.  Your goal is to reduce the attack surface of the
kernel by not allowing some processes access to a user namespace.

You have already said that you don't have concerns about the
fundamentals of the user namespace, and what it enables only that
it allows access to exploitable code.

Achieving the protection you seek requires talking and thinking clearly
about the goal.




I have a couple of deep and fundamental problems with this approach,
to limiting access to potentially exploitable code.

1) The first is that unless there is a high probability (say 90%) that at
   any time the only exploitable code in the kernel can only be accessed
   by an unprivileged user with the help of user namespaces, attackers
   will just route around this restriction and so it will achieve
   nothing in practice, while at the same time incur an extra
   maintenance burden.

2) The second is that there is a long standing problem with code that
   gets added to the kernel.  Many times new kernel code because it has
   the potential to confuse suid root executables that code has been
   made root only.  Over time that results in more and more code running
   as root to be able to make use of the useful features of the linux
   kernel.

   One of the goals of the user namespace is to avoid more and more code
   migrating to running as root.  To achieve that goal ordinary
   application developers need to be able to assume that typically user
   namespaces will be available on linux.

   An assumption that ordinary applications like chromium make today.

   Your intentions seem to be to place a capability check so that only
   root can use user namespaces or something of the sort.  Thus breaking
   the general availability of user namespaces for ordinary applications
   on your systems.
   

My apologies if this has been addressed somewhere in the conversation
already.  I don't see these issues addressed in the descriptions of your
patches.

Until these issues are firmly addressed and you are not proposing a
patch that can only cause regressions in userspace applications.

Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>

>
> The calls look something like this:
>
>     cred = prepare_creds()
>         security_prepare_creds()
>             call_int_hook(cred_prepare, ...
>     if (cred)
>         create_user_ns(cred)
>
> We noticed that error codes were not propagated from this hook and
> introduced a patch [1] to propagate those errors.
>
> The discussion notes that security_prepare_creds()
> is not appropriate for MAC policies, and instead the hook is
> meant for LSM authors to prepare credentials for mutation. [2]
>
> Ultimately, we concluded that a better course of action is to introduce
> a new security hook for LSM authors. [3]
>
> This patch set first introduces a new security_create_user_ns() function
> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>
> Links:
> 1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
> 2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
> 3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
>
> Past discussions:
> V2: https://lore.kernel.org/all/20220707223228.1940249-1-fred@cloudflare.com/
> V1: https://lore.kernel.org/all/20220621233939.993579-1-fred@cloudflare.com/
>
> Changes since v2:
> - Rename create_user_ns hook to userns_create
> - Use user_namespace as an object opposed to a generic namespace object
> - s/domB_t/domA_t in commit message
> Changes since v1:
> - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
> - Add selinux: Implement create_user_ns hook patch
> - Change function signature of security_create_user_ns() to only take
>   struct cred
> - Move security_create_user_ns() call after id mapping check in
>   create_user_ns()
> - Update documentation to reflect changes
>
> Frederick Lawler (4):
>   security, lsm: Introduce security_create_user_ns()
>   bpf-lsm: Make bpf_lsm_userns_create() sleepable
>   selftests/bpf: Add tests verifying bpf lsm userns_create hook
>   selinux: Implement userns_create hook
>
>  include/linux/lsm_hook_defs.h                 |  1 +
>  include/linux/lsm_hooks.h                     |  4 +
>  include/linux/security.h                      |  6 ++
>  kernel/bpf/bpf_lsm.c                          |  1 +
>  kernel/user_namespace.c                       |  5 ++
>  security/security.c                           |  5 ++
>  security/selinux/hooks.c                      |  9 ++
>  security/selinux/include/classmap.h           |  2 +
>  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
>  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
>  10 files changed, 160 insertions(+)
>  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
>  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
>
> --
> 2.30.2

Eric
Paul Moore July 25, 2022, 10:53 p.m. UTC | #4
On Fri, Jul 22, 2022 at 1:05 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
> Frederick Lawler <fred@cloudflare.com> writes:
>
> > While creating a LSM BPF MAC policy to block user namespace creation, we
> > used the LSM cred_prepare hook because that is the closest hook to prevent
> > a call to create_user_ns().
>
> That description is wrong.  Your goal his is not to limit access to
> the user namespace.  Your goal is to reduce the attack surface of the
> kernel by not allowing some processes access to a user namespace.
>
> You have already said that you don't have concerns about the
> fundamentals of the user namespace, and what it enables only that
> it allows access to exploitable code.
>
> Achieving the protection you seek requires talking and thinking clearly
> about the goal.

Providing a single concrete goal for a LSM hook is always going to be
a challenge due to the nature of the LSM layer and the great unknown
of all the different LSMs that are implemented underneath the LSM
abstraction.  However, we can make some very general statements such
that up to this point the LSMs that have been merged into mainline
generally provide some level of access control, observability, or
both.  While that may change in the future (the LSM layer does not
attempt to restrict LSMs to just these two ideas), I think they are
"good enough" goals for this discussion.

In addition to thinking about these goals, I think it also important
to take a step back and think about the original motivation for the
LSM and why it, and Linux itself, has proven to be popular with
everything from the phone in your hand to the datacenter servers
powering ... pretty much everything :)  Arguably Linux has seen such
profound success because of its malleability; the open nature of the
kernel development process has allowed the Linux Kernel to adopt
capabilities well beyond what any one dev team could produce, and as
Linux continues to grow in adoption, its ability to flex into new use
cases only increases.  The kernel namespace concept is an excellent
example of this: virtualizing core kernel ideas, such as user
credentials, to provide better, safer solutions.  It is my belief that
the LSM layer is very much built around this same idea of abstracting
and extending core kernel concepts, in this case security controls, to
provide better solutions.  Integrating the LSM into the kernel's
namespaces is a natural fit, and one that is long overdue.

If we can't find a way to make everyone happy here, let's at least try
to find a way to make everyone "okay" with adding a LSM hook to the
user namespace.  If you want to NACK this approach Eric, that's okay,
but please provide some alternative paths forward that we can discuss.
Djalal Harouni July 26, 2022, 12:02 p.m. UTC | #5
Hi Eric,

On Fri, Jul 22, 2022 at 7:07 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Frederick Lawler <fred@cloudflare.com> writes:
>
> > While creating a LSM BPF MAC policy to block user namespace creation, we
> > used the LSM cred_prepare hook because that is the closest hook to prevent
> > a call to create_user_ns().
>
> That description is wrong.  Your goal his is not to limit access to
> the user namespace.  Your goal is to reduce the attack surface of the
> kernel by not allowing some processes access to a user namespace.
>
> You have already said that you don't have concerns about the
> fundamentals of the user namespace, and what it enables only that
> it allows access to exploitable code.
>
> Achieving the protection you seek requires talking and thinking clearly
> about the goal.
>

We have valid use cases not specifically related to the attack surface,
but go into the middle from bpf observability to enforcement. As we want
to track namespace creation, changes, nesting and per task creds context
depending on the nature of the workload.

Obvious example is nesting as we want to track namespace creations
not necessarily user namespace but all to report hierarchies to dashboards,
then from kubernetes namespace view, we would like some applications to
setup namespaces privileged or not, but deny other apps creation of nested
pidns, userns, etc, it depends on users how they setup their kubernetes
namespaces and labels...

...
>
> 2) The second is that there is a long standing problem with code that
>    gets added to the kernel.  Many times new kernel code because it has
>    the potential to confuse suid root executables that code has been
>    made root only.  Over time that results in more and more code running
>    as root to be able to make use of the useful features of the linux
>    kernel.
>
>    One of the goals of the user namespace is to avoid more and more code
>    migrating to running as root.  To achieve that goal ordinary
>    application developers need to be able to assume that typically user
>    namespaces will be available on linux.
>
>    An assumption that ordinary applications like chromium make today.

I don't necessarily disagree with statement 2. and in a perfect world yes.
But practically as noted by Paul in his email, Linux is flexible and
speaking about kubernetes world we have multiple workload per
namespaces, and we would like a solution that we can support in the
next months.

Also these are features that some user space may use, some may not, we
will never be able to dictate to all user space applications how to do things.

From bpf side observability or bpf-lsm enforcement it allows to escalate how
to respond to the task and *make lsm and bpf (bpf-lsm) have a consistent
design* where they both follow the same path.

It is unfortunate that the security_task_alloc() [1] hook is  _late_ and can't
be used for context initialization as the credentials and even user namespace
have already been created. Strictly speaking we have a context that has
been already created and applied and we can't properly catch it !

There is no way to do that from user space as most bpf based tools
(observability and enforcement) do not and should not mess up at the
user space level with the namespace configuration of tasks (/proc...), they
are external programs to the running tasks, they do not set up the
environment. Having the hook before the namespaces and creds copying
allows to properly track this and construct the _right_ context. From lsm
and bpf-lsm this will definitely offer a better interface that is not prone to
errors.

We would like an answer here or an alternative hook that is placed before
the creation/setting of any namespace, credentials or creating new keyring.
So we can provide bpf-based transparent solutions that work.

[1] https://elixir.bootlin.com/linux/v5.18.13/source/kernel/fork.c#L2216



> My apologies if this has been addressed somewhere in the conversation
> already.  I don't see these issues addressed in the descriptions of your
> patches.
>
> Until these issues are firmly addressed and you are not proposing a
> patch that can only cause regressions in userspace applications.
>
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> >
> > The calls look something like this:
> >
> >     cred = prepare_creds()
> >         security_prepare_creds()
> >             call_int_hook(cred_prepare, ...
> >     if (cred)
> >         create_user_ns(cred)
> >
> > We noticed that error codes were not propagated from this hook and
> > introduced a patch [1] to propagate those errors.
> >
> > The discussion notes that security_prepare_creds()
> > is not appropriate for MAC policies, and instead the hook is
> > meant for LSM authors to prepare credentials for mutation. [2]
> >
> > Ultimately, we concluded that a better course of action is to introduce
> > a new security hook for LSM authors. [3]
> >
> > This patch set first introduces a new security_create_user_ns() function
> > and userns_create LSM hook, then marks the hook as sleepable in BPF.
> >
> > Links:
> > 1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
> > 2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
> > 3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
> >
> > Past discussions:
> > V2: https://lore.kernel.org/all/20220707223228.1940249-1-fred@cloudflare.com/
> > V1: https://lore.kernel.org/all/20220621233939.993579-1-fred@cloudflare.com/
> >
> > Changes since v2:
> > - Rename create_user_ns hook to userns_create
> > - Use user_namespace as an object opposed to a generic namespace object
> > - s/domB_t/domA_t in commit message
> > Changes since v1:
> > - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
> > - Add selinux: Implement create_user_ns hook patch
> > - Change function signature of security_create_user_ns() to only take
> >   struct cred
> > - Move security_create_user_ns() call after id mapping check in
> >   create_user_ns()
> > - Update documentation to reflect changes
> >
> > Frederick Lawler (4):
> >   security, lsm: Introduce security_create_user_ns()
> >   bpf-lsm: Make bpf_lsm_userns_create() sleepable
> >   selftests/bpf: Add tests verifying bpf lsm userns_create hook
> >   selinux: Implement userns_create hook
> >
> >  include/linux/lsm_hook_defs.h                 |  1 +
> >  include/linux/lsm_hooks.h                     |  4 +
> >  include/linux/security.h                      |  6 ++
> >  kernel/bpf/bpf_lsm.c                          |  1 +
> >  kernel/user_namespace.c                       |  5 ++
> >  security/security.c                           |  5 ++
> >  security/selinux/hooks.c                      |  9 ++
> >  security/selinux/include/classmap.h           |  2 +
> >  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
> >  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
> >  10 files changed, 160 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
> >
> > --
> > 2.30.2
>
> Eric
Ignat Korchagin July 26, 2022, 2:41 p.m. UTC | #6
On Fri, Jul 22, 2022 at 6:05 PM Eric W. Biederman <ebiederm@xmission.com> wrote:
>
> Frederick Lawler <fred@cloudflare.com> writes:
>
> > While creating a LSM BPF MAC policy to block user namespace creation, we
> > used the LSM cred_prepare hook because that is the closest hook to prevent
> > a call to create_user_ns().
>
> That description is wrong.  Your goal his is not to limit access to
> the user namespace.  Your goal is to reduce the attack surface of the
> kernel by not allowing some processes access to a user namespace.
>
> You have already said that you don't have concerns about the
> fundamentals of the user namespace, and what it enables only that
> it allows access to exploitable code.
>
> Achieving the protection you seek requires talking and thinking clearly
> about the goal.
>
>
>
>
> I have a couple of deep and fundamental problems with this approach,
> to limiting access to potentially exploitable code.
>
> 1) The first is that unless there is a high probability (say 90%) that at
>    any time the only exploitable code in the kernel can only be accessed
>    by an unprivileged user with the help of user namespaces, attackers
>    will just route around this restriction and so it will achieve
>    nothing in practice, while at the same time incur an extra
>    maintenance burden.
>
> 2) The second is that there is a long standing problem with code that
>    gets added to the kernel.  Many times new kernel code because it has
>    the potential to confuse suid root executables that code has been
>    made root only.  Over time that results in more and more code running
>    as root to be able to make use of the useful features of the linux
>    kernel.
>
>    One of the goals of the user namespace is to avoid more and more code
>    migrating to running as root.  To achieve that goal ordinary
>    application developers need to be able to assume that typically user
>    namespaces will be available on linux.
>
>    An assumption that ordinary applications like chromium make today.
>
>    Your intentions seem to be to place a capability check so that only
>    root can use user namespaces or something of the sort.  Thus breaking
>    the general availability of user namespaces for ordinary applications
>    on your systems.

I would like to comment here that our intention with the hook is quite
the opposite:
we do want to embrace user namespaces in our code and some of our workloads
already depend on it. Hence we didn't agree to Debian's approach of just
having a global sysctl. But there is "our code" and there is "third
party" code, which
might not even be open source due to various reasons. And while the path exists
for that code to do something bad - we want to block it.

So in a way, I think this hook allows better adoption of user
namespaces in the first
place and gives distros and other system maintainers a reasonable
alternative than
just providing a global "kill" sysctl (which is de-facto is used by
many, thus actually
limiting userspace applications accessing the user namespace functionality)

>
> My apologies if this has been addressed somewhere in the conversation
> already.  I don't see these issues addressed in the descriptions of your
> patches.
>
> Until these issues are firmly addressed and you are not proposing a
> patch that can only cause regressions in userspace applications.
>
> Nacked-by: "Eric W. Biederman" <ebiederm@xmission.com>
>
> >
> > The calls look something like this:
> >
> >     cred = prepare_creds()
> >         security_prepare_creds()
> >             call_int_hook(cred_prepare, ...
> >     if (cred)
> >         create_user_ns(cred)
> >
> > We noticed that error codes were not propagated from this hook and
> > introduced a patch [1] to propagate those errors.
> >
> > The discussion notes that security_prepare_creds()
> > is not appropriate for MAC policies, and instead the hook is
> > meant for LSM authors to prepare credentials for mutation. [2]
> >
> > Ultimately, we concluded that a better course of action is to introduce
> > a new security hook for LSM authors. [3]
> >
> > This patch set first introduces a new security_create_user_ns() function
> > and userns_create LSM hook, then marks the hook as sleepable in BPF.
> >
> > Links:
> > 1. https://lore.kernel.org/all/20220608150942.776446-1-fred@cloudflare.com/
> > 2. https://lore.kernel.org/all/87y1xzyhub.fsf@email.froward.int.ebiederm.org/
> > 3. https://lore.kernel.org/all/9fe9cd9f-1ded-a179-8ded-5fde8960a586@cloudflare.com/
> >
> > Past discussions:
> > V2: https://lore.kernel.org/all/20220707223228.1940249-1-fred@cloudflare.com/
> > V1: https://lore.kernel.org/all/20220621233939.993579-1-fred@cloudflare.com/
> >
> > Changes since v2:
> > - Rename create_user_ns hook to userns_create
> > - Use user_namespace as an object opposed to a generic namespace object
> > - s/domB_t/domA_t in commit message
> > Changes since v1:
> > - Add selftests/bpf: Add tests verifying bpf lsm create_user_ns hook patch
> > - Add selinux: Implement create_user_ns hook patch
> > - Change function signature of security_create_user_ns() to only take
> >   struct cred
> > - Move security_create_user_ns() call after id mapping check in
> >   create_user_ns()
> > - Update documentation to reflect changes
> >
> > Frederick Lawler (4):
> >   security, lsm: Introduce security_create_user_ns()
> >   bpf-lsm: Make bpf_lsm_userns_create() sleepable
> >   selftests/bpf: Add tests verifying bpf lsm userns_create hook
> >   selinux: Implement userns_create hook
> >
> >  include/linux/lsm_hook_defs.h                 |  1 +
> >  include/linux/lsm_hooks.h                     |  4 +
> >  include/linux/security.h                      |  6 ++
> >  kernel/bpf/bpf_lsm.c                          |  1 +
> >  kernel/user_namespace.c                       |  5 ++
> >  security/security.c                           |  5 ++
> >  security/selinux/hooks.c                      |  9 ++
> >  security/selinux/include/classmap.h           |  2 +
> >  .../selftests/bpf/prog_tests/deny_namespace.c | 88 +++++++++++++++++++
> >  .../selftests/bpf/progs/test_deny_namespace.c | 39 ++++++++
> >  10 files changed, 160 insertions(+)
> >  create mode 100644 tools/testing/selftests/bpf/prog_tests/deny_namespace.c
> >  create mode 100644 tools/testing/selftests/bpf/progs/test_deny_namespace.c
> >
> > --
> > 2.30.2
>
> Eric
Frederick Lawler Aug. 1, 2022, 1:13 p.m. UTC | #7
On 7/22/22 7:20 AM, Paul Moore wrote:
> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> 
>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>>> While creating a LSM BPF MAC policy to block user namespace creation, we
>>> used the LSM cred_prepare hook because that is the closest hook to prevent
>>> a call to create_user_ns().
>>>
>>> The calls look something like this:
>>>
>>> cred = prepare_creds()
>>> security_prepare_creds()
>>> call_int_hook(cred_prepare, ...
>>> if (cred)
>>> create_user_ns(cred)
>>>
>>> We noticed that error codes were not propagated from this hook and
>>> introduced a patch [1] to propagate those errors.
>>>
>>> The discussion notes that security_prepare_creds()
>>> is not appropriate for MAC policies, and instead the hook is
>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>
>>> Ultimately, we concluded that a better course of action is to introduce
>>> a new security hook for LSM authors. [3]
>>>
>>> This patch set first introduces a new security_create_user_ns() function
>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>> Patch 1 and 4 still need review from the lsm/security side.
> 
> 
> This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.
> 
> I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.
> 

Based on last weeks comments, should I go ahead and put up v4 for 
5.20-rc1 when that drops, or do I need to wait for more feedback?

> --
> paul-moore.com
> 
>
Paul Moore Aug. 1, 2022, 3:19 p.m. UTC | #8
On Mon, Aug 1, 2022 at 9:13 AM Frederick Lawler <fred@cloudflare.com> wrote:
> On 7/22/22 7:20 AM, Paul Moore wrote:
> > On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> >
> >> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> >>> While creating a LSM BPF MAC policy to block user namespace creation, we
> >>> used the LSM cred_prepare hook because that is the closest hook to prevent
> >>> a call to create_user_ns().
> >>>
> >>> The calls look something like this:
> >>>
> >>> cred = prepare_creds()
> >>> security_prepare_creds()
> >>> call_int_hook(cred_prepare, ...
> >>> if (cred)
> >>> create_user_ns(cred)
> >>>
> >>> We noticed that error codes were not propagated from this hook and
> >>> introduced a patch [1] to propagate those errors.
> >>>
> >>> The discussion notes that security_prepare_creds()
> >>> is not appropriate for MAC policies, and instead the hook is
> >>> meant for LSM authors to prepare credentials for mutation. [2]
> >>>
> >>> Ultimately, we concluded that a better course of action is to introduce
> >>> a new security hook for LSM authors. [3]
> >>>
> >>> This patch set first introduces a new security_create_user_ns() function
> >>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> >> Patch 1 and 4 still need review from the lsm/security side.
> >
> > This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.
> >
> > I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.
>
> Based on last weeks comments, should I go ahead and put up v4 for
> 5.20-rc1 when that drops, or do I need to wait for more feedback?

In general it rarely hurts to make another revision, and I think
you've gotten some decent feedback on this draft, especially around
the BPF LSM tests; I think rebasing on Linus tree after the upcoming
io_uring changes are merged would be a good idea.  Although as a
reminder to the BPF LSM folks - I'm looking at you KP Singh :) - I
need an ACK from you guys before I merge the BPF related patches
(patches {2,3}/4).  For the record, I think the SELinux portion of
this patchset (path 4/4) is fine.

There is the issue of Eric's NACK, but I believe the responses that
followed his comment sufficiently addressed those concerns and it has
now been a week with no further comment from Eric; we should continue
to move forward with this.
Casey Schaufler Aug. 1, 2022, 3:25 p.m. UTC | #9
On 8/1/2022 6:13 AM, Frederick Lawler wrote:
> On 7/22/22 7:20 AM, Paul Moore wrote:
>> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
>>
>>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>>>> While creating a LSM BPF MAC policy to block user namespace
>>>> creation, we
>>>> used the LSM cred_prepare hook because that is the closest hook to
>>>> prevent
>>>> a call to create_user_ns().
>>>>
>>>> The calls look something like this:
>>>>
>>>> cred = prepare_creds()
>>>> security_prepare_creds()
>>>> call_int_hook(cred_prepare, ...
>>>> if (cred)
>>>> create_user_ns(cred)
>>>>
>>>> We noticed that error codes were not propagated from this hook and
>>>> introduced a patch [1] to propagate those errors.
>>>>
>>>> The discussion notes that security_prepare_creds()
>>>> is not appropriate for MAC policies, and instead the hook is
>>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>>
>>>> Ultimately, we concluded that a better course of action is to
>>>> introduce
>>>> a new security hook for LSM authors. [3]
>>>>
>>>> This patch set first introduces a new security_create_user_ns()
>>>> function
>>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>>> Patch 1 and 4 still need review from the lsm/security side.
>>
>>
>> This patchset is in my review queue and assuming everything checks
>> out, I expect to merge it after the upcoming merge window closes.
>>
>> I would also need an ACK from the BPF LSM folks, but they're CC'd on
>> this patchset.
>>
>
> Based on last weeks comments, should I go ahead and put up v4 for
> 5.20-rc1 when that drops, or do I need to wait for more feedback?

As the primary consumer of this hook is BPF I would really expect their
reviewed-by before accepting this. 

>
>> -- 
>> paul-moore.com
>>
>>
>
Paul Moore Aug. 1, 2022, 4:34 p.m. UTC | #10
On Mon, Aug 1, 2022 at 11:25 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> On 8/1/2022 6:13 AM, Frederick Lawler wrote:
> > On 7/22/22 7:20 AM, Paul Moore wrote:
> >> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> >>
> >>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> >>>> While creating a LSM BPF MAC policy to block user namespace
> >>>> creation, we
> >>>> used the LSM cred_prepare hook because that is the closest hook to
> >>>> prevent
> >>>> a call to create_user_ns().
> >>>>
> >>>> The calls look something like this:
> >>>>
> >>>> cred = prepare_creds()
> >>>> security_prepare_creds()
> >>>> call_int_hook(cred_prepare, ...
> >>>> if (cred)
> >>>> create_user_ns(cred)
> >>>>
> >>>> We noticed that error codes were not propagated from this hook and
> >>>> introduced a patch [1] to propagate those errors.
> >>>>
> >>>> The discussion notes that security_prepare_creds()
> >>>> is not appropriate for MAC policies, and instead the hook is
> >>>> meant for LSM authors to prepare credentials for mutation. [2]
> >>>>
> >>>> Ultimately, we concluded that a better course of action is to
> >>>> introduce
> >>>> a new security hook for LSM authors. [3]
> >>>>
> >>>> This patch set first introduces a new security_create_user_ns()
> >>>> function
> >>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> >>> Patch 1 and 4 still need review from the lsm/security side.
> >>
> >>
> >> This patchset is in my review queue and assuming everything checks
> >> out, I expect to merge it after the upcoming merge window closes.
> >>
> >> I would also need an ACK from the BPF LSM folks, but they're CC'd on
> >> this patchset.
> >
> > Based on last weeks comments, should I go ahead and put up v4 for
> > 5.20-rc1 when that drops, or do I need to wait for more feedback?
>
> As the primary consumer of this hook is BPF I would really expect their
> reviewed-by before accepting this.

We love all our in-tree LSMs equally.  As long as there is at least
one LSM which provides an implementation and has ACK'd the hook, and
no other LSMs have NACK'd the hook, then I have no problem merging it.
I doubt it will be necessary in this case, but if we need to tweak the
hook in the future we can definitely do that; we've done this in the
past when it has made sense.

As a reminder, the LSM hooks are *not* part of the "don't break
userspace" promise.  I know it gets a little muddy with the way the
BPF LSM works, but just as we don't want to allow one LSM to impact
the runtime controls on another, we don't want to allow one LSM to
freeze the hooks for everyone.
KP Singh Aug. 2, 2022, 9:24 p.m. UTC | #11
On Mon, Aug 1, 2022 at 5:19 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Aug 1, 2022 at 9:13 AM Frederick Lawler <fred@cloudflare.com> wrote:
> > On 7/22/22 7:20 AM, Paul Moore wrote:
> > > On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > >
> > >> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> > >>> While creating a LSM BPF MAC policy to block user namespace creation, we
> > >>> used the LSM cred_prepare hook because that is the closest hook to prevent
> > >>> a call to create_user_ns().
> > >>>
> > >>> The calls look something like this:
> > >>>
> > >>> cred = prepare_creds()
> > >>> security_prepare_creds()
> > >>> call_int_hook(cred_prepare, ...
> > >>> if (cred)
> > >>> create_user_ns(cred)
> > >>>
> > >>> We noticed that error codes were not propagated from this hook and
> > >>> introduced a patch [1] to propagate those errors.
> > >>>
> > >>> The discussion notes that security_prepare_creds()
> > >>> is not appropriate for MAC policies, and instead the hook is
> > >>> meant for LSM authors to prepare credentials for mutation. [2]
> > >>>
> > >>> Ultimately, we concluded that a better course of action is to introduce
> > >>> a new security hook for LSM authors. [3]
> > >>>
> > >>> This patch set first introduces a new security_create_user_ns() function
> > >>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> > >> Patch 1 and 4 still need review from the lsm/security side.
> > >
> > > This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.
> > >
> > > I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.
> >
> > Based on last weeks comments, should I go ahead and put up v4 for
> > 5.20-rc1 when that drops, or do I need to wait for more feedback?
>
> In general it rarely hurts to make another revision, and I think
> you've gotten some decent feedback on this draft, especially around
> the BPF LSM tests; I think rebasing on Linus tree after the upcoming
> io_uring changes are merged would be a good idea.  Although as a
> reminder to the BPF LSM folks - I'm looking at you KP Singh :) - I
> need an ACK from you guys before I merge the BPF related patches

Apologies, I was on vacation. I am looking at the patches now.
Reviews and acks coming soon :)

- KP

> (patches {2,3}/4).  For the record, I think the SELinux portion of
> this patchset (path 4/4) is fine.
>

[...]

>
> --
> paul-moore.com
KP Singh Aug. 2, 2022, 9:27 p.m. UTC | #12
On Mon, Aug 1, 2022 at 6:35 PM Paul Moore <paul@paul-moore.com> wrote:
>
> On Mon, Aug 1, 2022 at 11:25 AM Casey Schaufler <casey@schaufler-ca.com> wrote:
> > On 8/1/2022 6:13 AM, Frederick Lawler wrote:
> > > On 7/22/22 7:20 AM, Paul Moore wrote:
> > >> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > >>
> > >>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> > >>>> While creating a LSM BPF MAC policy to block user namespace
> > >>>> creation, we
> > >>>> used the LSM cred_prepare hook because that is the closest hook to
> > >>>> prevent
> > >>>> a call to create_user_ns().
> > >>>>
> > >>>> The calls look something like this:
> > >>>>
> > >>>> cred = prepare_creds()
> > >>>> security_prepare_creds()
> > >>>> call_int_hook(cred_prepare, ...
> > >>>> if (cred)
> > >>>> create_user_ns(cred)
> > >>>>
> > >>>> We noticed that error codes were not propagated from this hook and
> > >>>> introduced a patch [1] to propagate those errors.
> > >>>>
> > >>>> The discussion notes that security_prepare_creds()
> > >>>> is not appropriate for MAC policies, and instead the hook is
> > >>>> meant for LSM authors to prepare credentials for mutation. [2]
> > >>>>
> > >>>> Ultimately, we concluded that a better course of action is to
> > >>>> introduce
> > >>>> a new security hook for LSM authors. [3]
> > >>>>
> > >>>> This patch set first introduces a new security_create_user_ns()
> > >>>> function
> > >>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> > >>> Patch 1 and 4 still need review from the lsm/security side.
> > >>
> > >>
> > >> This patchset is in my review queue and assuming everything checks
> > >> out, I expect to merge it after the upcoming merge window closes.
> > >>
> > >> I would also need an ACK from the BPF LSM folks, but they're CC'd on
> > >> this patchset.
> > >
> > > Based on last weeks comments, should I go ahead and put up v4 for
> > > 5.20-rc1 when that drops, or do I need to wait for more feedback?
> >
> > As the primary consumer of this hook is BPF I would really expect their
> > reviewed-by before accepting this.
>
> We love all our in-tree LSMs equally.  As long as there is at least
> one LSM which provides an implementation and has ACK'd the hook, and
> no other LSMs have NACK'd the hook, then I have no problem merging it.
> I doubt it will be necessary in this case, but if we need to tweak the
> hook in the future we can definitely do that; we've done this in the
> past when it has made sense.
>
> As a reminder, the LSM hooks are *not* part of the "don't break
> userspace" promise.  I know it gets a little muddy with the way the

That's correct. Also, with BPF LSM, we encourage users to
build the application / bpf program logic to be resilient to changes
in the LSM hooks.

I am happy to share how we've done it, if folks are interested.

- KP

> BPF LSM works, but just as we don't want to allow one LSM to impact
> the runtime controls on another, we don't want to allow one LSM to
> freeze the hooks for everyone.
>
> --
> paul-moore.com
Eric W. Biederman Aug. 2, 2022, 9:33 p.m. UTC | #13
Paul Moore <paul@paul-moore.com> writes:

> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
>
>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>>> While creating a LSM BPF MAC policy to block user namespace creation, we
>>> used the LSM cred_prepare hook because that is the closest hook to prevent
>>> a call to create_user_ns().
>>>
>>> The calls look something like this:
>>>
>>> cred = prepare_creds()
>>> security_prepare_creds()
>>> call_int_hook(cred_prepare, ...
>>> if (cred)
>>> create_user_ns(cred)
>>>
>>> We noticed that error codes were not propagated from this hook and
>>> introduced a patch [1] to propagate those errors.
>>>
>>> The discussion notes that security_prepare_creds()
>>> is not appropriate for MAC policies, and instead the hook is
>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>
>>> Ultimately, we concluded that a better course of action is to introduce
>>> a new security hook for LSM authors. [3]
>>>
>>> This patch set first introduces a new security_create_user_ns() function
>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>> Patch 1 and 4 still need review from the lsm/security side.
>
>
> This patchset is in my review queue and assuming everything checks
> out, I expect to merge it after the upcoming merge window closes.

It doesn't even address my issues with the last patchset.

So it has my NACK.

Eric
Paul Moore Aug. 3, 2022, 1:49 a.m. UTC | #14
On Tue, Aug 2, 2022 at 5:25 PM KP Singh <kpsingh@kernel.org> wrote:
> On Mon, Aug 1, 2022 at 5:19 PM Paul Moore <paul@paul-moore.com> wrote:
> > On Mon, Aug 1, 2022 at 9:13 AM Frederick Lawler <fred@cloudflare.com> wrote:
> > > On 7/22/22 7:20 AM, Paul Moore wrote:
> > > > On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
> > > >
> > > >> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
> > > >>> While creating a LSM BPF MAC policy to block user namespace creation, we
> > > >>> used the LSM cred_prepare hook because that is the closest hook to prevent
> > > >>> a call to create_user_ns().
> > > >>>
> > > >>> The calls look something like this:
> > > >>>
> > > >>> cred = prepare_creds()
> > > >>> security_prepare_creds()
> > > >>> call_int_hook(cred_prepare, ...
> > > >>> if (cred)
> > > >>> create_user_ns(cred)
> > > >>>
> > > >>> We noticed that error codes were not propagated from this hook and
> > > >>> introduced a patch [1] to propagate those errors.
> > > >>>
> > > >>> The discussion notes that security_prepare_creds()
> > > >>> is not appropriate for MAC policies, and instead the hook is
> > > >>> meant for LSM authors to prepare credentials for mutation. [2]
> > > >>>
> > > >>> Ultimately, we concluded that a better course of action is to introduce
> > > >>> a new security hook for LSM authors. [3]
> > > >>>
> > > >>> This patch set first introduces a new security_create_user_ns() function
> > > >>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
> > > >> Patch 1 and 4 still need review from the lsm/security side.
> > > >
> > > > This patchset is in my review queue and assuming everything checks out, I expect to merge it after the upcoming merge window closes.
> > > >
> > > > I would also need an ACK from the BPF LSM folks, but they're CC'd on this patchset.
> > >
> > > Based on last weeks comments, should I go ahead and put up v4 for
> > > 5.20-rc1 when that drops, or do I need to wait for more feedback?
> >
> > In general it rarely hurts to make another revision, and I think
> > you've gotten some decent feedback on this draft, especially around
> > the BPF LSM tests; I think rebasing on Linus tree after the upcoming
> > io_uring changes are merged would be a good idea.

As I was typing up my reply I realized I mistakenly mentioned the
io_uring changes that Linus just merged today - oops!  If you haven't
figured it out already, you can disregard that comment, that's a
completely different problem and a completely different set of patches
:)

> > Although as a
> > reminder to the BPF LSM folks - I'm looking at you KP Singh :) - I
> > need an ACK from you guys before I merge the BPF related patches
>
> Apologies, I was on vacation. I am looking at the patches now.
> Reviews and acks coming soon :)

No worries, we've still got the two weeks of the merge window before I
can do anything into linux-next - thanks KP!
Frederick Lawler Aug. 3, 2022, 3:20 p.m. UTC | #15
On 8/2/22 4:33 PM, Eric W. Biederman wrote:
> Paul Moore <paul@paul-moore.com> writes:
> 
>> On July 22, 2022 2:12:03 AM Martin KaFai Lau <kafai@fb.com> wrote:
>>
>>> On Thu, Jul 21, 2022 at 12:28:04PM -0500, Frederick Lawler wrote:
>>>> While creating a LSM BPF MAC policy to block user namespace creation, we
>>>> used the LSM cred_prepare hook because that is the closest hook to prevent
>>>> a call to create_user_ns().
>>>>
>>>> The calls look something like this:
>>>>
>>>> cred = prepare_creds()
>>>> security_prepare_creds()
>>>> call_int_hook(cred_prepare, ...
>>>> if (cred)
>>>> create_user_ns(cred)
>>>>
>>>> We noticed that error codes were not propagated from this hook and
>>>> introduced a patch [1] to propagate those errors.
>>>>
>>>> The discussion notes that security_prepare_creds()
>>>> is not appropriate for MAC policies, and instead the hook is
>>>> meant for LSM authors to prepare credentials for mutation. [2]
>>>>
>>>> Ultimately, we concluded that a better course of action is to introduce
>>>> a new security hook for LSM authors. [3]
>>>>
>>>> This patch set first introduces a new security_create_user_ns() function
>>>> and userns_create LSM hook, then marks the hook as sleepable in BPF.
>>> Patch 1 and 4 still need review from the lsm/security side.
>>
>>
>> This patchset is in my review queue and assuming everything checks
>> out, I expect to merge it after the upcoming merge window closes.
> 
> It doesn't even address my issues with the last patchset.

Are you referring to [1], and with regards to [2], is the issue that the 
wording could be improved for both the cover letter and patch 1/4?

Ultimately, the goal of CF is to leverage and use user namespaces and 
block tasks whose meta information do not align with our allow list 
criteria. Yes, there is a higher goal of restricting our attack surface. 
Yes, people will find ways around security. The point is to have 
multiple levels of security, and this patch series allows people to add 
another level.

Calling this hook a regression is not true since there's no actual 
regression in the code. What would constitute a perceived regression is 
an admin imposing such a SELinux or BPF restriction within their 
company, but developers in that company ideally would try to work with 
the admin to enable user namespaces for certain use cases, or 
alternatively do what you don't want given current tooling: always run 
code as root. That's where this hook comes in: let people observe and 
enforce how they see fit. The average enthusiasts would see no impact.

I was requested to add _some_ test to BPF and to add a SELinux 
implementation. The low hanging fruit for a test to prove that the hook 
is capable of doing _something_ was to simply just block outright, and 
provide _some example_ of use. It doesn't make sense for us to write a 
test that outlines specifically what CF or others are doing because that 
would put too much emphasis on an implementation detail that doesn't 
matter to prove that the hook works.

Without Djalal's comment, I can't defend an observability use case that 
we're not currently leveraging. We have it now, so therefore I'll defend 
it per KP's suggestion[3] in v5.

By not responding to the email discussions, we can't accurately gauge 
what should or should not be in the descriptions. No one here 
necessarily disagrees with some of the points you made, and others have 
appropriately responded. As others have also wrote, you're not proposing 
alternatives. How do you expect us to work with that?

Please, let us know which bits and pieces ought to be included in the 
descriptions, and let us know what things we should call out caveats to 
that would satisfy your concerns.

Links:
1. 
https://lore.kernel.org/all/01368386-521f-230b-1d49-de19377c27d1@cloudflare.com/
2. 
https://lore.kernel.org/all/877d45kri4.fsf@email.froward.int.ebiederm.org/#t
3. 
https://lore.kernel.org/all/CACYkzJ4x90DamdN4dRCn1gZuAHLqJNy4MoP=qTX+44Bqx1uxSQ@mail.gmail.com/
4. 
https://lore.kernel.org/all/CAEiveUdPhEPAk7Y0ZXjPsD=Vb5hn453CHzS9aG-tkyRa8bf_eg@mail.gmail.com/#t

> 
> So it has my NACK.
> 
> Eric