diff mbox series

Simplify ambient capability dropping in iproute2:ip tool.

Message ID CABCx3R0QbN2anNX5mO1iPGZNgS=wdWr+Rb=bYGwf24o6jxjnaQ@mail.gmail.com (mailing list archive)
State New, archived
Delegated to: David Ahern
Headers show
Series Simplify ambient capability dropping in iproute2:ip tool. | expand

Checks

Context Check Description
netdev/tree_selection success Not a local patch

Commit Message

Tinkerer One April 29, 2022, 3:17 a.m. UTC
Hi,

This is expanded from https://github.com/shemminger/iproute2/issues/62
which I'm told is not the way to report issues and offer fixes to
iproute2 etc.

[I'm not subscribed to the netdev list, so please cc: me if you need more info.]

The original change that added the drop_cap() code was:

https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=ba2fc55b99f8363c80ce36681bc1ec97690b66f5

In an attempt to address some user feedback, the code was further
complicated by:

https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=9b13cc98f5952f62b825461727c8170d37a4037d

Another user issue was asked about here (a couple days ago):

https://stackoverflow.com/questions/72015197/allow-non-root-user-of-container-to-execute-binaries-that-need-capabilities

I looked into what was going on and found that lib/utils.c contains
some complicated code that seems to be trying to prevent Ambient
capabilities from being inherited except in specific cases
(ip/ip.c:main() calls drop_cap() except in the ip vrf exec case.). The
code clears all capabilities in order to prevent Ambient capabilities
from being available. The following change achieves suppression of
Ambient capabilities much more precisely. It also permits ip to not
need to be setuid-root or executed under sudo since it can now be
optionally empowered by file capabilities:

Comments

Luca Boccassi April 29, 2022, 9:56 a.m. UTC | #1
On Thu, 2022-04-28 at 20:17 -0700, Tinkerer One wrote:
> Hi,
> 
> This is expanded from https://github.com/shemminger/iproute2/issues/62
> which I'm told is not the way to report issues and offer fixes to
> iproute2 etc.
> 
> [I'm not subscribed to the netdev list, so please cc: me if you need more info.]
> 
> The original change that added the drop_cap() code was:
> 
> https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=ba2fc55b99f8363c80ce36681bc1ec97690b66f5
> 
> In an attempt to address some user feedback, the code was further
> complicated by:
> 
> https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=9b13cc98f5952f62b825461727c8170d37a4037d
> 
> Another user issue was asked about here (a couple days ago):
> 
> https://stackoverflow.com/questions/72015197/allow-non-root-user-of-container-to-execute-binaries-that-need-capabilities
> 
> I looked into what was going on and found that lib/utils.c contains
> some complicated code that seems to be trying to prevent Ambient
> capabilities from being inherited except in specific cases
> (ip/ip.c:main() calls drop_cap() except in the ip vrf exec case.). The
> code clears all capabilities in order to prevent Ambient capabilities
> from being available. The following change achieves suppression of
> Ambient capabilities much more precisely. It also permits ip to not
> need to be setuid-root or executed under sudo since it can now be
> optionally empowered by file capabilities:
> 
> diff --git a/lib/utils.c b/lib/utils.c
> index 53d31006..681e4aee 100644
> --- a/lib/utils.c
> +++ b/lib/utils.c
> @@ -1555,25 +1555,10 @@ void drop_cap(void)
>  #ifdef HAVE_LIBCAP
>         /* don't harmstring root/sudo */
>         if (getuid() != 0 && geteuid() != 0) {
> -               cap_t capabilities;
> -               cap_value_t net_admin = CAP_NET_ADMIN;
> -               cap_flag_t inheritable = CAP_INHERITABLE;
> -               cap_flag_value_t is_set;
> -
> -               capabilities = cap_get_proc();
> -               if (!capabilities)
> -                       exit(EXIT_FAILURE);
> -               if (cap_get_flag(capabilities, net_admin, inheritable,
> -                   &is_set) != 0)
> +               /* prevent any ambient capabilities from being inheritable */
> +               if (cap_reset_ambient() != 0) {
>                         exit(EXIT_FAILURE);
> -               /* apps with ambient caps can fork and call ip */
> -               if (is_set == CAP_CLEAR) {
> -                       if (cap_clear(capabilities) != 0)
> -                               exit(EXIT_FAILURE);
> -                       if (cap_set_proc(capabilities) != 0)
> -                               exit(EXIT_FAILURE);
>                 }
> -               cap_free(capabilities);
>         }
>  #endif
>  }

The current setup is necessary, as the commit message says:

"Users have reported a regression due to ip now dropping capabilities
unconditionally.
zerotier-one VPN and VirtualBox use ambient capabilities in their
binary and then fork out to ip to set routes and links, and this
does not work anymore.

As a workaround, do not drop caps if CAP_NET_ADMIN (the most common
capability used by ip) is set with the INHERITABLE flag.
Users that want ip vrf exec to work do not need to set INHERITABLE,
which will then only set when the calling program had privileges to
give itself the ambient capability."

Besides, giving setuid to ip itself would be very dangerous, and should
definitely not be supported. I am not aware of any distribution that
does it. If there is any, it should be removed. Even for the vrf exec
case, on Debian/Ubuntu I've set it up so that the caps are not
configured by default, but require admin action at install time to
enable, with a clear warning about the possible risks and tradeoffs.
Tinkerer One April 30, 2022, 4:26 a.m. UTC | #2
On Fri, Apr 29, 2022 at 2:56 AM Luca Boccassi <bluca@debian.org> wrote:
>
> On Thu, 2022-04-28 at 20:17 -0700, Tinkerer One wrote:
> > Hi,
> >
> > This is expanded from https://github.com/shemminger/iproute2/issues/62
> > which I'm told is not the way to report issues and offer fixes to
> > iproute2 etc.
> >
> > [I'm not subscribed to the netdev list, so please cc: me if you need more info.]
> >
> > The original change that added the drop_cap() code was:
> >
> > https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=ba2fc55b99f8363c80ce36681bc1ec97690b66f5
> >
> > In an attempt to address some user feedback, the code was further
> > complicated by:
> >
> > https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=9b13cc98f5952f62b825461727c8170d37a4037d
> >
> > Another user issue was asked about here (a couple days ago):
> >
> > https://stackoverflow.com/questions/72015197/allow-non-root-user-of-container-to-execute-binaries-that-need-capabilities
> >
> > I looked into what was going on and found that lib/utils.c contains
> > some complicated code that seems to be trying to prevent Ambient
> > capabilities from being inherited except in specific cases
> > (ip/ip.c:main() calls drop_cap() except in the ip vrf exec case.). The
> > code clears all capabilities in order to prevent Ambient capabilities
> > from being available. The following change achieves suppression of
> > Ambient capabilities much more precisely. It also permits ip to not
> > need to be setuid-root or executed under sudo since it can now be
> > optionally empowered by file capabilities:
> >
> > diff --git a/lib/utils.c b/lib/utils.c
> > index 53d31006..681e4aee 100644
> > --- a/lib/utils.c
> > +++ b/lib/utils.c
> > @@ -1555,25 +1555,10 @@ void drop_cap(void)
> >  #ifdef HAVE_LIBCAP
> >         /* don't harmstring root/sudo */
> >         if (getuid() != 0 && geteuid() != 0) {
> > -               cap_t capabilities;
> > -               cap_value_t net_admin = CAP_NET_ADMIN;
> > -               cap_flag_t inheritable = CAP_INHERITABLE;
> > -               cap_flag_value_t is_set;
> > -
> > -               capabilities = cap_get_proc();
> > -               if (!capabilities)
> > -                       exit(EXIT_FAILURE);
> > -               if (cap_get_flag(capabilities, net_admin, inheritable,
> > -                   &is_set) != 0)
> > +               /* prevent any ambient capabilities from being inheritable */
> > +               if (cap_reset_ambient() != 0) {
> >                         exit(EXIT_FAILURE);
> > -               /* apps with ambient caps can fork and call ip */
> > -               if (is_set == CAP_CLEAR) {
> > -                       if (cap_clear(capabilities) != 0)
> > -                               exit(EXIT_FAILURE);
> > -                       if (cap_set_proc(capabilities) != 0)
> > -                               exit(EXIT_FAILURE);
> >                 }
> > -               cap_free(capabilities);
> >         }
> >  #endif
> >  }
>
> The current setup is necessary, as the commit message says:
>
> "Users have reported a regression due to ip now dropping capabilities
> unconditionally.
> zerotier-one VPN and VirtualBox use ambient capabilities in their
> binary and then fork out to ip to set routes and links, and this
> does not work anymore.
>
> As a workaround, do not drop caps if CAP_NET_ADMIN (the most common
> capability used by ip) is set with the INHERITABLE flag.
> Users that want ip vrf exec to work do not need to set INHERITABLE,
> which will then only set when the calling program had privileges to
> give itself the ambient capability."

That doesn't explain why my simplification isn't an improvement.

As I see it, there are 4 different ways 'ip' can get invoked that
could potentially relate to capabilities:

- the uid != 0 && euid !=0 test means the code is perfectly happy to
run via sudo or if the 'ip' program is setuid-root. The setuid-root
way is the workaround used in the stackoverflow post I referenced. In
this case, if you try it with sudo, or via setuid-root you'll find the
'ip' program runs without any Inheritable process capabilities at all.
I'm guessing this is why the code needs that if () { .. } protection
to not "harmstring root/sudo".

- the drop_cap() function isn't even called in the case of 'ip vrf
exec' so in that one case ambient capabilities (or any other form of
capability) are not dropped and ambient capabilities can be passed on
to any invoked child.

- should drop_cap() be called for a non-root user it can inherit
capabilities in one of two ways: via the ambient setup referred to in
that commit message (manually achievable by:

    $ sudo capsh --user=`whoami` --inh=cap_net_admin --addamb=cap_net_admin --
    $ ./ip ...
    $ exit       # needed to escape capsh's ambient setup after the test

), or if the 'ip' program is given a file-inheritable capabilities and
the invoker of 'ip' has the corresponding process-inheritable
capabilities (like this:

    $ sudo setcap cap_net_admin=ie ./ip
    $ sudo capsh --inh=cap_net_admin --user=`whoami` --
    $ ./ip ...
    $ exit      # needed to escape capsh's inheritable setup after the test

).

All three of the above are preserved by my simplification because
cap_reset_ambient() doesn't drop permitted or effective capabilities
from the running program, ip.

The fourth case, the one the upstream code doesn't support (which was
the case the stackoverflow poster cares about) is if the admin sets
permitted+effective file capabilities on their copy of 'ip' (inside
their docker container, or outside such a container for that matter).
In this case, the 'ip' program when run doesn't have any inheritable
capabilities, so in spite of the fact the 'ip' program starts running
with permitted and effective process capabilities (directly obtained
from those file capabilities) of CAP_NET_ADMIN, this particular code
inside drop_cap() causes the program to drop all capabilities and not
work.

What the code I am attempting to simplify does is permits ip to be set
via this fourth method with:

    $ sudo setcap cap_net_admin=ep ip

and it will then, just like the setuid-root case, run with permitted
and effective capabilities - which are the "real capabilities" after
all.

Dropping ambient capabilities with the cap_reset_ambient() call,
preserves the permitted and effective capabilities, but prevents any
process exec'd by 'ip' itself from passing any non-root real
capabilities on to such a child. I was thinking this was why that
capability dropping code was there at all. However, using this
simplification, "sudo setcap cap_net_admin=ep ip" can be used to make
ip work.

Is it really the intention that this fourth capability setup is not be
supported?

Thanks for helping me understand.

>
> Besides, giving setuid to ip itself would be very dangerous, and should
> definitely not be supported. I am not aware of any distribution that
> does it. If there is any, it should be removed. Even for the vrf exec
> case, on Debian/Ubuntu I've set it up so that the caps are not
> configured by default, but require admin action at install time to
> enable, with a clear warning about the possible risks and tradeoffs.
>
> --
> Kind regards,
> Luca Boccassi
Tinkerer One May 14, 2022, 5:10 a.m. UTC | #3
Any further thoughts?

Thanks

On Fri, Apr 29, 2022 at 9:26 PM Tinkerer One <tinkerer@zappem.net> wrote:
>
> On Fri, Apr 29, 2022 at 2:56 AM Luca Boccassi <bluca@debian.org> wrote:
> >
> > On Thu, 2022-04-28 at 20:17 -0700, Tinkerer One wrote:
> > > Hi,
> > >
> > > This is expanded from https://github.com/shemminger/iproute2/issues/62
> > > which I'm told is not the way to report issues and offer fixes to
> > > iproute2 etc.
> > >
> > > [I'm not subscribed to the netdev list, so please cc: me if you need more info.]
> > >
> > > The original change that added the drop_cap() code was:
> > >
> > > https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=ba2fc55b99f8363c80ce36681bc1ec97690b66f5
> > >
> > > In an attempt to address some user feedback, the code was further
> > > complicated by:
> > >
> > > https://git.kernel.org/pub/scm/network/iproute2/iproute2.git/commit/?id=9b13cc98f5952f62b825461727c8170d37a4037d
> > >
> > > Another user issue was asked about here (a couple days ago):
> > >
> > > https://stackoverflow.com/questions/72015197/allow-non-root-user-of-container-to-execute-binaries-that-need-capabilities
> > >
> > > I looked into what was going on and found that lib/utils.c contains
> > > some complicated code that seems to be trying to prevent Ambient
> > > capabilities from being inherited except in specific cases
> > > (ip/ip.c:main() calls drop_cap() except in the ip vrf exec case.). The
> > > code clears all capabilities in order to prevent Ambient capabilities
> > > from being available. The following change achieves suppression of
> > > Ambient capabilities much more precisely. It also permits ip to not
> > > need to be setuid-root or executed under sudo since it can now be
> > > optionally empowered by file capabilities:
> > >
> > > diff --git a/lib/utils.c b/lib/utils.c
> > > index 53d31006..681e4aee 100644
> > > --- a/lib/utils.c
> > > +++ b/lib/utils.c
> > > @@ -1555,25 +1555,10 @@ void drop_cap(void)
> > >  #ifdef HAVE_LIBCAP
> > >         /* don't harmstring root/sudo */
> > >         if (getuid() != 0 && geteuid() != 0) {
> > > -               cap_t capabilities;
> > > -               cap_value_t net_admin = CAP_NET_ADMIN;
> > > -               cap_flag_t inheritable = CAP_INHERITABLE;
> > > -               cap_flag_value_t is_set;
> > > -
> > > -               capabilities = cap_get_proc();
> > > -               if (!capabilities)
> > > -                       exit(EXIT_FAILURE);
> > > -               if (cap_get_flag(capabilities, net_admin, inheritable,
> > > -                   &is_set) != 0)
> > > +               /* prevent any ambient capabilities from being inheritable */
> > > +               if (cap_reset_ambient() != 0) {
> > >                         exit(EXIT_FAILURE);
> > > -               /* apps with ambient caps can fork and call ip */
> > > -               if (is_set == CAP_CLEAR) {
> > > -                       if (cap_clear(capabilities) != 0)
> > > -                               exit(EXIT_FAILURE);
> > > -                       if (cap_set_proc(capabilities) != 0)
> > > -                               exit(EXIT_FAILURE);
> > >                 }
> > > -               cap_free(capabilities);
> > >         }
> > >  #endif
> > >  }
> >
> > The current setup is necessary, as the commit message says:
> >
> > "Users have reported a regression due to ip now dropping capabilities
> > unconditionally.
> > zerotier-one VPN and VirtualBox use ambient capabilities in their
> > binary and then fork out to ip to set routes and links, and this
> > does not work anymore.
> >
> > As a workaround, do not drop caps if CAP_NET_ADMIN (the most common
> > capability used by ip) is set with the INHERITABLE flag.
> > Users that want ip vrf exec to work do not need to set INHERITABLE,
> > which will then only set when the calling program had privileges to
> > give itself the ambient capability."
>
> That doesn't explain why my simplification isn't an improvement.
>
> As I see it, there are 4 different ways 'ip' can get invoked that
> could potentially relate to capabilities:
>
> - the uid != 0 && euid !=0 test means the code is perfectly happy to
> run via sudo or if the 'ip' program is setuid-root. The setuid-root
> way is the workaround used in the stackoverflow post I referenced. In
> this case, if you try it with sudo, or via setuid-root you'll find the
> 'ip' program runs without any Inheritable process capabilities at all.
> I'm guessing this is why the code needs that if () { .. } protection
> to not "harmstring root/sudo".
>
> - the drop_cap() function isn't even called in the case of 'ip vrf
> exec' so in that one case ambient capabilities (or any other form of
> capability) are not dropped and ambient capabilities can be passed on
> to any invoked child.
>
> - should drop_cap() be called for a non-root user it can inherit
> capabilities in one of two ways: via the ambient setup referred to in
> that commit message (manually achievable by:
>
>     $ sudo capsh --user=`whoami` --inh=cap_net_admin --addamb=cap_net_admin --
>     $ ./ip ...
>     $ exit       # needed to escape capsh's ambient setup after the test
>
> ), or if the 'ip' program is given a file-inheritable capabilities and
> the invoker of 'ip' has the corresponding process-inheritable
> capabilities (like this:
>
>     $ sudo setcap cap_net_admin=ie ./ip
>     $ sudo capsh --inh=cap_net_admin --user=`whoami` --
>     $ ./ip ...
>     $ exit      # needed to escape capsh's inheritable setup after the test
>
> ).
>
> All three of the above are preserved by my simplification because
> cap_reset_ambient() doesn't drop permitted or effective capabilities
> from the running program, ip.
>
> The fourth case, the one the upstream code doesn't support (which was
> the case the stackoverflow poster cares about) is if the admin sets
> permitted+effective file capabilities on their copy of 'ip' (inside
> their docker container, or outside such a container for that matter).
> In this case, the 'ip' program when run doesn't have any inheritable
> capabilities, so in spite of the fact the 'ip' program starts running
> with permitted and effective process capabilities (directly obtained
> from those file capabilities) of CAP_NET_ADMIN, this particular code
> inside drop_cap() causes the program to drop all capabilities and not
> work.
>
> What the code I am attempting to simplify does is permits ip to be set
> via this fourth method with:
>
>     $ sudo setcap cap_net_admin=ep ip
>
> and it will then, just like the setuid-root case, run with permitted
> and effective capabilities - which are the "real capabilities" after
> all.
>
> Dropping ambient capabilities with the cap_reset_ambient() call,
> preserves the permitted and effective capabilities, but prevents any
> process exec'd by 'ip' itself from passing any non-root real
> capabilities on to such a child. I was thinking this was why that
> capability dropping code was there at all. However, using this
> simplification, "sudo setcap cap_net_admin=ep ip" can be used to make
> ip work.
>
> Is it really the intention that this fourth capability setup is not be
> supported?
>
> Thanks for helping me understand.
>
> >
> > Besides, giving setuid to ip itself would be very dangerous, and should
> > definitely not be supported. I am not aware of any distribution that
> > does it. If there is any, it should be removed. Even for the vrf exec
> > case, on Debian/Ubuntu I've set it up so that the caps are not
> > configured by default, but require admin action at install time to
> > enable, with a clear warning about the possible risks and tradeoffs.
> >
> > --
> > Kind regards,
> > Luca Boccassi
diff mbox series

Patch

diff --git a/lib/utils.c b/lib/utils.c
index 53d31006..681e4aee 100644
--- a/lib/utils.c
+++ b/lib/utils.c
@@ -1555,25 +1555,10 @@  void drop_cap(void)
 #ifdef HAVE_LIBCAP
        /* don't harmstring root/sudo */
        if (getuid() != 0 && geteuid() != 0) {
-               cap_t capabilities;
-               cap_value_t net_admin = CAP_NET_ADMIN;
-               cap_flag_t inheritable = CAP_INHERITABLE;
-               cap_flag_value_t is_set;
-
-               capabilities = cap_get_proc();
-               if (!capabilities)
-                       exit(EXIT_FAILURE);
-               if (cap_get_flag(capabilities, net_admin, inheritable,
-                   &is_set) != 0)
+               /* prevent any ambient capabilities from being inheritable */
+               if (cap_reset_ambient() != 0) {
                        exit(EXIT_FAILURE);
-               /* apps with ambient caps can fork and call ip */
-               if (is_set == CAP_CLEAR) {
-                       if (cap_clear(capabilities) != 0)
-                               exit(EXIT_FAILURE);
-                       if (cap_set_proc(capabilities) != 0)
-                               exit(EXIT_FAILURE);
                }
-               cap_free(capabilities);
        }
 #endif
 }