mbox series

[RFC,v1,0/2] Add capabilities file to sysfs

Message ID 20211227205500.214777-1-flaniel@linux.microsoft.com (mailing list archive)
Headers show
Series Add capabilities file to sysfs | expand

Message

Francis Laniel Dec. 27, 2021, 8:54 p.m. UTC
Hi.


First, I hope you are fine and the same for your relatives.

Capabilities are used to check if a thread has the right to perform a given
action [1].
For example, a thread with CAP_BPF set can use the bpf() syscall.

Capabilities are used in the container world.
In terms of code, several projects related to container maintain code where the
capabilities are written alike include/uapi/linux/capability.h [2][3][4][5].
For these projects, their codebase should be updated when a new capability is
added to the kernel.
Some other projects rely on <sys/capability.h> [6].
In this case, this header file should reflect the capabilities offered by the
kernel.

So, in this series, I added a new file to sysfs: /sys/kernel/capabilities.
The goal of this file is to be used by "container world" software to know kernel
capabilities at run time instead of compile time.

The underlying kernel attribute is read-only and its content is the capability
number associated with the capability name:
root@vm-amd64:~# cat /sys/kernel/capabilities
0       CAP_CHOWN
1       CAP_DAC_OVERRIDE
...
39      CAP_BPF

The kernel already exposes the last capability number under:
/proc/sys/kernel/cap_last_cap
So, I think there should not be any issue exposing all the capabilities it
offers.
If there is any, please share it as I do not want to introduce issue with this
series.

Also, if you see any way to improve this series please share it as it would
increase this contribution quality.

Francis Laniel (2):
  capability: Add cap_strings.
  kernel/ksysfs.c: Add capabilities attribute.

 include/uapi/linux/capability.h |  1 +
 kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
 kernel/ksysfs.c                 | 18 +++++++++++++
 3 files changed, 64 insertions(+)


Best regards and thank you in advance for your reviews.
---
[1] man capabilities
[2] https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a5664fab21d6f7d1e/pkg/cap/cap_linux.go#L135
[3] https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902aabc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
moby relies on containerd code.
[4] https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b7f94e3a0ffb/capability/enum.go#L47
[5] https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880ba0e6380519a/libcontainer/capabilities/capabilities.go#L12
runc relies on syndtr package.
[6] https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b880c089f1/src/libcrun/linux.c#L35

Comments

Casey Schaufler Dec. 27, 2021, 10:22 p.m. UTC | #1
On 12/27/2021 12:54 PM, Francis Laniel wrote:
> Hi.
>
>
> First, I hope you are fine and the same for your relatives.
>
> Capabilities are used to check if a thread has the right to perform a given
> action [1].
> For example, a thread with CAP_BPF set can use the bpf() syscall.
>
> Capabilities are used in the container world.
> In terms of code, several projects related to container maintain code where the
> capabilities are written alike include/uapi/linux/capability.h [2][3][4][5].
> For these projects, their codebase should be updated when a new capability is
> added to the kernel.
> Some other projects rely on <sys/capability.h> [6].
> In this case, this header file should reflect the capabilities offered by the
> kernel.
>
> So, in this series, I added a new file to sysfs: /sys/kernel/capabilities.

This should be /sys/kernel/security/capabilities.

> The goal of this file is to be used by "container world" software to know kernel
> capabilities at run time instead of compile time.
>
> The underlying kernel attribute is read-only and its content is the capability
> number associated with the capability name:
> root@vm-amd64:~# cat /sys/kernel/capabilities
> 0       CAP_CHOWN
> 1       CAP_DAC_OVERRIDE
> ...
> 39      CAP_BPF
>
> The kernel already exposes the last capability number under:
> /proc/sys/kernel/cap_last_cap
> So, I think there should not be any issue exposing all the capabilities it
> offers.
> If there is any, please share it as I do not want to introduce issue with this
> series.
>
> Also, if you see any way to improve this series please share it as it would
> increase this contribution quality.
>
> Francis Laniel (2):
>    capability: Add cap_strings.
>    kernel/ksysfs.c: Add capabilities attribute.
>
>   include/uapi/linux/capability.h |  1 +
>   kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
>   kernel/ksysfs.c                 | 18 +++++++++++++
>   3 files changed, 64 insertions(+)
>
>
> Best regards and thank you in advance for your reviews.
> ---
> [1] man capabilities
> [2] https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a5664fab21d6f7d1e/pkg/cap/cap_linux.go#L135
> [3] https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902aabc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
> moby relies on containerd code.
> [4] https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b7f94e3a0ffb/capability/enum.go#L47
> [5] https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880ba0e6380519a/libcontainer/capabilities/capabilities.go#L12
> runc relies on syndtr package.
> [6] https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b880c089f1/src/libcrun/linux.c#L35
>
Francis Laniel Dec. 28, 2021, 1:34 p.m. UTC | #2
Hi.

Le lundi 27 décembre 2021, 23:22:41 CET Casey Schaufler a écrit :
> On 12/27/2021 12:54 PM, Francis Laniel wrote:
> > Hi.
> > 
> > 
> > First, I hope you are fine and the same for your relatives.
> > 
> > Capabilities are used to check if a thread has the right to perform a
> > given
> > action [1].
> > For example, a thread with CAP_BPF set can use the bpf() syscall.
> > 
> > Capabilities are used in the container world.
> > In terms of code, several projects related to container maintain code
> > where the capabilities are written alike include/uapi/linux/capability.h
> > [2][3][4][5]. For these projects, their codebase should be updated when a
> > new capability is added to the kernel.
> > Some other projects rely on <sys/capability.h> [6].
> > In this case, this header file should reflect the capabilities offered by
> > the kernel.
> > 
> > So, in this series, I added a new file to sysfs: /sys/kernel/capabilities.
> 
> This should be /sys/kernel/security/capabilities.

I began to write code to move this under /sys/kernel/security/capabilities but 
I realized this directory is linked to CONFIG_SECURITYFS.
This option is not required to be able to run container [1].
Also, kernel/capability.c is always compiled, so I think it is better if this 
file (i.e. the one which prints capabilities to user) does not depend on any 
CONFIG_.

What do you think of it? Does this sound acceptable for you?

> > The goal of this file is to be used by "container world" software to know
> > kernel capabilities at run time instead of compile time.
> > 
> > The underlying kernel attribute is read-only and its content is the
> > capability number associated with the capability name:
> > root@vm-amd64:~# cat /sys/kernel/capabilities
> > 0       CAP_CHOWN
> > 1       CAP_DAC_OVERRIDE
> > ...
> > 39      CAP_BPF
> > 
> > The kernel already exposes the last capability number under:
> > /proc/sys/kernel/cap_last_cap
> > So, I think there should not be any issue exposing all the capabilities it
> > offers.
> > If there is any, please share it as I do not want to introduce issue with
> > this series.
> > 
> > Also, if you see any way to improve this series please share it as it
> > would
> > increase this contribution quality.
> > 
> > Francis Laniel (2):
> >    capability: Add cap_strings.
> >    kernel/ksysfs.c: Add capabilities attribute.
> >   
> >   include/uapi/linux/capability.h |  1 +
> >   kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
> >   kernel/ksysfs.c                 | 18 +++++++++++++
> >   3 files changed, 64 insertions(+)
> > 
> > Best regards and thank you in advance for your reviews.
> > ---
> > [1] man capabilities
> > [2]
> > https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a566
> > 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3]
> > https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902a
> > abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
> > moby relies on containerd code.
> > [4]
> > https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b
> > 7f94e3a0ffb/capability/enum.go#L47 [5]
> > https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880b
> > a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on
> > syndtr package.
> > [6]
> > https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b88
> > 0c089f1/src/libcrun/linux.c#L35


Best regards.
---
[1] https://github.com/moby/moby/blob/
10aecb0e652d346130a37e5b4383eca28f594c21/contrib/check-config.sh
Casey Schaufler Dec. 29, 2021, 1:26 a.m. UTC | #3
On 12/28/2021 5:34 AM, Francis Laniel wrote:
> Hi.
>
> Le lundi 27 décembre 2021, 23:22:41 CET Casey Schaufler a écrit :
>> On 12/27/2021 12:54 PM, Francis Laniel wrote:
>>> Hi.
>>>
>>>
>>> First, I hope you are fine and the same for your relatives.
>>>
>>> Capabilities are used to check if a thread has the right to perform a
>>> given
>>> action [1].
>>> For example, a thread with CAP_BPF set can use the bpf() syscall.
>>>
>>> Capabilities are used in the container world.
>>> In terms of code, several projects related to container maintain code
>>> where the capabilities are written alike include/uapi/linux/capability.h
>>> [2][3][4][5]. For these projects, their codebase should be updated when a
>>> new capability is added to the kernel.
>>> Some other projects rely on <sys/capability.h> [6].
>>> In this case, this header file should reflect the capabilities offered by
>>> the kernel.
>>>
>>> So, in this series, I added a new file to sysfs: /sys/kernel/capabilities.
>> This should be /sys/kernel/security/capabilities.
> I began to write code to move this under /sys/kernel/security/capabilities but
> I realized this directory is linked to CONFIG_SECURITYFS.
> This option is not required to be able to run container [1].

You're going to need to handle the case where the file is missing
regardless. It is hard to design a kernel feature based on what a
container expects when there are so many definitions of a container.

> Also, kernel/capability.c is always compiled, so I think it is better if this
> file (i.e. the one which prints capabilities to user) does not depend on any
> CONFIG_.

CONFIG_MULTUSER is going to be an issue if you really care.

> What do you think of it? Does this sound acceptable for you?

Meh. I'm not going to get worked up over it, but your rationale
is a little weak.

>
>>> The goal of this file is to be used by "container world" software to know
>>> kernel capabilities at run time instead of compile time.
>>>
>>> The underlying kernel attribute is read-only and its content is the
>>> capability number associated with the capability name:
>>> root@vm-amd64:~# cat /sys/kernel/capabilities
>>> 0       CAP_CHOWN
>>> 1       CAP_DAC_OVERRIDE
>>> ...
>>> 39      CAP_BPF
>>>
>>> The kernel already exposes the last capability number under:
>>> /proc/sys/kernel/cap_last_cap
>>> So, I think there should not be any issue exposing all the capabilities it
>>> offers.
>>> If there is any, please share it as I do not want to introduce issue with
>>> this series.
>>>
>>> Also, if you see any way to improve this series please share it as it
>>> would
>>> increase this contribution quality.
>>>
>>> Francis Laniel (2):
>>>     capability: Add cap_strings.
>>>     kernel/ksysfs.c: Add capabilities attribute.
>>>    
>>>    include/uapi/linux/capability.h |  1 +
>>>    kernel/capability.c             | 45 +++++++++++++++++++++++++++++++++
>>>    kernel/ksysfs.c                 | 18 +++++++++++++
>>>    3 files changed, 64 insertions(+)
>>>
>>> Best regards and thank you in advance for your reviews.
>>> ---
>>> [1] man capabilities
>>> [2]
>>> https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a566
>>> 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3]
>>> https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902a
>>> abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1
>>> moby relies on containerd code.
>>> [4]
>>> https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b
>>> 7f94e3a0ffb/capability/enum.go#L47 [5]
>>> https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880b
>>> a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on
>>> syndtr package.
>>> [6]
>>> https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b88
>>> 0c089f1/src/libcrun/linux.c#L35
>
> Best regards.
> ---
> [1] https://github.com/moby/moby/blob/
> 10aecb0e652d346130a37e5b4383eca28f594c21/contrib/check-config.sh
>
>
Francis Laniel Dec. 29, 2021, 8:56 p.m. UTC | #4
Le mercredi 29 décembre 2021, 02:26:20 CET Casey Schaufler a écrit :
> On 12/28/2021 5:34 AM, Francis Laniel wrote:
> > Hi.
> > 
> > Le lundi 27 décembre 2021, 23:22:41 CET Casey Schaufler a écrit :
> >> On 12/27/2021 12:54 PM, Francis Laniel wrote:
> >>> Hi.
> >>> 
> >>> 
> >>> First, I hope you are fine and the same for your relatives.
> >>> 
> >>> Capabilities are used to check if a thread has the right to perform a
> >>> given
> >>> action [1].
> >>> For example, a thread with CAP_BPF set can use the bpf() syscall.
> >>> 
> >>> Capabilities are used in the container world.
> >>> In terms of code, several projects related to container maintain code
> >>> where the capabilities are written alike include/uapi/linux/capability.h
> >>> [2][3][4][5]. For these projects, their codebase should be updated when
> >>> a
> >>> new capability is added to the kernel.
> >>> Some other projects rely on <sys/capability.h> [6].
> >>> In this case, this header file should reflect the capabilities offered
> >>> by
> >>> the kernel.
> >>> 
> >>> So, in this series, I added a new file to sysfs:
> >>> /sys/kernel/capabilities.
> >> 
> >> This should be /sys/kernel/security/capabilities.
> > 
> > I began to write code to move this under /sys/kernel/security/capabilities
> > but I realized this directory is linked to CONFIG_SECURITYFS.
> > This option is not required to be able to run container [1].
> 
> You're going to need to handle the case where the file is missing
> regardless. It is hard to design a kernel feature based on what a
> container expects when there are so many definitions of a container.

The goal would be to always have this file, as if I need to handle the case 
where it is missing, it surely means having hardcoded values for capabilities 
like today situation.
I think it should be always here if its definition lies in a source file marked 
as 'obj-y', right?

Nonetheless, I understand your point of having this "capabilities printing" 
file under /sys/kernel/security.
But, this would lead to add CONFIG_SECURITYFS as a needed CONFIG_ to "run 
container".
And, if "container stack" runs on a kernel which does not provide this option, 
then "container software" would need to rely on hardcoded capabilities (like 
today situation).

> > Also, kernel/capability.c is always compiled, so I think it is better if
> > this file (i.e. the one which prints capabilities to user) does not
> > depend on any CONFIG_.
> 
> CONFIG_MULTUSER is going to be an issue if you really care.

I did not know about CONFIG_MULTIUSER and thinking a bit about at it, it 
should be a needed CONFIG_ to run container.
Nonetheless, when CONFIG_MULTIUSER=n and if my understanding of 2813893f8b197 
is correct, calling capset() leads to sys_ni_syscall() which returns -ENOSYS.
Thus, an user trying to "run container" with specific capabilities on a kernel 
compiled with CONFIG_MULTIUSER=n will have trouble with all capabilities (and 
will then have to first fix its CONFIG_ issue instead of knowing which 
capabilities he/she can use).

> > What do you think of it? Does this sound acceptable for you?
> 
> Meh. I'm not going to get worked up over it, but your rationale
> is a little weak.

I agree this contribution will not revolutionize how user interact with the 
kernel.
But I see two advantages:
1. By removing hardcoded capabilities values from "container software", it 
will ease these software maintainability.
Indeed, someone will not need to add new values to their code base when the 
kernel get a new capability.
2. On the user side, without hardcoded capabilities values, you can use any 
version of "container software" on a recent kernel to be able to use all the 
capabilities the kernel offers.
For the anecdote, it was my case some time ago and I had to compile newer 
version of "container software" to use underlying kernel capabilities [1].
This was not a big deal, but if this "container software" were capabilities 
agnostic, it would have gain me a bit of time.
From this experience, I had the idea of this contribution.

> >>> The goal of this file is to be used by "container world" software to
> >>> know
> >>> kernel capabilities at run time instead of compile time.
> >>> 
> >>> The underlying kernel attribute is read-only and its content is the
> >>> capability number associated with the capability name:
> >>> root@vm-amd64:~# cat /sys/kernel/capabilities
> >>> 0       CAP_CHOWN
> >>> 1       CAP_DAC_OVERRIDE
> >>> ...
> >>> 39      CAP_BPF
> >>> 
> >>> The kernel already exposes the last capability number under:
> >>> /proc/sys/kernel/cap_last_cap
> >>> So, I think there should not be any issue exposing all the capabilities
> >>> it
> >>> offers.
> >>> If there is any, please share it as I do not want to introduce issue
> >>> with
> >>> this series.
> >>> 
> >>> Also, if you see any way to improve this series please share it as it
> >>> would
> >>> increase this contribution quality.
> >>> 
> >>> Francis Laniel (2):
> >>>     capability: Add cap_strings.
> >>>     kernel/ksysfs.c: Add capabilities attribute.
> >>>    
> >>>    include/uapi/linux/capability.h |  1 +
> >>>    kernel/capability.c             | 45
> >>>    +++++++++++++++++++++++++++++++++
> >>>    kernel/ksysfs.c                 | 18 +++++++++++++
> >>>    3 files changed, 64 insertions(+)
> >>> 
> >>> Best regards and thank you in advance for your reviews.
> >>> ---
> >>> [1] man capabilities
> >>> [2]
> >>> https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a56
> >>> 6
> >>> 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3]
> >>> https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902
> >>> a
> >>> abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c
> >>> 1
> >>> moby relies on containerd code.
> >>> [4]
> >>> https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0
> >>> b
> >>> 7f94e3a0ffb/capability/enum.go#L47 [5]
> >>> https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880
> >>> b
> >>> a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on
> >>> syndtr package.
> >>> [6]
> >>> https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b8
> >>> 8
> >>> 0c089f1/src/libcrun/linux.c#L35
> > 
> > Best regards.
> > ---
> > [1] https://github.com/moby/moby/blob/
> > 10aecb0e652d346130a37e5b4383eca28f594c21/contrib/check-config.sh
---
[1] https://github.com/kinvolk/minikube/commit/
51bf81c816c004ca0d0f3e3e368bc59f8a208387