Message ID | 20211227205500.214777-1-flaniel@linux.microsoft.com (mailing list archive) |
---|---|
Headers | show |
Series | Add capabilities file to sysfs | expand |
On 12/27/2021 12:54 PM, Francis Laniel wrote: > Hi. > > > First, I hope you are fine and the same for your relatives. > > Capabilities are used to check if a thread has the right to perform a given > action [1]. > For example, a thread with CAP_BPF set can use the bpf() syscall. > > Capabilities are used in the container world. > In terms of code, several projects related to container maintain code where the > capabilities are written alike include/uapi/linux/capability.h [2][3][4][5]. > For these projects, their codebase should be updated when a new capability is > added to the kernel. > Some other projects rely on <sys/capability.h> [6]. > In this case, this header file should reflect the capabilities offered by the > kernel. > > So, in this series, I added a new file to sysfs: /sys/kernel/capabilities. This should be /sys/kernel/security/capabilities. > The goal of this file is to be used by "container world" software to know kernel > capabilities at run time instead of compile time. > > The underlying kernel attribute is read-only and its content is the capability > number associated with the capability name: > root@vm-amd64:~# cat /sys/kernel/capabilities > 0 CAP_CHOWN > 1 CAP_DAC_OVERRIDE > ... > 39 CAP_BPF > > The kernel already exposes the last capability number under: > /proc/sys/kernel/cap_last_cap > So, I think there should not be any issue exposing all the capabilities it > offers. > If there is any, please share it as I do not want to introduce issue with this > series. > > Also, if you see any way to improve this series please share it as it would > increase this contribution quality. > > Francis Laniel (2): > capability: Add cap_strings. > kernel/ksysfs.c: Add capabilities attribute. > > include/uapi/linux/capability.h | 1 + > kernel/capability.c | 45 +++++++++++++++++++++++++++++++++ > kernel/ksysfs.c | 18 +++++++++++++ > 3 files changed, 64 insertions(+) > > > Best regards and thank you in advance for your reviews. > --- > [1] man capabilities > [2] https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a5664fab21d6f7d1e/pkg/cap/cap_linux.go#L135 > [3] https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902aabc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1 > moby relies on containerd code. > [4] https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b7f94e3a0ffb/capability/enum.go#L47 > [5] https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880ba0e6380519a/libcontainer/capabilities/capabilities.go#L12 > runc relies on syndtr package. > [6] https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b880c089f1/src/libcrun/linux.c#L35 >
Hi. Le lundi 27 décembre 2021, 23:22:41 CET Casey Schaufler a écrit : > On 12/27/2021 12:54 PM, Francis Laniel wrote: > > Hi. > > > > > > First, I hope you are fine and the same for your relatives. > > > > Capabilities are used to check if a thread has the right to perform a > > given > > action [1]. > > For example, a thread with CAP_BPF set can use the bpf() syscall. > > > > Capabilities are used in the container world. > > In terms of code, several projects related to container maintain code > > where the capabilities are written alike include/uapi/linux/capability.h > > [2][3][4][5]. For these projects, their codebase should be updated when a > > new capability is added to the kernel. > > Some other projects rely on <sys/capability.h> [6]. > > In this case, this header file should reflect the capabilities offered by > > the kernel. > > > > So, in this series, I added a new file to sysfs: /sys/kernel/capabilities. > > This should be /sys/kernel/security/capabilities. I began to write code to move this under /sys/kernel/security/capabilities but I realized this directory is linked to CONFIG_SECURITYFS. This option is not required to be able to run container [1]. Also, kernel/capability.c is always compiled, so I think it is better if this file (i.e. the one which prints capabilities to user) does not depend on any CONFIG_. What do you think of it? Does this sound acceptable for you? > > The goal of this file is to be used by "container world" software to know > > kernel capabilities at run time instead of compile time. > > > > The underlying kernel attribute is read-only and its content is the > > capability number associated with the capability name: > > root@vm-amd64:~# cat /sys/kernel/capabilities > > 0 CAP_CHOWN > > 1 CAP_DAC_OVERRIDE > > ... > > 39 CAP_BPF > > > > The kernel already exposes the last capability number under: > > /proc/sys/kernel/cap_last_cap > > So, I think there should not be any issue exposing all the capabilities it > > offers. > > If there is any, please share it as I do not want to introduce issue with > > this series. > > > > Also, if you see any way to improve this series please share it as it > > would > > increase this contribution quality. > > > > Francis Laniel (2): > > capability: Add cap_strings. > > kernel/ksysfs.c: Add capabilities attribute. > > > > include/uapi/linux/capability.h | 1 + > > kernel/capability.c | 45 +++++++++++++++++++++++++++++++++ > > kernel/ksysfs.c | 18 +++++++++++++ > > 3 files changed, 64 insertions(+) > > > > Best regards and thank you in advance for your reviews. > > --- > > [1] man capabilities > > [2] > > https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a566 > > 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3] > > https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902a > > abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1 > > moby relies on containerd code. > > [4] > > https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b > > 7f94e3a0ffb/capability/enum.go#L47 [5] > > https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880b > > a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on > > syndtr package. > > [6] > > https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b88 > > 0c089f1/src/libcrun/linux.c#L35 Best regards. --- [1] https://github.com/moby/moby/blob/ 10aecb0e652d346130a37e5b4383eca28f594c21/contrib/check-config.sh
On 12/28/2021 5:34 AM, Francis Laniel wrote: > Hi. > > Le lundi 27 décembre 2021, 23:22:41 CET Casey Schaufler a écrit : >> On 12/27/2021 12:54 PM, Francis Laniel wrote: >>> Hi. >>> >>> >>> First, I hope you are fine and the same for your relatives. >>> >>> Capabilities are used to check if a thread has the right to perform a >>> given >>> action [1]. >>> For example, a thread with CAP_BPF set can use the bpf() syscall. >>> >>> Capabilities are used in the container world. >>> In terms of code, several projects related to container maintain code >>> where the capabilities are written alike include/uapi/linux/capability.h >>> [2][3][4][5]. For these projects, their codebase should be updated when a >>> new capability is added to the kernel. >>> Some other projects rely on <sys/capability.h> [6]. >>> In this case, this header file should reflect the capabilities offered by >>> the kernel. >>> >>> So, in this series, I added a new file to sysfs: /sys/kernel/capabilities. >> This should be /sys/kernel/security/capabilities. > I began to write code to move this under /sys/kernel/security/capabilities but > I realized this directory is linked to CONFIG_SECURITYFS. > This option is not required to be able to run container [1]. You're going to need to handle the case where the file is missing regardless. It is hard to design a kernel feature based on what a container expects when there are so many definitions of a container. > Also, kernel/capability.c is always compiled, so I think it is better if this > file (i.e. the one which prints capabilities to user) does not depend on any > CONFIG_. CONFIG_MULTUSER is going to be an issue if you really care. > What do you think of it? Does this sound acceptable for you? Meh. I'm not going to get worked up over it, but your rationale is a little weak. > >>> The goal of this file is to be used by "container world" software to know >>> kernel capabilities at run time instead of compile time. >>> >>> The underlying kernel attribute is read-only and its content is the >>> capability number associated with the capability name: >>> root@vm-amd64:~# cat /sys/kernel/capabilities >>> 0 CAP_CHOWN >>> 1 CAP_DAC_OVERRIDE >>> ... >>> 39 CAP_BPF >>> >>> The kernel already exposes the last capability number under: >>> /proc/sys/kernel/cap_last_cap >>> So, I think there should not be any issue exposing all the capabilities it >>> offers. >>> If there is any, please share it as I do not want to introduce issue with >>> this series. >>> >>> Also, if you see any way to improve this series please share it as it >>> would >>> increase this contribution quality. >>> >>> Francis Laniel (2): >>> capability: Add cap_strings. >>> kernel/ksysfs.c: Add capabilities attribute. >>> >>> include/uapi/linux/capability.h | 1 + >>> kernel/capability.c | 45 +++++++++++++++++++++++++++++++++ >>> kernel/ksysfs.c | 18 +++++++++++++ >>> 3 files changed, 64 insertions(+) >>> >>> Best regards and thank you in advance for your reviews. >>> --- >>> [1] man capabilities >>> [2] >>> https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a566 >>> 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3] >>> https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902a >>> abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c1 >>> moby relies on containerd code. >>> [4] >>> https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0b >>> 7f94e3a0ffb/capability/enum.go#L47 [5] >>> https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880b >>> a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on >>> syndtr package. >>> [6] >>> https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b88 >>> 0c089f1/src/libcrun/linux.c#L35 > > Best regards. > --- > [1] https://github.com/moby/moby/blob/ > 10aecb0e652d346130a37e5b4383eca28f594c21/contrib/check-config.sh > >
Le mercredi 29 décembre 2021, 02:26:20 CET Casey Schaufler a écrit : > On 12/28/2021 5:34 AM, Francis Laniel wrote: > > Hi. > > > > Le lundi 27 décembre 2021, 23:22:41 CET Casey Schaufler a écrit : > >> On 12/27/2021 12:54 PM, Francis Laniel wrote: > >>> Hi. > >>> > >>> > >>> First, I hope you are fine and the same for your relatives. > >>> > >>> Capabilities are used to check if a thread has the right to perform a > >>> given > >>> action [1]. > >>> For example, a thread with CAP_BPF set can use the bpf() syscall. > >>> > >>> Capabilities are used in the container world. > >>> In terms of code, several projects related to container maintain code > >>> where the capabilities are written alike include/uapi/linux/capability.h > >>> [2][3][4][5]. For these projects, their codebase should be updated when > >>> a > >>> new capability is added to the kernel. > >>> Some other projects rely on <sys/capability.h> [6]. > >>> In this case, this header file should reflect the capabilities offered > >>> by > >>> the kernel. > >>> > >>> So, in this series, I added a new file to sysfs: > >>> /sys/kernel/capabilities. > >> > >> This should be /sys/kernel/security/capabilities. > > > > I began to write code to move this under /sys/kernel/security/capabilities > > but I realized this directory is linked to CONFIG_SECURITYFS. > > This option is not required to be able to run container [1]. > > You're going to need to handle the case where the file is missing > regardless. It is hard to design a kernel feature based on what a > container expects when there are so many definitions of a container. The goal would be to always have this file, as if I need to handle the case where it is missing, it surely means having hardcoded values for capabilities like today situation. I think it should be always here if its definition lies in a source file marked as 'obj-y', right? Nonetheless, I understand your point of having this "capabilities printing" file under /sys/kernel/security. But, this would lead to add CONFIG_SECURITYFS as a needed CONFIG_ to "run container". And, if "container stack" runs on a kernel which does not provide this option, then "container software" would need to rely on hardcoded capabilities (like today situation). > > Also, kernel/capability.c is always compiled, so I think it is better if > > this file (i.e. the one which prints capabilities to user) does not > > depend on any CONFIG_. > > CONFIG_MULTUSER is going to be an issue if you really care. I did not know about CONFIG_MULTIUSER and thinking a bit about at it, it should be a needed CONFIG_ to run container. Nonetheless, when CONFIG_MULTIUSER=n and if my understanding of 2813893f8b197 is correct, calling capset() leads to sys_ni_syscall() which returns -ENOSYS. Thus, an user trying to "run container" with specific capabilities on a kernel compiled with CONFIG_MULTIUSER=n will have trouble with all capabilities (and will then have to first fix its CONFIG_ issue instead of knowing which capabilities he/she can use). > > What do you think of it? Does this sound acceptable for you? > > Meh. I'm not going to get worked up over it, but your rationale > is a little weak. I agree this contribution will not revolutionize how user interact with the kernel. But I see two advantages: 1. By removing hardcoded capabilities values from "container software", it will ease these software maintainability. Indeed, someone will not need to add new values to their code base when the kernel get a new capability. 2. On the user side, without hardcoded capabilities values, you can use any version of "container software" on a recent kernel to be able to use all the capabilities the kernel offers. For the anecdote, it was my case some time ago and I had to compile newer version of "container software" to use underlying kernel capabilities [1]. This was not a big deal, but if this "container software" were capabilities agnostic, it would have gain me a bit of time. From this experience, I had the idea of this contribution. > >>> The goal of this file is to be used by "container world" software to > >>> know > >>> kernel capabilities at run time instead of compile time. > >>> > >>> The underlying kernel attribute is read-only and its content is the > >>> capability number associated with the capability name: > >>> root@vm-amd64:~# cat /sys/kernel/capabilities > >>> 0 CAP_CHOWN > >>> 1 CAP_DAC_OVERRIDE > >>> ... > >>> 39 CAP_BPF > >>> > >>> The kernel already exposes the last capability number under: > >>> /proc/sys/kernel/cap_last_cap > >>> So, I think there should not be any issue exposing all the capabilities > >>> it > >>> offers. > >>> If there is any, please share it as I do not want to introduce issue > >>> with > >>> this series. > >>> > >>> Also, if you see any way to improve this series please share it as it > >>> would > >>> increase this contribution quality. > >>> > >>> Francis Laniel (2): > >>> capability: Add cap_strings. > >>> kernel/ksysfs.c: Add capabilities attribute. > >>> > >>> include/uapi/linux/capability.h | 1 + > >>> kernel/capability.c | 45 > >>> +++++++++++++++++++++++++++++++++ > >>> kernel/ksysfs.c | 18 +++++++++++++ > >>> 3 files changed, 64 insertions(+) > >>> > >>> Best regards and thank you in advance for your reviews. > >>> --- > >>> [1] man capabilities > >>> [2] > >>> https://github.com/containerd/containerd/blob/1a078e6893d07fec10a4940a56 > >>> 6 > >>> 4fab21d6f7d1e/pkg/cap/cap_linux.go#L135 [3] > >>> https://github.com/moby/moby/commit/485cf38d48e7111b3d1f584d5e9eab46a902 > >>> a > >>> abc#diff-2e04625b209932e74c617de96682ed72fbd1bb0d0cb9fb7c709cf47a86b6f9c > >>> 1 > >>> moby relies on containerd code. > >>> [4] > >>> https://github.com/syndtr/gocapability/blob/42c35b4376354fd554efc7ad35e0 > >>> b > >>> 7f94e3a0ffb/capability/enum.go#L47 [5] > >>> https://github.com/opencontainers/runc/blob/00f56786bb220b55b41748231880 > >>> b > >>> a0e6380519a/libcontainer/capabilities/capabilities.go#L12 runc relies on > >>> syndtr package. > >>> [6] > >>> https://github.com/containers/crun/blob/fafb556f09e6ffd4690c452ff51856b8 > >>> 8 > >>> 0c089f1/src/libcrun/linux.c#L35 > > > > Best regards. > > --- > > [1] https://github.com/moby/moby/blob/ > > 10aecb0e652d346130a37e5b4383eca28f594c21/contrib/check-config.sh --- [1] https://github.com/kinvolk/minikube/commit/ 51bf81c816c004ca0d0f3e3e368bc59f8a208387