Message ID | 20220207121800.5079-2-mkoutny@suse.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | RLIMIT_NPROC in ucounts fixups | expand |
Hi Michal, On Mon, Feb 07, 2022 at 01:17:55PM +0100, Michal Koutný wrote: > The check is currently against the current->cred but since those are > going to change and we want to check RLIMIT_NPROC condition after the > switch, supply the capability check with the new cred. > But since we're checking new_user being INIT_USER any new cred's > capability-based allowance may be redundant when the check fails and the > alternative solution would be revert of the commit 2863643fb8b9 > ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") > > Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") > > Cc: Solar Designer <solar@openwall.com> > Cc: Christian Brauner <christian.brauner@ubuntu.com> > Signed-off-by: Michal Koutný <mkoutny@suse.com> > --- > kernel/sys.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/kernel/sys.c b/kernel/sys.c > index 8ea20912103a..48c90dcceff3 100644 > --- a/kernel/sys.c > +++ b/kernel/sys.c > @@ -481,7 +481,8 @@ static int set_user(struct cred *new) > */ > if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 && > new_user != INIT_USER && > - !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) > + !security_capable(new, &init_user_ns, CAP_SYS_RESOURCE, CAP_OPT_NONE) && > + !security_capable(new, &init_user_ns, CAP_SYS_ADMIN, CAP_OPT_NONE)) > current->flags |= PF_NPROC_EXCEEDED; > else > current->flags &= ~PF_NPROC_EXCEEDED; Thank you for working on this and CC'ing me on it. This is related to the discussion Christian and I had in September: https://lore.kernel.org/all/20210913100140.bxqlg47pushoqa3r@wittgenstein/ Christian was going to revert 2863643fb8b9, but apparently that never happened. Back then, I also suggested: "Alternatively, we could postpone the set_user() calls until we're running with the new user's capabilities, but that's an invasive change that's likely to create its own issues." The change you propose above is similar to that, but is more limited and non-invasive. That looks good to me. However, I think you need to drop the negations of the return value from security_capable(). security_capable() returns 0 or -EPERM, while capable() returns a bool, in kernel/capability.c: ns_capable_common(): capable = security_capable(current_cred(), ns, cap, opts); if (capable == 0) { current->flags |= PF_SUPERPRIV; return true; } return false; Also, your change would result in this no longer setting PF_SUPERPRIV. This may be fine, but you could want to document it. On a related note, this comment in security/commoncap.c needs an update: * NOTE WELL: cap_has_capability() cannot be used like the kernel's capable() * and has_capability() functions. That is, it has the reverse semantics: * cap_has_capability() returns 0 when a task has a capability, but the * kernel's capable() and has_capability() returns 1 for this case. cap_has_capability() doesn't actually exist, and perhaps the comment should refer to cap_capable(). Alexander
Solar Designer <solar@openwall.com> writes: > Hi Michal, > > On Mon, Feb 07, 2022 at 01:17:55PM +0100, Michal Koutný wrote: >> The check is currently against the current->cred but since those are >> going to change and we want to check RLIMIT_NPROC condition after the >> switch, supply the capability check with the new cred. >> But since we're checking new_user being INIT_USER any new cred's >> capability-based allowance may be redundant when the check fails and the >> alternative solution would be revert of the commit 2863643fb8b9 >> ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") >> >> Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") >> >> Cc: Solar Designer <solar@openwall.com> >> Cc: Christian Brauner <christian.brauner@ubuntu.com> >> Signed-off-by: Michal Koutný <mkoutny@suse.com> >> --- >> kernel/sys.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/sys.c b/kernel/sys.c >> index 8ea20912103a..48c90dcceff3 100644 >> --- a/kernel/sys.c >> +++ b/kernel/sys.c >> @@ -481,7 +481,8 @@ static int set_user(struct cred *new) >> */ >> if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 && >> new_user != INIT_USER && >> - !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) >> + !security_capable(new, &init_user_ns, CAP_SYS_RESOURCE, CAP_OPT_NONE) && >> + !security_capable(new, &init_user_ns, CAP_SYS_ADMIN, CAP_OPT_NONE)) >> current->flags |= PF_NPROC_EXCEEDED; >> else >> current->flags &= ~PF_NPROC_EXCEEDED; > > Thank you for working on this and CC'ing me on it. This is related to > the discussion Christian and I had in September: > > https://lore.kernel.org/all/20210913100140.bxqlg47pushoqa3r@wittgenstein/ > > Christian was going to revert 2863643fb8b9, but apparently that never > happened. Back then, I also suggested: > > "Alternatively, we could postpone the set_user() calls until we're > running with the new user's capabilities, but that's an invasive change > that's likely to create its own issues." I really think we need to do something like that. Probably just set a flag in commit_creds and test later. I was working on fixes that looked cleaner and I just recently realized that the test in fork is almost as bad. The function has_capability can be used but the same kind of problems exist. I thought I was very quickly going to have patches to post but I need to redo everything now that I have noticed the issue in fork, so it will be a day or so. Eric > The change you propose above is similar to that, but is more limited and > non-invasive. That looks good to me. > > However, I think you need to drop the negations of the return value from > security_capable(). security_capable() returns 0 or -EPERM, while > capable() returns a bool, in kernel/capability.c: ns_capable_common(): > > capable = security_capable(current_cred(), ns, cap, opts); > if (capable == 0) { > current->flags |= PF_SUPERPRIV; > return true; > } > return false; > > Also, your change would result in this no longer setting PF_SUPERPRIV. > This may be fine, but you could want to document it. > > On a related note, this comment in security/commoncap.c needs an update: > > * NOTE WELL: cap_has_capability() cannot be used like the kernel's capable() > * and has_capability() functions. That is, it has the reverse semantics: > * cap_has_capability() returns 0 when a task has a capability, but the > * kernel's capable() and has_capability() returns 1 for this case. > > cap_has_capability() doesn't actually exist, and perhaps the comment > should refer to cap_capable(). > > Alexander
Solar Designer <solar@openwall.com> writes: > Hi Michal, > > On Mon, Feb 07, 2022 at 01:17:55PM +0100, Michal Koutný wrote: >> The check is currently against the current->cred but since those are >> going to change and we want to check RLIMIT_NPROC condition after the >> switch, supply the capability check with the new cred. >> But since we're checking new_user being INIT_USER any new cred's >> capability-based allowance may be redundant when the check fails and the >> alternative solution would be revert of the commit 2863643fb8b9 >> ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") >> >> Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") >> >> Cc: Solar Designer <solar@openwall.com> >> Cc: Christian Brauner <christian.brauner@ubuntu.com> >> Signed-off-by: Michal Koutný <mkoutny@suse.com> >> --- >> kernel/sys.c | 3 ++- >> 1 file changed, 2 insertions(+), 1 deletion(-) >> >> diff --git a/kernel/sys.c b/kernel/sys.c >> index 8ea20912103a..48c90dcceff3 100644 >> --- a/kernel/sys.c >> +++ b/kernel/sys.c >> @@ -481,7 +481,8 @@ static int set_user(struct cred *new) >> */ >> if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 && >> new_user != INIT_USER && >> - !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) >> + !security_capable(new, &init_user_ns, CAP_SYS_RESOURCE, CAP_OPT_NONE) && >> + !security_capable(new, &init_user_ns, CAP_SYS_ADMIN, CAP_OPT_NONE)) >> current->flags |= PF_NPROC_EXCEEDED; >> else >> current->flags &= ~PF_NPROC_EXCEEDED; > > Thank you for working on this and CC'ing me on it. This is related to > the discussion Christian and I had in September: > > https://lore.kernel.org/all/20210913100140.bxqlg47pushoqa3r@wittgenstein/ > > Christian was going to revert 2863643fb8b9, but apparently that never > happened. Back then, I also suggested: > > "Alternatively, we could postpone the set_user() calls until we're > running with the new user's capabilities, but that's an invasive change > that's likely to create its own issues." Back then you mentioned that apache suexec was broken. Do you have any more details? I would like to make certain the apache suexec issue is fixed but without a few details I can't do that. I tried looking but I can't find an public report about apache suexec being broken. My goal is to come up with a very careful and conservative set of patches that fix all of the known issues with RLIMIT_NPROC. Eric
On Fri, Feb 11, 2022 at 02:32:47PM -0600, Eric W. Biederman wrote: > Solar Designer <solar@openwall.com> writes: > > https://lore.kernel.org/all/20210913100140.bxqlg47pushoqa3r@wittgenstein/ > > > > Christian was going to revert 2863643fb8b9, but apparently that never > > happened. Back then, I also suggested: > > > > "Alternatively, we could postpone the set_user() calls until we're > > running with the new user's capabilities, but that's an invasive change > > that's likely to create its own issues." > > Back then you mentioned that apache suexec was broken. Do you have > any more details? > > I would like to make certain the apache suexec issue is fixed but > without a few details I can't do that. I tried looking but I can't > find an public report about apache suexec being broken. I'm not aware of anyone actually running into this issue and reporting it. The systems that I personally know use suexec along with rlimits still run older/distro kernels, so would not yet be affected. So my mention was based on my understanding of how suexec works, and code review. Specifically, Apache httpd has the setting RLimitNPROC, which makes it set RLIMIT_NPROC: https://httpd.apache.org/docs/2.4/mod/core.html#rlimitnproc The above documentation for it includes: "This applies to processes forked from Apache httpd children servicing requests, not the Apache httpd children themselves. This includes CGI scripts and SSI exec commands, but not any processes forked from the Apache httpd parent, such as piped logs." In code, there are: ./modules/generators/mod_cgid.c: ( (cgid_req.limits.limit_nproc_set) && ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, ./modules/generators/mod_cgi.c: ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, ./modules/filters/mod_ext_filter.c: rv = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, conf->limit_nproc); For example, in mod_cgi.c this is in run_cgi_child(). I think this means an httpd child sets RLIMIT_NPROC shortly before it execs suexec, which is a SUID root program. suexec then switches to the target user and execs the CGI script. Before 2863643fb8b9, the setuid() in suexec would set the flag, and the target user's process count would be checked against RLIMIT_NPROC on execve(). After 2863643fb8b9, the setuid() in suexec wouldn't set the flag because setuid() is (naturally) called when the process is still running as root (thus, has those limits bypass capabilities), and accordingly execve() would not check the target user's process count against RLIMIT_NPROC. > My goal is to come up with a very careful and conservative set of > patches that fix all of the known issues with RLIMIT_NPROC. The most conservative fix for this one would be to revert 2863643fb8b9 (preserving other changes that were made on top of it). I think this commit did not fix a real issue - it attempted to fix what someone thought was a discrepancy, but actually made it worse. However, your recent patch trying to fix that commit looks like it'd also repair the behavior for suexec. Thanks, Alexander
On Thu, Feb 10, 2022 at 02:14:05AM +0100, Solar Designer <solar@openwall.com> wrote: > However, I think you need to drop the negations of the return value from > security_capable(). > security_capable() returns 0 or -EPERM, while capable() returns a > bool, in kernel/capability.c: ns_capable_common(): Oops. Yeah, I only blindly applied replacement with a predicate for (new) cred and overlooked this inverse semantics. Thanks for pointing that out to me! Nevertheless, this will likely be incorporated via Eric's series anyway. Michal
diff --git a/kernel/sys.c b/kernel/sys.c index 8ea20912103a..48c90dcceff3 100644 --- a/kernel/sys.c +++ b/kernel/sys.c @@ -481,7 +481,8 @@ static int set_user(struct cred *new) */ if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 && new_user != INIT_USER && - !capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN)) + !security_capable(new, &init_user_ns, CAP_SYS_RESOURCE, CAP_OPT_NONE) && + !security_capable(new, &init_user_ns, CAP_SYS_ADMIN, CAP_OPT_NONE)) current->flags |= PF_NPROC_EXCEEDED; else current->flags &= ~PF_NPROC_EXCEEDED;
The check is currently against the current->cred but since those are going to change and we want to check RLIMIT_NPROC condition after the switch, supply the capability check with the new cred. But since we're checking new_user being INIT_USER any new cred's capability-based allowance may be redundant when the check fails and the alternative solution would be revert of the commit 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds") Cc: Solar Designer <solar@openwall.com> Cc: Christian Brauner <christian.brauner@ubuntu.com> Signed-off-by: Michal Koutný <mkoutny@suse.com> --- kernel/sys.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)