diff mbox series

[RFC,1/6] set_user: Perform RLIMIT_NPROC capability check against new user credentials

Message ID 20220207121800.5079-2-mkoutny@suse.com (mailing list archive)
State New
Headers show
Series RLIMIT_NPROC in ucounts fixups | expand

Commit Message

Michal Koutný Feb. 7, 2022, 12:17 p.m. UTC
The check is currently against the current->cred but since those are
going to change and we want to check RLIMIT_NPROC condition after the
switch, supply the capability check with the new cred.
But since we're checking new_user being INIT_USER any new cred's
capability-based allowance may be redundant when the check fails and the
alternative solution would be revert of the commit 2863643fb8b9
("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")

Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")

Cc: Solar Designer <solar@openwall.com>
Cc: Christian Brauner <christian.brauner@ubuntu.com>
Signed-off-by: Michal Koutný <mkoutny@suse.com>
---
 kernel/sys.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Solar Designer Feb. 10, 2022, 1:14 a.m. UTC | #1
Hi Michal,

On Mon, Feb 07, 2022 at 01:17:55PM +0100, Michal Koutný wrote:
> The check is currently against the current->cred but since those are
> going to change and we want to check RLIMIT_NPROC condition after the
> switch, supply the capability check with the new cred.
> But since we're checking new_user being INIT_USER any new cred's
> capability-based allowance may be redundant when the check fails and the
> alternative solution would be revert of the commit 2863643fb8b9
> ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
> 
> Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
> 
> Cc: Solar Designer <solar@openwall.com>
> Cc: Christian Brauner <christian.brauner@ubuntu.com>
> Signed-off-by: Michal Koutný <mkoutny@suse.com>
> ---
>  kernel/sys.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/kernel/sys.c b/kernel/sys.c
> index 8ea20912103a..48c90dcceff3 100644
> --- a/kernel/sys.c
> +++ b/kernel/sys.c
> @@ -481,7 +481,8 @@ static int set_user(struct cred *new)
>  	 */
>  	if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 &&
>  			new_user != INIT_USER &&
> -			!capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
> +			!security_capable(new, &init_user_ns, CAP_SYS_RESOURCE, CAP_OPT_NONE) &&
> +			!security_capable(new, &init_user_ns, CAP_SYS_ADMIN, CAP_OPT_NONE))
>  		current->flags |= PF_NPROC_EXCEEDED;
>  	else
>  		current->flags &= ~PF_NPROC_EXCEEDED;

Thank you for working on this and CC'ing me on it.  This is related to
the discussion Christian and I had in September:

https://lore.kernel.org/all/20210913100140.bxqlg47pushoqa3r@wittgenstein/

Christian was going to revert 2863643fb8b9, but apparently that never
happened.  Back then, I also suggested:

"Alternatively, we could postpone the set_user() calls until we're
running with the new user's capabilities, but that's an invasive change
that's likely to create its own issues."

The change you propose above is similar to that, but is more limited and
non-invasive.  That looks good to me.

However, I think you need to drop the negations of the return value from
security_capable().  security_capable() returns 0 or -EPERM, while
capable() returns a bool, in kernel/capability.c: ns_capable_common():

	capable = security_capable(current_cred(), ns, cap, opts);
	if (capable == 0) {
		current->flags |= PF_SUPERPRIV;
		return true;
	}
	return false;

Also, your change would result in this no longer setting PF_SUPERPRIV.
This may be fine, but you could want to document it.

On a related note, this comment in security/commoncap.c needs an update:

 * NOTE WELL: cap_has_capability() cannot be used like the kernel's capable()
 * and has_capability() functions.  That is, it has the reverse semantics:
 * cap_has_capability() returns 0 when a task has a capability, but the
 * kernel's capable() and has_capability() returns 1 for this case.

cap_has_capability() doesn't actually exist, and perhaps the comment
should refer to cap_capable().

Alexander
Eric W. Biederman Feb. 10, 2022, 1:57 a.m. UTC | #2
Solar Designer <solar@openwall.com> writes:

> Hi Michal,
>
> On Mon, Feb 07, 2022 at 01:17:55PM +0100, Michal Koutný wrote:
>> The check is currently against the current->cred but since those are
>> going to change and we want to check RLIMIT_NPROC condition after the
>> switch, supply the capability check with the new cred.
>> But since we're checking new_user being INIT_USER any new cred's
>> capability-based allowance may be redundant when the check fails and the
>> alternative solution would be revert of the commit 2863643fb8b9
>> ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
>> 
>> Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
>> 
>> Cc: Solar Designer <solar@openwall.com>
>> Cc: Christian Brauner <christian.brauner@ubuntu.com>
>> Signed-off-by: Michal Koutný <mkoutny@suse.com>
>> ---
>>  kernel/sys.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/kernel/sys.c b/kernel/sys.c
>> index 8ea20912103a..48c90dcceff3 100644
>> --- a/kernel/sys.c
>> +++ b/kernel/sys.c
>> @@ -481,7 +481,8 @@ static int set_user(struct cred *new)
>>  	 */
>>  	if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 &&
>>  			new_user != INIT_USER &&
>> -			!capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
>> +			!security_capable(new, &init_user_ns, CAP_SYS_RESOURCE, CAP_OPT_NONE) &&
>> +			!security_capable(new, &init_user_ns, CAP_SYS_ADMIN, CAP_OPT_NONE))
>>  		current->flags |= PF_NPROC_EXCEEDED;
>>  	else
>>  		current->flags &= ~PF_NPROC_EXCEEDED;
>
> Thank you for working on this and CC'ing me on it.  This is related to
> the discussion Christian and I had in September:
>
> https://lore.kernel.org/all/20210913100140.bxqlg47pushoqa3r@wittgenstein/
>
> Christian was going to revert 2863643fb8b9, but apparently that never
> happened.  Back then, I also suggested:
>
> "Alternatively, we could postpone the set_user() calls until we're
> running with the new user's capabilities, but that's an invasive change
> that's likely to create its own issues."

I really think we need to do something like that.  Probably just set a
flag in commit_creds and test later.

I was working on fixes that looked cleaner and I just recently realized
that the test in fork is almost as bad.  The function has_capability can
be used but the same kind of problems exist.

I thought I was very quickly going to have patches to post but I need
to redo everything now that I have noticed the issue in fork, so it will
be a day or so.

Eric


> The change you propose above is similar to that, but is more limited and
> non-invasive.  That looks good to me.
>
> However, I think you need to drop the negations of the return value from
> security_capable().  security_capable() returns 0 or -EPERM, while
> capable() returns a bool, in kernel/capability.c: ns_capable_common():
>
> 	capable = security_capable(current_cred(), ns, cap, opts);
> 	if (capable == 0) {
> 		current->flags |= PF_SUPERPRIV;
> 		return true;
> 	}
> 	return false;
>
> Also, your change would result in this no longer setting PF_SUPERPRIV.
> This may be fine, but you could want to document it.
>
> On a related note, this comment in security/commoncap.c needs an update:
>
>  * NOTE WELL: cap_has_capability() cannot be used like the kernel's capable()
>  * and has_capability() functions.  That is, it has the reverse semantics:
>  * cap_has_capability() returns 0 when a task has a capability, but the
>  * kernel's capable() and has_capability() returns 1 for this case.
>
> cap_has_capability() doesn't actually exist, and perhaps the comment
> should refer to cap_capable().
>
> Alexander
Eric W. Biederman Feb. 11, 2022, 8:32 p.m. UTC | #3
Solar Designer <solar@openwall.com> writes:

> Hi Michal,
>
> On Mon, Feb 07, 2022 at 01:17:55PM +0100, Michal Koutný wrote:
>> The check is currently against the current->cred but since those are
>> going to change and we want to check RLIMIT_NPROC condition after the
>> switch, supply the capability check with the new cred.
>> But since we're checking new_user being INIT_USER any new cred's
>> capability-based allowance may be redundant when the check fails and the
>> alternative solution would be revert of the commit 2863643fb8b9
>> ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
>> 
>> Fixes: 2863643fb8b9 ("set_user: add capability check when rlimit(RLIMIT_NPROC) exceeds")
>> 
>> Cc: Solar Designer <solar@openwall.com>
>> Cc: Christian Brauner <christian.brauner@ubuntu.com>
>> Signed-off-by: Michal Koutný <mkoutny@suse.com>
>> ---
>>  kernel/sys.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>> 
>> diff --git a/kernel/sys.c b/kernel/sys.c
>> index 8ea20912103a..48c90dcceff3 100644
>> --- a/kernel/sys.c
>> +++ b/kernel/sys.c
>> @@ -481,7 +481,8 @@ static int set_user(struct cred *new)
>>  	 */
>>  	if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 &&
>>  			new_user != INIT_USER &&
>> -			!capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
>> +			!security_capable(new, &init_user_ns, CAP_SYS_RESOURCE, CAP_OPT_NONE) &&
>> +			!security_capable(new, &init_user_ns, CAP_SYS_ADMIN, CAP_OPT_NONE))
>>  		current->flags |= PF_NPROC_EXCEEDED;
>>  	else
>>  		current->flags &= ~PF_NPROC_EXCEEDED;
>
> Thank you for working on this and CC'ing me on it.  This is related to
> the discussion Christian and I had in September:
>
> https://lore.kernel.org/all/20210913100140.bxqlg47pushoqa3r@wittgenstein/
>
> Christian was going to revert 2863643fb8b9, but apparently that never
> happened.  Back then, I also suggested:
>
> "Alternatively, we could postpone the set_user() calls until we're
> running with the new user's capabilities, but that's an invasive change
> that's likely to create its own issues."

Back then you mentioned that apache suexec was broken.  Do you have
any more details?

I would like to make certain the apache suexec issue is fixed but
without a few details I can't do that.  I tried looking but I can't
find an public report about apache suexec being broken.

My goal is to come up with a very careful and conservative set of
patches that fix all of the known issues with RLIMIT_NPROC.

Eric
Solar Designer Feb. 12, 2022, 10:14 p.m. UTC | #4
On Fri, Feb 11, 2022 at 02:32:47PM -0600, Eric W. Biederman wrote:
> Solar Designer <solar@openwall.com> writes:
> > https://lore.kernel.org/all/20210913100140.bxqlg47pushoqa3r@wittgenstein/
> >
> > Christian was going to revert 2863643fb8b9, but apparently that never
> > happened.  Back then, I also suggested:
> >
> > "Alternatively, we could postpone the set_user() calls until we're
> > running with the new user's capabilities, but that's an invasive change
> > that's likely to create its own issues."
> 
> Back then you mentioned that apache suexec was broken.  Do you have
> any more details?
> 
> I would like to make certain the apache suexec issue is fixed but
> without a few details I can't do that.  I tried looking but I can't
> find an public report about apache suexec being broken.

I'm not aware of anyone actually running into this issue and reporting
it.  The systems that I personally know use suexec along with rlimits
still run older/distro kernels, so would not yet be affected.

So my mention was based on my understanding of how suexec works, and
code review.  Specifically, Apache httpd has the setting RLimitNPROC,
which makes it set RLIMIT_NPROC:

https://httpd.apache.org/docs/2.4/mod/core.html#rlimitnproc

The above documentation for it includes:

"This applies to processes forked from Apache httpd children servicing
requests, not the Apache httpd children themselves. This includes CGI
scripts and SSI exec commands, but not any processes forked from the
Apache httpd parent, such as piped logs."

In code, there are:

./modules/generators/mod_cgid.c:        ( (cgid_req.limits.limit_nproc_set) && ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
./modules/generators/mod_cgi.c:        ((rc = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC,
./modules/filters/mod_ext_filter.c:    rv = apr_procattr_limit_set(procattr, APR_LIMIT_NPROC, conf->limit_nproc);

For example, in mod_cgi.c this is in run_cgi_child().

I think this means an httpd child sets RLIMIT_NPROC shortly before it
execs suexec, which is a SUID root program.  suexec then switches to the
target user and execs the CGI script.

Before 2863643fb8b9, the setuid() in suexec would set the flag, and the
target user's process count would be checked against RLIMIT_NPROC on
execve().  After 2863643fb8b9, the setuid() in suexec wouldn't set the
flag because setuid() is (naturally) called when the process is still
running as root (thus, has those limits bypass capabilities), and
accordingly execve() would not check the target user's process count
against RLIMIT_NPROC.

> My goal is to come up with a very careful and conservative set of
> patches that fix all of the known issues with RLIMIT_NPROC.

The most conservative fix for this one would be to revert 2863643fb8b9
(preserving other changes that were made on top of it).  I think this
commit did not fix a real issue - it attempted to fix what someone
thought was a discrepancy, but actually made it worse.

However, your recent patch trying to fix that commit looks like it'd
also repair the behavior for suexec.

Thanks,

Alexander
Michal Koutný Feb. 15, 2022, 11:55 a.m. UTC | #5
On Thu, Feb 10, 2022 at 02:14:05AM +0100, Solar Designer <solar@openwall.com> wrote:
> However, I think you need to drop the negations of the return value from
> security_capable().
> security_capable() returns 0 or -EPERM, while capable() returns a
> bool, in kernel/capability.c: ns_capable_common():

Oops. Yeah, I only blindly applied replacement with a predicate for
(new) cred and overlooked this inverse semantics. Thanks for pointing
that out to me!

Nevertheless, this will likely be incorporated via Eric's series
anyway.


Michal
diff mbox series

Patch

diff --git a/kernel/sys.c b/kernel/sys.c
index 8ea20912103a..48c90dcceff3 100644
--- a/kernel/sys.c
+++ b/kernel/sys.c
@@ -481,7 +481,8 @@  static int set_user(struct cred *new)
 	 */
 	if (ucounts_limit_cmp(new->ucounts, UCOUNT_RLIMIT_NPROC, rlimit(RLIMIT_NPROC)) >= 0 &&
 			new_user != INIT_USER &&
-			!capable(CAP_SYS_RESOURCE) && !capable(CAP_SYS_ADMIN))
+			!security_capable(new, &init_user_ns, CAP_SYS_RESOURCE, CAP_OPT_NONE) &&
+			!security_capable(new, &init_user_ns, CAP_SYS_ADMIN, CAP_OPT_NONE))
 		current->flags |= PF_NPROC_EXCEEDED;
 	else
 		current->flags &= ~PF_NPROC_EXCEEDED;