Message ID | 20210730112443.23245-8-will@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add support for 32-bit tasks on asymmetric AArch32 systems | expand |
On Fri, Jul 30, 2021 at 12:24:34PM +0100, Will Deacon wrote: > In preparation for replaying user affinity requests using a saved mask, > split sched_setaffinity() up so that the initial task lookup and > security checks are only performed when the request is coming directly > from userspace. > > Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com> > Signed-off-by: Will Deacon <will@kernel.org> Should not sched_setaffinity() update user_cpus_ptr when it isn't NULL, such that the upcoming relax_compatible_cpus_allowed_ptr() preserve the full user mask?
On Tue, Aug 17, 2021 at 05:40:24PM +0200, Peter Zijlstra wrote: > On Fri, Jul 30, 2021 at 12:24:34PM +0100, Will Deacon wrote: > > In preparation for replaying user affinity requests using a saved mask, > > split sched_setaffinity() up so that the initial task lookup and > > security checks are only performed when the request is coming directly > > from userspace. > > > > Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com> > > Signed-off-by: Will Deacon <will@kernel.org> > > Should not sched_setaffinity() update user_cpus_ptr when it isn't NULL, > such that the upcoming relax_compatible_cpus_allowed_ptr() preserve the > full user mask? The idea is that force_compatible_cpus_allowed_ptr() and relax_compatible_cpus_allowed_ptr() are used as a pair, with the former setting ->user_cpus_ptr and the latter restoring it. An intervening call to sched_setaffinity() must _clear_ the saved mask, as we discussed before at: https://lore.kernel.org/r/YK53kDtczHIYumDC@hirez.programming.kicks-ass.net Will
On Wed, Aug 18, 2021 at 11:50:30AM +0100, Will Deacon wrote: > On Tue, Aug 17, 2021 at 05:40:24PM +0200, Peter Zijlstra wrote: > > On Fri, Jul 30, 2021 at 12:24:34PM +0100, Will Deacon wrote: > > > In preparation for replaying user affinity requests using a saved mask, > > > split sched_setaffinity() up so that the initial task lookup and > > > security checks are only performed when the request is coming directly > > > from userspace. > > > > > > Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com> > > > Signed-off-by: Will Deacon <will@kernel.org> > > > > Should not sched_setaffinity() update user_cpus_ptr when it isn't NULL, > > such that the upcoming relax_compatible_cpus_allowed_ptr() preserve the > > full user mask? > > The idea is that force_compatible_cpus_allowed_ptr() and > relax_compatible_cpus_allowed_ptr() are used as a pair, with the former > setting ->user_cpus_ptr and the latter restoring it. An intervening call > to sched_setaffinity() must _clear_ the saved mask, as we discussed > before at: > > https://lore.kernel.org/r/YK53kDtczHIYumDC@hirez.programming.kicks-ass.net Clearly that deserves a comment somewhere, because I keep trying to make it more consistent than it can be :/ I'll see if I can find a spot.
On Wed, Aug 18, 2021 at 12:56:24PM +0200, Peter Zijlstra wrote: > On Wed, Aug 18, 2021 at 11:50:30AM +0100, Will Deacon wrote: > > On Tue, Aug 17, 2021 at 05:40:24PM +0200, Peter Zijlstra wrote: > > > On Fri, Jul 30, 2021 at 12:24:34PM +0100, Will Deacon wrote: > > > > In preparation for replaying user affinity requests using a saved mask, > > > > split sched_setaffinity() up so that the initial task lookup and > > > > security checks are only performed when the request is coming directly > > > > from userspace. > > > > > > > > Reviewed-by: Valentin Schneider <Valentin.Schneider@arm.com> > > > > Signed-off-by: Will Deacon <will@kernel.org> > > > > > > Should not sched_setaffinity() update user_cpus_ptr when it isn't NULL, > > > such that the upcoming relax_compatible_cpus_allowed_ptr() preserve the > > > full user mask? > > > > The idea is that force_compatible_cpus_allowed_ptr() and > > relax_compatible_cpus_allowed_ptr() are used as a pair, with the former > > setting ->user_cpus_ptr and the latter restoring it. An intervening call > > to sched_setaffinity() must _clear_ the saved mask, as we discussed > > before at: > > > > https://lore.kernel.org/r/YK53kDtczHIYumDC@hirez.programming.kicks-ass.net > > Clearly that deserves a comment somewhere, because I keep trying to make > it more consistent than it can be :/ I'll see if I can find a spot. Agreed. The relax/force functions are already commented, so maybe alongside SCA_USER? Will
diff --git a/kernel/sched/core.c b/kernel/sched/core.c index a139ed8be7e3..d4219d366103 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -7578,53 +7578,22 @@ SYSCALL_DEFINE4(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr, return retval; } -long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) +static int +__sched_setaffinity(struct task_struct *p, const struct cpumask *mask) { - cpumask_var_t cpus_allowed, new_mask; - struct task_struct *p; int retval; + cpumask_var_t cpus_allowed, new_mask; - rcu_read_lock(); - - p = find_process_by_pid(pid); - if (!p) { - rcu_read_unlock(); - return -ESRCH; - } - - /* Prevent p going away */ - get_task_struct(p); - rcu_read_unlock(); + if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL)) + return -ENOMEM; - if (p->flags & PF_NO_SETAFFINITY) { - retval = -EINVAL; - goto out_put_task; - } - if (!alloc_cpumask_var(&cpus_allowed, GFP_KERNEL)) { - retval = -ENOMEM; - goto out_put_task; - } if (!alloc_cpumask_var(&new_mask, GFP_KERNEL)) { retval = -ENOMEM; goto out_free_cpus_allowed; } - retval = -EPERM; - if (!check_same_owner(p)) { - rcu_read_lock(); - if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) { - rcu_read_unlock(); - goto out_free_new_mask; - } - rcu_read_unlock(); - } - - retval = security_task_setscheduler(p); - if (retval) - goto out_free_new_mask; - cpuset_cpus_allowed(p, cpus_allowed); - cpumask_and(new_mask, in_mask, cpus_allowed); + cpumask_and(new_mask, mask, cpus_allowed); /* * Since bandwidth control happens on root_domain basis, @@ -7645,23 +7614,63 @@ long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) #endif again: retval = __set_cpus_allowed_ptr(p, new_mask, SCA_CHECK); + if (retval) + goto out_free_new_mask; - if (!retval) { - cpuset_cpus_allowed(p, cpus_allowed); - if (!cpumask_subset(new_mask, cpus_allowed)) { - /* - * We must have raced with a concurrent cpuset - * update. Just reset the cpus_allowed to the - * cpuset's cpus_allowed - */ - cpumask_copy(new_mask, cpus_allowed); - goto again; - } + cpuset_cpus_allowed(p, cpus_allowed); + if (!cpumask_subset(new_mask, cpus_allowed)) { + /* + * We must have raced with a concurrent cpuset update. + * Just reset the cpumask to the cpuset's cpus_allowed. + */ + cpumask_copy(new_mask, cpus_allowed); + goto again; } + out_free_new_mask: free_cpumask_var(new_mask); out_free_cpus_allowed: free_cpumask_var(cpus_allowed); + return retval; +} + +long sched_setaffinity(pid_t pid, const struct cpumask *in_mask) +{ + struct task_struct *p; + int retval; + + rcu_read_lock(); + + p = find_process_by_pid(pid); + if (!p) { + rcu_read_unlock(); + return -ESRCH; + } + + /* Prevent p going away */ + get_task_struct(p); + rcu_read_unlock(); + + if (p->flags & PF_NO_SETAFFINITY) { + retval = -EINVAL; + goto out_put_task; + } + + if (!check_same_owner(p)) { + rcu_read_lock(); + if (!ns_capable(__task_cred(p)->user_ns, CAP_SYS_NICE)) { + rcu_read_unlock(); + retval = -EPERM; + goto out_put_task; + } + rcu_read_unlock(); + } + + retval = security_task_setscheduler(p); + if (retval) + goto out_put_task; + + retval = __sched_setaffinity(p, in_mask); out_put_task: put_task_struct(p); return retval;