Message ID | 20230131154803.192530-1-longman@redhat.com (mailing list archive) |
---|---|
State | Accepted |
Commit | e5ae8803847b80fe9d744a3174abe2b7bfed222a |
Headers | show |
Series | cgroup/cpuset: Fix wrong check in update_parent_subparts_cpumask() | expand |
On Tue, Jan 31, 2023 at 10:48:03AM -0500, Waiman Long wrote: > It was found that the check to see if a partition could use up all > the cpus from the parent cpuset in update_parent_subparts_cpumask() > was incorrect. As a result, it is possible to leave parent with no > effective cpu left even if there are tasks in the parent cpuset. This > can lead to system panic as reported in [1]. > > Fix this probem by updating the check to fail the enabling the partition > if parent's effective_cpus is a subset of the child's cpus_allowed. > > Also record the error code when an error happens in update_prstate() > and add a test case where parent partition and child have the same cpu > list and parent has task. Enabling partition in the child will fail in > this case. > > [1] https://www.spinics.net/lists/cgroups/msg36254.html > > Fixes: f0af1bfc27b5 ("cgroup/cpuset: Relax constraints to partition & cpus changes") > Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com> > Signed-off-by: Waiman Long <longman@redhat.com> Applied to cgroup/for-6.2-fixes w/ stable cc added. Thanks.
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index a29c0b13706b..205dc9edcaa9 100644 --- a/kernel/cgroup/cpuset.c +++ b/kernel/cgroup/cpuset.c @@ -1346,7 +1346,7 @@ static int update_parent_subparts_cpumask(struct cpuset *cs, int cmd, * A parent can be left with no CPU as long as there is no * task directly associated with the parent partition. */ - if (!cpumask_intersects(cs->cpus_allowed, parent->effective_cpus) && + if (cpumask_subset(parent->effective_cpus, cs->cpus_allowed) && partition_is_populated(parent, cs)) return PERR_NOCPUS; @@ -2324,6 +2324,7 @@ static int update_prstate(struct cpuset *cs, int new_prs) new_prs = -new_prs; spin_lock_irq(&callback_lock); cs->partition_root_state = new_prs; + WRITE_ONCE(cs->prs_err, err); spin_unlock_irq(&callback_lock); /* * Update child cpusets, if present. diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/selftests/cgroup/test_cpuset_prs.sh index 186e1c26867e..75c100de90ff 100755 --- a/tools/testing/selftests/cgroup/test_cpuset_prs.sh +++ b/tools/testing/selftests/cgroup/test_cpuset_prs.sh @@ -268,6 +268,7 @@ TEST_MATRIX=( # Taking away all CPUs from parent or itself if there are tasks # will make the partition invalid. " S+ C2-3:P1:S+ C3:P1 . . T C2-3 . . 0 A1:2-3,A2:2-3 A1:P1,A2:P-1" + " S+ C3:P1:S+ C3 . . T P1 . . 0 A1:3,A2:3 A1:P1,A2:P-1" " S+ $SETUP_A123_PARTITIONS . T:C2-3 . . . 0 A1:2-3,A2:2-3,A3:3 A1:P1,A2:P-1,A3:P-1" " S+ $SETUP_A123_PARTITIONS . T:C2-3:C1-3 . . . 0 A1:1,A2:2,A3:3 A1:P1,A2:P1,A3:P1"
It was found that the check to see if a partition could use up all the cpus from the parent cpuset in update_parent_subparts_cpumask() was incorrect. As a result, it is possible to leave parent with no effective cpu left even if there are tasks in the parent cpuset. This can lead to system panic as reported in [1]. Fix this probem by updating the check to fail the enabling the partition if parent's effective_cpus is a subset of the child's cpus_allowed. Also record the error code when an error happens in update_prstate() and add a test case where parent partition and child have the same cpu list and parent has task. Enabling partition in the child will fail in this case. [1] https://www.spinics.net/lists/cgroups/msg36254.html Fixes: f0af1bfc27b5 ("cgroup/cpuset: Relax constraints to partition & cpus changes") Reported-by: Srinivas Pandruvada <srinivas.pandruvada@intel.com> Signed-off-by: Waiman Long <longman@redhat.com> --- kernel/cgroup/cpuset.c | 3 ++- tools/testing/selftests/cgroup/test_cpuset_prs.sh | 1 + 2 files changed, 3 insertions(+), 1 deletion(-)