[v2,1/2] Tracking cgroup-level niced CPU time

Message ID	20240830141939.723729-2-joshua.hahnjy@gmail.com (mailing list archive)
State	New
Headers	show Received: from mail-yw1-f179.google.com (mail-yw1-f179.google.com [209.85.128.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 04B7F1A3BCE; Fri, 30 Aug 2024 14:19:42 +0000 (UTC) From: Joshua Hahn <joshua.hahnjy@gmail.com> To: tj@kernel.org Cc: cgroups@vger.kernel.org, hannes@cmpxchg.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, lizefan.x@bytedance.com, mkoutny@suse.com, shuah@kernel.org Subject: [PATCH v2 1/2] Tracking cgroup-level niced CPU time Date: Fri, 30 Aug 2024 07:19:38 -0700 Message-ID: <20240830141939.723729-2-joshua.hahnjy@gmail.com> In-Reply-To: <20240830141939.723729-1-joshua.hahnjy@gmail.com> References: <20240830141939.723729-1-joshua.hahnjy@gmail.com> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Exposing nice CPU usage to userspace \| expand [v2,0/2] Exposing nice CPU usage to userspace [v2,1/2] Tracking cgroup-level niced CPU time [v2,2/2] Selftests for niced CPU statistics

Message ID

20240830141939.723729-2-joshua.hahnjy@gmail.com (mailing list archive)

State

New

Headers

From: Joshua Hahn <joshua.hahnjy@gmail.com>
To: tj@kernel.org
Cc: cgroups@vger.kernel.org,
	hannes@cmpxchg.org,
	linux-kernel@vger.kernel.org,
	linux-kselftest@vger.kernel.org,
	lizefan.x@bytedance.com,
	mkoutny@suse.com,
	shuah@kernel.org
Subject: [PATCH v2 1/2] Tracking cgroup-level niced CPU time
Date: Fri, 30 Aug 2024 07:19:38 -0700
Message-ID: <20240830141939.723729-2-joshua.hahnjy@gmail.com>
In-Reply-To: <20240830141939.723729-1-joshua.hahnjy@gmail.com>
References: <20240830141939.723729-1-joshua.hahnjy@gmail.com>
Precedence: bulk
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit

Series

Exposing nice CPU usage to userspace | expand

Commit Message

Joshua Hahn Aug. 30, 2024, 2:19 p.m. UTC

From: Joshua Hahn <joshua.hahn6@gmail.com>

Cgroup-level CPU statistics currently include time spent on
user/system processes, but do not include niced CPU time (despite
already being tracked). This patch exposes niced CPU time to the
userspace, allowing users to get a better understanding of their
hardware limits and can facilitate more informed workload distribution.

A new field 'ntime' is added to struct cgroup_base_stat as opposed to
struct task_cputime to minimize footprint.
---
 include/linux/cgroup-defs.h |  1 +
 kernel/cgroup/rstat.c       | 16 +++++++++++++---
 2 files changed, 14 insertions(+), 3 deletions(-)

Comments

Tejun Heo Aug. 30, 2024, 7:41 p.m. UTC | #1

On Fri, Aug 30, 2024 at 07:19:38AM -0700, Joshua Hahn wrote:
> From: Joshua Hahn <joshua.hahn6@gmail.com>
> 
> Cgroup-level CPU statistics currently include time spent on
> user/system processes, but do not include niced CPU time (despite
> already being tracked). This patch exposes niced CPU time to the
> userspace, allowing users to get a better understanding of their
> hardware limits and can facilitate more informed workload distribution.
> 
> A new field 'ntime' is added to struct cgroup_base_stat as opposed to
> struct task_cputime to minimize footprint.

Patch looks fine to me but can you please do the followings?

- Add subsystem prefix to the patch titles. Look other commits for examples.

- Add Signed-off-by to both.

Thanks.

Joshua Hahn Aug. 30, 2024, 8:06 p.m. UTC | #2

Hello, thank you for reviewing the v2.

> Patch looks fine to me but can you please do the followings?
>
> - Add subsystem prefix to the patch titles. Look other commits for examples.
> - Add Signed-off-by to both.
> --
> tejun

I will send out a v3 with the signed-off-by, and I will add
cgroup/rstat to the patch title.
Thank you again!

Joshua

diff --git a/include/linux/cgroup-defs.h b/include/linux/cgroup-defs.h
index ae04035b6cbe..a2fcb3db6c52 100644
--- a/include/linux/cgroup-defs.h
+++ b/include/linux/cgroup-defs.h
@@ -315,6 +315,7 @@  struct cgroup_base_stat {
 #ifdef CONFIG_SCHED_CORE
 	u64 forceidle_sum;
 #endif
+	u64 ntime;
 };
 
 /*
diff --git a/kernel/cgroup/rstat.c b/kernel/cgroup/rstat.c
index a06b45272411..a77ba9a83bab 100644
--- a/kernel/cgroup/rstat.c
+++ b/kernel/cgroup/rstat.c
@@ -444,6 +444,7 @@  static void cgroup_base_stat_add(struct cgroup_base_stat *dst_bstat,
 #ifdef CONFIG_SCHED_CORE
 	dst_bstat->forceidle_sum += src_bstat->forceidle_sum;
 #endif
+	dst_bstat->ntime += src_bstat->ntime;
 }
 
 static void cgroup_base_stat_sub(struct cgroup_base_stat *dst_bstat,
@@ -455,6 +456,7 @@  static void cgroup_base_stat_sub(struct cgroup_base_stat *dst_bstat,
 #ifdef CONFIG_SCHED_CORE
 	dst_bstat->forceidle_sum -= src_bstat->forceidle_sum;
 #endif
+	dst_bstat->ntime -= src_bstat->ntime;
 }
 
 static void cgroup_base_stat_flush(struct cgroup *cgrp, int cpu)
@@ -535,7 +537,10 @@  void __cgroup_account_cputime_field(struct cgroup *cgrp,
 
 	switch (index) {
 	case CPUTIME_USER:
+		rstatc->bstat.cputime.utime += delta_exec;
+		break;
 	case CPUTIME_NICE:
+		rstatc->bstat.ntime += delta_exec;
 		rstatc->bstat.cputime.utime += delta_exec;
 		break;
 	case CPUTIME_SYSTEM:
@@ -591,6 +596,7 @@  static void root_cgroup_cputime(struct cgroup_base_stat *bstat)
 #ifdef CONFIG_SCHED_CORE
 		bstat->forceidle_sum += cpustat[CPUTIME_FORCEIDLE];
 #endif
+		bstat->ntime += cpustat[CPUTIME_NICE];
 	}
 }
 
@@ -608,13 +614,14 @@  static void cgroup_force_idle_show(struct seq_file *seq, struct cgroup_base_stat
 void cgroup_base_stat_cputime_show(struct seq_file *seq)
 {
 	struct cgroup *cgrp = seq_css(seq)->cgroup;
-	u64 usage, utime, stime;
+	u64 usage, utime, stime, ntime;
 
 	if (cgroup_parent(cgrp)) {
 		cgroup_rstat_flush_hold(cgrp);
 		usage = cgrp->bstat.cputime.sum_exec_runtime;
 		cputime_adjust(&cgrp->bstat.cputime, &cgrp->prev_cputime,
 			       &utime, &stime);
+		ntime = cgrp->bstat.ntime;
 		cgroup_rstat_flush_release(cgrp);
 	} else {
 		/* cgrp->bstat of root is not actually used, reuse it */
@@ -622,16 +629,19 @@  void cgroup_base_stat_cputime_show(struct seq_file *seq)
 		usage = cgrp->bstat.cputime.sum_exec_runtime;
 		utime = cgrp->bstat.cputime.utime;
 		stime = cgrp->bstat.cputime.stime;
+		ntime = cgrp->bstat.ntime;
 	}
 
 	do_div(usage, NSEC_PER_USEC);
 	do_div(utime, NSEC_PER_USEC);
 	do_div(stime, NSEC_PER_USEC);
+	do_div(ntime, NSEC_PER_USEC);
 
 	seq_printf(seq, "usage_usec %llu\n"
 		   "user_usec %llu\n"
-		   "system_usec %llu\n",
-		   usage, utime, stime);
+			 "system_usec %llu\n"
+			 "nice_usec %llu\n",
+			 usage, utime, stime, ntime);
 
 	cgroup_force_idle_show(seq, &cgrp->bstat);
 }