[v8,03/15] PM: Introduce an Energy Model management framework

Several subsystems in the kernel (task scheduler and/or thermal at the
time of writing) can benefit from knowing about the energy consumed by
CPUs. Yet, this information can come from different sources (DT or
firmware for example), in different formats, hence making it hard to
exploit without a standard API.

As an attempt to address this, introduce a centralized Energy Model
(EM) management framework which aggregates the power values provided
by drivers into a table for each performance domain in the system. The
power cost tables are made available to interested clients (e.g. task
scheduler or thermal) via platform-agnostic APIs. The overall design
is represented by the diagram below (focused on Arm-related drivers as
an example, but applicable to any architecture):

     +---------------+  +-----------------+  +-------------+
     | Thermal (IPA) |  | Scheduler (EAS) |  |    Other    |
     +---------------+  +-----------------+  +-------------+
             |                   | em_pd_energy()   |
             |                   | em_cpu_get()     |
             +-----------+       |         +--------+
                         |       |         |
                         v       v         v
                      +---------------------+
                      |                     |
                      |    Energy Model     |
                      |                     |
                      |     Framework       |
                      |                     |
                      +---------------------+
                         ^       ^       ^
                         |       |       | em_register_perf_domain()
              +----------+       |       +---------+
              |                  |                 |
      +---------------+  +---------------+  +--------------+
      |  cpufreq-dt   |  |   arm_scmi    |  |    Other     |
      +---------------+  +---------------+  +--------------+
              ^                  ^                 ^
              |                  |                 |
      +--------------+   +---------------+  +--------------+
      | Device Tree  |   |   Firmware    |  |      ?       |
      +--------------+   +---------------+  +--------------+

Drivers (typically, but not limited to, CPUFreq drivers) can register
data in the EM framework using the em_register_perf_domain() API. The
calling driver must provide a callback function with a standardized
signature that will be used by the EM framework to build the power
cost tables of the performance domain. This design should offer a lot of
flexibility to calling drivers which are free of reading information
from any location and to use any technique to compute power costs.
Moreover, the capacity states registered by drivers in the EM framework
are not required to match real performance states of the target. This
is particularly important on targets where the performance states are
not known by the OS.

The power cost coefficients managed by the EM framework are specified in
milli-watts. Although the two potential users of those coefficients (IPA
and EAS) only need relative correctness, IPA specifically needs to
compare the power of CPUs with the power of other components (GPUs, for
example), which are still expressed in absolute terms in their
respective subsystems. Hence, specifying the power of CPUs in
milli-watts should help transitioning IPA to using the EM framework
without introducing new problems by keeping units comparable across
sub-systems.
On the longer term, the EM of other devices than CPUs could also be
managed by the EM framework, which would enable to remove the absolute
unit. However, this is not absolutely required as a first step, so this
extension of the EM framework is left for later.

On the client side, the EM framework offers APIs to access the power
cost tables of a CPU (em_cpu_get()), and to estimate the energy
consumed by the CPUs of a performance domain (em_pd_energy()). Clients
such as the task scheduler can then use these APIs to access the shared
data structures holding the Energy Model of CPUs.

Cc: Peter Zijlstra <peterz@infradead.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Signed-off-by: Quentin Perret <quentin.perret@arm.com>
---
 include/linux/energy_model.h | 187 ++++++++++++++++++++++++++++++++
 kernel/power/Kconfig         |  15 +++
 kernel/power/Makefile        |   2 +
 kernel/power/energy_model.c  | 201 +++++++++++++++++++++++++++++++++++
 4 files changed, 405 insertions(+)
 create mode 100644 include/linux/energy_model.h
 create mode 100644 kernel/power/energy_model.c

Message ID	20181016101513.26919-4-quentin.perret@arm.com (mailing list archive)
State	Superseded, archived
Headers	show Return-Path: <linux-pm-owner@kernel.org> From: Quentin Perret <quentin.perret@arm.com> To: peterz@infradead.org, rjw@rjwysocki.net, linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org Cc: gregkh@linuxfoundation.org, mingo@redhat.com, dietmar.eggemann@arm.com, morten.rasmussen@arm.com, chris.redpath@arm.com, patrick.bellasi@arm.com, valentin.schneider@arm.com, vincent.guittot@linaro.org, thara.gopinath@linaro.org, viresh.kumar@linaro.org, tkjos@google.com, joel@joelfernandes.org, smuckle@google.com, adharmap@codeaurora.org, skannan@codeaurora.org, pkondeti@codeaurora.org, juri.lelli@redhat.com, edubezval@gmail.com, srinivas.pandruvada@linux.intel.com, currojerez@riseup.net, javi.merino@kernel.org, quentin.perret@arm.com Subject: [PATCH v8 03/15] PM: Introduce an Energy Model management framework Date: Tue, 16 Oct 2018 11:15:01 +0100 Message-Id: <20181016101513.26919-4-quentin.perret@arm.com> In-Reply-To: <20181016101513.26919-1-quentin.perret@arm.com> References: <20181016101513.26919-1-quentin.perret@arm.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-pm-owner@vger.kernel.org Precedence: bulk
Series	Energy Aware Scheduling \| expand [v8,00/15] Energy Aware Scheduling [v8,01/15] sched: Relocate arch_scale_cpu_capacity [v8,02/15] sched/cpufreq: Prepare schedutil for Energy Aware Scheduling [v8,03/15] PM: Introduce an Energy Model management framework [v8,04/15] PM / EM: Expose the Energy Model in sysfs [v8,05/15] sched/topology: Reference the Energy Model of CPUs when available [v8,06/15] sched/topology: Lowest CPU asymmetry sched_domain level pointer [v8,07/15] sched/topology: Disable EAS on inappropriate platforms [v8,08/15] sched/topology: Make Energy Aware Scheduling depend on schedutil [v8,09/15] sched: Introduce sched_energy_present static key [v8,10/15] sched: Introduce a sysctl for Energy Aware Scheduling [v8,11/15] sched/fair: Clean-up update_sg_lb_stats parameters [v8,12/15] sched: Add over-utilization/tipping point indicator [v8,13/15] sched/fair: Introduce an energy estimation helper function [v8,14/15] sched/fair: Select an energy-efficient CPU on task wake-up [v8,15/15] OPTIONAL: cpufreq: dt: Register an Energy Model

[v8,03/15] PM: Introduce an Energy Model management framework

Commit Message

Comments

Patch