diff mbox

[RFC,v3,3/3] Documentation: arm: define DT idle states bindings

Message ID 1392128273-8614-4-git-send-email-lorenzo.pieralisi@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Lorenzo Pieralisi Feb. 11, 2014, 2:17 p.m. UTC
ARM based platforms implement a variety of power management schemes that
allow processors to enter idle states at run-time.
The parameters defining these idle states vary on a per-platform basis forcing
the OS to hardcode the state parameters in platform specific static tables
whose size grows as the number of platforms supported in the kernel increases
and hampers device drivers standardization.

Therefore, this patch aims at standardizing idle state device tree bindings for
ARM platforms. Bindings define idle state parameters inclusive of entry methods
and state latencies, to allow operating systems to retrieve the configuration
entries from the device tree and initialize the related power management
drivers, paving the way for common code in the kernel to deal with idle
states and removing the need for static data in current and previous kernel
versions.

Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
---
 Documentation/devicetree/bindings/arm/cpus.txt        |  10 +
 Documentation/devicetree/bindings/arm/idle-states.txt | 690 ++++++++++
 2 files changed, 700 insertions(+)

Comments

amit kachhap Feb. 12, 2014, 11:39 a.m. UTC | #1
Hi Lorenzo,

This patch series looks nice and explains the ARM C state DT bindings clearly.
Couples of thoughts below.

On 2/11/14, Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> wrote:
> ARM based platforms implement a variety of power management schemes that
> allow processors to enter idle states at run-time.
> The parameters defining these idle states vary on a per-platform basis
> forcing
> the OS to hardcode the state parameters in platform specific static tables
> whose size grows as the number of platforms supported in the kernel
> increases
> and hampers device drivers standardization.
>
> Therefore, this patch aims at standardizing idle state device tree bindings
> for
> ARM platforms. Bindings define idle state parameters inclusive of entry
> methods
> and state latencies, to allow operating systems to retrieve the
> configuration
> entries from the device tree and initialize the related power management
> drivers, paving the way for common code in the kernel to deal with idle
> states and removing the need for static data in current and previous kernel
> versions.
>
> Signed-off-by: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com>
> ---
>  Documentation/devicetree/bindings/arm/cpus.txt        |  10 +
>  Documentation/devicetree/bindings/arm/idle-states.txt | 690 ++++++++++
>  2 files changed, 700 insertions(+)
>
> diff --git a/Documentation/devicetree/bindings/arm/cpus.txt
> b/Documentation/devicetree/bindings/arm/cpus.txt
> index 9130435..fb7f008 100644
> --- a/Documentation/devicetree/bindings/arm/cpus.txt
> +++ b/Documentation/devicetree/bindings/arm/cpus.txt
> @@ -191,6 +191,13 @@ nodes to be present and contain the properties
> described below.
>  			  property identifying a 64-bit zero-initialised
>  			  memory location.
>
> +	- cpu-idle-states
> +		Usage: Optional
> +		Value type: <prop-encoded-array>
> +		Definition:
> +			# List of phandles to cpu idle state nodes supported
> +			  by this cpu [1].
> +
>  Example 1 (dual-cluster big.LITTLE system 32-bit):
>
>  	cpus {
> @@ -382,3 +389,6 @@ cpus {
>  		cpu-release-addr = <0 0x20000000>;
>  	};
>  };
> +
> +[1] ARM Linux kernel documentation - idle state bindings
> +    Documentation/devicetree/bindings/arm/idle-states.txt
> diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt
> b/Documentation/devicetree/bindings/arm/idle-states.txt
> new file mode 100644
> index 0000000..f155e70
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/arm/idle-states.txt
> @@ -0,0 +1,690 @@
> +==========================================
> +ARM idle states binding description
> +==========================================
> +
> +==========================================
> +1 - Introduction
> +==========================================
> +
> +ARM systems contain HW capable of managing power consumption dynamically,
> +where cores can be put in different low-power states (ranging from simple
> +wfi to power gating) according to OSPM policies. The CPU states
> representing
> +the range of dynamic idle states that a processor can enter at run-time,
> can be
> +specified through device tree bindings representing the parameters
> required
> +to enter/exit specific idle states on a given processor.
> +
> +According to the Server Base System Architecture document (SBSA, [3]), the
> +power states an ARM CPU can be put into are identified by the following
> list:
> +
> +1 - Running
> +2 - Idle_standby
> +3 - Idle_retention
> +4 - Sleep
> +5 - Off
> +
> +ARM platforms implement the power states specified in the SBSA through
> power
> +management schemes that allow an OS PM implementation to put the processor
> in
> +different idle states (which include states 1 to 4 above; "off" state is
> not
> +an idle state since it does not have wake-up capabilities, hence it is not
> +considered in this document).
> +
> +Idle state parameters (eg entry latency) are platform specific and need to
> be
> +characterized with bindings that provide the required information to OSPM
> +code so that it can build the required tables and use them at runtime.
> +
> +The device tree binding definition for ARM idle states is the subject of
> this
> +document.
> +
> +===========================================
> +2 - cpu-idle-states node
> +===========================================
> +
> +ARM processor idle states are defined within the cpu-idle-states node,
> which is
> +a direct child of the cpus node and provides a container where the
> processor
> +states, defined as device tree nodes, are listed.
> +
> +- cpu-idle-states node
> +
> +	Usage: Optional - On ARM systems, is a container of processor idle
> +			  states nodes. If the system does not provide CPU
> +			  power management capabilities or the processor just
> +			  supports idle_standby a cpu-idle-states node is not
> +			  required.
> +
> +	Description: cpu-idle-states node is a container node, where its
> +		     subnodes describe the CPU idle states.
> +
> +	Node name must be "cpu-idle-states".
> +
> +	The cpu-idle-states node's parent node must be the cpus node.
> +
> +	The cpu-idle-states node's child nodes can be:
> +
> +	- one or more state nodes
> +
> +	Any other configuration is considered invalid.
> +
> +The nodes describing the idle states (state) can only be defined within
> the
> +cpu-idle-states node.
> +
> +Any other configuration is consider invalid and therefore must be ignored.
> +
> +===========================================
> +2 - state node
> +===========================================
> +
> +A state node represents an idle state description and must be defined as
> +follows:
> +
> +- state node
> +
> +	Description: must be child of either the cpu-idle-states node or
> +		     a state node.
> +
> +	The state node name shall be "stateN", where N = {0, 1, ...} is
> +	the node number; state nodes which are siblings within a single common
> +	parent node must be given a unique and sequential N value, starting
> +	from 0.
> +
> +	A state node can contain state child nodes. A state node with
> +	children represents a hierarchical state, which is a superset of
> +	the child states. Hierarchical states require all CPUs on which
> +	they are valid to request the state in order for it to be entered.
> +
> +	A state node defines the following properties:
> +
> +	- compatible
> +		Usage: Required
> +		Value type: <stringlist>
> +		Definition: Must be "arm,cpu-idle-state".
> +
> +	- index
> +		Usage: Required
> +		Value type: <u32>
> +		Definition: It represents an idle state index, starting from 2.
> +			    Index 0 represents the processor state "running"
> +			    and index 1 represents processor mode
> +			    "idle_standby", entered by executing a wfi
> +			    instruction (SBSA,[3]); indexes 0 and 1 are
> +			    standard ARM states that need not be described.
> +
> +	- entry-method
> +		Usage: Required
> +		Value type: <stringlist>
> +		Definition: Describes the method by which a CPU enters the
> +			    idle state. This property is required and must be
> +			    one of:
> +
> +			    - "arm,psci-cpu-suspend"
> +			      ARM PSCI firmware interface, CPU suspend
> +			      method[2].
> +
> +			    - "[vendor],[method]"
> +			      An implementation dependent string with
> +			      format "vendor,method", where vendor is a string
> +			      denoting the name of the manufacturer and
> +			      method is a string specifying the mechanism
> +			      used to enter the idle state.
> +
> +	- power-state
> +		Usage: See definition.
> +		Value type: <u32>
> +		Definition: Depends on the entry-method property value.
> +			    If entry-method is "arm,psci-cpu-suspend":
> +				# Property is required and represents
> +				  psci-power-state parameter. Please refer to
> +				  [2] for PSCI bindings definition.
> +
> +	- entry-latency
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to enter the idle state.
I liked your V2 version of OPP based latency more. However in this way
also latency supplied can correspond to max OPP and based on the
current OPP, the cpuidle driver can compute the new latency
proportionately. In case of frequency this can be linear but may not
be linear for residency.
> +
> +	- exit-latency
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to exit the idle state.
> +
> +	- min-residency
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing time in microseconds
> +			    required for the CPU to be in the idle state to
> +			    make up for the dynamic power consumed to
> +			    enter/exit the idle state in order to break even
> +			    in terms of power consumption compared to idle
> +			    state index 1 (idle_standby).
> +
> +	- power-domains
> +		Usage: Optional
> +		Value type: <prop-encoded-array>
> +		Definition: List of power domain specifiers ([1]) describing
> +			    the power domains that are affected by the idle
> +			    state entry.
I can see power-domains used in 2 places. First in C state definitions
and second in cpu node cache descriptions. I am not sure but if
possible than this can be put in one place.
> +
> +	- cache-state-retained
> +		Usage: See definition
> +		Value type: <none>
> +		Definition: if present cache memory is retained on power down,
> +			    otherwise it is lost.
Can the DT bindings support both retained and non-retained with
different C states? The example shown does not use the retained flag.
I guess than many combinations are possible.
Processor retained, cache non-retained = C state 2
Processor non retained , cache retained = C state 3
Processor non retained, cache non retained = C state 4
> +
> +	- processor-state-retained
> +		Usage: See definition
> +		Value type: <none>
> +		Definition: if present CPU processor logic is retained on
> +			    power down, otherwise it is lost.
> +
> +===========================================
> +3 - Examples
> +===========================================
> +
> +Example 1 (ARM 64-bit, 16-cpu system):
> +
> +pd_clusters: power-domain-clusters@80002000 {
> +	compatible = "arm,power-controller";
> +	reg = <0x0 0x80002000 0x0 0x1000>;
> +	#power-domain-cells = <1>;
> +	#address-cells = <2>;
> +	#size-cells = <2>;
> +
> +	pd_cores: power-domain-cores@80000000 {
> +		compatible = "arm,power-controller";
> +		reg = <0x0 0x80000000 0x0 0x1000>;
> +		#power-domain-cells = <1>;
> +	};
> +};
> +
> +cpus {
> +	#size-cells = <0>;
> +	#address-cells = <2>;
> +
> +	cpu-idle-states {
> +
> +		STATE0: state0 {
> +			compatible = "arm,cpu-idle-state";
> +			index = <3>;
> +			entry-method = "arm,psci-cpu-suspend";
> +			psci-power-state = <0x1010000>;
> +			entry-latency = <500>;
> +			exit-latency = <1000>;
> +			min-residency = <2500>;
> +			power-domains = <&pd_clusters 0>;
> +			STATE0_1: state0 {
> +				compatible = "arm,cpu-idle-state";
> +				index = <2>;
> +				entry-method = "arm,psci-cpu-suspend";
> +				psci-power-state = <0x0010000>;
> +				entry-latency = <200>;
> +				exit-latency = <400>;
> +				min-residency = <300>;
> +				power-domains = <&pd_cores 0>,
> +						<&pd_cores 1>,
> +						<&pd_cores 2>,
> +						<&pd_cores 3>,
> +						<&pd_cores 4>,
> +						<&pd_cores 5>,
> +						<&pd_cores 6>,
> +						<&pd_cores 7>;
> +			};
> +		};
> +
> +		STATE1: state1 {
> +			compatible = "arm,cpu-idle-state";
> +			index = <3>;
> +			entry-method = "arm,psci-cpu-suspend";
> +			psci-power-state = <0x1010000>;
> +			entry-latency = <1000>;
> +			exit-latency = <3000>;
> +			min-residency = <6500>;
> +			power-domains = <&pd_clusters 1>;
> +			STATE1_0: state0 {
> +				compatible = "arm,cpu-idle-state";
> +				index = <2>;
> +				entry-method = "arm,psci-cpu-suspend";
> +				psci-power-state = <0x0010000>;
> +				entry-latency = <200>;
> +				exit-latency = <400>;
> +				min-residency = <500>;
> +				power-domains = <&pd_cores 8>,
> +						<&pd_cores 9>,
> +						<&pd_cores 10>,
> +						<&pd_cores 11>,
> +						<&pd_cores 12>,
> +						<&pd_cores 13>,
> +						<&pd_cores 14>,
> +						<&pd_cores 15>;
> +			};
> +		};
> +	};
> +
> +	CPU0: cpu@0 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a57";
> +		reg = <0x0 0x0>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_0>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_0: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 0>;
> +		};
> +		L2_0: l2-cache {
> +			compatible = "arm,arch-cache";
> +			power-domain = <&pd_clusters 0>;
> +		};
> +	};
> +
> +	CPU1: cpu@1 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a57";
> +		reg = <0x0 0x1>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_1>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_1: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 1>;
> +		};
> +	};
> +
> +	CPU2: cpu@100 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a57";
> +		reg = <0x0 0x100>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_2>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_2: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 2>;
> +		};
> +	};
> +
> +	CPU3: cpu@101 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a57";
> +		reg = <0x0 0x101>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_3>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_3: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 3>;
> +		};
> +	};
> +
> +	CPU4: cpu@10000 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a57";
> +		reg = <0x0 0x10000>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_4>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_4: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 4>;
> +		};
> +	};
> +
> +	CPU5: cpu@10001 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a57";
> +		reg = <0x0 0x10001>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_5>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_5: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 5>;
> +		};
> +	};
> +
> +	CPU6: cpu@10100 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a57";
> +		reg = <0x0 0x10100>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_6>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_6: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 6>;
> +		};
> +	};
> +
> +	CPU7: cpu@10101 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a57";
> +		reg = <0x0 0x10101>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_7>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_7: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 7>;
> +		};
> +	};
> +
> +	CPU8: cpu@100000000 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a53";
> +		reg = <0x1 0x0>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_8>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_8: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 8>;
> +		};
> +		L2_1: l2-cache {
> +			compatible = "arm,arch-cache";
> +			power-domain = <&pd_clusters 1>;
> +		};
> +	};
> +
> +	CPU9: cpu@100000001 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a53";
> +		reg = <0x1 0x1>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_9>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_9: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 9>;
> +		};
> +	};
> +
> +	CPU10: cpu@100000100 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a53";
> +		reg = <0x1 0x100>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_10>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_10: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 10>;
> +		};
> +	};
> +
> +	CPU11: cpu@100000101 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a53";
> +		reg = <0x1 0x101>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_11>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_11: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 11>;
> +		};
> +	};
> +
> +	CPU12: cpu@100010000 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a53";
> +		reg = <0x1 0x10000>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_12>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_12: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 12>;
> +		};
> +	};
> +
> +	CPU13: cpu@100010001 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a53";
> +		reg = <0x1 0x10001>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_13>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_13: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 13>;
> +		};
> +	};
> +
> +	CPU14: cpu@100010100 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a53";
> +		reg = <0x1 0x10100>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_14>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_14: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 14>;
> +		};
> +	};
> +
> +	CPU15: cpu@100010101 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a53";
> +		reg = <0x1 0x10101>;
> +		enable-method = "psci";
> +		next-level-cache = <&L1_15>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_15: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 15>;
> +		};
> +	};
> +};
> +
> +Example 2 (ARM 32-bit, 8-cpu system, two clusters):
> +
> +pd_clusters: power-domain-clusters@80002000 {
> +	compatible = "arm,power-controller";
> +	reg = <0x80002000 0x1000>;
> +	#power-domain-cells = <1>;
> +	#address-cells = <1>;
> +	#size-cells = <1>;
> +
> +	pd_cores: power-domain-cores@80000000 {
> +		compatible = "arm,power-controller";
> +		reg = <0x80000000 0x1000>;
> +		#power-domain-cells = <1>;
> +	};
> +};
> +
> +cpus {
> +	#size-cells = <0>;
> +	#address-cells = <1>;
> +
> +	cpu-idle-states {
> +
> +		STATE0: state0 {
> +			compatible = "arm,cpu-idle-state";
> +			index = <3>;
> +			entry-method = "psci";
> +			psci-power-state = <0x1010000>;
> +			entry-latency = <1000>;
> +			exit-latency = <1500>;
> +			min-residency = <1500>;
> +			power-domains = <&pd_clusters 0>;
> +			STATE0_1: state0 {
> +				compatible = "arm,cpu-idle-state";
> +				index = <2>;
> +				entry-method = "psci";
> +				psci-power-state = <0x0010000>;
> +				entry-latency = <400>;
> +				exit-latency = <500>;
> +				min-residency = <300>;
> +				power-domains = <&pd_cores 0>,
> +						<&pd_cores 1>,
> +						<&pd_cores 2>,
> +						<&pd_cores 3>;
> +			};
> +		};
> +
> +		STATE1: state1 {
> +			compatible = "arm,cpu-idle-state";
> +			index = <3>;
> +			entry-method = "psci";
> +			psci-power-state = <0x1010000>;
> +			entry-latency = <800>;
> +			exit-latency = <2000>;
> +			min-residency = <6500>;
> +			power-domains = <&pd_clusters 1>;
> +			STATE1_0: state0 {
> +				compatible = "arm,cpu-idle-state";
> +				index = <2>;
> +				entry-method = "psci";
> +				psci-power-state = <0x0010000>;
> +				entry-latency = <300>;
> +				exit-latency = <500>;
> +				min-residency = <500>;
> +				power-domains = <&pd_cores 4>,
> +						<&pd_cores 5>,
> +						<&pd_cores 6>,
> +						<&pd_cores 7>;
> +			};
> +		};
> +	};
> +
> +	CPU0: cpu@0 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a15";
> +		reg = <0x0>;
> +		next-level-cache = <&L1_0>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_0: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 0>;
> +		};
> +		L2_0: l2-cache {
> +			compatible = "arm,arch-cache";
> +			power-domain = <&pd_clusters 0>;
> +		};
> +	};
> +
> +	CPU1: cpu@1 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a15";
> +		reg = <0x1>;
> +		next-level-cache = <&L1_1>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_1: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 1>;
> +		};
> +	};
> +
> +	CPU2: cpu@2 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a15";
> +		reg = <0x2>;
> +		next-level-cache = <&L1_2>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_2: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 2>;
> +		};
> +	};
> +
> +	CPU3: cpu@3 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a15";
> +		reg = <0x3>;
> +		next-level-cache = <&L1_3>;
> +		cpu-idle-states = <&STATE0_1 &STATE0>;
> +		L1_3: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_0>;
> +			power-domain = <&pd_cores 3>;
> +		};
> +	};
> +
> +	CPU4: cpu@100 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a7";
> +		reg = <0x100>;
> +		next-level-cache = <&L1_4>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_4: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 4>;
> +		};
> +		L2_1: l2-cache {
> +			compatible = "arm,arch-cache";
> +			power-domain = <&pd_clusters 1>;
> +		};
> +	};
> +
> +	CPU5: cpu@101 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a7";
> +		reg = <0x101>;
> +		next-level-cache = <&L1_5>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_5: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 5>;
> +		};
> +	};
> +
> +	CPU6: cpu@102 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a7";
> +		reg = <0x102>;
> +		next-level-cache = <&L1_6>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_6: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 6>;
> +		};
> +	};
> +
> +	CPU7: cpu@103 {
> +		device_type = "cpu";
> +		compatible = "arm,cortex-a7";
> +		reg = <0x103>;
> +		next-level-cache = <&L1_7>;
> +		cpu-idle-states = <&STATE1_0 &STATE1>;
> +		L1_7: l1-cache {
> +			compatible = "arm,arch-cache";
> +			next-level-cache = <&L2_1>;
> +			power-domain = <&pd_cores 7>;
> +		};
> +	};
> +};
> +
> +===========================================
> +4 - References
> +===========================================
> +
> +[1] ARM Linux Kernel documentation - power domain bindings
> +    Documentation/devicetree/bindings/power/power_domain.txt
> +
> +[2] ARM Linux Kernel documentation - PSCI bindings
> +    Documentation/devicetree/bindings/arm/psci.txt
> +
> +[3] ARM Server Base System Architecture (SBSA)
> +    http://infocenter.arm.com/help/index.jsp
> --
> 1.8.4
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
Lorenzo Pieralisi Feb. 12, 2014, 2:34 p.m. UTC | #2
Hi Amit,

On Wed, Feb 12, 2014 at 11:39:28AM +0000, Amit Kachhap wrote:
> Hi Lorenzo,
> 
> This patch series looks nice and explains the ARM C state DT bindings clearly.
Thank you.

[...]

> > +     - entry-latency
> > +             Usage: Required
> > +             Value type: <prop-encoded-array>
> > +             Definition: u32 value representing worst case latency
> > +                         in microseconds required to enter the idle state.
> I liked your V2 version of OPP based latency more. However in this way
> also latency supplied can correspond to max OPP and based on the
> current OPP, the cpuidle driver can compute the new latency
> proportionately. In case of frequency this can be linear but may not
> be linear for residency.

I understand that, but the bindings do not preclude having a list instead of
just worst case value. Let's go for the worst case and build on it
(when we have a decent method to express OPP dependencies in the DT bindings,
that is not the case at present). Your point is taken, I just want a first
version that provides the groundwork on top of which more complex
bindings can be defined.

> > +
> > +     - exit-latency
> > +             Usage: Required
> > +             Value type: <prop-encoded-array>
> > +             Definition: u32 value representing worst case latency
> > +                         in microseconds required to exit the idle state.
> > +
> > +     - min-residency
> > +             Usage: Required
> > +             Value type: <prop-encoded-array>
> > +             Definition: u32 value representing time in microseconds
> > +                         required for the CPU to be in the idle state to
> > +                         make up for the dynamic power consumed to
> > +                         enter/exit the idle state in order to break even
> > +                         in terms of power consumption compared to idle
> > +                         state index 1 (idle_standby).
> > +
> > +     - power-domains
> > +             Usage: Optional
> > +             Value type: <prop-encoded-array>
> > +             Definition: List of power domain specifiers ([1]) describing
> > +                         the power domains that are affected by the idle
> > +                         state entry.
> I can see power-domains used in 2 places. First in C state definitions
> and second in cpu node cache descriptions. I am not sure but if
> possible than this can be put in one place.

Power domain specifiers are there to link devices (ie caches) to the
idle states. We have to know which devices are affected by an idle state
entry, and that's done through the power-domain (ie all devices that
belong to a power domain affected by the idle state entry must be acted
upon).

All devices have to be linked to the power domain they belong to,
possibly by using a hierarchical representation (to group devices under
a power domain and avoid duplicating the power domain specifier).

> > +
> > +     - cache-state-retained
> > +             Usage: See definition
> > +             Value type: <none>
> > +             Definition: if present cache memory is retained on power down,
> > +                         otherwise it is lost.
> Can the DT bindings support both retained and non-retained with
> different C states? The example shown does not use the retained flag.
> I guess than many combinations are possible.
> Processor retained, cache non-retained = C state 2
> Processor non retained , cache retained = C state 3
> Processor non retained, cache non retained = C state 4

Yes. Those are per idle state properties so the can be used as you defined,
I cannot think of any issue regarding those boolean properties at the
moment, but if you do please flag it up.

Thanks !
Lorenzo
Sebastian Capella Feb. 13, 2014, 1:31 a.m. UTC | #3
Quoting Lorenzo Pieralisi (2014-02-11 06:17:53)
> +       - cpu-idle-states
> +               Usage: Optional
> +               Value type: <prop-encoded-array>
> +               Definition:
> +                       # List of phandles to cpu idle state nodes supported
> +                         by this cpu [1].
> +

Should cpu idle be hyphenated in the definition like:
  "List of phandles to cpu-idle state nodes supported"

Is anything implied in the ordering of this list?
Or is this a non-ordered array of phandles?

Would it be a good idea to select a different name for this property vs.
the node?  It seems to get confusing sometimes.


> +According to the Server Base System Architecture document (SBSA, [3]), the
> +power states an ARM CPU can be put into are identified by the following list:
> +
> +1 - Running
> +2 - Idle_standby
> +3 - Idle_retention
> +4 - Sleep
> +5 - Off

Are these states used in the state->index value?
Might it be helpful to number these starting with 0?

> +ARM platforms implement the power states specified in the SBSA through power
> +management schemes that allow an OS PM implementation to put the processor in
> +different idle states (which include states 1 to 4 above; "off" state is not
> +an idle state since it does not have wake-up capabilities, hence it is not
> +considered in this document).

Might an implementation have both sleep and off states where they have
different latencies in the case a cpu can wake itself vs. a coprocessor
waking the cpu?

> +
> +Idle state parameters (eg entry latency) are platform specific and need to be
> +characterized with bindings that provide the required information to OSPM
> +code so that it can build the required tables and use them at runtime.
> +
> +The device tree binding definition for ARM idle states is the subject of this
> +document.

During last connect, we'd discussed that the small set of
states here could apply to a single node, which can represent a cpu, a
cluster with cache etc.  Then the complexities of the system power state
would be represented using a heirarchy, with each node in the
tree having its own state from the list above.  This would allow
a fairly rich system state while having just a few power states defined
at each level.  Is this how you're intending these bindings to go?

> +===========================================
> +2 - state node
> +===========================================

should the section numbering be incremented here?  Or is this a
subsection?  2.1?

Also, would it be nice to have a name field for each state?

> +       A state node can contain state child nodes. A state node with
> +       children represents a hierarchical state, which is a superset of
> +       the child states. Hierarchical states require all CPUs on which
> +       they are valid to request the state in order for it to be entered.

Is it possible for a cpu to request a deeper state and unblock other cpus
from entering this state?

"all CPUs on which they are valid" is this determined by seeing which
state's phandle is in each cpu->cpu-idle-states?

> +
> +       A state node defines the following properties:
...
> +       - index
> +               Usage: Required
> +               Value type: <u32>
> +               Definition: It represents an idle state index, starting from 2.
> +                           Index 0 represents the processor state "running"
> +                           and index 1 represents processor mode
> +                           "idle_standby", entered by executing a wfi
> +                           instruction (SBSA,[3]); indexes 0 and 1 are
> +                           standard ARM states that need not be described.

Do you think maybe something like this might be clearer?

Definition: It represents an idle state index.

	Index 0 and 1 shall not be specified and are reserved for ARM
	states where index 0 is running, and index 1 is idle_standby 
	entered by executing a wfi instruction (SBSA,[3])

What mechanism is used to order the power states WRT power consumption?

> +       - entry-method
> +               Usage: Required
> +               Value type: <stringlist>
> +               Definition: Describes the method by which a CPU enters the
> +                           idle state. This property is required and must be
> +                           one of:
> +
> +                           - "arm,psci-cpu-suspend"
> +                             ARM PSCI firmware interface, CPU suspend
> +                             method[2].

Can psci-cpu-suspend be assumed if entry-method is omitted?

Can this field be used to combine both psci and non-psci states in any order?

> +       - power-state
> +               Usage: See definition.
> +               Value type: <u32>
> +               Definition: Depends on the entry-method property value.
> +                           If entry-method is "arm,psci-cpu-suspend":
> +                               # Property is required and represents
> +                                 psci-power-state parameter. Please refer to
> +                                 [2] for PSCI bindings definition.

Examples use psci-power-state..

If we call this something like entry-method-param rather than power-state,
would this allow the field to be more flexible?  Is flexibility here a goal?

	- power-state
		Usage: See definition.
		Value type: <u32>
		Definition:  Parameter to pass to the entry method when
			this state is being entered.
			If entry-method is "arm,psci-cpu-suspend",
			this parameter represents the psci-power-state
			parameter. Please refer to [2] for PSCI bindings
			definition.

> +       - power-domains
> +               Usage: Optional
> +               Value type: <prop-encoded-array>
> +               Definition: List of power domain specifiers ([1]) describing
> +                           the power domains that are affected by the idle
> +                           state entry.

How do you expect this information should be used?

I assume psci will be turning off the powerdomains not Linux right?
If so, is the structure above helpful for psci to associate the cpu
requesting a state to the specific power domain in the power-domains list?
Or is this all encoded in the parameter to PSCI suspend?  In that case,
what is the utility?

> +       - cache-state-retained
> +               Usage: See definition
> +               Value type: <none>
> +               Definition: if present cache memory is retained on power down,
> +                           otherwise it is lost.
> +
> +       - processor-state-retained
> +               Usage: See definition
> +               Value type: <none>
> +               Definition: if present CPU processor logic is retained on
> +                           power down, otherwise it is lost.

I don't see a good example of these two retained state properties.
Isn't this the purpose of idle_retained state?  In the explanations, is
the term 'power down' the same as sleep or off?

> +
> +cpus {
> +       #size-cells = <0>;
> +       #address-cells = <2>;
> +
> +       cpu-idle-states {
> +
> +               STATE0: state0 {
> +                       compatible = "arm,cpu-idle-state";
> +                       index = <3>;

Are the index fields of nested states independent of each other or
sequential?

ie: 
  -  does index=3 here mean pd_cluster is sleep state, and index=2
     below mean the cpu cluster is idle_retention? (both SBSA states)
  -  Or does index=3 here mean this state is the next cpu-idle state after
     STATE0_1 below, which has index=2?  In this case, are the indexes
     implied to be increasing in order of most power savings?

> +                       entry-method = "arm,psci-cpu-suspend";
> +                       psci-power-state = <0x1010000>;
> +                       entry-latency = <500>;
> +                       exit-latency = <1000>;
> +                       min-residency = <2500>;
> +                       power-domains = <&pd_clusters 0>;
> +                       STATE0_1: state0 {

Should this be STATE0_0?

> +                               compatible = "arm,cpu-idle-state";
> +                               index = <2>;
> +                               entry-method = "arm,psci-cpu-suspend";
> +                               psci-power-state = <0x0010000>;
> +                               entry-latency = <200>;
> +                               exit-latency = <400>;
> +                               min-residency = <300>;
> +                               power-domains = <&pd_cores 0>,
> +                                               <&pd_cores 1>,
                                                  ...
> +                                               <&pd_cores 7>;
> +                       };
> +               };

Would it be possible to add an example illustrating more
complex cluster/cpu power states?  Maybe where both the cpus and
cluster have multiple states (sleep/retention)?

The current example seems to show just the cpu idle_retention state,
and the cluster off state.

Maybe an example of how you'd represent something like this with the new
bindings:

(in increasing order of power saving)
+-----------------+--------------------+-----------------------------------+
|      cpu        |      cluster       |        Notes
+-----------------+--------------------+-----------------------------------+
|    running      |      running       |      not specified running
|  idle_standby   |      running       |      not specified WFI
| idle_retention  |      running       |
| idle_retention  |   idle_retention   |
|     sleep       |   idle_retention   |
|     sleep       |       sleep        |
+-----------------|--------------------+-----------------------------------+

The CPU has 4 states: running, idle standby, idle_retention and sleep.
the first two are not specified per the instructions.

Then the cluster has 3 states: running, retention and sleep.

Maybe a complex case like this would be helpful to understand how the
bindings should be used.

Thanks,

Sebastian
Lorenzo Pieralisi Feb. 13, 2014, 12:47 p.m. UTC | #4
Hi Sebastian,

thanks for having a look.

On Thu, Feb 13, 2014 at 01:31:53AM +0000, Sebastian Capella wrote:
> Quoting Lorenzo Pieralisi (2014-02-11 06:17:53)
> > +       - cpu-idle-states
> > +               Usage: Optional
> > +               Value type: <prop-encoded-array>
> > +               Definition:
> > +                       # List of phandles to cpu idle state nodes supported
> > +                         by this cpu [1].
> > +
> 
> Should cpu idle be hyphenated in the definition like:
>   "List of phandles to cpu-idle state nodes supported"

No, because it is not meant to be tied to the kernel CPU idle framework.
It is meant to be "List of phandles to idle state nodes supported by this cpu"
and that's what I will do.

On a side note, that's why I was reluctant to call them idle states, for
the records.

> Is anything implied in the ordering of this list?
> Or is this a non-ordered array of phandles?

Non-ordered.

> Would it be a good idea to select a different name for this property vs.
> the node?  It seems to get confusing sometimes.

Such as ? "idle-states" ?

> > +According to the Server Base System Architecture document (SBSA, [3]), the
> > +power states an ARM CPU can be put into are identified by the following list:
> > +
> > +1 - Running
> > +2 - Idle_standby
> > +3 - Idle_retention
> > +4 - Sleep
> > +5 - Off
> 
> Are these states used in the state->index value?
> Might it be helpful to number these starting with 0?

No, I will remove the numbers.

> > +ARM platforms implement the power states specified in the SBSA through power
> > +management schemes that allow an OS PM implementation to put the processor in
> > +different idle states (which include states 1 to 4 above; "off" state is not
> > +an idle state since it does not have wake-up capabilities, hence it is not
> > +considered in this document).
> 
> Might an implementation have both sleep and off states where they have
> different latencies in the case a cpu can wake itself vs. a coprocessor
> waking the cpu?

Yes, but not in SBSA nomenclature. off as in SBSA is not an idle state,
since it does not require IRQ wake up capabilities.

An idle state requires IRQ wake-up capabilities (either through return
from wfi or reset from coprocessor), how it is implemented it does not
matter.

> > +
> > +Idle state parameters (eg entry latency) are platform specific and need to be
> > +characterized with bindings that provide the required information to OSPM
> > +code so that it can build the required tables and use them at runtime.
> > +
> > +The device tree binding definition for ARM idle states is the subject of this
> > +document.
> 
> During last connect, we'd discussed that the small set of
> states here could apply to a single node, which can represent a cpu, a
> cluster with cache etc.  Then the complexities of the system power state
> would be represented using a heirarchy, with each node in the
> tree having its own state from the list above.  This would allow
> a fairly rich system state while having just a few power states defined
> at each level.  Is this how you're intending these bindings to go?

Yes, CPUs point at states that can be reached by that CPU (eg a CPU core
gating state is represented by a single node pointed at by all CPUs in the
system - it the same state data, different power domains though).

States are hierarchical, which means that a parent state implies entry on all
substates that's how cluster states are defined.

> > +===========================================
> > +2 - state node
> > +===========================================
> 
> should the section numbering be incremented here?  Or is this a
> subsection?  2.1?

Yes.

> Also, would it be nice to have a name field for each state?

There is:

"The state node name shall be "stateN", where N = {0, 1, ...} is
the node number; state nodes which are siblings within a single
common parent node must be given a unique and sequential N value,
starting from 0."

I can remove the rule and allow people to call states as they wish
since they already have a compatible property to match them.

Actually, states can be called with any name, provided it is unique.

> > +       A state node can contain state child nodes. A state node with
> > +       children represents a hierarchical state, which is a superset of
> > +       the child states. Hierarchical states require all CPUs on which
> > +       they are valid to request the state in order for it to be entered.
> 
> Is it possible for a cpu to request a deeper state and unblock other cpus
> from entering this state?

That's not DT bindings business, hierarchical states define states that
require all CPUs on which they are valid to enter that state for it to
be considered "enabled".

It is a hard nut to crack. In x86 this stuff does not exist and it is
managed in HW, basically an idle state is always per-cpu (but it might
end up becoming a package state when all CPUs in a package enter that
state). On ARM, we want to define hierarchical states explicitly to
link resources (ie caches) to them.

CPUs are not prevented from requesting a hierarchical state, but the
state only becomes "enabled" when all CPUs on which it is valid request
it.

I cannot think of any other way of to express this properly but still in
a compact way.

> "all CPUs on which they are valid" is this determined by seeing which
> state's phandle is in each cpu->cpu-idle-states?

Yes, does it make sense ?

> > +
> > +       A state node defines the following properties:
> ...
> > +       - index
> > +               Usage: Required
> > +               Value type: <u32>
> > +               Definition: It represents an idle state index, starting from 2.
> > +                           Index 0 represents the processor state "running"
> > +                           and index 1 represents processor mode
> > +                           "idle_standby", entered by executing a wfi
> > +                           instruction (SBSA,[3]); indexes 0 and 1 are
> > +                           standard ARM states that need not be described.
> 
> Do you think maybe something like this might be clearer?

Yes it is.

> Definition: It represents an idle state index.
> 
>         Index 0 and 1 shall not be specified and are reserved for ARM
>         states where index 0 is running, and index 1 is idle_standby
>         entered by executing a wfi instruction (SBSA,[3])
> 
> What mechanism is used to order the power states WRT power consumption?

I think we should use index for that. The higher the index the lower the
power consumption.

> > +       - entry-method
> > +               Usage: Required
> > +               Value type: <stringlist>
> > +               Definition: Describes the method by which a CPU enters the
> > +                           idle state. This property is required and must be
> > +                           one of:
> > +
> > +                           - "arm,psci-cpu-suspend"
> > +                             ARM PSCI firmware interface, CPU suspend
> > +                             method[2].
> 
> Can psci-cpu-suspend be assumed if entry-method is omitted?

No.

> Can this field be used to combine both psci and non-psci states in any order?

No. I will enforce a unique entry method.

> > +       - power-state
> > +               Usage: See definition.
> > +               Value type: <u32>
> > +               Definition: Depends on the entry-method property value.
> > +                           If entry-method is "arm,psci-cpu-suspend":
> > +                               # Property is required and represents
> > +                                 psci-power-state parameter. Please refer to
> > +                                 [2] for PSCI bindings definition.
> 
> Examples use psci-power-state..

Typo, sorry, it is not C unfortunately...

> If we call this something like entry-method-param rather than power-state,
> would this allow the field to be more flexible?  Is flexibility here a goal?

Yes, I can call it like that.

>         - power-state
>                 Usage: See definition.
>                 Value type: <u32>
>                 Definition:  Parameter to pass to the entry method when
>                         this state is being entered.
>                         If entry-method is "arm,psci-cpu-suspend",
>                         this parameter represents the psci-power-state
>                         parameter. Please refer to [2] for PSCI bindings
>                         definition.
> 
> > +       - power-domains
> > +               Usage: Optional
> > +               Value type: <prop-encoded-array>
> > +               Definition: List of power domain specifiers ([1]) describing
> > +                           the power domains that are affected by the idle
> > +                           state entry.
> 
> How do you expect this information should be used?

This defines all power domains that are affected by the state entry.
It allows us to understand what caches, devices, whatnots have to be
acted upon state entry.

> I assume psci will be turning off the powerdomains not Linux right?

This is not a PSCI only document, and even if it was, we still need to deal
with devices. Which means we need to know what we have to save/restore (PMU,
arch timer, GIC), and power domains help us detect that.

> If so, is the structure above helpful for psci to associate the cpu
> requesting a state to the specific power domain in the power-domains list?
> Or is this all encoded in the parameter to PSCI suspend?  In that case,
> what is the utility?

See above.

> > +       - cache-state-retained
> > +               Usage: See definition
> > +               Value type: <none>
> > +               Definition: if present cache memory is retained on power down,
> > +                           otherwise it is lost.
> > +
> > +       - processor-state-retained
> > +               Usage: See definition
> > +               Value type: <none>
> > +               Definition: if present CPU processor logic is retained on
> > +                           power down, otherwise it is lost.
> 
> I don't see a good example of these two retained state properties.
> Isn't this the purpose of idle_retained state?  In the explanations, is
> the term 'power down' the same as sleep or off?

Ok, I will remove "power down" and replace it with "state entry".

> > +
> > +cpus {
> > +       #size-cells = <0>;
> > +       #address-cells = <2>;
> > +
> > +       cpu-idle-states {
> > +
> > +               STATE0: state0 {
> > +                       compatible = "arm,cpu-idle-state";
> > +                       index = <3>;
> 
> Are the index fields of nested states independent of each other or
> sequential?
> 
> ie:
>   -  does index=3 here mean pd_cluster is sleep state, and index=2
>      below mean the cpu cluster is idle_retention? (both SBSA states)
>   -  Or does index=3 here mean this state is the next cpu-idle state after
>      STATE0_1 below, which has index=2?  In this case, are the indexes
>      implied to be increasing in order of most power savings?

Forget index as a link to SBSA states indexes above I should have never listed
them as numbers. I understand index is misleading and either I remove it, or I
leave it there to define power savings scale as you mentioned.

> > +                       entry-method = "arm,psci-cpu-suspend";
> > +                       psci-power-state = <0x1010000>;
> > +                       entry-latency = <500>;
> > +                       exit-latency = <1000>;
> > +                       min-residency = <2500>;
> > +                       power-domains = <&pd_clusters 0>;
> > +                       STATE0_1: state0 {
> 
> Should this be STATE0_0?

Well, yes, it is a tag though so it can be whatever we want as long as
it is unique.

> > +                               compatible = "arm,cpu-idle-state";
> > +                               index = <2>;
> > +                               entry-method = "arm,psci-cpu-suspend";
> > +                               psci-power-state = <0x0010000>;
> > +                               entry-latency = <200>;
> > +                               exit-latency = <400>;
> > +                               min-residency = <300>;
> > +                               power-domains = <&pd_cores 0>,
> > +                                               <&pd_cores 1>,
>                                                   ...
> > +                                               <&pd_cores 7>;
> > +                       };
> > +               };
> 
> Would it be possible to add an example illustrating more
> complex cluster/cpu power states?  Maybe where both the cpus and
> cluster have multiple states (sleep/retention)?
> 
> The current example seems to show just the cpu idle_retention state,
> and the cluster off state.
> 
> Maybe an example of how you'd represent something like this with the new
> bindings:
> 
> (in increasing order of power saving)
> +-----------------+--------------------+-----------------------------------+
> |      cpu        |      cluster       |        Notes
> +-----------------+--------------------+-----------------------------------+
> |    running      |      running       |      not specified running
> |  idle_standby   |      running       |      not specified WFI
> | idle_retention  |      running       |
> | idle_retention  |   idle_retention   |
> |     sleep       |   idle_retention   |
> |     sleep       |       sleep        |
> +-----------------|--------------------+-----------------------------------+
> 
> The CPU has 4 states: running, idle standby, idle_retention and sleep.
> the first two are not specified per the instructions.
> 
> Then the cluster has 3 states: running, retention and sleep.
> 
> Maybe a complex case like this would be helpful to understand how the
> bindings should be used.

Ok, I will do that and rework the bindings accordingly.

Thanks,
Lorenzo
Sebastian Capella Feb. 14, 2014, 12:29 a.m. UTC | #5
Quoting Lorenzo Pieralisi (2014-02-13 04:47:32)

> Such as ? "idle-states" ?

That sounds good to me.  Our preference is for idle-states to be used
for name of the idle-states.txt node.

> > During last connect, we'd discussed that the small set of
> > states here could apply to a single node, which can represent a cpu, a
> > cluster with cache etc.  Then the complexities of the system power state
> > would be represented using a heirarchy, with each node in the
> > tree having its own state from the list above.  This would allow
> > a fairly rich system state while having just a few power states defined
> > at each level.  Is this how you're intending these bindings to go?
>
> Yes, CPUs point at states that can be reached by that CPU (eg a CPU core
> gating state is represented by a single node pointed at by all CPUs in the
> system - it the same state data, different power domains though).
> 
> States are hierarchical, which means that a parent state implies entry on all
> substates that's how cluster states are defined.

The cpu 0 node, in its cpu-idle-state array is also pointing at a cluster
node right?  STATE0 -> power-domains refers to <pd_clusters 0>.  Is it
correct that if the cpu's workload is such that it tolerates the
500/1000/2500 entry/exit latency/min-residency, it will call the
entry-method with the psci-power-state of 0x10100000 matching that node?


> > Also, would it be nice to have a name field for each state?
> 
> There is:
> 
> "The state node name shall be "stateN", where N = {0, 1, ...} is
> the node number; state nodes which are siblings within a single
> common parent node must be given a unique and sequential N value,
> starting from 0."
> I can remove the rule and allow people to call states as they wish
> since they already have a compatible property to match them.
> 
> Actually, states can be called with any name, provided it is unique.

Sorry, I was referring to a descriptive name string property.  Something
that can be useful to help a human understand what's going on.  Naming
the nodes might work too.

> > > +       A state node can contain state child nodes. A state node with
> > > +       children represents a hierarchical state, which is a superset of
> > > +       the child states. Hierarchical states require all CPUs on which
> > > +       they are valid to request the state in order for it to be entered.
> > 
> > Is it possible for a cpu to request a deeper state and unblock other cpus
> > from entering this state?
> 
> That's not DT bindings business, hierarchical states define states that
> require all CPUs on which they are valid to enter that state for it to
> be considered "enabled".
> 
> It is a hard nut to crack. In x86 this stuff does not exist and it is
> managed in HW, basically an idle state is always per-cpu (but it might
> end up becoming a package state when all CPUs in a package enter that
> state). On ARM, we want to define hierarchical states explicitly to
> link resources (ie caches) to them.
> 
> CPUs are not prevented from requesting a hierarchical state, but the
> state only becomes "enabled" when all CPUs on which it is valid request
> it.

The way we've seen it work, is a cpu tries to enter the lowest state it can
tolerate (including for eg. cluster off).  The system level code then looks
at the allowable states from all the cpus and selects the lowest power
state permissible for the heirarchies.

Eg. this way, the last cpu preventing a cluster from going to sleep enters
cluster sleep state (where peer cpu's have already voted for cluster
sleep) and the cluster will sleep.  Alternately if the same cpu cannot
tolerate the additional latency of the cluster sleeping, it will vote
only for cpu-sleep, then the cluster remains in a state where all cpu's are
sleeping but the cluster remains up.

> I cannot think of any other way of to express this properly but still in
> a compact way.

You're doing a great job by the way!  Thanks! :)

> > "all CPUs on which they are valid" is this determined by seeing which
> > state's phandle is in each cpu->cpu-idle-states?
> 
> Yes, does it make sense ?

Yup, maybe adding a little parenthetical text like below may help others
as well?

Hierarchical states require all CPUs on which they are valid (ie. CPU nodes
containing cpu-idle-state arrays having a phandle reference to the state)
to request the state in order for it to be entered.

> 
> > Definition: It represents an idle state index.
> > 
> >         Index 0 and 1 shall not be specified and are reserved for ARM
> >         states where index 0 is running, and index 1 is idle_standby
> >         entered by executing a wfi instruction (SBSA,[3])
> > 
> > What mechanism is used to order the power states WRT power consumption?
> 
> I think we should use index for that. The higher the index the lower the
> power consumption.

Ok, I think I get it now.
If we decouple index and SBSA states, do we really need to reserve index
0 and 1 then?

> > Can this field be used to combine both psci and non-psci states in any order?
> 
> No. I will enforce a unique entry method.

So a single entry-method for all  states in idle-states node?
Should this be specified once only then?

Should we not allow more flexibility here to mix methods where some
states use one method and some use others?  Is this mostly a security
concern?  What if a vendor wants to introduce a state between wfi and
cpu-sleep?

> > I assume psci will be turning off the powerdomains not Linux right?
> 
> This is not a PSCI only document, and even if it was, we still need to deal
> with devices. Which means we need to know what we have to save/restore (PMU,
> arch timer, GIC), and power domains help us detect that.

> > > +
> > > +cpus {
> > > +       #size-cells = <0>;
> > > +       #address-cells = <2>;
> > > +
> > > +       cpu-idle-states {
> > > +
> > > +               STATE0: state0 {
> > > +                       compatible = "arm,cpu-idle-state";
> > > +                       index = <3>;
> > 
> > Are the index fields of nested states independent of each other or
> > sequential?
> ...
> I understand index is misleading and either I remove it, or I
> leave it there to define power savings scale as you mentioned.

I like index, but I was confused.  You cleared it up pretty quick with
your earlier comment.  Removing the numbering may help, but I think
keeping the index is useful.

Thanks!

Sebastian
Lorenzo Pieralisi Feb. 14, 2014, 11:27 a.m. UTC | #6
On Fri, Feb 14, 2014 at 12:29:26AM +0000, Sebastian Capella wrote:
> Quoting Lorenzo Pieralisi (2014-02-13 04:47:32)
> 
> > Such as ? "idle-states" ?
> 
> That sounds good to me.  Our preference is for idle-states to be used
> for name of the idle-states.txt node.

Ok, so s/cpu-idle-state/idle-state everywhere, inclusive of state nodes
compatible properties ("arm,cpu-idle-state" becomes "arm,idle-state") ?

> > > During last connect, we'd discussed that the small set of
> > > states here could apply to a single node, which can represent a cpu, a
> > > cluster with cache etc.  Then the complexities of the system power state
> > > would be represented using a heirarchy, with each node in the
> > > tree having its own state from the list above.  This would allow
> > > a fairly rich system state while having just a few power states defined
> > > at each level.  Is this how you're intending these bindings to go?
> >
> > Yes, CPUs point at states that can be reached by that CPU (eg a CPU core
> > gating state is represented by a single node pointed at by all CPUs in the
> > system - it the same state data, different power domains though).
> > 
> > States are hierarchical, which means that a parent state implies entry on all
> > substates that's how cluster states are defined.
> 
> The cpu 0 node, in its cpu-idle-state array is also pointing at a cluster
> node right?  STATE0 -> power-domains refers to <pd_clusters 0>.  Is it
> correct that if the cpu's workload is such that it tolerates the
> 500/1000/2500 entry/exit latency/min-residency, it will call the
> entry-method with the psci-power-state of 0x10100000 matching that node?

Yes. As we know, since it is a cluster state, that might happen or not
depending on the state of other CPUs.

> > > Also, would it be nice to have a name field for each state?
> > 
> > There is:
> > 
> > "The state node name shall be "stateN", where N = {0, 1, ...} is
> > the node number; state nodes which are siblings within a single
> > common parent node must be given a unique and sequential N value,
> > starting from 0."
> > I can remove the rule and allow people to call states as they wish
> > since they already have a compatible property to match them.
> > 
> > Actually, states can be called with any name, provided it is unique.
> 
> Sorry, I was referring to a descriptive name string property.  Something
> that can be useful to help a human understand what's going on.  Naming
> the nodes might work too.

I do not like that, but I can remove the naming scheme and let people
find a naming scheme that complies with DT rules (nodes within a parent
must have a unique name). Not sure this would make dts more readable, but
I do not think it is a problem either.

> > > > +       A state node can contain state child nodes. A state node with
> > > > +       children represents a hierarchical state, which is a superset of
> > > > +       the child states. Hierarchical states require all CPUs on which
> > > > +       they are valid to request the state in order for it to be entered.
> > > 
> > > Is it possible for a cpu to request a deeper state and unblock other cpus
> > > from entering this state?
> > 
> > That's not DT bindings business, hierarchical states define states that
> > require all CPUs on which they are valid to enter that state for it to
> > be considered "enabled".
> > 
> > It is a hard nut to crack. In x86 this stuff does not exist and it is
> > managed in HW, basically an idle state is always per-cpu (but it might
> > end up becoming a package state when all CPUs in a package enter that
> > state). On ARM, we want to define hierarchical states explicitly to
> > link resources (ie caches) to them.
> > 
> > CPUs are not prevented from requesting a hierarchical state, but the
> > state only becomes "enabled" when all CPUs on which it is valid request
> > it.
> 
> The way we've seen it work, is a cpu tries to enter the lowest state it can
> tolerate (including for eg. cluster off).  The system level code then looks
> at the allowable states from all the cpus and selects the lowest power
> state permissible for the heirarchies.
> 
> Eg. this way, the last cpu preventing a cluster from going to sleep enters
> cluster sleep state (where peer cpu's have already voted for cluster
> sleep) and the cluster will sleep.  Alternately if the same cpu cannot
> tolerate the additional latency of the cluster sleeping, it will vote
> only for cpu-sleep, then the cluster remains in a state where all cpu's are
> sleeping but the cluster remains up.

That's exactly what these bindings are meant to describe too and I think they
do. There is also loads of information that can be used for tuning (what
caches are affected by an idle state, what CPUs "share" an idle-state
and so on).

> > I cannot think of any other way of to express this properly but still in
> > a compact way.
> 
> You're doing a great job by the way!  Thanks! :)

Thank you, we are almost there.

> > > "all CPUs on which they are valid" is this determined by seeing which
> > > state's phandle is in each cpu->cpu-idle-states?
> > 
> > Yes, does it make sense ?
> 
> Yup, maybe adding a little parenthetical text like below may help others
> as well?
> 
> Hierarchical states require all CPUs on which they are valid (ie. CPU nodes
> containing cpu-idle-state arrays having a phandle reference to the state)
> to request the state in order for it to be entered.

OK, applied.

> > 
> > > Definition: It represents an idle state index.
> > > 
> > >         Index 0 and 1 shall not be specified and are reserved for ARM
> > >         states where index 0 is running, and index 1 is idle_standby
> > >         entered by executing a wfi instruction (SBSA,[3])
> > > 
> > > What mechanism is used to order the power states WRT power consumption?
> > 
> > I think we should use index for that. The higher the index the lower the
> > power consumption.
> 
> Ok, I think I get it now.
> If we decouple index and SBSA states, do we really need to reserve index
> 0 and 1 then?

What I will do: move the entry-method to top-level cpu-idle-states node
(soon to be idle-states node) and add a property there:

"arm,has-idlewfi"

which allows me to prevent people from adding an explicit state that just
executes the wfi instruction on entry.

This way we can have a *global* entry-method, and we can also detect if the
platform supports wfi in its bare form.

Yes, index can start from 0, disallowing 0 and 1 was a (odd) way to prevent
people from adding wfi to DT. If the platform supports simple idlestandby
(ie wfi) it has to add the boolean property above.

How does that sound ?

> > > Can this field be used to combine both psci and non-psci states in any order?
> > 
> > No. I will enforce a unique entry method.
> 
> So a single entry-method for all  states in idle-states node?
> Should this be specified once only then?

Yes, see above.

> Should we not allow more flexibility here to mix methods where some
> states use one method and some use others?  Is this mostly a security
> concern?  What if a vendor wants to introduce a state between wfi and
> cpu-sleep?

entry-method-param is the way to differentiate. One interface/entry-method
(PSCI or vendor specific), different state parameters.

We are not installing multiple back-ends to enter different idle states,
are we ? That would be awkward.

ACK ?

> > > I assume psci will be turning off the powerdomains not Linux right?
> > 
> > This is not a PSCI only document, and even if it was, we still need to deal
> > with devices. Which means we need to know what we have to save/restore (PMU,
> > arch timer, GIC), and power domains help us detect that.
> 
> > > > +
> > > > +cpus {
> > > > +       #size-cells = <0>;
> > > > +       #address-cells = <2>;
> > > > +
> > > > +       cpu-idle-states {
> > > > +
> > > > +               STATE0: state0 {
> > > > +                       compatible = "arm,cpu-idle-state";
> > > > +                       index = <3>;
> > > 
> > > Are the index fields of nested states independent of each other or
> > > sequential?
> > ...
> > I understand index is misleading and either I remove it, or I
> > leave it there to define power savings scale as you mentioned.
> 
> I like index, but I was confused.  You cleared it up pretty quick with
> your earlier comment.  Removing the numbering may help, but I think
> keeping the index is useful.

I will leave the index to sort the states according to power consumption.

Thanks !
Lorenzo
Sebastian Capella Feb. 15, 2014, 1:22 a.m. UTC | #7
Quoting Lorenzo Pieralisi (2014-02-14 03:27:56)
> On Fri, Feb 14, 2014 at 12:29:26AM +0000, Sebastian Capella wrote:
> > Quoting Lorenzo Pieralisi (2014-02-13 04:47:32)
> > 
> > > Such as ? "idle-states" ?
> > 
> > That sounds good to me.  Our preference is for idle-states to be used
> > for name of the idle-states.txt node.
> 
> Ok, so s/cpu-idle-state/idle-state everywhere, inclusive of state nodes
> compatible properties ("arm,cpu-idle-state" becomes "arm,idle-state") ?

No, we were looking to make sure that the cpu-idle-states defined in
cpus.txt:

- cpu node
   - cpu-idle-states

And the cpu-idle-states defined in idle-states.txt:

- cpu-idle-states node

Did not use the same name, and instead had different, unique names.

Our preference was to change only the idle-states.txt name.

> I do not like that, but I can remove the naming scheme and let people
> find a naming scheme that complies with DT rules (nodes within a parent
> must have a unique name). Not sure this would make dts more readable, but
> I do not think it is a problem either.

In the current implementation for cpuidle, we have a descriptive c-state
name.  As long as we can get this kind of functionality using the node
name this seems fine to me.

> That's exactly what these bindings are meant to describe too and I think they
> do. There is also loads of information that can be used for tuning (what
> caches are affected by an idle state, what CPUs "share" an idle-state
> and so on).

Great :)

> > > 
> > > > Definition: It represents an idle state index.
> > > > 
> > > >         Index 0 and 1 shall not be specified and are reserved for ARM
> > > >         states where index 0 is running, and index 1 is idle_standby
> > > >         entered by executing a wfi instruction (SBSA,[3])
> > > > 
> > > > What mechanism is used to order the power states WRT power consumption?
> > > 
> > > I think we should use index for that. The higher the index the lower the
> > > power consumption.
> > 
> > Ok, I think I get it now.
> > If we decouple index and SBSA states, do we really need to reserve index
> > 0 and 1 then?
> 
> What I will do: move the entry-method to top-level cpu-idle-states node
> (soon to be idle-states node) and add a property there:
> 
> "arm,has-idlewfi"
> 
> which allows me to prevent people from adding an explicit state that just
> executes the wfi instruction on entry.
> 
> This way we can have a *global* entry-method, and we can also detect if the
> platform supports wfi in its bare form.
> 
> Yes, index can start from 0, disallowing 0 and 1 was a (odd) way to prevent
> people from adding wfi to DT. If the platform supports simple idlestandby
> (ie wfi) it has to add the boolean property above.
> 
> How does that sound ?

In general I'm ok with indexing as you have it, even if only to specify
an ordering of states by power.  I even thought maybe you could call it
order or sort-order or something, to help people understand you won't
use it as an array index or name, but I don't think it's a big deal
either way.

Do platforms remove support for WFI?  If they do, is this detectible
somehow directly from the ARM without relying on DTS?  It seems like
a comment is enough to discourage people from defining a wfi state.
Then eventually implement a common code path for idle.  I'm fine not
specifying this flag, but if you feel it can be useful I don't object.


> > Should we not allow more flexibility here to mix methods where some
> > states use one method and some use others?  Is this mostly a security
> > concern?  What if a vendor wants to introduce a state between wfi and
> > cpu-sleep?
> 
> entry-method-param is the way to differentiate. One interface/entry-method
> (PSCI or vendor specific), different state parameters.
> 
> We are not installing multiple back-ends to enter different idle states,
> are we ? That would be awkward.
> 
> ACK ?

ACK, thanks!

Sebastian
Mark Brown Feb. 16, 2014, 1:39 a.m. UTC | #8
On Tue, Feb 11, 2014 at 02:17:53PM +0000, Lorenzo Pieralisi wrote:

> +According to the Server Base System Architecture document (SBSA, [3]), the
> +power states an ARM CPU can be put into are identified by the following list:

> +1 - Running
> +2 - Idle_standby
> +3 - Idle_retention
> +4 - Sleep
> +5 - Off

> +ARM platforms implement the power states specified in the SBSA through power
> +management schemes that allow an OS PM implementation to put the processor in
> +different idle states (which include states 1 to 4 above; "off" state is not
> +an idle state since it does not have wake-up capabilities, hence it is not
> +considered in this document).

This is explicitly talking about SBSA - is there any restriction with
regard to non-SBSA systems?  I can't think of any right now and this
seems purely informational but it might be worth mentioning that
non-SBSA systems might potentially have other states if the intention is
to allow that.

> +- state node
> +
> +	Description: must be child of either the cpu-idle-states node or
> +		     a state node.
> +
> +	The state node name shall be "stateN", where N = {0, 1, ...} is
> +	the node number; state nodes which are siblings within a single common
> +	parent node must be given a unique and sequential N value, starting
> +	from 0.

This came up with the CPU bindings as well but I'm really not convinced
that making the naming of the nodes important is great - it's not normal
for DT and it makes the usual enumeration code not work.  Can we not
just have a property for state number in the node instead and allow the
name to be anything?  It seems we even have the index property
already...

> +	- compatible
> +		Usage: Required
> +		Value type: <stringlist>
> +		Definition: Must be "arm,cpu-idle-state".

Do we really need this given that the location in the tree is fixed -
what would we extend it with in future?

> +	- index
> +		Usage: Required
> +		Value type: <u32>
> +		Definition: It represents an idle state index, starting from 2.
> +			    Index 0 represents the processor state "running"
> +			    and index 1 represents processor mode
> +			    "idle_standby", entered by executing a wfi
> +			    instruction (SBSA,[3]); indexes 0 and 1 are
> +			    standard ARM states that need not be described.

...but other numbers are valid.

> +	- entry-method
> +		Usage: Required
> +		Value type: <stringlist>
> +		Definition: Describes the method by which a CPU enters the
> +			    idle state. This property is required and must be
> +			    one of:
> +
> +			    - "arm,psci-cpu-suspend"
> +			      ARM PSCI firmware interface, CPU suspend
> +			      method[2].
> +
> +			    - "[vendor],[method]"
> +			      An implementation dependent string with
> +			      format "vendor,method", where vendor is a string
> +			      denoting the name of the manufacturer and
> +			      method is a string specifying the mechanism
> +			      used to enter the idle state.

Might be worth reversing these - arm,psci-cpu-suspend is just an example
of a (hopefully very widely used) vendor method.  Given that everyone is
supposed to be using PSCI might it even be worth making it the default
and the property optional?

> +	- power-state
> +		Usage: See definition.
> +		Value type: <u32>
> +		Definition: Depends on the entry-method property value.
> +			    If entry-method is "arm,psci-cpu-suspend":
> +				# Property is required and represents
> +				  psci-power-state parameter. Please refer to
> +				  [2] for PSCI bindings definition.

Should we just have the entry method nodes define their own properties
for this stuff?  I guess this goes back to the comment I made on some
other binding that it might be cleaner to have an explicit PSCI binding
rather than put PSCI explicitly in indiidual bindings.

> +	- entry-latency
> +		Usage: Required
> +		Value type: <prop-encoded-array>
> +		Definition: u32 value representing worst case latency
> +			    in microseconds required to enter the idle state.

Why is this defined as an array?

> +	- cache-state-retained
> +		Usage: See definition
> +		Value type: <none>
> +		Definition: if present cache memory is retained on power down,
> +			    otherwise it is lost.

Might be better to define which caches?

> +		STATE1: state1 {
> +			compatible = "arm,cpu-idle-state";

Even if we stick with the node name being meaningful it'd be nice to
replace the STATEN here with a meaningful state name to improve
legibility.
Lorenzo Pieralisi Feb. 17, 2014, 10:11 a.m. UTC | #9
On Sat, Feb 15, 2014 at 01:22:15AM +0000, Sebastian Capella wrote:
> Quoting Lorenzo Pieralisi (2014-02-14 03:27:56)
> > On Fri, Feb 14, 2014 at 12:29:26AM +0000, Sebastian Capella wrote:
> > > Quoting Lorenzo Pieralisi (2014-02-13 04:47:32)
> > > 
> > > > Such as ? "idle-states" ?
> > > 
> > > That sounds good to me.  Our preference is for idle-states to be used
> > > for name of the idle-states.txt node.
> > 
> > Ok, so s/cpu-idle-state/idle-state everywhere, inclusive of state nodes
> > compatible properties ("arm,cpu-idle-state" becomes "arm,idle-state") ?
> 
> No, we were looking to make sure that the cpu-idle-states defined in
> cpus.txt:
> 
> - cpu node
>    - cpu-idle-states
> 
> And the cpu-idle-states defined in idle-states.txt:
> 
> - cpu-idle-states node
> 
> Did not use the same name, and instead had different, unique names.
> 
> Our preference was to change only the idle-states.txt name.

Ok, done.

> > I do not like that, but I can remove the naming scheme and let people
> > find a naming scheme that complies with DT rules (nodes within a parent
> > must have a unique name). Not sure this would make dts more readable, but
> > I do not think it is a problem either.
> 
> In the current implementation for cpuidle, we have a descriptive c-state
> name.  As long as we can get this kind of functionality using the node
> name this seems fine to me.

Ok, now the naming scheme follows standard device tree naming, so it is
up to platforms to find descriptive names.

[...]

> > What I will do: move the entry-method to top-level cpu-idle-states node
> > (soon to be idle-states node) and add a property there:
> > 
> > "arm,has-idlewfi"
> > 
> > which allows me to prevent people from adding an explicit state that just
> > executes the wfi instruction on entry.
> > 
> > This way we can have a *global* entry-method, and we can also detect if the
> > platform supports wfi in its bare form.
> > 
> > Yes, index can start from 0, disallowing 0 and 1 was a (odd) way to prevent
> > people from adding wfi to DT. If the platform supports simple idlestandby
> > (ie wfi) it has to add the boolean property above.
> > 
> > How does that sound ?
> 
> In general I'm ok with indexing as you have it, even if only to specify
> an ordering of states by power.  I even thought maybe you could call it
> order or sort-order or something, to help people understand you won't
> use it as an array index or name, but I don't think it's a big deal
> either way.
> 
> Do platforms remove support for WFI?  If they do, is this detectible
> somehow directly from the ARM without relying on DTS?  It seems like
> a comment is enough to discourage people from defining a wfi state.
> Then eventually implement a common code path for idle.  I'm fine not
> specifying this flag, but if you feel it can be useful I don't object.

Yeah, wfi flag won't be there. If we ever need it we will add it later.

index seems fine to me, it is well defined and as long as we know what
it means it is all sorted, unless someone has strong opinion against it.

Thanks !!
Lorenzo
Lorenzo Pieralisi Feb. 17, 2014, 10:40 a.m. UTC | #10
On Sun, Feb 16, 2014 at 01:39:56AM +0000, Mark Brown wrote:
> On Tue, Feb 11, 2014 at 02:17:53PM +0000, Lorenzo Pieralisi wrote:
> 
> > +According to the Server Base System Architecture document (SBSA, [3]), the
> > +power states an ARM CPU can be put into are identified by the following list:
> 
> > +1 - Running
> > +2 - Idle_standby
> > +3 - Idle_retention
> > +4 - Sleep
> > +5 - Off
> 
> > +ARM platforms implement the power states specified in the SBSA through power
> > +management schemes that allow an OS PM implementation to put the processor in
> > +different idle states (which include states 1 to 4 above; "off" state is not
> > +an idle state since it does not have wake-up capabilities, hence it is not
> > +considered in this document).
> 
> This is explicitly talking about SBSA - is there any restriction with
> regard to non-SBSA systems?  I can't think of any right now and this
> seems purely informational but it might be worth mentioning that
> non-SBSA systems might potentially have other states if the intention is
> to allow that.

It is informational, for nomenclature reasons. I do not think we have a
problem here, I will check if rewording is necessary.

> > +- state node
> > +
> > +	Description: must be child of either the cpu-idle-states node or
> > +		     a state node.
> > +
> > +	The state node name shall be "stateN", where N = {0, 1, ...} is
> > +	the node number; state nodes which are siblings within a single common
> > +	parent node must be given a unique and sequential N value, starting
> > +	from 0.
> 
> This came up with the CPU bindings as well but I'm really not convinced
> that making the naming of the nodes important is great - it's not normal
> for DT and it makes the usual enumeration code not work.  Can we not
> just have a property for state number in the node instead and allow the
> name to be anything?  It seems we even have the index property
> already...

Name can be anything now, acked.
 
> > +	- compatible
> > +		Usage: Required
> > +		Value type: <stringlist>
> > +		Definition: Must be "arm,cpu-idle-state".
> 
> Do we really need this given that the location in the tree is fixed -
> what would we extend it with in future?

I do not think it hurts either, honestly. Unless there is a strong
reason to remove it I would leave it there.

> > +	- index
> > +		Usage: Required
> > +		Value type: <u32>
> > +		Definition: It represents an idle state index, starting from 2.
> > +			    Index 0 represents the processor state "running"
> > +			    and index 1 represents processor mode
> > +			    "idle_standby", entered by executing a wfi
> > +			    instruction (SBSA,[3]); indexes 0 and 1 are
> > +			    standard ARM states that need not be described.
> 
> ...but other numbers are valid.

Now it is sequential {0,1,....} and I defined what it means in terms of
power consumption (ordering).

> > +	- entry-method
> > +		Usage: Required
> > +		Value type: <stringlist>
> > +		Definition: Describes the method by which a CPU enters the
> > +			    idle state. This property is required and must be
> > +			    one of:
> > +
> > +			    - "arm,psci-cpu-suspend"
> > +			      ARM PSCI firmware interface, CPU suspend
> > +			      method[2].
> > +
> > +			    - "[vendor],[method]"
> > +			      An implementation dependent string with
> > +			      format "vendor,method", where vendor is a string
> > +			      denoting the name of the manufacturer and
> > +			      method is a string specifying the mechanism
> > +			      used to enter the idle state.
> 
> Might be worth reversing these - arm,psci-cpu-suspend is just an example
> of a (hopefully very widely used) vendor method.  Given that everyone is
> supposed to be using PSCI might it even be worth making it the default
> and the property optional?

I think it is ok as it is, honestly. It is certainly not through this
property that psci will be mandated, it is just a way to describe an
entry method and it seems fine to me.

> > +	- power-state
> > +		Usage: See definition.
> > +		Value type: <u32>
> > +		Definition: Depends on the entry-method property value.
> > +			    If entry-method is "arm,psci-cpu-suspend":
> > +				# Property is required and represents
> > +				  psci-power-state parameter. Please refer to
> > +				  [2] for PSCI bindings definition.
> 
> Should we just have the entry method nodes define their own properties
> for this stuff?  I guess this goes back to the comment I made on some
> other binding that it might be cleaner to have an explicit PSCI binding
> rather than put PSCI explicitly in indiidual bindings.

I added to the PSCI bindings in this series. OK, I can add something like:
"Refer to entry-method bindings for the property value definition".

> > +	- entry-latency
> > +		Usage: Required
> > +		Value type: <prop-encoded-array>
> > +		Definition: u32 value representing worst case latency
> > +			    in microseconds required to enter the idle state.
> 
> Why is this defined as an array?

For possible extensions.

> > +	- cache-state-retained
> > +		Usage: See definition
> > +		Value type: <none>
> > +		Definition: if present cache memory is retained on power down,
> > +			    otherwise it is lost.
> 
> Might be better to define which caches?
> 
> > +		STATE1: state1 {
> > +			compatible = "arm,cpu-idle-state";
> 
> Even if we stick with the node name being meaningful it'd be nice to
> replace the STATEN here with a meaningful state name to improve
> legibility.  

Ok I will add this to v4.

Thanks,
Lorenzo
Lorenzo Pieralisi Feb. 17, 2014, 4:34 p.m. UTC | #11
On Sun, Feb 16, 2014 at 01:39:56AM +0000, Mark Brown wrote:

[...]

> > +	- cache-state-retained
> > +		Usage: See definition
> > +		Value type: <none>
> > +		Definition: if present cache memory is retained on power down,
> > +			    otherwise it is lost.
> 
> Might be better to define which caches?

I do not expect caches in the same power domain to have different retention
capabilities, so a flag per-state should be enough. If anyone is unhappy
about this please flag it up. List of caches affected can be retrieved by
walking the power-domain specifiers and check those against the caches power
domains.

Thanks,
Lorenzo
Mark Brown Feb. 17, 2014, 11:48 p.m. UTC | #12
On Mon, Feb 17, 2014 at 04:34:21PM +0000, Lorenzo Pieralisi wrote:
> On Sun, Feb 16, 2014 at 01:39:56AM +0000, Mark Brown wrote:

> > > +	- cache-state-retained
> > > +		Usage: See definition
> > > +		Value type: <none>
> > > +		Definition: if present cache memory is retained on power down,
> > > +			    otherwise it is lost.

> > Might be better to define which caches?

> I do not expect caches in the same power domain to have different retention
> capabilities, so a flag per-state should be enough. If anyone is unhappy
> about this please flag it up. List of caches affected can be retrieved by
> walking the power-domain specifiers and check those against the caches power
> domains.

OK, so it's caches located within the power domains referenced by the
state node?  Might be good to say that say that just for clarity since
the power domains are indirected.
diff mbox

Patch

diff --git a/Documentation/devicetree/bindings/arm/cpus.txt b/Documentation/devicetree/bindings/arm/cpus.txt
index 9130435..fb7f008 100644
--- a/Documentation/devicetree/bindings/arm/cpus.txt
+++ b/Documentation/devicetree/bindings/arm/cpus.txt
@@ -191,6 +191,13 @@  nodes to be present and contain the properties described below.
 			  property identifying a 64-bit zero-initialised
 			  memory location.
 
+	- cpu-idle-states
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition:
+			# List of phandles to cpu idle state nodes supported
+			  by this cpu [1].
+
 Example 1 (dual-cluster big.LITTLE system 32-bit):
 
 	cpus {
@@ -382,3 +389,6 @@  cpus {
 		cpu-release-addr = <0 0x20000000>;
 	};
 };
+
+[1] ARM Linux kernel documentation - idle state bindings
+    Documentation/devicetree/bindings/arm/idle-states.txt
diff --git a/Documentation/devicetree/bindings/arm/idle-states.txt b/Documentation/devicetree/bindings/arm/idle-states.txt
new file mode 100644
index 0000000..f155e70
--- /dev/null
+++ b/Documentation/devicetree/bindings/arm/idle-states.txt
@@ -0,0 +1,690 @@ 
+==========================================
+ARM idle states binding description
+==========================================
+
+==========================================
+1 - Introduction
+==========================================
+
+ARM systems contain HW capable of managing power consumption dynamically,
+where cores can be put in different low-power states (ranging from simple
+wfi to power gating) according to OSPM policies. The CPU states representing
+the range of dynamic idle states that a processor can enter at run-time, can be
+specified through device tree bindings representing the parameters required
+to enter/exit specific idle states on a given processor.
+
+According to the Server Base System Architecture document (SBSA, [3]), the
+power states an ARM CPU can be put into are identified by the following list:
+
+1 - Running
+2 - Idle_standby
+3 - Idle_retention
+4 - Sleep
+5 - Off
+
+ARM platforms implement the power states specified in the SBSA through power
+management schemes that allow an OS PM implementation to put the processor in
+different idle states (which include states 1 to 4 above; "off" state is not
+an idle state since it does not have wake-up capabilities, hence it is not
+considered in this document).
+
+Idle state parameters (eg entry latency) are platform specific and need to be
+characterized with bindings that provide the required information to OSPM
+code so that it can build the required tables and use them at runtime.
+
+The device tree binding definition for ARM idle states is the subject of this
+document.
+
+===========================================
+2 - cpu-idle-states node
+===========================================
+
+ARM processor idle states are defined within the cpu-idle-states node, which is
+a direct child of the cpus node and provides a container where the processor
+states, defined as device tree nodes, are listed.
+
+- cpu-idle-states node
+
+	Usage: Optional - On ARM systems, is a container of processor idle
+			  states nodes. If the system does not provide CPU
+			  power management capabilities or the processor just
+			  supports idle_standby a cpu-idle-states node is not
+			  required.
+
+	Description: cpu-idle-states node is a container node, where its
+		     subnodes describe the CPU idle states.
+
+	Node name must be "cpu-idle-states".
+
+	The cpu-idle-states node's parent node must be the cpus node.
+
+	The cpu-idle-states node's child nodes can be:
+
+	- one or more state nodes
+
+	Any other configuration is considered invalid.
+
+The nodes describing the idle states (state) can only be defined within the
+cpu-idle-states node.
+
+Any other configuration is consider invalid and therefore must be ignored.
+
+===========================================
+2 - state node
+===========================================
+
+A state node represents an idle state description and must be defined as
+follows:
+
+- state node
+
+	Description: must be child of either the cpu-idle-states node or
+		     a state node.
+
+	The state node name shall be "stateN", where N = {0, 1, ...} is
+	the node number; state nodes which are siblings within a single common
+	parent node must be given a unique and sequential N value, starting
+	from 0.
+
+	A state node can contain state child nodes. A state node with
+	children represents a hierarchical state, which is a superset of
+	the child states. Hierarchical states require all CPUs on which
+	they are valid to request the state in order for it to be entered.
+
+	A state node defines the following properties:
+
+	- compatible
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Must be "arm,cpu-idle-state".
+
+	- index
+		Usage: Required
+		Value type: <u32>
+		Definition: It represents an idle state index, starting from 2.
+			    Index 0 represents the processor state "running"
+			    and index 1 represents processor mode
+			    "idle_standby", entered by executing a wfi
+			    instruction (SBSA,[3]); indexes 0 and 1 are
+			    standard ARM states that need not be described.
+
+	- entry-method
+		Usage: Required
+		Value type: <stringlist>
+		Definition: Describes the method by which a CPU enters the
+			    idle state. This property is required and must be
+			    one of:
+
+			    - "arm,psci-cpu-suspend"
+			      ARM PSCI firmware interface, CPU suspend
+			      method[2].
+
+			    - "[vendor],[method]"
+			      An implementation dependent string with
+			      format "vendor,method", where vendor is a string
+			      denoting the name of the manufacturer and
+			      method is a string specifying the mechanism
+			      used to enter the idle state.
+
+	- power-state
+		Usage: See definition.
+		Value type: <u32>
+		Definition: Depends on the entry-method property value.
+			    If entry-method is "arm,psci-cpu-suspend":
+				# Property is required and represents
+				  psci-power-state parameter. Please refer to
+				  [2] for PSCI bindings definition.
+
+	- entry-latency
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to enter the idle state.
+
+	- exit-latency
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing worst case latency
+			    in microseconds required to exit the idle state.
+
+	- min-residency
+		Usage: Required
+		Value type: <prop-encoded-array>
+		Definition: u32 value representing time in microseconds
+			    required for the CPU to be in the idle state to
+			    make up for the dynamic power consumed to
+			    enter/exit the idle state in order to break even
+			    in terms of power consumption compared to idle
+			    state index 1 (idle_standby).
+
+	- power-domains
+		Usage: Optional
+		Value type: <prop-encoded-array>
+		Definition: List of power domain specifiers ([1]) describing
+			    the power domains that are affected by the idle
+			    state entry.
+
+	- cache-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present cache memory is retained on power down,
+			    otherwise it is lost.
+
+	- processor-state-retained
+		Usage: See definition
+		Value type: <none>
+		Definition: if present CPU processor logic is retained on
+			    power down, otherwise it is lost.
+
+===========================================
+3 - Examples
+===========================================
+
+Example 1 (ARM 64-bit, 16-cpu system):
+
+pd_clusters: power-domain-clusters@80002000 {
+	compatible = "arm,power-controller";
+	reg = <0x0 0x80002000 0x0 0x1000>;
+	#power-domain-cells = <1>;
+	#address-cells = <2>;
+	#size-cells = <2>;
+
+	pd_cores: power-domain-cores@80000000 {
+		compatible = "arm,power-controller";
+		reg = <0x0 0x80000000 0x0 0x1000>;
+		#power-domain-cells = <1>;
+	};
+};
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <2>;
+
+	cpu-idle-states {
+
+		STATE0: state0 {
+			compatible = "arm,cpu-idle-state";
+			index = <3>;
+			entry-method = "arm,psci-cpu-suspend";
+			psci-power-state = <0x1010000>;
+			entry-latency = <500>;
+			exit-latency = <1000>;
+			min-residency = <2500>;
+			power-domains = <&pd_clusters 0>;
+			STATE0_1: state0 {
+				compatible = "arm,cpu-idle-state";
+				index = <2>;
+				entry-method = "arm,psci-cpu-suspend";
+				psci-power-state = <0x0010000>;
+				entry-latency = <200>;
+				exit-latency = <400>;
+				min-residency = <300>;
+				power-domains = <&pd_cores 0>,
+						<&pd_cores 1>,
+						<&pd_cores 2>,
+						<&pd_cores 3>,
+						<&pd_cores 4>,
+						<&pd_cores 5>,
+						<&pd_cores 6>,
+						<&pd_cores 7>;
+			};
+		};
+
+		STATE1: state1 {
+			compatible = "arm,cpu-idle-state";
+			index = <3>;
+			entry-method = "arm,psci-cpu-suspend";
+			psci-power-state = <0x1010000>;
+			entry-latency = <1000>;
+			exit-latency = <3000>;
+			min-residency = <6500>;
+			power-domains = <&pd_clusters 1>;
+			STATE1_0: state0 {
+				compatible = "arm,cpu-idle-state";
+				index = <2>;
+				entry-method = "arm,psci-cpu-suspend";
+				psci-power-state = <0x0010000>;
+				entry-latency = <200>;
+				exit-latency = <400>;
+				min-residency = <500>;
+				power-domains = <&pd_cores 8>,
+						<&pd_cores 9>,
+						<&pd_cores 10>,
+						<&pd_cores 11>,
+						<&pd_cores 12>,
+						<&pd_cores 13>,
+						<&pd_cores 14>,
+						<&pd_cores 15>;
+			};
+		};
+	};
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x0>;
+		enable-method = "psci";
+		next-level-cache = <&L1_0>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_0: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 0>;
+		};
+		L2_0: l2-cache {
+			compatible = "arm,arch-cache";
+			power-domain = <&pd_clusters 0>;
+		};
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x1>;
+		enable-method = "psci";
+		next-level-cache = <&L1_1>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_1: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 1>;
+		};
+	};
+
+	CPU2: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x100>;
+		enable-method = "psci";
+		next-level-cache = <&L1_2>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_2: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 2>;
+		};
+	};
+
+	CPU3: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x101>;
+		enable-method = "psci";
+		next-level-cache = <&L1_3>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_3: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 3>;
+		};
+	};
+
+	CPU4: cpu@10000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10000>;
+		enable-method = "psci";
+		next-level-cache = <&L1_4>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_4: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 4>;
+		};
+	};
+
+	CPU5: cpu@10001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10001>;
+		enable-method = "psci";
+		next-level-cache = <&L1_5>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_5: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 5>;
+		};
+	};
+
+	CPU6: cpu@10100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10100>;
+		enable-method = "psci";
+		next-level-cache = <&L1_6>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_6: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 6>;
+		};
+	};
+
+	CPU7: cpu@10101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a57";
+		reg = <0x0 0x10101>;
+		enable-method = "psci";
+		next-level-cache = <&L1_7>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_7: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 7>;
+		};
+	};
+
+	CPU8: cpu@100000000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x0>;
+		enable-method = "psci";
+		next-level-cache = <&L1_8>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_8: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 8>;
+		};
+		L2_1: l2-cache {
+			compatible = "arm,arch-cache";
+			power-domain = <&pd_clusters 1>;
+		};
+	};
+
+	CPU9: cpu@100000001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x1>;
+		enable-method = "psci";
+		next-level-cache = <&L1_9>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_9: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 9>;
+		};
+	};
+
+	CPU10: cpu@100000100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x100>;
+		enable-method = "psci";
+		next-level-cache = <&L1_10>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_10: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 10>;
+		};
+	};
+
+	CPU11: cpu@100000101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x101>;
+		enable-method = "psci";
+		next-level-cache = <&L1_11>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_11: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 11>;
+		};
+	};
+
+	CPU12: cpu@100010000 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10000>;
+		enable-method = "psci";
+		next-level-cache = <&L1_12>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_12: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 12>;
+		};
+	};
+
+	CPU13: cpu@100010001 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10001>;
+		enable-method = "psci";
+		next-level-cache = <&L1_13>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_13: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 13>;
+		};
+	};
+
+	CPU14: cpu@100010100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10100>;
+		enable-method = "psci";
+		next-level-cache = <&L1_14>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_14: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 14>;
+		};
+	};
+
+	CPU15: cpu@100010101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a53";
+		reg = <0x1 0x10101>;
+		enable-method = "psci";
+		next-level-cache = <&L1_15>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_15: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 15>;
+		};
+	};
+};
+
+Example 2 (ARM 32-bit, 8-cpu system, two clusters):
+
+pd_clusters: power-domain-clusters@80002000 {
+	compatible = "arm,power-controller";
+	reg = <0x80002000 0x1000>;
+	#power-domain-cells = <1>;
+	#address-cells = <1>;
+	#size-cells = <1>;
+
+	pd_cores: power-domain-cores@80000000 {
+		compatible = "arm,power-controller";
+		reg = <0x80000000 0x1000>;
+		#power-domain-cells = <1>;
+	};
+};
+
+cpus {
+	#size-cells = <0>;
+	#address-cells = <1>;
+
+	cpu-idle-states {
+
+		STATE0: state0 {
+			compatible = "arm,cpu-idle-state";
+			index = <3>;
+			entry-method = "psci";
+			psci-power-state = <0x1010000>;
+			entry-latency = <1000>;
+			exit-latency = <1500>;
+			min-residency = <1500>;
+			power-domains = <&pd_clusters 0>;
+			STATE0_1: state0 {
+				compatible = "arm,cpu-idle-state";
+				index = <2>;
+				entry-method = "psci";
+				psci-power-state = <0x0010000>;
+				entry-latency = <400>;
+				exit-latency = <500>;
+				min-residency = <300>;
+				power-domains = <&pd_cores 0>,
+						<&pd_cores 1>,
+						<&pd_cores 2>,
+						<&pd_cores 3>;
+			};
+		};
+
+		STATE1: state1 {
+			compatible = "arm,cpu-idle-state";
+			index = <3>;
+			entry-method = "psci";
+			psci-power-state = <0x1010000>;
+			entry-latency = <800>;
+			exit-latency = <2000>;
+			min-residency = <6500>;
+			power-domains = <&pd_clusters 1>;
+			STATE1_0: state0 {
+				compatible = "arm,cpu-idle-state";
+				index = <2>;
+				entry-method = "psci";
+				psci-power-state = <0x0010000>;
+				entry-latency = <300>;
+				exit-latency = <500>;
+				min-residency = <500>;
+				power-domains = <&pd_cores 4>,
+						<&pd_cores 5>,
+						<&pd_cores 6>,
+						<&pd_cores 7>;
+			};
+		};
+	};
+
+	CPU0: cpu@0 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x0>;
+		next-level-cache = <&L1_0>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_0: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 0>;
+		};
+		L2_0: l2-cache {
+			compatible = "arm,arch-cache";
+			power-domain = <&pd_clusters 0>;
+		};
+	};
+
+	CPU1: cpu@1 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x1>;
+		next-level-cache = <&L1_1>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_1: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 1>;
+		};
+	};
+
+	CPU2: cpu@2 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x2>;
+		next-level-cache = <&L1_2>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_2: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 2>;
+		};
+	};
+
+	CPU3: cpu@3 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a15";
+		reg = <0x3>;
+		next-level-cache = <&L1_3>;
+		cpu-idle-states = <&STATE0_1 &STATE0>;
+		L1_3: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_0>;
+			power-domain = <&pd_cores 3>;
+		};
+	};
+
+	CPU4: cpu@100 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x100>;
+		next-level-cache = <&L1_4>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_4: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 4>;
+		};
+		L2_1: l2-cache {
+			compatible = "arm,arch-cache";
+			power-domain = <&pd_clusters 1>;
+		};
+	};
+
+	CPU5: cpu@101 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x101>;
+		next-level-cache = <&L1_5>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_5: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 5>;
+		};
+	};
+
+	CPU6: cpu@102 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x102>;
+		next-level-cache = <&L1_6>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_6: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 6>;
+		};
+	};
+
+	CPU7: cpu@103 {
+		device_type = "cpu";
+		compatible = "arm,cortex-a7";
+		reg = <0x103>;
+		next-level-cache = <&L1_7>;
+		cpu-idle-states = <&STATE1_0 &STATE1>;
+		L1_7: l1-cache {
+			compatible = "arm,arch-cache";
+			next-level-cache = <&L2_1>;
+			power-domain = <&pd_cores 7>;
+		};
+	};
+};
+
+===========================================
+4 - References
+===========================================
+
+[1] ARM Linux Kernel documentation - power domain bindings
+    Documentation/devicetree/bindings/power/power_domain.txt
+
+[2] ARM Linux Kernel documentation - PSCI bindings
+    Documentation/devicetree/bindings/arm/psci.txt
+
+[3] ARM Server Base System Architecture (SBSA)
+    http://infocenter.arm.com/help/index.jsp