diff mbox

[1/6] arm64: dts: amlogic: Add missing cooling device properties for CPUs

Message ID 2a2eb28da9fecf129f6bc0ab3d3748d9f4d25a29.1527225682.git.viresh.kumar@linaro.org (mailing list archive)
State New, archived
Headers show

Commit Message

Viresh Kumar May 25, 2018, 5:40 a.m. UTC
The cooling device properties, like "#cooling-cells" and
"dynamic-power-coefficient", should either be present for all the CPUs
of a cluster or none. If these are present only for a subset of CPUs of
a cluster then things will start falling apart as soon as the CPUs are
brought online in a different order. For example, this will happen
because the operating system looks for such properties in the CPU node
it is trying to bring up, so that it can register a cooling device.

Add such missing properties.

Signed-off-by: Viresh Kumar <viresh.kumar@linaro.org>
---
 .../boot/dts/amlogic/meson-gxm-khadas-vim2.dts     | 24 ++++++++++++++++++++++
 1 file changed, 24 insertions(+)

Comments

Olof Johansson May 25, 2018, 9:10 p.m. UTC | #1
On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote:
> The cooling device properties, like "#cooling-cells" and
> "dynamic-power-coefficient", should either be present for all the CPUs
> of a cluster or none. If these are present only for a subset of CPUs of
> a cluster then things will start falling apart as soon as the CPUs are
> brought online in a different order. For example, this will happen
> because the operating system looks for such properties in the CPU node
> it is trying to bring up, so that it can register a cooling device.
> 
> Add such missing properties.

This seems awkward compared to just having one cooling-cells in the /cpus node
instead.

What's it used for? I don't see any properties in the device nodes on meson-gxm
that have any cooling-foo cells in them? So why should #cooling-cells be
needed?


-Olof
Neil Armstrong May 26, 2018, 8:37 a.m. UTC | #2
Hi,

On 25/05/2018 23:10, Olof Johansson wrote:
> On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote:
>> The cooling device properties, like "#cooling-cells" and
>> "dynamic-power-coefficient", should either be present for all the CPUs
>> of a cluster or none. If these are present only for a subset of CPUs of
>> a cluster then things will start falling apart as soon as the CPUs are
>> brought online in a different order. For example, this will happen
>> because the operating system looks for such properties in the CPU node
>> it is trying to bring up, so that it can register a cooling device.
>>
>> Add such missing properties.
> 
> This seems awkward compared to just having one cooling-cells in the /cpus node
> instead.
> 
> What's it used for? I don't see any properties in the device nodes on meson-gxm
> that have any cooling-foo cells in them? So why should #cooling-cells be
> needed?


There is no reason to have the cooling-cells on these other CPUs, the DVFS is
controlled on the first CPU of each cluster, here cpu0 and cpu4 and only
cpu0 and cpu4 are used as cooling-cells.

Neil

> 
> 
> -Olof
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
>
Viresh Kumar May 28, 2018, 11:13 a.m. UTC | #3
On 25-05-18, 14:10, Olof Johansson wrote:
> On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote:
> > The cooling device properties, like "#cooling-cells" and
> > "dynamic-power-coefficient", should either be present for all the CPUs
> > of a cluster or none. If these are present only for a subset of CPUs of
> > a cluster then things will start falling apart as soon as the CPUs are
> > brought online in a different order. For example, this will happen
> > because the operating system looks for such properties in the CPU node
> > it is trying to bring up, so that it can register a cooling device.
> > 
> > Add such missing properties.
> 
> This seems awkward compared to just having one cooling-cells in the /cpus node
> instead.

Well, we don't allow that property to be present in /cpus node right
now and it is per device. And then we may not want all the CPUs to be
cooling devices really.

> What's it used for? I don't see any properties in the device nodes on meson-gxm
> that have any cooling-foo cells in them? So why should #cooling-cells be
> needed?

This property is required to declare a device as a cooling-device and
the device here is CPU. We use it as a cooling device by limiting its
higher range of frequencies, so that it doesn't generate too much
heat.

It is already there for CPU0 and CPU4, but it should really be there
for all the CPUs, like we have clock, supply, caches, etc.
Viresh Kumar May 28, 2018, 11:16 a.m. UTC | #4
On 26-05-18, 10:37, Neil Armstrong wrote:
> Hi,
> 
> On 25/05/2018 23:10, Olof Johansson wrote:
> > On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote:
> >> The cooling device properties, like "#cooling-cells" and
> >> "dynamic-power-coefficient", should either be present for all the CPUs
> >> of a cluster or none. If these are present only for a subset of CPUs of
> >> a cluster then things will start falling apart as soon as the CPUs are
> >> brought online in a different order. For example, this will happen
> >> because the operating system looks for such properties in the CPU node
> >> it is trying to bring up, so that it can register a cooling device.
> >>
> >> Add such missing properties.
> > 
> > This seems awkward compared to just having one cooling-cells in the /cpus node
> > instead.
> > 
> > What's it used for? I don't see any properties in the device nodes on meson-gxm
> > that have any cooling-foo cells in them? So why should #cooling-cells be
> > needed?
> 
> 
> There is no reason to have the cooling-cells on these other CPUs, the DVFS is
> controlled on the first CPU of each cluster, here cpu0 and cpu4 and only
> cpu0 and cpu4 are used as cooling-cells.

First, this is an incomplete definition of the hardware as all the
CPUs are cooling-devices here and DT shouldn't be written assuming how
OS will interpret it.

And then it is broken right now. You can offline your second cluster
(4567 CPUs) and bring CPU5 up first. You will see things breaking.

I have explained more in detail here.

https://marc.info/?l=linux-kernel&m=152750569414761
Olof Johansson June 2, 2018, 8:14 a.m. UTC | #5
On Mon, May 28, 2018 at 04:43:58PM +0530, Viresh Kumar wrote:
> On 25-05-18, 14:10, Olof Johansson wrote:
> > On Fri, May 25, 2018 at 11:10:01AM +0530, Viresh Kumar wrote:
> > > The cooling device properties, like "#cooling-cells" and
> > > "dynamic-power-coefficient", should either be present for all the CPUs
> > > of a cluster or none. If these are present only for a subset of CPUs of
> > > a cluster then things will start falling apart as soon as the CPUs are
> > > brought online in a different order. For example, this will happen
> > > because the operating system looks for such properties in the CPU node
> > > it is trying to bring up, so that it can register a cooling device.
> > > 
> > > Add such missing properties.
> > 
> > This seems awkward compared to just having one cooling-cells in the /cpus node
> > instead.
> 
> Well, we don't allow that property to be present in /cpus node right
> now and it is per device. And then we may not want all the CPUs to be
> cooling devices really.

And what I am saying is that it sounds like a broken binding if you don't allow
that, especially since it'll be a super common case that all CPUs will specify
the same cooling-device specifier.

> > What's it used for? I don't see any properties in the device nodes on meson-gxm
> > that have any cooling-foo cells in them? So why should #cooling-cells be
> > needed?
> 
> This property is required to declare a device as a cooling-device and
> the device here is CPU. We use it as a cooling device by limiting its
> higher range of frequencies, so that it doesn't generate too much
> heat.
> 
> It is already there for CPU0 and CPU4, but it should really be there
> for all the CPUs, like we have clock, supply, caches, etc.

You have #cooling-cells in the cpu node, but the actual data is in the
thermal-zones nodes. Why isn't #cooling-cells under thermal-zones, next to
cooling-maps?


-Olof
Viresh Kumar June 5, 2018, 4:37 a.m. UTC | #6
On 02-06-18, 01:14, Olof Johansson wrote:
> And what I am saying is that it sounds like a broken binding if you don't allow
> that, especially since it'll be a super common case that all CPUs will specify
> the same cooling-device specifier.

I am fine with allowing the #cooling-cells property in the cpus node if the DT
maintainers are fine with it.

@Rob: comments ?

@Olof: What about other properties which are still going to be duplicated for
the most common cases today, like: clocks, supply information, cache
information, cpu-idle-states and others. When we can duplicate these properties,
why not keep following the same for #cpu-cooling property ?

Note that the OPP table doesn't really need to get duplicated (for new
platforms) as the platforms use the v2 bindings now which just duplicates a
phandle assignment for all CPUs. Its a mess with older platforms which use the
earlier version of OPP table.

> > This property is required to declare a device as a cooling-device and
> > the device here is CPU. We use it as a cooling device by limiting its
> > higher range of frequencies, so that it doesn't generate too much
> > heat.
> > 
> > It is already there for CPU0 and CPU4, but it should really be there
> > for all the CPUs, like we have clock, supply, caches, etc.
> 
> You have #cooling-cells in the cpu node, but the actual data is in the
> thermal-zones nodes. Why isn't #cooling-cells under thermal-zones, next to
> cooling-maps?

Actually I thought about that when I worked on these patches initially and this
is why I felt convinced that the CPU nodes are the right place for this.

We add #interrupt-cells to an Interrupt controller's DT node, #gpio-cells to a
GPIO controller's DT node, #clock-cells to a clock controller's DT node and
that's exactly why we should (and we do) add #cooling-cells property to a
cooling device's DT node. This information is used in two ways, first it enables
the OS to know that the device is capable of being a cooling device and second
it tells us how many arguments will be required with a phandle of this device.

And so the cooling-maps always contain two arguments with the cooling device's
phandle (which is mostly a CPU or a gpio fan) as the #cooling-cells currently
is fixed to 2.

And so I am not really sure if thermal-zones is the right place to define this
thing. Is my understanding correct ?
diff mbox

Patch

diff --git a/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts b/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts
index 0868da476e41..313f88f8759e 100644
--- a/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts
+++ b/arch/arm64/boot/dts/amlogic/meson-gxm-khadas-vim2.dts
@@ -209,10 +209,34 @@ 
 	#cooling-cells = <2>;
 };
 
+&cpu1 {
+	#cooling-cells = <2>;
+};
+
+&cpu2 {
+	#cooling-cells = <2>;
+};
+
+&cpu3 {
+	#cooling-cells = <2>;
+};
+
 &cpu4 {
 	#cooling-cells = <2>;
 };
 
+&cpu5 {
+	#cooling-cells = <2>;
+};
+
+&cpu6 {
+	#cooling-cells = <2>;
+};
+
+&cpu7 {
+	#cooling-cells = <2>;
+};
+
 &ethmac {
 	pinctrl-0 = <&eth_pins>;
 	pinctrl-names = "default";