Message ID | 1375207217-4433-2-git-send-email-Sudeep.KarkadaNagesha@arm.com (mailing list archive) |
---|---|
State | RFC, archived |
Headers | show |
On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: > From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> > > If more than one similar devices share the same OPPs, currently we > need to replicate the OPP entries in all the nodes. > > Few drivers like cpufreq depend on physical cpu0 node to specify the > OPPs and only that node is referred irrespective of the logical cpu > accessing it. Alternatively to support cpuhotplug path, few drivers > parse all the cpu nodes for OPPs. Instead we can specify the phandle > of the node with which the current node shares the operating points. > > This patch adds support to specify the phandle in the operating points > of any device node, where the node specified by the phandle holds the > actual OPPs. > diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt > +Optional properties: > +- operating-points-phandle: phandle to the device node with which this That's a funny name. Bikeshedding a bit, how about shared-operating-points? I haven't thought at all about whether this change conceptually makes sense. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/30/2013 01:34 PM, Stephen Warren wrote: > On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> >> >> If more than one similar devices share the same OPPs, currently we >> need to replicate the OPP entries in all the nodes. >> >> Few drivers like cpufreq depend on physical cpu0 node to specify the >> OPPs and only that node is referred irrespective of the logical cpu >> accessing it. Alternatively to support cpuhotplug path, few drivers >> parse all the cpu nodes for OPPs. Instead we can specify the phandle >> of the node with which the current node shares the operating points. >> >> This patch adds support to specify the phandle in the operating points >> of any device node, where the node specified by the phandle holds the >> actual OPPs. > >> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt > >> +Optional properties: >> +- operating-points-phandle: phandle to the device node with which this > > That's a funny name. Bikeshedding a bit, how about shared-operating-points? > > I haven't thought at all about whether this change conceptually makes sense. > They may not really be shared- we could have phandle list even. one might have optional OPP sets for a chip family that one may - I was about to suggest something similar to pinctrl operating-points-names = "default", "performance", "cheapboard-config" ;) operating-points-0 = <&...> operating-points-1 = <&...> operating-points-2 = <&...> + wanted also to consider how we might have a single definition to scale across to what Mike is attempting to do with a generic clock framework support for DVFS. for compatibility sake, if operating-points is defined, we continue to expect old style definition, else we have options to pick from. This should setup stage for many of the work we have been trying to figure out on AM/OMAP and few other processors which has to depend on few sets of OPPs which may not be supported on various platforms. I am hoping our phandle solution could scale to all needed.
On 07/30/2013 02:48 PM, Nishanth Menon wrote: > On 07/30/2013 01:34 PM, Stephen Warren wrote: >> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> >>> >>> If more than one similar devices share the same OPPs, currently we >>> need to replicate the OPP entries in all the nodes. >>> >>> Few drivers like cpufreq depend on physical cpu0 node to specify the >>> OPPs and only that node is referred irrespective of the logical cpu >>> accessing it. Alternatively to support cpuhotplug path, few drivers >>> parse all the cpu nodes for OPPs. Instead we can specify the phandle >>> of the node with which the current node shares the operating points. >>> >>> This patch adds support to specify the phandle in the operating points >>> of any device node, where the node specified by the phandle holds the >>> actual OPPs. >> >>> diff --git a/Documentation/devicetree/bindings/power/opp.txt >>> b/Documentation/devicetree/bindings/power/opp.txt >> >>> +Optional properties: >>> +- operating-points-phandle: phandle to the device node with which this >> >> That's a funny name. Bikeshedding a bit, how about >> shared-operating-points? >> >> I haven't thought at all about whether this change conceptually makes >> sense. >> > > They may not really be shared- we could have phandle list even. Well, they are shared, or you wouldn't have one node pointing at another node and hence sharing the same property... > one > might have optional OPP sets for a chip family that one may - I was > about to suggest something similar to pinctrl > > operating-points-names = "default", "performance", "cheapboard-config" ;) > operating-points-0 = <&...> > operating-points-1 = <&...> > operating-points-2 = <&...> There is an assertion that DT should only represent the absolute max limits for things like this, and not policy-oriented data such as different performance profiles. I don't expect you'll see anything like the above in DT, since it's more policy than HW description. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 30/07/13 19:34, Stephen Warren wrote: > On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> >> >> If more than one similar devices share the same OPPs, currently we >> need to replicate the OPP entries in all the nodes. >> >> Few drivers like cpufreq depend on physical cpu0 node to specify the >> OPPs and only that node is referred irrespective of the logical cpu >> accessing it. Alternatively to support cpuhotplug path, few drivers >> parse all the cpu nodes for OPPs. Instead we can specify the phandle >> of the node with which the current node shares the operating points. >> >> This patch adds support to specify the phandle in the operating points >> of any device node, where the node specified by the phandle holds the >> actual OPPs. > >> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt > >> +Optional properties: >> +- operating-points-phandle: phandle to the device node with which this > > That's a funny name. Bikeshedding a bit, how about shared-operating-points? shared-operating-points makes sense, but I was thinking that name should indicate it's reference to the device that it shares OPP with. I agree 'operating-points-phandle' is no good, is 'shared-opp-device' any better ? Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 30/07/13 21:48, Nishanth Menon wrote: > On 07/30/2013 01:34 PM, Stephen Warren wrote: >> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> >>> >>> If more than one similar devices share the same OPPs, currently we >>> need to replicate the OPP entries in all the nodes. >>> >>> Few drivers like cpufreq depend on physical cpu0 node to specify the >>> OPPs and only that node is referred irrespective of the logical cpu >>> accessing it. Alternatively to support cpuhotplug path, few drivers >>> parse all the cpu nodes for OPPs. Instead we can specify the phandle >>> of the node with which the current node shares the operating points. >>> >>> This patch adds support to specify the phandle in the operating points >>> of any device node, where the node specified by the phandle holds the >>> actual OPPs. >> >>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt >> >>> +Optional properties: >>> +- operating-points-phandle: phandle to the device node with which this >> >> That's a funny name. Bikeshedding a bit, how about shared-operating-points? >> >> I haven't thought at all about whether this change conceptually makes sense. >> > > They may not really be shared- we could have phandle list even. one > might have optional OPP sets for a chip family that one may - I was > about to suggest something similar to pinctrl > I am not sure if I follow you here, if each chip family has its unique set of OPPs, why do we need to represent all of them together ? IIUC you are thinking about having these in include dts file, used by multiple chip/board dts. > operating-points-names = "default", "performance", "cheapboard-config" ;) > operating-points-0 = <&...> > operating-points-1 = <&...> > operating-points-2 = <&...> > This looks more like a PM policy. > + wanted also to consider how we might have a single definition to scale > across to what Mike is attempting to do with a generic clock framework > support for DVFS. > I don't quite follow what you are trying to say. In fact, following Mike's consolidation I had a thought that OPP must be part of clock node as multiple devices in the same clock domain refer to the same clock node. > for compatibility sake, if operating-points is defined, we continue to > expect old style definition, else we have options to pick from. > Yes we can do that, but we need to agree on where these OPPs need to present in DTS. > This should setup stage for many of the work we have been trying to > figure out on AM/OMAP and few other processors which has to depend on > few sets of OPPs which may not be supported on various platforms. > I still don't get the point why you would publish some OPP in the DT when the hardware which it describes doesn't support it. This may be already discussed when DT support was added to OPP library, IMO if for some reason the firmware/boot entity disables some of the OPPs, then it can append OPPs in DT with the state(enabled/disabled). But this needs extension of current binding. Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: > On 30/07/13 21:48, Nishanth Menon wrote: >> On 07/30/2013 01:34 PM, Stephen Warren wrote: >>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >>>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> >>>> >>>> If more than one similar devices share the same OPPs, currently we >>>> need to replicate the OPP entries in all the nodes. >>>> >>>> Few drivers like cpufreq depend on physical cpu0 node to specify the >>>> OPPs and only that node is referred irrespective of the logical cpu >>>> accessing it. Alternatively to support cpuhotplug path, few drivers >>>> parse all the cpu nodes for OPPs. Instead we can specify the phandle >>>> of the node with which the current node shares the operating points. >>>> >>>> This patch adds support to specify the phandle in the operating points >>>> of any device node, where the node specified by the phandle holds the >>>> actual OPPs. >>> >>>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt >>> >>>> +Optional properties: >>>> +- operating-points-phandle: phandle to the device node with which this >>> >>> That's a funny name. Bikeshedding a bit, how about shared-operating-points? >>> >>> I haven't thought at all about whether this change conceptually makes sense. >>> >> >> They may not really be shared- we could have phandle list even. one >> might have optional OPP sets for a chip family that one may - I was >> about to suggest something similar to pinctrl >> > I am not sure if I follow you here, if each chip family has its unique > set of OPPs, why do we need to represent all of them together ? > IIUC you are thinking about having these in include dts file, used by > multiple chip/board dts. > >> operating-points-names = "default", "performance", "cheapboard-config" ;) >> operating-points-0 = <&...> >> operating-points-1 = <&...> >> operating-points-2 = <&...> >> > This looks more like a PM policy. Let me try to explain since SoCs such as OMAP/AM family dont make life trivial :).. An legacy example[1][2] SoC DM explains that the chip is capable of X opps: opp1, 2 - for all devices opp1,2, 3 - if efuse bit X@y is set opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors requirements (including additional features A, B is enabled). So, the same chip family has a hardware feature - not just as a pm policy of selecting among a set of OPPs which opp to work on, but the actual set of OPPs are actually options in themselves that is selected based on board's SoC selection. this you could in theory, compare to selecting an pinctrl configuration option based on certain hardware selection criteria. > >> + wanted also to consider how we might have a single definition to scale >> across to what Mike is attempting to do with a generic clock framework >> support for DVFS. >> > I don't quite follow what you are trying to say. > > In fact, following Mike's consolidation I had a thought that OPP must be > part of clock node as multiple devices in the same clock domain refer to > the same clock node. just that the configuration and option we select *must* think beyond just CPU with a generic clock framework triggered dvfs - the strategy must work for devfreq/other frameworks as well. > >> for compatibility sake, if operating-points is defined, we continue to >> expect old style definition, else we have options to pick from. >> > Yes we can do that, but we need to agree on where these OPPs need to > present in DTS. I see Grant is in alignment, and I personally like the idea as well - as long as we dont break backward compatibility. > >> This should setup stage for many of the work we have been trying to >> figure out on AM/OMAP and few other processors which has to depend on >> few sets of OPPs which may not be supported on various platforms. >> > I still don't get the point why you would publish some OPP in the DT > when the hardware which it describes doesn't support it. > > This may be already discussed when DT support was added to OPP library, > IMO if for some reason the firmware/boot entity disables some of the > OPPs, then it can append OPPs in DT with the state(enabled/disabled). > But this needs extension of current binding. you could also have reduced OPP set which needs to be invoked, appending wont really work if cpufreq table is built as part of probe - it kind of creates all kind of races which I would really like to avoid. [1] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/mach-omap2/opp4xxx_data.c#n134 [2] http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/tree/arch/arm/mach-omap2/id.c#n290
On 31/07/13 15:46, Nishanth Menon wrote: > On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: >> On 30/07/13 21:48, Nishanth Menon wrote: [...] >>> This should setup stage for many of the work we have been trying to >>> figure out on AM/OMAP and few other processors which has to depend on >>> few sets of OPPs which may not be supported on various platforms. >>> >> I still don't get the point why you would publish some OPP in the DT >> when the hardware which it describes doesn't support it. >> >> This may be already discussed when DT support was added to OPP library, >> IMO if for some reason the firmware/boot entity disables some of the >> OPPs, then it can append OPPs in DT with the state(enabled/disabled). >> But this needs extension of current binding. > > you could also have reduced OPP set which needs to be invoked, appending > wont really work if cpufreq table is built as part of probe - it kind of > creates all kind of races which I would really like to avoid. > IIUC opp_set_availability(opp_enable/opp_disable) is designed for such use-case ? Currently there are no users of this API but I see it fits your use case. Even with multiple OPP sets listed in DT as you described, you need to read those fuses and chose the right set of OPPs. Instead you can use opp_en(/dis)able methods to do that ? Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: > On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: > > On 30/07/13 21:48, Nishanth Menon wrote: > >> On 07/30/2013 01:34 PM, Stephen Warren wrote: > >>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: > >>>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> > >>>> > >>>> If more than one similar devices share the same OPPs, currently we > >>>> need to replicate the OPP entries in all the nodes. > >>>> > >>>> Few drivers like cpufreq depend on physical cpu0 node to specify the > >>>> OPPs and only that node is referred irrespective of the logical cpu > >>>> accessing it. Alternatively to support cpuhotplug path, few drivers > >>>> parse all the cpu nodes for OPPs. Instead we can specify the phandle > >>>> of the node with which the current node shares the operating points. > >>>> > >>>> This patch adds support to specify the phandle in the operating points > >>>> of any device node, where the node specified by the phandle holds the > >>>> actual OPPs. > >>> > >>>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt > >>> > >>>> +Optional properties: > >>>> +- operating-points-phandle: phandle to the device node with which this > >>> > >>> That's a funny name. Bikeshedding a bit, how about shared-operating-points? > >>> > >>> I haven't thought at all about whether this change conceptually makes sense. > >>> > >> > >> They may not really be shared- we could have phandle list even. one > >> might have optional OPP sets for a chip family that one may - I was > >> about to suggest something similar to pinctrl > >> > > I am not sure if I follow you here, if each chip family has its unique > > set of OPPs, why do we need to represent all of them together ? > > IIUC you are thinking about having these in include dts file, used by > > multiple chip/board dts. > > > >> operating-points-names = "default", "performance", "cheapboard-config" ;) > >> operating-points-0 = <&...> > >> operating-points-1 = <&...> > >> operating-points-2 = <&...> > >> > > This looks more like a PM policy. > > Let me try to explain since SoCs such as OMAP/AM family dont make life > trivial :).. > > An legacy example[1][2] > > SoC DM explains that the chip is capable of X opps: > opp1, 2 - for all devices > opp1,2, 3 - if efuse bit X@y is set > opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors > requirements (including additional features A, B is enabled). > > So, the same chip family has a hardware feature - not just as a pm > policy of selecting among a set of OPPs which opp to work on, but the > actual set of OPPs are actually options in themselves that is selected > based on board's SoC selection. This sounds like we're describing a set of features not applicable to the device, then removing them, rather than only describing those features applicable to the device. If you have to probe to figure out which values in the dt are applicable, I'm not sure I see the benefit of describing said values in dt. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/31/2013 10:28 AM, Sudeep KarkadaNagesha wrote: > On 31/07/13 15:46, Nishanth Menon wrote: >> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: >>> On 30/07/13 21:48, Nishanth Menon wrote: > [...] >>>> This should setup stage for many of the work we have been trying to >>>> figure out on AM/OMAP and few other processors which has to depend on >>>> few sets of OPPs which may not be supported on various platforms. >>>> >>> I still don't get the point why you would publish some OPP in the DT >>> when the hardware which it describes doesn't support it. >>> >>> This may be already discussed when DT support was added to OPP library, >>> IMO if for some reason the firmware/boot entity disables some of the >>> OPPs, then it can append OPPs in DT with the state(enabled/disabled). >>> But this needs extension of current binding. >> >> you could also have reduced OPP set which needs to be invoked, appending >> wont really work if cpufreq table is built as part of probe - it kind of >> creates all kind of races which I would really like to avoid. >> > IIUC opp_set_availability(opp_enable/opp_disable) is designed for such > use-case ? Currently there are no users of this API but I see it fits > your use case. > > Even with multiple OPP sets listed in DT as you described, you need to > read those fuses and chose the right set of OPPs. Instead you can use > opp_en(/dis)able methods to do that ? > yes when the efuse data is present, but look at the other case I had also pointed at: Lets take an example: SoC X has OPPs 1,2,3,4 Same SoC is used on Board A and B. Board A meets with all SoC vendor requirements for routing, IR drop limits etc Board B *does not* meet with all SoC vendor requirements for routing, IR drop limits etc we no longer have board files, board will have to have a mechanism to "state it is not optimal configuration". A real life example is BeagleBoard Xm and another product board(which I cannot mention) -both use OMAP3630 1GHz part. 1GHz requirements are met on BeagleBoard Xm, but on the product board it is not. Chip used is exactly the same, we dont have "dts property" to mention "yes, this board meets SoC data manual and associated documentation requirement" - instead what we do have is what is the chip capable of doing. opp_enable/disable wont work here unless there is board specific "properties" we introduce. However, board file could choose "low performance" option of OPPs. the opp_enable/disable wont scale there. Further, opp_add is done enmasse by cpufreq-cpu0 and and cpufreq table is built off it, there is no option of SoC specific modification to the table (opportunity to do opp_enable/disable) there - not something that cannot be fixed, and eventually will be, but not there right now.
On 07/31/2013 10:29 AM, Mark Rutland wrote: > On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: >> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: >>> On 30/07/13 21:48, Nishanth Menon wrote: >>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: >>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >>>>>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> >>>>>> >>>>>> If more than one similar devices share the same OPPs, currently we >>>>>> need to replicate the OPP entries in all the nodes. >>>>>> >>>>>> Few drivers like cpufreq depend on physical cpu0 node to specify the >>>>>> OPPs and only that node is referred irrespective of the logical cpu >>>>>> accessing it. Alternatively to support cpuhotplug path, few drivers >>>>>> parse all the cpu nodes for OPPs. Instead we can specify the phandle >>>>>> of the node with which the current node shares the operating points. >>>>>> >>>>>> This patch adds support to specify the phandle in the operating points >>>>>> of any device node, where the node specified by the phandle holds the >>>>>> actual OPPs. >>>>> >>>>>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt >>>>> >>>>>> +Optional properties: >>>>>> +- operating-points-phandle: phandle to the device node with which this >>>>> >>>>> That's a funny name. Bikeshedding a bit, how about shared-operating-points? >>>>> >>>>> I haven't thought at all about whether this change conceptually makes sense. >>>>> >>>> >>>> They may not really be shared- we could have phandle list even. one >>>> might have optional OPP sets for a chip family that one may - I was >>>> about to suggest something similar to pinctrl >>>> >>> I am not sure if I follow you here, if each chip family has its unique >>> set of OPPs, why do we need to represent all of them together ? >>> IIUC you are thinking about having these in include dts file, used by >>> multiple chip/board dts. >>> >>>> operating-points-names = "default", "performance", "cheapboard-config" ;) >>>> operating-points-0 = <&...> >>>> operating-points-1 = <&...> >>>> operating-points-2 = <&...> >>>> >>> This looks more like a PM policy. >> >> Let me try to explain since SoCs such as OMAP/AM family dont make life >> trivial :).. >> >> An legacy example[1][2] >> >> SoC DM explains that the chip is capable of X opps: >> opp1, 2 - for all devices >> opp1,2, 3 - if efuse bit X@y is set >> opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors >> requirements (including additional features A, B is enabled). >> >> So, the same chip family has a hardware feature - not just as a pm >> policy of selecting among a set of OPPs which opp to work on, but the >> actual set of OPPs are actually options in themselves that is selected >> based on board's SoC selection. > > This sounds like we're describing a set of features not applicable to > the device, then removing them, rather than only describing those > features applicable to the device. If you have to probe to figure out > which values in the dt are applicable, I'm not sure I see the benefit of > describing said values in dt. Device has *options* of operating points sets it can operate at. It is not like "these are not applicable" for the device. DT does have to describe the hardware capability - that was it's entire intent. operating points are valid configurations where it can be operated at - and when you have options of configurations you need to choose from based on the board you are using it on, it still retains "hardware behavior" aspect. Hope that explains.
On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: > On 07/31/2013 10:29 AM, Mark Rutland wrote: > > On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: > >> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: > >>> On 30/07/13 21:48, Nishanth Menon wrote: > >>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: > >>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: > >>>>>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> > >>>>>> > >>>>>> If more than one similar devices share the same OPPs, currently we > >>>>>> need to replicate the OPP entries in all the nodes. > >>>>>> > >>>>>> Few drivers like cpufreq depend on physical cpu0 node to specify the > >>>>>> OPPs and only that node is referred irrespective of the logical cpu > >>>>>> accessing it. Alternatively to support cpuhotplug path, few drivers > >>>>>> parse all the cpu nodes for OPPs. Instead we can specify the phandle > >>>>>> of the node with which the current node shares the operating points. > >>>>>> > >>>>>> This patch adds support to specify the phandle in the operating points > >>>>>> of any device node, where the node specified by the phandle holds the > >>>>>> actual OPPs. > >>>>> > >>>>>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt > >>>>> > >>>>>> +Optional properties: > >>>>>> +- operating-points-phandle: phandle to the device node with which this > >>>>> > >>>>> That's a funny name. Bikeshedding a bit, how about shared-operating-points? > >>>>> > >>>>> I haven't thought at all about whether this change conceptually makes sense. > >>>>> > >>>> > >>>> They may not really be shared- we could have phandle list even. one > >>>> might have optional OPP sets for a chip family that one may - I was > >>>> about to suggest something similar to pinctrl > >>>> > >>> I am not sure if I follow you here, if each chip family has its unique > >>> set of OPPs, why do we need to represent all of them together ? > >>> IIUC you are thinking about having these in include dts file, used by > >>> multiple chip/board dts. > >>> > >>>> operating-points-names = "default", "performance", "cheapboard-config" ;) > >>>> operating-points-0 = <&...> > >>>> operating-points-1 = <&...> > >>>> operating-points-2 = <&...> > >>>> > >>> This looks more like a PM policy. > >> > >> Let me try to explain since SoCs such as OMAP/AM family dont make life > >> trivial :).. > >> > >> An legacy example[1][2] > >> > >> SoC DM explains that the chip is capable of X opps: > >> opp1, 2 - for all devices > >> opp1,2, 3 - if efuse bit X@y is set > >> opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors > >> requirements (including additional features A, B is enabled). > >> > >> So, the same chip family has a hardware feature - not just as a pm > >> policy of selecting among a set of OPPs which opp to work on, but the > >> actual set of OPPs are actually options in themselves that is selected > >> based on board's SoC selection. > > > > This sounds like we're describing a set of features not applicable to > > the device, then removing them, rather than only describing those > > features applicable to the device. If you have to probe to figure out > > which values in the dt are applicable, I'm not sure I see the benefit of > > describing said values in dt. > > Device has *options* of operating points sets it can operate at. It is > not like "these are not applicable" for the device. I don't follow. In the example above, if efuse bit X@y is not set, opp3 is not applicable, but we're describing it in dt. It's not an option for the particular device, yet it appears in the device's dt. For opp4, it's even worse, as you're suggesting we describe an option for the device that requires the driver to use some additional platform knowledge to come to the conclusion that it cannot use. That sounds like device knowledge internal to a driver, not how you describe an instance of a device to an OS. Have I misunderstood something here? > > DT does have to describe the hardware capability - that was it's entire > intent. operating points are valid configurations where it can be > operated at - and when you have options of configurations you need to > choose from based on the board you are using it on, it still retains > "hardware behavior" aspect. The dt should describe the particular board you're running on. As I see it what you're suggesting is equivalent to describing some hardware in the dt that isn't actually present, then relying on the OS to poke around somewhere else, figure out that the hardware isn't present, and then forget that the dt described it. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/31/2013 11:11 AM, Mark Rutland wrote: > On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: >> On 07/31/2013 10:29 AM, Mark Rutland wrote: >>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: >>>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: >>>>> On 30/07/13 21:48, Nishanth Menon wrote: >>>>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: >>>>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >>>>>>>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> >>>>>>>> >>>>>>>> If more than one similar devices share the same OPPs, currently we >>>>>>>> need to replicate the OPP entries in all the nodes. >>>>>>>> >>>>>>>> Few drivers like cpufreq depend on physical cpu0 node to specify the >>>>>>>> OPPs and only that node is referred irrespective of the logical cpu >>>>>>>> accessing it. Alternatively to support cpuhotplug path, few drivers >>>>>>>> parse all the cpu nodes for OPPs. Instead we can specify the phandle >>>>>>>> of the node with which the current node shares the operating points. >>>>>>>> >>>>>>>> This patch adds support to specify the phandle in the operating points >>>>>>>> of any device node, where the node specified by the phandle holds the >>>>>>>> actual OPPs. >>>>>>> >>>>>>>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt >>>>>>> >>>>>>>> +Optional properties: >>>>>>>> +- operating-points-phandle: phandle to the device node with which this >>>>>>> >>>>>>> That's a funny name. Bikeshedding a bit, how about shared-operating-points? >>>>>>> >>>>>>> I haven't thought at all about whether this change conceptually makes sense. >>>>>>> >>>>>> >>>>>> They may not really be shared- we could have phandle list even. one >>>>>> might have optional OPP sets for a chip family that one may - I was >>>>>> about to suggest something similar to pinctrl >>>>>> >>>>> I am not sure if I follow you here, if each chip family has its unique >>>>> set of OPPs, why do we need to represent all of them together ? >>>>> IIUC you are thinking about having these in include dts file, used by >>>>> multiple chip/board dts. >>>>> >>>>>> operating-points-names = "default", "performance", "cheapboard-config" ;) >>>>>> operating-points-0 = <&...> >>>>>> operating-points-1 = <&...> >>>>>> operating-points-2 = <&...> >>>>>> >>>>> This looks more like a PM policy. >>>> >>>> Let me try to explain since SoCs such as OMAP/AM family dont make life >>>> trivial :).. >>>> >>>> An legacy example[1][2] >>>> >>>> SoC DM explains that the chip is capable of X opps: >>>> opp1, 2 - for all devices >>>> opp1,2, 3 - if efuse bit X@y is set >>>> opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors >>>> requirements (including additional features A, B is enabled). >>>> >>>> So, the same chip family has a hardware feature - not just as a pm >>>> policy of selecting among a set of OPPs which opp to work on, but the >>>> actual set of OPPs are actually options in themselves that is selected >>>> based on board's SoC selection. >>> >>> This sounds like we're describing a set of features not applicable to >>> the device, then removing them, rather than only describing those >>> features applicable to the device. If you have to probe to figure out >>> which values in the dt are applicable, I'm not sure I see the benefit of >>> describing said values in dt. >> >> Device has *options* of operating points sets it can operate at. It is >> not like "these are not applicable" for the device. > > I don't follow. > > In the example above, if efuse bit X@y is not set, opp3 is not > applicable, but we're describing it in dt. It's not an option for the > particular device, yet it appears in the device's dt. This one is easy - opp_enable/disable as discussed in http://marc.info/?l=linux-pm&m=137528631125365&w=2 should probably help. > > For opp4, it's even worse, as you're suggesting we describe an option > for the device that requires the driver to use some additional platform > knowledge to come to the conclusion that it cannot use. That sounds like Precisely.. it wont have that knowledge and should not need that knowledge. See explanation above. Specific examples: SoC vendors try to squeeze the max out of the chip, when voltage values are defined, they need to consider board markets that they try to address, pricepoints etc.. too many vectors.. not all board manufacturers care to meet SoC vendor requirements as they may not care about picking up the full potential of the chip - example - usecases on OMAP where ARM is seldom used and max DSP is used (video usecases) and others so they use a high performance chip, refuse to optimize vdd_mpu rail, dont care too much about higher ARM OPPs. Yeah, I could always tell them to hand edit the OPP entries and maintain kernel forks, but that is never the right thing to do. > device knowledge internal to a driver, not how you describe an instance > of a device to an OS. OPP has never been a device - it is a performance point at which to operate a device. I am not sure if we are discussing about phandle definition of OPP is an issue or options of operating-point sets is an issue now. > > Have I misunderstood something here? Are you suggesting we have OPP tables per board? > >> >> DT does have to describe the hardware capability - that was it's entire >> intent. operating points are valid configurations where it can be >> operated at - and when you have options of configurations you need to >> choose from based on the board you are using it on, it still retains >> "hardware behavior" aspect. > > The dt should describe the particular board you're running on. As I see > it what you're suggesting is equivalent to describing some hardware in > the dt that isn't actually present, then relying on the OS to poke > around somewhere else, figure out that the hardware isn't present, and > then forget that the dt described it. I will buy that eventual dtb should contain some way to choose the OPP that the particular board can operate on. SoC dtsi is what we define, this allows multiple board dts to use them. the moment we start defining OPPs per board, all mayhem breaks loose. SoC dtsi provides options for the SoC to be operated upon, it is like saying I have 10 Uarts, but board dts chooses to enable the ones it uses. pinctrl we do the same. why cant we do with operating-points as well?
On 31/07/13 16:53, Nishanth Menon wrote: > On 07/31/2013 10:28 AM, Sudeep KarkadaNagesha wrote: >> On 31/07/13 15:46, Nishanth Menon wrote: >>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: >>>> On 30/07/13 21:48, Nishanth Menon wrote: >> [...] >>>>> This should setup stage for many of the work we have been trying to >>>>> figure out on AM/OMAP and few other processors which has to depend on >>>>> few sets of OPPs which may not be supported on various platforms. >>>>> >>>> I still don't get the point why you would publish some OPP in the DT >>>> when the hardware which it describes doesn't support it. >>>> >>>> This may be already discussed when DT support was added to OPP library, >>>> IMO if for some reason the firmware/boot entity disables some of the >>>> OPPs, then it can append OPPs in DT with the state(enabled/disabled). >>>> But this needs extension of current binding. >>> >>> you could also have reduced OPP set which needs to be invoked, appending >>> wont really work if cpufreq table is built as part of probe - it kind of >>> creates all kind of races which I would really like to avoid. >>> >> IIUC opp_set_availability(opp_enable/opp_disable) is designed for such >> use-case ? Currently there are no users of this API but I see it fits >> your use case. >> >> Even with multiple OPP sets listed in DT as you described, you need to >> read those fuses and chose the right set of OPPs. Instead you can use >> opp_en(/dis)able methods to do that ? >> > yes when the efuse data is present, but look at the other case I had > also pointed at: > > Lets take an example: SoC X has OPPs 1,2,3,4 > > Same SoC is used on Board A and B. > Board A meets with all SoC vendor requirements for routing, IR drop > limits etc > Board B *does not* meet with all SoC vendor requirements for routing, IR > drop limits etc > > we no longer have board files, board will have to have a mechanism to > "state it is not optimal configuration". > But we still have DTS per board(I assume each board will at-least different in IRQ/GPIO assignments or even different pin-mux configuration) > A real life example is BeagleBoard Xm and another product board(which I > cannot mention) -both use OMAP3630 1GHz part. 1GHz requirements are met > on BeagleBoard Xm, but on the product board it is not. Chip used is > exactly the same, we dont have "dts property" to mention "yes, this > board meets SoC data manual and associated documentation requirement" - > instead what we do have is what is the chip capable of doing. > If SoC gives configuration w.r.t OPPs, then its board property like pin-mux. Why is it not possible to have 1GHz only in BeagleBoard Xm and not in product board X ? > opp_enable/disable wont work here unless there is board specific > "properties" we introduce. However, board file could choose "low > performance" option of OPPs. > Again board file choice of selecting OPPs is policy and DT must describe all the features board supports. > the opp_enable/disable wont scale there. Further, opp_add is done > enmasse by cpufreq-cpu0 and and cpufreq table is built off it, there is > no option of SoC specific modification to the table (opportunity to do > opp_enable/disable) there - not something that cannot be fixed, and > eventually will be, but not there right now. > Freescale iMX6 seems to be using it, not use if its SoC level or board level.(imx6q_opp_check_1p2ghz in arch/arm/mach-imx/mach-imx6q.c) I just had a look @arch/arm/boot/dts/omap36xx.dtsi and omap3-beagle-xm.dts. This is my opinion(if you can't handle this dynamically): Now you have omap36xx SoC on 2 products X and BeagleBoard Xm. omap36xx can support up to 1GHz but depends on actual products. So its better not to publish OPPs in omap36xx.dtsi but leave to product X and BeagleBoard Xm to describe what that hardware supports. Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 17:40-20130731, Sudeep KarkadaNagesha wrote: > On 31/07/13 16:53, Nishanth Menon wrote: > > On 07/31/2013 10:28 AM, Sudeep KarkadaNagesha wrote: > >> On 31/07/13 15:46, Nishanth Menon wrote: > >>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: > >>>> On 30/07/13 21:48, Nishanth Menon wrote: > >> [...] > >>>>> This should setup stage for many of the work we have been trying to > >>>>> figure out on AM/OMAP and few other processors which has to depend on > >>>>> few sets of OPPs which may not be supported on various platforms. > >>>>> > >>>> I still don't get the point why you would publish some OPP in the DT > >>>> when the hardware which it describes doesn't support it. > >>>> > >>>> This may be already discussed when DT support was added to OPP library, > >>>> IMO if for some reason the firmware/boot entity disables some of the > >>>> OPPs, then it can append OPPs in DT with the state(enabled/disabled). > >>>> But this needs extension of current binding. > >>> > >>> you could also have reduced OPP set which needs to be invoked, appending > >>> wont really work if cpufreq table is built as part of probe - it kind of > >>> creates all kind of races which I would really like to avoid. > >>> > >> IIUC opp_set_availability(opp_enable/opp_disable) is designed for such > >> use-case ? Currently there are no users of this API but I see it fits > >> your use case. > >> > >> Even with multiple OPP sets listed in DT as you described, you need to > >> read those fuses and chose the right set of OPPs. Instead you can use > >> opp_en(/dis)able methods to do that ? > >> > > yes when the efuse data is present, but look at the other case I had > > also pointed at: > > > > Lets take an example: SoC X has OPPs 1,2,3,4 > > > > Same SoC is used on Board A and B. > > Board A meets with all SoC vendor requirements for routing, IR drop > > limits etc > > Board B *does not* meet with all SoC vendor requirements for routing, IR > > drop limits etc > > > > we no longer have board files, board will have to have a mechanism to > > "state it is not optimal configuration". > > > But we still have DTS per board(I assume each board will at-least > different in IRQ/GPIO assignments or even different pin-mux configuration) yes, of course. > > > A real life example is BeagleBoard Xm and another product board(which I > > cannot mention) -both use OMAP3630 1GHz part. 1GHz requirements are met > > on BeagleBoard Xm, but on the product board it is not. Chip used is > > exactly the same, we dont have "dts property" to mention "yes, this > > board meets SoC data manual and associated documentation requirement" - > > instead what we do have is what is the chip capable of doing. > > > If SoC gives configuration w.r.t OPPs, then its board property like > pin-mux. Why is it not possible to have 1GHz only in BeagleBoard Xm and > not in product board X ? Unless we have two "phandles", we wont be able to do the same. Then you'd want to standardize how we do that which is why I made the proposal. > > > opp_enable/disable wont work here unless there is board specific > > "properties" we introduce. However, board file could choose "low > > performance" option of OPPs. > > > Again board file choice of selecting OPPs is policy and DT must describe > all the features board supports. Umm... I would probably call it "capability" based on selection of SoC type, board design etc. as against "policy" of making a decision - example, this device is too hot, so lets reduce the number of OPPs that cpufreq chooses to operate on. > > > the opp_enable/disable wont scale there. Further, opp_add is done > > enmasse by cpufreq-cpu0 and and cpufreq table is built off it, there is > > no option of SoC specific modification to the table (opportunity to do > > opp_enable/disable) there - not something that cannot be fixed, and > > eventually will be, but not there right now. > > > Freescale iMX6 seems to be using it, not use if its SoC level or board > level.(imx6q_opp_check_1p2ghz in arch/arm/mach-imx/mach-imx6q.c) Yep, this is similar to the "efuse case" I had stated, and given no other option, that is the only way I will probably have to go w.r.t OMAP, but given OMAP, iMX have similar challenges, I am hoping we can have a definition that is generic enough for various SoCs to use. (similar to beagle_opp_init in arch/arm/mach-omap2/board-omap3beagle.c - which is pre-dts days usage) > > > I just had a look @arch/arm/boot/dts/omap36xx.dtsi and > omap3-beagle-xm.dts. This is my opinion(if you can't handle this > dynamically): > > Now you have omap36xx SoC on 2 products X and BeagleBoard Xm. > omap36xx can support up to 1GHz but depends on actual products. > So its better not to publish OPPs in omap36xx.dtsi but leave to product > X and BeagleBoard Xm to describe what that hardware supports. That is the pain I was trying to explain, having configuration options defined in SoC dtsi providing all valid options available is more sensible. It also solves the issue of multiple devices/clocks using same definitions Speaking as an SoC vendor, allowing board files define their own opp tables is basically an invitation for disaster from a production risk perspective, as, there are too many board variants out there and not all developers are, umm... "OPP-aware" and end up with a maintenance burden for rest of us (e.g. board X, board Y defining 1GHz differently) etc.
On 07/31/2013 02:13 PM, Nishanth Menon wrote: [...] > Unless we have two "phandles", we wont be able to do the same. Then > you'd want to standardize how we do that which is why I made the > proposal. > Let me try a slightly detailed proposal of what I am trying to suggest: Usage option #1: Legacy support. cpu@0 { compatible = "arm,cortex-a9"; reg = <0>; next-level-cache = <&L2>; operating-points = < /* kHz uV */ 792000 1100000 396000 950000 198000 850000 >; }; Usage option #2: Maintain only deltas in options from a base. cpu@0 { compatible = "arm,cortex-a9"; reg = <0>; next-level-cache = <&L2>; operating-points-names = "base", "high-performance"; operating-points-0 = < /* kHz uV */ 792000 1100000 396000 950000 198000 850000 >; operating-points-1 = < /* kHz uV */ 1000000 1200000 >; }; Usage option #3: (not compatible definition to #2) cpu@0 { compatible = "arm,cortex-a9"; reg = <0>; next-level-cache = <&L2>; operating-points-names = "default", "high-performance"; operating-points-0 = < /* kHz uV */ 792000 1100000 396000 950000 198000 850000 >; operating-points-1 = < /* kHz uV */ 1000000 1200000 792000 1100000 396000 950000 198000 850000 >; }; Usage option #4 (along with option 3 or 2): cpu@1 { compatible = "arm,cortex-a9"; reg = <0>; next-level-cache = <&L2>; operating-points-device = <&cpu0 high-performance>; }; Usage option #5 (along with option 1): This is the step we are attempting to do in this patch as far as I understand. cpu@1 { compatible = "arm,cortex-a9"; reg = <0>; next-level-cache = <&L2>; operating-points-device = <&cpu0>; }; board file override option: &cpu0 { operating-points-select = "default"; } This will prevent selection of high-performance even if efuse is set etc.. or force selection of high-performance independent of what efuse says. This allows us: a) To maintain dts in a separate repository without being dependent on frequencies in kernel code for opp_enable/disable. b) reasonably proceed towards complete SoC entitlement c) not have to deal with multiple OPP definitions per board file. Does that make sense? or do we see concerns?
On 07/31/2013 08:46 AM, Nishanth Menon wrote: ... > Let me try to explain since SoCs such as OMAP/AM family dont make life > trivial :).. > > An legacy example[1][2] > > SoC DM explains that the chip is capable of X opps: > opp1, 2 - for all devices > opp1,2, 3 - if efuse bit X@y is set > opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors > requirements (including additional features A, B is enabled). Hopefully the text "board design meets SoC vendors requirements" means something like "the board has a big fan capable of dissipating a lot of heat" and not "the board manufacturer paid us a lot of money to license the 'go faster' feature". The former could well be suitable to represent in DT, the latter not. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/31/2013 10:11 AM, Mark Rutland wrote: > On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: >> On 07/31/2013 10:29 AM, Mark Rutland wrote: >>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: ... >>>> Let me try to explain since SoCs such as OMAP/AM family dont make life >>>> trivial :).. >>>> >>>> An legacy example[1][2] >>>> >>>> SoC DM explains that the chip is capable of X opps: >>>> opp1, 2 - for all devices >>>> opp1,2, 3 - if efuse bit X@y is set >>>> opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors >>>> requirements (including additional features A, B is enabled). >>>> >>>> So, the same chip family has a hardware feature - not just as a pm >>>> policy of selecting among a set of OPPs which opp to work on, but the >>>> actual set of OPPs are actually options in themselves that is selected >>>> based on board's SoC selection. >>> >>> This sounds like we're describing a set of features not applicable to >>> the device, then removing them, rather than only describing those >>> features applicable to the device. If you have to probe to figure out >>> which values in the dt are applicable, I'm not sure I see the benefit of >>> describing said values in dt. >> >> Device has *options* of operating points sets it can operate at. It is >> not like "these are not applicable" for the device. > > I don't follow. > > In the example above, if efuse bit X@y is not set, opp3 is not > applicable, but we're describing it in dt. It's not an option for the > particular device, yet it appears in the device's dt. > > For opp4, it's even worse, as you're suggesting we describe an option > for the device that requires the driver to use some additional platform > knowledge to come to the conclusion that it cannot use. That sounds like > device knowledge internal to a driver, not how you describe an instance > of a device to an OS. > > Have I misunderstood something here? From the kernel's perspective here, a simple approach would be to say the the DT passed to the kernel must contain exactly the set of OPPs that the kernel should use. If this requires probing HW (fuses, board knowledge, etc.), then the bootloader/... must do that and fix up the DT to correctly represent the board at boot time (or choose the correct DT from a set of options in flash) or whoever flashed the board must have simply put the correct DT onto the board. Of course, that simply pushes back the issue onto the flashing utility or bootloader. Perhaps pushing stuff back there isn't a good holistic solution, since we only get a clean kernel by making life suck for someone else, and we'd perhaps better optimize the entire SW stack. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 15:51-20130731, Stephen Warren wrote: > On 07/31/2013 08:46 AM, Nishanth Menon wrote: > ... > > Let me try to explain since SoCs such as OMAP/AM family dont make life > > trivial :).. > > > > An legacy example[1][2] > > > > SoC DM explains that the chip is capable of X opps: > > opp1, 2 - for all devices > > opp1,2, 3 - if efuse bit X@y is set > > opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors > > requirements (including additional features A, B is enabled). > > Hopefully the text "board design meets SoC vendors requirements" means > something like "the board has a big fan capable of dissipating a lot of > heat" and not "the board manufacturer paid us a lot of money to license > the 'go faster' feature". The former could well be suitable to represent > in DT, the latter not. Nope, these are technical requirements. In my company, these guidelines are called Power Distribution Network guideline - which control the IRDrop within reasonable limit, running DPLLs are varied frequencies have different current draw characteristics, such that noise limits, cleanup capacitors etc. For boards that dont care too much about higher frequencies, they tend to skimp a little on caps and board routing constraints to save on BOM (Bill Of Materials) cost. I am sure similar constraints do exist in other SoC vendors as well.
On Wed, Jul 31, 2013 at 05:27:39PM +0100, Nishanth Menon wrote: > On 07/31/2013 11:11 AM, Mark Rutland wrote: > > On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: > >> On 07/31/2013 10:29 AM, Mark Rutland wrote: > >>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: > >>>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: > >>>>> On 30/07/13 21:48, Nishanth Menon wrote: > >>>>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: > >>>>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: > >>>>>>>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> > >>>>>>>> > >>>>>>>> If more than one similar devices share the same OPPs, currently we > >>>>>>>> need to replicate the OPP entries in all the nodes. > >>>>>>>> > >>>>>>>> Few drivers like cpufreq depend on physical cpu0 node to specify the > >>>>>>>> OPPs and only that node is referred irrespective of the logical cpu > >>>>>>>> accessing it. Alternatively to support cpuhotplug path, few drivers > >>>>>>>> parse all the cpu nodes for OPPs. Instead we can specify the phandle > >>>>>>>> of the node with which the current node shares the operating points. > >>>>>>>> > >>>>>>>> This patch adds support to specify the phandle in the operating points > >>>>>>>> of any device node, where the node specified by the phandle holds the > >>>>>>>> actual OPPs. > >>>>>>> > >>>>>>>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt > >>>>>>> > >>>>>>>> +Optional properties: > >>>>>>>> +- operating-points-phandle: phandle to the device node with which this > >>>>>>> > >>>>>>> That's a funny name. Bikeshedding a bit, how about shared-operating-points? > >>>>>>> > >>>>>>> I haven't thought at all about whether this change conceptually makes sense. > >>>>>>> > >>>>>> > >>>>>> They may not really be shared- we could have phandle list even. one > >>>>>> might have optional OPP sets for a chip family that one may - I was > >>>>>> about to suggest something similar to pinctrl > >>>>>> > >>>>> I am not sure if I follow you here, if each chip family has its unique > >>>>> set of OPPs, why do we need to represent all of them together ? > >>>>> IIUC you are thinking about having these in include dts file, used by > >>>>> multiple chip/board dts. > >>>>> > >>>>>> operating-points-names = "default", "performance", "cheapboard-config" ;) > >>>>>> operating-points-0 = <&...> > >>>>>> operating-points-1 = <&...> > >>>>>> operating-points-2 = <&...> > >>>>>> > >>>>> This looks more like a PM policy. > >>>> > >>>> Let me try to explain since SoCs such as OMAP/AM family dont make life > >>>> trivial :).. > >>>> > >>>> An legacy example[1][2] > >>>> > >>>> SoC DM explains that the chip is capable of X opps: > >>>> opp1, 2 - for all devices > >>>> opp1,2, 3 - if efuse bit X@y is set > >>>> opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors > >>>> requirements (including additional features A, B is enabled). > >>>> > >>>> So, the same chip family has a hardware feature - not just as a pm > >>>> policy of selecting among a set of OPPs which opp to work on, but the > >>>> actual set of OPPs are actually options in themselves that is selected > >>>> based on board's SoC selection. > >>> > >>> This sounds like we're describing a set of features not applicable to > >>> the device, then removing them, rather than only describing those > >>> features applicable to the device. If you have to probe to figure out > >>> which values in the dt are applicable, I'm not sure I see the benefit of > >>> describing said values in dt. > >> > >> Device has *options* of operating points sets it can operate at. It is > >> not like "these are not applicable" for the device. > > > > I don't follow. > > > > In the example above, if efuse bit X@y is not set, opp3 is not > > applicable, but we're describing it in dt. It's not an option for the > > particular device, yet it appears in the device's dt. > This one is easy - opp_enable/disable as discussed in > http://marc.info/?l=linux-pm&m=137528631125365&w=2 should probably help. Wrong link? There's no reference to opp_enable or opp_disable... > > > > For opp4, it's even worse, as you're suggesting we describe an option > > for the device that requires the driver to use some additional platform > > knowledge to come to the conclusion that it cannot use. That sounds like > > Precisely.. it wont have that knowledge and should not need that > knowledge. See explanation above. > > Specific examples: SoC vendors try to squeeze the max out of the chip, > when voltage values are defined, they need to consider board markets > that they try to address, pricepoints etc.. too many vectors.. not all > board manufacturers care to meet SoC vendor requirements as they may not > care about picking up the full potential of the chip - example - > usecases on OMAP where ARM is seldom used and max DSP is used (video > usecases) and others so they use a high performance chip, refuse to > optimize vdd_mpu rail, dont care too much about higher ARM OPPs. Yeah, I > could always tell them to hand edit the OPP entries and maintain kernel > forks, but that is never the right thing to do. Sorry, but I still don't follow. We seem to be going over two cases, which both feel wrong to me: * One SoC used in multiple boards, where on some boards an OPP cannot be used because some requirement is not met. In this case, the board's dts (by including the SoC's dtsi) describes something that's not necessarily usable, and we seem to have no way to describe in the OPP table that the OPP is not usable for that board. * Performance profiles, in which you have a set of OPP tables for "performance, "low-power", and whatever else. This arbitrary split seems like a configuration decision rather than a hardware description unless there is some hard limit that cannot be detected (e.g. the processor can function at some arbitrary high speed, but becomes hot enough to melt something, and there's no temperature sensor to handle this case dynamically). Have I've misunderstood something? > > > > device knowledge internal to a driver, not how you describe an instance > > of a device to an OS. > > OPP has never been a device - it is a performance point at which to > operate a device. I am not sure if we are discussing about phandle > definition of OPP is an issue or options of operating-point sets is an > issue now. > > > > > Have I misunderstood something here? > > Are you suggesting we have OPP tables per board? Yes, for the reasons I give above. Common OPP tabless can easily be factored into separate include files to allow for arbitrary composition. > > > > >> > >> DT does have to describe the hardware capability - that was it's entire > >> intent. operating points are valid configurations where it can be > >> operated at - and when you have options of configurations you need to > >> choose from based on the board you are using it on, it still retains > >> "hardware behavior" aspect. > > > > The dt should describe the particular board you're running on. As I see > > it what you're suggesting is equivalent to describing some hardware in > > the dt that isn't actually present, then relying on the OS to poke > > around somewhere else, figure out that the hardware isn't present, and > > then forget that the dt described it. > I will buy that eventual dtb should contain some way to choose the OPP > that the particular board can operate on. > > SoC dtsi is what we define, this allows multiple board dts to use them. > the moment we start defining OPPs per board, all mayhem breaks loose. > > SoC dtsi provides options for the SoC to be operated upon, it is like > saying I have 10 Uarts, but board dts chooses to enable the ones it > uses. pinctrl we do the same. why cant we do with operating-points as well? That's a possibility if we define a standard mechanism for stating OPPs are unusable (rather than having to probe the device to figure that out). Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/01/2013 08:54 AM, Mark Rutland wrote: > On Wed, Jul 31, 2013 at 05:27:39PM +0100, Nishanth Menon wrote: >> On 07/31/2013 11:11 AM, Mark Rutland wrote: >>> On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: >>>> On 07/31/2013 10:29 AM, Mark Rutland wrote: >>>>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: >>>>>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: >>>>>>> On 30/07/13 21:48, Nishanth Menon wrote: >>>>>>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: >>>>>>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >>>>>>>>>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> >>>>>>>>>> >>>>>>>>>> If more than one similar devices share the same OPPs, currently we >>>>>>>>>> need to replicate the OPP entries in all the nodes. >>>>>>>>>> >>>>>>>>>> Few drivers like cpufreq depend on physical cpu0 node to specify the >>>>>>>>>> OPPs and only that node is referred irrespective of the logical cpu >>>>>>>>>> accessing it. Alternatively to support cpuhotplug path, few drivers >>>>>>>>>> parse all the cpu nodes for OPPs. Instead we can specify the phandle >>>>>>>>>> of the node with which the current node shares the operating points. >>>>>>>>>> >>>>>>>>>> This patch adds support to specify the phandle in the operating points >>>>>>>>>> of any device node, where the node specified by the phandle holds the >>>>>>>>>> actual OPPs. >>>>>>>>> >>>>>>>>>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt >>>>>>>>> >>>>>>>>>> +Optional properties: >>>>>>>>>> +- operating-points-phandle: phandle to the device node with which this >>>>>>>>> >>>>>>>>> That's a funny name. Bikeshedding a bit, how about shared-operating-points? >>>>>>>>> >>>>>>>>> I haven't thought at all about whether this change conceptually makes sense. >>>>>>>>> >>>>>>>> >>>>>>>> They may not really be shared- we could have phandle list even. one >>>>>>>> might have optional OPP sets for a chip family that one may - I was >>>>>>>> about to suggest something similar to pinctrl >>>>>>>> >>>>>>> I am not sure if I follow you here, if each chip family has its unique >>>>>>> set of OPPs, why do we need to represent all of them together ? >>>>>>> IIUC you are thinking about having these in include dts file, used by >>>>>>> multiple chip/board dts. >>>>>>> >>>>>>>> operating-points-names = "default", "performance", "cheapboard-config" ;) >>>>>>>> operating-points-0 = <&...> >>>>>>>> operating-points-1 = <&...> >>>>>>>> operating-points-2 = <&...> >>>>>>>> >>>>>>> This looks more like a PM policy. >>>>>> >>>>>> Let me try to explain since SoCs such as OMAP/AM family dont make life >>>>>> trivial :).. >>>>>> >>>>>> An legacy example[1][2] >>>>>> >>>>>> SoC DM explains that the chip is capable of X opps: >>>>>> opp1, 2 - for all devices >>>>>> opp1,2, 3 - if efuse bit X@y is set >>>>>> opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors >>>>>> requirements (including additional features A, B is enabled). >>>>>> >>>>>> So, the same chip family has a hardware feature - not just as a pm >>>>>> policy of selecting among a set of OPPs which opp to work on, but the >>>>>> actual set of OPPs are actually options in themselves that is selected >>>>>> based on board's SoC selection. >>>>> >>>>> This sounds like we're describing a set of features not applicable to >>>>> the device, then removing them, rather than only describing those >>>>> features applicable to the device. If you have to probe to figure out >>>>> which values in the dt are applicable, I'm not sure I see the benefit of >>>>> describing said values in dt. >>>> >>>> Device has *options* of operating points sets it can operate at. It is >>>> not like "these are not applicable" for the device. >>> >>> I don't follow. >>> >>> In the example above, if efuse bit X@y is not set, opp3 is not >>> applicable, but we're describing it in dt. It's not an option for the >>> particular device, yet it appears in the device's dt. >> This one is easy - opp_enable/disable as discussed in >> http://marc.info/?l=linux-pm&m=137528631125365&w=2 should probably help. > > Wrong link? There's no reference to opp_enable or opp_disable... Darned! sorry about that.. I think this is what I wanted to point yesterday. http://marc.info/?l=linux-pm&m=137528603225295&w=2 > >>> >>> For opp4, it's even worse, as you're suggesting we describe an option >>> for the device that requires the driver to use some additional platform >>> knowledge to come to the conclusion that it cannot use. That sounds like >> >> Precisely.. it wont have that knowledge and should not need that >> knowledge. See explanation above. >> >> Specific examples: SoC vendors try to squeeze the max out of the chip, >> when voltage values are defined, they need to consider board markets >> that they try to address, pricepoints etc.. too many vectors.. not all >> board manufacturers care to meet SoC vendor requirements as they may not >> care about picking up the full potential of the chip - example - >> usecases on OMAP where ARM is seldom used and max DSP is used (video >> usecases) and others so they use a high performance chip, refuse to >> optimize vdd_mpu rail, dont care too much about higher ARM OPPs. Yeah, I >> could always tell them to hand edit the OPP entries and maintain kernel >> forks, but that is never the right thing to do. > > Sorry, but I still don't follow. > > We seem to be going over two cases, which both feel wrong to me: > > * One SoC used in multiple boards, where on some boards an OPP cannot be > used because some requirement is not met. In this case, the board's > dts (by including the SoC's dtsi) describes something that's not > necessarily usable, and we seem to have no way to describe in the OPP > table that the OPP is not usable for that board. not at the moment at least - at least in the way we have described the OPP in dts today. > > * Performance profiles, in which you have a set of OPP tables for > "performance, "low-power", and whatever else. This arbitrary split > seems like a configuration decision rather than a hardware description > unless there is some hard limit that cannot be detected (e.g. the > processor can function at some arbitrary high speed, but becomes hot > enough to melt something, and there's no temperature sensor to handle > this case dynamically). precisely -> I think I point this out in this thread: http://marc.info/?l=devicetree&m=137535932402560&w=2 > > Have I've misunderstood something? > >> >> >>> device knowledge internal to a driver, not how you describe an instance >>> of a device to an OS. >> >> OPP has never been a device - it is a performance point at which to >> operate a device. I am not sure if we are discussing about phandle >> definition of OPP is an issue or options of operating-point sets is an >> issue now. >> >>> >>> Have I misunderstood something here? >> >> Are you suggesting we have OPP tables per board? > > Yes, for the reasons I give above. Common OPP tabless can easily be > factored into separate include files to allow for arbitrary composition. Hmm.. that could be one other way to do it.. > >> >>> >>>> >>>> DT does have to describe the hardware capability - that was it's entire >>>> intent. operating points are valid configurations where it can be >>>> operated at - and when you have options of configurations you need to >>>> choose from based on the board you are using it on, it still retains >>>> "hardware behavior" aspect. >>> >>> The dt should describe the particular board you're running on. As I see >>> it what you're suggesting is equivalent to describing some hardware in >>> the dt that isn't actually present, then relying on the OS to poke >>> around somewhere else, figure out that the hardware isn't present, and >>> then forget that the dt described it. >> I will buy that eventual dtb should contain some way to choose the OPP >> that the particular board can operate on. >> >> SoC dtsi is what we define, this allows multiple board dts to use them. >> the moment we start defining OPPs per board, all mayhem breaks loose. >> >> SoC dtsi provides options for the SoC to be operated upon, it is like >> saying I have 10 Uarts, but board dts chooses to enable the ones it >> uses. pinctrl we do the same. why cant we do with operating-points as well? > > That's a possibility if we define a standard mechanism for stating OPPs > are unusable (rather than having to probe the device to figure that > out). > I did make a proposal here: http://marc.info/?l=devicetree&m=137530056230441&w=2 Do you see it making sense, If yes, I can help flesh out the idea with code.
On 08/01/2013 06:15 AM, Nishanth Menon wrote: > On 15:51-20130731, Stephen Warren wrote: >> On 07/31/2013 08:46 AM, Nishanth Menon wrote: >> ... >>> Let me try to explain since SoCs such as OMAP/AM family dont make life >>> trivial :).. >>> >>> An legacy example[1][2] >>> >>> SoC DM explains that the chip is capable of X opps: >>> opp1, 2 - for all devices >>> opp1,2, 3 - if efuse bit X@y is set >>> opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors >>> requirements (including additional features A, B is enabled). >> >> Hopefully the text "board design meets SoC vendors requirements" means >> something like "the board has a big fan capable of dissipating a lot of >> heat" and not "the board manufacturer paid us a lot of money to license >> the 'go faster' feature". The former could well be suitable to represent >> in DT, the latter not. > > Nope, these are technical requirements. In my company, these > guidelines are called Power Distribution Network guideline - which > control the IRDrop within reasonable limit, running DPLLs are varied > frequencies have different current draw characteristics, such that noise > limits, cleanup capacitors etc. For boards that dont care too much about > higher frequencies, they tend to skimp a little on caps and board > routing constraints to save on BOM (Bill Of Materials) cost. > > I am sure similar constraints do exist in other SoC vendors as well. Great, sounds good then. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/01/2013 07:54 AM, Mark Rutland wrote: ... > We seem to be going over two cases, which both feel wrong to me: > > * One SoC used in multiple boards, where on some boards an OPP cannot be > used because some requirement is not met. In this case, the board's > dts (by including the SoC's dtsi) describes something that's not > necessarily usable, and we seem to have no way to describe in the OPP > table that the OPP is not usable for that board. There are probably a lot of examples of this already. For example, for pinctrl, people often want the SoC .dtsi file to include "pin configuration nodes" (see Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt) for many common pinmux configurations in the SoC .dtsi file, so that board files can simply refer to the already-existing nodes rather than having to write everything from scratch. Obviously, not all common configurations are used by every board. ... >> Are you suggesting we have OPP tables per board? > > Yes, for the reasons I give above. Common OPP tabless can easily be > factored into separate include files to allow for arbitrary composition. Using separate include files sounds like a reasonable idea. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 01, 2013 at 05:25:06PM +0100, Nishanth Menon wrote: > On 08/01/2013 08:54 AM, Mark Rutland wrote: > > On Wed, Jul 31, 2013 at 05:27:39PM +0100, Nishanth Menon wrote: > >> On 07/31/2013 11:11 AM, Mark Rutland wrote: > >>> On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: > >>>> On 07/31/2013 10:29 AM, Mark Rutland wrote: > >>>>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: > >>>>>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: > >>>>>>> On 30/07/13 21:48, Nishanth Menon wrote: > >>>>>>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: > >>>>>>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: > >>>>>>>>>> From: Sudeep KarkadaNagesha <sudeep.karkadanagesha@arm.com> > >>>>>>>>>> > >>>>>>>>>> If more than one similar devices share the same OPPs, currently we > >>>>>>>>>> need to replicate the OPP entries in all the nodes. > >>>>>>>>>> > >>>>>>>>>> Few drivers like cpufreq depend on physical cpu0 node to specify the > >>>>>>>>>> OPPs and only that node is referred irrespective of the logical cpu > >>>>>>>>>> accessing it. Alternatively to support cpuhotplug path, few drivers > >>>>>>>>>> parse all the cpu nodes for OPPs. Instead we can specify the phandle > >>>>>>>>>> of the node with which the current node shares the operating points. > >>>>>>>>>> > >>>>>>>>>> This patch adds support to specify the phandle in the operating points > >>>>>>>>>> of any device node, where the node specified by the phandle holds the > >>>>>>>>>> actual OPPs. > >>>>>>>>> > >>>>>>>>>> diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt > >>>>>>>>> > >>>>>>>>>> +Optional properties: > >>>>>>>>>> +- operating-points-phandle: phandle to the device node with which this > >>>>>>>>> > >>>>>>>>> That's a funny name. Bikeshedding a bit, how about shared-operating-points? > >>>>>>>>> > >>>>>>>>> I haven't thought at all about whether this change conceptually makes sense. > >>>>>>>>> > >>>>>>>> > >>>>>>>> They may not really be shared- we could have phandle list even. one > >>>>>>>> might have optional OPP sets for a chip family that one may - I was > >>>>>>>> about to suggest something similar to pinctrl > >>>>>>>> > >>>>>>> I am not sure if I follow you here, if each chip family has its unique > >>>>>>> set of OPPs, why do we need to represent all of them together ? > >>>>>>> IIUC you are thinking about having these in include dts file, used by > >>>>>>> multiple chip/board dts. > >>>>>>> > >>>>>>>> operating-points-names = "default", "performance", "cheapboard-config" ;) > >>>>>>>> operating-points-0 = <&...> > >>>>>>>> operating-points-1 = <&...> > >>>>>>>> operating-points-2 = <&...> > >>>>>>>> > >>>>>>> This looks more like a PM policy. > >>>>>> > >>>>>> Let me try to explain since SoCs such as OMAP/AM family dont make life > >>>>>> trivial :).. > >>>>>> > >>>>>> An legacy example[1][2] > >>>>>> > >>>>>> SoC DM explains that the chip is capable of X opps: > >>>>>> opp1, 2 - for all devices > >>>>>> opp1,2, 3 - if efuse bit X@y is set > >>>>>> opp1,2,3,4 - if efuse bit X@y is set AND Board design meets SoC vendors > >>>>>> requirements (including additional features A, B is enabled). > >>>>>> > >>>>>> So, the same chip family has a hardware feature - not just as a pm > >>>>>> policy of selecting among a set of OPPs which opp to work on, but the > >>>>>> actual set of OPPs are actually options in themselves that is selected > >>>>>> based on board's SoC selection. > >>>>> > >>>>> This sounds like we're describing a set of features not applicable to > >>>>> the device, then removing them, rather than only describing those > >>>>> features applicable to the device. If you have to probe to figure out > >>>>> which values in the dt are applicable, I'm not sure I see the benefit of > >>>>> describing said values in dt. > >>>> > >>>> Device has *options* of operating points sets it can operate at. It is > >>>> not like "these are not applicable" for the device. > >>> > >>> I don't follow. > >>> > >>> In the example above, if efuse bit X@y is not set, opp3 is not > >>> applicable, but we're describing it in dt. It's not an option for the > >>> particular device, yet it appears in the device's dt. > >> This one is easy - opp_enable/disable as discussed in > >> http://marc.info/?l=linux-pm&m=137528631125365&w=2 should probably help. > > > > Wrong link? There's no reference to opp_enable or opp_disable... > > Darned! sorry about that.. I think this is what I wanted to point yesterday. > > http://marc.info/?l=linux-pm&m=137528603225295&w=2 Cheers. > > > > >>> > >>> For opp4, it's even worse, as you're suggesting we describe an option > >>> for the device that requires the driver to use some additional platform > >>> knowledge to come to the conclusion that it cannot use. That sounds like > >> > >> Precisely.. it wont have that knowledge and should not need that > >> knowledge. See explanation above. > >> > >> Specific examples: SoC vendors try to squeeze the max out of the chip, > >> when voltage values are defined, they need to consider board markets > >> that they try to address, pricepoints etc.. too many vectors.. not all > >> board manufacturers care to meet SoC vendor requirements as they may not > >> care about picking up the full potential of the chip - example - > >> usecases on OMAP where ARM is seldom used and max DSP is used (video > >> usecases) and others so they use a high performance chip, refuse to > >> optimize vdd_mpu rail, dont care too much about higher ARM OPPs. Yeah, I > >> could always tell them to hand edit the OPP entries and maintain kernel > >> forks, but that is never the right thing to do. > > > > Sorry, but I still don't follow. > > > > We seem to be going over two cases, which both feel wrong to me: > > > > * One SoC used in multiple boards, where on some boards an OPP cannot be > > used because some requirement is not met. In this case, the board's > > dts (by including the SoC's dtsi) describes something that's not > > necessarily usable, and we seem to have no way to describe in the OPP > > table that the OPP is not usable for that board. > > not at the moment at least - at least in the way we have described the > OPP in dts today. Ok. > > > > > * Performance profiles, in which you have a set of OPP tables for > > "performance, "low-power", and whatever else. This arbitrary split > > seems like a configuration decision rather than a hardware description > > unless there is some hard limit that cannot be detected (e.g. the > > processor can function at some arbitrary high speed, but becomes hot > > enough to melt something, and there's no temperature sensor to handle > > this case dynamically). > > precisely -> I think I point this out in this thread: > http://marc.info/?l=devicetree&m=137535932402560&w=2 The one thing I don't like is the arbitrary grouping into profiles, as the division is entirely a configuration decision. The operating points themselves are a hardware capability, and it may make sense to describe the feasible points for a device in the dt, but I don't want to have different profiles exported because it straddles the line of the dt telling us how to use the hardware rather than what the hardware is, and will come back to bite us later if we want to handle cpu frequency decisions differently. I also have a suspicion that this will end up having a subset of sane values, and Linux won't necessarily be able to do any interpolation of values without additional platform knowledge. > > > > > Have I've misunderstood something? > > > >> > >> > >>> device knowledge internal to a driver, not how you describe an instance > >>> of a device to an OS. > >> > >> OPP has never been a device - it is a performance point at which to > >> operate a device. I am not sure if we are discussing about phandle > >> definition of OPP is an issue or options of operating-point sets is an > >> issue now. > >> > >>> > >>> Have I misunderstood something here? > >> > >> Are you suggesting we have OPP tables per board? > > > > Yes, for the reasons I give above. Common OPP tabless can easily be > > factored into separate include files to allow for arbitrary composition. > > Hmm.. that could be one other way to do it.. > > > > > >> > >>> > >>>> > >>>> DT does have to describe the hardware capability - that was it's entire > >>>> intent. operating points are valid configurations where it can be > >>>> operated at - and when you have options of configurations you need to > >>>> choose from based on the board you are using it on, it still retains > >>>> "hardware behavior" aspect. > >>> > >>> The dt should describe the particular board you're running on. As I see > >>> it what you're suggesting is equivalent to describing some hardware in > >>> the dt that isn't actually present, then relying on the OS to poke > >>> around somewhere else, figure out that the hardware isn't present, and > >>> then forget that the dt described it. > >> I will buy that eventual dtb should contain some way to choose the OPP > >> that the particular board can operate on. > >> > >> SoC dtsi is what we define, this allows multiple board dts to use them. > >> the moment we start defining OPPs per board, all mayhem breaks loose. > >> > >> SoC dtsi provides options for the SoC to be operated upon, it is like > >> saying I have 10 Uarts, but board dts chooses to enable the ones it > >> uses. pinctrl we do the same. why cant we do with operating-points as well? > > > > That's a possibility if we define a standard mechanism for stating OPPs > > are unusable (rather than having to probe the device to figure that > > out). > > > > I did make a proposal here: > http://marc.info/?l=devicetree&m=137530056230441&w=2 > > Do you see it making sense, If yes, I can help flesh out the idea with code. I can see one of the example mechanisms working for describing OPPs being usable, but I'm still concerned about the division into profiles. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 01/08/13 17:49, Stephen Warren wrote: > On 08/01/2013 07:54 AM, Mark Rutland wrote: > ... >> We seem to be going over two cases, which both feel wrong to me: >> >> * One SoC used in multiple boards, where on some boards an OPP cannot be >> used because some requirement is not met. In this case, the board's >> dts (by including the SoC's dtsi) describes something that's not >> necessarily usable, and we seem to have no way to describe in the OPP >> table that the OPP is not usable for that board. > > There are probably a lot of examples of this already. For example, for > pinctrl, people often want the SoC .dtsi file to include "pin > configuration nodes" (see > Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt) for many > common pinmux configurations in the SoC .dtsi file, so that board files > can simply refer to the already-existing nodes rather than having to > write everything from scratch. Obviously, not all common configurations > are used by every board. > > ... Agreed, but I am not convinced with the comparison(pinmux and OPPs). The main concern I have is that if some developer wants to experiment with various configurations provided by SoC(e.g. I have seen some SoC where the pinmux have multiple functions and you can chose one of them) But that's not true with OPPs, if someone experiments with wrong OPP profile, then it might damage the board permanently. Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 14:43-20130802, Sudeep KarkadaNagesha wrote: > On 01/08/13 17:49, Stephen Warren wrote: > > On 08/01/2013 07:54 AM, Mark Rutland wrote: > > ... > >> We seem to be going over two cases, which both feel wrong to me: > >> > >> * One SoC used in multiple boards, where on some boards an OPP cannot be > >> used because some requirement is not met. In this case, the board's > >> dts (by including the SoC's dtsi) describes something that's not > >> necessarily usable, and we seem to have no way to describe in the OPP > >> table that the OPP is not usable for that board. > > > > There are probably a lot of examples of this already. For example, for > > pinctrl, people often want the SoC .dtsi file to include "pin > > configuration nodes" (see > > Documentation/devicetree/bindings/pinctrl/pinctrl-bindings.txt) for many > > common pinmux configurations in the SoC .dtsi file, so that board files > > can simply refer to the already-existing nodes rather than having to > > write everything from scratch. Obviously, not all common configurations > > are used by every board. > > > > ... > Agreed, but I am not convinced with the comparison(pinmux and OPPs). > The main concern I have is that if some developer wants to experiment > with various configurations provided by SoC(e.g. I have seen some SoC > where the pinmux have multiple functions and you can chose one of them) > But that's not true with OPPs, if someone experiments with wrong OPP > profile, then it might damage the board permanently. Even today, nothing prevents folks from "adding custom OPPs" at their own personal risk here - We have seen folks do this as part of board files - Now, for that matter, there is nothing that prevents folks linking the wrong LDO or setting wrong LDO voltage and damaging the board either. I mean, at the level where we "describe" the hardware and it's operation, you cannot be idiot or experimenter-proof - If these were to be considered paramount and prevent us from choosing the right concept that is needed for as many SoCs as possible, it'd be a shame.
On 14:15-20130802, Mark Rutland wrote: > On Thu, Aug 01, 2013 at 05:25:06PM +0100, Nishanth Menon wrote: > > On 08/01/2013 08:54 AM, Mark Rutland wrote: > > > On Wed, Jul 31, 2013 at 05:27:39PM +0100, Nishanth Menon wrote: > > >> On 07/31/2013 11:11 AM, Mark Rutland wrote: > > >>> On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: > > >>>> On 07/31/2013 10:29 AM, Mark Rutland wrote: > > >>>>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: > > >>>>>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: > > >>>>>>> On 30/07/13 21:48, Nishanth Menon wrote: > > >>>>>>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: > > >>>>>>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: [...] > > > > > > > > * Performance profiles, in which you have a set of OPP tables for > > > "performance, "low-power", and whatever else. This arbitrary split > > > seems like a configuration decision rather than a hardware description > > > unless there is some hard limit that cannot be detected (e.g. the > > > processor can function at some arbitrary high speed, but becomes hot > > > enough to melt something, and there's no temperature sensor to handle > > > this case dynamically). > > > > precisely -> I think I point this out in this thread: > > http://marc.info/?l=devicetree&m=137535932402560&w=2 > > The one thing I don't like is the arbitrary grouping into profiles, as > the division is entirely a configuration decision. The operating points > themselves are a hardware capability, and it may make sense to describe > the feasible points for a device in the dt, but I don't want to have > different profiles exported because it straddles the line of the dt > telling us how to use the hardware rather than what the hardware is, and > will come back to bite us later if we want to handle cpu frequency > decisions differently. I can understand why it seems to wrongly indicate *how* to use the hardware, rather than *what the hardware is* - Lets try it this way: - if Bit X is set in efuse, one cannot use high performance mode - If PDN (Power Distribution Network) guidelines are not met, one cannot use high performance mode. These constrain *hardware capability* you can do on that SoC+Board combination - that is exactly what we have been struggling to describe here. These are not *how to use hardware* profiles, but *hardware capability* profiles whose selection is upto to the System in discussion - example - SoC x will decide on bit based decision and forbid Board file overrides while an SoC y family might choose another path.. Framework and dts should not dictate policy and we dont try to do that here. How to use the hardware within the *capability costraints* is upto drivers, there is no attempt to define that in my proposal. > I also have a suspicion that this will end up having a subset of sane > values, and Linux won't necessarily be able to do any interpolation of > values without additional platform knowledge. These may be SoC dependent. An fixed regulator might be acceptable on SoC x, but not on y (as leakage angle might make it infeasible). [...] > > I did make a proposal here: > > http://marc.info/?l=devicetree&m=137530056230441&w=2 > > > > Do you see it making sense, If yes, I can help flesh out the idea with code. > > I can see one of the example mechanisms working for describing OPPs > being usable, but I'm still concerned about the division into profiles. Above. Does this make sense? I understand and share the concern about potential misuse, but I am all open ears of how we'd like to ensure data is completely separated out from kernel source.
On Tue, Aug 06, 2013 at 02:45:34PM +0100, Nishanth Menon wrote: > On 14:15-20130802, Mark Rutland wrote: > > On Thu, Aug 01, 2013 at 05:25:06PM +0100, Nishanth Menon wrote: > > > On 08/01/2013 08:54 AM, Mark Rutland wrote: > > > > On Wed, Jul 31, 2013 at 05:27:39PM +0100, Nishanth Menon wrote: > > > >> On 07/31/2013 11:11 AM, Mark Rutland wrote: > > > >>> On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: > > > >>>> On 07/31/2013 10:29 AM, Mark Rutland wrote: > > > >>>>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: > > > >>>>>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: > > > >>>>>>> On 30/07/13 21:48, Nishanth Menon wrote: > > > >>>>>>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: > > > >>>>>>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: > [...] > > > > > > > > > > > * Performance profiles, in which you have a set of OPP tables for > > > > "performance, "low-power", and whatever else. This arbitrary split > > > > seems like a configuration decision rather than a hardware description > > > > unless there is some hard limit that cannot be detected (e.g. the > > > > processor can function at some arbitrary high speed, but becomes hot > > > > enough to melt something, and there's no temperature sensor to handle > > > > this case dynamically). > > > > > > precisely -> I think I point this out in this thread: > > > http://marc.info/?l=devicetree&m=137535932402560&w=2 > > > > The one thing I don't like is the arbitrary grouping into profiles, as > > the division is entirely a configuration decision. The operating points > > themselves are a hardware capability, and it may make sense to describe > > the feasible points for a device in the dt, but I don't want to have > > different profiles exported because it straddles the line of the dt > > telling us how to use the hardware rather than what the hardware is, and > > will come back to bite us later if we want to handle cpu frequency > > decisions differently. > > I can understand why it seems to wrongly indicate *how* to use the > hardware, rather than *what the hardware is* - Lets try it this way: > - if Bit X is set in efuse, one cannot use high performance mode > - If PDN (Power Distribution Network) guidelines are not met, one cannot > use high performance mode. > > These constrain *hardware capability* you can do on that SoC+Board > combination - that is exactly what we have been struggling to describe > here. These are not *how to use hardware* profiles, but *hardware > capability* profiles whose selection is upto to the System in > discussion - example - SoC x will decide on bit based decision and > forbid Board file overrides while an SoC y family might choose another > path.. Framework and dts should not dictate policy and we dont try to > do that here. > > How to use the hardware within the *capability costraints* is upto > drivers, there is no attempt to define that in my proposal. I'm happy to have the OPPs, as your arguments certainly make sense. My only concern is that if we have them grouped in some fashion in dt (e.g. profiles), people will use this as configuration, treating the groups of OPPs differnetly (prefering a 'performance' or 'low-power' profile). I'd prefer that any decision on how to use the provided OPP values were done in the kernel dynamically. I suspect even if we remove profile names, people will attempt to read some semantics into the groupings. For that reason, I'd prefer to have a single OPP table for any device (though this table could be shared by devices). Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/08/13 17:17, Mark Rutland wrote: > On Tue, Aug 06, 2013 at 02:45:34PM +0100, Nishanth Menon wrote: >> On 14:15-20130802, Mark Rutland wrote: >>> On Thu, Aug 01, 2013 at 05:25:06PM +0100, Nishanth Menon wrote: >>>> On 08/01/2013 08:54 AM, Mark Rutland wrote: >>>>> On Wed, Jul 31, 2013 at 05:27:39PM +0100, Nishanth Menon wrote: >>>>>> On 07/31/2013 11:11 AM, Mark Rutland wrote: >>>>>>> On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: >>>>>>>> On 07/31/2013 10:29 AM, Mark Rutland wrote: >>>>>>>>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: >>>>>>>>>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: >>>>>>>>>>> On 30/07/13 21:48, Nishanth Menon wrote: >>>>>>>>>>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: >>>>>>>>>>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >> [...] >>>> >>>>> >>>>> * Performance profiles, in which you have a set of OPP tables for >>>>> "performance, "low-power", and whatever else. This arbitrary split >>>>> seems like a configuration decision rather than a hardware description >>>>> unless there is some hard limit that cannot be detected (e.g. the >>>>> processor can function at some arbitrary high speed, but becomes hot >>>>> enough to melt something, and there's no temperature sensor to handle >>>>> this case dynamically). >>>> >>>> precisely -> I think I point this out in this thread: >>>> http://marc.info/?l=devicetree&m=137535932402560&w=2 >>> >>> The one thing I don't like is the arbitrary grouping into profiles, as >>> the division is entirely a configuration decision. The operating points >>> themselves are a hardware capability, and it may make sense to describe >>> the feasible points for a device in the dt, but I don't want to have >>> different profiles exported because it straddles the line of the dt >>> telling us how to use the hardware rather than what the hardware is, and >>> will come back to bite us later if we want to handle cpu frequency >>> decisions differently. >> >> I can understand why it seems to wrongly indicate *how* to use the >> hardware, rather than *what the hardware is* - Lets try it this way: >> - if Bit X is set in efuse, one cannot use high performance mode >> - If PDN (Power Distribution Network) guidelines are not met, one cannot >> use high performance mode. >> >> These constrain *hardware capability* you can do on that SoC+Board >> combination - that is exactly what we have been struggling to describe >> here. These are not *how to use hardware* profiles, but *hardware >> capability* profiles whose selection is upto to the System in >> discussion - example - SoC x will decide on bit based decision and >> forbid Board file overrides while an SoC y family might choose another >> path.. Framework and dts should not dictate policy and we dont try to >> do that here. >> >> How to use the hardware within the *capability costraints* is upto >> drivers, there is no attempt to define that in my proposal. > > I'm happy to have the OPPs, as your arguments certainly make sense. My > only concern is that if we have them grouped in some fashion in dt (e.g. > profiles), people will use this as configuration, treating the groups of > OPPs differnetly (prefering a 'performance' or 'low-power' profile). I'd > prefer that any decision on how to use the provided OPP values were done > in the kernel dynamically. > > I suspect even if we remove profile names, people will attempt to read > some semantics into the groupings. For that reason, I'd prefer to have a > single OPP table for any device (though this table could be shared by > devices). > Until we get more feedback and agreement on new proposal can we have this simple amendment in this patch to the existing binding ? Since the new proposal[1] is backward compatible(this patch adding support for option#5 to existing option#1), we will have to add support for other binding options in [1] later. This is needed to support shared OPPs with simple/single OPP profile and also to fix the broken and unused binding @Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt Regards, Sudeep [1] http://www.spinics.net/lists/cpufreq/msg06563.html -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/20/2013 05:00 AM, Sudeep KarkadaNagesha wrote: > On 07/08/13 17:17, Mark Rutland wrote: >> On Tue, Aug 06, 2013 at 02:45:34PM +0100, Nishanth Menon wrote: >>> On 14:15-20130802, Mark Rutland wrote: >>>> On Thu, Aug 01, 2013 at 05:25:06PM +0100, Nishanth Menon wrote: >>>>> On 08/01/2013 08:54 AM, Mark Rutland wrote: >>>>>> On Wed, Jul 31, 2013 at 05:27:39PM +0100, Nishanth Menon wrote: >>>>>>> On 07/31/2013 11:11 AM, Mark Rutland wrote: >>>>>>>> On Wed, Jul 31, 2013 at 04:58:22PM +0100, Nishanth Menon wrote: >>>>>>>>> On 07/31/2013 10:29 AM, Mark Rutland wrote: >>>>>>>>>> On Wed, Jul 31, 2013 at 03:46:34PM +0100, Nishanth Menon wrote: >>>>>>>>>>> On 07/31/2013 06:14 AM, Sudeep KarkadaNagesha wrote: >>>>>>>>>>>> On 30/07/13 21:48, Nishanth Menon wrote: >>>>>>>>>>>>> On 07/30/2013 01:34 PM, Stephen Warren wrote: >>>>>>>>>>>>>> On 07/30/2013 12:00 PM, Sudeep KarkadaNagesha wrote: >>> [...] >>>>> >>>>>> >>>>>> * Performance profiles, in which you have a set of OPP tables for >>>>>> "performance, "low-power", and whatever else. This arbitrary split >>>>>> seems like a configuration decision rather than a hardware description >>>>>> unless there is some hard limit that cannot be detected (e.g. the >>>>>> processor can function at some arbitrary high speed, but becomes hot >>>>>> enough to melt something, and there's no temperature sensor to handle >>>>>> this case dynamically). >>>>> >>>>> precisely -> I think I point this out in this thread: >>>>> http://marc.info/?l=devicetree&m=137535932402560&w=2 >>>> >>>> The one thing I don't like is the arbitrary grouping into profiles, as >>>> the division is entirely a configuration decision. The operating points >>>> themselves are a hardware capability, and it may make sense to describe >>>> the feasible points for a device in the dt, but I don't want to have >>>> different profiles exported because it straddles the line of the dt >>>> telling us how to use the hardware rather than what the hardware is, and >>>> will come back to bite us later if we want to handle cpu frequency >>>> decisions differently. >>> >>> I can understand why it seems to wrongly indicate *how* to use the >>> hardware, rather than *what the hardware is* - Lets try it this way: >>> - if Bit X is set in efuse, one cannot use high performance mode >>> - If PDN (Power Distribution Network) guidelines are not met, one cannot >>> use high performance mode. >>> >>> These constrain *hardware capability* you can do on that SoC+Board >>> combination - that is exactly what we have been struggling to describe >>> here. These are not *how to use hardware* profiles, but *hardware >>> capability* profiles whose selection is upto to the System in >>> discussion - example - SoC x will decide on bit based decision and >>> forbid Board file overrides while an SoC y family might choose another >>> path.. Framework and dts should not dictate policy and we dont try to >>> do that here. >>> >>> How to use the hardware within the *capability costraints* is upto >>> drivers, there is no attempt to define that in my proposal. >> >> I'm happy to have the OPPs, as your arguments certainly make sense. My >> only concern is that if we have them grouped in some fashion in dt (e.g. >> profiles), people will use this as configuration, treating the groups of >> OPPs differnetly (prefering a 'performance' or 'low-power' profile). I'd >> prefer that any decision on how to use the provided OPP values were done >> in the kernel dynamically. >> >> I suspect even if we remove profile names, people will attempt to read >> some semantics into the groupings. For that reason, I'd prefer to have a >> single OPP table for any device (though this table could be shared by >> devices). >> > > Until we get more feedback and agreement on new proposal can we have > this simple amendment in this patch to the existing binding ? Since the > new proposal[1] is backward compatible(this patch adding support for > option#5 to existing option#1), we will have to add support for other > binding options in [1] later. > > This is needed to support shared OPPs with simple/single OPP profile > and also to fix the broken and unused binding > @Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt > > Regards, > Sudeep > > [1] http://www.spinics.net/lists/cpufreq/msg06563.html Could you post a non-RFC version of this series? As I had mentioned earlier in the thread, I dont mind having this pulled in as stage 1 of the transition to a more elaborate solution.
On 20/08/13 15:01, Nishanth Menon wrote: > On 08/20/2013 05:00 AM, Sudeep KarkadaNagesha wrote: [...] >> Until we get more feedback and agreement on new proposal can we have >> this simple amendment in this patch to the existing binding ? Since the >> new proposal[1] is backward compatible(this patch adding support for >> option#5 to existing option#1), we will have to add support for other >> binding options in [1] later. >> >> This is needed to support shared OPPs with simple/single OPP profile >> and also to fix the broken and unused binding >> @Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt >> >> Regards, >> Sudeep >> >> [1] http://www.spinics.net/lists/cpufreq/msg06563.html > > > Could you post a non-RFC version of this series? As I had mentioned > earlier in the thread, I dont mind having this pulled in as stage 1 of > the transition to a more elaborate solution. > Thanks Nishanth. I would wait till Mark and Stephen agree for this approach. Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 08/20/2013 04:00 AM, Sudeep KarkadaNagesha wrote: ... > Until we get more feedback and agreement on new proposal can we have > this simple amendment in this patch to the existing binding ? Since the > new proposal[1] is backward compatible(this patch adding support for > option#5 to existing option#1), we will have to add support for other > binding options in [1] later. > > This is needed to support shared OPPs with simple/single OPP profile > and also to fix the broken and unused binding > @Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt > > Regards, > Sudeep > > [1] http://www.spinics.net/lists/cpufreq/msg06563.html Presumably the desire for cpu1's node to say "go look at cpu0's node for OPP" is because they share OPPs. Don't they share OPPs because they are parts of the same device - that device being the CPU complex. As such, why not define the OPPs in /cpus rather than in each of /cpus/cpuN? Of course, that doesn't help if there are separate CPU and GPU nodes that just happen to have the same set of OPPs and you want to share them to save DT space. Is that at all likely? I'd suggest/bike-shed that operating-points-device is not the correct property name; it somehow implies that the other device actively defines the OPPs for this device, rather than just happening to have the same OPPs. Perhaps "operating-points-identical-to"? -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Aug 21, 2013 at 11:48:12PM +0100, Stephen Warren wrote: > On 08/20/2013 04:00 AM, Sudeep KarkadaNagesha wrote: > ... > > Until we get more feedback and agreement on new proposal can we have > > this simple amendment in this patch to the existing binding ? Since the > > new proposal[1] is backward compatible(this patch adding support for > > option#5 to existing option#1), we will have to add support for other > > binding options in [1] later. > > > > This is needed to support shared OPPs with simple/single OPP profile > > and also to fix the broken and unused binding > > @Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt > > > > Regards, > > Sudeep > > > > [1] http://www.spinics.net/lists/cpufreq/msg06563.html > > Presumably the desire for cpu1's node to say "go look at cpu0's node for > OPP" is because they share OPPs. Don't they share OPPs because they are > parts of the same device - that device being the CPU complex. As such, > why not define the OPPs in /cpus rather than in each of /cpus/cpuN? I'd very much like for it to be possible to factor out common properties into the /cpus node, but it should follow the ePAPR recommendation fo being treated as a fallback if not present in a particular /cpus/cpu@N node -- that way we can handle clusters with differing OPPs. The property might just be a phandle to a table node, but it should be possible to make it common. > > Of course, that doesn't help if there are separate CPU and GPU nodes > that just happen to have the same set of OPPs and you want to share them > to save DT space. Is that at all likely? I suspect that the OPPs for CPUs and GPUs are likely to be quite distict, and they are logically separate regardless. I'm not averse to sharing of tables if we can handle them in a standard fashion. > > I'd suggest/bike-shed that operating-points-device is not the correct > property name; it somehow implies that the other device actively defines > the OPPs for this device, rather than just happening to have the same > OPPs. Perhaps "operating-points-identical-to"? > I'd rather not have properties that point elsewhere and say "treat me the same as this node". I'd rather we have common properties as described above. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 22/08/13 12:59, Mark Rutland wrote: > On Wed, Aug 21, 2013 at 11:48:12PM +0100, Stephen Warren wrote: >> On 08/20/2013 04:00 AM, Sudeep KarkadaNagesha wrote: >> ... >>> Until we get more feedback and agreement on new proposal can we have >>> this simple amendment in this patch to the existing binding ? Since the >>> new proposal[1] is backward compatible(this patch adding support for >>> option#5 to existing option#1), we will have to add support for other >>> binding options in [1] later. >>> >>> This is needed to support shared OPPs with simple/single OPP profile >>> and also to fix the broken and unused binding >>> @Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt >>> >>> Regards, >>> Sudeep >>> >>> [1] http://www.spinics.net/lists/cpufreq/msg06563.html >> >> Presumably the desire for cpu1's node to say "go look at cpu0's node for >> OPP" is because they share OPPs. Don't they share OPPs because they are >> parts of the same device - that device being the CPU complex. As such, >> why not define the OPPs in /cpus rather than in each of /cpus/cpuN? > > I'd very much like for it to be possible to factor out common properties > into the /cpus node, but it should follow the ePAPR recommendation fo > being treated as a fallback if not present in a particular /cpus/cpu@N > node -- that way we can handle clusters with differing OPPs. The > property might just be a phandle to a table node, but it should be > possible to make it common. > Yes we can have this fallback mechanism, but only from cpu devices(OPP library handles non-cpu devices too). Referring the table node, I have a generic question on DT nodes. Does each DT node have to represent a unique device ? If so having a property common to one/more devices in a separate node doesn't sound correct. >> >> Of course, that doesn't help if there are separate CPU and GPU nodes >> that just happen to have the same set of OPPs and you want to share them >> to save DT space. Is that at all likely? > > I suspect that the OPPs for CPUs and GPUs are likely to be quite > distict, and they are logically separate regardless. I'm not averse to > sharing of tables if we can handle them in a standard fashion. > IMO sharing OPPs just for saving DT space might lead to confusions(no strong opinion though). >> >> I'd suggest/bike-shed that operating-points-device is not the correct >> property name; it somehow implies that the other device actively defines >> the OPPs for this device, rather than just happening to have the same >> OPPs. Perhaps "operating-points-identical-to"? >> > > I'd rather not have properties that point elsewhere and say "treat me > the same as this node". I'd rather we have common properties as > described above. > Agreed, but for platforms with multiple CPU clusters, since we have only one /cpus node, we ned to have table node which is arguable if node has represent a device(as mentioned above) Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 22, 2013 at 04:32:10PM +0100, Sudeep KarkadaNagesha wrote: > On 22/08/13 12:59, Mark Rutland wrote: > > On Wed, Aug 21, 2013 at 11:48:12PM +0100, Stephen Warren wrote: > >> On 08/20/2013 04:00 AM, Sudeep KarkadaNagesha wrote: > >> ... > >>> Until we get more feedback and agreement on new proposal can we have > >>> this simple amendment in this patch to the existing binding ? Since the > >>> new proposal[1] is backward compatible(this patch adding support for > >>> option#5 to existing option#1), we will have to add support for other > >>> binding options in [1] later. > >>> > >>> This is needed to support shared OPPs with simple/single OPP profile > >>> and also to fix the broken and unused binding > >>> @Documentation/devicetree/bindings/cpufreq/arm_big_little_dt.txt > >>> > >>> Regards, > >>> Sudeep > >>> > >>> [1] http://www.spinics.net/lists/cpufreq/msg06563.html > >> > >> Presumably the desire for cpu1's node to say "go look at cpu0's node for > >> OPP" is because they share OPPs. Don't they share OPPs because they are > >> parts of the same device - that device being the CPU complex. As such, > >> why not define the OPPs in /cpus rather than in each of /cpus/cpuN? > > > > I'd very much like for it to be possible to factor out common properties > > into the /cpus node, but it should follow the ePAPR recommendation fo > > being treated as a fallback if not present in a particular /cpus/cpu@N > > node -- that way we can handle clusters with differing OPPs. The > > property might just be a phandle to a table node, but it should be > > possible to make it common. > > > Yes we can have this fallback mechanism, but only from cpu devices(OPP > library handles non-cpu devices too). Ok. > > Referring the table node, I have a generic question on DT nodes. > Does each DT node have to represent a unique device ? If so having a > property common to one/more devices in a separate node doesn't sound > correct. I believe the answer is almost always yes. There are devices with subnodes that represent some logical portion of the device, but I couldn't think of any free-standing nodes that aren't a device or unique firmware interface. > > >> > >> Of course, that doesn't help if there are separate CPU and GPU nodes > >> that just happen to have the same set of OPPs and you want to share them > >> to save DT space. Is that at all likely? > > > > I suspect that the OPPs for CPUs and GPUs are likely to be quite > > distict, and they are logically separate regardless. I'm not averse to > > sharing of tables if we can handle them in a standard fashion. > > > IMO sharing OPPs just for saving DT space might lead to confusions(no > strong opinion though). Agreed. > > >> > >> I'd suggest/bike-shed that operating-points-device is not the correct > >> property name; it somehow implies that the other device actively defines > >> the OPPs for this device, rather than just happening to have the same > >> OPPs. Perhaps "operating-points-identical-to"? > >> > > > > I'd rather not have properties that point elsewhere and say "treat me > > the same as this node". I'd rather we have common properties as > > described above. > > > Agreed, but for platforms with multiple CPU clusters, since we have only > one /cpus node, we ned to have table node which is arguable if node has > represent a device(as mentioned above) I agree that this seems wasteful of space, but I really don't think that pointing at another device you want the OPPs of is the best way of describing the linkage, and I suspect we'll get all sorts of stupid bugs resulting from that style of binding. Consider the following (properties trimmed for brevity): cpus { cpu0: cpu@0 { operating-points-identical-to = <&cpu1>; }; cpu1: cpu@1 { operating-points = <0 100>, <23 47>, <62 970>; } }; If we boot a UP kernel on the above, I assume we won't read the info for cpu1, and thus we won't get operating points info for cpu0. Worse, what if cpu1 has status="disabled"? Does that make its OPP table invalid? What if the bootloader drops cpu1? I really don't like indirecting linkage to a property through an arbitrary node, simply because it happened to have the same property. It creates an artificial depdendency that will lead to problems. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 22/08/13 16:50, Mark Rutland wrote: > On Thu, Aug 22, 2013 at 04:32:10PM +0100, Sudeep KarkadaNagesha wrote: >> On 22/08/13 12:59, Mark Rutland wrote: >>> On Wed, Aug 21, 2013 at 11:48:12PM +0100, Stephen Warren wrote: [...] >>>> >>>> I'd suggest/bike-shed that operating-points-device is not the correct >>>> property name; it somehow implies that the other device actively defines >>>> the OPPs for this device, rather than just happening to have the same >>>> OPPs. Perhaps "operating-points-identical-to"? >>>> >>> >>> I'd rather not have properties that point elsewhere and say "treat me >>> the same as this node". I'd rather we have common properties as >>> described above. >>> >> Agreed, but for platforms with multiple CPU clusters, since we have only >> one /cpus node, we ned to have table node which is arguable if node has >> represent a device(as mentioned above) > > I agree that this seems wasteful of space, but I really don't think that > pointing at another device you want the OPPs of is the best way of > describing the linkage, and I suspect we'll get all sorts of stupid bugs > resulting from that style of binding. > > Consider the following (properties trimmed for brevity): > > cpus { > cpu0: cpu@0 { > operating-points-identical-to = <&cpu1>; > }; > cpu1: cpu@1 { > operating-points = <0 100>, > <23 47>, > <62 970>; > } > }; > > If we boot a UP kernel on the above, I assume we won't read the info for > cpu1, and thus we won't get operating points info for cpu0. Worse, what > if cpu1 has status="disabled"? Does that make its OPP table invalid? > What if the bootloader drops cpu1? > Another question :), does status property dictate the validity of the other properties in the node(not specific to above example, in general)? But I agree that this indirect linkage is broken as boot-loaders can drop the cpu node holding OPP. IMO fallback method would be reasonable(and much cleaner compared to a separate node) if it can even accommodate multiple cpu clusters. > I really don't like indirecting linkage to a property through an > arbitrary node, simply because it happened to have the same property. It > creates an artificial depdendency that will lead to problems. > I totally agree on this. Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Aug 22, 2013 at 05:28:10PM +0100, Sudeep KarkadaNagesha wrote: > On 22/08/13 16:50, Mark Rutland wrote: > > On Thu, Aug 22, 2013 at 04:32:10PM +0100, Sudeep KarkadaNagesha wrote: > >> On 22/08/13 12:59, Mark Rutland wrote: > >>> On Wed, Aug 21, 2013 at 11:48:12PM +0100, Stephen Warren wrote: > [...] > >>>> > >>>> I'd suggest/bike-shed that operating-points-device is not the correct > >>>> property name; it somehow implies that the other device actively defines > >>>> the OPPs for this device, rather than just happening to have the same > >>>> OPPs. Perhaps "operating-points-identical-to"? > >>>> > >>> > >>> I'd rather not have properties that point elsewhere and say "treat me > >>> the same as this node". I'd rather we have common properties as > >>> described above. > >>> > >> Agreed, but for platforms with multiple CPU clusters, since we have only > >> one /cpus node, we ned to have table node which is arguable if node has > >> represent a device(as mentioned above) > > > > I agree that this seems wasteful of space, but I really don't think that > > pointing at another device you want the OPPs of is the best way of > > describing the linkage, and I suspect we'll get all sorts of stupid bugs > > resulting from that style of binding. > > > > Consider the following (properties trimmed for brevity): > > > > cpus { > > cpu0: cpu@0 { > > operating-points-identical-to = <&cpu1>; > > }; > > cpu1: cpu@1 { > > operating-points = <0 100>, > > <23 47>, > > <62 970>; > > } > > }; > > > > If we boot a UP kernel on the above, I assume we won't read the info for > > cpu1, and thus we won't get operating points info for cpu0. Worse, what > > if cpu1 has status="disabled"? Does that make its OPP table invalid? > > What if the bootloader drops cpu1? > > > Another question :), does status property dictate the validity of the > other properties in the node(not specific to above example, in general)? That depends on what the binding in question describes for a disabled status. In general, if a node is disabled I would imagine that attempting to derive any knowledge from it is not a good idea. > > But I agree that this indirect linkage is broken as boot-loaders can > drop the cpu node holding OPP. > > IMO fallback method would be reasonable(and much cleaner compared to a > separate node) if it can even accommodate multiple cpu clusters. Well, it may be that for separate clusters we just have to describe the information per-cpu... > > > I really don't like indirecting linkage to a property through an > > arbitrary node, simply because it happened to have the same property. It > > creates an artificial depdendency that will lead to problems. > > > I totally agree on this. Cool. Thanks, Mark. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/Documentation/devicetree/bindings/power/opp.txt b/Documentation/devicetree/bindings/power/opp.txt index 74499e5..f66fd65 100644 --- a/Documentation/devicetree/bindings/power/opp.txt +++ b/Documentation/devicetree/bindings/power/opp.txt @@ -4,22 +4,112 @@ SoCs have a standard set of tuples consisting of frequency and voltage pairs that the device will support per voltage domain. These are called Operating Performance Points or OPPs. -Properties: +Required Properties: - operating-points: An array of 2-tuples items, and each item consists of frequency and voltage like <freq-kHz vol-uV>. freq: clock frequency in kHz vol: voltage in microvolt +Optional properties: +- operating-points-phandle: phandle to the device node with which this + device shares the operating points(recommended if multiple + devices are in the same clock domain and share OPPs, as it + avoids replication of OPPs) + Examples: -cpu@0 { - compatible = "arm,cortex-a9"; - reg = <0>; - next-level-cache = <&L2>; - operating-points = < - /* kHz uV */ - 792000 1100000 - 396000 950000 - 198000 850000 - >; -}; +1. A uniprocessor system + + cpu@0 { + compatible = "arm,cortex-a9"; + reg = <0>; + next-level-cache = <&L2>; + operating-points = < + /* kHz uV */ + 792000 1100000 + 396000 950000 + 198000 850000 + >; + }; + +If more than one device of same type share the same OPPs, for example +all the CPUs on a SoC or in a single cluster on a SoC, then we need to +avoid replicating the OPPs in all the nodes. We can specify the phandle +of the node with which the current node shares the operating points instead. + +2. Consider a SMP system with 4 CPUs in the same clock domain. + + cpu0: cpu@0 { + compatible = "arm,cortex-a9"; + reg = <0>; + next-level-cache = <&L2>; + operating-points = < + /* kHz uV */ + 792000 1100000 + 396000 950000 + 198000 850000 + >; + }; + + cpu1: cpu@1 { + compatible = "arm,cortex-a9"; + reg = <1>; + next-level-cache = <&L2>; + operating-points-phandle = <&cpu0>; + }; + + cpu2: cpu@2 { + compatible = "arm,cortex-a9"; + reg = <2>; + next-level-cache = <&L2>; + operating-points-phandle = <&cpu0>; + }; + + cpu3: cpu@3 { + compatible = "arm,cortex-a9"; + reg = <3>; + next-level-cache = <&L2>; + operating-points-phandle = <&cpu0>; + }; + +3. Consider an AMP(asymmetric multi-processor) sytem with 2 clusters of CPUs. + Each cluster has 2 CPUs and all the CPUs within the cluster share the clock + domain. + + cpu0: cpu@0 { + compatible = "arm,cortex-a15"; + reg = <0>; + operating-points = < + /* kHz uV */ + 792000 1100000 + 396000 950000 + 198000 850000 + >; + clock-latency = <100000>; /* 100us */ + }; + + cpu1: cpu@1 { + compatible = "arm,cortex-a15"; + reg = <1>; + clock-latency = <100000>; /* 100us */ + operating-points-phandle = <&cpu0>; + }; + + cpu2: cpu@100 { + compatible = "arm,cortex-a7"; + reg = <100>; + operating-points = < + /* kHz uV */ + 792000 950000 + 396000 750000 + 198000 450000 + >; + clock-latency = <100000>; /* 100us */ + }; + + cpu3: cpu@101 { + compatible = "arm,cortex-a7"; + reg = <101>; + operating-points-phandle = <&cpu2>; + clock-latency = <100000>; /* 100us */ + }; diff --git a/drivers/base/power/opp.c b/drivers/base/power/opp.c index c8ec186..9ac3c93 100644 --- a/drivers/base/power/opp.c +++ b/drivers/base/power/opp.c @@ -707,12 +707,20 @@ struct srcu_notifier_head *opp_get_notifier(struct device *dev) int of_init_opp_table(struct device *dev) { const struct property *prop; + struct device_node *opp_node; const __be32 *val; int nr; - prop = of_find_property(dev->of_node, "operating-points", NULL); - if (!prop) + opp_node = of_parse_phandle(dev->of_node, + "operating-points-phandle", 0); + if (!opp_node) /* if no OPP phandle, search for OPPs in current node */ + opp_node = dev->of_node; + prop = of_find_property(opp_node, "operating-points", NULL); + if (!prop) { + dev_warn(dev, "node %s missing operating-points property\n", + opp_node->full_name); return -ENODEV; + } if (!prop->value) return -ENODATA;