Message ID | 20190705095726.21433-12-niklas.cassel@linaro.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | Add support for QCOM Core Power Reduction | expand |
On 05-07-19, 11:57, Niklas Cassel wrote: > diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi b/arch/arm64/boot/dts/qcom/qcs404.dtsi > cpu_opp_table: cpu-opp-table { > - compatible = "operating-points-v2"; > + compatible = "operating-points-v2-kryo-cpu"; > opp-shared; > > opp-1094400000 { > opp-hz = /bits/ 64 <1094400000>; > - opp-microvolt = <1224000 1224000 1224000>; > + required-opps = <&cpr_opp1>; > }; > opp-1248000000 { > opp-hz = /bits/ 64 <1248000000>; > - opp-microvolt = <1288000 1288000 1288000>; > + required-opps = <&cpr_opp2>; > }; > opp-1401600000 { > opp-hz = /bits/ 64 <1401600000>; > - opp-microvolt = <1384000 1384000 1384000>; > + required-opps = <&cpr_opp3>; > + }; > + }; > + > + cpr_opp_table: cpr-opp-table { > + compatible = "operating-points-v2-qcom-level"; > + > + cpr_opp1: opp1 { > + opp-level = <1>; > + qcom,opp-fuse-level = <1>; > + opp-hz = /bits/ 64 <1094400000>; > + }; > + cpr_opp2: opp2 { > + opp-level = <2>; > + qcom,opp-fuse-level = <2>; > + opp-hz = /bits/ 64 <1248000000>; > + }; > + cpr_opp3: opp3 { > + opp-level = <3>; > + qcom,opp-fuse-level = <3>; > + opp-hz = /bits/ 64 <1401600000>; > }; > }; - Do we ever have cases more complex than this for this version of CPR ? - What about multiple devices with same CPR table, not big LITTLE CPUs, but other devices like two different type of IO devices ? What will we do with opp-hz in those cases ? - If there are no such cases, can we live without opp-hz being used here and reverse-engineer the highest frequency by looking directly at CPUs OPP table ? i.e. by looking at required-opps field.
On Wed, Jul 10, 2019 at 02:33:03PM +0530, Viresh Kumar wrote: > On 05-07-19, 11:57, Niklas Cassel wrote: > > diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi b/arch/arm64/boot/dts/qcom/qcs404.dtsi > > cpu_opp_table: cpu-opp-table { > > - compatible = "operating-points-v2"; > > + compatible = "operating-points-v2-kryo-cpu"; > > opp-shared; > > > > opp-1094400000 { > > opp-hz = /bits/ 64 <1094400000>; > > - opp-microvolt = <1224000 1224000 1224000>; > > + required-opps = <&cpr_opp1>; > > }; > > opp-1248000000 { > > opp-hz = /bits/ 64 <1248000000>; > > - opp-microvolt = <1288000 1288000 1288000>; > > + required-opps = <&cpr_opp2>; > > }; > > opp-1401600000 { > > opp-hz = /bits/ 64 <1401600000>; > > - opp-microvolt = <1384000 1384000 1384000>; > > + required-opps = <&cpr_opp3>; > > + }; > > + }; > > + > > + cpr_opp_table: cpr-opp-table { > > + compatible = "operating-points-v2-qcom-level"; > > + > > + cpr_opp1: opp1 { > > + opp-level = <1>; > > + qcom,opp-fuse-level = <1>; > > + opp-hz = /bits/ 64 <1094400000>; > > + }; > > + cpr_opp2: opp2 { > > + opp-level = <2>; > > + qcom,opp-fuse-level = <2>; > > + opp-hz = /bits/ 64 <1248000000>; > > + }; > > + cpr_opp3: opp3 { > > + opp-level = <3>; > > + qcom,opp-fuse-level = <3>; > > + opp-hz = /bits/ 64 <1401600000>; > > }; > > }; > > - Do we ever have cases more complex than this for this version of CPR ? For e.g. CPR on msm8916, we will have 7 different frequencies in the CPU OPP table, but only 3 OPPs in the CPR OPP table. Each of the 7 OPPs in the CPU OPP table will have a required-opps that points to an OPP in the CPR OPP table. On certain msm8916:s, the speedbin efuse will limit us to only have 6 OPPs in the CPU OPP table, but the required-opps are still the same. So I would say that it is just slightly more complex.. > > - What about multiple devices with same CPR table, not big LITTLE > CPUs, but other devices like two different type of IO devices ? What > will we do with opp-hz in those cases ? On all SoCs where there is a CPR for e.g. GPU, there is an additional CPR hardware block, so then there will also be an additional CPR OPP table. So I don't think that this will be a problem. > > - If there are no such cases, can we live without opp-hz being used > here and reverse-engineer the highest frequency by looking directly > at CPUs OPP table ? i.e. by looking at required-opps field. This was actually my initial thought when talking to you 6+ months ago. However, the problem was that, from the CPR drivers' perspective, it only sees the CPR OPP table. So this is the order things are called, from qcom-cpufreq-nvmem.c perspective: 1) dev_pm_opp_set_supported_hw() 2) dev_pm_opp_attach_genpd() -> which results in int cpr_pd_attach_dev(struct generic_pm_domain *domain, struct device *dev) being called. This callback is inside the CPR driver, and here we have the CPU's (genpd virtual) struct device, and this is where we would like to know the opp-hz. The problem here is that: [ 3.114979] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: genpd:0:cpu0: -19 [ 3.119610] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: cpu0: 0 [ 3.126489] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: cpr@b018000: 3 While we have the CPR OPP table in the attach callback, we don't have the CPU OPP table, neither in the CPU struct device or the genpd virtual struct device. Since we have called dev_pm_opp_attach_genpd(.., .., &virt_devs) which attaches an OPP table to the CPU, I would have expected one of them to be >= 0. Especially since dev_name(virt_devs[0]) == genpd:0:cpu0 I guess it should still be possible to parse the required-opps manually here, by iterating the OF nodes, however, we won't be able to use the CPU's struct opp_table (which is the nice representation of the OF nodes). Any suggestions? Kind regards, Niklas
On 15-07-19, 15:24, Niklas Cassel wrote: > This was actually my initial thought when talking to you 6+ months ago. > However, the problem was that, from the CPR drivers' perspective, it > only sees the CPR OPP table. > > > So this is the order things are called, > from qcom-cpufreq-nvmem.c perspective: > > 1) dev_pm_opp_set_supported_hw() > > 2) dev_pm_opp_attach_genpd() -> > which results in > int cpr_pd_attach_dev(struct generic_pm_domain *domain, > struct device *dev) > being called. > This callback is inside the CPR driver, and here we have the > CPU's (genpd virtual) struct device, and this is where we would like to > know the opp-hz. > The problem here is that: > [ 3.114979] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: genpd:0:cpu0: -19 > [ 3.119610] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: cpu0: 0 > [ 3.126489] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: cpr@b018000: 3 > > While we have the CPR OPP table in the attach callback, we don't > have the CPU OPP table, neither in the CPU struct device or the genpd virtual > struct device. If you can find CPU's physical number from the virtual device, then you can do get_cpu_device(X) and then life will be easy ? > Since we have called dev_pm_opp_attach_genpd(.., .., &virt_devs) which > attaches an OPP table to the CPU, I would have expected one of them to > be >= 0. > Especially since dev_name(virt_devs[0]) == genpd:0:cpu0 > > I guess it should still be possible to parse the required-opps manually here, > by iterating the OF nodes, however, we won't be able to use the CPU's struct > opp_table (which is the nice representation of the OF nodes). > > Any suggestions?
On Tue, Jul 16, 2019 at 04:04:36PM +0530, Viresh Kumar wrote: > On 15-07-19, 15:24, Niklas Cassel wrote: > > This was actually my initial thought when talking to you 6+ months ago. > > However, the problem was that, from the CPR drivers' perspective, it > > only sees the CPR OPP table. > > > > > > So this is the order things are called, > > from qcom-cpufreq-nvmem.c perspective: > > > > 1) dev_pm_opp_set_supported_hw() > > > > 2) dev_pm_opp_attach_genpd() -> > > which results in > > int cpr_pd_attach_dev(struct generic_pm_domain *domain, > > struct device *dev) > > being called. > > This callback is inside the CPR driver, and here we have the > > CPU's (genpd virtual) struct device, and this is where we would like to > > know the opp-hz. > > The problem here is that: > > [ 3.114979] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: genpd:0:cpu0: -19 > > [ 3.119610] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: cpu0: 0 Here I cheated and simply used get_cpu_device(0). Since I cheated, I used get_cpu_device(0) always, so even when CPU1,CPU2,CPU3 is attached, dev_pm_opp_get_opp_count(cpu0) is still 0. I added a print in [ 3.836533] cpr_set_performance: number of OPPs for dev: cpu0: 3 And there I can see that OPP count is 3, so it appears that with the current code, we need to wait until cpufreq-dt.c:cpufreq_init() has been called, maybe dev_pm_opp_of_cpumask_add_table() needs to be called before dev_pm_opp_get_opp_count(cpu0) actually returns 3. cpufreq_init() is called by platform_device_register_simple("cpufreq-dt", -1, NULL, 0); which is called after dev_pm_opp_attach_genpd(). What I don't understand is that dev_pm_opp_attach_genpd() actually returns a OPP table. So why do we need to wait for dev_pm_opp_of_cpumask_add_table(), before either dev_pm_opp_get_opp_count(cpu0) or dev_pm_opp_get_opp_count(genpd_virtdev_for_cpu0) returns 3? > > [ 3.126489] cpr_pd_attach_dev: dev_pm_opp_get_opp_count for dev: cpr@b018000: 3 > > > > While we have the CPR OPP table in the attach callback, we don't > > have the CPU OPP table, neither in the CPU struct device or the genpd virtual > > struct device. > > If you can find CPU's physical number from the virtual device, then > you can do get_cpu_device(X) and then life will be easy ? > > > Since we have called dev_pm_opp_attach_genpd(.., .., &virt_devs) which > > attaches an OPP table to the CPU, I would have expected one of them to > > be >= 0. > > Especially since dev_name(virt_devs[0]) == genpd:0:cpu0 > > > > I guess it should still be possible to parse the required-opps manually here, > > by iterating the OF nodes, however, we won't be able to use the CPU's struct > > opp_table (which is the nice representation of the OF nodes). > > > > Any suggestions? > > -- > viresh
On 16-07-19, 12:53, Niklas Cassel wrote: > Here I cheated and simply used get_cpu_device(0). > > Since I cheated, I used get_cpu_device(0) always, > so even when CPU1,CPU2,CPU3 is attached, dev_pm_opp_get_opp_count(cpu0) is > still 0. > > I added a print in > [ 3.836533] cpr_set_performance: number of OPPs for dev: cpu0: 3 > > And there I can see that OPP count is 3, so it appears that with the > current code, we need to wait until cpufreq-dt.c:cpufreq_init() > has been called, maybe dev_pm_opp_of_cpumask_add_table() needs > to be called before dev_pm_opp_get_opp_count(cpu0) actually returns 3. > > cpufreq_init() is called by platform_device_register_simple("cpufreq-dt", -1, > NULL, 0); > which is called after dev_pm_opp_attach_genpd(). > > What I don't understand is that dev_pm_opp_attach_genpd() actually returns > a OPP table. So why do we need to wait for dev_pm_opp_of_cpumask_add_table(), > before either dev_pm_opp_get_opp_count(cpu0) or > dev_pm_opp_get_opp_count(genpd_virtdev_for_cpu0) returns 3? Ah, I see the problems now. No, cpufreq table can't be available at this point of time and we aren't going to change that. It is the right thing to do. Now, even if the kernel isn't written in a way which works for you, it isn't right to put more things in DT than required. DT is (should be) very much independent of the Linux kernel. So we have to parse DT to find highest frequency for each required-opp. Best is to put that code in the OPP core and use it from your driver.
On Wed, Jul 17, 2019 at 10:19:23AM +0530, Viresh Kumar wrote: > On 16-07-19, 12:53, Niklas Cassel wrote: > > Here I cheated and simply used get_cpu_device(0). > > > > Since I cheated, I used get_cpu_device(0) always, > > so even when CPU1,CPU2,CPU3 is attached, dev_pm_opp_get_opp_count(cpu0) is > > still 0. > > > > I added a print in > > [ 3.836533] cpr_set_performance: number of OPPs for dev: cpu0: 3 > > > > And there I can see that OPP count is 3, so it appears that with the > > current code, we need to wait until cpufreq-dt.c:cpufreq_init() > > has been called, maybe dev_pm_opp_of_cpumask_add_table() needs > > to be called before dev_pm_opp_get_opp_count(cpu0) actually returns 3. > > > > cpufreq_init() is called by platform_device_register_simple("cpufreq-dt", -1, > > NULL, 0); > > which is called after dev_pm_opp_attach_genpd(). > > > > What I don't understand is that dev_pm_opp_attach_genpd() actually returns > > a OPP table. So why do we need to wait for dev_pm_opp_of_cpumask_add_table(), > > before either dev_pm_opp_get_opp_count(cpu0) or > > dev_pm_opp_get_opp_count(genpd_virtdev_for_cpu0) returns 3? > > Ah, I see the problems now. No, cpufreq table can't be available at > this point of time and we aren't going to change that. It is the right > thing to do. > > Now, even if the kernel isn't written in a way which works for you, it > isn't right to put more things in DT than required. DT is (should be) > very much independent of the Linux kernel. > > So we have to parse DT to find highest frequency for each > required-opp. Best is to put that code in the OPP core and use it from > your driver. Hello Viresh, Could you please have a look at the last two patches here: https://git.linaro.org/people/niklas.cassel/kernel.git/log/?h=cpr-opp-hz If you like my proposal then I could send out the first patch (the one to OPP core) as a real patch (with an improved commit message), and incorporate the second patch into my CPR patch series when I send out a V2. Kind regards, Niklas
On 19-07-19, 17:45, Niklas Cassel wrote: > Hello Viresh, > > Could you please have a look at the last two patches here: > https://git.linaro.org/people/niklas.cassel/kernel.git/log/?h=cpr-opp-hz There is no sane way of providing review comments with a link to the git tree :) I still had a look and I see that you don't search for max frequency but just any OPP that has required-opps set to the level u want. Also, can't there be multiple phandles in required-opps in your case ? > If you like my proposal then I could send out the first patch (the one to > OPP core) as a real patch (with an improved commit message), and > incorporate the second patch into my CPR patch series when I send out a V2. Send them both in your series only, otherwise the first one is useless anyway.
On Tue, Jul 23, 2019 at 07:26:35AM +0530, Viresh Kumar wrote: > On 19-07-19, 17:45, Niklas Cassel wrote: > > Hello Viresh, > > > > Could you please have a look at the last two patches here: > > https://git.linaro.org/people/niklas.cassel/kernel.git/log/?h=cpr-opp-hz > > There is no sane way of providing review comments with a link to the > git tree :) > > I still had a look and I see that you don't search for max frequency > but just any OPP that has required-opps set to the level u want. Also, > can't there be multiple phandles in required-opps in your case ? For each OPP in the CPR OPP table, we need three things, opp-level, qcom,fuse-level and opp-hz. The first two can simply be parsed from the OPP node itself while iterating the CPR OPP table. The opp-hz has to be fetched from the CPU OPP table. Several OPPs might have the same qcom,fuse-level value. However, they will have unique opp-level values and unique opp-hz values. Each opp-level has a matching opp-hz. required-opps is basically a connection between a opp-hz (CPU OPP table) and and a opp-level (CPR OPP table). So there will be only one match. No need to search for max frequency. I think you are confusing this with something else. The CPR hardware has to be programmed with the highest frequency for each qcom,fuse-corner. This is done here: https://git.linaro.org/people/niklas.cassel/kernel.git/tree/drivers/power/avs/qcom-cpr.c?h=cpr-full#n1219 by saving the highest frequency for each fuse level while iterating the OPP table. There can be only one phandle in the required-opps in my case, this is one of the reasons why I prefer implementing it in the CPR driver. If it were to be implemented in OPP core, it probably has to handle multiple phandles. > > > If you like my proposal then I could send out the first patch (the one to > > OPP core) as a real patch (with an improved commit message), and > > incorporate the second patch into my CPR patch series when I send out a V2. > > Send them both in your series only, otherwise the first one is useless > anyway. Ok, will do. Kind regards, Niklas
diff --git a/arch/arm64/boot/dts/qcom/qcs404.dtsi b/arch/arm64/boot/dts/qcom/qcs404.dtsi index ff9198740431..5b6276c3ec42 100644 --- a/arch/arm64/boot/dts/qcom/qcs404.dtsi +++ b/arch/arm64/boot/dts/qcom/qcs404.dtsi @@ -38,7 +38,8 @@ #cooling-cells = <2>; clocks = <&apcs_glb>; operating-points-v2 = <&cpu_opp_table>; - cpu-supply = <&pms405_s3>; + power-domains = <&cprpd>; + power-domain-names = "cpr"; }; CPU1: cpu@101 { @@ -51,7 +52,8 @@ #cooling-cells = <2>; clocks = <&apcs_glb>; operating-points-v2 = <&cpu_opp_table>; - cpu-supply = <&pms405_s3>; + power-domains = <&cprpd>; + power-domain-names = "cpr"; }; CPU2: cpu@102 { @@ -64,7 +66,8 @@ #cooling-cells = <2>; clocks = <&apcs_glb>; operating-points-v2 = <&cpu_opp_table>; - cpu-supply = <&pms405_s3>; + power-domains = <&cprpd>; + power-domain-names = "cpr"; }; CPU3: cpu@103 { @@ -77,7 +80,8 @@ #cooling-cells = <2>; clocks = <&apcs_glb>; operating-points-v2 = <&cpu_opp_table>; - cpu-supply = <&pms405_s3>; + power-domains = <&cprpd>; + power-domain-names = "cpr"; }; L2_0: l2-cache { @@ -101,20 +105,40 @@ }; cpu_opp_table: cpu-opp-table { - compatible = "operating-points-v2"; + compatible = "operating-points-v2-kryo-cpu"; opp-shared; opp-1094400000 { opp-hz = /bits/ 64 <1094400000>; - opp-microvolt = <1224000 1224000 1224000>; + required-opps = <&cpr_opp1>; }; opp-1248000000 { opp-hz = /bits/ 64 <1248000000>; - opp-microvolt = <1288000 1288000 1288000>; + required-opps = <&cpr_opp2>; }; opp-1401600000 { opp-hz = /bits/ 64 <1401600000>; - opp-microvolt = <1384000 1384000 1384000>; + required-opps = <&cpr_opp3>; + }; + }; + + cpr_opp_table: cpr-opp-table { + compatible = "operating-points-v2-qcom-level"; + + cpr_opp1: opp1 { + opp-level = <1>; + qcom,opp-fuse-level = <1>; + opp-hz = /bits/ 64 <1094400000>; + }; + cpr_opp2: opp2 { + opp-level = <2>; + qcom,opp-fuse-level = <2>; + opp-hz = /bits/ 64 <1248000000>; + }; + cpr_opp3: opp3 { + opp-level = <3>; + qcom,opp-fuse-level = <3>; + opp-hz = /bits/ 64 <1401600000>; }; }; @@ -294,6 +318,62 @@ tsens_caldata: caldata@d0 { reg = <0x1f8 0x14>; }; + cpr_efuse_speedbin: speedbin@13c { + reg = <0x13c 0x4>; + bits = <2 3>; + }; + cpr_efuse_quot_offset1: qoffset1@231 { + reg = <0x231 0x4>; + bits = <4 7>; + }; + cpr_efuse_quot_offset2: qoffset2@232 { + reg = <0x232 0x4>; + bits = <3 7>; + }; + cpr_efuse_quot_offset3: qoffset3@233 { + reg = <0x233 0x4>; + bits = <2 7>; + }; + cpr_efuse_init_voltage1: ivoltage1@229 { + reg = <0x229 0x4>; + bits = <4 6>; + }; + cpr_efuse_init_voltage2: ivoltage2@22a { + reg = <0x22a 0x4>; + bits = <2 6>; + }; + cpr_efuse_init_voltage3: ivoltage3@22b { + reg = <0x22b 0x4>; + bits = <0 6>; + }; + cpr_efuse_quot1: quot1@22b { + reg = <0x22b 0x4>; + bits = <6 12>; + }; + cpr_efuse_quot2: quot2@22d { + reg = <0x22d 0x4>; + bits = <2 12>; + }; + cpr_efuse_quot3: quot3@230 { + reg = <0x230 0x4>; + bits = <0 12>; + }; + cpr_efuse_ring1: ring1@228 { + reg = <0x228 0x4>; + bits = <0 3>; + }; + cpr_efuse_ring2: ring2@228 { + reg = <0x228 0x4>; + bits = <4 3>; + }; + cpr_efuse_ring3: ring3@229 { + reg = <0x229 0x4>; + bits = <0 3>; + }; + cpr_efuse_revision: revision@218 { + reg = <0x218 0x4>; + bits = <3 3>; + }; }; rng: rng@e3000 { @@ -901,6 +981,55 @@ clock-names = "xo"; }; + cprpd: cpr@b018000 { + compatible = "qcom,qcs404-cpr", "qcom,cpr"; + reg = <0x0b018000 0x1000>; + interrupts = <0 15 IRQ_TYPE_EDGE_RISING>; + clocks = <&xo_board>; + clock-names = "ref"; + vdd-apc-supply = <&pms405_s3>; + #power-domain-cells = <0>; + operating-points-v2 = <&cpr_opp_table>; + acc-syscon = <&tcsr>; + + nvmem-cells = <&cpr_efuse_quot_offset1>, + <&cpr_efuse_quot_offset2>, + <&cpr_efuse_quot_offset3>, + <&cpr_efuse_init_voltage1>, + <&cpr_efuse_init_voltage2>, + <&cpr_efuse_init_voltage3>, + <&cpr_efuse_quot1>, + <&cpr_efuse_quot2>, + <&cpr_efuse_quot3>, + <&cpr_efuse_ring1>, + <&cpr_efuse_ring2>, + <&cpr_efuse_ring3>, + <&cpr_efuse_revision>; + nvmem-cell-names = "cpr_quotient_offset1", + "cpr_quotient_offset2", + "cpr_quotient_offset3", + "cpr_init_voltage1", + "cpr_init_voltage2", + "cpr_init_voltage3", + "cpr_quotient1", + "cpr_quotient2", + "cpr_quotient3", + "cpr_ring_osc1", + "cpr_ring_osc2", + "cpr_ring_osc3", + "cpr_fuse_revision"; + + qcom,cpr-timer-delay-us = <5000>; + qcom,cpr-timer-cons-up = <0>; + qcom,cpr-timer-cons-down = <2>; + qcom,cpr-up-threshold = <1>; + qcom,cpr-down-threshold = <3>; + qcom,cpr-idle-clocks = <15>; + qcom,cpr-gcnt-us = <1>; + qcom,vdd-apc-step-up-limit = <1>; + qcom,vdd-apc-step-down-limit = <1>; + }; + timer@b120000 { #address-cells = <1>; #size-cells = <1>;