Message ID | 1425316643-31991-6-git-send-email-javi.merino@arm.com (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Eduardo Valentin |
Headers | show |
On Mon, Mar 2, 2015 at 10:47 PM, Javi Merino <javi.merino@arm.com> wrote: > From: Kapileshwar Singh <kapileshwar.singh@arm.com> > > When cpufreq changes the policy cpu, we need to update our cached cpu > device accordingly. > > Cc: Zhang Rui <rui.zhang@intel.com> > Cc: Eduardo Valentin <edubezval@gmail.com> > Signed-off-by: Kapileshwar Singh <kapileshwar.singh@arm.com> > --- > drivers/thermal/cpu_cooling.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c > index c4974144c787..e306d6bc3cf1 100644 > --- a/drivers/thermal/cpu_cooling.c > +++ b/drivers/thermal/cpu_cooling.c > @@ -269,6 +269,9 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb, > mutex_unlock(&cooling_cpufreq_lock); > break; > > + case CPUFREQ_UPDATE_POLICY_CPU: > + update_cpu_device(policy->cpu); > + break; > case CPUFREQ_CREATE_POLICY: > update_cpu_device(policy->cpu); > break; First of all, I wasn't able to find 3/5 on LKML and I looked at 3/7 from an earlier version to look at how update_cpu_device() looks like. What I couldn't understand is why do you need to update things if policy->cpu is changing ? I am expecting a detailed answer here according to your design, and we may be able to work out without such updates. Lets see.. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Viresh! On 03/03/15 04:03, Viresh Kumar wrote: > On Mon, Mar 2, 2015 at 10:47 PM, Javi Merino <javi.merino@arm.com> wrote: >> From: Kapileshwar Singh <kapileshwar.singh@arm.com> >> >> When cpufreq changes the policy cpu, we need to update our cached cpu >> device accordingly. >> >> Cc: Zhang Rui <rui.zhang@intel.com> >> Cc: Eduardo Valentin <edubezval@gmail.com> >> Signed-off-by: Kapileshwar Singh <kapileshwar.singh@arm.com> >> --- >> drivers/thermal/cpu_cooling.c | 3 +++ >> 1 file changed, 3 insertions(+) >> >> diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c >> index c4974144c787..e306d6bc3cf1 100644 >> --- a/drivers/thermal/cpu_cooling.c >> +++ b/drivers/thermal/cpu_cooling.c >> @@ -269,6 +269,9 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb, >> mutex_unlock(&cooling_cpufreq_lock); >> break; >> >> + case CPUFREQ_UPDATE_POLICY_CPU: >> + update_cpu_device(policy->cpu); >> + break; >> case CPUFREQ_CREATE_POLICY: >> update_cpu_device(policy->cpu); >> break; > > First of all, I wasn't able to find 3/5 on LKML and I looked at 3/7 > from an earlier > version to look at how update_cpu_device() looks like. > > What I couldn't understand is why do you need to update things if policy->cpu > is changing ? > We store the device pointer of the lead CPU (policy CPU) in a cpufreq domain as a part of the cpufreq_cooling_device data structure. There is one cpufreq_cooling_device per cpufreq domain. We need the device to find out the current OPP for the cpufreq_cooling_device for our static power calculation. opp = opp_find_freq_exact(cpu_dev, freq_hz, true); voltage = dev_pm_opp_get_voltage(opp); The problem we are trying to solve here is: When this lead CPU gets hotplugged out, the device pointer becomes stale and the policy cpu for the cpufreq domain changes. We then store the new policy CPU's device pointer for the in cpufreq_cooling_device on the reception of a notification from cpufreq. Being open to your suggestions for any other possible ways to solve the problem.. Regards, KP > I am expecting a detailed answer here according to your design, and we may > be able to work out without such updates. Lets see.. > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3 March 2015 at 16:29, Kapileshwar Singh <kapileshwar.singh@arm.com> wrote: > We store the device pointer of the lead CPU (policy CPU) in a cpufreq domain as a part of the > cpufreq_cooling_device data structure. There is one cpufreq_cooling_device per > cpufreq domain. > > We need the device to find out the current OPP for the cpufreq_cooling_device for our static power calculation. > > opp = opp_find_freq_exact(cpu_dev, freq_hz, true); > voltage = dev_pm_opp_get_voltage(opp); > > > The problem we are trying to solve here is: > > When this lead CPU gets hotplugged out, the device pointer becomes stale and the policy > cpu for the cpufreq domain changes. We then store the new policy CPU's device pointer for the > in cpufreq_cooling_device on the reception of a notification from cpufreq. Being open to your > suggestions for any other possible ways to solve the problem.. I would have loved that if life was that simple :) So, the OPP library today isn't that perfect and so is this doing rounds [1]. The problem is the OPPs are initialized per device today and even if they are shared by multiple CPUs, OPP library doesn't know about it. So, if the policy->cpu goes away, OPP APIs on the new CPU will not work as OPPs are only initialized for one CPU and not for others within the same policy :) The way cpufreq-dt is taking care of this is by saving cpu_dev of the first CPU for which OPPs are initialized and always using that even if the CPU goes away. And you need to do exactly that. And please, do test such scenario before sending the patches again. As it would have simply failed in this case, have you given it a try .. Once my patchset [1] is applied, life would be very simple and we can call OPP library for any CPU, but that is going to take some time. -- viresh [1] https://www.marc.info/?l=linaro-kernel&m=142364262800650&w=3 -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/03/15 11:19, Viresh Kumar wrote: > On 3 March 2015 at 16:29, Kapileshwar Singh <kapileshwar.singh@arm.com> wrote: >> We store the device pointer of the lead CPU (policy CPU) in a cpufreq domain as a part of the >> cpufreq_cooling_device data structure. There is one cpufreq_cooling_device per >> cpufreq domain. >> >> We need the device to find out the current OPP for the cpufreq_cooling_device for our static power calculation. >> >> opp = opp_find_freq_exact(cpu_dev, freq_hz, true); >> voltage = dev_pm_opp_get_voltage(opp); >> >> >> The problem we are trying to solve here is: >> >> When this lead CPU gets hotplugged out, the device pointer becomes stale and the policy >> cpu for the cpufreq domain changes. We then store the new policy CPU's device pointer for the >> in cpufreq_cooling_device on the reception of a notification from cpufreq. Being open to your >> suggestions for any other possible ways to solve the problem.. > > I would have loved that if life was that simple :) > > So, the OPP library today isn't that perfect and so is this doing rounds [1]. > The problem is the OPPs are initialized per device today and even if they > are shared by multiple CPUs, OPP library doesn't know about it. > > So, if the policy->cpu goes away, OPP APIs on the new CPU will not work > as OPPs are only initialized for one CPU and not for others within the same > policy :) > > The way cpufreq-dt is taking care of this is by saving cpu_dev of the first > CPU for which OPPs are initialized and always using that even if the CPU > goes away. And you need to do exactly that. > > And please, do test such scenario before sending the patches again. As > it would have simply failed in this case, have you given it a try .. Yes I indeed tested the case where we cache the device pointer of the CPU for which the OPP's are populated. When this CPU is hotplugged out, it invalidates the device pointer itself. Here are the error we get in dmesg: .. <3>[67203.216774] opp_get_voltage: Invalid parameters <3>[67203.326774] opp_get_voltage: Invalid parameters <3>[67203.326774] opp_get_voltage: Invalid parameters .. Which happens because: unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp) { .. tmp_opp = rcu_dereference(opp); if (unlikely(IS_ERR_OR_NULL(tmp_opp)) || !tmp_opp->available) pr_err("%s: Invalid parameters\n", __func__); else .. Which happens when opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz, true); returns a an erroneous or NULL OPP or the opp is unavailable (in the above condition) Regards, KP > > Once my patchset [1] is applied, life would be very simple and we can > call OPP library for any CPU, but that is going to take some time. > > -- > viresh > > [1] https://www.marc.info/?l=linaro-kernel&m=142364262800650&w=3 > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3 March 2015 at 17:11, Kapileshwar Singh <kapileshwar.singh@arm.com> wrote: > Yes I indeed tested the case where we cache the device pointer of the CPU for which the OPP's are populated. > When this CPU is hotplugged out, it invalidates the device pointer itself. Here are the error we get in dmesg: What do you mean by 'invalidates the device pointer' ? that cpu_dev is NULL ? > <3>[67203.216774] opp_get_voltage: Invalid parameters > <3>[67203.326774] opp_get_voltage: Invalid parameters > <3>[67203.326774] opp_get_voltage: Invalid parameters Have you handwritten them ? Why don't they precede with dev_pm_* ?? > > Which happens because: > > unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp) > { > .. > tmp_opp = rcu_dereference(opp); > if (unlikely(IS_ERR_OR_NULL(tmp_opp)) || !tmp_opp->available) > pr_err("%s: Invalid parameters\n", __func__); This %s should print routine name .. > else > .. > > Which happens when > > opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz, > true); > > returns a an erroneous or NULL OPP or the opp is unavailable (in the above condition) Please goto the depth of this thing, as I don't think it should happen. Over that I was asking you if you have tested the solution Javi gave, because OPPs wouldn't have been initialized for other CPUs once policy->cpu goes down. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/03/15 13:07, Viresh Kumar wrote: > On 3 March 2015 at 17:11, Kapileshwar Singh <kapileshwar.singh@arm.com> wrote: >> Yes I indeed tested the case where we cache the device pointer of the CPU for which the OPP's are populated. >> When this CPU is hotplugged out, it invalidates the device pointer itself. Here are the error we get in dmesg: > > What do you mean by 'invalidates the device pointer' ? that cpu_dev is NULL ? The cpu_dev is not NULL but we get an erroneous OPP back. We found the problem lies in the way we calculate the frequency for the cluster. >> <3>[67203.216774] opp_get_voltage: Invalid parameters >> <3>[67203.326774] opp_get_voltage: Invalid parameters >> <3>[67203.326774] opp_get_voltage: Invalid parameters > > Have you handwritten them ? Why don't they precede with dev_pm_* ?? I have not handwritten them, It was from a Linaro 3.10 based kernel when I first noticed this issue but the same problem exists in mainline. Apologies for this I sent you an older trace which I had saved when I found the bug. Here is the trace I get from mainline [ 5680.135339] dev_pm_opp_get_voltage: Invalid parameters [ 5680.245528] dev_pm_opp_get_voltage: Invalid parameters [ 5680.355432] dev_pm_opp_get_voltage: Invalid parameters [ 5680.465521] dev_pm_opp_get_voltage: Invalid parameters [ 5680.575599] dev_pm_opp_get_voltage: Invalid parameters [ 5680.685817] dev_pm_opp_get_voltage: Invalid parameters [ 5680.795556] dev_pm_opp_get_voltage: Invalid parameters [ 5680.905598] dev_pm_opp_get_voltage: Invalid parameters > >> >> Which happens because: >> >> unsigned long dev_pm_opp_get_voltage(struct dev_pm_opp *opp) >> { >> .. >> tmp_opp = rcu_dereference(opp); >> if (unlikely(IS_ERR_OR_NULL(tmp_opp)) || !tmp_opp->available) >> pr_err("%s: Invalid parameters\n", __func__); > > This %s should print routine name .. > >> else >> .. >> >> Which happens when >> >> opp = dev_pm_opp_find_freq_exact(cpufreq_device->cpu_dev, freq_hz, >> true); >> >> returns a an erroneous or NULL OPP or the opp is unavailable (in the above condition) > Update: This returns an erroneous OPP > Please goto the depth of this thing, as I don't think it should happen. > > Over that I was asking you if you have tested the solution Javi gave, > because OPPs > wouldn't have been initialized for other CPUs once policy->cpu goes down. I did test this but we were working with the assumption that OPPs should be populated for all the CPUs and also that OPPs are lost for a hotplugged CPU which I see is not the case. We have looked at this more closely and found that problem lies in: freq = cpufreq_quick_get(cpumask_any(&cpufreq_device->allowed_cpus)); which returns a NULL frequency as we are not checking for online CPUs here. We shall come up with a fix for this. Many thanks for helping us with the investigation. Regards, KP -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Mar 3, 2015 at 3:09 PM, Kapileshwar Singh <kapileshwar.singh@arm.com> wrote: > On 03/03/15 13:07, Viresh Kumar wrote: [...] >> Please goto the depth of this thing, as I don't think it should happen. >> >> Over that I was asking you if you have tested the solution Javi gave, >> because OPPs >> wouldn't have been initialized for other CPUs once policy->cpu goes down. > I did test this but we were working with the assumption that OPPs should be populated for all the CPUs and also that OPPs are lost for a hotplugged CPU which I see is not the case. > > We have looked at this more closely and found that problem lies in: > > freq = cpufreq_quick_get(cpumask_any(&cpufreq_device->allowed_cpus)); > > which returns a NULL frequency as we are not checking for online CPUs here. We shall come up with a fix for this. Many thanks for helping us with the investigation. > You can use any_online_cpu(..) instead of cpumask_any IMO Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3 March 2015 at 20:39, Kapileshwar Singh <kapileshwar.singh@arm.com> wrote: > I did test this but we were working with the assumption that OPPs should be populated for all the CPUs and also that OPPs are lost for a hotplugged CPU which I see is not the case. Then what did you test? My point here is, even with the latest patches that you have sent, you wouldn't be able to get the OPPs once policy->cpu goes down. So, how did this worked for you ? > We have looked at this more closely and found that problem lies in: > > freq = cpufreq_quick_get(cpumask_any(&cpufreq_device->allowed_cpus)); > > which returns a NULL frequency as we are not checking for online CPUs here. We shall come up with a fix for this. Many thanks for helping us with the investigation. Right. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 3 March 2015 at 20:56, Sudeep Holla <sudeep.holla@arm.com> wrote:
> You can use any_online_cpu(..) instead of cpumask_any IMO
What about policy->cpu ? :)
--
To unsubscribe from this list: send the line "unsubscribe linux-pm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/03/15 15:30, Viresh Kumar wrote: > On 3 March 2015 at 20:56, Sudeep Holla <sudeep.holla@arm.com> wrote: >> You can use any_online_cpu(..) instead of cpumask_any IMO > > What about policy->cpu ? :) > We can, was not sure if they have access to policy in their thermal driver. Just the first thought seeing one line of the code posted :) Regards, Sudeep -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/03/15 15:29, Viresh Kumar wrote: > On 3 March 2015 at 20:39, Kapileshwar Singh <kapileshwar.singh@arm.com> wrote: > >> I did test this but we were working with the assumption that OPPs should be populated for all the CPUs and also that OPPs are lost for a hotplugged CPU which I see is not the case. > > Then what did you test? My point here is, even with the latest patches > that you have > sent, you wouldn't be able to get the OPPs once policy->cpu goes down. So, how > did this worked for you ? We were basing our fix on possibility of having OPPs for all the CPUs and we incorrectly attributed the erroneous OPP we got from dev_pm_opp_find_freq_exact to the missing OPPs in the other CPUs. > >> We have looked at this more closely and found that problem lies in: >> >> freq = cpufreq_quick_get(cpumask_any(&cpufreq_device->allowed_cpus)); >> >> which returns a NULL frequency as we are not checking for online CPUs here. We shall come up with a fix for this. Many thanks for helping us with the investigation. > > Right. > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/thermal/cpu_cooling.c b/drivers/thermal/cpu_cooling.c index c4974144c787..e306d6bc3cf1 100644 --- a/drivers/thermal/cpu_cooling.c +++ b/drivers/thermal/cpu_cooling.c @@ -269,6 +269,9 @@ static int cpufreq_thermal_notifier(struct notifier_block *nb, mutex_unlock(&cooling_cpufreq_lock); break; + case CPUFREQ_UPDATE_POLICY_CPU: + update_cpu_device(policy->cpu); + break; case CPUFREQ_CREATE_POLICY: update_cpu_device(policy->cpu); break;