Message ID | 1544687394-331-3-git-send-email-tdas@codeaurora.org (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | cpufreq: qcom-hw: Add support for QCOM cpufreq HW | expand |
On 13-12-18, 13:19, Taniya Das wrote: > The CPUfreq HW present in some QCOM chipsets offloads the steps necessary > for changing the frequency of CPUs. The driver implements the cpufreq > driver interface for this hardware engine. > > Signed-off-by: Saravana Kannan <skannan@codeaurora.org> > Signed-off-by: Stephen Boyd <swboyd@chromium.org> > Signed-off-by: Taniya Das <tdas@codeaurora.org> > --- > drivers/cpufreq/Kconfig.arm | 11 ++ > drivers/cpufreq/Makefile | 1 + > drivers/cpufreq/qcom-cpufreq-hw.c | 305 ++++++++++++++++++++++++++++++++++++++ > 3 files changed, 317 insertions(+) > create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Quoting Taniya Das (2018-12-12 23:49:54) > The CPUfreq HW present in some QCOM chipsets offloads the steps necessary > for changing the frequency of CPUs. The driver implements the cpufreq > driver interface for this hardware engine. > > Signed-off-by: Saravana Kannan <skannan@codeaurora.org> > Signed-off-by: Stephen Boyd <swboyd@chromium.org> > Signed-off-by: Taniya Das <tdas@codeaurora.org> > --- Reviewed-by: Stephen Boyd <swboyd@chromium.org> But I noticed that we don't release the I/O region anymore so hotplug and replug of a whole clk domain fails. I guess devm_ioremap_resource() was just too much magic so how about we downgrade to devm_ioremap() instead? BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error upon bringing the policy online the second time. I guess cpufreq_stats aren't able to be freed from there because they take locks in different order vs. the normal path? -----8<------- diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c index fce7a1162e87..0e1105151478 100644 --- a/drivers/cpufreq/qcom-cpufreq-hw.c +++ b/drivers/cpufreq/qcom-cpufreq-hw.c @@ -182,9 +182,12 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy) index = args.args[0]; res = platform_get_resource(global_pdev, IORESOURCE_MEM, index); - base = devm_ioremap_resource(dev, res); - if (IS_ERR(base)) - return PTR_ERR(base); + if (!res) + return -ENODEV; + + base = devm_ioremap(dev, res->start, resource_size(res)); + if (!base) + return -ENOMEM; /* HW should be in enabled state to proceed */ if (!(readl_relaxed(base + REG_ENABLE) & 0x1)) {
On 13-12-18, 01:58, Stephen Boyd wrote: > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error > upon bringing the policy online the second time. I guess cpufreq_stats > aren't able to be freed from there because they take locks in different > order vs. the normal path? Please share the lockdep report and the steps to reproduce it. I will see if I can simulate the failure forcefully..
Quoting Viresh Kumar (2018-12-13 02:05:06) > On 13-12-18, 01:58, Stephen Boyd wrote: > > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error > > upon bringing the policy online the second time. I guess cpufreq_stats > > aren't able to be freed from there because they take locks in different > > order vs. the normal path? > > Please share the lockdep report and the steps to reproduce it. I will > see if I can simulate the failure forcefully.. > It's on a v4.19 kernel with this cpufreq hw driver backported to it. I think all it takes is to return an error the second time the policy is initialized when cpufreq_online() calls into the cpufreq driver. ====================================================== WARNING: possible circular locking dependency detected 4.19.8 #61 Tainted: G W ------------------------------------------------------ cpuhp/5/36 is trying to acquire lock: 000000003e901e8a (kn->count#326){++++}, at: kernfs_remove_by_name_ns+0x44/0x80 but task is already holding lock: 00000000dd7f52c3 (&policy->rwsem){++++}, at: cpufreq_policy_free+0x17c/0x1cc which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (&policy->rwsem){++++}: down_read+0x50/0xcc show+0x30/0x78 sysfs_kf_seq_show+0x17c/0x25c kernfs_seq_show+0xb4/0xf8 seq_read+0x4a8/0x8f0 kernfs_fop_read+0xe0/0x360 __vfs_read+0x80/0x328 vfs_read+0xd0/0x1d4 ksys_read+0x88/0x118 __arm64_sys_read+0x4c/0x5c el0_svc_common+0x124/0x1c4 el0_svc_compat_handler+0x64/0x8c el0_svc_compat+0x8/0x18 -> #0 (kn->count#326){++++}: lock_acquire+0x244/0x360 __kernfs_remove+0x324/0x4fc kernfs_remove_by_name_ns+0x44/0x80 remove_files+0x5c/0xd8 sysfs_remove_group+0xb4/0xec cpufreq_stats_free_table+0x50/0x9c cpufreq_policy_free+0x184/0x1cc cpufreq_online+0xa44/0xc4c cpuhp_cpufreq_online+0x1c/0x2c cpuhp_invoke_callback+0x608/0x1328 cpuhp_thread_fun+0x1dc/0x440 smpboot_thread_fn+0x46c/0x7c0 kthread+0x248/0x260 ret_from_fork+0x10/0x18 other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(&policy->rwsem); lock(kn->count#326); lock(&policy->rwsem); lock(kn->count#326); *** DEADLOCK *** 2 locks held by cpuhp/5/36: #0: 00000000549a28c3 (cpuhp_state-up){+.+.}, at: cpuhp_lock_acquire+0x8/0x48 #1: 00000000dd7f52c3 (&policy->rwsem){++++}, at: cpufreq_policy_free+0x17c/0x1cc stack backtrace: CPU: 5 PID: 36 Comm: cpuhp/5 Tainted: G W 4.19.8 #61 Call trace: dump_backtrace+0x0/0x2f8 show_stack+0x20/0x2c __dump_stack+0x20/0x28 dump_stack+0xcc/0x10c print_circular_bug+0x2c0/0x328 __lock_acquire+0x22b0/0x2708 lock_acquire+0x244/0x360 __kernfs_remove+0x324/0x4fc kernfs_remove_by_name_ns+0x44/0x80 remove_files+0x5c/0xd8 sysfs_remove_group+0xb4/0xec cpufreq_stats_free_table+0x50/0x9c cpufreq_policy_free+0x184/0x1cc cpufreq_online+0xa44/0xc4c cpuhp_cpufreq_online+0x1c/0x2c cpuhp_invoke_callback+0x608/0x1328 cpuhp_thread_fun+0x1dc/0x440 smpboot_thread_fn+0x46c/0x7c0 kthread+0x248/0x260 ret_from_fork+0x10/0x18
On 13-12-18, 02:12, Stephen Boyd wrote: > It's on a v4.19 kernel with this cpufreq hw driver backported to it. I > think all it takes is to return an error the second time the policy is > initialized when cpufreq_online() calls into the cpufreq driver. What do you mean by "the second time the policy is initialized" ? We call cpufreq_online() only once for each policy.
Quoting Viresh Kumar (2018-12-13 02:14:17) > On 13-12-18, 02:12, Stephen Boyd wrote: > > It's on a v4.19 kernel with this cpufreq hw driver backported to it. I > > think all it takes is to return an error the second time the policy is > > initialized when cpufreq_online() calls into the cpufreq driver. > > What do you mean by "the second time the policy is initialized" ? > > We call cpufreq_online() only once for each policy. > I have one policy for four CPUs. So take down all four of those CPUs by writing a 0 to the online file for each CPU, and then bring them back online. That should make cpufreq_driver->init() be called twice, once during boot when the CPUs are bound to the cpufreq devices, and second from the sysfs write when the user brings the first CPU in that policy online again by writing a 1 to the online file. If that second time it fails I suspect we hit the lockdep warning. I may have also catted the policy*/stat/* files before offlining any CPUs. If you can't trigger it I can try to reproduce again with a static counter in cpufreq_online() that fails when set to 1 or something.
On 13-12-18, 02:32, Stephen Boyd wrote: > I have one policy for four CPUs. So take down all four of those CPUs by > writing a 0 to the online file for each CPU, and then bring them back > online. That should make cpufreq_driver->init() be called twice, once > during boot when the CPUs are bound to the cpufreq devices, and second > from the sysfs write when the user brings the first CPU in that policy > online again by writing a 1 to the online file. If that second time it > fails I suspect we hit the lockdep warning. Okay, that is a special sequence. I will try to reproduce that locally. Thanks.
On Thu, Dec 13, 2018 at 01:19:54PM +0530, Taniya Das wrote: > The CPUfreq HW present in some QCOM chipsets offloads the steps necessary > for changing the frequency of CPUs. The driver implements the cpufreq > driver interface for this hardware engine. > > Signed-off-by: Saravana Kannan <skannan@codeaurora.org> > Signed-off-by: Stephen Boyd <swboyd@chromium.org> > Signed-off-by: Taniya Das <tdas@codeaurora.org> > --- > drivers/cpufreq/Kconfig.arm | 11 ++ > drivers/cpufreq/Makefile | 1 + > drivers/cpufreq/qcom-cpufreq-hw.c | 305 ++++++++++++++++++++++++++++++++++++++ > 3 files changed, 317 insertions(+) > create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c > > diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm > index 4e1131e..688f102 100644 > --- a/drivers/cpufreq/Kconfig.arm > +++ b/drivers/cpufreq/Kconfig.arm > @@ -114,6 +114,17 @@ config ARM_QCOM_CPUFREQ_KRYO > > If in doubt, say N. > > +config ARM_QCOM_CPUFREQ_HW > + tristate "QCOM CPUFreq HW driver" > + depends on ARCH_QCOM || COMPILE_TEST > + help > + Support for the CPUFreq HW driver. > + Some QCOM chipsets have a HW engine to offload the steps > + necessary for changing the frequency of the CPUs. Firmware loaded > + in this engine exposes a programming interface to the OS. > + The driver implements the cpufreq interface for this HW engine. > + Say Y if you want to support CPUFreq HW. > + > config ARM_S3C_CPUFREQ > bool > help > diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile > index d5ee456..08c071b 100644 > --- a/drivers/cpufreq/Makefile > +++ b/drivers/cpufreq/Makefile > @@ -61,6 +61,7 @@ obj-$(CONFIG_MACH_MVEBU_V7) += mvebu-cpufreq.o > obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ) += omap-cpufreq.o > obj-$(CONFIG_ARM_PXA2xx_CPUFREQ) += pxa2xx-cpufreq.o > obj-$(CONFIG_PXA3xx) += pxa3xx-cpufreq.o > +obj-$(CONFIG_ARM_QCOM_CPUFREQ_HW) += qcom-cpufreq-hw.o > obj-$(CONFIG_ARM_QCOM_CPUFREQ_KRYO) += qcom-cpufreq-kryo.o > obj-$(CONFIG_ARM_S3C2410_CPUFREQ) += s3c2410-cpufreq.o > obj-$(CONFIG_ARM_S3C2412_CPUFREQ) += s3c2412-cpufreq.o > diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c > new file mode 100644 > index 0000000..a65d0ae > --- /dev/null > +++ b/drivers/cpufreq/qcom-cpufreq-hw.c > @@ -0,0 +1,305 @@ > +// SPDX-License-Identifier: GPL-2.0 > +/* > + * Copyright (c) 2018, The Linux Foundation. All rights reserved. > + */ > + > +#include <linux/bitfield.h> > +#include <linux/cpufreq.h> > +#include <linux/init.h> > +#include <linux/kernel.h> > +#include <linux/module.h> > +#include <linux/of_address.h> > +#include <linux/of_platform.h> > +#include <linux/slab.h> > + > +#define LUT_MAX_ENTRIES 40U > +#define LUT_SRC GENMASK(31, 30) > +#define LUT_L_VAL GENMASK(7, 0) > +#define LUT_CORE_COUNT GENMASK(18, 16) > +#define LUT_ROW_SIZE 32 > +#define CLK_HW_DIV 2 > + > +/* Register offsets */ > +#define REG_ENABLE 0x0 > +#define REG_LUT_TABLE 0x110 > +#define REG_PERF_STATE 0x920 > + > +static unsigned long cpu_hw_rate, xo_rate; > +static struct platform_device *global_pdev; > + > +static int qcom_cpufreq_hw_target_index(struct cpufreq_policy *policy, > + unsigned int index) > +{ > + void __iomem *perf_state_reg = policy->driver_data; > + > + writel_relaxed(index, perf_state_reg); > + > + return 0; > +} > + > +static unsigned int qcom_cpufreq_hw_get(unsigned int cpu) > +{ > + void __iomem *perf_state_reg; > + struct cpufreq_policy *policy; > + unsigned int index; > + > + policy = cpufreq_cpu_get_raw(cpu); > + if (!policy) > + return 0; > + > + perf_state_reg = policy->driver_data; > + > + index = readl_relaxed(perf_state_reg); > + index = min(index, LUT_MAX_ENTRIES - 1); > + > + return policy->freq_table[index].frequency; > +} > + > +static unsigned int qcom_cpufreq_hw_fast_switch(struct cpufreq_policy *policy, > + unsigned int target_freq) > +{ > + void __iomem *perf_state_reg = policy->driver_data; > + int index; > + > + index = policy->cached_resolved_idx; > + if (index < 0) > + return 0; > + > + writel_relaxed(index, perf_state_reg); > + > + return policy->freq_table[index].frequency; > +} > + > +static int qcom_cpufreq_hw_read_lut(struct device *dev, > + struct cpufreq_policy *policy, > + void __iomem *base) > +{ > + u32 data, src, lval, i, core_count, prev_cc = 0, prev_freq = 0, freq; > + unsigned int max_cores = cpumask_weight(policy->cpus); > + struct cpufreq_frequency_table *table; > + > + table = kcalloc(LUT_MAX_ENTRIES + 1, sizeof(*table), GFP_KERNEL); > + if (!table) > + return -ENOMEM; > + > + for (i = 0; i < LUT_MAX_ENTRIES; i++) { > + data = readl_relaxed(base + REG_LUT_TABLE + i * LUT_ROW_SIZE); > + src = FIELD_GET(LUT_SRC, data); > + lval = FIELD_GET(LUT_L_VAL, data); > + core_count = FIELD_GET(LUT_CORE_COUNT, data); > + > + if (src) > + freq = xo_rate * lval / 1000; > + else > + freq = cpu_hw_rate / 1000; > + > + /* Ignore boosts in the middle of the table */ > + if (core_count != max_cores) { > + table[i].frequency = CPUFREQ_ENTRY_INVALID; > + } else { > + table[i].frequency = freq; > + dev_dbg(dev, "index=%d freq=%d, core_count %d\n", i, > + freq, core_count); > + } > + > + /* > + * Two of the same frequencies with the same core counts means > + * end of table > + */ > + if (i > 0 && prev_freq == freq && prev_cc == core_count) { > + struct cpufreq_frequency_table *prev = &table[i - 1]; > + > + /* > + * Only treat the last frequency that might be a boost > + * as the boost frequency > + */ > + if (prev_cc != max_cores) { > + prev->frequency = prev_freq; > + prev->flags = CPUFREQ_BOOST_FREQ; > + } > + > + break; > + } > + > + prev_cc = core_count; > + prev_freq = freq; > + } > + > + table[i].frequency = CPUFREQ_TABLE_END; > + policy->freq_table = table; > + > + return 0; > +} > + > +static void qcom_get_related_cpus(int index, struct cpumask *m) > +{ > + struct device_node *cpu_np; > + struct of_phandle_args args; > + int cpu, ret; > + > + for_each_possible_cpu(cpu) { > + cpu_np = of_cpu_device_node_get(cpu); > + if (!cpu_np) > + continue; > + > + ret = of_parse_phandle_with_args(cpu_np, "qcom,freq-domain", > + "#freq-domain-cells", 0, > + &args); > + of_node_put(cpu_np); > + if (ret < 0) > + continue; > + > + if (index == args.args[0]) > + cpumask_set_cpu(cpu, m); > + } > +} > + > +static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy) > +{ > + struct device *dev = &global_pdev->dev; > + struct of_phandle_args args; > + struct device_node *cpu_np; > + struct resource *res; > + void __iomem *base; > + int ret, index; > + > + cpu_np = of_cpu_device_node_get(policy->cpu); > + if (!cpu_np) > + return -EINVAL; > + > + ret = of_parse_phandle_with_args(cpu_np, "qcom,freq-domain", > + "#freq-domain-cells", 0, &args); > + of_node_put(cpu_np); > + if (ret) > + return ret; > + > + index = args.args[0]; > + > + res = platform_get_resource(global_pdev, IORESOURCE_MEM, index); > + base = devm_ioremap_resource(dev, res); > + if (IS_ERR(base)) > + return PTR_ERR(base); > + > + /* HW should be in enabled state to proceed */ > + if (!(readl_relaxed(base + REG_ENABLE) & 0x1)) { > + dev_err(dev, "Domain-%d cpufreq hardware not enabled\n", index); > + ret = -ENODEV; > + goto error; > + } > + > + qcom_get_related_cpus(index, policy->cpus); > + if (!cpumask_weight(policy->cpus)) { > + dev_err(dev, "Domain-%d failed to get related CPUs\n", index); > + ret = -ENOENT; > + goto error; > + } > + > + policy->driver_data = base + REG_PERF_STATE; > + > + ret = qcom_cpufreq_hw_read_lut(dev, policy, base); > + if (ret) { > + dev_err(dev, "Domain-%d failed to read LUT\n", index); > + goto error; > + } > + > + policy->fast_switch_possible = true; > + > + return 0; > +error: > + devm_iounmap(dev, base); > + return ret; > +} > + > +static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy) > +{ > + void __iomem *base = policy->driver_data - REG_PERF_STATE; > + > + kfree(policy->freq_table); > + devm_iounmap(&global_pdev->dev, base); > + > + return 0; > +} > + > +static struct freq_attr *qcom_cpufreq_hw_attr[] = { > + &cpufreq_freq_attr_scaling_available_freqs, > + &cpufreq_freq_attr_scaling_boost_freqs, > + NULL > +}; > + > +static struct cpufreq_driver cpufreq_qcom_hw_driver = { > + .flags = CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK | > + CPUFREQ_HAVE_GOVERNOR_PER_POLICY, > + .verify = cpufreq_generic_frequency_table_verify, > + .target_index = qcom_cpufreq_hw_target_index, > + .get = qcom_cpufreq_hw_get, > + .init = qcom_cpufreq_hw_cpu_init, > + .exit = qcom_cpufreq_hw_cpu_exit, > + .fast_switch = qcom_cpufreq_hw_fast_switch, > + .name = "qcom-cpufreq-hw", > + .attr = qcom_cpufreq_hw_attr, > +}; > + > +static int qcom_cpufreq_hw_driver_probe(struct platform_device *pdev) > +{ > + struct clk *clk; > + int ret; > + > + clk = clk_get(&pdev->dev, "xo"); > + if (IS_ERR(clk)) > + return PTR_ERR(clk); > + > + xo_rate = clk_get_rate(clk); > + clk_put(clk); > + > + clk = clk_get(&pdev->dev, "alternate"); > + if (IS_ERR(clk)) > + return PTR_ERR(clk); > + > + cpu_hw_rate = clk_get_rate(clk) / CLK_HW_DIV; > + clk_put(clk); > + > + global_pdev = pdev; > + > + ret = cpufreq_register_driver(&cpufreq_qcom_hw_driver); > + if (ret) > + dev_err(&pdev->dev, "CPUFreq HW driver failed to register\n"); > + else > + dev_dbg(&pdev->dev, "QCOM CPUFreq HW driver initialized\n"); > + > + return ret; > +} > + > +static int qcom_cpufreq_hw_driver_remove(struct platform_device *pdev) > +{ > + return cpufreq_unregister_driver(&cpufreq_qcom_hw_driver); > +} > + > +static const struct of_device_id qcom_cpufreq_hw_match[] = { > + { .compatible = "qcom,cpufreq-hw" }, > + {} > +}; > +MODULE_DEVICE_TABLE(of, qcom_cpufreq_hw_match); > + > +static struct platform_driver qcom_cpufreq_hw_driver = { > + .probe = qcom_cpufreq_hw_driver_probe, > + .remove = qcom_cpufreq_hw_driver_remove, > + .driver = { > + .name = "qcom-cpufreq-hw", > + .of_match_table = qcom_cpufreq_hw_match, > + }, > +}; > + > +static int __init qcom_cpufreq_hw_init(void) > +{ > + return platform_driver_register(&qcom_cpufreq_hw_driver); > +} > +subsys_initcall(qcom_cpufreq_hw_init); > + > +static void __exit qcom_cpufreq_hw_exit(void) > +{ > + platform_driver_unregister(&qcom_cpufreq_hw_driver); > +} > +module_exit(qcom_cpufreq_hw_exit); > + > +MODULE_DESCRIPTION("QCOM CPUFREQ HW Driver"); > +MODULE_LICENSE("GPL v2"); FWIW: Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Hello Stephen, Viresh, On 12/13/2018 3:28 PM, Stephen Boyd wrote: > Quoting Taniya Das (2018-12-12 23:49:54) >> The CPUfreq HW present in some QCOM chipsets offloads the steps necessary >> for changing the frequency of CPUs. The driver implements the cpufreq >> driver interface for this hardware engine. >> >> Signed-off-by: Saravana Kannan <skannan@codeaurora.org> >> Signed-off-by: Stephen Boyd <swboyd@chromium.org> >> Signed-off-by: Taniya Das <tdas@codeaurora.org> >> --- > > Reviewed-by: Stephen Boyd <swboyd@chromium.org> > > But I noticed that we don't release the I/O region anymore so hotplug > and replug of a whole clk domain fails. I guess devm_ioremap_resource() > was just too much magic so how about we downgrade to devm_ioremap() > instead? > > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error > upon bringing the policy online the second time. I guess cpufreq_stats > aren't able to be freed from there because they take locks in different > order vs. the normal path? > > -----8<------- > diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c > index fce7a1162e87..0e1105151478 100644 > --- a/drivers/cpufreq/qcom-cpufreq-hw.c > +++ b/drivers/cpufreq/qcom-cpufreq-hw.c > @@ -182,9 +182,12 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy) > index = args.args[0]; > > res = platform_get_resource(global_pdev, IORESOURCE_MEM, index); > - base = devm_ioremap_resource(dev, res); > - if (IS_ERR(base)) > - return PTR_ERR(base); > + if (!res) > + return -ENODEV; > + > + base = devm_ioremap(dev, res->start, resource_size(res)); > + if (!base) > + return -ENOMEM; > Updated the above in the next series. > /* HW should be in enabled state to proceed */ > if (!(readl_relaxed(base + REG_ENABLE) & 0x1)) { >
Hi Stephen, On 13-12-18, 02:12, Stephen Boyd wrote: > Quoting Viresh Kumar (2018-12-13 02:05:06) > > On 13-12-18, 01:58, Stephen Boyd wrote: > > > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error > > > upon bringing the policy online the second time. I guess cpufreq_stats > > > aren't able to be freed from there because they take locks in different > > > order vs. the normal path? > > > > Please share the lockdep report and the steps to reproduce it. I will > > see if I can simulate the failure forcefully.. > > > > It's on a v4.19 kernel with this cpufreq hw driver backported to it. I > think all it takes is to return an error the second time the policy is > initialized when cpufreq_online() calls into the cpufreq driver. > > ====================================================== > WARNING: possible circular locking dependency detected > 4.19.8 #61 Tainted: G W > ------------------------------------------------------ > cpuhp/5/36 is trying to acquire lock: > 000000003e901e8a (kn->count#326){++++}, at: kernfs_remove_by_name_ns+0x44/0x80 > > but task is already holding lock: > 00000000dd7f52c3 (&policy->rwsem){++++}, at: cpufreq_policy_free+0x17c/0x1cc > > which lock already depends on the new lock. > > > the existing dependency chain (in reverse order) is: > > -> #1 (&policy->rwsem){++++}: > down_read+0x50/0xcc > show+0x30/0x78 > sysfs_kf_seq_show+0x17c/0x25c > kernfs_seq_show+0xb4/0xf8 > seq_read+0x4a8/0x8f0 > kernfs_fop_read+0xe0/0x360 > __vfs_read+0x80/0x328 > vfs_read+0xd0/0x1d4 > ksys_read+0x88/0x118 > __arm64_sys_read+0x4c/0x5c > el0_svc_common+0x124/0x1c4 > el0_svc_compat_handler+0x64/0x8c > el0_svc_compat+0x8/0x18 I failed to reproduce it over linux/next. I had the following changes over linux/next: https://pastebin.ubuntu.com/p/zkVm77PGdY/ I also did savedefconfig to show what all I changed in it. I faked multiple clusters on my hikey960 board, which is not big little.. And here is the command list from history that I ran after boot. 501 grep . /sys/devices/system/cpu/cpufreq/*/* 502 grep . /sys/devices/system/cpu/cpufreq/*/*/* 503 grep . /sys/devices/system/cpu/cpufreq/*/*/* 504 grep . /sys/devices/system/cpu/cpufreq/*/*/* 505 grep . /sys/devices/system/cpu/cpufreq/*/*/* 506 grep . /sys/devices/system/cpu/cpufreq/*/* 507 grep . /sys/devices/system/cpu/cpufreq/*/* 508 echo 0 > /sys/devices/system/cpu/cpu4/online 509 echo 0 > /sys/devices/system/cpu/cpu5/online 510 echo 0 > /sys/devices/system/cpu/cpu6/online 511 echo 0 > /sys/devices/system/cpu/cpu7/online 512 grep . /sys/devices/system/cpu/cpufreq/*/* 513 grep . /sys/devices/system/cpu/cpufreq/*/*/* 514 grep . /sys/devices/system/cpu/cpufreq/*/* 515 echo 1 > /sys/devices/system/cpu/cpu4/online 516 grep . /sys/devices/system/cpu/cpufreq/*/* 517 grep . /sys/devices/system/cpu/cpufreq/*/*/* 518 dmesg
Quoting Viresh Kumar (2018-12-17 21:45:45) > Hi Stephen, > > On 13-12-18, 02:12, Stephen Boyd wrote: > > Quoting Viresh Kumar (2018-12-13 02:05:06) > > > On 13-12-18, 01:58, Stephen Boyd wrote: > > > > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error > > > > upon bringing the policy online the second time. I guess cpufreq_stats > > > > aren't able to be freed from there because they take locks in different > > > > order vs. the normal path? > > > > > > Please share the lockdep report and the steps to reproduce it. I will > > > see if I can simulate the failure forcefully.. > > > > > > > It's on a v4.19 kernel with this cpufreq hw driver backported to it. I > > think all it takes is to return an error the second time the policy is > > initialized when cpufreq_online() calls into the cpufreq driver. > > > > ====================================================== > > WARNING: possible circular locking dependency detected > > 4.19.8 #61 Tainted: G W > > ------------------------------------------------------ > > cpuhp/5/36 is trying to acquire lock: > > 000000003e901e8a (kn->count#326){++++}, at: kernfs_remove_by_name_ns+0x44/0x80 > > > > but task is already holding lock: > > 00000000dd7f52c3 (&policy->rwsem){++++}, at: cpufreq_policy_free+0x17c/0x1cc > > > > which lock already depends on the new lock. > > > > > > the existing dependency chain (in reverse order) is: > > > > -> #1 (&policy->rwsem){++++}: > > down_read+0x50/0xcc > > show+0x30/0x78 > > sysfs_kf_seq_show+0x17c/0x25c > > kernfs_seq_show+0xb4/0xf8 > > seq_read+0x4a8/0x8f0 > > kernfs_fop_read+0xe0/0x360 > > __vfs_read+0x80/0x328 > > vfs_read+0xd0/0x1d4 > > ksys_read+0x88/0x118 > > __arm64_sys_read+0x4c/0x5c > > el0_svc_common+0x124/0x1c4 > > el0_svc_compat_handler+0x64/0x8c > > el0_svc_compat+0x8/0x18 > > I failed to reproduce it over linux/next. > > I had the following changes over linux/next: > https://pastebin.ubuntu.com/p/zkVm77PGdY/ I don't see any failure returned from cpufreq_dt's cpufreq_init() function. Maybe put a static int counter = 0 and then fail cpufreq_init() the second time that it's called for the same policy pointer? I have a system with two policies, so I made it fail and return -EINVAL when the counter == 2 and I see the lockdep splat. > > I also did savedefconfig to show what all I changed in it. I faked multiple > clusters on my hikey960 board, which is not big little.. > > And here is the command list from history that I ran after boot. > > 501 grep . /sys/devices/system/cpu/cpufreq/*/* > 502 grep . /sys/devices/system/cpu/cpufreq/*/*/* > 503 grep . /sys/devices/system/cpu/cpufreq/*/*/* > 504 grep . /sys/devices/system/cpu/cpufreq/*/*/* > 505 grep . /sys/devices/system/cpu/cpufreq/*/*/* > 506 grep . /sys/devices/system/cpu/cpufreq/*/* > 507 grep . /sys/devices/system/cpu/cpufreq/*/* > 508 echo 0 > /sys/devices/system/cpu/cpu4/online > 509 echo 0 > /sys/devices/system/cpu/cpu5/online > 510 echo 0 > /sys/devices/system/cpu/cpu6/online > 511 echo 0 > /sys/devices/system/cpu/cpu7/online > 512 grep . /sys/devices/system/cpu/cpufreq/*/* > 513 grep . /sys/devices/system/cpu/cpufreq/*/*/* > 514 grep . /sys/devices/system/cpu/cpufreq/*/* > 515 echo 1 > /sys/devices/system/cpu/cpu4/online > 516 grep . /sys/devices/system/cpu/cpufreq/*/* > 517 grep . /sys/devices/system/cpu/cpufreq/*/*/* > 518 dmesg > I did the following: grep . /sys/devices/system/cpu/cpufreq/*/* >/dev/null echo 0 > /sys/devices/system/cpu/cpu4/online echo 0 > /sys/devices/system/cpu/cpu5/online echo 0 > /sys/devices/system/cpu/cpu6/online echo 0 > /sys/devices/system/cpu/cpu7/online echo 1 > /sys/devices/system/cpu/cpu4/online dmesg And boom, lockdep splat.
On 18-12-18, 11:13, Stephen Boyd wrote: > I don't see any failure returned from cpufreq_dt's cpufreq_init() > function. Maybe put a static int counter = 0 and then fail > cpufreq_init() the second time that it's called for the same policy > pointer? I have a system with two policies, so I made it fail and return > -EINVAL when the counter == 2 and I see the lockdep splat. Yuck. You were so clear on this earlier, how can I forget it. :( Here is the additional diff. And I do get this print message on my screen while I try to online CPU4. diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c index 639feca22d27..b836e93fd87d 100644 --- a/drivers/cpufreq/cpufreq-dt.c +++ b/drivers/cpufreq/cpufreq-dt.c @@ -151,6 +151,11 @@ static int cpufreq_init(struct cpufreq_policy *policy) const char *name; int ret; + if (policy->count++) { + pr_info("%s: %d\n", __func__, __LINE__); + return -EINVAL; + } + cpu_dev = get_cpu_device(policy->cpu); if (!cpu_dev) { pr_err("failed to get cpu%d device\n", policy->cpu); diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h index 882a9b9e34bc..643141a2013f 100644 --- a/include/linux/cpufreq.h +++ b/include/linux/cpufreq.h @@ -151,6 +151,7 @@ struct cpufreq_policy { /* For cpufreq driver's internal use */ void *driver_data; + int count; }; > I did the following: > > grep . /sys/devices/system/cpu/cpufreq/*/* >/dev/null > echo 0 > /sys/devices/system/cpu/cpu4/online > echo 0 > /sys/devices/system/cpu/cpu5/online > echo 0 > /sys/devices/system/cpu/cpu6/online > echo 0 > /sys/devices/system/cpu/cpu7/online > echo 1 > /sys/devices/system/cpu/cpu4/online > dmesg > > And boom, lockdep splat. 502 grep . /sys/devices/system/cpu/cpufreq/*/* 503 echo 0 > /sys/devices/system/cpu/cpu4/online 504 echo 0 > /sys/devices/system/cpu/cpu5/online 505 echo 0 > /sys/devices/system/cpu/cpu6/online 506 echo 0 > /sys/devices/system/cpu/cpu7/online 507 grep . /sys/devices/system/cpu/cpufreq/*/* 508 echo 1 > /sys/devices/system/cpu/cpu4/online 509 dmesg But still no lockdep :( Can you try that on mainline as well ?
diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm index 4e1131e..688f102 100644 --- a/drivers/cpufreq/Kconfig.arm +++ b/drivers/cpufreq/Kconfig.arm @@ -114,6 +114,17 @@ config ARM_QCOM_CPUFREQ_KRYO If in doubt, say N. +config ARM_QCOM_CPUFREQ_HW + tristate "QCOM CPUFreq HW driver" + depends on ARCH_QCOM || COMPILE_TEST + help + Support for the CPUFreq HW driver. + Some QCOM chipsets have a HW engine to offload the steps + necessary for changing the frequency of the CPUs. Firmware loaded + in this engine exposes a programming interface to the OS. + The driver implements the cpufreq interface for this HW engine. + Say Y if you want to support CPUFreq HW. + config ARM_S3C_CPUFREQ bool help diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile index d5ee456..08c071b 100644 --- a/drivers/cpufreq/Makefile +++ b/drivers/cpufreq/Makefile @@ -61,6 +61,7 @@ obj-$(CONFIG_MACH_MVEBU_V7) += mvebu-cpufreq.o obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ) += omap-cpufreq.o obj-$(CONFIG_ARM_PXA2xx_CPUFREQ) += pxa2xx-cpufreq.o obj-$(CONFIG_PXA3xx) += pxa3xx-cpufreq.o +obj-$(CONFIG_ARM_QCOM_CPUFREQ_HW) += qcom-cpufreq-hw.o obj-$(CONFIG_ARM_QCOM_CPUFREQ_KRYO) += qcom-cpufreq-kryo.o obj-$(CONFIG_ARM_S3C2410_CPUFREQ) += s3c2410-cpufreq.o obj-$(CONFIG_ARM_S3C2412_CPUFREQ) += s3c2412-cpufreq.o diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c new file mode 100644 index 0000000..a65d0ae --- /dev/null +++ b/drivers/cpufreq/qcom-cpufreq-hw.c @@ -0,0 +1,305 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * Copyright (c) 2018, The Linux Foundation. All rights reserved. + */ + +#include <linux/bitfield.h> +#include <linux/cpufreq.h> +#include <linux/init.h> +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/of_address.h> +#include <linux/of_platform.h> +#include <linux/slab.h> + +#define LUT_MAX_ENTRIES 40U +#define LUT_SRC GENMASK(31, 30) +#define LUT_L_VAL GENMASK(7, 0) +#define LUT_CORE_COUNT GENMASK(18, 16) +#define LUT_ROW_SIZE 32 +#define CLK_HW_DIV 2 + +/* Register offsets */ +#define REG_ENABLE 0x0 +#define REG_LUT_TABLE 0x110 +#define REG_PERF_STATE 0x920 + +static unsigned long cpu_hw_rate, xo_rate; +static struct platform_device *global_pdev; + +static int qcom_cpufreq_hw_target_index(struct cpufreq_policy *policy, + unsigned int index) +{ + void __iomem *perf_state_reg = policy->driver_data; + + writel_relaxed(index, perf_state_reg); + + return 0; +} + +static unsigned int qcom_cpufreq_hw_get(unsigned int cpu) +{ + void __iomem *perf_state_reg; + struct cpufreq_policy *policy; + unsigned int index; + + policy = cpufreq_cpu_get_raw(cpu); + if (!policy) + return 0; + + perf_state_reg = policy->driver_data; + + index = readl_relaxed(perf_state_reg); + index = min(index, LUT_MAX_ENTRIES - 1); + + return policy->freq_table[index].frequency; +} + +static unsigned int qcom_cpufreq_hw_fast_switch(struct cpufreq_policy *policy, + unsigned int target_freq) +{ + void __iomem *perf_state_reg = policy->driver_data; + int index; + + index = policy->cached_resolved_idx; + if (index < 0) + return 0; + + writel_relaxed(index, perf_state_reg); + + return policy->freq_table[index].frequency; +} + +static int qcom_cpufreq_hw_read_lut(struct device *dev, + struct cpufreq_policy *policy, + void __iomem *base) +{ + u32 data, src, lval, i, core_count, prev_cc = 0, prev_freq = 0, freq; + unsigned int max_cores = cpumask_weight(policy->cpus); + struct cpufreq_frequency_table *table; + + table = kcalloc(LUT_MAX_ENTRIES + 1, sizeof(*table), GFP_KERNEL); + if (!table) + return -ENOMEM; + + for (i = 0; i < LUT_MAX_ENTRIES; i++) { + data = readl_relaxed(base + REG_LUT_TABLE + i * LUT_ROW_SIZE); + src = FIELD_GET(LUT_SRC, data); + lval = FIELD_GET(LUT_L_VAL, data); + core_count = FIELD_GET(LUT_CORE_COUNT, data); + + if (src) + freq = xo_rate * lval / 1000; + else + freq = cpu_hw_rate / 1000; + + /* Ignore boosts in the middle of the table */ + if (core_count != max_cores) { + table[i].frequency = CPUFREQ_ENTRY_INVALID; + } else { + table[i].frequency = freq; + dev_dbg(dev, "index=%d freq=%d, core_count %d\n", i, + freq, core_count); + } + + /* + * Two of the same frequencies with the same core counts means + * end of table + */ + if (i > 0 && prev_freq == freq && prev_cc == core_count) { + struct cpufreq_frequency_table *prev = &table[i - 1]; + + /* + * Only treat the last frequency that might be a boost + * as the boost frequency + */ + if (prev_cc != max_cores) { + prev->frequency = prev_freq; + prev->flags = CPUFREQ_BOOST_FREQ; + } + + break; + } + + prev_cc = core_count; + prev_freq = freq; + } + + table[i].frequency = CPUFREQ_TABLE_END; + policy->freq_table = table; + + return 0; +} + +static void qcom_get_related_cpus(int index, struct cpumask *m) +{ + struct device_node *cpu_np; + struct of_phandle_args args; + int cpu, ret; + + for_each_possible_cpu(cpu) { + cpu_np = of_cpu_device_node_get(cpu); + if (!cpu_np) + continue; + + ret = of_parse_phandle_with_args(cpu_np, "qcom,freq-domain", + "#freq-domain-cells", 0, + &args); + of_node_put(cpu_np); + if (ret < 0) + continue; + + if (index == args.args[0]) + cpumask_set_cpu(cpu, m); + } +} + +static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy) +{ + struct device *dev = &global_pdev->dev; + struct of_phandle_args args; + struct device_node *cpu_np; + struct resource *res; + void __iomem *base; + int ret, index; + + cpu_np = of_cpu_device_node_get(policy->cpu); + if (!cpu_np) + return -EINVAL; + + ret = of_parse_phandle_with_args(cpu_np, "qcom,freq-domain", + "#freq-domain-cells", 0, &args); + of_node_put(cpu_np); + if (ret) + return ret; + + index = args.args[0]; + + res = platform_get_resource(global_pdev, IORESOURCE_MEM, index); + base = devm_ioremap_resource(dev, res); + if (IS_ERR(base)) + return PTR_ERR(base); + + /* HW should be in enabled state to proceed */ + if (!(readl_relaxed(base + REG_ENABLE) & 0x1)) { + dev_err(dev, "Domain-%d cpufreq hardware not enabled\n", index); + ret = -ENODEV; + goto error; + } + + qcom_get_related_cpus(index, policy->cpus); + if (!cpumask_weight(policy->cpus)) { + dev_err(dev, "Domain-%d failed to get related CPUs\n", index); + ret = -ENOENT; + goto error; + } + + policy->driver_data = base + REG_PERF_STATE; + + ret = qcom_cpufreq_hw_read_lut(dev, policy, base); + if (ret) { + dev_err(dev, "Domain-%d failed to read LUT\n", index); + goto error; + } + + policy->fast_switch_possible = true; + + return 0; +error: + devm_iounmap(dev, base); + return ret; +} + +static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy) +{ + void __iomem *base = policy->driver_data - REG_PERF_STATE; + + kfree(policy->freq_table); + devm_iounmap(&global_pdev->dev, base); + + return 0; +} + +static struct freq_attr *qcom_cpufreq_hw_attr[] = { + &cpufreq_freq_attr_scaling_available_freqs, + &cpufreq_freq_attr_scaling_boost_freqs, + NULL +}; + +static struct cpufreq_driver cpufreq_qcom_hw_driver = { + .flags = CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK | + CPUFREQ_HAVE_GOVERNOR_PER_POLICY, + .verify = cpufreq_generic_frequency_table_verify, + .target_index = qcom_cpufreq_hw_target_index, + .get = qcom_cpufreq_hw_get, + .init = qcom_cpufreq_hw_cpu_init, + .exit = qcom_cpufreq_hw_cpu_exit, + .fast_switch = qcom_cpufreq_hw_fast_switch, + .name = "qcom-cpufreq-hw", + .attr = qcom_cpufreq_hw_attr, +}; + +static int qcom_cpufreq_hw_driver_probe(struct platform_device *pdev) +{ + struct clk *clk; + int ret; + + clk = clk_get(&pdev->dev, "xo"); + if (IS_ERR(clk)) + return PTR_ERR(clk); + + xo_rate = clk_get_rate(clk); + clk_put(clk); + + clk = clk_get(&pdev->dev, "alternate"); + if (IS_ERR(clk)) + return PTR_ERR(clk); + + cpu_hw_rate = clk_get_rate(clk) / CLK_HW_DIV; + clk_put(clk); + + global_pdev = pdev; + + ret = cpufreq_register_driver(&cpufreq_qcom_hw_driver); + if (ret) + dev_err(&pdev->dev, "CPUFreq HW driver failed to register\n"); + else + dev_dbg(&pdev->dev, "QCOM CPUFreq HW driver initialized\n"); + + return ret; +} + +static int qcom_cpufreq_hw_driver_remove(struct platform_device *pdev) +{ + return cpufreq_unregister_driver(&cpufreq_qcom_hw_driver); +} + +static const struct of_device_id qcom_cpufreq_hw_match[] = { + { .compatible = "qcom,cpufreq-hw" }, + {} +}; +MODULE_DEVICE_TABLE(of, qcom_cpufreq_hw_match); + +static struct platform_driver qcom_cpufreq_hw_driver = { + .probe = qcom_cpufreq_hw_driver_probe, + .remove = qcom_cpufreq_hw_driver_remove, + .driver = { + .name = "qcom-cpufreq-hw", + .of_match_table = qcom_cpufreq_hw_match, + }, +}; + +static int __init qcom_cpufreq_hw_init(void) +{ + return platform_driver_register(&qcom_cpufreq_hw_driver); +} +subsys_initcall(qcom_cpufreq_hw_init); + +static void __exit qcom_cpufreq_hw_exit(void) +{ + platform_driver_unregister(&qcom_cpufreq_hw_driver); +} +module_exit(qcom_cpufreq_hw_exit); + +MODULE_DESCRIPTION("QCOM CPUFREQ HW Driver"); +MODULE_LICENSE("GPL v2");