diff mbox series

[v12,2/2] cpufreq: qcom-hw: Add support for QCOM cpufreq HW driver

Message ID 1544687394-331-3-git-send-email-tdas@codeaurora.org (mailing list archive)
State Superseded, archived
Headers show
Series cpufreq: qcom-hw: Add support for QCOM cpufreq HW | expand

Commit Message

Taniya Das Dec. 13, 2018, 7:49 a.m. UTC
The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
for changing the frequency of CPUs. The driver implements the cpufreq
driver interface for this hardware engine.

Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
Signed-off-by: Stephen Boyd <swboyd@chromium.org>
Signed-off-by: Taniya Das <tdas@codeaurora.org>
---
 drivers/cpufreq/Kconfig.arm       |  11 ++
 drivers/cpufreq/Makefile          |   1 +
 drivers/cpufreq/qcom-cpufreq-hw.c | 305 ++++++++++++++++++++++++++++++++++++++
 3 files changed, 317 insertions(+)
 create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c

--
Qualcomm INDIA, on behalf of Qualcomm Innovation Center, Inc.is a member
of the Code Aurora Forum, hosted by the  Linux Foundation.

Comments

Viresh Kumar Dec. 13, 2018, 9:13 a.m. UTC | #1
On 13-12-18, 13:19, Taniya Das wrote:
> The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
> for changing the frequency of CPUs. The driver implements the cpufreq
> driver interface for this hardware engine.
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> Signed-off-by: Taniya Das <tdas@codeaurora.org>
> ---
>  drivers/cpufreq/Kconfig.arm       |  11 ++
>  drivers/cpufreq/Makefile          |   1 +
>  drivers/cpufreq/qcom-cpufreq-hw.c | 305 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 317 insertions(+)
>  create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c

Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
Stephen Boyd Dec. 13, 2018, 9:58 a.m. UTC | #2
Quoting Taniya Das (2018-12-12 23:49:54)
> The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
> for changing the frequency of CPUs. The driver implements the cpufreq
> driver interface for this hardware engine.
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> Signed-off-by: Taniya Das <tdas@codeaurora.org>
> ---

Reviewed-by: Stephen Boyd <swboyd@chromium.org>

But I noticed that we don't release the I/O region anymore so hotplug
and replug of a whole clk domain fails. I guess devm_ioremap_resource()
was just too much magic so how about we downgrade to devm_ioremap()
instead?

BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error
upon bringing the policy online the second time. I guess cpufreq_stats
aren't able to be freed from there because they take locks in different
order vs. the normal path?

-----8<-------
diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
index fce7a1162e87..0e1105151478 100644
--- a/drivers/cpufreq/qcom-cpufreq-hw.c
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -182,9 +182,12 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
 	index = args.args[0];
 
 	res = platform_get_resource(global_pdev, IORESOURCE_MEM, index);
-	base = devm_ioremap_resource(dev, res);
-	if (IS_ERR(base))
-		return PTR_ERR(base);
+	if (!res)
+		return -ENODEV;
+
+	base = devm_ioremap(dev, res->start, resource_size(res));
+	if (!base)
+		return -ENOMEM;
 
 	/* HW should be in enabled state to proceed */
 	if (!(readl_relaxed(base + REG_ENABLE) & 0x1)) {
Viresh Kumar Dec. 13, 2018, 10:05 a.m. UTC | #3
On 13-12-18, 01:58, Stephen Boyd wrote:
> BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error
> upon bringing the policy online the second time. I guess cpufreq_stats
> aren't able to be freed from there because they take locks in different
> order vs. the normal path?

Please share the lockdep report and the steps to reproduce it. I will
see if I can simulate the failure forcefully..
Stephen Boyd Dec. 13, 2018, 10:12 a.m. UTC | #4
Quoting Viresh Kumar (2018-12-13 02:05:06)
> On 13-12-18, 01:58, Stephen Boyd wrote:
> > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error
> > upon bringing the policy online the second time. I guess cpufreq_stats
> > aren't able to be freed from there because they take locks in different
> > order vs. the normal path?
> 
> Please share the lockdep report and the steps to reproduce it. I will
> see if I can simulate the failure forcefully..
> 

It's on a v4.19 kernel with this cpufreq hw driver backported to it. I
think all it takes is to return an error the second time the policy is
initialized when cpufreq_online() calls into the cpufreq driver.

 ======================================================
 WARNING: possible circular locking dependency detected
 4.19.8 #61 Tainted: G        W
 ------------------------------------------------------
 cpuhp/5/36 is trying to acquire lock:
 000000003e901e8a (kn->count#326){++++}, at: kernfs_remove_by_name_ns+0x44/0x80

 but task is already holding lock:
 00000000dd7f52c3 (&policy->rwsem){++++}, at: cpufreq_policy_free+0x17c/0x1cc

 which lock already depends on the new lock.


 the existing dependency chain (in reverse order) is:

 -> #1 (&policy->rwsem){++++}:
        down_read+0x50/0xcc
        show+0x30/0x78
        sysfs_kf_seq_show+0x17c/0x25c
        kernfs_seq_show+0xb4/0xf8
        seq_read+0x4a8/0x8f0
        kernfs_fop_read+0xe0/0x360
        __vfs_read+0x80/0x328
        vfs_read+0xd0/0x1d4
        ksys_read+0x88/0x118
        __arm64_sys_read+0x4c/0x5c
        el0_svc_common+0x124/0x1c4
        el0_svc_compat_handler+0x64/0x8c
        el0_svc_compat+0x8/0x18

 -> #0 (kn->count#326){++++}:
        lock_acquire+0x244/0x360
        __kernfs_remove+0x324/0x4fc
        kernfs_remove_by_name_ns+0x44/0x80
        remove_files+0x5c/0xd8
        sysfs_remove_group+0xb4/0xec
        cpufreq_stats_free_table+0x50/0x9c
        cpufreq_policy_free+0x184/0x1cc
        cpufreq_online+0xa44/0xc4c
        cpuhp_cpufreq_online+0x1c/0x2c
        cpuhp_invoke_callback+0x608/0x1328
        cpuhp_thread_fun+0x1dc/0x440
        smpboot_thread_fn+0x46c/0x7c0
        kthread+0x248/0x260
        ret_from_fork+0x10/0x18

 other info that might help us debug this:

  Possible unsafe locking scenario:

        CPU0                    CPU1
        ----                    ----
   lock(&policy->rwsem);
                                lock(kn->count#326);
                                lock(&policy->rwsem);
   lock(kn->count#326);

  *** DEADLOCK ***

 2 locks held by cpuhp/5/36:
  #0: 00000000549a28c3 (cpuhp_state-up){+.+.}, at: cpuhp_lock_acquire+0x8/0x48
  #1: 00000000dd7f52c3 (&policy->rwsem){++++}, at: cpufreq_policy_free+0x17c/0x1cc

 stack backtrace:
 CPU: 5 PID: 36 Comm: cpuhp/5 Tainted: G        W         4.19.8 #61
 Call trace:
  dump_backtrace+0x0/0x2f8
  show_stack+0x20/0x2c
  __dump_stack+0x20/0x28
  dump_stack+0xcc/0x10c
  print_circular_bug+0x2c0/0x328
  __lock_acquire+0x22b0/0x2708
  lock_acquire+0x244/0x360
  __kernfs_remove+0x324/0x4fc
  kernfs_remove_by_name_ns+0x44/0x80
  remove_files+0x5c/0xd8
  sysfs_remove_group+0xb4/0xec
  cpufreq_stats_free_table+0x50/0x9c
  cpufreq_policy_free+0x184/0x1cc
  cpufreq_online+0xa44/0xc4c
  cpuhp_cpufreq_online+0x1c/0x2c
  cpuhp_invoke_callback+0x608/0x1328
  cpuhp_thread_fun+0x1dc/0x440
  smpboot_thread_fn+0x46c/0x7c0
  kthread+0x248/0x260
  ret_from_fork+0x10/0x18
Viresh Kumar Dec. 13, 2018, 10:14 a.m. UTC | #5
On 13-12-18, 02:12, Stephen Boyd wrote:
> It's on a v4.19 kernel with this cpufreq hw driver backported to it. I
> think all it takes is to return an error the second time the policy is
> initialized when cpufreq_online() calls into the cpufreq driver.

What do you mean by "the second time the policy is initialized" ?

We call cpufreq_online() only once for each policy.
Stephen Boyd Dec. 13, 2018, 10:32 a.m. UTC | #6
Quoting Viresh Kumar (2018-12-13 02:14:17)
> On 13-12-18, 02:12, Stephen Boyd wrote:
> > It's on a v4.19 kernel with this cpufreq hw driver backported to it. I
> > think all it takes is to return an error the second time the policy is
> > initialized when cpufreq_online() calls into the cpufreq driver.
> 
> What do you mean by "the second time the policy is initialized" ?
> 
> We call cpufreq_online() only once for each policy.
> 

I have one policy for four CPUs. So take down all four of those CPUs by
writing a 0 to the online file for each CPU, and then bring them back
online. That should make cpufreq_driver->init() be called twice, once
during boot when the CPUs are bound to the cpufreq devices, and second
from the sysfs write when the user brings the first CPU in that policy
online again by writing a 1 to the online file. If that second time it
fails I suspect we hit the lockdep warning.

I may have also catted the policy*/stat/* files before offlining any
CPUs. If you can't trigger it I can try to reproduce again with a static
counter in cpufreq_online() that fails when set to 1 or something.
Viresh Kumar Dec. 13, 2018, 10:36 a.m. UTC | #7
On 13-12-18, 02:32, Stephen Boyd wrote:
> I have one policy for four CPUs. So take down all four of those CPUs by
> writing a 0 to the online file for each CPU, and then bring them back
> online. That should make cpufreq_driver->init() be called twice, once
> during boot when the CPUs are bound to the cpufreq devices, and second
> from the sysfs write when the user brings the first CPU in that policy
> online again by writing a 1 to the online file. If that second time it
> fails I suspect we hit the lockdep warning.

Okay, that is a special sequence. I will try to reproduce that
locally. Thanks.
Matthias Kaehlcke Dec. 13, 2018, 7:43 p.m. UTC | #8
On Thu, Dec 13, 2018 at 01:19:54PM +0530, Taniya Das wrote:
> The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
> for changing the frequency of CPUs. The driver implements the cpufreq
> driver interface for this hardware engine.
> 
> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
> Signed-off-by: Taniya Das <tdas@codeaurora.org>
> ---
>  drivers/cpufreq/Kconfig.arm       |  11 ++
>  drivers/cpufreq/Makefile          |   1 +
>  drivers/cpufreq/qcom-cpufreq-hw.c | 305 ++++++++++++++++++++++++++++++++++++++
>  3 files changed, 317 insertions(+)
>  create mode 100644 drivers/cpufreq/qcom-cpufreq-hw.c
> 
> diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
> index 4e1131e..688f102 100644
> --- a/drivers/cpufreq/Kconfig.arm
> +++ b/drivers/cpufreq/Kconfig.arm
> @@ -114,6 +114,17 @@ config ARM_QCOM_CPUFREQ_KRYO
> 
>  	  If in doubt, say N.
> 
> +config ARM_QCOM_CPUFREQ_HW
> +	tristate "QCOM CPUFreq HW driver"
> +	depends on ARCH_QCOM || COMPILE_TEST
> +	help
> +	  Support for the CPUFreq HW driver.
> +	  Some QCOM chipsets have a HW engine to offload the steps
> +	  necessary for changing the frequency of the CPUs. Firmware loaded
> +	  in this engine exposes a programming interface to the OS.
> +	  The driver implements the cpufreq interface for this HW engine.
> +	  Say Y if you want to support CPUFreq HW.
> +
>  config ARM_S3C_CPUFREQ
>  	bool
>  	help
> diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
> index d5ee456..08c071b 100644
> --- a/drivers/cpufreq/Makefile
> +++ b/drivers/cpufreq/Makefile
> @@ -61,6 +61,7 @@ obj-$(CONFIG_MACH_MVEBU_V7)		+= mvebu-cpufreq.o
>  obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ)	+= omap-cpufreq.o
>  obj-$(CONFIG_ARM_PXA2xx_CPUFREQ)	+= pxa2xx-cpufreq.o
>  obj-$(CONFIG_PXA3xx)			+= pxa3xx-cpufreq.o
> +obj-$(CONFIG_ARM_QCOM_CPUFREQ_HW)	+= qcom-cpufreq-hw.o
>  obj-$(CONFIG_ARM_QCOM_CPUFREQ_KRYO)	+= qcom-cpufreq-kryo.o
>  obj-$(CONFIG_ARM_S3C2410_CPUFREQ)	+= s3c2410-cpufreq.o
>  obj-$(CONFIG_ARM_S3C2412_CPUFREQ)	+= s3c2412-cpufreq.o
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> new file mode 100644
> index 0000000..a65d0ae
> --- /dev/null
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -0,0 +1,305 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (c) 2018, The Linux Foundation. All rights reserved.
> + */
> +
> +#include <linux/bitfield.h>
> +#include <linux/cpufreq.h>
> +#include <linux/init.h>
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +#include <linux/of_address.h>
> +#include <linux/of_platform.h>
> +#include <linux/slab.h>
> +
> +#define LUT_MAX_ENTRIES			40U
> +#define LUT_SRC				GENMASK(31, 30)
> +#define LUT_L_VAL			GENMASK(7, 0)
> +#define LUT_CORE_COUNT			GENMASK(18, 16)
> +#define LUT_ROW_SIZE			32
> +#define CLK_HW_DIV			2
> +
> +/* Register offsets */
> +#define REG_ENABLE			0x0
> +#define REG_LUT_TABLE			0x110
> +#define REG_PERF_STATE			0x920
> +
> +static unsigned long cpu_hw_rate, xo_rate;
> +static struct platform_device *global_pdev;
> +
> +static int qcom_cpufreq_hw_target_index(struct cpufreq_policy *policy,
> +					unsigned int index)
> +{
> +	void __iomem *perf_state_reg = policy->driver_data;
> +
> +	writel_relaxed(index, perf_state_reg);
> +
> +	return 0;
> +}
> +
> +static unsigned int qcom_cpufreq_hw_get(unsigned int cpu)
> +{
> +	void __iomem *perf_state_reg;
> +	struct cpufreq_policy *policy;
> +	unsigned int index;
> +
> +	policy = cpufreq_cpu_get_raw(cpu);
> +	if (!policy)
> +		return 0;
> +
> +	perf_state_reg = policy->driver_data;
> +
> +	index = readl_relaxed(perf_state_reg);
> +	index = min(index, LUT_MAX_ENTRIES - 1);
> +
> +	return policy->freq_table[index].frequency;
> +}
> +
> +static unsigned int qcom_cpufreq_hw_fast_switch(struct cpufreq_policy *policy,
> +						unsigned int target_freq)
> +{
> +	void __iomem *perf_state_reg = policy->driver_data;
> +	int index;
> +
> +	index = policy->cached_resolved_idx;
> +	if (index < 0)
> +		return 0;
> +
> +	writel_relaxed(index, perf_state_reg);
> +
> +	return policy->freq_table[index].frequency;
> +}
> +
> +static int qcom_cpufreq_hw_read_lut(struct device *dev,
> +				    struct cpufreq_policy *policy,
> +				    void __iomem *base)
> +{
> +	u32 data, src, lval, i, core_count, prev_cc = 0, prev_freq = 0, freq;
> +	unsigned int max_cores = cpumask_weight(policy->cpus);
> +	struct cpufreq_frequency_table	*table;
> +
> +	table = kcalloc(LUT_MAX_ENTRIES + 1, sizeof(*table), GFP_KERNEL);
> +	if (!table)
> +		return -ENOMEM;
> +
> +	for (i = 0; i < LUT_MAX_ENTRIES; i++) {
> +		data = readl_relaxed(base + REG_LUT_TABLE + i * LUT_ROW_SIZE);
> +		src = FIELD_GET(LUT_SRC, data);
> +		lval = FIELD_GET(LUT_L_VAL, data);
> +		core_count = FIELD_GET(LUT_CORE_COUNT, data);
> +
> +		if (src)
> +			freq = xo_rate * lval / 1000;
> +		else
> +			freq = cpu_hw_rate / 1000;
> +
> +		/* Ignore boosts in the middle of the table */
> +		if (core_count != max_cores) {
> +			table[i].frequency = CPUFREQ_ENTRY_INVALID;
> +		} else {
> +			table[i].frequency = freq;
> +			dev_dbg(dev, "index=%d freq=%d, core_count %d\n", i,
> +				freq, core_count);
> +		}
> +
> +		/*
> +		 * Two of the same frequencies with the same core counts means
> +		 * end of table
> +		 */
> +		if (i > 0 && prev_freq == freq && prev_cc == core_count) {
> +			struct cpufreq_frequency_table *prev = &table[i - 1];
> +
> +			/*
> +			 * Only treat the last frequency that might be a boost
> +			 * as the boost frequency
> +			 */
> +			if (prev_cc != max_cores) {
> +				prev->frequency = prev_freq;
> +				prev->flags = CPUFREQ_BOOST_FREQ;
> +			}
> +
> +			break;
> +		}
> +
> +		prev_cc = core_count;
> +		prev_freq = freq;
> +	}
> +
> +	table[i].frequency = CPUFREQ_TABLE_END;
> +	policy->freq_table = table;
> +
> +	return 0;
> +}
> +
> +static void qcom_get_related_cpus(int index, struct cpumask *m)
> +{
> +	struct device_node *cpu_np;
> +	struct of_phandle_args args;
> +	int cpu, ret;
> +
> +	for_each_possible_cpu(cpu) {
> +		cpu_np = of_cpu_device_node_get(cpu);
> +		if (!cpu_np)
> +			continue;
> +
> +		ret = of_parse_phandle_with_args(cpu_np, "qcom,freq-domain",
> +						 "#freq-domain-cells", 0,
> +						 &args);
> +		of_node_put(cpu_np);
> +		if (ret < 0)
> +			continue;
> +
> +		if (index == args.args[0])
> +			cpumask_set_cpu(cpu, m);
> +	}
> +}
> +
> +static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
> +{
> +	struct device *dev = &global_pdev->dev;
> +	struct of_phandle_args args;
> +	struct device_node *cpu_np;
> +	struct resource *res;
> +	void __iomem *base;
> +	int ret, index;
> +
> +	cpu_np = of_cpu_device_node_get(policy->cpu);
> +	if (!cpu_np)
> +		return -EINVAL;
> +
> +	ret = of_parse_phandle_with_args(cpu_np, "qcom,freq-domain",
> +					 "#freq-domain-cells", 0, &args);
> +	of_node_put(cpu_np);
> +	if (ret)
> +		return ret;
> +
> +	index = args.args[0];
> +
> +	res = platform_get_resource(global_pdev, IORESOURCE_MEM, index);
> +	base = devm_ioremap_resource(dev, res);
> +	if (IS_ERR(base))
> +		return PTR_ERR(base);
> +
> +	/* HW should be in enabled state to proceed */
> +	if (!(readl_relaxed(base + REG_ENABLE) & 0x1)) {
> +		dev_err(dev, "Domain-%d cpufreq hardware not enabled\n", index);
> +		ret = -ENODEV;
> +		goto error;
> +	}
> +
> +	qcom_get_related_cpus(index, policy->cpus);
> +	if (!cpumask_weight(policy->cpus)) {
> +		dev_err(dev, "Domain-%d failed to get related CPUs\n", index);
> +		ret = -ENOENT;
> +		goto error;
> +	}
> +
> +	policy->driver_data = base + REG_PERF_STATE;
> +
> +	ret = qcom_cpufreq_hw_read_lut(dev, policy, base);
> +	if (ret) {
> +		dev_err(dev, "Domain-%d failed to read LUT\n", index);
> +		goto error;
> +	}
> +
> +	policy->fast_switch_possible = true;
> +
> +	return 0;
> +error:
> +	devm_iounmap(dev, base);
> +	return ret;
> +}
> +
> +static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
> +{
> +	void __iomem *base = policy->driver_data - REG_PERF_STATE;
> +
> +	kfree(policy->freq_table);
> +	devm_iounmap(&global_pdev->dev, base);
> +
> +	return 0;
> +}
> +
> +static struct freq_attr *qcom_cpufreq_hw_attr[] = {
> +	&cpufreq_freq_attr_scaling_available_freqs,
> +	&cpufreq_freq_attr_scaling_boost_freqs,
> +	NULL
> +};
> +
> +static struct cpufreq_driver cpufreq_qcom_hw_driver = {
> +	.flags		= CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK |
> +			  CPUFREQ_HAVE_GOVERNOR_PER_POLICY,
> +	.verify		= cpufreq_generic_frequency_table_verify,
> +	.target_index	= qcom_cpufreq_hw_target_index,
> +	.get		= qcom_cpufreq_hw_get,
> +	.init		= qcom_cpufreq_hw_cpu_init,
> +	.exit		= qcom_cpufreq_hw_cpu_exit,
> +	.fast_switch    = qcom_cpufreq_hw_fast_switch,
> +	.name		= "qcom-cpufreq-hw",
> +	.attr		= qcom_cpufreq_hw_attr,
> +};
> +
> +static int qcom_cpufreq_hw_driver_probe(struct platform_device *pdev)
> +{
> +	struct clk *clk;
> +	int ret;
> +
> +	clk = clk_get(&pdev->dev, "xo");
> +	if (IS_ERR(clk))
> +		return PTR_ERR(clk);
> +
> +	xo_rate = clk_get_rate(clk);
> +	clk_put(clk);
> +
> +	clk = clk_get(&pdev->dev, "alternate");
> +	if (IS_ERR(clk))
> +		return PTR_ERR(clk);
> +
> +	cpu_hw_rate = clk_get_rate(clk) / CLK_HW_DIV;
> +	clk_put(clk);
> +
> +	global_pdev = pdev;
> +
> +	ret = cpufreq_register_driver(&cpufreq_qcom_hw_driver);
> +	if (ret)
> +		dev_err(&pdev->dev, "CPUFreq HW driver failed to register\n");
> +	else
> +		dev_dbg(&pdev->dev, "QCOM CPUFreq HW driver initialized\n");
> +
> +	return ret;
> +}
> +
> +static int qcom_cpufreq_hw_driver_remove(struct platform_device *pdev)
> +{
> +	return cpufreq_unregister_driver(&cpufreq_qcom_hw_driver);
> +}
> +
> +static const struct of_device_id qcom_cpufreq_hw_match[] = {
> +	{ .compatible = "qcom,cpufreq-hw" },
> +	{}
> +};
> +MODULE_DEVICE_TABLE(of, qcom_cpufreq_hw_match);
> +
> +static struct platform_driver qcom_cpufreq_hw_driver = {
> +	.probe = qcom_cpufreq_hw_driver_probe,
> +	.remove = qcom_cpufreq_hw_driver_remove,
> +	.driver = {
> +		.name = "qcom-cpufreq-hw",
> +		.of_match_table = qcom_cpufreq_hw_match,
> +	},
> +};
> +
> +static int __init qcom_cpufreq_hw_init(void)
> +{
> +	return platform_driver_register(&qcom_cpufreq_hw_driver);
> +}
> +subsys_initcall(qcom_cpufreq_hw_init);
> +
> +static void __exit qcom_cpufreq_hw_exit(void)
> +{
> +	platform_driver_unregister(&qcom_cpufreq_hw_driver);
> +}
> +module_exit(qcom_cpufreq_hw_exit);
> +
> +MODULE_DESCRIPTION("QCOM CPUFREQ HW Driver");
> +MODULE_LICENSE("GPL v2");

FWIW:

Reviewed-by: Matthias Kaehlcke <mka@chromium.org>
Taniya Das Dec. 14, 2018, 4:10 a.m. UTC | #9
Hello Stephen, Viresh,

On 12/13/2018 3:28 PM, Stephen Boyd wrote:
> Quoting Taniya Das (2018-12-12 23:49:54)
>> The CPUfreq HW present in some QCOM chipsets offloads the steps necessary
>> for changing the frequency of CPUs. The driver implements the cpufreq
>> driver interface for this hardware engine.
>>
>> Signed-off-by: Saravana Kannan <skannan@codeaurora.org>
>> Signed-off-by: Stephen Boyd <swboyd@chromium.org>
>> Signed-off-by: Taniya Das <tdas@codeaurora.org>
>> ---
> 
> Reviewed-by: Stephen Boyd <swboyd@chromium.org>
> 
> But I noticed that we don't release the I/O region anymore so hotplug
> and replug of a whole clk domain fails. I guess devm_ioremap_resource()
> was just too much magic so how about we downgrade to devm_ioremap()
> instead?
> 
> BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error
> upon bringing the policy online the second time. I guess cpufreq_stats
> aren't able to be freed from there because they take locks in different
> order vs. the normal path?
> 
> -----8<-------
> diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
> index fce7a1162e87..0e1105151478 100644
> --- a/drivers/cpufreq/qcom-cpufreq-hw.c
> +++ b/drivers/cpufreq/qcom-cpufreq-hw.c
> @@ -182,9 +182,12 @@ static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
>   	index = args.args[0];
>   
>   	res = platform_get_resource(global_pdev, IORESOURCE_MEM, index);
> -	base = devm_ioremap_resource(dev, res);
> -	if (IS_ERR(base))
> -		return PTR_ERR(base);
> +	if (!res)
> +		return -ENODEV;
> +
> +	base = devm_ioremap(dev, res->start, resource_size(res));
> +	if (!base)
> +		return -ENOMEM;
>   

Updated the above in the next series.

>   	/* HW should be in enabled state to proceed */
>   	if (!(readl_relaxed(base + REG_ENABLE) & 0x1)) {
>
Viresh Kumar Dec. 18, 2018, 5:45 a.m. UTC | #10
Hi Stephen,

On 13-12-18, 02:12, Stephen Boyd wrote:
> Quoting Viresh Kumar (2018-12-13 02:05:06)
> > On 13-12-18, 01:58, Stephen Boyd wrote:
> > > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error
> > > upon bringing the policy online the second time. I guess cpufreq_stats
> > > aren't able to be freed from there because they take locks in different
> > > order vs. the normal path?
> > 
> > Please share the lockdep report and the steps to reproduce it. I will
> > see if I can simulate the failure forcefully..
> > 
> 
> It's on a v4.19 kernel with this cpufreq hw driver backported to it. I
> think all it takes is to return an error the second time the policy is
> initialized when cpufreq_online() calls into the cpufreq driver.
> 
>  ======================================================
>  WARNING: possible circular locking dependency detected
>  4.19.8 #61 Tainted: G        W
>  ------------------------------------------------------
>  cpuhp/5/36 is trying to acquire lock:
>  000000003e901e8a (kn->count#326){++++}, at: kernfs_remove_by_name_ns+0x44/0x80
> 
>  but task is already holding lock:
>  00000000dd7f52c3 (&policy->rwsem){++++}, at: cpufreq_policy_free+0x17c/0x1cc
> 
>  which lock already depends on the new lock.
> 
> 
>  the existing dependency chain (in reverse order) is:
> 
>  -> #1 (&policy->rwsem){++++}:
>         down_read+0x50/0xcc
>         show+0x30/0x78
>         sysfs_kf_seq_show+0x17c/0x25c
>         kernfs_seq_show+0xb4/0xf8
>         seq_read+0x4a8/0x8f0
>         kernfs_fop_read+0xe0/0x360
>         __vfs_read+0x80/0x328
>         vfs_read+0xd0/0x1d4
>         ksys_read+0x88/0x118
>         __arm64_sys_read+0x4c/0x5c
>         el0_svc_common+0x124/0x1c4
>         el0_svc_compat_handler+0x64/0x8c
>         el0_svc_compat+0x8/0x18

I failed to reproduce it over linux/next.

I had the following changes over linux/next:
https://pastebin.ubuntu.com/p/zkVm77PGdY/

I also did savedefconfig to show what all I changed in it. I faked multiple
clusters on my hikey960 board, which is not big little..

And here is the command list from history that I ran after boot.

  501  grep . /sys/devices/system/cpu/cpufreq/*/*
  502  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  503  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  504  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  505  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  506  grep . /sys/devices/system/cpu/cpufreq/*/*
  507  grep . /sys/devices/system/cpu/cpufreq/*/*
  508  echo 0 > /sys/devices/system/cpu/cpu4/online 
  509  echo 0 > /sys/devices/system/cpu/cpu5/online 
  510  echo 0 > /sys/devices/system/cpu/cpu6/online 
  511  echo 0 > /sys/devices/system/cpu/cpu7/online 
  512  grep . /sys/devices/system/cpu/cpufreq/*/*
  513  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  514  grep . /sys/devices/system/cpu/cpufreq/*/*
  515  echo 1 > /sys/devices/system/cpu/cpu4/online 
  516  grep . /sys/devices/system/cpu/cpufreq/*/*
  517  grep . /sys/devices/system/cpu/cpufreq/*/*/*
  518  dmesg
Stephen Boyd Dec. 18, 2018, 7:13 p.m. UTC | #11
Quoting Viresh Kumar (2018-12-17 21:45:45)
> Hi Stephen,
> 
> On 13-12-18, 02:12, Stephen Boyd wrote:
> > Quoting Viresh Kumar (2018-12-13 02:05:06)
> > > On 13-12-18, 01:58, Stephen Boyd wrote:
> > > > BTW, Viresh, I see a lockdep splat when cpufreq_init returns an error
> > > > upon bringing the policy online the second time. I guess cpufreq_stats
> > > > aren't able to be freed from there because they take locks in different
> > > > order vs. the normal path?
> > > 
> > > Please share the lockdep report and the steps to reproduce it. I will
> > > see if I can simulate the failure forcefully..
> > > 
> > 
> > It's on a v4.19 kernel with this cpufreq hw driver backported to it. I
> > think all it takes is to return an error the second time the policy is
> > initialized when cpufreq_online() calls into the cpufreq driver.
> > 
> >  ======================================================
> >  WARNING: possible circular locking dependency detected
> >  4.19.8 #61 Tainted: G        W
> >  ------------------------------------------------------
> >  cpuhp/5/36 is trying to acquire lock:
> >  000000003e901e8a (kn->count#326){++++}, at: kernfs_remove_by_name_ns+0x44/0x80
> > 
> >  but task is already holding lock:
> >  00000000dd7f52c3 (&policy->rwsem){++++}, at: cpufreq_policy_free+0x17c/0x1cc
> > 
> >  which lock already depends on the new lock.
> > 
> > 
> >  the existing dependency chain (in reverse order) is:
> > 
> >  -> #1 (&policy->rwsem){++++}:
> >         down_read+0x50/0xcc
> >         show+0x30/0x78
> >         sysfs_kf_seq_show+0x17c/0x25c
> >         kernfs_seq_show+0xb4/0xf8
> >         seq_read+0x4a8/0x8f0
> >         kernfs_fop_read+0xe0/0x360
> >         __vfs_read+0x80/0x328
> >         vfs_read+0xd0/0x1d4
> >         ksys_read+0x88/0x118
> >         __arm64_sys_read+0x4c/0x5c
> >         el0_svc_common+0x124/0x1c4
> >         el0_svc_compat_handler+0x64/0x8c
> >         el0_svc_compat+0x8/0x18
> 
> I failed to reproduce it over linux/next.
> 
> I had the following changes over linux/next:
> https://pastebin.ubuntu.com/p/zkVm77PGdY/

I don't see any failure returned from cpufreq_dt's cpufreq_init()
function. Maybe put a static int counter = 0 and then fail
cpufreq_init() the second time that it's called for the same policy
pointer? I have a system with two policies, so I made it fail and return
-EINVAL when the counter == 2 and I see the lockdep splat.

> 
> I also did savedefconfig to show what all I changed in it. I faked multiple
> clusters on my hikey960 board, which is not big little..
> 
> And here is the command list from history that I ran after boot.
> 
>   501  grep . /sys/devices/system/cpu/cpufreq/*/*
>   502  grep . /sys/devices/system/cpu/cpufreq/*/*/*
>   503  grep . /sys/devices/system/cpu/cpufreq/*/*/*
>   504  grep . /sys/devices/system/cpu/cpufreq/*/*/*
>   505  grep . /sys/devices/system/cpu/cpufreq/*/*/*
>   506  grep . /sys/devices/system/cpu/cpufreq/*/*
>   507  grep . /sys/devices/system/cpu/cpufreq/*/*
>   508  echo 0 > /sys/devices/system/cpu/cpu4/online 
>   509  echo 0 > /sys/devices/system/cpu/cpu5/online 
>   510  echo 0 > /sys/devices/system/cpu/cpu6/online 
>   511  echo 0 > /sys/devices/system/cpu/cpu7/online 
>   512  grep . /sys/devices/system/cpu/cpufreq/*/*
>   513  grep . /sys/devices/system/cpu/cpufreq/*/*/*
>   514  grep . /sys/devices/system/cpu/cpufreq/*/*
>   515  echo 1 > /sys/devices/system/cpu/cpu4/online 
>   516  grep . /sys/devices/system/cpu/cpufreq/*/*
>   517  grep . /sys/devices/system/cpu/cpufreq/*/*/*
>   518  dmesg 
> 

I did the following:

  grep . /sys/devices/system/cpu/cpufreq/*/* >/dev/null
  echo 0 > /sys/devices/system/cpu/cpu4/online 
  echo 0 > /sys/devices/system/cpu/cpu5/online 
  echo 0 > /sys/devices/system/cpu/cpu6/online 
  echo 0 > /sys/devices/system/cpu/cpu7/online 
  echo 1 > /sys/devices/system/cpu/cpu4/online 
  dmesg

And boom, lockdep splat.
Viresh Kumar Dec. 19, 2018, 5:44 a.m. UTC | #12
On 18-12-18, 11:13, Stephen Boyd wrote:
> I don't see any failure returned from cpufreq_dt's cpufreq_init()
> function. Maybe put a static int counter = 0 and then fail
> cpufreq_init() the second time that it's called for the same policy
> pointer? I have a system with two policies, so I made it fail and return
> -EINVAL when the counter == 2 and I see the lockdep splat.

Yuck. You were so clear on this earlier, how can I forget it. :(

Here is the additional diff. And I do get this print message on my screen while
I try to online CPU4.

diff --git a/drivers/cpufreq/cpufreq-dt.c b/drivers/cpufreq/cpufreq-dt.c
index 639feca22d27..b836e93fd87d 100644
--- a/drivers/cpufreq/cpufreq-dt.c
+++ b/drivers/cpufreq/cpufreq-dt.c
@@ -151,6 +151,11 @@ static int cpufreq_init(struct cpufreq_policy *policy)
        const char *name;
        int ret;
 
+       if (policy->count++) {
+               pr_info("%s: %d\n", __func__, __LINE__);
+               return -EINVAL;
+       }
+
        cpu_dev = get_cpu_device(policy->cpu);
        if (!cpu_dev) {
                pr_err("failed to get cpu%d device\n", policy->cpu);
diff --git a/include/linux/cpufreq.h b/include/linux/cpufreq.h
index 882a9b9e34bc..643141a2013f 100644
--- a/include/linux/cpufreq.h
+++ b/include/linux/cpufreq.h
@@ -151,6 +151,7 @@ struct cpufreq_policy {
 
        /* For cpufreq driver's internal use */
        void                    *driver_data;
+       int                     count;
 };

> I did the following:
> 
>   grep . /sys/devices/system/cpu/cpufreq/*/* >/dev/null
>   echo 0 > /sys/devices/system/cpu/cpu4/online 
>   echo 0 > /sys/devices/system/cpu/cpu5/online 
>   echo 0 > /sys/devices/system/cpu/cpu6/online 
>   echo 0 > /sys/devices/system/cpu/cpu7/online 
>   echo 1 > /sys/devices/system/cpu/cpu4/online 
>   dmesg
> 
> And boom, lockdep splat.

  502  grep . /sys/devices/system/cpu/cpufreq/*/*
  503  echo 0 > /sys/devices/system/cpu/cpu4/online
  504  echo 0 > /sys/devices/system/cpu/cpu5/online
  505  echo 0 > /sys/devices/system/cpu/cpu6/online
  506  echo 0 > /sys/devices/system/cpu/cpu7/online
  507  grep . /sys/devices/system/cpu/cpufreq/*/*
  508  echo 1 > /sys/devices/system/cpu/cpu4/online
  509  dmesg 


But still no lockdep :(

Can you try that on mainline as well ?
diff mbox series

Patch

diff --git a/drivers/cpufreq/Kconfig.arm b/drivers/cpufreq/Kconfig.arm
index 4e1131e..688f102 100644
--- a/drivers/cpufreq/Kconfig.arm
+++ b/drivers/cpufreq/Kconfig.arm
@@ -114,6 +114,17 @@  config ARM_QCOM_CPUFREQ_KRYO

 	  If in doubt, say N.

+config ARM_QCOM_CPUFREQ_HW
+	tristate "QCOM CPUFreq HW driver"
+	depends on ARCH_QCOM || COMPILE_TEST
+	help
+	  Support for the CPUFreq HW driver.
+	  Some QCOM chipsets have a HW engine to offload the steps
+	  necessary for changing the frequency of the CPUs. Firmware loaded
+	  in this engine exposes a programming interface to the OS.
+	  The driver implements the cpufreq interface for this HW engine.
+	  Say Y if you want to support CPUFreq HW.
+
 config ARM_S3C_CPUFREQ
 	bool
 	help
diff --git a/drivers/cpufreq/Makefile b/drivers/cpufreq/Makefile
index d5ee456..08c071b 100644
--- a/drivers/cpufreq/Makefile
+++ b/drivers/cpufreq/Makefile
@@ -61,6 +61,7 @@  obj-$(CONFIG_MACH_MVEBU_V7)		+= mvebu-cpufreq.o
 obj-$(CONFIG_ARM_OMAP2PLUS_CPUFREQ)	+= omap-cpufreq.o
 obj-$(CONFIG_ARM_PXA2xx_CPUFREQ)	+= pxa2xx-cpufreq.o
 obj-$(CONFIG_PXA3xx)			+= pxa3xx-cpufreq.o
+obj-$(CONFIG_ARM_QCOM_CPUFREQ_HW)	+= qcom-cpufreq-hw.o
 obj-$(CONFIG_ARM_QCOM_CPUFREQ_KRYO)	+= qcom-cpufreq-kryo.o
 obj-$(CONFIG_ARM_S3C2410_CPUFREQ)	+= s3c2410-cpufreq.o
 obj-$(CONFIG_ARM_S3C2412_CPUFREQ)	+= s3c2412-cpufreq.o
diff --git a/drivers/cpufreq/qcom-cpufreq-hw.c b/drivers/cpufreq/qcom-cpufreq-hw.c
new file mode 100644
index 0000000..a65d0ae
--- /dev/null
+++ b/drivers/cpufreq/qcom-cpufreq-hw.c
@@ -0,0 +1,305 @@ 
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (c) 2018, The Linux Foundation. All rights reserved.
+ */
+
+#include <linux/bitfield.h>
+#include <linux/cpufreq.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+#include <linux/slab.h>
+
+#define LUT_MAX_ENTRIES			40U
+#define LUT_SRC				GENMASK(31, 30)
+#define LUT_L_VAL			GENMASK(7, 0)
+#define LUT_CORE_COUNT			GENMASK(18, 16)
+#define LUT_ROW_SIZE			32
+#define CLK_HW_DIV			2
+
+/* Register offsets */
+#define REG_ENABLE			0x0
+#define REG_LUT_TABLE			0x110
+#define REG_PERF_STATE			0x920
+
+static unsigned long cpu_hw_rate, xo_rate;
+static struct platform_device *global_pdev;
+
+static int qcom_cpufreq_hw_target_index(struct cpufreq_policy *policy,
+					unsigned int index)
+{
+	void __iomem *perf_state_reg = policy->driver_data;
+
+	writel_relaxed(index, perf_state_reg);
+
+	return 0;
+}
+
+static unsigned int qcom_cpufreq_hw_get(unsigned int cpu)
+{
+	void __iomem *perf_state_reg;
+	struct cpufreq_policy *policy;
+	unsigned int index;
+
+	policy = cpufreq_cpu_get_raw(cpu);
+	if (!policy)
+		return 0;
+
+	perf_state_reg = policy->driver_data;
+
+	index = readl_relaxed(perf_state_reg);
+	index = min(index, LUT_MAX_ENTRIES - 1);
+
+	return policy->freq_table[index].frequency;
+}
+
+static unsigned int qcom_cpufreq_hw_fast_switch(struct cpufreq_policy *policy,
+						unsigned int target_freq)
+{
+	void __iomem *perf_state_reg = policy->driver_data;
+	int index;
+
+	index = policy->cached_resolved_idx;
+	if (index < 0)
+		return 0;
+
+	writel_relaxed(index, perf_state_reg);
+
+	return policy->freq_table[index].frequency;
+}
+
+static int qcom_cpufreq_hw_read_lut(struct device *dev,
+				    struct cpufreq_policy *policy,
+				    void __iomem *base)
+{
+	u32 data, src, lval, i, core_count, prev_cc = 0, prev_freq = 0, freq;
+	unsigned int max_cores = cpumask_weight(policy->cpus);
+	struct cpufreq_frequency_table	*table;
+
+	table = kcalloc(LUT_MAX_ENTRIES + 1, sizeof(*table), GFP_KERNEL);
+	if (!table)
+		return -ENOMEM;
+
+	for (i = 0; i < LUT_MAX_ENTRIES; i++) {
+		data = readl_relaxed(base + REG_LUT_TABLE + i * LUT_ROW_SIZE);
+		src = FIELD_GET(LUT_SRC, data);
+		lval = FIELD_GET(LUT_L_VAL, data);
+		core_count = FIELD_GET(LUT_CORE_COUNT, data);
+
+		if (src)
+			freq = xo_rate * lval / 1000;
+		else
+			freq = cpu_hw_rate / 1000;
+
+		/* Ignore boosts in the middle of the table */
+		if (core_count != max_cores) {
+			table[i].frequency = CPUFREQ_ENTRY_INVALID;
+		} else {
+			table[i].frequency = freq;
+			dev_dbg(dev, "index=%d freq=%d, core_count %d\n", i,
+				freq, core_count);
+		}
+
+		/*
+		 * Two of the same frequencies with the same core counts means
+		 * end of table
+		 */
+		if (i > 0 && prev_freq == freq && prev_cc == core_count) {
+			struct cpufreq_frequency_table *prev = &table[i - 1];
+
+			/*
+			 * Only treat the last frequency that might be a boost
+			 * as the boost frequency
+			 */
+			if (prev_cc != max_cores) {
+				prev->frequency = prev_freq;
+				prev->flags = CPUFREQ_BOOST_FREQ;
+			}
+
+			break;
+		}
+
+		prev_cc = core_count;
+		prev_freq = freq;
+	}
+
+	table[i].frequency = CPUFREQ_TABLE_END;
+	policy->freq_table = table;
+
+	return 0;
+}
+
+static void qcom_get_related_cpus(int index, struct cpumask *m)
+{
+	struct device_node *cpu_np;
+	struct of_phandle_args args;
+	int cpu, ret;
+
+	for_each_possible_cpu(cpu) {
+		cpu_np = of_cpu_device_node_get(cpu);
+		if (!cpu_np)
+			continue;
+
+		ret = of_parse_phandle_with_args(cpu_np, "qcom,freq-domain",
+						 "#freq-domain-cells", 0,
+						 &args);
+		of_node_put(cpu_np);
+		if (ret < 0)
+			continue;
+
+		if (index == args.args[0])
+			cpumask_set_cpu(cpu, m);
+	}
+}
+
+static int qcom_cpufreq_hw_cpu_init(struct cpufreq_policy *policy)
+{
+	struct device *dev = &global_pdev->dev;
+	struct of_phandle_args args;
+	struct device_node *cpu_np;
+	struct resource *res;
+	void __iomem *base;
+	int ret, index;
+
+	cpu_np = of_cpu_device_node_get(policy->cpu);
+	if (!cpu_np)
+		return -EINVAL;
+
+	ret = of_parse_phandle_with_args(cpu_np, "qcom,freq-domain",
+					 "#freq-domain-cells", 0, &args);
+	of_node_put(cpu_np);
+	if (ret)
+		return ret;
+
+	index = args.args[0];
+
+	res = platform_get_resource(global_pdev, IORESOURCE_MEM, index);
+	base = devm_ioremap_resource(dev, res);
+	if (IS_ERR(base))
+		return PTR_ERR(base);
+
+	/* HW should be in enabled state to proceed */
+	if (!(readl_relaxed(base + REG_ENABLE) & 0x1)) {
+		dev_err(dev, "Domain-%d cpufreq hardware not enabled\n", index);
+		ret = -ENODEV;
+		goto error;
+	}
+
+	qcom_get_related_cpus(index, policy->cpus);
+	if (!cpumask_weight(policy->cpus)) {
+		dev_err(dev, "Domain-%d failed to get related CPUs\n", index);
+		ret = -ENOENT;
+		goto error;
+	}
+
+	policy->driver_data = base + REG_PERF_STATE;
+
+	ret = qcom_cpufreq_hw_read_lut(dev, policy, base);
+	if (ret) {
+		dev_err(dev, "Domain-%d failed to read LUT\n", index);
+		goto error;
+	}
+
+	policy->fast_switch_possible = true;
+
+	return 0;
+error:
+	devm_iounmap(dev, base);
+	return ret;
+}
+
+static int qcom_cpufreq_hw_cpu_exit(struct cpufreq_policy *policy)
+{
+	void __iomem *base = policy->driver_data - REG_PERF_STATE;
+
+	kfree(policy->freq_table);
+	devm_iounmap(&global_pdev->dev, base);
+
+	return 0;
+}
+
+static struct freq_attr *qcom_cpufreq_hw_attr[] = {
+	&cpufreq_freq_attr_scaling_available_freqs,
+	&cpufreq_freq_attr_scaling_boost_freqs,
+	NULL
+};
+
+static struct cpufreq_driver cpufreq_qcom_hw_driver = {
+	.flags		= CPUFREQ_STICKY | CPUFREQ_NEED_INITIAL_FREQ_CHECK |
+			  CPUFREQ_HAVE_GOVERNOR_PER_POLICY,
+	.verify		= cpufreq_generic_frequency_table_verify,
+	.target_index	= qcom_cpufreq_hw_target_index,
+	.get		= qcom_cpufreq_hw_get,
+	.init		= qcom_cpufreq_hw_cpu_init,
+	.exit		= qcom_cpufreq_hw_cpu_exit,
+	.fast_switch    = qcom_cpufreq_hw_fast_switch,
+	.name		= "qcom-cpufreq-hw",
+	.attr		= qcom_cpufreq_hw_attr,
+};
+
+static int qcom_cpufreq_hw_driver_probe(struct platform_device *pdev)
+{
+	struct clk *clk;
+	int ret;
+
+	clk = clk_get(&pdev->dev, "xo");
+	if (IS_ERR(clk))
+		return PTR_ERR(clk);
+
+	xo_rate = clk_get_rate(clk);
+	clk_put(clk);
+
+	clk = clk_get(&pdev->dev, "alternate");
+	if (IS_ERR(clk))
+		return PTR_ERR(clk);
+
+	cpu_hw_rate = clk_get_rate(clk) / CLK_HW_DIV;
+	clk_put(clk);
+
+	global_pdev = pdev;
+
+	ret = cpufreq_register_driver(&cpufreq_qcom_hw_driver);
+	if (ret)
+		dev_err(&pdev->dev, "CPUFreq HW driver failed to register\n");
+	else
+		dev_dbg(&pdev->dev, "QCOM CPUFreq HW driver initialized\n");
+
+	return ret;
+}
+
+static int qcom_cpufreq_hw_driver_remove(struct platform_device *pdev)
+{
+	return cpufreq_unregister_driver(&cpufreq_qcom_hw_driver);
+}
+
+static const struct of_device_id qcom_cpufreq_hw_match[] = {
+	{ .compatible = "qcom,cpufreq-hw" },
+	{}
+};
+MODULE_DEVICE_TABLE(of, qcom_cpufreq_hw_match);
+
+static struct platform_driver qcom_cpufreq_hw_driver = {
+	.probe = qcom_cpufreq_hw_driver_probe,
+	.remove = qcom_cpufreq_hw_driver_remove,
+	.driver = {
+		.name = "qcom-cpufreq-hw",
+		.of_match_table = qcom_cpufreq_hw_match,
+	},
+};
+
+static int __init qcom_cpufreq_hw_init(void)
+{
+	return platform_driver_register(&qcom_cpufreq_hw_driver);
+}
+subsys_initcall(qcom_cpufreq_hw_init);
+
+static void __exit qcom_cpufreq_hw_exit(void)
+{
+	platform_driver_unregister(&qcom_cpufreq_hw_driver);
+}
+module_exit(qcom_cpufreq_hw_exit);
+
+MODULE_DESCRIPTION("QCOM CPUFREQ HW Driver");
+MODULE_LICENSE("GPL v2");