Message ID | 20231009171839.12267-3-sumitg@nvidia.com (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | Add support for _TFP and change throttle pctg | expand |
Hi Sumit,
kernel test robot noticed the following build errors:
[auto build test ERROR on rafael-pm/linux-next]
[also build test ERROR on next-20231009]
[cannot apply to linus/master v6.6-rc5]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Sumit-Gupta/ACPI-thermal-Add-Thermal-fast-Sampling-Period-_TFP-support/20231010-012229
base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next
patch link: https://lore.kernel.org/r/20231009171839.12267-3-sumitg%40nvidia.com
patch subject: [Patch v4 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241
config: i386-tinyconfig (https://download.01.org/0day-ci/archive/20231010/202310100219.lpVzbckv-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231010/202310100219.lpVzbckv-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202310100219.lpVzbckv-lkp@intel.com/
All errors (new ones prefixed by >>):
ld: arch/x86/kernel/setup.o: in function `acpi_thermal_cpufreq_pctg':
>> setup.c:(.text+0x3): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: arch/x86/kernel/x86_init.o: in function `acpi_thermal_cpufreq_pctg':
x86_init.c:(.text+0x44): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: arch/x86/kernel/i8259.o: in function `acpi_thermal_cpufreq_pctg':
i8259.c:(.text+0x2dd): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: arch/x86/kernel/irqinit.o: in function `acpi_thermal_cpufreq_pctg':
irqinit.c:(.text+0x0): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: arch/x86/kernel/bootflag.o: in function `acpi_thermal_cpufreq_pctg':
bootflag.c:(.text+0x0): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: arch/x86/kernel/e820.o: in function `acpi_thermal_cpufreq_pctg':
e820.c:(.text+0x144): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: arch/x86/kernel/pci-dma.o: in function `acpi_thermal_cpufreq_pctg':
pci-dma.c:(.text+0x0): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: arch/x86/kernel/process.o: in function `acpi_thermal_cpufreq_pctg':
process.c:(.text+0xe5): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: kernel/sysctl.o: in function `acpi_thermal_cpufreq_pctg':
sysctl.c:(.text+0x48): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: kernel/dma/mapping.o: in function `acpi_thermal_cpufreq_pctg':
mapping.c:(.text+0x5ba): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: drivers/base/core.o: in function `acpi_thermal_cpufreq_pctg':
core.c:(.text+0x13e9): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: drivers/base/platform.o: in function `acpi_thermal_cpufreq_pctg':
platform.c:(.text+0x8b8): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: drivers/base/cpu.o: in function `acpi_thermal_cpufreq_pctg':
cpu.c:(.text+0x128): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: drivers/base/property.o: in function `acpi_thermal_cpufreq_pctg':
property.c:(.text+0xa87): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
ld: drivers/base/cacheinfo.o: in function `acpi_thermal_cpufreq_pctg':
cacheinfo.c:(.text+0x231): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here
> Hi Sumit, > > kernel test robot noticed the following build errors: Thank you for the report. > > [auto build test ERROR on rafael-pm/linux-next] > [also build test ERROR on next-20231009] > [cannot apply to linus/master v6.6-rc5] [...] > All errors (new ones prefixed by >>): > > ld: arch/x86/kernel/setup.o: in function `acpi_thermal_cpufreq_pctg': >>> setup.c:(.text+0x3): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here > ld: arch/x86/kernel/x86_init.o: in function `acpi_thermal_cpufreq_pctg': [...] > cpu.c:(.text+0x128): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here > ld: drivers/base/property.o: in function `acpi_thermal_cpufreq_pctg': > property.c:(.text+0xa87): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here > ld: drivers/base/cacheinfo.o: in function `acpi_thermal_cpufreq_pctg': > cacheinfo.c:(.text+0x231): multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:main.c:(.text+0x32): first defined here Below change fixes the issue for me. +++ b/include/linux/acpi.h @@ -1542,7 +1542,7 @@ static inline void acpi_device_notify_remove(struct device *dev) { } #ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY int acpi_thermal_cpufreq_pctg(void); #else -inline int acpi_thermal_cpufreq_pctg(void) +static inline int acpi_thermal_cpufreq_pctg(void) { return 0; } Best Regards, Sumit Gupta
On 09/10/23 22:48, Sumit Gupta wrote: > From: Srikar Srimath Tirumala <srikars@nvidia.com> > > Current implementation of processor_thermal performs software throttling > in fixed steps of "20%" which can be too coarse for some platforms. > We observed some performance gain after reducing the throttle percentage. > Change the CPUFREQ thermal reduction percentage and maximum thermal steps > to be configurable. Also, update the default values of both for Nvidia > Tegra241 (Grace) SoC. The thermal reduction percentage is reduced to "5%" > and accordingly the maximum number of thermal steps are increased as they > are derived from the reduction percentage. > > Signed-off-by: Srikar Srimath Tirumala <srikars@nvidia.com> > Co-developed-by: Sumit Gupta <sumitg@nvidia.com> > Signed-off-by: Sumit Gupta <sumitg@nvidia.com> > --- > drivers/acpi/arm64/Makefile | 1 + > drivers/acpi/arm64/thermal_cpufreq.c | 20 ++++++++++++++++ > drivers/acpi/processor_thermal.c | 35 +++++++++++++++++++++++++--- > include/linux/acpi.h | 9 +++++++ > 4 files changed, 62 insertions(+), 3 deletions(-) > create mode 100644 drivers/acpi/arm64/thermal_cpufreq.c > > diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile > index 143debc1ba4a..3f181d8156cc 100644 > --- a/drivers/acpi/arm64/Makefile > +++ b/drivers/acpi/arm64/Makefile > @@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o > obj-$(CONFIG_ACPI_APMT) += apmt.o > obj-$(CONFIG_ARM_AMBA) += amba.o > obj-y += dma.o init.o > +obj-$(CONFIG_ACPI) += thermal_cpufreq.o > diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c > new file mode 100644 > index 000000000000..de834fb013e7 > --- /dev/null > +++ b/drivers/acpi/arm64/thermal_cpufreq.c > @@ -0,0 +1,20 @@ > +// SPDX-License-Identifier: GPL-2.0-only > +#include <linux/acpi.h> > + > +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY > +#define SMCCC_SOC_ID_T241 0x036b0241 > + > +int acpi_thermal_cpufreq_pctg(void) > +{ > + s32 soc_id = arm_smccc_get_soc_id_version(); > + > + /* > + * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and > + * reduce the CPUFREQ Thermal reduction percentage to 5%. > + */ > + if (soc_id == SMCCC_SOC_ID_T241) > + return 5; > + > + return 0; > +} > +#endif > diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c > index b7c6287eccca..52f316e4e260 100644 > --- a/drivers/acpi/processor_thermal.c > +++ b/drivers/acpi/processor_thermal.c > @@ -26,7 +26,16 @@ > */ > > #define CPUFREQ_THERMAL_MIN_STEP 0 > -#define CPUFREQ_THERMAL_MAX_STEP 3 > + > +static int cpufreq_thermal_max_step __read_mostly = 3; > + > +/* > + * Minimum throttle percentage for processor_thermal cooling device. > + * The processor_thermal driver uses it to calculate the percentage amount by > + * which cpu frequency must be reduced for each cooling state. This is also used > + * to calculate the maximum number of throttling steps or cooling states. > + */ > +static int cpufreq_thermal_pctg __read_mostly = 20; > > static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); > > @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu) > if (!cpu_has_cpufreq(cpu)) > return 0; > > - return CPUFREQ_THERMAL_MAX_STEP; > + return cpufreq_thermal_max_step; > } > > static int cpufreq_get_cur_state(unsigned int cpu) > @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) > if (!policy) > return -EINVAL; > > - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; > + max_freq = (policy->cpuinfo.max_freq * > + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100; > > cpufreq_cpu_put(policy); > > @@ -126,10 +136,29 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) > return 0; > } > > +static void acpi_thermal_cpufreq_config(void) > +{ > + int cpufreq_pctg = acpi_thermal_cpufreq_pctg(); > + > + if (!cpufreq_pctg) > + return; > + > + cpufreq_thermal_pctg = cpufreq_pctg; > + > + /* > + * Derive the MAX_STEP from minimum throttle percentage so that the reduction > + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that > + * the CPU performance doesn't become 0. > + */ > + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1; > +} > + > void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) > { > unsigned int cpu; > > + acpi_thermal_cpufreq_config(); > + > for_each_cpu(cpu, policy->related_cpus) { > struct acpi_processor *pr = per_cpu(processors, cpu); > int ret; > diff --git a/include/linux/acpi.h b/include/linux/acpi.h > index ba3f601b6e3d..052c90d7479a 100644 > --- a/include/linux/acpi.h > +++ b/include/linux/acpi.h > @@ -1541,4 +1541,13 @@ static inline void acpi_device_notify(struct device *dev) { } > static inline void acpi_device_notify_remove(struct device *dev) { } > #endif > > +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY > +int acpi_thermal_cpufreq_pctg(void); > +#else > +inline int acpi_thermal_cpufreq_pctg(void) > +{ > + return 0; > +} > +#endif > + > #endif /*_LINUX_ACPI_H*/ Hi Rafael, The below change fixes the issue reported by the kernel test robot. If there are no further comments, then I can send a v5 with this change? +++ b/include/linux/acpi.h @@ -1542,7 +1542,7 @@ static inline void acpi_device_notify_remove(struct device *dev) { } #ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY int acpi_thermal_cpufreq_pctg(void); #else -inline int acpi_thermal_cpufreq_pctg(void) +static inline int acpi_thermal_cpufreq_pctg(void) { return 0; } Best Regards, Sumit Gupta
Hi Sumit, kernel test robot noticed the following build errors: [auto build test ERROR on rafael-pm/linux-next] [also build test ERROR on next-20231010] [cannot apply to linus/master v6.6-rc5] [If your patch is applied to the wrong git tree, kindly drop us a note. And when submitting patch, we suggest to use '--base' as documented in https://git-scm.com/docs/git-format-patch#_base_tree_information] url: https://github.com/intel-lab-lkp/linux/commits/Sumit-Gupta/ACPI-thermal-Add-Thermal-fast-Sampling-Period-_TFP-support/20231010-012229 base: https://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git linux-next patch link: https://lore.kernel.org/r/20231009171839.12267-3-sumitg%40nvidia.com patch subject: [Patch v4 2/2] ACPI: processor: reduce CPUFREQ thermal reduction pctg for Tegra241 config: um-i386_defconfig (https://download.01.org/0day-ci/archive/20231010/202310101725.z21V2yNG-lkp@intel.com/config) compiler: gcc-7 (Ubuntu 7.5.0-6ubuntu2) 7.5.0 reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20231010/202310101725.z21V2yNG-lkp@intel.com/reproduce) If you fix the issue in a separate patch/commit (i.e. not just a new version of the same patch/commit), kindly add following tags | Reported-by: kernel test robot <lkp@intel.com> | Closes: https://lore.kernel.org/oe-kbuild-all/202310101725.z21V2yNG-lkp@intel.com/ All errors (new ones prefixed by >>): ld: kernel/sysctl.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here ld: kernel/power/main.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here ld: drivers/base/core.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here ld: drivers/base/platform.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here ld: drivers/base/cpu.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here ld: drivers/base/property.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here ld: drivers/base/cacheinfo.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here ld: drivers/base/power/common.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here ld: drivers/input/mouse/synaptics.o: in function `acpi_thermal_cpufreq_pctg': >> include/linux/acpi.h:1546: multiple definition of `acpi_thermal_cpufreq_pctg'; init/main.o:include/linux/acpi.h:1546: first defined here vim +1546 include/linux/acpi.h 1541 1542 #ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY 1543 int acpi_thermal_cpufreq_pctg(void); 1544 #else 1545 inline int acpi_thermal_cpufreq_pctg(void) > 1546 { 1547 return 0; 1548 } 1549 #endif 1550
diff --git a/drivers/acpi/arm64/Makefile b/drivers/acpi/arm64/Makefile index 143debc1ba4a..3f181d8156cc 100644 --- a/drivers/acpi/arm64/Makefile +++ b/drivers/acpi/arm64/Makefile @@ -5,3 +5,4 @@ obj-$(CONFIG_ACPI_GTDT) += gtdt.o obj-$(CONFIG_ACPI_APMT) += apmt.o obj-$(CONFIG_ARM_AMBA) += amba.o obj-y += dma.o init.o +obj-$(CONFIG_ACPI) += thermal_cpufreq.o diff --git a/drivers/acpi/arm64/thermal_cpufreq.c b/drivers/acpi/arm64/thermal_cpufreq.c new file mode 100644 index 000000000000..de834fb013e7 --- /dev/null +++ b/drivers/acpi/arm64/thermal_cpufreq.c @@ -0,0 +1,20 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include <linux/acpi.h> + +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY +#define SMCCC_SOC_ID_T241 0x036b0241 + +int acpi_thermal_cpufreq_pctg(void) +{ + s32 soc_id = arm_smccc_get_soc_id_version(); + + /* + * Check JEP106 code for NVIDIA Tegra241 chip (036b:0241) and + * reduce the CPUFREQ Thermal reduction percentage to 5%. + */ + if (soc_id == SMCCC_SOC_ID_T241) + return 5; + + return 0; +} +#endif diff --git a/drivers/acpi/processor_thermal.c b/drivers/acpi/processor_thermal.c index b7c6287eccca..52f316e4e260 100644 --- a/drivers/acpi/processor_thermal.c +++ b/drivers/acpi/processor_thermal.c @@ -26,7 +26,16 @@ */ #define CPUFREQ_THERMAL_MIN_STEP 0 -#define CPUFREQ_THERMAL_MAX_STEP 3 + +static int cpufreq_thermal_max_step __read_mostly = 3; + +/* + * Minimum throttle percentage for processor_thermal cooling device. + * The processor_thermal driver uses it to calculate the percentage amount by + * which cpu frequency must be reduced for each cooling state. This is also used + * to calculate the maximum number of throttling steps or cooling states. + */ +static int cpufreq_thermal_pctg __read_mostly = 20; static DEFINE_PER_CPU(unsigned int, cpufreq_thermal_reduction_pctg); @@ -71,7 +80,7 @@ static int cpufreq_get_max_state(unsigned int cpu) if (!cpu_has_cpufreq(cpu)) return 0; - return CPUFREQ_THERMAL_MAX_STEP; + return cpufreq_thermal_max_step; } static int cpufreq_get_cur_state(unsigned int cpu) @@ -113,7 +122,8 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) if (!policy) return -EINVAL; - max_freq = (policy->cpuinfo.max_freq * (100 - reduction_pctg(i) * 20)) / 100; + max_freq = (policy->cpuinfo.max_freq * + (100 - reduction_pctg(i) * cpufreq_thermal_pctg)) / 100; cpufreq_cpu_put(policy); @@ -126,10 +136,29 @@ static int cpufreq_set_cur_state(unsigned int cpu, int state) return 0; } +static void acpi_thermal_cpufreq_config(void) +{ + int cpufreq_pctg = acpi_thermal_cpufreq_pctg(); + + if (!cpufreq_pctg) + return; + + cpufreq_thermal_pctg = cpufreq_pctg; + + /* + * Derive the MAX_STEP from minimum throttle percentage so that the reduction + * percentage doesn't end up becoming negative. Also, cap the MAX_STEP so that + * the CPU performance doesn't become 0. + */ + cpufreq_thermal_max_step = (100 / cpufreq_thermal_pctg) - 1; +} + void acpi_thermal_cpufreq_init(struct cpufreq_policy *policy) { unsigned int cpu; + acpi_thermal_cpufreq_config(); + for_each_cpu(cpu, policy->related_cpus) { struct acpi_processor *pr = per_cpu(processors, cpu); int ret; diff --git a/include/linux/acpi.h b/include/linux/acpi.h index ba3f601b6e3d..052c90d7479a 100644 --- a/include/linux/acpi.h +++ b/include/linux/acpi.h @@ -1541,4 +1541,13 @@ static inline void acpi_device_notify(struct device *dev) { } static inline void acpi_device_notify_remove(struct device *dev) { } #endif +#ifdef CONFIG_HAVE_ARM_SMCCC_DISCOVERY +int acpi_thermal_cpufreq_pctg(void); +#else +inline int acpi_thermal_cpufreq_pctg(void) +{ + return 0; +} +#endif + #endif /*_LINUX_ACPI_H*/