Message ID | 1360041732-17936-2-git-send-email-nicolas.pitre@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Feb 05, 2013 at 12:21:58AM -0500, Nicolas Pitre wrote: > +#include <asm/mcpm_entry.h> > +#include <asm/barrier.h> > +#include <asm/proc-fns.h> > +#include <asm/cacheflush.h> > + > +extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER]; This should not be volatile. You should know by now the stance in the Linux community against using volatile on data declarations. See Documentation/volatile-considered-harmful.txt to remind yourself of the reasoning. > + > +void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr) > +{ > + unsigned long val = ptr ? virt_to_phys(ptr) : 0; > + mcpm_entry_vectors[cluster][cpu] = val; > + __cpuc_flush_dcache_area((void *)&mcpm_entry_vectors[cluster][cpu], 4); > + outer_clean_range(__pa(&mcpm_entry_vectors[cluster][cpu]), > + __pa(&mcpm_entry_vectors[cluster][cpu + 1])); And really, if the write hasn't been done by the compiler prior to calling __cpuc_flush_dcache_area() then we're into really bad problems.
On Tue, 23 Apr 2013, Russell King - ARM Linux wrote: > On Tue, Feb 05, 2013 at 12:21:58AM -0500, Nicolas Pitre wrote: > > +#include <asm/mcpm_entry.h> > > +#include <asm/barrier.h> > > +#include <asm/proc-fns.h> > > +#include <asm/cacheflush.h> > > + > > +extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER]; > > This should not be volatile. You should know by now the stance in the > Linux community against using volatile on data declarations. See > Documentation/volatile-considered-harmful.txt to remind yourself of > the reasoning. That document says: |The key point to understand with regard to volatile is that its purpose |is to suppress optimization, which is almost never what one really |wants to do. Turns out that this is exactly what we want here: suppress optimization. However ... > > + > > +void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr) > > +{ > > + unsigned long val = ptr ? virt_to_phys(ptr) : 0; > > + mcpm_entry_vectors[cluster][cpu] = val; > > + __cpuc_flush_dcache_area((void *)&mcpm_entry_vectors[cluster][cpu], 4); > > + outer_clean_range(__pa(&mcpm_entry_vectors[cluster][cpu]), > > + __pa(&mcpm_entry_vectors[cluster][cpu + 1])); > > And really, if the write hasn't been done by the compiler prior to calling > __cpuc_flush_dcache_area() then we're into really bad problems. That is indeed true. The memory might have been uncacheable at some point and then the volatile was necessary in that case. I removed it in my tree. Nicolas
On Tue, Apr 23, 2013 at 03:34:08PM -0400, Nicolas Pitre wrote: > On Tue, 23 Apr 2013, Russell King - ARM Linux wrote: > > > On Tue, Feb 05, 2013 at 12:21:58AM -0500, Nicolas Pitre wrote: > > > +#include <asm/mcpm_entry.h> > > > +#include <asm/barrier.h> > > > +#include <asm/proc-fns.h> > > > +#include <asm/cacheflush.h> > > > + > > > +extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER]; > > > > This should not be volatile. You should know by now the stance in the > > Linux community against using volatile on data declarations. See > > Documentation/volatile-considered-harmful.txt to remind yourself of > > the reasoning. > > That document says: > > |The key point to understand with regard to volatile is that its purpose > |is to suppress optimization, which is almost never what one really > |wants to do. > > Turns out that this is exactly what we want here: suppress optimization. What optimization are you trying to suppress in the function below? > However ... > > > > + > > > +void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr) > > > +{ > > > + unsigned long val = ptr ? virt_to_phys(ptr) : 0; > > > + mcpm_entry_vectors[cluster][cpu] = val; > > > + __cpuc_flush_dcache_area((void *)&mcpm_entry_vectors[cluster][cpu], 4); > > > + outer_clean_range(__pa(&mcpm_entry_vectors[cluster][cpu]), > > > + __pa(&mcpm_entry_vectors[cluster][cpu + 1])); > > > > And really, if the write hasn't been done by the compiler prior to calling > > __cpuc_flush_dcache_area() then we're into really bad problems. > > That is indeed true. The memory might have been uncacheable at some > point and then the volatile was necessary in that case. I don't buy the argument about it being uncachable - the compiler doesn't know that, and the cacheability of the memory really doesn't change the code that the compiler produces. Moreover, the compiler can't really reorder the store to mcpm_entry_vectors[cluster][cpu] with the calls to the cache flushing anyway - and as the compiler has no clue what those calls are doing so it has to ensure that the results of the preceding code is visible to some "unknown code" which it can't see. Therefore, it practically has no option but to issue the store before calling those cache flush functions. Also... consider using sizeof() rather than constant 4.
On Tue, 23 Apr 2013, Russell King - ARM Linux wrote: > On Tue, Apr 23, 2013 at 03:34:08PM -0400, Nicolas Pitre wrote: > > On Tue, 23 Apr 2013, Russell King - ARM Linux wrote: > > > > > On Tue, Feb 05, 2013 at 12:21:58AM -0500, Nicolas Pitre wrote: > > > > +#include <asm/mcpm_entry.h> > > > > +#include <asm/barrier.h> > > > > +#include <asm/proc-fns.h> > > > > +#include <asm/cacheflush.h> > > > > + > > > > +extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER]; > > > > > > This should not be volatile. You should know by now the stance in the > > > Linux community against using volatile on data declarations. See > > > Documentation/volatile-considered-harmful.txt to remind yourself of > > > the reasoning. > > > > That document says: > > > > |The key point to understand with regard to volatile is that its purpose > > |is to suppress optimization, which is almost never what one really > > |wants to do. > > > > Turns out that this is exactly what we want here: suppress optimization. > > What optimization are you trying to suppress in the function below? Ordering of writes to this memory wrt other code in case the above function was inlined. But as I said this is a moot point now. > > However ... > > > > > > + > > > > +void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr) > > > > +{ > > > > + unsigned long val = ptr ? virt_to_phys(ptr) : 0; > > > > + mcpm_entry_vectors[cluster][cpu] = val; > > > > + __cpuc_flush_dcache_area((void *)&mcpm_entry_vectors[cluster][cpu], 4); > > > > + outer_clean_range(__pa(&mcpm_entry_vectors[cluster][cpu]), > > > > + __pa(&mcpm_entry_vectors[cluster][cpu + 1])); > > > > > > And really, if the write hasn't been done by the compiler prior to calling > > > __cpuc_flush_dcache_area() then we're into really bad problems. > > > > That is indeed true. The memory might have been uncacheable at some > > point and then the volatile was necessary in that case. > > I don't buy the argument about it being uncachable - the compiler doesn't > know that, and the cacheability of the memory really doesn't change the > code that the compiler produces. My point is that the memory is now cacheable these days, which requires explicit cache maintenance for the writes to hit main memory. That cache maintenance acts as a barrier the volatile used to be before that cache maintenance call was there. > Also... consider using sizeof() rather than constant 4. Sure. Nicolas
diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig index 67874b82a4..200f559c1c 100644 --- a/arch/arm/Kconfig +++ b/arch/arm/Kconfig @@ -1584,6 +1584,14 @@ config HAVE_ARM_TWD help This options enables support for the ARM timer and watchdog unit +config CLUSTER_PM + bool "Cluster Power Management Infrastructure" + depends on CPU_V7 && SMP + help + This option provides the common power management infrastructure + for (multi-)cluster based systems, such as big.LITTLE based + systems. + choice prompt "Memory split" default VMSPLIT_3G diff --git a/arch/arm/common/Makefile b/arch/arm/common/Makefile index e8a4e58f1b..23e85b1fae 100644 --- a/arch/arm/common/Makefile +++ b/arch/arm/common/Makefile @@ -13,3 +13,4 @@ obj-$(CONFIG_SHARP_PARAM) += sharpsl_param.o obj-$(CONFIG_SHARP_SCOOP) += scoop.o obj-$(CONFIG_PCI_HOST_ITE8152) += it8152.o obj-$(CONFIG_ARM_TIMER_SP804) += timer-sp.o +obj-$(CONFIG_CLUSTER_PM) += mcpm_head.o mcpm_entry.o diff --git a/arch/arm/common/mcpm_entry.c b/arch/arm/common/mcpm_entry.c new file mode 100644 index 0000000000..3a6d7e70fd --- /dev/null +++ b/arch/arm/common/mcpm_entry.c @@ -0,0 +1,29 @@ +/* + * arch/arm/common/mcpm_entry.c -- entry point for multi-cluster PM + * + * Created by: Nicolas Pitre, March 2012 + * Copyright: (C) 2012-2013 Linaro Limited + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <linux/kernel.h> +#include <linux/init.h> + +#include <asm/mcpm_entry.h> +#include <asm/barrier.h> +#include <asm/proc-fns.h> +#include <asm/cacheflush.h> + +extern volatile unsigned long mcpm_entry_vectors[MAX_NR_CLUSTERS][MAX_CPUS_PER_CLUSTER]; + +void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr) +{ + unsigned long val = ptr ? virt_to_phys(ptr) : 0; + mcpm_entry_vectors[cluster][cpu] = val; + __cpuc_flush_dcache_area((void *)&mcpm_entry_vectors[cluster][cpu], 4); + outer_clean_range(__pa(&mcpm_entry_vectors[cluster][cpu]), + __pa(&mcpm_entry_vectors[cluster][cpu + 1])); +} diff --git a/arch/arm/common/mcpm_head.S b/arch/arm/common/mcpm_head.S new file mode 100644 index 0000000000..cda65f200b --- /dev/null +++ b/arch/arm/common/mcpm_head.S @@ -0,0 +1,86 @@ +/* + * arch/arm/common/mcpm_head.S -- kernel entry point for multi-cluster PM + * + * Created by: Nicolas Pitre, March 2012 + * Copyright: (C) 2012-2013 Linaro Limited + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include <linux/linkage.h> +#include <asm/mcpm_entry.h> + + .macro pr_dbg string +#if defined(CONFIG_DEBUG_LL) && defined(DEBUG) + b 1901f +1902: .asciz "CPU" +1903: .asciz " cluster" +1904: .asciz ": \string" + .align +1901: adr r0, 1902b + bl printascii + mov r0, r9 + bl printhex8 + adr r0, 1903b + bl printascii + mov r0, r10 + bl printhex8 + adr r0, 1904b + bl printascii +#endif + .endm + + .arm + .align + +ENTRY(mcpm_entry_point) + + THUMB( adr r12, BSYM(1f) ) + THUMB( bx r12 ) + THUMB( .thumb ) +1: + mrc p15, 0, r0, c0, c0, 5 @ MPIDR + ubfx r9, r0, #0, #8 @ r9 = cpu + ubfx r10, r0, #8, #8 @ r10 = cluster + mov r3, #MAX_CPUS_PER_CLUSTER + mla r4, r3, r10, r9 @ r4 = canonical CPU index + cmp r4, #(MAX_CPUS_PER_CLUSTER * MAX_NR_CLUSTERS) + blo 2f + + /* We didn't expect this CPU. Try to cheaply make it quiet. */ +1: wfi + wfe + b 1b + +2: pr_dbg "kernel mcpm_entry_point\n" + + /* + * MMU is off so we need to get to mcpm_entry_vectors in a + * position independent way. + */ + adr r5, 3f + ldr r6, [r5] + add r6, r5, r6 @ r6 = mcpm_entry_vectors + +mcpm_entry_gated: + ldr r5, [r6, r4, lsl #2] @ r5 = CPU entry vector + cmp r5, #0 + wfeeq + beq mcpm_entry_gated + pr_dbg "released\n" + bx r5 + + .align 2 + +3: .word mcpm_entry_vectors - . + +ENDPROC(mcpm_entry_point) + + .bss + .align 5 + + .type mcpm_entry_vectors, #object +ENTRY(mcpm_entry_vectors) + .space 4 * MAX_NR_CLUSTERS * MAX_CPUS_PER_CLUSTER diff --git a/arch/arm/include/asm/mcpm_entry.h b/arch/arm/include/asm/mcpm_entry.h new file mode 100644 index 0000000000..cc10ebbd2e --- /dev/null +++ b/arch/arm/include/asm/mcpm_entry.h @@ -0,0 +1,35 @@ +/* + * arch/arm/include/asm/mcpm_entry.h + * + * Created by: Nicolas Pitre, April 2012 + * Copyright: (C) 2012-2013 Linaro Limited + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#ifndef MCPM_ENTRY_H +#define MCPM_ENTRY_H + +#define MAX_CPUS_PER_CLUSTER 4 +#define MAX_NR_CLUSTERS 2 + +#ifndef __ASSEMBLY__ + +/* + * Platform specific code should use this symbol to set up secondary + * entry location for processors to use when released from reset. + */ +extern void mcpm_entry_point(void); + +/* + * This is used to indicate where the given CPU from given cluster should + * branch once it is ready to re-enter the kernel using ptr, or NULL if it + * should be gated. A gated CPU is held in a WFE loop until its vector + * becomes non NULL. + */ +void mcpm_set_entry_vector(unsigned cpu, unsigned cluster, void *ptr); + +#endif /* ! __ASSEMBLY__ */ +#endif