Message ID | 20110706140832.GA15946@oksana.dev.rtsoft.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wednesday 06 July 2011, Anton Vorontsov wrote: > CNS3xxx SOCs have L310-compatible cache controller, so let's use it. > > With this patch benchmarking with 'gzip' shows that performance is > doubled, and I'm still able to boot full-fledged userland over NFS > (using PCIe NIC), so the support should be pretty robust. > > Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> > --- > > I'm going to apply it to linux-cns3xxx.git tree and push it (via Arnd) > for v3.1 , if there will be no complains, of course. I think there is a small problem you should fix first, but otherwise it's ok. The problem is that CONFIG_CACHE_L2X0 is a compile-time option that can be disabled. Your code will not link correctly if it's turned off, so you need to contitionalize it on that Kconfig symbol. Arnd
On 07/06/2011 09:08 AM, Anton Vorontsov wrote: > CNS3xxx SOCs have L310-compatible cache controller, so let's use it. > > With this patch benchmarking with 'gzip' shows that performance is > doubled, and I'm still able to boot full-fledged userland over NFS > (using PCIe NIC), so the support should be pretty robust. > > Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> > --- > > I'm going to apply it to linux-cns3xxx.git tree and push it (via Arnd) > for v3.1 , if there will be no complains, of course. > > Thanks, > > arch/arm/mach-cns3xxx/cns3420vb.c | 2 + > arch/arm/mach-cns3xxx/core.c | 39 +++++++++++++++++++++++++++++++++++++ > arch/arm/mach-cns3xxx/core.h | 1 + > arch/arm/mm/Kconfig | 2 +- > 4 files changed, 43 insertions(+), 1 deletions(-) > > diff --git a/arch/arm/mach-cns3xxx/cns3420vb.c b/arch/arm/mach-cns3xxx/cns3420vb.c > index 08e5c87..4b804ba 100644 > --- a/arch/arm/mach-cns3xxx/cns3420vb.c > +++ b/arch/arm/mach-cns3xxx/cns3420vb.c > @@ -170,6 +170,8 @@ static struct platform_device *cns3420_pdevs[] __initdata = { > > static void __init cns3420_init(void) > { > + cns3xxx_l2x0_init(); > + > platform_add_devices(cns3420_pdevs, ARRAY_SIZE(cns3420_pdevs)); > > cns3xxx_ahci_init(); > diff --git a/arch/arm/mach-cns3xxx/core.c b/arch/arm/mach-cns3xxx/core.c > index da30078..49f3a51 100644 > --- a/arch/arm/mach-cns3xxx/core.c > +++ b/arch/arm/mach-cns3xxx/core.c > @@ -16,6 +16,7 @@ > #include <asm/mach/time.h> > #include <asm/mach/irq.h> > #include <asm/hardware/gic.h> > +#include <asm/hardware/cache-l2x0.h> > #include <mach/cns3xxx.h> > #include "core.h" > > @@ -244,3 +245,41 @@ static void __init cns3xxx_timer_init(void) > struct sys_timer cns3xxx_timer = { > .init = cns3xxx_timer_init, > }; > + > +void __init cns3xxx_l2x0_init(void) > +{ > + void __iomem *base = ioremap(CNS3XXX_L2C_BASE, SZ_4K); > + u32 val; > + > + if (WARN_ON(!base)) > + return; > + > + /* > + * Tag RAM Control register > + * > + * bit[10:8] - 1 cycle of write accesses latency > + * bit[6:4] - 1 cycle of read accesses latency > + * bit[3:0] - 1 cycle of setup latency > + * > + * 1 cycle of latency for setup, read and write accesses > + */ > + val = readl(base + L2X0_TAG_LATENCY_CTRL); > + val &= 0xfffff888; > + writel(val, base + L2X0_TAG_LATENCY_CTRL); > + > + /* > + * Data RAM Control register > + * > + * bit[10:8] - 1 cycles of write accesses latency > + * bit[6:4] - 1 cycles of read accesses latency > + * bit[3:0] - 1 cycle of setup latency > + * > + * 1 cycle of setup latency, 2 cycles of read and write accesses latency > + */ > + val = readl(base + L2X0_DATA_LATENCY_CTRL); > + val &= 0xfffff888; You're missing a "val |= 0x110" or your comment is wrong. Rob
2011/7/6 Anton Vorontsov <avorontsov@mvista.com>: > CNS3xxx SOCs have L310-compatible cache controller, so let's use it. > > With this patch benchmarking with 'gzip' shows that performance is > doubled, and I'm still able to boot full-fledged userland over NFS > (using PCIe NIC), so the support should be pretty robust. > > Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> CNS3xxx have PL310. Would you mind to enable CONFIG_CACHE_PL310 by default as well? It is default disabled by !CPU_V6 of CACHE_PL310. @@ -795,6 +795,7 @@ config CACHE_L2X0 default y select OUTER_CACHE select OUTER_CACHE_SYNC + select CACHE_PL310 if ARCH_CNS3XXX help This option enables the L2x0 PrimeCell. Best Regards, Mac Lin
On Thursday 07 July 2011 01:57:11 Lin Mac wrote: > 2011/7/6 Anton Vorontsov <avorontsov@mvista.com>: > > CNS3xxx SOCs have L310-compatible cache controller, so let's use it. > > > > With this patch benchmarking with 'gzip' shows that performance is > > doubled, and I'm still able to boot full-fledged userland over NFS > > (using PCIe NIC), so the support should be pretty robust. > > > > Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> > > CNS3xxx have PL310. Would you mind to enable CONFIG_CACHE_PL310 by > default as well? It is default disabled by !CPU_V6 of CACHE_PL310. > > @@ -795,6 +795,7 @@ config CACHE_L2X0 > default y > select OUTER_CACHE > select OUTER_CACHE_SYNC > + select CACHE_PL310 if ARCH_CNS3XXX > help > This option enables the L2x0 PrimeCell. > I think it's better to keep such things local to the platform that needs it and add 'select CACHE_PL310 if CACHE_L2X0' to the ARCH_CNS3XXX config. The result is the same, but we don't clutter the main Kconfig. In the light of the move to cross-platform zImage builds, this would still be wrong however, you must not select CACHE_PL310 if any target machine has a L2X0. A more correct but also more complex solution would be config CACHE_PL310 bool depends on CACHE_L2X0 - default y if CPU_V7 && !(CPU_V6 || CPU_V6K) + default y if CPU_V7 && (!(CPU_V6 || CPU_V6K) || ARCH_CNS3XXX) help This option enables optimisations for the PL310 cache controller. If we get more of these, we might want to turn around the logic. Arnd
On Thu, Jul 07, 2011 at 09:16:20AM +0200, Arnd Bergmann wrote: > A more correct but also more complex solution would be > > config CACHE_PL310 > bool > depends on CACHE_L2X0 > - default y if CPU_V7 && !(CPU_V6 || CPU_V6K) > + default y if CPU_V7 && (!(CPU_V6 || CPU_V6K) || ARCH_CNS3XXX) > help > This option enables optimisations for the PL310 cache > controller. > > If we get more of these, we might want to turn around the logic. Or we actually want to fix cache-l2x0.c to detect the cache type at runtime and decide what to do.
diff --git a/arch/arm/mach-cns3xxx/cns3420vb.c b/arch/arm/mach-cns3xxx/cns3420vb.c index 08e5c87..4b804ba 100644 --- a/arch/arm/mach-cns3xxx/cns3420vb.c +++ b/arch/arm/mach-cns3xxx/cns3420vb.c @@ -170,6 +170,8 @@ static struct platform_device *cns3420_pdevs[] __initdata = { static void __init cns3420_init(void) { + cns3xxx_l2x0_init(); + platform_add_devices(cns3420_pdevs, ARRAY_SIZE(cns3420_pdevs)); cns3xxx_ahci_init(); diff --git a/arch/arm/mach-cns3xxx/core.c b/arch/arm/mach-cns3xxx/core.c index da30078..49f3a51 100644 --- a/arch/arm/mach-cns3xxx/core.c +++ b/arch/arm/mach-cns3xxx/core.c @@ -16,6 +16,7 @@ #include <asm/mach/time.h> #include <asm/mach/irq.h> #include <asm/hardware/gic.h> +#include <asm/hardware/cache-l2x0.h> #include <mach/cns3xxx.h> #include "core.h" @@ -244,3 +245,41 @@ static void __init cns3xxx_timer_init(void) struct sys_timer cns3xxx_timer = { .init = cns3xxx_timer_init, }; + +void __init cns3xxx_l2x0_init(void) +{ + void __iomem *base = ioremap(CNS3XXX_L2C_BASE, SZ_4K); + u32 val; + + if (WARN_ON(!base)) + return; + + /* + * Tag RAM Control register + * + * bit[10:8] - 1 cycle of write accesses latency + * bit[6:4] - 1 cycle of read accesses latency + * bit[3:0] - 1 cycle of setup latency + * + * 1 cycle of latency for setup, read and write accesses + */ + val = readl(base + L2X0_TAG_LATENCY_CTRL); + val &= 0xfffff888; + writel(val, base + L2X0_TAG_LATENCY_CTRL); + + /* + * Data RAM Control register + * + * bit[10:8] - 1 cycles of write accesses latency + * bit[6:4] - 1 cycles of read accesses latency + * bit[3:0] - 1 cycle of setup latency + * + * 1 cycle of setup latency, 2 cycles of read and write accesses latency + */ + val = readl(base + L2X0_DATA_LATENCY_CTRL); + val &= 0xfffff888; + writel(val, base + L2X0_DATA_LATENCY_CTRL); + + /* 32 KiB, 8-way, parity disable */ + l2x0_init(base, 0x00540000, 0xfe000fff); +} diff --git a/arch/arm/mach-cns3xxx/core.h b/arch/arm/mach-cns3xxx/core.h index ffeb3a8..13635ca 100644 --- a/arch/arm/mach-cns3xxx/core.h +++ b/arch/arm/mach-cns3xxx/core.h @@ -14,6 +14,7 @@ extern struct sys_timer cns3xxx_timer; void __init cns3xxx_map_io(void); +void __init cns3xxx_l2x0_init(void); void __init cns3xxx_init_irq(void); void cns3xxx_power_off(void); diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index 0074b8d..cb26d49 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -821,7 +821,7 @@ config CACHE_L2X0 depends on REALVIEW_EB_ARM11MP || MACH_REALVIEW_PB11MP || MACH_REALVIEW_PB1176 || \ REALVIEW_EB_A9MP || SOC_IMX35 || SOC_IMX31 || MACH_REALVIEW_PBX || \ ARCH_NOMADIK || ARCH_OMAP4 || ARCH_EXYNOS4 || ARCH_TEGRA || \ - ARCH_U8500 || ARCH_VEXPRESS_CA9X4 || ARCH_SHMOBILE + ARCH_U8500 || ARCH_VEXPRESS_CA9X4 || ARCH_SHMOBILE || ARCH_CNS3XXX default y select OUTER_CACHE select OUTER_CACHE_SYNC
CNS3xxx SOCs have L310-compatible cache controller, so let's use it. With this patch benchmarking with 'gzip' shows that performance is doubled, and I'm still able to boot full-fledged userland over NFS (using PCIe NIC), so the support should be pretty robust. Signed-off-by: Anton Vorontsov <avorontsov@mvista.com> --- I'm going to apply it to linux-cns3xxx.git tree and push it (via Arnd) for v3.1 , if there will be no complains, of course. Thanks, arch/arm/mach-cns3xxx/cns3420vb.c | 2 + arch/arm/mach-cns3xxx/core.c | 39 +++++++++++++++++++++++++++++++++++++ arch/arm/mach-cns3xxx/core.h | 1 + arch/arm/mm/Kconfig | 2 +- 4 files changed, 43 insertions(+), 1 deletions(-)