From patchwork Sat Mar 7 00:54:50 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Florian Fainelli X-Patchwork-Id: 5958241 Return-Path: X-Original-To: patchwork-linux-arm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id E245ABF440 for ; Sat, 7 Mar 2015 01:00:51 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id 7812720392 for ; Sat, 7 Mar 2015 01:00:48 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F0E9D20390 for ; Sat, 7 Mar 2015 01:00:46 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1YU33y-0003ns-9y; Sat, 07 Mar 2015 00:57:54 +0000 Received: from mail-pa0-x236.google.com ([2607:f8b0:400e:c03::236]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1YU322-00034I-5D for linux-arm-kernel@lists.infradead.org; Sat, 07 Mar 2015 00:55:56 +0000 Received: by padet14 with SMTP id et14so31745561pad.11 for ; Fri, 06 Mar 2015 16:55:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=ZSzzqUvG8IIhVxqCMpSRzXkQ4dKalwTavhyP0BMC95Y=; b=ztBZ+MOCjUDL+2T+yN9CDq8MOali680/RHzKPn2+iHMcixuVwrec2XA9pSxY2tyRU4 iqYgsPr520HCWrTPLlcIRPrGeWokDi18hUtbNG8HSx3Iu7WQlJHa7Ctk9DTIULI/4fV2 R71THVMOxg81UtJhol67Xw+UWSdhmfLYWj0QcNYCrFmlWFLnimQYjp3vREGf66dHvYM/ t0DW+ngi7KHSA6qJEN/S5/sATHt93AplhtR40lAn/FFLcN42XvmOSlrSrUob0Wejw6sB kFMpPjWba3Rnm4JnicPncSj1J4cehnDcVmBlAraaaCvt4jkOjSMEJQpwFMNNBBGQY13x 1V0Q== X-Received: by 10.70.134.8 with SMTP id pg8mr30521468pdb.138.1425689731886; Fri, 06 Mar 2015 16:55:31 -0800 (PST) Received: from fainelli-desktop.broadcom.com (5520-maca-inet1-outside.broadcom.com. [216.31.211.11]) by mx.google.com with ESMTPSA id kd9sm10732871pab.0.2015.03.06.16.55.30 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Fri, 06 Mar 2015 16:55:31 -0800 (PST) From: Florian Fainelli To: linux-arm-kernel@lists.infradead.org Subject: [PATCH 2/5] ARM: Add Broadcom Brahma-B15 readahead cache support Date: Fri, 6 Mar 2015 16:54:50 -0800 Message-Id: <1425689693-31034-3-git-send-email-f.fainelli@gmail.com> X-Mailer: git-send-email 2.1.0 In-Reply-To: <1425689693-31034-1-git-send-email-f.fainelli@gmail.com> References: <1425689693-31034-1-git-send-email-f.fainelli@gmail.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20150306_165554_308432_5DED67C9 X-CRM114-Status: GOOD ( 25.44 ) X-Spam-Score: -0.8 (/) Cc: mark.rutland@arm.com, Florian Fainelli , linux@arm.linux.org.uk, rngun@broadcom.com, arnd@arndb.de, nico@linaro.org, cernekee@gmail.com, will.deacon@arm.com, Alamy Liu , bcm-kernel-feedback-list@broadcom.com, gregory.0xf0@gmail.com, olof@lixom.net, computersforpeace@gmail.com X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.18-1 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Spam-Status: No, score=-4.1 required=5.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_MED, T_DKIM_INVALID, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch adds support for the Broadcom Brahma-B15 CPU readahead cache controller. This cache controller sits between the L2 and the memory bus and its purpose is to provide a friendler burst size towards the DDR interface than the native cache line size. The readahead cache is mostly transparent, except for flush_kern_cache_all, flush_kern_cache_louis and flush_icache_all, which is precisely what we are overriding here. The readahead cache only intercepts reads, not writes, as such, some data can remain stale in any of its buffers, such that we need to flush it, which is an operation that needs to happen in a particular order: - disable the readahead cache - flush it - call the appropriate cache-v7.S function - re-enable This patch tries to minimize the impact to the cache-v7.S file by only providing a stub in case CONFIG_CACHE_B15_RAC is enabled (default for ARCH_BRCMSTB since it is the current user). Signed-off-by: Alamy Liu Signed-off-by: Florian Fainelli --- arch/arm/include/asm/cacheflush.h | 2 +- arch/arm/include/asm/glue-cache.h | 4 + arch/arm/include/asm/hardware/cache-b15-rac.h | 12 ++ arch/arm/mm/Kconfig | 8 ++ arch/arm/mm/Makefile | 1 + arch/arm/mm/cache-b15-rac.c | 181 ++++++++++++++++++++++++++ 6 files changed, 207 insertions(+), 1 deletion(-) create mode 100644 arch/arm/include/asm/hardware/cache-b15-rac.h create mode 100644 arch/arm/mm/cache-b15-rac.c diff --git a/arch/arm/include/asm/cacheflush.h b/arch/arm/include/asm/cacheflush.h index 2d46862e7bef..4d847e185cf6 100644 --- a/arch/arm/include/asm/cacheflush.h +++ b/arch/arm/include/asm/cacheflush.h @@ -199,7 +199,7 @@ extern void copy_to_user_page(struct vm_area_struct *, struct page *, */ #if (defined(CONFIG_CPU_V7) && \ (defined(CONFIG_CPU_V6) || defined(CONFIG_CPU_V6K))) || \ - defined(CONFIG_SMP_ON_UP) + defined(CONFIG_SMP_ON_UP) || defined(CONFIG_CACHE_B15_RAC) #define __flush_icache_preferred __cpuc_flush_icache_all #elif __LINUX_ARM_ARCH__ >= 7 && defined(CONFIG_SMP) #define __flush_icache_preferred __flush_icache_all_v7_smp diff --git a/arch/arm/include/asm/glue-cache.h b/arch/arm/include/asm/glue-cache.h index a3c24cd5b7c8..11f33b5f9284 100644 --- a/arch/arm/include/asm/glue-cache.h +++ b/arch/arm/include/asm/glue-cache.h @@ -117,6 +117,10 @@ # endif #endif +#if defined(CONFIG_CACHE_B15_RAC) +# define MULTI_CACHE 1 +#endif + #if defined(CONFIG_CPU_V7M) # ifdef _CACHE # define MULTI_CACHE 1 diff --git a/arch/arm/include/asm/hardware/cache-b15-rac.h b/arch/arm/include/asm/hardware/cache-b15-rac.h new file mode 100644 index 000000000000..76b888f53f90 --- /dev/null +++ b/arch/arm/include/asm/hardware/cache-b15-rac.h @@ -0,0 +1,12 @@ +#ifndef __ASM_ARM_HARDWARE_CACHE_B15_RAC_H +#define __ASM_ARM_HARDWARE_CACHE_B15_RAC_H + +#ifndef __ASSEMBLY__ + +void b15_flush_kern_cache_all(void); +void b15_flush_kern_cache_louis(void); +void b15_flush_icache_all(void); + +#endif + +#endif diff --git a/arch/arm/mm/Kconfig b/arch/arm/mm/Kconfig index 9b4f29e595a4..4d5652a39304 100644 --- a/arch/arm/mm/Kconfig +++ b/arch/arm/mm/Kconfig @@ -853,6 +853,14 @@ config OUTER_CACHE_SYNC The outer cache has a outer_cache_fns.sync function pointer that can be used to drain the write buffer of the outer cache. +config CACHE_B15_RAC + bool "Enable the Broadcom Brahma-B15 read-ahead cache controller" + depends on ARCH_BRCMSTB + default y + help + This option enables the Broadcom Brahma-B15 read-ahead cache + controller. If disabled, the read-ahead cache remains off. + config CACHE_FEROCEON_L2 bool "Enable the Feroceon L2 cache controller" depends on ARCH_MV78XX0 || ARCH_MVEBU diff --git a/arch/arm/mm/Makefile b/arch/arm/mm/Makefile index d3afdf9eb65a..a6797fdb6721 100644 --- a/arch/arm/mm/Makefile +++ b/arch/arm/mm/Makefile @@ -96,6 +96,7 @@ AFLAGS_proc-v6.o :=-Wa,-march=armv6 AFLAGS_proc-v7.o :=-Wa,-march=armv7-a obj-$(CONFIG_OUTER_CACHE) += l2c-common.o +obj-$(CONFIG_CACHE_B15_RAC) += cache-b15-rac.o obj-$(CONFIG_CACHE_FEROCEON_L2) += cache-feroceon-l2.o obj-$(CONFIG_CACHE_L2X0) += cache-l2x0.o l2c-l2x0-resume.o obj-$(CONFIG_CACHE_XSC3L2) += cache-xsc3l2.o diff --git a/arch/arm/mm/cache-b15-rac.c b/arch/arm/mm/cache-b15-rac.c new file mode 100644 index 000000000000..1c5bca6e906b --- /dev/null +++ b/arch/arm/mm/cache-b15-rac.c @@ -0,0 +1,181 @@ +/* + * Broadcom Brahma-B15 CPU read-ahead cache management functions + * + * Copyright (C) 2015, Broadcom Corporation + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include + +#include +#include + +extern void v7_flush_kern_cache_all(void); +extern void v7_flush_kern_cache_louis(void); +extern void v7_flush_icache_all(void); + +/* RAC register offsets, relative to the HIF_CPU_BIUCTRL register base */ +#define RAC_CONFIG0_REG (0x78) +#define RACENPREF_MASK (0x3) +#define RACPREFINST_SHIFT (0) +#define RACENINST_SHIFT (2) +#define RACPREFDATA_SHIFT (4) +#define RACENDATA_SHIFT (6) +#define RAC_CPU_SHIFT (8) +#define RACCFG_MASK (0xff) +#define RAC_CONFIG1_REG (0x7c) +#define RAC_FLUSH_REG (0x80) +#define FLUSH_RAC (1 << 0) + +/* Bitmask to enable instruction and data prefetching with a 256-bytes stride */ +#define RAC_DATA_INST_EN_MASK (1 << RACPREFINST_SHIFT | \ + RACENPREF_MASK << RACENINST_SHIFT | \ + 1 << RACPREFDATA_SHIFT | \ + RACENPREF_MASK << RACENDATA_SHIFT) + +#define RAC_ENABLED (1 << 0) + +static void __iomem *b15_rac_base; +static DEFINE_SPINLOCK(rac_lock); + +/* Initialization flag to avoid checking for b15_rac_base, and to prevent + * multi-platform kernels from crashing here as well. + */ +static unsigned long b15_rac_flags; + +static inline u32 __b15_rac_disable(void) +{ + u32 val = __raw_readl(b15_rac_base + RAC_CONFIG0_REG); + __raw_writel(0, b15_rac_base + RAC_CONFIG0_REG); + dmb(); + return val; +} + +static inline void __b15_rac_flush(void) +{ + u32 reg; + + __raw_writel(FLUSH_RAC, b15_rac_base + RAC_FLUSH_REG); + do { + /* This dmb() is required to force the Bus Interface Unit + * to clean oustanding writes, and forces an idle cycle + * to be inserted. + */ + dmb(); + reg = __raw_readl(b15_rac_base + RAC_FLUSH_REG); + } while (reg & RAC_FLUSH_REG); +} + +static inline u32 b15_rac_disable_and_flush(void) +{ + u32 reg; + + reg = __b15_rac_disable(); + __b15_rac_flush(); + return reg; +} + +static inline void __b15_rac_enable(u32 val) +{ + __raw_writel(val, b15_rac_base + RAC_CONFIG0_REG); + /* dsb() is required here to be consistent with __flush_icache_all() */ + dsb(); +} + +#define BUILD_RAC_CACHE_OP(name, bar) \ +void b15_flush_##name(void) \ +{ \ + unsigned int do_flush; \ + u32 val = 0; \ + \ + spin_lock(&rac_lock); \ + do_flush = test_bit(RAC_ENABLED, &b15_rac_flags); \ + if (do_flush) \ + val = b15_rac_disable_and_flush(); \ + v7_flush_##name(); \ + if (!do_flush) \ + bar; \ + else \ + __b15_rac_enable(val); \ + spin_unlock(&rac_lock); \ +} + +#define nobarrier + +/* The readahead cache present in the Brahma-B15 CPU is a special piece of + * hardware after the integrated L2 cache of the B15 CPU complex whose purpose + * is to prefetch instruction and/or data with a line size of either 64 bytes + * or 256 bytes. The rationale is that the data-bus of the CPU interface is + * optimized for 256-bytes transactions, and enabling the readahead cache + * provides a significant performance boost we want it enabled (typically + * twice the performance for a memcpy benchmark application). + * + * The readahead cache is transparent for Modified Virtual Addresses + * cache maintenance operations: ICIMVAU, DCIMVAC, DCCMVAC, DCCMVAU and + * DCCIMVAC. + * + * It is however not transparent for the following cache maintenance + * operations: DCISW, DCCSW, DCCISW, ICIALLUIS and ICIALLU which is precisely + * what we are patching here with our BUILD_RAC_CACHE_OP here. + */ + +BUILD_RAC_CACHE_OP(kern_cache_all, nobarrier); +BUILD_RAC_CACHE_OP(kern_cache_louis, nobarrier); +BUILD_RAC_CACHE_OP(icache_all, dsb()); + +static void b15_rac_enable(void) +{ + unsigned int cpu; + u32 enable = 0; + + for_each_possible_cpu(cpu) + enable |= (RAC_DATA_INST_EN_MASK << (cpu * RAC_CPU_SHIFT)); + + b15_rac_disable_and_flush(); + __b15_rac_enable(enable); +} + +static int __init b15_rac_init(void) +{ + struct device_node *dn; + int ret = 0, cpu; + u32 reg, en_mask = 0; + + dn = of_find_compatible_node(NULL, NULL, "brcm,brcmstb-cpu-biu-ctrl"); + if (!dn) + return -ENODEV; + + WARN(num_possible_cpus() > 4, "RAC only supports 4 CPUs\n"); + + b15_rac_base = of_iomap(dn, 0); + if (!b15_rac_base) { + pr_err("failed to remap BIU control base\n"); + ret = -ENOMEM; + goto out; + } + + spin_lock(&rac_lock); + reg = __raw_readl(b15_rac_base + RAC_CONFIG0_REG); + for_each_possible_cpu(cpu) + en_mask |= ((1 << RACPREFDATA_SHIFT) << (cpu * RAC_CPU_SHIFT)); + WARN(reg & en_mask, "Read-ahead cache not previously disabled\n"); + + b15_rac_enable(); + set_bit(RAC_ENABLED, &b15_rac_flags); + spin_unlock(&rac_lock); + + pr_info("Broadcom Brahma-B15 readahead cache at: 0x%p\n", + b15_rac_base + RAC_CONFIG0_REG); + +out: + of_node_put(dn); + return ret; +} +arch_initcall(b15_rac_init);