From patchwork Fri Jul 6 12:26:11 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 10511483 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 91CAC60325 for ; Fri, 6 Jul 2018 12:25:46 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7DFDB285EA for ; Fri, 6 Jul 2018 12:25:46 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 70AA6285F0; Fri, 6 Jul 2018 12:25:46 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,MAILING_LIST_MULTI autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id BEFDB285EA for ; Fri, 6 Jul 2018 12:25:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:Cc:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:In-Reply-To:MIME-Version:References: Message-ID:Subject:To:From:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=kKHy8r8RG+sCtZ0vQ8PDlXp+s691v3vZwa38MhOYOgQ=; b=owNsB/UK8blUTI 31twgkGtTmFOtHSXUmsHVzf3Fq6JVFouIojeCYpsCfqBRaFBZiotDo6YDhpkLBg/e74hRRJAVdAu5 ry9eWhwUmGq7ZGF5kDe/rSsn1WYeC2P8hgxZWL9jOf0M8CGcEidLv9uVU5kQZmOpgF7zcyRO2L/3H nrZpao+/ciXdUhsVv0AITEHRufIzmMVh8yzLGgRz2CpQ4XQiMmJCFH56dMLhzaP+61ajnRTBIeoQK Q6TiiASQnsOlOM+faqFhDAnx4i+kNW6SPwZCZXfTjuRvtAn6cMmh1jbdFnhnm1i6rFdomi5GsM6kD 6hYJbD0BoQQLEkivy1qA==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fbPnz-00061v-Mb; Fri, 06 Jul 2018 12:25:43 +0000 Received: from foss.arm.com ([217.140.101.70]) by bombadil.infradead.org with esmtp (Exim 4.90_1 #2 (Red Hat Linux)) id 1fbPnx-00060s-6U for linux-arm-kernel@lists.infradead.org; Fri, 06 Jul 2018 12:25:42 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C362618A; Fri, 6 Jul 2018 05:25:30 -0700 (PDT) Received: from edgewater-inn.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 94E163F5BA; Fri, 6 Jul 2018 05:25:30 -0700 (PDT) Received: by edgewater-inn.cambridge.arm.com (Postfix, from userid 1000) id 23B2A1AE3096; Fri, 6 Jul 2018 13:26:11 +0100 (BST) Date: Fri, 6 Jul 2018 13:26:11 +0100 From: Will Deacon To: Lucas Stach Subject: Re: Clarifying dma_wmb behavior in presence of non-coherent masters and outer caches Message-ID: <20180706122610.GA6993@arm.com> References: <1530275290.22468.69.camel@pengutronix.de> <20180629142539.GH17271@n2100.armlinux.org.uk> <20180629162248.GB20010@arm.com> <20180629164801.GJ17271@n2100.armlinux.org.uk> <20180629174321.GA20909@arm.com> <1530539366.22468.89.camel@pengutronix.de> <20180702174552.GD23687@arm.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180702174552.GD23687@arm.com> User-Agent: Mutt/1.5.23 (2014-03-12) X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20180706_052541_260893_DFB3CCF6 X-CRM114-Status: GOOD ( 22.88 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: linux-arm-kernel@lists.infradead.org, Oleksij Rempel , Russell King - ARM Linux , kernel@pengutronix.de Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP Hi Lucas, On Mon, Jul 02, 2018 at 06:45:52PM +0100, Will Deacon wrote: > On Mon, Jul 02, 2018 at 03:49:26PM +0200, Lucas Stach wrote: > > Also the section "External caches", "Before AMBA4 ACE" seems to suggest > > that the barriers aren't fully propagated to the PL310 write-caches and > > master ports. I'm unsure if this is just an artifact of the mentioned > > MMIO access, so handling of normal vs. device memory transactions in > > the PL310. > > I'll check with the hardware folks about the PL310, but I don't see how > you could propagate barriers over AXI3 anyway because I don't think this > stuff existed back then. Damn. As I feared, you can't propagate the barrier to the PL310, so the A9 is unable to enforce ordering beyond the responses it gets back. The PL310 will then re-order normal memory accesses (including non-cacheable) in its store buffer, so we do actually need a bloody CACHE SYNC here. Patch below. Does it help? Will --->8 From 5a2769b45e94b212d21bc44f5940ed5f3bde5cc2 Mon Sep 17 00:00:00 2001 From: Will Deacon Date: Wed, 4 Jul 2018 10:56:22 +0100 Subject: [PATCH] ARM: barrier: Kick outer-cache and SoC fabric in dma_wmb() Prior to AMBA4-ACE, the AXI bus protocol had no support for barrier transactions or shareability domains and was therefore unable to enforce ordering between memory accesses either side of a barrier instruction, requiring the CPU to handle all of the ordering itself. Unfortunately, IP such as the venerable PL310 will re-order stores to Normal memory inside its store-buffer, meaning that accesses to a coherent DMA buffer (i.e. one allocated by dma_alloc_coherent()) can be observed out-of-order by a non-coherent DMA engine unless the PL310 is kicked with a CACHE_SYNC command between buffering the accesses. Whilst this is already the case for mb() and wmb(), dma_wmb() just expands to a DMB OSHST, leading to sequences such as: STR Rdata, [Rbuf, #DATA_OFFSET] DMB OSHST STR Rvalid, [Rbuf, #VALID_OFFSET] being reordered such that the DMA device sees the write of valid but not the write of data. This patch ensures that the SoC fabric and the outer-cache are kicked by dma_wmb() if they are able to buffer transactions outside of the CPU but before the Point of Coherency. Reported-by: Lucas Stach Signed-off-by: Will Deacon --- arch/arm/include/asm/barrier.h | 14 ++++++++++---- arch/arm/include/asm/outercache.h | 6 ++++++ arch/arm/mm/flush.c | 24 +++++++++++++++++++++++- 3 files changed, 39 insertions(+), 5 deletions(-) diff --git a/arch/arm/include/asm/barrier.h b/arch/arm/include/asm/barrier.h index 69772e742a0a..e31833f72b69 100644 --- a/arch/arm/include/asm/barrier.h +++ b/arch/arm/include/asm/barrier.h @@ -53,17 +53,23 @@ #ifdef CONFIG_ARM_HEAVY_MB extern void (*soc_mb)(void); extern void arm_heavy_mb(void); -#define __arm_heavy_mb(x...) do { dsb(x); arm_heavy_mb(); } while (0) +extern void arm_heavy_wmb(void); +extern void arm_heavy_dma_wmb(void); +#define __arm_heavy_mb() arm_heavy_mb() +#define __arm_heavy_wmb() arm_heavy_wmb() +#define __arm_heavy_dma_wmb() arm_heavy_dma_wmb() #else -#define __arm_heavy_mb(x...) dsb(x) +#define __arm_heavy_mb() dsb() +#define __arm_heavy_wmb() dsb(st) +#define __arm_heavy_dma_wmb() dmb(oshst) #endif #if defined(CONFIG_ARM_DMA_MEM_BUFFERABLE) || defined(CONFIG_SMP) #define mb() __arm_heavy_mb() #define rmb() dsb() -#define wmb() __arm_heavy_mb(st) +#define wmb() __arm_heavy_wmb() #define dma_rmb() dmb(osh) -#define dma_wmb() dmb(oshst) +#define dma_wmb() __arm_heavy_dma_wmb() #else #define mb() barrier() #define rmb() barrier() diff --git a/arch/arm/include/asm/outercache.h b/arch/arm/include/asm/outercache.h index c2bf24f40177..e744efe6e2a7 100644 --- a/arch/arm/include/asm/outercache.h +++ b/arch/arm/include/asm/outercache.h @@ -43,6 +43,12 @@ struct outer_cache_fns { extern struct outer_cache_fns outer_cache; +#ifdef CONFIG_OUTER_CACHE_SYNC +static inline bool outer_cache_has_sync(void) { return outer_cache.sync; } +#else +static inline bool outer_cache_has_sync(void) { return 0; } +#endif + #ifdef CONFIG_OUTER_CACHE /** * outer_inv_range - invalidate range of outer cache lines diff --git a/arch/arm/mm/flush.c b/arch/arm/mm/flush.c index 58469623b015..1252689362d8 100644 --- a/arch/arm/mm/flush.c +++ b/arch/arm/mm/flush.c @@ -24,7 +24,7 @@ #ifdef CONFIG_ARM_HEAVY_MB void (*soc_mb)(void); -void arm_heavy_mb(void) +static void __arm_heavy_sync(void) { #ifdef CONFIG_OUTER_CACHE_SYNC if (outer_cache.sync) @@ -33,7 +33,29 @@ void arm_heavy_mb(void) if (soc_mb) soc_mb(); } + +void arm_heavy_mb(void) +{ + dsb(); + __arm_heavy_sync(); +} EXPORT_SYMBOL(arm_heavy_mb); + +void arm_heavy_wmb(void) +{ + dsb(st); + __arm_heavy_sync(); +} +EXPORT_SYMBOL(arm_heavy_wmb); + +void arm_heavy_dma_wmb(void) +{ + if (outer_cache_has_sync() || soc_mb) + arm_heavy_wmb(); + else + dmb(oshst); +} +EXPORT_SYMBOL(arm_heavy_dma_wmb); #endif #ifdef CONFIG_CPU_CACHE_VIPT