From patchwork Tue Feb 13 00:26:57 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: John David Anglin X-Patchwork-Id: 10214961 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6530360329 for ; Tue, 13 Feb 2018 00:27:01 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 4F74028E14 for ; Tue, 13 Feb 2018 00:27:01 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4409F28E16; Tue, 13 Feb 2018 00:27:01 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_TVD_MIME_EPI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7DC7328E14 for ; Tue, 13 Feb 2018 00:27:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932779AbeBMA1A (ORCPT ); Mon, 12 Feb 2018 19:27:00 -0500 Received: from belmont80srvr.owm.bell.net ([184.150.200.80]:52714 "EHLO mtlfep02.bell.net" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S932769AbeBMA07 (ORCPT ); Mon, 12 Feb 2018 19:26:59 -0500 Received: from bell.net mtlfep02 184.150.200.30 by mtlfep02.bell.net with ESMTP id <20180213002657.EXOW22355.mtlfep02.bell.net@mtlspm02.bell.net> for ; Mon, 12 Feb 2018 19:26:57 -0500 Received: from [192.168.2.49] (really [70.31.74.127]) by mtlspm02.bell.net with ESMTP id <20180213002657.BDZU6415.mtlspm02.bell.net@[192.168.2.49]>; Mon, 12 Feb 2018 19:26:57 -0500 To: linux-parisc@vger.kernel.org Cc: Helge Deller , James Bottomley From: John David Anglin Subject: [PATCH] parisc: Fix ordering of cache and TLB flushes Message-ID: <7ac084e0-75e7-8356-7839-3403fa53a5e0@bell.net> Date: Mon, 12 Feb 2018 19:26:57 -0500 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 Content-Language: en-US X-Cloudmark-Analysis: v=2.2 cv=OZx3NlbY c=1 sm=0 tr=0 a=exAxv0+L8MI61oIUChyxXA==:17 a=Op4juWPpsa0A:10 a=r77TgQKjGQsHNAKrUKIA:9 a=FBHGMhGWAAAA:8 a=joGbh3AQvw-7eJtGqr4A:9 a=QEXdDO2ut3YA:10 a=rJsPX2TPGC3wyYekd9MA:9 a=ichUEFZdvV4A:10 a=9gvnlMMaQFpL9xblJ6ne:22 Sender: linux-parisc-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-parisc@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP The attached change fixes the ordering of cache and TLB flushes in several cases.  When we flush the cache using the existing PTE/TLB entries, we need to flush the TLB after doing the cache flush.  We don't need to do this when we flush the entire instruction and data caches as these flushes don't use the existing TLB entries.  The same is true for tmpalias region flushes. The flush_kernel_vmap_range() and invalidate_kernel_vmap_range() routines have been updated.   Helge reported that using  flush_data_cache() on PA 1.1 causes a deadlock in the dynamic DMA routines in pci-dma.c in SMP kernels.  So, we have disabled using the whole data cache flush in this case.  This fixes PA 1.1 SMP support. Secondly, we added a new purge_kernel_dcache_range_asm() routine to pacache.S and use it in invalidate_kernel_vmap_range().  Nominally, purges are faster than flushes as the cache lines don't have to be written back to memory. Hopefully, this is sufficient to resolve the remaining problems due to cache speculation.  So far, testing indicates that this is the case.  I did work up a patch using tmpalias flushes, but there is a performance hit because we need the physical address for each page, and we also need to sequence access to the tmpalias flush code.  This increases the probability of stalls. Signed-off-by: John David Anglin  diff --git a/arch/parisc/include/asm/cacheflush.h b/arch/parisc/include/asm/cacheflush.h index 3742508cc534..bd5ce31936f5 100644 --- a/arch/parisc/include/asm/cacheflush.h +++ b/arch/parisc/include/asm/cacheflush.h @@ -26,6 +26,7 @@ void flush_user_icache_range_asm(unsigned long, unsigned long); void flush_kernel_icache_range_asm(unsigned long, unsigned long); void flush_user_dcache_range_asm(unsigned long, unsigned long); void flush_kernel_dcache_range_asm(unsigned long, unsigned long); +void purge_kernel_dcache_range_asm(unsigned long, unsigned long); void flush_kernel_dcache_page_asm(void *); void flush_kernel_icache_page(void *); diff --git a/arch/parisc/kernel/cache.c b/arch/parisc/kernel/cache.c index 19c0c141bc3f..af2cc4302a42 100644 --- a/arch/parisc/kernel/cache.c +++ b/arch/parisc/kernel/cache.c @@ -539,13 +539,10 @@ void flush_cache_mm(struct mm_struct *mm) struct vm_area_struct *vma; pgd_t *pgd; - /* Flush the TLB to avoid speculation if coherency is required. */ - if (parisc_requires_coherency()) - flush_tlb_all(); - /* Flushing the whole cache on each cpu takes forever on rp3440, etc. So, avoid it if the mm isn't too big. */ if (mm_total_size(mm) >= parisc_cache_flush_threshold) { + flush_tlb_all(); flush_cache_all(); return; } @@ -556,6 +553,7 @@ void flush_cache_mm(struct mm_struct *mm) if ((vma->vm_flags & VM_EXEC) == 0) continue; flush_user_icache_range_asm(vma->vm_start, vma->vm_end); + flush_tlb_range(vma, vma->vm_start, vma->vm_end); } return; } @@ -581,21 +579,18 @@ void flush_cache_mm(struct mm_struct *mm) void flush_cache_range(struct vm_area_struct *vma, unsigned long start, unsigned long end) { - BUG_ON(!vma->vm_mm->context); - - /* Flush the TLB to avoid speculation if coherency is required. */ - if (parisc_requires_coherency()) + if ((end - start) >= parisc_cache_flush_threshold) { flush_tlb_range(vma, start, end); - - if ((end - start) >= parisc_cache_flush_threshold - || vma->vm_mm->context != mfsp(3)) { flush_cache_all(); return; } + BUG_ON(vma->vm_mm->context != mfsp(3)); + flush_user_dcache_range_asm(start, end); if (vma->vm_flags & VM_EXEC) flush_user_icache_range_asm(start, end); + flush_tlb_range(vma, start, end); } void @@ -604,8 +599,7 @@ flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long BUG_ON(!vma->vm_mm->context); if (pfn_valid(pfn)) { - if (parisc_requires_coherency()) - flush_tlb_page(vma, vmaddr); + flush_tlb_page(vma, vmaddr); __flush_cache_page(vma, vmaddr, PFN_PHYS(pfn)); } } @@ -613,21 +607,35 @@ flush_cache_page(struct vm_area_struct *vma, unsigned long vmaddr, unsigned long void flush_kernel_vmap_range(void *vaddr, int size) { unsigned long start = (unsigned long)vaddr; + unsigned long end = start + size; - if ((unsigned long)size > parisc_cache_flush_threshold) +#if defined(CONFIG_PA20) || !defined(CONFIG_SMP) + /* The IPI interrupts used by flush_data_cache() cause a stall + in the PA 1.1 dynamic DMA mapping code in SMP kernels. */ + if ((unsigned long)size >= parisc_cache_flush_threshold) { + flush_tlb_kernel_range (start, end); flush_data_cache(); - else - flush_kernel_dcache_range_asm(start, start + size); + return; + } +#endif + + flush_kernel_dcache_range_asm(start, end); + flush_tlb_kernel_range (start, end); } EXPORT_SYMBOL(flush_kernel_vmap_range); void invalidate_kernel_vmap_range(void *vaddr, int size) { unsigned long start = (unsigned long)vaddr; + unsigned long end = start + size; - if ((unsigned long)size > parisc_cache_flush_threshold) + if ((unsigned long)size >= parisc_cache_flush_threshold) { + flush_tlb_kernel_range (start, end); flush_data_cache(); - else - flush_kernel_dcache_range_asm(start, start + size); + return; + } + + purge_kernel_dcache_range_asm(start, end); + flush_tlb_kernel_range (start, end); } EXPORT_SYMBOL(invalidate_kernel_vmap_range); diff --git a/arch/parisc/kernel/pacache.S b/arch/parisc/kernel/pacache.S index 2d40c4ff3f69..67b0f7532e83 100644 --- a/arch/parisc/kernel/pacache.S +++ b/arch/parisc/kernel/pacache.S @@ -1110,6 +1110,28 @@ ENTRY_CFI(flush_kernel_dcache_range_asm) .procend ENDPROC_CFI(flush_kernel_dcache_range_asm) +ENTRY_CFI(purge_kernel_dcache_range_asm) + .proc + .callinfo NO_CALLS + .entry + + ldil L%dcache_stride, %r1 + ldw R%dcache_stride(%r1), %r23 + ldo -1(%r23), %r21 + ANDCM %r26, %r21, %r26 + +1: cmpb,COND(<<),n %r26, %r25,1b + pdc,m %r23(%r26) + + sync + syncdma + bv %r0(%r2) + nop + .exit + + .procend +ENDPROC_CFI(purge_kernel_dcache_range_asm) + ENTRY_CFI(flush_user_icache_range_asm) .proc .callinfo NO_CALLS