From patchwork Wed Jun 8 22:35:16 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yinghai Lu X-Patchwork-Id: 9165865 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id E1A2560572 for ; Wed, 8 Jun 2016 22:37:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id CE9D825D91 for ; Wed, 8 Jun 2016 22:37:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id BFAE228307; Wed, 8 Jun 2016 22:37:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_MED, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.9]) (using TLSv1.2 with cipher AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 1860B25D91 for ; Wed, 8 Jun 2016 22:37:31 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.80.1 #2 (Red Hat Linux)) id 1bAm4f-0007c1-NJ; Wed, 08 Jun 2016 22:35:45 +0000 Received: from mail-vk0-x241.google.com ([2607:f8b0:400c:c05::241]) by bombadil.infradead.org with esmtps (Exim 4.80.1 #2 (Red Hat Linux)) id 1bAm4d-0007Y3-26 for linux-arm-kernel@lists.infradead.org; Wed, 08 Jun 2016 22:35:44 +0000 Received: by mail-vk0-x241.google.com with SMTP id a126so3666781vkb.1 for ; Wed, 08 Jun 2016 15:35:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:from:date:message-id :subject:to:cc; bh=o/PXBdUhtKAtNx0wPxKmA9We/6gUPkkLCMCYkSZINhI=; b=l2HqcPJo6dhJ/xOogP8FJXPI6ENPpiY6UwWZRoy9TYdf3qMFcJ/981FFT7q6LDSO58 240+gFAhkD1/HTFWvzCRqfCdQ+zCl2u4DTGk6JR5xFfyEnk0HmSdvfCypzmfQ1uXjxMV TzkSUalAhWcf2PkFpnctFWgc0GTe+1rcfdhLpU5I3IHwp7xvTn7OyI8lZfEQj1uzgffK xZHI4etAfwJzZ+6P3cA2BO3gQmUbA7ZscnvJAu5KhHq/Cs8nSw4jHlo+m7/8z1ZqxTrH AAV5ROYR54GjBFXpPHSZF247bYkWafATA26EXkhFP4voMqq8STmiKf/dFMnWPg5gj2Fc HavA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:from :date:message-id:subject:to:cc; bh=o/PXBdUhtKAtNx0wPxKmA9We/6gUPkkLCMCYkSZINhI=; b=d8QUie8g/TjoVRvGyHJhkwvyueLKw4ALAMePw7vbyYqFPzDJRXEEd6JakRRB9g5Uum MOD8entgF8ELDqp7BHwfErh3GIkcEqzD0SkILjWSoPmday0Kqipki4dVSAMK58tMsCa5 MEzAnRliLdXuu6yWdiJhvf4uNokN6ukE3dO7SzvOiDZ+q0hty+bgVRSFF1Nojijc99j6 XvG0D1FFUttsdy9/HZ+QPQtEVhXqJ++SFO/fIbb2uzi1qapU3eMC7D7ISI2mscQQBDux 0n6wT0efTSIdE9msdUkfoknYDaXAYYL9M94iF1f4eXKc1kF1PtUIBzeu+wz5am1aqx+q 5aXA== X-Gm-Message-State: ALyK8tLl27ul8TrR/1tBM58nqlpHohVXWx2A9q5txwom4K+OEU/HAw5oR8f1iTcxqQ9t+Hvy20muizu05a038w== X-Received: by 10.159.35.72 with SMTP id 66mr3452083uae.55.1465425317579; Wed, 08 Jun 2016 15:35:17 -0700 (PDT) MIME-Version: 1.0 Received: by 10.103.81.11 with HTTP; Wed, 8 Jun 2016 15:35:16 -0700 (PDT) In-Reply-To: <20160608210322.GA4248@localhost> References: <20160604000642.28162-1-yinghai@kernel.org> <20160604000642.28162-2-yinghai@kernel.org> <20160608210322.GA4248@localhost> From: Yinghai Lu Date: Wed, 8 Jun 2016 15:35:16 -0700 X-Google-Sender-Auth: 2EorLEpjbky_9VCUtCXyavcyL18 Message-ID: Subject: Re: [PATCH v12 01/15] PCI: Let pci_mmap_page_range() take extra resource pointer To: Bjorn Helgaas X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20160608_153543_272804_A9C0157C X-CRM114-Status: GOOD ( 36.16 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.20 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: "linux-mips@linux-mips.org" , Wei Yang , linux-am33-list@redhat.com, "linux-ia64@vger.kernel.org" , linux-parisc@vger.kernel.org, linux-cris-kernel@axis.com, linux-sh@vger.kernel.org, Benjamin Herrenschmidt , linux-xtensa@linux-xtensa.org, linuxppc-dev , Linux Kernel Mailing List , "sparclinux@vger.kernel.org" , Khalid Aziz , "linux-pci@vger.kernel.org" , Bjorn Helgaas , Linus Torvalds , David Miller , "linux-arm-kernel@lists.infradead.org" Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, Jun 8, 2016 at 2:03 PM, Bjorn Helgaas wrote: > > Microblaze does look up the resource in pci_mmap_page_range(), but it > never actually uses it. It *looks* like it uses it, but that code is > actually dead and I think we should apply the first patch below. Good one. > > That leaves powerpc as the only arch that would use this extra > resource pointer. It uses it in __pci_mmap_set_pgprot() to help > decide whether to make a normal uncacheable mapping or a write- > combining one. There's nothing here that's specific to the powerpc > architecture, and I don't think we should add this parameter just to > cater to powerpc. > > There are two cases where __pci_mmap_set_pgprot() on powerpc does > something based on the resource: > > 1) We're using procfs to mmap I/O port space after we requested > write-combining, e.g., we did this: > > ioctl(fd, PCIIOC_MMAP_IS_IO); # request I/O port space > ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining > mmap(fd, ...) > > On powerpc, we ignore the write-combining request in this case. > > I think we can handle this case by applying the second patch > below to ignore write-combining on I/O space for all arches, not > just powerpc. > > 2) We're using sysfs to mmap resourceN (not resourceN_wc), and > the resource is prefetchable. On powerpc, we turn *on* > write-combining, even though the user didn't ask for it. > > I'm not sure this case is actually safe, because it changes the > ordering properties. If it *is* safe, we could enable write- > combining in pci_mmap_resource(), where we already have the > resource and it could be done for all arches. > > This case is not strictly necessary, except to avoid a > performance regression, because the user could have mapped > resourceN_wc to explicitly request write-combining. > Agreed. > > commit 4e712b691abc5b579e3e4327f56b0b7988bdd1cb > Author: Bjorn Helgaas > Date: Wed Jun 8 14:00:14 2016 -0500 > > microblaze/PCI: Remove useless __pci_mmap_set_pgprot() > > The microblaze __pci_mmap_set_pgprot() was apparently copied from powerpc, > where it computes either an uncacheable pgprot_t or a write-combining one. > But on microblaze, we always use the regular uncacheable pgprot_t. > > Remove the useless code in __pci_mmap_set_pgprot() and inline the > pgprot_noncached() at the only caller. > > Signed-off-by: Bjorn Helgaas > > diff --git a/arch/microblaze/pci/pci-common.c b/arch/microblaze/pci/pci-common.c > index 14cba60..1974567 100644 > --- a/arch/microblaze/pci/pci-common.c > +++ b/arch/microblaze/pci/pci-common.c > @@ -219,33 +219,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev, > } > > /* > - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci > - * device mapping. > - */ > -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp, > - pgprot_t protection, > - enum pci_mmap_state mmap_state, > - int write_combine) > -{ > - pgprot_t prot = protection; > - > - /* Write combine is always 0 on non-memory space mappings. On > - * memory space, if the user didn't pass 1, we check for a > - * "prefetchable" resource. This is a bit hackish, but we use > - * this to workaround the inability of /sysfs to provide a write > - * combine bit > - */ > - if (mmap_state != pci_mmap_mem) > - write_combine = 0; > - else if (write_combine == 0) { > - if (rp->flags & IORESOURCE_PREFETCH) > - write_combine = 1; > - } > - > - return pgprot_noncached(prot); > -} > - > -/* > * This one is used by /dev/mem and fbdev who have no clue about the > * PCI device, it tries to find the PCI device first and calls the > * above routine > @@ -317,9 +290,7 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, > return -EINVAL; > > vma->vm_pgoff = offset >> PAGE_SHIFT; > - vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp, > - vma->vm_page_prot, > - mmap_state, write_combine); > + vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot); > > ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, > vma->vm_end - vma->vm_start, vma->vm_page_prot); > Acked-by: Yinghai Lu > > > commit 962972ee5e0ba6ceb680cb182bad65f8886586a6 > Author: Bjorn Helgaas > Date: Wed Jun 8 14:46:54 2016 -0500 > > PCI: Ignore write-combining when mapping I/O port space > > PCI exposes files like /proc/bus/pci/00/00.0 in procfs. These files > support operations like this: > > ioctl(fd, PCIIOC_MMAP_IS_IO); # request I/O port space > ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining > mmap(fd, ...) > > Many architectures don't allow mmap of I/O port space at all, but I don't > think it makes sense to do a write-combining mapping on the ones that do. > We could change proc_bus_pci_ioctl() so the user could never enable write- > combining for I/O port space, but that would break the following sequence, > which is currently legal: > > mmap(fd, ...) # default is I/O, non-combining > ioctl(fd, PCIIOC_WRITE_COMBINE, 1); # request write-combining > ioctl(fd, PCIIOC_MMAP_IS_MEM); # request memory space > mmap(fd, ...) > > Ignore the write-combining flag when mapping I/O port space. > > Signed-off-by: Bjorn Helgaas > > diff --git a/drivers/pci/proc.c b/drivers/pci/proc.c > index 3f155e7..21f8d613 100644 > --- a/drivers/pci/proc.c > +++ b/drivers/pci/proc.c > @@ -247,7 +247,8 @@ static int proc_bus_pci_mmap(struct file *file, struct vm_area_struct *vma) > > ret = pci_mmap_page_range(dev, vma, > fpriv->mmap_state, > - fpriv->write_combine); > + (fpriv->mmap_state == pci_mmap_mem) ? > + fpriv->write_combine : 0); > if (ret < 0) > return ret; > ok to me. At the same time, can you kill __pci_mmap_set_pgprot() for powerpc. * above routine @@ -458,9 +428,10 @@ int pci_mmap_page_range(struct pci_dev *dev, struct vm_area_struct *vma, return -EINVAL; vma->vm_pgoff = offset >> PAGE_SHIFT; - vma->vm_page_prot = __pci_mmap_set_pgprot(dev, rp, - vma->vm_page_prot, - mmap_state, write_combine); + if (write_combine) + vma->vm_page_prot = pgprot_noncached_wc(protection); + else + vma->vm_page_prot = pgprot_noncached(protection); ret = remap_pfn_range(vma, vma->vm_start, vma->vm_pgoff, vma->vm_end - vma->vm_start, vma->vm_page_prot); diff --git a/arch/powerpc/kernel/pci-common.c b/arch/powerpc/kernel/pci-common.c index 0f7a60f..0d0148d 100644 --- a/arch/powerpc/kernel/pci-common.c +++ b/arch/powerpc/kernel/pci-common.c @@ -356,36 +356,6 @@ static struct resource *__pci_mmap_make_offset(struct pci_dev *dev, } /* - * Set vm_page_prot of VMA, as appropriate for this architecture, for a pci - * device mapping. - */ -static pgprot_t __pci_mmap_set_pgprot(struct pci_dev *dev, struct resource *rp, - pgprot_t protection, - enum pci_mmap_state mmap_state, - int write_combine) -{ - - /* Write combine is always 0 on non-memory space mappings. On - * memory space, if the user didn't pass 1, we check for a - * "prefetchable" resource. This is a bit hackish, but we use - * this to workaround the inability of /sysfs to provide a write - * combine bit - */ - if (mmap_state != pci_mmap_mem) - write_combine = 0; - else if (write_combine == 0) { - if (rp->flags & IORESOURCE_PREFETCH) - write_combine = 1; - } - - /* XXX would be nice to have a way to ask for write-through */ - if (write_combine) - return pgprot_noncached_wc(protection); - else - return pgprot_noncached(protection); -} - -/* * This one is used by /dev/mem and fbdev who have no clue about the * PCI device, it tries to find the PCI device first and calls the