Message ID | 1460074573-7481-5-git-send-email-yinghai@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
[+cc Ben, Michael] On Thu, Apr 07, 2016 at 05:15:17PM -0700, Yinghai Lu wrote: > After we added 64bit mmio parsing, we got some "no compatible bridge window" > warning on anther new model that support 64bit resource. > > It turns out that we can not use mem_space.start as 64bit mem space > offset, aka there is mem_space.start != offset. > > Use child_phys_addr to calculate exact offset and record offset in > pbm. > > After patch we get correct offset. > > /pci@305: PCI IO [io 0x2007e00000000-0x2007e0fffffff] offset 2007e00000000 > /pci@305: PCI MEM [mem 0x2000000100000-0x200007effffff] offset 2000000000000 > /pci@305: PCI MEM64 [mem 0x2000100000000-0x2000dffffffff] offset 2000000000000 > ... > pci_sun4v f02ae7f8: PCI host bridge to bus 0000:00 > pci_bus 0000:00: root bus resource [io 0x2007e00000000-0x2007e0fffffff] (bus address [0x0000-0xfffffff]) > pci_bus 0000:00: root bus resource [mem 0x2000000100000-0x200007effffff] (bus address [0x00100000-0x7effffff]) > pci_bus 0000:00: root bus resource [mem 0x2000100000000-0x2000dffffffff] (bus address [0x100000000-0xdffffffff]) > ... > @@ -733,30 +733,32 @@ int pcibios_enable_device(struct pci_dev *dev, int mask) > static int __pci_mmap_make_offset_bus(struct pci_dev *pdev, struct vm_area_struct *vma, > enum pci_mmap_state mmap_state) > { > - struct pci_pbm_info *pbm = pdev->dev.archdata.host_controller; > - unsigned long space_size, user_offset, user_size; > - > - if (mmap_state == pci_mmap_io) { > - space_size = resource_size(&pbm->io_space); > - } else { > - space_size = resource_size(&pbm->mem_space); > - } > + unsigned long user_offset, user_size; > + struct resource res, *root_bus_res; > + struct pci_bus_region region; > + struct pci_bus *bus; > > /* Make sure the request is in range. */ > user_offset = vma->vm_pgoff << PAGE_SHIFT; > user_size = vma->vm_end - vma->vm_start; > > - if (user_offset >= space_size || > - (user_offset + user_size) > space_size) > + region.start = user_offset; > + region.end = user_offset + user_size - 1; > + memset(&res, 0, sizeof(res)); > + if (mmap_state == pci_mmap_io) > + res.flags = IORESOURCE_IO; > + else > + res.flags = IORESOURCE_MEM; > + > + pcibios_bus_to_resource(pdev->bus, &res, ®ion); > + bus = pdev->bus; > + while (bus->parent) > + bus = bus->parent; > + root_bus_res = pci_find_bus_resource(bus, &res); > + if (!root_bus_res) > return -EINVAL; > > - if (mmap_state == pci_mmap_io) { > - vma->vm_pgoff = (pbm->io_space.start + > - user_offset) >> PAGE_SHIFT; > - } else { > - vma->vm_pgoff = (pbm->mem_space.start + > - user_offset) >> PAGE_SHIFT; > - } > + vma->vm_pgoff = res.start >> PAGE_SHIFT; > > return 0; > } I'm kind of confused here. There are two ways to mmap PCI BARs: /proc/bus/pci/00/02.0 (proc_bus_pci_mmap()): all BARs in one file; MEM/IO determined by ioctl() mmap offset is a CPU physical address in the PCI resource /sys/devices/pci0000:00/0000:00:02.0/resource0 (pci_mmap_resource()): one file per BAR; MEM/IO determined by BAR type mmap offset is between 0 and BAR size Both proc_bus_pci_mmap() and pci_mmap_resource() validate the requested area with pci_mmap_fits() before calling pci_mmap_page_range(). In the proc_bus_pci_mmap() path, the offset in vma->vm_pgoff must be within the pdev->resource[], so the user must be supplying a CPU physical address (not an address obtained from pci_resource_to_user()). That vma->vm_pgoff is passed unchanged to pci_mmap_page_range(). In the pci_mmap_resource() path, vma->vm_pgoff must be between 0 and the BAR size. Then we add in the pci_resource_to_user() information before passing it to pci_mmap_page_range(). The comment in pci_mmap_resource() says pci_mmap_page_range() expects a "user visible" address, but I don't really believe that based on how proc_bus_pci_mmap() works. Do both proc_bus_pci_mmap() and pci_mmap_resource() work on sparc? It looks like they call pci_mmap_page_range() with different assumptions, so I don't see how they can both work. In any case, pci_mmap_page_range() on sparc converts that vma->vm_pgoff back to a CPU physical address, so there's an awful lot of work going on in the pci_mmap_resource() path to convert the mmap offset to a "user" address and then convert it back again. There's also quite a bit of validation done in the arch code (in __pci_mmap_make_offset() and __pci_mmap_make_offset_bus()) that looks partly redundant with pci_mmap_fits() and not necessarily arch-specific. As far as I can see, pci_mmap_resource() doesn't need to have any connection at all with pci_resource_to_user(). All it needs is the pdev->resource[] (which has the CPU physical address of the BAR) and vma->vm_pgoff (the offset into the BAR). I don't think pci_mmap_resource() should call pci_resource_to_user(), and I think pci_mmap_page_range() should expect a normal VMA that contains a valid CPU PFN in vm_pgoff (or a valid CPU I/O port number, in the case of an I/O port mmap). The original pci_resource_to_user() was added for powerpc by 2311b1f2bbd3 ("[PATCH] PCI: fix-pci-mmap-on-ppc-and-ppc64.patch"), but I couldn't find the linux-pci discussion it mentions. > @@ -977,16 +979,12 @@ void pci_resource_to_user(const struct pci_dev *pdev, int bar, > const struct resource *rp, resource_size_t *start, > resource_size_t *end) > { > - struct pci_pbm_info *pbm = pdev->dev.archdata.host_controller; > - unsigned long offset; > + struct pci_bus_region region; > > - if (rp->flags & IORESOURCE_IO) > - offset = pbm->io_space.start; > - else > - offset = pbm->mem_space.start; > + pcibios_resource_to_bus(pdev->bus, ®ion, (struct resource *)rp); > > - *start = rp->start - offset; > - *end = rp->end - offset; > + *start = region.start; > + *end = region.end; > } -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/sparc/kernel/pci.c b/arch/sparc/kernel/pci.c index badf095..4606dc1 100644 --- a/arch/sparc/kernel/pci.c +++ b/arch/sparc/kernel/pci.c @@ -654,12 +654,12 @@ struct pci_bus *pci_scan_one_pbm(struct pci_pbm_info *pbm, printk("PCI: Scanning PBM %s\n", node->full_name); pci_add_resource_offset(&resources, &pbm->io_space, - pbm->io_space.start); + pbm->io_offset); pci_add_resource_offset(&resources, &pbm->mem_space, - pbm->mem_space.start); + pbm->mem_offset); if (pbm->mem64_space.flags) pci_add_resource_offset(&resources, &pbm->mem64_space, - pbm->mem_space.start); + pbm->mem64_offset); pbm->busn.start = pbm->pci_first_busno; pbm->busn.end = pbm->pci_last_busno; pbm->busn.flags = IORESOURCE_BUS; @@ -733,30 +733,32 @@ int pcibios_enable_device(struct pci_dev *dev, int mask) static int __pci_mmap_make_offset_bus(struct pci_dev *pdev, struct vm_area_struct *vma, enum pci_mmap_state mmap_state) { - struct pci_pbm_info *pbm = pdev->dev.archdata.host_controller; - unsigned long space_size, user_offset, user_size; - - if (mmap_state == pci_mmap_io) { - space_size = resource_size(&pbm->io_space); - } else { - space_size = resource_size(&pbm->mem_space); - } + unsigned long user_offset, user_size; + struct resource res, *root_bus_res; + struct pci_bus_region region; + struct pci_bus *bus; /* Make sure the request is in range. */ user_offset = vma->vm_pgoff << PAGE_SHIFT; user_size = vma->vm_end - vma->vm_start; - if (user_offset >= space_size || - (user_offset + user_size) > space_size) + region.start = user_offset; + region.end = user_offset + user_size - 1; + memset(&res, 0, sizeof(res)); + if (mmap_state == pci_mmap_io) + res.flags = IORESOURCE_IO; + else + res.flags = IORESOURCE_MEM; + + pcibios_bus_to_resource(pdev->bus, &res, ®ion); + bus = pdev->bus; + while (bus->parent) + bus = bus->parent; + root_bus_res = pci_find_bus_resource(bus, &res); + if (!root_bus_res) return -EINVAL; - if (mmap_state == pci_mmap_io) { - vma->vm_pgoff = (pbm->io_space.start + - user_offset) >> PAGE_SHIFT; - } else { - vma->vm_pgoff = (pbm->mem_space.start + - user_offset) >> PAGE_SHIFT; - } + vma->vm_pgoff = res.start >> PAGE_SHIFT; return 0; } @@ -977,16 +979,12 @@ void pci_resource_to_user(const struct pci_dev *pdev, int bar, const struct resource *rp, resource_size_t *start, resource_size_t *end) { - struct pci_pbm_info *pbm = pdev->dev.archdata.host_controller; - unsigned long offset; + struct pci_bus_region region; - if (rp->flags & IORESOURCE_IO) - offset = pbm->io_space.start; - else - offset = pbm->mem_space.start; + pcibios_resource_to_bus(pdev->bus, ®ion, (struct resource *)rp); - *start = rp->start - offset; - *end = rp->end - offset; + *start = region.start; + *end = region.end; } void pcibios_set_master(struct pci_dev *dev) diff --git a/arch/sparc/kernel/pci_common.c b/arch/sparc/kernel/pci_common.c index 33524c1..76998f8 100644 --- a/arch/sparc/kernel/pci_common.c +++ b/arch/sparc/kernel/pci_common.c @@ -410,13 +410,16 @@ void pci_determine_mem_io_space(struct pci_pbm_info *pbm) for (i = 0; i < num_pbm_ranges; i++) { const struct linux_prom_pci_ranges *pr = &pbm_ranges[i]; - unsigned long a, size; + unsigned long a, size, region_a; u32 parent_phys_hi, parent_phys_lo; + u32 child_phys_mid, child_phys_lo; u32 size_hi, size_lo; int type; parent_phys_hi = pr->parent_phys_hi; parent_phys_lo = pr->parent_phys_lo; + child_phys_mid = pr->child_phys_mid; + child_phys_lo = pr->child_phys_lo; if (tlb_type == hypervisor) parent_phys_hi &= 0x0fffffff; @@ -426,6 +429,8 @@ void pci_determine_mem_io_space(struct pci_pbm_info *pbm) type = (pr->child_phys_hi >> 24) & 0x3; a = (((unsigned long)parent_phys_hi << 32UL) | ((unsigned long)parent_phys_lo << 0UL)); + region_a = (((unsigned long)child_phys_mid << 32UL) | + ((unsigned long)child_phys_lo << 0UL)); size = (((unsigned long)size_hi << 32UL) | ((unsigned long)size_lo << 0UL)); @@ -440,6 +445,7 @@ void pci_determine_mem_io_space(struct pci_pbm_info *pbm) pbm->io_space.start = a; pbm->io_space.end = a + size - 1UL; pbm->io_space.flags = IORESOURCE_IO; + pbm->io_offset = a - region_a; saw_io = 1; break; @@ -448,6 +454,7 @@ void pci_determine_mem_io_space(struct pci_pbm_info *pbm) pbm->mem_space.start = a; pbm->mem_space.end = a + size - 1UL; pbm->mem_space.flags = IORESOURCE_MEM; + pbm->mem_offset = a - region_a; saw_mem = 1; break; @@ -456,6 +463,7 @@ void pci_determine_mem_io_space(struct pci_pbm_info *pbm) pbm->mem64_space.start = a; pbm->mem64_space.end = a + size - 1UL; pbm->mem64_space.flags = IORESOURCE_MEM; + pbm->mem64_offset = a - region_a; saw_mem = 1; break; @@ -471,14 +479,22 @@ void pci_determine_mem_io_space(struct pci_pbm_info *pbm) prom_halt(); } - printk("%s: PCI IO[%llx] MEM[%llx]", - pbm->name, - pbm->io_space.start, - pbm->mem_space.start); + if (pbm->io_space.flags) + printk("%s: PCI IO %pR offset %llx\n", + pbm->name, &pbm->io_space, pbm->io_offset); + if (pbm->mem_space.flags) + printk("%s: PCI MEM %pR offset %llx\n", + pbm->name, &pbm->mem_space, pbm->mem_offset); + if (pbm->mem64_space.flags && pbm->mem_space.flags) { + if (pbm->mem64_space.start <= pbm->mem_space.end) + pbm->mem64_space.start = pbm->mem_space.end + 1; + if (pbm->mem64_space.start > pbm->mem64_space.end) + pbm->mem64_space.flags = 0; + } + if (pbm->mem64_space.flags) - printk(" MEM64[%llx]", - pbm->mem64_space.start); - printk("\n"); + printk("%s: PCI MEM64 %pR offset %llx\n", + pbm->name, &pbm->mem64_space, pbm->mem64_offset); pbm->io_space.name = pbm->mem_space.name = pbm->name; pbm->mem64_space.name = pbm->name; diff --git a/arch/sparc/kernel/pci_impl.h b/arch/sparc/kernel/pci_impl.h index 37222ca..2853af7 100644 --- a/arch/sparc/kernel/pci_impl.h +++ b/arch/sparc/kernel/pci_impl.h @@ -99,6 +99,10 @@ struct pci_pbm_info { struct resource mem_space; struct resource mem64_space; struct resource busn; + /* offset */ + resource_size_t io_offset; + resource_size_t mem_offset; + resource_size_t mem64_offset; /* Base of PCI Config space, can be per-PBM or shared. */ unsigned long config_space;