Message ID | 20161222202803.GA16855@bhelgaas-glaptop.roam.corp.google.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Hi Bjorn On Thu, Dec 22, 2016 at 02:28:03PM -0600, Bjorn Helgaas wrote: > On Thu, Dec 22, 2016 at 05:27:14PM +0100, Joerg Roedel wrote: > > Hi Bjorn, > > > > On Mon, Dec 19, 2016 at 03:20:44PM -0600, Bjorn Helgaas wrote: > > > I have some questions about dmar_init_reserved_ranges(). On systems > > > where CPU physical address space is not identity-mapped to PCI bus > > > address space, e.g., where the PCI host bridge windows have _TRA > > > offsets, I'm not sure we're doing the right thing. > > > > > > Assume we have a PCI host bridge with _TRA that maps CPU addresses > > > 0x80000000-0x9fffffff to PCI bus addresses 0x00000000-0x1fffffff, with > > > two PCI devices below it: This is the first time I'm hearing about it too!,and tracked it to 2002, one of Bjorn's patches from past life :-) > > > > > > PCI host bridge domain 0000 [bus 00-3f] > > > PCI host bridge window [mem 0x80000000-0x9fffffff] (bus 0x00000000-0x1fffffff] > > > 00:00.0: BAR 0 [mem 0x80000000-0x8ffffffff] (0x00000000-0x0fffffff on bus) > > > 00:01.0: BAR 0 [mem 0x90000000-0x9ffffffff] (0x10000000-0x1fffffff on bus) > > > > > > The IOMMU init code in dmar_init_reserved_ranges() reserves the PCI > > > MMIO space for all devices: > > > > > > pci_iommu_init() > > > intel_iommu_init() > > > dmar_init_reserved_ranges() > > > reserve_iova(0x80000000-0x8ffffffff) > > > reserve_iova(0x90000000-0x9ffffffff) > > > > > > This looks odd because we're reserving CPU physical addresses, but > > > the IOVA space contains *PCI bus* addresses. On most x86 systems they > > > would be the same, but not on all. > > > > Interesting, I wasn't aware of that. Looks like we are not doing the > > right thing in dmar_init_reserved_ranges(). How is that handled without > > an IOMMU, when the bus-addresses overlap with ram addresses? I'm not sure if there are platforms that i'm aware of that do _TRA. I'm checking internally if others have come across something like that. > > I don't know enough about these systems to answer that. One way would > be to avoid overlaps, e.g., by using bus addresses > 0x80000000-0xffffffff and not putting RAM at those addresses. Or > maybe the host bridge could apply a constant offset to bus addresses > before forwarding transactions up to the sytem bus. > > > > Assume the driver for 00:00.0 maps a page of main memory for DMA. It > > > may receive a dma_addr_t of 0x10000000: > > > > > > 00:00.0: intel_map_page() returns dma_addr_t 0x10000000 > > > 00:00.0: issues DMA to 0x10000000 > > > > > > What happens here? The DMA access should go to main memory. In > > > conventional PCI it would be a peer-to-peer access to device 00:01.0. > > > Is there enough PCIe smarts (ACS or something?) to do otherwise? > > > > If there is a bridge doing ACS between the devices, the IOMMU will see > > the request and re-map it to its RAM address. True, if its all acs enabled, we don't need this, probably true for legacy. But it doesn't matter in big scheme of things to reserve. > > > > > The dmar_init_reserved_ranges() comment says "Reserve all PCI MMIO to > > > avoid peer-to-peer access." Without _TRA, CPU addresses and PCI bus > > > addresses would be identical, and I think these reserve_iova() calls > > > *would* prevent this situation. So maybe we're just missing a > > > pcibios_resource_to_bus() here? > > > > I'll have a look, the AMD IOMMU driver implements this too, so it needs > > also be fixed there. Do you know which x86 systems are configured like > > this? > Let me check and keep you posted if we have such platforms to make sure if we need this considerations for _TRA. Cheers, Ashok -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Bjorn None in the platform group say they know about this. So i'm fairly sure we don't do that on Intel hardware (x86). I'm not sure about the usage, it appears maybe it was a hack pre-virtualization for some direct access? (just wild guessing) On Thu, Dec 22, 2016 at 03:32:38PM -0800, Raj, Ashok wrote: > Let me check and keep you posted if we have such platforms to make sure if > we need this considerations for _TRA. Cheers, Ashok -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Ashok, On Thu, Dec 22, 2016 at 03:45:08PM -0800, Raj, Ashok wrote: > Hi Bjorn > > None in the platform group say they know about this. So i'm fairly sure > we don't do that on Intel hardware (x86). I'm pretty sure there was once an x86 prototype for which PCI bus addresses were not identical to CPU physical addresses, but I have no idea whether it shipped that way. Even if such a system never shipped, the x86 arch code supports _TRA, and there's no reason to make the unnecessary assumption in this code that _TRA is always zero. If we didn't want to use pcibios_resource_to_bus() here for some reason, we should at least add a comment about why we think it's OK to use a CPU physical address as an IOVA. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Dec 22, 2016 at 06:48:01PM -0600, Bjorn Helgaas wrote: > If we didn't want to use pcibios_resource_to_bus() here for some > reason, we should at least add a comment about why we think it's OK to > use a CPU physical address as an IOVA. Even if there are no such x86 systems out there, I think it doesn't hurt to handle the possibility correctly in the IOMMU drivers. Joerg -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index c66c273..be78ab7 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -1865,6 +1865,7 @@ static struct lock_class_key reserved_rbtree_key; static int dmar_init_reserved_ranges(void) { struct pci_dev *pdev = NULL; + struct pci_bus_region region; struct iova *iova; int i; @@ -1890,9 +1891,11 @@ static int dmar_init_reserved_ranges(void) r = &pdev->resource[i]; if (!r->flags || !(r->flags & IORESOURCE_MEM)) continue; + + pcibios_resource_to_bus(pdev->bus, ®ion, r); iova = reserve_iova(&reserved_iova_list, - IOVA_PFN(r->start), - IOVA_PFN(r->end)); + IOVA_PFN(region.start), + IOVA_PFN(region.end)); if (!iova) { pr_err("Reserve iova failed\n"); return -ENODEV;