Message ID | 1359399397-29729-6-git-send-email-thomas.petazzoni@free-electrons.com (mailing list archive) |
---|---|
State | New, archived |
Delegated to: | Bjorn Helgaas |
Headers | show |
Russell, As the arch/arm/kernel/ maintainer, what is your position regarding the below patch? Thanks for your review, Thomas On Mon, 28 Jan 2013 19:56:14 +0100, Thomas Petazzoni wrote: > The PCI specifications says that an I/O region must be aligned on a 4 > KB boundary, and a memory region aligned on a 1 MB boundary. > > However, the Marvell PCIe interfaces rely on address decoding windows > (which allow to associate a range of physical addresses with a given > device). For PCIe memory windows, those windows are defined with a 1 > MB granularity (which matches the PCI specs), but PCIe I/O windows can > only be defined with a 64 KB granularity, so they have to be 64 KB > aligned. We therefore need to tell the PCI core about this special > alignement requirement. > > The PCI core already calls pcibios_align_resource() in the ARM PCI > core, specifically for such purposes. So this patch extends the ARM > PCI core so that it calls a ->align_resource() hook registered by the > PCI driver, exactly like the existing ->map_irq() and ->swizzle() > hooks. > > A particular PCI driver can register a align_resource() hook, and do > its own specific alignement, depending on the specific constraints of > the underlying hardware. > > Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> > Cc: Russell King <linux@arm.linux.org.uk> > --- > arch/arm/include/asm/mach/pci.h | 11 +++++++++++ > arch/arm/kernel/bios32.c | 6 ++++++ > 2 files changed, 17 insertions(+) > > diff --git a/arch/arm/include/asm/mach/pci.h b/arch/arm/include/asm/mach/pci.h > index 5cf2e97..7d2c3c8 100644 > --- a/arch/arm/include/asm/mach/pci.h > +++ b/arch/arm/include/asm/mach/pci.h > @@ -30,6 +30,11 @@ struct hw_pci { > void (*postinit)(void); > u8 (*swizzle)(struct pci_dev *dev, u8 *pin); > int (*map_irq)(const struct pci_dev *dev, u8 slot, u8 pin); > + resource_size_t (*align_resource)(struct pci_dev *dev, > + const struct resource *res, > + resource_size_t start, > + resource_size_t size, > + resource_size_t align); > }; > > /* > @@ -51,6 +56,12 @@ struct pci_sys_data { > u8 (*swizzle)(struct pci_dev *, u8 *); > /* IRQ mapping */ > int (*map_irq)(const struct pci_dev *, u8, u8); > + /* Resource alignement requirements */ > + resource_size_t (*align_resource)(struct pci_dev *dev, > + const struct resource *res, > + resource_size_t start, > + resource_size_t size, > + resource_size_t align); > void *private_data; /* platform controller private data */ > }; > > diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c > index 5401645..be2e6c9 100644 > --- a/arch/arm/kernel/bios32.c > +++ b/arch/arm/kernel/bios32.c > @@ -462,6 +462,7 @@ static void __init pcibios_init_hw(struct hw_pci *hw, struct list_head *head) > sys->busnr = busnr; > sys->swizzle = hw->swizzle; > sys->map_irq = hw->map_irq; > + sys->align_resource = hw->align_resource; > INIT_LIST_HEAD(&sys->resources); > > if (hw->private_data) > @@ -574,6 +575,8 @@ char * __init pcibios_setup(char *str) > resource_size_t pcibios_align_resource(void *data, const struct resource *res, > resource_size_t size, resource_size_t align) > { > + struct pci_dev *dev = data; > + struct pci_sys_data *sys = dev->sysdata; > resource_size_t start = res->start; > > if (res->flags & IORESOURCE_IO && start & 0x300) > @@ -581,6 +584,9 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res, > > start = (start + align - 1) & ~(align - 1); > > + if (sys->align_resource) > + return sys->align_resource(dev, res, start, size, align); > + > return start; > } >
On Tue, Jan 29, 2013 at 04:12:11PM +0100, Thomas Petazzoni wrote: > Russell, > > As the arch/arm/kernel/ maintainer, what is your position regarding the > below patch? Given the description, I'd feel much happier with this if we specified the alignment numerically rather than allowing "some random code" to do something with the passed values. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Russell, Thanks for your quick feedback! On Tue, 29 Jan 2013 15:15:01 +0000, Russell King - ARM Linux wrote: > Given the description, I'd feel much happier with this if we specified > the alignment numerically rather than allowing "some random code" to do > something with the passed values. So, you'd prefer to have two new members added in the hw_pci structure to give the alignment requirements for I/O regions and memory regions? Something like: struct hw_pci { [...] unsigned long io_align; unsigned long mem_align; }; If that's fine with you, I'll go ahead and change the implementation in this direction. As long as I can express my special 64 KB alignment requirement for I/O regions, I'm just fine :-) Thanks again, Thomas
On Tue, Jan 29, 2013 at 04:23:37PM +0100, Thomas Petazzoni wrote: > Russell, > > Thanks for your quick feedback! > > On Tue, 29 Jan 2013 15:15:01 +0000, Russell King - ARM Linux wrote: > > > Given the description, I'd feel much happier with this if we specified > > the alignment numerically rather than allowing "some random code" to do > > something with the passed values. > > So, you'd prefer to have two new members added in the hw_pci structure > to give the alignment requirements for I/O regions and memory regions? Yep, otherwise we'll have yet more code to review rather than one algorithm with a set of numbers... I work on the principle that if something can be expressed numerically, that's always better than expressing it with code. -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Russell, On Tue, 29 Jan 2013 15:25:18 +0000, Russell King - ARM Linux wrote: > Yep, otherwise we'll have yet more code to review rather than one > algorithm with a set of numbers... > > I work on the principle that if something can be expressed numerically, > that's always better than expressing it with code. Having a hook allows for more flexibility (for example having special alignment requirements depending on the device or something like that), but I don't need this flexibility for the specific case I'm interested in. So as you suggest, let's not over-engineer this, and use numerical values. I'll rework my patch according to this suggestion. Thanks, Thomas
On Tue, Jan 29, 2013 at 04:12:11PM +0100, Thomas Petazzoni wrote: > On Mon, 28 Jan 2013 19:56:14 +0100, Thomas Petazzoni wrote: > > The PCI specifications says that an I/O region must be aligned on a 4 > > KB boundary, and a memory region aligned on a 1 MB boundary. BTW, this, as a general statement, is wrong - though it really depends what you mean by "region". Remember that BARs can be set where-ever provided that they satisify their _individual_ alignment requirements. So, an IO bar which occupies 16 bytes must be set to a 16-byte boundary. Now, there's an additional complication there which occurs if you have ISA devices sharing the PCI IO space: ISA devices used to only decode 10 bits of IO space, which means that their registers repeat throughout the IO space. Therefore, it is generally accepted that within any 1K block, only the first 256 locations are only usable. Moreover, some PCI cards have taken advantage of this, particularly VGA cards. For example, S3 VGA cards put different registers on 1K multiples of the standard PC VGA IO addresses... Also, another reason why I suspect your statement is wrong if I were to interpret "region" as "BAR" is that consider a bunch of PCI peripherals behind a PCI bridge. The total number of IO BARs on the peripherals is 16. If you allocate each of those IO BARs to be 4K aligned, then you consume all 64K of IO space behind one bridge, which leaves no space for any other IO peripherals elsewhere in the bus structure. Last reason I think that intepretation is wrong is, on this PC, I see: Region 4: I/O ports at 1800 [size=8] Region 0: I/O ports at 1830 [size=8] Region 2: I/O ports at 1840 [size=32] Region 4: I/O ports at 1860 [size=32] Region 4: I/O ports at 1880 [size=32] Region 4: I/O ports at 18a0 [size=32] Region 4: I/O ports at 18c0 [size=32] Region 4: I/O ports at 18e0 [size=32] Region 4: I/O ports at 1c00 [size=32] Region 0: I/O ports at 1c48 [size=8] Region 1: I/O ports at 183c [size=4] Region 2: I/O ports at 1c40 [size=8] Region 3: I/O ports at 1838 [size=4] Region 4: I/O ports at 1c20 [size=32] Region 4: I/O ports at 1c60 [size=32] which doesn't follow - and I can pull out other examples on other x86 platforms where IO BARs aren't aligned to 4K... -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Russell, On Tue, 29 Jan 2013 15:58:20 +0000, Russell King - ARM Linux wrote: > On Tue, Jan 29, 2013 at 04:12:11PM +0100, Thomas Petazzoni wrote: > > On Mon, 28 Jan 2013 19:56:14 +0100, Thomas Petazzoni wrote: > > > The PCI specifications says that an I/O region must be aligned on a 4 > > > KB boundary, and a memory region aligned on a 1 MB boundary. > > BTW, this, as a general statement, is wrong - though it really depends > what you mean by "region". Yes, sorry, my statement does not correctly reflect the reality. My knowledge of the PCI terminology is still quite fuzzy (as you found out). What I am referring to is that the PCI standard requires the I/O base register of a PCI-to-PCI bridge to contain a 4 KB aligned address. From the PCI-to-PCI Bridge Architecture Specification, Revision 1.1, section 3.2.5.6. I/O Base Register and I/O Limit Register: """ If a bridge implements an I/O address range, the upper 4 bits of both the I/O Base and I/O Limit registers are writable and correspond to address bits AD[15::12]. For the purpose of address decoding, the bridge assumes that the lower 12 address bits, AD[11::00], of the I/O base address (not implemented in the I/O Base register) are zero. Similarly, the bridge assumes that the lower 12 address bits, AD[11::00], of the I/O limit address (not implemented in the I/O Limit register) are FFFh. Thus, the bottom of the defined I/O address range will be aligned to a 4 KB boundary and the top of the defined I/O address range will be one less than a 4 KB boundary. """ And the Linux PCI resource allocation code complies with this, so that if I have two PCI-to-PCI bridges (each having downstream a device with an I/O BAR), then the first PCI-to-PCI bridge gets its I/O base address register set to ADDR + 0x0, and the second bridge gets its I/O base address set to ADDR + 0x1000. And this doesn't play well with the requirements of Marvell address decoding windows for PCIe I/O regions, which must be 64 KB aligned. So I guess I should simply rewrite the commit log to make it clear that I'm referring to the I/O base address register of PCI-to-PCI bridges. Would this be more correct? In that case, maybe in fact I really need a hook so that this alignment requirement on only applied on the resources allocated to bridges, and not on their downstream devices? Thanks, Thomas
On Tuesday 29 January 2013, Thomas Petazzoni wrote: > And the Linux PCI resource allocation code complies with this, so that > if I have two PCI-to-PCI bridges (each having downstream a device with > an I/O BAR), then the first PCI-to-PCI bridge gets its I/O base address > register set to ADDR + 0x0, and the second bridge gets its I/O base > address set to ADDR + 0x1000. And this doesn't play well with the > requirements of Marvell address decoding windows for PCIe I/O regions, > which must be 64 KB aligned. But we normally only assign a 64 KB I/O window to each PCI host bridge. Requiring PCI bridges to be space 64 KB apart would mean that we cannot actually support bridges at all. Is this just about your "virtual" bridges? If each one has its own 64 KB I/O range and its own configuration space, that sounds a lot like you should make them appear as individual domains instead. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Arnd Bergmann, On Tue, 29 Jan 2013 16:45:07 +0000, Arnd Bergmann wrote: > On Tuesday 29 January 2013, Thomas Petazzoni wrote: > > And the Linux PCI resource allocation code complies with this, so > > that if I have two PCI-to-PCI bridges (each having downstream a > > device with an I/O BAR), then the first PCI-to-PCI bridge gets its > > I/O base address register set to ADDR + 0x0, and the second bridge > > gets its I/O base address set to ADDR + 0x1000. And this doesn't > > play well with the requirements of Marvell address decoding windows > > for PCIe I/O regions, which must be 64 KB aligned. > > But we normally only assign a 64 KB I/O window to each PCI host > bridge. Requiring PCI bridges to be space 64 KB apart would mean that > we cannot actually support bridges at all. > > Is this just about your "virtual" bridges? If each one has its > own 64 KB I/O range and its own configuration space, that sounds > a lot like you should make them appear as individual domains instead. Yes, it is about the emulated PCI-to-PCI bridges. Each emulated PCI-to-PCI bridge corresponds to one hardware PCIe interface, and I need the I/O base address assigned to each PCIe interface to be aligned on a 64 KB boundary. I am not sure to understand why you think this is a problem. Also, what do you mean exactly by making them appear as individual domains? Remember that the very reason to use emulated PCI-to-PCI bridges is that we want to assign a global range of addresses of I/O regions and a global range of addresses of memory regions, and let the Linux PCI core allocate from those two ranges to the different devices connected downstream of the PCI-to-PCI bridges. This gives us for free the rather complex allocation of addresses we need to set up our address decoding windows. If we have have separate domains for each of our hardware PCIe interface, can we still benefit from this allocation of resources from a globally defined range of I/O addresses and memory addresses? Thanks, Thomas
On Tuesday 29 January 2013, Thomas Petazzoni wrote: > Yes, it is about the emulated PCI-to-PCI bridges. Each > emulated PCI-to-PCI bridge corresponds to one hardware PCIe interface, > and I need the I/O base address assigned to each PCIe interface to be > aligned on a 64 KB boundary. I am not sure to understand why you think > this is a problem. > > Also, what do you mean exactly by making them appear as individual > domains? I mean you could make each root port look like a separate host bridge that is not related to the others, and not have any emulated PCI-to-PCI bridges at all. > Remember that the very reason to use emulated PCI-to-PCI bridges is > that we want to assign a global range of addresses of I/O regions and a > global range of addresses of memory regions, and let the Linux PCI core > allocate from those two ranges to the different devices connected > downstream of the PCI-to-PCI bridges. This gives us for free the rather > complex allocation of addresses we need to set up our address decoding > windows. > > If we have have separate domains for each of our hardware PCIe > interface, can we still benefit from this allocation of resources from > a globally defined range of I/O addresses and memory addresses? My interpretation of what you told me in the previous mail is that each root port has * A separate configuration space * A separate 64KB I/O window that is not shared with the other ports, or potentially multiple 64KB windows, which we would not want to use * A configurable range of the memory space that does not overlap with the other ports Is the above a correct description? If so, I think it would be most sensible to not try to put all ports into the same domain, but give each port the full view of its own 256 buses, and 64KB I/O space. The memory space can still be directly mapped, if you only set up the physical address window for that after the bus scan is complete. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Arnd Bergmann, On Tue, 29 Jan 2013 20:15:21 +0000, Arnd Bergmann wrote: > I mean you could make each root port look like a separate host > bridge that is not related to the others, and not have any > emulated PCI-to-PCI bridges at all. Ok. > > Remember that the very reason to use emulated PCI-to-PCI bridges is > > that we want to assign a global range of addresses of I/O regions > > and a global range of addresses of memory regions, and let the > > Linux PCI core allocate from those two ranges to the different > > devices connected downstream of the PCI-to-PCI bridges. This gives > > us for free the rather complex allocation of addresses we need to > > set up our address decoding windows. > > > > If we have have separate domains for each of our hardware PCIe > > interface, can we still benefit from this allocation of resources > > from a globally defined range of I/O addresses and memory addresses? > > My interpretation of what you told me in the previous mail is that > each root port has > > * A separate configuration space > * A separate 64KB I/O window that is not shared with the other ports, > or potentially multiple 64KB windows, which we would not want to use > * A configurable range of the memory space that does not overlap > with the other ports > > Is the above a correct description? > > If so, I think it would be most sensible to not try to put all ports > into the same domain, but give each port the full view of its own > 256 buses, and 64KB I/O space. The memory space can still be directly > mapped, if you only set up the physical address window for that after > the bus scan is complete. Does this still allows me to give the Linux PCI *one* global range of addresses for I/O space, and *one* global range of addresses for memory space, and the the Linux PCI core assign ranges, within those global ranges, to each host bridge? This is absolutely essential for me, as I then read those allocated ranges to configure the address decoding windows. Basically, I have currently two suggestions: * From Jason Gunthorpe, to not use any host bridge, and instead use only PCI-to-PCI bridges, one per PCIe interface. * From you, to not use any PCI-to-PCI bridge, and use only host bridges, one per PCIe interface. Would it be possible to get some consensus on this? In the review of RFCv1, I was already told to use one global host bridge, and then one PCI-to-PCI bridge per PCIe interface, and now we're talking about doing something different. I'd like to avoid having to try gazillions of different possible implementations :-) Best regards, Thomas
Arnd, On Tue, 29 Jan 2013 21:33:08 +0100, Thomas Petazzoni wrote: > Basically, I have currently two suggestions: > > * From Jason Gunthorpe, to not use any host bridge, and instead use > only PCI-to-PCI bridges, one per PCIe interface. > > * From you, to not use any PCI-to-PCI bridge, and use only host > bridges, one per PCIe interface. Thinking more about this, this solution (using one emulated host bridge per PCIe interface) would cause one problem: the PCIe device itself would no longer be in slot 0. If I'm correct, with one host bridge per PCIe interface, we would have the following topology: bus 0, slot 0: emulated host bridge 0 bus 0, slot 1: PCIe device connected to PCIe interface 0 bus 1, slot 0: emulated host bridge 1 bus 1, slot 1: PCIe device connected to PCIe interface 1 bus 2, slot 0: emulated host bridge 2 bus 2, slot 1: PCIe device connected to PCIE interface 2 etc. However, one of the reason to use a PCI-to-PCI bridge was to ensure that the PCIe devices were all listed in slot 0. According to the Marvell engineers who work on the PCIe stuff, some new PCIe devices have this requirement. I don't have a lot of details about this, but I was told that most of the new Intel NICs require this, for example the Intel X520 fiber NIC. Maybe PCIe experts (Jason?) could provide more details about this, and confirm/infirm this statement. The usage of PCI-to-PCI bridge allows to have each PCIe device on its own bus, at slot 0, which also solves this problem. Best regards, Thomas
On 01/29/2013 02:59 PM, Thomas Petazzoni wrote: > Arnd, > > On Tue, 29 Jan 2013 21:33:08 +0100, Thomas Petazzoni wrote: > >> Basically, I have currently two suggestions: >> >> * From Jason Gunthorpe, to not use any host bridge, and instead use >> only PCI-to-PCI bridges, one per PCIe interface. >> >> * From you, to not use any PCI-to-PCI bridge, and use only host >> bridges, one per PCIe interface. > > Thinking more about this, this solution (using one emulated host bridge > per PCIe interface) would cause one problem: the PCIe device itself > would no longer be in slot 0. I think that's device 0 not slot 0 right? > If I'm correct, with one host bridge per PCIe interface, we would have > the following topology: > > bus 0, slot 0: emulated host bridge 0 > bus 0, slot 1: PCIe device connected to PCIe interface 0 I /think/ the bus that the root port itself is on is different from the bus that the downstream device is on, so wouldn't you end up with: bus 0, slot 0: emulated host bridge 0 bus 1, slot 0: PCIe device connected to PCIe interface 0 (and isn't that "root port" not "host bridge" in the first line above?) -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tuesday 29 January 2013, Thomas Petazzoni wrote: > Does this still allows me to give the Linux PCI one global range of > addresses for I/O space, and one global range of addresses for memory > space, and the the Linux PCI core assign ranges, within those global > ranges, to each host bridge? > > This is absolutely essential for me, as I then read those allocated > ranges to configure the address decoding windows. > > Basically, I have currently two suggestions: > > * From Jason Gunthorpe, to not use any host bridge, and instead use > only PCI-to-PCI bridges, one per PCIe interface. > > * From you, to not use any PCI-to-PCI bridge, and use only host > bridges, one per PCIe interface. > > Would it be possible to get some consensus on this? In the review of > RFCv1, I was already told to use one global host bridge, and then one > PCI-to-PCI bridge per PCIe interface, and now we're talking about doing > something different. I'd like to avoid having to try gazillions of > different possible implementations :-) I'm actually fine with either of the two suggestions you mentioned above, whichever is easier to implement and/or more closely matches what the hardware actually implements is better IMHO. The part that I did not like about having emulated PCI-to-PCI bridges is that it seems to just work around a (percieved or real) limitation in the Linux kernel by adding a piece of infrastructure, rather than lifting that limitation by making the kernel deal with what the hardware provides. That reminded me of the original mach-vt8500 PCI implementation that faked a complete PCI host bridge and a bunch of PCI devices on it, in order to use the via-velocity ethernet controller, instead of adding a simple 'platform_driver' struct to that driver. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 29, 2013 at 10:54:00PM +0000, Arnd Bergmann wrote: > I'm actually fine with either of the two suggestions you mentioned above, > whichever is easier to implement and/or more closely matches what the > hardware actually implements is better IMHO. > > The part that I did not like about having emulated PCI-to-PCI bridges > is that it seems to just work around a (percieved or real) limitation > in the Linux kernel by adding a piece of infrastructure, rather than > lifting that limitation by making the kernel deal with what the > hardware provides. That reminded me of the original mach-vt8500 Well.. in this case there is a standard - PCI-E for what HW vendors are supposed to do. The kernel core code follows it and works with compliant hardware. Marvell HW is not compliant. So.. Should the kernel core PCI code support this particular non-compliance? Should the driver work around the non-compliance and present a compliant interface to the kernel and userspace? My take is the kernel core PCI code is fine, and I hope this will be an isolated issue with one family of Marvell IP. So working around the HW problem in the driver seems best. If we learn of many more instances like this then, yah, update the core code and rip out this driver work around... Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 29, 2013 at 10:59:32PM +0100, Thomas Petazzoni wrote: > Arnd, > > On Tue, 29 Jan 2013 21:33:08 +0100, Thomas Petazzoni wrote: > > > Basically, I have currently two suggestions: > > > > * From Jason Gunthorpe, to not use any host bridge, and instead use > > only PCI-to-PCI bridges, one per PCIe interface. > > > > * From you, to not use any PCI-to-PCI bridge, and use only host > > bridges, one per PCIe interface. Arnd is suggesting to use multiple *linux* host bridges (ie host drivers), there is never any need for a 'host bridge config space' as in patch #7, in either case. > However, one of the reason to use a PCI-to-PCI bridge was to ensure > that the PCIe devices were all listed in slot 0. According to the > Marvell engineers who work on the PCIe stuff, some new PCIe devices > have this requirement. I don't have a lot of details about this, but I > was told that most of the new Intel NICs require this, for example the > Intel X520 fiber NIC. Maybe PCIe experts (Jason?) could provide more > details about this, and confirm/infirm this statement. I'm not sure what this is referring to.. I don't recall any specific requirements in PCI-E for the device number, I think the spec requires it to be learned based on the config TLPs received. There might be a device number sensitivity in INTx translation, but that is defined by the spec. That said, if your root complex is PCI-E compliant then all downstream end ports attached to a root port should have a device number of 0. > The usage of PCI-to-PCI bridge allows to have each PCIe device on its > own bus, at slot 0, which also solves this problem. Hrm.... Looking at the docs, you will also need to change the internal device number (probably reg 41a04 again) to something other than 0, otherwise the Marvell itself will claim device number 0 and the downstream end port will be device number 1. You should see this happen today?? You should set the Marvell internal device number to something like all ones and then deny any Linux config register access to the all ones device number on the subordinate bus to hide the Marvell end port config space registers from Linux. As for as this process goes, it doesn't matter which approach you take. If you use multiple PCI domains then you'd still be able to arrange things via the above so that the downstream device could always claim device number 0. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jan 29, 2013 at 04:45:07PM +0000, Arnd Bergmann wrote: > On Tuesday 29 January 2013, Thomas Petazzoni wrote: > > And the Linux PCI resource allocation code complies with this, so that > > if I have two PCI-to-PCI bridges (each having downstream a device with > > an I/O BAR), then the first PCI-to-PCI bridge gets its I/O base address > > register set to ADDR + 0x0, and the second bridge gets its I/O base > > address set to ADDR + 0x1000. And this doesn't play well with the > > requirements of Marvell address decoding windows for PCIe I/O regions, > > which must be 64 KB aligned. > > But we normally only assign a 64 KB I/O window to each PCI host bridge. > Requiring PCI bridges to be space 64 KB apart would mean that we cannot > actually support bridges at all. The PCI resource code uses full 32 bit integers when it handles IO addresses, so this actually does sort of work out. However, Thomas how did you recover the high bits of the IO window address from the bridge configuration? Are you reading the struct resource directly? That probably causes problems with hotplug/etc... If you look back in your old emails I outlined a solution to this using the MMU. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Jason Gunthorpe, On Tue, 29 Jan 2013 21:56:55 -0700, Jason Gunthorpe wrote: > However, Thomas how did you recover the high bits of the > IO window address from the bridge configuration? Are you reading the > struct resource directly? That probably causes problems with > hotplug/etc... The PCI-to-PCI bridge configuration space has a register with the high bits of the I/O window address. If you look at the PCI-to-PCI emulation code, I set the bit that says "I'm a bridge capable of 32 bits addressing of I/O addresses", and then when setting up the windows, I reconstruct the full 32 bits address by reading the two I/O address registers. See 3.2.5.6 in the PCI-to-PCI bridge specification: If the low four bits of the I/O Base and I/O Limit registers are 01h, then the bridge supports 32-bit I/O address decoding, and the I/O Base Upper 16 Bits and the I/O Limit Upper 16 Bits hold the upper 16 bits, corresponding to AD[31::16], of the 32-bit I/O Base and I/O Limit addresses respectively. In this case, system configuration software is permitted to locate the I/O address range supported by the anywhere in the 4-GB I/O Space. Note that the 4-KB alignment and granularity restrictions still apply when the bridge supports 32 -bit I/O addressing. (And my code does ensure that the low four bits of the I/O Base and I/O Limit registers are 01h) Best regards, Thomas
On Wednesday 30 January 2013, Jason Gunthorpe wrote: > > But we normally only assign a 64 KB I/O window to each PCI host bridge. > > Requiring PCI bridges to be space 64 KB apart would mean that we cannot > > actually support bridges at all. > > The PCI resource code uses full 32 bit integers when it handles IO > addresses, so this actually does sort of work out. However, we only reserve 1 MB (I think) virtual address window for all I/O spaces of all PCI domains combined, at a fixed location (0xfee00000). This means we can have at most 16 such windows at run-time. That can be changed if necessary, but it seems like overkill when in practice you only need a few bytes at most. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Arnd Bergmann, On Wed, 30 Jan 2013 09:46:53 +0000, Arnd Bergmann wrote: > On Wednesday 30 January 2013, Jason Gunthorpe wrote: > > > But we normally only assign a 64 KB I/O window to each PCI host bridge. > > > Requiring PCI bridges to be space 64 KB apart would mean that we cannot > > > actually support bridges at all. > > > > The PCI resource code uses full 32 bit integers when it handles IO > > addresses, so this actually does sort of work out. > > However, we only reserve 1 MB (I think) virtual address window for all > I/O spaces of all PCI domains combined, at a fixed location (0xfee00000). > This means we can have at most 16 such windows at run-time. That can > be changed if necessary, but it seems like overkill when in practice > you only need a few bytes at most. I am not sure where this 0xfee00000 address comes from, but in my case (and I think in the Tegra PCI driver as well), we tell the Linux PCI core from which addresses the I/O ranges should be allocated. In my DT, I have: ranges = <0x00000800 0 0xd0040000 0xd0040000 0 0x00002000 /* port 0.0 registers */ 0x00004800 0 0xd0042000 0xd0042000 0 0x00002000 /* port 2.0 registers */ 0x00001000 0 0xd0044000 0xd0044000 0 0x00002000 /* port 0.1 registers */ 0x00001800 0 0xd0048000 0xd0048000 0 0x00002000 /* port 0.2 registers */ 0x00002000 0 0xd004C000 0xd004C000 0 0x00002000 /* port 0.3 registers */ 0x00002800 0 0xd0080000 0xd0080000 0 0x00002000 /* port 1.0 registers */ 0x00005000 0 0xd0082000 0xd0082000 0 0x00002000 /* port 3.0 registers */ 0x00003000 0 0xd0084000 0xd0084000 0 0x00002000 /* port 1.1 registers */ 0x00003800 0 0xd0088000 0xd0088000 0 0x00002000 /* port 1.2 registers */ 0x00004000 0 0xd008C000 0xd008C000 0 0x00002000 /* port 1.3 registers */ 0x81000000 0 0 0xc0000000 0 0x00100000 /* downstream I/O */ 0x82000000 0 0 0xc1000000 0 0x08000000>; /* non-prefetchable memory */ And then, the Marvell PCI driver gets the "downstream I/O" range, parses it into a "struct resource", and then does (where &pcie->io is the struct resource into which we parsed the "downstream I/O" range): pci_add_resource_offset(&sys->resources, &pcie->io, sys->io_offset); [...] pci_ioremap_io(nr * SZ_64K, pcie->io.start); And it works just fine, I get my I/O ranges allocated at 0xc0000000 for the first device, 0xc0010000 (i.e base address + 64KB) for the second device, etc. The Tegra PCI driver does exactly the same (I shamelessly copied what Thierry has done). I somehow have the feeling that we are looking for problems that simply don't exist... Best regards, Thomas
On Wednesday 30 January 2013, Jason Gunthorpe wrote: > On Tue, Jan 29, 2013 at 10:54:00PM +0000, Arnd Bergmann wrote: > > The part that I did not like about having emulated PCI-to-PCI bridges > > is that it seems to just work around a (percieved or real) limitation > > in the Linux kernel by adding a piece of infrastructure, rather than > > lifting that limitation by making the kernel deal with what the > > hardware provides. That reminded me of the original mach-vt8500 > > Well.. in this case there is a standard - PCI-E for what HW vendors > are supposed to do. The kernel core code follows it and works with > compliant hardware. > > Marvell HW is not compliant. > > So.. > > Should the kernel core PCI code support this particular non-compliance? > Should the driver work around the non-compliance and present a > compliant interface to the kernel and userspace? > > My take is the kernel core PCI code is fine, and I hope > this will be an isolated issue with one family of Marvell IP. So > working around the HW problem in the driver seems best. I don't remember the kernel ever caring about whether hardware complies to a standard or not. The kernel's job is to make hardware work, based on the actual implementation of that hardware. In a lot of cases that means taking the standard document as a reference, and adding quirks for the devices that are different. In the end, it comes down to the impact on the code complexity, and the run-time overhead for whatever hardware is most common when adding those quirks. Can you (or someone else) describe what kind of changes to the core code we would actually need to make it work without emulting the bridge? > If we learn of many more instances like this then, yah, update the > core code and rip out this driver work around... But the code was specifically written to be reusable, which is normally a good thing. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wednesday 30 January 2013, Thomas Petazzoni wrote: > I am not sure where this 0xfee00000 address comes from, but in my case > (and I think in the Tegra PCI driver as well), we tell the Linux PCI > core from which addresses the I/O ranges should be allocated. In my DT, > I have: > > ranges = <0x00000800 0 0xd0040000 0xd0040000 0 0x00002000 /* port 0.0 registers */ > 0x00004800 0 0xd0042000 0xd0042000 0 0x00002000 /* port 2.0 registers */ > 0x00001000 0 0xd0044000 0xd0044000 0 0x00002000 /* port 0.1 registers */ > 0x00001800 0 0xd0048000 0xd0048000 0 0x00002000 /* port 0.2 registers */ > 0x00002000 0 0xd004C000 0xd004C000 0 0x00002000 /* port 0.3 registers */ > 0x00002800 0 0xd0080000 0xd0080000 0 0x00002000 /* port 1.0 registers */ > 0x00005000 0 0xd0082000 0xd0082000 0 0x00002000 /* port 3.0 registers */ > 0x00003000 0 0xd0084000 0xd0084000 0 0x00002000 /* port 1.1 registers */ > 0x00003800 0 0xd0088000 0xd0088000 0 0x00002000 /* port 1.2 registers */ > 0x00004000 0 0xd008C000 0xd008C000 0 0x00002000 /* port 1.3 registers */ > 0x81000000 0 0 0xc0000000 0 0x00100000 /* downstream I/O */ > 0x82000000 0 0 0xc1000000 0 0x08000000>; /* non-prefetchable memory */ > > And then, the Marvell PCI driver gets the "downstream I/O" range, > parses it into a "struct resource", and then does (where &pcie->io is > the struct resource into which we parsed the "downstream I/O" range): > > pci_add_resource_offset(&sys->resources, &pcie->io, sys->io_offset); > [...] > pci_ioremap_io(nr * SZ_64K, pcie->io.start); 0xfee00000 is the platform independent virtual address that pci_ioremap_io maps your platform specific physical address (from pcie->io.start) to. It's defined (in the kernel I am looking at) in asm/io.h as #define PCI_IO_VIRT_BASE 0xfee00000 and used by pci_ioremap_io as return ioremap_page_range(PCI_IO_VIRT_BASE + offset, PCI_IO_VIRT_BASE + offset + SZ_64K, phys_addr, __pgprot(get_mem_type(MT_DEVICE)->prot_pte)); > And it works just fine, I get my I/O ranges allocated at 0xc0000000 for > the first device, 0xc0010000 (i.e base address + 64KB) for the second > device, etc. (void*)0xc0000000 is the normal PAGE_OFFSET. If you map your I/O space there, you are in big trouble because that is supposed to have the start of your physical memory mapping. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Arnd Bergmann, On Wed, 30 Jan 2013 10:03:43 +0000, Arnd Bergmann wrote: > 0xfee00000 is the platform independent virtual address that pci_ioremap_io > maps your platform specific physical address (from pcie->io.start) to. It's > defined (in the kernel I am looking at) in asm/io.h as > > #define PCI_IO_VIRT_BASE 0xfee00000 > > and used by pci_ioremap_io as > > return ioremap_page_range(PCI_IO_VIRT_BASE + offset, > PCI_IO_VIRT_BASE + offset + SZ_64K, > phys_addr, > __pgprot(get_mem_type(MT_DEVICE)->prot_pte)); > > > > And it works just fine, I get my I/O ranges allocated at 0xc0000000 for > > the first device, 0xc0010000 (i.e base address + 64KB) for the second > > device, etc. > > (void*)0xc0000000 is the normal PAGE_OFFSET. If you map your I/O space there, > you are in big trouble because that is supposed to have the start of your > physical memory mapping. Aaah, I know where the confusion comes from. You are talking about virtual addresses, while I am talking about physical addresses. 0xC0000000 in my DT is a *physical* address. Basically, with Marvell SoCs, we have the following behavior: ------- -------------------- Virtual address --> | MMU | --> Physical address --> | Address decoding | --> real hardware ------- -------------------- The MMU is the usual stuff everybody knows about. What's more special about Marvell SoC is this "Address decoding" thing. Basically, instead of having the physical address of things fully hardcoded and mentioned in the datasheets, they are configurable. So for each PCIe interface, you have to set up an address decoding window for the I/O accesses and another address decoding window for the memory accesses. And the physical address associated to each of these "address decoding windows" can be freely chosen, so they must be "assigned" for each PCIe interface. So, my 0xC0000000 is a *physical* address is the diagram above. The fact that it gets maps at 0xfee00000 as a virtual address doesn't really matter for me, I'm just fine with that. Does that clarify things? Thomas
Dear Arnd Bergmann, On Wed, 30 Jan 2013 09:55:49 +0000, Arnd Bergmann wrote: > I don't remember the kernel ever caring about whether hardware complies > to a standard or not. The kernel's job is to make hardware work, based > on the actual implementation of that hardware. In a lot of cases that > means taking the standard document as a reference, and adding quirks > for the devices that are different. > > In the end, it comes down to the impact on the code complexity, and > the run-time overhead for whatever hardware is most common when adding > those quirks. This is not only about standards, it is also about re-using the PCI resource allocation code. In my RFCv1, sent December, 7th, I wasn't using any emulated PCI-to-PCI bridge. So it *can* perfectly work without it. However, one major drawback of my RFCv1 version is that since I didn't know how much I/O space and memory space was needed for each PCIe device, I had to oversize the address decoding windows. And also, I had to have a special allocator (certainly simple, but still) to find an available physical address to set up each address decoding window. Emulating a PCI-to-PCI bridge very nicely allows to re-use the PCI core resource allocation code. I think it's really the main reason for emulated those PCI-to-PCI bridges, rather than willing to comply to some standards. So what I'm going to do now is rework my patch series by removing the emulated host bridge (which is normally mandatory by PCIe standard, but Linux doesn't need it, so we don't care), but I'll keep the emulated PCI-to-PCI bridges in order to benefit for the PCI core resource allocation mechanisms. Is this ok for you? I'd like to settle on the strategy to follow, because we're really going a funny road here: on December 7th, I submit a series that doesn't use any PCI-to-PCI bridge, and I'm told that I should emulate some. I spent a long time working on an implementation that uses emumlated PCI-to-PCI bridges, which I submitted on Monday, now to be told that I should work really hard not to use PCI-to-PCI bridges. I hope you can feel my little embarrassment here... Thanks, Thomas
On Wednesday 30 January 2013, Thomas Petazzoni wrote: > Dear Arnd Bergmann, > > On Wed, 30 Jan 2013 09:55:49 +0000, Arnd Bergmann wrote: > > > I don't remember the kernel ever caring about whether hardware complies > > to a standard or not. The kernel's job is to make hardware work, based > > on the actual implementation of that hardware. In a lot of cases that > > means taking the standard document as a reference, and adding quirks > > for the devices that are different. > > > > In the end, it comes down to the impact on the code complexity, and > > the run-time overhead for whatever hardware is most common when adding > > those quirks. > > This is not only about standards, it is also about re-using the PCI > resource allocation code. > > In my RFCv1, sent December, 7th, I wasn't using any emulated PCI-to-PCI > bridge. So it *can* perfectly work without it. Ok, Isee. > However, one major drawback of my RFCv1 version is that since I didn't > know how much I/O space and memory space was needed for each PCIe > device, I had to oversize the address decoding windows. And also, I had > to have a special allocator (certainly simple, but still) to find an > available physical address to set up each address decoding window. Well, for the I/O space, there is no oversizing because either way you end up with exactly 64KB per root port, right? > Emulating a PCI-to-PCI bridge very nicely allows to re-use the PCI core > resource allocation code. I think it's really the main reason for > emulated those PCI-to-PCI bridges, rather than willing to comply to > some standards. > > So what I'm going to do now is rework my patch series by removing the > emulated host bridge (which is normally mandatory by PCIe standard, but > Linux doesn't need it, so we don't care), but I'll keep the emulated > PCI-to-PCI bridges in order to benefit for the PCI core resource > allocation mechanisms. > > Is this ok for you? Using the Linux allocator for memory resources does sound useful, so if that requires using the emulated PCI-to-PCI bridges, I guess it's the best compromise. > I'd like to settle on the strategy to follow, because we're really > going a funny road here: on December 7th, I submit a series that > doesn't use any PCI-to-PCI bridge, and I'm told that I should emulate > some. I spent a long time working on an implementation that uses > emumlated PCI-to-PCI bridges, which I submitted on Monday, now to be > told that I should work really hard not to use PCI-to-PCI bridges. I > hope you can feel my little embarrassment here... I'm sorry about this situation. Unfortunatly the way that such decisions are made is not always straightforward, and what seems like a good idea at one point turns out to be a mistake or more complex than anticipated later. With the description of the first patch set, I did not think it would be necessary to fake a bridge device and their config space. What I had guessed you could do was to call pci_scan_root_bus on each of the ports, and then set up the memory space window for the bus including all of its child devices. Arnd -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Dear Arnd Bergmann, On Wed, 30 Jan 2013 16:17:38 +0000, Arnd Bergmann wrote: > > However, one major drawback of my RFCv1 version is that since I didn't > > know how much I/O space and memory space was needed for each PCIe > > device, I had to oversize the address decoding windows. And also, I had > > to have a special allocator (certainly simple, but still) to find an > > available physical address to set up each address decoding window. > > Well, for the I/O space, there is no oversizing because either way you > end up with exactly 64KB per root port, right? Correct I/O space is not an issue, of course. Only the memory windows are an issue (in terms of quantity of address space used). That said, the PCI-to-PCI bridge solution doesn't solve the fact that I/O addresses get assigned even though the driver will most likely use them. This means that one I/O window is consumed for each PCIe interface, even though it is not being used in practice. And see I have 10 PCIe interfaces in this SoC, and only 20 windows available globally (not only for PCIe, but also for NAND, NOR, etc.). But for now, I'd like to leave this potential problem on the side, and get something working. If it seems useful to remove this problem later, we'll work on it. > > Emulating a PCI-to-PCI bridge very nicely allows to re-use the PCI core > > resource allocation code. I think it's really the main reason for > > emulated those PCI-to-PCI bridges, rather than willing to comply to > > some standards. > > > > So what I'm going to do now is rework my patch series by removing the > > emulated host bridge (which is normally mandatory by PCIe standard, but > > Linux doesn't need it, so we don't care), but I'll keep the emulated > > PCI-to-PCI bridges in order to benefit for the PCI core resource > > allocation mechanisms. > > > > Is this ok for you? > > Using the Linux allocator for memory resources does sound useful, > so if that requires using the emulated PCI-to-PCI bridges, I guess > it's the best compromise. Yes, that was Jason's original idea when he suggested to use PCI-to-PCI bridges. And when I did the implementation, it really worked nicely. And this PCI-to-PCI bridge emulation stuff is really not a big deal, look at drivers/pci/sw-pci-pci-bridge.c: 185 lines in total, including 10 lines of comment header at the top. > I'm sorry about this situation. Unfortunatly the way that such decisions > are made is not always straightforward, and what seems like a good idea > at one point turns out to be a mistake or more complex than anticipated > later. With the description of the first patch set, I did not think > it would be necessary to fake a bridge device and their config space. Sure, I understand this. I guess you also understand my slight frustration when I propose A, I'm told to do B, I propose B, and I'm then suggested to do A again :-) But I agree, it's part of the technical discussion, and we can't get it right on the first shot. > What I had guessed you could do was to call pci_scan_root_bus on > each of the ports, and then set up the memory space window for > the bus including all of its child devices. But where would I read how much space is needed for the I/O and memory regions of each bus? Thanks, Thomas
On Wed, Jan 30, 2013 at 4:47 AM, Thomas Petazzoni <thomas.petazzoni@free-electrons.com> wrote: > So what I'm going to do now is rework my patch series by removing the > emulated host bridge (which is normally mandatory by PCIe standard, but > Linux doesn't need it, so we don't care), ... This is a tangent since you're removing the emulated host bridge anyway, but it's been mentioned a couple of times, and I'd like to understand this. Jason mentioned earlier in the [07/27] emulated host bridge thread that the PCIe spec requires a host bridge at 00:00.0. I've never seen that mentioned in the spec; can somebody point me to the actual requirement that host bridges appear in config space? My understanding has been that host bridges, whether PCI or PCIe, are required to *exist*, but that the way you enumerate them and configure them is outside the scope of the PCI/PCIe specs. I know that many chips, especially for x86, *do* make the host bridge appear in config space, but I've never seen a requirement for that. Bjorn -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 30, 2013 at 01:48:33PM -0700, Bjorn Helgaas wrote: > This is a tangent since you're removing the emulated host bridge > anyway, but it's been mentioned a couple of times, and I'd like to > understand this. Jason mentioned earlier in the [07/27] emulated host > bridge thread that the PCIe spec requires a host bridge at 00:00.0. > I've never seen that mentioned in the spec; can somebody point me to > the actual requirement that host bridges appear in config space? Hum, a more careful search/reading brings up this: 7.2.2.1. Host Bridge Requirements [...] The use of Host Bridge PCI class code is reserved for backwards compatibility; host Bridge configuration space is opaque to standard PCI Express software and may be implemented in an implementation specific manner that is compatible with PCI Host Bridge Type 0 configuration space. A PCI Express Host Bridge is not required to signal errors through a Root Complex Event Collector. This support is optional for PCI Express Host Bridges. So, if it is present it is required to be compatible with the 'PCI Host Bridge' stuff, but it is not mandatory. My bad, I believe I got also confused with the spec language regarding a 'host bridge' vs a 'host bridge configuration space' Regards, Jason -- To unsubscribe from this list: send the line "unsubscribe linux-pci" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/arm/include/asm/mach/pci.h b/arch/arm/include/asm/mach/pci.h index 5cf2e97..7d2c3c8 100644 --- a/arch/arm/include/asm/mach/pci.h +++ b/arch/arm/include/asm/mach/pci.h @@ -30,6 +30,11 @@ struct hw_pci { void (*postinit)(void); u8 (*swizzle)(struct pci_dev *dev, u8 *pin); int (*map_irq)(const struct pci_dev *dev, u8 slot, u8 pin); + resource_size_t (*align_resource)(struct pci_dev *dev, + const struct resource *res, + resource_size_t start, + resource_size_t size, + resource_size_t align); }; /* @@ -51,6 +56,12 @@ struct pci_sys_data { u8 (*swizzle)(struct pci_dev *, u8 *); /* IRQ mapping */ int (*map_irq)(const struct pci_dev *, u8, u8); + /* Resource alignement requirements */ + resource_size_t (*align_resource)(struct pci_dev *dev, + const struct resource *res, + resource_size_t start, + resource_size_t size, + resource_size_t align); void *private_data; /* platform controller private data */ }; diff --git a/arch/arm/kernel/bios32.c b/arch/arm/kernel/bios32.c index 5401645..be2e6c9 100644 --- a/arch/arm/kernel/bios32.c +++ b/arch/arm/kernel/bios32.c @@ -462,6 +462,7 @@ static void __init pcibios_init_hw(struct hw_pci *hw, struct list_head *head) sys->busnr = busnr; sys->swizzle = hw->swizzle; sys->map_irq = hw->map_irq; + sys->align_resource = hw->align_resource; INIT_LIST_HEAD(&sys->resources); if (hw->private_data) @@ -574,6 +575,8 @@ char * __init pcibios_setup(char *str) resource_size_t pcibios_align_resource(void *data, const struct resource *res, resource_size_t size, resource_size_t align) { + struct pci_dev *dev = data; + struct pci_sys_data *sys = dev->sysdata; resource_size_t start = res->start; if (res->flags & IORESOURCE_IO && start & 0x300) @@ -581,6 +584,9 @@ resource_size_t pcibios_align_resource(void *data, const struct resource *res, start = (start + align - 1) & ~(align - 1); + if (sys->align_resource) + return sys->align_resource(dev, res, start, size, align); + return start; }
The PCI specifications says that an I/O region must be aligned on a 4 KB boundary, and a memory region aligned on a 1 MB boundary. However, the Marvell PCIe interfaces rely on address decoding windows (which allow to associate a range of physical addresses with a given device). For PCIe memory windows, those windows are defined with a 1 MB granularity (which matches the PCI specs), but PCIe I/O windows can only be defined with a 64 KB granularity, so they have to be 64 KB aligned. We therefore need to tell the PCI core about this special alignement requirement. The PCI core already calls pcibios_align_resource() in the ARM PCI core, specifically for such purposes. So this patch extends the ARM PCI core so that it calls a ->align_resource() hook registered by the PCI driver, exactly like the existing ->map_irq() and ->swizzle() hooks. A particular PCI driver can register a align_resource() hook, and do its own specific alignement, depending on the specific constraints of the underlying hardware. Signed-off-by: Thomas Petazzoni <thomas.petazzoni@free-electrons.com> Cc: Russell King <linux@arm.linux.org.uk> --- arch/arm/include/asm/mach/pci.h | 11 +++++++++++ arch/arm/kernel/bios32.c | 6 ++++++ 2 files changed, 17 insertions(+)