Message ID | 1415470392-28229-1-git-send-email-hauke@hauke-m.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Saturday 08 November 2014 19:13:12 Hauke Mehrtens wrote: > diff --git a/drivers/pci/host/pci-host-bcm5301x.c b/drivers/pci/host/pci-host-bcm5301x.c > new file mode 100644 > index 0000000..8b7ba62 > --- /dev/null > +++ b/drivers/pci/host/pci-host-bcm5301x.c > @@ -0,0 +1,483 @@ > +/* > + * Northstar PCI-Express driver > + * Only supports Root-Complex (RC) mode > + * > + * Notes: > + * PCI Domains are being used to identify the PCIe port 1:1. > + * > + * Only MEM access is supported, PAX does not support IO. What is PAX? > +static int bcma_pcie2_map_irq(const struct pci_dev *pdev, u8 slot, u8 pin) > +{ > + struct pci_sys_data *sys = pdev->sysdata; > + struct bcma_device *bdev = sys->private_data; > + > + /* > + * Every PCIe controller has 5 IRQ number and the last one is > + * triggered every time, use that one > + */ > + if (bdev && bdev->dev.of_node) > + return irq_of_parse_and_map(bdev->dev.of_node, 4); > + > + return bdev->irq; > +} If you have an OF device node for the PCI host, you don't need this, there should already be an interrupt-map property that correctly maps this into the interrupt domain of the bcma bus, so you can use the default of_irq_parse_and_map_pci function. > +static u32 bcma_pcie2_cfg_base(struct bcma_device *bdev, int busno, > + unsigned int devfn, int where) > +{ > + int slot = PCI_SLOT(devfn); > + int fn = PCI_FUNC(devfn); > + u32 addr_reg; > + > + if (busno == 0) { > + if (slot >= 1) > + return 0; > + bcma_write32(bdev, BCMA_CORE_PCIE2_CONFIGINDADDR, > + where & 0xffc); > + return BCMA_CORE_PCIE2_CONFIGINDDATA; > + } > + if (fn > 1) > + return 0; > + addr_reg = (busno & 0xff) << 20 | (slot << 15) | (fn << 12) | > + (where & 0xffc) | (1 & 0x3); > + > + bcma_write32(bdev, BCMA_CORE_PCIE2_CFG_ADDR, addr_reg); > + return BCMA_CORE_PCIE2_CFG_DATA; > +} The normal buses seem to be ECAM compliant, I wonder if we can have the code to access those shared between this driver, pci-host-generic.c and the ACPI PCI implementation. The root bus is unfortunately not compliant, which for some reason is common on a lot of controllers. > +static u32 bcma_pcie2_read_config(struct bcma_device *bdev, int busno, > + unsigned int devfn, int where, int size) > +{ > + u32 base; > + u32 data_reg; > + u32 mask; > + int shift; > + > + base = bcma_pcie2_cfg_base(bdev, busno, devfn, where); > + > + if (!base) > + return ~0UL; > + > + data_reg = bcma_read32(bdev, base); > + > + /* NS: CLASS field is R/O, and set to wrong 0x200 value */ > + if (busno == 0 && devfn == 0) { > + /* > + * RC's class is 0x0280, but Linux PCI driver needs 0x604 > + * for a PCIe bridge. So we must fixup the class code > + * to 0x604 here. > + */ > + if ((where & 0xffc) == PCI_CLASS_REVISION) { > + data_reg &= 0xff; > + data_reg |= 0x604 << 16; > + } > + } > + /* HEADER_TYPE=00 indicates the port in EP mode */ > + > + if (size == 4) > + return data_reg; > + > + mask = (1 << (size * 8)) - 1; > + shift = (where % 4) * 8; > + return (data_reg >> shift) & mask; > +} Interesting. 0x0280 is PCI_CLASS_NETWORK_OTHER, I guess this is the same PCI core that is used on broadcom network adapters in endpoint mode, and when someone tweaked the core to do root complex mode, they forgot to fix this. Which code depends on this? I think I've seem something similar in other host drivers, so maybe we can change the core code instead. > + > +static u8 bcma_pcie2_read_config8(struct bcma_device *bdev, int busno, > + unsigned int devfn, int where) > +{ > + return bcma_pcie2_read_config(bdev, busno, devfn, where, 1); > +} > + > +static u16 bcma_pcie2_read_config16(struct bcma_device *bdev, int busno, > + unsigned int devfn, int where) > +{ > + return bcma_pcie2_read_config(bdev, busno, devfn, where, 2); > +} > + > +static u32 bcma_pcie2_read_config32(struct bcma_device *bdev, int busno, > + unsigned int devfn, int where) > +{ > + return bcma_pcie2_read_config(bdev, busno, devfn, where, 4); > +} These all seem to be relatively pointless, I'd just open-code the bcma_pcie2_read_config call in the calling function. > + > +/* > + * Initializte the PCIe controller > + */ > +static void bcma_pcie2_hw_init(struct bcma_device *bdev) > +{ > + u32 tmp32; > + u16 tmp16; > + > + /* Change MPS and MRRS to 512 */ > + tmp16 = bcma_pcie2_read_config16(bdev, 0, 0, 0x4d4); > + tmp16 &= ~7; > + tmp16 |= 2; > + bcma_pcie2_write_config16(bdev, 0, 0, 0x4d4, tmp16); > + > + tmp32 = bcma_pcie2_read_config32(bdev, 0, 0, 0xb4); > + tmp32 &= ~((7 << 12) | (7 << 5)); > + tmp32 |= (2 << 12) | (2 << 5); > + bcma_pcie2_write_config32(bdev, 0, 0, 0xb4, tmp32); > + > + /* > + * Turn-on Root-Complex (RC) mode, from reset default of EP > + * The mode is set by straps, can be overwritten via DMU > + * register <cru_straps_control> bit 5, "1" means RC > + */ > + > + /* Send a downstream reset */ > + bcma_write32(bdev, BCMA_CORE_PCIE2_CLK_CONTROL, > + PCIE2_CLKC_RST_OE | PCIE2_CLKC_RST); > + udelay(250); > + bcma_write32(bdev, BCMA_CORE_PCIE2_CLK_CONTROL, PCIE2_CLKC_RST_OE); > + mdelay(250); > + > + /* TBD: take care of PM, check we're on */ > +} Can you turn the two delays into msleep()? Wasting 250ms of CPU time is really nasty. > + /* > + * Inbound address translation setup > + * Northstar only maps up to 128 MiB inbound, DRAM could be up to 1 GiB. > + * > + * For now allow access to entire DRAM, assuming it is less than 128MiB, > + * otherwise DMA bouncing mechanism may be required. > + * Also consider DMA mask to limit DMA physical address > + */ > + /* 64-bit LE regs, write low word, high is 0 at reset */ > + bcma_write32(bdev, BCMA_CORE_PCIE2_FUNC0_IMAP1, PHYS_OFFSET | 0x1); > + bcma_write32(bdev, BCMA_CORE_PCIE2_IARR1_LOWER, > + PHYS_OFFSET | ((SZ_128M >> 20) & 0xff)); Maybe I should bully you into enabling swiotlb on arm32 ;-) Do you have any machines with more than 128MB of RAM that use this? According to wikidevi.com [http://tinyurl.com/pdtw2h4], most machines with bcm4708/81/9 have 256MB or 512MB. I think it shouldn't be too hard to get swiotlb to work on arm32, given that we already use it on arm64. Arnd
On 11/08/2014 08:47 PM, Arnd Bergmann wrote: > On Saturday 08 November 2014 19:13:12 Hauke Mehrtens wrote: > >> diff --git a/drivers/pci/host/pci-host-bcm5301x.c b/drivers/pci/host/pci-host-bcm5301x.c >> new file mode 100644 >> index 0000000..8b7ba62 >> --- /dev/null >> +++ b/drivers/pci/host/pci-host-bcm5301x.c >> @@ -0,0 +1,483 @@ >> +/* >> + * Northstar PCI-Express driver >> + * Only supports Root-Complex (RC) mode >> + * >> + * Notes: >> + * PCI Domains are being used to identify the PCIe port 1:1. >> + * >> + * Only MEM access is supported, PAX does not support IO. > > What is PAX? This is the name of the PCIe controller I think, This was copied from the vendor driver, I hope Florian knows more details. >> +static int bcma_pcie2_map_irq(const struct pci_dev *pdev, u8 slot, u8 pin) >> +{ >> + struct pci_sys_data *sys = pdev->sysdata; >> + struct bcma_device *bdev = sys->private_data; >> + >> + /* >> + * Every PCIe controller has 5 IRQ number and the last one is >> + * triggered every time, use that one >> + */ >> + if (bdev && bdev->dev.of_node) >> + return irq_of_parse_and_map(bdev->dev.of_node, 4); >> + >> + return bdev->irq; >> +} > > If you have an OF device node for the PCI host, you don't need this, > there should already be an interrupt-map property that correctly > maps this into the interrupt domain of the bcma bus, so you can use > the default of_irq_parse_and_map_pci function. So I just have to define a interrupt-map property for the device this driver gets registered for and then I can use the default implementation? >> +static u32 bcma_pcie2_cfg_base(struct bcma_device *bdev, int busno, >> + unsigned int devfn, int where) >> +{ >> + int slot = PCI_SLOT(devfn); >> + int fn = PCI_FUNC(devfn); >> + u32 addr_reg; >> + >> + if (busno == 0) { >> + if (slot >= 1) >> + return 0; >> + bcma_write32(bdev, BCMA_CORE_PCIE2_CONFIGINDADDR, >> + where & 0xffc); >> + return BCMA_CORE_PCIE2_CONFIGINDDATA; >> + } >> + if (fn > 1) >> + return 0; >> + addr_reg = (busno & 0xff) << 20 | (slot << 15) | (fn << 12) | >> + (where & 0xffc) | (1 & 0x3); >> + >> + bcma_write32(bdev, BCMA_CORE_PCIE2_CFG_ADDR, addr_reg); >> + return BCMA_CORE_PCIE2_CFG_DATA; >> +} > > The normal buses seem to be ECAM compliant, I wonder if we can have the > code to access those shared between this driver, pci-host-generic.c and > the ACPI PCI implementation. There are just a few lines we can share I do not think that would be worth the effort. > The root bus is unfortunately not compliant, which for some reason > is common on a lot of controllers. > >> +static u32 bcma_pcie2_read_config(struct bcma_device *bdev, int busno, >> + unsigned int devfn, int where, int size) >> +{ >> + u32 base; >> + u32 data_reg; >> + u32 mask; >> + int shift; >> + >> + base = bcma_pcie2_cfg_base(bdev, busno, devfn, where); >> + >> + if (!base) >> + return ~0UL; >> + >> + data_reg = bcma_read32(bdev, base); >> + >> + /* NS: CLASS field is R/O, and set to wrong 0x200 value */ >> + if (busno == 0 && devfn == 0) { >> + /* >> + * RC's class is 0x0280, but Linux PCI driver needs 0x604 >> + * for a PCIe bridge. So we must fixup the class code >> + * to 0x604 here. >> + */ >> + if ((where & 0xffc) == PCI_CLASS_REVISION) { >> + data_reg &= 0xff; >> + data_reg |= 0x604 << 16; >> + } >> + } >> + /* HEADER_TYPE=00 indicates the port in EP mode */ >> + >> + if (size == 4) >> + return data_reg; >> + >> + mask = (1 << (size * 8)) - 1; >> + shift = (where % 4) * 8; >> + return (data_reg >> shift) & mask; >> +} > > Interesting. 0x0280 is PCI_CLASS_NETWORK_OTHER, I guess this is the > same PCI core that is used on broadcom network adapters in endpoint > mode, and when someone tweaked the core to do root complex mode, > they forgot to fix this. They did this with the old PCIe controller used on MIPS based SoCs, there it was even the same core and one could only identify if it is running in root complex mode with some strange register read and checking the exception. In this generation they agve the core a different number than the PCI client core used on wifi cards. > Which code depends on this? I think I've seem something similar in > other host drivers, so maybe we can change the core code instead. It is used in many places in the core PCI code, I think it is better to change this to the correct value, but the pci-tegra.c driver does it in a better way, by adding a DECLARE_PCI_FIXUP_EARLY only changing the struct data, I will use that version. >> + >> +static u8 bcma_pcie2_read_config8(struct bcma_device *bdev, int busno, >> + unsigned int devfn, int where) >> +{ >> + return bcma_pcie2_read_config(bdev, busno, devfn, where, 1); >> +} >> + >> +static u16 bcma_pcie2_read_config16(struct bcma_device *bdev, int busno, >> + unsigned int devfn, int where) >> +{ >> + return bcma_pcie2_read_config(bdev, busno, devfn, where, 2); >> +} >> + >> +static u32 bcma_pcie2_read_config32(struct bcma_device *bdev, int busno, >> + unsigned int devfn, int where) >> +{ >> + return bcma_pcie2_read_config(bdev, busno, devfn, where, 4); >> +} > > These all seem to be relatively pointless, I'd just open-code the > bcma_pcie2_read_config call in the calling function. I will inline them. >> + >> +/* >> + * Initializte the PCIe controller >> + */ >> +static void bcma_pcie2_hw_init(struct bcma_device *bdev) >> +{ >> + u32 tmp32; >> + u16 tmp16; >> + >> + /* Change MPS and MRRS to 512 */ >> + tmp16 = bcma_pcie2_read_config16(bdev, 0, 0, 0x4d4); >> + tmp16 &= ~7; >> + tmp16 |= 2; >> + bcma_pcie2_write_config16(bdev, 0, 0, 0x4d4, tmp16); >> + >> + tmp32 = bcma_pcie2_read_config32(bdev, 0, 0, 0xb4); >> + tmp32 &= ~((7 << 12) | (7 << 5)); >> + tmp32 |= (2 << 12) | (2 << 5); >> + bcma_pcie2_write_config32(bdev, 0, 0, 0xb4, tmp32); >> + >> + /* >> + * Turn-on Root-Complex (RC) mode, from reset default of EP >> + * The mode is set by straps, can be overwritten via DMU >> + * register <cru_straps_control> bit 5, "1" means RC >> + */ >> + >> + /* Send a downstream reset */ >> + bcma_write32(bdev, BCMA_CORE_PCIE2_CLK_CONTROL, >> + PCIE2_CLKC_RST_OE | PCIE2_CLKC_RST); >> + udelay(250); >> + bcma_write32(bdev, BCMA_CORE_PCIE2_CLK_CONTROL, PCIE2_CLKC_RST_OE); >> + mdelay(250); >> + >> + /* TBD: take care of PM, check we're on */ >> +} > > Can you turn the two delays into msleep()? Wasting 250ms of CPU time is > really nasty. Yes I will do that. >> + /* >> + * Inbound address translation setup >> + * Northstar only maps up to 128 MiB inbound, DRAM could be up to 1 GiB. >> + * >> + * For now allow access to entire DRAM, assuming it is less than 128MiB, >> + * otherwise DMA bouncing mechanism may be required. >> + * Also consider DMA mask to limit DMA physical address >> + */ >> + /* 64-bit LE regs, write low word, high is 0 at reset */ >> + bcma_write32(bdev, BCMA_CORE_PCIE2_FUNC0_IMAP1, PHYS_OFFSET | 0x1); >> + bcma_write32(bdev, BCMA_CORE_PCIE2_IARR1_LOWER, >> + PHYS_OFFSET | ((SZ_128M >> 20) & 0xff)); > > Maybe I should bully you into enabling swiotlb on arm32 ;-) This sounds complicated, I hope I can avoid it. ;-) > Do you have any machines with more than 128MB of RAM that use this? Yes, the device I use mostly has 256 MB ram, but I have only activated the first 128MB because the last time I tried activating the rest caused some problems and I still had to fix some other stuff. ;-) > According to wikidevi.com [http://tinyurl.com/pdtw2h4], most machines > with bcm4708/81/9 have 256MB or 512MB. I think it shouldn't be too hard > to get swiotlb to work on arm32, given that we already use it on arm64. Hauke Hauke
On Saturday 08 November 2014 22:26:58 Hauke Mehrtens wrote: > On 11/08/2014 08:47 PM, Arnd Bergmann wrote: > > On Saturday 08 November 2014 19:13:12 Hauke Mehrtens wrote: > > > >> diff --git a/drivers/pci/host/pci-host-bcm5301x.c b/drivers/pci/host/pci-host-bcm5301x.c > >> new file mode 100644 > >> index 0000000..8b7ba62 > >> --- /dev/null > >> +++ b/drivers/pci/host/pci-host-bcm5301x.c > >> @@ -0,0 +1,483 @@ > >> +/* > >> + * Northstar PCI-Express driver > >> + * Only supports Root-Complex (RC) mode > >> + * > >> + * Notes: > >> + * PCI Domains are being used to identify the PCIe port 1:1. > >> + * > >> + * Only MEM access is supported, PAX does not support IO. > > > > What is PAX? > > This is the name of the PCIe controller I think, This was copied from > the vendor driver, I hope Florian knows more details. Just clarify this in the comment. > >> +static int bcma_pcie2_map_irq(const struct pci_dev *pdev, u8 slot, u8 pin) > >> +{ > >> + struct pci_sys_data *sys = pdev->sysdata; > >> + struct bcma_device *bdev = sys->private_data; > >> + > >> + /* > >> + * Every PCIe controller has 5 IRQ number and the last one is > >> + * triggered every time, use that one > >> + */ > >> + if (bdev && bdev->dev.of_node) > >> + return irq_of_parse_and_map(bdev->dev.of_node, 4); > >> + > >> + return bdev->irq; > >> +} > > > > If you have an OF device node for the PCI host, you don't need this, > > there should already be an interrupt-map property that correctly > > maps this into the interrupt domain of the bcma bus, so you can use > > the default of_irq_parse_and_map_pci function. > > So I just have to define a interrupt-map property for the device this > driver gets registered for and then I can use the default implementation? It depends on what you want to cover. For all I know, you normally don't need a DT entry for this PCI host, as it's covered by the upstream bus. However adding a DT node would let you do some things that you can't currently do: - specify the inbound and outbound windows to configure - list different interrupts, e.g. if there is a PCIe-to-PCI bridge and the device connected to it is routed to a GPIO pin or the another input of the main interrupt controller. > >> +static u32 bcma_pcie2_cfg_base(struct bcma_device *bdev, int busno, > >> + unsigned int devfn, int where) > >> +{ > >> + int slot = PCI_SLOT(devfn); > >> + int fn = PCI_FUNC(devfn); > >> + u32 addr_reg; > >> + > >> + if (busno == 0) { > >> + if (slot >= 1) > >> + return 0; > >> + bcma_write32(bdev, BCMA_CORE_PCIE2_CONFIGINDADDR, > >> + where & 0xffc); > >> + return BCMA_CORE_PCIE2_CONFIGINDDATA; > >> + } > >> + if (fn > 1) > >> + return 0; > >> + addr_reg = (busno & 0xff) << 20 | (slot << 15) | (fn << 12) | > >> + (where & 0xffc) | (1 & 0x3); > >> + > >> + bcma_write32(bdev, BCMA_CORE_PCIE2_CFG_ADDR, addr_reg); > >> + return BCMA_CORE_PCIE2_CFG_DATA; > >> +} > > > > The normal buses seem to be ECAM compliant, I wonder if we can have the > > code to access those shared between this driver, pci-host-generic.c and > > the ACPI PCI implementation. > > There are just a few lines we can share I do not think that would be > worth the effort. Well, the idea would be that for the root bus, you use simplified pci_ops that just do BCMA_CORE_PCIE2_CONFIGINDADDR/DATA, while for the other buses you use the standard pci ops. > > Which code depends on this? I think I've seem something similar in > > other host drivers, so maybe we can change the core code instead. > > It is used in many places in the core PCI code, I think it is better to > change this to the correct value, but the pci-tegra.c driver does it in > a better way, by adding a DECLARE_PCI_FIXUP_EARLY only changing the > struct data, I will use that version. Yes, sounds good. > >> + /* > >> + * Inbound address translation setup > >> + * Northstar only maps up to 128 MiB inbound, DRAM could be up to 1 GiB. > >> + * > >> + * For now allow access to entire DRAM, assuming it is less than 128MiB, > >> + * otherwise DMA bouncing mechanism may be required. > >> + * Also consider DMA mask to limit DMA physical address > >> + */ > >> + /* 64-bit LE regs, write low word, high is 0 at reset */ > >> + bcma_write32(bdev, BCMA_CORE_PCIE2_FUNC0_IMAP1, PHYS_OFFSET | 0x1); > >> + bcma_write32(bdev, BCMA_CORE_PCIE2_IARR1_LOWER, > >> + PHYS_OFFSET | ((SZ_128M >> 20) & 0xff)); > > > > Maybe I should bully you into enabling swiotlb on arm32 ;-) > > This sounds complicated, I hope I can avoid it. ;-) I'd really hope that it's not that hard. We basically just need a copy of coherent_swiotlb_dma_ops/noncoherent_swiotlb_dma_ops from arm64 and use those on bcma devices, with the right dma_mask set. Arnd
(reviving an old thread) On Sunday 09 November 2014 21:27:40 Arnd Bergmann wrote: > > > >> + /* > > >> + * Inbound address translation setup > > >> + * Northstar only maps up to 128 MiB inbound, DRAM could be up to 1 GiB. > > >> + * > > >> + * For now allow access to entire DRAM, assuming it is less than 128MiB, > > >> + * otherwise DMA bouncing mechanism may be required. > > >> + * Also consider DMA mask to limit DMA physical address > > >> + */ > > >> + /* 64-bit LE regs, write low word, high is 0 at reset */ > > >> + bcma_write32(bdev, BCMA_CORE_PCIE2_FUNC0_IMAP1, PHYS_OFFSET | 0x1); > > >> + bcma_write32(bdev, BCMA_CORE_PCIE2_IARR1_LOWER, > > >> + PHYS_OFFSET | ((SZ_128M >> 20) & 0xff)); > > > > > > Maybe I should bully you into enabling swiotlb on arm32 > > > > This sounds complicated, I hope I can avoid it. > > I'd really hope that it's not that hard. We basically just need a copy of > coherent_swiotlb_dma_ops/noncoherent_swiotlb_dma_ops from arm64 and > use those on bcma devices, with the right dma_mask set. Peter Senna has tested this on bcm4708a0 (Buffalo WZR-1750DHP) and found that all RAM is DMA capable, as tested with CONFIG_VMSPLIT_1G. Could the code comment here be incorrect, or is it possible that it was fixed in later chip versions? If this works, using CONFIG_VMSPLIT_1G should result in noticeably better I/O performance on this chip. I have created a patch that lets him simulate the broken behavior on his machine, so he can work on implementing swiotlb, but it would certainly be best to understand which machines are really affected. Note that the driver that was merged as drivers/pci/host/pcie-iproc.c does not seem to touch the BCMA_CORE_PCIE2_IARR1_LOWER (offset 0xd08) register, and presumably the power-on default for this register maps all of RAM correctly. Arnd
On 4/17/2015 7:09 AM, Arnd Bergmann wrote: > (reviving an old thread) > > On Sunday 09 November 2014 21:27:40 Arnd Bergmann wrote: >> >>>>> + /* >>>>> + * Inbound address translation setup >>>>> + * Northstar only maps up to 128 MiB inbound, DRAM could be up to 1 GiB. >>>>> + * >>>>> + * For now allow access to entire DRAM, assuming it is less than 128MiB, >>>>> + * otherwise DMA bouncing mechanism may be required. >>>>> + * Also consider DMA mask to limit DMA physical address >>>>> + */ >>>>> + /* 64-bit LE regs, write low word, high is 0 at reset */ >>>>> + bcma_write32(bdev, BCMA_CORE_PCIE2_FUNC0_IMAP1, PHYS_OFFSET | 0x1); >>>>> + bcma_write32(bdev, BCMA_CORE_PCIE2_IARR1_LOWER, >>>>> + PHYS_OFFSET | ((SZ_128M >> 20) & 0xff)); >>>> >>>> Maybe I should bully you into enabling swiotlb on arm32 >>> >>> This sounds complicated, I hope I can avoid it. >> >> I'd really hope that it's not that hard. We basically just need a copy of >> coherent_swiotlb_dma_ops/noncoherent_swiotlb_dma_ops from arm64 and >> use those on bcma devices, with the right dma_mask set. > > Peter Senna has tested this on bcm4708a0 (Buffalo WZR-1750DHP) and found > that all RAM is DMA capable, as tested with CONFIG_VMSPLIT_1G. Could > the code comment here be incorrect, or is it possible that it was fixed > in later chip versions? > > If this works, using CONFIG_VMSPLIT_1G should result in noticeably > better I/O performance on this chip. > > I have created a patch that lets him simulate the broken behavior on his > machine, so he can work on implementing swiotlb, but it would certainly > be best to understand which machines are really affected. > > Note that the driver that was merged as drivers/pci/host/pcie-iproc.c > does not seem to touch the BCMA_CORE_PCIE2_IARR1_LOWER (offset 0xd08) > register, and presumably the power-on default for this register > maps all of RAM correctly. That appears to be the case for Cygnus and NS+ but I just realized I should not make this assumption since this is going to be the driver for a lot of other iProc based chips. I need to check with our ASIC team on limitation of PCIe on each chip on inbound mapping and do more investigation and experiment myself. Also note I won't have time to work on this in the short-term, but I will eventually get to it. Thanks, Ray > > Arnd >
diff --git a/arch/arm/mach-bcm/Kconfig b/arch/arm/mach-bcm/Kconfig index 2abad74..a3d3ede 100644 --- a/arch/arm/mach-bcm/Kconfig +++ b/arch/arm/mach-bcm/Kconfig @@ -86,6 +86,7 @@ config ARCH_BCM_5301X select HAVE_ARM_TWD if SMP select ARM_GLOBAL_TIMER select CLKSRC_ARM_GLOBAL_TIMER_SCHED_CLOCK + select PCI_DOMAINS if PCI help Support for Broadcom BCM470X and BCM5301X SoCs with ARM CPU cores. diff --git a/drivers/pci/host/Kconfig b/drivers/pci/host/Kconfig index 3dc25fa..ead54e9 100644 --- a/drivers/pci/host/Kconfig +++ b/drivers/pci/host/Kconfig @@ -91,4 +91,11 @@ config PCI_XGENE There are 5 internal PCIe ports available. Each port is GEN3 capable and have varied lanes from x1 to x8. +config PCI_BCM5301X + bool "BCM5301X PCIe2 host controller" + depends on BCMA && OF && ARM && PCI_DOMAINS + help + Say Y here if you want to support the PCIe host controller found + on Broadcom BCM5301X and BCM470X (Northstar) SoCs. + endmenu diff --git a/drivers/pci/host/Makefile b/drivers/pci/host/Makefile index 26b3461..036f1a0 100644 --- a/drivers/pci/host/Makefile +++ b/drivers/pci/host/Makefile @@ -11,3 +11,4 @@ obj-$(CONFIG_PCIE_SPEAR13XX) += pcie-spear13xx.o obj-$(CONFIG_PCI_KEYSTONE) += pci-keystone-dw.o pci-keystone.o obj-$(CONFIG_PCIE_XILINX) += pcie-xilinx.o obj-$(CONFIG_PCI_XGENE) += pci-xgene.o +obj-$(CONFIG_PCI_BCM5301X) += pci-host-bcm5301x.o diff --git a/drivers/pci/host/pci-host-bcm5301x.c b/drivers/pci/host/pci-host-bcm5301x.c new file mode 100644 index 0000000..8b7ba62 --- /dev/null +++ b/drivers/pci/host/pci-host-bcm5301x.c @@ -0,0 +1,483 @@ +/* + * Northstar PCI-Express driver + * Only supports Root-Complex (RC) mode + * + * Notes: + * PCI Domains are being used to identify the PCIe port 1:1. + * + * Only MEM access is supported, PAX does not support IO. + * + * Copyright 2012-2014, Broadcom Corporation + * Copyright 2014, Hauke Mehrtens <hauke@hauke-m.de> + * + * Licensed under the GNU/GPL. See COPYING for details. + */ + +#include <linux/kernel.h> +#include <linux/module.h> +#include <linux/bug.h> +#include <linux/delay.h> +#include <linux/pci.h> +#include <linux/io.h> +#include <linux/ioport.h> +#include <linux/interrupt.h> +#include <linux/bcma/bcma.h> +#include <linux/bcma/bcma_driver_pcie2.h> +#include <linux/of_irq.h> + +#define SOC_PCIE_HDR_OFF 0x400 /* 256 bytes per function */ + +static int bcma_pcie2_map_irq(const struct pci_dev *pdev, u8 slot, u8 pin) +{ + struct pci_sys_data *sys = pdev->sysdata; + struct bcma_device *bdev = sys->private_data; + + /* + * Every PCIe controller has 5 IRQ number and the last one is + * triggered every time, use that one + */ + if (bdev && bdev->dev.of_node) + return irq_of_parse_and_map(bdev->dev.of_node, 4); + + return bdev->irq; +} + +static u32 bcma_pcie2_cfg_base(struct bcma_device *bdev, int busno, + unsigned int devfn, int where) +{ + int slot = PCI_SLOT(devfn); + int fn = PCI_FUNC(devfn); + u32 addr_reg; + + if (busno == 0) { + if (slot >= 1) + return 0; + bcma_write32(bdev, BCMA_CORE_PCIE2_CONFIGINDADDR, + where & 0xffc); + return BCMA_CORE_PCIE2_CONFIGINDDATA; + } + if (fn > 1) + return 0; + addr_reg = (busno & 0xff) << 20 | (slot << 15) | (fn << 12) | + (where & 0xffc) | (1 & 0x3); + + bcma_write32(bdev, BCMA_CORE_PCIE2_CFG_ADDR, addr_reg); + return BCMA_CORE_PCIE2_CFG_DATA; +} + +static u32 bcma_pcie2_read_config(struct bcma_device *bdev, int busno, + unsigned int devfn, int where, int size) +{ + u32 base; + u32 data_reg; + u32 mask; + int shift; + + base = bcma_pcie2_cfg_base(bdev, busno, devfn, where); + + if (!base) + return ~0UL; + + data_reg = bcma_read32(bdev, base); + + /* NS: CLASS field is R/O, and set to wrong 0x200 value */ + if (busno == 0 && devfn == 0) { + /* + * RC's class is 0x0280, but Linux PCI driver needs 0x604 + * for a PCIe bridge. So we must fixup the class code + * to 0x604 here. + */ + if ((where & 0xffc) == PCI_CLASS_REVISION) { + data_reg &= 0xff; + data_reg |= 0x604 << 16; + } + } + /* HEADER_TYPE=00 indicates the port in EP mode */ + + if (size == 4) + return data_reg; + + mask = (1 << (size * 8)) - 1; + shift = (where % 4) * 8; + return (data_reg >> shift) & mask; +} + +static void bcma_pcie2_write_config(struct bcma_device *bdev, int busno, + unsigned int devfn, int where, int size, + u32 val) +{ + u32 base; + u32 data_reg; + + base = bcma_pcie2_cfg_base(bdev, busno, devfn, where); + + if (!base) + return; + + if (size < 4) { + u32 mask = (1 << (size * 8)) - 1; + int shift = (where % 4) * 8; + + data_reg = bcma_read32(bdev, base); + data_reg &= ~(mask << shift); + data_reg |= (val & mask) << shift; + } else { + data_reg = val; + } + + bcma_write32(bdev, base, data_reg); +} + +static u8 bcma_pcie2_read_config8(struct bcma_device *bdev, int busno, + unsigned int devfn, int where) +{ + return bcma_pcie2_read_config(bdev, busno, devfn, where, 1); +} + +static u16 bcma_pcie2_read_config16(struct bcma_device *bdev, int busno, + unsigned int devfn, int where) +{ + return bcma_pcie2_read_config(bdev, busno, devfn, where, 2); +} + +static u32 bcma_pcie2_read_config32(struct bcma_device *bdev, int busno, + unsigned int devfn, int where) +{ + return bcma_pcie2_read_config(bdev, busno, devfn, where, 4); +} + +static void bcma_pcie2_write_config8(struct bcma_device *bdev, int busno, + unsigned int devfn, int where, u8 val) +{ + return bcma_pcie2_write_config(bdev, busno, devfn, where, 1, val); +} + +static void bcma_pcie2_write_config16(struct bcma_device *bdev, int busno, + unsigned int devfn, int where, u16 val) +{ + return bcma_pcie2_write_config(bdev, busno, devfn, where, 2, val); +} + +static void bcma_pcie2_write_config32(struct bcma_device *bdev, int busno, + unsigned int devfn, int where, u32 val) +{ + return bcma_pcie2_write_config(bdev, busno, devfn, where, 4, val); +} + +static int bcma_pcie2_read_config_pci(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 *val) +{ + struct pci_sys_data *sys = bus->sysdata; + struct bcma_device *bdev = sys->private_data; + + *val = bcma_pcie2_read_config(bdev, bus->number, devfn, where, size); + + return PCIBIOS_SUCCESSFUL; +} + +static int bcma_pcie2_write_config_pci(struct pci_bus *bus, unsigned int devfn, + int where, int size, u32 val) +{ + struct pci_sys_data *sys = bus->sysdata; + struct bcma_device *bdev = sys->private_data; + + bcma_pcie2_write_config(bdev, bus->number, devfn, where, size, val); + + return PCIBIOS_SUCCESSFUL; +} + +/* + * Methods for accessing configuration registers + */ +static struct pci_ops bcma_pcie2_ops = { + .read = bcma_pcie2_read_config_pci, + .write = bcma_pcie2_write_config_pci, +}; + +/* + * Check link status, return 0 if link is up in RC mode, + * otherwise return non-zero + */ +static int bcma_pcie2_check_link(struct bcma_device *bdev, bool allow_gen2) +{ + u32 devfn = 0; + u32 tmp32; + u8 tmp8; + + tmp32 = bcma_pcie2_read_config32(bdev, 0, devfn, 0xdc); + tmp32 &= ~0xf; + if (allow_gen2) + tmp32 |= 2; + else { + /* force PCIE GEN1 */ + tmp32 |= 1; + } + bcma_pcie2_write_config32(bdev, 0, devfn, 0xdc, tmp32); + + /* See if the port is in EP mode, indicated by header type 00 */ + tmp8 = bcma_pcie2_read_config8(bdev, 0, devfn, PCI_HEADER_TYPE); + if (tmp8 != PCI_HEADER_TYPE_BRIDGE) { + dev_info(&bdev->dev, "Port %d in End-Point mode - ignored\n", + bdev->core_unit); + return -ENODEV; + } + + return 0; +} + +/* + * Initializte the PCIe controller + */ +static void bcma_pcie2_hw_init(struct bcma_device *bdev) +{ + u32 tmp32; + u16 tmp16; + + /* Change MPS and MRRS to 512 */ + tmp16 = bcma_pcie2_read_config16(bdev, 0, 0, 0x4d4); + tmp16 &= ~7; + tmp16 |= 2; + bcma_pcie2_write_config16(bdev, 0, 0, 0x4d4, tmp16); + + tmp32 = bcma_pcie2_read_config32(bdev, 0, 0, 0xb4); + tmp32 &= ~((7 << 12) | (7 << 5)); + tmp32 |= (2 << 12) | (2 << 5); + bcma_pcie2_write_config32(bdev, 0, 0, 0xb4, tmp32); + + /* + * Turn-on Root-Complex (RC) mode, from reset default of EP + * The mode is set by straps, can be overwritten via DMU + * register <cru_straps_control> bit 5, "1" means RC + */ + + /* Send a downstream reset */ + bcma_write32(bdev, BCMA_CORE_PCIE2_CLK_CONTROL, + PCIE2_CLKC_RST_OE | PCIE2_CLKC_RST); + udelay(250); + bcma_write32(bdev, BCMA_CORE_PCIE2_CLK_CONTROL, PCIE2_CLKC_RST_OE); + mdelay(250); + + /* TBD: take care of PM, check we're on */ +} + +/* + * Setup the address translation + * + * NOTE: All PCI-to-CPU address mapping are 1:1 for simplicity + */ +static int bcma_pcie2_map_init(struct bcma_device *bdev, u32 addr) +{ + /* 64MB alignment */ + if (!addr || (addr & (SZ_64M - 1))) + return -EINVAL; + + bcma_write32(bdev, BCMA_CORE_PCIE2_OMAP0_LOWER, addr); + bcma_write32(bdev, BCMA_CORE_PCIE2_OARR0, addr | 0x01); + + bcma_write32(bdev, BCMA_CORE_PCIE2_OMAP1_LOWER, addr + SZ_64M); + bcma_write32(bdev, BCMA_CORE_PCIE2_OARR1, (addr + SZ_64M) | 0x01); + + /* + * Inbound address translation setup + * Northstar only maps up to 128 MiB inbound, DRAM could be up to 1 GiB. + * + * For now allow access to entire DRAM, assuming it is less than 128MiB, + * otherwise DMA bouncing mechanism may be required. + * Also consider DMA mask to limit DMA physical address + */ + /* 64-bit LE regs, write low word, high is 0 at reset */ + bcma_write32(bdev, BCMA_CORE_PCIE2_FUNC0_IMAP1, PHYS_OFFSET | 0x1); + bcma_write32(bdev, BCMA_CORE_PCIE2_IARR1_LOWER, + PHYS_OFFSET | ((SZ_128M >> 20) & 0xff)); + return 0; +} + +/* + * Setup PCIE Host bridge + */ +static int bcma_pcie2_bridge_init(struct bcma_device *bdev, u32 addr, u32 size) +{ + bcma_pcie2_write_config8(bdev, 0, 0, PCI_PRIMARY_BUS, 0); + bcma_pcie2_write_config8(bdev, 0, 0, PCI_SECONDARY_BUS, 1); + bcma_pcie2_write_config8(bdev, 0, 0, PCI_SUBORDINATE_BUS, 4); + + bcma_pcie2_read_config8(bdev, 0, 0, PCI_PRIMARY_BUS); + bcma_pcie2_read_config8(bdev, 0, 0, PCI_SECONDARY_BUS); + bcma_pcie2_read_config8(bdev, 0, 0, PCI_SUBORDINATE_BUS); + + /* MEM_BASE, MEM_LIM require 1MB alignment */ + if (((addr >> 16) & 0xf) || (((addr + size) >> 16) & 0xf)) + return -EINVAL; + + bcma_pcie2_write_config16(bdev, 0, 0, PCI_MEMORY_BASE, addr >> 16); + bcma_pcie2_write_config16(bdev, 0, 0, PCI_MEMORY_LIMIT, + (addr + size) >> 16); + + /* These registers are not supported on the NS */ + bcma_pcie2_write_config16(bdev, 0, 0, PCI_IO_BASE_UPPER16, 0); + bcma_pcie2_write_config16(bdev, 0, 0, PCI_IO_LIMIT_UPPER16, 0); + + /* Force class to that of a Bridge */ + bcma_pcie2_write_config16(bdev, 0, 0, PCI_CLASS_DEVICE, + PCI_CLASS_BRIDGE_PCI); + + bcma_pcie2_read_config16(bdev, 0, 0, PCI_CLASS_DEVICE); + bcma_pcie2_read_config16(bdev, 0, 0, PCI_MEMORY_BASE); + bcma_pcie2_read_config16(bdev, 0, 0, PCI_MEMORY_LIMIT); + return 0; +} + +static void bcma_pcie2_3rd_init(struct bcma_bus *bus) +{ + /* PCIE PLL block register (base 0x8000) */ + bcma_chipco_b_mii_write(&bus->drv_cc_b, 0x00000088, 0x57fe8000); + /* Check PCIE PLL lock status */ + bcma_chipco_b_mii_write(&bus->drv_cc_b, 0x00000088, 0x67c60000); +} + +/* To improve PCIE phy jitter */ +static void bcma_pcie2_improve_phy_jitter(struct bcma_bus *bus, int phyaddr) +{ + u32 val; + + /* Change blkaddr */ + val = (1 << 30) | (1 << 28) | (phyaddr << 23) | (0x1f << 18) | + (2 << 16) | (0x863 << 4); + bcma_chipco_b_mii_write(&bus->drv_cc_b, 0x0000009a, val); + + /* Write 0x0190 to 0x13 regaddr */ + val = (1 << 30) | (1 << 28) | (phyaddr << 23) | (0x13 << 18) | + (2 << 16) | 0x0190; + bcma_chipco_b_mii_write(&bus->drv_cc_b, 0x0000009a, val); + + /* Write 0x0191 to 0x19 regaddr */ + val = (1 << 30) | (1 << 28) | (phyaddr << 23) | (0x19 << 18) | + (2 << 16) | 0x0191; + bcma_chipco_b_mii_write(&bus->drv_cc_b, 0x0000009a, val); +} + +static int bcma_pcie2_setup(int nr, struct pci_sys_data *sys) +{ + struct bcma_device *bdev = sys->private_data; + struct bcma_bus *bus = bdev->bus; + struct resource *res; + struct bcma_device *arm_core; + u32 cru_straps_ctrl; + int ret; + int phyaddr; + + if (bdev->core_unit == 2) { + arm_core = bcma_find_core(bus, BCMA_CORE_ARMCA9); + cru_straps_ctrl = bcma_read32(arm_core, 0x2a0); + + /* 3rd PCIE is not selected */ + if (cru_straps_ctrl & 0x10) + return -ENODEV; + + bcma_pcie2_3rd_init(bus); + phyaddr = 0xf; + } else { + phyaddr = bdev->core_unit; + } + bcma_pcie2_improve_phy_jitter(bus, phyaddr); + + /* create mem resource */ + res = devm_kzalloc(&bdev->dev, sizeof(*res), GFP_KERNEL); + if (!res) + return -EINVAL; + + res->start = bdev->addr_s[0]; + res->end = res->start + SZ_128M - 1; + res->name = "PCIe configuration mem space"; + res->flags = IORESOURCE_MEM; + + ret = devm_request_resource(&bdev->dev, &iomem_resource, res); + if (ret) + return ret; + + pci_add_resource(&sys->resources, res); + + /* This PCIe controller does not support IO Mem, so use a dummy one. */ + res = devm_kzalloc(&bdev->dev, sizeof(*res), GFP_KERNEL); + if (!res) + return -EINVAL; + + res->start = 0; + res->end = 0; + res->name = "PCIe dummy IO space"; + res->flags = IORESOURCE_IO; + + pci_add_resource(&sys->resources, res); + + bcma_pcie2_hw_init(bdev); + ret = bcma_pcie2_map_init(bdev, bdev->addr_s[0]); + if (ret) + return ret; + + /* + * Skip inactive ports - + * will need to change this for hot-plugging + */ + ret = bcma_pcie2_check_link(bdev, true); + if (ret) + return ret; + + ret = bcma_pcie2_bridge_init(bdev, bdev->addr_s[0], SZ_128M); + if (ret) + return ret; + + return 1; +} + +static int bcma_pcie2_probe(struct bcma_device *bdev) +{ + struct hw_pci hw = { + .nr_controllers = 1, + .domain = bdev->core_unit, + .private_data = (void **)&bdev, + .setup = bcma_pcie2_setup, + .map_irq = bcma_pcie2_map_irq, + .ops = &bcma_pcie2_ops, + }; + + dev_info(&bdev->dev, "initializing PCIe controller\n"); + + /* Announce this port to ARM/PCI common code */ + pci_common_init_dev(&bdev->dev, &hw); + + /* Setup virtual-wire interrupts */ + bcma_write32(bdev, BCMA_CORE_PCIE2_SYS_RC_INTX_EN, 0xf); + + /* Enable memory and bus master */ + bcma_write32(bdev, SOC_PCIE_HDR_OFF + 4, 0x6); + + return 0; +} + +static const struct bcma_device_id bcma_pcie2_table[] = { + BCMA_CORE(BCMA_MANUF_BCM, BCMA_CORE_NS_PCIEG2, BCMA_ANY_REV, BCMA_ANY_CLASS), + BCMA_CORETABLE_END +}; +MODULE_DEVICE_TABLE(bcma, bcma_pcie2_table); + +static struct bcma_driver bcma_pcie2_driver = { + .name = KBUILD_MODNAME, + .id_table = bcma_pcie2_table, + .probe = bcma_pcie2_probe, +}; + +static int __init bcma_pcie2_init(void) +{ + return bcma_driver_register(&bcma_pcie2_driver); +} +module_init(bcma_pcie2_init); + +static void __exit bcma_pcie2_exit(void) +{ + bcma_driver_unregister(&bcma_pcie2_driver); +} +module_exit(bcma_pcie2_exit); + +MODULE_AUTHOR("Hauke Mehrtens"); +MODULE_DESCRIPTION("PCIe Gen2 driver for BCMA"); +MODULE_LICENSE("GPLv2");
This driver supports the PCIe controller found on the BCM4708 and similar SoCs. The controller itself is automatically detected by bcma. This controller is found on SoCs usually used in SOHO routers to connect the wifi cards to the SoC. All the of the BCM5301X SoCs I know of have 2 or 3 of these controllers in the SoC. I had to use PCI domains otherwise the pci_create_root_bus() function in drivers/pci/probe.c would fail for the second controller being registered because pci_find_bus() would find the same PCIe bus again and assume it is already registered, which ends up in a kernel panic in pcibios_init_hw() in arch/arm/kernel/bios32.c The ARM PCI code assumes that every controller has an I/O space and adds a dummy area if the driver does not specify one. This will work for the first controller, but when we register the second one this will result in an error. To prevent this problem we add an empty I/O space. Currently I have problems with probing the devices on the bus, because pci_bus_add_devices() is called too early in pci_scan_root_bus() in drivers/pci/probe.c, before pci_bus_assign_resources() was called in pci_common_init_dev() in arch/arm/kernel/bios32.c. When the devices are added too early they do not have any resources and adding fails. I have to remove the call to pci_bus_add_devices() in pci_scan_root_bus() to make registration work, calling pci_bus_add_devices() later again does not fix this problem. Signed-off-by: Hauke Mehrtens <hauke@hauke-m.de> --- arch/arm/mach-bcm/Kconfig | 1 + drivers/pci/host/Kconfig | 7 + drivers/pci/host/Makefile | 1 + drivers/pci/host/pci-host-bcm5301x.c | 483 +++++++++++++++++++++++++++++++++++ 4 files changed, 492 insertions(+) create mode 100644 drivers/pci/host/pci-host-bcm5301x.c