diff mbox

[1/1] pci: xgene: Enable huge outbound bar support

Message ID 1435280756-24455-1-git-send-email-dhdang@apm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Duc Dang June 26, 2015, 1:05 a.m. UTC
X-Gene PCIe controllers support huge outbound BARs (with size upto
64GB). This patch configures additional 1 outbound BAR for X-Gene
PCIe controllers with size larger than 4GB. This is required to
support devices that request huge outbound memory (nVidia K40 as an
example)

Signed-off-by: Duc Dang <dhdang@apm.com>
Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
---
 arch/arm64/boot/dts/apm/apm-storm.dtsi | 33 +++++++++++++++++++--------------
 drivers/pci/host/pci-xgene.c           |  6 +++++-
 2 files changed, 24 insertions(+), 15 deletions(-)

Comments

Arnd Bergmann June 26, 2015, 7:59 a.m. UTC | #1
On Thursday 25 June 2015 18:05:56 Duc Dang wrote:
> X-Gene PCIe controllers support huge outbound BARs (with size upto
> 64GB). This patch configures additional 1 outbound BAR for X-Gene
> PCIe controllers with size larger than 4GB. This is required to
> support devices that request huge outbound memory (nVidia K40 as an
> example)
> 
> Signed-off-by: Duc Dang <dhdang@apm.com>
> Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
> ---
>  arch/arm64/boot/dts/apm/apm-storm.dtsi | 33 +++++++++++++++++++--------------
>  drivers/pci/host/pci-xgene.c           |  6 +++++-
>  2 files changed, 24 insertions(+), 15 deletions(-)

It seems you do multiple things here:

- add an entry in the ranges
- move the config space
- fix driver to handle multiple memory ranges

but your description only mentions the first one. Please submit separate
patches for these and explain for each patch why it is required, so we can
pick them up into the arm-soc and pci git repositories separately, and make
sure that each patch works by itself to guarantee we don't get incompatible
binding changes.

> diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi b/arch/arm64/boot/dts/apm/apm-storm.dtsi
> index d8f3a1c..039206b 100644
> --- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
> +++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
> @@ -404,10 +404,11 @@
>  			#size-cells = <2>;
>  			#address-cells = <3>;
>  			reg = < 0x00 0x1f2b0000 0x0 0x00010000   /* Controller registers */
> -				0xe0 0xd0000000 0x0 0x00040000>; /* PCI config space */
> +				0xe0 0x00000000 0x0 0x00040000>; /* PCI config space */
>  			reg-names = "csr", "cfg";
>  			ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000   /* io */
> -				  0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>; /* mem */
> +				  0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000   /* mem */
> +				  0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */
>  			dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
>  				      0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
>  			interrupt-map-mask = <0x0 0x0 0x0 0x7>;

What is the reason for picking the 0x10.00000000 address? We normally try
to use an identity mapping (mem_offset=0) on the bus, or fall back to starting
at zero (mem_offset=cpu_offset), but you use seemingly random mem_offset values.

Could you get the same effect by extending the 0x80000000 mapping to a length
of 58GB (just before the start of the second window)?

Can you configure one of the two windows as prefetchable for improved performance?

I also notice that the 0x80000000-0xffffffff bus range is list both in
ranges and dma-ranges. Does that mean that devices on this host cannot
access MMIO ranges of other devices, or should you exclude this range from
the dma-ranges property?

Style-wise, it would be nice to submit an extra patch that groups entries
in the ranges and reg lists like

			ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000>, /* io */
				 <0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>, /* mem */
				 <0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */

but that is just a cosmetic change and should be kept separate.

> @@ -321,8 +322,11 @@ static int xgene_pcie_map_ranges(struct xgene_pcie_port *port,
>  				return ret;
>  			break;
>  		case IORESOURCE_MEM:
> -			xgene_pcie_setup_ob_reg(port, res, OMR1BARL, res->start,
> +			xgene_pcie_setup_ob_reg(port, res,
> +						OMR1BARL + (omr_idx * 0x18),
> +						res->start,
>  						res->start - window->offset);
> +			omr_idx++;
>  			break;
>  		case IORESOURCE_BUS:
>  			break;

Can you describe what happens when you boot an older kernel with the new
dt file, or vice versa? Will that simply ignore the first ranges entries
and use only the last one, or does it break in interesting ways?

	Arnd
Bjorn Helgaas June 26, 2015, 2:26 p.m. UTC | #2
Hi Duc,

On Thu, Jun 25, 2015 at 8:05 PM, Duc Dang <dhdang@apm.com> wrote:
> X-Gene PCIe controllers support huge outbound BARs (with size upto
> 64GB). This patch configures additional 1 outbound BAR for X-Gene
> PCIe controllers with size larger than 4GB. This is required to
> support devices that request huge outbound memory (nVidia K40 as an
> example)

The PCI specs don't use the "outbound BAR" terminology.  I assume this
is what we normally call "windows" or "apertures" for MMIO access by
the CPU?  (The specs rarely use those terms either, but they're
commonly used in Linux, and the PCIe spec does mention them a couple
times.)

By "request huge outbound memory," I assume you simply mean the device
has a huge BAR, right?

Bjorn
Duc Dang June 26, 2015, 5:26 p.m. UTC | #3
On Fri, Jun 26, 2015 at 7:26 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> Hi Duc,
>
> On Thu, Jun 25, 2015 at 8:05 PM, Duc Dang <dhdang@apm.com> wrote:
>> X-Gene PCIe controllers support huge outbound BARs (with size upto
>> 64GB). This patch configures additional 1 outbound BAR for X-Gene
>> PCIe controllers with size larger than 4GB. This is required to
>> support devices that request huge outbound memory (nVidia K40 as an
>> example)
>
> The PCI specs don't use the "outbound BAR" terminology.  I assume this
> is what we normally call "windows" or "apertures" for MMIO access by
> the CPU?  (The specs rarely use those terms either, but they're
> commonly used in Linux, and the PCIe spec does mention them a couple
> times.)

Yes, Bjorn. The "outbound BAR" that I mentioned is the MMIO window. I
will change these terms in next patch.

>
> By "request huge outbound memory," I assume you simply mean the device
> has a huge BAR, right?
Yes, I meant "the device has a huge BAR". I will change the
description for this in next patch as well

>
> Bjorn

Regards,
Duc Dang.
Duc Dang June 26, 2015, 6:56 p.m. UTC | #4
Hi Arnd,

On Fri, Jun 26, 2015 at 12:59 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Thursday 25 June 2015 18:05:56 Duc Dang wrote:
>> X-Gene PCIe controllers support huge outbound BARs (with size upto
>> 64GB). This patch configures additional 1 outbound BAR for X-Gene
>> PCIe controllers with size larger than 4GB. This is required to
>> support devices that request huge outbound memory (nVidia K40 as an
>> example)
>>
>> Signed-off-by: Duc Dang <dhdang@apm.com>
>> Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
>> ---
>>  arch/arm64/boot/dts/apm/apm-storm.dtsi | 33 +++++++++++++++++++--------------
>>  drivers/pci/host/pci-xgene.c           |  6 +++++-
>>  2 files changed, 24 insertions(+), 15 deletions(-)
>
> It seems you do multiple things here:
>
> - add an entry in the ranges
> - move the config space
> - fix driver to handle multiple memory ranges
>
> but your description only mentions the first one. Please submit separate
> patches for these and explain for each patch why it is required, so we can
> pick them up into the arm-soc and pci git repositories separately, and make
> sure that each patch works by itself to guarantee we don't get incompatible
> binding changes.

Move config space can be in a separate patch. But it is not absolutely
necessary to move the configuration space. The reason I change it is
to have config space starting from the beginning of SoC PCIe physical
address space. I will drop it in next patch.

But add 'an entry in the ranges' and 'fix driver to handle multiple
memory ranges' should go together as single patch because the old
driver will break when it sees multiple ranges.

>
>> diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi b/arch/arm64/boot/dts/apm/apm-storm.dtsi
>> index d8f3a1c..039206b 100644
>> --- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
>> +++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
>> @@ -404,10 +404,11 @@
>>                       #size-cells = <2>;
>>                       #address-cells = <3>;
>>                       reg = < 0x00 0x1f2b0000 0x0 0x00010000   /* Controller registers */
>> -                             0xe0 0xd0000000 0x0 0x00040000>; /* PCI config space */
>> +                             0xe0 0x00000000 0x0 0x00040000>; /* PCI config space */
>>                       reg-names = "csr", "cfg";
>>                       ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000   /* io */
>> -                               0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>; /* mem */
>> +                               0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000   /* mem */
>> +                               0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */
>>                       dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
>>                                     0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
>>                       interrupt-map-mask = <0x0 0x0 0x0 0x7>;
>
> What is the reason for picking the 0x10.00000000 address? We normally try
> to use an identity mapping (mem_offset=0) on the bus, or fall back to starting
> at zero (mem_offset=cpu_offset), but you use seemingly random mem_offset values.
>

I just randomly pick 0x10 00000000 as it is not used. I will change to
use 0xf0 00000000:
0x02000000 0xf0 0x00000000 0xf0 0x00000000 0x10 0x00000000

> Could you get the same effect by extending the 0x80000000 mapping to a length
> of 58GB (just before the start of the second window)?

I don't think so. It will break with the PCIe device that requires 32-bit BAR.

>
> Can you configure one of the two windows as prefetchable for improved performance?
>
The new huge window will be prefetchable.

> I also notice that the 0x80000000-0xffffffff bus range is list both in
> ranges and dma-ranges. Does that mean that devices on this host cannot
> access MMIO ranges of other devices, or should you exclude this range from
> the dma-ranges property?
>

As I understand, the mem entries in 'ranges' are for controller to
access remote PCIe devices (EP)  BAR; while the entries in
'dma-ranges'
are for remote PCIe devices (EP) to access controller memory. So they
are different. Or are you asking about something else?


> Style-wise, it would be nice to submit an extra patch that groups entries
> in the ranges and reg lists like
>
>                         ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000>, /* io */
>                                  <0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>, /* mem */
>                                  <0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */
>
> but that is just a cosmetic change and should be kept separate.

Would you mind if I make this change as well and include in next patch?

>
>> @@ -321,8 +322,11 @@ static int xgene_pcie_map_ranges(struct xgene_pcie_port *port,
>>                               return ret;
>>                       break;
>>               case IORESOURCE_MEM:
>> -                     xgene_pcie_setup_ob_reg(port, res, OMR1BARL, res->start,
>> +                     xgene_pcie_setup_ob_reg(port, res,
>> +                                             OMR1BARL + (omr_idx * 0x18),
>> +                                             res->start,
>>                                               res->start - window->offset);
>> +                     omr_idx++;
>>                       break;
>>               case IORESOURCE_BUS:
>>                       break;
>
> Can you describe what happens when you boot an older kernel with the new
> dt file, or vice versa? Will that simply ignore the first ranges entries
> and use only the last one, or does it break in interesting ways?

Older kernel and new dtb: the new entry in 'ranges' will overwrite the
old entry, so device that requires 32-bit BAR will break.
New kernel and old dtb: works fine but without huge bar support.

>
>         Arnd
Arnd Bergmann June 26, 2015, 8:35 p.m. UTC | #5
On Friday 26 June 2015 11:56:47 Duc Dang wrote:
> Hi Arnd,
> 
> On Fri, Jun 26, 2015 at 12:59 AM, Arnd Bergmann <arnd@arndb.de> wrote:
> > On Thursday 25 June 2015 18:05:56 Duc Dang wrote:
> >> X-Gene PCIe controllers support huge outbound BARs (with size upto
> >> 64GB). This patch configures additional 1 outbound BAR for X-Gene
> >> PCIe controllers with size larger than 4GB. This is required to
> >> support devices that request huge outbound memory (nVidia K40 as an
> >> example)
> >>
> >> Signed-off-by: Duc Dang <dhdang@apm.com>
> >> Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
> >> ---
> >>  arch/arm64/boot/dts/apm/apm-storm.dtsi | 33 +++++++++++++++++++--------------
> >>  drivers/pci/host/pci-xgene.c           |  6 +++++-
> >>  2 files changed, 24 insertions(+), 15 deletions(-)
> >
> > It seems you do multiple things here:
> >
> > - add an entry in the ranges
> > - move the config space
> > - fix driver to handle multiple memory ranges
> >
> > but your description only mentions the first one. Please submit separate
> > patches for these and explain for each patch why it is required, so we can
> > pick them up into the arm-soc and pci git repositories separately, and make
> > sure that each patch works by itself to guarantee we don't get incompatible
> > binding changes.
> 
> Move config space can be in a separate patch. But it is not absolutely
> necessary to move the configuration space. The reason I change it is
> to have config space starting from the beginning of SoC PCIe physical
> address space. I will drop it in next patch.
> 
> But add 'an entry in the ranges' and 'fix driver to handle multiple
> memory ranges' should go together as single patch because the old
> driver will break when it sees multiple ranges.
> 
> >
> >> diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi b/arch/arm64/boot/dts/apm/apm-storm.dtsi
> >> index d8f3a1c..039206b 100644
> >> --- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
> >> +++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
> >> @@ -404,10 +404,11 @@
> >>                       #size-cells = <2>;
> >>                       #address-cells = <3>;
> >>                       reg = < 0x00 0x1f2b0000 0x0 0x00010000   /* Controller registers */
> >> -                             0xe0 0xd0000000 0x0 0x00040000>; /* PCI config space */
> >> +                             0xe0 0x00000000 0x0 0x00040000>; /* PCI config space */
> >>                       reg-names = "csr", "cfg";
> >>                       ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000   /* io */
> >> -                               0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>; /* mem */
> >> +                               0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000   /* mem */
> >> +                               0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */
> >>                       dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
> >>                                     0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
> >>                       interrupt-map-mask = <0x0 0x0 0x0 0x7>;
> >
> > What is the reason for picking the 0x10.00000000 address? We normally try
> > to use an identity mapping (mem_offset=0) on the bus, or fall back to starting
> > at zero (mem_offset=cpu_offset), but you use seemingly random mem_offset values.
> >
> 
> I just randomly pick 0x10 00000000 as it is not used. I will change to
> use 0xf0 00000000:
> 0x02000000 0xf0 0x00000000 0xf0 0x00000000 0x10 0x00000000

Ok

> > Could you get the same effect by extending the 0x80000000 mapping to a length
> > of 58GB (just before the start of the second window)?
> 
> I don't think so. It will break with the PCIe device that requires 32-bit BAR.

Are you sure? If this is true, maybe this should be fixed in the PCI core.

> >
> > Can you configure one of the two windows as prefetchable for improved performance?
> >
> The new huge window will be prefetchable.

The DT marks it as non-prefetchable, so the kernel will now try to allocate
space for non-prefetchable BARs from this area, which is a bug. If the second
memory window is always prefetchable, you have to mark it accordingly in DT,
to prevent it from being used for normal 64 bit BARs.

> > I also notice that the 0x80000000-0xffffffff bus range is list both in
> > ranges and dma-ranges. Does that mean that devices on this host cannot
> > access MMIO ranges of other devices, or should you exclude this range from
> > the dma-ranges property?
> >
> 
> As I understand, the mem entries in 'ranges' are for controller to
> access remote PCIe devices (EP)  BAR; while the entries in
> 'dma-ranges'
> are for remote PCIe devices (EP) to access controller memory. So they
> are different. Or are you asking about something else?

Your understanding is correct, but from the perspective of a PCI device
the two are the same: when a device issues a bus master transaction,
the target can either be an MMIO BAR on the same PCI host, or it can
get turned into a DMA that ends up in RAM. The way you list the ranges,
it is a bit ambiguous about what happens when a transaction is targetted
at address 0x80000000: does your PCI host send that to a device with a
BAR at that address (CPU address 0xe1.80000000), or upstream to the
parent bus for CPU address 0x00.80000000?
> 
> > Style-wise, it would be nice to submit an extra patch that groups entries
> > in the ranges and reg lists like
> >
> >                         ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000>, /* io */
> >                                  <0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>, /* mem */
> >                                  <0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */
> >
> > but that is just a cosmetic change and should be kept separate.
> 
> Would you mind if I make this change as well and include in next patch?

It's normally better to keep cleanups and functional changes separate, so
better do two patches.

> >> @@ -321,8 +322,11 @@ static int xgene_pcie_map_ranges(struct xgene_pcie_port *port,
> >>                               return ret;
> >>                       break;
> >>               case IORESOURCE_MEM:
> >> -                     xgene_pcie_setup_ob_reg(port, res, OMR1BARL, res->start,
> >> +                     xgene_pcie_setup_ob_reg(port, res,
> >> +                                             OMR1BARL + (omr_idx * 0x18),
> >> +                                             res->start,
> >>                                               res->start - window->offset);
> >> +                     omr_idx++;
> >>                       break;
> >>               case IORESOURCE_BUS:
> >>                       break;
> >
> > Can you describe what happens when you boot an older kernel with the new
> > dt file, or vice versa? Will that simply ignore the first ranges entries
> > and use only the last one, or does it break in interesting ways?
> 
> Older kernel and new dtb: the new entry in 'ranges' will overwrite the
> old entry, so device that requires 32-bit BAR will break.
> New kernel and old dtb: works fine but without huge bar support.

Ok, so the patches cannot trivially get merged through separate trees,
but have to be ordered. Please split up the driver change from the
DT change anyway, and then we can decide whether to take both through
arm-soc or the PCI tree.

Regarding the functional change, I suppose you should take prefetchable/
non-prefetchable settings into account here. My guess is that the first
window is always non-prefetchable, while the second one is always
prefetchable. If that is the case, it would be more robust to replace
the counter with another conditional that looks at the IORESOURCE_PREFETCH
flag to decide which register to program.

If you have more than two windows, better do both and use a total of
four windows for all combinations of 32/64 bit addressing and prefetch.

	Arnd
Arnd Bergmann June 26, 2015, 8:43 p.m. UTC | #6
On Friday 26 June 2015 22:35:08 Arnd Bergmann wrote:
> > >
> > > Can you configure one of the two windows as prefetchable for improved performance?
> > >
> > The new huge window will be prefetchable.
> 
> The DT marks it as non-prefetchable, so the kernel will now try to allocate
> space for non-prefetchable BARs from this area, which is a bug. If the second
> memory window is always prefetchable, you have to mark it accordingly in DT,
> to prevent it from being used for normal 64 bit BARs.

Forget what I wrote here, I just remembered that 64-bit BARs are by definition
prefetchable. You should still mark the range accordingly in DT for correct
operation, but it won't be used for non-prefetchable 64-bit BARs if you don't
because they do not exist.

	Arnd
Duc Dang June 26, 2015, 9:56 p.m. UTC | #7
On Fri, Jun 26, 2015 at 1:35 PM, Arnd Bergmann <arnd@arndb.de> wrote:
> On Friday 26 June 2015 11:56:47 Duc Dang wrote:
>> Hi Arnd,
>>
>> On Fri, Jun 26, 2015 at 12:59 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>> > On Thursday 25 June 2015 18:05:56 Duc Dang wrote:
>> >> X-Gene PCIe controllers support huge outbound BARs (with size upto
>> >> 64GB). This patch configures additional 1 outbound BAR for X-Gene
>> >> PCIe controllers with size larger than 4GB. This is required to
>> >> support devices that request huge outbound memory (nVidia K40 as an
>> >> example)
>> >>
>> >> Signed-off-by: Duc Dang <dhdang@apm.com>
>> >> Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
>> >> ---
>> >>  arch/arm64/boot/dts/apm/apm-storm.dtsi | 33 +++++++++++++++++++--------------
>> >>  drivers/pci/host/pci-xgene.c           |  6 +++++-
>> >>  2 files changed, 24 insertions(+), 15 deletions(-)
>> >
>> > It seems you do multiple things here:
>> >
>> > - add an entry in the ranges
>> > - move the config space
>> > - fix driver to handle multiple memory ranges
>> >
>> > but your description only mentions the first one. Please submit separate
>> > patches for these and explain for each patch why it is required, so we can
>> > pick them up into the arm-soc and pci git repositories separately, and make
>> > sure that each patch works by itself to guarantee we don't get incompatible
>> > binding changes.
>>
>> Move config space can be in a separate patch. But it is not absolutely
>> necessary to move the configuration space. The reason I change it is
>> to have config space starting from the beginning of SoC PCIe physical
>> address space. I will drop it in next patch.
>>
>> But add 'an entry in the ranges' and 'fix driver to handle multiple
>> memory ranges' should go together as single patch because the old
>> driver will break when it sees multiple ranges.
>>
>> >
>> >> diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi b/arch/arm64/boot/dts/apm/apm-storm.dtsi
>> >> index d8f3a1c..039206b 100644
>> >> --- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
>> >> +++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
>> >> @@ -404,10 +404,11 @@
>> >>                       #size-cells = <2>;
>> >>                       #address-cells = <3>;
>> >>                       reg = < 0x00 0x1f2b0000 0x0 0x00010000   /* Controller registers */
>> >> -                             0xe0 0xd0000000 0x0 0x00040000>; /* PCI config space */
>> >> +                             0xe0 0x00000000 0x0 0x00040000>; /* PCI config space */
>> >>                       reg-names = "csr", "cfg";
>> >>                       ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000   /* io */
>> >> -                               0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>; /* mem */
>> >> +                               0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000   /* mem */
>> >> +                               0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */
>> >>                       dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
>> >>                                     0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
>> >>                       interrupt-map-mask = <0x0 0x0 0x0 0x7>;
>> >
>> > What is the reason for picking the 0x10.00000000 address? We normally try
>> > to use an identity mapping (mem_offset=0) on the bus, or fall back to starting
>> > at zero (mem_offset=cpu_offset), but you use seemingly random mem_offset values.
>> >
>>
>> I just randomly pick 0x10 00000000 as it is not used. I will change to
>> use 0xf0 00000000:
>> 0x02000000 0xf0 0x00000000 0xf0 0x00000000 0x10 0x00000000
>
> Ok
>
>> > Could you get the same effect by extending the 0x80000000 mapping to a length
>> > of 58GB (just before the start of the second window)?
>>
>> I don't think so. It will break with the PCIe device that requires 32-bit BAR.
>
> Are you sure? If this is true, maybe this should be fixed in the PCI core.

The controller requires the size of windows must be in power of 2, so
programing the size as 58GB will cause
 transaction failures.

>
>> >
>> > Can you configure one of the two windows as prefetchable for improved performance?
>> >
>> The new huge window will be prefetchable.
>
> The DT marks it as non-prefetchable, so the kernel will now try to allocate
> space for non-prefetchable BARs from this area, which is a bug. If the second
> memory window is always prefetchable, you have to mark it accordingly in DT,
> to prevent it from being used for normal 64 bit BARs.
>
>> > I also notice that the 0x80000000-0xffffffff bus range is list both in
>> > ranges and dma-ranges. Does that mean that devices on this host cannot
>> > access MMIO ranges of other devices, or should you exclude this range from
>> > the dma-ranges property?
>> >
>>
>> As I understand, the mem entries in 'ranges' are for controller to
>> access remote PCIe devices (EP)  BAR; while the entries in
>> 'dma-ranges'
>> are for remote PCIe devices (EP) to access controller memory. So they
>> are different. Or are you asking about something else?
>
> Your understanding is correct, but from the perspective of a PCI device
> the two are the same: when a device issues a bus master transaction,
> the target can either be an MMIO BAR on the same PCI host, or it can
> get turned into a DMA that ends up in RAM. The way you list the ranges,
> it is a bit ambiguous about what happens when a transaction is targetted
> at address 0x80000000: does your PCI host send that to a device with a
> BAR at that address (CPU address 0xe1.80000000), or upstream to the
> parent bus for CPU address 0x00.80000000?

The PCIe devices connecting to this host cannot access MMIO ranges on
other devices, so when the host receives transaction
that targets address  0x80000000 from device, it will translate it to
CPU address 0x80000000.

>>
>> > Style-wise, it would be nice to submit an extra patch that groups entries
>> > in the ranges and reg lists like
>> >
>> >                         ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000>, /* io */
>> >                                  <0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>, /* mem */
>> >                                  <0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */
>> >
>> > but that is just a cosmetic change and should be kept separate.
>>
>> Would you mind if I make this change as well and include in next patch?
>
> It's normally better to keep cleanups and functional changes separate, so
> better do two patches.

Sure, I will post a cleanup patch later.

>
>> >> @@ -321,8 +322,11 @@ static int xgene_pcie_map_ranges(struct xgene_pcie_port *port,
>> >>                               return ret;
>> >>                       break;
>> >>               case IORESOURCE_MEM:
>> >> -                     xgene_pcie_setup_ob_reg(port, res, OMR1BARL, res->start,
>> >> +                     xgene_pcie_setup_ob_reg(port, res,
>> >> +                                             OMR1BARL + (omr_idx * 0x18),
>> >> +                                             res->start,
>> >>                                               res->start - window->offset);
>> >> +                     omr_idx++;
>> >>                       break;
>> >>               case IORESOURCE_BUS:
>> >>                       break;
>> >
>> > Can you describe what happens when you boot an older kernel with the new
>> > dt file, or vice versa? Will that simply ignore the first ranges entries
>> > and use only the last one, or does it break in interesting ways?
>>
>> Older kernel and new dtb: the new entry in 'ranges' will overwrite the
>> old entry, so device that requires 32-bit BAR will break.
>> New kernel and old dtb: works fine but without huge bar support.
>
> Ok, so the patches cannot trivially get merged through separate trees,
> but have to be ordered. Please split up the driver change from the
> DT change anyway, and then we can decide whether to take both through
> arm-soc or the PCI tree.

I will do that.
>
> Regarding the functional change, I suppose you should take prefetchable/
> non-prefetchable settings into account here. My guess is that the first
> window is always non-prefetchable, while the second one is always
> prefetchable. If that is the case, it would be more robust to replace
> the counter with another conditional that looks at the IORESOURCE_PREFETCH
> flag to decide which register to program.
>
I will look into using IORESOURCE_PREFETCH flag in next patch.

> If you have more than two windows, better do both and use a total of
> four windows for all combinations of 32/64 bit addressing and prefetch.
>
>         Arnd
Duc Dang June 30, 2015, 6:22 p.m. UTC | #8
This patch set adds 1 large (up to 64GB) memory window for each PCIe
controller nodes in X-Gene device tree and fix PCIe controller driver
to handle multiple memory ranges correctly. These changes are required
to support PCIe devices that has huge BAR.

v2 changes:
	1. Separate device-tree changes and driver changes into different
	patches
	2. Explicitly define new large window as 64-bit prefetchable in dts
	3. Use IORESOURCE_PREFETCH flag to determine which PCIe controller
	register to be used to configure the memory ranges.
 
 arch/arm64/boot/dts/apm/apm-storm.dtsi | 23 ++++++++++++++---------
 drivers/pci/host/pci-xgene.c           | 12 ++++++++++--
 2 files changed, 24 insertions(+), 11 deletions(-)
Duc Dang July 6, 2015, 11:28 p.m. UTC | #9
On Tue, Jun 30, 2015 at 11:22 AM, Duc Dang <dhdang@apm.com> wrote:
> This patch set adds 1 large (up to 64GB) memory window for each PCIe
> controller nodes in X-Gene device tree and fix PCIe controller driver
> to handle multiple memory ranges correctly. These changes are required
> to support PCIe devices that has huge BAR.
>
> v2 changes:
>         1. Separate device-tree changes and driver changes into different
>         patches
>         2. Explicitly define new large window as 64-bit prefetchable in dts
>         3. Use IORESOURCE_PREFETCH flag to determine which PCIe controller
>         register to be used to configure the memory ranges.
>
>  arch/arm64/boot/dts/apm/apm-storm.dtsi | 23 ++++++++++++++---------
>  drivers/pci/host/pci-xgene.c           | 12 ++++++++++--
>  2 files changed, 24 insertions(+), 11 deletions(-)

Hi Arnd, Bjorn,

Do you have additional comment on this v2 patch set?

>
> --
> 1.9.1
>
Arnd Bergmann July 9, 2015, 11:47 a.m. UTC | #10
On Monday 06 July 2015 16:28:43 Duc Dang wrote:
> On Tue, Jun 30, 2015 at 11:22 AM, Duc Dang <dhdang@apm.com> wrote:
> > This patch set adds 1 large (up to 64GB) memory window for each PCIe
> > controller nodes in X-Gene device tree and fix PCIe controller driver
> > to handle multiple memory ranges correctly. These changes are required
> > to support PCIe devices that has huge BAR.
> >
> > v2 changes:
> >         1. Separate device-tree changes and driver changes into different
> >         patches
> >         2. Explicitly define new large window as 64-bit prefetchable in dts
> >         3. Use IORESOURCE_PREFETCH flag to determine which PCIe controller
> >         register to be used to configure the memory ranges.
> >
> >  arch/arm64/boot/dts/apm/apm-storm.dtsi | 23 ++++++++++++++---------
> >  drivers/pci/host/pci-xgene.c           | 12 ++++++++++--
> >  2 files changed, 24 insertions(+), 11 deletions(-)
> 
> Hi Arnd, Bjorn,
> 
> Do you have additional comment on this v2 patch set?
> 
> 

The changes look ok to me now, but I'd mention in the changelog about
the fact that one of the windows is prefetchable and the other one
is not, and that only prefetchable BARs are now using the 64-bit
window.

	Arnd
Duc Dang July 9, 2015, 9:20 p.m. UTC | #11
This patch set adds 1 large (up to 64GB) memory window for each PCIe
controller nodes in X-Gene device tree and fix PCIe controller driver
to handle multiple memory ranges correctly. These changes are required
to support PCIe devices that have huge BAR.

v3 changes:
	1. Explicitly mention in change log that: Each PCIe node in dts 
	will have 1 32-bit non-prefetchable memory window and 1 64-bit 
	prefetchable memory window.

v2 changes:
        1. Separate device-tree changes and driver changes into different
        patches
        2. Explicitly define new large window as 64-bit prefetchable in dts.
        3. Use IORESOURCE_PREFETCH flag to determine which PCIe controller
        register to be used to configure the memory ranges.
 
 arch/arm64/boot/dts/apm/apm-storm.dtsi | 23 ++++++++++++++---------
 drivers/pci/host/pci-xgene.c           | 12 ++++++++++--
 2 files changed, 24 insertions(+), 11 deletions(-)
Duc Dang July 9, 2015, 9:24 p.m. UTC | #12
On Thu, Jul 9, 2015 at 4:47 AM, Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Monday 06 July 2015 16:28:43 Duc Dang wrote:
> > On Tue, Jun 30, 2015 at 11:22 AM, Duc Dang <dhdang@apm.com> wrote:
> > > This patch set adds 1 large (up to 64GB) memory window for each PCIe
> > > controller nodes in X-Gene device tree and fix PCIe controller driver
> > > to handle multiple memory ranges correctly. These changes are required
> > > to support PCIe devices that has huge BAR.
> > >
> > > v2 changes:
> > >         1. Separate device-tree changes and driver changes into different
> > >         patches
> > >         2. Explicitly define new large window as 64-bit prefetchable in dts
> > >         3. Use IORESOURCE_PREFETCH flag to determine which PCIe controller
> > >         register to be used to configure the memory ranges.
> > >
> > >  arch/arm64/boot/dts/apm/apm-storm.dtsi | 23 ++++++++++++++---------
> > >  drivers/pci/host/pci-xgene.c           | 12 ++++++++++--
> > >  2 files changed, 24 insertions(+), 11 deletions(-)
> >
> > Hi Arnd, Bjorn,
> >
> > Do you have additional comment on this v2 patch set?
> >
> >
>
> The changes look ok to me now, but I'd mention in the changelog about
> the fact that one of the windows is prefetchable and the other one
> is not, and that only prefetchable BARs are now using the 64-bit
> window.

Thanks, Arnd.

I posted v3 patch to clarify each PCIe node will have 2 memory windows:
1 non-prefetchable 32-bit memory window and
1 prefetchable 64-bit window

>
>         Arnd
> --
> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas July 21, 2015, 4:14 p.m. UTC | #13
On Thu, Jul 09, 2015 at 02:20:10PM -0700, Duc Dang wrote:
> This patch set adds 1 large (up to 64GB) memory window for each PCIe
> controller nodes in X-Gene device tree and fix PCIe controller driver
> to handle multiple memory ranges correctly. These changes are required
> to support PCIe devices that have huge BAR.
> 
> v3 changes:
> 	1. Explicitly mention in change log that: Each PCIe node in dts 
> 	will have 1 32-bit non-prefetchable memory window and 1 64-bit 
> 	prefetchable memory window.
> 
> v2 changes:
>         1. Separate device-tree changes and driver changes into different
>         patches
>         2. Explicitly define new large window as 64-bit prefetchable in dts.
>         3. Use IORESOURCE_PREFETCH flag to determine which PCIe controller
>         register to be used to configure the memory ranges.
>  
>  arch/arm64/boot/dts/apm/apm-storm.dtsi | 23 ++++++++++++++---------
>  drivers/pci/host/pci-xgene.c           | 12 ++++++++++--
>  2 files changed, 24 insertions(+), 11 deletions(-)

Applied to pci/host-xgene for v4.3, with changelogs as follows, thanks!


commit 80bb3eda7475eb38b688d2e915c191ce6ad20df1
Author: Duc Dang <dhdang@apm.com>
Date:   Thu Jul 9 14:20:11 2015 -0700

    arm64: dts: Add APM X-Gene PCIe 64-bit prefetchable window
    
    Add a large window (up to 64GB) for X-Gene PCIe nodes to support devices
    that require huge BARs.
    
    Each X-Gene PCIe node will now have two memory windows: a 32-bit
    non-prefetchable window and a 64-bit prefetchable window.
    
    [bhelgaas: changelog]
    Signed-off-by: Duc Dang <dhdang@apm.com>
    Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

commit 8ef54f27f6f425fa8deedc237d99b9daf41d68d2
Author: Duc Dang <dhdang@apm.com>
Date:   Thu Jul 9 14:20:12 2015 -0700

    PCI: xgene: Add support for a 64-bit prefetchable memory window
    
    X-Gene PCIe controller has registers to support multiple memory ranges.
    
    Add support for a 64-bit prefetchable memory window.
    
    [bhelgaas: changelog]
    Signed-off-by: Duc Dang <dhdang@apm.com>
    Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
Duc Dang July 21, 2015, 6:11 p.m. UTC | #14
On Tue, Jul 21, 2015 at 9:14 AM, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Thu, Jul 09, 2015 at 02:20:10PM -0700, Duc Dang wrote:
>> This patch set adds 1 large (up to 64GB) memory window for each PCIe
>> controller nodes in X-Gene device tree and fix PCIe controller driver
>> to handle multiple memory ranges correctly. These changes are required
>> to support PCIe devices that have huge BAR.
>>
>> v3 changes:
>>       1. Explicitly mention in change log that: Each PCIe node in dts
>>       will have 1 32-bit non-prefetchable memory window and 1 64-bit
>>       prefetchable memory window.
>>
>> v2 changes:
>>         1. Separate device-tree changes and driver changes into different
>>         patches
>>         2. Explicitly define new large window as 64-bit prefetchable in dts.
>>         3. Use IORESOURCE_PREFETCH flag to determine which PCIe controller
>>         register to be used to configure the memory ranges.
>>
>>  arch/arm64/boot/dts/apm/apm-storm.dtsi | 23 ++++++++++++++---------
>>  drivers/pci/host/pci-xgene.c           | 12 ++++++++++--
>>  2 files changed, 24 insertions(+), 11 deletions(-)
>
> Applied to pci/host-xgene for v4.3, with changelogs as follows, thanks!

Thanks a lot, Bjorn.

>
>
> commit 80bb3eda7475eb38b688d2e915c191ce6ad20df1
> Author: Duc Dang <dhdang@apm.com>
> Date:   Thu Jul 9 14:20:11 2015 -0700
>
>     arm64: dts: Add APM X-Gene PCIe 64-bit prefetchable window
>
>     Add a large window (up to 64GB) for X-Gene PCIe nodes to support devices
>     that require huge BARs.
>
>     Each X-Gene PCIe node will now have two memory windows: a 32-bit
>     non-prefetchable window and a 64-bit prefetchable window.
>
>     [bhelgaas: changelog]
>     Signed-off-by: Duc Dang <dhdang@apm.com>
>     Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
>
> commit 8ef54f27f6f425fa8deedc237d99b9daf41d68d2
> Author: Duc Dang <dhdang@apm.com>
> Date:   Thu Jul 9 14:20:12 2015 -0700
>
>     PCI: xgene: Add support for a 64-bit prefetchable memory window
>
>     X-Gene PCIe controller has registers to support multiple memory ranges.
>
>     Add support for a 64-bit prefetchable memory window.
>
>     [bhelgaas: changelog]
>     Signed-off-by: Duc Dang <dhdang@apm.com>
>     Signed-off-by: Tanmay Inamdar <tinamdar@apm.com>
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
diff mbox

Patch

diff --git a/arch/arm64/boot/dts/apm/apm-storm.dtsi b/arch/arm64/boot/dts/apm/apm-storm.dtsi
index d8f3a1c..039206b 100644
--- a/arch/arm64/boot/dts/apm/apm-storm.dtsi
+++ b/arch/arm64/boot/dts/apm/apm-storm.dtsi
@@ -404,10 +404,11 @@ 
 			#size-cells = <2>;
 			#address-cells = <3>;
 			reg = < 0x00 0x1f2b0000 0x0 0x00010000   /* Controller registers */
-				0xe0 0xd0000000 0x0 0x00040000>; /* PCI config space */
+				0xe0 0x00000000 0x0 0x00040000>; /* PCI config space */
 			reg-names = "csr", "cfg";
 			ranges = <0x01000000 0x00 0x00000000 0xe0 0x10000000 0x00 0x00010000   /* io */
-				  0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000>; /* mem */
+				  0x02000000 0x00 0x80000000 0xe1 0x80000000 0x00 0x80000000   /* mem */
+				  0x02000000 0x10 0x00000000 0xf0 0x00000000 0x10 0x00000000>; /* mem */
 			dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
 				      0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
 			interrupt-map-mask = <0x0 0x0 0x0 0x7>;
@@ -428,10 +429,11 @@ 
 			#size-cells = <2>;
 			#address-cells = <3>;
 			reg = < 0x00 0x1f2c0000 0x0 0x00010000   /* Controller registers */
-				0xd0 0xd0000000 0x0 0x00040000>; /* PCI config space */
+				0xd0 0x00000000 0x0 0x00040000>; /* PCI config space */
 			reg-names = "csr", "cfg";
-			ranges = <0x01000000 0x0 0x00000000 0xd0 0x10000000 0x00 0x00010000   /* io  */
-				  0x02000000 0x0 0x80000000 0xd1 0x80000000 0x00 0x80000000>; /* mem */
+			ranges = <0x01000000 0x00 0x00000000 0xd0 0x10000000 0x00 0x00010000   /* io  */
+				  0x02000000 0x00 0x80000000 0xd1 0x80000000 0x00 0x80000000   /* mem */
+				  0x02000000 0x08 0x00000000 0xd8 0x00000000 0x08 0x00000000>; /* mem */
 			dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
 				      0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
 			interrupt-map-mask = <0x0 0x0 0x0 0x7>;
@@ -452,10 +454,11 @@ 
 			#size-cells = <2>;
 			#address-cells = <3>;
 			reg =  < 0x00 0x1f2d0000 0x0 0x00010000   /* Controller registers */
-				 0x90 0xd0000000 0x0 0x00040000>; /* PCI config space */
+				 0x90 0x00000000 0x0 0x00040000>; /* PCI config space */
 			reg-names = "csr", "cfg";
-			ranges = <0x01000000 0x0 0x00000000 0x90 0x10000000 0x0 0x00010000   /* io  */
-				  0x02000000 0x0 0x80000000 0x91 0x80000000 0x0 0x80000000>; /* mem */
+			ranges = <0x01000000 0x00 0x00000000 0x90 0x10000000 0x00 0x00010000   /* io  */
+				  0x02000000 0x00 0x80000000 0x91 0x80000000 0x00 0x80000000   /* mem */
+				  0x02000000 0x04 0x00000000 0x94 0x00000000 0x04 0x00000000>; /* mem */
 			dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
 				      0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
 			interrupt-map-mask = <0x0 0x0 0x0 0x7>;
@@ -476,10 +479,11 @@ 
 			#size-cells = <2>;
 			#address-cells = <3>;
 			reg = < 0x00 0x1f500000 0x0 0x00010000   /* Controller registers */
-				0xa0 0xd0000000 0x0 0x00040000>; /* PCI config space */
+				0xa0 0x00000000 0x0 0x00040000>; /* PCI config space */
 			reg-names = "csr", "cfg";
-			ranges = <0x01000000 0x0 0x00000000 0xa0 0x10000000 0x0 0x00010000   /* io   */
-				  0x02000000 0x0 0x80000000 0xa1 0x80000000 0x0 0x80000000>; /* mem  */
+			ranges = <0x01000000 0x00 0x00000000 0xa0 0x10000000 0x00 0x00010000   /* io  */
+				  0x02000000 0x00 0x80000000 0xa1 0x80000000 0x00 0x80000000   /* mem */
+				  0x02000000 0x10 0x00000000 0xb0 0x00000000 0x10 0x00000000>; /* mem */
 			dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
 				      0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
 			interrupt-map-mask = <0x0 0x0 0x0 0x7>;
@@ -500,10 +504,11 @@ 
 			#size-cells = <2>;
 			#address-cells = <3>;
 			reg = < 0x00 0x1f510000 0x0 0x00010000   /* Controller registers */
-				0xc0 0xd0000000 0x0 0x00200000>; /* PCI config space */
+				0xc0 0x00000000 0x0 0x00040000>; /* PCI config space */
 			reg-names = "csr", "cfg";
-			ranges = <0x01000000 0x0 0x00000000 0xc0 0x10000000 0x0 0x00010000   /* io  */
-				  0x02000000 0x0 0x80000000 0xc1 0x80000000 0x0 0x80000000>; /* mem */
+			ranges = <0x01000000 0x00 0x00000000 0xc0 0x10000000 0x00 0x00010000   /* io  */
+				  0x02000000 0x00 0x80000000 0xc1 0x80000000 0x00 0x80000000   /* mem */
+				  0x02000000 0x08 0x00000000 0xc8 0x00000000 0x08 0x00000000>; /* mem */
 			dma-ranges = <0x42000000 0x80 0x00000000 0x80 0x00000000 0x00 0x80000000
 				      0x42000000 0x00 0x00000000 0x00 0x00000000 0x80 0x00000000>;
 			interrupt-map-mask = <0x0 0x0 0x0 0x7>;
diff --git a/drivers/pci/host/pci-xgene.c b/drivers/pci/host/pci-xgene.c
index a9dfb70..62d7843 100644
--- a/drivers/pci/host/pci-xgene.c
+++ b/drivers/pci/host/pci-xgene.c
@@ -305,6 +305,7 @@  static int xgene_pcie_map_ranges(struct xgene_pcie_port *port,
 	struct resource_entry *window;
 	struct device *dev = port->dev;
 	int ret;
+	u32 omr_idx = 0;
 
 	resource_list_for_each_entry(window, res) {
 		struct resource *res = window->res;
@@ -321,8 +322,11 @@  static int xgene_pcie_map_ranges(struct xgene_pcie_port *port,
 				return ret;
 			break;
 		case IORESOURCE_MEM:
-			xgene_pcie_setup_ob_reg(port, res, OMR1BARL, res->start,
+			xgene_pcie_setup_ob_reg(port, res,
+						OMR1BARL + (omr_idx * 0x18),
+						res->start,
 						res->start - window->offset);
+			omr_idx++;
 			break;
 		case IORESOURCE_BUS:
 			break;