diff mbox series

[9/9,DNI] ARM: multi_v7_defconfig: Enable CONFIG_ARM_LPAE for multi_v7_config

Message ID 20230119144236.3541751-10-alexander.stein@ew.tq-group.com (mailing list archive)
State Superseded
Headers show
Series TQMLS1021A support | expand

Commit Message

Alexander Stein Jan. 19, 2023, 2:42 p.m. UTC
This is necessary to support PCIe on LS1021A.

Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
---
 arch/arm/configs/multi_v7_defconfig | 1 +
 1 file changed, 1 insertion(+)

Comments

Arnd Bergmann Jan. 19, 2023, 3:09 p.m. UTC | #1
On Thu, Jan 19, 2023, at 15:42, Alexander Stein wrote:
> This is necessary to support PCIe on LS1021A.
>
> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>

Can you explain why this is actually required? I can see that the
ranges in the PCIe device point to a high address (0x4000000000,
2^40), but I can't tell if this is hardwired in the SoC or a
setting that is applied by software (either the bootloader or
the PCIe driver).

If you can reprogram the memory map, I would expect this to fit
easily into the 32-bit address space, with 1GB for DDR3 memory
and 1GB for PCIe BARs.

I don't mind having a defconfig with LPAE enabled, I think this
can be done using a Makefile target that applies a config
fragment on top of the normal multi_v7_defconfig, you can find
some examples in arch/powerpc/configs/*.config.

   Arnd
Alexander Stein Jan. 19, 2023, 3:27 p.m. UTC | #2
Hi Arnd,

thanks for the fast response.

Am Donnerstag, 19. Januar 2023, 16:09:05 CET schrieb Arnd Bergmann:
> On Thu, Jan 19, 2023, at 15:42, Alexander Stein wrote:
> > This is necessary to support PCIe on LS1021A.
> > 
> > Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
> 
> Can you explain why this is actually required? I can see that the
> ranges in the PCIe device point to a high address (0x4000000000,
> 2^40), but I can't tell if this is hardwired in the SoC or a
> setting that is applied by software (either the bootloader or
> the PCIe driver).

The RM ([1]) memory map (Table 2-1) says that 'PCI Express 1' is located at 
'400000_0000', 'PCI Express 2' at '480000_0000', so I assume this is hardcoded 
in SoC.
It also explicitly lists in that table PCIe 1&2 is only accessible with 40-bit 
addressing.

> If you can reprogram the memory map, I would expect this to fit
> easily into the 32-bit address space, with 1GB for DDR3 memory
> and 1GB for PCIe BARs.

I'm not sure which part of memory map you can reprogram and where, but I guess 
this is fixed on this SoC.

> I don't mind having a defconfig with LPAE enabled, I think this
> can be done using a Makefile target that applies a config
> fragment on top of the normal multi_v7_defconfig, you can find
> some examples in arch/powerpc/configs/*.config.

Ah, nice. This can be a good starter. Thanks.

Best regards,
Alexander

[1] https://www.nxp.com/webapp/Download?colCode=LS1021ARM
Russell King (Oracle) Jan. 19, 2023, 4 p.m. UTC | #3
On Thu, Jan 19, 2023 at 03:42:36PM +0100, Alexander Stein wrote:
> This is necessary to support PCIe on LS1021A.
> 
> Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
> ---
>  arch/arm/configs/multi_v7_defconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig
> index 441a449172368..f0757f05ec2c2 100644
> --- a/arch/arm/configs/multi_v7_defconfig
> +++ b/arch/arm/configs/multi_v7_defconfig
> @@ -105,6 +105,7 @@ CONFIG_ARCH_VEXPRESS=y
>  CONFIG_ARCH_VEXPRESS_TC2_PM=y
>  CONFIG_ARCH_WM8850=y
>  CONFIG_ARCH_ZYNQ=y
> +CONFIG_ARM_LPAE=y
>  CONFIG_SMP=y
>  CONFIG_NR_CPUS=16
>  CONFIG_ARM_APPENDED_DTB=y

Enabling LPAE will break multi_v7 on CPUs that do not support LPAE,
such as Cortex A9, rendering iMX6 platforms unbootable with this
defconfig.
Arnd Bergmann Jan. 19, 2023, 4:07 p.m. UTC | #4
On Thu, Jan 19, 2023, at 16:27, Alexander Stein wrote:
> Am Donnerstag, 19. Januar 2023, 16:09:05 CET schrieb Arnd Bergmann:
>> On Thu, Jan 19, 2023, at 15:42, Alexander Stein wrote:
>> > This is necessary to support PCIe on LS1021A.
>> > 
>> > Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
>> 
>> Can you explain why this is actually required? I can see that the
>> ranges in the PCIe device point to a high address (0x4000000000,
>> 2^40), but I can't tell if this is hardwired in the SoC or a
>> setting that is applied by software (either the bootloader or
>> the PCIe driver).
>
> The RM ([1]) memory map (Table 2-1) says that 'PCI Express 1' is located at 
> '400000_0000', 'PCI Express 2' at '480000_0000', so I assume this is hardcoded 
> in SoC.
> It also explicitly lists in that table PCIe 1&2 is only accessible with 40-bit 
> addressing.
>
>> If you can reprogram the memory map, I would expect this to fit
>> easily into the 32-bit address space, with 1GB for DDR3 memory
>> and 1GB for PCIe BARs.
>
> I'm not sure which part of memory map you can reprogram and where, but I guess 
> this is fixed on this SoC.

Ok, I see it now. It looks like they fell victim to the 
cursed "Principles of ARMĀ® Memory Maps White Paper"
document and messed it up even further ;-)

In particular, it seems that the memory map of the PCI address
spaces is configurable, but only within that area you listed.
I see that section "28.4.2 PEX register descriptions" does list
a 64-bit prefetchable address space in addition to the 32-bit
non-prefetchable memory space, but the 64-bit space is not
listed in the DT. It would be a good idea to configure that
as well in order for devices to work that need a larger BAR,
such as a GPU, but it wouldn't help with fitting the PCIe
into non-LPAE 32-bit CPU address space.

In the datasheet I also see that the chip theoretically
supports 8GB of DDR4, which would definitely put it beyond
the highmem limit, even with the 4G:4G memory split. Do you
know if there are ls1021a devices with more than 4GB of
installed memory?

    Arnd
Alexander Stein Jan. 20, 2023, 12:43 p.m. UTC | #5
Hi Arnd,

Am Donnerstag, 19. Januar 2023, 17:07:30 CET schrieb Arnd Bergmann:
> On Thu, Jan 19, 2023, at 16:27, Alexander Stein wrote:
> > Am Donnerstag, 19. Januar 2023, 16:09:05 CET schrieb Arnd Bergmann:
> >> On Thu, Jan 19, 2023, at 15:42, Alexander Stein wrote:
> >> > This is necessary to support PCIe on LS1021A.
> >> > 
> >> > Signed-off-by: Alexander Stein <alexander.stein@ew.tq-group.com>
> >> 
> >> Can you explain why this is actually required? I can see that the
> >> ranges in the PCIe device point to a high address (0x4000000000,
> >> 2^40), but I can't tell if this is hardwired in the SoC or a
> >> setting that is applied by software (either the bootloader or
> >> the PCIe driver).
> > 
> > The RM ([1]) memory map (Table 2-1) says that 'PCI Express 1' is located
> > at
> > '400000_0000', 'PCI Express 2' at '480000_0000', so I assume this is
> > hardcoded in SoC.
> > It also explicitly lists in that table PCIe 1&2 is only accessible with
> > 40-bit addressing.
> > 
> >> If you can reprogram the memory map, I would expect this to fit
> >> easily into the 32-bit address space, with 1GB for DDR3 memory
> >> and 1GB for PCIe BARs.
> > 
> > I'm not sure which part of memory map you can reprogram and where, but I
> > guess this is fixed on this SoC.
> 
> Ok, I see it now. It looks like they fell victim to the
> cursed "Principles of ARMĀ® Memory Maps White Paper"
> document and messed it up even further ;-)
> 
> In particular, it seems that the memory map of the PCI address
> spaces is configurable, but only within that area you listed.
> I see that section "28.4.2 PEX register descriptions" does list
> a 64-bit prefetchable address space in addition to the 32-bit
> non-prefetchable memory space, but the 64-bit space is not
> listed in the DT. It would be a good idea to configure that
> as well in order for devices to work that need a larger BAR,
> such as a GPU, but it wouldn't help with fitting the PCIe
> into non-LPAE 32-bit CPU address space.

I'm not sure if I can follow you here. Do you have some keywords of what's 
missing there?

> In the datasheet I also see that the chip theoretically
> supports 8GB of DDR4, which would definitely put it beyond
> the highmem limit, even with the 4G:4G memory split. Do you
> know if there are ls1021a devices with more than 4GB of
> installed memory?

Where did you find those 8GB? Section 16.2 mentions it supports up to 4 banks/
chip-selects which I would assume is much more. Also the memory map has a DRAM 
region 2 for memory region 2-32GB. But yes this exceeds 32bit addressing.
I'm not aware of ls1021 devices with more than 4GB memory. Our modules only 
support up to 2GB.

Best regards,
Alexander
Arnd Bergmann Jan. 20, 2023, 2 p.m. UTC | #6
On Fri, Jan 20, 2023, at 13:43, Alexander Stein wrote:
> Am Donnerstag, 19. Januar 2023, 17:07:30 CET schrieb Arnd Bergmann:
>> On Thu, Jan 19, 2023, at 16:27, Alexander Stein wrote:
>> 
>> In particular, it seems that the memory map of the PCI address
>> spaces is configurable, but only within that area you listed.
>> I see that section "28.4.2 PEX register descriptions" does list
>> a 64-bit prefetchable address space in addition to the 32-bit
>> non-prefetchable memory space, but the 64-bit space is not
>> listed in the DT. It would be a good idea to configure that
>> as well in order for devices to work that need a larger BAR,
>> such as a GPU, but it wouldn't help with fitting the PCIe
>> into non-LPAE 32-bit CPU address space.
>
> I'm not sure if I can follow you here. Do you have some keywords of what's 
> missing there?

Prefetchable_Memory_Base_Register, section 28.4.2.20 in the
document you pointed me to. 

PCIe addressing is usually split up into I/O space (kilobytes of
registers), non-prefetchable memory space (megabytes of registers
and memory and prefetchable 64-bit memory space (gigabytes of
device memory).

The prefetchable space is indicated by bit '30' of the first
word in the ranges property, so if that is configured, you
would see a third line there starting with 0xc2000000 or
0x42000000. Without this, PCIe cards that have prefetchable
BARs fall back to the non-prefetchable one, which may be
too small or less efficient. This is usually only relevant
for framebuffers on a GPU, but there are probably other
devices as well.

>> In the datasheet I also see that the chip theoretically
>> supports 8GB of DDR4, which would definitely put it beyond
>> the highmem limit, even with the 4G:4G memory split. Do you
>> know if there are ls1021a devices with more than 4GB of
>> installed memory?
>
> Where did you find those 8GB? Section 16.2 mentions it supports up to 4 banks/
> chip-selects which I would assume is much more. Also the memory map has a DRAM 
> region 2 for memory region 2-32GB. But yes this exceeds 32bit addressing.
> I'm not aware of ls1021 devices with more than 4GB memory. Our modules only 
> support up to 2GB.

I think I misread this, as section 2.2 mentions you can have
four chip-selects that are limited to either 2GB or 8GB each,
for a theoretical maximum of 26GB. As long as the practical
limit is 4GB or less, I think we're fine here. Linus Walleij
has is working on a prototype for changing the memory
management code to handle up to 4GB of contiguous RAM without
highmem, which will become relevant in the future as we get
rid of highmem support. On this chip, the first 4GB of
installed memory are not contiguous in the physical address
space, so this will need another set of patches on top.

As long as you only use the first chip-select with 2GB
of installed memory, very little will change for you.

It might be worthwhile to check if your system works
correctly with ARM_LPAE=y, VMSPLIT_2G=y and HIGHMEM=n,
which should be the best configuration for your system
anyway and will keep working after highmem gets removed.

    Arnd
Alexander Stein Jan. 24, 2023, 10:30 a.m. UTC | #7
Hi Arnd,

Am Freitag, 20. Januar 2023, 15:00:35 CET schrieb Arnd Bergmann:
> On Fri, Jan 20, 2023, at 13:43, Alexander Stein wrote:
> > Am Donnerstag, 19. Januar 2023, 17:07:30 CET schrieb Arnd Bergmann:
> >> On Thu, Jan 19, 2023, at 16:27, Alexander Stein wrote:
> >> 
> >> In particular, it seems that the memory map of the PCI address
> >> spaces is configurable, but only within that area you listed.
> >> I see that section "28.4.2 PEX register descriptions" does list
> >> a 64-bit prefetchable address space in addition to the 32-bit
> >> non-prefetchable memory space, but the 64-bit space is not
> >> listed in the DT. It would be a good idea to configure that
> >> as well in order for devices to work that need a larger BAR,
> >> such as a GPU, but it wouldn't help with fitting the PCIe
> >> into non-LPAE 32-bit CPU address space.
> > 
> > I'm not sure if I can follow you here. Do you have some keywords of what's
> > missing there?
> 
> Prefetchable_Memory_Base_Register, section 28.4.2.20 in the
> document you pointed me to.
> 
> PCIe addressing is usually split up into I/O space (kilobytes of
> registers), non-prefetchable memory space (megabytes of registers
> and memory and prefetchable 64-bit memory space (gigabytes of
> device memory).
> 
> The prefetchable space is indicated by bit '30' of the first
> word in the ranges property, so if that is configured, you
> would see a third line there starting with 0xc2000000 or
> 0x42000000. Without this, PCIe cards that have prefetchable
> BARs fall back to the non-prefetchable one, which may be
> too small or less efficient. This is usually only relevant
> for framebuffers on a GPU, but there are probably other
> devices as well.

Thanks for the explanation, although I'm still lacking deeper knowledge how to 
configure PCIe properly.
I tried adding the following line in the 'ranges' property:
> <0xc2000000 0x0 0x20000000 0x40 0x20000000 0x0 0x20000000>, /* prefetchable 
memory */
which was taken from the old example in Documentation/devicetree/bindings/pci/
layerscape-pci.txt, removed in Commit a3b18f5f1d42e ("dt-bindings: pci: 
layerscape-pci: define AER/PME interrupts", 2022-03-11).
But I couldn't detect any difference, maybe it's just due to my PCIe devices I 
have available.

> >> In the datasheet I also see that the chip theoretically
> >> supports 8GB of DDR4, which would definitely put it beyond
> >> the highmem limit, even with the 4G:4G memory split. Do you
> >> know if there are ls1021a devices with more than 4GB of
> >> installed memory?
> > 
> > Where did you find those 8GB? Section 16.2 mentions it supports up to 4
> > banks/ chip-selects which I would assume is much more. Also the memory
> > map has a DRAM region 2 for memory region 2-32GB. But yes this exceeds
> > 32bit addressing. I'm not aware of ls1021 devices with more than 4GB
> > memory. Our modules only support up to 2GB.
> 
> I think I misread this, as section 2.2 mentions you can have
> four chip-selects that are limited to either 2GB or 8GB each,
> for a theoretical maximum of 26GB. As long as the practical
> limit is 4GB or less, I think we're fine here. Linus Walleij
> has is working on a prototype for changing the memory
> management code to handle up to 4GB of contiguous RAM without
> highmem, which will become relevant in the future as we get
> rid of highmem support. On this chip, the first 4GB of
> installed memory are not contiguous in the physical address
> space, so this will need another set of patches on top.
> 
> As long as you only use the first chip-select with 2GB
> of installed memory, very little will change for you.
> 
> It might be worthwhile to check if your system works
> correctly with ARM_LPAE=y, VMSPLIT_2G=y and HIGHMEM=n,
> which should be the best configuration for your system
> anyway and will keep working after highmem gets removed.

Thanks for that hint. Having this setting the board seems to still run like it 
should.

Best regards,
Alexander
Arnd Bergmann Jan. 24, 2023, 11:37 a.m. UTC | #8
On Tue, Jan 24, 2023, at 11:30, Alexander Stein wrote:
> Am Freitag, 20. Januar 2023, 15:00:35 CET schrieb Arnd Bergmann:
>> On Fri, Jan 20, 2023, at 13:43, Alexander Stein wrote:
>
> Thanks for the explanation, although I'm still lacking deeper knowledge how to 
> configure PCIe properly.
> I tried adding the following line in the 'ranges' property:
>> <0xc2000000 0x0 0x20000000 0x40 0x20000000 0x0 0x20000000>, /* prefetchable 
> memory */
> which was taken from the old example in Documentation/devicetree/bindings/pci/
> layerscape-pci.txt, removed in Commit a3b18f5f1d42e ("dt-bindings: pci: 
> layerscape-pci: define AER/PME interrupts", 2022-03-11).
> But I couldn't detect any difference, maybe it's just due to my PCIe devices I 
> have available.

Right, you need to have a device that actually wants to use prefetchable
memory, whichi is something that 'lspci -v' tells you. I'm also not
sure how this particular controller needs to be configured. Some
drivers read the 'ranges' properties and program the windows in
the PCI controller registers, while others expect the firmware to
have set up the hardware windows in the way they are described in DT.

>> It might be worthwhile to check if your system works
>> correctly with ARM_LPAE=y, VMSPLIT_2G=y and HIGHMEM=n,
>> which should be the best configuration for your system
>> anyway and will keep working after highmem gets removed.
>
> Thanks for that hint. Having this setting the board seems to still run like it 
> should.

Ok, good.

   Arnd
diff mbox series

Patch

diff --git a/arch/arm/configs/multi_v7_defconfig b/arch/arm/configs/multi_v7_defconfig
index 441a449172368..f0757f05ec2c2 100644
--- a/arch/arm/configs/multi_v7_defconfig
+++ b/arch/arm/configs/multi_v7_defconfig
@@ -105,6 +105,7 @@  CONFIG_ARCH_VEXPRESS=y
 CONFIG_ARCH_VEXPRESS_TC2_PM=y
 CONFIG_ARCH_WM8850=y
 CONFIG_ARCH_ZYNQ=y
+CONFIG_ARM_LPAE=y
 CONFIG_SMP=y
 CONFIG_NR_CPUS=16
 CONFIG_ARM_APPENDED_DTB=y