Message ID | 20240703210707.1986816-1-scott@os.amperecomputing.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: limit MAX_PHYSMEM_BITS based on vmemmap | expand |
Hello Scott, On 7/4/24 02:37, D Scott Phillips wrote: > Prior to the memory map adjustments in v6.9-rc1, the amdgpu driver could > trip over the warning of: > > `WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));` Could you please provide the mainline commit ID for the mentioned memory adjustment changes. > > in vmemmap_populate()[1]. After the adjustments, it becomes a translation > fault and panic. Probably caused by accessing memory which does not exist on that address. > > The cause is that the amdgpu driver allocates some unused space from > iomem_resource and claims it as MEMORY_DEVICE_PRIVATE and > devm_memremap_pages() it. An address above those backed by the vmemmap is > used. > > Adjust MAX_PHYSMEM_BITS so that addresses not backed by the vmemmap will > not be chosen as device private addresses. > > [1]: Call trace: > vmemmap_populate+0x30/0x48 > __populate_section_memmap+0x40/0x90 > sparse_add_section+0xfc/0x3e8 > __add_pages+0xb4/0x168 > pagemap_range+0x300/0x410 > memremap_pages+0x184/0x2d8 > devm_memremap_pages+0x30/0x90 > kgd2kfd_init_zone_device+0xe0/0x1f0 [amdgpu] > amdgpu_device_ip_init+0x674/0x888 [amdgpu] > amdgpu_device_init+0x7bc/0xed8 [amdgpu] > amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu] > amdgpu_pci_probe+0x194/0x580 [amdgpu] > local_pci_probe+0x48/0xb8 > work_for_cpu_fn+0x24/0x40 > process_one_work+0x170/0x3e0 > worker_thread+0x2ac/0x3e0 > kthread+0xf4/0x108 > ret_from_fork+0x10/0x20 > > Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> > --- > arch/arm64/include/asm/sparsemem.h | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h > index 8a8acc220371c..8387301f2e206 100644 > --- a/arch/arm64/include/asm/sparsemem.h > +++ b/arch/arm64/include/asm/sparsemem.h > @@ -5,7 +5,7 @@ > #ifndef __ASM_SPARSEMEM_H > #define __ASM_SPARSEMEM_H > > -#define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS > +#define MAX_PHYSMEM_BITS ilog2(VMEMMAP_RANGE) Just wondering if there is another method, which avoids selecting physical memory ranges not backed with vmemmap. Also will reducing MAX_PHYSMEM_BITS below ARM64_PA_BITS have other side effects ? Do other platforms have this exact same co-relation between MAX_PHYSMEM_BITS and vmemmap range ? > > /* > * Section size must be at least 512MB for 64K base
On Thu, Jul 04, 2024 at 08:42:52AM +0530, Anshuman Khandual wrote: > On 7/4/24 02:37, D Scott Phillips wrote: > > diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h > > index 8a8acc220371c..8387301f2e206 100644 > > --- a/arch/arm64/include/asm/sparsemem.h > > +++ b/arch/arm64/include/asm/sparsemem.h > > @@ -5,7 +5,7 @@ > > #ifndef __ASM_SPARSEMEM_H > > #define __ASM_SPARSEMEM_H > > > > -#define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS > > +#define MAX_PHYSMEM_BITS ilog2(VMEMMAP_RANGE) > > Just wondering if there is another method, which avoids selecting physical > memory ranges not backed with vmemmap. Also will reducing MAX_PHYSMEM_BITS > below ARM64_PA_BITS have other side effects ? Do other platforms have this > exact same co-relation between MAX_PHYSMEM_BITS and vmemmap range ? That's indeed a pretty weird workaround. MAX_PHYSMEM_BITS, as the name implies, is about the physical bits supported for memory while VMEMMAP_RANGE tells us the virtual address range. There is a correlation between them but they are different things conceptually. The memory hotplug code uses arch_get_mappable_range(). This should be called from the amdgpu code rather than changing MAX_PHYSMEM_BITS.
Anshuman Khandual <anshuman.khandual@arm.com> writes: > Hello Scott, > > On 7/4/24 02:37, D Scott Phillips wrote: > > Prior to the memory map adjustments in v6.9-rc1, the amdgpu driver could > > trip over the warning of: > > > > `WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));` > > Could you please provide the mainline commit ID for the mentioned memory > adjustment changes. Yes, it's commit 32697ff38287 ("arm64: vmemmap: Avoid base2 order of struct page size to dimension region)" > > > > in vmemmap_populate()[1]. After the adjustments, it becomes a translation > > fault and panic. > > Probably caused by accessing memory which does not exist on that address. It seems the changed VMEMMAP_START causes the out-of-range address to wrap around to the low virtual address space. | [ 22.177085] Unable to handle kernel paging request at virtual address 000001ffa6000034 | [ 22.186641] Mem abort info: | [ 22.192201] ESR = 0x0000000096000044 | [ 22.198720] EC = 0x25: DABT (current EL), IL = 32 bits | [ 22.206798] SET = 0, FnV = 0 | [ 22.212624] EA = 0, S1PTW = 0 | [ 22.218612] FSC = 0x04: level 0 translation fault | [ 22.226343] Data abort info: | [ 22.231992] ISV = 0, ISS = 0x00000044, ISS2 = 0x00000000 | [ 22.240245] CM = 0, WnR = 1, TnD = 0, TagAccess = 0 | [ 22.248064] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0 | [ 22.256145] user pgtable: 4k pages, 48-bit VAs, pgdp=000008000287c000 | [ 22.265357] [000001ffa6000034] pgd=0000000000000000, p4d=0000000000000000 | [ 22.265362] Internal error: Oops: 0000000096000044 [#1] SMP | [ 22.265365] Modules linked in: amdgpu(+) drm_ttm_helper ttm drm_exec drm_suballoc_helper amdxcp drm_buddy gpu_sched drm_display_helper cec | [ 22.292879] CPU: 0 PID: 564 Comm: kworker/0:2 Tainted: G W 6.8.0-rc3+ #2 | [ 22.300957] Hardware name: ADLINK Ampere Altra Developer Platform/Ampere Altra Developer Platform, BIOS TianoCore 2.04.100.11 (SYS: 2.06.20220308) 11/06/2 | [ 22.314763] Workqueue: events work_for_cpu_fn | [ 22.319112] pstate: 80400009 (Nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--) | [ 22.326062] pc : __init_zone_device_page.constprop.0+0x2c/0xa8 | [ 22.331886] lr : memmap_init_zone_device+0xf0/0x210 | [ 22.336751] sp : ffff80008359b9f0 | [ 22.340053] x29: ffff80008359b9f0 x28: 0000000fffa00000 x27: ffff07ffe3e3a7c8 | [ 22.347177] x26: 000001ffa6000000 x25: 0000000000000001 x24: fffffdffc0000000 | [ 22.354300] x23: ffffcb2dc0b21dd0 x22: 0000001000000000 x21: 0000000000000000 | [ 22.361424] x20: 0000000000000000 x19: 0000000000000001 x18: 0000000000001fb8 | [ 22.368547] x17: ffffcb2d80f67db4 x16: ffffcb2dc00f7090 x15: ffffcb2d80f5cc50 | [ 22.375671] x14: ffffcb2d80f6aff8 x13: ffffcb2dbea78a4c x12: ffffcb2dbea6d82c | [ 22.382794] x11: ffffcb2dc1661008 x10: ffffcb2dbea78a4c x9 : ffffcb2dc00eef40 | [ 22.389918] x8 : ffff081f6fdcd2c0 x7 : 00000000ffffffff x6 : 0000ffff00004000 | [ 22.397041] x5 : 0000000000000001 x4 : 000001ffa6000008 x3 : ffff07ffe3e3a7c8 | [ 22.404164] x2 : 0000000000000000 x1 : 0000000fffa00000 x0 : 000001ffa6000000 | [ 22.411287] Call trace: | [ 22.413721] __init_zone_device_page.constprop.0+0x2c/0xa8 | [ 22.419194] memmap_init_zone_device+0xf0/0x210 | [ 22.423713] pagemap_range+0x1e0/0x410 | [ 22.427452] memremap_pages+0x18c/0x2e0 | [ 22.431275] devm_memremap_pages+0x30/0x90 | [ 22.435360] kgd2kfd_init_zone_device+0xf0/0x200 [amdgpu] | [ 22.441301] amdgpu_device_ip_init+0x674/0x888 [amdgpu] | [ 22.446989] amdgpu_device_init+0x7a4/0xea0 [amdgpu] | [ 22.452415] amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu] | [ 22.458101] amdgpu_pci_probe+0x1a0/0x560 [amdgpu] | [ 22.463353] local_pci_probe+0x48/0xb8 | [ 22.467093] work_for_cpu_fn+0x24/0x40 | [ 22.470831] process_one_work+0x170/0x3e0 | [ 22.474829] worker_thread+0x2ac/0x3e0 | [ 22.478567] kthread+0xf4/0x108 | [ 22.481696] ret_from_fork+0x10/0x20 | [ 22.485261] Code: d2880006 a90153f3 91002004 f2dfffe6 (b9003405) | [ 22.491342] ---[ end trace 0000000000000000 ]--- > > > > The cause is that the amdgpu driver allocates some unused space from > > iomem_resource and claims it as MEMORY_DEVICE_PRIVATE and > > devm_memremap_pages() it. An address above those backed by the vmemmap is > > used. > > > > Adjust MAX_PHYSMEM_BITS so that addresses not backed by the vmemmap will > > not be chosen as device private addresses. > > > > [1]: Call trace: > > vmemmap_populate+0x30/0x48 > > __populate_section_memmap+0x40/0x90 > > sparse_add_section+0xfc/0x3e8 > > __add_pages+0xb4/0x168 > > pagemap_range+0x300/0x410 > > memremap_pages+0x184/0x2d8 > > devm_memremap_pages+0x30/0x90 > > kgd2kfd_init_zone_device+0xe0/0x1f0 [amdgpu] > > amdgpu_device_ip_init+0x674/0x888 [amdgpu] > > amdgpu_device_init+0x7bc/0xed8 [amdgpu] > > amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu] > > amdgpu_pci_probe+0x194/0x580 [amdgpu] > > local_pci_probe+0x48/0xb8 > > work_for_cpu_fn+0x24/0x40 > > process_one_work+0x170/0x3e0 > > worker_thread+0x2ac/0x3e0 > > kthread+0xf4/0x108 > > ret_from_fork+0x10/0x20 > > > > Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> > > --- > > arch/arm64/include/asm/sparsemem.h | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h > > index 8a8acc220371c..8387301f2e206 100644 > > --- a/arch/arm64/include/asm/sparsemem.h > > +++ b/arch/arm64/include/asm/sparsemem.h > > @@ -5,7 +5,7 @@ > > #ifndef __ASM_SPARSEMEM_H > > #define __ASM_SPARSEMEM_H > > > > -#define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS > > +#define MAX_PHYSMEM_BITS ilog2(VMEMMAP_RANGE) > > Just wondering if there is another method, which avoids selecting physical > memory ranges not backed with vmemmap. Also will reducing MAX_PHYSMEM_BITS > below ARM64_PA_BITS have other side effects ? Do other platforms have this > exact same co-relation between MAX_PHYSMEM_BITS and vmemmap range ? I'm not that familiar with other architectures, but I assume that high addresses of MAX_PHYSMEM_BITS work properly for hmm as the code to allocate a free range allocates from high addresses, descending.
Catalin Marinas <catalin.marinas@arm.com> writes: > On Thu, Jul 04, 2024 at 08:42:52AM +0530, Anshuman Khandual wrote: > > On 7/4/24 02:37, D Scott Phillips wrote: > > > diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h > > > index 8a8acc220371c..8387301f2e206 100644 > > > --- a/arch/arm64/include/asm/sparsemem.h > > > +++ b/arch/arm64/include/asm/sparsemem.h > > > @@ -5,7 +5,7 @@ > > > #ifndef __ASM_SPARSEMEM_H > > > #define __ASM_SPARSEMEM_H > > > > > > -#define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS > > > +#define MAX_PHYSMEM_BITS ilog2(VMEMMAP_RANGE) > > > > Just wondering if there is another method, which avoids selecting > > physical memory ranges not backed with vmemmap. Also will reducing > > MAX_PHYSMEM_BITS below ARM64_PA_BITS have other side effects ? Do > > other platforms have this exact same co-relation between > > MAX_PHYSMEM_BITS and vmemmap range ? > > That's indeed a pretty weird workaround. MAX_PHYSMEM_BITS, as the name > implies, is about the physical bits supported for memory while > VMEMMAP_RANGE tells us the virtual address range. There is a > correlation between them but they are different things conceptually. > > The memory hotplug code uses arch_get_mappable_range(). This should be > called from the amdgpu code rather than changing MAX_PHYSMEM_BITS. OK, thanks I'll pursue that approach.
diff --git a/arch/arm64/include/asm/sparsemem.h b/arch/arm64/include/asm/sparsemem.h index 8a8acc220371c..8387301f2e206 100644 --- a/arch/arm64/include/asm/sparsemem.h +++ b/arch/arm64/include/asm/sparsemem.h @@ -5,7 +5,7 @@ #ifndef __ASM_SPARSEMEM_H #define __ASM_SPARSEMEM_H -#define MAX_PHYSMEM_BITS CONFIG_ARM64_PA_BITS +#define MAX_PHYSMEM_BITS ilog2(VMEMMAP_RANGE) /* * Section size must be at least 512MB for 64K base
Prior to the memory map adjustments in v6.9-rc1, the amdgpu driver could trip over the warning of: `WARN_ON((start < VMEMMAP_START) || (end > VMEMMAP_END));` in vmemmap_populate()[1]. After the adjustments, it becomes a translation fault and panic. The cause is that the amdgpu driver allocates some unused space from iomem_resource and claims it as MEMORY_DEVICE_PRIVATE and devm_memremap_pages() it. An address above those backed by the vmemmap is used. Adjust MAX_PHYSMEM_BITS so that addresses not backed by the vmemmap will not be chosen as device private addresses. [1]: Call trace: vmemmap_populate+0x30/0x48 __populate_section_memmap+0x40/0x90 sparse_add_section+0xfc/0x3e8 __add_pages+0xb4/0x168 pagemap_range+0x300/0x410 memremap_pages+0x184/0x2d8 devm_memremap_pages+0x30/0x90 kgd2kfd_init_zone_device+0xe0/0x1f0 [amdgpu] amdgpu_device_ip_init+0x674/0x888 [amdgpu] amdgpu_device_init+0x7bc/0xed8 [amdgpu] amdgpu_driver_load_kms+0x28/0x1c0 [amdgpu] amdgpu_pci_probe+0x194/0x580 [amdgpu] local_pci_probe+0x48/0xb8 work_for_cpu_fn+0x24/0x40 process_one_work+0x170/0x3e0 worker_thread+0x2ac/0x3e0 kthread+0xf4/0x108 ret_from_fork+0x10/0x20 Signed-off-by: D Scott Phillips <scott@os.amperecomputing.com> --- arch/arm64/include/asm/sparsemem.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)