Message ID | 20210910053354.26721-1-wangkefeng.wang@huawei.com (mailing list archive) |
---|---|
Headers | show |
Series | arm64: support page mapping percpu first chunk allocator | expand |
Hi Greg and Andrew, as Catalin saids,the series touches drivers/ and mm/ but missing acks from both of you,could you take a look of this patchset(patch1 change mm/vmalloc.c and patch2 changes drivers/base/arch_numa.c). And Catalin, is there any other comments? I hope this could be merged into next version, Many thanks all of you. On 2021/9/10 13:33, Kefeng Wang wrote: > Percpu embedded first chunk allocator is the firstly option, but it > could fails on ARM64, eg, > "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000" > "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000" > "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000" > > then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu_get_vm_areas+0x488/0x838", > even the system could not boot successfully. > > Let's implement page mapping percpu first chunk allocator as a fallback > to the embedding allocator to increase the robustness of the system. > > Also fix a crash when both NEED_PER_CPU_PAGE_FIRST_CHUNK and KASAN_VMALLOC enabled. > > Tested on ARM64 qemu with cmdline "percpu_alloc=page" based on v5.14. > > V4: > - add ACK/RB > - address comments about patch1 from Catalin > - add Greg and Andrew into list suggested by Catalin > > v3: > - search for a range that fits instead of always picking the end from > vmalloc area suggested by Catalin. > - use NUMA_NO_NODE to avoid "virt_to_phys used for non-linear address:" > issue in arm64 kasan_populate_early_vm_area_shadow(). > - add Acked-by: Marco Elver <elver@google.com> to patch v3 > > V2: > - fix build error when CONFIG_KASAN disabled, found by lkp@intel.com > - drop wrong __weak comment from kasan_populate_early_vm_area_shadow(), > found by Marco Elver <elver@google.com> > > Kefeng Wang (3): > vmalloc: Choose a better start address in vm_area_register_early() > arm64: Support page mapping percpu first chunk allocator > kasan: arm64: Fix pcpu_page_first_chunk crash with KASAN_VMALLOC > > arch/arm64/Kconfig | 4 ++ > arch/arm64/mm/kasan_init.c | 16 ++++++++ > drivers/base/arch_numa.c | 82 +++++++++++++++++++++++++++++++++----- > include/linux/kasan.h | 6 +++ > mm/kasan/init.c | 5 +++ > mm/vmalloc.c | 19 ++++++--- > 6 files changed, 116 insertions(+), 16 deletions(-) >
On Wed, Sep 15, 2021 at 04:33:09PM +0800, Kefeng Wang wrote: > Hi Greg and Andrew, as Catalin saids,the series touches drivers/ and mm/ > but missing > > acks from both of you,could you take a look of this patchset(patch1 change > mm/vmalloc.c What patchset? > and patch2 changes drivers/base/arch_numa.c). that file is not really owned by anyone it seems :( Can you provide a link to the real patch please? thanks, greg k-h
On 2021/9/16 23:41, Greg KH wrote: > On Wed, Sep 15, 2021 at 04:33:09PM +0800, Kefeng Wang wrote: >> Hi Greg and Andrew, as Catalin saids,the series touches drivers/ and mm/ >> but missing >> >> acks from both of you,could you take a look of this patchset(patch1 change >> mm/vmalloc.c > What patchset? [PATCH v4 1/3] vmalloc: Choose a better start address in vm_area_register_early() <https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-2-wangkefeng.wang@huawei.com/> [PATCH v4 2/3] arm64: Support page mapping percpu first chunk allocator <https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-3-wangkefeng.wang@huawei.com/> [PATCH v4 3/3] kasan: arm64: Fix pcpu_page_first_chunk crash with KASAN_VMALLOC <https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-4-wangkefeng.wang@huawei.com/> [PATCH v4 0/3] arm64: support page mapping percpu first chunk allocator <https://lore.kernel.org/linux-arm-kernel/c06faf6c-3d21-04f2-6855-95c86e96cf5a@huawei.com/> >> and patch2 changes drivers/base/arch_numa.c). patch2 : [PATCH v4 2/3] arm64: Support page mapping percpu first chunk allocator <https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-3-wangkefeng.wang@huawei.com/#r> > that file is not really owned by anyone it seems :( > > Can you provide a link to the real patch please? Yes, arch_numa.c is moved into drivers/base to support riscv numa, it is shared by arm64/riscv, my changes(patch2) only support NEED_PER_CPU_PAGE_FIRST_CHUNK on ARM64. here is the link: https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-1-wangkefeng.wang@huawei.com/ Thanks. > > thanks, > > greg k-h > . >
On Fri, Sep 17, 2021 at 09:11:38AM +0800, Kefeng Wang wrote: > > On 2021/9/16 23:41, Greg KH wrote: > > On Wed, Sep 15, 2021 at 04:33:09PM +0800, Kefeng Wang wrote: > > > Hi Greg and Andrew, as Catalin saids,the series touches drivers/ and mm/ > > > but missing > > > > > > acks from both of you,could you take a look of this patchset(patch1 change > > > mm/vmalloc.c > > What patchset? > > [PATCH v4 1/3] vmalloc: Choose a better start address in > vm_area_register_early() <https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-2-wangkefeng.wang@huawei.com/> > [PATCH v4 2/3] arm64: Support page mapping percpu first chunk allocator <https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-3-wangkefeng.wang@huawei.com/> > [PATCH v4 3/3] kasan: arm64: Fix pcpu_page_first_chunk crash with > KASAN_VMALLOC <https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-4-wangkefeng.wang@huawei.com/> > [PATCH v4 0/3] arm64: support page mapping percpu first chunk allocator <https://lore.kernel.org/linux-arm-kernel/c06faf6c-3d21-04f2-6855-95c86e96cf5a@huawei.com/> > > > > and patch2 changes drivers/base/arch_numa.c). > patch2 : > > [PATCH v4 2/3] arm64: Support page mapping percpu first chunk allocator <https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-3-wangkefeng.wang@huawei.com/#r> > > > that file is not really owned by anyone it seems :( > > > > Can you provide a link to the real patch please? > > Yes, arch_numa.c is moved into drivers/base to support riscv numa, it is > shared by arm64/riscv, > > my changes(patch2) only support NEED_PER_CPU_PAGE_FIRST_CHUNK on ARM64. > > here is the link: > > https://lore.kernel.org/linux-arm-kernel/20210910053354.26721-1-wangkefeng.wang@huawei.com/ Now reviewed.
Hi Catalin and Andrew, kindly ping again, any comments, thanks. On 2021/9/10 13:33, Kefeng Wang wrote: > Percpu embedded first chunk allocator is the firstly option, but it > could fails on ARM64, eg, > "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000" > "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000" > "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000" > > then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu_get_vm_areas+0x488/0x838", > even the system could not boot successfully. > > Let's implement page mapping percpu first chunk allocator as a fallback > to the embedding allocator to increase the robustness of the system. > > Also fix a crash when both NEED_PER_CPU_PAGE_FIRST_CHUNK and KASAN_VMALLOC enabled. > > Tested on ARM64 qemu with cmdline "percpu_alloc=page" based on v5.14. > > V4: > - add ACK/RB > - address comments about patch1 from Catalin > - add Greg and Andrew into list suggested by Catalin > > v3: > - search for a range that fits instead of always picking the end from > vmalloc area suggested by Catalin. > - use NUMA_NO_NODE to avoid "virt_to_phys used for non-linear address:" > issue in arm64 kasan_populate_early_vm_area_shadow(). > - add Acked-by: Marco Elver <elver@google.com> to patch v3 > > V2: > - fix build error when CONFIG_KASAN disabled, found by lkp@intel.com > - drop wrong __weak comment from kasan_populate_early_vm_area_shadow(), > found by Marco Elver <elver@google.com> > > Kefeng Wang (3): > vmalloc: Choose a better start address in vm_area_register_early() > arm64: Support page mapping percpu first chunk allocator > kasan: arm64: Fix pcpu_page_first_chunk crash with KASAN_VMALLOC > > arch/arm64/Kconfig | 4 ++ > arch/arm64/mm/kasan_init.c | 16 ++++++++ > drivers/base/arch_numa.c | 82 +++++++++++++++++++++++++++++++++----- > include/linux/kasan.h | 6 +++ > mm/kasan/init.c | 5 +++ > mm/vmalloc.c | 19 ++++++--- > 6 files changed, 116 insertions(+), 16 deletions(-) >
On 2021/9/28 15:48, Kefeng Wang wrote: > Hi Catalin and Andrew, kindly ping again, any comments, thanks. Looks no more comments, Catalin and Andrew, ping again, any one of you could merge this patchset, many thanks. > > On 2021/9/10 13:33, Kefeng Wang wrote: >> Percpu embedded first chunk allocator is the firstly option, but it >> could fails on ARM64, eg, >> "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space >> 0x781fefff0000" >> "percpu: max_distance=0x600000540000 too large for vmalloc space >> 0x7dffb7ff0000" >> "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space >> 0x5dffb7ff0000" >> >> then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 >> pcpu_get_vm_areas+0x488/0x838", >> even the system could not boot successfully. >> >> Let's implement page mapping percpu first chunk allocator as a fallback >> to the embedding allocator to increase the robustness of the system. >> >> Also fix a crash when both NEED_PER_CPU_PAGE_FIRST_CHUNK and >> KASAN_VMALLOC enabled. >> >> Tested on ARM64 qemu with cmdline "percpu_alloc=page" based on v5.14. >> >> V4: >> - add ACK/RB >> - address comments about patch1 from Catalin >> - add Greg and Andrew into list suggested by Catalin >> >> v3: >> - search for a range that fits instead of always picking the end from >> vmalloc area suggested by Catalin. >> - use NUMA_NO_NODE to avoid "virt_to_phys used for non-linear address:" >> issue in arm64 kasan_populate_early_vm_area_shadow(). >> - add Acked-by: Marco Elver <elver@google.com> to patch v3 >> >> V2: >> - fix build error when CONFIG_KASAN disabled, found by lkp@intel.com >> - drop wrong __weak comment from kasan_populate_early_vm_area_shadow(), >> found by Marco Elver <elver@google.com> >> >> Kefeng Wang (3): >> vmalloc: Choose a better start address in vm_area_register_early() >> arm64: Support page mapping percpu first chunk allocator >> kasan: arm64: Fix pcpu_page_first_chunk crash with KASAN_VMALLOC >> >> arch/arm64/Kconfig | 4 ++ >> arch/arm64/mm/kasan_init.c | 16 ++++++++ >> drivers/base/arch_numa.c | 82 +++++++++++++++++++++++++++++++++----- >> include/linux/kasan.h | 6 +++ >> mm/kasan/init.c | 5 +++ >> mm/vmalloc.c | 19 ++++++--- >> 6 files changed, 116 insertions(+), 16 deletions(-) >>
On Fri, 10 Sep 2021 13:33:51 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > Percpu embedded first chunk allocator is the firstly option, but it > could fails on ARM64, eg, > "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000" > "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000" > "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000" > > then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu_get_vm_areas+0x488/0x838", > even the system could not boot successfully. > > Let's implement page mapping percpu first chunk allocator as a fallback > to the embedding allocator to increase the robustness of the system. > > Also fix a crash when both NEED_PER_CPU_PAGE_FIRST_CHUNK and KASAN_VMALLOC enabled. How serious are these problems in real-world situations? Do people feel that a -stable backport is needed, or is a 5.16-rc1 merge sufficient?
On 2021/10/11 5:36, Andrew Morton wrote: > On Fri, 10 Sep 2021 13:33:51 +0800 Kefeng Wang <wangkefeng.wang@huawei.com> wrote: > >> Percpu embedded first chunk allocator is the firstly option, but it >> could fails on ARM64, eg, >> "percpu: max_distance=0x5fcfdc640000 too large for vmalloc space 0x781fefff0000" >> "percpu: max_distance=0x600000540000 too large for vmalloc space 0x7dffb7ff0000" >> "percpu: max_distance=0x5fff9adb0000 too large for vmalloc space 0x5dffb7ff0000" >> >> then we could meet "WARNING: CPU: 15 PID: 461 at vmalloc.c:3087 pcpu_get_vm_areas+0x488/0x838", >> even the system could not boot successfully. >> >> Let's implement page mapping percpu first chunk allocator as a fallback >> to the embedding allocator to increase the robustness of the system. >> >> Also fix a crash when both NEED_PER_CPU_PAGE_FIRST_CHUNK and KASAN_VMALLOC enabled. > > How serious are these problems in real-world situations? Do people > feel that a -stable backport is needed, or is a 5.16-rc1 merge > sufficient? > . Thanks Andrew. A specific memory layout is required(also with KASAN enabled), we met this issue at qemu and real hardware, due to KASAN enabled, so I think 5.16-rc1 is sufficient.