diff mbox series

arm64: mm: move dma_contiguous_reserve() to be after paging_init()

Message ID 20200916085933.25220-1-song.bao.hua@hisilicon.com (mailing list archive)
State New, archived
Headers show
Series arm64: mm: move dma_contiguous_reserve() to be after paging_init() | expand

Commit Message

Song Bao Hua (Barry Song) Sept. 16, 2020, 8:59 a.m. UTC
Recent CMA change "cma: make number of CMA areas dynamic, remove
CONFIG_CMA_AREAS" breaks the boot of arm64 kernel in linux-next.
Knic is like:

Unable to handle kernel paging request at virtual address ffff0000438fff70
Mem abort info:
  ESR = 0x96000044
  EC = 0x25: DABT (current EL), IL = 32 bits
  SET = 0, FnV = 0
  EA = 0, S1PTW = 0
Data abort info:
  ISV = 0, ISS = 0x00000044
  CM = 0, WnR = 1
swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041f61000
[ffff0000438fff70] pgd=0000000000000000, p4d=0000000000000000
Internal error: Oops: 96000044 [#1] PREEMPT SMP
Modules linked in:
CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc3-00020-ge1bce3d64c48 #2
Hardware name: linux,dummy-virt (DT)
pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--)
pc : __memset+0x148/0x188
lr : memblock_alloc_try_nid+0xbc/0xd4
sp : ffff800011ab3d10
x29: ffff800011ab3d10 x28: 0000000041710018
x27: 0000000040000000 x26: ffff8000115d1000
x25: 0000000000000000 x24: ffff800011300428
x23: ffff800011d1bd60 x22: 0000000000000000
x21: 00000000ffffffff x20: ffff0000438fff70
x19: 0000000000000090 x18: 0000000000000010
x17: 0000000000001400 x16: 0000000000001c00
x15: ffff800011ac3530 x14: ffff800011ac3530
x13: fffffdfffe600000 x12: ffff800011ab3e44
x11: 0000000000000004 x10: 0000000000000018
x9 : 0000000000000000 x8 : ffff0000438fff70
x7 : 0000000000000000 x6 : 000000000000003f
x5 : 0000000000000040 x4 : 0000000000000010
x3 : 0000000000000080 x2 : 0000000000000080
x1 : 0000000000000000 x0 : ffff0000438fff70
Call trace:
 __memset+0x148/0x188
 cma_init_reserved_mem+0x94/0x154
 cma_declare_contiguous_nid+0x240/0x2bc
 dma_contiguous_reserve_area+0x48/0x78
 dma_contiguous_reserve+0x78/0x88
 arm64_memblock_init+0x424/0x45c
 setup_arch+0x270/0x5f0
 start_kernel+0x84/0x4dc
Code: f101007f fa45a068 54fffc0b aa0303e2 (a9001d07)
random: get_random_bytes called from print_oops_end_marker+0x2c/0x68 with crng_init=0
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Attempted to kill the idle task!
---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---

The virtual address returned from memblock_alloc() is not ready till
paging_init() is done.

Cc: Roman Gushchin <guro@fb.com>
Cc: Mike Kravetz <mike.kravetz@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Will Deacon <will@kernel.org>
Cc: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
---
 -v1: to fix the knic during boot after applying Mike's patch:
 https://lore.kernel.org/linux-mm/20200915205703.34572-1-mike.kravetz@oracle.com/

 arch/arm64/mm/init.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

Comments

Anders Roxell Sept. 16, 2020, 3:12 p.m. UTC | #1
> Recent CMA change "cma: make number of CMA areas dynamic, remove
> CONFIG_CMA_AREAS" breaks the boot of arm64 kernel in linux-next.
> Knic is like:
> 
> Unable to handle kernel paging request at virtual address ffff0000438fff70
> Mem abort info:
>   ESR = 0x96000044
>   EC = 0x25: DABT (current EL), IL = 32 bits
>   SET = 0, FnV = 0
>   EA = 0, S1PTW = 0
> Data abort info:
>   ISV = 0, ISS = 0x00000044
>   CM = 0, WnR = 1
> swapper pgtable: 4k pages, 48-bit VAs, pgdp=0000000041f61000
> [ffff0000438fff70] pgd=0000000000000000, p4d=0000000000000000
> Internal error: Oops: 96000044 [#1] PREEMPT SMP
> Modules linked in:
> CPU: 0 PID: 0 Comm: swapper Not tainted 5.9.0-rc3-00020-ge1bce3d64c48 #2
> Hardware name: linux,dummy-virt (DT)
> pstate: 20000085 (nzCv daIf -PAN -UAO BTYPE=--)
> pc : __memset+0x148/0x188
> lr : memblock_alloc_try_nid+0xbc/0xd4
> sp : ffff800011ab3d10
> x29: ffff800011ab3d10 x28: 0000000041710018
> x27: 0000000040000000 x26: ffff8000115d1000
> x25: 0000000000000000 x24: ffff800011300428
> x23: ffff800011d1bd60 x22: 0000000000000000
> x21: 00000000ffffffff x20: ffff0000438fff70
> x19: 0000000000000090 x18: 0000000000000010
> x17: 0000000000001400 x16: 0000000000001c00
> x15: ffff800011ac3530 x14: ffff800011ac3530
> x13: fffffdfffe600000 x12: ffff800011ab3e44
> x11: 0000000000000004 x10: 0000000000000018
> x9 : 0000000000000000 x8 : ffff0000438fff70
> x7 : 0000000000000000 x6 : 000000000000003f
> x5 : 0000000000000040 x4 : 0000000000000010
> x3 : 0000000000000080 x2 : 0000000000000080
> x1 : 0000000000000000 x0 : ffff0000438fff70
> Call trace:
>  __memset+0x148/0x188
>  cma_init_reserved_mem+0x94/0x154
>  cma_declare_contiguous_nid+0x240/0x2bc
>  dma_contiguous_reserve_area+0x48/0x78
>  dma_contiguous_reserve+0x78/0x88
>  arm64_memblock_init+0x424/0x45c
>  setup_arch+0x270/0x5f0
>  start_kernel+0x84/0x4dc
> Code: f101007f fa45a068 54fffc0b aa0303e2 (a9001d07)
> random: get_random_bytes called from print_oops_end_marker+0x2c/0x68 with crng_init=0
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Attempted to kill the idle task!
> ---[ end Kernel panic - not syncing: Attempted to kill the idle task! ]---
> 
> The virtual address returned from memblock_alloc() is not ready till
> paging_init() is done.
> 
> Signed-off-by: Barry Song <song.bao.hua@hisilicon.com>
> Cc: Roman Gushchin <guro@fb.com>
> Cc: Mike Kravetz <mike.kravetz@oracle.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Will Deacon <will@kernel.org>
> Cc: Stephen Rothwell <sfr@canb.auug.org.au>

Tested-by: Anders Roxell <anders.roxell@linaro.org>

Cheers,
Anders
Nick Desaulniers Sept. 17, 2020, 12:19 a.m. UTC | #2
It looks like that change referenced may also break arm32 boots with today's
next?

The following allows me to boot, but I have no idea if it's incorrect or not.

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 45f9d5ec2360..7118b98c1f5f 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -226,9 +226,6 @@ void __init arm_memblock_init(const struct machine_desc *mdesc)
        early_init_fdt_reserve_self();
        early_init_fdt_scan_reserved_mem();
 
-       /* reserve memory for DMA contiguous allocations */
-       dma_contiguous_reserve(arm_dma_limit);
-
        arm_memblock_steal_permitted = false;
        memblock_dump_all();
 }
@@ -248,6 +245,9 @@ void __init bootmem_init(void)
         */
        sparse_init();
 
+       /* reserve memory for DMA contiguous allocations */
+       dma_contiguous_reserve(arm_dma_limit);
+
        /*
         * Now free the memory - free_area_init needs
         * the sparse mem_map arrays initialized by sparse_init()
Mike Kravetz Sept. 17, 2020, 12:27 a.m. UTC | #3
On 9/16/20 5:19 PM, Nick Desaulniers wrote:
> It looks like that change referenced may also break arm32 boots with today's
> next?
> 
> The following allows me to boot, but I have no idea if it's incorrect or not.

Thanks Nick,

The referenced commit was pulled from Andrew's tree and subsequent versions
of next.

I'm looking into other architectures as this is dependent on where in arch
specific boot process first cma call is made.  Hopefully, there is a some
way to do this without potentially touching a bunch of arch code.
Song Bao Hua (Barry Song) Sept. 17, 2020, 12:57 a.m. UTC | #4
> -----Original Message-----
> From: ndesaulniers via sendgmr
> [mailto:ndesaulniers@ndesaulniers1.mtv.corp.google.com] On Behalf Of Nick
> Desaulniers
> Sent: Thursday, September 17, 2020 12:20 PM
> To: Song Bao Hua (Barry Song) <song.bao.hua@hisilicon.com>
> Cc: akpm@linux-foundation.org; guro@fb.com;
> linux-arm-kernel@lists.infradead.org; linux-mm@kvack.org; Linuxarm
> <linuxarm@huawei.com>; mike.kravetz@oracle.com; sfr@canb.auug.org.au;
> will@kernel.org; ardb@kernel.org; clang-built-linux@googlegroups.com
> Subject: Re: arm64: mm: move dma_contiguous_reserve() to be after
> paging_init()
> 
> It looks like that change referenced may also break arm32 boots with today's
> next?
> 
> The following allows me to boot, but I have no idea if it's incorrect or not.
> 

This is probably incorrect on arm32 as dma_contiguous_early_fixup() is done after
dma_contiguous_remap() in paging_init(). That means dma_contiguous_remap() gets an
empty list.
For arm64, there isn't early_fixup() and remap().

> diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
> index 45f9d5ec2360..7118b98c1f5f 100644
> --- a/arch/arm/mm/init.c
> +++ b/arch/arm/mm/init.c
> @@ -226,9 +226,6 @@ void __init arm_memblock_init(const struct
> machine_desc *mdesc)
>         early_init_fdt_reserve_self();
>         early_init_fdt_scan_reserved_mem();
> 
> -       /* reserve memory for DMA contiguous allocations */
> -       dma_contiguous_reserve(arm_dma_limit);
> -
>         arm_memblock_steal_permitted = false;
>         memblock_dump_all();
>  }
> @@ -248,6 +245,9 @@ void __init bootmem_init(void)
>          */
>         sparse_init();
> 
> +       /* reserve memory for DMA contiguous allocations */
> +       dma_contiguous_reserve(arm_dma_limit);
> +
>         /*
>          * Now free the memory - free_area_init needs
>          * the sparse mem_map arrays initialized by sparse_init()

Thanks
Barry
Song Bao Hua (Barry Song) Sept. 17, 2020, 1:35 a.m. UTC | #5
> -----Original Message-----
> From: Mike Kravetz [mailto:mike.kravetz@oracle.com]
> Sent: Thursday, September 17, 2020 12:27 PM
> To: Nick Desaulniers <ndesaulniers@google.com>; Song Bao Hua (Barry Song)
> <song.bao.hua@hisilicon.com>
> Cc: akpm@linux-foundation.org; guro@fb.com;
> linux-arm-kernel@lists.infradead.org; linux-mm@kvack.org; Linuxarm
> <linuxarm@huawei.com>; sfr@canb.auug.org.au; will@kernel.org;
> ardb@kernel.org; clang-built-linux@googlegroups.com
> Subject: Re: arm64: mm: move dma_contiguous_reserve() to be after
> paging_init()
> 
> On 9/16/20 5:19 PM, Nick Desaulniers wrote:
> > It looks like that change referenced may also break arm32 boots with
> > today's next?
> >
> > The following allows me to boot, but I have no idea if it's incorrect or not.
> 
> Thanks Nick,
> 
> The referenced commit was pulled from Andrew's tree and subsequent
> versions of next.
> 
> I'm looking into other architectures as this is dependent on where in arch
> specific boot process first cma call is made.  Hopefully, there is a some way to
> do this without potentially touching a bunch of arch code.

The default cma is probably not the only who has been broken. I think the patch to enable the 
dynamic number of cma areas has also broken fdt-based CMA. For example, 
early_init_fdt_scan_reserved_mem() is called before paging_init() in arm64.

Considering many arch are calling early_init_fdt_scan_reserved_mem(), it seems using
memblock_alloc() for cma_area will require lots of test on different platforms.
1. arc
2.arm
3.arm64
4.csky
5.h8300
6.mips
7.nsd32
8.nios32
9.openrise
10.powerpc
11.riscv
12.sh
13.xtensa

> --
> Mike Kravetz

Thanks
Barry
Will Deacon Sept. 17, 2020, 9:02 a.m. UTC | #6
Hi Mike,

On Wed, Sep 16, 2020 at 05:27:18PM -0700, Mike Kravetz wrote:
> On 9/16/20 5:19 PM, Nick Desaulniers wrote:
> > It looks like that change referenced may also break arm32 boots with today's
> > next?
> > 
> > The following allows me to boot, but I have no idea if it's incorrect or not.
> 
> Thanks Nick,
> 
> The referenced commit was pulled from Andrew's tree and subsequent versions
> of next.

By "pulled" do you mean removed? (potential terminology clash with git pull
is confusing me here!).

> I'm looking into other architectures as this is dependent on where in arch
> specific boot process first cma call is made.  Hopefully, there is a some
> way to do this without potentially touching a bunch of arch code.

Ok, for now I won't queue this arm64 patch, then.

Cheers,

Will
Stephen Rothwell Sept. 17, 2020, 9:13 a.m. UTC | #7
Hi Will,

On Thu, 17 Sep 2020 10:02:35 +0100 Will Deacon <will@kernel.org> wrote:
>
> By "pulled" do you mean removed? (potential terminology clash with git pull
> is confusing me here!).

"removed"  It will not be in today's linux-next.
Will Deacon Sept. 17, 2020, 9:36 a.m. UTC | #8
On Thu, Sep 17, 2020 at 07:13:16PM +1000, Stephen Rothwell wrote:
> On Thu, 17 Sep 2020 10:02:35 +0100 Will Deacon <will@kernel.org> wrote:
> >
> > By "pulled" do you mean removed? (potential terminology clash with git pull
> > is confusing me here!).
> 
> "removed"  It will not be in today's linux-next.

Perfect, thanks Stephen.

Will
diff mbox series

Patch

diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
index f1c75957ff3c..8dd61d07fff5 100644
--- a/arch/arm64/mm/init.c
+++ b/arch/arm64/mm/init.c
@@ -402,8 +402,6 @@  void __init arm64_memblock_init(void)
 	reserve_elfcorehdr();
 
 	high_memory = __va(memblock_end_of_DRAM() - 1) + 1;
-
-	dma_contiguous_reserve(arm64_dma32_phys_limit);
 }
 
 void __init bootmem_init(void)
@@ -415,6 +413,13 @@  void __init bootmem_init(void)
 
 	early_memtest(min << PAGE_SHIFT, max << PAGE_SHIFT);
 
+	/*
+	 * CMA is using memblock_alloc(), the virtual address returned
+	 * from memblock_alloc() isn't ready till paging_init().
+	 * So this has to happen after paging_init()
+	 */
+	dma_contiguous_reserve(arm64_dma32_phys_limit);
+
 	max_pfn = max_low_pfn = max;
 	min_low_pfn = min;