Message ID | 20230328115150.2700016-2-chenjiahao16@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | support allocating crashkernel above 4G explicitly on riscv | expand |
Context | Check | Description |
---|---|---|
conchuod/cover_letter | success | Series has a cover letter |
conchuod/tree_selection | success | Guessed tree name to be for-next at HEAD d34a6b715a23 |
conchuod/fixes_present | success | Fixes tag not required for -next series |
conchuod/maintainers_pattern | success | MAINTAINERS pattern errors before the patch: 1 and now 1 |
conchuod/verify_signedoff | success | Signed-off-by tag matches author and committer |
conchuod/kdoc | success | Errors and warnings before: 0 this patch: 0 |
conchuod/build_rv64_clang_allmodconfig | success | Errors and warnings before: 18 this patch: 18 |
conchuod/module_param | success | Was 0 now: 0 |
conchuod/build_rv64_gcc_allmodconfig | success | Errors and warnings before: 19 this patch: 19 |
conchuod/build_rv32_defconfig | success | Build OK |
conchuod/dtb_warn_rv64 | success | Errors and warnings before: 3 this patch: 3 |
conchuod/header_inline | success | No static functions without inline keyword in header files |
conchuod/checkpatch | warning | WARNING: line length of 106 exceeds 100 columns |
conchuod/source_inline | success | Was 0 now: 0 |
conchuod/build_rv64_nommu_k210_defconfig | success | Build OK |
conchuod/verify_fixes | success | No Fixes tag |
conchuod/build_rv64_nommu_virt_defconfig | success | Build OK |
On 03/28/23 at 07:51pm, Chen Jiahao wrote: > On riscv, the current crash kernel allocation logic is trying to > allocate within 32bit addressible memory region by default, if > failed, try to allocate without 4G restriction. > > In need of saving DMA zone memory while allocating a relatively large > crash kernel region, allocating the reserved memory top down in > high memory, without overlapping the DMA zone, is a mature solution. > Here introduce the parameter option crashkernel=X,[high,low]. > > One can reserve the crash kernel from high memory above DMA zone range > by explicitly passing "crashkernel=X,high"; or reserve a memory range > below 4G with "crashkernel=X,low". > > Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> > --- > arch/riscv/kernel/setup.c | 5 ++++ > arch/riscv/mm/init.c | 63 ++++++++++++++++++++++++++++++++++++--- > 2 files changed, 64 insertions(+), 4 deletions(-) > > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c > index 5d3184cbf518..ea84e5047c23 100644 > --- a/arch/riscv/kernel/setup.c > +++ b/arch/riscv/kernel/setup.c > @@ -176,6 +176,11 @@ static void __init init_resources(void) > if (ret < 0) > goto error; > } > + if (crashk_low_res.start != crashk_low_res.end) { > + ret = add_resource(&iomem_resource, &crashk_low_res); > + if (ret < 0) > + goto error; > + } > #endif > > #ifdef CONFIG_CRASH_DUMP > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 478d6763a01a..b7708cc467fa 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -1152,6 +1152,28 @@ static inline void setup_vm_final(void) > } > #endif /* CONFIG_MMU */ > > +/* Reserve 128M low memory by default for swiotlb buffer */ > +#define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) > + > +static int __init reserve_crashkernel_low(unsigned long long low_size) > +{ > + unsigned long long low_base; > + > + low_base = memblock_phys_alloc_range(low_size, PMD_SIZE, 0, dma32_phys_limit); > + if (!low_base) { > + pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size); > + return -ENOMEM; > + } > + > + pr_info("crashkernel low memory reserved: 0x%016llx - 0x%016llx (%lld MB)\n", > + low_base, low_base + low_size, low_size >> 20); > + > + crashk_low_res.start = low_base; > + crashk_low_res.end = low_base + low_size - 1; > + > + return 0; > +} > + > /* > * reserve_crashkernel() - reserves memory for crash kernel > * > @@ -1163,6 +1185,7 @@ static void __init reserve_crashkernel(void) > { > unsigned long long crash_base = 0; > unsigned long long crash_size = 0; > + unsigned long long crash_low_size = 0; > unsigned long search_start = memblock_start_of_DRAM(); > unsigned long search_end = memblock_end_of_DRAM(); > > @@ -1182,8 +1205,30 @@ static void __init reserve_crashkernel(void) > > ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), > &crash_size, &crash_base); > - if (ret || !crash_size) > + if (ret == -ENOENT) { > + /* > + * crashkernel=X,[high,low] can be specified or not, but > + * invalid value is not allowed. > + */ > + ret = parse_crashkernel_high(boot_command_line, 0, &crash_size, &crash_base); I would add a local variable to assign boot_command_line to it just like arm64 does. Then these lines could be shorter. char *cmdline = boot_command_line; > + if (ret || !crash_size) > + return; > + > + /* > + * crashkernel=Y,low is valid only when crashkernel=X,high > + * is passed and high memory is reserved successful. > + */ > + ret = parse_crashkernel_low(boot_command_line, 0, &crash_low_size, &crash_base); > + if (ret == -ENOENT) > + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > + else if (ret) > + return; > + > + search_start = dma32_phys_limit; > + } else if (ret || !crash_size) { > + /* Invalid argument value specified */ > return; > + } > > crash_size = PAGE_ALIGN(crash_size); > > @@ -1201,16 +1246,26 @@ static void __init reserve_crashkernel(void) > */ > crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, > search_start, > - min(search_end, (unsigned long) SZ_4G)); > + min(search_end, (unsigned long)dma32_phys_limit)); > if (crash_base == 0) { The above conditional check isn't right. If crashkernel=size@offset specified, the reservation failure won't trigger retry. This seems to be originally introduced by old commit, while this need be fixed firstly. > - /* Try again without restricting region to 32bit addressible memory */ > + /* Try again above the region of 32bit addressible memory */ > crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, > - search_start, search_end); > + max(search_start, (unsigned long)dma32_phys_limit), > + search_end); > if (crash_base == 0) { > pr_warn("crashkernel: couldn't allocate %lldKB\n", > crash_size >> 10); > return; > } > + > + if (!crash_low_size) > + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > + } > + > + if ((crash_base > dma32_phys_limit - crash_low_size) && > + crash_low_size && reserve_crashkernel_low(crash_low_size)) { > + memblock_phys_free(crash_base, crash_size); > + return; > } > > pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", > -- > 2.31.1 >
On 2023/3/29 19:19, Baoquan He wrote: > On 03/28/23 at 07:51pm, Chen Jiahao wrote: Thanks for reviewing. >> On riscv, the current crash kernel allocation logic is trying to >> allocate within 32bit addressible memory region by default, if >> failed, try to allocate without 4G restriction. >> >> In need of saving DMA zone memory while allocating a relatively large >> crash kernel region, allocating the reserved memory top down in >> high memory, without overlapping the DMA zone, is a mature solution. >> Here introduce the parameter option crashkernel=X,[high,low]. >> >> One can reserve the crash kernel from high memory above DMA zone range >> by explicitly passing "crashkernel=X,high"; or reserve a memory range >> below 4G with "crashkernel=X,low". >> >> Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> >> --- >> arch/riscv/kernel/setup.c | 5 ++++ >> arch/riscv/mm/init.c | 63 ++++++++++++++++++++++++++++++++++++--- >> 2 files changed, 64 insertions(+), 4 deletions(-) >> >> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c >> index 5d3184cbf518..ea84e5047c23 100644 >> --- a/arch/riscv/kernel/setup.c >> +++ b/arch/riscv/kernel/setup.c >> @@ -176,6 +176,11 @@ static void __init init_resources(void) >> if (ret < 0) >> goto error; >> } >> + if (crashk_low_res.start != crashk_low_res.end) { >> + ret = add_resource(&iomem_resource, &crashk_low_res); >> + if (ret < 0) >> + goto error; >> + } >> #endif >> >> #ifdef CONFIG_CRASH_DUMP >> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c >> index 478d6763a01a..b7708cc467fa 100644 >> --- a/arch/riscv/mm/init.c >> +++ b/arch/riscv/mm/init.c >> @@ -1152,6 +1152,28 @@ static inline void setup_vm_final(void) >> } >> #endif /* CONFIG_MMU */ >> >> +/* Reserve 128M low memory by default for swiotlb buffer */ >> +#define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) >> + >> +static int __init reserve_crashkernel_low(unsigned long long low_size) >> +{ >> + unsigned long long low_base; >> + >> + low_base = memblock_phys_alloc_range(low_size, PMD_SIZE, 0, dma32_phys_limit); >> + if (!low_base) { >> + pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size); >> + return -ENOMEM; >> + } >> + >> + pr_info("crashkernel low memory reserved: 0x%016llx - 0x%016llx (%lld MB)\n", >> + low_base, low_base + low_size, low_size >> 20); >> + >> + crashk_low_res.start = low_base; >> + crashk_low_res.end = low_base + low_size - 1; >> + >> + return 0; >> +} >> + >> /* >> * reserve_crashkernel() - reserves memory for crash kernel >> * >> @@ -1163,6 +1185,7 @@ static void __init reserve_crashkernel(void) >> { >> unsigned long long crash_base = 0; >> unsigned long long crash_size = 0; >> + unsigned long long crash_low_size = 0; >> unsigned long search_start = memblock_start_of_DRAM(); >> unsigned long search_end = memblock_end_of_DRAM(); >> >> @@ -1182,8 +1205,30 @@ static void __init reserve_crashkernel(void) >> >> ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), >> &crash_size, &crash_base); >> - if (ret || !crash_size) >> + if (ret == -ENOENT) { >> + /* >> + * crashkernel=X,[high,low] can be specified or not, but >> + * invalid value is not allowed. >> + */ >> + ret = parse_crashkernel_high(boot_command_line, 0, &crash_size, &crash_base); > I would add a local variable to assign boot_command_line to it just like > arm64 does. Then these lines could be shorter. > > char *cmdline = boot_command_line; Agreed, I will clean this up later in next version. >> + if (ret || !crash_size) >> + return; >> + >> + /* >> + * crashkernel=Y,low is valid only when crashkernel=X,high >> + * is passed and high memory is reserved successful. >> + */ >> + ret = parse_crashkernel_low(boot_command_line, 0, &crash_low_size, &crash_base); >> + if (ret == -ENOENT) >> + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; >> + else if (ret) >> + return; >> + >> + search_start = dma32_phys_limit; >> + } else if (ret || !crash_size) { >> + /* Invalid argument value specified */ >> return; >> + } >> >> crash_size = PAGE_ALIGN(crash_size); >> >> @@ -1201,16 +1246,26 @@ static void __init reserve_crashkernel(void) >> */ >> crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, >> search_start, >> - min(search_end, (unsigned long) SZ_4G)); >> + min(search_end, (unsigned long)dma32_phys_limit)); >> if (crash_base == 0) { > The above conditional check isn't right. If crashkernel=size@offset > specified, the reservation failure won't trigger retry. This seems to be > originally introduced by old commit, while this need be fixed firstly. Just a little curious about the rule to cope with this specific case. If "crashkernel=size@offset" was passed but reserve failed, should try again to allocate in high memory, regardless the specified size@offset, or just throw a warning and return? Since I noticed the current logic here on Arm64 is to check if !fixed_base first before retrying. Or have I missed anything else? >> - /* Try again without restricting region to 32bit addressible memory */ >> + /* Try again above the region of 32bit addressible memory */ >> crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, >> - search_start, search_end); >> + max(search_start, (unsigned long)dma32_phys_limit), >> + search_end); >> if (crash_base == 0) { >> pr_warn("crashkernel: couldn't allocate %lldKB\n", >> crash_size >> 10); >> return; >> } >> + >> + if (!crash_low_size) >> + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; >> + } >> + >> + if ((crash_base > dma32_phys_limit - crash_low_size) && >> + crash_low_size && reserve_crashkernel_low(crash_low_size)) { >> + memblock_phys_free(crash_base, crash_size); >> + return; >> } >> >> pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n", >> -- >> 2.31.1 >> BR, Jiahao
On 03/30/23 at 09:40pm, chenjiahao (C) wrote: ...... > Agreed, I will clean this up later in next version. > > > + if (ret || !crash_size) > > > + return; > > > + > > > + /* > > > + * crashkernel=Y,low is valid only when crashkernel=X,high > > > + * is passed and high memory is reserved successful. > > > + */ > > > + ret = parse_crashkernel_low(boot_command_line, 0, &crash_low_size, &crash_base); > > > + if (ret == -ENOENT) > > > + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; > > > + else if (ret) > > > + return; > > > + > > > + search_start = dma32_phys_limit; > > > + } else if (ret || !crash_size) { > > > + /* Invalid argument value specified */ > > > return; > > > + } > > > crash_size = PAGE_ALIGN(crash_size); > > > @@ -1201,16 +1246,26 @@ static void __init reserve_crashkernel(void) > > > */ > > > crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, > > > search_start, > > > - min(search_end, (unsigned long) SZ_4G)); > > > + min(search_end, (unsigned long)dma32_phys_limit)); > > > if (crash_base == 0) { > > The above conditional check isn't right. If crashkernel=size@offset > > specified, the reservation failure won't trigger retry. This seems to be > > originally introduced by old commit, while this need be fixed firstly. > > Just a little curious about the rule to cope with this specific case. If > "crashkernel=size@offset" was passed > > but reserve failed, should try again to allocate in high memory, regardless > the specified size@offset, > > or just throw a warning and return? Since I noticed the current logic here > on Arm64 is to check if !fixed_base first Yeah, we need mark the "crashkernel=size@offset" case and avoid to retry. Because you won't succeed if memblock has already failed to reserve an unavailable memory region, retry is meaningless. This has been done in x86, arm64.
On 2023/3/31 7:32, Baoquan He wrote: > On 03/30/23 at 09:40pm, chenjiahao (C) wrote: > ...... >> Agreed, I will clean this up later in next version. >>>> + if (ret || !crash_size) >>>> + return; >>>> + >>>> + /* >>>> + * crashkernel=Y,low is valid only when crashkernel=X,high >>>> + * is passed and high memory is reserved successful. >>>> + */ >>>> + ret = parse_crashkernel_low(boot_command_line, 0, &crash_low_size, &crash_base); >>>> + if (ret == -ENOENT) >>>> + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; >>>> + else if (ret) >>>> + return; >>>> + >>>> + search_start = dma32_phys_limit; >>>> + } else if (ret || !crash_size) { >>>> + /* Invalid argument value specified */ >>>> return; >>>> + } >>>> crash_size = PAGE_ALIGN(crash_size); >>>> @@ -1201,16 +1246,26 @@ static void __init reserve_crashkernel(void) >>>> */ >>>> crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, >>>> search_start, >>>> - min(search_end, (unsigned long) SZ_4G)); >>>> + min(search_end, (unsigned long)dma32_phys_limit)); >>>> if (crash_base == 0) { >>> The above conditional check isn't right. If crashkernel=size@offset >>> specified, the reservation failure won't trigger retry. This seems to be >>> originally introduced by old commit, while this need be fixed firstly. >> Just a little curious about the rule to cope with this specific case. If >> "crashkernel=size@offset" was passed >> >> but reserve failed, should try again to allocate in high memory, regardless >> the specified size@offset, >> >> or just throw a warning and return? Since I noticed the current logic here >> on Arm64 is to check if !fixed_base first > Yeah, we need mark the "crashkernel=size@offset" case and avoid to > retry. Because you won't succeed if memblock has already failed to > reserve an unavailable memory region, retry is meaningless. This has > been done in x86, arm64. Make sense, thanks. Actually, in my previous tests, the result in this case is the same as expectation, i.e. when allocating "crashkernel=size@offset" failed on low memory, it would retry but return on failure. Since the search_end is assigned with offset + size, which is lower than DMA32 limit, the second allocation is definitely invalid. But for sure, to make the code easy to read and eradicate other possible corner cases, I will check if !fixed_base first on retry. >
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c index 5d3184cbf518..ea84e5047c23 100644 --- a/arch/riscv/kernel/setup.c +++ b/arch/riscv/kernel/setup.c @@ -176,6 +176,11 @@ static void __init init_resources(void) if (ret < 0) goto error; } + if (crashk_low_res.start != crashk_low_res.end) { + ret = add_resource(&iomem_resource, &crashk_low_res); + if (ret < 0) + goto error; + } #endif #ifdef CONFIG_CRASH_DUMP diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 478d6763a01a..b7708cc467fa 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -1152,6 +1152,28 @@ static inline void setup_vm_final(void) } #endif /* CONFIG_MMU */ +/* Reserve 128M low memory by default for swiotlb buffer */ +#define DEFAULT_CRASH_KERNEL_LOW_SIZE (128UL << 20) + +static int __init reserve_crashkernel_low(unsigned long long low_size) +{ + unsigned long long low_base; + + low_base = memblock_phys_alloc_range(low_size, PMD_SIZE, 0, dma32_phys_limit); + if (!low_base) { + pr_err("cannot allocate crashkernel low memory (size:0x%llx).\n", low_size); + return -ENOMEM; + } + + pr_info("crashkernel low memory reserved: 0x%016llx - 0x%016llx (%lld MB)\n", + low_base, low_base + low_size, low_size >> 20); + + crashk_low_res.start = low_base; + crashk_low_res.end = low_base + low_size - 1; + + return 0; +} + /* * reserve_crashkernel() - reserves memory for crash kernel * @@ -1163,6 +1185,7 @@ static void __init reserve_crashkernel(void) { unsigned long long crash_base = 0; unsigned long long crash_size = 0; + unsigned long long crash_low_size = 0; unsigned long search_start = memblock_start_of_DRAM(); unsigned long search_end = memblock_end_of_DRAM(); @@ -1182,8 +1205,30 @@ static void __init reserve_crashkernel(void) ret = parse_crashkernel(boot_command_line, memblock_phys_mem_size(), &crash_size, &crash_base); - if (ret || !crash_size) + if (ret == -ENOENT) { + /* + * crashkernel=X,[high,low] can be specified or not, but + * invalid value is not allowed. + */ + ret = parse_crashkernel_high(boot_command_line, 0, &crash_size, &crash_base); + if (ret || !crash_size) + return; + + /* + * crashkernel=Y,low is valid only when crashkernel=X,high + * is passed and high memory is reserved successful. + */ + ret = parse_crashkernel_low(boot_command_line, 0, &crash_low_size, &crash_base); + if (ret == -ENOENT) + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; + else if (ret) + return; + + search_start = dma32_phys_limit; + } else if (ret || !crash_size) { + /* Invalid argument value specified */ return; + } crash_size = PAGE_ALIGN(crash_size); @@ -1201,16 +1246,26 @@ static void __init reserve_crashkernel(void) */ crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, search_start, - min(search_end, (unsigned long) SZ_4G)); + min(search_end, (unsigned long)dma32_phys_limit)); if (crash_base == 0) { - /* Try again without restricting region to 32bit addressible memory */ + /* Try again above the region of 32bit addressible memory */ crash_base = memblock_phys_alloc_range(crash_size, PMD_SIZE, - search_start, search_end); + max(search_start, (unsigned long)dma32_phys_limit), + search_end); if (crash_base == 0) { pr_warn("crashkernel: couldn't allocate %lldKB\n", crash_size >> 10); return; } + + if (!crash_low_size) + crash_low_size = DEFAULT_CRASH_KERNEL_LOW_SIZE; + } + + if ((crash_base > dma32_phys_limit - crash_low_size) && + crash_low_size && reserve_crashkernel_low(crash_low_size)) { + memblock_phys_free(crash_base, crash_size); + return; } pr_info("crashkernel: reserved 0x%016llx - 0x%016llx (%lld MB)\n",
On riscv, the current crash kernel allocation logic is trying to allocate within 32bit addressible memory region by default, if failed, try to allocate without 4G restriction. In need of saving DMA zone memory while allocating a relatively large crash kernel region, allocating the reserved memory top down in high memory, without overlapping the DMA zone, is a mature solution. Here introduce the parameter option crashkernel=X,[high,low]. One can reserve the crash kernel from high memory above DMA zone range by explicitly passing "crashkernel=X,high"; or reserve a memory range below 4G with "crashkernel=X,low". Signed-off-by: Chen Jiahao <chenjiahao16@huawei.com> --- arch/riscv/kernel/setup.c | 5 ++++ arch/riscv/mm/init.c | 63 ++++++++++++++++++++++++++++++++++++--- 2 files changed, 64 insertions(+), 4 deletions(-)