mbox series

[-next,v6,0/2] support allocating crashkernel above 4G explicitly on riscv

Message ID 20230701171138.1491206-1-chenjiahao16@huawei.com (mailing list archive)
Headers show
Series support allocating crashkernel above 4G explicitly on riscv | expand

Message

Chen Jiahao July 1, 2023, 5:11 p.m. UTC
On riscv, the current crash kernel allocation logic is trying to
allocate within 32bit addressible memory region by default, if
failed, try to allocate without 4G restriction.

In need of saving DMA zone memory while allocating a relatively large
crash kernel region, allocating the reserved memory top down in
high memory, without overlapping the DMA zone, is a mature solution.
Hence this patchset introduces the parameter option crashkernel=X,[high,low].

One can reserve the crash kernel from high memory above DMA zone range
by explicitly passing "crashkernel=X,high"; or reserve a memory range
below 4G with "crashkernel=X,low". Besides, there are few rules need
to take notice:
1. "crashkernel=X,[high,low]" will be ignored if "crashkernel=size"
   is specified.
2. "crashkernel=X,low" is valid only when "crashkernel=X,high" is passed
   and there is enough memory to be allocated under 4G.
3. When allocating crashkernel above 4G and no "crashkernel=X,low" is
   specified, a 128M low memory will be allocated automatically for
   swiotlb bounce buffer.
See Documentation/admin-guide/kernel-parameters.txt for more information.

To verify loading the crashkernel, adapted kexec-tools is attached below:
https://github.com/chenjh005/kexec-tools/tree/build-test-riscv-v2

Following test cases have been performed as expected:
1) crashkernel=256M                          //low=256M
2) crashkernel=1G                            //low=1G
3) crashkernel=4G                            //high=4G, low=128M(default)
4) crashkernel=4G crashkernel=256M,high      //high=4G, low=128M(default), high is ignored
5) crashkernel=4G crashkernel=256M,low       //high=4G, low=128M(default), low is ignored
6) crashkernel=4G,high                       //high=4G, low=128M(default)
7) crashkernel=256M,low                      //low=0M, invalid
8) crashkernel=4G,high crashkernel=256M,low  //high=4G, low=256M
9) crashkernel=4G,high crashkernel=4G,low    //high=0M, low=0M, invalid
10) crashkernel=512M@0xd0000000              //low=512M

Changes since [v6]:
1. Introduce the "high" flag to mark whether "crashkernel=X,high"
   is passed. Fix the retrying logic between "crashkernel=X,high"
   case and others when the first allocation attempt fails.

Changes since [v5]:
1. Update the crashkernel allocation logic when crashkernel=X,high
   is specified. In this case, region above 4G will directly get
   reserved as crashkernel, rather than trying lower 32bit allocation
   first.

Changes since [v4]:
1. Update some imprecise code comments for cmdline parsing.

Changes since [v3]:
1. Update to print warning and return explicitly on failure when
   crashkernel=size@offset is specified. Not changing the result
   in this case but making the logic more straightforward.
2. Some minor cleanup.

Changes since [v2]:
1. Update the allocation logic to ensure the high crashkernel
   region is reserved strictly above dma32_phys_limit.
2. Clean up some minor format problems.

Chen Jiahao (2):
  riscv: kdump: Implement crashkernel=X,[high,low]
  docs: kdump: Update the crashkernel description for riscv

 .../admin-guide/kernel-parameters.txt         | 15 ++--
 arch/riscv/kernel/setup.c                     |  5 ++
 arch/riscv/mm/init.c                          | 84 +++++++++++++++++--
 3 files changed, 90 insertions(+), 14 deletions(-)

Comments

Guo Ren July 1, 2023, 1:45 p.m. UTC | #1
On Sat, Jul 1, 2023 at 5:12 PM Chen Jiahao <chenjiahao16@huawei.com> wrote:
>
> On riscv, the current crash kernel allocation logic is trying to
> allocate within 32bit addressible memory region by default, if
> failed, try to allocate without 4G restriction.
>
> In need of saving DMA zone memory while allocating a relatively large
> crash kernel region, allocating the reserved memory top down in
> high memory, without overlapping the DMA zone, is a mature solution.
> Hence this patchset introduces the parameter option crashkernel=X,[high,low].
>
> One can reserve the crash kernel from high memory above DMA zone range
> by explicitly passing "crashkernel=X,high"; or reserve a memory range
> below 4G with "crashkernel=X,low". Besides, there are few rules need
> to take notice:
> 1. "crashkernel=X,[high,low]" will be ignored if "crashkernel=size"
>    is specified.
> 2. "crashkernel=X,low" is valid only when "crashkernel=X,high" is passed
>    and there is enough memory to be allocated under 4G.
> 3. When allocating crashkernel above 4G and no "crashkernel=X,low" is
>    specified, a 128M low memory will be allocated automatically for
>    swiotlb bounce buffer.
> See Documentation/admin-guide/kernel-parameters.txt for more information.
>
> To verify loading the crashkernel, adapted kexec-tools is attached below:
> https://github.com/chenjh005/kexec-tools/tree/build-test-riscv-v2
>
> Following test cases have been performed as expected:
> 1) crashkernel=256M                          //low=256M
> 2) crashkernel=1G                            //low=1G
Have you tried 1GB memory? we found a pud mapping problem on Sv39 of kexec, See:
https://lore.kernel.org/linux-riscv/20230629082032.3481237-1-guoren@kernel.org/

> 3) crashkernel=4G                            //high=4G, low=128M(default)
> 4) crashkernel=4G crashkernel=256M,high      //high=4G, low=128M(default), high is ignored
> 5) crashkernel=4G crashkernel=256M,low       //high=4G, low=128M(default), low is ignored
> 6) crashkernel=4G,high                       //high=4G, low=128M(default)
> 7) crashkernel=256M,low                      //low=0M, invalid
> 8) crashkernel=4G,high crashkernel=256M,low  //high=4G, low=256M
> 9) crashkernel=4G,high crashkernel=4G,low    //high=0M, low=0M, invalid
> 10) crashkernel=512M@0xd0000000              //low=512M
>
> Changes since [v6]:
> 1. Introduce the "high" flag to mark whether "crashkernel=X,high"
>    is passed. Fix the retrying logic between "crashkernel=X,high"
>    case and others when the first allocation attempt fails.
>
> Changes since [v5]:
> 1. Update the crashkernel allocation logic when crashkernel=X,high
>    is specified. In this case, region above 4G will directly get
>    reserved as crashkernel, rather than trying lower 32bit allocation
>    first.
>
> Changes since [v4]:
> 1. Update some imprecise code comments for cmdline parsing.
>
> Changes since [v3]:
> 1. Update to print warning and return explicitly on failure when
>    crashkernel=size@offset is specified. Not changing the result
>    in this case but making the logic more straightforward.
> 2. Some minor cleanup.
>
> Changes since [v2]:
> 1. Update the allocation logic to ensure the high crashkernel
>    region is reserved strictly above dma32_phys_limit.
> 2. Clean up some minor format problems.
>
> Chen Jiahao (2):
>   riscv: kdump: Implement crashkernel=X,[high,low]
>   docs: kdump: Update the crashkernel description for riscv
>
>  .../admin-guide/kernel-parameters.txt         | 15 ++--
>  arch/riscv/kernel/setup.c                     |  5 ++
>  arch/riscv/mm/init.c                          | 84 +++++++++++++++++--
>  3 files changed, 90 insertions(+), 14 deletions(-)
>
> --
> 2.34.1
>
Chen Jiahao July 3, 2023, 1:07 p.m. UTC | #2
On 2023/7/1 21:45, Guo Ren wrote:
> On Sat, Jul 1, 2023 at 5:12 PM Chen Jiahao <chenjiahao16@huawei.com> wrote:
>> On riscv, the current crash kernel allocation logic is trying to
>> allocate within 32bit addressible memory region by default, if
>> failed, try to allocate without 4G restriction.
>>
>> In need of saving DMA zone memory while allocating a relatively large
>> crash kernel region, allocating the reserved memory top down in
>> high memory, without overlapping the DMA zone, is a mature solution.
>> Hence this patchset introduces the parameter option crashkernel=X,[high,low].
>>
>> One can reserve the crash kernel from high memory above DMA zone range
>> by explicitly passing "crashkernel=X,high"; or reserve a memory range
>> below 4G with "crashkernel=X,low". Besides, there are few rules need
>> to take notice:
>> 1. "crashkernel=X,[high,low]" will be ignored if "crashkernel=size"
>>     is specified.
>> 2. "crashkernel=X,low" is valid only when "crashkernel=X,high" is passed
>>     and there is enough memory to be allocated under 4G.
>> 3. When allocating crashkernel above 4G and no "crashkernel=X,low" is
>>     specified, a 128M low memory will be allocated automatically for
>>     swiotlb bounce buffer.
>> See Documentation/admin-guide/kernel-parameters.txt for more information.
>>
>> To verify loading the crashkernel, adapted kexec-tools is attached below:
>> https://github.com/chenjh005/kexec-tools/tree/build-test-riscv-v2
>>
>> Following test cases have been performed as expected:
>> 1) crashkernel=256M                          //low=256M
>> 2) crashkernel=1G                            //low=1G
> Have you tried 1GB memory? we found a pud mapping problem on Sv39 of kexec, See:
> https://lore.kernel.org/linux-riscv/20230629082032.3481237-1-guoren@kernel.org/

I have tested on QEMU with sv57 mmu, so it seems the synchronization problem
was not reproduce when reserving 1G memory and loading the capture kernel.


Thanks,
Jiahao

>
>> 3) crashkernel=4G                            //high=4G, low=128M(default)
>> 4) crashkernel=4G crashkernel=256M,high      //high=4G, low=128M(default), high is ignored
>> 5) crashkernel=4G crashkernel=256M,low       //high=4G, low=128M(default), low is ignored
>> 6) crashkernel=4G,high                       //high=4G, low=128M(default)
>> 7) crashkernel=256M,low                      //low=0M, invalid
>> 8) crashkernel=4G,high crashkernel=256M,low  //high=4G, low=256M
>> 9) crashkernel=4G,high crashkernel=4G,low    //high=0M, low=0M, invalid
>> 10) crashkernel=512M@0xd0000000              //low=512M
>>
>> Changes since [v6]:
>> 1. Introduce the "high" flag to mark whether "crashkernel=X,high"
>>     is passed. Fix the retrying logic between "crashkernel=X,high"
>>     case and others when the first allocation attempt fails.
>>
>> Changes since [v5]:
>> 1. Update the crashkernel allocation logic when crashkernel=X,high
>>     is specified. In this case, region above 4G will directly get
>>     reserved as crashkernel, rather than trying lower 32bit allocation
>>     first.
>>
>> Changes since [v4]:
>> 1. Update some imprecise code comments for cmdline parsing.
>>
>> Changes since [v3]:
>> 1. Update to print warning and return explicitly on failure when
>>     crashkernel=size@offset is specified. Not changing the result
>>     in this case but making the logic more straightforward.
>> 2. Some minor cleanup.
>>
>> Changes since [v2]:
>> 1. Update the allocation logic to ensure the high crashkernel
>>     region is reserved strictly above dma32_phys_limit.
>> 2. Clean up some minor format problems.
>>
>> Chen Jiahao (2):
>>    riscv: kdump: Implement crashkernel=X,[high,low]
>>    docs: kdump: Update the crashkernel description for riscv
>>
>>   .../admin-guide/kernel-parameters.txt         | 15 ++--
>>   arch/riscv/kernel/setup.c                     |  5 ++
>>   arch/riscv/mm/init.c                          | 84 +++++++++++++++++--
>>   3 files changed, 90 insertions(+), 14 deletions(-)
>>
>> --
>> 2.34.1
>>
>
Guo Ren July 4, 2023, 2:39 a.m. UTC | #3
On Mon, Jul 3, 2023 at 9:07 PM chenjiahao (C) <chenjiahao16@huawei.com> wrote:
>
>
> On 2023/7/1 21:45, Guo Ren wrote:
> > On Sat, Jul 1, 2023 at 5:12 PM Chen Jiahao <chenjiahao16@huawei.com> wrote:
> >> On riscv, the current crash kernel allocation logic is trying to
> >> allocate within 32bit addressible memory region by default, if
> >> failed, try to allocate without 4G restriction.
> >>
> >> In need of saving DMA zone memory while allocating a relatively large
> >> crash kernel region, allocating the reserved memory top down in
> >> high memory, without overlapping the DMA zone, is a mature solution.
> >> Hence this patchset introduces the parameter option crashkernel=X,[high,low].
> >>
> >> One can reserve the crash kernel from high memory above DMA zone range
> >> by explicitly passing "crashkernel=X,high"; or reserve a memory range
> >> below 4G with "crashkernel=X,low". Besides, there are few rules need
> >> to take notice:
> >> 1. "crashkernel=X,[high,low]" will be ignored if "crashkernel=size"
> >>     is specified.
> >> 2. "crashkernel=X,low" is valid only when "crashkernel=X,high" is passed
> >>     and there is enough memory to be allocated under 4G.
> >> 3. When allocating crashkernel above 4G and no "crashkernel=X,low" is
> >>     specified, a 128M low memory will be allocated automatically for
> >>     swiotlb bounce buffer.
> >> See Documentation/admin-guide/kernel-parameters.txt for more information.
> >>
> >> To verify loading the crashkernel, adapted kexec-tools is attached below:
> >> https://github.com/chenjh005/kexec-tools/tree/build-test-riscv-v2
> >>
> >> Following test cases have been performed as expected:
> >> 1) crashkernel=256M                          //low=256M
> >> 2) crashkernel=1G                            //low=1G
> > Have you tried 1GB memory? we found a pud mapping problem on Sv39 of kexec, See:
> > https://lore.kernel.org/linux-riscv/20230629082032.3481237-1-guoren@kernel.org/
>
> I have tested on QEMU with sv57 mmu, so it seems the synchronization problem
> was not reproduce when reserving 1G memory and loading the capture kernel.
Yes, the pud of sv57 is not the pgd entries, so you didn't get problem.

>
>
> Thanks,
> Jiahao
>
> >
> >> 3) crashkernel=4G                            //high=4G, low=128M(default)
> >> 4) crashkernel=4G crashkernel=256M,high      //high=4G, low=128M(default), high is ignored
> >> 5) crashkernel=4G crashkernel=256M,low       //high=4G, low=128M(default), low is ignored
> >> 6) crashkernel=4G,high                       //high=4G, low=128M(default)
> >> 7) crashkernel=256M,low                      //low=0M, invalid
> >> 8) crashkernel=4G,high crashkernel=256M,low  //high=4G, low=256M
> >> 9) crashkernel=4G,high crashkernel=4G,low    //high=0M, low=0M, invalid
> >> 10) crashkernel=512M@0xd0000000              //low=512M
> >>
> >> Changes since [v6]:
> >> 1. Introduce the "high" flag to mark whether "crashkernel=X,high"
> >>     is passed. Fix the retrying logic between "crashkernel=X,high"
> >>     case and others when the first allocation attempt fails.
> >>
> >> Changes since [v5]:
> >> 1. Update the crashkernel allocation logic when crashkernel=X,high
> >>     is specified. In this case, region above 4G will directly get
> >>     reserved as crashkernel, rather than trying lower 32bit allocation
> >>     first.
> >>
> >> Changes since [v4]:
> >> 1. Update some imprecise code comments for cmdline parsing.
> >>
> >> Changes since [v3]:
> >> 1. Update to print warning and return explicitly on failure when
> >>     crashkernel=size@offset is specified. Not changing the result
> >>     in this case but making the logic more straightforward.
> >> 2. Some minor cleanup.
> >>
> >> Changes since [v2]:
> >> 1. Update the allocation logic to ensure the high crashkernel
> >>     region is reserved strictly above dma32_phys_limit.
> >> 2. Clean up some minor format problems.
> >>
> >> Chen Jiahao (2):
> >>    riscv: kdump: Implement crashkernel=X,[high,low]
> >>    docs: kdump: Update the crashkernel description for riscv
> >>
> >>   .../admin-guide/kernel-parameters.txt         | 15 ++--
> >>   arch/riscv/kernel/setup.c                     |  5 ++
> >>   arch/riscv/mm/init.c                          | 84 +++++++++++++++++--
> >>   3 files changed, 90 insertions(+), 14 deletions(-)
> >>
> >> --
> >> 2.34.1
> >>
> >