Message ID | 1575057559-25496-3-git-send-email-bhsharma@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Append new variables to vmcoreinfo (TCR_EL1.T1SZ for arm64 and MAX_PHYSMEM_BITS for all archs) | expand |
Hi Bhupesh, On 29/11/2019 19:59, Bhupesh Sharma wrote: > vabits_actual variable on arm64 indicates the actual VA space size, > and allows a single binary to support both 48-bit and 52-bit VA > spaces. > > If the ARMv8.2-LVA optional feature is present, and we are running > with a 64KB page size; then it is possible to use 52-bits of address > space for both userspace and kernel addresses. However, any kernel > binary that supports 52-bit must also be able to fall back to 48-bit > at early boot time if the hardware feature is not present. > > Since TCR_EL1.T1SZ indicates the size offset of the memory region > addressed by TTBR1_EL1 (and hence can be used for determining the > vabits_actual value) it makes more sense to export the same in > vmcoreinfo rather than vabits_actual variable, as the name of the > variable can change in future kernel versions, but the architectural > constructs like TCR_EL1.T1SZ can be used better to indicate intended > specific fields to user-space. > > User-space utilities like makedumpfile and crash-utility, need to > read/write this value from/to vmcoreinfo (write?) > for determining if a virtual address lies in the linear map range. I think this is a fragile example. The debugger shouldn't need to know this. > The user-space computation for determining whether an address lies in > the linear map range is the same as we have in kernel-space: > > #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1))) This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If user-space tools rely on 'knowing' the kernel memory layout, they must have to constantly be fixed and updated. This is a poor argument for adding this to something that ends up as ABI. I think a better argument is walking the kernel page tables from the core dump. Core code's vmcoreinfo exports the location of the kernel page tables, but in the example above you can't walk them without knowing how T1SZ was configured. On older kernels, user-space that needs this would have to assume the value it computes from VA_BITs (also in vmcoreinfo) is the value in use. ---%<--- > I have sent out user-space patches for makedumpfile and crash-utility > to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see > [0] and [1]). > > Akashi reported that he was able to use this patchset and the user-space > changes to get user-space working fine with the 52-bit kernel VA > changes (see [2]). > > [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html > [1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html > [2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html ---%<--- This probably belongs in the cover letter instead of the commit log. (From-memory: one of vmcore/kcore is virtually addressed, the other physically. Does this fix your poblem in both cases?) > diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c > index ca4c3e12d8c5..f78310ba65ea 100644 > --- a/arch/arm64/kernel/crash_core.c > +++ b/arch/arm64/kernel/crash_core.c > @@ -7,6 +7,13 @@ > #include <linux/crash_core.h> > #include <asm/memory.h> You need to include asm/sysreg.h for read_sysreg(), and asm/pgtable-hwdef.h for the macros you added. > +static inline u64 get_tcr_el1_t1sz(void); Why do you need to do this? > +static inline u64 get_tcr_el1_t1sz(void) > +{ > + return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET; > +} (We don't modify this one, and its always the same one very CPU, so this is fine. This function is only called once when the stringy vmcoreinfo elf_note is created...) > void arch_crash_save_vmcoreinfo(void) > { > VMCOREINFO_NUMBER(VA_BITS); > @@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void) > kimage_voffset); > vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n", > PHYS_OFFSET); > + vmcoreinfo_append_str("NUMBER(tcr_el1_t1sz)=0x%llx\n", > + get_tcr_el1_t1sz()); You document the name as being upper-case. The two values either values either side are upper-case. > vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset()); > } Thanks, James
Hi James, On 12/12/2019 04:02 PM, James Morse wrote: > Hi Bhupesh, > > On 29/11/2019 19:59, Bhupesh Sharma wrote: >> vabits_actual variable on arm64 indicates the actual VA space size, >> and allows a single binary to support both 48-bit and 52-bit VA >> spaces. >> >> If the ARMv8.2-LVA optional feature is present, and we are running >> with a 64KB page size; then it is possible to use 52-bits of address >> space for both userspace and kernel addresses. However, any kernel >> binary that supports 52-bit must also be able to fall back to 48-bit >> at early boot time if the hardware feature is not present. >> >> Since TCR_EL1.T1SZ indicates the size offset of the memory region >> addressed by TTBR1_EL1 (and hence can be used for determining the >> vabits_actual value) it makes more sense to export the same in >> vmcoreinfo rather than vabits_actual variable, as the name of the >> variable can change in future kernel versions, but the architectural >> constructs like TCR_EL1.T1SZ can be used better to indicate intended >> specific fields to user-space. >> >> User-space utilities like makedumpfile and crash-utility, need to >> read/write this value from/to vmcoreinfo > > (write?) Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can be used for analysis of the root-cause of panic/crash on say an x86_64 host using utilities like crash-utility/gdb. >> for determining if a virtual address lies in the linear map range. > > I think this is a fragile example. The debugger shouldn't need to know this. Well that the current user-space utility design, so I am not sure we can tweak that too much. >> The user-space computation for determining whether an address lies in >> the linear map range is the same as we have in kernel-space: >> >> #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1))) > > This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If user-space > tools rely on 'knowing' the kernel memory layout, they must have to constantly be fixed > and updated. This is a poor argument for adding this to something that ends up as ABI. See above. The user-space has to rely on some ABI/guaranteed hardware-symbols which can be used for 'determining' the kernel memory layout. > I think a better argument is walking the kernel page tables from the core dump. > Core code's vmcoreinfo exports the location of the kernel page tables, but in the example > above you can't walk them without knowing how T1SZ was configured. Sure, both makedumpfile and crash-utility (which walks the kernel page tables from the core dump) use this (and similar) information currently in the user-space. > On older kernels, user-space that needs this would have to assume the value it computes > from VA_BITs (also in vmcoreinfo) is the value in use. Yes, backward compatibility has been handled in the user-space already. > ---%<--- >> I have sent out user-space patches for makedumpfile and crash-utility >> to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see >> [0] and [1]). >> >> Akashi reported that he was able to use this patchset and the user-space >> changes to get user-space working fine with the 52-bit kernel VA >> changes (see [2]). >> >> [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html >> [1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html >> [2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html > ---%<--- > > This probably belongs in the cover letter instead of the commit log. Ok. > (From-memory: one of vmcore/kcore is virtually addressed, the other physically. Does this > fix your poblem in both cases?) > > >> diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c >> index ca4c3e12d8c5..f78310ba65ea 100644 >> --- a/arch/arm64/kernel/crash_core.c >> +++ b/arch/arm64/kernel/crash_core.c >> @@ -7,6 +7,13 @@ >> #include <linux/crash_core.h> >> #include <asm/memory.h> > > You need to include asm/sysreg.h for read_sysreg(), and asm/pgtable-hwdef.h for the macros > you added. Ok. Will check as I did not get any compilation errors without the same and build-bot also did not raise a flag for the missing include files. >> +static inline u64 get_tcr_el1_t1sz(void); > Why do you need to do this? Without this I was getting a missing declaration error, while compiling the code. >> +static inline u64 get_tcr_el1_t1sz(void) >> +{ >> + return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET; >> +} > > (We don't modify this one, and its always the same one very CPU, so this is fine. > This function is only called once when the stringy vmcoreinfo elf_note is created...) Right. >> void arch_crash_save_vmcoreinfo(void) >> { >> VMCOREINFO_NUMBER(VA_BITS); >> @@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void) >> kimage_voffset); >> vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n", >> PHYS_OFFSET); >> + vmcoreinfo_append_str("NUMBER(tcr_el1_t1sz)=0x%llx\n", >> + get_tcr_el1_t1sz()); > > You document the name as being upper-case. > The two values either values either side are upper-case. Ok, will fix this in v6. Thanks for your inputs. >> vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset()); >> } Thanks, Bhupesh
Hi Bhupesh, On 25/12/2019 19:01, Bhupesh Sharma wrote: > On 12/12/2019 04:02 PM, James Morse wrote: >> On 29/11/2019 19:59, Bhupesh Sharma wrote: >>> vabits_actual variable on arm64 indicates the actual VA space size, >>> and allows a single binary to support both 48-bit and 52-bit VA >>> spaces. >>> >>> If the ARMv8.2-LVA optional feature is present, and we are running >>> with a 64KB page size; then it is possible to use 52-bits of address >>> space for both userspace and kernel addresses. However, any kernel >>> binary that supports 52-bit must also be able to fall back to 48-bit >>> at early boot time if the hardware feature is not present. >>> >>> Since TCR_EL1.T1SZ indicates the size offset of the memory region >>> addressed by TTBR1_EL1 (and hence can be used for determining the >>> vabits_actual value) it makes more sense to export the same in >>> vmcoreinfo rather than vabits_actual variable, as the name of the >>> variable can change in future kernel versions, but the architectural >>> constructs like TCR_EL1.T1SZ can be used better to indicate intended >>> specific fields to user-space. >>> >>> User-space utilities like makedumpfile and crash-utility, need to >>> read/write this value from/to vmcoreinfo >> >> (write?) > > Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can be used for > analysis of the root-cause of panic/crash on say an x86_64 host using utilities like > crash-utility/gdb. I read this as as "User-space [...] needs to write to vmcoreinfo". >>> for determining if a virtual address lies in the linear map range. >> >> I think this is a fragile example. The debugger shouldn't need to know this. > > Well that the current user-space utility design, so I am not sure we can tweak that too much. > >>> The user-space computation for determining whether an address lies in >>> the linear map range is the same as we have in kernel-space: >>> >>> #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1))) >> >> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If user-space >> tools rely on 'knowing' the kernel memory layout, they must have to constantly be fixed >> and updated. This is a poor argument for adding this to something that ends up as ABI. > > See above. The user-space has to rely on some ABI/guaranteed hardware-symbols which can be > used for 'determining' the kernel memory layout. I disagree. Everything and anything in the kernel will change. The ABI rules apply to stuff exposed via syscalls and kernel filesystems. It does not apply to kernel internals, like the memory layout we used yesterday. 14c127c957c1 is a case in point. A debugger trying to rely on this sort of thing would have to play catchup whenever it changes. I'm looking for a justification that isn't paper-thin. Putting 'for guessing the memory map' in the commit message gives the impression we can support that. >> I think a better argument is walking the kernel page tables from the core dump. >> Core code's vmcoreinfo exports the location of the kernel page tables, but in the example >> above you can't walk them without knowing how T1SZ was configured. > > Sure, both makedumpfile and crash-utility (which walks the kernel page tables from the > core dump) use this (and similar) information currently in the user-space. [...] >> (From-memory: one of vmcore/kcore is virtually addressed, the other physically. Does this >> fix your poblem in both cases?) >> >> >>> diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c >>> index ca4c3e12d8c5..f78310ba65ea 100644 >>> --- a/arch/arm64/kernel/crash_core.c >>> +++ b/arch/arm64/kernel/crash_core.c >>> @@ -7,6 +7,13 @@ >>> #include <linux/crash_core.h> >>> #include <asm/memory.h> >> >> You need to include asm/sysreg.h for read_sysreg(), and asm/pgtable-hwdef.h for the macros >> you added. > > Ok. Will check as I did not get any compilation errors without the same and build-bot also > did not raise a flag for the missing include files. Don't trust the header jungle! >>> +static inline u64 get_tcr_el1_t1sz(void); > >> Why do you need to do this? > > Without this I was getting a missing declaration error, while compiling the code. Missing declaration? >>> +static inline u64 get_tcr_el1_t1sz(void) >>> +{ >>> + return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET; >>> +} Here it is! (I must be going mad...) Thanks, James
----- Original Message ----- > Hi Bhupesh, > > On 25/12/2019 19:01, Bhupesh Sharma wrote: > > On 12/12/2019 04:02 PM, James Morse wrote: > >> On 29/11/2019 19:59, Bhupesh Sharma wrote: > >>> vabits_actual variable on arm64 indicates the actual VA space size, > >>> and allows a single binary to support both 48-bit and 52-bit VA > >>> spaces. > >>> > >>> If the ARMv8.2-LVA optional feature is present, and we are running > >>> with a 64KB page size; then it is possible to use 52-bits of address > >>> space for both userspace and kernel addresses. However, any kernel > >>> binary that supports 52-bit must also be able to fall back to 48-bit > >>> at early boot time if the hardware feature is not present. > >>> > >>> Since TCR_EL1.T1SZ indicates the size offset of the memory region > >>> addressed by TTBR1_EL1 (and hence can be used for determining the > >>> vabits_actual value) it makes more sense to export the same in > >>> vmcoreinfo rather than vabits_actual variable, as the name of the > >>> variable can change in future kernel versions, but the architectural > >>> constructs like TCR_EL1.T1SZ can be used better to indicate intended > >>> specific fields to user-space. > >>> > >>> User-space utilities like makedumpfile and crash-utility, need to > >>> read/write this value from/to vmcoreinfo > >> > >> (write?) > > > > Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can > > be used for > > analysis of the root-cause of panic/crash on say an x86_64 host using > > utilities like > > crash-utility/gdb. > > I read this as as "User-space [...] needs to write to vmcoreinfo". > > > >>> for determining if a virtual address lies in the linear map range. > >> > >> I think this is a fragile example. The debugger shouldn't need to know > >> this. > > > > Well that the current user-space utility design, so I am not sure we can > > tweak that too much. > > > >>> The user-space computation for determining whether an address lies in > >>> the linear map range is the same as we have in kernel-space: > >>> > >>> #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - > >>> 1))) > >> > >> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If > >> user-space > >> tools rely on 'knowing' the kernel memory layout, they must have to > >> constantly be fixed > >> and updated. This is a poor argument for adding this to something that > >> ends up as ABI. > > > > See above. The user-space has to rely on some ABI/guaranteed > > hardware-symbols which can be > > used for 'determining' the kernel memory layout. > > I disagree. Everything and anything in the kernel will change. The ABI rules apply to > stuff exposed via syscalls and kernel filesystems. It does not apply to kernel internals, > like the memory layout we used yesterday. 14c127c957c1 is a case in point. > > A debugger trying to rely on this sort of thing would have to play catchup whenever it > changes. Exactly. That's the whole point. The crash utility and makedumpfile are not in the same league as other user-space tools. They have always had to "play catchup" precisely because they depend upon kernel internals, which constantly change. Dave
Hi James, On 01/11/2020 12:30 AM, Dave Anderson wrote: > > ----- Original Message ----- >> Hi Bhupesh, >> >> On 25/12/2019 19:01, Bhupesh Sharma wrote: >>> On 12/12/2019 04:02 PM, James Morse wrote: >>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: >>>>> vabits_actual variable on arm64 indicates the actual VA space size, >>>>> and allows a single binary to support both 48-bit and 52-bit VA >>>>> spaces. >>>>> >>>>> If the ARMv8.2-LVA optional feature is present, and we are running >>>>> with a 64KB page size; then it is possible to use 52-bits of address >>>>> space for both userspace and kernel addresses. However, any kernel >>>>> binary that supports 52-bit must also be able to fall back to 48-bit >>>>> at early boot time if the hardware feature is not present. >>>>> >>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region >>>>> addressed by TTBR1_EL1 (and hence can be used for determining the >>>>> vabits_actual value) it makes more sense to export the same in >>>>> vmcoreinfo rather than vabits_actual variable, as the name of the >>>>> variable can change in future kernel versions, but the architectural >>>>> constructs like TCR_EL1.T1SZ can be used better to indicate intended >>>>> specific fields to user-space. >>>>> >>>>> User-space utilities like makedumpfile and crash-utility, need to >>>>> read/write this value from/to vmcoreinfo >>>> >>>> (write?) >>> >>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 system can >>> be used for >>> analysis of the root-cause of panic/crash on say an x86_64 host using >>> utilities like >>> crash-utility/gdb. >> >> I read this as as "User-space [...] needs to write to vmcoreinfo". That's correct. But for writing to vmcore dump in the kdump kernel, we need to read the symbols from the vmcoreinfo in the primary kernel. >>>>> for determining if a virtual address lies in the linear map range. >>>> >>>> I think this is a fragile example. The debugger shouldn't need to know >>>> this. >>> >>> Well that the current user-space utility design, so I am not sure we can >>> tweak that too much. >>> >>>>> The user-space computation for determining whether an address lies in >>>>> the linear map range is the same as we have in kernel-space: >>>>> >>>>> #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - >>>>> 1))) >>>> >>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA space"). If >>>> user-space >>>> tools rely on 'knowing' the kernel memory layout, they must have to >>>> constantly be fixed >>>> and updated. This is a poor argument for adding this to something that >>>> ends up as ABI. >>> >>> See above. The user-space has to rely on some ABI/guaranteed >>> hardware-symbols which can be >>> used for 'determining' the kernel memory layout. >> >> I disagree. Everything and anything in the kernel will change. The ABI rules apply to >> stuff exposed via syscalls and kernel filesystems. It does not apply to kernel internals, >> like the memory layout we used yesterday. 14c127c957c1 is a case in point. >> >> A debugger trying to rely on this sort of thing would have to play catchup whenever it >> changes. > > Exactly. That's the whole point. > > The crash utility and makedumpfile are not in the same league as other user-space tools. > They have always had to "play catchup" precisely because they depend upon kernel internals, > which constantly change. I agree with you and DaveA here. Software user-space debuggers are dependent on kernel internals (which can change from time-to-time) and will have to play catch-up (which has been the case since the very start). Unfortunately we don't have any clear ABI for software debugging tools - may be something to look for in future. A case in point is gdb/kgdb, which still needs to run with KASLR turned-off (nokaslr) for debugging, as it confuses gdb which resolve kernel symbol address from symbol table of vmlinux. But we can work-around the same in makedumpfile/crash by reading the 'kaslr_offset' value. And I have several users telling me now they cannot use gdb on KASLR enabled kernel to debug panics, but can makedumpfile + crash combination to achieve the same. So, we should be looking to fix these utilities which are broken since the 52-bit changes for arm64. Accordingly, I will try to send the v6 soon while incorporating the comments posted on the v5. Thanks, Bhupesh
Hi Bhupesh, On 1/13/20 5:44 PM, Bhupesh Sharma wrote: > Hi James, > > On 01/11/2020 12:30 AM, Dave Anderson wrote: >> >> ----- Original Message ----- >>> Hi Bhupesh, >>> >>> On 25/12/2019 19:01, Bhupesh Sharma wrote: >>>> On 12/12/2019 04:02 PM, James Morse wrote: >>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: >>>>>> vabits_actual variable on arm64 indicates the actual VA space size, >>>>>> and allows a single binary to support both 48-bit and 52-bit VA >>>>>> spaces. >>>>>> >>>>>> If the ARMv8.2-LVA optional feature is present, and we are running >>>>>> with a 64KB page size; then it is possible to use 52-bits of address >>>>>> space for both userspace and kernel addresses. However, any kernel >>>>>> binary that supports 52-bit must also be able to fall back to 48-bit >>>>>> at early boot time if the hardware feature is not present. >>>>>> >>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region >>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the >>>>>> vabits_actual value) it makes more sense to export the same in >>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the >>>>>> variable can change in future kernel versions, but the architectural >>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate intended >>>>>> specific fields to user-space. >>>>>> >>>>>> User-space utilities like makedumpfile and crash-utility, need to >>>>>> read/write this value from/to vmcoreinfo >>>>> >>>>> (write?) >>>> >>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 >>>> system can >>>> be used for >>>> analysis of the root-cause of panic/crash on say an x86_64 host using >>>> utilities like >>>> crash-utility/gdb. >>> >>> I read this as as "User-space [...] needs to write to vmcoreinfo". > > That's correct. But for writing to vmcore dump in the kdump kernel, we > need to read the symbols from the vmcoreinfo in the primary kernel. > >>>>>> for determining if a virtual address lies in the linear map range. >>>>> >>>>> I think this is a fragile example. The debugger shouldn't need to know >>>>> this. >>>> >>>> Well that the current user-space utility design, so I am not sure we >>>> can >>>> tweak that too much. >>>> >>>>>> The user-space computation for determining whether an address lies in >>>>>> the linear map range is the same as we have in kernel-space: >>>>>> >>>>>> #define __is_lm_address(addr) (!(((u64)addr) & >>>>>> BIT(vabits_actual - >>>>>> 1))) >>>>> >>>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA >>>>> space"). If >>>>> user-space >>>>> tools rely on 'knowing' the kernel memory layout, they must have to >>>>> constantly be fixed >>>>> and updated. This is a poor argument for adding this to something that >>>>> ends up as ABI. >>>> >>>> See above. The user-space has to rely on some ABI/guaranteed >>>> hardware-symbols which can be >>>> used for 'determining' the kernel memory layout. >>> >>> I disagree. Everything and anything in the kernel will change. The >>> ABI rules apply to >>> stuff exposed via syscalls and kernel filesystems. It does not apply >>> to kernel internals, >>> like the memory layout we used yesterday. 14c127c957c1 is a case in >>> point. >>> >>> A debugger trying to rely on this sort of thing would have to play >>> catchup whenever it >>> changes. >> >> Exactly. That's the whole point. >> >> The crash utility and makedumpfile are not in the same league as other >> user-space tools. >> They have always had to "play catchup" precisely because they depend >> upon kernel internals, >> which constantly change. > > I agree with you and DaveA here. Software user-space debuggers are > dependent on kernel internals (which can change from time-to-time) and > will have to play catch-up (which has been the case since the very start). > > Unfortunately we don't have any clear ABI for software debugging tools - > may be something to look for in future. > > A case in point is gdb/kgdb, which still needs to run with KASLR > turned-off (nokaslr) for debugging, as it confuses gdb which resolve > kernel symbol address from symbol table of vmlinux. But we can > work-around the same in makedumpfile/crash by reading the 'kaslr_offset' > value. And I have several users telling me now they cannot use gdb on > KASLR enabled kernel to debug panics, but can makedumpfile + crash > combination to achieve the same. > > So, we should be looking to fix these utilities which are broken since > the 52-bit changes for arm64. Accordingly, I will try to send the v6 > soon while incorporating the comments posted on the v5. Any update on the next v6 version. Since this patch series is fixing the current broken kdump so need this series to add some more fields in vmcoreinfo for Pointer Authentication work. Thanks, Amit Daniel > > Thanks, > Bhupesh > > > > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Hi Amit, On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap <amit.kachhap@arm.com> wrote: > > Hi Bhupesh, > > On 1/13/20 5:44 PM, Bhupesh Sharma wrote: > > Hi James, > > > > On 01/11/2020 12:30 AM, Dave Anderson wrote: > >> > >> ----- Original Message ----- > >>> Hi Bhupesh, > >>> > >>> On 25/12/2019 19:01, Bhupesh Sharma wrote: > >>>> On 12/12/2019 04:02 PM, James Morse wrote: > >>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: > >>>>>> vabits_actual variable on arm64 indicates the actual VA space size, > >>>>>> and allows a single binary to support both 48-bit and 52-bit VA > >>>>>> spaces. > >>>>>> > >>>>>> If the ARMv8.2-LVA optional feature is present, and we are running > >>>>>> with a 64KB page size; then it is possible to use 52-bits of address > >>>>>> space for both userspace and kernel addresses. However, any kernel > >>>>>> binary that supports 52-bit must also be able to fall back to 48-bit > >>>>>> at early boot time if the hardware feature is not present. > >>>>>> > >>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region > >>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the > >>>>>> vabits_actual value) it makes more sense to export the same in > >>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the > >>>>>> variable can change in future kernel versions, but the architectural > >>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate intended > >>>>>> specific fields to user-space. > >>>>>> > >>>>>> User-space utilities like makedumpfile and crash-utility, need to > >>>>>> read/write this value from/to vmcoreinfo > >>>>> > >>>>> (write?) > >>>> > >>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 > >>>> system can > >>>> be used for > >>>> analysis of the root-cause of panic/crash on say an x86_64 host using > >>>> utilities like > >>>> crash-utility/gdb. > >>> > >>> I read this as as "User-space [...] needs to write to vmcoreinfo". > > > > That's correct. But for writing to vmcore dump in the kdump kernel, we > > need to read the symbols from the vmcoreinfo in the primary kernel. > > > >>>>>> for determining if a virtual address lies in the linear map range. > >>>>> > >>>>> I think this is a fragile example. The debugger shouldn't need to know > >>>>> this. > >>>> > >>>> Well that the current user-space utility design, so I am not sure we > >>>> can > >>>> tweak that too much. > >>>> > >>>>>> The user-space computation for determining whether an address lies in > >>>>>> the linear map range is the same as we have in kernel-space: > >>>>>> > >>>>>> #define __is_lm_address(addr) (!(((u64)addr) & > >>>>>> BIT(vabits_actual - > >>>>>> 1))) > >>>>> > >>>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA > >>>>> space"). If > >>>>> user-space > >>>>> tools rely on 'knowing' the kernel memory layout, they must have to > >>>>> constantly be fixed > >>>>> and updated. This is a poor argument for adding this to something that > >>>>> ends up as ABI. > >>>> > >>>> See above. The user-space has to rely on some ABI/guaranteed > >>>> hardware-symbols which can be > >>>> used for 'determining' the kernel memory layout. > >>> > >>> I disagree. Everything and anything in the kernel will change. The > >>> ABI rules apply to > >>> stuff exposed via syscalls and kernel filesystems. It does not apply > >>> to kernel internals, > >>> like the memory layout we used yesterday. 14c127c957c1 is a case in > >>> point. > >>> > >>> A debugger trying to rely on this sort of thing would have to play > >>> catchup whenever it > >>> changes. > >> > >> Exactly. That's the whole point. > >> > >> The crash utility and makedumpfile are not in the same league as other > >> user-space tools. > >> They have always had to "play catchup" precisely because they depend > >> upon kernel internals, > >> which constantly change. > > > > I agree with you and DaveA here. Software user-space debuggers are > > dependent on kernel internals (which can change from time-to-time) and > > will have to play catch-up (which has been the case since the very start). > > > > Unfortunately we don't have any clear ABI for software debugging tools - > > may be something to look for in future. > > > > A case in point is gdb/kgdb, which still needs to run with KASLR > > turned-off (nokaslr) for debugging, as it confuses gdb which resolve > > kernel symbol address from symbol table of vmlinux. But we can > > work-around the same in makedumpfile/crash by reading the 'kaslr_offset' > > value. And I have several users telling me now they cannot use gdb on > > KASLR enabled kernel to debug panics, but can makedumpfile + crash > > combination to achieve the same. > > > > So, we should be looking to fix these utilities which are broken since > > the 52-bit changes for arm64. Accordingly, I will try to send the v6 > > soon while incorporating the comments posted on the v5. > > Any update on the next v6 version. Since this patch series is fixing the > current broken kdump so need this series to add some more fields in > vmcoreinfo for Pointer Authentication work. Sorry for the delay. I was caught up in some other urgent arm64 user-space issues. I am preparing the v6 now and hopefully will be able to post it out for review later today. Thanks, Bhupesh
Hi Bhupesh, On 2020-02-23 10:25 p.m., Bhupesh Sharma wrote: > Hi Amit, > > On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap <amit.kachhap@arm.com> wrote: >> Hi Bhupesh, >> >> On 1/13/20 5:44 PM, Bhupesh Sharma wrote: >>> Hi James, >>> >>> On 01/11/2020 12:30 AM, Dave Anderson wrote: >>>> ----- Original Message ----- >>>>> Hi Bhupesh, >>>>> >>>>> On 25/12/2019 19:01, Bhupesh Sharma wrote: >>>>>> On 12/12/2019 04:02 PM, James Morse wrote: >>>>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: >>>>>>>> vabits_actual variable on arm64 indicates the actual VA space size, >>>>>>>> and allows a single binary to support both 48-bit and 52-bit VA >>>>>>>> spaces. >>>>>>>> >>>>>>>> If the ARMv8.2-LVA optional feature is present, and we are running >>>>>>>> with a 64KB page size; then it is possible to use 52-bits of address >>>>>>>> space for both userspace and kernel addresses. However, any kernel >>>>>>>> binary that supports 52-bit must also be able to fall back to 48-bit >>>>>>>> at early boot time if the hardware feature is not present. >>>>>>>> >>>>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region >>>>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the >>>>>>>> vabits_actual value) it makes more sense to export the same in >>>>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the >>>>>>>> variable can change in future kernel versions, but the architectural >>>>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate intended >>>>>>>> specific fields to user-space. >>>>>>>> >>>>>>>> User-space utilities like makedumpfile and crash-utility, need to >>>>>>>> read/write this value from/to vmcoreinfo >>>>>>> (write?) >>>>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 >>>>>> system can >>>>>> be used for >>>>>> analysis of the root-cause of panic/crash on say an x86_64 host using >>>>>> utilities like >>>>>> crash-utility/gdb. >>>>> I read this as as "User-space [...] needs to write to vmcoreinfo". >>> That's correct. But for writing to vmcore dump in the kdump kernel, we >>> need to read the symbols from the vmcoreinfo in the primary kernel. >>> >>>>>>>> for determining if a virtual address lies in the linear map range. >>>>>>> I think this is a fragile example. The debugger shouldn't need to know >>>>>>> this. >>>>>> Well that the current user-space utility design, so I am not sure we >>>>>> can >>>>>> tweak that too much. >>>>>> >>>>>>>> The user-space computation for determining whether an address lies in >>>>>>>> the linear map range is the same as we have in kernel-space: >>>>>>>> >>>>>>>> #define __is_lm_address(addr) (!(((u64)addr) & >>>>>>>> BIT(vabits_actual - >>>>>>>> 1))) >>>>>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA >>>>>>> space"). If >>>>>>> user-space >>>>>>> tools rely on 'knowing' the kernel memory layout, they must have to >>>>>>> constantly be fixed >>>>>>> and updated. This is a poor argument for adding this to something that >>>>>>> ends up as ABI. >>>>>> See above. The user-space has to rely on some ABI/guaranteed >>>>>> hardware-symbols which can be >>>>>> used for 'determining' the kernel memory layout. >>>>> I disagree. Everything and anything in the kernel will change. The >>>>> ABI rules apply to >>>>> stuff exposed via syscalls and kernel filesystems. It does not apply >>>>> to kernel internals, >>>>> like the memory layout we used yesterday. 14c127c957c1 is a case in >>>>> point. >>>>> >>>>> A debugger trying to rely on this sort of thing would have to play >>>>> catchup whenever it >>>>> changes. >>>> Exactly. That's the whole point. >>>> >>>> The crash utility and makedumpfile are not in the same league as other >>>> user-space tools. >>>> They have always had to "play catchup" precisely because they depend >>>> upon kernel internals, >>>> which constantly change. >>> I agree with you and DaveA here. Software user-space debuggers are >>> dependent on kernel internals (which can change from time-to-time) and >>> will have to play catch-up (which has been the case since the very start). >>> >>> Unfortunately we don't have any clear ABI for software debugging tools - >>> may be something to look for in future. >>> >>> A case in point is gdb/kgdb, which still needs to run with KASLR >>> turned-off (nokaslr) for debugging, as it confuses gdb which resolve >>> kernel symbol address from symbol table of vmlinux. But we can >>> work-around the same in makedumpfile/crash by reading the 'kaslr_offset' >>> value. And I have several users telling me now they cannot use gdb on >>> KASLR enabled kernel to debug panics, but can makedumpfile + crash >>> combination to achieve the same. >>> >>> So, we should be looking to fix these utilities which are broken since >>> the 52-bit changes for arm64. Accordingly, I will try to send the v6 >>> soon while incorporating the comments posted on the v5. >> Any update on the next v6 version. Since this patch series is fixing the >> current broken kdump so need this series to add some more fields in >> vmcoreinfo for Pointer Authentication work. > Sorry for the delay. I was caught up in some other urgent arm64 > user-space issues. > I am preparing the v6 now and hopefully will be able to post it out > for review later today. Did v6 get sent out? > > Thanks, > Bhupesh > > Regards, Scott
Hello Bhupesh, V6 patch set on Linux 5.7, did not help. I have applied makedump file http://lists.infradead.org/pipermail/kexec/2019-November/023963.html changes also (makedump-1.6.6). Tried to apply it on makedumpfile 1.6.7. Patch set_2 failed. Would like to know, if you have V5 patch set for makedump file changes. With makedump 1.6.6, able to collect the vmore file. I used latest crash utility (https://www.redhat.com/archives/crash-utility/2019-November/msg00014.html changes are present) When I used crash utility, following is the error: Thanks, -Bharat -----Original Message----- From: Scott Branden [mailto:scott.branden@broadcom.com] Sent: Thursday, April 30, 2020 4:34 AM To: Bhupesh Sharma; Amit Kachhap Cc: Mark Rutland; x86@kernel.org; Will Deacon; Linux Doc Mailing List; Catalin Marinas; Ard Biesheuvel; kexec mailing list; Linux Kernel Mailing List; Kazuhito Hagio; James Morse; Dave Anderson; bhupesh linux; linuxppc-dev@lists.ozlabs.org; linux-arm-kernel; Steve Capper; Ray Jui; Bharat Gooty Subject: Re: Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo Hi Bhupesh, On 2020-02-23 10:25 p.m., Bhupesh Sharma wrote: > Hi Amit, > > On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap <amit.kachhap@arm.com> wrote: >> Hi Bhupesh, >> >> On 1/13/20 5:44 PM, Bhupesh Sharma wrote: >>> Hi James, >>> >>> On 01/11/2020 12:30 AM, Dave Anderson wrote: >>>> ----- Original Message ----- >>>>> Hi Bhupesh, >>>>> >>>>> On 25/12/2019 19:01, Bhupesh Sharma wrote: >>>>>> On 12/12/2019 04:02 PM, James Morse wrote: >>>>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: >>>>>>>> vabits_actual variable on arm64 indicates the actual VA space size, >>>>>>>> and allows a single binary to support both 48-bit and 52-bit VA >>>>>>>> spaces. >>>>>>>> >>>>>>>> If the ARMv8.2-LVA optional feature is present, and we are running >>>>>>>> with a 64KB page size; then it is possible to use 52-bits of >>>>>>>> address >>>>>>>> space for both userspace and kernel addresses. However, any kernel >>>>>>>> binary that supports 52-bit must also be able to fall back to >>>>>>>> 48-bit >>>>>>>> at early boot time if the hardware feature is not present. >>>>>>>> >>>>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region >>>>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the >>>>>>>> vabits_actual value) it makes more sense to export the same in >>>>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the >>>>>>>> variable can change in future kernel versions, but the >>>>>>>> architectural >>>>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate >>>>>>>> intended >>>>>>>> specific fields to user-space. >>>>>>>> >>>>>>>> User-space utilities like makedumpfile and crash-utility, need to >>>>>>>> read/write this value from/to vmcoreinfo >>>>>>> (write?) >>>>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 >>>>>> system can >>>>>> be used for >>>>>> analysis of the root-cause of panic/crash on say an x86_64 host using >>>>>> utilities like >>>>>> crash-utility/gdb. >>>>> I read this as as "User-space [...] needs to write to vmcoreinfo". >>> That's correct. But for writing to vmcore dump in the kdump kernel, we >>> need to read the symbols from the vmcoreinfo in the primary kernel. >>> >>>>>>>> for determining if a virtual address lies in the linear map range. >>>>>>> I think this is a fragile example. The debugger shouldn't need to >>>>>>> know >>>>>>> this. >>>>>> Well that the current user-space utility design, so I am not sure we >>>>>> can >>>>>> tweak that too much. >>>>>> >>>>>>>> The user-space computation for determining whether an address lies >>>>>>>> in >>>>>>>> the linear map range is the same as we have in kernel-space: >>>>>>>> >>>>>>>> #define __is_lm_address(addr) (!(((u64)addr) & >>>>>>>> BIT(vabits_actual - >>>>>>>> 1))) >>>>>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA >>>>>>> space"). If >>>>>>> user-space >>>>>>> tools rely on 'knowing' the kernel memory layout, they must have to >>>>>>> constantly be fixed >>>>>>> and updated. This is a poor argument for adding this to something >>>>>>> that >>>>>>> ends up as ABI. >>>>>> See above. The user-space has to rely on some ABI/guaranteed >>>>>> hardware-symbols which can be >>>>>> used for 'determining' the kernel memory layout. >>>>> I disagree. Everything and anything in the kernel will change. The >>>>> ABI rules apply to >>>>> stuff exposed via syscalls and kernel filesystems. It does not apply >>>>> to kernel internals, >>>>> like the memory layout we used yesterday. 14c127c957c1 is a case in >>>>> point. >>>>> >>>>> A debugger trying to rely on this sort of thing would have to play >>>>> catchup whenever it >>>>> changes. >>>> Exactly. That's the whole point. >>>> >>>> The crash utility and makedumpfile are not in the same league as other >>>> user-space tools. >>>> They have always had to "play catchup" precisely because they depend >>>> upon kernel internals, >>>> which constantly change. >>> I agree with you and DaveA here. Software user-space debuggers are >>> dependent on kernel internals (which can change from time-to-time) and >>> will have to play catch-up (which has been the case since the very >>> start). >>> >>> Unfortunately we don't have any clear ABI for software debugging tools - >>> may be something to look for in future. >>> >>> A case in point is gdb/kgdb, which still needs to run with KASLR >>> turned-off (nokaslr) for debugging, as it confuses gdb which resolve >>> kernel symbol address from symbol table of vmlinux. But we can >>> work-around the same in makedumpfile/crash by reading the 'kaslr_offset' >>> value. And I have several users telling me now they cannot use gdb on >>> KASLR enabled kernel to debug panics, but can makedumpfile + crash >>> combination to achieve the same. >>> >>> So, we should be looking to fix these utilities which are broken since >>> the 52-bit changes for arm64. Accordingly, I will try to send the v6 >>> soon while incorporating the comments posted on the v5. >> Any update on the next v6 version. Since this patch series is fixing the >> current broken kdump so need this series to add some more fields in >> vmcoreinfo for Pointer Authentication work. > Sorry for the delay. I was caught up in some other urgent arm64 > user-space issues. > I am preparing the v6 now and hopefully will be able to post it out > for review later today. Did v6 get sent out? > > Thanks, > Bhupesh > > Regards, Scott
Sorry, error message was not posted. Following is the error message crash: cannot determine VA_BITS_ACTUAL -----Original Message----- From: Bharat Gooty [mailto:bharat.gooty@broadcom.com] Sent: Wednesday, June 10, 2020 10:18 PM To: Scott Branden; 'Bhupesh Sharma'; 'Amit Kachhap' Cc: 'Mark Rutland'; 'x86@kernel.org'; 'Will Deacon'; 'Linux Doc Mailing List'; 'Catalin Marinas'; 'Ard Biesheuvel'; 'kexec mailing list'; 'Linux Kernel Mailing List'; 'Kazuhito Hagio'; 'James Morse'; 'Dave Anderson'; 'bhupesh linux'; 'linuxppc-dev@lists.ozlabs.org'; 'linux-arm-kernel'; 'Steve Capper'; Ray Jui Subject: RE: Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo Hello Bhupesh, V6 patch set on Linux 5.7, did not help. I have applied makedump file http://lists.infradead.org/pipermail/kexec/2019-November/023963.html changes also (makedump-1.6.6). Tried to apply it on makedumpfile 1.6.7. Patch set_2 failed. Would like to know, if you have V5 patch set for makedump file changes. With makedump 1.6.6, able to collect the vmore file. I used latest crash utility (https://www.redhat.com/archives/crash-utility/2019-November/msg00014.html changes are present) When I used crash utility, following is the error: Thanks, -Bharat -----Original Message----- From: Scott Branden [mailto:scott.branden@broadcom.com] Sent: Thursday, April 30, 2020 4:34 AM To: Bhupesh Sharma; Amit Kachhap Cc: Mark Rutland; x86@kernel.org; Will Deacon; Linux Doc Mailing List; Catalin Marinas; Ard Biesheuvel; kexec mailing list; Linux Kernel Mailing List; Kazuhito Hagio; James Morse; Dave Anderson; bhupesh linux; linuxppc-dev@lists.ozlabs.org; linux-arm-kernel; Steve Capper; Ray Jui; Bharat Gooty Subject: Re: Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ in vmcoreinfo Hi Bhupesh, On 2020-02-23 10:25 p.m., Bhupesh Sharma wrote: > Hi Amit, > > On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap <amit.kachhap@arm.com> wrote: >> Hi Bhupesh, >> >> On 1/13/20 5:44 PM, Bhupesh Sharma wrote: >>> Hi James, >>> >>> On 01/11/2020 12:30 AM, Dave Anderson wrote: >>>> ----- Original Message ----- >>>>> Hi Bhupesh, >>>>> >>>>> On 25/12/2019 19:01, Bhupesh Sharma wrote: >>>>>> On 12/12/2019 04:02 PM, James Morse wrote: >>>>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: >>>>>>>> vabits_actual variable on arm64 indicates the actual VA space size, >>>>>>>> and allows a single binary to support both 48-bit and 52-bit VA >>>>>>>> spaces. >>>>>>>> >>>>>>>> If the ARMv8.2-LVA optional feature is present, and we are running >>>>>>>> with a 64KB page size; then it is possible to use 52-bits of >>>>>>>> address >>>>>>>> space for both userspace and kernel addresses. However, any kernel >>>>>>>> binary that supports 52-bit must also be able to fall back to >>>>>>>> 48-bit >>>>>>>> at early boot time if the hardware feature is not present. >>>>>>>> >>>>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region >>>>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the >>>>>>>> vabits_actual value) it makes more sense to export the same in >>>>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the >>>>>>>> variable can change in future kernel versions, but the >>>>>>>> architectural >>>>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate >>>>>>>> intended >>>>>>>> specific fields to user-space. >>>>>>>> >>>>>>>> User-space utilities like makedumpfile and crash-utility, need to >>>>>>>> read/write this value from/to vmcoreinfo >>>>>>> (write?) >>>>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 >>>>>> system can >>>>>> be used for >>>>>> analysis of the root-cause of panic/crash on say an x86_64 host using >>>>>> utilities like >>>>>> crash-utility/gdb. >>>>> I read this as as "User-space [...] needs to write to vmcoreinfo". >>> That's correct. But for writing to vmcore dump in the kdump kernel, we >>> need to read the symbols from the vmcoreinfo in the primary kernel. >>> >>>>>>>> for determining if a virtual address lies in the linear map range. >>>>>>> I think this is a fragile example. The debugger shouldn't need to >>>>>>> know >>>>>>> this. >>>>>> Well that the current user-space utility design, so I am not sure we >>>>>> can >>>>>> tweak that too much. >>>>>> >>>>>>>> The user-space computation for determining whether an address lies >>>>>>>> in >>>>>>>> the linear map range is the same as we have in kernel-space: >>>>>>>> >>>>>>>> #define __is_lm_address(addr) (!(((u64)addr) & >>>>>>>> BIT(vabits_actual - >>>>>>>> 1))) >>>>>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA >>>>>>> space"). If >>>>>>> user-space >>>>>>> tools rely on 'knowing' the kernel memory layout, they must have to >>>>>>> constantly be fixed >>>>>>> and updated. This is a poor argument for adding this to something >>>>>>> that >>>>>>> ends up as ABI. >>>>>> See above. The user-space has to rely on some ABI/guaranteed >>>>>> hardware-symbols which can be >>>>>> used for 'determining' the kernel memory layout. >>>>> I disagree. Everything and anything in the kernel will change. The >>>>> ABI rules apply to >>>>> stuff exposed via syscalls and kernel filesystems. It does not apply >>>>> to kernel internals, >>>>> like the memory layout we used yesterday. 14c127c957c1 is a case in >>>>> point. >>>>> >>>>> A debugger trying to rely on this sort of thing would have to play >>>>> catchup whenever it >>>>> changes. >>>> Exactly. That's the whole point. >>>> >>>> The crash utility and makedumpfile are not in the same league as other >>>> user-space tools. >>>> They have always had to "play catchup" precisely because they depend >>>> upon kernel internals, >>>> which constantly change. >>> I agree with you and DaveA here. Software user-space debuggers are >>> dependent on kernel internals (which can change from time-to-time) and >>> will have to play catch-up (which has been the case since the very >>> start). >>> >>> Unfortunately we don't have any clear ABI for software debugging tools - >>> may be something to look for in future. >>> >>> A case in point is gdb/kgdb, which still needs to run with KASLR >>> turned-off (nokaslr) for debugging, as it confuses gdb which resolve >>> kernel symbol address from symbol table of vmlinux. But we can >>> work-around the same in makedumpfile/crash by reading the 'kaslr_offset' >>> value. And I have several users telling me now they cannot use gdb on >>> KASLR enabled kernel to debug panics, but can makedumpfile + crash >>> combination to achieve the same. >>> >>> So, we should be looking to fix these utilities which are broken since >>> the 52-bit changes for arm64. Accordingly, I will try to send the v6 >>> soon while incorporating the comments posted on the v5. >> Any update on the next v6 version. Since this patch series is fixing the >> current broken kdump so need this series to add some more fields in >> vmcoreinfo for Pointer Authentication work. > Sorry for the delay. I was caught up in some other urgent arm64 > user-space issues. > I am preparing the v6 now and hopefully will be able to post it out > for review later today. Did v6 get sent out? > > Thanks, > Bhupesh > > Regards, Scott
Hello Bharat, On Wed, Jun 10, 2020 at 10:17 PM Bharat Gooty <bharat.gooty@broadcom.com> wrote: > > Hello Bhupesh, > V6 patch set on Linux 5.7, did not help. > I have applied makedump file > http://lists.infradead.org/pipermail/kexec/2019-November/023963.html changes > also (makedump-1.6.6). Tried to apply it on makedumpfile 1.6.7. Patch set_2 > failed. Would like to know, if you have V5 patch set for makedump file > changes. With makedump 1.6.6, able to collect the vmore file. > I used latest crash utility > (https://www.redhat.com/archives/crash-utility/2019-November/msg00014.html > changes are present) > When I used crash utility, following is the error: > > Thanks, > -Bharat > > > -----Original Message----- > From: Scott Branden [mailto:scott.branden@broadcom.com] > Sent: Thursday, April 30, 2020 4:34 AM > To: Bhupesh Sharma; Amit Kachhap > Cc: Mark Rutland; x86@kernel.org; Will Deacon; Linux Doc Mailing List; > Catalin Marinas; Ard Biesheuvel; kexec mailing list; Linux Kernel Mailing > List; Kazuhito Hagio; James Morse; Dave Anderson; bhupesh linux; > linuxppc-dev@lists.ozlabs.org; linux-arm-kernel; Steve Capper; Ray Jui; > Bharat Gooty > Subject: Re: Re: [RESEND PATCH v5 2/5] arm64/crash_core: Export TCR_EL1.T1SZ > in vmcoreinfo > > Hi Bhupesh, > > On 2020-02-23 10:25 p.m., Bhupesh Sharma wrote: > > Hi Amit, > > > > On Fri, Feb 21, 2020 at 2:36 PM Amit Kachhap <amit.kachhap@arm.com> wrote: > >> Hi Bhupesh, > >> > >> On 1/13/20 5:44 PM, Bhupesh Sharma wrote: > >>> Hi James, > >>> > >>> On 01/11/2020 12:30 AM, Dave Anderson wrote: > >>>> ----- Original Message ----- > >>>>> Hi Bhupesh, > >>>>> > >>>>> On 25/12/2019 19:01, Bhupesh Sharma wrote: > >>>>>> On 12/12/2019 04:02 PM, James Morse wrote: > >>>>>>> On 29/11/2019 19:59, Bhupesh Sharma wrote: > >>>>>>>> vabits_actual variable on arm64 indicates the actual VA space size, > >>>>>>>> and allows a single binary to support both 48-bit and 52-bit VA > >>>>>>>> spaces. > >>>>>>>> > >>>>>>>> If the ARMv8.2-LVA optional feature is present, and we are running > >>>>>>>> with a 64KB page size; then it is possible to use 52-bits of > >>>>>>>> address > >>>>>>>> space for both userspace and kernel addresses. However, any kernel > >>>>>>>> binary that supports 52-bit must also be able to fall back to > >>>>>>>> 48-bit > >>>>>>>> at early boot time if the hardware feature is not present. > >>>>>>>> > >>>>>>>> Since TCR_EL1.T1SZ indicates the size offset of the memory region > >>>>>>>> addressed by TTBR1_EL1 (and hence can be used for determining the > >>>>>>>> vabits_actual value) it makes more sense to export the same in > >>>>>>>> vmcoreinfo rather than vabits_actual variable, as the name of the > >>>>>>>> variable can change in future kernel versions, but the > >>>>>>>> architectural > >>>>>>>> constructs like TCR_EL1.T1SZ can be used better to indicate > >>>>>>>> intended > >>>>>>>> specific fields to user-space. > >>>>>>>> > >>>>>>>> User-space utilities like makedumpfile and crash-utility, need to > >>>>>>>> read/write this value from/to vmcoreinfo > >>>>>>> (write?) > >>>>>> Yes, also write so that the vmcoreinfo from an (crashing) arm64 > >>>>>> system can > >>>>>> be used for > >>>>>> analysis of the root-cause of panic/crash on say an x86_64 host using > >>>>>> utilities like > >>>>>> crash-utility/gdb. > >>>>> I read this as as "User-space [...] needs to write to vmcoreinfo". > >>> That's correct. But for writing to vmcore dump in the kdump kernel, we > >>> need to read the symbols from the vmcoreinfo in the primary kernel. > >>> > >>>>>>>> for determining if a virtual address lies in the linear map range. > >>>>>>> I think this is a fragile example. The debugger shouldn't need to > >>>>>>> know > >>>>>>> this. > >>>>>> Well that the current user-space utility design, so I am not sure we > >>>>>> can > >>>>>> tweak that too much. > >>>>>> > >>>>>>>> The user-space computation for determining whether an address lies > >>>>>>>> in > >>>>>>>> the linear map range is the same as we have in kernel-space: > >>>>>>>> > >>>>>>>> #define __is_lm_address(addr) (!(((u64)addr) & > >>>>>>>> BIT(vabits_actual - > >>>>>>>> 1))) > >>>>>>> This was changed with 14c127c957c1 ("arm64: mm: Flip kernel VA > >>>>>>> space"). If > >>>>>>> user-space > >>>>>>> tools rely on 'knowing' the kernel memory layout, they must have to > >>>>>>> constantly be fixed > >>>>>>> and updated. This is a poor argument for adding this to something > >>>>>>> that > >>>>>>> ends up as ABI. > >>>>>> See above. The user-space has to rely on some ABI/guaranteed > >>>>>> hardware-symbols which can be > >>>>>> used for 'determining' the kernel memory layout. > >>>>> I disagree. Everything and anything in the kernel will change. The > >>>>> ABI rules apply to > >>>>> stuff exposed via syscalls and kernel filesystems. It does not apply > >>>>> to kernel internals, > >>>>> like the memory layout we used yesterday. 14c127c957c1 is a case in > >>>>> point. > >>>>> > >>>>> A debugger trying to rely on this sort of thing would have to play > >>>>> catchup whenever it > >>>>> changes. > >>>> Exactly. That's the whole point. > >>>> > >>>> The crash utility and makedumpfile are not in the same league as other > >>>> user-space tools. > >>>> They have always had to "play catchup" precisely because they depend > >>>> upon kernel internals, > >>>> which constantly change. > >>> I agree with you and DaveA here. Software user-space debuggers are > >>> dependent on kernel internals (which can change from time-to-time) and > >>> will have to play catch-up (which has been the case since the very > >>> start). > >>> > >>> Unfortunately we don't have any clear ABI for software debugging tools - > >>> may be something to look for in future. > >>> > >>> A case in point is gdb/kgdb, which still needs to run with KASLR > >>> turned-off (nokaslr) for debugging, as it confuses gdb which resolve > >>> kernel symbol address from symbol table of vmlinux. But we can > >>> work-around the same in makedumpfile/crash by reading the 'kaslr_offset' > >>> value. And I have several users telling me now they cannot use gdb on > >>> KASLR enabled kernel to debug panics, but can makedumpfile + crash > >>> combination to achieve the same. > >>> > >>> So, we should be looking to fix these utilities which are broken since > >>> the 52-bit changes for arm64. Accordingly, I will try to send the v6 > >>> soon while incorporating the comments posted on the v5. > >> Any update on the next v6 version. Since this patch series is fixing the > >> current broken kdump so need this series to add some more fields in > >> vmcoreinfo for Pointer Authentication work. > > Sorry for the delay. I was caught up in some other urgent arm64 > > user-space issues. > > I am preparing the v6 now and hopefully will be able to post it out > > for review later today. > > Did v6 get sent out? Like I mentioned in a different thread reply, I did not put out the user-space changes just yet, since we are waiting for the kernel patches to be accepted first. In the last review cycle (v5) we had inconsistencies between the user-space and kernel as user-space utilities like crash accepted the v5 patches while we had to respin the v6 for the kernel side. But since a few other Red Hat arm partners have asked for the same, please find below my public github trees (with prescribed branches), which you can use for testing the v6 kernel patchset: 1. makedumpfile: <https://github.com/bhupesh-sharma/makedumpfile/tree/52-bit-va-support-via-vmcore-upstream-v6-rebase> 2. crash-utility: <https://github.com/bhupesh-sharma/crash/tree/52-bit-va-support-via-vmcore-upstream-v6-rebase> Hope this helps. Thanks, Bhupesh
diff --git a/arch/arm64/include/asm/pgtable-hwdef.h b/arch/arm64/include/asm/pgtable-hwdef.h index d9fbd433cc17..d2e7aff5821e 100644 --- a/arch/arm64/include/asm/pgtable-hwdef.h +++ b/arch/arm64/include/asm/pgtable-hwdef.h @@ -215,6 +215,7 @@ #define TCR_TxSZ(x) (TCR_T0SZ(x) | TCR_T1SZ(x)) #define TCR_TxSZ_WIDTH 6 #define TCR_T0SZ_MASK (((UL(1) << TCR_TxSZ_WIDTH) - 1) << TCR_T0SZ_OFFSET) +#define TCR_T1SZ_MASK (((UL(1) << TCR_TxSZ_WIDTH) - 1) << TCR_T1SZ_OFFSET) #define TCR_EPD0_SHIFT 7 #define TCR_EPD0_MASK (UL(1) << TCR_EPD0_SHIFT) diff --git a/arch/arm64/kernel/crash_core.c b/arch/arm64/kernel/crash_core.c index ca4c3e12d8c5..f78310ba65ea 100644 --- a/arch/arm64/kernel/crash_core.c +++ b/arch/arm64/kernel/crash_core.c @@ -7,6 +7,13 @@ #include <linux/crash_core.h> #include <asm/memory.h> +static inline u64 get_tcr_el1_t1sz(void); + +static inline u64 get_tcr_el1_t1sz(void) +{ + return (read_sysreg(tcr_el1) & TCR_T1SZ_MASK) >> TCR_T1SZ_OFFSET; +} + void arch_crash_save_vmcoreinfo(void) { VMCOREINFO_NUMBER(VA_BITS); @@ -15,5 +22,7 @@ void arch_crash_save_vmcoreinfo(void) kimage_voffset); vmcoreinfo_append_str("NUMBER(PHYS_OFFSET)=0x%llx\n", PHYS_OFFSET); + vmcoreinfo_append_str("NUMBER(tcr_el1_t1sz)=0x%llx\n", + get_tcr_el1_t1sz()); vmcoreinfo_append_str("KERNELOFFSET=%lx\n", kaslr_offset()); }
vabits_actual variable on arm64 indicates the actual VA space size, and allows a single binary to support both 48-bit and 52-bit VA spaces. If the ARMv8.2-LVA optional feature is present, and we are running with a 64KB page size; then it is possible to use 52-bits of address space for both userspace and kernel addresses. However, any kernel binary that supports 52-bit must also be able to fall back to 48-bit at early boot time if the hardware feature is not present. Since TCR_EL1.T1SZ indicates the size offset of the memory region addressed by TTBR1_EL1 (and hence can be used for determining the vabits_actual value) it makes more sense to export the same in vmcoreinfo rather than vabits_actual variable, as the name of the variable can change in future kernel versions, but the architectural constructs like TCR_EL1.T1SZ can be used better to indicate intended specific fields to user-space. User-space utilities like makedumpfile and crash-utility, need to read/write this value from/to vmcoreinfo for determining if a virtual address lies in the linear map range. The user-space computation for determining whether an address lies in the linear map range is the same as we have in kernel-space: #define __is_lm_address(addr) (!(((u64)addr) & BIT(vabits_actual - 1))) I have sent out user-space patches for makedumpfile and crash-utility to add features for obtaining vabits_actual value from TCR_EL1.T1SZ (see [0] and [1]). Akashi reported that he was able to use this patchset and the user-space changes to get user-space working fine with the 52-bit kernel VA changes (see [2]). [0]. http://lists.infradead.org/pipermail/kexec/2019-November/023966.html [1]. http://lists.infradead.org/pipermail/kexec/2019-November/024006.html [2]. http://lists.infradead.org/pipermail/kexec/2019-November/023992.html Cc: James Morse <james.morse@arm.com> Cc: Mark Rutland <mark.rutland@arm.com> Cc: Will Deacon <will@kernel.org> Cc: Steve Capper <steve.capper@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: Ard Biesheuvel <ard.biesheuvel@linaro.org> Cc: Dave Anderson <anderson@redhat.com> Cc: Kazuhito Hagio <k-hagio@ab.jp.nec.com> Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: kexec@lists.infradead.org Signed-off-by: Bhupesh Sharma <bhsharma@redhat.com> --- arch/arm64/include/asm/pgtable-hwdef.h | 1 + arch/arm64/kernel/crash_core.c | 9 +++++++++ 2 files changed, 10 insertions(+)