Message ID | 20240312022030.320789-1-samuel.holland@sifive.com (mailing list archive) |
---|---|
State | Accepted |
Commit | d080a08b06b6266cc3e0e86c5acfd80db937cb6b |
Headers | show |
Series | riscv: Fix spurious errors from __get/put_kernel_nofault | expand |
On Mon, Mar 11, 2024 at 07:19:13PM -0700, Samuel Holland wrote: > These macros did not initialize __kr_err, so they could fail even if > the access did not fault. > > Cc: stable@vger.kernel.org > Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") > Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > --- > Found while testing the unaligned access speed series[1]. The observed > behavior was that with RISCV_EFFICIENT_UNALIGNED_ACCESS=y, the > copy_from_kernel_nofault() in prepend_copy() failed every time when > filling out /proc/self/mounts, so all of the mount points were "xxx". > > I'm surprised this hasn't been seen before. For reference, I'm compiling > with clang 18. > > [1]: https://lore.kernel.org/linux-riscv/20240308-disable_misaligned_probe_config-v9-0-a388770ba0ce@rivosinc.com/ > > arch/riscv/include/asm/uaccess.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h > index ec0cab9fbddd..72ec1d9bd3f3 100644 > --- a/arch/riscv/include/asm/uaccess.h > +++ b/arch/riscv/include/asm/uaccess.h > @@ -319,7 +319,7 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) > > #define __get_kernel_nofault(dst, src, type, err_label) \ > do { \ > - long __kr_err; \ > + long __kr_err = 0; \ > \ > __get_user_nocheck(*((type *)(dst)), (type *)(src), __kr_err); \ > if (unlikely(__kr_err)) \ > @@ -328,7 +328,7 @@ do { \ > > #define __put_kernel_nofault(dst, src, type, err_label) \ > do { \ > - long __kr_err; \ > + long __kr_err = 0; \ > \ > __put_user_nocheck(*((type *)(src)), (type *)(dst), __kr_err); \ > if (unlikely(__kr_err)) \ > -- > 2.43.1 > > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv I am not able to reproduce this using Clang 18 with RISCV_EFFICIENT_UNALIGNED_ACCESS=y on 6.8. However I can see how this could be an issue. Going down the rabbit hold of macros here, I end up at arch/riscv/include/asm/asm-extable.h where the register that hold 'err' is written into the __ex_table section: #define EX_DATA_REG(reg, gpr) \ "((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")" #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero) \ __DEFINE_ASM_GPR_NUMS \ __ASM_EXTABLE_RAW(#insn, #fixup, \ __stringify(EX_TYPE_UACCESS_ERR_ZERO), \ "(" \ EX_DATA_REG(ERR, err) " | " \ EX_DATA_REG(ZERO, zero) \ ")") I am wondering if setting this value to zero solves the problem by hiding another issue. It seems like this shouldn't need to be initialized to zero, however I am lost as to how this extable setup works so perhaps this is the proper solution. - Charlie
On 2024-03-12 9:53 PM, Charlie Jenkins wrote: > On Mon, Mar 11, 2024 at 07:19:13PM -0700, Samuel Holland wrote: >> These macros did not initialize __kr_err, so they could fail even if >> the access did not fault. >> >> Cc: stable@vger.kernel.org >> Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") >> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> >> --- >> Found while testing the unaligned access speed series[1]. The observed >> behavior was that with RISCV_EFFICIENT_UNALIGNED_ACCESS=y, the >> copy_from_kernel_nofault() in prepend_copy() failed every time when >> filling out /proc/self/mounts, so all of the mount points were "xxx". >> >> I'm surprised this hasn't been seen before. For reference, I'm compiling >> with clang 18. >> >> [1]: https://lore.kernel.org/linux-riscv/20240308-disable_misaligned_probe_config-v9-0-a388770ba0ce@rivosinc.com/ >> >> arch/riscv/include/asm/uaccess.h | 4 ++-- >> 1 file changed, 2 insertions(+), 2 deletions(-) >> >> diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h >> index ec0cab9fbddd..72ec1d9bd3f3 100644 >> --- a/arch/riscv/include/asm/uaccess.h >> +++ b/arch/riscv/include/asm/uaccess.h >> @@ -319,7 +319,7 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) >> >> #define __get_kernel_nofault(dst, src, type, err_label) \ >> do { \ >> - long __kr_err; \ >> + long __kr_err = 0; \ >> \ >> __get_user_nocheck(*((type *)(dst)), (type *)(src), __kr_err); \ >> if (unlikely(__kr_err)) \ >> @@ -328,7 +328,7 @@ do { \ >> >> #define __put_kernel_nofault(dst, src, type, err_label) \ >> do { \ >> - long __kr_err; \ >> + long __kr_err = 0; \ >> \ >> __put_user_nocheck(*((type *)(src)), (type *)(dst), __kr_err); \ >> if (unlikely(__kr_err)) \ >> -- >> 2.43.1 >> >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv > > I am not able to reproduce this using Clang 18 with > RISCV_EFFICIENT_UNALIGNED_ACCESS=y on 6.8. However I can see how this > could be an issue. > > Going down the rabbit hold of macros here, I end up at > arch/riscv/include/asm/asm-extable.h where the register that hold 'err' > is written into the __ex_table section: > > #define EX_DATA_REG(reg, gpr) \ > "((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")" > > #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero) \ > __DEFINE_ASM_GPR_NUMS \ > __ASM_EXTABLE_RAW(#insn, #fixup, \ > __stringify(EX_TYPE_UACCESS_ERR_ZERO), \ > "(" \ > EX_DATA_REG(ERR, err) " | " \ > EX_DATA_REG(ZERO, zero) \ > ")") > > I am wondering if setting this value to zero solves the problem by > hiding another issue. It seems like this shouldn't need to be > initialized to zero, however I am lost as to how this extable setup > works so perhaps this is the proper solution. extable works by running the handler (selected by EX_TYPE_*) if some exception occurs while executing that instruction -- see the calls to fixup_exception() in fault.c and traps.c. If there is no exception, then the handler does not run, and the err register is not written by ex_handler_uaccess_err_zero(). If you look at __get_user_asm(), you can see that the err register is not touched by the assembly code at all -- the only reference to %0 is in the extable entry. So if the macro that declares the error variable doesn't initialize it, nothing will. Compare __get_user() and __put_user() which do initialize their error variable. Regards, Samuel
On Tue, Mar 12, 2024 at 10:05:37PM -0500, Samuel Holland wrote: > On 2024-03-12 9:53 PM, Charlie Jenkins wrote: > > On Mon, Mar 11, 2024 at 07:19:13PM -0700, Samuel Holland wrote: > >> These macros did not initialize __kr_err, so they could fail even if > >> the access did not fault. > >> > >> Cc: stable@vger.kernel.org > >> Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") > >> Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > >> --- > >> Found while testing the unaligned access speed series[1]. The observed > >> behavior was that with RISCV_EFFICIENT_UNALIGNED_ACCESS=y, the > >> copy_from_kernel_nofault() in prepend_copy() failed every time when > >> filling out /proc/self/mounts, so all of the mount points were "xxx". > >> > >> I'm surprised this hasn't been seen before. For reference, I'm compiling > >> with clang 18. > >> > >> [1]: https://lore.kernel.org/linux-riscv/20240308-disable_misaligned_probe_config-v9-0-a388770ba0ce@rivosinc.com/ > >> > >> arch/riscv/include/asm/uaccess.h | 4 ++-- > >> 1 file changed, 2 insertions(+), 2 deletions(-) > >> > >> diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h > >> index ec0cab9fbddd..72ec1d9bd3f3 100644 > >> --- a/arch/riscv/include/asm/uaccess.h > >> +++ b/arch/riscv/include/asm/uaccess.h > >> @@ -319,7 +319,7 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) > >> > >> #define __get_kernel_nofault(dst, src, type, err_label) \ > >> do { \ > >> - long __kr_err; \ > >> + long __kr_err = 0; \ > >> \ > >> __get_user_nocheck(*((type *)(dst)), (type *)(src), __kr_err); \ > >> if (unlikely(__kr_err)) \ > >> @@ -328,7 +328,7 @@ do { \ > >> > >> #define __put_kernel_nofault(dst, src, type, err_label) \ > >> do { \ > >> - long __kr_err; \ > >> + long __kr_err = 0; \ > >> \ > >> __put_user_nocheck(*((type *)(src)), (type *)(dst), __kr_err); \ > >> if (unlikely(__kr_err)) \ > >> -- > >> 2.43.1 > >> > >> > >> _______________________________________________ > >> linux-riscv mailing list > >> linux-riscv@lists.infradead.org > >> http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > I am not able to reproduce this using Clang 18 with > > RISCV_EFFICIENT_UNALIGNED_ACCESS=y on 6.8. However I can see how this > > could be an issue. > > > > Going down the rabbit hold of macros here, I end up at > > arch/riscv/include/asm/asm-extable.h where the register that hold 'err' > > is written into the __ex_table section: > > > > #define EX_DATA_REG(reg, gpr) \ > > "((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")" > > > > #define _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero) \ > > __DEFINE_ASM_GPR_NUMS \ > > __ASM_EXTABLE_RAW(#insn, #fixup, \ > > __stringify(EX_TYPE_UACCESS_ERR_ZERO), \ > > "(" \ > > EX_DATA_REG(ERR, err) " | " \ > > EX_DATA_REG(ZERO, zero) \ > > ")") > > > > I am wondering if setting this value to zero solves the problem by > > hiding another issue. It seems like this shouldn't need to be > > initialized to zero, however I am lost as to how this extable setup > > works so perhaps this is the proper solution. > > extable works by running the handler (selected by EX_TYPE_*) if some exception > occurs while executing that instruction -- see the calls to fixup_exception() in > fault.c and traps.c. If there is no exception, then the handler does not run, > and the err register is not written by ex_handler_uaccess_err_zero(). Hmm okay I understand thank you for explaining that. It's interesting to me that in __get_user_asm 'err' is set as a read/write variable even though __get_user_asm doesn't write to it. However, it seems like changing it to a write-only variable the compiler incorrectly optimizes err, and the kernel fails to boot. > > If you look at __get_user_asm(), you can see that the err register is not > touched by the assembly code at all -- the only reference to %0 is in the > extable entry. So if the macro that declares the error variable doesn't > initialize it, nothing will. Reviewed-by: Charlie Jenkins <charlie@rivosinc.com> > > Compare __get_user() and __put_user() which do initialize their error variable. > > Regards, > Samuel >
Hi Samuel, On 12/03/2024 03:19, Samuel Holland wrote: > These macros did not initialize __kr_err, so they could fail even if > the access did not fault. > > Cc: stable@vger.kernel.org > Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") > Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > --- > Found while testing the unaligned access speed series[1]. The observed > behavior was that with RISCV_EFFICIENT_UNALIGNED_ACCESS=y, the > copy_from_kernel_nofault() in prepend_copy() failed every time when > filling out /proc/self/mounts, so all of the mount points were "xxx". > > I'm surprised this hasn't been seen before. For reference, I'm compiling > with clang 18. > > [1]: https://lore.kernel.org/linux-riscv/20240308-disable_misaligned_probe_config-v9-0-a388770ba0ce@rivosinc.com/ > > arch/riscv/include/asm/uaccess.h | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h > index ec0cab9fbddd..72ec1d9bd3f3 100644 > --- a/arch/riscv/include/asm/uaccess.h > +++ b/arch/riscv/include/asm/uaccess.h > @@ -319,7 +319,7 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) > > #define __get_kernel_nofault(dst, src, type, err_label) \ > do { \ > - long __kr_err; \ > + long __kr_err = 0; \ > \ > __get_user_nocheck(*((type *)(dst)), (type *)(src), __kr_err); \ > if (unlikely(__kr_err)) \ > @@ -328,7 +328,7 @@ do { \ > > #define __put_kernel_nofault(dst, src, type, err_label) \ > do { \ > - long __kr_err; \ > + long __kr_err = 0; \ > \ > __put_user_nocheck(*((type *)(src)), (type *)(dst), __kr_err); \ > if (unlikely(__kr_err)) \ That's a good a catch! Surprising indeed that we did not find this before, that's the second bug in the uaccess routines in the last 3 (or 4) releases, maybe we should improve our coverage here, I'll see what we can do. You can add: Reviewed-by: Alexandre Ghiti <alexghiti@rivosinc.com> Thanks, Alex
Hello: This patch was applied to riscv/linux.git (fixes) by Palmer Dabbelt <palmer@rivosinc.com>: On Mon, 11 Mar 2024 19:19:13 -0700 you wrote: > These macros did not initialize __kr_err, so they could fail even if > the access did not fault. > > Cc: stable@vger.kernel.org > Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") > Signed-off-by: Samuel Holland <samuel.holland@sifive.com> > > [...] Here is the summary with links: - riscv: Fix spurious errors from __get/put_kernel_nofault https://git.kernel.org/riscv/c/d080a08b06b6 You are awesome, thank you!
diff --git a/arch/riscv/include/asm/uaccess.h b/arch/riscv/include/asm/uaccess.h index ec0cab9fbddd..72ec1d9bd3f3 100644 --- a/arch/riscv/include/asm/uaccess.h +++ b/arch/riscv/include/asm/uaccess.h @@ -319,7 +319,7 @@ unsigned long __must_check clear_user(void __user *to, unsigned long n) #define __get_kernel_nofault(dst, src, type, err_label) \ do { \ - long __kr_err; \ + long __kr_err = 0; \ \ __get_user_nocheck(*((type *)(dst)), (type *)(src), __kr_err); \ if (unlikely(__kr_err)) \ @@ -328,7 +328,7 @@ do { \ #define __put_kernel_nofault(dst, src, type, err_label) \ do { \ - long __kr_err; \ + long __kr_err = 0; \ \ __put_user_nocheck(*((type *)(src)), (type *)(dst), __kr_err); \ if (unlikely(__kr_err)) \
These macros did not initialize __kr_err, so they could fail even if the access did not fault. Cc: stable@vger.kernel.org Fixes: d464118cdc41 ("riscv: implement __get_kernel_nofault and __put_user_nofault") Signed-off-by: Samuel Holland <samuel.holland@sifive.com> --- Found while testing the unaligned access speed series[1]. The observed behavior was that with RISCV_EFFICIENT_UNALIGNED_ACCESS=y, the copy_from_kernel_nofault() in prepend_copy() failed every time when filling out /proc/self/mounts, so all of the mount points were "xxx". I'm surprised this hasn't been seen before. For reference, I'm compiling with clang 18. [1]: https://lore.kernel.org/linux-riscv/20240308-disable_misaligned_probe_config-v9-0-a388770ba0ce@rivosinc.com/ arch/riscv/include/asm/uaccess.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)