Message ID | 20220117154433.3124-1-changbin.du@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | riscv: eliminate unreliable __builtin_frame_address(1) | expand |
On Jan 17 2022, Changbin Du wrote: > I tried different pieces of code which uses __builtin_frame_address(1) > (with both gcc version 7.5.0 and 10.3.0) to verify whether it works as > expected on riscv64. The result is negative. > > What the compiler had generated is as below: > 31 fp = (unsigned long)__builtin_frame_address(1); > 0xffffffff80006024 <+200>: ld s1,0(s0) > > It takes '0(s0)' as the address of frame 1 (caller), but the actual address > should be '-16(s0)'. > > | ... | <-+ > +-----------------+ | > | return address | | > | previous fp | | > | saved registers | | > | local variables | | > $fp --> | ... | | > +-----------------+ | > | return address | | > | previous fp --------+ > | saved registers | > $sp --> | local variables | > +-----------------+ > > This leads the kernel can not dump the full stack trace on riscv. > > [ 7.222126][ T1] Call Trace: > [ 7.222804][ T1] [<ffffffff80006058>] dump_backtrace+0x2c/0x3a > > This problem is not exposed on most riscv builds just because the '0(s0)' > occasionally is the address frame 2 (caller's caller), if only ra and fp > are stored in frame 1 (caller). > > | ... | <-+ > +-----------------+ | > | return address | | > $fp --> | previous fp | | > +-----------------+ | > | return address | | > | previous fp --------+ > | saved registers | > $sp --> | local variables | > +-----------------+ > > This could be a *bug* of gcc that should be fixed. Yes, it would be nice to get this fixed. The riscv target does not override DYNAMIC_CHAIN_ADDRESS, thus the default is used, which has the noted effect.
On 17 Jan 2022, at 15:44, Changbin Du <changbin.du@gmail.com> wrote: > > I tried different pieces of code which uses __builtin_frame_address(1) > (with both gcc version 7.5.0 and 10.3.0) to verify whether it works as > expected on riscv64. The result is negative. > > What the compiler had generated is as below: > 31 fp = (unsigned long)__builtin_frame_address(1); > 0xffffffff80006024 <+200>: ld s1,0(s0) > > It takes '0(s0)' as the address of frame 1 (caller), but the actual address > should be '-16(s0)'. > > | ... | <-+ > +-----------------+ | > | return address | | > | previous fp | | > | saved registers | | > | local variables | | > $fp --> | ... | | > +-----------------+ | > | return address | | > | previous fp --------+ > | saved registers | > $sp --> | local variables | > +-----------------+ > > This leads the kernel can not dump the full stack trace on riscv. > > [ 7.222126][ T1] Call Trace: > [ 7.222804][ T1] [<ffffffff80006058>] dump_backtrace+0x2c/0x3a > > This problem is not exposed on most riscv builds just because the '0(s0)' > occasionally is the address frame 2 (caller's caller), if only ra and fp > are stored in frame 1 (caller). > > | ... | <-+ > +-----------------+ | > | return address | | > $fp --> | previous fp | | > +-----------------+ | > | return address | | > | previous fp --------+ > | saved registers | > $sp --> | local variables | > +-----------------+ > > This could be a *bug* of gcc that should be fixed. But as noted in gcc > manual "Calling this function with a nonzero argument can have > unpredictable effects, including crashing the calling program.", let's > remove the '__builtin_frame_address(1)' in backtrace code. Yes, this is a bug, that is always wrong. LLVM gets this right. https://godbolt.org/z/MrhsoPPM6 Jess
On Jan 17 2022, Jessica Clarke wrote:
> Yes, this is a bug, that is always wrong. LLVM gets this right.
Is that an ABI requirement? In case of a leaf function, gcc saves the
caller's frame pointer in the first slot, not the second (it doesn't
save the return address).
On 19 Jan 2022, at 10:58, Andreas Schwab <schwab@linux-m68k.org> wrote: > > On Jan 17 2022, Jessica Clarke wrote: > >> Yes, this is a bug, that is always wrong. LLVM gets this right. > > Is that an ABI requirement? In case of a leaf function, gcc saves the > caller's frame pointer in the first slot, not the second (it doesn't > save the return address). Leaf functions by definition don’t have callees that are trying to read their frame pointer so aren’t relevant here. The stack frame layout isn’t specified by the ABI, only that the in-memory outgoing arguments be at the bottom when calling other functions. However, GCC knows what layout it uses, so it should be consistent and follow that layout for walking back up frames. Especially for __builtin_frame_address(1), that just pertains to the current function’s frame, which it clearly knows without a doubt, so there’s no reason to get that wrong. Accessing 0(s0) is just straight up wrong, that’s accessing past the top of the stack frame, which is never going to make any sense. Jess
On Jan 19 2022, Jessica Clarke wrote: > Leaf functions by definition don’t have callees that are trying to read > their frame pointer so aren’t relevant here. ??? __builtin_frame_address(1) is about the caller, not the callee.
On 19 Jan 2022, at 20:44, Andreas Schwab <schwab@linux-m68k.org> wrote: > > On Jan 19 2022, Jessica Clarke wrote: > >> Leaf functions by definition don’t have callees that are trying to read >> their frame pointer so aren’t relevant here. > > ??? __builtin_frame_address(1) is about the caller, not the callee. My point is that the only thing that can possibly read the incoming frame pointer of a leaf function is the leaf function itself, and since it knows where it’s putting it then there is no ABI issue, it just remembers where it put it and loads it from there. The issue of whether it’s part of the ABI or not only arises when you’re trying to read the incoming frame pointer from a caller, which by definition is not a leaf function. Jess
On Jan 19 2022, Jessica Clarke wrote: > My point is that the only thing that can possibly read the incoming > frame pointer of a leaf function is the leaf function itself, and since > it knows where it’s putting it then there is no ABI issue, it just > remembers where it put it and loads it from there. llvm sidesteps that issue by always saving ra when creating a frame, even in a leaf function, so it can use a constant offset.
On 19 Jan 2022, at 21:07, Andreas Schwab <schwab@linux-m68k.org> wrote: > > On Jan 19 2022, Jessica Clarke wrote: > >> My point is that the only thing that can possibly read the incoming >> frame pointer of a leaf function is the leaf function itself, and since >> it knows where it’s putting it then there is no ABI issue, it just >> remembers where it put it and loads it from there. > > llvm sidesteps that issue by always saving ra when creating a frame, > even in a leaf function, so it can use a constant offset. What’s your point? That’s a correct implementation, just a simple one. If it wanted to RISCVFrameLowering::spillCalleeSavedRegisters could easily save that information, or recompute it when trying to load s0, it just doesn’t because there’s no need. Also saving s0 alongside ra is deliberate, it aids debugging when calling noreturn functions (e.g. panic in an OS kernel). So yes, we avoid complexity in LLVM by not doing things we don’t need to; there’s nothing wrong with that and it doesn’t mean other toolchains that do need that to be correct should just do something wrong. Jess
On Jan 19 2022, Jessica Clarke wrote: > What’s your point? LLVM doesn't have to deal with the extra complexity. > doesn’t mean other toolchains that do need that to be correct should > just do something wrong. __builtin_frame_address with count > 0 is considered bad. Nobody should use it. You don't have to be arrogant.
On Wed, 19 Jan 2022 15:53:07 PST (-0800), schwab@linux-m68k.org wrote: > On Jan 19 2022, Jessica Clarke wrote: > >> What’s your point? > > LLVM doesn't have to deal with the extra complexity. > >> doesn’t mean other toolchains that do need that to be correct should >> just do something wrong. > > __builtin_frame_address with count > 0 is considered bad. Nobody should > use it. The documentation is very clear about this. I don't really see anything to argue about here: our code violates the spec and is producing results we don't like, though the spec allows for much worse. We shouldn't have had that code in the first place, but it slipped through as these things sometimes do. This is just a regular old bug that deserves to be fixed. Just because one compiler produces answers we like doesn't mean it's valid code, that's the whole point of having a spec in the first place. > You don't have to be arrogant. This has been a persistent problem, it's really just not productive. We're still trying to dig out from the last two rounds of silliness, let's not have another one. I don't see anything wrong with the patch in question, but these "stack trace without debug info" things are always tricky and thus warrant a proper look. I'm in the middle of juggling some patches right now, I'll try to take a look but it's fairly far down the queue. Always happy to have help looking these things over, let's try to keep things constructive, though. We've already got enough work to do.
On Mon, 17 Jan 2022 07:44:33 PST (-0800), changbin.du@gmail.com wrote: > I tried different pieces of code which uses __builtin_frame_address(1) > (with both gcc version 7.5.0 and 10.3.0) to verify whether it works as > expected on riscv64. The result is negative. > > What the compiler had generated is as below: > 31 fp = (unsigned long)__builtin_frame_address(1); > 0xffffffff80006024 <+200>: ld s1,0(s0) > > It takes '0(s0)' as the address of frame 1 (caller), but the actual address > should be '-16(s0)'. > > | ... | <-+ > +-----------------+ | > | return address | | > | previous fp | | > | saved registers | | > | local variables | | > $fp --> | ... | | > +-----------------+ | > | return address | | > | previous fp --------+ > | saved registers | > $sp --> | local variables | > +-----------------+ > > This leads the kernel can not dump the full stack trace on riscv. > > [ 7.222126][ T1] Call Trace: > [ 7.222804][ T1] [<ffffffff80006058>] dump_backtrace+0x2c/0x3a > > This problem is not exposed on most riscv builds just because the '0(s0)' > occasionally is the address frame 2 (caller's caller), if only ra and fp > are stored in frame 1 (caller). > > | ... | <-+ > +-----------------+ | > | return address | | > $fp --> | previous fp | | > +-----------------+ | > | return address | | > | previous fp --------+ > | saved registers | > $sp --> | local variables | > +-----------------+ > > This could be a *bug* of gcc that should be fixed. But as noted in gcc > manual "Calling this function with a nonzero argument can have > unpredictable effects, including crashing the calling program.", let's > remove the '__builtin_frame_address(1)' in backtrace code. > > With this fix now it can show full stack trace: > [ 10.444838][ T1] Call Trace: > [ 10.446199][ T1] [<ffffffff8000606c>] dump_backtrace+0x2c/0x3a > [ 10.447711][ T1] [<ffffffff800060ac>] show_stack+0x32/0x3e > [ 10.448710][ T1] [<ffffffff80a005c0>] dump_stack_lvl+0x58/0x7a > [ 10.449941][ T1] [<ffffffff80a005f6>] dump_stack+0x14/0x1c > [ 10.450929][ T1] [<ffffffff804c04ee>] ubsan_epilogue+0x10/0x5a > [ 10.451869][ T1] [<ffffffff804c092e>] __ubsan_handle_load_invalid_value+0x6c/0x78 > [ 10.453049][ T1] [<ffffffff8018f834>] __pagevec_release+0x62/0x64 > [ 10.455476][ T1] [<ffffffff80190830>] truncate_inode_pages_range+0x132/0x5be > [ 10.456798][ T1] [<ffffffff80190ce0>] truncate_inode_pages+0x24/0x30 > [ 10.457853][ T1] [<ffffffff8045bb04>] kill_bdev+0x32/0x3c > ... > > Signed-off-by: Changbin Du <changbin.du@gmail.com> > --- > arch/riscv/kernel/stacktrace.c | 9 +++++---- > 1 file changed, 5 insertions(+), 4 deletions(-) > > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c > index 201ee206fb57..14d2b53ec322 100644 > --- a/arch/riscv/kernel/stacktrace.c > +++ b/arch/riscv/kernel/stacktrace.c > @@ -22,15 +22,16 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > bool (*fn)(void *, unsigned long), void *arg) > { > unsigned long fp, sp, pc; > + int level = 0; > > if (regs) { > fp = frame_pointer(regs); > sp = user_stack_pointer(regs); > pc = instruction_pointer(regs); > } else if (task == NULL || task == current) { > - fp = (unsigned long)__builtin_frame_address(1); > - sp = (unsigned long)__builtin_frame_address(0); > - pc = (unsigned long)__builtin_return_address(0); > + fp = (unsigned long)__builtin_frame_address(0); > + sp = sp_in_global; > + pc = (unsigned long)walk_stackframe; > } else { > /* task blocked in __switch_to */ > fp = task->thread.s[0]; > @@ -42,7 +43,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, > unsigned long low, high; > struct stackframe *frame; > > - if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) > + if (unlikely(!__kernel_text_address(pc) || (level++ >= 1 && !fn(arg, pc)))) > break; > > /* Validate frame pointer */ Thanks, this is on fixes.
diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c index 201ee206fb57..14d2b53ec322 100644 --- a/arch/riscv/kernel/stacktrace.c +++ b/arch/riscv/kernel/stacktrace.c @@ -22,15 +22,16 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, bool (*fn)(void *, unsigned long), void *arg) { unsigned long fp, sp, pc; + int level = 0; if (regs) { fp = frame_pointer(regs); sp = user_stack_pointer(regs); pc = instruction_pointer(regs); } else if (task == NULL || task == current) { - fp = (unsigned long)__builtin_frame_address(1); - sp = (unsigned long)__builtin_frame_address(0); - pc = (unsigned long)__builtin_return_address(0); + fp = (unsigned long)__builtin_frame_address(0); + sp = sp_in_global; + pc = (unsigned long)walk_stackframe; } else { /* task blocked in __switch_to */ fp = task->thread.s[0]; @@ -42,7 +43,7 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs, unsigned long low, high; struct stackframe *frame; - if (unlikely(!__kernel_text_address(pc) || !fn(arg, pc))) + if (unlikely(!__kernel_text_address(pc) || (level++ >= 1 && !fn(arg, pc)))) break; /* Validate frame pointer */
I tried different pieces of code which uses __builtin_frame_address(1) (with both gcc version 7.5.0 and 10.3.0) to verify whether it works as expected on riscv64. The result is negative. What the compiler had generated is as below: 31 fp = (unsigned long)__builtin_frame_address(1); 0xffffffff80006024 <+200>: ld s1,0(s0) It takes '0(s0)' as the address of frame 1 (caller), but the actual address should be '-16(s0)'. | ... | <-+ +-----------------+ | | return address | | | previous fp | | | saved registers | | | local variables | | $fp --> | ... | | +-----------------+ | | return address | | | previous fp --------+ | saved registers | $sp --> | local variables | +-----------------+ This leads the kernel can not dump the full stack trace on riscv. [ 7.222126][ T1] Call Trace: [ 7.222804][ T1] [<ffffffff80006058>] dump_backtrace+0x2c/0x3a This problem is not exposed on most riscv builds just because the '0(s0)' occasionally is the address frame 2 (caller's caller), if only ra and fp are stored in frame 1 (caller). | ... | <-+ +-----------------+ | | return address | | $fp --> | previous fp | | +-----------------+ | | return address | | | previous fp --------+ | saved registers | $sp --> | local variables | +-----------------+ This could be a *bug* of gcc that should be fixed. But as noted in gcc manual "Calling this function with a nonzero argument can have unpredictable effects, including crashing the calling program.", let's remove the '__builtin_frame_address(1)' in backtrace code. With this fix now it can show full stack trace: [ 10.444838][ T1] Call Trace: [ 10.446199][ T1] [<ffffffff8000606c>] dump_backtrace+0x2c/0x3a [ 10.447711][ T1] [<ffffffff800060ac>] show_stack+0x32/0x3e [ 10.448710][ T1] [<ffffffff80a005c0>] dump_stack_lvl+0x58/0x7a [ 10.449941][ T1] [<ffffffff80a005f6>] dump_stack+0x14/0x1c [ 10.450929][ T1] [<ffffffff804c04ee>] ubsan_epilogue+0x10/0x5a [ 10.451869][ T1] [<ffffffff804c092e>] __ubsan_handle_load_invalid_value+0x6c/0x78 [ 10.453049][ T1] [<ffffffff8018f834>] __pagevec_release+0x62/0x64 [ 10.455476][ T1] [<ffffffff80190830>] truncate_inode_pages_range+0x132/0x5be [ 10.456798][ T1] [<ffffffff80190ce0>] truncate_inode_pages+0x24/0x30 [ 10.457853][ T1] [<ffffffff8045bb04>] kill_bdev+0x32/0x3c ... Signed-off-by: Changbin Du <changbin.du@gmail.com> --- arch/riscv/kernel/stacktrace.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)