Message ID | 5CF0F2440200007800233D84@prv1-mh.provo.novell.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | : x86/traps: improve show_trace()'s top-of-stack handling | expand |
On 31/05/2019 10:22, Jan Beulich wrote: > Despite -fno-omit-frame-pointer the compiler may omit the frame pointer, > often for relatively simple leaf functions. (To give a specific example, > the case I've run into this with is _pci_hide_device() and gcc 8. > Interestingly the even more simple neighboring iommu_has_feature() does > get a frame pointer set up, around just a single instruction. But this > may be a result of the size-of-asm() effects discussed elsewhere.) > > Log the top-of-stack value if it looks valid _or_ if RIP looks invalid. > > Also annotate non-frame-pointer-based stack trace entries with a > question mark, to signal clearly that any one of them may not actually > be part of the call stack. I very specifically didn't do that before, because it give the false impression that when a question mark isn't present, the logging line is accurate. This is not true for %rbp corruption, asm blocks which don't respect the frame pointer ABI (arguably also corruption), any fault raised from within an EFI call. Porting Xen to use objtool would be a definite improvement, but cannot guard against unexpected %rbp corruption and the EFI case. ~Andrew
>>> On 07.06.19 at 20:01, <andrew.cooper3@citrix.com> wrote: > On 31/05/2019 10:22, Jan Beulich wrote: >> Despite -fno-omit-frame-pointer the compiler may omit the frame pointer, >> often for relatively simple leaf functions. (To give a specific example, >> the case I've run into this with is _pci_hide_device() and gcc 8. >> Interestingly the even more simple neighboring iommu_has_feature() does >> get a frame pointer set up, around just a single instruction. But this >> may be a result of the size-of-asm() effects discussed elsewhere.) >> >> Log the top-of-stack value if it looks valid _or_ if RIP looks invalid. >> >> Also annotate non-frame-pointer-based stack trace entries with a >> question mark, to signal clearly that any one of them may not actually >> be part of the call stack. > > I very specifically didn't do that before, because it give the false > impression that when a question mark isn't present, the logging line is > accurate. > > This is not true for %rbp corruption, asm blocks which don't respect the > frame pointer ABI (arguably also corruption), any fault raised from > within an EFI call. So what do you suggest instead? Somehow we should mark slots that are more guesses than actually derived. > Porting Xen to use objtool would be a definite improvement, but cannot > guard against unexpected %rbp corruption and the EFI case. I'm not sure about "definite", but I think I see your point. Personally I continue to believe that programmer (assembly code) and compiler (C code) attached unwind annotations are the better model. Jan
--- a/xen/arch/x86/traps.c +++ b/xen/arch/x86/traps.c @@ -431,7 +431,7 @@ static void _show_trace(unsigned long sp { addr = *stack++; if ( is_active_kernel_text(addr) ) - printk(" [<%p>] %pS\n", _p(addr), _p(addr)); + printk(" [<%p>] ? %pS\n", _p(addr), _p(addr)); } } @@ -502,21 +502,25 @@ static void show_trace(const struct cpu_ if ( is_active_kernel_text(regs->rip) || !is_active_kernel_text(tos) ) printk(" [<%p>] %pS\n", _p(regs->rip), _p(regs->rip)); + + if ( !sp ) + return; + /* - * Else RIP looks bad but the top of the stack looks good. Perhaps we - * followed a wild function pointer? Lets assume the top of the stack is a + * If RIP looks bad or the top of the stack looks good, log the top of + * stack as well. Perhaps we followed a wild function pointer, or we're + * in a function without frame pointer, or in a function prologue before + * the frame pointer gets set up? Lets assume the top of the stack is a * return address; print it and skip past so _show_trace() doesn't print * it again. */ - else if ( sp ) + if ( !is_active_kernel_text(regs->rip) || + is_active_kernel_text(tos) ) { - printk(" [<%p>] %pS\n", _p(tos), _p(tos)); + printk(" [<%p>] ? %pS\n", _p(tos), _p(tos)); sp++; } - if ( !sp ) - return; - _show_trace((unsigned long)sp, regs->rbp); printk("\n");
Despite -fno-omit-frame-pointer the compiler may omit the frame pointer, often for relatively simple leaf functions. (To give a specific example, the case I've run into this with is _pci_hide_device() and gcc 8. Interestingly the even more simple neighboring iommu_has_feature() does get a frame pointer set up, around just a single instruction. But this may be a result of the size-of-asm() effects discussed elsewhere.) Log the top-of-stack value if it looks valid _or_ if RIP looks invalid. Also annotate non-frame-pointer-based stack trace entries with a question mark, to signal clearly that any one of them may not actually be part of the call stack. Signed-off-by: Jan Beulich <jbeulich@suse.com>