From patchwork Thu Apr 25 09:45:14 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Thomas Gleixner X-Patchwork-Id: 10917055 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C51951515 for ; Thu, 25 Apr 2019 13:29:59 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B23A128994 for ; Thu, 25 Apr 2019 13:29:59 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id A50612899D; Thu, 25 Apr 2019 13:29:59 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_HI autolearn=unavailable version=3.3.1 Received: from mx1.redhat.com (mx1.redhat.com [209.132.183.28]) (using TLSv1.2 with cipher DHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 370D728994 for ; Thu, 25 Apr 2019 13:29:59 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 7AE9B3076C6B; Thu, 25 Apr 2019 13:29:58 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 5A2F91001DC0; Thu, 25 Apr 2019 13:29:58 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id 27F54181B9EE; Thu, 25 Apr 2019 13:29:58 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id x3PAXlEI009251 for ; Thu, 25 Apr 2019 06:33:47 -0400 Received: by smtp.corp.redhat.com (Postfix) id 1E66C5D9CD; Thu, 25 Apr 2019 10:33:47 +0000 (UTC) Delivered-To: dm-devel@redhat.com Received: from mx1.redhat.com (ext-mx20.extmail.prod.ext.phx2.redhat.com [10.5.110.49]) by smtp.corp.redhat.com (Postfix) with ESMTPS id D9B6E5D9C5; Thu, 25 Apr 2019 10:33:42 +0000 (UTC) Received: from Galois.linutronix.de (Galois.linutronix.de [146.0.238.70]) (using TLSv1.2 with cipher DHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 877EE313434B; Thu, 25 Apr 2019 10:33:41 +0000 (UTC) Received: from localhost ([127.0.0.1] helo=nanos.tec.linutronix.de) by Galois.linutronix.de with esmtp (Exim 4.80) (envelope-from ) id 1hJbAD-0001wo-Fc; Thu, 25 Apr 2019 11:59:33 +0200 Message-Id: <20190425094803.066064076@linutronix.de> User-Agent: quilt/0.65 Date: Thu, 25 Apr 2019 11:45:14 +0200 From: Thomas Gleixner To: LKML References: <20190425094453.875139013@linutronix.de> MIME-Version: 1.0 X-Greylist: IP, sender and recipient auto-whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Thu, 25 Apr 2019 10:33:42 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.49]); Thu, 25 Apr 2019 10:33:42 +0000 (UTC) for IP:'146.0.238.70' DOMAIN:'Galois.linutronix.de' HELO:'Galois.linutronix.de' FROM:'tglx@linutronix.de' RCPT:'' X-RedHat-Spam-Score: 0 () 146.0.238.70 Galois.linutronix.de 146.0.238.70 Galois.linutronix.de X-Scanned-By: MIMEDefang 2.84 on 10.5.110.49 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-loop: dm-devel@redhat.com X-Mailman-Approved-At: Thu, 25 Apr 2019 09:29:28 -0400 Cc: Mike Snitzer , David Airlie , Catalin Marinas , Joonas Lahtinen , dri-devel@lists.freedesktop.org, linux-mm@kvack.org, dm-devel@redhat.com, Alexander Potapenko , Christoph Lameter , Miroslav Benes , Christoph Hellwig , Alasdair Kergon , Marek Szyprowski , linux-arch@vger.kernel.org, x86@kernel.org, kasan-dev@googlegroups.com, Johannes Thumshirn , Andrey Ryabinin , Alexey Dobriyan , intel-gfx@lists.freedesktop.org, David Rientjes , Maarten Lankhorst , Akinobu Mita , Steven Rostedt , Josef Bacik , Rodrigo Vivi , Mike Rapoport , Jani Nikula , Andy Lutomirski , Josh Poimboeuf , David Sterba , Dmitry Vyukov , Tom Zanussi , Chris Mason , Pekka Enberg , iommu@lists.linux-foundation.org, Daniel Vetter , Andrew Morton , Robin Murphy , linux-btrfs@vger.kernel.org Subject: [dm-devel] [patch V3 21/29] tracing: Use percpu stack trace buffer more intelligently X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.48]); Thu, 25 Apr 2019 13:29:58 +0000 (UTC) X-Virus-Scanned: ClamAV using ClamSMTP The per cpu stack trace buffer usage pattern is odd at best. The buffer has place for 512 stack trace entries on 64-bit and 1024 on 32-bit. When interrupts or exceptions nest after the per cpu buffer was acquired the stacktrace length is hardcoded to 8 entries. 512/1024 stack trace entries in kernel stacks are unrealistic so the buffer is a complete waste. Split the buffer into 4 nest levels, which are 128/256 entries per level. This allows nesting contexts (interrupts, exceptions) to utilize the cpu buffer for stack retrieval and avoids the fixed length allocation along with the conditional execution pathes. Signed-off-by: Thomas Gleixner Cc: Steven Rostedt --- V3: Limit to 4 nest levels and increase size per level. --- kernel/trace/trace.c | 77 +++++++++++++++++++++++++-------------------------- 1 file changed, 39 insertions(+), 38 deletions(-) -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel --- a/kernel/trace/trace.c +++ b/kernel/trace/trace.c @@ -2749,12 +2749,21 @@ trace_function(struct trace_array *tr, #ifdef CONFIG_STACKTRACE -#define FTRACE_STACK_MAX_ENTRIES (PAGE_SIZE / sizeof(unsigned long)) +/* Allow 4 levels of nesting: normal, softirq, irq, NMI */ +#define FTRACE_KSTACK_NESTING 4 + +#define FTRACE_KSTACK_ENTRIES (PAGE_SIZE / FTRACE_KSTACK_NESTING) + struct ftrace_stack { - unsigned long calls[FTRACE_STACK_MAX_ENTRIES]; + unsigned long calls[FTRACE_KSTACK_ENTRIES]; +}; + + +struct ftrace_stacks { + struct ftrace_stack stacks[FTRACE_KSTACK_NESTING]; }; -static DEFINE_PER_CPU(struct ftrace_stack, ftrace_stack); +static DEFINE_PER_CPU(struct ftrace_stacks, ftrace_stacks); static DEFINE_PER_CPU(int, ftrace_stack_reserve); static void __ftrace_trace_stack(struct ring_buffer *buffer, @@ -2763,10 +2772,11 @@ static void __ftrace_trace_stack(struct { struct trace_event_call *call = &event_kernel_stack; struct ring_buffer_event *event; + struct ftrace_stack *fstack; struct stack_entry *entry; struct stack_trace trace; - int use_stack; - int size = FTRACE_STACK_ENTRIES; + int size = FTRACE_KSTACK_ENTRIES; + int stackidx; trace.nr_entries = 0; trace.skip = skip; @@ -2788,29 +2798,32 @@ static void __ftrace_trace_stack(struct */ preempt_disable_notrace(); - use_stack = __this_cpu_inc_return(ftrace_stack_reserve); + stackidx = __this_cpu_inc_return(ftrace_stack_reserve); + + /* This should never happen. If it does, yell once and skip */ + if (WARN_ON_ONCE(stackidx >= FTRACE_KSTACK_NESTING)) + goto out; + /* - * We don't need any atomic variables, just a barrier. - * If an interrupt comes in, we don't care, because it would - * have exited and put the counter back to what we want. - * We just need a barrier to keep gcc from moving things - * around. + * The above __this_cpu_inc_return() is 'atomic' cpu local. An + * interrupt will either see the value pre increment or post + * increment. If the interrupt happens pre increment it will have + * restored the counter when it returns. We just need a barrier to + * keep gcc from moving things around. */ barrier(); - if (use_stack == 1) { - trace.entries = this_cpu_ptr(ftrace_stack.calls); - trace.max_entries = FTRACE_STACK_MAX_ENTRIES; - - if (regs) - save_stack_trace_regs(regs, &trace); - else - save_stack_trace(&trace); - - if (trace.nr_entries > size) - size = trace.nr_entries; - } else - /* From now on, use_stack is a boolean */ - use_stack = 0; + + fstack = this_cpu_ptr(ftrace_stacks.stacks) + (stackidx - 1); + trace.entries = fstack->calls; + trace.max_entries = FTRACE_KSTACK_ENTRIES; + + if (regs) + save_stack_trace_regs(regs, &trace); + else + save_stack_trace(&trace); + + if (trace.nr_entries > size) + size = trace.nr_entries; size *= sizeof(unsigned long); @@ -2820,19 +2833,7 @@ static void __ftrace_trace_stack(struct goto out; entry = ring_buffer_event_data(event); - memset(&entry->caller, 0, size); - - if (use_stack) - memcpy(&entry->caller, trace.entries, - trace.nr_entries * sizeof(unsigned long)); - else { - trace.max_entries = FTRACE_STACK_ENTRIES; - trace.entries = entry->caller; - if (regs) - save_stack_trace_regs(regs, &trace); - else - save_stack_trace(&trace); - } + memcpy(&entry->caller, trace.entries, size); entry->size = trace.nr_entries;