From patchwork Thu May 4 15:46:03 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Petr Mladek X-Patchwork-Id: 9712223 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4AC1160362 for ; Thu, 4 May 2017 15:47:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3A3D62863C for ; Thu, 4 May 2017 15:47:13 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2C9D528652; Thu, 4 May 2017 15:47:13 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-1.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID autolearn=ham version=3.3.1 Received: from bombadil.infradead.org (bombadil.infradead.org [65.50.211.133]) (using TLSv1.2 with cipher AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.wl.linuxfoundation.org (Postfix) with ESMTPS id 963112863C for ; Thu, 4 May 2017 15:47:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20170209; h=Sender: Content-Transfer-Encoding:Content-Type:MIME-Version:Cc:List-Subscribe: List-Help:List-Post:List-Archive:List-Unsubscribe:List-Id:Message-Id:Date: Subject:To:From:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=8fsKHh5Bi7ZagTPwSD8dwzhnQKxUFXqnC+09KSNF7Sk=; b=ZgS WIYzn7kGaldgzjOAnmS0qOB4+O8J/H9/XYhV2E4CSHFRShEsUTyVRUfN3kuIKx+B4L79jc/uBZcOE xwj5OBvtDhkK/CmIrUzIPmKPhDjjXJ5JFdaYqqPdRUzjg2vAESkPYXpeGBP5w55x7Ong4jNEoXBaO bEKEjWtNgwBZIHF5vPD8f2I4afZ5tcytN9sjNZqhS8/0KOhlFviRkeCVBwbG3m1I2gpntUq2GtINM ECU05tU2PZfDIjr0JKZmILPxNNQDoyFojl/oSc/2mYey7oLQCZZ5yIPxW0QTNB/zxZRwpWaGicYO9 xo8Yz3D3QMmBtl8g19iDcm9YGc5uJ3A==; Received: from localhost ([127.0.0.1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.87 #1 (Red Hat Linux)) id 1d6IyE-0006Wy-H4; Thu, 04 May 2017 15:47:10 +0000 Received: from mx2.suse.de ([195.135.220.15] helo=mx1.suse.de) by bombadil.infradead.org with esmtps (Exim 4.87 #1 (Red Hat Linux)) id 1d6IyA-0006Rw-6u for linux-arm-kernel@lists.infradead.org; Thu, 04 May 2017 15:47:08 +0000 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 26C00ADA5; Thu, 4 May 2017 15:46:43 +0000 (UTC) From: Petr Mladek To: Sergey Senozhatsky , Steven Rostedt Subject: [PATCH v2] printk: Use the main logbuf in NMI when logbuf_lock is available Date: Thu, 4 May 2017 17:46:03 +0200 Message-Id: <1493912763-24873-1-git-send-email-pmladek@suse.com> X-Mailer: git-send-email 1.8.5.6 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20170504_084706_572985_64CB171C X-CRM114-Status: GOOD ( 17.50 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: Petr Mladek , Daniel Thompson , Sergey Senozhatsky , Peter Zijlstra , x86@kernel.org, linux-kernel@vger.kernel.org, Chris Metcalf , Ingo Molnar , Russell King , Andrew Morton , Thomas Gleixner , linux-arm-kernel@lists.infradead.org MIME-Version: 1.0 Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org X-Virus-Scanned: ClamAV using ClamSMTP The commit 42a0bb3f71383b457a7d ("printk/nmi: generic solution for safe printk in NMI") caused that printk stores messages into a temporary buffer in NMI context. The buffer is per-CPU and therefore the size is rather limited. It works quite well for NMI backtraces. But there are longer logs that might get printed in NMI context, for example, lockdep warnings, ftrace_dump_on_oops. The temporary buffer is used to avoid deadlocks caused by logbuf_lock. Also it is needed to avoid races with the other temporary buffer that is used when PRINTK_SAFE_CONTEXT is entered. But the main buffer can be used in NMI if the lock is available and we did not interrupt PRINTK_SAFE_CONTEXT. The lock is checked using raw_spin_is_locked(). It might cause false negatives when the lock is taken on another CPU and this CPU is in the safe context from other reasons. Note that the safe context is used also to get console semaphore or when calling console drivers. For this reason, we do the check in printk_nmi_enter(). It makes the handling consistent for the entire NMI handler and avoids reshuffling of the messages. The patch also defines special printk context that allows to use printk_deferred() in NMI. Note that we could not flush the messages to the consoles because console drivers might use many other internal locks. The newly created vprintk_deferred() disables the preemption only around the irq work handling. It is needed there to keep the consistency between the two per-CPU variables. But there is no reason to disable preemption around vprintk_emit(). Finally, the patch puts back explicit serialization of the NMI backtraces from different CPUs. It was removed by the commit a9edc88093287183ac934b ("x86/nmi: Perform a safe NMI stack trace on all CPUs"). It was not needed because the flushing of the temporary per-CPU buffers was serialized. Suggested-by: Sergey Senozhatsky Signed-off-by: Petr Mladek Acked-by: Sergey Senozhatsky --- Changes against v1: + improved the check of logbuf_lock state to reduce false negatives; updated the related comment and the commit message accordingly The discussion about the previous version started at https://lkml.kernel.org/r/20170420131154.GL3452@pathway.suse.cz kernel/printk/internal.h | 6 ++++-- kernel/printk/printk.c | 19 ++++++++++++++----- kernel/printk/printk_safe.c | 26 ++++++++++++++++++++++++-- lib/nmi_backtrace.c | 3 +++ 4 files changed, 45 insertions(+), 9 deletions(-) diff --git a/kernel/printk/internal.h b/kernel/printk/internal.h index 1db044f808b7..2a7d04049af4 100644 --- a/kernel/printk/internal.h +++ b/kernel/printk/internal.h @@ -18,12 +18,14 @@ #ifdef CONFIG_PRINTK -#define PRINTK_SAFE_CONTEXT_MASK 0x7fffffff -#define PRINTK_NMI_CONTEXT_MASK 0x80000000 +#define PRINTK_SAFE_CONTEXT_MASK 0x3fffffff +#define PRINTK_NMI_DEFERRED_CONTEXT_MASK 0x40000000 +#define PRINTK_NMI_CONTEXT_MASK 0x80000000 extern raw_spinlock_t logbuf_lock; __printf(1, 0) int vprintk_default(const char *fmt, va_list args); +__printf(1, 0) int vprintk_deferred(const char *fmt, va_list args); __printf(1, 0) int vprintk_func(const char *fmt, va_list args); void __printk_safe_enter(void); void __printk_safe_exit(void); diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c index 2984fb0f0257..16b519927d35 100644 --- a/kernel/printk/printk.c +++ b/kernel/printk/printk.c @@ -2691,16 +2691,13 @@ void wake_up_klogd(void) preempt_enable(); } -int printk_deferred(const char *fmt, ...) +int vprintk_deferred(const char *fmt, va_list args) { - va_list args; int r; - preempt_disable(); - va_start(args, fmt); r = vprintk_emit(0, LOGLEVEL_SCHED, NULL, 0, fmt, args); - va_end(args); + preempt_disable(); __this_cpu_or(printk_pending, PRINTK_PENDING_OUTPUT); irq_work_queue(this_cpu_ptr(&wake_up_klogd_work)); preempt_enable(); @@ -2708,6 +2705,18 @@ int printk_deferred(const char *fmt, ...) return r; } +int printk_deferred(const char *fmt, ...) +{ + va_list args; + int r; + + va_start(args, fmt); + r = vprintk_deferred(fmt, args); + va_end(args); + + return r; +} + /* * printk rate limiting, lifted from the networking subsystem. * diff --git a/kernel/printk/printk_safe.c b/kernel/printk/printk_safe.c index 033e50a7d706..6b43dda2b71b 100644 --- a/kernel/printk/printk_safe.c +++ b/kernel/printk/printk_safe.c @@ -308,12 +308,24 @@ static int vprintk_nmi(const char *fmt, va_list args) void printk_nmi_enter(void) { - this_cpu_or(printk_context, PRINTK_NMI_CONTEXT_MASK); + /* + * The size of the extra per-CPU buffer is limited. Use it only when + * the main one is locked. If this CPU is not in the safe context, + * the lock must be taken on another CPU and we could wait for it. + */ + if (raw_spin_is_locked(&logbuf_lock) && + this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK) { + this_cpu_or(printk_context, PRINTK_NMI_CONTEXT_MASK); + } else { + this_cpu_or(printk_context, PRINTK_NMI_DEFERRED_CONTEXT_MASK); + } } void printk_nmi_exit(void) { - this_cpu_and(printk_context, ~PRINTK_NMI_CONTEXT_MASK); + this_cpu_and(printk_context, + ~(PRINTK_NMI_CONTEXT_MASK || + PRINTK_NMI_DEFERRED_CONTEXT_MASK)); } #else @@ -351,12 +363,22 @@ void __printk_safe_exit(void) __printf(1, 0) int vprintk_func(const char *fmt, va_list args) { + /* Use extra buffer in NMI when logbuf_lock is taken or in safe mode. */ if (this_cpu_read(printk_context) & PRINTK_NMI_CONTEXT_MASK) return vprintk_nmi(fmt, args); + /* Use extra buffer to prevent a recursion deadlock in safe mode. */ if (this_cpu_read(printk_context) & PRINTK_SAFE_CONTEXT_MASK) return vprintk_safe(fmt, args); + /* + * Use the main logbuf when logbuf_lock is available in NMI. + * But avoid calling console drivers that might have their own locks. + */ + if (this_cpu_read(printk_context) & PRINTK_NMI_DEFERRED_CONTEXT_MASK) + return vprintk_deferred(fmt, args); + + /* No obstacles. */ return vprintk_default(fmt, args); } diff --git a/lib/nmi_backtrace.c b/lib/nmi_backtrace.c index 4e8a30d1c22f..0bc0a3535a8a 100644 --- a/lib/nmi_backtrace.c +++ b/lib/nmi_backtrace.c @@ -86,9 +86,11 @@ void nmi_trigger_cpumask_backtrace(const cpumask_t *mask, bool nmi_cpu_backtrace(struct pt_regs *regs) { + static arch_spinlock_t lock = __ARCH_SPIN_LOCK_UNLOCKED; int cpu = smp_processor_id(); if (cpumask_test_cpu(cpu, to_cpumask(backtrace_mask))) { + arch_spin_lock(&lock); if (regs && cpu_in_idle(instruction_pointer(regs))) { pr_warn("NMI backtrace for cpu %d skipped: idling at pc %#lx\n", cpu, instruction_pointer(regs)); @@ -99,6 +101,7 @@ bool nmi_cpu_backtrace(struct pt_regs *regs) else dump_stack(); } + arch_spin_unlock(&lock); cpumask_clear_cpu(cpu, to_cpumask(backtrace_mask)); return true; }