From patchwork Mon Sep 17 23:03:32 2012 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Daniel Vetter X-Patchwork-Id: 1470001 Return-Path: X-Original-To: patchwork-dri-devel@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from gabe.freedesktop.org (gabe.freedesktop.org [131.252.210.177]) by patchwork1.kernel.org (Postfix) with ESMTP id 351893FE79 for ; Mon, 17 Sep 2012 23:05:47 +0000 (UTC) Received: from gabe.freedesktop.org (localhost [127.0.0.1]) by gabe.freedesktop.org (Postfix) with ESMTP id 164369E93D for ; Mon, 17 Sep 2012 16:05:47 -0700 (PDT) X-Original-To: dri-devel@lists.freedesktop.org Delivered-To: dri-devel@lists.freedesktop.org Received: from mail-we0-f177.google.com (mail-we0-f177.google.com [74.125.82.177]) by gabe.freedesktop.org (Postfix) with ESMTP id B5DE29E7BA for ; Mon, 17 Sep 2012 16:03:55 -0700 (PDT) Received: by weyr3 with SMTP id r3so4379692wey.36 for ; Mon, 17 Sep 2012 16:03:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=from:to:cc:subject:date:message-id:x-mailer:in-reply-to:references; bh=vuR+tX/4zAHHNbwkJXKutqhAuxfuLl6l+x5qEHhs8q8=; b=W4+AJlYppPBY7EwG02Uf1e8BBDRHlRk1dDaEw5981gdW7oaYc8Yv3G/vZbKK+9O7wI kibZzd4ivzUpeFVvCH42eZXjqBf0Acoz++RYZ3FunyxZJTW/EQb32tE9uTm6BnVXredz JStFL63bgAl8Waa68RNswf3TDjBtWHu7nr3ks= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=from:to:cc:subject:date:message-id:x-mailer:in-reply-to:references :x-gm-message-state; bh=vuR+tX/4zAHHNbwkJXKutqhAuxfuLl6l+x5qEHhs8q8=; b=RhH0tOLDIYgtIAerImmH/2hzd/TWrcX4T85s5V7gpn5av5w1y6bLE7MIRTATqhWTpl 3HHG69dD9L0mkFrIdDbQeMm30cyAMU7RugQaX4Y2k02LSildmuPO3cD9wcZXla1MwZWR AkEbd9/S2gLnEgzCoZggbGuKYnDpROgksdUSAhf1H41C01SqwOKBErHagrkJhglywde/ B2zsBPySSDSGo4tVpI0VgBZ8omzdcDRYX8tnYjF1hhjivPVBPeg3TXKz2PzqOo9+GFuV BgXZDH9fDUFOOYtb8xMFrHB60Ag4jDC/1YFI0L0hXvGl2dqK8qzFeQK78moy4x5q1MlY b7Eg== Received: by 10.216.136.167 with SMTP id w39mr7397363wei.76.1347923034811; Mon, 17 Sep 2012 16:03:54 -0700 (PDT) Received: from aaron.ffwll.local (178-83-130-250.dynamic.hispeed.ch. [178.83.130.250]) by mx.google.com with ESMTPS id o2sm28635397wiz.11.2012.09.17.16.03.53 (version=TLSv1/SSLv3 cipher=OTHER); Mon, 17 Sep 2012 16:03:54 -0700 (PDT) From: Daniel Vetter To: LKML Subject: [PATCH 2/2] [RESEND] console: implement lockdep support for console_lock Date: Tue, 18 Sep 2012 01:03:32 +0200 Message-Id: <1347923013-4253-2-git-send-email-daniel.vetter@ffwll.ch> X-Mailer: git-send-email 1.7.10.4 In-Reply-To: <1347923013-4253-1-git-send-email-daniel.vetter@ffwll.ch> References: <1347923013-4253-1-git-send-email-daniel.vetter@ffwll.ch> X-Gm-Message-State: ALoCoQl3vQk5AT8GSUitTmAd7NERh4qh/iOvvarlhpGrEl1HdYEUI/hbz/zIA7jE+jdnscfp82kC Cc: Peter Zijlstra , Daniel Vetter , Intel Graphics Development , DRI Development , Thomas Gleixner X-BeenThere: dri-devel@lists.freedesktop.org X-Mailman-Version: 2.1.13 Precedence: list List-Id: Direct Rendering Infrastructure - Development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , MIME-Version: 1.0 Sender: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Errors-To: dri-devel-bounces+patchwork-dri-devel=patchwork.kernel.org@lists.freedesktop.org Dave Airlie recently discovered a locking bug in the fbcon layer, where a timer_del_sync (for the blinking cursor) deadlocks with the timer itself, since both (want to) hold the console_lock: https://lkml.org/lkml/2012/8/21/36 Unfortunately the console_lock isn't a plain mutex and hence has no lockdep support. Which resulted in a few days wasted of tracking down this bug (complicated by the fact that printk doesn't show anything when the console is locked) instead of noticing the bug much earlier with the lockdep splat. Hence I've figured I need to fix that for the next deadlock involving console_lock - and with kms/drm growing ever more complex locking that'll eventually happen. Now the console_lock has rather funky semantics, so after a quick irc discussion with Thomas Gleixner and Dave Airlie I've quickly ditched the original idead of switching to a real mutex (since it won't work) and instead opted to annotate the console_lock with lockdep information manually. There are a few special cases: - The console_lock state is protected by the console_sem, and usually grabbed/dropped at _lock/_unlock time. But the suspend/resume code drops the semaphore without dropping the console_lock (see suspend_console/resume_console). But since the same thread that did the suspend will do the resume, we don't need to fix up anything. - In the printk code there's a special trylock, only used to kick off the logbuffer printk'ing in console_unlock. But all that happens while lockdep is disable (since printk does a few other evil tricks). So no issue there, either. - The console_lock can also be acquired form irq context (but only with a trylock). lockdep already handles that. This all leaves us with annotating the normal console_lock, _unlock and _trylock functions. And yes, it works - simply unloading a drm kms driver resulted in lockdep complaining about the deadlock in fbcon_deinit: ====================================================== [ INFO: possible circular locking dependency detected ] 3.6.0-rc2+ #552 Not tainted ------------------------------------------------------- kms-reload/3577 is trying to acquire lock: ((&info->queue)){+.+...}, at: [] wait_on_work+0x0/0xa7 but task is already holding lock: (console_lock){+.+.+.}, at: [] bind_con_driver+0x38/0x263 which lock already depends on the new lock. the existing dependency chain (in reverse order) is: -> #1 (console_lock){+.+.+.}: [] lock_acquire+0x95/0x105 [] console_lock+0x59/0x5b [] fb_flashcursor+0x2e/0x12c [] process_one_work+0x1d9/0x3b4 [] worker_thread+0x1a7/0x24b [] kthread+0x7f/0x87 [] kernel_thread_helper+0x4/0x10 -> #0 ((&info->queue)){+.+...}: [] __lock_acquire+0x999/0xcf6 [] lock_acquire+0x95/0x105 [] wait_on_work+0x3b/0xa7 [] __cancel_work_timer+0xbf/0x102 [] cancel_work_sync+0xb/0xd [] fbcon_deinit+0x11c/0x1dc [] bind_con_driver+0x145/0x263 [] unbind_con_driver+0x14f/0x195 [] store_bind+0x1ad/0x1c1 [] dev_attr_store+0x13/0x1f [] sysfs_write_file+0xe9/0x121 [] vfs_write+0x9b/0xfd [] sys_write+0x3e/0x6b [] system_call_fastpath+0x16/0x1b other info that might help us debug this: Possible unsafe locking scenario: CPU0 CPU1 ---- ---- lock(console_lock); lock((&info->queue)); lock(console_lock); lock((&info->queue)); *** DEADLOCK *** Cc: Dave Airlie Cc: Thomas Gleixner Cc: Alan Cox Cc: Peter Zijlstra Signed-off-by: Daniel Vetter --- kernel/printk.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/kernel/printk.c b/kernel/printk.c index ed9af6a..ab2ab24 100644 --- a/kernel/printk.c +++ b/kernel/printk.c @@ -87,6 +87,12 @@ static DEFINE_SEMAPHORE(console_sem); struct console *console_drivers; EXPORT_SYMBOL_GPL(console_drivers); +#ifdef CONFIG_LOCKDEP +struct lockdep_map console_lock_dep_map = { + .name = "console_lock" +}; +#endif + /* * This is used for debugging the mess that is the VT code by * keeping track if we have the console semaphore held. It's @@ -1916,6 +1922,7 @@ void console_lock(void) return; console_locked = 1; console_may_schedule = 1; + mutex_acquire(&console_lock_dep_map, 0, 0, _RET_IP_); } EXPORT_SYMBOL(console_lock); @@ -1937,6 +1944,7 @@ int console_trylock(void) } console_locked = 1; console_may_schedule = 0; + mutex_acquire(&console_lock_dep_map, 0, 1, _RET_IP_); return 1; } EXPORT_SYMBOL(console_trylock); @@ -2097,6 +2105,7 @@ skip: local_irq_restore(flags); } console_locked = 0; + mutex_release(&console_lock_dep_map, 1, _RET_IP_); /* Release the exclusive_console once it is used */ if (unlikely(exclusive_console))