From patchwork Tue Jan 27 23:06:42 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 5721241 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork1.web.kernel.org (Postfix) with ESMTP id E28E29F358 for ; Tue, 27 Jan 2015 23:07:20 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id D018A2024F for ; Tue, 27 Jan 2015 23:07:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A24F12024C for ; Tue, 27 Jan 2015 23:07:18 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753648AbbA0XHI (ORCPT ); Tue, 27 Jan 2015 18:07:08 -0500 Received: from victor.provo.novell.com ([137.65.250.26]:36942 "EHLO prv3-mh.provo.novell.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751095AbbA0XHH (ORCPT ); Tue, 27 Jan 2015 18:07:07 -0500 Received: from debian3.lan (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by prv3-mh.provo.novell.com with ESMTP (NOT encrypted); Tue, 27 Jan 2015 16:06:53 -0700 From: Filipe Manana To: linux-btrfs@vger.kernel.org Cc: Filipe Manana Subject: [PATCH v2] Btrfs: fix scrub race leading to use-after-free Date: Tue, 27 Jan 2015 23:06:42 +0000 Message-Id: <1422400002-19602-1-git-send-email-fdmanana@suse.com> X-Mailer: git-send-email 2.1.3 In-Reply-To: <1422375078-6916-1-git-send-email-fdmanana@suse.com> References: <1422375078-6916-1-git-send-email-fdmanana@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-5.6 required=5.0 tests=BAYES_00, RCVD_IN_BL_SPAMCOP_NET, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP While running a scrub on a kernel with CONFIG_DEBUG_PAGEALLOC=y, I got the following trace: [68127.807663] BUG: unable to handle kernel paging request at ffff8803f8947a50 [68127.807663] IP: [] do_raw_spin_lock+0x94/0x122 [68127.807663] PGD 3003067 PUD 43e1f5067 PMD 43e030067 PTE 80000003f8947060 [68127.807663] Oops: 0000 [#1] SMP DEBUG_PAGEALLOC [68127.807663] Modules linked in: dm_flakey dm_mod crc32c_generic btrfs xor raid6_pq nfsd auth_rpcgss oid_registry nfs_acl nfs lockd grace fscache sunrpc loop parport_pc processor parpo [68127.807663] CPU: 2 PID: 3081 Comm: kworker/u8:5 Not tainted 3.18.0-rc6-btrfs-next-3+ #4 [68127.807663] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [68127.807663] Workqueue: btrfs-btrfs-scrub btrfs_scrub_helper [btrfs] [68127.807663] task: ffff880101fc5250 ti: ffff8803f097c000 task.ti: ffff8803f097c000 [68127.807663] RIP: 0010:[] [] do_raw_spin_lock+0x94/0x122 [68127.807663] RSP: 0018:ffff8803f097fbb8 EFLAGS: 00010093 [68127.807663] RAX: 0000000028dd386c RBX: ffff8803f8947a50 RCX: 0000000028dd3854 [68127.807663] RDX: 0000000000000018 RSI: 0000000000000002 RDI: 0000000000000001 [68127.807663] RBP: ffff8803f097fbd8 R08: 0000000000000004 R09: 0000000000000001 [68127.807663] R10: ffff880102620980 R11: ffff8801f3e8c900 R12: 000000000001d390 [68127.807663] R13: 00000000cabd13c8 R14: ffff8803f8947800 R15: ffff88037c574f00 [68127.807663] FS: 0000000000000000(0000) GS:ffff88043dd00000(0000) knlGS:0000000000000000 [68127.807663] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [68127.807663] CR2: ffff8803f8947a50 CR3: 00000000b6481000 CR4: 00000000000006e0 [68127.807663] Stack: [68127.807663] ffffffff823942a8 ffff8803f8947a50 ffff8802a3416f80 0000000000000000 [68127.807663] ffff8803f097fc18 ffffffff8141e7c0 ffffffff81072948 000000000034f314 [68127.807663] ffff8803f097fc08 0000000000000292 ffff8803f097fc48 ffff8803f8947a50 [68127.807663] Call Trace: [68127.807663] [] _raw_spin_lock_irqsave+0x4b/0x55 [68127.807663] [] ? __wake_up+0x22/0x4b [68127.807663] [] __wake_up+0x22/0x4b [68127.807663] [] scrub_pending_bio_dec+0x32/0x36 [btrfs] [68127.807663] [] scrub_bio_end_io_worker+0x5a3/0x5c9 [btrfs] [68127.807663] [] ? time_hardirqs_off+0x15/0x28 [68127.807663] [] ? trace_hardirqs_off_caller+0x4c/0xb9 [68127.807663] [] normal_work_helper+0xf1/0x238 [btrfs] [68127.807663] [] btrfs_scrub_helper+0x12/0x14 [btrfs] [68127.807663] [] process_one_work+0x1e4/0x3b6 [68127.807663] [] ? trace_hardirqs_off+0xd/0xf [68127.807663] [] worker_thread+0x1fb/0x2a8 [68127.807663] [] ? rescuer_thread+0x219/0x219 [68127.807663] [] kthread+0xdb/0xe3 [68127.807663] [] ? __kthread_parkme+0x67/0x67 [68127.807663] [] ret_from_fork+0x7c/0xb0 [68127.807663] [] ? __kthread_parkme+0x67/0x67 [68127.807663] Code: 39 c2 75 14 8d 8a 00 00 01 00 89 d0 f0 0f b1 0b 39 d0 0f 84 81 00 00 00 4c 69 2d 27 86 99 00 fa 00 00 00 45 31 e4 4d 39 ec 74 2b <8b> 13 89 d0 c1 e8 10 66 39 c2 75 [68127.807663] RIP [] do_raw_spin_lock+0x94/0x122 [68127.807663] RSP [68127.807663] CR2: ffff8803f8947a50 [68127.807663] ---[ end trace d7045aac00a66cd8 ]--- This is due to a race that can happen in a very tiny time window and is illustrated by the following sequence diagram: CPU 1 CPU 2 btrfs_scrub_dev() scrub_bio_end_io_worker() scrub_pending_bio_dec() atomic_dec(&sctx->bios_in_flight) wait sctx->bios_in_flight == 0 wait sctx->workers_pending == 0 mutex_lock(&fs_info->scrub_lock) (...) mutex_lock(&fs_info->scrub_lock) scrub_free_ctx(sctx) kfree(sctx) wake_up(&sctx->list_wait) __wake_up() spin_lock_irqsave(&sctx->list_wait->lock, flags) Another variation of this scenario that results in the same use-after-free issue is: CPU 1 CPU 2 btrfs_scrub_dev() wait sctx->bios_in_flight == 0 scrub_bio_end_io_worker() scrub_pending_bio_dec() __wake_up(&sctx->list_wait) spin_lock_irqsave(&sctx->list_wait->lock, flags) default_wake_function() wake up task at CPU 2 wait sctx->workers_pending == 0 mutex_lock(&fs_info->scrub_lock) (...) mutex_lock(&fs_info->scrub_lock) scrub_free_ctx(sctx) kfree(sctx) spin_unlock_irqrestore(&sctx->list_wait->lock, flags) Fix this by holding the scrub lock while doing the wakeup. This isn't a recent regression, the issue as been around since the scrub feature was added (2011, commit a2de733c78fa7af51ba9670482fa7d392aa67c57). Signed-off-by: Filipe Manana --- V2: No code changes, updated only commit message's sequence diagram and added a second one for a slightly different case. fs/btrfs/scrub.c | 16 +++++++++++++++- 1 file changed, 15 insertions(+), 1 deletion(-) diff --git a/fs/btrfs/scrub.c b/fs/btrfs/scrub.c index 8f75b99..8252a5e 100644 --- a/fs/btrfs/scrub.c +++ b/fs/btrfs/scrub.c @@ -306,8 +306,17 @@ static void scrub_pending_bio_inc(struct scrub_ctx *sctx) static void scrub_pending_bio_dec(struct scrub_ctx *sctx) { + struct btrfs_fs_info *fs_info = sctx->dev_root->fs_info; + + /* + * Hold the scrub_lock while doing the wakeup to ensure the + * sctx (and its wait queue list_wait) isn't destroyed/freed + * during the wakeup. + */ + mutex_lock(&fs_info->scrub_lock); atomic_dec(&sctx->bios_in_flight); wake_up(&sctx->list_wait); + mutex_unlock(&fs_info->scrub_lock); } static void __scrub_blocked_if_needed(struct btrfs_fs_info *fs_info) @@ -379,10 +388,15 @@ static void scrub_pending_trans_workers_dec(struct scrub_ctx *sctx) mutex_lock(&fs_info->scrub_lock); atomic_dec(&fs_info->scrubs_running); atomic_dec(&fs_info->scrubs_paused); - mutex_unlock(&fs_info->scrub_lock); atomic_dec(&sctx->workers_pending); wake_up(&fs_info->scrub_pause_wait); + /* + * Hold the scrub_lock while doing the wakeup to ensure the + * sctx (and its wait queue list_wait) isn't destroyed/freed + * during the wakeup. + */ wake_up(&sctx->list_wait); + mutex_unlock(&fs_info->scrub_lock); } static void scrub_free_csums(struct scrub_ctx *sctx)