From patchwork Wed Nov 26 15:28:52 2014 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Filipe Manana X-Patchwork-Id: 5386351 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 71CC89F319 for ; Wed, 26 Nov 2014 15:29:43 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id C8139201EF for ; Wed, 26 Nov 2014 15:29:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5DBDD201C0 for ; Wed, 26 Nov 2014 15:29:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753805AbaKZP3h (ORCPT ); Wed, 26 Nov 2014 10:29:37 -0500 Received: from victor.provo.novell.com ([137.65.250.26]:55536 "EHLO prv3-mh.provo.novell.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1753582AbaKZP3e (ORCPT ); Wed, 26 Nov 2014 10:29:34 -0500 Received: from debian3.lan (prv-ext-foundry1int.gns.novell.com [137.65.251.240]) by prv3-mh.provo.novell.com with ESMTP (NOT encrypted); Wed, 26 Nov 2014 08:29:23 -0700 From: Filipe Manana To: linux-btrfs@vger.kernel.org Cc: Filipe Manana Subject: [PATCH 3/6] Btrfs: fix freeing used extents after removing empty block group Date: Wed, 26 Nov 2014 15:28:52 +0000 Message-Id: <1417015735-8581-4-git-send-email-fdmanana@suse.com> X-Mailer: git-send-email 2.1.3 In-Reply-To: <1417015735-8581-1-git-send-email-fdmanana@suse.com> References: <1417015735-8581-1-git-send-email-fdmanana@suse.com> Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-6.9 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP There's a race between adding a block group to the list of the unused block groups and removing an unused block group (cleaner kthread) that leads to freeing extents that are in use or a crash during transaction commmit. Basically the cleaner kthread, when executing btrfs_delete_unused_bgs(), might catch the newly added block group to the list fs_info->unused_bgs and clear the range representing the whole group from fs_info->freed_extents[] before the task that added the block group to the list (running update_block_group()) marked the last freed extent as dirty in fs_info->freed_extents (pinned_extents). That is: CPU 1 CPU 2 btrfs_delete_unused_bgs() update_block_group() add block group to fs_info->unused_bgs got block group from the list clear_extent_bits for the whole block group range in freed_extents[] set_extent_dirty for the range covering the freed extent in freed_extents[] (fs_info->pinned_extents) block group deleted, and a new block group with the same logical address is created reserve space from the new block group for new data or metadata - the reserved space overlaps the range specified by CPU 1 for set_extent_dirty() commit transaction find all ranges marked as dirty in fs_info->pinned_extents, clear them and add them to the free space cache Alternatively, if CPU 2 doesn't create a new block group with the same logical address, we get a crash/BUG_ON at transaction commit when unpining extent ranges because we can't find a block group for the range marked as dirty by CPU 1. Sample trace: [ 2163.426462] invalid opcode: 0000 [#1] SMP DEBUG_PAGEALLOC [ 2163.426640] Modules linked in: btrfs xor raid6_pq dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio crc32c_generic libcrc32c dm_mod nfsd auth_rpc gss oid_registry nfs_acl nfs lockd fscache sunrpc loop psmouse parport_pc parport i2c_piix4 processor thermal_sys i2ccore evdev button pcspkr microcode serio_raw ext4 crc16 jbd2 mbcache sg sr_mod cdrom sd_mod crc_t10dif crct10dif_generic crct10dif_common ata_generic virtio_scsi floppy ata_piix libata e1000 scsi_mod virtio_pci virtio_ring virtio [ 2163.428209] CPU: 0 PID: 11858 Comm: btrfs-transacti Tainted: G W 3.17.0-rc5-btrfs-next-1+ #1 [ 2163.428519] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.7.5-0-ge51488c-20140602_164612-nilsson.home.kraxel.org 04/01/2014 [ 2163.428875] task: ffff88009f2c0650 ti: ffff8801356bc000 task.ti: ffff8801356bc000 [ 2163.429157] RIP: 0010:[] [] unpin_extent_range.isra.58+0x62/0x192 [btrfs] [ 2163.429562] RSP: 0018:ffff8801356bfda8 EFLAGS: 00010246 [ 2163.429802] RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000 [ 2163.429990] RDX: 0000000041bfffff RSI: 0000000001c00000 RDI: ffff880024307080 [ 2163.430042] RBP: ffff8801356bfde8 R08: 0000000000000068 R09: ffff88003734f118 [ 2163.430042] R10: ffff8801356bfcb8 R11: fffffffffffffb69 R12: ffff8800243070d0 [ 2163.430042] R13: 0000000083c04000 R14: ffff8800751b0f00 R15: ffff880024307000 [ 2163.430042] FS: 0000000000000000(0000) GS:ffff88013f400000(0000) knlGS:0000000000000000 [ 2163.430042] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b [ 2163.430042] CR2: 00007ff10eb43fc0 CR3: 0000000004cb8000 CR4: 00000000000006f0 [ 2163.430042] Stack: [ 2163.430042] ffff8800243070d0 0000000083c08000 0000000083c07fff ffff88012d6bc800 [ 2163.430042] ffff8800243070d0 ffff8800751b0f18 ffff8800751b0f00 0000000000000000 [ 2163.430042] ffff8801356bfe18 ffffffffa037a481 0000000083c04000 0000000083c07fff [ 2163.430042] Call Trace: [ 2163.430042] [] btrfs_finish_extent_commit+0xac/0xbf [btrfs] [ 2163.430042] [] btrfs_commit_transaction+0x6ee/0x882 [btrfs] [ 2163.430042] [] transaction_kthread+0xf2/0x1a4 [btrfs] [ 2163.430042] [] ? btrfs_cleanup_transaction+0x3d8/0x3d8 [btrfs] [ 2163.430042] [] kthread+0xb7/0xbf [ 2163.430042] [] ? __kthread_parkme+0x67/0x67 [ 2163.430042] [] ret_from_fork+0x7c/0xb0 [ 2163.430042] [] ? __kthread_parkme+0x67/0x67 So fix this by making update_block_group() first set the range as dirty in pinned_extents before adding the block group to the unused_bgs list. Signed-off-by: Filipe Manana Reviewed-by: Josef Bacik --- fs/btrfs/extent-tree.c | 21 ++++++++++----------- 1 file changed, 10 insertions(+), 11 deletions(-) diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index b7e40ef..92f61f2 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -5439,7 +5439,17 @@ static int update_block_group(struct btrfs_root *root, spin_unlock(&cache->space_info->lock); } else { old_val -= num_bytes; + btrfs_set_block_group_used(&cache->item, old_val); + cache->pinned += num_bytes; + cache->space_info->bytes_pinned += num_bytes; + cache->space_info->bytes_used -= num_bytes; + cache->space_info->disk_used -= num_bytes * factor; + spin_unlock(&cache->lock); + spin_unlock(&cache->space_info->lock); + set_extent_dirty(info->pinned_extents, + bytenr, bytenr + num_bytes - 1, + GFP_NOFS | __GFP_NOFAIL); /* * No longer have used bytes in this block group, queue * it for deletion. @@ -5453,17 +5463,6 @@ static int update_block_group(struct btrfs_root *root, } spin_unlock(&info->unused_bgs_lock); } - btrfs_set_block_group_used(&cache->item, old_val); - cache->pinned += num_bytes; - cache->space_info->bytes_pinned += num_bytes; - cache->space_info->bytes_used -= num_bytes; - cache->space_info->disk_used -= num_bytes * factor; - spin_unlock(&cache->lock); - spin_unlock(&cache->space_info->lock); - - set_extent_dirty(info->pinned_extents, - bytenr, bytenr + num_bytes - 1, - GFP_NOFS | __GFP_NOFAIL); } btrfs_put_block_group(cache); total -= num_bytes;