From patchwork Sat Feb 16 06:47:45 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Bo X-Patchwork-Id: 2151111 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork1.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork1.kernel.org (Postfix) with ESMTP id 0DE723FDF1 for ; Sat, 16 Feb 2013 06:50:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751270Ab3BPGuu (ORCPT ); Sat, 16 Feb 2013 01:50:50 -0500 Received: from aserp1040.oracle.com ([141.146.126.69]:24633 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750828Ab3BPGuu (ORCPT ); Sat, 16 Feb 2013 01:50:50 -0500 Received: from ucsinet22.oracle.com (ucsinet22.oracle.com [156.151.31.94]) by aserp1040.oracle.com (Sentrion-MTA-4.3.1/Sentrion-MTA-4.3.1) with ESMTP id r1G6okYo004995 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Sat, 16 Feb 2013 06:50:47 GMT Received: from acsmt356.oracle.com (acsmt356.oracle.com [141.146.40.156]) by ucsinet22.oracle.com (8.14.4+Sun/8.14.4) with ESMTP id r1G6ojmO016175 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Sat, 16 Feb 2013 06:50:46 GMT Received: from abhmt103.oracle.com (abhmt103.oracle.com [141.146.116.55]) by acsmt356.oracle.com (8.12.11.20060308/8.12.11) with ESMTP id r1G6oipN014939; Sat, 16 Feb 2013 00:50:44 -0600 Received: from liubo.jp.oracle.com (/10.191.7.103) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Fri, 15 Feb 2013 22:50:44 -0800 Date: Sat, 16 Feb 2013 14:47:45 +0800 From: Liu Bo To: Stefan Behrens Cc: linux-btrfs@vger.kernel.org Subject: Re: [PATCH V5] Btrfs: snapshot-aware defrag Message-ID: <20130216064743.GA3124@liubo.jp.oracle.com> Reply-To: bo.li.liu@oracle.com References: <1358339768-2314-1-git-send-email-bo.li.liu@oracle.com> <20130123075155.GE17162@liubo.jp.oracle.com> <20130124005221.GA28406@liubo> <5102A76C.5050706@giantdisaster.de> <20130127131952.GB16722@liubo> <5106AD9D.5020906@giantdisaster.de> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <5106AD9D.5020906@giantdisaster.de> User-Agent: Mutt/1.5.21 (2010-09-15) X-Source-IP: ucsinet22.oracle.com [156.151.31.94] Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org On Mon, Jan 28, 2013 at 05:55:57PM +0100, Stefan Behrens wrote: > [CC list reduced (my initial statement was that such dead_list > corruptions happen without the snapshot-aware defrag patch, by now the > contents is not related to the snapshot-aware defrag patch anymore)] > [...] > > No, this did not fix the problem (and I changed the patch and replaced > "root" with "gang[0]" for the compiler's satisfaction). Same stack trace > as before. > > This happens without scrub or defrag running in parallel. The mount > options are compress=lzo,space_cache,inode_cache. I mount the > filesystem, create about 1000 subvols and snapshots, fill some data in > the subvolumes, delete all subvolumes, wait until "btrfs subvol list ... > | wc -l" prints 0, then immediately unmount the filesystem and then it > crashs. > > Disabling the inode_cache mount option eliminates the crash. Hi Stefan, What about this patch(UNTESTED)? thanks, liubo > > BTW, when I reproduced this crash with 6600 outstanding subvolume > deletions, the next mount command took 40 minutes to return back to user > mode. The btrfs-cleaner thread was executing btrfs_clean_old_snapshots() > and was writing the superblocks everytime I looked on its stack. The > mount process was executing btrfs_find_orphan_roots() the first half of > the time and afterwards btrfs_orphan_cleanup() for the rest of the 40 > minutes. > > > >> BUG: unable to handle kernel paging request at ffff88042503b830 > >> IP: [] __list_add+0x17/0xd0 > >> PGD 1e0c063 PUD bf58e067 PMD bf6b7067 PTE 800000042503b160 > >> Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC > >> Modules linked in: btrfs bonding raid1 mpt2sas scsi_transport_sas raid_class > >> CPU 2 > >> Pid: 10259, comm: umount Not tainted 3.8.0-rc4+ #16 Supermicro X8SIL/X8SIL > >> RIP: 0010:[] [] __list_add+0x17/0xd0 > >> RSP: 0018:ffff8802f67a1bd8 EFLAGS: 00010286 > >> RAX: ffff880425b7c560 RBX: ffff880423ca2828 RCX: 0000000000000001 > >> RDX: ffff88042503b828 RSI: ffff8804257794c0 RDI: ffff880423ca2828 > >> RBP: ffff8802f67a1bf8 R08: 0000000000077850 R09: 0000000000000000 > >> R10: 0000000000000000 R11: 0000000000000001 R12: ffff880423ca2000 > >> R13: ffff880423ca2898 R14: 0000000000000000 R15: ffff8802f67a1d30 > >> FS: 00007f6e89bba740(0000) GS:ffff88042ea00000(0000) knlGS:0000000000000000 > >> CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b > >> CR2: ffff88042503b830 CR3: 000000029a56c000 CR4: 00000000000007e0 > >> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > >> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 > >> Process umount (pid: 10259, threadinfo ffff8802f67a0000, task ffff880425b7c560) > >> Stack: > >> ffffffffa00a414f ffff880423ca2000 ffff880423ca2000 ffff880423ca2898 > >> ffff8802f67a1c18 ffffffffa00a4170 ffff88042a60c1f8 ffff88042a60c1f8 > >> ffff8802f67a1c48 ffffffffa00b3180 ffff88042a60c1f8 ffff88042a60c280 > >> Call Trace: > >> [] ? btrfs_add_dead_root+0x1f/0x60 [btrfs] > >> [] btrfs_add_dead_root+0x40/0x60 [btrfs] > >> [] btrfs_destroy_inode+0x1d0/0x2d0 [btrfs] > >> [] destroy_inode+0x37/0x60 > >> [] evict+0x10d/0x1a0 > >> [] iput+0x105/0x190 > >> [] free_fs_root+0x18/0x90 [btrfs] > >> [] btrfs_free_fs_root+0x7b/0x90 [btrfs] > >> [] del_fs_roots+0xaf/0xf0 [btrfs] > >> [] close_ctree+0x1c6/0x300 [btrfs] > >> [] ? evict_inodes+0xec/0x100 > >> [] btrfs_put_super+0x14/0x20 [btrfs] > >> [] generic_shutdown_super+0x5c/0xe0 > >> [] kill_anon_super+0x11/0x20 > >> [] btrfs_kill_super+0x15/0x90 [btrfs] > >> [] ? deactivate_super+0x41/0x70 > >> [] deactivate_locked_super+0x3d/0x70 > >> [] deactivate_super+0x49/0x70 > >> [] mntput_no_expire+0xd2/0x130 > >> [] sys_umount+0x71/0x390 > >> [] system_call_fastpath+0x16/0x1b > >> Code: 48 83 c4 08 5b 5d c3 66 66 66 66 2e 0f 1f 84 00 00 00 00 00 55 48 89 e5 48 83 ec 20 48 89 5d e8 4c 89 65 f0 48 89 fb 4c 89 6d f8 <4c> 8b 42 08 49 89 f5 49 89 d4 49 39 f0 75 31 4d 8b 45 00 4d 39 > >> RIP [] __list_add+0x17/0xd0 > >> RSP > >> CR2: ffff88042503b830 > >> ---[ end trace 5e44f1afc74751aa ]--- > --- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index ca7ace7..dac9d4b 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -4142,9 +4142,14 @@ static void inode_tree_del(struct inode *inode) * root_refs of 0, so this could end up dropping the tree root as a * snapshot, so we need the extra !root->fs_info->tree_root check to * make sure we don't drop it. + * + * Inode cache's inodes may be iput and add root back to dead roots + * list during killing super, which leads to use-after-free, so + * we need to check fs_info->closing to keep us from use-after-free. */ if (empty && btrfs_root_refs(&root->root_item) == 0 && - root != root->fs_info->tree_root) { + root != root->fs_info->tree_root && + btrfs_fs_closing(root->fs_info) > 1) { synchronize_srcu(&root->fs_info->subvol_srcu); spin_lock(&root->inode_lock); empty = RB_EMPTY_ROOT(&root->inode_tree);