From patchwork Sun Jul 14 16:20:04 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Alex Lyakas X-Patchwork-Id: 2827228 Return-Path: X-Original-To: patchwork-linux-btrfs@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork1.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.19.201]) by patchwork1.web.kernel.org (Postfix) with ESMTP id 143AD9F967 for ; Sun, 14 Jul 2013 16:20:15 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id DF67A20124 for ; Sun, 14 Jul 2013 16:20:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C591420117 for ; Sun, 14 Jul 2013 16:20:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752702Ab3GNQUI (ORCPT ); Sun, 14 Jul 2013 12:20:08 -0400 Received: from mail-we0-f181.google.com ([74.125.82.181]:40084 "EHLO mail-we0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752679Ab3GNQUG (ORCPT ); Sun, 14 Jul 2013 12:20:06 -0400 Received: by mail-we0-f181.google.com with SMTP id p58so9275061wes.40 for ; Sun, 14 Jul 2013 09:20:04 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type:x-gm-message-state; bh=qIXL3Awq0bfj7lTdqVu7zWb0JWdX4FBtzYmW71HPWV0=; b=JoE2hQAnuo6tjGXepTyXdj8Y5PQva/yG6kbkKZ768MW6iB9Erfdo8wD2V46dlFr796 909P4dYHS+Jq+pZt4rOn+IYMe9YLNId0RjYScrkVZIwFzhn/VD6MknvUk+1O7A4PaSZZ 8vtzI9nvbAYbJxPd7ReeTC2hcBBYGWFMye4oKIJAKo0BCy0cfcDQKc5HfkwUJZHFxeQN iPea7ZUAcjM3PTIj/i2kp6In2nhlTCSIMo/n5t/dtabWIfU+zp4z+Uunkh1hsa30b5h5 b36oPSwhLNEemoxOFZuQqMaGQf8qwxA4gfdMn/tNO2LsyynywDDDI1Mz6gZbgFgJy3Oa qicw== MIME-Version: 1.0 X-Received: by 10.194.9.101 with SMTP id y5mr29178962wja.86.1373818804293; Sun, 14 Jul 2013 09:20:04 -0700 (PDT) Received: by 10.194.2.240 with HTTP; Sun, 14 Jul 2013 09:20:04 -0700 (PDT) In-Reply-To: References: <1363101208-30184-1-git-send-email-dsterba@suse.cz> <20130704170337.GU18204@suse.cz> Date: Sun, 14 Jul 2013 19:20:04 +0300 Message-ID: Subject: Re: [PATCH v3] btrfs: clean snapshots one by one From: Alex Lyakas To: dsterba@suse.cz, linux-btrfs Cc: Josef Bacik X-Gm-Message-State: ALoCoQlItMk9FwMJOxoWy0G2NGdRqzoMvADjq6C2oIrJRwJiI18R+KnCJpFp0rqTv4I0k1+eB9ji Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Spam-Status: No, score=-7.3 required=5.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Hi, On Thu, Jul 4, 2013 at 10:52 PM, Alex Lyakas wrote: > Hi David, > > On Thu, Jul 4, 2013 at 8:03 PM, David Sterba wrote: >> On Thu, Jul 04, 2013 at 06:29:23PM +0300, Alex Lyakas wrote: >>> > @@ -7363,6 +7365,12 @@ int btrfs_drop_snapshot(struct btrfs_root *root, >>> > wc->reada_count = BTRFS_NODEPTRS_PER_BLOCK(root); >>> > >>> > while (1) { >>> > + if (!for_reloc && btrfs_fs_closing(root->fs_info)) { >>> > + pr_debug("btrfs: drop snapshot early exit\n"); >>> > + err = -EAGAIN; >>> > + goto out_end_trans; >>> > + } >>> Here you exit the loop, but the "drop_progress" in the root item is >>> incorrect. When the system is remounted, and snapshot deletion >>> resumes, it seems that it tries to resume from the EXTENT_ITEM that >>> does not exist anymore, and [1] shows that btrfs_lookup_extent_info() >>> simply does not find the needed extent. >>> So then I hit panic in walk_down_tree(): >>> BUG: wc->refs[level - 1] == 0 >>> >>> I fixed it like follows: >>> There is a place where btrfs_drop_snapshot() checks if it needs to >>> detach from transaction and re-attach. So I moved the exit point there >>> and the code is like this: >>> >>> if (btrfs_should_end_transaction(trans, tree_root) || >>> (!for_reloc && btrfs_need_cleaner_sleep(root))) { >>> ret = btrfs_update_root(trans, tree_root, >>> &root->root_key, >>> root_item); >>> if (ret) { >>> btrfs_abort_transaction(trans, tree_root, ret); >>> err = ret; >>> goto out_end_trans; >>> } >>> >>> btrfs_end_transaction_throttle(trans, tree_root); >>> if (!for_reloc && btrfs_need_cleaner_sleep(root)) { >>> err = -EAGAIN; >>> goto out_free; >>> } >>> trans = btrfs_start_transaction(tree_root, 0); >>> ... >>> >>> With this fix, I do not hit the panic, and snapshot deletion proceeds >>> and completes alright after mount. >>> >>> Do you agree to my analysis or I am missing something? It seems that >>> Josef's btrfs-next still has this issue (as does Chris's for-linus). >> >> Sound analysis and I agree with the fix. The clean-by-one patch has been >> merged into 3.10 so we need a stable fix for that. > Thanks for confirming, David! > > From more testing, I have two more notes: > > # After applying the fix, whenever snapshot deletion is resumed after > mount, and successfully completes, then I unmount again, and rmmod > btrfs, linux complains about loosing few "struct extent_buffer" during > kem_cache_delete(). > So somewhere on that path: > if (btrfs_disk_key_objectid(&root_item->drop_progress) == 0) { > ... > } else { > ===> HERE > > and later we perhaps somehow overwrite the contents of "struct > btrfs_path" that is used in the whole function. Because at the end of > the function we always do btrfs_free_path(), which inside does > btrfs_release_path(). I was not able to determine where the leak > happens, do you have any hint? No other activity happens in the system > except the resumed snap deletion, and this problem only happens when > resuming. > I found where the memory leak happens. When we abort snapshot deletion in the middle, then this btrfs_root is basically left alone hanging in the air. It is out of the "dead_roots" already, so when del_fs_roots() is called during unmount, it will not free this root and its root->node (which is the one that triggers memory leak warning on kmem_cache_destroy) and perhaps other stuff too. The issue still exists in btrfs-next. Simplest fix I came up with was: With this fix, I don't see any memleak warnings (also by enabling LEAK_DEBUG) while aborting and resuming snapshot deletion. > # This is for Josef: after I unmount the fs with ongoing snap deletion > (after applying my fix), and run the latest btrfsck - it complains a > lot about problems in extent tree:( But after I mount again, snap > deletion resumes then completes, then I unmount and btrfsck is happy > again. So probably it does not account orphan roots properly? > > David, will you provide a fixed patch, if possible? > Josef, David, I feel that I am not helpful enough by pinpointing the problem and suggesting a fix, but not providing actual patch that fixes it and can be applied. The reason is that it is difficult for me to test the fix thoroughly on the latest upstream kernel (like btrfs-next), for reasons I'm sure you understand. So I appreciate if you could post these two fixes to the upstream kernel; but otherwise, I will try to work and test them on the latest kernel myself. Thanks, Alex. > Thanks! > Alex. > >> >> thanks, >> david --- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c index d275681..52a2c54 100644 --- a/fs/btrfs/extent-tree.c +++ b/fs/btrfs/extent-tree.c @@ -7468,6 +7468,7 @@ int btrfs_drop_snapshot(struct btrfs_root *root, int err = 0; int ret; int level; + bool root_freed = false; path = btrfs_alloc_path(); if (!path) { @@ -7641,6 +7642,8 @@ int btrfs_drop_snapshot(struct btrfs_root *root, free_extent_buffer(root->commit_root); btrfs_put_fs_root(root); } + root_freed = true; + out_end_trans: btrfs_end_transaction_throttle(trans, tree_root); out_free: @@ -7649,6 +7652,18 @@ out_free: out: if (err) btrfs_std_error(root->fs_info, err); + + /* + * If the root was not freed by any reason, this means that FS had + * a problem and will probably be unmounted soon. + * But we need to put the root back into the 'dead_roots' list, + * so that it will be properly freed during unmount. + */ + if (!root_freed) { + WARN_ON(err == 0); + btrfs_add_dead_root(root); + } + return err; }