From patchwork Tue Jul 12 18:24:21 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Bo X-Patchwork-Id: 9225915 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id BD23760868 for ; Tue, 12 Jul 2016 18:21:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id AF27327BFF for ; Tue, 12 Jul 2016 18:21:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9F12527D29; Tue, 12 Jul 2016 18:21:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C22A127BFF for ; Tue, 12 Jul 2016 18:21:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751079AbcGLSVe (ORCPT ); Tue, 12 Jul 2016 14:21:34 -0400 Received: from aserp1040.oracle.com ([141.146.126.69]:35910 "EHLO aserp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750780AbcGLSVd (ORCPT ); Tue, 12 Jul 2016 14:21:33 -0400 Received: from aserv0021.oracle.com (aserv0021.oracle.com [141.146.126.233]) by aserp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u6CILNBp020387 (version=TLSv1 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 12 Jul 2016 18:21:28 GMT Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by aserv0021.oracle.com (8.13.8/8.13.8) with ESMTP id u6CILMTW006345 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 12 Jul 2016 18:21:23 GMT Received: from abhmp0005.oracle.com (abhmp0005.oracle.com [141.146.116.11]) by userv0121.oracle.com (8.13.8/8.13.8) with ESMTP id u6CILGfC014720; Tue, 12 Jul 2016 18:21:22 GMT Received: from localhost.us.oracle.com (/10.211.47.181) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Tue, 12 Jul 2016 11:21:16 -0700 From: Liu Bo To: linux-btrfs@vger.kernel.org Cc: David Sterba , Chris Mason , Filipe Manana Subject: [PATCH v3] Btrfs: fix unexpected balance crash due to BUG_ON Date: Tue, 12 Jul 2016 11:24:21 -0700 Message-Id: <1468347861-16930-1-git-send-email-bo.li.liu@oracle.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1468257695-22445-1-git-send-email-bo.li.liu@oracle.com> References: <1468257695-22445-1-git-send-email-bo.li.liu@oracle.com> X-Source-IP: aserv0021.oracle.com [141.146.126.233] Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Mounting a btrfs can resume previous balance operations asynchronously. An user got a crash when one drive has some corrupt sectors. Since balance can cancel itself in case of any error, we can gracefully return errors to upper layers and let balance do the cancel job. Reported-by: sash Signed-off-by: Liu Bo Reviewed-by: David Sterba --- v2: - Initialize path with NULL. - Show more information when we bail out. v3: - Fix typo - Use rcu version btrfs_info - Remove __func__ and __LINE__ fs/btrfs/volumes.c | 28 ++++++++++++++++++++++++---- 1 file changed, 24 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 589f128..6040c26 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3421,7 +3421,7 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info) u64 size_to_free; u64 chunk_type; struct btrfs_chunk *chunk; - struct btrfs_path *path; + struct btrfs_path *path = NULL; struct btrfs_key key; struct btrfs_key found_key; struct btrfs_trans_handle *trans; @@ -3455,13 +3455,33 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info) ret = btrfs_shrink_device(device, old_size - size_to_free); if (ret == -ENOSPC) break; - BUG_ON(ret); + if (ret) { + /* btrfs_shrink_device never returns ret > 0 */ + WARN_ON(ret > 0); + goto error; + } trans = btrfs_start_transaction(dev_root, 0); - BUG_ON(IS_ERR(trans)); + if (IS_ERR(trans)) { + ret = PTR_ERR(trans); + btrfs_info_in_rcu(fs_info, + "resize: unable to start transaction after shrinking device %s (error %d), old size %llu, new size %llu", + rcu_str_deref(device->name), ret, + old_size, old_size - size_to_free); + goto error; + } ret = btrfs_grow_device(trans, device, old_size); - BUG_ON(ret); + if (ret) { + btrfs_end_transaction(trans, dev_root); + /* btrfs_grow_device never returns ret > 0 */ + WARN_ON(ret > 0); + btrfs_info_in_rcu(fs_info, + "resize: unable to grow device after shrinking device %s (error %d), old size %llu, new size %llu", + rcu_str_deref(device->name), ret, + old_size, old_size - size_to_free); + goto error; + } btrfs_end_transaction(trans, dev_root); }