From patchwork Mon Jul 11 17:21:35 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Liu Bo X-Patchwork-Id: 9223849 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 39A0960572 for ; Mon, 11 Jul 2016 17:18:58 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 280DA27CDF for ; Mon, 11 Jul 2016 17:18:58 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1CFD027DE0; Mon, 11 Jul 2016 17:18:58 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00, RCVD_IN_DNSWL_HI, UNPARSEABLE_RELAY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 536CA27CDF for ; Mon, 11 Jul 2016 17:18:57 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755517AbcGKRSk (ORCPT ); Mon, 11 Jul 2016 13:18:40 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:32204 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753768AbcGKRSi (ORCPT ); Mon, 11 Jul 2016 13:18:38 -0400 Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by userp1040.oracle.com (Sentrion-MTA-4.3.2/Sentrion-MTA-4.3.2) with ESMTP id u6BHIYun027272 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 11 Jul 2016 17:18:35 GMT Received: from aserv0122.oracle.com (aserv0122.oracle.com [141.146.126.236]) by userv0022.oracle.com (8.14.4/8.13.8) with ESMTP id u6BHIXMi002825 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 11 Jul 2016 17:18:34 GMT Received: from abhmp0015.oracle.com (abhmp0015.oracle.com [141.146.116.21]) by aserv0122.oracle.com (8.13.8/8.13.8) with ESMTP id u6BHIVLZ011340; Mon, 11 Jul 2016 17:18:32 GMT Received: from localhost.us.oracle.com (/10.211.47.181) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 11 Jul 2016 10:18:31 -0700 From: Liu Bo To: linux-btrfs@vger.kernel.org Cc: David Sterba Subject: [PATCH v2] Btrfs: fix unexpected balance crash due to BUG_ON Date: Mon, 11 Jul 2016 10:21:35 -0700 Message-Id: <1468257695-22445-1-git-send-email-bo.li.liu@oracle.com> X-Mailer: git-send-email 2.5.5 In-Reply-To: <1462230062-8053-1-git-send-email-bo.li.liu@oracle.com> References: <1462230062-8053-1-git-send-email-bo.li.liu@oracle.com> X-Source-IP: userv0022.oracle.com [156.151.31.74] Sender: linux-btrfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-btrfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Mounting a btrfs can resume previous balance operations asynchronously. An user got a crash when one drive has some corrupt sectors. Since balance can cancel itself in case of any error, we can gracefully return errors to upper layers and let balance do the cancel job. Reported-by: sash Signed-off-by: Liu Bo Signed-off-by: Chris Mason --- v2: - Initialize path with NULL. - Show more information when we bail out. fs/btrfs/volumes.c | 30 ++++++++++++++++++++++++++---- 1 file changed, 26 insertions(+), 4 deletions(-) diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c index 589f128..348a183 100644 --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -3421,7 +3421,7 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info) u64 size_to_free; u64 chunk_type; struct btrfs_chunk *chunk; - struct btrfs_path *path; + struct btrfs_path *path = NULL; struct btrfs_key key; struct btrfs_key found_key; struct btrfs_trans_handle *trans; @@ -3455,13 +3455,35 @@ static int __btrfs_balance(struct btrfs_fs_info *fs_info) ret = btrfs_shrink_device(device, old_size - size_to_free); if (ret == -ENOSPC) break; - BUG_ON(ret); + if (ret) { + /* btrfs_shrink_device never returns ret > 0 */ + WARN_ON(ret > 0); + goto error; + } trans = btrfs_start_transaction(dev_root, 0); - BUG_ON(IS_ERR(trans)); + if (IS_ERR(trans)) { + ret = PTR_ERR(trans); + btrfs_info(fs_info, + "%s:%d fails on btrfs_start_transaction() right after shrinking devivce %s (original size is %llu new size is %llu", + __func__, __LINE__, + rcu_str_deref(device->name), old_size, + old_size - size_to_free); + goto error; + } ret = btrfs_grow_device(trans, device, old_size); - BUG_ON(ret); + if (ret) { + btrfs_end_transaction(trans, dev_root); + /* btrfs_grow_device never returns ret > 0 */ + WARN_ON(ret > 0); + btrfs_info(fs_info, + "%s:%d fails on btrfs_grow_device() right after shrinking devivce %s (original size is %llu new size is %llu", + __func__, __LINE__, + rcu_str_deref(device->name), old_size, + old_size - size_to_free); + goto error; + } btrfs_end_transaction(trans, dev_root); }