diff mbox

Btrfs: fix crash of starting balance

Message ID 1358261277-3566-1-git-send-email-bo.li.liu@oracle.com (mailing list archive)
State New, archived
Headers show

Commit Message

Liu Bo Jan. 15, 2013, 2:47 p.m. UTC
We will crash on BUG_ON(ret == -EEXIST) when we do not resume the existing
balance but attempt to start a new one.

The steps can be:
1. start balance
2. pause balance
3. start balance

Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
---
 fs/btrfs/volumes.c |    7 ++++++-
 1 files changed, 6 insertions(+), 1 deletions(-)

Comments

Ilya Dryomov Jan. 15, 2013, 4:59 p.m. UTC | #1
On Tue, Jan 15, 2013 at 10:47:57PM +0800, Liu Bo wrote:
> We will crash on BUG_ON(ret == -EEXIST) when we do not resume the existing
> balance but attempt to start a new one.
> 
> The steps can be:
> 1. start balance
> 2. pause balance
> 3. start balance
> 
> Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> ---
>  fs/btrfs/volumes.c |    7 ++++++-
>  1 files changed, 6 insertions(+), 1 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index 5cce6aa..3901654 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -3100,7 +3100,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
>  		goto out;
>  
>  	if (!(bctl->flags & BTRFS_BALANCE_RESUME)) {
> -		BUG_ON(ret == -EEXIST);
> +		/*
> +		 * This can happen when we do not resume the existing balance
> +		 * but try to start a new one instead.
> +		 */
> +		if (ret == -EEXIST)
> +			goto out;
>  		set_balance_control(bctl);
>  	} else {
>  		BUG_ON(ret != -EEXIST);

OK, it seems balance pause/resume logic got broken by dev-replace code
(5ac00addc7ac09110995fe967071d191b5981cc1), which went into v3.8-rc1.
This is most certainly not the right way to fix it, that BUG_ON is there
for a reason.  I'll send a fix in a couple of days.

Thanks,

		Ilya

> -- 
> 1.7.7.6
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu Bo Jan. 16, 2013, 1:05 a.m. UTC | #2
On Tue, Jan 15, 2013 at 06:59:04PM +0200, Ilya Dryomov wrote:
> On Tue, Jan 15, 2013 at 10:47:57PM +0800, Liu Bo wrote:
> > We will crash on BUG_ON(ret == -EEXIST) when we do not resume the existing
> > balance but attempt to start a new one.
> > 
> > The steps can be:
> > 1. start balance
> > 2. pause balance
> > 3. start balance
> > 
> > Signed-off-by: Liu Bo <bo.li.liu@oracle.com>
> > ---
> >  fs/btrfs/volumes.c |    7 ++++++-
> >  1 files changed, 6 insertions(+), 1 deletions(-)
> > 
> > diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> > index 5cce6aa..3901654 100644
> > --- a/fs/btrfs/volumes.c
> > +++ b/fs/btrfs/volumes.c
> > @@ -3100,7 +3100,12 @@ int btrfs_balance(struct btrfs_balance_control *bctl,
> >  		goto out;
> >  
> >  	if (!(bctl->flags & BTRFS_BALANCE_RESUME)) {
> > -		BUG_ON(ret == -EEXIST);
> > +		/*
> > +		 * This can happen when we do not resume the existing balance
> > +		 * but try to start a new one instead.
> > +		 */
> > +		if (ret == -EEXIST)
> > +			goto out;
> >  		set_balance_control(bctl);
> >  	} else {
> >  		BUG_ON(ret != -EEXIST);
> 
> OK, it seems balance pause/resume logic got broken by dev-replace code
> (5ac00addc7ac09110995fe967071d191b5981cc1), which went into v3.8-rc1.
> This is most certainly not the right way to fix it, that BUG_ON is there
> for a reason.  I'll send a fix in a couple of days.

Okay, right here waiting for test ;)

thanks,
liubo
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index 5cce6aa..3901654 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -3100,7 +3100,12 @@  int btrfs_balance(struct btrfs_balance_control *bctl,
 		goto out;
 
 	if (!(bctl->flags & BTRFS_BALANCE_RESUME)) {
-		BUG_ON(ret == -EEXIST);
+		/*
+		 * This can happen when we do not resume the existing balance
+		 * but try to start a new one instead.
+		 */
+		if (ret == -EEXIST)
+			goto out;
 		set_balance_control(bctl);
 	} else {
 		BUG_ON(ret != -EEXIST);