diff mbox

Btrfs: don't bother updating the inode when evicting

Message ID 1355863917-16932-1-git-send-email-jbacik@fusionio.com (mailing list archive)
State New, archived
Headers show

Commit Message

Josef Bacik Dec. 18, 2012, 8:51 p.m. UTC
We're deleting the stupid thing, no sense in updating the inode for the new
size.  We're running into having 50-100 orphans left over with xfstests 83
because of ENOSPC when trying to start the transaction for the inode update.
This patch fixes this problem.  Thanks,

Signed-off-by: Josef Bacik <jbacik@fusionio.com>
---
 fs/btrfs/inode.c |    6 +-----
 1 files changed, 1 insertions(+), 5 deletions(-)

Comments

Miao Xie Dec. 19, 2012, 1:58 a.m. UTC | #1
On tue, 18 Dec 2012 15:51:57 -0500, Josef Bacik wrote:
> We're deleting the stupid thing, no sense in updating the inode for the new
> size.  We're running into having 50-100 orphans left over with xfstests 83
> because of ENOSPC when trying to start the transaction for the inode update.
> This patch fixes this problem.  Thanks,

This patch is wrong, it will introduce the inconsonant metadata in the snapshot
tree. The reason is folloing:

commit 8407aa464331556e4f6784f974030b83fc7585ed
Author: Miao Xie <miaox@cn.fujitsu.com>
Date:   Fri Sep 7 01:43:32 2012 -0600

    Btrfs: fix corrupted metadata in the snapshot
    
    When we delete a inode, we will remove all the delayed items including delayed
    inode update, and then truncate all the relative metadata. If there is lots of
    metadata, we will end the current transaction, and start a new transaction to
    truncate the left metadata. In this way, we will leave a inode item that its
    link counter is > 0, and also may leave some directory index items in fs/file tree
    after the current transaction ends. In other words, the metadata in this fs/file tree
    is inconsistent. If we create a snapshot for this tree now, we will find a inode with
    corrupted metadata in the new snapshot, and we won't continue to drop the left metadata,
    because its link counter is not 0.
    
    We fix this problem by updating the inode item before the current transaction ends.
    
    Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>


I will write a new patch to fix the problem you said above.

Thanks
Miao

> 
> Signed-off-by: Josef Bacik <jbacik@fusionio.com>
> ---
>  fs/btrfs/inode.c |    6 +-----
>  1 files changed, 1 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index f33269a..ac7f471 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -3898,7 +3898,7 @@ void btrfs_evict_inode(struct inode *inode)
>  			goto no_delete;
>  		}
>  
> -		trans = btrfs_start_transaction_lflush(root, 1);
> +		trans = btrfs_join_transaction(root);
>  		if (IS_ERR(trans)) {
>  			btrfs_orphan_del(NULL, inode);
>  			btrfs_free_block_rsv(root, rsv);
> @@ -3911,10 +3911,6 @@ void btrfs_evict_inode(struct inode *inode)
>  		if (ret != -ENOSPC)
>  			break;
>  
> -		trans->block_rsv = &root->fs_info->trans_block_rsv;
> -		ret = btrfs_update_inode(trans, root, inode);
> -		BUG_ON(ret);
> -
>  		btrfs_end_transaction(trans, root);
>  		trans = NULL;
>  		btrfs_btree_balance_dirty(root);
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Josef Bacik Dec. 19, 2012, 3:02 p.m. UTC | #2
On Tue, Dec 18, 2012 at 06:58:33PM -0700, Miao Xie wrote:
> On tue, 18 Dec 2012 15:51:57 -0500, Josef Bacik wrote:
> > We're deleting the stupid thing, no sense in updating the inode for the new
> > size.  We're running into having 50-100 orphans left over with xfstests 83
> > because of ENOSPC when trying to start the transaction for the inode update.
> > This patch fixes this problem.  Thanks,
> 
> This patch is wrong, it will introduce the inconsonant metadata in the snapshot
> tree. The reason is folloing:
> 
> commit 8407aa464331556e4f6784f974030b83fc7585ed
> Author: Miao Xie <miaox@cn.fujitsu.com>
> Date:   Fri Sep 7 01:43:32 2012 -0600
> 
>     Btrfs: fix corrupted metadata in the snapshot
>     
>     When we delete a inode, we will remove all the delayed items including delayed
>     inode update, and then truncate all the relative metadata. If there is lots of
>     metadata, we will end the current transaction, and start a new transaction to
>     truncate the left metadata. In this way, we will leave a inode item that its
>     link counter is > 0, and also may leave some directory index items in fs/file tree
>     after the current transaction ends. In other words, the metadata in this fs/file tree
>     is inconsistent. If we create a snapshot for this tree now, we will find a inode with
>     corrupted metadata in the new snapshot, and we won't continue to drop the left metadata,
>     because its link counter is not 0.
>     
>     We fix this problem by updating the inode item before the current transaction ends.
>     
>     Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> 

So why don't we fix unlink to call btrfs_update_inode_item so that the nlink
counter is set to 0?  The orphan item will be carried over into the snapshot if
we don't actually evict the inode before we do the snapshot and then the orphan
cleanup will take care of the rest?  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Miao Xie Dec. 20, 2012, 3:04 a.m. UTC | #3
On 	wed, 19 Dec 2012 10:02:59 -0500, Josef Bacik wrote:
> On Tue, Dec 18, 2012 at 06:58:33PM -0700, Miao Xie wrote:
>> On tue, 18 Dec 2012 15:51:57 -0500, Josef Bacik wrote:
>>> We're deleting the stupid thing, no sense in updating the inode for the new
>>> size.  We're running into having 50-100 orphans left over with xfstests 83
>>> because of ENOSPC when trying to start the transaction for the inode update.
>>> This patch fixes this problem.  Thanks,
>>
>> This patch is wrong, it will introduce the inconsonant metadata in the snapshot
>> tree. The reason is folloing:
>>
>> commit 8407aa464331556e4f6784f974030b83fc7585ed
>> Author: Miao Xie <miaox@cn.fujitsu.com>
>> Date:   Fri Sep 7 01:43:32 2012 -0600
>>
>>     Btrfs: fix corrupted metadata in the snapshot
>>     
>>     When we delete a inode, we will remove all the delayed items including delayed
>>     inode update, and then truncate all the relative metadata. If there is lots of
>>     metadata, we will end the current transaction, and start a new transaction to
>>     truncate the left metadata. In this way, we will leave a inode item that its
>>     link counter is > 0, and also may leave some directory index items in fs/file tree
>>     after the current transaction ends. In other words, the metadata in this fs/file tree
>>     is inconsistent. If we create a snapshot for this tree now, we will find a inode with
>>     corrupted metadata in the new snapshot, and we won't continue to drop the left metadata,
>>     because its link counter is not 0.
>>     
>>     We fix this problem by updating the inode item before the current transaction ends.
>>     
>>     Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>>
> 
> So why don't we fix unlink to call btrfs_update_inode_item so that the nlink
> counter is set to 0?  The orphan item will be carried over into the snapshot if
> we don't actually evict the inode before we do the snapshot and then the orphan
> cleanup will take care of the rest?  Thanks,

But it would make the file deletion performance down.

Thanks
Miao

> 
> Josef
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index f33269a..ac7f471 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -3898,7 +3898,7 @@  void btrfs_evict_inode(struct inode *inode)
 			goto no_delete;
 		}
 
-		trans = btrfs_start_transaction_lflush(root, 1);
+		trans = btrfs_join_transaction(root);
 		if (IS_ERR(trans)) {
 			btrfs_orphan_del(NULL, inode);
 			btrfs_free_block_rsv(root, rsv);
@@ -3911,10 +3911,6 @@  void btrfs_evict_inode(struct inode *inode)
 		if (ret != -ENOSPC)
 			break;
 
-		trans->block_rsv = &root->fs_info->trans_block_rsv;
-		ret = btrfs_update_inode(trans, root, inode);
-		BUG_ON(ret);
-
 		btrfs_end_transaction(trans, root);
 		trans = NULL;
 		btrfs_btree_balance_dirty(root);