diff mbox

kernel BUG at fs/btrfs/delayed-inode.c:1301!

Message ID 4DFF1CFA.7060201@cn.fujitsu.com (mailing list archive)
State New, archived
Headers show

Commit Message

Miao Xie June 20, 2011, 10:12 a.m. UTC
Hi, Jim

Could you test the attached patch for me? 
I have done some quick tests, it worked well. But I'm not sure if it can fix
the bug you reported or not, so I need your help!

Thanks
Miao

On fri, 17 Jun 2011 10:10:31 -0600, Jim Schutt wrote:
> Hi,
> 
> I've hit this delayed-inode BUG several times.  I'm using btrfs
> as the data store for Ceph OSDs, and testing a heavy write load.
> The kernel I'm running is a recent commit (f8f44f09eaa) from
> Linus' tree with the for-chris branch (commit ed0ca14021e5) of
>   git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git
> merged in.
> 
> Please let me know what I can do to help resolve this.
> 
> [ 5447.554187] err add delayed dir index item(name: pglog_0.965_0) into the insertion tree of the delayed node(root id: 262, inode id: 258, errno: -17)
> [ 5447.569766] ------------[ cut here ]------------
> [ 5447.575361] kernel BUG at fs/btrfs/delayed-inode.c:1301!
> [ 5447.580672] invalid opcode: 0000 [#1] SMP
> [ 5447.584806] CPU 2
> [ 5447.586646] Modules linked in: loop btrfs zlib_deflate lzo_compress ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp i2c_dev i2c_core ext3 jbd scsi_transport_iscsi rds ib_ipoib rdma_ucm rdma_cm ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod video sbs sbshc pci_slot battery acpi_pad ac kvm sg ses enclosure sd_mod megaraid_sas ide_cd_mod cdrom qla2xxx ib_mthca scsi_transport_fc ib_mad scsi_tgt ib_core button serio_raw ata_piix i5k_amb libata hwmon i5000_edac scsi_mod tpm_tis edac_core ehci_hcd pcspkr iTCO_wdt tpm dcdbas tpm_bios iTCO_vendor_support uhci_hcd rtc nfs nfs_acl auth_rpcgss fscache lockd sunrpc tg3 bnx2 e1000 [last unloaded: freq_table]
> [ 5447.660248]
> [ 5447.661744] Pid: 7622, comm: cosd Not tainted 3.0.0-rc3-00178-gbfc8ccb #34 Dell Inc. PowerEdge 1950/0DT097
> [ 5447.671421] RIP: 0010:[<ffffffffa068a7fc>]  [<ffffffffa068a7fc>] btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
> [ 5447.681922] RSP: 0018:ffff88021c0edaf8  EFLAGS: 00010292
> [ 5447.687351] RAX: 000000000000009e RBX: ffff880085bf0480 RCX: 0000000000012e0f
> [ 5447.694487] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff819aed98
> [ 5447.701631] RBP: ffff88021c0edb48 R08: ffff88021c0ed908 R09: ffffffff8189ef98
> [ 5447.708783] R10: 0000000000000000 R11: 0000000000000006 R12: 00000000ffffffef
> [ 5447.715923] R13: ffff880072e64240 R14: ffff880072e64288 R15: 000000000000000d
> [ 5447.723065] FS:  00007fefc66a9940(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
> [ 5447.731178] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [ 5447.736934] CR2: 00000000004042b0 CR3: 00000001ca4ef000 CR4: 00000000000006e0
> [ 5447.744087] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 5447.751218] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 5447.758343] Process cosd (pid: 7622, threadinfo ffff88021c0ec000, task ffff8801d92996b0)
> [ 5447.766422] Stack:
> [ 5447.768429]  00000001d2da3700 ffff88021c0edb88 ffff8802238d50f8 ffff880225627000
> [ 5447.775866]  ffff8801e32f5e58 ffff8801d2da3700 0000000000000000 ffff8801d9aec510
> [ 5447.783291]  000000000000000d ffff8802238d50f8 ffff88021c0edbf8 ffffffffa0641c4e
> [ 5447.790721] Call Trace:
> [ 5447.793191]  [<ffffffffa0641c4e>] btrfs_insert_dir_item+0x189/0x1bb [btrfs]
> [ 5447.800156]  [<ffffffffa0651a45>] btrfs_add_link+0x12b/0x191 [btrfs]
> [ 5447.806517]  [<ffffffffa0651adc>] btrfs_add_nondir+0x31/0x58 [btrfs]
> [ 5447.812876]  [<ffffffffa0651d6a>] btrfs_create+0xf9/0x197 [btrfs]
> [ 5447.818961]  [<ffffffff8111f840>] vfs_create+0x72/0x92
> [ 5447.824090]  [<ffffffff8111fa8c>] do_last+0x22c/0x40b
> [ 5447.829133]  [<ffffffff8112076a>] path_openat+0xc0/0x2ef
> [ 5447.834438]  [<ffffffff810c58e2>] ? __perf_event_task_sched_out+0x24/0x44
> [ 5447.841216]  [<ffffffff8103ecdd>] ? perf_event_task_sched_out+0x59/0x67
> [ 5447.847846]  [<ffffffff81121a79>] do_filp_open+0x3d/0x87
> [ 5447.853156]  [<ffffffff811e126c>] ? strncpy_from_user+0x43/0x4d
> [ 5447.859072]  [<ffffffff8111f1f5>] ? getname_flags+0x2e/0x80
> [ 5447.864636]  [<ffffffff8111f179>] ? do_getname+0x14b/0x173
> [ 5447.870112]  [<ffffffff8111f1b7>] ? audit_getname+0x16/0x26
> [ 5447.875682]  [<ffffffff8112b1ab>] ? spin_lock+0xe/0x10
> [ 5447.880882]  [<ffffffff81112d39>] do_sys_open+0x69/0xae
> [ 5447.886153]  [<ffffffff81112db1>] sys_open+0x20/0x22
> [ 5447.891114]  [<ffffffff813b9aab>] system_call_fastpath+0x16/0x1b
> [ 5447.897124] Code: 85 c0 41 89 c4 74 28 49 8b 45 10 49 8b 4d 00 45 89 e0 48 8b 75 c0 48 c7 c7 bb 43 69 a0 48 8b 90 e8 02 00 00 31 c0 e8 ce fb 9b e0 <0f> 0b eb fe 4c 89 f7 e8 81 7b d2 e0 4c 89 ef e8 d4 e4 ff ff 48
> [ 5447.916562] RIP  [<ffffffffa068a7fc>] btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
> [ 5447.924683]  RSP <ffff88021c0edaf8>
> [ 5447.928514] ---[ end trace 461a7f9887994fe0 ]---
> 
> -- Jim
> 
> 
> -- 
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>
From 457f39393b2e3d475fbba029b90b6a4e17b94d43 Mon Sep 17 00:00:00 2001
From: Miao Xie <miaox@cn.fujitsu.com>
Date: Mon, 20 Jun 2011 17:21:51 +0800
Subject: [PATCH] btrfs: fix inconsonant inode information

When iputting the inode, We may leave the delayed nodes if they have some
delayed items that have not been dealt with. So when the inode is read again,
we must look up the relative delayed node, and use the information in it to
initialize the inode. Or we will get inconsonant inode information.

Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
---
 fs/btrfs/delayed-inode.c |  104 +++++++++++++++++++++++++++++++++++-----------
 fs/btrfs/delayed-inode.h |    1 +
 fs/btrfs/inode.c         |   12 ++++-
 3 files changed, 91 insertions(+), 26 deletions(-)

Comments

Jim Schutt June 20, 2011, 10:30 p.m. UTC | #1
Hi Miao,

Miao Xie wrote:
> Hi, Jim
> 
> Could you test the attached patch for me? 
> I have done some quick tests, it worked well. But I'm not sure if it can fix
> the bug you reported or not, so I need your help!

So far I haven't been able to reproduce with your patch
applied.  I'd like to test for a few more days, though,
before calling it good.

Thanks for the patch -- I'll let you know what more
testing brings.

-- Jim

> 
> Thanks
> Miao
> 
> On fri, 17 Jun 2011 10:10:31 -0600, Jim Schutt wrote:
>> Hi,
>>
>> I've hit this delayed-inode BUG several times.  I'm using btrfs
>> as the data store for Ceph OSDs, and testing a heavy write load.
>> The kernel I'm running is a recent commit (f8f44f09eaa) from
>> Linus' tree with the for-chris branch (commit ed0ca14021e5) of
>>   git://git.kernel.org/pub/scm/linux/kernel/git/josef/btrfs-work.git
>> merged in.
>>
>> Please let me know what I can do to help resolve this.
>>
>> [ 5447.554187] err add delayed dir index item(name: pglog_0.965_0) into the insertion tree of the delayed node(root id: 262, inode id: 258, errno: -17)
>> [ 5447.569766] ------------[ cut here ]------------
>> [ 5447.575361] kernel BUG at fs/btrfs/delayed-inode.c:1301!
>> [ 5447.580672] invalid opcode: 0000 [#1] SMP
>> [ 5447.584806] CPU 2
>> [ 5447.586646] Modules linked in: loop btrfs zlib_deflate lzo_compress ipt_MASQUERADE iptable_nat nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_state nf_conntrack ipt_REJECT xt_tcpudp iptable_filter ip_tables x_tables bridge stp i2c_dev i2c_core ext3 jbd scsi_transport_iscsi rds ib_ipoib rdma_ucm rdma_cm ib_ucm ib_uverbs ib_umad ib_cm iw_cm ib_addr ipv6 ib_sa dm_mirror dm_region_hash dm_log dm_multipath scsi_dh dm_mod video sbs sbshc pci_slot battery acpi_pad ac kvm sg ses enclosure sd_mod megaraid_sas ide_cd_mod cdrom qla2xxx ib_mthca scsi_transport_fc ib_mad scsi_tgt ib_core button serio_raw ata_piix i5k_amb libata hwmon i5000_edac scsi_mod tpm_tis edac_core ehci_hcd pcspkr iTCO_wdt tpm dcdbas tpm_bios iTCO_vendor_support uhci_hcd rtc nfs nfs_acl auth_rpcgss fscache lockd sunrpc tg3 bnx2 e1000 [last unloaded: freq_table]
>> [ 5447.660248]
>> [ 5447.661744] Pid: 7622, comm: cosd Not tainted 3.0.0-rc3-00178-gbfc8ccb #34 Dell Inc. PowerEdge 1950/0DT097
>> [ 5447.671421] RIP: 0010:[<ffffffffa068a7fc>]  [<ffffffffa068a7fc>] btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
>> [ 5447.681922] RSP: 0018:ffff88021c0edaf8  EFLAGS: 00010292
>> [ 5447.687351] RAX: 000000000000009e RBX: ffff880085bf0480 RCX: 0000000000012e0f
>> [ 5447.694487] RDX: 0000000000000000 RSI: 0000000000000046 RDI: ffffffff819aed98
>> [ 5447.701631] RBP: ffff88021c0edb48 R08: ffff88021c0ed908 R09: ffffffff8189ef98
>> [ 5447.708783] R10: 0000000000000000 R11: 0000000000000006 R12: 00000000ffffffef
>> [ 5447.715923] R13: ffff880072e64240 R14: ffff880072e64288 R15: 000000000000000d
>> [ 5447.723065] FS:  00007fefc66a9940(0000) GS:ffff88022fc80000(0000) knlGS:0000000000000000
>> [ 5447.731178] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [ 5447.736934] CR2: 00000000004042b0 CR3: 00000001ca4ef000 CR4: 00000000000006e0
>> [ 5447.744087] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [ 5447.751218] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
>> [ 5447.758343] Process cosd (pid: 7622, threadinfo ffff88021c0ec000, task ffff8801d92996b0)
>> [ 5447.766422] Stack:
>> [ 5447.768429]  00000001d2da3700 ffff88021c0edb88 ffff8802238d50f8 ffff880225627000
>> [ 5447.775866]  ffff8801e32f5e58 ffff8801d2da3700 0000000000000000 ffff8801d9aec510
>> [ 5447.783291]  000000000000000d ffff8802238d50f8 ffff88021c0edbf8 ffffffffa0641c4e
>> [ 5447.790721] Call Trace:
>> [ 5447.793191]  [<ffffffffa0641c4e>] btrfs_insert_dir_item+0x189/0x1bb [btrfs]
>> [ 5447.800156]  [<ffffffffa0651a45>] btrfs_add_link+0x12b/0x191 [btrfs]
>> [ 5447.806517]  [<ffffffffa0651adc>] btrfs_add_nondir+0x31/0x58 [btrfs]
>> [ 5447.812876]  [<ffffffffa0651d6a>] btrfs_create+0xf9/0x197 [btrfs]
>> [ 5447.818961]  [<ffffffff8111f840>] vfs_create+0x72/0x92
>> [ 5447.824090]  [<ffffffff8111fa8c>] do_last+0x22c/0x40b
>> [ 5447.829133]  [<ffffffff8112076a>] path_openat+0xc0/0x2ef
>> [ 5447.834438]  [<ffffffff810c58e2>] ? __perf_event_task_sched_out+0x24/0x44
>> [ 5447.841216]  [<ffffffff8103ecdd>] ? perf_event_task_sched_out+0x59/0x67
>> [ 5447.847846]  [<ffffffff81121a79>] do_filp_open+0x3d/0x87
>> [ 5447.853156]  [<ffffffff811e126c>] ? strncpy_from_user+0x43/0x4d
>> [ 5447.859072]  [<ffffffff8111f1f5>] ? getname_flags+0x2e/0x80
>> [ 5447.864636]  [<ffffffff8111f179>] ? do_getname+0x14b/0x173
>> [ 5447.870112]  [<ffffffff8111f1b7>] ? audit_getname+0x16/0x26
>> [ 5447.875682]  [<ffffffff8112b1ab>] ? spin_lock+0xe/0x10
>> [ 5447.880882]  [<ffffffff81112d39>] do_sys_open+0x69/0xae
>> [ 5447.886153]  [<ffffffff81112db1>] sys_open+0x20/0x22
>> [ 5447.891114]  [<ffffffff813b9aab>] system_call_fastpath+0x16/0x1b
>> [ 5447.897124] Code: 85 c0 41 89 c4 74 28 49 8b 45 10 49 8b 4d 00 45 89 e0 48 8b 75 c0 48 c7 c7 bb 43 69 a0 48 8b 90 e8 02 00 00 31 c0 e8 ce fb 9b e0 <0f> 0b eb fe 4c 89 f7 e8 81 7b d2 e0 4c 89 ef e8 d4 e4 ff ff 48
>> [ 5447.916562] RIP  [<ffffffffa068a7fc>] btrfs_insert_delayed_dir_index+0x124/0x14c [btrfs]
>> [ 5447.924683]  RSP <ffff88021c0edaf8>
>> [ 5447.928514] ---[ end trace 461a7f9887994fe0 ]---
>>
>> -- Jim
>>
>>
>> -- 
>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
>> the body of a message to majordomo@vger.kernel.org
>> More majordomo info at  http://vger.kernel.org/majordomo-info.html
>>
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
David Sterba June 21, 2011, 12:08 a.m. UTC | #2
On Mon, Jun 20, 2011 at 06:12:10PM +0800, Miao Xie wrote:
> >From 457f39393b2e3d475fbba029b90b6a4e17b94d43 Mon Sep 17 00:00:00 2001
> From: Miao Xie <miaox@cn.fujitsu.com>
> Date: Mon, 20 Jun 2011 17:21:51 +0800
> Subject: [PATCH] btrfs: fix inconsonant inode information
> 
> When iputting the inode, We may leave the delayed nodes if they have some
> delayed items that have not been dealt with. So when the inode is read again,
> we must look up the relative delayed node, and use the information in it to
> initialize the inode. Or we will get inconsonant inode information.
> 
> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
> ---
>  fs/btrfs/delayed-inode.c |  104 +++++++++++++++++++++++++++++++++++-----------
>  fs/btrfs/delayed-inode.h |    1 +
>  fs/btrfs/inode.c         |   12 ++++-
>  3 files changed, 91 insertions(+), 26 deletions(-)
> 
> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
> index f1cbd02..280755e 100644
> --- a/fs/btrfs/delayed-inode.c
> +++ b/fs/btrfs/delayed-inode.c
> @@ -82,19 +82,16 @@ static inline struct btrfs_delayed_root *btrfs_get_delayed_root(
>  	return root->fs_info->delayed_root;
>  }
>  
> -static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
> -							struct inode *inode)
> +static struct btrfs_delayed_node *btrfs_get_delayed_node(struct inode *inode)
>  {
> -	struct btrfs_delayed_node *node;
>  	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>  	struct btrfs_root *root = btrfs_inode->root;
>  	u64 ino = btrfs_ino(inode);
> -	int ret;
> +	struct btrfs_delayed_node *node;
>  
> -again:
>  	node = ACCESS_ONCE(btrfs_inode->delayed_node);

do you still need the volatile access here, after the again: label has
been removed? it does not break things if it's there, but it raises
questions ...

>  	if (node) {
> -		atomic_inc(&node->refs);	/* can be accessed */
> +		atomic_inc(&node->refs);
>  		return node;
>  	}
>  
> @@ -103,7 +100,9 @@ again:
>  	if (node) {
>  		if (btrfs_inode->delayed_node) {
>  			spin_unlock(&root->inode_lock);
> -			goto again;
> +			BUG_ON(btrfs_inode->delayed_node != node);
> +			atomic_inc(&node->refs);	/* can be accessed */
> +			return node;
>  		}
>  		btrfs_inode->delayed_node = node;
>  		atomic_inc(&node->refs);	/* can be accessed */
> @@ -113,6 +112,23 @@ again:
>  	}
>  	spin_unlock(&root->inode_lock);
>  
> +	return NULL;
> +}
> +
> +static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
> +							struct inode *inode)
> +{
> +	struct btrfs_delayed_node *node;
> +	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
> +	struct btrfs_root *root = btrfs_inode->root;
> +	u64 ino = btrfs_ino(inode);
> +	int ret;
> +
> +again:
> +	node = btrfs_get_delayed_node(inode);

... aha, it's been somehow moved here, which copies the original logic.
Now reading inode->delayed_node is inside a function and I do not think
that compiler could optimize reading value of btrfs_inode->delayed_node
that it would require ACCESS_ONCE.

And there is another ACCESS_ONCE in btrfs_remove_delayed_node. I wonder
what's the reason for that. Sorry to abuse this thread, but I'd like to
be sure about protection of the ->delayed_node members inside
btrfs_inode. Can you please comment on that?

If you recall one message from the original report:

[ 5447.554187] err add delayed dir index item(name: pglog_0.965_0)
into the insertion tree of the delayed node(root id:
262, inode id: 258, errno: -17)

(-17 == -EEXIST)

a printk after return from __btrfs_add_delayed_item (which is
able to return -EEXIST) in btrfs_insert_delayed_dir_index. I haven't
looked farther, but it seems that the item is being inserted (at least)
twice and I suspect missing locking or other type of protection.


thanks,
david

> +	if (node)
> +		return node;
> +
>  	node = kmem_cache_alloc(delayed_node_cache, GFP_NOFS);
>  	if (!node)
>  		return ERR_PTR(-ENOMEM);
> @@ -548,19 +564,6 @@ struct btrfs_delayed_item *__btrfs_next_delayed_item(
>  	return next;
>  }
>  
> -static inline struct btrfs_delayed_node *btrfs_get_delayed_node(
> -							struct inode *inode)
> -{
> -	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
> -	struct btrfs_delayed_node *delayed_node;
> -
> -	delayed_node = btrfs_inode->delayed_node;
> -	if (delayed_node)
> -		atomic_inc(&delayed_node->refs);
> -
> -	return delayed_node;
> -}
> -
>  static inline struct btrfs_root *btrfs_get_fs_root(struct btrfs_root *root,
>  						   u64 root_id)
>  {
> @@ -1404,8 +1407,7 @@ end:
>  
>  int btrfs_inode_delayed_dir_index_count(struct inode *inode)
>  {
> -	struct btrfs_delayed_node *delayed_node = BTRFS_I(inode)->delayed_node;
> -	int ret = 0;
> +	struct btrfs_delayed_node *delayed_node = btrfs_get_delayed_node(inode);
>  
>  	if (!delayed_node)
>  		return -ENOENT;
> @@ -1415,11 +1417,14 @@ int btrfs_inode_delayed_dir_index_count(struct inode *inode)
>  	 * a new directory index is added into the delayed node and index_cnt
>  	 * is updated now. So we needn't lock the delayed node.
>  	 */
> -	if (!delayed_node->index_cnt)
> +	if (!delayed_node->index_cnt) {
> +		btrfs_release_delayed_node(delayed_node);
>  		return -EINVAL;
> +	}
>  
>  	BTRFS_I(inode)->index_cnt = delayed_node->index_cnt;
> -	return ret;
> +	btrfs_release_delayed_node(delayed_node);
> +	return 0;
>  }
>  
>  void btrfs_get_delayed_items(struct inode *inode, struct list_head *ins_list,
> @@ -1613,6 +1618,57 @@ static void fill_stack_inode_item(struct btrfs_trans_handle *trans,
>  				      inode->i_ctime.tv_nsec);
>  }
>  
> +int btrfs_fill_inode(struct inode *inode, u32 *rdev)
> +{
> +	struct btrfs_delayed_node *delayed_node;
> +	struct btrfs_inode_item *inode_item;
> +	struct btrfs_timespec *tspec;
> +
> +	delayed_node = btrfs_get_delayed_node(inode);
> +	if (!delayed_node)
> +		return -ENOENT;
> +
> +	mutex_lock(&delayed_node->mutex);
> +	if (!delayed_node->inode_dirty) {
> +		mutex_unlock(&delayed_node->mutex);
> +		btrfs_release_delayed_node(delayed_node);
> +		return -ENOENT;
> +	}
> +
> +	inode_item = &delayed_node->inode_item;
> +
> +	inode->i_uid = btrfs_stack_inode_uid(inode_item);
> +	inode->i_gid = btrfs_stack_inode_gid(inode_item);
> +	btrfs_i_size_write(inode, btrfs_stack_inode_size(inode_item));
> +	inode->i_mode = btrfs_stack_inode_mode(inode_item);
> +	inode->i_nlink = btrfs_stack_inode_nlink(inode_item);
> +	inode_set_bytes(inode, btrfs_stack_inode_nbytes(inode_item));
> +	BTRFS_I(inode)->generation = btrfs_stack_inode_generation(inode_item);
> +	BTRFS_I(inode)->sequence = btrfs_stack_inode_sequence(inode_item);
> +	inode->i_rdev = 0;
> +	*rdev = btrfs_stack_inode_rdev(inode_item);
> +	BTRFS_I(inode)->flags = btrfs_stack_inode_flags(inode_item);
> +
> +	tspec = btrfs_inode_atime(inode_item);
> +	inode->i_atime.tv_sec = btrfs_stack_timespec_sec(tspec);
> +	inode->i_atime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
> +
> +	tspec = btrfs_inode_mtime(inode_item);
> +	inode->i_mtime.tv_sec = btrfs_stack_timespec_sec(tspec);
> +	inode->i_mtime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
> +
> +	tspec = btrfs_inode_ctime(inode_item);
> +	inode->i_ctime.tv_sec = btrfs_stack_timespec_sec(tspec);
> +	inode->i_ctime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
> +
> +	inode->i_generation = BTRFS_I(inode)->generation;
> +	BTRFS_I(inode)->index_cnt = (u64)-1;
> +
> +	mutex_unlock(&delayed_node->mutex);
> +	btrfs_release_delayed_node(delayed_node);
> +	return 0;
> +}
> +
>  int btrfs_delayed_update_inode(struct btrfs_trans_handle *trans,
>  			       struct btrfs_root *root, struct inode *inode)
>  {
> diff --git a/fs/btrfs/delayed-inode.h b/fs/btrfs/delayed-inode.h
> index d1a6a29..8d27af4 100644
> --- a/fs/btrfs/delayed-inode.h
> +++ b/fs/btrfs/delayed-inode.h
> @@ -119,6 +119,7 @@ void btrfs_kill_delayed_inode_items(struct inode *inode);
>  
>  int btrfs_delayed_update_inode(struct btrfs_trans_handle *trans,
>  			       struct btrfs_root *root, struct inode *inode);
> +int btrfs_fill_inode(struct inode *inode, u32 *rdev);
>  
>  /* Used for drop dead root */
>  void btrfs_kill_all_delayed_nodes(struct btrfs_root *root);
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 5813dec..1d25a04 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -2509,6 +2509,11 @@ static void btrfs_read_locked_inode(struct inode *inode)
>  	int maybe_acls;
>  	u32 rdev;
>  	int ret;
> +	bool filled = false;
> +
> +	ret = btrfs_fill_inode(inode, &rdev);
> +	if (!ret)
> +		filled = true;
>  
>  	path = btrfs_alloc_path();
>  	BUG_ON(!path);
> @@ -2520,6 +2525,10 @@ static void btrfs_read_locked_inode(struct inode *inode)
>  		goto make_bad;
>  
>  	leaf = path->nodes[0];
> +
> +	if (filled)
> +		goto cache_acl;
> +
>  	inode_item = btrfs_item_ptr(leaf, path->slots[0],
>  				    struct btrfs_inode_item);
>  	if (!leaf->map_token)
> @@ -2556,7 +2565,7 @@ static void btrfs_read_locked_inode(struct inode *inode)
>  
>  	BTRFS_I(inode)->index_cnt = (u64)-1;
>  	BTRFS_I(inode)->flags = btrfs_inode_flags(leaf, inode_item);
> -
> +cache_acl:
>  	/*
>  	 * try to precache a NULL acl entry for files that don't have
>  	 * any xattrs or acls
> @@ -2572,7 +2581,6 @@ static void btrfs_read_locked_inode(struct inode *inode)
>  	}
>  
>  	btrfs_free_path(path);
> -	inode_item = NULL;
>  
>  	switch (inode->i_mode & S_IFMT) {
>  	case S_IFREG:
> -- 
> 1.7.4
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Miao Xie June 22, 2011, 4:35 a.m. UTC | #3
On Tue, 21 Jun 2011 02:08:54 +0200, David Sterba wrote:
> On Mon, Jun 20, 2011 at 06:12:10PM +0800, Miao Xie wrote:
>> >From 457f39393b2e3d475fbba029b90b6a4e17b94d43 Mon Sep 17 00:00:00 2001
>> From: Miao Xie <miaox@cn.fujitsu.com>
>> Date: Mon, 20 Jun 2011 17:21:51 +0800
>> Subject: [PATCH] btrfs: fix inconsonant inode information
>>
>> When iputting the inode, We may leave the delayed nodes if they have some
>> delayed items that have not been dealt with. So when the inode is read again,
>> we must look up the relative delayed node, and use the information in it to
>> initialize the inode. Or we will get inconsonant inode information.
>>
>> Signed-off-by: Miao Xie <miaox@cn.fujitsu.com>
>> ---
>>  fs/btrfs/delayed-inode.c |  104 +++++++++++++++++++++++++++++++++++-----------
>>  fs/btrfs/delayed-inode.h |    1 +
>>  fs/btrfs/inode.c         |   12 ++++-
>>  3 files changed, 91 insertions(+), 26 deletions(-)
>>
>> diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
>> index f1cbd02..280755e 100644
>> --- a/fs/btrfs/delayed-inode.c
>> +++ b/fs/btrfs/delayed-inode.c
>> @@ -82,19 +82,16 @@ static inline struct btrfs_delayed_root *btrfs_get_delayed_root(
>>  	return root->fs_info->delayed_root;
>>  }
>>  
>> -static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
>> -							struct inode *inode)
>> +static struct btrfs_delayed_node *btrfs_get_delayed_node(struct inode *inode)
>>  {
>> -	struct btrfs_delayed_node *node;
>>  	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>>  	struct btrfs_root *root = btrfs_inode->root;
>>  	u64 ino = btrfs_ino(inode);
>> -	int ret;
>> +	struct btrfs_delayed_node *node;
>>  
>> -again:
>>  	node = ACCESS_ONCE(btrfs_inode->delayed_node);
> 
> do you still need the volatile access here, after the again: label has
> been removed? it does not break things if it's there, but it raises
> questions ...

The rule of ACCESS_ONCE said by Linus is:

  if you access unlocked values, you use ACCESS_ONCE().

(See: http://yarchive.net/comp/linux/ACCESS_ONCE.html)
So I think it is still needed.
> 
>>  	if (node) {
>> -		atomic_inc(&node->refs);	/* can be accessed */
>> +		atomic_inc(&node->refs);
>>  		return node;
>>  	}
>>  
>> @@ -103,7 +100,9 @@ again:
>>  	if (node) {
>>  		if (btrfs_inode->delayed_node) {
>>  			spin_unlock(&root->inode_lock);
>> -			goto again;
>> +			BUG_ON(btrfs_inode->delayed_node != node);
>> +			atomic_inc(&node->refs);	/* can be accessed */
>> +			return node;
>>  		}
>>  		btrfs_inode->delayed_node = node;
>>  		atomic_inc(&node->refs);	/* can be accessed */
>> @@ -113,6 +112,23 @@ again:
>>  	}
>>  	spin_unlock(&root->inode_lock);
>>  
>> +	return NULL;
>> +}
>> +
>> +static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
>> +							struct inode *inode)
>> +{
>> +	struct btrfs_delayed_node *node;
>> +	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>> +	struct btrfs_root *root = btrfs_inode->root;
>> +	u64 ino = btrfs_ino(inode);
>> +	int ret;
>> +
>> +again:
>> +	node = btrfs_get_delayed_node(inode);
> 
> ... aha, it's been somehow moved here, which copies the original logic.
> Now reading inode->delayed_node is inside a function and I do not think
> that compiler could optimize reading value of btrfs_inode->delayed_node
> that it would require ACCESS_ONCE.

The reason that I rewrote btrfs_get_delayed_node() is:
The old btrfs_get_delayed_node() may ignore the old delayed node which holds the
up-to-date information (such as: new directory indexes, up-to-date inode information),
considers there is no relative delayed node. And then btrfs will use the max index number
in the fs tree to initialize ->index_cnt, but in fact, this number is not right,
perhaps there are some directory indexes in the delayed node, the index number of them
is greater than the one in the fs tree. In this way, the same index number may be allocated
twice, and hit EEXIST error.

> 
> And there is another ACCESS_ONCE in btrfs_remove_delayed_node. I wonder
> what's the reason for that. Sorry to abuse this thread, but I'd like to
> be sure about protection of the ->delayed_node members inside
> btrfs_inode. Can you please comment on that?

OK, I'll write some comment.

> If you recall one message from the original report:
> 
> [ 5447.554187] err add delayed dir index item(name: pglog_0.965_0)
> into the insertion tree of the delayed node(root id:
> 262, inode id: 258, errno: -17)
> 
> (-17 == -EEXIST)
> 
> a printk after return from __btrfs_add_delayed_item (which is
> able to return -EEXIST) in btrfs_insert_delayed_dir_index. I haven't
> looked farther, but it seems that the item is being inserted (at least)
> twice and I suspect missing locking or other type of protection.

I don't think.
The reason is above.

Thanks
Miao

> 
> 
> thanks,
> david
> 
>> +	if (node)
>> +		return node;
>> +
>>  	node = kmem_cache_alloc(delayed_node_cache, GFP_NOFS);
>>  	if (!node)
>>  		return ERR_PTR(-ENOMEM);
>> @@ -548,19 +564,6 @@ struct btrfs_delayed_item *__btrfs_next_delayed_item(
>>  	return next;
>>  }
>>  
>> -static inline struct btrfs_delayed_node *btrfs_get_delayed_node(
>> -							struct inode *inode)
>> -{
>> -	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
>> -	struct btrfs_delayed_node *delayed_node;
>> -
>> -	delayed_node = btrfs_inode->delayed_node;
>> -	if (delayed_node)
>> -		atomic_inc(&delayed_node->refs);
>> -
>> -	return delayed_node;
>> -}
>> -
>>  static inline struct btrfs_root *btrfs_get_fs_root(struct btrfs_root *root,
>>  						   u64 root_id)
>>  {
>> @@ -1404,8 +1407,7 @@ end:
>>  
>>  int btrfs_inode_delayed_dir_index_count(struct inode *inode)
>>  {
>> -	struct btrfs_delayed_node *delayed_node = BTRFS_I(inode)->delayed_node;
>> -	int ret = 0;
>> +	struct btrfs_delayed_node *delayed_node = btrfs_get_delayed_node(inode);
>>  
>>  	if (!delayed_node)
>>  		return -ENOENT;
>> @@ -1415,11 +1417,14 @@ int btrfs_inode_delayed_dir_index_count(struct inode *inode)
>>  	 * a new directory index is added into the delayed node and index_cnt
>>  	 * is updated now. So we needn't lock the delayed node.
>>  	 */
>> -	if (!delayed_node->index_cnt)
>> +	if (!delayed_node->index_cnt) {
>> +		btrfs_release_delayed_node(delayed_node);
>>  		return -EINVAL;
>> +	}
>>  
>>  	BTRFS_I(inode)->index_cnt = delayed_node->index_cnt;
>> -	return ret;
>> +	btrfs_release_delayed_node(delayed_node);
>> +	return 0;
>>  }
>>  
>>  void btrfs_get_delayed_items(struct inode *inode, struct list_head *ins_list,
>> @@ -1613,6 +1618,57 @@ static void fill_stack_inode_item(struct btrfs_trans_handle *trans,
>>  				      inode->i_ctime.tv_nsec);
>>  }
>>  
>> +int btrfs_fill_inode(struct inode *inode, u32 *rdev)
>> +{
>> +	struct btrfs_delayed_node *delayed_node;
>> +	struct btrfs_inode_item *inode_item;
>> +	struct btrfs_timespec *tspec;
>> +
>> +	delayed_node = btrfs_get_delayed_node(inode);
>> +	if (!delayed_node)
>> +		return -ENOENT;
>> +
>> +	mutex_lock(&delayed_node->mutex);
>> +	if (!delayed_node->inode_dirty) {
>> +		mutex_unlock(&delayed_node->mutex);
>> +		btrfs_release_delayed_node(delayed_node);
>> +		return -ENOENT;
>> +	}
>> +
>> +	inode_item = &delayed_node->inode_item;
>> +
>> +	inode->i_uid = btrfs_stack_inode_uid(inode_item);
>> +	inode->i_gid = btrfs_stack_inode_gid(inode_item);
>> +	btrfs_i_size_write(inode, btrfs_stack_inode_size(inode_item));
>> +	inode->i_mode = btrfs_stack_inode_mode(inode_item);
>> +	inode->i_nlink = btrfs_stack_inode_nlink(inode_item);
>> +	inode_set_bytes(inode, btrfs_stack_inode_nbytes(inode_item));
>> +	BTRFS_I(inode)->generation = btrfs_stack_inode_generation(inode_item);
>> +	BTRFS_I(inode)->sequence = btrfs_stack_inode_sequence(inode_item);
>> +	inode->i_rdev = 0;
>> +	*rdev = btrfs_stack_inode_rdev(inode_item);
>> +	BTRFS_I(inode)->flags = btrfs_stack_inode_flags(inode_item);
>> +
>> +	tspec = btrfs_inode_atime(inode_item);
>> +	inode->i_atime.tv_sec = btrfs_stack_timespec_sec(tspec);
>> +	inode->i_atime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
>> +
>> +	tspec = btrfs_inode_mtime(inode_item);
>> +	inode->i_mtime.tv_sec = btrfs_stack_timespec_sec(tspec);
>> +	inode->i_mtime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
>> +
>> +	tspec = btrfs_inode_ctime(inode_item);
>> +	inode->i_ctime.tv_sec = btrfs_stack_timespec_sec(tspec);
>> +	inode->i_ctime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
>> +
>> +	inode->i_generation = BTRFS_I(inode)->generation;
>> +	BTRFS_I(inode)->index_cnt = (u64)-1;
>> +
>> +	mutex_unlock(&delayed_node->mutex);
>> +	btrfs_release_delayed_node(delayed_node);
>> +	return 0;
>> +}
>> +
>>  int btrfs_delayed_update_inode(struct btrfs_trans_handle *trans,
>>  			       struct btrfs_root *root, struct inode *inode)
>>  {
>> diff --git a/fs/btrfs/delayed-inode.h b/fs/btrfs/delayed-inode.h
>> index d1a6a29..8d27af4 100644
>> --- a/fs/btrfs/delayed-inode.h
>> +++ b/fs/btrfs/delayed-inode.h
>> @@ -119,6 +119,7 @@ void btrfs_kill_delayed_inode_items(struct inode *inode);
>>  
>>  int btrfs_delayed_update_inode(struct btrfs_trans_handle *trans,
>>  			       struct btrfs_root *root, struct inode *inode);
>> +int btrfs_fill_inode(struct inode *inode, u32 *rdev);
>>  
>>  /* Used for drop dead root */
>>  void btrfs_kill_all_delayed_nodes(struct btrfs_root *root);
>> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
>> index 5813dec..1d25a04 100644
>> --- a/fs/btrfs/inode.c
>> +++ b/fs/btrfs/inode.c
>> @@ -2509,6 +2509,11 @@ static void btrfs_read_locked_inode(struct inode *inode)
>>  	int maybe_acls;
>>  	u32 rdev;
>>  	int ret;
>> +	bool filled = false;
>> +
>> +	ret = btrfs_fill_inode(inode, &rdev);
>> +	if (!ret)
>> +		filled = true;
>>  
>>  	path = btrfs_alloc_path();
>>  	BUG_ON(!path);
>> @@ -2520,6 +2525,10 @@ static void btrfs_read_locked_inode(struct inode *inode)
>>  		goto make_bad;
>>  
>>  	leaf = path->nodes[0];
>> +
>> +	if (filled)
>> +		goto cache_acl;
>> +
>>  	inode_item = btrfs_item_ptr(leaf, path->slots[0],
>>  				    struct btrfs_inode_item);
>>  	if (!leaf->map_token)
>> @@ -2556,7 +2565,7 @@ static void btrfs_read_locked_inode(struct inode *inode)
>>  
>>  	BTRFS_I(inode)->index_cnt = (u64)-1;
>>  	BTRFS_I(inode)->flags = btrfs_inode_flags(leaf, inode_item);
>> -
>> +cache_acl:
>>  	/*
>>  	 * try to precache a NULL acl entry for files that don't have
>>  	 * any xattrs or acls
>> @@ -2572,7 +2581,6 @@ static void btrfs_read_locked_inode(struct inode *inode)
>>  	}
>>  
>>  	btrfs_free_path(path);
>> -	inode_item = NULL;
>>  
>>  	switch (inode->i_mode & S_IFMT) {
>>  	case S_IFREG:
>> -- 
>> 1.7.4
>>
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jim Schutt June 22, 2011, 6:14 p.m. UTC | #4
Jim Schutt wrote:
> Hi Miao,
> 
> Miao Xie wrote:
>> Hi, Jim
>>
>> Could you test the attached patch for me? I have done some quick 
>> tests, it worked well. But I'm not sure if it can fix
>> the bug you reported or not, so I need your help!
> 
> So far I haven't been able to reproduce with your patch
> applied.  I'd like to test for a few more days, though,
> before calling it good.
> 
> Thanks for the patch -- I'll let you know what more
> testing brings.

I've been running this patch on top of 3.0-rc4 for
the last couple days, and have been unable to reproduce
the BUG.

Thanks -- Jim

> 
> -- Jim
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/fs/btrfs/delayed-inode.c b/fs/btrfs/delayed-inode.c
index f1cbd02..280755e 100644
--- a/fs/btrfs/delayed-inode.c
+++ b/fs/btrfs/delayed-inode.c
@@ -82,19 +82,16 @@  static inline struct btrfs_delayed_root *btrfs_get_delayed_root(
 	return root->fs_info->delayed_root;
 }
 
-static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
-							struct inode *inode)
+static struct btrfs_delayed_node *btrfs_get_delayed_node(struct inode *inode)
 {
-	struct btrfs_delayed_node *node;
 	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
 	struct btrfs_root *root = btrfs_inode->root;
 	u64 ino = btrfs_ino(inode);
-	int ret;
+	struct btrfs_delayed_node *node;
 
-again:
 	node = ACCESS_ONCE(btrfs_inode->delayed_node);
 	if (node) {
-		atomic_inc(&node->refs);	/* can be accessed */
+		atomic_inc(&node->refs);
 		return node;
 	}
 
@@ -103,7 +100,9 @@  again:
 	if (node) {
 		if (btrfs_inode->delayed_node) {
 			spin_unlock(&root->inode_lock);
-			goto again;
+			BUG_ON(btrfs_inode->delayed_node != node);
+			atomic_inc(&node->refs);	/* can be accessed */
+			return node;
 		}
 		btrfs_inode->delayed_node = node;
 		atomic_inc(&node->refs);	/* can be accessed */
@@ -113,6 +112,23 @@  again:
 	}
 	spin_unlock(&root->inode_lock);
 
+	return NULL;
+}
+
+static struct btrfs_delayed_node *btrfs_get_or_create_delayed_node(
+							struct inode *inode)
+{
+	struct btrfs_delayed_node *node;
+	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
+	struct btrfs_root *root = btrfs_inode->root;
+	u64 ino = btrfs_ino(inode);
+	int ret;
+
+again:
+	node = btrfs_get_delayed_node(inode);
+	if (node)
+		return node;
+
 	node = kmem_cache_alloc(delayed_node_cache, GFP_NOFS);
 	if (!node)
 		return ERR_PTR(-ENOMEM);
@@ -548,19 +564,6 @@  struct btrfs_delayed_item *__btrfs_next_delayed_item(
 	return next;
 }
 
-static inline struct btrfs_delayed_node *btrfs_get_delayed_node(
-							struct inode *inode)
-{
-	struct btrfs_inode *btrfs_inode = BTRFS_I(inode);
-	struct btrfs_delayed_node *delayed_node;
-
-	delayed_node = btrfs_inode->delayed_node;
-	if (delayed_node)
-		atomic_inc(&delayed_node->refs);
-
-	return delayed_node;
-}
-
 static inline struct btrfs_root *btrfs_get_fs_root(struct btrfs_root *root,
 						   u64 root_id)
 {
@@ -1404,8 +1407,7 @@  end:
 
 int btrfs_inode_delayed_dir_index_count(struct inode *inode)
 {
-	struct btrfs_delayed_node *delayed_node = BTRFS_I(inode)->delayed_node;
-	int ret = 0;
+	struct btrfs_delayed_node *delayed_node = btrfs_get_delayed_node(inode);
 
 	if (!delayed_node)
 		return -ENOENT;
@@ -1415,11 +1417,14 @@  int btrfs_inode_delayed_dir_index_count(struct inode *inode)
 	 * a new directory index is added into the delayed node and index_cnt
 	 * is updated now. So we needn't lock the delayed node.
 	 */
-	if (!delayed_node->index_cnt)
+	if (!delayed_node->index_cnt) {
+		btrfs_release_delayed_node(delayed_node);
 		return -EINVAL;
+	}
 
 	BTRFS_I(inode)->index_cnt = delayed_node->index_cnt;
-	return ret;
+	btrfs_release_delayed_node(delayed_node);
+	return 0;
 }
 
 void btrfs_get_delayed_items(struct inode *inode, struct list_head *ins_list,
@@ -1613,6 +1618,57 @@  static void fill_stack_inode_item(struct btrfs_trans_handle *trans,
 				      inode->i_ctime.tv_nsec);
 }
 
+int btrfs_fill_inode(struct inode *inode, u32 *rdev)
+{
+	struct btrfs_delayed_node *delayed_node;
+	struct btrfs_inode_item *inode_item;
+	struct btrfs_timespec *tspec;
+
+	delayed_node = btrfs_get_delayed_node(inode);
+	if (!delayed_node)
+		return -ENOENT;
+
+	mutex_lock(&delayed_node->mutex);
+	if (!delayed_node->inode_dirty) {
+		mutex_unlock(&delayed_node->mutex);
+		btrfs_release_delayed_node(delayed_node);
+		return -ENOENT;
+	}
+
+	inode_item = &delayed_node->inode_item;
+
+	inode->i_uid = btrfs_stack_inode_uid(inode_item);
+	inode->i_gid = btrfs_stack_inode_gid(inode_item);
+	btrfs_i_size_write(inode, btrfs_stack_inode_size(inode_item));
+	inode->i_mode = btrfs_stack_inode_mode(inode_item);
+	inode->i_nlink = btrfs_stack_inode_nlink(inode_item);
+	inode_set_bytes(inode, btrfs_stack_inode_nbytes(inode_item));
+	BTRFS_I(inode)->generation = btrfs_stack_inode_generation(inode_item);
+	BTRFS_I(inode)->sequence = btrfs_stack_inode_sequence(inode_item);
+	inode->i_rdev = 0;
+	*rdev = btrfs_stack_inode_rdev(inode_item);
+	BTRFS_I(inode)->flags = btrfs_stack_inode_flags(inode_item);
+
+	tspec = btrfs_inode_atime(inode_item);
+	inode->i_atime.tv_sec = btrfs_stack_timespec_sec(tspec);
+	inode->i_atime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
+
+	tspec = btrfs_inode_mtime(inode_item);
+	inode->i_mtime.tv_sec = btrfs_stack_timespec_sec(tspec);
+	inode->i_mtime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
+
+	tspec = btrfs_inode_ctime(inode_item);
+	inode->i_ctime.tv_sec = btrfs_stack_timespec_sec(tspec);
+	inode->i_ctime.tv_nsec = btrfs_stack_timespec_nsec(tspec);
+
+	inode->i_generation = BTRFS_I(inode)->generation;
+	BTRFS_I(inode)->index_cnt = (u64)-1;
+
+	mutex_unlock(&delayed_node->mutex);
+	btrfs_release_delayed_node(delayed_node);
+	return 0;
+}
+
 int btrfs_delayed_update_inode(struct btrfs_trans_handle *trans,
 			       struct btrfs_root *root, struct inode *inode)
 {
diff --git a/fs/btrfs/delayed-inode.h b/fs/btrfs/delayed-inode.h
index d1a6a29..8d27af4 100644
--- a/fs/btrfs/delayed-inode.h
+++ b/fs/btrfs/delayed-inode.h
@@ -119,6 +119,7 @@  void btrfs_kill_delayed_inode_items(struct inode *inode);
 
 int btrfs_delayed_update_inode(struct btrfs_trans_handle *trans,
 			       struct btrfs_root *root, struct inode *inode);
+int btrfs_fill_inode(struct inode *inode, u32 *rdev);
 
 /* Used for drop dead root */
 void btrfs_kill_all_delayed_nodes(struct btrfs_root *root);
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 5813dec..1d25a04 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -2509,6 +2509,11 @@  static void btrfs_read_locked_inode(struct inode *inode)
 	int maybe_acls;
 	u32 rdev;
 	int ret;
+	bool filled = false;
+
+	ret = btrfs_fill_inode(inode, &rdev);
+	if (!ret)
+		filled = true;
 
 	path = btrfs_alloc_path();
 	BUG_ON(!path);
@@ -2520,6 +2525,10 @@  static void btrfs_read_locked_inode(struct inode *inode)
 		goto make_bad;
 
 	leaf = path->nodes[0];
+
+	if (filled)
+		goto cache_acl;
+
 	inode_item = btrfs_item_ptr(leaf, path->slots[0],
 				    struct btrfs_inode_item);
 	if (!leaf->map_token)
@@ -2556,7 +2565,7 @@  static void btrfs_read_locked_inode(struct inode *inode)
 
 	BTRFS_I(inode)->index_cnt = (u64)-1;
 	BTRFS_I(inode)->flags = btrfs_inode_flags(leaf, inode_item);
-
+cache_acl:
 	/*
 	 * try to precache a NULL acl entry for files that don't have
 	 * any xattrs or acls
@@ -2572,7 +2581,6 @@  static void btrfs_read_locked_inode(struct inode *inode)
 	}
 
 	btrfs_free_path(path);
-	inode_item = NULL;
 
 	switch (inode->i_mode & S_IFMT) {
 	case S_IFREG: