Message ID | 20130129145429.GD3660@localhost.localdomain (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Well I found this, so I think it's likely: root@gwboss2:~# dmesg |grep bitten [ 3196.193238] this would have bitten us in the ass [ 3196.193784] this would have bitten us in the ass On Jan 29, 2013, at 9:54 AM, Josef Bacik <jbacik@fusionio.com> wrote: > On Mon, Jan 28, 2013 at 05:12:12PM -0700, Sage Weil wrote: >> A ceph user observed a incorrect i_size on btrfs. The pattern looks like >> this: >> >> - some writes at low file offsets >> - a write to 4185600 len 8704 (i_size should be 4MB) >> - more writes to low offsets >> - a write to 4181504 len 4096 (abutts the write above) >> - a bit of time goes by... >> - stat returns 4186112 (4MB - 8192) >> - that's a fwe bytes to the right of the top write above. >> >> There are some logs showing the full read/write activity to the file at >> >> http://tracker.newdream.net/attachments/658/object_log.txt >> >> on issue >> >> http://tracker.newdream.net/issues/3810 >> >> The kernel was 3.7.0-030700-generic (and probably also observed on 3.7.1). >> >> Is this a known bug? > > Not known but I took a long hard look at our ordered i size updating and I think > I spotted the bug. Could you run this patch and see if you get the printk? If > you do then that was the problem and you should be good to go. It definitely > needs to be fixed, hopefully it's also your bug. Thanks, > > Josef > > > diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c > index cbd4838..dbd4905 100644 > --- a/fs/btrfs/ordered-data.c > +++ b/fs/btrfs/ordered-data.c > @@ -895,8 +895,14 @@ int btrfs_ordered_update_i_size(struct inode *inode, u64 offset, > * if the disk i_size is already at the inode->i_size, or > * this ordered extent is inside the disk i_size, we're done > */ > - if (disk_i_size == i_size || offset <= disk_i_size) { > + if (disk_i_size == i_size) > goto out; > + > + if (offset <= disk_i_size) { > + if (ordered && ordered->outstanding_isize > disk_i_size) > + printk(KERN_ERR "this would have bitten us in the ass\n"); > + else > + goto out; > } > > /* -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 30, 2013 at 11:17:25AM -0700, Mike Lowe wrote: > Well I found this, so I think it's likely: > > root@gwboss2:~# dmesg |grep bitten > [ 3196.193238] this would have bitten us in the ass > [ 3196.193784] this would have bitten us in the ass > Well that makes me happy since I had almost talked myself out of this being a possiblity. How long did it take you to hit this problem before and how long have you been running with this patch? Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
I've been running rsync against a rbd device backed by btrfs filesystems that are about 11% full for about 45 minutes before I checked and noticed the printk message. That was the first go with the patch. Seems like I was able to get by without any problems until the btrfs filesystems got some use and filled up a little bit. On Jan 30, 2013, at 1:22 PM, Josef Bacik <jbacik@fusionio.com> wrote: > On Wed, Jan 30, 2013 at 11:17:25AM -0700, Mike Lowe wrote: >> Well I found this, so I think it's likely: >> >> root@gwboss2:~# dmesg |grep bitten >> [ 3196.193238] this would have bitten us in the ass >> [ 3196.193784] this would have bitten us in the ass >> > > Well that makes me happy since I had almost talked myself out of this being a > possiblity. How long did it take you to hit this problem before and how long > have you been running with this patch? Thanks, > > Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Jan 30, 2013 at 11:30:49AM -0700, Mike Lowe wrote: > I've been running rsync against a rbd device backed by btrfs filesystems that are about 11% full for about 45 minutes before I checked and noticed the printk message. That was the first go with the patch. Seems like I was able to get by without any problems until the btrfs filesystems got some use and filled up a little bit. > Ok since you are seeing the message I'll go ahead and post the patch and get it moving along, let me know if you still see the problem. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 30 Jan 2013, Josef Bacik wrote: > On Wed, Jan 30, 2013 at 11:30:49AM -0700, Mike Lowe wrote: > > I've been running rsync against a rbd device backed by btrfs filesystems that are about 11% full for about 45 minutes before I checked and noticed the printk message. That was the first go with the patch. Seems like I was able to get by without any problems until the btrfs filesystems got some use and filled up a little bit. > > > > Ok since you are seeing the message I'll go ahead and post the patch and > get it moving along, let me know if you still see the problem. Thanks, Awesome. Mike still hasn't seen a reocurrence, so it's looking like the patch is good. Thanks so much! sage -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index cbd4838..dbd4905 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -895,8 +895,14 @@ int btrfs_ordered_update_i_size(struct inode *inode, u64 offset, * if the disk i_size is already at the inode->i_size, or * this ordered extent is inside the disk i_size, we're done */ - if (disk_i_size == i_size || offset <= disk_i_size) { + if (disk_i_size == i_size) goto out; + + if (offset <= disk_i_size) { + if (ordered && ordered->outstanding_isize > disk_i_size) + printk(KERN_ERR "this would have bitten us in the ass\n"); + else + goto out; } /*