Message ID | 1405584516-16881-1-git-send-email-bo.li.liu@oracle.com (mailing list archive) |
---|---|
State | Accepted |
Headers | show |
On 07/17/2014 04:08 AM, Liu Bo wrote: > xfstests generic/127 detected this problem. > > With commit 7fc34a62ca4434a79c68e23e70ed26111b7a4cf8, now fsync will only flush > data within the passed range. This is the cause of the above problem, > -- btrfs's fsync has a stage called 'sync log' which will wait for all the > ordered extents it've recorded to finish. > > In xfstests/generic/127, with mixed operations such as truncate, fallocate, > punch hole, and mapwrite, we get some pre-allocated extents, and mapwrite will > mmap, and then msync. And I find that msync will wait for quite a long time > (about 20s in my case), thanks to ftrace, it turns out that the previous > fallocate calls 'btrfs_wait_ordered_range()' to flush dirty pages, but as the > range of dirty pages may be larger than 'btrfs_wait_ordered_range()' wants, > there can be some ordered extents created but not getting corresponding pages > flushed, then they're left in memory until we fsync which runs into the > stage 'sync log', and fsync will just wait for the system writeback thread > to flush those pages and get ordered extents finished, so the latency is > inevitable. > > This adds a flush similar to btrfs_start_ordered_extent() in > btrfs_wait_logged_extents() to fix that. I was able to trigger the stalls with plain fsx as well. Thanks! -chris -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/btrfs/ordered-data.c b/fs/btrfs/ordered-data.c index e12441c..7187b14 100644 --- a/fs/btrfs/ordered-data.c +++ b/fs/btrfs/ordered-data.c @@ -484,8 +484,19 @@ void btrfs_wait_logged_extents(struct btrfs_root *log, u64 transid) log_list); list_del_init(&ordered->log_list); spin_unlock_irq(&log->log_extents_lock[index]); + + if (!test_bit(BTRFS_ORDERED_IO_DONE, &ordered->flags) && + !test_bit(BTRFS_ORDERED_DIRECT, &ordered->flags)) { + struct inode *inode = ordered->inode; + u64 start = ordered->file_offset; + u64 end = ordered->file_offset + ordered->len - 1; + + WARN_ON(!inode); + filemap_fdatawrite_range(inode->i_mapping, start, end); + } wait_event(ordered->wait, test_bit(BTRFS_ORDERED_IO_DONE, &ordered->flags)); + btrfs_put_ordered_extent(ordered); spin_lock_irq(&log->log_extents_lock[index]); }