diff mbox series

[v2,1/2] filemap: skip write and wait if end offset precedes start

Message ID 20221128155632.3950447-2-bfoster@redhat.com (mailing list archive)
State New
Headers show
Series filemap: skip write and wait if end offset precedes start | expand

Commit Message

Brian Foster Nov. 28, 2022, 3:56 p.m. UTC
A call to file[map]_write_and_wait_range() with an end offset that
precedes the start offset but happens to land in the same page can
trigger writeback submission but fails to wait on the submitted
page. Writeback submission occurs because
__filemap_fdatawrite_range() passes both offsets down into
write_cache_pages(), which rounds down to page indexes before it
starts processing writeback. However, __filemap_fdatawait_range()
immediately returns if the byte-granular end offset precedes the
start offset.

This behavior was observed in the form of unpredictable latency from
a frequent write and wait call with incorrect parameters. The
behavior gave the impression that the fdatawait path might
occasionally fail to wait on writeback, but further investigation
showed the latency was from write_cache_pages() waiting on writeback
state to clear for a page already under writeback. Therefore, this
indicated that fdatawait actually never waits on writeback in this
particular situation.

The byte granular check in __filemap_fdatawait_range() goes all the
way back to the old wait_on_page_writeback() helper. It originally
used page offsets and so would have waited in this problematic case.
That changed to byte granularity file offsets in commit 94004ed726f3
("kill wait_on_page_writeback_range"), which subtly changed this
behavior. The check itself has become somewhat redundant since the
error checking code that used to follow the wait loop (at the time
of the aforementioned commit) has now been removed and lifted into
the higher level callers.

Therefore, we can restore historical fdatawait behavior by simply
removing the check. Since the current fdatawait behavior has been in
place for quite some time and is consistent with other interfaces
that use file offsets, instead lift the check into the
file[map]_write_and_wait_range() helpers to provide consistent
behavior between the write and wait.

Signed-off-by: Brian Foster <bfoster@redhat.com>
---
 mm/filemap.c | 9 ++++++---
 1 file changed, 6 insertions(+), 3 deletions(-)

Comments

Christoph Hellwig Nov. 30, 2022, 7:44 a.m. UTC | #1
Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
diff mbox series

Patch

diff --git a/mm/filemap.c b/mm/filemap.c
index 08341616ae7a..e7711b5a3f4c 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -506,9 +506,6 @@  static void __filemap_fdatawait_range(struct address_space *mapping,
 	struct pagevec pvec;
 	int nr_pages;
 
-	if (end_byte < start_byte)
-		return;
-
 	pagevec_init(&pvec);
 	while (index <= end) {
 		unsigned i;
@@ -670,6 +667,9 @@  int filemap_write_and_wait_range(struct address_space *mapping,
 {
 	int err = 0, err2;
 
+	if (lend < lstart)
+		return 0;
+
 	if (mapping_needs_writeback(mapping)) {
 		err = __filemap_fdatawrite_range(mapping, lstart, lend,
 						 WB_SYNC_ALL);
@@ -770,6 +770,9 @@  int file_write_and_wait_range(struct file *file, loff_t lstart, loff_t lend)
 	int err = 0, err2;
 	struct address_space *mapping = file->f_mapping;
 
+	if (lend < lstart)
+		return 0;
+
 	if (mapping_needs_writeback(mapping)) {
 		err = __filemap_fdatawrite_range(mapping, lstart, lend,
 						 WB_SYNC_ALL);