diff mbox

[v2,4/4] gfs2: convert to errseq_t based writeback error reporting for fsync

Message ID 20170726175538.13885-5-jlayton@kernel.org (mailing list archive)
State New, archived
Headers show

Commit Message

Jeff Layton July 26, 2017, 5:55 p.m. UTC
From: Jeff Layton <jlayton@redhat.com>

This means that we need to export the new file_fdatawait_range symbol.

Also, fix a place where a writeback error might get dropped in the
gfs2_is_jdata case.

Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/gfs2/file.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Matthew Wilcox July 26, 2017, 7:21 p.m. UTC | #1
On Wed, Jul 26, 2017 at 01:55:38PM -0400, Jeff Layton wrote:
> @@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t start, loff_t end,
>  		if (ret)
>  			return ret;
>  		if (gfs2_is_jdata(ip))
> -			filemap_write_and_wait(mapping);
> +			ret = file_write_and_wait(file);
> +		if (ret)
> +			return ret;
>  		gfs2_ail_flush(ip->i_gl, 1);
>  	}

Do we want to skip flushing the AIL if there was an error (possibly
previously encountered)?  I'd think we'd want to flush the AIL then report
the error, like this:

 		if (gfs2_is_jdata(ip))
-			filemap_write_and_wait(mapping);
+			ret = file_write_and_wait(file);
 		gfs2_ail_flush(ip->i_gl, 1);
+		if (ret)
+			return ret;
 	}
Jeff Layton July 26, 2017, 10:22 p.m. UTC | #2
On Wed, 2017-07-26 at 12:21 -0700, Matthew Wilcox wrote:
> On Wed, Jul 26, 2017 at 01:55:38PM -0400, Jeff Layton wrote:
> > @@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t start, loff_t end,
> >  		if (ret)
> >  			return ret;
> >  		if (gfs2_is_jdata(ip))
> > -			filemap_write_and_wait(mapping);
> > +			ret = file_write_and_wait(file);
> > +		if (ret)
> > +			return ret;
> >  		gfs2_ail_flush(ip->i_gl, 1);
> >  	}
> 
> Do we want to skip flushing the AIL if there was an error (possibly
> previously encountered)?  I'd think we'd want to flush the AIL then report
> the error, like this:
> 

I wondered about that. Note that earlier in the function, we also bail
out without flushing the AIL if sync_inode_metadata fails, so I assumed
that we'd want to do the same here. 

I could definitely be wrong and am fine with changing it if so.
Discarding the error like we do today seems wrong though.

Bob, thoughts?


>  		if (gfs2_is_jdata(ip))
> -			filemap_write_and_wait(mapping);
> +			ret = file_write_and_wait(file);
>  		gfs2_ail_flush(ip->i_gl, 1);
> +		if (ret)
> +			return ret;
>  	}
Bob Peterson July 27, 2017, 12:47 p.m. UTC | #3
----- Original Message -----
| On Wed, 2017-07-26 at 12:21 -0700, Matthew Wilcox wrote:
| > On Wed, Jul 26, 2017 at 01:55:38PM -0400, Jeff Layton wrote:
| > > @@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t
| > > start, loff_t end,
| > >  		if (ret)
| > >  			return ret;
| > >  		if (gfs2_is_jdata(ip))
| > > -			filemap_write_and_wait(mapping);
| > > +			ret = file_write_and_wait(file);
| > > +		if (ret)
| > > +			return ret;
| > >  		gfs2_ail_flush(ip->i_gl, 1);
| > >  	}
| > 
| > Do we want to skip flushing the AIL if there was an error (possibly
| > previously encountered)?  I'd think we'd want to flush the AIL then report
| > the error, like this:
| > 
| 
| I wondered about that. Note that earlier in the function, we also bail
| out without flushing the AIL if sync_inode_metadata fails, so I assumed
| that we'd want to do the same here.
| 
| I could definitely be wrong and am fine with changing it if so.
| Discarding the error like we do today seems wrong though.
| 
| Bob, thoughts?

Hi Jeff, Matthew,

I'm not sure there's a right or wrong answer here. I don't know what's
best from a "correctness" point of view.

I guess I'm leaning toward Jeff's original solution where we don't
call gfs2_ail_flush() on error. The main purpose of ail_flush is to
go through buffer descriptors (bds) attached to the glock and generate
revokes for them in a new transaction. If there's an error condition,
trying to go through more hoops will probably just get us into more
trouble. If the error is -ENOMEM, we don't want to allocate new memory
for the new transaction. If the error is -EIO, we probably don't
want to encourage more writing either.

So on the one hand, it might be good to get rid of the buffer descriptors
so we don't leak memory, but that's probably also done elsewhere.
I have not chased down what happens in that case, but the same thing
would happen in the existing -EIO case a few lines above.

On the other hand, we probably don't want to start a new transaction
and start adding revokes to it, and such, due to the error.

Perhaps Steve Whitehouse can weigh in?

Regards,

Bob Peterson
Red Hat File Systems
Steven Whitehouse July 28, 2017, 12:37 p.m. UTC | #4
Hi,


On 27/07/17 13:47, Bob Peterson wrote:
> ----- Original Message -----
> | On Wed, 2017-07-26 at 12:21 -0700, Matthew Wilcox wrote:
> | > On Wed, Jul 26, 2017 at 01:55:38PM -0400, Jeff Layton wrote:
> | > > @@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t
> | > > start, loff_t end,
> | > >  		if (ret)
> | > >  			return ret;
> | > >  		if (gfs2_is_jdata(ip))
> | > > -			filemap_write_and_wait(mapping);
> | > > +			ret = file_write_and_wait(file);
> | > > +		if (ret)
> | > > +			return ret;
> | > >  		gfs2_ail_flush(ip->i_gl, 1);
> | > >  	}
> | >
> | > Do we want to skip flushing the AIL if there was an error (possibly
> | > previously encountered)?  I'd think we'd want to flush the AIL then report
> | > the error, like this:
> | >
> |
> | I wondered about that. Note that earlier in the function, we also bail
> | out without flushing the AIL if sync_inode_metadata fails, so I assumed
> | that we'd want to do the same here.
> |
> | I could definitely be wrong and am fine with changing it if so.
> | Discarding the error like we do today seems wrong though.
> |
> | Bob, thoughts?
>
> Hi Jeff, Matthew,
>
> I'm not sure there's a right or wrong answer here. I don't know what's
> best from a "correctness" point of view.
>
> I guess I'm leaning toward Jeff's original solution where we don't
> call gfs2_ail_flush() on error. The main purpose of ail_flush is to
> go through buffer descriptors (bds) attached to the glock and generate
> revokes for them in a new transaction. If there's an error condition,
> trying to go through more hoops will probably just get us into more
> trouble. If the error is -ENOMEM, we don't want to allocate new memory
> for the new transaction. If the error is -EIO, we probably don't
> want to encourage more writing either.
>
> So on the one hand, it might be good to get rid of the buffer descriptors
> so we don't leak memory, but that's probably also done elsewhere.
> I have not chased down what happens in that case, but the same thing
> would happen in the existing -EIO case a few lines above.
>
> On the other hand, we probably don't want to start a new transaction
> and start adding revokes to it, and such, due to the error.
>
> Perhaps Steve Whitehouse can weigh in?
>
> Regards,
>
> Bob Peterson
> Red Hat File Systems

Yes, we probably do want to skip the ail flush if there is an error. We 
don't know whether the error is permanent or transient at that stage. If 
a previous stage of the fsync has failed, then there may be nothing for 
the next stage to do anyway, so it is probably not a big deal either 
way. So long as the error is reported to the caller, then we should be ok,

Steve.
Jeff Layton July 28, 2017, 12:47 p.m. UTC | #5
On Fri, 2017-07-28 at 13:37 +0100, Steven Whitehouse wrote:
> Hi,
> 
> 
> On 27/07/17 13:47, Bob Peterson wrote:
> > ----- Original Message -----
> > > On Wed, 2017-07-26 at 12:21 -0700, Matthew Wilcox wrote:
> > > > On Wed, Jul 26, 2017 at 01:55:38PM -0400, Jeff Layton wrote:
> > > > > @@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t
> > > > > start, loff_t end,
> > > > >  		if (ret)
> > > > >  			return ret;
> > > > >  		if (gfs2_is_jdata(ip))
> > > > > -			filemap_write_and_wait(mapping);
> > > > > +			ret = file_write_and_wait(file);
> > > > > +		if (ret)
> > > > > +			return ret;
> > > > >  		gfs2_ail_flush(ip->i_gl, 1);
> > > > >  	}
> > > > 
> > > > Do we want to skip flushing the AIL if there was an error (possibly
> > > > previously encountered)?  I'd think we'd want to flush the AIL then report
> > > > the error, like this:
> > > > 
> > > 
> > > I wondered about that. Note that earlier in the function, we also bail
> > > out without flushing the AIL if sync_inode_metadata fails, so I assumed
> > > that we'd want to do the same here.
> > > 
> > > I could definitely be wrong and am fine with changing it if so.
> > > Discarding the error like we do today seems wrong though.
> > > 
> > > Bob, thoughts?
> > 
> > Hi Jeff, Matthew,
> > 
> > I'm not sure there's a right or wrong answer here. I don't know what's
> > best from a "correctness" point of view.
> > 
> > I guess I'm leaning toward Jeff's original solution where we don't
> > call gfs2_ail_flush() on error. The main purpose of ail_flush is to
> > go through buffer descriptors (bds) attached to the glock and generate
> > revokes for them in a new transaction. If there's an error condition,
> > trying to go through more hoops will probably just get us into more
> > trouble. If the error is -ENOMEM, we don't want to allocate new memory
> > for the new transaction. If the error is -EIO, we probably don't
> > want to encourage more writing either.
> > 
> > So on the one hand, it might be good to get rid of the buffer descriptors
> > so we don't leak memory, but that's probably also done elsewhere.
> > I have not chased down what happens in that case, but the same thing
> > would happen in the existing -EIO case a few lines above.
> > 
> > On the other hand, we probably don't want to start a new transaction
> > and start adding revokes to it, and such, due to the error.
> > 
> > Perhaps Steve Whitehouse can weigh in?
> > 
> > Regards,
> > 
> > Bob Peterson
> > Red Hat File Systems
> 
> Yes, we probably do want to skip the ail flush if there is an error. We 
> don't know whether the error is permanent or transient at that stage. If 
> a previous stage of the fsync has failed, then there may be nothing for 
> the next stage to do anyway, so it is probably not a big deal either 
> way. So long as the error is reported to the caller, then we should be ok,
> 

Ok, cool. I'll plan to carry this patch for now as it depends on an
earlier one in the series. One more question though:

Is it correct in the gfs2_is_jdata case to ignore the range that was
passed in from the caller? ->fsync gets start and end arguments, but
this will always write back the whole range. Is that necessary in this
case?
Steven Whitehouse July 28, 2017, 12:54 p.m. UTC | #6
Hi,


On 28/07/17 13:47, Jeff Layton wrote:
> On Fri, 2017-07-28 at 13:37 +0100, Steven Whitehouse wrote:
>> Hi,
>>
>>
>> On 27/07/17 13:47, Bob Peterson wrote:
>>> ----- Original Message -----
>>>> On Wed, 2017-07-26 at 12:21 -0700, Matthew Wilcox wrote:
>>>>> On Wed, Jul 26, 2017 at 01:55:38PM -0400, Jeff Layton wrote:
>>>>>> @@ -668,12 +668,14 @@ static int gfs2_fsync(struct file *file, loff_t
>>>>>> start, loff_t end,
>>>>>>   		if (ret)
>>>>>>   			return ret;
>>>>>>   		if (gfs2_is_jdata(ip))
>>>>>> -			filemap_write_and_wait(mapping);
>>>>>> +			ret = file_write_and_wait(file);
>>>>>> +		if (ret)
>>>>>> +			return ret;
>>>>>>   		gfs2_ail_flush(ip->i_gl, 1);
>>>>>>   	}
>>>>> Do we want to skip flushing the AIL if there was an error (possibly
>>>>> previously encountered)?  I'd think we'd want to flush the AIL then report
>>>>> the error, like this:
>>>>>
>>>> I wondered about that. Note that earlier in the function, we also bail
>>>> out without flushing the AIL if sync_inode_metadata fails, so I assumed
>>>> that we'd want to do the same here.
>>>>
>>>> I could definitely be wrong and am fine with changing it if so.
>>>> Discarding the error like we do today seems wrong though.
>>>>
>>>> Bob, thoughts?
>>> Hi Jeff, Matthew,
>>>
>>> I'm not sure there's a right or wrong answer here. I don't know what's
>>> best from a "correctness" point of view.
>>>
>>> I guess I'm leaning toward Jeff's original solution where we don't
>>> call gfs2_ail_flush() on error. The main purpose of ail_flush is to
>>> go through buffer descriptors (bds) attached to the glock and generate
>>> revokes for them in a new transaction. If there's an error condition,
>>> trying to go through more hoops will probably just get us into more
>>> trouble. If the error is -ENOMEM, we don't want to allocate new memory
>>> for the new transaction. If the error is -EIO, we probably don't
>>> want to encourage more writing either.
>>>
>>> So on the one hand, it might be good to get rid of the buffer descriptors
>>> so we don't leak memory, but that's probably also done elsewhere.
>>> I have not chased down what happens in that case, but the same thing
>>> would happen in the existing -EIO case a few lines above.
>>>
>>> On the other hand, we probably don't want to start a new transaction
>>> and start adding revokes to it, and such, due to the error.
>>>
>>> Perhaps Steve Whitehouse can weigh in?
>>>
>>> Regards,
>>>
>>> Bob Peterson
>>> Red Hat File Systems
>> Yes, we probably do want to skip the ail flush if there is an error. We
>> don't know whether the error is permanent or transient at that stage. If
>> a previous stage of the fsync has failed, then there may be nothing for
>> the next stage to do anyway, so it is probably not a big deal either
>> way. So long as the error is reported to the caller, then we should be ok,
>>
> Ok, cool. I'll plan to carry this patch for now as it depends on an
> earlier one in the series. One more question though:
>
> Is it correct in the gfs2_is_jdata case to ignore the range that was
> passed in from the caller? ->fsync gets start and end arguments, but
> this will always write back the whole range. Is that necessary in this
> case?
>
It probably doesn't matter really. We try to discourage the use of jdata 
from userspace. There are a few internal files that use it still, and it 
is there for backwards compatibility more than anything. So performance 
is generally not a problem for that. The ordered write mode is the 
important one.

So you are right that it might be better to add the range into that call 
too, but it is not likely that anybody will notice the performance 
improvement,

Steve.
diff mbox

Patch

diff --git a/fs/gfs2/file.c b/fs/gfs2/file.c
index c2062a108d19..c53ac6efd04c 100644
--- a/fs/gfs2/file.c
+++ b/fs/gfs2/file.c
@@ -668,12 +668,14 @@  static int gfs2_fsync(struct file *file, loff_t start, loff_t end,
 		if (ret)
 			return ret;
 		if (gfs2_is_jdata(ip))
-			filemap_write_and_wait(mapping);
+			ret = file_write_and_wait(file);
+		if (ret)
+			return ret;
 		gfs2_ail_flush(ip->i_gl, 1);
 	}
 
 	if (mapping->nrpages)
-		ret = filemap_fdatawait_range(mapping, start, end);
+		ret = file_fdatawait_range(file, start, end);
 
 	return ret ? ret : ret1;
 }