diff mbox series

[1/2] writeback: avoid double-writing the inode on a lazytime expiration

Message ID 20200320025255.1705972-1-tytso@mit.edu (mailing list archive)
State Deferred, archived
Headers show
Series [1/2] writeback: avoid double-writing the inode on a lazytime expiration | expand

Commit Message

Theodore Ts'o March 20, 2020, 2:52 a.m. UTC
In the case that an inode has dirty timestamp for longer than the
lazytime expiration timeout (or if all such inodes are being flushed
out due to a sync or syncfs system call), we need to inform the file
system that the inode is dirty so that the inode's timestamps can be
copied out to the on-disk data structures.  That's because if the file
system supports lazytime, it will have ignored the dirty_inode(inode,
I_DIRTY_TIME) notification when the timestamp was modified in memory.q

Previously, this was accomplished by calling mark_inode_dirty_sync(),
but that has the unfortunate side effect of also putting the inode the
writeback list, and that's not necessary in this case, since we will
immediately call write_inode() afterwards.

Eric Biggers noticed that this was causing problems for fscrypt after
the key was removed[1].

[1] https://lore.kernel.org/r/20200306004555.GB225345@gmail.com

Reported-by: Eric Biggers <ebiggers@kernel.org>
Signed-off-by: Theodore Ts'o <tytso@mit.edu>
---
 fs/fs-writeback.c | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Christoph Hellwig March 25, 2020, 9:20 a.m. UTC | #1
>  	spin_unlock(&inode->i_lock);
>  
> -	if (dirty & I_DIRTY_TIME)
> -		mark_inode_dirty_sync(inode);
> +	/* This was a lazytime expiration; we need to tell the file system */
> +	if (dirty & I_DIRTY_TIME_EXPIRED && inode->i_sb->s_op->dirty_inode)
> +		inode->i_sb->s_op->dirty_inode(inode, I_DIRTY_SYNC);

I think this needs a very clear comment explaining why we don't go
through __mark_inode_dirty.

But as said before I'd rather have a new lazytime_expired operation that
makes it very clear what is happening.  We currenly have 4 file systems
(ext4, f2fs, ubifs and xfs) that support lazytime, so this won't really
be a major churn.
Theodore Ts'o March 25, 2020, 3:21 p.m. UTC | #2
On Wed, Mar 25, 2020 at 02:20:57AM -0700, Christoph Hellwig wrote:
> >  	spin_unlock(&inode->i_lock);
> >  
> > -	if (dirty & I_DIRTY_TIME)
> > -		mark_inode_dirty_sync(inode);
> > +	/* This was a lazytime expiration; we need to tell the file system */
> > +	if (dirty & I_DIRTY_TIME_EXPIRED && inode->i_sb->s_op->dirty_inode)
> > +		inode->i_sb->s_op->dirty_inode(inode, I_DIRTY_SYNC);
> 
> I think this needs a very clear comment explaining why we don't go
> through __mark_inode_dirty.

I can take the explanation which is in the git commit description and
move it into the comment.

> But as said before I'd rather have a new lazytime_expired operation that
> makes it very clear what is happening.  We currenly have 4 file systems
> (ext4, f2fs, ubifs and xfs) that support lazytime, so this won't really
> be a major churn.

Again, I believe patch #2 does what you want; if it doesn't can you
explain why passing I_DIRTY_TIME_EXPIRED to s_op->dirty_inode() isn't
"a new lazytime expired operation that makes very clear what is
happening"?

I separated out patch #1 and patch #2 because patch #1 preserves
current behavior, and patch #2 modifies XFS code, which I don't want
to push Linus without an XFS reviewed-by.

N.b.  None of the other file systems required a change for patch #2,
so if you want, we can have the XFS tree carry patch #2, and/or
combine that with whatever other simplifying changes that you want.
Or I can combine patch #1 and patch #2, with an XFS Reviewed-by, and
send it through the ext4 tree.

What's your pleasure?

					- Ted
Darrick J. Wong March 25, 2020, 3:47 p.m. UTC | #3
On Wed, Mar 25, 2020 at 11:21:13AM -0400, Theodore Y. Ts'o wrote:
> On Wed, Mar 25, 2020 at 02:20:57AM -0700, Christoph Hellwig wrote:
> > >  	spin_unlock(&inode->i_lock);
> > >  
> > > -	if (dirty & I_DIRTY_TIME)
> > > -		mark_inode_dirty_sync(inode);
> > > +	/* This was a lazytime expiration; we need to tell the file system */
> > > +	if (dirty & I_DIRTY_TIME_EXPIRED && inode->i_sb->s_op->dirty_inode)
> > > +		inode->i_sb->s_op->dirty_inode(inode, I_DIRTY_SYNC);
> > 
> > I think this needs a very clear comment explaining why we don't go
> > through __mark_inode_dirty.
> 
> I can take the explanation which is in the git commit description and
> move it into the comment.
> 
> > But as said before I'd rather have a new lazytime_expired operation that
> > makes it very clear what is happening.  We currenly have 4 file systems
> > (ext4, f2fs, ubifs and xfs) that support lazytime, so this won't really
> > be a major churn.
> 
> Again, I believe patch #2 does what you want; if it doesn't can you
> explain why passing I_DIRTY_TIME_EXPIRED to s_op->dirty_inode() isn't
> "a new lazytime expired operation that makes very clear what is
> happening"?
> 
> I separated out patch #1 and patch #2 because patch #1 preserves
> current behavior, and patch #2 modifies XFS code, which I don't want
> to push Linus without an XFS reviewed-by.
> 
> N.b.  None of the other file systems required a change for patch #2,
> so if you want, we can have the XFS tree carry patch #2, and/or
> combine that with whatever other simplifying changes that you want.
> Or I can combine patch #1 and patch #2, with an XFS Reviewed-by, and
> send it through the ext4 tree.
> 
> What's your pleasure?

TBH while I'm pretty sure this does actually maintain more or less the
same behavior on xfs, I prefer Christoph's explicit ->lazytime_expired
approach[1] over squinting at bitflag manipulations.

(It also took me a while to realize that this patch duo even existed, as
it was kinda buried in its parent thread...)

--D

[1] https://lore.kernel.org/linux-fsdevel/20200325122825.1086872-1-hch@lst.de/T/#t

> 
> 					- Ted
>
diff mbox series

Patch

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 76ac9c7d32ec..867454997c9d 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1504,8 +1504,9 @@  __writeback_single_inode(struct inode *inode, struct writeback_control *wbc)
 
 	spin_unlock(&inode->i_lock);
 
-	if (dirty & I_DIRTY_TIME)
-		mark_inode_dirty_sync(inode);
+	/* This was a lazytime expiration; we need to tell the file system */
+	if (dirty & I_DIRTY_TIME_EXPIRED && inode->i_sb->s_op->dirty_inode)
+		inode->i_sb->s_op->dirty_inode(inode, I_DIRTY_SYNC);
 	/* Don't write the inode if only I_DIRTY_PAGES was set */
 	if (dirty & ~I_DIRTY_PAGES) {
 		int err = write_inode(inode, wbc);