diff mbox series

[v4,16/16] xfs: add pre-content fsnotify hook for write faults

Message ID 631039816bbac737db351e3067520e85a8774ba1.1723670362.git.josef@toxicpanda.com (mailing list archive)
State New
Headers show
Series fanotify: add pre-content hooks | expand

Commit Message

Josef Bacik Aug. 14, 2024, 9:25 p.m. UTC
xfs has it's own handling for write faults, so we need to add the
pre-content fsnotify hook for this case.  Reads go through filemap_fault
so they're handled properly there.

Signed-off-by: Josef Bacik <josef@toxicpanda.com>
---
 fs/xfs/xfs_file.c | 4 ++++
 1 file changed, 4 insertions(+)

Comments

Jan Kara Aug. 29, 2024, 11:17 a.m. UTC | #1
On Wed 14-08-24 17:25:34, Josef Bacik wrote:
> xfs has it's own handling for write faults, so we need to add the
> pre-content fsnotify hook for this case.  Reads go through filemap_fault
> so they're handled properly there.
> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>

Looks good to me but it would be great to get explicit ack from some XFS
guy...  Some selection CCed :)

								Honza

> ---
>  fs/xfs/xfs_file.c | 4 ++++
>  1 file changed, 4 insertions(+)
> 
> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> index 4cdc54dc9686..e61c4c389d7d 100644
> --- a/fs/xfs/xfs_file.c
> +++ b/fs/xfs/xfs_file.c
> @@ -1283,6 +1283,10 @@ xfs_write_fault(
>  	unsigned int		lock_mode = XFS_MMAPLOCK_SHARED;
>  	vm_fault_t		ret;
>  
> +	ret = filemap_maybe_emit_fsnotify_event(vmf);
> +	if (unlikely(ret))
> +		return ret;
> +
>  	sb_start_pagefault(inode->i_sb);
>  	file_update_time(vmf->vma->vm_file);
>  
> -- 
> 2.43.0
>
Darrick J. Wong Aug. 30, 2024, 11:28 p.m. UTC | #2
On Thu, Aug 29, 2024 at 01:17:53PM +0200, Jan Kara wrote:
> On Wed 14-08-24 17:25:34, Josef Bacik wrote:
> > xfs has it's own handling for write faults, so we need to add the
> > pre-content fsnotify hook for this case.  Reads go through filemap_fault
> > so they're handled properly there.
> > 
> > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> 
> Looks good to me but it would be great to get explicit ack from some XFS
> guy...  Some selection CCed :)

Looks decent to me, but I wonder why xfs_write_fault has to invoke
filemap_maybe_emit_fsnotify_event itself?  Can that be done from
whatever calls ->page_mkwrite and friends?

--D

> 
> 								Honza
> 
> > ---
> >  fs/xfs/xfs_file.c | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
> > index 4cdc54dc9686..e61c4c389d7d 100644
> > --- a/fs/xfs/xfs_file.c
> > +++ b/fs/xfs/xfs_file.c
> > @@ -1283,6 +1283,10 @@ xfs_write_fault(
> >  	unsigned int		lock_mode = XFS_MMAPLOCK_SHARED;
> >  	vm_fault_t		ret;
> >  
> > +	ret = filemap_maybe_emit_fsnotify_event(vmf);
> > +	if (unlikely(ret))
> > +		return ret;
> > +
> >  	sb_start_pagefault(inode->i_sb);
> >  	file_update_time(vmf->vma->vm_file);
> >  
> > -- 
> > 2.43.0
> > 
> -- 
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
>
Jan Kara Sept. 2, 2024, 10:23 a.m. UTC | #3
On Fri 30-08-24 16:28:33, Darrick J. Wong wrote:
> On Thu, Aug 29, 2024 at 01:17:53PM +0200, Jan Kara wrote:
> > On Wed 14-08-24 17:25:34, Josef Bacik wrote:
> > > xfs has it's own handling for write faults, so we need to add the
> > > pre-content fsnotify hook for this case.  Reads go through filemap_fault
> > > so they're handled properly there.
> > > 
> > > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > 
> > Looks good to me but it would be great to get explicit ack from some XFS
> > guy...  Some selection CCed :)
> 
> Looks decent to me, but I wonder why xfs_write_fault has to invoke
> filemap_maybe_emit_fsnotify_event itself?  Can that be done from
> whatever calls ->page_mkwrite and friends?

So we were discussing this already here [1]. The options we have:

1) Call filemap_maybe_emit_fsnotify_event() from filesystem hooks
(filemap_fault() for those who use it). This is a bit ugly because it
requires modification to filesystems with their own fault handlers.

2) Call filemap_maybe_emit_fsnotify_event() from generic code before
calling ->fault() and if we needed to send event (and thus dropped
mmap_lock), we will retry the fault. This requires no special fs awareness
but the ->fault hook will then be called after retry most of the times on
HSM managed fs and thus without possibility to drop mmap_lock making
contention there possibly worse.

3) (I don't think we've discussed this option yet): Call
filemap_maybe_emit_fsnotify_event() in generic code before calling ->fault
and then continue to call ->fault even if we've dropped mmap_lock. This
will require changing calling convention for ->fault as vmf->vma must not
be touched after we've dropped mmap_lock and practically all users end up
using it to get vmf->vma->vm_file. With per-fs opt in flag to enable HSM
events this could be manageable but frankly I'm not convinced the
complicated calling convention would be better outcome than 1). But I'm
open for discussion.
								Honza

[1] https://lore.kernel.org/all/CAOQ4uxgXEzT=Buwu8SOkQG+2qcObmdH4NgsGme8bECObiobfTQ@mail.gmail.com
Christian Brauner Sept. 2, 2024, 11:19 a.m. UTC | #4
On Mon, Sep 02, 2024 at 12:23:44PM GMT, Jan Kara wrote:
> On Fri 30-08-24 16:28:33, Darrick J. Wong wrote:
> > On Thu, Aug 29, 2024 at 01:17:53PM +0200, Jan Kara wrote:
> > > On Wed 14-08-24 17:25:34, Josef Bacik wrote:
> > > > xfs has it's own handling for write faults, so we need to add the
> > > > pre-content fsnotify hook for this case.  Reads go through filemap_fault
> > > > so they're handled properly there.
> > > > 
> > > > Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> > > 
> > > Looks good to me but it would be great to get explicit ack from some XFS
> > > guy...  Some selection CCed :)
> > 
> > Looks decent to me, but I wonder why xfs_write_fault has to invoke
> > filemap_maybe_emit_fsnotify_event itself?  Can that be done from
> > whatever calls ->page_mkwrite and friends?
> 
> So we were discussing this already here [1]. The options we have:
> 
> 1) Call filemap_maybe_emit_fsnotify_event() from filesystem hooks

Sidenote: Can that be renamed to filemap_fsnotify() or something
similar. Especially that "maybe" in there really doesn't add value imho.
diff mbox series

Patch

diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
index 4cdc54dc9686..e61c4c389d7d 100644
--- a/fs/xfs/xfs_file.c
+++ b/fs/xfs/xfs_file.c
@@ -1283,6 +1283,10 @@  xfs_write_fault(
 	unsigned int		lock_mode = XFS_MMAPLOCK_SHARED;
 	vm_fault_t		ret;
 
+	ret = filemap_maybe_emit_fsnotify_event(vmf);
+	if (unlikely(ret))
+		return ret;
+
 	sb_start_pagefault(inode->i_sb);
 	file_update_time(vmf->vma->vm_file);