diff mbox series

fanotify: allow creating FAN_PRE_ACCESS events on directories

Message ID 20250402062707.1637811-1-amir73il@gmail.com (mailing list archive)
State New
Headers show
Series fanotify: allow creating FAN_PRE_ACCESS events on directories | expand

Commit Message

Amir Goldstein April 2, 2025, 6:27 a.m. UTC
Like files, a FAN_PRE_ACCESS event will be generated before every
read access to directory, that is on readdir(3).

Unlike files, there will be no range info record following a
FAN_PRE_ACCESS event, because the range of access on a directory
is not well defined.

FAN_PRE_ACCESS events on readdir are only generated when user opts-in
with FAN_ONDIR request in event mask and the FAN_PRE_ACCESS events on
readdir report the FAN_ONDIR flag, so user can differentiate them from
event on read.

An HSM service is expected to use those events to populate directories
from slower tier on first readdir access. Having to range info means
that the entire directory will need to be populated on the first
readdir() call.

Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Jan,

IIRC, the reason we did not allow FAN_ONDIR with FAN_PRE_ACCESS event
in initial API version was due to uncertainty around reporting range info.

Circling back to this, I do not see any better options other than not
reporting range info and reporting the FAN_ONDIR flag.

HSM only option is to populate the entire directory on first access.
Doing a partial range populate for directories does not seem practical
with exising POSIX semantics.

If you accept this claim, please consider fast tracking this change into
6.14.y.

LTP tests pushed to my fan_hsm branch.

Thanks,
Amir.

 fs/notify/fanotify/fanotify.c      | 11 +++++++----
 fs/notify/fanotify/fanotify_user.c |  9 ---------
 2 files changed, 7 insertions(+), 13 deletions(-)

Comments

Jan Kara April 3, 2025, 5:10 p.m. UTC | #1
On Wed 02-04-25 08:27:07, Amir Goldstein wrote:
> Like files, a FAN_PRE_ACCESS event will be generated before every
> read access to directory, that is on readdir(3).
> 
> Unlike files, there will be no range info record following a
> FAN_PRE_ACCESS event, because the range of access on a directory
> is not well defined.
> 
> FAN_PRE_ACCESS events on readdir are only generated when user opts-in
> with FAN_ONDIR request in event mask and the FAN_PRE_ACCESS events on
> readdir report the FAN_ONDIR flag, so user can differentiate them from
> event on read.
> 
> An HSM service is expected to use those events to populate directories
> from slower tier on first readdir access. Having to range info means
> that the entire directory will need to be populated on the first
> readdir() call.
> 
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
> 
> Jan,
> 
> IIRC, the reason we did not allow FAN_ONDIR with FAN_PRE_ACCESS event
> in initial API version was due to uncertainty around reporting range info.
> 
> Circling back to this, I do not see any better options other than not
> reporting range info and reporting the FAN_ONDIR flag.
> 
> HSM only option is to populate the entire directory on first access.
> Doing a partial range populate for directories does not seem practical
> with exising POSIX semantics.

I agree that range info for directory events doesn't make sense (or better
there's no way to have a generic implementation since everything is pretty
fs specific). If I remember our past discussion, filling in directory
content on open has unnecessarily high overhead because the user may then
just do e.g. lookup in the opened directory and not full readdir. That's
why you want to generate it on readdir. Correct?

> If you accept this claim, please consider fast tracking this change into
> 6.14.y.

Hum, why the rush? It is just additional feature to allow more efficient
filling in of directory entries...

> LTP tests pushed to my fan_hsm branch.

Thanks.

Otherwise the patch itself looks good so if my understanding is correct,
I'll pick it to my tree.

								Honza

>  fs/notify/fanotify/fanotify.c      | 11 +++++++----
>  fs/notify/fanotify/fanotify_user.c |  9 ---------
>  2 files changed, 7 insertions(+), 13 deletions(-)
> 
> diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
> index 6c877b3646ec..531c038eed7c 100644
> --- a/fs/notify/fanotify/fanotify.c
> +++ b/fs/notify/fanotify/fanotify.c
> @@ -303,8 +303,7 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
>  				     struct inode *dir)
>  {
>  	__u32 marks_mask = 0, marks_ignore_mask = 0;
> -	__u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
> -				     FANOTIFY_EVENT_FLAGS;
> +	__u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS;
>  	const struct path *path = fsnotify_data_path(data, data_type);
>  	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
>  	struct fsnotify_mark *mark;
> @@ -356,6 +355,9 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
>  	 * the child entry name information, we report FAN_ONDIR for mkdir/rmdir
>  	 * so user can differentiate them from creat/unlink.
>  	 *
> +	 * For pre-content events we report FAN_ONDIR for readdir, so user can
> +	 * differentiate them from read.
> +	 *
>  	 * For backward compatibility and consistency, do not report FAN_ONDIR
>  	 * to user in legacy fanotify mode (reporting fd) and report FAN_ONDIR
>  	 * to user in fid mode for all event types.
> @@ -368,8 +370,9 @@ static u32 fanotify_group_event_mask(struct fsnotify_group *group,
>  		/* Do not report event flags without any event */
>  		if (!(test_mask & ~FANOTIFY_EVENT_FLAGS))
>  			return 0;
> -	} else {
> -		user_mask &= ~FANOTIFY_EVENT_FLAGS;
> +		user_mask |= FANOTIFY_EVENT_FLAGS;
> +	} else if (test_mask & FANOTIFY_PRE_CONTENT_EVENTS) {
> +		user_mask |= FAN_ONDIR;
>  	}
>  
>  	return test_mask & user_mask;
> diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
> index f2d840ae4ded..38cb9ba54842 100644
> --- a/fs/notify/fanotify/fanotify_user.c
> +++ b/fs/notify/fanotify/fanotify_user.c
> @@ -1410,11 +1410,6 @@ static int fanotify_may_update_existing_mark(struct fsnotify_mark *fsn_mark,
>  	    fsn_mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY)
>  		return -EEXIST;
>  
> -	/* For now pre-content events are not generated for directories */
> -	mask |= fsn_mark->mask;
> -	if (mask & FANOTIFY_PRE_CONTENT_EVENTS && mask & FAN_ONDIR)
> -		return -EEXIST;
> -
>  	return 0;
>  }
>  
> @@ -1956,10 +1951,6 @@ static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
>  	if (mask & FAN_RENAME && !(fid_mode & FAN_REPORT_NAME))
>  		return -EINVAL;
>  
> -	/* Pre-content events are not currently generated for directories. */
> -	if (mask & FANOTIFY_PRE_CONTENT_EVENTS && mask & FAN_ONDIR)
> -		return -EINVAL;
> -
>  	if (mark_cmd == FAN_MARK_FLUSH) {
>  		if (mark_type == FAN_MARK_MOUNT)
>  			fsnotify_clear_vfsmount_marks_by_group(group);
> -- 
> 2.34.1
>
Amir Goldstein April 3, 2025, 5:24 p.m. UTC | #2
On Thu, Apr 3, 2025 at 7:10 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 02-04-25 08:27:07, Amir Goldstein wrote:
> > Like files, a FAN_PRE_ACCESS event will be generated before every
> > read access to directory, that is on readdir(3).
> >
> > Unlike files, there will be no range info record following a
> > FAN_PRE_ACCESS event, because the range of access on a directory
> > is not well defined.
> >
> > FAN_PRE_ACCESS events on readdir are only generated when user opts-in
> > with FAN_ONDIR request in event mask and the FAN_PRE_ACCESS events on
> > readdir report the FAN_ONDIR flag, so user can differentiate them from
> > event on read.
> >
> > An HSM service is expected to use those events to populate directories
> > from slower tier on first readdir access. Having to range info means
> > that the entire directory will need to be populated on the first
> > readdir() call.
> >
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >
> > Jan,
> >
> > IIRC, the reason we did not allow FAN_ONDIR with FAN_PRE_ACCESS event
> > in initial API version was due to uncertainty around reporting range info.
> >
> > Circling back to this, I do not see any better options other than not
> > reporting range info and reporting the FAN_ONDIR flag.
> >
> > HSM only option is to populate the entire directory on first access.
> > Doing a partial range populate for directories does not seem practical
> > with exising POSIX semantics.
>
> I agree that range info for directory events doesn't make sense (or better
> there's no way to have a generic implementation since everything is pretty
> fs specific). If I remember our past discussion, filling in directory
> content on open has unnecessarily high overhead because the user may then
> just do e.g. lookup in the opened directory and not full readdir. That's
> why you want to generate it on readdir. Correct?
>

Right.

> > If you accept this claim, please consider fast tracking this change into
> > 6.14.y.
>
> Hum, why the rush? It is just additional feature to allow more efficient
> filling in of directory entries...
>

Well, no rush really.

My incentive is not having to confuse users with documentation that
version X supports FAN_PRE_ACCESS but only version Y supports
it with FAN_ONDIR.

It's not a big deal, but if we have no reason to delay this, I'd just
treat it as a fix to the new api (removing unneeded limitations).

I would point out that FAN_ACCESS_PERM already works
for directories and in effect provides (almost) the exact same
functionality as FAN_PRE_ACCESS without range info.

But in order to get the FAN_ACCESS_PERM events on directories
listener would also be forced to get FAN_ACCESS_PERM on
special files and regular files
and assuming that this user is an HSM, it cannot request
FAN_ACCESS_PERM|FAN_ONDIR in the same mask as
FAN_PRE_ACCESS (which it needs for files) so it will need to
open another group for populating directories.

So that's why I would maybe consider this a last minute fix to the new API.

Thanks,
Amir.
diff mbox series

Patch

diff --git a/fs/notify/fanotify/fanotify.c b/fs/notify/fanotify/fanotify.c
index 6c877b3646ec..531c038eed7c 100644
--- a/fs/notify/fanotify/fanotify.c
+++ b/fs/notify/fanotify/fanotify.c
@@ -303,8 +303,7 @@  static u32 fanotify_group_event_mask(struct fsnotify_group *group,
 				     struct inode *dir)
 {
 	__u32 marks_mask = 0, marks_ignore_mask = 0;
-	__u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS |
-				     FANOTIFY_EVENT_FLAGS;
+	__u32 test_mask, user_mask = FANOTIFY_OUTGOING_EVENTS;
 	const struct path *path = fsnotify_data_path(data, data_type);
 	unsigned int fid_mode = FAN_GROUP_FLAG(group, FANOTIFY_FID_BITS);
 	struct fsnotify_mark *mark;
@@ -356,6 +355,9 @@  static u32 fanotify_group_event_mask(struct fsnotify_group *group,
 	 * the child entry name information, we report FAN_ONDIR for mkdir/rmdir
 	 * so user can differentiate them from creat/unlink.
 	 *
+	 * For pre-content events we report FAN_ONDIR for readdir, so user can
+	 * differentiate them from read.
+	 *
 	 * For backward compatibility and consistency, do not report FAN_ONDIR
 	 * to user in legacy fanotify mode (reporting fd) and report FAN_ONDIR
 	 * to user in fid mode for all event types.
@@ -368,8 +370,9 @@  static u32 fanotify_group_event_mask(struct fsnotify_group *group,
 		/* Do not report event flags without any event */
 		if (!(test_mask & ~FANOTIFY_EVENT_FLAGS))
 			return 0;
-	} else {
-		user_mask &= ~FANOTIFY_EVENT_FLAGS;
+		user_mask |= FANOTIFY_EVENT_FLAGS;
+	} else if (test_mask & FANOTIFY_PRE_CONTENT_EVENTS) {
+		user_mask |= FAN_ONDIR;
 	}
 
 	return test_mask & user_mask;
diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index f2d840ae4ded..38cb9ba54842 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -1410,11 +1410,6 @@  static int fanotify_may_update_existing_mark(struct fsnotify_mark *fsn_mark,
 	    fsn_mark->flags & FSNOTIFY_MARK_FLAG_IGNORED_SURV_MODIFY)
 		return -EEXIST;
 
-	/* For now pre-content events are not generated for directories */
-	mask |= fsn_mark->mask;
-	if (mask & FANOTIFY_PRE_CONTENT_EVENTS && mask & FAN_ONDIR)
-		return -EEXIST;
-
 	return 0;
 }
 
@@ -1956,10 +1951,6 @@  static int do_fanotify_mark(int fanotify_fd, unsigned int flags, __u64 mask,
 	if (mask & FAN_RENAME && !(fid_mode & FAN_REPORT_NAME))
 		return -EINVAL;
 
-	/* Pre-content events are not currently generated for directories. */
-	if (mask & FANOTIFY_PRE_CONTENT_EVENTS && mask & FAN_ONDIR)
-		return -EINVAL;
-
 	if (mark_cmd == FAN_MARK_FLUSH) {
 		if (mark_type == FAN_MARK_MOUNT)
 			fsnotify_clear_vfsmount_marks_by_group(group);