diff mbox series

fanotify: allow reporting errors on failure to open fd

Message ID 20240927125624.2198202-1-amir73il@gmail.com (mailing list archive)
State New
Headers show
Series fanotify: allow reporting errors on failure to open fd | expand

Commit Message

Amir Goldstein Sept. 27, 2024, 12:56 p.m. UTC
When working in "fd mode", fanotify_read() needs to open an fd
from a dentry to report event->fd to userspace.

Opening an fd from dentry can fail for several reasons.
For example, when tasks are gone and we try to open their
/proc files or we try to open a WRONLY file like in sysfs
or when trying to open a file that was deleted on the
remote network server.

Add a new flag FAN_REPORT_FD_ERROR for fanotify_init().
For a group with FAN_REPORT_FD_ERROR, we will send the
event with the error instead of the open fd, otherwise
userspace may not get the error at all.

In any case, userspace will not know which file failed to
open, so leave a warning in ksmg for further investigation.

Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com>
Closes: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/
Signed-off-by: Amir Goldstein <amir73il@gmail.com>
---

Jan,

This is my proposal for a slightly better UAPI for error reporting.
I have a vague memory that we discussed this before and that you preferred
to report errno in an extra info field (?), but I have a strong repulsion
from this altenative, which seems like way over design for the case.

Here is what it looks like with an enhanced fanotify_example [1]
and the reproducer of the 9p open unlinked file issue [2]:

$ ./fanotify_example /vtmp/
Press enter key to terminate.
Listening for events.
FAN_OPEN_PERM: File /vtmp/config.lock
FAN_CLOSE_WRITE: fd open failed: No such file or directory

And the warning in kmsg:
[ 1836.619957] fanotify: create_fd(/config.lock) failed err=-2

fanotify_read() can still return an error with FAN_REPORT_FD_ERROR,
but not for a failure to open an fd.

WDYT?

Thanks,
Amir.

[1] https://github.com/amir73il/fsnotify-utils/blob/fan_report_fd_error/src/test/fanotify_example.c
[2] https://lore.kernel.org/linux-fsdevel/CAOQ4uxgRnzB0E2ESeqgZBHW++zyRj8-VmvB38Vxm5OXgr=EM9g@mail.gmail.com/

 fs/notify/fanotify/fanotify_user.c | 42 ++++++++++++++++++------------
 include/linux/fanotify.h           |  1 +
 include/uapi/linux/fanotify.h      |  1 +
 3 files changed, 28 insertions(+), 16 deletions(-)

Comments

Jan Kara Sept. 30, 2024, 3:42 p.m. UTC | #1
On Fri 27-09-24 14:56:24, Amir Goldstein wrote:
> When working in "fd mode", fanotify_read() needs to open an fd
> from a dentry to report event->fd to userspace.
> 
> Opening an fd from dentry can fail for several reasons.
> For example, when tasks are gone and we try to open their
> /proc files or we try to open a WRONLY file like in sysfs
> or when trying to open a file that was deleted on the
> remote network server.
> 
> Add a new flag FAN_REPORT_FD_ERROR for fanotify_init().
> For a group with FAN_REPORT_FD_ERROR, we will send the
> event with the error instead of the open fd, otherwise
> userspace may not get the error at all.
> 
> In any case, userspace will not know which file failed to
> open, so leave a warning in ksmg for further investigation.
> 
> Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com>
> Closes: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/
> Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> ---
> 
> Jan,
> 
> This is my proposal for a slightly better UAPI for error reporting.
> I have a vague memory that we discussed this before and that you preferred
> to report errno in an extra info field (?), but I have a strong repulsion
> from this altenative, which seems like way over design for the case.

Hum, I don't remember a proposal for extra info field to hold errno. What I
rather think we talked about was that we would return only the successfully
formatted events, push back the problematic one and on next read from
fanotify group the first event will be the one with error so that will get
returned to userspace. Now this would work but I agree that from userspace
it is kind of difficult to know what went wrong when the read failed (were
the arguments somehow wrong, is this temporary or permanent problem, is it
the fd or something else in the event, etc.) so reporting the error in
place of fd looks like a more convenient option.

But I wonder: Do we really need to report the error code? We already have
FAN_NOFD with -1 value (which corresponds to EPERM), with pidfd we are
reporting FAN_EPIDFD when its open fails so here we could have FAN_EFD ==
-2 in case opening of fd fails for whatever reason?

>  	if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
>  	    path && path->mnt && path->dentry) {
>  		fd = create_fd(group, path, &f);
> -		if (fd < 0)
> -			return fd;
> +		/*
> +		 * Opening an fd from dentry can fail for several reasons.
> +		 * For example, when tasks are gone and we try to open their
> +		 * /proc files or we try to open a WRONLY file like in sysfs
> +		 * or when trying to open a file that was deleted on the
> +		 * remote network server.
> +		 *
> +		 * For a group with FAN_REPORT_FD_ERROR, we will send the
> +		 * event with the error instead of the open fd, otherwise
> +		 * Userspace may not get the error at all.
> +		 * In any case, userspace will not know which file failed to
> +		 * open, so leave a warning in ksmg for further investigation.
> +		 */
> +		if (fd < 0) {
> +			pr_warn_ratelimited("fanotify: create_fd(%pd2) failed err=%d\n",
> +					    path->dentry, fd);

This is triggerable only by priviledged user so it is not a huge issue but
it still seems wrong that we spam kernel logs with warnings on more or less
normal operation. It is unrealistic that userspace would scrape the logs to
extract these names and furthermove without full path they are not even
telling much. If anything, I'd be willing to accept pr_debug() here which
sysadmin can selectively enable to ease debugging.

								Honza
Amir Goldstein Sept. 30, 2024, 4:14 p.m. UTC | #2
On Mon, Sep 30, 2024 at 5:42 PM Jan Kara <jack@suse.cz> wrote:
>
> On Fri 27-09-24 14:56:24, Amir Goldstein wrote:
> > When working in "fd mode", fanotify_read() needs to open an fd
> > from a dentry to report event->fd to userspace.
> >
> > Opening an fd from dentry can fail for several reasons.
> > For example, when tasks are gone and we try to open their
> > /proc files or we try to open a WRONLY file like in sysfs
> > or when trying to open a file that was deleted on the
> > remote network server.
> >
> > Add a new flag FAN_REPORT_FD_ERROR for fanotify_init().
> > For a group with FAN_REPORT_FD_ERROR, we will send the
> > event with the error instead of the open fd, otherwise
> > userspace may not get the error at all.
> >
> > In any case, userspace will not know which file failed to
> > open, so leave a warning in ksmg for further investigation.
> >
> > Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com>
> > Closes: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/
> > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > ---
> >
> > Jan,
> >
> > This is my proposal for a slightly better UAPI for error reporting.
> > I have a vague memory that we discussed this before and that you preferred
> > to report errno in an extra info field (?), but I have a strong repulsion
> > from this altenative, which seems like way over design for the case.
>
> Hum, I don't remember a proposal for extra info field to hold errno. What I
> rather think we talked about was that we would return only the successfully
> formatted events, push back the problematic one and on next read from
> fanotify group the first event will be the one with error so that will get
> returned to userspace. Now this would work but I agree that from userspace
> it is kind of difficult to know what went wrong when the read failed (were
> the arguments somehow wrong, is this temporary or permanent problem, is it
> the fd or something else in the event, etc.) so reporting the error in
> place of fd looks like a more convenient option.
>
> But I wonder: Do we really need to report the error code? We already have
> FAN_NOFD with -1 value (which corresponds to EPERM), with pidfd we are
> reporting FAN_EPIDFD when its open fails so here we could have FAN_EFD ==
> -2 in case opening of fd fails for whatever reason?
>

Well it is hard as it is to understand that went wrong, so the error
codes provide some clues for the bug report.
ENOENT, ENXIO, EROFS kind of point to the likely reason of
failures, so it does not make sense for me to hide this information,
which is available.

> >       if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
> >           path && path->mnt && path->dentry) {
> >               fd = create_fd(group, path, &f);
> > -             if (fd < 0)
> > -                     return fd;
> > +             /*
> > +              * Opening an fd from dentry can fail for several reasons.
> > +              * For example, when tasks are gone and we try to open their
> > +              * /proc files or we try to open a WRONLY file like in sysfs
> > +              * or when trying to open a file that was deleted on the
> > +              * remote network server.
> > +              *
> > +              * For a group with FAN_REPORT_FD_ERROR, we will send the
> > +              * event with the error instead of the open fd, otherwise
> > +              * Userspace may not get the error at all.
> > +              * In any case, userspace will not know which file failed to
> > +              * open, so leave a warning in ksmg for further investigation.
> > +              */
> > +             if (fd < 0) {
> > +                     pr_warn_ratelimited("fanotify: create_fd(%pd2) failed err=%d\n",
> > +                                         path->dentry, fd);
>
> This is triggerable only by priviledged user so it is not a huge issue but
> it still seems wrong that we spam kernel logs with warnings on more or less
> normal operation. It is unrealistic that userspace would scrape the logs to
> extract these names and furthermove without full path they are not even
> telling much. If anything, I'd be willing to accept pr_debug() here which
> sysadmin can selectively enable to ease debugging.

Even without the full path I could easily understand which file was
failing the event in git clone, but sure, pr_debug is a decent compromise.

Thanks,
Amir.
Jan Kara Oct. 2, 2024, 1:01 p.m. UTC | #3
On Mon 30-09-24 18:14:33, Amir Goldstein wrote:
> On Mon, Sep 30, 2024 at 5:42 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Fri 27-09-24 14:56:24, Amir Goldstein wrote:
> > > When working in "fd mode", fanotify_read() needs to open an fd
> > > from a dentry to report event->fd to userspace.
> > >
> > > Opening an fd from dentry can fail for several reasons.
> > > For example, when tasks are gone and we try to open their
> > > /proc files or we try to open a WRONLY file like in sysfs
> > > or when trying to open a file that was deleted on the
> > > remote network server.
> > >
> > > Add a new flag FAN_REPORT_FD_ERROR for fanotify_init().
> > > For a group with FAN_REPORT_FD_ERROR, we will send the
> > > event with the error instead of the open fd, otherwise
> > > userspace may not get the error at all.
> > >
> > > In any case, userspace will not know which file failed to
> > > open, so leave a warning in ksmg for further investigation.
> > >
> > > Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com>
> > > Closes: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/
> > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > ---
> > >
> > > Jan,
> > >
> > > This is my proposal for a slightly better UAPI for error reporting.
> > > I have a vague memory that we discussed this before and that you preferred
> > > to report errno in an extra info field (?), but I have a strong repulsion
> > > from this altenative, which seems like way over design for the case.
> >
> > Hum, I don't remember a proposal for extra info field to hold errno. What I
> > rather think we talked about was that we would return only the successfully
> > formatted events, push back the problematic one and on next read from
> > fanotify group the first event will be the one with error so that will get
> > returned to userspace. Now this would work but I agree that from userspace
> > it is kind of difficult to know what went wrong when the read failed (were
> > the arguments somehow wrong, is this temporary or permanent problem, is it
> > the fd or something else in the event, etc.) so reporting the error in
> > place of fd looks like a more convenient option.
> >
> > But I wonder: Do we really need to report the error code? We already have
> > FAN_NOFD with -1 value (which corresponds to EPERM), with pidfd we are
> > reporting FAN_EPIDFD when its open fails so here we could have FAN_EFD ==
> > -2 in case opening of fd fails for whatever reason?
> >
> 
> Well it is hard as it is to understand that went wrong, so the error
> codes provide some clues for the bug report.
> ENOENT, ENXIO, EROFS kind of point to the likely reason of
> failures, so it does not make sense for me to hide this information,
> which is available.

OK, fair enough. I was kind of hoping we could avoid the feature flag but
probably we cannot even if we added just FAN_EFD. But I still have a bit of
problem with FAN_NOFD overlapping with -EPERM. I guess it kind of makes
sense to return -EPERM in that field for unpriviledged groups but we return
FAN_NOFD also for events without path attached and there it gets
somewhat confusing... Perhaps we should refuse FAN_REPORT_FD_ERROR for
groups in fid mode? That would still leave overflow events so instead of
setting fd to FAN_NOFD, we could set it to -EINVAL to preserve the property
that fd is either -errno or fd number?

And then I have a second question about pidfd. Should FAN_REPORT_FD_ERROR
influence it in the same way? Something like -ESRCH if the process already
exited and otherwise pass back the errno?

								Honza

> > >       if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
> > >           path && path->mnt && path->dentry) {
> > >               fd = create_fd(group, path, &f);
> > > -             if (fd < 0)
> > > -                     return fd;
> > > +             /*
> > > +              * Opening an fd from dentry can fail for several reasons.
> > > +              * For example, when tasks are gone and we try to open their
> > > +              * /proc files or we try to open a WRONLY file like in sysfs
> > > +              * or when trying to open a file that was deleted on the
> > > +              * remote network server.
> > > +              *
> > > +              * For a group with FAN_REPORT_FD_ERROR, we will send the
> > > +              * event with the error instead of the open fd, otherwise
> > > +              * Userspace may not get the error at all.
> > > +              * In any case, userspace will not know which file failed to
> > > +              * open, so leave a warning in ksmg for further investigation.
> > > +              */
> > > +             if (fd < 0) {
> > > +                     pr_warn_ratelimited("fanotify: create_fd(%pd2) failed err=%d\n",
> > > +                                         path->dentry, fd);
> >
> > This is triggerable only by priviledged user so it is not a huge issue but
> > it still seems wrong that we spam kernel logs with warnings on more or less
> > normal operation. It is unrealistic that userspace would scrape the logs to
> > extract these names and furthermove without full path they are not even
> > telling much. If anything, I'd be willing to accept pr_debug() here which
> > sysadmin can selectively enable to ease debugging.
> 
> Even without the full path I could easily understand which file was
> failing the event in git clone, but sure, pr_debug is a decent compromise.
> 
> Thanks,
> Amir.
Amir Goldstein Oct. 2, 2024, 1:54 p.m. UTC | #4
On Wed, Oct 2, 2024 at 3:01 PM Jan Kara <jack@suse.cz> wrote:
>
> On Mon 30-09-24 18:14:33, Amir Goldstein wrote:
> > On Mon, Sep 30, 2024 at 5:42 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Fri 27-09-24 14:56:24, Amir Goldstein wrote:
> > > > When working in "fd mode", fanotify_read() needs to open an fd
> > > > from a dentry to report event->fd to userspace.
> > > >
> > > > Opening an fd from dentry can fail for several reasons.
> > > > For example, when tasks are gone and we try to open their
> > > > /proc files or we try to open a WRONLY file like in sysfs
> > > > or when trying to open a file that was deleted on the
> > > > remote network server.
> > > >
> > > > Add a new flag FAN_REPORT_FD_ERROR for fanotify_init().
> > > > For a group with FAN_REPORT_FD_ERROR, we will send the
> > > > event with the error instead of the open fd, otherwise
> > > > userspace may not get the error at all.
> > > >
> > > > In any case, userspace will not know which file failed to
> > > > open, so leave a warning in ksmg for further investigation.
> > > >
> > > > Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com>
> > > > Closes: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/
> > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > ---
> > > >
> > > > Jan,
> > > >
> > > > This is my proposal for a slightly better UAPI for error reporting.
> > > > I have a vague memory that we discussed this before and that you preferred
> > > > to report errno in an extra info field (?), but I have a strong repulsion
> > > > from this altenative, which seems like way over design for the case.
> > >
> > > Hum, I don't remember a proposal for extra info field to hold errno. What I
> > > rather think we talked about was that we would return only the successfully
> > > formatted events, push back the problematic one and on next read from
> > > fanotify group the first event will be the one with error so that will get
> > > returned to userspace. Now this would work but I agree that from userspace
> > > it is kind of difficult to know what went wrong when the read failed (were
> > > the arguments somehow wrong, is this temporary or permanent problem, is it
> > > the fd or something else in the event, etc.) so reporting the error in
> > > place of fd looks like a more convenient option.
> > >
> > > But I wonder: Do we really need to report the error code? We already have
> > > FAN_NOFD with -1 value (which corresponds to EPERM), with pidfd we are
> > > reporting FAN_EPIDFD when its open fails so here we could have FAN_EFD ==
> > > -2 in case opening of fd fails for whatever reason?
> > >
> >
> > Well it is hard as it is to understand that went wrong, so the error
> > codes provide some clues for the bug report.
> > ENOENT, ENXIO, EROFS kind of point to the likely reason of
> > failures, so it does not make sense for me to hide this information,
> > which is available.
>
> OK, fair enough. I was kind of hoping we could avoid the feature flag but
> probably we cannot even if we added just FAN_EFD. But I still have a bit of
> problem with FAN_NOFD overlapping with -EPERM. I guess it kind of makes
> sense to return -EPERM in that field for unpriviledged groups but we return
> FAN_NOFD also for events without path attached and there it gets
> somewhat confusing... Perhaps we should refuse FAN_REPORT_FD_ERROR for
> groups in fid mode?

Makes sense.

> That would still leave overflow events so instead of
> setting fd to FAN_NOFD, we could set it to -EINVAL to preserve the property
> that fd is either -errno or fd number?
>

EOVERFLOW? nah, witty but irrelevant.
I think EBADF would be a good substitute for FAN_NOFD,
but I can live with EINVAL as well.

> And then I have a second question about pidfd. Should FAN_REPORT_FD_ERROR
> influence it in the same way? Something like -ESRCH if the process already
> exited and otherwise pass back the errno?

Yeh that sounds useful.

Thanks,
Amir.
Jan Kara Oct. 2, 2024, 2:47 p.m. UTC | #5
On Wed 02-10-24 15:54:02, Amir Goldstein wrote:
> On Wed, Oct 2, 2024 at 3:01 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Mon 30-09-24 18:14:33, Amir Goldstein wrote:
> > > On Mon, Sep 30, 2024 at 5:42 PM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Fri 27-09-24 14:56:24, Amir Goldstein wrote:
> > > > > When working in "fd mode", fanotify_read() needs to open an fd
> > > > > from a dentry to report event->fd to userspace.
> > > > >
> > > > > Opening an fd from dentry can fail for several reasons.
> > > > > For example, when tasks are gone and we try to open their
> > > > > /proc files or we try to open a WRONLY file like in sysfs
> > > > > or when trying to open a file that was deleted on the
> > > > > remote network server.
> > > > >
> > > > > Add a new flag FAN_REPORT_FD_ERROR for fanotify_init().
> > > > > For a group with FAN_REPORT_FD_ERROR, we will send the
> > > > > event with the error instead of the open fd, otherwise
> > > > > userspace may not get the error at all.
> > > > >
> > > > > In any case, userspace will not know which file failed to
> > > > > open, so leave a warning in ksmg for further investigation.
> > > > >
> > > > > Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com>
> > > > > Closes: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/
> > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > > ---
> > > > >
> > > > > Jan,
> > > > >
> > > > > This is my proposal for a slightly better UAPI for error reporting.
> > > > > I have a vague memory that we discussed this before and that you preferred
> > > > > to report errno in an extra info field (?), but I have a strong repulsion
> > > > > from this altenative, which seems like way over design for the case.
> > > >
> > > > Hum, I don't remember a proposal for extra info field to hold errno. What I
> > > > rather think we talked about was that we would return only the successfully
> > > > formatted events, push back the problematic one and on next read from
> > > > fanotify group the first event will be the one with error so that will get
> > > > returned to userspace. Now this would work but I agree that from userspace
> > > > it is kind of difficult to know what went wrong when the read failed (were
> > > > the arguments somehow wrong, is this temporary or permanent problem, is it
> > > > the fd or something else in the event, etc.) so reporting the error in
> > > > place of fd looks like a more convenient option.
> > > >
> > > > But I wonder: Do we really need to report the error code? We already have
> > > > FAN_NOFD with -1 value (which corresponds to EPERM), with pidfd we are
> > > > reporting FAN_EPIDFD when its open fails so here we could have FAN_EFD ==
> > > > -2 in case opening of fd fails for whatever reason?
> > > >
> > >
> > > Well it is hard as it is to understand that went wrong, so the error
> > > codes provide some clues for the bug report.
> > > ENOENT, ENXIO, EROFS kind of point to the likely reason of
> > > failures, so it does not make sense for me to hide this information,
> > > which is available.
> >
> > OK, fair enough. I was kind of hoping we could avoid the feature flag but
> > probably we cannot even if we added just FAN_EFD. But I still have a bit of
> > problem with FAN_NOFD overlapping with -EPERM. I guess it kind of makes
> > sense to return -EPERM in that field for unpriviledged groups but we return
> > FAN_NOFD also for events without path attached and there it gets
> > somewhat confusing... Perhaps we should refuse FAN_REPORT_FD_ERROR for
> > groups in fid mode?
> 
> Makes sense.
> 
> > That would still leave overflow events so instead of
> > setting fd to FAN_NOFD, we could set it to -EINVAL to preserve the property
> > that fd is either -errno or fd number?
> >
> 
> EOVERFLOW? nah, witty but irrelevant.
> I think EBADF would be a good substitute for FAN_NOFD,
> but I can live with EINVAL as well.

EBADF is fine with me, probably even better.

> > And then I have a second question about pidfd. Should FAN_REPORT_FD_ERROR
> > influence it in the same way? Something like -ESRCH if the process already
> > exited and otherwise pass back the errno?
> 
> Yeh that sounds useful.

OK, I guess we have an agreement :) Will you send an updated patch please?

								Honza
Amir Goldstein Oct. 3, 2024, 9:18 a.m. UTC | #6
On Wed, Oct 2, 2024 at 4:47 PM Jan Kara <jack@suse.cz> wrote:
>
> On Wed 02-10-24 15:54:02, Amir Goldstein wrote:
> > On Wed, Oct 2, 2024 at 3:01 PM Jan Kara <jack@suse.cz> wrote:
> > >
> > > On Mon 30-09-24 18:14:33, Amir Goldstein wrote:
> > > > On Mon, Sep 30, 2024 at 5:42 PM Jan Kara <jack@suse.cz> wrote:
> > > > >
> > > > > On Fri 27-09-24 14:56:24, Amir Goldstein wrote:
> > > > > > When working in "fd mode", fanotify_read() needs to open an fd
> > > > > > from a dentry to report event->fd to userspace.
> > > > > >
> > > > > > Opening an fd from dentry can fail for several reasons.
> > > > > > For example, when tasks are gone and we try to open their
> > > > > > /proc files or we try to open a WRONLY file like in sysfs
> > > > > > or when trying to open a file that was deleted on the
> > > > > > remote network server.
> > > > > >
> > > > > > Add a new flag FAN_REPORT_FD_ERROR for fanotify_init().
> > > > > > For a group with FAN_REPORT_FD_ERROR, we will send the
> > > > > > event with the error instead of the open fd, otherwise
> > > > > > userspace may not get the error at all.
> > > > > >
> > > > > > In any case, userspace will not know which file failed to
> > > > > > open, so leave a warning in ksmg for further investigation.
> > > > > >
> > > > > > Reported-by: Krishna Vivek Vitta <kvitta@microsoft.com>
> > > > > > Closes: https://lore.kernel.org/linux-fsdevel/SI2P153MB07182F3424619EDDD1F393EED46D2@SI2P153MB0718.APCP153.PROD.OUTLOOK.COM/
> > > > > > Signed-off-by: Amir Goldstein <amir73il@gmail.com>
> > > > > > ---
> > > > > >
> > > > > > Jan,
> > > > > >
> > > > > > This is my proposal for a slightly better UAPI for error reporting.
> > > > > > I have a vague memory that we discussed this before and that you preferred
> > > > > > to report errno in an extra info field (?), but I have a strong repulsion
> > > > > > from this altenative, which seems like way over design for the case.
> > > > >
> > > > > Hum, I don't remember a proposal for extra info field to hold errno. What I
> > > > > rather think we talked about was that we would return only the successfully
> > > > > formatted events, push back the problematic one and on next read from
> > > > > fanotify group the first event will be the one with error so that will get
> > > > > returned to userspace. Now this would work but I agree that from userspace
> > > > > it is kind of difficult to know what went wrong when the read failed (were
> > > > > the arguments somehow wrong, is this temporary or permanent problem, is it
> > > > > the fd or something else in the event, etc.) so reporting the error in
> > > > > place of fd looks like a more convenient option.
> > > > >
> > > > > But I wonder: Do we really need to report the error code? We already have
> > > > > FAN_NOFD with -1 value (which corresponds to EPERM), with pidfd we are
> > > > > reporting FAN_EPIDFD when its open fails so here we could have FAN_EFD ==
> > > > > -2 in case opening of fd fails for whatever reason?
> > > > >
> > > >
> > > > Well it is hard as it is to understand that went wrong, so the error
> > > > codes provide some clues for the bug report.
> > > > ENOENT, ENXIO, EROFS kind of point to the likely reason of
> > > > failures, so it does not make sense for me to hide this information,
> > > > which is available.
> > >
> > > OK, fair enough. I was kind of hoping we could avoid the feature flag but
> > > probably we cannot even if we added just FAN_EFD. But I still have a bit of
> > > problem with FAN_NOFD overlapping with -EPERM. I guess it kind of makes
> > > sense to return -EPERM in that field for unpriviledged groups but we return
> > > FAN_NOFD also for events without path attached and there it gets
> > > somewhat confusing... Perhaps we should refuse FAN_REPORT_FD_ERROR for
> > > groups in fid mode?
> >
> > Makes sense.
> >
> > > That would still leave overflow events so instead of
> > > setting fd to FAN_NOFD, we could set it to -EINVAL to preserve the property
> > > that fd is either -errno or fd number?
> > >
> >
> > EOVERFLOW? nah, witty but irrelevant.
> > I think EBADF would be a good substitute for FAN_NOFD,
> > but I can live with EINVAL as well.
>
> EBADF is fine with me, probably even better.
>
> > > And then I have a second question about pidfd. Should FAN_REPORT_FD_ERROR
> > > influence it in the same way? Something like -ESRCH if the process already
> > > exited and otherwise pass back the errno?
> >
> > Yeh that sounds useful.
>
> OK, I guess we have an agreement :) Will you send an updated patch please?
>

Sure. Will do.

Please note that I intend to funnel this API change to stable.
That is why I have annotated the patch with Reported-by and Closes.
While this is not a traditional stable backport candidate, this API can
really be used by the user that reported the issue to solve a problem
or get better insights once the distro picks up this API.

Also, considering the recent backport of the entire fanotify API from
v6.6 back to v5.10, I no longer feel compelled to obey no the legacy
no new features rule for the stable trees.

Thanks,
Amir.
diff mbox series

Patch

diff --git a/fs/notify/fanotify/fanotify_user.c b/fs/notify/fanotify/fanotify_user.c
index 13454e5fd3fb..80917814981c 100644
--- a/fs/notify/fanotify/fanotify_user.c
+++ b/fs/notify/fanotify/fanotify_user.c
@@ -266,13 +266,6 @@  static int create_fd(struct fsnotify_group *group, const struct path *path,
 			       group->fanotify_data.f_flags | __FMODE_NONOTIFY,
 			       current_cred());
 	if (IS_ERR(new_file)) {
-		/*
-		 * we still send an event even if we can't open the file.  this
-		 * can happen when say tasks are gone and we try to open their
-		 * /proc files or we try to open a WRONLY file like in sysfs
-		 * we just send the errno to userspace since there isn't much
-		 * else we can do.
-		 */
 		put_unused_fd(client_fd);
 		client_fd = PTR_ERR(new_file);
 	} else {
@@ -691,8 +684,25 @@  static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	if (!FAN_GROUP_FLAG(group, FANOTIFY_UNPRIV) &&
 	    path && path->mnt && path->dentry) {
 		fd = create_fd(group, path, &f);
-		if (fd < 0)
-			return fd;
+		/*
+		 * Opening an fd from dentry can fail for several reasons.
+		 * For example, when tasks are gone and we try to open their
+		 * /proc files or we try to open a WRONLY file like in sysfs
+		 * or when trying to open a file that was deleted on the
+		 * remote network server.
+		 *
+		 * For a group with FAN_REPORT_FD_ERROR, we will send the
+		 * event with the error instead of the open fd, otherwise
+		 * Userspace may not get the error at all.
+		 * In any case, userspace will not know which file failed to
+		 * open, so leave a warning in ksmg for further investigation.
+		 */
+		if (fd < 0) {
+			pr_warn_ratelimited("fanotify: create_fd(%pd2) failed err=%d\n",
+					    path->dentry, fd);
+			if (!FAN_GROUP_FLAG(group, FAN_REPORT_FD_ERROR))
+				return fd;
+		}
 	}
 	metadata.fd = fd;
 
@@ -737,9 +747,6 @@  static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	buf += FAN_EVENT_METADATA_LEN;
 	count -= FAN_EVENT_METADATA_LEN;
 
-	if (fanotify_is_perm_event(event->mask))
-		FANOTIFY_PERM(event)->fd = fd;
-
 	if (info_mode) {
 		ret = copy_info_records_to_user(event, info, info_mode, pidfd,
 						buf, count);
@@ -753,15 +760,18 @@  static ssize_t copy_event_to_user(struct fsnotify_group *group,
 	if (pidfd_file)
 		fd_install(pidfd, pidfd_file);
 
+	if (fanotify_is_perm_event(event->mask))
+		FANOTIFY_PERM(event)->fd = fd;
+
 	return metadata.event_len;
 
 out_close_fd:
-	if (fd != FAN_NOFD) {
+	if (f) {
 		put_unused_fd(fd);
 		fput(f);
 	}
 
-	if (pidfd >= 0) {
+	if (pidfd_file) {
 		put_unused_fd(pidfd);
 		fput(pidfd_file);
 	}
@@ -845,7 +855,7 @@  static ssize_t fanotify_read(struct file *file, char __user *buf,
 		if (!fanotify_is_perm_event(event->mask)) {
 			fsnotify_destroy_event(group, &event->fse);
 		} else {
-			if (ret <= 0) {
+			if (ret <= 0 || FANOTIFY_PERM(event)->fd < 0) {
 				spin_lock(&group->notification_lock);
 				finish_permission_event(group,
 					FANOTIFY_PERM(event), FAN_DENY, NULL);
@@ -1954,7 +1964,7 @@  static int __init fanotify_user_setup(void)
 				     FANOTIFY_DEFAULT_MAX_USER_MARKS);
 
 	BUILD_BUG_ON(FANOTIFY_INIT_FLAGS & FANOTIFY_INTERNAL_GROUP_FLAGS);
-	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 12);
+	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_INIT_FLAGS) != 13);
 	BUILD_BUG_ON(HWEIGHT32(FANOTIFY_MARK_FLAGS) != 11);
 
 	fanotify_mark_cache = KMEM_CACHE(fanotify_mark,
diff --git a/include/linux/fanotify.h b/include/linux/fanotify.h
index 4f1c4f603118..89ff45bd6f01 100644
--- a/include/linux/fanotify.h
+++ b/include/linux/fanotify.h
@@ -36,6 +36,7 @@ 
 #define FANOTIFY_ADMIN_INIT_FLAGS	(FANOTIFY_PERM_CLASSES | \
 					 FAN_REPORT_TID | \
 					 FAN_REPORT_PIDFD | \
+					 FAN_REPORT_FD_ERROR | \
 					 FAN_UNLIMITED_QUEUE | \
 					 FAN_UNLIMITED_MARKS)
 
diff --git a/include/uapi/linux/fanotify.h b/include/uapi/linux/fanotify.h
index a37de58ca571..34f221d3a1b9 100644
--- a/include/uapi/linux/fanotify.h
+++ b/include/uapi/linux/fanotify.h
@@ -60,6 +60,7 @@ 
 #define FAN_REPORT_DIR_FID	0x00000400	/* Report unique directory id */
 #define FAN_REPORT_NAME		0x00000800	/* Report events with name */
 #define FAN_REPORT_TARGET_FID	0x00001000	/* Report dirent target id  */
+#define FAN_REPORT_FD_ERROR	0x00002000	/* event->fd can report error */
 
 /* Convenience macro - FAN_REPORT_NAME requires FAN_REPORT_DIR_FID */
 #define FAN_REPORT_DFID_NAME	(FAN_REPORT_DIR_FID | FAN_REPORT_NAME)