Message ID | 20240123093701.94166-1-jefflexu@linux.alibaba.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] fuse: disable support for file handle when FUSE_EXPORT_SUPPORT not configured | expand |
On Tue, Jan 23, 2024 at 11:37 AM Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > > I think this is more of an issue reporter. > > I'm not sure if it's a known issue, but we found that following a > successful name_to_handle_at(2), open_by_handle_at(2) fails (-ESTALE, > Stale file handle) with the given file handle when the fuse daemon is in > "cache= none" mode. > > It can be reproduced by the examples from the man page of > name_to_handle_at(2) and open_by_handle_at(2) [1], along with the > virtiofsd daemon (C implementation) in "cache= none" mode. > > ``` > ./t_name_to_handle_at t_open_by_handle_at.c > /tmp/fh > ./t_open_by_handle_at < /tmp/fh > t_open_by_handle_at: open_by_handle_at: Stale file handle > ``` > > After investigation into this issue, I found the root cause is that, > when virtiofsd is in "cache= none" mode, the entry_valid_timeout is > configured as 0. Thus the dput() called when name_to_handle_at(2) > finishes will trigger iput -> evict(), in which FUSE_FORGET will be sent > to the daemon. The following open_by_handle_at(2) will trigger a new > FUSE_LOOKUP request when no cached inode is found with the given file > handle. And then the fuse daemon fails the FUSE_LOOKUP request with > -ENOENT as the cached metadata of the requested inode has already been > cleaned up among the previous FUSE_FORGET. > > This indeed confuses the application, as open_by_handle_at(2) fails in > the condition of the previous name_to_handle_at(2) succeeds, given the > requested file is not deleted and ready there. It is acceptable for the > application folks to fail name_to_handle_at(2) early in this case, in > which they will fallback to open(2) to access files. > > > As for this RFC patch, the idea is that if the fuse daemon is configured > with "cache=none" mode, FUSE_EXPORT_SUPPORT should also be explicitly > disabled and the following name_to_handle_at(2) will all fail as a > workaround of this issue. This will probably regress NFS export of (many) fuse servers that do not have FUSE_EXPORT_SUPPORT, even though you are right to point out that those NFS exports are of dubious quality. Not only can an NFS client get ESTALE for evicted fuse inodes, but it can also get a completely different object for the same file handle if that fuse server was restarted and re-exported to NFS. > > [1] https://man7.org/linux/man-pages/man2/open_by_handle_at.2.html > > Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> > --- > fs/fuse/inode.c | 4 ++++ > 1 file changed, 4 insertions(+) > > diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c > index 2a6d44f91729..9fed63be60fe 100644 > --- a/fs/fuse/inode.c > +++ b/fs/fuse/inode.c > @@ -1025,6 +1025,7 @@ static struct dentry *fuse_get_dentry(struct super_block *sb, > static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, > struct inode *parent) > { > + struct fuse_conn *fc = get_fuse_conn(inode); > int len = parent ? 6 : 3; > u64 nodeid; > u32 generation; > @@ -1034,6 +1035,9 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, > return FILEID_INVALID; > } > > + if (!fc->export_support) > + return -EOPNOTSUPP; > + > nodeid = get_fuse_inode(inode)->nodeid; > generation = inode->i_generation; > If you somehow find a way to mitigate the regression for NFS export of old fuse servers (maybe an opt-in Kconfig?), your patch is also going to regress AT_HANDLE_FID functionality, which can be used by fanotify to monitor fuse. AT_HANDLE_FID flag to name_to_handle_at(2) means that open_by_handle_at(2) is not supposed to be called on that fh. The correct way to deal with that would be something like this: +static const struct export_operations fuse_fid_operations = { + .encode_fh = fuse_encode_fh, +}; + static const struct export_operations fuse_export_operations = { .fh_to_dentry = fuse_fh_to_dentry, .fh_to_parent = fuse_fh_to_parent, @@ -1529,12 +1533,16 @@ static void fuse_fill_attr_from_inode(struct fuse_attr *attr, static void fuse_sb_defaults(struct super_block *sb) { + struct fuse_mount *fm = get_fuse_mount_super(sb); + sb->s_magic = FUSE_SUPER_MAGIC; sb->s_op = &fuse_super_operations; sb->s_xattr = fuse_xattr_handlers; sb->s_maxbytes = MAX_LFS_FILESIZE; sb->s_time_gran = 1; - sb->s_export_op = &fuse_export_operations; + if (fm->fc->export_support) + sb->s_export_op = &fuse_export_operations; + else + sb->s_export_op = &fuse_fid_operations; sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; if (sb->s_user_ns != &init_user_ns) sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; --- This would make name_to_handle_at() without AT_HANDLE_FID fail and name_to_handle_at() with AT_HANDLE_FID to succeed as it should. Thanks, Amir.
On 1/23/24 6:17 PM, Amir Goldstein wrote: > On Tue, Jan 23, 2024 at 11:37 AM Jingbo Xu <jefflexu@linux.alibaba.com> wrote: >> >> I think this is more of an issue reporter. >> >> I'm not sure if it's a known issue, but we found that following a >> successful name_to_handle_at(2), open_by_handle_at(2) fails (-ESTALE, >> Stale file handle) with the given file handle when the fuse daemon is in >> "cache= none" mode. >> >> It can be reproduced by the examples from the man page of >> name_to_handle_at(2) and open_by_handle_at(2) [1], along with the >> virtiofsd daemon (C implementation) in "cache= none" mode. >> >> ``` >> ./t_name_to_handle_at t_open_by_handle_at.c > /tmp/fh >> ./t_open_by_handle_at < /tmp/fh >> t_open_by_handle_at: open_by_handle_at: Stale file handle >> ``` >> >> After investigation into this issue, I found the root cause is that, >> when virtiofsd is in "cache= none" mode, the entry_valid_timeout is >> configured as 0. Thus the dput() called when name_to_handle_at(2) >> finishes will trigger iput -> evict(), in which FUSE_FORGET will be sent >> to the daemon. The following open_by_handle_at(2) will trigger a new >> FUSE_LOOKUP request when no cached inode is found with the given file >> handle. And then the fuse daemon fails the FUSE_LOOKUP request with >> -ENOENT as the cached metadata of the requested inode has already been >> cleaned up among the previous FUSE_FORGET. >> >> This indeed confuses the application, as open_by_handle_at(2) fails in >> the condition of the previous name_to_handle_at(2) succeeds, given the >> requested file is not deleted and ready there. It is acceptable for the >> application folks to fail name_to_handle_at(2) early in this case, in >> which they will fallback to open(2) to access files. >> >> >> As for this RFC patch, the idea is that if the fuse daemon is configured >> with "cache=none" mode, FUSE_EXPORT_SUPPORT should also be explicitly >> disabled and the following name_to_handle_at(2) will all fail as a >> workaround of this issue. > > This will probably regress NFS export of (many) fuse servers that do > not have FUSE_EXPORT_SUPPORT, even though you are right to point > out that those NFS exports are of dubious quality. Yeah, the RFC itself is just for describing the problem, while the final fix (if any) needs further discussion. We even add an extra optional mount option, e.g "-o no_file_handle" to explicitly disable support for file handle in our internal product. > > Not only can an NFS client get ESTALE for evicted fuse inodes, but it > can also get a completely different object for the same file handle > if that fuse server was restarted and re-exported to NFS. > >> >> [1] https://man7.org/linux/man-pages/man2/open_by_handle_at.2.html >> >> Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> >> --- >> fs/fuse/inode.c | 4 ++++ >> 1 file changed, 4 insertions(+) >> >> diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c >> index 2a6d44f91729..9fed63be60fe 100644 >> --- a/fs/fuse/inode.c >> +++ b/fs/fuse/inode.c >> @@ -1025,6 +1025,7 @@ static struct dentry *fuse_get_dentry(struct super_block *sb, >> static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, >> struct inode *parent) >> { >> + struct fuse_conn *fc = get_fuse_conn(inode); >> int len = parent ? 6 : 3; >> u64 nodeid; >> u32 generation; >> @@ -1034,6 +1035,9 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, >> return FILEID_INVALID; >> } >> >> + if (!fc->export_support) >> + return -EOPNOTSUPP; >> + >> nodeid = get_fuse_inode(inode)->nodeid; >> generation = inode->i_generation; >> > > If you somehow find a way to mitigate the regression for NFS export of > old fuse servers (maybe an opt-in Kconfig?), your patch is also going to > regress AT_HANDLE_FID functionality, which can be used by fanotify to > monitor fuse. > > AT_HANDLE_FID flag to name_to_handle_at(2) means that > open_by_handle_at(2) is not supposed to be called on that fh. > > The correct way to deal with that would be something like this: > > +static const struct export_operations fuse_fid_operations = { > + .encode_fh = fuse_encode_fh, > +}; > + > static const struct export_operations fuse_export_operations = { > .fh_to_dentry = fuse_fh_to_dentry, > .fh_to_parent = fuse_fh_to_parent, > @@ -1529,12 +1533,16 @@ static void fuse_fill_attr_from_inode(struct > fuse_attr *attr, > > static void fuse_sb_defaults(struct super_block *sb) > { > + struct fuse_mount *fm = get_fuse_mount_super(sb); > + > sb->s_magic = FUSE_SUPER_MAGIC; > sb->s_op = &fuse_super_operations; > sb->s_xattr = fuse_xattr_handlers; > sb->s_maxbytes = MAX_LFS_FILESIZE; > sb->s_time_gran = 1; > - sb->s_export_op = &fuse_export_operations; > + if (fm->fc->export_support) > + sb->s_export_op = &fuse_export_operations; > + else > + sb->s_export_op = &fuse_fid_operations; > sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; > if (sb->s_user_ns != &init_user_ns) > sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; > > --- > > This would make name_to_handle_at() without AT_HANDLE_FID fail > and name_to_handle_at() with AT_HANDLE_FID to succeed as it should. > > Thanks, > Amir.
On 1/23/24 6:17 PM, Amir Goldstein wrote: > If you somehow find a way to mitigate the regression for NFS export of > old fuse servers (maybe an opt-in Kconfig?), your patch is also going to > regress AT_HANDLE_FID functionality, which can be used by fanotify to > monitor fuse. > > AT_HANDLE_FID flag to name_to_handle_at(2) means that > open_by_handle_at(2) is not supposed to be called on that fh. > > The correct way to deal with that would be something like this: > > +static const struct export_operations fuse_fid_operations = { > + .encode_fh = fuse_encode_fh, > +}; > + > static const struct export_operations fuse_export_operations = { > .fh_to_dentry = fuse_fh_to_dentry, > .fh_to_parent = fuse_fh_to_parent, > @@ -1529,12 +1533,16 @@ static void fuse_fill_attr_from_inode(struct > fuse_attr *attr, > > static void fuse_sb_defaults(struct super_block *sb) > { > + struct fuse_mount *fm = get_fuse_mount_super(sb); > + > sb->s_magic = FUSE_SUPER_MAGIC; > sb->s_op = &fuse_super_operations; > sb->s_xattr = fuse_xattr_handlers; > sb->s_maxbytes = MAX_LFS_FILESIZE; > sb->s_time_gran = 1; > - sb->s_export_op = &fuse_export_operations; > + if (fm->fc->export_support) > + sb->s_export_op = &fuse_export_operations; > + else > + sb->s_export_op = &fuse_fid_operations; > sb->s_iflags |= SB_I_IMA_UNVERIFIABLE_SIGNATURE; > if (sb->s_user_ns != &init_user_ns) > sb->s_iflags |= SB_I_UNTRUSTED_MOUNTER; > > --- > > This would make name_to_handle_at() without AT_HANDLE_FID fail > and name_to_handle_at() with AT_HANDLE_FID to succeed as it should. > Oh I didn't notice this. Many thanks!
On Tue, 23 Jan 2024 at 11:40, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: > > > > On 1/23/24 6:17 PM, Amir Goldstein wrote: > > If you somehow find a way to mitigate the regression for NFS export of > > old fuse servers (maybe an opt-in Kconfig?), Better would be if the server explicitly disabled export support with an INIT flag (FUSE_NO_EXPORT). Thanks, Miklos
On 1/23/24 6:46 PM, Miklos Szeredi wrote: > On Tue, 23 Jan 2024 at 11:40, Jingbo Xu <jefflexu@linux.alibaba.com> wrote: >> >> >> >> On 1/23/24 6:17 PM, Amir Goldstein wrote: >>> If you somehow find a way to mitigate the regression for NFS export of >>> old fuse servers (maybe an opt-in Kconfig?), > > Better would be if the server explicitly disabled export support with > an INIT flag (FUSE_NO_EXPORT). > I would give it a try (as a mitigation) if it's on the right direction.
diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 2a6d44f91729..9fed63be60fe 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1025,6 +1025,7 @@ static struct dentry *fuse_get_dentry(struct super_block *sb, static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, struct inode *parent) { + struct fuse_conn *fc = get_fuse_conn(inode); int len = parent ? 6 : 3; u64 nodeid; u32 generation; @@ -1034,6 +1035,9 @@ static int fuse_encode_fh(struct inode *inode, u32 *fh, int *max_len, return FILEID_INVALID; } + if (!fc->export_support) + return -EOPNOTSUPP; + nodeid = get_fuse_inode(inode)->nodeid; generation = inode->i_generation;
I think this is more of an issue reporter. I'm not sure if it's a known issue, but we found that following a successful name_to_handle_at(2), open_by_handle_at(2) fails (-ESTALE, Stale file handle) with the given file handle when the fuse daemon is in "cache= none" mode. It can be reproduced by the examples from the man page of name_to_handle_at(2) and open_by_handle_at(2) [1], along with the virtiofsd daemon (C implementation) in "cache= none" mode. ``` ./t_name_to_handle_at t_open_by_handle_at.c > /tmp/fh ./t_open_by_handle_at < /tmp/fh t_open_by_handle_at: open_by_handle_at: Stale file handle ``` After investigation into this issue, I found the root cause is that, when virtiofsd is in "cache= none" mode, the entry_valid_timeout is configured as 0. Thus the dput() called when name_to_handle_at(2) finishes will trigger iput -> evict(), in which FUSE_FORGET will be sent to the daemon. The following open_by_handle_at(2) will trigger a new FUSE_LOOKUP request when no cached inode is found with the given file handle. And then the fuse daemon fails the FUSE_LOOKUP request with -ENOENT as the cached metadata of the requested inode has already been cleaned up among the previous FUSE_FORGET. This indeed confuses the application, as open_by_handle_at(2) fails in the condition of the previous name_to_handle_at(2) succeeds, given the requested file is not deleted and ready there. It is acceptable for the application folks to fail name_to_handle_at(2) early in this case, in which they will fallback to open(2) to access files. As for this RFC patch, the idea is that if the fuse daemon is configured with "cache=none" mode, FUSE_EXPORT_SUPPORT should also be explicitly disabled and the following name_to_handle_at(2) will all fail as a workaround of this issue. [1] https://man7.org/linux/man-pages/man2/open_by_handle_at.2.html Signed-off-by: Jingbo Xu <jefflexu@linux.alibaba.com> --- fs/fuse/inode.c | 4 ++++ 1 file changed, 4 insertions(+)