Message ID | 20250403015946.5283-1-superman.xpt@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | None | expand |
On Wed, Apr 02, 2025 at 06:59:46PM -0700, Penglei Jiang wrote: > may_open() > { > ... > switch (inode->i_mode & S_IFMT) { > case S_IFLNK: > case S_IFDIR: > case S_IFBLK: > case S_IFCHR: > case S_IFIFO: > case S_IFSOCK: > case S_IFREG: > default: > VFS_BUG_ON_INODE(1, inode); > } > ... > } > > Since some anonymous inodes do not have S_IFLNK, S_IFDIR, S_IFBLK, > S_IFCHR, S_IFIFO, S_IFSOCK, or S_IFREG flags set when created, they > end up in the default case branch. > > When creating some anon_inode instances, the i_mode in the switch > statement is not properly set, which causes the open operation to > follow the default branch when opening anon_inode. > > We could check whether the inode is an anon_inode before VFS_BUG_ON_INODE > and trigger the assertion only if it's not. However, a more reasonable > approach might be to set a flag during creation in alloc_anon_inode(), > such as inode->i_flags |= S_ANON, to explicitly mark anonymous inodes. > I think this is the right step, but there is a missing bit. The assert was added because of a prior bug report involving ntfs, where a misconstructed inode did not have an i_mode and was passed to execve. That eventually landed in may_open() which did not check for MAY_EXEC as the inode did not fit any of the modes. It was only caught by a hardening check later. Note all other cases handle MAY_EXEC. So I think the safest way forward has 2 steps in the default case: - detect inodes which purposefully don't have a valid mode and elide the assert if so, for example like in your patch - add the permission check: if (acc_mode & MAY_EXEC) return -EACCES; The hardening check comes with a WARN_ON if the inode is not S_ISREG. No need to hope none of the anon inodes reach it, this should just be short-circuited.
On Fri, Apr 4, 2025 at 5:30 AM Penglei Jiang <superman.xpt@gmail.com> wrote: > > Some anon_inodes do not set the S_IFLNK, S_IFDIR, S_IFBLK, S_IFCHR, > S_IFIFO, S_IFSOCK, or S_IFREG flags after initialization. As a result, > opening these files triggers VFS_BUG_ON_INODE in the may_open() function. > > Here is the relevant code snippet: > > static int may_open(struct mnt_idmap *idmap, const struct path *path, > int acc_mode, int flag) > { > ... > switch (inode->i_mode & S_IFMT) { > case S_IFLNK: > case S_IFDIR: > case S_IFBLK: > case S_IFCHR: > case S_IFIFO: > case S_IFSOCK: > case S_IFREG: > default: > VFS_BUG_ON_INODE(1, inode); > ~~~~~~~~~~~~~~~~~~~~~~~~~~ > } > ... > } > > To address this, we modify the code so that only non-anon_inodes trigger > VFS_BUG_ON_INODE, and we also check MAY_EXEC. > > To determine if an inode is an anon_inode, we consider two cases: > > 1. If the inode is the same as anon_inode_inode, it is the default > anon_inode. > 2. Anonymous inodes created with alloc_anon_inode() set the S_PRIVATE > flag. If S_PRIVATE is set, we consider it an anonymous inode. > > The bug can be reproduced with the following code: > > #include <stdio.h> > #include <unistd.h> > #include <fcntl.h> > #include <sys/timerfd.h> > int main(int argc, char **argv) { > int fd = timerfd_create(CLOCK_MONOTONIC, 0); > if (fd != -1) { > char path[256]; > sprintf(path, "/proc/self/fd/%d", fd); > open(path, O_RDONLY); > } > return 0; > } > > Finally, after testing, anon_inodes no longer trigger VFS_BUG_ON_INODE. > > Fixes: af153bb63a336 ("vfs: catch invalid modes in may_open()") > Reported-by: syzbot+5d8e79d323a13aa0b248@syzkaller.appspotmail.com > Closes: https://lore.kernel.org/all/67ed3fb3.050a0220.14623d.0009.GAE@google.com" > Signed-off-by: Penglei Jiang <superman.xpt@gmail.com> > --- > V1 -> V2: added MAY_EXEC check, added some comments > > fs/anon_inodes.c | 12 ++++++++++++ > fs/namei.c | 8 +++++++- > include/linux/anon_inodes.h | 1 + > 3 files changed, 20 insertions(+), 1 deletion(-) > > diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c > index 583ac81669c2..bf124d53973e 100644 > --- a/fs/anon_inodes.c > +++ b/fs/anon_inodes.c > @@ -303,6 +303,18 @@ int anon_inode_create_getfd(const char *name, const struct file_operations *fops > return __anon_inode_getfd(name, fops, priv, flags, context_inode, true); > } > > +/** > + * is_default_anon_inode - Checks if the given inode is the default > + * anonymous inode (anon_inode_inode) > + * > + * @inode: [in] the inode to be checked > + * > + * Returns true if the given inode is anon_inode_inode, otherwise returns false. > + */ > +inline bool is_default_anon_inode(const struct inode *inode) > +{ > + return anon_inode_inode == inode; > +} > I would drop the inline. > static int __init anon_inode_init(void) > { > diff --git a/fs/namei.c b/fs/namei.c > index 360a86ca1f02..e8cc00a7c31a 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -40,6 +40,7 @@ > #include <linux/bitops.h> > #include <linux/init_task.h> > #include <linux/uaccess.h> > +#include <linux/anon_inodes.h> > > #include "internal.h" > #include "mount.h" > @@ -3429,7 +3430,12 @@ static int may_open(struct mnt_idmap *idmap, const struct path *path, > return -EACCES; > break; > default: > - VFS_BUG_ON_INODE(1, inode); > + if (!is_default_anon_inode(inode) > + && !(inode->i_flags & S_PRIVATE)) > + VFS_BUG_ON_INODE(1, inode); > + if (acc_mode & MAY_EXEC) > + return -EACCES; > + break; > } Semantically this looks ok to me. It may be this happens to still be too restrictive, but then we are erroring on the side of some test suite blowing up, as opposed to something in production, so I think it's fine. I would fold these conditions into the debug macro though, as in: VFS_BUG_ON_INODE(!is_default_anon_inode(inode) && !(inode->i_flags & S_PRIVATE)), inode); > > error = inode_permission(idmap, inode, MAY_OPEN | acc_mode); > diff --git a/include/linux/anon_inodes.h b/include/linux/anon_inodes.h > index edef565c2a1a..eca4a3913ba7 100644 > --- a/include/linux/anon_inodes.h > +++ b/include/linux/anon_inodes.h > @@ -30,6 +30,7 @@ int anon_inode_create_getfd(const char *name, > const struct file_operations *fops, > void *priv, int flags, > const struct inode *context_inode); > +bool is_default_anon_inode(const struct inode *inode); > > #endif /* _LINUX_ANON_INODES_H */ > > -- > 2.17.1 >
Please make sure anon inodes have a valid mode set. Leaving this landmine around where we have inodes without a valid mode is just going to create more problems in the future.
On Fri, Apr 04, 2025 at 01:31:58AM -0700, Christoph Hellwig wrote: > Please make sure anon inodes have a valid mode set. Leaving this > landmine around where we have inodes without a valid mode is just going > to create more problems in the future. We've tried to change it, it immediately leads to userspace tools regressions as they've started to rely on that behavior because it's been like that since the dawn of time. We even had to work around that crap in pidfs because lsof regressed. So now, we can't do this. >
On Fri, Apr 04, 2025 at 10:45:43AM +0200, Christian Brauner wrote: > On Fri, Apr 04, 2025 at 01:31:58AM -0700, Christoph Hellwig wrote: > > Please make sure anon inodes have a valid mode set. Leaving this > > landmine around where we have inodes without a valid mode is just going > > to create more problems in the future. > > We've tried to change it, it immediately leads to userspace tools > regressions as they've started to rely on that behavior because it's been > like that since the dawn of time. We even had to work around that crap > in pidfs because lsof regressed. So now, we can't do this. Just because i_mode has something useful, we don't need to report that to userspace. We can still clear kstat.mode (with a big fat comment explaining why).
On Fri, Apr 04, 2025 at 01:47:12AM -0700, Christoph Hellwig wrote: > On Fri, Apr 04, 2025 at 10:45:43AM +0200, Christian Brauner wrote: > > On Fri, Apr 04, 2025 at 01:31:58AM -0700, Christoph Hellwig wrote: > > > Please make sure anon inodes have a valid mode set. Leaving this > > > landmine around where we have inodes without a valid mode is just going > > > to create more problems in the future. > > > > We've tried to change it, it immediately leads to userspace tools > > regressions as they've started to rely on that behavior because it's been > > like that since the dawn of time. We even had to work around that crap > > in pidfs because lsof regressed. So now, we can't do this. > > Just because i_mode has something useful, we don't need to report that > to userspace. We can still clear kstat.mode (with a big fat comment > explaining why). Yes, that's what we do in pidfs. I can do this for anon_inode as well. >
diff --git a/fs/anon_inodes.c b/fs/anon_inodes.c index 583ac81669c2..c29eca6106d2 100644 --- a/fs/anon_inodes.c +++ b/fs/anon_inodes.c @@ -303,6 +303,10 @@ int anon_inode_create_getfd(const char *name, const struct file_operations *fops return __anon_inode_getfd(name, fops, priv, flags, context_inode, true); } +inline bool is_default_anon_inode(const struct inode *inode) +{ + return anon_inode_inode == inode; +} static int __init anon_inode_init(void) { diff --git a/fs/namei.c b/fs/namei.c index 360a86ca1f02..81cea4901a2f 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -40,6 +40,7 @@ #include <linux/bitops.h> #include <linux/init_task.h> #include <linux/uaccess.h> +#include <linux/anon_inodes.h> #include "internal.h" #include "mount.h" @@ -3429,7 +3430,9 @@ static int may_open(struct mnt_idmap *idmap, const struct path *path, return -EACCES; break; default: - VFS_BUG_ON_INODE(1, inode); + if (!is_default_anon_inode(inode) + && !(inode->i_flags & S_PRIVATE)) + VFS_BUG_ON_INODE(1, inode); } error = inode_permission(idmap, inode, MAY_OPEN | acc_mode); diff --git a/include/linux/anon_inodes.h b/include/linux/anon_inodes.h index edef565c2a1a..eca4a3913ba7 100644 --- a/include/linux/anon_inodes.h +++ b/include/linux/anon_inodes.h @@ -30,6 +30,7 @@ int anon_inode_create_getfd(const char *name, const struct file_operations *fops, void *priv, int flags, const struct inode *context_inode); +bool is_default_anon_inode(const struct inode *inode); #endif /* _LINUX_ANON_INODES_H */
may_open() { ... switch (inode->i_mode & S_IFMT) { case S_IFLNK: case S_IFDIR: case S_IFBLK: case S_IFCHR: case S_IFIFO: case S_IFSOCK: case S_IFREG: default: VFS_BUG_ON_INODE(1, inode); } ... } Since some anonymous inodes do not have S_IFLNK, S_IFDIR, S_IFBLK, S_IFCHR, S_IFIFO, S_IFSOCK, or S_IFREG flags set when created, they end up in the default case branch. When creating some anon_inode instances, the i_mode in the switch statement is not properly set, which causes the open operation to follow the default branch when opening anon_inode. We could check whether the inode is an anon_inode before VFS_BUG_ON_INODE and trigger the assertion only if it's not. However, a more reasonable approach might be to set a flag during creation in alloc_anon_inode(), such as inode->i_flags |= S_ANON, to explicitly mark anonymous inodes. The code that triggers the BUG: #include <stdio.h> #include <unistd.h> #include <fcntl.h> #include <sys/timerfd.h> int main(int argc, char **argv) { int fd = timerfd_create(CLOCK_MONOTONIC, 0); if (fd != -1) { char path[256]; sprintf(path, "/proc/self/fd/%d", fd); open(path, O_RDONLY); } return 0; } Thank you! Fixes: af153bb63a336 ("vfs: catch invalid modes in may_open()") Reported-by: syzbot+5d8e79d323a13aa0b248@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67ed3fb3.050a0220.14623d.0009.GAE@google.com" Signed-off-by: Penglei Jiang <superman.xpt@gmail.com> --- fs/anon_inodes.c | 4 ++++ fs/namei.c | 5 ++++- include/linux/anon_inodes.h | 1 + 3 files changed, 9 insertions(+), 1 deletion(-)