Message ID | 87tws010r2.fsf_-_@x220.int.ebiederm.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sat, 15 Aug 2015 20:27:13 -0500 ebiederm@xmission.com (Eric W. Biederman) wrote: > > In rare cases a directory can be renamed out from under a bind mount. > In those cases without special handling it becomes possible to walk up > the directory tree to the root dentry of the filesystem and down > from the root dentry to every other file or directory on the filesystem. > > Like division by zero .. from an unconnected path can not be given > a useful semantic as there is no predicting at which path component > the code will realize it is unconnected. We certainly can not match > the current behavior as the current behavior is a security hole. > > Therefore when encounting .. when following an unconnected path > return -ENOENT. > > - Add a function path_connected to verify path->dentry is reachable > from path->mnt.mnt_root. AKA to validate that rename did not do > something nasty to the bind mount. > > To avoid races path_connected must be called after following a path > component to it's next path component. > > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> > --- > > This is the simple version that needs no extra vfs support. > > My availability is likely to be a bit spotty for the next while > as I am travelling to and then attending Linux Plumbers Conference. > > fs/namei.c | 27 +++++++++++++++++++++++++-- > 1 file changed, 25 insertions(+), 2 deletions(-) > > diff --git a/fs/namei.c b/fs/namei.c > index ae4e4c18b2ac..5303e994f8d6 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -560,6 +560,24 @@ static int __nd_alloc_stack(struct nameidata *nd) > return 0; > } > > +/** > + * path_connected - Verify that a path->dentry is below path->mnt.mnt_root > + * @path: nameidate to verify > + * > + * Rename can sometimes move a file or directory outside of a bind > + * mount, path_connected allows those cases to be detected. While it is obviously true that a rename can move a file outside of a bind mount, it doesn't seem relevant and so could be confusing. This is only ever used for directories, and a file could already be outside a bind mount even while it is inside. I would stick with "Rename can sometimes move a directory outside..." > + */ > +static bool path_connected(const struct path *path) > +{ > + struct vfsmount *mnt = path->mnt; > + > + /* Only bind mounts can have disconnected paths */ > + if (mnt->mnt_root == mnt->mnt_sb->s_root) > + return true; > + > + return is_subdir(path->dentry, mnt->mnt_root); > +} > + > static inline int nd_alloc_stack(struct nameidata *nd) > { > if (likely(nd->depth != EMBEDDED_LEVELS)) > @@ -1296,6 +1314,8 @@ static int follow_dotdot_rcu(struct nameidata *nd) > return -ECHILD; > nd->path.dentry = parent; > nd->seq = seq; > + if (unlikely(!path_connected(&nd->path))) > + return -ENOENT; > break; > } else { > struct mount *mnt = real_mount(nd->path.mnt); > @@ -1396,7 +1416,7 @@ static void follow_mount(struct path *path) > } > } > > -static void follow_dotdot(struct nameidata *nd) > +static int follow_dotdot(struct nameidata *nd) > { > if (!nd->root.mnt) > set_root(nd); > @@ -1412,6 +1432,8 @@ static void follow_dotdot(struct nameidata *nd) > /* rare case of legitimate dget_parent()... */ > nd->path.dentry = dget_parent(nd->path.dentry); > dput(old); > + if (unlikely(!path_connected(&nd->path))) > + return -ENOENT; > break; > } > if (!follow_up(&nd->path)) > @@ -1419,6 +1441,7 @@ static void follow_dotdot(struct nameidata *nd) > } > follow_mount(&nd->path); > nd->inode = nd->path.dentry->d_inode; > + return 0; > } > > /* > @@ -1634,7 +1657,7 @@ static inline int handle_dots(struct nameidata *nd, int type) > if (nd->flags & LOOKUP_RCU) { > return follow_dotdot_rcu(nd); > } else > - follow_dotdot(nd); > + return follow_dotdot(nd); > } > return 0; > } I really like this patch, particularly from the standpoint of backporting to -stable and enterprise kernels. I suspect that all the tracking of which mounts might have been escaped from should be classified as premature optimisation, until measurements show otherwise. path_connected() adds no locks or even atomics. The path that it walks will be well-trodden and so very likely most of it will be in the CPU cache. And I particularly like that follow_dotdot() and follow_dotdot_rcu() now both that the same signature :-) What's not to like? Reviewed-by: NeilBrown <neilb@suse.com> in case it helps. NeilBrown -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/namei.c b/fs/namei.c index ae4e4c18b2ac..5303e994f8d6 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -560,6 +560,24 @@ static int __nd_alloc_stack(struct nameidata *nd) return 0; } +/** + * path_connected - Verify that a path->dentry is below path->mnt.mnt_root + * @path: nameidate to verify + * + * Rename can sometimes move a file or directory outside of a bind + * mount, path_connected allows those cases to be detected. + */ +static bool path_connected(const struct path *path) +{ + struct vfsmount *mnt = path->mnt; + + /* Only bind mounts can have disconnected paths */ + if (mnt->mnt_root == mnt->mnt_sb->s_root) + return true; + + return is_subdir(path->dentry, mnt->mnt_root); +} + static inline int nd_alloc_stack(struct nameidata *nd) { if (likely(nd->depth != EMBEDDED_LEVELS)) @@ -1296,6 +1314,8 @@ static int follow_dotdot_rcu(struct nameidata *nd) return -ECHILD; nd->path.dentry = parent; nd->seq = seq; + if (unlikely(!path_connected(&nd->path))) + return -ENOENT; break; } else { struct mount *mnt = real_mount(nd->path.mnt); @@ -1396,7 +1416,7 @@ static void follow_mount(struct path *path) } } -static void follow_dotdot(struct nameidata *nd) +static int follow_dotdot(struct nameidata *nd) { if (!nd->root.mnt) set_root(nd); @@ -1412,6 +1432,8 @@ static void follow_dotdot(struct nameidata *nd) /* rare case of legitimate dget_parent()... */ nd->path.dentry = dget_parent(nd->path.dentry); dput(old); + if (unlikely(!path_connected(&nd->path))) + return -ENOENT; break; } if (!follow_up(&nd->path)) @@ -1419,6 +1441,7 @@ static void follow_dotdot(struct nameidata *nd) } follow_mount(&nd->path); nd->inode = nd->path.dentry->d_inode; + return 0; } /* @@ -1634,7 +1657,7 @@ static inline int handle_dots(struct nameidata *nd, int type) if (nd->flags & LOOKUP_RCU) { return follow_dotdot_rcu(nd); } else - follow_dotdot(nd); + return follow_dotdot(nd); } return 0; }
In rare cases a directory can be renamed out from under a bind mount. In those cases without special handling it becomes possible to walk up the directory tree to the root dentry of the filesystem and down from the root dentry to every other file or directory on the filesystem. Like division by zero .. from an unconnected path can not be given a useful semantic as there is no predicting at which path component the code will realize it is unconnected. We certainly can not match the current behavior as the current behavior is a security hole. Therefore when encounting .. when following an unconnected path return -ENOENT. - Add a function path_connected to verify path->dentry is reachable from path->mnt.mnt_root. AKA to validate that rename did not do something nasty to the bind mount. To avoid races path_connected must be called after following a path component to it's next path component. Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> --- This is the simple version that needs no extra vfs support. My availability is likely to be a bit spotty for the next while as I am travelling to and then attending Linux Plumbers Conference. fs/namei.c | 27 +++++++++++++++++++++++++-- 1 file changed, 25 insertions(+), 2 deletions(-)