Message ID | 87bnhedhw5.fsf_-_@x220.int.ebiederm.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, May 20, 2015 at 05:05:30PM -0500, Eric W. Biederman wrote: > This works for me, can you confirm it works for you folks as well? > > Thank you, > Eric Yup, neither reproducer produces the warning anymore. Reported-and-Tested-by: Omar Sandoval <osandov@osandov.com> > From: "Eric W. Biederman" <ebiederm@xmission.com> > Date: Wed, 20 May 2015 16:39:00 -0500 > Subject: [PATCH] vfs: Fix, simpliy and stop using d_dname for the /proc/*/ns/* files. > > Now that these files are no longer in proc we can stop playing games > with d_alloc_pseudo and d_dname. This change causes a couple of user > visible changes: > > - Opening /proc/*/ns/* and then readlink on /proc/self/fd/N now sees a > prepended / in the filename. > > - /proc/mountinfo now gives useful information on what is mounted. > > - Bind mounting /proc/*/ns/*, opening the mounted file, removing the > bind mount, and readlink on /proc/self/fd/N now sees / (as it should) What's the rationale for "/" in this case? (I'm not disagreeing, just looking to learn something.) Thanks,
On Wed, May 20, 2015 at 05:05:30PM -0500, Eric W. Biederman wrote: > Omar Sandoval <osandov@osandov.com> writes: >> I'm attaching a minimal script that reproduces this on 4.1-rc4. I'm >> taking a look, but Eric or Al will probably figure this out before I get >> the chance :) > > This works for me, can you confirm it works for you folks as well? Yeah, your patch works with our setup and when running our initial test-case as well. Thanks a lot Eric. > From: "Eric W. Biederman" <ebiederm@xmission.com> > Date: Wed, 20 May 2015 16:39:00 -0500 > Subject: [PATCH] vfs: Fix, simpliy and stop using d_dname for the /proc/*/ns/* files. > > Now that these files are no longer in proc we can stop playing games > with d_alloc_pseudo and d_dname. This change causes a couple of user > visible changes: > > - Opening /proc/*/ns/* and then readlink on /proc/self/fd/N now sees a > prepended / in the filename. > > - /proc/mountinfo now gives useful information on what is mounted. > > - Bind mounting /proc/*/ns/*, opening the mounted file, removing the > bind mount, and readlink on /proc/self/fd/N now sees / (as it should) > instead of triggering a warning and a kernel stack backtrace from > prepend_path. > > Cc: stable@vger.kernel.org > Reported-by: Ivan Delalande <colona@arista.com> > Reported-by: Omar Sandoval <osandov@osandov.com> > Signed-off-by: "Eric W. Biederman" <ebiederm@xmission.com> Tested-by: Ivan Delalande <colona@arista.com>
On Wed, May 20, 2015 at 05:05:30PM -0500, Eric W. Biederman wrote: > - dentry = d_alloc_pseudo(mnt->mnt_sb, &qname); > + dentry = d_alloc_name(mnt->mnt_root, name); > if (!dentry) { > iput(inode); > mntput(mnt); > return ERR_PTR(-ENOMEM); > } > - d_instantiate(dentry, inode); > + d_add(dentry, inode); Careful - that might have non-trivial effects. Namely, you are making the root dentry of that sucker a contention point and adding to hash pollution... It's probably not going to cause visible problems, but it's worth profiling just to be sure. Besides, you are violating a bunch of rules here - several hashed children of the same directory with the same name all at once... not nice. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, May 21, 2015 at 12:23:25AM +0100, Al Viro wrote: > On Wed, May 20, 2015 at 05:05:30PM -0500, Eric W. Biederman wrote: > > - dentry = d_alloc_pseudo(mnt->mnt_sb, &qname); > > + dentry = d_alloc_name(mnt->mnt_root, name); > > if (!dentry) { > > iput(inode); > > mntput(mnt); > > return ERR_PTR(-ENOMEM); > > } > > - d_instantiate(dentry, inode); > > + d_add(dentry, inode); > > Careful - that might have non-trivial effects. Namely, you are making > the root dentry of that sucker a contention point and adding to hash > pollution... It's probably not going to cause visible problems, but > it's worth profiling just to be sure. > > Besides, you are violating a bunch of rules here - several hashed > children of the same directory with the same name all at once... > not nice. Hey Eric, did you have any thought about Al’s concerns? Thanks,
Ivan Delalande <colona@arista.com> writes: > On Thu, May 21, 2015 at 12:23:25AM +0100, Al Viro wrote: >> On Wed, May 20, 2015 at 05:05:30PM -0500, Eric W. Biederman wrote: >> > - dentry = d_alloc_pseudo(mnt->mnt_sb, &qname); >> > + dentry = d_alloc_name(mnt->mnt_root, name); >> > if (!dentry) { >> > iput(inode); >> > mntput(mnt); >> > return ERR_PTR(-ENOMEM); >> > } >> > - d_instantiate(dentry, inode); >> > + d_add(dentry, inode); >> >> Careful - that might have non-trivial effects. Namely, you are making >> the root dentry of that sucker a contention point and adding to hash >> pollution... It's probably not going to cause visible problems, but >> it's worth profiling just to be sure. >> >> Besides, you are violating a bunch of rules here - several hashed >> children of the same directory with the same name all at once... >> not nice. > > Hey Eric, did you have any thought about Al’s concerns? Massive difference in perspective for the most part. It did cause me to step back and really look at what that code is doing and why. For the immediate problem the issue is that the WARN_ON is warning about something nothing in the kernel has done for 5 years and in practice as you have seen is actually wrong. So deleting the warning message appears the best way to handle the situation you are seeing. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sat, Jun 06, 2015 at 02:27:24PM -0500, Eric W. Biederman wrote: > Ivan Delalande <colona@arista.com> writes: > > > On Thu, May 21, 2015 at 12:23:25AM +0100, Al Viro wrote: > >> On Wed, May 20, 2015 at 05:05:30PM -0500, Eric W. Biederman wrote: > >> > - dentry = d_alloc_pseudo(mnt->mnt_sb, &qname); > >> > + dentry = d_alloc_name(mnt->mnt_root, name); > >> > if (!dentry) { > >> > iput(inode); > >> > mntput(mnt); > >> > return ERR_PTR(-ENOMEM); > >> > } > >> > - d_instantiate(dentry, inode); > >> > + d_add(dentry, inode); > >> > >> Careful - that might have non-trivial effects. Namely, you are making > >> the root dentry of that sucker a contention point and adding to hash > >> pollution... It's probably not going to cause visible problems, but > >> it's worth profiling just to be sure. > >> > >> Besides, you are violating a bunch of rules here - several hashed > >> children of the same directory with the same name all at once... > >> not nice. > > > > Hey Eric, did you have any thought about Al’s concerns? > > Massive difference in perspective for the most part. It did cause me to > step back and really look at what that code is doing and why. > > For the immediate problem the issue is that the WARN_ON is warning about > something nothing in the kernel has done for 5 years and in practice as > you have seen is actually wrong. So deleting the warning message > appears the best way to handle the situation you are seeing. Ok, great. Thanks for all the info and the new patch, Eric.
diff --git a/fs/dcache.c b/fs/dcache.c index 656ce522a218..e0f0967ef24c 100644 --- a/fs/dcache.c +++ b/fs/dcache.c @@ -3087,13 +3087,8 @@ char *d_path(const struct path *path, char *buf, int buflen) * thus don't need to be hashed. They also don't need a name until a * user wants to identify the object in /proc/pid/fd/. The little hack * below allows us to generate a name for these objects on demand: - * - * Some pseudo inodes are mountable. When they are mounted - * path->dentry == path->mnt->mnt_root. In that case don't call d_dname - * and instead have d_path return the mounted path. */ - if (path->dentry->d_op && path->dentry->d_op->d_dname && - (!IS_ROOT(path->dentry) || path->dentry != path->mnt->mnt_root)) + if (path->dentry->d_op && path->dentry->d_op->d_dname) return path->dentry->d_op->d_dname(path->dentry, buf, buflen); rcu_read_lock(); diff --git a/fs/nsfs.c b/fs/nsfs.c index 99521e7c492b..b9ecfa6cb46d 100644 --- a/fs/nsfs.c +++ b/fs/nsfs.c @@ -11,15 +11,6 @@ static const struct file_operations ns_file_operations = { .llseek = no_llseek, }; -static char *ns_dname(struct dentry *dentry, char *buffer, int buflen) -{ - struct inode *inode = d_inode(dentry); - const struct proc_ns_operations *ns_ops = dentry->d_fsdata; - - return dynamic_dname(dentry, buffer, buflen, "%s:[%lu]", - ns_ops->name, inode->i_ino); -} - static void ns_prune_dentry(struct dentry *dentry) { struct inode *inode = d_inode(dentry); @@ -33,7 +24,6 @@ const struct dentry_operations ns_dentry_operations = { .d_prune = ns_prune_dentry, .d_delete = always_delete_dentry, - .d_dname = ns_dname, }; static void nsfs_evict(struct inode *inode) @@ -47,7 +37,7 @@ void *ns_get_path(struct path *path, struct task_struct *task, const struct proc_ns_operations *ns_ops) { struct vfsmount *mnt = mntget(nsfs_mnt); - struct qstr qname = { .name = "", }; + char name[DNAME_INLINE_LEN]; struct dentry *dentry; struct inode *inode; struct ns_common *ns; @@ -80,6 +70,7 @@ slow: mntput(mnt); return ERR_PTR(-ENOMEM); } + snprintf(name, sizeof(name), "%s:[%u]", ns_ops->name, ns->inum); inode->i_ino = ns->inum; inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME; inode->i_flags |= S_IMMUTABLE; @@ -87,13 +78,13 @@ slow: inode->i_fop = &ns_file_operations; inode->i_private = ns; - dentry = d_alloc_pseudo(mnt->mnt_sb, &qname); + dentry = d_alloc_name(mnt->mnt_root, name); if (!dentry) { iput(inode); mntput(mnt); return ERR_PTR(-ENOMEM); } - d_instantiate(dentry, inode); + d_add(dentry, inode); dentry->d_fsdata = (void *)ns_ops; d = atomic_long_cmpxchg(&ns->stashed, 0, (unsigned long)dentry); if (d) { @@ -143,7 +134,7 @@ static const struct super_operations nsfs_ops = { static struct dentry *nsfs_mount(struct file_system_type *fs_type, int flags, const char *dev_name, void *data) { - return mount_pseudo(fs_type, "nsfs:", &nsfs_ops, + return mount_pseudo(fs_type, "/", &nsfs_ops, &ns_dentry_operations, NSFS_MAGIC); } static struct file_system_type nsfs = {