Message ID | 161793090597.10062.4954029445418116308.stgit@mickey.themaw.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kernfs: proposed locking and concurrency improvement | expand |
On Fri, Apr 09, 2021 at 09:15:06AM +0800, Ian Kent wrote: > + parent = kernfs_dentry_node(dentry->d_parent); > + if (parent) { > + const void *ns = NULL; > + > + if (kernfs_ns_enabled(parent)) > + ns = kernfs_info(dentry->d_parent->d_sb)->ns; For any dentry d, we have d->d_parent->d_sb == d->d_sb. All the time. If you ever run into the case where that would not be true, you've found a critical bug. > + kn = kernfs_find_ns(parent, dentry->d_name.name, ns); > + if (kn) > + goto out_bad; > + } Umm... What's to prevent a race with successful rename(2)? IOW, what's there to stabilize ->d_parent and ->d_name while we are in that function?
On Fri, 2021-04-09 at 01:35 +0000, Al Viro wrote: > On Fri, Apr 09, 2021 at 09:15:06AM +0800, Ian Kent wrote: > > + parent = kernfs_dentry_node(dentry->d_parent); > > + if (parent) { > > + const void *ns = NULL; > > + > > + if (kernfs_ns_enabled(parent)) > > + ns = kernfs_info(dentry->d_parent- > > >d_sb)->ns; > > For any dentry d, we have d->d_parent->d_sb == d->d_sb. All > the time. > If you ever run into the case where that would not be true, you've > found > a critical bug. Right, yes. > > > + kn = kernfs_find_ns(parent, dentry- > > >d_name.name, ns); > > + if (kn) > > + goto out_bad; > > + } > > Umm... What's to prevent a race with successful rename(2)? IOW, > what's > there to stabilize ->d_parent and ->d_name while we are in that > function? Indeed, glad you looked at this. Now I'm wondering how kerfs_iop_rename() protects itself from concurrent kernfs_rename_ns() ...
On Fri, 2021-04-09 at 16:26 +0800, Ian Kent wrote: > On Fri, 2021-04-09 at 01:35 +0000, Al Viro wrote: > > On Fri, Apr 09, 2021 at 09:15:06AM +0800, Ian Kent wrote: > > > + parent = kernfs_dentry_node(dentry->d_parent); > > > + if (parent) { > > > + const void *ns = NULL; > > > + > > > + if (kernfs_ns_enabled(parent)) > > > + ns = kernfs_info(dentry->d_parent- > > > > d_sb)->ns; > > > > For any dentry d, we have d->d_parent->d_sb == d->d_sb. All > > the time. > > If you ever run into the case where that would not be true, you've > > found > > a critical bug. > > Right, yes. > > > > + kn = kernfs_find_ns(parent, dentry- > > > > d_name.name, ns); > > > + if (kn) > > > + goto out_bad; > > > + } > > > > Umm... What's to prevent a race with successful rename(2)? IOW, > > what's > > there to stabilize ->d_parent and ->d_name while we are in that > > function? > > Indeed, glad you looked at this. > > Now I'm wondering how kerfs_iop_rename() protects itself from > concurrent kernfs_rename_ns() ... As I thought ... I haven't done an exhaustive search but I can't find any file system that doesn't call back into kernfs from kernfs_syscall_ops (if provided at kernfs root creation). I don't see anything that uses kernfs that defines a .rename() op but if there was one it would be expected to call back into kernfs at which point it would block on kernfs_mutex (kernfs_rwsem) until it's released. So I don't think there can be changes in this case due to the lock taken just above the code your questioning. I need to think a bit about whether the dentry being negative (ie. not having kernfs node) could allow bad things to happen ... Or am I misunderstanding the race your pointing out here? Ian
diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c index 4c69e2af82dac..edfeee1bf38ec 100644 --- a/fs/kernfs/dir.c +++ b/fs/kernfs/dir.c @@ -1037,12 +1037,33 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags) if (flags & LOOKUP_RCU) return -ECHILD; - /* Always perform fresh lookup for negatives */ - if (d_really_is_negative(dentry)) - goto out_bad_unlocked; + mutex_lock(&kernfs_mutex); kn = kernfs_dentry_node(dentry); - mutex_lock(&kernfs_mutex); + + /* Negative hashed dentry? */ + if (!kn) { + struct kernfs_node *parent; + + /* If the kernfs node can be found this is a stale negative + * hashed dentry so it must be discarded and the lookup redone. + */ + parent = kernfs_dentry_node(dentry->d_parent); + if (parent) { + const void *ns = NULL; + + if (kernfs_ns_enabled(parent)) + ns = kernfs_info(dentry->d_parent->d_sb)->ns; + kn = kernfs_find_ns(parent, dentry->d_name.name, ns); + if (kn) + goto out_bad; + } + + /* The kernfs node doesn't exist, leave the dentry negative + * and return success. + */ + goto out; + } /* The kernfs node has been deactivated */ if (!kernfs_active_read(kn)) @@ -1060,12 +1081,11 @@ static int kernfs_dop_revalidate(struct dentry *dentry, unsigned int flags) if (kn->parent && kernfs_ns_enabled(kn->parent) && kernfs_info(dentry->d_sb)->ns != kn->ns) goto out_bad; - +out: mutex_unlock(&kernfs_mutex); return 1; out_bad: mutex_unlock(&kernfs_mutex); -out_bad_unlocked: return 0; } @@ -1080,33 +1100,24 @@ static struct dentry *kernfs_iop_lookup(struct inode *dir, struct dentry *ret; struct kernfs_node *parent = dir->i_private; struct kernfs_node *kn; - struct inode *inode; + struct inode *inode = NULL; const void *ns = NULL; mutex_lock(&kernfs_mutex); - if (kernfs_ns_enabled(parent)) ns = kernfs_info(dir->i_sb)->ns; kn = kernfs_find_ns(parent, dentry->d_name.name, ns); - - /* no such entry */ - if (!kn || !kernfs_active(kn)) { - ret = NULL; - goto out_unlock; - } - /* attach dentry and inode */ - inode = kernfs_get_inode(dir->i_sb, kn); - if (!inode) { - ret = ERR_PTR(-ENOMEM); - goto out_unlock; + if (kn && kernfs_active(kn)) { + inode = kernfs_get_inode(dir->i_sb, kn); + if (!inode) + inode = ERR_PTR(-ENOMEM); } - - /* instantiate and hash dentry */ + /* instantiate and hash (possibly negative) dentry */ ret = d_splice_alias(inode, dentry); - out_unlock: mutex_unlock(&kernfs_mutex); + return ret; }
If there are many lookups for non-existent paths these negative lookups can lead to a lot of overhead during path walks. The VFS allows dentries to be created as negative and hashed, and caches them so they can be used to reduce the fairly high overhead alloc/free cycle that occurs during these lookups. Signed-off-by: Ian Kent <raven@themaw.net> --- fs/kernfs/dir.c | 55 +++++++++++++++++++++++++++++++++---------------------- 1 file changed, 33 insertions(+), 22 deletions(-)