Message ID | 1433424586-7771-3-git-send-email-miklos@szeredi.hu (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Jun 04, 2015 at 03:29:46PM +0200, Miklos Szeredi wrote: > From: Miklos Szeredi <mszeredi@suse.cz> > > Allow filesystems with .d_revalidate as lower layer(s), but not as upper > layer. > > For local filesystems the rule was that modifications on the layers > directly while being part of the overlay results in undefined behavior. > > This can easily be extended to distributed filesystems: we assume the tree > used as lower layer is static, which means ->d_revalidate() should always > return "1". If that is not the case, return -ESTALE, don't try to work > around the modification. Umm... Cosmetical point is that this > +static bool ovl_remote(struct dentry *root) > +{ > + const struct dentry_operations *dop = root->d_op; > + > + return dop && (dop->d_revalidate || dop->d_weak_revalidate); > +} is better done as root->d_flags & (DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE) More interesting question is whether anything in the system relies on existing behaviour that follows ->d_revalidate() returning 0. Have you tried to mount e.g. procfs as underlying layer and torture it for a while? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jun 05, 2015 at 01:07:15AM +0100, Al Viro wrote: > Umm... Cosmetical point is that this > > > +static bool ovl_remote(struct dentry *root) > > +{ > > + const struct dentry_operations *dop = root->d_op; > > + > > + return dop && (dop->d_revalidate || dop->d_weak_revalidate); > > +} > > is better done as > root->d_flags & (DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE) Okay. > > More interesting question is whether anything in the system relies on > existing behaviour that follows ->d_revalidate() returning 0. Hmm, d_invalidate() almost always follows ->d_revalidate(). Almost, becuase RCU lookup can get aborted at that point. We can easily stick d_invalidate() in there for the non-RCU case. Regular lookup also almost always follows ->d_revalidate(). Except if allocation of new dentry fails. So relying on this would be buggy (which is not to say nobody does it). > Have you tried to mount e.g. procfs as underlying layer and torture it for a > while? I did try now. Nothing bad happened during the test (parallel stat(1) of the whole overlayed proc tree). My laptop froze while trying to write this mail. But it's 8 years old and when the fan starts to make noises and the weather is hot, it does this sometimes. I don't think that has anything to do with overlayfs, but will do more testing... Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miklos Szeredi <miklos@szeredi.hu> writes: > On Fri, Jun 05, 2015 at 01:07:15AM +0100, Al Viro wrote: > >> Umm... Cosmetical point is that this >> >> > +static bool ovl_remote(struct dentry *root) >> > +{ >> > + const struct dentry_operations *dop = root->d_op; >> > + >> > + return dop && (dop->d_revalidate || dop->d_weak_revalidate); >> > +} >> >> is better done as >> root->d_flags & (DCACHE_OP_REVALIDATE | DCACHE_OP_WEAK_REVALIDATE) > > Okay. > >> >> More interesting question is whether anything in the system relies on >> existing behaviour that follows ->d_revalidate() returning 0. > > Hmm, d_invalidate() almost always follows ->d_revalidate(). Almost, becuase RCU > lookup can get aborted at that point. We can easily stick d_invalidate() in > there for the non-RCU case. > > Regular lookup also almost always follows ->d_revalidate(). Except if > allocation of new dentry fails. So relying on this would be buggy (which is not > to say nobody does it). > >> Have you tried to mount e.g. procfs as underlying layer and torture it for a >> while? > > I did try now. Nothing bad happened during the test (parallel stat(1) of the > whole overlayed proc tree). > > My laptop froze while trying to write this mail. But it's 8 years old and when > the fan starts to make noises and the weather is hot, it does this sometimes. I > don't think that has anything to do with overlayfs, but will do more > testing... A nasty corner case to be aware of (and I think this is part of what Al was warning about). /proc/sys/net is different depending upon which current->nsproxy->net_ns. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Jun 7, 2015 at 3:02 AM, Eric W. Biederman <ebiederm@xmission.com> wrote: > A nasty corner case to be aware of (and I think this is part of what Al > was warning about). /proc/sys/net is different depending upon which > current->nsproxy->net_ns. Ah, I'm beginning to grasp what's going on there: mulitple dentries with the same name but belonging to different namespaces, ->d_compare() being used to select the right one. Is that it? Overlayfs checks for d_compare() on the root of the lower and upper trees, but here it only set on a subdirectory of proc, not on every dentry. So overlayfs should be careful and check for DCACHE_OP_COMPARE | DCACHE_OP_HASH and reject going down such a dentry. Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Miklos Szeredi <miklos@szeredi.hu> writes: > On Sun, Jun 7, 2015 at 3:02 AM, Eric W. Biederman <ebiederm@xmission.com> wrote: >> A nasty corner case to be aware of (and I think this is part of what Al >> was warning about). /proc/sys/net is different depending upon which >> current->nsproxy->net_ns. > > Ah, I'm beginning to grasp what's going on there: mulitple dentries > with the same name but belonging to different namespaces, > ->d_compare() being used to select the right one. Is that it? Yes. The whole sysctl_is_seen magic. I am not proud of it, and I keep thinking I should create /proc/<pid>/sys/... making /proc/sys a symlink to /proc/<pid>/sys/ so that case could go away. Although at this point tomoyo or apparmor probably has rules that would make that impossible (despite no applications actually caring). *sigh* > Overlayfs checks for d_compare() on the root of the lower and upper > trees, but here it only set on a subdirectory of proc, not on every > dentry. So overlayfs should be careful and check for > DCACHE_OP_COMPARE | DCACHE_OP_HASH and reject going down such a > dentry. That sound about right. Eric -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/overlayfs/super.c b/fs/overlayfs/super.c index de9d2ee68ccf..822e6bbe7bf7 100644 --- a/fs/overlayfs/super.c +++ b/fs/overlayfs/super.c @@ -273,10 +273,52 @@ static void ovl_dentry_release(struct dentry *dentry) } } +static int ovl_dentry_revalidate(struct dentry *dentry, unsigned int flags) +{ + struct ovl_entry *oe = dentry->d_fsdata; + unsigned int i; + int ret = 1; + + for (i = 0; i < oe->numlower; i++) { + struct dentry *d = oe->lowerstack[i].dentry; + + if (d->d_flags & DCACHE_OP_REVALIDATE) { + ret = d->d_op->d_revalidate(d, flags); + if (ret <= 0) + return ret ? ret : -ESTALE; + } + } + return 1; +} + +static int ovl_dentry_weak_revalidate(struct dentry *dentry, unsigned int flags) +{ + struct ovl_entry *oe = dentry->d_fsdata; + unsigned int i; + int ret = 1; + + for (i = 0; i < oe->numlower; i++) { + struct dentry *d = oe->lowerstack[i].dentry; + + if (d->d_flags & DCACHE_OP_WEAK_REVALIDATE) { + ret = d->d_op->d_weak_revalidate(d, flags); + if (ret <= 0) + return ret ? ret : -ESTALE; + } + } + return 1; +} + static const struct dentry_operations ovl_dentry_operations = { .d_release = ovl_dentry_release, }; +static const struct dentry_operations ovl_reval_dentry_operations = { + .d_release = ovl_dentry_release, + .d_revalidate = ovl_dentry_revalidate, + .d_weak_revalidate = ovl_dentry_weak_revalidate, +}; + static struct ovl_entry *ovl_alloc_entry(unsigned int numlower) { size_t size = offsetof(struct ovl_entry, lowerstack[numlower]); @@ -705,18 +747,23 @@ static bool ovl_is_allowed_fs_type(struct dentry *root) /* * We don't support: * - autofs - * - filesystems with revalidate (FIXME for lower layer) * - filesystems with case insensitive names */ if (dop && (dop->d_manage || - dop->d_revalidate || dop->d_weak_revalidate || dop->d_compare || dop->d_hash)) { return false; } return true; } +static bool ovl_remote(struct dentry *root) +{ + const struct dentry_operations *dop = root->d_op; + + return dop && (dop->d_revalidate || dop->d_weak_revalidate); +} + static int ovl_mount_dir_noesc(const char *name, struct path *path) { int err = -EINVAL; @@ -755,13 +802,21 @@ static int ovl_mount_dir(const char *name, struct path *path) if (tmp) { ovl_unescape(tmp); err = ovl_mount_dir_noesc(tmp, path); + + if (!err) + if (ovl_remote(path->dentry)) { + pr_err("overlayfs: filesystem on '%s' not supported as upperdir\n", + tmp); + path_put(path); + err = -EINVAL; + } kfree(tmp); } return err; } static int ovl_lower_dir(const char *name, struct path *path, long *namelen, - int *stack_depth) + int *stack_depth, bool *remote) { int err; struct kstatfs statfs; @@ -778,6 +833,9 @@ static int ovl_lower_dir(const char *name, struct path *path, long *namelen, *namelen = max(*namelen, statfs.f_namelen); *stack_depth = max(*stack_depth, path->mnt->mnt_sb->s_stack_depth); + if (ovl_remote(path->dentry)) + *remote = true; + return 0; out_put: @@ -831,6 +889,7 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent) unsigned int numlower; unsigned int stacklen = 0; unsigned int i; + bool remote = false; int err; err = -ENOMEM; @@ -904,7 +963,8 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent) lower = lowertmp; for (numlower = 0; numlower < stacklen; numlower++) { err = ovl_lower_dir(lower, &stack[numlower], - &ufs->lower_namelen, &sb->s_stack_depth); + &ufs->lower_namelen, &sb->s_stack_depth, + &remote); if (err) goto out_put_lowerpath; @@ -962,7 +1022,10 @@ static int ovl_fill_super(struct super_block *sb, void *data, int silent) if (!ufs->upper_mnt) sb->s_flags |= MS_RDONLY; - sb->s_d_op = &ovl_dentry_operations; + if (remote) + sb->s_d_op = &ovl_reval_dentry_operations; + else + sb->s_d_op = &ovl_dentry_operations; err = -ENOMEM; oe = ovl_alloc_entry(numlower);