Autofs and mount namespaces

From: Ian Kent <ikent@redhat.com>

On Thu, 2016-04-07 at 11:19 -0700, Omar Sandoval wrote:
> Hi, Ian,
> 
> I wanted to bring up the issue of autofs and mount namespaces again,
> which recently resurfaced in the thread here [1]. In that thread, you
> mentioned that you had some patches to make autofs namespace-aware
> that
> you were holding on to. Do you think you could put those up so we can
> all work out the issues you mentioned?

I can but I haven't tested them at all.

Possibly I could post them to the autofs list if you want to test and
probably make changes to get them working.

Would that be ok .... umm, I think I need to post the patch now anyway
as it's the easiest way to demonstrate what I think is a better approach
than the patch below ....

I'm struggling to get back to work on this but I'm getting there and
hope to return to it soon.

> 
> For some background, we've also been bitten by the lack of
> namespace-awareness in autofs many times. The simple case that breaks
> can be reproduced as follows, assuming that /mnt/foo is an indirect
> mount:
> 
> ls /mnt/foo            # do the initial automount
> unshare -m sleep 10 &  # hold the automount in a new namespace
> umount /mnt/foo        # pretend the mount timed out
> ls /mnt/foo            # try to access it again
> ls: cannot open directory '/mnt/foo': Too many levels of symbolic
> links
> 
> For us, the unshare step is actually setting up a container. Our
> workaround was to make sure that we clean up all of the base system's
> mounts when setting up the container, but in the general case this is
> still broken.

Yeah, that looks like a simple test I can use to test my change.
I'll use this.

> 
> In this particular case, I believe the problem is that we're using
> `d_mountpoint()` (or `have_submounts()`) in several places to check
> whether something is already mounted, and that's not namespace-aware.
> This hack mitigates the problem I saw above, although it probably
> ignores edge cases that the old code was handling:
> 
> ----
> diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c
> index 7ab923940d18..a620822342c8 100644
> --- a/fs/autofs4/root.c
> +++ b/fs/autofs4/root.c
> @@ -378,39 +378,29 @@ static struct vfsmount
> *autofs4_d_automount(struct path *path)
>  		goto done;
>  	}
>  
> -	if (!d_mountpoint(dentry)) {
> -		/*
> -		 * It's possible that user space hasn't removed
> directories
> -		 * after umounting a rootless multi-mount, although
> it
> -		 * should. For v5 have_submounts() is sufficient to
> handle
> -		 * this because the leaves of the directory tree
> under the
> -		 * mount never trigger mounts themselves (they have
> an autofs
> -		 * trigger mount mounted on them). But v4 pseudo
> direct mounts
> -		 * do need the leaves to trigger mounts. In this case
> we
> -		 * have no choice but to use the list_empty() check
> and
> -		 * require user space behave.
> -		 */
> -		if (sbi->version > 4) {
> -			if (have_submounts(dentry)) {
> -				spin_unlock(&sbi->fs_lock);
> -				goto done;
> -			}
> -		} else {
> -			if (!simple_empty(dentry)) {
> -				spin_unlock(&sbi->fs_lock);
> -				goto done;
> -			}
> -		}
> -		ino->flags |= AUTOFS_INF_PENDING;
> -		spin_unlock(&sbi->fs_lock);
> -		status = autofs4_mount_wait(dentry, 0);
> -		spin_lock(&sbi->fs_lock);
> -		ino->flags &= ~AUTOFS_INF_PENDING;
> -		if (status) {
> +	if (d_mountpoint(dentry)) {
> +		/* Make sure this is a mountpoint in this namespace.
> */
> +		struct vfsmount *mounted = lookup_mnt(path);
> +		if (mounted) {
> +			mntput(mounted);
>  			spin_unlock(&sbi->fs_lock);
> -			return ERR_PTR(status);
> +			goto done;
>  		}
>  	}
> +
> +	if (!simple_empty(dentry)) {
> +		spin_unlock(&sbi->fs_lock);
> +		goto done;
> +	}
> +	ino->flags |= AUTOFS_INF_PENDING;
> +	spin_unlock(&sbi->fs_lock);
> +	status = autofs4_mount_wait(dentry, 0);
> +	spin_lock(&sbi->fs_lock);
> +	ino->flags &= ~AUTOFS_INF_PENDING;
> +	if (status) {
> +		spin_unlock(&sbi->fs_lock);
> +		return ERR_PTR(status);
> +	}
>  	spin_unlock(&sbi->fs_lock);
>  done:
>  	/* Mount succeeded, check if we ended up with a new dentry */
> diff --git a/fs/autofs4/waitq.c b/fs/autofs4/waitq.c
> index 0146d911f468..b4e901d2aa6b 100644
> --- a/fs/autofs4/waitq.c
> +++ b/fs/autofs4/waitq.c
> @@ -332,8 +332,6 @@ static int validate_request(struct
> autofs_wait_queue **wait,
>  					dentry = new;
>  			}
>  		}
> -		if (have_submounts(dentry))
> -			valid = 0;
>  
>  		if (new)
>  			dput(new);
> diff --git a/fs/internal.h b/fs/internal.h
> index b71deeecea17..c26ba59eec1b 100644
> --- a/fs/internal.h
> +++ b/fs/internal.h
> @@ -58,7 +58,6 @@ extern int vfs_path_lookup(struct dentry *, struct
> vfsmount *,
>  extern void *copy_mount_options(const void __user *);
>  extern char *copy_mount_string(const void __user *);
>  
> -extern struct vfsmount *lookup_mnt(struct path *);
>  extern int finish_automount(struct vfsmount *, struct path *);
>  
>  extern int sb_prepare_remount_readonly(struct super_block *);
> diff --git a/include/linux/mount.h b/include/linux/mount.h
> index f822c3c11377..ff971448152f 100644
> --- a/include/linux/mount.h
> +++ b/include/linux/mount.h
> @@ -72,6 +72,7 @@ struct vfsmount {
>  struct file; /* forward dec */
>  struct path;
>  
> +extern struct vfsmount *lookup_mnt(struct path *);

Can't say I like this.

If I thought making lookup_mnt() visible was acceptable (from a VFS POV)
that would have made a bunch of things much easier over the years.

There's also expiration and the have_submounts() VFS check to consider.
The have_submounts() check is used when validating a request before
sending it. I'm not certain about the expire, but consider independent
automount daemons in different containers using overlapping automount
name spaces (ie. paths or mounts with the same super block).

In any case I don't think you don't need lookup_mnt(), see blow.

>  extern int mnt_want_write(struct vfsmount *mnt);
>  extern int mnt_want_write_file(struct file *file);
>  extern int mnt_clone_write(struct vfsmount *mnt);
> ----

I think this is a better approach (beware, not even compile tested so it
almost certainly doesn't work properly) and I'm not even sure that
exposing __is_local_mountpoint() is ok from a VFS POV but something
needs to be done:

autofs4 - make mountpoint checks namespace aware

From: Ian Kent <ikent@redhat.com>

If an automount mount is clone(2)ed into a file system that is
propagation private, when it later expires the mount subsequent
calls to autofs ->d_automount() for in that dentry in that
namespace will return ELOOP until the mount is manually umounted.

In the same way, if an autofs mount is triggered by automount(8)
within a container the dentry will be seen as mounted in the
root init namespace and calls to ->d_automount() in that namespace
will return ELOOP until the mount is umounted within the container.

Also, have_submounts() can return an incorect result when a mount
exists in a namespace other than the one being checked.

Signed-off-by: Ian Kent <ikent@redhat.com>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
---
 fs/autofs4/expire.c   |    4 ++--
 fs/autofs4/root.c     |   14 +++++++-------
 fs/dcache.c           |    2 +-
 fs/mount.h            |    9 ---------
 fs/namespace.c        |    1 +
 include/linux/mount.h |    9 +++++++++
 6 files changed, 20 insertions(+), 19 deletions(-)

--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Autofs and mount namespaces

Commit Message

Comments

Patch