diff mbox

[REVIEW,1/2] mnt: In propgate_umount handle visiting mounts in any order

Message ID 20170524204253.GB5554@ram.oc3035372033.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ram Pai May 24, 2017, 8:42 p.m. UTC
On Wed, May 17, 2017 at 12:54:34AM -0500, Eric W. Biederman wrote:
> 
> While investigating some poor umount performance I realized that in
> the case of overlapping mount trees where some of the mounts are locked
> the code has been failing to unmount all of the mounts it should
> have been unmounting.
> 
> This failure to unmount all of the necessary
> mounts can be reproduced with:
> 
> $ cat locked_mounts_test.sh
> 
> mount -t tmpfs test-base /mnt
> mount --make-shared /mnt
> mkdir -p /mnt/b
> 
> mount -t tmpfs test1 /mnt/b
> mount --make-shared /mnt/b
> mkdir -p /mnt/b/10
> 
> mount -t tmpfs test2 /mnt/b/10
> mount --make-shared /mnt/b/10
> mkdir -p /mnt/b/10/20
> 
> mount --rbind /mnt/b /mnt/b/10/20
> 
> unshare -Urm --propagation unchaged /bin/sh -c 'sleep 5; if [ $(grep test /proc/self/mountinfo | wc -l) -eq 1 ] ; then echo SUCCESS ; else echo FAILURE ; fi'
> sleep 1
> umount -l /mnt/b
> wait %%
> 
> $ unshare -Urm ./locked_mounts_test.sh
> 
> This failure is corrected by removing the prepass that marks mounts
> that may be umounted.
> 
> A first pass is added that umounts mounts if possible and if not sets
> mount mark if they could be unmounted if they weren't locked and adds
> them to a list to umount possibilities.  This first pass reconsiders
> the mounts parent if it is on the list of umount possibilities, ensuring
> that information of umoutability will pass from child to mount parent.
> 
> A second pass then walks through all mounts that are umounted and processes
> their children unmounting them or marking them for reparenting.
> 
> A last pass cleans up the state on the mounts that could not be umounted
> and if applicable reparents them to their first parent that remained
> mounted.
> 
> While a bit longer than the old code this code is much more robust
> as it allows information to flow up from the leaves and down
> from the trunk making the order in which mounts are encountered
> in the umount propgation tree irrelevant.

Eric,
	I think we can accomplish what you want in a much simpler way.
       	Would the patch below; UNTESTED BUT COMPILED, resolve your
	issue?

	Its a two pass unmount. First pass marks mounts that can
	be unmounted, and second pass does the neccessary unlinks.
	It does mark TUCKED mounts, and uses that information
	to peel off the correct mounts. Key points are

	a) a tucked mount never entertain any unmount propagation
	 	on its root dentry.

	b) when the child on the root dentry of a tucked mount is
	   unmounted, the tucked mount is not a tucked mount anymore.

	c) if the child is a tucked mount, than its child is reparented
	   to the parent.


Signed-off-by: "Ram Pai" <linuxram@us.ibm.com>

fs/namespace.c        |    4 ++-
fs/pnode.c            |   53 +++++++++++++++++++++++++++++++++++++-------------
fs/pnode.h            |    3 ++
include/linux/mount.h |    1

Comments

Eric W. Biederman May 24, 2017, 9:54 p.m. UTC | #1
Ram Pai <linuxram@us.ibm.com> writes:

> On Wed, May 17, 2017 at 12:54:34AM -0500, Eric W. Biederman wrote:
>> 
>> While investigating some poor umount performance I realized that in
>> the case of overlapping mount trees where some of the mounts are locked
>> the code has been failing to unmount all of the mounts it should
>> have been unmounting.
>> 
>> This failure to unmount all of the necessary
>> mounts can be reproduced with:
>> 
>> $ cat locked_mounts_test.sh
>> 
>> mount -t tmpfs test-base /mnt
>> mount --make-shared /mnt
>> mkdir -p /mnt/b
>> 
>> mount -t tmpfs test1 /mnt/b
>> mount --make-shared /mnt/b
>> mkdir -p /mnt/b/10
>> 
>> mount -t tmpfs test2 /mnt/b/10
>> mount --make-shared /mnt/b/10
>> mkdir -p /mnt/b/10/20
>> 
>> mount --rbind /mnt/b /mnt/b/10/20
>> 
>> unshare -Urm --propagation unchaged /bin/sh -c 'sleep 5; if [ $(grep test /proc/self/mountinfo | wc -l) -eq 1 ] ; then echo SUCCESS ; else echo FAILURE ; fi'
>> sleep 1
>> umount -l /mnt/b
>> wait %%
>> 
>> $ unshare -Urm ./locked_mounts_test.sh
>> 
>> This failure is corrected by removing the prepass that marks mounts
>> that may be umounted.
>> 
>> A first pass is added that umounts mounts if possible and if not sets
>> mount mark if they could be unmounted if they weren't locked and adds
>> them to a list to umount possibilities.  This first pass reconsiders
>> the mounts parent if it is on the list of umount possibilities, ensuring
>> that information of umoutability will pass from child to mount parent.
>> 
>> A second pass then walks through all mounts that are umounted and processes
>> their children unmounting them or marking them for reparenting.
>> 
>> A last pass cleans up the state on the mounts that could not be umounted
>> and if applicable reparents them to their first parent that remained
>> mounted.
>> 
>> While a bit longer than the old code this code is much more robust
>> as it allows information to flow up from the leaves and down
>> from the trunk making the order in which mounts are encountered
>> in the umount propgation tree irrelevant.
>
> Eric,
> 	I think we can accomplish what you want in a much simpler way.
>        	Would the patch below; UNTESTED BUT COMPILED, resolve your
> 	issue?

The reason I came up with an algorithm where the information flows
both directions is that especially in the case of umount -l
but even in some rare cases of a simple umount, the ordering
of the mount propagation tree can result in parent mounts being
visited before the child mounts.

This case shows in in the case of a mount or a set of mounts
being mounted below itself.

So no.   Irregardless of tucked mount state we can't do this.

I see this also doesn't have the change to move mnt_change_mountpoint
into another pass.  That one is quite important from a practical
point of view as that means the way the mount tree changes in umount
is the same irrespective of the number of times a mount shows
up in the mount propagation trees.  Which is a very important
property to have for optimizing umount -l.  Which in
the worst case allows reduces umount from O(N^2+) to roughly O(N).

All of what I am doing should have not effect on an implementation of
MNT_TUCKED.

That said your code to deal with MNT_TUCKED seems reasonable.

Eric

>
> 	Its a two pass unmount. First pass marks mounts that can
> 	be unmounted, and second pass does the neccessary unlinks.
> 	It does mark TUCKED mounts, and uses that information
> 	to peel off the correct mounts. Key points are
>
> 	a) a tucked mount never entertain any unmount propagation
> 	 	on its root dentry.
>
> 	b) when the child on the root dentry of a tucked mount is
> 	   unmounted, the tucked mount is not a tucked mount anymore.
>
> 	c) if the child is a tucked mount, than its child is reparented
> 	   to the parent.
>
>
> Signed-off-by: "Ram Pai" <linuxram@us.ibm.com>
>
> fs/namespace.c        |    4 ++-
> fs/pnode.c            |   53 +++++++++++++++++++++++++++++++++++++-------------
> fs/pnode.h            |    3 ++
> include/linux/mount.h |    1 
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index cc1375ef..ff3ec90 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -2050,8 +2050,10 @@ static int attach_recursive_mnt(struct mount *source_mnt,
>  		hlist_del_init(&child->mnt_hash);
>  		q = __lookup_mnt(&child->mnt_parent->mnt,
>  				 child->mnt_mountpoint);
> -		if (q)
> +		if (q) {
>  			mnt_change_mountpoint(child, smp, q);
> +			SET_MNT_TUCKED(child);
> +		}
>  		commit_tree(child);
>  	}
>  	put_mountpoint(smp);
> diff --git a/fs/pnode.c b/fs/pnode.c
> index 5bc7896..b44a544 100644
> --- a/fs/pnode.c
> +++ b/fs/pnode.c
> @@ -448,31 +448,58 @@ static void __propagate_umount(struct mount *mnt)
>  
>  	for (m = propagation_next(parent, parent); m;
>  			m = propagation_next(m, parent)) {
> -		struct mount *topper;
> -		struct mount *child = __lookup_mnt(&m->mnt,
> -						mnt->mnt_mountpoint);
> -		/*
> -		 * umount the child only if the child has no children
> -		 * and the child is marked safe to unmount.
> +		struct mount *topper, *child;
> +
> +		/* Tucked mount must drop umount propagation events on
> +		 * its **root dentry**.
> +		 * The tucked mount did not exist when that child came
> +		 * into existence. It never received that mount propagation.
> +		 * Hence it should never entertain the umount propagation
> +		 * aswell.
>  		 */
> +		if (IS_MNT_TUCKED(m) && list_is_singular(&mnt->mnt_mounts))
> +			continue;
> +
> +
> +		child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint);
> +
>  		if (!child || !IS_MNT_MARKED(child))
>  			continue;
> +
>  		CLEAR_MNT_MARK(child);
>  
> -		/* If there is exactly one mount covering all of child
> -		 * replace child with that mount.
> -		 */
> -		topper = find_topper(child);
> -		if (topper)
> -			mnt_change_mountpoint(child->mnt_parent, child->mnt_mp,
> -					      topper);
> +		if (IS_MNT_TUCKED(child) &&
> +			(list_is_singular(&child->mnt_mounts))) {
> +			topper = find_topper(child);
> +			if (topper) {
> +				mnt_change_mountpoint(child->mnt_parent,
> +					child->mnt_mp, topper);
> +				CLEAR_MNT_TUCKED(child); /*lets be precise*/
> +			}
> +		}
>  
>  		if (list_empty(&child->mnt_mounts)) {
>  			list_del_init(&child->mnt_child);
>  			child->mnt.mnt_flags |= MNT_UMOUNT;
>  			list_move_tail(&child->mnt_list, &mnt->mnt_list);
>  		}
> +#if 0
> +	       	else {
> +			mntput(child); /* mark it for deletion. It will
> +				       	  be deleted whenever it looses
> +					  all its remaining references.
> +					  TODO: some more thought
> +					  needed, please validate */
> +		}
> +#endif
>  	}
> +
> +	/*
> +	 * This explicit umount operation is exposing the parent.
> +	 * In case the parent was a 'tucked' mount, it cannot be so
> +	 * anymore.
> +	 */
> +	CLEAR_MNT_TUCKED(parent);
>  }
>  
>  /*
> diff --git a/fs/pnode.h b/fs/pnode.h
> index dc87e65..9ebd1a8 100644
> --- a/fs/pnode.h
> +++ b/fs/pnode.h
> @@ -18,8 +18,11 @@
>  #define IS_MNT_UNBINDABLE(m) ((m)->mnt.mnt_flags & MNT_UNBINDABLE)
>  #define IS_MNT_MARKED(m) ((m)->mnt.mnt_flags & MNT_MARKED)
>  #define SET_MNT_MARK(m) ((m)->mnt.mnt_flags |= MNT_MARKED)
> +#define SET_MNT_TUCKED(m) ((m)->mnt.mnt_flags |= MNT_TUCKED)
>  #define CLEAR_MNT_MARK(m) ((m)->mnt.mnt_flags &= ~MNT_MARKED)
> +#define CLEAR_MNT_TUCKED(m) ((m)->mnt.mnt_flags &= ~MNT_TUCKED)
>  #define IS_MNT_LOCKED(m) ((m)->mnt.mnt_flags & MNT_LOCKED)
> +#define IS_MNT_TUCKED(m) ((m)->mnt.mnt_flags & MNT_TUCKED)
>  
>  #define CL_EXPIRE    		0x01
>  #define CL_SLAVE     		0x02
> diff --git a/include/linux/mount.h b/include/linux/mount.h
> index 8e0352a..41674e7 100644
> --- a/include/linux/mount.h
> +++ b/include/linux/mount.h
> @@ -62,6 +62,7 @@
>  #define MNT_SYNC_UMOUNT		0x2000000
>  #define MNT_MARKED		0x4000000
>  #define MNT_UMOUNT		0x8000000
> +#define MNT_TUCKED		0x10000000
>  
>  struct vfsmount {
>  	struct dentry *mnt_root;	/* root of the mounted tree */
Ram Pai May 24, 2017, 10:35 p.m. UTC | #2
On Wed, May 24, 2017 at 04:54:34PM -0500, Eric W. Biederman wrote:
> Ram Pai <linuxram@us.ibm.com> writes:
> 
> > On Wed, May 17, 2017 at 12:54:34AM -0500, Eric W. Biederman wrote:
> >> 
> >> While investigating some poor umount performance I realized that in
> >> the case of overlapping mount trees where some of the mounts are locked
> >> the code has been failing to unmount all of the mounts it should
> >> have been unmounting.
> >> 
> >> This failure to unmount all of the necessary
> >> mounts can be reproduced with:
> >> 
> >> $ cat locked_mounts_test.sh
> >> 
> >> mount -t tmpfs test-base /mnt
> >> mount --make-shared /mnt
> >> mkdir -p /mnt/b
> >> 
> >> mount -t tmpfs test1 /mnt/b
> >> mount --make-shared /mnt/b
> >> mkdir -p /mnt/b/10
> >> 
> >> mount -t tmpfs test2 /mnt/b/10
> >> mount --make-shared /mnt/b/10
> >> mkdir -p /mnt/b/10/20
> >> 
> >> mount --rbind /mnt/b /mnt/b/10/20
> >> 
> >> unshare -Urm --propagation unchaged /bin/sh -c 'sleep 5; if [ $(grep test /proc/self/mountinfo | wc -l) -eq 1 ] ; then echo SUCCESS ; else echo FAILURE ; fi'
> >> sleep 1
> >> umount -l /mnt/b
> >> wait %%
> >> 
> >> $ unshare -Urm ./locked_mounts_test.sh
> >> 
> >> This failure is corrected by removing the prepass that marks mounts
> >> that may be umounted.
> >> 
> >> A first pass is added that umounts mounts if possible and if not sets
> >> mount mark if they could be unmounted if they weren't locked and adds
> >> them to a list to umount possibilities.  This first pass reconsiders
> >> the mounts parent if it is on the list of umount possibilities, ensuring
> >> that information of umoutability will pass from child to mount parent.
> >> 
> >> A second pass then walks through all mounts that are umounted and processes
> >> their children unmounting them or marking them for reparenting.
> >> 
> >> A last pass cleans up the state on the mounts that could not be umounted
> >> and if applicable reparents them to their first parent that remained
> >> mounted.
> >> 
> >> While a bit longer than the old code this code is much more robust
> >> as it allows information to flow up from the leaves and down
> >> from the trunk making the order in which mounts are encountered
> >> in the umount propgation tree irrelevant.
> >
> > Eric,
> > 	I think we can accomplish what you want in a much simpler way.
> >        	Would the patch below; UNTESTED BUT COMPILED, resolve your
> > 	issue?
> 
> The reason I came up with an algorithm where the information flows
> both directions is that especially in the case of umount -l
> but even in some rare cases of a simple umount, the ordering
> of the mount propagation tree can result in parent mounts being
> visited before the child mounts.
> 
> This case shows in in the case of a mount or a set of mounts
> being mounted below itself.
> 
> So no.   Irregardless of tucked mount state we can't do this.

Ok. I thought I had taken care, regardles of the order in which the mounts
were encountered. I need to understand your patch better. Will relook at it later
tonight.

RP

> 
> I see this also doesn't have the change to move mnt_change_mountpoint
> into another pass.  That one is quite important from a practical
> point of view as that means the way the mount tree changes in umount
> is the same irrespective of the number of times a mount shows
> up in the mount propagation trees.  Which is a very important
> property to have for optimizing umount -l.  Which in
> the worst case allows reduces umount from O(N^2+) to roughly O(N).
> 
> All of what I am doing should have not effect on an implementation of
> MNT_TUCKED.
> 
> That said your code to deal with MNT_TUCKED seems reasonable.
> 
> Eric
> 
> >
> > 	Its a two pass unmount. First pass marks mounts that can
> > 	be unmounted, and second pass does the neccessary unlinks.
> > 	It does mark TUCKED mounts, and uses that information
> > 	to peel off the correct mounts. Key points are
> >
> > 	a) a tucked mount never entertain any unmount propagation
> > 	 	on its root dentry.
> >
> > 	b) when the child on the root dentry of a tucked mount is
> > 	   unmounted, the tucked mount is not a tucked mount anymore.
> >
> > 	c) if the child is a tucked mount, than its child is reparented
> > 	   to the parent.
> >
> >
> > Signed-off-by: "Ram Pai" <linuxram@us.ibm.com>
> >
> > fs/namespace.c        |    4 ++-
> > fs/pnode.c            |   53 +++++++++++++++++++++++++++++++++++++-------------
> > fs/pnode.h            |    3 ++
> > include/linux/mount.h |    1 
> >
> > diff --git a/fs/namespace.c b/fs/namespace.c
> > index cc1375ef..ff3ec90 100644
> > --- a/fs/namespace.c
> > +++ b/fs/namespace.c
> > @@ -2050,8 +2050,10 @@ static int attach_recursive_mnt(struct mount *source_mnt,
> >  		hlist_del_init(&child->mnt_hash);
> >  		q = __lookup_mnt(&child->mnt_parent->mnt,
> >  				 child->mnt_mountpoint);
> > -		if (q)
> > +		if (q) {
> >  			mnt_change_mountpoint(child, smp, q);
> > +			SET_MNT_TUCKED(child);
> > +		}
> >  		commit_tree(child);
> >  	}
> >  	put_mountpoint(smp);
> > diff --git a/fs/pnode.c b/fs/pnode.c
> > index 5bc7896..b44a544 100644
> > --- a/fs/pnode.c
> > +++ b/fs/pnode.c
> > @@ -448,31 +448,58 @@ static void __propagate_umount(struct mount *mnt)
> >  
> >  	for (m = propagation_next(parent, parent); m;
> >  			m = propagation_next(m, parent)) {
> > -		struct mount *topper;
> > -		struct mount *child = __lookup_mnt(&m->mnt,
> > -						mnt->mnt_mountpoint);
> > -		/*
> > -		 * umount the child only if the child has no children
> > -		 * and the child is marked safe to unmount.
> > +		struct mount *topper, *child;
> > +
> > +		/* Tucked mount must drop umount propagation events on
> > +		 * its **root dentry**.
> > +		 * The tucked mount did not exist when that child came
> > +		 * into existence. It never received that mount propagation.
> > +		 * Hence it should never entertain the umount propagation
> > +		 * aswell.
> >  		 */
> > +		if (IS_MNT_TUCKED(m) && list_is_singular(&mnt->mnt_mounts))
> > +			continue;
> > +
> > +
> > +		child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint);
> > +
> >  		if (!child || !IS_MNT_MARKED(child))
> >  			continue;
> > +
> >  		CLEAR_MNT_MARK(child);
> >  
> > -		/* If there is exactly one mount covering all of child
> > -		 * replace child with that mount.
> > -		 */
> > -		topper = find_topper(child);
> > -		if (topper)
> > -			mnt_change_mountpoint(child->mnt_parent, child->mnt_mp,
> > -					      topper);
> > +		if (IS_MNT_TUCKED(child) &&
> > +			(list_is_singular(&child->mnt_mounts))) {
> > +			topper = find_topper(child);
> > +			if (topper) {
> > +				mnt_change_mountpoint(child->mnt_parent,
> > +					child->mnt_mp, topper);
> > +				CLEAR_MNT_TUCKED(child); /*lets be precise*/
> > +			}
> > +		}
> >  
> >  		if (list_empty(&child->mnt_mounts)) {
> >  			list_del_init(&child->mnt_child);
> >  			child->mnt.mnt_flags |= MNT_UMOUNT;
> >  			list_move_tail(&child->mnt_list, &mnt->mnt_list);
> >  		}
> > +#if 0
> > +	       	else {
> > +			mntput(child); /* mark it for deletion. It will
> > +				       	  be deleted whenever it looses
> > +					  all its remaining references.
> > +					  TODO: some more thought
> > +					  needed, please validate */
> > +		}
> > +#endif
> >  	}
> > +
> > +	/*
> > +	 * This explicit umount operation is exposing the parent.
> > +	 * In case the parent was a 'tucked' mount, it cannot be so
> > +	 * anymore.
> > +	 */
> > +	CLEAR_MNT_TUCKED(parent);
> >  }
> >  
> >  /*
> > diff --git a/fs/pnode.h b/fs/pnode.h
> > index dc87e65..9ebd1a8 100644
> > --- a/fs/pnode.h
> > +++ b/fs/pnode.h
> > @@ -18,8 +18,11 @@
> >  #define IS_MNT_UNBINDABLE(m) ((m)->mnt.mnt_flags & MNT_UNBINDABLE)
> >  #define IS_MNT_MARKED(m) ((m)->mnt.mnt_flags & MNT_MARKED)
> >  #define SET_MNT_MARK(m) ((m)->mnt.mnt_flags |= MNT_MARKED)
> > +#define SET_MNT_TUCKED(m) ((m)->mnt.mnt_flags |= MNT_TUCKED)
> >  #define CLEAR_MNT_MARK(m) ((m)->mnt.mnt_flags &= ~MNT_MARKED)
> > +#define CLEAR_MNT_TUCKED(m) ((m)->mnt.mnt_flags &= ~MNT_TUCKED)
> >  #define IS_MNT_LOCKED(m) ((m)->mnt.mnt_flags & MNT_LOCKED)
> > +#define IS_MNT_TUCKED(m) ((m)->mnt.mnt_flags & MNT_TUCKED)
> >  
> >  #define CL_EXPIRE    		0x01
> >  #define CL_SLAVE     		0x02
> > diff --git a/include/linux/mount.h b/include/linux/mount.h
> > index 8e0352a..41674e7 100644
> > --- a/include/linux/mount.h
> > +++ b/include/linux/mount.h
> > @@ -62,6 +62,7 @@
> >  #define MNT_SYNC_UMOUNT		0x2000000
> >  #define MNT_MARKED		0x4000000
> >  #define MNT_UMOUNT		0x8000000
> > +#define MNT_TUCKED		0x10000000
> >  
> >  struct vfsmount {
> >  	struct dentry *mnt_root;	/* root of the mounted tree */
diff mbox

Patch

diff --git a/fs/namespace.c b/fs/namespace.c
index cc1375ef..ff3ec90 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -2050,8 +2050,10 @@  static int attach_recursive_mnt(struct mount *source_mnt,
 		hlist_del_init(&child->mnt_hash);
 		q = __lookup_mnt(&child->mnt_parent->mnt,
 				 child->mnt_mountpoint);
-		if (q)
+		if (q) {
 			mnt_change_mountpoint(child, smp, q);
+			SET_MNT_TUCKED(child);
+		}
 		commit_tree(child);
 	}
 	put_mountpoint(smp);
diff --git a/fs/pnode.c b/fs/pnode.c
index 5bc7896..b44a544 100644
--- a/fs/pnode.c
+++ b/fs/pnode.c
@@ -448,31 +448,58 @@  static void __propagate_umount(struct mount *mnt)
 
 	for (m = propagation_next(parent, parent); m;
 			m = propagation_next(m, parent)) {
-		struct mount *topper;
-		struct mount *child = __lookup_mnt(&m->mnt,
-						mnt->mnt_mountpoint);
-		/*
-		 * umount the child only if the child has no children
-		 * and the child is marked safe to unmount.
+		struct mount *topper, *child;
+
+		/* Tucked mount must drop umount propagation events on
+		 * its **root dentry**.
+		 * The tucked mount did not exist when that child came
+		 * into existence. It never received that mount propagation.
+		 * Hence it should never entertain the umount propagation
+		 * aswell.
 		 */
+		if (IS_MNT_TUCKED(m) && list_is_singular(&mnt->mnt_mounts))
+			continue;
+
+
+		child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint);
+
 		if (!child || !IS_MNT_MARKED(child))
 			continue;
+
 		CLEAR_MNT_MARK(child);
 
-		/* If there is exactly one mount covering all of child
-		 * replace child with that mount.
-		 */
-		topper = find_topper(child);
-		if (topper)
-			mnt_change_mountpoint(child->mnt_parent, child->mnt_mp,
-					      topper);
+		if (IS_MNT_TUCKED(child) &&
+			(list_is_singular(&child->mnt_mounts))) {
+			topper = find_topper(child);
+			if (topper) {
+				mnt_change_mountpoint(child->mnt_parent,
+					child->mnt_mp, topper);
+				CLEAR_MNT_TUCKED(child); /*lets be precise*/
+			}
+		}
 
 		if (list_empty(&child->mnt_mounts)) {
 			list_del_init(&child->mnt_child);
 			child->mnt.mnt_flags |= MNT_UMOUNT;
 			list_move_tail(&child->mnt_list, &mnt->mnt_list);
 		}
+#if 0
+	       	else {
+			mntput(child); /* mark it for deletion. It will
+				       	  be deleted whenever it looses
+					  all its remaining references.
+					  TODO: some more thought
+					  needed, please validate */
+		}
+#endif
 	}
+
+	/*
+	 * This explicit umount operation is exposing the parent.
+	 * In case the parent was a 'tucked' mount, it cannot be so
+	 * anymore.
+	 */
+	CLEAR_MNT_TUCKED(parent);
 }
 
 /*
diff --git a/fs/pnode.h b/fs/pnode.h
index dc87e65..9ebd1a8 100644
--- a/fs/pnode.h
+++ b/fs/pnode.h
@@ -18,8 +18,11 @@ 
 #define IS_MNT_UNBINDABLE(m) ((m)->mnt.mnt_flags & MNT_UNBINDABLE)
 #define IS_MNT_MARKED(m) ((m)->mnt.mnt_flags & MNT_MARKED)
 #define SET_MNT_MARK(m) ((m)->mnt.mnt_flags |= MNT_MARKED)
+#define SET_MNT_TUCKED(m) ((m)->mnt.mnt_flags |= MNT_TUCKED)
 #define CLEAR_MNT_MARK(m) ((m)->mnt.mnt_flags &= ~MNT_MARKED)
+#define CLEAR_MNT_TUCKED(m) ((m)->mnt.mnt_flags &= ~MNT_TUCKED)
 #define IS_MNT_LOCKED(m) ((m)->mnt.mnt_flags & MNT_LOCKED)
+#define IS_MNT_TUCKED(m) ((m)->mnt.mnt_flags & MNT_TUCKED)
 
 #define CL_EXPIRE    		0x01
 #define CL_SLAVE     		0x02
diff --git a/include/linux/mount.h b/include/linux/mount.h
index 8e0352a..41674e7 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -62,6 +62,7 @@ 
 #define MNT_SYNC_UMOUNT		0x2000000
 #define MNT_MARKED		0x4000000
 #define MNT_UMOUNT		0x8000000
+#define MNT_TUCKED		0x10000000
 
 struct vfsmount {
 	struct dentry *mnt_root;	/* root of the mounted tree */