From patchwork Wed May 24 20:42:53 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ram Pai X-Patchwork-Id: 9746897 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 0B91D60210 for ; Wed, 24 May 2017 20:43:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EFC8E2849A for ; Wed, 24 May 2017 20:43:24 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E263F2852A; Wed, 24 May 2017 20:43:24 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 45FBF2849A for ; Wed, 24 May 2017 20:43:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1035820AbdEXUnI (ORCPT ); Wed, 24 May 2017 16:43:08 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:44419 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1035807AbdEXUnC (ORCPT ); Wed, 24 May 2017 16:43:02 -0400 Received: from pps.filterd (m0098419.ppops.net [127.0.0.1]) by mx0b-001b2d01.pphosted.com (8.16.0.20/8.16.0.20) with SMTP id v4OKcfab088930 for ; Wed, 24 May 2017 16:43:01 -0400 Received: from e16.ny.us.ibm.com (e16.ny.us.ibm.com [129.33.205.206]) by mx0b-001b2d01.pphosted.com with ESMTP id 2angqm1nbr-1 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=NOT) for ; Wed, 24 May 2017 16:43:01 -0400 Received: from localhost by e16.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Wed, 24 May 2017 16:43:00 -0400 Received: from b01cxnp22035.gho.pok.ibm.com (9.57.198.25) by e16.ny.us.ibm.com (146.89.104.203) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 24 May 2017 16:42:57 -0400 Received: from b01ledav004.gho.pok.ibm.com (b01ledav004.gho.pok.ibm.com [9.57.199.109]) by b01cxnp22035.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id v4OKguNn64880798; Wed, 24 May 2017 20:42:56 GMT Received: from b01ledav004.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 30965112040; Wed, 24 May 2017 16:42:58 -0400 (EDT) Received: from ram.oc3035372033.ibm.com (unknown [9.80.212.173]) by b01ledav004.gho.pok.ibm.com (Postfix) with ESMTPS id 88925112047; Wed, 24 May 2017 16:42:57 -0400 (EDT) Date: Wed, 24 May 2017 13:42:53 -0700 From: Ram Pai To: "Eric W. Biederman" Cc: Andrei Vagin , Al Viro , linux-fsdevel@vger.kernel.org Subject: Re: [REVIEW][PATCH 1/2] mnt: In propgate_umount handle visiting mounts in any order Reply-To: Ram Pai References: <87y3yr32ig.fsf@xmission.com> <87shoz32g8.fsf_-_@xmission.com> <87a8b6r0z5.fsf_-_@xmission.com> <20170514021504.GA19495@outlook.office365.com> <87inl4ozg2.fsf@xmission.com> <87y3tzoklh.fsf@xmission.com> <20170515182704.GA15539@outlook.office365.com> <87a86dj49b.fsf@xmission.com> <871srpj2yp.fsf_-_@xmission.com> <87efvodo4l.fsf_-_@xmission.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <87efvodo4l.fsf_-_@xmission.com> User-Agent: Mutt/1.5.20 (2009-12-10) X-TM-AS-GCONF: 00 x-cbid: 17052420-0024-0000-0000-0000027A0C9D X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007112; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000212; SDB=6.00865114; UDB=6.00429530; IPR=6.00644895; BA=6.00005372; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015569; XFM=3.00000015; UTC=2017-05-24 20:42:58 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17052420-0025-0000-0000-0000441F4036 Message-Id: <20170524204253.GB5554@ram.oc3035372033.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-05-24_13:, , signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1705240098 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Wed, May 17, 2017 at 12:54:34AM -0500, Eric W. Biederman wrote: > > While investigating some poor umount performance I realized that in > the case of overlapping mount trees where some of the mounts are locked > the code has been failing to unmount all of the mounts it should > have been unmounting. > > This failure to unmount all of the necessary > mounts can be reproduced with: > > $ cat locked_mounts_test.sh > > mount -t tmpfs test-base /mnt > mount --make-shared /mnt > mkdir -p /mnt/b > > mount -t tmpfs test1 /mnt/b > mount --make-shared /mnt/b > mkdir -p /mnt/b/10 > > mount -t tmpfs test2 /mnt/b/10 > mount --make-shared /mnt/b/10 > mkdir -p /mnt/b/10/20 > > mount --rbind /mnt/b /mnt/b/10/20 > > unshare -Urm --propagation unchaged /bin/sh -c 'sleep 5; if [ $(grep test /proc/self/mountinfo | wc -l) -eq 1 ] ; then echo SUCCESS ; else echo FAILURE ; fi' > sleep 1 > umount -l /mnt/b > wait %% > > $ unshare -Urm ./locked_mounts_test.sh > > This failure is corrected by removing the prepass that marks mounts > that may be umounted. > > A first pass is added that umounts mounts if possible and if not sets > mount mark if they could be unmounted if they weren't locked and adds > them to a list to umount possibilities. This first pass reconsiders > the mounts parent if it is on the list of umount possibilities, ensuring > that information of umoutability will pass from child to mount parent. > > A second pass then walks through all mounts that are umounted and processes > their children unmounting them or marking them for reparenting. > > A last pass cleans up the state on the mounts that could not be umounted > and if applicable reparents them to their first parent that remained > mounted. > > While a bit longer than the old code this code is much more robust > as it allows information to flow up from the leaves and down > from the trunk making the order in which mounts are encountered > in the umount propgation tree irrelevant. Eric, I think we can accomplish what you want in a much simpler way. Would the patch below; UNTESTED BUT COMPILED, resolve your issue? Its a two pass unmount. First pass marks mounts that can be unmounted, and second pass does the neccessary unlinks. It does mark TUCKED mounts, and uses that information to peel off the correct mounts. Key points are a) a tucked mount never entertain any unmount propagation on its root dentry. b) when the child on the root dentry of a tucked mount is unmounted, the tucked mount is not a tucked mount anymore. c) if the child is a tucked mount, than its child is reparented to the parent. Signed-off-by: "Ram Pai" fs/namespace.c | 4 ++- fs/pnode.c | 53 +++++++++++++++++++++++++++++++++++++------------- fs/pnode.h | 3 ++ include/linux/mount.h | 1 diff --git a/fs/namespace.c b/fs/namespace.c index cc1375ef..ff3ec90 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2050,8 +2050,10 @@ static int attach_recursive_mnt(struct mount *source_mnt, hlist_del_init(&child->mnt_hash); q = __lookup_mnt(&child->mnt_parent->mnt, child->mnt_mountpoint); - if (q) + if (q) { mnt_change_mountpoint(child, smp, q); + SET_MNT_TUCKED(child); + } commit_tree(child); } put_mountpoint(smp); diff --git a/fs/pnode.c b/fs/pnode.c index 5bc7896..b44a544 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -448,31 +448,58 @@ static void __propagate_umount(struct mount *mnt) for (m = propagation_next(parent, parent); m; m = propagation_next(m, parent)) { - struct mount *topper; - struct mount *child = __lookup_mnt(&m->mnt, - mnt->mnt_mountpoint); - /* - * umount the child only if the child has no children - * and the child is marked safe to unmount. + struct mount *topper, *child; + + /* Tucked mount must drop umount propagation events on + * its **root dentry**. + * The tucked mount did not exist when that child came + * into existence. It never received that mount propagation. + * Hence it should never entertain the umount propagation + * aswell. */ + if (IS_MNT_TUCKED(m) && list_is_singular(&mnt->mnt_mounts)) + continue; + + + child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint); + if (!child || !IS_MNT_MARKED(child)) continue; + CLEAR_MNT_MARK(child); - /* If there is exactly one mount covering all of child - * replace child with that mount. - */ - topper = find_topper(child); - if (topper) - mnt_change_mountpoint(child->mnt_parent, child->mnt_mp, - topper); + if (IS_MNT_TUCKED(child) && + (list_is_singular(&child->mnt_mounts))) { + topper = find_topper(child); + if (topper) { + mnt_change_mountpoint(child->mnt_parent, + child->mnt_mp, topper); + CLEAR_MNT_TUCKED(child); /*lets be precise*/ + } + } if (list_empty(&child->mnt_mounts)) { list_del_init(&child->mnt_child); child->mnt.mnt_flags |= MNT_UMOUNT; list_move_tail(&child->mnt_list, &mnt->mnt_list); } +#if 0 + else { + mntput(child); /* mark it for deletion. It will + be deleted whenever it looses + all its remaining references. + TODO: some more thought + needed, please validate */ + } +#endif } + + /* + * This explicit umount operation is exposing the parent. + * In case the parent was a 'tucked' mount, it cannot be so + * anymore. + */ + CLEAR_MNT_TUCKED(parent); } /* diff --git a/fs/pnode.h b/fs/pnode.h index dc87e65..9ebd1a8 100644 --- a/fs/pnode.h +++ b/fs/pnode.h @@ -18,8 +18,11 @@ #define IS_MNT_UNBINDABLE(m) ((m)->mnt.mnt_flags & MNT_UNBINDABLE) #define IS_MNT_MARKED(m) ((m)->mnt.mnt_flags & MNT_MARKED) #define SET_MNT_MARK(m) ((m)->mnt.mnt_flags |= MNT_MARKED) +#define SET_MNT_TUCKED(m) ((m)->mnt.mnt_flags |= MNT_TUCKED) #define CLEAR_MNT_MARK(m) ((m)->mnt.mnt_flags &= ~MNT_MARKED) +#define CLEAR_MNT_TUCKED(m) ((m)->mnt.mnt_flags &= ~MNT_TUCKED) #define IS_MNT_LOCKED(m) ((m)->mnt.mnt_flags & MNT_LOCKED) +#define IS_MNT_TUCKED(m) ((m)->mnt.mnt_flags & MNT_TUCKED) #define CL_EXPIRE 0x01 #define CL_SLAVE 0x02 diff --git a/include/linux/mount.h b/include/linux/mount.h index 8e0352a..41674e7 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -62,6 +62,7 @@ #define MNT_SYNC_UMOUNT 0x2000000 #define MNT_MARKED 0x4000000 #define MNT_UMOUNT 0x8000000 +#define MNT_TUCKED 0x10000000 struct vfsmount { struct dentry *mnt_root; /* root of the mounted tree */