From patchwork Fri Feb 3 18:26:20 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 9554943 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1B81E602B7 for ; Fri, 3 Feb 2017 18:30:55 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 06A74262FF for ; Fri, 3 Feb 2017 18:30:55 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EDB0B27FAD; Fri, 3 Feb 2017 18:30:54 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 189C6262FF for ; Fri, 3 Feb 2017 18:30:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751348AbdBCSaw (ORCPT ); Fri, 3 Feb 2017 13:30:52 -0500 Received: from out01.mta.xmission.com ([166.70.13.231]:53325 "EHLO out01.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751213AbdBCSav (ORCPT ); Fri, 3 Feb 2017 13:30:51 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out01.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1cZidG-0004Bd-QJ; Fri, 03 Feb 2017 11:30:50 -0700 Received: from [101.100.131.98] (helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.87) (envelope-from ) id 1cZidA-0006HK-Cn; Fri, 03 Feb 2017 11:30:50 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Ram Pai Cc: Al Viro , linux-fsdevel@vger.kernel.org, Andrei Vagin References: <87inplinxd.fsf@xmission.com> <87inplh8or.fsf_-_@xmission.com> <87d1fth8mh.fsf_-_@xmission.com> <20170112054548.GT1555@ZenIV.linux.org.uk> <87tw8u42fq.fsf_-_@xmission.com> <20170121035827.GA5657@ram.oc3035372033.ibm.com> <87efzxt5em.fsf@xmission.com> <20170123190201.GC5657@ram.oc3035372033.ibm.com> <87d1fd1fem.fsf@xmission.com> <87k2977deq.fsf@xmission.com> <20170203171019.GC5705@ram.oc3035372033.ibm.com> Date: Sat, 04 Feb 2017 07:26:20 +1300 In-Reply-To: <20170203171019.GC5705@ram.oc3035372033.ibm.com> (Ram Pai's message of "Fri, 3 Feb 2017 09:10:19 -0800") Message-ID: <87lgtn167n.fsf@xmission.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/25.1 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=1cZidA-0006HK-Cn; ; ; mid=<87lgtn167n.fsf@xmission.com>; ; ; hst=in01.mta.xmission.com; ; ; ip=101.100.131.98; ; ; frm=ebiederm@xmission.com; ; ; spf=neutral X-XM-AID: U2FsdGVkX1+Z24cZo0jjk1Ts8JWVjqbBU4hBRrT00dY= X-SA-Exim-Connect-IP: 101.100.131.98 X-SA-Exim-Mail-From: ebiederm@xmission.com Subject: Re: [PATCH v5] mnt: Tuck mounts under others instead of creating shadow/side mounts. X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Ram Pai writes: > On Fri, Feb 03, 2017 at 11:54:21PM +1300, Eric W. Biederman wrote: >> ebiederm@xmission.com (Eric W. Biederman) writes: >> >> > Ram Pai writes: >> > >> >> On Sat, Jan 21, 2017 at 05:15:29PM +1300, Eric W. Biederman wrote: >> >>> Ram Pai writes: >> >>> >> >>> >> @@ -359,12 +373,24 @@ int propagate_mount_busy(struct mount *mnt, int refcnt) >> >>> >> >> >>> >> for (m = propagation_next(parent, parent); m; >> >>> >> m = propagation_next(m, parent)) { >> >>> >> - child = __lookup_mnt_last(&m->mnt, mnt->mnt_mountpoint); >> >>> >> - if (child && list_empty(&child->mnt_mounts) && >> >>> >> - (ret = do_refcount_check(child, 1))) >> >>> >> - break; >> >>> >> + int count = 1; >> >>> >> + child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint); >> >>> >> + if (!child) >> >>> >> + continue; >> >>> >> + >> >>> >> + /* Is there exactly one mount on the child that covers >> >>> >> + * it completely whose reference should be ignored? >> >>> >> + */ >> >>> >> + topper = find_topper(child); >> >>> > >> >>> > This is tricky. I understand it is trying to identify the case where a >> >>> > mount got tucked-in because of propagation. But this will not >> >>> > distinguish the case where a mount got over-mounted genuinely, not because of >> >>> > propagation, but because of explicit user action. >> >>> > >> >>> > >> >>> > example: >> >>> > >> >>> > case 1: (explicit user action) >> >>> > B is a slave of A >> >>> > mount something on A/a , it will propagate to B/a >> >>> > and than mount something on B/a >> >>> > >> >>> > case 2: (tucked mount) >> >>> > B is a slave of A >> >>> > mount something on B/a >> >>> > and than mount something on A/a >> >>> > >> >>> > Both case 1 and case 2 lead to the same mount configuration. >> >>> > >> >>> > >> >>> > however 'umount A/a' in case 1 should fail. >> >>> > and 'umount A/a' in case 2 should pass. >> >>> > >> >>> > Right? in other words, umounts of 'tucked mounts' should pass(case 2). >> >>> > whereas umounts of mounts on which overmounts exist should >> >>> > fail.(case 1) >> >>> >> >>> Looking at your example. I agree that case 1 will fail today. >> >> >> >> And should continue to fail. right? Your semantics change will pass it. >> > >> > I don't see why it should continue to fail. >> > >> >>> However my actual expectation would be for both mount configurations >> >>> to behave the same. In both cases something has been explicitly mounted >> >>> on B/a and something has propagated to B/a. In both cases the mount >> >>> on top is what was explicitly mounted, and the mount below is what was >> >>> propagated to B/a. >> >>> >> >>> I don't see why the order of operations should matter. >> >> >> >> One of the subtle expectation is reversibility. >> >> >> >> Mount followed immediately by unmount has always passed and that is the >> >> standard expectation always. Your proposed code will ensure that. >> >> >> >> However there is one other subtle expectaton. >> >> >> >> A mount cannot disappear if a user has explicitly mounted on top of it. >> >> >> >> your proposed code will not meet that expectation. >> >> >> >> In other words, these two expectations make it behave differently even >> >> when; arguably, they feel like the same configuration. >> > >> > I am not seeing that. >> > >> > >> > >> >>> >> >>> > maybe we need a flag to identify tucked mounts? >> >>> >> >>> To preserve our exact current semantics yes. >> >>> >> >>> The mount configurations that are delibearately constructed that I am >> >>> aware of are comparatively simple. I don't think anyone has even taken >> >>> advantage of the shadow/side mounts at this point. I made a reasonable >> >>> effort to find out and no one was even aware they existed. Much less >> >>> what they were. And certainly no one I talked to could find code that >> >>> used them. >> >> >> >> But someday; even if its after a decade, someone ;) will >> >> stumble into this semantics and wonder 'why?'. Its better to get it right >> >> sooner. Sorry, I am blaming myself; for keeping some of the problems >> >> open thinking no one will bump into them. >> > >> > Oh definitely. If we have people ready to talk it through I am happy to >> > dot as many i's and cross as many t's as we productively can. >> > >> > I was just pointing out that I don't have any reason to expect that any >> > one depends on the subtle details of the implementation today so we >> > still have some wiggle room to fix them. Even if they are visible to >> > user space. >> >> So I haven't seen a reply, and we are getting awfully close to the merge >> window. Is there anything concrete we can do to ease concerns? >> >> Right now I am thinking my last version of the patch is the likely the >> best we have time and energy to manage and it would be good to merge it >> before the code bit rots. > > I was waiting for some other opinions on the behavior, since I > continue to think that 'one should not be able to unmount mounts on > which a user has explicitly mounted upon'. I am happy to be overruled, > since your patch significantly improves the rest of the semantics. > > Viro? Ram Pai, just to be clear you were hoping to add the logic below to my patch? My objections to the snippet below are: - It makes it hard for the CRIU folks (yet more state they have to find and restore). - It feels subjectively worse to me. - We already have cases where mounts are unmounted transparently (umount on rmdir). - Al Viro claims that the side/shadow mounts are ordinary mounts and maintaining this extra logic that remembers if we tucked one mount under another seems to make this them less ordinary. - The symmetry for unmounting exists for a tucked mount. We can unmount it via propagation or we can unmount the mount above it, and then we can unmount the new underlying mount. So I don't see why we don't want symmetry in the other case just because we mounted on top of the mount and rather than had the mount tucked under us. Eric diff --git a/fs/namespace.c b/fs/namespace.c index 8bfad42c1ccf..8b00e0548438 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -2047,8 +2047,10 @@ static int attach_recursive_mnt(struct mount *source_mnt, hlist_del_init(&child->mnt_hash); q = __lookup_mnt(&child->mnt_parent->mnt, child->mnt_mountpoint); - if (q) + if (q) { mnt_change_mountpoint(child, smp, q); + child->mnt.mnt_flags |= MNT_TUCKED; + } commit_tree(child); } put_mountpoint(smp); diff --git a/fs/pnode.c b/fs/pnode.c index 5bc7896d122a..e2a6ac68feb9 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -327,6 +327,9 @@ static struct mount *find_topper(struct mount *mnt) /* If there is exactly one mount covering mnt completely return it. */ struct mount *child; + if (!(mnt->mnt.mnt_flags & MNT_TUCKED)) + return NULL; + if (!list_is_singular(&mnt->mnt_mounts)) return NULL; diff --git a/include/linux/mount.h b/include/linux/mount.h index 8e0352af06b7..25ca398b19b3 100644 --- a/include/linux/mount.h +++ b/include/linux/mount.h @@ -52,6 +52,7 @@ struct mnt_namespace; #define MNT_INTERNAL 0x4000 +#define MNT_TUCKED 0x020000 #define MNT_LOCK_ATIME 0x040000 #define MNT_LOCK_NOEXEC 0x080000 #define MNT_LOCK_NOSUID 0x100000