From patchwork Mon Aug 29 03:36:02 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrey Vagin X-Patchwork-Id: 9303041 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 6FA99607F0 for ; Mon, 29 Aug 2016 03:36:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 6577F287A4 for ; Mon, 29 Aug 2016 03:36:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 59F7D287AE; Mon, 29 Aug 2016 03:36:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.8 required=2.0 tests=BAYES_00,DKIM_SIGNED, RCVD_IN_DNSWL_HI,T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BC7FA287A4 for ; Mon, 29 Aug 2016 03:36:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755804AbcH2DgJ (ORCPT ); Sun, 28 Aug 2016 23:36:09 -0400 Received: from mail-pf0-f169.google.com ([209.85.192.169]:36582 "EHLO mail-pf0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751610AbcH2DgI (ORCPT ); Sun, 28 Aug 2016 23:36:08 -0400 Received: by mail-pf0-f169.google.com with SMTP id h186so47015386pfg.3; Sun, 28 Aug 2016 20:36:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=sender:from:to:cc:subject:date:message-id; bh=49cC1D5GGRXsTi4mh2Ty8aYZskFITyKB5u7Y+VdXgss=; b=YcSW7Jc6BzxjjhpYoEfwRDreQdDmvoGbtHEx6Oe+Rx4jvGkIjATMel6OuBZ7iYHjRN NbojR7MFFYVwcKrDj/GmZrf3at3+riZjVOAgLGSeTJdQjQBaqcWVsG+ERVyl/h0cRyt5 IMtXCe4JH07wJlXwQBSGUfpMePCv7HdjlKBIx/ozij791/Wf0xUwAlCzR0nmMQjQPPBa p2OHPEsvu/U5mNEZj9+1twgXvTPt01OiRmKhMZKGzF7jBvItLcwLpgdUAw32Tzk+Yzsy aXcTKgQgkZg8jvOimx32gpCPUQLmyO1MMuSQIfmHEC2hBquMW3zpey5kdVUXX5n+EWYW qEgw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:sender:from:to:cc:subject:date:message-id; bh=49cC1D5GGRXsTi4mh2Ty8aYZskFITyKB5u7Y+VdXgss=; b=gnX8ZjDZeo3SSAtq+OCDDvK7I1ug/QgXukPXQiuwEn9yqSzDXmmWLdzPqK8d+kW8uk wfMwG4NPM+sdXdcAquBvazNEPCaUwX0Kg4QM+9borwH9Sg1ErQKyUuW3xT8CCLEd5Sgy YvRtyJsYPzaaUM1sm7T0QttV13ArnlWUyut4SrYgf1t8HtsLU1d34YNCkbRBvNpFJs74 Tqw8/r4rRrb1zJtU0ci19wd8ixg5XqYPvMeToEVBvxk6hzIm6oEGV8g2++CNZSFoPn90 g0K4csz5cgZdCuUzonT1e/kQaUQWmTrPKWFxa9ql5AkAjcmX2d9FsjDqLurHLyeN3p4V AY/w== X-Gm-Message-State: AE9vXwNpL4gboXgUTq0CVPfQ6w3UMZx6HIl40OYgxJ1MX16mOdE5qOofdYxGClzgsayBYw== X-Received: by 10.98.75.65 with SMTP id y62mr28598919pfa.99.1472441766839; Sun, 28 Aug 2016 20:36:06 -0700 (PDT) Received: from laptop.localdomain (c-67-183-159-197.hsd1.wa.comcast.net. [67.183.159.197]) by smtp.gmail.com with ESMTPSA id h66sm44672628pfe.58.2016.08.28.20.36.05 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 28 Aug 2016 20:36:05 -0700 (PDT) From: Andrei Vagin To: Alexander Viro Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Andrei Vagin Subject: [PATCH] mount: dont execute propagate_umount() many times for same mounts Date: Sun, 28 Aug 2016 20:36:02 -0700 Message-Id: <1472441762-28068-1-git-send-email-avagin@openvz.org> X-Mailer: git-send-email 2.5.5 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP In a worse case the current complexity of umount_tree() is O(n^3). * Enumirate all mounts in a target tree (propagate_umount) * Enumirate mounts to find where these changes have to be propagated (mark_umount_candidates) * Enumirate mounts to find a requered mount by parent and dentry (__lookup_mnt_lat) The worse case is when all mounts from the tree live in the same shared group. And in this case we have to enumirate all mounts on each step. Here we can optimize the second step. We don't need to make it for mounts which we already met when we do this step for previous mounts. It reduces the complexity of umount_tree() to O(n^2). Here is a script to generate such mount tree: $ cat run.sh mount -t tmpfs xxx /mnt mount --make-shared /mnt for i in `seq $1`; do mount --bind /mnt `mktemp -d /mnt/test.XXXXXX` done time umount -l /mnt $ for i in `seq 10 16`; do echo $i; unshare -Urm bash ./run.sh $i; done Here is performance measurements with and without this patch: mounts | after | before (sec) --------------------- 1024 | 0.024 | 0.084 2048 | 0.041 | 0.39 4096 | 0.059 | 3.198 8192 | 0.227 | 50.794 16384 | 1.015 | 810 This patch is a first step to fix CVE-2016-6213. The next step will be to add ucount (user namespace limit) for mounts. Signed-off-by: Andrei Vagin --- fs/mount.h | 2 ++ fs/namespace.c | 19 ++++++++++++++++--- fs/pnode.c | 23 +++++++++++++++++++++-- 3 files changed, 39 insertions(+), 5 deletions(-) diff --git a/fs/mount.h b/fs/mount.h index 14db05d..b5631bd 100644 --- a/fs/mount.h +++ b/fs/mount.h @@ -87,6 +87,8 @@ static inline int is_mounted(struct vfsmount *mnt) extern struct mount *__lookup_mnt(struct vfsmount *, struct dentry *); extern struct mount *__lookup_mnt_last(struct vfsmount *, struct dentry *); +extern struct mount *__lookup_mnt_cont(struct mount *, + struct vfsmount *, struct dentry *); extern int __legitimize_mnt(struct vfsmount *, unsigned); extern bool legitimize_mnt(struct vfsmount *, unsigned); diff --git a/fs/namespace.c b/fs/namespace.c index 7bb2cda..924cea7 100644 --- a/fs/namespace.c +++ b/fs/namespace.c @@ -649,9 +649,7 @@ struct mount *__lookup_mnt_last(struct vfsmount *mnt, struct dentry *dentry) goto out; if (!(p->mnt.mnt_flags & MNT_UMOUNT)) res = p; - hlist_for_each_entry_continue(p, mnt_hash) { - if (&p->mnt_parent->mnt != mnt || p->mnt_mountpoint != dentry) - break; + for (; p != NULL; p = __lookup_mnt_cont(p, mnt, dentry)) { if (!(p->mnt.mnt_flags & MNT_UMOUNT)) res = p; } @@ -659,6 +657,21 @@ out: return res; } +struct mount *__lookup_mnt_cont(struct mount *p, + struct vfsmount *mnt, struct dentry *dentry) +{ + struct hlist_node *node = p->mnt_hash.next; + + if (!node) + return NULL; + + p = hlist_entry(node, struct mount, mnt_hash); + if (&p->mnt_parent->mnt != mnt || p->mnt_mountpoint != dentry) + return NULL; + + return p; +} + /* * lookup_mnt - Return the first child mount mounted at path * diff --git a/fs/pnode.c b/fs/pnode.c index 9989970..2242aad 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -399,10 +399,24 @@ static void mark_umount_candidates(struct mount *mnt) BUG_ON(parent == mnt); + if (IS_MNT_MARKED(mnt)) + return; + for (m = propagation_next(parent, parent); m; m = propagation_next(m, parent)) { - struct mount *child = __lookup_mnt_last(&m->mnt, + struct mount *child = __lookup_mnt(&m->mnt, mnt->mnt_mountpoint); + + while (child && child->mnt.mnt_flags & MNT_UMOUNT) { + /* + * Mark umounted mounts to not call + * __propagate_umount for them again. + */ + SET_MNT_MARK(child); + child = __lookup_mnt_cont(child, &m->mnt, + mnt->mnt_mountpoint); + } + if (child && (!IS_MNT_LOCKED(child) || IS_MNT_MARKED(m))) { SET_MNT_MARK(child); } @@ -420,6 +434,9 @@ static void __propagate_umount(struct mount *mnt) BUG_ON(parent == mnt); + if (IS_MNT_MARKED(mnt)) + return; + for (m = propagation_next(parent, parent); m; m = propagation_next(m, parent)) { @@ -431,6 +448,8 @@ static void __propagate_umount(struct mount *mnt) */ if (!child || !IS_MNT_MARKED(child)) continue; + if (child->mnt.mnt_flags & MNT_UMOUNT) + continue; CLEAR_MNT_MARK(child); if (list_empty(&child->mnt_mounts)) { list_del_init(&child->mnt_child); @@ -454,7 +473,7 @@ int propagate_umount(struct list_head *list) list_for_each_entry_reverse(mnt, list, mnt_list) mark_umount_candidates(mnt); - list_for_each_entry(mnt, list, mnt_list) + list_for_each_entry_reverse(mnt, list, mnt_list) __propagate_umount(mnt); return 0; }