From patchwork Thu Feb 20 20:47:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11395003 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8A53F92A for ; Thu, 20 Feb 2020 20:49:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 6A3F124654 for ; Thu, 20 Feb 2020 20:49:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728993AbgBTUtu (ORCPT ); Thu, 20 Feb 2020 15:49:50 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:33366 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728618AbgBTUtu (ORCPT ); Thu, 20 Feb 2020 15:49:50 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1j4slY-0006CP-Pt; Thu, 20 Feb 2020 13:49:48 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1j4slX-0008I9-Sf; Thu, 20 Feb 2020 13:49:48 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Al Viro , LKML , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Solar Designer References: <20200210150519.538333-8-gladkov.alexey@gmail.com> <87v9odlxbr.fsf@x220.int.ebiederm.org> <20200212144921.sykucj4mekcziicz@comp-core-i7-2640m-0182e6> <87tv3vkg1a.fsf@x220.int.ebiederm.org> <87v9obipk9.fsf@x220.int.ebiederm.org> <20200212200335.GO23230@ZenIV.linux.org.uk> <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> <871rqpaswu.fsf_-_@x220.int.ebiederm.org> Date: Thu, 20 Feb 2020 14:47:47 -0600 In-Reply-To: <871rqpaswu.fsf_-_@x220.int.ebiederm.org> (Eric W. Biederman's message of "Thu, 20 Feb 2020 14:46:25 -0600") Message-ID: <87tv3l9ea4.fsf_-_@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=1j4slX-0008I9-Sf;;;mid=<87tv3l9ea4.fsf_-_@x220.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18FKktgO7yeajhH6EwOZZcjcqkyfTCGXi4= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa06.xmission.com X-Spam-Level: *** X-Spam-Status: No, score=3.2 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,LotsOfNums_01,T_TooManySym_01,XMNoVowels,XMSubLong autolearn=disabled version=3.4.2 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4664] * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 1.2 LotsOfNums_01 BODY: Lots of long strings of numbers * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 494 ms - load_scoreonly_sql: 0.06 (0.0%), signal_user_changed: 2.9 (0.6%), b_tie_ro: 2.1 (0.4%), parse: 1.01 (0.2%), extract_message_metadata: 11 (2.3%), get_uri_detail_list: 1.65 (0.3%), tests_pri_-1000: 14 (2.9%), tests_pri_-950: 1.29 (0.3%), tests_pri_-900: 1.10 (0.2%), tests_pri_-90: 27 (5.4%), check_bayes: 25 (5.1%), b_tokenize: 10 (2.0%), b_tok_get_all: 7 (1.5%), b_comp_prob: 2.2 (0.5%), b_tok_touch_all: 3.6 (0.7%), b_finish: 0.59 (0.1%), tests_pri_0: 424 (85.8%), check_dkim_signature: 0.54 (0.1%), check_dkim_adsp: 2.6 (0.5%), poll_dns_idle: 0.89 (0.2%), tests_pri_10: 2.2 (0.4%), tests_pri_500: 6 (1.3%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 1/7] proc: Rename in proc_inode rename sysctl_inodes sibling_inodes X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org I about to need and use the same functionality for pid based inodes and there is no point in adding a second field when this field is already here and serving the same purporse. Just give the field a generic name so it is clear that it is no longer sysctl specific. Also for good measure initialize sibling_inodes when proc_inode is initialized. Signed-off-by: Eric W. Biederman --- fs/proc/inode.c | 1 + fs/proc/internal.h | 2 +- fs/proc/proc_sysctl.c | 8 ++++---- 3 files changed, 6 insertions(+), 5 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index 6da18316d209..bdae442d5262 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -68,6 +68,7 @@ static struct inode *proc_alloc_inode(struct super_block *sb) ei->pde = NULL; ei->sysctl = NULL; ei->sysctl_entry = NULL; + INIT_HLIST_NODE(&ei->sibling_inodes); ei->ns_ops = NULL; return &ei->vfs_inode; } diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 41587276798e..366cd3aa690b 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -91,7 +91,7 @@ struct proc_inode { struct proc_dir_entry *pde; struct ctl_table_header *sysctl; struct ctl_table *sysctl_entry; - struct hlist_node sysctl_inodes; + struct hlist_node sibling_inodes; const struct proc_ns_operations *ns_ops; struct inode vfs_inode; } __randomize_layout; diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index c75bb4632ed1..42fbb7f3c587 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -279,9 +279,9 @@ static void proc_sys_prune_dcache(struct ctl_table_header *head) node = hlist_first_rcu(&head->inodes); if (!node) break; - ei = hlist_entry(node, struct proc_inode, sysctl_inodes); + ei = hlist_entry(node, struct proc_inode, sibling_inodes); spin_lock(&sysctl_lock); - hlist_del_init_rcu(&ei->sysctl_inodes); + hlist_del_init_rcu(&ei->sibling_inodes); spin_unlock(&sysctl_lock); inode = &ei->vfs_inode; @@ -483,7 +483,7 @@ static struct inode *proc_sys_make_inode(struct super_block *sb, } ei->sysctl = head; ei->sysctl_entry = table; - hlist_add_head_rcu(&ei->sysctl_inodes, &head->inodes); + hlist_add_head_rcu(&ei->sibling_inodes, &head->inodes); head->count++; spin_unlock(&sysctl_lock); @@ -514,7 +514,7 @@ static struct inode *proc_sys_make_inode(struct super_block *sb, void proc_sys_evict_inode(struct inode *inode, struct ctl_table_header *head) { spin_lock(&sysctl_lock); - hlist_del_init_rcu(&PROC_I(inode)->sysctl_inodes); + hlist_del_init_rcu(&PROC_I(inode)->sibling_inodes); if (!--head->count) kfree_rcu(head, rcu); spin_unlock(&sysctl_lock); From patchwork Thu Feb 20 20:48:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11395009 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F31E2109A for ; Thu, 20 Feb 2020 20:50:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D3E19222C4 for ; Thu, 20 Feb 2020 20:50:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729140AbgBTUub (ORCPT ); Thu, 20 Feb 2020 15:50:31 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:33652 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728618AbgBTUua (ORCPT ); Thu, 20 Feb 2020 15:50:30 -0500 Received: from in02.mta.xmission.com ([166.70.13.52]) by out03.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1j4smD-0006HC-9P; Thu, 20 Feb 2020 13:50:29 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in02.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1j4smC-0008QU-FZ; Thu, 20 Feb 2020 13:50:29 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Al Viro , LKML , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Solar Designer References: <20200210150519.538333-8-gladkov.alexey@gmail.com> <87v9odlxbr.fsf@x220.int.ebiederm.org> <20200212144921.sykucj4mekcziicz@comp-core-i7-2640m-0182e6> <87tv3vkg1a.fsf@x220.int.ebiederm.org> <87v9obipk9.fsf@x220.int.ebiederm.org> <20200212200335.GO23230@ZenIV.linux.org.uk> <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> <871rqpaswu.fsf_-_@x220.int.ebiederm.org> Date: Thu, 20 Feb 2020 14:48:28 -0600 In-Reply-To: <871rqpaswu.fsf_-_@x220.int.ebiederm.org> (Eric W. Biederman's message of "Thu, 20 Feb 2020 14:46:25 -0600") Message-ID: <87mu9d9e8z.fsf_-_@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=1j4smC-0008QU-FZ;;;mid=<87mu9d9e8z.fsf_-_@x220.int.ebiederm.org>;;;hst=in02.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19U9wTJM2ZK4ftSyO914HE9nIIpH9k5Pfg= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa05.xmission.com X-Spam-Level: ** X-Spam-Status: No, score=2.5 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,T_TooManySym_01,XMGappySubj_01,XMNoVowels,XMSubLong autolearn=disabled version=3.4.2 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4998] * 0.7 XMSubLong Long Subject * 0.5 XMGappySubj_01 Very gappy subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa05 1397; Body=1 Fuz1=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa05 1397; Body=1 Fuz1=1 X-Spam-Combo: **;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 394 ms - load_scoreonly_sql: 0.06 (0.0%), signal_user_changed: 3.1 (0.8%), b_tie_ro: 2.2 (0.6%), parse: 1.45 (0.4%), extract_message_metadata: 13 (3.2%), get_uri_detail_list: 1.86 (0.5%), tests_pri_-1000: 16 (4.0%), tests_pri_-950: 1.30 (0.3%), tests_pri_-900: 1.08 (0.3%), tests_pri_-90: 33 (8.3%), check_bayes: 31 (7.9%), b_tokenize: 11 (2.9%), b_tok_get_all: 8 (2.0%), b_comp_prob: 4.2 (1.1%), b_tok_touch_all: 5 (1.3%), b_finish: 0.66 (0.2%), tests_pri_0: 313 (79.4%), check_dkim_signature: 0.95 (0.2%), check_dkim_adsp: 2.5 (0.6%), poll_dns_idle: 0.47 (0.1%), tests_pri_10: 2.2 (0.5%), tests_pri_500: 7 (1.8%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 2/7] proc: Generalize proc_sys_prune_dcache into proc_prune_siblings_dcache X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in02.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This prepares the way for allowing the pid part of proc to use this dcache pruning code as well. Signed-off-by: Eric W. Biederman --- fs/proc/inode.c | 38 ++++++++++++++++++++++++++++++++++++++ fs/proc/internal.h | 1 + fs/proc/proc_sysctl.c | 35 +---------------------------------- 3 files changed, 40 insertions(+), 34 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index bdae442d5262..74ce4a8d05eb 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -103,6 +103,44 @@ void __init proc_init_kmemcache(void) BUILD_BUG_ON(sizeof(struct proc_dir_entry) >= SIZEOF_PDE); } +void proc_prune_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock) +{ + struct inode *inode; + struct proc_inode *ei; + struct hlist_node *node; + struct super_block *sb; + + rcu_read_lock(); + for (;;) { + node = hlist_first_rcu(inodes); + if (!node) + break; + ei = hlist_entry(node, struct proc_inode, sibling_inodes); + spin_lock(lock); + hlist_del_init_rcu(&ei->sibling_inodes); + spin_unlock(lock); + + inode = &ei->vfs_inode; + sb = inode->i_sb; + if (!atomic_inc_not_zero(&sb->s_active)) + continue; + inode = igrab(inode); + rcu_read_unlock(); + if (unlikely(!inode)) { + deactivate_super(sb); + rcu_read_lock(); + continue; + } + + d_prune_aliases(inode); + iput(inode); + deactivate_super(sb); + + rcu_read_lock(); + } + rcu_read_unlock(); +} + static int proc_show_options(struct seq_file *seq, struct dentry *root) { struct super_block *sb = root->d_sb; diff --git a/fs/proc/internal.h b/fs/proc/internal.h index 366cd3aa690b..ba9a991824a5 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -210,6 +210,7 @@ extern const struct inode_operations proc_pid_link_inode_operations; extern const struct super_operations proc_sops; void proc_init_kmemcache(void); +void proc_prune_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock); void set_proc_pid_nlink(void); extern struct inode *proc_get_inode(struct super_block *, struct proc_dir_entry *); extern void proc_entry_rundown(struct proc_dir_entry *); diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 42fbb7f3c587..5da9d7f7ae34 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -269,40 +269,7 @@ static void unuse_table(struct ctl_table_header *p) static void proc_sys_prune_dcache(struct ctl_table_header *head) { - struct inode *inode; - struct proc_inode *ei; - struct hlist_node *node; - struct super_block *sb; - - rcu_read_lock(); - for (;;) { - node = hlist_first_rcu(&head->inodes); - if (!node) - break; - ei = hlist_entry(node, struct proc_inode, sibling_inodes); - spin_lock(&sysctl_lock); - hlist_del_init_rcu(&ei->sibling_inodes); - spin_unlock(&sysctl_lock); - - inode = &ei->vfs_inode; - sb = inode->i_sb; - if (!atomic_inc_not_zero(&sb->s_active)) - continue; - inode = igrab(inode); - rcu_read_unlock(); - if (unlikely(!inode)) { - deactivate_super(sb); - rcu_read_lock(); - continue; - } - - d_prune_aliases(inode); - iput(inode); - deactivate_super(sb); - - rcu_read_lock(); - } - rcu_read_unlock(); + proc_prune_siblings_dcache(&head->inodes, &sysctl_lock); } /* called under sysctl_lock, will reacquire if has to wait */ From patchwork Thu Feb 20 20:49:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11395015 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D181D92A for ; Thu, 20 Feb 2020 20:51:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id B221E208CD for ; Thu, 20 Feb 2020 20:51:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729055AbgBTUvY (ORCPT ); Thu, 20 Feb 2020 15:51:24 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:48810 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728618AbgBTUvX (ORCPT ); Thu, 20 Feb 2020 15:51:23 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out02.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1j4sn4-0001ko-94; Thu, 20 Feb 2020 13:51:22 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1j4smw-0005SC-BW; Thu, 20 Feb 2020 13:51:21 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Al Viro , LKML , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Solar Designer References: <20200210150519.538333-8-gladkov.alexey@gmail.com> <87v9odlxbr.fsf@x220.int.ebiederm.org> <20200212144921.sykucj4mekcziicz@comp-core-i7-2640m-0182e6> <87tv3vkg1a.fsf@x220.int.ebiederm.org> <87v9obipk9.fsf@x220.int.ebiederm.org> <20200212200335.GO23230@ZenIV.linux.org.uk> <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> <871rqpaswu.fsf_-_@x220.int.ebiederm.org> Date: Thu, 20 Feb 2020 14:49:09 -0600 In-Reply-To: <871rqpaswu.fsf_-_@x220.int.ebiederm.org> (Eric W. Biederman's message of "Thu, 20 Feb 2020 14:46:25 -0600") Message-ID: <87h7zl9e7u.fsf_-_@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=1j4smw-0005SC-BW;;;mid=<87h7zl9e7u.fsf_-_@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX18yxsTIN6rEo+POZRFDqs4bpwfG8Bf38/Y= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa01.xmission.com X-Spam-Level: **** X-Spam-Status: No, score=4.0 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,NO_DNS_FOR_FROM,TR_Symld_Words,T_TooManySym_01, T_TooManySym_02,T_TooManySym_03,T_TooManySym_04,XMGappySubj_01, XMNoVowels,XMSubLong autolearn=disabled version=3.4.2 X-Spam-Virus: No X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * 0.7 XMSubLong Long Subject * 0.5 XMGappySubj_01 Very gappy subject * 1.5 TR_Symld_Words too many words that have symbols inside * 1.5 XMNoVowels Alpha-numberic number with no vowels * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa01 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 NO_DNS_FOR_FROM DNS: Envelope sender has no MX or A DNS records * 0.0 T_TooManySym_02 5+ unique symbols in subject * 0.0 T_TooManySym_01 4+ unique symbols in subject * 0.0 T_TooManySym_03 6+ unique symbols in subject * 0.0 T_TooManySym_04 7+ unique symbols in subject X-Spam-DCC: XMission; sa01 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ****;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 7271 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.9 (0.0%), b_tie_ro: 2.1 (0.0%), parse: 1.16 (0.0%), extract_message_metadata: 12 (0.2%), get_uri_detail_list: 1.32 (0.0%), tests_pri_-1000: 3.4 (0.0%), tests_pri_-950: 0.98 (0.0%), tests_pri_-900: 0.84 (0.0%), tests_pri_-90: 21 (0.3%), check_bayes: 20 (0.3%), b_tokenize: 6 (0.1%), b_tok_get_all: 7 (0.1%), b_comp_prob: 1.55 (0.0%), b_tok_touch_all: 3.9 (0.1%), b_finish: 0.62 (0.0%), tests_pri_0: 6184 (85.0%), check_dkim_signature: 0.38 (0.0%), check_dkim_adsp: 6008 (82.6%), poll_dns_idle: 7038 (96.8%), tests_pri_10: 1.66 (0.0%), tests_pri_500: 1041 (14.3%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 3/7] proc: Mov rcu_read_(lock|unlock) in proc_prune_siblings_dcache X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Don't make it look like rcu_read_lock is held over the entire loop instead just take the rcu_read_lock over the part of the loop that matters. This makes the intent of the code a little clearer. Signed-off-by: "Eric W. Biederman" --- fs/proc/inode.c | 14 +++++++------- 1 file changed, 7 insertions(+), 7 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index 74ce4a8d05eb..38a7baa41aba 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -110,11 +110,13 @@ void proc_prune_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock) struct hlist_node *node; struct super_block *sb; - rcu_read_lock(); for (;;) { + rcu_read_lock(); node = hlist_first_rcu(inodes); - if (!node) + if (!node) { + rcu_read_unlock(); break; + } ei = hlist_entry(node, struct proc_inode, sibling_inodes); spin_lock(lock); hlist_del_init_rcu(&ei->sibling_inodes); @@ -122,23 +124,21 @@ void proc_prune_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock) inode = &ei->vfs_inode; sb = inode->i_sb; - if (!atomic_inc_not_zero(&sb->s_active)) + if (!atomic_inc_not_zero(&sb->s_active)) { + rcu_read_unlock(); continue; + } inode = igrab(inode); rcu_read_unlock(); if (unlikely(!inode)) { deactivate_super(sb); - rcu_read_lock(); continue; } d_prune_aliases(inode); iput(inode); deactivate_super(sb); - - rcu_read_lock(); } - rcu_read_unlock(); } static int proc_show_options(struct seq_file *seq, struct dentry *root) From patchwork Thu Feb 20 20:49:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11395019 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DF2A0109A for ; Thu, 20 Feb 2020 20:51:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BFF34208E4 for ; Thu, 20 Feb 2020 20:51:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729100AbgBTUvy (ORCPT ); Thu, 20 Feb 2020 15:51:54 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:34144 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728936AbgBTUvy (ORCPT ); Thu, 20 Feb 2020 15:51:54 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out03.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1j4snZ-0006Pe-Ka; Thu, 20 Feb 2020 13:51:53 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1j4snY-0005Ze-Lt; Thu, 20 Feb 2020 13:51:53 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Al Viro , LKML , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Solar Designer References: <20200210150519.538333-8-gladkov.alexey@gmail.com> <87v9odlxbr.fsf@x220.int.ebiederm.org> <20200212144921.sykucj4mekcziicz@comp-core-i7-2640m-0182e6> <87tv3vkg1a.fsf@x220.int.ebiederm.org> <87v9obipk9.fsf@x220.int.ebiederm.org> <20200212200335.GO23230@ZenIV.linux.org.uk> <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> <871rqpaswu.fsf_-_@x220.int.ebiederm.org> Date: Thu, 20 Feb 2020 14:49:53 -0600 In-Reply-To: <871rqpaswu.fsf_-_@x220.int.ebiederm.org> (Eric W. Biederman's message of "Thu, 20 Feb 2020 14:46:25 -0600") Message-ID: <87blpt9e6m.fsf_-_@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=1j4snY-0005Ze-Lt;;;mid=<87blpt9e6m.fsf_-_@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1+0jzXheB+9rojm9OHQfM0H+L/hnF6cm6M= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa06.xmission.com X-Spam-Level: ** X-Spam-Status: No, score=2.5 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,T_TooManySym_01,XMGappySubj_01,XMNoVowels,XMSubLong autolearn=disabled version=3.4.2 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4982] * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.5 XMGappySubj_01 Very gappy subject * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa06 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa06 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 538 ms - load_scoreonly_sql: 0.05 (0.0%), signal_user_changed: 3.3 (0.6%), b_tie_ro: 2.3 (0.4%), parse: 1.02 (0.2%), extract_message_metadata: 11 (2.1%), get_uri_detail_list: 1.95 (0.4%), tests_pri_-1000: 14 (2.7%), tests_pri_-950: 1.26 (0.2%), tests_pri_-900: 1.04 (0.2%), tests_pri_-90: 29 (5.4%), check_bayes: 28 (5.2%), b_tokenize: 11 (2.0%), b_tok_get_all: 8 (1.5%), b_comp_prob: 2.5 (0.5%), b_tok_touch_all: 4.2 (0.8%), b_finish: 0.66 (0.1%), tests_pri_0: 466 (86.5%), check_dkim_signature: 0.60 (0.1%), check_dkim_adsp: 2.5 (0.5%), poll_dns_idle: 0.79 (0.1%), tests_pri_10: 2.1 (0.4%), tests_pri_500: 6 (1.2%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 4/7] proc: Use d_invalidate in proc_prune_siblings_dcache X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The function d_prune_aliases has the problem that it will only prune aliases thare are completely unused. It will not remove aliases for the dcache or even think of removing mounts from the dcache. For that behavior d_invalidate is needed. To use d_invalidate replace d_prune_aliases with d_find_alias followed by d_invalidate and dput. This is safe and complete because no inode in proc has any hardlinks or aliases. To make this behavior change clear rename proc_prune_siblings_dache proc_invalidate_siblings_dcache, and rename proc_sys_prune_dcache proc_sys_invalidate_dcache. Signed-off-by: "Eric W. Biederman" --- fs/proc/inode.c | 9 +++++++-- fs/proc/internal.h | 2 +- fs/proc/proc_sysctl.c | 8 ++++---- 3 files changed, 12 insertions(+), 7 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index 38a7baa41aba..c4528c419876 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -103,12 +103,13 @@ void __init proc_init_kmemcache(void) BUILD_BUG_ON(sizeof(struct proc_dir_entry) >= SIZEOF_PDE); } -void proc_prune_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock) +void proc_invalidate_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock) { struct inode *inode; struct proc_inode *ei; struct hlist_node *node; struct super_block *sb; + struct dentry *dentry; for (;;) { rcu_read_lock(); @@ -135,7 +136,11 @@ void proc_prune_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock) continue; } - d_prune_aliases(inode); + dentry = d_find_alias(inode); + if (dentry) { + d_invalidate(dentry); + dput(dentry); + } iput(inode); deactivate_super(sb); } diff --git a/fs/proc/internal.h b/fs/proc/internal.h index ba9a991824a5..fd470172675f 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -210,7 +210,7 @@ extern const struct inode_operations proc_pid_link_inode_operations; extern const struct super_operations proc_sops; void proc_init_kmemcache(void); -void proc_prune_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock); +void proc_invalidate_siblings_dcache(struct hlist_head *inodes, spinlock_t *lock); void set_proc_pid_nlink(void); extern struct inode *proc_get_inode(struct super_block *, struct proc_dir_entry *); extern void proc_entry_rundown(struct proc_dir_entry *); diff --git a/fs/proc/proc_sysctl.c b/fs/proc/proc_sysctl.c index 5da9d7f7ae34..b6f5d459b087 100644 --- a/fs/proc/proc_sysctl.c +++ b/fs/proc/proc_sysctl.c @@ -267,9 +267,9 @@ static void unuse_table(struct ctl_table_header *p) complete(p->unregistering); } -static void proc_sys_prune_dcache(struct ctl_table_header *head) +static void proc_sys_invalidate_dcache(struct ctl_table_header *head) { - proc_prune_siblings_dcache(&head->inodes, &sysctl_lock); + proc_invalidate_siblings_dcache(&head->inodes, &sysctl_lock); } /* called under sysctl_lock, will reacquire if has to wait */ @@ -291,10 +291,10 @@ static void start_unregistering(struct ctl_table_header *p) spin_unlock(&sysctl_lock); } /* - * Prune dentries for unregistered sysctls: namespaced sysctls + * Invalidate dentries for unregistered sysctls: namespaced sysctls * can have duplicate names and contaminate dcache very badly. */ - proc_sys_prune_dcache(p); + proc_sys_invalidate_dcache(p); /* * do not remove from the list until nobody holds it; walking the * list in do_sysctl() relies on that. From patchwork Thu Feb 20 20:51:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11395027 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 50FBA109A for ; Thu, 20 Feb 2020 20:53:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3038720679 for ; Thu, 20 Feb 2020 20:53:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728993AbgBTUxl (ORCPT ); Thu, 20 Feb 2020 15:53:41 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:49616 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728688AbgBTUxk (ORCPT ); Thu, 20 Feb 2020 15:53:40 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out02.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1j4spH-00027H-L2; Thu, 20 Feb 2020 13:53:39 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1j4spG-0005x1-BZ; Thu, 20 Feb 2020 13:53:39 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Al Viro , LKML , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Solar Designer References: <20200210150519.538333-8-gladkov.alexey@gmail.com> <87v9odlxbr.fsf@x220.int.ebiederm.org> <20200212144921.sykucj4mekcziicz@comp-core-i7-2640m-0182e6> <87tv3vkg1a.fsf@x220.int.ebiederm.org> <87v9obipk9.fsf@x220.int.ebiederm.org> <20200212200335.GO23230@ZenIV.linux.org.uk> <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> <871rqpaswu.fsf_-_@x220.int.ebiederm.org> Date: Thu, 20 Feb 2020 14:51:38 -0600 In-Reply-To: <871rqpaswu.fsf_-_@x220.int.ebiederm.org> (Eric W. Biederman's message of "Thu, 20 Feb 2020 14:46:25 -0600") Message-ID: <8736b59e3p.fsf_-_@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=1j4spG-0005x1-BZ;;;mid=<8736b59e3p.fsf_-_@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX19DTetfpqNyVK1RVE7gBNaoSNIVSGPjvlE= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa04.xmission.com X-Spam-Level: ** X-Spam-Status: No, score=2.0 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,T_TooManySym_01,XMNoVowels,XMSubLong autolearn=disabled version=3.4.2 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4636] * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa04 1397; Body=1 Fuz1=1 Fuz2=1] * 0.0 T_TooManySym_01 4+ unique symbols in subject X-Spam-DCC: XMission; sa04 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 553 ms - load_scoreonly_sql: 0.03 (0.0%), signal_user_changed: 2.3 (0.4%), b_tie_ro: 1.60 (0.3%), parse: 0.78 (0.1%), extract_message_metadata: 9 (1.6%), get_uri_detail_list: 0.90 (0.2%), tests_pri_-1000: 12 (2.2%), tests_pri_-950: 0.98 (0.2%), tests_pri_-900: 0.80 (0.1%), tests_pri_-90: 22 (4.0%), check_bayes: 21 (3.8%), b_tokenize: 7 (1.3%), b_tok_get_all: 7 (1.3%), b_comp_prob: 1.82 (0.3%), b_tok_touch_all: 3.2 (0.6%), b_finish: 0.54 (0.1%), tests_pri_0: 494 (89.4%), check_dkim_signature: 0.43 (0.1%), check_dkim_adsp: 2.4 (0.4%), poll_dns_idle: 1.01 (0.2%), tests_pri_10: 2.0 (0.4%), tests_pri_500: 6 (1.0%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 5/7] proc: Clear the pieces of proc_inode that proc_evict_inode cares about X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org This just keeps everything tidier, and allows for using flags like SLAB_TYPESAFE_BY_RCU where slabs are not always cleared before reuse. I don't see reuse without reinitializing happening with the proc_inode but I had a false alarm while reworking flushing of proc dentries and indoes when a process dies that caused me to tidy this up. The code is a little easier to follow and reason about this way so I figured the changes might as well be kept. Signed-off-by: "Eric W. Biederman" --- fs/proc/inode.c | 16 +++++++++++----- 1 file changed, 11 insertions(+), 5 deletions(-) diff --git a/fs/proc/inode.c b/fs/proc/inode.c index c4528c419876..3c9082cd257b 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -33,21 +33,27 @@ static void proc_evict_inode(struct inode *inode) { struct proc_dir_entry *de; struct ctl_table_header *head; + struct proc_inode *ei = PROC_I(inode); truncate_inode_pages_final(&inode->i_data); clear_inode(inode); /* Stop tracking associated processes */ - put_pid(PROC_I(inode)->pid); + if (ei->pid) { + put_pid(ei->pid); + ei->pid = NULL; + } /* Let go of any associated proc directory entry */ - de = PDE(inode); - if (de) + de = ei->pde; + if (de) { pde_put(de); + ei->pde = NULL; + } - head = PROC_I(inode)->sysctl; + head = ei->sysctl; if (head) { - RCU_INIT_POINTER(PROC_I(inode)->sysctl, NULL); + RCU_INIT_POINTER(ei->sysctl, NULL); proc_sys_evict_inode(inode, head); } } From patchwork Thu Feb 20 20:52:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11395031 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 66D7592A for ; Thu, 20 Feb 2020 20:54:19 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3E1CE222C4 for ; Thu, 20 Feb 2020 20:54:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729071AbgBTUyP (ORCPT ); Thu, 20 Feb 2020 15:54:15 -0500 Received: from out03.mta.xmission.com ([166.70.13.233]:34984 "EHLO out03.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728400AbgBTUyP (ORCPT ); Thu, 20 Feb 2020 15:54:15 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out03.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1j4spp-0006m8-FF; Thu, 20 Feb 2020 13:54:13 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1j4spo-00063X-DX; Thu, 20 Feb 2020 13:54:13 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Al Viro , LKML , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Solar Designer References: <20200210150519.538333-8-gladkov.alexey@gmail.com> <87v9odlxbr.fsf@x220.int.ebiederm.org> <20200212144921.sykucj4mekcziicz@comp-core-i7-2640m-0182e6> <87tv3vkg1a.fsf@x220.int.ebiederm.org> <87v9obipk9.fsf@x220.int.ebiederm.org> <20200212200335.GO23230@ZenIV.linux.org.uk> <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> <871rqpaswu.fsf_-_@x220.int.ebiederm.org> Date: Thu, 20 Feb 2020 14:52:12 -0600 In-Reply-To: <871rqpaswu.fsf_-_@x220.int.ebiederm.org> (Eric W. Biederman's message of "Thu, 20 Feb 2020 14:46:25 -0600") Message-ID: <87wo8h7zib.fsf_-_@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=1j4spo-00063X-DX;;;mid=<87wo8h7zib.fsf_-_@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/OMqs9VehhvIb+4bZWbETbEYL0SNii37Q= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa07.xmission.com X-Spam-Level: *** X-Spam-Status: No, score=3.5 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,LotsOfNums_01,T_FILL_THIS_FORM_SHORT, T_XMDrugObfuBody_08,XMNoVowels autolearn=disabled version=3.4.2 X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.5000] * 1.5 XMNoVowels Alpha-numberic number with no vowels * 1.2 LotsOfNums_01 BODY: Lots of long strings of numbers * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa07 1397; Body=1 Fuz1=1 Fuz2=1] * 1.0 T_XMDrugObfuBody_08 obfuscated drug references * 0.0 T_FILL_THIS_FORM_SHORT Fill in a short form with personal * information X-Spam-DCC: XMission; sa07 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: ***;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 651 ms - load_scoreonly_sql: 0.06 (0.0%), signal_user_changed: 6 (1.0%), b_tie_ro: 3.9 (0.6%), parse: 1.97 (0.3%), extract_message_metadata: 20 (3.1%), get_uri_detail_list: 6 (0.9%), tests_pri_-1000: 16 (2.5%), tests_pri_-950: 1.35 (0.2%), tests_pri_-900: 1.15 (0.2%), tests_pri_-90: 48 (7.4%), check_bayes: 47 (7.1%), b_tokenize: 21 (3.2%), b_tok_get_all: 14 (2.1%), b_comp_prob: 4.1 (0.6%), b_tok_touch_all: 5 (0.8%), b_finish: 0.66 (0.1%), tests_pri_0: 542 (83.2%), check_dkim_signature: 0.94 (0.1%), check_dkim_adsp: 2.9 (0.4%), poll_dns_idle: 0.85 (0.1%), tests_pri_10: 2.2 (0.3%), tests_pri_500: 7 (1.1%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 6/7] proc: Use a list of inodes to flush from proc X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Rework the flushing of proc to use a list of directory inodes that need to be flushed. The list is kept on struct pid not on struct task_struct, as there is a fixed connection between proc inodes and pids but at least for the case of de_thread the pid of a task_struct changes. This removes the dependency on proc_mnt which allows for different mounts of proc having different mount options even in the same pid namespace and this allows for the removal of proc_mnt which will trivially the first mount of proc to honor it's mount options. This flushing remains an optimization. The functions pid_delete_dentry and pid_revalidate ensure that ordinary dcache management will not attempt to use dentries past the point their respective task has died. When unused the shrinker will eventually be able to remove these dentries. There is a case in de_thread where proc_flush_pid can be called early for a given pid. Which winds up being safe (if suboptimal) as this is just an optiimization. Only pid directories are put on the list as the other per pid files are children of those directories and d_invalidate on the directory will get them as well. So that the pid can be used during flushing it's reference count is taken in release_task and dropped in proc_flush_pid. Further the call of proc_flush_pid is moved after the tasklist_lock is released in release_task so that it is certain that the pid has already been unhashed when flushing it taking place. This removes a small race where a dentry could recreated. As struct pid is supposed to be small and I need a per pid lock I reuse the only lock that currently exists in struct pid the the wait_pidfd.lock. The net result is that this adds all of this functionality with just a little extra list management overhead and a single extra pointer in struct pid. Signed-off-by: Eric W. Biederman --- fs/proc/base.c | 111 +++++++++++++--------------------------- fs/proc/inode.c | 2 +- fs/proc/internal.h | 1 + include/linux/pid.h | 1 + include/linux/proc_fs.h | 4 +- kernel/exit.c | 4 +- 6 files changed, 44 insertions(+), 79 deletions(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index c7c64272b0fa..e7efe9d6f3d6 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -1834,11 +1834,25 @@ void task_dump_owner(struct task_struct *task, umode_t mode, *rgid = gid; } +void proc_pid_evict_inode(struct proc_inode *ei) +{ + struct pid *pid = ei->pid; + + if (S_ISDIR(ei->vfs_inode.i_mode)) { + spin_lock(&pid->wait_pidfd.lock); + hlist_del_init_rcu(&ei->sibling_inodes); + spin_unlock(&pid->wait_pidfd.lock); + } + + put_pid(pid); +} + struct inode *proc_pid_make_inode(struct super_block * sb, struct task_struct *task, umode_t mode) { struct inode * inode; struct proc_inode *ei; + struct pid *pid; /* We need a new inode */ @@ -1856,10 +1870,18 @@ struct inode *proc_pid_make_inode(struct super_block * sb, /* * grab the reference to task. */ - ei->pid = get_task_pid(task, PIDTYPE_PID); - if (!ei->pid) + pid = get_task_pid(task, PIDTYPE_PID); + if (!pid) goto out_unlock; + /* Let the pid remember us for quick removal */ + ei->pid = pid; + if (S_ISDIR(mode)) { + spin_lock(&pid->wait_pidfd.lock); + hlist_add_head_rcu(&ei->sibling_inodes, &pid->inodes); + spin_unlock(&pid->wait_pidfd.lock); + } + task_dump_owner(task, 0, &inode->i_uid, &inode->i_gid); security_task_to_inode(task, inode); @@ -3230,90 +3252,29 @@ static const struct inode_operations proc_tgid_base_inode_operations = { .permission = proc_pid_permission, }; -static void proc_flush_task_mnt(struct vfsmount *mnt, pid_t pid, pid_t tgid) -{ - struct dentry *dentry, *leader, *dir; - char buf[10 + 1]; - struct qstr name; - - name.name = buf; - name.len = snprintf(buf, sizeof(buf), "%u", pid); - /* no ->d_hash() rejects on procfs */ - dentry = d_hash_and_lookup(mnt->mnt_root, &name); - if (dentry) { - d_invalidate(dentry); - dput(dentry); - } - - if (pid == tgid) - return; - - name.name = buf; - name.len = snprintf(buf, sizeof(buf), "%u", tgid); - leader = d_hash_and_lookup(mnt->mnt_root, &name); - if (!leader) - goto out; - - name.name = "task"; - name.len = strlen(name.name); - dir = d_hash_and_lookup(leader, &name); - if (!dir) - goto out_put_leader; - - name.name = buf; - name.len = snprintf(buf, sizeof(buf), "%u", pid); - dentry = d_hash_and_lookup(dir, &name); - if (dentry) { - d_invalidate(dentry); - dput(dentry); - } - - dput(dir); -out_put_leader: - dput(leader); -out: - return; -} - /** - * proc_flush_task - Remove dcache entries for @task from the /proc dcache. - * @task: task that should be flushed. + * proc_flush_pid - Remove dcache entries for @pid from the /proc dcache. + * @pid: pid that should be flushed. * - * When flushing dentries from proc, one needs to flush them from global - * proc (proc_mnt) and from all the namespaces' procs this task was seen - * in. This call is supposed to do all of this job. - * - * Looks in the dcache for - * /proc/@pid - * /proc/@tgid/task/@pid - * if either directory is present flushes it and all of it'ts children - * from the dcache. + * This function walks a list of inodes (that belong to any proc + * filesystem) that are attached to the pid and flushes them from + * the dentry cache. * * It is safe and reasonable to cache /proc entries for a task until * that task exits. After that they just clog up the dcache with * useless entries, possibly causing useful dcache entries to be - * flushed instead. This routine is proved to flush those useless - * dcache entries at process exit time. + * flushed instead. This routine is provided to flush those useless + * dcache entries when a process is reaped. * * NOTE: This routine is just an optimization so it does not guarantee - * that no dcache entries will exist at process exit time it - * just makes it very unlikely that any will persist. + * that no dcache entries will exist after a process is reaped + * it just makes it very unlikely that any will persist. */ -void proc_flush_task(struct task_struct *task) +void proc_flush_pid(struct pid *pid) { - int i; - struct pid *pid, *tgid; - struct upid *upid; - - pid = task_pid(task); - tgid = task_tgid(task); - - for (i = 0; i <= pid->level; i++) { - upid = &pid->numbers[i]; - proc_flush_task_mnt(upid->ns->proc_mnt, upid->nr, - tgid->numbers[i].nr); - } + proc_invalidate_siblings_dcache(&pid->inodes, &pid->wait_pidfd.lock); + put_pid(pid); } static struct dentry *proc_pid_instantiate(struct dentry * dentry, diff --git a/fs/proc/inode.c b/fs/proc/inode.c index 3c9082cd257b..979e867be3e5 100644 --- a/fs/proc/inode.c +++ b/fs/proc/inode.c @@ -40,7 +40,7 @@ static void proc_evict_inode(struct inode *inode) /* Stop tracking associated processes */ if (ei->pid) { - put_pid(ei->pid); + proc_pid_evict_inode(ei); ei->pid = NULL; } diff --git a/fs/proc/internal.h b/fs/proc/internal.h index fd470172675f..9e294f0290e5 100644 --- a/fs/proc/internal.h +++ b/fs/proc/internal.h @@ -158,6 +158,7 @@ extern int proc_pid_statm(struct seq_file *, struct pid_namespace *, extern const struct dentry_operations pid_dentry_operations; extern int pid_getattr(const struct path *, struct kstat *, u32, unsigned int); extern int proc_setattr(struct dentry *, struct iattr *); +extern void proc_pid_evict_inode(struct proc_inode *); extern struct inode *proc_pid_make_inode(struct super_block *, struct task_struct *, umode_t); extern void pid_update_inode(struct task_struct *, struct inode *); extern int pid_delete_dentry(const struct dentry *); diff --git a/include/linux/pid.h b/include/linux/pid.h index 998ae7d24450..01a0d4e28506 100644 --- a/include/linux/pid.h +++ b/include/linux/pid.h @@ -62,6 +62,7 @@ struct pid unsigned int level; /* lists of tasks that use this pid */ struct hlist_head tasks[PIDTYPE_MAX]; + struct hlist_head inodes; /* wait queue for pidfd notifications */ wait_queue_head_t wait_pidfd; struct rcu_head rcu; diff --git a/include/linux/proc_fs.h b/include/linux/proc_fs.h index 3dfa92633af3..40a7982b7285 100644 --- a/include/linux/proc_fs.h +++ b/include/linux/proc_fs.h @@ -32,7 +32,7 @@ struct proc_ops { typedef int (*proc_write_t)(struct file *, char *, size_t); extern void proc_root_init(void); -extern void proc_flush_task(struct task_struct *); +extern void proc_flush_pid(struct pid *); extern struct proc_dir_entry *proc_symlink(const char *, struct proc_dir_entry *, const char *); @@ -105,7 +105,7 @@ static inline void proc_root_init(void) { } -static inline void proc_flush_task(struct task_struct *task) +static inline void proc_flush_pid(struct pid *pid) { } diff --git a/kernel/exit.c b/kernel/exit.c index 2833ffb0c211..502b4995b688 100644 --- a/kernel/exit.c +++ b/kernel/exit.c @@ -191,6 +191,7 @@ void put_task_struct_rcu_user(struct task_struct *task) void release_task(struct task_struct *p) { struct task_struct *leader; + struct pid *thread_pid; int zap_leader; repeat: /* don't need to get the RCU readlock here - the process is dead and @@ -199,11 +200,11 @@ void release_task(struct task_struct *p) atomic_dec(&__task_cred(p)->user->processes); rcu_read_unlock(); - proc_flush_task(p); cgroup_release(p); write_lock_irq(&tasklist_lock); ptrace_release_task(p); + thread_pid = get_pid(p->thread_pid); __exit_signal(p); /* @@ -226,6 +227,7 @@ void release_task(struct task_struct *p) } write_unlock_irq(&tasklist_lock); + proc_flush_pid(thread_pid); release_thread(p); put_task_struct_rcu_user(p); From patchwork Thu Feb 20 20:52:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Eric W. Biederman" X-Patchwork-Id: 11395037 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0D6DC109A for ; Thu, 20 Feb 2020 20:54:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id E1FF924654 for ; Thu, 20 Feb 2020 20:54:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729064AbgBTUyu (ORCPT ); Thu, 20 Feb 2020 15:54:50 -0500 Received: from out02.mta.xmission.com ([166.70.13.232]:50204 "EHLO out02.mta.xmission.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728400AbgBTUyu (ORCPT ); Thu, 20 Feb 2020 15:54:50 -0500 Received: from in01.mta.xmission.com ([166.70.13.51]) by out02.mta.xmission.com with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1j4sqP-0002HF-0g; Thu, 20 Feb 2020 13:54:49 -0700 Received: from ip68-227-160-95.om.om.cox.net ([68.227.160.95] helo=x220.xmission.com) by in01.mta.xmission.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.87) (envelope-from ) id 1j4sqN-00069Y-Au; Thu, 20 Feb 2020 13:54:48 -0700 From: ebiederm@xmission.com (Eric W. Biederman) To: Linus Torvalds Cc: Al Viro , LKML , Kernel Hardening , Linux API , Linux FS Devel , Linux Security Module , Akinobu Mita , Alexey Dobriyan , Andrew Morton , Andy Lutomirski , Daniel Micay , Djalal Harouni , "Dmitry V . Levin" , Greg Kroah-Hartman , Ingo Molnar , "J . Bruce Fields" , Jeff Layton , Jonathan Corbet , Kees Cook , Oleg Nesterov , Solar Designer References: <20200210150519.538333-8-gladkov.alexey@gmail.com> <87v9odlxbr.fsf@x220.int.ebiederm.org> <20200212144921.sykucj4mekcziicz@comp-core-i7-2640m-0182e6> <87tv3vkg1a.fsf@x220.int.ebiederm.org> <87v9obipk9.fsf@x220.int.ebiederm.org> <20200212200335.GO23230@ZenIV.linux.org.uk> <20200212203833.GQ23230@ZenIV.linux.org.uk> <20200212204124.GR23230@ZenIV.linux.org.uk> <87lfp7h422.fsf@x220.int.ebiederm.org> <87pnejf6fz.fsf@x220.int.ebiederm.org> <871rqpaswu.fsf_-_@x220.int.ebiederm.org> Date: Thu, 20 Feb 2020 14:52:47 -0600 In-Reply-To: <871rqpaswu.fsf_-_@x220.int.ebiederm.org> (Eric W. Biederman's message of "Thu, 20 Feb 2020 14:46:25 -0600") Message-ID: <87r1yp7zhc.fsf_-_@x220.int.ebiederm.org> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/26.1 (gnu/linux) MIME-Version: 1.0 X-XM-SPF: eid=1j4sqN-00069Y-Au;;;mid=<87r1yp7zhc.fsf_-_@x220.int.ebiederm.org>;;;hst=in01.mta.xmission.com;;;ip=68.227.160.95;;;frm=ebiederm@xmission.com;;;spf=neutral X-XM-AID: U2FsdGVkX1/YvfDt9LJUlANZXt5NnLd7OKfeCre6VOM= X-SA-Exim-Connect-IP: 68.227.160.95 X-SA-Exim-Mail-From: ebiederm@xmission.com X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on sa01.xmission.com X-Spam-Level: ** X-Spam-Status: No, score=2.0 required=8.0 tests=ALL_TRUSTED,BAYES_50, DCC_CHECK_NEGATIVE,NO_DNS_FOR_FROM,T_TM2_M_HEADER_IN_MSG,XMNoVowels, XMSubLong autolearn=disabled version=3.4.2 X-Spam-Virus: No X-Spam-Report: * -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP * 0.8 BAYES_50 BODY: Bayes spam probability is 40 to 60% * [score: 0.4998] * 0.7 XMSubLong Long Subject * 1.5 XMNoVowels Alpha-numberic number with no vowels * 0.0 NO_DNS_FOR_FROM DNS: Envelope sender has no MX or A DNS records * 0.0 T_TM2_M_HEADER_IN_MSG BODY: No description available. * -0.0 DCC_CHECK_NEGATIVE Not listed in DCC * [sa01 1397; Body=1 Fuz1=1 Fuz2=1] X-Spam-DCC: XMission; sa01 1397; Body=1 Fuz1=1 Fuz2=1 X-Spam-Combo: **;Linus Torvalds X-Spam-Relay-Country: X-Spam-Timing: total 1246 ms - load_scoreonly_sql: 0.04 (0.0%), signal_user_changed: 2.8 (0.2%), b_tie_ro: 2.0 (0.2%), parse: 1.28 (0.1%), extract_message_metadata: 12 (1.0%), get_uri_detail_list: 2.9 (0.2%), tests_pri_-1000: 3.4 (0.3%), tests_pri_-950: 0.99 (0.1%), tests_pri_-900: 0.86 (0.1%), tests_pri_-90: 26 (2.1%), check_bayes: 25 (2.0%), b_tokenize: 9 (0.7%), b_tok_get_all: 9 (0.7%), b_comp_prob: 1.68 (0.1%), b_tok_touch_all: 3.9 (0.3%), b_finish: 0.66 (0.1%), tests_pri_0: 1186 (95.2%), check_dkim_signature: 0.39 (0.0%), check_dkim_adsp: 553 (44.4%), poll_dns_idle: 549 (44.1%), tests_pri_10: 2.4 (0.2%), tests_pri_500: 7 (0.6%), rewrite_mail: 0.00 (0.0%) Subject: [PATCH 7/7] proc: Ensure we see the exit of each process tid exactly once X-Spam-Flag: No X-SA-Exim-Version: 4.2.1 (built Thu, 05 May 2016 13:38:54 -0600) X-SA-Exim-Scanned: Yes (on in01.mta.xmission.com) Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org When the thread group leader changes during exec and the old leaders thread is reaped proc_flush_pid will flush the dentries for the entire process because the leader still has it's original pid. Fix this by exchanging the pids in an rcu safe manner, and wrapping the code to do that up in a helper exchange_tids. When I removed switch_exec_pids and introduced this behavior in d73d65293e3e ("[PATCH] pidhash: kill switch_exec_pids") there really was nothing that cared as flushing happened with the cached dentry and de_thread flushed both of them on exec. This lack of fully exchanging pids became a problem a few months later when I introduced 48e6484d4902 ("[PATCH] proc: Rewrite the proc dentry flush on exit optimization"). Which overlooked the de_thread case was no longer swapping pids, and I was looking up proc dentries by task->pid. The current behavior isn't properly a bug as everything in proc will continue to work correctly just a little bit less efficiently. Fix this just so there are no little surprise corner cases waiting to bite people. Fixes: 48e6484d4902 ("[PATCH] proc: Rewrite the proc dentry flush on exit optimization"). Signed-off-by: Eric W. Biederman --- fs/exec.c | 5 +---- include/linux/pid.h | 1 + kernel/pid.c | 16 ++++++++++++++++ 3 files changed, 18 insertions(+), 4 deletions(-) diff --git a/fs/exec.c b/fs/exec.c index db17be51b112..3f0bc293442e 100644 --- a/fs/exec.c +++ b/fs/exec.c @@ -1148,11 +1148,8 @@ static int de_thread(struct task_struct *tsk) /* Become a process group leader with the old leader's pid. * The old leader becomes a thread of the this thread group. - * Note: The old leader also uses this pid until release_task - * is called. Odd but simple and correct. */ - tsk->pid = leader->pid; - change_pid(tsk, PIDTYPE_PID, task_pid(leader)); + exchange_tids(tsk, leader); transfer_pid(leader, tsk, PIDTYPE_TGID); transfer_pid(leader, tsk, PIDTYPE_PGID); transfer_pid(leader, tsk, PIDTYPE_SID); diff --git a/include/linux/pid.h b/include/linux/pid.h index 01a0d4e28506..0f40b5f1c32c 100644 --- a/include/linux/pid.h +++ b/include/linux/pid.h @@ -101,6 +101,7 @@ extern void attach_pid(struct task_struct *task, enum pid_type); extern void detach_pid(struct task_struct *task, enum pid_type); extern void change_pid(struct task_struct *task, enum pid_type, struct pid *pid); +extern void exchange_tids(struct task_struct *task, struct task_struct *old); extern void transfer_pid(struct task_struct *old, struct task_struct *new, enum pid_type); diff --git a/kernel/pid.c b/kernel/pid.c index 0f4ecb57214c..0085b15478fb 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -359,6 +359,22 @@ void change_pid(struct task_struct *task, enum pid_type type, attach_pid(task, type); } +void exchange_tids(struct task_struct *ntask, struct task_struct *otask) +{ + /* pid_links[PIDTYPE_PID].next is always NULL */ + struct pid *npid = READ_ONCE(ntask->thread_pid); + struct pid *opid = READ_ONCE(otask->thread_pid); + + rcu_assign_pointer(opid->tasks[PIDTYPE_PID].first, &ntask->pid_links[PIDTYPE_PID]); + rcu_assign_pointer(npid->tasks[PIDTYPE_PID].first, &otask->pid_links[PIDTYPE_PID]); + rcu_assign_pointer(ntask->thread_pid, opid); + rcu_assign_pointer(otask->thread_pid, npid); + WRITE_ONCE(ntask->pid_links[PIDTYPE_PID].pprev, &opid->tasks[PIDTYPE_PID].first); + WRITE_ONCE(otask->pid_links[PIDTYPE_PID].pprev, &npid->tasks[PIDTYPE_PID].first); + WRITE_ONCE(ntask->pid, pid_nr(opid)); + WRITE_ONCE(otask->pid, pid_nr(npid)); +} + /* transfer_pid is an optimization of attach_pid(new), detach_pid(old) */ void transfer_pid(struct task_struct *old, struct task_struct *new, enum pid_type type)