From patchwork Fri Dec 2 17:16:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13062997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D3C04C4321E for ; Fri, 2 Dec 2022 17:16:19 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 6116A6B0071; Fri, 2 Dec 2022 12:16:19 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 5C1646B0073; Fri, 2 Dec 2022 12:16:19 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 48BF86B0074; Fri, 2 Dec 2022 12:16:19 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 39D996B0071 for ; Fri, 2 Dec 2022 12:16:19 -0500 (EST) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id D4D90A058D for ; Fri, 2 Dec 2022 17:16:18 +0000 (UTC) X-FDA: 80198019636.26.2F33B40 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf11.hostedemail.com (Postfix) with ESMTP id E4E224000A for ; Fri, 2 Dec 2022 17:16:17 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Lg/WieSf"; spf=pass (imf11.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670001378; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=fMasqR+dnAYUNzNvfPkl6FDxt06ngAqgCShDgQ/MNk4=; b=MsaUMoYIVcsr9pHwALrdXFGXxvXtepmubedEgjUGneWb/ITfIfWBa93dw6onO6AhjjGpDN MfnZNfVxA/vD3JeLpwnLAhpj7MFmxFfpENk06KAU6jayxPzRc/F28FErSnO7SQu7IwY1J2 t3FyHbDW4u6ep+1B6fj2vE96+GxZKtY= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b="Lg/WieSf"; spf=pass (imf11.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670001378; a=rsa-sha256; cv=none; b=gah8Y9c+Pb1zEM27mOnue75lCT8H2tmZWYSZqbPg4RPUKsLhaMyV4NsCyB5WPDu+s4+QPx i6W9ZHjau13ZYYt9JYvtli0DC9hHdy5iNaZfFPLWwiFEyGhi1ooQ3vkZAszGCmOD1+qRc3 q2cfg47nygsXsOyKCxbJxWHCeCSmbs8= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670001377; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=fMasqR+dnAYUNzNvfPkl6FDxt06ngAqgCShDgQ/MNk4=; b=Lg/WieSf454xyqXg2lrm0s/IPKYGrnGTaAHcBzta0BQx5earo+cNdHmvtUnH4gUEAZ8QWi s7cPNV1tbI0T66eI2ACKg/FYXnM05g6Zp5SSIiNdeCbSDdXE6zIAAosY82UUebZbSgJORG XQ+SkPpTx4pa1VNvXU5xCDJH4VlYyDw= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-558-rkrXDgJWMlaM3o3biIrs3g-1; Fri, 02 Dec 2022 12:16:15 -0500 X-MC-Unique: rkrXDgJWMlaM3o3biIrs3g-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4BAD1882823; Fri, 2 Dec 2022 17:16:15 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.8.52]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0990E40C94AA; Fri, 2 Dec 2022 17:16:15 +0000 (UTC) From: Brian Foster To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: ikent@redhat.com, onestero@redhat.com, willy@infradead.org, ebiederm@redhat.com Subject: [PATCH v3 1/5] pid: replace pidmap_lock with xarray lock Date: Fri, 2 Dec 2022 12:16:16 -0500 Message-Id: <20221202171620.509140-2-bfoster@redhat.com> In-Reply-To: <20221202171620.509140-1-bfoster@redhat.com> References: <20221202171620.509140-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: E4E224000A X-Rspam-User: X-Stat-Signature: ziub9ijzzzto361rww81dkmmm4bxc66n X-Spamd-Result: default: False [-2.36 / 9.00]; BAYES_HAM(-5.96)[99.91%]; R_MISSING_CHARSET(2.50)[]; MID_CONTAINS_FROM(1.00)[]; SUBJECT_HAS_UNDERSCORES(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[redhat.com,none]; R_SPF_ALLOW(-0.20)[+ip4:170.10.133.0/24]; R_DKIM_ALLOW(-0.20)[redhat.com:s=mimecast20190719]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[redhat.com:+]; RCVD_COUNT_THREE(0.00)[4]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_NONE(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; ARC_NA(0.00)[] X-HE-Tag: 1670001377-388485 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As a first step to changing the struct pid tracking code from the idr over to the xarray, replace the custom pidmap_lock spinlock with the internal lock associated with the underlying xarray. This is effectively equivalent to using idr_lock() and friends, but since the goal is to disentangle from the idr, move directly to the underlying xarray api. Signed-off-by: Matthew Wilcox Signed-off-by: Brian Foster Reviewed-by: Ian Kent --- kernel/pid.c | 79 ++++++++++++++++++++++++++-------------------------- 1 file changed, 40 insertions(+), 39 deletions(-) diff --git a/kernel/pid.c b/kernel/pid.c index 3fbc5e46b721..3622f8b13143 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -86,22 +86,6 @@ struct pid_namespace init_pid_ns = { }; EXPORT_SYMBOL_GPL(init_pid_ns); -/* - * Note: disable interrupts while the pidmap_lock is held as an - * interrupt might come in and do read_lock(&tasklist_lock). - * - * If we don't disable interrupts there is a nasty deadlock between - * detach_pid()->free_pid() and another cpu that does - * spin_lock(&pidmap_lock) followed by an interrupt routine that does - * read_lock(&tasklist_lock); - * - * After we clean up the tasklist_lock and know there are no - * irq handlers that take it we can leave the interrupts enabled. - * For now it is easier to be safe than to prove it can't happen. - */ - -static __cacheline_aligned_in_smp DEFINE_SPINLOCK(pidmap_lock); - void put_pid(struct pid *pid) { struct pid_namespace *ns; @@ -129,10 +113,11 @@ void free_pid(struct pid *pid) int i; unsigned long flags; - spin_lock_irqsave(&pidmap_lock, flags); for (i = 0; i <= pid->level; i++) { struct upid *upid = pid->numbers + i; struct pid_namespace *ns = upid->ns; + + xa_lock_irqsave(&ns->idr.idr_rt, flags); switch (--ns->pid_allocated) { case 2: case 1: @@ -150,8 +135,8 @@ void free_pid(struct pid *pid) } idr_remove(&ns->idr, upid->nr); + xa_unlock_irqrestore(&ns->idr.idr_rt, flags); } - spin_unlock_irqrestore(&pidmap_lock, flags); call_rcu(&pid->rcu, delayed_put_pid); } @@ -206,7 +191,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, } idr_preload(GFP_KERNEL); - spin_lock_irq(&pidmap_lock); + xa_lock_irq(&tmp->idr.idr_rt); if (tid) { nr = idr_alloc(&tmp->idr, NULL, tid, @@ -233,7 +218,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min, pid_max, GFP_ATOMIC); } - spin_unlock_irq(&pidmap_lock); + xa_unlock_irq(&tmp->idr.idr_rt); idr_preload_end(); if (nr < 0) { @@ -266,34 +251,38 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, INIT_HLIST_HEAD(&pid->inodes); upid = pid->numbers + ns->level; - spin_lock_irq(&pidmap_lock); - if (!(ns->pid_allocated & PIDNS_ADDING)) - goto out_unlock; for ( ; upid >= pid->numbers; --upid) { + tmp = upid->ns; + + xa_lock_irq(&tmp->idr.idr_rt); + if (tmp == ns && !(tmp->pid_allocated & PIDNS_ADDING)) { + xa_unlock_irq(&tmp->idr.idr_rt); + put_pid_ns(ns); + goto out_free; + } + /* Make the PID visible to find_pid_ns. */ - idr_replace(&upid->ns->idr, pid, upid->nr); - upid->ns->pid_allocated++; + idr_replace(&tmp->idr, pid, upid->nr); + tmp->pid_allocated++; + xa_unlock_irq(&tmp->idr.idr_rt); } - spin_unlock_irq(&pidmap_lock); return pid; -out_unlock: - spin_unlock_irq(&pidmap_lock); - put_pid_ns(ns); - out_free: - spin_lock_irq(&pidmap_lock); while (++i <= ns->level) { upid = pid->numbers + i; - idr_remove(&upid->ns->idr, upid->nr); - } + tmp = upid->ns; - /* On failure to allocate the first pid, reset the state */ - if (ns->pid_allocated == PIDNS_ADDING) - idr_set_cursor(&ns->idr, 0); + xa_lock_irq(&tmp->idr.idr_rt); - spin_unlock_irq(&pidmap_lock); + /* On failure to allocate the first pid, reset the state */ + if (tmp == ns && tmp->pid_allocated == PIDNS_ADDING) + idr_set_cursor(&ns->idr, 0); + + idr_remove(&tmp->idr, upid->nr); + xa_unlock_irq(&tmp->idr.idr_rt); + } kmem_cache_free(ns->pid_cachep, pid); return ERR_PTR(retval); @@ -301,9 +290,9 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, void disable_pid_allocation(struct pid_namespace *ns) { - spin_lock_irq(&pidmap_lock); + xa_lock_irq(&ns->idr.idr_rt); ns->pid_allocated &= ~PIDNS_ADDING; - spin_unlock_irq(&pidmap_lock); + xa_unlock_irq(&ns->idr.idr_rt); } struct pid *find_pid_ns(int nr, struct pid_namespace *ns) @@ -647,6 +636,18 @@ SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) return fd; } +/* + * Note: disable interrupts while the xarray lock is held as an interrupt might + * come in and do read_lock(&tasklist_lock). + * + * If we don't disable interrupts there is a nasty deadlock between + * detach_pid()->free_pid() and another cpu that does xa_lock() followed by an + * interrupt routine that does read_lock(&tasklist_lock); + * + * After we clean up the tasklist_lock and know there are no irq handlers that + * take it we can leave the interrupts enabled. For now it is easier to be safe + * than to prove it can't happen. + */ void __init pid_idr_init(void) { /* Verify no one has done anything silly: */ From patchwork Fri Dec 2 17:16:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13063002 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 415A9C4321E for ; Fri, 2 Dec 2022 17:16:26 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 970958E0002; Fri, 2 Dec 2022 12:16:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 80DD38E0001; Fri, 2 Dec 2022 12:16:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 63B038E0002; Fri, 2 Dec 2022 12:16:22 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 461378E0001 for ; Fri, 2 Dec 2022 12:16:22 -0500 (EST) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id E7CDC1C6C34 for ; Fri, 2 Dec 2022 17:16:21 +0000 (UTC) X-FDA: 80198019762.20.8A52D7D Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf02.hostedemail.com (Postfix) with ESMTP id 8F1E480017 for ; Fri, 2 Dec 2022 17:16:21 +0000 (UTC) Authentication-Results: imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=APwBGI3E; spf=pass (imf02.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670001381; a=rsa-sha256; cv=none; b=IKw/720+v3G+igq5e2F4Ylaclx2s4yqC1TVMYVNdnffu10pdtcY3KGTUFWNU5pqVWrsRvU sU3aneps/wSuXr5jcYBYyuAmSNxqb12BmCAa0OPYoVoSwvwktunSJVH1i5LXGAoHERiGjC 8EhHM54nb6aYoDreZUEFjY6ln2tbQmU= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=APwBGI3E; spf=pass (imf02.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670001381; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=E+1/kxt1BcrGWlRNRbifBPUUK2viMqVoDBsQbTYSfG4=; b=O/JAPO1hSBAmxV9BX/lK55Zw2CHIOC3UfUg+vr5PyCBRhSBCqMU64MlI80jmPP014OZtOo 4KYytIN3fHRNBLSyzdFzQkd/f0v2+TO1BTos0T0e87WWpRWlWTXCOn71tEY+ajMKnQuebP GI/V+bvCTNunHse4FS43RRRkkDfDpAU= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670001381; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=E+1/kxt1BcrGWlRNRbifBPUUK2viMqVoDBsQbTYSfG4=; b=APwBGI3EpSM2YHc4BfSDksqF8KElryuI9aNPrThQNR2g3lMGcv0JGCmE0xF7bH8HhOD+7p Ftdk7qDSzgGgS5cW5Gu5mxDWKFVr6WYWKRZ4Y3QqT0/7MP0wNuSosU6fnbBiwPEmzlmzQ5 troSnadU8FDAwiqIDtBh8H9SPGGw3ws= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-210-p9TcCX5SMzOk8TRsi3DIrA-1; Fri, 02 Dec 2022 12:16:19 -0500 X-MC-Unique: p9TcCX5SMzOk8TRsi3DIrA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9DCE23C0F7F6; Fri, 2 Dec 2022 17:16:15 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.8.52]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5B0DD40C947B; Fri, 2 Dec 2022 17:16:15 +0000 (UTC) From: Brian Foster To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: ikent@redhat.com, onestero@redhat.com, willy@infradead.org, ebiederm@redhat.com Subject: [PATCH v3 2/5] pid: split cyclic id allocation cursor from idr Date: Fri, 2 Dec 2022 12:16:17 -0500 Message-Id: <20221202171620.509140-3-bfoster@redhat.com> In-Reply-To: <20221202171620.509140-1-bfoster@redhat.com> References: <20221202171620.509140-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spamd-Result: default: False [-3.40 / 9.00]; BAYES_HAM(-6.00)[99.99%]; R_MISSING_CHARSET(2.50)[]; MID_CONTAINS_FROM(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[redhat.com,none]; R_DKIM_ALLOW(-0.20)[redhat.com:s=mimecast20190719]; R_SPF_ALLOW(-0.20)[+ip4:170.10.133.0/24]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[redhat.com:+]; RCVD_COUNT_THREE(0.00)[4]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_NONE(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; ARC_NA(0.00)[] X-Rspam-User: X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 8F1E480017 X-Stat-Signature: cuzb6414mj1tfznn5r4cigw4i4asb3rm X-HE-Tag: 1670001381-155272 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: As a next step in separating pid allocation from the idr, split off the cyclic pid allocation cursor from the idr. Lift the cursor value into the struct pid_namespace. Note that this involves temporarily open-coding the cursor increment on allocation, but this is cleaned up in the subsequent patch. Signed-off-by: Matthew Wilcox Signed-off-by: Brian Foster Reviewed-by: Ian Kent --- arch/powerpc/platforms/cell/spufs/sched.c | 2 +- fs/proc/loadavg.c | 2 +- include/linux/pid_namespace.h | 1 + kernel/pid.c | 6 ++++-- kernel/pid_namespace.c | 4 ++-- 5 files changed, 9 insertions(+), 6 deletions(-) diff --git a/arch/powerpc/platforms/cell/spufs/sched.c b/arch/powerpc/platforms/cell/spufs/sched.c index 99bd027a7f7c..a2ed928d7658 100644 --- a/arch/powerpc/platforms/cell/spufs/sched.c +++ b/arch/powerpc/platforms/cell/spufs/sched.c @@ -1072,7 +1072,7 @@ static int show_spu_loadavg(struct seq_file *s, void *private) LOAD_INT(c), LOAD_FRAC(c), count_active_contexts(), atomic_read(&nr_spu_contexts), - idr_get_cursor(&task_active_pid_ns(current)->idr) - 1); + READ_ONCE(task_active_pid_ns(current)->pid_next) - 1); return 0; } #endif diff --git a/fs/proc/loadavg.c b/fs/proc/loadavg.c index 817981e57223..2740b31b6461 100644 --- a/fs/proc/loadavg.c +++ b/fs/proc/loadavg.c @@ -22,7 +22,7 @@ static int loadavg_proc_show(struct seq_file *m, void *v) LOAD_INT(avnrun[1]), LOAD_FRAC(avnrun[1]), LOAD_INT(avnrun[2]), LOAD_FRAC(avnrun[2]), nr_running(), nr_threads, - idr_get_cursor(&task_active_pid_ns(current)->idr) - 1); + READ_ONCE(task_active_pid_ns(current)->pid_next) - 1); return 0; } diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h index 07481bb87d4e..82c72482019d 100644 --- a/include/linux/pid_namespace.h +++ b/include/linux/pid_namespace.h @@ -18,6 +18,7 @@ struct fs_pin; struct pid_namespace { struct idr idr; + unsigned int pid_next; struct rcu_head rcu; unsigned int pid_allocated; struct task_struct *child_reaper; diff --git a/kernel/pid.c b/kernel/pid.c index 3622f8b13143..2e2d33273c8e 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -75,6 +75,7 @@ int pid_max_max = PID_MAX_LIMIT; struct pid_namespace init_pid_ns = { .ns.count = REFCOUNT_INIT(2), .idr = IDR_INIT(init_pid_ns.idr), + .pid_next = 0, .pid_allocated = PIDNS_ADDING, .level = 0, .child_reaper = &init_task, @@ -208,7 +209,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, * init really needs pid 1, but after reaching the * maximum wrap back to RESERVED_PIDS */ - if (idr_get_cursor(&tmp->idr) > RESERVED_PIDS) + if (tmp->pid_next > RESERVED_PIDS) pid_min = RESERVED_PIDS; /* @@ -217,6 +218,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, */ nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min, pid_max, GFP_ATOMIC); + tmp->pid_next = nr + 1; } xa_unlock_irq(&tmp->idr.idr_rt); idr_preload_end(); @@ -278,7 +280,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, /* On failure to allocate the first pid, reset the state */ if (tmp == ns && tmp->pid_allocated == PIDNS_ADDING) - idr_set_cursor(&ns->idr, 0); + ns->pid_next = 0; idr_remove(&tmp->idr, upid->nr); xa_unlock_irq(&tmp->idr.idr_rt); diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index f4f8cb0435b4..a53d20c5c85e 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -272,12 +272,12 @@ static int pid_ns_ctl_handler(struct ctl_table *table, int write, * it should synchronize its usage with external means. */ - next = idr_get_cursor(&pid_ns->idr) - 1; + next = READ_ONCE(pid_ns->pid_next) - 1; tmp.data = &next; ret = proc_dointvec_minmax(&tmp, write, buffer, lenp, ppos); if (!ret && write) - idr_set_cursor(&pid_ns->idr, next + 1); + WRITE_ONCE(pid_ns->pid_next, next + 1); return ret; } From patchwork Fri Dec 2 17:16:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13063000 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 88BBBC4321E for ; Fri, 2 Dec 2022 17:16:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id CF1F36B0075; Fri, 2 Dec 2022 12:16:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id C51A48E0001; Fri, 2 Dec 2022 12:16:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A7CE46B007B; Fri, 2 Dec 2022 12:16:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id 87B8C6B0075 for ; Fri, 2 Dec 2022 12:16:21 -0500 (EST) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 37B88120B7D for ; Fri, 2 Dec 2022 17:16:21 +0000 (UTC) X-FDA: 80198019762.30.B716AE5 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf07.hostedemail.com (Postfix) with ESMTP id CD4524000C for ; Fri, 2 Dec 2022 17:16:20 +0000 (UTC) Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NMlIalVV; spf=pass (imf07.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670001380; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=SLnLNVhg+V2QM+5uaqJbdcO5eHog3WerK+Tm3KrAKi4=; b=1iJgRkFsa7DwUgYyz1PEY430+OOp6uiHAd2UeDFIFiOGP66mT44w/LRm6yBQr5k+oJhqmE uS6BGWwJMYwABCyWUMI7Ub8CcW9jPwENCsCCUHd8TDH4j5zBmkJxgfYeEOUTzBx/393YMH UZuWE6emFSrfUtE8Wn9tGxPTINDR818= ARC-Authentication-Results: i=1; imf07.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=NMlIalVV; spf=pass (imf07.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670001380; a=rsa-sha256; cv=none; b=k13ny2JPHoHSLYlqLjoqHxFMny6j0YnYeFeFZX/SQ1D79VbseXCZqvM1+tCl8w5b6GpSpu JDH6QL898p5UTsIYbGHIVkqBd8TwM7g/rIvrOdTq44Fty53Fuw3hAL6cWvJ7lIlZijWicU XwGHhQxWBxTyj7/nMHn4S8BBqOJuefs= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670001380; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=SLnLNVhg+V2QM+5uaqJbdcO5eHog3WerK+Tm3KrAKi4=; b=NMlIalVVazjio9KbJIuygdSgL3+ZUu9dqWnxIxjxxTDR1cMn8u9gaGYsLXvFEuO8dY7C/k yuAcrQrzwrGP4tYYv6zXDbKOkObDRZaoU0172xKHCmXYH9NF0XnF3wX3WO/8pMFQFje93O Z+tHdvUiOmFBcxdHXzlKIgVKGX9gB/Y= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-280-uXxH9e1cPB2HTGHGtZO37g-1; Fri, 02 Dec 2022 12:16:17 -0500 X-MC-Unique: uXxH9e1cPB2HTGHGtZO37g-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id F09FC882828; Fri, 2 Dec 2022 17:16:15 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.8.52]) by smtp.corp.redhat.com (Postfix) with ESMTP id AA86140C94AA; Fri, 2 Dec 2022 17:16:15 +0000 (UTC) From: Brian Foster To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: ikent@redhat.com, onestero@redhat.com, willy@infradead.org, ebiederm@redhat.com Subject: [PATCH v3 3/5] pid: switch pid_namespace from idr to xarray Date: Fri, 2 Dec 2022 12:16:18 -0500 Message-Id: <20221202171620.509140-4-bfoster@redhat.com> In-Reply-To: <20221202171620.509140-1-bfoster@redhat.com> References: <20221202171620.509140-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Stat-Signature: 5ws8rstj5qfo1b58ikba75ouw9utscm9 X-Rspam-User: X-Spamd-Result: default: False [-2.40 / 9.00]; BAYES_HAM(-6.00)[100.00%]; R_MISSING_CHARSET(2.50)[]; MID_CONTAINS_FROM(1.00)[]; SUBJECT_HAS_UNDERSCORES(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[redhat.com,none]; R_SPF_ALLOW(-0.20)[+ip4:170.10.133.0/24]; R_DKIM_ALLOW(-0.20)[redhat.com:s=mimecast20190719]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[redhat.com:+]; RCVD_COUNT_THREE(0.00)[4]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_NONE(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: CD4524000C X-Rspamd-Server: rspam06 X-HE-Tag: 1670001380-658981 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Switch struct pid[_namespace] management over to use the xarray api directly instead of the idr. The underlying data structures used by both interfaces is the same. The difference is that the idr api relies on the old, idr-custom radix-tree implementation for things like efficient tracking/allocation of free ids. The xarray already supports this, so most of this is a direct switchover from the old api to the new. Signed-off-by: Matthew Wilcox Signed-off-by: Brian Foster Reviewed-by: Ian Kent --- include/linux/pid_namespace.h | 8 ++-- include/linux/threads.h | 2 +- init/main.c | 3 +- kernel/pid.c | 78 ++++++++++++++++------------------- kernel/pid_namespace.c | 19 ++++----- 5 files changed, 51 insertions(+), 59 deletions(-) diff --git a/include/linux/pid_namespace.h b/include/linux/pid_namespace.h index 82c72482019d..e4f5979b482b 100644 --- a/include/linux/pid_namespace.h +++ b/include/linux/pid_namespace.h @@ -9,7 +9,7 @@ #include #include #include -#include +#include /* MAX_PID_NS_LEVEL is needed for limiting size of 'struct pid' */ #define MAX_PID_NS_LEVEL 32 @@ -17,7 +17,7 @@ struct fs_pin; struct pid_namespace { - struct idr idr; + struct xarray xa; unsigned int pid_next; struct rcu_head rcu; unsigned int pid_allocated; @@ -38,6 +38,8 @@ extern struct pid_namespace init_pid_ns; #define PIDNS_ADDING (1U << 31) +#define PID_XA_FLAGS (XA_FLAGS_TRACK_FREE | XA_FLAGS_LOCK_IRQ) + #ifdef CONFIG_PID_NS static inline struct pid_namespace *get_pid_ns(struct pid_namespace *ns) { @@ -85,7 +87,7 @@ static inline int reboot_pid_ns(struct pid_namespace *pid_ns, int cmd) extern struct pid_namespace *task_active_pid_ns(struct task_struct *tsk); void pidhash_init(void); -void pid_idr_init(void); +void pid_init(void); static inline bool task_is_in_init_pid_ns(struct task_struct *tsk) { diff --git a/include/linux/threads.h b/include/linux/threads.h index c34173e6c5f1..37e4391ee89f 100644 --- a/include/linux/threads.h +++ b/include/linux/threads.h @@ -38,7 +38,7 @@ * Define a minimum number of pids per cpu. Heuristically based * on original pid max of 32k for 32 cpus. Also, increase the * minimum settable value for pid_max on the running system based - * on similar defaults. See kernel/pid.c:pid_idr_init() for details. + * on similar defaults. See kernel/pid.c:pid_init() for details. */ #define PIDS_PER_CPU_DEFAULT 1024 #define PIDS_PER_CPU_MIN 8 diff --git a/init/main.c b/init/main.c index aa21add5f7c5..7dd8888036c7 100644 --- a/init/main.c +++ b/init/main.c @@ -74,7 +74,6 @@ #include #include #include -#include #include #include #include @@ -1108,7 +1107,7 @@ asmlinkage __visible void __init __no_sanitize_address start_kernel(void) late_time_init(); sched_clock_init(); calibrate_delay(); - pid_idr_init(); + pid_init(); anon_vma_init(); #ifdef CONFIG_X86 if (efi_enabled(EFI_RUNTIME_SERVICES)) diff --git a/kernel/pid.c b/kernel/pid.c index 2e2d33273c8e..53db06f9882d 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -41,7 +41,7 @@ #include #include #include -#include +#include #include #include @@ -66,15 +66,9 @@ int pid_max = PID_MAX_DEFAULT; int pid_max_min = RESERVED_PIDS + 1; int pid_max_max = PID_MAX_LIMIT; -/* - * PID-map pages start out as NULL, they get allocated upon - * first use and are never deallocated. This way a low pid_max - * value does not cause lots of bitmaps to be allocated, but - * the scheme scales to up to 4 million PIDs, runtime. - */ struct pid_namespace init_pid_ns = { .ns.count = REFCOUNT_INIT(2), - .idr = IDR_INIT(init_pid_ns.idr), + .xa = XARRAY_INIT(init_pid_ns.xa, PID_XA_FLAGS), .pid_next = 0, .pid_allocated = PIDNS_ADDING, .level = 0, @@ -118,7 +112,7 @@ void free_pid(struct pid *pid) struct upid *upid = pid->numbers + i; struct pid_namespace *ns = upid->ns; - xa_lock_irqsave(&ns->idr.idr_rt, flags); + xa_lock_irqsave(&ns->xa, flags); switch (--ns->pid_allocated) { case 2: case 1: @@ -135,8 +129,8 @@ void free_pid(struct pid *pid) break; } - idr_remove(&ns->idr, upid->nr); - xa_unlock_irqrestore(&ns->idr.idr_rt, flags); + __xa_erase(&ns->xa, upid->nr); + xa_unlock_irqrestore(&ns->xa, flags); } call_rcu(&pid->rcu, delayed_put_pid); @@ -147,7 +141,7 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, { struct pid *pid; enum pid_type type; - int i, nr; + int i; struct pid_namespace *tmp; struct upid *upid; int retval = -ENOMEM; @@ -191,18 +185,17 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, set_tid_size--; } - idr_preload(GFP_KERNEL); - xa_lock_irq(&tmp->idr.idr_rt); + xa_lock_irq(&tmp->xa); if (tid) { - nr = idr_alloc(&tmp->idr, NULL, tid, - tid + 1, GFP_ATOMIC); + retval = __xa_insert(&tmp->xa, tid, NULL, GFP_KERNEL); + /* - * If ENOSPC is returned it means that the PID is - * alreay in use. Return EEXIST in that case. + * If EBUSY is returned it means that the PID is already + * in use. Return EEXIST in that case. */ - if (nr == -ENOSPC) - nr = -EEXIST; + if (retval == -EBUSY) + retval = -EEXIST; } else { int pid_min = 1; /* @@ -216,19 +209,18 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, * Store a null pointer so find_pid_ns does not find * a partially initialized PID (see below). */ - nr = idr_alloc_cyclic(&tmp->idr, NULL, pid_min, - pid_max, GFP_ATOMIC); - tmp->pid_next = nr + 1; + retval = __xa_alloc_cyclic(&tmp->xa, &tid, NULL, + XA_LIMIT(pid_min, pid_max), + &tmp->pid_next, GFP_KERNEL); + if (retval == -EBUSY) + retval = -EAGAIN; } - xa_unlock_irq(&tmp->idr.idr_rt); - idr_preload_end(); + xa_unlock_irq(&tmp->xa); - if (nr < 0) { - retval = (nr == -ENOSPC) ? -EAGAIN : nr; + if (retval < 0) goto out_free; - } - pid->numbers[i].nr = nr; + pid->numbers[i].nr = tid; pid->numbers[i].ns = tmp; tmp = tmp->parent; } @@ -256,17 +248,17 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, for ( ; upid >= pid->numbers; --upid) { tmp = upid->ns; - xa_lock_irq(&tmp->idr.idr_rt); + xa_lock_irq(&tmp->xa); if (tmp == ns && !(tmp->pid_allocated & PIDNS_ADDING)) { - xa_unlock_irq(&tmp->idr.idr_rt); + xa_unlock_irq(&tmp->xa); put_pid_ns(ns); goto out_free; } /* Make the PID visible to find_pid_ns. */ - idr_replace(&tmp->idr, pid, upid->nr); + __xa_store(&tmp->xa, upid->nr, pid, 0); tmp->pid_allocated++; - xa_unlock_irq(&tmp->idr.idr_rt); + xa_unlock_irq(&tmp->xa); } return pid; @@ -276,14 +268,14 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, upid = pid->numbers + i; tmp = upid->ns; - xa_lock_irq(&tmp->idr.idr_rt); + xa_lock_irq(&tmp->xa); /* On failure to allocate the first pid, reset the state */ if (tmp == ns && tmp->pid_allocated == PIDNS_ADDING) ns->pid_next = 0; - idr_remove(&tmp->idr, upid->nr); - xa_unlock_irq(&tmp->idr.idr_rt); + __xa_erase(&tmp->xa, upid->nr); + xa_unlock_irq(&tmp->xa); } kmem_cache_free(ns->pid_cachep, pid); @@ -292,14 +284,14 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, void disable_pid_allocation(struct pid_namespace *ns) { - xa_lock_irq(&ns->idr.idr_rt); + xa_lock_irq(&ns->xa); ns->pid_allocated &= ~PIDNS_ADDING; - xa_unlock_irq(&ns->idr.idr_rt); + xa_unlock_irq(&ns->xa); } struct pid *find_pid_ns(int nr, struct pid_namespace *ns) { - return idr_find(&ns->idr, nr); + return xa_load(&ns->xa, nr); } EXPORT_SYMBOL_GPL(find_pid_ns); @@ -508,7 +500,9 @@ EXPORT_SYMBOL_GPL(task_active_pid_ns); */ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) { - return idr_get_next(&ns->idr, &nr); + unsigned long index = nr; + + return xa_find(&ns->xa, &index, ULONG_MAX, XA_PRESENT); } EXPORT_SYMBOL_GPL(find_ge_pid); @@ -650,7 +644,7 @@ SYSCALL_DEFINE2(pidfd_open, pid_t, pid, unsigned int, flags) * take it we can leave the interrupts enabled. For now it is easier to be safe * than to prove it can't happen. */ -void __init pid_idr_init(void) +void __init pid_init(void) { /* Verify no one has done anything silly: */ BUILD_BUG_ON(PID_MAX_LIMIT >= PIDNS_ADDING); @@ -662,8 +656,6 @@ void __init pid_idr_init(void) PIDS_PER_CPU_MIN * num_possible_cpus()); pr_info("pid_max: default: %u minimum: %u\n", pid_max, pid_max_min); - idr_init(&init_pid_ns.idr); - init_pid_ns.pid_cachep = KMEM_CACHE(pid, SLAB_HWCACHE_ALIGN | SLAB_PANIC | SLAB_ACCOUNT); } diff --git a/kernel/pid_namespace.c b/kernel/pid_namespace.c index a53d20c5c85e..8561e01e2d01 100644 --- a/kernel/pid_namespace.c +++ b/kernel/pid_namespace.c @@ -22,7 +22,7 @@ #include #include #include -#include +#include static DEFINE_MUTEX(pid_caches_mutex); static struct kmem_cache *pid_ns_cachep; @@ -92,15 +92,15 @@ static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns if (ns == NULL) goto out_dec; - idr_init(&ns->idr); + xa_init_flags(&ns->xa, PID_XA_FLAGS); ns->pid_cachep = create_pid_cachep(level); if (ns->pid_cachep == NULL) - goto out_free_idr; + goto out_free_xa; err = ns_alloc_inum(&ns->ns); if (err) - goto out_free_idr; + goto out_free_xa; ns->ns.ops = &pidns_operations; refcount_set(&ns->ns.count, 1); @@ -112,8 +112,8 @@ static struct pid_namespace *create_pid_namespace(struct user_namespace *user_ns return ns; -out_free_idr: - idr_destroy(&ns->idr); +out_free_xa: + xa_destroy(&ns->xa); kmem_cache_free(pid_ns_cachep, ns); out_dec: dec_pid_namespaces(ucounts); @@ -135,7 +135,7 @@ static void destroy_pid_namespace(struct pid_namespace *ns) { ns_free_inum(&ns->ns); - idr_destroy(&ns->idr); + xa_destroy(&ns->xa); call_rcu(&ns->rcu, delayed_free_pidns); } @@ -165,7 +165,7 @@ EXPORT_SYMBOL_GPL(put_pid_ns); void zap_pid_ns_processes(struct pid_namespace *pid_ns) { - int nr; + long nr; int rc; struct task_struct *task, *me = current; int init_pids = thread_group_leader(me) ? 1 : 2; @@ -198,8 +198,7 @@ void zap_pid_ns_processes(struct pid_namespace *pid_ns) */ rcu_read_lock(); read_lock(&tasklist_lock); - nr = 2; - idr_for_each_entry_continue(&pid_ns->idr, pid, nr) { + xa_for_each_range(&pid_ns->xa, nr, pid, 2, ULONG_MAX) { task = pid_task(pid, PIDTYPE_PID); if (task && !__fatal_signal_pending(task)) group_send_sig_info(SIGKILL, SEND_SIG_PRIV, task, PIDTYPE_MAX); From patchwork Fri Dec 2 17:16:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13062999 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4BD3EC47088 for ; Fri, 2 Dec 2022 17:16:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 551396B0074; Fri, 2 Dec 2022 12:16:21 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4B0866B0075; Fri, 2 Dec 2022 12:16:21 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2DC4D6B0078; Fri, 2 Dec 2022 12:16:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 18A376B0074 for ; Fri, 2 Dec 2022 12:16:21 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id DC329A12C0 for ; Fri, 2 Dec 2022 17:16:20 +0000 (UTC) X-FDA: 80198019720.03.B4306C6 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by imf23.hostedemail.com (Postfix) with ESMTP id 704DC140014 for ; Fri, 2 Dec 2022 17:16:20 +0000 (UTC) Authentication-Results: imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gGhrHqfT; spf=pass (imf23.hostedemail.com: domain of bfoster@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670001380; a=rsa-sha256; cv=none; b=ihfGaR69YCuddV0TP4rRWnxlrStIzKmAy88EbO4ZEysyJGvc+SK+crxGjLxfh70jYo9Bg4 CxhPPdB10aCREgKkUDpmK2jeT3GpMP2Lktdq9yryc6xMZl/ks2gEI3kQyWHzyxtrtfbSjV aCHRLW0xfqBgAjakWuw9Yvymt3dqQwc= ARC-Authentication-Results: i=1; imf23.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=gGhrHqfT; spf=pass (imf23.hostedemail.com: domain of bfoster@redhat.com designates 170.10.129.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670001380; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Jz5L8aDW4bvVNOnmSsD8fmQ8XtEZs9lbM2YmlOnfn7I=; b=jXmOBExjvQrI23k9q9plecXD/qqfBGftdCT2rmpkqdgis4aUSWJ4zkc7ISwUuEWJnX2TWn Z5s9FhV3a6XL7DtvADv7ZWeVnhGhVIolECR1grBHb2MIYWTCPDHOW9ynC0WRopCR0GXUgb UQQGyBxslmLNsj4lMja0mnVO9eDt4cY= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670001379; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Jz5L8aDW4bvVNOnmSsD8fmQ8XtEZs9lbM2YmlOnfn7I=; b=gGhrHqfTYBLE/wym0TeXLrnMnpE/hj+L5L2+6mupDBAMPyVP4On5hUKA81htxtXZSlYmPM gaO7bFVF9oG/e0Va1UqUf7bLyUyUWnxP1Lh4PWAI4JbVQoSKpYyaKzJSDeEI49OJEMqGo0 USL0GNtmKXQIuSvPVHlP3GfrsPRpvNU= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-417-stNBeUb6OA2jAZQjw_-VnA-1; Fri, 02 Dec 2022 12:16:16 -0500 X-MC-Unique: stNBeUb6OA2jAZQjw_-VnA-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 4BD33101A5AD; Fri, 2 Dec 2022 17:16:16 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.8.52]) by smtp.corp.redhat.com (Postfix) with ESMTP id 090C840C94AA; Fri, 2 Dec 2022 17:16:16 +0000 (UTC) From: Brian Foster To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: ikent@redhat.com, onestero@redhat.com, willy@infradead.org, ebiederm@redhat.com Subject: [PATCH v3 4/5] pid: mark pids associated with group leader tasks Date: Fri, 2 Dec 2022 12:16:19 -0500 Message-Id: <20221202171620.509140-5-bfoster@redhat.com> In-Reply-To: <20221202171620.509140-1-bfoster@redhat.com> References: <20221202171620.509140-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Rspam-User: X-Spamd-Result: default: False [-3.40 / 9.00]; BAYES_HAM(-6.00)[100.00%]; R_MISSING_CHARSET(2.50)[]; MID_CONTAINS_FROM(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[redhat.com,none]; R_DKIM_ALLOW(-0.20)[redhat.com:s=mimecast20190719]; R_SPF_ALLOW(-0.20)[+ip4:170.10.129.0/24]; RCVD_NO_TLS_LAST(0.10)[]; MIME_GOOD(-0.10)[text/plain]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[redhat.com:+]; RCVD_COUNT_THREE(0.00)[4]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_NONE(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; ARC_NA(0.00)[] X-Rspamd-Queue-Id: 704DC140014 X-Rspamd-Server: rspam01 X-Stat-Signature: j1d6eg4iq3rjs8skqu4ia1yfnp1awo98 X-HE-Tag: 1670001380-892614 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Searching the pid_namespace for group leader tasks is a fairly inefficient operation. Listing the root directory of a procfs mount performs a linear scan of allocated pids, checking each entry for an associated PIDTYPE_TGID task to determine whether to populate a directory entry. This can cause a significant increase in readdir() syscall latency when run in namespaces that might have one or more processes with significant thread counts. To facilitate improved TGID pid searches, mark the ids of pid entries that are likely to have an associated PIDTYPE_TGID task. To keep the code simple and avoid having to maintain synchronization between mark state and post-fork pid-task association changes, the mark is applied to all pids allocated for tasks cloned without CLONE_THREAD. This means that it is possible for a pid to remain marked in the xarray after being disassociated from the group leader task. For example, a process that does a setsid() followed by fork() and exit() (to daemonize) will remain associated with the original pid for the session, but link with the child pid as the group leader. OTOH, the only place other than fork() where a tgid association occurs is in the exec() path, which kills all other tasks in the group and associates the current task with the preexisting leader pid. Therefore, the semantics of the mark are that false positives (marked pids without PIDTYPE_TGID tasks) are possible, but false negatives (unmarked pids without PIDTYPE_TGID tasks) should never occur. This is an effective optimization because false negatives are fairly uncommon and don't add overhead (i.e. we already have to check pid_task() for marked entries), but still filters out thread pids that are guaranteed not to have TGID task association. Mark entries in the pid allocation path when the caller specifies that the pid associates with a new thread group. Since false negatives are not allowed, warn in the event that a PIDTYPE_TGID task is ever attached to an unmarked pid. Finally, create a helper to implement the task search based on the mark semantics defined above (based on search logic currently implemented by next_tgid() in procfs). Signed-off-by: Brian Foster Reviewed-by: Ian Kent --- include/linux/pid.h | 3 ++- kernel/fork.c | 2 +- kernel/pid.c | 44 +++++++++++++++++++++++++++++++++++++++++++- 3 files changed, 46 insertions(+), 3 deletions(-) diff --git a/include/linux/pid.h b/include/linux/pid.h index 343abf22092e..64caf21be256 100644 --- a/include/linux/pid.h +++ b/include/linux/pid.h @@ -132,9 +132,10 @@ extern struct pid *find_vpid(int nr); */ extern struct pid *find_get_pid(int nr); extern struct pid *find_ge_pid(int nr, struct pid_namespace *); +struct task_struct *find_get_tgid_task(int *id, struct pid_namespace *); extern struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, - size_t set_tid_size); + size_t set_tid_size, bool group_leader); extern void free_pid(struct pid *pid); extern void disable_pid_allocation(struct pid_namespace *ns); diff --git a/kernel/fork.c b/kernel/fork.c index 08969f5aa38d..1cf2644c642e 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -2267,7 +2267,7 @@ static __latent_entropy struct task_struct *copy_process( if (pid != &init_struct_pid) { pid = alloc_pid(p->nsproxy->pid_ns_for_children, args->set_tid, - args->set_tid_size); + args->set_tid_size, !(clone_flags & CLONE_THREAD)); if (IS_ERR(pid)) { retval = PTR_ERR(pid); goto bad_fork_cleanup_thread; diff --git a/kernel/pid.c b/kernel/pid.c index 53db06f9882d..d65f74c6186c 100644 --- a/kernel/pid.c +++ b/kernel/pid.c @@ -66,6 +66,9 @@ int pid_max = PID_MAX_DEFAULT; int pid_max_min = RESERVED_PIDS + 1; int pid_max_max = PID_MAX_LIMIT; +/* MARK_0 used by XA_FREE_MARK */ +#define TGID_MARK XA_MARK_1 + struct pid_namespace init_pid_ns = { .ns.count = REFCOUNT_INIT(2), .xa = XARRAY_INIT(init_pid_ns.xa, PID_XA_FLAGS), @@ -137,7 +140,7 @@ void free_pid(struct pid *pid) } struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, - size_t set_tid_size) + size_t set_tid_size, bool group_leader) { struct pid *pid; enum pid_type type; @@ -257,6 +260,8 @@ struct pid *alloc_pid(struct pid_namespace *ns, pid_t *set_tid, /* Make the PID visible to find_pid_ns. */ __xa_store(&tmp->xa, upid->nr, pid, 0); + if (group_leader) + __xa_set_mark(&tmp->xa, upid->nr, TGID_MARK); tmp->pid_allocated++; xa_unlock_irq(&tmp->xa); } @@ -314,6 +319,11 @@ static struct pid **task_pid_ptr(struct task_struct *task, enum pid_type type) void attach_pid(struct task_struct *task, enum pid_type type) { struct pid *pid = *task_pid_ptr(task, type); + struct pid_namespace *pid_ns = ns_of_pid(pid); + pid_t pid_nr = pid_nr_ns(pid, pid_ns); + + WARN_ON(type == PIDTYPE_TGID && + !xa_get_mark(&pid_ns->xa, pid_nr, TGID_MARK)); hlist_add_head_rcu(&task->pid_links[type], &pid->tasks[type]); } @@ -506,6 +516,38 @@ struct pid *find_ge_pid(int nr, struct pid_namespace *ns) } EXPORT_SYMBOL_GPL(find_ge_pid); +/* + * Used by proc to find the first thread group leader task with an id greater + * than or equal to *id. + * + * Use the xarray mark as a hint to find the next best pid. The mark does not + * guarantee a linked group leader task exists, so retry until a suitable entry + * is found. + */ +struct task_struct *find_get_tgid_task(int *id, struct pid_namespace *ns) +{ + struct pid *pid; + struct task_struct *t; + unsigned long nr = *id; + + rcu_read_lock(); + do { + pid = xa_find(&ns->xa, &nr, ULONG_MAX, TGID_MARK); + if (!pid) { + rcu_read_unlock(); + return NULL; + } + t = pid_task(pid, PIDTYPE_TGID); + nr++; + } while (!t); + + *id = pid_nr_ns(pid, ns); + get_task_struct(t); + rcu_read_unlock(); + + return t; +} + struct pid *pidfd_get_pid(unsigned int fd, unsigned int *flags) { struct fd f; From patchwork Fri Dec 2 17:16:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 13063001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 02955C47088 for ; Fri, 2 Dec 2022 17:16:24 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 307436B007D; Fri, 2 Dec 2022 12:16:22 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2B5C86B007B; Fri, 2 Dec 2022 12:16:22 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F18226B007D; Fri, 2 Dec 2022 12:16:21 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id DD04B6B0078 for ; Fri, 2 Dec 2022 12:16:21 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 96811ABB6A for ; Fri, 2 Dec 2022 17:16:21 +0000 (UTC) X-FDA: 80198019762.27.DC59139 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf08.hostedemail.com (Postfix) with ESMTP id CD0CF160015 for ; Fri, 2 Dec 2022 17:16:20 +0000 (UTC) Authentication-Results: imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eqfzDeRW; spf=pass (imf08.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1670001381; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=CW4axpY/WkASdhmPKJthM+tNRg8V7lNskJpRuDs78M8=; b=UrjtpXNZV6201T0TlJ6oETJxuVJ5N2aZQWmaPaWx/eUKE/A+Gkcj4V/KzXusHE+nZXegTl +tMr/AIKB6eMYOEFBXocZtIp3LavLe51RdrRWWf7GZZpVxAPGQVX/k70fXkNMyMcE1GmDn JBbqQQaWyNpE6yE7bST53cJe7pbiJL4= ARC-Authentication-Results: i=1; imf08.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=eqfzDeRW; spf=pass (imf08.hostedemail.com: domain of bfoster@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=bfoster@redhat.com; dmarc=pass (policy=none) header.from=redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1670001381; a=rsa-sha256; cv=none; b=WOPR5RPEUiy4Rc6QQyRKijcaUEukRZ3buNppHBe8li1yJzLsGnl/oLlu+N7/6AYlSfEnRi +x+VY9N2DDY0ETvSXhYCOq7VIVd8Fpr6Fpyq6GpCGVIM/ZnPojr/Q51SubrXpfhkxrN0SH rflPr+FOP1YQZMrLQT7E9GuEpWzJgZg= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1670001380; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CW4axpY/WkASdhmPKJthM+tNRg8V7lNskJpRuDs78M8=; b=eqfzDeRWM//lLTF0odW4wYdBFYeBT9yjeRbqXjY6+D+6xLMlogqVo475OvVRw650h2t8OH wRT+j+mKqa0JdHYliR1rna4nkA+vUPkmBUuGfT5Uj0PqCpFlx+DXzdS7BwrmguAuhhPyY5 v3XfkKhYxoheGJ949Bph7oy7RZ/XRLs= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-590-6ydDHLUSNQaMhMi-06NKSQ-1; Fri, 02 Dec 2022 12:16:16 -0500 X-MC-Unique: 6ydDHLUSNQaMhMi-06NKSQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.rdu2.redhat.com [10.11.54.2]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 9C351894E83; Fri, 2 Dec 2022 17:16:16 +0000 (UTC) Received: from bfoster.redhat.com (unknown [10.22.8.52]) by smtp.corp.redhat.com (Postfix) with ESMTP id 580D240C947B; Fri, 2 Dec 2022 17:16:16 +0000 (UTC) From: Brian Foster To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: ikent@redhat.com, onestero@redhat.com, willy@infradead.org, ebiederm@redhat.com Subject: [PATCH v3 5/5] procfs: use efficient tgid pid search on root readdir Date: Fri, 2 Dec 2022 12:16:20 -0500 Message-Id: <20221202171620.509140-6-bfoster@redhat.com> In-Reply-To: <20221202171620.509140-1-bfoster@redhat.com> References: <20221202171620.509140-1-bfoster@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.2 X-Spamd-Result: default: False [2.59 / 9.00]; R_MISSING_CHARSET(2.50)[]; MID_CONTAINS_FROM(1.00)[]; DMARC_POLICY_ALLOW(-0.50)[redhat.com,none]; R_DKIM_ALLOW(-0.20)[redhat.com:s=mimecast20190719]; R_SPF_ALLOW(-0.20)[+ip4:170.10.133.0/24]; MIME_GOOD(-0.10)[text/plain]; RCVD_NO_TLS_LAST(0.10)[]; BAYES_HAM(-0.01)[46.87%]; RCPT_COUNT_SEVEN(0.00)[7]; FROM_EQ_ENVFROM(0.00)[]; DKIM_TRACE(0.00)[redhat.com:+]; RCVD_COUNT_THREE(0.00)[4]; MIME_TRACE(0.00)[0:+]; FROM_HAS_DN(0.00)[]; TO_MATCH_ENVRCPT_SOME(0.00)[]; TO_DN_NONE(0.00)[]; ARC_SIGNED(0.00)[hostedemail.com:s=arc-20220608:i=1]; ARC_NA(0.00)[] X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: CD0CF160015 X-Stat-Signature: 4k39nhmqizbsbqgypm5fry34qj7nprrh X-HE-Tag: 1670001380-543244 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: find_ge_pid() walks every allocated id and checks every associated pid in the namespace for a link to a PIDTYPE_TGID task. If the pid namespace contains processes with large numbers of threads, this search doesn't scale and can notably increase getdents() syscall latency. For example, on a mostly idle 2.4GHz Intel Xeon running Fedora on 5.19.0-rc2, 'strace -T xfs_io -c readdir /proc' shows the following: getdents64(... /* 814 entries */, 32768) = 20624 <0.000568> With the addition of a dummy (i.e. idle) process running that creates an additional 100k threads, that latency increases to: getdents64(... /* 815 entries */, 32768) = 20656 <0.011315> While this may not be noticeable to users in one off /proc scans or simple usage of ps or top, we have users that report problems caused by this latency increase in these sort of scaled environments with custom tooling that makes heavier use of task monitoring. Optimize the tgid task scanning in proc_pid_readdir() by using the more efficient find_get_tgid_task() helper. This significantly improves readdir() latency when the pid namespace is populated with processes with very large thread counts. For example, the above 100k idle task test against a patched kernel now results in the following: Idle: getdents64(... /* 861 entries */, 32768) = 21048 <0.000670> "" + 100k threads: getdents64(... /* 862 entries */, 32768) = 21096 <0.000959> ... which is a much smaller latency hit after the high thread count task is started. Signed-off-by: Brian Foster Reviewed-by: Ian Kent --- fs/proc/base.c | 17 +---------------- 1 file changed, 1 insertion(+), 16 deletions(-) diff --git a/fs/proc/base.c b/fs/proc/base.c index 9e479d7d202b..ac34b6bb7249 100644 --- a/fs/proc/base.c +++ b/fs/proc/base.c @@ -3475,24 +3475,9 @@ struct tgid_iter { }; static struct tgid_iter next_tgid(struct pid_namespace *ns, struct tgid_iter iter) { - struct pid *pid; - if (iter.task) put_task_struct(iter.task); - rcu_read_lock(); -retry: - iter.task = NULL; - pid = find_ge_pid(iter.tgid, ns); - if (pid) { - iter.tgid = pid_nr_ns(pid, ns); - iter.task = pid_task(pid, PIDTYPE_TGID); - if (!iter.task) { - iter.tgid += 1; - goto retry; - } - get_task_struct(iter.task); - } - rcu_read_unlock(); + iter.task = find_get_tgid_task(&iter.tgid, ns); return iter; }