From patchwork Wed Jun 5 00:24:49 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13686019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B433EC25B78 for ; Wed, 5 Jun 2024 00:25:30 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 47AAF6B009D; Tue, 4 Jun 2024 20:25:30 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 42A906B009E; Tue, 4 Jun 2024 20:25:30 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 27D016B009F; Tue, 4 Jun 2024 20:25:30 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 032D86B009D for ; Tue, 4 Jun 2024 20:25:29 -0400 (EDT) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7B075A2595 for ; Wed, 5 Jun 2024 00:25:29 +0000 (UTC) X-FDA: 82194941178.12.E7FCB1F Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by imf27.hostedemail.com (Postfix) with ESMTP id 3E1A840011 for ; Wed, 5 Jun 2024 00:25:26 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=fpgXLkOC; spf=pass (imf27.hostedemail.com: domain of andrii@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1717547128; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=BQrXfcIsxYIcvUzfLZe9GiUXInc1sTmW+gF6wzQfC3I=; b=6Bb4v3lS0uQIbVSJiti7ZEIzl6r63XNqbLSUfVBlt8C8eplEeooDC3inzAImCWdTv1RUUy HsWoEihClWRGQMye5PpI0vi6unaV0E8o3wp/b5iYrDU5FRYMGw7yzV00Cu7AVGH7EIEY6M PHZ8YK7aDtdDEtPa+ljER3oa3VXDVAU= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=fpgXLkOC; spf=pass (imf27.hostedemail.com: domain of andrii@kernel.org designates 145.40.73.55 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=none) header.from=kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1717547128; a=rsa-sha256; cv=none; b=1cRQo2KtEqSDvHd87vqKBQj4dm91aS4VfR3NFeIEeBsLer4MFt48Jdz92ht2e9t5JdOs/b 0RXyzASxs69QlLs+5KFvdcXflc0hYSZngXzYHJmaxR7/sezvRNkmIHBQTaOVtQ+vBWsJYA rC/JPzusA7Sd4rYfqYzQUY1mXMP8UD8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by sin.source.kernel.org (Postfix) with ESMTP id A94F0CE10F0; Wed, 5 Jun 2024 00:25:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF59CC2BBFC; Wed, 5 Jun 2024 00:25:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1717547124; bh=Pq+QGLVWTUWBT2KD39eaZ277HgWCnfkSzzJkgOa7/lc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fpgXLkOCnVfC96uuFhU6H4kf3FRlKyUiWC5AwEmD+MVxxbpcpWofrljXwKK3VDQ6C Atrrq7u8WckvkQnCVMLNC4Jn+q+pukJjrExfB0JctjvySmUVy4339wU7WjsbrsY9Yl wbf5qxJp5OJeLwAYiXGpbl3AFJA0Jn6SM/8TGzyBAo6mGrHPBLpLmBb4z2e+n2i/b/ 8RpYXfAwc1M7v6AuOdAjRr5hIU/Jr0u+gGyG9iC/WL508facstcWJVdyjFFwiaSmgk E/RRK/3MUxZGhXKbdvlWC9Rbf4UIpEd0cGJJA3Iw3qQtYXNO3Mo9Klt6Qqec2IjKUa 2jql6xImxK5VQ== From: Andrii Nakryiko To: linux-fsdevel@vger.kernel.org, brauner@kernel.org, viro@zeniv.linux.org.uk, akpm@linux-foundation.org Cc: linux-kernel@vger.kernel.org, bpf@vger.kernel.org, gregkh@linuxfoundation.org, linux-mm@kvack.org, liam.howlett@oracle.com, surenb@google.com, rppt@kernel.org, Andrii Nakryiko Subject: [PATCH v3 4/9] fs/procfs: use per-VMA RCU-protected locking in PROCMAP_QUERY API Date: Tue, 4 Jun 2024 17:24:49 -0700 Message-ID: <20240605002459.4091285-5-andrii@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240605002459.4091285-1-andrii@kernel.org> References: <20240605002459.4091285-1-andrii@kernel.org> MIME-Version: 1.0 X-Stat-Signature: zxhm1cii8opjf4hpeqnxxhe9p46j95ds X-Rspamd-Queue-Id: 3E1A840011 X-Rspam-User: X-Rspamd-Server: rspam10 X-HE-Tag: 1717547126-243969 X-HE-Meta: U2FsdGVkX1+gJGqgvaNWJeLWASfUpgn8pcmVvJQE5dY+fB9stNZfCdJ7w4a5f8eJBPoyDR9Jep/ekWv1yjVl10BKyn4q9Kzk6oOl1lTCDAHqc5x2q6AcNUic/wtehDgm27NswBLMKL3kujVQdf/CKC7/NpeqimDhDFSq3ccdwCtgiy79ytQ4rmb3sS2IIB3dmXGYM1rvAvazZQloqShoyU4Dl/DngFWVVnLojIxvDnjdE7jXxmWPoBVDcPH9jugTpptlRsYOtdgERbeTQTpgnPMJ0z6HEdvH0GRfzA7mBHohIoo6TIOd/U0+RMkKg2gms4zQeLRY8FfDYv3dvD2+k4sjKvCfxt9gaQy5+Our6+H9G/i8EEod8ZSRRkx8i3+lZh7wu1YyXW1zBFhPIfmAj/Q52Srd3WIcL16Fbjz2wagdyC5nwswUakWYFlSkHtKghIEndJrLWpFevjxQPTzhFgpt8BMU0B68M/PZbamJmaMrGI1XiMui7crN/BGSKAgy9YA6SnFPo/yKXBOsESkK5I+IcBIwbIXWiIw70OtNTOT1okKJqh4If+fJlawvsBT7Al6wkCLIjhAj/tVVOS59VAh3t3/YPwOuTo85kDqnZ5cr+UnAOpZM8UNDHwsbphH5WJgh+PUZNstAqPltMqLrvhQWr4lR1QwiLzO/TeK1R3kQJDVPpfiIp3qW5PhvhWOSXuDL4VOY+at9tyJ+Mxz3QcWcRG28BqwwYyIxLrMmBnsnZ77yGCPBDwmczxvJSIfGbFuy0BPg9G6sg0xovaykQIFsnQrxXxyIv06aP6+sJd0Un7rqc80lLjJhyEZaWOVt3ldOGxweR9kWUj4t9EGojQodwaKZSWLQaWL6Ig6yeW0D/doyPN1CiKvFm9logOCzhscm3HTyrCA7+VvXmSu2G7Vqkjabe4h4lQ6F5mrPARTIn6BOFSF+xH7SGBzHu9+yVhST5JEq1hChNs8kFCW a4Xnrdnc iDxCTO7Qkh0t2Gu9mmpJ5RvzP7CcBrw9SRRLRsZZV01C3/pw3NjUBVIOKFjztgrVEGQzeUZXj+nD6wDZBqtfmYikGwWs3apWLAGwTXs//hQ9PgknxtJj+S1I9LuHIEhXEPwLoAVzsteZxxYLDjO5NM619YkHKf0oieg2moKQx9LDAe50Dt7wkEP3K8tF5wJLSkMZRyKcamUJdGdSR6ZRjKUoqRn7s0ARM6fkzmoj/73OxT95aVGToQOXp3UWoVN2+TpaR++5JUQgXyYOU1/TKCkpvKSvo9rChwvvM X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Attempt to use RCU-protected per-VMA lock when looking up requested VMA as much as possible, only falling back to mmap_lock if per-VMA lock failed. This is done so that querying of VMAs doesn't interfere with other critical tasks, like page fault handling. This has been suggested by mm folks, and we make use of a newly added internal API that works like find_vma(), but tries to use per-VMA lock. We have two sets of setup/query/teardown helper functions with different implementations depending on availability of per-VMA lock (conditioned on CONFIG_PER_VMA_LOCK) to abstract per-VMA lock subtleties. When per-VMA lock is available, lookup is done under RCU, attempting to take a per-VMA lock. If that fails, we fallback to mmap_lock, but then proceed to unconditionally grab per-VMA lock again, dropping mmap_lock immediately. In this configuration mmap_lock is never helf for long, minimizing disruptions while querying. When per-VMA lock is compiled out, we take mmap_lock once, query VMAs using find_vma() API, and then unlock mmap_lock at the very end once as well. In this setup we avoid locking/unlocking mmap_lock on every looked up VMA (depending on query parameters we might need to iterate a few of them). Signed-off-by: Andrii Nakryiko --- fs/proc/task_mmu.c | 46 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index 614fbe5d0667..140032ffc551 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -388,6 +388,49 @@ static int pid_maps_open(struct inode *inode, struct file *file) PROCMAP_QUERY_VMA_FLAGS \ ) +#ifdef CONFIG_PER_VMA_LOCK +static int query_vma_setup(struct mm_struct *mm) +{ + /* in the presence of per-VMA lock we don't need any setup/teardown */ + return 0; +} + +static void query_vma_teardown(struct mm_struct *mm, struct vm_area_struct *vma) +{ + /* in the presence of per-VMA lock we need to unlock vma, if present */ + if (vma) + vma_end_read(vma); +} + +static struct vm_area_struct *query_vma_find_by_addr(struct mm_struct *mm, unsigned long addr) +{ + struct vm_area_struct *vma; + + /* try to use less disruptive per-VMA lock */ + vma = find_and_lock_vma_rcu(mm, addr); + if (IS_ERR(vma)) { + /* failed to take per-VMA lock, fallback to mmap_lock */ + if (mmap_read_lock_killable(mm)) + return ERR_PTR(-EINTR); + + vma = find_vma(mm, addr); + if (vma) { + /* + * We cannot use vma_start_read() as it may fail due to + * false locked (see comment in vma_start_read()). We + * can avoid that by directly locking vm_lock under + * mmap_lock, which guarantees that nobody can lock the + * vma for write (vma_start_write()) under us. + */ + down_read(&vma->vm_lock->lock); + } + + mmap_read_unlock(mm); + } + + return vma; +} +#else static int query_vma_setup(struct mm_struct *mm) { return mmap_read_lock_killable(mm); @@ -402,6 +445,7 @@ static struct vm_area_struct *query_vma_find_by_addr(struct mm_struct *mm, unsig { return find_vma(mm, addr); } +#endif static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, unsigned long addr, u32 flags) @@ -441,8 +485,10 @@ static struct vm_area_struct *query_matching_vma(struct mm_struct *mm, skip_vma: /* * If the user needs closest matching VMA, keep iterating. + * But before we proceed we might need to unlock current VMA. */ addr = vma->vm_end; + vma_end_read(vma); /* no-op under !CONFIG_PER_VMA_LOCK */ if (flags & PROCMAP_QUERY_COVERING_OR_NEXT_VMA) goto next_vma; no_vma: