From patchwork Thu Oct 10 20:56:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13831110 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C54D2D24451 for ; Thu, 10 Oct 2024 20:56:54 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2C91D6B0085; Thu, 10 Oct 2024 16:56:54 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 278346B0088; Thu, 10 Oct 2024 16:56:54 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0C9BF6B0089; Thu, 10 Oct 2024 16:56:54 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id DEF4F6B0085 for ; Thu, 10 Oct 2024 16:56:53 -0400 (EDT) Received: from smtpin16.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id C8D11140A14 for ; Thu, 10 Oct 2024 20:56:49 +0000 (UTC) X-FDA: 82658901864.16.9F3F863 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf10.hostedemail.com (Postfix) with ESMTP id D57A0C0017 for ; Thu, 10 Oct 2024 20:56:50 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="T/xpkIms"; spf=pass (imf10.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728593674; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=WA+KUbVPNVenl2hRo1jR353leDonhC7IiyFlaXACq7A=; b=7rx6ahWvv+IaOKViBixWVSUqkZ4hmcH/2zlC7cm0psFQXpV60lVrVYJRylMSFnRJJie1py V3QUzuzfTgbAeZxp5Q85cdqmJZQ1bdWGvw3dlH1PBlm5w4BzLG1DO/EbL4in8a0fu1HqXZ zGmAVUhrPwV9GLy2d4PRWRdAPAr4KDU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728593674; a=rsa-sha256; cv=none; b=GtxbNzJF5dcaGlC8NhSPd3mexZe+0YmaYXCBUvPXfp9/Gy2vNkF6Mg+3q8O4wIe63O7EWz xn1aazMr2sDwaIlA52ArStabascYktNd9Tg9FRVYyM0fyvsYwZ8iWHt/7V7TdIDVtBuRf8 EQzoLwPmQgteX1Z4Pwxcviyr5EasNEI= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="T/xpkIms"; spf=pass (imf10.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org; dmarc=pass (policy=quarantine) header.from=kernel.org Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 9DA6B5C5FC1; Thu, 10 Oct 2024 20:56:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8D916C4CECD; Thu, 10 Oct 2024 20:56:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1728593810; bh=wgt1aG7beo2Dv30obPLfAmhHU07l2a/OUlPnAMEzc64=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T/xpkIms4/XteSjXOaghmqaQVxUthT47UHNM+oFCsckR7E3/ssFpC2mX6GbNYFNua c7GYJ/tay3Y31AM2Jrz+78QGi+BakcO2VBszuXhoE4TXE59jNBFRf5HbCii/XHY6Ju ZBJ7hzr4rp3ZbFkl0pAnhZKpDLNb5QPBiJXoCdgnt25FPcKDuiv9Fhn0on1J0/mcKc 8q6+B1AGFdV5jmva2MTrzj+PC0PcCptrwoFcIbfCWr43dCbkFE6F2qiAV3X8kN6SiB ht+oDb8FkkNv9pFXMfH8vUswVFgouEPlDaH6zjpszRYATw0AjV+4axBpdKG7KNpCoY Z16vjigxFwOfg== From: Andrii Nakryiko To: linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org Cc: oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, shakeel.butt@linux.dev, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, Andrii Nakryiko Subject: [PATCH v3 tip/perf/core 1/4] mm: introduce mmap_lock_speculation_{start|end} Date: Thu, 10 Oct 2024 13:56:41 -0700 Message-ID: <20241010205644.3831427-2-andrii@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010205644.3831427-1-andrii@kernel.org> References: <20241010205644.3831427-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: D57A0C0017 X-Stat-Signature: 3jcy8hmgktzi4dzj1j7q1hingg99dwxt X-HE-Tag: 1728593810-710125 X-HE-Meta: U2FsdGVkX18gnXMuWO1OuZdeLWm5W5ljttEQAHWC9wcPZ78XFHDFTYZ/OuQXH9ESo1l3/12h6bkFFruztFgBJYkhS1izjNY+Bh6HyGu0EXyqcNEhnFdYZH+702jlmAik3Tw+JPh3rcQa9QZ22qBNkfcaMrRDIfTZHB6bjulE2ezNAi9fxNB4CNlzb6kTsTFz4/woi5YEk9kGCvpm47NckeoVU6g8IC75p4mFLfOjoc/W1pTVexYziNyrgiI/4v3wBb0lDbs/njb5xpAaSzFuTw4uJOuQ6+NMFLClQUh/d9vDQ6zOuXic0AvGJLPYbIyWjGIKrRvrq8jLxgwJrHIIVsAlQgIr3Zv0u+Jjvcdq5ZXs6W/N94wBZLQq+j9qniMrILukyCHVJzlQLwduFawyWdaa7mj2yN3pqlzYhD0CiNR4cQ5qfB+IlHLXVUhFYb3IxssB5qzn9gOed5CFSJHtztCENC0oJtRQxQX7GpgXGlQlDz0LcBu0fXwj2Oc3u4yxphA2bWmcw7dNz6JYq/dfypY8n422dQChOeNAGPJpO+ZiwzW9iaDNb/pnE/bUUazLIZnO6CCtP9mdWVZJjxcTChm9P7qzITcrVvvomewYaEUK/9vKYhff1nStff7Lne+jobH1mkrXRotQCpfOWH+AR3eMbkZZQxxYFWV09jVy4DB3MLpSBJwa905JbnfxfsW+A2oFogm8oGWt/2bceP0z8qsMeLnHpbR/6ir5lAs2e/obtDw7P13o6JGNuE1JOJkQbKhkEUwIrFhYnGnInuXyOtZ/QfCt+LJrxxJ19g4azO3/gHC9aO5e3k7AzUgwjO/ERxWLLY6QxfJNMS0uDsn1Kk/i4vxFcC76vria5YTlJdh8b2RV6S6aDmlGBTCMkWJnt6hci0DupijH7cSvBCvgl2Nd+NSMLekuqvsi+5s1EW8NXDfaSG2KSQAndu2pqohGcJbrqJXhkoaZXX/TBXX kMCKmD7F icx0wXZsqIjwidBS/TGFS0cvFAQ4DZIAS0CZP2rIs+HjXNQbNDRLHsTOXWG/09CvkL+MO4B15Ok7gpxoaLLFWP6y5on9tvToPL4W81rHXjndRkgiuZaU+BYrDjuLugAFm4qJY2xWF/pR058EsRcQLK98XyLyDeEc2uwtPP+xDSxxMqK8DSeGXOn6DEh8cP0s84uoyxA00sg7iP0zmenG1n94VZzKub+pRMC+ukRxyBFAM1EWUfwaVKOUYETP0aaCUltxJzEMBsUh/d4XWX0FK1+xsy4ldSDe9h8HVUTwsHlKruqGU0uOwp2BgfvNSEJTnh+/25glqhebfUyptC03IaNDu/SM1g/mL6njQO6+U5nJ1N6k= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Suren Baghdasaryan Add helper functions to speculatively perform operations without read-locking mmap_lock, expecting that mmap_lock will not be write-locked and mm is not modified from under us. Suggested-by: Peter Zijlstra Signed-off-by: Suren Baghdasaryan Signed-off-by: Andrii Nakryiko Link: https://lore.kernel.org/bpf/20240912210222.186542-1-surenb@google.com Reviewed-by: Shakeel Butt --- include/linux/mm_types.h | 3 ++ include/linux/mmap_lock.h | 72 ++++++++++++++++++++++++++++++++------- kernel/fork.c | 3 -- 3 files changed, 63 insertions(+), 15 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8e38bc..5d8cdebd42bc 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -887,6 +887,9 @@ struct mm_struct { * Roughly speaking, incrementing the sequence number is * equivalent to releasing locks on VMAs; reading the sequence * number can be part of taking a read lock on a VMA. + * Incremented every time mmap_lock is write-locked/unlocked. + * Initialized to 0, therefore odd values indicate mmap_lock + * is write-locked and even values that it's released. * * Can be modified under write mmap_lock using RELEASE * semantics. diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index de9dc20b01ba..9d23635bc701 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -71,39 +71,84 @@ static inline void mmap_assert_write_locked(const struct mm_struct *mm) } #ifdef CONFIG_PER_VMA_LOCK +static inline void init_mm_lock_seq(struct mm_struct *mm) +{ + mm->mm_lock_seq = 0; +} + /* - * Drop all currently-held per-VMA locks. - * This is called from the mmap_lock implementation directly before releasing - * a write-locked mmap_lock (or downgrading it to read-locked). - * This should normally NOT be called manually from other places. - * If you want to call this manually anyway, keep in mind that this will release - * *all* VMA write locks, including ones from further up the stack. + * Increment mm->mm_lock_seq when mmap_lock is write-locked (ACQUIRE semantics) + * or write-unlocked (RELEASE semantics). */ -static inline void vma_end_write_all(struct mm_struct *mm) +static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) { mmap_assert_write_locked(mm); /* * Nobody can concurrently modify mm->mm_lock_seq due to exclusive * mmap_lock being held. - * We need RELEASE semantics here to ensure that preceding stores into - * the VMA take effect before we unlock it with this store. - * Pairs with ACQUIRE semantics in vma_start_read(). */ - smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); + + if (acquire) { + WRITE_ONCE(mm->mm_lock_seq, mm->mm_lock_seq + 1); + /* + * For ACQUIRE semantics we should ensure no following stores are + * reordered to appear before the mm->mm_lock_seq modification. + */ + smp_wmb(); + } else { + /* + * We need RELEASE semantics here to ensure that preceding stores + * into the VMA take effect before we unlock it with this store. + * Pairs with ACQUIRE semantics in vma_start_read(). + */ + smp_store_release(&mm->mm_lock_seq, mm->mm_lock_seq + 1); + } +} + +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) +{ + /* Pairs with RELEASE semantics in inc_mm_lock_seq(). */ + *seq = smp_load_acquire(&mm->mm_lock_seq); + /* Allow speculation if mmap_lock is not write-locked */ + return (*seq & 1) == 0; +} + +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) +{ + /* Pairs with ACQUIRE semantics in inc_mm_lock_seq(). */ + smp_rmb(); + return seq == READ_ONCE(mm->mm_lock_seq); } + #else -static inline void vma_end_write_all(struct mm_struct *mm) {} +static inline void init_mm_lock_seq(struct mm_struct *mm) {} +static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) {} +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) { return false; } +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) { return false; } #endif +/* + * Drop all currently-held per-VMA locks. + * This is called from the mmap_lock implementation directly before releasing + * a write-locked mmap_lock (or downgrading it to read-locked). + * This should NOT be called manually from other places. + */ +static inline void vma_end_write_all(struct mm_struct *mm) +{ + inc_mm_lock_seq(mm, false); +} + static inline void mmap_init_lock(struct mm_struct *mm) { init_rwsem(&mm->mmap_lock); + init_mm_lock_seq(mm); } static inline void mmap_write_lock(struct mm_struct *mm) { __mmap_lock_trace_start_locking(mm, true); down_write(&mm->mmap_lock); + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, true); } @@ -111,6 +156,7 @@ static inline void mmap_write_lock_nested(struct mm_struct *mm, int subclass) { __mmap_lock_trace_start_locking(mm, true); down_write_nested(&mm->mmap_lock, subclass); + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, true); } @@ -120,6 +166,8 @@ static inline int mmap_write_lock_killable(struct mm_struct *mm) __mmap_lock_trace_start_locking(mm, true); ret = down_write_killable(&mm->mmap_lock); + if (!ret) + inc_mm_lock_seq(mm, true); __mmap_lock_trace_acquire_returned(mm, true, ret == 0); return ret; } diff --git a/kernel/fork.c b/kernel/fork.c index 89ceb4a68af2..dd1bded0294d 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -1261,9 +1261,6 @@ static struct mm_struct *mm_init(struct mm_struct *mm, struct task_struct *p, seqcount_init(&mm->write_protect_seq); mmap_init_lock(mm); INIT_LIST_HEAD(&mm->mmlist); -#ifdef CONFIG_PER_VMA_LOCK - mm->mm_lock_seq = 0; -#endif mm_pgtables_bytes_init(mm); mm->map_count = 0; mm->locked_vm = 0; From patchwork Thu Oct 10 20:56:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13831111 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B75E1D24452 for ; Thu, 10 Oct 2024 20:56:57 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3662B6B0088; Thu, 10 Oct 2024 16:56:57 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2ECAD6B0089; Thu, 10 Oct 2024 16:56:57 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0F3776B008A; Thu, 10 Oct 2024 16:56:57 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E4B856B0088 for ; Thu, 10 Oct 2024 16:56:56 -0400 (EDT) Received: from smtpin11.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id 4C752120C48 for ; Thu, 10 Oct 2024 20:56:53 +0000 (UTC) X-FDA: 82658902032.11.6156561 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf11.hostedemail.com (Postfix) with ESMTP id BAFB940003 for ; Thu, 10 Oct 2024 20:56:53 +0000 (UTC) Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=rIJtnOw3; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf11.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728593770; a=rsa-sha256; cv=none; b=dBEgGuqy0m8jX5KgbRD5pIjqMrrleQ9RiXQ5uNhZkYg/wEKgk6JnWumF0xjqIG8QYAq6Hb UnE/wyfGbjgxrNjdlRlECDRtTRhRqk0xWoeyYOIW8L+IWUQBgwlEwgQe/Z23+tr26XLXqr wcaa8HRfYwTaRbhubPvfmPU2w0QyUQQ= ARC-Authentication-Results: i=1; imf11.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=rIJtnOw3; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf11.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728593770; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wZPHgZoZo8kwrQDgNe++T3nyUYzxonL69OWNuudLOtQ=; b=v4Xtr6SyaQNIFo9t34gD83jsvD/twMsuSpKDxVNxmdqHI5IjadL1M78q3LsL4VorGKWyus KOyzvEpPfvYiKRYLhJ3yd+rvh/NTMz8Kt/Rdi3nvE+VRqQj54iFsqkqSD/DnisZfHn27Jx 52tSsM2zb6YoplEv4QqI+ZnVZ499xK8= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id EAC825C5FD6; Thu, 10 Oct 2024 20:56:49 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CF2A2C4CECD; Thu, 10 Oct 2024 20:56:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1728593814; bh=Fclxig+1EAtAWEBIe+FzYZRXWMXhfE7OvVxsNaE1DGc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rIJtnOw3slAN3+Rawp25Ab7zwSbPGytnf251F+P9dJp6mu+wi/4bvnkUAD61lBZ+/ WiAu/Sv6kxIzrTRke67spN6D/KVkdsbex8Ahhuy9n+F4/ehSWhLtH+BytOQANs7lrb aONXzNWSCTZnJqFgqCP2i/b98GyBTchxH27Zw2pzlsoAgiRRdUty410qoKQcz+Xgn0 hR6S9fDgI4MghcYQ6mWM2w1TocGDnVZEg+S0ok1Wz7LhFGV3nzu3/SGSj6eoncF57G n9m1cte7bZ3wjUSFSd8j6GY78XZnXoaezm3DCmAjB6uJkKA7R6z0X3iGMac6OnqpgQ fl7T/DexrVneg== From: Andrii Nakryiko To: linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org Cc: oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, shakeel.butt@linux.dev, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, Andrii Nakryiko Subject: [PATCH v3 tip/perf/core 2/4] mm: switch to 64-bit mm_lock_seq/vm_lock_seq on 64-bit architectures Date: Thu, 10 Oct 2024 13:56:42 -0700 Message-ID: <20241010205644.3831427-3-andrii@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010205644.3831427-1-andrii@kernel.org> References: <20241010205644.3831427-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: BAFB940003 X-Rspamd-Server: rspam01 X-Stat-Signature: pohafyy84mrhgedwawh7ub7mrz1o14nz X-HE-Tag: 1728593813-35050 X-HE-Meta: U2FsdGVkX18IbE02D9UhqHaKWMMcsHEUr5jD6h0WFY9qSwxHgEDmxsU8JzhkQAe5SOYvzBEDkR8TUZuxaIzb6JJZ8CmlAbIgOGQVNc8TA6ouBHJMgkmhdmH/YW/QFvdFiqErJ+eKRi86zMUEO6QFd4kwmlw62gmKINhAQRWUXTou5WsskVte/5jyr+9WdBCZ/3W0QscMTDE5vbbfCuOPvUQAdxN9Ni8wSs3uqI50JUOSzHQwAuNLtf0rJUguPMe9swNKohslPvOgoCiP40ZD1CSYvEiE5FGHE1DzN+ZdF8J8NvhRUSCMY4j81yQ/RrfsJoUyZM5Km/agRMaDFX6Ik81eqbTKbK0ZmimAqBRKqsr/Dkszdyolxc7eItzrd0WfRxQl40INn8DzVVNeDchZDETle/uj9mPQvVGxspQQcJ0NVxj8YPJhWtO/maK3wsf1cwhDzdcvuXt5tHxZf/8EgHLJRFgbKx+pQA05cOR1UlNs8SSIKsS9Yh1l1LPexHfngWqyD4rmeCjrjxhBsFMCXmj7W3JYCWhvrGKxYglUY017XPHiU1Q+q3m7Cgr1RiL4/6AKUyPMaplpzj7b0DIZtD7ztr3A6TBQREv21GpDsqe1QcK3+tTR1nv5aJvWByohPDlxCDz/jwNOyAAM08HifC9yiX1xSUfyMMYAz9LhOmNAEi0CuQ/dnx+LmcMPKf5v83yrULh2VWipYQJbXhbSB2kAoZorb79AoZiuEtHE5ZTIgKjGhpwc0TX0oTgrBL76J00gv1BX3DAx99r2QVBPtGw5zwdTziRyigPC3g1xX0AvlGPEIChm+vl0sHj0bEoQ37lCjF/Xvazu8hM1kRMLa5eqjRx95/ntfFnOONbBkE6lhWgGzuHbu65CLtqSp92el1U3njWznXi06ROwOEpPBH7pN81CmL2VULUwfuG3b1MmIVwmkXKm68QIicCQk3EJ4HwWoO8yuZEp3OtVzNa XZeX0QeV zdwMwgnUyTp8tVMaxautjGFqQ51ydLC9crQFs2TJ0ooNTEV+6ha09vBn9+oCHzdGcitP1GFIGo3W8BSp1anyCUlaR/MXR+AbN4qpsaF9iT+pxPAWhdKCGKDxWqzlhFOdg2qsNAYpxzofOy6jh4imwFofJXC5Gl0SvlNMewfZI2ktwi+9/jV9LX+11/xeeN0pLDzIFRIIRanWHSzDamPeQZ4CGmIIdLSbhcUW3nQDKVQ9pD7y7aJxmnKGEGagkUltamYuXEDKiWAB+Etiv10JC54VewLjWckUC+/BmiodfmzasG0AXNGOuJVe0SQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: To increase mm->mm_lock_seq robustness, switch it from int to long, so that it's a 64-bit counter on 64-bit systems and we can stop worrying about it wrapping around in just ~4 billion iterations. Same goes for VMA's matching vm_lock_seq, which is derived from mm_lock_seq. I didn't use __u64 outright to keep 32-bit architectures unaffected, but if it seems important enough, I have nothing against using __u64. Suggested-by: Jann Horn Signed-off-by: Andrii Nakryiko Reviewed-by: Shakeel Butt Reviewed-by: Suren Baghdasaryan --- include/linux/mm.h | 6 +++--- include/linux/mm_types.h | 4 ++-- include/linux/mmap_lock.h | 8 ++++---- 3 files changed, 9 insertions(+), 9 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index ecf63d2b0582..97819437832e 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -730,7 +730,7 @@ static inline void vma_end_read(struct vm_area_struct *vma) } /* WARNING! Can only be used if mmap_lock is expected to be write-locked */ -static bool __is_vma_write_locked(struct vm_area_struct *vma, int *mm_lock_seq) +static bool __is_vma_write_locked(struct vm_area_struct *vma, long *mm_lock_seq) { mmap_assert_write_locked(vma->vm_mm); @@ -749,7 +749,7 @@ static bool __is_vma_write_locked(struct vm_area_struct *vma, int *mm_lock_seq) */ static inline void vma_start_write(struct vm_area_struct *vma) { - int mm_lock_seq; + long mm_lock_seq; if (__is_vma_write_locked(vma, &mm_lock_seq)) return; @@ -767,7 +767,7 @@ static inline void vma_start_write(struct vm_area_struct *vma) static inline void vma_assert_write_locked(struct vm_area_struct *vma) { - int mm_lock_seq; + long mm_lock_seq; VM_BUG_ON_VMA(!__is_vma_write_locked(vma, &mm_lock_seq), vma); } diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 5d8cdebd42bc..0dc57d6cfe38 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -715,7 +715,7 @@ struct vm_area_struct { * counter reuse can only lead to occasional unnecessary use of the * slowpath. */ - int vm_lock_seq; + long vm_lock_seq; /* Unstable RCU readers are allowed to read this. */ struct vma_lock *vm_lock; #endif @@ -898,7 +898,7 @@ struct mm_struct { * Can be read with ACQUIRE semantics if not holding write * mmap_lock. */ - int mm_lock_seq; + long mm_lock_seq; #endif diff --git a/include/linux/mmap_lock.h b/include/linux/mmap_lock.h index 9d23635bc701..f8fd6d879aa9 100644 --- a/include/linux/mmap_lock.h +++ b/include/linux/mmap_lock.h @@ -105,7 +105,7 @@ static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) } } -static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, long *seq) { /* Pairs with RELEASE semantics in inc_mm_lock_seq(). */ *seq = smp_load_acquire(&mm->mm_lock_seq); @@ -113,7 +113,7 @@ static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) return (*seq & 1) == 0; } -static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, long seq) { /* Pairs with ACQUIRE semantics in inc_mm_lock_seq(). */ smp_rmb(); @@ -123,8 +123,8 @@ static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) #else static inline void init_mm_lock_seq(struct mm_struct *mm) {} static inline void inc_mm_lock_seq(struct mm_struct *mm, bool acquire) {} -static inline bool mmap_lock_speculation_start(struct mm_struct *mm, int *seq) { return false; } -static inline bool mmap_lock_speculation_end(struct mm_struct *mm, int seq) { return false; } +static inline bool mmap_lock_speculation_start(struct mm_struct *mm, long *seq) { return false; } +static inline bool mmap_lock_speculation_end(struct mm_struct *mm, long seq) { return false; } #endif /* From patchwork Thu Oct 10 20:56:43 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13831112 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4DFC5D24451 for ; Thu, 10 Oct 2024 20:57:01 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id AE0436B008A; Thu, 10 Oct 2024 16:57:00 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A67BE6B0092; Thu, 10 Oct 2024 16:57:00 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8E0E46B0093; Thu, 10 Oct 2024 16:57:00 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 6C6316B008A for ; Thu, 10 Oct 2024 16:57:00 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id DAF00C0870 for ; Thu, 10 Oct 2024 20:56:55 +0000 (UTC) X-FDA: 82658902158.10.D8CA697 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf27.hostedemail.com (Postfix) with ESMTP id A7AEA40007 for ; Thu, 10 Oct 2024 20:56:56 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="OdE/XF1b"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf27.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728593773; a=rsa-sha256; cv=none; b=sc1NDooHu0R5dUNK/LBn1iFqIOIgmpBIsI6A3OLaHWYQGy39InPZvvfP1cPOdtYNEGW+UN fRU3pZKMf+HbJwiAZsruYFDOzxNsJSu4GZ6jokK5msPHBzqV9ioWvI//izgg9JuRewooox 6mB72cU2q1yWyumOa+wSa3J6OEUi9Pk= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b="OdE/XF1b"; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf27.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728593773; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=mSexpQZUU8hmqxF1hz9XEZ+jv/OHgDid5UTMA0t74S0=; b=hm1ITGnTk10xQ6yE5B9RSEaSxGmSKpuelDLb7zVxxTWiy7XR/1l81SHRWaIaPTDeUHRvwZ TOVhoa32Xzf0wFrZoMMz88e2gGtyXKxlPCm8V2K7Qn1bTY9ESf6BGwSs26LrqePbFIQLtR w4M4I4Nep102T8n/dMwjDg5g/487kUw= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 2F5D65C5FCC; Thu, 10 Oct 2024 20:56:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 22F7FC4CEC5; Thu, 10 Oct 2024 20:56:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1728593817; bh=R2MF70Qe5RyiJ7bapva+CdTRJ3ORlm3QRnJgpBNcwdg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OdE/XF1bptJss+RjCSwJXgar0qqj6hZ1g1s6lERBGX2VMmD6i1AXUd2ErsiARXGNW YRD21MpcXIx2VN8huDFgvMAkI2ywlHm1TLyrzAqmANnFxbUj513wQorX4kzCCbnNRF GA4pK3Mwa3/NB+GEQOQe8KCIJK85CcSZMGYtSaHpdZWuYJ7DbHT9VKA9XLmfNXtAXz BM2DzYEHyr6b1/yP6r6vqwCwOgdbFLq3QZ+VYIPzk4x+4WVdjutVExOVZR/DO9yjpN lyRpste/timi9kH9OupllEhR2R79MWCKf6hny5yBMJ2rsl4aUU6KtsLlmgq+io3mO2 gOytO7LwqRWfA== From: Andrii Nakryiko To: linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org Cc: oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, shakeel.butt@linux.dev, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, Andrii Nakryiko Subject: [PATCH v3 tip/perf/core 3/4] uprobes: simplify find_active_uprobe_rcu() VMA checks Date: Thu, 10 Oct 2024 13:56:43 -0700 Message-ID: <20241010205644.3831427-4-andrii@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010205644.3831427-1-andrii@kernel.org> References: <20241010205644.3831427-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: A7AEA40007 X-Rspamd-Server: rspam01 X-Stat-Signature: zzdtmb14bzp9cxaurjsp5brqbetc47e9 X-HE-Tag: 1728593816-167092 X-HE-Meta: U2FsdGVkX19xbBgJe8CR98sGA13KM9DJGr3ZWbGmz0CwJRU/bH6fU0hI5YcF5wM2H1hetRTEwe0m2VaQ+69QnKjVqSXc+GVcJVCTZMlZj74oKqu3dhimKb2eC1T3y1RuGf9gA/z5h+EN5MdA9D5wt/9tzrpOFmo10nnGCW4il2LIjnK3FS/OEnvJVbZJOdPqK8CViKUFiO2TJhhy/2Tp1st1U2eQdXWKuEPR/f21V6RmCP6UBUJfE7Kfw6g0P35HWWo8SuzATcvUN3ZQFta6YzDE9k/aRhzK7LHDvLhNC7OZ8ZbWSP896kMNqHfJcmb2iwWpucBY+mfL6vjmsrgKFCGaCNUAa/MRbquomglfDAFI8pUwB3myLGvEqSV5n0csgO65P/gKCn8gRR48ghak4+Lf4/E7f24vo0XJX/FiXoJlb17+jQ5TB43/dVlxCNsVgBL12o5LVIArqRWzFrv65WFD2QculiMsGNYXtrm2npaTE1cp7h6XKDla4kw9SZ9wl1/gruSlSl6UuddLFOkZZ8DnvSBI5GvKS5OYE3IXrXrq03B0gou823gip0gxxK9fsdx+p3v1QbNGHud1OrMLnV5TETxAzH6L7o++gy9Ey5pPaxhHRP1quEREhhY6M3tYZdnPo7/lthV458haKQeHhYF7cWrXkiZ5NVKQMqzcffQRIthQ2HcMsClbCkFj36MEMeHhVeHcy8jEligyKCw2FITrHRysPt2eWLcsbRgmbHKx3uPajuQZl4DrneKcp8Y91T+eRfI0t2maRC2M8TOYyZcFRHh3Sxz/gIndHc95gUtQHrPnz1yExi8NniKg2he+xf5p6Qea8dYVj2a9fy4Wx6/lGKdov0Vro0tZwgakOthr5017ztCPKfqb/XeZ+rGl3aUH/v/NYKDadPr0l/INMDCyg54YFjceBPRJZNpSZe1dJOrQxdU/qOM1uozO2lC/bhYF22o//PI5jIFU/lq C4saq2l/ nD1TqXJFiKzgoJeWgKv/Hnd8bE9rfOJE9PDaeoacf5gJDUYrhZIs42Mm7BtTFGZzfIg/0Z031VQxvRCJy7ecqdhcSuXGmjtp9bi0imN6Q0/swk7JboN5DPY1GuZ7oEubZ+vaBbGtxDaaQZeNT8Ro460oiAnvOs7inbWyl+mBuT6gOVU2ZxrzMIIL4ryjVefijU92GbZL98ErhCSOXK2yScQ91KNGnJcgih3FQ/jXhRU6nKNrMEh5WPsDEPW2Yg3pJVBCDRT6ghl1mxqo/aapf1FLJRCRN6IfUZaGhIsmzsvnF/LLrAMi2EqGOtw== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: At the point where find_active_uprobe_rcu() is used we know that VMA in question has triggered software breakpoint, so we don't need to validate vma->vm_flags. Keep only vma->vm_file NULL check. Acked-by: Oleg Nesterov Suggested-by: Oleg Nesterov Signed-off-by: Andrii Nakryiko --- kernel/events/uprobes.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index 2a0059464383..fa1024aad6c4 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -2057,7 +2057,7 @@ static struct uprobe *find_active_uprobe_rcu(unsigned long bp_vaddr, int *is_swb mmap_read_lock(mm); vma = vma_lookup(mm, bp_vaddr); if (vma) { - if (valid_vma(vma, false)) { + if (vma->vm_file) { struct inode *inode = file_inode(vma->vm_file); loff_t offset = vaddr_to_offset(vma, bp_vaddr); From patchwork Thu Oct 10 20:56:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Andrii Nakryiko X-Patchwork-Id: 13831113 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B479D2444F for ; Thu, 10 Oct 2024 20:57:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E61EB6B0093; Thu, 10 Oct 2024 16:57:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DEAE76B0096; Thu, 10 Oct 2024 16:57:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C8BEC6B0095; Thu, 10 Oct 2024 16:57:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 960696B0092 for ; Thu, 10 Oct 2024 16:57:03 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id D0129AC194 for ; Thu, 10 Oct 2024 20:56:55 +0000 (UTC) X-FDA: 82658902284.24.06FE5E2 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by imf03.hostedemail.com (Postfix) with ESMTP id 8B29E2000B for ; Thu, 10 Oct 2024 20:57:00 +0000 (UTC) Authentication-Results: imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=M9r9qtsY; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf03.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1728593777; a=rsa-sha256; cv=none; b=1pOD8Jhn8mdH9Z42rmIPjE1CQp4kyJwNzzJSxzFg+otLqxGseXWvxyYxNi7df78PvTqyvg 5QkVudMXDRbhV/LPhmooZyTDI3oWQqHZaTbac5ZpscfKGDgGVxg93a5eHXi6/6rj5unQX0 LH9EUUaqu8ZKbdNDvL9x2xuNKNa5NHc= ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=pass header.d=kernel.org header.s=k20201202 header.b=M9r9qtsY; dmarc=pass (policy=quarantine) header.from=kernel.org; spf=pass (imf03.hostedemail.com: domain of andrii@kernel.org designates 139.178.84.217 as permitted sender) smtp.mailfrom=andrii@kernel.org ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1728593777; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=EXOEwrATIAJmcFQz7SWHx0L7yXYB0EIHQB2UiwbV0H8=; b=Asiu1I4z4os6/QEhjAmNXUKGFgRLnQDic78bWKe2X+SQUwqiizfoeqWzdFCQ8koacv4SFf fRk4y3l6cul0PBB/+BlrprVBuCuMCvF7yfQlAJTXxIHloldLKcr+uO7e1S6RpHOG5iu+Sy 3Z9gKn64eZ6TKeyOROPdNS5+dL5mJUM= Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by dfw.source.kernel.org (Postfix) with ESMTP id 94A905C6002; Thu, 10 Oct 2024 20:56:56 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5A9CDC4CEC5; Thu, 10 Oct 2024 20:57:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1728593820; bh=HJW3q0theOLZJq7cr7mtCh0i3gb8IviMxWPJ1EUCK8Y=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=M9r9qtsYZGd4X6+5xoplvdRL+ZFYzemjolVpIJwvK0szlb1y4gDtezvN8agSMJscW l4n9x4peZxFi5llEIv76G3TpR+pfmzHPzF2NSoaw2YIhGup95xfoeAm4oRZCF7ayw5 bbGSHUhxT8S9BdKGdGHM4xEO+L/XOQx4hrEnt4f1tip4yJVtIWocycxX5JlRZlQtY/ Jq4d4xgAK681mlSZf1Ska7NNLb7oxoBj4XXqlR1yGxHQs2lnLVsIAqcg9xwLWO8jBt Ii17mcAKVvPskH8m79ilR+9AuuycUxN0LXsNDSe98+NQxOZIBxERYE+M6SLXHueeUD P+CEOcJbWed1g== From: Andrii Nakryiko To: linux-trace-kernel@vger.kernel.org, linux-mm@kvack.org, peterz@infradead.org Cc: oleg@redhat.com, rostedt@goodmis.org, mhiramat@kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org, jolsa@kernel.org, paulmck@kernel.org, willy@infradead.org, surenb@google.com, akpm@linux-foundation.org, mjguzik@gmail.com, brauner@kernel.org, jannh@google.com, mhocko@kernel.org, vbabka@suse.cz, shakeel.butt@linux.dev, hannes@cmpxchg.org, Liam.Howlett@oracle.com, lorenzo.stoakes@oracle.com, Andrii Nakryiko Subject: [PATCH v3 tip/perf/core 4/4] uprobes: add speculative lockless VMA-to-inode-to-uprobe resolution Date: Thu, 10 Oct 2024 13:56:44 -0700 Message-ID: <20241010205644.3831427-5-andrii@kernel.org> X-Mailer: git-send-email 2.43.5 In-Reply-To: <20241010205644.3831427-1-andrii@kernel.org> References: <20241010205644.3831427-1-andrii@kernel.org> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Queue-Id: 8B29E2000B X-Rspamd-Server: rspam01 X-Stat-Signature: 4y3e1fxetbzoooyqd5qgspgk8gnr8s8f X-HE-Tag: 1728593820-359517 X-HE-Meta: U2FsdGVkX1/0A6qoMadXQbujRWVyTUK2/YD/zKBA++YCdOzhfn2itpkdvS3cUrA/XYJ1zKs611XNjPK1KbDvrceT66SsBE6jyyBVVlaZP8L3v2IFv2MTza4ib0pEMD8OYDI1xsSQ5cd3HGWqjFgVTdKhIXo9KSrWBYBJr6CiTgERAFJ+wifLAeOFmtJZZM8HJ4IBJzcgFFc8Yy+HoQsIxoLDqxMiA1wK2X98UbYKUG7IP11UGz8sXgPXL8i4Ul8t15WFlPWLJOL6coOpKvDLqZraONdVjbXoslIIdwtrv1w1EF/X+yeR1FSFMArSUeyOJExG2fcseJemvgeQwF5DSEZA6aIVSZSVi2uqW3gyGb4eIzM0tRAT7lFsrBP/KZ34XeCXo58pUxVWnDPmogg3J/fNj/1aMuwMyRFU4DrlJeKE3WzSJFns5yUezougZPxngtuipbRMmqX/Nxx+v7twyRpaK3mhmO4Q54e05bvMTDJ0fFHNY7GR8kquxvW40bnVtN7RnIEkWpfvB/MM/3QFVbbzDmaP+3aHBDeTzptgZOR8ZnrQgs4cAYxI+e7ZjGUcxojNOIProN/Mp/8745g7B3Ey6aoOmFsYOZzxDQhyPea2GjoVP5a0R5AkopjZA4X8vaRYemPWyhrOuMkrOfZz0amVDD1UNd/e4d4dWjImV7SPgo4OKrVCR5BOhOLjLQTr5pmP6U/Q6kT3mqSI8oHSKJdmlzyuwsYafAN6Yj8cOdPqGhJVeLoybwVvXbxZrVG54EL55tv+6uwPDPOHk1ZHXy99jijQIHiyHY/AmsdDSQq7fCSkxR7azqT05+ys4cNghQSzjuTulJ+Lp8WIDsRQLfx1IJETj8XfXJAunhcgxG2IhP3htuBDNbmpFb5vMMQARsPOvaI/t1kS4MATkCP7ElfvK8dijbQWV4rd4Hn5SFGe1aBl4lx/2HtRZJlhNFdi+V7ferdYxg0iAdCZ7SF 4NlBjtHz 1Hn4nozuvUmBymsg8kfZqVu6XxVHbzAMrszW7mYN/DSA1x8GvVGFoslLtzK3qSL7K6Hlt9vknaK8OkpWm6FtWDlrwh1PspvpzcsdNPD2J0UgO5s6GbAOeE6LmfdAzGQD1GUZtQEE6WfotQFqlB10W1HUJDwMhKY6KkxCTf3Qhfni7HAd3EehS9guHonNlP+b6gso4CqJDULYLC4YUadZs6RDM6IqU7yAu8frTuzLT15UDgOO89mUZV10uelFozdw7N3GSFFeFdC9KCndM9iD/JnG/zmYrt5fh6ICe4yNK48h6ePrClAAQA6zoAB4c37ra09aRCPBEE4+5FIs= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Given filp_cachep is marked SLAB_TYPESAFE_BY_RCU (and FMODE_BACKING files, a special case, now goes through RCU-delated freeing), we can safely access vma->vm_file->f_inode field locklessly under just rcu_read_lock() protection, which enables looking up uprobe from uprobes_tree completely locklessly and speculatively without the need to acquire mmap_lock for reads. In most cases, anyway, assuming that there are no parallel mm and/or VMA modifications. The underlying struct file's memory won't go away from under us (even if struct file can be reused in the meantime). We rely on newly added mmap_lock_speculation_{start,end}() helpers to validate that mm_struct stays intact for entire duration of this speculation. If not, we fall back to mmap_lock-protected lookup. The speculative logic is written in such a way that it will safely handle any garbage values that might be read from vma or file structs. Benchmarking results speak for themselves. BEFORE (latest tip/perf/core) ============================= uprobe-nop ( 1 cpus): 3.384 ± 0.004M/s ( 3.384M/s/cpu) uprobe-nop ( 2 cpus): 5.456 ± 0.005M/s ( 2.728M/s/cpu) uprobe-nop ( 3 cpus): 7.863 ± 0.015M/s ( 2.621M/s/cpu) uprobe-nop ( 4 cpus): 9.442 ± 0.008M/s ( 2.360M/s/cpu) uprobe-nop ( 5 cpus): 11.036 ± 0.013M/s ( 2.207M/s/cpu) uprobe-nop ( 6 cpus): 10.884 ± 0.019M/s ( 1.814M/s/cpu) uprobe-nop ( 7 cpus): 7.897 ± 0.145M/s ( 1.128M/s/cpu) uprobe-nop ( 8 cpus): 10.021 ± 0.128M/s ( 1.253M/s/cpu) uprobe-nop (10 cpus): 9.932 ± 0.170M/s ( 0.993M/s/cpu) uprobe-nop (12 cpus): 8.369 ± 0.056M/s ( 0.697M/s/cpu) uprobe-nop (14 cpus): 8.678 ± 0.017M/s ( 0.620M/s/cpu) uprobe-nop (16 cpus): 7.392 ± 0.003M/s ( 0.462M/s/cpu) uprobe-nop (24 cpus): 5.326 ± 0.178M/s ( 0.222M/s/cpu) uprobe-nop (32 cpus): 5.426 ± 0.059M/s ( 0.170M/s/cpu) uprobe-nop (40 cpus): 5.262 ± 0.070M/s ( 0.132M/s/cpu) uprobe-nop (48 cpus): 6.121 ± 0.010M/s ( 0.128M/s/cpu) uprobe-nop (56 cpus): 6.252 ± 0.035M/s ( 0.112M/s/cpu) uprobe-nop (64 cpus): 7.644 ± 0.023M/s ( 0.119M/s/cpu) uprobe-nop (72 cpus): 7.781 ± 0.001M/s ( 0.108M/s/cpu) uprobe-nop (80 cpus): 8.992 ± 0.048M/s ( 0.112M/s/cpu) AFTER ===== uprobe-nop ( 1 cpus): 3.534 ± 0.033M/s ( 3.534M/s/cpu) uprobe-nop ( 2 cpus): 6.701 ± 0.007M/s ( 3.351M/s/cpu) uprobe-nop ( 3 cpus): 10.031 ± 0.007M/s ( 3.344M/s/cpu) uprobe-nop ( 4 cpus): 13.003 ± 0.012M/s ( 3.251M/s/cpu) uprobe-nop ( 5 cpus): 16.274 ± 0.006M/s ( 3.255M/s/cpu) uprobe-nop ( 6 cpus): 19.563 ± 0.024M/s ( 3.261M/s/cpu) uprobe-nop ( 7 cpus): 22.696 ± 0.054M/s ( 3.242M/s/cpu) uprobe-nop ( 8 cpus): 24.534 ± 0.010M/s ( 3.067M/s/cpu) uprobe-nop (10 cpus): 30.475 ± 0.117M/s ( 3.047M/s/cpu) uprobe-nop (12 cpus): 33.371 ± 0.017M/s ( 2.781M/s/cpu) uprobe-nop (14 cpus): 38.864 ± 0.004M/s ( 2.776M/s/cpu) uprobe-nop (16 cpus): 41.476 ± 0.020M/s ( 2.592M/s/cpu) uprobe-nop (24 cpus): 64.696 ± 0.021M/s ( 2.696M/s/cpu) uprobe-nop (32 cpus): 85.054 ± 0.027M/s ( 2.658M/s/cpu) uprobe-nop (40 cpus): 101.979 ± 0.032M/s ( 2.549M/s/cpu) uprobe-nop (48 cpus): 110.518 ± 0.056M/s ( 2.302M/s/cpu) uprobe-nop (56 cpus): 117.737 ± 0.020M/s ( 2.102M/s/cpu) uprobe-nop (64 cpus): 124.613 ± 0.079M/s ( 1.947M/s/cpu) uprobe-nop (72 cpus): 133.239 ± 0.032M/s ( 1.851M/s/cpu) uprobe-nop (80 cpus): 142.037 ± 0.138M/s ( 1.775M/s/cpu) Previously total throughput was maxing out at 11mln/s, and gradually declining past 8 cores. With this change, it now keeps growing with each added CPU, reaching 142mln/s at 80 CPUs (this was measured on a 80-core Intel(R) Xeon(R) Gold 6138 CPU @ 2.00GHz). Suggested-by: Matthew Wilcox Signed-off-by: Andrii Nakryiko Reviewed-by: Oleg Nesterov --- kernel/events/uprobes.c | 50 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 50 insertions(+) diff --git a/kernel/events/uprobes.c b/kernel/events/uprobes.c index fa1024aad6c4..9dc6e78975c9 100644 --- a/kernel/events/uprobes.c +++ b/kernel/events/uprobes.c @@ -2047,6 +2047,52 @@ static int is_trap_at_addr(struct mm_struct *mm, unsigned long vaddr) return is_trap_insn(&opcode); } +static struct uprobe *find_active_uprobe_speculative(unsigned long bp_vaddr) +{ + struct mm_struct *mm = current->mm; + struct uprobe *uprobe = NULL; + struct vm_area_struct *vma; + struct file *vm_file; + struct inode *vm_inode; + unsigned long vm_pgoff, vm_start; + loff_t offset; + long seq; + + guard(rcu)(); + + if (!mmap_lock_speculation_start(mm, &seq)) + return NULL; + + vma = vma_lookup(mm, bp_vaddr); + if (!vma) + return NULL; + + /* vm_file memory can be reused for another instance of struct file, + * but can't be freed from under us, so it's safe to read fields from + * it, even if the values are some garbage values; ultimately + * find_uprobe_rcu() + mmap_lock_speculation_end() check will ensure + * that whatever we speculatively found is correct + */ + vm_file = READ_ONCE(vma->vm_file); + if (!vm_file) + return NULL; + + vm_pgoff = data_race(vma->vm_pgoff); + vm_start = data_race(vma->vm_start); + vm_inode = data_race(vm_file->f_inode); + + offset = (loff_t)(vm_pgoff << PAGE_SHIFT) + (bp_vaddr - vm_start); + uprobe = find_uprobe_rcu(vm_inode, offset); + if (!uprobe) + return NULL; + + /* now double check that nothing about MM changed */ + if (!mmap_lock_speculation_end(mm, seq)) + return NULL; + + return uprobe; +} + /* assumes being inside RCU protected region */ static struct uprobe *find_active_uprobe_rcu(unsigned long bp_vaddr, int *is_swbp) { @@ -2054,6 +2100,10 @@ static struct uprobe *find_active_uprobe_rcu(unsigned long bp_vaddr, int *is_swb struct uprobe *uprobe = NULL; struct vm_area_struct *vma; + uprobe = find_active_uprobe_speculative(bp_vaddr); + if (uprobe) + return uprobe; + mmap_read_lock(mm); vma = vma_lookup(mm, bp_vaddr); if (vma) {