From patchwork Thu Dec 26 17:06:52 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13921381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 12F74E77188 for ; Thu, 26 Dec 2024 17:07:17 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 376096B0082; Thu, 26 Dec 2024 12:07:17 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 327106B0083; Thu, 26 Dec 2024 12:07:17 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1EE336B0085; Thu, 26 Dec 2024 12:07:17 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 02C006B0082 for ; Thu, 26 Dec 2024 12:07:16 -0500 (EST) Received: from smtpin09.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id AB93F140C20 for ; Thu, 26 Dec 2024 17:07:16 +0000 (UTC) X-FDA: 82937738394.09.0166216 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf18.hostedemail.com (Postfix) with ESMTP id DC92D1C0013 for ; Thu, 26 Dec 2024 17:06:55 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GyqEKxVt; spf=pass (imf18.hostedemail.com: domain of 3QY1tZwYKCFQEGD09x2AA270.yA8749GJ-886Hwy6.AD2@flex--surenb.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3QY1tZwYKCFQEGD09x2AA270.yA8749GJ-886Hwy6.AD2@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1735232816; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=ODP42SKDHXV3hxOWWJxtaKBxGzn2PPEC/z3NvFn3uUE=; b=n1KWETLiOW8REiyqo3J12MDUi/9zp+0V8D7JMVB08PYYMq9DesexnvXtWQMmNHtZwG2D6I QWAW9oxkITiX8FxzojBU6X59ukOP1lWP15/gJllCOjXF3nlev1BdcywcdhD1UmFLiBfH4E Hh8MTABeVuP3GjviwjeX5qO/oPe8Csw= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=GyqEKxVt; spf=pass (imf18.hostedemail.com: domain of 3QY1tZwYKCFQEGD09x2AA270.yA8749GJ-886Hwy6.AD2@flex--surenb.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3QY1tZwYKCFQEGD09x2AA270.yA8749GJ-886Hwy6.AD2@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1735232816; a=rsa-sha256; cv=none; b=vvsfVIQOmN0APtEolNDsz5SG0V6/IkfaZLAG4kHuwnOA+tCowpcUO7kKuqML821+3jH+0+ 7JVzSiMAmeVHyxTGUFkEeveQNRO1y6+RwFf3H8za9hkwnzs6zzhlQfyKviUqHPh3HY9QJW m35i9yeTalA28ISEPs3iGmRBkA4fqaY= Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-2163dc0f5dbso77831495ad.2 for ; Thu, 26 Dec 2024 09:07:14 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1735232833; x=1735837633; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=ODP42SKDHXV3hxOWWJxtaKBxGzn2PPEC/z3NvFn3uUE=; b=GyqEKxVt4dbMnzK7TueYgnQuI0YC8VimGJequpmKN80IBLe1m9wVDkicwJneOKPUqs 2Ws1Irfjw66AZRUj+Jx60bgN6Xq2UgwpFtVJCz0Q16vFzYcKTStJckDZoGVAH4rPT4RT qIj7/nkW8UEqo90coHDCySgKHLG36y23z1Gi6eKQNcuJ8J83fwQjb1w9h/OnfEGIL51+ IO6Ca6GXn26J3oBqlJOKxJGO6JvIGsU2vlEGNu/PvB+UmlccAFeHBSARXG5wKYp7Lqky alpDxe2RLovS4vcZ87910DPiyBgpLPcjWzfvzaACY+bHDLP5ZOLUieqib45dnn8oe8bD SISQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1735232833; x=1735837633; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=ODP42SKDHXV3hxOWWJxtaKBxGzn2PPEC/z3NvFn3uUE=; b=MCYbCUgPf8I1JXz2wzSW/IPdjwEDv0J/1JaS+pAvpP2x6I8iKMUQxlTv78LDS3kMq+ uxT74fWXFGR4kxBLlrZcGhhjsArXoVwY1VlnDrXQTa8QSKSR2AkxfSZA+X+FEMSq3V9Y Owo6j8mjUaw7XJs2T18AiOG4LJ/WfctKqx5u+vRQPmUbhl0C28leCQLzZZfbT7wUcVHq NHshcIJjqbmY/PnSocitujmem9hnhvnjU3FlfisBHjXZDA19tEX1Z3mjssgkvplsfHM6 j/5XO1UKyoXuRJmBzeTKNxsevxvu1kZCadjHBPqYLir2hJ5QmT9QDXinls81jMSlQgPF mJ8w== X-Forwarded-Encrypted: i=1; AJvYcCUJ0YAZRvXOZS4+l3A+sAcgAmGb7FmWe2HQ7AZDb1Wd6zuQGOyctqTYa7dCOolJYvRz3FVsRTmE5A==@kvack.org X-Gm-Message-State: AOJu0YwiZjgrGLoDVYnSPBLV70Rvy6RzkF7Ttr99WZoSIhZcPnCodoWO nGytu1ptP52KDB2Gf+VqVSIgbVLz6kT2zvQ95hfZGaB4e2VG4yvNql28P8AyG6Pyrr8fLiZ31hw rXQ== X-Google-Smtp-Source: AGHT+IES5fMpd18ac55lXlGTxpVigji00REq0R3j/VnBIPKrZwlOFUxdfKFQVkDuQKi2HlmyiSdRMGJNgUg= X-Received: from pfwz9.prod.google.com ([2002:a05:6a00:1d89:b0:72a:9fce:4f44]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:2443:b0:1e1:f40d:9783 with SMTP id adf61e73a8af0-1e5e081f331mr43831486637.40.1735232833310; Thu, 26 Dec 2024 09:07:13 -0800 (PST) Date: Thu, 26 Dec 2024 09:06:52 -0800 Mime-Version: 1.0 X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20241226170710.1159679-1-surenb@google.com> Subject: [PATCH v7 00/17] move per-vma lock into vm_area_struct From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com X-Rspamd-Queue-Id: DC92D1C0013 X-Stat-Signature: gerra8q76yxb5h53dme1crrnqb1quxbk X-Rspam-User: X-Rspamd-Server: rspam11 X-HE-Tag: 1735232815-331958 X-HE-Meta: U2FsdGVkX1/OyKxYbnXhwgGla2bDRIbQm0LXTNPSQ44Go+Py7O8GRKPs0/CW2pI2RcC9i6aw7nYUNL3ckort7urJJAsK5RMMDt9uw+Y8a4fG4KrzhCq99afeURfeBDhxKL+3cB+SDKPfsOeU0s3mUrepcrYRDfbfmVlNRjJhELQ3sZVFa5HwZHNv8s6W92wsXsqHZ3xpTj74CJXmCWOgPtMIBV92ShObr0nUSW3weD+qGXZdW975GJ0tdghvx1cZUuz4X4GgfwZ9PaaUy9dbyk686PFQBoKnS7jMgsR7+c423TEkzOhKW/9pvYBLNRegRNt29ZOVKDz0I/ICTHy7vfIbHk7HFiNc9fBIzC7vYfr9CS0ycknYt8tNX2t7WziOsCtYQddeH5eTcmhlDBt8LkVhg9ESf+2ZsjmboJ6zuuswNnuWQLpSNPT1CeySrtAruL7mTod7tBEo3ZUKIdq9XaS31aPKr/mW90o0XTlBGqIhryMKTS9+45dkxBKvg1VVNIices4tBH1jR8D0/rBoUZYffpAP2MhtdTtMKPjnCThl6FHyHafddXjFgizZDQgSIxhlvPelIcgpoit4Sj9AOjGSOOxwVlxR9ZzXivISQODibzltJau0zHK0RoM5IY8aRWXYF+5H3rIo8S/W1YnVftbDF6aDgUbDarrf06zeJUe03FX2blYsewrGKEE3V2i3T3zh23Y+elhrhhNVSP1hC9kjoh0lFdemZrOLiW0pKmwSvExHKPCTItRDcySZGqYMH8LPO8vOs8zvj8d2ovu8u6BFPOl827UUDwpN/n/dCC2QR+2sNRzsjK7Pcmv4EghX7QzsN4m70vv1byfGhxqecNXgQawhmikWuulMulxUpVYbV3PtMCZJLVgpyJGaEZyqNbYNry0G/kWOe5KW6pu5HxMsacAz7TdZDB2xFvTyW5FZlOHua5Ng8i9mTDDQ92qz/TwUEUeCeoKtxSqblWr o2OoPUAL J4Jd5PT8H8OuYvDqlPNxwI9dgyUQx+eoZ0di2xRPYSn+/Z1hnJCMDFEDmkj9xJFWzXlelXNaEN9ek4ldxTnZo8T956z6I5tOHLMb4RDI+vuF4ZEVE4MkBSLVAK6MQWelRyTDmN4DC66Qcs4PWQaBAbJA62mXxSaOxXHulhZecwsAip2qmb3ogRi8CKn6lclaDR0QHsTKBl/11VPdpupUSvF3mWXj7FdarV90YGWrb7qaKz+pdms313L327F6QBA5peRwvLAV5LexLAsxKKd3epVIzVkZMXDv3I4BtrDsQJYQHGIpWnqQYhHhTTcH7b+NWFOVkzPskt8rr5cG6N7pxqsXQJXBN3PDv1RVSZijLyNTCLq/J5RzaaKEtvAwjeV8UI5vCzfDg13W4UWVxPUIjmTGYV/4wDS8S208mrEfhE2XskDq2tj6Wu7UF2VhHXwM2P2pfKiF7+tNk7dmUVofbqxGMULABQex+Uh4X4/Twv7UIl5TO27ejV7vCWDC/xwihLrK4vJonmOPUlMuOJ4qm2nOkEeR4rd81UvRSxv2zo/Pw2p30m5ih0B9BFs1/fbIfjUtgsSWI50sULZACmO1iRFQYDu/qOKqXPEs2Ie6EkxWdBSiNlGkH8ObJJw48d+mTwVyR X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Back when per-vma locks were introduces, vm_lock was moved out of vm_area_struct in [1] because of the performance regression caused by false cacheline sharing. Recent investigation [2] revealed that the regressions is limited to a rather old Broadwell microarchitecture and even there it can be mitigated by disabling adjacent cacheline prefetching, see [3]. Splitting single logical structure into multiple ones leads to more complicated management, extra pointer dereferences and overall less maintainable code. When that split-away part is a lock, it complicates things even further. With no performance benefits, there are no reasons for this split. Merging the vm_lock back into vm_area_struct also allows vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. This patchset: 1. moves vm_lock back into vm_area_struct, aligning it at the cacheline boundary and changing the cache to be cacheline-aligned to minimize cacheline sharing; 2. changes vm_area_struct initialization to mark new vma as detached until it is inserted into vma tree; 3. replaces vm_lock and vma->detached flag with a reference counter; 4. changes vm_area_struct cache to SLAB_TYPESAFE_BY_RCU to allow for their reuse and to minimize call_rcu() calls. Pagefault microbenchmarks show performance improvement: Hmean faults/cpu-1 507926.5547 ( 0.00%) 506519.3692 * -0.28%* Hmean faults/cpu-4 479119.7051 ( 0.00%) 481333.6802 * 0.46%* Hmean faults/cpu-7 452880.2961 ( 0.00%) 455845.6211 * 0.65%* Hmean faults/cpu-12 347639.1021 ( 0.00%) 352004.2254 * 1.26%* Hmean faults/cpu-21 200061.2238 ( 0.00%) 229597.0317 * 14.76%* Hmean faults/cpu-30 145251.2001 ( 0.00%) 164202.5067 * 13.05%* Hmean faults/cpu-48 106848.4434 ( 0.00%) 120641.5504 * 12.91%* Hmean faults/cpu-56 92472.3835 ( 0.00%) 103464.7916 * 11.89%* Hmean faults/sec-1 507566.1468 ( 0.00%) 506139.0811 * -0.28%* Hmean faults/sec-4 1880478.2402 ( 0.00%) 1886795.6329 * 0.34%* Hmean faults/sec-7 3106394.3438 ( 0.00%) 3140550.7485 * 1.10%* Hmean faults/sec-12 4061358.4795 ( 0.00%) 4112477.0206 * 1.26%* Hmean faults/sec-21 3988619.1169 ( 0.00%) 4577747.1436 * 14.77%* Hmean faults/sec-30 3909839.5449 ( 0.00%) 4311052.2787 * 10.26%* Hmean faults/sec-48 4761108.4691 ( 0.00%) 5283790.5026 * 10.98%* Hmean faults/sec-56 4885561.4590 ( 0.00%) 5415839.4045 * 10.85%* Changes since v6 [4] - Fixed vma_start_read_locked() condition in uffd_move_lock(), per Lokesh Gidra - Moved more conditions into unlikely() in vma_start_read(), per Peter - Renamed VMA_LOCK_LOCKED into VMA_LOCK_OFFSET, removed VMA_STATE_{A|DE}TACHED, introduced VMA_REF_LIMIT, per Peter - Made sure no re-attach or re-detach operation is happening, added assertions to catch such cases, per Peter - Added a parameter to vma_iter_store{_gfp} to indicate when a new vma is being added or existing being modified, to avoid re-attaching existing vma - Refactored patches to implement detached guarantees in the single patch What I did not include in this patchset: - Changing of vma locking patterns; - Changing do_vmi_align_munmap() to avoid reattach_vmas() This cleanup needs more discussion and can be done independently as this patchset is already quite large. Patchset applies over mm-unstable. [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/ [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/ [4] https://lore.kernel.org/all/20241216192419.2970941-1-surenb@google.com/ Suren Baghdasaryan (17): mm: introduce vma_start_read_locked{_nested} helpers mm: move per-vma lock into vm_area_struct mm: mark vma as detached until it's added into vma tree mm: modify vma_iter_store{_gfp} to indicate if it's storing a new vma mm: mark vmas detached upon exit mm/nommu: fix the last places where vma is not locked before being attached types: move struct rcuwait into types.h mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail mm: move mmap_init_lock() out of the header file mm: uninline the main body of vma_start_write() refcount: introduce __refcount_{add|inc}_not_zero_limited mm: replace vm_lock and detached flag with a reference count mm/debug: print vm_refcnt state when dumping the vma mm: remove extra vma_numab_state_init() call mm: prepare lock_vma_under_rcu() for vma reuse possibility mm: make vma cache SLAB_TYPESAFE_BY_RCU docs/mm: document latest changes to vm_lock Documentation/mm/process_addrs.rst | 44 +++++---- include/linux/mm.h | 152 ++++++++++++++++++++++------- include/linux/mm_types.h | 36 ++++--- include/linux/mmap_lock.h | 6 -- include/linux/rcuwait.h | 13 +-- include/linux/refcount.h | 20 +++- include/linux/slab.h | 6 -- include/linux/types.h | 12 +++ kernel/fork.c | 87 +++++------------ mm/debug.c | 4 +- mm/init-mm.c | 1 + mm/memory.c | 85 +++++++++++++--- mm/mmap.c | 3 +- mm/nommu.c | 6 +- mm/userfaultfd.c | 31 +++--- mm/vma.c | 31 +++--- mm/vma.h | 13 ++- tools/testing/vma/linux/atomic.h | 5 + tools/testing/vma/vma_internal.h | 93 ++++++++---------- 19 files changed, 385 insertions(+), 263 deletions(-) base-commit: 431614f1580a03c1a653340c55ea76bd12a9403f