From patchwork Thu Jan 9 02:30:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13931807 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01FF9E77188 for ; Thu, 9 Jan 2025 02:30:32 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 8B7C36B0085; Wed, 8 Jan 2025 21:30:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 83FDF6B0089; Wed, 8 Jan 2025 21:30:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 708466B008A; Wed, 8 Jan 2025 21:30:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 4EB7C6B0085 for ; Wed, 8 Jan 2025 21:30:32 -0500 (EST) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 0CC921415E6 for ; Thu, 9 Jan 2025 02:30:32 +0000 (UTC) X-FDA: 82986334704.25.04A9F58 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) by imf15.hostedemail.com (Postfix) with ESMTP id 39A88A0007 for ; Thu, 9 Jan 2025 02:30:30 +0000 (UTC) Authentication-Results: imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NhKgAZnm; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 3xDR_ZwYKCG0dfcPYMRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--surenb.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3xDR_ZwYKCG0dfcPYMRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--surenb.bounces.google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1736389830; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding:in-reply-to: references:dkim-signature; bh=Cpo3070ozIigd1UD5sogM22USQlvAcQS+SrPZOPOiVU=; b=jvA2aVF/y2hTWrq7z85oZLXL2L5XBWSIR+r2aItXC32wZNnSvqRf248MzGehXW0pEzxeuX aBCd0dzXpLN5ou9NhkHc8lyjQiTgYmmT1IhAVtjpb2BuK0OZwvZV4FP6S0RWrZiDhymPOl 3Qb2ZXA7KdjmrZAjuc3P93VuqBcqOgo= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1736389830; a=rsa-sha256; cv=none; b=o0r33FRlgxdIHlDHGcSRigRrycJH5hcAQLlOfMPML+dd9hh+1q6CNgyVJDfks0Db+HB762 mo/X9qEaanjD1tHr0e5qdbwwJWg18Khz5hTZiIy/e4Hf6QNdqCIbTauM4qOOChQStdzjAQ d37UFXLmRvrrwA0CLFf7x+6zDAem28w= ARC-Authentication-Results: i=1; imf15.hostedemail.com; dkim=pass header.d=google.com header.s=20230601 header.b=NhKgAZnm; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf15.hostedemail.com: domain of 3xDR_ZwYKCG0dfcPYMRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--surenb.bounces.google.com designates 209.85.214.201 as permitted sender) smtp.mailfrom=3xDR_ZwYKCG0dfcPYMRZZRWP.NZXWTYfi-XXVgLNV.ZcR@flex--surenb.bounces.google.com Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-21a7cbe3b56so6113765ad.0 for ; Wed, 08 Jan 2025 18:30:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1736389829; x=1736994629; darn=kvack.org; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=Cpo3070ozIigd1UD5sogM22USQlvAcQS+SrPZOPOiVU=; b=NhKgAZnmaPlp1pgbQLWbdsg3Ac7AVZKj2jAwmBsyHuVsw/MzqfP/ajai7W30H6hjmG JkyCR1xx9dEzdyr4BsVHm2n0C0sQp9T/UEi6OkMJZ/Xbi2MC0NfdnQo+dHKN2wg+YOue NiTFb0S6HQQhJf/7/dJPpY2di6+wE5UF7bSKH0NZ2IpE3Wjo8qZb0xDfgt+vVK/YJwTb pR0k+htwoQKjyY2X6z731Xp5+NBdczjoOxt7l8f+MIzG9wEBmM9v6o5abCAl/h2udERU bgErRgc4cSH2yvYUp7SCRYzyHLu6ulNn387IsCDZjz/yEapZRviu87Yn/QJ+mRCVjdeS 3D9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1736389829; x=1736994629; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=Cpo3070ozIigd1UD5sogM22USQlvAcQS+SrPZOPOiVU=; b=H7LUNN3q5ELW5BARAQBWnGPxIaLXdzmM3m9C5+iyyey90+xqANeM/xo3vbGPb+HyR8 l+p4QBRQpJ3q2k2nKcnzs2kAJvvNcwAEU2vUx14yjcNezF3gLf9kXwHQ4DcmSb/zuCPp biHq9yn6m7JIWY0W4zsVmT69ztS5B2rvn0RddyGeJD84NwxVseZWl8nepq0SV7EOEZGS LxKnxqoGWOvPKoBoxzlwQudxbr2HlLI9SJvor/FUqXwSWDe9XKbc7nKWDZwwNYH0WOWX 5PdGXh04q6F6F3bUoVjyAazUylGyq+tz+Dh90LPh1UPJrTj5qBV/EemjA2lMYN5+crHY ipow== X-Forwarded-Encrypted: i=1; AJvYcCVIB8E4DM3VAF4iopQ91AMYwyCZPopRlW+S1G3OpptihsVwGxA6R5Q7CcKKhIAt9p+zf0B+lnFQXA==@kvack.org X-Gm-Message-State: AOJu0Yx4WCtxGtTrntxFKd++UVZ80i5bjRydvNqOYTkZ+8YAZ5QZ5sia QSuUiUfjztuffO5sGAid2lsxKvJXwBacRRPyBVWT8H/szgKR+swW/Nz2+nLy3tx7nLAITvagk9L IbQ== X-Google-Smtp-Source: AGHT+IFWRwQyRHtcEUyRY/wfEsf25Xa6VuwC5jEhGExKs0dL9+I/VX06a6PUWB7b7J8ztc5+nZicWF+9Xfs= X-Received: from pfbca11.prod.google.com ([2002:a05:6a00:418b:b0:725:936f:c305]) (user=surenb job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:1942:b0:725:d956:aa6f with SMTP id d2e1a72fcca58-72d21f7f5a6mr7847896b3a.5.1736389828787; Wed, 08 Jan 2025 18:30:28 -0800 (PST) Date: Wed, 8 Jan 2025 18:30:09 -0800 Mime-Version: 1.0 X-Mailer: git-send-email 2.47.1.613.gc27f4b7a9f-goog Message-ID: <20250109023025.2242447-1-surenb@google.com> Subject: [PATCH v8 00/16] move per-vma lock into vm_area_struct From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: peterz@infradead.org, willy@infradead.org, liam.howlett@oracle.com, lorenzo.stoakes@oracle.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mjguzik@gmail.com, oliver.sang@intel.com, mgorman@techsingularity.net, david@redhat.com, peterx@redhat.com, oleg@redhat.com, dave@stgolabs.net, paulmck@kernel.org, brauner@kernel.org, dhowells@redhat.com, hdanton@sina.com, hughd@google.com, lokeshgidra@google.com, minchan@google.com, jannh@google.com, shakeel.butt@linux.dev, souravpanda@google.com, pasha.tatashin@soleen.com, klarasmodin@gmail.com, richard.weiyang@gmail.com, corbet@lwn.net, linux-doc@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com X-Stat-Signature: gj1jtese3bsqry9sp4kf7ogiesfsbbq3 X-Rspamd-Queue-Id: 39A88A0007 X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1736389830-654615 X-HE-Meta: U2FsdGVkX19AJVQyWX7hZqtfaosRTLZYZ47svYRl6wr3hDd+C01MyhL48HH3jPmg4wKA1ip8D07F/ScARRXCZUvbJF3smOSe9sOgHLzlmUK29jmxOtlDnFCRmw4sbyj6F0maA1ngBog99Yvk9rcL96LdPwxCSmR5x1WgFiP9qLklR3isuoAnzd8T2BrcRGk/hqMs7pbT0P2sTiQFFbP3JiP8OzV4iDAmB9QfUjRk2h+OADYNivfg8RSQYXbZ5NyLYkgwgLVdIFz0b/Qmd7tIxD3d0ylXv1rzPTLHrs2DhCdfFrRZbk2qvwd448FTzYKNeeZQcZsjo+S8TsAYnNRvt77qu8W1T0hJ6bcwO8XTT8WW/QSGBCNcuiGUw9cuqRQO1oOM7pTB2Wr+1TtQiWG6p0RDntFdeR77ztCAewjBOuWF6UQqGvAxdC0c7Y9QN7Q0UyHq5mOyKWc9DsOJVxb2+Ce/xjATKFqVi4JpfliJQj4i2qh0homKA4jkn7QJTd/vWfG9CuM57tyCYIAxynNIzCsdstPHKZkgJWY2HcrsFKxyd4KVhsv0NXA44tEcfjIWHfXQPaGNRWLmhrgldn8vYE2UPrF27gKC/mgtGhrhd3UI1yAbPejov3Tu+MDX1sFKn6r9UhY7iQ12gG9P/73XGqkbF1QFcVslir9ovkIOtAen5XSpFQ+UtAiQ2Mvy3RX1mfck3JqlnzAVzbcZhmyFowLhTxXwtWKCzL96qWztwMHPvlrwRPxKebFI2esQhvLVFmtaPrryp03XztyV20zqYrzn9r0vziCMMk40lggSCwxyLoaqj0saPwoDTD4IRkj52Fy9QewQQt4OPksPBhJnpqgFX7NlI93n/6zUouTIxXefP6bXTlRzcqqXPXqRwabNjb1dgIU3g9pYAUasTPjYKVO8NMhnrbWUR8koGzJ3rTUmMV2yqrE7clq1bHkwsUCJIT3qCVJq6l03HPEu7PN 8mVv17z/ rvCwDmCsGy3Fm26KwigjeQmdgi2cYIqKZXWpmp+LRRXUCPCjQvlyZHmVcHRKQZc7SDojjbW4RgJuZ5lA14Ro92sgr2GsHTDJU5C08kYfcA37Lc3FwwWxIYM3u9ADe/LbJSLeaTuhojV5qPrWKTMlKu5I7YQbeTjjTR4mdqixXL0jyafi802nn6yNSPFsRkXWSS/YghHt0jcLPMT3eLiEdM/25Vifu3eqEow6PJ1+L7P0fcI21trRpGss4/lJ/R3t9EeYuN5QoImIkMatxAO+7Eu22QcYYL5/Wn7p3wuM3L5H0b1YFT0zEFgs26eaL79hp3efJyf5Op4P3Gy0r2YDgsdFjbiV0N4PZWH4+P9E+arlPghY/0j6zYLYNEv1AFrtdtAVUsknBAq4deASvtCpzP0m4QDRKekxcvKR8TYm4bFIoEgnPf18WkxBWKOQZQSAkgPUI7kPjzCYp0Y23qPFEVJLatNjPR6WXLtiVMPnyogCt6Se38mWOD/UwysrM+uAd4Hxyy1zhD1ZBbDXwe0tQr5CZpu+PTE62EiapF/RVeGfV3TalJ2NFpZrJbYFyuSD/thY/VjJ4Ovx+RZC/y9QlVrCphgQsFWbnoAOlFwfdvbETJvPjaRZsDTjYQz7rXNBKLAcXE1LXhmfhdJXy0zPF5KAH7T4mAiVjefRdp3O8Imv/SZ85+lomC9XcLNlY+KIquEA+mvCz3McpTFo= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Back when per-vma locks were introduces, vm_lock was moved out of vm_area_struct in [1] because of the performance regression caused by false cacheline sharing. Recent investigation [2] revealed that the regressions is limited to a rather old Broadwell microarchitecture and even there it can be mitigated by disabling adjacent cacheline prefetching, see [3]. Splitting single logical structure into multiple ones leads to more complicated management, extra pointer dereferences and overall less maintainable code. When that split-away part is a lock, it complicates things even further. With no performance benefits, there are no reasons for this split. Merging the vm_lock back into vm_area_struct also allows vm_area_struct to use SLAB_TYPESAFE_BY_RCU later in this patchset. This patchset: 1. moves vm_lock back into vm_area_struct, aligning it at the cacheline boundary and changing the cache to be cacheline-aligned to minimize cacheline sharing; 2. changes vm_area_struct initialization to mark new vma as detached until it is inserted into vma tree; 3. replaces vm_lock and vma->detached flag with a reference counter; 4. changes vm_area_struct cache to SLAB_TYPESAFE_BY_RCU to allow for their reuse and to minimize call_rcu() calls. Pagefault microbenchmarks show performance improvement: Hmean faults/cpu-1 507926.5547 ( 0.00%) 506519.3692 * -0.28%* Hmean faults/cpu-4 479119.7051 ( 0.00%) 481333.6802 * 0.46%* Hmean faults/cpu-7 452880.2961 ( 0.00%) 455845.6211 * 0.65%* Hmean faults/cpu-12 347639.1021 ( 0.00%) 352004.2254 * 1.26%* Hmean faults/cpu-21 200061.2238 ( 0.00%) 229597.0317 * 14.76%* Hmean faults/cpu-30 145251.2001 ( 0.00%) 164202.5067 * 13.05%* Hmean faults/cpu-48 106848.4434 ( 0.00%) 120641.5504 * 12.91%* Hmean faults/cpu-56 92472.3835 ( 0.00%) 103464.7916 * 11.89%* Hmean faults/sec-1 507566.1468 ( 0.00%) 506139.0811 * -0.28%* Hmean faults/sec-4 1880478.2402 ( 0.00%) 1886795.6329 * 0.34%* Hmean faults/sec-7 3106394.3438 ( 0.00%) 3140550.7485 * 1.10%* Hmean faults/sec-12 4061358.4795 ( 0.00%) 4112477.0206 * 1.26%* Hmean faults/sec-21 3988619.1169 ( 0.00%) 4577747.1436 * 14.77%* Hmean faults/sec-30 3909839.5449 ( 0.00%) 4311052.2787 * 10.26%* Hmean faults/sec-48 4761108.4691 ( 0.00%) 5283790.5026 * 10.98%* Hmean faults/sec-56 4885561.4590 ( 0.00%) 5415839.4045 * 10.85%* Changes since v7 [4]: - Removed additional parameter for vma_iter_store() and introduced vma_iter_store_attached() instead, per Vlastimil Babka and Liam R. Howlett - Fixed coding style nits, per Vlastimil Babka - Added Reviewed-bys and Acked-bys, per Vlastimil Babka - Added Reviewed-bys and Acked-bys, per Liam R. Howlett - Added Acked-by, per Davidlohr Bueso - Removed unnecessary patch changeing nommu.c - Folded a fixup patch [5] into the patch it was fixing - Changed calculation in __refcount_add_not_zero_limited() to avoid overflow, to change the limit to be inclusive and to use INT_MAX to indicate no limits, per Vlastimil Babka and Matthew Wilcox - Folded a fixup patch [6] into the patch it was fixing - Added vm_refcnt rules summary in the changelog, per Liam R. Howlett - Changed writers to not increment vm_refcnt and adjusted VMA_REF_LIMIT to not reserve one count for a writer, per Liam R. Howlett - Changed vma_refcount_put() to wake up writers only when the last reader is leaving, per Liam R. Howlett - Fixed rwsem_acquire_read() parameters when read-locking a vma to match the way down_read_trylock() does lockdep, per Vlastimil Babka - Folded vma_lockdep_init() into vma_lock_init() for simplicity - Brought back vma_copy() to keep vm_refcount at 0 during reuse, per Vlastimil Babka What I did not include in this patchset: - Liam's suggestion to change dump_vma() output since it's unclear to me how it should look like. The patch is for debug only and not critical for the rest of the series, we can change the output later or even drop it if necessary. [1] https://lore.kernel.org/all/20230227173632.3292573-34-surenb@google.com/ [2] https://lore.kernel.org/all/ZsQyI%2F087V34JoIt@xsang-OptiPlex-9020/ [3] https://lore.kernel.org/all/CAJuCfpEisU8Lfe96AYJDZ+OM4NoPmnw9bP53cT_kbfP_pR+-2g@mail.gmail.com/ [4] https://lore.kernel.org/all/20241226170710.1159679-1-surenb@google.com/ [5] https://lore.kernel.org/all/20250107030415.721474-1-surenb@google.com/ [6] https://lore.kernel.org/all/20241226200335.1250078-1-surenb@google.com/ Patchset applies over mm-unstable after reverting v7 (current SHA range: 588f0086398e - fb2270654630) Suren Baghdasaryan (16): mm: introduce vma_start_read_locked{_nested} helpers mm: move per-vma lock into vm_area_struct mm: mark vma as detached until it's added into vma tree mm: introduce vma_iter_store_attached() to use with attached vmas mm: mark vmas detached upon exit types: move struct rcuwait into types.h mm: allow vma_start_read_locked/vma_start_read_locked_nested to fail mm: move mmap_init_lock() out of the header file mm: uninline the main body of vma_start_write() refcount: introduce __refcount_{add|inc}_not_zero_limited mm: replace vm_lock and detached flag with a reference count mm/debug: print vm_refcnt state when dumping the vma mm: remove extra vma_numab_state_init() call mm: prepare lock_vma_under_rcu() for vma reuse possibility mm: make vma cache SLAB_TYPESAFE_BY_RCU docs/mm: document latest changes to vm_lock Documentation/mm/process_addrs.rst | 44 +++++---- include/linux/mm.h | 152 ++++++++++++++++++++++------- include/linux/mm_types.h | 36 ++++--- include/linux/mmap_lock.h | 6 -- include/linux/rcuwait.h | 13 +-- include/linux/refcount.h | 20 +++- include/linux/slab.h | 6 -- include/linux/types.h | 12 +++ kernel/fork.c | 128 +++++++++++------------- mm/debug.c | 12 +++ mm/init-mm.c | 1 + mm/memory.c | 94 +++++++++++++++--- mm/mmap.c | 3 +- mm/userfaultfd.c | 32 +++--- mm/vma.c | 23 ++--- mm/vma.h | 15 ++- tools/testing/vma/linux/atomic.h | 5 + tools/testing/vma/vma_internal.h | 93 ++++++++---------- 18 files changed, 435 insertions(+), 260 deletions(-)