From patchwork Thu Sep 1 17:34:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12962995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 22101ECAAD3 for ; Thu, 1 Sep 2022 17:36:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: Mime-Version:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=RE2QaOF6uVY8v50OEDDKDS/Sd+c/WS8il5URuw+Xsoo=; b=2Ku YPo+j6vWBXU8kmXXN7uG/x+JNqg4i7s0FjbwyDhs/HSAFENoYeIexyUxq+v96exqyvOWlPyj+ijr/ F/HtlZfsoemOhlVrbAsu0kEn6LfQzIrKPEXc3WmiHvqZg8BXTQo/voOHI8pYIsBHcpZeRKM4vPZ18 PB4X7TwEql6xMsspHiRgOdwZZohMaQkQkA6GnE96b+fdl72qDX+0oXASrZAe+//DI83VH5s4AMmC7 9L+I9Q6z5KupXhHC7lFAQRt0le5ReXUfQKBUIJuJyFrIWkkKrBVrkfY6UqGnMSSU49i0ZD+OLHC0p qGRe1hPZc+cjDc1HpyZItuMKwOTXQDw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1oTo6C-00DbJO-Em; Thu, 01 Sep 2022 17:35:28 +0000 Received: from mail-yb1-xb4a.google.com ([2607:f8b0:4864:20::b4a]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1oTo68-00DbFw-LF for linux-arm-kernel@lists.infradead.org; Thu, 01 Sep 2022 17:35:26 +0000 Received: by mail-yb1-xb4a.google.com with SMTP id f3-20020a056902038300b00696588a0e87so3562551ybs.3 for ; Thu, 01 Sep 2022 10:35:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:from:to:cc:subject:date; bh=kDz45+x8leBVOf7Bdp0WiDEao4W3YXxkJImYKeLiow8=; b=UJ2NJSigmjvs38+mtpjC0YPaCWUde6cB0UL/bLufaGhMosguVjiiFe1oszFm36/Odc sUfw69wiCGaI3T90u9gbLBluz/hph0cewPRzHoAQw666Ao4RCcUg+CRGBoTlLAh+P8fw d6pa6e1ROBBUZrnr2K/YAABtJnqT1xqAFjYdpdHHFOI0Qj4R7XA51PC6fSlMGidZfaTe auU6PBRWiLtEO8ZKkl2afbBFNVW7/F0No/gyQzvcoDCPswV1JpNDwaFi4dzPo2uGVvT+ KyAW7FzT4zbe/qv1aeGuBipeT7ieMPk/X1LylsgIncU0zCOpr2t23+stO/KnyWKxRNvu oj8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:cc:to:from:subject:message-id :mime-version:date:x-gm-message-state:from:to:cc:subject:date; bh=kDz45+x8leBVOf7Bdp0WiDEao4W3YXxkJImYKeLiow8=; b=WKef2rEcrcQc+yeqTffV7aYnxjXPS+8ggw2rWLiYYnCKpVaRGRtzGRWYpcHkiJOCOd Aps5cBE4gxQrhsBurkutNjnnnO6APPXqcERXLl8PNo0immZK266nrIFoDOvoQO2920Vo PA+LnJrhZazV4YvaKAKvAMh4X+1wLd/wTldZEKh5DQY1jjHJjGR/r11GezuUSkwnfM21 Pkn9JddDXiZAlTeglAXyMT+EQBU8Bh4vxrSPN/vWTuaElNIQqh2YHhx5LwmPZcdUAA/W EZvJcx01NTok7pBotWwFX+k4nlxudJVnvlGGTnTitC6LjdkiaN5b1O9WnsF3D9e4fUqq +vkQ== X-Gm-Message-State: ACgBeo07gG+yDKNgzRUKq9GudbxL+uyvLymlF5vsZt81TE8mav2l8Ab2 UO/hQbRi/bGb2WnMK7q0GbhWuYM5k7k= X-Google-Smtp-Source: AA6agR7afRN1POoQY9p1igScMyaFRYqs4/PVRqG8JF8uqVSGzSSN48hhlfxus/tnBm8aq/2o/zV00jvOX/w= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:1bfc:e7ee:6530:4449]) (user=surenb job=sendgmr) by 2002:a81:9e53:0:b0:340:8556:472d with SMTP id n19-20020a819e53000000b003408556472dmr23620124ywj.197.1662053720792; Thu, 01 Sep 2022 10:35:20 -0700 (PDT) Date: Thu, 1 Sep 2022 10:34:48 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.37.2.789.g6183377224-goog Message-ID: <20220901173516.702122-1-surenb@google.com> Subject: [RFC PATCH RESEND 00/28] per-VMA locks proposal From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@suse.de, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, laurent.dufour@fr.ibm.com, paulmck@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, rientjes@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, surenb@google.com, kernel-team@android.com, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, x86@kernel.org, linux-kernel@vger.kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20220901_103524_708467_2322F03B X-CRM114-Status: GOOD ( 18.16 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Resending to fix the issue with the In-Reply-To tag in the original submission at [4]. This is a proof of concept for per-vma locks idea that was discussed during SPF [1] discussion at LSF/MM this year [2], which concluded with suggestion that “a reader/writer semaphore could be put into the VMA itself; that would have the effect of using the VMA as a sort of range lock. There would still be contention at the VMA level, but it would be an improvement.” This patchset implements this suggested approach. When handling page faults we lookup the VMA that contains the faulting page under RCU protection and try to acquire its lock. If that fails we fall back to using mmap_lock, similar to how SPF handled this situation. One notable way the implementation deviates from the proposal is the way VMAs are marked as locked. Because during some of mm updates multiple VMAs need to be locked until the end of the update (e.g. vma_merge, split_vma, etc). Tracking all the locked VMAs, avoiding recursive locks and other complications would make the code more complex. Therefore we provide a way to "mark" VMAs as locked and then unmark all locked VMAs all at once. This is done using two sequence numbers - one in the vm_area_struct and one in the mm_struct. VMA is considered locked when these sequence numbers are equal. To mark a VMA as locked we set the sequence number in vm_area_struct to be equal to the sequence number in mm_struct. To unlock all VMAs we increment mm_struct's seq number. This allows for an efficient way to track locked VMAs and to drop the locks on all VMAs at the end of the update. The patchset implements per-VMA locking only for anonymous pages which are not in swap. If the initial proposal is considered acceptable, then support for swapped and file-backed page faults will be added. Performance benchmarks show similar although slightly smaller benefits as with SPF patchset (~75% of SPF benefits). Still, with lower complexity this approach might be more desirable. The patchset applies cleanly over 6.0-rc3 The tree for testing is posted at [3] [1] https://lore.kernel.org/all/20220128131006.67712-1-michel@lespinasse.org/ [2] https://lwn.net/Articles/893906/ [3] https://github.com/surenbaghdasaryan/linux/tree/per_vma_lock_rfc [4] https://lore.kernel.org/all/20220829212531.3184856-1-surenb@google.com/ Laurent Dufour (2): powerc/mm: try VMA lock-based page fault handling first powerpc/mm: define ARCH_SUPPORTS_PER_VMA_LOCK Michel Lespinasse (1): mm: rcu safe VMA freeing Suren Baghdasaryan (25): mm: introduce CONFIG_PER_VMA_LOCK mm: introduce __find_vma to be used without mmap_lock protection mm: move mmap_lock assert function definitions mm: add per-VMA lock and helper functions to control it mm: mark VMA as locked whenever vma->vm_flags are modified kernel/fork: mark VMAs as locked before copying pages during fork mm/khugepaged: mark VMA as locked while collapsing a hugepage mm/mempolicy: mark VMA as locked when changing protection policy mm/mmap: mark VMAs as locked in vma_adjust mm/mmap: mark VMAs as locked before merging or splitting them mm/mremap: mark VMA as locked while remapping it to a new address range mm: conditionally mark VMA as locked in free_pgtables and unmap_page_range mm: mark VMAs as locked before isolating them mm/mmap: mark adjacent VMAs as locked if they can grow into unmapped area kernel/fork: assert no VMA readers during its destruction mm/mmap: prevent pagefault handler from racing with mmu_notifier registration mm: add FAULT_FLAG_VMA_LOCK flag mm: disallow do_swap_page to handle page faults under VMA lock mm: introduce per-VMA lock statistics mm: introduce find_and_lock_anon_vma to be used from arch-specific code x86/mm: try VMA lock-based page fault handling first x86/mm: define ARCH_SUPPORTS_PER_VMA_LOCK arm64/mm: try VMA lock-based page fault handling first arm64/mm: define ARCH_SUPPORTS_PER_VMA_LOCK kernel/fork: throttle call_rcu() calls in vm_area_free arch/arm64/Kconfig | 1 + arch/arm64/mm/fault.c | 36 +++++++++ arch/powerpc/mm/fault.c | 41 ++++++++++ arch/powerpc/platforms/powernv/Kconfig | 1 + arch/powerpc/platforms/pseries/Kconfig | 1 + arch/x86/Kconfig | 1 + arch/x86/mm/fault.c | 36 +++++++++ drivers/gpu/drm/i915/i915_gpu_error.c | 4 +- fs/proc/task_mmu.c | 1 + fs/userfaultfd.c | 6 ++ include/linux/mm.h | 104 ++++++++++++++++++++++++- include/linux/mm_types.h | 33 ++++++-- include/linux/mmap_lock.h | 37 ++++++--- include/linux/vm_event_item.h | 6 ++ include/linux/vmstat.h | 6 ++ kernel/fork.c | 75 +++++++++++++++++- mm/Kconfig | 13 ++++ mm/Kconfig.debug | 8 ++ mm/init-mm.c | 6 ++ mm/internal.h | 4 +- mm/khugepaged.c | 1 + mm/madvise.c | 1 + mm/memory.c | 82 ++++++++++++++++--- mm/mempolicy.c | 6 +- mm/mlock.c | 2 + mm/mmap.c | 60 ++++++++++---- mm/mprotect.c | 1 + mm/mremap.c | 1 + mm/nommu.c | 2 + mm/oom_kill.c | 3 +- mm/vmstat.c | 6 ++ 31 files changed, 531 insertions(+), 54 deletions(-)