From patchwork Wed Dec 8 21:22:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12665267 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 11C2BC433F5 for ; Wed, 8 Dec 2021 21:22:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 85FA36B0071; Wed, 8 Dec 2021 16:22:26 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 80DEC6B0073; Wed, 8 Dec 2021 16:22:26 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 6AE4C6B0074; Wed, 8 Dec 2021 16:22:26 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0055.hostedemail.com [216.40.44.55]) by kanga.kvack.org (Postfix) with ESMTP id 5C1DD6B0071 for ; Wed, 8 Dec 2021 16:22:26 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 1B8BF89958 for ; Wed, 8 Dec 2021 21:22:16 +0000 (UTC) X-FDA: 78895900272.21.687F20D Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf11.hostedemail.com (Postfix) with ESMTP id C1822F0000B8 for ; Wed, 8 Dec 2021 21:22:15 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id s189-20020a252cc6000000b005c1f206d91eso6641766ybs.14 for ; Wed, 08 Dec 2021 13:22:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:message-id:mime-version:subject:from:to:cc; bh=7AiB4xZKo9CfYCzdKVvxaPbZztKFTU9MCKhfSBvngDw=; b=ZD3pyDq6HyIQAbnb53r5xYWhonZl7vzKyzJ/RpgDlART3fRWEPaYJOszMUK/NaeNTT EVAq6D6jiUM399ZJA0aFUGReV+gBgR9N6DvGUa8HmyUGYASfD7+6a1KqGS2s0TgEDB1t DMcQYFXXQrCpaF07o3JwjWPBEfRI4PBDpxDMkzeinyIdSW7Lhyp0u1iVuHAyMe0sEYL0 YLMVvjmDEqUctw2SoylPZb2RNh1pfdi8qj0bOtWj31gTfQUaZXDg7+GD7PrcyQEDp0q/ C5hiVh2hOE971kb4gPu8fEH2H3+SXcP/m8QTTVY4l1sgUbWwtNcHt6kaC6oxm8YpyOve COnw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:message-id:mime-version:subject:from:to:cc; bh=7AiB4xZKo9CfYCzdKVvxaPbZztKFTU9MCKhfSBvngDw=; b=EbM0yHehSmAL8b0uXyqotkR3nOG3EtWoUB41Z/ugMSxp4vqXj/DTQu1Kxr56egxSdG ODbhk1Xle6dGmWSxgf9bspNMvhH4nS+vkff4igd6PpUHz1S6kVmAc9EYaxR7IP7w1C/3 XsHzkQgDvjVVsh5AFQInAmvuThk26ZhvjmtuAUZSPT9Hs+8+5wdbGw8iwswRFqxm+ZuK 7eM6LHGXPEaUJcMImQJDjrJagOrx69SgF4h5uZ7YXIKatOC7Qoxk8mTgBBsFXBfsTmQh hL2Zk/L1tYHoDMcyGOyLnv5s6O4dpdyyeKYs/2gKLj3bYF7waK4iHusANNbs1oLmtToA U3aA== X-Gm-Message-State: AOAM532Sz5Hig7woMzYmOdok/nveSn5LyRLcx+7wZTuaQCFIdgLkyEIt D0TE33O5ei5qSpJyLLjU3NoNhy2eT0c= X-Google-Smtp-Source: ABdhPJwmX5ycLZWEvpUwxbN+kx4ZMInSe8TpJ7lAq5TIrjBzoLpXy0BMW3REJ9kApX0vManDCrABDX8hUP0= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:4efa:9eac:2174:103c]) (user=surenb job=sendgmr) by 2002:a25:ac23:: with SMTP id w35mr1497904ybi.341.1638998534911; Wed, 08 Dec 2021 13:22:14 -0800 (PST) Date: Wed, 8 Dec 2021 13:22:09 -0800 Message-Id: <20211208212211.2860249-1-surenb@google.com> Mime-Version: 1.0 X-Mailer: git-send-email 2.34.1.400.ga245620fadb-goog Subject: [PATCH v4 1/3] mm: protect free_pgtables with mmap_lock write lock in exit_mmap From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: mhocko@kernel.org, mhocko@suse.com, rientjes@google.com, willy@infradead.org, hannes@cmpxchg.org, guro@fb.com, riel@surriel.com, minchan@kernel.org, kirill@shutemov.name, aarcange@redhat.com, christian@brauner.io, hch@infradead.org, oleg@redhat.com, david@redhat.com, jannh@google.com, shakeelb@google.com, luto@kernel.org, christian.brauner@ubuntu.com, fweimer@redhat.com, jengelh@inai.de, timmurray@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: C1822F0000B8 X-Stat-Signature: jk5ehfbrp7o1w946hmk55ztdba6yjq4t Authentication-Results: imf11.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ZD3pyDq6; spf=pass (imf11.hostedemail.com: domain of 3BiKxYQYKCCMRTQDMAFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--surenb.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3BiKxYQYKCCMRTQDMAFNNFKD.BNLKHMTW-LLJU9BJ.NQF@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1638998535-488740 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: oom-reaper and process_mrelease system call should protect against races with exit_mmap which can destroy page tables while they walk the VMA tree. oom-reaper protects from that race by setting MMF_OOM_VICTIM and by relying on exit_mmap to set MMF_OOM_SKIP before taking and releasing mmap_write_lock. process_mrelease has to elevate mm->mm_users to prevent such race. Both oom-reaper and process_mrelease hold mmap_read_lock when walking the VMA tree. The locking rules and mechanisms could be simpler if exit_mmap takes mmap_write_lock while executing destructive operations such as free_pgtables. Change exit_mmap to hold the mmap_write_lock when calling free_pgtables and remove_vma. Operations like unmap_vmas and unlock_range are not destructive and could run under mmap_read_lock but for simplicity we take one mmap_write_lock during almost the entire operation. Note also that because oom-reaper checks VM_LOCKED flag, unlock_range() should not be allowed to race with it. Before this patch, remove_vma used to be called with no locks held, however with fput being executed asynchronously and vm_ops->close not being allowed to hold mmap_lock (it is called from __split_vma with mmap_sem held for write), changing that should be fine. In most cases this lock should be uncontended. Previously, Kirill reported ~4% regression caused by a similar change [1]. We reran the same test and although the individual results are quite noisy, the percentiles show lower regression with 1.6% being the worst case [2]. The change allows oom-reaper and process_mrelease to execute safely under mmap_read_lock without worries that exit_mmap might destroy page tables from under them. [1] https://lore.kernel.org/all/20170725141723.ivukwhddk2voyhuc@node.shutemov.name/ [2] https://lore.kernel.org/all/CAJuCfpGC9-c9P40x7oy=jy5SphMcd0o0G_6U1-+JAziGKG6dGA@mail.gmail.com/ Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko --- changes in v4 - Separated comments describing vm_operations_struct::close locking requirements into a separate patch, per Matthew Wilcox mm/mmap.c | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index bfb0ea164a90..f4e09d390a07 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3142,25 +3142,27 @@ void exit_mmap(struct mm_struct *mm) * to mmu_notifier_release(mm) ensures mmu notifier callbacks in * __oom_reap_task_mm() will not block. * - * This needs to be done before calling munlock_vma_pages_all(), + * This needs to be done before calling unlock_range(), * which clears VM_LOCKED, otherwise the oom reaper cannot * reliably test it. */ (void)__oom_reap_task_mm(mm); set_bit(MMF_OOM_SKIP, &mm->flags); - mmap_write_lock(mm); - mmap_write_unlock(mm); } + mmap_write_lock(mm); if (mm->locked_vm) unlock_range(mm->mmap, ULONG_MAX); arch_exit_mmap(mm); vma = mm->mmap; - if (!vma) /* Can happen if dup_mmap() received an OOM */ + if (!vma) { + /* Can happen if dup_mmap() received an OOM */ + mmap_write_unlock(mm); return; + } lru_add_drain(); flush_cache_mm(mm); @@ -3171,16 +3173,14 @@ void exit_mmap(struct mm_struct *mm) free_pgtables(&tlb, vma, FIRST_USER_ADDRESS, USER_PGTABLES_CEILING); tlb_finish_mmu(&tlb); - /* - * Walk the list again, actually closing and freeing it, - * with preemption enabled, without holding any MM locks. - */ + /* Walk the list again, actually closing and freeing it. */ while (vma) { if (vma->vm_flags & VM_ACCOUNT) nr_accounted += vma_pages(vma); vma = remove_vma(vma); cond_resched(); } + mmap_write_unlock(mm); vm_unacct_memory(nr_accounted); } From patchwork Wed Dec 8 21:22:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12665269 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8991CC4332F for ; Wed, 8 Dec 2021 21:22:53 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2DFC56B0073; Wed, 8 Dec 2021 16:22:29 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 2908A6B0075; Wed, 8 Dec 2021 16:22:29 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0DA7F6B0073; Wed, 8 Dec 2021 16:22:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0190.hostedemail.com [216.40.44.190]) by kanga.kvack.org (Postfix) with ESMTP id EC2946B0073 for ; Wed, 8 Dec 2021 16:22:28 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id A21638248076 for ; Wed, 8 Dec 2021 21:22:18 +0000 (UTC) X-FDA: 78895900356.21.87844AB Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf28.hostedemail.com (Postfix) with ESMTP id 538E390000A3 for ; Wed, 8 Dec 2021 21:22:18 +0000 (UTC) Received: by mail-yb1-f202.google.com with SMTP id l28-20020a25b31c000000b005c27dd4987bso6648365ybj.18 for ; Wed, 08 Dec 2021 13:22:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=d5tOeWVQTPZlPIcldFSGDsFeR/8SS3+rAWonTN5sOts=; b=As/cw9k3ZGJr9vmFWL0o7VOvTAGnXVlb8lU3/Obtf/MjW9RffGBvhTi9jUNHiB0iKY g1U1xeLjv30acYeHHel4z4Gj+bkE2iKp/JNXsJVAkeWgHX8YC5XjgE/iZgMXe/lWJaH/ yk1xCNdSL825zlvcc+aYZMRi2aZhzOvMlsGPrTkl5JPFkbuKX7uSOODzECIx5H86+ues 1+zUptQJzqGpwf2KXRjU+9i659AXUUR+fyajHgnELIPU66JLQG+GgOyRkMsEhozaq75O LbIChGRevW74+fRvT4mDBjW/vgoRFQH9mRsO8zKyTy8Ho7u0pZ6u08gXdpiHUjw5Jt/B Y3Aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=d5tOeWVQTPZlPIcldFSGDsFeR/8SS3+rAWonTN5sOts=; b=cB1sQOSxBFAaKckWO+bF7/eE7bJ9pW1zLFdKUbFCqMgSasMMygN8bpnOvBo9uYwB4+ /DMFDY7MMcY0rTJQdLNUJJ0pUWpvXk/qMQE55kIGYxtyK1KK+ORkqk5brmgw0KKRb6KB PCfuHgBNnhLwelKmflnQkZidnAFs0cjCyd+aIrgXNm4UmkGIaG1OWoEym3MU8WpNlqca d2Llx/QuvMmv5YNhpZW5EYcf8yysCqhWSKPG+QNknBuyzowpNiYIRP/LrUVqXGiDjS4F wQO+NZFuNGvZ+tC2ro3xnqcoExZVwzSar10I2QxC26D5pPBi69nzz5wyxjS5EFK1Q/EL URUg== X-Gm-Message-State: AOAM533iDgfnaUQG65gYLq6aODiT2kSTAtpkgsb32wZDoIKsAeyzZdF6 HYri0xcdV+c2jZcyjwlfuVvL0XArtT8= X-Google-Smtp-Source: ABdhPJzCWyLFduG8i0U5m4IS2ULnaIHBO3AedQA8BjTiMJWDd56I5AJrHyXipL4ekCx1ST67qEUF19Klq34= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:4efa:9eac:2174:103c]) (user=surenb job=sendgmr) by 2002:a25:5c3:: with SMTP id 186mr1408710ybf.505.1638998537538; Wed, 08 Dec 2021 13:22:17 -0800 (PST) Date: Wed, 8 Dec 2021 13:22:10 -0800 In-Reply-To: <20211208212211.2860249-1-surenb@google.com> Message-Id: <20211208212211.2860249-2-surenb@google.com> Mime-Version: 1.0 References: <20211208212211.2860249-1-surenb@google.com> X-Mailer: git-send-email 2.34.1.400.ga245620fadb-goog Subject: [PATCH v4 2/3] mm: document locking restrictions for vm_operations_struct::close From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: mhocko@kernel.org, mhocko@suse.com, rientjes@google.com, willy@infradead.org, hannes@cmpxchg.org, guro@fb.com, riel@surriel.com, minchan@kernel.org, kirill@shutemov.name, aarcange@redhat.com, christian@brauner.io, hch@infradead.org, oleg@redhat.com, david@redhat.com, jannh@google.com, shakeelb@google.com, luto@kernel.org, christian.brauner@ubuntu.com, fweimer@redhat.com, jengelh@inai.de, timmurray@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com X-Rspamd-Server: rspam01 X-Rspamd-Queue-Id: 538E390000A3 X-Stat-Signature: tswywbdfrbjtaoygt7s8hhbsw7x5wycs Authentication-Results: imf28.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b="As/cw9k3"; spf=pass (imf28.hostedemail.com: domain of 3CSKxYQYKCCYUWTGPDIQQING.EQONKPWZ-OOMXCEM.QTI@flex--surenb.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3CSKxYQYKCCYUWTGPDIQQING.EQONKPWZ-OOMXCEM.QTI@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1638998538-846137 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add comments for vm_operations_struct::close documenting locking requirements for this callback and its callers. Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko --- include/linux/mm.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/include/linux/mm.h b/include/linux/mm.h index a7e4a9e7d807..b9b88ba7564b 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -577,6 +577,10 @@ enum page_entry_size { */ struct vm_operations_struct { void (*open)(struct vm_area_struct * area); + /** + * @close: Called when the VMA is being removed from the MM. + * Context: User context. May sleep. Caller holds mmap_lock. + */ void (*close)(struct vm_area_struct * area); /* Called any time before splitting to check if it's allowed */ int (*may_split)(struct vm_area_struct *area, unsigned long addr); From patchwork Wed Dec 8 21:22:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 12665271 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0700BC433F5 for ; Wed, 8 Dec 2021 21:23:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BE24C6B0074; Wed, 8 Dec 2021 16:22:31 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id B92056B0075; Wed, 8 Dec 2021 16:22:31 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A31066B0078; Wed, 8 Dec 2021 16:22:31 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0168.hostedemail.com [216.40.44.168]) by kanga.kvack.org (Postfix) with ESMTP id 946AE6B0074 for ; Wed, 8 Dec 2021 16:22:31 -0500 (EST) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 57FB218579880 for ; Wed, 8 Dec 2021 21:22:21 +0000 (UTC) X-FDA: 78895900482.14.A17106D Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) by imf14.hostedemail.com (Postfix) with ESMTP id D252B6001991 for ; Wed, 8 Dec 2021 21:22:20 +0000 (UTC) Received: by mail-yb1-f202.google.com with SMTP id j18-20020a258152000000b006000d292a42so6824801ybm.0 for ; Wed, 08 Dec 2021 13:22:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=/LdRQnVKFEGyVFkXugJvOQFjhYmdmtnxJCMLa6KGAIY=; b=ik0lGloytyqJMBwHPZlQOqLyGVHVJ8skmb+yfBkdfnZ4R41eUquBRC3ulzCWnv8Yop DHIJiOfklRnkRfWnrXgyiJDvrck0Og0VPhymdBAxFy2KBrqduxHQmEIhlgJa8wrQHClP kPa1Rk8bGcfXYR0KRs1CHPFM7GPeNDJeKmVlN95DFxzzV1dqadhkEaS/P3K641Btrlcq NhXoVbYmY7Ja28uxgrq/sR90a8ZuskctY379JhQnHlSqnZ+yBHK0Wll7F40s4fap2M3X uW8ITJCXRmGbgaAmxtM13KqdcKkCZocCerIKyi2qLfYIzL8QM17q0dgL8c0m7J+Na60G 2Rbg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=/LdRQnVKFEGyVFkXugJvOQFjhYmdmtnxJCMLa6KGAIY=; b=SMCVXb7vPyIvcByZIcVp/ppP+EnZ4r3SlQH8okfy4lePuH86PxORZrLgSStuG9F0N1 NTMywzBhi7CvsdL0z/EZ+RFvzccavCHpMDJLixTaweNjeCNp+c3TFyNWWgYKSu0W7BvW ExSntSnp0zrB/Jnpvo60eb04QzyTIQL2nTeGD+uPEwiORf4rvI8Fg/9VJkANEvKaPOnV r9uFCni1xbHMxp+asLf1RyVxmyxWFZaibX0GwrRtvPMHx2vxv2PpROFcYwMGK8UFb8ki fiXVc00dCL8IHXpWQzjEJpiogKmRV0LiUvSbAwJQhQux41Z9gQnZpn3JPokYBDRn0BzA nkZA== X-Gm-Message-State: AOAM533gdV4Obk1vrPDexdNp+Y/H5sagNUWvqkypzMWPQi6O2L+4lfbr 7tHDaA+KDICybkZhuSB6BlgmZdK+f88= X-Google-Smtp-Source: ABdhPJxUwvDuuaC2E/F2iwSIbNDSm8TNvXzaOJU4mGj28Dea/RhmXNcUnekZSEqBy5C61eQOwQxDs+F7nz4= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:200:4efa:9eac:2174:103c]) (user=surenb job=sendgmr) by 2002:a25:357:: with SMTP id 84mr1583433ybd.220.1638998540126; Wed, 08 Dec 2021 13:22:20 -0800 (PST) Date: Wed, 8 Dec 2021 13:22:11 -0800 In-Reply-To: <20211208212211.2860249-1-surenb@google.com> Message-Id: <20211208212211.2860249-3-surenb@google.com> Mime-Version: 1.0 References: <20211208212211.2860249-1-surenb@google.com> X-Mailer: git-send-email 2.34.1.400.ga245620fadb-goog Subject: [PATCH v4 3/3] mm/oom_kill: allow process_mrelease to run under mmap_lock protection From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: mhocko@kernel.org, mhocko@suse.com, rientjes@google.com, willy@infradead.org, hannes@cmpxchg.org, guro@fb.com, riel@surriel.com, minchan@kernel.org, kirill@shutemov.name, aarcange@redhat.com, christian@brauner.io, hch@infradead.org, oleg@redhat.com, david@redhat.com, jannh@google.com, shakeelb@google.com, luto@kernel.org, christian.brauner@ubuntu.com, fweimer@redhat.com, jengelh@inai.de, timmurray@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com, surenb@google.com X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: D252B6001991 X-Stat-Signature: aofe8hyqat5dtx6wb6c6uwqy4ebwery9 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=ik0lGloy; spf=pass (imf14.hostedemail.com: domain of 3DCKxYQYKCCkXZWJSGLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--surenb.bounces.google.com designates 209.85.219.202 as permitted sender) smtp.mailfrom=3DCKxYQYKCCkXZWJSGLTTLQJ.HTRQNSZc-RRPaFHP.TWL@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-HE-Tag: 1638998540-695141 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: With exit_mmap holding mmap_write_lock during free_pgtables call, process_mrelease does not need to elevate mm->mm_users in order to prevent exit_mmap from destrying pagetables while __oom_reap_task_mm is walking the VMA tree. The change prevents process_mrelease from calling the last mmput, which can lead to waiting for IO completion in exit_aio. Fixes: 337546e83fc7 ("mm/oom_kill.c: prevent a race between process_mrelease and exit_mmap") Signed-off-by: Suren Baghdasaryan Acked-by: Michal Hocko --- mm/oom_kill.c | 27 +++++++++++++++------------ 1 file changed, 15 insertions(+), 12 deletions(-) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 1ddabefcfb5a..67780386f478 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -1169,15 +1169,15 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags) goto put_task; } - if (mmget_not_zero(p->mm)) { - mm = p->mm; - if (task_will_free_mem(p)) - reap = true; - else { - /* Error only if the work has not been done already */ - if (!test_bit(MMF_OOM_SKIP, &mm->flags)) - ret = -EINVAL; - } + mm = p->mm; + mmgrab(mm); + + if (task_will_free_mem(p)) + reap = true; + else { + /* Error only if the work has not been done already */ + if (!test_bit(MMF_OOM_SKIP, &mm->flags)) + ret = -EINVAL; } task_unlock(p); @@ -1188,13 +1188,16 @@ SYSCALL_DEFINE2(process_mrelease, int, pidfd, unsigned int, flags) ret = -EINTR; goto drop_mm; } - if (!__oom_reap_task_mm(mm)) + /* + * Check MMF_OOM_SKIP again under mmap_read_lock protection to ensure + * possible change in exit_mmap is seen + */ + if (!test_bit(MMF_OOM_SKIP, &mm->flags) && !__oom_reap_task_mm(mm)) ret = -EAGAIN; mmap_read_unlock(mm); drop_mm: - if (mm) - mmput(mm); + mmdrop(mm); put_task: put_task_struct(task); return ret; From patchwork Thu Dec 9 09:12:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michal Hocko X-Patchwork-Id: 12666189 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 726E7C433F5 for ; Thu, 9 Dec 2021 09:12:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id D08BD6B0071; Thu, 9 Dec 2021 04:12:24 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CB87A6B0073; Thu, 9 Dec 2021 04:12:24 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BA7636B0074; Thu, 9 Dec 2021 04:12:24 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0113.hostedemail.com [216.40.44.113]) by kanga.kvack.org (Postfix) with ESMTP id AC9D56B0071 for ; Thu, 9 Dec 2021 04:12:24 -0500 (EST) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6723689D4E for ; Thu, 9 Dec 2021 09:12:14 +0000 (UTC) X-FDA: 78897689388.27.14D82EA Received: from smtp-out2.suse.de (smtp-out2.suse.de [195.135.220.29]) by imf06.hostedemail.com (Postfix) with ESMTP id E8F4A180004 for ; Thu, 9 Dec 2021 09:12:11 +0000 (UTC) Received: from relay2.suse.de (relay2.suse.de [149.44.160.134]) by smtp-out2.suse.de (Postfix) with ESMTP id 6685F1FD2A; Thu, 9 Dec 2021 09:12:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=suse.com; s=susede1; t=1639041131; h=from:from:reply-to:date:date:message-id:message-id:to:to:cc:cc: mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=jMKe2fpbaFIkwZm6Sx+WeSLDOJ/0w510lqoTLI2UzPI=; b=XLV+DOARD1znUeX5QY+Ws9skotH/YYoxpu2j/1lUla0gfpLOKIrx9df+SRHjRaNQli31yJ /1SvkgW/vTRWYgL20HgHbCZppbMaam9zpTsg8i0RZ64yCpXyQu6BdmNQl/SnjxRGi+4qF4 JOxdkVn519bLQWMp8MoizZzsse3vAkk= Received: from suse.cz (unknown [10.100.201.86]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by relay2.suse.de (Postfix) with ESMTPS id 3481AA3B85; Thu, 9 Dec 2021 09:12:11 +0000 (UTC) Date: Thu, 9 Dec 2021 10:12:10 +0100 From: Michal Hocko To: Suren Baghdasaryan Cc: akpm@linux-foundation.org, rientjes@google.com, willy@infradead.org, hannes@cmpxchg.org, guro@fb.com, riel@surriel.com, minchan@kernel.org, kirill@shutemov.name, aarcange@redhat.com, christian@brauner.io, hch@infradead.org, oleg@redhat.com, david@redhat.com, jannh@google.com, shakeelb@google.com, luto@kernel.org, christian.brauner@ubuntu.com, fweimer@redhat.com, jengelh@inai.de, timmurray@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, kernel-team@android.com Subject: [PATCH 4/3] mm: drop MMF_OOM_SKIP from exit_mmap Message-ID: References: <20211208212211.2860249-1-surenb@google.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20211208212211.2860249-1-surenb@google.com> X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: E8F4A180004 X-Stat-Signature: omxiwh8umqbkroz9caec3xx9d4q1peug Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=suse.com header.s=susede1 header.b=XLV+DOAR; spf=pass (imf06.hostedemail.com: domain of mhocko@suse.com designates 195.135.220.29 as permitted sender) smtp.mailfrom=mhocko@suse.com; dmarc=pass (policy=quarantine) header.from=suse.com X-HE-Tag: 1639041131-239511 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Do we want this on top? ---- From 58b04ae6dc97b0105ea2651daca55cf2386f69b4 Mon Sep 17 00:00:00 2001 From: Michal Hocko Date: Thu, 9 Dec 2021 10:07:51 +0100 Subject: [PATCH] mm: drop MMF_OOM_SKIP from exit_mmap MMF_OOM_SKIP used to play a synchronization role between exit_mmap and oom repear in the past. Since the exclusive mmap_sem is held in exit_mmap to cover all destructive operations the flag synchronization is not needed anymore and we can safely drop it. Just make sure that mm->mmap is set to NULL so that nobody will access the freed vma list. Signed-off-by: Michal Hocko --- mm/mmap.c | 23 +---------------------- 1 file changed, 1 insertion(+), 22 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index f4e09d390a07..0d6af9d89aa8 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -3129,28 +3129,6 @@ void exit_mmap(struct mm_struct *mm) /* mm's last user has gone, and its about to be pulled down */ mmu_notifier_release(mm); - if (unlikely(mm_is_oom_victim(mm))) { - /* - * Manually reap the mm to free as much memory as possible. - * Then, as the oom reaper does, set MMF_OOM_SKIP to disregard - * this mm from further consideration. Taking mm->mmap_lock for - * write after setting MMF_OOM_SKIP will guarantee that the oom - * reaper will not run on this mm again after mmap_lock is - * dropped. - * - * Nothing can be holding mm->mmap_lock here and the above call - * to mmu_notifier_release(mm) ensures mmu notifier callbacks in - * __oom_reap_task_mm() will not block. - * - * This needs to be done before calling unlock_range(), - * which clears VM_LOCKED, otherwise the oom reaper cannot - * reliably test it. - */ - (void)__oom_reap_task_mm(mm); - - set_bit(MMF_OOM_SKIP, &mm->flags); - } - mmap_write_lock(mm); if (mm->locked_vm) unlock_range(mm->mmap, ULONG_MAX); @@ -3180,6 +3158,7 @@ void exit_mmap(struct mm_struct *mm) vma = remove_vma(vma); cond_resched(); } + mm->mmap = NULL; mmap_write_unlock(mm); vm_unacct_memory(nr_accounted); }