From patchwork Sat Jul 8 19:12:10 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13305706 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 7A703EB64DD for ; Sat, 8 Jul 2023 19:12:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: Mime-Version:Date:Reply-To:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To: References:List-Owner; bh=BVBjbu2Uo/2YkavSqkbTiumiUW9eaySi9AzFov//o0U=; b=xoB thKEJMTyTvdaeDt3gi86FG31C4F2PUByMqyC7V8xuGrugrgKpg44B3xoRgejXHyFdJp6S6lEuTlXv pr9KkOdBowQs9DZzRfK4U6+16IpG+R1lqunYqGxSfogA/rzxStwWAp1EUxsIrUkOa5QFbIYtPyQ0G jXRe0BaJGB0wK/DHVnlyJroQ+Xa4whju//gucmCZWLbEEh1ZcRlTQHZbp3/k2zztRttcdSDZHcPz8 LVD7rKitNP7PB4bfiDPsx+vOn6oRGYWOxFaO5IM0NdTjaqki7dLrx40YG6Mc6zAd/feLng4FVOLkV Ry35CzclrDm8XHeTJlr0yAJLDoqI5nA==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qIDLx-007jwt-1Q; Sat, 08 Jul 2023 19:12:21 +0000 Received: from mail-yb1-xb49.google.com ([2607:f8b0:4864:20::b49]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qIDLu-007jvJ-0i for linux-arm-kernel@lists.infradead.org; Sat, 08 Jul 2023 19:12:19 +0000 Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-c64ef5bde93so3496347276.0 for ; Sat, 08 Jul 2023 12:12:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688843536; x=1691435536; h=cc:to:from:subject:message-id:mime-version:date:from:to:cc:subject :date:message-id:reply-to; bh=JxWcwQCt64OXVUnpGWZTSfPA4gPHWHeYyOn+4bzO86s=; b=e+tppejdoz5kN8B3C5WlD+0nrZGu/5EqfAONbG0hP/lWZ7G+LSD9YpK/YdiONHY+ET COyN68AHpyc5cvZcCJI9Kdj/VXVTOhzXyqaYxVOU/XE8oiqZCtvDT9Ohjc4lwpKs8uIs Q4SGdk/Hg/xVzMRwTNMA5Eb7fUdiGxUNFD+tTkYwspdAJA3H1WkY5sp8/o9lg62jTXI9 vPMevfc26zzweXBMLZnrlY/VDaPVf886Cj6pjLGQXsKtCnCzJ7R7L8Gjg1qP169pUZdf 1fQiHyVHhW2J/bG8DUWZndT612qIhsJaOJnKsT3WA5FG5hybdpqBg5J7YK8LJ6GcI8+m tGNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688843536; x=1691435536; h=cc:to:from:subject:message-id:mime-version:date:x-gm-message-state :from:to:cc:subject:date:message-id:reply-to; bh=JxWcwQCt64OXVUnpGWZTSfPA4gPHWHeYyOn+4bzO86s=; b=GUK/nW+8jiygyi033Pw0rWcaeGWa37VEwWAqLgiAoWNIRz7+HHxsSRdZiDP+Ip30YM iUV9R2orZsm1Kb4NHMKqwZxIKux1U9C2phPzEMVrRYtvJP+Ps5FPSM3SEBzrzvZQK7Yq 1e8TXMzSMCmNKO0GTdJxl2fL/rNjO3jJ/3TTe0RX+DBEX5pxpl0yPgj2qXORATOwxJWj i0mgsxb1VvsdzA6Cw8ebhloPPI0hk3cgSsZmiTzWZ5QdQ0CIvgdBGyRTy8Tk3uRiHwJK 6sL8zf0HMmvBidjY8xBTUIYniJjAJFE02YzHZe6iWxDUgF1+y67rKQ2j16e0OI+uAkek et7w== X-Gm-Message-State: ABy/qLb+34wi46LRMP4GI/NHlXIJb3aBP0OLCW/3NJnUKdJi7vKbWMYt C9WrVs+TQ6op90X9al2VVSBcUMMLQFc= X-Google-Smtp-Source: APBJJlHhdNayqx2O2N+8ynu4MlCffNN+ZnIRARK/gUWveaZn8/1xJL36UJUekD5xd/lYzHt6QBE2wx7GXfo= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:6f0:5193:79db:25b3]) (user=surenb job=sendgmr) by 2002:a5b:10e:0:b0:c39:d6f6:481f with SMTP id 14-20020a5b010e000000b00c39d6f6481fmr73329ybx.10.1688843535891; Sat, 08 Jul 2023 12:12:15 -0700 (PDT) Date: Sat, 8 Jul 2023 12:12:10 -0700 Mime-Version: 1.0 X-Mailer: git-send-email 2.41.0.390.g38632f3daf-goog Message-ID: <20230708191212.4147700-1-surenb@google.com> Subject: [PATCH v2 1/3] mm: lock a vma before stack expansion From: Suren Baghdasaryan To: torvalds@linux-foundation.org Cc: akpm@linux-foundation.org, regressions@leemhuis.info, bagasdotme@gmail.com, jacobly.alt@gmail.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-arm-kernel@lists.infradead.org, gregkh@linuxfoundation.org, regressions@lists.linux.dev, Suren Baghdasaryan , stable@vger.kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230708_121218_260864_8197EF8A X-CRM114-Status: GOOD ( 10.69 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org With recent changes necessitating mmap_lock to be held for write while expanding a stack, per-VMA locks should follow the same rules and be write-locked to prevent page faults into the VMA being expanded. Add the necessary locking. Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index 204ddcd52625..c66e4622a557 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1977,6 +1977,8 @@ static int expand_upwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; } + /* Lock the VMA before expanding to prevent concurrent page faults */ + vma_start_write(vma); /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_lock in read mode. We need the @@ -2064,6 +2066,8 @@ int expand_downwards(struct vm_area_struct *vma, unsigned long address) return -ENOMEM; } + /* Lock the VMA before expanding to prevent concurrent page faults */ + vma_start_write(vma); /* * vma->vm_start/vm_end cannot change under us because the caller * is required to hold the mmap_lock in read mode. We need the From patchwork Sat Jul 8 19:12:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13305705 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 0C651EB64DA for ; Sat, 8 Jul 2023 19:12:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=+bfxC+s11GsR19Tt/Y0UisBROAq2YwG82/9ekhW+2E8=; b=KZwu8GP2rwTIJ35QKEpBm8KE+z W+KYtxHIbSLwhQTLrWiflFM2nz/OLXEzLrLUHYTUxbyA7tDNCgIQeqZGnP8tNTcseGs18cu74E3TF o4kbobV3EDD9Ry6YxDy4FZ7I+6hXOfIpVl2AVo/2eNSeHXTnB30Bpym0S+CyuhxdK39YIQxIkfF7s g621fT40f04XocHsxmPWOmTQ2QNNs2yGlNMB7to3KtcWumpBi/sdC7EXCCz9XYluj9+GTdbIhvfcV 2G/0U7xvMkdhqLwbjme3fG6KC6W6GytfbulyzeFfjQcnNjIYqpWmrd/E2OectfrkC8ZyMf9Lk9+bG iPU2bFXQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qIDM0-007jyt-0V; Sat, 08 Jul 2023 19:12:24 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qIDLx-007jvf-0U for linux-arm-kernel@lists.infradead.org; Sat, 08 Jul 2023 19:12:22 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-57704af0e64so35106367b3.0 for ; Sat, 08 Jul 2023 12:12:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688843539; x=1691435539; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Z4rAZiN8CFc+wkQPvmF3sDwgaLYdtc/sQi/Gu45S9YQ=; b=rDyzKJgvSb4iVFr4MU6yl+skVCEaDOmrNnUTQxW2AR6lwndLcFZirFs7K8NVbOet/l Hz1ezj81fpfDhby24QFHC8W4yNrvRhOZmkRPzGk0HoDpmSzVRvUFh46uDhRrvrsHj0T+ bWfsgAAuSrJUkxovFXIvpjT+3mNLViAPukrwkqTZEVh67uYSSdkV3/RB6xiSA1ML8ef8 x4SI1Wv4IyWtXrra9hN2OOMwSs1hHi6cndf596Frps7d08xkQhXDxEudvyECcS4PhaDT YUU3z/ec3eCdOsvrUHORdE6FKycuZJ6n6CZZT2tr9aAc7ZTwuikUkUWUt+zO7jA28nYB wf5g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688843539; x=1691435539; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Z4rAZiN8CFc+wkQPvmF3sDwgaLYdtc/sQi/Gu45S9YQ=; b=VJrNtiLFRtxI2cjt7ZJA4NcsfGJlYpKUMex4cmSVvF3TLGpRAYbAYc938v8kVwh7F6 Bb67laK+9CcxCNWLA7E61v10PIgtRFR3kNZCTt3OWsjuEuW4qpmenhjONJqvdy3H6afC ZkshdaBfaw1Y8wl4F99El1JAJUJ/p5tqlXOVdS7YsGCig/IhMlWt1orFFfzKyTpqUUpW suiFiXHtXah1RxT5sdXiFv0jLFG4lrv5Iw09pYwTTu3gaVLhUKeXosjo3VhbIrVhSGS4 LTB+3S6c67tpJzWYJIZ6VBTOZcXXdnBw6mzO/B0Bc3NJHhE2P3Oeb2uGfXMEQCYYQRdo aveA== X-Gm-Message-State: ABy/qLZ8xyhdW4FNG05EcYDqc+adcKmo0JDPrSyjZDqyo22CK7VHleYQ iVj6RAbwTOfpRH9isoW/jnqWgYzFIh4= X-Google-Smtp-Source: APBJJlHlFE3mHfSxux3qV7DD1eHyTN3Bj7OluqfqRLk3Ndt0avHOdBy2KrtMLyVg6El02lkP8km17cLgxIM= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:6f0:5193:79db:25b3]) (user=surenb job=sendgmr) by 2002:a25:b05:0:b0:c24:7d1c:6145 with SMTP id 5-20020a250b05000000b00c247d1c6145mr69068ybl.2.1688843538817; Sat, 08 Jul 2023 12:12:18 -0700 (PDT) Date: Sat, 8 Jul 2023 12:12:11 -0700 In-Reply-To: <20230708191212.4147700-1-surenb@google.com> Mime-Version: 1.0 References: <20230708191212.4147700-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.390.g38632f3daf-goog Message-ID: <20230708191212.4147700-2-surenb@google.com> Subject: [PATCH v2 2/3] mm: lock newly mapped VMA which can be modified after it becomes visible From: Suren Baghdasaryan To: torvalds@linux-foundation.org Cc: akpm@linux-foundation.org, regressions@leemhuis.info, bagasdotme@gmail.com, jacobly.alt@gmail.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-arm-kernel@lists.infradead.org, gregkh@linuxfoundation.org, regressions@lists.linux.dev, Suren Baghdasaryan , stable@vger.kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230708_121221_187429_BA4FF576 X-CRM114-Status: GOOD ( 13.49 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org mmap_region adds a newly created VMA into VMA tree and might modify it afterwards before dropping the mmap_lock. This poses a problem for page faults handled under per-VMA locks because they don't take the mmap_lock and can stumble on this VMA while it's still being modified. Currently this does not pose a problem since post-addition modifications are done only for file-backed VMAs, which are not handled under per-VMA lock. However, once support for handling file-backed page faults with per-VMA locks is added, this will become a race. Fix this by write-locking the VMA before inserting it into the VMA tree. Other places where a new VMA is added into VMA tree do not modify it after the insertion, so do not need the same locking. Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan --- mm/mmap.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/mmap.c b/mm/mmap.c index c66e4622a557..84c71431a527 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2812,6 +2812,8 @@ unsigned long mmap_region(struct file *file, unsigned long addr, if (vma->vm_file) i_mmap_lock_write(vma->vm_file->f_mapping); + /* Lock the VMA since it is modified after insertion into VMA tree */ + vma_start_write(vma); vma_iter_store(&vmi, vma); mm->map_count++; if (vma->vm_file) { From patchwork Sat Jul 8 19:12:12 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13305707 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 1C358EB64DC for ; Sat, 8 Jul 2023 19:12:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:Cc:To:From:Subject:Message-ID: References:Mime-Version:In-Reply-To:Date:Reply-To:Content-ID: Content-Description:Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc :Resent-Message-ID:List-Owner; bh=V1wTmrkGvRL99BWZ/n838w/MbydEOCUJB5ODlF4HbQc=; b=VohnjbEJppLnvl/x8nYhTO/jcR C4RWRz0BwWyHEW7EJK1aqmxUEmQ/bqyqxqMNDoWe5n9+Za82Gus24XE9Xa8ulXS8VHfkE56lqDlp7 ARA1bCZF4F+wNmPCNTPLr7L7Oz7+bCihblD6SN7PGbLO/DtZDkp9qKODTnJo2idyYXn7VKjkVP3sE BGAumB79gKAAn22tZRz6Zdd7tyVeGcmgROB9zasbkKquvr1KktvMulBG7fzY/K8ABO1b3+/6ZxupB 5UyYtJtJF3fZevqFadYvtdUe09obwLt8CFBXSV7KKehpEkV6ruCi0m3ZVmYblCu2k+8X/4VCOQTka Iplkd3/Q==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.96 #2 (Red Hat Linux)) id 1qIDM3-007k11-1b; Sat, 08 Jul 2023 19:12:27 +0000 Received: from mail-yw1-x114a.google.com ([2607:f8b0:4864:20::114a]) by bombadil.infradead.org with esmtps (Exim 4.96 #2 (Red Hat Linux)) id 1qIDLy-007jwy-0i for linux-arm-kernel@lists.infradead.org; Sat, 08 Jul 2023 19:12:23 +0000 Received: by mail-yw1-x114a.google.com with SMTP id 00721157ae682-5706641dda9so32446977b3.3 for ; Sat, 08 Jul 2023 12:12:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688843541; x=1691435541; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=oTV0xihdJDZ3LmT5b0VvbfWVPo2l/pYZxiWhaA2z/Wc=; b=1k2h2Lzi2Y7Zso2QCXABxD6tKxYBi69Od4eg/BJzd/oZKr/yo1QC9zYTy+F7bkwUsr MJ4Cs8yuMwfbeH9sKlCNAwVo49wzFYe05nOmOIFdL12CZV8/kFSu5wJrvCOBni6b6XLW llu6z1SxXl9jUD2la6uHivLOfnp3yDTzL+wbIb/uZmz+uMYQ0KGYXgadTEUmWm9Uz2p8 OvPqWDoLPuHzUoISVlQIJ3HwqRiXh8i6S85Nr9mydxCOVHJ3byBf5nAejApz8KtI+7VG nVka+9EAa6kAPb4o2CIYQlx+qV3ecK270Nq5eVaYu0hTPSFmvNlGf9LGupgmu1sf8Ore 83wA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688843541; x=1691435541; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=oTV0xihdJDZ3LmT5b0VvbfWVPo2l/pYZxiWhaA2z/Wc=; b=UQm3tJkyLk1a2e/hqHgNm/EZsHiKDc87AGfSMqiNdBDhQeJZK4BzK3sDE5htobfasU XUHtZ4E5MtXSQD97YZcDjjwsdtJ/bfK7mNjTDYrRWpa2HeT9Xh2/wpCFRf8aZ2G6fRnS MdQh3oV0lfGGNcK7uyS9vBXHVgw87NMhaclL3V6TMb4QaUHItBlzWcJRHEuY+GMvXTha ASEvpu87HhwDlfAcJurNorMByudu4XBPt6JBCr/7slLYYZir44rRNnxnMlOWWy7Md8QK aKhx8G+iB3hRLXzIIZYyNL5g9eEnjIk1gnXQFqfglQeVJEn/Bt94BZ6GrheLkPRG3qqC yumg== X-Gm-Message-State: ABy/qLbXEp7SdCfHetch/06qO1ZG4C4vEWvr+9YSGx23U+vyysN7WjNR HSatOjudchBeavURoxoNQNxrx2OG8ow= X-Google-Smtp-Source: APBJJlHVrRFpZyyO6dwy1Gjn4/30Rl1D+ek+P+dzMKAoVpdfL0WWnvznDWyGgdeVaC3RV2FWAanbuAUiF7I= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:6f0:5193:79db:25b3]) (user=surenb job=sendgmr) by 2002:a81:ac20:0:b0:565:b269:5ef7 with SMTP id k32-20020a81ac20000000b00565b2695ef7mr58334ywh.1.1688843541040; Sat, 08 Jul 2023 12:12:21 -0700 (PDT) Date: Sat, 8 Jul 2023 12:12:12 -0700 In-Reply-To: <20230708191212.4147700-1-surenb@google.com> Mime-Version: 1.0 References: <20230708191212.4147700-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.390.g38632f3daf-goog Message-ID: <20230708191212.4147700-3-surenb@google.com> Subject: [PATCH v2 3/3] fork: lock VMAs of the parent process when forking From: Suren Baghdasaryan To: torvalds@linux-foundation.org Cc: akpm@linux-foundation.org, regressions@leemhuis.info, bagasdotme@gmail.com, jacobly.alt@gmail.com, willy@infradead.org, liam.howlett@oracle.com, david@redhat.com, peterx@redhat.com, ldufour@linux.ibm.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-arm-kernel@lists.infradead.org, gregkh@linuxfoundation.org, regressions@lists.linux.dev, Suren Baghdasaryan , Jiri Slaby , " =?utf-8?q?Holger_Hoffst=C3=A4tte?= " , stable@vger.kernel.org X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20230708_121222_260713_86EBDD8E X-CRM114-Status: GOOD ( 13.15 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org When forking a child process, parent write-protects an anonymous page and COW-shares it with the child being forked using copy_present_pte(). Parent's TLB is flushed right before we drop the parent's mmap_lock in dup_mmap(). If we get a write-fault before that TLB flush in the parent, and we end up replacing that anonymous page in the parent process in do_wp_page() (because, COW-shared with the child), this might lead to some stale writable TLB entries targeting the wrong (old) page. Similar issue happened in the past with userfaultfd (see flush_tlb_page() call inside do_wp_page()). Lock VMAs of the parent process when forking a child, which prevents concurrent page faults during fork operation and avoids this issue. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by: David Hildenbrand Reported-by: Jiri Slaby Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ Reported-by: Holger Hoffstätte Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/ Reported-by: Jacob Young Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan --- kernel/fork.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/fork.c b/kernel/fork.c index b85814e614a5..d2e12b6d2b18 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -686,6 +686,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, for_each_vma(old_vmi, mpnt) { struct file *file; + vma_start_write(mpnt); if (mpnt->vm_flags & VM_DONTCOPY) { vm_stat_account(mm, mpnt->vm_flags, -vma_pages(mpnt)); continue;