From patchwork Wed Jul 5 17:12:11 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Suren Baghdasaryan X-Patchwork-Id: 13302420 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2DC4EC001DE for ; Wed, 5 Jul 2023 17:12:23 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id BF18B8D0003; Wed, 5 Jul 2023 13:12:22 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BA1C48D0001; Wed, 5 Jul 2023 13:12:22 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 9F4868D0003; Wed, 5 Jul 2023 13:12:22 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 8F6B48D0001 for ; Wed, 5 Jul 2023 13:12:22 -0400 (EDT) Received: from smtpin20.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 36EDA1C8F01 for ; Wed, 5 Jul 2023 17:12:22 +0000 (UTC) X-FDA: 80978201724.20.DD42797 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf22.hostedemail.com (Postfix) with ESMTP id 48CE9C0028 for ; Wed, 5 Jul 2023 17:12:19 +0000 (UTC) Authentication-Results: imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=pv5J0UpQ; spf=pass (imf22.hostedemail.com: domain of 3c6SlZAYKCH4uwtgpdiqqing.eqonkpwz-oomxcem.qti@flex--surenb.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3c6SlZAYKCH4uwtgpdiqqing.eqonkpwz-oomxcem.qti@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1688577140; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LEfYN9pWKwKFx1JeQhdsclXOpUTR5qIj3GofsUdOz+Q=; b=dkBF/rSTy4hn5zpnEru5K9+nZb7sX99F4M4P18Ot/bSkFSRPWwpcVSKS4aWA70IEKfW2Vd UhiXyRhSv8K9Y36k6mYBTolMgkqWiBgWV9OsXlzaJtG/39SpcD0+3Kvon3u9LGXwvuBSt+ 3PBdUHV52/Kd/OPHx6gS8HRf8sqY5hs= ARC-Authentication-Results: i=1; imf22.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=pv5J0UpQ; spf=pass (imf22.hostedemail.com: domain of 3c6SlZAYKCH4uwtgpdiqqing.eqonkpwz-oomxcem.qti@flex--surenb.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3c6SlZAYKCH4uwtgpdiqqing.eqonkpwz-oomxcem.qti@flex--surenb.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1688577140; a=rsa-sha256; cv=none; b=ARm8jGwEcq9VctGdUnDHuxjDsp76ZBfXYaSEQQwQGome1AqOQo77hoJM7LcEW4pIALFAem DROSwAbNmjxZkMOKN20fh9P7QTmIs0EyHPqWw2RruVbDFA7hYrp6Fd91oB1FCsxB3lp+zw h+pdFDd/QRbcTz60oz3cupePP96xtxQ= Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-573d70da2afso67828827b3.2 for ; Wed, 05 Jul 2023 10:12:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1688577139; x=1691169139; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:from:to:cc:subject:date:message-id :reply-to; bh=LEfYN9pWKwKFx1JeQhdsclXOpUTR5qIj3GofsUdOz+Q=; b=pv5J0UpQIPe0CfYpd/A4LUOIsPSqjggyBSXP8KA0L3IjO09ETnpFoKvsG9qpEBU9te LkkAk34KlR77CbwUqvjVmP+PelArfdSViArec9GtraLonPbdZCHYpxl59mhuJh+qwNH/ x8PVMMrUZyzXU+40IUVTClHb66yk1cJcLC9UceptakgAwptMXgWNfem+82VsI8PV2343 ulU6/+r+EU8f0/mTbgobEHK1XRAYgNrYZAN4gmcUtuXNe9XdnDjCwCv5EvDBO523pj4K 7SMAC2TVEkWPpnJYpSek3AANRder6w/kKNU3IvQKMe889P3MVrbJ0gLpEvPDPOLNjhs0 dzJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1688577139; x=1691169139; h=content-transfer-encoding:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:x-gm-message-state:from:to:cc:subject :date:message-id:reply-to; bh=LEfYN9pWKwKFx1JeQhdsclXOpUTR5qIj3GofsUdOz+Q=; b=AqOrn3I9aOa7gIlTxAf2yUHpTLLKbUUqWGfzNRXF5NPVFvh4Hwm/BSLScEQK8f+Z1L 6UA5rOmydCKNz4JelQ7fH92WEXwt4boqWeR+aeNDSN4VVDGTafGA4AHtNo111VY8nQw9 SXnMaIyDqP49r74gmDfMeI4JiKcG1XC9UDSUwnAcyP4eHDxNeVAyxRwjO7nnbCpXY62w VxMXu2HF6YiOt2Ll354dYOGlcom331WslGuya3nc+W5YF8HyP9rjr8vriAHMviCY2QLj 1JQ3mBtsSVfZmP7MFqF4GRZuId8c9iSsA/MXTGMfJhIjYcmiEexrLDinhnw6LaBvOMWZ eoCw== X-Gm-Message-State: ABy/qLZLut7XJqAPtcigxrOCkCmlxeq+3zafzDXW9hnZBK8B8xAzQBBk XtJWlnegntGH02DCToN4zrsIOFDOYTM= X-Google-Smtp-Source: APBJJlE4UJqW+lKx2tf9OzDDJfg9cAcjpV6MDLC4+UeP6EFuD+QsQM5ZG6pzBAqVEKNmfN+tN2GiB5xNqjQ= X-Received: from surenb-desktop.mtv.corp.google.com ([2620:15c:211:201:946c:be30:90d9:9093]) (user=surenb job=sendgmr) by 2002:a81:b60f:0:b0:577:617b:f881 with SMTP id u15-20020a81b60f000000b00577617bf881mr98615ywh.8.1688577139225; Wed, 05 Jul 2023 10:12:19 -0700 (PDT) Date: Wed, 5 Jul 2023 10:12:11 -0700 In-Reply-To: <20230705171213.2843068-1-surenb@google.com> Mime-Version: 1.0 References: <20230705171213.2843068-1-surenb@google.com> X-Mailer: git-send-email 2.41.0.255.g8b1d071c50-goog Message-ID: <20230705171213.2843068-2-surenb@google.com> Subject: [PATCH v3 1/2] fork: lock VMAs of the parent process when forking From: Suren Baghdasaryan To: akpm@linux-foundation.org Cc: jirislaby@kernel.org, jacobly.alt@gmail.com, holger@applied-asynchrony.com, hdegoede@redhat.com, michel@lespinasse.org, jglisse@google.com, mhocko@suse.com, vbabka@suse.cz, hannes@cmpxchg.org, mgorman@techsingularity.net, dave@stgolabs.net, willy@infradead.org, liam.howlett@oracle.com, peterz@infradead.org, ldufour@linux.ibm.com, paulmck@kernel.org, mingo@redhat.com, will@kernel.org, luto@kernel.org, songliubraving@fb.com, peterx@redhat.com, david@redhat.com, dhowells@redhat.com, hughd@google.com, bigeasy@linutronix.de, kent.overstreet@linux.dev, punit.agrawal@bytedance.com, lstoakes@gmail.com, peterjung1337@gmail.com, rientjes@google.com, chriscli@google.com, axelrasmussen@google.com, joelaf@google.com, minchan@google.com, rppt@kernel.org, jannh@google.com, shakeelb@google.com, tatashin@google.com, edumazet@google.com, gthelen@google.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org, Suren Baghdasaryan X-Rspamd-Queue-Id: 48CE9C0028 X-Rspam-User: X-Stat-Signature: f73tefmerf15ptau9fd4xf4eckf64fr5 X-Rspamd-Server: rspam01 X-HE-Tag: 1688577139-551369 X-HE-Meta: U2FsdGVkX1/yAfBzWLjeS68OhnYEUZS+dR358VB4SjB/zGXqPJMDwrvjhTinJgmBk2hOGGbVdtCoRAla4qumwfhbiXv8d+3YwjKFYbttveHQlSHehCo/ETK/JA3YDBRyXrzBxoLekhCJAZXgFxSbJcaMqHPI/pwbsOV9RN59AmoysIuDJX8IUAGOV3ViOYyPQtC8DkFTU5nysMkn+WoA/+Z28McO7RvV18ipqe44gIOJBOmfc2mHBHWUXNvPWZhNP2PVmjWB85J5etB05s+EIczol8X1lMchih+i4x/f1oEXd9kna4rAj6xxkzlZUCF/6Jp1/VYUaqWSR/bGpzOKX41TZpy7sbZRPU1S3YqSILo/wqbyuZbTmgLqLRzUhNWQkCRzlYoeuitY5EWT1iwcgLm9f5gOy5ZSEn+l0NR7gwKWLA227BwOnCeLrWnEB6gjKOSpdYTnrPX/2a+JoybIzh8MWjgTtmdhMw28r4KbPWc8AIvrm38y9YVOMKW6gcjTGZ2YryPKZi8/jmzjiT3MtrE9UdA7e5qzSFXgCSlPpUAHdvtqht8t55B+vnlqcSQzOGQaFgyZxhgFoeeIuBQtAQMxSuFTap3HEGUXA7i9AMdIehUqA/ezKtaPlTAwcxs8KREP6QrWM430QJfERa28k0mK2ZA1g1/+pP2NQFoP80q09hUbKEEprRKcC0fJft1pmBINsLAPq+nQbVb952P529hIpIl36AKyw00+LHq0eL24JuMHu/cw9UoH9uhm0To2jwAUOe0ZU9tGRwIKo6S42TRbuwQytutHhtrD2ojJfDYrIqXqqajmTsI/OTyzTBmT0qIumznnjecdoUSGEUhCKW+JeYcNokjDxK85QGm30d2sI0YZl9F5tmejpRqiP9g1GLa4VyzGFjCMwlZF5jtOXhncOTDojb3eyg7SG6KoUq3wLWtaWzPSxckuyZS+K9ADTDRQHfBL6ipmXcB2erO cQA9ortP VNQubr7YjrYnbwLqxZbif3/U34IXNXNnAq679Vs2Kk5Z/hIwuOP4MgViK7bVni0LWdA66WBqOqj+39VbicPd8M/fk2j0SIv0RFi6fNByXo/jHS7dAZjNR8K//iW+8/dZ+IbloS4EczEStxHm7Nar6KmGqrUFcBIiGTWxvGB8BE6kmlhBv5KdcUcwe2mQ/W4MzjR5OLOdgHoByzeMJkY5N0vHvCB/uinW39oEcIYt8c2gBW5Vje0OMCpwcV6lKYNT38we+cKvNhiu7mg6ZRzw23tVi7y9o7ZQyIPbIs/IXFzdnNJLrk+nRl1aUJRqoKmxRoDnHhVxiXIs4PkU1gByL9GGQk8Ki/bQ34PsZ/TWkbnBDV92HGd8UilGs3hGdnxaQiSQ8jaxkdHsn6T5IlCUQP0htYOKrs5m8O6EtKnnAl4tlWqZ8d7ko2m3ps9vguMaP2NmaYYlCcznjberOIN9sjKFJsrUa13EgDv/wJu5q6lZ3qjfFJdLpAeeW3Lig9J9eoc+6XjV0abZaqnTC9Q2lo3DLESg6xmwtVNAT0/wpYg0FKg8YHGeaVte9WDGfEEZns100NTzrFxg1j3lnijyglulHu6ymh1ts3eBvaUW95uyhVTRXl+MAI6I5+0ZfaER8egG9/um1tEepqIwHsjAyumRqhLlE+wbDU9TLP8A6ekTzITbTyIldP3mYDcz6qY3KtQ8WiEhrrUGJ0eq8M/DF8zXVhoXU3/OEQNyoR7R0R/Cz11q3CW5ob8u5wwW0mvBIGH0Ywk0nkCxQO+B0W5ZL8p0zduVwD26+59/i4oelepGYF0g8b+FLUcqTgy5xUMLS23uRpy/NfXRUf8nCjHIlYVKHHA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: When forking a child process, parent write-protects an anonymous page and COW-shares it with the child being forked using copy_present_pte(). Parent's TLB is flushed right before we drop the parent's mmap_lock in dup_mmap(). If we get a write-fault before that TLB flush in the parent, and we end up replacing that anonymous page in the parent process in do_wp_page() (because, COW-shared with the child), this might lead to some stale writable TLB entries targeting the wrong (old) page. Similar issue happened in the past with userfaultfd (see flush_tlb_page() call inside do_wp_page()). Lock VMAs of the parent process when forking a child, which prevents concurrent page faults during fork operation and avoids this issue. This fix can potentially regress some fork-heavy workloads. Kernel build time did not show noticeable regression on a 56-core machine while a stress test mapping 10000 VMAs and forking 5000 times in a tight loop shows ~5% regression. If such fork time regression is unacceptable, disabling CONFIG_PER_VMA_LOCK should restore its performance. Further optimizations are possible if this regression proves to be problematic. Suggested-by: David Hildenbrand Reported-by: Jiri Slaby Closes: https://lore.kernel.org/all/dbdef34c-3a07-5951-e1ae-e9c6e3cdf51b@kernel.org/ Reported-by: Holger Hoffstätte Closes: https://lore.kernel.org/all/b198d649-f4bf-b971-31d0-e8433ec2a34c@applied-asynchrony.com/ Reported-by: Jacob Young Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217624 Fixes: 0bff0aaea03e ("x86/mm: try VMA lock-based page fault handling first") Cc: stable@vger.kernel.org Signed-off-by: Suren Baghdasaryan Acked-by: David Hildenbrand --- kernel/fork.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/kernel/fork.c b/kernel/fork.c index b85814e614a5..403bc2b72301 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -658,6 +658,12 @@ static __latent_entropy int dup_mmap(struct mm_struct *mm, retval = -EINTR; goto fail_uprobe_end; } +#ifdef CONFIG_PER_VMA_LOCK + /* Disallow any page faults before calling flush_cache_dup_mm */ + for_each_vma(old_vmi, mpnt) + vma_start_write(mpnt); + vma_iter_init(&old_vmi, oldmm, 0); +#endif flush_cache_dup_mm(oldmm); uprobe_dup_mmap(oldmm, mm); /*