From patchwork Thu Sep 26 06:46:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Qi Zheng X-Patchwork-Id: 13812871 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 31908CCF9E9 for ; Thu, 26 Sep 2024 06:49:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:Message-Id:Date:Subject:Cc:To:From:Reply-To:Content-Type: Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender: Resent-To:Resent-Cc:Resent-Message-ID:In-Reply-To:References:List-Owner; bh=psD5QBE2/gRnBoEm6qk1NsXq2zG3EMB2GGe+WDIIiWM=; b=IiPqQEggQY/YrGT/A1+0GKI960 aN4qjzoq9MbsU27PgEflrRLYqscrYxUCC65/6G1rXImMT3bxRbRleHnrV1fJNTdWzXd79DaC6cHgu ZGPe37FBG6DXeL9FdOzdXp9pNJkSNpMJNGxzdeMFZpURfDQJBQruxYqix2jlQ0MT6lzibetfkr2yJ PhVbOjo8t1j6OW+JKBXCGebow3BkvcXIOHnj+hvWC4pwPUIqVfKJ/nQIp4I+27ogsBE9cAYSlSh9r Y4lwbJFtZj2rtAL91fy4XjfTPpOVq1HNpJuP8mg90kFGzpWFsGZS490OFpehRRODd317DGx35YRF6 fqfm68yg==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1stiJV-00000007OCU-0Zsr; Thu, 26 Sep 2024 06:49:21 +0000 Received: from mail-qt1-x82b.google.com ([2607:f8b0:4864:20::82b]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1stiH7-00000007N79-40iE for linux-arm-kernel@lists.infradead.org; Thu, 26 Sep 2024 06:46:55 +0000 Received: by mail-qt1-x82b.google.com with SMTP id d75a77b69052e-4582760b79cso3045201cf.2 for ; Wed, 25 Sep 2024 23:46:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1727333208; x=1727938008; darn=lists.infradead.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=psD5QBE2/gRnBoEm6qk1NsXq2zG3EMB2GGe+WDIIiWM=; b=CWR2AnxmCCfV0mVHnFzGk0z8pPibGCze5+4RN1pidn83gfiTh+a0f7EWvcJyRcvFWB WNza5yeb63NQjiUMBXrRh1s9IzVbaVscBB0Em46Gi6IOk6i3iTDSCw7l+G7JpCf44VFQ Q6A4ci2LaTwuESIGSwysl5A6JJKG1IbpHlRuOIaKtyNGHl/5HclLvvRzCxfNMdKTplUu bs2LGLVy58tHoV5OWkgDu/3iiwhcYBh+6GNvgpwSoncDFZHpz4IBKBIIJSCS+oniM6Jn ChEHlpxMA5vTSrGjYx2V2ye5qgFp6fRrBrulrYlhLimsay8n9gBBUfVH1UR9kVZEGN4w LdOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727333208; x=1727938008; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=psD5QBE2/gRnBoEm6qk1NsXq2zG3EMB2GGe+WDIIiWM=; b=fV/gngUIzAM++XdsAK1vFBX2SGpwiXwI2kr46ykTmbO6sNGSLkzUDdFb+qnYPhGbiA E53RWqF6Tr8elbszrIjWCd0Lf6hXgroQEMiYIWex9LEwBEYJW8nLFWnZu63XRIQEjum6 wuuxmnGlEGDSzSanmVpSuxXjq5LglU2EdcTtFfQgwo9brd5y+pX6dWEZlo9D5KjHMKS0 E6VYAFW2cz6VwUNTia/ZRaVLwCKp/nxvf3Wm9uDmbaB0oDLiGWe0xLPnrvdMB39zpmL1 n5pVrzwqiWdRmHAp2CLygNUdrgUs0/egbD6fHLFSVvKESwS65O4WRfXuBhiQvhlNSWR5 ke/w== X-Forwarded-Encrypted: i=1; AJvYcCXi0GK3oHKE2DIZF6f9Fzcjnh6NhvK5npr6FFX5K4S+9VQgm9Kd28uoQvFAr5Yy0eXMynkd3cR5W8P4qEMvRMOn@lists.infradead.org X-Gm-Message-State: AOJu0Yw+w+f9+L16QRjdCnOmOwS2weruNdamjWkrryLA/+cCHCi/Ifup cOxRXSol2d68YHn9K8GUtfIE1fwelndu39MpUa0ySKuWHk66R4oBzbmfdBlBh8A= X-Google-Smtp-Source: AGHT+IESrskC0O60xJo9CKcSFqAz070fd6XCEqJPS1rPBwM9ByQ3Zw6qYsXl3/Ca/QhHASyX7ikg5w== X-Received: by 2002:a05:622a:1311:b0:458:34ee:3a4b with SMTP id d75a77b69052e-45b5dec78afmr74252701cf.6.1727333208088; Wed, 25 Sep 2024 23:46:48 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.150]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-45b5257ff1esm23024611cf.38.2024.09.25.23.46.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 25 Sep 2024 23:46:47 -0700 (PDT) From: Qi Zheng To: david@redhat.com, hughd@google.com, willy@infradead.org, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, rppt@kernel.org, vishal.moola@gmail.com, peterx@redhat.com, ryan.roberts@arm.com, christophe.leroy2@cs-soprasteria.com Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, linuxppc-dev@lists.ozlabs.org, Qi Zheng Subject: [PATCH v5 00/13] introduce pte_offset_map_{ro|rw}_nolock() Date: Thu, 26 Sep 2024 14:46:13 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) MIME-Version: 1.0 X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20240925_234654_025932_006C9129 X-CRM114-Status: GOOD ( 17.72 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org Changes in v5: - directly pass pmdvalp to __pte_offset_map() in pte_offset_map_rw_nolock() (Muchun Song) - fix the problem of the reference of folio and the mm counter in [PATCH v4 07/13] (Muchun Song) - directly pass pmdvalp to pte_offset_map_rw_nolock() in map_pte() (Muchun Song) - collect the Acked-bys and Reviewed-bys - rebase onto the next-20240926 Changes in v4: - arm: adjust_pte() use pte_offset_map_rw_nolock() (use ptl != vmf->ptl to check if we are using split PTE locks) mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_rw_nolock() (move the pte_unmap() backward) mm: copy_pte_range() use pte_offset_map_rw_nolock() (remove pmd_same() check) mm: mremap: move_ptes() use pte_offset_map_rw_nolock() (remove pmd_same() check) mm: page_vma_mapped_walk: map_pte() use pte_offset_map_rw_nolock() (move the assignment to pvmw->ptl backward) - remove [PATCH v3 14/14] (will be sent as a separate patch) - reorder patches - collect the Reviewed-bys - rebase onto the next-20240923 Changes in v3: - change to use VM_WARN_ON_ONCE() instead of BUG_ON() in pte_offset_map_rw_nolock() (David Hildenbrand) - modify the comment above the pte_offset_map_lock() in [PATCH v2 01/14] (David Hildenbrand and Muchun Song) - modify the comment above the pte_offset_map_rw_nolock() in [PATCH v2 06/14] (David Hildenbrand and Muchun Song) - also perform a pmd_same() check in [PATCH v2 08/14] and [PATCH v2 09/14] (since we may free the PTE page in retract_page_tables() without holding the read lock of mmap_lock) - collect the Acked-bys and Reviewed-bys - rebase onto the next-20240904 Changes in v2: - rename pte_offset_map_{readonly|maywrite}_nolock() to pte_offset_map_{ro|rw}_nolock() (LEROY Christophe) - make pte_offset_map_rw_nolock() not accept NULL parameters (David Hildenbrand) - rebase onto the next-20240822 Hi all, As proposed by David Hildenbrand [1], this series introduces the following two new helper functions to replace pte_offset_map_nolock(). 1. pte_offset_map_ro_nolock() 2. pte_offset_map_rw_nolock() As the name suggests, pte_offset_map_ro_nolock() is used for read-only case. In this case, only read-only operations will be performed on PTE page after the PTL is held. The RCU lock in pte_offset_map_nolock() will ensure that the PTE page will not be freed, and there is no need to worry about whether the pmd entry is modified. Therefore pte_offset_map_ro_nolock() is just a renamed version of pte_offset_map_nolock(). pte_offset_map_rw_nolock() is used for may-write case. In this case, the pte or pmd entry may be modified after the PTL is held, so we need to ensure that the pmd entry has not been modified concurrently. So in addition to the name change, it also outputs the pmdval when successful. The users should make sure the page table is stable like checking pte_same() or checking pmd_same() by using the output pmdval before performing the write operations. This series will convert all pte_offset_map_nolock() into the above two helper functions one by one, and finally completely delete it. This also a preparation for reclaiming the empty user PTE page table pages. This series is based on the next-20240926. Comments and suggestions are welcome! Thanks, Qi Qi Zheng (13): mm: pgtable: introduce pte_offset_map_{ro|rw}_nolock() powerpc: assert_pte_locked() use pte_offset_map_ro_nolock() mm: filemap: filemap_fault_recheck_pte_none() use pte_offset_map_ro_nolock() mm: khugepaged: __collapse_huge_page_swapin() use pte_offset_map_ro_nolock() arm: adjust_pte() use pte_offset_map_rw_nolock() mm: handle_pte_fault() use pte_offset_map_rw_nolock() mm: khugepaged: collapse_pte_mapped_thp() use pte_offset_map_rw_nolock() mm: copy_pte_range() use pte_offset_map_rw_nolock() mm: mremap: move_ptes() use pte_offset_map_rw_nolock() mm: page_vma_mapped_walk: map_pte() use pte_offset_map_rw_nolock() mm: userfaultfd: move_pages_pte() use pte_offset_map_rw_nolock() mm: multi-gen LRU: walk_pte_range() use pte_offset_map_rw_nolock() mm: pgtable: remove pte_offset_map_nolock() Documentation/mm/split_page_table_lock.rst | 6 ++- arch/arm/mm/fault-armv.c | 53 +++++++++------------- arch/powerpc/mm/pgtable.c | 2 +- include/linux/mm.h | 7 ++- mm/filemap.c | 4 +- mm/khugepaged.c | 24 +++++++--- mm/memory.c | 25 ++++++++-- mm/mremap.c | 11 ++++- mm/page_vma_mapped.c | 24 +++++++--- mm/pgtable-generic.c | 41 ++++++++++++++--- mm/userfaultfd.c | 15 ++++-- mm/vmscan.c | 9 +++- 12 files changed, 157 insertions(+), 64 deletions(-)