From patchwork Sun Apr 3 05:39:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12799480 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFDB1C433F5 for ; Sun, 3 Apr 2022 05:41:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238133AbiDCFnM (ORCPT ); Sun, 3 Apr 2022 01:43:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237972AbiDCFnL (ORCPT ); Sun, 3 Apr 2022 01:43:11 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2249733E9E for ; Sat, 2 Apr 2022 22:41:18 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id y10so6115173pfa.7 for ; Sat, 02 Apr 2022 22:41:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=Xp6qCxwTnE0C1FxChih956SdBv3W6L+dQXT7C0J3Dts=; b=2ABETFqdsMhesFHDHhEUmJyVsW7RZ5Js5waMUtu7AgAnw7NQJE9KF6VPHqmSdvMej+ 2OJl2yUuMkKlsJntBOzaVw6h3FbkNTEtAUb7NgwTf/4KS8Unjkp0AkIDihqOM4AQy0E9 b0D1cmPFHN/ZT3IC0qJD0E+WI024G3g/JHnvtyGmSNaBRaaNOf62Cjctp4an9NRZ9b6z vPzjHQqLtX8FSAzX5BdZbUx0QSmSsl5esFSgUJJfPkwZ9OBGd+Ki5S02q7b/3idXLY9Q v3WBGcPbhY5PtafSU0qiWng1lLj8uoWDT3H3wd0bwa2IQYzXzlExfT+r6ROb2/HljUOo gBaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=Xp6qCxwTnE0C1FxChih956SdBv3W6L+dQXT7C0J3Dts=; b=dPypXq3AsgG/USZOrTc3JV/uXUYZu2AI/PvT0zZqRZuVEvdmPhUS/BrTWe/DTZI2Hp PZx4YXHuAbtakc1mJPsFVwy09OxuxPvmXcGbbYqFrZLGLXB5z0PZ5PnkC0KiG+vFKR37 xRFU/enzZ2Dx1SRcWRmCqbo9FjGQvD51aGC6OyUt8EnUHIdQ82prDdY2ZHr29ppCRuZy 5fBbjJOTEMk5ziFjIki4OlT+Oq1iyyZYrGdYnvRbJXt0+obpdyrlAEuqPAatLI579uMJ jp7D5RqD7Jb3DovVXT7NerMq2Wa3oJrLtqqENr9W9+d94JMcpqLtX7akAzpkEGs3F3Le JspQ== X-Gm-Message-State: AOAM533ilwcBD+Az9BHFrRh7DCSgOPKmTWp26Y6819biVlCsWXsOKjZ9 Lj/0fcAaRppx5qdEzhlvCRSiBQ== X-Google-Smtp-Source: ABdhPJyZBSv2yEKXE7ELYRPowq8GlcSVl+mBvyh+Dr/+2j3dfmhbMmgifEOf+lzxBO7utJytI2ZQWA== X-Received: by 2002:a05:6a00:b95:b0:4fa:ec15:7eb7 with SMTP id g21-20020a056a000b9500b004faec157eb7mr18416062pfj.74.1648964477478; Sat, 02 Apr 2022 22:41:17 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:17 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v7 1/6] mm: rmap: fix cache flush on THP pages Date: Sun, 3 Apr 2022 13:39:52 +0800 Message-Id: <20220403053957.10770-2-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. At least, no problems were found due to this. Maybe because the architectures that have virtual indexed caches is less. Fixes: f27176cfc363 ("mm: convert page_mkclean_one() to use page_vma_mapped_walk()") Signed-off-by: Muchun Song Reviewed-by: Yang Shi Reviewed-by: Dan Williams Reviewed-by: Christoph Hellwig --- mm/rmap.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mm/rmap.c b/mm/rmap.c index fc46a3d7b704..723682ddb9e8 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -970,7 +970,8 @@ static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) continue; - flush_cache_page(vma, address, folio_pfn(folio)); + flush_cache_range(vma, address, + address + HPAGE_PMD_SIZE); entry = pmdp_invalidate(vma, address, pmd); entry = pmd_wrprotect(entry); entry = pmd_mkclean(entry); From patchwork Sun Apr 3 05:39:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12799482 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AA247C433EF for ; Sun, 3 Apr 2022 05:41:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239221AbiDCFna (ORCPT ); Sun, 3 Apr 2022 01:43:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S237986AbiDCFnW (ORCPT ); Sun, 3 Apr 2022 01:43:22 -0400 Received: from mail-pl1-x62a.google.com (mail-pl1-x62a.google.com [IPv6:2607:f8b0:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D2B7381BD for ; Sat, 2 Apr 2022 22:41:25 -0700 (PDT) Received: by mail-pl1-x62a.google.com with SMTP id f10so5694479plr.6 for ; Sat, 02 Apr 2022 22:41:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qCp2b48ULThIjDs7kZ69Cjh0BLYiJwB01qLpS62xP+w=; b=lwrBARs+/7DTDJPCPDkO87hzSIKPt6GQ3BJuVqyr1LMMRf0ELecQkoLiSNJjbiZdpd OOw8+jdaEjkrKEkQOCmLvwQJzHLz27JTlGJ5zE2dn5Jip53qlttwM0FkCYZueUeb7zeF RVGdSY44kEkFd3NNdpkbB9EPEooKpn6V9VAwbVH27fUkD5jUhTGJPGtIrzGx+IUgXU/O uu1GRfpRmgRGEwvWhRZtaM7CW/f27In9HxOfBsluDtGcV56/sxLL1FJ//gEuOidep45r /RAHUxBOsm5BUtauhEFj88JE1ydBVybi1r2rzExNbwhHo6EWG1darzCFI36vXEKt9WKC z0qQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qCp2b48ULThIjDs7kZ69Cjh0BLYiJwB01qLpS62xP+w=; b=JO5RU9/+R6WDmUPGh1rmbMzl6wSOmnMW+FAtTG6gYDBsvcRb9bg5urfih4Jnn/7KY1 QDYABeekXJ8nrHRvE9XhmFOh5xk3WlWxidrdvsZhC9qb9gziuzHX8Yz31zsNNqNNunu0 mlfg/ThhvEenie//XO8l5vF7P3S0XzgVSzjuWVTM+lbsqgGL3eQ3oicyz82XmtLdB4Il O7flJ9c2iSoJQHQLr3bUy5Bvz9V2k58HGQlJFjsZPNcsomHyMmBc3iaaIJx+Hwh649gU tiuOIEoWoCIpJLBvb6vUIHEtn5k/bXWcWA8DP/CHs5AfKfnvwwFtuynOCeOygwGwDyxK YkEQ== X-Gm-Message-State: AOAM533TVwQ45cWYNyQa0CYcm+9mVB1fgOsvZWNGpxFvlo+w8d2HbMq2 Fqyn1jr3zPBePm0pGEsnErb7zQ== X-Google-Smtp-Source: ABdhPJwPE+R5M1c944zWgzZX/rK39iXrxGm2KhaVKut5BUEWAAHDPdGQ5lQJXH1Kcs/14QSL7BveZw== X-Received: by 2002:a17:90b:4b42:b0:1c7:3f6a:5d97 with SMTP id mi2-20020a17090b4b4200b001c73f6a5d97mr19591035pjb.27.1648964484666; Sat, 02 Apr 2022 22:41:24 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:24 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v7 2/6] dax: fix cache flush on PMD-mapped pages Date: Sun, 3 Apr 2022 13:39:53 +0800 Message-Id: <20220403053957.10770-3-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The flush_cache_page() only remove a PAGE_SIZE sized range from the cache. However, it does not cover the full pages in a THP except a head page. Replace it with flush_cache_range() to fix this issue. This is just a documentation issue with the respect to properly documenting the expected usage of cache flushing before modifying the pmd. However, in practice this is not a problem due to the fact that DAX is not available on architectures with virtually indexed caches per: commit d92576f1167c ("dax: does not work correctly with virtual aliasing caches") Fixes: f729c8c9b24f ("dax: wrprotect pmd_t in dax_mapping_entry_mkclean") Signed-off-by: Muchun Song Reviewed-by: Dan Williams Reviewed-by: Christoph Hellwig --- fs/dax.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/fs/dax.c b/fs/dax.c index 67a08a32fccb..a372304c9695 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -845,7 +845,8 @@ static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) goto unlock_pmd; - flush_cache_page(vma, address, pfn); + flush_cache_range(vma, address, + address + HPAGE_PMD_SIZE); pmd = pmdp_invalidate(vma, address, pmdp); pmd = pmd_wrprotect(pmd); pmd = pmd_mkclean(pmd); From patchwork Sun Apr 3 05:39:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12799481 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8901C433EF for ; Sun, 3 Apr 2022 05:41:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238863AbiDCFn2 (ORCPT ); Sun, 3 Apr 2022 01:43:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238688AbiDCFnZ (ORCPT ); Sun, 3 Apr 2022 01:43:25 -0400 Received: from mail-pl1-x635.google.com (mail-pl1-x635.google.com [IPv6:2607:f8b0:4864:20::635]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17A15340D3 for ; Sat, 2 Apr 2022 22:41:32 -0700 (PDT) Received: by mail-pl1-x635.google.com with SMTP id i11so5717901plr.1 for ; Sat, 02 Apr 2022 22:41:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ss8jaoNqbCttgkdiicrTIYI1R2QuFynfz36gAXgUE7k=; b=ljK+nt2UNQ8zKRGE/zCfCxSLHdGZqaxkxh6a9KwpoVhKSi8iS7pzyFG5rUuCc4et/Y FSVM1Xf4ajyjXed3xauvnzbn4LoRQo2l7LRafjxetfr7eFZMr/NRuhrZg9qV7N4TFN09 Fg055h0MHAVQEzLwkOe7W3nbvuZqhE4jrho3aGzB+9TxiAw1nUibE8uCsMENUHr7sgb3 2+y0AbE9pYhCOu/Vm1jJ0f7ryZ0b7JYHyUL4YgLEiQehW0Sp5NpEnfwYu4CB2Z04zLVK 51yRZI2AJpzXQZFF31Na/4DZYHjtqwqKFaQ665+Osp4W0qnram+2HxYQV8U9EWtFRxoB DSpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ss8jaoNqbCttgkdiicrTIYI1R2QuFynfz36gAXgUE7k=; b=akotP8PRJEmOPHhKzH8iFbz0CQ3TetpazmQ1tN9TlYCk53KfW0PMFbk/GZh5F8n0ww X8ghEuZ0rXxJfY7tQ9YuplePlIuw8krxpcOvDI3TE+ufIU+l1MhdONjbY9jzIaMvjUUG luMQ2wO7mcaUGQ3rCmNpT+Kb4wmNe/4A7oaRQL0vkjLspawxUnwQZkaiyz2luBzwWMIo y21uIetl1bPhhyxJC08aEyOLwfKr2Q+ir9iyFaOK004lh4qhHjtgv6FX/ghyFHEVw+cy TcJdPuMjhpVokEPrcgcQrXF0tTCNW39X5bVi2IVqB1HQUuqnj6Pgs4LwM816b72s0fxI /eEg== X-Gm-Message-State: AOAM530EI2r97zCj7wMGQiAjQb6b1Yo8e7gjouBELNDcoTVDg3LeNxYA HJMnLYen1JylDr35VtGcV/RacA== X-Google-Smtp-Source: ABdhPJzb7n1x9H/3exn1YcDrylPGvv27RMTn16h2h9+bcsaRc2iqaKLat0zmBiFPgo3tmrFG1hAZGQ== X-Received: by 2002:a17:90a:488c:b0:1c7:b62e:8e8c with SMTP id b12-20020a17090a488c00b001c7b62e8e8cmr19839143pjh.157.1648964491520; Sat, 02 Apr 2022 22:41:31 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:31 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song Subject: [PATCH v7 3/6] mm: rmap: introduce pfn_mkclean_range() to cleans PTEs Date: Sun, 3 Apr 2022 13:39:54 +0800 Message-Id: <20220403053957.10770-4-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The page_mkclean_one() is supposed to be used with the pfn that has a associated struct page, but not all the pfns (e.g. DAX) have a struct page. Introduce a new function pfn_mkclean_range() to cleans the PTEs (including PMDs) mapped with range of pfns which has no struct page associated with them. This helper will be used by DAX device in the next patch to make pfns clean. Signed-off-by: Muchun Song --- include/linux/rmap.h | 3 +++ mm/internal.h | 26 +++++++++++++-------- mm/rmap.c | 65 +++++++++++++++++++++++++++++++++++++++++++--------- 3 files changed, 74 insertions(+), 20 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index b58ddb8b2220..a6ec0d3e40c1 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -263,6 +263,9 @@ unsigned long page_address_in_vma(struct page *, struct vm_area_struct *); */ int folio_mkclean(struct folio *); +int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, + struct vm_area_struct *vma); + void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked); /* diff --git a/mm/internal.h b/mm/internal.h index f45292dc4ef5..664e6d48607c 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -516,26 +516,22 @@ void mlock_page_drain(int cpu); extern pmd_t maybe_pmd_mkwrite(pmd_t pmd, struct vm_area_struct *vma); /* - * At what user virtual address is page expected in vma? - * Returns -EFAULT if all of the page is outside the range of vma. - * If page is a compound head, the entire compound page is considered. + * Return the start of user virtual address at the specific offset within + * a vma. */ static inline unsigned long -vma_address(struct page *page, struct vm_area_struct *vma) +vma_pgoff_address(pgoff_t pgoff, unsigned long nr_pages, + struct vm_area_struct *vma) { - pgoff_t pgoff; unsigned long address; - VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */ - pgoff = page_to_pgoff(page); if (pgoff >= vma->vm_pgoff) { address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); /* Check for address beyond vma (or wrapped through 0?) */ if (address < vma->vm_start || address >= vma->vm_end) address = -EFAULT; - } else if (PageHead(page) && - pgoff + compound_nr(page) - 1 >= vma->vm_pgoff) { + } else if (pgoff + nr_pages - 1 >= vma->vm_pgoff) { /* Test above avoids possibility of wrap to 0 on 32-bit */ address = vma->vm_start; } else { @@ -545,6 +541,18 @@ vma_address(struct page *page, struct vm_area_struct *vma) } /* + * Return the start of user virtual address of a page within a vma. + * Returns -EFAULT if all of the page is outside the range of vma. + * If page is a compound head, the entire compound page is considered. + */ +static inline unsigned long +vma_address(struct page *page, struct vm_area_struct *vma) +{ + VM_BUG_ON_PAGE(PageKsm(page), page); /* KSM page->index unusable */ + return vma_pgoff_address(page_to_pgoff(page), compound_nr(page), vma); +} + +/* * Then at what user virtual address will none of the range be found in vma? * Assumes that vma_address() already returned a good starting address. */ diff --git a/mm/rmap.c b/mm/rmap.c index 723682ddb9e8..ad5cf0e45a73 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -929,12 +929,12 @@ int folio_referenced(struct folio *folio, int is_locked, return pra.referenced; } -static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, - unsigned long address, void *arg) +static int page_vma_mkclean_one(struct page_vma_mapped_walk *pvmw) { - DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, PVMW_SYNC); + int cleaned = 0; + struct vm_area_struct *vma = pvmw->vma; struct mmu_notifier_range range; - int *cleaned = arg; + unsigned long address = pvmw->address; /* * We have to assume the worse case ie pmd for invalidation. Note that @@ -942,16 +942,16 @@ static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, */ mmu_notifier_range_init(&range, MMU_NOTIFY_PROTECTION_PAGE, 0, vma, vma->vm_mm, address, - vma_address_end(&pvmw)); + vma_address_end(pvmw)); mmu_notifier_invalidate_range_start(&range); - while (page_vma_mapped_walk(&pvmw)) { + while (page_vma_mapped_walk(pvmw)) { int ret = 0; - address = pvmw.address; - if (pvmw.pte) { + address = pvmw->address; + if (pvmw->pte) { pte_t entry; - pte_t *pte = pvmw.pte; + pte_t *pte = pvmw->pte; if (!pte_dirty(*pte) && !pte_write(*pte)) continue; @@ -964,7 +964,7 @@ static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, ret = 1; } else { #ifdef CONFIG_TRANSPARENT_HUGEPAGE - pmd_t *pmd = pvmw.pmd; + pmd_t *pmd = pvmw->pmd; pmd_t entry; if (!pmd_dirty(*pmd) && !pmd_write(*pmd)) @@ -991,11 +991,22 @@ static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, * See Documentation/vm/mmu_notifier.rst */ if (ret) - (*cleaned)++; + cleaned++; } mmu_notifier_invalidate_range_end(&range); + return cleaned; +} + +static bool page_mkclean_one(struct folio *folio, struct vm_area_struct *vma, + unsigned long address, void *arg) +{ + DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, PVMW_SYNC); + int *cleaned = arg; + + *cleaned += page_vma_mkclean_one(&pvmw); + return true; } @@ -1033,6 +1044,38 @@ int folio_mkclean(struct folio *folio) EXPORT_SYMBOL_GPL(folio_mkclean); /** + * pfn_mkclean_range - Cleans the PTEs (including PMDs) mapped with range of + * [@pfn, @pfn + @nr_pages) at the specific offset (@pgoff) + * within the @vma of shared mappings. And since clean PTEs + * should also be readonly, write protects them too. + * @pfn: start pfn. + * @nr_pages: number of physically contiguous pages srarting with @pfn. + * @pgoff: page offset that the @pfn mapped with. + * @vma: vma that @pfn mapped within. + * + * Returns the number of cleaned PTEs (including PMDs). + */ +int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, + struct vm_area_struct *vma) +{ + struct page_vma_mapped_walk pvmw = { + .pfn = pfn, + .nr_pages = nr_pages, + .pgoff = pgoff, + .vma = vma, + .flags = PVMW_SYNC, + }; + + if (invalid_mkclean_vma(vma, NULL)) + return 0; + + pvmw.address = vma_pgoff_address(pgoff, nr_pages, vma); + VM_BUG_ON_VMA(pvmw.address == -EFAULT, vma); + + return page_vma_mkclean_one(&pvmw); +} + +/** * page_move_anon_rmap - move a page to our anon_vma * @page: the page to move to our anon_vma * @vma: the vma the page belongs to From patchwork Sun Apr 3 05:39:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12799483 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EA775C433F5 for ; Sun, 3 Apr 2022 05:41:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240695AbiDCFnq (ORCPT ); Sun, 3 Apr 2022 01:43:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59076 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239369AbiDCFnc (ORCPT ); Sun, 3 Apr 2022 01:43:32 -0400 Received: from mail-pf1-x431.google.com (mail-pf1-x431.google.com [IPv6:2607:f8b0:4864:20::431]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E8D6340EB for ; Sat, 2 Apr 2022 22:41:39 -0700 (PDT) Received: by mail-pf1-x431.google.com with SMTP id p8so6110278pfh.8 for ; Sat, 02 Apr 2022 22:41:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=P/G4cY6LA4u6yV/b8BGhElOxrEr7wa4Csrvtb+Y6qso=; b=7jSVQ79b+01y0a3OIt4mIy3OD3Bmy88FmGcA/qn97f2Sw2HTM5zHt5Z3vf6eSa05O5 8KATfUa7aPdV9idYkmMFpQa7avLXAeGK7o21AsI+a/FVKxiqlZmHorgTTju6bnHLfHt6 oWLIAHsD4cxl82X558HPB6VNb45Uomwcmk9+JIZ0QfXDwex0bJnDmMai6OsLpPpHWteC 2QezqEXHY1peozS+If4seb5mEfpnvno/EA2OzdmWprY8lmLKDdKzXeOOvxPeQCKg3YuH Z4f9zEjIPoK+mkeSx70BhARKhSvGBE+qygkzwVs/Nr7nirXqN0f33JPXiYNKA5gZjZdA LCdg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=P/G4cY6LA4u6yV/b8BGhElOxrEr7wa4Csrvtb+Y6qso=; b=wmHGjQGOqt4wQAD59Ojd5dJ2SlhPyCmFNfOBS+dhkpqITA01asd7lkY1Ne1fqTYM1O 03xkOZdYEpBRiS9WJBHtWEJuirHUt7jmzToKtaIpd8o/z7U0JSLJPUcoEkgmITnL+48V 5+IF8WBolM5ga8M1Mndi1qv/wZoK+yiCBYx5qit840thbW/PMddF6Kean0LZWCADMqF5 rnXxuQE5rtZHAfs6Rglpc7I8I1hzbsxvzJlrXvExX/ys2qq9EH8YKUnT50hiuYJA0on3 n55VGHr2Hkp+hTYfhICQUNSuaDw3oEORoEn2+aOqJEJc5zX+wBUg/CfsEs1GdiljCDb7 xyDg== X-Gm-Message-State: AOAM531lZir0GCNk+DPIBCdqua2bhKroH7pqzWfZK6xjN3IOGHzc91OG rXajs0fdmlQ3o0Ap4fxbsNu8hg== X-Google-Smtp-Source: ABdhPJxGOROIqTOKId5F4JMIrbX834xDIMvBWzvSxSoC7uaCqPhyXCkc5AknFgZY3po9EwZFzgLsfQ== X-Received: by 2002:a65:6e82:0:b0:381:71c9:9856 with SMTP id bm2-20020a656e82000000b0038171c99856mr21437599pgb.316.1648964498579; Sat, 02 Apr 2022 22:41:38 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:38 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song Subject: [PATCH v7 4/6] mm: pvmw: add support for walking devmap pages Date: Sun, 3 Apr 2022 13:39:55 +0800 Message-Id: <20220403053957.10770-5-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The devmap pages can not use page_vma_mapped_walk() to check if a huge devmap page is mapped into a vma. Add support for walking huge devmap pages so that DAX can use it in the next patch. Signed-off-by: Muchun Song --- mm/page_vma_mapped.c | 17 +++++++++-------- 1 file changed, 9 insertions(+), 8 deletions(-) diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 1187f9c1ec5b..3da82bf65de8 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -210,16 +210,10 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) */ pmde = READ_ONCE(*pvmw->pmd); - if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde)) { + if (pmd_trans_huge(pmde) || is_pmd_migration_entry(pmde) || + (pmd_present(pmde) && pmd_devmap(pmde))) { pvmw->ptl = pmd_lock(mm, pvmw->pmd); pmde = *pvmw->pmd; - if (likely(pmd_trans_huge(pmde))) { - if (pvmw->flags & PVMW_MIGRATION) - return not_found(pvmw); - if (!check_pmd(pmd_pfn(pmde), pvmw)) - return not_found(pvmw); - return true; - } if (!pmd_present(pmde)) { swp_entry_t entry; @@ -232,6 +226,13 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) return not_found(pvmw); return true; } + if (likely(pmd_trans_huge(pmde) || pmd_devmap(pmde))) { + if (pvmw->flags & PVMW_MIGRATION) + return not_found(pvmw); + if (!check_pmd(pmd_pfn(pmde), pvmw)) + return not_found(pvmw); + return true; + } /* THP pmd was split under us: handle on pte level */ spin_unlock(pvmw->ptl); pvmw->ptl = NULL; From patchwork Sun Apr 3 05:39:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12799484 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C91FDC433EF for ; Sun, 3 Apr 2022 05:42:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243407AbiDCFn7 (ORCPT ); Sun, 3 Apr 2022 01:43:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241908AbiDCFnq (ORCPT ); Sun, 3 Apr 2022 01:43:46 -0400 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5060338BC0 for ; Sat, 2 Apr 2022 22:41:46 -0700 (PDT) Received: by mail-pl1-x62e.google.com with SMTP id s11so1290584pla.8 for ; Sat, 02 Apr 2022 22:41:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ij9Ou9wostYj6EGhx1lGIRtlvYKIpDSow4S1vsu6OkU=; b=XeDR2vJSmU7Ivd3WN7rCTrbkf25VSpqOphH+Y0b8v30A2+sw5zqGls9Infj7s8yVlP svWNwaO1p/aGhXjj7Zupj5x3tuNFjanAQWQWM6u7liBIOurPY1rHvAScK/VCuZ7iFAfg pFi84i3m4JempmgKh34P9NIz6wJ875JOw+0Aqg1U6aRUn/Xtf4315rOLnph7LcPVNo/o bCkdnwqsmM4WrVRAFJEPnNeFpXgXWQf43xph7Siue3ttUD6DaIe9v5c5W0aQH8CZed+l DJeXFlvu8ihwu8DuH0KCvDNYmwo5uFWXXWrAhMpb9CnbUIRuHIanH9teTlUUk6AjPz45 B+cw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ij9Ou9wostYj6EGhx1lGIRtlvYKIpDSow4S1vsu6OkU=; b=Lw0M2+1eE0xFlWuh7VoVFfBxArac/DLHNTG8Yf1zBHASdP1Nz9SklrOv8l/QTqvj2t G+DW5qLrxYv9CJcJmVlCvfzX3QdafJWyz7fXiaItQTI7Gw4E9b2/LVwK4kYpJ6B66IcZ 5gfEy8PMw0bI1MHKfMH1yh89bd9jICI1PcTkajv7hHOU0kSLQkr1G5xYdmnlOG0CQVhz +jALk7O49qgSxxjrZTPL4brxyl16Qg1fZ/KbLEeqO6FdJdCI6azCgq+6OMhiHPakhGbh IFXV7HPwds/rXHlK9rXzqYVfAeAy/awd+CRX1FpExR7uAkPMBTOQfQcP1F2XLX43BjW8 JhZQ== X-Gm-Message-State: AOAM533ukT7tofqcO77pSvWTVNhSnwaMv0UXnDGdJcF7uqnBpl9EKWh/ Ki1Pv5vy/WZfwFUX5O9u75sZ6w== X-Google-Smtp-Source: ABdhPJxCMZdeA2cMPE4dzvOrFTBRiMQzhbgxuHDpUKbgulRM1aIbxvU3i75LzfHUzm2CYnokBq58DQ== X-Received: by 2002:a17:902:dad0:b0:154:740a:9094 with SMTP id q16-20020a170902dad000b00154740a9094mr17337374plx.107.1648964505726; Sat, 02 Apr 2022 22:41:45 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:45 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v7 5/6] dax: fix missing writeprotect the pte entry Date: Sun, 3 Apr 2022 13:39:56 +0800 Message-Id: <20220403053957.10770-6-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Currently dax_mapping_entry_mkclean() fails to clean and write protect the pte entry within a DAX PMD entry during an *sync operation. This can result in data loss in the following sequence: 1) process A mmap write to DAX PMD, dirtying PMD radix tree entry and making the pmd entry dirty and writeable. 2) process B mmap with the @offset (e.g. 4K) and @length (e.g. 4K) write to the same file, dirtying PMD radix tree entry (already done in 1)) and making the pte entry dirty and writeable. 3) fsync, flushing out PMD data and cleaning the radix tree entry. We currently fail to mark the pte entry as clean and write protected since the vma of process B is not covered in dax_entry_mkclean(). 4) process B writes to the pte. These don't cause any page faults since the pte entry is dirty and writeable. The radix tree entry remains clean. 5) fsync, which fails to flush the dirty PMD data because the radix tree entry was clean. 6) crash - dirty data that should have been fsync'd as part of 5) could still have been in the processor cache, and is lost. Just to use pfn_mkclean_range() to clean the pfns to fix this issue. Fixes: 4b4bb46d00b3 ("dax: clear dirty entry tags on cache flush") Signed-off-by: Muchun Song Reviewed-by: Christoph Hellwig --- fs/dax.c | 99 ++++++++-------------------------------------------------------- 1 file changed, 12 insertions(+), 87 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index a372304c9695..1ac12e877f4f 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -24,6 +24,7 @@ #include #include #include +#include #include #define CREATE_TRACE_POINTS @@ -789,96 +790,12 @@ static void *dax_insert_entry(struct xa_state *xas, return entry; } -static inline -unsigned long pgoff_address(pgoff_t pgoff, struct vm_area_struct *vma) -{ - unsigned long address; - - address = vma->vm_start + ((pgoff - vma->vm_pgoff) << PAGE_SHIFT); - VM_BUG_ON_VMA(address < vma->vm_start || address >= vma->vm_end, vma); - return address; -} - -/* Walk all mappings of a given index of a file and writeprotect them */ -static void dax_entry_mkclean(struct address_space *mapping, pgoff_t index, - unsigned long pfn) -{ - struct vm_area_struct *vma; - pte_t pte, *ptep = NULL; - pmd_t *pmdp = NULL; - spinlock_t *ptl; - - i_mmap_lock_read(mapping); - vma_interval_tree_foreach(vma, &mapping->i_mmap, index, index) { - struct mmu_notifier_range range; - unsigned long address; - - cond_resched(); - - if (!(vma->vm_flags & VM_SHARED)) - continue; - - address = pgoff_address(index, vma); - - /* - * follow_invalidate_pte() will use the range to call - * mmu_notifier_invalidate_range_start() on our behalf before - * taking any lock. - */ - if (follow_invalidate_pte(vma->vm_mm, address, &range, &ptep, - &pmdp, &ptl)) - continue; - - /* - * No need to call mmu_notifier_invalidate_range() as we are - * downgrading page table protection not changing it to point - * to a new page. - * - * See Documentation/vm/mmu_notifier.rst - */ - if (pmdp) { -#ifdef CONFIG_FS_DAX_PMD - pmd_t pmd; - - if (pfn != pmd_pfn(*pmdp)) - goto unlock_pmd; - if (!pmd_dirty(*pmdp) && !pmd_write(*pmdp)) - goto unlock_pmd; - - flush_cache_range(vma, address, - address + HPAGE_PMD_SIZE); - pmd = pmdp_invalidate(vma, address, pmdp); - pmd = pmd_wrprotect(pmd); - pmd = pmd_mkclean(pmd); - set_pmd_at(vma->vm_mm, address, pmdp, pmd); -unlock_pmd: -#endif - spin_unlock(ptl); - } else { - if (pfn != pte_pfn(*ptep)) - goto unlock_pte; - if (!pte_dirty(*ptep) && !pte_write(*ptep)) - goto unlock_pte; - - flush_cache_page(vma, address, pfn); - pte = ptep_clear_flush(vma, address, ptep); - pte = pte_wrprotect(pte); - pte = pte_mkclean(pte); - set_pte_at(vma->vm_mm, address, ptep, pte); -unlock_pte: - pte_unmap_unlock(ptep, ptl); - } - - mmu_notifier_invalidate_range_end(&range); - } - i_mmap_unlock_read(mapping); -} - static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, struct address_space *mapping, void *entry) { - unsigned long pfn, index, count; + unsigned long pfn, index, count, end; long ret = 0; + struct vm_area_struct *vma; /* * A page got tagged dirty in DAX mapping? Something is seriously @@ -936,8 +853,16 @@ static int dax_writeback_one(struct xa_state *xas, struct dax_device *dax_dev, pfn = dax_to_pfn(entry); count = 1UL << dax_entry_order(entry); index = xas->xa_index & ~(count - 1); + end = index + count - 1; + + /* Walk all mappings of a given index of a file and writeprotect them */ + i_mmap_lock_read(mapping); + vma_interval_tree_foreach(vma, &mapping->i_mmap, index, end) { + pfn_mkclean_range(pfn, count, index, vma); + cond_resched(); + } + i_mmap_unlock_read(mapping); - dax_entry_mkclean(mapping, index, pfn); dax_flush(dax_dev, page_address(pfn_to_page(pfn)), count * PAGE_SIZE); /* * After we have flushed the cache, we can clear the dirty tag. There From patchwork Sun Apr 3 05:39:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Muchun Song X-Patchwork-Id: 12799485 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9406EC433F5 for ; Sun, 3 Apr 2022 05:42:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238655AbiDCFoL (ORCPT ); Sun, 3 Apr 2022 01:44:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59726 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244974AbiDCFny (ORCPT ); Sun, 3 Apr 2022 01:43:54 -0400 Received: from mail-pf1-x436.google.com (mail-pf1-x436.google.com [IPv6:2607:f8b0:4864:20::436]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DDECE38D93 for ; Sat, 2 Apr 2022 22:41:53 -0700 (PDT) Received: by mail-pf1-x436.google.com with SMTP id y10so6115674pfa.7 for ; Sat, 02 Apr 2022 22:41:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pquxx3qH2nuqmnC96VcgoZZu7sN0crimqByO4QYpxLI=; b=GHXKTG43kCR2umCkJhnnOQOd88LUpFg6cViEu00MVxJOQ2rWmTplbsf/IitrWYgSHd rZxvN/NrO90+8edNTpRLUynQfaGmARDAqr7zPpmiFS11Zrt4utAean69PdTj48CP2KO0 C5qyBBPUy5KOw8921uFQf2iytsaH6zuZ3pyqapbrDeq88E4SSxXCc7EyvkNl5vo54R1l fCVL47Tl6nwHugt2GPYTiUmBUhoaThzL8bnjc4/Z2pj3JDrHgH71JOIbTI+YZfc35Oa3 KcBQBW6V5LTYIJwGJ+N5eQgB5TKPUw42FBmGGibFyKJz3RR6hG8pbyWZyxD5D0pLvbQX D4HQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pquxx3qH2nuqmnC96VcgoZZu7sN0crimqByO4QYpxLI=; b=FwMiQoP2PXTnV2aIe3RrZ7PPS0iC9UCaaQBSzzE9M7MPrclbTQTphPgFlc5uOAIT5O A8DQgHuoWCEU9S5SHTxUH67+ZoF8PMg9Z/lPAmO8QyS0+F6Sz62RjmlbUSeBm9859Z6B Qluw0gSmYjcxLN59fbuC8nA+bab+bwnzjGeg+Em9G5RqvIIockZujupktaQ0/rZAar0a Z0Q6Qbpfmwhi5oynp3yNssc4g8LyS+fbCqPwQ5SBsNyM6E1xyLc0Jl7RgP71kZc/AnOJ aMNzGKVcPv4uNC7Ax43kOhEpkj43vaEPBe6uv2GHlfBkm7wqzELDdI0ERMj3Gm6blNKN 98ZA== X-Gm-Message-State: AOAM53082ruhLimzsAxwYLGbCk9pwWSKBQB7WA1QVONbWSfqFRvRzFpS ZGZFA/NEoBZasgPICfUaWOjXWw== X-Google-Smtp-Source: ABdhPJx2ieH8dT2jPBxFH/WIhN69bGTVTj3Q7+m7G0MbpiKUgbWZgnzMYMsn0NXOrADeD3t2AWF1qA== X-Received: by 2002:a62:e215:0:b0:4fa:87f1:dc16 with SMTP id a21-20020a62e215000000b004fa87f1dc16mr18414877pfi.19.1648964513366; Sat, 02 Apr 2022 22:41:53 -0700 (PDT) Received: from FVFYT0MHHV2J.bytedance.net ([139.177.225.245]) by smtp.gmail.com with ESMTPSA id a38-20020a056a001d2600b004f70d5e92basm8262479pfx.34.2022.04.02.22.41.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Apr 2022 22:41:53 -0700 (PDT) From: Muchun Song To: dan.j.williams@intel.com, willy@infradead.org, jack@suse.cz, viro@zeniv.linux.org.uk, akpm@linux-foundation.org, apopple@nvidia.com, shy828301@gmail.com, rcampbell@nvidia.com, hughd@google.com, xiyuyang19@fudan.edu.cn, kirill.shutemov@linux.intel.com, zwisler@kernel.org, hch@infradead.org Cc: linux-fsdevel@vger.kernel.org, nvdimm@lists.linux.dev, linux-kernel@vger.kernel.org, linux-mm@kvack.org, duanxiongchun@bytedance.com, smuchun@gmail.com, Muchun Song , Christoph Hellwig Subject: [PATCH v7 6/6] mm: simplify follow_invalidate_pte() Date: Sun, 3 Apr 2022 13:39:57 +0800 Message-Id: <20220403053957.10770-7-songmuchun@bytedance.com> X-Mailer: git-send-email 2.32.0 (Apple Git-132) In-Reply-To: <20220403053957.10770-1-songmuchun@bytedance.com> References: <20220403053957.10770-1-songmuchun@bytedance.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org The only user (DAX) of range and pmdpp parameters of follow_invalidate_pte() is gone, it is safe to remove them and make it static to simlify the code. This is revertant of the following commits: 097963959594 ("mm: add follow_pte_pmd()") a4d1a8852513 ("dax: update to new mmu_notifier semantic") There is only one caller of the follow_invalidate_pte(). So just fold it into follow_pte() and remove it. Signed-off-by: Muchun Song Reviewed-by: Christoph Hellwig --- include/linux/mm.h | 3 -- mm/memory.c | 81 ++++++++++++++++-------------------------------------- 2 files changed, 23 insertions(+), 61 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index c9bada4096ac..be7ec4c37ebe 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -1871,9 +1871,6 @@ void free_pgd_range(struct mmu_gather *tlb, unsigned long addr, unsigned long end, unsigned long floor, unsigned long ceiling); int copy_page_range(struct vm_area_struct *dst_vma, struct vm_area_struct *src_vma); -int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, - struct mmu_notifier_range *range, pte_t **ptepp, - pmd_t **pmdpp, spinlock_t **ptlp); int follow_pte(struct mm_struct *mm, unsigned long address, pte_t **ptepp, spinlock_t **ptlp); int follow_pfn(struct vm_area_struct *vma, unsigned long address, diff --git a/mm/memory.c b/mm/memory.c index cc6968dc8e4e..84f7250e6cd1 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -4964,9 +4964,29 @@ int __pmd_alloc(struct mm_struct *mm, pud_t *pud, unsigned long address) } #endif /* __PAGETABLE_PMD_FOLDED */ -int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, - struct mmu_notifier_range *range, pte_t **ptepp, - pmd_t **pmdpp, spinlock_t **ptlp) +/** + * follow_pte - look up PTE at a user virtual address + * @mm: the mm_struct of the target address space + * @address: user virtual address + * @ptepp: location to store found PTE + * @ptlp: location to store the lock for the PTE + * + * On a successful return, the pointer to the PTE is stored in @ptepp; + * the corresponding lock is taken and its location is stored in @ptlp. + * The contents of the PTE are only stable until @ptlp is released; + * any further use, if any, must be protected against invalidation + * with MMU notifiers. + * + * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore + * should be taken for read. + * + * KVM uses this function. While it is arguably less bad than ``follow_pfn``, + * it is not a good general-purpose API. + * + * Return: zero on success, -ve otherwise. + */ +int follow_pte(struct mm_struct *mm, unsigned long address, + pte_t **ptepp, spinlock_t **ptlp) { pgd_t *pgd; p4d_t *p4d; @@ -4989,35 +5009,9 @@ int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, pmd = pmd_offset(pud, address); VM_BUG_ON(pmd_trans_huge(*pmd)); - if (pmd_huge(*pmd)) { - if (!pmdpp) - goto out; - - if (range) { - mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, - NULL, mm, address & PMD_MASK, - (address & PMD_MASK) + PMD_SIZE); - mmu_notifier_invalidate_range_start(range); - } - *ptlp = pmd_lock(mm, pmd); - if (pmd_huge(*pmd)) { - *pmdpp = pmd; - return 0; - } - spin_unlock(*ptlp); - if (range) - mmu_notifier_invalidate_range_end(range); - } - if (pmd_none(*pmd) || unlikely(pmd_bad(*pmd))) goto out; - if (range) { - mmu_notifier_range_init(range, MMU_NOTIFY_CLEAR, 0, NULL, mm, - address & PAGE_MASK, - (address & PAGE_MASK) + PAGE_SIZE); - mmu_notifier_invalidate_range_start(range); - } ptep = pte_offset_map_lock(mm, pmd, address, ptlp); if (!pte_present(*ptep)) goto unlock; @@ -5025,38 +5019,9 @@ int follow_invalidate_pte(struct mm_struct *mm, unsigned long address, return 0; unlock: pte_unmap_unlock(ptep, *ptlp); - if (range) - mmu_notifier_invalidate_range_end(range); out: return -EINVAL; } - -/** - * follow_pte - look up PTE at a user virtual address - * @mm: the mm_struct of the target address space - * @address: user virtual address - * @ptepp: location to store found PTE - * @ptlp: location to store the lock for the PTE - * - * On a successful return, the pointer to the PTE is stored in @ptepp; - * the corresponding lock is taken and its location is stored in @ptlp. - * The contents of the PTE are only stable until @ptlp is released; - * any further use, if any, must be protected against invalidation - * with MMU notifiers. - * - * Only IO mappings and raw PFN mappings are allowed. The mmap semaphore - * should be taken for read. - * - * KVM uses this function. While it is arguably less bad than ``follow_pfn``, - * it is not a good general-purpose API. - * - * Return: zero on success, -ve otherwise. - */ -int follow_pte(struct mm_struct *mm, unsigned long address, - pte_t **ptepp, spinlock_t **ptlp) -{ - return follow_invalidate_pte(mm, address, NULL, ptepp, NULL, ptlp); -} EXPORT_SYMBOL_GPL(follow_pte); /**