From patchwork Wed Jan 20 17:36:05 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 12033031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5F6D7C433DB for ; Wed, 20 Jan 2021 17:36:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BA05C20705 for ; Wed, 20 Jan 2021 17:36:28 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BA05C20705 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 264CE6B0006; Wed, 20 Jan 2021 12:36:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 212206B0007; Wed, 20 Jan 2021 12:36:28 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 101466B0008; Wed, 20 Jan 2021 12:36:28 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0072.hostedemail.com [216.40.44.72]) by kanga.kvack.org (Postfix) with ESMTP id E32776B0006 for ; Wed, 20 Jan 2021 12:36:27 -0500 (EST) Received: from smtpin01.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 91E233640 for ; Wed, 20 Jan 2021 17:36:27 +0000 (UTC) X-FDA: 77726857614.01.chess98_400e3322755c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin01.hostedemail.com (Postfix) with ESMTP id 29EF11004EC43 for ; Wed, 20 Jan 2021 17:36:27 +0000 (UTC) X-HE-Tag: chess98_400e3322755c X-Filterd-Recvd-Size: 22124 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf25.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Jan 2021 17:36:26 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id A6C4F221F8; Wed, 20 Jan 2021 17:36:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611164185; bh=7hVoGKHfkQYPCcmWTeMSVBvC5VODyPKgv0dL+I7dBDw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=LtQGF09RSAVoEM52i4Rr/JQ5aKNlbPGpESd9PLoQt949V1e/n3WC5f1qK0I14PWKs 87KqjHZO2vx3814rleHrCZCiABXgF17mpVf5t8g3CdEf6umm7x3WWNdicE+b/xp6XW zyAO0ObVffkYY1e82zvNbiDBtHQmX1k0x++M/7YsVWIH30ioMRBhWbhCU4k9FNac9X 2NtDuvr3aU/cNRRWvxFil2Wb3mAKnkGpmQGqFdlgtZuEDen5KJ4WuER9wfJT9vQ6Qk AyNZaPKM3j4H0qfYbS1fEWx7IYsNOYtmOgzgFYhpk6YuLZAOMpEj0RH9r2RGLa9Z7Q A/eXIkUaFoCvw== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , Nick Desaulniers , kernel-team@android.com Subject: [PATCH v4 1/8] mm: Cleanup faultaround and finish_fault() codepaths Date: Wed, 20 Jan 2021 17:36:05 +0000 Message-Id: <20210120173612.20913-2-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210120173612.20913-1-will@kernel.org> References: <20210120173612.20913-1-will@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: "Kirill A. Shutemov" alloc_set_pte() has two users with different requirements: in the faultaround code, it called from an atomic context and PTE page table has to be preallocated. finish_fault() can sleep and allocate page table as needed. PTL locking rules are also strange, hard to follow and overkill for finish_fault(). Let's untangle the mess. alloc_set_pte() has gone now. All locking is explicit. The price is some code duplication to handle huge pages in faultaround path, but it should be fine, having overall improvement in readability. Link: https://lore.kernel.org/r/20201229132819.najtavneutnf7ajp@box Signed-off-by: Kirill A. Shutemov [will: s/from from/from/ in comment; spotted by willy] Signed-off-by: Will Deacon --- fs/xfs/xfs_file.c | 6 +- include/linux/mm.h | 12 ++- include/linux/pgtable.h | 11 +++ mm/filemap.c | 177 ++++++++++++++++++++++++++--------- mm/memory.c | 199 ++++++++++++---------------------------- 5 files changed, 213 insertions(+), 192 deletions(-) diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c index 5b0f93f73837..111fe73bb8a7 100644 --- a/fs/xfs/xfs_file.c +++ b/fs/xfs/xfs_file.c @@ -1319,17 +1319,19 @@ xfs_filemap_pfn_mkwrite( return __xfs_filemap_fault(vmf, PE_SIZE_PTE, true); } -static void +static vm_fault_t xfs_filemap_map_pages( struct vm_fault *vmf, pgoff_t start_pgoff, pgoff_t end_pgoff) { struct inode *inode = file_inode(vmf->vma->vm_file); + vm_fault_t ret; xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); - filemap_map_pages(vmf, start_pgoff, end_pgoff); + ret = filemap_map_pages(vmf, start_pgoff, end_pgoff); xfs_iunlock(XFS_I(inode), XFS_MMAPLOCK_SHARED); + return ret; } static const struct vm_operations_struct xfs_file_vm_ops = { diff --git a/include/linux/mm.h b/include/linux/mm.h index ecdf8a8cd6ae..4572a9bc5862 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -542,8 +542,8 @@ struct vm_fault { * is not NULL, otherwise pmd. */ pgtable_t prealloc_pte; /* Pre-allocated pte page table. - * vm_ops->map_pages() calls - * alloc_set_pte() from atomic context. + * vm_ops->map_pages() sets up a page + * table from atomic context. * do_fault_around() pre-allocates * page table to avoid allocation from * atomic context. @@ -578,7 +578,7 @@ struct vm_operations_struct { vm_fault_t (*fault)(struct vm_fault *vmf); vm_fault_t (*huge_fault)(struct vm_fault *vmf, enum page_entry_size pe_size); - void (*map_pages)(struct vm_fault *vmf, + vm_fault_t (*map_pages)(struct vm_fault *vmf, pgoff_t start_pgoff, pgoff_t end_pgoff); unsigned long (*pagesize)(struct vm_area_struct * area); @@ -988,7 +988,9 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) return pte; } -vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page); +vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page); +void do_set_pte(struct vm_fault *vmf, struct page *page); + vm_fault_t finish_fault(struct vm_fault *vmf); vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf); #endif @@ -2622,7 +2624,7 @@ extern void truncate_inode_pages_final(struct address_space *); /* generic vm_area_ops exported for stackable file systems */ extern vm_fault_t filemap_fault(struct vm_fault *vmf); -extern void filemap_map_pages(struct vm_fault *vmf, +extern vm_fault_t filemap_map_pages(struct vm_fault *vmf, pgoff_t start_pgoff, pgoff_t end_pgoff); extern vm_fault_t filemap_page_mkwrite(struct vm_fault *vmf); diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h index 8fcdfa52eb4b..36eb748f3c97 100644 --- a/include/linux/pgtable.h +++ b/include/linux/pgtable.h @@ -1314,6 +1314,17 @@ static inline int pmd_trans_unstable(pmd_t *pmd) #endif } +/* + * the ordering of these checks is important for pmds with _page_devmap set. + * if we check pmd_trans_unstable() first we will trip the bad_pmd() check + * inside of pmd_none_or_trans_huge_or_clear_bad(). this will end up correctly + * returning 1 but not before it spams dmesg with the pmd_clear_bad() output. + */ +static inline int pmd_devmap_trans_unstable(pmd_t *pmd) +{ + return pmd_devmap(*pmd) || pmd_trans_unstable(pmd); +} + #ifndef CONFIG_NUMA_BALANCING /* * Technically a PTE can be PROTNONE even when not doing NUMA balancing but diff --git a/mm/filemap.c b/mm/filemap.c index 5c9d564317a5..c1f2dc89b8a7 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -42,6 +42,7 @@ #include #include #include +#include #include "internal.h" #define CREATE_TRACE_POINTS @@ -2911,74 +2912,164 @@ vm_fault_t filemap_fault(struct vm_fault *vmf) } EXPORT_SYMBOL(filemap_fault); -void filemap_map_pages(struct vm_fault *vmf, - pgoff_t start_pgoff, pgoff_t end_pgoff) +static bool filemap_map_pmd(struct vm_fault *vmf, struct page *page) { - struct file *file = vmf->vma->vm_file; + struct mm_struct *mm = vmf->vma->vm_mm; + + /* Huge page is mapped? No need to proceed. */ + if (pmd_trans_huge(*vmf->pmd)) { + unlock_page(page); + put_page(page); + return true; + } + + if (pmd_none(*vmf->pmd) && PageTransHuge(page)) { + vm_fault_t ret = do_set_pmd(vmf, page); + if (!ret) { + /* The page is mapped successfully, reference consumed. */ + unlock_page(page); + return true; + } + } + + if (pmd_none(*vmf->pmd)) { + vmf->ptl = pmd_lock(mm, vmf->pmd); + if (likely(pmd_none(*vmf->pmd))) { + mm_inc_nr_ptes(mm); + pmd_populate(mm, vmf->pmd, vmf->prealloc_pte); + vmf->prealloc_pte = NULL; + } + spin_unlock(vmf->ptl); + } + + /* See comment in handle_pte_fault() */ + if (pmd_devmap_trans_unstable(vmf->pmd)) { + unlock_page(page); + put_page(page); + return true; + } + + return false; +} + +static struct page *next_uptodate_page(struct page *page, + struct address_space *mapping, + struct xa_state *xas, pgoff_t end_pgoff) +{ + unsigned long max_idx; + + do { + if (!page) + return NULL; + if (xas_retry(xas, page)) + continue; + if (xa_is_value(page)) + continue; + if (PageLocked(page)) + continue; + if (!page_cache_get_speculative(page)) + continue; + /* Has the page moved or been split? */ + if (unlikely(page != xas_reload(xas))) + goto skip; + if (!PageUptodate(page) || PageReadahead(page)) + goto skip; + if (PageHWPoison(page)) + goto skip; + if (!trylock_page(page)) + goto skip; + if (page->mapping != mapping) + goto unlock; + if (!PageUptodate(page)) + goto unlock; + max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); + if (xas->xa_index >= max_idx) + goto unlock; + return page; +unlock: + unlock_page(page); +skip: + put_page(page); + } while ((page = xas_next_entry(xas, end_pgoff)) != NULL); + + return NULL; +} + +static inline struct page *first_map_page(struct address_space *mapping, + struct xa_state *xas, + pgoff_t end_pgoff) +{ + return next_uptodate_page(xas_find(xas, end_pgoff), + mapping, xas, end_pgoff); +} + +static inline struct page *next_map_page(struct address_space *mapping, + struct xa_state *xas, + pgoff_t end_pgoff) +{ + return next_uptodate_page(xas_next_entry(xas, end_pgoff), + mapping, xas, end_pgoff); +} + +vm_fault_t filemap_map_pages(struct vm_fault *vmf, + pgoff_t start_pgoff, pgoff_t end_pgoff) +{ + struct vm_area_struct *vma = vmf->vma; + struct file *file = vma->vm_file; struct address_space *mapping = file->f_mapping; pgoff_t last_pgoff = start_pgoff; - unsigned long max_idx; + unsigned long address = vmf->address; XA_STATE(xas, &mapping->i_pages, start_pgoff); struct page *head, *page; unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); + vm_fault_t ret = 0; rcu_read_lock(); - xas_for_each(&xas, head, end_pgoff) { - if (xas_retry(&xas, head)) - continue; - if (xa_is_value(head)) - goto next; + head = first_map_page(mapping, &xas, end_pgoff); + if (!head) + goto out; - /* - * Check for a locked page first, as a speculative - * reference may adversely influence page migration. - */ - if (PageLocked(head)) - goto next; - if (!page_cache_get_speculative(head)) - goto next; + if (filemap_map_pmd(vmf, head)) { + ret = VM_FAULT_NOPAGE; + goto out; + } - /* Has the page moved or been split? */ - if (unlikely(head != xas_reload(&xas))) - goto skip; + vmf->address = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT); + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); + do { page = find_subpage(head, xas.xa_index); - - if (!PageUptodate(head) || - PageReadahead(page) || - PageHWPoison(page)) - goto skip; - if (!trylock_page(head)) - goto skip; - - if (head->mapping != mapping || !PageUptodate(head)) - goto unlock; - - max_idx = DIV_ROUND_UP(i_size_read(mapping->host), PAGE_SIZE); - if (xas.xa_index >= max_idx) + if (PageHWPoison(page)) goto unlock; if (mmap_miss > 0) mmap_miss--; vmf->address += (xas.xa_index - last_pgoff) << PAGE_SHIFT; - if (vmf->pte) - vmf->pte += xas.xa_index - last_pgoff; + vmf->pte += xas.xa_index - last_pgoff; last_pgoff = xas.xa_index; - if (alloc_set_pte(vmf, page)) + + if (!pte_none(*vmf->pte)) goto unlock; + + do_set_pte(vmf, page); + /* no need to invalidate: a not-present page won't be cached */ + update_mmu_cache(vma, vmf->address, vmf->pte); unlock_page(head); - goto next; + + /* The fault is handled */ + if (vmf->address == address) + ret = VM_FAULT_NOPAGE; + continue; unlock: unlock_page(head); -skip: put_page(head); -next: - /* Huge page is mapped? No need to proceed. */ - if (pmd_trans_huge(*vmf->pmd)) - break; - } + } while ((head = next_map_page(mapping, &xas, end_pgoff)) != NULL); + pte_unmap_unlock(vmf->pte, vmf->ptl); +out: rcu_read_unlock(); + vmf->address = address; WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss); + return ret; } EXPORT_SYMBOL(filemap_map_pages); diff --git a/mm/memory.c b/mm/memory.c index feff48e1465a..3e2fc2950ad7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3503,7 +3503,7 @@ static vm_fault_t do_anonymous_page(struct vm_fault *vmf) if (pte_alloc(vma->vm_mm, vmf->pmd)) return VM_FAULT_OOM; - /* See the comment in pte_alloc_one_map() */ + /* See comment in handle_pte_fault() */ if (unlikely(pmd_trans_unstable(vmf->pmd))) return 0; @@ -3643,66 +3643,6 @@ static vm_fault_t __do_fault(struct vm_fault *vmf) return ret; } -/* - * The ordering of these checks is important for pmds with _PAGE_DEVMAP set. - * If we check pmd_trans_unstable() first we will trip the bad_pmd() check - * inside of pmd_none_or_trans_huge_or_clear_bad(). This will end up correctly - * returning 1 but not before it spams dmesg with the pmd_clear_bad() output. - */ -static int pmd_devmap_trans_unstable(pmd_t *pmd) -{ - return pmd_devmap(*pmd) || pmd_trans_unstable(pmd); -} - -static vm_fault_t pte_alloc_one_map(struct vm_fault *vmf) -{ - struct vm_area_struct *vma = vmf->vma; - - if (!pmd_none(*vmf->pmd)) - goto map_pte; - if (vmf->prealloc_pte) { - vmf->ptl = pmd_lock(vma->vm_mm, vmf->pmd); - if (unlikely(!pmd_none(*vmf->pmd))) { - spin_unlock(vmf->ptl); - goto map_pte; - } - - mm_inc_nr_ptes(vma->vm_mm); - pmd_populate(vma->vm_mm, vmf->pmd, vmf->prealloc_pte); - spin_unlock(vmf->ptl); - vmf->prealloc_pte = NULL; - } else if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) { - return VM_FAULT_OOM; - } -map_pte: - /* - * If a huge pmd materialized under us just retry later. Use - * pmd_trans_unstable() via pmd_devmap_trans_unstable() instead of - * pmd_trans_huge() to ensure the pmd didn't become pmd_trans_huge - * under us and then back to pmd_none, as a result of MADV_DONTNEED - * running immediately after a huge pmd fault in a different thread of - * this mm, in turn leading to a misleading pmd_trans_huge() retval. - * All we have to ensure is that it is a regular pmd that we can walk - * with pte_offset_map() and we can do that through an atomic read in - * C, which is what pmd_trans_unstable() provides. - */ - if (pmd_devmap_trans_unstable(vmf->pmd)) - return VM_FAULT_NOPAGE; - - /* - * At this point we know that our vmf->pmd points to a page of ptes - * and it cannot become pmd_none(), pmd_devmap() or pmd_trans_huge() - * for the duration of the fault. If a racing MADV_DONTNEED runs and - * we zap the ptes pointed to by our vmf->pmd, the vmf->ptl will still - * be valid and we will re-check to make sure the vmf->pte isn't - * pte_none() under vmf->ptl protection when we return to - * alloc_set_pte(). - */ - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, - &vmf->ptl); - return 0; -} - #ifdef CONFIG_TRANSPARENT_HUGEPAGE static void deposit_prealloc_pte(struct vm_fault *vmf) { @@ -3717,7 +3657,7 @@ static void deposit_prealloc_pte(struct vm_fault *vmf) vmf->prealloc_pte = NULL; } -static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) +vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) { struct vm_area_struct *vma = vmf->vma; bool write = vmf->flags & FAULT_FLAG_WRITE; @@ -3775,52 +3715,17 @@ static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) return ret; } #else -static vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) +vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) { - BUILD_BUG(); - return 0; + return VM_FAULT_FALLBACK; } #endif -/** - * alloc_set_pte - setup new PTE entry for given page and add reverse page - * mapping. If needed, the function allocates page table or use pre-allocated. - * - * @vmf: fault environment - * @page: page to map - * - * Caller must take care of unlocking vmf->ptl, if vmf->pte is non-NULL on - * return. - * - * Target users are page handler itself and implementations of - * vm_ops->map_pages. - * - * Return: %0 on success, %VM_FAULT_ code in case of error. - */ -vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) +void do_set_pte(struct vm_fault *vmf, struct page *page) { struct vm_area_struct *vma = vmf->vma; bool write = vmf->flags & FAULT_FLAG_WRITE; pte_t entry; - vm_fault_t ret; - - if (pmd_none(*vmf->pmd) && PageTransCompound(page)) { - ret = do_set_pmd(vmf, page); - if (ret != VM_FAULT_FALLBACK) - return ret; - } - - if (!vmf->pte) { - ret = pte_alloc_one_map(vmf); - if (ret) - return ret; - } - - /* Re-check under ptl */ - if (unlikely(!pte_none(*vmf->pte))) { - update_mmu_tlb(vma, vmf->address, vmf->pte); - return VM_FAULT_NOPAGE; - } flush_icache_page(vma, page); entry = mk_pte(page, vma->vm_page_prot); @@ -3837,14 +3742,8 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) page_add_file_rmap(page, false); } set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); - - /* no need to invalidate: a not-present page won't be cached */ - update_mmu_cache(vma, vmf->address, vmf->pte); - - return 0; } - /** * finish_fault - finish page fault once we have prepared the page to fault * @@ -3862,12 +3761,12 @@ vm_fault_t alloc_set_pte(struct vm_fault *vmf, struct page *page) */ vm_fault_t finish_fault(struct vm_fault *vmf) { + struct vm_area_struct *vma = vmf->vma; struct page *page; - vm_fault_t ret = 0; + vm_fault_t ret; /* Did we COW the page? */ - if ((vmf->flags & FAULT_FLAG_WRITE) && - !(vmf->vma->vm_flags & VM_SHARED)) + if ((vmf->flags & FAULT_FLAG_WRITE) && !(vma->vm_flags & VM_SHARED)) page = vmf->cow_page; else page = vmf->page; @@ -3876,12 +3775,38 @@ vm_fault_t finish_fault(struct vm_fault *vmf) * check even for read faults because we might have lost our CoWed * page */ - if (!(vmf->vma->vm_flags & VM_SHARED)) - ret = check_stable_address_space(vmf->vma->vm_mm); - if (!ret) - ret = alloc_set_pte(vmf, page); - if (vmf->pte) - pte_unmap_unlock(vmf->pte, vmf->ptl); + if (!(vma->vm_flags & VM_SHARED)) { + ret = check_stable_address_space(vma->vm_mm); + if (ret) + return ret; + } + + if (pmd_none(*vmf->pmd)) { + if (PageTransCompound(page)) { + ret = do_set_pmd(vmf, page); + if (ret != VM_FAULT_FALLBACK) + return ret; + } + + if (unlikely(pte_alloc(vma->vm_mm, vmf->pmd))) + return VM_FAULT_OOM; + } + + /* See comment in handle_pte_fault() */ + if (pmd_devmap_trans_unstable(vmf->pmd)) + return 0; + + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, + vmf->address, &vmf->ptl); + ret = 0; + /* Re-check under ptl */ + if (likely(pte_none(*vmf->pte))) + do_set_pte(vmf, page); + else + ret = VM_FAULT_NOPAGE; + + update_mmu_tlb(vma, vmf->address, vmf->pte); + pte_unmap_unlock(vmf->pte, vmf->ptl); return ret; } @@ -3951,13 +3876,12 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) pgoff_t start_pgoff = vmf->pgoff; pgoff_t end_pgoff; int off; - vm_fault_t ret = 0; nr_pages = READ_ONCE(fault_around_bytes) >> PAGE_SHIFT; mask = ~(nr_pages * PAGE_SIZE - 1) & PAGE_MASK; - vmf->address = max(address & mask, vmf->vma->vm_start); - off = ((address - vmf->address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1); + address = max(address & mask, vmf->vma->vm_start); + off = ((vmf->address - address) >> PAGE_SHIFT) & (PTRS_PER_PTE - 1); start_pgoff -= off; /* @@ -3965,7 +3889,7 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) * the vma or nr_pages from start_pgoff, depending what is nearest. */ end_pgoff = start_pgoff - - ((vmf->address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) + + ((address >> PAGE_SHIFT) & (PTRS_PER_PTE - 1)) + PTRS_PER_PTE - 1; end_pgoff = min3(end_pgoff, vma_pages(vmf->vma) + vmf->vma->vm_pgoff - 1, start_pgoff + nr_pages - 1); @@ -3973,31 +3897,11 @@ static vm_fault_t do_fault_around(struct vm_fault *vmf) if (pmd_none(*vmf->pmd)) { vmf->prealloc_pte = pte_alloc_one(vmf->vma->vm_mm); if (!vmf->prealloc_pte) - goto out; + return VM_FAULT_OOM; smp_wmb(); /* See comment in __pte_alloc() */ } - vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff); - - /* Huge page is mapped? Page fault is solved */ - if (pmd_trans_huge(*vmf->pmd)) { - ret = VM_FAULT_NOPAGE; - goto out; - } - - /* ->map_pages() haven't done anything useful. Cold page cache? */ - if (!vmf->pte) - goto out; - - /* check if the page fault is solved */ - vmf->pte -= (vmf->address >> PAGE_SHIFT) - (address >> PAGE_SHIFT); - if (!pte_none(*vmf->pte)) - ret = VM_FAULT_NOPAGE; - pte_unmap_unlock(vmf->pte, vmf->ptl); -out: - vmf->address = address; - vmf->pte = NULL; - return ret; + return vmf->vma->vm_ops->map_pages(vmf, start_pgoff, end_pgoff); } static vm_fault_t do_read_fault(struct vm_fault *vmf) @@ -4353,7 +4257,18 @@ static vm_fault_t handle_pte_fault(struct vm_fault *vmf) */ vmf->pte = NULL; } else { - /* See comment in pte_alloc_one_map() */ + /* + * If a huge pmd materialized under us just retry later. Use + * pmd_trans_unstable() via pmd_devmap_trans_unstable() instead + * of pmd_trans_huge() to ensure the pmd didn't become + * pmd_trans_huge under us and then back to pmd_none, as a + * result of MADV_DONTNEED running immediately after a huge pmd + * fault in a different thread of this mm, in turn leading to a + * misleading pmd_trans_huge() retval. All we have to ensure is + * that it is a regular pmd that we can walk with + * pte_offset_map() and we can do that through an atomic read + * in C, which is what pmd_trans_unstable() provides. + */ if (pmd_devmap_trans_unstable(vmf->pmd)) return 0; /* From patchwork Wed Jan 20 17:36:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 12033033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC5FCC433E0 for ; Wed, 20 Jan 2021 17:36:30 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 8D29A20705 for ; Wed, 20 Jan 2021 17:36:30 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 8D29A20705 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 12D906B0007; Wed, 20 Jan 2021 12:36:30 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 0DDA46B0008; Wed, 20 Jan 2021 12:36:30 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E9BC66B000A; Wed, 20 Jan 2021 12:36:29 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0222.hostedemail.com [216.40.44.222]) by kanga.kvack.org (Postfix) with ESMTP id CF1226B0007 for ; Wed, 20 Jan 2021 12:36:29 -0500 (EST) Received: from smtpin12.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 935F2180AD837 for ; Wed, 20 Jan 2021 17:36:29 +0000 (UTC) X-FDA: 77726857698.12.owl55_0d048082755c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin12.hostedemail.com (Postfix) with ESMTP id 6A4091800DCB5 for ; Wed, 20 Jan 2021 17:36:29 +0000 (UTC) X-HE-Tag: owl55_0d048082755c X-Filterd-Recvd-Size: 7443 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf48.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Jan 2021 17:36:28 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 90BA3221FE; Wed, 20 Jan 2021 17:36:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611164187; bh=IeHJxTh2dLGEMk6hqY4LyIwvV7pcdSMi8pKKcvjOfAM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iJZgHhD1/ANVgDp81pvXm5mD0T9PCPThgef5eboYUt+npa+A94x20BjQHGiu0dmJH RFd4PARLxLTM3loqluv29GYjc2kOTtvGYrzSKKdC6mUiQJKe3xKI9TVD72369BsSj9 IYuBHTgD2Da0vddlUjEFqyaIerUiWuHvA8d7mRXaB7fwx+T1Vc80Nz5mhWznyOs/yO ygNaKxask7Uek/xI8pL5ZkAbULQkoQh0vnhjiflxc21FB55p4xmmks4wGep9Fr7Dwj Fbr6xkmYPl/0jKgZMVvG6aW5IRu0N38I6ujTTcx88y6krIlWLhfjFLgxRKJeE+u8jW HjO3r97j8wDBw== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , Nick Desaulniers , kernel-team@android.com Subject: [PATCH v4 2/8] mm: Allow architectures to request 'old' entries when prefaulting Date: Wed, 20 Jan 2021 17:36:06 +0000 Message-Id: <20210120173612.20913-3-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210120173612.20913-1-will@kernel.org> References: <20210120173612.20913-1-will@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Commit 5c0a85fad949 ("mm: make faultaround produce old ptes") changed the "faultaround" behaviour to initialise prefaulted PTEs as 'old', since this avoids vmscan wrongly assuming that they are hot, despite having never been explicitly accessed by userspace. The change has been shown to benefit numerous arm64 micro-architectures (with hardware access flag) running Android, where both application launch latency and direct reclaim time are significantly reduced (by 10%+ and ~80% respectively). Unfortunately, commit 315d09bf30c2 ("Revert "mm: make faultaround produce old ptes"") reverted the change due to it being identified as the cause of a ~6% regression in unixbench on x86. Experiments on a variety of recent arm64 micro-architectures indicate that unixbench is not affected by the original commit, which appears to yield a 0-1% performance improvement. Since one size does not fit all for the initial state of prefaulted PTEs, introduce arch_wants_old_prefaulted_pte(), which allows an architecture to opt-in to 'old' prefaulted PTEs at runtime based on whatever criteria it may have. Cc: Jan Kara Cc: Minchan Kim Cc: Andrew Morton Cc: Kirill A. Shutemov Cc: Linus Torvalds Reported-by: Vinayak Menon Signed-off-by: Will Deacon --- include/linux/mm.h | 5 ++++- mm/filemap.c | 14 ++++++++++---- mm/memory.c | 20 +++++++++++++++++++- 3 files changed, 33 insertions(+), 6 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 4572a9bc5862..251a2339befb 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -434,6 +434,7 @@ extern pgprot_t protection_map[16]; * @FAULT_FLAG_REMOTE: The fault is not for current task/mm. * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch. * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals. + * @FAULT_FLAG_PREFAULT: Fault was a prefault. * * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify * whether we would allow page faults to retry by specifying these two @@ -464,6 +465,7 @@ extern pgprot_t protection_map[16]; #define FAULT_FLAG_REMOTE 0x80 #define FAULT_FLAG_INSTRUCTION 0x100 #define FAULT_FLAG_INTERRUPTIBLE 0x200 +#define FAULT_FLAG_PREFAULT 0x400 /* * The default fault flags that should be used by most of the @@ -501,7 +503,8 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags) { FAULT_FLAG_USER, "USER" }, \ { FAULT_FLAG_REMOTE, "REMOTE" }, \ { FAULT_FLAG_INSTRUCTION, "INSTRUCTION" }, \ - { FAULT_FLAG_INTERRUPTIBLE, "INTERRUPTIBLE" } + { FAULT_FLAG_INTERRUPTIBLE, "INTERRUPTIBLE" }, \ + { FAULT_FLAG_PREFAULT, "PREFAULT" } /* * vm_fault is filled by the pagefault handler and passed to the vma's diff --git a/mm/filemap.c b/mm/filemap.c index c1f2dc89b8a7..a6dc97906c8e 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3019,6 +3019,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, struct address_space *mapping = file->f_mapping; pgoff_t last_pgoff = start_pgoff; unsigned long address = vmf->address; + unsigned long flags = vmf->flags; XA_STATE(xas, &mapping->i_pages, start_pgoff); struct page *head, *page; unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); @@ -3051,14 +3052,18 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, if (!pte_none(*vmf->pte)) goto unlock; + /* We're about to handle the fault */ + if (vmf->address == address) { + vmf->flags &= ~FAULT_FLAG_PREFAULT; + ret = VM_FAULT_NOPAGE; + } else { + vmf->flags |= FAULT_FLAG_PREFAULT; + } + do_set_pte(vmf, page); /* no need to invalidate: a not-present page won't be cached */ update_mmu_cache(vma, vmf->address, vmf->pte); unlock_page(head); - - /* The fault is handled */ - if (vmf->address == address) - ret = VM_FAULT_NOPAGE; continue; unlock: unlock_page(head); @@ -3067,6 +3072,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, pte_unmap_unlock(vmf->pte, vmf->ptl); out: rcu_read_unlock(); + vmf->flags = flags; vmf->address = address; WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss); return ret; diff --git a/mm/memory.c b/mm/memory.c index 3e2fc2950ad7..f0e7c589ca9d 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -134,6 +134,18 @@ static inline bool arch_faults_on_old_pte(void) } #endif +#ifndef arch_wants_old_prefaulted_pte +static inline bool arch_wants_old_prefaulted_pte(void) +{ + /* + * Transitioning a PTE from 'old' to 'young' can be expensive on + * some architectures, even if it's performed in hardware. By + * default, "false" means prefaulted entries will be 'young'. + */ + return false; +} +#endif + static int __init disable_randmaps(char *s) { randomize_va_space = 0; @@ -3725,11 +3737,17 @@ void do_set_pte(struct vm_fault *vmf, struct page *page) { struct vm_area_struct *vma = vmf->vma; bool write = vmf->flags & FAULT_FLAG_WRITE; + bool prefault = vmf->flags & FAULT_FLAG_PREFAULT; pte_t entry; flush_icache_page(vma, page); entry = mk_pte(page, vma->vm_page_prot); - entry = pte_sw_mkyoung(entry); + + if (prefault && arch_wants_old_prefaulted_pte()) + entry = pte_mkold(entry); + else + entry = pte_sw_mkyoung(entry); + if (write) entry = maybe_mkwrite(pte_mkdirty(entry), vma); /* copy-on-write page */ From patchwork Wed Jan 20 17:36:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 12033035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D5C1C433E6 for ; Wed, 20 Jan 2021 17:36:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4FE7E2070A for ; Wed, 20 Jan 2021 17:36:33 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4FE7E2070A Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id E03CB6B0008; Wed, 20 Jan 2021 12:36:32 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id DD98D6B000A; Wed, 20 Jan 2021 12:36:32 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C9C566B000C; Wed, 20 Jan 2021 12:36:32 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0235.hostedemail.com [216.40.44.235]) by kanga.kvack.org (Postfix) with ESMTP id B57A96B0008 for ; Wed, 20 Jan 2021 12:36:32 -0500 (EST) Received: from smtpin28.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id 785181EE6 for ; Wed, 20 Jan 2021 17:36:32 +0000 (UTC) X-FDA: 77726857824.28.coat92_5502af62755c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin28.hostedemail.com (Postfix) with ESMTP id 2DD0D6C04 for ; Wed, 20 Jan 2021 17:36:32 +0000 (UTC) X-HE-Tag: coat92_5502af62755c X-Filterd-Recvd-Size: 3172 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf11.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Jan 2021 17:36:31 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 4702E207C4; Wed, 20 Jan 2021 17:36:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611164190; bh=5GZtx5inBi7paTjDAzc0sxuRXktlFwJSoTEoY/D7ZmE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GdWQv3xvaA0v2TUZLwjq/QGsJ+90rBiCFsxBIr34LBByEZuIYI4bPMdgX8r2vGxkD OaNtl1caOy2s/6Q0kWQ15uL3/Gaoa+ORzSuHZtCsgOpwtnPQyXeNVoAFvfO0qPineM BHV5wE1Hi3ON5AppH0sBkknp/dmYcSm1iY2+MGiJd22K/8Fgjq5zArzlo0FsUSpB19 OOJMqFjNHI+NLfN1sZ3oi2EeaX4R/p/3FtrMAuqEEMbAzXsz2pEF5AnTwruZj6pEs/ 0yssvKFDKB4+AStqrSSBERctINEwvYNiFSiu42ee+DZd4KB6aALo6lSlcKvv3EPeve c1jbPXJV153NQ== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , Nick Desaulniers , kernel-team@android.com Subject: [PATCH v4 3/8] arm64: mm: Implement arch_wants_old_prefaulted_pte() Date: Wed, 20 Jan 2021 17:36:07 +0000 Message-Id: <20210120173612.20913-4-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210120173612.20913-1-will@kernel.org> References: <20210120173612.20913-1-will@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: On CPUs with hardware AF/DBM, initialising prefaulted PTEs as 'old' improves vmscan behaviour and does not appear to introduce any overhead elsewhere. Implement arch_wants_old_prefaulted_pte() to return 'true' if we detect hardware access flag support at runtime. This can be extended in future based on MIDR matching if necessary. Cc: Catalin Marinas Signed-off-by: Will Deacon --- arch/arm64/include/asm/pgtable.h | 12 +++++++++++- 1 file changed, 11 insertions(+), 1 deletion(-) diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 501562793ce2..e17b96d0e4b5 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -980,7 +980,17 @@ static inline bool arch_faults_on_old_pte(void) return !cpu_has_hw_af(); } -#define arch_faults_on_old_pte arch_faults_on_old_pte +#define arch_faults_on_old_pte arch_faults_on_old_pte + +/* + * Experimentally, it's cheap to set the access flag in hardware and we + * benefit from prefaulting mappings as 'old' to start with. + */ +static inline bool arch_wants_old_prefaulted_pte(void) +{ + return !arch_faults_on_old_pte(); +} +#define arch_wants_old_prefaulted_pte arch_wants_old_prefaulted_pte #endif /* !__ASSEMBLY__ */ From patchwork Wed Jan 20 17:36:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 12033037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 16F78C433E0 for ; Wed, 20 Jan 2021 17:36:36 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id BBBA423122 for ; Wed, 20 Jan 2021 17:36:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org BBBA423122 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 523656B000A; Wed, 20 Jan 2021 12:36:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4D3CD6B000C; Wed, 20 Jan 2021 12:36:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 3C6C06B000D; Wed, 20 Jan 2021 12:36:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id 25ED96B000A for ; Wed, 20 Jan 2021 12:36:35 -0500 (EST) Received: from smtpin08.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id DAF4E1EE6 for ; Wed, 20 Jan 2021 17:36:34 +0000 (UTC) X-FDA: 77726857908.08.rock95_05075932755c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin08.hostedemail.com (Postfix) with ESMTP id B398D1819E772 for ; Wed, 20 Jan 2021 17:36:34 +0000 (UTC) X-HE-Tag: rock95_05075932755c X-Filterd-Recvd-Size: 4137 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf34.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Jan 2021 17:36:33 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id F3DAF20705; Wed, 20 Jan 2021 17:36:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611164193; bh=1NPKVlBGRw3SuPWaJXbQxTnIY7pCrHJ3wlOHyCK9jeQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=U9ca2SegTpBhZEzpssFClY/cObJnvft5eu8HXnPUhuqoE0GL6ZLTTHCMt+pwygMGS GwJ97sj9K5mpO0BafY2aTyqIHxzV97j9D4ltNM6cyfE+4dftlwGNdjAknrY08pTxKV xhKfsblHNbZBiPjDdzW/d9doz9vT4Q9iKoAZkuqns71CBcrqzdVvpL4A78yc/jzEsb Xgva3GgS62+EQ1vdc7cNnVG7BhcWZP4XTo9pnPGYWqWH+n8/CZP1P0mPvFNlpjhKjX bZ36Xx+btBsNJHmOg7dDapu5k0tI7CvIvpQspgEeYEIjLkAbJTUnIEK6HP956Htv9n ldHKDhkeCS1YQ== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , Nick Desaulniers , kernel-team@android.com Subject: [PATCH v4 4/8] mm: Move immutable fields of 'struct vm_fault' into anonymous struct Date: Wed, 20 Jan 2021 17:36:08 +0000 Message-Id: <20210120173612.20913-5-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210120173612.20913-1-will@kernel.org> References: <20210120173612.20913-1-will@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: 'struct vm_fault' contains both information about the fault being serviced alongside mutable fields contributing to the state of the fault-handling logic. Unfortunately, the distinction between the two is not clear-cut, and a number of callers end up manipulating the structure temporarily before restoring it when returning. Try to clean this up by moving the immutable fault information into an anonymous struct, which will later be marked as 'const'. GCC will then complain (with an error) about modification of these fields after they have been initialised, although LLVM currently allows them without even a warning: https://bugs.llvm.org/show_bug.cgi?id=48755 Ideally, the 'flags' field would be part of the new structure too, but it seems as though the ->page_mkwrite() path is not ready for this yet. Cc: Kirill A. Shutemov Suggested-by: Linus Torvalds Link: https://lore.kernel.org/r/CAHk-=whYs9XsO88iqJzN6NC=D-dp2m0oYXuOoZ=eWnvv=5OA+w@mail.gmail.com Signed-off-by: Will Deacon --- include/linux/mm.h | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index 251a2339befb..b4a5cb9bff7d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -517,11 +517,14 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags) * pgoff should be used in favour of virtual_address, if possible. */ struct vm_fault { - struct vm_area_struct *vma; /* Target VMA */ - unsigned int flags; /* FAULT_FLAG_xxx flags */ - gfp_t gfp_mask; /* gfp mask to be used for allocations */ - pgoff_t pgoff; /* Logical page offset based on vma */ - unsigned long address; /* Faulting virtual address */ + struct { + struct vm_area_struct *vma; /* Target VMA */ + gfp_t gfp_mask; /* gfp mask to be used for allocations */ + pgoff_t pgoff; /* Logical page offset based on vma */ + unsigned long address; /* Faulting virtual address */ + }; + unsigned int flags; /* FAULT_FLAG_xxx flags + * XXX: should really be 'const' */ pmd_t *pmd; /* Pointer to pmd entry matching * the 'address' */ pud_t *pud; /* Pointer to pud entry matching From patchwork Wed Jan 20 17:36:09 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 12033039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 54379C433DB for ; Wed, 20 Jan 2021 17:36:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id D0B1B23122 for ; Wed, 20 Jan 2021 17:36:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D0B1B23122 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 438A36B000C; Wed, 20 Jan 2021 12:36:38 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 3EE846B000D; Wed, 20 Jan 2021 12:36:38 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2B2106B000E; Wed, 20 Jan 2021 12:36:38 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0139.hostedemail.com [216.40.44.139]) by kanga.kvack.org (Postfix) with ESMTP id 161F76B000C for ; Wed, 20 Jan 2021 12:36:38 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id CC24B8249980 for ; Wed, 20 Jan 2021 17:36:37 +0000 (UTC) X-FDA: 77726858034.23.fog02_0c02cfe2755c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin23.hostedemail.com (Postfix) with ESMTP id ABDFC37608 for ; Wed, 20 Jan 2021 17:36:37 +0000 (UTC) X-HE-Tag: fog02_0c02cfe2755c X-Filterd-Recvd-Size: 8084 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf42.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Jan 2021 17:36:36 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id B73242070A; Wed, 20 Jan 2021 17:36:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611164196; bh=0pn7bgAeZDKGcdVrLp6LkRvBAhpX3ko/rgDGtAX8D/g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=XBbI+cW5nTMlH6QSQRgVkTEBVvtv4u6GsHqzTtYhEpsjvPu60/sLBlrQCMfYIazZs ILwZD3avrfsn1ktk8J8uTn16xfq+EldSoErBTdmFA6gZdhou9syARHucXjG46ukvGC VrOFDd2VLvwjPUBdSYLJCW8zf7PStqVJfDUvw4dIyYroO1WS3bipJBgDk7F+wCaCTp PHHqhJ6uXPkwudGNTJeU6wdrn0rCYO/eI2DsYPMNMaiIozIXlzBLqStuYywvriJqNb TdyBjarkLYF9OYr97HocAB1WEodd7YYP8ajk+b3UKrdjvDBv/EUOhzW+pmRph0QKIk ir/TzMXaRrRrw== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , Nick Desaulniers , kernel-team@android.com Subject: [PATCH v4 5/8] mm: Pass 'address' to map to do_set_pte() and drop FAULT_FLAG_PREFAULT Date: Wed, 20 Jan 2021 17:36:09 +0000 Message-Id: <20210120173612.20913-6-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210120173612.20913-1-will@kernel.org> References: <20210120173612.20913-1-will@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Rather than modifying the 'address' field of the 'struct vm_fault' passed to do_set_pte(), leave that to identify the real faulting address and pass in the virtual address to be mapped by the new pte as a separate argument. This makes FAULT_FLAG_PREFAULT redundant, as a prefault entry can be identified simply by comparing the new address parameter with the faulting address, so remove the redundant flag at the same time. Cc: Kirill A. Shutemov Cc: Linus Torvalds Signed-off-by: Will Deacon --- include/linux/mm.h | 7 ++----- mm/filemap.c | 21 +++++++-------------- mm/memory.c | 10 +++++----- 3 files changed, 14 insertions(+), 24 deletions(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index b4a5cb9bff7d..e0f056753bef 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -434,7 +434,6 @@ extern pgprot_t protection_map[16]; * @FAULT_FLAG_REMOTE: The fault is not for current task/mm. * @FAULT_FLAG_INSTRUCTION: The fault was during an instruction fetch. * @FAULT_FLAG_INTERRUPTIBLE: The fault can be interrupted by non-fatal signals. - * @FAULT_FLAG_PREFAULT: Fault was a prefault. * * About @FAULT_FLAG_ALLOW_RETRY and @FAULT_FLAG_TRIED: we can specify * whether we would allow page faults to retry by specifying these two @@ -465,7 +464,6 @@ extern pgprot_t protection_map[16]; #define FAULT_FLAG_REMOTE 0x80 #define FAULT_FLAG_INSTRUCTION 0x100 #define FAULT_FLAG_INTERRUPTIBLE 0x200 -#define FAULT_FLAG_PREFAULT 0x400 /* * The default fault flags that should be used by most of the @@ -503,8 +501,7 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags) { FAULT_FLAG_USER, "USER" }, \ { FAULT_FLAG_REMOTE, "REMOTE" }, \ { FAULT_FLAG_INSTRUCTION, "INSTRUCTION" }, \ - { FAULT_FLAG_INTERRUPTIBLE, "INTERRUPTIBLE" }, \ - { FAULT_FLAG_PREFAULT, "PREFAULT" } + { FAULT_FLAG_INTERRUPTIBLE, "INTERRUPTIBLE" } /* * vm_fault is filled by the pagefault handler and passed to the vma's @@ -995,7 +992,7 @@ static inline pte_t maybe_mkwrite(pte_t pte, struct vm_area_struct *vma) } vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page); -void do_set_pte(struct vm_fault *vmf, struct page *page); +void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr); vm_fault_t finish_fault(struct vm_fault *vmf); vm_fault_t finish_mkwrite_fault(struct vm_fault *vmf); diff --git a/mm/filemap.c b/mm/filemap.c index a6dc97906c8e..fb7a8d9b5603 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3018,8 +3018,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, struct file *file = vma->vm_file; struct address_space *mapping = file->f_mapping; pgoff_t last_pgoff = start_pgoff; - unsigned long address = vmf->address; - unsigned long flags = vmf->flags; + unsigned long addr; XA_STATE(xas, &mapping->i_pages, start_pgoff); struct page *head, *page; unsigned int mmap_miss = READ_ONCE(file->f_ra.mmap_miss); @@ -3035,8 +3034,8 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, goto out; } - vmf->address = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT); - vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, vmf->address, &vmf->ptl); + addr = vma->vm_start + ((start_pgoff - vma->vm_pgoff) << PAGE_SHIFT); + vmf->pte = pte_offset_map_lock(vma->vm_mm, vmf->pmd, addr, &vmf->ptl); do { page = find_subpage(head, xas.xa_index); if (PageHWPoison(page)) @@ -3045,7 +3044,7 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, if (mmap_miss > 0) mmap_miss--; - vmf->address += (xas.xa_index - last_pgoff) << PAGE_SHIFT; + addr += (xas.xa_index - last_pgoff) << PAGE_SHIFT; vmf->pte += xas.xa_index - last_pgoff; last_pgoff = xas.xa_index; @@ -3053,16 +3052,12 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, goto unlock; /* We're about to handle the fault */ - if (vmf->address == address) { - vmf->flags &= ~FAULT_FLAG_PREFAULT; + if (vmf->address == addr) ret = VM_FAULT_NOPAGE; - } else { - vmf->flags |= FAULT_FLAG_PREFAULT; - } - do_set_pte(vmf, page); + do_set_pte(vmf, page, addr); /* no need to invalidate: a not-present page won't be cached */ - update_mmu_cache(vma, vmf->address, vmf->pte); + update_mmu_cache(vma, addr, vmf->pte); unlock_page(head); continue; unlock: @@ -3072,8 +3067,6 @@ vm_fault_t filemap_map_pages(struct vm_fault *vmf, pte_unmap_unlock(vmf->pte, vmf->ptl); out: rcu_read_unlock(); - vmf->flags = flags; - vmf->address = address; WRITE_ONCE(file->f_ra.mmap_miss, mmap_miss); return ret; } diff --git a/mm/memory.c b/mm/memory.c index f0e7c589ca9d..7b1307873325 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -3733,11 +3733,11 @@ vm_fault_t do_set_pmd(struct vm_fault *vmf, struct page *page) } #endif -void do_set_pte(struct vm_fault *vmf, struct page *page) +void do_set_pte(struct vm_fault *vmf, struct page *page, unsigned long addr) { struct vm_area_struct *vma = vmf->vma; bool write = vmf->flags & FAULT_FLAG_WRITE; - bool prefault = vmf->flags & FAULT_FLAG_PREFAULT; + bool prefault = vmf->address != addr; pte_t entry; flush_icache_page(vma, page); @@ -3753,13 +3753,13 @@ void do_set_pte(struct vm_fault *vmf, struct page *page) /* copy-on-write page */ if (write && !(vma->vm_flags & VM_SHARED)) { inc_mm_counter_fast(vma->vm_mm, MM_ANONPAGES); - page_add_new_anon_rmap(page, vma, vmf->address, false); + page_add_new_anon_rmap(page, vma, addr, false); lru_cache_add_inactive_or_unevictable(page, vma); } else { inc_mm_counter_fast(vma->vm_mm, mm_counter_file(page)); page_add_file_rmap(page, false); } - set_pte_at(vma->vm_mm, vmf->address, vmf->pte, entry); + set_pte_at(vma->vm_mm, addr, vmf->pte, entry); } /** @@ -3819,7 +3819,7 @@ vm_fault_t finish_fault(struct vm_fault *vmf) ret = 0; /* Re-check under ptl */ if (likely(pte_none(*vmf->pte))) - do_set_pte(vmf, page); + do_set_pte(vmf, page, vmf->address); else ret = VM_FAULT_NOPAGE; From patchwork Wed Jan 20 17:36:10 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 12033041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C563EC433E0 for ; Wed, 20 Jan 2021 17:36:41 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 72D8023122 for ; Wed, 20 Jan 2021 17:36:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 72D8023122 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DB09F6B000D; Wed, 20 Jan 2021 12:36:40 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D5F546B000E; Wed, 20 Jan 2021 12:36:40 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C4ED86B0010; Wed, 20 Jan 2021 12:36:40 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0196.hostedemail.com [216.40.44.196]) by kanga.kvack.org (Postfix) with ESMTP id A47C46B000D for ; Wed, 20 Jan 2021 12:36:40 -0500 (EST) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 65CFB8249980 for ; Wed, 20 Jan 2021 17:36:40 +0000 (UTC) X-FDA: 77726858160.21.beef71_5602bd12755c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin21.hostedemail.com (Postfix) with ESMTP id 3BA97180445E0 for ; Wed, 20 Jan 2021 17:36:40 +0000 (UTC) X-HE-Tag: beef71_5602bd12755c X-Filterd-Recvd-Size: 4818 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Jan 2021 17:36:39 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 7DADD2333E; Wed, 20 Jan 2021 17:36:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611164198; bh=mPVcvqnnFczBTmevUq2gNa8hUAqbH91DT9UsrPws3+g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nPZmHgq6EOYiaoMTHwKnopMYJLLjBf+KUh7gFxWGU6C6ZsFy8Fw59t8JWVzuKxMks IHr208S00Re726/ukSpL9sUdOnyxksN9bEL65lnbXKYooL4MUXmsokpx9GqjnSFELN xejVA5rh5/IVjWljgThf4JfJwS0fyPtRUGs0+FwYduyBEhC7mV9Hhfg6jLpsZXrkPL aSfhGTwDfpO7rYmQ2cmH15i063QV5R2hRT1j+cTbiSZBkaVFNnaVwtM4swaArGolDn i79B4d4fBH0/xGiIVBx93yWErYA3GW7Wu/AvVigqmGB2X1AeLGSdtwajpRte6QK0Zu aPbipWs4q5i8Q== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , Nick Desaulniers , kernel-team@android.com Subject: [PATCH v4 6/8] mm: Avoid modifying vmf.address in __collapse_huge_page_swapin() Date: Wed, 20 Jan 2021 17:36:10 +0000 Message-Id: <20210120173612.20913-7-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210120173612.20913-1-will@kernel.org> References: <20210120173612.20913-1-will@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In preparation for const-ifying the anonymous struct field of 'struct vm_fault', rework __collapse_huge_page_swapin() to avoid continuously updating vmf.address and instead populate a new 'struct vm_fault' on the stack for each page being processed. Cc: Kirill A. Shutemov Cc: Linus Torvalds Signed-off-by: Will Deacon --- mm/khugepaged.c | 37 ++++++++++++++++++------------------- 1 file changed, 18 insertions(+), 19 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 67ab391a5373..fb0fdaec34d5 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -991,38 +991,41 @@ static int hugepage_vma_revalidate(struct mm_struct *mm, unsigned long address, static bool __collapse_huge_page_swapin(struct mm_struct *mm, struct vm_area_struct *vma, - unsigned long address, pmd_t *pmd, + unsigned long haddr, pmd_t *pmd, int referenced) { int swapped_in = 0; vm_fault_t ret = 0; - struct vm_fault vmf = { - .vma = vma, - .address = address, - .flags = FAULT_FLAG_ALLOW_RETRY, - .pmd = pmd, - .pgoff = linear_page_index(vma, address), - }; - - vmf.pte = pte_offset_map(pmd, address); - for (; vmf.address < address + HPAGE_PMD_NR*PAGE_SIZE; - vmf.pte++, vmf.address += PAGE_SIZE) { + unsigned long address, end = haddr + (HPAGE_PMD_NR * PAGE_SIZE); + + for (address = haddr; address < end; address += PAGE_SIZE) { + struct vm_fault vmf = { + .vma = vma, + .address = address, + .pgoff = linear_page_index(vma, haddr), + .flags = FAULT_FLAG_ALLOW_RETRY, + .pmd = pmd, + }; + + vmf.pte = pte_offset_map(pmd, address); vmf.orig_pte = *vmf.pte; - if (!is_swap_pte(vmf.orig_pte)) + if (!is_swap_pte(vmf.orig_pte)) { + pte_unmap(vmf.pte); continue; + } swapped_in++; ret = do_swap_page(&vmf); /* do_swap_page returns VM_FAULT_RETRY with released mmap_lock */ if (ret & VM_FAULT_RETRY) { mmap_read_lock(mm); - if (hugepage_vma_revalidate(mm, address, &vmf.vma)) { + if (hugepage_vma_revalidate(mm, haddr, &vma)) { /* vma is no longer available, don't continue to swapin */ trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); return false; } /* check if the pmd is still valid */ - if (mm_find_pmd(mm, address) != pmd) { + if (mm_find_pmd(mm, haddr) != pmd) { trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); return false; } @@ -1031,11 +1034,7 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, trace_mm_collapse_huge_page_swapin(mm, swapped_in, referenced, 0); return false; } - /* pte is unmapped now, we need to map it */ - vmf.pte = pte_offset_map(pmd, vmf.address); } - vmf.pte--; - pte_unmap(vmf.pte); /* Drain LRU add pagevec to remove extra pin on the swapped in pages */ if (swapped_in) From patchwork Wed Jan 20 17:36:11 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 12033043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-14.0 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,UNWANTED_LANGUAGE_BODY,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6348EC433DB for ; Wed, 20 Jan 2021 17:36:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 18DCF23136 for ; Wed, 20 Jan 2021 17:36:44 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 18DCF23136 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id AA17D6B000E; Wed, 20 Jan 2021 12:36:43 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id A79316B0010; Wed, 20 Jan 2021 12:36:43 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8F8046B0012; Wed, 20 Jan 2021 12:36:43 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0135.hostedemail.com [216.40.44.135]) by kanga.kvack.org (Postfix) with ESMTP id 749C56B000E for ; Wed, 20 Jan 2021 12:36:43 -0500 (EST) Received: from smtpin18.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 411FC180AD837 for ; Wed, 20 Jan 2021 17:36:43 +0000 (UTC) X-FDA: 77726858286.18.snake04_6100f812755c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin18.hostedemail.com (Postfix) with ESMTP id F28AA100EDBF7 for ; Wed, 20 Jan 2021 17:36:42 +0000 (UTC) X-HE-Tag: snake04_6100f812755c X-Filterd-Recvd-Size: 3593 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf23.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Jan 2021 17:36:42 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 5004B233EA; Wed, 20 Jan 2021 17:36:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611164201; bh=nZ0EDBk5rDq2VDhiLnpGwPqd3bpBQursOySVujIsCMc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VCUTKTKul7XMKhPhUsZ3RU4YTZs8hTEuZLgYaE/oqPKF3PkCQ80A0PnrSu7LH79y8 RJGsvw9H7y/jbKRPAAJ8xOvUnNljRDEPMk80xckfTEuFPBNHOtL3gXMKjyMz1t8XiQ 3IiMiNjAcWMGwzEYPGls03u3RRLf7wMGP09bGukK3KyWK2eVWgwKc4GAwXZu70cjIM /9A+dliD3NS43Y434oLUAwb61ZSx96AOJpFX97acCjHM7JQo8oALGl15Inz12qehZ7 BC4jXwdTjT5pXoZ6kZgBwYepGeQ7fU5LhKq9JBsTbVpKPA3XxUX07fSmLsYebkSOgV XeWB/VG1Zvfrg== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , Nick Desaulniers , kernel-team@android.com Subject: [PATCH v4 7/8] mm: Use static initialisers for immutable fields of 'struct vm_fault' Date: Wed, 20 Jan 2021 17:36:11 +0000 Message-Id: <20210120173612.20913-8-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210120173612.20913-1-will@kernel.org> References: <20210120173612.20913-1-will@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: In preparation for const-ifying the anonymous struct field of 'struct vm_fault', ensure that it is initialised using static initialisers. Cc: Kirill A. Shutemov Cc: Linus Torvalds Signed-off-by: Will Deacon Reviewed-by: Nick Desaulniers --- mm/shmem.c | 6 +++--- mm/swapfile.c | 11 ++++++----- 2 files changed, 9 insertions(+), 8 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 7c6b6d8f6c39..1b254fbfdf52 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -1520,11 +1520,11 @@ static struct page *shmem_swapin(swp_entry_t swap, gfp_t gfp, { struct vm_area_struct pvma; struct page *page; - struct vm_fault vmf; + struct vm_fault vmf = { + .vma = &pvma, + }; shmem_pseudo_vma_init(&pvma, info, index); - vmf.vma = &pvma; - vmf.address = 0; page = swap_cluster_readahead(swap, gfp, &vmf); shmem_pseudo_vma_destroy(&pvma); diff --git a/mm/swapfile.c b/mm/swapfile.c index 9fffc5af29d1..2b570a566276 100644 --- a/mm/swapfile.c +++ b/mm/swapfile.c @@ -1951,8 +1951,6 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd, si = swap_info[type]; pte = pte_offset_map(pmd, addr); do { - struct vm_fault vmf; - if (!is_swap_pte(*pte)) continue; @@ -1968,9 +1966,12 @@ static int unuse_pte_range(struct vm_area_struct *vma, pmd_t *pmd, swap_map = &si->swap_map[offset]; page = lookup_swap_cache(entry, vma, addr); if (!page) { - vmf.vma = vma; - vmf.address = addr; - vmf.pmd = pmd; + struct vm_fault vmf = { + .vma = vma, + .address = addr, + .pmd = pmd, + }; + page = swapin_readahead(entry, GFP_HIGHUSER_MOVABLE, &vmf); } From patchwork Wed Jan 20 17:36:12 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Will Deacon X-Patchwork-Id: 12033045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-16.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,INCLUDES_CR_TRAILER,INCLUDES_PATCH,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 82528C433DB for ; Wed, 20 Jan 2021 17:36:49 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 3481623136 for ; Wed, 20 Jan 2021 17:36:49 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3481623136 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 4B4A06B0012; Wed, 20 Jan 2021 12:36:46 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 461546B0022; Wed, 20 Jan 2021 12:36:46 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 308576B0023; Wed, 20 Jan 2021 12:36:46 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0159.hostedemail.com [216.40.44.159]) by kanga.kvack.org (Postfix) with ESMTP id 129536B0012 for ; Wed, 20 Jan 2021 12:36:46 -0500 (EST) Received: from smtpin26.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CC3DA1EE6 for ; Wed, 20 Jan 2021 17:36:45 +0000 (UTC) X-FDA: 77726858370.26.slave86_1e085412755c Received: from filter.hostedemail.com (10.5.16.251.rfc1918.com [10.5.16.251]) by smtpin26.hostedemail.com (Postfix) with ESMTP id A9A3E1804B66E for ; Wed, 20 Jan 2021 17:36:45 +0000 (UTC) X-HE-Tag: slave86_1e085412755c X-Filterd-Recvd-Size: 2767 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf31.hostedemail.com (Postfix) with ESMTP for ; Wed, 20 Jan 2021 17:36:45 +0000 (UTC) Received: by mail.kernel.org (Postfix) with ESMTPSA id 116FC23122; Wed, 20 Jan 2021 17:36:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611164204; bh=DEWJsXFSV81jlOBYgr3LEPW3UawvblxwfW/J9Ed78ic=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VpQVclVXlxSy0Xj5C7hS2E1F4PoDjtit84Uz02J2rq+/QVqCfZtq7tyCx5+xqaGkx Q4U5Xz2/Dqs206JdOTX6hIowJzsptc1Xi564azFWvGsbVVlQf/Gst6qKY6OkzZL9Zq P9eqggQdSqPisehrzCZmupr2rY1pJqtK67kijhnTmBNcBtdEhPi2fScZPX3PYbh6PQ lVep4qzAp7FiRs+7aTJXmmsQgW4+sIwBsP0HqAlpYGaup7TYLg8knbq4/FZqtTb7ac qwfZ/sTOEaBhym7q5/L4qu0dcevjkjJUmH0BA1fURJRgVg8zoAhF4S23jkfLndhPiK HvHfwILoYBO4w== From: Will Deacon To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, linux-arm-kernel@lists.infradead.org, Will Deacon , Catalin Marinas , Jan Kara , Minchan Kim , Andrew Morton , "Kirill A . Shutemov" , Linus Torvalds , Vinayak Menon , Hugh Dickins , Nick Desaulniers , kernel-team@android.com Subject: [PATCH v4 8/8] mm: Mark anonymous struct field of 'struct vm_fault' as 'const' Date: Wed, 20 Jan 2021 17:36:12 +0000 Message-Id: <20210120173612.20913-9-will@kernel.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20210120173612.20913-1-will@kernel.org> References: <20210120173612.20913-1-will@kernel.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The fields of this struct are only ever read after being initialised, so mark it 'const' before somebody tries to modify it again. Cc: Kirill A. Shutemov Cc: Linus Torvalds Signed-off-by: Will Deacon --- include/linux/mm.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/include/linux/mm.h b/include/linux/mm.h index e0f056753bef..7ff3d9817d38 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -514,7 +514,7 @@ static inline bool fault_flag_allow_retry_first(unsigned int flags) * pgoff should be used in favour of virtual_address, if possible. */ struct vm_fault { - struct { + const struct { struct vm_area_struct *vma; /* Target VMA */ gfp_t gfp_mask; /* gfp mask to be used for allocations */ pgoff_t pgoff; /* Logical page offset based on vma */