From patchwork Sun Dec 1 01:52:39 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Andrew Morton X-Patchwork-Id: 11268295 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3AD37921 for ; Sun, 1 Dec 2019 01:52:44 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id EEBB4215E5 for ; Sun, 1 Dec 2019 01:52:43 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=kernel.org header.i=@kernel.org header.b="ZlDCMDxr" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EEBB4215E5 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=linux-foundation.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id A17686B02CD; Sat, 30 Nov 2019 20:52:42 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 9C9206B02CF; Sat, 30 Nov 2019 20:52:42 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8DE6A6B02D0; Sat, 30 Nov 2019 20:52:42 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0180.hostedemail.com [216.40.44.180]) by kanga.kvack.org (Postfix) with ESMTP id 7889F6B02CD for ; Sat, 30 Nov 2019 20:52:42 -0500 (EST) Received: from smtpin23.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with SMTP id 426E4824999B for ; Sun, 1 Dec 2019 01:52:42 +0000 (UTC) X-FDA: 76214898564.23.tramp79_178d48d1ae612 X-Spam-Summary: 2,0,0,418ee520c23e8ad7,d41d8cd98f00b204,akpm@linux-foundation.org,:akpm@linux-foundation.org:alex@ghiti.fr:aou@eecs.berkeley.edu:ard.biesheuvel@linaro.org:arnd@arndb.de:aryabinin@virtuozzo.com:benh@kernel.crashing.org:borntraeger@de.ibm.com:bp@alien8.de:catalin.marinas@arm.com:dave.hansen@linux.intel.com:dave.jiang@intel.com:davem@davemloft.net:dvyukov@google.com:glider@google.com:gor@linux.ibm.com:heiko.carstens@de.ibm.com:hpa@zytor.com:james.morse@arm.com:jhogan@kernel.org:kan.liang@linux.intel.com::linux@armlinux.org.uk:luto@kernel.org:mark.rutland@arm.com:mawilcox@microsoft.com:mingo@elte.hu:mm-commits@vger.kernel.org:mpe@ellerman.id.au:n-horiguchi@ah.jp.nec.com:palmer@sifive.com:paul.burton@mips.com:paul.walmsley@sifive.com:paulus@samba.org:peterz@infradead.org:ralf@linux-mips.org:shashim@codeaurora.org:steven.price@arm.com:tglx@linutronix.de:torvalds@linux-foundation.org:vgupta@synopsys.com:will@kernel.org:zong.li@sifive.com,RULES_HIT:1:41:355:379:800:96 0:967:97 X-HE-Tag: tramp79_178d48d1ae612 X-Filterd-Recvd-Size: 12554 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by imf40.hostedemail.com (Postfix) with ESMTP for ; Sun, 1 Dec 2019 01:52:41 +0000 (UTC) Received: from localhost.localdomain (c-73-231-172-41.hsd1.ca.comcast.net [73.231.172.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id D74F72084E; Sun, 1 Dec 2019 01:52:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1575165161; bh=iaO7T6ZUE1RAcfQYwTW6B+MKtjHz6J8jLGDLsi3euko=; h=Date:From:To:Subject:From; b=ZlDCMDxrLJfhClP/Sn3ME5F6f2FYwh44T9hBEuqd+wMF0+oULxswHPKt53XwKME9Y HT0VbQ0thE/pl9F6wMJEJUpfQlIa6a0YqgnCcIUPHwe26ICaaaB4csCH0eY+dg00jN sU54xXBwK1+02tgCgDKgNyqNQ5naqtODK5fT9dX8= Date: Sat, 30 Nov 2019 17:52:39 -0800 From: akpm@linux-foundation.org To: akpm@linux-foundation.org, alex@ghiti.fr, aou@eecs.berkeley.edu, ard.biesheuvel@linaro.org, arnd@arndb.de, aryabinin@virtuozzo.com, benh@kernel.crashing.org, borntraeger@de.ibm.com, bp@alien8.de, catalin.marinas@arm.com, dave.hansen@linux.intel.com, dave.jiang@intel.com, davem@davemloft.net, dvyukov@google.com, glider@google.com, gor@linux.ibm.com, heiko.carstens@de.ibm.com, hpa@zytor.com, james.morse@arm.com, jhogan@kernel.org, kan.liang@linux.intel.com, linux-mm@kvack.org, linux@armlinux.org.uk, luto@kernel.org, mark.rutland@arm.com, mawilcox@microsoft.com, mingo@elte.hu, mm-commits@vger.kernel.org, mpe@ellerman.id.au, n-horiguchi@ah.jp.nec.com, palmer@sifive.com, paul.burton@mips.com, paul.walmsley@sifive.com, paulus@samba.org, peterz@infradead.org, ralf@linux-mips.org, shashim@codeaurora.org, steven.price@arm.com, tglx@linutronix.de, torvalds@linux-foundation.org, vgupta@synopsys.com, will@kernel.org, zong.li@sifive.com Subject: [patch 059/158] mm: pagewalk: add 'depth' parameter to pte_hole Message-ID: <20191201015239.AuqOwIAzc%akpm@linux-foundation.org> User-Agent: s-nail v14.8.16 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Steven Price Subject: mm: pagewalk: add 'depth' parameter to pte_hole The pte_hole() callback is called at multiple levels of the page tables. Code dumping the kernel page tables needs to know what at what depth the missing entry is. Add this is an extra parameter to pte_hole(). When the depth isn't know (e.g. processing a vma) then -1 is passed. The depth that is reported is the actual level where the entry is missing (ignoring any folding that is in place), i.e. any levels where PTRS_PER_P?D is set to 1 are ignored. Note that depth starts at 0 for a PGD so that PUD/PMD/PTE retain their natural numbers as levels 2/3/4. Link: http://lkml.kernel.org/r/20191028135910.33253-15-steven.price@arm.com Signed-off-by: Steven Price Tested-by: Zong Li Cc: Albert Ou Cc: Alexander Potapenko Cc: Alexandre Ghiti Cc: Andrey Ryabinin Cc: Andy Lutomirski Cc: Ard Biesheuvel Cc: Arnd Bergmann Cc: Benjamin Herrenschmidt Cc: Borislav Petkov Cc: Catalin Marinas Cc: Christian Borntraeger Cc: Dave Hansen Cc: Dave Jiang Cc: David S. Miller Cc: Dmitry Vyukov Cc: Heiko Carstens Cc: "H. Peter Anvin" Cc: Ingo Molnar Cc: James Hogan Cc: James Morse Cc: "Liang, Kan" Cc: Mark Rutland Cc: Matthew Wilcox Cc: Michael Ellerman Cc: Naoya Horiguchi Cc: Palmer Dabbelt Cc: Paul Burton Cc: Paul Mackerras Cc: Paul Walmsley Cc: Peter Zijlstra Cc: Ralf Baechle Cc: Russell King Cc: Shiraz Hashim Cc: Thomas Gleixner Cc: Vasily Gorbik Cc: Vineet Gupta Cc: Will Deacon Signed-off-by: Andrew Morton --- fs/proc/task_mmu.c | 4 ++-- include/linux/pagewalk.h | 7 +++++-- mm/hmm.c | 8 ++++---- mm/migrate.c | 5 +++-- mm/mincore.c | 1 + mm/pagewalk.c | 31 +++++++++++++++++++++++++------ 6 files changed, 40 insertions(+), 16 deletions(-) --- a/fs/proc/task_mmu.c~mm-pagewalk-add-depth-parameter-to-pte_hole +++ a/fs/proc/task_mmu.c @@ -505,7 +505,7 @@ static void smaps_account(struct mem_siz #ifdef CONFIG_SHMEM static int smaps_pte_hole(unsigned long addr, unsigned long end, - struct mm_walk *walk) + __always_unused int depth, struct mm_walk *walk) { struct mem_size_stats *mss = walk->private; @@ -1282,7 +1282,7 @@ static int add_to_pagemap(unsigned long } static int pagemap_pte_hole(unsigned long start, unsigned long end, - struct mm_walk *walk) + __always_unused int depth, struct mm_walk *walk) { struct pagemapread *pm = walk->private; unsigned long addr = start; --- a/include/linux/pagewalk.h~mm-pagewalk-add-depth-parameter-to-pte_hole +++ a/include/linux/pagewalk.h @@ -17,7 +17,10 @@ struct mm_walk; * split_huge_page() instead of handling it explicitly. * @pte_entry: if set, called for each non-empty PTE (lowest-level) * entry - * @pte_hole: if set, called for each hole at all levels + * @pte_hole: if set, called for each hole at all levels, + * depth is -1 if not known, 0:PGD, 1:P4D, 2:PUD, 3:PMD + * 4:PTE. Any folded depths (where PTRS_PER_P?D is equal + * to 1) are skipped. * @hugetlb_entry: if set, called for each hugetlb entry * @test_walk: caller specific callback function to determine whether * we walk over the current vma or not. Returning 0 means @@ -48,7 +51,7 @@ struct mm_walk_ops { int (*pte_entry)(pte_t *pte, unsigned long addr, unsigned long next, struct mm_walk *walk); int (*pte_hole)(unsigned long addr, unsigned long next, - struct mm_walk *walk); + int depth, struct mm_walk *walk); int (*hugetlb_entry)(pte_t *pte, unsigned long hmask, unsigned long addr, unsigned long next, struct mm_walk *walk); --- a/mm/hmm.c~mm-pagewalk-add-depth-parameter-to-pte_hole +++ a/mm/hmm.c @@ -186,7 +186,7 @@ static void hmm_range_need_fault(const s } static int hmm_vma_walk_hole(unsigned long addr, unsigned long end, - struct mm_walk *walk) + __always_unused int depth, struct mm_walk *walk) { struct hmm_vma_walk *hmm_vma_walk = walk->private; struct hmm_range *range = hmm_vma_walk->range; @@ -380,7 +380,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, again: pmd = READ_ONCE(*pmdp); if (pmd_none(pmd)) - return hmm_vma_walk_hole(start, end, walk); + return hmm_vma_walk_hole(start, end, -1, walk); if (thp_migration_supported() && is_pmd_migration_entry(pmd)) { bool fault, write_fault; @@ -482,7 +482,7 @@ static int hmm_vma_walk_pud(pud_t *pudp, again: pud = READ_ONCE(*pudp); if (pud_none(pud)) - return hmm_vma_walk_hole(start, end, walk); + return hmm_vma_walk_hole(start, end, -1, walk); if (pud_huge(pud) && pud_devmap(pud)) { unsigned long i, npages, pfn; @@ -490,7 +490,7 @@ again: bool fault, write_fault; if (!pud_present(pud)) - return hmm_vma_walk_hole(start, end, walk); + return hmm_vma_walk_hole(start, end, -1, walk); i = (addr - range->start) >> PAGE_SHIFT; npages = (end - addr) >> PAGE_SHIFT; --- a/mm/migrate.c~mm-pagewalk-add-depth-parameter-to-pte_hole +++ a/mm/migrate.c @@ -2123,6 +2123,7 @@ out_unlock: #ifdef CONFIG_DEVICE_PRIVATE static int migrate_vma_collect_hole(unsigned long start, unsigned long end, + __always_unused int depth, struct mm_walk *walk) { struct migrate_vma *migrate = walk->private; @@ -2167,7 +2168,7 @@ static int migrate_vma_collect_pmd(pmd_t again: if (pmd_none(*pmdp)) - return migrate_vma_collect_hole(start, end, walk); + return migrate_vma_collect_hole(start, end, -1, walk); if (pmd_trans_huge(*pmdp)) { struct page *page; @@ -2200,7 +2201,7 @@ again: return migrate_vma_collect_skip(start, end, walk); if (pmd_none(*pmdp)) - return migrate_vma_collect_hole(start, end, + return migrate_vma_collect_hole(start, end, -1, walk); } } --- a/mm/mincore.c~mm-pagewalk-add-depth-parameter-to-pte_hole +++ a/mm/mincore.c @@ -112,6 +112,7 @@ static int __mincore_unmapped_range(unsi } static int mincore_unmapped_range(unsigned long addr, unsigned long end, + __always_unused int depth, struct mm_walk *walk) { walk->private += __mincore_unmapped_range(addr, end, --- a/mm/pagewalk.c~mm-pagewalk-add-depth-parameter-to-pte_hole +++ a/mm/pagewalk.c @@ -4,6 +4,22 @@ #include #include +/* + * We want to know the real level where a entry is located ignoring any + * folding of levels which may be happening. For example if p4d is folded then + * a missing entry found at level 1 (p4d) is actually at level 0 (pgd). + */ +static int real_depth(int depth) +{ + if (depth == 3 && PTRS_PER_PMD == 1) + depth = 2; + if (depth == 2 && PTRS_PER_PUD == 1) + depth = 1; + if (depth == 1 && PTRS_PER_P4D == 1) + depth = 0; + return depth; +} + static int walk_pte_range(pmd_t *pmd, unsigned long addr, unsigned long end, struct mm_walk *walk) { @@ -34,6 +50,7 @@ static int walk_pmd_range(pud_t *pud, un unsigned long next; const struct mm_walk_ops *ops = walk->ops; int err = 0; + int depth = real_depth(3); if (ops->test_pmd) { err = ops->test_pmd(addr, end, pmd_offset(pud, 0UL), walk); @@ -49,7 +66,7 @@ again: next = pmd_addr_end(addr, end); if (pmd_none(*pmd) || (!walk->vma && !walk->no_vma)) { if (ops->pte_hole) - err = ops->pte_hole(addr, next, walk); + err = ops->pte_hole(addr, next, depth, walk); if (err) break; continue; @@ -93,6 +110,7 @@ static int walk_pud_range(p4d_t *p4d, un unsigned long next; const struct mm_walk_ops *ops = walk->ops; int err = 0; + int depth = real_depth(2); if (ops->test_pud) { err = ops->test_pud(addr, end, pud_offset(p4d, 0UL), walk); @@ -108,7 +126,7 @@ static int walk_pud_range(p4d_t *p4d, un next = pud_addr_end(addr, end); if (pud_none(*pud) || (!walk->vma && !walk->no_vma)) { if (ops->pte_hole) - err = ops->pte_hole(addr, next, walk); + err = ops->pte_hole(addr, next, depth, walk); if (err) break; continue; @@ -144,6 +162,7 @@ static int walk_p4d_range(pgd_t *pgd, un unsigned long next; const struct mm_walk_ops *ops = walk->ops; int err = 0; + int depth = real_depth(1); if (ops->test_p4d) { err = ops->test_p4d(addr, end, p4d_offset(pgd, 0UL), walk); @@ -158,7 +177,7 @@ static int walk_p4d_range(pgd_t *pgd, un next = p4d_addr_end(addr, end); if (p4d_none_or_clear_bad(p4d)) { if (ops->pte_hole) - err = ops->pte_hole(addr, next, walk); + err = ops->pte_hole(addr, next, depth, walk); if (err) break; continue; @@ -190,7 +209,7 @@ static int walk_pgd_range(unsigned long next = pgd_addr_end(addr, end); if (pgd_none_or_clear_bad(pgd)) { if (ops->pte_hole) - err = ops->pte_hole(addr, next, walk); + err = ops->pte_hole(addr, next, 0, walk); if (err) break; continue; @@ -237,7 +256,7 @@ static int walk_hugetlb_range(unsigned l if (pte) err = ops->hugetlb_entry(pte, hmask, addr, next, walk); else if (ops->pte_hole) - err = ops->pte_hole(addr, next, walk); + err = ops->pte_hole(addr, next, -1, walk); if (err) break; @@ -281,7 +300,7 @@ static int walk_page_test(unsigned long if (vma->vm_flags & VM_PFNMAP) { int err = 1; if (ops->pte_hole) - err = ops->pte_hole(start, end, walk); + err = ops->pte_hole(start, end, -1, walk); return err ? err : 1; } return 0;