From patchwork Tue Oct 8 09:15:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28Intel=29?= X-Patchwork-Id: 11179211 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3ADB915AB for ; Tue, 8 Oct 2019 09:15:29 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id ED45E2070B for ; Tue, 8 Oct 2019 09:15:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (1024-bit key) header.d=shipmail.org header.i=@shipmail.org header.b="CHMVWjyy" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org ED45E2070B Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=shipmail.org Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 110478E0008; Tue, 8 Oct 2019 05:15:26 -0400 (EDT) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id 08D6F8E0005; Tue, 8 Oct 2019 05:15:25 -0400 (EDT) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D36F58E0009; Tue, 8 Oct 2019 05:15:25 -0400 (EDT) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0067.hostedemail.com [216.40.44.67]) by kanga.kvack.org (Postfix) with ESMTP id A2FC88E0006 for ; Tue, 8 Oct 2019 05:15:25 -0400 (EDT) Received: from smtpin07.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with SMTP id 11852180AD803 for ; Tue, 8 Oct 2019 09:15:25 +0000 (UTC) X-FDA: 76020059010.07.silk36_11e809f269913 X-Spam-Summary: 40,2.5,0,2e3f8053f7d9fc5e,d41d8cd98f00b204,thomas_os@shipmail.org,:linux-kernel@vger.kernel.org::torvalds@linux-foundation.org:thellstrom@vmware.com:willy@infradead.org:will.deacon@arm.com:peterz@infradead.org:riel@surriel.com:minchan@kernel.org:mhocko@suse.com:ying.huang@intel.com:jglisse@redhat.com:kirill@shutemov.name,RULES_HIT:41:152:355:379:421:541:800:960:973:988:989:1260:1261:1277:1311:1313:1314:1345:1359:1437:1515:1516:1518:1535:1544:1593:1594:1676:1711:1730:1747:1777:1792:1801:2198:2199:2393:2553:2559:2562:2693:2731:2898:3138:3139:3140:3141:3142:3355:3865:3866:3867:3868:3870:3871:3872:3873:3874:4118:4250:4321:4398:4605:5007:6117:6119:6261:6653:6742:7576:7901:7903:7974:9592:10011:11026:11473:11658:11914:12043:12291:12296:12297:12438:12517:12519:12555:12679:12895:13141:13149:13161:13229:13230:13255:13894:14096:14097:14181:14394:14659:14721:21063:21080:21451:21627:30003:30012:30054:30064:30070:30090,0,RBL:79.136.2.40:@shipmail.org:.lbl8.mailshell.net-62. 14.203.1 X-HE-Tag: silk36_11e809f269913 X-Filterd-Recvd-Size: 7775 Received: from pio-pvt-msa1.bahnhof.se (pio-pvt-msa1.bahnhof.se [79.136.2.40]) by imf47.hostedemail.com (Postfix) with ESMTP for ; Tue, 8 Oct 2019 09:15:22 +0000 (UTC) Received: from localhost (localhost [127.0.0.1]) by pio-pvt-msa1.bahnhof.se (Postfix) with ESMTP id 79BE83F787; Tue, 8 Oct 2019 11:15:21 +0200 (CEST) Authentication-Results: pio-pvt-msa1.bahnhof.se; dkim=pass (1024-bit key; unprotected) header.d=shipmail.org header.i=@shipmail.org header.b="CHMVWjyy"; dkim-atps=neutral X-Virus-Scanned: Debian amavisd-new at bahnhof.se X-Spam-Flag: NO X-Spam-Score: -2.099 X-Spam-Level: X-Spam-Status: No, score=-2.099 tagged_above=-999 required=6.31 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no Received: from pio-pvt-msa1.bahnhof.se ([127.0.0.1]) by localhost (pio-pvt-msa1.bahnhof.se [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id COAn5xUjmjkU; Tue, 8 Oct 2019 11:15:20 +0200 (CEST) Received: from mail1.shipmail.org (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) (Authenticated sender: mb878879) by pio-pvt-msa1.bahnhof.se (Postfix) with ESMTPA id 1B41D3F59D; Tue, 8 Oct 2019 11:15:17 +0200 (CEST) Received: from localhost.localdomain.localdomain (h-205-35.A357.priv.bahnhof.se [155.4.205.35]) by mail1.shipmail.org (Postfix) with ESMTPSA id A16BB3605DC; Tue, 8 Oct 2019 11:15:17 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=shipmail.org; s=mail; t=1570526117; bh=cWWLN8IyL/PIrg0tjEE8pDhLTEss1yQM1iEt6Room1s=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CHMVWjyymcO57wjd2bv9bPjkIzsm7+YRzfsVkGqFCn3rOzHnWRNa1hy3ECnnK/S56 fIEu8ZxmY4ivhNnsGEe+GL4Qi+WuSePxGOKG2Yl+v0a6LMcMBoMgbKzWpo+tnkrFwe ehwinB/VzkQvGI3zXl5wxaUvKKFPOa7iLymr7dfc= From: =?utf-8?q?Thomas_Hellstr=C3=B6m_=28VMware=29?= To: linux-kernel@vger.kernel.org, linux-mm@kvack.org Cc: torvalds@linux-foundation.org, Thomas Hellstrom , Matthew Wilcox , Will Deacon , Peter Zijlstra , Rik van Riel , Minchan Kim , Michal Hocko , Huang Ying , =?utf-8?b?SsOpcsO0bWUgR2xpc3Nl?= , "Kirill A . Shutemov" Subject: [PATCH v4 3/9] mm: pagewalk: Don't split transhuge pmds when a pmd_entry is present Date: Tue, 8 Oct 2019 11:15:02 +0200 Message-Id: <20191008091508.2682-4-thomas_os@shipmail.org> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20191008091508.2682-1-thomas_os@shipmail.org> References: <20191008091508.2682-1-thomas_os@shipmail.org> MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Thomas Hellstrom The pagewalk code was unconditionally splitting transhuge pmds when a pte_entry was present. However ideally we'd want to handle transhuge pmds in the pmd_entry function and ptes in pte_entry function. So don't split huge pmds when there is a pmd_entry function present, but let the callback take care of it if necessary. In order to make sure a virtual address range is handled by one and only one callback, and since pmd entries may be unstable, we introduce a pmd_entry return code that tells the walk code to continue processing this pmd entry rather than to move on. Since caller-defined positive return codes (up to 2) are used by current callers, use a high value that allows a large range of positive caller-defined return codes for future users. Cc: Matthew Wilcox Cc: Will Deacon Cc: Peter Zijlstra Cc: Rik van Riel Cc: Minchan Kim Cc: Michal Hocko Cc: Huang Ying Cc: Jérôme Glisse Cc: Kirill A. Shutemov Suggested-by: Linus Torvalds Signed-off-by: Thomas Hellstrom --- include/linux/pagewalk.h | 8 ++++++++ mm/pagewalk.c | 28 +++++++++++++++++++++------- 2 files changed, 29 insertions(+), 7 deletions(-) diff --git a/include/linux/pagewalk.h b/include/linux/pagewalk.h index bddd9759bab9..c4a013eb445d 100644 --- a/include/linux/pagewalk.h +++ b/include/linux/pagewalk.h @@ -4,6 +4,11 @@ #include +/* Highest positive pmd_entry caller-specific return value */ +#define PAGE_WALK_CALLER_MAX (INT_MAX / 2) +/* The handler did not handle the entry. Fall back to the next level */ +#define PAGE_WALK_FALLBACK (PAGE_WALK_CALLER_MAX + 1) + struct mm_walk; /** @@ -16,6 +21,9 @@ struct mm_walk; * this handler is required to be able to handle * pmd_trans_huge() pmds. They may simply choose to * split_huge_page() instead of handling it explicitly. + * If the handler did not handle the PMD, or split the + * PMD and wants it handled by the PTE handler, it + * should return PAGE_WALK_FALLBACK. * @pte_entry: if set, called for each non-empty PTE (4th-level) entry * @pte_hole: if set, called for each hole at all levels * @hugetlb_entry: if set, called for each hugetlb entry diff --git a/mm/pagewalk.c b/mm/pagewalk.c index 83c0b78363b4..f844c2a2aa60 100644 --- a/mm/pagewalk.c +++ b/mm/pagewalk.c @@ -50,10 +50,18 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, * This implies that each ->pmd_entry() handler * needs to know about pmd_trans_huge() pmds */ - if (ops->pmd_entry) + if (ops->pmd_entry) { err = ops->pmd_entry(pmd, addr, next, walk); - if (err) - break; + if (!err) + continue; + else if (err <= PAGE_WALK_CALLER_MAX) + break; + WARN_ON(err != PAGE_WALK_FALLBACK); + err = 0; + if (pmd_trans_unstable(pmd)) + goto again; + /* Fall through */ + } /* * Check this here so we only break down trans_huge @@ -61,8 +69,8 @@ static int walk_pmd_range(pud_t *pud, unsigned long addr, unsigned long end, */ if (!ops->pte_entry) continue; - - split_huge_pmd(walk->vma, pmd, addr); + if (!ops->pmd_entry) + split_huge_pmd(walk->vma, pmd, addr); if (pmd_trans_unstable(pmd)) goto again; err = walk_pte_range(pmd, addr, next, walk); @@ -281,11 +289,17 @@ static int __walk_page_range(unsigned long start, unsigned long end, * * - 0 : succeeded to handle the current entry, and if you don't reach the * end address yet, continue to walk. - * - >0 : succeeded to handle the current entry, and return to the caller - * with caller specific value. + * - >0, and <= PAGE_WALK_CALLER_MAX : succeeded to handle the current entry, + * and return to the caller with caller specific value. * - <0 : failed to handle the current entry, and return to the caller * with error code. * + * For pmd_entry(), a value <= PAGE_WALK_CALLER_MAX indicates that the entry + * was handled by the callback. PAGE_WALK_FALLBACK indicates that the entry + * could not be handled by the callback and should be re-checked. If the + * callback needs the entry to be handled by the next level, it should + * split the entry and then return PAGE_WALK_FALLBACK. + * * Before starting to walk page table, some callers want to check whether * they really want to walk over the current vma, typically by checking * its vm_flags. walk_page_test() and @ops->test_walk() are used for this