From patchwork Mon May 2 18:17:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12834610 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1F09C433EF for ; Mon, 2 May 2022 18:17:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3C71A6B007D; Mon, 2 May 2022 14:17:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 34F6E6B007E; Mon, 2 May 2022 14:17:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17B1D6B0080; Mon, 2 May 2022 14:17:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.28]) by kanga.kvack.org (Postfix) with ESMTP id 0247C6B007D for ; Mon, 2 May 2022 14:17:40 -0400 (EDT) Received: from smtpin04.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id DA18780511 for ; Mon, 2 May 2022 18:17:39 +0000 (UTC) X-FDA: 79421611038.04.F6565F3 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) by imf10.hostedemail.com (Postfix) with ESMTP id 51ECCC0074 for ; Mon, 2 May 2022 18:17:25 +0000 (UTC) Received: by mail-pf1-f201.google.com with SMTP id c202-20020a621cd3000000b0050dd228152aso3461100pfc.11 for ; Mon, 02 May 2022 11:17:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=LkSv88jny1OXgSkaAcGQstEwQBZTdj4fUS6JBH9Lw7w=; b=Pie2u8QYFgELJ6U2NgysWkV2PXS+bV4eQ6RTLnhNUQR5ffAFY/VEkqCWuG3RAponpJ ah8NalnAWe4hT0xA/dlNZnl/Ok/cpDvaeXf7i8J4JxQV6QtFJ31pY7tU0KNrLAmVox6h PYTL4yr6nzQNXA0UvrW8tOvpgMu7YQd+KeyopWJIPsvBeHPnsNFpy463/8fev25S4yFt wxIgqOX1TahacTWxSaT3E+4O9uemqXdjPzgTJTXbm2X3Ob5/pMGKWRExAsfiqNX8qaBt w6djJRZog4x4Jf7PsohkeZwCyefivt4bC856IR6XbmZTFggyKzSs/IN3IzuRQ7es8hzq 72Pg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=LkSv88jny1OXgSkaAcGQstEwQBZTdj4fUS6JBH9Lw7w=; b=Y+06KmDbyLBzm93g13WZNxr/qWoryWC6kIfYAII+Z3e4ZqSGwqrCrHJ7RDM2yKIC9G 6RjLhNH3XWmWqJe+zHdcMIZv2CSt6Cy2P2MuVNNHlC1UwlZRzNX8V11zii4ErbMCs8xX OypurXYmGwwtg6Ew6v32iNHMkhunpiOqb7sSsn206a2m1oXxCTovXbD9J8R4g1vcbX4p jRltBCBrogu++0GWbp9suENSoU4Jrne1mS/GAzzB09OiEi8NuyWt/m9QArb38Vf6V+hG NC5fuc3dxP2ske4aQsRKqlJJP8oC2zV0ytJFm7bxtaa19RjQFDdSNlo95xfQI1vFZiiY yj+w== X-Gm-Message-State: AOAM531YIjg0KUmriazDONxlwqhAIbu1Ky2WBTIyEe8pdDDimQWwjUP2 p3rlGuakVGbMYL9991FuwitqPIo/CJDz X-Google-Smtp-Source: ABdhPJxR/6IIvcinN95gvfKGVBlvrtwSs9TP+I0GTp7DY2oBhzR6kr5BbPUPgrBXq4JgNk1V1codAnfeuNQ2 X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:90a:e510:b0:1d9:ee23:9fa1 with SMTP id t16-20020a17090ae51000b001d9ee239fa1mr88452pjy.0.1651515457636; Mon, 02 May 2022 11:17:37 -0700 (PDT) Date: Mon, 2 May 2022 11:17:07 -0700 In-Reply-To: <20220502181714.3483177-1-zokeefe@google.com> Message-Id: <20220502181714.3483177-7-zokeefe@google.com> Mime-Version: 1.0 References: <20220502181714.3483177-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.0.464.gb9c8b46e94-goog Subject: [PATCH v4 06/13] mm/khugepaged: add flag to ignore khugepaged_max_ptes_* From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Rspamd-Server: rspam03 X-Rspamd-Queue-Id: 51ECCC0074 X-Stat-Signature: arx871eyx8wxzoy7dk7jcqxuwm1urztn Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=Pie2u8QY; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf10.hostedemail.com: domain of 3QSBwYgcKCFwTIE8898AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--zokeefe.bounces.google.com designates 209.85.210.201 as permitted sender) smtp.mailfrom=3QSBwYgcKCFwTIE8898AIIAF8.6IGFCHOR-GGEP46E.ILA@flex--zokeefe.bounces.google.com X-Rspam-User: X-HE-Tag: 1651515445-965787 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add enforce_pte_scan_limits flag to struct collapse_control that allows context to ignore the sysfs-controlled knobs khugepaged_max_ptes_[none|swap|shared] and set this flag in khugepaged collapse context to preserve existing khugepaged behavior. This flag will be used (unset) when introducing madvise collapse context since here, the user presumably has reason to believe the collapse will be beneficial and khugepaged heuristics shouldn't tell the user they are wrong. Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 31 +++++++++++++++++++++---------- 1 file changed, 21 insertions(+), 10 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 986344a04165..94f18be83835 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -87,6 +87,9 @@ static struct kmem_cache *mm_slot_cache __read_mostly; #define MAX_PTE_MAPPED_THP 8 struct collapse_control { + /* Respect khugepaged_max_ptes_[none|swap|shared] */ + bool enforce_pte_scan_limits; + /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; @@ -614,6 +617,7 @@ static bool is_refcount_suitable(struct page *page) static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, + struct collapse_control *cc, struct list_head *compound_pagelist) { struct page *page = NULL; @@ -627,7 +631,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (pte_none(pteval) || (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -647,8 +652,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, VM_BUG_ON_PAGE(!PageAnon(page), page); - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -1186,7 +1191,7 @@ static int collapse_huge_page(struct mm_struct *mm, unsigned long address, mmu_notifier_invalidate_range_end(&range); spin_lock(pte_ptl); - result = __collapse_huge_page_isolate(vma, address, pte, + result = __collapse_huge_page_isolate(vma, address, pte, cc, &compound_pagelist); spin_unlock(pte_ptl); @@ -1276,7 +1281,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { - if (++unmapped <= khugepaged_max_ptes_swap) { + if (++unmapped <= khugepaged_max_ptes_swap || + !cc->enforce_pte_scan_limits) { /* * Always be strict with uffd-wp * enabled swap entries. Please see @@ -1295,7 +1301,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -1325,8 +1332,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, goto out_unmap; } - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && + page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; @@ -2056,7 +2064,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, continue; if (xa_is_value(page)) { - if (++swap > khugepaged_max_ptes_swap) { + if (cc->enforce_pte_scan_limits && + ++swap > khugepaged_max_ptes_swap) { result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); break; @@ -2107,7 +2116,8 @@ static int khugepaged_scan_file(struct mm_struct *mm, rcu_read_unlock(); if (result == SCAN_SUCCEED) { - if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { + if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none && + cc->enforce_pte_scan_limits) { result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { @@ -2337,6 +2347,7 @@ static int khugepaged(void *none) { struct mm_slot *mm_slot; struct collapse_control cc = { + .enforce_pte_scan_limits = true, .last_target_node = NUMA_NO_NODE, .alloc_charge_hpage = &alloc_charge_hpage, };