From patchwork Sun Apr 10 13:54:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12808151 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 37945C433EF for ; Sun, 10 Apr 2022 13:55:14 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id C28A98D0002; Sun, 10 Apr 2022 09:55:13 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id BB4038D0001; Sun, 10 Apr 2022 09:55:13 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A2A5E8D0002; Sun, 10 Apr 2022 09:55:13 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0162.hostedemail.com [216.40.44.162]) by kanga.kvack.org (Postfix) with ESMTP id 89A6A8D0001 for ; Sun, 10 Apr 2022 09:55:13 -0400 (EDT) Received: from smtpin27.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay03.hostedemail.com (Postfix) with ESMTP id 3CF708249980 for ; Sun, 10 Apr 2022 13:55:13 +0000 (UTC) X-FDA: 79341116106.27.CBAF768 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) by imf31.hostedemail.com (Postfix) with ESMTP id 00FF520006 for ; Sun, 10 Apr 2022 13:55:12 +0000 (UTC) Received: by mail-pj1-f73.google.com with SMTP id y15-20020a17090a154f00b001cb4f2196e1so3594534pja.3 for ; Sun, 10 Apr 2022 06:55:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=o0GV/i2w0W5xsDvWnkCTYSD2+KQOFIhsp+EwOgG0va0=; b=tlUKq1zhrWted0V54Vl3SAg7SrXTDtTfhvs3NuRPcQYEjDMBQl6PmDpiKROyFMbot3 bZIQkR7Je3K8k8WS0/pIn/oH1Y5qk3i0VeMwSuhYvjiY/q/EbHXS7HwdKSIteYEclk8T +QjcHMdwW+YdaFM/7lnreAYdkVHJgKm5RWrKUQH/pvuWkBY4U3AyV6+6cX6zZyafFhoI ytJSBrL6ltDfthQ9DG7E83h70x/juqUSJFQqo0DJTcBxE6B12YrpT9d+ZqJqYMpuYgsy J4hPKW9MxzkcYChB/IuwemWN/5ZccsN6D/PG+lVk22CxhCjesIbRBLLKVbEUeregiTfG N4uQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=o0GV/i2w0W5xsDvWnkCTYSD2+KQOFIhsp+EwOgG0va0=; b=etbqfn8r+8BZsD2dezAbitf8OosVgCesnIK8/3U0isCroYk8MMgC0zWZIfyg12d4mS R1MNDMEsOTdu+oCQb9yE4PAWeqBBT72AGO/oYYKM02LjTX76ol+M6MB3p48FWvUiuGgI IPMDyZHinc0x4f7My9WgcD7/i0VhZLVQrOc+1lqb0VvstdVnH1T/slRDoXmi8mcRh1Bi Bjn1SBn/BOwgDxmhl3J+HPReNfsPrvH+6Mw+pDK8dSucmjBjoFNqchs7d4hrEemqwLZh IlAUpXWcMfqAbN+Vej2Pi8M+Nz2gQlGL3cfG6H8rhemVm26fl+/q6YeCyZ34IC7T32CR STGg== X-Gm-Message-State: AOAM531xdvyXyeY6zpTyAXT17pSUORJjGGGz5a7dDPRvJzeATVkTXOHu Dl6aa3FvCqkjc4963d6wadx96g3Jzloe X-Google-Smtp-Source: ABdhPJy90l5SuTiVTzuyMYIxl7QDHR1VoXteUjiQ/VqZ/Ucl/C+rkSb4HT/j7mXWZ4u/CYIFihfPjrTz9Eg9 X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6a00:1589:b0:4fb:e7c:7c53 with SMTP id u9-20020a056a00158900b004fb0e7c7c53mr27910658pfk.78.1649598911843; Sun, 10 Apr 2022 06:55:11 -0700 (PDT) Date: Sun, 10 Apr 2022 06:54:40 -0700 In-Reply-To: <20220410135445.3897054-1-zokeefe@google.com> Message-Id: <20220410135445.3897054-8-zokeefe@google.com> Mime-Version: 1.0 References: <20220410135445.3897054-1-zokeefe@google.com> X-Mailer: git-send-email 2.35.1.1178.g4f1659d476-goog Subject: [PATCH 07/12] mm/khugepaged: add flag to ignore khugepaged_max_ptes_* From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Peter Xu , Thomas Bogendoerfer , "Zach O'Keefe" X-Stat-Signature: kt9rwdskufshxwqqe7xse1dwfpfd6dpy Authentication-Results: imf31.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=tlUKq1zh; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf31.hostedemail.com: domain of 3v-FSYgcKCOYhWSMMNMOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--zokeefe.bounces.google.com designates 209.85.216.73 as permitted sender) smtp.mailfrom=3v-FSYgcKCOYhWSMMNMOWWOTM.KWUTQVcf-UUSdIKS.WZO@flex--zokeefe.bounces.google.com X-Rspam-User: X-Rspamd-Server: rspam11 X-Rspamd-Queue-Id: 00FF520006 X-HE-Tag: 1649598912-882824 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add enforce_pte_scan_limits flag to struct collapse_control that allows context to ignore sysfs-controlled knobs: khugepaged_max_ptes_[none|swap|shared]. Set this flag in khugepaged collapse context to preserve existing khugepaged behavior and unset the flag in madvise collapse context since the user presumably has reason to believe the collapse will be beneficial. Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 2717262d1832..7f555da26fdc 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -87,6 +87,9 @@ static struct kmem_cache *mm_slot_cache __read_mostly; #define MAX_PTE_MAPPED_THP 8 struct collapse_control { + /* Respect khugepaged_max_ptes_[none|swap|shared] */ + bool enforce_pte_scan_limits; + /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; @@ -631,6 +634,7 @@ static bool is_refcount_suitable(struct page *page) static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, + struct collapse_control *cc, struct list_head *compound_pagelist) { struct page *page = NULL; @@ -644,7 +648,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (pte_none(pteval) || (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -664,8 +669,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, VM_BUG_ON_PAGE(!PageAnon(page), page); - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -1207,7 +1212,7 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, mmu_notifier_invalidate_range_end(&range); spin_lock(pte_ptl); - cr->result = __collapse_huge_page_isolate(vma, address, pte, + cr->result = __collapse_huge_page_isolate(vma, address, pte, cc, &compound_pagelist); spin_unlock(pte_ptl); @@ -1296,7 +1301,8 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { - if (++unmapped <= khugepaged_max_ptes_swap) { + if (++unmapped <= khugepaged_max_ptes_swap || + !cc->enforce_pte_scan_limits) { /* * Always be strict with uffd-wp * enabled swap entries. Please see @@ -1315,7 +1321,8 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { cr->result = SCAN_EXCEED_NONE_PTE; @@ -1345,8 +1352,9 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, goto out_unmap; } - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && + page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { cr->result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; @@ -2073,7 +2081,8 @@ static void khugepaged_scan_file(struct mm_struct *mm, continue; if (xa_is_value(page)) { - if (++swap > khugepaged_max_ptes_swap) { + if (cc->enforce_pte_scan_limits && + ++swap > khugepaged_max_ptes_swap) { cr->result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); break; @@ -2124,7 +2133,8 @@ static void khugepaged_scan_file(struct mm_struct *mm, rcu_read_unlock(); if (cr->result == SCAN_SUCCEED) { - if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { + if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none && + cc->enforce_pte_scan_limits) { cr->result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { @@ -2351,6 +2361,7 @@ static int khugepaged(void *none) { struct mm_slot *mm_slot; struct collapse_control cc = { + .enforce_pte_scan_limits = true, .last_target_node = NUMA_NO_NODE, .alloc_hpage = &khugepaged_alloc_page, }; @@ -2492,6 +2503,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end) { struct collapse_control cc = { + .enforce_pte_scan_limits = false, .last_target_node = NUMA_NO_NODE, .hpage = NULL, .alloc_hpage = &alloc_hpage,