From patchwork Tue Apr 26 14:44:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12827327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A48A2C433F5 for ; Tue, 26 Apr 2022 14:44:35 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3EC2A6B0080; Tue, 26 Apr 2022 10:44:35 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 39BB36B0081; Tue, 26 Apr 2022 10:44:35 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 214E16B0083; Tue, 26 Apr 2022 10:44:35 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id 140226B0080 for ; Tue, 26 Apr 2022 10:44:35 -0400 (EDT) Received: from smtpin19.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay11.hostedemail.com (Postfix) with ESMTP id E4C61807DA for ; Tue, 26 Apr 2022 14:44:34 +0000 (UTC) X-FDA: 79399301268.19.9788329 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) by imf13.hostedemail.com (Postfix) with ESMTP id 48FB920051 for ; Tue, 26 Apr 2022 14:44:28 +0000 (UTC) Received: by mail-yb1-f201.google.com with SMTP id d188-20020a25cdc5000000b00648429e5ab9so6930715ybf.13 for ; Tue, 26 Apr 2022 07:44:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=RnGp+l6jR1hMyWYazyCiftnVnZIhQDELBFPeJ+ULhiI=; b=WRCkrA+29V82sY+Ufhowucw63C1PzzpG3+jwsXPHV7YQN7t8BOdyjL2ITzdvpI277j d1XgxTRcdDSawMDuDaP3NTU4M2942/X4CwGSO4+J1gEBwOoi53f71Da/uxx1GekoOVKh ZBEEyftAELVNmHdeuMHHDCivXh8rTAsvFAqnmAO0XaVLf4PwL5yDkhuhqvRPi3IwA2l1 Otas2sE6zv+SnMPpOcczwcZILCkKGyTWV9aFqIZ2bgCYQfOWFx0IP6+FzCT7a+b8D2cD 2ejVMQXWlr/NPVa3tOglvLjc+RFqCQBN9d7u8z8ImW4N5h5226pVQt5vdeIQch0pvssr vUZA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=RnGp+l6jR1hMyWYazyCiftnVnZIhQDELBFPeJ+ULhiI=; b=Qy86KA1UDE7eUyWFqfutm8Ir4q/sZs2svJY7EwpD3CaHFGHLYj+0pw+9NS+amDoVjC BDPxg+7xU+SGf75xXLfzo70pho+SuEXNAym/qfKAyjOFt6xfpKVkx+3qVwVolfHMr05q WLmdfg4CCoQ2LoJgL3yqwZiPh4q459PoTcn3a4fwhQRBCcwc+vaqw79lmewUYxbuj9FY YoX4aXjvx1nxQnG5KMNAqizYWYRKZKlQzYVfZFghFVzCcD+0HaFkRpxs/arRkfx6HWjO PIjAO6rtE90rg8+4sMMqhYW1zGV5u4a9XKzNKdK0GYxKG6Qb1RD3bE/ZZk++jwZj+WN2 YXfg== X-Gm-Message-State: AOAM531/gqVgI6n8gqWu9qpV6lTVrjDuetPd0iP4ZLb4T7QyPHg3flks pMo/VKVqJvoUSrRjHJTpB91btTThvdZc X-Google-Smtp-Source: ABdhPJxnE2fYL4FUsr/g1hHlij2glNRMQoTMPV3pKlIKyvSbzPjB4eveaF0+X5/fsy3u0VQnJGC7EG51LGen X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a05:6902:561:b0:648:63ff:2b61 with SMTP id a1-20020a056902056100b0064863ff2b61mr9719526ybt.30.1650984273652; Tue, 26 Apr 2022 07:44:33 -0700 (PDT) Date: Tue, 26 Apr 2022 07:44:07 -0700 In-Reply-To: <20220426144412.742113-1-zokeefe@google.com> Message-Id: <20220426144412.742113-8-zokeefe@google.com> Mime-Version: 1.0 References: <20220426144412.742113-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.0.rc2.479.g8af0fa9b8e-goog Subject: [PATCH v3 07/12] mm/khugepaged: add flag to ignore khugepaged_max_ptes_* From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=WRCkrA+2; spf=pass (imf13.hostedemail.com: domain of 3UQVoYgcKCBYLA600102AA270.yA8749GJ-886Hwy6.AD2@flex--zokeefe.bounces.google.com designates 209.85.219.201 as permitted sender) smtp.mailfrom=3UQVoYgcKCBYLA600102AA270.yA8749GJ-886Hwy6.AD2@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspam-User: X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 48FB920051 X-Stat-Signature: iunsmma81udmqh4zrsra6zbbf9t696z1 X-HE-Tag: 1650984268-924277 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add enforce_pte_scan_limits flag to struct collapse_control that allows context to ignore sysfs-controlled knobs: khugepaged_max_ptes_[none|swap|shared]. Set this flag in khugepaged collapse context to preserve existing khugepaged behavior and unset the flag in madvise collapse context since the user presumably has reason to believe the collapse will be beneficial. Signed-off-by: Zach O'Keefe --- mm/khugepaged.c | 32 ++++++++++++++++++++++---------- 1 file changed, 22 insertions(+), 10 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index a6881f5b3c67..57725482290d 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -87,6 +87,9 @@ static struct kmem_cache *mm_slot_cache __read_mostly; #define MAX_PTE_MAPPED_THP 8 struct collapse_control { + /* Respect khugepaged_max_ptes_[none|swap|shared] */ + bool enforce_pte_scan_limits; + /* Num pages scanned per node */ int node_load[MAX_NUMNODES]; @@ -632,6 +635,7 @@ static bool is_refcount_suitable(struct page *page) static int __collapse_huge_page_isolate(struct vm_area_struct *vma, unsigned long address, pte_t *pte, + struct collapse_control *cc, struct list_head *compound_pagelist) { struct page *page = NULL; @@ -645,7 +649,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, if (pte_none(pteval) || (pte_present(pteval) && is_zero_pfn(pte_pfn(pteval)))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { result = SCAN_EXCEED_NONE_PTE; @@ -665,8 +670,8 @@ static int __collapse_huge_page_isolate(struct vm_area_struct *vma, VM_BUG_ON_PAGE(!PageAnon(page), page); - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out; @@ -1208,7 +1213,7 @@ static void collapse_huge_page(struct mm_struct *mm, unsigned long address, mmu_notifier_invalidate_range_end(&range); spin_lock(pte_ptl); - cr->result = __collapse_huge_page_isolate(vma, address, pte, + cr->result = __collapse_huge_page_isolate(vma, address, pte, cc, &compound_pagelist); spin_unlock(pte_ptl); @@ -1297,7 +1302,8 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, _pte++, _address += PAGE_SIZE) { pte_t pteval = *_pte; if (is_swap_pte(pteval)) { - if (++unmapped <= khugepaged_max_ptes_swap) { + if (++unmapped <= khugepaged_max_ptes_swap || + !cc->enforce_pte_scan_limits) { /* * Always be strict with uffd-wp * enabled swap entries. Please see @@ -1316,7 +1322,8 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, } if (pte_none(pteval) || is_zero_pfn(pte_pfn(pteval))) { if (!userfaultfd_armed(vma) && - ++none_or_zero <= khugepaged_max_ptes_none) { + (++none_or_zero <= khugepaged_max_ptes_none || + !cc->enforce_pte_scan_limits)) { continue; } else { cr->result = SCAN_EXCEED_NONE_PTE; @@ -1346,8 +1353,9 @@ static void scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, goto out_unmap; } - if (page_mapcount(page) > 1 && - ++shared > khugepaged_max_ptes_shared) { + if (cc->enforce_pte_scan_limits && + page_mapcount(page) > 1 && + ++shared > khugepaged_max_ptes_shared) { cr->result = SCAN_EXCEED_SHARED_PTE; count_vm_event(THP_SCAN_EXCEED_SHARED_PTE); goto out_unmap; @@ -2087,7 +2095,8 @@ static void khugepaged_scan_file(struct mm_struct *mm, continue; if (xa_is_value(page)) { - if (++swap > khugepaged_max_ptes_swap) { + if (cc->enforce_pte_scan_limits && + ++swap > khugepaged_max_ptes_swap) { cr->result = SCAN_EXCEED_SWAP_PTE; count_vm_event(THP_SCAN_EXCEED_SWAP_PTE); break; @@ -2138,7 +2147,8 @@ static void khugepaged_scan_file(struct mm_struct *mm, rcu_read_unlock(); if (cr->result == SCAN_SUCCEED) { - if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none) { + if (present < HPAGE_PMD_NR - khugepaged_max_ptes_none && + cc->enforce_pte_scan_limits) { cr->result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { @@ -2365,6 +2375,7 @@ static int khugepaged(void *none) { struct mm_slot *mm_slot; struct collapse_control cc = { + .enforce_pte_scan_limits = true, .last_target_node = NUMA_NO_NODE, .gfp = &alloc_hugepage_khugepaged_gfpmask, .alloc_hpage = &khugepaged_alloc_page, @@ -2512,6 +2523,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, unsigned long start, unsigned long end) { struct collapse_control cc = { + .enforce_pte_scan_limits = false, .last_target_node = NUMA_NO_NODE, .hpage = NULL, .gfp = &alloc_hugepage_madvise_gfpmask,