From patchwork Thu Apr 14 18:06:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12813844 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5FBDC433FE for ; Thu, 14 Apr 2022 18:06:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 3474E6B0075; Thu, 14 Apr 2022 14:06:41 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 282226B0078; Thu, 14 Apr 2022 14:06:41 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F2E1A6B007B; Thu, 14 Apr 2022 14:06:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.27]) by kanga.kvack.org (Postfix) with ESMTP id E06D46B0075 for ; Thu, 14 Apr 2022 14:06:40 -0400 (EDT) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay12.hostedemail.com (Postfix) with ESMTP id C09CB123328 for ; Thu, 14 Apr 2022 18:06:40 +0000 (UTC) X-FDA: 79356264960.13.D390823 Received: from mail-pj1-f74.google.com (mail-pj1-f74.google.com [209.85.216.74]) by imf07.hostedemail.com (Postfix) with ESMTP id 4473340008 for ; Thu, 14 Apr 2022 18:06:40 +0000 (UTC) Received: by mail-pj1-f74.google.com with SMTP id e12-20020a17090a7c4c00b001cb1b3274c9so3445688pjl.4 for ; Thu, 14 Apr 2022 11:06:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=CbYwHIqxj7L7cTAy/Tm6k6NK9EUWazawJNfUEq9VbmE=; b=FcquyZ3TzmVu7Qu7839jywUhelBxf2xOdw8/iKiXv10OVJil9BucNixlCi0Dsakrwn QCCsZSkfc+iKGITH8NU/a84Vn3viWrXmZloTHyZY0WhIBG2PIG9K131x1zG67RpiXqTf CArNj1fBoNqAqYO/N500yCSMAV51Zt1HJwckbrrp+nTxAv1/tM4o2YwRIWvNtGfW5F3T mKmyiaSsNRaMKyps0y16V7k9PNJc40u2g4F9LRPUWBQUbyV+C+Bv/mz8mOKc7vraGWwf sBnQmB6LFwqUx98V6RcNbRO1Q06falaBTP6yTRJqEBZOCUrKdzYXUCCGZKvvoVn/3TX8 1H5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=CbYwHIqxj7L7cTAy/Tm6k6NK9EUWazawJNfUEq9VbmE=; b=hNbAT0/79BUZWSjuJ/dwblWYMIik/2dZhgMSZiDIv0/hOzrk1Hs/MKdRP9KjKkcw3D mf9qhf8EvXPsKhsh+pv7VralULnn4afSo3QrukywUFh0ytP0xc8n8nKBvir4fyAuWRv6 WLy9OUDJ72roBUHt47MXkHUenn3FX6osePvP/KF0F2F/y7wfX6Ej0v4m18ymc5WD3zWe TUjZvnCs/g9KDvoMHPKm5iLpudrUsichJ4eIO/DIbbiPQZieX819CnFX8RB8vzzhfRoT E+ZR/YZeKooxpgT9NXx3WykqHXQFBS+UMeqHtgR9eo/PTp6iwJGccdFsMvgUYnd9M2vW 2aBw== X-Gm-Message-State: AOAM533UgiooHq1pA04FndQjxBDB2tUzrSkV8BgLKcZ4Jy8AjWLKVqdy UqrVXZTVyFcdJ6Rg4i7v+7zalw4Mrxj8 X-Google-Smtp-Source: ABdhPJytvZTSyN+ea1zuRG/iVesLH5AkWfSDPabo7YkEb0lfonrhbRl140qmX/H97lpsYwCg/q3zOBdJR60Z X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a17:902:7ec1:b0:156:17a4:a2f8 with SMTP id p1-20020a1709027ec100b0015617a4a2f8mr49620617plb.155.1649959599160; Thu, 14 Apr 2022 11:06:39 -0700 (PDT) Date: Thu, 14 Apr 2022 11:06:03 -0700 In-Reply-To: <20220414180612.3844426-1-zokeefe@google.com> Message-Id: <20220414180612.3844426-4-zokeefe@google.com> Mime-Version: 1.0 References: <20220414180612.3844426-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.0.rc0.470.gd361397f0d-goog Subject: [PATCH v2 03/12] mm/khugepaged: make hugepage allocation context-specific From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Peter Xu , Thomas Bogendoerfer , "Zach O'Keefe" , kernel test robot X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: 9mziqweyx7i11nc7doi841wixo8wi41r Authentication-Results: imf07.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=FcquyZ3T; dmarc=pass (policy=reject) header.from=google.com; spf=pass (imf07.hostedemail.com: domain of 3r2JYYgcKCO4peaUUVUWeeWbU.SecbYdkn-ccalQSa.ehW@flex--zokeefe.bounces.google.com designates 209.85.216.74 as permitted sender) smtp.mailfrom=3r2JYYgcKCO4peaUUVUWeeWbU.SecbYdkn-ccalQSa.ehW@flex--zokeefe.bounces.google.com X-Rspamd-Queue-Id: 4473340008 X-HE-Tag: 1649959600-716682 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add hugepage allocation context to struct collapse_context, allowing different collapse contexts to allocate hugepages differently. For example, khugepaged decides to allocate differently in NUMA and UMA configurations, and other collapse contexts shouldn't be coupled to this decision. Additionally, move [pre]allocated hugepage pointer into struct collapse_context. Signed-off-by: Zach O'Keefe Reported-by: kernel test robot --- mm/khugepaged.c | 96 ++++++++++++++++++++++++------------------------- 1 file changed, 48 insertions(+), 48 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 25f45ac7f6bd..21c8436fa73c 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -92,6 +92,10 @@ struct collapse_control { /* Last target selected in khugepaged_find_target_node() for this scan */ int last_target_node; + + struct page *hpage; + struct page* (*alloc_hpage)(struct collapse_control *cc, gfp_t gfp, + int node); }; /** @@ -877,21 +881,21 @@ static bool khugepaged_prealloc_page(struct page **hpage, bool *wait) return true; } -static struct page * -khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) +static struct page *khugepaged_alloc_page(struct collapse_control *cc, + gfp_t gfp, int node) { - VM_BUG_ON_PAGE(*hpage, *hpage); + VM_BUG_ON_PAGE(cc->hpage, cc->hpage); - *hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER); - if (unlikely(!*hpage)) { + cc->hpage = __alloc_pages_node(node, gfp, HPAGE_PMD_ORDER); + if (unlikely(!cc->hpage)) { count_vm_event(THP_COLLAPSE_ALLOC_FAILED); - *hpage = ERR_PTR(-ENOMEM); + cc->hpage = ERR_PTR(-ENOMEM); return NULL; } - prep_transhuge_page(*hpage); + prep_transhuge_page(cc->hpage); count_vm_event(THP_COLLAPSE_ALLOC); - return *hpage; + return cc->hpage; } #else static int khugepaged_find_target_node(struct collapse_control *cc) @@ -953,12 +957,12 @@ static bool khugepaged_prealloc_page(struct page **hpage, bool *wait) return true; } -static struct page * -khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) +static struct page *khugepaged_alloc_page(struct collapse_control *cc, + gfp_t gfp, int node) { - VM_BUG_ON(!*hpage); + VM_BUG_ON(!cc->hpage); - return *hpage; + return cc->hpage; } #endif @@ -1080,10 +1084,9 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, return true; } -static void collapse_huge_page(struct mm_struct *mm, - unsigned long address, - struct page **hpage, - int node, int referenced, int unmapped) +static void collapse_huge_page(struct mm_struct *mm, unsigned long address, + struct collapse_control *cc, int referenced, + int unmapped) { LIST_HEAD(compound_pagelist); pmd_t *pmd, _pmd; @@ -1096,6 +1099,7 @@ static void collapse_huge_page(struct mm_struct *mm, struct mmu_notifier_range range; gfp_t gfp; const struct cpumask *cpumask; + int node; VM_BUG_ON(address & ~HPAGE_PMD_MASK); @@ -1110,13 +1114,14 @@ static void collapse_huge_page(struct mm_struct *mm, */ mmap_read_unlock(mm); + node = khugepaged_find_target_node(cc); /* sched to specified node before huage page memory copy */ if (task_node(current) != node) { cpumask = cpumask_of_node(node); if (!cpumask_empty(cpumask)) set_cpus_allowed_ptr(current, cpumask); } - new_page = khugepaged_alloc_page(hpage, gfp, node); + new_page = cc->alloc_hpage(cc, gfp, node); if (!new_page) { result = SCAN_ALLOC_HUGE_PAGE_FAIL; goto out_nolock; @@ -1238,15 +1243,15 @@ static void collapse_huge_page(struct mm_struct *mm, update_mmu_cache_pmd(vma, address, pmd); spin_unlock(pmd_ptl); - *hpage = NULL; + cc->hpage = NULL; khugepaged_pages_collapsed++; result = SCAN_SUCCEED; out_up_write: mmap_write_unlock(mm); out_nolock: - if (!IS_ERR_OR_NULL(*hpage)) - mem_cgroup_uncharge(page_folio(*hpage)); + if (!IS_ERR_OR_NULL(cc->hpage)) + mem_cgroup_uncharge(page_folio(cc->hpage)); trace_mm_collapse_huge_page(mm, isolated, result); return; } @@ -1254,7 +1259,6 @@ static void collapse_huge_page(struct mm_struct *mm, static int khugepaged_scan_pmd(struct mm_struct *mm, struct vm_area_struct *vma, unsigned long address, - struct page **hpage, struct collapse_control *cc) { pmd_t *pmd; @@ -1399,10 +1403,8 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, out_unmap: pte_unmap_unlock(pte, ptl); if (ret) { - node = khugepaged_find_target_node(cc); /* collapse_huge_page will return with the mmap_lock released */ - collapse_huge_page(mm, address, hpage, node, - referenced, unmapped); + collapse_huge_page(mm, address, cc, referenced, unmapped); } out: trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, @@ -1667,8 +1669,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * @mm: process address space where collapse happens * @file: file that collapse on * @start: collapse start address - * @hpage: new allocated huge page for collapse - * @node: appointed node the new huge page allocate from + * @cc: collapse context and scratchpad * * Basic scheme is simple, details are more complex: * - allocate and lock a new huge page; @@ -1686,8 +1687,8 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * + unlock and free huge page; */ static void collapse_file(struct mm_struct *mm, - struct file *file, pgoff_t start, - struct page **hpage, int node) + struct file *file, pgoff_t start, + struct collapse_control *cc) { struct address_space *mapping = file->f_mapping; gfp_t gfp; @@ -1697,15 +1698,16 @@ static void collapse_file(struct mm_struct *mm, XA_STATE_ORDER(xas, &mapping->i_pages, start, HPAGE_PMD_ORDER); int nr_none = 0, result = SCAN_SUCCEED; bool is_shmem = shmem_file(file); - int nr; + int nr, node; VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); /* Only allocate from the target node */ gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; + node = khugepaged_find_target_node(cc); - new_page = khugepaged_alloc_page(hpage, gfp, node); + new_page = cc->alloc_hpage(cc, gfp, node); if (!new_page) { result = SCAN_ALLOC_HUGE_PAGE_FAIL; goto out; @@ -1998,7 +2000,7 @@ static void collapse_file(struct mm_struct *mm, * Remove pte page tables, so we can re-fault the page as huge. */ retract_page_tables(mapping, start); - *hpage = NULL; + cc->hpage = NULL; khugepaged_pages_collapsed++; } else { @@ -2045,14 +2047,14 @@ static void collapse_file(struct mm_struct *mm, unlock_page(new_page); out: VM_BUG_ON(!list_empty(&pagelist)); - if (!IS_ERR_OR_NULL(*hpage)) - mem_cgroup_uncharge(page_folio(*hpage)); + if (!IS_ERR_OR_NULL(cc->hpage)) + mem_cgroup_uncharge(page_folio(cc->hpage)); /* TODO: tracepoints */ } static void khugepaged_scan_file(struct mm_struct *mm, - struct file *file, pgoff_t start, struct page **hpage, - struct collapse_control *cc) + struct file *file, pgoff_t start, + struct collapse_control *cc) { struct page *page = NULL; struct address_space *mapping = file->f_mapping; @@ -2125,8 +2127,7 @@ static void khugepaged_scan_file(struct mm_struct *mm, result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { - node = khugepaged_find_target_node(cc); - collapse_file(mm, file, start, hpage, node); + collapse_file(mm, file, start, cc); } } @@ -2134,8 +2135,8 @@ static void khugepaged_scan_file(struct mm_struct *mm, } #else static void khugepaged_scan_file(struct mm_struct *mm, - struct file *file, pgoff_t start, struct page **hpage, - struct collapse_control *cc) + struct file *file, pgoff_t start, + struct collapse_control *cc) { BUILD_BUG(); } @@ -2146,7 +2147,6 @@ static void khugepaged_collapse_pte_mapped_thps(struct mm_slot *mm_slot) #endif static unsigned int khugepaged_scan_mm_slot(unsigned int pages, - struct page **hpage, struct collapse_control *cc) __releases(&khugepaged_mm_lock) __acquires(&khugepaged_mm_lock) @@ -2223,12 +2223,11 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, mmap_read_unlock(mm); ret = 1; - khugepaged_scan_file(mm, file, pgoff, hpage, cc); + khugepaged_scan_file(mm, file, pgoff, cc); fput(file); } else { ret = khugepaged_scan_pmd(mm, vma, - khugepaged_scan.address, - hpage, cc); + khugepaged_scan.address, cc); } /* move to next address */ khugepaged_scan.address += HPAGE_PMD_SIZE; @@ -2286,15 +2285,15 @@ static int khugepaged_wait_event(void) static void khugepaged_do_scan(struct collapse_control *cc) { - struct page *hpage = NULL; unsigned int progress = 0, pass_through_head = 0; unsigned int pages = READ_ONCE(khugepaged_pages_to_scan); bool wait = true; + cc->hpage = NULL; lru_add_drain_all(); while (progress < pages) { - if (!khugepaged_prealloc_page(&hpage, &wait)) + if (!khugepaged_prealloc_page(&cc->hpage, &wait)) break; cond_resched(); @@ -2308,14 +2307,14 @@ static void khugepaged_do_scan(struct collapse_control *cc) if (khugepaged_has_work() && pass_through_head < 2) progress += khugepaged_scan_mm_slot(pages - progress, - &hpage, cc); + cc); else progress = pages; spin_unlock(&khugepaged_mm_lock); } - if (!IS_ERR_OR_NULL(hpage)) - put_page(hpage); + if (!IS_ERR_OR_NULL(cc->hpage)) + put_page(cc->hpage); } static bool khugepaged_should_wakeup(void) @@ -2349,6 +2348,7 @@ static int khugepaged(void *none) struct mm_slot *mm_slot; struct collapse_control cc = { .last_target_node = NUMA_NO_NODE, + .alloc_hpage = &khugepaged_alloc_page, }; set_freezable();