From patchwork Mon May 2 18:17:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Zach O'Keefe X-Patchwork-Id: 12834607 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7DC94C433F5 for ; Mon, 2 May 2022 18:17:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 17D786B0075; Mon, 2 May 2022 14:17:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 0B8036B0078; Mon, 2 May 2022 14:17:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id E997F6B007B; Mon, 2 May 2022 14:17:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (relay.hostedemail.com [64.99.140.26]) by kanga.kvack.org (Postfix) with ESMTP id CE1046B0075 for ; Mon, 2 May 2022 14:17:33 -0400 (EDT) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id A9F49210E7 for ; Mon, 2 May 2022 18:17:33 +0000 (UTC) X-FDA: 79421610786.27.3354B1F Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) by imf14.hostedemail.com (Postfix) with ESMTP id DB5B1100072 for ; Mon, 2 May 2022 18:17:31 +0000 (UTC) Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-2f7c5767f0fso141589637b3.4 for ; Mon, 02 May 2022 11:17:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=SLjReTtZhtJD/7TtFXA/+T84e20adxPsQb3SxmH90Eo=; b=rTIsYncqIDJwsXCLIuLEiSJdEggxhdSNzvexIvJlyRiLjZat4C0pgdCNGmqm0N6Ih7 OQNt8n3E84c+solMBasEpRvpdStlwW1PBf4E7a6LsTs8kiMFZ1JBGvRk81ZiRrsFRh92 ATlDRPp6kvybBbNJSV9pG0buadsfp40CUDHNV+mW7QO50eumtV1qaFP1FLXsPcP06RFw w4u41kW00YoAlwQabmP+buDnydVP4SHFuSS63g1/2jtKVXU4ofQjF34srZB6dqz32J+r KinfNgd/JKtZL1U00O/vl36mhwdlXycQbptfQB6rqGcK/fPUnqCEwI55I4niGhr9NAye 5E4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=SLjReTtZhtJD/7TtFXA/+T84e20adxPsQb3SxmH90Eo=; b=p5N9mfW6G2Vn24Qyb9vgQRXh+QLRvOJS6dlUVWQW7JiWBrVTdpr8lxv17YWq8r8noo dATtV8qaJCxyssfOAxcRXMjtjeyyrM1wpyQu6El+gYy+pExs1pVq0J/YVMChxBoepLdW VqU8aVfmVnpG/zx1jpaUcnaZjGzsT0dLD7dOMk6lVI+C+sqUwGamFnOyztiN3yd9QlxD yYwr9Q0Hd/H/r87skhysiQqG6tiC+MVA4H8WPQzorK+jZF5ok/mLS4egGpdCJ9lrNxKr 1VMG2qiaoVUmBuNCMUR8nVUPCnYfSt9c5q5+WqwKFKllzS6ruwKD6KWGNuE7aQ80LZYI +y/A== X-Gm-Message-State: AOAM532nOaO0efDB6TvDeM7czOXBTHxCHC49yZl77feA7P7CcNxDDTfk c3knm6YWehjV7BwOZvf2m0I+db0pRba5 X-Google-Smtp-Source: ABdhPJyAvHh6HkPeAT1uTW5Zsi5Pf5SBz9XB2UYVBN163Q6fCj3q/Eal88mm5/y5sw6isi/pUNKvO3dI6a54 X-Received: from zokeefe3.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:1b6]) (user=zokeefe job=sendgmr) by 2002:a5b:a10:0:b0:645:deb7:294a with SMTP id k16-20020a5b0a10000000b00645deb7294amr11534201ybq.177.1651515452397; Mon, 02 May 2022 11:17:32 -0700 (PDT) Date: Mon, 2 May 2022 11:17:04 -0700 In-Reply-To: <20220502181714.3483177-1-zokeefe@google.com> Message-Id: <20220502181714.3483177-4-zokeefe@google.com> Mime-Version: 1.0 References: <20220502181714.3483177-1-zokeefe@google.com> X-Mailer: git-send-email 2.36.0.464.gb9c8b46e94-goog Subject: [PATCH v4 03/13] mm/khugepaged: dedup and simplify hugepage alloc and charging From: "Zach O'Keefe" To: Alex Shi , David Hildenbrand , David Rientjes , Matthew Wilcox , Michal Hocko , Pasha Tatashin , Peter Xu , SeongJae Park , Song Liu , Vlastimil Babka , Yang Shi , Zi Yan , linux-mm@kvack.org Cc: Andrea Arcangeli , Andrew Morton , Arnd Bergmann , Axel Rasmussen , Chris Kennelly , Chris Zankel , Helge Deller , Hugh Dickins , Ivan Kokshaysky , "James E.J. Bottomley" , Jens Axboe , "Kirill A. Shutemov" , Matt Turner , Max Filippov , Miaohe Lin , Minchan Kim , Patrick Xia , Pavel Begunkov , Thomas Bogendoerfer , "Zach O'Keefe" X-Stat-Signature: u1brb7smywiwugdkahmo456u5mzd3uqh X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: DB5B1100072 Authentication-Results: imf14.hostedemail.com; dkim=pass header.d=google.com header.s=20210112 header.b=rTIsYncq; spf=pass (imf14.hostedemail.com: domain of 3PCBwYgcKCFcOD933435DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--zokeefe.bounces.google.com designates 209.85.128.202 as permitted sender) smtp.mailfrom=3PCBwYgcKCFcOD933435DD5A3.1DBA7CJM-BB9Kz19.DG5@flex--zokeefe.bounces.google.com; dmarc=pass (policy=reject) header.from=google.com X-Rspam-User: X-HE-Tag: 1651515451-419546 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The following code is duplicated in collapse_huge_page() and collapse_file(): /* Only allocate from the target node */ gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; new_page = khugepaged_alloc_page(hpage, gfp, node); if (!new_page) { result = SCAN_ALLOC_HUGE_PAGE_FAIL; goto out; } if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) { result = SCAN_CGROUP_CHARGE_FAIL; goto out; } count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); Also, "node" is passed as an argument to both collapse_huge_page() and collapse_file() and obtained the same way, via khugepaged_find_target_node(). Move all this into a new helper, alloc_charge_hpage(), and remove the duplicate code from collapse_huge_page() and collapse_file(). Also, simplify khugepaged_alloc_page() by returning a bool indicating allocation success instead of a copy of the (possibly) allocated struct page. Suggested-by: Peter Xu Signed-off-by: Zach O'Keefe --- This patch currently depends on 'mm/khugepaged: sched to numa node when collapse huge page' currently being discussed upstream[1], and anticipates that this functionality would be equally applicable to file-backed collapse. It also goes ahead and wraps this code in a CONFIF_NUMA #ifdef. [1] https://lore.kernel.org/linux-mm/20220317065024.2635069-1-maobibo@loongson.cn/ mm/khugepaged.c | 99 +++++++++++++++++++++++-------------------------- 1 file changed, 46 insertions(+), 53 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 59357e34e7ce..b05fb9a85eab 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -866,8 +866,7 @@ static bool khugepaged_prealloc_page(struct page **hpage, bool *wait) return true; } -static struct page * -khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) +static bool khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) { VM_BUG_ON_PAGE(*hpage, *hpage); @@ -875,12 +874,12 @@ khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) if (unlikely(!*hpage)) { count_vm_event(THP_COLLAPSE_ALLOC_FAILED); *hpage = ERR_PTR(-ENOMEM); - return NULL; + return false; } prep_transhuge_page(*hpage); count_vm_event(THP_COLLAPSE_ALLOC); - return *hpage; + return true; } #else static int khugepaged_find_target_node(struct collapse_control *cc) @@ -942,12 +941,11 @@ static bool khugepaged_prealloc_page(struct page **hpage, bool *wait) return true; } -static struct page * -khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) +static bool khugepaged_alloc_page(struct page **hpage, gfp_t gfp, int node) { VM_BUG_ON(!*hpage); - return *hpage; + return true; } #endif @@ -1069,10 +1067,34 @@ static bool __collapse_huge_page_swapin(struct mm_struct *mm, return true; } -static void collapse_huge_page(struct mm_struct *mm, - unsigned long address, - struct page **hpage, - int node, int referenced, int unmapped) +static int alloc_charge_hpage(struct page **hpage, struct mm_struct *mm, + struct collapse_control *cc) +{ +#ifdef CONFIG_NUMA + const struct cpumask *cpumask; +#endif + gfp_t gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; + int node = khugepaged_find_target_node(cc); + +#ifdef CONFIG_NUMA + /* sched to specified node before huge page memory copy */ + if (task_node(current) != node) { + cpumask = cpumask_of_node(node); + if (!cpumask_empty(cpumask)) + set_cpus_allowed_ptr(current, cpumask); + } +#endif + if (!khugepaged_alloc_page(hpage, gfp, node)) + return SCAN_ALLOC_HUGE_PAGE_FAIL; + if (unlikely(mem_cgroup_charge(page_folio(*hpage), mm, gfp))) + return SCAN_CGROUP_CHARGE_FAIL; + count_memcg_page_event(*hpage, THP_COLLAPSE_ALLOC); + return SCAN_SUCCEED; +} + +static void collapse_huge_page(struct mm_struct *mm, unsigned long address, + struct page **hpage, int referenced, + int unmapped, struct collapse_control *cc) { LIST_HEAD(compound_pagelist); pmd_t *pmd, _pmd; @@ -1083,14 +1105,9 @@ static void collapse_huge_page(struct mm_struct *mm, int isolated = 0, result = 0; struct vm_area_struct *vma; struct mmu_notifier_range range; - gfp_t gfp; - const struct cpumask *cpumask; VM_BUG_ON(address & ~HPAGE_PMD_MASK); - /* Only allocate from the target node */ - gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; - /* * Before allocating the hugepage, release the mmap_lock read lock. * The allocation can take potentially a long time if it involves @@ -1099,23 +1116,11 @@ static void collapse_huge_page(struct mm_struct *mm, */ mmap_read_unlock(mm); - /* sched to specified node before huage page memory copy */ - if (task_node(current) != node) { - cpumask = cpumask_of_node(node); - if (!cpumask_empty(cpumask)) - set_cpus_allowed_ptr(current, cpumask); - } - new_page = khugepaged_alloc_page(hpage, gfp, node); - if (!new_page) { - result = SCAN_ALLOC_HUGE_PAGE_FAIL; + result = alloc_charge_hpage(hpage, mm, cc); + if (result != SCAN_SUCCEED) goto out_nolock; - } - if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) { - result = SCAN_CGROUP_CHARGE_FAIL; - goto out_nolock; - } - count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); + new_page = *hpage; mmap_read_lock(mm); result = hugepage_vma_revalidate(mm, address, &vma); @@ -1388,10 +1393,9 @@ static int khugepaged_scan_pmd(struct mm_struct *mm, out_unmap: pte_unmap_unlock(pte, ptl); if (ret) { - node = khugepaged_find_target_node(cc); /* collapse_huge_page will return with the mmap_lock released */ - collapse_huge_page(mm, address, hpage, node, - referenced, unmapped); + collapse_huge_page(mm, address, hpage, referenced, unmapped, + cc); } out: trace_mm_khugepaged_scan_pmd(mm, page, writable, referenced, @@ -1657,7 +1661,7 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * @file: file that collapse on * @start: collapse start address * @hpage: new allocated huge page for collapse - * @node: appointed node the new huge page allocate from + * @cc: collapse context and scratchpad * * Basic scheme is simple, details are more complex: * - allocate and lock a new huge page; @@ -1674,12 +1678,11 @@ static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) * + restore gaps in the page cache; * + unlock and free huge page; */ -static void collapse_file(struct mm_struct *mm, - struct file *file, pgoff_t start, - struct page **hpage, int node) +static void collapse_file(struct mm_struct *mm, struct file *file, + pgoff_t start, struct page **hpage, + struct collapse_control *cc) { struct address_space *mapping = file->f_mapping; - gfp_t gfp; struct page *new_page; pgoff_t index, end = start + HPAGE_PMD_NR; LIST_HEAD(pagelist); @@ -1691,20 +1694,11 @@ static void collapse_file(struct mm_struct *mm, VM_BUG_ON(!IS_ENABLED(CONFIG_READ_ONLY_THP_FOR_FS) && !is_shmem); VM_BUG_ON(start & (HPAGE_PMD_NR - 1)); - /* Only allocate from the target node */ - gfp = alloc_hugepage_khugepaged_gfpmask() | __GFP_THISNODE; - - new_page = khugepaged_alloc_page(hpage, gfp, node); - if (!new_page) { - result = SCAN_ALLOC_HUGE_PAGE_FAIL; + result = alloc_charge_hpage(hpage, mm, cc); + if (result != SCAN_SUCCEED) goto out; - } - if (unlikely(mem_cgroup_charge(page_folio(new_page), mm, gfp))) { - result = SCAN_CGROUP_CHARGE_FAIL; - goto out; - } - count_memcg_page_event(new_page, THP_COLLAPSE_ALLOC); + new_page = *hpage; /* * Ensure we have slots for all the pages in the range. This is @@ -2114,8 +2108,7 @@ static void khugepaged_scan_file(struct mm_struct *mm, result = SCAN_EXCEED_NONE_PTE; count_vm_event(THP_SCAN_EXCEED_NONE_PTE); } else { - node = khugepaged_find_target_node(cc); - collapse_file(mm, file, start, hpage, node); + collapse_file(mm, file, start, hpage, cc); } }