From patchwork Tue Jun 20 07:58:07 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13285286 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id B635BEB64D7 for ; Tue, 20 Jun 2023 07:58:15 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 57A618D0003; Tue, 20 Jun 2023 03:58:15 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 529D38D0001; Tue, 20 Jun 2023 03:58:15 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 37EB28D0003; Tue, 20 Jun 2023 03:58:15 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 287938D0001 for ; Tue, 20 Jun 2023 03:58:15 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id ECC2780698 for ; Tue, 20 Jun 2023 07:58:14 +0000 (UTC) X-FDA: 80922373308.26.607786E Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) by imf27.hostedemail.com (Postfix) with ESMTP id 2BA1D40005 for ; Tue, 20 Jun 2023 07:58:12 +0000 (UTC) Authentication-Results: imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=jjSq9aTI; spf=pass (imf27.hostedemail.com: domain of hughd@google.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1687247893; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=oZY6w9J7toyiTSnrDYdTl/iZtQ4DwvkRvcpzIuD6Z/M=; b=KY9JAXOW/swXlBjKMhi2Y/MIppYfxTdea1GEprYHxM9DIu8GIEELjbwbuEd3V00xW/9p3E JZFXqLez8/S3fujHgG+flqZX6jrmpbr0/OqE8knCt1vUqnZ5dqeiBxxbCRRdoGkET0lH8M oHsgENk/ax9SnqNv6WRWNjs1sDoebM0= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1687247893; a=rsa-sha256; cv=none; b=1uyHndS02BZBKYis44BYyqIOM0DB8ODuRM7gNM25ByYgQiJU8dkGAFhCEyQ4ElTq5qnSaF 0T/tHhdHIUCrwXNUqh7kWJxEtFxcBBEFXY+3dAM/9NlDxdBZCl6j2HX4ulTV90aETwklIm f2b7eyYJtnigMUrzWUJ7GLhRV2qQMhc= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=jjSq9aTI; spf=pass (imf27.hostedemail.com: domain of hughd@google.com designates 209.85.128.172 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-57045429f76so46745067b3.0 for ; Tue, 20 Jun 2023 00:58:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1687247892; x=1689839892; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=oZY6w9J7toyiTSnrDYdTl/iZtQ4DwvkRvcpzIuD6Z/M=; b=jjSq9aTI2DDqxscN4pq6tIW2t1oB2Us2M+ybVjj+WbNNuPa0SHl8anbtzK7Rjar1SY FtbdELwb+qm7aZfRDGlas9htldGEgGJo8EUel1IgLBRhuOjybXZ2JhWFWoNydWhWnLoe sDikPgviTF+F9BLGHUmNHshe9jLjcHp4eHYN78AHXDshMx1gmgi0lwXRrXrymzvfy1oS sSS/SWZGOylcmqt/OOyH7yW5jIVhDF2myOH5nmIHjTp4hCoBYxyIV0dBLcoGfKjX+nSm SCqmPcSA1OfsDZ4JORRyiDnJMkamZlwDweNLYjOmQrj5114Aq+ZG9d2oxbylEIDpX6zC XvUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1687247892; x=1689839892; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=oZY6w9J7toyiTSnrDYdTl/iZtQ4DwvkRvcpzIuD6Z/M=; b=fVnNrdF48UoCXRK3IQsJPqSNWX8EpufrOkdvbnptddZmktkRFyV4kmfyI1BXUWJFMB 2EN5O5piynZXXzbKVN2QBfdBtz6N2SV8mRGwhyalCSbkqmYRUokuCnNSIC2etmHDLzLa JzEo7H9MvQZzYxRWbBAznvWlWwL39zFIcKAGbaLg3RsQr2LDQk5AJA1tSTToo/3yr/cK eMbGiYL9KAIJCmsd00w+d96x0mAGHi8UtiwsTNI4klsppUyL+H24MA+dbmSQ0cdB1A9I bVA+3ASf0qAU6IHs+xhuYtJ60Co2sNcA+i6tlWJqm77nK4qIozosRJ2WweYQqAcx5YTP 2N0w== X-Gm-Message-State: AC+VfDzaX6b9jqeSw1P++mP2Cl4cKplQ+JEgUiiZowov8ZNZ0ymosT9r xN7rl87nH1YdnpVsZRs93GjozA== X-Google-Smtp-Source: ACHHUZ6WOKlFi1AP0MbqxWtb/dOy1NBeYqPcXgHn95ddmUtg48jpbqy+Ay6U+MoHnAVEhjTgTFamZg== X-Received: by 2002:a81:9157:0:b0:570:8802:ec9f with SMTP id i84-20020a819157000000b005708802ec9fmr10023541ywg.19.1687247892063; Tue, 20 Jun 2023 00:58:12 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id a133-20020a81668b000000b00569ff2d94f6sm385736ywc.19.2023.06.20.00.58.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 20 Jun 2023 00:58:11 -0700 (PDT) Date: Tue, 20 Jun 2023 00:58:07 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Andrew Morton cc: Gerald Schaefer , Vasily Gorbik , Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Yang Shi , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Alistair Popple , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Lorenzo Stoakes , Huang Ying , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , Russell King , "David S. Miller" , Michael Ellerman , "Aneesh Kumar K.V" , Heiko Carstens , Christian Borntraeger , Claudio Imbrenda , Alexander Gordeev , Jann Horn , Vishal Moola , Vlastimil Babka , linux-arm-kernel@lists.infradead.org, sparclinux@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH v2 11/12] mm/khugepaged: delete khugepaged_collapse_pte_mapped_thps() In-Reply-To: <54cb04f-3762-987f-8294-91dafd8ebfb0@google.com> Message-ID: <90cd6860-eb92-db66-9a8-5fa7b494a10@google.com> References: <54cb04f-3762-987f-8294-91dafd8ebfb0@google.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 2BA1D40005 X-Rspam-User: X-Stat-Signature: ph58ta6gtphi5xsdgeog8r8a4adito5c X-Rspamd-Server: rspam03 X-HE-Tag: 1687247892-923031 X-HE-Meta: U2FsdGVkX197Ph8NmfWi8qIBilKgpnhef3yAFYsMsb8kljb5x0rE2acvHE+AasavVP4NB7UcPv6T2LLfHxzf6GDZhx18DQ2ekTmzt/JjquqjrlQ0QPlX+eC9PL0A6ArDegMHu8SyU/IkIPT2ffOu9BKAd79y/JfwBlhkiciUQLmNIA2hJVqIUQ7DMvEU+OkTqz8ErPGjKUFr94kB71KEyqsRkjKPIXzWxfpxKvU0+ZCgPrW1Hb+43T1tQVr9Xkv4lQRi78Ni4DRtEO7kO1ib57DYR5iFNgiJ2bDgBIJ762BqaWn3hwZaKiQdlpkGC4lT8UKILi3r/os/EcvMqsoo64XN0wvHkjeBCcEV+/+lkT9rbVqffQU7t5w0uwI6BliumnIOzjouVQoauWVh9LXvH9xckzaQDAwcqoeSZjX7vE69Z+CbMda+6jWxUZO4XqS9AlzIbzGpYWUYjs/Y42Bzl8FbuzryhEj2WE1w2OzUIKVruiFso0g56E11L6vI7NohytX7r6SD1xxjAsETsKsKagRUgf1p1Qm8OcLJd5bQ0iJZPFTVqnRTnn4EspHVaKL5EhKeE7LZfny8XxepOudojVwhKvZo9ADJWf9xqq17Dsot2Xb2cwF4zakTL/8utkBJrPqIrCx5FkDTFlYK1hoMI6CPC/l/0Ha+Ydn1Wuv6M2Z2uzquJMgKZv2G+9f4/siZqbpZ3eF6RZBLjGCc3R211LugQnzNJLHLv+fhHJXH3zliR7rJiotd4Spr7q9lj1lC6/VBZgg3eGFy2Rmi+H9oA/yFId4IrOq01DgK6AYSxxVRlzF/FtXj4s9iL0+wrq4OzkynFxHFCFUpbt0tSFKxotgTzV1RwcuWesU/ybaYk7HBrN+gpU6WsI2yOHlrFnOSWRLTvgOzL3WQO8uF5e0MLelLLeyTwHOoJaHIMQv+9zA2Xdm2EIcwmqp6rk7fkeis4TTyT4UMrwGY3CItwdM JMvI5KPA it8MMPkgn4YCnJFItNwPPZ0pM7l59kccemKwC4klthdmhaj4RCW4OO2NVB6WxjNd8OzxfBkTxFMpE4pLZQq6nAhRxyjMiIIFZzCraS0i0UOkyTuyp0NXWyN2+svMBe+QqidBRhqml5hSJi7xCXRQ8KfORCog0dvF+GyHZMkWgDOLortsvEjQYtNiFLNgCF6DqztpoqXEHus9and+tHDAbpdTQ+ZDooj5Lu8sj5/JxxCMuiTkIucWo8poiU6T+BFWZOQubrnuPSd/S2o9meSYqTPxDxwerC1O7NlYqW4E9Y3JKaqMRdffuXgxZwy8bpC4TAtiSCyKNe3EBYQdnq2EAzE85ZXd80u4fXRosPs9bjGtlVkkmFBP4/vbESVgBHyzkfs2zLYGHujmBmrFPHUGA0e7nkBrN3KrsmvnuffNwWLCPJv84MGpqXivId57Bl5/Tinsvvd1Ra4g7du9Qps2K3IbZJoZ1iVGWdofKdoNR9xW8ithd8Ughlwlhle5GiDK3UVPk X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Now that retract_page_tables() can retract page tables reliably, without depending on trylocks, delete all the apparatus for khugepaged to try again later: khugepaged_collapse_pte_mapped_thps() etc; and free up the per-mm memory which was set aside for that in the khugepaged_mm_slot. But one part of that is worth keeping: when hpage_collapse_scan_file() found SCAN_PTE_MAPPED_HUGEPAGE, that address was noted in the mm_slot to be tried for retraction later - catching, for example, page tables where a reversible mprotect() of a portion had required splitting the pmd, but now it can be recollapsed. Call collapse_pte_mapped_thp() directly in this case (why was it deferred before? I assume an issue with needing mmap_lock for write, but now it's only needed for read). Signed-off-by: Hugh Dickins --- mm/khugepaged.c | 125 +++++++----------------------------------------- 1 file changed, 16 insertions(+), 109 deletions(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 060ac8789a1e..06c659e6a89e 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -92,8 +92,6 @@ static __read_mostly DEFINE_HASHTABLE(mm_slots_hash, MM_SLOTS_HASH_BITS); static struct kmem_cache *mm_slot_cache __read_mostly; -#define MAX_PTE_MAPPED_THP 8 - struct collapse_control { bool is_khugepaged; @@ -107,15 +105,9 @@ struct collapse_control { /** * struct khugepaged_mm_slot - khugepaged information per mm that is being scanned * @slot: hash lookup from mm to mm_slot - * @nr_pte_mapped_thp: number of pte mapped THP - * @pte_mapped_thp: address array corresponding pte mapped THP */ struct khugepaged_mm_slot { struct mm_slot slot; - - /* pte-mapped THP in this mm */ - int nr_pte_mapped_thp; - unsigned long pte_mapped_thp[MAX_PTE_MAPPED_THP]; }; /** @@ -1441,50 +1433,6 @@ static void collect_mm_slot(struct khugepaged_mm_slot *mm_slot) } #ifdef CONFIG_SHMEM -/* - * Notify khugepaged that given addr of the mm is pte-mapped THP. Then - * khugepaged should try to collapse the page table. - * - * Note that following race exists: - * (1) khugepaged calls khugepaged_collapse_pte_mapped_thps() for mm_struct A, - * emptying the A's ->pte_mapped_thp[] array. - * (2) MADV_COLLAPSE collapses some file extent with target mm_struct B, and - * retract_page_tables() finds a VMA in mm_struct A mapping the same extent - * (at virtual address X) and adds an entry (for X) into mm_struct A's - * ->pte-mapped_thp[] array. - * (3) khugepaged calls khugepaged_collapse_scan_file() for mm_struct A at X, - * sees a pte-mapped THP (SCAN_PTE_MAPPED_HUGEPAGE) and adds an entry - * (for X) into mm_struct A's ->pte-mapped_thp[] array. - * Thus, it's possible the same address is added multiple times for the same - * mm_struct. Should this happen, we'll simply attempt - * collapse_pte_mapped_thp() multiple times for the same address, under the same - * exclusive mmap_lock, and assuming the first call is successful, subsequent - * attempts will return quickly (without grabbing any additional locks) when - * a huge pmd is found in find_pmd_or_thp_or_none(). Since this is a cheap - * check, and since this is a rare occurrence, the cost of preventing this - * "multiple-add" is thought to be more expensive than just handling it, should - * it occur. - */ -static bool khugepaged_add_pte_mapped_thp(struct mm_struct *mm, - unsigned long addr) -{ - struct khugepaged_mm_slot *mm_slot; - struct mm_slot *slot; - bool ret = false; - - VM_BUG_ON(addr & ~HPAGE_PMD_MASK); - - spin_lock(&khugepaged_mm_lock); - slot = mm_slot_lookup(mm_slots_hash, mm); - mm_slot = mm_slot_entry(slot, struct khugepaged_mm_slot, slot); - if (likely(mm_slot && mm_slot->nr_pte_mapped_thp < MAX_PTE_MAPPED_THP)) { - mm_slot->pte_mapped_thp[mm_slot->nr_pte_mapped_thp++] = addr; - ret = true; - } - spin_unlock(&khugepaged_mm_lock); - return ret; -} - /* hpage must be locked, and mmap_lock must be held */ static int set_huge_pmd(struct vm_area_struct *vma, unsigned long addr, pmd_t *pmdp, struct page *hpage) @@ -1706,29 +1654,6 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, return result; } -static void khugepaged_collapse_pte_mapped_thps(struct khugepaged_mm_slot *mm_slot) -{ - struct mm_slot *slot = &mm_slot->slot; - struct mm_struct *mm = slot->mm; - int i; - - if (likely(mm_slot->nr_pte_mapped_thp == 0)) - return; - - if (!mmap_write_trylock(mm)) - return; - - if (unlikely(hpage_collapse_test_exit(mm))) - goto out; - - for (i = 0; i < mm_slot->nr_pte_mapped_thp; i++) - collapse_pte_mapped_thp(mm, mm_slot->pte_mapped_thp[i], false); - -out: - mm_slot->nr_pte_mapped_thp = 0; - mmap_write_unlock(mm); -} - static void retract_page_tables(struct address_space *mapping, pgoff_t pgoff) { struct vm_area_struct *vma; @@ -2372,16 +2297,6 @@ static int hpage_collapse_scan_file(struct mm_struct *mm, unsigned long addr, { BUILD_BUG(); } - -static void khugepaged_collapse_pte_mapped_thps(struct khugepaged_mm_slot *mm_slot) -{ -} - -static bool khugepaged_add_pte_mapped_thp(struct mm_struct *mm, - unsigned long addr) -{ - return false; -} #endif static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, @@ -2411,7 +2326,6 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, khugepaged_scan.mm_slot = mm_slot; } spin_unlock(&khugepaged_mm_lock); - khugepaged_collapse_pte_mapped_thps(mm_slot); mm = slot->mm; /* @@ -2464,36 +2378,29 @@ static unsigned int khugepaged_scan_mm_slot(unsigned int pages, int *result, khugepaged_scan.address); mmap_read_unlock(mm); - *result = hpage_collapse_scan_file(mm, - khugepaged_scan.address, - file, pgoff, cc); mmap_locked = false; + *result = hpage_collapse_scan_file(mm, + khugepaged_scan.address, file, pgoff, cc); + if (*result == SCAN_PTE_MAPPED_HUGEPAGE) { + mmap_read_lock(mm); + mmap_locked = true; + if (hpage_collapse_test_exit(mm)) { + fput(file); + goto breakouterloop; + } + *result = collapse_pte_mapped_thp(mm, + khugepaged_scan.address, false); + if (*result == SCAN_PMD_MAPPED) + *result = SCAN_SUCCEED; + } fput(file); } else { *result = hpage_collapse_scan_pmd(mm, vma, - khugepaged_scan.address, - &mmap_locked, - cc); + khugepaged_scan.address, &mmap_locked, cc); } - switch (*result) { - case SCAN_PTE_MAPPED_HUGEPAGE: { - pmd_t *pmd; - *result = find_pmd_or_thp_or_none(mm, - khugepaged_scan.address, - &pmd); - if (*result != SCAN_SUCCEED) - break; - if (!khugepaged_add_pte_mapped_thp(mm, - khugepaged_scan.address)) - break; - } fallthrough; - case SCAN_SUCCEED: + if (*result == SCAN_SUCCEED) ++khugepaged_pages_collapsed; - break; - default: - break; - } /* move to next address */ khugepaged_scan.address += HPAGE_PMD_SIZE;