From patchwork Mon May 22 05:15:06 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hugh Dickins X-Patchwork-Id: 13249781 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 505C9C7EE23 for ; Mon, 22 May 2023 05:15:13 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E8B636B0074; Mon, 22 May 2023 01:15:12 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E3BE96B0075; Mon, 22 May 2023 01:15:12 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CB674900002; Mon, 22 May 2023 01:15:12 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BCA656B0074 for ; Mon, 22 May 2023 01:15:12 -0400 (EDT) Received: from smtpin25.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 8E3BD160427 for ; Mon, 22 May 2023 05:15:12 +0000 (UTC) X-FDA: 80816727264.25.699A9CD Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) by imf13.hostedemail.com (Postfix) with ESMTP id ACEAB20008 for ; Mon, 22 May 2023 05:15:10 +0000 (UTC) Authentication-Results: imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=cRcUhaoi; spf=pass (imf13.hostedemail.com: domain of hughd@google.com designates 209.85.128.170 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1684732510; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wdvmzniCtM+nQioqQdrwu6IzqdoeozaKC+WtG0d7jzc=; b=4+goyj0JY9h0WVAff5qGXfM8IaeUZLZdF6GrR1hhlvyZRy7Zv0L1+NcE/r4whznZaihCwO 8AAwlm/YsN9x7exngM74k97Qv5vo7n/mppbjbUp8IC+fLYMMvWYku4uDnHawVWXmm0Sb9m ZcmGTjhHPkh68L3k6rJuSt1oTU97de8= ARC-Authentication-Results: i=1; imf13.hostedemail.com; dkim=pass header.d=google.com header.s=20221208 header.b=cRcUhaoi; spf=pass (imf13.hostedemail.com: domain of hughd@google.com designates 209.85.128.170 as permitted sender) smtp.mailfrom=hughd@google.com; dmarc=pass (policy=reject) header.from=google.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1684732510; a=rsa-sha256; cv=none; b=Tyx8Z/6d7ZBPOnfonxL/bJeBtHMaf/VeIJi8ENXWcOFyJJ8j5eDOnYoq3HeCufSxqct4De WC0Di2nY+DD1yhdVp/eIbwNSnju5Yrcn6mbn/0fYnCnopXjN0dxVF9HF7lfJvYUdCNwrEA 8QvmR8JiJxOxtm4FVp9krTh7UX6xeLk= Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-561da492bcbso69668397b3.3 for ; Sun, 21 May 2023 22:15:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20221208; t=1684732509; x=1687324509; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:from:to:cc:subject:date:message-id:reply-to; bh=wdvmzniCtM+nQioqQdrwu6IzqdoeozaKC+WtG0d7jzc=; b=cRcUhaoiSgJ38o5WfT/WZeLX6uP++hPYcx9RZxFfGqfMkCU/m1hwojIEJua5ROp1t2 3gbVLqFCMhxh733PpSF/0qZ8YjtL+7/P2FqSYSUAY16Ac5W48ksKrjIJgFQ08/pbPPf3 q4tD4wBFecp/AobnrAUouJg1D3xu+hq/x/0yYcV3B7KprXoBoJD5VvXqNme3XhkF8MUs ptKdL72dy1i0NG2qsZVUCDca3Hwe/qFNH0+a+76pWfu+QubvXC6ErHyq6G5+PWt8MPaI 67e8Krgsis/Lu/L/JhEyvnDKGzJIK/fkyJS7kC+QKlqSYLfoaAZoCGiKnWGOJLttwSnL hGFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1684732509; x=1687324509; h=mime-version:references:message-id:in-reply-to:subject:cc:to:from :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=wdvmzniCtM+nQioqQdrwu6IzqdoeozaKC+WtG0d7jzc=; b=j7zysr+oOxsaL1HB5cT8LWxGejyMwfvuSiNsjGW5K2rUNFuDFyhGZkBiwRsKukL7EI U1lTwb4sr8xzkI2zV4roqZghwYT531QrfhqvNy1A+BMKi7YHyMoy3+dZJuyYs+DUeuX0 RTLCiNbYrSA5SCkXJthNUPlSX1oZgethEdKKf+Me4W0sb5ljoWeUgBqQ76yTfU75CEXG eSpyt5nVsvX2NqCufB9c1z7yTDWvHJaTuh0wHoMUoRS1FNxP0AsVf6FjxZQA6yKjQYCV sS2zlCrHoFRAvFToCzMdD7mnnaubnIPFVpQ1b9WH1X6amAS4ehzcSC7AsRSx7UuMAUyj ea0w== X-Gm-Message-State: AC+VfDwdZBoDg57d4Ha/xKllQ6fuEWE5czmVLmiFkZJHejkJ7r8ZH8RI 7+kIEgyTAaZ2T2Q5KNpurPwMdw== X-Google-Smtp-Source: ACHHUZ4FV9kV2YREHnhwIbwygT+rPx4Et9mG+j5eesRA86qgE8Xs1bA3kk0cV5xsWSqbz756kifYvg== X-Received: by 2002:a0d:e807:0:b0:55a:4ff4:f97d with SMTP id r7-20020a0de807000000b0055a4ff4f97dmr10153523ywe.48.1684732509567; Sun, 21 May 2023 22:15:09 -0700 (PDT) Received: from ripple.attlocal.net (172-10-233-147.lightspeed.sntcca.sbcglobal.net. [172.10.233.147]) by smtp.gmail.com with ESMTPSA id z73-20020a0dd74c000000b00559d9989490sm1828589ywd.41.2023.05.21.22.15.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 21 May 2023 22:15:09 -0700 (PDT) Date: Sun, 21 May 2023 22:15:06 -0700 (PDT) From: Hugh Dickins X-X-Sender: hugh@ripple.attlocal.net To: Andrew Morton cc: Mike Kravetz , Mike Rapoport , "Kirill A. Shutemov" , Matthew Wilcox , David Hildenbrand , Suren Baghdasaryan , Qi Zheng , Yang Shi , Mel Gorman , Peter Xu , Peter Zijlstra , Will Deacon , Yu Zhao , Alistair Popple , Ralph Campbell , Ira Weiny , Steven Price , SeongJae Park , Naoya Horiguchi , Christophe Leroy , Zack Rusin , Jason Gunthorpe , Axel Rasmussen , Anshuman Khandual , Pasha Tatashin , Miaohe Lin , Minchan Kim , Christoph Hellwig , Song Liu , Thomas Hellstrom , linux-kernel@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 20/31] mm/madvise: clean up pte_offset_map_lock() scans In-Reply-To: <68a97fbe-5c1e-7ac6-72c-7b9c6290b370@google.com> Message-ID: References: <68a97fbe-5c1e-7ac6-72c-7b9c6290b370@google.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: ACEAB20008 X-Rspam-User: X-Stat-Signature: s1q93nmrq7qt145e3qsi47abhhc9pr9n X-Rspamd-Server: rspam01 X-HE-Tag: 1684732510-148726 X-HE-Meta: U2FsdGVkX1+1qoytRbX0dNARK12VpEaeFxlm4qoXz7/OsbCxS0KuF0f64uBsnQS593FzLcPUUogVq/xDwMYYmkhI1Ep0kmcP+PvXmDw0MPoSK59vq+PQAIu8F1FUzJX6uLRq2rp1elV8LANLxUjhmHG7z2bO7STNVgHvgEcPdGLEnaKi7DhYNI5mQ///T/VA0xIdpc4L7cGeo5lL2xDpZaSuZMehjL5bPTP0/edXa8vG9l+5U3zWqRzgQTXkJiSPKDBdhN8DC5YkEbCAWFWIOdOdBGc6VrO/Z0f/iba+tuB1uzNbdlVld3DkJQAvqXf6xPuYYDUEObN6goCrbun0mwU1jK35jBXLriYKhzCs6rQpyZGzc8VDCNS2phJPA8SVmST3aXcTvqLnxm1vCkJfqCZSH3Y2g2ygmFcVo088tU08w2hz2oMpzbQvuyFrRfahI88s03YNZjdZ1TZ0gAP2pmVVY2y460tp7SXqV5V8f3WgxvwHpsrqmmN5vIypXK+dE3hfeFSoR7Me45bCbSOVyPIvXxnBPFMDR2devAJ6nHR44DR6Ri8+PkDL2xBAS2V/2O9lwzrz3WAiQcYZeIjwwZEXJNsg4K0F5W+TrtZV9kkvRgFtDhVnvlbNJDGi5Qn5s2Pmv7o7WjVc90laKVKxC9YSVJrecEvBwrisY1wx572Uk61SWJpLl+An4r/qX4W3XeANT4+rX6vSBDwyYZVWbWV5xmh1HzjYSkphIzkPXsY4laM9bPkDUhdP45eGERt+ww/yIo5ARa3fZ2H/c4Jtw4aMFdQ6ZWTfAgQwrhr6FJhxSacb6RozMrKbSYJ6P5wCE7lup8r4e3SnMhAZiEd//qHKx0BTLEydSsfZ2/FBjAHEmAzJ9LwQcSKO7Vj8HtH9RAbfaCvygPSv80fnSMdm+JCSmthTygw5lrObYHU2y6xh57WQO8L2pSknk4ERN77d2IZUkQrSKRkywWTtBQT be/CTf1v FnCFeel59Jjej/iZ46K08Ao3dUFNPkFArftCHHVjxqzyd49fxgiSshu5AKhYA+kf0SEF6v//+6HWgxlDuett60F0/cMIjdVmkgZoNaj2t1A23NApghshJvrB6VbWFahwoUQBGJ1YtanyC+RF21021cs1mjmJKhzT4+DPksQjSUcyX6GZ4V6i1BbmqUziPR05UG1t+PcuazXVyjprL0P5v7kPBGgeJOujuFaiBo+6zFznYqdIM17tIx34mnN/nlvwzSgDRJKY6Awv/3JIsDuUQlrjs4XSa6HWFMvawZi84A8Mq/pGv/ftZYRZJTdHF/i4+0dDEKMVRmYvxA7eS5NLQkhcOEA4hD2Imy+7a30KCdMam9ISBP6rcSFiUpZEi6X1V4lqAZcL1Urf4SCREMfaDY8tx2TQ8A1/uvH9S/y5Q0rwXEmnXkHmsh//L8A== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Came here to make madvise's several pte_offset_map_lock() scans advance to next extent on failure, and remove superfluous pmd_trans_unstable() and pmd_none_or_trans_huge_or_clear_bad() calls. But also did some nearby cleanup. swapin_walk_pmd_entry(): don't name an address "index"; don't drop the lock after every pte, only when calling out to read_swap_cache_async(). madvise_cold_or_pageout_pte_range() and madvise_free_pte_range(): prefer "start_pte" for pointer, orig_pte usually denotes a saved pte value; leave lazy MMU mode before unlocking; merge the success and failure paths after split_folio(). Signed-off-by: Hugh Dickins --- mm/madvise.c | 122 ++++++++++++++++++++++++++++----------------------- 1 file changed, 68 insertions(+), 54 deletions(-) diff --git a/mm/madvise.c b/mm/madvise.c index b5ffbaf616f5..0af64c4a8f82 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -188,37 +188,43 @@ static int madvise_update_vma(struct vm_area_struct *vma, #ifdef CONFIG_SWAP static int swapin_walk_pmd_entry(pmd_t *pmd, unsigned long start, - unsigned long end, struct mm_walk *walk) + unsigned long end, struct mm_walk *walk) { struct vm_area_struct *vma = walk->private; - unsigned long index; struct swap_iocb *splug = NULL; + pte_t *ptep = NULL; + spinlock_t *ptl; + unsigned long addr; - if (pmd_none_or_trans_huge_or_clear_bad(pmd)) - return 0; - - for (index = start; index != end; index += PAGE_SIZE) { + for (addr = start; addr < end; addr += PAGE_SIZE) { pte_t pte; swp_entry_t entry; struct page *page; - spinlock_t *ptl; - pte_t *ptep; - ptep = pte_offset_map_lock(vma->vm_mm, pmd, index, &ptl); + if (!ptep++) { + ptep = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + if (!ptep) + break; + } + pte = *ptep; - pte_unmap_unlock(ptep, ptl); - if (!is_swap_pte(pte)) continue; entry = pte_to_swp_entry(pte); if (unlikely(non_swap_entry(entry))) continue; + pte_unmap_unlock(ptep, ptl); + ptep = NULL; + page = read_swap_cache_async(entry, GFP_HIGHUSER_MOVABLE, - vma, index, false, &splug); + vma, addr, false, &splug); if (page) put_page(page); } + + if (ptep) + pte_unmap_unlock(ptep, ptl); swap_read_unplug(splug); cond_resched(); @@ -340,7 +346,7 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, bool pageout = private->pageout; struct mm_struct *mm = tlb->mm; struct vm_area_struct *vma = walk->vma; - pte_t *orig_pte, *pte, ptent; + pte_t *start_pte, *pte, ptent; spinlock_t *ptl; struct folio *folio = NULL; LIST_HEAD(folio_list); @@ -422,11 +428,11 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, } regular_folio: - if (pmd_trans_unstable(pmd)) - return 0; #endif tlb_change_page_size(tlb, PAGE_SIZE); - orig_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + start_pte = pte = pte_offset_map_lock(vma->vm_mm, pmd, addr, &ptl); + if (!start_pte) + return 0; flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); for (; addr < end; pte++, addr += PAGE_SIZE) { @@ -447,25 +453,28 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, * are sure it's worth. Split it if we are only owner. */ if (folio_test_large(folio)) { + int err; + if (folio_mapcount(folio) != 1) break; if (pageout_anon_only_filter && !folio_test_anon(folio)) break; + if (!folio_trylock(folio)) + break; folio_get(folio); - if (!folio_trylock(folio)) { - folio_put(folio); - break; - } - pte_unmap_unlock(orig_pte, ptl); - if (split_folio(folio)) { - folio_unlock(folio); - folio_put(folio); - orig_pte = pte_offset_map_lock(mm, pmd, addr, &ptl); - break; - } + arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(start_pte, ptl); + start_pte = NULL; + err = split_folio(folio); folio_unlock(folio); folio_put(folio); - orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + if (err) + break; + start_pte = pte = + pte_offset_map_lock(mm, pmd, addr, &ptl); + if (!start_pte) + break; + arch_enter_lazy_mmu_mode(); pte--; addr -= PAGE_SIZE; continue; @@ -510,8 +519,10 @@ static int madvise_cold_or_pageout_pte_range(pmd_t *pmd, folio_deactivate(folio); } - arch_leave_lazy_mmu_mode(); - pte_unmap_unlock(orig_pte, ptl); + if (start_pte) { + arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(start_pte, ptl); + } if (pageout) reclaim_pages(&folio_list); cond_resched(); @@ -612,7 +623,7 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, struct mm_struct *mm = tlb->mm; struct vm_area_struct *vma = walk->vma; spinlock_t *ptl; - pte_t *orig_pte, *pte, ptent; + pte_t *start_pte, *pte, ptent; struct folio *folio; int nr_swap = 0; unsigned long next; @@ -620,13 +631,12 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, next = pmd_addr_end(addr, end); if (pmd_trans_huge(*pmd)) if (madvise_free_huge_pmd(tlb, vma, pmd, addr, next)) - goto next; - - if (pmd_trans_unstable(pmd)) - return 0; + return 0; tlb_change_page_size(tlb, PAGE_SIZE); - orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + start_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + if (!start_pte) + return 0; flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); for (; addr != end; pte++, addr += PAGE_SIZE) { @@ -664,23 +674,26 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, * deactivate all pages. */ if (folio_test_large(folio)) { + int err; + if (folio_mapcount(folio) != 1) - goto out; + break; + if (!folio_trylock(folio)) + break; folio_get(folio); - if (!folio_trylock(folio)) { - folio_put(folio); - goto out; - } - pte_unmap_unlock(orig_pte, ptl); - if (split_folio(folio)) { - folio_unlock(folio); - folio_put(folio); - orig_pte = pte_offset_map_lock(mm, pmd, addr, &ptl); - goto out; - } + arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(start_pte, ptl); + start_pte = NULL; + err = split_folio(folio); folio_unlock(folio); folio_put(folio); - orig_pte = pte = pte_offset_map_lock(mm, pmd, addr, &ptl); + if (err) + break; + start_pte = pte = + pte_offset_map_lock(mm, pmd, addr, &ptl); + if (!start_pte) + break; + arch_enter_lazy_mmu_mode(); pte--; addr -= PAGE_SIZE; continue; @@ -725,17 +738,18 @@ static int madvise_free_pte_range(pmd_t *pmd, unsigned long addr, } folio_mark_lazyfree(folio); } -out: + if (nr_swap) { if (current->mm == mm) sync_mm_rss(mm); - add_mm_counter(mm, MM_SWAPENTS, nr_swap); } - arch_leave_lazy_mmu_mode(); - pte_unmap_unlock(orig_pte, ptl); + if (start_pte) { + arch_leave_lazy_mmu_mode(); + pte_unmap_unlock(start_pte, ptl); + } cond_resched(); -next: + return 0; }