From patchwork Thu Feb 23 08:31:56 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13149971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1BCAC636D7 for ; Thu, 23 Feb 2023 08:30:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E117A6B0073; Thu, 23 Feb 2023 03:30:33 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id D6F9E6B0074; Thu, 23 Feb 2023 03:30:33 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE83D6B0075; Thu, 23 Feb 2023 03:30:33 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id AC8556B0073 for ; Thu, 23 Feb 2023 03:30:33 -0500 (EST) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 63AD2161232 for ; Thu, 23 Feb 2023 08:30:33 +0000 (UTC) X-FDA: 80497885146.18.DD2C544 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf06.hostedemail.com (Postfix) with ESMTP id 3A766180013 for ; Thu, 23 Feb 2023 08:30:31 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=cHreMMAU; spf=pass (imf06.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677141031; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Dn8mW7cehFDVu3IOsyhuwut6fAtn4JvHNLQrERqw/pU=; b=7uyqd7jv1+1oEV+SWknrCE6nG7jQIEBZEyipOP3iJI5mDUtDGXbhBlm2U2OqIvYCZnUE8L hmhBBMQl8zqI52coqHpdRew1vbwzvK84AWDrFAxEYeIyRxdG9/ZSJpxQ6ST4trij3e77mS gmgEaIxN0DoZrMk9VzMEPbGm7/hc54A= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=cHreMMAU; spf=pass (imf06.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677141031; a=rsa-sha256; cv=none; b=nRqPFmlXW9T7bCxEoPRdVlr46jPlmWipc+tsThXvGQd60j95tlAQMV9EYfsfScG0+a1UbV 7838yq+OqqZRW6RrdVVmkcaxl3YfTLfwhtWG3UpB1NRliC/IDI/pp5ejFd9+Y5Q4+Q9RXV 6ydlpJZPwUkuuUgPPo+NGrpE7xzMAZ4= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677141031; x=1708677031; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7SOYS7uGeBRWEQq00stB4VBZna3R5bbYM1myX6BaqvE=; b=cHreMMAUvjheK1KAeatHZPdnOb4eEUtFogbj5iuTOaQLtA0ZxMpcp7PV pqapXRAVjvJIORLAore6slp8rkX/quyP04pmmo4jYXw3ZBCwjsj2pap3d +UxgNpjcWxj7e1SH39vk+g2Zc5Bs3Xw0xzZPJ0xylh5b8WpKO/KdsLicC nVK2cHoE33cJpCQV3zAD76w4PAuDOMiICgAOUUFJjWEY9fjhcjAChiXNs teTSGJueNbA6QwtQ+WGKYjuO2T8Dxras3pJvZvFWTPk4nC04hndG6Sfqh X8JiILakSJlNjk/XPs54sAwBcJ0DEoemHVO+GSnXip8hLeKmnhCL1KjbC Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="321298465" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="321298465" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2023 00:30:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="674437889" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="674437889" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga007.fm.intel.com with ESMTP; 23 Feb 2023 00:30:27 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org Cc: fengwei.yin@intel.com Subject: [PATCH 1/5] rmap: move hugetlb try_to_unmap to dedicated function Date: Thu, 23 Feb 2023 16:31:56 +0800 Message-Id: <20230223083200.3149015-2-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230223083200.3149015-1-fengwei.yin@intel.com> References: <20230223083200.3149015-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 3A766180013 X-Stat-Signature: t7zucmste6jpg5fn3f6q5eryfd6cn9ef X-Rspam-User: X-HE-Tag: 1677141031-601686 X-HE-Meta: U2FsdGVkX197tYnJT80v3M3RknW2NKR3cREYtNXv3dhh6/8C/W1uBq9ZP2RCbO1VMOJ3YIGuk9boLf3CamsDituPEt+ypFno195BiewB6spLEjKmISteBSWnJEX6mQ1zdKb2FC0r1WA5xfwOScxBaaidcAh6rxo6HXXyjtKwzBUdsVXi5e1FIxTqqJ4F/gsGFhLIseJQaHqRYlo6URJSVTqYEa8tUgBSs/ylxQmyMn9PPJa/qyw9LQ5CSesJIeUGW8ieGgz4hysIiFKkZv1wvV3m84z3QwR7g0/YsXeI+w5VI/8aCahtuELSTQeraORCrwLxC6F28pRp/tyyli+DqzZJuuC5gD1vDaZ67yTgZC/BtbN5Sfl3+QPkd+TTGnmV0lW5Na3jb2LEgC36iMTckCLqUfG67sKFK+fCVckxg57XqH9EeeDA5ixEZl65B1Wr39hVqbmSktCeYTcXn10vwM13zN2HHDDzl3e4FbfsK91Foc4R4KeLFPL6BeoJKWYUyVilr3jXtNMJhjQ+IkpI2+xNhOpSJZNk92jfBNWMbJqaIUTmZMgrC9n8+IOVpLcvcoJQPZL6licSSyZYBJtxqOjio/XHqnBwegUhCVHymraTXvp1iS/9bHsjPENIgcLtXvLA67l0bsf94ue067Ue6Tevzy3Eb2jhmSr+ZIC8SHFMEUUGPbXmjovRfUL/viT+Ngi5yApImJrbuRTQuLcjDxE54iiczR3v/+kYHgtu8RPLL486M74LlpY+fKJKOANyHbJQUB+KzvdK22JY2NyPX/GfAhq/9VZ5Qd2S2pUFTNp9Vp/F8QMkzl2J9Tzm8CPhrUd3k9mOuwAxj9D5sL8UBJxMK15PFs4/yZyB9NgYxMtiKS9ApMvgniU8qGSsdsaB8G2F8ean26XJJEyotCe+4PZv+bdFBdJbE6igMxCGl1ngBLiTdNX1PgQ8MPhZJLxHIsh6dTnwMgjkqvBdNLI 744znrjQ Hvs6GEyDmeGSncCVOsSZh1yMj2RP036wjjaWm7KuHASUOQLs3M9jZFPNBbFled+YC/XN9vy/ng8QyFwXeYRLk/Z5+muPTXw3uVX9Sb3FmBf9K15BddAuCrc/BPFQ2/Y5M0cTsvkfIdJ4PEbK1apsFSScRuu06ffwuhp/105adbBKW/B3gIoPCDGa/VYdVlc1LHzW/jlchQGM6hY4XTMvukJHVDy3K58va1+kw X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: It's to prepare the batched rmap update for large folio. No need to looped handle hugetlb. Just handle hugetlb and bail out early. Almost no functional change. Just one change to mm counter update. Signed-off-by: Yin Fengwei --- mm/rmap.c | 205 +++++++++++++++++++++++++++++++++--------------------- 1 file changed, 126 insertions(+), 79 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 15ae24585fc4..e7aa63b800f7 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1443,6 +1443,108 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma, compound); } +static bool try_to_unmap_one_hugetlb(struct folio *folio, + struct vm_area_struct *vma, struct mmu_notifier_range range, + struct page_vma_mapped_walk pvmw, unsigned long address, + enum ttu_flags flags) +{ + struct mm_struct *mm = vma->vm_mm; + pte_t pteval; + bool ret = true, anon = folio_test_anon(folio); + + /* + * The try_to_unmap() is only passed a hugetlb page + * in the case where the hugetlb page is poisoned. + */ + VM_BUG_ON_FOLIO(!folio_test_hwpoison(folio), folio); + /* + * huge_pmd_unshare may unmap an entire PMD page. + * There is no way of knowing exactly which PMDs may + * be cached for this mm, so we must flush them all. + * start/end were already adjusted above to cover this + * range. + */ + flush_cache_range(vma, range.start, range.end); + + /* + * To call huge_pmd_unshare, i_mmap_rwsem must be + * held in write mode. Caller needs to explicitly + * do this outside rmap routines. + * + * We also must hold hugetlb vma_lock in write mode. + * Lock order dictates acquiring vma_lock BEFORE + * i_mmap_rwsem. We can only try lock here and fail + * if unsuccessful. + */ + if (!anon) { + VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); + if (!hugetlb_vma_trylock_write(vma)) { + ret = false; + goto out; + } + if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { + hugetlb_vma_unlock_write(vma); + flush_tlb_range(vma, + range.start, range.end); + mmu_notifier_invalidate_range(mm, + range.start, range.end); + /* + * The ref count of the PMD page was + * dropped which is part of the way map + * counting is done for shared PMDs. + * Return 'true' here. When there is + * no other sharing, huge_pmd_unshare + * returns false and we will unmap the + * actual page and drop map count + * to zero. + */ + goto out; + } + hugetlb_vma_unlock_write(vma); + } + pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + + /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + if (folio_test_hwpoison(folio) && !(flags & TTU_IGNORE_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(&folio->page)); + set_huge_pte_at(mm, address, pvmw.pte, pteval); + } + + /*** try_to_unmap_one() called dec_mm_counter for + * (folio_test_hwpoison(folio) && !(flags & TTU_IGNORE_HWPOISON)) not + * true case, looks incorrect. Change it to hugetlb_count_sub() here. + */ + hugetlb_count_sub(folio_nr_pages(folio), mm); + + /* + * No need to call mmu_notifier_invalidate_range() it has be + * done above for all cases requiring it to happen under page + * table lock before mmu_notifier_invalidate_range_end() + * + * See Documentation/mm/mmu_notifier.rst + */ + page_remove_rmap(&folio->page, vma, folio_test_hugetlb(folio)); + if (vma->vm_flags & VM_LOCKED) + mlock_drain_local(); + folio_put(folio); + +out: + return ret; +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1506,86 +1608,37 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } + address = pvmw.address; + if (folio_test_hugetlb(folio)) { + ret = try_to_unmap_one_hugetlb(folio, vma, range, + pvmw, address, flags); + + /* no need to loop for hugetlb */ + page_vma_mapped_walk_done(&pvmw); + break; + } + subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); - address = pvmw.address; anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(subpage); - if (folio_test_hugetlb(folio)) { - bool anon = folio_test_anon(folio); - + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + /* Nuke the page table entry. */ + if (should_defer_flush(mm, flags)) { /* - * The try_to_unmap() is only passed a hugetlb page - * in the case where the hugetlb page is poisoned. + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. */ - VM_BUG_ON_PAGE(!PageHWPoison(subpage), subpage); - /* - * huge_pmd_unshare may unmap an entire PMD page. - * There is no way of knowing exactly which PMDs may - * be cached for this mm, so we must flush them all. - * start/end were already adjusted above to cover this - * range. - */ - flush_cache_range(vma, range.start, range.end); + pteval = ptep_get_and_clear(mm, address, pvmw.pte); - /* - * To call huge_pmd_unshare, i_mmap_rwsem must be - * held in write mode. Caller needs to explicitly - * do this outside rmap routines. - * - * We also must hold hugetlb vma_lock in write mode. - * Lock order dictates acquiring vma_lock BEFORE - * i_mmap_rwsem. We can only try lock here and fail - * if unsuccessful. - */ - if (!anon) { - VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); - if (!hugetlb_vma_trylock_write(vma)) { - page_vma_mapped_walk_done(&pvmw); - ret = false; - break; - } - if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { - hugetlb_vma_unlock_write(vma); - flush_tlb_range(vma, - range.start, range.end); - mmu_notifier_invalidate_range(mm, - range.start, range.end); - /* - * The ref count of the PMD page was - * dropped which is part of the way map - * counting is done for shared PMDs. - * Return 'true' here. When there is - * no other sharing, huge_pmd_unshare - * returns false and we will unmap the - * actual page and drop map count - * to zero. - */ - page_vma_mapped_walk_done(&pvmw); - break; - } - hugetlb_vma_unlock_write(vma); - } - pteval = huge_ptep_clear_flush(vma, address, pvmw.pte); + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); } else { - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); - - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } + pteval = ptep_clear_flush(vma, address, pvmw.pte); } /* @@ -1604,14 +1657,8 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) { pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - if (folio_test_hugetlb(folio)) { - hugetlb_count_sub(folio_nr_pages(folio), mm); - set_huge_pte_at(mm, address, pvmw.pte, pteval); - } else { - dec_mm_counter(mm, mm_counter(&folio->page)); - set_pte_at(mm, address, pvmw.pte, pteval); - } - + dec_mm_counter(mm, mm_counter(&folio->page)); + set_pte_at(mm, address, pvmw.pte, pteval); } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { /* * The guest indicated that the page content is of no From patchwork Thu Feb 23 08:31:57 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13149972 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id AECE4C6379F for ; Thu, 23 Feb 2023 08:30:36 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5B8616B007B; Thu, 23 Feb 2023 03:30:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F0346B007D; Thu, 23 Feb 2023 03:30:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 2EA186B007B; Thu, 23 Feb 2023 03:30:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 0C4C76B0074 for ; Thu, 23 Feb 2023 03:30:35 -0500 (EST) Received: from smtpin05.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id D4F431C6B6E for ; Thu, 23 Feb 2023 08:30:34 +0000 (UTC) X-FDA: 80497885188.05.1D0A0EC Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf10.hostedemail.com (Postfix) with ESMTP id CD6A9C0014 for ; Thu, 23 Feb 2023 08:30:32 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=BrNEHOqg; spf=pass (imf10.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677141033; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=N6thC7HLdtvjBvMWdllAzMVwWwBF4igEKjoiUzATBb0=; b=t0sQi1xFsOuFTziKMr0FB1Zw5t5T+LRRIrnWsEeT4fb7fFRcCU/sT1xBmR88OzfrZJGsnO 3W5IlU5EPlKjnojlTsxu4ENosyjP/v8Ky5CAabGhi/LUtT4CnZXWTBxY9+v4jDwSJmN/zN MYjLey2ATkOFFUU6VSJljAi1vZigZrE= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=BrNEHOqg; spf=pass (imf10.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677141033; a=rsa-sha256; cv=none; b=c63H+GzPHIc/0LksyU7fqTcWFInQSjvx8Rw6AQDKRrbUdFlTCy7Ssc7X4wcyuKNRYfSEck B1NZRpfTS0rSADCMRYJzs3qvekw5gyreDWTeuC+Nj6PCjdv83tOIRvLyI3vUhPOSiyFD72 UXCx2kP5vhTqVQQ6+z1EDez16opHXnw= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677141032; x=1708677032; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=KKI1NZSJx0ab1iynOwmhLRa//pdz1ShdJE8KcGAOlaA=; b=BrNEHOqgSI8enTNN8iBeyi9zJ9bf30Ms3U2c/XCfrH/BKwQsMMRtFLya kHFHorpy9hU8um6/qNIEKZOudKTRQSBoN18XHwUJPNSRiVf7rwbJl2nr2 Oz2H3NzowBmGOLhluI/Nb35yWg3siaJveioJzimZdZJJcG23LOQTmze63 AEPbpHwCJJmOkwq+PchLzfeeOc6VKi1ZJxO2vu+/GLaUQJAWZvAE9vXLx pf4extVPCeg80PyZlaTb+kyJKYk8zCyj00TRc022UETfBgBiasbL84qh8 nSDDGOmTLR99Slsmg8Bw60GA7NxOx6lKoHvXH4Zxoe2tzk0z3993rkBRe g==; X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="321298478" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="321298478" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2023 00:30:30 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="674437894" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="674437894" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga007.fm.intel.com with ESMTP; 23 Feb 2023 00:30:28 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org Cc: fengwei.yin@intel.com Subject: [PATCH 2/5] rmap: move page unmap operation to dedicated function Date: Thu, 23 Feb 2023 16:31:57 +0800 Message-Id: <20230223083200.3149015-3-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230223083200.3149015-1-fengwei.yin@intel.com> References: <20230223083200.3149015-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: CD6A9C0014 X-Stat-Signature: gp8hegjf6pjekrqc9m1zd4tsij9jxswt X-HE-Tag: 1677141032-795396 X-HE-Meta: U2FsdGVkX18QBuKifsqO44gqJa1S879/0f0/rWdT68WYit9+RHYXDYgvFXDnqNcxHVcQPnXjhVVWbaJxVOJsmQ8RuzSSebcHZLuuw/NzepkAEnITb61TfqH9myhg0Xxh36/sveSZtzBALBkhtyjHYMaOWb8IchWCBmciybnEVmkJVEyCK6uhX5DUnFp8Pih1vp8ECCS1DBfoGpYK4JBLY1cp5cR24NCZrVxv0jIdR1hvnsSjq7KVcep0nQF9uCY0KteSq5MgTwSxt/nCB0gIlKSTggUmnntkELLZ9ZkqNbSuwoDYr0kllMwcImLr9GWHLjGaoC6HP78nyAOtBbuprUJcgVwGsokERO4S6rON4Bg2sPxdNvCA+jYJJc3tQFc/Shgu4FcaGf1WqiiYmjyXjshAA4D+l5ObeeTzF8LsrO8tho+VWsDKmvH4k/fW/tS6ynU/hgQONr4Z9xvAkvolBx8xoFRkmv+C8mwSONHcNj6+gO0UlKpiLxZz+Psic+Wx8XxdPXF72bIKR/T3dhDwMa9ydsWwfn09ibaePlY/S7c68iwrlT8Skig9ss25dfJmeyKQ285WjnOCsNMpsTgCCFHuBxwg5vofJULTbrV+wNXdHlzcFNaylakfTxinIvBHTkqhvybX5AItRIm6wFH78+2t6f/32PeFGQx2nuOjU/hoVkuj0ZxvaJZ8RMpgC1fYtH6wxXjGBJjiuJFZGxOVgADGFlJPGfJGorJ0lHRhaEWlXB+wGHlxNxjcCWS2ussP1PZeZBv/M71VNynCB8G2HlluFYcZI9BIHicJukEVRCTPugN/mDO4GITD8oKHW4i1BJ3cZKdV6z5TDYzj3cu/mNqlbpoIul+PxkMWDA6b1C9xKcGus3W76k84sfl8ydw8vYEbUbNKNhdED4gdX15l9djpeBmoRlP6Iup9uObxMi1rwXIeYvSTust/YEBgoE5TgXp6GX6wqgnrOPmF9vs AsoV+P+y p61CVVFqc3LZ5IdC3YkuhwnBdPOrKqd3AcQbImKul5vZSvNFXN76XEgp5vZNHJ/Aa//vSzQZ0e5O6pXx+X0myhmISe1zLbrPBHUhY6D6VdvO4vcBsOgMBuEFoMgQuurgcV4vVOCgpHwPcPqPD+KZkwikFYAGDVbRyN8/z188jJ6am1H0= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: No functional change. Just code reorganized. Signed-off-by: Yin Fengwei --- mm/rmap.c | 369 ++++++++++++++++++++++++++++-------------------------- 1 file changed, 194 insertions(+), 175 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index e7aa63b800f7..879e90bbf6aa 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1545,17 +1545,204 @@ static bool try_to_unmap_one_hugetlb(struct folio *folio, return ret; } +static bool try_to_unmap_one_page(struct folio *folio, + struct vm_area_struct *vma, struct mmu_notifier_range range, + struct page_vma_mapped_walk pvmw, unsigned long address, + enum ttu_flags flags) +{ + bool anon_exclusive, ret = true; + struct page *subpage; + struct mm_struct *mm = vma->vm_mm; + pte_t pteval; + + subpage = folio_page(folio, + pte_pfn(*pvmw.pte) - folio_pfn(folio)); + anon_exclusive = folio_test_anon(folio) && + PageAnonExclusive(subpage); + + flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); + /* Nuke the page table entry. */ + if (should_defer_flush(mm, flags)) { + /* + * We clear the PTE but do not flush so potentially + * a remote CPU could still be writing to the folio. + * If the entry was previously clean then the + * architecture must guarantee that a clear->dirty + * transition on a cached TLB entry is written through + * and traps if the PTE is unmapped. + */ + pteval = ptep_get_and_clear(mm, address, pvmw.pte); + + set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); + } else { + pteval = ptep_clear_flush(vma, address, pvmw.pte); + } + + /* + * Now the pte is cleared. If this pte was uffd-wp armed, + * we may want to replace a none pte with a marker pte if + * it's file-backed, so we don't lose the tracking info. + */ + pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); + + /* Set the dirty flag on the folio now the pte is gone. */ + if (pte_dirty(pteval)) + folio_mark_dirty(folio); + + /* Update high watermark before we lower rss */ + update_hiwater_rss(mm); + + if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); + dec_mm_counter(mm, mm_counter(&folio->page)); + set_pte_at(mm, address, pvmw.pte, pteval); + } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { + /* + * The guest indicated that the page content is of no + * interest anymore. Simply discard the pte, vmscan + * will take care of the rest. + * A future reference will then fault in a new zero + * page. When userfaultfd is active, we must not drop + * this page though, as its main user (postcopy + * migration) will not expect userfaults on already + * copied pages. + */ + dec_mm_counter(mm, mm_counter(&folio->page)); + /* We have to invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + } else if (folio_test_anon(folio)) { + swp_entry_t entry = { .val = page_private(subpage) }; + pte_t swp_pte; + /* + * Store the swap location in the pte. + * See handle_pte_fault() ... + */ + if (unlikely(folio_test_swapbacked(folio) != + folio_test_swapcache(folio))) { + WARN_ON_ONCE(1); + ret = false; + /* We have to invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + /* MADV_FREE page check */ + if (!folio_test_swapbacked(folio)) { + int ref_count, map_count; + + /* + * Synchronize with gup_pte_range(): + * - clear PTE; barrier; read refcount + * - inc refcount; barrier; read PTE + */ + smp_mb(); + + ref_count = folio_ref_count(folio); + map_count = folio_mapcount(folio); + + /* + * Order reads for page refcount and dirty flag + * (see comments in __remove_mapping()). + */ + smp_rmb(); + + /* + * The only page refs must be one from isolation + * plus the rmap(s) (dropped by discard:). + */ + if (ref_count == 1 + map_count && + !folio_test_dirty(folio)) { + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, + address, address + PAGE_SIZE); + dec_mm_counter(mm, MM_ANONPAGES); + goto discard; + } + + /* + * If the folio was redirtied, it cannot be + * discarded. Remap the page to page table. + */ + set_pte_at(mm, address, pvmw.pte, pteval); + folio_set_swapbacked(folio); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + if (swap_duplicate(entry) < 0) { + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + if (arch_unmap_one(mm, vma, address, pteval) < 0) { + swap_free(entry); + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + + /* See page_try_share_anon_rmap(): clear PTE first. */ + if (anon_exclusive && + page_try_share_anon_rmap(subpage)) { + swap_free(entry); + set_pte_at(mm, address, pvmw.pte, pteval); + ret = false; + page_vma_mapped_walk_done(&pvmw); + goto discard; + } + if (list_empty(&mm->mmlist)) { + spin_lock(&mmlist_lock); + if (list_empty(&mm->mmlist)) + list_add(&mm->mmlist, &init_mm.mmlist); + spin_unlock(&mmlist_lock); + } + dec_mm_counter(mm, MM_ANONPAGES); + inc_mm_counter(mm, MM_SWAPENTS); + swp_pte = swp_entry_to_pte(entry); + if (anon_exclusive) + swp_pte = pte_swp_mkexclusive(swp_pte); + if (pte_soft_dirty(pteval)) + swp_pte = pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pteval)) + swp_pte = pte_swp_mkuffd_wp(swp_pte); + set_pte_at(mm, address, pvmw.pte, swp_pte); + /* Invalidate as we cleared the pte */ + mmu_notifier_invalidate_range(mm, address, + address + PAGE_SIZE); + } else { + /* + * This is a locked file-backed folio, + * so it cannot be removed from the page + * cache and replaced by a new folio before + * mmu_notifier_invalidate_range_end, so no + * concurrent thread might update its page table + * to point at a new folio while a device is + * still using this folio. + * + * See Documentation/mm/mmu_notifier.rst + */ + dec_mm_counter(mm, mm_counter_file(&folio->page)); + } + +discard: + return ret; +} + /* * @arg: enum ttu_flags will be passed to this argument */ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, unsigned long address, void *arg) { - struct mm_struct *mm = vma->vm_mm; DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - pte_t pteval; struct page *subpage; - bool anon_exclusive, ret = true; + bool ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; @@ -1620,179 +1807,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, subpage = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); - anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); - - flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); - /* Nuke the page table entry. */ - if (should_defer_flush(mm, flags)) { - /* - * We clear the PTE but do not flush so potentially - * a remote CPU could still be writing to the folio. - * If the entry was previously clean then the - * architecture must guarantee that a clear->dirty - * transition on a cached TLB entry is written through - * and traps if the PTE is unmapped. - */ - pteval = ptep_get_and_clear(mm, address, pvmw.pte); - - set_tlb_ubc_flush_pending(mm, pte_dirty(pteval)); - } else { - pteval = ptep_clear_flush(vma, address, pvmw.pte); - } - - /* - * Now the pte is cleared. If this pte was uffd-wp armed, - * we may want to replace a none pte with a marker pte if - * it's file-backed, so we don't lose the tracking info. - */ - pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); - - /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) - folio_mark_dirty(folio); - - /* Update high watermark before we lower rss */ - update_hiwater_rss(mm); - - if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) { - pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - dec_mm_counter(mm, mm_counter(&folio->page)); - set_pte_at(mm, address, pvmw.pte, pteval); - } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { - /* - * The guest indicated that the page content is of no - * interest anymore. Simply discard the pte, vmscan - * will take care of the rest. - * A future reference will then fault in a new zero - * page. When userfaultfd is active, we must not drop - * this page though, as its main user (postcopy - * migration) will not expect userfaults on already - * copied pages. - */ - dec_mm_counter(mm, mm_counter(&folio->page)); - /* We have to invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; - pte_t swp_pte; - /* - * Store the swap location in the pte. - * See handle_pte_fault() ... - */ - if (unlikely(folio_test_swapbacked(folio) != - folio_test_swapcache(folio))) { - WARN_ON_ONCE(1); - ret = false; - /* We have to invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - page_vma_mapped_walk_done(&pvmw); - break; - } - - /* MADV_FREE page check */ - if (!folio_test_swapbacked(folio)) { - int ref_count, map_count; - - /* - * Synchronize with gup_pte_range(): - * - clear PTE; barrier; read refcount - * - inc refcount; barrier; read PTE - */ - smp_mb(); - - ref_count = folio_ref_count(folio); - map_count = folio_mapcount(folio); - - /* - * Order reads for page refcount and dirty flag - * (see comments in __remove_mapping()). - */ - smp_rmb(); - - /* - * The only page refs must be one from isolation - * plus the rmap(s) (dropped by discard:). - */ - if (ref_count == 1 + map_count && - !folio_test_dirty(folio)) { - /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, - address, address + PAGE_SIZE); - dec_mm_counter(mm, MM_ANONPAGES); - goto discard; - } - - /* - * If the folio was redirtied, it cannot be - * discarded. Remap the page to page table. - */ - set_pte_at(mm, address, pvmw.pte, pteval); - folio_set_swapbacked(folio); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - if (arch_unmap_one(mm, vma, address, pteval) < 0) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } + ret = try_to_unmap_one_page(folio, vma, + range, pvmw, address, flags); + if (!ret) + break; - /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { - swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - break; - } - if (list_empty(&mm->mmlist)) { - spin_lock(&mmlist_lock); - if (list_empty(&mm->mmlist)) - list_add(&mm->mmlist, &init_mm.mmlist); - spin_unlock(&mmlist_lock); - } - dec_mm_counter(mm, MM_ANONPAGES); - inc_mm_counter(mm, MM_SWAPENTS); - swp_pte = swp_entry_to_pte(entry); - if (anon_exclusive) - swp_pte = pte_swp_mkexclusive(swp_pte); - if (pte_soft_dirty(pteval)) - swp_pte = pte_swp_mksoft_dirty(swp_pte); - if (pte_uffd_wp(pteval)) - swp_pte = pte_swp_mkuffd_wp(swp_pte); - set_pte_at(mm, address, pvmw.pte, swp_pte); - /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); - } else { - /* - * This is a locked file-backed folio, - * so it cannot be removed from the page - * cache and replaced by a new folio before - * mmu_notifier_invalidate_range_end, so no - * concurrent thread might update its page table - * to point at a new folio while a device is - * still using this folio. - * - * See Documentation/mm/mmu_notifier.rst - */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); - } -discard: /* * No need to call mmu_notifier_invalidate_range() it has be * done above for all cases requiring it to happen under page From patchwork Thu Feb 23 08:31:58 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13149973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42470C636D7 for ; Thu, 23 Feb 2023 08:30:38 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id A3F056B0075; Thu, 23 Feb 2023 03:30:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 9DB7B6B007D; Thu, 23 Feb 2023 03:30:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7C8466B007E; Thu, 23 Feb 2023 03:30:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 4E3606B0075 for ; Thu, 23 Feb 2023 03:30:35 -0500 (EST) Received: from smtpin03.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 26AEC1C6B6E for ; Thu, 23 Feb 2023 08:30:35 +0000 (UTC) X-FDA: 80497885230.03.D2C8F97 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf26.hostedemail.com (Postfix) with ESMTP id F2139140011 for ; Thu, 23 Feb 2023 08:30:32 +0000 (UTC) Authentication-Results: imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=O1+B8x3R; spf=pass (imf26.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677141033; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=scTs1tSNlZ/Q2C/i6LH4vDzWHd+TD1Qjw97tbUPFr8Q=; b=NUxAKimqxoRt4bA1tTniiPFdwuA3v5P8coS1+QDGTv/HkqhmRCwoTbuXWcS/j+p4UWTkTj J8IAl6o5XDYbdnSvGwCKgy7WRTwU9Xe/tyFroww/JhgkMWsjjAAqJI7LRH7apVkyMvszZT U69UfhqGowVL3JAo1YpjPzLWe9uHd6w= ARC-Authentication-Results: i=1; imf26.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=O1+B8x3R; spf=pass (imf26.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677141033; a=rsa-sha256; cv=none; b=dJ8ufiYelAOcmMzl0ko6+IA0zu3kgLlCsLNpTCbWTrtj/pQmZ5VJ37VFZzriel5jiVxxiT mTnD2mhmN0sO1NP0c2p0cD1W/oiO9K431xDTGxKZr+vyBZNS0sYdV4mhKjgdicunptrnsb DigyvWEUuYr8OwHtuStCxoON1TIahak= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677141033; x=1708677033; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=pn1rQHlQmuF6JxGWRGKiya8QX8Fx+EyQOThE5TbPCIc=; b=O1+B8x3RHjLwSz8/w7GBx4uQwohJ42p4/E4fae0O927sPi0rSOzr6Q8m l9965xTqx6CxcvD4eBoHn3FRW6nz6y7fzuoyeStjPK2Xly6ixUFRlSxmn +zljUPgPdB2sH+jSg9kzCtUBN6IspAFSMkRGkIC/7B1rRKCVLCOd43nA0 aOuIaeJb2YTX3sTXgsVcWIugndvTiAZQmb2fZqld3OuzYbosFS9CciE3D L1ZfOcNp133VzsfQNNH2RmFMY+mB7yL+nHLSIrz51en89k3rL0bVveTid Kj4iMqVoYpNOVGmEKCFAQmsl0cBNub5w4piRNwtz4CDNZXS18PeUyHczU w==; X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="321298492" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="321298492" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2023 00:30:31 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="674437903" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="674437903" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga007.fm.intel.com with ESMTP; 23 Feb 2023 00:30:30 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org Cc: fengwei.yin@intel.com Subject: [PATCH 3/5] rmap: cleanup exit path of try_to_unmap_one_page() Date: Thu, 23 Feb 2023 16:31:58 +0800 Message-Id: <20230223083200.3149015-4-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230223083200.3149015-1-fengwei.yin@intel.com> References: <20230223083200.3149015-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Stat-Signature: ya68a8fuwwqjqyfyti4nxeqbaq7o8mpx X-Rspam-User: X-Rspamd-Queue-Id: F2139140011 X-Rspamd-Server: rspam06 X-HE-Tag: 1677141032-12111 X-HE-Meta: U2FsdGVkX1/sYa1XWmNiQjcrYK2dbpxSVgJaj7FQIdLUZGF5e0aCSN+n6JcRLiRFK6hAvVSOElVN9aGg4LcgkcfaF+kTY6TYgO9ZQeLek+7upCEFCMgsnuLO6w83puPD3XU51Xh50od8AqYUIZZLBU85Si4pDoXPxeoN53gDfXqER8msOSlfheIG4NQadI3rDlx5JLqGRmVkPcXY5HGsOEkT6dyHzVCWi5p8UNxsFti3DsIEewwtcXlno77NBhM3UbL7U8BD15MUL4berQanl2eebGUkKzQgaP/lZBqprhGoYdecnQsULLO3PkKz3gSFDRHFkfY4dlUTRpJXRPOrc/ENU0WL3HB8V3ID9GY2IrUiJ9TG/d+JqUFsxq0nDIzta3GAuK0t5zRqqMmrPMqHRP9g87raxXEMNvVOJAENYuY/eAHDGj4TmVa07m1TRgwcBKNmMwcDrqvXvN/uAWq9msaVV53QOawbGZEFfI2Jog1Omr6BLVRa+QexfstL7OHbO8COjpxH/T1EOOsuGqKIXTyodzUPYg/vaQ2x8teN2tmWedkXPaY0Zi2TQtxeQu9CldIiK+XwspRguMhderIjkLM4vsp0Dzd+tNi6OZeiogLdJwMNUMczKhB0kr6s7KBan3lGerqDG0fzvHdQsS1Kiw5f0jQZrZUrjcJmitNJ+zWj94khJ5vy8VJQ/HuhxgcD/vpYu3RZFF25KSF2MQFCsWz7TjccmawIP0F4+CXeh+IEioxbt/7oa1TCJJjK89Hn6YXD0D7ooJtny2UnwoxPJGfDYNUKIOZ7FM+gFoxaa3zdZYiTpH1x7loxSShFH/9hBaSLSYYk68CON3Lwe+90hzN6oz3lez01si0HmM01Cty3AZDeVGsJ18Pjd1XIcWWPTmICYLkNMYypT5xOjgYMWp9oD61pzvIsFBq98iogaX07jzKHMkl2QJXiuOfov4uGCDqOnUooJBvh7o28cXS 4cxe6W9X 29lgIi5NAcMS7ACph0FrVcalHV3YdWBrCLlUBgYH5TS5bUWQPiPOjaJh52DILUv2QXu0rgGHAgrmbSl5ZAO+iGPTjdHlPy5ATrZgNg4ePujGARefOXDdOvgcRFY5pqToXc7RBD6owFwCO/MhK+EegkAUtcJGN322Z5My1LWAk71XS8mS/EIlybBJj0CKDa0rmkGBsVe8tcu2CQhipB3nujCwJUQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Cleanup exit path of try_to_unmap_one_page() by removing some duplicated code. Move page_vma_mapped_walk_done() back to try_to_unmap_one(). Change subpage to page as folio has no concept of subpage. Signed-off-by: Yin Fengwei --- mm/rmap.c | 74 ++++++++++++++++++++++--------------------------------- 1 file changed, 30 insertions(+), 44 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index 879e90bbf6aa..097774c809a0 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1536,7 +1536,7 @@ static bool try_to_unmap_one_hugetlb(struct folio *folio, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(&folio->page, vma, folio_test_hugetlb(folio)); + page_remove_rmap(&folio->page, vma, true); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); @@ -1550,15 +1550,13 @@ static bool try_to_unmap_one_page(struct folio *folio, struct page_vma_mapped_walk pvmw, unsigned long address, enum ttu_flags flags) { - bool anon_exclusive, ret = true; - struct page *subpage; + bool anon_exclusive; + struct page *page; struct mm_struct *mm = vma->vm_mm; pte_t pteval; - subpage = folio_page(folio, - pte_pfn(*pvmw.pte) - folio_pfn(folio)); - anon_exclusive = folio_test_anon(folio) && - PageAnonExclusive(subpage); + page = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); + anon_exclusive = folio_test_anon(folio) && PageAnonExclusive(page); flush_cache_page(vma, address, pte_pfn(*pvmw.pte)); /* Nuke the page table entry. */ @@ -1586,15 +1584,14 @@ static bool try_to_unmap_one_page(struct folio *folio, pte_install_uffd_wp_if_needed(vma, address, pvmw.pte, pteval); /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pteval)) + if (pte_dirty(pteval) && !folio_test_dirty(folio)) folio_mark_dirty(folio); /* Update high watermark before we lower rss */ update_hiwater_rss(mm); - if (PageHWPoison(subpage) && !(flags & TTU_IGNORE_HWPOISON)) { - pteval = swp_entry_to_pte(make_hwpoison_entry(subpage)); - dec_mm_counter(mm, mm_counter(&folio->page)); + if (PageHWPoison(page) && !(flags & TTU_IGNORE_HWPOISON)) { + pteval = swp_entry_to_pte(make_hwpoison_entry(page)); set_pte_at(mm, address, pvmw.pte, pteval); } else if (pte_unused(pteval) && !userfaultfd_armed(vma)) { /* @@ -1607,12 +1604,11 @@ static bool try_to_unmap_one_page(struct folio *folio, * migration) will not expect userfaults on already * copied pages. */ - dec_mm_counter(mm, mm_counter(&folio->page)); /* We have to invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else if (folio_test_anon(folio)) { - swp_entry_t entry = { .val = page_private(subpage) }; + swp_entry_t entry = { .val = page_private(page) }; pte_t swp_pte; /* * Store the swap location in the pte. @@ -1621,12 +1617,10 @@ static bool try_to_unmap_one_page(struct folio *folio, if (unlikely(folio_test_swapbacked(folio) != folio_test_swapcache(folio))) { WARN_ON_ONCE(1); - ret = false; /* We have to invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit; } /* MADV_FREE page check */ @@ -1658,7 +1652,6 @@ static bool try_to_unmap_one_page(struct folio *folio, /* Invalidate as we cleared the pte */ mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); - dec_mm_counter(mm, MM_ANONPAGES); goto discard; } @@ -1666,43 +1659,30 @@ static bool try_to_unmap_one_page(struct folio *folio, * If the folio was redirtied, it cannot be * discarded. Remap the page to page table. */ - set_pte_at(mm, address, pvmw.pte, pteval); folio_set_swapbacked(folio); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } - if (swap_duplicate(entry) < 0) { - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; - } + if (swap_duplicate(entry) < 0) + goto exit_restore_pte; + if (arch_unmap_one(mm, vma, address, pteval) < 0) { swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } /* See page_try_share_anon_rmap(): clear PTE first. */ - if (anon_exclusive && - page_try_share_anon_rmap(subpage)) { + if (anon_exclusive && page_try_share_anon_rmap(page)) { swap_free(entry); - set_pte_at(mm, address, pvmw.pte, pteval); - ret = false; - page_vma_mapped_walk_done(&pvmw); - goto discard; + goto exit_restore_pte; } + if (list_empty(&mm->mmlist)) { spin_lock(&mmlist_lock); if (list_empty(&mm->mmlist)) list_add(&mm->mmlist, &init_mm.mmlist); spin_unlock(&mmlist_lock); } - dec_mm_counter(mm, MM_ANONPAGES); inc_mm_counter(mm, MM_SWAPENTS); swp_pte = swp_entry_to_pte(entry); if (anon_exclusive) @@ -1713,8 +1693,7 @@ static bool try_to_unmap_one_page(struct folio *folio, swp_pte = pte_swp_mkuffd_wp(swp_pte); set_pte_at(mm, address, pvmw.pte, swp_pte); /* Invalidate as we cleared the pte */ - mmu_notifier_invalidate_range(mm, address, - address + PAGE_SIZE); + mmu_notifier_invalidate_range(mm, address, address + PAGE_SIZE); } else { /* * This is a locked file-backed folio, @@ -1727,11 +1706,16 @@ static bool try_to_unmap_one_page(struct folio *folio, * * See Documentation/mm/mmu_notifier.rst */ - dec_mm_counter(mm, mm_counter_file(&folio->page)); } discard: - return ret; + dec_mm_counter(vma->vm_mm, mm_counter(&folio->page)); + return true; + +exit_restore_pte: + set_pte_at(mm, address, pvmw.pte, pteval); +exit: + return false; } /* @@ -1809,8 +1793,10 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, pte_pfn(*pvmw.pte) - folio_pfn(folio)); ret = try_to_unmap_one_page(folio, vma, range, pvmw, address, flags); - if (!ret) + if (!ret) { + page_vma_mapped_walk_done(&pvmw); break; + } /* * No need to call mmu_notifier_invalidate_range() it has be @@ -1819,7 +1805,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, * * See Documentation/mm/mmu_notifier.rst */ - page_remove_rmap(subpage, vma, folio_test_hugetlb(folio)); + page_remove_rmap(subpage, vma, false); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); From patchwork Thu Feb 23 08:31:59 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13149974 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id D82DDC61DA4 for ; Thu, 23 Feb 2023 08:30:39 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id DEA2B6B0074; Thu, 23 Feb 2023 03:30:35 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id CF81C6B007E; Thu, 23 Feb 2023 03:30:35 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id A8B256B0080; Thu, 23 Feb 2023 03:30:35 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id 904866B0074 for ; Thu, 23 Feb 2023 03:30:35 -0500 (EST) Received: from smtpin27.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id 672611C6B6E for ; Thu, 23 Feb 2023 08:30:35 +0000 (UTC) X-FDA: 80497885230.27.368AC8E Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf06.hostedemail.com (Postfix) with ESMTP id 7068C180014 for ; Thu, 23 Feb 2023 08:30:33 +0000 (UTC) Authentication-Results: imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=m3fRzbbO; spf=pass (imf06.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677141033; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LPz2ETUja8kHTfsT0rbBvT0gxlRSH1hwzWLcea4rwfo=; b=As2wEm7KMNcwWwXCxyicClxAVTgag/bJasy+RJFZjqE47Q4oeJwCOJ1IHLPEFlP0KuxVLD MQsxRN+uFAR2RBK1ipxpH3ZTD1WQrAwB8AzbDNLk5MebXe71ZCj8+XHJSJP+L/7u3V09Bc ASDXDKRttV+r3cFK8VhJTQ8f0NfsMlk= ARC-Authentication-Results: i=1; imf06.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=m3fRzbbO; spf=pass (imf06.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677141033; a=rsa-sha256; cv=none; b=pxVGWy+s+zmFkAfe0+Y0dxU9wSj3JSsp/PmpVBqONtTJwJk1OuAmY2MfwVcdFM7snHPjHC cBYnRw8yHqLRKZJoGyzxYnT0N/VfWEdLyoJe2RxY0kz7tbV7xqeRgImDahnxR47Ia1Ef+s /Ptz5ZWIQE0Z0Bi++lGTd9HchxSfGOc= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677141033; x=1708677033; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Frr3tP9iMcF/ayTmwJtPkPezLAtUZ5ZeFYQb1fE7bdE=; b=m3fRzbbOkP5HTJSfQYex5Zm5JX2ZehASxSnv9PVlCZzCnPnnwoFlXuq4 DPhTlDbmTfIy7YizZHEBD1v7URsJceprzdvueOixTvaWSyKTFWojIDdmR Ji9lGs9QgBl0boYkeiqgV46jNrINEdZj3iA0hkIad7rGyOVGCVdepZxzT bEHmadb+PhkwcoGaUJKm0BI/HlZdb4rXLvkcZyKprf5HyCyGueIRF1TkN RdCLs5OjjBsgRiM6sGgNS5/4BdN19sGyQZIsSxO0WJzyDpaUoWTlvJLac LMxrFrq/yT8OeX1gyorpvNq7pMooO/nZesbzZvxdtghmGFtUZhy8MfAqE Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="321298502" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="321298502" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2023 00:30:32 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="674437911" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="674437911" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga007.fm.intel.com with ESMTP; 23 Feb 2023 00:30:31 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org Cc: fengwei.yin@intel.com Subject: [PATCH 4/5] rmap:addd folio_remove_rmap_range() Date: Thu, 23 Feb 2023 16:31:59 +0800 Message-Id: <20230223083200.3149015-5-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230223083200.3149015-1-fengwei.yin@intel.com> References: <20230223083200.3149015-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 7068C180014 X-Stat-Signature: djd9qfhsbgb8o9e119mhdan6wst5r7wa X-Rspam-User: X-HE-Tag: 1677141033-845092 X-HE-Meta: U2FsdGVkX18Psft3E2GvKKBrbJHOVsMugby/Z93IcIVNAYqizrJyjTislH8swDUOkjia3t+YuTtoc6BJofE4TEz1STua4qxIvEKPdWF8E3stn6KDY9pGkzonM4xzFk8mLqwRXvVCkdmNKDnM/tnm0/PfmF0QbQZg7LDuUphkXDfDB1ejg21DSDALStjMgYpo5Y8LwvY6ySH2hQfrWPPyARnvSDmmCvhHijb9Mdk/9msULGN0G5BPMa/0pB88W7ZuCQ8TcesNxq9smsUDgq07YgW0TdzXOW7feR31T37j7Xd4irUPwTJxLyKENBKRIHCtdqgU0acUmf200ESxpov8zz1yfahH2Ld0FW5ywaGdxwaQ37suOIkNwMj5i40m4vYXHz8OC0xg/aYBgUR47v7+4AWcVFCefPGCcKqA0nEFZRiA3x5y60zpJ3uPNsTYYTnE24SZQl2/tNaMkyX4qU/Og0eJZGjHiI4yD0dBKSC5/Ta9j0d8iqX4Az2f502lgfK1kSLcYf17VQE127Yp7cceRysTs+Q1RSDaIkPkCLBsJG+W+ypKER9JFP6rwkxKFOu1LidnaiDg65jC/3RUluzB0eYr1ho6P0H4zDLShxhSAN8vAECXsGHYuUW82uOV0r4GNMYrLyhCTIGdHXCh9e5OuvbpU/uafpW/ZfHhy2GhMARJ4PQ3HN58BDVP4NTDj6m7b1j34XEjKSJnGiOBMY+sqGGMETuE+Y2VK93g5ip6QLFc66K7iNZku97sWSOdiIvA1SqIJ0rfxUiBoFJqmPsZUWiP6YNjPO6b0kbPmhmu5uTqmAZmY+k/ZUSnY/UsWMTwbVUIY2l/+oW13x2VTfgevxN09AfIgdrlwOrbxsqEy3LHO4GOfPzH8zYClcgFGeJnzg888FVT8WRly3TA9NnxVUE8V9Que7Cm7e8xjmIMZjEeVCSZbGowBy1O+tdwBtkQAn+zb71Siok2JavYlFu Zn/1HK9H UwlTZUA7uXQE+lh3GIe8QcTIUEAN+8feslphNni7HL+odk/x2gEL7xd5+vxO0c9I2v1lnoALwHKC/7M5lV8RXATpPc299IYT6ktptrMJz9ymm3oEtqyIhxbOmM8H7pHpgaOyjAbEknVyjjdV9FEapSdPKRtgYA/nxfZAznXQGkVDNfn5OoElEetwbrhKG7AR+lnV0DHuEPmIVSHgPhihjMdvCaSsBWAEeSLDrnmmcMBgnQopgbBLGNd0BjElzjHuJZOb0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: folio_remove_rmap_range() allows to take down the pte mapping to a specific range of folio. Comparing to page_remove_rmap(), it batched updates __lruvec_stat for large folio. Signed-off-by: Yin Fengwei --- include/linux/rmap.h | 4 +++ mm/rmap.c | 58 +++++++++++++++++++++++++++++++++----------- 2 files changed, 48 insertions(+), 14 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index a4570da03e58..d7a51b96f379 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -200,6 +200,10 @@ void page_add_file_rmap(struct page *, struct vm_area_struct *, bool compound); void page_remove_rmap(struct page *, struct vm_area_struct *, bool compound); +void folio_remove_rmap_range(struct folio *, struct page *, + unsigned int nr_pages, struct vm_area_struct *, + bool compound); + void hugepage_add_anon_rmap(struct page *, struct vm_area_struct *, unsigned long address, rmap_t flags); diff --git a/mm/rmap.c b/mm/rmap.c index 097774c809a0..3680765b7ec8 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1357,23 +1357,25 @@ void page_add_file_rmap(struct page *page, struct vm_area_struct *vma, } /** - * page_remove_rmap - take down pte mapping from a page - * @page: page to remove mapping from + * folio_remove_rmap_range - take down pte mapping from a range of pages + * @folio: folio to remove mapping from + * @page: The first page to take down pte mapping + * @nr_pages: The number of pages which will be take down pte mapping * @vma: the vm area from which the mapping is removed * @compound: uncharge the page as compound or small page * * The caller needs to hold the pte lock. */ -void page_remove_rmap(struct page *page, struct vm_area_struct *vma, - bool compound) +void folio_remove_rmap_range(struct folio *folio, struct page *page, + unsigned int nr_pages, struct vm_area_struct *vma, + bool compound) { - struct folio *folio = page_folio(page); atomic_t *mapped = &folio->_nr_pages_mapped; - int nr = 0, nr_pmdmapped = 0; - bool last; + int nr = 0, nr_pmdmapped = 0, last; enum node_stat_item idx; - VM_BUG_ON_PAGE(compound && !PageHead(page), page); + VM_BUG_ON_FOLIO(compound && (nr_pages != folio_nr_pages(folio)), folio); + VM_BUG_ON_FOLIO(compound && (page != &folio->page), folio); /* Hugetlb pages are not counted in NR_*MAPPED */ if (unlikely(folio_test_hugetlb(folio))) { @@ -1384,12 +1386,16 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, /* Is page being unmapped by PTE? Is this its last map to be removed? */ if (likely(!compound)) { - last = atomic_add_negative(-1, &page->_mapcount); - nr = last; - if (last && folio_test_large(folio)) { - nr = atomic_dec_return_relaxed(mapped); - nr = (nr < COMPOUND_MAPPED); - } + do { + last = atomic_add_negative(-1, &page->_mapcount); + if (last && folio_test_large(folio)) { + last = atomic_dec_return_relaxed(mapped); + last = (last < COMPOUND_MAPPED); + } + + if (last) + nr++; + } while (page++, --nr_pages > 0); } else if (folio_test_pmd_mappable(folio)) { /* That test is redundant: it's for safety or to optimize out */ @@ -1443,6 +1449,30 @@ void page_remove_rmap(struct page *page, struct vm_area_struct *vma, munlock_vma_folio(folio, vma, compound); } +/** + * page_remove_rmap - take down pte mapping from a page + * @page: page to remove mapping from + * @vma: the vm area from which the mapping is removed + * @compound: uncharge the page as compound or small page + * + * The caller needs to hold the pte lock. + */ +void page_remove_rmap(struct page *page, struct vm_area_struct *vma, + bool compound) +{ + struct folio *folio = page_folio(page); + unsigned int nr_pages; + + VM_BUG_ON_FOLIO(compound && (page != &folio->page), folio); + + if (likely(!compound)) + nr_pages = 1; + else + nr_pages = folio_nr_pages(folio); + + folio_remove_rmap_range(folio, page, nr_pages, vma, compound); +} + static bool try_to_unmap_one_hugetlb(struct folio *folio, struct vm_area_struct *vma, struct mmu_notifier_range range, struct page_vma_mapped_walk pvmw, unsigned long address, From patchwork Thu Feb 23 08:32:00 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13149975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C04A4C6379F for ; Thu, 23 Feb 2023 08:30:41 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 905DC6B007E; Thu, 23 Feb 2023 03:30:37 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 8B4C36B0080; Thu, 23 Feb 2023 03:30:37 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 693206B0081; Thu, 23 Feb 2023 03:30:37 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 54DB36B007E for ; Thu, 23 Feb 2023 03:30:37 -0500 (EST) Received: from smtpin13.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 1E0CF4126E for ; Thu, 23 Feb 2023 08:30:37 +0000 (UTC) X-FDA: 80497885314.13.62222B3 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf10.hostedemail.com (Postfix) with ESMTP id F41D9C0014 for ; Thu, 23 Feb 2023 08:30:34 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="bcQKyqK/"; spf=pass (imf10.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1677141035; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=LG+G4CbRfXLWfpKc9YkszcxpegepZUGtOrXoX6VfTSc=; b=nNl4S4ZuP5oMqwPUnfznZPNsmBR89VukxOqrjhXtsiOYt+cEYY42Z9h00EVF7mmsBVsTju og/AuTGvB5wTU+n3kCLCLIZbxdVEJ7jV90etkHr/hWVD8slEqsOGfa4++8IwW7bhcTPXby jOC8EBorjISYHxvFBpuSjEE30IvRGy8= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b="bcQKyqK/"; spf=pass (imf10.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.20 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1677141035; a=rsa-sha256; cv=none; b=m0FSZ2t7m1o2Oie/24oVUduGqosebnfIAvtYq2aZa6ULkeWWYOOOynpcy7p7WbpfvhMP8i E7f5WPguC72wQlSyc29mHWD+zav7myrubXNnhB0Kcl7MiZoqWOrq/ahst1QN8T9weOfLN6 Qa9g5b5KKm8CWbwW0wNxpzQYj6pGuss= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1677141035; x=1708677035; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5qNh46j3kGtQTgTbcNWUvGm74xowS1nozv/si/SE9M0=; b=bcQKyqK/OS7u/QcIWvUSujC39KMHvQ1j0JKFQiUyWyQO1a/0s/nkRdeG tJGabOgy740xrzRO3emsIk+omctYgvxia0bE+k5b16BBsSFI8nywH6NUz b0VkXYdFKU0V3K+pCrZcSmSdYVTxbEX/nqER9IEC5+QibdIrwz7RF3Wp7 sxt6mv1Y2lLCi9A6q00V22d7Q4tjB6Tm3FhJMc7wZC98eu/CqPgMuLC3g CCFon/9xy391VczF0CN9o7Q9zxpGgqZ9L/JkzGdLb4TtTxr1vSveblMDQ OFcxShoME7JNN8qklB25eHeg/i/N4V0kvry6v9P37x14D07iB1h3YFNPa A==; X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="321298509" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="321298509" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Feb 2023 00:30:34 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10629"; a="674437916" X-IronPort-AV: E=Sophos;i="5.97,320,1669104000"; d="scan'208";a="674437916" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by fmsmga007.fm.intel.com with ESMTP; 23 Feb 2023 00:30:33 -0800 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org Cc: fengwei.yin@intel.com Subject: [PATCH 5/5] try_to_unmap_one: batched remove rmap, update folio refcount Date: Thu, 23 Feb 2023 16:32:00 +0800 Message-Id: <20230223083200.3149015-6-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20230223083200.3149015-1-fengwei.yin@intel.com> References: <20230223083200.3149015-1-fengwei.yin@intel.com> MIME-Version: 1.0 X-Rspam-User: X-Rspamd-Server: rspam04 X-Rspamd-Queue-Id: F41D9C0014 X-Stat-Signature: shu4uo4zcu1yytuia488zfxrwu34u9s8 X-HE-Tag: 1677141034-561662 X-HE-Meta: U2FsdGVkX18600iwIcgiulo/leZOx26n8CYyG0JMpRbcGoMyzKbnWcU+b1xvLaZg2KLVcoVt8HpmqhU1+on4zF70q1OU7vpsow26Z/eUbB7Exua5AQ+wInxaMKpbXm/1sgwBnSXKx5fvweLMVxNkDt5GnS6i34sePF1Mew60iZ/jkAWBdiaxxzVg2LW+tKPopHir1M3y+9/6AZdA1OnLIhzep0pc7Ahzamtkb3jrZbWcG3PGBWQLiJv0+Ycnh/VzKZBR4jkypGqrgPbp3IjUu0+8RzuJbPT4hX4CdXRo1y4cKaL3XoIlda38uyPM55Q66vqr7D9vM0NmarHWfa/l6+5/1EtIQrXLdDeJk+EiMowQAUbMT9WapnQbZwztjbqLhU3BST3kre37sFSKtcGkQqqceW1RAgqF0Hc99i4awFuglwmpR/1ctWUQyfghKNV34SC4y4kqX3ybuEHTqRMwDXWht8mksCCiw46LcDgdAciDQY60Dt9CuUw5T6zKYkYN8vaoUAKq75b4dRIRqQ/hOZbPF866j9sBF+wJY1NiyaETbq+v3j2d0Z18UbstKqljbDvj6X/RovkRN6vWbRwfSxhL4zxM/aLS0axUvK0NBXjz+UZ0/KIWwmfESir2/Qv23g2wW5a9QDOes8rvYd6iMrfrB1OA6X6G0rLYUMtxydNyI2qPfo3iSvJoPUrOJTv7dCcwTEJMO1ZEE4q4N1Gufzvsz3oxw+ehDgcrg7wA8bjY4OPn6lRVHqzoy4hq/P795ZWYmbKHeMgWza83W8pAunzLwBwVaVerBGgEPoBDE/yn5NlRmSUp2ZssVa3Wu1HQYEqk8sZEEogVU31BDyoLQ2NOm3z04skJNUfIdKRabc3BBiJsuSyeumBmrcvUkjZzCCGz+5oFSNW5PUINXZcpMnXeKC2vVoX+1dN2hAFUavZKIu1rTR/HBVwraglaa6EA1XlnrnGSsab7u41G4+v piYpoKpD z587bVghnPiPhOw2fT2psOUcsqZZzDFXybCfbJGm8InoF6S15aH7L7QFzYYGm0Dw9e96ciE5fJWQJatyxRiEjal+uhoNoKeibmQUJroBVdXOONAR+OOpc74SvzLyGGifJfjMphgy/VnABmNQO6ZsGznc3vV145W8YpofbOxEZD8ipG6Z/DqaL0BmkvhG4qbCUVp/2Q0oIjCOnZouaBCswau9XHgBWtZT2btgS X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: If unmap one page fails, or the vma walk will skip next pte, or the vma walk will end on next pte, batched remove map, update folio refcount. Signed-off-by: Yin Fengwei --- include/linux/rmap.h | 1 + mm/page_vma_mapped.c | 30 +++++++++++++++++++++++++++ mm/rmap.c | 48 ++++++++++++++++++++++++++++++++++---------- 3 files changed, 68 insertions(+), 11 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index d7a51b96f379..568801ee8d6a 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -424,6 +424,7 @@ static inline void page_vma_mapped_walk_done(struct page_vma_mapped_walk *pvmw) } bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); +bool pvmw_walk_skip_or_end_on_next(struct page_vma_mapped_walk *pvmw); /* * Used by swapoff to help locate where page is expected in vma. diff --git a/mm/page_vma_mapped.c b/mm/page_vma_mapped.c index 4e448cfbc6ef..19e997dfb5c6 100644 --- a/mm/page_vma_mapped.c +++ b/mm/page_vma_mapped.c @@ -291,6 +291,36 @@ bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw) return false; } +/** + * pvmw_walk_skip_or_end_on_next - check if next pte will be skipped or + * end the walk + * @pvmw: pointer to struct page_vma_mapped_walk. + * + * This function can only be called with correct pte lock hold + */ +bool pvmw_walk_skip_or_end_on_next(struct page_vma_mapped_walk *pvmw) +{ + unsigned long address = pvmw->address + PAGE_SIZE; + + if (address >= vma_address_end(pvmw)) + return true; + + if ((address & (PMD_SIZE - PAGE_SIZE)) == 0) + return true; + + if (pte_none(*pvmw->pte)) + return true; + + pvmw->pte++; + if (!check_pte(pvmw)) { + pvmw->pte--; + return true; + } + pvmw->pte--; + + return false; +} + /** * page_mapped_in_vma - check whether a page is really mapped in a VMA * @page: the page to test diff --git a/mm/rmap.c b/mm/rmap.c index 3680765b7ec8..7156b804d424 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1748,6 +1748,26 @@ static bool try_to_unmap_one_page(struct folio *folio, return false; } +static void folio_remove_rmap_and_update_count(struct folio *folio, + struct page *start, struct vm_area_struct *vma, int count) +{ + if (count == 0) + return; + + /* + * No need to call mmu_notifier_invalidate_range() it has be + * done above for all cases requiring it to happen under page + * table lock before mmu_notifier_invalidate_range_end() + * + * See Documentation/mm/mmu_notifier.rst + */ + folio_remove_rmap_range(folio, start, count, vma, + folio_test_hugetlb(folio)); + if (vma->vm_flags & VM_LOCKED) + mlock_drain_local(); + folio_ref_sub(folio, count); +} + /* * @arg: enum ttu_flags will be passed to this argument */ @@ -1755,10 +1775,11 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, unsigned long address, void *arg) { DEFINE_FOLIO_VMA_WALK(pvmw, folio, vma, address, 0); - struct page *subpage; + struct page *start = NULL; bool ret = true; struct mmu_notifier_range range; enum ttu_flags flags = (enum ttu_flags)(long)arg; + int count = 0; /* * When racing against e.g. zap_pte_range() on another cpu, @@ -1819,26 +1840,31 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma, break; } - subpage = folio_page(folio, + if (!start) + start = folio_page(folio, pte_pfn(*pvmw.pte) - folio_pfn(folio)); ret = try_to_unmap_one_page(folio, vma, range, pvmw, address, flags); if (!ret) { + folio_remove_rmap_and_update_count(folio, + start, vma, count); page_vma_mapped_walk_done(&pvmw); break; } + count++; /* - * No need to call mmu_notifier_invalidate_range() it has be - * done above for all cases requiring it to happen under page - * table lock before mmu_notifier_invalidate_range_end() - * - * See Documentation/mm/mmu_notifier.rst + * If next pte will be skipped in page_vma_mapped_walk() or + * the walk will end at it, batched remove rmap and update + * page refcount. We can't do it after page_vma_mapped_walk() + * return false because the pte lock will not be hold. */ - page_remove_rmap(subpage, vma, false); - if (vma->vm_flags & VM_LOCKED) - mlock_drain_local(); - folio_put(folio); + if (pvmw_walk_skip_or_end_on_next(&pvmw)) { + folio_remove_rmap_and_update_count(folio, + start, vma, count); + count = 0; + start = NULL; + } } mmu_notifier_invalidate_range_end(&range);