From patchwork Mon Mar 13 12:45:21 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yin Fengwei X-Patchwork-Id: 13172440 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8355DC61DA4 for ; Mon, 13 Mar 2023 12:44:21 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 36E826B0072; Mon, 13 Mar 2023 08:44:20 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 31FB86B0074; Mon, 13 Mar 2023 08:44:20 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1E82F6B0075; Mon, 13 Mar 2023 08:44:20 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id EF2D36B0074 for ; Mon, 13 Mar 2023 08:44:19 -0400 (EDT) Received: from smtpin07.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay02.hostedemail.com (Postfix) with ESMTP id B8ABF120B39 for ; Mon, 13 Mar 2023 12:44:19 +0000 (UTC) X-FDA: 80563843038.07.88B70D9 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf18.hostedemail.com (Postfix) with ESMTP id 6B5681C0008 for ; Mon, 13 Mar 2023 12:44:17 +0000 (UTC) Authentication-Results: imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=gTKel5cB; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf18.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1678711457; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=OGpeRKMdYVqCIUHtYzt1swlI45bO0OycQ0cm8bo39qI=; b=cgFzh6GQ4xpXZYnIhm0qd8zPuSGkfxoJHZl9Bm7rEt99jVP0wRUHBZTTpEgUewxlVUOphz e+YNPMby1yapkqDfSySqvEXbKIDu3CzqaW5DVH8TTamYP8jr57xD63fGSsGCzyvx1QVqY3 rsIQRiJXOD+wzGhKQNUxiOcFRQfhp/o= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=gTKel5cB; dmarc=pass (policy=none) header.from=intel.com; spf=pass (imf18.hostedemail.com: domain of fengwei.yin@intel.com designates 134.134.136.31 as permitted sender) smtp.mailfrom=fengwei.yin@intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1678711457; a=rsa-sha256; cv=none; b=BUjkx4b+M8Q5oJ0vw+jlEQ3EvngTnNBCsEIVbf+UD5/u4z73iqzf8r6vW0FBjQzCvdkz7h dT7Nn8lmAhLVP2SrH7zK1PyiAUnGHpm6jdLrlLx0nZEjxs0OHFirNdR+wkkaklVUpOjWKS ivsyWKkw2HUMycRqeVOaD3OeM7wuREo= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1678711457; x=1710247457; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=4ynijzUA2lll6J0hKS8/xYI1IjPolhjzLLh6OwmziEk=; b=gTKel5cB/RZHcSJuJTFXfvWmf4rkb8DOiInyYDbSNSy5QXFrKHXJ3C0L i/kgo665QONkO+9ZoGULOZvv7agAuGuci6T2KazxIKdRvtE3g80ZfXqPQ vr8U/5VqQBtgh0bFwYdMMKabjAwZPCKPD5WpP0oZS9G7un7NCEIgHIdIz SDZtihJgnKfRwb/Csv1jzUlQVP87xf6ZslJt/dvvGaamjC9rukJE2JDjp oHnzQZUf/KVg6+D/Gg6nTXr6v3Rzjhq1GgR9juELlOzBVvppqdyM9g/zs CU3/UVHg3auB5NLrnAI+gqthIAu7QRMy+WX9Rcas+8elfDU2uzjNRxMFN Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="399727360" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="399727360" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Mar 2023 05:44:00 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6500,9779,10647"; a="767683040" X-IronPort-AV: E=Sophos;i="5.98,256,1673942400"; d="scan'208";a="767683040" Received: from fyin-dev.sh.intel.com ([10.239.159.32]) by FMSMGA003.fm.intel.com with ESMTP; 13 Mar 2023 05:43:57 -0700 From: Yin Fengwei To: linux-mm@kvack.org, akpm@linux-foundation.org, willy@infradead.org, mike.kravetz@oracle.com, sidhartha.kumar@oracle.com, naoya.horiguchi@nec.com, jane.chu@oracle.com, david@redhat.com Cc: fengwei.yin@intel.com Subject: [PATCH v4 0/5] batched remove rmap in try_to_unmap_one() Date: Mon, 13 Mar 2023 20:45:21 +0800 Message-Id: <20230313124526.1207490-1-fengwei.yin@intel.com> X-Mailer: git-send-email 2.30.2 MIME-Version: 1.0 X-Rspamd-Queue-Id: 6B5681C0008 X-Rspamd-Server: rspam09 X-Rspam-User: X-Stat-Signature: qmswtf8zxfts44ekwie1o4dtkmtfk7kq X-HE-Tag: 1678711457-171993 X-HE-Meta: U2FsdGVkX1/Y0cuU/qduX8B4R8kpYEOVzBN7voAtlDNRinxFfs4QBHiB3Z6P8IbCZKf3itgEZZ5wkvF/LcvWL1LXURnCi1paKNj1QADuAi3xAnKrI+eeIpI32ELn8r69HghrsI4DdfhJJQcGlDC/1GOW6Hz/8oM8sEpSzSJwOdwM5igOXrD4T0toWfND0N+29u2/j0SbWIVXXjhhLP5Kklzij974WwwdwtdxmVGZwonLNiZaAdqY/+2GChQbFo4uLjuQW9U53QmsuMbq5jyQgvR0dp9dmWaMtjnvg9QsfwVYbeTWUrBSSCTOmoL1atO1PLhJky0N6qX0fqhe8qK0pj3SVHn+ssn1wZDc5MIQz9vtBf3hV8zQXJ16uwZ21zqet0GJ+vtg53GoV496zFhyKeTR/jwIAEdpOhjkuE2sF673EXWLo3FzcenDfjDatw/Mzq1SHmULognRpdU7UbWnLlmbTZJPckk7h0MxyCNHF6vENnVxXI9byErjg0a5JFqRYQfBSrL068MT+3hp1jTNcDlaZdSJtqTob53gbygfSy1iVHmgMkGVTK5VTp6Y9s+a+5V0MGMW9vypVwkmwWCbLwhC2UNMByKplsk3jqfpwHdpTa6eJBDEAk5/t2u3iRnvPwpfHsd84zg+ZVtCve3TMWrjRVWMdZW2oiFz4VOOBqQjYYWqNLJEqcDPM2td3Iu8Sk2UYHhOOgt62wXOVb7rOBxxzrpPKyUyFtbEzR8Ddvuf2Hm0lwM1UPL4PA4CqryRAYgK6DQ3LoXF+gd6Ojg0EmaCsb1hU++aKy8RmNF0Ms53HmdByMie0BhupC6GxtUF9OrLHu06d/0PewaV+a0n1AgUvqDNSzNVOCHCvtEusDcKaSTRO2MBiyGO1LtXCDiPpcz00cGR/hs1mnBP3dgDz/uccUkumf7AiVezpOxjiLniyQ5jYgneAMhij7blfnbfCE6ifAcIVJd/3+F9wiD JMlWaHVo A4pO1baxXtt9i4aIeOFYXFfuGPXRZv08n1Y9xz7KBghzEJD73w9inPyZ6AttLVvBIHdgeehK0vP8U35GbbcvXZMr0zziMT7+TAnK3QV7F2lnRe80pVxZ51YBoUfwDiLTZ2ngLXBnxj8HlPOSM2H/quax0hOsxky4vbFdc568Ick3DLQuelHEseEGUXWqR5cYhVwSm X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: This series is trying to bring the batched rmap removing to try_to_unmap_one(). It's expected that the batched rmap removing bring performance gain than remove rmap per page. This series reconstruct the try_to_unmap_one() from: loop: clear and update PTE unmap one page goto loop to: loop: clear and update PTE goto loop unmap the range of folio in one call It is one step to always map/unmap the entire folio in one call. Which can simplify the folio mapcount handling by avoid dealing with each page map/unmap. The changes are organized as: Patch1/2 move the hugetlb and normal page unmap to dedicated functions to make try_to_unmap_one() logic clearer and easy to add batched rmap removing. To make code review easier, no function change. Patch3 cleanup the try_to_unmap_one_page(). Try to removed some duplicated function calls. Patch4 adds folio_remove_rmap_range() which batched remove rmap. Patch5 make try_to_unmap_one() to batched remove rmap. Functional testing done with the V3 patchset in a qemu guest with 4G mem: - kernel mm selftest to trigger vmscan() and final hit try_to_unmap_one(). - Inject hwpoison to hugetlb page to trigger try_to_unmap_one() call against hugetlb. - 8 hours stress testing: Firefox + kernel mm selftest + kernel build. For performance gain demonstration, changed the MADV_PAGEOUT not to split the large folio for page cache and created a micro benchmark mainly as following: #define FILESIZE (2 * 1024 * 1024) char *c = mmap(NULL, FILESIZE, PROT_READ|PROT_WRITE, MAP_PRIVATE, fd, 0); count = 0; while (1) { unsigned long i; for (i = 0; i < FILESIZE; i += pgsize) { cc = *(volatile char *)(c + i); } madvise(c, FILESIZE, MADV_PAGEOUT); count++; } munmap(c, FILESIZE); Run it with 96 instances + 96 files on xfs file system for 1 second. The test platform was IceLake with 48C/96T + 192G memory. Test result (number count) got around %7 (58865 -> 63247) improvement with this patch series. And perf shows following: Without this series: 18.26%--try_to_unmap_one | |--10.71%--page_remove_rmap | | | --9.81%--__mod_lruvec_page_state | | | |--1.36%--__mod_memcg_lruvec_state | | | | | --0.80%--cgroup_rstat_updated | | | --0.67%--__mod_lruvec_state | | | --0.59%--__mod_node_page_state | |--5.41%--ptep_clear_flush | | | --4.64%--flush_tlb_mm_range | | | --3.88%--flush_tlb_func | | | --3.56%--native_flush_tlb_one_user | |--0.75%--percpu_counter_add_batch | --0.53%--PageHeadHuge With this series: 9.87%--try_to_unmap_one | |--7.14%--try_to_unmap_one_page.constprop.0.isra.0 | | | |--5.21%--ptep_clear_flush | | | | | --4.36%--flush_tlb_mm_range | | | | | --3.54%--flush_tlb_func | | | | | --3.17%--native_flush_tlb_one_user | | | --0.82%--percpu_counter_add_batch | |--1.18%--folio_remove_rmap_and_update_count.part.0 | | | --1.11%--folio_remove_rmap_range | | | --0.53%--__mod_lruvec_page_state | --0.57%--PageHeadHuge As expected, the cost of __mod_lruvec_page_state is reduced significantly with batched folio_remove_rmap_range. Suppose the page reclaim path can get same benefit also. This series based on next-20230310. Changes from v3: - General - Rebase to next-20230310 - Add performance testing result - Patch1 - Fixed incorrect comments as Mike Kravetz pointed out - Use huge_pte_dirty() as Mike Kravetz suggested - Use true instead of folio_test_hugetlb() in try_to_unmap_one_hugetlb() as it's hugetlb page for sure as Mike Kravetz suggested Changes from v2: - General - Rebase the patch to next-20230303 - Update cover letter about the preparation to unmap the entire folio in one call - No code change comparing to V2. But fix the patch applying conflict because of wrong patch order in V2. Changes from v1: - General - Rebase the patch to next-20230228 - Patch1 - Removed the if (PageHWPoison(page) && !(flags & TTU_HWPOISON) as suggestion from Mike Kravetz and HORIGUCHI NAOYA - Removed the mlock_drain_local() as suggestion from Mike Kravetz _ Removed the comments about the mm counter change as suggestion from Mike Kravetz Yin Fengwei (5): rmap: move hugetlb try_to_unmap to dedicated function rmap: move page unmap operation to dedicated function rmap: cleanup exit path of try_to_unmap_one_page() rmap:addd folio_remove_rmap_range() try_to_unmap_one: batched remove rmap, update folio refcount include/linux/rmap.h | 5 + mm/page_vma_mapped.c | 30 +++ mm/rmap.c | 623 +++++++++++++++++++++++++------------------ 3 files changed, 398 insertions(+), 260 deletions(-)