From patchwork Tue Jul 30 12:46:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Usama Arif X-Patchwork-Id: 13747382 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 519A5C3DA49 for ; Tue, 30 Jul 2024 12:54:22 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B2D206B0095; Tue, 30 Jul 2024 08:54:18 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id A8E186B0098; Tue, 30 Jul 2024 08:54:18 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 827A96B0099; Tue, 30 Jul 2024 08:54:18 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 5770E6B0095 for ; Tue, 30 Jul 2024 08:54:18 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id E0861C0286 for ; Tue, 30 Jul 2024 12:54:17 +0000 (UTC) X-FDA: 82396412154.18.B32609B Received: from mail-vk1-f180.google.com (mail-vk1-f180.google.com [209.85.221.180]) by imf09.hostedemail.com (Postfix) with ESMTP id E975B140008 for ; Tue, 30 Jul 2024 12:54:15 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XN1LpwHA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.221.180 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1722344011; a=rsa-sha256; cv=none; b=LrWlnWHueUoHMU35P4ARLZ5y1bXSc3UjDvH92Jz0xxNqVfsYOOI08fVcxp3uqDOvF2KMEd 4Tu6OdDKboJF+xYC6We0wRUqylp3lji7wD/H04fYtKCFtxNfnN9I7SMHt8yjCZKuwMrsn2 UewbcC8Meh5wMvEm8RtMkRtHGQQii50= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=XN1LpwHA; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf09.hostedemail.com: domain of usamaarif642@gmail.com designates 209.85.221.180 as permitted sender) smtp.mailfrom=usamaarif642@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1722344011; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=a1wFF2OykYTq7u4lRnrBlmyhKJmdj3JpsXJVQxU9HeI=; b=NxoTrHLDRvIRU79nLB1oveZzXdr9SfkcTrrdQyB4BlD0Pw2yjHrOKFP4ef2oZdiZNLlIlL nsrmz77/vq/SFg1LiJnZCd5Sggvmf4AFmSw3hoyfqyLdEzZYJgoIOHsKaQOBU8BszYvCV+ O3QyyPYxzBrYDISISs8sHFPYUgm8s1s= Received: by mail-vk1-f180.google.com with SMTP id 71dfb90a1353d-4f6e36ea1ebso1199474e0c.0 for ; Tue, 30 Jul 2024 05:54:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722344055; x=1722948855; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=a1wFF2OykYTq7u4lRnrBlmyhKJmdj3JpsXJVQxU9HeI=; b=XN1LpwHAlBDGKxuIT0ZMbPaQDOQQsIORK+rcAcIx0V5nYIGtWysVs+lpH0zUM2YGmS SS8U5jP76xExuSCYd/ugntIvyW+jmSJMo+k/dDCAifE7vrzs7tLDatSfhZVtT3uzPXoK DNkJ+IDIj5+fMOUCGptoFPpoo9UyeBc3J4uZ2ClzBLuMjz8oqJ6tGX7KjeSfnb8YwrWx kaSlQOLHYLoTaCtk73UyAO01gAb49uq25joLNBW/ckQPdLO8OD3G2iAtJZ1tIS49EcgJ fNOPcxJiDDidqbwZvfT8GOIyqV0PCKREBtF7xzNK4O/noA5vy/kMGFTkG7JtsXcdYMjO SNyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722344055; x=1722948855; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a1wFF2OykYTq7u4lRnrBlmyhKJmdj3JpsXJVQxU9HeI=; b=A67BQqrGJsmioYdyMI636dUOuDERxDTTajShQQJx9DhcaEZUz5/GxO2Xt8MsE0dQV9 gObqix9QRY7jVwNOsRLf196/2VA06FsIMJFWuyI4mc4hMUDD0BYVejcK882Jab5oiKSg 1MZatzsCpLM7+qRJnoAaYt1mjjW42bo//cpWar0bkAYQJ91eOlBjwu6TmeRr0/7aqV7g +4VyM045apLv9TcOPHRAF6ouRawcAzDDg1l04MrWj7TnmzqcK9VReg49nM6/yTYEktOn ZMOKitb1ZkHZ2C5Bf+Dc0iZvnFazMjtY3vXBdrz/HJCe6xkvU8Td8+2ZTGeRnQc3elym v4EA== X-Forwarded-Encrypted: i=1; AJvYcCXt095GbhZA7jJ8mUd91FE1zSaNHuCWkB91E4k4QnnIuO29gHhdlph3Rby74b2xf4bESXU3QnGePmYUijUoR7Eh9Ao= X-Gm-Message-State: AOJu0YxnMbpIhYWpB6avVsc9YmzdQSQ8u6bwMS5RKqh0C0Ya2JLiDyuM IPcmEyCG4f7JIA+Eim9rhZl7jQ/1d/vizmgcEcY3p9oxGjwMw0IM X-Google-Smtp-Source: AGHT+IFv69c1CqQsvpikAHVYWXzas2sO4e5GzupNiWoyXChBgpWC9IWcrnZ5g+GmctsDCyK+p6YuCA== X-Received: by 2002:a05:6102:54ac:b0:492:a93d:7ca7 with SMTP id ada2fe7eead31-493fa61a794mr12211983137.4.1722344054907; Tue, 30 Jul 2024 05:54:14 -0700 (PDT) Received: from localhost (fwdproxy-ash-114.fbsv.net. [2a03:2880:20ff:72::face:b00c]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7a1d745f404sm628318285a.131.2024.07.30.05.54.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Jul 2024 05:54:14 -0700 (PDT) From: Usama Arif To: akpm@linux-foundation.org, linux-mm@kvack.org Cc: hannes@cmpxchg.org, riel@surriel.com, shakeel.butt@linux.dev, roman.gushchin@linux.dev, yuzhao@google.com, david@redhat.com, baohua@kernel.org, ryan.roberts@arm.com, rppt@kernel.org, willy@infradead.org, cerasuolodomenico@gmail.com, corbet@lwn.net, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, kernel-team@meta.com, Shuang Zhai , Usama Arif Subject: [PATCH 4/6] mm: don't remap unused subpages when splitting isolated thp Date: Tue, 30 Jul 2024 13:46:01 +0100 Message-ID: <20240730125346.1580150-5-usamaarif642@gmail.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240730125346.1580150-1-usamaarif642@gmail.com> References: <20240730125346.1580150-1-usamaarif642@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam12 X-Rspamd-Queue-Id: E975B140008 X-Stat-Signature: d3gfn6cc4cotdbee5hdd9wbha4t9y6ps X-Rspam-User: X-HE-Tag: 1722344055-301800 X-HE-Meta: U2FsdGVkX18EfzHygrt6tWChb25Sjorqj7Czs7Tmg6Fu0wSwErXmImE1MhZDQZ0ftwTkQ9g6+c+zGyQUekga5bZiNA+v8OWjLRMulazVnY1H2CEIYr3wa9z+qZLO0OMSj8DcBoIQSLq/tylHhvov/E8mqXxnfSVhlewpDMAXkgmmZgS5YV5XS1rX3uoPOBWQyW6SPVwpe1wLIqmNWUQTWq87bLXNoIMLZPI6ZT0VW89mqX5YQdLSvvPoA754BSRJl3DZc5NiXEc9KpKQF3zth/x5kifDXPXmZfjJrrG9u4+tpjDTDw7JixlWkZPXYSN/xXuBDwN6UNz3rDbl97AyWB1HnU9FgT/yrLTZ4DQNZpRYBd9sbDb/miBxC8JgnmR496hstZaFAVtND/TBLvAikptZnAwKR/r5hkKPQLAT0hlmJESCk/iCbUUdvMNEW0V6wH4t5UKNUOsvHvsxEODUgAeL25uh9YnF6dhe0fMOCOih11KjQsSxVdmKDmtzQ9+DEBSRm6IlFHpiLRcGJ8t+Ry34XMHyzDk1+RlBvBlPkv6sJL1vU4RbzHrzG9nEJWl965S5mzf5kuBsvMbzARIrGyNAJrkp40nAT9x6qFu1er0I2ruz2ma0Z6vICCCHIA3z4eI5mAgfuQQ5Ze7tQA2oNKsidL5saMoITdso2H3nD1+x5z5YeGqwhhk6GeP3G0JsKydb/lHhTAOdlFt9wKTBRSn7B4X9p9OKj1MLfYvDoCxhdCpSVMNzTEZ4LPkkXoFW+DrAF0I5ZGl95fh66eNUtfDPAZtSVUn90BTRCrSdcIKfry327NoTgrbU9nw+u5NF5h2WutkjOMJ3o3mJ0pb7lmoxm6DIdQcM6mrbzIwORfNxFWfNFhKRbiGib6Csm6/Kc6u49mcKgvtG1sHYQPcIZAezMPqlrA0webnN7bTD4DI7oPSfUcG6VHxhDyhkRf4f/RMjfsmLvFULpneRDlN 0pr6uSjR DdKLDEUa3o1ppakhoXbG23xLNUqHymuQKUIC/dH4qzlGDzOINuwaRpydeEcmljuBGdt4+gLT4qm1pU6pQ5VRj3NwwDgFWLjxahxqHbuQTZ+IWvr5rGxS5aUCnsPDUzLZsYZjPXTyL+yr1znS/Q0Dk26LrH9Y2sorCEqb5Wb6wH5u/HfScvkS54cTb1IZPgI6oCAt59PQTpC93qT+bDjtnaTRsiDAymoXrRNuuEgy9DKuScZvRm8ib0skIGbH1yoKy1RXeSbkvOIc9X63HZ3s5VtP4mEwDF+pIKeopX80MbE2LfQIGm2DR1MHK4sL0OACcBjK30ismGikJyI/pUyqrEBck5hMYEI/YQbBZWWrsJGZGSic= X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: From: Yu Zhao Here being unused means containing only zeros and inaccessible to userspace. When splitting an isolated thp under reclaim or migration, there is no need to remap its unused subpages because they can be faulted in anew. Not remapping them avoids writeback or copying during reclaim or migration. This is particularly helpful when the internal fragmentation of a thp is high, i.e., it has many untouched subpages. This is also a prerequisite for THP low utilization shrinker which will be introduced in later patches, where underutilized THPs are split, and the zero-filled split pages are freed saving memory. Signed-off-by: Yu Zhao Tested-by: Shuang Zhai Signed-off-by: Usama Arif --- include/linux/rmap.h | 2 +- mm/huge_memory.c | 8 ++--- mm/migrate.c | 73 +++++++++++++++++++++++++++++++++++++++----- mm/migrate_device.c | 4 +-- 4 files changed, 72 insertions(+), 15 deletions(-) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 0978c64f49d8..805ab09057ed 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -745,7 +745,7 @@ int folio_mkclean(struct folio *); int pfn_mkclean_range(unsigned long pfn, unsigned long nr_pages, pgoff_t pgoff, struct vm_area_struct *vma); -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked); +void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked, bool unmap_unused); /* * rmap_walk_control: To control rmap traversing for specific needs diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 76a3b6a2b796..892467d85f3a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2775,7 +2775,7 @@ bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, return false; } -static void remap_page(struct folio *folio, unsigned long nr) +static void remap_page(struct folio *folio, unsigned long nr, bool unmap_unused) { int i = 0; @@ -2783,7 +2783,7 @@ static void remap_page(struct folio *folio, unsigned long nr) if (!folio_test_anon(folio)) return; for (;;) { - remove_migration_ptes(folio, folio, true); + remove_migration_ptes(folio, folio, true, unmap_unused); i += folio_nr_pages(folio); if (i >= nr) break; @@ -2993,7 +2993,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, if (nr_dropped) shmem_uncharge(folio->mapping->host, nr_dropped); - remap_page(folio, nr); + remap_page(folio, nr, PageAnon(head)); /* * set page to its compound_head when split to non order-0 pages, so @@ -3286,7 +3286,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, if (mapping) xas_unlock(&xas); local_irq_enable(); - remap_page(folio, folio_nr_pages(folio)); + remap_page(folio, folio_nr_pages(folio), false); ret = -EAGAIN; } diff --git a/mm/migrate.c b/mm/migrate.c index b273bac0d5ae..f4f06bdded70 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -177,13 +177,61 @@ void putback_movable_pages(struct list_head *l) } } +static bool try_to_unmap_unused(struct page_vma_mapped_walk *pvmw, + struct folio *folio, + unsigned long idx) +{ + struct page *page = folio_page(folio, idx); + void *addr; + bool dirty; + pte_t newpte; + + VM_BUG_ON_PAGE(PageCompound(page), page); + VM_BUG_ON_PAGE(!PageAnon(page), page); + VM_BUG_ON_PAGE(!PageLocked(page), page); + VM_BUG_ON_PAGE(pte_present(*pvmw->pte), page); + + if (PageMlocked(page) || (pvmw->vma->vm_flags & VM_LOCKED)) + return false; + + /* + * The pmd entry mapping the old thp was flushed and the pte mapping + * this subpage has been non present. Therefore, this subpage is + * inaccessible. We don't need to remap it if it contains only zeros. + */ + addr = kmap_local_page(page); + dirty = memchr_inv(addr, 0, PAGE_SIZE); + kunmap_local(addr); + + if (dirty) + return false; + + pte_clear_not_present_full(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, false); + + if (userfaultfd_armed(pvmw->vma)) { + newpte = pte_mkspecial(pfn_pte(page_to_pfn(ZERO_PAGE(pvmw->address)), + pvmw->vma->vm_page_prot)); + ptep_clear_flush(pvmw->vma, pvmw->address, pvmw->pte); + set_pte_at(pvmw->vma->vm_mm, pvmw->address, pvmw->pte, newpte); + } + + dec_mm_counter(pvmw->vma->vm_mm, mm_counter(folio)); + return true; +} + +struct rmap_walk_arg { + struct folio *folio; + bool unmap_unused; +}; + /* * Restore a potential migration pte to a working pte entry */ static bool remove_migration_pte(struct folio *folio, - struct vm_area_struct *vma, unsigned long addr, void *old) + struct vm_area_struct *vma, unsigned long addr, void *arg) { - DEFINE_FOLIO_VMA_WALK(pvmw, old, vma, addr, PVMW_SYNC | PVMW_MIGRATION); + struct rmap_walk_arg *rmap_walk_arg = arg; + DEFINE_FOLIO_VMA_WALK(pvmw, rmap_walk_arg->folio, vma, addr, PVMW_SYNC | PVMW_MIGRATION); while (page_vma_mapped_walk(&pvmw)) { rmap_t rmap_flags = RMAP_NONE; @@ -207,6 +255,8 @@ static bool remove_migration_pte(struct folio *folio, continue; } #endif + if (rmap_walk_arg->unmap_unused && try_to_unmap_unused(&pvmw, folio, idx)) + continue; folio_get(folio); pte = mk_pte(new, READ_ONCE(vma->vm_page_prot)); @@ -285,13 +335,20 @@ static bool remove_migration_pte(struct folio *folio, * Get rid of all migration entries and replace them by * references to the indicated page. */ -void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked) +void remove_migration_ptes(struct folio *src, struct folio *dst, bool locked, bool unmap_unused) { + struct rmap_walk_arg rmap_walk_arg = { + .folio = src, + .unmap_unused = unmap_unused, + }; + struct rmap_walk_control rwc = { .rmap_one = remove_migration_pte, - .arg = src, + .arg = &rmap_walk_arg, }; + VM_BUG_ON_FOLIO(unmap_unused && src != dst, src); + if (locked) rmap_walk_locked(dst, &rwc); else @@ -904,7 +961,7 @@ static int writeout(struct address_space *mapping, struct folio *folio) * At this point we know that the migration attempt cannot * be successful. */ - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, false, false); rc = mapping->a_ops->writepage(&folio->page, &wbc); @@ -1068,7 +1125,7 @@ static void migrate_folio_undo_src(struct folio *src, struct list_head *ret) { if (page_was_mapped) - remove_migration_ptes(src, src, false); + remove_migration_ptes(src, src, false, false); /* Drop an anon_vma reference if we took one */ if (anon_vma) put_anon_vma(anon_vma); @@ -1306,7 +1363,7 @@ static int migrate_folio_move(free_folio_t put_new_folio, unsigned long private, lru_add_drain(); if (old_page_state & PAGE_WAS_MAPPED) - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, false, false); out_unlock_both: folio_unlock(dst); @@ -1444,7 +1501,7 @@ static int unmap_and_move_huge_page(new_folio_t get_new_folio, if (page_was_mapped) remove_migration_ptes(src, - rc == MIGRATEPAGE_SUCCESS ? dst : src, false); + rc == MIGRATEPAGE_SUCCESS ? dst : src, false, false); unlock_put_anon: folio_unlock(dst); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 6d66dc1c6ffa..a1630d8e0d95 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -424,7 +424,7 @@ static unsigned long migrate_device_unmap(unsigned long *src_pfns, continue; folio = page_folio(page); - remove_migration_ptes(folio, folio, false); + remove_migration_ptes(folio, folio, false, false); src_pfns[i] = 0; folio_unlock(folio); @@ -837,7 +837,7 @@ void migrate_device_finalize(unsigned long *src_pfns, src = page_folio(page); dst = page_folio(newpage); - remove_migration_ptes(src, dst, false); + remove_migration_ptes(src, dst, false, false); folio_unlock(src); if (is_zone_device_page(page))