[v2,30/46] hugetlb: add high-granularity migration support

Message ID	20230218002819.1486479-31-jthoughton@google.com (mailing list archive)
State	New
Headers	show Return-Path: <owner-linux-mm@kvack.org> Date: Sat, 18 Feb 2023 00:28:03 +0000 In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com> Mime-Version: 1.0 References: <20230218002819.1486479-1-jthoughton@google.com> Message-ID: <20230218002819.1486479-31-jthoughton@google.com> Subject: [PATCH v2 30/46] hugetlb: add high-granularity migration support From: James Houghton <jthoughton@google.com> To: Mike Kravetz <mike.kravetz@oracle.com>, Muchun Song <songmuchun@bytedance.com>, Peter Xu <peterx@redhat.com>, Andrew Morton <akpm@linux-foundation.org> Cc: David Hildenbrand <david@redhat.com>, David Rientjes <rientjes@google.com>, Axel Rasmussen <axelrasmussen@google.com>, Mina Almasry <almasrymina@google.com>, "Zach O'Keefe" <zokeefe@google.com>, Manish Mishra <manish.mishra@nutanix.com>, Naoya Horiguchi <naoya.horiguchi@nec.com>, "Dr . David Alan Gilbert" <dgilbert@redhat.com>, "Matthew Wilcox (Oracle)" <willy@infradead.org>, Vlastimil Babka <vbabka@suse.cz>, Baolin Wang <baolin.wang@linux.alibaba.com>, Miaohe Lin <linmiaohe@huawei.com>, Yang Shi <shy828301@gmail.com>, Frank van der Linden <fvdl@google.com>, Jiaqi Yan <jiaqiyan@google.com>, linux-mm@kvack.org, linux-kernel@vger.kernel.org, James Houghton <jthoughton@google.com> Content-Type: text/plain; charset="UTF-8" Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	hugetlb: introduce HugeTLB high-granularity mapping \| expand [v2,00/46] hugetlb: introduce HugeTLB high-granularity mapping [v2,01/46] hugetlb: don't set PageUptodate for UFFDIO_CONTINUE [v2,02/46] hugetlb: remove mk_huge_pte; it is unused [v2,03/46] hugetlb: remove redundant pte_mkhuge in migration path [v2,04/46] hugetlb: only adjust address ranges when VMAs want PMD sharing [v2,05/46] rmap: hugetlb: switch from page_dup_file_rmap to page_add_file_rmap [v2,06/46] hugetlb: add CONFIG_HUGETLB_HIGH_GRANULARITY_MAPPING [v2,07/46] mm: add VM_HUGETLB_HGM VMA flag [v2,08/46] hugetlb: add HugeTLB HGM enablement helpers [v2,09/46] mm: add MADV_SPLIT to enable HugeTLB HGM [v2,10/46] hugetlb: make huge_pte_lockptr take an explicit shift argument [v2,11/46] hugetlb: add hugetlb_pte to track HugeTLB page table entries [v2,12/46] hugetlb: add hugetlb_alloc_pmd and hugetlb_alloc_pte [v2,13/46] hugetlb: add hugetlb_hgm_walk and hugetlb_walk_step [v2,14/46] hugetlb: split PTE markers when doing HGM walks [v2,15/46] hugetlb: add make_huge_pte_with_shift [v2,16/46] hugetlb: make default arch_make_huge_pte understand small mappings [v2,17/46] hugetlbfs: do a full walk to check if vma maps a page [v2,18/46] hugetlb: add HGM support to __unmap_hugepage_range [v2,19/46] hugetlb: add HGM support to hugetlb_change_protection [v2,20/46] hugetlb: add HGM support to follow_hugetlb_page [v2,21/46] hugetlb: add HGM support to hugetlb_follow_page_mask [v2,22/46] hugetlb: add HGM support to copy_hugetlb_page_range [v2,23/46] hugetlb: add HGM support to move_hugetlb_page_tables [v2,24/46] hugetlb: add HGM support to hugetlb_fault and hugetlb_no_page [v2,25/46] hugetlb: use struct hugetlb_pte for walk_hugetlb_range [v2,26/46] mm: rmap: provide pte_order in page_vma_mapped_walk [v2,27/46] mm: rmap: update try_to_{migrate,unmap} to handle mapcount for HGM [v2,28/46] mm: rmap: in try_to_{migrate,unmap}, check head page for hugetlb page flags [v2,29/46] hugetlb: update page_vma_mapped to do high-granularity walks [v2,30/46] hugetlb: add high-granularity migration support [v2,31/46] hugetlb: sort hstates in hugetlb_init_hstates [v2,32/46] hugetlb: add for_each_hgm_shift [v2,33/46] hugetlb: userfaultfd: add support for high-granularity UFFDIO_CONTINUE [v2,34/46] hugetlb: add MADV_COLLAPSE for hugetlb [v2,35/46] hugetlb: add check to prevent refcount overflow via HGM [v2,36/46] hugetlb: remove huge_pte_lock and huge_pte_lockptr [v2,37/46] hugetlb: replace make_huge_pte with make_huge_pte_with_shift [v2,38/46] mm: smaps: add stats for HugeTLB mapping size [v2,39/46] hugetlb: x86: enable high-granularity mapping for x86_64 [v2,40/46] docs: hugetlb: update hugetlb and userfaultfd admin-guides with HGM info [v2,41/46] docs: proc: include information about HugeTLB HGM [v2,42/46] selftests/mm: add HugeTLB HGM to userfaultfd selftest [v2,43/46] KVM: selftests: add HugeTLB HGM to KVM demand paging selftest [v2,44/46] selftests/mm: add anon and shared hugetlb to migration test [v2,45/46] selftests/mm: add hugetlb HGM test to migration selftest [v2,46/46] selftests/mm: add HGM UFFDIO_CONTINUE and hwpoison tests

Message ID

20230218002819.1486479-31-jthoughton@google.com (mailing list archive)

State

New

Headers

Date: Sat, 18 Feb 2023 00:28:03 +0000
In-Reply-To: <20230218002819.1486479-1-jthoughton@google.com>
Mime-Version: 1.0
References: <20230218002819.1486479-1-jthoughton@google.com>
Message-ID: <20230218002819.1486479-31-jthoughton@google.com>
Subject: [PATCH v2 30/46] hugetlb: add high-granularity migration support
From: James Houghton <jthoughton@google.com>
To: Mike Kravetz <mike.kravetz@oracle.com>,
 Muchun Song <songmuchun@bytedance.com>,
	Peter Xu <peterx@redhat.com>, Andrew Morton <akpm@linux-foundation.org>
Cc: David Hildenbrand <david@redhat.com>,
 David Rientjes <rientjes@google.com>,
	Axel Rasmussen <axelrasmussen@google.com>,
 Mina Almasry <almasrymina@google.com>,
	"Zach O'Keefe" <zokeefe@google.com>,
 Manish Mishra <manish.mishra@nutanix.com>,
	Naoya Horiguchi <naoya.horiguchi@nec.com>,
 "Dr . David Alan Gilbert" <dgilbert@redhat.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
 Vlastimil Babka <vbabka@suse.cz>,
	Baolin Wang <baolin.wang@linux.alibaba.com>,
 Miaohe Lin <linmiaohe@huawei.com>,
	Yang Shi <shy828301@gmail.com>, Frank van der Linden <fvdl@google.com>,
 Jiaqi Yan <jiaqiyan@google.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	James Houghton <jthoughton@google.com>
Content-Type: text/plain; charset="UTF-8"
Sender: owner-linux-mm@kvack.org
Precedence: bulk

Series

hugetlb: introduce HugeTLB high-granularity mapping | expand

Commit Message

James Houghton Feb. 18, 2023, 12:28 a.m. UTC

To prevent queueing a hugepage for migration multiple times, we use
last_folio to keep track of the last page we saw in queue_pages_hugetlb,
and if the page we're looking at is last_folio, then we skip it.

For the non-hugetlb cases, last_folio, although unused, is still updated
so that it has a consistent meaning with the hugetlb case.

Signed-off-by: James Houghton <jthoughton@google.com>

diff --git a/include/linux/swapops.h b/include/linux/swapops.h
index 3a451b7afcb3..6ef80763e629 100644
--- a/include/linux/swapops.h
+++ b/include/linux/swapops.h
@@ -68,6 +68,8 @@ 
 
 static inline bool is_pfn_swap_entry(swp_entry_t entry);
 
+struct hugetlb_pte;
+
 /* Clear all flags but only keep swp_entry_t related information */
 static inline pte_t pte_swp_clear_flags(pte_t pte)
 {
@@ -339,7 +341,8 @@  extern void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
 #ifdef CONFIG_HUGETLB_PAGE
 extern void __migration_entry_wait_huge(struct vm_area_struct *vma,
 					pte_t *ptep, spinlock_t *ptl);
-extern void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte);
+extern void migration_entry_wait_huge(struct vm_area_struct *vma,
+					struct hugetlb_pte *hpte);
 #endif	/* CONFIG_HUGETLB_PAGE */
 #else  /* CONFIG_MIGRATION */
 static inline swp_entry_t make_readable_migration_entry(pgoff_t offset)
@@ -369,7 +372,8 @@  static inline void migration_entry_wait(struct mm_struct *mm, pmd_t *pmd,
 #ifdef CONFIG_HUGETLB_PAGE
 static inline void __migration_entry_wait_huge(struct vm_area_struct *vma,
 					       pte_t *ptep, spinlock_t *ptl) { }
-static inline void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte) { }
+static inline void migration_entry_wait_huge(struct vm_area_struct *vma,
+						struct hugetlb_pte *hpte) { }
 #endif	/* CONFIG_HUGETLB_PAGE */
 static inline int is_writable_migration_entry(swp_entry_t entry)
 {
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 86cd51beb02c..39f541b4a0a8 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -6418,7 +6418,7 @@  vm_fault_t hugetlb_fault(struct mm_struct *mm, struct vm_area_struct *vma,
 			 * be released there.
 			 */
 			mutex_unlock(&hugetlb_fault_mutex_table[hash]);
-			migration_entry_wait_huge(vma, hpte.ptep);
+			migration_entry_wait_huge(vma, &hpte);
 			return 0;
 		} else if (unlikely(is_hugetlb_entry_hwpoisoned(entry)))
 			ret = VM_FAULT_HWPOISON_LARGE |
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 0f91be88392b..43e210181cce 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -424,6 +424,7 @@  struct queue_pages {
 	unsigned long start;
 	unsigned long end;
 	struct vm_area_struct *first;
+	struct folio *last_folio;
 };
 
 /*
@@ -475,6 +476,7 @@  static int queue_folios_pmd(pmd_t *pmd, spinlock_t *ptl, unsigned long addr,
 	flags = qp->flags;
 	/* go to folio migration */
 	if (flags & (MPOL_MF_MOVE | MPOL_MF_MOVE_ALL)) {
+		qp->last_folio = folio;
 		if (!vma_migratable(walk->vma) ||
 		    migrate_folio_add(folio, qp->pagelist, flags)) {
 			ret = 1;
@@ -539,6 +541,8 @@  static int queue_folios_pte_range(pmd_t *pmd, unsigned long addr,
 				break;
 			}
 
+			qp->last_folio = folio;
+
 			/*
 			 * Do not abort immediately since there may be
 			 * temporary off LRU pages in the range.  Still
@@ -570,15 +574,22 @@  static int queue_folios_hugetlb(struct hugetlb_pte *hpte,
 	spinlock_t *ptl;
 	pte_t entry;
 
-	/* We don't migrate high-granularity HugeTLB mappings for now. */
-	if (hugetlb_hgm_enabled(walk->vma))
-		return -EINVAL;
-
 	ptl = hugetlb_pte_lock(hpte);
 	entry = huge_ptep_get(hpte->ptep);
 	if (!pte_present(entry))
 		goto unlock;
-	folio = pfn_folio(pte_pfn(entry));
+
+	if (!hugetlb_pte_present_leaf(hpte, entry)) {
+		ret = -EAGAIN;
+		goto unlock;
+	}
+
+	folio = page_folio(pte_page(entry));
+
+	/* We already queued this page with another high-granularity PTE. */
+	if (folio == qp->last_folio)
+		goto unlock;
+
 	if (!queue_folio_required(folio, qp))
 		goto unlock;
 
@@ -747,6 +758,7 @@  queue_pages_range(struct mm_struct *mm, unsigned long start, unsigned long end,
 		.start = start,
 		.end = end,
 		.first = NULL,
+		.last_folio = NULL,
 	};
 
 	err = walk_page_range(mm, start, end, &queue_pages_walk_ops, &qp);
diff --git a/mm/migrate.c b/mm/migrate.c
index 616afcc40fdc..b26169990532 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -196,6 +196,9 @@  static bool remove_migration_pte(struct folio *folio,
 		/* pgoff is invalid for ksm pages, but they are never large */
 		if (folio_test_large(folio) && !folio_test_hugetlb(folio))
 			idx = linear_page_index(vma, pvmw.address) - pvmw.pgoff;
+		else if (folio_test_hugetlb(folio))
+			idx = (pvmw.address & ~huge_page_mask(hstate_vma(vma)))/
+				PAGE_SIZE;
 		new = folio_page(folio, idx);
 
 #ifdef CONFIG_ARCH_ENABLE_THP_MIGRATION
@@ -247,14 +250,16 @@  static bool remove_migration_pte(struct folio *folio,
 
 #ifdef CONFIG_HUGETLB_PAGE
 		if (folio_test_hugetlb(folio)) {
+			struct page *hpage = folio_page(folio, 0);
 			unsigned int shift = pvmw.pte_order + PAGE_SHIFT;
 
 			pte = arch_make_huge_pte(pte, shift, vma->vm_flags);
 			if (folio_test_anon(folio))
-				hugepage_add_anon_rmap(new, vma, pvmw.address,
+				hugepage_add_anon_rmap(hpage, vma, pvmw.address,
 						       rmap_flags);
 			else
-				page_add_file_rmap(new, vma, true);
+				hugetlb_add_file_rmap(new, shift,
+						hstate_vma(vma), vma);
 			set_huge_pte_at(vma->vm_mm, pvmw.address, pvmw.pte, pte);
 		} else
 #endif
@@ -270,7 +275,7 @@  static bool remove_migration_pte(struct folio *folio,
 			mlock_drain_local();
 
 		trace_remove_migration_pte(pvmw.address, pte_val(pte),
-					   compound_order(new));
+					   pvmw.pte_order);
 
 		/* No need to invalidate - it was non-present before */
 		update_mmu_cache(vma, pvmw.address, pvmw.pte);
@@ -361,12 +366,10 @@  void __migration_entry_wait_huge(struct vm_area_struct *vma,
 	}
 }
 
-void migration_entry_wait_huge(struct vm_area_struct *vma, pte_t *pte)
+void migration_entry_wait_huge(struct vm_area_struct *vma,
+				struct hugetlb_pte *hpte)
 {
-	spinlock_t *ptl = huge_pte_lockptr(huge_page_shift(hstate_vma(vma)),
-					   vma->vm_mm, pte);
-
-	__migration_entry_wait_huge(vma, pte, ptl);
+	__migration_entry_wait_huge(vma, hpte->ptep, hpte->ptl);
 }
 #endif

[v2,30/46] hugetlb: add high-granularity migration support

Commit Message

Patch