Message ID | 20210122195231.324857-3-mike.kravetz@oracle.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | create hugetlb flags to consolidate state | expand |
On Fri 22-01-21 11:52:28, Mike Kravetz wrote: > Use the new hugetlb page specific flag HPageMigratable to replace the > page_huge_active interfaces. By it's name, page_huge_active implied > that a huge page was on the active list. However, that is not really > what code checking the flag wanted to know. It really wanted to determine > if the huge page could be migrated. This happens when the page is actually > added to the page cache and/or task page table. This is the reasoning > behind the name change. yeah, definitely less confusing! > The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not > really necessary as we KNOW the page is a hugetlb page. Therefore, they > are removed. > > The routine page_huge_active checked for PageHeadHuge before testing the > active bit. This is unnecessary in the case where we hold a reference or > lock and know it is a hugetlb head page. page_huge_active is also called > without holding a reference or lock (scan_movable_pages), and can race with > code freeing the page. The extra check in page_huge_active shortened the > race window, but did not prevent the race. Offline code calling > scan_movable_pages already deals with these races, so removing the check > is acceptable. Add comment to racy code. > > Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> > Reviewed-by: Oscar Salvador <osalvador@suse.de> > Reviewed-by: Muchun Song <songmuchun@bytedance.com> > Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> Acked-by: Michal Hocko <mhocko@suse.com> One nit below [...] > diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h > index a7eb05315c6e..58be44a915d1 100644 > --- a/include/linux/hugetlb.h > +++ b/include/linux/hugetlb.h > @@ -480,9 +480,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, > * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at > * allocation time. Cleared when page is fully instantiated. Free > * routine checks flag to restore a reservation on error paths. > + * HPG_migratable - Set after a newly allocated page is added to the page > + * cache and/or page tables. Indicates the page is a candidate for > + * migration. The state change is synchronized by hugetlb_lock. Pls. > */ > enum hugetlb_page_flags { > HPG_restore_reserve = 0, > + HPG_migratable, > __NR_HPAGEFLAGS, > };
On 1/27/21 2:25 AM, Michal Hocko wrote: > On Fri 22-01-21 11:52:28, Mike Kravetz wrote: >> Use the new hugetlb page specific flag HPageMigratable to replace the >> page_huge_active interfaces. By it's name, page_huge_active implied >> that a huge page was on the active list. However, that is not really >> what code checking the flag wanted to know. It really wanted to determine >> if the huge page could be migrated. This happens when the page is actually >> added to the page cache and/or task page table. This is the reasoning >> behind the name change. > > yeah, definitely less confusing! > >> The VM_BUG_ON_PAGE() calls in the *_huge_active() interfaces are not >> really necessary as we KNOW the page is a hugetlb page. Therefore, they >> are removed. >> >> The routine page_huge_active checked for PageHeadHuge before testing the >> active bit. This is unnecessary in the case where we hold a reference or >> lock and know it is a hugetlb head page. page_huge_active is also called >> without holding a reference or lock (scan_movable_pages), and can race with >> code freeing the page. The extra check in page_huge_active shortened the >> race window, but did not prevent the race. Offline code calling >> scan_movable_pages already deals with these races, so removing the check >> is acceptable. Add comment to racy code. >> >> Signed-off-by: Mike Kravetz <mike.kravetz@oracle.com> >> Reviewed-by: Oscar Salvador <osalvador@suse.de> >> Reviewed-by: Muchun Song <songmuchun@bytedance.com> >> Reviewed-by: Miaohe Lin <linmiaohe@huawei.com> > > Acked-by: Michal Hocko <mhocko@suse.com> > > One nit below > [...] >> diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h >> index a7eb05315c6e..58be44a915d1 100644 >> --- a/include/linux/hugetlb.h >> +++ b/include/linux/hugetlb.h >> @@ -480,9 +480,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, >> * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at >> * allocation time. Cleared when page is fully instantiated. Free >> * routine checks flag to restore a reservation on error paths. >> + * HPG_migratable - Set after a newly allocated page is added to the page >> + * cache and/or page tables. Indicates the page is a candidate for >> + * migration. > > The state change is synchronized by hugetlb_lock. > > Pls. I will update/explain this. But ... hugetlb_lock does not synchronize all changes of this flag. The flag is set without holding the lock for newly allocated pages after being added to page cache and/or page tables. This 'signals' the page is a candidate for migration. When the migration code (isolation, putback, move state) deals with the flag, it does hold the hugetlb_lock. This patch did not change any of the synchronization.
diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index b00801fd6002..e1d7ed2a53a9 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -735,7 +735,7 @@ static long hugetlbfs_fallocate(struct file *file, int mode, loff_t offset, mutex_unlock(&hugetlb_fault_mutex_table[hash]); - set_page_huge_active(page); + SetHPageMigratable(page); /* * unlock_page because locked by add_to_page_cache() * put_page() due to reference from alloc_huge_page() diff --git a/include/linux/hugetlb.h b/include/linux/hugetlb.h index a7eb05315c6e..58be44a915d1 100644 --- a/include/linux/hugetlb.h +++ b/include/linux/hugetlb.h @@ -480,9 +480,13 @@ unsigned long hugetlb_get_unmapped_area(struct file *file, unsigned long addr, * HPG_restore_reserve - Set when a hugetlb page consumes a reservation at * allocation time. Cleared when page is fully instantiated. Free * routine checks flag to restore a reservation on error paths. + * HPG_migratable - Set after a newly allocated page is added to the page + * cache and/or page tables. Indicates the page is a candidate for + * migration. */ enum hugetlb_page_flags { HPG_restore_reserve = 0, + HPG_migratable, __NR_HPAGEFLAGS, }; @@ -525,6 +529,7 @@ static inline void ClearHPage##uname(struct page *page) \ * Create functions associated with hugetlb page flags */ HPAGEFLAG(RestoreReserve, restore_reserve) +HPAGEFLAG(Migratable, migratable) #ifdef CONFIG_HUGETLB_PAGE diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h index ec5d0290e0ee..db914477057b 100644 --- a/include/linux/page-flags.h +++ b/include/linux/page-flags.h @@ -592,15 +592,9 @@ static inline void ClearPageCompound(struct page *page) #ifdef CONFIG_HUGETLB_PAGE int PageHuge(struct page *page); int PageHeadHuge(struct page *page); -bool page_huge_active(struct page *page); #else TESTPAGEFLAG_FALSE(Huge) TESTPAGEFLAG_FALSE(HeadHuge) - -static inline bool page_huge_active(struct page *page) -{ - return 0; -} #endif diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 8ec6138ca81b..f1a3c8230dbf 100644 --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -1353,30 +1353,6 @@ struct hstate *size_to_hstate(unsigned long size) return NULL; } -/* - * Test to determine whether the hugepage is "active/in-use" (i.e. being linked - * to hstate->hugepage_activelist.) - * - * This function can be called for tail pages, but never returns true for them. - */ -bool page_huge_active(struct page *page) -{ - return PageHeadHuge(page) && PagePrivate(&page[1]); -} - -/* never called for tail page */ -void set_page_huge_active(struct page *page) -{ - VM_BUG_ON_PAGE(!PageHeadHuge(page), page); - SetPagePrivate(&page[1]); -} - -static void clear_page_huge_active(struct page *page) -{ - VM_BUG_ON_PAGE(!PageHeadHuge(page), page); - ClearPagePrivate(&page[1]); -} - /* * Internal hugetlb specific page flag. Do not use outside of the hugetlb * code @@ -1438,7 +1414,7 @@ static void __free_huge_page(struct page *page) } spin_lock(&hugetlb_lock); - clear_page_huge_active(page); + ClearHPageMigratable(page); hugetlb_cgroup_uncharge_page(hstate_index(h), pages_per_huge_page(h), page); hugetlb_cgroup_uncharge_page_rsvd(hstate_index(h), @@ -4218,7 +4194,7 @@ static vm_fault_t hugetlb_cow(struct mm_struct *mm, struct vm_area_struct *vma, make_huge_pte(vma, new_page, 1)); page_remove_rmap(old_page, true); hugepage_add_new_anon_rmap(new_page, vma, haddr); - set_page_huge_active(new_page); + SetHPageMigratable(new_page); /* Make the old page be freed below */ new_page = old_page; } @@ -4455,12 +4431,12 @@ static vm_fault_t hugetlb_no_page(struct mm_struct *mm, spin_unlock(ptl); /* - * Only make newly allocated pages active. Existing pages found - * in the pagecache could be !page_huge_active() if they have been - * isolated for migration. + * Only set HPageMigratable in newly allocated pages. Existing pages + * found in the pagecache may not have HPageMigratableset if they have + * been isolated for migration. */ if (new_page) - set_page_huge_active(page); + SetHPageMigratable(page); unlock_page(page); out: @@ -4771,7 +4747,7 @@ int hugetlb_mcopy_atomic_pte(struct mm_struct *dst_mm, update_mmu_cache(dst_vma, dst_addr, dst_pte); spin_unlock(ptl); - set_page_huge_active(page); + SetHPageMigratable(page); if (vm_shared) unlock_page(page); ret = 0; @@ -5596,12 +5572,13 @@ bool isolate_huge_page(struct page *page, struct list_head *list) bool ret = true; spin_lock(&hugetlb_lock); - if (!PageHeadHuge(page) || !page_huge_active(page) || + if (!PageHeadHuge(page) || + !HPageMigratable(page) || !get_page_unless_zero(page)) { ret = false; goto unlock; } - clear_page_huge_active(page); + ClearHPageMigratable(page); list_move_tail(&page->lru, list); unlock: spin_unlock(&hugetlb_lock); @@ -5612,7 +5589,7 @@ void putback_active_hugepage(struct page *page) { VM_BUG_ON_PAGE(!PageHead(page), page); spin_lock(&hugetlb_lock); - set_page_huge_active(page); + SetHPageMigratable(page); list_move_tail(&page->lru, &(page_hstate(page))->hugepage_activelist); spin_unlock(&hugetlb_lock); put_page(page); diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c index f9d57b9be8c7..465164be4d2d 100644 --- a/mm/memory_hotplug.c +++ b/mm/memory_hotplug.c @@ -1260,7 +1260,14 @@ static int scan_movable_pages(unsigned long start, unsigned long end, if (!PageHuge(page)) continue; head = compound_head(page); - if (page_huge_active(head)) + /* + * This test is racy as we hold no reference or lock. The + * hugetlb page could have been free'ed and head is no longer + * a hugetlb page before the following check. In such unlikely + * cases false positives and negatives are possible. Calling + * code must deal with these scenarios. + */ + if (HPageMigratable(head)) goto found; skip = compound_nr(head) - (page - head); pfn += skip - 1;