diff mbox series

[v3,1/2] mm: migration: fix the FOLL_GET failure on following huge page

Message ID 20220815010349.432313-2-haiyue.wang@intel.com (mailing list archive)
State Superseded
Headers show
Series fix follow_page related issues | expand

Commit Message

Wang, Haiyue Aug. 15, 2022, 1:03 a.m. UTC
Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
will fail to get the page node information for huge page.

This is an temporary solution to mitigate the racing fix.

After supporting follow huge page by FOLL_GET is done, this fix can be
reverted safely.

Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
---
 mm/migrate.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

Comments

Huang, Ying Aug. 15, 2022, 1:59 a.m. UTC | #1
Haiyue Wang <haiyue.wang@intel.com> writes:

> Not all huge page APIs support FOLL_GET option, so the __NR_move_pages

move_pages() is a syscall, so you can just call it move_pages(), or
move_pages() syscall.

> will fail to get the page node information for huge page.
                                                 ~~~~~~~~~
some huge pages?

> This is an temporary solution to mitigate the racing fix.

Why is it "racing fix"?  This isn't a race condition fix.

Best Regards,
Huang, Ying

> After supporting follow huge page by FOLL_GET is done, this fix can be
> reverted safely.
>
> Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
> Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>

[snip]
Wang, Haiyue Aug. 15, 2022, 2:10 a.m. UTC | #2
> -----Original Message-----
> From: Huang, Ying <ying.huang@intel.com>
> Sent: Monday, August 15, 2022 09:59
> To: Wang, Haiyue <haiyue.wang@intel.com>
> Cc: linux-mm@kvack.org; linux-kernel@vger.kernel.org; akpm@linux-foundation.org; david@redhat.com;
> linmiaohe@huawei.com; songmuchun@bytedance.com; naoya.horiguchi@linux.dev; alex.sierra@amd.com
> Subject: Re: [PATCH v3 1/2] mm: migration: fix the FOLL_GET failure on following huge page
> 
> Haiyue Wang <haiyue.wang@intel.com> writes:
> 
> > Not all huge page APIs support FOLL_GET option, so the __NR_move_pages
> 
> move_pages() is a syscall, so you can just call it move_pages(), or
> move_pages() syscall.

The application meets the issue, use the bellow function:
syscall (__NR_move_pages, 0, n_pages, ptr, 0, status, 0)

So I used it directly in the commit. "move_pages() syscall" is better.
Will update latter.

> 
> > will fail to get the page node information for huge page.
>                                                  ~~~~~~~~~
> some huge pages?

OK.

> 
> > This is an temporary solution to mitigate the racing fix.
> 
> Why is it "racing fix"?  This isn't a race condition fix.

The 'Fixes' commit is about race condition fix.

How about " his is an temporary solution to mitigate the side effect
Of the race condition fix"

> 
> Best Regards,
> Huang, Ying
> 
> > After supporting follow huge page by FOLL_GET is done, this fix can be
> > reverted safely.
> >
> > Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
> > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> 
> [snip]
Wang, Haiyue Aug. 15, 2022, 2:15 a.m. UTC | #3
> -----Original Message-----
> From: Wang, Haiyue
> Sent: Monday, August 15, 2022 10:11
> To: Huang, Ying <ying.huang@intel.com>
> Cc: linux-mm@kvack.org; linux-kernel@vger.kernel.org; akpm@linux-foundation.org; david@redhat.com;
> linmiaohe@huawei.com; songmuchun@bytedance.com; naoya.horiguchi@linux.dev; alex.sierra@amd.com
> Subject: RE: [PATCH v3 1/2] mm: migration: fix the FOLL_GET failure on following huge page
> 
> > -----Original Message-----
> > From: Huang, Ying <ying.huang@intel.com>
> > Sent: Monday, August 15, 2022 09:59
> > To: Wang, Haiyue <haiyue.wang@intel.com>
> > Cc: linux-mm@kvack.org; linux-kernel@vger.kernel.org; akpm@linux-foundation.org; david@redhat.com;
> > linmiaohe@huawei.com; songmuchun@bytedance.com; naoya.horiguchi@linux.dev; alex.sierra@amd.com
> > Subject: Re: [PATCH v3 1/2] mm: migration: fix the FOLL_GET failure on following huge page
> >
> > Haiyue Wang <haiyue.wang@intel.com> writes:
> >


> >
> > > This is an temporary solution to mitigate the racing fix.
> >
> > Why is it "racing fix"?  This isn't a race condition fix.
> 
> The 'Fixes' commit is about race condition fix.
> 
> How about " his is an temporary solution to mitigate the side effect
> Of the race condition fix"

Try to add more words to make things clean:

"This is an temporary solution to mitigate the side effect of the race
condition fix by calling follow_page() with FOLL_GET set."

> 
> >
> > Best Regards,
> > Huang, Ying
> >
> > > After supporting follow huge page by FOLL_GET is done, this fix can be
> > > reverted safely.
> > >
> > > Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
> > > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
> >
> > [snip]
Huang, Ying Aug. 15, 2022, 2:51 a.m. UTC | #4
"Wang, Haiyue" <haiyue.wang@intel.com> writes:

>> -----Original Message-----
>> From: Wang, Haiyue
>> Sent: Monday, August 15, 2022 10:11
>> To: Huang, Ying <ying.huang@intel.com>
>> Cc: linux-mm@kvack.org; linux-kernel@vger.kernel.org; akpm@linux-foundation.org; david@redhat.com;
>> linmiaohe@huawei.com; songmuchun@bytedance.com; naoya.horiguchi@linux.dev; alex.sierra@amd.com
>> Subject: RE: [PATCH v3 1/2] mm: migration: fix the FOLL_GET failure on following huge page
>>
>> > -----Original Message-----
>> > From: Huang, Ying <ying.huang@intel.com>
>> > Sent: Monday, August 15, 2022 09:59
>> > To: Wang, Haiyue <haiyue.wang@intel.com>
>> > Cc: linux-mm@kvack.org; linux-kernel@vger.kernel.org; akpm@linux-foundation.org; david@redhat.com;
>> > linmiaohe@huawei.com; songmuchun@bytedance.com; naoya.horiguchi@linux.dev; alex.sierra@amd.com
>> > Subject: Re: [PATCH v3 1/2] mm: migration: fix the FOLL_GET failure on following huge page
>> >
>> > Haiyue Wang <haiyue.wang@intel.com> writes:
>> >
>
>
>> >
>> > > This is an temporary solution to mitigate the racing fix.
>> >
>> > Why is it "racing fix"?  This isn't a race condition fix.
>>
>> The 'Fixes' commit is about race condition fix.
>>
>> How about " his is an temporary solution to mitigate the side effect
>> Of the race condition fix"
>
> Try to add more words to make things clean:
>
> "This is an temporary solution to mitigate the side effect of the race
> condition fix by calling follow_page() with FOLL_GET set."

Looks good to me.  Thanks!

Best Regards,
Huang, Ying

>>
>> >
>> > Best Regards,
>> > Huang, Ying
>> >
>> > > After supporting follow huge page by FOLL_GET is done, this fix can be
>> > > reverted safely.
>> > >
>> > > Fixes: 4cd614841c06 ("mm: migration: fix possible do_pages_stat_array racing with memory offline")
>> > > Signed-off-by: Haiyue Wang <haiyue.wang@intel.com>
>> >
>> > [snip]
diff mbox series

Patch

diff --git a/mm/migrate.c b/mm/migrate.c
index 6a1597c92261..581dfaad9257 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -1848,6 +1848,7 @@  static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
 
 	for (i = 0; i < nr_pages; i++) {
 		unsigned long addr = (unsigned long)(*pages);
+		unsigned int foll_flags = FOLL_DUMP;
 		struct vm_area_struct *vma;
 		struct page *page;
 		int err = -EFAULT;
@@ -1856,8 +1857,12 @@  static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
 		if (!vma)
 			goto set_status;
 
+		/* Not all huge page follow APIs support 'FOLL_GET' */
+		if (!is_vm_hugetlb_page(vma))
+			foll_flags |= FOLL_GET;
+
 		/* FOLL_DUMP to ignore special (like zero) pages */
-		page = follow_page(vma, addr, FOLL_GET | FOLL_DUMP);
+		page = follow_page(vma, addr, foll_flags);
 
 		err = PTR_ERR(page);
 		if (IS_ERR(page))
@@ -1865,7 +1870,8 @@  static void do_pages_stat_array(struct mm_struct *mm, unsigned long nr_pages,
 
 		if (page && !is_zone_device_page(page)) {
 			err = page_to_nid(page);
-			put_page(page);
+			if (foll_flags & FOLL_GET)
+				put_page(page);
 		} else {
 			err = -ENOENT;
 		}