diff mbox series

[v2] mm/rmap.c: split huge pmd when it really is

Message ID 20191223222856.7189-1-richardw.yang@linux.intel.com (mailing list archive)
State New, archived
Headers show
Series [v2] mm/rmap.c: split huge pmd when it really is | expand

Commit Message

Wei Yang Dec. 23, 2019, 10:28 p.m. UTC
When page is not NULL, function is called by try_to_unmap_one() with
TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
with TTU_SPLIT_HUGE_PMD set:

  * unmap_page()
  * shrink_page_list()

In both case, the page passed to try_to_unmap_one() is PageHead() of the
THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
aligned, this means the THP is not mapped as PMD THP in this process.
This could happen when we do mremap() a PMD size range to an un-aligned
address.

Currently, this case is handled by following check in __split_huge_pmd()
luckily.

  page != pmd_page(*pmd)

This patch checks the address to skip some work.

Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>

---
v2: move the check into split_huge_pmd_address().
---
 mm/huge_memory.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Matthew Wilcox (Oracle) Dec. 23, 2019, 11:11 p.m. UTC | #1
On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
> When page is not NULL, function is called by try_to_unmap_one() with
> TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
> with TTU_SPLIT_HUGE_PMD set:
> 
>   * unmap_page()
>   * shrink_page_list()
> 
> In both case, the page passed to try_to_unmap_one() is PageHead() of the
> THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
> aligned, this means the THP is not mapped as PMD THP in this process.
> This could happen when we do mremap() a PMD size range to an un-aligned
> address.
> 
> Currently, this case is handled by following check in __split_huge_pmd()
> luckily.
> 
>   page != pmd_page(*pmd)
> 
> This patch checks the address to skip some work.

The description here is confusing to me.

> +	/*
> +	 * When page is not NULL, function is called by try_to_unmap_one()
> +	 * with TTU_SPLIT_HUGE_PMD set. There are two places set
> +	 * TTU_SPLIT_HUGE_PMD
> +	 *
> +	 *     unmap_page()
> +	 *     shrink_page_list()
> +	 *
> +	 * In both cases, the "page" here is the PageHead() of a THP.
> +	 *
> +	 * If the page is not a PMD mapped huge page, e.g. after mremap(), it
> +	 * is not necessary to split it.
> +	 */
> +	if (page && !IS_ALIGNED(address, HPAGE_PMD_SIZE))
> +		return;

Repeating 75% of it as comments doesn't make it any less confusing.  And
it feels like we're digging a pothole for someone to fall into later.
Why not make it make sense ...

	if (page && !IS_ALIGNED(address, page_size(page))
		return;
Wei Yang Dec. 24, 2019, 1:56 a.m. UTC | #2
On Mon, Dec 23, 2019 at 03:11:20PM -0800, Matthew Wilcox wrote:
>On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
>> When page is not NULL, function is called by try_to_unmap_one() with
>> TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
>> with TTU_SPLIT_HUGE_PMD set:
>> 
>>   * unmap_page()
>>   * shrink_page_list()
>> 
>> In both case, the page passed to try_to_unmap_one() is PageHead() of the
>> THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
>> aligned, this means the THP is not mapped as PMD THP in this process.
>> This could happen when we do mremap() a PMD size range to an un-aligned
>> address.
>> 
>> Currently, this case is handled by following check in __split_huge_pmd()
>> luckily.
>> 
>>   page != pmd_page(*pmd)
>> 
>> This patch checks the address to skip some work.
>
>The description here is confusing to me.
>

Sorry for the confusion.

Below is my understanding, if not correct or proper, just let me know :-)

According to current comment in __split_huge_pmd(), we check pmd_page with
page for migration case. While actually, this check also helps in the
following two cases when page already split-ed:

   * page just split-ed in place
   * page split-ed and moved to non-PMD aligned address

In both cases, pmd_page() is pointing to the PTE level page table. That's why
we don't split one already split-ed THP page.

If current code really intend to cover these two cases, sorry for my poor
understanding.

>> +	/*
>> +	 * When page is not NULL, function is called by try_to_unmap_one()
>> +	 * with TTU_SPLIT_HUGE_PMD set. There are two places set
>> +	 * TTU_SPLIT_HUGE_PMD
>> +	 *
>> +	 *     unmap_page()
>> +	 *     shrink_page_list()
>> +	 *
>> +	 * In both cases, the "page" here is the PageHead() of a THP.
>> +	 *
>> +	 * If the page is not a PMD mapped huge page, e.g. after mremap(), it
>> +	 * is not necessary to split it.
>> +	 */
>> +	if (page && !IS_ALIGNED(address, HPAGE_PMD_SIZE))
>> +		return;
>
>Repeating 75% of it as comments doesn't make it any less confusing.  And
>it feels like we're digging a pothole for someone to fall into later.
>Why not make it make sense ...
>
>	if (page && !IS_ALIGNED(address, page_size(page))
>		return;

Hmm... Use HPAGE_PMD_SIZE here wants to emphasize we want the address to be
PMD aligned. If just use page_size() here, may confuse the audience?
Matthew Wilcox (Oracle) Dec. 27, 2019, 3:13 p.m. UTC | #3
On Tue, Dec 24, 2019 at 09:56:02AM +0800, Wei Yang wrote:
> On Mon, Dec 23, 2019 at 03:11:20PM -0800, Matthew Wilcox wrote:
> >On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
> >> When page is not NULL, function is called by try_to_unmap_one() with
> >> TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
> >> with TTU_SPLIT_HUGE_PMD set:
> >> 
> >>   * unmap_page()
> >>   * shrink_page_list()
> >> 
> >> In both case, the page passed to try_to_unmap_one() is PageHead() of the
> >> THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
> >> aligned, this means the THP is not mapped as PMD THP in this process.
> >> This could happen when we do mremap() a PMD size range to an un-aligned
> >> address.
> >> 
> >> Currently, this case is handled by following check in __split_huge_pmd()
> >> luckily.
> >> 
> >>   page != pmd_page(*pmd)
> >> 
> >> This patch checks the address to skip some work.
> >
> >The description here is confusing to me.
> >
> 
> Sorry for the confusion.
> 
> Below is my understanding, if not correct or proper, just let me know :-)
> 
> According to current comment in __split_huge_pmd(), we check pmd_page with
> page for migration case. While actually, this check also helps in the
> following two cases when page already split-ed:
> 
>    * page just split-ed in place
>    * page split-ed and moved to non-PMD aligned address
> 
> In both cases, pmd_page() is pointing to the PTE level page table. That's why
> we don't split one already split-ed THP page.
> 
> If current code really intend to cover these two cases, sorry for my poor
> understanding.
> 
> >> +	/*
> >> +	 * When page is not NULL, function is called by try_to_unmap_one()
> >> +	 * with TTU_SPLIT_HUGE_PMD set. There are two places set
> >> +	 * TTU_SPLIT_HUGE_PMD
> >> +	 *
> >> +	 *     unmap_page()
> >> +	 *     shrink_page_list()
> >> +	 *
> >> +	 * In both cases, the "page" here is the PageHead() of a THP.
> >> +	 *
> >> +	 * If the page is not a PMD mapped huge page, e.g. after mremap(), it
> >> +	 * is not necessary to split it.
> >> +	 */
> >> +	if (page && !IS_ALIGNED(address, HPAGE_PMD_SIZE))
> >> +		return;
> >
> >Repeating 75% of it as comments doesn't make it any less confusing.  And
> >it feels like we're digging a pothole for someone to fall into later.
> >Why not make it make sense ...
> >
> >	if (page && !IS_ALIGNED(address, page_size(page))
> >		return;
> 
> Hmm... Use HPAGE_PMD_SIZE here wants to emphasize we want the address to be
> PMD aligned. If just use page_size() here, may confuse the audience?

I'm OK with using HPAGE_PMD_SIZE here.  I was trying to future-proof
this function for supporting 64kB pages with a 4kB page size on ARM,
but this function will need changes for that anyway, so I'm OK with
your suggestion.
Wei Yang Dec. 29, 2019, 10:07 p.m. UTC | #4
On Fri, Dec 27, 2019 at 07:13:46AM -0800, Matthew Wilcox wrote:
>On Tue, Dec 24, 2019 at 09:56:02AM +0800, Wei Yang wrote:
>> On Mon, Dec 23, 2019 at 03:11:20PM -0800, Matthew Wilcox wrote:
>> >On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
>> >> When page is not NULL, function is called by try_to_unmap_one() with
>> >> TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
>> >> with TTU_SPLIT_HUGE_PMD set:
>> >> 
>> >>   * unmap_page()
>> >>   * shrink_page_list()
>> >> 
>> >> In both case, the page passed to try_to_unmap_one() is PageHead() of the
>> >> THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
>> >> aligned, this means the THP is not mapped as PMD THP in this process.
>> >> This could happen when we do mremap() a PMD size range to an un-aligned
>> >> address.
>> >> 
>> >> Currently, this case is handled by following check in __split_huge_pmd()
>> >> luckily.
>> >> 
>> >>   page != pmd_page(*pmd)
>> >> 
>> >> This patch checks the address to skip some work.
>> >
>> >The description here is confusing to me.
>> >
>> 
>> Sorry for the confusion.
>> 
>> Below is my understanding, if not correct or proper, just let me know :-)
>> 
>> According to current comment in __split_huge_pmd(), we check pmd_page with
>> page for migration case. While actually, this check also helps in the
>> following two cases when page already split-ed:
>> 
>>    * page just split-ed in place
>>    * page split-ed and moved to non-PMD aligned address
>> 
>> In both cases, pmd_page() is pointing to the PTE level page table. That's why
>> we don't split one already split-ed THP page.
>> 
>> If current code really intend to cover these two cases, sorry for my poor
>> understanding.
>> 
>> >> +	/*
>> >> +	 * When page is not NULL, function is called by try_to_unmap_one()
>> >> +	 * with TTU_SPLIT_HUGE_PMD set. There are two places set
>> >> +	 * TTU_SPLIT_HUGE_PMD
>> >> +	 *
>> >> +	 *     unmap_page()
>> >> +	 *     shrink_page_list()
>> >> +	 *
>> >> +	 * In both cases, the "page" here is the PageHead() of a THP.
>> >> +	 *
>> >> +	 * If the page is not a PMD mapped huge page, e.g. after mremap(), it
>> >> +	 * is not necessary to split it.
>> >> +	 */
>> >> +	if (page && !IS_ALIGNED(address, HPAGE_PMD_SIZE))
>> >> +		return;
>> >
>> >Repeating 75% of it as comments doesn't make it any less confusing.  And
>> >it feels like we're digging a pothole for someone to fall into later.
>> >Why not make it make sense ...
>> >
>> >	if (page && !IS_ALIGNED(address, page_size(page))
>> >		return;
>> 
>> Hmm... Use HPAGE_PMD_SIZE here wants to emphasize we want the address to be
>> PMD aligned. If just use page_size() here, may confuse the audience?
>
>I'm OK with using HPAGE_PMD_SIZE here.  I was trying to future-proof
>this function for supporting 64kB pages with a 4kB page size on ARM,
>but this function will need changes for that anyway, so I'm OK with
>your suggestion.

Thanks for your comments. :-)
Wei Yang Jan. 3, 2020, 7:18 a.m. UTC | #5
On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
>When page is not NULL, function is called by try_to_unmap_one() with
>TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
>with TTU_SPLIT_HUGE_PMD set:
>
>  * unmap_page()
>  * shrink_page_list()
>
>In both case, the page passed to try_to_unmap_one() is PageHead() of the
>THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
>aligned, this means the THP is not mapped as PMD THP in this process.
>This could happen when we do mremap() a PMD size range to an un-aligned
>address.
>
>Currently, this case is handled by following check in __split_huge_pmd()
>luckily.
>
>  page != pmd_page(*pmd)
>
>This patch checks the address to skip some work.

I am sorry to forget address Kirill's comment in 1st version.

The first one is the performance difference after this change for a PTE
mappged THP.

Here is the result:(in cycle)

        Before     Patched

        963        195
        988        40
        895        78

Average 948        104

So the change reduced 90% time for function split_huge_pmd_address().

For the 2nd comment, the vma check. Let me take a further look to analysis.

Thanks for Kirill's suggestion.

>
>Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>
>---
>v2: move the check into split_huge_pmd_address().
>---
> mm/huge_memory.c | 16 ++++++++++++++++
> 1 file changed, 16 insertions(+)
>
>diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>index 893fecd5daa4..2b9c2f412b32 100644
>--- a/mm/huge_memory.c
>+++ b/mm/huge_memory.c
>@@ -2342,6 +2342,22 @@ void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address,
> 	pud_t *pud;
> 	pmd_t *pmd;
> 
>+	/*
>+	 * When page is not NULL, function is called by try_to_unmap_one()
>+	 * with TTU_SPLIT_HUGE_PMD set. There are two places set
>+	 * TTU_SPLIT_HUGE_PMD
>+	 *
>+	 *     unmap_page()
>+	 *     shrink_page_list()
>+	 *
>+	 * In both cases, the "page" here is the PageHead() of a THP.
>+	 *
>+	 * If the page is not a PMD mapped huge page, e.g. after mremap(), it
>+	 * is not necessary to split it.
>+	 */
>+	if (page && !IS_ALIGNED(address, HPAGE_PMD_SIZE))
>+		return;
>+
> 	pgd = pgd_offset(vma->vm_mm, address);
> 	if (!pgd_present(*pgd))
> 		return;
>-- 
>2.17.1
Wei Yang Jan. 3, 2020, 1:05 p.m. UTC | #6
On Fri, Jan 03, 2020 at 03:18:46PM +0800, Wei Yang wrote:
>On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
>>When page is not NULL, function is called by try_to_unmap_one() with
>>TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
>>with TTU_SPLIT_HUGE_PMD set:
>>
>>  * unmap_page()
>>  * shrink_page_list()
>>
>>In both case, the page passed to try_to_unmap_one() is PageHead() of the
>>THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
>>aligned, this means the THP is not mapped as PMD THP in this process.
>>This could happen when we do mremap() a PMD size range to an un-aligned
>>address.
>>
>>Currently, this case is handled by following check in __split_huge_pmd()
>>luckily.
>>
>>  page != pmd_page(*pmd)
>>
>>This patch checks the address to skip some work.
>
>I am sorry to forget address Kirill's comment in 1st version.
>
>The first one is the performance difference after this change for a PTE
>mappged THP.
>
>Here is the result:(in cycle)
>
>        Before     Patched
>
>        963        195
>        988        40
>        895        78
>
>Average 948        104
>
>So the change reduced 90% time for function split_huge_pmd_address().
>
>For the 2nd comment, the vma check. Let me take a further look to analysis.
>
>Thanks for Kirill's suggestion.
>

For 2nd comment, check vma could hold huge page.

You mean do this check ?

  vma->vm_start <= address && vma->vm_end >= address + HPAGE_PMD_SIZE

This happens after munmap a partial of the THP range? After doing so, we can
skip split_pmd for this case.

>>
>>Signed-off-by: Wei Yang <richardw.yang@linux.intel.com>
>>
>>---
>>v2: move the check into split_huge_pmd_address().
>>---
>> mm/huge_memory.c | 16 ++++++++++++++++
>> 1 file changed, 16 insertions(+)
>>
>>diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>index 893fecd5daa4..2b9c2f412b32 100644
>>--- a/mm/huge_memory.c
>>+++ b/mm/huge_memory.c
>>@@ -2342,6 +2342,22 @@ void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address,
>> 	pud_t *pud;
>> 	pmd_t *pmd;
>> 
>>+	/*
>>+	 * When page is not NULL, function is called by try_to_unmap_one()
>>+	 * with TTU_SPLIT_HUGE_PMD set. There are two places set
>>+	 * TTU_SPLIT_HUGE_PMD
>>+	 *
>>+	 *     unmap_page()
>>+	 *     shrink_page_list()
>>+	 *
>>+	 * In both cases, the "page" here is the PageHead() of a THP.
>>+	 *
>>+	 * If the page is not a PMD mapped huge page, e.g. after mremap(), it
>>+	 * is not necessary to split it.
>>+	 */
>>+	if (page && !IS_ALIGNED(address, HPAGE_PMD_SIZE))
>>+		return;
>>+
>> 	pgd = pgd_offset(vma->vm_mm, address);
>> 	if (!pgd_present(*pgd))
>> 		return;
>>-- 
>>2.17.1
>
>-- 
>Wei Yang
>Help you, Help me
Kirill A. Shutemov Jan. 3, 2020, 1:26 p.m. UTC | #7
On Fri, Jan 03, 2020 at 09:05:54PM +0800, Wei Yang wrote:
> On Fri, Jan 03, 2020 at 03:18:46PM +0800, Wei Yang wrote:
> >On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
> >>When page is not NULL, function is called by try_to_unmap_one() with
> >>TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
> >>with TTU_SPLIT_HUGE_PMD set:
> >>
> >>  * unmap_page()
> >>  * shrink_page_list()
> >>
> >>In both case, the page passed to try_to_unmap_one() is PageHead() of the
> >>THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
> >>aligned, this means the THP is not mapped as PMD THP in this process.
> >>This could happen when we do mremap() a PMD size range to an un-aligned
> >>address.
> >>
> >>Currently, this case is handled by following check in __split_huge_pmd()
> >>luckily.
> >>
> >>  page != pmd_page(*pmd)
> >>
> >>This patch checks the address to skip some work.
> >
> >I am sorry to forget address Kirill's comment in 1st version.
> >
> >The first one is the performance difference after this change for a PTE
> >mappged THP.
> >
> >Here is the result:(in cycle)
> >
> >        Before     Patched
> >
> >        963        195
> >        988        40
> >        895        78
> >
> >Average 948        104
> >
> >So the change reduced 90% time for function split_huge_pmd_address().

Right.

But do we have a scenario, where the performance of
split_huge_pmd_address() matters? I mean, it it called as part of rmap
walk, attempt to split huge PMD where we don't have huge PMD should be
within noise.

> >For the 2nd comment, the vma check. Let me take a further look to analysis.
> >
> >Thanks for Kirill's suggestion.
> >
> 
> For 2nd comment, check vma could hold huge page.
> 
> You mean do this check ?
> 
>   vma->vm_start <= address && vma->vm_end >= address + HPAGE_PMD_SIZE
> 
> This happens after munmap a partial of the THP range? After doing so, we can
> skip split_pmd for this case.

Okay, you are right. This kind of check would not be safe as we
split_huge_pmd_address() after adjusting VMA with expectation of splitting
PMD on boundary of the VMA.
Wei Yang Jan. 3, 2020, 2:01 p.m. UTC | #8
On Fri, Jan 03, 2020 at 04:26:50PM +0300, Kirill A. Shutemov wrote:
>On Fri, Jan 03, 2020 at 09:05:54PM +0800, Wei Yang wrote:
>> On Fri, Jan 03, 2020 at 03:18:46PM +0800, Wei Yang wrote:
>> >On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
>> >>When page is not NULL, function is called by try_to_unmap_one() with
>> >>TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
>> >>with TTU_SPLIT_HUGE_PMD set:
>> >>
>> >>  * unmap_page()
>> >>  * shrink_page_list()
>> >>
>> >>In both case, the page passed to try_to_unmap_one() is PageHead() of the
>> >>THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
>> >>aligned, this means the THP is not mapped as PMD THP in this process.
>> >>This could happen when we do mremap() a PMD size range to an un-aligned
>> >>address.
>> >>
>> >>Currently, this case is handled by following check in __split_huge_pmd()
>> >>luckily.
>> >>
>> >>  page != pmd_page(*pmd)
>> >>
>> >>This patch checks the address to skip some work.
>> >
>> >I am sorry to forget address Kirill's comment in 1st version.
>> >
>> >The first one is the performance difference after this change for a PTE
>> >mappged THP.
>> >
>> >Here is the result:(in cycle)
>> >
>> >        Before     Patched
>> >
>> >        963        195
>> >        988        40
>> >        895        78
>> >
>> >Average 948        104
>> >
>> >So the change reduced 90% time for function split_huge_pmd_address().
>
>Right.
>
>But do we have a scenario, where the performance of
>split_huge_pmd_address() matters? I mean, it it called as part of rmap
>walk, attempt to split huge PMD where we don't have huge PMD should be
>within noise.

Sorry for my poor English.

I don't catch the meaning of the last sentence. "within noise" here means
non-huge PMD is an expected scenario and we could tolerate this? 

>
>> >For the 2nd comment, the vma check. Let me take a further look to analysis.
>> >
>> >Thanks for Kirill's suggestion.
>> >
>> 
>> For 2nd comment, check vma could hold huge page.
>> 
>> You mean do this check ?
>> 
>>   vma->vm_start <= address && vma->vm_end >= address + HPAGE_PMD_SIZE
>> 
>> This happens after munmap a partial of the THP range? After doing so, we can
>> skip split_pmd for this case.
>
>Okay, you are right. This kind of check would not be safe as we
>split_huge_pmd_address() after adjusting VMA with expectation of splitting
>PMD on boundary of the VMA.
>
>-- 
> Kirill A. Shutemov
Kirill A. Shutemov Jan. 7, 2020, 12:03 p.m. UTC | #9
On Fri, Jan 03, 2020 at 10:01:28PM +0800, Wei Yang wrote:
> On Fri, Jan 03, 2020 at 04:26:50PM +0300, Kirill A. Shutemov wrote:
> >On Fri, Jan 03, 2020 at 09:05:54PM +0800, Wei Yang wrote:
> >> On Fri, Jan 03, 2020 at 03:18:46PM +0800, Wei Yang wrote:
> >> >On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
> >> >>When page is not NULL, function is called by try_to_unmap_one() with
> >> >>TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
> >> >>with TTU_SPLIT_HUGE_PMD set:
> >> >>
> >> >>  * unmap_page()
> >> >>  * shrink_page_list()
> >> >>
> >> >>In both case, the page passed to try_to_unmap_one() is PageHead() of the
> >> >>THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
> >> >>aligned, this means the THP is not mapped as PMD THP in this process.
> >> >>This could happen when we do mremap() a PMD size range to an un-aligned
> >> >>address.
> >> >>
> >> >>Currently, this case is handled by following check in __split_huge_pmd()
> >> >>luckily.
> >> >>
> >> >>  page != pmd_page(*pmd)
> >> >>
> >> >>This patch checks the address to skip some work.
> >> >
> >> >I am sorry to forget address Kirill's comment in 1st version.
> >> >
> >> >The first one is the performance difference after this change for a PTE
> >> >mappged THP.
> >> >
> >> >Here is the result:(in cycle)
> >> >
> >> >        Before     Patched
> >> >
> >> >        963        195
> >> >        988        40
> >> >        895        78
> >> >
> >> >Average 948        104
> >> >
> >> >So the change reduced 90% time for function split_huge_pmd_address().
> >
> >Right.
> >
> >But do we have a scenario, where the performance of
> >split_huge_pmd_address() matters? I mean, it it called as part of rmap
> >walk, attempt to split huge PMD where we don't have huge PMD should be
> >within noise.
> 
> Sorry for my poor English.
> 
> I don't catch the meaning of the last sentence. "within noise" here means
> non-huge PMD is an expected scenario and we could tolerate this? 

Basically, I doubt that this change would bring any measurable perfromance
benefits on a real workload.
Wei Yang Jan. 8, 2020, 12:36 a.m. UTC | #10
On Tue, Jan 07, 2020 at 03:03:33PM +0300, Kirill A. Shutemov wrote:
>On Fri, Jan 03, 2020 at 10:01:28PM +0800, Wei Yang wrote:
>> On Fri, Jan 03, 2020 at 04:26:50PM +0300, Kirill A. Shutemov wrote:
>> >On Fri, Jan 03, 2020 at 09:05:54PM +0800, Wei Yang wrote:
>> >> On Fri, Jan 03, 2020 at 03:18:46PM +0800, Wei Yang wrote:
>> >> >On Tue, Dec 24, 2019 at 06:28:56AM +0800, Wei Yang wrote:
>> >> >>When page is not NULL, function is called by try_to_unmap_one() with
>> >> >>TTU_SPLIT_HUGE_PMD set. There are two cases to call try_to_unmap_one()
>> >> >>with TTU_SPLIT_HUGE_PMD set:
>> >> >>
>> >> >>  * unmap_page()
>> >> >>  * shrink_page_list()
>> >> >>
>> >> >>In both case, the page passed to try_to_unmap_one() is PageHead() of the
>> >> >>THP. If this page's mapping address in process is not HPAGE_PMD_SIZE
>> >> >>aligned, this means the THP is not mapped as PMD THP in this process.
>> >> >>This could happen when we do mremap() a PMD size range to an un-aligned
>> >> >>address.
>> >> >>
>> >> >>Currently, this case is handled by following check in __split_huge_pmd()
>> >> >>luckily.
>> >> >>
>> >> >>  page != pmd_page(*pmd)
>> >> >>
>> >> >>This patch checks the address to skip some work.
>> >> >
>> >> >I am sorry to forget address Kirill's comment in 1st version.
>> >> >
>> >> >The first one is the performance difference after this change for a PTE
>> >> >mappged THP.
>> >> >
>> >> >Here is the result:(in cycle)
>> >> >
>> >> >        Before     Patched
>> >> >
>> >> >        963        195
>> >> >        988        40
>> >> >        895        78
>> >> >
>> >> >Average 948        104
>> >> >
>> >> >So the change reduced 90% time for function split_huge_pmd_address().
>> >
>> >Right.
>> >
>> >But do we have a scenario, where the performance of
>> >split_huge_pmd_address() matters? I mean, it it called as part of rmap
>> >walk, attempt to split huge PMD where we don't have huge PMD should be
>> >within noise.
>> 
>> Sorry for my poor English.
>> 
>> I don't catch the meaning of the last sentence. "within noise" here means
>> non-huge PMD is an expected scenario and we could tolerate this? 
>
>Basically, I doubt that this change would bring any measurable perfromance
>benefits on a real workload.
>

Ok, let's leave it now.

>-- 
> Kirill A. Shutemov
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 893fecd5daa4..2b9c2f412b32 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2342,6 +2342,22 @@  void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address,
 	pud_t *pud;
 	pmd_t *pmd;
 
+	/*
+	 * When page is not NULL, function is called by try_to_unmap_one()
+	 * with TTU_SPLIT_HUGE_PMD set. There are two places set
+	 * TTU_SPLIT_HUGE_PMD
+	 *
+	 *     unmap_page()
+	 *     shrink_page_list()
+	 *
+	 * In both cases, the "page" here is the PageHead() of a THP.
+	 *
+	 * If the page is not a PMD mapped huge page, e.g. after mremap(), it
+	 * is not necessary to split it.
+	 */
+	if (page && !IS_ALIGNED(address, HPAGE_PMD_SIZE))
+		return;
+
 	pgd = pgd_offset(vma->vm_mm, address);
 	if (!pgd_present(*pgd))
 		return;