diff mbox series

[linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios

Message ID 20240604134738264WKaKYb3q_YTE32hNAy2lz@zte.com.cn (mailing list archive)
State New
Headers show
Series [linux-next] mm: huge_memory: fix misused mapping_large_folio_support() for anon folios | expand

Commit Message

xu.xin16@zte.com.cn June 4, 2024, 5:47 a.m. UTC
From: Ran Xiaokai <ran.xiaokai@zte.com.cn>

When I did a large folios split test, a WARNING
"[ 5059.122759][  T166] Cannot split file folio to non-0 order"
was triggered. But my test cases are only for anonmous folios. 
while mapping_large_folio_support() is only reasonable for page
cache folios. 

In split_huge_page_to_list_to_order(), the folio passed to
mapping_large_folio_support() maybe anonmous folio. The
folio_test_anon() check is missing. So the split of the anonmous THP
is failed. This is also the same for shmem_mapping(). We'd better add
a check for both. But the shmem_mapping() in __split_huge_page() is
not involved, as for anonmous folios, the end parameter is set to -1, so
(head[i].index >= end) is always false. shmem_mapping() is not called.

Using /sys/kernel/debug/split_huge_pages to verify this, with this
patch, large anon THP is successfully split and the warning is ceased.

Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
Cc: xu xin <xu.xin16@zte.com.cn>
Cc: Yang Yang <yang.yang29@zte.com.cn>
---
 mm/huge_memory.c | 38 ++++++++++++++++++++------------------
 1 file changed, 20 insertions(+), 18 deletions(-)

Comments

David Hildenbrand June 4, 2024, 7:57 a.m. UTC | #1
On 04.06.24 07:47, xu.xin16@zte.com.cn wrote:
> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> 
> When I did a large folios split test, a WARNING
> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
> was triggered. But my test cases are only for anonmous folios.
> while mapping_large_folio_support() is only reasonable for page
> cache folios.

Agreed.

I wonder if mapping_large_folio_support() should either

a) Complain if used for anon folios, so we can detect the wrong use more 
easily. (VM_WARN_ON_ONCE())

b) Return "true" for anonymous mappings, although that's more debatable.

> 
> In split_huge_page_to_list_to_order(), the folio passed to
> mapping_large_folio_support() maybe anonmous folio. The
> folio_test_anon() check is missing. So the split of the anonmous THP
> is failed. This is also the same for shmem_mapping(). We'd better add
> a check for both. But the shmem_mapping() in __split_huge_page() is
> not involved, as for anonmous folios, the end parameter is set to -1, so
> (head[i].index >= end) is always false. shmem_mapping() is not called.
> 
> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> patch, large anon THP is successfully split and the warning is ceased.
> 
> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> Cc: xu xin <xu.xin16@zte.com.cn>
> Cc: Yang Yang <yang.yang29@zte.com.cn>
> ---
>   mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>   1 file changed, 20 insertions(+), 18 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 317de2afd371..4c9c7e5ea20c 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>   	if (new_order >= folio_order(folio))
>   		return -EINVAL;
> 
> -	/* Cannot split anonymous THP to order-1 */
> -	if (new_order == 1 && folio_test_anon(folio)) {
> -		VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> -		return -EINVAL;
> -	}
> -
>   	if (new_order) {
>   		/* Only swapping a whole PMD-mapped folio is supported */
>   		if (folio_test_swapcache(folio))
>   			return -EINVAL;
> -		/* Split shmem folio to non-zero order not supported */
> -		if (shmem_mapping(folio->mapping)) {
> -			VM_WARN_ONCE(1,
> -				"Cannot split shmem folio to non-0 order");
> -			return -EINVAL;
> -		}
> -		/* No split if the file system does not support large folio */
> -		if (!mapping_large_folio_support(folio->mapping)) {
> -			VM_WARN_ONCE(1,
> -				"Cannot split file folio to non-0 order");
> -			return -EINVAL;
> +
> +		if (folio_test_anon(folio)) {
> +			/* Cannot split anonymous THP to order-1 */
> +			if (new_order == 1) {
> +				VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> +				return -EINVAL;
> +			}
> +		} else {
> +			/* Split shmem folio to non-zero order not supported */
> +			if (shmem_mapping(folio->mapping)) {
> +				VM_WARN_ONCE(1,
> +					"Cannot split shmem folio to non-0 order");
> +				return -EINVAL;
> +			}
> +			/* No split if the file system does not support large folio */
> +			if (!mapping_large_folio_support(folio->mapping)) {
> +				VM_WARN_ONCE(1,
> +					"Cannot split file folio to non-0 order");
> +				return -EINVAL;
> +			}
>   		}
>   	}

What about the following sequence:

if (folio_test_anon(folio)) {
	if (new_order == 1)
		...
} else if (new_order) {
	if (shmem_mapping(...))
		...
	...
}

if (folio_test_swapcache(folio) && new_order)
	return -EINVAL;

Should result in less churn and reduce indentation level.
Zi Yan June 4, 2024, 1:52 p.m. UTC | #2
On 4 Jun 2024, at 0:57, David Hildenbrand wrote:

> On 04.06.24 07:47, xu.xin16@zte.com.cn wrote:
>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>>
>> When I did a large folios split test, a WARNING
>> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
>> was triggered. But my test cases are only for anonmous folios.
>> while mapping_large_folio_support() is only reasonable for page
>> cache folios.
>
> Agreed.
>
> I wonder if mapping_large_folio_support() should either
>
> a) Complain if used for anon folios, so we can detect the wrong use more easily. (VM_WARN_ON_ONCE())

This is much better.

>
> b) Return "true" for anonymous mappings, although that's more debatable.

This might fix the warning here, but the function might get wrong uses easily.

>
>>
>> In split_huge_page_to_list_to_order(), the folio passed to
>> mapping_large_folio_support() maybe anonmous folio. The
>> folio_test_anon() check is missing. So the split of the anonmous THP
>> is failed. This is also the same for shmem_mapping(). We'd better add
>> a check for both. But the shmem_mapping() in __split_huge_page() is
>> not involved, as for anonmous folios, the end parameter is set to -1, so
>> (head[i].index >= end) is always false. shmem_mapping() is not called.
>>
>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
>> patch, large anon THP is successfully split and the warning is ceased.
>>
>> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>> Cc: xu xin <xu.xin16@zte.com.cn>
>> Cc: Yang Yang <yang.yang29@zte.com.cn>
>> ---
>>   mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>>   1 file changed, 20 insertions(+), 18 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 317de2afd371..4c9c7e5ea20c 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>   	if (new_order >= folio_order(folio))
>>   		return -EINVAL;
>>
>> -	/* Cannot split anonymous THP to order-1 */
>> -	if (new_order == 1 && folio_test_anon(folio)) {
>> -		VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>> -		return -EINVAL;
>> -	}
>> -
>>   	if (new_order) {
>>   		/* Only swapping a whole PMD-mapped folio is supported */
>>   		if (folio_test_swapcache(folio))
>>   			return -EINVAL;
>> -		/* Split shmem folio to non-zero order not supported */
>> -		if (shmem_mapping(folio->mapping)) {
>> -			VM_WARN_ONCE(1,
>> -				"Cannot split shmem folio to non-0 order");
>> -			return -EINVAL;
>> -		}
>> -		/* No split if the file system does not support large folio */
>> -		if (!mapping_large_folio_support(folio->mapping)) {
>> -			VM_WARN_ONCE(1,
>> -				"Cannot split file folio to non-0 order");
>> -			return -EINVAL;
>> +
>> +		if (folio_test_anon(folio)) {
>> +			/* Cannot split anonymous THP to order-1 */
>> +			if (new_order == 1) {
>> +				VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>> +				return -EINVAL;
>> +			}
>> +		} else {
>> +			/* Split shmem folio to non-zero order not supported */
>> +			if (shmem_mapping(folio->mapping)) {
>> +				VM_WARN_ONCE(1,
>> +					"Cannot split shmem folio to non-0 order");
>> +				return -EINVAL;
>> +			}
>> +			/* No split if the file system does not support large folio */
>> +			if (!mapping_large_folio_support(folio->mapping)) {
>> +				VM_WARN_ONCE(1,
>> +					"Cannot split file folio to non-0 order");
>> +				return -EINVAL;
>> +			}
>>   		}
>>   	}
>
> What about the following sequence:
>
> if (folio_test_anon(folio)) {
> 	if (new_order == 1)
> 		...
> } else if (new_order) {
> 	if (shmem_mapping(...))
> 		...
> 	...
> }
>
> if (folio_test_swapcache(folio) && new_order)
> 	return -EINVAL;
>
> Should result in less churn and reduce indentation level.

Yeah, this looks better to me.

Best Regards,
Yan, Zi
Zi Yan June 4, 2024, 1:57 p.m. UTC | #3
+Luis and Pankaj, who are working on enable bs > ps in XFS and touch split_huge_page_to_list_to_order().


On 4 Jun 2024, at 6:52, Zi Yan wrote:

> On 4 Jun 2024, at 0:57, David Hildenbrand wrote:
>
>> On 04.06.24 07:47, xu.xin16@zte.com.cn wrote:
>>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>>>
>>> When I did a large folios split test, a WARNING
>>> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
>>> was triggered. But my test cases are only for anonmous folios.
>>> while mapping_large_folio_support() is only reasonable for page
>>> cache folios.
>>
>> Agreed.
>>
>> I wonder if mapping_large_folio_support() should either
>>
>> a) Complain if used for anon folios, so we can detect the wrong use more easily. (VM_WARN_ON_ONCE())
>
> This is much better.
>
>>
>> b) Return "true" for anonymous mappings, although that's more debatable.
>
> This might fix the warning here, but the function might get wrong uses easily.
>
>>
>>>
>>> In split_huge_page_to_list_to_order(), the folio passed to
>>> mapping_large_folio_support() maybe anonmous folio. The
>>> folio_test_anon() check is missing. So the split of the anonmous THP
>>> is failed. This is also the same for shmem_mapping(). We'd better add
>>> a check for both. But the shmem_mapping() in __split_huge_page() is
>>> not involved, as for anonmous folios, the end parameter is set to -1, so
>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
>>>
>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
>>> patch, large anon THP is successfully split and the warning is ceased.
>>>
>>> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>>> Cc: xu xin <xu.xin16@zte.com.cn>
>>> Cc: Yang Yang <yang.yang29@zte.com.cn>
>>> ---
>>>   mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>>>   1 file changed, 20 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 317de2afd371..4c9c7e5ea20c 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>>   	if (new_order >= folio_order(folio))
>>>   		return -EINVAL;
>>>
>>> -	/* Cannot split anonymous THP to order-1 */
>>> -	if (new_order == 1 && folio_test_anon(folio)) {
>>> -		VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>> -		return -EINVAL;
>>> -	}
>>> -
>>>   	if (new_order) {
>>>   		/* Only swapping a whole PMD-mapped folio is supported */
>>>   		if (folio_test_swapcache(folio))
>>>   			return -EINVAL;
>>> -		/* Split shmem folio to non-zero order not supported */
>>> -		if (shmem_mapping(folio->mapping)) {
>>> -			VM_WARN_ONCE(1,
>>> -				"Cannot split shmem folio to non-0 order");
>>> -			return -EINVAL;
>>> -		}
>>> -		/* No split if the file system does not support large folio */
>>> -		if (!mapping_large_folio_support(folio->mapping)) {
>>> -			VM_WARN_ONCE(1,
>>> -				"Cannot split file folio to non-0 order");
>>> -			return -EINVAL;
>>> +
>>> +		if (folio_test_anon(folio)) {
>>> +			/* Cannot split anonymous THP to order-1 */
>>> +			if (new_order == 1) {
>>> +				VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>> +				return -EINVAL;
>>> +			}
>>> +		} else {
>>> +			/* Split shmem folio to non-zero order not supported */
>>> +			if (shmem_mapping(folio->mapping)) {
>>> +				VM_WARN_ONCE(1,
>>> +					"Cannot split shmem folio to non-0 order");
>>> +				return -EINVAL;
>>> +			}
>>> +			/* No split if the file system does not support large folio */
>>> +			if (!mapping_large_folio_support(folio->mapping)) {
>>> +				VM_WARN_ONCE(1,
>>> +					"Cannot split file folio to non-0 order");
>>> +				return -EINVAL;
>>> +			}
>>>   		}
>>>   	}
>>
>> What about the following sequence:
>>
>> if (folio_test_anon(folio)) {
>> 	if (new_order == 1)
>> 		...
>> } else if (new_order) {
>> 	if (shmem_mapping(...))
>> 		...
>> 	...
>> }
>>
>> if (folio_test_swapcache(folio) && new_order)
>> 	return -EINVAL;
>>
>> Should result in less churn and reduce indentation level.
>
> Yeah, this looks better to me.
>
> Best Regards,
> Yan, Zi


Best Regards,
Yan, Zi
Ran Xiaokai June 5, 2024, 2:20 a.m. UTC | #4
> On 04.06.24 07:47, xu.xin16@zte.com.cn wrote:
> > From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> > 
> > When I did a large folios split test, a WARNING
> > "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
> > was triggered. But my test cases are only for anonmous folios.
> > while mapping_large_folio_support() is only reasonable for page
> > cache folios.
> 
> Agreed.
> 
> I wonder if mapping_large_folio_support() should either
> 
> a) Complain if used for anon folios, so we can detect the wrong use more 
> easily. (VM_WARN_ON_ONCE())

> b) Return "true" for anonymous mappings, although that's more debatable.
> 

Hi, David,
Thanks for the review.
I think a) is better. 
But we have to add a new parameter "folio" to mapping_large_folio_support(), right ?
something like mapping_large_folio_support(struct address_space *mapping, struct folio *folio) ?
But in the __filemap_get_folio() path, 

__filemap_get_folio()
  no_page:
    ....
    if (!mapping_large_folio_support(mapping))

the folio is not allocated yet, yes ?
Or do you mean there is some other way to do this ? 

> > 
> > In split_huge_page_to_list_to_order(), the folio passed to
> > mapping_large_folio_support() maybe anonmous folio. The
> > folio_test_anon() check is missing. So the split of the anonmous THP
> > is failed. This is also the same for shmem_mapping(). We'd better add
> > a check for both. But the shmem_mapping() in __split_huge_page() is
> > not involved, as for anonmous folios, the end parameter is set to -1, so
> > (head[i].index >= end) is always false. shmem_mapping() is not called.
> > 
> > Using /sys/kernel/debug/split_huge_pages to verify this, with this
> > patch, large anon THP is successfully split and the warning is ceased.
> > 
> > Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> > Cc: xu xin <xu.xin16@zte.com.cn>
> > Cc: Yang Yang <yang.yang29@zte.com.cn>
> > ---
> >   mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> >   1 file changed, 20 insertions(+), 18 deletions(-)
> > 
> > diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> > index 317de2afd371..4c9c7e5ea20c 100644
> > --- a/mm/huge_memory.c
> > +++ b/mm/huge_memory.c
> > @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >   	if (new_order >= folio_order(folio))
> >   		return -EINVAL;
> > 
> > -	/* Cannot split anonymous THP to order-1 */
> > -	if (new_order == 1 && folio_test_anon(folio)) {
> > -		VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> > -		return -EINVAL;
> > -	}
> > -
> >   	if (new_order) {
> >   		/* Only swapping a whole PMD-mapped folio is supported */
> >   		if (folio_test_swapcache(folio))
> >   			return -EINVAL;
> > -		/* Split shmem folio to non-zero order not supported */
> > -		if (shmem_mapping(folio->mapping)) {
> > -			VM_WARN_ONCE(1,
> > -				"Cannot split shmem folio to non-0 order");
> > -			return -EINVAL;
> > -		}
> > -		/* No split if the file system does not support large folio */
> > -		if (!mapping_large_folio_support(folio->mapping)) {
> > -			VM_WARN_ONCE(1,
> > -				"Cannot split file folio to non-0 order");
> > -			return -EINVAL;
> > +
> > +		if (folio_test_anon(folio)) {
> > +			/* Cannot split anonymous THP to order-1 */
> > +			if (new_order == 1) {
> > +				VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> > +				return -EINVAL;
> > +			}
> > +		} else {
> > +			/* Split shmem folio to non-zero order not supported */
> > +			if (shmem_mapping(folio->mapping)) {
> > +				VM_WARN_ONCE(1,
> > +					"Cannot split shmem folio to non-0 order");
> > +				return -EINVAL;
> > +			}
> > +			/* No split if the file system does not support large folio */
> > +			if (!mapping_large_folio_support(folio->mapping)) {
> > +				VM_WARN_ONCE(1,
> > +					"Cannot split file folio to non-0 order");
> > +				return -EINVAL;
> > +			}
> >   		}
> >   	}
> 
> What about the following sequence:
> 
> if (folio_test_anon(folio)) {
> 	if (new_order == 1)
> 		...
> } else if (new_order) {
> 	if (shmem_mapping(...))
> 		...
> 	...
> }
> 
> if (folio_test_swapcache(folio) && new_order)
> 	return -EINVAL;
> 
> Should result in less churn and reduce indentation level.
>

Thanks.
The code is cleaner in this way.
 
> -- 
> Cheers,
> 
> David / dhildenb
Ran Xiaokai June 5, 2024, 2:56 a.m. UTC | #5
> On 4 Jun 2024, at 0:57, David Hildenbrand wrote:
> 
> > On 04.06.24 07:47, xu.xin16@zte.com.cn wrote:
> >> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> >>
> >> When I did a large folios split test, a WARNING
> >> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
> >> was triggered. But my test cases are only for anonmous folios.
> >> while mapping_large_folio_support() is only reasonable for page
> >> cache folios.
> >
> > Agreed.
> >
> > I wonder if mapping_large_folio_support() should either
> >
> > a) Complain if used for anon folios, so we can detect the wrong use more easily. (VM_WARN_ON_ONCE())
> 
> This is much better.
> 
> >
> > b) Return "true" for anonymous mappings, although that's more debatable.
> 
> This might fix the warning here, but the function might get wrong uses easily.

yes, maybe we should rename mapping_large_folio_support() if we choose b).
 
> >
> >>
> >> In split_huge_page_to_list_to_order(), the folio passed to
> >> mapping_large_folio_support() maybe anonmous folio. The
> >> folio_test_anon() check is missing. So the split of the anonmous THP
> >> is failed. This is also the same for shmem_mapping(). We'd better add
> >> a check for both. But the shmem_mapping() in __split_huge_page() is
> >> not involved, as for anonmous folios, the end parameter is set to -1, so
> >> (head[i].index >= end) is always false. shmem_mapping() is not called.
> >>
> >> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> >> patch, large anon THP is successfully split and the warning is ceased.
> >>
> >> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> >> Cc: xu xin <xu.xin16@zte.com.cn>
> >> Cc: Yang Yang <yang.yang29@zte.com.cn>
> >> ---
> >>   mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> >>   1 file changed, 20 insertions(+), 18 deletions(-)
> >>
> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >> index 317de2afd371..4c9c7e5ea20c 100644
> >> --- a/mm/huge_memory.c
> >> +++ b/mm/huge_memory.c
> >> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>   	if (new_order >= folio_order(folio))
> >>   		return -EINVAL;
> >>
> >> -	/* Cannot split anonymous THP to order-1 */
> >> -	if (new_order == 1 && folio_test_anon(folio)) {
> >> -		VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >> -		return -EINVAL;
> >> -	}
> >> -
> >>   	if (new_order) {
> >>   		/* Only swapping a whole PMD-mapped folio is supported */
> >>   		if (folio_test_swapcache(folio))
> >>   			return -EINVAL;
> >> -		/* Split shmem folio to non-zero order not supported */
> >> -		if (shmem_mapping(folio->mapping)) {
> >> -			VM_WARN_ONCE(1,
> >> -				"Cannot split shmem folio to non-0 order");
> >> -			return -EINVAL;
> >> -		}
> >> -		/* No split if the file system does not support large folio */
> >> -		if (!mapping_large_folio_support(folio->mapping)) {
> >> -			VM_WARN_ONCE(1,
> >> -				"Cannot split file folio to non-0 order");
> >> -			return -EINVAL;
> >> +
> >> +		if (folio_test_anon(folio)) {
> >> +			/* Cannot split anonymous THP to order-1 */
> >> +			if (new_order == 1) {
> >> +				VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >> +				return -EINVAL;
> >> +			}
> >> +		} else {
> >> +			/* Split shmem folio to non-zero order not supported */
> >> +			if (shmem_mapping(folio->mapping)) {
> >> +				VM_WARN_ONCE(1,
> >> +					"Cannot split shmem folio to non-0 order");
> >> +				return -EINVAL;
> >> +			}
> >> +			/* No split if the file system does not support large folio */
> >> +			if (!mapping_large_folio_support(folio->mapping)) {
> >> +				VM_WARN_ONCE(1,
> >> +					"Cannot split file folio to non-0 order");
> >> +				return -EINVAL;
> >> +			}
> >>   		}
> >>   	}
> >
> > What about the following sequence:
> >
> > if (folio_test_anon(folio)) {
> > 	if (new_order == 1)
> > 		...
> > } else if (new_order) {
> > 	if (shmem_mapping(...))
> > 		...
> > 	...
> > }
> >
> > if (folio_test_swapcache(folio) && new_order)
> > 	return -EINVAL;
> >
> > Should result in less churn and reduce indentation level.
> 
> Yeah, this looks better to me.
> 
> Best Regards,
> Yan, Zi
David Hildenbrand June 5, 2024, 7:25 a.m. UTC | #6
On 05.06.24 04:20, ran xiaokai wrote:
>> On 04.06.24 07:47, xu.xin16@zte.com.cn wrote:
>>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>>>
>>> When I did a large folios split test, a WARNING
>>> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
>>> was triggered. But my test cases are only for anonmous folios.
>>> while mapping_large_folio_support() is only reasonable for page
>>> cache folios.
>>
>> Agreed.
>>
>> I wonder if mapping_large_folio_support() should either
>>
>> a) Complain if used for anon folios, so we can detect the wrong use more
>> easily. (VM_WARN_ON_ONCE())
> 
>> b) Return "true" for anonymous mappings, although that's more debatable.
>>
> 
> Hi, David,
> Thanks for the review.
> I think a) is better.
> But we have to add a new parameter "folio" to mapping_large_folio_support(), right ?
> something like mapping_large_folio_support(struct address_space *mapping, struct folio *folio) ?
> But in the __filemap_get_folio() path,
> 
> __filemap_get_folio()
>    no_page:
>      ....
>      if (!mapping_large_folio_support(mapping))
> 
> the folio is not allocated yet, yes ?
> Or do you mean there is some other way to do this ?

If we really pass unmodified folio->mapping, you can do what 
folio_test_anon() would and make sure PAGE_MAPPING_ANON is not set.
Ran Xiaokai June 5, 2024, 8:30 a.m. UTC | #7
> On 05.06.24 04:20, ran xiaokai wrote:
> >> On 04.06.24 07:47, xu.xin16@zte.com.cn wrote:
> >>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> >>>
> >>> When I did a large folios split test, a WARNING
> >>> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
> >>> was triggered. But my test cases are only for anonmous folios.
> >>> while mapping_large_folio_support() is only reasonable for page
> >>> cache folios.
> >>
> >> Agreed.
> >>
> >> I wonder if mapping_large_folio_support() should either
> >>
> >> a) Complain if used for anon folios, so we can detect the wrong use more
> >> easily. (VM_WARN_ON_ONCE())
> > 
> >> b) Return "true" for anonymous mappings, although that's more debatable.
> >>
> > 
> > Hi, David,
> > Thanks for the review.
> > I think a) is better.
> > But we have to add a new parameter "folio" to mapping_large_folio_support(), right ?
> > something like mapping_large_folio_support(struct address_space *mapping, struct folio *folio) ?
> > But in the __filemap_get_folio() path,
> > 
> > __filemap_get_folio()
> >    no_page:
> >      ....
> >      if (!mapping_large_folio_support(mapping))
> > 
> > the folio is not allocated yet, yes ?
> > Or do you mean there is some other way to do this ?
> 
> If we really pass unmodified folio->mapping, you can do what 
> folio_test_anon() would and make sure PAGE_MAPPING_ANON is not set.

I think I just misunderstood your suggestion.
How about this ?

static inline bool mapping_large_folio_support(struct address_space *mapping)
{
	VM_WARN_ONCE((unsigned long)mapping & PAGE_MAPPING_ANON, 
			"Anonymous mapping always supports large folio");

	return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
		test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
}
David Hildenbrand June 5, 2024, 8:33 a.m. UTC | #8
On 05.06.24 10:30, ran xiaokai wrote:
>> On 05.06.24 04:20, ran xiaokai wrote:
>>>> On 04.06.24 07:47, xu.xin16@zte.com.cn wrote:
>>>>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>>>>>
>>>>> When I did a large folios split test, a WARNING
>>>>> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
>>>>> was triggered. But my test cases are only for anonmous folios.
>>>>> while mapping_large_folio_support() is only reasonable for page
>>>>> cache folios.
>>>>
>>>> Agreed.
>>>>
>>>> I wonder if mapping_large_folio_support() should either
>>>>
>>>> a) Complain if used for anon folios, so we can detect the wrong use more
>>>> easily. (VM_WARN_ON_ONCE())
>>>
>>>> b) Return "true" for anonymous mappings, although that's more debatable.
>>>>
>>>
>>> Hi, David,
>>> Thanks for the review.
>>> I think a) is better.
>>> But we have to add a new parameter "folio" to mapping_large_folio_support(), right ?
>>> something like mapping_large_folio_support(struct address_space *mapping, struct folio *folio) ?
>>> But in the __filemap_get_folio() path,
>>>
>>> __filemap_get_folio()
>>>     no_page:
>>>       ....
>>>       if (!mapping_large_folio_support(mapping))
>>>
>>> the folio is not allocated yet, yes ?
>>> Or do you mean there is some other way to do this ?
>>
>> If we really pass unmodified folio->mapping, you can do what
>> folio_test_anon() would and make sure PAGE_MAPPING_ANON is not set.
> 
> I think I just misunderstood your suggestion.

Likely my use of "folio" was confusing.

> How about this ?
> 
> static inline bool mapping_large_folio_support(struct address_space *mapping)
> {
> 	VM_WARN_ONCE((unsigned long)mapping & PAGE_MAPPING_ANON,
> 			"Anonymous mapping always supports large folio");
> 
> 	return IS_ENABLED(CONFIG_TRANSPARENT_HUGEPAGE) &&
> 		test_bit(AS_LARGE_FOLIO_SUPPORT, &mapping->flags);
> }

Yes, and we should likely document that this is not supposed to be used 
with mappings from anonymous folios.
Barry Song June 5, 2024, 9:06 a.m. UTC | #9
On Tue, Jun 4, 2024 at 5:47 PM <xu.xin16@zte.com.cn> wrote:
>
> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>
> When I did a large folios split test, a WARNING
> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
> was triggered. But my test cases are only for anonmous folios.
> while mapping_large_folio_support() is only reasonable for page
> cache folios.
>
> In split_huge_page_to_list_to_order(), the folio passed to
> mapping_large_folio_support() maybe anonmous folio. The
> folio_test_anon() check is missing. So the split of the anonmous THP
> is failed. This is also the same for shmem_mapping(). We'd better add
> a check for both. But the shmem_mapping() in __split_huge_page() is
> not involved, as for anonmous folios, the end parameter is set to -1, so
> (head[i].index >= end) is always false. shmem_mapping() is not called.
>
> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> patch, large anon THP is successfully split and the warning is ceased.
>
> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> Cc: xu xin <xu.xin16@zte.com.cn>
> Cc: Yang Yang <yang.yang29@zte.com.cn>
> ---
>  mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>  1 file changed, 20 insertions(+), 18 deletions(-)
>
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 317de2afd371..4c9c7e5ea20c 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>         if (new_order >= folio_order(folio))
>                 return -EINVAL;
>
> -       /* Cannot split anonymous THP to order-1 */
> -       if (new_order == 1 && folio_test_anon(folio)) {
> -               VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> -               return -EINVAL;
> -       }
> -
>         if (new_order) {
>                 /* Only swapping a whole PMD-mapped folio is supported */
>                 if (folio_test_swapcache(folio))
>                         return -EINVAL;
> -               /* Split shmem folio to non-zero order not supported */
> -               if (shmem_mapping(folio->mapping)) {
> -                       VM_WARN_ONCE(1,
> -                               "Cannot split shmem folio to non-0 order");
> -                       return -EINVAL;
> -               }
> -               /* No split if the file system does not support large folio */
> -               if (!mapping_large_folio_support(folio->mapping)) {
> -                       VM_WARN_ONCE(1,
> -                               "Cannot split file folio to non-0 order");
> -                       return -EINVAL;
> +
> +               if (folio_test_anon(folio)) {
> +                       /* Cannot split anonymous THP to order-1 */
> +                       if (new_order == 1) {
> +                               VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> +                               return -EINVAL;
> +                       }
> +               } else {
> +                       /* Split shmem folio to non-zero order not supported */
> +                       if (shmem_mapping(folio->mapping)) {
> +                               VM_WARN_ONCE(1,
> +                                       "Cannot split shmem folio to non-0 order");
> +                               return -EINVAL;
> +                       }
> +                       /* No split if the file system does not support large folio */
> +                       if (!mapping_large_folio_support(folio->mapping)) {
> +                               VM_WARN_ONCE(1,
> +                                       "Cannot split file folio to non-0 order");
> +                               return -EINVAL;
> +                       }

Am I missing something? if file system doesn't support large folio,
how could the large folio start to exist from the first place while its
mapping points to a file which doesn't support large folio?

>                 }
>         }
>
> -
>         is_hzp = is_huge_zero_folio(folio);
>         if (is_hzp) {
>                 pr_warn_ratelimited("Called split_huge_page for huge zero page\n");
> --
> 2.15.2
>

Thanks
Barry
Zi Yan June 5, 2024, 2:08 p.m. UTC | #10
On 5 Jun 2024, at 2:54, ran xiaokai wrote:

>> On Tue, Jun 4, 2024 at 5:47?PM <xu.xin16@zte.com.cn> wrote:
>>>
>>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>>>
>>> When I did a large folios split test, a WARNING
>>> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
>>> was triggered. But my test cases are only for anonmous folios.
>>> while mapping_large_folio_support() is only reasonable for page
>>> cache folios.
>>>
>>> In split_huge_page_to_list_to_order(), the folio passed to
>>> mapping_large_folio_support() maybe anonmous folio. The
>>> folio_test_anon() check is missing. So the split of the anonmous THP
>>> is failed. This is also the same for shmem_mapping(). We'd better add
>>> a check for both. But the shmem_mapping() in __split_huge_page() is
>>> not involved, as for anonmous folios, the end parameter is set to -1, so
>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
>>>
>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
>>> patch, large anon THP is successfully split and the warning is ceased.
>>>
>>> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
>>> Cc: xu xin <xu.xin16@zte.com.cn>
>>> Cc: Yang Yang <yang.yang29@zte.com.cn>
>>> ---
>>>  mm/huge_memory.c | 38 ++++++++++++++++++++------------------
>>>  1 file changed, 20 insertions(+), 18 deletions(-)
>>>
>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>> index 317de2afd371..4c9c7e5ea20c 100644
>>> --- a/mm/huge_memory.c
>>> +++ b/mm/huge_memory.c
>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>>         if (new_order >= folio_order(folio))
>>>                 return -EINVAL;
>>>
>>> -       /* Cannot split anonymous THP to order-1 */
>>> -       if (new_order == 1 && folio_test_anon(folio)) {
>>> -               VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>> -               return -EINVAL;
>>> -       }
>>> -
>>>         if (new_order) {
>>>                 /* Only swapping a whole PMD-mapped folio is supported */
>>>                 if (folio_test_swapcache(folio))
>>>                         return -EINVAL;
>>> -               /* Split shmem folio to non-zero order not supported */
>>> -               if (shmem_mapping(folio->mapping)) {
>>> -                       VM_WARN_ONCE(1,
>>> -                               "Cannot split shmem folio to non-0 order");
>>> -                       return -EINVAL;
>>> -               }
>>> -               /* No split if the file system does not support large folio */
>>> -               if (!mapping_large_folio_support(folio->mapping)) {
>>> -                       VM_WARN_ONCE(1,
>>> -                               "Cannot split file folio to non-0 order");
>>> -                       return -EINVAL;
>>> +
>>> +               if (folio_test_anon(folio)) {
>>> +                       /* Cannot split anonymous THP to order-1 */
>>> +                       if (new_order == 1) {
>>> +                               VM_WARN_ONCE(1, "Cannot split to order-1 folio");
>>> +                               return -EINVAL;
>>> +                       }
>>> +               } else {
>>> +                       /* Split shmem folio to non-zero order not supported */
>>> +                       if (shmem_mapping(folio->mapping)) {
>>> +                               VM_WARN_ONCE(1,
>>> +                                       "Cannot split shmem folio to non-0 order");
>>> +                               return -EINVAL;
>>> +                       }
>>> +                       /* No split if the file system does not support large folio */
>>> +                       if (!mapping_large_folio_support(folio->mapping)) {
>>> +                               VM_WARN_ONCE(1,
>>> +                                       "Cannot split file folio to non-0 order");
>>> +                               return -EINVAL;
>>> +                       }
>>
>> Am I missing something? if file system doesn't support large folio,
>> how could the large folio start to exist from the first place while its
>> mapping points to a file which doesn't support large folio?
>
> I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
> khugepaged will try to collapse read-only file-backed pages to 2M THP.

Can you add this information to the commit log in your next version?

Best Regards,
Yan, Zi
Barry Song June 6, 2024, 1:34 a.m. UTC | #11
On Thu, Jun 6, 2024 at 2:42 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 05.06.24 16:08, Zi Yan wrote:
> > On 5 Jun 2024, at 2:54, ran xiaokai wrote:
> >
> >>> On Tue, Jun 4, 2024 at 5:47?PM <xu.xin16@zte.com.cn> wrote:
> >>>>
> >>>> From: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> >>>>
> >>>> When I did a large folios split test, a WARNING
> >>>> "[ 5059.122759][  T166] Cannot split file folio to non-0 order"
> >>>> was triggered. But my test cases are only for anonmous folios.
> >>>> while mapping_large_folio_support() is only reasonable for page
> >>>> cache folios.
> >>>>
> >>>> In split_huge_page_to_list_to_order(), the folio passed to
> >>>> mapping_large_folio_support() maybe anonmous folio. The
> >>>> folio_test_anon() check is missing. So the split of the anonmous THP
> >>>> is failed. This is also the same for shmem_mapping(). We'd better add
> >>>> a check for both. But the shmem_mapping() in __split_huge_page() is
> >>>> not involved, as for anonmous folios, the end parameter is set to -1, so
> >>>> (head[i].index >= end) is always false. shmem_mapping() is not called.
> >>>>
> >>>> Using /sys/kernel/debug/split_huge_pages to verify this, with this
> >>>> patch, large anon THP is successfully split and the warning is ceased.
> >>>>
> >>>> Signed-off-by: Ran Xiaokai <ran.xiaokai@zte.com.cn>
> >>>> Cc: xu xin <xu.xin16@zte.com.cn>
> >>>> Cc: Yang Yang <yang.yang29@zte.com.cn>
> >>>> ---
> >>>>   mm/huge_memory.c | 38 ++++++++++++++++++++------------------
> >>>>   1 file changed, 20 insertions(+), 18 deletions(-)
> >>>>
> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >>>> index 317de2afd371..4c9c7e5ea20c 100644
> >>>> --- a/mm/huge_memory.c
> >>>> +++ b/mm/huge_memory.c
> >>>> @@ -3009,31 +3009,33 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>>          if (new_order >= folio_order(folio))
> >>>>                  return -EINVAL;
> >>>>
> >>>> -       /* Cannot split anonymous THP to order-1 */
> >>>> -       if (new_order == 1 && folio_test_anon(folio)) {
> >>>> -               VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >>>> -               return -EINVAL;
> >>>> -       }
> >>>> -
> >>>>          if (new_order) {
> >>>>                  /* Only swapping a whole PMD-mapped folio is supported */
> >>>>                  if (folio_test_swapcache(folio))
> >>>>                          return -EINVAL;
> >>>> -               /* Split shmem folio to non-zero order not supported */
> >>>> -               if (shmem_mapping(folio->mapping)) {
> >>>> -                       VM_WARN_ONCE(1,
> >>>> -                               "Cannot split shmem folio to non-0 order");
> >>>> -                       return -EINVAL;
> >>>> -               }
> >>>> -               /* No split if the file system does not support large folio */
> >>>> -               if (!mapping_large_folio_support(folio->mapping)) {
> >>>> -                       VM_WARN_ONCE(1,
> >>>> -                               "Cannot split file folio to non-0 order");
> >>>> -                       return -EINVAL;
> >>>> +
> >>>> +               if (folio_test_anon(folio)) {
> >>>> +                       /* Cannot split anonymous THP to order-1 */
> >>>> +                       if (new_order == 1) {
> >>>> +                               VM_WARN_ONCE(1, "Cannot split to order-1 folio");
> >>>> +                               return -EINVAL;
> >>>> +                       }
> >>>> +               } else {
> >>>> +                       /* Split shmem folio to non-zero order not supported */
> >>>> +                       if (shmem_mapping(folio->mapping)) {
> >>>> +                               VM_WARN_ONCE(1,
> >>>> +                                       "Cannot split shmem folio to non-0 order");
> >>>> +                               return -EINVAL;
> >>>> +                       }
> >>>> +                       /* No split if the file system does not support large folio */
> >>>> +                       if (!mapping_large_folio_support(folio->mapping)) {
> >>>> +                               VM_WARN_ONCE(1,
> >>>> +                                       "Cannot split file folio to non-0 order");
> >>>> +                               return -EINVAL;
> >>>> +                       }
> >>>
> >>> Am I missing something? if file system doesn't support large folio,
> >>> how could the large folio start to exist from the first place while its
> >>> mapping points to a file which doesn't support large folio?
> >>
> >> I think it is the CONFIG_READ_ONLY_THP_FOR_FS case.
> >> khugepaged will try to collapse read-only file-backed pages to 2M THP.
> >
> > Can you add this information to the commit log in your next version?
>
> Can we also add that as a comment to that function, like "Note that we
> might still
> have THPs in such mappings due to CONFIG_READ_ONLY_THP_FOR_FS. But in
> that case,
> the mapping does not actually support large folios properly.
> "Or sth. like that.

+1. Otherwise, the code appears quite confusing.
Would using "#ifdef" help to further clarify it?

#ifdef CONFIG_READ_ONLY_THP_FOR_FS
            if (!mapping_large_folio_support(folio->mapping)) {
                          ....
            }
#endif

>
> --
> Cheers,
>
> David / dhildenb
>

Thanks
Barry
diff mbox series

Patch

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 317de2afd371..4c9c7e5ea20c 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -3009,31 +3009,33 @@  int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
 	if (new_order >= folio_order(folio))
 		return -EINVAL;

-	/* Cannot split anonymous THP to order-1 */
-	if (new_order == 1 && folio_test_anon(folio)) {
-		VM_WARN_ONCE(1, "Cannot split to order-1 folio");
-		return -EINVAL;
-	}
-
 	if (new_order) {
 		/* Only swapping a whole PMD-mapped folio is supported */
 		if (folio_test_swapcache(folio))
 			return -EINVAL;
-		/* Split shmem folio to non-zero order not supported */
-		if (shmem_mapping(folio->mapping)) {
-			VM_WARN_ONCE(1,
-				"Cannot split shmem folio to non-0 order");
-			return -EINVAL;
-		}
-		/* No split if the file system does not support large folio */
-		if (!mapping_large_folio_support(folio->mapping)) {
-			VM_WARN_ONCE(1,
-				"Cannot split file folio to non-0 order");
-			return -EINVAL;
+
+		if (folio_test_anon(folio)) {
+			/* Cannot split anonymous THP to order-1 */
+			if (new_order == 1) {
+				VM_WARN_ONCE(1, "Cannot split to order-1 folio");
+				return -EINVAL;
+			}
+		} else {
+			/* Split shmem folio to non-zero order not supported */
+			if (shmem_mapping(folio->mapping)) {
+				VM_WARN_ONCE(1,
+					"Cannot split shmem folio to non-0 order");
+				return -EINVAL;
+			}
+			/* No split if the file system does not support large folio */
+			if (!mapping_large_folio_support(folio->mapping)) {
+				VM_WARN_ONCE(1,
+					"Cannot split file folio to non-0 order");
+				return -EINVAL;
+			}
 		}
 	}

-
 	is_hzp = is_huge_zero_folio(folio);
 	if (is_hzp) {
 		pr_warn_ratelimited("Called split_huge_page for huge zero page\n");