diff mbox

[1/3] arm64: hugetlb: Fix huge_pte_offset to return poisoned page table entries

Message ID 20170412140459.21824-2-punit.agrawal@arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Punit Agrawal April 12, 2017, 2:04 p.m. UTC
When memory failure is enabled, a poisoned hugepage pte is marked as a
swap entry. huge_pte_offset() does not return the poisoned page table
entries when it encounters PUD/PMD hugepages.

This behaviour of huge_pte_offset() leads to error such as below when
munmap is called on poisoned hugepages.

[  344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074.

Fix huge_pte_offset() to return the poisoned pte which is then
appropriately handled by the generic layer code.

Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Steve Capper <steve.capper@arm.com>
Cc: David Woods <dwoods@mellanox.com>
---
 arch/arm64/mm/hugetlbpage.c | 20 +++++++++++++++-----
 1 file changed, 15 insertions(+), 5 deletions(-)

Comments

Tyler Baicar April 14, 2017, 7:29 p.m. UTC | #1
On 4/12/2017 8:04 AM, Punit Agrawal wrote:
> When memory failure is enabled, a poisoned hugepage pte is marked as a
> swap entry. huge_pte_offset() does not return the poisoned page table
> entries when it encounters PUD/PMD hugepages.
>
> This behaviour of huge_pte_offset() leads to error such as below when
> munmap is called on poisoned hugepages.
>
> [  344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074.
>
> Fix huge_pte_offset() to return the poisoned pte which is then
> appropriately handled by the generic layer code.
>
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: David Woods <dwoods@mellanox.com>
Tested-by: Tyler Baicar <tbaicar@codeaurora.org>

Thanks,
Tyler
> ---
>   arch/arm64/mm/hugetlbpage.c | 20 +++++++++++++++-----
>   1 file changed, 15 insertions(+), 5 deletions(-)
>
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 7514a000e361..5f1832165d69 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -143,15 +143,24 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
>   	pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd);
>   	if (!pgd_present(*pgd))
>   		return NULL;
> -	pud = pud_offset(pgd, addr);
> -	if (!pud_present(*pud))
> -		return NULL;
>   
> -	if (pud_huge(*pud))
> +	pud = pud_offset(pgd, addr);
> +	/*
> +	 * In case of HW Poisoning, a hugepage pud/pmd can contain
> +	 * poisoned entries. Poisoned entries are marked as swap
> +	 * entries.
> +	 *
> +	 * For puds/pmds that are not present, check to see if it
> +	 * could be a swap entry (!present and !none).
> +	 */
> +	if ((!pte_present(pud_pte(*pud)) && !pud_none(*pud)) || pud_huge(*pud))
>   		return (pte_t *)pud;
> +
>   	pmd = pmd_offset(pud, addr);
> -	if (!pmd_present(*pmd))
> +	if (pmd_none(*pmd))
>   		return NULL;
> +	if (!pmd_present(*pmd) && !pmd_none(*pmd))
> +		return (pte_t *)pmd;
>   
>   	if (pte_cont(pmd_pte(*pmd))) {
>   		pmd = pmd_offset(
> @@ -160,6 +169,7 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
>   	}
>   	if (pmd_huge(*pmd))
>   		return (pte_t *)pmd;
> +
>   	pte = pte_offset_kernel(pmd, addr);
>   	if (pte_present(*pte) && pte_cont(*pte)) {
>   		pte = pte_offset_kernel(
Catalin Marinas May 3, 2017, 12:49 p.m. UTC | #2
On Wed, Apr 12, 2017 at 03:04:57PM +0100, Punit Agrawal wrote:
> When memory failure is enabled, a poisoned hugepage pte is marked as a
> swap entry. huge_pte_offset() does not return the poisoned page table
> entries when it encounters PUD/PMD hugepages.
> 
> This behaviour of huge_pte_offset() leads to error such as below when
> munmap is called on poisoned hugepages.
> 
> [  344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074.
> 
> Fix huge_pte_offset() to return the poisoned pte which is then
> appropriately handled by the generic layer code.
> 
> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Steve Capper <steve.capper@arm.com>
> Cc: David Woods <dwoods@mellanox.com>
> ---
>  arch/arm64/mm/hugetlbpage.c | 20 +++++++++++++++-----
>  1 file changed, 15 insertions(+), 5 deletions(-)
> 
> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
> index 7514a000e361..5f1832165d69 100644
> --- a/arch/arm64/mm/hugetlbpage.c
> +++ b/arch/arm64/mm/hugetlbpage.c
> @@ -143,15 +143,24 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
>  	pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd);
>  	if (!pgd_present(*pgd))
>  		return NULL;
> -	pud = pud_offset(pgd, addr);
> -	if (!pud_present(*pud))
> -		return NULL;
>  
> -	if (pud_huge(*pud))
> +	pud = pud_offset(pgd, addr);
> +	/*
> +	 * In case of HW Poisoning, a hugepage pud/pmd can contain
> +	 * poisoned entries. Poisoned entries are marked as swap
> +	 * entries.
> +	 *
> +	 * For puds/pmds that are not present, check to see if it
> +	 * could be a swap entry (!present and !none).
> +	 */
> +	if ((!pte_present(pud_pte(*pud)) && !pud_none(*pud)) || pud_huge(*pud))
>  		return (pte_t *)pud;

Since we use puds as huge pages, can we just change pud_present() to
match the pmd_present()? I'd like to see similar checks for pud and pmd,
it would be easier to follow. Something like (unchecked):

	if (pud_none(*pud))
		return NULL;
	/* swap or huge page */
	if (!pud_present(*pud) || pud_huge(*pud))
		return (pte_t *)pud;
	/* table; check the next level */

> +
>  	pmd = pmd_offset(pud, addr);
> -	if (!pmd_present(*pmd))
> +	if (pmd_none(*pmd))
>  		return NULL;
> +	if (!pmd_present(*pmd) && !pmd_none(*pmd))
> +		return (pte_t *)pmd;

At this point, we already know that pmd_none(*pmd) is false, no ned to
check it again.
Punit Agrawal May 4, 2017, 3:55 p.m. UTC | #3
Catalin Marinas <catalin.marinas@arm.com> writes:

> On Wed, Apr 12, 2017 at 03:04:57PM +0100, Punit Agrawal wrote:
>> When memory failure is enabled, a poisoned hugepage pte is marked as a
>> swap entry. huge_pte_offset() does not return the poisoned page table
>> entries when it encounters PUD/PMD hugepages.
>> 
>> This behaviour of huge_pte_offset() leads to error such as below when
>> munmap is called on poisoned hugepages.
>> 
>> [  344.165544] mm/pgtable-generic.c:33: bad pmd 000000083af00074.
>> 
>> Fix huge_pte_offset() to return the poisoned pte which is then
>> appropriately handled by the generic layer code.
>> 
>> Signed-off-by: Punit Agrawal <punit.agrawal@arm.com>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Steve Capper <steve.capper@arm.com>
>> Cc: David Woods <dwoods@mellanox.com>
>> ---
>>  arch/arm64/mm/hugetlbpage.c | 20 +++++++++++++++-----
>>  1 file changed, 15 insertions(+), 5 deletions(-)
>> 
>> diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
>> index 7514a000e361..5f1832165d69 100644
>> --- a/arch/arm64/mm/hugetlbpage.c
>> +++ b/arch/arm64/mm/hugetlbpage.c
>> @@ -143,15 +143,24 @@ pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
>>  	pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd);
>>  	if (!pgd_present(*pgd))
>>  		return NULL;
>> -	pud = pud_offset(pgd, addr);
>> -	if (!pud_present(*pud))
>> -		return NULL;
>>  
>> -	if (pud_huge(*pud))
>> +	pud = pud_offset(pgd, addr);
>> +	/*
>> +	 * In case of HW Poisoning, a hugepage pud/pmd can contain
>> +	 * poisoned entries. Poisoned entries are marked as swap
>> +	 * entries.
>> +	 *
>> +	 * For puds/pmds that are not present, check to see if it
>> +	 * could be a swap entry (!present and !none).
>> +	 */
>> +	if ((!pte_present(pud_pte(*pud)) && !pud_none(*pud)) || pud_huge(*pud))
>>  		return (pte_t *)pud;
>
> Since we use puds as huge pages, can we just change pud_present() to
> match the pmd_present()? I'd like to see similar checks for pud and pmd,
> it would be easier to follow. Something like (unchecked):
>
> 	if (pud_none(*pud))
> 		return NULL;
> 	/* swap or huge page */
> 	if (!pud_present(*pud) || pud_huge(*pud))
> 		return (pte_t *)pud;
> 	/* table; check the next level */
>
>> +
>>  	pmd = pmd_offset(pud, addr);
>> -	if (!pmd_present(*pmd))
>> +	if (pmd_none(*pmd))
>>  		return NULL;
>> +	if (!pmd_present(*pmd) && !pmd_none(*pmd))
>> +		return (pte_t *)pmd;
>
> At this point, we already know that pmd_none(*pmd) is false, no ned to
> check it again.

Indeed - I was avoiding changing the function to drop contiguous
hugepage support which follows this hunk.

I've made changes locally based on your suggestion and will post a
revised version after the merge window.

Thanks,
Punit
Timur Tabi May 17, 2017, 2:35 p.m. UTC | #4
On Thu, May 4, 2017 at 10:55 AM, Punit Agrawal <punit.agrawal@arm.com> wrote:
> Indeed - I was avoiding changing the function to drop contiguous
> hugepage support which follows this hunk.
>
> I've made changes locally based on your suggestion and will post a
> revised version after the merge window.

Punit, will you be able to post a new version that could be a
candidate for 4.13?
Punit Agrawal May 17, 2017, 3:27 p.m. UTC | #5
Timur Tabi <timur@codeaurora.org> writes:

> On Thu, May 4, 2017 at 10:55 AM, Punit Agrawal <punit.agrawal@arm.com> wrote:
>> Indeed - I was avoiding changing the function to drop contiguous
>> hugepage support which follows this hunk.
>>
>> I've made changes locally based on your suggestion and will post a
>> revised version after the merge window.
>
> Punit, will you be able to post a new version that could be a
> candidate for 4.13?

Hi Timur,

I've just posted v2 for this series. I was testing the patches on rc1
when I saw your mail come through.

Please shout out if you notice any problems with the new version.

Thanks,
Punit
diff mbox

Patch

diff --git a/arch/arm64/mm/hugetlbpage.c b/arch/arm64/mm/hugetlbpage.c
index 7514a000e361..5f1832165d69 100644
--- a/arch/arm64/mm/hugetlbpage.c
+++ b/arch/arm64/mm/hugetlbpage.c
@@ -143,15 +143,24 @@  pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
 	pr_debug("%s: addr:0x%lx pgd:%p\n", __func__, addr, pgd);
 	if (!pgd_present(*pgd))
 		return NULL;
-	pud = pud_offset(pgd, addr);
-	if (!pud_present(*pud))
-		return NULL;
 
-	if (pud_huge(*pud))
+	pud = pud_offset(pgd, addr);
+	/*
+	 * In case of HW Poisoning, a hugepage pud/pmd can contain
+	 * poisoned entries. Poisoned entries are marked as swap
+	 * entries.
+	 *
+	 * For puds/pmds that are not present, check to see if it
+	 * could be a swap entry (!present and !none).
+	 */
+	if ((!pte_present(pud_pte(*pud)) && !pud_none(*pud)) || pud_huge(*pud))
 		return (pte_t *)pud;
+
 	pmd = pmd_offset(pud, addr);
-	if (!pmd_present(*pmd))
+	if (pmd_none(*pmd))
 		return NULL;
+	if (!pmd_present(*pmd) && !pmd_none(*pmd))
+		return (pte_t *)pmd;
 
 	if (pte_cont(pmd_pte(*pmd))) {
 		pmd = pmd_offset(
@@ -160,6 +169,7 @@  pte_t *huge_pte_offset(struct mm_struct *mm, unsigned long addr)
 	}
 	if (pmd_huge(*pmd))
 		return (pte_t *)pmd;
+
 	pte = pte_offset_kernel(pmd, addr);
 	if (pte_present(*pte) && pte_cont(*pte)) {
 		pte = pte_offset_kernel(