diff mbox series

[1/4] mm/swapfile: fix wrong swap entry type for hwpoisoned swapcache page

Message ID 20230715031729.2420338-2-linmiaohe@huawei.com (mailing list archive)
State New
Headers show
Series A few fixup and cleanup patches for memory-failure | expand

Commit Message

Miaohe Lin July 15, 2023, 3:17 a.m. UTC
Hwpoisoned dirty swap cache page is kept in the swap cache and there's
simple interception code in do_swap_page() to catch it. But when trying
to swapoff, unuse_pte() will wrongly install a general sense of "future
accesses are invalid" swap entry for hwpoisoned swap cache page due to
unaware of such type of page. The user will receive SIGBUS signal without
expected BUS_MCEERR_AR payload.

Fixes: 6b970599e807 ("mm: hwpoison: support recovery from ksm_might_need_to_copy()")
Signed-off-by: Miaohe Lin <linmiaohe@huawei.com>
---
 mm/swapfile.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Matthew Wilcox July 15, 2023, 3:50 a.m. UTC | #1
On Sat, Jul 15, 2023 at 11:17:26AM +0800, Miaohe Lin wrote:
> Hwpoisoned dirty swap cache page is kept in the swap cache and there's
> simple interception code in do_swap_page() to catch it. But when trying
> to swapoff, unuse_pte() will wrongly install a general sense of "future
> accesses are invalid" swap entry for hwpoisoned swap cache page due to
> unaware of such type of page. The user will receive SIGBUS signal without
> expected BUS_MCEERR_AR payload.

Have you observed this, or do you just think it's true?

> +++ b/mm/swapfile.c
> @@ -1767,7 +1767,8 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>  		swp_entry_t swp_entry;
>  
>  		dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
> -		if (hwposioned) {
> +		/* Hwpoisoned swapcache page is also !PageUptodate. */
> +		if (hwposioned || PageHWPoison(page)) {

This line makes no sense to me.  How do we get here with PageHWPoison()
being true and hwposioned being false?

>  			swp_entry = make_hwpoison_entry(swapcache);
>  			page = swapcache;
>  		} else {
> -- 
> 2.33.0
> 
>
Miaohe Lin July 17, 2023, 2:33 a.m. UTC | #2
On 2023/7/15 11:50, Matthew Wilcox wrote:
> On Sat, Jul 15, 2023 at 11:17:26AM +0800, Miaohe Lin wrote:
>> Hwpoisoned dirty swap cache page is kept in the swap cache and there's
>> simple interception code in do_swap_page() to catch it. But when trying
>> to swapoff, unuse_pte() will wrongly install a general sense of "future
>> accesses are invalid" swap entry for hwpoisoned swap cache page due to
>> unaware of such type of page. The user will receive SIGBUS signal without
>> expected BUS_MCEERR_AR payload.
> 
> Have you observed this, or do you just think it's true?
> 
>> +++ b/mm/swapfile.c
>> @@ -1767,7 +1767,8 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>>  		swp_entry_t swp_entry;
>>  
>>  		dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
>> -		if (hwposioned) {
>> +		/* Hwpoisoned swapcache page is also !PageUptodate. */
>> +		if (hwposioned || PageHWPoison(page)) {
> 
> This line makes no sense to me.  How do we get here with PageHWPoison()
> being true and hwposioned being false?

hwposioned will be true iff ksm_might_need_to_copy returns -EHWPOISON.
And there's PageUptodate check in ksm_might_need_to_copy before we can return -EHWPOISON:

  ksm_might_need_to_copy
    if (!PageUptodate(page))
      return page;		/* let do_swap_page report the error */
    ^^^
    Will return here because hwpoisoned swapcache page is !PageUptodate(cleared via me_swapcache_dirty()).

Or am I miss something?

Thanks.
Matthew Wilcox July 17, 2023, 2:53 a.m. UTC | #3
On Mon, Jul 17, 2023 at 10:33:14AM +0800, Miaohe Lin wrote:
> On 2023/7/15 11:50, Matthew Wilcox wrote:
> > On Sat, Jul 15, 2023 at 11:17:26AM +0800, Miaohe Lin wrote:
> >> Hwpoisoned dirty swap cache page is kept in the swap cache and there's
> >> simple interception code in do_swap_page() to catch it. But when trying
> >> to swapoff, unuse_pte() will wrongly install a general sense of "future
> >> accesses are invalid" swap entry for hwpoisoned swap cache page due to
> >> unaware of such type of page. The user will receive SIGBUS signal without
> >> expected BUS_MCEERR_AR payload.
> > 
> > Have you observed this, or do you just think it's true?
> > 
> >> +++ b/mm/swapfile.c
> >> @@ -1767,7 +1767,8 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
> >>  		swp_entry_t swp_entry;
> >>  
> >>  		dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
> >> -		if (hwposioned) {
> >> +		/* Hwpoisoned swapcache page is also !PageUptodate. */
> >> +		if (hwposioned || PageHWPoison(page)) {
> > 
> > This line makes no sense to me.  How do we get here with PageHWPoison()
> > being true and hwposioned being false?
> 
> hwposioned will be true iff ksm_might_need_to_copy returns -EHWPOISON.
> And there's PageUptodate check in ksm_might_need_to_copy before we can return -EHWPOISON:
> 
>   ksm_might_need_to_copy
>     if (!PageUptodate(page))
>       return page;		/* let do_swap_page report the error */
>     ^^^
>     Will return here because hwpoisoned swapcache page is !PageUptodate(cleared via me_swapcache_dirty()).
> 
> Or am I miss something?

Ah!  So we don't even get to calling copy_mc_to_kernel().  That seems
like a bug in ksm_might_need_to_copy(), don't you think?  Maybe this
would be a better fix:

+	if (PageHWPoison(page))
+		return ERR_PTR(-EHWPOISON);
	if (!PageUptodate(page))
		return page;
Miaohe Lin July 17, 2023, 5:55 a.m. UTC | #4
On 2023/7/17 10:53, Matthew Wilcox wrote:
> On Mon, Jul 17, 2023 at 10:33:14AM +0800, Miaohe Lin wrote:
>> On 2023/7/15 11:50, Matthew Wilcox wrote:
>>> On Sat, Jul 15, 2023 at 11:17:26AM +0800, Miaohe Lin wrote:
>>>> Hwpoisoned dirty swap cache page is kept in the swap cache and there's
>>>> simple interception code in do_swap_page() to catch it. But when trying
>>>> to swapoff, unuse_pte() will wrongly install a general sense of "future
>>>> accesses are invalid" swap entry for hwpoisoned swap cache page due to
>>>> unaware of such type of page. The user will receive SIGBUS signal without
>>>> expected BUS_MCEERR_AR payload.
>>>
>>> Have you observed this, or do you just think it's true?
>>>
>>>> +++ b/mm/swapfile.c
>>>> @@ -1767,7 +1767,8 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>>>>  		swp_entry_t swp_entry;
>>>>  
>>>>  		dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
>>>> -		if (hwposioned) {
>>>> +		/* Hwpoisoned swapcache page is also !PageUptodate. */
>>>> +		if (hwposioned || PageHWPoison(page)) {
>>>
>>> This line makes no sense to me.  How do we get here with PageHWPoison()
>>> being true and hwposioned being false?
>>
>> hwposioned will be true iff ksm_might_need_to_copy returns -EHWPOISON.
>> And there's PageUptodate check in ksm_might_need_to_copy before we can return -EHWPOISON:
>>
>>   ksm_might_need_to_copy
>>     if (!PageUptodate(page))
>>       return page;		/* let do_swap_page report the error */
>>     ^^^
>>     Will return here because hwpoisoned swapcache page is !PageUptodate(cleared via me_swapcache_dirty()).
>>
>> Or am I miss something?
> 
> Ah!  So we don't even get to calling copy_mc_to_kernel().  That seems
> like a bug in ksm_might_need_to_copy(), don't you think?  Maybe this

I'm sorry but could you please explain what the bug is?

> would be a better fix:
> 
> +	if (PageHWPoison(page))
> +		return ERR_PTR(-EHWPOISON);

Looks reasonable. We can further avoid accessing contents of hwpoisoned swapcache pages which are
PageHWPoison() && PageUptodate() at first place.

Thanks.

> 	if (!PageUptodate(page))
> 		return page;
> 
> .
>
Miaohe Lin July 23, 2023, 2:23 a.m. UTC | #5
On 2023/7/17 10:53, Matthew Wilcox wrote:
> On Mon, Jul 17, 2023 at 10:33:14AM +0800, Miaohe Lin wrote:
>> On 2023/7/15 11:50, Matthew Wilcox wrote:
>>> On Sat, Jul 15, 2023 at 11:17:26AM +0800, Miaohe Lin wrote:
>>>> Hwpoisoned dirty swap cache page is kept in the swap cache and there's
>>>> simple interception code in do_swap_page() to catch it. But when trying
>>>> to swapoff, unuse_pte() will wrongly install a general sense of "future
>>>> accesses are invalid" swap entry for hwpoisoned swap cache page due to
>>>> unaware of such type of page. The user will receive SIGBUS signal without
>>>> expected BUS_MCEERR_AR payload.
>>>
>>> Have you observed this, or do you just think it's true?
>>>
>>>> +++ b/mm/swapfile.c
>>>> @@ -1767,7 +1767,8 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
>>>>  		swp_entry_t swp_entry;
>>>>  
>>>>  		dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
>>>> -		if (hwposioned) {
>>>> +		/* Hwpoisoned swapcache page is also !PageUptodate. */
>>>> +		if (hwposioned || PageHWPoison(page)) {
>>>
>>> This line makes no sense to me.  How do we get here with PageHWPoison()
>>> being true and hwposioned being false?
>>
>> hwposioned will be true iff ksm_might_need_to_copy returns -EHWPOISON.
>> And there's PageUptodate check in ksm_might_need_to_copy before we can return -EHWPOISON:
>>
>>   ksm_might_need_to_copy
>>     if (!PageUptodate(page))
>>       return page;		/* let do_swap_page report the error */
>>     ^^^
>>     Will return here because hwpoisoned swapcache page is !PageUptodate(cleared via me_swapcache_dirty()).
>>
>> Or am I miss something?
> 
> Ah!  So we don't even get to calling copy_mc_to_kernel().  That seems
> like a bug in ksm_might_need_to_copy(), don't you think?  Maybe this
> would be a better fix:
> 
> +	if (PageHWPoison(page))
> +		return ERR_PTR(-EHWPOISON);

I'm preparing posting the v2. But I found this won't fix the issue if CONFIG_KSM is disabled. So the correct fix
should be something like below?

diff --git a/mm/ksm.c b/mm/ksm.c
index 97a9627116fa..74804158ee02 100644
--- a/mm/ksm.c
+++ b/mm/ksm.c
@@ -2794,6 +2794,8 @@ struct page *ksm_might_need_to_copy(struct page *page,
                        anon_vma->root == vma->anon_vma->root) {
                return page;            /* still no need to copy it */
        }
+       if (PageHWPoison(page))
+               return ERR_PTR(-EHWPOISON);
        if (!PageUptodate(page))
                return page;            /* let do_swap_page report the error */

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 346e22b8ae97..3dcc9d92689c 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1744,7 +1744,7 @@ static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
        struct page *swapcache;
        spinlock_t *ptl;
        pte_t *pte, new_pte, old_pte;
-       bool hwposioned = false;
+       bool hwposioned = PageHWPoison(page);
        int ret = 1;

        swapcache = page;


Thanks.
diff mbox series

Patch

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 346e22b8ae97..02f6808e65bf 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -1767,7 +1767,8 @@  static int unuse_pte(struct vm_area_struct *vma, pmd_t *pmd,
 		swp_entry_t swp_entry;
 
 		dec_mm_counter(vma->vm_mm, MM_SWAPENTS);
-		if (hwposioned) {
+		/* Hwpoisoned swapcache page is also !PageUptodate. */
+		if (hwposioned || PageHWPoison(page)) {
 			swp_entry = make_hwpoison_entry(swapcache);
 			page = swapcache;
 		} else {