diff mbox series

mm/hmm: hmm_range_fault handle pages swapped out

Message ID 20190815205227.7949-1-Philip.Yang@amd.com (mailing list archive)
State New, archived
Headers show
Series mm/hmm: hmm_range_fault handle pages swapped out | expand

Commit Message

Philip Yang Aug. 15, 2019, 8:52 p.m. UTC
hmm_range_fault may return NULL pages because some of pfns are equal to
HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
the page in.

The fix is to call hmm_pte_need_fault to update fault variable.

Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
---
 mm/hmm.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Jerome Glisse Aug. 15, 2019, 9:02 p.m. UTC | #1
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
> 
> The fix is to call hmm_pte_need_fault to update fault variable.
> 
> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>

Reviewed-by: Jérôme Glisse <jglisse@redhat.com>

> ---
>  mm/hmm.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/hmm.c b/mm/hmm.c
> index 9f22562e2c43..7ca4fb39d3d8 100644
> --- a/mm/hmm.c
> +++ b/mm/hmm.c
> @@ -544,6 +544,9 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
>  		swp_entry_t entry = pte_to_swp_entry(pte);
>  
>  		if (!non_swap_entry(entry)) {
> +			cpu_flags = pte_to_hmm_pfn_flags(range, pte);
> +			hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
> +					   &fault, &write_fault);
>  			if (fault || write_fault)
>  				goto fault;
>  			return 0;
> -- 
> 2.17.1
>
Jason Gunthorpe Aug. 16, 2019, 12:54 a.m. UTC | #2
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
>
> The fix is to call hmm_pte_need_fault to update fault variable.

> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee

I'll fix it for you but please be careful not to send Change-id's to
the public lists.

Also what is the Fixes line for this?

> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>  mm/hmm.c | 3 +++
>  1 file changed, 3 insertions(+)

Ralph has also been looking at this area also so I'll give him a bit
to chime in, otherwise with Jerome's review this looks OK to go to
linux-next

Thanks,
Jason
Philip Yang Aug. 16, 2019, 4:02 p.m. UTC | #3
On 2019-08-15 8:54 p.m., Jason Gunthorpe wrote:
> On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
>> hmm_range_fault may return NULL pages because some of pfns are equal to
>> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
>> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
>> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
>> the page in.
>>
>> The fix is to call hmm_pte_need_fault to update fault variable.
> 
>> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
> 
> I'll fix it for you but please be careful not to send Change-id's to
> the public lists.
> 
Thanks, the change-id was added by our Gerrit hook, I need generate 
patch files, remove change-id line and then send out modified patch 
files in future.

> Also what is the Fixes line for this?
> 
This fixes the issue found by the internal rocrtst, the 
rocrtstFunc.Memory_Max_Mem evicted some user buffers, and then following 
test restore those user buffers failed because the buffers are swapped 
out and application doesn't touch the buffers to swap it in.

>> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>>   mm/hmm.c | 3 +++
>>   1 file changed, 3 insertions(+)
> 
> Ralph has also been looking at this area also so I'll give him a bit
> to chime in, otherwise with Jerome's review this looks OK to go to
> linux-next
>
Ok, thanks for helping push this to hmm branch at 
https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git

> Thanks,
> Jason
>
Jason Gunthorpe Aug. 23, 2019, 1:39 p.m. UTC | #4
On Thu, Aug 15, 2019 at 08:52:56PM +0000, Yang, Philip wrote:
> hmm_range_fault may return NULL pages because some of pfns are equal to
> HMM_PFN_NONE. This happens randomly under memory pressure. The reason is
> for swapped out page pte path, hmm_vma_handle_pte doesn't update fault
> variable from cpu_flags, so it failed to call hmm_vam_do_fault to swap
> the page in.
> 
> The fix is to call hmm_pte_need_fault to update fault variable.
> 
> Change-Id: I2e8611485563d11d938881c18b7935fa1e7c91ee
> Signed-off-by: Philip Yang <Philip.Yang@amd.com>
>  mm/hmm.c | 3 +++
>  1 file changed, 3 insertions(+)

Applied to hmm.git, thanks

I fixed the commit message:

Author: Yang, Philip <Philip.Yang@amd.com>
Date:   Thu Aug 15 20:52:56 2019 +0000

    mm/hmm: fix hmm_range_fault()'s handling of swapped out pages
    
    hmm_range_fault() may return NULL pages because some of the pfns are equal
    to HMM_PFN_NONE. This happens randomly under memory pressure. The reason
    is during the swapped out page pte path, hmm_vma_handle_pte() doesn't
    update the fault variable from cpu_flags, so it failed to call
    hmm_vam_do_fault() to swap the page in.
    
    The fix is to call hmm_pte_need_fault() to update fault variable.
    
    Fixes: 74eee180b935 ("mm/hmm/mirror: device page fault handler")
    Link: https://lore.kernel.org/r/20190815205227.7949-1-Philip.Yang@amd.com
    Signed-off-by: Philip Yang <Philip.Yang@amd.com>
    Reviewed-by: "Jérôme Glisse" <jglisse@redhat.com>
    Signed-off-by: Jason Gunthorpe <jgg@mellanox.com>
diff mbox series

Patch

diff --git a/mm/hmm.c b/mm/hmm.c
index 9f22562e2c43..7ca4fb39d3d8 100644
--- a/mm/hmm.c
+++ b/mm/hmm.c
@@ -544,6 +544,9 @@  static int hmm_vma_handle_pte(struct mm_walk *walk, unsigned long addr,
 		swp_entry_t entry = pte_to_swp_entry(pte);
 
 		if (!non_swap_entry(entry)) {
+			cpu_flags = pte_to_hmm_pfn_flags(range, pte);
+			hmm_pte_need_fault(hmm_vma_walk, orig_pfn, cpu_flags,
+					   &fault, &write_fault);
 			if (fault || write_fault)
 				goto fault;
 			return 0;