diff mbox series

mm: page_alloc: dump migrate-failed pages only at -EBUSY

Message ID 20210519213341.2620708-1-minchan@kernel.org (mailing list archive)
State New, archived
Headers show
Series mm: page_alloc: dump migrate-failed pages only at -EBUSY | expand

Commit Message

Minchan Kim May 19, 2021, 9:33 p.m. UTC
alloc_contig_dump_pages aims for helping debugging page migration
failure by page refcount mismatch or something else of page itself
from migration handler function. However, in -ENOMEM case, there is
nothing to get clue from page descriptor information so just
dump pages only when -EBUSY happens.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/page_alloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Minchan Kim May 20, 2021, 7:19 p.m. UTC | #1
On Wed, May 19, 2021 at 02:33:41PM -0700, Minchan Kim wrote:
> alloc_contig_dump_pages aims for helping debugging page migration
> failure by page refcount mismatch or something else of page itself
> from migration handler function. However, in -ENOMEM case, there is
> nothing to get clue from page descriptor information so just
> dump pages only when -EBUSY happens.
> 
> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/page_alloc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3100fcb08500..c0a2971dc755 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
>  
>  	lru_cache_enable();
>  	if (ret < 0) {
> -		alloc_contig_dump_pages(&cc->migratepages);
> +		if (ret == -EBUSY)
> +			alloc_contig_dump_pages(&cc->migratepages);
>  		putback_movable_pages(&cc->migratepages);
>  		return ret;
>  	}
> -- 
> 2.31.1.751.gd2f1c929bd-goog
> 

Resend with a little modifying description.

From c5a2fea291cf46079b87cc9ac9a25fc7f819d0fd Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Wed, 19 May 2021 14:22:18 -0700
Subject: [PATCH] mm: page_alloc: dump migrate-failed pages only at -EBUSY

alloc_contig_dump_pages aims for helping debugging page migration
failure by elevated page refcount compared to expected_count.
(for the detail, please look at migrate_page_move_mapping)

However, -ENOMEM is just the case that system is under memory
pressure state, not relevant with page refcount at all. Thus,
the dumping page list is not helpful for the debugging point of view.

Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 mm/page_alloc.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3100fcb08500..c0a2971dc755 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 
 	lru_cache_enable();
 	if (ret < 0) {
-		alloc_contig_dump_pages(&cc->migratepages);
+		if (ret == -EBUSY)
+			alloc_contig_dump_pages(&cc->migratepages);
 		putback_movable_pages(&cc->migratepages);
 		return ret;
 	}
David Hildenbrand May 20, 2021, 7:28 p.m. UTC | #2
Minchan Kim <minchan@kernel.org> schrieb am Do. 20. Mai 2021 um 21:20:

> On Wed, May 19, 2021 at 02:33:41PM -0700, Minchan Kim wrote:
> > alloc_contig_dump_pages aims for helping debugging page migration
> > failure by page refcount mismatch or something else of page itself
> > from migration handler function. However, in -ENOMEM case, there is
> > nothing to get clue from page descriptor information so just
> > dump pages only when -EBUSY happens.
> >
> > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > ---
> >  mm/page_alloc.c | 3 ++-
> >  1 file changed, 2 insertions(+), 1 deletion(-)
> >
> > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > index 3100fcb08500..c0a2971dc755 100644
> > --- a/mm/page_alloc.c
> > +++ b/mm/page_alloc.c
> > @@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct
> compact_control *cc,
> >
> >       lru_cache_enable();
> >       if (ret < 0) {
> > -             alloc_contig_dump_pages(&cc->migratepages);
> > +             if (ret == -EBUSY)
> > +                     alloc_contig_dump_pages(&cc->migratepages);
> >               putback_movable_pages(&cc->migratepages);
> >               return ret;
> >       }
> > --
> > 2.31.1.751.gd2f1c929bd-goog
> >
>
> Resend with a little modifying description.
>
> From c5a2fea291cf46079b87cc9ac9a25fc7f819d0fd Mon Sep 17 00:00:00 2001
> From: Minchan Kim <minchan@kernel.org>
> Date: Wed, 19 May 2021 14:22:18 -0700
> Subject: [PATCH] mm: page_alloc: dump migrate-failed pages only at -EBUSY
>
> alloc_contig_dump_pages aims for helping debugging page migration
> failure by elevated page refcount compared to expected_count.
> (for the detail, please look at migrate_page_move_mapping)
>
> However, -ENOMEM is just the case that system is under memory
> pressure state, not relevant with page refcount at all. Thus,
> the dumping page list is not helpful for the debugging point of view.
>

what about -ENOMEM when migrating empty/free huge pages? I think there is
value in having the pages dumped to identify something like that. And it
doesn‘t require heavy memory pressure to fail allocating a huge page.


> Signed-off-by: Minchan Kim <minchan@kernel.org>
> ---
>  mm/page_alloc.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 3100fcb08500..c0a2971dc755 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct
> compact_control *cc,
>
>         lru_cache_enable();
>         if (ret < 0) {
> -               alloc_contig_dump_pages(&cc->migratepages);
> +               if (ret == -EBUSY)
> +                       alloc_contig_dump_pages(&cc->migratepages);
>                 putback_movable_pages(&cc->migratepages);
>                 return ret;
>         }
> --
> 2.31.1.818.g46aad6cb9e-goog
>
> --
Thanks,

David / dhildenb
Minchan Kim May 20, 2021, 8:51 p.m. UTC | #3
On Thu, May 20, 2021 at 09:28:09PM +0200, David Hildenbrand wrote:
> Minchan Kim <minchan@kernel.org> schrieb am Do. 20. Mai 2021 um 21:20:
> 
> > On Wed, May 19, 2021 at 02:33:41PM -0700, Minchan Kim wrote:
> > > alloc_contig_dump_pages aims for helping debugging page migration
> > > failure by page refcount mismatch or something else of page itself
> > > from migration handler function. However, in -ENOMEM case, there is
> > > nothing to get clue from page descriptor information so just
> > > dump pages only when -EBUSY happens.
> > >
> > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > ---
> > >  mm/page_alloc.c | 3 ++-
> > >  1 file changed, 2 insertions(+), 1 deletion(-)
> > >
> > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > index 3100fcb08500..c0a2971dc755 100644
> > > --- a/mm/page_alloc.c
> > > +++ b/mm/page_alloc.c
> > > @@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct
> > compact_control *cc,
> > >
> > >       lru_cache_enable();
> > >       if (ret < 0) {
> > > -             alloc_contig_dump_pages(&cc->migratepages);
> > > +             if (ret == -EBUSY)
> > > +                     alloc_contig_dump_pages(&cc->migratepages);
> > >               putback_movable_pages(&cc->migratepages);
> > >               return ret;
> > >       }
> > > --
> > > 2.31.1.751.gd2f1c929bd-goog
> > >
> >
> > Resend with a little modifying description.
> >
> > From c5a2fea291cf46079b87cc9ac9a25fc7f819d0fd Mon Sep 17 00:00:00 2001
> > From: Minchan Kim <minchan@kernel.org>
> > Date: Wed, 19 May 2021 14:22:18 -0700
> > Subject: [PATCH] mm: page_alloc: dump migrate-failed pages only at -EBUSY
> >
> > alloc_contig_dump_pages aims for helping debugging page migration
> > failure by elevated page refcount compared to expected_count.
> > (for the detail, please look at migrate_page_move_mapping)
> >
> > However, -ENOMEM is just the case that system is under memory
> > pressure state, not relevant with page refcount at all. Thus,
> > the dumping page list is not helpful for the debugging point of view.
> >
> 
> what about -ENOMEM when migrating empty/free huge pages? I think there is
> value in having the pages dumped to identify something like that. And it
> doesn‘t require heavy memory pressure to fail allocating a huge page.
> 

-ENOMEM means there is no memory to alloate destination page.
How could it help dumping source pages in those case from dump_page
content point of view?
David Hildenbrand May 21, 2021, 8:08 a.m. UTC | #4
On 20.05.21 22:51, Minchan Kim wrote:
> On Thu, May 20, 2021 at 09:28:09PM +0200, David Hildenbrand wrote:
>> Minchan Kim <minchan@kernel.org> schrieb am Do. 20. Mai 2021 um 21:20:
>>
>>> On Wed, May 19, 2021 at 02:33:41PM -0700, Minchan Kim wrote:
>>>> alloc_contig_dump_pages aims for helping debugging page migration
>>>> failure by page refcount mismatch or something else of page itself
>>>> from migration handler function. However, in -ENOMEM case, there is
>>>> nothing to get clue from page descriptor information so just
>>>> dump pages only when -EBUSY happens.
>>>>
>>>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>>>> ---
>>>>   mm/page_alloc.c | 3 ++-
>>>>   1 file changed, 2 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 3100fcb08500..c0a2971dc755 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct
>>> compact_control *cc,
>>>>
>>>>        lru_cache_enable();
>>>>        if (ret < 0) {
>>>> -             alloc_contig_dump_pages(&cc->migratepages);
>>>> +             if (ret == -EBUSY)
>>>> +                     alloc_contig_dump_pages(&cc->migratepages);
>>>>                putback_movable_pages(&cc->migratepages);
>>>>                return ret;
>>>>        }
>>>> --
>>>> 2.31.1.751.gd2f1c929bd-goog
>>>>
>>>
>>> Resend with a little modifying description.
>>>
>>>  From c5a2fea291cf46079b87cc9ac9a25fc7f819d0fd Mon Sep 17 00:00:00 2001
>>> From: Minchan Kim <minchan@kernel.org>
>>> Date: Wed, 19 May 2021 14:22:18 -0700
>>> Subject: [PATCH] mm: page_alloc: dump migrate-failed pages only at -EBUSY
>>>
>>> alloc_contig_dump_pages aims for helping debugging page migration
>>> failure by elevated page refcount compared to expected_count.
>>> (for the detail, please look at migrate_page_move_mapping)
>>>
>>> However, -ENOMEM is just the case that system is under memory
>>> pressure state, not relevant with page refcount at all. Thus,
>>> the dumping page list is not helpful for the debugging point of view.
>>>
>>
>> what about -ENOMEM when migrating empty/free huge pages? I think there is
>> value in having the pages dumped to identify something like that. And it
>> doesn‘t require heavy memory pressure to fail allocating a huge page.
>>
> 
> -ENOMEM means there is no memory to alloate destination page.
> How could it help dumping source pages in those case from dump_page
> content point of view?

You would spot a huge page in the source list (usually at first 
position) without any obvious migration blockers I assume?

I'm wondering, did you actually run into this being suboptimal? If it's 
a real problem dumping too many stuff when running into -ENOMEM, fine 
with me. If it's a theoretical issue, I'd prefer to just keep it simple 
as is.
Minchan Kim May 21, 2021, 5:39 p.m. UTC | #5
On Fri, May 21, 2021 at 10:08:15AM +0200, David Hildenbrand wrote:
> On 20.05.21 22:51, Minchan Kim wrote:
> > On Thu, May 20, 2021 at 09:28:09PM +0200, David Hildenbrand wrote:
> > > Minchan Kim <minchan@kernel.org> schrieb am Do. 20. Mai 2021 um 21:20:
> > > 
> > > > On Wed, May 19, 2021 at 02:33:41PM -0700, Minchan Kim wrote:
> > > > > alloc_contig_dump_pages aims for helping debugging page migration
> > > > > failure by page refcount mismatch or something else of page itself
> > > > > from migration handler function. However, in -ENOMEM case, there is
> > > > > nothing to get clue from page descriptor information so just
> > > > > dump pages only when -EBUSY happens.
> > > > > 
> > > > > Signed-off-by: Minchan Kim <minchan@kernel.org>
> > > > > ---
> > > > >   mm/page_alloc.c | 3 ++-
> > > > >   1 file changed, 2 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> > > > > index 3100fcb08500..c0a2971dc755 100644
> > > > > --- a/mm/page_alloc.c
> > > > > +++ b/mm/page_alloc.c
> > > > > @@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct
> > > > compact_control *cc,
> > > > > 
> > > > >        lru_cache_enable();
> > > > >        if (ret < 0) {
> > > > > -             alloc_contig_dump_pages(&cc->migratepages);
> > > > > +             if (ret == -EBUSY)
> > > > > +                     alloc_contig_dump_pages(&cc->migratepages);
> > > > >                putback_movable_pages(&cc->migratepages);
> > > > >                return ret;
> > > > >        }
> > > > > --
> > > > > 2.31.1.751.gd2f1c929bd-goog
> > > > > 
> > > > 
> > > > Resend with a little modifying description.
> > > > 
> > > >  From c5a2fea291cf46079b87cc9ac9a25fc7f819d0fd Mon Sep 17 00:00:00 2001
> > > > From: Minchan Kim <minchan@kernel.org>
> > > > Date: Wed, 19 May 2021 14:22:18 -0700
> > > > Subject: [PATCH] mm: page_alloc: dump migrate-failed pages only at -EBUSY
> > > > 
> > > > alloc_contig_dump_pages aims for helping debugging page migration
> > > > failure by elevated page refcount compared to expected_count.
> > > > (for the detail, please look at migrate_page_move_mapping)
> > > > 
> > > > However, -ENOMEM is just the case that system is under memory
> > > > pressure state, not relevant with page refcount at all. Thus,
> > > > the dumping page list is not helpful for the debugging point of view.
> > > > 
> > > 
> > > what about -ENOMEM when migrating empty/free huge pages? I think there is
> > > value in having the pages dumped to identify something like that. And it
> > > doesn‘t require heavy memory pressure to fail allocating a huge page.
> > > 
> > 
> > -ENOMEM means there is no memory to alloate destination page.
> > How could it help dumping source pages in those case from dump_page
> > content point of view?
> 
> You would spot a huge page in the source list (usually at first position)
> without any obvious migration blockers I assume?

It was not a huge page case.

> 
> I'm wondering, did you actually run into this being suboptimal? If it's a
> real problem dumping too many stuff when running into -ENOMEM, fine with me.
> If it's a theoretical issue, I'd prefer to just keep it simple as is.

That's exactly what I encountered. With -ENOMEM, it dumped bunch of
pages on migratepages list. It was just useless with just consuming
logbuffer since there are nothing much to investigate with dumping
source pages.
David Hildenbrand May 23, 2021, 12:06 p.m. UTC | #6
On 21.05.21 19:39, Minchan Kim wrote:
> On Fri, May 21, 2021 at 10:08:15AM +0200, David Hildenbrand wrote:
>> On 20.05.21 22:51, Minchan Kim wrote:
>>> On Thu, May 20, 2021 at 09:28:09PM +0200, David Hildenbrand wrote:
>>>> Minchan Kim <minchan@kernel.org> schrieb am Do. 20. Mai 2021 um 21:20:
>>>>
>>>>> On Wed, May 19, 2021 at 02:33:41PM -0700, Minchan Kim wrote:
>>>>>> alloc_contig_dump_pages aims for helping debugging page migration
>>>>>> failure by page refcount mismatch or something else of page itself
>>>>>> from migration handler function. However, in -ENOMEM case, there is
>>>>>> nothing to get clue from page descriptor information so just
>>>>>> dump pages only when -EBUSY happens.
>>>>>>
>>>>>> Signed-off-by: Minchan Kim <minchan@kernel.org>
>>>>>> ---
>>>>>>    mm/page_alloc.c | 3 ++-
>>>>>>    1 file changed, 2 insertions(+), 1 deletion(-)
>>>>>>
>>>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>>>> index 3100fcb08500..c0a2971dc755 100644
>>>>>> --- a/mm/page_alloc.c
>>>>>> +++ b/mm/page_alloc.c
>>>>>> @@ -8760,7 +8760,8 @@ static int __alloc_contig_migrate_range(struct
>>>>> compact_control *cc,
>>>>>>
>>>>>>         lru_cache_enable();
>>>>>>         if (ret < 0) {
>>>>>> -             alloc_contig_dump_pages(&cc->migratepages);
>>>>>> +             if (ret == -EBUSY)
>>>>>> +                     alloc_contig_dump_pages(&cc->migratepages);
>>>>>>                 putback_movable_pages(&cc->migratepages);
>>>>>>                 return ret;
>>>>>>         }
>>>>>> --
>>>>>> 2.31.1.751.gd2f1c929bd-goog
>>>>>>
>>>>>
>>>>> Resend with a little modifying description.
>>>>>
>>>>>   From c5a2fea291cf46079b87cc9ac9a25fc7f819d0fd Mon Sep 17 00:00:00 2001
>>>>> From: Minchan Kim <minchan@kernel.org>
>>>>> Date: Wed, 19 May 2021 14:22:18 -0700
>>>>> Subject: [PATCH] mm: page_alloc: dump migrate-failed pages only at -EBUSY
>>>>>
>>>>> alloc_contig_dump_pages aims for helping debugging page migration
>>>>> failure by elevated page refcount compared to expected_count.
>>>>> (for the detail, please look at migrate_page_move_mapping)
>>>>>
>>>>> However, -ENOMEM is just the case that system is under memory
>>>>> pressure state, not relevant with page refcount at all. Thus,
>>>>> the dumping page list is not helpful for the debugging point of view.
>>>>>
>>>>
>>>> what about -ENOMEM when migrating empty/free huge pages? I think there is
>>>> value in having the pages dumped to identify something like that. And it
>>>> doesn‘t require heavy memory pressure to fail allocating a huge page.
>>>>
>>>
>>> -ENOMEM means there is no memory to alloate destination page.
>>> How could it help dumping source pages in those case from dump_page
>>> content point of view?
>>
>> You would spot a huge page in the source list (usually at first position)
>> without any obvious migration blockers I assume?
> 
> It was not a huge page case.
> 
>>
>> I'm wondering, did you actually run into this being suboptimal? If it's a
>> real problem dumping too many stuff when running into -ENOMEM, fine with me.
>> If it's a theoretical issue, I'd prefer to just keep it simple as is.
> 
> That's exactly what I encountered. With -ENOMEM, it dumped bunch of
> pages on migratepages list. It was just useless with just consuming
> logbuffer since there are nothing much to investigate with dumping
> source pages.
> 

Fine with me, then

Reviewed-by: David Hildenbrand <david@redhat.com>

Thanks!
diff mbox series

Patch

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 3100fcb08500..c0a2971dc755 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -8760,7 +8760,8 @@  static int __alloc_contig_migrate_range(struct compact_control *cc,
 
 	lru_cache_enable();
 	if (ret < 0) {
-		alloc_contig_dump_pages(&cc->migratepages);
+		if (ret == -EBUSY)
+			alloc_contig_dump_pages(&cc->migratepages);
 		putback_movable_pages(&cc->migratepages);
 		return ret;
 	}