diff mbox series

[2/2] mm: compaction: update the cc->nr_migratepages when allocating or freeing the freepages

Message ID 0773058df022fa701b78f9a6dfe3c501a1a77351.1705928395.git.baolin.wang@linux.alibaba.com (mailing list archive)
State New
Headers show
Series [1/2] mm: compaction: limit the suitable target page order to be less than cc->order | expand

Commit Message

Baolin Wang Jan. 22, 2024, 1:01 p.m. UTC
Currently we will use 'cc->nr_freepages >= cc->nr_migratepages' comparison
to ensure that enough freepages are isolated in isolate_freepages(), however
it just decreases the cc->nr_freepages without updating cc->nr_migratepages
in compaction_alloc(), which will waste more CPU cycles and cause too many
freepages to be isolated.

So we should also update the cc->nr_migratepages when allocating or freeing
the freepages to avoid isolating excess freepages. And I can see fewer free
pages are scanned and isolated when running thpcompact on my Arm64 server:
                                       k6.7         k6.7_patched
Ops Compaction pages isolated      120692036.00   118160797.00
Ops Compaction migrate scanned     131210329.00   154093268.00
Ops Compaction free scanned       1090587971.00  1080632536.00
Ops Compact scan efficiency               12.03          14.26

Moreover, I did not see an obvious latency improvements, this is likely because
isolating freepages is not the bottleneck in the thpcompact test case.
                              k6.7                  k6.7_patched
Amean     fault-both-1      1089.76 (   0.00%)     1080.16 *   0.88%*
Amean     fault-both-3      1616.48 (   0.00%)     1636.65 *  -1.25%*
Amean     fault-both-5      2266.66 (   0.00%)     2219.20 *   2.09%*
Amean     fault-both-7      2909.84 (   0.00%)     2801.90 *   3.71%*
Amean     fault-both-12     4861.26 (   0.00%)     4733.25 *   2.63%*
Amean     fault-both-18     7351.11 (   0.00%)     6950.51 *   5.45%*
Amean     fault-both-24     9059.30 (   0.00%)     9159.99 *  -1.11%*
Amean     fault-both-30    10685.68 (   0.00%)    11399.02 *  -6.68%*

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 mm/compaction.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Mel Gorman Feb. 1, 2024, 10:30 a.m. UTC | #1
On Mon, Jan 22, 2024 at 09:01:54PM +0800, Baolin Wang wrote:
> Currently we will use 'cc->nr_freepages >= cc->nr_migratepages' comparison
> to ensure that enough freepages are isolated in isolate_freepages(), however
> it just decreases the cc->nr_freepages without updating cc->nr_migratepages
> in compaction_alloc(), which will waste more CPU cycles and cause too many
> freepages to be isolated.
> 
> So we should also update the cc->nr_migratepages when allocating or freeing
> the freepages to avoid isolating excess freepages. And I can see fewer free
> pages are scanned and isolated when running thpcompact on my Arm64 server:
>                                        k6.7         k6.7_patched
> Ops Compaction pages isolated      120692036.00   118160797.00
> Ops Compaction migrate scanned     131210329.00   154093268.00
> Ops Compaction free scanned       1090587971.00  1080632536.00
> Ops Compact scan efficiency               12.03          14.26
> 
> Moreover, I did not see an obvious latency improvements, this is likely because
> isolating freepages is not the bottleneck in the thpcompact test case.
>                               k6.7                  k6.7_patched
> Amean     fault-both-1      1089.76 (   0.00%)     1080.16 *   0.88%*
> Amean     fault-both-3      1616.48 (   0.00%)     1636.65 *  -1.25%*
> Amean     fault-both-5      2266.66 (   0.00%)     2219.20 *   2.09%*
> Amean     fault-both-7      2909.84 (   0.00%)     2801.90 *   3.71%*
> Amean     fault-both-12     4861.26 (   0.00%)     4733.25 *   2.63%*
> Amean     fault-both-18     7351.11 (   0.00%)     6950.51 *   5.45%*
> Amean     fault-both-24     9059.30 (   0.00%)     9159.99 *  -1.11%*
> Amean     fault-both-30    10685.68 (   0.00%)    11399.02 *  -6.68%*
> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>

Acked-by: Mel Gorman <mgorman@techsingularity.net>
Vlastimil Babka Feb. 12, 2024, 10:32 a.m. UTC | #2
On 1/22/24 14:01, Baolin Wang wrote:
> Currently we will use 'cc->nr_freepages >= cc->nr_migratepages' comparison
> to ensure that enough freepages are isolated in isolate_freepages(), however
> it just decreases the cc->nr_freepages without updating cc->nr_migratepages
> in compaction_alloc(), which will waste more CPU cycles and cause too many
> freepages to be isolated.

Hm yeah I guess this can happen with fast_isolate_freepages() if it returns
with something but not all the freepages that are expected to be needed, and
then we get to isolate_freepages() again.

> So we should also update the cc->nr_migratepages when allocating or freeing
> the freepages to avoid isolating excess freepages. And I can see fewer free
> pages are scanned and isolated when running thpcompact on my Arm64 server:
>                                        k6.7         k6.7_patched
> Ops Compaction pages isolated      120692036.00   118160797.00
> Ops Compaction migrate scanned     131210329.00   154093268.00
> Ops Compaction free scanned       1090587971.00  1080632536.00
> Ops Compact scan efficiency               12.03          14.26
> 
> Moreover, I did not see an obvious latency improvements, this is likely because
> isolating freepages is not the bottleneck in the thpcompact test case.
>                               k6.7                  k6.7_patched
> Amean     fault-both-1      1089.76 (   0.00%)     1080.16 *   0.88%*
> Amean     fault-both-3      1616.48 (   0.00%)     1636.65 *  -1.25%*
> Amean     fault-both-5      2266.66 (   0.00%)     2219.20 *   2.09%*
> Amean     fault-both-7      2909.84 (   0.00%)     2801.90 *   3.71%*
> Amean     fault-both-12     4861.26 (   0.00%)     4733.25 *   2.63%*
> Amean     fault-both-18     7351.11 (   0.00%)     6950.51 *   5.45%*
> Amean     fault-both-24     9059.30 (   0.00%)     9159.99 *  -1.11%*
> Amean     fault-both-30    10685.68 (   0.00%)    11399.02 *  -6.68%*
> 
> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  mm/compaction.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/mm/compaction.c b/mm/compaction.c
> index 066b72b3471a..6c84e3a5b32b 100644
> --- a/mm/compaction.c
> +++ b/mm/compaction.c
> @@ -1779,6 +1779,7 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data)
>  	dst = list_entry(cc->freepages.next, struct folio, lru);
>  	list_del(&dst->lru);
>  	cc->nr_freepages--;
> +	cc->nr_migratepages--;

This is breaking the tracepoint TRACE_EVENT(mm_compaction_migratepages)
which does

__entry->nr_failed = cc->nr_migratepages - nr_succeeded;

and is called after migrate_pages() finishes, so now this will underflow.

Probably need to get a snapshot of cc->nr_migratepages before calling
migrate_pages() and then feed that to the tracepoint instead of cc.

>  
>  	return dst;
>  }
> @@ -1794,6 +1795,7 @@ static void compaction_free(struct folio *dst, unsigned long data)
>  
>  	list_add(&dst->lru, &cc->freepages);
>  	cc->nr_freepages++;
> +	cc->nr_migratepages++;
>  }
>  
>  /* possible outcome of isolate_migratepages */
Baolin Wang Feb. 19, 2024, 2:34 a.m. UTC | #3
On 2024/2/12 18:32, Vlastimil Babka wrote:
> On 1/22/24 14:01, Baolin Wang wrote:
>> Currently we will use 'cc->nr_freepages >= cc->nr_migratepages' comparison
>> to ensure that enough freepages are isolated in isolate_freepages(), however
>> it just decreases the cc->nr_freepages without updating cc->nr_migratepages
>> in compaction_alloc(), which will waste more CPU cycles and cause too many
>> freepages to be isolated.
> 
> Hm yeah I guess this can happen with fast_isolate_freepages() if it returns
> with something but not all the freepages that are expected to be needed, and
> then we get to isolate_freepages() again.

Yes.

> 
>> So we should also update the cc->nr_migratepages when allocating or freeing
>> the freepages to avoid isolating excess freepages. And I can see fewer free
>> pages are scanned and isolated when running thpcompact on my Arm64 server:
>>                                         k6.7         k6.7_patched
>> Ops Compaction pages isolated      120692036.00   118160797.00
>> Ops Compaction migrate scanned     131210329.00   154093268.00
>> Ops Compaction free scanned       1090587971.00  1080632536.00
>> Ops Compact scan efficiency               12.03          14.26
>>
>> Moreover, I did not see an obvious latency improvements, this is likely because
>> isolating freepages is not the bottleneck in the thpcompact test case.
>>                                k6.7                  k6.7_patched
>> Amean     fault-both-1      1089.76 (   0.00%)     1080.16 *   0.88%*
>> Amean     fault-both-3      1616.48 (   0.00%)     1636.65 *  -1.25%*
>> Amean     fault-both-5      2266.66 (   0.00%)     2219.20 *   2.09%*
>> Amean     fault-both-7      2909.84 (   0.00%)     2801.90 *   3.71%*
>> Amean     fault-both-12     4861.26 (   0.00%)     4733.25 *   2.63%*
>> Amean     fault-both-18     7351.11 (   0.00%)     6950.51 *   5.45%*
>> Amean     fault-both-24     9059.30 (   0.00%)     9159.99 *  -1.11%*
>> Amean     fault-both-30    10685.68 (   0.00%)    11399.02 *  -6.68%*
>>
>> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
>> ---
>>   mm/compaction.c | 2 ++
>>   1 file changed, 2 insertions(+)
>>
>> diff --git a/mm/compaction.c b/mm/compaction.c
>> index 066b72b3471a..6c84e3a5b32b 100644
>> --- a/mm/compaction.c
>> +++ b/mm/compaction.c
>> @@ -1779,6 +1779,7 @@ static struct folio *compaction_alloc(struct folio *src, unsigned long data)
>>   	dst = list_entry(cc->freepages.next, struct folio, lru);
>>   	list_del(&dst->lru);
>>   	cc->nr_freepages--;
>> +	cc->nr_migratepages--;
> 
> This is breaking the tracepoint TRACE_EVENT(mm_compaction_migratepages)
> which does
> 
> __entry->nr_failed = cc->nr_migratepages - nr_succeeded;
> 
> and is called after migrate_pages() finishes, so now this will underflow.
> 
> Probably need to get a snapshot of cc->nr_migratepages before calling
> migrate_pages() and then feed that to the tracepoint instead of cc.

Ah, good catch. Will fix in next version. Thanks.
diff mbox series

Patch

diff --git a/mm/compaction.c b/mm/compaction.c
index 066b72b3471a..6c84e3a5b32b 100644
--- a/mm/compaction.c
+++ b/mm/compaction.c
@@ -1779,6 +1779,7 @@  static struct folio *compaction_alloc(struct folio *src, unsigned long data)
 	dst = list_entry(cc->freepages.next, struct folio, lru);
 	list_del(&dst->lru);
 	cc->nr_freepages--;
+	cc->nr_migratepages--;
 
 	return dst;
 }
@@ -1794,6 +1795,7 @@  static void compaction_free(struct folio *dst, unsigned long data)
 
 	list_add(&dst->lru, &cc->freepages);
 	cc->nr_freepages++;
+	cc->nr_migratepages++;
 }
 
 /* possible outcome of isolate_migratepages */