diff mbox series

[v2,1/2] mm: swap: fix performance regression on sparsetruncate-tiny

Message ID 20230405161854.6931-1-zhengqi.arch@bytedance.com (mailing list archive)
State New
Headers show
Series [v2,1/2] mm: swap: fix performance regression on sparsetruncate-tiny | expand

Commit Message

Qi Zheng April 5, 2023, 4:18 p.m. UTC
The ->percpu_pvec_drained was originally introduced by
commit d9ed0d08b6c6 ("mm: only drain per-cpu pagevecs once per
pagevec usage") to drain per-cpu pagevecs only once per pagevec
usage. But after converting the swap code to be more folio-based,
the commit c2bc16817aa0 ("mm/swap: add folio_batch_move_lru()")
breaks this logic, which would cause ->percpu_pvec_drained to be
reset to false, that means per-cpu pagevecs will be drained
multiple times per pagevec usage.

In theory, there should be no functional changes when converting
code to be more folio-based. We should call folio_batch_reinit()
in folio_batch_move_lru() instead of folio_batch_init(). And to
verify that we still need ->percpu_pvec_drained, I ran
mmtests/sparsetruncate-tiny and got the following data:

                             baseline                   with
                            baseline/                 patch/
Min       Time      326.00 (   0.00%)      328.00 (  -0.61%)
1st-qrtle Time      334.00 (   0.00%)      336.00 (  -0.60%)
2nd-qrtle Time      338.00 (   0.00%)      341.00 (  -0.89%)
3rd-qrtle Time      343.00 (   0.00%)      347.00 (  -1.17%)
Max-1     Time      326.00 (   0.00%)      328.00 (  -0.61%)
Max-5     Time      327.00 (   0.00%)      330.00 (  -0.92%)
Max-10    Time      328.00 (   0.00%)      331.00 (  -0.91%)
Max-90    Time      350.00 (   0.00%)      357.00 (  -2.00%)
Max-95    Time      395.00 (   0.00%)      390.00 (   1.27%)
Max-99    Time      508.00 (   0.00%)      434.00 (  14.57%)
Max       Time      547.00 (   0.00%)      476.00 (  12.98%)
Amean     Time      344.61 (   0.00%)      345.56 *  -0.28%*
Stddev    Time       30.34 (   0.00%)       19.51 (  35.69%)
CoeffVar  Time        8.81 (   0.00%)        5.65 (  35.87%)
BAmean-99 Time      342.38 (   0.00%)      344.27 (  -0.55%)
BAmean-95 Time      338.58 (   0.00%)      341.87 (  -0.97%)
BAmean-90 Time      336.89 (   0.00%)      340.26 (  -1.00%)
BAmean-75 Time      335.18 (   0.00%)      338.40 (  -0.96%)
BAmean-50 Time      332.54 (   0.00%)      335.42 (  -0.87%)
BAmean-25 Time      329.30 (   0.00%)      332.00 (  -0.82%)

From the above it can be seen that we get similar data to when
->percpu_pvec_drained was introduced, so we still need it. Let's
call folio_batch_reinit() in folio_batch_move_lru() to restore
the original logic.

Fixes: c2bc16817aa0 ("mm/swap: add folio_batch_move_lru()")
Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
---
Changlog in v1 to v2:
 - revise commit message and add test data

 mm/swap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Matthew Wilcox April 5, 2023, 4:45 p.m. UTC | #1
On Thu, Apr 06, 2023 at 12:18:53AM +0800, Qi Zheng wrote:
> The ->percpu_pvec_drained was originally introduced by
> commit d9ed0d08b6c6 ("mm: only drain per-cpu pagevecs once per
> pagevec usage") to drain per-cpu pagevecs only once per pagevec
> usage. But after converting the swap code to be more folio-based,
> the commit c2bc16817aa0 ("mm/swap: add folio_batch_move_lru()")
> breaks this logic, which would cause ->percpu_pvec_drained to be
> reset to false, that means per-cpu pagevecs will be drained
> multiple times per pagevec usage.

My mistake.  I didn't reaise that we'd need a folio_batch_reinit(),
and indeed we didn't have one until 811561288397 (January 2023).
I thought this usage of percpu_pvec_drained was going to be fine
with being set to false each time.  Thanks for showing I was wrong.

> Fixes: c2bc16817aa0 ("mm/swap: add folio_batch_move_lru()")
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>

Reviewed-by: Matthew Wilcox (Oracle) <willy@infradead.org>
Mel Gorman April 6, 2023, 10:22 a.m. UTC | #2
On Thu, Apr 06, 2023 at 12:18:53AM +0800, Qi Zheng wrote:
> The ->percpu_pvec_drained was originally introduced by
> commit d9ed0d08b6c6 ("mm: only drain per-cpu pagevecs once per
> pagevec usage") to drain per-cpu pagevecs only once per pagevec
> usage. But after converting the swap code to be more folio-based,
> the commit c2bc16817aa0 ("mm/swap: add folio_batch_move_lru()")
> breaks this logic, which would cause ->percpu_pvec_drained to be
> reset to false, that means per-cpu pagevecs will be drained
> multiple times per pagevec usage.
> 
> In theory, there should be no functional changes when converting
> code to be more folio-based. We should call folio_batch_reinit()
> in folio_batch_move_lru() instead of folio_batch_init(). And to
> verify that we still need ->percpu_pvec_drained, I ran
> mmtests/sparsetruncate-tiny and got the following data:
> 
>                              baseline                   with
>                             baseline/                 patch/
> Min       Time      326.00 (   0.00%)      328.00 (  -0.61%)
> 1st-qrtle Time      334.00 (   0.00%)      336.00 (  -0.60%)
> 2nd-qrtle Time      338.00 (   0.00%)      341.00 (  -0.89%)
> 3rd-qrtle Time      343.00 (   0.00%)      347.00 (  -1.17%)
> Max-1     Time      326.00 (   0.00%)      328.00 (  -0.61%)
> Max-5     Time      327.00 (   0.00%)      330.00 (  -0.92%)
> Max-10    Time      328.00 (   0.00%)      331.00 (  -0.91%)
> Max-90    Time      350.00 (   0.00%)      357.00 (  -2.00%)
> Max-95    Time      395.00 (   0.00%)      390.00 (   1.27%)
> Max-99    Time      508.00 (   0.00%)      434.00 (  14.57%)
> Max       Time      547.00 (   0.00%)      476.00 (  12.98%)
> Amean     Time      344.61 (   0.00%)      345.56 *  -0.28%*
> Stddev    Time       30.34 (   0.00%)       19.51 (  35.69%)
> CoeffVar  Time        8.81 (   0.00%)        5.65 (  35.87%)
> BAmean-99 Time      342.38 (   0.00%)      344.27 (  -0.55%)
> BAmean-95 Time      338.58 (   0.00%)      341.87 (  -0.97%)
> BAmean-90 Time      336.89 (   0.00%)      340.26 (  -1.00%)
> BAmean-75 Time      335.18 (   0.00%)      338.40 (  -0.96%)
> BAmean-50 Time      332.54 (   0.00%)      335.42 (  -0.87%)
> BAmean-25 Time      329.30 (   0.00%)      332.00 (  -0.82%)
> 
> From the above it can be seen that we get similar data to when
> ->percpu_pvec_drained was introduced, so we still need it. Let's
> call folio_batch_reinit() in folio_batch_move_lru() to restore
> the original logic.
> 
> Fixes: c2bc16817aa0 ("mm/swap: add folio_batch_move_lru()")
> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>

Well spotted,

Acked-by: Mel Gorman <mgorman@suse.de>
diff mbox series

Patch

diff --git a/mm/swap.c b/mm/swap.c
index 57cb01b042f6..423199ee8478 100644
--- a/mm/swap.c
+++ b/mm/swap.c
@@ -222,7 +222,7 @@  static void folio_batch_move_lru(struct folio_batch *fbatch, move_fn_t move_fn)
 	if (lruvec)
 		unlock_page_lruvec_irqrestore(lruvec, flags);
 	folios_put(fbatch->folios, folio_batch_count(fbatch));
-	folio_batch_init(fbatch);
+	folio_batch_reinit(fbatch);
 }
 
 static void folio_batch_add_and_move(struct folio_batch *fbatch,