[PATCHv2,0/4] zsmalloc: make zspage chain size configurable

Message ID	20230109033838.2779902-1-senozhatsky@chromium.org (mailing list archive)
Headers	show Return-Path: <owner-linux-mm@kvack.org> From: Sergey Senozhatsky <senozhatsky@chromium.org> To: Minchan Kim <minchan@kernel.org>, Andrew Morton <akpm@linux-foundation.org> Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org, Sergey Senozhatsky <senozhatsky@chromium.org> Subject: [PATCHv2 0/4] zsmalloc: make zspage chain size configurable Date: Mon, 9 Jan 2023 12:38:34 +0900 Message-Id: <20230109033838.2779902-1-senozhatsky@chromium.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: owner-linux-mm@kvack.org Precedence: bulk
Series	zsmalloc: make zspage chain size configurable \| expand [PATCHv2,0/4] zsmalloc: make zspage chain size configurable [PATCHv2,1/4] zsmalloc: rework zspage chain size selection [PATCHv2,2/4] zsmalloc: skip chain size calculation for pow_of_2 classes [PATCHv2,3/4] zsmalloc: make zspage chain size configurable [PATCHv2,4/4] zsmalloc: set default zspage chain size to 8

Sergey Senozhatsky Jan. 9, 2023, 3:38 a.m. UTC

Hi,

	This turns hard coded limit on maximum number of physical
pages per-zspage into a config option. It also increases the default
limit from 4 to 8.

Sergey Senozhatsky (4):
  zsmalloc: rework zspage chain size selection
  zsmalloc: skip chain size calculation for pow_of_2 classes
  zsmalloc: make zspage chain size configurable
  zsmalloc: set default zspage chain size to 8

 Documentation/mm/zsmalloc.rst | 168 ++++++++++++++++++++++++++++++++++
 mm/Kconfig                    |  19 ++++
 mm/zsmalloc.c                 |  72 +++++----------
 3 files changed, 212 insertions(+), 47 deletions(-)

Mike Kravetz Jan. 13, 2023, 7:57 p.m. UTC | #1

On 01/09/23 12:38, Sergey Senozhatsky wrote:
> Hi,
> 
> 	This turns hard coded limit on maximum number of physical
> pages per-zspage into a config option. It also increases the default
> limit from 4 to 8.
> 
> Sergey Senozhatsky (4):
>   zsmalloc: rework zspage chain size selection
>   zsmalloc: skip chain size calculation for pow_of_2 classes
>   zsmalloc: make zspage chain size configurable
>   zsmalloc: set default zspage chain size to 8
> 
>  Documentation/mm/zsmalloc.rst | 168 ++++++++++++++++++++++++++++++++++
>  mm/Kconfig                    |  19 ++++
>  mm/zsmalloc.c                 |  72 +++++----------
>  3 files changed, 212 insertions(+), 47 deletions(-)

Hi Sergey,

The following BUG shows up after this series in linux-next.  I can easily
recreate by doing the following:

# echo large_value > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
where 'large_value' is a so big that there could never possibly be that
many 2MB huge pages in the system.

Sergey Senozhatsky Jan. 14, 2023, 5:27 a.m. UTC | #2

On (23/01/13 11:57), Mike Kravetz wrote:
> > 	This turns hard coded limit on maximum number of physical
> > pages per-zspage into a config option. It also increases the default
> > limit from 4 to 8.
> > 
> > Sergey Senozhatsky (4):
> >   zsmalloc: rework zspage chain size selection
> >   zsmalloc: skip chain size calculation for pow_of_2 classes
> >   zsmalloc: make zspage chain size configurable
> >   zsmalloc: set default zspage chain size to 8
> > 
> >  Documentation/mm/zsmalloc.rst | 168 ++++++++++++++++++++++++++++++++++
> >  mm/Kconfig                    |  19 ++++
> >  mm/zsmalloc.c                 |  72 +++++----------
> >  3 files changed, 212 insertions(+), 47 deletions(-)
> 
> Hi Sergey,

Hi Mike,

> The following BUG shows up after this series in linux-next.  I can easily
> recreate by doing the following:
>
> # echo large_value > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> where 'large_value' is a so big that there could never possibly be that
> many 2MB huge pages in the system.

Hmm... Are we sure this is related? I really cannot see how chain-size
can have an effect on zspage ->isolate counter. What chain-size value
do you use? You don't see problems with chain size of 4?

Sergey Senozhatsky Jan. 14, 2023, 6:34 a.m. UTC | #3

On (23/01/13 11:57), Mike Kravetz wrote:
> The following BUG shows up after this series in linux-next.  I can easily
> recreate by doing the following:
> 
> # echo large_value > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> where 'large_value' is a so big that there could never possibly be that
> many 2MB huge pages in the system.

Just to make sure. Do you have this patch applied?
https://lore.kernel.org/lkml/20230112071443.1933880-1-senozhatsky@chromium.org

Sergey Senozhatsky Jan. 14, 2023, 7:08 a.m. UTC | #4

On (23/01/13 11:57), Mike Kravetz wrote:
> Hi Sergey,
> 
> The following BUG shows up after this series in linux-next.  I can easily
> recreate by doing the following:
> 
> # echo large_value > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> where 'large_value' is a so big that there could never possibly be that
> many 2MB huge pages in the system.

I get migration warnins with the zsmalloc series reverted.
I guess the problem is somewhere else. Can you double check
on you side?


[   87.208255] ------------[ cut here ]------------
[   87.209431] WARNING: CPU: 18 PID: 300 at mm/migrate.c:995 move_to_new_folio+0x1ef/0x260
[   87.211993] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
[   87.214287] CPU: 18 PID: 300 Comm: kcompactd0 Tainted: G                 N 6.2.0-rc3-next-20230113+ #385
[   87.217529] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
[   87.220131] RIP: 0010:move_to_new_folio+0x1ef/0x260
[   87.221892] Code: 84 c0 74 78 48 8b 43 18 44 89 ea 48 89 de 4c 89 e7 ff 50 06 85 c0 0f 85 a9 fe ff ff 48 8b 03 a9 00 00 04 00 0f 85 7a fe ff ff <0f> 0b e9 73 fe ff ff 48 8b 03 f6 c4 20 74 2a be c0 0c 00 00 48 89
[   87.226514] RSP: 0018:ffffc90000b9fb08 EFLAGS: 00010246
[   87.227879] RAX: 4000000000000021 RBX: ffffea0000890500 RCX: 0000000000000000
[   87.230948] RDX: 0000000000000000 RSI: ffffffff81e6f950 RDI: ffffea0000890500
[   87.233026] RBP: ffffea0000890500 R08: 0000001e82ec3c3e R09: 0000000000000001
[   87.235517] R10: 00000000ffffffff R11: 00000000ffffffff R12: ffffea00015a26c0
[   87.237807] R13: 0000000000000001 R14: ffffea00015a2680 R15: ffffea00008904c0
[   87.239438] FS:  0000000000000000(0000) GS:ffff888624200000(0000) knlGS:0000000000000000
[   87.241303] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   87.242627] CR2: 00007fe537ebbdb8 CR3: 0000000110a0a004 CR4: 0000000000770ee0
[   87.244283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   87.245913] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   87.247559] PKRU: 55555554
[   87.248269] Call Trace:
[   87.248862]  <TASK>
[   87.249370]  ? lock_is_held_type+0xd9/0x130
[   87.250377]  migrate_pages_batch+0x553/0xc80
[   87.251513]  ? move_freelist_tail+0xc0/0xc0
[   87.252545]  ? isolate_freepages+0x290/0x290
[   87.253654]  ? trace_mm_migrate_pages+0xf0/0xf0
[   87.254901]  migrate_pages+0x1ae/0x330
[   87.255877]  ? isolate_freepages+0x290/0x290
[   87.257015]  ? move_freelist_tail+0xc0/0xc0
[   87.258213]  compact_zone+0x528/0x6a0
[   87.260911]  proactive_compact_node+0x87/0xd0
[   87.262090]  kcompactd+0x1ca/0x360
[   87.263018]  ? swake_up_all+0xe0/0xe0
[   87.264101]  ? kcompactd_do_work+0x240/0x240
[   87.265243]  kthread+0xec/0x110
[   87.266031]  ? kthread_complete_and_exit+0x20/0x20
[   87.267268]  ret_from_fork+0x1f/0x30
[   87.268243]  </TASK>
[   87.268984] irq event stamp: 311113
[   87.269930] hardirqs last  enabled at (311125): [<ffffffff810da6c2>] __up_console_sem+0x52/0x60
[   87.272235] hardirqs last disabled at (311134): [<ffffffff810da6a7>] __up_console_sem+0x37/0x60
[   87.275707] softirqs last  enabled at (311088): [<ffffffff819d2b2c>] __do_softirq+0x21c/0x31f
[   87.278450] softirqs last disabled at (311083): [<ffffffff81070b8d>] __irq_exit_rcu+0xad/0x120
[   87.280555] ---[ end trace 0000000000000000 ]---

Mike Kravetz Jan. 14, 2023, 9:34 p.m. UTC | #5

On 01/14/23 16:08, Sergey Senozhatsky wrote:
> On (23/01/13 11:57), Mike Kravetz wrote:
> > Hi Sergey,
> > 
> > The following BUG shows up after this series in linux-next.  I can easily
> > recreate by doing the following:
> > 
> > # echo large_value > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > where 'large_value' is a so big that there could never possibly be that
> > many 2MB huge pages in the system.
> 
> I get migration warnins with the zsmalloc series reverted.
> I guess the problem is somewhere else. Can you double check
> on you side?

I did the following:

- Start with clean v6.2-rc3
  Perform echo, did not see issue

- Applied your 5 patches (includes the zsmalloc: turn chain size config option
  into UL constant patch).  Took default value for ZSMALLOC_CHAIN_SIZE of 8.
  Performed echo, recreated issue.

- Changed ZSMALLOC_CHAIN_SIZE to 1.
  Perform echo, did not see issue

I have not looked into the details of your patches or elsewhere.  Just thought
it might be related to your series because of the above.  And, since your
series was fresh in your mind this may trigger some thought/explanation.

It is certainly possible that root cause could be elsewhere and your series is
just exposing that.  I can take a closer look on Monday.

Thanks,

Sergey Senozhatsky Jan. 15, 2023, 4:21 a.m. UTC | #6

On (23/01/14 13:34), Mike Kravetz wrote:
> I did the following:
> 
> - Start with clean v6.2-rc3
>   Perform echo, did not see issue
> 
> - Applied your 5 patches (includes the zsmalloc: turn chain size config option
>   into UL constant patch).  Took default value for ZSMALLOC_CHAIN_SIZE of 8.
>   Performed echo, recreated issue.
> 
> - Changed ZSMALLOC_CHAIN_SIZE to 1.
>   Perform echo, did not see issue

The patch set basically just adjusts $NUM in calculate_zspage_chain_size():

		for (i = 1; i <= $NUM; i++)

It changes default 4 to 8. Can't really see how this can cause problems.

Sergey Senozhatsky Jan. 15, 2023, 5:32 a.m. UTC | #7

On (23/01/15 13:21), Sergey Senozhatsky wrote:
> On (23/01/14 13:34), Mike Kravetz wrote:
> > I did the following:
> > 
> > - Start with clean v6.2-rc3
> >   Perform echo, did not see issue
> > 
> > - Applied your 5 patches (includes the zsmalloc: turn chain size config option
> >   into UL constant patch).  Took default value for ZSMALLOC_CHAIN_SIZE of 8.
> >   Performed echo, recreated issue.
> > 
> > - Changed ZSMALLOC_CHAIN_SIZE to 1.
> >   Perform echo, did not see issue
> 
> The patch set basically just adjusts $NUM in calculate_zspage_chain_size():
> 
> 		for (i = 1; i <= $NUM; i++)
> 
> It changes default 4 to 8. Can't really see how this can cause problems.

OK, I guess it overflows zspage isolated counter, which is a 3 bit
integer, so the max chain-size we can have is b111 == 7.

We probably need something like below (this should not increase sizeof
zspage):

---

diff --git a/mm/zsmalloc.c b/mm/zsmalloc.c
index 290053e648b0..86b742a613ee 100644
--- a/mm/zsmalloc.c
+++ b/mm/zsmalloc.c
@@ -129,7 +129,7 @@
 #define HUGE_BITS      1
 #define FULLNESS_BITS  2
 #define CLASS_BITS     8
-#define ISOLATED_BITS  3
+#define ISOLATED_BITS  5
 #define MAGIC_VAL_BITS 8
 
 #define MAX(a, b) ((a) >= (b) ? (a) : (b))

Sergey Senozhatsky Jan. 15, 2023, 7:18 a.m. UTC | #8

Cc-ing Matthew,

On (23/01/14 16:08), Sergey Senozhatsky wrote:
> [   87.208255] ------------[ cut here ]------------
> [   87.209431] WARNING: CPU: 18 PID: 300 at mm/migrate.c:995 move_to_new_folio+0x1ef/0x260
> [   87.211993] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
> [   87.214287] CPU: 18 PID: 300 Comm: kcompactd0 Tainted: G                 N 6.2.0-rc3-next-20230113+ #385
> [   87.217529] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> [   87.220131] RIP: 0010:move_to_new_folio+0x1ef/0x260
> [   87.221892] Code: 84 c0 74 78 48 8b 43 18 44 89 ea 48 89 de 4c 89 e7 ff 50 06 85 c0 0f 85 a9 fe ff ff 48 8b 03 a9 00 00 04 00 0f 85 7a fe ff ff <0f> 0b e9 73 fe ff ff 48 8b 03 f6 c4 20 74 2a be c0 0c 00 00 48 89
> [   87.226514] RSP: 0018:ffffc90000b9fb08 EFLAGS: 00010246
> [   87.227879] RAX: 4000000000000021 RBX: ffffea0000890500 RCX: 0000000000000000
> [   87.230948] RDX: 0000000000000000 RSI: ffffffff81e6f950 RDI: ffffea0000890500
> [   87.233026] RBP: ffffea0000890500 R08: 0000001e82ec3c3e R09: 0000000000000001
> [   87.235517] R10: 00000000ffffffff R11: 00000000ffffffff R12: ffffea00015a26c0
> [   87.237807] R13: 0000000000000001 R14: ffffea00015a2680 R15: ffffea00008904c0
> [   87.239438] FS:  0000000000000000(0000) GS:ffff888624200000(0000) knlGS:0000000000000000
> [   87.241303] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   87.242627] CR2: 00007fe537ebbdb8 CR3: 0000000110a0a004 CR4: 0000000000770ee0
> [   87.244283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   87.245913] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   87.247559] PKRU: 55555554
> [   87.248269] Call Trace:
> [   87.248862]  <TASK>
> [   87.249370]  ? lock_is_held_type+0xd9/0x130
> [   87.250377]  migrate_pages_batch+0x553/0xc80
> [   87.251513]  ? move_freelist_tail+0xc0/0xc0
> [   87.252545]  ? isolate_freepages+0x290/0x290
> [   87.253654]  ? trace_mm_migrate_pages+0xf0/0xf0
> [   87.254901]  migrate_pages+0x1ae/0x330
> [   87.255877]  ? isolate_freepages+0x290/0x290
> [   87.257015]  ? move_freelist_tail+0xc0/0xc0
> [   87.258213]  compact_zone+0x528/0x6a0
> [   87.260911]  proactive_compact_node+0x87/0xd0
> [   87.262090]  kcompactd+0x1ca/0x360
> [   87.263018]  ? swake_up_all+0xe0/0xe0
> [   87.264101]  ? kcompactd_do_work+0x240/0x240
> [   87.265243]  kthread+0xec/0x110
> [   87.266031]  ? kthread_complete_and_exit+0x20/0x20
> [   87.267268]  ret_from_fork+0x1f/0x30
> [   87.268243]  </TASK>
> [   87.268984] irq event stamp: 311113
> [   87.269930] hardirqs last  enabled at (311125): [<ffffffff810da6c2>] __up_console_sem+0x52/0x60
> [   87.272235] hardirqs last disabled at (311134): [<ffffffff810da6a7>] __up_console_sem+0x37/0x60
> [   87.275707] softirqs last  enabled at (311088): [<ffffffff819d2b2c>] __do_softirq+0x21c/0x31f
> [   87.278450] softirqs last disabled at (311083): [<ffffffff81070b8d>] __irq_exit_rcu+0xad/0x120
> [   87.280555] ---[ end trace 0000000000000000 ]---

So this warning is move_to_new_folio() being called on un-isolated
src folio. I had DEBUG_VM disabled so VM_BUG_ON_FOLIO(!folio_test_isolated(src))
did nothing, however after mops->migrate_page() it would trigger WARN_ON()
because it evaluates folio_test_isolated(src) one more time:

[   59.500580] page:0000000097d97a42 refcount:2 mapcount:1665 mapping:0000000000000000 index:0xffffea00185ce940 pfn:0x113dc4
[   59.503239] flags: 0x8000000000000001(locked|zone=2)
[   59.505060] raw: 8000000000000001 ffffea00044f70c8 ffffc90000ba7c20 ffffffff81c22582
[   59.507288] raw: ffffea00185ce940 ffff88809183fdb0 0000000200000680 0000000000000000
[   59.509622] page dumped because: VM_BUG_ON_FOLIO(!folio_test_isolated(src))
[   59.511845] ------------[ cut here ]------------
[   59.513181] kernel BUG at mm/migrate.c:988!
[   59.514821] invalid opcode: 0000 [#1] PREEMPT SMP PTI

[   59.523018] RIP: 0010:move_to_new_folio+0x362/0x3b0
[   59.524160] Code: ff ff e9 55 fd ff ff 48 89 df e8 69 d8 ff ff f0 80 60 02 fb 31 c0 e9 65 fd ff ff 48 c7 c6 00 f5 e9 81 48 89 df e8 be c0 f9 ff <0f> 0b 48 c7 c6 00 f5 e9 81 48 89 df e8 ad c0 f9 ff 0f 0b b8 f5 ff
[   59.528349] RSP: 0018:ffffc90000ba7af8 EFLAGS: 00010246
[   59.529551] RAX: 000000000000003f RBX: ffffea00044f7100 RCX: 0000000000000000
[   59.531186] RDX: 0000000000000000 RSI: ffffffff81e8dcf1 RDI: 00000000ffffffff
[   59.532790] RBP: ffffea00184f1140 R08: 00000000ffffbfff R09: 00000000ffffbfff
[   59.534392] R10: ffff888621ca0000 R11: ffff888621ca0000 R12: 8000000000000001
[   59.536026] R13: 0000000000000001 R14: 0000000000000000 R15: ffffea00184f1140
[   59.537646] FS:  0000000000000000(0000) GS:ffff888626a00000(0000) knlGS:0000000000000000
[   59.539484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.540785] CR2: 00007ff7fbed8000 CR3: 0000000101a26001 CR4: 0000000000770ee0
[   59.542412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.544030] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.545637] PKRU: 55555554
[   59.546261] Call Trace:
[   59.546833]  <TASK>
[   59.547371]  ? lock_is_held_type+0xd9/0x130
[   59.548331]  migrate_pages_batch+0x650/0xdc0
[   59.549326]  ? move_freelist_tail+0xc0/0xc0
[   59.550281]  ? isolate_freepages+0x290/0x290
[   59.551289]  ? folio_flags.constprop.0+0x50/0x50
[   59.552348]  migrate_pages+0x3fa/0x4d0
[   59.553224]  ? isolate_freepages+0x290/0x290
[   59.554214]  ? move_freelist_tail+0xc0/0xc0
[   59.555173]  compact_zone+0x51b/0x6a0
[   59.556031]  proactive_compact_node+0x8e/0xe0
[   59.557033]  kcompactd+0x1c3/0x350
[   59.557842]  ? swake_up_all+0xe0/0xe0
[   59.558699]  ? kcompactd_do_work+0x260/0x260
[   59.559703]  kthread+0xec/0x110
[   59.560450]  ? kthread_complete_and_exit+0x20/0x20
[   59.561582]  ret_from_fork+0x1f/0x30
[   59.562427]  </TASK>
[   59.562966] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
[   59.564591] ---[ end trace 0000000000000000 ]---
[   59.565661] RIP: 0010:move_to_new_folio+0x362/0x3b0
[   59.566802] Code: ff ff e9 55 fd ff ff 48 89 df e8 69 d8 ff ff f0 80 60 02 fb 31 c0 e9 65 fd ff ff 48 c7 c6 00 f5 e9 81 48 89 df e8 be c0 f9 ff <0f> 0b 48 c7 c6 00 f5 e9 81 48 89 df e8 ad c0 f9 ff 0f 0b b8 f5 ff
[   59.571048] RSP: 0018:ffffc90000ba7af8 EFLAGS: 00010246
[   59.572257] RAX: 000000000000003f RBX: ffffea00044f7100 RCX: 0000000000000000
[   59.573906] RDX: 0000000000000000 RSI: ffffffff81e8dcf1 RDI: 00000000ffffffff
[   59.575544] RBP: ffffea00184f1140 R08: 00000000ffffbfff R09: 00000000ffffbfff
[   59.577236] R10: ffff888621ca0000 R11: ffff888621ca0000 R12: 8000000000000001
[   59.578893] R13: 0000000000000001 R14: 0000000000000000 R15: ffffea00184f1140
[   59.580593] FS:  0000000000000000(0000) GS:ffff888626a00000(0000) knlGS:0000000000000000
[   59.582432] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   59.583767] CR2: 00007ff7fbed8000 CR3: 0000000101a26001 CR4: 0000000000770ee0
[   59.585437] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[   59.587082] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[   59.588738] PKRU: 55555554

Sergey Senozhatsky Jan. 15, 2023, 8:19 a.m. UTC | #9

+ Huang Ying,

> On (23/01/14 16:08), Sergey Senozhatsky wrote:
> > [   87.208255] ------------[ cut here ]------------
> > [   87.209431] WARNING: CPU: 18 PID: 300 at mm/migrate.c:995 move_to_new_folio+0x1ef/0x260
> > [   87.211993] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
> > [   87.214287] CPU: 18 PID: 300 Comm: kcompactd0 Tainted: G                 N 6.2.0-rc3-next-20230113+ #385
> > [   87.217529] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
> > [   87.220131] RIP: 0010:move_to_new_folio+0x1ef/0x260
> > [   87.221892] Code: 84 c0 74 78 48 8b 43 18 44 89 ea 48 89 de 4c 89 e7 ff 50 06 85 c0 0f 85 a9 fe ff ff 48 8b 03 a9 00 00 04 00 0f 85 7a fe ff ff <0f> 0b e9 73 fe ff ff 48 8b 03 f6 c4 20 74 2a be c0 0c 00 00 48 89
> > [   87.226514] RSP: 0018:ffffc90000b9fb08 EFLAGS: 00010246
> > [   87.227879] RAX: 4000000000000021 RBX: ffffea0000890500 RCX: 0000000000000000
> > [   87.230948] RDX: 0000000000000000 RSI: ffffffff81e6f950 RDI: ffffea0000890500
> > [   87.233026] RBP: ffffea0000890500 R08: 0000001e82ec3c3e R09: 0000000000000001
> > [   87.235517] R10: 00000000ffffffff R11: 00000000ffffffff R12: ffffea00015a26c0
> > [   87.237807] R13: 0000000000000001 R14: ffffea00015a2680 R15: ffffea00008904c0
> > [   87.239438] FS:  0000000000000000(0000) GS:ffff888624200000(0000) knlGS:0000000000000000
> > [   87.241303] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   87.242627] CR2: 00007fe537ebbdb8 CR3: 0000000110a0a004 CR4: 0000000000770ee0
> > [   87.244283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > [   87.245913] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > [   87.247559] PKRU: 55555554
> > [   87.248269] Call Trace:
> > [   87.248862]  <TASK>
> > [   87.249370]  ? lock_is_held_type+0xd9/0x130
> > [   87.250377]  migrate_pages_batch+0x553/0xc80
> > [   87.251513]  ? move_freelist_tail+0xc0/0xc0
> > [   87.252545]  ? isolate_freepages+0x290/0x290
> > [   87.253654]  ? trace_mm_migrate_pages+0xf0/0xf0
> > [   87.254901]  migrate_pages+0x1ae/0x330
> > [   87.255877]  ? isolate_freepages+0x290/0x290
> > [   87.257015]  ? move_freelist_tail+0xc0/0xc0
> > [   87.258213]  compact_zone+0x528/0x6a0
> > [   87.260911]  proactive_compact_node+0x87/0xd0
> > [   87.262090]  kcompactd+0x1ca/0x360
> > [   87.263018]  ? swake_up_all+0xe0/0xe0
> > [   87.264101]  ? kcompactd_do_work+0x240/0x240
> > [   87.265243]  kthread+0xec/0x110
> > [   87.266031]  ? kthread_complete_and_exit+0x20/0x20
> > [   87.267268]  ret_from_fork+0x1f/0x30
> > [   87.268243]  </TASK>
> > [   87.268984] irq event stamp: 311113
> > [   87.269930] hardirqs last  enabled at (311125): [<ffffffff810da6c2>] __up_console_sem+0x52/0x60
> > [   87.272235] hardirqs last disabled at (311134): [<ffffffff810da6a7>] __up_console_sem+0x37/0x60
> > [   87.275707] softirqs last  enabled at (311088): [<ffffffff819d2b2c>] __do_softirq+0x21c/0x31f
> > [   87.278450] softirqs last disabled at (311083): [<ffffffff81070b8d>] __irq_exit_rcu+0xad/0x120
> > [   87.280555] ---[ end trace 0000000000000000 ]---
> 
> So this warning is move_to_new_folio() being called on un-isolated
> src folio. I had DEBUG_VM disabled so VM_BUG_ON_FOLIO(!folio_test_isolated(src))
> did nothing, however after mops->migrate_page() it would trigger WARN_ON()
> because it evaluates folio_test_isolated(src) one more time:
> 
> [   59.500580] page:0000000097d97a42 refcount:2 mapcount:1665 mapping:0000000000000000 index:0xffffea00185ce940 pfn:0x113dc4
> [   59.503239] flags: 0x8000000000000001(locked|zone=2)
> [   59.505060] raw: 8000000000000001 ffffea00044f70c8 ffffc90000ba7c20 ffffffff81c22582
> [   59.507288] raw: ffffea00185ce940 ffff88809183fdb0 0000000200000680 0000000000000000
> [   59.509622] page dumped because: VM_BUG_ON_FOLIO(!folio_test_isolated(src))
> [   59.511845] ------------[ cut here ]------------
> [   59.513181] kernel BUG at mm/migrate.c:988!
> [   59.514821] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> 
> [   59.523018] RIP: 0010:move_to_new_folio+0x362/0x3b0
> [   59.524160] Code: ff ff e9 55 fd ff ff 48 89 df e8 69 d8 ff ff f0 80 60 02 fb 31 c0 e9 65 fd ff ff 48 c7 c6 00 f5 e9 81 48 89 df e8 be c0 f9 ff <0f> 0b 48 c7 c6 00 f5 e9 81 48 89 df e8 ad c0 f9 ff 0f 0b b8 f5 ff
> [   59.528349] RSP: 0018:ffffc90000ba7af8 EFLAGS: 00010246
> [   59.529551] RAX: 000000000000003f RBX: ffffea00044f7100 RCX: 0000000000000000
> [   59.531186] RDX: 0000000000000000 RSI: ffffffff81e8dcf1 RDI: 00000000ffffffff
> [   59.532790] RBP: ffffea00184f1140 R08: 00000000ffffbfff R09: 00000000ffffbfff
> [   59.534392] R10: ffff888621ca0000 R11: ffff888621ca0000 R12: 8000000000000001
> [   59.536026] R13: 0000000000000001 R14: 0000000000000000 R15: ffffea00184f1140
> [   59.537646] FS:  0000000000000000(0000) GS:ffff888626a00000(0000) knlGS:0000000000000000
> [   59.539484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   59.540785] CR2: 00007ff7fbed8000 CR3: 0000000101a26001 CR4: 0000000000770ee0
> [   59.542412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   59.544030] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   59.545637] PKRU: 55555554
> [   59.546261] Call Trace:
> [   59.546833]  <TASK>
> [   59.547371]  ? lock_is_held_type+0xd9/0x130
> [   59.548331]  migrate_pages_batch+0x650/0xdc0
> [   59.549326]  ? move_freelist_tail+0xc0/0xc0
> [   59.550281]  ? isolate_freepages+0x290/0x290
> [   59.551289]  ? folio_flags.constprop.0+0x50/0x50
> [   59.552348]  migrate_pages+0x3fa/0x4d0
> [   59.553224]  ? isolate_freepages+0x290/0x290
> [   59.554214]  ? move_freelist_tail+0xc0/0xc0
> [   59.555173]  compact_zone+0x51b/0x6a0
> [   59.556031]  proactive_compact_node+0x8e/0xe0
> [   59.557033]  kcompactd+0x1c3/0x350
> [   59.557842]  ? swake_up_all+0xe0/0xe0
> [   59.558699]  ? kcompactd_do_work+0x260/0x260
> [   59.559703]  kthread+0xec/0x110
> [   59.560450]  ? kthread_complete_and_exit+0x20/0x20
> [   59.561582]  ret_from_fork+0x1f/0x30
> [   59.562427]  </TASK>
> [   59.562966] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
> [   59.564591] ---[ end trace 0000000000000000 ]---
> [   59.565661] RIP: 0010:move_to_new_folio+0x362/0x3b0
> [   59.566802] Code: ff ff e9 55 fd ff ff 48 89 df e8 69 d8 ff ff f0 80 60 02 fb 31 c0 e9 65 fd ff ff 48 c7 c6 00 f5 e9 81 48 89 df e8 be c0 f9 ff <0f> 0b 48 c7 c6 00 f5 e9 81 48 89 df e8 ad c0 f9 ff 0f 0b b8 f5 ff
> [   59.571048] RSP: 0018:ffffc90000ba7af8 EFLAGS: 00010246
> [   59.572257] RAX: 000000000000003f RBX: ffffea00044f7100 RCX: 0000000000000000
> [   59.573906] RDX: 0000000000000000 RSI: ffffffff81e8dcf1 RDI: 00000000ffffffff
> [   59.575544] RBP: ffffea00184f1140 R08: 00000000ffffbfff R09: 00000000ffffbfff
> [   59.577236] R10: ffff888621ca0000 R11: ffff888621ca0000 R12: 8000000000000001
> [   59.578893] R13: 0000000000000001 R14: 0000000000000000 R15: ffffea00184f1140
> [   59.580593] FS:  0000000000000000(0000) GS:ffff888626a00000(0000) knlGS:0000000000000000
> [   59.582432] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   59.583767] CR2: 00007ff7fbed8000 CR3: 0000000101a26001 CR4: 0000000000770ee0
> [   59.585437] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   59.587082] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   59.588738] PKRU: 55555554

Matthew Wilcox Jan. 15, 2023, 1:04 p.m. UTC | #10

On Sun, Jan 15, 2023 at 04:18:55PM +0900, Sergey Senozhatsky wrote:
> So this warning is move_to_new_folio() being called on un-isolated
> src folio. I had DEBUG_VM disabled so VM_BUG_ON_FOLIO(!folio_test_isolated(src))
> did nothing, however after mops->migrate_page() it would trigger WARN_ON()
> because it evaluates folio_test_isolated(src) one more time:
> 
> [   59.500580] page:0000000097d97a42 refcount:2 mapcount:1665 mapping:0000000000000000 index:0xffffea00185ce940 pfn:0x113dc4
> [   59.503239] flags: 0x8000000000000001(locked|zone=2)
> [   59.505060] raw: 8000000000000001 ffffea00044f70c8 ffffc90000ba7c20 ffffffff81c22582
> [   59.507288] raw: ffffea00185ce940 ffff88809183fdb0 0000000200000680 0000000000000000

That is quite the messed-up page.  mapcount is positive, but higher than
refcount.  And not just a little bit; 1665 vs 2.  But mapping is NULL,
so it's not anon or file memory.  Makes me think it belongs to a driver
that's using ->mapcount for its own purposes.  It's not PageSlab.

Given that you're working on zsmalloc, I took a look and:

static inline void set_first_obj_offset(struct page *page, unsigned int offset)
{
        page->page_type = offset;
}

(page_type aliases with mapcount).  So I'm pretty sure this is a
zsmalloc page.  But mapping should point to zsmalloc_mops.  Not
really sure what's going on here.  Can you bisect?

> [   59.509622] page dumped because: VM_BUG_ON_FOLIO(!folio_test_isolated(src))
> [   59.511845] ------------[ cut here ]------------
> [   59.513181] kernel BUG at mm/migrate.c:988!
> [   59.514821] invalid opcode: 0000 [#1] PREEMPT SMP PTI
> 
> [   59.523018] RIP: 0010:move_to_new_folio+0x362/0x3b0
> [   59.524160] Code: ff ff e9 55 fd ff ff 48 89 df e8 69 d8 ff ff f0 80 60 02 fb 31 c0 e9 65 fd ff ff 48 c7 c6 00 f5 e9 81 48 89 df e8 be c0 f9 ff <0f> 0b 48 c7 c6 00 f5 e9 81 48 89 df e8 ad c0 f9 ff 0f 0b b8 f5 ff
> [   59.528349] RSP: 0018:ffffc90000ba7af8 EFLAGS: 00010246
> [   59.529551] RAX: 000000000000003f RBX: ffffea00044f7100 RCX: 0000000000000000
> [   59.531186] RDX: 0000000000000000 RSI: ffffffff81e8dcf1 RDI: 00000000ffffffff
> [   59.532790] RBP: ffffea00184f1140 R08: 00000000ffffbfff R09: 00000000ffffbfff
> [   59.534392] R10: ffff888621ca0000 R11: ffff888621ca0000 R12: 8000000000000001
> [   59.536026] R13: 0000000000000001 R14: 0000000000000000 R15: ffffea00184f1140
> [   59.537646] FS:  0000000000000000(0000) GS:ffff888626a00000(0000) knlGS:0000000000000000
> [   59.539484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   59.540785] CR2: 00007ff7fbed8000 CR3: 0000000101a26001 CR4: 0000000000770ee0
> [   59.542412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   59.544030] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   59.545637] PKRU: 55555554
> [   59.546261] Call Trace:
> [   59.546833]  <TASK>
> [   59.547371]  ? lock_is_held_type+0xd9/0x130
> [   59.548331]  migrate_pages_batch+0x650/0xdc0
> [   59.549326]  ? move_freelist_tail+0xc0/0xc0
> [   59.550281]  ? isolate_freepages+0x290/0x290
> [   59.551289]  ? folio_flags.constprop.0+0x50/0x50
> [   59.552348]  migrate_pages+0x3fa/0x4d0
> [   59.553224]  ? isolate_freepages+0x290/0x290
> [   59.554214]  ? move_freelist_tail+0xc0/0xc0
> [   59.555173]  compact_zone+0x51b/0x6a0
> [   59.556031]  proactive_compact_node+0x8e/0xe0
> [   59.557033]  kcompactd+0x1c3/0x350
> [   59.557842]  ? swake_up_all+0xe0/0xe0
> [   59.558699]  ? kcompactd_do_work+0x260/0x260
> [   59.559703]  kthread+0xec/0x110
> [   59.560450]  ? kthread_complete_and_exit+0x20/0x20
> [   59.561582]  ret_from_fork+0x1f/0x30
> [   59.562427]  </TASK>
> [   59.562966] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
> [   59.564591] ---[ end trace 0000000000000000 ]---
> [   59.565661] RIP: 0010:move_to_new_folio+0x362/0x3b0
> [   59.566802] Code: ff ff e9 55 fd ff ff 48 89 df e8 69 d8 ff ff f0 80 60 02 fb 31 c0 e9 65 fd ff ff 48 c7 c6 00 f5 e9 81 48 89 df e8 be c0 f9 ff <0f> 0b 48 c7 c6 00 f5 e9 81 48 89 df e8 ad c0 f9 ff 0f 0b b8 f5 ff
> [   59.571048] RSP: 0018:ffffc90000ba7af8 EFLAGS: 00010246
> [   59.572257] RAX: 000000000000003f RBX: ffffea00044f7100 RCX: 0000000000000000
> [   59.573906] RDX: 0000000000000000 RSI: ffffffff81e8dcf1 RDI: 00000000ffffffff
> [   59.575544] RBP: ffffea00184f1140 R08: 00000000ffffbfff R09: 00000000ffffbfff
> [   59.577236] R10: ffff888621ca0000 R11: ffff888621ca0000 R12: 8000000000000001
> [   59.578893] R13: 0000000000000001 R14: 0000000000000000 R15: ffffea00184f1140
> [   59.580593] FS:  0000000000000000(0000) GS:ffff888626a00000(0000) knlGS:0000000000000000
> [   59.582432] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   59.583767] CR2: 00007ff7fbed8000 CR3: 0000000101a26001 CR4: 0000000000770ee0
> [   59.585437] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [   59.587082] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [   59.588738] PKRU: 55555554

Sergey Senozhatsky Jan. 15, 2023, 2:55 p.m. UTC | #11

On (23/01/15 13:04), Matthew Wilcox wrote:
> On Sun, Jan 15, 2023 at 04:18:55PM +0900, Sergey Senozhatsky wrote:
> > So this warning is move_to_new_folio() being called on un-isolated
> > src folio. I had DEBUG_VM disabled so VM_BUG_ON_FOLIO(!folio_test_isolated(src))
> > did nothing, however after mops->migrate_page() it would trigger WARN_ON()
> > because it evaluates folio_test_isolated(src) one more time:
> > 
> > [   59.500580] page:0000000097d97a42 refcount:2 mapcount:1665 mapping:0000000000000000 index:0xffffea00185ce940 pfn:0x113dc4
> > [   59.503239] flags: 0x8000000000000001(locked|zone=2)
> > [   59.505060] raw: 8000000000000001 ffffea00044f70c8 ffffc90000ba7c20 ffffffff81c22582
> > [   59.507288] raw: ffffea00185ce940 ffff88809183fdb0 0000000200000680 0000000000000000
> 
> That is quite the messed-up page.  mapcount is positive, but higher than
> refcount.  And not just a little bit; 1665 vs 2.  But mapping is NULL,
> so it's not anon or file memory.  Makes me think it belongs to a driver
> that's using ->mapcount for its own purposes.  It's not PageSlab.
> 
> Given that you're working on zsmalloc, I took a look and:
> 
> static inline void set_first_obj_offset(struct page *page, unsigned int offset)
> {
>         page->page_type = offset;
> }
> 
> (page_type aliases with mapcount).  So I'm pretty sure this is a
> zsmalloc page.  But mapping should point to zsmalloc_mops.  Not
> really sure what's going on here.  Can you bisect?

Thanks.

Let me try bisecting. From what I can tell it seems that
tags/next-20221226 is the last good and tags/next-20230105
is the first bad kernel.

I'll try to narrow it down from here.

Huang, Ying Jan. 16, 2023, 1:27 a.m. UTC | #12

Hi, Sergey,

Sergey Senozhatsky <senozhatsky@chromium.org> writes:
> + Huang Ying,
>
>> On (23/01/14 16:08), Sergey Senozhatsky wrote:
>> > [   87.208255] ------------[ cut here ]------------
>> > [   87.209431] WARNING: CPU: 18 PID: 300 at mm/migrate.c:995 move_to_new_folio+0x1ef/0x260
>> > [   87.211993] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
>> > [   87.214287] CPU: 18 PID: 300 Comm: kcompactd0 Tainted: G                 N 6.2.0-rc3-next-20230113+ #385
>> > [   87.217529] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.0-debian-1.16.0-4 04/01/2014
>> > [   87.220131] RIP: 0010:move_to_new_folio+0x1ef/0x260
>> > [ 87.221892] Code: 84 c0 74 78 48 8b 43 18 44 89 ea 48 89 de 4c 89
>> > e7 ff 50 06 85 c0 0f 85 a9 fe ff ff 48 8b 03 a9 00 00 04 00 0f 85
>> > 7a fe ff ff <0f> 0b e9 73 fe ff ff 48 8b 03 f6 c4 20 74 2a be c0
>> > 0c 00 00 48 89
>> > [   87.226514] RSP: 0018:ffffc90000b9fb08 EFLAGS: 00010246
>> > [   87.227879] RAX: 4000000000000021 RBX: ffffea0000890500 RCX: 0000000000000000
>> > [   87.230948] RDX: 0000000000000000 RSI: ffffffff81e6f950 RDI: ffffea0000890500
>> > [   87.233026] RBP: ffffea0000890500 R08: 0000001e82ec3c3e R09: 0000000000000001
>> > [   87.235517] R10: 00000000ffffffff R11: 00000000ffffffff R12: ffffea00015a26c0
>> > [   87.237807] R13: 0000000000000001 R14: ffffea00015a2680 R15: ffffea00008904c0
>> > [   87.239438] FS:  0000000000000000(0000) GS:ffff888624200000(0000) knlGS:0000000000000000
>> > [   87.241303] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > [   87.242627] CR2: 00007fe537ebbdb8 CR3: 0000000110a0a004 CR4: 0000000000770ee0
>> > [   87.244283] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> > [   87.245913] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> > [   87.247559] PKRU: 55555554
>> > [   87.248269] Call Trace:
>> > [   87.248862]  <TASK>
>> > [   87.249370]  ? lock_is_held_type+0xd9/0x130
>> > [   87.250377]  migrate_pages_batch+0x553/0xc80
>> > [   87.251513]  ? move_freelist_tail+0xc0/0xc0
>> > [   87.252545]  ? isolate_freepages+0x290/0x290
>> > [   87.253654]  ? trace_mm_migrate_pages+0xf0/0xf0
>> > [   87.254901]  migrate_pages+0x1ae/0x330
>> > [   87.255877]  ? isolate_freepages+0x290/0x290
>> > [   87.257015]  ? move_freelist_tail+0xc0/0xc0
>> > [   87.258213]  compact_zone+0x528/0x6a0
>> > [   87.260911]  proactive_compact_node+0x87/0xd0
>> > [   87.262090]  kcompactd+0x1ca/0x360
>> > [   87.263018]  ? swake_up_all+0xe0/0xe0
>> > [   87.264101]  ? kcompactd_do_work+0x240/0x240
>> > [   87.265243]  kthread+0xec/0x110
>> > [   87.266031]  ? kthread_complete_and_exit+0x20/0x20
>> > [   87.267268]  ret_from_fork+0x1f/0x30
>> > [   87.268243]  </TASK>
>> > [   87.268984] irq event stamp: 311113
>> > [   87.269930] hardirqs last  enabled at (311125): [<ffffffff810da6c2>] __up_console_sem+0x52/0x60
>> > [   87.272235] hardirqs last disabled at (311134): [<ffffffff810da6a7>] __up_console_sem+0x37/0x60
>> > [   87.275707] softirqs last  enabled at (311088): [<ffffffff819d2b2c>] __do_softirq+0x21c/0x31f
>> > [   87.278450] softirqs last disabled at (311083): [<ffffffff81070b8d>] __irq_exit_rcu+0xad/0x120
>> > [   87.280555] ---[ end trace 0000000000000000 ]---
>> 
>> So this warning is move_to_new_folio() being called on un-isolated
>> src folio. I had DEBUG_VM disabled so VM_BUG_ON_FOLIO(!folio_test_isolated(src))
>> did nothing, however after mops->migrate_page() it would trigger WARN_ON()
>> because it evaluates folio_test_isolated(src) one more time:
>> 
>> [   59.500580] page:0000000097d97a42 refcount:2 mapcount:1665 mapping:0000000000000000 index:0xffffea00185ce940 pfn:0x113dc4
>> [   59.503239] flags: 0x8000000000000001(locked|zone=2)
>> [   59.505060] raw: 8000000000000001 ffffea00044f70c8 ffffc90000ba7c20 ffffffff81c22582
>> [   59.507288] raw: ffffea00185ce940 ffff88809183fdb0 0000000200000680 0000000000000000
>> [   59.509622] page dumped because: VM_BUG_ON_FOLIO(!folio_test_isolated(src))
>> [   59.511845] ------------[ cut here ]------------
>> [   59.513181] kernel BUG at mm/migrate.c:988!
>> [   59.514821] invalid opcode: 0000 [#1] PREEMPT SMP PTI
>> 
>> [   59.523018] RIP: 0010:move_to_new_folio+0x362/0x3b0
>> [ 59.524160] Code: ff ff e9 55 fd ff ff 48 89 df e8 69 d8 ff ff f0
>> 80 60 02 fb 31 c0 e9 65 fd ff ff 48 c7 c6 00 f5 e9 81 48 89 df e8 be
>> c0 f9 ff <0f> 0b 48 c7 c6 00 f5 e9 81 48 89 df e8 ad c0 f9 ff 0f 0b
>> b8 f5 ff
>> [   59.528349] RSP: 0018:ffffc90000ba7af8 EFLAGS: 00010246
>> [   59.529551] RAX: 000000000000003f RBX: ffffea00044f7100 RCX: 0000000000000000
>> [   59.531186] RDX: 0000000000000000 RSI: ffffffff81e8dcf1 RDI: 00000000ffffffff
>> [   59.532790] RBP: ffffea00184f1140 R08: 00000000ffffbfff R09: 00000000ffffbfff
>> [   59.534392] R10: ffff888621ca0000 R11: ffff888621ca0000 R12: 8000000000000001
>> [   59.536026] R13: 0000000000000001 R14: 0000000000000000 R15: ffffea00184f1140
>> [   59.537646] FS:  0000000000000000(0000) GS:ffff888626a00000(0000) knlGS:0000000000000000
>> [   59.539484] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   59.540785] CR2: 00007ff7fbed8000 CR3: 0000000101a26001 CR4: 0000000000770ee0
>> [   59.542412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   59.544030] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   59.545637] PKRU: 55555554
>> [   59.546261] Call Trace:
>> [   59.546833]  <TASK>
>> [   59.547371]  ? lock_is_held_type+0xd9/0x130
>> [   59.548331]  migrate_pages_batch+0x650/0xdc0
>> [   59.549326]  ? move_freelist_tail+0xc0/0xc0
>> [   59.550281]  ? isolate_freepages+0x290/0x290
>> [   59.551289]  ? folio_flags.constprop.0+0x50/0x50
>> [   59.552348]  migrate_pages+0x3fa/0x4d0
>> [   59.553224]  ? isolate_freepages+0x290/0x290
>> [   59.554214]  ? move_freelist_tail+0xc0/0xc0
>> [   59.555173]  compact_zone+0x51b/0x6a0
>> [   59.556031]  proactive_compact_node+0x8e/0xe0
>> [   59.557033]  kcompactd+0x1c3/0x350
>> [   59.557842]  ? swake_up_all+0xe0/0xe0
>> [   59.558699]  ? kcompactd_do_work+0x260/0x260
>> [   59.559703]  kthread+0xec/0x110
>> [   59.560450]  ? kthread_complete_and_exit+0x20/0x20
>> [   59.561582]  ret_from_fork+0x1f/0x30
>> [   59.562427]  </TASK>
>> [   59.562966] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
>> [   59.564591] ---[ end trace 0000000000000000 ]---
>> [   59.565661] RIP: 0010:move_to_new_folio+0x362/0x3b0
>> [ 59.566802] Code: ff ff e9 55 fd ff ff 48 89 df e8 69 d8 ff ff f0
>> 80 60 02 fb 31 c0 e9 65 fd ff ff 48 c7 c6 00 f5 e9 81 48 89 df e8 be
>> c0 f9 ff <0f> 0b 48 c7 c6 00 f5 e9 81 48 89 df e8 ad c0 f9 ff 0f 0b
>> b8 f5 ff
>> [   59.571048] RSP: 0018:ffffc90000ba7af8 EFLAGS: 00010246
>> [   59.572257] RAX: 000000000000003f RBX: ffffea00044f7100 RCX: 0000000000000000
>> [   59.573906] RDX: 0000000000000000 RSI: ffffffff81e8dcf1 RDI: 00000000ffffffff
>> [   59.575544] RBP: ffffea00184f1140 R08: 00000000ffffbfff R09: 00000000ffffbfff
>> [   59.577236] R10: ffff888621ca0000 R11: ffff888621ca0000 R12: 8000000000000001
>> [   59.578893] R13: 0000000000000001 R14: 0000000000000000 R15: ffffea00184f1140
>> [   59.580593] FS:  0000000000000000(0000) GS:ffff888626a00000(0000) knlGS:0000000000000000
>> [   59.582432] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [   59.583767] CR2: 00007ff7fbed8000 CR3: 0000000101a26001 CR4: 0000000000770ee0
>> [   59.585437] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> [   59.587082] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>> [   59.588738] PKRU: 55555554

Thanks for reporting.  We have just fixed a ZRAM related bug in
migrate_pages() batching series with the help of Mike.

https://lore.kernel.org/linux-mm/Y8DizzvFXBSEPzI4@monkey/

I will send out a new version today or tomorrow to fix it.  Please try
that.

Best Regards,
Huang, Ying

Sergey Senozhatsky Jan. 16, 2023, 3:15 a.m. UTC | #13

On (23/01/09 12:38), Sergey Senozhatsky wrote:
> 	This turns hard coded limit on maximum number of physical
> pages per-zspage into a config option. It also increases the default
> limit from 4 to 8.
> 
> Sergey Senozhatsky (4):
>   zsmalloc: rework zspage chain size selection
>   zsmalloc: skip chain size calculation for pow_of_2 classes
>   zsmalloc: make zspage chain size configurable
>   zsmalloc: set default zspage chain size to 8
> 
>  Documentation/mm/zsmalloc.rst | 168 ++++++++++++++++++++++++++++++++++
>  mm/Kconfig                    |  19 ++++
>  mm/zsmalloc.c                 |  72 +++++----------
>  3 files changed, 212 insertions(+), 47 deletions(-)

Andrew,

Can you please drop this series? We have two fixup patches (hppa64 build
failure and isolated bit-field overflow reported by Mike) for this series
and at this point I probably want to send out v3 with all fixups squashed.

Mike, would that be OK with you if I squash ->isolated fixup?

Sergey Senozhatsky Jan. 16, 2023, 3:46 a.m. UTC | #14

Hi,

On (23/01/16 09:27), Huang, Ying wrote:
> >> [   59.546261] Call Trace:
> >> [   59.546833]  <TASK>
> >> [   59.547371]  ? lock_is_held_type+0xd9/0x130
> >> [   59.548331]  migrate_pages_batch+0x650/0xdc0
> >> [   59.549326]  ? move_freelist_tail+0xc0/0xc0
> >> [   59.550281]  ? isolate_freepages+0x290/0x290
> >> [   59.551289]  ? folio_flags.constprop.0+0x50/0x50
> >> [   59.552348]  migrate_pages+0x3fa/0x4d0
> >> [   59.553224]  ? isolate_freepages+0x290/0x290
> >> [   59.554214]  ? move_freelist_tail+0xc0/0xc0
> >> [   59.555173]  compact_zone+0x51b/0x6a0
> >> [   59.556031]  proactive_compact_node+0x8e/0xe0
> >> [   59.557033]  kcompactd+0x1c3/0x350
> >> [   59.557842]  ? swake_up_all+0xe0/0xe0
> >> [   59.558699]  ? kcompactd_do_work+0x260/0x260
> >> [   59.559703]  kthread+0xec/0x110
> >> [   59.560450]  ? kthread_complete_and_exit+0x20/0x20
> >> [   59.561582]  ret_from_fork+0x1f/0x30
> >> [   59.562427]  </TASK>
> >> [   59.562966] Modules linked in: deflate zlib_deflate zstd zstd_compress zram
> >> [   59.564591] ---[ end trace 0000000000000000 ]---
> >> [   59.565661] RIP: 0010:move_to_new_folio+0x362/0x3b0
> 
> Thanks for reporting.  We have just fixed a ZRAM related bug in
> migrate_pages() batching series with the help of Mike.

Oh, great. Yeah, I narroved it down to that series as well.

> https://lore.kernel.org/linux-mm/Y8DizzvFXBSEPzI4@monkey/

That fixes it!

Mike Kravetz Jan. 16, 2023, 6:34 p.m. UTC | #15

On 01/16/23 12:15, Sergey Senozhatsky wrote:
> On (23/01/09 12:38), Sergey Senozhatsky wrote:
> > 	This turns hard coded limit on maximum number of physical
> > pages per-zspage into a config option. It also increases the default
> > limit from 4 to 8.
> > 
> > Sergey Senozhatsky (4):
> >   zsmalloc: rework zspage chain size selection
> >   zsmalloc: skip chain size calculation for pow_of_2 classes
> >   zsmalloc: make zspage chain size configurable
> >   zsmalloc: set default zspage chain size to 8
> > 
> >  Documentation/mm/zsmalloc.rst | 168 ++++++++++++++++++++++++++++++++++
> >  mm/Kconfig                    |  19 ++++
> >  mm/zsmalloc.c                 |  72 +++++----------
> >  3 files changed, 212 insertions(+), 47 deletions(-)
> 
> Andrew,
> 
> Can you please drop this series? We have two fixup patches (hppa64 build
> failure and isolated bit-field overflow reported by Mike) for this series
> and at this point I probably want to send out v3 with all fixups squashed.
> 
> Mike, would that be OK with you if I squash ->isolated fixup?

I'm OK with however you want to address.   Thanks!

[PATCHv2,0/4] zsmalloc: make zspage chain size configurable

Message

Comments