Message ID | 20240226-zsmalloc-zspage-rcu-v1-0-456b0ef1a89d@bytedance.com (mailing list archive) |
---|---|
Headers | show |
Series | mm/zsmalloc: simplify synchronization between zs_page_migrate() and free_zspage() | expand |
On (24/02/27 03:02), Chengming Zhou wrote: > Hello, > > free_zspage() has to hold locks of all pages, since zs_page_migrate() > path rely on this page lock to protect the race between zs_free() and > it, so it can safely get zspage from page->private. > > But this way is not good and simple enough: > > 1. Since zs_free() couldn't be sleepable, it can only trylock pages, > or has to kick_deferred_free() to defer that to a work. > > 2. Even in the worker context, async_free_zspage() can't simply > lock all pages in lock_zspage(), it's still trylock because of > the race between zs_free() and zs_page_migrate(). Please see > the commit 2505a981114d ("zsmalloc: fix races between asynchronous > zspage free and page migration") for details. > > Actually, all free_zspage() needs is to get zspage from page safely, > we can use RCU to achieve it easily. Then free_zspage() don't need to > hold locks of all pages, so don't need the deferred free mechanism > at all. This patchset implements it and remove all of deferred free > related code. > > Thanks for review and comments! > > Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> JFI, recovered from the SPAM folder "The sender hasn't authenticated this message"
On 2024/2/28 09:57, Sergey Senozhatsky wrote: > On (24/02/27 03:02), Chengming Zhou wrote: >> Hello, >> >> free_zspage() has to hold locks of all pages, since zs_page_migrate() >> path rely on this page lock to protect the race between zs_free() and >> it, so it can safely get zspage from page->private. >> >> But this way is not good and simple enough: >> >> 1. Since zs_free() couldn't be sleepable, it can only trylock pages, >> or has to kick_deferred_free() to defer that to a work. >> >> 2. Even in the worker context, async_free_zspage() can't simply >> lock all pages in lock_zspage(), it's still trylock because of >> the race between zs_free() and zs_page_migrate(). Please see >> the commit 2505a981114d ("zsmalloc: fix races between asynchronous >> zspage free and page migration") for details. >> >> Actually, all free_zspage() needs is to get zspage from page safely, >> we can use RCU to achieve it easily. Then free_zspage() don't need to >> hold locks of all pages, so don't need the deferred free mechanism >> at all. This patchset implements it and remove all of deferred free >> related code. >> >> Thanks for review and comments! >> >> Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> > > JFI, recovered from the SPAM folder > "The sender hasn't authenticated this message" Sorry for this, I thought the problem was fixed after testing with my own Gmail last time. But it turns out my corporation email still sometimes has this problem. I will always use linux.dev email in the future to avoid these problems. Thanks for your time!
On (24/02/27 03:02), Chengming Zhou wrote: > free_zspage() has to hold locks of all pages, since zs_page_migrate() > path rely on this page lock to protect the race between zs_free() and > it, so it can safely get zspage from page->private. > > But this way is not good and simple enough: > > 1. Since zs_free() couldn't be sleepable, it can only trylock pages, > or has to kick_deferred_free() to defer that to a work. > > 2. Even in the worker context, async_free_zspage() can't simply > lock all pages in lock_zspage(), it's still trylock because of > the race between zs_free() and zs_page_migrate(). Please see > the commit 2505a981114d ("zsmalloc: fix races between asynchronous > zspage free and page migration") for details. > > Actually, all free_zspage() needs is to get zspage from page safely, > we can use RCU to achieve it easily. Then free_zspage() don't need to > hold locks of all pages, so don't need the deferred free mechanism > at all. This patchset implements it and remove all of deferred free > related code. > > Thanks for review and comments! > > Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> > --- > Chengming Zhou (2): > mm/zsmalloc: don't hold locks of all pages when free_zspage() That seems to be crashing on me: [ 28.123867] ================================================================== [ 28.125303] BUG: KASAN: null-ptr-deref in obj_malloc+0xa9/0x1f0 [ 28.126289] Read of size 8 at addr 0000000000000028 by task mkfs.ext2/432 [ 28.127414] [ 28.127684] CPU: 8 PID: 432 Comm: mkfs.ext2 Tainted: G N 6.8.0-rc5+ #309 [ 28.129015] Call Trace: [ 28.129442] <TASK> [ 28.129805] dump_stack_lvl+0x6f/0xab [ 28.130437] print_report+0xe0/0x5e0 [ 28.131050] ? _printk+0x59/0x7b [ 28.131602] ? kasan_report+0x96/0x120 [ 28.132233] ? obj_malloc+0xa9/0x1f0 [ 28.132837] kasan_report+0xe7/0x120 [ 28.133441] ? obj_malloc+0xa9/0x1f0 [ 28.134046] obj_malloc+0xa9/0x1f0 [ 28.134633] zs_malloc+0x22c/0x3e0 [ 28.135211] zram_submit_bio+0x44e/0xee0 [ 28.135871] ? lock_release+0x50c/0x700 [ 28.136520] submit_bio_noacct_nocheck+0x22a/0x650 [ 28.137327] __block_write_full_folio+0x48b/0x710 [ 28.138119] ? __cfi_blkdev_get_block+0x10/0x10 [ 28.138885] ? __cfi_block_write_full_folio+0x10/0x10 [ 28.139737] write_cache_pages+0x83/0xf0 [ 28.140397] ? __cfi_blkdev_get_block+0x10/0x10 [ 28.141152] blkdev_writepages+0x46/0x80 [ 28.141810] do_writepages+0x1be/0x400 [ 28.142443] file_write_and_wait_range+0x104/0x170 [ 28.143254] blkdev_fsync+0x4a/0x70 [ 28.143846] __x64_sys_fsync+0xe9/0x120 [ 28.144491] do_syscall_64+0x8d/0x130 [ 28.145106] entry_SYSCALL_64_after_hwframe+0x46/0x4e
Hello, free_zspage() has to hold locks of all pages, since zs_page_migrate() path rely on this page lock to protect the race between zs_free() and it, so it can safely get zspage from page->private. But this way is not good and simple enough: 1. Since zs_free() couldn't be sleepable, it can only trylock pages, or has to kick_deferred_free() to defer that to a work. 2. Even in the worker context, async_free_zspage() can't simply lock all pages in lock_zspage(), it's still trylock because of the race between zs_free() and zs_page_migrate(). Please see the commit 2505a981114d ("zsmalloc: fix races between asynchronous zspage free and page migration") for details. Actually, all free_zspage() needs is to get zspage from page safely, we can use RCU to achieve it easily. Then free_zspage() don't need to hold locks of all pages, so don't need the deferred free mechanism at all. This patchset implements it and remove all of deferred free related code. Thanks for review and comments! Signed-off-by: Chengming Zhou <zhouchengming@bytedance.com> --- Chengming Zhou (2): mm/zsmalloc: don't hold locks of all pages when free_zspage() mm/zsmalloc: remove the deferred free mechanism mm/zsmalloc.c | 206 ++++++++++++++++------------------------------------------ 1 file changed, 56 insertions(+), 150 deletions(-) --- base-commit: ccbd06e764bac9bbf6b4e91c700fe6dd28f08fb3 change-id: 20240226-zsmalloc-zspage-rcu-b2c12f054fb4 Best regards,