Message ID | 1608207487-30537-2-git-send-email-Kuan-Ying.Lee@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kasan: fix memory leak of kasan quarantine | expand |
On Dec 17, 2020 / 20:18, Kuan-Ying Lee wrote: > When cpu is going offline, set q->offline as true > and interrupt happened. The interrupt may call the > quarantine_put. But quarantine_put do not free the > the object. The object will cause memory leak. > > Add qlink_free() to free the object. > > Fixes: 6c82d45c7f03 (kasan: fix object remaining in offline per-cpu quarantine) > Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> > Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> > Cc: Alexander Potapenko <glider@google.com> > Cc: Dmitry Vyukov <dvyukov@google.com> > Cc: Andrew Morton <akpm@linux-foundation.org> > Cc: Matthias Brugger <matthias.bgg@gmail.com> > Cc: <stable@vger.kernel.org> [5.10-] > --- > mm/kasan/quarantine.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c > index 0e3f8494628f..cac7c617df72 100644 > --- a/mm/kasan/quarantine.c > +++ b/mm/kasan/quarantine.c > @@ -191,6 +191,7 @@ void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache) > > q = this_cpu_ptr(&cpu_quarantine); > if (q->offline) { > + qlink_free(&info->quarantine_link, cache); > local_irq_restore(flags); > return; > } I ran blktests with kernels v5.10.0 and v5.10.3 enabling KASAN, and observed a BUG message [1]. The BUG can be recreated with a test case which hotplugs CPUs during I/O to dm-linear device. The stack trace in the message indicates memory leak was detected when slab of dm-linear bio set was destroyed. I bisected and found the commit 6c82d45c7f03 ("kasan: fix object remaining in offline per-cpu quarantine") triggers the BUG message. I also tried this fix patch by Kuan-Ying and observed that it avoids the BUG message. I suppose the fix is required for v5.10.x. Confirmation by kasan maintainers will be appreciated. I took following steps to recreate the BUG message: 1. Create dm-linear device on top of HDD (dmsetup create). 2. Run blktests' test case block/008 using the dm-linear device. 3. Remove the dm-linear device (dmsetup remove) When I repeat the steps, the BUG message is often observed at the step 3. I repeated the steps with v5.11-rc1 also, and did not observe the BUG. With v5.11-rc1, quarantine_put() returns bool. Then I think the fix patch is not required for v5.11, probably. Wish this report helps for the fix. [1] [ 151.201998] ============================================================================= [ 151.212580] BUG bio-3 (Not tainted): Objects remaining in bio-3 on __kmem_cache_shutdown() [ 151.222321] ----------------------------------------------------------------------------- [ 151.234933] Disabling lock debugging due to kernel taint [ 151.241634] INFO: Slab 0x0000000010690f30 objects=36 used=3 fp=0x00000000e7351615 flags=0x17ffffc0010200 [ 151.252558] CPU: 6 PID: 1996 Comm: dmsetup Tainted: G B 5.10.3 #21 [ 151.261520] Hardware name: Supermicro X10SLL-F/X10SLL-F, BIOS 3.0 04/24/2015 [ 151.270070] Call Trace: [ 151.274029] dump_stack+0x9a/0xcc [ 151.278844] slab_err+0xb7/0xdc [ 151.283505] ? do_raw_spin_lock+0x115/0x240 [ 151.289220] ? rwlock_bug.part.0+0x90/0x90 [ 151.294833] __kmem_cache_shutdown.cold+0x36/0x19f [ 151.301182] kmem_cache_destroy+0x5d/0x110 [ 151.306822] bio_put_slab+0xd3/0x180 [ 151.311952] bioset_exit+0xa5/0x100 [ 151.316989] cleanup_mapped_device+0x5e/0x310 [ 151.322886] free_dev+0xb8/0x210 [ 151.327665] __dm_destroy+0x2e0/0x470 [ 151.332872] ? dm_blk_report_zones+0x2b0/0x2b0 [ 151.338877] ? _raw_spin_unlock+0x1f/0x30 [ 151.344481] dev_remove+0x223/0x2f0 [ 151.349565] ctl_ioctl+0x384/0x970 [ 151.354572] ? remove_all+0x90/0x90 [ 151.359677] ? find_held_lock+0x2c/0x110 [ 151.365211] ? free_params+0x30/0x30 [ 151.370401] ? lockdep_hardirqs_on_prepare+0x273/0x3e0 [ 151.377181] dm_ctl_ioctl+0xa/0x10 [ 151.382224] __x64_sys_ioctl+0x127/0x190 [ 151.387800] do_syscall_64+0x33/0x40 [ 151.393017] entry_SYSCALL_64_after_hwframe+0x44/0xa9 [ 151.399713] RIP: 0033:0x7f6fe944338b [ 151.404947] Code: 89 d8 49 8d 3c 1c 48 f7 d8 49 39 c4 72 b5 e8 1c ff ff ff 85 c0 78 ba 4c 89 e0 5b 5d 41 5c c3 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d bd ba 0c 00 f7 d8 64 89 01 48 [ 151.426499] RSP: 002b:00007ffd00632f08 EFLAGS: 00000202 ORIG_RAX: 0000000000000010 [ 151.435871] RAX: ffffffffffffffda RBX: 00007f6fe9523f30 RCX: 00007f6fe944338b [ 151.444829] RDX: 0000561f9b938440 RSI: 00000000c138fd04 RDI: 0000000000000003 [ 151.453812] RBP: 00007f6fe9560494 R08: 00007f6fe9562de0 R09: 00007ffd00632d70 [ 151.462808] R10: 0000000000000000 R11: 0000000000000202 R12: 0000561f9b936c40 [ 151.471820] R13: 0000561f9b938470 R14: 0000561f9b9384f0 R15: 0000561f9b938440 [ 151.480836] INFO: Object 0x00000000f6e5c796 @offset=5824 [ 151.488063] INFO: Object 0x00000000a14f36d2 @offset=6720 [ 151.495274] INFO: Object 0x000000005bfef957 @offset=9856
diff --git a/mm/kasan/quarantine.c b/mm/kasan/quarantine.c index 0e3f8494628f..cac7c617df72 100644 --- a/mm/kasan/quarantine.c +++ b/mm/kasan/quarantine.c @@ -191,6 +191,7 @@ void quarantine_put(struct kasan_free_meta *info, struct kmem_cache *cache) q = this_cpu_ptr(&cpu_quarantine); if (q->offline) { + qlink_free(&info->quarantine_link, cache); local_irq_restore(flags); return; }
When cpu is going offline, set q->offline as true and interrupt happened. The interrupt may call the quarantine_put. But quarantine_put do not free the the object. The object will cause memory leak. Add qlink_free() to free the object. Fixes: 6c82d45c7f03 (kasan: fix object remaining in offline per-cpu quarantine) Signed-off-by: Kuan-Ying Lee <Kuan-Ying.Lee@mediatek.com> Cc: Andrey Ryabinin <aryabinin@virtuozzo.com> Cc: Alexander Potapenko <glider@google.com> Cc: Dmitry Vyukov <dvyukov@google.com> Cc: Andrew Morton <akpm@linux-foundation.org> Cc: Matthias Brugger <matthias.bgg@gmail.com> Cc: <stable@vger.kernel.org> [5.10-] --- mm/kasan/quarantine.c | 1 + 1 file changed, 1 insertion(+)