Message ID | 20150703161137.GA10438@codemonkey.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 07/03/15 09:11, Dave Jones wrote: > After a fuzzing run recently, I noticed that the machine had oom'd, and > killed everything, but there was still 3GB of memory still in use, that > I couldn't even reclaim with /proc/sys/vm/drop_caches > > So I enabled kmemleak. After applying this.. > > diff --git a/mm/kmemleak.c b/mm/kmemleak.c > index cf79f110157c..6dc18dbad9ec 100644 > --- a/mm/kmemleak.c > +++ b/mm/kmemleak.c > @@ -553,8 +553,8 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, > > object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); > if (!object) { > - pr_warning("Cannot allocate a kmemleak_object structure\n"); > - kmemleak_disable(); > + //pr_warning("Cannot allocate a kmemleak_object structure\n"); > + //kmemleak_disable(); > return NULL; > } > > otherwise it would disable itself within a minute of runtime. > > I notice now that I'm seeing a lot of traces like this.. > > > unreferenced object 0xffff8800ba8202c0 (size 320): > comm "kworker/u4:1", pid 38, jiffies 4294741176 (age 46887.690s) > hex dump (first 32 bytes): > 21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00 !Ce............. > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0 > [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200 > [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30 > [<ffffffff89165963>] mempool_alloc+0x63/0x180 > [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50 > [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180 > [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90 > [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80 > [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0 > [<ffffffff8946da59>] sd_init_command+0x59/0xe40 > [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160 > [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0 > [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390 > [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120 > [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0 > [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140 > > unreferenced object 0xffff8800ba824800 (size 640): > comm "trinity-c2", pid 3687, jiffies 4294843075 (age 46785.966s) > hex dump (first 32 bytes): > 21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00 !Ce............. > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0 > [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200 > [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30 > [<ffffffff89165963>] mempool_alloc+0x63/0x180 > [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50 > [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180 > [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90 > [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80 > [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0 > [<ffffffff8946da59>] sd_init_command+0x59/0xe40 > [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160 > [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0 > [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390 > [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120 > [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0 > [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140 > > unreferenced object 0xffff8800a9fe6780 (size 2560): > comm "kworker/1:1H", pid 171, jiffies 4294843118 (age 46785.923s) > hex dump (first 32 bytes): > 21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00 !Ce............. > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > backtrace: > [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0 > [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200 > [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30 > [<ffffffff89165963>] mempool_alloc+0x63/0x180 > [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50 > [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180 > [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90 > [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80 > [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0 > [<ffffffff8946da59>] sd_init_command+0x59/0xe40 > [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160 > [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0 > [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390 > [<ffffffff892f66b2>] blk_mq_run_work_fn+0x12/0x20 > [<ffffffff8908eba7>] process_one_work+0x147/0x420 > [<ffffffff8908f209>] worker_thread+0x69/0x470 > > The sizes vary, but the hex dump is always the same. > > What's the usual completion path where these would get deallocated ? > I'm wondering if there's just some annotation missing to appease kmemleak, > because I'm seeing thousands of these. > > Or it could be a real leak, but it seems surprising no-one else is complaining. (+Catalin) Dave, with which kernel version has this behavior been observed ? Catalin, can you recommend which patches Dave Jones should apply to kmemleak ? A few weeks ago I had noticed similar kmemleak reports. However, when I reran my test with kmemleak disabled memory usage was stable. See also https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html. Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 03, 2015 at 10:04:00AM -0700, Bart Van Assche wrote: > On 07/03/15 09:11, Dave Jones wrote: > > After a fuzzing run recently, I noticed that the machine had oom'd, and > > killed everything, but there was still 3GB of memory still in use, that > > I couldn't even reclaim with /proc/sys/vm/drop_caches > > ... > > I'm wondering if there's just some annotation missing to appease kmemleak, > > because I'm seeing thousands of these. > > > > Or it could be a real leak, but it seems surprising no-one else is complaining. > > Dave, with which kernel version has this behavior been observed ? Linus' current tree Dave -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Fri, Jul 03, 2015 at 06:04:00PM +0100, Bart Van Assche wrote: > On 07/03/15 09:11, Dave Jones wrote: > > After a fuzzing run recently, I noticed that the machine had oom'd, and > > killed everything, but there was still 3GB of memory still in use, that > > I couldn't even reclaim with /proc/sys/vm/drop_caches > > > > So I enabled kmemleak. After applying this.. > > > > diff --git a/mm/kmemleak.c b/mm/kmemleak.c > > index cf79f110157c..6dc18dbad9ec 100644 > > --- a/mm/kmemleak.c > > +++ b/mm/kmemleak.c > > @@ -553,8 +553,8 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, > > > > object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); > > if (!object) { > > - pr_warning("Cannot allocate a kmemleak_object structure\n"); > > - kmemleak_disable(); > > + //pr_warning("Cannot allocate a kmemleak_object structure\n"); > > + //kmemleak_disable(); > > return NULL; > > } > > > > otherwise it would disable itself within a minute of runtime. > > > > I notice now that I'm seeing a lot of traces like this.. > > > > > > unreferenced object 0xffff8800ba8202c0 (size 320): > > comm "kworker/u4:1", pid 38, jiffies 4294741176 (age 46887.690s) > > hex dump (first 32 bytes): > > 21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00 !Ce............. > > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ If kmemleak fails to allocate its metadata, the object referencing graph is broken, so whatever kmemleak prints afterwards can no longer be trusted (e.g. objects in a linked list and there is a block in the middle of the list that kmemleak doesn't know about and won't scan; the blocks referenced by it will be reported as leaks). After kmemleak disables itself, does it still show any potential leaks? Even that is not entirely to be trusted. A better start would be to look at /proc/slabinfo. > Catalin, can you recommend which patches Dave Jones should apply to > kmemleak ? All kmemleak patches I had are in mainline (as of 4.2-rc1) but none of them targets leaks (or false positives). > A few weeks ago I had noticed similar kmemleak reports. > However, when I reran my test with kmemleak disabled memory usage was > stable. See also > https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html. Further down in this thread it gets interesting with kmalloc-96 objects disappearing in /proc/slabinfo with kmemleak disabled. Kmemleak does not allocate such objects, the two types it allocates are in separate kmem_cache as kmemleak_object and kmemleak_scan_area. However, the 96 size is reported by kmemleak as well, so maybe it was fixed by some other patch in the meantime? If you manage to reproduce it again please let me know. Thanks.
On 07/07/15 03:33, Catalin Marinas wrote: > On Fri, Jul 03, 2015 at 06:04:00PM +0100, Bart Van Assche wrote: >> A few weeks ago I had noticed similar kmemleak reports. >> However, when I reran my test with kmemleak disabled memory usage was >> stable. See also >> https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html. > > Further down in this thread it gets interesting with kmalloc-96 objects > disappearing in /proc/slabinfo with kmemleak disabled. Kmemleak does not > allocate such objects, the two types it allocates are in separate > kmem_cache as kmemleak_object and kmemleak_scan_area. However, the 96 > size is reported by kmemleak as well, so maybe it was fixed by some > other patch in the meantime? > > If you manage to reproduce it again please let me know. Thanks. Hello Catalin, Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a red zone to be allocated before and after each block of allocated memory. Could that explain the kmalloc-96 objects ? Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 07, 2015 at 06:59:37AM -0700, Bart Van Assche wrote: > Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a red > zone to be allocated before and after each block of allocated memory. Could > that explain the kmalloc-96 objects ? 96 is almost guaranteed to be the sense buffer allocated in scsi_init_request and freed in scsi_exit_request. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/08/2015 01:17 AM, Christoph Hellwig wrote: > On Tue, Jul 07, 2015 at 06:59:37AM -0700, Bart Van Assche wrote: >> Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a red >> zone to be allocated before and after each block of allocated memory. Could >> that explain the kmalloc-96 objects ? > > 96 is almost guaranteed to be the sense buffer allocated in > scsi_init_request and freed in scsi_exit_request. Hello Catalin and Christoph, kmemleak still reports large numbers of unreferenced objects for the scsi-mq code with the v4.2-rc4 kernel even with the recently posted scsi-mq leak fix applied on top of v4.2-rc4. Here is an example of one such report: unreferenced object 0xffff88045e05dc28 (size 96): comm "srp_daemon", pid 8461, jiffies 4294973034 (age 742.350s) hex dump (first 32 bytes): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ backtrace: [<ffffffff814f2ede>] kmemleak_alloc+0x4e/0xb0 [<ffffffff811b0678>] kmem_cache_alloc_trace+0xc8/0x2d0 [<ffffffffa006cc37>] scsi_init_request+0x27/0x40 [scsi_mod] [<ffffffff81278b91>] blk_mq_init_rq_map+0x1d1/0x260 [<ffffffff81278cc4>] blk_mq_alloc_tag_set+0xa4/0x1f0 [<ffffffffa006fb0d>] scsi_mq_setup_tags+0xcd/0xd0 [scsi_mod] [<ffffffffa0066464>] scsi_add_host_with_dma+0x74/0x2e0 [scsi_mod] [<ffffffffa0478e12>] srp_create_target+0xe12/0x1320 [ib_srp] [<ffffffff8138a728>] dev_attr_store+0x18/0x30 [<ffffffff812371f8>] sysfs_kf_write+0x48/0x60 [<ffffffff812367f4>] kernfs_fop_write+0x144/0x190 [<ffffffff811bdaf8>] __vfs_write+0x28/0xe0 [<ffffffff811be1a9>] vfs_write+0xa9/0x190 [<ffffffff811bef09>] SyS_write+0x49/0xa0 [<ffffffff815022f2>] entry_SYSCALL_64_fastpath+0x16/0x7a [<ffffffffffffffff>] 0xffffffffffffffff Thanks, Bart. -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mm/kmemleak.c b/mm/kmemleak.c index cf79f110157c..6dc18dbad9ec 100644 --- a/mm/kmemleak.c +++ b/mm/kmemleak.c @@ -553,8 +553,8 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size, object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp)); if (!object) { - pr_warning("Cannot allocate a kmemleak_object structure\n"); - kmemleak_disable(); + //pr_warning("Cannot allocate a kmemleak_object structure\n"); + //kmemleak_disable(); return NULL; }