diff mbox

blk-mq vs kmemleak

Message ID 20150703161137.GA10438@codemonkey.org.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Dave Jones July 3, 2015, 4:11 p.m. UTC
After a fuzzing run recently, I noticed that the machine had oom'd, and
killed everything, but there was still 3GB of memory still in use, that
I couldn't even reclaim with /proc/sys/vm/drop_caches

So I enabled kmemleak. After applying this..

otherwise it would disable itself within a minute of runtime.

I notice now that I'm seeing a lot of traces like this..


unreferenced object 0xffff8800ba8202c0 (size 320):
  comm "kworker/u4:1", pid 38, jiffies 4294741176 (age 46887.690s)
  hex dump (first 32 bytes):
    21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
    [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
    [<ffffffff89165963>] mempool_alloc+0x63/0x180
    [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
    [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
    [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
    [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
    [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
    [<ffffffff8946da59>] sd_init_command+0x59/0xe40
    [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
    [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
    [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
    [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120
    [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0
    [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140

unreferenced object 0xffff8800ba824800 (size 640):
  comm "trinity-c2", pid 3687, jiffies 4294843075 (age 46785.966s)
  hex dump (first 32 bytes):
    21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
    [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
    [<ffffffff89165963>] mempool_alloc+0x63/0x180
    [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
    [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
    [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
    [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
    [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
    [<ffffffff8946da59>] sd_init_command+0x59/0xe40
    [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
    [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
    [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
    [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120
    [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0
    [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140

unreferenced object 0xffff8800a9fe6780 (size 2560):
  comm "kworker/1:1H", pid 171, jiffies 4294843118 (age 46785.923s)
  hex dump (first 32 bytes):
    21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
    00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
  backtrace:
    [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
    [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
    [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
    [<ffffffff89165963>] mempool_alloc+0x63/0x180
    [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
    [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
    [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
    [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
    [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
    [<ffffffff8946da59>] sd_init_command+0x59/0xe40
    [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
    [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
    [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
    [<ffffffff892f66b2>] blk_mq_run_work_fn+0x12/0x20
    [<ffffffff8908eba7>] process_one_work+0x147/0x420
    [<ffffffff8908f209>] worker_thread+0x69/0x470

The sizes vary, but the hex dump is always the same.

What's the usual completion path where these would get deallocated ?
I'm wondering if there's just some annotation missing to appease kmemleak,
because I'm seeing thousands of these.

Or it could be a real leak, but it seems surprising no-one else is complaining.

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Bart Van Assche July 3, 2015, 5:04 p.m. UTC | #1
On 07/03/15 09:11, Dave Jones wrote:
> After a fuzzing run recently, I noticed that the machine had oom'd, and
> killed everything, but there was still 3GB of memory still in use, that
> I couldn't even reclaim with /proc/sys/vm/drop_caches
>
> So I enabled kmemleak. After applying this..
>
> diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> index cf79f110157c..6dc18dbad9ec 100644
> --- a/mm/kmemleak.c
> +++ b/mm/kmemleak.c
> @@ -553,8 +553,8 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
>
>          object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
>          if (!object) {
> -               pr_warning("Cannot allocate a kmemleak_object structure\n");
> -               kmemleak_disable();
> +               //pr_warning("Cannot allocate a kmemleak_object structure\n");
> +               //kmemleak_disable();
>                  return NULL;
>          }
>
> otherwise it would disable itself within a minute of runtime.
>
> I notice now that I'm seeing a lot of traces like this..
>
>
> unreferenced object 0xffff8800ba8202c0 (size 320):
>    comm "kworker/u4:1", pid 38, jiffies 4294741176 (age 46887.690s)
>    hex dump (first 32 bytes):
>      21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
>      [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
>      [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
>      [<ffffffff89165963>] mempool_alloc+0x63/0x180
>      [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
>      [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
>      [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
>      [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
>      [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
>      [<ffffffff8946da59>] sd_init_command+0x59/0xe40
>      [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
>      [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
>      [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
>      [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120
>      [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0
>      [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140
>
> unreferenced object 0xffff8800ba824800 (size 640):
>    comm "trinity-c2", pid 3687, jiffies 4294843075 (age 46785.966s)
>    hex dump (first 32 bytes):
>      21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
>      [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
>      [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
>      [<ffffffff89165963>] mempool_alloc+0x63/0x180
>      [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
>      [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
>      [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
>      [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
>      [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
>      [<ffffffff8946da59>] sd_init_command+0x59/0xe40
>      [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
>      [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
>      [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
>      [<ffffffff892f5e5e>] blk_mq_run_hw_queue+0x9e/0x120
>      [<ffffffff892f7524>] blk_mq_insert_requests+0xd4/0x1a0
>      [<ffffffff892f8273>] blk_mq_flush_plug_list+0x123/0x140
>
> unreferenced object 0xffff8800a9fe6780 (size 2560):
>    comm "kworker/1:1H", pid 171, jiffies 4294843118 (age 46785.923s)
>    hex dump (first 32 bytes):
>      21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
>      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
>    backtrace:
>      [<ffffffff8969b80e>] kmemleak_alloc+0x4e/0xb0
>      [<ffffffff891b3e37>] kmem_cache_alloc+0x107/0x200
>      [<ffffffff8916528d>] mempool_alloc_slab+0x1d/0x30
>      [<ffffffff89165963>] mempool_alloc+0x63/0x180
>      [<ffffffff8945f85a>] scsi_sg_alloc+0x4a/0x50
>      [<ffffffff89323f0e>] __sg_alloc_table+0x11e/0x180
>      [<ffffffff8945dc03>] scsi_alloc_sgtable+0x43/0x90
>      [<ffffffff8945dc81>] scsi_init_sgtable+0x31/0x80
>      [<ffffffff8945dd1a>] scsi_init_io+0x4a/0x1c0
>      [<ffffffff8946da59>] sd_init_command+0x59/0xe40
>      [<ffffffff8945df81>] scsi_setup_cmnd+0xf1/0x160
>      [<ffffffff8945e75c>] scsi_queue_rq+0x57c/0x6a0
>      [<ffffffff892f60b8>] __blk_mq_run_hw_queue+0x1d8/0x390
>      [<ffffffff892f66b2>] blk_mq_run_work_fn+0x12/0x20
>      [<ffffffff8908eba7>] process_one_work+0x147/0x420
>      [<ffffffff8908f209>] worker_thread+0x69/0x470
>
> The sizes vary, but the hex dump is always the same.
>
> What's the usual completion path where these would get deallocated ?
> I'm wondering if there's just some annotation missing to appease kmemleak,
> because I'm seeing thousands of these.
>
> Or it could be a real leak, but it seems surprising no-one else is complaining.

(+Catalin)

Dave, with which kernel version has this behavior been observed ?

Catalin, can you recommend which patches Dave Jones should apply to 
kmemleak ? A few weeks ago I had noticed similar kmemleak reports. 
However, when I reran my test with kmemleak disabled memory usage was 
stable. See also 
https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html.

Thanks,

Bart.

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Dave Jones July 3, 2015, 5:07 p.m. UTC | #2
On Fri, Jul 03, 2015 at 10:04:00AM -0700, Bart Van Assche wrote:
 > On 07/03/15 09:11, Dave Jones wrote:
 > > After a fuzzing run recently, I noticed that the machine had oom'd, and
 > > killed everything, but there was still 3GB of memory still in use, that
 > > I couldn't even reclaim with /proc/sys/vm/drop_caches
 > > ...
 > > I'm wondering if there's just some annotation missing to appease kmemleak,
 > > because I'm seeing thousands of these.
 > >
 > > Or it could be a real leak, but it seems surprising no-one else is complaining.
 > 
 > Dave, with which kernel version has this behavior been observed ?

Linus' current tree

	Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Catalin Marinas July 7, 2015, 10:33 a.m. UTC | #3
On Fri, Jul 03, 2015 at 06:04:00PM +0100, Bart Van Assche wrote:
> On 07/03/15 09:11, Dave Jones wrote:
> > After a fuzzing run recently, I noticed that the machine had oom'd, and
> > killed everything, but there was still 3GB of memory still in use, that
> > I couldn't even reclaim with /proc/sys/vm/drop_caches
> >
> > So I enabled kmemleak. After applying this..
> >
> > diff --git a/mm/kmemleak.c b/mm/kmemleak.c
> > index cf79f110157c..6dc18dbad9ec 100644
> > --- a/mm/kmemleak.c
> > +++ b/mm/kmemleak.c
> > @@ -553,8 +553,8 @@ static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
> >
> >          object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
> >          if (!object) {
> > -               pr_warning("Cannot allocate a kmemleak_object structure\n");
> > -               kmemleak_disable();
> > +               //pr_warning("Cannot allocate a kmemleak_object structure\n");
> > +               //kmemleak_disable();
> >                  return NULL;
> >          }
> >
> > otherwise it would disable itself within a minute of runtime.
> >
> > I notice now that I'm seeing a lot of traces like this..
> >
> >
> > unreferenced object 0xffff8800ba8202c0 (size 320):
> >    comm "kworker/u4:1", pid 38, jiffies 4294741176 (age 46887.690s)
> >    hex dump (first 32 bytes):
> >      21 43 65 87 00 00 00 00 00 00 00 00 00 00 00 00  !Ce.............
> >      00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................

If kmemleak fails to allocate its metadata, the object referencing graph
is broken, so whatever kmemleak prints afterwards can no longer be
trusted (e.g. objects in a linked list and there is a block in the
middle of the list that kmemleak doesn't know about and won't scan; the
blocks referenced by it will be reported as leaks).

After kmemleak disables itself, does it still show any potential leaks?
Even that is not entirely to be trusted. A better start would be to look
at /proc/slabinfo.

> Catalin, can you recommend which patches Dave Jones should apply to 
> kmemleak ?

All kmemleak patches I had are in mainline (as of 4.2-rc1) but none of
them targets leaks (or false positives).

> A few weeks ago I had noticed similar kmemleak reports. 
> However, when I reran my test with kmemleak disabled memory usage was 
> stable. See also 
> https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html.

Further down in this thread it gets interesting with kmalloc-96 objects
disappearing in /proc/slabinfo with kmemleak disabled. Kmemleak does not
allocate such objects, the two types it allocates are in separate
kmem_cache as kmemleak_object and kmemleak_scan_area. However, the 96
size is reported by kmemleak as well, so maybe it was fixed by some
other patch in the meantime?

If you manage to reproduce it again please let me know. Thanks.
Bart Van Assche July 7, 2015, 1:59 p.m. UTC | #4
On 07/07/15 03:33, Catalin Marinas wrote:
> On Fri, Jul 03, 2015 at 06:04:00PM +0100, Bart Van Assche wrote:
>> A few weeks ago I had noticed similar kmemleak reports.
>> However, when I reran my test with kmemleak disabled memory usage was
>> stable. See also
>> https://www.redhat.com/archives/dm-devel/2015-May/msg00198.html.
>
> Further down in this thread it gets interesting with kmalloc-96 objects
> disappearing in /proc/slabinfo with kmemleak disabled. Kmemleak does not
> allocate such objects, the two types it allocates are in separate
> kmem_cache as kmemleak_object and kmemleak_scan_area. However, the 96
> size is reported by kmemleak as well, so maybe it was fixed by some
> other patch in the meantime?
>
> If you manage to reproduce it again please let me know. Thanks.

Hello Catalin,

Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a 
red zone to be allocated before and after each block of allocated 
memory. Could that explain the kmalloc-96 objects ?

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Christoph Hellwig July 8, 2015, 8:17 a.m. UTC | #5
On Tue, Jul 07, 2015 at 06:59:37AM -0700, Bart Van Assche wrote:
> Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a red
> zone to be allocated before and after each block of allocated memory. Could
> that explain the kmalloc-96 objects ?

96 is almost guaranteed to be the sense buffer allocated in
scsi_init_request and freed in scsi_exit_request.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bart Van Assche Aug. 1, 2015, 12:37 a.m. UTC | #6
On 07/08/2015 01:17 AM, Christoph Hellwig wrote:
> On Tue, Jul 07, 2015 at 06:59:37AM -0700, Bart Van Assche wrote:
>> Please note that my test was run with CONFIG_SLUB_DEBUG=y which causes a red
>> zone to be allocated before and after each block of allocated memory. Could
>> that explain the kmalloc-96 objects ?
>
> 96 is almost guaranteed to be the sense buffer allocated in
> scsi_init_request and freed in scsi_exit_request.

Hello Catalin and Christoph,

kmemleak still reports large numbers of unreferenced objects for the 
scsi-mq code with the v4.2-rc4 kernel even with the recently posted 
scsi-mq leak fix applied on top of v4.2-rc4. Here is an example of one 
such report:

unreferenced object 0xffff88045e05dc28 (size 96):
   comm "srp_daemon", pid 8461, jiffies 4294973034 (age 742.350s)
   hex dump (first 32 bytes):
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
     00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  ................
   backtrace:
     [<ffffffff814f2ede>] kmemleak_alloc+0x4e/0xb0
     [<ffffffff811b0678>] kmem_cache_alloc_trace+0xc8/0x2d0
     [<ffffffffa006cc37>] scsi_init_request+0x27/0x40 [scsi_mod]
     [<ffffffff81278b91>] blk_mq_init_rq_map+0x1d1/0x260
     [<ffffffff81278cc4>] blk_mq_alloc_tag_set+0xa4/0x1f0
     [<ffffffffa006fb0d>] scsi_mq_setup_tags+0xcd/0xd0 [scsi_mod]
     [<ffffffffa0066464>] scsi_add_host_with_dma+0x74/0x2e0 [scsi_mod]
     [<ffffffffa0478e12>] srp_create_target+0xe12/0x1320 [ib_srp]
     [<ffffffff8138a728>] dev_attr_store+0x18/0x30
     [<ffffffff812371f8>] sysfs_kf_write+0x48/0x60
     [<ffffffff812367f4>] kernfs_fop_write+0x144/0x190
     [<ffffffff811bdaf8>] __vfs_write+0x28/0xe0
     [<ffffffff811be1a9>] vfs_write+0xa9/0x190
     [<ffffffff811bef09>] SyS_write+0x49/0xa0
     [<ffffffff815022f2>] entry_SYSCALL_64_fastpath+0x16/0x7a
     [<ffffffffffffffff>] 0xffffffffffffffff

Thanks,

Bart.
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/mm/kmemleak.c b/mm/kmemleak.c
index cf79f110157c..6dc18dbad9ec 100644
--- a/mm/kmemleak.c
+++ b/mm/kmemleak.c
@@ -553,8 +553,8 @@  static struct kmemleak_object *create_object(unsigned long ptr, size_t size,
 
        object = kmem_cache_alloc(object_cache, gfp_kmemleak_mask(gfp));
        if (!object) {
-               pr_warning("Cannot allocate a kmemleak_object structure\n");
-               kmemleak_disable();
+               //pr_warning("Cannot allocate a kmemleak_object structure\n");
+               //kmemleak_disable();
                return NULL;
        }