Message ID | 1411399521-2752-1-git-send-email-bradley.d.volkin@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Sep 22, 2014 at 08:25:21AM -0700, bradley.d.volkin@intel.com wrote: > From: Brad Volkin <bradley.d.volkin@intel.com> > > Ring init and cleanup are not balanced because we re-init the rings on > resume without having cleaned them up on suspend. This leads to the > driver leaking the parser's hash tables with a kmemleak signature such > as this: > > unreferenced object 0xffff880405960980 (size 32): > comm "systemd-udevd", pid 516, jiffies 4294896961 (age 10202.044s) > hex dump (first 32 bytes): > d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..F............. > 98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00 .`(............. > backtrace: > [<ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0 > [<ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0 > [<ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915] > [<ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915] > [<ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915] > [<ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915] > [<ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915] > [<ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915] > [<ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm] > [<ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm] > [<ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915] > [<ffffffff81436725>] local_pci_probe+0x45/0xa0 > [<ffffffff81437a69>] pci_device_probe+0xd9/0x130 > [<ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0 > [<ffffffff815252d3>] __driver_attach+0x93/0xa0 > [<ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0 > > This patch extends the current convention of checking whether a > resource is already allocated before allocating it during ring init. > Longer term it might make sense to only init the rings once. > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794 > Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> We should finally get around to split the ring init code properly into software init in _init functions (done once) and _init_hw functions for rpm/resume ... Anyway, this looks sane. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: stable@vger.kernel.org > --- > > According to the report in bugzilla, this happens in linux-next-20140919 as > well. I'm not sure what the path is for getting the fix there in addition to > nightly. > > drivers/gpu/drm/i915/i915_cmd_parser.c | 12 +++++++----- > 1 file changed, 7 insertions(+), 5 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c > index 4c35e2a..86b3ae0 100644 > --- a/drivers/gpu/drm/i915/i915_cmd_parser.c > +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c > @@ -709,11 +709,13 @@ int i915_cmd_parser_init_ring(struct intel_engine_cs *ring) > BUG_ON(!validate_cmds_sorted(ring, cmd_tables, cmd_table_count)); > BUG_ON(!validate_regs_sorted(ring)); > > - ret = init_hash_table(ring, cmd_tables, cmd_table_count); > - if (ret) { > - DRM_ERROR("CMD: cmd_parser_init failed!\n"); > - fini_hash_table(ring); > - return ret; > + if (hash_empty(ring->cmd_hash)) { > + ret = init_hash_table(ring, cmd_tables, cmd_table_count); > + if (ret) { > + DRM_ERROR("CMD: cmd_parser_init failed!\n"); > + fini_hash_table(ring); > + return ret; > + } > } > > ring->needs_cmd_parser = true; > -- > 1.8.3.2 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, Sep 23, 2014 at 10:34:46AM +0200, Daniel Vetter wrote: > On Mon, Sep 22, 2014 at 08:25:21AM -0700, bradley.d.volkin@intel.com wrote: > > From: Brad Volkin <bradley.d.volkin@intel.com> > > > > Ring init and cleanup are not balanced because we re-init the rings on > > resume without having cleaned them up on suspend. This leads to the > > driver leaking the parser's hash tables with a kmemleak signature such > > as this: > > > > unreferenced object 0xffff880405960980 (size 32): > > comm "systemd-udevd", pid 516, jiffies 4294896961 (age 10202.044s) > > hex dump (first 32 bytes): > > d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..F............. > > 98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00 .`(............. > > backtrace: > > [<ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0 > > [<ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0 > > [<ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915] > > [<ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915] > > [<ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915] > > [<ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915] > > [<ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915] > > [<ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915] > > [<ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm] > > [<ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm] > > [<ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915] > > [<ffffffff81436725>] local_pci_probe+0x45/0xa0 > > [<ffffffff81437a69>] pci_device_probe+0xd9/0x130 > > [<ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0 > > [<ffffffff815252d3>] __driver_attach+0x93/0xa0 > > [<ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0 > > > > This patch extends the current convention of checking whether a > > resource is already allocated before allocating it during ring init. > > Longer term it might make sense to only init the rings once. > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794 > > Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> > > We should finally get around to split the ring init code properly into > software init in _init functions (done once) and _init_hw functions for > rpm/resume ... Anyway, this looks sane. Maybe someone already sent patches to do so. -Chris
On Tue, Sep 23, 2014 at 10:14:33AM +0100, Chris Wilson wrote: > On Tue, Sep 23, 2014 at 10:34:46AM +0200, Daniel Vetter wrote: > > On Mon, Sep 22, 2014 at 08:25:21AM -0700, bradley.d.volkin@intel.com wrote: > > > From: Brad Volkin <bradley.d.volkin@intel.com> > > > > > > Ring init and cleanup are not balanced because we re-init the rings on > > > resume without having cleaned them up on suspend. This leads to the > > > driver leaking the parser's hash tables with a kmemleak signature such > > > as this: > > > > > > unreferenced object 0xffff880405960980 (size 32): > > > comm "systemd-udevd", pid 516, jiffies 4294896961 (age 10202.044s) > > > hex dump (first 32 bytes): > > > d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..F............. > > > 98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00 .`(............. > > > backtrace: > > > [<ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0 > > > [<ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0 > > > [<ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915] > > > [<ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915] > > > [<ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915] > > > [<ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915] > > > [<ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915] > > > [<ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915] > > > [<ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm] > > > [<ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm] > > > [<ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915] > > > [<ffffffff81436725>] local_pci_probe+0x45/0xa0 > > > [<ffffffff81437a69>] pci_device_probe+0xd9/0x130 > > > [<ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0 > > > [<ffffffff815252d3>] __driver_attach+0x93/0xa0 > > > [<ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0 > > > > > > This patch extends the current convention of checking whether a > > > resource is already allocated before allocating it during ring init. > > > Longer term it might make sense to only init the rings once. > > > > > > Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794 > > > Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> > > > > We should finally get around to split the ring init code properly into > > software init in _init functions (done once) and _init_hw functions for > > rpm/resume ... Anyway, this looks sane. > > Maybe someone already sent patches to do so. Hm, I've thought I've merge the ones that avoid the re-init, but those didn't go the full path to split the init code. Which ones am I still missing? -Daniel
On Tue, 23 Sep 2014, Daniel Vetter <daniel@ffwll.ch> wrote: > On Mon, Sep 22, 2014 at 08:25:21AM -0700, bradley.d.volkin@intel.com wrote: >> From: Brad Volkin <bradley.d.volkin@intel.com> >> >> Ring init and cleanup are not balanced because we re-init the rings on >> resume without having cleaned them up on suspend. This leads to the >> driver leaking the parser's hash tables with a kmemleak signature such >> as this: >> >> unreferenced object 0xffff880405960980 (size 32): >> comm "systemd-udevd", pid 516, jiffies 4294896961 (age 10202.044s) >> hex dump (first 32 bytes): >> d0 85 46 c0 ff ff ff ff 00 00 00 00 00 00 00 00 ..F............. >> 98 60 28 04 04 88 ff ff 00 00 00 00 00 00 00 00 .`(............. >> backtrace: >> [<ffffffff81816f9e>] kmemleak_alloc+0x4e/0xb0 >> [<ffffffff811fa678>] kmem_cache_alloc_trace+0x168/0x2f0 >> [<ffffffffc03e20a5>] i915_cmd_parser_init_ring+0x2a5/0x3e0 [i915] >> [<ffffffffc04088a2>] intel_init_ring_buffer+0x202/0x470 [i915] >> [<ffffffffc040c998>] intel_init_vebox_ring_buffer+0x1e8/0x2b0 [i915] >> [<ffffffffc03eff59>] i915_gem_init_hw+0x2f9/0x3a0 [i915] >> [<ffffffffc03f0057>] i915_gem_init+0x57/0x1d0 [i915] >> [<ffffffffc045e26a>] i915_driver_load+0xc0a/0x10e0 [i915] >> [<ffffffffc02e0d5d>] drm_dev_register+0xad/0x100 [drm] >> [<ffffffffc02e3b9f>] drm_get_pci_dev+0x8f/0x200 [drm] >> [<ffffffffc03c934b>] i915_pci_probe+0x3b/0x60 [i915] >> [<ffffffff81436725>] local_pci_probe+0x45/0xa0 >> [<ffffffff81437a69>] pci_device_probe+0xd9/0x130 >> [<ffffffff81524f4d>] driver_probe_device+0x12d/0x3e0 >> [<ffffffff815252d3>] __driver_attach+0x93/0xa0 >> [<ffffffff81522e1b>] bus_for_each_dev+0x6b/0xb0 >> >> This patch extends the current convention of checking whether a >> resource is already allocated before allocating it during ring init. >> Longer term it might make sense to only init the rings once. >> >> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83794 >> Signed-off-by: Brad Volkin <bradley.d.volkin@intel.com> > > We should finally get around to split the ring init code properly into > software init in _init functions (done once) and _init_hw functions for > rpm/resume ... Anyway, this looks sane. > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: stable@vger.kernel.org Pushed to drm-intel-fixes, thanks for the patch and review. BR, Jani. > >> --- >> >> According to the report in bugzilla, this happens in linux-next-20140919 as >> well. I'm not sure what the path is for getting the fix there in addition to >> nightly. >> >> drivers/gpu/drm/i915/i915_cmd_parser.c | 12 +++++++----- >> 1 file changed, 7 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c >> index 4c35e2a..86b3ae0 100644 >> --- a/drivers/gpu/drm/i915/i915_cmd_parser.c >> +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c >> @@ -709,11 +709,13 @@ int i915_cmd_parser_init_ring(struct intel_engine_cs *ring) >> BUG_ON(!validate_cmds_sorted(ring, cmd_tables, cmd_table_count)); >> BUG_ON(!validate_regs_sorted(ring)); >> >> - ret = init_hash_table(ring, cmd_tables, cmd_table_count); >> - if (ret) { >> - DRM_ERROR("CMD: cmd_parser_init failed!\n"); >> - fini_hash_table(ring); >> - return ret; >> + if (hash_empty(ring->cmd_hash)) { >> + ret = init_hash_table(ring, cmd_tables, cmd_table_count); >> + if (ret) { >> + DRM_ERROR("CMD: cmd_parser_init failed!\n"); >> + fini_hash_table(ring); >> + return ret; >> + } >> } >> >> ring->needs_cmd_parser = true; >> -- >> 1.8.3.2 >> >> _______________________________________________ >> Intel-gfx mailing list >> Intel-gfx@lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/intel-gfx > > -- > Daniel Vetter > Software Engineer, Intel Corporation > +41 (0) 79 365 57 48 - http://blog.ffwll.ch
diff --git a/drivers/gpu/drm/i915/i915_cmd_parser.c b/drivers/gpu/drm/i915/i915_cmd_parser.c index 4c35e2a..86b3ae0 100644 --- a/drivers/gpu/drm/i915/i915_cmd_parser.c +++ b/drivers/gpu/drm/i915/i915_cmd_parser.c @@ -709,11 +709,13 @@ int i915_cmd_parser_init_ring(struct intel_engine_cs *ring) BUG_ON(!validate_cmds_sorted(ring, cmd_tables, cmd_table_count)); BUG_ON(!validate_regs_sorted(ring)); - ret = init_hash_table(ring, cmd_tables, cmd_table_count); - if (ret) { - DRM_ERROR("CMD: cmd_parser_init failed!\n"); - fini_hash_table(ring); - return ret; + if (hash_empty(ring->cmd_hash)) { + ret = init_hash_table(ring, cmd_tables, cmd_table_count); + if (ret) { + DRM_ERROR("CMD: cmd_parser_init failed!\n"); + fini_hash_table(ring); + return ret; + } } ring->needs_cmd_parser = true;