Message ID | 20231115-dont_clean_gt_on_error_path-v2-1-54250125470a@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v2] drm/i915: do not clean GT table on error path | expand |
On 15/11/2023 10:54, Andrzej Hajda wrote: > The only task of intel_gt_release_all is to zero gt table. Calling > it on error path prevents intel_gt_driver_late_release_all (called from > i915_driver_late_release) to cleanup GTs, causing leakage. > After i915_driver_late_release GT array is not used anymore so > it does not need cleaning at all. > > Sample leak report: > > BUG i915_request (...): Objects remaining in i915_request on __kmem_cache_shutdown() > ... > Object 0xffff888113420040 @offset=64 > Allocated in __i915_request_create+0x75/0x610 [i915] age=18339 cpu=1 pid=1454 > kmem_cache_alloc+0x25b/0x270 > __i915_request_create+0x75/0x610 [i915] > i915_request_create+0x109/0x290 [i915] > __engines_record_defaults+0xca/0x440 [i915] > intel_gt_init+0x275/0x430 [i915] > i915_gem_init+0x135/0x2c0 [i915] > i915_driver_probe+0x8d1/0xdc0 [i915] > > v2: removed whole intel_gt_release_all > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8489 > Fixes: bec68cc9ea42d8 ("drm/i915: Prepare for multiple GTs") > Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> > --- > - Link to v1: https://lore.kernel.org/r/20231114-dont_clean_gt_on_error_path-v1-1-37f2fa827fd2@intel.com > --- > drivers/gpu/drm/i915/gt/intel_gt.c | 11 ----------- > drivers/gpu/drm/i915/i915_driver.c | 4 +--- > 2 files changed, 1 insertion(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c > index ed32bf5b15464e..ba1186fc524f84 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt.c > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > @@ -982,8 +982,6 @@ int intel_gt_probe_all(struct drm_i915_private *i915) > > err: > i915_probe_error(i915, "Failed to initialize %s! (%d)\n", gtdef->name, ret); > - intel_gt_release_all(i915); > - > return ret; > } > > @@ -1002,15 +1000,6 @@ int intel_gt_tiles_init(struct drm_i915_private *i915) > return 0; > } > > -void intel_gt_release_all(struct drm_i915_private *i915) > -{ > - struct intel_gt *gt; > - unsigned int id; > - > - for_each_gt(gt, i915, id) > - i915->gt[id] = NULL; > -} > - > void intel_gt_info_print(const struct intel_gt_info *info, > struct drm_printer *p) > { > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c > index 01fd25b622d16c..2a1faf4039659c 100644 > --- a/drivers/gpu/drm/i915/i915_driver.c > +++ b/drivers/gpu/drm/i915/i915_driver.c > @@ -776,7 +776,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) > > ret = i915_driver_mmio_probe(i915); > if (ret < 0) > - goto out_tiles_cleanup; > + goto out_runtime_pm_put; > > ret = i915_driver_hw_probe(i915); > if (ret < 0) > @@ -836,8 +836,6 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) > i915_ggtt_driver_late_release(i915); > out_cleanup_mmio: > i915_driver_mmio_release(i915); > -out_tiles_cleanup: > - intel_gt_release_all(i915); > out_runtime_pm_put: > enable_rpm_wakeref_asserts(&i915->runtime_pm); > i915_driver_late_release(i915); > > --- > base-commit: 1489bab52c281a869295414031a56506a375b036 > change-id: 20231114-dont_clean_gt_on_error_path-91cd9c3caa0a Looks okay. Reviewed-by: Tvrtko Ursulin <tvrtko.ursulin@intel.com> Although on the overall driver init/exit in general again started feeling a bit messy. Seems like it would need an overhaul every few years to keep tidy and logical but meh. Regards, Tvrtko
On 11/15/2023 11:54 AM, Andrzej Hajda wrote: > The only task of intel_gt_release_all is to zero gt table. Calling > it on error path prevents intel_gt_driver_late_release_all (called from > i915_driver_late_release) to cleanup GTs, causing leakage. > After i915_driver_late_release GT array is not used anymore so > it does not need cleaning at all. > > Sample leak report: > > BUG i915_request (...): Objects remaining in i915_request on __kmem_cache_shutdown() > ... > Object 0xffff888113420040 @offset=64 > Allocated in __i915_request_create+0x75/0x610 [i915] age=18339 cpu=1 pid=1454 > kmem_cache_alloc+0x25b/0x270 > __i915_request_create+0x75/0x610 [i915] > i915_request_create+0x109/0x290 [i915] > __engines_record_defaults+0xca/0x440 [i915] > intel_gt_init+0x275/0x430 [i915] > i915_gem_init+0x135/0x2c0 [i915] > i915_driver_probe+0x8d1/0xdc0 [i915] > > v2: removed whole intel_gt_release_all > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8489 > Fixes: bec68cc9ea42d8 ("drm/i915: Prepare for multiple GTs") > Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Nirmoy Das <nirmoy.das@intel.com> > --- > - Link to v1: https://lore.kernel.org/r/20231114-dont_clean_gt_on_error_path-v1-1-37f2fa827fd2@intel.com > --- > drivers/gpu/drm/i915/gt/intel_gt.c | 11 ----------- > drivers/gpu/drm/i915/i915_driver.c | 4 +--- > 2 files changed, 1 insertion(+), 14 deletions(-) > > diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c > index ed32bf5b15464e..ba1186fc524f84 100644 > --- a/drivers/gpu/drm/i915/gt/intel_gt.c > +++ b/drivers/gpu/drm/i915/gt/intel_gt.c > @@ -982,8 +982,6 @@ int intel_gt_probe_all(struct drm_i915_private *i915) > > err: > i915_probe_error(i915, "Failed to initialize %s! (%d)\n", gtdef->name, ret); > - intel_gt_release_all(i915); > - > return ret; > } > > @@ -1002,15 +1000,6 @@ int intel_gt_tiles_init(struct drm_i915_private *i915) > return 0; > } > > -void intel_gt_release_all(struct drm_i915_private *i915) > -{ > - struct intel_gt *gt; > - unsigned int id; > - > - for_each_gt(gt, i915, id) > - i915->gt[id] = NULL; > -} > - > void intel_gt_info_print(const struct intel_gt_info *info, > struct drm_printer *p) > { > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c > index 01fd25b622d16c..2a1faf4039659c 100644 > --- a/drivers/gpu/drm/i915/i915_driver.c > +++ b/drivers/gpu/drm/i915/i915_driver.c > @@ -776,7 +776,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) > > ret = i915_driver_mmio_probe(i915); > if (ret < 0) > - goto out_tiles_cleanup; > + goto out_runtime_pm_put; > > ret = i915_driver_hw_probe(i915); > if (ret < 0) > @@ -836,8 +836,6 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) > i915_ggtt_driver_late_release(i915); > out_cleanup_mmio: > i915_driver_mmio_release(i915); > -out_tiles_cleanup: > - intel_gt_release_all(i915); > out_runtime_pm_put: > enable_rpm_wakeref_asserts(&i915->runtime_pm); > i915_driver_late_release(i915); > > --- > base-commit: 1489bab52c281a869295414031a56506a375b036 > change-id: 20231114-dont_clean_gt_on_error_path-91cd9c3caa0a > > Best regards,
Hi Andrzej, On Wed, Nov 15, 2023 at 11:54:03AM +0100, Andrzej Hajda wrote: > The only task of intel_gt_release_all is to zero gt table. Calling > it on error path prevents intel_gt_driver_late_release_all (called from > i915_driver_late_release) to cleanup GTs, causing leakage. > After i915_driver_late_release GT array is not used anymore so > it does not need cleaning at all. > > Sample leak report: > > BUG i915_request (...): Objects remaining in i915_request on __kmem_cache_shutdown() > ... > Object 0xffff888113420040 @offset=64 > Allocated in __i915_request_create+0x75/0x610 [i915] age=18339 cpu=1 pid=1454 > kmem_cache_alloc+0x25b/0x270 > __i915_request_create+0x75/0x610 [i915] > i915_request_create+0x109/0x290 [i915] > __engines_record_defaults+0xca/0x440 [i915] > intel_gt_init+0x275/0x430 [i915] > i915_gem_init+0x135/0x2c0 [i915] > i915_driver_probe+0x8d1/0xdc0 [i915] > > v2: removed whole intel_gt_release_all > > Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8489 > Fixes: bec68cc9ea42d8 ("drm/i915: Prepare for multiple GTs") > Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Andi
On 16.11.2023 12:44, Andi Shyti wrote: > Hi Andrzej, > > On Wed, Nov 15, 2023 at 11:54:03AM +0100, Andrzej Hajda wrote: >> The only task of intel_gt_release_all is to zero gt table. Calling >> it on error path prevents intel_gt_driver_late_release_all (called from >> i915_driver_late_release) to cleanup GTs, causing leakage. >> After i915_driver_late_release GT array is not used anymore so >> it does not need cleaning at all. >> >> Sample leak report: >> >> BUG i915_request (...): Objects remaining in i915_request on __kmem_cache_shutdown() >> ... >> Object 0xffff888113420040 @offset=64 >> Allocated in __i915_request_create+0x75/0x610 [i915] age=18339 cpu=1 pid=1454 >> kmem_cache_alloc+0x25b/0x270 >> __i915_request_create+0x75/0x610 [i915] >> i915_request_create+0x109/0x290 [i915] >> __engines_record_defaults+0xca/0x440 [i915] >> intel_gt_init+0x275/0x430 [i915] >> i915_gem_init+0x135/0x2c0 [i915] >> i915_driver_probe+0x8d1/0xdc0 [i915] >> >> v2: removed whole intel_gt_release_all >> >> Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8489 >> Fixes: bec68cc9ea42d8 ("drm/i915: Prepare for multiple GTs") >> Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> > Reviewed-by: Andi Shyti <andi.shyti@linux.intel.com> Thx all, pushed. > Andi
diff --git a/drivers/gpu/drm/i915/gt/intel_gt.c b/drivers/gpu/drm/i915/gt/intel_gt.c index ed32bf5b15464e..ba1186fc524f84 100644 --- a/drivers/gpu/drm/i915/gt/intel_gt.c +++ b/drivers/gpu/drm/i915/gt/intel_gt.c @@ -982,8 +982,6 @@ int intel_gt_probe_all(struct drm_i915_private *i915) err: i915_probe_error(i915, "Failed to initialize %s! (%d)\n", gtdef->name, ret); - intel_gt_release_all(i915); - return ret; } @@ -1002,15 +1000,6 @@ int intel_gt_tiles_init(struct drm_i915_private *i915) return 0; } -void intel_gt_release_all(struct drm_i915_private *i915) -{ - struct intel_gt *gt; - unsigned int id; - - for_each_gt(gt, i915, id) - i915->gt[id] = NULL; -} - void intel_gt_info_print(const struct intel_gt_info *info, struct drm_printer *p) { diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c index 01fd25b622d16c..2a1faf4039659c 100644 --- a/drivers/gpu/drm/i915/i915_driver.c +++ b/drivers/gpu/drm/i915/i915_driver.c @@ -776,7 +776,7 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) ret = i915_driver_mmio_probe(i915); if (ret < 0) - goto out_tiles_cleanup; + goto out_runtime_pm_put; ret = i915_driver_hw_probe(i915); if (ret < 0) @@ -836,8 +836,6 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent) i915_ggtt_driver_late_release(i915); out_cleanup_mmio: i915_driver_mmio_release(i915); -out_tiles_cleanup: - intel_gt_release_all(i915); out_runtime_pm_put: enable_rpm_wakeref_asserts(&i915->runtime_pm); i915_driver_late_release(i915);
The only task of intel_gt_release_all is to zero gt table. Calling it on error path prevents intel_gt_driver_late_release_all (called from i915_driver_late_release) to cleanup GTs, causing leakage. After i915_driver_late_release GT array is not used anymore so it does not need cleaning at all. Sample leak report: BUG i915_request (...): Objects remaining in i915_request on __kmem_cache_shutdown() ... Object 0xffff888113420040 @offset=64 Allocated in __i915_request_create+0x75/0x610 [i915] age=18339 cpu=1 pid=1454 kmem_cache_alloc+0x25b/0x270 __i915_request_create+0x75/0x610 [i915] i915_request_create+0x109/0x290 [i915] __engines_record_defaults+0xca/0x440 [i915] intel_gt_init+0x275/0x430 [i915] i915_gem_init+0x135/0x2c0 [i915] i915_driver_probe+0x8d1/0xdc0 [i915] v2: removed whole intel_gt_release_all Closes: https://gitlab.freedesktop.org/drm/intel/-/issues/8489 Fixes: bec68cc9ea42d8 ("drm/i915: Prepare for multiple GTs") Signed-off-by: Andrzej Hajda <andrzej.hajda@intel.com> --- - Link to v1: https://lore.kernel.org/r/20231114-dont_clean_gt_on_error_path-v1-1-37f2fa827fd2@intel.com --- drivers/gpu/drm/i915/gt/intel_gt.c | 11 ----------- drivers/gpu/drm/i915/i915_driver.c | 4 +--- 2 files changed, 1 insertion(+), 14 deletions(-) --- base-commit: 1489bab52c281a869295414031a56506a375b036 change-id: 20231114-dont_clean_gt_on_error_path-91cd9c3caa0a Best regards,