diff mbox series

drm/i915: Call i915_gem_suspend() only after display is turned off

Message ID 20220617190629.355356-1-jose.souza@intel.com (mailing list archive)
State New, archived
Headers show
Series drm/i915: Call i915_gem_suspend() only after display is turned off | expand

Commit Message

Souza, Jose June 17, 2022, 7:06 p.m. UTC
Gem buffers could still be in use by display after i915_gem_suspend()
is executed so there is chances that i915_gem_flush_free_objects()
will be being executed at the same time that
intel_runtime_pm_driver_release() is executed printing warnings about
wakerefs will being held.

So here only calling i915_gem_suspend() and by consequence
i915_gem_drain_freed_objects() only after display is down making
sure all buffers are freed.

Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
---
 drivers/gpu/drm/i915/i915_driver.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

Comments

Matt Roper June 17, 2022, 7:28 p.m. UTC | #1
On Fri, Jun 17, 2022 at 12:06:29PM -0700, José Roberto de Souza wrote:
> Gem buffers could still be in use by display after i915_gem_suspend()
> is executed so there is chances that i915_gem_flush_free_objects()
> will be being executed at the same time that
> intel_runtime_pm_driver_release() is executed printing warnings about
> wakerefs will being held.

By the same logic do we need to adjust i915_driver_remove() too?


Matt

> 
> So here only calling i915_gem_suspend() and by consequence
> i915_gem_drain_freed_objects() only after display is down making
> sure all buffers are freed.
> 
> Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> ---
>  drivers/gpu/drm/i915/i915_driver.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> index d26dcca7e654a..4227675dd1cfe 100644
> --- a/drivers/gpu/drm/i915/i915_driver.c
> +++ b/drivers/gpu/drm/i915/i915_driver.c
> @@ -1067,8 +1067,6 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
>  	intel_runtime_pm_disable(&i915->runtime_pm);
>  	intel_power_domains_disable(i915);
>  
> -	i915_gem_suspend(i915);
> -
>  	if (HAS_DISPLAY(i915)) {
>  		drm_kms_helper_poll_disable(&i915->drm);
>  
> @@ -1085,6 +1083,8 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
>  
>  	intel_dmc_ucode_suspend(i915);
>  
> +	i915_gem_suspend(i915);
> +
>  	/*
>  	 * The only requirement is to reboot with display DC states disabled,
>  	 * for now leaving all display power wells in the INIT power domain
> -- 
> 2.36.1
>
Souza, Jose June 21, 2022, 5:03 p.m. UTC | #2
On Fri, 2022-06-17 at 12:28 -0700, Matt Roper wrote:
> On Fri, Jun 17, 2022 at 12:06:29PM -0700, José Roberto de Souza wrote:
> > Gem buffers could still be in use by display after i915_gem_suspend()
> > is executed so there is chances that i915_gem_flush_free_objects()
> > will be being executed at the same time that
> > intel_runtime_pm_driver_release() is executed printing warnings about
> > wakerefs will being held.
> 
> By the same logic do we need to adjust i915_driver_remove() too?

Nope, all display buffers are freed in i915_driver_unregister() call chain:


i915_driver_remove()
	i915_driver_unregister()
		intel_display_driver_unregister()
			drm_atomic_helper_shutdown()
	i915_gem_suspend()
		i915_gem_drain_freed_objects()


Only FBC compressed framebuffer is freed after that but that will not cause any warnings as it is allocated from stolen memory.

> 
> 
> Matt
> 
> > 
> > So here only calling i915_gem_suspend() and by consequence
> > i915_gem_drain_freed_objects() only after display is down making
> > sure all buffers are freed.
> > 
> > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > ---
> >  drivers/gpu/drm/i915/i915_driver.c | 4 ++--
> >  1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> > index d26dcca7e654a..4227675dd1cfe 100644
> > --- a/drivers/gpu/drm/i915/i915_driver.c
> > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > @@ -1067,8 +1067,6 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> >  	intel_runtime_pm_disable(&i915->runtime_pm);
> >  	intel_power_domains_disable(i915);
> >  
> > -	i915_gem_suspend(i915);
> > -
> >  	if (HAS_DISPLAY(i915)) {
> >  		drm_kms_helper_poll_disable(&i915->drm);
> >  
> > @@ -1085,6 +1083,8 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> >  
> >  	intel_dmc_ucode_suspend(i915);
> >  
> > +	i915_gem_suspend(i915);
> > +
> >  	/*
> >  	 * The only requirement is to reboot with display DC states disabled,
> >  	 * for now leaving all display power wells in the INIT power domain
> > -- 
> > 2.36.1
> > 
>
Matt Roper June 22, 2022, 10:19 p.m. UTC | #3
On Tue, Jun 21, 2022 at 10:03:04AM -0700, Souza, Jose wrote:
> On Fri, 2022-06-17 at 12:28 -0700, Matt Roper wrote:
> > On Fri, Jun 17, 2022 at 12:06:29PM -0700, José Roberto de Souza wrote:
> > > Gem buffers could still be in use by display after i915_gem_suspend()
> > > is executed so there is chances that i915_gem_flush_free_objects()
> > > will be being executed at the same time that
> > > intel_runtime_pm_driver_release() is executed printing warnings about
> > > wakerefs will being held.
> > 
> > By the same logic do we need to adjust i915_driver_remove() too?
> 
> Nope, all display buffers are freed in i915_driver_unregister() call chain:
> 
> 
> i915_driver_remove()
> 	i915_driver_unregister()
> 		intel_display_driver_unregister()
> 			drm_atomic_helper_shutdown()
> 	i915_gem_suspend()
> 		i915_gem_drain_freed_objects()
> 
> 
> Only FBC compressed framebuffer is freed after that but that will not cause any warnings as it is allocated from stolen memory.

Okay sounds good; thanks for checking.

I'm still having a bit of trouble understanding your description of the
issue in the commit message though:

        "...so there is chances that i915_gem_flush_free_objects() will
        be being executed at the same time that
        intel_runtime_pm_driver_release()..."

I'm not super familiar with the driver teardown paths, or the memory
management cleanup details.  Intuitively it makes sense that we should
clean up memory management (GEM) only after we've torn down display so
that all objects that were used by framebuffers are out of circulation.
But from a cursory view, it looks like i915_gem_suspend() is mostly
concerned with quiescing the GT and cleaning up PPGTT (which doesn't
impact display since all of its buffers are in the GGTT).

Is the problem arising from i915->mm.free_work still doing asynchronous
work to actually release the unused objects at the same time we're
tearing down runtime PM later?  If so does swapping the order of the
gem_suspend and display disable here actually prevent that from
happening or does it just make the race less likely by helping some
objects free up earlier?


Matt

> 
> > 
> > 
> > Matt
> > 
> > > 
> > > So here only calling i915_gem_suspend() and by consequence
> > > i915_gem_drain_freed_objects() only after display is down making
> > > sure all buffers are freed.
> > > 
> > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > > ---
> > >  drivers/gpu/drm/i915/i915_driver.c | 4 ++--
> > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> > > index d26dcca7e654a..4227675dd1cfe 100644
> > > --- a/drivers/gpu/drm/i915/i915_driver.c
> > > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > > @@ -1067,8 +1067,6 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> > >  	intel_runtime_pm_disable(&i915->runtime_pm);
> > >  	intel_power_domains_disable(i915);
> > >  
> > > -	i915_gem_suspend(i915);
> > > -
> > >  	if (HAS_DISPLAY(i915)) {
> > >  		drm_kms_helper_poll_disable(&i915->drm);
> > >  
> > > @@ -1085,6 +1083,8 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> > >  
> > >  	intel_dmc_ucode_suspend(i915);
> > >  
> > > +	i915_gem_suspend(i915);
> > > +
> > >  	/*
> > >  	 * The only requirement is to reboot with display DC states disabled,
> > >  	 * for now leaving all display power wells in the INIT power domain
> > > -- 
> > > 2.36.1
> > > 
> > 
>
Souza, Jose June 23, 2022, 2:48 p.m. UTC | #4
On Wed, 2022-06-22 at 15:19 -0700, Matt Roper wrote:
> On Tue, Jun 21, 2022 at 10:03:04AM -0700, Souza, Jose wrote:
> > On Fri, 2022-06-17 at 12:28 -0700, Matt Roper wrote:
> > > On Fri, Jun 17, 2022 at 12:06:29PM -0700, José Roberto de Souza wrote:
> > > > Gem buffers could still be in use by display after i915_gem_suspend()
> > > > is executed so there is chances that i915_gem_flush_free_objects()
> > > > will be being executed at the same time that
> > > > intel_runtime_pm_driver_release() is executed printing warnings about
> > > > wakerefs will being held.
> > > 
> > > By the same logic do we need to adjust i915_driver_remove() too?
> > 
> > Nope, all display buffers are freed in i915_driver_unregister() call chain:
> > 
> > 
> > i915_driver_remove()
> > 	i915_driver_unregister()
> > 		intel_display_driver_unregister()
> > 			drm_atomic_helper_shutdown()
> > 	i915_gem_suspend()
> > 		i915_gem_drain_freed_objects()
> > 
> > 
> > Only FBC compressed framebuffer is freed after that but that will not cause any warnings as it is allocated from stolen memory.
> 
> Okay sounds good; thanks for checking.
> 
> I'm still having a bit of trouble understanding your description of the
> issue in the commit message though:
> 
>         "...so there is chances that i915_gem_flush_free_objects() will
>         be being executed at the same time that
>         intel_runtime_pm_driver_release()..."
> 
> I'm not super familiar with the driver teardown paths, or the memory
> management cleanup details.  Intuitively it makes sense that we should
> clean up memory management (GEM) only after we've torn down display so
> that all objects that were used by framebuffers are out of circulation.
> But from a cursory view, it looks like i915_gem_suspend() is mostly
> concerned with quiescing the GT and cleaning up PPGTT (which doesn't
> impact display since all of its buffers are in the GGTT).
> 
> Is the problem arising from i915->mm.free_work still doing asynchronous
> work to actually release the unused objects at the same time we're
> tearing down runtime PM later?  If so does swapping the order of the
> gem_suspend and display disable here actually prevent that from
> happening or does it just make the race less likely by helping some
> objects free up earlier?

So when the last reference of a gem object is removed it is added to the mm.free_list list and mm.free_work is queued to actually free the object.
i915_gem_drain_freed_objects() flushes the mm.free_work.

If any other gem object has its last reference removed after i915_gem_suspend()/i915_gem_drain_freed_objects() the warning in
intel_runtime_pm_driver_release() can happen as the mm.free_work could be running at the same time.

But when pci_driver.remove() is called, probably all file descriptors attached to this device have been closed and the functions called after
i915_gem_suspend() will not free any gem object, so I don't believe we will have any more warnings.

> 
> 
> Matt
> 
> > 
> > > 
> > > 
> > > Matt
> > > 
> > > > 
> > > > So here only calling i915_gem_suspend() and by consequence
> > > > i915_gem_drain_freed_objects() only after display is down making
> > > > sure all buffers are freed.
> > > > 
> > > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > > > ---
> > > >  drivers/gpu/drm/i915/i915_driver.c | 4 ++--
> > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> > > > index d26dcca7e654a..4227675dd1cfe 100644
> > > > --- a/drivers/gpu/drm/i915/i915_driver.c
> > > > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > > > @@ -1067,8 +1067,6 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> > > >  	intel_runtime_pm_disable(&i915->runtime_pm);
> > > >  	intel_power_domains_disable(i915);
> > > >  
> > > > -	i915_gem_suspend(i915);
> > > > -
> > > >  	if (HAS_DISPLAY(i915)) {
> > > >  		drm_kms_helper_poll_disable(&i915->drm);
> > > >  
> > > > @@ -1085,6 +1083,8 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> > > >  
> > > >  	intel_dmc_ucode_suspend(i915);
> > > >  
> > > > +	i915_gem_suspend(i915);
> > > > +
> > > >  	/*
> > > >  	 * The only requirement is to reboot with display DC states disabled,
> > > >  	 * for now leaving all display power wells in the INIT power domain
> > > > -- 
> > > > 2.36.1
> > > > 
> > > 
> > 
>
Matt Roper June 23, 2022, 6:06 p.m. UTC | #5
On Thu, Jun 23, 2022 at 07:48:32AM -0700, Souza, Jose wrote:
> On Wed, 2022-06-22 at 15:19 -0700, Matt Roper wrote:
> > On Tue, Jun 21, 2022 at 10:03:04AM -0700, Souza, Jose wrote:
> > > On Fri, 2022-06-17 at 12:28 -0700, Matt Roper wrote:
> > > > On Fri, Jun 17, 2022 at 12:06:29PM -0700, José Roberto de Souza wrote:
> > > > > Gem buffers could still be in use by display after i915_gem_suspend()
> > > > > is executed so there is chances that i915_gem_flush_free_objects()
> > > > > will be being executed at the same time that
> > > > > intel_runtime_pm_driver_release() is executed printing warnings about
> > > > > wakerefs will being held.
> > > > 
> > > > By the same logic do we need to adjust i915_driver_remove() too?
> > > 
> > > Nope, all display buffers are freed in i915_driver_unregister() call chain:
> > > 
> > > 
> > > i915_driver_remove()
> > > 	i915_driver_unregister()
> > > 		intel_display_driver_unregister()
> > > 			drm_atomic_helper_shutdown()
> > > 	i915_gem_suspend()
> > > 		i915_gem_drain_freed_objects()
> > > 
> > > 
> > > Only FBC compressed framebuffer is freed after that but that will not cause any warnings as it is allocated from stolen memory.
> > 
> > Okay sounds good; thanks for checking.
> > 
> > I'm still having a bit of trouble understanding your description of the
> > issue in the commit message though:
> > 
> >         "...so there is chances that i915_gem_flush_free_objects() will
> >         be being executed at the same time that
> >         intel_runtime_pm_driver_release()..."
> > 
> > I'm not super familiar with the driver teardown paths, or the memory
> > management cleanup details.  Intuitively it makes sense that we should
> > clean up memory management (GEM) only after we've torn down display so
> > that all objects that were used by framebuffers are out of circulation.
> > But from a cursory view, it looks like i915_gem_suspend() is mostly
> > concerned with quiescing the GT and cleaning up PPGTT (which doesn't
> > impact display since all of its buffers are in the GGTT).
> > 
> > Is the problem arising from i915->mm.free_work still doing asynchronous
> > work to actually release the unused objects at the same time we're
> > tearing down runtime PM later?  If so does swapping the order of the
> > gem_suspend and display disable here actually prevent that from
> > happening or does it just make the race less likely by helping some
> > objects free up earlier?
> 
> So when the last reference of a gem object is removed it is added to the mm.free_list list and mm.free_work is queued to actually free the object.
> i915_gem_drain_freed_objects() flushes the mm.free_work.
> 
> If any other gem object has its last reference removed after i915_gem_suspend()/i915_gem_drain_freed_objects() the warning in
> intel_runtime_pm_driver_release() can happen as the mm.free_work could be running at the same time.
> 
> But when pci_driver.remove() is called, probably all file descriptors attached to this device have been closed and the functions called after
> i915_gem_suspend() will not free any gem object, so I don't believe we will have any more warnings.

Okay, thanks for explaining, makes sense.  You might want to add some of
this extra explanation to the commit message too for future reference,
but either way,

Reviewed-by: Matt Roper <matthew.d.roper@intel.com>

> 
> > 
> > 
> > Matt
> > 
> > > 
> > > > 
> > > > 
> > > > Matt
> > > > 
> > > > > 
> > > > > So here only calling i915_gem_suspend() and by consequence
> > > > > i915_gem_drain_freed_objects() only after display is down making
> > > > > sure all buffers are freed.
> > > > > 
> > > > > Signed-off-by: José Roberto de Souza <jose.souza@intel.com>
> > > > > ---
> > > > >  drivers/gpu/drm/i915/i915_driver.c | 4 ++--
> > > > >  1 file changed, 2 insertions(+), 2 deletions(-)
> > > > > 
> > > > > diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
> > > > > index d26dcca7e654a..4227675dd1cfe 100644
> > > > > --- a/drivers/gpu/drm/i915/i915_driver.c
> > > > > +++ b/drivers/gpu/drm/i915/i915_driver.c
> > > > > @@ -1067,8 +1067,6 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> > > > >  	intel_runtime_pm_disable(&i915->runtime_pm);
> > > > >  	intel_power_domains_disable(i915);
> > > > >  
> > > > > -	i915_gem_suspend(i915);
> > > > > -
> > > > >  	if (HAS_DISPLAY(i915)) {
> > > > >  		drm_kms_helper_poll_disable(&i915->drm);
> > > > >  
> > > > > @@ -1085,6 +1083,8 @@ void i915_driver_shutdown(struct drm_i915_private *i915)
> > > > >  
> > > > >  	intel_dmc_ucode_suspend(i915);
> > > > >  
> > > > > +	i915_gem_suspend(i915);
> > > > > +
> > > > >  	/*
> > > > >  	 * The only requirement is to reboot with display DC states disabled,
> > > > >  	 * for now leaving all display power wells in the INIT power domain
> > > > > -- 
> > > > > 2.36.1
> > > > > 
> > > > 
> > > 
> > 
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/i915_driver.c b/drivers/gpu/drm/i915/i915_driver.c
index d26dcca7e654a..4227675dd1cfe 100644
--- a/drivers/gpu/drm/i915/i915_driver.c
+++ b/drivers/gpu/drm/i915/i915_driver.c
@@ -1067,8 +1067,6 @@  void i915_driver_shutdown(struct drm_i915_private *i915)
 	intel_runtime_pm_disable(&i915->runtime_pm);
 	intel_power_domains_disable(i915);
 
-	i915_gem_suspend(i915);
-
 	if (HAS_DISPLAY(i915)) {
 		drm_kms_helper_poll_disable(&i915->drm);
 
@@ -1085,6 +1083,8 @@  void i915_driver_shutdown(struct drm_i915_private *i915)
 
 	intel_dmc_ucode_suspend(i915);
 
+	i915_gem_suspend(i915);
+
 	/*
 	 * The only requirement is to reboot with display DC states disabled,
 	 * for now leaving all display power wells in the INIT power domain