Message ID | 1393001548-2883-4-git-send-email-przanoni@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Fri, 21 Feb 2014 13:52:20 -0300 Paulo Zanoni <przanoni@gmail.com> wrote: > From: Paulo Zanoni <paulo.r.zanoni@intel.com> > > When we call gen6_gt_force_wake_put we don't actually put force_wake, > we just schedule gen6_force_wake_work through mod_delayed_work, and > that will eventually release force_wake. > > The problem is that we call intel_runtime_pm_put directly at > gen6_gt_force_wake_put, so most of the times we put our runtime PM > reference before the delayed work happens, so we may runtime suspend > while force_wake is still supposed to be enabled if the graphics > autosuspend_delay_ms is too small. > > Now the nice thing about the current code is that after it triggers > the delayed work function it gets a refcount, and it only triggers the > delayed work function if refcount is zero. This guarantees that when > we schedule the funciton, it will run before we try to schedule it > again, which simplifies the problem and allows for the current > solution to work properly (hopefully!). > > Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> > --- > drivers/gpu/drm/i915/intel_uncore.c | 7 ++++++- > 1 file changed, 6 insertions(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > index c628414..1f7226f 100644 > --- a/drivers/gpu/drm/i915/intel_uncore.c > +++ b/drivers/gpu/drm/i915/intel_uncore.c > @@ -299,6 +299,8 @@ static void gen6_force_wake_work(struct work_struct *work) > if (--dev_priv->uncore.forcewake_count == 0) > dev_priv->uncore.funcs.force_wake_put(dev_priv, FORCEWAKE_ALL); > spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); > + > + intel_runtime_pm_put(dev_priv); > } > > static void intel_uncore_forcewake_reset(struct drm_device *dev) > @@ -393,6 +395,7 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv, int fw_engine) > void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) > { > unsigned long irqflags; > + bool delayed = false; > > if (!dev_priv->uncore.funcs.force_wake_put) > return; > @@ -405,13 +408,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) > spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); > if (--dev_priv->uncore.forcewake_count == 0) { > dev_priv->uncore.forcewake_count++; > + delayed = true; > mod_delayed_work(dev_priv->wq, > &dev_priv->uncore.force_wake_work, > 1); > } > spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); > > - intel_runtime_pm_put(dev_priv); > + if (!delayed) > + intel_runtime_pm_put(dev_priv); > } > > /* We give fast paths for the really cool registers */ Do we need this for the VLV path too? It's a little confusing that we do this delayed thing, incrementing the count and then decrementing again in the work queue, but what you have looks correct for both cases. So with the VLV thing addressed: Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org>
2014-02-21 14:34 GMT-03:00 Jesse Barnes <jbarnes@virtuousgeek.org>: > On Fri, 21 Feb 2014 13:52:20 -0300 > Paulo Zanoni <przanoni@gmail.com> wrote: > >> From: Paulo Zanoni <paulo.r.zanoni@intel.com> >> >> When we call gen6_gt_force_wake_put we don't actually put force_wake, >> we just schedule gen6_force_wake_work through mod_delayed_work, and >> that will eventually release force_wake. >> >> The problem is that we call intel_runtime_pm_put directly at >> gen6_gt_force_wake_put, so most of the times we put our runtime PM >> reference before the delayed work happens, so we may runtime suspend >> while force_wake is still supposed to be enabled if the graphics >> autosuspend_delay_ms is too small. >> >> Now the nice thing about the current code is that after it triggers >> the delayed work function it gets a refcount, and it only triggers the >> delayed work function if refcount is zero. This guarantees that when >> we schedule the funciton, it will run before we try to schedule it >> again, which simplifies the problem and allows for the current >> solution to work properly (hopefully!). >> >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> >> --- >> drivers/gpu/drm/i915/intel_uncore.c | 7 ++++++- >> 1 file changed, 6 insertions(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c >> index c628414..1f7226f 100644 >> --- a/drivers/gpu/drm/i915/intel_uncore.c >> +++ b/drivers/gpu/drm/i915/intel_uncore.c >> @@ -299,6 +299,8 @@ static void gen6_force_wake_work(struct work_struct *work) >> if (--dev_priv->uncore.forcewake_count == 0) >> dev_priv->uncore.funcs.force_wake_put(dev_priv, FORCEWAKE_ALL); >> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); >> + >> + intel_runtime_pm_put(dev_priv); >> } >> >> static void intel_uncore_forcewake_reset(struct drm_device *dev) >> @@ -393,6 +395,7 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv, int fw_engine) >> void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) >> { >> unsigned long irqflags; >> + bool delayed = false; >> >> if (!dev_priv->uncore.funcs.force_wake_put) >> return; >> @@ -405,13 +408,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) >> spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); >> if (--dev_priv->uncore.forcewake_count == 0) { >> dev_priv->uncore.forcewake_count++; >> + delayed = true; >> mod_delayed_work(dev_priv->wq, >> &dev_priv->uncore.force_wake_work, >> 1); >> } >> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); >> >> - intel_runtime_pm_put(dev_priv); >> + if (!delayed) >> + intel_runtime_pm_put(dev_priv); >> } >> >> /* We give fast paths for the really cool registers */ > > Do we need this for the VLV path too? Yeah, my patch is wrong for VLV due to that "return". I'll send a new version. By the way, why doesn't VLV use the delayed work queue? I would assume the work queue is there to improve performance somehow, so it could be a good idea to use it... And maybe try to avoid special-casing VLV would be good too :) > > It's a little confusing that we do this delayed thing, incrementing the > count and then decrementing again in the work queue, but what you have > looks correct for both cases. > > So with the VLV thing addressed: > > Reviewed-by: Jesse Barnes <jbarnes@virtuousgeek.org> > > -- > Jesse Barnes, Intel Open Source Technology Center
On Fri, 21 Feb 2014 17:08:50 -0300 Paulo Zanoni <przanoni@gmail.com> wrote: > 2014-02-21 14:34 GMT-03:00 Jesse Barnes <jbarnes@virtuousgeek.org>: > > On Fri, 21 Feb 2014 13:52:20 -0300 > > Paulo Zanoni <przanoni@gmail.com> wrote: > > > >> From: Paulo Zanoni <paulo.r.zanoni@intel.com> > >> > >> When we call gen6_gt_force_wake_put we don't actually put force_wake, > >> we just schedule gen6_force_wake_work through mod_delayed_work, and > >> that will eventually release force_wake. > >> > >> The problem is that we call intel_runtime_pm_put directly at > >> gen6_gt_force_wake_put, so most of the times we put our runtime PM > >> reference before the delayed work happens, so we may runtime suspend > >> while force_wake is still supposed to be enabled if the graphics > >> autosuspend_delay_ms is too small. > >> > >> Now the nice thing about the current code is that after it triggers > >> the delayed work function it gets a refcount, and it only triggers the > >> delayed work function if refcount is zero. This guarantees that when > >> we schedule the funciton, it will run before we try to schedule it > >> again, which simplifies the problem and allows for the current > >> solution to work properly (hopefully!). > >> > >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> > >> --- > >> drivers/gpu/drm/i915/intel_uncore.c | 7 ++++++- > >> 1 file changed, 6 insertions(+), 1 deletion(-) > >> > >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > >> index c628414..1f7226f 100644 > >> --- a/drivers/gpu/drm/i915/intel_uncore.c > >> +++ b/drivers/gpu/drm/i915/intel_uncore.c > >> @@ -299,6 +299,8 @@ static void gen6_force_wake_work(struct work_struct *work) > >> if (--dev_priv->uncore.forcewake_count == 0) > >> dev_priv->uncore.funcs.force_wake_put(dev_priv, FORCEWAKE_ALL); > >> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); > >> + > >> + intel_runtime_pm_put(dev_priv); > >> } > >> > >> static void intel_uncore_forcewake_reset(struct drm_device *dev) > >> @@ -393,6 +395,7 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv, int fw_engine) > >> void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) > >> { > >> unsigned long irqflags; > >> + bool delayed = false; > >> > >> if (!dev_priv->uncore.funcs.force_wake_put) > >> return; > >> @@ -405,13 +408,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) > >> spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); > >> if (--dev_priv->uncore.forcewake_count == 0) { > >> dev_priv->uncore.forcewake_count++; > >> + delayed = true; > >> mod_delayed_work(dev_priv->wq, > >> &dev_priv->uncore.force_wake_work, > >> 1); > >> } > >> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); > >> > >> - intel_runtime_pm_put(dev_priv); > >> + if (!delayed) > >> + intel_runtime_pm_put(dev_priv); > >> } > >> > >> /* We give fast paths for the really cool registers */ > > > > Do we need this for the VLV path too? > > Yeah, my patch is wrong for VLV due to that "return". I'll send a new version. > > By the way, why doesn't VLV use the delayed work queue? I would assume > the work queue is there to improve performance somehow, so it could be > a good idea to use it... And maybe try to avoid special-casing VLV > would be good too :) I don't know why VLV has an early return there rather than just using a function pointer like everything else. ISTR reviewing that patch from Deepak and suggesting something else, but I guess Daniel merged it anyway. But yes, it should be fixed up as well and should probably use the delayed mechanism.
On Fri, Feb 21, 2014 at 12:16:58PM -0800, Jesse Barnes wrote: > On Fri, 21 Feb 2014 17:08:50 -0300 > Paulo Zanoni <przanoni@gmail.com> wrote: > > > 2014-02-21 14:34 GMT-03:00 Jesse Barnes <jbarnes@virtuousgeek.org>: > > > On Fri, 21 Feb 2014 13:52:20 -0300 > > > Paulo Zanoni <przanoni@gmail.com> wrote: > > > > > >> From: Paulo Zanoni <paulo.r.zanoni@intel.com> > > >> > > >> When we call gen6_gt_force_wake_put we don't actually put force_wake, > > >> we just schedule gen6_force_wake_work through mod_delayed_work, and > > >> that will eventually release force_wake. > > >> > > >> The problem is that we call intel_runtime_pm_put directly at > > >> gen6_gt_force_wake_put, so most of the times we put our runtime PM > > >> reference before the delayed work happens, so we may runtime suspend > > >> while force_wake is still supposed to be enabled if the graphics > > >> autosuspend_delay_ms is too small. > > >> > > >> Now the nice thing about the current code is that after it triggers > > >> the delayed work function it gets a refcount, and it only triggers the > > >> delayed work function if refcount is zero. This guarantees that when > > >> we schedule the funciton, it will run before we try to schedule it > > >> again, which simplifies the problem and allows for the current > > >> solution to work properly (hopefully!). > > >> > > >> Signed-off-by: Paulo Zanoni <paulo.r.zanoni@intel.com> > > >> --- > > >> drivers/gpu/drm/i915/intel_uncore.c | 7 ++++++- > > >> 1 file changed, 6 insertions(+), 1 deletion(-) > > >> > > >> diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c > > >> index c628414..1f7226f 100644 > > >> --- a/drivers/gpu/drm/i915/intel_uncore.c > > >> +++ b/drivers/gpu/drm/i915/intel_uncore.c > > >> @@ -299,6 +299,8 @@ static void gen6_force_wake_work(struct work_struct *work) > > >> if (--dev_priv->uncore.forcewake_count == 0) > > >> dev_priv->uncore.funcs.force_wake_put(dev_priv, FORCEWAKE_ALL); > > >> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); > > >> + > > >> + intel_runtime_pm_put(dev_priv); > > >> } > > >> > > >> static void intel_uncore_forcewake_reset(struct drm_device *dev) > > >> @@ -393,6 +395,7 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv, int fw_engine) > > >> void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) > > >> { > > >> unsigned long irqflags; > > >> + bool delayed = false; > > >> > > >> if (!dev_priv->uncore.funcs.force_wake_put) > > >> return; > > >> @@ -405,13 +408,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) > > >> spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); > > >> if (--dev_priv->uncore.forcewake_count == 0) { > > >> dev_priv->uncore.forcewake_count++; > > >> + delayed = true; > > >> mod_delayed_work(dev_priv->wq, > > >> &dev_priv->uncore.force_wake_work, > > >> 1); > > >> } > > >> spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); > > >> > > >> - intel_runtime_pm_put(dev_priv); > > >> + if (!delayed) > > >> + intel_runtime_pm_put(dev_priv); > > >> } > > >> > > >> /* We give fast paths for the really cool registers */ > > > > > > Do we need this for the VLV path too? > > > > Yeah, my patch is wrong for VLV due to that "return". I'll send a new version. > > > > By the way, why doesn't VLV use the delayed work queue? I would assume > > the work queue is there to improve performance somehow, so it could be > > a good idea to use it... And maybe try to avoid special-casing VLV > > would be good too :) > > I don't know why VLV has an early return there rather than just using a > function pointer like everything else. ISTR reviewing that patch from > Deepak and suggesting something else, but I guess Daniel merged it > anyway. > > But yes, it should be fixed up as well and should probably use the > delayed mechanism. The delayed work (well timer since Chris' latest patch) can only handle one forcewake domain. vlv has two ... That reminds me: This delayed suspend stuff is kinda implemented in the linux pm domain code already. Just needed to drop this since I've predicted that in the end we'll invent all these wheels, too ;-) -Daniel
diff --git a/drivers/gpu/drm/i915/intel_uncore.c b/drivers/gpu/drm/i915/intel_uncore.c index c628414..1f7226f 100644 --- a/drivers/gpu/drm/i915/intel_uncore.c +++ b/drivers/gpu/drm/i915/intel_uncore.c @@ -299,6 +299,8 @@ static void gen6_force_wake_work(struct work_struct *work) if (--dev_priv->uncore.forcewake_count == 0) dev_priv->uncore.funcs.force_wake_put(dev_priv, FORCEWAKE_ALL); spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); + + intel_runtime_pm_put(dev_priv); } static void intel_uncore_forcewake_reset(struct drm_device *dev) @@ -393,6 +395,7 @@ void gen6_gt_force_wake_get(struct drm_i915_private *dev_priv, int fw_engine) void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) { unsigned long irqflags; + bool delayed = false; if (!dev_priv->uncore.funcs.force_wake_put) return; @@ -405,13 +408,15 @@ void gen6_gt_force_wake_put(struct drm_i915_private *dev_priv, int fw_engine) spin_lock_irqsave(&dev_priv->uncore.lock, irqflags); if (--dev_priv->uncore.forcewake_count == 0) { dev_priv->uncore.forcewake_count++; + delayed = true; mod_delayed_work(dev_priv->wq, &dev_priv->uncore.force_wake_work, 1); } spin_unlock_irqrestore(&dev_priv->uncore.lock, irqflags); - intel_runtime_pm_put(dev_priv); + if (!delayed) + intel_runtime_pm_put(dev_priv); } /* We give fast paths for the really cool registers */