Message ID | 1315150502-12537-4-git-send-email-daniel.vetter@ffwll.ch (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Sep 04, 2011 at 05:35:02PM +0200, Daniel Vetter wrote: > The rps disabling code wasn't properly cancelling outstanding work > items. Also add a comment that explains why we're not racing with > the work item, that could again unmask interrupts. > > This also fixes a bug on module unload because nothing was properly > syncing up with that work item, possibly leading to it being run after > freeing dev_priv or removing the module code. > > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> > --- > drivers/gpu/drm/i915/intel_display.c | 8 +++++++- > 1 files changed, 7 insertions(+), 1 deletions(-) > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > index 56a8554..ccd4600 100644 > --- a/drivers/gpu/drm/i915/intel_display.c > +++ b/drivers/gpu/drm/i915/intel_display.c > @@ -7483,10 +7483,16 @@ void gen6_disable_rps(struct drm_device *dev) > > I915_WRITE(GEN6_RPNSWREQ, 1 << 31); > I915_WRITE(GEN6_PMINTRMSK, 0xffffffff); > - I915_WRITE(GEN6_PMIER, 0); > + /* Complete PM interrupt masking here doesn't race with the rps work > + * item again unmasking PM interrupts because that is using PMIMR > + * (logically sitting atop of PMINTRMSK) to mask interrupts. */ > + cancel_work_sync(&dev_priv->rps_work); > > + /* Clear PMIMR and dev_priv->pm_iir in case outstanding work gets > + * cancelled before having run. */ This comment is wrong, you are clearing PMIER, not IMR. IMR gets cleared in enable() iirc > spin_lock_irq(&dev_priv->rps_lock); > dev_priv->pm_iir = 0; > + I915_WRITE(GEN6_PMIER, 0); > spin_unlock_irq(&dev_priv->rps_lock); > > I915_WRITE(GEN6_PMIIR, I915_READ(GEN6_PMIIR)); > -- > 1.7.6 > I'm not sure this actually fixes a problem. The existing code: 1. disables all interrupts (no more can occur). 2. sets pm_iir to 0 safe in rps lock <workqueues can run at this point, but IMR has no effect with IER = 0> I think you should do the cancel work sync somewhere in the code before module unload (to be correct). I just don't think this fixes a race. Ben
On Sun, Sep 04, 2011 at 05:23:30PM +0000, Ben Widawsky wrote: > On Sun, Sep 04, 2011 at 05:35:02PM +0200, Daniel Vetter wrote: > > The rps disabling code wasn't properly cancelling outstanding work > > items. Also add a comment that explains why we're not racing with > > the work item, that could again unmask interrupts. > > > > This also fixes a bug on module unload because nothing was properly > > syncing up with that work item, possibly leading to it being run after > > freeing dev_priv or removing the module code. > > > > Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > --- > > drivers/gpu/drm/i915/intel_display.c | 8 +++++++- > > 1 files changed, 7 insertions(+), 1 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c > > index 56a8554..ccd4600 100644 > > --- a/drivers/gpu/drm/i915/intel_display.c > > +++ b/drivers/gpu/drm/i915/intel_display.c > > @@ -7483,10 +7483,16 @@ void gen6_disable_rps(struct drm_device *dev) > > > > I915_WRITE(GEN6_RPNSWREQ, 1 << 31); > > I915_WRITE(GEN6_PMINTRMSK, 0xffffffff); > > - I915_WRITE(GEN6_PMIER, 0); > > + /* Complete PM interrupt masking here doesn't race with the rps work > > + * item again unmasking PM interrupts because that is using PMIMR > > + * (logically sitting atop of PMINTRMSK) to mask interrupts. */ > > + cancel_work_sync(&dev_priv->rps_work); > > > > + /* Clear PMIMR and dev_priv->pm_iir in case outstanding work gets > > + * cancelled before having run. */ > > This comment is wrong, you are clearing PMIER, not IMR. IMR gets cleared > in enable() iirc Bleh, now it's my turn to completely screw it up ;-) Still, I think some comment that we muck around with different registers than the work item would be good. And clearing PMIMR under the lock just for paranoia so that setting dev_priv->pm_iir doesn't leave a strange interrupt mask lingering around > > spin_lock_irq(&dev_priv->rps_lock); > > dev_priv->pm_iir = 0; > > + I915_WRITE(GEN6_PMIER, 0); > > spin_unlock_irq(&dev_priv->rps_lock); > > > > I915_WRITE(GEN6_PMIIR, I915_READ(GEN6_PMIIR)); > I'm not sure this actually fixes a problem. The existing code: > 1. disables all interrupts (no more can occur). > 2. sets pm_iir to 0 safe in rps lock > <workqueues can run at this point, but IMR has no effect with IER = 0> > > I think you should do the cancel work sync somewhere in the code before > module unload (to be correct). I just don't think this fixes a race. Well, we definitely need the cancel_work_sync in the unload path somewhere. I prefer it in the rps disable function. In the context of the other patches my patch description is a bit lousy because I've really only wanted to fix the unload race (and the PMIMR inconsistency) with this patch. -Daniel
On Sun, Sep 04, 2011 at 09:17:24PM +0200, Daniel Vetter wrote: > > > spin_lock_irq(&dev_priv->rps_lock); > > > dev_priv->pm_iir = 0; > > > + I915_WRITE(GEN6_PMIER, 0); > > > spin_unlock_irq(&dev_priv->rps_lock); > > > > > > I915_WRITE(GEN6_PMIIR, I915_READ(GEN6_PMIIR)); > > I'm not sure this actually fixes a problem. The existing code: > > 1. disables all interrupts (no more can occur). > > 2. sets pm_iir to 0 safe in rps lock > > <workqueues can run at this point, but IMR has no effect with IER = 0> > > > > I think you should do the cancel work sync somewhere in the code before > > module unload (to be correct). I just don't think this fixes a race. > > Well, we definitely need the cancel_work_sync in the unload path > somewhere. I prefer it in the rps disable function. In the context of the > other patches my patch description is a bit lousy because I've really only > wanted to fix the unload race (and the PMIMR inconsistency) with this > patch. > > -Daniel I still fail to see the inconsitency. The value of PMR no longer matters once IER is cleared. What the wq or irq handler does after this point should be fine so long as everything is setup up properly at enable time. This is a minor detail. So I would rework some of your comments a bit, and I also think it's about time we create a central place to cancel workqueue items. Mostly because I think this patch is subject to a deadlock with struct_mutex (you're holding it when you call gen6_disable_rps(), but you're doing cancel_work_sync on a workqueue which attempts to acquire struct_mutex. Ben
On Sun, Sep 04, 2011 at 07:50:08PM +0000, Ben Widawsky wrote: > So I would rework some of your comments a bit, and I also think it's > about time we create a central place to cancel workqueue items. Mostly > because I think this patch is subject to a deadlock with struct_mutex > (you're holding it when you call gen6_disable_rps(), but you're doing > cancel_work_sync on a workqueue which attempts to acquire struct_mutex. Oh dang, you're correct. This will deadlock. And the PMIMR is really just cosmetic. So putting the cancel_work_sync into intel_modeset_cleanup sounds like the right approach. And hey, I've already put other work cancelling stuff there in my last fury at fixing module unload, so it can't be that bad ;-) I'll try to write a better patch tomorrow. -Daniel
diff --git a/drivers/gpu/drm/i915/intel_display.c b/drivers/gpu/drm/i915/intel_display.c index 56a8554..ccd4600 100644 --- a/drivers/gpu/drm/i915/intel_display.c +++ b/drivers/gpu/drm/i915/intel_display.c @@ -7483,10 +7483,16 @@ void gen6_disable_rps(struct drm_device *dev) I915_WRITE(GEN6_RPNSWREQ, 1 << 31); I915_WRITE(GEN6_PMINTRMSK, 0xffffffff); - I915_WRITE(GEN6_PMIER, 0); + /* Complete PM interrupt masking here doesn't race with the rps work + * item again unmasking PM interrupts because that is using PMIMR + * (logically sitting atop of PMINTRMSK) to mask interrupts. */ + cancel_work_sync(&dev_priv->rps_work); + /* Clear PMIMR and dev_priv->pm_iir in case outstanding work gets + * cancelled before having run. */ spin_lock_irq(&dev_priv->rps_lock); dev_priv->pm_iir = 0; + I915_WRITE(GEN6_PMIER, 0); spin_unlock_irq(&dev_priv->rps_lock); I915_WRITE(GEN6_PMIIR, I915_READ(GEN6_PMIIR));
The rps disabling code wasn't properly cancelling outstanding work items. Also add a comment that explains why we're not racing with the work item, that could again unmask interrupts. This also fixes a bug on module unload because nothing was properly syncing up with that work item, possibly leading to it being run after freeing dev_priv or removing the module code. Signed-off-by: Daniel Vetter <daniel.vetter@ffwll.ch> --- drivers/gpu/drm/i915/intel_display.c | 8 +++++++- 1 files changed, 7 insertions(+), 1 deletions(-)