Message ID | 20110904100817.16c6c4cc@bwidawsk.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Sep 04, 2011 at 10:08:17AM -0700, Ben Widawsky wrote: > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > index 55518e3..3bc1479 100644 > --- a/drivers/gpu/drm/i915/i915_irq.c > +++ b/drivers/gpu/drm/i915/i915_irq.c > @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work) > gen6_set_rps(dev_priv->dev, new_delay); > dev_priv->cur_delay = new_delay; > > - /* > - * rps_lock not held here because clearing is non-destructive. There is > - * an *extremely* unlikely race with gen6_rps_enable() that is prevented > - * by holding struct_mutex for the duration of the write. > - */ > - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); > + I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir); > mutex_unlock(&dev_priv->dev->struct_mutex); > } For this to work we'd need to hold the rps_lock (to avoid racing with the irq handler). But imo my approach is conceptually simpler: The work func grabs all oustanding PM interrupts and then enables them again in one go (protected by rps_lock). And because the dev_priv->wq workqueue is single-threaded (no point in using multiple threads when all work items grab dev->struct mutex) we also cannot make a mess by running work items in the wrong order (or in parallel). -Daniel
On Sun, Sep 04, 2011 at 09:26:48PM +0200, Daniel Vetter wrote: > On Sun, Sep 04, 2011 at 10:08:17AM -0700, Ben Widawsky wrote: > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > > index 55518e3..3bc1479 100644 > > --- a/drivers/gpu/drm/i915/i915_irq.c > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work) > > gen6_set_rps(dev_priv->dev, new_delay); > > dev_priv->cur_delay = new_delay; > > > > - /* > > - * rps_lock not held here because clearing is non-destructive. There is > > - * an *extremely* unlikely race with gen6_rps_enable() that is prevented > > - * by holding struct_mutex for the duration of the write. > > - */ > > - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); > > + I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir); > > mutex_unlock(&dev_priv->dev->struct_mutex); > > } > > For this to work we'd need to hold the rps_lock (to avoid racing with the > irq handler). But imo my approach is conceptually simpler: The work func > grabs all oustanding PM interrupts and then enables them again in one go > (protected by rps_lock). I agree your approach is similar, but I think we should really consider whether my approach actually requires the lock. I *think* it doesn't. At least in my head my patch should fix the error you spotted. I don't know, maybe I need to think some more. The reason I worked so hard to avoid doing it the way you did in my original implementation is I was trying really hard to not break the cardinal rule about minimizing time holding spinlock_irqs. To go with the other patch, you probably want a POSTING_READ also before releasing the spin_lock (though I think this is being a bit paranoid). Again I need to think on this some more, but I'm also not terribly fond of ever unconditionally setting a value to 0 as it tends to complicate things when adding new readers or writers to the system. In summary, let me think about whether my solution actually won't work (feel free to contribute to my thinking), and if it doesn't then the decision is easy. Ben
On Sun, Sep 04, 2011 at 07:56:57PM +0000, Ben Widawsky wrote: > On Sun, Sep 04, 2011 at 09:26:48PM +0200, Daniel Vetter wrote: > > On Sun, Sep 04, 2011 at 10:08:17AM -0700, Ben Widawsky wrote: > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > > > index 55518e3..3bc1479 100644 > > > --- a/drivers/gpu/drm/i915/i915_irq.c > > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > > @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work) > > > gen6_set_rps(dev_priv->dev, new_delay); > > > dev_priv->cur_delay = new_delay; > > > > > > - /* > > > - * rps_lock not held here because clearing is non-destructive. There is > > > - * an *extremely* unlikely race with gen6_rps_enable() that is prevented > > > - * by holding struct_mutex for the duration of the write. > > > - */ > > > - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); > > > + I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir); > > > mutex_unlock(&dev_priv->dev->struct_mutex); > > > } > > > > For this to work we'd need to hold the rps_lock (to avoid racing with the > > irq handler). But imo my approach is conceptually simpler: The work func > > grabs all oustanding PM interrupts and then enables them again in one go > > (protected by rps_lock). > > I agree your approach is similar, but I think we should really consider > whether my approach actually requires the lock. I *think* it doesn't. At > least in my head my patch should fix the error you spotted. I don't > know, maybe I need to think some more. 1. rps work reads dev_priv->pm_iir (anew in the line you've added). 2. irq handler runs, adds a new bit to dev_priv->pm_iir and sets PMIMR to dev_priv->pm_iir (under irqsafe rps_lock). 3. rps work writes crap to PMIMR. I.e. same race, you've just dramatically reduced the window ;-) > The reason I worked so hard to avoid doing it the way you did in my > original implementation is I was trying really hard to not break the > cardinal rule about minimizing time holding spinlock_irqs. To go with > the other patch, you probably want a POSTING_READ also before releasing > the spin_lock (though I think this is being a bit paranoid). There POSTING_READ was to order the PMIMR write with the PMIIR write (both in the irq handler). There's no such ordering here (and the irq handler can't be interrupted) so I think we're save. -Daniel
On Sun, Sep 04, 2011 at 10:10:30PM +0200, Daniel Vetter wrote: > On Sun, Sep 04, 2011 at 07:56:57PM +0000, Ben Widawsky wrote: > > On Sun, Sep 04, 2011 at 09:26:48PM +0200, Daniel Vetter wrote: > > > On Sun, Sep 04, 2011 at 10:08:17AM -0700, Ben Widawsky wrote: > > > > diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c > > > > index 55518e3..3bc1479 100644 > > > > --- a/drivers/gpu/drm/i915/i915_irq.c > > > > +++ b/drivers/gpu/drm/i915/i915_irq.c > > > > @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work) > > > > gen6_set_rps(dev_priv->dev, new_delay); > > > > dev_priv->cur_delay = new_delay; > > > > > > > > - /* > > > > - * rps_lock not held here because clearing is non-destructive. There is > > > > - * an *extremely* unlikely race with gen6_rps_enable() that is prevented > > > > - * by holding struct_mutex for the duration of the write. > > > > - */ > > > > - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); > > > > + I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir); > > > > mutex_unlock(&dev_priv->dev->struct_mutex); > > > > } > > > > > > For this to work we'd need to hold the rps_lock (to avoid racing with the > > > irq handler). But imo my approach is conceptually simpler: The work func > > > grabs all oustanding PM interrupts and then enables them again in one go > > > (protected by rps_lock). > > > > I agree your approach is similar, but I think we should really consider > > whether my approach actually requires the lock. I *think* it doesn't. At > > least in my head my patch should fix the error you spotted. I don't > > know, maybe I need to think some more. > > 1. rps work reads dev_priv->pm_iir (anew in the line you've added). > 2. irq handler runs, adds a new bit to dev_priv->pm_iir and sets PMIMR to > dev_priv->pm_iir (under irqsafe rps_lock). > 3. rps work writes crap to PMIMR. > > I.e. same race, you've just dramatically reduced the window ;-) > > > The reason I worked so hard to avoid doing it the way you did in my > > original implementation is I was trying really hard to not break the > > cardinal rule about minimizing time holding spinlock_irqs. To go with > > the other patch, you probably want a POSTING_READ also before releasing > > the spin_lock (though I think this is being a bit paranoid). > > There POSTING_READ was to order the PMIMR write with the PMIIR write (both > in the irq handler). There's no such ordering here (and the irq handler > can't be interrupted) so I think we're save. > > -Daniel Oops, you're totally right, I think I meant: - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); + I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir); With regarding to the POSTING_READ, the concern I had was if the write to IMR doesn't land before releasing the spinlock, but I don't feel like addressing that concern anymore. Ben
On Sun, Sep 04, 2011 at 09:38:56PM +0000, Ben Widawsky wrote: > Oops, you're totally right, I think I meant: > - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); > + I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir); Imo still racy without the irqsafe rps_lock around it. gcc is free to compile that into a separate load and store which the irq handler can get in between and change dev_priv->pm_iir and PMIMR. The race is now only one instruction wide, though ;-) -Daniel
On Mon, 5 Sep 2011 08:38:07 +0200 Daniel Vetter <daniel@ffwll.ch> wrote: > On Sun, Sep 04, 2011 at 09:38:56PM +0000, Ben Widawsky wrote: > > Oops, you're totally right, I think I meant: > > - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); > > + I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir); > > Imo still racy without the irqsafe rps_lock around it. gcc is free to > compile that into a separate load and store which the irq handler can > get in between and change dev_priv->pm_iir and PMIMR. The race is now > only one instruction wide, though ;-) > -Daniel You are absolutely correct. The modification to GEN6_PMIMR must be within the protection of rps_lock. Ben
On Sun, 4 Sep 2011 23:51:52 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > On Mon, 5 Sep 2011 08:38:07 +0200 > Daniel Vetter <daniel@ffwll.ch> wrote: > > > On Sun, Sep 04, 2011 at 09:38:56PM +0000, Ben Widawsky wrote: > > > Oops, you're totally right, I think I meant: > > > - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); > > > + I915_WRITE(GEN6_PMIMR, dev_priv->pm_iir); > > > > Imo still racy without the irqsafe rps_lock around it. gcc is free to > > compile that into a separate load and store which the irq handler can > > get in between and change dev_priv->pm_iir and PMIMR. The race is now > > only one instruction wide, though ;-) > > -Daniel > > You are absolutely correct. The modification to GEN6_PMIMR must be > within the protection of rps_lock. Right. However, I don't see why we need to hold the mutex though. Why are we touching PMIMR in gen6_enable_rps()? Afaics, it is because gen6_disable_rps() may leave a stale PMIMR (it sets dev_priv->pm_iir to 0, causing the existing work-handler to return early and not touch PMIMR). I believe everything is correct if we clear PMIMR on module load, remove the setting of PMIMR from gen6_enable_rps() and clear PMIMR under the rps lock at the top of the work handler (which both Daniel and I have desired to do... ;-) -Chris
diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 55518e3..3bc1479 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -415,12 +415,7 @@ static void gen6_pm_rps_work(struct work_struct *work) gen6_set_rps(dev_priv->dev, new_delay); dev_priv->cur_delay = new_delay; - /* - * rps_lock not held here because clearing is non-destructive. There is - * an *extremely* unlikely race with gen6_rps_enable() that is prevented - * by holding struct_mutex for the duration of the write. - */ - I915_WRITE(GEN6_PMIMR, pm_imr & ~pm_iir); + I915_WRITE(GEN6_PMIMR, pm_imr & dev_priv->pm_iir); mutex_unlock(&dev_priv->dev->struct_mutex); }