Message ID | 1308870382-1587-1-git-send-email-ben@bwidawsk.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > --- > drivers/gpu/drm/i915/i915_drv.c | 1 + > 1 files changed, 1 insertions(+), 0 deletions(-) > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > index 0defd42..9292499 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags) > } else switch (INTEL_INFO(dev)->gen) { > case 6: > ret = gen6_do_reset(dev, flags); > + atomic_set(&dev_priv->forcewake_count, 0); > break; > case 5: > ret = ironlake_do_reset(dev, flags); Can forcewake be non-zero here? If it has been bumped by a user wakelock, then what happens when that is subsequently released? I don't think this is safe... What scenario are you trying to fix? -Chris
On Fri, Jun 24, 2011 at 12:45:27AM +0100, Chris Wilson wrote: > On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > --- > > drivers/gpu/drm/i915/i915_drv.c | 1 + > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > > index 0defd42..9292499 100644 > > --- a/drivers/gpu/drm/i915/i915_drv.c > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags) > > } else switch (INTEL_INFO(dev)->gen) { > > case 6: > > ret = gen6_do_reset(dev, flags); > > + atomic_set(&dev_priv->forcewake_count, 0); > > break; > > case 5: > > ret = ironlake_do_reset(dev, flags); > > Can forcewake be non-zero here? If it has been bumped by a user wakelock, > then what happens when that is subsequently released? I don't think this > is safe... > > What scenario are you trying to fix? > -Chris This is not the cleanest fix, but the problem is the following: 1. User bumps refcount 2. GPU hangs 3. Reset occurs 4. User doesn't close the file (or even the race before the user closes the file after the reset) the driver is now completely screwed in this case, once the user does close the file, things will go back to normal. I was actually just about to respond to my original email to say this belongs in -fixes (unless I'm confused). Ben
On Thu, Jun 23, 2011 at 07:00:50PM -0700, Ben Widawsky wrote: > On Fri, Jun 24, 2011 at 12:45:27AM +0100, Chris Wilson wrote: > > On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > --- > > > drivers/gpu/drm/i915/i915_drv.c | 1 + > > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > > > index 0defd42..9292499 100644 > > > --- a/drivers/gpu/drm/i915/i915_drv.c > > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags) > > > } else switch (INTEL_INFO(dev)->gen) { > > > case 6: > > > ret = gen6_do_reset(dev, flags); > > > + atomic_set(&dev_priv->forcewake_count, 0); > > > break; > > > case 5: > > > ret = ironlake_do_reset(dev, flags); > > > > Can forcewake be non-zero here? If it has been bumped by a user wakelock, > > then what happens when that is subsequently released? I don't think this > > is safe... > > > > What scenario are you trying to fix? > > -Chris > > This is not the cleanest fix, but the problem is the following: > > 1. User bumps refcount > 2. GPU hangs > 3. Reset occurs > 4. User doesn't close the file (or even the race before the user closes > the file after the reset) the driver is now completely screwed in > this case, once the user does close the file, things will go back to > normal. > > I was actually just about to respond to my original email to say this > belongs in -fixes (unless I'm confused). > > Ben Just realized that you're right. My code is buggy at step 4 when the user closes the file... I do think we need some fix though. Agree?
On Thu, 23 Jun 2011 19:02:32 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > On Thu, Jun 23, 2011 at 07:00:50PM -0700, Ben Widawsky wrote: > > On Fri, Jun 24, 2011 at 12:45:27AM +0100, Chris Wilson wrote: > > > On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > > > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > > --- > > > > drivers/gpu/drm/i915/i915_drv.c | 1 + > > > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > > > > index 0defd42..9292499 100644 > > > > --- a/drivers/gpu/drm/i915/i915_drv.c > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > > > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags) > > > > } else switch (INTEL_INFO(dev)->gen) { > > > > case 6: > > > > ret = gen6_do_reset(dev, flags); > > > > + atomic_set(&dev_priv->forcewake_count, 0); > > > > break; > > > > case 5: > > > > ret = ironlake_do_reset(dev, flags); > > > > > > Can forcewake be non-zero here? If it has been bumped by a user wakelock, > > > then what happens when that is subsequently released? I don't think this > > > is safe... > > > > > > What scenario are you trying to fix? > > > -Chris > > > > This is not the cleanest fix, but the problem is the following: > > > > 1. User bumps refcount > > 2. GPU hangs > > 3. Reset occurs > > 4. User doesn't close the file (or even the race before the user closes > > the file after the reset) the driver is now completely screwed in > > this case, once the user does close the file, things will go back to > > normal. > > > > I was actually just about to respond to my original email to say this > > belongs in -fixes (unless I'm confused). > > > > Ben > > Just realized that you're right. My code is buggy at step 4 when the > user closes the file... I do think we need some fix though. Agree? Are we sure that the GT forcedwake is hammered along with the GPU reset? I haven't checked but that's the crux of the issue... Assuming it is, I see the problem you're trying to solve (sleep is good!). Even if it isn't, we could perform the forcedwake sequence so that our refcnt was back in sync with the hardware. If we continue to presume that struct_mutex is the one and only guard for forcedwake, then we should be race free? Another solution would be to defer the reset until the forcedwake refcnt drops to zero. But that conflates the notion of a resetlock with the wakelock (although we could say that the user wakelock is the combination of forcedwakelock and resetlock). Something to think about, at least :) -Chris
On Fri, Jun 24, 2011 at 08:54:24AM +0100, Chris Wilson wrote: > On Thu, 23 Jun 2011 19:02:32 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > > On Thu, Jun 23, 2011 at 07:00:50PM -0700, Ben Widawsky wrote: > > > On Fri, Jun 24, 2011 at 12:45:27AM +0100, Chris Wilson wrote: > > > > On Thu, 23 Jun 2011 16:06:22 -0700, Ben Widawsky <ben@bwidawsk.net> wrote: > > > > > > > > > > Signed-off-by: Ben Widawsky <ben@bwidawsk.net> > > > > > --- > > > > > drivers/gpu/drm/i915/i915_drv.c | 1 + > > > > > 1 files changed, 1 insertions(+), 0 deletions(-) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > > > > > index 0defd42..9292499 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_drv.c > > > > > +++ b/drivers/gpu/drm/i915/i915_drv.c > > > > > @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags) > > > > > } else switch (INTEL_INFO(dev)->gen) { > > > > > case 6: > > > > > ret = gen6_do_reset(dev, flags); > > > > > + atomic_set(&dev_priv->forcewake_count, 0); > > > > > break; > > > > > case 5: > > > > > ret = ironlake_do_reset(dev, flags); > > > > > > > > Can forcewake be non-zero here? If it has been bumped by a user wakelock, > > > > then what happens when that is subsequently released? I don't think this > > > > is safe... > > > > > > > > What scenario are you trying to fix? > > > > -Chris > > > > > > This is not the cleanest fix, but the problem is the following: > > > > > > 1. User bumps refcount > > > 2. GPU hangs > > > 3. Reset occurs > > > 4. User doesn't close the file (or even the race before the user closes > > > the file after the reset) the driver is now completely screwed in > > > this case, once the user does close the file, things will go back to > > > normal. > > > > > > I was actually just about to respond to my original email to say this > > > belongs in -fixes (unless I'm confused). > > > > > > Ben > > > > Just realized that you're right. My code is buggy at step 4 when the > > user closes the file... I do think we need some fix though. Agree? > > Are we sure that the GT forcedwake is hammered along with the GPU reset? I > haven't checked but that's the crux of the issue... Yes, the test I am performing leads me to believe so. You can try yourself and tell me what you think: forcewaked - you remember that nifty app I posted ;-) gpu reset intel_reg_write anything < 0x40000 > > Assuming it is, I see the problem you're trying to solve (sleep is good!). > Even if it isn't, we could perform the forcedwake sequence so that our > refcnt was back in sync with the hardware. If we continue to presume that > struct_mutex is the one and only guard for forcedwake, then we should be > race free? I believe the problem only exists with user initiated forcewake. Fortunately that debugfs entry is root only, and your average person won't be using it. I'll modify this patch to be a WARN_ON instead of atomic_set(). That will be helpful in proving it. > Something to think about, at least :) > -Chris You know, I was really excited to post this because I felt it could be really helpful, but now that you've convinced me this is only a problem for users using the debugfs forcewakes (which is probably only me at this moment), let me go think about it again. Ben
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 0defd42..9292499 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -579,6 +579,7 @@ int i915_reset(struct drm_device *dev, u8 flags) } else switch (INTEL_INFO(dev)->gen) { case 6: ret = gen6_do_reset(dev, flags); + atomic_set(&dev_priv->forcewake_count, 0); break; case 5: ret = ironlake_do_reset(dev, flags);
Signed-off-by: Ben Widawsky <ben@bwidawsk.net> --- drivers/gpu/drm/i915/i915_drv.c | 1 + 1 files changed, 1 insertions(+), 0 deletions(-)