Message ID | 1452768585-18661-1-git-send-email-arun.siluvery@linux.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Arun, [auto build test WARNING on drm-intel/for-linux-next] [also build test WARNING on v4.4 next-20160114] [if your patch is applied to the wrong git tree, please drop us a note to help improving the system] url: https://github.com/0day-ci/linux/commits/Arun-Siluvery/drm-i915-Clear-pending-reset-requests-during-suspend/20160114-185121 base: git://anongit.freedesktop.org/drm-intel for-linux-next config: x86_64-randconfig-x010-01140842 (attached as .config) reproduce: # save the attached .config to linux build tree make ARCH=x86_64 All warnings (new ones prefixed by >>): drivers/gpu/drm/i915/i915_drv.c: In function 'i915_drm_suspend': >> drivers/gpu/drm/i915/i915_drv.c:601:2: warning: 'atomic_clear_mask' is deprecated [-Wdeprecated-declarations] atomic_clear_mask(I915_RESET_IN_PROGRESS_FLAG, ^ In file included from include/linux/debug_locks.h:5:0, from include/linux/lockdep.h:23, from include/linux/spinlock_types.h:18, from include/linux/mutex.h:15, from include/linux/kernfs.h:13, from include/linux/sysfs.h:15, from include/linux/kobject.h:21, from include/linux/device.h:17, from drivers/gpu/drm/i915/i915_drv.c:30: include/linux/atomic.h:458:33: note: declared here static inline __deprecated void atomic_clear_mask(unsigned int mask, atomic_t *v) ^ vim +/atomic_clear_mask +601 drivers/gpu/drm/i915/i915_drv.c 585 586 drm_kms_helper_poll_disable(dev); 587 588 pci_save_state(dev->pdev); 589 590 error = i915_gem_suspend(dev); 591 if (error) { 592 dev_err(&dev->pdev->dev, 593 "GEM idle failed, resume might fail\n"); 594 goto out; 595 } 596 597 /* 598 * Clear any pending reset requests. They should be picked up 599 * after resume when new work is submitted 600 */ > 601 atomic_clear_mask(I915_RESET_IN_PROGRESS_FLAG, 602 &dev_priv->gpu_error.reset_counter); 603 604 intel_guc_suspend(dev); 605 606 intel_suspend_gt_powersave(dev); 607 608 /* 609 * Disable CRTCs directly since we want to preserve sw state --- 0-DAY kernel test infrastructure Open Source Technology Center https://lists.01.org/pipermail/kbuild-all Intel Corporation
On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: > Pending reset requests are cleared before suspending, they should be picked up > after resume when new work is submitted. > > This is originally added as part of TDR patches for Gen8 from Tomas Elf which > are under review, as suggested by Chris this is extracted as a separate patch > as it can be useful now. > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > --- > drivers/gpu/drm/i915/i915_drv.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > index f17a2b0..09ed83e 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -594,6 +594,13 @@ static int i915_drm_suspend(struct drm_device *dev) > goto out; > } > > + /* > + * Clear any pending reset requests. They should be picked up > + * after resume when new work is submitted > + */ > + atomic_clear_mask(I915_RESET_IN_PROGRESS_FLAG, > + &dev_priv->gpu_error.reset_counter); > + The comment is slightly wrong. When the error tasklet in progress sees that the flag is unset, it return (i.e. doesn't perform the reset). This is ok, because we are putting the device to PCI_D3, we are powering it down which should be our ultimate reset. So no need for the reset on resume. Except.... We do need to clean up the bookkeeping. Hmm. so what we need to do is actually flush the reset task, and pretend it succeeded. -Chris
On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: > Pending reset requests are cleared before suspending, they should be picked up > after resume when new work is submitted. > > This is originally added as part of TDR patches for Gen8 from Tomas Elf which > are under review, as suggested by Chris this is extracted as a separate patch > as it can be useful now. > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > Cc: Chris Wilson <chris@chris-wilson.co.uk> > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> Pulling in the discussion we had from irc: Imo the right approach is to simply wait for gpu reset to finish it's job. Since that could in turn lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do that in a loop around gem_idle. And drop dev->struct_mutex in-between. E.g. while (busy) { mutex_lock(); gpu_idle(); mutex_unlock(); flush_work(reset_work); } Cheers, Daniel > --- > drivers/gpu/drm/i915/i915_drv.c | 7 +++++++ > 1 file changed, 7 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c > index f17a2b0..09ed83e 100644 > --- a/drivers/gpu/drm/i915/i915_drv.c > +++ b/drivers/gpu/drm/i915/i915_drv.c > @@ -594,6 +594,13 @@ static int i915_drm_suspend(struct drm_device *dev) > goto out; > } > > + /* > + * Clear any pending reset requests. They should be picked up > + * after resume when new work is submitted > + */ > + atomic_clear_mask(I915_RESET_IN_PROGRESS_FLAG, > + &dev_priv->gpu_error.reset_counter); > + > intel_guc_suspend(dev); > > intel_suspend_gt_powersave(dev); > -- > 1.9.1 > > _______________________________________________ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > http://lists.freedesktop.org/mailman/listinfo/intel-gfx
On Tue, Jan 19, 2016 at 01:09:28PM +0100, Daniel Vetter wrote: > On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: > > Pending reset requests are cleared before suspending, they should be picked up > > after resume when new work is submitted. > > > > This is originally added as part of TDR patches for Gen8 from Tomas Elf which > > are under review, as suggested by Chris this is extracted as a separate patch > > as it can be useful now. > > > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > > Pulling in the discussion we had from irc: Imo the right approach is to > simply wait for gpu reset to finish it's job. Since that could in turn > lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do > that in a loop around gem_idle. And drop dev->struct_mutex in-between. > E.g. > > while (busy) { > mutex_lock(); > gpu_idle(); > mutex_unlock(); > > flush_work(reset_work); > } Where does the requirement for gpu_idle come from? If there is a global reset in progress, it cannot queue a request to flush the work and waiting on the old results will be skipped. So just wait for the global reset to complete, i.e. flush_work(). -Chris
On Tue, Jan 19, 2016 at 01:48:05PM +0000, Chris Wilson wrote: > On Tue, Jan 19, 2016 at 01:09:28PM +0100, Daniel Vetter wrote: > > On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: > > > Pending reset requests are cleared before suspending, they should be picked up > > > after resume when new work is submitted. > > > > > > This is originally added as part of TDR patches for Gen8 from Tomas Elf which > > > are under review, as suggested by Chris this is extracted as a separate patch > > > as it can be useful now. > > > > > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > > > > Pulling in the discussion we had from irc: Imo the right approach is to > > simply wait for gpu reset to finish it's job. Since that could in turn > > lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do > > that in a loop around gem_idle. And drop dev->struct_mutex in-between. > > E.g. > > > > while (busy) { > > mutex_lock(); > > gpu_idle(); > > mutex_unlock(); > > > > flush_work(reset_work); > > } > > Where does the requirement for gpu_idle come from? If there is a global > reset in progress, it cannot queue a request to flush the work and > waiting on the old results will be skipped. So just wait for the global > reset to complete, i.e. flush_work(). Yes, but the global reset might in turn leave a wrecked gpu behind, or at least a non-idle one. Hence another gpu_idle on top, to make sure. If we change init_hw() of engines to be synchronous then we should have at least a WARN_ON(not_idle_but_i_expected_so()); in there ... -Daniel
On Tue, Jan 19, 2016 at 03:04:40PM +0100, Daniel Vetter wrote: > On Tue, Jan 19, 2016 at 01:48:05PM +0000, Chris Wilson wrote: > > On Tue, Jan 19, 2016 at 01:09:28PM +0100, Daniel Vetter wrote: > > > On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: > > > > Pending reset requests are cleared before suspending, they should be picked up > > > > after resume when new work is submitted. > > > > > > > > This is originally added as part of TDR patches for Gen8 from Tomas Elf which > > > > are under review, as suggested by Chris this is extracted as a separate patch > > > > as it can be useful now. > > > > > > > > Cc: Mika Kuoppala <mika.kuoppala@intel.com> > > > > Cc: Chris Wilson <chris@chris-wilson.co.uk> > > > > Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > > > > > > Pulling in the discussion we had from irc: Imo the right approach is to > > > simply wait for gpu reset to finish it's job. Since that could in turn > > > lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do > > > that in a loop around gem_idle. And drop dev->struct_mutex in-between. > > > E.g. > > > > > > while (busy) { > > > mutex_lock(); > > > gpu_idle(); > > > mutex_unlock(); > > > > > > flush_work(reset_work); > > > } > > > > Where does the requirement for gpu_idle come from? If there is a global > > reset in progress, it cannot queue a request to flush the work and > > waiting on the old results will be skipped. So just wait for the global > > reset to complete, i.e. flush_work(). > > Yes, but the global reset might in turn leave a wrecked gpu behind, or at > least a non-idle one. Hence another gpu_idle on top, to make sure. If we > change init_hw() of engines to be synchronous then we should have at least > a WARN_ON(not_idle_but_i_expected_so()); in there ... Does it matter on suspend? We test on resume if the GPU is usable, but if we wanted to test on suspend then we should do flush_work(); if (i915_terminally_wedged()) /* oh noes */; -Chris
On 19/01/2016 14:13, Chris Wilson wrote: > On Tue, Jan 19, 2016 at 03:04:40PM +0100, Daniel Vetter wrote: >> On Tue, Jan 19, 2016 at 01:48:05PM +0000, Chris Wilson wrote: >>> On Tue, Jan 19, 2016 at 01:09:28PM +0100, Daniel Vetter wrote: >>>> On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: >>>>> Pending reset requests are cleared before suspending, they should be picked up >>>>> after resume when new work is submitted. >>>>> >>>>> This is originally added as part of TDR patches for Gen8 from Tomas Elf which >>>>> are under review, as suggested by Chris this is extracted as a separate patch >>>>> as it can be useful now. >>>>> >>>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>>>> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> >>>> >>>> Pulling in the discussion we had from irc: Imo the right approach is to >>>> simply wait for gpu reset to finish it's job. Since that could in turn >>>> lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do >>>> that in a loop around gem_idle. And drop dev->struct_mutex in-between. >>>> E.g. >>>> >>>> while (busy) { >>>> mutex_lock(); >>>> gpu_idle(); >>>> mutex_unlock(); >>>> >>>> flush_work(reset_work); >>>> } >>> >>> Where does the requirement for gpu_idle come from? If there is a global >>> reset in progress, it cannot queue a request to flush the work and >>> waiting on the old results will be skipped. So just wait for the global >>> reset to complete, i.e. flush_work(). >> >> Yes, but the global reset might in turn leave a wrecked gpu behind, or at >> least a non-idle one. Hence another gpu_idle on top, to make sure. If we >> change init_hw() of engines to be synchronous then we should have at least >> a WARN_ON(not_idle_but_i_expected_so()); in there ... gpu_error.work is removed in b8d24a06568368076ebd5a858a011699a97bfa42, we are doing reset in hangcheck work itself so I think there is no need to flush work. while (i915_reset_in_progress(gpu_error) && !i915_terminally_wedged(gpu_error)) { int ret; mutex_lock(&dev->struct_mutex); ret = i915_gpu_idle(dev); if (ret) DRM_ERROR("GPU is in inconsistent state after reset\n"); mutex_unlock(&dev->struct_mutex); } If the reset is successful we are idle before suspend otherwise in a wedged state. is this ok? regards Arun > > Does it matter on suspend? We test on resume if the GPU is usable, but > if we wanted to test on suspend then we should do > > flush_work(); > if (i915_terminally_wedged()) > /* oh noes */; > -Chris >
On Tue, Jan 19, 2016 at 03:04:09PM +0000, Arun Siluvery wrote: > On 19/01/2016 14:13, Chris Wilson wrote: > >On Tue, Jan 19, 2016 at 03:04:40PM +0100, Daniel Vetter wrote: > >>On Tue, Jan 19, 2016 at 01:48:05PM +0000, Chris Wilson wrote: > >>>On Tue, Jan 19, 2016 at 01:09:28PM +0100, Daniel Vetter wrote: > >>>>On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: > >>>>>Pending reset requests are cleared before suspending, they should be picked up > >>>>>after resume when new work is submitted. > >>>>> > >>>>>This is originally added as part of TDR patches for Gen8 from Tomas Elf which > >>>>>are under review, as suggested by Chris this is extracted as a separate patch > >>>>>as it can be useful now. > >>>>> > >>>>>Cc: Mika Kuoppala <mika.kuoppala@intel.com> > >>>>>Cc: Chris Wilson <chris@chris-wilson.co.uk> > >>>>>Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > >>>> > >>>>Pulling in the discussion we had from irc: Imo the right approach is to > >>>>simply wait for gpu reset to finish it's job. Since that could in turn > >>>>lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do > >>>>that in a loop around gem_idle. And drop dev->struct_mutex in-between. > >>>>E.g. > >>>> > >>>>while (busy) { > >>>> mutex_lock(); > >>>> gpu_idle(); > >>>> mutex_unlock(); > >>>> > >>>> flush_work(reset_work); > >>>>} > >>> > >>>Where does the requirement for gpu_idle come from? If there is a global > >>>reset in progress, it cannot queue a request to flush the work and > >>>waiting on the old results will be skipped. So just wait for the global > >>>reset to complete, i.e. flush_work(). > >> > >>Yes, but the global reset might in turn leave a wrecked gpu behind, or at > >>least a non-idle one. Hence another gpu_idle on top, to make sure. If we > >>change init_hw() of engines to be synchronous then we should have at least > >>a WARN_ON(not_idle_but_i_expected_so()); in there ... > > gpu_error.work is removed in b8d24a06568368076ebd5a858a011699a97bfa42, we git sha1 from your private tree are meaningless in the public. Either link to some git weburl or mailing lists archive link. Thanks, Daniel > are doing reset in hangcheck work itself so I think there is no need to > flush work. > > while (i915_reset_in_progress(gpu_error) && > !i915_terminally_wedged(gpu_error)) { > int ret; > > mutex_lock(&dev->struct_mutex); > ret = i915_gpu_idle(dev); > if (ret) > DRM_ERROR("GPU is in inconsistent state after reset\n"); > mutex_unlock(&dev->struct_mutex); > } > > If the reset is successful we are idle before suspend otherwise in a wedged > state. is this ok? > > regards > Arun > > > > >Does it matter on suspend? We test on resume if the GPU is usable, but > >if we wanted to test on suspend then we should do > > > >flush_work(); > >if (i915_terminally_wedged()) > > /* oh noes */; > >-Chris > > >
On 19/01/2016 16:42, Daniel Vetter wrote: > On Tue, Jan 19, 2016 at 03:04:09PM +0000, Arun Siluvery wrote: >> On 19/01/2016 14:13, Chris Wilson wrote: >>> On Tue, Jan 19, 2016 at 03:04:40PM +0100, Daniel Vetter wrote: >>>> On Tue, Jan 19, 2016 at 01:48:05PM +0000, Chris Wilson wrote: >>>>> On Tue, Jan 19, 2016 at 01:09:28PM +0100, Daniel Vetter wrote: >>>>>> On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: >>>>>>> Pending reset requests are cleared before suspending, they should be picked up >>>>>>> after resume when new work is submitted. >>>>>>> >>>>>>> This is originally added as part of TDR patches for Gen8 from Tomas Elf which >>>>>>> are under review, as suggested by Chris this is extracted as a separate patch >>>>>>> as it can be useful now. >>>>>>> >>>>>>> Cc: Mika Kuoppala <mika.kuoppala@intel.com> >>>>>>> Cc: Chris Wilson <chris@chris-wilson.co.uk> >>>>>>> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> >>>>>> >>>>>> Pulling in the discussion we had from irc: Imo the right approach is to >>>>>> simply wait for gpu reset to finish it's job. Since that could in turn >>>>>> lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do >>>>>> that in a loop around gem_idle. And drop dev->struct_mutex in-between. >>>>>> E.g. >>>>>> >>>>>> while (busy) { >>>>>> mutex_lock(); >>>>>> gpu_idle(); >>>>>> mutex_unlock(); >>>>>> >>>>>> flush_work(reset_work); >>>>>> } >>>>> >>>>> Where does the requirement for gpu_idle come from? If there is a global >>>>> reset in progress, it cannot queue a request to flush the work and >>>>> waiting on the old results will be skipped. So just wait for the global >>>>> reset to complete, i.e. flush_work(). >>>> >>>> Yes, but the global reset might in turn leave a wrecked gpu behind, or at >>>> least a non-idle one. Hence another gpu_idle on top, to make sure. If we >>>> change init_hw() of engines to be synchronous then we should have at least >>>> a WARN_ON(not_idle_but_i_expected_so()); in there ... >> >> gpu_error.work is removed in b8d24a06568368076ebd5a858a011699a97bfa42, we > > git sha1 from your private tree are meaningless in the public. Either link > to some git weburl or mailing lists archive link. It is from drm-intel repo, http://cgit.freedesktop.org/drm-intel/commit/?id=b8d24a06568368076ebd5a858a011699a97bfa42 http://lists.freedesktop.org/archives/intel-gfx/2015-January/059154.html regards Arun > > Thanks, Daniel > >> are doing reset in hangcheck work itself so I think there is no need to >> flush work. >> >> while (i915_reset_in_progress(gpu_error) && >> !i915_terminally_wedged(gpu_error)) { >> int ret; >> >> mutex_lock(&dev->struct_mutex); >> ret = i915_gpu_idle(dev); >> if (ret) >> DRM_ERROR("GPU is in inconsistent state after reset\n"); >> mutex_unlock(&dev->struct_mutex); >> } >> >> If the reset is successful we are idle before suspend otherwise in a wedged >> state. is this ok? >> >> regards >> Arun >> >>> >>> Does it matter on suspend? We test on resume if the GPU is usable, but >>> if we wanted to test on suspend then we should do >>> >>> flush_work(); >>> if (i915_terminally_wedged()) >>> /* oh noes */; >>> -Chris >>> >> >
On Tue, Jan 19, 2016 at 05:01:00PM +0000, Arun Siluvery wrote: > On 19/01/2016 16:42, Daniel Vetter wrote: > >On Tue, Jan 19, 2016 at 03:04:09PM +0000, Arun Siluvery wrote: > >>On 19/01/2016 14:13, Chris Wilson wrote: > >>>On Tue, Jan 19, 2016 at 03:04:40PM +0100, Daniel Vetter wrote: > >>>>On Tue, Jan 19, 2016 at 01:48:05PM +0000, Chris Wilson wrote: > >>>>>On Tue, Jan 19, 2016 at 01:09:28PM +0100, Daniel Vetter wrote: > >>>>>>On Thu, Jan 14, 2016 at 10:49:45AM +0000, Arun Siluvery wrote: > >>>>>>>Pending reset requests are cleared before suspending, they should be picked up > >>>>>>>after resume when new work is submitted. > >>>>>>> > >>>>>>>This is originally added as part of TDR patches for Gen8 from Tomas Elf which > >>>>>>>are under review, as suggested by Chris this is extracted as a separate patch > >>>>>>>as it can be useful now. > >>>>>>> > >>>>>>>Cc: Mika Kuoppala <mika.kuoppala@intel.com> > >>>>>>>Cc: Chris Wilson <chris@chris-wilson.co.uk> > >>>>>>>Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> > >>>>>> > >>>>>>Pulling in the discussion we had from irc: Imo the right approach is to > >>>>>>simply wait for gpu reset to finish it's job. Since that could in turn > >>>>>>lead to a dead gpu (if we're unlucky and init_hw failed) we'd need to do > >>>>>>that in a loop around gem_idle. And drop dev->struct_mutex in-between. > >>>>>>E.g. > >>>>>> > >>>>>>while (busy) { > >>>>>> mutex_lock(); > >>>>>> gpu_idle(); > >>>>>> mutex_unlock(); > >>>>>> > >>>>>> flush_work(reset_work); > >>>>>>} > >>>>> > >>>>>Where does the requirement for gpu_idle come from? If there is a global > >>>>>reset in progress, it cannot queue a request to flush the work and > >>>>>waiting on the old results will be skipped. So just wait for the global > >>>>>reset to complete, i.e. flush_work(). > >>>> > >>>>Yes, but the global reset might in turn leave a wrecked gpu behind, or at > >>>>least a non-idle one. Hence another gpu_idle on top, to make sure. If we > >>>>change init_hw() of engines to be synchronous then we should have at least > >>>>a WARN_ON(not_idle_but_i_expected_so()); in there ... > >> > >>gpu_error.work is removed in b8d24a06568368076ebd5a858a011699a97bfa42, we > > > >git sha1 from your private tree are meaningless in the public. Either link > >to some git weburl or mailing lists archive link. > > It is from drm-intel repo, > http://cgit.freedesktop.org/drm-intel/commit/?id=b8d24a06568368076ebd5a858a011699a97bfa42 > > http://lists.freedesktop.org/archives/intel-gfx/2015-January/059154.html Oh right, forgot that this landed, sorry for the confusion. Summary of our irc discussion: We idle the gpu and flush the hangcheck (which should flush the reset work) so at least with current upstream there shouldn't be a bug. If there is a bug we need to understand it, we can't just add code without clear explanation and reasons: At best that confuses, at worst it hides some real bugs. -Daniel > > regards > Arun > > > > >Thanks, Daniel > > > >>are doing reset in hangcheck work itself so I think there is no need to > >>flush work. > >> > >>while (i915_reset_in_progress(gpu_error) && > >> !i915_terminally_wedged(gpu_error)) { > >> int ret; > >> > >> mutex_lock(&dev->struct_mutex); > >> ret = i915_gpu_idle(dev); > >> if (ret) > >> DRM_ERROR("GPU is in inconsistent state after reset\n"); > >> mutex_unlock(&dev->struct_mutex); > >>} > >> > >>If the reset is successful we are idle before suspend otherwise in a wedged > >>state. is this ok? > >> > >>regards > >>Arun > >> > >>> > >>>Does it matter on suspend? We test on resume if the GPU is usable, but > >>>if we wanted to test on suspend then we should do > >>> > >>>flush_work(); > >>>if (i915_terminally_wedged()) > >>> /* oh noes */; > >>>-Chris > >>> > >> > > >
diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index f17a2b0..09ed83e 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -594,6 +594,13 @@ static int i915_drm_suspend(struct drm_device *dev) goto out; } + /* + * Clear any pending reset requests. They should be picked up + * after resume when new work is submitted + */ + atomic_clear_mask(I915_RESET_IN_PROGRESS_FLAG, + &dev_priv->gpu_error.reset_counter); + intel_guc_suspend(dev); intel_suspend_gt_powersave(dev);
Pending reset requests are cleared before suspending, they should be picked up after resume when new work is submitted. This is originally added as part of TDR patches for Gen8 from Tomas Elf which are under review, as suggested by Chris this is extracted as a separate patch as it can be useful now. Cc: Mika Kuoppala <mika.kuoppala@intel.com> Cc: Chris Wilson <chris@chris-wilson.co.uk> Signed-off-by: Arun Siluvery <arun.siluvery@linux.intel.com> --- drivers/gpu/drm/i915/i915_drv.c | 7 +++++++ 1 file changed, 7 insertions(+)