mbox series

[v6,0/9] drm/i915: Suspend / resume backup- and restore of LMEM.

Message ID 20210922062527.865433-1-thomas.hellstrom@linux.intel.com (mailing list archive)
Headers show
Series drm/i915: Suspend / resume backup- and restore of LMEM. | expand

Message

Thomas Hellstrom Sept. 22, 2021, 6:25 a.m. UTC
Implement backup and restore of LMEM during suspend / resume.
What complicates things a bit is handling of pinned LMEM memory during
suspend and the fact that we might be dealing with unmappable LMEM in
the future, which makes us want to restrict the number of pinned objects that
need memcpy resume.

The first two patches are prereq patches implementing object content copy
and a generic means of iterating through all objects in a region.
The third patch adds the backup / recover / restore functions and the
two last patches deal with restricting the number of objects we need to
use memcpy for.

v2:
- Some polishing of patch 4/6, see patch commit message for details (Chris
  Wilson)
- Rework of patch 3/6.

v3:
- Comment changes in patch 2/6 (Matthew Auld)
- A number of changes to patch 3/6, see commit message.
- Slightly reword comment in patch 5/6. (Matthew Auld).

v4:
- Various cleanups, among other things reworking the ttm / lmem backup-
  and resume interfaces somewhat.

v5:
- GuC adaptations. Mark GuC LMEM objects for early resume and increase
  the suspend idle timeout.

v6:
- Add two HAX patches to make broken CI happy.

Kai Vehmanen (1):
  HAX: component: do not leave master devres group open after bind

Thomas Hellström (8):
  drm/i915/ttm: Implement a function to copy the contents of two
    TTM-based objects
  drm/i915/gem: Implement a function to process all gem objects of a
    region
  drm/i915/gt: Increase suspend timeout
  drm/i915 Implement LMEM backup and restore for suspend / resume
  drm/i915/gt: Register the migrate contexts with their engines
  drm/i915: Don't back up pinned LMEM context images and rings during
    suspend
  drm/i915: Reduce the number of objects subject to memcpy recover
  HAX: drm/i915/gem: Fix the __i915_gem_is_lmem() function

 drivers/base/component.c                      |   5 +-
 drivers/gpu/drm/i915/Makefile                 |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_context.c   |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_lmem.c      |   2 +-
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |  21 +-
 drivers/gpu/drm/i915/gem/i915_gem_pm.c        |  91 ++++++++
 drivers/gpu/drm/i915/gem/i915_gem_pm.h        |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_region.c    |  70 ++++++
 drivers/gpu/drm/i915/gem/i915_gem_region.h    |  37 ++++
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |  99 +++++++--
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h       |  14 ++
 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c    | 206 ++++++++++++++++++
 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h    |  26 +++
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |   2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          |   2 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          |   5 +-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h          |   4 +-
 drivers/gpu/drm/i915/gt/intel_context_types.h |   8 +
 drivers/gpu/drm/i915/gt/intel_engine_cs.c     |   4 +
 drivers/gpu/drm/i915/gt/intel_engine_pm.c     |  23 ++
 drivers/gpu/drm/i915/gt/intel_engine_pm.h     |   2 +
 drivers/gpu/drm/i915/gt/intel_engine_types.h  |   7 +
 .../drm/i915/gt/intel_execlists_submission.c  |   2 +
 drivers/gpu/drm/i915/gt/intel_ggtt.c          |   3 +-
 drivers/gpu/drm/i915/gt/intel_gt.c            |   2 +-
 drivers/gpu/drm/i915/gt/intel_gt_pm.c         |   8 +-
 drivers/gpu/drm/i915/gt/intel_gtt.c           |   3 +-
 drivers/gpu/drm/i915/gt/intel_gtt.h           |   9 +-
 drivers/gpu/drm/i915/gt/intel_lrc.c           |   3 +-
 drivers/gpu/drm/i915/gt/intel_migrate.c       |   2 +-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  13 +-
 drivers/gpu/drm/i915/gt/intel_ring.c          |   3 +-
 .../gpu/drm/i915/gt/intel_ring_submission.c   |   3 +
 drivers/gpu/drm/i915/gt/mock_engine.c         |   2 +
 drivers/gpu/drm/i915/gt/selftest_hangcheck.c  |   2 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc.c        |   3 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  12 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      |   7 +-
 drivers/gpu/drm/i915/gvt/scheduler.c          |   2 +-
 drivers/gpu/drm/i915/i915_drv.c               |   4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c |   4 +-
 41 files changed, 658 insertions(+), 63 deletions(-)
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c
 create mode 100644 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.h

Comments

Thomas Hellstrom Sept. 22, 2021, 6:06 p.m. UTC | #1
On 9/22/21 11:05 AM, Patchwork wrote:
> Project List - Patchwork *Patch Details*
> *Series:* 	drm/i915: Suspend / resume backup- and restore of LMEM. (rev9)
> *URL:* 	https://patchwork.freedesktop.org/series/94278/ 
> <https://patchwork.freedesktop.org/series/94278/>
> *State:* 	failure
> *Details:* 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21124/index.html 
> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21124/index.html>
>
>
>   CI Bug Log - changes from CI_DRM_10622_full -> Patchwork_21124_full
>
>
>     Summary
>
> *FAILURE*
>
> Serious unknown changes coming with Patchwork_21124_full absolutely 
> need to be
> verified manually.
>
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_21124_full, please notify your bug team to 
> allow them
> to document this new failure mode, which will reduce false positives 
> in CI.
>
>
>     Possible new issues
>
> Here are the unknown changes that may have been introduced in 
> Patchwork_21124_full:
>
>
>       IGT changes
>
>
>         Possible regressions
>
>   * igt@gem_exec_schedule@u-submit-golden-slice@rcs0:
>       o shard-tglb: PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10622/shard-tglb3/igt@gem_exec_schedule@u-submit-golden-slice@rcs0.html>
>         -> INCOMPLETE
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21124/shard-tglb6/igt@gem_exec_schedule@u-submit-golden-slice@rcs0.html>
>
>

Lakshmi, this failure is unrelated.

The igt@gem_exec_schedule@u-submit-golden-slice plus some other subtests 
have been broken since igt commit

a9987a8d tests/i915/gem_exec_schedule: Convert to intel_ctx_t (v3)

Although the tests typically says SUCCESS, it's because they typically 
are interrupted by the watchdog and move on.

Thanks,

Thomas
Vudum, Lakshminarayana Sept. 23, 2021, 2:11 a.m. UTC | #2
Failure is related to https://gitlab.freedesktop.org/drm/intel/-/issues/3797. I have re-reported the results after updating the CI bug log filters.

Lakshmi.

From: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Sent: Wednesday, September 22, 2021 11:07 AM
To: intel-gfx@lists.freedesktop.org; Vudum, Lakshminarayana <lakshminarayana.vudum@intel.com>
Subject: Re: ✗ Fi.CI.IGT: failure for drm/i915: Suspend / resume backup- and restore of LMEM. (rev9)



On 9/22/21 11:05 AM, Patchwork wrote:
Patch Details
Series:

drm/i915: Suspend / resume backup- and restore of LMEM. (rev9)

URL:

https://patchwork.freedesktop.org/series/94278/

State:

failure

Details:

https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21124/index.html

CI Bug Log - changes from CI_DRM_10622_full -> Patchwork_21124_full
Summary

FAILURE

Serious unknown changes coming with Patchwork_21124_full absolutely need to be
verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_21124_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.

Possible new issues

Here are the unknown changes that may have been introduced in Patchwork_21124_full:

IGT changes
Possible regressions

  *   igt@gem_exec_schedule@u-submit-golden-slice@rcs0:

     *   shard-tglb: PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10622/shard-tglb3/igt@gem_exec_schedule@u-submit-golden-slice@rcs0.html> -> INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21124/shard-tglb6/igt@gem_exec_schedule@u-submit-golden-slice@rcs0.html>



Lakshmi, this failure is unrelated.

The igt@gem_exec_schedule@u-submit-golden-slice plus some other subtests have been broken since igt commit

a9987a8d tests/i915/gem_exec_schedule: Convert to intel_ctx_t (v3)

Although the tests typically says SUCCESS, it's because they typically are interrupted by the watchdog and move on.

Thanks,

Thomas