mbox series

[v7,0/6] drm/i915/ttm: Async migration

Message ID 20211122214554.371864-1-thomas.hellstrom@linux.intel.com (mailing list archive)
Headers show
Series drm/i915/ttm: Async migration | expand

Message

Thomas Hellstrom Nov. 22, 2021, 9:45 p.m. UTC
This patch series deals with async migration and async vram management.
It still leaves an important part out, which is async unbinding which
will reduce latency further, at least when trying to migrate already active
objects.

Patch 1/6 deals with accessing and waiting for the TTM moving
fence from i915 GEM.
Patch 2 is pure code reorganization, no functional change.
Patch 3 breaks a refcounting loop involving the TTM moving fence.
Patch 4 makes the i915 TTM shinking code handle async moves.
Patch 5 uses TTM to implement the ttm move() callback async, it also
introduces a utility to collect dependencies and turn them into a
single dma_fence, which is needed for the intel_migrate code.
This also affects the gem object migrate code.
Patch 6 makes the object copy utility async as well, mainly for future
users since the only current user, suspend backup and restore, typically
will want to sync anyway.

v2:
- Fix a couple of SPARSE warnings.
v3:
- Fix a NULL pointer dereference.
v4:
- Squash what was previously patch 1 and 2 to patch1
- Ditch the moving fence waiting in i915_vma_pin_iomap()
- Rework how the refcounting loop is broken in patch 3. Drop region
  reference counting.
- Break what is now patch 4 out of patch 5. Add support for avoiding
  waiting for gpu when shrinking.
- A number of changes in patch 5. See the commit message for details.
v5:
- Some fixes to i915_vma_verify_bind_complete() (Matthew Auld)
- Update patches with R-B.
v6:
- Code comment update
- Re-check for fence signaled before returning -EBUSY (Matthew Auld)
- Use dma_resv_iter_is_exclusive() (Matthew Auld)
- Await all dma-resv fences before a migration blit (Matthew Auld)
v7:
- Fix yet another compilation failure in patch 1.

Maarten Lankhorst (1):
  drm/i915: Add support for moving fence waiting

Thomas Hellström (5):
  drm/i915/ttm: Move the i915_gem_obj_copy_ttm() function
  drm/i915/ttm: Drop region reference counting
  drm/i915/ttm: Correctly handle waiting for gpu when shrinking
  drm/i915/ttm: Implement asynchronous TTM moves
  drm/i915/ttm: Update i915_gem_obj_copy_ttm() to be asynchronous

 drivers/gpu/drm/i915/gem/i915_gem_object.c    |  52 +++
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   6 +
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |   6 +
 drivers/gpu/drm/i915/gem/i915_gem_region.c    |   4 +-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |   3 +-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |   1 +
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |   6 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       |  89 ++--
 drivers/gpu/drm/i915/gem/i915_gem_ttm.h       |   6 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 409 ++++++++++++++++--
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.h  |  10 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_pm.c    |   3 +
 drivers/gpu/drm/i915/gem/i915_gem_wait.c      |   4 +-
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |   2 +-
 drivers/gpu/drm/i915/gt/intel_region_lmem.c   |  10 +-
 drivers/gpu/drm/i915/i915_vma.c               |  43 +-
 drivers/gpu/drm/i915/intel_memory_region.c    |  26 +-
 drivers/gpu/drm/i915/intel_memory_region.h    |   9 +-
 drivers/gpu/drm/i915/intel_region_ttm.c       |  35 +-
 drivers/gpu/drm/i915/intel_region_ttm.h       |   2 +-
 .../drm/i915/selftests/intel_memory_region.c  |   8 +-
 drivers/gpu/drm/i915/selftests/mock_region.c  |   7 +-
 23 files changed, 599 insertions(+), 143 deletions(-)

Comments

Thomas Hellstrom Nov. 24, 2021, 10:56 a.m. UTC | #1
On 11/24/21 10:42, Patchwork wrote:
> Project List - Patchwork *Patch Details*
> *Series:* 	drm/i915/ttm: Async migration (rev11)
> *URL:* 	https://patchwork.freedesktop.org/series/96798/
> *State:* 	failure
> *Details:* 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/index.html
>
>
>   CI Bug Log - changes from CI_DRM_10921_full -> Patchwork_21672_full
>
>
>     Summary
>
> *FAILURE*
>
> Serious unknown changes coming with Patchwork_21672_full absolutely 
> need to be
> verified manually.
>
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_21672_full, please notify your bug team to 
> allow them
> to document this new failure mode, which will reduce false positives 
> in CI.
>
>
>     Participating hosts (11 -> 11)
>
> No changes in participating hosts
>
>
>     Possible new issues
>
> Here are the unknown changes that may have been introduced in 
> Patchwork_21672_full:
>
>
>       IGT changes
>
>
>         Possible regressions
>
>  *
>
>     igt@gem_exec_capture@pi@rcs0:
>
>       o shard-skl: PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10921/shard-skl7/igt@gem_exec_capture@pi@rcs0.html>
>         -> INCOMPLETE
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/shard-skl2/igt@gem_exec_capture@pi@rcs0.html>
>  *
>
>     igt@gem_exec_create@forked@smem:
>
>       o shard-kbl: NOTRUN -> INCOMPLETE
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/shard-kbl7/igt@gem_exec_create@forked@smem.html>
>  *
>
>     igt@kms_atomic_interruptible@universal-setplane-cursor@dp-1-pipe-a:
>
>       o shard-kbl: PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10921/shard-kbl1/igt@kms_atomic_interruptible@universal-setplane-cursor@dp-1-pipe-a.html>
>         -> INCOMPLETE
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/shard-kbl7/igt@kms_atomic_interruptible@universal-setplane-cursor@dp-1-pipe-a.html>
>         +2 similar issues
>
Lakshmi,

These failures are unrelated. The first one appears also elsewhere, The 
other two appears to be a general problem with shard-kbl. Lots of random 
hangs from that one.

/Thomas
Thomas Hellstrom Nov. 24, 2021, 3:51 p.m. UTC | #2
On Wed, 2021-11-24 at 11:56 +0100, Thomas Hellström wrote:
> 
> On 11/24/21 10:42, Patchwork wrote:
> 
> > Project List - Patchwork Patch Details Series: drm/i915/ttm: Async
> > migration (rev11) URL:
> > https://patchwork.freedesktop.org/series/96798/ State: failure
> > Details:
> > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/index.html
> > CI Bug Log - changes from CI_DRM_10921_full -> Patchwork_21672_fullSummaryFAILURE
> > Serious unknown changes coming with Patchwork_21672_full absolutely
> > need to be
> > verified manually.
> > If you think the reported changes have nothing to do with the
> > changes
> > introduced in Patchwork_21672_full, please notify your bug team to
> > allow them
> > to document this new failure mode, which will reduce false
> > positives in CI.
> > Participating hosts (11 -> 11)No changes in participating hosts
> > Possible new issuesHere are the unknown changes that may have been introduced in
> > Patchwork_21672_full:
> > IGT changesPossible regressions * igt@gem_exec_capture@pi@rcs0: shard-skl: PASS -> INCOMPLETE
> >  * igt@gem_exec_create@forked@smem: shard-kbl: NOTRUN -> INCOMPLETE
> >  * igt@kms_atomic_interruptible@universal-setplane-cursor@dp-1-pipe-a:
> >    shard-kbl: PASS -> INCOMPLETE +2 similar issues
> Lakshmi,
> These failures are unrelated. The first one appears also elsewhere,
> The other two appears to be a general problem with shard-kbl. Lots of
> random hangs from that one.
> /Thomas
> 
Lakshmi, Those shard-kbl failures appear to be machine related. (Not
the first skl failure, though), so no need to file issues for the two
last ones. I'll trigger a rerun.

/Thomas
Vudum, Lakshminarayana Nov. 24, 2021, 8:16 p.m. UTC | #3
Filed two new issues
https://gitlab.freedesktop.org/drm/intel/-/issues/4632
igt@gem_exec_create@forked@smem - incomplete - No warnings/errors

https://gitlab.freedesktop.org/drm/intel/-/issues/4633
igt@kms_atomic_interruptible@universal-setplane-cursor@dp-1-pipe-a - incomplete - No warnings/errors

re-reported.

Lakshmi.
From: Thomas Hellström <thomas.hellstrom@linux.intel.com>
Sent: Wednesday, November 24, 2021 2:56 AM
To: intel-gfx@lists.freedesktop.org; Thomas Hellström <thomas.hellstrom@linux.intel.com>; Vudum, Lakshminarayana <lakshminarayana.vudum@intel.com>
Subject: Re: [Intel-gfx] ✗ Fi.CI.IGT: failure for drm/i915/ttm: Async migration (rev11)



On 11/24/21 10:42, Patchwork wrote:
Patch Details
Series:

drm/i915/ttm: Async migration (rev11)

URL:

https://patchwork.freedesktop.org/series/96798/

State:

failure

Details:

https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/index.html

CI Bug Log - changes from CI_DRM_10921_full -> Patchwork_21672_full
Summary

FAILURE

Serious unknown changes coming with Patchwork_21672_full absolutely need to be
verified manually.

If you think the reported changes have nothing to do with the changes
introduced in Patchwork_21672_full, please notify your bug team to allow them
to document this new failure mode, which will reduce false positives in CI.

Participating hosts (11 -> 11)

No changes in participating hosts

Possible new issues

Here are the unknown changes that may have been introduced in Patchwork_21672_full:

IGT changes
Possible regressions

  *   igt@gem_exec_capture@pi@rcs0:

     *   shard-skl: PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10921/shard-skl7/igt@gem_exec_capture@pi@rcs0.html> -> INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/shard-skl2/igt@gem_exec_capture@pi@rcs0.html>

  *   igt@gem_exec_create@forked@smem:

     *   shard-kbl: NOTRUN -> INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/shard-kbl7/igt@gem_exec_create@forked@smem.html>

  *   igt@kms_atomic_interruptible@universal-setplane-cursor@dp-1-pipe-a:

     *   shard-kbl: PASS<https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_10921/shard-kbl1/igt@kms_atomic_interruptible@universal-setplane-cursor@dp-1-pipe-a.html> -> INCOMPLETE<https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_21672/shard-kbl7/igt@kms_atomic_interruptible@universal-setplane-cursor@dp-1-pipe-a.html> +2 similar issues

Lakshmi,

These failures are unrelated. The first one appears also elsewhere, The other two appears to be a general problem with shard-kbl. Lots of random hangs from that one.

/Thomas