[v4,0/4] TTM unlockable restartable LRU list iteration

Message ID	20240306070125.27071-1-thomas.hellstrom@linux.intel.com (mailing list archive)
Headers	show Return-Path: <intel-gfx-bounces@lists.freedesktop.org> From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com> To: intel-xe@lists.freedesktop.org, intel-gfx@lists.freedesktop.org Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com>, Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>, =?utf-8?q?Christian_?= =?utf-8?q?K=C3=B6nig?= <christian.koenig@amd.com>, dri-devel@lists.freedesktop.org Subject: [PATCH v4 0/4] TTM unlockable restartable LRU list iteration Date: Wed, 6 Mar 2024 08:01:21 +0100 Message-ID: <20240306070125.27071-1-thomas.hellstrom@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: list Errors-To: intel-gfx-bounces@lists.freedesktop.org Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>
Series	TTM unlockable restartable LRU list iteration \| expand [v4,0/4] TTM unlockable restartable LRU list iteration [v4,1/4] drm/ttm: Allow TTM LRU list nodes of different types [v4,2/4] drm/ttm: Use LRU hitches [v4,3/4] drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist moves [v4,4/4] drm/ttm: Allow continued swapout after -ENOSPC falure

Message ID

20240306070125.27071-1-thomas.hellstrom@linux.intel.com (mailing list archive)

Headers

From: =?utf-8?q?Thomas_Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com>
To: intel-xe@lists.freedesktop.org,
	intel-gfx@lists.freedesktop.org
Cc: =?utf-8?q?Thomas_Hellstr=C3=B6m?= <thomas.hellstrom@linux.intel.com>,
 Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>, =?utf-8?q?Christian_?=
	=?utf-8?q?K=C3=B6nig?= <christian.koenig@amd.com>,
 dri-devel@lists.freedesktop.org
Subject: [PATCH v4 0/4] TTM unlockable restartable LRU list iteration
Date: Wed,  6 Mar 2024 08:01:21 +0100
Message-ID: <20240306070125.27071-1-thomas.hellstrom@linux.intel.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Precedence: list
Errors-To: intel-gfx-bounces@lists.freedesktop.org
Sender: "Intel-gfx" <intel-gfx-bounces@lists.freedesktop.org>

Series

TTM unlockable restartable LRU list iteration | expand

Message

Thomas Hellström March 6, 2024, 7:01 a.m. UTC

This patch-set is a prerequisite for a standalone TTM shrinker
and for exhaustive TTM eviction using sleeping dma_resv locks,
which is the motivation for it.

Currently when unlocking the TTM lru list lock, iteration needs
to be restarted from the beginning, rather from the next LRU list
node. This can potentially be a big problem, because if eviction
or shrinking fails for whatever reason after unlock, restarting
is likely to cause the same failure over and over again.

There are various schemes to be able to continue the list
iteration from where we left off. One such scheme used by the
GEM LRU list traversal is to pull items already considered off
the LRU list and reinsert them when iteration is done.
This has the drawback that concurrent list iteration doesn't see
the complete list (which is bad for exhaustive eviction) and also
doesn't lend itself well to bulk-move sublists since these will
be split in the process where items from those lists are
temporarily pulled from the list and moved to the list tail.

The approach taken here is that list iterators insert themselves
into the list next position using a special list node. Iteration
is then using that list node as starting point when restarting.
Concurrent iterators just skip over the special list nodes.

This is implemented in patch 1 and 2.

For bulk move sublist the approach is the same, but when a bulk
move sublist is moved to the tail, the iterator is also moved,
causing us to skip parts of the list. That is undesirable.
Patch 3 deals with that, and when iterator detects it is
traversing a sublist, it registers with the ttm_lru_bulk_move
struct using a linked list, and when that bulk move sublist
is moved to the tail, any iterator registered with it will
first be moved to the tail of the sublist.
This is implemented in patch 3.

The restartable property is used in patch 4 to restart swapout if
needed, but the main purpose is this paves the way for
shrinker- and exhaustive eviction.

v2:
- Rework patch 3 completely.
v3:
- Fix a NULL pointer dereference found by Xe CI.
v4:
- Remove some leftover code causing build problems.

Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
Cc: Christian König <christian.koenig@amd.com>
Cc: <dri-devel@lists.freedesktop.org>

Thomas Hellström (4):
  drm/ttm: Allow TTM LRU list nodes of different types
  drm/ttm: Use LRU hitches
  drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist
    moves
  drm/ttm: Allow continued swapout after -ENOSPC falure

 drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |   4 +
 drivers/gpu/drm/ttm/ttm_bo.c           |   1 +
 drivers/gpu/drm/ttm/ttm_device.c       |  33 +++-
 drivers/gpu/drm/ttm/ttm_resource.c     | 228 ++++++++++++++++++++-----
 drivers/gpu/drm/xe/xe_vm.c             |   4 +
 include/drm/ttm/ttm_device.h           |   2 +
 include/drm/ttm/ttm_resource.h         |  96 +++++++++--
 7 files changed, 308 insertions(+), 60 deletions(-)

Comments

Somalapuram, Amaranath March 8, 2024, 7:43 a.m. UTC | #1

Patches are tested on AMD platform.
Repeated stress test on Unigine Heaven, memory full (VRAM + GTT + system 
SWAP), then free.
No errors/warning in kernel log.
Any suggestion specific tests?

Regards,
S.Amarnath
On 3/6/2024 12:31 PM, Thomas Hellström wrote:
> This patch-set is a prerequisite for a standalone TTM shrinker
> and for exhaustive TTM eviction using sleeping dma_resv locks,
> which is the motivation for it.
>
> Currently when unlocking the TTM lru list lock, iteration needs
> to be restarted from the beginning, rather from the next LRU list
> node. This can potentially be a big problem, because if eviction
> or shrinking fails for whatever reason after unlock, restarting
> is likely to cause the same failure over and over again.
>
> There are various schemes to be able to continue the list
> iteration from where we left off. One such scheme used by the
> GEM LRU list traversal is to pull items already considered off
> the LRU list and reinsert them when iteration is done.
> This has the drawback that concurrent list iteration doesn't see
> the complete list (which is bad for exhaustive eviction) and also
> doesn't lend itself well to bulk-move sublists since these will
> be split in the process where items from those lists are
> temporarily pulled from the list and moved to the list tail.
>
> The approach taken here is that list iterators insert themselves
> into the list next position using a special list node. Iteration
> is then using that list node as starting point when restarting.
> Concurrent iterators just skip over the special list nodes.
>
> This is implemented in patch 1 and 2.
>
> For bulk move sublist the approach is the same, but when a bulk
> move sublist is moved to the tail, the iterator is also moved,
> causing us to skip parts of the list. That is undesirable.
> Patch 3 deals with that, and when iterator detects it is
> traversing a sublist, it registers with the ttm_lru_bulk_move
> struct using a linked list, and when that bulk move sublist
> is moved to the tail, any iterator registered with it will
> first be moved to the tail of the sublist.
> This is implemented in patch 3.
>
> The restartable property is used in patch 4 to restart swapout if
> needed, but the main purpose is this paves the way for
> shrinker- and exhaustive eviction.
>
> v2:
> - Rework patch 3 completely.
> v3:
> - Fix a NULL pointer dereference found by Xe CI.
> v4:
> - Remove some leftover code causing build problems.
>
> Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
> Cc: Christian König <christian.koenig@amd.com>
> Cc: <dri-devel@lists.freedesktop.org>
>
> Thomas Hellström (4):
>    drm/ttm: Allow TTM LRU list nodes of different types
>    drm/ttm: Use LRU hitches
>    drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk sublist
>      moves
>    drm/ttm: Allow continued swapout after -ENOSPC falure
>
>   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |   4 +
>   drivers/gpu/drm/ttm/ttm_bo.c           |   1 +
>   drivers/gpu/drm/ttm/ttm_device.c       |  33 +++-
>   drivers/gpu/drm/ttm/ttm_resource.c     | 228 ++++++++++++++++++++-----
>   drivers/gpu/drm/xe/xe_vm.c             |   4 +
>   include/drm/ttm/ttm_device.h           |   2 +
>   include/drm/ttm/ttm_resource.h         |  96 +++++++++--
>   7 files changed, 308 insertions(+), 60 deletions(-)
>

Thomas Hellström March 11, 2024, 1:07 p.m. UTC | #2

On Fri, 2024-03-08 at 13:13 +0530, Somalapuram, Amaranath wrote:
> Patches are tested on AMD platform.
> Repeated stress test on Unigine Heaven, memory full (VRAM + GTT +
> system 
> SWAP), then free.
> No errors/warning in kernel log.
> Any suggestion specific tests?

We are testing locally against Intel Xe CI and Intel i915 CI which
should give rather good coverage. If there are some amdgpu tests that
exercise eviction / swapping also with a lot of local objects (Vulkan
apps?) that would be great.

Thanks,
Thomas



> 
> Regards,
> S.Amarnath
> On 3/6/2024 12:31 PM, Thomas Hellström wrote:
> > This patch-set is a prerequisite for a standalone TTM shrinker
> > and for exhaustive TTM eviction using sleeping dma_resv locks,
> > which is the motivation for it.
> > 
> > Currently when unlocking the TTM lru list lock, iteration needs
> > to be restarted from the beginning, rather from the next LRU list
> > node. This can potentially be a big problem, because if eviction
> > or shrinking fails for whatever reason after unlock, restarting
> > is likely to cause the same failure over and over again.
> > 
> > There are various schemes to be able to continue the list
> > iteration from where we left off. One such scheme used by the
> > GEM LRU list traversal is to pull items already considered off
> > the LRU list and reinsert them when iteration is done.
> > This has the drawback that concurrent list iteration doesn't see
> > the complete list (which is bad for exhaustive eviction) and also
> > doesn't lend itself well to bulk-move sublists since these will
> > be split in the process where items from those lists are
> > temporarily pulled from the list and moved to the list tail.
> > 
> > The approach taken here is that list iterators insert themselves
> > into the list next position using a special list node. Iteration
> > is then using that list node as starting point when restarting.
> > Concurrent iterators just skip over the special list nodes.
> > 
> > This is implemented in patch 1 and 2.
> > 
> > For bulk move sublist the approach is the same, but when a bulk
> > move sublist is moved to the tail, the iterator is also moved,
> > causing us to skip parts of the list. That is undesirable.
> > Patch 3 deals with that, and when iterator detects it is
> > traversing a sublist, it registers with the ttm_lru_bulk_move
> > struct using a linked list, and when that bulk move sublist
> > is moved to the tail, any iterator registered with it will
> > first be moved to the tail of the sublist.
> > This is implemented in patch 3.
> > 
> > The restartable property is used in patch 4 to restart swapout if
> > needed, but the main purpose is this paves the way for
> > shrinker- and exhaustive eviction.
> > 
> > v2:
> > - Rework patch 3 completely.
> > v3:
> > - Fix a NULL pointer dereference found by Xe CI.
> > v4:
> > - Remove some leftover code causing build problems.
> > 
> > Cc: Somalapuram Amaranath <Amaranath.Somalapuram@amd.com>
> > Cc: Christian König <christian.koenig@amd.com>
> > Cc: <dri-devel@lists.freedesktop.org>
> > 
> > Thomas Hellström (4):
> >    drm/ttm: Allow TTM LRU list nodes of different types
> >    drm/ttm: Use LRU hitches
> >    drm/ttm, drm/amdgpu, drm/xe: Consider hitch moves within bulk
> > sublist
> >      moves
> >    drm/ttm: Allow continued swapout after -ENOSPC falure
> > 
> >   drivers/gpu/drm/amd/amdgpu/amdgpu_vm.c |   4 +
> >   drivers/gpu/drm/ttm/ttm_bo.c           |   1 +
> >   drivers/gpu/drm/ttm/ttm_device.c       |  33 +++-
> >   drivers/gpu/drm/ttm/ttm_resource.c     | 228
> > ++++++++++++++++++++-----
> >   drivers/gpu/drm/xe/xe_vm.c             |   4 +
> >   include/drm/ttm/ttm_device.h           |   2 +
> >   include/drm/ttm/ttm_resource.h         |  96 +++++++++--
> >   7 files changed, 308 insertions(+), 60 deletions(-)
> >

Thomas Hellström March 13, 2024, 12:26 p.m. UTC | #3

Hi!

On Mon, 2024-03-11 at 14:07 +0100, Thomas Hellström wrote:
> On Fri, 2024-03-08 at 13:13 +0530, Somalapuram, Amaranath wrote:
> > Patches are tested on AMD platform.
> > Repeated stress test on Unigine Heaven, memory full (VRAM + GTT +
> > system 
> > SWAP), then free.
> > No errors/warning in kernel log.
> > Any suggestion specific tests?
> 
> We are testing locally against Intel Xe CI and Intel i915 CI which
> should give rather good coverage. If there are some amdgpu tests that
> exercise eviction / swapping also with a lot of local objects (Vulkan
> apps?) that would be great.
> 
> Thanks,
> Thomas
> 

Any updates on this?

FWIW, For patch 3, IMO after looking a bit at other solutions, IMO this
is the preferred solution mostly because it is self-contained. In
particular if we allow drivers to iterate over the LRU lists with this
interface, most likely if we add semantics like "You must block any
bulk lru bumping if unlocking the lru_lock" That becomes pretty nasty
and will most likely end up incorrect. It might well be that we've
traversed well into a bulk move lru sublist before we try to unlock.

/Thomas