mbox series

[RFC,00/16] Add a TTM shrinker

Message ID 20230215161405.187368-1-thomas.hellstrom@linux.intel.com (mailing list archive)
Headers show
Series Add a TTM shrinker | expand

Message

Thomas Hellstrom Feb. 15, 2023, 4:13 p.m. UTC
This series introduces a TTM shrinker.

Currently the TTM subsystem allows a certain watermark fraction of
system memory to be pinned by GPUs. Any allocation beyond that will
cause TTM to attempt to copy memory to shmem objects for possible
later swapout so that that fraction is fulfilled. That unnecessarily
happens also on systems where swapping is not available, but still
works reasonably well in many cases.

However there is no way for the system to swap out all of graphics
memory even in situatons where graphics processes are suspended.

So add a TTM shrinker capable of moving graphics memory pages to the
swap cache for later laundring and free, and, in the case there is no
swap available, freeing graphics memory that is kept around for
caching purposes.

For devices where the shrinker is active, the watermark
fraction is disabled, but for devices not (yet) supporting shrinking
or using dma_alloced memory which we can't insert into the swap-cache,
keep it around.

Each driver needs to implement a callback to enable the shrinker for
its devices. Enable it for i915 as a POC. Will also be used by the
new Intel xe driver if accepted.

The parts of the series mostly needing consideration and feecback is

*) The mm part, inserting pages into the swap-cache. Is it acceptable and,
   if so, correct? It *might* be possible we can do without this part,
   but then we'd have to be able to call read_mapping_page() and
   trylock_page() on non-isolated shmem pages from reclaim context,
   and need to be able to recover from failures.

*) The TTM driver callback for shrinking

*) The additional TTM functions to mark buffer-objects as not needed, but
   good to have around for caching purposes.

*) Swapin doesn't lose content on error and is also interruptible or at
   least killable ATM. This complicates helpers. Should we
   drop this and just drop content on error, and wait for swapin
   uninterruptible? The TTM pool code could indeed do without additional
   complication...

*) Is there a better way to do shrink throttling to avoid filling the
   swap-cache completely.

*) Is it good enough for real-world workloads?

The series has been tested using the i915 driver with a 4GiB
VRAM DG1 on a system with 14GiB system memory and 16GiB SSD Swap, and using
an old igt-gpu-tools version, 8c0bb07b7b4d, of gem_lmem_swapping
which overcommits system memory quite extensively

Patch walkthrough:

Initial bugfixes, could be decoupled from the series.
drm/ttm: Fix a NULL pointer dereference.
drm/ttm/pool: Fix ttm_pool_alloc error path.

Cleanups and restructuring:
drm/ttm: Use the BIT macro for the TTM_TT_FLAGs
drm/ttm, drm/vmwgfx: Update the TTM swapout interface
drm/ttm: Unexport ttm_global_swapout()

Adding shrinker without enabling it:
drm/ttm: Don't use watermark accounting on shrinkable pools
drm/ttm: Reduce the number of used allocation orders for TTM pages
drm/ttm: Add a shrinker and shrinker accounting
drm/ttm: Introduce shrink throttling
drm/ttm: Remove pinned bos from shrinkable accounting
drm/ttm: Add a simple api to set/ clear purgeable ttm_tt content

Adding the core mm part to insert and read-back pages from the swap-cache:
mm: Add interfaces to back up and recover folio contents using swap.

TTM helpers for shrinking:
drm/ttm: Make the call to ttm_tt_populate() interruptible when faulting.
drm/ttm: Provide helpers for shrinking.
drm/ttm: Use fault-injection to test error paths.

Enable i915:
drm/i915, drm/ttm: Use the TTM shrinker rather than the external shmem pool

Any feedback greatly appreciated.
Thomas

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: "Matthew Wilcox (Oracle)" <willy@infradead.org>
Cc: Miaohe Lin <linmiaohe@huawei.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Peter Xu <peterx@redhat.com>
Cc: NeilBrown <neilb@suse.de>
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Christian Koenig <christian.koenig@amd.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: <linux-graphics-maintainer@vmware.com>
Cc: <linux-mm@kvack.org>
Cc: <intel-gfx@lists.freedesktop.org>


Thomas Hellström (16):
  drm/ttm: Fix a NULL pointer dereference
  drm/ttm/pool: Fix ttm_pool_alloc error path
  drm/ttm: Use the BIT macro for the TTM_TT_FLAGs
  drm/ttm, drm/vmwgfx: Update the TTM swapout interface
  drm/ttm: Unexport ttm_global_swapout()
  drm/ttm: Don't use watermark accounting on shrinkable pools
  drm/ttm: Reduce the number of used allocation orders for TTM pages
  drm/ttm: Add a shrinker and shrinker accounting
  drm/ttm: Introduce shrink throttling.
  drm/ttm: Remove pinned bos from shrinkable accounting
  drm/ttm: Add a simple api to set / clear purgeable ttm_tt content
  mm: Add interfaces to back up and recover folio contents using swap
  drm/ttm: Make the call to ttm_tt_populate() interruptible when
    faulting
  drm/ttm: Provide helpers for shrinking
  drm/ttm: Use fault-injection to test error paths
  drm/i915, drm/ttm: Use the TTM shrinker rather than the external shmem
    pool

 drivers/gpu/drm/Kconfig                       |  11 +
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |   6 -
 .../gpu/drm/i915/gem/i915_gem_object_types.h  |   6 -
 drivers/gpu/drm/i915/gem/i915_gem_pages.c     |   5 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm.c       | 273 ++-------
 drivers/gpu/drm/i915/i915_gem.c               |   3 +-
 drivers/gpu/drm/ttm/ttm_bo.c                  |  45 +-
 drivers/gpu/drm/ttm/ttm_bo_vm.c               |  19 +-
 drivers/gpu/drm/ttm/ttm_device.c              |  85 ++-
 drivers/gpu/drm/ttm/ttm_pool.c                | 522 ++++++++++++++++--
 drivers/gpu/drm/ttm/ttm_tt.c                  | 336 +++++++++--
 drivers/gpu/drm/vmwgfx/vmwgfx_drv.c           |   3 +-
 include/drm/ttm/ttm_bo.h                      |   4 +-
 include/drm/ttm/ttm_device.h                  |  36 +-
 include/drm/ttm/ttm_pool.h                    |  19 +
 include/drm/ttm/ttm_tt.h                      |  57 +-
 include/linux/swap.h                          |  10 +
 mm/Kconfig                                    |  18 +
 mm/Makefile                                   |   2 +
 mm/swap_backup_folio.c                        | 178 ++++++
 mm/swap_backup_folio_test.c                   | 111 ++++
 21 files changed, 1361 insertions(+), 388 deletions(-)
 create mode 100644 mm/swap_backup_folio.c
 create mode 100644 mm/swap_backup_folio_test.c