mbox series

[v3,00/16] drm/i915/guc: Refactor ADS access to use iosys_map

Message ID 20220216174147.3073235-1-lucas.demarchi@intel.com (mailing list archive)
Headers show
Series drm/i915/guc: Refactor ADS access to use iosys_map | expand

Message

Lucas De Marchi Feb. 16, 2022, 5:41 p.m. UTC
original: https://patchwork.freedesktop.org/series/99378/
v2: https://patchwork.freedesktop.org/series/99711/#rev1,
    https://patchwork.freedesktop.org/series/99711/#rev2

Main changes from previous version:

	- Unrelated patches to iosys-map conversion have landed
	- Remove unecessary kernel.h include from iosys-map.h
	- Rebase on latest drm-tip

Original cover letter:

While porting i915 to arm64 we noticed some issues accessing lmem.
Some writes were getting corrupted and the final state of the buffer
didn't have exactly what we wrote. This became evident when enabling
GuC submission: depending on the number of engines the ADS struct was
being corrupted and GuC would reject it, refusin to initialize.

From Documentation/core-api/bus-virt-phys-mapping.rst:

	This memory is called "PCI memory" or "shared memory" or "IO memory" or
	whatever, and there is only one way to access it: the readb/writeb and
	related functions. You should never take the address of such memory, because
	there is really nothing you can do with such an address: it's not
	conceptually in the same memory space as "real memory" at all, so you cannot
	just dereference a pointer. (Sadly, on x86 it **is** in the same memory space,
	so on x86 it actually works to just deference a pointer, but it's not
	portable).

When reading or writing words directly to IO memory, in order to be portable
the Linux kernel provides the abstraction detailed in section "Differences
between I/O access functions" of Documentation/driver-api/device-io.rst.

This limits our ability to simply overlay our structs on top a buffer
and directly access it since that buffer may come from IO memory rather than
system memory. Hence the approach taken in intel_guc_ads.c needs to be
refactored. This is not the only place in i915 that need to be changed, but
the one causing the most problems, with a real reproducer. This first set of
patch focuses on fixing the gem object to pass the ADS

After the addition of a few helpers in the dma_buf_map API, most of
intel_guc_ads.c can be converted to use it. The exception is the regset
initialization: we'd incur into a lot of extra indirection when
reading/writing each register. So the regset is converted to use a
temporary buffer allocated on probe, which is then copied to its
final location when finishing the initialization or on gt reset.
[v3: the part unrelated to iosys-map has already landed]

Testing on some discrete cards, after this change we can correctly pass the
ADS struct to GuC and have it initialized correctly.

thanks
Lucas De Marchi

Lucas De Marchi (16):
  iosys-map: Add offset to iosys_map_memcpy_to()
  iosys-map: Add a few more helpers
  drm/i915/gt: Add helper for shmem copy to iosys_map
  drm/i915/guc: Keep iosys_map of ads_blob around
  drm/i915/guc: Add read/write helpers for ADS blob
  drm/i915/guc: Convert golden context init to iosys_map
  drm/i915/guc: Convert policies update to iosys_map
  drm/i915/guc: Convert engine record to iosys_map
  drm/i915/guc: Convert guc_ads_private_data_reset to iosys_map
  drm/i915/guc: Convert golden context prep to iosys_map
  drm/i915/guc: Replace check for golden context size
  drm/i915/guc: Convert mapping table to iosys_map
  drm/i915/guc: Convert capture list to iosys_map
  drm/i915/guc: Convert guc_mmio_reg_state_init to iosys_map
  drm/i915/guc: Convert __guc_ads_init to iosys_map
  drm/i915/guc: Remove plain ads_blob pointer

 drivers/gpu/drm/drm_cache.c                   |   2 +-
 drivers/gpu/drm/drm_fb_helper.c               |   2 +-
 drivers/gpu/drm/i915/gt/shmem_utils.c         |  32 +++
 drivers/gpu/drm/i915/gt/shmem_utils.h         |   3 +
 drivers/gpu/drm/i915/gt/uc/intel_guc.h        |   7 +-
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.c    | 233 ++++++++++--------
 drivers/gpu/drm/i915/gt/uc/intel_guc_ads.h    |   3 +-
 .../gpu/drm/i915/gt/uc/intel_guc_submission.c |  17 +-
 include/linux/iosys-map.h                     | 218 +++++++++++++++-
 9 files changed, 396 insertions(+), 121 deletions(-)

Comments

Lucas De Marchi Feb. 19, 2022, 6:46 p.m. UTC | #1
On Sat, Feb 19, 2022 at 12:47:31PM +0000, Patchwork wrote:
>== Series Details ==
>
>Series: drm/i915/guc: Refactor ADS access to use iosys_map (rev4)
>URL   : https://patchwork.freedesktop.org/series/99711/
>State : failure
>
>== Summary ==
>
>CI Bug Log - changes from CI_DRM_11250_full -> Patchwork_22334_full
>====================================================
>
>Summary
>-------
>
>  **FAILURE**
>
>  Serious unknown changes coming with Patchwork_22334_full absolutely need to be
>  verified manually.
>
>  If you think the reported changes have nothing to do with the changes
>  introduced in Patchwork_22334_full, please notify your bug team to allow them
>  to document this new failure mode, which will reduce false positives in CI.
>
>
>
>Participating hosts (13 -> 13)
>------------------------------
>
>  No changes in participating hosts
>
>Possible new issues
>-------------------
>
>  Here are the unknown changes that may have been introduced in Patchwork_22334_full:
>
>### IGT changes ###
>
>#### Possible regressions ####
>
>  * igt@gem_exec_balancer@bonded-dual:
>    - shard-tglb:         [PASS][1] -> [INCOMPLETE][2]
>   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11250/shard-tglb6/igt@gem_exec_balancer@bonded-dual.html
>   [2]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22334/shard-tglb6/igt@gem_exec_balancer@bonded-dual.html

<7>[    7.738070] i915 0000:00:02.0: [drm:intel_uc_init_early [i915]] enable_guc=0 (guc:no submission:no huc:no slpc:no)

>
>  * igt@gem_mmap_gtt@fault-concurrent-x:
>    - shard-snb:          [PASS][3] -> [INCOMPLETE][4]
>   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11250/shard-snb4/igt@gem_mmap_gtt@fault-concurrent-x.html
>   [4]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22334/shard-snb6/igt@gem_mmap_gtt@fault-concurrent-x.html

<7>[    4.614922] i915 0000:00:02.0: [drm:intel_uc_init_early [i915]] enable_guc=0 (guc:no submission:no huc:no slpc:no)


anything in this series can only change the behavior when using guc, so these failures can't be related to this series.

Lucas De Marchi
Vudum, Lakshminarayana Feb. 22, 2022, 4:50 p.m. UTC | #2
Filed two new issues
https://gitlab.freedesktop.org/drm/intel/-/issues/5162
igt@gem_exec_balancer@bonded-dual - incomplete - No logs

https://gitlab.freedesktop.org/drm/intel/-/issues/5161
igt@gem_mmap_gtt@fault-concurrent-x - incomplete - unhandled error in i915_error_to_vmf_fault: -105

#5161 looks like a new issue but yet to happen on post-merge.

Thanks,
Lakshmi.

-----Original Message-----
From: De Marchi, Lucas <lucas.demarchi@intel.com> 
Sent: Saturday, February 19, 2022 10:46 AM
To: intel-gfx@lists.freedesktop.org
Cc: Vudum, Lakshminarayana <lakshminarayana.vudum@intel.com>
Subject: Re: ✗ Fi.CI.IGT: failure for drm/i915/guc: Refactor ADS access to use iosys_map (rev4)

On Sat, Feb 19, 2022 at 12:47:31PM +0000, Patchwork wrote:
>== Series Details ==
>
>Series: drm/i915/guc: Refactor ADS access to use iosys_map (rev4)
>URL   : https://patchwork.freedesktop.org/series/99711/
>State : failure
>
>== Summary ==
>
>CI Bug Log - changes from CI_DRM_11250_full -> Patchwork_22334_full 
>====================================================
>
>Summary
>-------
>
>  **FAILURE**
>
>  Serious unknown changes coming with Patchwork_22334_full absolutely 
> need to be  verified manually.
>
>  If you think the reported changes have nothing to do with the changes  
> introduced in Patchwork_22334_full, please notify your bug team to 
> allow them  to document this new failure mode, which will reduce false positives in CI.
>
>
>
>Participating hosts (13 -> 13)
>------------------------------
>
>  No changes in participating hosts
>
>Possible new issues
>-------------------
>
>  Here are the unknown changes that may have been introduced in Patchwork_22334_full:
>
>### IGT changes ###
>
>#### Possible regressions ####
>
>  * igt@gem_exec_balancer@bonded-dual:
>    - shard-tglb:         [PASS][1] -> [INCOMPLETE][2]
>   [1]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11250/shard-tglb6/igt@gem_exec_balancer@bonded-dual.html
>   [2]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22334/shard-tglb6/i
> gt@gem_exec_balancer@bonded-dual.html

<7>[    7.738070] i915 0000:00:02.0: [drm:intel_uc_init_early [i915]] enable_guc=0 (guc:no submission:no huc:no slpc:no)

>
>  * igt@gem_mmap_gtt@fault-concurrent-x:
>    - shard-snb:          [PASS][3] -> [INCOMPLETE][4]
>   [3]: https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_11250/shard-snb4/igt@gem_mmap_gtt@fault-concurrent-x.html
>   [4]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_22334/shard-snb6/ig
> t@gem_mmap_gtt@fault-concurrent-x.html

<7>[    4.614922] i915 0000:00:02.0: [drm:intel_uc_init_early [i915]] enable_guc=0 (guc:no submission:no huc:no slpc:no)


anything in this series can only change the behavior when using guc, so these failures can't be related to this series.

Lucas De Marchi