mbox series

[v2,0/5] drm/i915: Allow user to set cache at BO creation

Message ID 20230426062423.320519-1-fei.yang@intel.com (mailing list archive)
Headers show
Series drm/i915: Allow user to set cache at BO creation | expand

Message

Yang, Fei April 26, 2023, 6:24 a.m. UTC
From: Fei Yang <fei.yang@intel.com>

The first three patches in this series are taken from
https://patchwork.freedesktop.org/series/116868/
These patches are included here because the last patch
has dependency on the pat_index refactor.

This series is focusing on uAPI changes,
1. end support for set caching ioctl [PATCH 4/5]
2. add set_pat extension for gem_create [PATCH 5/5]

v2: drop one patch that was merged separately
    341ad0e8e254 drm/i915/mtl: Add PTE encode function

Fei Yang (5):
  drm/i915: preparation for using PAT index
  drm/i915: use pat_index instead of cache_level
  drm/i915: make sure correct pte encode is used
  drm/i915/mtl: end support for set caching ioctl
  drm/i915: Allow user to set cache at BO creation

 drivers/gpu/drm/i915/display/intel_dpt.c      | 12 +--
 drivers/gpu/drm/i915/gem/i915_gem_create.c    | 36 +++++++++
 drivers/gpu/drm/i915/gem/i915_gem_domain.c    | 30 +++----
 .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 10 ++-
 drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  3 +-
 drivers/gpu/drm/i915/gem/i915_gem_object.c    | 67 +++++++++++++++-
 drivers/gpu/drm/i915/gem/i915_gem_object.h    |  8 ++
 .../gpu/drm/i915/gem/i915_gem_object_types.h  | 26 +++++-
 drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  9 ++-
 drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
 drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |  4 +-
 drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
 .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
 .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
 .../drm/i915/gem/selftests/i915_gem_mman.c    |  2 +-
 drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 10 ++-
 drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 73 +++++++++--------
 drivers/gpu/drm/i915/gt/gen8_ppgtt.h          |  3 +-
 drivers/gpu/drm/i915/gt/intel_ggtt.c          | 76 +++++++++---------
 drivers/gpu/drm/i915/gt/intel_gtt.h           | 20 +++--
 drivers/gpu/drm/i915/gt/intel_migrate.c       | 47 ++++++-----
 drivers/gpu/drm/i915/gt/intel_migrate.h       | 13 ++-
 drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  6 +-
 drivers/gpu/drm/i915/gt/selftest_migrate.c    | 47 +++++------
 drivers/gpu/drm/i915/gt/selftest_reset.c      |  8 +-
 drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
 drivers/gpu/drm/i915/gt/selftest_tlb.c        |  4 +-
 drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 10 ++-
 drivers/gpu/drm/i915/i915_debugfs.c           | 55 ++++++++++---
 drivers/gpu/drm/i915/i915_gem.c               | 16 +++-
 drivers/gpu/drm/i915/i915_gpu_error.c         |  8 +-
 drivers/gpu/drm/i915/i915_pci.c               | 79 ++++++++++++++++---
 drivers/gpu/drm/i915/i915_vma.c               | 16 ++--
 drivers/gpu/drm/i915/i915_vma.h               |  2 +-
 drivers/gpu/drm/i915/i915_vma_types.h         |  2 -
 drivers/gpu/drm/i915/intel_device_info.h      |  5 ++
 drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
 .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
 drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
 .../drm/i915/selftests/intel_memory_region.c  |  4 +-
 .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 +++
 drivers/gpu/drm/i915/selftests/mock_gtt.c     |  8 +-
 include/uapi/drm/i915_drm.h                   | 36 +++++++++
 tools/include/uapi/drm/i915_drm.h             | 36 +++++++++
 44 files changed, 605 insertions(+), 243 deletions(-)

Comments

Tvrtko Ursulin April 26, 2023, 10:38 a.m. UTC | #1
On 26/04/2023 07:24, fei.yang@intel.com wrote:
> From: Fei Yang <fei.yang@intel.com>
> 
> The first three patches in this series are taken from
> https://patchwork.freedesktop.org/series/116868/
> These patches are included here because the last patch
> has dependency on the pat_index refactor.
> 
> This series is focusing on uAPI changes,
> 1. end support for set caching ioctl [PATCH 4/5]
> 2. add set_pat extension for gem_create [PATCH 5/5]
> 
> v2: drop one patch that was merged separately
>      341ad0e8e254 drm/i915/mtl: Add PTE encode function

Are the re-sends for stabilizing the series, or focusing on merge?

If the latter then opens are:

1) Link to Mesa MR reviewed and ready to merge.

2) IGT reviewed.

3) I raised an open that get/set_caching should not "lie" but return an 
error if set pat extension has been used. I don't see a good reason not 
to do that.

+ Joonas on this one.

4) Refactoring as done is not very pretty and I proposed an idea for a 
nicer approach. Feasible or not, open for discussion.

At a push I can look past that and someone can attempt to tidy the 
driver later.

But without 1-3 we cannot merge this.

Regards,

Tvrtko

> 
> Fei Yang (5):
>    drm/i915: preparation for using PAT index
>    drm/i915: use pat_index instead of cache_level
>    drm/i915: make sure correct pte encode is used
>    drm/i915/mtl: end support for set caching ioctl
>    drm/i915: Allow user to set cache at BO creation
> 
>   drivers/gpu/drm/i915/display/intel_dpt.c      | 12 +--
>   drivers/gpu/drm/i915/gem/i915_gem_create.c    | 36 +++++++++
>   drivers/gpu/drm/i915/gem/i915_gem_domain.c    | 30 +++----
>   .../gpu/drm/i915/gem/i915_gem_execbuffer.c    | 10 ++-
>   drivers/gpu/drm/i915/gem/i915_gem_mman.c      |  3 +-
>   drivers/gpu/drm/i915/gem/i915_gem_object.c    | 67 +++++++++++++++-
>   drivers/gpu/drm/i915/gem/i915_gem_object.h    |  8 ++
>   .../gpu/drm/i915/gem/i915_gem_object_types.h  | 26 +++++-
>   drivers/gpu/drm/i915/gem/i915_gem_shmem.c     |  9 ++-
>   drivers/gpu/drm/i915/gem/i915_gem_shrinker.c  |  2 -
>   drivers/gpu/drm/i915/gem/i915_gem_stolen.c    |  4 +-
>   drivers/gpu/drm/i915/gem/i915_gem_ttm_move.c  | 16 ++--
>   .../gpu/drm/i915/gem/selftests/huge_pages.c   |  2 +-
>   .../drm/i915/gem/selftests/i915_gem_migrate.c |  2 +-
>   .../drm/i915/gem/selftests/i915_gem_mman.c    |  2 +-
>   drivers/gpu/drm/i915/gt/gen6_ppgtt.c          | 10 ++-
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.c          | 73 +++++++++--------
>   drivers/gpu/drm/i915/gt/gen8_ppgtt.h          |  3 +-
>   drivers/gpu/drm/i915/gt/intel_ggtt.c          | 76 +++++++++---------
>   drivers/gpu/drm/i915/gt/intel_gtt.h           | 20 +++--
>   drivers/gpu/drm/i915/gt/intel_migrate.c       | 47 ++++++-----
>   drivers/gpu/drm/i915/gt/intel_migrate.h       | 13 ++-
>   drivers/gpu/drm/i915/gt/intel_ppgtt.c         |  6 +-
>   drivers/gpu/drm/i915/gt/selftest_migrate.c    | 47 +++++------
>   drivers/gpu/drm/i915/gt/selftest_reset.c      |  8 +-
>   drivers/gpu/drm/i915/gt/selftest_timeline.c   |  2 +-
>   drivers/gpu/drm/i915/gt/selftest_tlb.c        |  4 +-
>   drivers/gpu/drm/i915/gt/uc/intel_uc_fw.c      | 10 ++-
>   drivers/gpu/drm/i915/i915_debugfs.c           | 55 ++++++++++---
>   drivers/gpu/drm/i915/i915_gem.c               | 16 +++-
>   drivers/gpu/drm/i915/i915_gpu_error.c         |  8 +-
>   drivers/gpu/drm/i915/i915_pci.c               | 79 ++++++++++++++++---
>   drivers/gpu/drm/i915/i915_vma.c               | 16 ++--
>   drivers/gpu/drm/i915/i915_vma.h               |  2 +-
>   drivers/gpu/drm/i915/i915_vma_types.h         |  2 -
>   drivers/gpu/drm/i915/intel_device_info.h      |  5 ++
>   drivers/gpu/drm/i915/selftests/i915_gem.c     |  5 +-
>   .../gpu/drm/i915/selftests/i915_gem_evict.c   |  4 +-
>   drivers/gpu/drm/i915/selftests/i915_gem_gtt.c | 15 ++--
>   .../drm/i915/selftests/intel_memory_region.c  |  4 +-
>   .../gpu/drm/i915/selftests/mock_gem_device.c  |  9 +++
>   drivers/gpu/drm/i915/selftests/mock_gtt.c     |  8 +-
>   include/uapi/drm/i915_drm.h                   | 36 +++++++++
>   tools/include/uapi/drm/i915_drm.h             | 36 +++++++++
>   44 files changed, 605 insertions(+), 243 deletions(-)
>
Yang, Fei April 26, 2023, 3:41 p.m. UTC | #2
> On 26/04/2023 07:24, fei.yang@intel.com wrote:
>> From: Fei Yang <fei.yang@intel.com>
>>
>> The first three patches in this series are taken from
>> https://patchwork.freedesktop.org/series/116868/
>> These patches are included here because the last patch
>> has dependency on the pat_index refactor.
>>
>> This series is focusing on uAPI changes,
>> 1. end support for set caching ioctl [PATCH 4/5]
>> 2. add set_pat extension for gem_create [PATCH 5/5]
>>
>> v2: drop one patch that was merged separately
>>      341ad0e8e254 drm/i915/mtl: Add PTE encode function
>
> Are the re-sends for stabilizing the series, or focusing on merge?

v2 was sent just to drop one of patches that has already been merged.

> If the latter then opens are:
>
> 1) Link to Mesa MR reviewed and ready to merge.
>
> 2) IGT reviewed.
>
> 3) I raised an open that get/set_caching should not "lie" but return an
> error if set pat extension has been used. I don't see a good reason not
> to do that.

I don't think it's "lying" to the userspace. the comparison is only valid
for objects created by KMD because only such object uses the pat_index
from the cachelevel_to_pat table.

> + Joonas on this one.
>
> 4) Refactoring as done is not very pretty and I proposed an idea for a
> nicer approach. Feasible or not, open for discussion.

Still digesting your proposal. but not sure how would you define things
like PAT_UC as that is platform dependent, not a constant.
- Fei

> At a push I can look past that and someone can attempt to tidy the
> driver later.
>
> But without 1-3 we cannot merge this.
>
> Regards,
>
> Tvrtko
Tvrtko Ursulin April 27, 2023, 2:33 p.m. UTC | #3
On 26/04/2023 16:41, Yang, Fei wrote:
>  > On 26/04/2023 07:24, fei.yang@intel.com wrote:
>  >> From: Fei Yang <fei.yang@intel.com>
>  >>
>  >> The first three patches in this series are taken from
>  >> https://patchwork.freedesktop.org/series/116868/
>  >> These patches are included here because the last patch
>  >> has dependency on the pat_index refactor.
>  >>
>  >> This series is focusing on uAPI changes,
>  >> 1. end support for set caching ioctl [PATCH 4/5]
>  >> 2. add set_pat extension for gem_create [PATCH 5/5]
>  >>
>  >> v2: drop one patch that was merged separately
>  >>      341ad0e8e254 drm/i915/mtl: Add PTE encode function
>  >
>  > Are the re-sends for stabilizing the series, or focusing on merge?
> 
> v2 was sent just to drop one of patches that has already been merged.
> 
>  > If the latter then opens are:
>  >
>  > 1) Link to Mesa MR reviewed and ready to merge.
>  >
>  > 2) IGT reviewed.
>  >
>  > 3) I raised an open that get/set_caching should not "lie" but return an
>  > error if set pat extension has been used. I don't see a good reason not
>  > to do that.
> 
> I don't think it's "lying" to the userspace. the comparison is only valid
> for objects created by KMD because only such object uses the pat_index
> from the cachelevel_to_pat table.

Lets double check my understanding is correct. Userspace sequence of 
operations:

1)
obj = gem_create()+set_pat(PAT_UC)

2a)
get_caching(obj)

What gets reported?

2b)
set_caching(obj, I915_CACHE_LLC)

What is the return code?

If answer to 2a is I915_CACHING_CACHED and to 2b) success, then please 
state a good reason why both shouldn't return an error.

> 
>  > + Joonas on this one.
>  >
>  > 4) Refactoring as done is not very pretty and I proposed an idea for a
>  > nicer approach. Feasible or not, open for discussion.
> 
> Still digesting your proposal. but not sure how would you define things
> like PAT_UC as that is platform dependent, not a constant.

"PAT_UC" in my outline was purely a driver specific value, similarly as 
I915_CACHE_... are. With the whole point that driver can ask if 
something is uncached, WT or whatever. Using the platform specific 
mapping table which converts platform specific obj->pat_index to driver 
representation of caching modes (PAT_UC etc).

Quite possible I missed some detail, but if I haven't then it could be 
a neat and lightweight solution.

Regards,

Tvrtko
Yang, Fei April 27, 2023, 4:07 p.m. UTC | #4
> On 26/04/2023 16:41, Yang, Fei wrote:
>>> On 26/04/2023 07:24, fei.yang@intel.com wrote:
>>>> From: Fei Yang <fei.yang@intel.com>
>>>>
>>>> The first three patches in this series are taken from
>>>> https://patchwork.freedesktop.org/series/116868/
>>>> These patches are included here because the last patch
>>>> has dependency on the pat_index refactor.
>>>>
>>>> This series is focusing on uAPI changes,
>>>> 1. end support for set caching ioctl [PATCH 4/5]
>>>> 2. add set_pat extension for gem_create [PATCH 5/5]
>>>>
>>>> v2: drop one patch that was merged separately
>>>>      341ad0e8e254 drm/i915/mtl: Add PTE encode function
>>>
>>> Are the re-sends for stabilizing the series, or focusing on merge?
>>
>> v2 was sent just to drop one of patches that has already been merged.
>>
>>> If the latter then opens are:
>>>
>>> 1) Link to Mesa MR reviewed and ready to merge.
>>>
>>> 2) IGT reviewed.
>>>
>>> 3) I raised an open that get/set_caching should not "lie" but return an
>>> error if set pat extension has been used. I don't see a good reason not
>>> to do that.
>>
>> I don't think it's "lying" to the userspace. the comparison is only valid
>> for objects created by KMD because only such object uses the pat_index
>> from the cachelevel_to_pat table.
>
> Lets double check my understanding is correct. Userspace sequence of
> operations:
> 1)
> obj = gem_create()+set_pat(PAT_UC)
>
> 2a)
> get_caching(obj)
> What gets reported?

I see your point here. nice catch.
That should be handled by,
if (obj->cachel_level == I915_CACHE_INVAL) /* indicated pat-index is set by userspace */
      return -EOPNOTSUPP;
Will update the patch.

>
> 2b)
> set_caching(obj, I915_CACHE_LLC)
> What is the return code?

This will either return -EOPNOTSUPP, or ignored because set_pat extension was called.

>
> If answer to 2a is I915_CACHING_CACHED and to 2b) success, then please
> state a good reason why both shouldn't return an error.
>
>>
>>> + Joonas on this one.
>>>
>>> 4) Refactoring as done is not very pretty and I proposed an idea for a
>>> nicer approach. Feasible or not, open for discussion.
>>
>> Still digesting your proposal. but not sure how would you define things
>> like PAT_UC as that is platform dependent, not a constant.
>
> "PAT_UC" in my outline was purely a driver specific value, similarly as
> I915_CACHE_... are.

Okay. Then you were suggesting to add a translation table for pat_index-to-(PAT-UC/WB/WT)?
It's going to be a N:1 mapping.

> With the whole point that driver can ask if
> something is uncached, WT or whatever. Using the platform specific
> mapping table which converts platform specific obj->pat_index to driver
> representation of caching modes (PAT_UC etc).

Are you suggesting completely remove i915_cache_level and use (PAT-UC/WB/WT) instead?
Let's say a KMD object wants to be set as WB, how you get the exact pat_index to use?
Note that there are multiple PAT indices having caching mod WB, but they are different,
e.g. do you want just WB or WB with 1-way-coherency or WB with 2-way coherency?

Also, in case a checking of pat_index is needed, do we say WB with 1-way-coherency is
equal to WB or not?

BTW, isn't PAT-UC/WB/WT the same kind of abstraction as enum i915_cache_level, I'm not
sure how that would simplify anything.

> Quite possible I missed some detail, but if I haven't then it could be
> a neat and lightweight solution.
>
> Regards,
>
> Tvrtko
Tvrtko Ursulin April 28, 2023, 9:43 a.m. UTC | #5
On 27/04/2023 17:07, Yang, Fei wrote:
>  > On 26/04/2023 16:41, Yang, Fei wrote:
>  >>> On 26/04/2023 07:24, fei.yang@intel.com wrote:
>  >>>> From: Fei Yang <fei.yang@intel.com>
>  >>>>
>  >>>> The first three patches in this series are taken from
>  >>>> https://patchwork.freedesktop.org/series/116868/
>  >>>> These patches are included here because the last patch
>  >>>> has dependency on the pat_index refactor.
>  >>>>
>  >>>> This series is focusing on uAPI changes,
>  >>>> 1. end support for set caching ioctl [PATCH 4/5]
>  >>>> 2. add set_pat extension for gem_create [PATCH 5/5]
>  >>>>
>  >>>> v2: drop one patch that was merged separately
>  >>>>      341ad0e8e254 drm/i915/mtl: Add PTE encode function
>  >>>
>  >>> Are the re-sends for stabilizing the series, or focusing on merge?
>  >>
>  >> v2 was sent just to drop one of patches that has already been merged.
>  >>
>  >>> If the latter then opens are:
>  >>>
>  >>> 1) Link to Mesa MR reviewed and ready to merge.
>  >>>
>  >>> 2) IGT reviewed.
>  >>>
>  >>> 3) I raised an open that get/set_caching should not "lie" but return an
>  >>> error if set pat extension has been used. I don't see a good reason not
>  >>> to do that.
>  >>
>  >> I don't think it's "lying" to the userspace. the comparison is only 
> valid
>  >> for objects created by KMD because only such object uses the pat_index
>  >> from the cachelevel_to_pat table.
>  >
>  > Lets double check my understanding is correct. Userspace sequence of
>  > operations:
>  > 1)
>  > obj = gem_create()+set_pat(PAT_UC)
>  >
>  > 2a)
>  > get_caching(obj)
>> What gets reported?
> 
> I see your point here. nice catch.
> That should be handled by,
> if (obj->cachel_level == I915_CACHE_INVAL) /* indicated pat-index is set 
> by userspace */
>       return -EOPNOTSUPP;
> Will update the patch.
> 
>  >
>  > 2b)
>  > set_caching(obj, I915_CACHE_LLC)
>> What is the return code?
> 
> This will either return -EOPNOTSUPP, or ignored because set_pat 
> extension was called.

I see that you made it fail instead of fake success in the latest respin 
and I definitely agree with that.

> 
>  >
>  > If answer to 2a is I915_CACHING_CACHED and to 2b) success, then please
>  > state a good reason why both shouldn't return an error.
>  >
>  >>
>  >>> + Joonas on this one.
>  >>>
>  >>> 4) Refactoring as done is not very pretty and I proposed an idea for a
>  >>> nicer approach. Feasible or not, open for discussion.
>  >>
>  >> Still digesting your proposal. but not sure how would you define things
>  >> like PAT_UC as that is platform dependent, not a constant.
>  >
>  > "PAT_UC" in my outline was purely a driver specific value, similarly as
>  > I915_CACHE_... are.
> 
> Okay. Then you were suggesting to add a translation table for 
> pat_index-to-(PAT-UC/WB/WT)?
> It's going to be a N:1 mapping.

PAT index to a value, probably a bitmask, built out of bits which define 
caching modes. Like "PAT_WB | PAT_1WAY_COHERENT", or just PAT_WB. Would 
that approach be enough to express everything?

> 
>  > With the whole point that driver can ask if
>  > something is uncached, WT or whatever. Using the platform specific
>  > mapping table which converts platform specific obj->pat_index to driver
>  > representation of caching modes (PAT_UC etc).
> 
> Are you suggesting completely remove i915_cache_level and use 
> (PAT-UC/WB/WT) instead?

Not completely but throughout the most internal code paths, which would 
all just work on PAT index. Basically object always has PAT index set, 
with a separate boolean saying whether it came from gem_create_ext or 
set_cache_level.

> Let's say a KMD object wants to be set as WB, how you get the exact 
> pat_index to use?
> Note that there are multiple PAT indices having caching mod WB, but they 
> are different,
> e.g. do you want just WB or WB with 1-way-coherency or WB with 2-way 
> coherency?

Just use the cache_level to pat_index mapping you added in the series?

> Also, in case a checking of pat_index is needed, do we say WB with 
> 1-way-coherency is
> equal to WB or not?

You mean the call sites where i915 is checking the object caching mode?

We have two call sites which check for !I915_CACHE_NONE. Those would 
just check if PAT_UC is not set.

Then the one in gpu_write_needs_clflush is checking for neither UC nor 
WT, which also directly translates.

For the WB case there aren't any callers but if we just checked for 
"base" PAT_WB bit being set that would work.

So in all cases helper which does "return bits_required | bits_set" 
seems would work fine.

> BTW, isn't PAT-UC/WB/WT the same kind of abstraction as enum 
> i915_cache_level, I'm not
> sure how that would simplify anything.

As I wrote before, I *think* it provides a way of not needing to 
sprinkle around i915_gem_get_pat_index and a simpler "has_cache_level". 
Conceptually cache_level->pat_index is done only in gem_create_ext and 
set_cache_level. Lower level code does not have to consult cache_level 
at all. And existence of tables simplifies the pretty printing code to a 
platform agnostic loop.

I *think* at least.. We can leave it all for later. My main concern was 
that UAPI needs to be clear and solid which it now seems to be.

Regards,

Tvrtko

> 
>  > Quite possible I missed some detail, but if I haven't then it could be
>  > a neat and lightweight solution.
>  >
>  > Regards,
>  >
>  > Tvrtko
Yang, Fei April 28, 2023, 7:51 p.m. UTC | #6
> On 27/04/2023 17:07, Yang, Fei wrote:
>>> On 26/04/2023 16:41, Yang, Fei wrote:
>>>>> On 26/04/2023 07:24, fei.yang@intel.com wrote:
>>>>>> From: Fei Yang <fei.yang@intel.com>
>>>>>>
>>>>>> The first three patches in this series are taken from
>>>>>> https://patchwork.freedesktop.org/series/116868/
>>>>>> These patches are included here because the last patch
>>>>>> has dependency on the pat_index refactor.
>>>>>>
>>>>>> This series is focusing on uAPI changes,
>>>>>> 1. end support for set caching ioctl [PATCH 4/5]
>>>>>> 2. add set_pat extension for gem_create [PATCH 5/5]
>>>>>>
>>>>>> v2: drop one patch that was merged separately
>>>>>>      341ad0e8e254 drm/i915/mtl: Add PTE encode function
>>>>>
>>>>> Are the re-sends for stabilizing the series, or focusing on merge?
>>>>
>>>> v2 was sent just to drop one of patches that has already been merged.
>>>>
>>>>> If the latter then opens are:
>>>>>
>>>>> 1) Link to Mesa MR reviewed and ready to merge.
>>>>>
>>>>> 2) IGT reviewed.
>>>>>
>>>>> 3) I raised an open that get/set_caching should not "lie" but return an
>>>>> error if set pat extension has been used. I don't see a good reason not
>>>>> to do that.
>>>>
>>>> I don't think it's "lying" to the userspace. the comparison is only valid
>>>> for objects created by KMD because only such object uses the pat_index
>>>> from the cachelevel_to_pat table.
>>>
>>> Lets double check my understanding is correct. Userspace sequence of
>>> operations:
>>> 1)
>>> obj = gem_create()+set_pat(PAT_UC)
>>>
>>> 2a)
>>> get_caching(obj)
>>> What gets reported?
>>
>> I see your point here. nice catch.
>> That should be handled by,
>> if (obj->cachel_level == I915_CACHE_INVAL) /* indicated pat-index is set by userspace */
>>       return -EOPNOTSUPP;
>> Will update the patch.
>>
>>>
>>> 2b)
>>> set_caching(obj, I915_CACHE_LLC)
>>> What is the return code?
>>
>> This will either return -EOPNOTSUPP, or ignored because set_pat
>> extension was called.
>
> I see that you made it fail instead of fake success in the latest respin
> and I definitely agree with that.

Thanks for your picky eyes. :)

>>>
>>> If answer to 2a is I915_CACHING_CACHED and to 2b) success, then please
>>> state a good reason why both shouldn't return an error.
>>>
>>>>
>>>>> + Joonas on this one.
>>>>>
>>>>> 4) Refactoring as done is not very pretty and I proposed an idea for a
>>>>> nicer approach. Feasible or not, open for discussion.
>>>>
>>>> Still digesting your proposal. but not sure how would you define things
>>>> like PAT_UC as that is platform dependent, not a constant.
>>>
>>> "PAT_UC" in my outline was purely a driver specific value, similarly as
>>> I915_CACHE_... are.
>>
>> Okay. Then you were suggesting to add a translation table for
>> pat_index-to-(PAT-UC/WB/WT)?
>> It's going to be a N:1 mapping.
>
> PAT index to a value, probably a bitmask, built out of bits which define
> caching modes. Like "PAT_WB | PAT_1WAY_COHERENT", or just PAT_WB. Would
> that approach be enough to express everything?

I was thinking about dumping the PAT tables from Bspec into struct intel_device_info {}.
But that would be only useful to unify all those *_setup_private_ppat() functions in
intel_gtt.c. Kernel doesn't really care much about PAT index except making sure the bits
are programmed correctly in PTE.

>>
>>> With the whole point that driver can ask if
>>> something is uncached, WT or whatever. Using the platform specific
>>> mapping table which converts platform specific obj->pat_index to driver
>>> representation of caching modes (PAT_UC etc).
>>
>> Are you suggesting completely remove i915_cache_level and use
>> (PAT-UC/WB/WT) instead?
>
> Not completely but throughout the most internal code paths, which would
> all just work on PAT index. Basically object always has PAT index set,
> with a separate boolean saying whether it came from gem_create_ext or
> set_cache_level.

What's in my mind as an improvement is to completely remove i915_cache_level.
KMD is supposed to abstract the hardware, but in this case, since the desgin
is that UMDs understand PAT index (and MOCS as well), having an abstraction
in between would only introduce ambiguity and complexity.

Also kernel doesn't need to play with all available PAT indices, so it should
be okay to add i915->pat_{uc|wb|wt} and initialize them with proper indices
respectively. That takes care of the PAT checking as well wherever it's needed,
like "if (pat_index == i915->pat_uc)"

>> Let's say a KMD object wants to be set as WB, how you get the exact pat_index to use?
>> Note that there are multiple PAT indices having caching mod WB, but they are different,
>> e.g. do you want just WB or WB with 1-way-coherency or WB with 2-way
>> coherency?
>
> Just use the cache_level to pat_index mapping you added in the series?
>
>> Also, in case a checking of pat_index is needed, do we say WB with
>> 1-way-coherency is equal to WB or not?
>
> You mean the call sites where i915 is checking the object caching mode?
> We have two call sites which check for !I915_CACHE_NONE. Those would
> just check if PAT_UC is not set.
>
> Then the one in gpu_write_needs_clflush is checking for neither UC nor
> WT, which also directly translates.
>
> For the WB case there aren't any callers but if we just checked for
> "base" PAT_WB bit being set that would work.
>
> So in all cases helper which does "return bits_required | bits_set"
> seems would work fine.
>
>> BTW, isn't PAT-UC/WB/WT the same kind of abstraction as enum
>> i915_cache_level, I'm not
>> sure how that would simplify anything.
>
> As I wrote before, I *think* it provides a way of not needing to
> sprinkle around i915_gem_get_pat_index and a simpler "has_cache_level".
> Conceptually cache_level->pat_index is done only in gem_create_ext and
> set_cache_level. Lower level code does not have to consult cache_level
> at all. And existence of tables simplifies the pretty printing code to a
> platform agnostic loop.
>
> I *think* at least.. We can leave it all for later. My main concern was
> that UAPI needs to be clear and solid which it now seems to be.

I'm hesitant to do all that I said above in one shot. But I think completely
removing enum i915_cache_level is what to do next.

-Fei

> Regards,
>
> Tvrtko
>
>>
>>> Quite possible I missed some detail, but if I haven't then it could be
>>> a neat and lightweight solution.