diff mbox series

[v2] drm/i915/display: Check GGTT to determine phys_base

Message ID 20231206184736.3769657-1-pazz@chromium.org (mailing list archive)
State New, archived
Headers show
Series [v2] drm/i915/display: Check GGTT to determine phys_base | expand

Commit Message

Paz Zcharya Dec. 6, 2023, 6:46 p.m. UTC
There was an assumption that for iGPU there should be a 1:1 mapping
of GGTT to physical address pointing to the framebuffer.
This assumption is not strictly true effective generation 8 or newer.
Fix that by checking GGTT to determine the phys address on gen8+.

The following algorithm for phys_base should be valid for all platforms:
1. Find pte
2. if(IS_DGFX(i915) && pte & GEN12_GGTT_PTE_LM) mem =
i915->mm.regions[INTEL_REGION_LMEM_0] else mem = i915->mm.stolen_region
3. phys_base = (pte & I915_GTT_PAGE_MASK) - mem->region.start;

- On older platforms, stolen_region points to system memory, starting at 0
- on DG2, it uses lmem region which starts at 0 as well
- on MTL, stolen_region points to stolen-local which starts at 0x800000

Changes from v1:
  - Add an if statement for gen7-, where there is a 1:1 mapping

Signed-off-by: Paz Zcharya <pazz@chromium.org>
---

 .../drm/i915/display/intel_plane_initial.c    | 64 +++++++++++--------
 1 file changed, 39 insertions(+), 25 deletions(-)

Comments

Andrzej Hajda Dec. 7, 2023, 10:10 a.m. UTC | #1
On 07.12.2023 01:18, Patchwork wrote:
> *Patch Details*
> *Series:*	drm/i915/display: Check GGTT to determine phys_base (rev2)
> *URL:*	https://patchwork.freedesktop.org/series/127130/ 
> <https://patchwork.freedesktop.org/series/127130/>
> *State:*	failure
> *Details:* 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html 
> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html>
> 
> 
>   CI Bug Log - changes from CI_DRM_13990 -> Patchwork_127130v2
> 
> 
>     Summary
> 
> *FAILURE*
> 
> Serious unknown changes coming with Patchwork_127130v2 absolutely need to be
> verified manually.
> 
> If you think the reported changes have nothing to do with the changes
> introduced in Patchwork_127130v2, please notify your bug team 
> (I915-ci-infra@lists.freedesktop.org) to allow them
> to document this new failure mode, which will reduce false positives in CI.
> 
> External URL: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html
> 
> 
>     Participating hosts (37 -> 34)
> 
> Missing (3): fi-pnv-d510 fi-snb-2520m bat-dg1-5
> 
> 
>     Possible new issues
> 
> Here are the unknown changes that may have been introduced in 
> Patchwork_127130v2:
> 
> 
>       IGT changes
> 
> 
>         Possible regressions
> 
>   * igt@i915_module_load@load:
>       o bat-mtlp-8: PASS
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-mtlp-8/igt@i915_module_load@load.html> -> INCOMPLETE <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-mtlp-8/igt@i915_module_load@load.html>


It seems related. I think the patch is correct but it just unveils other 
display take-over issues.
Ie with this patch initial_plane_vma returns valid buffer, but 
subsequent display code fails miserably with kernel panic.

So until this is not solved, we shouldn't merge the patch, IMO.

CC: i915 maintainers and display developers

Regards
Andrzej

> 
> 
>     Known issues
> 
> Here are the changes found in Patchwork_127130v2 that come from known 
> issues:
> 
> 
>       IGT changes
> 
> 
>         Issues hit
> 
>   * igt@kms_pm_backlight@basic-brightness@edp-1:
>       o bat-rplp-1: NOTRUN -> ABORT
>         <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-rplp-1/igt@kms_pm_backlight@basic-brightness@edp-1.html> (i915#8668 <https://gitlab.freedesktop.org/drm/intel/issues/8668>)
> 
> 
>         Possible fixes
> 
>   *
> 
>     igt@gem_exec_suspend@basic-s0@lmem0:
> 
>       o bat-dg2-9: INCOMPLETE
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html> (i915#9275 <https://gitlab.freedesktop.org/drm/intel/issues/9275>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html>
>   *
> 
>     igt@kms_flip@basic-flip-vs-dpms@d-dp6:
> 
>       o bat-adlp-11: DMESG-FAIL
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-adlp-11/igt@kms_flip@basic-flip-vs-dpms@d-dp6.html> (i915#6868 <https://gitlab.freedesktop.org/drm/intel/issues/6868>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-adlp-11/igt@kms_flip@basic-flip-vs-dpms@d-dp6.html>
>   *
> 
>     igt@kms_flip@basic-flip-vs-modeset@d-dp6:
> 
>       o bat-adlp-11: DMESG-WARN
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-adlp-11/igt@kms_flip@basic-flip-vs-modeset@d-dp6.html> -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-adlp-11/igt@kms_flip@basic-flip-vs-modeset@d-dp6.html>
>   *
> 
>     igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1:
> 
>       o bat-rplp-1: ABORT
>         <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-rplp-1/igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1.html> (i915#8668 <https://gitlab.freedesktop.org/drm/intel/issues/8668>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-rplp-1/igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1.html>
> 
> 
>     Build changes
> 
>   * Linux: CI_DRM_13990 -> Patchwork_127130v2
> 
> CI-20190529: 20190529
> CI_DRM_13990: 85d33d0ad82a5c1a71492f14a5ceb67ada6a22d8 @ 
> git://anongit.freedesktop.org/gfx-ci/linux
> IGT_7626: 154b7288552cd7ed3033f8ef396e88d0bd1b7646 @ 
> https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
> Patchwork_127130v2: 85d33d0ad82a5c1a71492f14a5ceb67ada6a22d8 @ 
> git://anongit.freedesktop.org/gfx-ci/linux
> 
> 
>       Linux commits
> 
> 43f210e851cd drm/i915/display: Check GGTT to determine phys_base
>
Andrzej Hajda Dec. 7, 2023, 11:26 a.m. UTC | #2
On 07.12.2023 11:10, Andrzej Hajda wrote:
> On 07.12.2023 01:18, Patchwork wrote:
>> *Patch Details*
>> *Series:*    drm/i915/display: Check GGTT to determine phys_base (rev2)
>> *URL:*    https://patchwork.freedesktop.org/series/127130/ 
>> <https://patchwork.freedesktop.org/series/127130/>
>> *State:*    failure
>> *Details:* 
>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html 
>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html>
>>
>>
>>   CI Bug Log - changes from CI_DRM_13990 -> Patchwork_127130v2
>>
>>
>>     Summary
>>
>> *FAILURE*
>>
>> Serious unknown changes coming with Patchwork_127130v2 absolutely need 
>> to be
>> verified manually.
>>
>> If you think the reported changes have nothing to do with the changes
>> introduced in Patchwork_127130v2, please notify your bug team 
>> (I915-ci-infra@lists.freedesktop.org) to allow them
>> to document this new failure mode, which will reduce false positives 
>> in CI.
>>
>> External URL: 
>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html
>>
>>
>>     Participating hosts (37 -> 34)
>>
>> Missing (3): fi-pnv-d510 fi-snb-2520m bat-dg1-5
>>
>>
>>     Possible new issues
>>
>> Here are the unknown changes that may have been introduced in 
>> Patchwork_127130v2:
>>
>>
>>       IGT changes
>>
>>
>>         Possible regressions
>>
>>   * igt@i915_module_load@load:
>>       o bat-mtlp-8: PASS
>>         
>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-mtlp-8/igt@i915_module_load@load.html> -> INCOMPLETE <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-mtlp-8/igt@i915_module_load@load.html>
> 
> 
> It seems related. I think the patch is correct but it just unveils other 
> display take-over issues.
> Ie with this patch initial_plane_vma returns valid buffer, but 
> subsequent display code fails miserably with kernel panic.
> 
> So until this is not solved, we shouldn't merge the patch, IMO.
> 
> CC: i915 maintainers and display developers


After taking a look on panic log [1], I have found:
[drm:i915_init_ggtt [i915]] Failed to reserve top of GGTT for GuC

I don't know why it is only debug level? It seems serious failure, as a 
result i915_init_ggtt fails and probe fails.

The cause is that initial framebuffer is located at the end of GGTT and 
it overlaps with reserved area (see ggtt_reserve_guc_top).

I am not sure how it can be properly fixed, I guess dirty fix could be
just relocation of vma (hopefully into free area), sth like:
new_gte = gsm + (ggtt->vm.total - GUC_TOP_RESERVE_SIZE - size) / 
I915_GTT_PAGE_SIZE;
memmove(new_gte, gte, size / I915_GTT_PAGE_SIZE);

but I have no idea of possible side effects :)

[1]: 
https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-mtlp-8/pstore0-2849851684_Panic_1.txt

Regards
Andrzej


> 
> Regards
> Andrzej
> 
>>
>>
>>     Known issues
>>
>> Here are the changes found in Patchwork_127130v2 that come from known 
>> issues:
>>
>>
>>       IGT changes
>>
>>
>>         Issues hit
>>
>>   * igt@kms_pm_backlight@basic-brightness@edp-1:
>>       o bat-rplp-1: NOTRUN -> ABORT
>>         
>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-rplp-1/igt@kms_pm_backlight@basic-brightness@edp-1.html> (i915#8668 <https://gitlab.freedesktop.org/drm/intel/issues/8668>)
>>
>>
>>         Possible fixes
>>
>>   *
>>
>>     igt@gem_exec_suspend@basic-s0@lmem0:
>>
>>       o bat-dg2-9: INCOMPLETE
>>         
>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html> (i915#9275 <https://gitlab.freedesktop.org/drm/intel/issues/9275>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html>
>>   *
>>
>>     igt@kms_flip@basic-flip-vs-dpms@d-dp6:
>>
>>       o bat-adlp-11: DMESG-FAIL
>>         
>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-adlp-11/igt@kms_flip@basic-flip-vs-dpms@d-dp6.html> (i915#6868 <https://gitlab.freedesktop.org/drm/intel/issues/6868>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-adlp-11/igt@kms_flip@basic-flip-vs-dpms@d-dp6.html>
>>   *
>>
>>     igt@kms_flip@basic-flip-vs-modeset@d-dp6:
>>
>>       o bat-adlp-11: DMESG-WARN
>>         
>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-adlp-11/igt@kms_flip@basic-flip-vs-modeset@d-dp6.html> -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-adlp-11/igt@kms_flip@basic-flip-vs-modeset@d-dp6.html>
>>   *
>>
>>     igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1:
>>
>>       o bat-rplp-1: ABORT
>>         
>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-rplp-1/igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1.html> (i915#8668 <https://gitlab.freedesktop.org/drm/intel/issues/8668>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-rplp-1/igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1.html>
>>
>>
>>     Build changes
>>
>>   * Linux: CI_DRM_13990 -> Patchwork_127130v2
>>
>> CI-20190529: 20190529
>> CI_DRM_13990: 85d33d0ad82a5c1a71492f14a5ceb67ada6a22d8 @ 
>> git://anongit.freedesktop.org/gfx-ci/linux
>> IGT_7626: 154b7288552cd7ed3033f8ef396e88d0bd1b7646 @ 
>> https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>> Patchwork_127130v2: 85d33d0ad82a5c1a71492f14a5ceb67ada6a22d8 @ 
>> git://anongit.freedesktop.org/gfx-ci/linux
>>
>>
>>       Linux commits
>>
>> 43f210e851cd drm/i915/display: Check GGTT to determine phys_base
>>
>
Imre Deak Dec. 7, 2023, 12:05 p.m. UTC | #3
On Thu, Dec 07, 2023 at 12:26:25PM +0100, Andrzej Hajda wrote:
> 
> 
> On 07.12.2023 11:10, Andrzej Hajda wrote:
> > On 07.12.2023 01:18, Patchwork wrote:
> > > *Patch Details*
> > > *Series:*    drm/i915/display: Check GGTT to determine phys_base (rev2)
> > > *URL:*    https://patchwork.freedesktop.org/series/127130/
> > > <https://patchwork.freedesktop.org/series/127130/>
> > > *State:*    failure
> > > *Details:*
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html
> > > <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html>
> > > 
> > > 
> > >   CI Bug Log - changes from CI_DRM_13990 -> Patchwork_127130v2
> > > 
> > > 
> > >     Summary
> > > 
> > > *FAILURE*
> > > 
> > > Serious unknown changes coming with Patchwork_127130v2 absolutely
> > > need to be
> > > verified manually.
> > > 
> > > If you think the reported changes have nothing to do with the changes
> > > introduced in Patchwork_127130v2, please notify your bug team
> > > (I915-ci-infra@lists.freedesktop.org) to allow them
> > > to document this new failure mode, which will reduce false positives
> > > in CI.
> > > 
> > > External URL:
> > > https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html
> > > 
> > > 
> > >     Participating hosts (37 -> 34)
> > > 
> > > Missing (3): fi-pnv-d510 fi-snb-2520m bat-dg1-5
> > > 
> > > 
> > >     Possible new issues
> > > 
> > > Here are the unknown changes that may have been introduced in
> > > Patchwork_127130v2:
> > > 
> > > 
> > >       IGT changes
> > > 
> > > 
> > >         Possible regressions
> > > 
> > >   * igt@i915_module_load@load:
> > >       o bat-mtlp-8: PASS
> > > <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-mtlp-8/igt@i915_module_load@load.html> -> INCOMPLETE <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-mtlp-8/igt@i915_module_load@load.html>
> > 
> > 
> > It seems related. I think the patch is correct but it just unveils other
> > display take-over issues.
> > Ie with this patch initial_plane_vma returns valid buffer, but
> > subsequent display code fails miserably with kernel panic.
> > 
> > So until this is not solved, we shouldn't merge the patch, IMO.
> > 
> > CC: i915 maintainers and display developers
> 
> 
> After taking a look on panic log [1], I have found:
> [drm:i915_init_ggtt [i915]] Failed to reserve top of GGTT for GuC
> 
> I don't know why it is only debug level? It seems serious failure, as a
> result i915_init_ggtt fails and probe fails.
> 
> The cause is that initial framebuffer is located at the end of GGTT and it
> overlaps with reserved area (see ggtt_reserve_guc_top).
> 
> I am not sure how it can be properly fixed, I guess dirty fix could be
> just relocation of vma (hopefully into free area), sth like:
> new_gte = gsm + (ggtt->vm.total - GUC_TOP_RESERVE_SIZE - size) /
> I915_GTT_PAGE_SIZE;
> memmove(new_gte, gte, size / I915_GTT_PAGE_SIZE);
> 
> but I have no idea of possible side effects :)
> 
> [1]: https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-mtlp-8/pstore0-2849851684_Panic_1.txt

fwiw, the following hack should fix the error path:
@@ -822,6 +823,8 @@ int i915_driver_probe(struct pci_dev *pdev, const struct pci_device_id *ent)
        i915_gem_driver_release(i915);
 out_cleanup_modeset2:
        /* FIXME clean up the error path */
+       if (HAS_DISPLAY(i915))
+               drm_atomic_helper_shutdown(&i915->drm);
        intel_display_driver_remove(i915);
        intel_irq_uninstall(i915);
        intel_display_driver_remove_noirq(i915);

> Regards
> Andrzej
> 
> 
> > 
> > Regards
> > Andrzej
> > 
> > > 
> > > 
> > >     Known issues
> > > 
> > > Here are the changes found in Patchwork_127130v2 that come from
> > > known issues:
> > > 
> > > 
> > >       IGT changes
> > > 
> > > 
> > >         Issues hit
> > > 
> > >   * igt@kms_pm_backlight@basic-brightness@edp-1:
> > >       o bat-rplp-1: NOTRUN -> ABORT
> > > <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-rplp-1/igt@kms_pm_backlight@basic-brightness@edp-1.html> (i915#8668 <https://gitlab.freedesktop.org/drm/intel/issues/8668>)
> > > 
> > > 
> > >         Possible fixes
> > > 
> > >   *
> > > 
> > >     igt@gem_exec_suspend@basic-s0@lmem0:
> > > 
> > >       o bat-dg2-9: INCOMPLETE
> > > <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html> (i915#9275 <https://gitlab.freedesktop.org/drm/intel/issues/9275>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html>
> > >   *
> > > 
> > >     igt@kms_flip@basic-flip-vs-dpms@d-dp6:
> > > 
> > >       o bat-adlp-11: DMESG-FAIL
> > > <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-adlp-11/igt@kms_flip@basic-flip-vs-dpms@d-dp6.html> (i915#6868 <https://gitlab.freedesktop.org/drm/intel/issues/6868>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-adlp-11/igt@kms_flip@basic-flip-vs-dpms@d-dp6.html>
> > >   *
> > > 
> > >     igt@kms_flip@basic-flip-vs-modeset@d-dp6:
> > > 
> > >       o bat-adlp-11: DMESG-WARN
> > > <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-adlp-11/igt@kms_flip@basic-flip-vs-modeset@d-dp6.html> -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-adlp-11/igt@kms_flip@basic-flip-vs-modeset@d-dp6.html>
> > >   *
> > > 
> > >     igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1:
> > > 
> > >       o bat-rplp-1: ABORT
> > > <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-rplp-1/igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1.html> (i915#8668 <https://gitlab.freedesktop.org/drm/intel/issues/8668>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-rplp-1/igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1.html>
> > > 
> > > 
> > >     Build changes
> > > 
> > >   * Linux: CI_DRM_13990 -> Patchwork_127130v2
> > > 
> > > CI-20190529: 20190529
> > > CI_DRM_13990: 85d33d0ad82a5c1a71492f14a5ceb67ada6a22d8 @
> > > git://anongit.freedesktop.org/gfx-ci/linux
> > > IGT_7626: 154b7288552cd7ed3033f8ef396e88d0bd1b7646 @
> > > https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
> > > Patchwork_127130v2: 85d33d0ad82a5c1a71492f14a5ceb67ada6a22d8 @
> > > git://anongit.freedesktop.org/gfx-ci/linux
> > > 
> > > 
> > >       Linux commits
> > > 
> > > 43f210e851cd drm/i915/display: Check GGTT to determine phys_base
> > > 
> >
Almahallawy, Khaled Dec. 8, 2023, 1 a.m. UTC | #4
Thank You for the patch. We noticed a break in the customer board with
the latest GOP + this patch.


Thank You
Khaled	

On Wed, 2023-12-06 at 18:46 +0000, Paz Zcharya wrote:
> There was an assumption that for iGPU there should be a 1:1 mapping
> of GGTT to physical address pointing to the framebuffer.
> This assumption is not strictly true effective generation 8 or newer.
> Fix that by checking GGTT to determine the phys address on gen8+.
> 
> The following algorithm for phys_base should be valid for all
> platforms:
> 1. Find pte
> 2. if(IS_DGFX(i915) && pte & GEN12_GGTT_PTE_LM) mem =
> i915->mm.regions[INTEL_REGION_LMEM_0] else mem = i915-
> >mm.stolen_region
> 3. phys_base = (pte & I915_GTT_PAGE_MASK) - mem->region.start;
> 
> - On older platforms, stolen_region points to system memory, starting
> at 0
> - on DG2, it uses lmem region which starts at 0 as well
> - on MTL, stolen_region points to stolen-local which starts at
> 0x800000
> 
> Changes from v1:
>   - Add an if statement for gen7-, where there is a 1:1 mapping
> 
> Signed-off-by: Paz Zcharya <pazz@chromium.org>
> ---
> 
>  .../drm/i915/display/intel_plane_initial.c    | 64 +++++++++++----
> ----
>  1 file changed, 39 insertions(+), 25 deletions(-)
> 
> diff --git a/drivers/gpu/drm/i915/display/intel_plane_initial.c
> b/drivers/gpu/drm/i915/display/intel_plane_initial.c
> index a55c09cbd0e4..7d9bb631b93b 100644
> --- a/drivers/gpu/drm/i915/display/intel_plane_initial.c
> +++ b/drivers/gpu/drm/i915/display/intel_plane_initial.c
> @@ -59,44 +59,58 @@ initial_plane_vma(struct drm_i915_private *i915,
>  		return NULL;
>  
>  	base = round_down(plane_config->base, I915_GTT_MIN_ALIGNMENT);
> -	if (IS_DGFX(i915)) {
> +
> +	if (GRAPHICS_VER(i915) < 8) {
> +		/*
> +		 * In gen7-, there is a 1:1 mapping
> +		 * between GSM and physical address.
> +		 */
> +		phys_base = base;
> +		mem = i915->mm.stolen_region;
> +	} else {
> +		/*
> +		 * In gen8+, there is no 1:1 mapping between
> +		 * GSM and physical address, so we need to
> +		 * check GGTT to determine the physical address.
> +		 */
>  		gen8_pte_t __iomem *gte = to_gt(i915)->ggtt->gsm;
>  		gen8_pte_t pte;
>  
>  		gte += base / I915_GTT_PAGE_SIZE;
> -
>  		pte = ioread64(gte);
> -		if (!(pte & GEN12_GGTT_PTE_LM)) {
> -			drm_err(&i915->drm,
> -				"Initial plane programming missing
> PTE_LM bit\n");
> -			return NULL;
> -		}
> -
> -		phys_base = pte & I915_GTT_PAGE_MASK;
> -		mem = i915->mm.regions[INTEL_REGION_LMEM_0];
>  
> -		/*
> -		 * We don't currently expect this to ever be placed in
> the
> -		 * stolen portion.
> -		 */
> -		if (phys_base >= resource_size(&mem->region)) {
> -			drm_err(&i915->drm,
> -				"Initial plane programming using
> invalid range, phys_base=%pa\n",
> -				&phys_base);
> -			return NULL;
> +		if (IS_DGFX(i915)) {
> +			if (!(pte & GEN12_GGTT_PTE_LM)) {
> +				drm_err(&i915->drm,
> +					"Initial plane programming
> missing PTE_LM bit\n");
> +				return NULL;
> +			}
> +			mem = i915->mm.regions[INTEL_REGION_LMEM_0];
> +		} else {
> +			mem = i915->mm.stolen_region;
>  		}
>  
> -		drm_dbg(&i915->drm,
> -			"Using phys_base=%pa, based on initial plane
> programming\n",
> -			&phys_base);
> -	} else {
> -		phys_base = base;
> -		mem = i915->mm.stolen_region;
> +		phys_base = (pte & I915_GTT_PAGE_MASK) - mem-
> >region.start;
>  	}
>  
>  	if (!mem)
>  		return NULL;
>  
> +	/*
> +	 * We don't currently expect this to ever be placed in the
> +	 * stolen portion.
> +	 */
> +	if (phys_base >= resource_size(&mem->region)) {
> +		drm_err(&i915->drm,
> +			"Initial plane programming using invalid range,
> phys_base=%pa\n",
> +			&phys_base);
> +		return NULL;
> +	}
> +
> +	drm_dbg(&i915->drm,
> +		"Using phys_base=%pa, based on initial plane
> programming\n",
> +		&phys_base);
> +
>  	size = round_up(plane_config->base + plane_config->size,
>  			mem->min_page_size);
>  	size -= base;
Andrzej Hajda Dec. 8, 2023, 8:32 a.m. UTC | #5
On 07.12.2023 12:26, Andrzej Hajda wrote:
> 
> 
> On 07.12.2023 11:10, Andrzej Hajda wrote:
>> On 07.12.2023 01:18, Patchwork wrote:
>>> *Patch Details*
>>> *Series:*    drm/i915/display: Check GGTT to determine phys_base (rev2)
>>> *URL:*    https://patchwork.freedesktop.org/series/127130/ 
>>> <https://patchwork.freedesktop.org/series/127130/>
>>> *State:*    failure
>>> *Details:* 
>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html>
>>>
>>>
>>>   CI Bug Log - changes from CI_DRM_13990 -> Patchwork_127130v2
>>>
>>>
>>>     Summary
>>>
>>> *FAILURE*
>>>
>>> Serious unknown changes coming with Patchwork_127130v2 absolutely 
>>> need to be
>>> verified manually.
>>>
>>> If you think the reported changes have nothing to do with the changes
>>> introduced in Patchwork_127130v2, please notify your bug team 
>>> (I915-ci-infra@lists.freedesktop.org) to allow them
>>> to document this new failure mode, which will reduce false positives 
>>> in CI.
>>>
>>> External URL: 
>>> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/index.html
>>>
>>>
>>>     Participating hosts (37 -> 34)
>>>
>>> Missing (3): fi-pnv-d510 fi-snb-2520m bat-dg1-5
>>>
>>>
>>>     Possible new issues
>>>
>>> Here are the unknown changes that may have been introduced in 
>>> Patchwork_127130v2:
>>>
>>>
>>>       IGT changes
>>>
>>>
>>>         Possible regressions
>>>
>>>   * igt@i915_module_load@load:
>>>       o bat-mtlp-8: PASS
>>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-mtlp-8/igt@i915_module_load@load.html> -> INCOMPLETE <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-mtlp-8/igt@i915_module_load@load.html>
>>
>>
>> It seems related. I think the patch is correct but it just unveils 
>> other display take-over issues.
>> Ie with this patch initial_plane_vma returns valid buffer, but 
>> subsequent display code fails miserably with kernel panic.
>>
>> So until this is not solved, we shouldn't merge the patch, IMO.
>>
>> CC: i915 maintainers and display developers
> 
> 
> After taking a look on panic log [1], I have found:
> [drm:i915_init_ggtt [i915]] Failed to reserve top of GGTT for GuC
> 
> I don't know why it is only debug level? It seems serious failure, as a 
> result i915_init_ggtt fails and probe fails.
> 
> The cause is that initial framebuffer is located at the end of GGTT and 
> it overlaps with reserved area (see ggtt_reserve_guc_top).
> 
> I am not sure how it can be properly fixed, I guess dirty fix could be
> just relocation of vma (hopefully into free area), sth like:
> new_gte = gsm + (ggtt->vm.total - GUC_TOP_RESERVE_SIZE - size) / 
> I915_GTT_PAGE_SIZE;
> memmove(new_gte, gte, size / I915_GTT_PAGE_SIZE);
> 
> but I have no idea of possible side effects :)

I looked once more into the code and maybe you can just pin the buffer 
to earlier address (not overlapping with GuC reservation and current vma 
of the fb):
@@ -143,6 +143,9 @@ initial_plane_vma(struct drm_i915_private *i915,
         if (IS_ERR(vma))
                 goto err_obj;

+       if (base + size > GUC_GGTT_TOP)
+               base = min(base, GUC_GGTT_TOP) - size;
+
         pinctl = PIN_GLOBAL | PIN_OFFSET_FIXED | base;
         if (HAS_GMCH(i915))
                 pinctl |= PIN_MAPPABLE;


Regards
Andrzej

> 
> [1]: 
> https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-mtlp-8/pstore0-2849851684_Panic_1.txt
> 
> Regards
> Andrzej
> 
> 
>>
>> Regards
>> Andrzej
>>
>>>
>>>
>>>     Known issues
>>>
>>> Here are the changes found in Patchwork_127130v2 that come from known 
>>> issues:
>>>
>>>
>>>       IGT changes
>>>
>>>
>>>         Issues hit
>>>
>>>   * igt@kms_pm_backlight@basic-brightness@edp-1:
>>>       o bat-rplp-1: NOTRUN -> ABORT
>>> <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-rplp-1/igt@kms_pm_backlight@basic-brightness@edp-1.html> (i915#8668 <https://gitlab.freedesktop.org/drm/intel/issues/8668>)
>>>
>>>
>>>         Possible fixes
>>>
>>>   *
>>>
>>>     igt@gem_exec_suspend@basic-s0@lmem0:
>>>
>>>       o bat-dg2-9: INCOMPLETE
>>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html> (i915#9275 <https://gitlab.freedesktop.org/drm/intel/issues/9275>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-dg2-9/igt@gem_exec_suspend@basic-s0@lmem0.html>
>>>   *
>>>
>>>     igt@kms_flip@basic-flip-vs-dpms@d-dp6:
>>>
>>>       o bat-adlp-11: DMESG-FAIL
>>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-adlp-11/igt@kms_flip@basic-flip-vs-dpms@d-dp6.html> (i915#6868 <https://gitlab.freedesktop.org/drm/intel/issues/6868>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-adlp-11/igt@kms_flip@basic-flip-vs-dpms@d-dp6.html>
>>>   *
>>>
>>>     igt@kms_flip@basic-flip-vs-modeset@d-dp6:
>>>
>>>       o bat-adlp-11: DMESG-WARN
>>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-adlp-11/igt@kms_flip@basic-flip-vs-modeset@d-dp6.html> -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-adlp-11/igt@kms_flip@basic-flip-vs-modeset@d-dp6.html>
>>>   *
>>>
>>>     igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1:
>>>
>>>       o bat-rplp-1: ABORT
>>> <https://intel-gfx-ci.01.org/tree/drm-tip/CI_DRM_13990/bat-rplp-1/igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1.html> (i915#8668 <https://gitlab.freedesktop.org/drm/intel/issues/8668>) -> PASS <https://intel-gfx-ci.01.org/tree/drm-tip/Patchwork_127130v2/bat-rplp-1/igt@kms_pipe_crc_basic@read-crc-frame-sequence@pipe-d-edp-1.html>
>>>
>>>
>>>     Build changes
>>>
>>>   * Linux: CI_DRM_13990 -> Patchwork_127130v2
>>>
>>> CI-20190529: 20190529
>>> CI_DRM_13990: 85d33d0ad82a5c1a71492f14a5ceb67ada6a22d8 @ 
>>> git://anongit.freedesktop.org/gfx-ci/linux
>>> IGT_7626: 154b7288552cd7ed3033f8ef396e88d0bd1b7646 @ 
>>> https://gitlab.freedesktop.org/drm/igt-gpu-tools.git
>>> Patchwork_127130v2: 85d33d0ad82a5c1a71492f14a5ceb67ada6a22d8 @ 
>>> git://anongit.freedesktop.org/gfx-ci/linux
>>>
>>>
>>>       Linux commits
>>>
>>> 43f210e851cd drm/i915/display: Check GGTT to determine phys_base
>>>
>>
Ville Syrjälä Dec. 13, 2023, 4:03 p.m. UTC | #6
On Fri, Dec 08, 2023 at 09:32:00AM +0100, Andrzej Hajda wrote:
> On 07.12.2023 12:26, Andrzej Hajda wrote:
> > On 07.12.2023 11:10, Andrzej Hajda wrote:
> > 
> > After taking a look on panic log [1], I have found:
> > [drm:i915_init_ggtt [i915]] Failed to reserve top of GGTT for GuC
> > 
> > I don't know why it is only debug level? It seems serious failure, as a 
> > result i915_init_ggtt fails and probe fails.
> > 
> > The cause is that initial framebuffer is located at the end of GGTT and 
> > it overlaps with reserved area (see ggtt_reserve_guc_top).
> > 
> > I am not sure how it can be properly fixed, I guess dirty fix could be
> > just relocation of vma (hopefully into free area), sth like:
> > new_gte = gsm + (ggtt->vm.total - GUC_TOP_RESERVE_SIZE - size) / 
> > I915_GTT_PAGE_SIZE;
> > memmove(new_gte, gte, size / I915_GTT_PAGE_SIZE);
> > 
> > but I have no idea of possible side effects :)
> 
> I looked once more into the code and maybe you can just pin the buffer 
> to earlier address (not overlapping with GuC reservation and current vma 
> of the fb):
> @@ -143,6 +143,9 @@ initial_plane_vma(struct drm_i915_private *i915,
>          if (IS_ERR(vma))
>                  goto err_obj;
> 
> +       if (base + size > GUC_GGTT_TOP)
> +               base = min(base, GUC_GGTT_TOP) - size;
> +
>          pinctl = PIN_GLOBAL | PIN_OFFSET_FIXED | base;
>          if (HAS_GMCH(i915))
>                  pinctl |= PIN_MAPPABLE;

This is not a solution. The correct fix is either:
1. fix the guc code to not insist on a fixed chunk of the ggtt
   and instead allocate it normally like a good boy. It could
   still try to allocate as high as possible to ideally
   land in the GUC_GGTT_TOP range
2. relocate the display vma to a different part of the ggtt

1. should be far simpler by the looks of it, as 2. would involve:
a) pinning the same object into two places in the ggtt simultanously
   I don't think we have support for that except for special ggtt views,
   and xe doesn't have even that (should we need this trick there as well)
b) flip the plane(s) over to the relocated vma
c) wait for vblank(s)
d) discard the original vma
e) all of that would need to happen pretty early so we may not have
   a lot of the required normal machinery fully up and running yet

Anyways, I ran into much of the same issues when debugging an MTL
machine. This is the result of my efforts (partially overlaps with
the proposed patch here):
https://patchwork.freedesktop.org/series/127721/
diff mbox series

Patch

diff --git a/drivers/gpu/drm/i915/display/intel_plane_initial.c b/drivers/gpu/drm/i915/display/intel_plane_initial.c
index a55c09cbd0e4..7d9bb631b93b 100644
--- a/drivers/gpu/drm/i915/display/intel_plane_initial.c
+++ b/drivers/gpu/drm/i915/display/intel_plane_initial.c
@@ -59,44 +59,58 @@  initial_plane_vma(struct drm_i915_private *i915,
 		return NULL;
 
 	base = round_down(plane_config->base, I915_GTT_MIN_ALIGNMENT);
-	if (IS_DGFX(i915)) {
+
+	if (GRAPHICS_VER(i915) < 8) {
+		/*
+		 * In gen7-, there is a 1:1 mapping
+		 * between GSM and physical address.
+		 */
+		phys_base = base;
+		mem = i915->mm.stolen_region;
+	} else {
+		/*
+		 * In gen8+, there is no 1:1 mapping between
+		 * GSM and physical address, so we need to
+		 * check GGTT to determine the physical address.
+		 */
 		gen8_pte_t __iomem *gte = to_gt(i915)->ggtt->gsm;
 		gen8_pte_t pte;
 
 		gte += base / I915_GTT_PAGE_SIZE;
-
 		pte = ioread64(gte);
-		if (!(pte & GEN12_GGTT_PTE_LM)) {
-			drm_err(&i915->drm,
-				"Initial plane programming missing PTE_LM bit\n");
-			return NULL;
-		}
-
-		phys_base = pte & I915_GTT_PAGE_MASK;
-		mem = i915->mm.regions[INTEL_REGION_LMEM_0];
 
-		/*
-		 * We don't currently expect this to ever be placed in the
-		 * stolen portion.
-		 */
-		if (phys_base >= resource_size(&mem->region)) {
-			drm_err(&i915->drm,
-				"Initial plane programming using invalid range, phys_base=%pa\n",
-				&phys_base);
-			return NULL;
+		if (IS_DGFX(i915)) {
+			if (!(pte & GEN12_GGTT_PTE_LM)) {
+				drm_err(&i915->drm,
+					"Initial plane programming missing PTE_LM bit\n");
+				return NULL;
+			}
+			mem = i915->mm.regions[INTEL_REGION_LMEM_0];
+		} else {
+			mem = i915->mm.stolen_region;
 		}
 
-		drm_dbg(&i915->drm,
-			"Using phys_base=%pa, based on initial plane programming\n",
-			&phys_base);
-	} else {
-		phys_base = base;
-		mem = i915->mm.stolen_region;
+		phys_base = (pte & I915_GTT_PAGE_MASK) - mem->region.start;
 	}
 
 	if (!mem)
 		return NULL;
 
+	/*
+	 * We don't currently expect this to ever be placed in the
+	 * stolen portion.
+	 */
+	if (phys_base >= resource_size(&mem->region)) {
+		drm_err(&i915->drm,
+			"Initial plane programming using invalid range, phys_base=%pa\n",
+			&phys_base);
+		return NULL;
+	}
+
+	drm_dbg(&i915->drm,
+		"Using phys_base=%pa, based on initial plane programming\n",
+		&phys_base);
+
 	size = round_up(plane_config->base + plane_config->size,
 			mem->min_page_size);
 	size -= base;