Message ID | 20231229163821.144599-1-mwen@igalia.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] drm/amd/display: fix bandwidth validation failure on DCN 2.1 | expand |
On 12/29/23 11:25, Melissa Wen wrote: > IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of > the test execution, during atomic check, because DC rejects the > bandwidth state for a fb sizing 64x64. The test was previously working > with the deprecated dc_commit_state(). Now using > dc_validate_with_context() approach, the atomic check needs to perform a > full state validation. Therefore, set fast_validation to false in the > dc_validate_global_state call for atomic check. > > Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams") > Signed-off-by: Melissa Wen <mwen@igalia.com> > --- > > Hi, > > It's a long story. I was inspecting this bug report: > - https://gitlab.freedesktop.org/drm/amd/-/issues/2016 > and noticed the IGT test `igt@amdgpu/amd_color@crtc-lut-accuracy` > mentioned there wasn't even being executed on a laptop with DCN 2.1 > (HP HP ENVY x360 Convertible 13-ay1xxx/8929). The test fails right at > the beginning due to an atomic check rejection, as below: > > Starting subtest: crtc-lut-accuracy > (amd_color:14772) igt_kms-CRITICAL: Test assertion failure function igt_display_commit_atomic, file ../lib/igt_kms.c:4530: > (amd_color:14772) igt_kms-CRITICAL: Failed assertion: ret == 0 > (amd_color:14772) igt_kms-CRITICAL: Last errno: 22, Invalid argument > (amd_color:14772) igt_kms-CRITICAL: error: -22 != 0 > Stack trace: > #0 ../lib/igt_core.c:1989 __igt_fail_assert() > #1 [igt_display_commit_atomic+0x44] > #2 ../tests/amdgpu/amd_color.c:159 __igt_unique____real_main395() > #3 ../tests/amdgpu/amd_color.c:395 main() > #4 ../sysdeps/nptl/libc_start_call_main.h:74 __libc_start_call_main() > #5 ../csu/libc-start.c:128 __libc_start_main@@GLIBC_2.34() > #6 [_start+0x21] > Subtest crtc-lut-accuracy failed. > > Checking dmesg, we can see that a bandwidth validation failure causes > the atomic check rejection: > > [ 711.147663] amdgpu 0000:04:00.0: [drm] Mode Validation Warning: Unknown Status failed validation. > [ 711.147667] [drm:amdgpu_dm_atomic_check [amdgpu]] DC global validation failure: Bandwidth validation failure (BW and Watermark) (13) > [ 711.147772] [drm:amdgpu_irq_dispatch [amdgpu]] Unregistered interrupt src_id: 243 of client_id:8 > [ 711.148033] [drm:amdgpu_dm_atomic_check [amdgpu]] Atomic check failed with err: -22 > > I also noticed that the atomic check doesn't fail if I change the fb > width and height used in the test from 64 to 66, and I can get the test > execution back (and with success). However, I recall that all test cases > of IGT `amd_color` were working in the past, so I bisected and found the > first bad commit: > > b8272241ff9d drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams > > Bringing the `dc_commit_state` machinery back also prevents the > bandwidth validation failure, but the commit above says > dc_commit_streams validation is more complete than dc_commit_state, so I > discarded this approach. > > After some debugging and code inspection, I found out that avoiding fast > validation on dc_validate_global_state during atomic check solves the > issue, but I'm not sure if this change may affect performance. I > compared exec time of some IGT tests and didn't see any differences, but > I recognize it doesn't provide enough evidence. > > What do you think about this change? Should I examine other things? Do > you see any potential issue that I should investigate? Could you > recommend a better approach to assess any side-effect of not enabling > fast validation in the atomic check? > > Please, let me know your thoughts. We shouldn't be doing fast updates when lock_and_validation_needed is true, so this seems to be correct. Which is to say, applied, thanks! Cc: stable@vger.kernel.org > > Happy New Year! > > Melissa > > drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > index 2845c884398e..4f51a7ad7a3c 100644 > --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c > @@ -10745,7 +10745,7 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, > DRM_DEBUG_DRIVER("drm_dp_mst_atomic_check() failed\n"); > goto fail; > } > - status = dc_validate_global_state(dc, dm_state->context, true); > + status = dc_validate_global_state(dc, dm_state->context, false); > if (status != DC_OK) { > DRM_DEBUG_DRIVER("DC global validation failure: %s (%d)", > dc_status_to_str(status), status);
On Wed, Jan 03, 2024 at 09:44:18AM -0500, Hamza Mahfooz wrote: > On 12/29/23 11:25, Melissa Wen wrote: > > IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of > > the test execution, during atomic check, because DC rejects the > > bandwidth state for a fb sizing 64x64. The test was previously working > > with the deprecated dc_commit_state(). Now using > > dc_validate_with_context() approach, the atomic check needs to perform a > > full state validation. Therefore, set fast_validation to false in the > > dc_validate_global_state call for atomic check. > > > > Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams") > > Signed-off-by: Melissa Wen <mwen@igalia.com> > > --- > > > > Hi, > > > > It's a long story. I was inspecting this bug report: > > - https://gitlab.freedesktop.org/drm/amd/-/issues/2016 > > and noticed the IGT test `igt@amdgpu/amd_color@crtc-lut-accuracy` > > mentioned there wasn't even being executed on a laptop with DCN 2.1 > > (HP HP ENVY x360 Convertible 13-ay1xxx/8929). The test fails right at > > the beginning due to an atomic check rejection, as below: > > > > Starting subtest: crtc-lut-accuracy > > (amd_color:14772) igt_kms-CRITICAL: Test assertion failure function igt_display_commit_atomic, file ../lib/igt_kms.c:4530: > > (amd_color:14772) igt_kms-CRITICAL: Failed assertion: ret == 0 > > (amd_color:14772) igt_kms-CRITICAL: Last errno: 22, Invalid argument > > (amd_color:14772) igt_kms-CRITICAL: error: -22 != 0 > > Stack trace: > > #0 ../lib/igt_core.c:1989 __igt_fail_assert() > > #1 [igt_display_commit_atomic+0x44] > > #2 ../tests/amdgpu/amd_color.c:159 __igt_unique____real_main395() > > #3 ../tests/amdgpu/amd_color.c:395 main() > > #4 ../sysdeps/nptl/libc_start_call_main.h:74 __libc_start_call_main() > > #5 ../csu/libc-start.c:128 __libc_start_main@@GLIBC_2.34() > > #6 [_start+0x21] > > Subtest crtc-lut-accuracy failed. > > > > Checking dmesg, we can see that a bandwidth validation failure causes > > the atomic check rejection: > > > > [ 711.147663] amdgpu 0000:04:00.0: [drm] Mode Validation Warning: Unknown Status failed validation. > > [ 711.147667] [drm:amdgpu_dm_atomic_check [amdgpu]] DC global validation failure: Bandwidth validation failure (BW and Watermark) (13) > > [ 711.147772] [drm:amdgpu_irq_dispatch [amdgpu]] Unregistered interrupt src_id: 243 of client_id:8 > > [ 711.148033] [drm:amdgpu_dm_atomic_check [amdgpu]] Atomic check failed with err: -22 > > > > I also noticed that the atomic check doesn't fail if I change the fb > > width and height used in the test from 64 to 66, and I can get the test > > execution back (and with success). However, I recall that all test cases > > of IGT `amd_color` were working in the past, so I bisected and found the > > first bad commit: > > > > b8272241ff9d drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams > > > > Bringing the `dc_commit_state` machinery back also prevents the > > bandwidth validation failure, but the commit above says > > dc_commit_streams validation is more complete than dc_commit_state, so I > > discarded this approach. > > > > After some debugging and code inspection, I found out that avoiding fast > > validation on dc_validate_global_state during atomic check solves the > > issue, but I'm not sure if this change may affect performance. I > > compared exec time of some IGT tests and didn't see any differences, but > > I recognize it doesn't provide enough evidence. > > > > What do you think about this change? Should I examine other things? Do > > you see any potential issue that I should investigate? Could you > > recommend a better approach to assess any side-effect of not enabling > > fast validation in the atomic check? > > > > Please, let me know your thoughts. > > We shouldn't be doing fast updates when lock_and_validation_needed is > true, so this seems to be correct. > > Which is to say, applied, thanks! > > Cc: stable@vger.kernel.org <formletter> This is not the correct way to submit patches for inclusion in the stable kernel tree. Please read: https://www.kernel.org/doc/html/latest/process/stable-kernel-rules.html for how to do this properly. </formletter>
diff --git a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c index 2845c884398e..4f51a7ad7a3c 100644 --- a/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c +++ b/drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c @@ -10745,7 +10745,7 @@ static int amdgpu_dm_atomic_check(struct drm_device *dev, DRM_DEBUG_DRIVER("drm_dp_mst_atomic_check() failed\n"); goto fail; } - status = dc_validate_global_state(dc, dm_state->context, true); + status = dc_validate_global_state(dc, dm_state->context, false); if (status != DC_OK) { DRM_DEBUG_DRIVER("DC global validation failure: %s (%d)", dc_status_to_str(status), status);
IGT `amdgpu/amd_color/crtc-lut-accuracy` fails right at the beginning of the test execution, during atomic check, because DC rejects the bandwidth state for a fb sizing 64x64. The test was previously working with the deprecated dc_commit_state(). Now using dc_validate_with_context() approach, the atomic check needs to perform a full state validation. Therefore, set fast_validation to false in the dc_validate_global_state call for atomic check. Fixes: b8272241ff9d ("drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams") Signed-off-by: Melissa Wen <mwen@igalia.com> --- Hi, It's a long story. I was inspecting this bug report: - https://gitlab.freedesktop.org/drm/amd/-/issues/2016 and noticed the IGT test `igt@amdgpu/amd_color@crtc-lut-accuracy` mentioned there wasn't even being executed on a laptop with DCN 2.1 (HP HP ENVY x360 Convertible 13-ay1xxx/8929). The test fails right at the beginning due to an atomic check rejection, as below: Starting subtest: crtc-lut-accuracy (amd_color:14772) igt_kms-CRITICAL: Test assertion failure function igt_display_commit_atomic, file ../lib/igt_kms.c:4530: (amd_color:14772) igt_kms-CRITICAL: Failed assertion: ret == 0 (amd_color:14772) igt_kms-CRITICAL: Last errno: 22, Invalid argument (amd_color:14772) igt_kms-CRITICAL: error: -22 != 0 Stack trace: #0 ../lib/igt_core.c:1989 __igt_fail_assert() #1 [igt_display_commit_atomic+0x44] #2 ../tests/amdgpu/amd_color.c:159 __igt_unique____real_main395() #3 ../tests/amdgpu/amd_color.c:395 main() #4 ../sysdeps/nptl/libc_start_call_main.h:74 __libc_start_call_main() #5 ../csu/libc-start.c:128 __libc_start_main@@GLIBC_2.34() #6 [_start+0x21] Subtest crtc-lut-accuracy failed. Checking dmesg, we can see that a bandwidth validation failure causes the atomic check rejection: [ 711.147663] amdgpu 0000:04:00.0: [drm] Mode Validation Warning: Unknown Status failed validation. [ 711.147667] [drm:amdgpu_dm_atomic_check [amdgpu]] DC global validation failure: Bandwidth validation failure (BW and Watermark) (13) [ 711.147772] [drm:amdgpu_irq_dispatch [amdgpu]] Unregistered interrupt src_id: 243 of client_id:8 [ 711.148033] [drm:amdgpu_dm_atomic_check [amdgpu]] Atomic check failed with err: -22 I also noticed that the atomic check doesn't fail if I change the fb width and height used in the test from 64 to 66, and I can get the test execution back (and with success). However, I recall that all test cases of IGT `amd_color` were working in the past, so I bisected and found the first bad commit: b8272241ff9d drm/amd/display: Drop dc_commit_state in favor of dc_commit_streams Bringing the `dc_commit_state` machinery back also prevents the bandwidth validation failure, but the commit above says dc_commit_streams validation is more complete than dc_commit_state, so I discarded this approach. After some debugging and code inspection, I found out that avoiding fast validation on dc_validate_global_state during atomic check solves the issue, but I'm not sure if this change may affect performance. I compared exec time of some IGT tests and didn't see any differences, but I recognize it doesn't provide enough evidence. What do you think about this change? Should I examine other things? Do you see any potential issue that I should investigate? Could you recommend a better approach to assess any side-effect of not enabling fast validation in the atomic check? Please, let me know your thoughts. Happy New Year! Melissa drivers/gpu/drm/amd/display/amdgpu_dm/amdgpu_dm.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)