diff mbox

[2/2] drm/radeon: Prevent races on pre DCE4 between flip submission and completion.

Message ID 1474115139-25201-3-git-send-email-mario.kleiner.de@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Mario Kleiner Sept. 17, 2016, 12:25 p.m. UTC
Pre DCE4 hw doesn't have reliable pageflip completion
interrupts, so instead polling for flip completion is
used from within the vblank irq handler to complete
page flips.

This causes a race if pageflip ioctl is called close to
vblank:

1. pageflip ioctl queues execution of radeon_flip_work_func.

2. vblank irq fires, radeon_crtc_handle_vblank checks for
   flip_status == FLIP_SUBMITTED finds none, no-ops.

3. radeon_flip_work_func runs inside vblank, decides to
   set flip_status == FLIP_SUBMITTED and programs the
   flip into hw.

4. hw executes flip immediately (because in vblank), but
   as 2 already happened, the flip completion routine only
   emits the flip completion event one refresh later ->
   wrong vblank count/timestamp for completion and no
   performance gain, as instead of delaying the flip until
   next vblank, we now delay the next flip by 1 refresh
   while waiting for the delayed flip completion event.

Given we often don't gain anything due to this race, but
lose precision, prevent the programmed flip from executing
in vblank on pre DCE4 asics to avoid this race.

On pre-AVIVO hw we can't program the hw for edge-triggered
flips, they always execute anywhere in vblank. Therefore delay
the actual flip programming until after vblank on pre-AVIVO.

Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
---
 drivers/gpu/drm/radeon/atombios_crtc.c  |  4 ++--
 drivers/gpu/drm/radeon/radeon_display.c | 17 ++++++++++-------
 drivers/gpu/drm/radeon/rv515.c          |  3 ++-
 3 files changed, 14 insertions(+), 10 deletions(-)

Comments

Michel Dänzer Sept. 20, 2016, 2:56 a.m. UTC | #1
On 17/09/16 09:25 PM, Mario Kleiner wrote:
> Pre DCE4 hw doesn't have reliable pageflip completion
> interrupts, so instead polling for flip completion is
> used from within the vblank irq handler to complete
> page flips.
> 
> This causes a race if pageflip ioctl is called close to
> vblank:
> 
> 1. pageflip ioctl queues execution of radeon_flip_work_func.
> 
> 2. vblank irq fires, radeon_crtc_handle_vblank checks for
>    flip_status == FLIP_SUBMITTED finds none, no-ops.
> 
> 3. radeon_flip_work_func runs inside vblank, decides to
>    set flip_status == FLIP_SUBMITTED and programs the
>    flip into hw.
> 
> 4. hw executes flip immediately (because in vblank), but
>    as 2 already happened, the flip completion routine only
>    emits the flip completion event one refresh later ->
>    wrong vblank count/timestamp for completion and no
>    performance gain, as instead of delaying the flip until
>    next vblank, we now delay the next flip by 1 refresh
>    while waiting for the delayed flip completion event.
> 
> Given we often don't gain anything due to this race, but
> lose precision, prevent the programmed flip from executing
> in vblank on pre DCE4 asics to avoid this race.
> 
> On pre-AVIVO hw we can't program the hw for edge-triggered
> flips, they always execute anywhere in vblank. Therefore delay
> the actual flip programming until after vblank on pre-AVIVO.
> 
> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>

Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>


P.S. Please send radeon (and amdgpu) driver patches to the amd-gfx
mailing list (as well).
Alex Deucher Sept. 29, 2016, 1:36 p.m. UTC | #2
On Mon, Sep 19, 2016 at 10:56 PM, Michel Dänzer <michel@daenzer.net> wrote:
> On 17/09/16 09:25 PM, Mario Kleiner wrote:
>> Pre DCE4 hw doesn't have reliable pageflip completion
>> interrupts, so instead polling for flip completion is
>> used from within the vblank irq handler to complete
>> page flips.
>>
>> This causes a race if pageflip ioctl is called close to
>> vblank:
>>
>> 1. pageflip ioctl queues execution of radeon_flip_work_func.
>>
>> 2. vblank irq fires, radeon_crtc_handle_vblank checks for
>>    flip_status == FLIP_SUBMITTED finds none, no-ops.
>>
>> 3. radeon_flip_work_func runs inside vblank, decides to
>>    set flip_status == FLIP_SUBMITTED and programs the
>>    flip into hw.
>>
>> 4. hw executes flip immediately (because in vblank), but
>>    as 2 already happened, the flip completion routine only
>>    emits the flip completion event one refresh later ->
>>    wrong vblank count/timestamp for completion and no
>>    performance gain, as instead of delaying the flip until
>>    next vblank, we now delay the next flip by 1 refresh
>>    while waiting for the delayed flip completion event.
>>
>> Given we often don't gain anything due to this race, but
>> lose precision, prevent the programmed flip from executing
>> in vblank on pre DCE4 asics to avoid this race.
>>
>> On pre-AVIVO hw we can't program the hw for edge-triggered
>> flips, they always execute anywhere in vblank. Therefore delay
>> the actual flip programming until after vblank on pre-AVIVO.
>>
>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com>
>
> Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>

Applied this series.  Thanks!

Alex

>
>
> P.S. Please send radeon (and amdgpu) driver patches to the amd-gfx
> mailing list (as well).
>
> --
> Earthling Michel Dänzer               |               http://www.amd.com
> Libre software enthusiast             |             Mesa and X developer
> _______________________________________________
> dri-devel mailing list
> dri-devel@lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/dri-devel
diff mbox

Patch

diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c
index a4e9f35..74f99ba 100644
--- a/drivers/gpu/drm/radeon/atombios_crtc.c
+++ b/drivers/gpu/drm/radeon/atombios_crtc.c
@@ -1638,8 +1638,8 @@  static int avivo_crtc_do_set_base(struct drm_crtc *crtc,
 	WREG32(AVIVO_D1MODE_VIEWPORT_SIZE + radeon_crtc->crtc_offset,
 	       (viewport_w << 16) | viewport_h);
 
-	/* set pageflip to happen anywhere in vblank interval */
-	WREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + radeon_crtc->crtc_offset, 0);
+	/* set pageflip to happen only at start of vblank interval (front porch) */
+	WREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + radeon_crtc->crtc_offset, 3);
 
 	if (!atomic && fb && fb != crtc->primary->fb) {
 		radeon_fb = to_radeon_framebuffer(fb);
diff --git a/drivers/gpu/drm/radeon/radeon_display.c b/drivers/gpu/drm/radeon/radeon_display.c
index 5070646..b8ab30a 100644
--- a/drivers/gpu/drm/radeon/radeon_display.c
+++ b/drivers/gpu/drm/radeon/radeon_display.c
@@ -452,16 +452,19 @@  static void radeon_flip_work_func(struct work_struct *__work)
 	}
 
 	/* Wait until we're out of the vertical blank period before the one
-	 * targeted by the flip
+	 * targeted by the flip. Always wait on pre DCE4 to avoid races with
+	 * flip completion handling from vblank irq, as these old asics don't
+	 * have reliable pageflip completion interrupts.
 	 */
 	while (radeon_crtc->enabled &&
-	       (radeon_get_crtc_scanoutpos(dev, work->crtc_id, 0,
-					   &vpos, &hpos, NULL, NULL,
-					   &crtc->hwmode)
+		(radeon_get_crtc_scanoutpos(dev, work->crtc_id, 0,
+					    &vpos, &hpos, NULL, NULL,
+					    &crtc->hwmode)
 		& (DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK)) ==
-	       (DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK) &&
-	       (int)(work->target_vblank -
-		     dev->driver->get_vblank_counter(dev, work->crtc_id)) > 0)
+		(DRM_SCANOUTPOS_VALID | DRM_SCANOUTPOS_IN_VBLANK) &&
+		(!ASIC_IS_AVIVO(rdev) ||
+		((int) (work->target_vblank -
+		dev->driver->get_vblank_counter(dev, work->crtc_id)) > 0)))
 		usleep_range(1000, 2000);
 
 	/* We borrow the event spin lock for protecting flip_status */
diff --git a/drivers/gpu/drm/radeon/rv515.c b/drivers/gpu/drm/radeon/rv515.c
index 76c55c5..c55d653 100644
--- a/drivers/gpu/drm/radeon/rv515.c
+++ b/drivers/gpu/drm/radeon/rv515.c
@@ -406,8 +406,9 @@  void rv515_mc_resume(struct radeon_device *rdev, struct rv515_mc_save *save)
 	for (i = 0; i < rdev->num_crtc; i++) {
 		if (save->crtc_enabled[i]) {
 			tmp = RREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + crtc_offsets[i]);
-			if ((tmp & 0x7) != 0) {
+			if ((tmp & 0x7) != 3) {
 				tmp &= ~0x7;
+				tmp |= 0x3;
 				WREG32(AVIVO_D1MODE_MASTER_UPDATE_MODE + crtc_offsets[i], tmp);
 			}
 			tmp = RREG32(AVIVO_D1GRPH_UPDATE + crtc_offsets[i]);