diff mbox

drm/nv84+: fix fence context seqno's

Message ID 541FE720.50504@canonical.com (mailing list archive)
State Accepted
Headers show

Commit Message

Maarten Lankhorst Sept. 22, 2014, 9:08 a.m. UTC
This fixes a regression introduced by "drm/nouveau: rework to new fence interface"
(commit 29ba89b2371d466).

The fence sequence should not be reset after creation, the old value is used instead.
On destruction the final value is written, to prevent another source of accidental
wraparound in case of a channel being destroyed after a hang, and unblocking any other
channel that may wait on the about-to-be-deleted channel to signal.

I'm nothing if not optimistic about any hope of recovery from that. ;-)

Reported-by: Ted Percival <ted@tedp.id.au>
Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
---

Comments

Ted Percival Sept. 22, 2014, 4:23 p.m. UTC | #1
On 09/22/2014 03:08 AM, Maarten Lankhorst wrote:
> This fixes a regression introduced by "drm/nouveau: rework to new fence interface"
> (commit 29ba89b2371d466).
> 
> The fence sequence should not be reset after creation, the old value is used instead.
> On destruction the final value is written, to prevent another source of accidental
> wraparound in case of a channel being destroyed after a hang, and unblocking any other
> channel that may wait on the about-to-be-deleted channel to signal.
> 
> I'm nothing if not optimistic about any hope of recovery from that. ;-)
> 
> Reported-by: Ted Percival <ted@tedp.id.au>
> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> ---
> diff --git a/drivers/gpu/drm/nouveau/nv84_fence.c b/drivers/gpu/drm/nouveau/nv84_fence.c
> index 7b372a68aa4e..4138db4d8291 100644
> --- a/drivers/gpu/drm/nouveau/nv84_fence.c
> +++ b/drivers/gpu/drm/nouveau/nv84_fence.c
> @@ -120,6 +120,7 @@ nv84_fence_context_del(struct nouveau_channel *chan)
>  		nouveau_bo_vma_del(bo, &fctx->dispc_vma[i]);
>  	}
>  
> +	nouveau_bo_wr32(priv->bo, chan->chid * 16 / 4, fctx->base.sequence);
>  	nouveau_bo_vma_del(priv->bo, &fctx->vma_gart);
>  	nouveau_bo_vma_del(priv->bo, &fctx->vma);
>  	nouveau_fence_context_del(&fctx->base);
> @@ -159,8 +160,6 @@ nv84_fence_context_new(struct nouveau_channel *chan)
>  		ret = nouveau_bo_vma_add(bo, cli->vm, &fctx->dispc_vma[i]);
>  	}
>  
> -	nouveau_bo_wr32(priv->bo, chan->chid * 16/4, 0x00000000);
> -
>  	if (ret)
>  		nv84_fence_context_del(chan);
>  	return ret;
> 

This works, thanks :-)

Tested-by: Ted Percival <ted@tedp.id.au>
Ben Skeggs Sept. 23, 2014, 5:35 a.m. UTC | #2
On Tue, Sep 23, 2014 at 2:23 AM, Ted Percival <ted@tedp.id.au> wrote:
> On 09/22/2014 03:08 AM, Maarten Lankhorst wrote:
>> This fixes a regression introduced by "drm/nouveau: rework to new fence interface"
>> (commit 29ba89b2371d466).
>>
>> The fence sequence should not be reset after creation, the old value is used instead.
>> On destruction the final value is written, to prevent another source of accidental
>> wraparound in case of a channel being destroyed after a hang, and unblocking any other
>> channel that may wait on the about-to-be-deleted channel to signal.
>>
>> I'm nothing if not optimistic about any hope of recovery from that. ;-)
>>
>> Reported-by: Ted Percival <ted@tedp.id.au>
>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
Acked-by: Ben Skeggs <bskeggs@redhat.com>

I'm still seeing issues with suspend, even with this patch, and the
one you pastebinned recently.

>> ---
>> diff --git a/drivers/gpu/drm/nouveau/nv84_fence.c b/drivers/gpu/drm/nouveau/nv84_fence.c
>> index 7b372a68aa4e..4138db4d8291 100644
>> --- a/drivers/gpu/drm/nouveau/nv84_fence.c
>> +++ b/drivers/gpu/drm/nouveau/nv84_fence.c
>> @@ -120,6 +120,7 @@ nv84_fence_context_del(struct nouveau_channel *chan)
>>               nouveau_bo_vma_del(bo, &fctx->dispc_vma[i]);
>>       }
>>
>> +     nouveau_bo_wr32(priv->bo, chan->chid * 16 / 4, fctx->base.sequence);
>>       nouveau_bo_vma_del(priv->bo, &fctx->vma_gart);
>>       nouveau_bo_vma_del(priv->bo, &fctx->vma);
>>       nouveau_fence_context_del(&fctx->base);
>> @@ -159,8 +160,6 @@ nv84_fence_context_new(struct nouveau_channel *chan)
>>               ret = nouveau_bo_vma_add(bo, cli->vm, &fctx->dispc_vma[i]);
>>       }
>>
>> -     nouveau_bo_wr32(priv->bo, chan->chid * 16/4, 0x00000000);
>> -
>>       if (ret)
>>               nv84_fence_context_del(chan);
>>       return ret;
>>
>
> This works, thanks :-)
>
> Tested-by: Ted Percival <ted@tedp.id.au>
> _______________________________________________
> Nouveau mailing list
> Nouveau@lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau
Maarten Lankhorst Sept. 23, 2014, 2:24 p.m. UTC | #3
Op 23-09-14 om 07:35 schreef Ben Skeggs:
> On Tue, Sep 23, 2014 at 2:23 AM, Ted Percival <ted@tedp.id.au> wrote:
>> On 09/22/2014 03:08 AM, Maarten Lankhorst wrote:
>>> This fixes a regression introduced by "drm/nouveau: rework to new fence interface"
>>> (commit 29ba89b2371d466).
>>>
>>> The fence sequence should not be reset after creation, the old value is used instead.
>>> On destruction the final value is written, to prevent another source of accidental
>>> wraparound in case of a channel being destroyed after a hang, and unblocking any other
>>> channel that may wait on the about-to-be-deleted channel to signal.
>>>
>>> I'm nothing if not optimistic about any hope of recovery from that. ;-)
>>>
>>> Reported-by: Ted Percival <ted@tedp.id.au>
>>> Signed-off-by: Maarten Lankhorst <maarten.lankhorst@canonical.com>
> Acked-by: Ben Skeggs <bskeggs@redhat.com>
>
> I'm still seeing issues with suspend, even with this patch, and the
> one you pastebinned recently.
>
Annoying, and I'm out of ideas. The pastebinned patch is posted to dri-devel as:
[PATCH 2/8] drm/nouveau: specify if interruptible wait is desired in nouveau_fence_sync.

Could you bisect to where the suspend issues started? With this patch applied after
"drm/nouveau: rework to new fence interface", and the other patch applied after
"drm/nouveau: use shared fences for readable objects"

~Maarten
Ted Percival Sept. 25, 2014, 10:28 p.m. UTC | #4
On 09/23/2014 08:24 AM, Maarten Lankhorst wrote:
> Op 23-09-14 om 07:35 schreef Ben Skeggs:
>>> On 09/22/2014 03:08 AM, Maarten Lankhorst wrote:
>>>> This fixes a regression introduced by "drm/nouveau: rework to new fence interface"
>>>> (commit 29ba89b2371d466).
>>
>> I'm still seeing issues with suspend, even with this patch, and the
>> one you pastebinned recently.
>>
> Annoying, and I'm out of ideas. The pastebinned patch is posted to dri-devel as:
> [PATCH 2/8] drm/nouveau: specify if interruptible wait is desired in nouveau_fence_sync.
> 
> Could you bisect to where the suspend issues started? With this patch applied after
> "drm/nouveau: rework to new fence interface", and the other patch applied after
> "drm/nouveau: use shared fences for readable objects"

I started bisecting to track down the suspend issue, but X won't start
at all with today's dri-next tree @ d743ecf36063 ("drm/doc: Fixup
drm_irq kerneldoc includes") plus the two patches, so I'll have to get
that out of the way first.
diff mbox

Patch

diff --git a/drivers/gpu/drm/nouveau/nv84_fence.c b/drivers/gpu/drm/nouveau/nv84_fence.c
index 7b372a68aa4e..4138db4d8291 100644
--- a/drivers/gpu/drm/nouveau/nv84_fence.c
+++ b/drivers/gpu/drm/nouveau/nv84_fence.c
@@ -120,6 +120,7 @@  nv84_fence_context_del(struct nouveau_channel *chan)
 		nouveau_bo_vma_del(bo, &fctx->dispc_vma[i]);
 	}
 
+	nouveau_bo_wr32(priv->bo, chan->chid * 16 / 4, fctx->base.sequence);
 	nouveau_bo_vma_del(priv->bo, &fctx->vma_gart);
 	nouveau_bo_vma_del(priv->bo, &fctx->vma);
 	nouveau_fence_context_del(&fctx->base);
@@ -159,8 +160,6 @@  nv84_fence_context_new(struct nouveau_channel *chan)
 		ret = nouveau_bo_vma_add(bo, cli->vm, &fctx->dispc_vma[i]);
 	}
 
-	nouveau_bo_wr32(priv->bo, chan->chid * 16/4, 0x00000000);
-
 	if (ret)
 		nv84_fence_context_del(chan);
 	return ret;