diff mbox series

nouveau/fence: handle cross device fences properly. (v3)

Message ID 20250109005553.623947-1-airlied@gmail.com (mailing list archive)
State New
Headers show
Series nouveau/fence: handle cross device fences properly. (v3) | expand

Commit Message

Dave Airlie Jan. 9, 2025, 12:55 a.m. UTC
From: Dave Airlie <airlied@redhat.com>

This is the 3rd iteration of this after talking to Ben and
Danilo, I think this makes sense now.

The fence sync logic doesn't handle a fence sync across devices
as it tries to write to a channel offset from one device into
the fence bo from a different device, which won't work so well.

This patch fixes that to avoid using the sync path in the case
where the fences come from different nouveau drm devices.

This works fine on a single device as the fence bo is shared
across the devices, and mapped into each channels vma space,
the channel offsets are therefore okay to pass between sides,
so one channel can sync on the seqnos from the other by using
the offset into it's vma.

Signed-off-by: Dave Airlie <airlied@redhat.com>
Cc: stable@vger.kernel.org
---
 drivers/gpu/drm/nouveau/nouveau_fence.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Ben Skeggs Jan. 9, 2025, 3:42 a.m. UTC | #1
On 9/1/25 10:55, Dave Airlie wrote:

> From: Dave Airlie <airlied@redhat.com>
>
> This is the 3rd iteration of this after talking to Ben and
> Danilo, I think this makes sense now.
>
> The fence sync logic doesn't handle a fence sync across devices
> as it tries to write to a channel offset from one device into
> the fence bo from a different device, which won't work so well.
>
> This patch fixes that to avoid using the sync path in the case
> where the fences come from different nouveau drm devices.
>
> This works fine on a single device as the fence bo is shared
> across the devices, and mapped into each channels vma space,
> the channel offsets are therefore okay to pass between sides,
> so one channel can sync on the seqnos from the other by using
> the offset into it's vma.
>
> Signed-off-by: Dave Airlie <airlied@redhat.com>
Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>
> Cc: stable@vger.kernel.org
> ---
>   drivers/gpu/drm/nouveau/nouveau_fence.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index ee5e9d40c166..a3eb1f447a29 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -367,11 +367,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
>   			if (f) {
>   				struct nouveau_channel *prev;
>   				bool must_wait = true;
> +				bool local;
>   
>   				rcu_read_lock();
>   				prev = rcu_dereference(f->channel);
> -				if (prev && (prev == chan ||
> -					     fctx->sync(f, prev, chan) == 0))
> +				local = prev && prev->drm == chan->drm;
> +				if (local && (prev == chan ||
> +					      fctx->sync(f, prev, chan) == 0))
>   					must_wait = false;
>   				rcu_read_unlock();
>   				if (!must_wait)
Ben Skeggs Jan. 9, 2025, 4:13 a.m. UTC | #2
On 9/1/25 13:42, Ben Skeggs wrote:

> On 9/1/25 10:55, Dave Airlie wrote:
>
>> From: Dave Airlie <airlied@redhat.com>
>>
>> This is the 3rd iteration of this after talking to Ben and
>> Danilo, I think this makes sense now.
>>
>> The fence sync logic doesn't handle a fence sync across devices
>> as it tries to write to a channel offset from one device into
>> the fence bo from a different device, which won't work so well.
>>
>> This patch fixes that to avoid using the sync path in the case
>> where the fences come from different nouveau drm devices.
>>
>> This works fine on a single device as the fence bo is shared
>> across the devices, and mapped into each channels vma space,
>> the channel offsets are therefore okay to pass between sides,
>> so one channel can sync on the seqnos from the other by using
>> the offset into it's vma.
>>
>> Signed-off-by: Dave Airlie <airlied@redhat.com>
> Signed-off-by: Ben Skeggs <bskeggs@nvidia.com>

Err, not sure what my fingers did there ;)

Reviewed-by: Ben Skeggs <bskeggs@nvidia.com>

>
>> Cc: stable@vger.kernel.org
>> ---
>>   drivers/gpu/drm/nouveau/nouveau_fence.c | 6 ++++--
>>   1 file changed, 4 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c 
>> b/drivers/gpu/drm/nouveau/nouveau_fence.c
>> index ee5e9d40c166..a3eb1f447a29 100644
>> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
>> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
>> @@ -367,11 +367,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, 
>> struct nouveau_channel *chan,
>>               if (f) {
>>                   struct nouveau_channel *prev;
>>                   bool must_wait = true;
>> +                bool local;
>>                     rcu_read_lock();
>>                   prev = rcu_dereference(f->channel);
>> -                if (prev && (prev == chan ||
>> -                         fctx->sync(f, prev, chan) == 0))
>> +                local = prev && prev->drm == chan->drm;
>> +                if (local && (prev == chan ||
>> +                          fctx->sync(f, prev, chan) == 0))
>>                       must_wait = false;
>>                   rcu_read_unlock();
>>                   if (!must_wait)
Danilo Krummrich Jan. 9, 2025, 4:58 p.m. UTC | #3
On Thu, Jan 09, 2025 at 10:55:53AM +1000, Dave Airlie wrote:
> From: Dave Airlie <airlied@redhat.com>
> 
> This is the 3rd iteration of this after talking to Ben and
> Danilo, I think this makes sense now.
> 
> The fence sync logic doesn't handle a fence sync across devices
> as it tries to write to a channel offset from one device into
> the fence bo from a different device, which won't work so well.
> 
> This patch fixes that to avoid using the sync path in the case
> where the fences come from different nouveau drm devices.
> 
> This works fine on a single device as the fence bo is shared
> across the devices, and mapped into each channels vma space,
> the channel offsets are therefore okay to pass between sides,
> so one channel can sync on the seqnos from the other by using
> the offset into it's vma.

Huh, they indeed all share and map drm->fence->bo, good catch.

> 
> Signed-off-by: Dave Airlie <airlied@redhat.com>
> Cc: stable@vger.kernel.org
> ---
>  drivers/gpu/drm/nouveau/nouveau_fence.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
> index ee5e9d40c166..a3eb1f447a29 100644
> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> @@ -367,11 +367,13 @@ nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
>  			if (f) {
>  				struct nouveau_channel *prev;
>  				bool must_wait = true;
> +				bool local;
>  
>  				rcu_read_lock();
>  				prev = rcu_dereference(f->channel);
> -				if (prev && (prev == chan ||
> -					     fctx->sync(f, prev, chan) == 0))
> +				local = prev && prev->drm == chan->drm;

struct nouveau_channel has no pointer to a struct nouveau_drm, this should be
prev->cli->drm and chan->cli->drm instead.

No need to resend, I can fix it when applying the patch if you want.

> +				if (local && (prev == chan ||
> +					      fctx->sync(f, prev, chan) == 0))
>  					must_wait = false;
>  				rcu_read_unlock();
>  				if (!must_wait)
> -- 
> 2.43.0
>
diff mbox series

Patch

diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
index ee5e9d40c166..a3eb1f447a29 100644
--- a/drivers/gpu/drm/nouveau/nouveau_fence.c
+++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
@@ -367,11 +367,13 @@  nouveau_fence_sync(struct nouveau_bo *nvbo, struct nouveau_channel *chan,
 			if (f) {
 				struct nouveau_channel *prev;
 				bool must_wait = true;
+				bool local;
 
 				rcu_read_lock();
 				prev = rcu_dereference(f->channel);
-				if (prev && (prev == chan ||
-					     fctx->sync(f, prev, chan) == 0))
+				local = prev && prev->drm == chan->drm;
+				if (local && (prev == chan ||
+					      fctx->sync(f, prev, chan) == 0))
 					must_wait = false;
 				rcu_read_unlock();
 				if (!must_wait)