diff mbox

[1/3] dma-fence: Reserve 0 as a special NO_CONTEXT token

Message ID 20170408162603.28305-1-chris@chris-wilson.co.uk (mailing list archive)
State New, archived
Headers show

Commit Message

Chris Wilson April 8, 2017, 4:26 p.m. UTC
Reserve 0 for general use a token meaning that the fence doesn't belong
to an ordered timeline (fence context).

Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
Cc: Sumit Semwal <sumit.semwal@linaro.org>
Cc: Gustavo Padovan <gustavo@padovan.org>
Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
Cc: "Christian König" <christian.koenig@amd.com
Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
Cc: Alex Deucher <alexander.deucher@amd.com>
---
 drivers/dma-buf/dma-fence.c | 4 +++-
 include/linux/dma-fence.h   | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

Comments

Christian König April 8, 2017, 5:49 p.m. UTC | #1
Am 08.04.2017 um 18:26 schrieb Chris Wilson:
> Reserve 0 for general use a token meaning that the fence doesn't belong
> to an ordered timeline (fence context).

NAK, we kept context allocation cheap to avoid exactly that.

Please elaborate further why it should be necessary now.

Regards,
Christian.

> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk>
> Cc: Sumit Semwal <sumit.semwal@linaro.org>
> Cc: Gustavo Padovan <gustavo@padovan.org>
> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com>
> Cc: "Christian König" <christian.koenig@amd.com
> Cc: Daniel Vetter <daniel.vetter@ffwll.ch>
> Cc: Alex Deucher <alexander.deucher@amd.com>
> ---
>   drivers/dma-buf/dma-fence.c | 4 +++-
>   include/linux/dma-fence.h   | 2 ++
>   2 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
> index 0918d3f003d6..0646357ea350 100644
> --- a/drivers/dma-buf/dma-fence.c
> +++ b/drivers/dma-buf/dma-fence.c
> @@ -36,8 +36,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
>    * fence context, this allows checking if fences belong to the same
>    * context or not. One device can have multiple separate contexts,
>    * and they're used if some engine can run independently of another.
> + *
> + * 0 is excluded and treated as a special DMA_FENCE_NO_CONTEXT.
>    */
> -static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(0);
> +static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(1);
>   
>   /**
>    * dma_fence_context_alloc - allocate an array of fence contexts
> diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
> index 6048fa404e57..adfdc7fdd9c3 100644
> --- a/include/linux/dma-fence.h
> +++ b/include/linux/dma-fence.h
> @@ -455,6 +455,8 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
>   	return ret < 0 ? ret : 0;
>   }
>   
> +#define DMA_FENCE_NO_CONTEXT ((u64)0)
> +
>   u64 dma_fence_context_alloc(unsigned num);
>   
>   #define DMA_FENCE_TRACE(f, fmt, args...) \
Chris Wilson April 8, 2017, 6 p.m. UTC | #2
On Sat, Apr 08, 2017 at 07:49:37PM +0200, Christian König wrote:
> Am 08.04.2017 um 18:26 schrieb Chris Wilson:
> >Reserve 0 for general use a token meaning that the fence doesn't belong
> >to an ordered timeline (fence context).
> 
> NAK, we kept context allocation cheap to avoid exactly that.

However, they result in very sparse mappings.

> Please elaborate further why it should be necessary now.

Because I want to efficiently exclude them from comparisons as
demonstrated by this small series as there may be several hundred such
fences as dependencies for this job.
-Chris
Christian König April 9, 2017, 7:08 a.m. UTC | #3
Am 08.04.2017 um 20:00 schrieb Chris Wilson:
> On Sat, Apr 08, 2017 at 07:49:37PM +0200, Christian König wrote:
>> Am 08.04.2017 um 18:26 schrieb Chris Wilson:
>>> Reserve 0 for general use a token meaning that the fence doesn't belong
>>> to an ordered timeline (fence context).
>> NAK, we kept context allocation cheap to avoid exactly that.
> However, they result in very sparse mappings.

Which is perfectly fine at least for how we use this in Radeon and Amdgpu.

The fence context is used as key for a hashtable, even when the context 
is only used once we want an evenly distribution over all of them.

>
>> Please elaborate further why it should be necessary now.
> Because I want to efficiently exclude them from comparisons as
> demonstrated by this small series as there may be several hundred such
> fences as dependencies for this job.

That would horrible break Amdgpu, Radeon and the GPU scheduler because 
all of them assume that context numbers are unique.

So that is a clear NAK from my side,
Christian.

> -Chris
>
Chris Wilson April 9, 2017, 4:02 p.m. UTC | #4
On Sun, Apr 09, 2017 at 09:08:53AM +0200, Christian König wrote:
> Am 08.04.2017 um 20:00 schrieb Chris Wilson:
> >On Sat, Apr 08, 2017 at 07:49:37PM +0200, Christian König wrote:
> >>Am 08.04.2017 um 18:26 schrieb Chris Wilson:
> >>>Reserve 0 for general use a token meaning that the fence doesn't belong
> >>>to an ordered timeline (fence context).
> >>NAK, we kept context allocation cheap to avoid exactly that.
> >However, they result in very sparse mappings.
> 
> Which is perfectly fine at least for how we use this in Radeon and Amdgpu.
> 
> The fence context is used as key for a hashtable, even when the
> context is only used once we want an evenly distribution over all of
> them.

The ht is a more expensive solution.

> >>Please elaborate further why it should be necessary now.
> >Because I want to efficiently exclude them from comparisons as
> >demonstrated by this small series as there may be several hundred such
> >fences as dependencies for this job.
> 
> That would horrible break Amdgpu, Radeon and the GPU scheduler
> because all of them assume that context numbers are unique.

Why would it break if it respected the trivial notion of an unordered
timeline?
-Chris
Christian König April 9, 2017, 6 p.m. UTC | #5
Am 09.04.2017 um 18:02 schrieb Chris Wilson:
> On Sun, Apr 09, 2017 at 09:08:53AM +0200, Christian König wrote:
>> Am 08.04.2017 um 20:00 schrieb Chris Wilson:
>>> On Sat, Apr 08, 2017 at 07:49:37PM +0200, Christian König wrote:
>>>> Am 08.04.2017 um 18:26 schrieb Chris Wilson:
>>>>> Reserve 0 for general use a token meaning that the fence doesn't belong
>>>>> to an ordered timeline (fence context).
>>>> NAK, we kept context allocation cheap to avoid exactly that.
>>> However, they result in very sparse mappings.
>> Which is perfectly fine at least for how we use this in Radeon and Amdgpu.
>>
>> The fence context is used as key for a hashtable, even when the
>> context is only used once we want an evenly distribution over all of
>> them.
> The ht is a more expensive solution.

Not when the fences without contexts are rare.

>>>> Please elaborate further why it should be necessary now.
>>> Because I want to efficiently exclude them from comparisons as
>>> demonstrated by this small series as there may be several hundred such
>>> fences as dependencies for this job.
>> That would horrible break Amdgpu, Radeon and the GPU scheduler
>> because all of them assume that context numbers are unique.
> Why would it break if it respected the trivial notion of an unordered
> timeline?

You seem to miss the point. We already had that discussion and decided 
against using no-context fences.

Now we have a whole bunch of cases over different drivers/components 
where we assume uniqueness of fence contexts.

If you change this without fixing all those cases they will just subtle 
break, probably without anybody noticing the problem immediately.

So if you don't want to scan all drivers for such cases (and I know at 
least three of hand) and fix them before that patch then that is a clear 
no-go for this patch.

I suggest to just use two separate fence contexts for you clflush 
problem in i915. We have a similar situation in the GPU scheduler solve 
it the same way.

Regards,
Christian.

> -Chris
>
diff mbox

Patch

diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c
index 0918d3f003d6..0646357ea350 100644
--- a/drivers/dma-buf/dma-fence.c
+++ b/drivers/dma-buf/dma-fence.c
@@ -36,8 +36,10 @@  EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal);
  * fence context, this allows checking if fences belong to the same
  * context or not. One device can have multiple separate contexts,
  * and they're used if some engine can run independently of another.
+ *
+ * 0 is excluded and treated as a special DMA_FENCE_NO_CONTEXT.
  */
-static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(0);
+static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(1);
 
 /**
  * dma_fence_context_alloc - allocate an array of fence contexts
diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h
index 6048fa404e57..adfdc7fdd9c3 100644
--- a/include/linux/dma-fence.h
+++ b/include/linux/dma-fence.h
@@ -455,6 +455,8 @@  static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr)
 	return ret < 0 ? ret : 0;
 }
 
+#define DMA_FENCE_NO_CONTEXT ((u64)0)
+
 u64 dma_fence_context_alloc(unsigned num);
 
 #define DMA_FENCE_TRACE(f, fmt, args...) \