Message ID | 20170408162603.28305-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am 08.04.2017 um 18:26 schrieb Chris Wilson: > Reserve 0 for general use a token meaning that the fence doesn't belong > to an ordered timeline (fence context). NAK, we kept context allocation cheap to avoid exactly that. Please elaborate further why it should be necessary now. Regards, Christian. > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > Cc: Sumit Semwal <sumit.semwal@linaro.org> > Cc: Gustavo Padovan <gustavo@padovan.org> > Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> > Cc: "Christian König" <christian.koenig@amd.com > Cc: Daniel Vetter <daniel.vetter@ffwll.ch> > Cc: Alex Deucher <alexander.deucher@amd.com> > --- > drivers/dma-buf/dma-fence.c | 4 +++- > include/linux/dma-fence.h | 2 ++ > 2 files changed, 5 insertions(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index 0918d3f003d6..0646357ea350 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -36,8 +36,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); > * fence context, this allows checking if fences belong to the same > * context or not. One device can have multiple separate contexts, > * and they're used if some engine can run independently of another. > + * > + * 0 is excluded and treated as a special DMA_FENCE_NO_CONTEXT. > */ > -static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(0); > +static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(1); > > /** > * dma_fence_context_alloc - allocate an array of fence contexts > diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h > index 6048fa404e57..adfdc7fdd9c3 100644 > --- a/include/linux/dma-fence.h > +++ b/include/linux/dma-fence.h > @@ -455,6 +455,8 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) > return ret < 0 ? ret : 0; > } > > +#define DMA_FENCE_NO_CONTEXT ((u64)0) > + > u64 dma_fence_context_alloc(unsigned num); > > #define DMA_FENCE_TRACE(f, fmt, args...) \
On Sat, Apr 08, 2017 at 07:49:37PM +0200, Christian König wrote: > Am 08.04.2017 um 18:26 schrieb Chris Wilson: > >Reserve 0 for general use a token meaning that the fence doesn't belong > >to an ordered timeline (fence context). > > NAK, we kept context allocation cheap to avoid exactly that. However, they result in very sparse mappings. > Please elaborate further why it should be necessary now. Because I want to efficiently exclude them from comparisons as demonstrated by this small series as there may be several hundred such fences as dependencies for this job. -Chris
Am 08.04.2017 um 20:00 schrieb Chris Wilson: > On Sat, Apr 08, 2017 at 07:49:37PM +0200, Christian König wrote: >> Am 08.04.2017 um 18:26 schrieb Chris Wilson: >>> Reserve 0 for general use a token meaning that the fence doesn't belong >>> to an ordered timeline (fence context). >> NAK, we kept context allocation cheap to avoid exactly that. > However, they result in very sparse mappings. Which is perfectly fine at least for how we use this in Radeon and Amdgpu. The fence context is used as key for a hashtable, even when the context is only used once we want an evenly distribution over all of them. > >> Please elaborate further why it should be necessary now. > Because I want to efficiently exclude them from comparisons as > demonstrated by this small series as there may be several hundred such > fences as dependencies for this job. That would horrible break Amdgpu, Radeon and the GPU scheduler because all of them assume that context numbers are unique. So that is a clear NAK from my side, Christian. > -Chris >
On Sun, Apr 09, 2017 at 09:08:53AM +0200, Christian König wrote: > Am 08.04.2017 um 20:00 schrieb Chris Wilson: > >On Sat, Apr 08, 2017 at 07:49:37PM +0200, Christian König wrote: > >>Am 08.04.2017 um 18:26 schrieb Chris Wilson: > >>>Reserve 0 for general use a token meaning that the fence doesn't belong > >>>to an ordered timeline (fence context). > >>NAK, we kept context allocation cheap to avoid exactly that. > >However, they result in very sparse mappings. > > Which is perfectly fine at least for how we use this in Radeon and Amdgpu. > > The fence context is used as key for a hashtable, even when the > context is only used once we want an evenly distribution over all of > them. The ht is a more expensive solution. > >>Please elaborate further why it should be necessary now. > >Because I want to efficiently exclude them from comparisons as > >demonstrated by this small series as there may be several hundred such > >fences as dependencies for this job. > > That would horrible break Amdgpu, Radeon and the GPU scheduler > because all of them assume that context numbers are unique. Why would it break if it respected the trivial notion of an unordered timeline? -Chris
Am 09.04.2017 um 18:02 schrieb Chris Wilson: > On Sun, Apr 09, 2017 at 09:08:53AM +0200, Christian König wrote: >> Am 08.04.2017 um 20:00 schrieb Chris Wilson: >>> On Sat, Apr 08, 2017 at 07:49:37PM +0200, Christian König wrote: >>>> Am 08.04.2017 um 18:26 schrieb Chris Wilson: >>>>> Reserve 0 for general use a token meaning that the fence doesn't belong >>>>> to an ordered timeline (fence context). >>>> NAK, we kept context allocation cheap to avoid exactly that. >>> However, they result in very sparse mappings. >> Which is perfectly fine at least for how we use this in Radeon and Amdgpu. >> >> The fence context is used as key for a hashtable, even when the >> context is only used once we want an evenly distribution over all of >> them. > The ht is a more expensive solution. Not when the fences without contexts are rare. >>>> Please elaborate further why it should be necessary now. >>> Because I want to efficiently exclude them from comparisons as >>> demonstrated by this small series as there may be several hundred such >>> fences as dependencies for this job. >> That would horrible break Amdgpu, Radeon and the GPU scheduler >> because all of them assume that context numbers are unique. > Why would it break if it respected the trivial notion of an unordered > timeline? You seem to miss the point. We already had that discussion and decided against using no-context fences. Now we have a whole bunch of cases over different drivers/components where we assume uniqueness of fence contexts. If you change this without fixing all those cases they will just subtle break, probably without anybody noticing the problem immediately. So if you don't want to scan all drivers for such cases (and I know at least three of hand) and fix them before that patch then that is a clear no-go for this patch. I suggest to just use two separate fence contexts for you clflush problem in i915. We have a similar situation in the GPU scheduler solve it the same way. Regards, Christian. > -Chris >
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 0918d3f003d6..0646357ea350 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -36,8 +36,10 @@ EXPORT_TRACEPOINT_SYMBOL(dma_fence_enable_signal); * fence context, this allows checking if fences belong to the same * context or not. One device can have multiple separate contexts, * and they're used if some engine can run independently of another. + * + * 0 is excluded and treated as a special DMA_FENCE_NO_CONTEXT. */ -static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(0); +static atomic64_t dma_fence_context_counter = ATOMIC64_INIT(1); /** * dma_fence_context_alloc - allocate an array of fence contexts diff --git a/include/linux/dma-fence.h b/include/linux/dma-fence.h index 6048fa404e57..adfdc7fdd9c3 100644 --- a/include/linux/dma-fence.h +++ b/include/linux/dma-fence.h @@ -455,6 +455,8 @@ static inline signed long dma_fence_wait(struct dma_fence *fence, bool intr) return ret < 0 ? ret : 0; } +#define DMA_FENCE_NO_CONTEXT ((u64)0) + u64 dma_fence_context_alloc(unsigned num); #define DMA_FENCE_TRACE(f, fmt, args...) \
Reserve 0 for general use a token meaning that the fence doesn't belong to an ordered timeline (fence context). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Cc: Sumit Semwal <sumit.semwal@linaro.org> Cc: Gustavo Padovan <gustavo@padovan.org> Cc: Joonas Lahtinen <joonas.lahtinen@linux.intel.com> Cc: "Christian König" <christian.koenig@amd.com Cc: Daniel Vetter <daniel.vetter@ffwll.ch> Cc: Alex Deucher <alexander.deucher@amd.com> --- drivers/dma-buf/dma-fence.c | 4 +++- include/linux/dma-fence.h | 2 ++ 2 files changed, 5 insertions(+), 1 deletion(-)