Message ID | 20170811170147.26416-1-chris@chris-wilson.co.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Am 11.08.2017 um 19:01 schrieb Chris Wilson: > This is an illegal scenario, to free the fence whilst there are pending > callbacks. Currently, we emit a WARN and then cast aside the callbacks > leaving them dangling. Alternatively, we could set an error on the fence > and then signal fence so that any dependency chains from the fence can > be tidied up, and if they care they can check for the error. > > The question is whether or not the cure is worse than the disease > (premature fence signaling is never pretty). > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> Not sure if -EDEADLK is the best error code, but in general the approach sounds like the least evil to me. Patch is Reviewed-by: Christian König <christian.koenig@amd.com>. > --- > drivers/dma-buf/dma-fence.c | 14 +++++++++++++- > 1 file changed, 13 insertions(+), 1 deletion(-) > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index 9a302799040e..ed311edbeefa 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -172,7 +172,19 @@ void dma_fence_release(struct kref *kref) > > trace_dma_fence_destroy(fence); > > - WARN_ON(!list_empty(&fence->cb_list)); > + if (WARN_ON(!list_empty(&fence->cb_list))) { > + unsigned long flags; > + > + /* > + * This should never happen, but if it does make sure that we > + * don't leave chains dangling. We set the error flag first > + * so that the callbacks know this signal is due to an error. > + */ > + spin_lock_irqsave(fence->lock, flags); > + fence->error = -EDEADLK; > + dma_fence_signal_locked(fence); > + spin_unlock_irqrestore(fence->lock, flags); > + } > > if (fence->ops->release) > fence->ops->release(fence);
Quoting Christian König (2017-08-13 14:04:29) > Am 11.08.2017 um 19:01 schrieb Chris Wilson: > > This is an illegal scenario, to free the fence whilst there are pending > > callbacks. Currently, we emit a WARN and then cast aside the callbacks > > leaving them dangling. Alternatively, we could set an error on the fence > > and then signal fence so that any dependency chains from the fence can > > be tidied up, and if they care they can check for the error. > > > > The question is whether or not the cure is worse than the disease > > (premature fence signaling is never pretty). > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Not sure if -EDEADLK is the best error code, but in general the approach > sounds like the least evil to me. EDEADLK felt appropriate since that was the situation that I expect might arise, well livelock. -ENXIO? That seems a reasonable alternative. -ENODEV? The classic -ENOTTY? -Chris
Am 15.08.2017 um 13:23 schrieb Chris Wilson: > Quoting Christian König (2017-08-13 14:04:29) >> Am 11.08.2017 um 19:01 schrieb Chris Wilson: >>> This is an illegal scenario, to free the fence whilst there are pending >>> callbacks. Currently, we emit a WARN and then cast aside the callbacks >>> leaving them dangling. Alternatively, we could set an error on the fence >>> and then signal fence so that any dependency chains from the fence can >>> be tidied up, and if they care they can check for the error. >>> >>> The question is whether or not the cure is worse than the disease >>> (premature fence signaling is never pretty). >>> >>> Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> >> Not sure if -EDEADLK is the best error code, but in general the approach >> sounds like the least evil to me. > EDEADLK felt appropriate since that was the situation that I expect might > arise, well livelock. > > -ENXIO? That seems a reasonable alternative. > -ENODEV? > The classic -ENOTTY? No strong opinion either. For now I think we should just stick with -EDEADLK. We would need something like -EHUMAN, because in this particular case the human who wrote the code has messed up the reference counting. Christian. > -Chris
Quoting Christian König (2017-08-13 14:04:29) > Am 11.08.2017 um 19:01 schrieb Chris Wilson: > > This is an illegal scenario, to free the fence whilst there are pending > > callbacks. Currently, we emit a WARN and then cast aside the callbacks > > leaving them dangling. Alternatively, we could set an error on the fence > > and then signal fence so that any dependency chains from the fence can > > be tidied up, and if they care they can check for the error. > > > > The question is whether or not the cure is worse than the disease > > (premature fence signaling is never pretty). > > > > Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> > > Not sure if -EDEADLK is the best error code, but in general the approach > sounds like the least evil to me. > > Patch is Reviewed-by: Christian König <christian.koenig@amd.com>. Since the spectre of an early dma_fence_release() has shown itself recently, I thought I'd ping this patch for inclusion along with the new comment. -Chris > > --- > > drivers/dma-buf/dma-fence.c | 14 +++++++++++++- > > 1 file changed, 13 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > > index 9a302799040e..ed311edbeefa 100644 > > --- a/drivers/dma-buf/dma-fence.c > > +++ b/drivers/dma-buf/dma-fence.c > > @@ -172,7 +172,19 @@ void dma_fence_release(struct kref *kref) > > > > trace_dma_fence_destroy(fence); > > > > - WARN_ON(!list_empty(&fence->cb_list)); > > + if (WARN_ON(!list_empty(&fence->cb_list))) { > > + unsigned long flags; > > + > > + /* > > + * This should never happen, but if it does make sure that we > > + * don't leave chains dangling. We set the error flag first > > + * so that the callbacks know this signal is due to an error. > > + */ > > + spin_lock_irqsave(fence->lock, flags); > > + fence->error = -EDEADLK; > > + dma_fence_signal_locked(fence); > > + spin_unlock_irqrestore(fence->lock, flags); > > + } > > > > if (fence->ops->release) > > fence->ops->release(fence); > >
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 9a302799040e..ed311edbeefa 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -172,7 +172,19 @@ void dma_fence_release(struct kref *kref) trace_dma_fence_destroy(fence); - WARN_ON(!list_empty(&fence->cb_list)); + if (WARN_ON(!list_empty(&fence->cb_list))) { + unsigned long flags; + + /* + * This should never happen, but if it does make sure that we + * don't leave chains dangling. We set the error flag first + * so that the callbacks know this signal is due to an error. + */ + spin_lock_irqsave(fence->lock, flags); + fence->error = -EDEADLK; + dma_fence_signal_locked(fence); + spin_unlock_irqrestore(fence->lock, flags); + } if (fence->ops->release) fence->ops->release(fence);
This is an illegal scenario, to free the fence whilst there are pending callbacks. Currently, we emit a WARN and then cast aside the callbacks leaving them dangling. Alternatively, we could set an error on the fence and then signal fence so that any dependency chains from the fence can be tidied up, and if they care they can check for the error. The question is whether or not the cure is worse than the disease (premature fence signaling is never pretty). Signed-off-by: Chris Wilson <chris@chris-wilson.co.uk> --- drivers/dma-buf/dma-fence.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-)