Message ID | 20180129154002.3287-1-oded.gabbay@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Jan 29, 2018 at 05:40:02PM +0200, Oded Gabbay wrote: > In dma_fence_release() there is a WARN_ON which could be triggered by > several cases of wrong dma-fence usage. This patch adds a comment to > explain two use-cases to help driver developers that use dma-fence > and trigger that WARN_ON to better understand the reasons for it. > > Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> Not sure how useful this is, trying to do anything with a dma_fence while not holding a reference is just plain buggy. If you do that, then all kinds of use-after-free hilarity can happen. Trying to enumerate all the ways you can get refcounting wrong seems futile. What we maybe could do is a simple one-line like /* Failed to signal before release, could be a refcounting issue. */ I think this is generally a true statement and more useful hint for the next convoluted scenario (which in all likelihood will be different from yours). -Daniel > --- > drivers/dma-buf/dma-fence.c | 33 +++++++++++++++++++++++++++++++++ > 1 file changed, 33 insertions(+) > > diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > index 5d101c4053e0..a7170ab23ec0 100644 > --- a/drivers/dma-buf/dma-fence.c > +++ b/drivers/dma-buf/dma-fence.c > @@ -171,6 +171,39 @@ void dma_fence_release(struct kref *kref) > > trace_dma_fence_destroy(fence); > > + /* > + * If the WARN_ON below is triggered it could be because the dma fence > + * was not signaled and therefore, the cb list is still not empty > + * because the cb functions were not called. > + * > + * A more subtle case is where the fence got signaled by a thread that > + * didn't hold a ref to the fence. The following describes the scenario: > + * > + * Thread A Thread B > + *-------------------------- -------------------------- > + * calls dma_fence_signal() { > + * set signal bit > + * > + * scheduled out > + * ---------------------------> calls dma_fence_wait_timeout() and > + * returns immediately > + * > + * calls dma_fence_put() > + * | > + * |thread A doesn't hold ref > + * |to fence so ref goes to 0 > + * |and release is called > + * | > + * -> dma_fence_release() > + * | > + * -> WARN_ON triggered > + * > + * go over CB list, > + * call each CB and remove it > + * } > + * > + * > + */ > WARN_ON(!list_empty(&fence->cb_list)); > > if (fence->ops->release) > -- > 2.14.3 >
On Tue, Jan 30, 2018 at 12:33 PM, Daniel Vetter <daniel@ffwll.ch> wrote: > On Mon, Jan 29, 2018 at 05:40:02PM +0200, Oded Gabbay wrote: >> In dma_fence_release() there is a WARN_ON which could be triggered by >> several cases of wrong dma-fence usage. This patch adds a comment to >> explain two use-cases to help driver developers that use dma-fence >> and trigger that WARN_ON to better understand the reasons for it. >> >> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> > > Not sure how useful this is, trying to do anything with a dma_fence while > not holding a reference is just plain buggy. If you do that, then all > kinds of use-after-free hilarity can happen. Trying to enumerate all the > ways you can get refcounting wrong seems futile. > > What we maybe could do is a simple one-line like > > /* Failed to signal before release, could be a refcounting issue. */ > > I think this is generally a true statement and more useful hint for the > next convoluted scenario (which in all likelihood will be different from > yours). > -Daniel > >> --- >> drivers/dma-buf/dma-fence.c | 33 +++++++++++++++++++++++++++++++++ >> 1 file changed, 33 insertions(+) >> >> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c >> index 5d101c4053e0..a7170ab23ec0 100644 >> --- a/drivers/dma-buf/dma-fence.c >> +++ b/drivers/dma-buf/dma-fence.c >> @@ -171,6 +171,39 @@ void dma_fence_release(struct kref *kref) >> >> trace_dma_fence_destroy(fence); >> >> + /* >> + * If the WARN_ON below is triggered it could be because the dma fence >> + * was not signaled and therefore, the cb list is still not empty >> + * because the cb functions were not called. >> + * >> + * A more subtle case is where the fence got signaled by a thread that >> + * didn't hold a ref to the fence. The following describes the scenario: >> + * >> + * Thread A Thread B >> + *-------------------------- -------------------------- >> + * calls dma_fence_signal() { >> + * set signal bit >> + * >> + * scheduled out >> + * ---------------------------> calls dma_fence_wait_timeout() and >> + * returns immediately >> + * >> + * calls dma_fence_put() >> + * | >> + * |thread A doesn't hold ref >> + * |to fence so ref goes to 0 >> + * |and release is called >> + * | >> + * -> dma_fence_release() >> + * | >> + * -> WARN_ON triggered >> + * >> + * go over CB list, >> + * call each CB and remove it >> + * } >> + * >> + * >> + */ >> WARN_ON(!list_empty(&fence->cb_list)); >> >> if (fence->ops->release) >> -- >> 2.14.3 >> > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch Hi Daniel, Yeah, I get what you are saying. Can I add your RB to the one-liner version ? Thanks, Oded
On Wed, Jan 31, 2018 at 11:03:39AM +0200, Oded Gabbay wrote: > On Tue, Jan 30, 2018 at 12:33 PM, Daniel Vetter <daniel@ffwll.ch> wrote: > > On Mon, Jan 29, 2018 at 05:40:02PM +0200, Oded Gabbay wrote: > >> In dma_fence_release() there is a WARN_ON which could be triggered by > >> several cases of wrong dma-fence usage. This patch adds a comment to > >> explain two use-cases to help driver developers that use dma-fence > >> and trigger that WARN_ON to better understand the reasons for it. > >> > >> Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> > > > > Not sure how useful this is, trying to do anything with a dma_fence while > > not holding a reference is just plain buggy. If you do that, then all > > kinds of use-after-free hilarity can happen. Trying to enumerate all the > > ways you can get refcounting wrong seems futile. > > > > What we maybe could do is a simple one-line like > > > > /* Failed to signal before release, could be a refcounting issue. */ > > > > I think this is generally a true statement and more useful hint for the > > next convoluted scenario (which in all likelihood will be different from > > yours). > > -Daniel > > > >> --- > >> drivers/dma-buf/dma-fence.c | 33 +++++++++++++++++++++++++++++++++ > >> 1 file changed, 33 insertions(+) > >> > >> diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c > >> index 5d101c4053e0..a7170ab23ec0 100644 > >> --- a/drivers/dma-buf/dma-fence.c > >> +++ b/drivers/dma-buf/dma-fence.c > >> @@ -171,6 +171,39 @@ void dma_fence_release(struct kref *kref) > >> > >> trace_dma_fence_destroy(fence); > >> > >> + /* > >> + * If the WARN_ON below is triggered it could be because the dma fence > >> + * was not signaled and therefore, the cb list is still not empty > >> + * because the cb functions were not called. > >> + * > >> + * A more subtle case is where the fence got signaled by a thread that > >> + * didn't hold a ref to the fence. The following describes the scenario: > >> + * > >> + * Thread A Thread B > >> + *-------------------------- -------------------------- > >> + * calls dma_fence_signal() { > >> + * set signal bit > >> + * > >> + * scheduled out > >> + * ---------------------------> calls dma_fence_wait_timeout() and > >> + * returns immediately > >> + * > >> + * calls dma_fence_put() > >> + * | > >> + * |thread A doesn't hold ref > >> + * |to fence so ref goes to 0 > >> + * |and release is called > >> + * | > >> + * -> dma_fence_release() > >> + * | > >> + * -> WARN_ON triggered > >> + * > >> + * go over CB list, > >> + * call each CB and remove it > >> + * } > >> + * > >> + * > >> + */ > >> WARN_ON(!list_empty(&fence->cb_list)); > >> > >> if (fence->ops->release) > >> -- > >> 2.14.3 > >> > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch > > Hi Daniel, > Yeah, I get what you are saying. > Can I add your RB to the one-liner version ? Sure. Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> Cheers, Daniel
diff --git a/drivers/dma-buf/dma-fence.c b/drivers/dma-buf/dma-fence.c index 5d101c4053e0..a7170ab23ec0 100644 --- a/drivers/dma-buf/dma-fence.c +++ b/drivers/dma-buf/dma-fence.c @@ -171,6 +171,39 @@ void dma_fence_release(struct kref *kref) trace_dma_fence_destroy(fence); + /* + * If the WARN_ON below is triggered it could be because the dma fence + * was not signaled and therefore, the cb list is still not empty + * because the cb functions were not called. + * + * A more subtle case is where the fence got signaled by a thread that + * didn't hold a ref to the fence. The following describes the scenario: + * + * Thread A Thread B + *-------------------------- -------------------------- + * calls dma_fence_signal() { + * set signal bit + * + * scheduled out + * ---------------------------> calls dma_fence_wait_timeout() and + * returns immediately + * + * calls dma_fence_put() + * | + * |thread A doesn't hold ref + * |to fence so ref goes to 0 + * |and release is called + * | + * -> dma_fence_release() + * | + * -> WARN_ON triggered + * + * go over CB list, + * call each CB and remove it + * } + * + * + */ WARN_ON(!list_empty(&fence->cb_list)); if (fence->ops->release)
In dma_fence_release() there is a WARN_ON which could be triggered by several cases of wrong dma-fence usage. This patch adds a comment to explain two use-cases to help driver developers that use dma-fence and trigger that WARN_ON to better understand the reasons for it. Signed-off-by: Oded Gabbay <oded.gabbay@gmail.com> --- drivers/dma-buf/dma-fence.c | 33 +++++++++++++++++++++++++++++++++ 1 file changed, 33 insertions(+)