Message ID | 20210723075857.4065-1-michel@daenzer.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | dma-buf/poll: Get a file reference for outstanding fence callbacks | expand |
On 2021-07-23 10:04 a.m., Christian König wrote: > Am 23.07.21 um 09:58 schrieb Michel Dänzer: >> From: Michel Dänzer <mdaenzer@redhat.com> >> >> This makes sure we don't hit the >> >> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >> >> in dma_buf_release, which could be triggered by user space closing the >> dma-buf file description while there are outstanding fence callbacks >> from dma_buf_poll. > > I was also wondering the same thing while working on this, but then thought that the poll interface would take care of this. I was able to hit the BUG_ON with https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1880 . >> Cc: stable@vger.kernel.org >> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> >> --- >> drivers/dma-buf/dma-buf.c | 18 ++++++++++++------ >> 1 file changed, 12 insertions(+), 6 deletions(-) >> >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >> index 6c520c9bd93c..ec25498a971f 100644 >> --- a/drivers/dma-buf/dma-buf.c >> +++ b/drivers/dma-buf/dma-buf.c >> @@ -65,12 +65,9 @@ static void dma_buf_release(struct dentry *dentry) >> BUG_ON(dmabuf->vmapping_counter); >> /* >> - * Any fences that a dma-buf poll can wait on should be signaled >> - * before releasing dma-buf. This is the responsibility of each >> - * driver that uses the reservation objects. >> - * >> - * If you hit this BUG() it means someone dropped their ref to the >> - * dma-buf while still having pending operation to the buffer. >> + * If you hit this BUG() it could mean: >> + * * There's a file reference imbalance in dma_buf_poll / dma_buf_poll_cb or somewhere else >> + * * dmabuf->cb_in/out.active are non-0 despite no pending fence callback >> */ >> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >> @@ -196,6 +193,7 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) >> static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) >> { >> struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; >> + struct dma_buf *dmabuf = container_of(dcb->poll, struct dma_buf, poll); >> unsigned long flags; >> spin_lock_irqsave(&dcb->poll->lock, flags); >> @@ -203,6 +201,8 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) >> dcb->active = 0; >> spin_unlock_irqrestore(&dcb->poll->lock, flags); >> dma_fence_put(fence); >> + /* Paired with get_file in dma_buf_poll */ >> + fput(dmabuf->file); > > Is calling fput() in interrupt context ok? IIRC that could potentially sleep. Looks fine AFAICT: It has if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { and as a fallback for that, it adds the file to a lock-less delayed_fput_list which is processed by a workqueue.
Am 23.07.21 um 10:19 schrieb Michel Dänzer: > On 2021-07-23 10:04 a.m., Christian König wrote: >> Am 23.07.21 um 09:58 schrieb Michel Dänzer: >>> From: Michel Dänzer <mdaenzer@redhat.com> >>> >>> This makes sure we don't hit the >>> >>> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >>> >>> in dma_buf_release, which could be triggered by user space closing the >>> dma-buf file description while there are outstanding fence callbacks >>> from dma_buf_poll. >> I was also wondering the same thing while working on this, but then thought that the poll interface would take care of this. > I was able to hit the BUG_ON with https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1880 . > > >>> Cc: stable@vger.kernel.org >>> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> >>> --- >>> drivers/dma-buf/dma-buf.c | 18 ++++++++++++------ >>> 1 file changed, 12 insertions(+), 6 deletions(-) >>> >>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>> index 6c520c9bd93c..ec25498a971f 100644 >>> --- a/drivers/dma-buf/dma-buf.c >>> +++ b/drivers/dma-buf/dma-buf.c >>> @@ -65,12 +65,9 @@ static void dma_buf_release(struct dentry *dentry) >>> BUG_ON(dmabuf->vmapping_counter); >>> /* >>> - * Any fences that a dma-buf poll can wait on should be signaled >>> - * before releasing dma-buf. This is the responsibility of each >>> - * driver that uses the reservation objects. >>> - * >>> - * If you hit this BUG() it means someone dropped their ref to the >>> - * dma-buf while still having pending operation to the buffer. >>> + * If you hit this BUG() it could mean: >>> + * * There's a file reference imbalance in dma_buf_poll / dma_buf_poll_cb or somewhere else >>> + * * dmabuf->cb_in/out.active are non-0 despite no pending fence callback >>> */ >>> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >>> @@ -196,6 +193,7 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) >>> static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) >>> { >>> struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; >>> + struct dma_buf *dmabuf = container_of(dcb->poll, struct dma_buf, poll); >>> unsigned long flags; >>> spin_lock_irqsave(&dcb->poll->lock, flags); >>> @@ -203,6 +201,8 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) >>> dcb->active = 0; >>> spin_unlock_irqrestore(&dcb->poll->lock, flags); >>> dma_fence_put(fence); >>> + /* Paired with get_file in dma_buf_poll */ >>> + fput(dmabuf->file); >> Is calling fput() in interrupt context ok? IIRC that could potentially sleep. > Looks fine AFAICT: It has > > if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { > > and as a fallback for that, it adds the file to a lock-less delayed_fput_list which is processed by a workqueue. Ah, yes that makes sense. Fell free to add Reviewed-by: Christian König <christian.koenig@amd.com> Thanks, Christian.
On Fri, Jul 23, 2021 at 10:19:49AM +0200, Michel Dänzer wrote: > On 2021-07-23 10:04 a.m., Christian König wrote: > > Am 23.07.21 um 09:58 schrieb Michel Dänzer: > >> From: Michel Dänzer <mdaenzer@redhat.com> > >> > >> This makes sure we don't hit the > >> > >> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); > >> > >> in dma_buf_release, which could be triggered by user space closing the > >> dma-buf file description while there are outstanding fence callbacks > >> from dma_buf_poll. > > > > I was also wondering the same thing while working on this, but then thought that the poll interface would take care of this. > > I was able to hit the BUG_ON with https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1880 . igt test would be really lovely. Maybe base something off the import/export igts from Jason? -Daniel > > > >> Cc: stable@vger.kernel.org > >> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> > >> --- > >> drivers/dma-buf/dma-buf.c | 18 ++++++++++++------ > >> 1 file changed, 12 insertions(+), 6 deletions(-) > >> > >> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > >> index 6c520c9bd93c..ec25498a971f 100644 > >> --- a/drivers/dma-buf/dma-buf.c > >> +++ b/drivers/dma-buf/dma-buf.c > >> @@ -65,12 +65,9 @@ static void dma_buf_release(struct dentry *dentry) > >> BUG_ON(dmabuf->vmapping_counter); > >> /* > >> - * Any fences that a dma-buf poll can wait on should be signaled > >> - * before releasing dma-buf. This is the responsibility of each > >> - * driver that uses the reservation objects. > >> - * > >> - * If you hit this BUG() it means someone dropped their ref to the > >> - * dma-buf while still having pending operation to the buffer. > >> + * If you hit this BUG() it could mean: > >> + * * There's a file reference imbalance in dma_buf_poll / dma_buf_poll_cb or somewhere else > >> + * * dmabuf->cb_in/out.active are non-0 despite no pending fence callback > >> */ > >> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); > >> @@ -196,6 +193,7 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) > >> static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) > >> { > >> struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; > >> + struct dma_buf *dmabuf = container_of(dcb->poll, struct dma_buf, poll); > >> unsigned long flags; > >> spin_lock_irqsave(&dcb->poll->lock, flags); > >> @@ -203,6 +201,8 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) > >> dcb->active = 0; > >> spin_unlock_irqrestore(&dcb->poll->lock, flags); > >> dma_fence_put(fence); > >> + /* Paired with get_file in dma_buf_poll */ > >> + fput(dmabuf->file); > > > > Is calling fput() in interrupt context ok? IIRC that could potentially sleep. > > Looks fine AFAICT: It has > > if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { > > and as a fallback for that, it adds the file to a lock-less delayed_fput_list which is processed by a workqueue. > > > -- > Earthling Michel Dänzer | https://redhat.com > Libre software enthusiast | Mesa and X developer
On 2021-07-23 11:02 a.m., Daniel Vetter wrote: > On Fri, Jul 23, 2021 at 10:19:49AM +0200, Michel Dänzer wrote: >> On 2021-07-23 10:04 a.m., Christian König wrote: >>> Am 23.07.21 um 09:58 schrieb Michel Dänzer: >>>> From: Michel Dänzer <mdaenzer@redhat.com> >>>> >>>> This makes sure we don't hit the >>>> >>>> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >>>> >>>> in dma_buf_release, which could be triggered by user space closing the >>>> dma-buf file description while there are outstanding fence callbacks >>>> from dma_buf_poll. >>> >>> I was also wondering the same thing while working on this, but then thought that the poll interface would take care of this. >> >> I was able to hit the BUG_ON with https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1880 . > > igt test would be really lovely. Maybe base something off the > import/export igts from Jason? I'll see what I can do, busy with other stuff right now though.
On 2021-07-23 10:22, Christian König wrote: > Am 23.07.21 um 10:19 schrieb Michel Dänzer: >> On 2021-07-23 10:04 a.m., Christian König wrote: >>> Am 23.07.21 um 09:58 schrieb Michel Dänzer: >>>> From: Michel Dänzer <mdaenzer@redhat.com> >>>> >>>> This makes sure we don't hit the >>>> >>>> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >>>> >>>> in dma_buf_release, which could be triggered by user space closing the >>>> dma-buf file description while there are outstanding fence callbacks >>>> from dma_buf_poll. >>> I was also wondering the same thing while working on this, but then thought that the poll interface would take care of this. >> I was able to hit the BUG_ON with https://gitlab.gnome.org/GNOME/mutter/-/merge_requests/1880 . >> >> >>>> Cc: stable@vger.kernel.org >>>> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> >>>> --- >>>> drivers/dma-buf/dma-buf.c | 18 ++++++++++++------ >>>> 1 file changed, 12 insertions(+), 6 deletions(-) >>>> >>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>>> index 6c520c9bd93c..ec25498a971f 100644 >>>> --- a/drivers/dma-buf/dma-buf.c >>>> +++ b/drivers/dma-buf/dma-buf.c >>>> @@ -65,12 +65,9 @@ static void dma_buf_release(struct dentry *dentry) >>>> BUG_ON(dmabuf->vmapping_counter); >>>> /* >>>> - * Any fences that a dma-buf poll can wait on should be signaled >>>> - * before releasing dma-buf. This is the responsibility of each >>>> - * driver that uses the reservation objects. >>>> - * >>>> - * If you hit this BUG() it means someone dropped their ref to the >>>> - * dma-buf while still having pending operation to the buffer. >>>> + * If you hit this BUG() it could mean: >>>> + * * There's a file reference imbalance in dma_buf_poll / dma_buf_poll_cb or somewhere else >>>> + * * dmabuf->cb_in/out.active are non-0 despite no pending fence callback >>>> */ >>>> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >>>> @@ -196,6 +193,7 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) >>>> static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) >>>> { >>>> struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; >>>> + struct dma_buf *dmabuf = container_of(dcb->poll, struct dma_buf, poll); >>>> unsigned long flags; >>>> spin_lock_irqsave(&dcb->poll->lock, flags); >>>> @@ -203,6 +201,8 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) >>>> dcb->active = 0; >>>> spin_unlock_irqrestore(&dcb->poll->lock, flags); >>>> dma_fence_put(fence); >>>> + /* Paired with get_file in dma_buf_poll */ >>>> + fput(dmabuf->file); >>> Is calling fput() in interrupt context ok? IIRC that could potentially sleep. >> Looks fine AFAICT: It has >> >> if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { >> >> and as a fallback for that, it adds the file to a lock-less delayed_fput_list which is processed by a workqueue. > > Ah, yes that makes sense. > > Fell free to add Reviewed-by: Christian König <christian.koenig@amd.com> Thanks! AFAICT this fix can be merged now for 5.16?
Am 03.11.21 um 15:50 schrieb Michel Dänzer: > On 2021-07-23 10:22, Christian König wrote: >> Am 23.07.21 um 10:19 schrieb Michel Dänzer: >>> On 2021-07-23 10:04 a.m., Christian König wrote: >>>> Am 23.07.21 um 09:58 schrieb Michel Dänzer: >>>>> From: Michel Dänzer <mdaenzer@redhat.com> >>>>> >>>>> This makes sure we don't hit the >>>>> >>>>> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >>>>> >>>>> in dma_buf_release, which could be triggered by user space closing the >>>>> dma-buf file description while there are outstanding fence callbacks >>>>> from dma_buf_poll. >>>> I was also wondering the same thing while working on this, but then thought that the poll interface would take care of this. >>> I was able to hit the BUG_ON with https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgitlab.gnome.org%2FGNOME%2Fmutter%2F-%2Fmerge_requests%2F1880&data=04%7C01%7Cchristian.koenig%40amd.com%7C8d930ab39011481a839c08d99ed95755%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637715479787056688%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000&sdata=SjxSZIsWkP7ru1iHyL0IY9hN9882ENv7Cy38vzOtqyc%3D&reserved=0 . >>> >>> >>>>> Cc: stable@vger.kernel.org >>>>> Signed-off-by: Michel Dänzer <mdaenzer@redhat.com> >>>>> --- >>>>> drivers/dma-buf/dma-buf.c | 18 ++++++++++++------ >>>>> 1 file changed, 12 insertions(+), 6 deletions(-) >>>>> >>>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c >>>>> index 6c520c9bd93c..ec25498a971f 100644 >>>>> --- a/drivers/dma-buf/dma-buf.c >>>>> +++ b/drivers/dma-buf/dma-buf.c >>>>> @@ -65,12 +65,9 @@ static void dma_buf_release(struct dentry *dentry) >>>>> BUG_ON(dmabuf->vmapping_counter); >>>>> /* >>>>> - * Any fences that a dma-buf poll can wait on should be signaled >>>>> - * before releasing dma-buf. This is the responsibility of each >>>>> - * driver that uses the reservation objects. >>>>> - * >>>>> - * If you hit this BUG() it means someone dropped their ref to the >>>>> - * dma-buf while still having pending operation to the buffer. >>>>> + * If you hit this BUG() it could mean: >>>>> + * * There's a file reference imbalance in dma_buf_poll / dma_buf_poll_cb or somewhere else >>>>> + * * dmabuf->cb_in/out.active are non-0 despite no pending fence callback >>>>> */ >>>>> BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); >>>>> @@ -196,6 +193,7 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) >>>>> static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) >>>>> { >>>>> struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; >>>>> + struct dma_buf *dmabuf = container_of(dcb->poll, struct dma_buf, poll); >>>>> unsigned long flags; >>>>> spin_lock_irqsave(&dcb->poll->lock, flags); >>>>> @@ -203,6 +201,8 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) >>>>> dcb->active = 0; >>>>> spin_unlock_irqrestore(&dcb->poll->lock, flags); >>>>> dma_fence_put(fence); >>>>> + /* Paired with get_file in dma_buf_poll */ >>>>> + fput(dmabuf->file); >>>> Is calling fput() in interrupt context ok? IIRC that could potentially sleep. >>> Looks fine AFAICT: It has >>> >>> if (likely(!in_interrupt() && !(task->flags & PF_KTHREAD))) { >>> >>> and as a fallback for that, it adds the file to a lock-less delayed_fput_list which is processed by a workqueue. >> Ah, yes that makes sense. >> >> Fell free to add Reviewed-by: Christian König <christian.koenig@amd.com> > Thanks! AFAICT this fix can be merged now for 5.16? I've just pushed it to drm-misc-next-fixes since it won't even apply to drm-misc-fixes. Could be that we get requests to backport this because of the CC stable. Christian.
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 6c520c9bd93c..ec25498a971f 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -65,12 +65,9 @@ static void dma_buf_release(struct dentry *dentry) BUG_ON(dmabuf->vmapping_counter); /* - * Any fences that a dma-buf poll can wait on should be signaled - * before releasing dma-buf. This is the responsibility of each - * driver that uses the reservation objects. - * - * If you hit this BUG() it means someone dropped their ref to the - * dma-buf while still having pending operation to the buffer. + * If you hit this BUG() it could mean: + * * There's a file reference imbalance in dma_buf_poll / dma_buf_poll_cb or somewhere else + * * dmabuf->cb_in/out.active are non-0 despite no pending fence callback */ BUG_ON(dmabuf->cb_in.active || dmabuf->cb_out.active); @@ -196,6 +193,7 @@ static loff_t dma_buf_llseek(struct file *file, loff_t offset, int whence) static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) { struct dma_buf_poll_cb_t *dcb = (struct dma_buf_poll_cb_t *)cb; + struct dma_buf *dmabuf = container_of(dcb->poll, struct dma_buf, poll); unsigned long flags; spin_lock_irqsave(&dcb->poll->lock, flags); @@ -203,6 +201,8 @@ static void dma_buf_poll_cb(struct dma_fence *fence, struct dma_fence_cb *cb) dcb->active = 0; spin_unlock_irqrestore(&dcb->poll->lock, flags); dma_fence_put(fence); + /* Paired with get_file in dma_buf_poll */ + fput(dmabuf->file); } static bool dma_buf_poll_shared(struct dma_resv *resv, @@ -278,6 +278,9 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(&dmabuf->poll.lock); if (events & EPOLLOUT) { + /* Paired with fput in dma_buf_poll_cb */ + get_file(dmabuf->file); + if (!dma_buf_poll_shared(resv, dcb) && !dma_buf_poll_excl(resv, dcb)) /* No callback queued, wake up any other waiters */ @@ -299,6 +302,9 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(&dmabuf->poll.lock); if (events & EPOLLIN) { + /* Paired with fput in dma_buf_poll_cb */ + get_file(dmabuf->file); + if (!dma_buf_poll_excl(resv, dcb)) /* No callback queued, wake up any other waiters */ dma_buf_poll_cb(NULL, &dcb->cb);