Message ID | 20230809165330.2451699-2-boris.brezillon@collabora.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm: Add a driver for FW-based Mali GPUs | expand |
On 09/08/2023 17:53, Boris Brezillon wrote: > This way we can grab a pages ref without acquiring the resv lock when > pages_use_count > 0. Need to implement asynchronous map using the NIT: s/Need/This is needed/ > drm_gpuva_mgr when the map/unmap operation triggers a mapping split, > requiring the new left/right regions to grab an additional page ref > to guarantee that the pages stay pinned when the middle section is > unmapped. > > Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> > --- > drivers/gpu/drm/drm_gem_shmem_helper.c | 28 +++++++++++++------------ > drivers/gpu/drm/lima/lima_gem.c | 2 +- > drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +- > include/drm/drm_gem_shmem_helper.h | 2 +- > 4 files changed, 18 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > index a783d2245599..ca6938ea1b82 100644 > --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > @@ -155,7 +155,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem) > if (shmem->pages) > drm_gem_shmem_put_pages(shmem); > > - drm_WARN_ON(obj->dev, shmem->pages_use_count); > + drm_WARN_ON(obj->dev, atomic_read(&shmem->pages_use_count)); > > dma_resv_unlock(shmem->base.resv); > } > @@ -172,14 +172,14 @@ static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem) > > dma_resv_assert_held(shmem->base.resv); > > - if (shmem->pages_use_count++ > 0) > + if (atomic_inc_return(&shmem->pages_use_count) > 1) > return 0; > > pages = drm_gem_get_pages(obj); > if (IS_ERR(pages)) { > drm_dbg_kms(obj->dev, "Failed to get pages (%ld)\n", > PTR_ERR(pages)); > - shmem->pages_use_count = 0; > + atomic_set(&shmem->pages_use_count, 0); > return PTR_ERR(pages); > } > > @@ -210,10 +210,10 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem) > > dma_resv_assert_held(shmem->base.resv); > > - if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count)) > + if (drm_WARN_ON_ONCE(obj->dev, !atomic_read(&shmem->pages_use_count))) > return; > > - if (--shmem->pages_use_count > 0) > + if (atomic_dec_return(&shmem->pages_use_count) > 0) > return; > > #ifdef CONFIG_X86 > @@ -263,6 +263,10 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem) > > drm_WARN_ON(obj->dev, obj->import_attach); > > + /* If we are the first owner, we need to grab the lock. */ > + if (atomic_inc_not_zero(&shmem->pages_use_count)) > + return 0; > + Unless I'm misunderstanding I think this introduces a race where two threads call drm_gem_shmem_pin() at the same time: Thread1 | Thread 2 --------------------------------+------------------------------ drm_gem_shmem_pin() | - pages_use_count == 0 so not | incremented | - lock taken | drm_gem_shmem_pin_locked() | drm_gem_shmem_get_pages() | - pages_use_count incremented | <thread descheduled> | drm_gem_shmem_pin() | - pages_use_count == 1 so is it | incremented and returns early | without taking the lock | Code tries to use shmem->pages <thread rescheduled> | and blows up drm_gem_get_pages() | shmem->pages populated | lock released | I think you need to modify drm_gem_shmem_get_pages() to only increment pages_use_count when shmem->pages has been populated. That also gets rid of the atomic_set() in that function which scares me. Steve > ret = dma_resv_lock_interruptible(shmem->base.resv, NULL); > if (ret) > return ret; > @@ -286,6 +290,10 @@ void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem) > > drm_WARN_ON(obj->dev, obj->import_attach); > > + /* If we are the last owner, we need to grab the lock. */ > + if (atomic_add_unless(&shmem->pages_use_count, -1, 1)) > + return; > + > dma_resv_lock(shmem->base.resv, NULL); > drm_gem_shmem_unpin_locked(shmem); > dma_resv_unlock(shmem->base.resv); > @@ -543,18 +551,12 @@ static void drm_gem_shmem_vm_open(struct vm_area_struct *vma) > > drm_WARN_ON(obj->dev, obj->import_attach); > > - dma_resv_lock(shmem->base.resv, NULL); > - > /* > * We should have already pinned the pages when the buffer was first > * mmap'd, vm_open() just grabs an additional reference for the new > * mm the vma is getting copied into (ie. on fork()). > */ > - if (!drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count)) > - shmem->pages_use_count++; > - > - dma_resv_unlock(shmem->base.resv); > - > + drm_WARN_ON_ONCE(obj->dev, atomic_inc_return(&shmem->pages_use_count) == 1); > drm_gem_vm_open(vma); > } > > @@ -632,7 +634,7 @@ void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem, > if (shmem->base.import_attach) > return; > > - drm_printf_indent(p, indent, "pages_use_count=%u\n", shmem->pages_use_count); > + drm_printf_indent(p, indent, "pages_use_count=%u\n", atomic_read(&shmem->pages_use_count)); > drm_printf_indent(p, indent, "vmap_use_count=%u\n", shmem->vmap_use_count); > drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr); > } > diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c > index 4f9736e5f929..0116518b1601 100644 > --- a/drivers/gpu/drm/lima/lima_gem.c > +++ b/drivers/gpu/drm/lima/lima_gem.c > @@ -47,7 +47,7 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm) > } > > bo->base.pages = pages; > - bo->base.pages_use_count = 1; > + atomic_set(&bo->base.pages_use_count, 1); > > mapping_set_unevictable(mapping); > } > diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c > index c0123d09f699..f66e63bf743e 100644 > --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c > +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c > @@ -487,7 +487,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, > goto err_unlock; > } > bo->base.pages = pages; > - bo->base.pages_use_count = 1; > + atomic_set(&bo->base.pages_use_count, 1); > } else { > pages = bo->base.pages; > if (pages[page_offset]) { > diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h > index bf0c31aa8fbe..0661f87d3bda 100644 > --- a/include/drm/drm_gem_shmem_helper.h > +++ b/include/drm/drm_gem_shmem_helper.h > @@ -37,7 +37,7 @@ struct drm_gem_shmem_object { > * Reference count on the pages table. > * The pages are put when the count reaches zero. > */ > - unsigned int pages_use_count; > + atomic_t pages_use_count; > > /** > * @madv: State for madvise
On 8/11/23 16:08, Steven Price wrote: > On 09/08/2023 17:53, Boris Brezillon wrote: >> This way we can grab a pages ref without acquiring the resv lock when >> pages_use_count > 0. Need to implement asynchronous map using the > > NIT: s/Need/This is needed/ > >> drm_gpuva_mgr when the map/unmap operation triggers a mapping split, >> requiring the new left/right regions to grab an additional page ref >> to guarantee that the pages stay pinned when the middle section is >> unmapped. >> >> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> >> --- >> drivers/gpu/drm/drm_gem_shmem_helper.c | 28 +++++++++++++------------ >> drivers/gpu/drm/lima/lima_gem.c | 2 +- >> drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +- >> include/drm/drm_gem_shmem_helper.h | 2 +- >> 4 files changed, 18 insertions(+), 16 deletions(-) >> >> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c >> index a783d2245599..ca6938ea1b82 100644 >> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c >> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c >> @@ -155,7 +155,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem) >> if (shmem->pages) >> drm_gem_shmem_put_pages(shmem); >> >> - drm_WARN_ON(obj->dev, shmem->pages_use_count); >> + drm_WARN_ON(obj->dev, atomic_read(&shmem->pages_use_count)); >> >> dma_resv_unlock(shmem->base.resv); >> } >> @@ -172,14 +172,14 @@ static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem) >> >> dma_resv_assert_held(shmem->base.resv); >> >> - if (shmem->pages_use_count++ > 0) >> + if (atomic_inc_return(&shmem->pages_use_count) > 1) >> return 0; >> >> pages = drm_gem_get_pages(obj); >> if (IS_ERR(pages)) { >> drm_dbg_kms(obj->dev, "Failed to get pages (%ld)\n", >> PTR_ERR(pages)); >> - shmem->pages_use_count = 0; >> + atomic_set(&shmem->pages_use_count, 0); >> return PTR_ERR(pages); >> } >> >> @@ -210,10 +210,10 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem) >> >> dma_resv_assert_held(shmem->base.resv); >> >> - if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count)) >> + if (drm_WARN_ON_ONCE(obj->dev, !atomic_read(&shmem->pages_use_count))) >> return; >> >> - if (--shmem->pages_use_count > 0) >> + if (atomic_dec_return(&shmem->pages_use_count) > 0) >> return; >> >> #ifdef CONFIG_X86 >> @@ -263,6 +263,10 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem) >> >> drm_WARN_ON(obj->dev, obj->import_attach); >> >> + /* If we are the first owner, we need to grab the lock. */ >> + if (atomic_inc_not_zero(&shmem->pages_use_count)) >> + return 0; >> + > > Unless I'm misunderstanding I think this introduces a race where two > threads call drm_gem_shmem_pin() at the same time: > > Thread1 | Thread 2 > --------------------------------+------------------------------ > drm_gem_shmem_pin() | > - pages_use_count == 0 so not | > incremented | > - lock taken | > drm_gem_shmem_pin_locked() | > drm_gem_shmem_get_pages() | > - pages_use_count incremented | > <thread descheduled> | drm_gem_shmem_pin() > | - pages_use_count == 1 so is it > | incremented and returns early > | without taking the lock > | Code tries to use shmem->pages > <thread rescheduled> | and blows up > drm_gem_get_pages() | > shmem->pages populated | > lock released | > > I think you need to modify drm_gem_shmem_get_pages() to only increment > pages_use_count when shmem->pages has been populated. That also gets rid > of the atomic_set() in that function which scares me. This is correct, both pin() and get_pages() should use atomic_inc_not_zero(). Note that we shouldn't use atomic functions open-coded, there is kref helper for that which uses refcount_t underneath and has additional checks/warnings for count underflow/overflow. I'm going to post patches converting drm-shmem to kref around next week, Boris is aware about it and we should then sync shrinker/panthor patchsets to the common drm-shmem base.
On Sat, 19 Aug 2023 05:13:06 +0300 Dmitry Osipenko <dmitry.osipenko@collabora.com> wrote: > On 8/11/23 16:08, Steven Price wrote: > > On 09/08/2023 17:53, Boris Brezillon wrote: > >> This way we can grab a pages ref without acquiring the resv lock when > >> pages_use_count > 0. Need to implement asynchronous map using the > > > > NIT: s/Need/This is needed/ > > > >> drm_gpuva_mgr when the map/unmap operation triggers a mapping split, > >> requiring the new left/right regions to grab an additional page ref > >> to guarantee that the pages stay pinned when the middle section is > >> unmapped. > >> > >> Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> > >> --- > >> drivers/gpu/drm/drm_gem_shmem_helper.c | 28 +++++++++++++------------ > >> drivers/gpu/drm/lima/lima_gem.c | 2 +- > >> drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +- > >> include/drm/drm_gem_shmem_helper.h | 2 +- > >> 4 files changed, 18 insertions(+), 16 deletions(-) > >> > >> diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c > >> index a783d2245599..ca6938ea1b82 100644 > >> --- a/drivers/gpu/drm/drm_gem_shmem_helper.c > >> +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c > >> @@ -155,7 +155,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem) > >> if (shmem->pages) > >> drm_gem_shmem_put_pages(shmem); > >> > >> - drm_WARN_ON(obj->dev, shmem->pages_use_count); > >> + drm_WARN_ON(obj->dev, atomic_read(&shmem->pages_use_count)); > >> > >> dma_resv_unlock(shmem->base.resv); > >> } > >> @@ -172,14 +172,14 @@ static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem) > >> > >> dma_resv_assert_held(shmem->base.resv); > >> > >> - if (shmem->pages_use_count++ > 0) > >> + if (atomic_inc_return(&shmem->pages_use_count) > 1) > >> return 0; > >> > >> pages = drm_gem_get_pages(obj); > >> if (IS_ERR(pages)) { > >> drm_dbg_kms(obj->dev, "Failed to get pages (%ld)\n", > >> PTR_ERR(pages)); > >> - shmem->pages_use_count = 0; > >> + atomic_set(&shmem->pages_use_count, 0); > >> return PTR_ERR(pages); > >> } > >> > >> @@ -210,10 +210,10 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem) > >> > >> dma_resv_assert_held(shmem->base.resv); > >> > >> - if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count)) > >> + if (drm_WARN_ON_ONCE(obj->dev, !atomic_read(&shmem->pages_use_count))) > >> return; > >> > >> - if (--shmem->pages_use_count > 0) > >> + if (atomic_dec_return(&shmem->pages_use_count) > 0) > >> return; > >> > >> #ifdef CONFIG_X86 > >> @@ -263,6 +263,10 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem) > >> > >> drm_WARN_ON(obj->dev, obj->import_attach); > >> > >> + /* If we are the first owner, we need to grab the lock. */ > >> + if (atomic_inc_not_zero(&shmem->pages_use_count)) > >> + return 0; > >> + > > > > Unless I'm misunderstanding I think this introduces a race where two > > threads call drm_gem_shmem_pin() at the same time: > > > > Thread1 | Thread 2 > > --------------------------------+------------------------------ > > drm_gem_shmem_pin() | > > - pages_use_count == 0 so not | > > incremented | > > - lock taken | > > drm_gem_shmem_pin_locked() | > > drm_gem_shmem_get_pages() | > > - pages_use_count incremented | > > <thread descheduled> | drm_gem_shmem_pin() > > | - pages_use_count == 1 so is it > > | incremented and returns early > > | without taking the lock > > | Code tries to use shmem->pages > > <thread rescheduled> | and blows up > > drm_gem_get_pages() | > > shmem->pages populated | > > lock released | > > > > I think you need to modify drm_gem_shmem_get_pages() to only increment > > pages_use_count when shmem->pages has been populated. Oops, didn't spot that race. Thanks for pointing it out. > > This is correct, both pin() and get_pages() should use > atomic_inc_not_zero(). > > Note that we shouldn't use atomic functions open-coded, there is kref > helper for that which uses refcount_t underneath and has additional > checks/warnings for count underflow/overflow. I'm going to post patches > converting drm-shmem to kref around next week, Boris is aware about it > and we should then sync shrinker/panthor patchsets to the common > drm-shmem base. Thanks, I'll have a look at these patches pretty soon.
diff --git a/drivers/gpu/drm/drm_gem_shmem_helper.c b/drivers/gpu/drm/drm_gem_shmem_helper.c index a783d2245599..ca6938ea1b82 100644 --- a/drivers/gpu/drm/drm_gem_shmem_helper.c +++ b/drivers/gpu/drm/drm_gem_shmem_helper.c @@ -155,7 +155,7 @@ void drm_gem_shmem_free(struct drm_gem_shmem_object *shmem) if (shmem->pages) drm_gem_shmem_put_pages(shmem); - drm_WARN_ON(obj->dev, shmem->pages_use_count); + drm_WARN_ON(obj->dev, atomic_read(&shmem->pages_use_count)); dma_resv_unlock(shmem->base.resv); } @@ -172,14 +172,14 @@ static int drm_gem_shmem_get_pages(struct drm_gem_shmem_object *shmem) dma_resv_assert_held(shmem->base.resv); - if (shmem->pages_use_count++ > 0) + if (atomic_inc_return(&shmem->pages_use_count) > 1) return 0; pages = drm_gem_get_pages(obj); if (IS_ERR(pages)) { drm_dbg_kms(obj->dev, "Failed to get pages (%ld)\n", PTR_ERR(pages)); - shmem->pages_use_count = 0; + atomic_set(&shmem->pages_use_count, 0); return PTR_ERR(pages); } @@ -210,10 +210,10 @@ void drm_gem_shmem_put_pages(struct drm_gem_shmem_object *shmem) dma_resv_assert_held(shmem->base.resv); - if (drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count)) + if (drm_WARN_ON_ONCE(obj->dev, !atomic_read(&shmem->pages_use_count))) return; - if (--shmem->pages_use_count > 0) + if (atomic_dec_return(&shmem->pages_use_count) > 0) return; #ifdef CONFIG_X86 @@ -263,6 +263,10 @@ int drm_gem_shmem_pin(struct drm_gem_shmem_object *shmem) drm_WARN_ON(obj->dev, obj->import_attach); + /* If we are the first owner, we need to grab the lock. */ + if (atomic_inc_not_zero(&shmem->pages_use_count)) + return 0; + ret = dma_resv_lock_interruptible(shmem->base.resv, NULL); if (ret) return ret; @@ -286,6 +290,10 @@ void drm_gem_shmem_unpin(struct drm_gem_shmem_object *shmem) drm_WARN_ON(obj->dev, obj->import_attach); + /* If we are the last owner, we need to grab the lock. */ + if (atomic_add_unless(&shmem->pages_use_count, -1, 1)) + return; + dma_resv_lock(shmem->base.resv, NULL); drm_gem_shmem_unpin_locked(shmem); dma_resv_unlock(shmem->base.resv); @@ -543,18 +551,12 @@ static void drm_gem_shmem_vm_open(struct vm_area_struct *vma) drm_WARN_ON(obj->dev, obj->import_attach); - dma_resv_lock(shmem->base.resv, NULL); - /* * We should have already pinned the pages when the buffer was first * mmap'd, vm_open() just grabs an additional reference for the new * mm the vma is getting copied into (ie. on fork()). */ - if (!drm_WARN_ON_ONCE(obj->dev, !shmem->pages_use_count)) - shmem->pages_use_count++; - - dma_resv_unlock(shmem->base.resv); - + drm_WARN_ON_ONCE(obj->dev, atomic_inc_return(&shmem->pages_use_count) == 1); drm_gem_vm_open(vma); } @@ -632,7 +634,7 @@ void drm_gem_shmem_print_info(const struct drm_gem_shmem_object *shmem, if (shmem->base.import_attach) return; - drm_printf_indent(p, indent, "pages_use_count=%u\n", shmem->pages_use_count); + drm_printf_indent(p, indent, "pages_use_count=%u\n", atomic_read(&shmem->pages_use_count)); drm_printf_indent(p, indent, "vmap_use_count=%u\n", shmem->vmap_use_count); drm_printf_indent(p, indent, "vaddr=%p\n", shmem->vaddr); } diff --git a/drivers/gpu/drm/lima/lima_gem.c b/drivers/gpu/drm/lima/lima_gem.c index 4f9736e5f929..0116518b1601 100644 --- a/drivers/gpu/drm/lima/lima_gem.c +++ b/drivers/gpu/drm/lima/lima_gem.c @@ -47,7 +47,7 @@ int lima_heap_alloc(struct lima_bo *bo, struct lima_vm *vm) } bo->base.pages = pages; - bo->base.pages_use_count = 1; + atomic_set(&bo->base.pages_use_count, 1); mapping_set_unevictable(mapping); } diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index c0123d09f699..f66e63bf743e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -487,7 +487,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, goto err_unlock; } bo->base.pages = pages; - bo->base.pages_use_count = 1; + atomic_set(&bo->base.pages_use_count, 1); } else { pages = bo->base.pages; if (pages[page_offset]) { diff --git a/include/drm/drm_gem_shmem_helper.h b/include/drm/drm_gem_shmem_helper.h index bf0c31aa8fbe..0661f87d3bda 100644 --- a/include/drm/drm_gem_shmem_helper.h +++ b/include/drm/drm_gem_shmem_helper.h @@ -37,7 +37,7 @@ struct drm_gem_shmem_object { * Reference count on the pages table. * The pages are put when the count reaches zero. */ - unsigned int pages_use_count; + atomic_t pages_use_count; /** * @madv: State for madvise
This way we can grab a pages ref without acquiring the resv lock when pages_use_count > 0. Need to implement asynchronous map using the drm_gpuva_mgr when the map/unmap operation triggers a mapping split, requiring the new left/right regions to grab an additional page ref to guarantee that the pages stay pinned when the middle section is unmapped. Signed-off-by: Boris Brezillon <boris.brezillon@collabora.com> --- drivers/gpu/drm/drm_gem_shmem_helper.c | 28 +++++++++++++------------ drivers/gpu/drm/lima/lima_gem.c | 2 +- drivers/gpu/drm/panfrost/panfrost_mmu.c | 2 +- include/drm/drm_gem_shmem_helper.h | 2 +- 4 files changed, 18 insertions(+), 16 deletions(-)