Message ID | 20171122203928.28135-1-boris.brezillon@free-electrons.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Boris Brezillon <boris.brezillon@free-electrons.com> writes: > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's > passed a refcount object that has its counter set to 0. In this driver, > this is a valid use case since we want to increment ->usecnt only when > the BO object starts to be used by real HW components and this is > definitely not the case when the BO is created. > > Fix the problem by using refcount_inc_not_zero() instead of > refcount_inc() and fallback to refcount_set(1) when > refcount_inc_not_zero() returns false. Note that this 2-steps operation > is not racy here because the whole section is protected by a mutex > which guarantees that the counter does not change between the > refcount_inc_not_zero() and refcount_set() calls. If we're not following the model, and protecting the refcount by a mutex, shouldn't we just be using addition and subtraction instead of refcount's atomics?
On Wed, 22 Nov 2017 13:16:08 -0800 Eric Anholt <eric@anholt.net> wrote: > Boris Brezillon <boris.brezillon@free-electrons.com> writes: > > > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's > > passed a refcount object that has its counter set to 0. In this driver, > > this is a valid use case since we want to increment ->usecnt only when > > the BO object starts to be used by real HW components and this is > > definitely not the case when the BO is created. > > > > Fix the problem by using refcount_inc_not_zero() instead of > > refcount_inc() and fallback to refcount_set(1) when > > refcount_inc_not_zero() returns false. Note that this 2-steps operation > > is not racy here because the whole section is protected by a mutex > > which guarantees that the counter does not change between the > > refcount_inc_not_zero() and refcount_set() calls. > > If we're not following the model, and protecting the refcount by a > mutex, shouldn't we just be using addition and subtraction instead of > refcount's atomics? Actually, this mutex is protecting the bo->madv value which has to be checked when the counter reaches 0 (when decrementing) or 1 (when incrementing). We just benefit from this protection here, but ideally it would be better to have an refcount_inc_allow_zero() as suggested by Daniel. Anyway, I can also use a regular counter and protect it with the madv mutex, it's just that I thought using the existing refcount infrastructure would be safer than open-coding it (actually, I found several unbalanced refcounting issues thanks to this API while developing the feature).
Boris Brezillon <boris.brezillon@free-electrons.com> writes: > On Wed, 22 Nov 2017 13:16:08 -0800 > Eric Anholt <eric@anholt.net> wrote: > >> Boris Brezillon <boris.brezillon@free-electrons.com> writes: >> >> > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's >> > passed a refcount object that has its counter set to 0. In this driver, >> > this is a valid use case since we want to increment ->usecnt only when >> > the BO object starts to be used by real HW components and this is >> > definitely not the case when the BO is created. >> > >> > Fix the problem by using refcount_inc_not_zero() instead of >> > refcount_inc() and fallback to refcount_set(1) when >> > refcount_inc_not_zero() returns false. Note that this 2-steps operation >> > is not racy here because the whole section is protected by a mutex >> > which guarantees that the counter does not change between the >> > refcount_inc_not_zero() and refcount_set() calls. >> >> If we're not following the model, and protecting the refcount by a >> mutex, shouldn't we just be using addition and subtraction instead of >> refcount's atomics? > > Actually, this mutex is protecting the bo->madv value which has to be > checked when the counter reaches 0 (when decrementing) or 1 (when > incrementing). We just benefit from this protection here, but ideally > it would be better to have an refcount_inc_allow_zero() as suggested by > Daniel. Let me restate this to see if it makes sense: The refcount is always >= 0, this is is the only path that increases the refcount from 0 to 1, and it's (incidentally) protected by a mutex, so there's no race between the attempted increase from nonzero and the set from nonzero to 1. This seems fine to me as a bugfix.
On Wed, 22 Nov 2017 16:13:31 -0800 Eric Anholt <eric@anholt.net> wrote: > Boris Brezillon <boris.brezillon@free-electrons.com> writes: > > > On Wed, 22 Nov 2017 13:16:08 -0800 > > Eric Anholt <eric@anholt.net> wrote: > > > >> Boris Brezillon <boris.brezillon@free-electrons.com> writes: > >> > >> > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's > >> > passed a refcount object that has its counter set to 0. In this driver, > >> > this is a valid use case since we want to increment ->usecnt only when > >> > the BO object starts to be used by real HW components and this is > >> > definitely not the case when the BO is created. > >> > > >> > Fix the problem by using refcount_inc_not_zero() instead of > >> > refcount_inc() and fallback to refcount_set(1) when > >> > refcount_inc_not_zero() returns false. Note that this 2-steps operation > >> > is not racy here because the whole section is protected by a mutex > >> > which guarantees that the counter does not change between the > >> > refcount_inc_not_zero() and refcount_set() calls. > >> > >> If we're not following the model, and protecting the refcount by a > >> mutex, shouldn't we just be using addition and subtraction instead of > >> refcount's atomics? > > > > Actually, this mutex is protecting the bo->madv value which has to be > > checked when the counter reaches 0 (when decrementing) or 1 (when > > incrementing). We just benefit from this protection here, but ideally > > it would be better to have an refcount_inc_allow_zero() as suggested by > > Daniel. > > Let me restate this to see if it makes sense: The refcount is always >= > 0, this is is the only path that increases the refcount from 0 to 1, and > it's (incidentally) protected by a mutex, so there's no race between the > attempted increase from nonzero and the set from nonzero to 1. Yep. > > This seems fine to me as a bugfix. The discussion is going on in the other thread, let's wait a bit before taking a decision.
On Thu, Nov 23, 2017 at 09:24:11AM +0100, Boris Brezillon wrote: > On Wed, 22 Nov 2017 16:13:31 -0800 > Eric Anholt <eric@anholt.net> wrote: > > > Boris Brezillon <boris.brezillon@free-electrons.com> writes: > > > > > On Wed, 22 Nov 2017 13:16:08 -0800 > > > Eric Anholt <eric@anholt.net> wrote: > > > > > >> Boris Brezillon <boris.brezillon@free-electrons.com> writes: > > >> > > >> > With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's > > >> > passed a refcount object that has its counter set to 0. In this driver, > > >> > this is a valid use case since we want to increment ->usecnt only when > > >> > the BO object starts to be used by real HW components and this is > > >> > definitely not the case when the BO is created. > > >> > > > >> > Fix the problem by using refcount_inc_not_zero() instead of > > >> > refcount_inc() and fallback to refcount_set(1) when > > >> > refcount_inc_not_zero() returns false. Note that this 2-steps operation > > >> > is not racy here because the whole section is protected by a mutex > > >> > which guarantees that the counter does not change between the > > >> > refcount_inc_not_zero() and refcount_set() calls. > > >> > > >> If we're not following the model, and protecting the refcount by a > > >> mutex, shouldn't we just be using addition and subtraction instead of > > >> refcount's atomics? > > > > > > Actually, this mutex is protecting the bo->madv value which has to be > > > checked when the counter reaches 0 (when decrementing) or 1 (when > > > incrementing). We just benefit from this protection here, but ideally > > > it would be better to have an refcount_inc_allow_zero() as suggested by > > > Daniel. > > > > Let me restate this to see if it makes sense: The refcount is always >= > > 0, this is is the only path that increases the refcount from 0 to 1, and > > it's (incidentally) protected by a mutex, so there's no race between the > > attempted increase from nonzero and the set from nonzero to 1. > > Yep. > > > > > This seems fine to me as a bugfix. > > The discussion is going on in the other thread, let's wait a bit > before taking a decision. Let's not block the bugfix on reaching perfection imo. I'd merge this as the minimal fix, and then apply the pretty paint once we have a clear idea for that. -Daniel
Hi Daniel, Am 24.11.2017 um 15:52 schrieb Daniel Vetter: > On Thu, Nov 23, 2017 at 09:24:11AM +0100, Boris Brezillon wrote: >> On Wed, 22 Nov 2017 16:13:31 -0800 >> Eric Anholt <eric@anholt.net> wrote: >> >>> Boris Brezillon <boris.brezillon@free-electrons.com> writes: >>> >>>> On Wed, 22 Nov 2017 13:16:08 -0800 >>>> Eric Anholt <eric@anholt.net> wrote: >>>> >>>>> Boris Brezillon <boris.brezillon@free-electrons.com> writes: >>>>> >>>>>> With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's >>>>>> passed a refcount object that has its counter set to 0. In this driver, >>>>>> this is a valid use case since we want to increment ->usecnt only when >>>>>> the BO object starts to be used by real HW components and this is >>>>>> definitely not the case when the BO is created. >>>>>> >>>>>> Fix the problem by using refcount_inc_not_zero() instead of >>>>>> refcount_inc() and fallback to refcount_set(1) when >>>>>> refcount_inc_not_zero() returns false. Note that this 2-steps operation >>>>>> is not racy here because the whole section is protected by a mutex >>>>>> which guarantees that the counter does not change between the >>>>>> refcount_inc_not_zero() and refcount_set() calls. >>>>> If we're not following the model, and protecting the refcount by a >>>>> mutex, shouldn't we just be using addition and subtraction instead of >>>>> refcount's atomics? >>>> Actually, this mutex is protecting the bo->madv value which has to be >>>> checked when the counter reaches 0 (when decrementing) or 1 (when >>>> incrementing). We just benefit from this protection here, but ideally >>>> it would be better to have an refcount_inc_allow_zero() as suggested by >>>> Daniel. >>> Let me restate this to see if it makes sense: The refcount is always >= >>> 0, this is is the only path that increases the refcount from 0 to 1, and >>> it's (incidentally) protected by a mutex, so there's no race between the >>> attempted increase from nonzero and the set from nonzero to 1. >> Yep. >> >>> This seems fine to me as a bugfix. >> The discussion is going on in the other thread, let's wait a bit >> before taking a decision. > Let's not block the bugfix on reaching perfection imo. I'd merge this as > the minimal fix, and then apply the pretty paint once we have a clear idea > for that. > -Daniel sorry but i can't find it in the DRI tree. Regards Stefan
On Thu, 7 Dec 2017 13:31:34 +0100 Stefan Wahren <stefan.wahren@i2se.com> wrote: > Hi Daniel, > > > Am 24.11.2017 um 15:52 schrieb Daniel Vetter: > > On Thu, Nov 23, 2017 at 09:24:11AM +0100, Boris Brezillon wrote: > >> On Wed, 22 Nov 2017 16:13:31 -0800 > >> Eric Anholt <eric@anholt.net> wrote: > >> > >>> Boris Brezillon <boris.brezillon@free-electrons.com> writes: > >>> > >>>> On Wed, 22 Nov 2017 13:16:08 -0800 > >>>> Eric Anholt <eric@anholt.net> wrote: > >>>> > >>>>> Boris Brezillon <boris.brezillon@free-electrons.com> writes: > >>>>> > >>>>>> With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's > >>>>>> passed a refcount object that has its counter set to 0. In this driver, > >>>>>> this is a valid use case since we want to increment ->usecnt only when > >>>>>> the BO object starts to be used by real HW components and this is > >>>>>> definitely not the case when the BO is created. > >>>>>> > >>>>>> Fix the problem by using refcount_inc_not_zero() instead of > >>>>>> refcount_inc() and fallback to refcount_set(1) when > >>>>>> refcount_inc_not_zero() returns false. Note that this 2-steps operation > >>>>>> is not racy here because the whole section is protected by a mutex > >>>>>> which guarantees that the counter does not change between the > >>>>>> refcount_inc_not_zero() and refcount_set() calls. > >>>>> If we're not following the model, and protecting the refcount by a > >>>>> mutex, shouldn't we just be using addition and subtraction instead of > >>>>> refcount's atomics? > >>>> Actually, this mutex is protecting the bo->madv value which has to be > >>>> checked when the counter reaches 0 (when decrementing) or 1 (when > >>>> incrementing). We just benefit from this protection here, but ideally > >>>> it would be better to have an refcount_inc_allow_zero() as suggested by > >>>> Daniel. > >>> Let me restate this to see if it makes sense: The refcount is always >= > >>> 0, this is is the only path that increases the refcount from 0 to 1, and > >>> it's (incidentally) protected by a mutex, so there's no race between the > >>> attempted increase from nonzero and the set from nonzero to 1. > >> Yep. > >> > >>> This seems fine to me as a bugfix. > >> The discussion is going on in the other thread, let's wait a bit > >> before taking a decision. > > Let's not block the bugfix on reaching perfection imo. I'd merge this as > > the minimal fix, and then apply the pretty paint once we have a clear idea > > for that. > > -Daniel > > sorry but i can't find it in the DRI tree. Sorry for the delay. I applied it this morning [1] and it should appear in the next -rc. Regards, Boris [1]https://cgit.freedesktop.org/drm/drm-misc/log/?h=drm-misc-fixes
diff --git a/drivers/gpu/drm/vc4/vc4_bo.c b/drivers/gpu/drm/vc4/vc4_bo.c index 4ae45d7dac42..2decc8e2c79f 100644 --- a/drivers/gpu/drm/vc4/vc4_bo.c +++ b/drivers/gpu/drm/vc4/vc4_bo.c @@ -637,7 +637,8 @@ int vc4_bo_inc_usecnt(struct vc4_bo *bo) mutex_lock(&bo->madv_lock); switch (bo->madv) { case VC4_MADV_WILLNEED: - refcount_inc(&bo->usecnt); + if (!refcount_inc_not_zero(&bo->usecnt)) + refcount_set(&bo->usecnt, 1); ret = 0; break; case VC4_MADV_DONTNEED:
With CONFIG_REFCOUNT_FULL enabled, refcount_inc() complains when it's passed a refcount object that has its counter set to 0. In this driver, this is a valid use case since we want to increment ->usecnt only when the BO object starts to be used by real HW components and this is definitely not the case when the BO is created. Fix the problem by using refcount_inc_not_zero() instead of refcount_inc() and fallback to refcount_set(1) when refcount_inc_not_zero() returns false. Note that this 2-steps operation is not racy here because the whole section is protected by a mutex which guarantees that the counter does not change between the refcount_inc_not_zero() and refcount_set() calls. Fixes: b9f19259b84d ("drm/vc4: Add the DRM_IOCTL_VC4_GEM_MADVISE ioctl") Reported-by: Stefan Wahren <stefan.wahren@i2se.com> Signed-off-by: Boris Brezillon <boris.brezillon@free-electrons.com> --- drivers/gpu/drm/vc4/vc4_bo.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)