Message ID | 1454894009-15466-3-git-send-email-mario.kleiner.de@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote: > This fixes a regression introduced by the new drm_update_vblank_count() > implementation in Linux 4.4: > > Restrict the bump of the software vblank counter in drm_update_vblank_count() > to a safe maximum value of +1 whenever there is the possibility that > concurrent readers of vblank timestamps could be active at the moment, > as the current implementation of the timestamp caching and updating is > not safe against concurrent readers for calls to store_vblank() with a > bump of anything but +1. A bump != 1 would very likely return corrupted > timestamps to userspace, because the same slot in the cache could > be concurrently written by store_vblank() and read by one of those > readers in a non-atomic fashion and without the read-retry logic > detecting this collision. > > Concurrent readers can exist while drm_update_vblank_count() is called > from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank- > irq callers. However, all those calls are happening with the vbl_lock > locked thereby preventing a drm_vblank_get(), so the vblank refcount > can't increase while drm_update_vblank_count() is executing. Therefore > a zero vblank refcount during execution of that function signals that > is safe for arbitrary counter bumps if called from outside vblank irq, > whereas a non-zero count is not safe. > > Whenever the function is called from vblank irq, we have to assume concurrent > readers could show up any time during its execution, even if the refcount > is currently zero, as vblank irqs are usually only enabled due to the > presence of readers, and because when it is called from vblank irq it > can't hold the vbl_lock to protect it from sudden bumps in vblank refcount. > Therefore also restrict bumps to +1 when the function is called from vblank > irq. > > Such bumps of more than +1 can happen at other times than reenabling > vblank irqs, e.g., when regular vblank interrupts get delayed by more > than 1 frame due to long held locks, long irq off periods, realtime > preemption on RT kernels, or system management interrupts. > > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > Cc: <stable@vger.kernel.org> # 4.4+ > Cc: michel@daenzer.net > Cc: vbabka@suse.cz > Cc: ville.syrjala@linux.intel.com > Cc: daniel.vetter@ffwll.ch > Cc: dri-devel@lists.freedesktop.org > Cc: alexander.deucher@amd.com > Cc: christian.koenig@amd.com Imo this is duct-tape. If we want to fix this up properly I think we should just use a full-blown seqlock instead of our hand-rolled one. And that could handle any increment at all. -Daniel > --- > drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++ > 1 file changed, 41 insertions(+) > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > index bcb8528..aa2c74b 100644 > --- a/drivers/gpu/drm/drm_irq.c > +++ b/drivers/gpu/drm/drm_irq.c > @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; > } > > + /* > + * Restrict the bump of the software vblank counter to a safe maximum > + * value of +1 whenever there is the possibility that concurrent readers > + * of vblank timestamps could be active at the moment, as the current > + * implementation of the timestamp caching and updating is not safe > + * against concurrent readers for calls to store_vblank() with a bump > + * of anything but +1. A bump != 1 would very likely return corrupted > + * timestamps to userspace, because the same slot in the cache could > + * be concurrently written by store_vblank() and read by one of those > + * readers without the read-retry logic detecting the collision. > + * > + * Concurrent readers can exist when we are called from the > + * drm_vblank_off() or drm_vblank_on() functions and other non-vblank- > + * irq callers. However, all those calls to us are happening with the > + * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount > + * can't increase while we are executing. Therefore a zero refcount at > + * this point is safe for arbitrary counter bumps if we are called > + * outside vblank irq, a non-zero count is not 100% safe. Unfortunately > + * we must also accept a refcount of 1, as whenever we are called from > + * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and > + * we must let that one pass through in order to not lose vblank counts > + * during vblank irq off - which would completely defeat the whole > + * point of this routine. > + * > + * Whenever we are called from vblank irq, we have to assume concurrent > + * readers exist or can show up any time during our execution, even if > + * the refcount is currently zero, as vblank irqs are usually only > + * enabled due to the presence of readers, and because when we are called > + * from vblank irq we can't hold the vbl_lock to protect us from sudden > + * bumps in vblank refcount. Therefore also restrict bumps to +1 when > + * called from vblank irq. > + */ > + if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || > + (flags & DRM_CALLED_FROM_VBLIRQ))) { > + DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u " > + "refcount %u, vblirq %u\n", pipe, diff, > + atomic_read(&vblank->refcount), > + (flags & DRM_CALLED_FROM_VBLIRQ) != 0); > + diff = 1; > + } > + > DRM_DEBUG_VBL("updating vblank count on crtc %u:" > " current=%u, diff=%u, hw=%u hw_last=%u\n", > pipe, vblank->count, diff, cur_vblank, vblank->last); > -- > 1.9.1 >
On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote: > On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote: > > This fixes a regression introduced by the new drm_update_vblank_count() > > implementation in Linux 4.4: > > > > Restrict the bump of the software vblank counter in drm_update_vblank_count() > > to a safe maximum value of +1 whenever there is the possibility that > > concurrent readers of vblank timestamps could be active at the moment, > > as the current implementation of the timestamp caching and updating is > > not safe against concurrent readers for calls to store_vblank() with a > > bump of anything but +1. A bump != 1 would very likely return corrupted > > timestamps to userspace, because the same slot in the cache could > > be concurrently written by store_vblank() and read by one of those > > readers in a non-atomic fashion and without the read-retry logic > > detecting this collision. > > > > Concurrent readers can exist while drm_update_vblank_count() is called > > from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank- > > irq callers. However, all those calls are happening with the vbl_lock > > locked thereby preventing a drm_vblank_get(), so the vblank refcount > > can't increase while drm_update_vblank_count() is executing. Therefore > > a zero vblank refcount during execution of that function signals that > > is safe for arbitrary counter bumps if called from outside vblank irq, > > whereas a non-zero count is not safe. > > > > Whenever the function is called from vblank irq, we have to assume concurrent > > readers could show up any time during its execution, even if the refcount > > is currently zero, as vblank irqs are usually only enabled due to the > > presence of readers, and because when it is called from vblank irq it > > can't hold the vbl_lock to protect it from sudden bumps in vblank refcount. > > Therefore also restrict bumps to +1 when the function is called from vblank > > irq. > > > > Such bumps of more than +1 can happen at other times than reenabling > > vblank irqs, e.g., when regular vblank interrupts get delayed by more > > than 1 frame due to long held locks, long irq off periods, realtime > > preemption on RT kernels, or system management interrupts. > > > > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > > Cc: <stable@vger.kernel.org> # 4.4+ > > Cc: michel@daenzer.net > > Cc: vbabka@suse.cz > > Cc: ville.syrjala@linux.intel.com > > Cc: daniel.vetter@ffwll.ch > > Cc: dri-devel@lists.freedesktop.org > > Cc: alexander.deucher@amd.com > > Cc: christian.koenig@amd.com > > Imo this is duct-tape. If we want to fix this up properly I think we > should just use a full-blown seqlock instead of our hand-rolled one. And > that could handle any increment at all. And I even fixed this [1] almost a half a year ago when I sent the original series, but that part got held hostage to the same seqlock argument. Perfect is the enemy of good. [1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html > -Daniel > > > --- > > drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++ > > 1 file changed, 41 insertions(+) > > > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > > index bcb8528..aa2c74b 100644 > > --- a/drivers/gpu/drm/drm_irq.c > > +++ b/drivers/gpu/drm/drm_irq.c > > @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > > diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; > > } > > > > + /* > > + * Restrict the bump of the software vblank counter to a safe maximum > > + * value of +1 whenever there is the possibility that concurrent readers > > + * of vblank timestamps could be active at the moment, as the current > > + * implementation of the timestamp caching and updating is not safe > > + * against concurrent readers for calls to store_vblank() with a bump > > + * of anything but +1. A bump != 1 would very likely return corrupted > > + * timestamps to userspace, because the same slot in the cache could > > + * be concurrently written by store_vblank() and read by one of those > > + * readers without the read-retry logic detecting the collision. > > + * > > + * Concurrent readers can exist when we are called from the > > + * drm_vblank_off() or drm_vblank_on() functions and other non-vblank- > > + * irq callers. However, all those calls to us are happening with the > > + * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount > > + * can't increase while we are executing. Therefore a zero refcount at > > + * this point is safe for arbitrary counter bumps if we are called > > + * outside vblank irq, a non-zero count is not 100% safe. Unfortunately > > + * we must also accept a refcount of 1, as whenever we are called from > > + * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and > > + * we must let that one pass through in order to not lose vblank counts > > + * during vblank irq off - which would completely defeat the whole > > + * point of this routine. > > + * > > + * Whenever we are called from vblank irq, we have to assume concurrent > > + * readers exist or can show up any time during our execution, even if > > + * the refcount is currently zero, as vblank irqs are usually only > > + * enabled due to the presence of readers, and because when we are called > > + * from vblank irq we can't hold the vbl_lock to protect us from sudden > > + * bumps in vblank refcount. Therefore also restrict bumps to +1 when > > + * called from vblank irq. > > + */ > > + if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || > > + (flags & DRM_CALLED_FROM_VBLIRQ))) { > > + DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u " > > + "refcount %u, vblirq %u\n", pipe, diff, > > + atomic_read(&vblank->refcount), > > + (flags & DRM_CALLED_FROM_VBLIRQ) != 0); > > + diff = 1; > > + } > > + > > DRM_DEBUG_VBL("updating vblank count on crtc %u:" > > " current=%u, diff=%u, hw=%u hw_last=%u\n", > > pipe, vblank->count, diff, cur_vblank, vblank->last); > > -- > > 1.9.1 > > > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrjälä wrote: > On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote: > > On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote: > > > This fixes a regression introduced by the new drm_update_vblank_count() > > > implementation in Linux 4.4: > > > > > > Restrict the bump of the software vblank counter in drm_update_vblank_count() > > > to a safe maximum value of +1 whenever there is the possibility that > > > concurrent readers of vblank timestamps could be active at the moment, > > > as the current implementation of the timestamp caching and updating is > > > not safe against concurrent readers for calls to store_vblank() with a > > > bump of anything but +1. A bump != 1 would very likely return corrupted > > > timestamps to userspace, because the same slot in the cache could > > > be concurrently written by store_vblank() and read by one of those > > > readers in a non-atomic fashion and without the read-retry logic > > > detecting this collision. > > > > > > Concurrent readers can exist while drm_update_vblank_count() is called > > > from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank- > > > irq callers. However, all those calls are happening with the vbl_lock > > > locked thereby preventing a drm_vblank_get(), so the vblank refcount > > > can't increase while drm_update_vblank_count() is executing. Therefore > > > a zero vblank refcount during execution of that function signals that > > > is safe for arbitrary counter bumps if called from outside vblank irq, > > > whereas a non-zero count is not safe. > > > > > > Whenever the function is called from vblank irq, we have to assume concurrent > > > readers could show up any time during its execution, even if the refcount > > > is currently zero, as vblank irqs are usually only enabled due to the > > > presence of readers, and because when it is called from vblank irq it > > > can't hold the vbl_lock to protect it from sudden bumps in vblank refcount. > > > Therefore also restrict bumps to +1 when the function is called from vblank > > > irq. > > > > > > Such bumps of more than +1 can happen at other times than reenabling > > > vblank irqs, e.g., when regular vblank interrupts get delayed by more > > > than 1 frame due to long held locks, long irq off periods, realtime > > > preemption on RT kernels, or system management interrupts. > > > > > > Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > > > Cc: <stable@vger.kernel.org> # 4.4+ > > > Cc: michel@daenzer.net > > > Cc: vbabka@suse.cz > > > Cc: ville.syrjala@linux.intel.com > > > Cc: daniel.vetter@ffwll.ch > > > Cc: dri-devel@lists.freedesktop.org > > > Cc: alexander.deucher@amd.com > > > Cc: christian.koenig@amd.com > > > > Imo this is duct-tape. If we want to fix this up properly I think we > > should just use a full-blown seqlock instead of our hand-rolled one. And > > that could handle any increment at all. > > And I even fixed this [1] almost a half a year ago when I sent the > original series, but that part got held hostage to the same seqlock > argument. Perfect is the enemy of good. > > [1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer your patch over Mario's hack here tbh. Your patch with seqlock would be even more awesome. -Daniel > > > -Daniel > > > > > --- > > > drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++ > > > 1 file changed, 41 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > > > index bcb8528..aa2c74b 100644 > > > --- a/drivers/gpu/drm/drm_irq.c > > > +++ b/drivers/gpu/drm/drm_irq.c > > > @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > > > diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; > > > } > > > > > > + /* > > > + * Restrict the bump of the software vblank counter to a safe maximum > > > + * value of +1 whenever there is the possibility that concurrent readers > > > + * of vblank timestamps could be active at the moment, as the current > > > + * implementation of the timestamp caching and updating is not safe > > > + * against concurrent readers for calls to store_vblank() with a bump > > > + * of anything but +1. A bump != 1 would very likely return corrupted > > > + * timestamps to userspace, because the same slot in the cache could > > > + * be concurrently written by store_vblank() and read by one of those > > > + * readers without the read-retry logic detecting the collision. > > > + * > > > + * Concurrent readers can exist when we are called from the > > > + * drm_vblank_off() or drm_vblank_on() functions and other non-vblank- > > > + * irq callers. However, all those calls to us are happening with the > > > + * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount > > > + * can't increase while we are executing. Therefore a zero refcount at > > > + * this point is safe for arbitrary counter bumps if we are called > > > + * outside vblank irq, a non-zero count is not 100% safe. Unfortunately > > > + * we must also accept a refcount of 1, as whenever we are called from > > > + * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and > > > + * we must let that one pass through in order to not lose vblank counts > > > + * during vblank irq off - which would completely defeat the whole > > > + * point of this routine. > > > + * > > > + * Whenever we are called from vblank irq, we have to assume concurrent > > > + * readers exist or can show up any time during our execution, even if > > > + * the refcount is currently zero, as vblank irqs are usually only > > > + * enabled due to the presence of readers, and because when we are called > > > + * from vblank irq we can't hold the vbl_lock to protect us from sudden > > > + * bumps in vblank refcount. Therefore also restrict bumps to +1 when > > > + * called from vblank irq. > > > + */ > > > + if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || > > > + (flags & DRM_CALLED_FROM_VBLIRQ))) { > > > + DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u " > > > + "refcount %u, vblirq %u\n", pipe, diff, > > > + atomic_read(&vblank->refcount), > > > + (flags & DRM_CALLED_FROM_VBLIRQ) != 0); > > > + diff = 1; > > > + } > > > + > > > DRM_DEBUG_VBL("updating vblank count on crtc %u:" > > > " current=%u, diff=%u, hw=%u hw_last=%u\n", > > > pipe, vblank->count, diff, cur_vblank, vblank->last); > > > -- > > > 1.9.1 > > > > > > > -- > > Daniel Vetter > > Software Engineer, Intel Corporation > > http://blog.ffwll.ch > > -- > Ville Syrjälä > Intel OTC
On 02/09/2016 11:23 AM, Daniel Vetter wrote: > On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrjälä wrote: >> On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote: >>> On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote: >>>> This fixes a regression introduced by the new drm_update_vblank_count() >>>> implementation in Linux 4.4: >>>> >>>> Restrict the bump of the software vblank counter in drm_update_vblank_count() >>>> to a safe maximum value of +1 whenever there is the possibility that >>>> concurrent readers of vblank timestamps could be active at the moment, >>>> as the current implementation of the timestamp caching and updating is >>>> not safe against concurrent readers for calls to store_vblank() with a >>>> bump of anything but +1. A bump != 1 would very likely return corrupted >>>> timestamps to userspace, because the same slot in the cache could >>>> be concurrently written by store_vblank() and read by one of those >>>> readers in a non-atomic fashion and without the read-retry logic >>>> detecting this collision. >>>> >>>> Concurrent readers can exist while drm_update_vblank_count() is called >>>> from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank- >>>> irq callers. However, all those calls are happening with the vbl_lock >>>> locked thereby preventing a drm_vblank_get(), so the vblank refcount >>>> can't increase while drm_update_vblank_count() is executing. Therefore >>>> a zero vblank refcount during execution of that function signals that >>>> is safe for arbitrary counter bumps if called from outside vblank irq, >>>> whereas a non-zero count is not safe. >>>> >>>> Whenever the function is called from vblank irq, we have to assume concurrent >>>> readers could show up any time during its execution, even if the refcount >>>> is currently zero, as vblank irqs are usually only enabled due to the >>>> presence of readers, and because when it is called from vblank irq it >>>> can't hold the vbl_lock to protect it from sudden bumps in vblank refcount. >>>> Therefore also restrict bumps to +1 when the function is called from vblank >>>> irq. >>>> >>>> Such bumps of more than +1 can happen at other times than reenabling >>>> vblank irqs, e.g., when regular vblank interrupts get delayed by more >>>> than 1 frame due to long held locks, long irq off periods, realtime >>>> preemption on RT kernels, or system management interrupts. >>>> >>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> >>>> Cc: <stable@vger.kernel.org> # 4.4+ >>>> Cc: michel@daenzer.net >>>> Cc: vbabka@suse.cz >>>> Cc: ville.syrjala@linux.intel.com >>>> Cc: daniel.vetter@ffwll.ch >>>> Cc: dri-devel@lists.freedesktop.org >>>> Cc: alexander.deucher@amd.com >>>> Cc: christian.koenig@amd.com >>> >>> Imo this is duct-tape. If we want to fix this up properly I think we >>> should just use a full-blown seqlock instead of our hand-rolled one. And >>> that could handle any increment at all. >> >> And I even fixed this [1] almost a half a year ago when I sent the >> original series, but that part got held hostage to the same seqlock >> argument. Perfect is the enemy of good. >> >> [1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html > > Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer your > patch over Mario's hack here tbh. Your patch with seqlock would be even > more awesome. > -Daniel > I agree that my hack is only duct-tape. That's why the long code comment to let people know under which condition they could remove it. Using seqlocks would be the robust long term solution. But as this is supposed to be a fix for both 4.4 and 4.5 i thought that such a rewrite would be too intrusive as a change, compared to this one-liner? The original "roll our own" seqlock look alike implementation was meant to avoid/minimize taking locks, esp. with _irqsave that are taken by both userspace and timing sensitive vblank irq handling code. There were various locking changes since then and that advantage might have been lost already quite a long time ago, so maybe switching to full seqlocks wouldn't pose some new performance problems there, but i haven't looked into this. -mario >> >>> -Daniel >>> >>>> --- >>>> drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++ >>>> 1 file changed, 41 insertions(+) >>>> >>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c >>>> index bcb8528..aa2c74b 100644 >>>> --- a/drivers/gpu/drm/drm_irq.c >>>> +++ b/drivers/gpu/drm/drm_irq.c >>>> @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, >>>> diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; >>>> } >>>> >>>> + /* >>>> + * Restrict the bump of the software vblank counter to a safe maximum >>>> + * value of +1 whenever there is the possibility that concurrent readers >>>> + * of vblank timestamps could be active at the moment, as the current >>>> + * implementation of the timestamp caching and updating is not safe >>>> + * against concurrent readers for calls to store_vblank() with a bump >>>> + * of anything but +1. A bump != 1 would very likely return corrupted >>>> + * timestamps to userspace, because the same slot in the cache could >>>> + * be concurrently written by store_vblank() and read by one of those >>>> + * readers without the read-retry logic detecting the collision. >>>> + * >>>> + * Concurrent readers can exist when we are called from the >>>> + * drm_vblank_off() or drm_vblank_on() functions and other non-vblank- >>>> + * irq callers. However, all those calls to us are happening with the >>>> + * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount >>>> + * can't increase while we are executing. Therefore a zero refcount at >>>> + * this point is safe for arbitrary counter bumps if we are called >>>> + * outside vblank irq, a non-zero count is not 100% safe. Unfortunately >>>> + * we must also accept a refcount of 1, as whenever we are called from >>>> + * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and >>>> + * we must let that one pass through in order to not lose vblank counts >>>> + * during vblank irq off - which would completely defeat the whole >>>> + * point of this routine. >>>> + * >>>> + * Whenever we are called from vblank irq, we have to assume concurrent >>>> + * readers exist or can show up any time during our execution, even if >>>> + * the refcount is currently zero, as vblank irqs are usually only >>>> + * enabled due to the presence of readers, and because when we are called >>>> + * from vblank irq we can't hold the vbl_lock to protect us from sudden >>>> + * bumps in vblank refcount. Therefore also restrict bumps to +1 when >>>> + * called from vblank irq. >>>> + */ >>>> + if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || >>>> + (flags & DRM_CALLED_FROM_VBLIRQ))) { >>>> + DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u " >>>> + "refcount %u, vblirq %u\n", pipe, diff, >>>> + atomic_read(&vblank->refcount), >>>> + (flags & DRM_CALLED_FROM_VBLIRQ) != 0); >>>> + diff = 1; >>>> + } >>>> + >>>> DRM_DEBUG_VBL("updating vblank count on crtc %u:" >>>> " current=%u, diff=%u, hw=%u hw_last=%u\n", >>>> pipe, vblank->count, diff, cur_vblank, vblank->last); >>>> -- >>>> 1.9.1 >>>> >>> >>> -- >>> Daniel Vetter >>> Software Engineer, Intel Corporation >>> http://blog.ffwll.ch >> >> -- >> Ville Syrjälä >> Intel OTC >
On Tue, Feb 09, 2016 at 02:39:44PM +0100, Mario Kleiner wrote: > On 02/09/2016 11:23 AM, Daniel Vetter wrote: > >On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrjälä wrote: > >>On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote: > >>>On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote: > >>>>This fixes a regression introduced by the new drm_update_vblank_count() > >>>>implementation in Linux 4.4: > >>>> > >>>>Restrict the bump of the software vblank counter in drm_update_vblank_count() > >>>>to a safe maximum value of +1 whenever there is the possibility that > >>>>concurrent readers of vblank timestamps could be active at the moment, > >>>>as the current implementation of the timestamp caching and updating is > >>>>not safe against concurrent readers for calls to store_vblank() with a > >>>>bump of anything but +1. A bump != 1 would very likely return corrupted > >>>>timestamps to userspace, because the same slot in the cache could > >>>>be concurrently written by store_vblank() and read by one of those > >>>>readers in a non-atomic fashion and without the read-retry logic > >>>>detecting this collision. > >>>> > >>>>Concurrent readers can exist while drm_update_vblank_count() is called > >>>>from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank- > >>>>irq callers. However, all those calls are happening with the vbl_lock > >>>>locked thereby preventing a drm_vblank_get(), so the vblank refcount > >>>>can't increase while drm_update_vblank_count() is executing. Therefore > >>>>a zero vblank refcount during execution of that function signals that > >>>>is safe for arbitrary counter bumps if called from outside vblank irq, > >>>>whereas a non-zero count is not safe. > >>>> > >>>>Whenever the function is called from vblank irq, we have to assume concurrent > >>>>readers could show up any time during its execution, even if the refcount > >>>>is currently zero, as vblank irqs are usually only enabled due to the > >>>>presence of readers, and because when it is called from vblank irq it > >>>>can't hold the vbl_lock to protect it from sudden bumps in vblank refcount. > >>>>Therefore also restrict bumps to +1 when the function is called from vblank > >>>>irq. > >>>> > >>>>Such bumps of more than +1 can happen at other times than reenabling > >>>>vblank irqs, e.g., when regular vblank interrupts get delayed by more > >>>>than 1 frame due to long held locks, long irq off periods, realtime > >>>>preemption on RT kernels, or system management interrupts. > >>>> > >>>>Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> > >>>>Cc: <stable@vger.kernel.org> # 4.4+ > >>>>Cc: michel@daenzer.net > >>>>Cc: vbabka@suse.cz > >>>>Cc: ville.syrjala@linux.intel.com > >>>>Cc: daniel.vetter@ffwll.ch > >>>>Cc: dri-devel@lists.freedesktop.org > >>>>Cc: alexander.deucher@amd.com > >>>>Cc: christian.koenig@amd.com > >>> > >>>Imo this is duct-tape. If we want to fix this up properly I think we > >>>should just use a full-blown seqlock instead of our hand-rolled one. And > >>>that could handle any increment at all. > >> > >>And I even fixed this [1] almost a half a year ago when I sent the > >>original series, but that part got held hostage to the same seqlock > >>argument. Perfect is the enemy of good. > >> > >>[1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html > > > >Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer your > >patch over Mario's hack here tbh. Your patch with seqlock would be even > >more awesome. > >-Daniel > > > > I agree that my hack is only duct-tape. That's why the long code comment to > let people know under which condition they could remove it. > > Using seqlocks would be the robust long term solution. But as this is > supposed to be a fix for both 4.4 and 4.5 i thought that such a rewrite > would be too intrusive as a change, compared to this one-liner? > > The original "roll our own" seqlock look alike implementation was meant to > avoid/minimize taking locks, esp. with _irqsave that are taken by both > userspace and timing sensitive vblank irq handling code. There were various > locking changes since then and that advantage might have been lost already > quite a long time ago, so maybe switching to full seqlocks wouldn't pose > some new performance problems there, but i haven't looked into this. Last time I've checked we've already reinvented seqlocks completely, except buggy since ours can't take an increment > 1. I don't expect you'll be able to measure anything if we switch. Agree that it might be better to delay this for 4.6. So if you add a big "FIMXE: Need to replace this hack with proper seqlocks." a the top of your big comment (or just as a replacement for it), then Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> But currently it looks like this is a proper long-term solution, which it imo isn't. -Daniel > > -mario > > >> > >>>-Daniel > >>> > >>>>--- > >>>> drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++ > >>>> 1 file changed, 41 insertions(+) > >>>> > >>>>diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c > >>>>index bcb8528..aa2c74b 100644 > >>>>--- a/drivers/gpu/drm/drm_irq.c > >>>>+++ b/drivers/gpu/drm/drm_irq.c > >>>>@@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, > >>>> diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; > >>>> } > >>>> > >>>>+ /* > >>>>+ * Restrict the bump of the software vblank counter to a safe maximum > >>>>+ * value of +1 whenever there is the possibility that concurrent readers > >>>>+ * of vblank timestamps could be active at the moment, as the current > >>>>+ * implementation of the timestamp caching and updating is not safe > >>>>+ * against concurrent readers for calls to store_vblank() with a bump > >>>>+ * of anything but +1. A bump != 1 would very likely return corrupted > >>>>+ * timestamps to userspace, because the same slot in the cache could > >>>>+ * be concurrently written by store_vblank() and read by one of those > >>>>+ * readers without the read-retry logic detecting the collision. > >>>>+ * > >>>>+ * Concurrent readers can exist when we are called from the > >>>>+ * drm_vblank_off() or drm_vblank_on() functions and other non-vblank- > >>>>+ * irq callers. However, all those calls to us are happening with the > >>>>+ * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount > >>>>+ * can't increase while we are executing. Therefore a zero refcount at > >>>>+ * this point is safe for arbitrary counter bumps if we are called > >>>>+ * outside vblank irq, a non-zero count is not 100% safe. Unfortunately > >>>>+ * we must also accept a refcount of 1, as whenever we are called from > >>>>+ * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and > >>>>+ * we must let that one pass through in order to not lose vblank counts > >>>>+ * during vblank irq off - which would completely defeat the whole > >>>>+ * point of this routine. > >>>>+ * > >>>>+ * Whenever we are called from vblank irq, we have to assume concurrent > >>>>+ * readers exist or can show up any time during our execution, even if > >>>>+ * the refcount is currently zero, as vblank irqs are usually only > >>>>+ * enabled due to the presence of readers, and because when we are called > >>>>+ * from vblank irq we can't hold the vbl_lock to protect us from sudden > >>>>+ * bumps in vblank refcount. Therefore also restrict bumps to +1 when > >>>>+ * called from vblank irq. > >>>>+ */ > >>>>+ if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || > >>>>+ (flags & DRM_CALLED_FROM_VBLIRQ))) { > >>>>+ DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u " > >>>>+ "refcount %u, vblirq %u\n", pipe, diff, > >>>>+ atomic_read(&vblank->refcount), > >>>>+ (flags & DRM_CALLED_FROM_VBLIRQ) != 0); > >>>>+ diff = 1; > >>>>+ } > >>>>+ > >>>> DRM_DEBUG_VBL("updating vblank count on crtc %u:" > >>>> " current=%u, diff=%u, hw=%u hw_last=%u\n", > >>>> pipe, vblank->count, diff, cur_vblank, vblank->last); > >>>>-- > >>>>1.9.1 > >>>> > >>> > >>>-- > >>>Daniel Vetter > >>>Software Engineer, Intel Corporation > >>>http://blog.ffwll.ch > >> > >>-- > >>Ville Syrjälä > >>Intel OTC > >
On 02/09/2016 03:29 PM, Daniel Vetter wrote: > On Tue, Feb 09, 2016 at 02:39:44PM +0100, Mario Kleiner wrote: >> On 02/09/2016 11:23 AM, Daniel Vetter wrote: >>> On Tue, Feb 09, 2016 at 12:07:27PM +0200, Ville Syrjälä wrote: >>>> On Tue, Feb 09, 2016 at 10:56:38AM +0100, Daniel Vetter wrote: >>>>> On Mon, Feb 08, 2016 at 02:13:25AM +0100, Mario Kleiner wrote: >>>>>> This fixes a regression introduced by the new drm_update_vblank_count() >>>>>> implementation in Linux 4.4: >>>>>> >>>>>> Restrict the bump of the software vblank counter in drm_update_vblank_count() >>>>>> to a safe maximum value of +1 whenever there is the possibility that >>>>>> concurrent readers of vblank timestamps could be active at the moment, >>>>>> as the current implementation of the timestamp caching and updating is >>>>>> not safe against concurrent readers for calls to store_vblank() with a >>>>>> bump of anything but +1. A bump != 1 would very likely return corrupted >>>>>> timestamps to userspace, because the same slot in the cache could >>>>>> be concurrently written by store_vblank() and read by one of those >>>>>> readers in a non-atomic fashion and without the read-retry logic >>>>>> detecting this collision. >>>>>> >>>>>> Concurrent readers can exist while drm_update_vblank_count() is called >>>>> >from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank- >>>>>> irq callers. However, all those calls are happening with the vbl_lock >>>>>> locked thereby preventing a drm_vblank_get(), so the vblank refcount >>>>>> can't increase while drm_update_vblank_count() is executing. Therefore >>>>>> a zero vblank refcount during execution of that function signals that >>>>>> is safe for arbitrary counter bumps if called from outside vblank irq, >>>>>> whereas a non-zero count is not safe. >>>>>> >>>>>> Whenever the function is called from vblank irq, we have to assume concurrent >>>>>> readers could show up any time during its execution, even if the refcount >>>>>> is currently zero, as vblank irqs are usually only enabled due to the >>>>>> presence of readers, and because when it is called from vblank irq it >>>>>> can't hold the vbl_lock to protect it from sudden bumps in vblank refcount. >>>>>> Therefore also restrict bumps to +1 when the function is called from vblank >>>>>> irq. >>>>>> >>>>>> Such bumps of more than +1 can happen at other times than reenabling >>>>>> vblank irqs, e.g., when regular vblank interrupts get delayed by more >>>>>> than 1 frame due to long held locks, long irq off periods, realtime >>>>>> preemption on RT kernels, or system management interrupts. >>>>>> >>>>>> Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> >>>>>> Cc: <stable@vger.kernel.org> # 4.4+ >>>>>> Cc: michel@daenzer.net >>>>>> Cc: vbabka@suse.cz >>>>>> Cc: ville.syrjala@linux.intel.com >>>>>> Cc: daniel.vetter@ffwll.ch >>>>>> Cc: dri-devel@lists.freedesktop.org >>>>>> Cc: alexander.deucher@amd.com >>>>>> Cc: christian.koenig@amd.com >>>>> >>>>> Imo this is duct-tape. If we want to fix this up properly I think we >>>>> should just use a full-blown seqlock instead of our hand-rolled one. And >>>>> that could handle any increment at all. >>>> >>>> And I even fixed this [1] almost a half a year ago when I sent the >>>> original series, but that part got held hostage to the same seqlock >>>> argument. Perfect is the enemy of good. >>>> >>>> [1] https://lists.freedesktop.org/archives/intel-gfx/2015-September/075879.html >>> >>> Hm yeah, that does suffer from reinventing seqlocks. But I'd prefer your >>> patch over Mario's hack here tbh. Your patch with seqlock would be even >>> more awesome. >>> -Daniel >>> >> >> I agree that my hack is only duct-tape. That's why the long code comment to >> let people know under which condition they could remove it. >> >> Using seqlocks would be the robust long term solution. But as this is >> supposed to be a fix for both 4.4 and 4.5 i thought that such a rewrite >> would be too intrusive as a change, compared to this one-liner? >> >> The original "roll our own" seqlock look alike implementation was meant to >> avoid/minimize taking locks, esp. with _irqsave that are taken by both >> userspace and timing sensitive vblank irq handling code. There were various >> locking changes since then and that advantage might have been lost already >> quite a long time ago, so maybe switching to full seqlocks wouldn't pose >> some new performance problems there, but i haven't looked into this. > > Last time I've checked we've already reinvented seqlocks completely, > except buggy since ours can't take an increment > 1. I don't expect you'll > be able to measure anything if we switch. > > Agree that it might be better to delay this for 4.6. So if you add a big > "FIMXE: Need to replace this hack with proper seqlocks." a the top of your > big comment (or just as a replacement for it), then > > Reviewed-by: Daniel Vetter <daniel.vetter@ffwll.ch> > > But currently it looks like this is a proper long-term solution, which it > imo isn't. > -Daniel > Ok, will do. -mario > >> >> -mario >> >>>> >>>>> -Daniel >>>>> >>>>>> --- >>>>>> drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++ >>>>>> 1 file changed, 41 insertions(+) >>>>>> >>>>>> diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c >>>>>> index bcb8528..aa2c74b 100644 >>>>>> --- a/drivers/gpu/drm/drm_irq.c >>>>>> +++ b/drivers/gpu/drm/drm_irq.c >>>>>> @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, >>>>>> diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; >>>>>> } >>>>>> >>>>>> + /* >>>>>> + * Restrict the bump of the software vblank counter to a safe maximum >>>>>> + * value of +1 whenever there is the possibility that concurrent readers >>>>>> + * of vblank timestamps could be active at the moment, as the current >>>>>> + * implementation of the timestamp caching and updating is not safe >>>>>> + * against concurrent readers for calls to store_vblank() with a bump >>>>>> + * of anything but +1. A bump != 1 would very likely return corrupted >>>>>> + * timestamps to userspace, because the same slot in the cache could >>>>>> + * be concurrently written by store_vblank() and read by one of those >>>>>> + * readers without the read-retry logic detecting the collision. >>>>>> + * >>>>>> + * Concurrent readers can exist when we are called from the >>>>>> + * drm_vblank_off() or drm_vblank_on() functions and other non-vblank- >>>>>> + * irq callers. However, all those calls to us are happening with the >>>>>> + * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount >>>>>> + * can't increase while we are executing. Therefore a zero refcount at >>>>>> + * this point is safe for arbitrary counter bumps if we are called >>>>>> + * outside vblank irq, a non-zero count is not 100% safe. Unfortunately >>>>>> + * we must also accept a refcount of 1, as whenever we are called from >>>>>> + * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and >>>>>> + * we must let that one pass through in order to not lose vblank counts >>>>>> + * during vblank irq off - which would completely defeat the whole >>>>>> + * point of this routine. >>>>>> + * >>>>>> + * Whenever we are called from vblank irq, we have to assume concurrent >>>>>> + * readers exist or can show up any time during our execution, even if >>>>>> + * the refcount is currently zero, as vblank irqs are usually only >>>>>> + * enabled due to the presence of readers, and because when we are called >>>>>> + * from vblank irq we can't hold the vbl_lock to protect us from sudden >>>>>> + * bumps in vblank refcount. Therefore also restrict bumps to +1 when >>>>>> + * called from vblank irq. >>>>>> + */ >>>>>> + if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || >>>>>> + (flags & DRM_CALLED_FROM_VBLIRQ))) { >>>>>> + DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u " >>>>>> + "refcount %u, vblirq %u\n", pipe, diff, >>>>>> + atomic_read(&vblank->refcount), >>>>>> + (flags & DRM_CALLED_FROM_VBLIRQ) != 0); >>>>>> + diff = 1; >>>>>> + } >>>>>> + >>>>>> DRM_DEBUG_VBL("updating vblank count on crtc %u:" >>>>>> " current=%u, diff=%u, hw=%u hw_last=%u\n", >>>>>> pipe, vblank->count, diff, cur_vblank, vblank->last); >>>>>> -- >>>>>> 1.9.1 >>>>>> >>>>> >>>>> -- >>>>> Daniel Vetter >>>>> Software Engineer, Intel Corporation >>>>> http://blog.ffwll.ch >>>> >>>> -- >>>> Ville Syrjälä >>>> Intel OTC >>> >
diff --git a/drivers/gpu/drm/drm_irq.c b/drivers/gpu/drm/drm_irq.c index bcb8528..aa2c74b 100644 --- a/drivers/gpu/drm/drm_irq.c +++ b/drivers/gpu/drm/drm_irq.c @@ -221,6 +221,47 @@ static void drm_update_vblank_count(struct drm_device *dev, unsigned int pipe, diff = (flags & DRM_CALLED_FROM_VBLIRQ) != 0; } + /* + * Restrict the bump of the software vblank counter to a safe maximum + * value of +1 whenever there is the possibility that concurrent readers + * of vblank timestamps could be active at the moment, as the current + * implementation of the timestamp caching and updating is not safe + * against concurrent readers for calls to store_vblank() with a bump + * of anything but +1. A bump != 1 would very likely return corrupted + * timestamps to userspace, because the same slot in the cache could + * be concurrently written by store_vblank() and read by one of those + * readers without the read-retry logic detecting the collision. + * + * Concurrent readers can exist when we are called from the + * drm_vblank_off() or drm_vblank_on() functions and other non-vblank- + * irq callers. However, all those calls to us are happening with the + * vbl_lock locked to prevent drm_vblank_get(), so the vblank refcount + * can't increase while we are executing. Therefore a zero refcount at + * this point is safe for arbitrary counter bumps if we are called + * outside vblank irq, a non-zero count is not 100% safe. Unfortunately + * we must also accept a refcount of 1, as whenever we are called from + * drm_vblank_get() -> drm_vblank_enable() the refcount will be 1 and + * we must let that one pass through in order to not lose vblank counts + * during vblank irq off - which would completely defeat the whole + * point of this routine. + * + * Whenever we are called from vblank irq, we have to assume concurrent + * readers exist or can show up any time during our execution, even if + * the refcount is currently zero, as vblank irqs are usually only + * enabled due to the presence of readers, and because when we are called + * from vblank irq we can't hold the vbl_lock to protect us from sudden + * bumps in vblank refcount. Therefore also restrict bumps to +1 when + * called from vblank irq. + */ + if ((diff > 1) && (atomic_read(&vblank->refcount) > 1 || + (flags & DRM_CALLED_FROM_VBLIRQ))) { + DRM_DEBUG_VBL("clamping vblank bump to 1 on crtc %u: diffr=%u " + "refcount %u, vblirq %u\n", pipe, diff, + atomic_read(&vblank->refcount), + (flags & DRM_CALLED_FROM_VBLIRQ) != 0); + diff = 1; + } + DRM_DEBUG_VBL("updating vblank count on crtc %u:" " current=%u, diff=%u, hw=%u hw_last=%u\n", pipe, vblank->count, diff, cur_vblank, vblank->last);
This fixes a regression introduced by the new drm_update_vblank_count() implementation in Linux 4.4: Restrict the bump of the software vblank counter in drm_update_vblank_count() to a safe maximum value of +1 whenever there is the possibility that concurrent readers of vblank timestamps could be active at the moment, as the current implementation of the timestamp caching and updating is not safe against concurrent readers for calls to store_vblank() with a bump of anything but +1. A bump != 1 would very likely return corrupted timestamps to userspace, because the same slot in the cache could be concurrently written by store_vblank() and read by one of those readers in a non-atomic fashion and without the read-retry logic detecting this collision. Concurrent readers can exist while drm_update_vblank_count() is called from the drm_vblank_off() or drm_vblank_on() functions or other non-vblank- irq callers. However, all those calls are happening with the vbl_lock locked thereby preventing a drm_vblank_get(), so the vblank refcount can't increase while drm_update_vblank_count() is executing. Therefore a zero vblank refcount during execution of that function signals that is safe for arbitrary counter bumps if called from outside vblank irq, whereas a non-zero count is not safe. Whenever the function is called from vblank irq, we have to assume concurrent readers could show up any time during its execution, even if the refcount is currently zero, as vblank irqs are usually only enabled due to the presence of readers, and because when it is called from vblank irq it can't hold the vbl_lock to protect it from sudden bumps in vblank refcount. Therefore also restrict bumps to +1 when the function is called from vblank irq. Such bumps of more than +1 can happen at other times than reenabling vblank irqs, e.g., when regular vblank interrupts get delayed by more than 1 frame due to long held locks, long irq off periods, realtime preemption on RT kernels, or system management interrupts. Signed-off-by: Mario Kleiner <mario.kleiner.de@gmail.com> Cc: <stable@vger.kernel.org> # 4.4+ Cc: michel@daenzer.net Cc: vbabka@suse.cz Cc: ville.syrjala@linux.intel.com Cc: daniel.vetter@ffwll.ch Cc: dri-devel@lists.freedesktop.org Cc: alexander.deucher@amd.com Cc: christian.koenig@amd.com --- drivers/gpu/drm/drm_irq.c | 41 +++++++++++++++++++++++++++++++++++++++++ 1 file changed, 41 insertions(+)