diff mbox series

[2/2] drm: get lock before accessing vblank refcount

Message ID 20220722215234.129793-2-Yunxiang.Li@amd.com (mailing list archive)
State New, archived
Headers show
Series [1/2] drm: Fix vblank refcount during modeset | expand

Commit Message

Yunxiang Li July 22, 2022, 9:52 p.m. UTC
Acquire vbl_lock before accessing vblank refcount in drm_vblank_put,
just like everywhere else that access the refcount.

Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>
---
 drivers/gpu/drm/drm_vblank.c | 11 +++++++++--
 1 file changed, 9 insertions(+), 2 deletions(-)

Comments

Daniel Vetter Sept. 6, 2022, 7:20 p.m. UTC | #1
On Fri, Jul 22, 2022 at 05:52:34PM -0400, Yunxiang Li wrote:
> Acquire vbl_lock before accessing vblank refcount in drm_vblank_put,
> just like everywhere else that access the refcount.
> 
> Signed-off-by: Yunxiang Li <Yunxiang.Li@amd.com>

The entire point of using atomic for the refcount is that we can check it
lockless, so I'm not sure what you're trying to fix here?

For the first patch I think it's clear that the bug needs to be fixed in
amdgpu dc code already.
-Daniel
> ---
>  drivers/gpu/drm/drm_vblank.c | 11 +++++++++--
>  1 file changed, 9 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
> index 159d13b5d97b..77b8c40fc7ba 100644
> --- a/drivers/gpu/drm/drm_vblank.c
> +++ b/drivers/gpu/drm/drm_vblank.c
> @@ -1203,15 +1203,22 @@ EXPORT_SYMBOL(drm_crtc_vblank_get);
>  void drm_vblank_put(struct drm_device *dev, unsigned int pipe)
>  {
>  	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
> +	unsigned long irqflags;
> +	int ret;
>  
>  	if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
>  		return;
>  
> -	if (drm_WARN_ON(dev, atomic_read(&vblank->refcount) == 0))
> +	spin_lock_irqsave(&dev->vbl_lock, irqflags);
> +	if (drm_WARN_ON(dev, atomic_read(&vblank->refcount) == 0)) {
> +		spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
>  		return;
> +	}
>  
>  	/* Last user schedules interrupt disable */
> -	if (atomic_dec_and_test(&vblank->refcount)) {
> +	ret = atomic_dec_and_test(&vblank->refcount);
> +	spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
> +	if (ret) {
>  		if (drm_vblank_offdelay == 0)
>  			return;
>  		else if (drm_vblank_offdelay < 0)
> -- 
> 2.37.1
>
Yunxiang Li Sept. 6, 2022, 8:18 p.m. UTC | #2
[Public]

Hi Daniel,

I added the check because I saw that the refcount was always updated/read with lock held elsewhere, and the pattern looked very similar to in the put e.g. drm_crtc_vblank_reset. This patchset is outdated by now and I've sent a fix to amd-gfx for the specific issue in amdgpu.

However, I think the way drm_crtc_vblank_on/off functions increments the refcount without enabling the vblank is still a bit risky given how many unchecked calls to drm_vblank_get there is elsewhere. Maybe it's more appropriate to simply add an WARN to drm_vblank_get when it's called with inmodeset set? This way the WARN happens right at the problematic point, instead of far into the future when the put is called.

Yunxiang
Daniel Vetter Sept. 6, 2022, 9:58 p.m. UTC | #3
On Tue, Sep 06, 2022 at 08:18:30PM +0000, Li, Yunxiang (Teddy) wrote:
> [Public]
> 
> Hi Daniel,
> 
> I added the check because I saw that the refcount was always
> updated/read with lock held elsewhere, and the pattern looked very
> similar to in the put e.g. drm_crtc_vblank_reset. This patchset is
> outdated by now and I've sent a fix to amd-gfx for the specific issue in
> amdgpu.
> 
> However, I think the way drm_crtc_vblank_on/off functions increments the
> refcount without enabling the vblank is still a bit risky given how many
> unchecked calls to drm_vblank_get there is elsewhere. Maybe it's more
> appropriate to simply add an WARN to drm_vblank_get when it's called
> with inmodeset set? This way the WARN happens right at the problematic
> point, instead of far into the future when the put is called.

drm_crtc_vblank_get failing when the crtc is off is how this is supposed
to work, calling WARN_ON or similar in there would upset everything.

What might be an option is adding __must_check or similar annotations, but
the problem is that in many cases the driver knows that it cannot fail, so
this isn't great either.

Another option would be to split this up into drm_crtc_vblank_get with
void return value (and a WARN_ON when it fails), and
drm_crtc_vblank_try_get, which can fail. And then go through _all_ the
callers and audit them.

Imo not really worth the work, but we could do that.
-Daniel
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_vblank.c b/drivers/gpu/drm/drm_vblank.c
index 159d13b5d97b..77b8c40fc7ba 100644
--- a/drivers/gpu/drm/drm_vblank.c
+++ b/drivers/gpu/drm/drm_vblank.c
@@ -1203,15 +1203,22 @@  EXPORT_SYMBOL(drm_crtc_vblank_get);
 void drm_vblank_put(struct drm_device *dev, unsigned int pipe)
 {
 	struct drm_vblank_crtc *vblank = &dev->vblank[pipe];
+	unsigned long irqflags;
+	int ret;
 
 	if (drm_WARN_ON(dev, pipe >= dev->num_crtcs))
 		return;
 
-	if (drm_WARN_ON(dev, atomic_read(&vblank->refcount) == 0))
+	spin_lock_irqsave(&dev->vbl_lock, irqflags);
+	if (drm_WARN_ON(dev, atomic_read(&vblank->refcount) == 0)) {
+		spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
 		return;
+	}
 
 	/* Last user schedules interrupt disable */
-	if (atomic_dec_and_test(&vblank->refcount)) {
+	ret = atomic_dec_and_test(&vblank->refcount);
+	spin_unlock_irqrestore(&dev->vbl_lock, irqflags);
+	if (ret) {
 		if (drm_vblank_offdelay == 0)
 			return;
 		else if (drm_vblank_offdelay < 0)